From ef0f16f84be674d521193960046469f38de7f8f0 Mon Sep 17 00:00:00 2001 From: Shakeel Butt Date: Fri, 5 Dec 2025 12:01:06 -0800 Subject: [PATCH 0001/1321] cgroup: rstat: use LOCK CMPXCHG in css_rstat_updated On x86-64, this_cpu_cmpxchg() uses CMPXCHG without LOCK prefix which means it is only safe for the local CPU and not for multiple CPUs. Recently the commit 36df6e3dbd7e ("cgroup: make css_rstat_updated nmi safe") make css_rstat_updated lockless and uses lockless list to allow reentrancy. Since css_rstat_updated can invoked from process context, IRQ and NMI, it uses this_cpu_cmpxchg() to select the winner which will inset the lockless lnode into the global per-cpu lockless list. However the commit missed one case where lockless node of a cgroup can be accessed and modified by another CPU doing the flushing. Basically llist_del_first_init() in css_process_update_tree(). On a cursory look, it can be questioned how css_process_update_tree() can see a lockless node in global lockless list where the updater is at this_cpu_cmpxchg() and before llist_add() call in css_rstat_updated(). This can indeed happen in the presence of IRQs/NMI. Consider this scenario: Updater for cgroup stat C on CPU A in process context is after llist_on_list() check and before this_cpu_cmpxchg() in css_rstat_updated() where it get interrupted by IRQ/NMI. In the IRQ/NMI context, a new updater calls css_rstat_updated() for same cgroup C and successfully inserts rstatc_pcpu->lnode. Now concurrently CPU B is running the flusher and it calls llist_del_first_init() for CPU A and got rstatc_pcpu->lnode of cgroup C which was added by the IRQ/NMI updater. Now imagine CPU B calling init_llist_node() on cgroup C's rstatc_pcpu->lnode of CPU A and on CPU A, the process context updater calling this_cpu_cmpxchg(rstatc_pcpu->lnode) concurrently. The CMPXCNG without LOCK on CPU A is not safe and thus we need LOCK prefix. In Meta's fleet running the kernel with the commit 36df6e3dbd7e, we are observing on some machines the memcg stats are getting skewed by more than the actual memory on the system. On close inspection, we noticed that lockless node for a workload for specific CPU was in the bad state and thus all the updates on that CPU for that cgroup was being lost. To confirm if this skew was indeed due to this CMPXCHG without LOCK in css_rstat_updated(), we created a repro (using AI) at [1] which shows that CMPXCHG without LOCK creates almost the same lnode corruption as seem in Meta's fleet and with LOCK CMPXCHG the issue does not reproduces. Link: http://lore.kernel.org/efiagdwmzfwpdzps74fvcwq3n4cs36q33ij7eebcpssactv3zu@se4hqiwxcfxq [1] Signed-off-by: Shakeel Butt Cc: stable@vger.kernel.org # v6.17+ Fixes: 36df6e3dbd7e ("cgroup: make css_rstat_updated nmi safe") Signed-off-by: Tejun Heo --- kernel/cgroup/rstat.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c index a198e40c799b4..150e5871e66f2 100644 --- a/kernel/cgroup/rstat.c +++ b/kernel/cgroup/rstat.c @@ -71,7 +71,6 @@ __bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu) { struct llist_head *lhead; struct css_rstat_cpu *rstatc; - struct css_rstat_cpu __percpu *rstatc_pcpu; struct llist_node *self; /* @@ -104,18 +103,22 @@ __bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu) /* * This function can be renentered by irqs and nmis for the same cgroup * and may try to insert the same per-cpu lnode into the llist. Note - * that llist_add() does not protect against such scenarios. + * that llist_add() does not protect against such scenarios. In addition + * this same per-cpu lnode can be modified through init_llist_node() + * from css_rstat_flush() running on a different CPU. * * To protect against such stacked contexts of irqs/nmis, we use the * fact that lnode points to itself when not on a list and then use - * this_cpu_cmpxchg() to atomically set to NULL to select the winner + * try_cmpxchg() to atomically set to NULL to select the winner * which will call llist_add(). The losers can assume the insertion is * successful and the winner will eventually add the per-cpu lnode to * the llist. + * + * Please note that we can not use this_cpu_cmpxchg() here as on some + * archs it is not safe against modifications from multiple CPUs. */ self = &rstatc->lnode; - rstatc_pcpu = css->rstat_cpu; - if (this_cpu_cmpxchg(rstatc_pcpu->lnode.next, self, NULL) != self) + if (!try_cmpxchg(&rstatc->lnode.next, &self, NULL)) return; lhead = ss_lhead_cpu(css->ss, cpu); From 917a240eebbc3cdbbb96a9d686fc32b1f07600c1 Mon Sep 17 00:00:00 2001 From: Zqiang Date: Mon, 8 Dec 2025 19:23:19 +0800 Subject: [PATCH 0002/1321] sched_ext: Fix the memleak for sch->helper objects This commit use kthread_destroy_worker() to release sch->helper objects to fix the following kmemleak: unreferenced object 0xffff888121ec7b00 (size 128): comm "scx_simple", pid 1197, jiffies 4295884415 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 ad 4e ad de .............N.. ff ff ff ff 00 00 00 00 ff ff ff ff ff ff ff ff ................ backtrace (crc 587b3352): kmemleak_alloc+0x62/0xa0 __kmalloc_cache_noprof+0x28d/0x3e0 kthread_create_worker_on_node+0xd5/0x1f0 scx_enable.isra.210+0x6c2/0x25b0 bpf_scx_reg+0x12/0x20 bpf_struct_ops_link_create+0x2c3/0x3b0 __sys_bpf+0x3102/0x4b00 __x64_sys_bpf+0x79/0xc0 x64_sys_call+0x15d9/0x1dd0 do_syscall_64+0xf0/0x470 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: bff3b5aec1b7 ("sched_ext: Move disable machinery into scx_sched") Cc: stable@vger.kernel.org # v6.16+ Signed-off-by: Zqiang Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 05f5a49e9649a..073b669869cb0 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -3575,7 +3575,7 @@ static void scx_sched_free_rcu_work(struct work_struct *work) int node; irq_work_sync(&sch->error_irq_work); - kthread_stop(sch->helper->task); + kthread_destroy_worker(sch->helper); free_percpu(sch->pcpu); @@ -4786,7 +4786,7 @@ static struct scx_sched *scx_alloc_and_add_sched(struct sched_ext_ops *ops) return sch; err_stop_helper: - kthread_stop(sch->helper->task); + kthread_destroy_worker(sch->helper); err_free_pcpu: free_percpu(sch->pcpu); err_free_gdsqs: From 57bb9475cdb4af871f376b9bc4e056817bcd95cc Mon Sep 17 00:00:00 2001 From: John Stultz Date: Sat, 6 Dec 2025 02:22:03 +0000 Subject: [PATCH 0003/1321] sched/ext: Avoid null ptr traversal when ->put_prev_task() is called with NULL next Early when trying to get sched_ext and proxy-exe working together, I kept tripping over NULL ptr in put_prev_task_scx() on the line: if (sched_class_above(&ext_sched_class, next->sched_class)) { Which was due to put_prev_task() passes a NULL next, calling: prev->sched_class->put_prev_task(rq, prev, NULL); put_prev_task_scx() already guards for a NULL next in the switch_class case, but doesn't seem to have a guard for sched_class_above() check. I can't say I understand why this doesn't trip usually without proxy-exec. And in newer kernels there are way fewer put_prev_task(), and I can't easily reproduce the issue now even with proxy-exec. But we still have one put_prev_task() call left in core.c that seems like it could trip this, so I wanted to send this out for consideration. tj: put_prev_task() can be called with NULL @next; however, when @p is queued, that doesn't happen, so this condition shouldn't currently be triggerable. The connection isn't straightforward or necessarily reliable, so add the NULL check even if it can't currently be triggered. Link: http://lkml.kernel.org/r/20251206022218.1541878-1-jstultz@google.com Signed-off-by: John Stultz Reviewed-by: Andrea Righi Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 073b669869cb0..bd74b371f52d9 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -2402,7 +2402,7 @@ static void put_prev_task_scx(struct rq *rq, struct task_struct *p, * ops.enqueue() that @p is the only one available for this cpu, * which should trigger an explicit follow-up scheduling event. */ - if (sched_class_above(&ext_sched_class, next->sched_class)) { + if (next && sched_class_above(&ext_sched_class, next->sched_class)) { WARN_ON_ONCE(!(sch->ops.flags & SCX_OPS_ENQ_LAST)); do_enqueue_task(rq, p, SCX_ENQ_LAST, -1); } else { From b31d2f26bb710e752912cf3c04e87c75880a57ef Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Tue, 9 Dec 2025 11:04:33 -1000 Subject: [PATCH 0004/1321] sched_ext: Fix bypass depth leak on scx_enable() failure scx_enable() calls scx_bypass(true) to initialize in bypass mode and then scx_bypass(false) on success to exit. If scx_enable() fails during task initialization - e.g. scx_cgroup_init() or scx_init_task() returns an error - it jumps to err_disable while bypass is still active. scx_disable_workfn() then calls scx_bypass(true/false) for its own bypass, leaving the bypass depth at 1 instead of 0. This causes the system to remain permanently in bypass mode after a failed scx_enable(). Failures after task initialization is complete - e.g. scx_tryset_enable_state() at the end - already call scx_bypass(false) before reaching the error path and are not affected. This only affects a subset of failure modes. Fix it by tracking whether scx_enable() called scx_bypass(true) in a bool and having scx_disable_workfn() call an extra scx_bypass(false) to clear it. This is a temporary measure as the bypass depth will be moved into the sched instance, which will make this tracking unnecessary. Fixes: 8c2090c504e9 ("sched_ext: Initialize in bypass mode") Cc: stable@vger.kernel.org # v6.12+ Reported-by: Chris Mason Reviewed-by: Emil Tsalapatis Link: https://lore.kernel.org/stable/286e6f7787a81239e1ce2989b52391ce%40kernel.org Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index bd74b371f52d9..c4465ccefea48 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -41,6 +41,13 @@ static bool scx_init_task_enabled; static bool scx_switching_all; DEFINE_STATIC_KEY_FALSE(__scx_switched_all); +/* + * Tracks whether scx_enable() called scx_bypass(true). Used to balance bypass + * depth on enable failure. Will be removed when bypass depth is moved into the + * sched instance. + */ +static bool scx_bypassed_for_enable; + static atomic_long_t scx_nr_rejected = ATOMIC_LONG_INIT(0); static atomic_long_t scx_hotplug_seq = ATOMIC_LONG_INIT(0); @@ -4318,6 +4325,11 @@ static void scx_disable_workfn(struct kthread_work *work) scx_dsp_max_batch = 0; free_kick_syncs(); + if (scx_bypassed_for_enable) { + scx_bypassed_for_enable = false; + scx_bypass(false); + } + mutex_unlock(&scx_enable_mutex); WARN_ON_ONCE(scx_set_enable_state(SCX_DISABLED) != SCX_DISABLING); @@ -4970,6 +4982,7 @@ static int scx_enable(struct sched_ext_ops *ops, struct bpf_link *link) * Init in bypass mode to guarantee forward progress. */ scx_bypass(true); + scx_bypassed_for_enable = true; for (i = SCX_OPI_NORMAL_BEGIN; i < SCX_OPI_NORMAL_END; i++) if (((void (**)(void))ops)[i]) @@ -5067,6 +5080,7 @@ static int scx_enable(struct sched_ext_ops *ops, struct bpf_link *link) scx_task_iter_stop(&sti); percpu_up_write(&scx_fork_rwsem); + scx_bypassed_for_enable = false; scx_bypass(false); if (!scx_tryset_enable_state(SCX_ENABLED, SCX_ENABLING)) { From a4d39abc50c98ff88895c1abb8a44afe34266bd9 Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Thu, 11 Dec 2025 15:45:03 -1000 Subject: [PATCH 0005/1321] sched_ext: Factor out local_dsq_post_enq() from dispatch_enqueue() Factor out local_dsq_post_enq() which performs post-enqueue handling for local DSQs - triggering resched_curr() if SCX_ENQ_PREEMPT is specified or if the current CPU is idle. No functional change. This will be used by the next patch to fix move_local_task_to_local_dsq(). Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Andrea Righi Reviewed-by: Emil Tsalapatis Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 34 +++++++++++++++++++--------------- 1 file changed, 19 insertions(+), 15 deletions(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index c4465ccefea48..c78efa99406ff 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -982,6 +982,22 @@ static void refill_task_slice_dfl(struct scx_sched *sch, struct task_struct *p) __scx_add_event(sch, SCX_EV_REFILL_SLICE_DFL, 1); } +static void local_dsq_post_enq(struct scx_dispatch_q *dsq, struct task_struct *p, + u64 enq_flags) +{ + struct rq *rq = container_of(dsq, struct rq, scx.local_dsq); + bool preempt = false; + + if ((enq_flags & SCX_ENQ_PREEMPT) && p != rq->curr && + rq->curr->sched_class == &ext_sched_class) { + rq->curr->scx.slice = 0; + preempt = true; + } + + if (preempt || sched_class_above(&ext_sched_class, rq->curr->sched_class)) + resched_curr(rq); +} + static void dispatch_enqueue(struct scx_sched *sch, struct scx_dispatch_q *dsq, struct task_struct *p, u64 enq_flags) { @@ -1093,22 +1109,10 @@ static void dispatch_enqueue(struct scx_sched *sch, struct scx_dispatch_q *dsq, if (enq_flags & SCX_ENQ_CLEAR_OPSS) atomic_long_set_release(&p->scx.ops_state, SCX_OPSS_NONE); - if (is_local) { - struct rq *rq = container_of(dsq, struct rq, scx.local_dsq); - bool preempt = false; - - if ((enq_flags & SCX_ENQ_PREEMPT) && p != rq->curr && - rq->curr->sched_class == &ext_sched_class) { - rq->curr->scx.slice = 0; - preempt = true; - } - - if (preempt || sched_class_above(&ext_sched_class, - rq->curr->sched_class)) - resched_curr(rq); - } else { + if (is_local) + local_dsq_post_enq(dsq, p, enq_flags); + else raw_spin_unlock(&dsq->lock); - } } static void task_unlink_from_dsq(struct task_struct *p, From 06e6eb3d28ef72ad24946bb9f03159aed422a664 Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Thu, 11 Dec 2025 15:45:04 -1000 Subject: [PATCH 0006/1321] sched_ext: Fix missing post-enqueue handling in move_local_task_to_local_dsq() move_local_task_to_local_dsq() is used when moving a task from a non-local DSQ to a local DSQ on the same CPU. It directly manipulates the local DSQ without going through dispatch_enqueue() and was missing the post-enqueue handling that triggers preemption when SCX_ENQ_PREEMPT is set or the idle task is running. The function is used by move_task_between_dsqs() which backs scx_bpf_dsq_move() and may be called while the CPU is busy. Add local_dsq_post_enq() call to move_local_task_to_local_dsq(). As the dispatch path doesn't need post-enqueue handling, add SCX_RQ_IN_BALANCE early exit to keep consume_dispatch_q() behavior unchanged and avoid triggering unnecessary resched when scx_bpf_dsq_move() is used from the dispatch path. Fixes: 4c30f5ce4f7a ("sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq()") Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Andrea Righi Reviewed-by: Emil Tsalapatis Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index c78efa99406ff..695503a2f7d1e 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -988,6 +988,14 @@ static void local_dsq_post_enq(struct scx_dispatch_q *dsq, struct task_struct *p struct rq *rq = container_of(dsq, struct rq, scx.local_dsq); bool preempt = false; + /* + * If @rq is in balance, the CPU is already vacant and looking for the + * next task to run. No need to preempt or trigger resched after moving + * @p into its local DSQ. + */ + if (rq->scx.flags & SCX_RQ_IN_BALANCE) + return; + if ((enq_flags & SCX_ENQ_PREEMPT) && p != rq->curr && rq->curr->sched_class == &ext_sched_class) { rq->curr->scx.slice = 0; @@ -1636,6 +1644,8 @@ static void move_local_task_to_local_dsq(struct task_struct *p, u64 enq_flags, dsq_mod_nr(dst_dsq, 1); p->scx.dsq = dst_dsq; + + local_dsq_post_enq(dst_dsq, p, enq_flags); } /** From a1f842d113bb3d1ab128ac2d46210c48f62a63f8 Mon Sep 17 00:00:00 2001 From: Emil Tsalapatis Date: Fri, 5 Dec 2025 13:23:14 -0800 Subject: [PATCH 0007/1321] selftests/sched_ext: flush stdout before test to avoid log spam The sched_ext selftests runner runs each test in the same process, with each test possibly forking multiple times. When the main runner has not flushed its stdout, the children inherit the buffered output for previous tests and emit it during exit. This causes log spam. Make sure stdout/stderr is fully flushed before each test. Cc: Ihor Solodrai Signed-off-by: Emil Tsalapatis Signed-off-by: Tejun Heo --- tools/testing/selftests/sched_ext/runner.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/tools/testing/selftests/sched_ext/runner.c b/tools/testing/selftests/sched_ext/runner.c index aa2d7d32dda9b..5748d2c699037 100644 --- a/tools/testing/selftests/sched_ext/runner.c +++ b/tools/testing/selftests/sched_ext/runner.c @@ -46,6 +46,14 @@ static void print_test_preamble(const struct scx_test *test, bool quiet) if (!quiet) printf("DESCRIPTION: %s\n", test->description); printf("OUTPUT:\n"); + + /* + * The tests may fork with the preamble buffered + * in the children's stdout. Flush before the test + * to avoid printing the message multiple times. + */ + fflush(stdout); + fflush(stderr); } static const char *status_to_result(enum scx_test_status status) From 070e46a2bac7d6243144c3a137474de4d4d01473 Mon Sep 17 00:00:00 2001 From: Zqiang Date: Mon, 15 Dec 2025 19:29:40 +0800 Subject: [PATCH 0008/1321] sched_ext: Remove unused code in the do_pick_task_scx() The kick_idle variable is no longer used, this commit therefore remove it and also remove associated code in the do_pick_task_scx(). Signed-off-by: Zqiang Reviewed-by: Andrea Righi Reviewed-by: Emil Tsalapatis Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 695503a2f7d1e..94164f2dec6dc 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -2446,7 +2446,7 @@ static struct task_struct * do_pick_task_scx(struct rq *rq, struct rq_flags *rf, bool force_scx) { struct task_struct *prev = rq->curr; - bool keep_prev, kick_idle = false; + bool keep_prev; struct task_struct *p; /* see kick_cpus_irq_workfn() */ @@ -2488,12 +2488,8 @@ do_pick_task_scx(struct rq *rq, struct rq_flags *rf, bool force_scx) refill_task_slice_dfl(rcu_dereference_sched(scx_root), p); } else { p = first_local_task(rq); - if (!p) { - if (kick_idle) - scx_kick_cpu(rcu_dereference_sched(scx_root), - cpu_of(rq), SCX_KICK_IDLE); + if (!p) return NULL; - } if (unlikely(!p->scx.slice)) { struct scx_sched *sch = rcu_dereference_sched(scx_root); From ff2328d609635e94bca5e00f16210da7983718ad Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Thu, 27 Nov 2025 10:14:24 +0300 Subject: [PATCH 0009/1321] btrfs: tests: fix double btrfs_path free in remove_extent_ref() We converted this code to use auto free cleanup.h magic but one remaining free was accidentally left behind which leads to a double free bug. Fixes: a320476ca8a3 ("btrfs: tests: do trivial BTRFS_PATH_AUTO_FREE conversions") Signed-off-by: Dan Carpenter Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/tests/qgroup-tests.c | 1 - 1 file changed, 1 deletion(-) diff --git a/fs/btrfs/tests/qgroup-tests.c b/fs/btrfs/tests/qgroup-tests.c index 05cfda8af422d..e9124605974bf 100644 --- a/fs/btrfs/tests/qgroup-tests.c +++ b/fs/btrfs/tests/qgroup-tests.c @@ -187,7 +187,6 @@ static int remove_extent_ref(struct btrfs_root *root, u64 bytenr, ret = btrfs_search_slot(&trans, root, &key, path, -1, 1); if (ret) { test_err("couldn't find backref %d", ret); - btrfs_free_path(path); return ret; } btrfs_del_item(&trans, root, path); From c52ed1c06c5509a5a7d8c536acf75ec5c209f956 Mon Sep 17 00:00:00 2001 From: Filipe Manana Date: Thu, 27 Nov 2025 16:35:59 +0000 Subject: [PATCH 0010/1321] btrfs: don't log conflicting inode if it's a dir moved in the current transaction We can't log a conflicting inode if it's a directory and it was moved from one parent directory to another parent directory in the current transaction, as this can result an attempt to have a directory with two hard links during log replay, one for the old parent directory and another for the new parent directory. The following scenario triggers that issue: 1) We have directories "dir1" and "dir2" created in a past transaction. Directory "dir1" has inode A as its parent directory; 2) We move "dir1" to some other directory; 3) We create a file with the name "dir1" in directory inode A; 4) We fsync the new file. This results in logging the inode of the new file and the inode for the directory "dir1" that was previously moved in the current transaction. So the log tree has the INODE_REF item for the new location of "dir1"; 5) We move the new file to some other directory. This results in updating the log tree to included the new INODE_REF for the new location of the file and removes the INODE_REF for the old location. This happens during the rename when we call btrfs_log_new_name(); 6) We fsync the file, and that persists the log tree changes done in the previous step (btrfs_log_new_name() only updates the log tree in memory); 7) We have a power failure; 8) Next time the fs is mounted, log replay happens and when processing the inode for directory "dir1" we find a new INODE_REF and add that link, but we don't remove the old link of the inode since we have not logged the old parent directory of the directory inode "dir1". As a result after log replay finishes when we trigger writeback of the subvolume tree's extent buffers, the tree check will detect that we have a directory a hard link count of 2 and we get a mount failure. The errors and stack traces reported in dmesg/syslog are like this: [ 3845.729764] BTRFS info (device dm-0): start tree-log replay [ 3845.730304] page: refcount:3 mapcount:0 mapping:000000005c8a3027 index:0x1d00 pfn:0x11510c [ 3845.731236] memcg:ffff9264c02f4e00 [ 3845.731751] aops:btree_aops [btrfs] ino:1 [ 3845.732300] flags: 0x17fffc00000400a(uptodate|private|writeback|node=0|zone=2|lastcpupid=0x1ffff) [ 3845.733346] raw: 017fffc00000400a 0000000000000000 dead000000000122 ffff9264d978aea8 [ 3845.734265] raw: 0000000000001d00 ffff92650e6d4738 00000003ffffffff ffff9264c02f4e00 [ 3845.735305] page dumped because: eb page dump [ 3845.735981] BTRFS critical (device dm-0): corrupt leaf: root=5 block=30408704 slot=6 ino=257, invalid nlink: has 2 expect no more than 1 for dir [ 3845.737786] BTRFS info (device dm-0): leaf 30408704 gen 10 total ptrs 17 free space 14881 owner 5 [ 3845.737789] BTRFS info (device dm-0): refs 4 lock_owner 0 current 30701 [ 3845.737792] item 0 key (256 INODE_ITEM 0) itemoff 16123 itemsize 160 [ 3845.737794] inode generation 3 transid 9 size 16 nbytes 16384 [ 3845.737795] block group 0 mode 40755 links 1 uid 0 gid 0 [ 3845.737797] rdev 0 sequence 2 flags 0x0 [ 3845.737798] atime 1764259517.0 [ 3845.737800] ctime 1764259517.572889464 [ 3845.737801] mtime 1764259517.572889464 [ 3845.737802] otime 1764259517.0 [ 3845.737803] item 1 key (256 INODE_REF 256) itemoff 16111 itemsize 12 [ 3845.737805] index 0 name_len 2 [ 3845.737807] item 2 key (256 DIR_ITEM 2363071922) itemoff 16077 itemsize 34 [ 3845.737808] location key (257 1 0) type 2 [ 3845.737810] transid 9 data_len 0 name_len 4 [ 3845.737811] item 3 key (256 DIR_ITEM 2676584006) itemoff 16043 itemsize 34 [ 3845.737813] location key (258 1 0) type 2 [ 3845.737814] transid 9 data_len 0 name_len 4 [ 3845.737815] item 4 key (256 DIR_INDEX 2) itemoff 16009 itemsize 34 [ 3845.737816] location key (257 1 0) type 2 [ 3845.737818] transid 9 data_len 0 name_len 4 [ 3845.737819] item 5 key (256 DIR_INDEX 3) itemoff 15975 itemsize 34 [ 3845.737820] location key (258 1 0) type 2 [ 3845.737821] transid 9 data_len 0 name_len 4 [ 3845.737822] item 6 key (257 INODE_ITEM 0) itemoff 15815 itemsize 160 [ 3845.737824] inode generation 9 transid 10 size 6 nbytes 0 [ 3845.737825] block group 0 mode 40755 links 2 uid 0 gid 0 [ 3845.737826] rdev 0 sequence 1 flags 0x0 [ 3845.737827] atime 1764259517.572889464 [ 3845.737828] ctime 1764259517.572889464 [ 3845.737830] mtime 1764259517.572889464 [ 3845.737831] otime 1764259517.572889464 [ 3845.737832] item 7 key (257 INODE_REF 256) itemoff 15801 itemsize 14 [ 3845.737833] index 2 name_len 4 [ 3845.737834] item 8 key (257 INODE_REF 258) itemoff 15787 itemsize 14 [ 3845.737836] index 2 name_len 4 [ 3845.737837] item 9 key (257 DIR_ITEM 2507850652) itemoff 15754 itemsize 33 [ 3845.737838] location key (259 1 0) type 1 [ 3845.737839] transid 10 data_len 0 name_len 3 [ 3845.737840] item 10 key (257 DIR_INDEX 2) itemoff 15721 itemsize 33 [ 3845.737842] location key (259 1 0) type 1 [ 3845.737843] transid 10 data_len 0 name_len 3 [ 3845.737844] item 11 key (258 INODE_ITEM 0) itemoff 15561 itemsize 160 [ 3845.737846] inode generation 9 transid 10 size 8 nbytes 0 [ 3845.737847] block group 0 mode 40755 links 1 uid 0 gid 0 [ 3845.737848] rdev 0 sequence 1 flags 0x0 [ 3845.737849] atime 1764259517.572889464 [ 3845.737850] ctime 1764259517.572889464 [ 3845.737851] mtime 1764259517.572889464 [ 3845.737852] otime 1764259517.572889464 [ 3845.737853] item 12 key (258 INODE_REF 256) itemoff 15547 itemsize 14 [ 3845.737855] index 3 name_len 4 [ 3845.737856] item 13 key (258 DIR_ITEM 1843588421) itemoff 15513 itemsize 34 [ 3845.737857] location key (257 1 0) type 2 [ 3845.737858] transid 10 data_len 0 name_len 4 [ 3845.737860] item 14 key (258 DIR_INDEX 2) itemoff 15479 itemsize 34 [ 3845.737861] location key (257 1 0) type 2 [ 3845.737862] transid 10 data_len 0 name_len 4 [ 3845.737863] item 15 key (259 INODE_ITEM 0) itemoff 15319 itemsize 160 [ 3845.737865] inode generation 10 transid 10 size 0 nbytes 0 [ 3845.737866] block group 0 mode 100600 links 1 uid 0 gid 0 [ 3845.737867] rdev 0 sequence 2 flags 0x0 [ 3845.737868] atime 1764259517.580874966 [ 3845.737869] ctime 1764259517.586121869 [ 3845.737870] mtime 1764259517.580874966 [ 3845.737872] otime 1764259517.580874966 [ 3845.737873] item 16 key (259 INODE_REF 257) itemoff 15306 itemsize 13 [ 3845.737874] index 2 name_len 3 [ 3845.737875] BTRFS error (device dm-0): block=30408704 write time tree block corruption detected [ 3845.739448] ------------[ cut here ]------------ [ 3845.740092] WARNING: CPU: 5 PID: 30701 at fs/btrfs/disk-io.c:335 btree_csum_one_bio+0x25a/0x270 [btrfs] [ 3845.741439] Modules linked in: btrfs dm_flakey crc32c_cryptoapi (...) [ 3845.750626] CPU: 5 UID: 0 PID: 30701 Comm: mount Tainted: G W 6.18.0-rc6-btrfs-next-218+ #1 PREEMPT(full) [ 3845.752414] Tainted: [W]=WARN [ 3845.752828] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [ 3845.754499] RIP: 0010:btree_csum_one_bio+0x25a/0x270 [btrfs] [ 3845.755460] Code: 31 f6 48 89 (...) [ 3845.758685] RSP: 0018:ffffa8d9c5677678 EFLAGS: 00010246 [ 3845.759450] RAX: 0000000000000000 RBX: ffff92650e6d4738 RCX: 0000000000000000 [ 3845.760309] RDX: 0000000000000000 RSI: ffffffff9aab45b9 RDI: ffff9264c4748000 [ 3845.761239] RBP: ffff9264d4324000 R08: 0000000000000000 R09: ffffa8d9c5677468 [ 3845.762607] R10: ffff926bdc1fffa8 R11: 0000000000000003 R12: ffffa8d9c5677680 [ 3845.764099] R13: 0000000000004000 R14: ffff9264dd624000 R15: ffff9264d978aba8 [ 3845.765094] FS: 00007f751fa5a840(0000) GS:ffff926c42a82000(0000) knlGS:0000000000000000 [ 3845.766226] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3845.766970] CR2: 0000558df1815380 CR3: 000000010ed88003 CR4: 0000000000370ef0 [ 3845.768009] Call Trace: [ 3845.768392] [ 3845.768714] btrfs_submit_bbio+0x6ee/0x7f0 [btrfs] [ 3845.769640] ? write_one_eb+0x28e/0x340 [btrfs] [ 3845.770588] btree_write_cache_pages+0x2f0/0x550 [btrfs] [ 3845.771286] ? alloc_extent_state+0x19/0x100 [btrfs] [ 3845.771967] ? merge_next_state+0x1a/0x90 [btrfs] [ 3845.772586] ? set_extent_bit+0x233/0x8b0 [btrfs] [ 3845.773198] ? xas_load+0x9/0xc0 [ 3845.773589] ? xas_find+0x14d/0x1a0 [ 3845.773969] do_writepages+0xc6/0x160 [ 3845.774367] filemap_fdatawrite_wbc+0x48/0x60 [ 3845.775003] __filemap_fdatawrite_range+0x5b/0x80 [ 3845.775902] btrfs_write_marked_extents+0x61/0x170 [btrfs] [ 3845.776707] btrfs_write_and_wait_transaction+0x4e/0xc0 [btrfs] [ 3845.777379] ? _raw_spin_unlock_irqrestore+0x23/0x40 [ 3845.777923] btrfs_commit_transaction+0x5ea/0xd20 [btrfs] [ 3845.778551] ? _raw_spin_unlock+0x15/0x30 [ 3845.778986] ? release_extent_buffer+0x34/0x160 [btrfs] [ 3845.779659] btrfs_recover_log_trees+0x7a3/0x7c0 [btrfs] [ 3845.780416] ? __pfx_replay_one_buffer+0x10/0x10 [btrfs] [ 3845.781499] open_ctree+0x10bb/0x15f0 [btrfs] [ 3845.782194] btrfs_get_tree.cold+0xb/0x16c [btrfs] [ 3845.782764] ? fscontext_read+0x15c/0x180 [ 3845.783202] ? rw_verify_area+0x50/0x180 [ 3845.783667] vfs_get_tree+0x25/0xd0 [ 3845.784047] vfs_cmd_create+0x59/0xe0 [ 3845.784458] __do_sys_fsconfig+0x4f6/0x6b0 [ 3845.784914] do_syscall_64+0x50/0x1220 [ 3845.785340] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 3845.785980] RIP: 0033:0x7f751fc7f4aa [ 3845.786759] Code: 73 01 c3 48 (...) [ 3845.789951] RSP: 002b:00007ffcdba45dc8 EFLAGS: 00000246 ORIG_RAX: 00000000000001af [ 3845.791402] RAX: ffffffffffffffda RBX: 000055ccc8291c20 RCX: 00007f751fc7f4aa [ 3845.792688] RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000003 [ 3845.794308] RBP: 000055ccc8292120 R08: 0000000000000000 R09: 0000000000000000 [ 3845.795829] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 3845.797183] R13: 00007f751fe11580 R14: 00007f751fe1326c R15: 00007f751fdf8a23 [ 3845.798633] [ 3845.799067] ---[ end trace 0000000000000000 ]--- [ 3845.800215] BTRFS: error (device dm-0) in btrfs_commit_transaction:2553: errno=-5 IO failure (Error while writing out transaction) [ 3845.801860] BTRFS warning (device dm-0 state E): Skipping commit of aborted transaction. [ 3845.802815] BTRFS error (device dm-0 state EA): Transaction aborted (error -5) [ 3845.803728] BTRFS: error (device dm-0 state EA) in cleanup_transaction:2036: errno=-5 IO failure [ 3845.805374] BTRFS: error (device dm-0 state EA) in btrfs_replay_log:2083: errno=-5 IO failure (Failed to recover log tree) [ 3845.807919] BTRFS error (device dm-0 state EA): open_ctree failed: -5 Fix this by never logging a conflicting inode that is a directory and was moved in the current transaction (its last_unlink_trans equals the current transaction) and instead fallback to a transaction commit. A test case for fstests will follow soon. Reported-by: Vyacheslav Kovalevsky Link: https://lore.kernel.org/linux-btrfs/7bbc9419-5c56-450a-b5a0-efeae7457113@gmail.com/ CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Filipe Manana Signed-off-by: David Sterba --- fs/btrfs/tree-log.c | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index fff37c8d96a45..64c1155160a22 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -6051,6 +6051,33 @@ static int conflicting_inode_is_dir(struct btrfs_root *root, u64 ino, return ret; } +static bool can_log_conflicting_inode(const struct btrfs_trans_handle *trans, + const struct btrfs_inode *inode) +{ + if (!S_ISDIR(inode->vfs_inode.i_mode)) + return true; + + if (inode->last_unlink_trans < trans->transid) + return true; + + /* + * If this is a directory and its unlink_trans is not from a past + * transaction then we must fallback to a transaction commit in order + * to avoid getting a directory with 2 hard links after log replay. + * + * This happens if a directory A is renamed, moved from one parent + * directory to another one, a new file is created in the old parent + * directory with the old name of our directory A, the new file is + * fsynced, then we moved the new file to some other parent directory + * and fsync again the new file. This results in a log tree where we + * logged that directory A existed, with the INODE_REF item for the + * new location but without having logged its old parent inode, so + * that on log replay we add a new link for the new location but the + * old link remains, resulting in a link count of 2. + */ + return false; +} + static int add_conflicting_inode(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_path *path, @@ -6154,6 +6181,11 @@ static int add_conflicting_inode(struct btrfs_trans_handle *trans, return 0; } + if (!can_log_conflicting_inode(trans, inode)) { + btrfs_add_delayed_iput(inode); + return BTRFS_LOG_FORCE_COMMIT; + } + btrfs_add_delayed_iput(inode); ino_elem = kmalloc(sizeof(*ino_elem), GFP_NOFS); @@ -6218,6 +6250,12 @@ static int log_conflicting_inodes(struct btrfs_trans_handle *trans, break; } + if (!can_log_conflicting_inode(trans, inode)) { + btrfs_add_delayed_iput(inode); + ret = BTRFS_LOG_FORCE_COMMIT; + break; + } + /* * Always log the directory, we cannot make this * conditional on need_log_inode() because the directory From fd9a1f063738a1e03153b6d7cfe8b8758b189cbd Mon Sep 17 00:00:00 2001 From: Filipe Manana Date: Wed, 3 Dec 2025 17:02:00 +0000 Subject: [PATCH 0011/1321] btrfs: do not skip logging new dentries when logging a new name When we are logging a directory and the log context indicates that we are logging a new name for some other file (that is or was inside that directory), we skip logging the inodes for new dentries in the directory. This is ok most of the time, but if after the rename or link operation that triggered the logging of that directory, we have an explicit fsync of that directory without the directory inode being evicted and reloaded, we end up never logging the inodes for the new dentries that we found during the new name logging, as the next directory fsync will only process dentries that were added after the last time we logged the directory (we are doing an incremental directory logging). So make sure we always log new dentries for a directory even if we are in a context of logging a new name. We started skipping logging inodes for new dentries as of commit c48792c6ee7a ("btrfs: do not log new dentries when logging that a new name exists") and it was fine back then, because when logging a directory we always iterated over all the directory entries (for leaves changed in the current transaction) so a subsequent fsync would always log anything that was previously skipped while logging a directory when logging a new name (with btrfs_log_new_name()). But later support for incrementally logging a directory was added in commit dc2872247ec0 ("btrfs: keep track of the last logged keys when logging a directory"), to avoid checking all dir items every time we log a directory, so the check to skip dentry logging added in the first commit should have been removed when the incremental support for logging a directory was added. A test case for fstests will follow soon. Reported-by: Vyacheslav Kovalevsky Link: https://lore.kernel.org/linux-btrfs/84c4e713-85d6-42b9-8dcf-0722ed26cb05@gmail.com/ Fixes: dc2872247ec0 ("btrfs: keep track of the last logged keys when logging a directory") Reviewed-by: Boris Burkov Signed-off-by: Filipe Manana Signed-off-by: David Sterba --- fs/btrfs/tree-log.c | 8 -------- 1 file changed, 8 deletions(-) diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 64c1155160a22..31edc93a383e2 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -5865,14 +5865,6 @@ static int log_new_dir_dentries(struct btrfs_trans_handle *trans, struct btrfs_inode *curr_inode = start_inode; int ret = 0; - /* - * If we are logging a new name, as part of a link or rename operation, - * don't bother logging new dentries, as we just want to log the names - * of an inode and that any new parents exist. - */ - if (ctx->logging_new_name) - return 0; - path = btrfs_alloc_path(); if (!path) return -ENOMEM; From 8008fb86f7047790dd759a6f796e30d218fd29f0 Mon Sep 17 00:00:00 2001 From: Qu Wenruo Date: Mon, 8 Dec 2025 19:55:48 +1030 Subject: [PATCH 0012/1321] Revert "btrfs: add ASSERTs on prealloc in qgroup functions" This reverts commit 252877a8701530fde861a4f27710c1e718e97caa. Commit 252877a87015 ("btrfs: add ASSERTs on prealloc in qgroup functions") tries to remove the kfree() on preallocated qgroup during several call sites, but this cannot work as intended: - btrfs_quota_enable() - btrfs_create_qgroup() If add_qgroup_item() failed, we go out_free_path() and at that time prealloc is not yet utilized and will trigger the new ASSERT(). - btrfs_qgroup_inherit() If qgroup_auto_inherit() failed, prealloc is not yet utilized and will trigger the new ASSERT() Reported-by: syzbot+b44d4a4885bc82af2a06@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/69369331.a70a0220.38f243.009e.GAE@google.com/ Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/qgroup.c | 27 ++++----------------------- 1 file changed, 4 insertions(+), 23 deletions(-) diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index 9e2b53e90dcbe..d9d8d9968a582 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -1243,14 +1243,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info, btrfs_end_transaction(trans); else if (trans) ret = btrfs_end_transaction(trans); - - /* - * At this point we either failed at allocating prealloc, or we - * succeeded and passed the ownership to it to add_qgroup_rb(). In any - * case, this needs to be NULL or there is something wrong. - */ - ASSERT(prealloc == NULL); - + kfree(prealloc); return ret; } @@ -1682,12 +1675,7 @@ int btrfs_create_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid) ret = btrfs_sysfs_add_one_qgroup(fs_info, qgroup); out: mutex_unlock(&fs_info->qgroup_ioctl_lock); - /* - * At this point we either failed at allocating prealloc, or we - * succeeded and passed the ownership to it to add_qgroup_rb(). In any - * case, this needs to be NULL or there is something wrong. - */ - ASSERT(prealloc == NULL); + kfree(prealloc); return ret; } @@ -3279,7 +3267,7 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid, struct btrfs_root *quota_root; struct btrfs_qgroup *srcgroup; struct btrfs_qgroup *dstgroup; - struct btrfs_qgroup *prealloc = NULL; + struct btrfs_qgroup *prealloc; struct btrfs_qgroup_list **qlist_prealloc = NULL; bool free_inherit = false; bool need_rescan = false; @@ -3520,14 +3508,7 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid, } if (free_inherit) kfree(inherit); - - /* - * At this point we either failed at allocating prealloc, or we - * succeeded and passed the ownership to it to add_qgroup_rb(). In any - * case, this needs to be NULL or there is something wrong. - */ - ASSERT(prealloc == NULL); - + kfree(prealloc); return ret; } From d2c536641b6f564aab04c0bf0c6495988d9bdeb4 Mon Sep 17 00:00:00 2001 From: Qu Wenruo Date: Tue, 25 Nov 2025 18:49:56 +1030 Subject: [PATCH 0013/1321] btrfs: fix a potential path leak in print_data_reloc_error() Inside print_data_reloc_error(), if extent_from_logical() failed we return immediately. However there are the following cases where extent_from_logical() can return error but still holds a path: - btrfs_search_slot() returned 0 - No backref item found in extent tree - No flags_ret provided This is not possible in this call site though. So for the above two cases, we can return without releasing the path, causing extent buffer leaks. Fixes: b9a9a85059cd ("btrfs: output affected files when relocation fails") Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/inode.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index c4bee47829eda..317db7d10a21d 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -256,6 +256,7 @@ static void print_data_reloc_error(const struct btrfs_inode *inode, u64 file_off if (ret < 0) { btrfs_err_rl(fs_info, "failed to lookup extent item for logical %llu: %d", logical, ret); + btrfs_release_path(&path); return; } eb = path.nodes[0]; From e78d7753b4e6fe11cd4ec280f5f854f9e81aee23 Mon Sep 17 00:00:00 2001 From: Deepanshu Kartikey Date: Wed, 10 Dec 2025 18:58:07 +0530 Subject: [PATCH 0014/1321] btrfs: fix memory leak of fs_devices in degraded seed device path In open_seed_devices(), when find_fsid() fails and we're in DEGRADED mode, a new fs_devices is allocated via alloc_fs_devices() but is never added to the seed_list before returning. This contrasts with the normal path where fs_devices is properly added via list_add(). If any error occurs later in read_one_dev() or btrfs_read_chunk_tree(), the cleanup code iterates seed_list to free seed devices, but this orphaned fs_devices is never found and never freed, causing a memory leak. Any devices allocated via add_missing_dev() and attached to this fs_devices are also leaked. Fix this by adding the newly allocated fs_devices to seed_list in the degraded path, consistent with the normal path. Fixes: 5f37583569442 ("Btrfs: move the missing device to its own fs device list") Reported-by: syzbot+eadd98df8bceb15d7fed@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=eadd98df8bceb15d7fed Tested-by: syzbot+eadd98df8bceb15d7fed@syzkaller.appspotmail.com Reviewed-by: Qu Wenruo Signed-off-by: Deepanshu Kartikey Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/volumes.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index ae1742a35e76f..13c514684cfba 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -7128,6 +7128,7 @@ static struct btrfs_fs_devices *open_seed_devices(struct btrfs_fs_info *fs_info, fs_devices->seeding = true; fs_devices->opened = 1; + list_add(&fs_devices->seed_list, &fs_info->fs_devices->seed_list); return fs_devices; } From 50295599991646660f67d58973dfeade459f19e9 Mon Sep 17 00:00:00 2001 From: Filipe Manana Date: Thu, 11 Dec 2025 11:51:19 +0000 Subject: [PATCH 0015/1321] btrfs: fix changeset leak on mmap write after failure to reserve metadata If the call to btrfs_delalloc_reserve_metadata() fails we jump to the 'out_noreserve' label and there we never free the extent_changeset allocated by the previous call to btrfs_check_data_free_space() (if qgroups are enabled). Fix this by calling extent_changeset_free() under the 'out_noreserve' label. Fixes: 6599716de2d6 ("btrfs: fix -ENOSPC mmap write failure on NOCOW files/extents") Reported-by: syzbot+2f8aa76e6acc9fce6638@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/693a635a.a70a0220.33cd7b.0029.GAE@google.com/ Reviewed-by: Qu Wenruo Signed-off-by: Filipe Manana Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/file.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 7a501e73d8802..1abc7ed2990e0 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2019,13 +2019,14 @@ static vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf) else btrfs_delalloc_release_space(inode, data_reserved, page_start, reserved_space, true); - extent_changeset_free(data_reserved); out_noreserve: if (only_release_metadata) btrfs_check_nocow_unlock(inode); sb_end_pagefault(inode->vfs_inode.i_sb); + extent_changeset_free(data_reserved); + if (ret < 0) return vmf_error(ret); From bc68987790ce7629a75932b63d8c9983a973a461 Mon Sep 17 00:00:00 2001 From: Stefan Metzmacher Date: Tue, 2 Dec 2025 22:15:24 +0100 Subject: [PATCH 0016/1321] smb: smbdirect: introduce smbdirect_socket.connect.{lock,work} This will first be used by the server in order to defer the processing of the initial recv of the negotiation request. But in future it will also be used by the client in order to implement an async connect. Cc: Tom Talpey Cc: Long Li Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher Acked-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/common/smbdirect/smbdirect_socket.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/fs/smb/common/smbdirect/smbdirect_socket.h b/fs/smb/common/smbdirect/smbdirect_socket.h index 384b19177e1c3..ee4c2726771a3 100644 --- a/fs/smb/common/smbdirect/smbdirect_socket.h +++ b/fs/smb/common/smbdirect/smbdirect_socket.h @@ -132,6 +132,14 @@ struct smbdirect_socket { struct smbdirect_socket_parameters parameters; + /* + * The state for connect/negotiation + */ + struct { + spinlock_t lock; + struct work_struct work; + } connect; + /* * The state for keepalive and timeout handling */ @@ -353,6 +361,10 @@ static __always_inline void smbdirect_socket_init(struct smbdirect_socket *sc) INIT_WORK(&sc->disconnect_work, __smbdirect_socket_disabled_work); disable_work_sync(&sc->disconnect_work); + spin_lock_init(&sc->connect.lock); + INIT_WORK(&sc->connect.work, __smbdirect_socket_disabled_work); + disable_work_sync(&sc->connect.work); + INIT_WORK(&sc->idle.immediate_work, __smbdirect_socket_disabled_work); disable_work_sync(&sc->idle.immediate_work); INIT_DELAYED_WORK(&sc->idle.timer_work, __smbdirect_socket_disabled_work); From 6f1a0038d8dd8794b8581bcb8eb14148a0601519 Mon Sep 17 00:00:00 2001 From: Stefan Metzmacher Date: Tue, 2 Dec 2025 22:15:25 +0100 Subject: [PATCH 0017/1321] smb: server: initialize recv_io->cqe.done = recv_done just once smbdirect_recv_io structures are pre-allocated so we can set the callback function just once. This will make it easy to move smb_direct_post_recv to common code soon. Cc: Tom Talpey Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher Acked-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/server/transport_rdma.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/smb/server/transport_rdma.c b/fs/smb/server/transport_rdma.c index 4e7ab8d9314f6..222d1b5365e83 100644 --- a/fs/smb/server/transport_rdma.c +++ b/fs/smb/server/transport_rdma.c @@ -758,7 +758,6 @@ static int smb_direct_post_recv(struct smbdirect_socket *sc, return ret; recvmsg->sge.length = sp->max_recv_size; recvmsg->sge.lkey = sc->ib.pd->local_dma_lkey; - recvmsg->cqe.done = recv_done; wr.wr_cqe = &recvmsg->cqe; wr.next = NULL; @@ -2339,6 +2338,7 @@ static int smb_direct_prepare(struct ksmbd_transport *t) static int smb_direct_connect(struct smbdirect_socket *sc) { + struct smbdirect_recv_io *recv_io; int ret; ret = smb_direct_init_params(sc); @@ -2353,6 +2353,9 @@ static int smb_direct_connect(struct smbdirect_socket *sc) return ret; } + list_for_each_entry(recv_io, &sc->recv_io.free.list, list) + recv_io->cqe.done = recv_done; + ret = smb_direct_create_qpair(sc); if (ret) { pr_err("Can't accept RDMA client: %d\n", ret); From 3f19465ff827c2cec6e87edb763557be61e68dc3 Mon Sep 17 00:00:00 2001 From: Stefan Metzmacher Date: Tue, 2 Dec 2025 22:15:26 +0100 Subject: [PATCH 0018/1321] smb: server: defer the initial recv completion logic to smb_direct_negotiate_recv_work() The previous change to relax WARN_ON_ONCE(SMBDIRECT_SOCKET_*) checks in recv_done() and smb_direct_cm_handler() seems to work around the problem that the order of initial recv completion and RDMA_CM_EVENT_ESTABLISHED is random, but it's still a bit ugly. This implements a better solution deferring the recv completion processing to smb_direct_negotiate_recv_work(), which is queued only if both events arrived. In order to avoid more basic changes to the main recv_done callback, I introduced a smb_direct_negotiate_recv_done, which is only used for the first pdu, this will allow further cleanup and simplifications in recv_done as a future patch. smb_direct_negotiate_recv_work() is also very basic with only basic error checking and the transition from SMBDIRECT_SOCKET_NEGOTIATE_NEEDED to SMBDIRECT_SOCKET_NEGOTIATE_RUNNING, which allows smb_direct_prepare() to continue as before. Cc: Tom Talpey Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher Acked-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/server/transport_rdma.c | 170 +++++++++++++++++++++++++++------ 1 file changed, 142 insertions(+), 28 deletions(-) diff --git a/fs/smb/server/transport_rdma.c b/fs/smb/server/transport_rdma.c index 222d1b5365e83..f585359684d45 100644 --- a/fs/smb/server/transport_rdma.c +++ b/fs/smb/server/transport_rdma.c @@ -242,6 +242,7 @@ static void smb_direct_disconnect_rdma_work(struct work_struct *work) * disable[_delayed]_work_sync() */ disable_work(&sc->disconnect_work); + disable_work(&sc->connect.work); disable_work(&sc->recv_io.posted.refill_work); disable_delayed_work(&sc->idle.timer_work); disable_work(&sc->idle.immediate_work); @@ -297,6 +298,7 @@ smb_direct_disconnect_rdma_connection(struct smbdirect_socket *sc) * not queued again but here we don't block and avoid * disable[_delayed]_work_sync() */ + disable_work(&sc->connect.work); disable_work(&sc->recv_io.posted.refill_work); disable_work(&sc->idle.immediate_work); disable_delayed_work(&sc->idle.timer_work); @@ -467,6 +469,7 @@ static void free_transport(struct smb_direct_transport *t) */ smb_direct_disconnect_wake_up_all(sc); + disable_work_sync(&sc->connect.work); disable_work_sync(&sc->recv_io.posted.refill_work); disable_delayed_work_sync(&sc->idle.timer_work); disable_work_sync(&sc->idle.immediate_work); @@ -635,28 +638,8 @@ static void recv_done(struct ib_cq *cq, struct ib_wc *wc) switch (sc->recv_io.expected) { case SMBDIRECT_EXPECT_NEGOTIATE_REQ: - if (wc->byte_len < sizeof(struct smbdirect_negotiate_req)) { - put_recvmsg(sc, recvmsg); - smb_direct_disconnect_rdma_connection(sc); - return; - } - sc->recv_io.reassembly.full_packet_received = true; - /* - * Some drivers (at least mlx5_ib) might post a - * recv completion before RDMA_CM_EVENT_ESTABLISHED, - * we need to adjust our expectation in that case. - */ - if (!sc->first_error && sc->status == SMBDIRECT_SOCKET_RDMA_CONNECT_RUNNING) - sc->status = SMBDIRECT_SOCKET_NEGOTIATE_NEEDED; - if (SMBDIRECT_CHECK_STATUS_WARN(sc, SMBDIRECT_SOCKET_NEGOTIATE_NEEDED)) { - put_recvmsg(sc, recvmsg); - smb_direct_disconnect_rdma_connection(sc); - return; - } - sc->status = SMBDIRECT_SOCKET_NEGOTIATE_RUNNING; - enqueue_reassembly(sc, recvmsg, 0); - wake_up(&sc->status_wait); - return; + /* see smb_direct_negotiate_recv_done */ + break; case SMBDIRECT_EXPECT_DATA_TRANSFER: { struct smbdirect_data_transfer *data_transfer = (struct smbdirect_data_transfer *)recvmsg->packet; @@ -742,6 +725,126 @@ static void recv_done(struct ib_cq *cq, struct ib_wc *wc) smb_direct_disconnect_rdma_connection(sc); } +static void smb_direct_negotiate_recv_work(struct work_struct *work); + +static void smb_direct_negotiate_recv_done(struct ib_cq *cq, struct ib_wc *wc) +{ + struct smbdirect_recv_io *recv_io = + container_of(wc->wr_cqe, struct smbdirect_recv_io, cqe); + struct smbdirect_socket *sc = recv_io->socket; + unsigned long flags; + + /* + * reset the common recv_done for later reuse. + */ + recv_io->cqe.done = recv_done; + + if (wc->status != IB_WC_SUCCESS || wc->opcode != IB_WC_RECV) { + put_recvmsg(sc, recv_io); + if (wc->status != IB_WC_WR_FLUSH_ERR) { + pr_err("Negotiate Recv error. status='%s (%d)' opcode=%d\n", + ib_wc_status_msg(wc->status), wc->status, + wc->opcode); + smb_direct_disconnect_rdma_connection(sc); + } + return; + } + + ksmbd_debug(RDMA, "Negotiate Recv completed. status='%s (%d)', opcode=%d\n", + ib_wc_status_msg(wc->status), wc->status, + wc->opcode); + + ib_dma_sync_single_for_cpu(sc->ib.dev, + recv_io->sge.addr, + recv_io->sge.length, + DMA_FROM_DEVICE); + + /* + * This is an internal error! + */ + if (WARN_ON_ONCE(sc->recv_io.expected != SMBDIRECT_EXPECT_NEGOTIATE_REQ)) { + put_recvmsg(sc, recv_io); + smb_direct_disconnect_rdma_connection(sc); + return; + } + + /* + * Don't reset timer to the keepalive interval in + * this will be done in smb_direct_negotiate_recv_work. + */ + + /* + * Only remember the recv_io if it has enough bytes, + * this gives smb_direct_negotiate_recv_work enough + * information in order to disconnect if it was not + * valid. + */ + sc->recv_io.reassembly.full_packet_received = true; + if (wc->byte_len >= sizeof(struct smbdirect_negotiate_req)) + enqueue_reassembly(sc, recv_io, 0); + else + put_recvmsg(sc, recv_io); + + /* + * Some drivers (at least mlx5_ib and irdma in roce mode) + * might post a recv completion before RDMA_CM_EVENT_ESTABLISHED, + * we need to adjust our expectation in that case. + * + * So we defer further processing of the negotiation + * to smb_direct_negotiate_recv_work(). + * + * If we are already in SMBDIRECT_SOCKET_NEGOTIATE_NEEDED + * we queue the work directly otherwise + * smb_direct_cm_handler() will do it, when + * RDMA_CM_EVENT_ESTABLISHED arrived. + */ + spin_lock_irqsave(&sc->connect.lock, flags); + if (!sc->first_error) { + INIT_WORK(&sc->connect.work, smb_direct_negotiate_recv_work); + if (sc->status == SMBDIRECT_SOCKET_NEGOTIATE_NEEDED) + queue_work(sc->workqueue, &sc->connect.work); + } + spin_unlock_irqrestore(&sc->connect.lock, flags); +} + +static void smb_direct_negotiate_recv_work(struct work_struct *work) +{ + struct smbdirect_socket *sc = + container_of(work, struct smbdirect_socket, connect.work); + const struct smbdirect_socket_parameters *sp = &sc->parameters; + struct smbdirect_recv_io *recv_io; + + if (sc->first_error) + return; + + ksmbd_debug(RDMA, "Negotiate Recv Work running\n"); + + /* + * Reset timer to the keepalive interval in + * order to trigger our next keepalive message. + */ + sc->idle.keepalive = SMBDIRECT_KEEPALIVE_NONE; + mod_delayed_work(sc->workqueue, &sc->idle.timer_work, + msecs_to_jiffies(sp->keepalive_interval_msec)); + + /* + * If smb_direct_negotiate_recv_done() detected an + * invalid request we want to disconnect. + */ + recv_io = get_first_reassembly(sc); + if (!recv_io) { + smb_direct_disconnect_rdma_connection(sc); + return; + } + + if (SMBDIRECT_CHECK_STATUS_WARN(sc, SMBDIRECT_SOCKET_NEGOTIATE_NEEDED)) { + smb_direct_disconnect_rdma_connection(sc); + return; + } + sc->status = SMBDIRECT_SOCKET_NEGOTIATE_RUNNING; + wake_up(&sc->status_wait); +} + static int smb_direct_post_recv(struct smbdirect_socket *sc, struct smbdirect_recv_io *recvmsg) { @@ -1731,6 +1834,7 @@ static int smb_direct_cm_handler(struct rdma_cm_id *cm_id, struct rdma_cm_event *event) { struct smbdirect_socket *sc = cm_id->context; + unsigned long flags; ksmbd_debug(RDMA, "RDMA CM event. cm_id=%p event=%s (%d)\n", cm_id, rdma_event_msg(event->event), event->event); @@ -1738,18 +1842,27 @@ static int smb_direct_cm_handler(struct rdma_cm_id *cm_id, switch (event->event) { case RDMA_CM_EVENT_ESTABLISHED: { /* - * Some drivers (at least mlx5_ib) might post a - * recv completion before RDMA_CM_EVENT_ESTABLISHED, + * Some drivers (at least mlx5_ib and irdma in roce mode) + * might post a recv completion before RDMA_CM_EVENT_ESTABLISHED, * we need to adjust our expectation in that case. * - * As we already started the negotiation, we just - * ignore RDMA_CM_EVENT_ESTABLISHED here. + * If smb_direct_negotiate_recv_done was called first + * it initialized sc->connect.work only for us to + * start, so that we turned into + * SMBDIRECT_SOCKET_NEGOTIATE_NEEDED, before + * smb_direct_negotiate_recv_work() runs. + * + * If smb_direct_negotiate_recv_done didn't happen + * yet. sc->connect.work is still be disabled and + * queue_work() is a no-op. */ - if (!sc->first_error && sc->status > SMBDIRECT_SOCKET_RDMA_CONNECT_RUNNING) - break; if (SMBDIRECT_CHECK_STATUS_DISCONNECT(sc, SMBDIRECT_SOCKET_RDMA_CONNECT_RUNNING)) break; sc->status = SMBDIRECT_SOCKET_NEGOTIATE_NEEDED; + spin_lock_irqsave(&sc->connect.lock, flags); + if (!sc->first_error) + queue_work(sc->workqueue, &sc->connect.work); + spin_unlock_irqrestore(&sc->connect.lock, flags); wake_up(&sc->status_wait); break; } @@ -1920,6 +2033,7 @@ static int smb_direct_prepare_negotiation(struct smbdirect_socket *sc) recvmsg = get_free_recvmsg(sc); if (!recvmsg) return -ENOMEM; + recvmsg->cqe.done = smb_direct_negotiate_recv_done; ret = smb_direct_post_recv(sc, recvmsg); if (ret) { From fb860a49f59a040258159252a6582292fe01795a Mon Sep 17 00:00:00 2001 From: Chen Ni Date: Tue, 18 Nov 2025 09:32:29 +0800 Subject: [PATCH 0019/1321] ksmbd: convert comma to semicolon Replace comma between expressions with semicolons. Using a ',' in place of a ';' can have unintended side effects. Although that is not the case here, it is seems best to use ';' unless ',' is intended. Found by inspection. No functional change intended. Compile tested only. Signed-off-by: Chen Ni Acked-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/server/vfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c index 98b0eb966d917..f891344bd76b5 100644 --- a/fs/smb/server/vfs.c +++ b/fs/smb/server/vfs.c @@ -702,7 +702,7 @@ int ksmbd_vfs_rename(struct ksmbd_work *work, const struct path *old_path, rd.old_parent = NULL; rd.new_parent = new_path.dentry; rd.flags = flags; - rd.delegated_inode = NULL, + rd.delegated_inode = NULL; err = start_renaming_dentry(&rd, lookup_flags, old_child, &new_last); if (err) goto out_drop_write; From 936bba897b39e66027abafc2ddae1d3f851800ec Mon Sep 17 00:00:00 2001 From: Alexey Velichayshiy Date: Wed, 10 Dec 2025 16:51:33 +0300 Subject: [PATCH 0020/1321] ksmbd: remove redundant DACL check in smb_check_perm_dacl A zero value of pdacl->num_aces is already handled at the start of smb_check_perm_dacl() so the second check is useless. Drop the unreachable code block, no functional impact intended. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Alexey Velichayshiy Acked-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/server/smbacl.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/fs/smb/server/smbacl.c b/fs/smb/server/smbacl.c index 5aa7a66334d93..05598d994a686 100644 --- a/fs/smb/server/smbacl.c +++ b/fs/smb/server/smbacl.c @@ -1307,9 +1307,6 @@ int smb_check_perm_dacl(struct ksmbd_conn *conn, const struct path *path, granted |= le32_to_cpu(ace->access_req); ace = (struct smb_ace *)((char *)ace + le16_to_cpu(ace->size)); } - - if (!pdacl->num_aces) - granted = GENERIC_ALL_FLAGS; } if (!uid) From 2e44e6ca8658211fb495ba65e9ee135e6304d49d Mon Sep 17 00:00:00 2001 From: Namjae Jeon Date: Sun, 14 Dec 2025 15:05:56 +0900 Subject: [PATCH 0021/1321] ksmbd: Fix refcount leak when invalid session is found on session lookup When a session is found but its state is not SMB2_SESSION_VALID, It indicates that no valid session was found, but it is missing to decrement the reference count acquired by the session lookup, which results in a reference count leak. This patch fixes the issue by explicitly calling ksmbd_user_session_put to release the reference to the session. Cc: stable@vger.kernel.org Reported-by: Alexandre Reported-by: Stanislas Polu Signed-off-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/server/mgmt/user_session.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/smb/server/mgmt/user_session.c b/fs/smb/server/mgmt/user_session.c index 1c181ef999295..7d880ff34402e 100644 --- a/fs/smb/server/mgmt/user_session.c +++ b/fs/smb/server/mgmt/user_session.c @@ -325,8 +325,10 @@ struct ksmbd_session *ksmbd_session_lookup_all(struct ksmbd_conn *conn, sess = ksmbd_session_lookup(conn, id); if (!sess && conn->binding) sess = ksmbd_session_lookup_slowpath(id); - if (sess && sess->state != SMB2_SESSION_VALID) + if (sess && sess->state != SMB2_SESSION_VALID) { + ksmbd_user_session_put(sess); sess = NULL; + } return sess; } From 19a07800d560b07cde5faf1b25888185b33d8035 Mon Sep 17 00:00:00 2001 From: Namjae Jeon Date: Sun, 14 Dec 2025 15:06:34 +0900 Subject: [PATCH 0022/1321] ksmbd: fix buffer validation by including null terminator size in EA length The smb2_set_ea function, which handles Extended Attributes (EA), was performing buffer validation checks that incorrectly omitted the size of the null terminating character (+1 byte) for EA Name. This patch fixes the issue by explicitly adding '+ 1' to EaNameLength where the null terminator is expected to be present in the buffer, ensuring the validation accurately reflects the total required buffer size. Cc: stable@vger.kernel.org Reported-by: Roger Reported-by: Stanislas Polu Signed-off-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/server/smb2pdu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index 27f87a13f20a7..8aa483800014d 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -2363,7 +2363,7 @@ static int smb2_set_ea(struct smb2_ea_info *eabuf, unsigned int buf_len, int rc = 0; unsigned int next = 0; - if (buf_len < sizeof(struct smb2_ea_info) + eabuf->EaNameLength + + if (buf_len < sizeof(struct smb2_ea_info) + eabuf->EaNameLength + 1 + le16_to_cpu(eabuf->EaValueLength)) return -EINVAL; @@ -2440,7 +2440,7 @@ static int smb2_set_ea(struct smb2_ea_info *eabuf, unsigned int buf_len, break; } - if (buf_len < sizeof(struct smb2_ea_info) + eabuf->EaNameLength + + if (buf_len < sizeof(struct smb2_ea_info) + eabuf->EaNameLength + 1 + le16_to_cpu(eabuf->EaValueLength)) { rc = -EINVAL; break; From b12bc92a42b00e6325d8d61ff641312e9e6a3e6f Mon Sep 17 00:00:00 2001 From: Al Viro Date: Sat, 13 Dec 2025 12:36:15 -0500 Subject: [PATCH 0023/1321] shmem_whiteout(): fix regression from tree-in-dcache series Now that shmem_mknod() hashes the new dentry, d_rehash() in shmem_whiteout() should be removed. X-paperbag: brown Reported-by: Hugh Dickins Acked-by: Hugh Dickins Tested-by: Hugh Dickins Fixes: 2313598222f9 ("convert ramfs and tmpfs") Signed-off-by: Al Viro --- mm/shmem.c | 14 +------------- 1 file changed, 1 insertion(+), 13 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index b329b5302c488..ecc7de7878212 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -4019,22 +4019,10 @@ static int shmem_whiteout(struct mnt_idmap *idmap, whiteout = d_alloc(old_dentry->d_parent, &old_dentry->d_name); if (!whiteout) return -ENOMEM; - error = shmem_mknod(idmap, old_dir, whiteout, S_IFCHR | WHITEOUT_MODE, WHITEOUT_DEV); dput(whiteout); - if (error) - return error; - - /* - * Cheat and hash the whiteout while the old dentry is still in - * place, instead of playing games with FS_RENAME_DOES_D_MOVE. - * - * d_lookup() will consistently find one of them at this point, - * not sure which one, but that isn't even important. - */ - d_rehash(whiteout); - return 0; + return error; } /* From 8518506365f8fcc1390947bed006d7f991b854aa Mon Sep 17 00:00:00 2001 From: Al Viro Date: Sat, 13 Dec 2025 17:50:23 -0500 Subject: [PATCH 0024/1321] shmem: fix recovery on rename failures maple_tree insertions can fail if we are seriously short on memory; simple_offset_rename() does not recover well if it runs into that. The same goes for simple_offset_rename_exchange(). Moreover, shmem_whiteout() expects that if it succeeds, the caller will progress to d_move(), i.e. that shmem_rename2() won't fail past the successful call of shmem_whiteout(). Not hard to fix, fortunately - mtree_store() can't fail if the index we are trying to store into is already present in the tree as a singleton. For simple_offset_rename_exchange() that's enough - we just need to be careful about the order of operations. For simple_offset_rename() solution is to preinsert the target into the tree for new_dir; the rest can be done without any potentially failing operations. That preinsertion has to be done in shmem_rename2() rather than in simple_offset_rename() itself - otherwise we'd need to deal with the possibility of failure after successful shmem_whiteout(). Fixes: a2e459555c5f ("shmem: stable directory offsets") Reviewed-by: Christian Brauner Reviewed-by: Chuck Lever Signed-off-by: Al Viro --- fs/libfs.c | 50 +++++++++++++++++++--------------------------- include/linux/fs.h | 2 +- mm/shmem.c | 18 ++++++++++++----- 3 files changed, 35 insertions(+), 35 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index 9264523be85cf..591eb649ebbac 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -346,22 +346,22 @@ void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry) * User space expects the directory offset value of the replaced * (new) directory entry to be unchanged after a rename. * - * Returns zero on success, a negative errno value on failure. + * Caller must have grabbed a slot for new_dentry in the maple_tree + * associated with new_dir, even if dentry is negative. */ -int simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry, - struct inode *new_dir, struct dentry *new_dentry) +void simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry, + struct inode *new_dir, struct dentry *new_dentry) { struct offset_ctx *old_ctx = old_dir->i_op->get_offset_ctx(old_dir); struct offset_ctx *new_ctx = new_dir->i_op->get_offset_ctx(new_dir); long new_offset = dentry2offset(new_dentry); - simple_offset_remove(old_ctx, old_dentry); + if (WARN_ON(!new_offset)) + return; - if (new_offset) { - offset_set(new_dentry, 0); - return simple_offset_replace(new_ctx, old_dentry, new_offset); - } - return simple_offset_add(new_ctx, old_dentry); + simple_offset_remove(old_ctx, old_dentry); + offset_set(new_dentry, 0); + WARN_ON(simple_offset_replace(new_ctx, old_dentry, new_offset)); } /** @@ -388,31 +388,23 @@ int simple_offset_rename_exchange(struct inode *old_dir, long new_index = dentry2offset(new_dentry); int ret; - simple_offset_remove(old_ctx, old_dentry); - simple_offset_remove(new_ctx, new_dentry); + if (WARN_ON(!old_index || !new_index)) + return -EINVAL; - ret = simple_offset_replace(new_ctx, old_dentry, new_index); - if (ret) - goto out_restore; + ret = mtree_store(&new_ctx->mt, new_index, old_dentry, GFP_KERNEL); + if (WARN_ON(ret)) + return ret; - ret = simple_offset_replace(old_ctx, new_dentry, old_index); - if (ret) { - simple_offset_remove(new_ctx, old_dentry); - goto out_restore; + ret = mtree_store(&old_ctx->mt, old_index, new_dentry, GFP_KERNEL); + if (WARN_ON(ret)) { + mtree_store(&new_ctx->mt, new_index, new_dentry, GFP_KERNEL); + return ret; } - ret = simple_rename_exchange(old_dir, old_dentry, new_dir, new_dentry); - if (ret) { - simple_offset_remove(new_ctx, old_dentry); - simple_offset_remove(old_ctx, new_dentry); - goto out_restore; - } + offset_set(old_dentry, new_index); + offset_set(new_dentry, old_index); + simple_rename_exchange(old_dir, old_dentry, new_dir, new_dentry); return 0; - -out_restore: - (void)simple_offset_replace(old_ctx, old_dentry, old_index); - (void)simple_offset_replace(new_ctx, new_dentry, new_index); - return ret; } /** diff --git a/include/linux/fs.h b/include/linux/fs.h index 04ceeca12a0d5..f5c9cf28c4dcf 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3247,7 +3247,7 @@ struct offset_ctx { void simple_offset_init(struct offset_ctx *octx); int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry); void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry); -int simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry, +void simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry, struct inode *new_dir, struct dentry *new_dentry); int simple_offset_rename_exchange(struct inode *old_dir, struct dentry *old_dentry, diff --git a/mm/shmem.c b/mm/shmem.c index ecc7de7878212..ec6c01378e9d2 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -4038,6 +4038,7 @@ static int shmem_rename2(struct mnt_idmap *idmap, { struct inode *inode = d_inode(old_dentry); int they_are_dirs = S_ISDIR(inode->i_mode); + bool had_offset = false; int error; if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE | RENAME_WHITEOUT)) @@ -4050,16 +4051,23 @@ static int shmem_rename2(struct mnt_idmap *idmap, if (!simple_empty(new_dentry)) return -ENOTEMPTY; + error = simple_offset_add(shmem_get_offset_ctx(new_dir), new_dentry); + if (error == -EBUSY) + had_offset = true; + else if (unlikely(error)) + return error; + if (flags & RENAME_WHITEOUT) { error = shmem_whiteout(idmap, old_dir, old_dentry); - if (error) + if (error) { + if (!had_offset) + simple_offset_remove(shmem_get_offset_ctx(new_dir), + new_dentry); return error; + } } - error = simple_offset_rename(old_dir, old_dentry, new_dir, new_dentry); - if (error) - return error; - + simple_offset_rename(old_dir, old_dentry, new_dir, new_dentry); if (d_really_is_positive(new_dentry)) { (void) shmem_unlink(new_dir, new_dentry); if (they_are_dirs) { From 1e603e1fbdacfc3910f1646a0eab19ca9d2c3eba Mon Sep 17 00:00:00 2001 From: Sven Schnelle Date: Fri, 5 Dec 2025 10:58:57 +0100 Subject: [PATCH 0025/1321] s390/ipl: Clear SBP flag when bootprog is set With z16 a new flag 'search boot program' was introduced for list-directed IPL (SCSI, NVMe, ECKD DASD). If this flag is set, e.g. via selecting the "Automatic" value for the "Boot program selector" control on an HMC load panel, it is copied to the reipl structure from the initial ipl structure. When a user now sets a boot prog via sysfs, the flag is not cleared and the bootloader will again automatically select the boot program, ignoring user configuration. To avoid that, clear the SBP flag when a bootprog sysfs file is written. Cc: stable@vger.kernel.org Reviewed-by: Peter Oberparleiter Reviewed-by: Heiko Carstens Signed-off-by: Sven Schnelle Signed-off-by: Heiko Carstens --- arch/s390/include/uapi/asm/ipl.h | 1 + arch/s390/kernel/ipl.c | 48 ++++++++++++++++++++++++-------- 2 files changed, 37 insertions(+), 12 deletions(-) diff --git a/arch/s390/include/uapi/asm/ipl.h b/arch/s390/include/uapi/asm/ipl.h index 2cd28af50dd43..3d64a22516994 100644 --- a/arch/s390/include/uapi/asm/ipl.h +++ b/arch/s390/include/uapi/asm/ipl.h @@ -15,6 +15,7 @@ struct ipl_pl_hdr { #define IPL_PL_FLAG_IPLPS 0x80 #define IPL_PL_FLAG_SIPL 0x40 #define IPL_PL_FLAG_IPLSR 0x20 +#define IPL_PL_FLAG_SBP 0x10 /* IPL Parameter Block header */ struct ipl_pb_hdr { diff --git a/arch/s390/kernel/ipl.c b/arch/s390/kernel/ipl.c index 961a3d60a4ddd..dcdc7e2748486 100644 --- a/arch/s390/kernel/ipl.c +++ b/arch/s390/kernel/ipl.c @@ -262,6 +262,24 @@ static struct kobj_attribute sys_##_prefix##_##_name##_attr = \ sys_##_prefix##_##_name##_show, \ sys_##_prefix##_##_name##_store) +#define DEFINE_IPL_ATTR_BOOTPROG_RW(_prefix, _name, _fmt_out, _fmt_in, _hdr, _value) \ + IPL_ATTR_SHOW_FN(_prefix, _name, _fmt_out, (unsigned long long) _value) \ +static ssize_t sys_##_prefix##_##_name##_store(struct kobject *kobj, \ + struct kobj_attribute *attr, \ + const char *buf, size_t len) \ +{ \ + unsigned long long value; \ + if (sscanf(buf, _fmt_in, &value) != 1) \ + return -EINVAL; \ + (_value) = value; \ + (_hdr).flags &= ~IPL_PL_FLAG_SBP; \ + return len; \ +} \ +static struct kobj_attribute sys_##_prefix##_##_name##_attr = \ + __ATTR(_name, 0644, \ + sys_##_prefix##_##_name##_show, \ + sys_##_prefix##_##_name##_store) + #define DEFINE_IPL_ATTR_STR_RW(_prefix, _name, _fmt_out, _fmt_in, _value)\ IPL_ATTR_SHOW_FN(_prefix, _name, _fmt_out, _value) \ static ssize_t sys_##_prefix##_##_name##_store(struct kobject *kobj, \ @@ -818,12 +836,13 @@ DEFINE_IPL_ATTR_RW(reipl_fcp, wwpn, "0x%016llx\n", "%llx\n", reipl_block_fcp->fcp.wwpn); DEFINE_IPL_ATTR_RW(reipl_fcp, lun, "0x%016llx\n", "%llx\n", reipl_block_fcp->fcp.lun); -DEFINE_IPL_ATTR_RW(reipl_fcp, bootprog, "%lld\n", "%lld\n", - reipl_block_fcp->fcp.bootprog); DEFINE_IPL_ATTR_RW(reipl_fcp, br_lba, "%lld\n", "%lld\n", reipl_block_fcp->fcp.br_lba); DEFINE_IPL_ATTR_RW(reipl_fcp, device, "0.0.%04llx\n", "0.0.%llx\n", reipl_block_fcp->fcp.devno); +DEFINE_IPL_ATTR_BOOTPROG_RW(reipl_fcp, bootprog, "%lld\n", "%lld\n", + reipl_block_fcp->hdr, + reipl_block_fcp->fcp.bootprog); static void reipl_get_ascii_loadparm(char *loadparm, struct ipl_parameter_block *ibp) @@ -942,10 +961,11 @@ DEFINE_IPL_ATTR_RW(reipl_nvme, fid, "0x%08llx\n", "%llx\n", reipl_block_nvme->nvme.fid); DEFINE_IPL_ATTR_RW(reipl_nvme, nsid, "0x%08llx\n", "%llx\n", reipl_block_nvme->nvme.nsid); -DEFINE_IPL_ATTR_RW(reipl_nvme, bootprog, "%lld\n", "%lld\n", - reipl_block_nvme->nvme.bootprog); DEFINE_IPL_ATTR_RW(reipl_nvme, br_lba, "%lld\n", "%lld\n", reipl_block_nvme->nvme.br_lba); +DEFINE_IPL_ATTR_BOOTPROG_RW(reipl_nvme, bootprog, "%lld\n", "%lld\n", + reipl_block_nvme->hdr, + reipl_block_nvme->nvme.bootprog); static struct attribute *reipl_nvme_attrs[] = { &sys_reipl_nvme_fid_attr.attr, @@ -1038,8 +1058,9 @@ static const struct bin_attribute *const reipl_eckd_bin_attrs[] = { }; DEFINE_IPL_CCW_ATTR_RW(reipl_eckd, device, reipl_block_eckd->eckd); -DEFINE_IPL_ATTR_RW(reipl_eckd, bootprog, "%lld\n", "%lld\n", - reipl_block_eckd->eckd.bootprog); +DEFINE_IPL_ATTR_BOOTPROG_RW(reipl_eckd, bootprog, "%lld\n", "%lld\n", + reipl_block_eckd->hdr, + reipl_block_eckd->eckd.bootprog); static struct attribute *reipl_eckd_attrs[] = { &sys_reipl_eckd_device_attr.attr, @@ -1567,12 +1588,13 @@ DEFINE_IPL_ATTR_RW(dump_fcp, wwpn, "0x%016llx\n", "%llx\n", dump_block_fcp->fcp.wwpn); DEFINE_IPL_ATTR_RW(dump_fcp, lun, "0x%016llx\n", "%llx\n", dump_block_fcp->fcp.lun); -DEFINE_IPL_ATTR_RW(dump_fcp, bootprog, "%lld\n", "%lld\n", - dump_block_fcp->fcp.bootprog); DEFINE_IPL_ATTR_RW(dump_fcp, br_lba, "%lld\n", "%lld\n", dump_block_fcp->fcp.br_lba); DEFINE_IPL_ATTR_RW(dump_fcp, device, "0.0.%04llx\n", "0.0.%llx\n", dump_block_fcp->fcp.devno); +DEFINE_IPL_ATTR_BOOTPROG_RW(dump_fcp, bootprog, "%lld\n", "%lld\n", + dump_block_fcp->hdr, + dump_block_fcp->fcp.bootprog); DEFINE_IPL_ATTR_SCP_DATA_RW(dump_fcp, dump_block_fcp->hdr, dump_block_fcp->fcp, @@ -1604,10 +1626,11 @@ DEFINE_IPL_ATTR_RW(dump_nvme, fid, "0x%08llx\n", "%llx\n", dump_block_nvme->nvme.fid); DEFINE_IPL_ATTR_RW(dump_nvme, nsid, "0x%08llx\n", "%llx\n", dump_block_nvme->nvme.nsid); -DEFINE_IPL_ATTR_RW(dump_nvme, bootprog, "%lld\n", "%llx\n", - dump_block_nvme->nvme.bootprog); DEFINE_IPL_ATTR_RW(dump_nvme, br_lba, "%lld\n", "%llx\n", dump_block_nvme->nvme.br_lba); +DEFINE_IPL_ATTR_BOOTPROG_RW(dump_nvme, bootprog, "%lld\n", "%llx\n", + dump_block_nvme->hdr, + dump_block_nvme->nvme.bootprog); DEFINE_IPL_ATTR_SCP_DATA_RW(dump_nvme, dump_block_nvme->hdr, dump_block_nvme->nvme, @@ -1635,8 +1658,9 @@ static const struct attribute_group dump_nvme_attr_group = { /* ECKD dump device attributes */ DEFINE_IPL_CCW_ATTR_RW(dump_eckd, device, dump_block_eckd->eckd); -DEFINE_IPL_ATTR_RW(dump_eckd, bootprog, "%lld\n", "%llx\n", - dump_block_eckd->eckd.bootprog); +DEFINE_IPL_ATTR_BOOTPROG_RW(dump_eckd, bootprog, "%lld\n", "%llx\n", + dump_block_eckd->hdr, + dump_block_eckd->eckd.bootprog); IPL_ATTR_BR_CHR_SHOW_FN(dump, dump_block_eckd->eckd); IPL_ATTR_BR_CHR_STORE_FN(dump, dump_block_eckd->eckd); From 72843f56d3d48eb6076740ff878f875365dbaf67 Mon Sep 17 00:00:00 2001 From: Benjamin Block Date: Fri, 5 Dec 2025 16:47:17 +0100 Subject: [PATCH 0026/1321] s390/pci: Fix cyclic dead-lock in zpci_zdev_put() and zpci_scan_devices() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When triggering PCI device recovery by writing into the SysFS attribute `recover` of a Physical Function with existing child SR-IOV Virtual Functions, lockdep is reporting a possible deadlock between three threads: Thread (A) Thread (B) Thread (C) | | | recover_store() zpci_scan_devices() zpci_scan_devices() lock(pci_rescan_remove_lock) | | | | | | | zpci_bus_scan_busses() | | lock(zbus_list_lock) | zpci_add_device() | | lock(zpci_add_remove_lock) | | | ┴ | | zpci_bus_scan_bus() | | lock(pci_rescan_remove_lock) ┴ | zpci_zdev_put() | lock(zpci_add_remove_lock) | ┴ zpci_bus_get() lock(zbus_list_lock) In zpci_bus_scan_busses() the `zbus_list_lock` is taken for the whole duration of the function, which also includes taking `pci_rescan_remove_lock`, among other things. But `zbus_list_lock` only really needs to protect the modification of the global registration `zbus_list`, it can be dropped while the functions within the list iteration run; this way we break the cycle above. Break up zpci_bus_scan_busses() into an "iterator" zpci_bus_get_next() that iterates over `zbus_list` element by element, and acquires and releases `zbus_list_lock` as necessary, but never keep holding it. References to `zpci_bus` objects are also acquired and released. The reference counting on `zpci_bus` objects is also changed so that all put() and get() operations are done under the protection of `zbus_list_lock`, and if the operation results in a modification of `zpci_bus_list`, this modification is done in the same critical section (apart the very first initialization). This way objects are never seen on the list that are about to be released and/or half-initialized. Fixes: 14c87ba8123a ("s390/pci: separate zbus registration from scanning") Suggested-by: Niklas Schnelle Signed-off-by: Benjamin Block Reviewed-by: Niklas Schnelle Reviewed-by: Gerd Bayer Signed-off-by: Heiko Carstens --- .clang-format | 1 + arch/s390/pci/pci.c | 6 ++- arch/s390/pci/pci_bus.c | 98 +++++++++++++++++++++++++++++------------ arch/s390/pci/pci_bus.h | 15 ++++++- 4 files changed, 91 insertions(+), 29 deletions(-) diff --git a/.clang-format b/.clang-format index 2ceca764122f8..c7060124a47aa 100644 --- a/.clang-format +++ b/.clang-format @@ -748,6 +748,7 @@ ForEachMacros: - 'ynl_attr_for_each_nested' - 'ynl_attr_for_each_payload' - 'zorro_for_each_dev' + - 'zpci_bus_for_each' IncludeBlocks: Preserve IncludeCategories: diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c index 5a6ace9d875a2..8fd14d0430085 100644 --- a/arch/s390/pci/pci.c +++ b/arch/s390/pci/pci.c @@ -1148,6 +1148,7 @@ static void zpci_add_devices(struct list_head *scan_list) int zpci_scan_devices(void) { + struct zpci_bus *zbus; LIST_HEAD(scan_list); int rc; @@ -1156,7 +1157,10 @@ int zpci_scan_devices(void) return rc; zpci_add_devices(&scan_list); - zpci_bus_scan_busses(); + zpci_bus_for_each(zbus) { + zpci_bus_scan_bus(zbus); + cond_resched(); + } return 0; } diff --git a/arch/s390/pci/pci_bus.c b/arch/s390/pci/pci_bus.c index 66c4bd888b293..42a13e451f649 100644 --- a/arch/s390/pci/pci_bus.c +++ b/arch/s390/pci/pci_bus.c @@ -153,23 +153,6 @@ int zpci_bus_scan_bus(struct zpci_bus *zbus) return ret; } -/* zpci_bus_scan_busses - Scan all registered busses - * - * Scan all available zbusses - * - */ -void zpci_bus_scan_busses(void) -{ - struct zpci_bus *zbus = NULL; - - mutex_lock(&zbus_list_lock); - list_for_each_entry(zbus, &zbus_list, bus_next) { - zpci_bus_scan_bus(zbus); - cond_resched(); - } - mutex_unlock(&zbus_list_lock); -} - static bool zpci_bus_is_multifunction_root(struct zpci_dev *zdev) { return !s390_pci_no_rid && zdev->rid_available && @@ -222,10 +205,29 @@ static int zpci_bus_create_pci_bus(struct zpci_bus *zbus, struct zpci_dev *fr, s return -ENOMEM; } -static void zpci_bus_release(struct kref *kref) +/** + * zpci_bus_release - Un-initialize resources associated with the zbus and + * free memory + * @kref: refcount * that is part of struct zpci_bus + * + * MUST be called with `zbus_list_lock` held, but the lock is released during + * run of the function. + */ +static inline void zpci_bus_release(struct kref *kref) + __releases(&zbus_list_lock) { struct zpci_bus *zbus = container_of(kref, struct zpci_bus, kref); + lockdep_assert_held(&zbus_list_lock); + + list_del(&zbus->bus_next); + mutex_unlock(&zbus_list_lock); + + /* + * At this point no-one should see this object, or be able to get a new + * reference to it. + */ + if (zbus->bus) { pci_lock_rescan_remove(); pci_stop_root_bus(zbus->bus); @@ -237,16 +239,19 @@ static void zpci_bus_release(struct kref *kref) pci_unlock_rescan_remove(); } - mutex_lock(&zbus_list_lock); - list_del(&zbus->bus_next); - mutex_unlock(&zbus_list_lock); zpci_remove_parent_msi_domain(zbus); kfree(zbus); } -static void zpci_bus_put(struct zpci_bus *zbus) +static inline void __zpci_bus_get(struct zpci_bus *zbus) +{ + lockdep_assert_held(&zbus_list_lock); + kref_get(&zbus->kref); +} + +static inline void zpci_bus_put(struct zpci_bus *zbus) { - kref_put(&zbus->kref, zpci_bus_release); + kref_put_mutex(&zbus->kref, zpci_bus_release, &zbus_list_lock); } static struct zpci_bus *zpci_bus_get(int topo, bool topo_is_tid) @@ -258,7 +263,7 @@ static struct zpci_bus *zpci_bus_get(int topo, bool topo_is_tid) if (!zbus->multifunction) continue; if (topo_is_tid == zbus->topo_is_tid && topo == zbus->topo) { - kref_get(&zbus->kref); + __zpci_bus_get(zbus); goto out_unlock; } } @@ -268,6 +273,44 @@ static struct zpci_bus *zpci_bus_get(int topo, bool topo_is_tid) return zbus; } +/** + * zpci_bus_get_next - get the next zbus object from given position in the list + * @pos: current position/cursor in the global zbus list + * + * Acquires and releases references as the cursor iterates (might also free/ + * release the cursor). Is tolerant of concurrent operations on the list. + * + * To begin the iteration, set *@pos to %NULL before calling the function. + * + * *@pos is set to %NULL in cases where either the list is empty, or *@pos is + * the last element in the list. + * + * Context: Process context. May sleep. + */ +void zpci_bus_get_next(struct zpci_bus **pos) +{ + struct zpci_bus *curp = *pos, *next = NULL; + + mutex_lock(&zbus_list_lock); + if (curp) + next = list_next_entry(curp, bus_next); + else + next = list_first_entry(&zbus_list, typeof(*curp), bus_next); + + if (list_entry_is_head(next, &zbus_list, bus_next)) + next = NULL; + + if (next) + __zpci_bus_get(next); + + *pos = next; + mutex_unlock(&zbus_list_lock); + + /* zpci_bus_put() might drop refcount to 0 and locks zbus_list_lock */ + if (curp) + zpci_bus_put(curp); +} + static struct zpci_bus *zpci_bus_alloc(int topo, bool topo_is_tid) { struct zpci_bus *zbus; @@ -279,9 +322,6 @@ static struct zpci_bus *zpci_bus_alloc(int topo, bool topo_is_tid) zbus->topo = topo; zbus->topo_is_tid = topo_is_tid; INIT_LIST_HEAD(&zbus->bus_next); - mutex_lock(&zbus_list_lock); - list_add_tail(&zbus->bus_next, &zbus_list); - mutex_unlock(&zbus_list_lock); kref_init(&zbus->kref); INIT_LIST_HEAD(&zbus->resources); @@ -291,6 +331,10 @@ static struct zpci_bus *zpci_bus_alloc(int topo, bool topo_is_tid) zbus->bus_resource.flags = IORESOURCE_BUS; pci_add_resource(&zbus->resources, &zbus->bus_resource); + mutex_lock(&zbus_list_lock); + list_add_tail(&zbus->bus_next, &zbus_list); + mutex_unlock(&zbus_list_lock); + return zbus; } diff --git a/arch/s390/pci/pci_bus.h b/arch/s390/pci/pci_bus.h index ae3d7a9159bde..e440742e3145f 100644 --- a/arch/s390/pci/pci_bus.h +++ b/arch/s390/pci/pci_bus.h @@ -15,7 +15,20 @@ int zpci_bus_device_register(struct zpci_dev *zdev, struct pci_ops *ops); void zpci_bus_device_unregister(struct zpci_dev *zdev); int zpci_bus_scan_bus(struct zpci_bus *zbus); -void zpci_bus_scan_busses(void); +void zpci_bus_get_next(struct zpci_bus **pos); + +/** + * zpci_bus_for_each - iterate over all the registered zbus objects + * @pos: a struct zpci_bus * as cursor + * + * Acquires and releases references as the cursor iterates over the registered + * objects. Is tolerant against concurrent removals of objects. + * + * Context: Process context. May sleep. + */ +#define zpci_bus_for_each(pos) \ + for ((pos) = NULL, zpci_bus_get_next(&(pos)); (pos) != NULL; \ + zpci_bus_get_next(&(pos))) int zpci_bus_scan_device(struct zpci_dev *zdev); void zpci_bus_remove_device(struct zpci_dev *zdev, bool set_error); From a11e430f955cd1e932a4b726f1573fb0fba56b4c Mon Sep 17 00:00:00 2001 From: Benjamin Block Date: Fri, 5 Dec 2025 16:47:18 +0100 Subject: [PATCH 0027/1321] s390/pci: Annotate lock context imbalance in zpci_release_device() When checking `arch/s390/pci/pci.c` with `sparse` during build, the following complaint is reported: arch/s390/pci/pci.c: note: in included file (through include/linux/smp.h, include/linux/lockdep.h, include/linux/spinlock.h, include/linux/mmzone.h, include/linux/gfp.h, include/linux/slab.h): ./include/linux/list.h:237:25: warning: context imbalance in 'zpci_release_device' - unexpected unlock But this is expected, as zpci_release_device() is expected to be called with `zpci_list_lock` held, as part of `kref_put_lock()` or similar. Reflect this by annotating the function with the appropriate __releases(). Signed-off-by: Benjamin Block Reviewed-by: Farhan Ali Reviewed-by: Niklas Schnelle Reviewed-by: Gerd Bayer Signed-off-by: Heiko Carstens --- arch/s390/pci/pci.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c index 8fd14d0430085..57f3980b98a92 100644 --- a/arch/s390/pci/pci.c +++ b/arch/s390/pci/pci.c @@ -961,6 +961,7 @@ void zpci_device_reserved(struct zpci_dev *zdev) } void zpci_release_device(struct kref *kref) + __releases(&zpci_list_lock) { struct zpci_dev *zdev = container_of(kref, struct zpci_dev, kref); From 3176bc81671a053b8ac633290b837248aa309b1f Mon Sep 17 00:00:00 2001 From: Jens Remus Date: Thu, 11 Dec 2025 12:24:50 +0100 Subject: [PATCH 0028/1321] s390/stacktrace: Do not fallback to RA register The logic to fallback to the return address (RA) register value in the topmost frame when stack tracing using back chain is broken in multiple ways: When assuming the RA register 14 has not been saved yet one must assume that a new user stack frame has not been allocated either. Therefore the back chain would not contain the stack pointer (SP) at entry, but the caller's SP at its entry instead. Therefore when falling back to the RA register 14 value it would also be necessary to fallback to the SP register 15 value. Otherwise an invalid combination of RA register 14 and caller's SP at its entry (from the back chain) is used. In the topmost frame the back chain contains either the caller's SP at its entry (before having allocated a new stack frame in the prologue), the SP at entry (after having allocated a new stack frame), or an uninitialized value (during static/dynamic stack allocation). In both cases where the back chain is valid either the caller or prologue must have saved its respective RA to the respective frame. Therefore, if the RA obtained from the frame pointed to by the back chain is invalid, this does not indicate that the IP in the topmost frame is still early in the prologue and the RA has not been saved. Reviewed-by: Heiko Carstens Signed-off-by: Jens Remus Signed-off-by: Heiko Carstens --- arch/s390/kernel/stacktrace.c | 18 ++---------------- 1 file changed, 2 insertions(+), 16 deletions(-) diff --git a/arch/s390/kernel/stacktrace.c b/arch/s390/kernel/stacktrace.c index 3aae7f70e6ab1..18520d3330581 100644 --- a/arch/s390/kernel/stacktrace.c +++ b/arch/s390/kernel/stacktrace.c @@ -104,7 +104,6 @@ void arch_stack_walk_user_common(stack_trace_consume_fn consume_entry, void *coo struct stack_frame_vdso_wrapper __user *sf_vdso; struct stack_frame_user __user *sf; unsigned long ip, sp; - bool first = true; if (!current->mm) return; @@ -133,24 +132,11 @@ void arch_stack_walk_user_common(stack_trace_consume_fn consume_entry, void *coo if (__get_user(ip, &sf->gprs[8])) break; } - /* Sanity check: ABI requires SP to be 8 byte aligned. */ - if (sp & 0x7) + /* Validate SP and RA (ABI requires SP to be 8 byte aligned). */ + if (sp & 0x7 || ip_invalid(ip)) break; - if (ip_invalid(ip)) { - /* - * If the instruction address is invalid, and this - * is the first stack frame, assume r14 has not - * been written to the stack yet. Otherwise exit. - */ - if (!first) - break; - ip = regs->gprs[14]; - if (ip_invalid(ip)) - break; - } if (!store_ip(consume_entry, cookie, entry, perf, ip)) break; - first = false; } pagefault_enable(); } From 23fcbfddbda4cf9cb748e2ed04e9a747de8dfbde Mon Sep 17 00:00:00 2001 From: Alexei Starovoitov Date: Wed, 3 Dec 2025 20:14:30 -0800 Subject: [PATCH 0029/1321] selftests/bpf: Add -fms-extensions to bpf build flags The kernel is now built with -fms-extensions, therefore generated vmlinux.h contains types like: struct slab { .. struct freelist_counters; }; Use -fms-extensions and -Wno-microsoft-anon-tag flags to build bpf programs that #include "vmlinux.h" Signed-off-by: Alexei Starovoitov --- tools/testing/selftests/bpf/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index b7030a6e2e763..4aa60e83ff191 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -437,6 +437,8 @@ BPF_CFLAGS = -g -Wall -Werror -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN) \ -I$(abspath $(OUTPUT)/../usr/include) \ -std=gnu11 \ -fno-strict-aliasing \ + -Wno-microsoft-anon-tag \ + -fms-extensions \ -Wno-compare-distinct-pointer-types \ -Wno-initializer-overrides \ # From 223f668c06701dee9cb9473228fb926a90054cfe Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Thu, 4 Dec 2025 11:29:16 +0100 Subject: [PATCH 0030/1321] net: smc: SMC_HS_CTRL_BPF should depend on BPF_JIT MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If CONFIG_BPF_SYSCALL=y, but CONFIG_BPF_JIT=n: net/smc/smc_hs_bpf.c: In function ‘bpf_smc_hs_ctrl_init’: include/linux/bpf.h:2068:50: error: statement with no effect [-Werror=unused-value] 2068 | #define register_bpf_struct_ops(st_ops, type) ({ (void *)(st_ops); 0; }) | ^~~~~~~~~~~~~~~~ net/smc/smc_hs_bpf.c:139:16: note: in expansion of macro ‘register_bpf_struct_ops’ 139 | return register_bpf_struct_ops(&bpf_smc_hs_ctrl_ops, smc_hs_ctrl); | ^~~~~~~~~~~~~~~~~~~~~~~ While this compile error is caused by a bug in , none of the code in net/smc/smc_hs_bpf.c becomes effective if CONFIG_BPF_JIT is not enabled. Hence add a dependency on BPF_JIT. While at it, add the missing newline at the end of the file. Fixes: 15f295f55656658e ("net/smc: bpf: Introduce generic hook for handshake flow") Signed-off-by: Geert Uytterhoeven Signed-off-by: Martin KaFai Lau Link: https://patch.msgid.link/988c61e5fea280872d81b3640f1f34d0619cfbbf.1764843951.git.geert@linux-m68k.org --- net/smc/Kconfig | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/smc/Kconfig b/net/smc/Kconfig index 325addf83cc69..277ef504bc26e 100644 --- a/net/smc/Kconfig +++ b/net/smc/Kconfig @@ -22,10 +22,10 @@ config SMC_DIAG config SMC_HS_CTRL_BPF bool "Generic eBPF hook for SMC handshake flow" - depends on SMC && BPF_SYSCALL + depends on SMC && BPF_JIT && BPF_SYSCALL default y help SMC_HS_CTRL_BPF enables support to register generic eBPF hook for SMC handshake flow, which offer much greater flexibility in modifying the behavior of the SMC protocol stack compared to a complete kernel-based approach. Select - this option if you want filtring the handshake process via eBPF programs. \ No newline at end of file + this option if you want filtring the handshake process via eBPF programs. From 59ff3ddb8b0223900e44db7a40bb1f7db6a4911c Mon Sep 17 00:00:00 2001 From: Quentin Monnet Date: Mon, 8 Dec 2025 13:07:48 +0000 Subject: [PATCH 0031/1321] bpftool: Fix build warnings due to MS extensions The kernel is now built with -fms-extensions. Anonymous structs or unions permitted by these extensions have been used in several places, and can end up in the generated vmlinux.h file, for example: struct ns_tree { [...] }; [...] struct ns_common { [...] union { struct ns_tree; struct callback_head ns_rcu; }; }; Trying to include this header for compiling a tool may result in build warnings, if the compiler does not expect these extensions. This is the case, for example, with bpftool: In file included from skeleton/pid_iter.bpf.c:3: .../tools/testing/selftests/bpf/tools/build/bpftool/vmlinux.h:64057:3: warning: declaration does not declare anything [-Wmissing-declarations] 64057 | struct ns_tree; | ^~~~~~~~~~~~~~ Fix these build warnings in bpftool by turning on Microsoft extensions when compiling the two BPF programs that rely on vmlinux.h. Reported-by: Alexei Starovoitov Closes: https://lore.kernel.org/bpf/CAADnVQK9ZkPC7+R5VXKHVdtj8tumpMXm7BTp0u9CoiFLz_aPTg@mail.gmail.com/ Signed-off-by: Quentin Monnet Link: https://lore.kernel.org/r/20251208130748.68371-1-qmo@kernel.org Signed-off-by: Alexei Starovoitov --- tools/bpf/bpftool/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile index 586d1b2595d16..5442073a2e428 100644 --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -224,6 +224,8 @@ endif $(OUTPUT)%.bpf.o: skeleton/%.bpf.c $(OUTPUT)vmlinux.h $(LIBBPF_BOOTSTRAP) $(QUIET_CLANG)$(CLANG) \ + -Wno-microsoft-anon-tag \ + -fms-extensions \ -I$(or $(OUTPUT),.) \ -I$(srctree)/tools/include/uapi/ \ -I$(LIBBPF_BOOTSTRAP_INCLUDE) \ From 45da1a625d5660b669e0b0978b07f082b55ac217 Mon Sep 17 00:00:00 2001 From: Mikhail Gavrilov Date: Sat, 6 Dec 2025 14:28:25 +0500 Subject: [PATCH 0032/1321] libbpf: Fix -Wdiscarded-qualifiers under C23 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit glibc ≥ 2.42 (GCC 15) defaults to -std=gnu23, which promotes -Wdiscarded-qualifiers to an error. In C23, strstr() and strchr() return "const char *". Change variable types to const char * where the pointers are never modified (res, sym_sfx, next_path). Suggested-by: Florian Weimer Suggested-by: Andrii Nakryiko Signed-off-by: Mikhail Gavrilov Link: https://lore.kernel.org/r/20251206092825.1471385-1-mikhail.v.gavrilov@gmail.com Signed-off-by: Alexei Starovoitov --- tools/lib/bpf/libbpf.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 3dc8a80788155..f4dfd23148a55 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -8484,7 +8484,7 @@ static int kallsyms_cb(unsigned long long sym_addr, char sym_type, struct bpf_object *obj = ctx; const struct btf_type *t; struct extern_desc *ext; - char *res; + const char *res; res = strstr(sym_name, ".llvm."); if (sym_type == 'd' && res) @@ -11818,7 +11818,8 @@ static int avail_kallsyms_cb(unsigned long long sym_addr, char sym_type, * * [0] fb6a421fb615 ("kallsyms: Match symbols exactly with CONFIG_LTO_CLANG") */ - char sym_trim[256], *psym_trim = sym_trim, *sym_sfx; + char sym_trim[256], *psym_trim = sym_trim; + const char *sym_sfx; if (!(sym_sfx = strstr(sym_name, ".llvm."))) return 0; @@ -12401,7 +12402,7 @@ static int resolve_full_path(const char *file, char *result, size_t result_sz) if (!search_paths[i]) continue; for (s = search_paths[i]; s != NULL; s = strchr(s, ':')) { - char *next_path; + const char *next_path; int seg_len; if (s[0] == ':') From c1661d4e696e6577afdcbfc20ed3dc3d6a8955cc Mon Sep 17 00:00:00 2001 From: Ondrej Mosnacek Date: Thu, 4 Dec 2025 13:59:16 +0100 Subject: [PATCH 0033/1321] bpf, arm64: Do not audit capability check in do_jit() Analogically to the x86 commit 881a9c9cb785 ("bpf: Do not audit capability check in do_jit()"), change the capable() call to ns_capable_noaudit() in order to avoid spurious SELinux denials in audit log. The commit log from that commit applies here as well: """ The failure of this check only results in a security mitigation being applied, slightly affecting performance of the compiled BPF program. It doesn't result in a failed syscall, an thus auditing a failed LSM permission check for it is unwanted. For example with SELinux, it causes a denial to be reported for confined processes running as root, which tends to be flagged as a problem to be fixed in the policy. Yet dontauditing or allowing CAP_SYS_ADMIN to the domain may not be desirable, as it would allow/silence also other checks - either going against the principle of least privilege or making debugging potentially harder. Fix it by changing it from capable() to ns_capable_noaudit(), which instructs the LSMs to not audit the resulting denials. """ Fixes: f300769ead03 ("arm64: bpf: Only mitigate cBPF programs loaded by unprivileged users") Signed-off-by: Ondrej Mosnacek Link: https://lore.kernel.org/r/20251204125916.441021-1-omosnace@redhat.com Signed-off-by: Alexei Starovoitov --- arch/arm64/net/bpf_jit_comp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index 74dd29816f36a..b6eb7a465ad24 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -1004,7 +1004,7 @@ static void __maybe_unused build_bhb_mitigation(struct jit_ctx *ctx) arm64_get_spectre_v2_state() == SPECTRE_VULNERABLE) return; - if (capable(CAP_SYS_ADMIN)) + if (ns_capable_noaudit(&init_user_ns, CAP_SYS_ADMIN)) return; if (supports_clearbhb(SCOPE_SYSTEM)) { From 05152fbff5c2647f764451443a3f2b0886e078b9 Mon Sep 17 00:00:00 2001 From: Josh Poimboeuf Date: Wed, 3 Dec 2025 19:32:15 -0800 Subject: [PATCH 0034/1321] bpf: Add bpf_has_frame_pointer() Introduce a bpf_has_frame_pointer() helper that unwinders can call to determine whether a given instruction pointer is within the valid frame pointer region of a BPF JIT program or trampoline (i.e., after the prologue, before the epilogue). This will enable livepatch (with the ORC unwinder) to reliably unwind through BPF JIT frames. Acked-by: Song Liu Acked-and-tested-by: Andrey Grodzovsky Signed-off-by: Josh Poimboeuf Link: https://lore.kernel.org/r/fd2bc5b4e261a680774b28f6100509fd5ebad2f0.1764818927.git.jpoimboe@kernel.org Signed-off-by: Alexei Starovoitov Reviewed-by: Jiri Olsa --- arch/x86/net/bpf_jit_comp.c | 12 ++++++++++++ include/linux/bpf.h | 3 +++ kernel/bpf/core.c | 16 ++++++++++++++++ 3 files changed, 31 insertions(+) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index b69dc7194e2c0..b0bac2a66eff3 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1678,6 +1678,9 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image emit_prologue(&prog, image, stack_depth, bpf_prog_was_classic(bpf_prog), tail_call_reachable, bpf_is_subprog(bpf_prog), bpf_prog->aux->exception_cb); + + bpf_prog->aux->ksym.fp_start = prog - temp; + /* Exception callback will clobber callee regs for its own use, and * restore the original callee regs from main prog's stack frame. */ @@ -2736,6 +2739,8 @@ st: if (is_imm8(insn->off)) pop_r12(&prog); } EMIT1(0xC9); /* leave */ + bpf_prog->aux->ksym.fp_end = prog - temp; + emit_return(&prog, image + addrs[i - 1] + (prog - temp)); break; @@ -3325,6 +3330,9 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im } EMIT1(0x55); /* push rbp */ EMIT3(0x48, 0x89, 0xE5); /* mov rbp, rsp */ + if (im) + im->ksym.fp_start = prog - (u8 *)rw_image; + if (!is_imm8(stack_size)) { /* sub rsp, stack_size */ EMIT3_off32(0x48, 0x81, 0xEC, stack_size); @@ -3462,7 +3470,11 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, -8); emit_ldx(&prog, BPF_DW, BPF_REG_6, BPF_REG_FP, -rbx_off); + EMIT1(0xC9); /* leave */ + if (im) + im->ksym.fp_end = prog - (u8 *)rw_image; + if (flags & BPF_TRAMP_F_SKIP_FRAME) { /* skip our return address and return to parent */ EMIT4(0x48, 0x83, 0xC4, 8); /* add rsp, 8 */ diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 6498be4c44f8c..e5be698256d15 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1283,6 +1283,8 @@ struct bpf_ksym { struct list_head lnode; struct latch_tree_node tnode; bool prog; + u32 fp_start; + u32 fp_end; }; enum bpf_tramp_prog_type { @@ -1511,6 +1513,7 @@ void bpf_image_ksym_add(struct bpf_ksym *ksym); void bpf_image_ksym_del(struct bpf_ksym *ksym); void bpf_ksym_add(struct bpf_ksym *ksym); void bpf_ksym_del(struct bpf_ksym *ksym); +bool bpf_has_frame_pointer(unsigned long ip); int bpf_jit_charge_modmem(u32 size); void bpf_jit_uncharge_modmem(u32 size); bool bpf_prog_has_trampoline(const struct bpf_prog *prog); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index c8ae6ab316510..1b9b18e5b03cb 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -760,6 +760,22 @@ struct bpf_prog *bpf_prog_ksym_find(unsigned long addr) NULL; } +bool bpf_has_frame_pointer(unsigned long ip) +{ + struct bpf_ksym *ksym; + unsigned long offset; + + guard(rcu)(); + + ksym = bpf_ksym_find(ip); + if (!ksym || !ksym->fp_start || !ksym->fp_end) + return false; + + offset = ip - ksym->start; + + return offset >= ksym->fp_start && offset < ksym->fp_end; +} + const struct exception_table_entry *search_bpf_extables(unsigned long addr) { const struct exception_table_entry *e = NULL; From 821ef50662bda0327be8f6018b5a92bc2e356cbf Mon Sep 17 00:00:00 2001 From: Josh Poimboeuf Date: Wed, 3 Dec 2025 19:32:16 -0800 Subject: [PATCH 0035/1321] x86/unwind/orc: Support reliable unwinding through BPF stack frames BPF JIT programs and trampolines use a frame pointer, so the current ORC unwinder strategy of falling back to frame pointers (when an ORC entry is missing) usually works in practice when unwinding through BPF JIT stack frames. However, that frame pointer fallback is just a guess, so the unwind gets marked unreliable for live patching, which can cause livepatch transition stalls. Make the common case reliable by calling the bpf_has_frame_pointer() helper to detect the valid frame pointer region of BPF JIT programs and trampolines. Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Reported-by: Andrey Grodzovsky Closes: https://lore.kernel.org/0e555733-c670-4e84-b2e6-abb8b84ade38@crowdstrike.com Acked-by: Song Liu Acked-and-tested-by: Andrey Grodzovsky Signed-off-by: Josh Poimboeuf Link: https://lore.kernel.org/r/a18505975662328c8ffb1090dded890c6f8c1004.1764818927.git.jpoimboe@kernel.org Signed-off-by: Alexei Starovoitov Reviewed-by: Jiri Olsa --- arch/x86/kernel/unwind_orc.c | 39 +++++++++++++++++++++++++----------- 1 file changed, 27 insertions(+), 12 deletions(-) diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index 977ee75e047c8..f610fde2d5c4b 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -2,6 +2,7 @@ #include #include #include +#include #include #include #include @@ -172,6 +173,25 @@ static struct orc_entry *orc_ftrace_find(unsigned long ip) } #endif +/* Fake frame pointer entry -- used as a fallback for generated code */ +static struct orc_entry orc_fp_entry = { + .type = ORC_TYPE_CALL, + .sp_reg = ORC_REG_BP, + .sp_offset = 16, + .bp_reg = ORC_REG_PREV_SP, + .bp_offset = -16, +}; + +static struct orc_entry *orc_bpf_find(unsigned long ip) +{ +#ifdef CONFIG_BPF_JIT + if (bpf_has_frame_pointer(ip)) + return &orc_fp_entry; +#endif + + return NULL; +} + /* * If we crash with IP==0, the last successfully executed instruction * was probably an indirect function call with a NULL function pointer, @@ -186,15 +206,6 @@ static struct orc_entry null_orc_entry = { .type = ORC_TYPE_CALL }; -/* Fake frame pointer entry -- used as a fallback for generated code */ -static struct orc_entry orc_fp_entry = { - .type = ORC_TYPE_CALL, - .sp_reg = ORC_REG_BP, - .sp_offset = 16, - .bp_reg = ORC_REG_PREV_SP, - .bp_offset = -16, -}; - static struct orc_entry *orc_find(unsigned long ip) { static struct orc_entry *orc; @@ -238,6 +249,11 @@ static struct orc_entry *orc_find(unsigned long ip) if (orc) return orc; + /* BPF lookup: */ + orc = orc_bpf_find(ip); + if (orc) + return orc; + return orc_ftrace_find(ip); } @@ -495,9 +511,8 @@ bool unwind_next_frame(struct unwind_state *state) if (!orc) { /* * As a fallback, try to assume this code uses a frame pointer. - * This is useful for generated code, like BPF, which ORC - * doesn't know about. This is just a guess, so the rest of - * the unwind is no longer considered reliable. + * This is just a guess, so the rest of the unwind is no longer + * considered reliable. */ orc = &orc_fp_entry; state->error = true; From e5b616ddf0973d03d70705102bbe521a23e20585 Mon Sep 17 00:00:00 2001 From: "T.J. Mercier" Date: Wed, 3 Dec 2025 16:03:47 -0800 Subject: [PATCH 0036/1321] bpf: Fix truncated dmabuf iterator reads If there is a large number (hundreds) of dmabufs allocated, the text output generated from dmabuf_iter_seq_show can exceed common user buffer sizes (e.g. PAGE_SIZE) necessitating multiple start/stop cycles to iterate through all dmabufs. However the dmabuf iterator currently returns NULL in dmabuf_iter_seq_start for all non-zero pos values, which results in the truncation of the output before all dmabufs are handled. After dma_buf_iter_begin / dma_buf_iter_next, the refcount of the buffer is elevated so that the BPF iterator program can run without holding any locks. When a stop occurs, instead of immediately dropping the reference on the buffer, stash a pointer to the buffer in seq->priv until either start is called or the iterator is released. This also enables the resumption of iteration without first walking through the list of dmabufs based on the pos value. Fixes: 76ea95534995 ("bpf: Add dmabuf iterator") Signed-off-by: T.J. Mercier Link: https://lore.kernel.org/r/20251204000348.1413593-1-tjmercier@google.com Signed-off-by: Alexei Starovoitov --- kernel/bpf/dmabuf_iter.c | 56 +++++++++++++++++++++++++++++++++++----- 1 file changed, 49 insertions(+), 7 deletions(-) diff --git a/kernel/bpf/dmabuf_iter.c b/kernel/bpf/dmabuf_iter.c index 4dd7ef7c145ca..cd500248abd95 100644 --- a/kernel/bpf/dmabuf_iter.c +++ b/kernel/bpf/dmabuf_iter.c @@ -6,10 +6,33 @@ #include #include +struct dmabuf_iter_priv { + /* + * If this pointer is non-NULL, the buffer's refcount is elevated to + * prevent destruction between stop/start. If reading is not resumed and + * start is never called again, then dmabuf_iter_seq_fini drops the + * reference when the iterator is released. + */ + struct dma_buf *dmabuf; +}; + static void *dmabuf_iter_seq_start(struct seq_file *seq, loff_t *pos) { - if (*pos) - return NULL; + struct dmabuf_iter_priv *p = seq->private; + + if (*pos) { + struct dma_buf *dmabuf = p->dmabuf; + + if (!dmabuf) + return NULL; + + /* + * Always resume from where we stopped, regardless of the value + * of pos. + */ + p->dmabuf = NULL; + return dmabuf; + } return dma_buf_iter_begin(); } @@ -54,8 +77,11 @@ static void dmabuf_iter_seq_stop(struct seq_file *seq, void *v) { struct dma_buf *dmabuf = v; - if (dmabuf) - dma_buf_put(dmabuf); + if (dmabuf) { + struct dmabuf_iter_priv *p = seq->private; + + p->dmabuf = dmabuf; + } } static const struct seq_operations dmabuf_iter_seq_ops = { @@ -71,11 +97,27 @@ static void bpf_iter_dmabuf_show_fdinfo(const struct bpf_iter_aux_info *aux, seq_puts(seq, "dmabuf iter\n"); } +static int dmabuf_iter_seq_init(void *priv, struct bpf_iter_aux_info *aux) +{ + struct dmabuf_iter_priv *p = (struct dmabuf_iter_priv *)priv; + + p->dmabuf = NULL; + return 0; +} + +static void dmabuf_iter_seq_fini(void *priv) +{ + struct dmabuf_iter_priv *p = (struct dmabuf_iter_priv *)priv; + + if (p->dmabuf) + dma_buf_put(p->dmabuf); +} + static const struct bpf_iter_seq_info dmabuf_iter_seq_info = { .seq_ops = &dmabuf_iter_seq_ops, - .init_seq_private = NULL, - .fini_seq_private = NULL, - .seq_priv_size = 0, + .init_seq_private = dmabuf_iter_seq_init, + .fini_seq_private = dmabuf_iter_seq_fini, + .seq_priv_size = sizeof(struct dmabuf_iter_priv), }; static struct bpf_iter_reg bpf_dmabuf_reg_info = { From de6642c936f4bf201503513b4a3c4d122edbef44 Mon Sep 17 00:00:00 2001 From: "T.J. Mercier" Date: Wed, 3 Dec 2025 16:03:48 -0800 Subject: [PATCH 0037/1321] selftests/bpf: Add test for truncated dmabuf_iter reads If many dmabufs are present, reads of the dmabuf iterator can be truncated at PAGE_SIZE or user buffer size boundaries before the fix in "bpf: Fix truncated dmabuf iterator reads". Add a test to confirm truncation does not occur. Signed-off-by: T.J. Mercier Link: https://lore.kernel.org/r/20251204000348.1413593-2-tjmercier@google.com Signed-off-by: Alexei Starovoitov --- .../selftests/bpf/prog_tests/dmabuf_iter.c | 47 +++++++++++++++++-- 1 file changed, 42 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c b/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c index 6c2b0c3dbcd86..e442be9dde7e1 100644 --- a/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c @@ -73,12 +73,10 @@ static int create_udmabuf(void) return -1; } -static int create_sys_heap_dmabuf(void) +static int create_sys_heap_dmabuf(size_t bytes) { - sysheap_test_buffer_size = 20 * getpagesize(); - struct dma_heap_allocation_data data = { - .len = sysheap_test_buffer_size, + .len = bytes, .fd = 0, .fd_flags = O_RDWR | O_CLOEXEC, .heap_flags = 0, @@ -110,7 +108,9 @@ static int create_sys_heap_dmabuf(void) static int create_test_buffers(void) { udmabuf = create_udmabuf(); - sysheap_dmabuf = create_sys_heap_dmabuf(); + + sysheap_test_buffer_size = 20 * getpagesize(); + sysheap_dmabuf = create_sys_heap_dmabuf(sysheap_test_buffer_size); if (udmabuf < 0 || sysheap_dmabuf < 0) return -1; @@ -219,6 +219,26 @@ static void subtest_dmabuf_iter_check_default_iter(struct dmabuf_iter *skel) close(iter_fd); } +static void subtest_dmabuf_iter_check_lots_of_buffers(struct dmabuf_iter *skel) +{ + int iter_fd; + char buf[1024]; + size_t total_bytes_read = 0; + ssize_t bytes_read; + + iter_fd = bpf_iter_create(bpf_link__fd(skel->links.dmabuf_collector)); + if (!ASSERT_OK_FD(iter_fd, "iter_create")) + return; + + while ((bytes_read = read(iter_fd, buf, sizeof(buf))) > 0) + total_bytes_read += bytes_read; + + ASSERT_GT(total_bytes_read, getpagesize(), "total_bytes_read"); + + close(iter_fd); +} + + static void subtest_dmabuf_iter_check_open_coded(struct dmabuf_iter *skel, int map_fd) { LIBBPF_OPTS(bpf_test_run_opts, topts); @@ -275,6 +295,23 @@ void test_dmabuf_iter(void) subtest_dmabuf_iter_check_no_infinite_reads(skel); if (test__start_subtest("default_iter")) subtest_dmabuf_iter_check_default_iter(skel); + if (test__start_subtest("lots_of_buffers")) { + size_t NUM_BUFS = 100; + int buffers[NUM_BUFS]; + int i; + + for (i = 0; i < NUM_BUFS; ++i) { + buffers[i] = create_sys_heap_dmabuf(getpagesize()); + if (!ASSERT_OK_FD(buffers[i], "dmabuf_fd")) + goto cleanup_bufs; + } + + subtest_dmabuf_iter_check_lots_of_buffers(skel); + +cleanup_bufs: + for (--i; i >= 0; --i) + close(buffers[i]); + } if (test__start_subtest("open_coded")) subtest_dmabuf_iter_check_open_coded(skel, map_fd); From 342b8b1f0382cc34c471126f6ccaf77f0bdcf63b Mon Sep 17 00:00:00 2001 From: Shuran Liu Date: Sat, 6 Dec 2025 22:12:09 +0800 Subject: [PATCH 0038/1321] bpf: Fix verifier assumptions of bpf_d_path's output buffer Commit 37cce22dbd51 ("bpf: verifier: Refactor helper access type tracking") started distinguishing read vs write accesses performed by helpers. The second argument of bpf_d_path() is a pointer to a buffer that the helper fills with the resulting path. However, its prototype currently uses ARG_PTR_TO_MEM without MEM_WRITE. Before 37cce22dbd51, helper accesses were conservatively treated as potential writes, so this mismatch did not cause issues. Since that commit, the verifier may incorrectly assume that the buffer contents are unchanged across the helper call and base its optimizations on this wrong assumption. This can lead to misbehaviour in BPF programs that read back the buffer, such as prefix comparisons on the returned path. Fix this by marking the second argument of bpf_d_path() as ARG_PTR_TO_MEM | MEM_WRITE so that the verifier correctly models the write to the caller-provided buffer. Fixes: 37cce22dbd51 ("bpf: verifier: Refactor helper access type tracking") Co-developed-by: Zesen Liu Signed-off-by: Zesen Liu Co-developed-by: Peili Gao Signed-off-by: Peili Gao Co-developed-by: Haoran Ni Signed-off-by: Haoran Ni Signed-off-by: Shuran Liu Reviewed-by: Matt Bobrowski Link: https://lore.kernel.org/r/20251206141210.3148-2-electronlsr@gmail.com Signed-off-by: Alexei Starovoitov --- kernel/trace/bpf_trace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index d57727abaade7..fe28d86f7c357 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -965,7 +965,7 @@ static const struct bpf_func_proto bpf_d_path_proto = { .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_BTF_ID, .arg1_btf_id = &bpf_d_path_btf_ids[0], - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_WRITE, .arg3_type = ARG_CONST_SIZE_OR_ZERO, .allowed = bpf_d_path_allowed, }; From 466bc5af9ce442253a8d24d74fe1545301b0f9e9 Mon Sep 17 00:00:00 2001 From: Shuran Liu Date: Sat, 6 Dec 2025 22:12:10 +0800 Subject: [PATCH 0039/1321] selftests/bpf: add regression test for bpf_d_path() Add a regression test for bpf_d_path() to cover incorrect verifier assumptions caused by an incorrect function prototype. The test attaches to the fallocate hook, calls bpf_d_path() and verifies that a simple prefix comparison on the returned pathname behaves correctly after the fix in patch 1. It ensures the verifier does not assume the buffer remains unwritten. Co-developed-by: Zesen Liu Signed-off-by: Zesen Liu Co-developed-by: Peili Gao Signed-off-by: Peili Gao Co-developed-by: Haoran Ni Signed-off-by: Haoran Ni Signed-off-by: Shuran Liu Link: https://lore.kernel.org/r/20251206141210.3148-3-electronlsr@gmail.com Signed-off-by: Alexei Starovoitov --- .../testing/selftests/bpf/prog_tests/d_path.c | 89 +++++++++++++++---- .../testing/selftests/bpf/progs/test_d_path.c | 23 +++++ 2 files changed, 95 insertions(+), 17 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/d_path.c b/tools/testing/selftests/bpf/prog_tests/d_path.c index ccc768592e66a..1a2a2f1abf033 100644 --- a/tools/testing/selftests/bpf/prog_tests/d_path.c +++ b/tools/testing/selftests/bpf/prog_tests/d_path.c @@ -38,6 +38,14 @@ static int set_pathname(int fd, pid_t pid) return readlink(buf, src.paths[src.cnt++], MAX_PATH_LEN); } +static inline long syscall_close(int fd) +{ + return syscall(__NR_close_range, + (unsigned int)fd, + (unsigned int)fd, + 0u); +} + static int trigger_fstat_events(pid_t pid) { int sockfd = -1, procfd = -1, devfd = -1; @@ -104,18 +112,34 @@ static int trigger_fstat_events(pid_t pid) /* sys_close no longer triggers filp_close, but we can * call sys_close_range instead which still does */ -#define close(fd) syscall(__NR_close_range, fd, fd, 0) + syscall_close(pipefd[0]); + syscall_close(pipefd[1]); + syscall_close(sockfd); + syscall_close(procfd); + syscall_close(devfd); + syscall_close(localfd); + syscall_close(indicatorfd); + return ret; +} - close(pipefd[0]); - close(pipefd[1]); - close(sockfd); - close(procfd); - close(devfd); - close(localfd); - close(indicatorfd); +static void attach_and_load(struct test_d_path **skel) +{ + int err; -#undef close - return ret; + *skel = test_d_path__open_and_load(); + if (CHECK(!*skel, "setup", "d_path skeleton failed\n")) + goto cleanup; + + err = test_d_path__attach(*skel); + if (CHECK(err, "setup", "attach failed: %d\n", err)) + goto cleanup; + + (*skel)->bss->my_pid = getpid(); + return; + +cleanup: + test_d_path__destroy(*skel); + *skel = NULL; } static void test_d_path_basic(void) @@ -124,16 +148,11 @@ static void test_d_path_basic(void) struct test_d_path *skel; int err; - skel = test_d_path__open_and_load(); - if (CHECK(!skel, "setup", "d_path skeleton failed\n")) - goto cleanup; - - err = test_d_path__attach(skel); - if (CHECK(err, "setup", "attach failed: %d\n", err)) + attach_and_load(&skel); + if (!skel) goto cleanup; bss = skel->bss; - bss->my_pid = getpid(); err = trigger_fstat_events(bss->my_pid); if (err < 0) @@ -195,6 +214,39 @@ static void test_d_path_check_types(void) test_d_path_check_types__destroy(skel); } +/* Check if the verifier correctly generates code for + * accessing the memory modified by d_path helper. + */ +static void test_d_path_mem_access(void) +{ + int localfd = -1; + char path_template[] = "/dev/shm/d_path_loadgen.XXXXXX"; + struct test_d_path__bss *bss; + struct test_d_path *skel; + + attach_and_load(&skel); + if (!skel) + goto cleanup; + + bss = skel->bss; + + localfd = mkstemp(path_template); + if (CHECK(localfd < 0, "trigger", "mkstemp failed\n")) + goto cleanup; + + if (CHECK(fallocate(localfd, 0, 0, 1024) < 0, "trigger", "fallocate failed\n")) + goto cleanup; + remove(path_template); + + if (CHECK(!bss->path_match_fallocate, "check", + "failed to read fallocate path")) + goto cleanup; + +cleanup: + syscall_close(localfd); + test_d_path__destroy(skel); +} + void test_d_path(void) { if (test__start_subtest("basic")) @@ -205,4 +257,7 @@ void test_d_path(void) if (test__start_subtest("check_alloc_mem")) test_d_path_check_types(); + + if (test__start_subtest("check_mem_access")) + test_d_path_mem_access(); } diff --git a/tools/testing/selftests/bpf/progs/test_d_path.c b/tools/testing/selftests/bpf/progs/test_d_path.c index 84e1f883f97bc..561b2f861808e 100644 --- a/tools/testing/selftests/bpf/progs/test_d_path.c +++ b/tools/testing/selftests/bpf/progs/test_d_path.c @@ -17,6 +17,7 @@ int rets_close[MAX_FILES] = {}; int called_stat = 0; int called_close = 0; +int path_match_fallocate = 0; SEC("fentry/security_inode_getattr") int BPF_PROG(prog_stat, struct path *path, struct kstat *stat, @@ -62,4 +63,26 @@ int BPF_PROG(prog_close, struct file *file, void *id) return 0; } +SEC("fentry/vfs_fallocate") +int BPF_PROG(prog_fallocate, struct file *file, int mode, loff_t offset, loff_t len) +{ + pid_t pid = bpf_get_current_pid_tgid() >> 32; + int ret = 0; + char path_fallocate[MAX_PATH_LEN] = {}; + + if (pid != my_pid) + return 0; + + ret = bpf_d_path(&file->f_path, + path_fallocate, MAX_PATH_LEN); + if (ret < 0) + return 0; + + if (!path_fallocate[0]) + return 0; + + path_match_fallocate = 1; + return 0; +} + char _license[] SEC("license") = "GPL"; From 5fb4c7232fc3a9250e69d926552f5f9f2791b9c7 Mon Sep 17 00:00:00 2001 From: Amir Goldstein Date: Sun, 7 Dec 2025 11:44:55 +0100 Subject: [PATCH 0040/1321] fsnotify: do not generate ACCESS/MODIFY events on child for special files inotify/fanotify do not allow users with no read access to a file to subscribe to events (e.g. IN_ACCESS/IN_MODIFY), but they do allow the same user to subscribe for watching events on children when the user has access to the parent directory (e.g. /dev). Users with no read access to a file but with read access to its parent directory can still stat the file and see if it was accessed/modified via atime/mtime change. The same is not true for special files (e.g. /dev/null). Users will not generally observe atime/mtime changes when other users read/write to special files, only when someone sets atime/mtime via utimensat(). Align fsnotify events with this stat behavior and do not generate ACCESS/MODIFY events to parent watchers on read/write of special files. The events are still generated to parent watchers on utimensat(). This closes some side-channels that could be possibly used for information exfiltration [1]. [1] https://snee.la/pdf/pubs/file-notification-attacks.pdf Reported-by: Sudheendra Raghav Neela CC: stable@vger.kernel.org Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara --- fs/notify/fsnotify.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c index d27ff5e5f165b..71bd44e5ab6da 100644 --- a/fs/notify/fsnotify.c +++ b/fs/notify/fsnotify.c @@ -270,8 +270,15 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data, /* * Include parent/name in notification either if some notification * groups require parent info or the parent is interested in this event. + * The parent interest in ACCESS/MODIFY events does not apply to special + * files, where read/write are not on the filesystem of the parent and + * events can provide an undesirable side-channel for information + * exfiltration. */ - parent_interested = mask & p_mask & ALL_FSNOTIFY_EVENTS; + parent_interested = mask & p_mask & ALL_FSNOTIFY_EVENTS && + !(data_type == FSNOTIFY_EVENT_PATH && + d_is_special(dentry) && + (mask & (FS_ACCESS | FS_MODIFY))); if (parent_needed || parent_interested) { /* When notifying parent, child should be passed as data */ WARN_ON_ONCE(inode != fsnotify_data_inode(data, data_type)); From 69473c1540313d8f0ff4733fed906bbee0b9390e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ahelenia=20Ziemia=C5=84ska?= Date: Mon, 8 Dec 2025 23:20:24 +0100 Subject: [PATCH 0041/1321] fs: send fsnotify_xattr()/IN_ATTRIB from vfs_fileattr_set()/chattr(1) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Currently it seems impossible to observe these changes to the file's attributes. It's useful to be able to do this to see when the file becomes immutable, for example, so emit IN_ATTRIB via fsnotify_xattr(), like when changing other inode attributes. Signed-off-by: Ahelenia Ziemiańska Link: https://patch.msgid.link/iyvn6qjotpu6cei5jdtsoibfcp6l6rgvn47cwgaucgtucpfy2s@tarta.nabijaczleweli.xyz Signed-off-by: Jan Kara --- fs/file_attr.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/file_attr.c b/fs/file_attr.c index 4c4916632f11d..13cdb31a3e947 100644 --- a/fs/file_attr.c +++ b/fs/file_attr.c @@ -2,6 +2,7 @@ #include #include #include +#include #include #include #include @@ -298,6 +299,7 @@ int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry, err = inode->i_op->fileattr_set(idmap, dentry, fa); if (err) goto out; + fsnotify_xattr(dentry); } out: From 381d1896c03205de75be979f74b5dd8fc21f745a Mon Sep 17 00:00:00 2001 From: Bharath SM Date: Tue, 16 Dec 2025 21:26:05 +0530 Subject: [PATCH 0042/1321] smb: align durable reconnect v2 context to 8 byte boundary Add a 4-byte Pad to create_durable_handle_reconnect_v2 so the DH2C create context is 8 byte aligned. This avoids malformed CREATE contexts on reconnect. Recent change removed this Padding, adding it back. Fixes: 81a45de432c6 ("smb: move create_durable_handle_reconnect_v2 to common/smb2pdu.h") Signed-off-by: Bharath SM Reviewed-by: Paulo Alcantara (Red Hat) Signed-off-by: Steve French --- fs/smb/common/smb2pdu.h | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/smb/common/smb2pdu.h b/fs/smb/common/smb2pdu.h index 3c8d8a4e74393..95323df7274b6 100644 --- a/fs/smb/common/smb2pdu.h +++ b/fs/smb/common/smb2pdu.h @@ -1293,6 +1293,7 @@ struct create_durable_handle_reconnect_v2 { struct create_context_hdr ccontext; __u8 Name[8]; struct durable_reconnect_context_v2 dcontext; + __u8 Pad[4]; } __packed; /* See MS-SMB2 2.2.14.2.12 */ From c961dd89f9203353498cec2f714efbce7e704694 Mon Sep 17 00:00:00 2001 From: ZhangGuoDong Date: Tue, 2 Dec 2025 15:14:17 +0800 Subject: [PATCH 0043/1321] smb: move some SMB1 definitions into common/smb1pdu.h These definitions are only used by SMB1, so move them into the new common/smb1pdu.h. KSMBD only implements SMB_COM_NEGOTIATE, see MS-SMB2 3.3.5.2. Co-developed-by: ChenXiaoSong Signed-off-by: ChenXiaoSong Signed-off-by: ZhangGuoDong Signed-off-by: Steve French --- fs/smb/client/cifspdu.h | 2 +- fs/smb/common/smb1pdu.h | 56 ++++++++++++++++++++++++++++++++++++++ fs/smb/common/smb2pdu.h | 40 --------------------------- fs/smb/common/smbglob.h | 2 -- fs/smb/server/smb_common.h | 1 + 5 files changed, 58 insertions(+), 43 deletions(-) create mode 100644 fs/smb/common/smb1pdu.h diff --git a/fs/smb/client/cifspdu.h b/fs/smb/client/cifspdu.h index eeb4011cb217d..fdd84369e7b8b 100644 --- a/fs/smb/client/cifspdu.h +++ b/fs/smb/client/cifspdu.h @@ -12,7 +12,7 @@ #include #include #include "../common/smbfsctl.h" -#include "../common/smb2pdu.h" +#include "../common/smb1pdu.h" #define CIFS_PROT 0 #define POSIX_PROT (CIFS_PROT+1) diff --git a/fs/smb/common/smb1pdu.h b/fs/smb/common/smb1pdu.h new file mode 100644 index 0000000000000..df6d4e11ae929 --- /dev/null +++ b/fs/smb/common/smb1pdu.h @@ -0,0 +1,56 @@ +/* SPDX-License-Identifier: LGPL-2.1 */ +/* + * + * Copyright (C) International Business Machines Corp., 2002,2009 + * 2018 Samsung Electronics Co., Ltd. + * Author(s): Steve French + * Namjae Jeon + * + */ + +#ifndef _COMMON_SMB1_PDU_H +#define _COMMON_SMB1_PDU_H + +#define SMB1_PROTO_NUMBER cpu_to_le32(0x424d53ff) + +/* + * See MS-CIFS 2.2.3.1 + * MS-SMB 2.2.3.1 + */ +struct smb_hdr { + __u8 Protocol[4]; + __u8 Command; + union { + struct { + __u8 ErrorClass; + __u8 Reserved; + __le16 Error; + } __packed DosError; + __le32 CifsError; + } __packed Status; + __u8 Flags; + __le16 Flags2; /* note: le */ + __le16 PidHigh; + union { + struct { + __le32 SequenceNumber; /* le */ + __u32 Reserved; /* zero */ + } __packed Sequence; + __u8 SecuritySignature[8]; /* le */ + } __packed Signature; + __u8 pad[2]; + __u16 Tid; + __le16 Pid; + __u16 Uid; + __le16 Mid; + __u8 WordCount; +} __packed; + +/* See MS-CIFS 2.2.4.52.1 */ +typedef struct smb_negotiate_req { + struct smb_hdr hdr; /* wct = 0 */ + __le16 ByteCount; + unsigned char DialectsArray[]; +} __packed SMB_NEGOTIATE_REQ; + +#endif /* _COMMON_SMB1_PDU_H */ diff --git a/fs/smb/common/smb2pdu.h b/fs/smb/common/smb2pdu.h index 95323df7274b6..f5ebbe31384ae 100644 --- a/fs/smb/common/smb2pdu.h +++ b/fs/smb/common/smb2pdu.h @@ -1986,39 +1986,6 @@ struct smb2_lease_ack { __le64 LeaseDuration; } __packed; -/* - * See MS-CIFS 2.2.3.1 - * MS-SMB 2.2.3.1 - */ -struct smb_hdr { - __u8 Protocol[4]; - __u8 Command; - union { - struct { - __u8 ErrorClass; - __u8 Reserved; - __le16 Error; - } __packed DosError; - __le32 CifsError; - } __packed Status; - __u8 Flags; - __le16 Flags2; /* note: le */ - __le16 PidHigh; - union { - struct { - __le32 SequenceNumber; /* le */ - __u32 Reserved; /* zero */ - } __packed Sequence; - __u8 SecuritySignature[8]; /* le */ - } __packed Signature; - __u8 pad[2]; - __u16 Tid; - __le16 Pid; - __u16 Uid; - __le16 Mid; - __u8 WordCount; -} __packed; - #define OP_BREAK_STRUCT_SIZE_20 24 #define OP_BREAK_STRUCT_SIZE_21 36 @@ -2123,11 +2090,4 @@ struct smb_hdr { #define SET_MINIMUM_RIGHTS (FILE_READ_EA | FILE_READ_ATTRIBUTES \ | READ_CONTROL | SYNCHRONIZE) -/* See MS-CIFS 2.2.4.52.1 */ -typedef struct smb_negotiate_req { - struct smb_hdr hdr; /* wct = 0 */ - __le16 ByteCount; - unsigned char DialectsArray[]; -} __packed SMB_NEGOTIATE_REQ; - #endif /* _COMMON_SMB2PDU_H */ diff --git a/fs/smb/common/smbglob.h b/fs/smb/common/smbglob.h index 9562845a56175..4e33d91cdc9db 100644 --- a/fs/smb/common/smbglob.h +++ b/fs/smb/common/smbglob.h @@ -11,8 +11,6 @@ #ifndef _COMMON_SMB_GLOB_H #define _COMMON_SMB_GLOB_H -#define SMB1_PROTO_NUMBER cpu_to_le32(0x424d53ff) - struct smb_version_values { char *version_string; __u16 protocol_id; diff --git a/fs/smb/server/smb_common.h b/fs/smb/server/smb_common.h index 067b45048c732..95bf1465387b9 100644 --- a/fs/smb/server/smb_common.h +++ b/fs/smb/server/smb_common.h @@ -10,6 +10,7 @@ #include "glob.h" #include "../common/smbglob.h" +#include "../common/smb1pdu.h" #include "../common/smb2pdu.h" #include "../common/fscc.h" #include "smb2pdu.h" From 4bcf590436fa24ee962e491a325564833afe7efc Mon Sep 17 00:00:00 2001 From: Steve French Date: Sat, 13 Dec 2025 12:48:49 -0600 Subject: [PATCH 0044/1321] cifs: update internal module version number to 2.58 Signed-off-by: Steve French --- fs/smb/client/cifsfs.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/smb/client/cifsfs.h b/fs/smb/client/cifsfs.h index e9534258d1efd..75d372ceb6553 100644 --- a/fs/smb/client/cifsfs.h +++ b/fs/smb/client/cifsfs.h @@ -145,6 +145,6 @@ extern const struct export_operations cifs_export_ops; #endif /* CONFIG_CIFS_NFSD_EXPORT */ /* when changing internal version - update following two lines at same time */ -#define SMB3_PRODUCT_BUILD 57 -#define CIFS_VERSION "2.57" +#define SMB3_PRODUCT_BUILD 58 +#define CIFS_VERSION "2.58" #endif /* _CIFSFS_H */ From 33e33d2721f06d58332f48bfb750012f6fcaae37 Mon Sep 17 00:00:00 2001 From: Wei Fang Date: Fri, 28 Nov 2025 10:59:15 +0800 Subject: [PATCH 0045/1321] net: fec: ERR007885 Workaround for XDP TX path The ERR007885 will lead to a TDAR race condition for mutliQ when the driver sets TDAR and the UDMA clears TDAR simultaneously or in a small window (2-4 cycles). And it will cause the udma_tx and udma_tx_arbiter state machines to hang. Therefore, the commit 53bb20d1faba ("net: fec: add variable reg_desc_active to speed things up") and the commit a179aad12bad ("net: fec: ERR007885 Workaround for conventional TX") have added the workaround to fix the potential issue for the conventional TX path. Similarly, the XDP TX path should also have the potential hang issue, so add the workaround for XDP TX path. Fixes: 6d6b39f180b8 ("net: fec: add initial XDP support") Signed-off-by: Wei Fang Link: https://patch.msgid.link/20251128025915.2486943-1-wei.fang@nxp.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/freescale/fec_main.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index c685a5c0cc51a..a753265961af5 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -3933,7 +3933,12 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep, txq->bd.cur = bdp; /* Trigger transmission start */ - writel(0, txq->bd.reg_desc_active); + if (!(fep->quirks & FEC_QUIRK_ERR007885) || + !readl(txq->bd.reg_desc_active) || + !readl(txq->bd.reg_desc_active) || + !readl(txq->bd.reg_desc_active) || + !readl(txq->bd.reg_desc_active)) + writel(0, txq->bd.reg_desc_active); return 0; } From bfb3f204c70a81e0b239f2950398aea7864b13b3 Mon Sep 17 00:00:00 2001 From: Wang Liang Date: Sat, 29 Nov 2025 12:13:15 +0800 Subject: [PATCH 0046/1321] netrom: Fix memory leak in nr_sendmsg() syzbot reported a memory leak [1]. When function sock_alloc_send_skb() return NULL in nr_output(), the original skb is not freed, which was allocated in nr_sendmsg(). Fix this by freeing it before return. [1] BUG: memory leak unreferenced object 0xffff888129f35500 (size 240): comm "syz.0.17", pid 6119, jiffies 4294944652 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 10 52 28 81 88 ff ff ..........R(.... backtrace (crc 1456a3e4): kmemleak_alloc_recursive include/linux/kmemleak.h:44 [inline] slab_post_alloc_hook mm/slub.c:4983 [inline] slab_alloc_node mm/slub.c:5288 [inline] kmem_cache_alloc_node_noprof+0x36f/0x5e0 mm/slub.c:5340 __alloc_skb+0x203/0x240 net/core/skbuff.c:660 alloc_skb include/linux/skbuff.h:1383 [inline] alloc_skb_with_frags+0x69/0x3f0 net/core/skbuff.c:6671 sock_alloc_send_pskb+0x379/0x3e0 net/core/sock.c:2965 sock_alloc_send_skb include/net/sock.h:1859 [inline] nr_sendmsg+0x287/0x450 net/netrom/af_netrom.c:1105 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg net/socket.c:742 [inline] sock_write_iter+0x293/0x2a0 net/socket.c:1195 new_sync_write fs/read_write.c:593 [inline] vfs_write+0x45d/0x710 fs/read_write.c:686 ksys_write+0x143/0x170 fs/read_write.c:738 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xa4/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Reported-by: syzbot+d7abc36bbbb6d7d40b58@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d7abc36bbbb6d7d40b58 Tested-by: syzbot+d7abc36bbbb6d7d40b58@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Wang Liang Link: https://patch.msgid.link/20251129041315.1550766-1-wangliang74@huawei.com Signed-off-by: Paolo Abeni --- net/netrom/nr_out.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/netrom/nr_out.c b/net/netrom/nr_out.c index 5e531394a724b..2b3cbceb0b52d 100644 --- a/net/netrom/nr_out.c +++ b/net/netrom/nr_out.c @@ -43,8 +43,10 @@ void nr_output(struct sock *sk, struct sk_buff *skb) frontlen = skb_headroom(skb); while (skb->len > 0) { - if ((skbn = sock_alloc_send_skb(sk, frontlen + NR_MAX_PACKET_SIZE, 0, &err)) == NULL) + if ((skbn = sock_alloc_send_skb(sk, frontlen + NR_MAX_PACKET_SIZE, 0, &err)) == NULL) { + kfree_skb(skb); return; + } skb_reserve(skbn, frontlen); From b519e0d7d4a70b48b6c926cdf9e4a04cf1a02d78 Mon Sep 17 00:00:00 2001 From: Shaurya Rane Date: Sat, 29 Nov 2025 15:07:18 +0530 Subject: [PATCH 0047/1321] net/hsr: fix NULL pointer dereference in prp_get_untagged_frame() prp_get_untagged_frame() calls __pskb_copy() to create frame->skb_std but doesn't check if the allocation failed. If __pskb_copy() returns NULL, skb_clone() is called with a NULL pointer, causing a crash: Oops: general protection fault, probably for non-canonical address 0xdffffc000000000f: 0000 [#1] SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000078-0x000000000000007f] CPU: 0 UID: 0 PID: 5625 Comm: syz.1.18 Not tainted syzkaller #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:skb_clone+0xd7/0x3a0 net/core/skbuff.c:2041 Code: 03 42 80 3c 20 00 74 08 4c 89 f7 e8 23 29 05 f9 49 83 3e 00 0f 85 a0 01 00 00 e8 94 dd 9d f8 48 8d 6b 7e 49 89 ee 49 c1 ee 03 <43> 0f b6 04 26 84 c0 0f 85 d1 01 00 00 44 0f b6 7d 00 41 83 e7 0c RSP: 0018:ffffc9000d00f200 EFLAGS: 00010207 RAX: ffffffff892235a1 RBX: 0000000000000000 RCX: ffff88803372a480 RDX: 0000000000000000 RSI: 0000000000000820 RDI: 0000000000000000 RBP: 000000000000007e R08: ffffffff8f7d0f77 R09: 1ffffffff1efa1ee R10: dffffc0000000000 R11: fffffbfff1efa1ef R12: dffffc0000000000 R13: 0000000000000820 R14: 000000000000000f R15: ffff88805144cc00 FS: 0000555557f6d500(0000) GS:ffff88808d72f000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000555581d35808 CR3: 000000005040e000 CR4: 0000000000352ef0 Call Trace: hsr_forward_do net/hsr/hsr_forward.c:-1 [inline] hsr_forward_skb+0x1013/0x2860 net/hsr/hsr_forward.c:741 hsr_handle_frame+0x6ce/0xa70 net/hsr/hsr_slave.c:84 __netif_receive_skb_core+0x10b9/0x4380 net/core/dev.c:5966 __netif_receive_skb_one_core net/core/dev.c:6077 [inline] __netif_receive_skb+0x72/0x380 net/core/dev.c:6192 netif_receive_skb_internal net/core/dev.c:6278 [inline] netif_receive_skb+0x1cb/0x790 net/core/dev.c:6337 tun_rx_batched+0x1b9/0x730 drivers/net/tun.c:1485 tun_get_user+0x2b65/0x3e90 drivers/net/tun.c:1953 tun_chr_write_iter+0x113/0x200 drivers/net/tun.c:1999 new_sync_write fs/read_write.c:593 [inline] vfs_write+0x5c9/0xb30 fs/read_write.c:686 ksys_write+0x145/0x250 fs/read_write.c:738 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f0449f8e1ff Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 f9 92 02 00 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 4c 93 02 00 48 RSP: 002b:00007ffd7ad94c90 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007f044a1e5fa0 RCX: 00007f0449f8e1ff RDX: 000000000000003e RSI: 0000200000000500 RDI: 00000000000000c8 RBP: 00007ffd7ad94d20 R08: 0000000000000000 R09: 0000000000000000 R10: 000000000000003e R11: 0000000000000293 R12: 0000000000000001 R13: 00007f044a1e5fa0 R14: 00007f044a1e5fa0 R15: 0000000000000003 Add a NULL check immediately after __pskb_copy() to handle allocation failures gracefully. Reported-by: syzbot+2fa344348a579b779e05@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2fa344348a579b779e05 Fixes: f266a683a480 ("net/hsr: Better frame dispatch") Cc: stable@vger.kernel.org Signed-off-by: Shaurya Rane Reviewed-by: Felix Maurer Tested-by: Felix Maurer Link: https://patch.msgid.link/20251129093718.25320-1-ssrane_b23@ee.vjti.ac.in Signed-off-by: Paolo Abeni --- net/hsr/hsr_forward.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c index 339f0d2202129..aefc9b6936ba0 100644 --- a/net/hsr/hsr_forward.c +++ b/net/hsr/hsr_forward.c @@ -205,6 +205,8 @@ struct sk_buff *prp_get_untagged_frame(struct hsr_frame_info *frame, __pskb_copy(frame->skb_prp, skb_headroom(frame->skb_prp), GFP_ATOMIC); + if (!frame->skb_std) + return NULL; } else { /* Unexpected */ WARN_ONCE(1, "%s:%d: Unexpected frame received (port_src %s)\n", From a617009c8ee5ac1a73448c937f3f720eb7f778c9 Mon Sep 17 00:00:00 2001 From: Jamal Hadi Salim Date: Fri, 28 Nov 2025 10:19:19 -0500 Subject: [PATCH 0048/1321] net/sched: ets: Always remove class from active list before deleting in ets_qdisc_change zdi-disclosures@trendmicro.com says: The vulnerability is a race condition between `ets_qdisc_dequeue` and `ets_qdisc_change`. It leads to UAF on `struct Qdisc` object. Attacker requires the capability to create new user and network namespace in order to trigger the bug. See my additional commentary at the end of the analysis. Analysis: static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt, struct netlink_ext_ack *extack) { ... // (1) this lock is preventing .change handler (`ets_qdisc_change`) //to race with .dequeue handler (`ets_qdisc_dequeue`) sch_tree_lock(sch); for (i = nbands; i < oldbands; i++) { if (i >= q->nstrict && q->classes[i].qdisc->q.qlen) list_del_init(&q->classes[i].alist); qdisc_purge_queue(q->classes[i].qdisc); } WRITE_ONCE(q->nbands, nbands); for (i = nstrict; i < q->nstrict; i++) { if (q->classes[i].qdisc->q.qlen) { // (2) the class is added to the q->active list_add_tail(&q->classes[i].alist, &q->active); q->classes[i].deficit = quanta[i]; } } WRITE_ONCE(q->nstrict, nstrict); memcpy(q->prio2band, priomap, sizeof(priomap)); for (i = 0; i < q->nbands; i++) WRITE_ONCE(q->classes[i].quantum, quanta[i]); for (i = oldbands; i < q->nbands; i++) { q->classes[i].qdisc = queues[i]; if (q->classes[i].qdisc != &noop_qdisc) qdisc_hash_add(q->classes[i].qdisc, true); } // (3) the qdisc is unlocked, now dequeue can be called in parallel // to the rest of .change handler sch_tree_unlock(sch); ets_offload_change(sch); for (i = q->nbands; i < oldbands; i++) { // (4) we're reducing the refcount for our class's qdisc and // freeing it qdisc_put(q->classes[i].qdisc); // (5) If we call .dequeue between (4) and (5), we will have // a strong UAF and we can control RIP q->classes[i].qdisc = NULL; WRITE_ONCE(q->classes[i].quantum, 0); q->classes[i].deficit = 0; gnet_stats_basic_sync_init(&q->classes[i].bstats); memset(&q->classes[i].qstats, 0, sizeof(q->classes[i].qstats)); } return 0; } Comment: This happens because some of the classes have their qdiscs assigned to NULL, but remain in the active list. This commit fixes this issue by always removing the class from the active list before deleting and freeing its associated qdisc Reproducer Steps (trimmed version of what was sent by zdi-disclosures@trendmicro.com) ``` DEV="${DEV:-lo}" ROOT_HANDLE="${ROOT_HANDLE:-1:}" BAND2_HANDLE="${BAND2_HANDLE:-20:}" # child under 1:2 PING_BYTES="${PING_BYTES:-48}" PING_COUNT="${PING_COUNT:-200000}" PING_DST="${PING_DST:-127.0.0.1}" SLOW_TBF_RATE="${SLOW_TBF_RATE:-8bit}" SLOW_TBF_BURST="${SLOW_TBF_BURST:-100b}" SLOW_TBF_LAT="${SLOW_TBF_LAT:-1s}" cleanup() { tc qdisc del dev "$DEV" root 2>/dev/null } trap cleanup EXIT ip link set "$DEV" up tc qdisc del dev "$DEV" root 2>/dev/null || true tc qdisc add dev "$DEV" root handle "$ROOT_HANDLE" ets bands 2 strict 2 tc qdisc add dev "$DEV" parent 1:2 handle "$BAND2_HANDLE" \ tbf rate "$SLOW_TBF_RATE" burst "$SLOW_TBF_BURST" latency "$SLOW_TBF_LAT" tc filter add dev "$DEV" parent 1: protocol all prio 1 u32 match u32 0 0 flowid 1:2 tc -s qdisc ls dev $DEV ping -I "$DEV" -f -c "$PING_COUNT" -s "$PING_BYTES" -W 0.001 "$PING_DST" \ >/dev/null 2>&1 & tc qdisc change dev "$DEV" root handle "$ROOT_HANDLE" ets bands 2 strict 0 tc qdisc change dev "$DEV" root handle "$ROOT_HANDLE" ets bands 2 strict 2 tc -s qdisc ls dev $DEV tc qdisc del dev "$DEV" parent 1:2 || true tc -s qdisc ls dev $DEV tc qdisc change dev "$DEV" root handle "$ROOT_HANDLE" ets bands 1 strict 1 ``` KASAN report ``` ================================================================== BUG: KASAN: slab-use-after-free in ets_qdisc_dequeue+0x1071/0x11b0 kernel/net/sched/sch_ets.c:481 Read of size 8 at addr ffff8880502fc018 by task ping/12308 > CPU: 0 UID: 0 PID: 12308 Comm: ping Not tainted 6.18.0-rc4-dirty #1 PREEMPT(full) Hardware name: QEMU Ubuntu 25.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Call Trace: __dump_stack kernel/lib/dump_stack.c:94 dump_stack_lvl+0x100/0x190 kernel/lib/dump_stack.c:120 print_address_description kernel/mm/kasan/report.c:378 print_report+0x156/0x4c9 kernel/mm/kasan/report.c:482 kasan_report+0xdf/0x110 kernel/mm/kasan/report.c:595 ets_qdisc_dequeue+0x1071/0x11b0 kernel/net/sched/sch_ets.c:481 dequeue_skb kernel/net/sched/sch_generic.c:294 qdisc_restart kernel/net/sched/sch_generic.c:399 __qdisc_run+0x1c9/0x1b00 kernel/net/sched/sch_generic.c:417 __dev_xmit_skb kernel/net/core/dev.c:4221 __dev_queue_xmit+0x2848/0x4410 kernel/net/core/dev.c:4729 dev_queue_xmit kernel/./include/linux/netdevice.h:3365 [...] Allocated by task 17115: kasan_save_stack+0x30/0x50 kernel/mm/kasan/common.c:56 kasan_save_track+0x14/0x30 kernel/mm/kasan/common.c:77 poison_kmalloc_redzone kernel/mm/kasan/common.c:400 __kasan_kmalloc+0xaa/0xb0 kernel/mm/kasan/common.c:417 kasan_kmalloc kernel/./include/linux/kasan.h:262 __do_kmalloc_node kernel/mm/slub.c:5642 __kmalloc_node_noprof+0x34e/0x990 kernel/mm/slub.c:5648 kmalloc_node_noprof kernel/./include/linux/slab.h:987 qdisc_alloc+0xb8/0xc30 kernel/net/sched/sch_generic.c:950 qdisc_create_dflt+0x93/0x490 kernel/net/sched/sch_generic.c:1012 ets_class_graft+0x4fd/0x800 kernel/net/sched/sch_ets.c:261 qdisc_graft+0x3e4/0x1780 kernel/net/sched/sch_api.c:1196 [...] Freed by task 9905: kasan_save_stack+0x30/0x50 kernel/mm/kasan/common.c:56 kasan_save_track+0x14/0x30 kernel/mm/kasan/common.c:77 __kasan_save_free_info+0x3b/0x70 kernel/mm/kasan/generic.c:587 kasan_save_free_info kernel/mm/kasan/kasan.h:406 poison_slab_object kernel/mm/kasan/common.c:252 __kasan_slab_free+0x5f/0x80 kernel/mm/kasan/common.c:284 kasan_slab_free kernel/./include/linux/kasan.h:234 slab_free_hook kernel/mm/slub.c:2539 slab_free kernel/mm/slub.c:6630 kfree+0x144/0x700 kernel/mm/slub.c:6837 rcu_do_batch kernel/kernel/rcu/tree.c:2605 rcu_core+0x7c0/0x1500 kernel/kernel/rcu/tree.c:2861 handle_softirqs+0x1ea/0x8a0 kernel/kernel/softirq.c:622 __do_softirq kernel/kernel/softirq.c:656 [...] Commentary: 1. Maher Azzouzi working with Trend Micro Zero Day Initiative was reported as the person who found the issue. I requested to get a proper email to add to the reported-by tag but got no response. For this reason i will credit the person i exchanged emails with i.e zdi-disclosures@trendmicro.com 2. Neither i nor Victor who did a much more thorough testing was able to reproduce a UAF with the PoC or other approaches we tried. We were both able to reproduce a null ptr deref. After exchange with zdi-disclosures@trendmicro.com they sent a small change to be made to the code to add an extra delay which was able to simulate the UAF. i.e, this: qdisc_put(q->classes[i].qdisc); mdelay(90); q->classes[i].qdisc = NULL; I was informed by Thomas Gleixner(tglx@linutronix.de) that adding delays was acceptable approach for demonstrating the bug, quote: "Adding such delays is common exploit validation practice" The equivalent delay could happen "by virt scheduling the vCPU out, SMIs, NMIs, PREEMPT_RT enabled kernel" 3. I asked the OP to test and report back but got no response and after a few days gave up and proceeded to submit this fix. Fixes: de6d25924c2a ("net/sched: sch_ets: don't peek at classes beyond 'nbands'") Reported-by: zdi-disclosures@trendmicro.com Tested-by: Victor Nogueira Signed-off-by: Jamal Hadi Salim Reviewed-by: Davide Caratti Link: https://patch.msgid.link/20251128151919.576920-1-jhs@mojatatu.com Signed-off-by: Paolo Abeni --- net/sched/sch_ets.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c index 82635dd2cfa59..ae46643e596d3 100644 --- a/net/sched/sch_ets.c +++ b/net/sched/sch_ets.c @@ -652,7 +652,7 @@ static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt, sch_tree_lock(sch); for (i = nbands; i < oldbands; i++) { - if (i >= q->nstrict && q->classes[i].qdisc->q.qlen) + if (cl_is_active(&q->classes[i])) list_del_init(&q->classes[i].alist); qdisc_purge_queue(q->classes[i].qdisc); } From 66c71b7e0f80cdfb63198bdd38b6eaeb1f7f575b Mon Sep 17 00:00:00 2001 From: Moshe Shemesh Date: Mon, 1 Dec 2025 17:13:27 +0200 Subject: [PATCH 0049/1321] net/mlx5: make enable_mpesw idempotent The enable_mpesw() function returns -EINVAL if ldev->mode is not MLX5_LAG_MODE_NONE. This means attempting to enable MPESW mode when it's already enabled will fail. In contrast, disable_mpesw() properly checks if the mode is MLX5_LAG_MODE_MPESW before proceeding, making it naturally idempotent and safe to call multiple times. Fix enable_mpesw() to return success if mpesw is already enabled. Fixes: a32327a3a02c ("net/mlx5: Lag, Control MultiPort E-Switch single FDB mode") Signed-off-by: Moshe Shemesh Reviewed-by: Shay Drori Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman Link: https://patch.msgid.link/1764602008-1334866-2-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c index aad52d3a90e68..2d86af8f0d9b8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c @@ -67,12 +67,19 @@ static int mlx5_mpesw_metadata_set(struct mlx5_lag *ldev) static int enable_mpesw(struct mlx5_lag *ldev) { - int idx = mlx5_lag_get_dev_index_by_seq(ldev, MLX5_LAG_P1); struct mlx5_core_dev *dev0; int err; + int idx; int i; - if (idx < 0 || ldev->mode != MLX5_LAG_MODE_NONE) + if (ldev->mode == MLX5_LAG_MODE_MPESW) + return 0; + + if (ldev->mode != MLX5_LAG_MODE_NONE) + return -EINVAL; + + idx = mlx5_lag_get_dev_index_by_seq(ldev, MLX5_LAG_P1); + if (idx < 0) return -EINVAL; dev0 = ldev->pf[idx].dev; From 43a08b372b7d3518677c8d17e5211fe2da9464a9 Mon Sep 17 00:00:00 2001 From: Cosmin Ratiu Date: Mon, 1 Dec 2025 17:13:28 +0200 Subject: [PATCH 0050/1321] net/mlx5e: Avoid unregistering PSP twice PSP is unregistered twice in: _mlx5e_remove -> mlx5e_psp_unregister mlx5e_nic_cleanup -> mlx5e_psp_unregister This leads to a refcount underflow in some conditions: ------------[ cut here ]------------ refcount_t: underflow; use-after-free. WARNING: CPU: 2 PID: 1694 at lib/refcount.c:28 refcount_warn_saturate+0xd8/0xe0 [...] mlx5e_psp_unregister+0x26/0x50 [mlx5_core] mlx5e_nic_cleanup+0x26/0x90 [mlx5_core] mlx5e_remove+0xe6/0x1f0 [mlx5_core] auxiliary_bus_remove+0x18/0x30 device_release_driver_internal+0x194/0x1f0 bus_remove_device+0xc6/0x130 device_del+0x159/0x3c0 mlx5_rescan_drivers_locked+0xbc/0x2a0 [mlx5_core] [...] Do not directly remove psp from the _mlx5e_remove path, the PSP cleanup happens as part of profile cleanup. Fixes: 89ee2d92f66c ("net/mlx5e: Support PSP offload functionality") Signed-off-by: Cosmin Ratiu Reviewed-by: Dragos Tatulea Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman Link: https://patch.msgid.link/1764602008-1334866-3-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 6168f08144148..07fc4d2c8fadd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -6825,7 +6825,6 @@ static void _mlx5e_remove(struct auxiliary_device *adev) * is already unregistered before changing to NIC profile. */ if (priv->netdev->reg_state == NETREG_REGISTERED) { - mlx5e_psp_unregister(priv); unregister_netdev(priv->netdev); _mlx5e_suspend(adev, false); } else { From 4804a4aa4dc20f062436728fcab853e11590414f Mon Sep 17 00:00:00 2001 From: Ivan Galkin Date: Tue, 2 Dec 2025 10:07:42 +0100 Subject: [PATCH 0051/1321] net: phy: RTL8211FVD: Restore disabling of PHY-mode EEE When support for RTL8211F(D)(I)-VD-CG was introduced in commit bb726b753f75 ("net: phy: realtek: add support for RTL8211F(D)(I)-VD-CG") the implementation assumed that this PHY model doesn't have the control register PHYCR2 (Page 0xa43 Address 0x19). This assumption was based on the differences in CLKOUT configurations between RTL8211FVD and the remaining RTL8211F PHYs. In the latter commit 2c67301584f2 ("net: phy: realtek: Avoid PHYCR2 access if PHYCR2 not present") this assumption was expanded to the PHY-mode EEE. I performed tests on RTL8211FI-VD-CG and confirmed that disabling PHY-mode EEE works correctly and is uniform with other PHYs supported by the driver. To validate the correctness, I contacted Realtek support. Realtek confirmed that PHY-mode EEE on RTL8211F(D)(I)-VD-CG is configured via Page 0xa43 Address 0x19 bit 5. Moreover, Realtek informed me that the most recent datasheet for RTL8211F(D)(I)-VD-CG v1.1 is incomplete and the naming of control registers is partly inconsistent. The errata I received from Realtek corrects the naming as follows: | Register | Datasheet v1.1 | Errata | |-------------------------|----------------|--------| | Page 0xa44 Address 0x11 | PHYCR2 | PHYCR3 | | Page 0xa43 Address 0x19 | N/A | PHYCR2 | This information confirms that the supposedly missing control register, PHYCR2, exists in the RTL8211F(D)(I)-VD-CG under the same address and the same name. It controls widely the same configs as other PHYs from the RTL8211F series (e.g. PHY-mode EEE). Clock out configuration is an exception. Given all this information, restore disabling of the PHY-mode EEE. Fixes: 2c67301584f2 ("net: phy: realtek: Avoid PHYCR2 access if PHYCR2 not present") Signed-off-by: Ivan Galkin Reviewed-by: Vladimir Oltean Link: https://patch.msgid.link/20251202-phy_eee-v1-1-fe0bf6ab3df0@axis.com Signed-off-by: Paolo Abeni --- drivers/net/phy/realtek/realtek_main.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/drivers/net/phy/realtek/realtek_main.c b/drivers/net/phy/realtek/realtek_main.c index 67ecf3d4af2b1..6ff0385201a57 100644 --- a/drivers/net/phy/realtek/realtek_main.c +++ b/drivers/net/phy/realtek/realtek_main.c @@ -691,10 +691,6 @@ static int rtl8211f_config_aldps(struct phy_device *phydev) static int rtl8211f_config_phy_eee(struct phy_device *phydev) { - /* RTL8211FVD has no PHYCR2 register */ - if (phydev->drv->phy_id == RTL_8211FVD_PHYID) - return 0; - /* Disable PHY-mode EEE so LPI is passed to the MAC */ return phy_modify_paged(phydev, RTL8211F_PHYCR_PAGE, RTL8211F_PHYCR2, RTL8211F_PHYCR2_PHY_EEE_ENABLE, 0); From 83ba06c60742a05934c3031a1915021e4290484e Mon Sep 17 00:00:00 2001 From: Daniel Golle Date: Tue, 2 Dec 2025 09:57:21 +0000 Subject: [PATCH 0052/1321] net: dsa: mxl-gsw1xx: fix SerDes RX polarity According to MaxLinear engineer Benny Weng the RX lane of the SerDes port of the GSW1xx switches is inverted in hardware, and the SGMII_PHY_RX0_CFG2_INVERT bit is set by default in order to compensate for that. Hence also set the SGMII_PHY_RX0_CFG2_INVERT bit by default in gsw1xx_pcs_reset(). Fixes: 22335939ec90 ("net: dsa: add driver for MaxLinear GSW1xx switch family") Reported-by: Rasmus Villemoes Signed-off-by: Daniel Golle Reviewed-by: Vladimir Oltean Link: https://patch.msgid.link/ca10e9f780c0152ecf9ae8cbac5bf975802e8f99.1764668951.git.daniel@makrotopia.org Signed-off-by: Paolo Abeni --- drivers/net/dsa/lantiq/mxl-gsw1xx.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/net/dsa/lantiq/mxl-gsw1xx.c b/drivers/net/dsa/lantiq/mxl-gsw1xx.c index 0816c61a47f12..cf33a16fd183b 100644 --- a/drivers/net/dsa/lantiq/mxl-gsw1xx.c +++ b/drivers/net/dsa/lantiq/mxl-gsw1xx.c @@ -255,10 +255,16 @@ static int gsw1xx_pcs_reset(struct gsw1xx_priv *priv) FIELD_PREP(GSW1XX_SGMII_PHY_RX0_CFG2_FILT_CNT, GSW1XX_SGMII_PHY_RX0_CFG2_FILT_CNT_DEF); - /* TODO: Take care of inverted RX pair once generic property is + /* RX lane seems to be inverted internally, so bit + * GSW1XX_SGMII_PHY_RX0_CFG2_INVERT needs to be set for normal + * (ie. non-inverted) operation. + * + * TODO: Take care of inverted RX pair once generic property is * available */ + val |= GSW1XX_SGMII_PHY_RX0_CFG2_INVERT; + ret = regmap_write(priv->sgmii, GSW1XX_SGMII_PHY_RX0_CFG2, val); if (ret < 0) return ret; From 12e2a1d1fae792652717567554f96a65e2ab0955 Mon Sep 17 00:00:00 2001 From: Dmitry Skorodumov Date: Tue, 2 Dec 2025 13:39:03 +0300 Subject: [PATCH 0053/1321] ipvlan: Ignore PACKET_LOOPBACK in handle_mode_l2() Packets with pkt_type == PACKET_LOOPBACK are captured by handle_frame() function, but they don't have L2 header. We should not process them in handle_mode_l2(). This doesn't affect old L2 functionality, since handling was anyway incorrect. Handle them the same way as in br_handle_frame(): just pass the skb. To observe invalid behaviour, just start "ping -b" on bcast address of port-interface. Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.") Signed-off-by: Dmitry Skorodumov Link: https://patch.msgid.link/20251202103906.4087675-1-skorodumov.dmitry@huawei.com Signed-off-by: Paolo Abeni --- drivers/net/ipvlan/ipvlan_core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c index dea411e132dba..2efa3ba148aa7 100644 --- a/drivers/net/ipvlan/ipvlan_core.c +++ b/drivers/net/ipvlan/ipvlan_core.c @@ -737,6 +737,9 @@ static rx_handler_result_t ipvlan_handle_mode_l2(struct sk_buff **pskb, struct ethhdr *eth = eth_hdr(skb); rx_handler_result_t ret = RX_HANDLER_PASS; + if (unlikely(skb->pkt_type == PACKET_LOOPBACK)) + return RX_HANDLER_PASS; + if (is_multicast_ether_addr(eth->h_dest)) { if (ipvlan_external_frame(skb, port)) { struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC); From 05ca33e228e0eab42d19a53d13716bedac9603e3 Mon Sep 17 00:00:00 2001 From: Gerd Bayer Date: Tue, 2 Dec 2025 12:12:57 +0100 Subject: [PATCH 0054/1321] net/mlx5: Fix double unregister of HCA_PORTS component Clear hca_devcom_comp in device's private data after unregistering it in LAG teardown. Otherwise a slightly lagging second pass through mlx5_unload_one() might try to unregister it again and trip over use-after-free. On s390 almost all PCI level recovery events trigger two passes through mxl5_unload_one() - one through the poll_health() method and one through mlx5_pci_err_detected() as callback from generic PCI error recovery. While testing PCI error recovery paths with more kernel debug features enabled, this issue reproducibly led to kernel panics with the following call chain: Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6803 ESOP-2 FSI Fault in home space mode while using kernel ASCE. AS:00000000705c4007 R3:0000000000000024 Oops: 0038 ilc:3 [#1]SMP CPU: 14 UID: 0 PID: 156 Comm: kmcheck Kdump: loaded Not tainted 6.18.0-20251130.rc7.git0.16131a59cab1.300.fc43.s390x+debug #1 PREEMPT Krnl PSW : 0404e00180000000 0000020fc86aa1dc (__lock_acquire+0x5c/0x15f0) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 Krnl GPRS: 0000000000000000 0000020f00000001 6b6b6b6b6b6b6c33 0000000000000000 0000000000000000 0000000000000000 0000000000000001 0000000000000000 0000000000000000 0000020fca28b820 0000000000000000 0000010a1ced8100 0000010a1ced8100 0000020fc9775068 0000018fce14f8b8 0000018fce14f7f8 Krnl Code: 0000020fc86aa1cc: e3b003400004 lg %r11,832 0000020fc86aa1d2: a7840211 brc 8,0000020fc86aa5f4 *0000020fc86aa1d6: c09000df0b25 larl %r9,0000020fca28b820 >0000020fc86aa1dc: d50790002000 clc 0(8,%r9),0(%r2) 0000020fc86aa1e2: a7840209 brc 8,0000020fc86aa5f4 0000020fc86aa1e6: c0e001100401 larl %r14,0000020fca8aa9e8 0000020fc86aa1ec: c01000e25a00 larl %r1,0000020fca2f55ec 0000020fc86aa1f2: a7eb00e8 aghi %r14,232 Call Trace: __lock_acquire+0x5c/0x15f0 lock_acquire.part.0+0xf8/0x270 lock_acquire+0xb0/0x1b0 down_write+0x5a/0x250 mlx5_detach_device+0x42/0x110 [mlx5_core] mlx5_unload_one_devl_locked+0x50/0xc0 [mlx5_core] mlx5_unload_one+0x42/0x60 [mlx5_core] mlx5_pci_err_detected+0x94/0x150 [mlx5_core] zpci_event_attempt_error_recovery+0xcc/0x388 Fixes: 5a977b5833b7 ("net/mlx5: Lag, move devcom registration to LAG layer") Signed-off-by: Gerd Bayer Reviewed-by: Moshe Shemesh Acked-by: Tariq Toukan Link: https://patch.msgid.link/20251202-fix_lag-v1-1-59e8177ffce0@linux.ibm.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c index 1ac933cd8f02b..a459a30f36cae 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c @@ -1413,6 +1413,7 @@ static int __mlx5_lag_dev_add_mdev(struct mlx5_core_dev *dev) static void mlx5_lag_unregister_hca_devcom_comp(struct mlx5_core_dev *dev) { mlx5_devcom_unregister_component(dev->priv.hca_devcom_comp); + dev->priv.hca_devcom_comp = NULL; } static int mlx5_lag_register_hca_devcom_comp(struct mlx5_core_dev *dev) From e6cc74438ff40f45c8edf9e33d44f3ba51f0eb95 Mon Sep 17 00:00:00 2001 From: Thorsten Blum Date: Tue, 2 Dec 2025 18:27:44 +0100 Subject: [PATCH 0055/1321] net: phy: marvell-88q2xxx: Fix clamped value in mv88q2xxx_hwmon_write The local variable 'val' was never clamped to -75000 or 180000 because the return value of clamp_val() was not used. Fix this by assigning the clamped value back to 'val', and use clamp() instead of clamp_val(). Cc: stable@vger.kernel.org Fixes: a557a92e6881 ("net: phy: marvell-88q2xxx: add support for temperature sensor") Signed-off-by: Thorsten Blum Reviewed-by: Dimitri Fedrau Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20251202172743.453055-3-thorsten.blum@linux.dev Signed-off-by: Jakub Kicinski --- drivers/net/phy/marvell-88q2xxx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/phy/marvell-88q2xxx.c b/drivers/net/phy/marvell-88q2xxx.c index f3d83b04c9535..201dee1a16985 100644 --- a/drivers/net/phy/marvell-88q2xxx.c +++ b/drivers/net/phy/marvell-88q2xxx.c @@ -698,7 +698,7 @@ static int mv88q2xxx_hwmon_write(struct device *dev, switch (attr) { case hwmon_temp_max: - clamp_val(val, -75000, 180000); + val = clamp(val, -75000, 180000); val = (val / 1000) + 75; val = FIELD_PREP(MDIO_MMD_PCS_MV_TEMP_SENSOR3_INT_THRESH_MASK, val); From 346cc4cf3c7427b6182358eee38e87165c2c41d2 Mon Sep 17 00:00:00 2001 From: Ido Schimmel Date: Tue, 2 Dec 2025 18:44:11 +0100 Subject: [PATCH 0056/1321] mlxsw: spectrum_router: Fix possible neighbour reference count leak mlxsw_sp_router_schedule_work() takes a reference on a neighbour, expecting a work item to release it later on. However, we might fail to schedule the work item, in which case the neighbour reference count will be leaked. Fix by taking the reference just before scheduling the work item. Note that mlxsw_sp_router_schedule_work() can receive a NULL neighbour pointer, but neigh_clone() handles that correctly. Spotted during code review, did not actually observe the reference count leak. Fixes: 151b89f6025a ("mlxsw: spectrum_router: Reuse work neighbor initialization in work scheduler") Reviewed-by: Petr Machata Signed-off-by: Ido Schimmel Signed-off-by: Petr Machata Reviewed-by: Simon Horman Link: https://patch.msgid.link/ec2934ae4aca187a8d8c9329a08ce93cca411378.1764695650.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c index a2033837182e8..f4e9ecaeb104f 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c @@ -2858,6 +2858,11 @@ static int mlxsw_sp_router_schedule_work(struct net *net, if (!net_work) return NOTIFY_BAD; + /* Take a reference to ensure the neighbour won't be destructed until + * we drop the reference in the work item. + */ + neigh_clone(n); + INIT_WORK(&net_work->work, cb); net_work->mlxsw_sp = router->mlxsw_sp; net_work->n = n; @@ -2881,11 +2886,6 @@ static int mlxsw_sp_router_schedule_neigh_work(struct mlxsw_sp_router *router, struct net *net; net = neigh_parms_net(n->parms); - - /* Take a reference to ensure the neighbour won't be destructed until we - * drop the reference in delayed work. - */ - neigh_clone(n); return mlxsw_sp_router_schedule_work(net, router, n, mlxsw_sp_router_neigh_event_work); } From a874ec5a99736890000ca87273b053880799dc5d Mon Sep 17 00:00:00 2001 From: Ido Schimmel Date: Tue, 2 Dec 2025 18:44:12 +0100 Subject: [PATCH 0057/1321] mlxsw: spectrum_router: Fix neighbour use-after-free We sometimes observe use-after-free when dereferencing a neighbour [1]. The problem seems to be that the driver stores a pointer to the neighbour, but without holding a reference on it. A reference is only taken when the neighbour is used by a nexthop. Fix by simplifying the reference counting scheme. Always take a reference when storing a neighbour pointer in a neighbour entry. Avoid taking a referencing when the neighbour is used by a nexthop as the neighbour entry associated with the nexthop already holds a reference. Tested by running the test that uncovered the problem over 300 times. Without this patch the problem was reproduced after a handful of iterations. [1] BUG: KASAN: slab-use-after-free in mlxsw_sp_neigh_entry_update+0x2d4/0x310 Read of size 8 at addr ffff88817f8e3420 by task ip/3929 CPU: 3 UID: 0 PID: 3929 Comm: ip Not tainted 6.18.0-rc4-virtme-g36b21a067510 #3 PREEMPT(full) Hardware name: Nvidia SN5600/VMOD0013, BIOS 5.13 05/31/2023 Call Trace: dump_stack_lvl+0x6f/0xa0 print_address_description.constprop.0+0x6e/0x300 print_report+0xfc/0x1fb kasan_report+0xe4/0x110 mlxsw_sp_neigh_entry_update+0x2d4/0x310 mlxsw_sp_router_rif_gone_sync+0x35f/0x510 mlxsw_sp_rif_destroy+0x1ea/0x730 mlxsw_sp_inetaddr_port_vlan_event+0xa1/0x1b0 __mlxsw_sp_inetaddr_lag_event+0xcc/0x130 __mlxsw_sp_inetaddr_event+0xf5/0x3c0 mlxsw_sp_router_netdevice_event+0x1015/0x1580 notifier_call_chain+0xcc/0x150 call_netdevice_notifiers_info+0x7e/0x100 __netdev_upper_dev_unlink+0x10b/0x210 netdev_upper_dev_unlink+0x79/0xa0 vrf_del_slave+0x18/0x50 do_set_master+0x146/0x7d0 do_setlink.isra.0+0x9a0/0x2880 rtnl_newlink+0x637/0xb20 rtnetlink_rcv_msg+0x6fe/0xb90 netlink_rcv_skb+0x123/0x380 netlink_unicast+0x4a3/0x770 netlink_sendmsg+0x75b/0xc90 __sock_sendmsg+0xbe/0x160 ____sys_sendmsg+0x5b2/0x7d0 ___sys_sendmsg+0xfd/0x180 __sys_sendmsg+0x124/0x1c0 do_syscall_64+0xbb/0xfd0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 [...] Allocated by task 109: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x7b/0x90 __kmalloc_noprof+0x2c1/0x790 neigh_alloc+0x6af/0x8f0 ___neigh_create+0x63/0xe90 mlxsw_sp_nexthop_neigh_init+0x430/0x7e0 mlxsw_sp_nexthop_type_init+0x212/0x960 mlxsw_sp_nexthop6_group_info_init.constprop.0+0x81f/0x1280 mlxsw_sp_nexthop6_group_get+0x392/0x6a0 mlxsw_sp_fib6_entry_create+0x46a/0xfd0 mlxsw_sp_router_fib6_replace+0x1ed/0x5f0 mlxsw_sp_router_fib6_event_work+0x10a/0x2a0 process_one_work+0xd57/0x1390 worker_thread+0x4d6/0xd40 kthread+0x355/0x5b0 ret_from_fork+0x1d4/0x270 ret_from_fork_asm+0x11/0x20 Freed by task 154: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x43/0x70 kmem_cache_free_bulk.part.0+0x1eb/0x5e0 kvfree_rcu_bulk+0x1f2/0x260 kfree_rcu_work+0x130/0x1b0 process_one_work+0xd57/0x1390 worker_thread+0x4d6/0xd40 kthread+0x355/0x5b0 ret_from_fork+0x1d4/0x270 ret_from_fork_asm+0x11/0x20 Last potentially related work creation: kasan_save_stack+0x30/0x50 kasan_record_aux_stack+0x8c/0xa0 kvfree_call_rcu+0x93/0x5b0 mlxsw_sp_router_neigh_event_work+0x67d/0x860 process_one_work+0xd57/0x1390 worker_thread+0x4d6/0xd40 kthread+0x355/0x5b0 ret_from_fork+0x1d4/0x270 ret_from_fork_asm+0x11/0x20 Fixes: 6cf3c971dc84 ("mlxsw: spectrum_router: Add private neigh table") Signed-off-by: Ido Schimmel Reviewed-by: Petr Machata Signed-off-by: Petr Machata Reviewed-by: Simon Horman Link: https://patch.msgid.link/92d75e21d95d163a41b5cea67a15cd33f547cba6.1764695650.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski --- .../ethernet/mellanox/mlxsw/spectrum_router.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c index f4e9ecaeb104f..2d0e89bd2fb9c 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c @@ -2265,6 +2265,7 @@ mlxsw_sp_neigh_entry_alloc(struct mlxsw_sp *mlxsw_sp, struct neighbour *n, if (!neigh_entry) return NULL; + neigh_hold(n); neigh_entry->key.n = n; neigh_entry->rif = rif; INIT_LIST_HEAD(&neigh_entry->nexthop_list); @@ -2274,6 +2275,7 @@ mlxsw_sp_neigh_entry_alloc(struct mlxsw_sp *mlxsw_sp, struct neighbour *n, static void mlxsw_sp_neigh_entry_free(struct mlxsw_sp_neigh_entry *neigh_entry) { + neigh_release(neigh_entry->key.n); kfree(neigh_entry); } @@ -4320,6 +4322,8 @@ mlxsw_sp_nexthop_dead_neigh_replace(struct mlxsw_sp *mlxsw_sp, if (err) goto err_neigh_entry_insert; + neigh_release(old_n); + read_lock_bh(&n->lock); nud_state = n->nud_state; dead = n->dead; @@ -4328,14 +4332,10 @@ mlxsw_sp_nexthop_dead_neigh_replace(struct mlxsw_sp *mlxsw_sp, list_for_each_entry(nh, &neigh_entry->nexthop_list, neigh_list_node) { - neigh_release(old_n); - neigh_clone(n); __mlxsw_sp_nexthop_neigh_update(nh, !entry_connected); mlxsw_sp_nexthop_group_refresh(mlxsw_sp, nh->nhgi->nh_grp); } - neigh_release(n); - return 0; err_neigh_entry_insert: @@ -4428,6 +4428,11 @@ static int mlxsw_sp_nexthop_neigh_init(struct mlxsw_sp *mlxsw_sp, } } + /* Release the reference taken by neigh_lookup() / neigh_create() since + * neigh_entry already holds one. + */ + neigh_release(n); + /* If that is the first nexthop connected to that neigh, add to * nexthop_neighs_list */ @@ -4454,11 +4459,9 @@ static void mlxsw_sp_nexthop_neigh_fini(struct mlxsw_sp *mlxsw_sp, struct mlxsw_sp_nexthop *nh) { struct mlxsw_sp_neigh_entry *neigh_entry = nh->neigh_entry; - struct neighbour *n; if (!neigh_entry) return; - n = neigh_entry->key.n; __mlxsw_sp_nexthop_neigh_update(nh, true); list_del(&nh->neigh_list_node); @@ -4472,8 +4475,6 @@ static void mlxsw_sp_nexthop_neigh_fini(struct mlxsw_sp *mlxsw_sp, if (!neigh_entry->connected && list_empty(&neigh_entry->nexthop_list)) mlxsw_sp_neigh_entry_destroy(mlxsw_sp, neigh_entry); - - neigh_release(n); } static bool mlxsw_sp_ipip_netdev_ul_up(struct net_device *ol_dev) From b04ebe3233007452e752b77e3e4f3d225bd7a920 Mon Sep 17 00:00:00 2001 From: Ido Schimmel Date: Tue, 2 Dec 2025 18:44:13 +0100 Subject: [PATCH 0058/1321] mlxsw: spectrum_mr: Fix use-after-free when updating multicast route stats Cited commit added a dedicated mutex (instead of RTNL) to protect the multicast route list, so that it will not change while the driver periodically traverses it in order to update the kernel about multicast route stats that were queried from the device. One instance of list entry deletion (during route replace) was missed and it can result in a use-after-free [1]. Fix by acquiring the mutex before deleting the entry from the list and releasing it afterwards. [1] BUG: KASAN: slab-use-after-free in mlxsw_sp_mr_stats_update+0x4a5/0x540 drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c:1006 [mlxsw_spectrum] Read of size 8 at addr ffff8881523c2fa8 by task kworker/2:5/22043 CPU: 2 UID: 0 PID: 22043 Comm: kworker/2:5 Not tainted 6.18.0-rc1-custom-g1a3d6d7cd014 #1 PREEMPT(full) Hardware name: Mellanox Technologies Ltd. MSN2010/SA002610, BIOS 5.6.5 08/24/2017 Workqueue: mlxsw_core mlxsw_sp_mr_stats_update [mlxsw_spectrum] Call Trace: dump_stack_lvl+0xba/0x110 print_report+0x174/0x4f5 kasan_report+0xdf/0x110 mlxsw_sp_mr_stats_update+0x4a5/0x540 drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c:1006 [mlxsw_spectrum] process_one_work+0x9cc/0x18e0 worker_thread+0x5df/0xe40 kthread+0x3b8/0x730 ret_from_fork+0x3e9/0x560 ret_from_fork_asm+0x1a/0x30 Allocated by task 29933: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x8f/0xa0 mlxsw_sp_mr_route_add+0xd8/0x4770 [mlxsw_spectrum] mlxsw_sp_router_fibmr_event_work+0x371/0xad0 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:7965 [mlxsw_spectrum] process_one_work+0x9cc/0x18e0 worker_thread+0x5df/0xe40 kthread+0x3b8/0x730 ret_from_fork+0x3e9/0x560 ret_from_fork_asm+0x1a/0x30 Freed by task 29933: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_save_free_info+0x3b/0x70 __kasan_slab_free+0x43/0x70 kfree+0x14e/0x700 mlxsw_sp_mr_route_add+0x2dea/0x4770 drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c:444 [mlxsw_spectrum] mlxsw_sp_router_fibmr_event_work+0x371/0xad0 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:7965 [mlxsw_spectrum] process_one_work+0x9cc/0x18e0 worker_thread+0x5df/0xe40 kthread+0x3b8/0x730 ret_from_fork+0x3e9/0x560 ret_from_fork_asm+0x1a/0x30 Fixes: f38656d06725 ("mlxsw: spectrum_mr: Protect multicast route list with a lock") Signed-off-by: Ido Schimmel Reviewed-by: Petr Machata Signed-off-by: Petr Machata Reviewed-by: Simon Horman Link: https://patch.msgid.link/f996feecfd59fde297964bfc85040b6d83ec6089.1764695650.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c index 5afe6b155ef0d..81935f87bfcd7 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c @@ -440,7 +440,9 @@ int mlxsw_sp_mr_route_add(struct mlxsw_sp_mr_table *mr_table, rhashtable_remove_fast(&mr_table->route_ht, &mr_orig_route->ht_node, mlxsw_sp_mr_route_ht_params); + mutex_lock(&mr_table->route_list_lock); list_del(&mr_orig_route->node); + mutex_unlock(&mr_table->route_list_lock); mlxsw_sp_mr_route_destroy(mr_table, mr_orig_route); } From d7c52258c8967c2a9829836085134c0cb4495e67 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ren=C3=A9=20Rebe?= Date: Tue, 2 Dec 2025 19:41:37 +0100 Subject: [PATCH 0059/1321] r8169: fix RTL8117 Wake-on-Lan in DASH mode MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Wake-on-Lan does currently not work for r8169 in DASH mode, e.g. the ASUS Pro WS X570-ACE with RTL8168fp/RTL8117. Fix by not returning early in rtl_prepare_power_down when dash_enabled. While this fixes WoL, it still kills the OOB RTL8117 remote management BMC connection. Fix by not calling rtl8168_driver_stop if WoL is enabled. Fixes: 065c27c184d6 ("r8169: phy power ops") Signed-off-by: René Rebe Cc: stable@vger.kernel.org Reviewed-by: Heiner Kallweit Link: https://patch.msgid.link/20251202.194137.1647877804487085954.rene@exactco.de Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/realtek/r8169_main.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 405e91eb3141f..755083852eef2 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -2655,9 +2655,6 @@ static void rtl_wol_enable_rx(struct rtl8169_private *tp) static void rtl_prepare_power_down(struct rtl8169_private *tp) { - if (tp->dash_enabled) - return; - if (tp->mac_version == RTL_GIGA_MAC_VER_32 || tp->mac_version == RTL_GIGA_MAC_VER_33) rtl_ephy_write(tp, 0x19, 0xff64); @@ -4812,7 +4809,7 @@ static void rtl8169_down(struct rtl8169_private *tp) rtl_disable_exit_l1(tp); rtl_prepare_power_down(tp); - if (tp->dash_type != RTL_DASH_NONE) + if (tp->dash_type != RTL_DASH_NONE && !tp->saved_wolopts) rtl8168_driver_stop(tp); } From b357f166a8252aa1ea3f437b3cbe03849116e8e6 Mon Sep 17 00:00:00 2001 From: Tim Hostetler Date: Tue, 2 Dec 2025 20:02:07 +0000 Subject: [PATCH 0060/1321] gve: Move gve_init_clock to after AQ CONFIGURE_DEVICE_RESOURCES call commit 46e7860ef941 ("gve: Move ptp_schedule_worker to gve_init_clock") moved the first invocation of the AQ command REPORT_NIC_TIMESTAMP to gve_probe(). However, gve_init_clock() invoking REPORT_NIC_TIMESTAMP is not valid until after gve_probe() invokes the AQ command CONFIGURE_DEVICE_RESOURCES. Failure to do so results in the following error: gve 0000:00:07.0: failed to read NIC clock -11 This was missed earlier because the driver under test was loaded at runtime instead of boot-time. The boot-time driver had already initialized the device, causing the runtime driver to successfully call gve_init_clock() incorrectly. Fixes: 46e7860ef941 ("gve: Move ptp_schedule_worker to gve_init_clock") Reviewed-by: Ankit Garg Signed-off-by: Tim Hostetler Signed-off-by: Harshitha Ramamurthy Reviewed-by: Simon Horman Link: https://patch.msgid.link/20251202200207.1434749-1-hramamurthy@google.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/google/gve/gve_main.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c index a5a2b18d309b8..a7a088a77f378 100644 --- a/drivers/net/ethernet/google/gve/gve_main.c +++ b/drivers/net/ethernet/google/gve/gve_main.c @@ -647,12 +647,9 @@ static int gve_setup_device_resources(struct gve_priv *priv) err = gve_alloc_counter_array(priv); if (err) goto abort_with_rss_config_cache; - err = gve_init_clock(priv); - if (err) - goto abort_with_counter; err = gve_alloc_notify_blocks(priv); if (err) - goto abort_with_clock; + goto abort_with_counter; err = gve_alloc_stats_report(priv); if (err) goto abort_with_ntfy_blocks; @@ -683,10 +680,16 @@ static int gve_setup_device_resources(struct gve_priv *priv) } } + err = gve_init_clock(priv); + if (err) { + dev_err(&priv->pdev->dev, "Failed to init clock"); + goto abort_with_ptype_lut; + } + err = gve_init_rss_config(priv, priv->rx_cfg.num_queues); if (err) { dev_err(&priv->pdev->dev, "Failed to init RSS config"); - goto abort_with_ptype_lut; + goto abort_with_clock; } err = gve_adminq_report_stats(priv, priv->stats_report_len, @@ -698,6 +701,8 @@ static int gve_setup_device_resources(struct gve_priv *priv) gve_set_device_resources_ok(priv); return 0; +abort_with_clock: + gve_teardown_clock(priv); abort_with_ptype_lut: kvfree(priv->ptype_lut_dqo); priv->ptype_lut_dqo = NULL; @@ -705,8 +710,6 @@ static int gve_setup_device_resources(struct gve_priv *priv) gve_free_stats_report(priv); abort_with_ntfy_blocks: gve_free_notify_blocks(priv); -abort_with_clock: - gve_teardown_clock(priv); abort_with_counter: gve_free_counter_array(priv); abort_with_rss_config_cache: From 2a5c96e19cd041cfc01766c09c37225635413406 Mon Sep 17 00:00:00 2001 From: Michael Chan Date: Tue, 2 Dec 2025 16:30:24 -0800 Subject: [PATCH 0061/1321] bnxt_en: Fix XDP_TX path For XDP_TX action in bnxt_rx_xdp(), clearing of the event flags is not correct. __bnxt_poll_work() -> bnxt_rx_pkt() -> bnxt_rx_xdp() may be looping within NAPI and some event flags may be set in earlier iterations. In particular, if BNXT_TX_EVENT is set earlier indicating some XDP_TX packets are ready and pending, it will be cleared if it is XDP_TX action again. Normally, we will set BNXT_TX_EVENT again when we successfully call __bnxt_xmit_xdp(). But if the TX ring has no more room, the flag will not be set. This will cause the TX producer to be ahead but the driver will not hit the TX doorbell. For multi-buf XDP_TX, there is no need to clear the event flags and set BNXT_AGG_EVENT. The BNXT_AGG_EVENT flag should have been set earlier in bnxt_rx_pkt(). The visible symptom of this is that the RX ring associated with the TX XDP ring will eventually become empty and all packets will be dropped. Because this condition will cause the driver to not refill the RX ring seeing that the TX ring has forever pending XDP_TX packets. The fix is to only clear BNXT_RX_EVENT when we have successfully called __bnxt_xmit_xdp(). Fixes: 7f0a168b0441 ("bnxt_en: Add completion ring pointer in TX and RX ring structures") Reported-by: Pavel Dubovitsky Reviewed-by: Andy Gospodarek Reviewed-by: Pavan Chebbi Reviewed-by: Kalesh AP Signed-off-by: Michael Chan Reviewed-by: Jacob Keller Link: https://patch.msgid.link/20251203003024.2246699-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c index 3e77a96e5a3e3..c94a391b1ba5b 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c @@ -268,13 +268,11 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons, case XDP_TX: rx_buf = &rxr->rx_buf_ring[cons]; mapping = rx_buf->mapping - bp->rx_dma_offset; - *event &= BNXT_TX_CMP_EVENT; if (unlikely(xdp_buff_has_frags(xdp))) { struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp); tx_needed += sinfo->nr_frags; - *event = BNXT_AGG_EVENT; } if (tx_avail < tx_needed) { @@ -287,6 +285,7 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons, dma_sync_single_for_device(&pdev->dev, mapping + offset, *len, bp->rx_dir); + *event &= ~BNXT_RX_EVENT; *event |= BNXT_TX_EVENT; __bnxt_xmit_xdp(bp, txr, mapping + offset, *len, NEXT_RX(rxr->rx_prod), xdp); From 343083a83906efac4c4ef104478e30f977fb5cb5 Mon Sep 17 00:00:00 2001 From: Mateusz Guzik Date: Wed, 3 Dec 2025 11:01:22 +0100 Subject: [PATCH 0062/1321] af_unix: annotate unix_gc_lock with __cacheline_aligned_in_smp Otherwise the lock is susceptible to ever-changing false-sharing due to unrelated changes. This in particular popped up here where an unrelated change improved performance: https://lore.kernel.org/oe-lkp/202511281306.51105b46-lkp@intel.com/ Stabilize it with an explicit annotation which also has a side effect of furher improving scalability: > in our oiginal report, 284922f4c5 has a 6.1% performance improvement comparing > to parent 17d85f33a8. > we applied your patch directly upon 284922f4c5. as below, now by > "284922f4c5 + your patch" > we observe a 12.8% performance improvements (still comparing to 17d85f33a8). Note nothing was done for the other fields, so some fluctuation is still possible. Tested-by: kernel test robot Signed-off-by: Mateusz Guzik Reviewed-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20251203100122.291550-1-mjguzik@gmail.com Signed-off-by: Jakub Kicinski --- net/unix/garbage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 78323d43e63ed..25f65817faab9 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -199,7 +199,7 @@ static void unix_free_vertices(struct scm_fp_list *fpl) } } -static DEFINE_SPINLOCK(unix_gc_lock); +static __cacheline_aligned_in_smp DEFINE_SPINLOCK(unix_gc_lock); void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver) { From 6cd3bb7fcf8a83a18d1cbf53b5a1938ef36ea0d6 Mon Sep 17 00:00:00 2001 From: Eric Biggers Date: Wed, 3 Dec 2025 21:44:17 -0800 Subject: [PATCH 0063/1321] mptcp: select CRYPTO_LIB_UTILS instead of CRYPTO Since the only crypto functions used by the mptcp code are the SHA-256 library functions and crypto_memneq(), select only the options needed for those: CRYPTO_LIB_SHA256 and CRYPTO_LIB_UTILS. Previously, CRYPTO was selected instead of CRYPTO_LIB_UTILS. That does pull in CRYPTO_LIB_UTILS as well, but it's unnecessarily broad. Years ago, the CRYPTO_LIB_* options were visible only when CRYPTO. That may be another reason why CRYPTO is selected here. However, that was fixed years ago, and the libraries can now be selected directly. Signed-off-by: Eric Biggers Reviewed-by: Mat Martineau Link: https://patch.msgid.link/20251204054417.491439-1-ebiggers@kernel.org Signed-off-by: Jakub Kicinski --- net/mptcp/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/mptcp/Kconfig b/net/mptcp/Kconfig index 20328920f6ed1..be71fc9b46381 100644 --- a/net/mptcp/Kconfig +++ b/net/mptcp/Kconfig @@ -4,7 +4,7 @@ config MPTCP depends on INET select SKB_EXTENSIONS select CRYPTO_LIB_SHA256 - select CRYPTO + select CRYPTO_LIB_UTILS help Multipath TCP (MPTCP) connections send and receive data over multiple subflows in order to utilize multiple network paths. Each subflow From 69884ccdfbb167a9c46c5acf1e21385addd0021a Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Sat, 6 Dec 2025 16:47:40 -0800 Subject: [PATCH 0064/1321] ynl: add regen hint to new headers Recent commit 68e83f347266 ("tools: ynl-gen: add regeneration comment") added a hint how to regenerate the code to the headers. Update the new headers from this release cycle to also include it. Reviewed-by: Simon Horman Link: https://patch.msgid.link/20251207004740.1657799-1-kuba@kernel.org Signed-off-by: Jakub Kicinski --- include/uapi/linux/energy_model.h | 1 + kernel/power/em_netlink_autogen.c | 1 + kernel/power/em_netlink_autogen.h | 1 + 3 files changed, 3 insertions(+) diff --git a/include/uapi/linux/energy_model.h b/include/uapi/linux/energy_model.h index 4ec4c0eabbbbc..0bcad967854ff 100644 --- a/include/uapi/linux/energy_model.h +++ b/include/uapi/linux/energy_model.h @@ -2,6 +2,7 @@ /* Do not edit directly, auto-generated from: */ /* Documentation/netlink/specs/em.yaml */ /* YNL-GEN uapi header */ +/* To regenerate run: tools/net/ynl/ynl-regen.sh */ #ifndef _UAPI_LINUX_ENERGY_MODEL_H #define _UAPI_LINUX_ENERGY_MODEL_H diff --git a/kernel/power/em_netlink_autogen.c b/kernel/power/em_netlink_autogen.c index a7a09ab1d1c21..ceb3b2bb6ebe0 100644 --- a/kernel/power/em_netlink_autogen.c +++ b/kernel/power/em_netlink_autogen.c @@ -2,6 +2,7 @@ /* Do not edit directly, auto-generated from: */ /* Documentation/netlink/specs/em.yaml */ /* YNL-GEN kernel source */ +/* To regenerate run: tools/net/ynl/ynl-regen.sh */ #include #include diff --git a/kernel/power/em_netlink_autogen.h b/kernel/power/em_netlink_autogen.h index 78ce609641f11..140ab548103ce 100644 --- a/kernel/power/em_netlink_autogen.h +++ b/kernel/power/em_netlink_autogen.h @@ -2,6 +2,7 @@ /* Do not edit directly, auto-generated from: */ /* Documentation/netlink/specs/em.yaml */ /* YNL-GEN kernel header */ +/* To regenerate run: tools/net/ynl/ynl-regen.sh */ #ifndef _LINUX_EM_GEN_H #define _LINUX_EM_GEN_H From 9dbad4708696853a749d38a99d8f01a40facf62f Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Sat, 6 Dec 2025 17:38:48 -0800 Subject: [PATCH 0065/1321] tools: ynl: fix build on systems with old kernel headers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The wireguard YNL conversion was missing the customary .deps entry. NIPA doesn't catch this but my CentOS 9 system complains: wireguard-user.c:72:10: error: ‘WGALLOWEDIP_A_FLAGS’ undeclared here wireguard-user.c:58:67: error: parameter 1 (‘value’) has incomplete type 58 | const char *wireguard_wgallowedip_flags_str(enum wgallowedip_flag value) | ~~~~~~~~~~~~~~~~~~~~~~^~~~~ And similarly does Ubuntu 22.04. One extra complication here is that we renamed the header guard, so we need to compat with both old and new guard define. Reviewed-by: Asbjørn Sloth Tønnesen Link: https://patch.msgid.link/20251207013848.1692990-1-kuba@kernel.org Signed-off-by: Jakub Kicinski --- tools/net/ynl/Makefile.deps | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/net/ynl/Makefile.deps b/tools/net/ynl/Makefile.deps index 865fd2e8519ed..08205f9fc5257 100644 --- a/tools/net/ynl/Makefile.deps +++ b/tools/net/ynl/Makefile.deps @@ -13,6 +13,7 @@ UAPI_PATH:=../../../../include/uapi/ # need the explicit -D matching what's in /usr, to avoid multiple definitions. get_hdr_inc=-D$(1) -include $(UAPI_PATH)/linux/$(2) +get_hdr_inc2=-D$(1) -D$(2) -include $(UAPI_PATH)/linux/$(3) CFLAGS_devlink:=$(call get_hdr_inc,_LINUX_DEVLINK_H_,devlink.h) CFLAGS_dpll:=$(call get_hdr_inc,_LINUX_DPLL_H,dpll.h) @@ -48,3 +49,4 @@ CFLAGS_tc:= $(call get_hdr_inc,__LINUX_RTNETLINK_H,rtnetlink.h) \ $(call get_hdr_inc,_TC_SKBEDIT_H,tc_act/tc_skbedit.h) \ $(call get_hdr_inc,_TC_TUNNEL_KEY_H,tc_act/tc_tunnel_key.h) CFLAGS_tcp_metrics:=$(call get_hdr_inc,_LINUX_TCP_METRICS_H,tcp_metrics.h) +CFLAGS_wireguard:=$(call get_hdr_inc2,_LINUX_WIREGUARD_H,_WG_UAPI_WIREGUARD_H,wireguard.h) From 4428def700f660c5f20b2fc9dfeecdfe9df5859b Mon Sep 17 00:00:00 2001 From: "Matthieu Baerts (NGI0)" Date: Fri, 5 Dec 2025 19:55:14 +0100 Subject: [PATCH 0066/1321] mptcp: pm: ignore unknown endpoint flags Before this patch, the kernel was saving any flags set by the userspace, even unknown ones. This doesn't cause critical issues because the kernel is only looking at specific ones. But on the other hand, endpoints dumps could tell the userspace some recent flags seem to be supported on older kernel versions. Instead, ignore all unknown flags when parsing them. By doing that, the userspace can continue to set unsupported flags, but it has a way to verify what is supported by the kernel. Note that it sounds better to continue accepting unsupported flags not to change the behaviour, but also that eases things on the userspace side by adding "optional" endpoint types only supported by newer kernel versions without having to deal with the different kernel versions. A note for the backports: there will be conflicts in mptcp.h on older versions not having the mentioned flags, the new line should still be added last, and the '5' needs to be adapted to have the same value as the last entry. Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-1-9e4781a6c1b8@kernel.org Signed-off-by: Jakub Kicinski --- include/uapi/linux/mptcp.h | 1 + net/mptcp/pm_netlink.c | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/mptcp.h b/include/uapi/linux/mptcp.h index 04eea6d1d0a9b..72a5d030154e1 100644 --- a/include/uapi/linux/mptcp.h +++ b/include/uapi/linux/mptcp.h @@ -40,6 +40,7 @@ #define MPTCP_PM_ADDR_FLAG_FULLMESH _BITUL(3) #define MPTCP_PM_ADDR_FLAG_IMPLICIT _BITUL(4) #define MPTCP_PM_ADDR_FLAG_LAMINAR _BITUL(5) +#define MPTCP_PM_ADDR_FLAGS_MASK GENMASK(5, 0) struct mptcp_info { __u8 mptcpi_subflows; diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index d5b383870f799..7aa42de9c47b5 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -119,7 +119,8 @@ int mptcp_pm_parse_entry(struct nlattr *attr, struct genl_info *info, } if (tb[MPTCP_PM_ADDR_ATTR_FLAGS]) - entry->flags = nla_get_u32(tb[MPTCP_PM_ADDR_ATTR_FLAGS]); + entry->flags = nla_get_u32(tb[MPTCP_PM_ADDR_ATTR_FLAGS]) & + MPTCP_PM_ADDR_FLAGS_MASK; if (tb[MPTCP_PM_ADDR_ATTR_PORT]) entry->addr.port = htons(nla_get_u16(tb[MPTCP_PM_ADDR_ATTR_PORT])); From 7e7d053d3ca89e164bc7024ad33dd24630500263 Mon Sep 17 00:00:00 2001 From: "Matthieu Baerts (NGI0)" Date: Fri, 5 Dec 2025 19:55:15 +0100 Subject: [PATCH 0067/1321] selftests: mptcp: pm: ensure unknown flags are ignored This validates the previous commit: the userspace can set unknown flags -- the 7th bit is currently unused -- without errors, but only the supported ones are printed in the endpoints dumps. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-2-9e4781a6c1b8@kernel.org Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/mptcp/pm_netlink.sh | 4 ++++ tools/testing/selftests/net/mptcp/pm_nl_ctl.c | 11 +++++++++++ 2 files changed, 15 insertions(+) diff --git a/tools/testing/selftests/net/mptcp/pm_netlink.sh b/tools/testing/selftests/net/mptcp/pm_netlink.sh index ec6a875881919..123d9d7a0278c 100755 --- a/tools/testing/selftests/net/mptcp/pm_netlink.sh +++ b/tools/testing/selftests/net/mptcp/pm_netlink.sh @@ -192,6 +192,10 @@ check "show_endpoints" \ flush_endpoint check "show_endpoints" "" "flush addrs" +add_endpoint 10.0.1.1 flags unknown +check "show_endpoints" "$(format_endpoints "1,10.0.1.1")" "ignore unknown flags" +flush_endpoint + set_limits 9 1 2>/dev/null check "get_limits" "${default_limits}" "rcv addrs above hard limit" diff --git a/tools/testing/selftests/net/mptcp/pm_nl_ctl.c b/tools/testing/selftests/net/mptcp/pm_nl_ctl.c index 65b374232ff5a..99eecccbf0c87 100644 --- a/tools/testing/selftests/net/mptcp/pm_nl_ctl.c +++ b/tools/testing/selftests/net/mptcp/pm_nl_ctl.c @@ -24,6 +24,8 @@ #define IPPROTO_MPTCP 262 #endif +#define MPTCP_PM_ADDR_FLAG_UNKNOWN _BITUL(7) + static void syntax(char *argv[]) { fprintf(stderr, "%s add|ann|rem|csf|dsf|get|set|del|flush|dump|events|listen|accept []\n", argv[0]); @@ -836,6 +838,8 @@ int add_addr(int fd, int pm_family, int argc, char *argv[]) flags |= MPTCP_PM_ADDR_FLAG_BACKUP; else if (!strcmp(tok, "fullmesh")) flags |= MPTCP_PM_ADDR_FLAG_FULLMESH; + else if (!strcmp(tok, "unknown")) + flags |= MPTCP_PM_ADDR_FLAG_UNKNOWN; else error(1, errno, "unknown flag %s", argv[arg]); @@ -1048,6 +1052,13 @@ static void print_addr(struct rtattr *attrs, int len) printf(","); } + if (flags & MPTCP_PM_ADDR_FLAG_UNKNOWN) { + printf("unknown"); + flags &= ~MPTCP_PM_ADDR_FLAG_UNKNOWN; + if (flags) + printf(","); + } + /* bump unknown flags, if any */ if (flags) printf("0x%x", flags); From b7a6544a6700ce604f48be95771678a34d4c80c2 Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Fri, 5 Dec 2025 19:55:16 +0100 Subject: [PATCH 0068/1321] mptcp: schedule rtx timer only after pushing data The MPTCP protocol usually schedule the retransmission timer only when there is some chances for such retransmissions to happen. With a notable exception: __mptcp_push_pending() currently schedule such timer unconditionally, potentially leading to unnecessary rtx timer expiration. The issue is present since the blamed commit below but become easily reproducible after commit 27b0e701d387 ("mptcp: drop bogus optimization in __mptcp_check_push()") Fixes: 33d41c9cd74c ("mptcp: more accurate timeout") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-3-9e4781a6c1b8@kernel.org Signed-off-by: Jakub Kicinski --- net/mptcp/protocol.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index e212c1374bd04..d8a7f70291645 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1623,7 +1623,7 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags) struct mptcp_sendmsg_info info = { .flags = flags, }; - bool do_check_data_fin = false; + bool copied = false; int push_count = 1; while (mptcp_send_head(sk) && (push_count > 0)) { @@ -1665,7 +1665,7 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags) push_count--; continue; } - do_check_data_fin = true; + copied = true; } } } @@ -1674,11 +1674,14 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags) if (ssk) mptcp_push_release(ssk, &info); - /* ensure the rtx timer is running */ - if (!mptcp_rtx_timer_pending(sk)) - mptcp_reset_rtx_timer(sk); - if (do_check_data_fin) + /* Avoid scheduling the rtx timer if no data has been pushed; the timer + * will be updated on positive acks by __mptcp_cleanup_una(). + */ + if (copied) { + if (!mptcp_rtx_timer_pending(sk)) + mptcp_reset_rtx_timer(sk); mptcp_check_send_data_fin(sk); + } } static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, bool first) From a872105dd35b3409efd1fc1ca195d7ea4b411556 Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Fri, 5 Dec 2025 19:55:17 +0100 Subject: [PATCH 0069/1321] mptcp: avoid deadlock on fallback while reinjecting Jakub reported an MPTCP deadlock at fallback time: WARNING: possible recursive locking detected 6.18.0-rc7-virtme #1 Not tainted -------------------------------------------- mptcp_connect/20858 is trying to acquire lock: ff1100001da18b60 (&msk->fallback_lock){+.-.}-{3:3}, at: __mptcp_try_fallback+0xd8/0x280 but task is already holding lock: ff1100001da18b60 (&msk->fallback_lock){+.-.}-{3:3}, at: __mptcp_retrans+0x352/0xaa0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&msk->fallback_lock); lock(&msk->fallback_lock); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by mptcp_connect/20858: #0: ff1100001da18290 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg+0x114/0x1bc0 #1: ff1100001db40fd0 (k-sk_lock-AF_INET#2){+.+.}-{0:0}, at: __mptcp_retrans+0x2cb/0xaa0 #2: ff1100001da18b60 (&msk->fallback_lock){+.-.}-{3:3}, at: __mptcp_retrans+0x352/0xaa0 stack backtrace: CPU: 0 UID: 0 PID: 20858 Comm: mptcp_connect Not tainted 6.18.0-rc7-virtme #1 PREEMPT(full) Hardware name: Bochs, BIOS Bochs 01/01/2011 Call Trace: dump_stack_lvl+0x6f/0xa0 print_deadlock_bug.cold+0xc0/0xcd validate_chain+0x2ff/0x5f0 __lock_acquire+0x34c/0x740 lock_acquire.part.0+0xbc/0x260 _raw_spin_lock_bh+0x38/0x50 __mptcp_try_fallback+0xd8/0x280 mptcp_sendmsg_frag+0x16c2/0x3050 __mptcp_retrans+0x421/0xaa0 mptcp_release_cb+0x5aa/0xa70 release_sock+0xab/0x1d0 mptcp_sendmsg+0xd5b/0x1bc0 sock_write_iter+0x281/0x4d0 new_sync_write+0x3c5/0x6f0 vfs_write+0x65e/0xbb0 ksys_write+0x17e/0x200 do_syscall_64+0xbb/0xfd0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7fa5627cbc5e Code: 4d 89 d8 e8 14 bd 00 00 4c 8b 5d f8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f 05 c3 83 e2 39 83 fa 08 75 e7 e8 13 ff ff ff 0f 1f 00 f3 0f 1e fa RSP: 002b:00007fff1fe14700 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fa5627cbc5e RDX: 0000000000001f9c RSI: 00007fff1fe16984 RDI: 0000000000000005 RBP: 00007fff1fe14710 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000202 R12: 00007fff1fe16920 R13: 0000000000002000 R14: 0000000000001f9c R15: 0000000000001f9c The packet scheduler could attempt a reinjection after receiving an MP_FAIL and before the infinite map has been transmitted, causing a deadlock since MPTCP needs to do the reinjection atomically from WRT fallback. Address the issue explicitly avoiding the reinjection in the critical scenario. Note that this is the only fallback critical section that could potentially send packets and hit the double-lock. Reported-by: Jakub Kicinski Closes: https://netdev-ctrl.bots.linux.dev/logs/vmksft/mptcp-dbg/results/412720/1-mptcp-join-sh/stderr Fixes: f8a1d9b18c5e ("mptcp: make fallback action and fallback decision atomic") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-4-9e4781a6c1b8@kernel.org Signed-off-by: Jakub Kicinski --- net/mptcp/protocol.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index d8a7f70291645..9b1fafd87cb94 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2769,10 +2769,13 @@ static void __mptcp_retrans(struct sock *sk) /* * make the whole retrans decision, xmit, disallow - * fallback atomic + * fallback atomic, note that we can't retrans even + * when an infinite fallback is in progress, i.e. new + * subflows are disallowed. */ spin_lock_bh(&msk->fallback_lock); - if (__mptcp_check_fallback(msk)) { + if (__mptcp_check_fallback(msk) || + !msk->allow_subflows) { spin_unlock_bh(&msk->fallback_lock); release_sock(ssk); goto clear_scheduled; From ad8eb86a125e5ca6ab3e1c90f65e49777c12a936 Mon Sep 17 00:00:00 2001 From: Ilya Maximets Date: Thu, 4 Dec 2025 11:53:32 +0100 Subject: [PATCH 0070/1321] net: openvswitch: fix middle attribute validation in push_nsh() action The push_nsh() action structure looks like this: OVS_ACTION_ATTR_PUSH_NSH(OVS_KEY_ATTR_NSH(OVS_NSH_KEY_ATTR_BASE,...)) The outermost OVS_ACTION_ATTR_PUSH_NSH attribute is OK'ed by the nla_for_each_nested() inside __ovs_nla_copy_actions(). The innermost OVS_NSH_KEY_ATTR_BASE/MD1/MD2 are OK'ed by the nla_for_each_nested() inside nsh_key_put_from_nlattr(). But nothing checks if the attribute in the middle is OK. We don't even check that this attribute is the OVS_KEY_ATTR_NSH. We just do a double unwrap with a pair of nla_data() calls - first time directly while calling validate_push_nsh() and the second time as part of the nla_for_each_nested() macro, which isn't safe, potentially causing invalid memory access if the size of this attribute is incorrect. The failure may not be noticed during validation due to larger netlink buffer, but cause trouble later during action execution where the buffer is allocated exactly to the size: BUG: KASAN: slab-out-of-bounds in nsh_hdr_from_nlattr+0x1dd/0x6a0 [openvswitch] Read of size 184 at addr ffff88816459a634 by task a.out/22624 CPU: 8 UID: 0 PID: 22624 6.18.0-rc7+ #115 PREEMPT(voluntary) Call Trace: dump_stack_lvl+0x51/0x70 print_address_description.constprop.0+0x2c/0x390 kasan_report+0xdd/0x110 kasan_check_range+0x35/0x1b0 __asan_memcpy+0x20/0x60 nsh_hdr_from_nlattr+0x1dd/0x6a0 [openvswitch] push_nsh+0x82/0x120 [openvswitch] do_execute_actions+0x1405/0x2840 [openvswitch] ovs_execute_actions+0xd5/0x3b0 [openvswitch] ovs_packet_cmd_execute+0x949/0xdb0 [openvswitch] genl_family_rcv_msg_doit+0x1d6/0x2b0 genl_family_rcv_msg+0x336/0x580 genl_rcv_msg+0x9f/0x130 netlink_rcv_skb+0x11f/0x370 genl_rcv+0x24/0x40 netlink_unicast+0x73e/0xaa0 netlink_sendmsg+0x744/0xbf0 __sys_sendto+0x3d6/0x450 do_syscall_64+0x79/0x2c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Let's add some checks that the attribute is properly sized and it's the only one attribute inside the action. Technically, there is no real reason for OVS_KEY_ATTR_NSH to be there, as we know that we're pushing an NSH header already, it just creates extra nesting, but that's how uAPI works today. So, keeping as it is. Fixes: b2d0f5d5dc53 ("openvswitch: enable NSH support") Reported-by: Junvy Yang Signed-off-by: Ilya Maximets Acked-by: Eelco Chaudron echaudro@redhat.com Reviewed-by: Aaron Conole Link: https://patch.msgid.link/20251204105334.900379-1-i.maximets@ovn.org Signed-off-by: Jakub Kicinski --- net/openvswitch/flow_netlink.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c index 1cb4f97335d87..2d536901309ea 100644 --- a/net/openvswitch/flow_netlink.c +++ b/net/openvswitch/flow_netlink.c @@ -2802,13 +2802,20 @@ static int validate_and_copy_set_tun(const struct nlattr *attr, return err; } -static bool validate_push_nsh(const struct nlattr *attr, bool log) +static bool validate_push_nsh(const struct nlattr *a, bool log) { + struct nlattr *nsh_key = nla_data(a); struct sw_flow_match match; struct sw_flow_key key; + /* There must be one and only one NSH header. */ + if (!nla_ok(nsh_key, nla_len(a)) || + nla_total_size(nla_len(nsh_key)) != nla_len(a) || + nla_type(nsh_key) != OVS_KEY_ATTR_NSH) + return false; + ovs_match_init(&match, &key, true, NULL); - return !nsh_key_put_from_nlattr(attr, &match, false, true, log); + return !nsh_key_put_from_nlattr(nsh_key, &match, false, true, log); } /* Return false if there are any non-masked bits set. @@ -3389,7 +3396,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr, return -EINVAL; } mac_proto = MAC_PROTO_NONE; - if (!validate_push_nsh(nla_data(a), log)) + if (!validate_push_nsh(a, log)) return -EINVAL; break; From 304ee7ca943b1d3b743af140c36374cfe5532704 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Thu, 4 Dec 2025 11:01:28 +0100 Subject: [PATCH 0071/1321] net: ti: icssg-prueth: add PTP_1588_CLOCK_OPTIONAL dependency The new icssg-prueth driver needs the same dependency as the other parts that use the ptp-1588: WARNING: unmet direct dependencies detected for TI_ICSS_IEP Depends on [m]: NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_TI [=y] && PTP_1588_CLOCK_OPTIONAL [=m] && TI_PRUSS [=y] Selected by [y]: - TI_PRUETH [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_TI [=y] && PRU_REMOTEPROC [=y] && NET_SWITCHDEV [=y] Add the correct dependency on the two drivers missing it, and remove the pointless 'imply' in the process. Fixes: e654b85a693e ("net: ti: icssg-prueth: Add ICSSG Ethernet driver for AM65x SR1.0 platforms") Fixes: 511f6c1ae093 ("net: ti: icssm-prueth: Adds ICSSM Ethernet driver") Signed-off-by: Arnd Bergmann Reviewed-by: Vadim Fedorenko Link: https://patch.msgid.link/20251204100138.1034175-1-arnd@kernel.org Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/ti/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig index a54d71155263c..fe5b2926d8ab0 100644 --- a/drivers/net/ethernet/ti/Kconfig +++ b/drivers/net/ethernet/ti/Kconfig @@ -209,6 +209,7 @@ config TI_ICSSG_PRUETH_SR1 depends on PRU_REMOTEPROC depends on NET_SWITCHDEV depends on ARCH_K3 && OF && TI_K3_UDMA_GLUE_LAYER + depends on PTP_1588_CLOCK_OPTIONAL help Support dual Gigabit Ethernet ports over the ICSSG PRU Subsystem. This subsystem is available on the AM65 SR1.0 platform. @@ -234,7 +235,7 @@ config TI_PRUETH depends on PRU_REMOTEPROC depends on NET_SWITCHDEV select TI_ICSS_IEP - imply PTP_1588_CLOCK + depends on PTP_1588_CLOCK_OPTIONAL help Some TI SoCs has Programmable Realtime Unit (PRU) cores which can support Single or Dual Ethernet ports with the help of firmware code From 8507f12042471fd74fb2820f2dc99466cf35d5f0 Mon Sep 17 00:00:00 2001 From: caoping Date: Thu, 4 Dec 2025 01:10:58 -0800 Subject: [PATCH 0072/1321] net/handshake: restore destructor on submit failure handshake_req_submit() replaces sk->sk_destruct but never restores it when submission fails before the request is hashed. handshake_sk_destruct() then returns early and the original destructor never runs, leaking the socket. Restore sk_destruct on the error path. Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") Reviewed-by: Chuck Lever Cc: stable@vger.kernel.org Signed-off-by: caoping Link: https://patch.msgid.link/20251204091058.1545151-1-caoping@cmss.chinamobile.com Signed-off-by: Jakub Kicinski --- net/handshake/request.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/handshake/request.c b/net/handshake/request.c index 274d2c89b6b20..89435ed755cd0 100644 --- a/net/handshake/request.c +++ b/net/handshake/request.c @@ -276,6 +276,8 @@ int handshake_req_submit(struct socket *sock, struct handshake_req *req, out_unlock: spin_unlock(&hn->hn_lock); out_err: + /* Restore original destructor so socket teardown still runs on failure */ + req->hr_sk->sk_destruct = req->hr_odestruct; trace_handshake_submit_err(net, req, req->hr_sk, ret); handshake_req_destroy(req); return ret; From bc84d7e30acbe92437d5b56601c8cf17b602424d Mon Sep 17 00:00:00 2001 From: Alexey Simakov Date: Fri, 5 Dec 2025 18:58:16 +0300 Subject: [PATCH 0073/1321] broadcom: b44: prevent uninitialized value usage On execution path with raised B44_FLAG_EXTERNAL_PHY, b44_readphy() leaves bmcr value uninitialized and it is used later in the code. Add check of this flag at the beginning of the b44_nway_reset() and exit early of the function with restarting autonegotiation if an external PHY is used. Fixes: 753f492093da ("[B44]: port to native ssb support") Reviewed-by: Jonas Gorski Reviewed-by: Andrew Lunn Signed-off-by: Alexey Simakov Reviewed-by: Michael Chan Link: https://patch.msgid.link/20251205155815.4348-1-bigalex934@gmail.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/broadcom/b44.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/broadcom/b44.c b/drivers/net/ethernet/broadcom/b44.c index 888f28f11406f..90df02e0039cb 100644 --- a/drivers/net/ethernet/broadcom/b44.c +++ b/drivers/net/ethernet/broadcom/b44.c @@ -1790,6 +1790,9 @@ static int b44_nway_reset(struct net_device *dev) u32 bmcr; int r; + if (bp->flags & B44_FLAG_EXTERNAL_PHY) + return phy_ethtool_nway_reset(dev); + spin_lock_irq(&bp->lock); b44_readphy(bp, MII_BMCR, &bmcr); b44_readphy(bp, MII_BMCR, &bmcr); From 3856291311e5a9968107ac4929503d08a91c674c Mon Sep 17 00:00:00 2001 From: Ankit Khushwaha Date: Fri, 5 Dec 2025 22:02:42 +0530 Subject: [PATCH 0074/1321] selftests: tls: fix warning of uninitialized variable In 'poll_partial_rec_async' a uninitialized char variable 'token' with is used for write/read instruction to synchronize between threads via a pipe. tls.c:2833:26: warning: variable 'token' is uninitialized when passed as a const pointer argument Initialize 'token' to '\0' to silence compiler warning. Signed-off-by: Ankit Khushwaha Link: https://patch.msgid.link/20251205163242.14615-1-ankitkhushwaha.linux@gmail.com Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/tls.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c index a3ef4b57eb5f8..a4d16a460fbe4 100644 --- a/tools/testing/selftests/net/tls.c +++ b/tools/testing/selftests/net/tls.c @@ -2786,10 +2786,10 @@ TEST_F(tls_err, epoll_partial_rec) TEST_F(tls_err, poll_partial_rec_async) { struct pollfd pfd = { }; + char token = '\0'; ssize_t rec_len; char rec[256]; char buf[128]; - char token; int p[2]; int ret; From a4abf8d04f483746c11a6a6fd5980d969191dc68 Mon Sep 17 00:00:00 2001 From: Guenter Roeck Date: Fri, 5 Dec 2025 09:10:00 -0800 Subject: [PATCH 0075/1321] selftest: af_unix: Support compilers without flex-array-member-not-at-end support MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fix: gcc: error: unrecognized command-line option ‘-Wflex-array-member-not-at-end’ by making the compiler option dependent on its support. Fixes: 1838731f1072c ("selftest: af_unix: Add -Wall and -Wflex-array-member-not-at-end to CFLAGS.") Cc: Kuniyuki Iwashima Signed-off-by: Guenter Roeck Link: https://patch.msgid.link/20251205171010.515236-7-linux@roeck-us.net Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/af_unix/Makefile | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/net/af_unix/Makefile b/tools/testing/selftests/net/af_unix/Makefile index 3cd677b720728..4c0375e28bbee 100644 --- a/tools/testing/selftests/net/af_unix/Makefile +++ b/tools/testing/selftests/net/af_unix/Makefile @@ -1,4 +1,9 @@ -CFLAGS += $(KHDR_INCLUDES) -Wall -Wflex-array-member-not-at-end +top_srcdir := ../../../../.. +include $(top_srcdir)/scripts/Makefile.compiler + +cc-option = $(call __cc-option, $(CC),,$(1),$(2)) + +CFLAGS += $(KHDR_INCLUDES) -Wall $(call cc-option,-Wflex-array-member-not-at-end) TEST_GEN_PROGS := \ diag_uid \ From 881ba2b7454abebcf20898169a8e13001b5752f9 Mon Sep 17 00:00:00 2001 From: Guenter Roeck Date: Fri, 5 Dec 2025 09:10:04 -0800 Subject: [PATCH 0076/1321] selftests: net: Fix build warnings MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fix ksft.h: In function ‘ksft_ready’: ksft.h:27:9: warning: ignoring return value of ‘write’ declared with attribute ‘warn_unused_result’ ksft.h: In function ‘ksft_wait’: ksft.h:51:9: warning: ignoring return value of ‘read’ declared with attribute ‘warn_unused_result’ by checking the return value of the affected functions and displaying an error message if an error is seen. Fixes: 2b6d490b82668 ("selftests: drv-net: Factor out ksft C helpers") Cc: Joe Damato Signed-off-by: Guenter Roeck Link: https://patch.msgid.link/20251205171010.515236-11-linux@roeck-us.net Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/lib/ksft.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/net/lib/ksft.h b/tools/testing/selftests/net/lib/ksft.h index 17dc34a612c64..03912902a6d30 100644 --- a/tools/testing/selftests/net/lib/ksft.h +++ b/tools/testing/selftests/net/lib/ksft.h @@ -24,7 +24,8 @@ static inline void ksft_ready(void) fd = STDOUT_FILENO; } - write(fd, msg, sizeof(msg)); + if (write(fd, msg, sizeof(msg)) < 0) + perror("write()"); if (fd != STDOUT_FILENO) close(fd); } @@ -48,7 +49,8 @@ static inline void ksft_wait(void) fd = STDIN_FILENO; } - read(fd, &byte, sizeof(byte)); + if (read(fd, &byte, sizeof(byte)) < 0) + perror("read()"); if (fd != STDIN_FILENO) close(fd); } From 770692418cd7f96fcf93212a5f716191241af229 Mon Sep 17 00:00:00 2001 From: Guenter Roeck Date: Fri, 5 Dec 2025 09:10:07 -0800 Subject: [PATCH 0077/1321] selftests: net: tfo: Fix build warning MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fix tfo.c: In function ‘run_server’: tfo.c:84:9: warning: ignoring return value of ‘read’ declared with attribute ‘warn_unused_result’ by evaluating the return value from read() and displaying an error message if it reports an error. Fixes: c65b5bb2329e3 ("selftests: net: add passive TFO test binary") Cc: David Wei Signed-off-by: Guenter Roeck Link: https://patch.msgid.link/20251205171010.515236-14-linux@roeck-us.net Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/tfo.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/net/tfo.c b/tools/testing/selftests/net/tfo.c index eb3cac5e583c9..8d82140f0f767 100644 --- a/tools/testing/selftests/net/tfo.c +++ b/tools/testing/selftests/net/tfo.c @@ -81,7 +81,8 @@ static void run_server(void) if (getsockopt(connfd, SOL_SOCKET, SO_INCOMING_NAPI_ID, &opt, &len) < 0) error(1, errno, "getsockopt(SO_INCOMING_NAPI_ID)"); - read(connfd, buf, 64); + if (read(connfd, buf, 64) < 0) + perror("read()"); fprintf(outfile, "%d\n", opt); fclose(outfile); From aced8ef9201922d6ec8cd5534edf001d478587ca Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Sat, 6 Dec 2025 17:09:39 -0800 Subject: [PATCH 0078/1321] inet: frags: avoid theoretical race in ip_frag_reinit() In ip_frag_reinit() we want to move the frag timeout timer into the future. If the timer fires in the meantime we inadvertently scheduled it again, and since the timer assumes a ref on frag_queue we need to acquire one to balance things out. This is technically racy, we should have acquired the reference _before_ we touch the timer, it may fire again before we take the ref. Avoid this entire dance by using mod_timer_pending() which only modifies the timer if its pending (and which exists since Linux v2.6.30) Note that this was the only place we ever took a ref on frag_queue since Eric's conversion to RCU. So we could potentially replace the whole refcnt field with an atomic flag and a bit more RCU. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reviewed-by: Eric Dumazet Link: https://patch.msgid.link/20251207010942.1672972-2-kuba@kernel.org Signed-off-by: Jakub Kicinski --- net/ipv4/inet_fragment.c | 4 +++- net/ipv4/ip_fragment.c | 4 +--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c index 025895eb6ec59..30f4fa50ee2d7 100644 --- a/net/ipv4/inet_fragment.c +++ b/net/ipv4/inet_fragment.c @@ -327,7 +327,9 @@ static struct inet_frag_queue *inet_frag_alloc(struct fqdir *fqdir, timer_setup(&q->timer, f->frag_expire, 0); spin_lock_init(&q->lock); - /* One reference for the timer, one for the hash table. */ + /* One reference for the timer, one for the hash table. + * We never take any extra references, only decrement this field. + */ refcount_set(&q->refcnt, 2); return q; diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index f7012479713ba..d7bccdc9dc693 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -242,10 +242,8 @@ static int ip_frag_reinit(struct ipq *qp) { unsigned int sum_truesize = 0; - if (!mod_timer(&qp->q.timer, jiffies + qp->q.fqdir->timeout)) { - refcount_inc(&qp->q.refcnt); + if (!mod_timer_pending(&qp->q.timer, jiffies + qp->q.fqdir->timeout)) return -ETIMEDOUT; - } sum_truesize = inet_frag_rbtree_purge(&qp->q.rb_fragments, SKB_DROP_REASON_FRAG_TOO_FAR); From 0620ce325a2785390369ae0cdf617ac31c5d4d42 Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Sat, 6 Dec 2025 17:09:40 -0800 Subject: [PATCH 0079/1321] inet: frags: add inet_frag_queue_flush() Instead of exporting inet_frag_rbtree_purge() which requires that caller takes care of memory accounting, add a new helper. We will need to call it from a few places in the next patch. Reviewed-by: Eric Dumazet Link: https://patch.msgid.link/20251207010942.1672972-3-kuba@kernel.org Signed-off-by: Jakub Kicinski --- include/net/inet_frag.h | 5 ++--- net/ipv4/inet_fragment.c | 15 ++++++++++++--- net/ipv4/ip_fragment.c | 6 +----- 3 files changed, 15 insertions(+), 11 deletions(-) diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h index 0eccd9c3a883f..3ffaceee7bbc0 100644 --- a/include/net/inet_frag.h +++ b/include/net/inet_frag.h @@ -141,9 +141,8 @@ void inet_frag_kill(struct inet_frag_queue *q, int *refs); void inet_frag_destroy(struct inet_frag_queue *q); struct inet_frag_queue *inet_frag_find(struct fqdir *fqdir, void *key); -/* Free all skbs in the queue; return the sum of their truesizes. */ -unsigned int inet_frag_rbtree_purge(struct rb_root *root, - enum skb_drop_reason reason); +void inet_frag_queue_flush(struct inet_frag_queue *q, + enum skb_drop_reason reason); static inline void inet_frag_putn(struct inet_frag_queue *q, int refs) { diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c index 30f4fa50ee2d7..1bf969b5a1cb5 100644 --- a/net/ipv4/inet_fragment.c +++ b/net/ipv4/inet_fragment.c @@ -263,8 +263,8 @@ static void inet_frag_destroy_rcu(struct rcu_head *head) kmem_cache_free(f->frags_cachep, q); } -unsigned int inet_frag_rbtree_purge(struct rb_root *root, - enum skb_drop_reason reason) +static unsigned int +inet_frag_rbtree_purge(struct rb_root *root, enum skb_drop_reason reason) { struct rb_node *p = rb_first(root); unsigned int sum = 0; @@ -284,7 +284,16 @@ unsigned int inet_frag_rbtree_purge(struct rb_root *root, } return sum; } -EXPORT_SYMBOL(inet_frag_rbtree_purge); + +void inet_frag_queue_flush(struct inet_frag_queue *q, + enum skb_drop_reason reason) +{ + unsigned int sum; + + sum = inet_frag_rbtree_purge(&q->rb_fragments, reason); + sub_frag_mem_limit(q->fqdir, sum); +} +EXPORT_SYMBOL(inet_frag_queue_flush); void inet_frag_destroy(struct inet_frag_queue *q) { diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index d7bccdc9dc693..32f1c1a46ba72 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -240,14 +240,10 @@ static int ip_frag_too_far(struct ipq *qp) static int ip_frag_reinit(struct ipq *qp) { - unsigned int sum_truesize = 0; - if (!mod_timer_pending(&qp->q.timer, jiffies + qp->q.fqdir->timeout)) return -ETIMEDOUT; - sum_truesize = inet_frag_rbtree_purge(&qp->q.rb_fragments, - SKB_DROP_REASON_FRAG_TOO_FAR); - sub_frag_mem_limit(qp->q.fqdir, sum_truesize); + inet_frag_queue_flush(&qp->q, SKB_DROP_REASON_FRAG_TOO_FAR); qp->q.flags = 0; qp->q.len = 0; From 00425f3e9dffce224e9d5c1c562995d0a468d0e7 Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Sat, 6 Dec 2025 17:09:41 -0800 Subject: [PATCH 0080/1321] inet: frags: flush pending skbs in fqdir_pre_exit() We have been seeing occasional deadlocks on pernet_ops_rwsem since September in NIPA. The stuck task was usually modprobe (often loading a driver like ipvlan), trying to take the lock as a Writer. lockdep does not track readers for rwsems so the read wasn't obvious from the reports. On closer inspection the Reader holding the lock was conntrack looping forever in nf_conntrack_cleanup_net_list(). Based on past experience with occasional NIPA crashes I looked thru the tests which run before the crash and noticed that the crash follows ip_defrag.sh. An immediate red flag. Scouring thru (de)fragmentation queues reveals skbs sitting around, holding conntrack references. The problem is that since conntrack depends on nf_defrag_ipv6, nf_defrag_ipv6 will load first. Since nf_defrag_ipv6 loads first its netns exit hooks run _after_ conntrack's netns exit hook. Flush all fragment queue SKBs during fqdir_pre_exit() to release conntrack references before conntrack cleanup runs. Also flush the queues in timer expiry handlers when they discover fqdir->dead is set, in case packet sneaks in while we're running the pre_exit flush. The commit under Fixes is not exactly the culprit, but I think previously the timer firing would eventually unblock the spinning conntrack. Fixes: d5dd88794a13 ("inet: fix various use-after-free in defrags units") Reviewed-by: Eric Dumazet Link: https://patch.msgid.link/20251207010942.1672972-4-kuba@kernel.org Signed-off-by: Jakub Kicinski --- include/net/inet_frag.h | 13 +------------ include/net/ipv6_frag.h | 9 ++++++--- net/ipv4/inet_fragment.c | 36 ++++++++++++++++++++++++++++++++++++ net/ipv4/ip_fragment.c | 12 +++++++----- 4 files changed, 50 insertions(+), 20 deletions(-) diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h index 3ffaceee7bbc0..365925c9d2628 100644 --- a/include/net/inet_frag.h +++ b/include/net/inet_frag.h @@ -123,18 +123,7 @@ void inet_frags_fini(struct inet_frags *); int fqdir_init(struct fqdir **fqdirp, struct inet_frags *f, struct net *net); -static inline void fqdir_pre_exit(struct fqdir *fqdir) -{ - /* Prevent creation of new frags. - * Pairs with READ_ONCE() in inet_frag_find(). - */ - WRITE_ONCE(fqdir->high_thresh, 0); - - /* Pairs with READ_ONCE() in inet_frag_kill(), ip_expire() - * and ip6frag_expire_frag_queue(). - */ - WRITE_ONCE(fqdir->dead, true); -} +void fqdir_pre_exit(struct fqdir *fqdir); void fqdir_exit(struct fqdir *fqdir); void inet_frag_kill(struct inet_frag_queue *q, int *refs); diff --git a/include/net/ipv6_frag.h b/include/net/ipv6_frag.h index 38ef66826939e..41d9fc6965f9a 100644 --- a/include/net/ipv6_frag.h +++ b/include/net/ipv6_frag.h @@ -69,9 +69,6 @@ ip6frag_expire_frag_queue(struct net *net, struct frag_queue *fq) int refs = 1; rcu_read_lock(); - /* Paired with the WRITE_ONCE() in fqdir_pre_exit(). */ - if (READ_ONCE(fq->q.fqdir->dead)) - goto out_rcu_unlock; spin_lock(&fq->q.lock); if (fq->q.flags & INET_FRAG_COMPLETE) @@ -80,6 +77,12 @@ ip6frag_expire_frag_queue(struct net *net, struct frag_queue *fq) fq->q.flags |= INET_FRAG_DROP; inet_frag_kill(&fq->q, &refs); + /* Paired with the WRITE_ONCE() in fqdir_pre_exit(). */ + if (READ_ONCE(fq->q.fqdir->dead)) { + inet_frag_queue_flush(&fq->q, 0); + goto out; + } + dev = dev_get_by_index_rcu(net, fq->iif); if (!dev) goto out; diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c index 1bf969b5a1cb5..001ee5c4d962e 100644 --- a/net/ipv4/inet_fragment.c +++ b/net/ipv4/inet_fragment.c @@ -218,6 +218,41 @@ static int __init inet_frag_wq_init(void) pure_initcall(inet_frag_wq_init); +void fqdir_pre_exit(struct fqdir *fqdir) +{ + struct inet_frag_queue *fq; + struct rhashtable_iter hti; + + /* Prevent creation of new frags. + * Pairs with READ_ONCE() in inet_frag_find(). + */ + WRITE_ONCE(fqdir->high_thresh, 0); + + /* Pairs with READ_ONCE() in inet_frag_kill(), ip_expire() + * and ip6frag_expire_frag_queue(). + */ + WRITE_ONCE(fqdir->dead, true); + + rhashtable_walk_enter(&fqdir->rhashtable, &hti); + rhashtable_walk_start(&hti); + + while ((fq = rhashtable_walk_next(&hti))) { + if (IS_ERR(fq)) { + if (PTR_ERR(fq) != -EAGAIN) + break; + continue; + } + spin_lock_bh(&fq->lock); + if (!(fq->flags & INET_FRAG_COMPLETE)) + inet_frag_queue_flush(fq, 0); + spin_unlock_bh(&fq->lock); + } + + rhashtable_walk_stop(&hti); + rhashtable_walk_exit(&hti); +} +EXPORT_SYMBOL(fqdir_pre_exit); + void fqdir_exit(struct fqdir *fqdir) { INIT_WORK(&fqdir->destroy_work, fqdir_work_fn); @@ -290,6 +325,7 @@ void inet_frag_queue_flush(struct inet_frag_queue *q, { unsigned int sum; + reason = reason ?: SKB_DROP_REASON_FRAG_REASM_TIMEOUT; sum = inet_frag_rbtree_purge(&q->rb_fragments, reason); sub_frag_mem_limit(q->fqdir, sum); } diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index 32f1c1a46ba72..56b0f738d2f27 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -134,11 +134,6 @@ static void ip_expire(struct timer_list *t) net = qp->q.fqdir->net; rcu_read_lock(); - - /* Paired with WRITE_ONCE() in fqdir_pre_exit(). */ - if (READ_ONCE(qp->q.fqdir->dead)) - goto out_rcu_unlock; - spin_lock(&qp->q.lock); if (qp->q.flags & INET_FRAG_COMPLETE) @@ -146,6 +141,13 @@ static void ip_expire(struct timer_list *t) qp->q.flags |= INET_FRAG_DROP; inet_frag_kill(&qp->q, &refs); + + /* Paired with WRITE_ONCE() in fqdir_pre_exit(). */ + if (READ_ONCE(qp->q.fqdir->dead)) { + inet_frag_queue_flush(&qp->q, 0); + goto out; + } + __IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS); __IP_INC_STATS(net, IPSTATS_MIB_REASMTIMEOUT); From 2e26d8ec52ee89a7fda2127baeaee86b5657c481 Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Sat, 6 Dec 2025 17:09:42 -0800 Subject: [PATCH 0081/1321] netfilter: conntrack: warn when cleanup is stuck nf_conntrack_cleanup_net_list() calls schedule() so it does not show up as a hung task. Add an explicit check to make debugging leaked skbs/conntack references more obvious. Acked-by: Florian Westphal Reviewed-by: Eric Dumazet Link: https://patch.msgid.link/20251207010942.1672972-5-kuba@kernel.org Signed-off-by: Jakub Kicinski --- net/netfilter/nf_conntrack_core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 0b95f226f2111..d1f8eb725d422 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -2487,6 +2487,7 @@ void nf_conntrack_cleanup_net(struct net *net) void nf_conntrack_cleanup_net_list(struct list_head *net_exit_list) { struct nf_ct_iter_data iter_data = {}; + unsigned long start = jiffies; struct net *net; int busy; @@ -2507,6 +2508,8 @@ void nf_conntrack_cleanup_net_list(struct list_head *net_exit_list) busy = 1; } if (busy) { + DEBUG_NET_WARN_ONCE(time_after(jiffies, start + 60 * HZ), + "conntrack cleanup blocked for 60s"); schedule(); goto i_see_dead_people; } From 515dad885b5aae08952e3fe4d5ca2f8938c84dc3 Mon Sep 17 00:00:00 2001 From: Petr Machata Date: Tue, 9 Dec 2025 16:29:01 +0100 Subject: [PATCH 0082/1321] selftests: net: lib: tc_rule_stats_get(): Don't hard-code array index Flower is commonly used to match on packets in many bash-based selftests. A dump of a flower filter including statistics looks something like this: [ { "protocol": "all", "pref": 49152, "kind": "flower", "chain": 0 }, { ... "options": { ... "actions": [ { ... "stats": { "bytes": 0, "packets": 0, "drops": 0, "overlimits": 0, "requeues": 0, "backlog": 0, "qlen": 0 } } ] } } ] The JQ query in the helper function tc_rule_stats_get() assumes this form and looks for the second element of the array. However, a dump of a u32 filter looks like this: [ { "protocol": "all", "pref": 49151, "kind": "u32", "chain": 0 }, { "protocol": "all", "pref": 49151, "kind": "u32", "chain": 0, "options": { "fh": "800:", "ht_divisor": 1 } }, { ... "options": { ... "actions": [ { ... "stats": { "bytes": 0, "packets": 0, "drops": 0, "overlimits": 0, "requeues": 0, "backlog": 0, "qlen": 0 } } ] } }, ] There's an extra element which the JQ query ends up choosing. Instead of hard-coding a particular index, look for the entry on which a selector .options.actions yields anything. Signed-off-by: Petr Machata Reviewed-by: Ido Schimmel Link: https://patch.msgid.link/12982a44471c834511a0ee6c1e8f57e3a5307105.1765289566.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/lib.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/net/lib.sh b/tools/testing/selftests/net/lib.sh index f448bafb3f208..0ec131b339bc4 100644 --- a/tools/testing/selftests/net/lib.sh +++ b/tools/testing/selftests/net/lib.sh @@ -280,7 +280,8 @@ tc_rule_stats_get() local selector=${1:-.packets}; shift tc -j -s filter show dev $dev $dir pref $pref \ - | jq ".[1].options.actions[].stats$selector" + | jq ".[] | select(.options.actions) | + .options.actions[].stats$selector" } tc_rule_handle_stats_get() From 85589432a3dfbb26b85c5a4e7308df6c6d3fb03e Mon Sep 17 00:00:00 2001 From: Petr Machata Date: Tue, 9 Dec 2025 16:29:02 +0100 Subject: [PATCH 0083/1321] selftests: forwarding: vxlan_bridge_1q_mc_ul: Fix flakiness This test runs an overlay traffic, forwarded over a multicast-routed VXLAN underlay. In order to determine whether packets reach their intended destination, it uses a TC match. For convenience, it uses a flower match, which however does not allow matching on the encapsulated packet. So various service traffic ends up being indistinguishable from the test packets, and ends up confusing the test. To alleviate the problem, the test uses sleep to allow the necessary service traffic to run and clear the channel, before running the test traffic. This worked for a while, but lately we have nevertheless seen flakiness of the test in the CI. Fix the issue by using u32 to match the encapsulated packet as well. The confusing packets seem to always be IPv6 multicast listener reports. Realistically they could be ARP or other ICMP6 traffic as well. Therefore look for ethertype IPv4 in the IPv4 traffic test, and for IPv6 / UDP combination in the IPv6 traffic test. Signed-off-by: Petr Machata Reviewed-by: Ido Schimmel Link: https://patch.msgid.link/6438cb1613a2a667d3ff64089eb5994778f247af.1765289566.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/forwarding/config | 1 + .../net/forwarding/vxlan_bridge_1q_mc_ul.sh | 13 +++++++++---- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/net/forwarding/config b/tools/testing/selftests/net/forwarding/config index ce64518aaa111..75a6c3d3c1da3 100644 --- a/tools/testing/selftests/net/forwarding/config +++ b/tools/testing/selftests/net/forwarding/config @@ -29,6 +29,7 @@ CONFIG_NET_ACT_VLAN=m CONFIG_NET_CLS_BASIC=m CONFIG_NET_CLS_FLOWER=m CONFIG_NET_CLS_MATCHALL=m +CONFIG_NET_CLS_U32=m CONFIG_NET_EMATCH=y CONFIG_NET_EMATCH_META=m CONFIG_NETFILTER=y diff --git a/tools/testing/selftests/net/forwarding/vxlan_bridge_1q_mc_ul.sh b/tools/testing/selftests/net/forwarding/vxlan_bridge_1q_mc_ul.sh index 6a570d256e07b..5ce19ca088461 100755 --- a/tools/testing/selftests/net/forwarding/vxlan_bridge_1q_mc_ul.sh +++ b/tools/testing/selftests/net/forwarding/vxlan_bridge_1q_mc_ul.sh @@ -138,13 +138,18 @@ install_capture() defer tc qdisc del dev "$dev" clsact tc filter add dev "$dev" ingress proto ip pref 104 \ - flower skip_hw ip_proto udp dst_port "$VXPORT" \ - action pass + u32 match ip protocol 0x11 0xff \ + match u16 "$VXPORT" 0xffff at 0x16 \ + match u16 0x0800 0xffff at 0x30 \ + action pass defer tc filter del dev "$dev" ingress proto ip pref 104 tc filter add dev "$dev" ingress proto ipv6 pref 106 \ - flower skip_hw ip_proto udp dst_port "$VXPORT" \ - action pass + u32 match ip6 protocol 0x11 0xff \ + match u16 "$VXPORT" 0xffff at 0x2a \ + match u16 0x86dd 0xffff at 0x44 \ + match u8 0x11 0xff at 0x4c \ + action pass defer tc filter del dev "$dev" ingress proto ipv6 pref 106 } From 03412e3b2a3944930b9407d3b780784147313643 Mon Sep 17 00:00:00 2001 From: Petr Machata Date: Tue, 9 Dec 2025 16:29:03 +0100 Subject: [PATCH 0084/1321] selftests: forwarding: vxlan_bridge_1q_mc_ul: Drop useless sleeping After fixing traffic matching in the previous patch, the test does not need to use the sleep anymore. So drop vx_wait() altogether, migrate all callers of vx{10,20}_create_wait() to the corresponding _create(), and drop the now unused _create_wait() helpers. Signed-off-by: Petr Machata Reviewed-by: Ido Schimmel Link: https://patch.msgid.link/eabfe4fa12ae788cf3b8c5c876a989de81dfc3d3.1765289566.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski --- .../net/forwarding/vxlan_bridge_1q_mc_ul.sh | 63 +++++++------------ 1 file changed, 22 insertions(+), 41 deletions(-) diff --git a/tools/testing/selftests/net/forwarding/vxlan_bridge_1q_mc_ul.sh b/tools/testing/selftests/net/forwarding/vxlan_bridge_1q_mc_ul.sh index 5ce19ca088461..2cf4c6d9245ba 100755 --- a/tools/testing/selftests/net/forwarding/vxlan_bridge_1q_mc_ul.sh +++ b/tools/testing/selftests/net/forwarding/vxlan_bridge_1q_mc_ul.sh @@ -253,13 +253,6 @@ vx_create() } export -f vx_create -vx_wait() -{ - # Wait for all the ARP, IGMP etc. noise to settle down so that the - # tunnel is clear for measurements. - sleep 10 -} - vx10_create() { vx_create vx10 10 id 1000 "$@" @@ -272,18 +265,6 @@ vx20_create() } export -f vx20_create -vx10_create_wait() -{ - vx10_create "$@" - vx_wait -} - -vx20_create_wait() -{ - vx20_create "$@" - vx_wait -} - ns_init_common() { local ns=$1; shift @@ -559,7 +540,7 @@ ipv4_nomcroute() # Install a misleading (S,G) rule to attempt to trick the system into # pushing the packets elsewhere. adf_install_broken_sg - vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$swp2" + vx10_create local 192.0.2.100 group "$GROUP4" dev "$swp2" do_test 4 10 0 "IPv4 nomcroute" } @@ -567,7 +548,7 @@ ipv6_nomcroute() { # Like for IPv4, install a misleading (S,G). adf_install_broken_sg - vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$swp2" + vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$swp2" do_test 6 10 0 "IPv6 nomcroute" } @@ -586,35 +567,35 @@ ipv6_nomcroute_rx() ipv4_mcroute() { adf_install_sg - vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute + vx10_create local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute do_test 4 10 10 "IPv4 mcroute" } ipv6_mcroute() { adf_install_sg - vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute + vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute do_test 6 10 10 "IPv6 mcroute" } ipv4_mcroute_rx() { adf_install_sg - vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute + vx10_create local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute ipv4_do_test_rx 0 "IPv4 mcroute ping" } ipv6_mcroute_rx() { adf_install_sg - vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute + vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute ipv6_do_test_rx 0 "IPv6 mcroute ping" } ipv4_mcroute_changelink() { adf_install_sg - vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$IPMR" + vx10_create local 192.0.2.100 group "$GROUP4" dev "$IPMR" ip link set dev vx10 type vxlan mcroute sleep 1 do_test 4 10 10 "IPv4 mcroute changelink" @@ -623,7 +604,7 @@ ipv4_mcroute_changelink() ipv6_mcroute_changelink() { adf_install_sg - vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute + vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute ip link set dev vx20 type vxlan mcroute sleep 1 do_test 6 10 10 "IPv6 mcroute changelink" @@ -632,47 +613,47 @@ ipv6_mcroute_changelink() ipv4_mcroute_starg() { adf_install_starg - vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute + vx10_create local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute do_test 4 10 10 "IPv4 mcroute (*,G)" } ipv6_mcroute_starg() { adf_install_starg - vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute + vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute do_test 6 10 10 "IPv6 mcroute (*,G)" } ipv4_mcroute_starg_rx() { adf_install_starg - vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute + vx10_create local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute ipv4_do_test_rx 0 "IPv4 mcroute (*,G) ping" } ipv6_mcroute_starg_rx() { adf_install_starg - vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute + vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute ipv6_do_test_rx 0 "IPv6 mcroute (*,G) ping" } ipv4_mcroute_noroute() { - vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute + vx10_create local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute do_test 4 0 0 "IPv4 mcroute, no route" } ipv6_mcroute_noroute() { - vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute + vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute do_test 6 0 0 "IPv6 mcroute, no route" } ipv4_mcroute_fdb() { adf_install_sg - vx10_create_wait local 192.0.2.100 dev "$IPMR" mcroute + vx10_create local 192.0.2.100 dev "$IPMR" mcroute bridge fdb add dev vx10 \ 00:00:00:00:00:00 self static dst "$GROUP4" via "$IPMR" do_test 4 10 10 "IPv4 mcroute FDB" @@ -681,7 +662,7 @@ ipv4_mcroute_fdb() ipv6_mcroute_fdb() { adf_install_sg - vx20_create_wait local 2001:db8:4::1 dev "$IPMR" mcroute + vx20_create local 2001:db8:4::1 dev "$IPMR" mcroute bridge -6 fdb add dev vx20 \ 00:00:00:00:00:00 self static dst "$GROUP6" via "$IPMR" do_test 6 10 10 "IPv6 mcroute FDB" @@ -691,7 +672,7 @@ ipv6_mcroute_fdb() ipv4_mcroute_fdb_oif0() { adf_install_sg - vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute + vx10_create local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute bridge fdb del dev vx10 00:00:00:00:00:00 bridge fdb add dev vx10 00:00:00:00:00:00 self static dst "$GROUP4" do_test 4 10 10 "IPv4 mcroute oif=0" @@ -708,7 +689,7 @@ ipv6_mcroute_fdb_oif0() defer ip -6 route del table local multicast "$GROUP6/128" dev "$IPMR" adf_install_sg - vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute + vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute bridge -6 fdb del dev vx20 00:00:00:00:00:00 bridge -6 fdb add dev vx20 00:00:00:00:00:00 self static dst "$GROUP6" do_test 6 10 10 "IPv6 mcroute oif=0" @@ -721,7 +702,7 @@ ipv4_mcroute_fdb_oif0_sep() adf_install_sg_sep adf_ip_addr_add lo 192.0.2.120/28 - vx10_create_wait local 192.0.2.120 group "$GROUP4" dev "$IPMR" mcroute + vx10_create local 192.0.2.120 group "$GROUP4" dev "$IPMR" mcroute bridge fdb del dev vx10 00:00:00:00:00:00 bridge fdb add dev vx10 00:00:00:00:00:00 self static dst "$GROUP4" do_test 4 10 10 "IPv4 mcroute TX!=RX oif=0" @@ -732,7 +713,7 @@ ipv4_mcroute_fdb_oif0_sep_rx() adf_install_sg_sep_rx lo adf_ip_addr_add lo 192.0.2.120/28 - vx10_create_wait local 192.0.2.120 group "$GROUP4" dev "$IPMR" mcroute + vx10_create local 192.0.2.120 group "$GROUP4" dev "$IPMR" mcroute bridge fdb del dev vx10 00:00:00:00:00:00 bridge fdb add dev vx10 00:00:00:00:00:00 self static dst "$GROUP4" ipv4_do_test_rx 0 "IPv4 mcroute TX!=RX oif=0 ping" @@ -743,7 +724,7 @@ ipv4_mcroute_fdb_sep_rx() adf_install_sg_sep_rx lo adf_ip_addr_add lo 192.0.2.120/28 - vx10_create_wait local 192.0.2.120 group "$GROUP4" dev "$IPMR" mcroute + vx10_create local 192.0.2.120 group "$GROUP4" dev "$IPMR" mcroute bridge fdb del dev vx10 00:00:00:00:00:00 bridge fdb add \ dev vx10 00:00:00:00:00:00 self static dst "$GROUP4" via lo @@ -755,7 +736,7 @@ ipv6_mcroute_fdb_sep_rx() adf_install_sg_sep_rx "X$IPMR" adf_ip_addr_add "X$IPMR" 2001:db8:5::1/64 - vx20_create_wait local 2001:db8:5::1 group "$GROUP6" dev "$IPMR" mcroute + vx20_create local 2001:db8:5::1 group "$GROUP6" dev "$IPMR" mcroute bridge -6 fdb del dev vx20 00:00:00:00:00:00 bridge -6 fdb add dev vx20 00:00:00:00:00:00 \ self static dst "$GROUP6" via "X$IPMR" From 0786aa60259fb24807c4c1e034c92a7484ea5f9b Mon Sep 17 00:00:00 2001 From: Fernando Fernandez Mancera Date: Fri, 5 Dec 2025 12:58:01 +0100 Subject: [PATCH 0085/1321] netfilter: nf_conncount: fix leaked ct in error paths There are some situations where ct might be leaked as error paths are skipping the refcounted check and return immediately. In order to solve it make sure that the check is always called. Fixes: be102eb6a0e7 ("netfilter: nf_conncount: rework API to use sk_buff directly") Signed-off-by: Fernando Fernandez Mancera Signed-off-by: Florian Westphal --- net/netfilter/nf_conncount.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c index f1be4dd5cf85f..3654f1e8976c9 100644 --- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -172,14 +172,14 @@ static int __nf_conncount_add(struct net *net, struct nf_conn *found_ct; unsigned int collect = 0; bool refcounted = false; + int err = 0; if (!get_ct_or_tuple_from_skb(net, skb, l3num, &ct, &tuple, &zone, &refcounted)) return -ENOENT; if (ct && nf_ct_is_confirmed(ct)) { - if (refcounted) - nf_ct_put(ct); - return -EEXIST; + err = -EEXIST; + goto out_put; } if ((u32)jiffies == list->last_gc) @@ -231,12 +231,16 @@ static int __nf_conncount_add(struct net *net, } add_new_node: - if (WARN_ON_ONCE(list->count > INT_MAX)) - return -EOVERFLOW; + if (WARN_ON_ONCE(list->count > INT_MAX)) { + err = -EOVERFLOW; + goto out_put; + } conn = kmem_cache_alloc(conncount_conn_cachep, GFP_ATOMIC); - if (conn == NULL) - return -ENOMEM; + if (conn == NULL) { + err = -ENOMEM; + goto out_put; + } conn->tuple = tuple; conn->zone = *zone; @@ -249,7 +253,7 @@ static int __nf_conncount_add(struct net *net, out_put: if (refcounted) nf_ct_put(ct); - return 0; + return err; } int nf_conncount_add_skb(struct net *net, @@ -456,11 +460,10 @@ insert_tree(struct net *net, rb_link_node_rcu(&rbconn->node, parent, rbnode); rb_insert_color(&rbconn->node, root); - - if (refcounted) - nf_ct_put(ct); } out_unlock: + if (refcounted) + nf_ct_put(ct); spin_unlock_bh(&nf_conncount_locks[hash]); return count; } From a91d70f50022bd2a975c933009392f31d801f0f0 Mon Sep 17 00:00:00 2001 From: Slavin Liu Date: Fri, 21 Nov 2025 16:52:13 +0800 Subject: [PATCH 0086/1321] ipvs: fix ipv4 null-ptr-deref in route error path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The IPv4 code path in __ip_vs_get_out_rt() calls dst_link_failure() without ensuring skb->dev is set, leading to a NULL pointer dereference in fib_compute_spec_dst() when ipv4_link_failure() attempts to send ICMP destination unreachable messages. The issue emerged after commit ed0de45a1008 ("ipv4: recompile ip options in ipv4_link_failure") started calling __ip_options_compile() from ipv4_link_failure(). This code path eventually calls fib_compute_spec_dst() which dereferences skb->dev. An attempt was made to fix the NULL skb->dev dereference in commit 0113d9c9d1cc ("ipv4: fix null-deref in ipv4_link_failure"), but it only addressed the immediate dev_net(skb->dev) dereference by using a fallback device. The fix was incomplete because fib_compute_spec_dst() later in the call chain still accesses skb->dev directly, which remains NULL when IPVS calls dst_link_failure(). The crash occurs when: 1. IPVS processes a packet in NAT mode with a misconfigured destination 2. Route lookup fails in __ip_vs_get_out_rt() before establishing a route 3. The error path calls dst_link_failure(skb) with skb->dev == NULL 4. ipv4_link_failure() → ipv4_send_dest_unreach() → __ip_options_compile() → fib_compute_spec_dst() 5. fib_compute_spec_dst() dereferences NULL skb->dev Apply the same fix used for IPv6 in commit 326bf17ea5d4 ("ipvs: fix ipv6 route unreach panic"): set skb->dev from skb_dst(skb)->dev before calling dst_link_failure(). KASAN: null-ptr-deref in range [0x0000000000000328-0x000000000000032f] CPU: 1 PID: 12732 Comm: syz.1.3469 Not tainted 6.6.114 #2 RIP: 0010:__in_dev_get_rcu include/linux/inetdevice.h:233 RIP: 0010:fib_compute_spec_dst+0x17a/0x9f0 net/ipv4/fib_frontend.c:285 Call Trace: spec_dst_fill net/ipv4/ip_options.c:232 spec_dst_fill net/ipv4/ip_options.c:229 __ip_options_compile+0x13a1/0x17d0 net/ipv4/ip_options.c:330 ipv4_send_dest_unreach net/ipv4/route.c:1252 ipv4_link_failure+0x702/0xb80 net/ipv4/route.c:1265 dst_link_failure include/net/dst.h:437 __ip_vs_get_out_rt+0x15fd/0x19e0 net/netfilter/ipvs/ip_vs_xmit.c:412 ip_vs_nat_xmit+0x1d8/0xc80 net/netfilter/ipvs/ip_vs_xmit.c:764 Fixes: ed0de45a1008 ("ipv4: recompile ip options in ipv4_link_failure") Signed-off-by: Slavin Liu Acked-by: Julian Anastasov Signed-off-by: Florian Westphal --- net/netfilter/ipvs/ip_vs_xmit.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c index 3162ce3c26404..64c697212578a 100644 --- a/net/netfilter/ipvs/ip_vs_xmit.c +++ b/net/netfilter/ipvs/ip_vs_xmit.c @@ -408,6 +408,9 @@ __ip_vs_get_out_rt(struct netns_ipvs *ipvs, int skb_af, struct sk_buff *skb, return -1; err_unreach: + if (!skb->dev) + skb->dev = skb_dst(skb)->dev; + dst_link_failure(skb); return -1; } From ed02979988581f06fd338e2f34f2eada0eeda24b Mon Sep 17 00:00:00 2001 From: Lorenzo Bianconi Date: Mon, 1 Dec 2025 11:22:45 +0100 Subject: [PATCH 0087/1321] netfilter: always set route tuple out ifindex Always set nf_flow_route tuple out ifindex even if the indev is not one of the flowtable configured devices since otherwise the outdev lookup in nf_flow_offload_ip_hook() or nf_flow_offload_ipv6_hook() for FLOW_OFFLOAD_XMIT_NEIGH flowtable entries will fail. The above issue occurs in the following configuration since IP6IP6 tunnel does not support flowtable acceleration yet: $ip addr show 5: eth0: mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 00:11:22:33:22:55 brd ff:ff:ff:ff:ff:ff link-netns ns1 inet6 2001:db8:1::2/64 scope global nodad valid_lft forever preferred_lft forever inet6 fe80::211:22ff:fe33:2255/64 scope link tentative proto kernel_ll valid_lft forever preferred_lft forever 6: eth1: mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 00:22:22:33:22:55 brd ff:ff:ff:ff:ff:ff link-netns ns3 inet6 2001:db8:2::1/64 scope global nodad valid_lft forever preferred_lft forever inet6 fe80::222:22ff:fe33:2255/64 scope link tentative proto kernel_ll valid_lft forever preferred_lft forever 7: tun0@NONE: mtu 1452 qdisc noqueue state UNKNOWN group default qlen 1000 link/tunnel6 2001:db8:2::1 peer 2001:db8:2::2 permaddr a85:e732:2c37:: inet6 2002:db8:1::1/64 scope global nodad valid_lft forever preferred_lft forever inet6 fe80::885:e7ff:fe32:2c37/64 scope link proto kernel_ll valid_lft forever preferred_lft forever $ip -6 route show 2001:db8:1::/64 dev eth0 proto kernel metric 256 pref medium 2001:db8:2::/64 dev eth1 proto kernel metric 256 pref medium 2002:db8:1::/64 dev tun0 proto kernel metric 256 pref medium default via 2002:db8:1::2 dev tun0 metric 1024 pref medium $nft list ruleset table inet filter { flowtable ft { hook ingress priority filter devices = { eth0, eth1 } } chain forward { type filter hook forward priority filter; policy accept; meta l4proto { tcp, udp } flow add @ft } } Fixes: b5964aac51e0 ("netfilter: flowtable: consolidate xmit path") Signed-off-by: Lorenzo Bianconi Signed-off-by: Florian Westphal --- net/netfilter/nf_flow_table_path.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/netfilter/nf_flow_table_path.c b/net/netfilter/nf_flow_table_path.c index f0984cf69a09b..eb24fe2715dcd 100644 --- a/net/netfilter/nf_flow_table_path.c +++ b/net/netfilter/nf_flow_table_path.c @@ -250,6 +250,9 @@ static void nft_dev_forward_path(const struct nft_pktinfo *pkt, if (nft_dev_fill_forward_path(route, dst, ct, dir, ha, &stack) >= 0) nft_dev_path_info(&stack, &info, ha, &ft->data); + if (info.outdev) + route->tuple[dir].out.ifindex = info.outdev->ifindex; + if (!info.indev || !nft_flowtable_find_dev(info.indev, ft)) return; @@ -269,7 +272,6 @@ static void nft_dev_forward_path(const struct nft_pktinfo *pkt, route->tuple[!dir].in.num_encaps = info.num_encaps; route->tuple[!dir].in.ingress_vlans = info.ingress_vlans; - route->tuple[dir].out.ifindex = info.outdev->ifindex; if (info.xmit_type == FLOW_OFFLOAD_XMIT_DIRECT) { memcpy(route->tuple[dir].out.h_source, info.h_source, ETH_ALEN); From b8c6a7314f1866e1cc62553daef558e747f7a842 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Tue, 9 Dec 2025 00:03:36 +0100 Subject: [PATCH 0088/1321] selftests: netfilter: prefer xfail in case race wasn't triggered Jakub says: "We try to reserve SKIP for tests skipped because tool is missing in env, something isn't built into the kernel etc." use xfail, we can't force the race condition to appear at will so its expected that the test 'fails' occasionally. Fixes: 78a588363587 ("selftests: netfilter: add conntrack clash resolution test case") Reported-by: Jakub Kicinski Closes: https://lore.kernel.org/netdev/20251206175647.5c32f419@kernel.org/ Signed-off-by: Florian Westphal --- tools/testing/selftests/net/netfilter/conntrack_clash.sh | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/net/netfilter/conntrack_clash.sh b/tools/testing/selftests/net/netfilter/conntrack_clash.sh index 7fc6c5dbd5516..84b8eb12143ae 100755 --- a/tools/testing/selftests/net/netfilter/conntrack_clash.sh +++ b/tools/testing/selftests/net/netfilter/conntrack_clash.sh @@ -116,7 +116,7 @@ run_one_clash_test() # not a failure: clash resolution logic did not trigger. # With right timing, xmit completed sequentially and # no parallel insertion occurs. - return $ksft_skip + return $ksft_xfail } run_clash_test() @@ -133,12 +133,12 @@ run_clash_test() if [ $rv -eq 0 ];then echo "PASS: clash resolution test for $daddr:$dport on attempt $i" return 0 - elif [ $rv -eq $ksft_skip ]; then + elif [ $rv -eq $ksft_xfail ]; then softerr=1 fi done - [ $softerr -eq 1 ] && echo "SKIP: clash resolution for $daddr:$dport did not trigger" + [ $softerr -eq 1 ] && echo "XFAIL: clash resolution for $daddr:$dport did not trigger" } ip link add veth0 netns "$nsclient1" type veth peer name veth0 netns "$nsrouter" @@ -167,8 +167,7 @@ load_simple_ruleset "$nsclient2" run_clash_test "$nsclient2" "$nsclient2" 127.0.0.1 9001 if [ $clash_resolution_active -eq 0 ];then - [ "$ret" -eq 0 ] && ret=$ksft_skip - echo "SKIP: Clash resolution did not trigger" + [ "$ret" -eq 0 ] && ret=$ksft_xfail fi exit $ret From bbec844b6918acdd0b283e35bc52cf357734c89c Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Thu, 4 Dec 2025 11:00:09 +0100 Subject: [PATCH 0089/1321] can: fix build dependency A recent bugfix introduced a new problem with Kconfig dependencies: WARNING: unmet direct dependencies detected for CAN_DEV Depends on [n]: NETDEVICES [=n] && CAN [=m] Selected by [m]: - CAN [=m] && NET [=y] Since the CAN core code now links into the CAN device code, that particular function needs to be available, though the rest of it does not. Revert the incomplete fix and instead use Makefile logic to avoid the link failure. Fixes: cb2dc6d2869a ("can: Kconfig: select CAN driver infrastructure by default") Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202512091523.zty3CLmc-lkp@intel.com/ Signed-off-by: Arnd Bergmann Tested-by: Oliver Hartkopp Acked-by: Oliver Hartkopp Link: https://patch.msgid.link/20251204100015.1033688-1-arnd@kernel.org [mkl: removed module option from CAN_DEV help text (thanks Vincent)] [mkl: removed '&& CAN' from Kconfig dependency (thanks Vincent)] Signed-off-by: Marc Kleine-Budde --- drivers/net/can/Kconfig | 5 +---- drivers/net/can/Makefile | 2 +- drivers/net/can/dev/Makefile | 5 ++--- net/can/Kconfig | 1 - 4 files changed, 4 insertions(+), 9 deletions(-) diff --git a/drivers/net/can/Kconfig b/drivers/net/can/Kconfig index e15e320db4763..460a74ae69233 100644 --- a/drivers/net/can/Kconfig +++ b/drivers/net/can/Kconfig @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0-only menuconfig CAN_DEV - tristate "CAN Device Drivers" + bool "CAN Device Drivers" default y depends on CAN help @@ -17,9 +17,6 @@ menuconfig CAN_DEV virtual ones. If you own such devices or plan to use the virtual CAN interfaces to develop applications, say Y here. - To compile as a module, choose M here: the module will be called - can-dev. - if CAN_DEV config CAN_VCAN diff --git a/drivers/net/can/Makefile b/drivers/net/can/Makefile index d7bc10a6b8eae..37e2f1a2faecd 100644 --- a/drivers/net/can/Makefile +++ b/drivers/net/can/Makefile @@ -7,7 +7,7 @@ obj-$(CONFIG_CAN_VCAN) += vcan.o obj-$(CONFIG_CAN_VXCAN) += vxcan.o obj-$(CONFIG_CAN_SLCAN) += slcan/ -obj-y += dev/ +obj-$(CONFIG_CAN_DEV) += dev/ obj-y += esd/ obj-y += rcar/ obj-y += rockchip/ diff --git a/drivers/net/can/dev/Makefile b/drivers/net/can/dev/Makefile index 633687d6b6c0c..64226acf0f3d4 100644 --- a/drivers/net/can/dev/Makefile +++ b/drivers/net/can/dev/Makefile @@ -1,9 +1,8 @@ # SPDX-License-Identifier: GPL-2.0 -obj-$(CONFIG_CAN_DEV) += can-dev.o - -can-dev-y += skb.o +obj-$(CONFIG_CAN) += can-dev.o +can-dev-$(CONFIG_CAN_DEV) += skb.o can-dev-$(CONFIG_CAN_CALC_BITTIMING) += calc_bittiming.o can-dev-$(CONFIG_CAN_NETLINK) += bittiming.o can-dev-$(CONFIG_CAN_NETLINK) += dev.o diff --git a/net/can/Kconfig b/net/can/Kconfig index e4ccf731a24ce..af64a6f764588 100644 --- a/net/can/Kconfig +++ b/net/can/Kconfig @@ -5,7 +5,6 @@ menuconfig CAN tristate "CAN bus subsystem support" - select CAN_DEV help Controller Area Network (CAN) is a slow (up to 1Mbit/s) serial communications protocol. Development of the CAN bus started in From baedb7e249874f0a764cf0bfb62a7ac4681df503 Mon Sep 17 00:00:00 2001 From: Marc Kleine-Budde Date: Mon, 1 Dec 2025 19:26:38 +0100 Subject: [PATCH 0090/1321] can: gs_usb: gs_can_open(): fix error handling Commit 2603be9e8167 ("can: gs_usb: gs_can_open(): improve error handling") added missing error handling to the gs_can_open() function. The driver uses 2 USB anchors to track the allocated URBs: the TX URBs in struct gs_can::tx_submitted for each netdev and the RX URBs in struct gs_usb::rx_submitted for the USB device. gs_can_open() allocates the RX URBs, while TX URBs are allocated during gs_can_start_xmit(). The cleanup in gs_can_open() kills all anchored dev->tx_submitted URBs (which is not necessary since the netdev is not yet registered), but misses the parent->rx_submitted URBs. Fix the problem by killing the rx_submitted instead of the tx_submitted. Fixes: 2603be9e8167 ("can: gs_usb: gs_can_open(): improve error handling") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251210-gs_usb-fix-error-handling-v1-1-d6a5a03f10bb@pengutronix.de Signed-off-by: Marc Kleine-Budde --- drivers/net/can/usb/gs_usb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c index e29e85b67fd40..a0233e550a5ad 100644 --- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -1074,7 +1074,7 @@ static int gs_can_open(struct net_device *netdev) usb_free_urb(urb); out_usb_kill_anchored_urbs: if (!parent->active_channels) { - usb_kill_anchored_urbs(&dev->tx_submitted); + usb_kill_anchored_urbs(&parent->rx_submitted); if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) gs_usb_timestamp_stop(parent); From f34252622e4eb621076dd05b5524e17ae701b506 Mon Sep 17 00:00:00 2001 From: Marcus Hughes Date: Sun, 7 Dec 2025 21:03:55 +0000 Subject: [PATCH 0091/1321] net: sfp: extend Potron XGSPON quirk to cover additional EEPROM variant Some Potron SFP+ XGSPON ONU sticks are shipped with different EEPROM vendor ID and vendor name strings, but are otherwise functionally identical to the existing "Potron SFP+ XGSPON ONU Stick" handled by sfp_quirk_potron(). These modules, including units distributed under the "Better Internet" branding, use the same UART pin assignment and require the same TX_FAULT/LOS behaviour and boot delay. Re-use the existing Potron quirk for this EEPROM variant. Signed-off-by: Marcus Hughes Link: https://patch.msgid.link/20251207210355.333451-1-marcus.hughes@betterinternet.ltd Signed-off-by: Jakub Kicinski --- drivers/net/phy/sfp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c index 0401fa6b24d25..6166e91963644 100644 --- a/drivers/net/phy/sfp.c +++ b/drivers/net/phy/sfp.c @@ -497,6 +497,8 @@ static const struct sfp_quirk sfp_quirks[] = { SFP_QUIRK("ALCATELLUCENT", "3FE46541AA", sfp_quirk_2500basex, sfp_fixup_nokia), + SFP_QUIRK_F("BIDB", "X-ONU-SFPP", sfp_fixup_potron), + // FLYPRO SFP-10GT-CS-30M uses Rollball protocol to talk to the PHY. SFP_QUIRK_F("FLYPRO", "SFP-10GT-CS-30M", sfp_fixup_rollball), From c8c9c8383dd052c696ecf7fd62c4812986de1426 Mon Sep 17 00:00:00 2001 From: Junrui Luo Date: Thu, 4 Dec 2025 21:30:47 +0800 Subject: [PATCH 0092/1321] caif: fix integer underflow in cffrml_receive() The cffrml_receive() function extracts a length field from the packet header and, when FCS is disabled, subtracts 2 from this length without validating that len >= 2. If an attacker sends a malicious packet with a length field of 0 or 1 to an interface with FCS disabled, the subtraction causes an integer underflow. This can lead to memory exhaustion and kernel instability, potential information disclosure if padding contains uninitialized kernel memory. Fix this by validating that len >= 2 before performing the subtraction. Reported-by: Yuhao Jiang Reported-by: Junrui Luo Fixes: b482cd2053e3 ("net-caif: add CAIF core protocol stack") Signed-off-by: Junrui Luo Reviewed-by: Simon Horman Link: https://patch.msgid.link/SYBPR01MB7881511122BAFEA8212A1608AFA6A@SYBPR01MB7881.ausprd01.prod.outlook.com Signed-off-by: Jakub Kicinski --- net/caif/cffrml.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/net/caif/cffrml.c b/net/caif/cffrml.c index 6651a8dc62e04..d4d63586053ad 100644 --- a/net/caif/cffrml.c +++ b/net/caif/cffrml.c @@ -92,8 +92,15 @@ static int cffrml_receive(struct cflayer *layr, struct cfpkt *pkt) len = le16_to_cpu(tmp); /* Subtract for FCS on length if FCS is not used. */ - if (!this->dofcs) + if (!this->dofcs) { + if (len < 2) { + ++cffrml_rcv_error; + pr_err("Invalid frame length (%d)\n", len); + cfpkt_destroy(pkt); + return -EPROTO; + } len -= 2; + } if (cfpkt_setlen(pkt, len) < 0) { ++cffrml_rcv_error; From 15d3aec9795139ece90b51bbb3c14d75cd838176 Mon Sep 17 00:00:00 2001 From: Victor Nogueira Date: Mon, 8 Dec 2025 16:01:24 -0300 Subject: [PATCH 0093/1321] net/sched: ets: Remove drr class from the active list if it changes to strict Whenever a user issues an ets qdisc change command, transforming a drr class into a strict one, the ets code isn't checking whether that class was in the active list and removing it. This means that, if a user changes a strict class (which was in the active list) back to a drr one, that class will be added twice to the active list [1]. Doing so with the following commands: tc qdisc add dev lo root handle 1: ets bands 2 strict 1 tc qdisc add dev lo parent 1:2 handle 20: \ tbf rate 8bit burst 100b latency 1s tc filter add dev lo parent 1: basic classid 1:2 ping -c1 -W0.01 -s 56 127.0.0.1 tc qdisc change dev lo root handle 1: ets bands 2 strict 2 tc qdisc change dev lo root handle 1: ets bands 2 strict 1 ping -c1 -W0.01 -s 56 127.0.0.1 Will trigger the following splat with list debug turned on: [ 59.279014][ T365] ------------[ cut here ]------------ [ 59.279452][ T365] list_add double add: new=ffff88801d60e350, prev=ffff88801d60e350, next=ffff88801d60e2c0. [ 59.280153][ T365] WARNING: CPU: 3 PID: 365 at lib/list_debug.c:35 __list_add_valid_or_report+0x17f/0x220 [ 59.280860][ T365] Modules linked in: [ 59.281165][ T365] CPU: 3 UID: 0 PID: 365 Comm: tc Not tainted 6.18.0-rc7-00105-g7e9f13163c13-dirty #239 PREEMPT(voluntary) [ 59.281977][ T365] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 59.282391][ T365] RIP: 0010:__list_add_valid_or_report+0x17f/0x220 [ 59.282842][ T365] Code: 89 c6 e8 d4 b7 0d ff 90 0f 0b 90 90 31 c0 e9 31 ff ff ff 90 48 c7 c7 e0 a0 22 9f 48 89 f2 48 89 c1 4c 89 c6 e8 b2 b7 0d ff 90 <0f> 0b 90 90 31 c0 e9 0f ff ff ff 48 89 f7 48 89 44 24 10 4c 89 44 ... [ 59.288812][ T365] Call Trace: [ 59.289056][ T365] [ 59.289224][ T365] ? srso_alias_return_thunk+0x5/0xfbef5 [ 59.289546][ T365] ets_qdisc_change+0xd2b/0x1e80 [ 59.289891][ T365] ? __lock_acquire+0x7e7/0x1be0 [ 59.290223][ T365] ? __pfx_ets_qdisc_change+0x10/0x10 [ 59.290546][ T365] ? srso_alias_return_thunk+0x5/0xfbef5 [ 59.290898][ T365] ? __mutex_trylock_common+0xda/0x240 [ 59.291228][ T365] ? __pfx___mutex_trylock_common+0x10/0x10 [ 59.291655][ T365] ? srso_alias_return_thunk+0x5/0xfbef5 [ 59.291993][ T365] ? srso_alias_return_thunk+0x5/0xfbef5 [ 59.292313][ T365] ? trace_contention_end+0xc8/0x110 [ 59.292656][ T365] ? srso_alias_return_thunk+0x5/0xfbef5 [ 59.293022][ T365] ? srso_alias_return_thunk+0x5/0xfbef5 [ 59.293351][ T365] tc_modify_qdisc+0x63a/0x1cf0 Fix this by always checking and removing an ets class from the active list when changing it to strict. [1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/tree/net/sched/sch_ets.c?id=ce052b9402e461a9aded599f5b47e76bc727f7de#n663 Fixes: cd9b50adc6bb9 ("net/sched: ets: fix crash when flipping from 'strict' to 'quantum'") Acked-by: Jamal Hadi Salim Signed-off-by: Victor Nogueira Reviewed-by: Petr Machata Link: https://patch.msgid.link/20251208190125.1868423-1-victor@mojatatu.com Signed-off-by: Jakub Kicinski --- net/sched/sch_ets.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c index ae46643e596d3..306e046276d46 100644 --- a/net/sched/sch_ets.c +++ b/net/sched/sch_ets.c @@ -664,6 +664,10 @@ static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt, q->classes[i].deficit = quanta[i]; } } + for (i = q->nstrict; i < nstrict; i++) { + if (cl_is_active(&q->classes[i])) + list_del_init(&q->classes[i].alist); + } WRITE_ONCE(q->nstrict, nstrict); memcpy(q->prio2band, priomap, sizeof(priomap)); From 7d0682dcbf6eb47abd85bcda0b6c530dd05c71dd Mon Sep 17 00:00:00 2001 From: Victor Nogueira Date: Mon, 8 Dec 2025 16:01:25 -0300 Subject: [PATCH 0094/1321] selftests/tc-testing: Create tests to exercise ets classes active list misplacements Add a test case for a bug fixed by Jamal [1] and for scenario where an ets drr class is inserted into the active list twice. - Try to delete ets drr class' qdisc while still keeping it in the active list - Try to add ets class to the active list twice [1] https://lore.kernel.org/netdev/20251128151919.576920-1-jhs@mojatatu.com/ Acked-by: Jamal Hadi Salim Signed-off-by: Victor Nogueira Reviewed-by: Petr Machata Link: https://patch.msgid.link/20251208190125.1868423-2-victor@mojatatu.com Signed-off-by: Jakub Kicinski --- .../tc-testing/tc-tests/infra/qdiscs.json | 78 +++++++++++++++++++ 1 file changed, 78 insertions(+) diff --git a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json index 47de27fd4f90f..6a39640aa2a86 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json +++ b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json @@ -1033,5 +1033,83 @@ "teardown": [ "$TC qdisc del dev $DUMMY handle 1: root" ] + }, + { + "id": "6e4f", + "name": "Try to delete ets drr class' qdisc while still keeping it in the active list", + "category": [ + "qdisc", + "ets", + "tbf" + ], + "plugins": { + "requires": [ + "nsPlugin", + "scapyPlugin" + ] + }, + "setup": [ + "$IP link set dev $DUMMY up || true", + "$IP addr add 10.10.11.10/24 dev $DUMMY || true", + "$TC qdisc add dev $DUMMY root handle 1: ets bands 2 strict 1", + "$TC qdisc add dev $DUMMY parent 1:2 handle 20: tbf rate 8bit burst 100b latency 1s", + "$TC filter add dev $DUMMY parent 1: basic classid 1:2", + "ping -c2 -W0.01 -s 56 -I $DUMMY 10.10.11.11 || true", + "$TC qdisc change dev $DUMMY root handle 1: ets bands 2 strict 2", + "$TC qdisc change dev $DUMMY root handle 1: ets bands 1 strict 1" + ], + "cmdUnderTest": "ping -c1 -W0.01 -s 56 -I $DUMMY 10.10.11.11", + "expExitCode": "1", + "verifyCmd": "$TC -s -j qdisc ls dev $DUMMY root", + "matchJSON": [ + { + "kind": "ets", + "handle": "1:", + "bytes": 196, + "packets": 2 + } + ], + "teardown": [ + "$TC qdisc del dev $DUMMY root handle 1:" + ] + }, + { + "id": "0b8f", + "name": "Try to add ets class to the active list twice", + "category": [ + "qdisc", + "ets", + "tbf" + ], + "plugins": { + "requires": [ + "nsPlugin", + "scapyPlugin" + ] + }, + "setup": [ + "$IP link set dev $DUMMY up || true", + "$IP addr add 10.10.11.10/24 dev $DUMMY || true", + "$TC qdisc add dev $DUMMY root handle 1: ets bands 2 strict 1", + "$TC qdisc add dev $DUMMY parent 1:2 handle 20: tbf rate 8bit burst 100b latency 1s", + "$TC filter add dev $DUMMY parent 1: basic classid 1:2", + "ping -c2 -W0.01 -s 56 -I $DUMMY 10.10.11.11 || true", + "$TC qdisc change dev $DUMMY root handle 1: ets bands 2 strict 2", + "$TC qdisc change dev $DUMMY root handle 1: ets bands 2 strict 1" + ], + "cmdUnderTest": "ping -c1 -W0.01 -s 56 -I $DUMMY 10.10.11.11", + "expExitCode": "1", + "verifyCmd": "$TC -s -j qdisc ls dev $DUMMY root", + "matchJSON": [ + { + "kind": "ets", + "handle": "1:", + "bytes": 98, + "packets": 1 + } + ], + "teardown": [ + "$TC qdisc del dev $DUMMY root handle 1:" + ] } ] From bebc3c92f6c65bb08810cee68d95073573716c76 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Tue, 9 Dec 2025 09:56:39 +0300 Subject: [PATCH 0095/1321] nfc: pn533: Fix error code in pn533_acr122_poweron_rdr() Set the error code if "transferred != sizeof(cmd)" instead of returning success. Fixes: dbafc28955fa ("NFC: pn533: don't send USB data off of the stack") Signed-off-by: Dan Carpenter Link: https://patch.msgid.link/aTfIJ9tZPmeUF4W1@stanley.mountain Signed-off-by: Jakub Kicinski --- drivers/nfc/pn533/usb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/nfc/pn533/usb.c b/drivers/nfc/pn533/usb.c index ffd7367ce1194..018a80674f06e 100644 --- a/drivers/nfc/pn533/usb.c +++ b/drivers/nfc/pn533/usb.c @@ -406,7 +406,7 @@ static int pn533_acr122_poweron_rdr(struct pn533_usb_phy *phy) if (rc || (transferred != sizeof(cmd))) { nfc_err(&phy->udev->dev, "Reader power on cmd error %d\n", rc); - return rc; + return rc ?: -EINVAL; } rc = usb_submit_urb(phy->in_urb, GFP_KERNEL); From 5a6836d0fac3e49fab4128b9659d668fb6095f43 Mon Sep 17 00:00:00 2001 From: Gal Pressman Date: Mon, 8 Dec 2025 14:19:01 +0200 Subject: [PATCH 0096/1321] ethtool: Avoid overflowing userspace buffer on stats query The ethtool -S command operates across three ioctl calls: ETHTOOL_GSSET_INFO for the size, ETHTOOL_GSTRINGS for the names, and ETHTOOL_GSTATS for the values. If the number of stats changes between these calls (e.g., due to device reconfiguration), userspace's buffer allocation will be incorrect, potentially leading to buffer overflow. Drivers are generally expected to maintain stable stat counts, but some drivers (e.g., mlx5, bnx2x, bna, ksz884x) use dynamic counters, making this scenario possible. Some drivers try to handle this internally: - bnad_get_ethtool_stats() returns early in case stats.n_stats is not equal to the driver's stats count. - micrel/ksz884x also makes sure not to write anything beyond stats.n_stats and overflow the buffer. However, both use stats.n_stats which is already assigned with the value returned from get_sset_count(), hence won't solve the issue described here. Change ethtool_get_strings(), ethtool_get_stats(), ethtool_get_phy_stats() to not return anything in case of a mismatch between userspace's size and get_sset_size(), to prevent buffer overflow. The returned n_stats value will be equal to zero, to reflect that nothing has been returned. This could result in one of two cases when using upstream ethtool, depending on when the size change is detected: 1. When detected in ethtool_get_strings(): # ethtool -S eth2 no stats available 2. When detected in get stats, all stats will be reported as zero. Both cases are presumably transient, and a subsequent ethtool call should succeed. Other than the overflow avoidance, these two cases are very evident (no output/cleared stats), which is arguably better than presenting incorrect/shifted stats. I also considered returning an error instead of a "silent" response, but that seems more destructive towards userspace apps. Notes: - This patch does not claim to fix the inherent race, it only makes sure that we do not overflow the userspace buffer, and makes for a more predictable behavior. - RTNL lock is held during each ioctl, the race window exists between the separate ioctl calls when the lock is released. - Userspace ethtool always fills stats.n_stats, but it is likely that these stats ioctls are implemented in other userspace applications which might not fill it. The added code checks that it's not zero, to prevent any regressions. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reviewed-by: Dragos Tatulea Reviewed-by: Tariq Toukan Signed-off-by: Gal Pressman Link: https://patch.msgid.link/20251208121901.3203692-1-gal@nvidia.com Signed-off-by: Paolo Abeni --- net/ethtool/ioctl.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index fa83ddade4f81..9431e305b2333 100644 --- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -2383,7 +2383,10 @@ static int ethtool_get_strings(struct net_device *dev, void __user *useraddr) return -ENOMEM; WARN_ON_ONCE(!ret); - gstrings.len = ret; + if (gstrings.len && gstrings.len != ret) + gstrings.len = 0; + else + gstrings.len = ret; if (gstrings.len) { data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN)); @@ -2509,10 +2512,13 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr) if (copy_from_user(&stats, useraddr, sizeof(stats))) return -EFAULT; - stats.n_stats = n_stats; + if (stats.n_stats && stats.n_stats != n_stats) + stats.n_stats = 0; + else + stats.n_stats = n_stats; - if (n_stats) { - data = vzalloc(array_size(n_stats, sizeof(u64))); + if (stats.n_stats) { + data = vzalloc(array_size(stats.n_stats, sizeof(u64))); if (!data) return -ENOMEM; ops->get_ethtool_stats(dev, &stats, data); @@ -2524,7 +2530,9 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr) if (copy_to_user(useraddr, &stats, sizeof(stats))) goto out; useraddr += sizeof(stats); - if (n_stats && copy_to_user(useraddr, data, array_size(n_stats, sizeof(u64)))) + if (stats.n_stats && + copy_to_user(useraddr, data, + array_size(stats.n_stats, sizeof(u64)))) goto out; ret = 0; @@ -2560,6 +2568,10 @@ static int ethtool_get_phy_stats_phydev(struct phy_device *phydev, return -EOPNOTSUPP; n_stats = phy_ops->get_sset_count(phydev); + if (stats->n_stats && stats->n_stats != n_stats) { + stats->n_stats = 0; + return 0; + } ret = ethtool_vzalloc_stats_array(n_stats, data); if (ret) @@ -2580,6 +2592,10 @@ static int ethtool_get_phy_stats_ethtool(struct net_device *dev, return -EOPNOTSUPP; n_stats = ops->get_sset_count(dev, ETH_SS_PHY_STATS); + if (stats->n_stats && stats->n_stats != n_stats) { + stats->n_stats = 0; + return 0; + } ret = ethtool_vzalloc_stats_array(n_stats, data); if (ret) @@ -2616,7 +2632,9 @@ static int ethtool_get_phy_stats(struct net_device *dev, void __user *useraddr) } useraddr += sizeof(stats); - if (copy_to_user(useraddr, data, array_size(stats.n_stats, sizeof(u64)))) + if (stats.n_stats && + copy_to_user(useraddr, data, + array_size(stats.n_stats, sizeof(u64)))) ret = -EFAULT; out: From 4da2dd423b84e7d514a6ad36a1502fa467eb1d3d Mon Sep 17 00:00:00 2001 From: Daniel Golle Date: Tue, 9 Dec 2025 01:28:20 +0000 Subject: [PATCH 0097/1321] net: dsa: lantiq_gswip: fix order in .remove operation Russell King pointed out that disabling the switch by clearing GSWIP_MDIO_GLOB_ENABLE before calling dsa_unregister_switch() is problematic, as it violates a Golden Rule of driver development to always first unpublish userspace interfaces and then disable the hardware. Fix this, and also simplify the probe() function, by introducing a dsa_switch_ops teardown() operation which takes care of clearing the GSWIP_MDIO_GLOB_ENABLE bit. Fixes: 14fceff4771e5 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Suggested-by: "Russell King (Oracle)" Signed-off-by: Daniel Golle Link: https://patch.msgid.link/4ebd72a29edc1e4059b9666a26a0bb5d906a829a.1765241054.git.daniel@makrotopia.org Reviewed-by: Vladimir Oltean Signed-off-by: Paolo Abeni --- drivers/net/dsa/lantiq/lantiq_gswip.c | 3 --- drivers/net/dsa/lantiq/lantiq_gswip_common.c | 13 ++++++++++--- 2 files changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/net/dsa/lantiq/lantiq_gswip.c b/drivers/net/dsa/lantiq/lantiq_gswip.c index 57dd063c07403..b094001a7c805 100644 --- a/drivers/net/dsa/lantiq/lantiq_gswip.c +++ b/drivers/net/dsa/lantiq/lantiq_gswip.c @@ -444,9 +444,6 @@ static void gswip_remove(struct platform_device *pdev) if (!priv) return; - /* disable the switch */ - gswip_disable_switch(priv); - dsa_unregister_switch(priv->ds); for (i = 0; i < priv->num_gphy_fw; i++) diff --git a/drivers/net/dsa/lantiq/lantiq_gswip_common.c b/drivers/net/dsa/lantiq/lantiq_gswip_common.c index 9da39edf8f574..6b171d58e1862 100644 --- a/drivers/net/dsa/lantiq/lantiq_gswip_common.c +++ b/drivers/net/dsa/lantiq/lantiq_gswip_common.c @@ -752,6 +752,13 @@ static int gswip_setup(struct dsa_switch *ds) return 0; } +static void gswip_teardown(struct dsa_switch *ds) +{ + struct gswip_priv *priv = ds->priv; + + regmap_clear_bits(priv->mdio, GSWIP_MDIO_GLOB, GSWIP_MDIO_GLOB_ENABLE); +} + static enum dsa_tag_protocol gswip_get_tag_protocol(struct dsa_switch *ds, int port, enum dsa_tag_protocol mp) @@ -1629,6 +1636,7 @@ static const struct phylink_mac_ops gswip_phylink_mac_ops = { static const struct dsa_switch_ops gswip_switch_ops = { .get_tag_protocol = gswip_get_tag_protocol, .setup = gswip_setup, + .teardown = gswip_teardown, .port_setup = gswip_port_setup, .port_enable = gswip_port_enable, .port_disable = gswip_port_disable, @@ -1718,15 +1726,14 @@ int gswip_probe_common(struct gswip_priv *priv, u32 version) err = gswip_validate_cpu_port(priv->ds); if (err) - goto disable_switch; + goto unregister_switch; dev_info(priv->dev, "probed GSWIP version %lx mod %lx\n", GSWIP_VERSION_REV(version), GSWIP_VERSION_MOD(version)); return 0; -disable_switch: - gswip_disable_switch(priv); +unregister_switch: dsa_unregister_switch(priv->ds); return err; From 864f68219827ef22df38eea9be141c4bb1b86799 Mon Sep 17 00:00:00 2001 From: Daniel Golle Date: Tue, 9 Dec 2025 01:28:49 +0000 Subject: [PATCH 0098/1321] net: dsa: mxl-gsw1xx: fix order in .remove operation The driver's .remove operation was calling gswip_disable_switch() which clears the GSWIP_MDIO_GLOB_ENABLE bit before calling dsa_unregister_switch() and thereby violating a Golden Rule of driver development to always unpublish userspace interfaces before disabling hardware, as pointed out by Russell King. Fix this by relying in GSWIP_MDIO_GLOB_ENABLE being cleared by the .teardown operation introduced by the previous commit ("net: dsa: lantiq_gswip: fix teardown order"). Fixes: 22335939ec907 ("net: dsa: add driver for MaxLinear GSW1xx switch family") Suggested-by: "Russell King (Oracle)" Signed-off-by: Daniel Golle Link: https://patch.msgid.link/63f882eeb910cf24503c35a443b541cc54a930f2.1765241054.git.daniel@makrotopia.org Reviewed-by: Vladimir Oltean Signed-off-by: Paolo Abeni --- drivers/net/dsa/lantiq/mxl-gsw1xx.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/net/dsa/lantiq/mxl-gsw1xx.c b/drivers/net/dsa/lantiq/mxl-gsw1xx.c index cf33a16fd183b..cda966d71e889 100644 --- a/drivers/net/dsa/lantiq/mxl-gsw1xx.c +++ b/drivers/net/dsa/lantiq/mxl-gsw1xx.c @@ -652,8 +652,6 @@ static void gsw1xx_remove(struct mdio_device *mdiodev) if (!priv) return; - gswip_disable_switch(priv); - dsa_unregister_switch(priv->ds); } From 8fc43918758a73c63a4df9fd7f0cfbdd24cda05b Mon Sep 17 00:00:00 2001 From: Daniel Golle Date: Tue, 9 Dec 2025 01:29:05 +0000 Subject: [PATCH 0099/1321] net: dsa: mxl-gsw1xx: fix .shutdown driver operation The .shutdown operation should call dsa_switch_shutdown() just like it is done also by the sibling lantiq_gswip driver. Not doing that results in shutdown or reboot hanging and waiting for the CPU port becoming free, which introduces a longer delay and a WARNING before shutdown or reboot in case the driver is built-into the kernel. Fix this by calling dsa_switch_shutdown() in the driver's shutdown operation, harmonizing it with what is done in the lantiq_gswip driver. As a side-effect this now allows to remove the previously exported gswip_disable_switch() function which no longer got any users. Fixes: 22335939ec907 ("net: dsa: add driver for MaxLinear GSW1xx switch family") Signed-off-by: Daniel Golle Link: https://patch.msgid.link/77ed91a5206e5dbf5d3e83d7e364ebfda90d31fd.1765241054.git.daniel@makrotopia.org Reviewed-by: Vladimir Oltean Signed-off-by: Paolo Abeni --- drivers/net/dsa/lantiq/lantiq_gswip.h | 2 -- drivers/net/dsa/lantiq/lantiq_gswip_common.c | 6 ------ drivers/net/dsa/lantiq/mxl-gsw1xx.c | 4 ++-- 3 files changed, 2 insertions(+), 10 deletions(-) diff --git a/drivers/net/dsa/lantiq/lantiq_gswip.h b/drivers/net/dsa/lantiq/lantiq_gswip.h index 9c38e51a75e80..2e0f2afbadbbc 100644 --- a/drivers/net/dsa/lantiq/lantiq_gswip.h +++ b/drivers/net/dsa/lantiq/lantiq_gswip.h @@ -294,8 +294,6 @@ struct gswip_priv { u16 version; }; -void gswip_disable_switch(struct gswip_priv *priv); - int gswip_probe_common(struct gswip_priv *priv, u32 version); #endif /* __LANTIQ_GSWIP_H */ diff --git a/drivers/net/dsa/lantiq/lantiq_gswip_common.c b/drivers/net/dsa/lantiq/lantiq_gswip_common.c index 6b171d58e1862..e790f2ef75884 100644 --- a/drivers/net/dsa/lantiq/lantiq_gswip_common.c +++ b/drivers/net/dsa/lantiq/lantiq_gswip_common.c @@ -1664,12 +1664,6 @@ static const struct dsa_switch_ops gswip_switch_ops = { .port_hsr_leave = dsa_port_simple_hsr_leave, }; -void gswip_disable_switch(struct gswip_priv *priv) -{ - regmap_clear_bits(priv->mdio, GSWIP_MDIO_GLOB, GSWIP_MDIO_GLOB_ENABLE); -} -EXPORT_SYMBOL_GPL(gswip_disable_switch); - static int gswip_validate_cpu_port(struct dsa_switch *ds) { struct gswip_priv *priv = ds->priv; diff --git a/drivers/net/dsa/lantiq/mxl-gsw1xx.c b/drivers/net/dsa/lantiq/mxl-gsw1xx.c index cda966d71e889..4dc287ad141e1 100644 --- a/drivers/net/dsa/lantiq/mxl-gsw1xx.c +++ b/drivers/net/dsa/lantiq/mxl-gsw1xx.c @@ -662,9 +662,9 @@ static void gsw1xx_shutdown(struct mdio_device *mdiodev) if (!priv) return; - dev_set_drvdata(&mdiodev->dev, NULL); + dsa_switch_shutdown(priv->ds); - gswip_disable_switch(priv); + dev_set_drvdata(&mdiodev->dev, NULL); } static const struct gswip_hw_info gsw12x_data = { From 7a3a6530d3e83e24e26aff6b8914461e81670dc1 Mon Sep 17 00:00:00 2001 From: Daniel Golle Date: Tue, 9 Dec 2025 01:29:34 +0000 Subject: [PATCH 0100/1321] net: dsa: mxl-gsw1xx: manually clear RANEG bit Despite being documented as self-clearing, the RANEG bit sometimes remains set, preventing auto-negotiation from happening. Manually clear the RANEG bit after 10ms as advised by MaxLinear. In order to not hold RTNL during the 10ms of waiting schedule delayed work to take care of clearing the bit asynchronously, which is similar to the self-clearing behavior. Fixes: 22335939ec90 ("net: dsa: add driver for MaxLinear GSW1xx switch family") Reported-by: Rasmus Villemoes Signed-off-by: Daniel Golle Link: https://patch.msgid.link/76745fceb5a3f53088110fb7a96acf88434088ca.1765241054.git.daniel@makrotopia.org Reviewed-by: Vladimir Oltean Signed-off-by: Paolo Abeni --- drivers/net/dsa/lantiq/mxl-gsw1xx.c | 34 ++++++++++++++++++++++++++++- 1 file changed, 33 insertions(+), 1 deletion(-) diff --git a/drivers/net/dsa/lantiq/mxl-gsw1xx.c b/drivers/net/dsa/lantiq/mxl-gsw1xx.c index 4dc287ad141e1..f8ff8a604bf53 100644 --- a/drivers/net/dsa/lantiq/mxl-gsw1xx.c +++ b/drivers/net/dsa/lantiq/mxl-gsw1xx.c @@ -11,10 +11,12 @@ #include #include +#include #include #include #include #include +#include #include #include "lantiq_gswip.h" @@ -29,6 +31,7 @@ struct gsw1xx_priv { struct regmap *clk; struct regmap *shell; struct phylink_pcs pcs; + struct delayed_work clear_raneg; phy_interface_t tbi_interface; struct gswip_priv gswip; }; @@ -145,7 +148,9 @@ static void gsw1xx_pcs_disable(struct phylink_pcs *pcs) { struct gsw1xx_priv *priv = pcs_to_gsw1xx(pcs); - /* Assert SGMII shell reset */ + cancel_delayed_work_sync(&priv->clear_raneg); + + /* Assert SGMII shell reset (will also clear RANEG bit) */ regmap_set_bits(priv->shell, GSW1XX_SHELL_RST_REQ, GSW1XX_RST_REQ_SGMII_SHELL); @@ -428,12 +433,29 @@ static int gsw1xx_pcs_config(struct phylink_pcs *pcs, unsigned int neg_mode, return 0; } +static void gsw1xx_pcs_clear_raneg(struct work_struct *work) +{ + struct gsw1xx_priv *priv = + container_of(work, struct gsw1xx_priv, clear_raneg.work); + + regmap_clear_bits(priv->sgmii, GSW1XX_SGMII_TBI_ANEGCTL, + GSW1XX_SGMII_TBI_ANEGCTL_RANEG); +} + static void gsw1xx_pcs_an_restart(struct phylink_pcs *pcs) { struct gsw1xx_priv *priv = pcs_to_gsw1xx(pcs); + cancel_delayed_work_sync(&priv->clear_raneg); + regmap_set_bits(priv->sgmii, GSW1XX_SGMII_TBI_ANEGCTL, GSW1XX_SGMII_TBI_ANEGCTL_RANEG); + + /* despite being documented as self-clearing, the RANEG bit + * sometimes remains set, preventing auto-negotiation from happening. + * MaxLinear advises to manually clear the bit after 10ms. + */ + schedule_delayed_work(&priv->clear_raneg, msecs_to_jiffies(10)); } static void gsw1xx_pcs_link_up(struct phylink_pcs *pcs, @@ -636,6 +658,8 @@ static int gsw1xx_probe(struct mdio_device *mdiodev) if (ret) return ret; + INIT_DELAYED_WORK(&priv->clear_raneg, gsw1xx_pcs_clear_raneg); + ret = gswip_probe_common(&priv->gswip, version); if (ret) return ret; @@ -648,16 +672,21 @@ static int gsw1xx_probe(struct mdio_device *mdiodev) static void gsw1xx_remove(struct mdio_device *mdiodev) { struct gswip_priv *priv = dev_get_drvdata(&mdiodev->dev); + struct gsw1xx_priv *gsw1xx_priv; if (!priv) return; dsa_unregister_switch(priv->ds); + + gsw1xx_priv = container_of(priv, struct gsw1xx_priv, gswip); + cancel_delayed_work_sync(&gsw1xx_priv->clear_raneg); } static void gsw1xx_shutdown(struct mdio_device *mdiodev) { struct gswip_priv *priv = dev_get_drvdata(&mdiodev->dev); + struct gsw1xx_priv *gsw1xx_priv; if (!priv) return; @@ -665,6 +694,9 @@ static void gsw1xx_shutdown(struct mdio_device *mdiodev) dsa_switch_shutdown(priv->ds); dev_set_drvdata(&mdiodev->dev, NULL); + + gsw1xx_priv = container_of(priv, struct gsw1xx_priv, gswip); + cancel_delayed_work_sync(&gsw1xx_priv->clear_raneg); } static const struct gswip_hw_info gsw12x_data = { From d48689e39250843c5a59b5e12e28d0af9be41643 Mon Sep 17 00:00:00 2001 From: Moshe Shemesh Date: Tue, 9 Dec 2025 14:56:09 +0200 Subject: [PATCH 0101/1321] net/mlx5: fw reset, clear reset requested on drain_fw_reset drain_fw_reset() waits for ongoing firmware reset events and blocks new event handling, but does not clear the reset requested flag, and may keep sync reset polling. To fix it, call mlx5_sync_reset_clear_reset_requested() to clear the flag, stop sync reset polling, and resume health polling, ensuring health issues are still detected after the firmware reset drain. Fixes: 16d42d313350 ("net/mlx5: Drain fw_reset when removing device") Signed-off-by: Moshe Shemesh Reviewed-by: Shay Drori Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/1765284977-1363052-2-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c index 2bceb42c98cc2..b81de792c181a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c @@ -844,7 +844,8 @@ void mlx5_drain_fw_reset(struct mlx5_core_dev *dev) cancel_work_sync(&fw_reset->reset_reload_work); cancel_work_sync(&fw_reset->reset_now_work); cancel_work_sync(&fw_reset->reset_abort_work); - cancel_delayed_work(&fw_reset->reset_timeout_work); + if (test_bit(MLX5_FW_RESET_FLAGS_RESET_REQUESTED, &fw_reset->reset_flags)) + mlx5_sync_reset_clear_reset_requested(dev, true); } static const struct devlink_param mlx5_fw_reset_devlink_params[] = { From a54d8ce8260050f3673076964ec5016d4d2cb703 Mon Sep 17 00:00:00 2001 From: Moshe Shemesh Date: Tue, 9 Dec 2025 14:56:10 +0200 Subject: [PATCH 0102/1321] net/mlx5: Drain firmware reset in shutdown callback Invoke drain_fw_reset() in the shutdown callback to ensure all firmware reset handling is completed before shutdown proceeds. Fixes: 16d42d313350 ("net/mlx5: Drain fw_reset when removing device") Signed-off-by: Moshe Shemesh Reviewed-by: Shay Drori Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/1765284977-1363052-3-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/mellanox/mlx5/core/main.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 1ab569ce3fcfd..4209da722f9a6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -2231,6 +2231,7 @@ static void shutdown(struct pci_dev *pdev) mlx5_core_info(dev, "Shutdown was called\n"); set_bit(MLX5_BREAK_FW_WAIT, &dev->intf_state); + mlx5_drain_fw_reset(dev); mlx5_drain_health_wq(dev); err = mlx5_try_fast_unload(dev); if (err) From 433ec0f30c33d047b28eeb5c4534b7893fd83195 Mon Sep 17 00:00:00 2001 From: Shay Drory Date: Tue, 9 Dec 2025 14:56:11 +0200 Subject: [PATCH 0103/1321] net/mlx5: fw_tracer, Validate format string parameters Add validation for format string parameters in the firmware tracer to prevent potential security vulnerabilities and crashes from malformed format strings received from firmware. The firmware tracer receives format strings from the device firmware and uses them to format trace messages. Without proper validation, bad firmware could provide format strings with invalid format specifiers (e.g., %s, %p, %n) that could lead to crashes, or other undefined behavior. Add mlx5_tracer_validate_params() to validate that all format specifiers in trace strings are limited to safe integer/hex formats (%x, %d, %i, %u, %llx, %lx, etc.). Reject strings containing other format types that could be used to access arbitrary memory or cause crashes. Invalid format strings are added to the trace output for visibility with "BAD_FORMAT: " prefix. Fixes: 70dd6fdb8987 ("net/mlx5: FW tracer, parse traces and kernel tracing support") Signed-off-by: Shay Drory Reviewed-by: Moshe Shemesh Reported-by: Breno Leitao Closes: https://lore.kernel.org/netdev/hanz6rzrb2bqbplryjrakvkbmv4y5jlmtthnvi3thg5slqvelp@t3s3erottr6s/ Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/1765284977-1363052-4-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni --- .../mellanox/mlx5/core/diag/fw_tracer.c | 83 ++++++++++++++++--- .../mellanox/mlx5/core/diag/fw_tracer.h | 1 + 2 files changed, 74 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c index 7bcf822a89f9f..b415dfe5de45f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c @@ -33,6 +33,7 @@ #include "lib/eq.h" #include "fw_tracer.h" #include "fw_tracer_tracepoint.h" +#include static int mlx5_query_mtrc_caps(struct mlx5_fw_tracer *tracer) { @@ -358,6 +359,43 @@ static const char *VAL_PARM = "%llx"; static const char *REPLACE_64_VAL_PARM = "%x%x"; static const char *PARAM_CHAR = "%"; +static bool mlx5_is_valid_spec(const char *str) +{ + /* Parse format specifiers to find the actual type. + * Structure: %[flags][width][.precision][length]type + * Skip flags, width, precision & length. + */ + while (isdigit(*str) || *str == '#' || *str == '.' || *str == 'l') + str++; + + /* Check if it's a valid integer/hex specifier: + * Valid formats: %x, %d, %i, %u, etc. + */ + if (*str != 'x' && *str != 'X' && *str != 'd' && *str != 'i' && + *str != 'u' && *str != 'c') + return false; + + return true; +} + +static bool mlx5_tracer_validate_params(const char *str) +{ + const char *substr = str; + + if (!str) + return false; + + substr = strstr(substr, PARAM_CHAR); + while (substr) { + if (!mlx5_is_valid_spec(substr + 1)) + return false; + + substr = strstr(substr + 1, PARAM_CHAR); + } + + return true; +} + static int mlx5_tracer_message_hash(u32 message_id) { return jhash_1word(message_id, 0) & (MESSAGE_HASH_SIZE - 1); @@ -419,6 +457,10 @@ static int mlx5_tracer_get_num_of_params(char *str) char *substr, *pstr = str; int num_of_params = 0; + /* Validate that all parameters are valid before processing */ + if (!mlx5_tracer_validate_params(str)) + return -EINVAL; + /* replace %llx with %x%x */ substr = strstr(pstr, VAL_PARM); while (substr) { @@ -570,14 +612,17 @@ void mlx5_tracer_print_trace(struct tracer_string_format *str_frmt, { char tmp[512]; - snprintf(tmp, sizeof(tmp), str_frmt->string, - str_frmt->params[0], - str_frmt->params[1], - str_frmt->params[2], - str_frmt->params[3], - str_frmt->params[4], - str_frmt->params[5], - str_frmt->params[6]); + if (str_frmt->invalid_string) + snprintf(tmp, sizeof(tmp), "BAD_FORMAT: %s", str_frmt->string); + else + snprintf(tmp, sizeof(tmp), str_frmt->string, + str_frmt->params[0], + str_frmt->params[1], + str_frmt->params[2], + str_frmt->params[3], + str_frmt->params[4], + str_frmt->params[5], + str_frmt->params[6]); trace_mlx5_fw(dev->tracer, trace_timestamp, str_frmt->lost, str_frmt->event_id, tmp); @@ -609,6 +654,13 @@ static int mlx5_tracer_handle_raw_string(struct mlx5_fw_tracer *tracer, return 0; } +static void mlx5_tracer_handle_bad_format_string(struct mlx5_fw_tracer *tracer, + struct tracer_string_format *cur_string) +{ + cur_string->invalid_string = true; + list_add_tail(&cur_string->list, &tracer->ready_strings_list); +} + static int mlx5_tracer_handle_string_trace(struct mlx5_fw_tracer *tracer, struct tracer_event *tracer_event) { @@ -619,12 +671,18 @@ static int mlx5_tracer_handle_string_trace(struct mlx5_fw_tracer *tracer, if (!cur_string) return mlx5_tracer_handle_raw_string(tracer, tracer_event); - cur_string->num_of_params = mlx5_tracer_get_num_of_params(cur_string->string); - cur_string->last_param_num = 0; cur_string->event_id = tracer_event->event_id; cur_string->tmsn = tracer_event->string_event.tmsn; cur_string->timestamp = tracer_event->string_event.timestamp; cur_string->lost = tracer_event->lost_event; + cur_string->last_param_num = 0; + cur_string->num_of_params = mlx5_tracer_get_num_of_params(cur_string->string); + if (cur_string->num_of_params < 0) { + pr_debug("%s Invalid format string parameters\n", + __func__); + mlx5_tracer_handle_bad_format_string(tracer, cur_string); + return 0; + } if (cur_string->num_of_params == 0) /* trace with no params */ list_add_tail(&cur_string->list, &tracer->ready_strings_list); } else { @@ -634,6 +692,11 @@ static int mlx5_tracer_handle_string_trace(struct mlx5_fw_tracer *tracer, __func__, tracer_event->string_event.tmsn); return mlx5_tracer_handle_raw_string(tracer, tracer_event); } + if (cur_string->num_of_params < 0) { + pr_debug("%s string parameter of invalid string, dumping\n", + __func__); + return 0; + } cur_string->last_param_num += 1; if (cur_string->last_param_num > TRACER_MAX_PARAMS) { pr_debug("%s Number of params exceeds the max (%d)\n", diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h index 5c548bb74f07b..30d0bcba88479 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h @@ -125,6 +125,7 @@ struct tracer_string_format { struct list_head list; u32 timestamp; bool lost; + bool invalid_string; }; enum mlx5_fw_tracer_ownership_state { From ed04ac2e262e21a7f7cf86a09144a89f1681fb2f Mon Sep 17 00:00:00 2001 From: Shay Drory Date: Tue, 9 Dec 2025 14:56:12 +0200 Subject: [PATCH 0104/1321] net/mlx5: fw_tracer, Handle escaped percent properly The firmware tracer's format string validation and parameter counting did not properly handle escaped percent signs (%%). This caused fw_tracer to count more parameters when trace format strings contained literal percent characters. To fix it, allow %% to pass string validation and skip %% sequences when counting parameters since they represent literal percent signs rather than format specifiers. Fixes: 70dd6fdb8987 ("net/mlx5: FW tracer, parse traces and kernel tracing support") Signed-off-by: Shay Drory Reported-by: Breno Leitao Reviewed-by: Moshe Shemesh Closes: https://lore.kernel.org/netdev/hanz6rzrb2bqbplryjrakvkbmv4y5jlmtthnvi3thg5slqvelp@t3s3erottr6s/ Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/1765284977-1363052-5-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni --- .../mellanox/mlx5/core/diag/fw_tracer.c | 20 +++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c index b415dfe5de45f..6b4ec457ce227 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c @@ -368,11 +368,11 @@ static bool mlx5_is_valid_spec(const char *str) while (isdigit(*str) || *str == '#' || *str == '.' || *str == 'l') str++; - /* Check if it's a valid integer/hex specifier: + /* Check if it's a valid integer/hex specifier or %%: * Valid formats: %x, %d, %i, %u, etc. */ if (*str != 'x' && *str != 'X' && *str != 'd' && *str != 'i' && - *str != 'u' && *str != 'c') + *str != 'u' && *str != 'c' && *str != '%') return false; return true; @@ -390,7 +390,11 @@ static bool mlx5_tracer_validate_params(const char *str) if (!mlx5_is_valid_spec(substr + 1)) return false; - substr = strstr(substr + 1, PARAM_CHAR); + if (*(substr + 1) == '%') + substr = strstr(substr + 2, PARAM_CHAR); + else + substr = strstr(substr + 1, PARAM_CHAR); + } return true; @@ -469,11 +473,15 @@ static int mlx5_tracer_get_num_of_params(char *str) substr = strstr(pstr, VAL_PARM); } - /* count all the % characters */ + /* count all the % characters, but skip %% (escaped percent) */ substr = strstr(str, PARAM_CHAR); while (substr) { - num_of_params += 1; - str = substr + 1; + if (*(substr + 1) != '%') { + num_of_params += 1; + str = substr + 1; + } else { + str = substr + 2; + } substr = strstr(str, PARAM_CHAR); } From 144a25c2053a59f9d3ff6a03bba29e8765e2dd7d Mon Sep 17 00:00:00 2001 From: Shay Drory Date: Tue, 9 Dec 2025 14:56:13 +0200 Subject: [PATCH 0105/1321] net/mlx5: Serialize firmware reset with devlink The firmware reset mechanism can be triggered by asynchronous events, which may race with other devlink operations like devlink reload or devlink dev eswitch set, potentially leading to inconsistent states. This patch addresses the race by using the devl_lock to serialize the firmware reset against other devlink operations. When a reset is requested, the driver attempts to acquire the lock. If successful, it sets a flag to block devlink reload or eswitch changes, ACKs the reset to firmware and then releases the lock. If the lock is already held by another operation, the driver NACKs the firmware reset request, indicating that the reset cannot proceed. Firmware reset does not keep the devl_lock and instead uses an internal firmware reset bit. This is because firmware resets can be triggered by asynchronous events, and processed in different threads. It is illegal and unsafe to acquire a lock in one thread and attempt to release it in another, as lock ownership is intrinsically thread-specific. This change ensures that firmware resets and other devlink operations are mutually exclusive during the critical reset request phase, preventing race conditions. Fixes: 38b9f903f22b ("net/mlx5: Handle sync reset request event") Signed-off-by: Shay Drory Reviewed-by: Mateusz Berezecki Reviewed-by: Moshe Shemesh Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/1765284977-1363052-6-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni --- .../net/ethernet/mellanox/mlx5/core/devlink.c | 5 +++ .../mellanox/mlx5/core/eswitch_offloads.c | 6 +++ .../ethernet/mellanox/mlx5/core/fw_reset.c | 45 +++++++++++++++++-- .../ethernet/mellanox/mlx5/core/fw_reset.h | 1 + 4 files changed, 53 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c index 887adf4807d16..ea77fbd98396a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c @@ -197,6 +197,11 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change, struct pci_dev *pdev = dev->pdev; int ret = 0; + if (mlx5_fw_reset_in_progress(dev)) { + NL_SET_ERR_MSG_MOD(extack, "Can't reload during firmware reset"); + return -EBUSY; + } + if (mlx5_dev_is_lightweight(dev)) { if (action != DEVLINK_RELOAD_ACTION_DRIVER_REINIT) return -EOPNOTSUPP; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 8de6c7f6c2944..ea94a727633f1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -52,6 +52,7 @@ #include "devlink.h" #include "lag/lag.h" #include "en/tc/post_meter.h" +#include "fw_reset.h" /* There are two match-all miss flows, one for unicast dst mac and * one for multicast. @@ -3991,6 +3992,11 @@ int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode, if (IS_ERR(esw)) return PTR_ERR(esw); + if (mlx5_fw_reset_in_progress(esw->dev)) { + NL_SET_ERR_MSG_MOD(extack, "Can't change eswitch mode during firmware reset"); + return -EBUSY; + } + if (esw_mode_from_devlink(mode, &mlx5_mode)) return -EINVAL; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c index b81de792c181a..ae10665c53f32 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c @@ -15,6 +15,7 @@ enum { MLX5_FW_RESET_FLAGS_DROP_NEW_REQUESTS, MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED, MLX5_FW_RESET_FLAGS_UNLOAD_EVENT, + MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, }; struct mlx5_fw_reset { @@ -128,6 +129,16 @@ int mlx5_fw_reset_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_ty return mlx5_reg_mfrl_query(dev, reset_level, reset_type, NULL, NULL); } +bool mlx5_fw_reset_in_progress(struct mlx5_core_dev *dev) +{ + struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset; + + if (!fw_reset) + return false; + + return test_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags); +} + static int mlx5_fw_reset_get_reset_method(struct mlx5_core_dev *dev, u8 *reset_method) { @@ -243,6 +254,8 @@ static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev) BIT(DEVLINK_RELOAD_ACTION_FW_ACTIVATE)); devl_unlock(devlink); } + + clear_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags); } static void mlx5_stop_sync_reset_poll(struct mlx5_core_dev *dev) @@ -462,27 +475,48 @@ static void mlx5_sync_reset_request_event(struct work_struct *work) struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset, reset_request_work); struct mlx5_core_dev *dev = fw_reset->dev; + bool nack_request = false; + struct devlink *devlink; int err; err = mlx5_fw_reset_get_reset_method(dev, &fw_reset->reset_method); - if (err) + if (err) { + nack_request = true; mlx5_core_warn(dev, "Failed reading MFRL, err %d\n", err); + } else if (!mlx5_is_reset_now_capable(dev, fw_reset->reset_method) || + test_bit(MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, + &fw_reset->reset_flags)) { + nack_request = true; + } - if (err || test_bit(MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, &fw_reset->reset_flags) || - !mlx5_is_reset_now_capable(dev, fw_reset->reset_method)) { + devlink = priv_to_devlink(dev); + /* For external resets, try to acquire devl_lock. Skip if devlink reset is + * pending (lock already held) + */ + if (nack_request || + (!test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, + &fw_reset->reset_flags) && + !devl_trylock(devlink))) { err = mlx5_fw_reset_set_reset_sync_nack(dev); mlx5_core_warn(dev, "PCI Sync FW Update Reset Nack %s", err ? "Failed" : "Sent"); return; } + if (mlx5_sync_reset_set_reset_requested(dev)) - return; + goto unlock; + + set_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags); err = mlx5_fw_reset_set_reset_sync_ack(dev); if (err) mlx5_core_warn(dev, "PCI Sync FW Update Reset Ack Failed. Error code: %d\n", err); else mlx5_core_warn(dev, "PCI Sync FW Update Reset Ack. Device reset is expected.\n"); + +unlock: + if (!test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags)) + devl_unlock(devlink); } static int mlx5_pci_link_toggle(struct mlx5_core_dev *dev, u16 dev_id) @@ -722,6 +756,8 @@ static void mlx5_sync_reset_abort_event(struct work_struct *work) if (mlx5_sync_reset_clear_reset_requested(dev, true)) return; + + clear_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags); mlx5_core_warn(dev, "PCI Sync FW Update Reset Aborted.\n"); } @@ -758,6 +794,7 @@ static void mlx5_sync_reset_timeout_work(struct work_struct *work) if (mlx5_sync_reset_clear_reset_requested(dev, true)) return; + clear_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags); mlx5_core_warn(dev, "PCI Sync FW Update Reset Timeout.\n"); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h index d5b28525c960d..2d96b2adc1cdf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h @@ -10,6 +10,7 @@ int mlx5_fw_reset_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_ty int mlx5_fw_reset_set_reset_sync(struct mlx5_core_dev *dev, u8 reset_type_sel, struct netlink_ext_ack *extack); int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev); +bool mlx5_fw_reset_in_progress(struct mlx5_core_dev *dev); int mlx5_fw_reset_wait_reset_done(struct mlx5_core_dev *dev); void mlx5_sync_reset_unload_flow(struct mlx5_core_dev *dev, bool locked); From 430b977d567544418dec0f918f5572483668807e Mon Sep 17 00:00:00 2001 From: Jianbo Liu Date: Tue, 9 Dec 2025 14:56:14 +0200 Subject: [PATCH 0106/1321] net/mlx5e: Use ip6_dst_lookup instead of ipv6_dst_lookup_flow for MAC init Replace ipv6_stub->ipv6_dst_lookup_flow() with ip6_dst_lookup() in mlx5e_ipsec_init_macs() since IPsec transformations are not needed during Security Association setup - only basic routing information is required for nexthop MAC address resolution. This resolves an issue where XfrmOutNoStates error counter would be incremented when xfrm policy is configured before xfrm state, as the IPsec-aware routing function would attempt policy checks during SA initialization. Fixes: 71670f766b8f ("net/mlx5e: Support routed networks during IPsec MACs initialization") Signed-off-by: Jianbo Liu Reviewed-by: Leon Romanovsky Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/1765284977-1363052-7-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c index 35d9530037a65..6c79b9cea2efb 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c @@ -342,9 +342,8 @@ static void mlx5e_ipsec_init_macs(struct mlx5e_ipsec_sa_entry *sa_entry, rt_dst_entry = &rt->dst; break; case AF_INET6: - rt_dst_entry = ipv6_stub->ipv6_dst_lookup_flow( - dev_net(netdev), NULL, &fl6, NULL); - if (IS_ERR(rt_dst_entry)) + if (!IS_ENABLED(CONFIG_IPV6) || + ip6_dst_lookup(dev_net(netdev), NULL, &rt_dst_entry, &fl6)) goto neigh; break; default: From 1c26c8ba11199c7a4ee1fad8b2e0a0076e351012 Mon Sep 17 00:00:00 2001 From: Jianbo Liu Date: Tue, 9 Dec 2025 14:56:15 +0200 Subject: [PATCH 0107/1321] net/mlx5e: Trigger neighbor resolution for unresolved destinations When initializing the MAC addresses for an outbound IPsec packet offload rule in mlx5e_ipsec_init_macs, the call to dst_neigh_lookup is used to find the next-hop neighbor (typically the gateway in tunnel mode). This call might create a new neighbor entry if one doesn't already exist. This newly created entry starts in the INCOMPLETE state, as the kernel hasn't yet sent an ARP or NDISC probe to resolve the MAC address. In this case, neigh_ha_snapshot will correctly return an all-zero MAC address. IPsec packet offload requires the actual next-hop MAC address to program the rule correctly. If the neighbor state is INCOMPLETE when the rule is created, the hardware rule is programmed with an all-zero destination MAC address. Packets sent using this rule will be subsequently dropped by the receiving network infrastructure or host. This patch adds a check specifically for the outbound offload path. If neigh_ha_snapshot returns an all-zero MAC address, it proactively calls neigh_event_send(n, NULL). This ensures the kernel immediately sends the initial ARP or NDISC probe if one isn't already pending, accelerating the resolution process. This helps prevent the hardware rule from being programmed with an invalid MAC address and avoids packet drops due to unresolved neighbors. Fixes: 71670f766b8f ("net/mlx5e: Support routed networks during IPsec MACs initialization") Signed-off-by: Jianbo Liu Reviewed-by: Leon Romanovsky Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/1765284977-1363052-8-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c index 6c79b9cea2efb..a8fb4bec369cf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c @@ -358,6 +358,9 @@ static void mlx5e_ipsec_init_macs(struct mlx5e_ipsec_sa_entry *sa_entry, neigh_ha_snapshot(addr, n, netdev); ether_addr_copy(dst, addr); + if (attrs->dir == XFRM_DEV_OFFLOAD_OUT && + is_zero_ether_addr(addr)) + neigh_event_send(n, NULL); dst_release(rt_dst_entry); neigh_release(n); return; From 3acddab27b6a98732e034ba39915d248e8928891 Mon Sep 17 00:00:00 2001 From: Tariq Toukan Date: Tue, 9 Dec 2025 14:56:16 +0200 Subject: [PATCH 0108/1321] net/mlx5e: Do not update BQL of old txqs during channel reconfiguration During channel reconfiguration (e.g., ethtool private flags changes), the driver can trigger a kernel BUG_ON in dql_completed() with the error "kernel BUG at lib/dynamic_queue_limits.c:99". The issue occurs in the following sequence: During mlx5e_safe_switch_params(), old channels are deactivated via mlx5e_deactivate_txqsq(). New channels are created and activated, taking ownership of the netdev_queues and their BQL state. When old channels are closed via mlx5e_close_txqsq(), there may be pending TX descriptors (sq->cc != sq->pc) that were in-flight during the deactivation. mlx5e_free_txqsq_descs() frees these pending descriptors and attempts to complete them via netdev_tx_completed_queue(). However, the BQL state (dql->num_queued and dql->num_completed) have been reset in mlx5e_activate_txqsq and belong to the new queue owner, leading to dql->num_queued - dql->num_completed < nbytes. This triggers BUG_ON(count > num_queued - num_completed) in dql_completed(). Fixes: 3b88a535a8e1 ("net/mlx5e: Defer channels closure to reduce interface down time") Signed-off-by: Tariq Toukan Signed-off-by: William Tu Reviewed-by: Dragos Tatulea Link: https://patch.msgid.link/1765284977-1363052-9-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c index 14884b9ea7f39..a01ee656a1e7f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c @@ -939,7 +939,11 @@ void mlx5e_free_txqsq_descs(struct mlx5e_txqsq *sq) sq->dma_fifo_cc = dma_fifo_cc; sq->cc = sqcc; - netdev_tx_completed_queue(sq->txq, npkts, nbytes); + /* Do not update BQL for TXQs that got replaced by new active ones, as + * netdev_tx_reset_queue() is called for them in mlx5e_activate_txqsq(). + */ + if (sq == sq->priv->txq2sq[sq->txq_ix]) + netdev_tx_completed_queue(sq->txq, npkts, nbytes); } #ifdef CONFIG_MLX5_CORE_IPOIB From 5382814355276111afacfdc17a2a864c2306a097 Mon Sep 17 00:00:00 2001 From: Cosmin Ratiu Date: Tue, 9 Dec 2025 14:56:17 +0200 Subject: [PATCH 0109/1321] net/mlx5e: Don't include PSP in the hard MTU calculations Commit [1] added the 40 bytes required by the PSP header+trailer and the UDP header to MLX5E_ETH_HARD_MTU, which limits the device-wide max software MTU that could be set. This is not okay, because most packets are not PSP packets and it doesn't make sense to always reserve space for headers which won't get added in most cases. As it turns out, for TCP connections, PSP overhead is already taken into account in the TCP MSS calculations via inet_csk(sk)->icsk_ext_hdr_len. This was added in commit [2]. This means that the extra space reserved in the hard MTU for mlx5 ends up unused and wasted. Remove the unnecessary 40 byte reservation from hard MTU. [1] commit e5a1861a298e ("net/mlx5e: Implement PSP Tx data path") [2] commit e97269257fe4 ("net: psp: update the TCP MSS to reflect PSP packet overhead") Fixes: e5a1861a298e ("net/mlx5e: Implement PSP Tx data path") Signed-off-by: Cosmin Ratiu Reviewed-by: Shahar Shitrit Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/1765284977-1363052-10-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 811178d8976cf..262dc032e276a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -69,7 +69,7 @@ struct page_pool; #define MLX5E_METADATA_ETHER_TYPE (0x8CE4) #define MLX5E_METADATA_ETHER_LEN 8 -#define MLX5E_ETH_HARD_MTU (ETH_HLEN + PSP_ENCAP_HLEN + PSP_TRL_SIZE + VLAN_HLEN + ETH_FCS_LEN) +#define MLX5E_ETH_HARD_MTU (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN) #define MLX5E_HW2SW_MTU(params, hwmtu) ((hwmtu) - ((params)->hard_mtu)) #define MLX5E_SW2HW_MTU(params, swmtu) ((swmtu) + ((params)->hard_mtu)) From 760b811d7ce0e0eb5240dd45984f125eca188b46 Mon Sep 17 00:00:00 2001 From: Jozsef Kadlecsik Date: Tue, 2 Dec 2025 15:20:15 +0100 Subject: [PATCH 0110/1321] MAINTAINERS: Remove Jozsef Kadlecsik from MAINTAINERS file I'm retiring from maintaining netfilter. I'll still keep an eye on ipset and respond to anything related to it. Thank you! Signed-off-by: Jozsef Kadlecsik Signed-off-by: Florian Westphal --- CREDITS | 1 + MAINTAINERS | 1 - 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/CREDITS b/CREDITS index b735b8c58102a..ca75f110edb6f 100644 --- a/CREDITS +++ b/CREDITS @@ -1987,6 +1987,7 @@ D: netfilter: TCP window tracking code D: netfilter: raw table D: netfilter: iprange match D: netfilter: new logging interfaces +D: netfilter: ipset D: netfilter: various other hacks S: Tata S: Hungary diff --git a/MAINTAINERS b/MAINTAINERS index 5b11839cba9de..454b8ed119e9e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -18028,7 +18028,6 @@ F: drivers/net/ethernet/neterion/ NETFILTER M: Pablo Neira Ayuso -M: Jozsef Kadlecsik M: Florian Westphal R: Phil Sutter L: netfilter-devel@vger.kernel.org From af37f264eb67fa5e65284b79eb9d2d837bafa2a0 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Mon, 8 Dec 2025 16:00:34 +0100 Subject: [PATCH 0111/1321] netfilter: nf_nat: remove bogus direction check Jakub reports spurious failures of the 'conntrack_reverse_clash.sh' selftest. A bogus test makes nat core resort to port rewrite even though there is no need for this. When the test is made, nf_nat_used_tuple() would already have caused us to return if no other CPU had added a colliding entry. Moreover, nf_nat_used_tuple() would have ignored the colliding entry if their origin tuples had been the same. All that is left to check is if the colliding entry in the hash table is subject to NAT, and, if its not, if our entry matches in the reverse direction, e.g. hash table has addr1:1234 -> addr2:80, and we want to commit addr2:80 -> addr1:1234. Because we already checked that neither the new nor the committed entry is subject to NAT we only have to check origin vs. reply tuple: for non-nat entries, the reply tuple is always the inverted original. Just in case there are more problems extend the error reporting in the selftest while at it and dump conntrack table/stats on error. Reported-by: Jakub Kicinski Closes: https://lore.kernel.org/netdev/20251206175135.4a56591b@kernel.org/ Fixes: d8f84a9bc7c4 ("netfilter: nf_nat: don't try nat source port reallocation for reverse dir clash") Signed-off-by: Florian Westphal --- net/netfilter/nf_nat_core.c | 14 +------------- .../net/netfilter/conntrack_reverse_clash.c | 13 +++++++++---- .../net/netfilter/conntrack_reverse_clash.sh | 2 ++ 3 files changed, 12 insertions(+), 17 deletions(-) diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c index 78a61dac4ade8..e6b24586d2fed 100644 --- a/net/netfilter/nf_nat_core.c +++ b/net/netfilter/nf_nat_core.c @@ -294,25 +294,13 @@ nf_nat_used_tuple_new(const struct nf_conntrack_tuple *tuple, ct = nf_ct_tuplehash_to_ctrack(thash); - /* NB: IP_CT_DIR_ORIGINAL should be impossible because - * nf_nat_used_tuple() handles origin collisions. - * - * Handle remote chance other CPU confirmed its ct right after. - */ - if (thash->tuple.dst.dir != IP_CT_DIR_REPLY) - goto out; - /* clashing connection subject to NAT? Retry with new tuple. */ if (READ_ONCE(ct->status) & uses_nat) goto out; if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, - &ignored_ct->tuplehash[IP_CT_DIR_REPLY].tuple) && - nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_REPLY].tuple, - &ignored_ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple)) { + &ignored_ct->tuplehash[IP_CT_DIR_REPLY].tuple)) taken = false; - goto out; - } out: nf_ct_put(ct); return taken; diff --git a/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.c b/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.c index 507930cee8cb6..462d628cc3bdb 100644 --- a/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.c +++ b/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.c @@ -33,9 +33,14 @@ static void die(const char *e) exit(111); } -static void die_port(uint16_t got, uint16_t want) +static void die_port(const struct sockaddr_in *sin, uint16_t want) { - fprintf(stderr, "Port number changed, wanted %d got %d\n", want, ntohs(got)); + uint16_t got = ntohs(sin->sin_port); + char str[INET_ADDRSTRLEN]; + + inet_ntop(AF_INET, &sin->sin_addr, str, sizeof(str)); + + fprintf(stderr, "Port number changed, wanted %d got %d from %s\n", want, got, str); exit(1); } @@ -100,7 +105,7 @@ int main(int argc, char *argv[]) die("child recvfrom"); if (peer.sin_port != htons(PORT)) - die_port(peer.sin_port, PORT); + die_port(&peer, PORT); } else { if (sendto(s2, buf, LEN, 0, (struct sockaddr *)&sa1, sizeof(sa1)) != LEN) continue; @@ -109,7 +114,7 @@ int main(int argc, char *argv[]) die("parent recvfrom"); if (peer.sin_port != htons((PORT + 1))) - die_port(peer.sin_port, PORT + 1); + die_port(&peer, PORT + 1); } } diff --git a/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.sh b/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.sh index a24c896347a88..dc7e9d6da0624 100755 --- a/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.sh +++ b/tools/testing/selftests/net/netfilter/conntrack_reverse_clash.sh @@ -45,6 +45,8 @@ if ip netns exec "$ns0" ./conntrack_reverse_clash; then echo "PASS: No SNAT performed for null bindings" else echo "ERROR: SNAT performed without any matching snat rule" + ip netns exec "$ns0" conntrack -L + ip netns exec "$ns0" conntrack -S exit 1 fi From eb1fbd11c64f1d68f205e46e72a41e8d71963ea9 Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso Date: Wed, 19 Nov 2025 13:42:05 +0100 Subject: [PATCH 0112/1321] netfilter: nf_tables: remove redundant chain validation on register store This validation predates the introduction of the state machine that determines when to enter slow path validation for error reporting. Currently, table validation is perform when: - new rule contains expressions that need validation. - new set element with jump/goto verdict. Validation on register store skips most checks with no basechains, still this walks the graph searching for loops and ensuring expressions are called from the right hook. Remove this. Fixes: a654de8fdc18 ("netfilter: nf_tables: fix chain dependency validation") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Florian Westphal --- net/netfilter/nf_tables_api.c | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index f3de2f9bbebf1..c46b1bb0efe0f 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -11676,21 +11676,10 @@ static int nft_validate_register_store(const struct nft_ctx *ctx, enum nft_data_types type, unsigned int len) { - int err; - switch (reg) { case NFT_REG_VERDICT: if (type != NFT_DATA_VERDICT) return -EINVAL; - - if (data != NULL && - (data->verdict.code == NFT_GOTO || - data->verdict.code == NFT_JUMP)) { - err = nft_chain_validate(ctx, data->verdict.chain); - if (err < 0) - return err; - } - break; default: if (type != NFT_DATA_VALUE) From 267712f3847b6a698958166001009875790a4a83 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Sun, 7 Jul 2024 01:18:25 +0200 Subject: [PATCH 0113/1321] netfilter: nf_tables: avoid chain re-validation if possible Hamza Mahfooz reports cpu soft lock-ups in nft_chain_validate(): watchdog: BUG: soft lockup - CPU#1 stuck for 27s! [iptables-nft-re:37547] [..] RIP: 0010:nft_chain_validate+0xcb/0x110 [nf_tables] [..] nft_immediate_validate+0x36/0x50 [nf_tables] nft_chain_validate+0xc9/0x110 [nf_tables] nft_immediate_validate+0x36/0x50 [nf_tables] nft_chain_validate+0xc9/0x110 [nf_tables] nft_immediate_validate+0x36/0x50 [nf_tables] nft_chain_validate+0xc9/0x110 [nf_tables] nft_immediate_validate+0x36/0x50 [nf_tables] nft_chain_validate+0xc9/0x110 [nf_tables] nft_immediate_validate+0x36/0x50 [nf_tables] nft_chain_validate+0xc9/0x110 [nf_tables] nft_immediate_validate+0x36/0x50 [nf_tables] nft_chain_validate+0xc9/0x110 [nf_tables] nft_table_validate+0x6b/0xb0 [nf_tables] nf_tables_validate+0x8b/0xa0 [nf_tables] nf_tables_commit+0x1df/0x1eb0 [nf_tables] [..] Currently nf_tables will traverse the entire table (chain graph), starting from the entry points (base chains), exploring all possible paths (chain jumps). But there are cases where we could avoid revalidation. Consider: 1 input -> j2 -> j3 2 input -> j2 -> j3 3 input -> j1 -> j2 -> j3 Then the second rule does not need to revalidate j2, and, by extension j3, because this was already checked during validation of the first rule. We need to validate it only for rule 3. This is needed because chain loop detection also ensures we do not exceed the jump stack: Just because we know that j2 is cycle free, its last jump might now exceed the allowed stack size. We also need to update all reachable chains with the new largest observed call depth. Care has to be taken to revalidate even if the chain depth won't be an issue: chain validation also ensures that expressions are not called from invalid base chains. For example, the masquerade expression can only be called from NAT postrouting base chains. Therefore we also need to keep record of the base chain context (type, hooknum) and revalidate if the chain becomes reachable from a different hook location. Reported-by: Hamza Mahfooz Closes: https://lore.kernel.org/netfilter-devel/20251118221735.GA5477@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net/ Tested-by: Hamza Mahfooz Signed-off-by: Florian Westphal --- include/net/netfilter/nf_tables.h | 34 +++++++++++---- net/netfilter/nf_tables_api.c | 69 +++++++++++++++++++++++++++++-- 2 files changed, 91 insertions(+), 12 deletions(-) diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index fab7dc73f738c..0e266c2d0e7f0 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -1091,6 +1091,29 @@ struct nft_rule_blob { __attribute__((aligned(__alignof__(struct nft_rule_dp)))); }; +enum nft_chain_types { + NFT_CHAIN_T_DEFAULT = 0, + NFT_CHAIN_T_ROUTE, + NFT_CHAIN_T_NAT, + NFT_CHAIN_T_MAX +}; + +/** + * struct nft_chain_validate_state - validation state + * + * If a chain is encountered again during table validation it is + * possible to avoid revalidation provided the calling context is + * compatible. This structure stores relevant calling context of + * previous validations. + * + * @hook_mask: the hook numbers and locations the chain is linked to + * @depth: the deepest call chain level the chain is linked to + */ +struct nft_chain_validate_state { + u8 hook_mask[NFT_CHAIN_T_MAX]; + u8 depth; +}; + /** * struct nft_chain - nf_tables chain * @@ -1109,6 +1132,7 @@ struct nft_rule_blob { * @udlen: user data length * @udata: user data in the chain * @blob_next: rule blob pointer to the next in the chain + * @vstate: validation state */ struct nft_chain { struct nft_rule_blob __rcu *blob_gen_0; @@ -1128,9 +1152,10 @@ struct nft_chain { /* Only used during control plane commit phase: */ struct nft_rule_blob *blob_next; + struct nft_chain_validate_state vstate; }; -int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain); +int nft_chain_validate(const struct nft_ctx *ctx, struct nft_chain *chain); int nft_setelem_validate(const struct nft_ctx *ctx, struct nft_set *set, const struct nft_set_iter *iter, struct nft_elem_priv *elem_priv); @@ -1138,13 +1163,6 @@ int nft_set_catchall_validate(const struct nft_ctx *ctx, struct nft_set *set); int nf_tables_bind_chain(const struct nft_ctx *ctx, struct nft_chain *chain); void nf_tables_unbind_chain(const struct nft_ctx *ctx, struct nft_chain *chain); -enum nft_chain_types { - NFT_CHAIN_T_DEFAULT = 0, - NFT_CHAIN_T_ROUTE, - NFT_CHAIN_T_NAT, - NFT_CHAIN_T_MAX -}; - /** * struct nft_chain_type - nf_tables chain type info * diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index c46b1bb0efe0f..a9f6babcc781b 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -123,6 +123,29 @@ static void nft_validate_state_update(struct nft_table *table, u8 new_validate_s table->validate_state = new_validate_state; } + +static bool nft_chain_vstate_valid(const struct nft_ctx *ctx, + const struct nft_chain *chain) +{ + const struct nft_base_chain *base_chain; + enum nft_chain_types type; + u8 hooknum; + + if (WARN_ON_ONCE(!nft_is_base_chain(ctx->chain))) + return false; + + base_chain = nft_base_chain(ctx->chain); + hooknum = base_chain->ops.hooknum; + type = base_chain->type->type; + + /* chain is already validated for this call depth */ + if (chain->vstate.depth >= ctx->level && + chain->vstate.hook_mask[type] & BIT(hooknum)) + return true; + + return false; +} + static void nf_tables_trans_destroy_work(struct work_struct *w); static void nft_trans_gc_work(struct work_struct *work); @@ -4079,6 +4102,29 @@ static void nf_tables_rule_release(const struct nft_ctx *ctx, struct nft_rule *r nf_tables_rule_destroy(ctx, rule); } +static void nft_chain_vstate_update(const struct nft_ctx *ctx, struct nft_chain *chain) +{ + const struct nft_base_chain *base_chain; + enum nft_chain_types type; + u8 hooknum; + + /* ctx->chain must hold the calling base chain. */ + if (WARN_ON_ONCE(!nft_is_base_chain(ctx->chain))) { + memset(&chain->vstate, 0, sizeof(chain->vstate)); + return; + } + + base_chain = nft_base_chain(ctx->chain); + hooknum = base_chain->ops.hooknum; + type = base_chain->type->type; + + BUILD_BUG_ON(BIT(NF_INET_NUMHOOKS) > U8_MAX); + + chain->vstate.hook_mask[type] |= BIT(hooknum); + if (chain->vstate.depth < ctx->level) + chain->vstate.depth = ctx->level; +} + /** nft_chain_validate - loop detection and hook validation * * @ctx: context containing call depth and base chain @@ -4088,15 +4134,25 @@ static void nf_tables_rule_release(const struct nft_ctx *ctx, struct nft_rule *r * and set lookups until either the jump limit is hit or all reachable * chains have been validated. */ -int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain) +int nft_chain_validate(const struct nft_ctx *ctx, struct nft_chain *chain) { struct nft_expr *expr, *last; struct nft_rule *rule; int err; + BUILD_BUG_ON(NFT_JUMP_STACK_SIZE > 255); if (ctx->level == NFT_JUMP_STACK_SIZE) return -EMLINK; + if (ctx->level > 0) { + /* jumps to base chains are not allowed. */ + if (nft_is_base_chain(chain)) + return -ELOOP; + + if (nft_chain_vstate_valid(ctx, chain)) + return 0; + } + list_for_each_entry(rule, &chain->rules, list) { if (fatal_signal_pending(current)) return -EINTR; @@ -4117,6 +4173,7 @@ int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain) } } + nft_chain_vstate_update(ctx, chain); return 0; } EXPORT_SYMBOL_GPL(nft_chain_validate); @@ -4128,7 +4185,7 @@ static int nft_table_validate(struct net *net, const struct nft_table *table) .net = net, .family = table->family, }; - int err; + int err = 0; list_for_each_entry(chain, &table->chains, list) { if (!nft_is_base_chain(chain)) @@ -4137,12 +4194,16 @@ static int nft_table_validate(struct net *net, const struct nft_table *table) ctx.chain = chain; err = nft_chain_validate(&ctx, chain); if (err < 0) - return err; + goto err; cond_resched(); } - return 0; +err: + list_for_each_entry(chain, &table->chains, list) + memset(&chain->vstate, 0, sizeof(chain->vstate)); + + return err; } int nft_setelem_validate(const struct nft_ctx *ctx, struct nft_set *set, From 3a626c5fe913173c563d82c8291bac0319d181b2 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Thu, 11 Dec 2025 12:55:19 +0100 Subject: [PATCH 0114/1321] netfilter: nf_tables: avoid softlockup warnings in nft_chain_validate This reverts commit 314c82841602 ("netfilter: nf_tables: can't schedule in nft_chain_validate"): Since commit a60a5abe19d6 ("netfilter: nf_tables: allow iter callbacks to sleep") the iterator callback is invoked without rcu read lock held, so this cond_resched() is now valid. Signed-off-by: Florian Westphal --- net/netfilter/nf_tables_api.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index a9f6babcc781b..618af6e90773f 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -4171,6 +4171,8 @@ int nft_chain_validate(const struct nft_ctx *ctx, struct nft_chain *chain) if (err < 0) return err; } + + cond_resched(); } nft_chain_vstate_update(ctx, chain); @@ -4195,8 +4197,6 @@ static int nft_table_validate(struct net *net, const struct nft_table *table) err = nft_chain_validate(&ctx, chain); if (err < 0) goto err; - - cond_resched(); } err: From 48216c6e1d6ff63bfe13fe98a5f41c8c3ead86b6 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Thu, 11 Dec 2025 13:16:49 +0100 Subject: [PATCH 0115/1321] selftests: netfilter: packetdrill: avoid failure on HZ=100 kernel packetdrill --ip_version=ipv4 --mtu=1500 --tolerance_usecs=1000000 --non_fatal packet conntrack_syn_challenge_ack.pkt conntrack v1.4.8 (conntrack-tools): 1 flow entries have been shown. conntrack_syn_challenge_ack.pkt:32: error executing `conntrack -f $NFCT_IP_VERSION \ -L -p tcp --dport 8080 | grep UNREPLIED | grep -q SYN_SENT` command: non-zero status 1 Affected kernel had CONFIG_HZ=100; reset packet was still sitting in backlog. Reported-by: Yi Chen Fixes: a8a388c2aae4 ("selftests: netfilter: add packetdrill based conntrack tests") Signed-off-by: Florian Westphal --- .../net/netfilter/packetdrill/conntrack_syn_challenge_ack.pkt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/net/netfilter/packetdrill/conntrack_syn_challenge_ack.pkt b/tools/testing/selftests/net/netfilter/packetdrill/conntrack_syn_challenge_ack.pkt index 3442cd29bc932..cdb3910af95b4 100644 --- a/tools/testing/selftests/net/netfilter/packetdrill/conntrack_syn_challenge_ack.pkt +++ b/tools/testing/selftests/net/netfilter/packetdrill/conntrack_syn_challenge_ack.pkt @@ -26,7 +26,7 @@ +0.01 > R 643160523:643160523(0) win 0 -+0.01 `conntrack -f $NFCT_IP_VERSION -L -p tcp --dport 8080 2>/dev/null | grep UNREPLIED | grep -q SYN_SENT` ++0.1 `conntrack -f $NFCT_IP_VERSION -L -p tcp --dport 8080 2>/dev/null | grep UNREPLIED | grep -q SYN_SENT` // Must go through. +0.01 > S 0:0(0) win 65535 From 9d63dea6ca46b80ef2a92c4806679e3de6f5784b Mon Sep 17 00:00:00 2001 From: Scott Mayhew Date: Tue, 9 Dec 2025 14:30:15 -0500 Subject: [PATCH 0116/1321] net/handshake: duplicate handshake cancellations leak socket When a handshake request is cancelled it is removed from the handshake_net->hn_requests list, but it is still present in the handshake_rhashtbl until it is destroyed. If a second cancellation request arrives for the same handshake request, then remove_pending() will return false... and assuming HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue processing through the out_true label, where we put another reference on the sock and a refcount underflow occurs. This can happen for example if a handshake times out - particularly if the SUNRPC client sends the AUTH_TLS probe to the server but doesn't follow it up with the ClientHello due to a problem with tlshd. When the timeout is hit on the server, the server will send a FIN, which triggers a cancellation request via xs_reset_transport(). When the timeout is hit on the client, another cancellation request happens via xs_tls_handshake_sync(). Add a test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED) in the pending cancel path so duplicate cancels can be detected. Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") Suggested-by: Chuck Lever Signed-off-by: Scott Mayhew Reviewed-by: Chuck Lever Link: https://patch.msgid.link/20251209193015.3032058-1-smayhew@redhat.com Signed-off-by: Paolo Abeni --- net/handshake/request.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/net/handshake/request.c b/net/handshake/request.c index 89435ed755cd0..6b7e3e0bf3996 100644 --- a/net/handshake/request.c +++ b/net/handshake/request.c @@ -326,7 +326,11 @@ bool handshake_req_cancel(struct sock *sk) hn = handshake_pernet(net); if (hn && remove_pending(hn, req)) { - /* Request hadn't been accepted */ + /* Request hadn't been accepted - mark cancelled */ + if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) { + trace_handshake_cancel_busy(net, req, sk); + return false; + } goto out_true; } if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) { From 750284a68d59e7007b57ab0860e092b5e487e25a Mon Sep 17 00:00:00 2001 From: Kuniyuki Iwashima Date: Wed, 10 Dec 2025 08:11:12 +0000 Subject: [PATCH 0117/1321] sctp: Fetch inet6_sk() after setting ->pinet6 in sctp_clone_sock(). syzbot reported the lockdep splat below. [0] sctp_clone_sock() sets the child socket's ipv6_mc_list to NULL, but somehow sock_release() in an error path finally acquires lock_sock() in ipv6_sock_mc_close(). The root cause is that sctp_clone_sock() fetches inet6_sk(newsk) before setting newinet->pinet6, meaning that the parent's ipv6_mc_list was actually cleared. Also, sctp_v6_copy_ip_options() uses inet6_sk() but is called before newinet->pinet6 is set. Let's use inet6_sk() only after setting newinet->pinet6. [0]: WARNING: possible recursive locking detected syzkaller #0 Not tainted syz.0.17/5996 is trying to acquire lock: ffff888031af4c60 (sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1700 [inline] ffff888031af4c60 (sk_lock-AF_INET6){+.+.}-{0:0}, at: ipv6_sock_mc_close+0xd3/0x140 net/ipv6/mcast.c:348 but task is already holding lock: ffff888031af4320 (sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1700 [inline] ffff888031af4320 (sk_lock-AF_INET6){+.+.}-{0:0}, at: sctp_getsockopt+0x135/0xb60 net/sctp/socket.c:8131 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(sk_lock-AF_INET6); lock(sk_lock-AF_INET6); *** DEADLOCK *** May be due to missing lock nesting notation 1 lock held by syz.0.17/5996: #0: ffff888031af4320 (sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1700 [inline] #0: ffff888031af4320 (sk_lock-AF_INET6){+.+.}-{0:0}, at: sctp_getsockopt+0x135/0xb60 net/sctp/socket.c:8131 stack backtrace: CPU: 0 UID: 0 PID: 5996 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 Call Trace: dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 print_deadlock_bug+0x279/0x290 kernel/locking/lockdep.c:3041 check_deadlock kernel/locking/lockdep.c:3093 [inline] validate_chain kernel/locking/lockdep.c:3895 [inline] __lock_acquire+0x2540/0x2cf0 kernel/locking/lockdep.c:5237 lock_acquire+0x117/0x340 kernel/locking/lockdep.c:5868 lock_sock_nested+0x48/0x100 net/core/sock.c:3780 lock_sock include/net/sock.h:1700 [inline] ipv6_sock_mc_close+0xd3/0x140 net/ipv6/mcast.c:348 inet6_release+0x47/0x70 net/ipv6/af_inet6.c:482 __sock_release net/socket.c:653 [inline] sock_release+0x85/0x150 net/socket.c:681 sctp_getsockopt_peeloff_common+0x56b/0x770 net/sctp/socket.c:5732 sctp_getsockopt_peeloff_flags+0x13b/0x230 net/sctp/socket.c:5801 sctp_getsockopt+0x3ab/0xb60 net/sctp/socket.c:8151 do_sock_getsockopt+0x2b4/0x3d0 net/socket.c:2399 __sys_getsockopt net/socket.c:2428 [inline] __do_sys_getsockopt net/socket.c:2435 [inline] __se_sys_getsockopt net/socket.c:2432 [inline] __x64_sys_getsockopt+0x1a5/0x250 net/socket.c:2432 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f8f8c38f749 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffcfdade018 EFLAGS: 00000246 ORIG_RAX: 0000000000000037 RAX: ffffffffffffffda RBX: 00007f8f8c5e5fa0 RCX: 00007f8f8c38f749 RDX: 000000000000007a RSI: 0000000000000084 RDI: 0000000000000003 RBP: 00007f8f8c413f91 R08: 0000200000000040 R09: 0000000000000000 R10: 0000200000000340 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f8f8c5e5fa0 R14: 00007f8f8c5e5fa0 R15: 0000000000000005 Fixes: 16942cf4d3e31 ("sctp: Use sk_clone() in sctp_accept().") Reported-by: syzbot+c59e6bb54e7620495725@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6936d112.a70a0220.38f243.00a7.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20251210081206.1141086-2-kuniyu@google.com Acked-by: Xin Long Signed-off-by: Paolo Abeni --- net/sctp/socket.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/sctp/socket.c b/net/sctp/socket.c index d808096f5ab17..2493a5b1fa3ca 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -4863,8 +4863,6 @@ static struct sock *sctp_clone_sock(struct sock *sk, newsp->pf->to_sk_daddr(&asoc->peer.primary_addr, newsk); newinet->inet_dport = htons(asoc->peer.port); - - newsp->pf->copy_ip_options(sk, newsk); atomic_set(&newinet->inet_id, get_random_u16()); inet_set_bit(MC_LOOP, newsk); @@ -4874,17 +4872,20 @@ static struct sock *sctp_clone_sock(struct sock *sk, #if IS_ENABLED(CONFIG_IPV6) if (sk->sk_family == AF_INET6) { - struct ipv6_pinfo *newnp = inet6_sk(newsk); + struct ipv6_pinfo *newnp; newinet->pinet6 = &((struct sctp6_sock *)newsk)->inet6; newinet->ipv6_fl_list = NULL; + newnp = inet6_sk(newsk); memcpy(newnp, inet6_sk(sk), sizeof(struct ipv6_pinfo)); newnp->ipv6_mc_list = NULL; newnp->ipv6_ac_list = NULL; } #endif + newsp->pf->copy_ip_options(sk, newsk); + newsp->do_auto_asconf = 0; skb_queue_head_init(&newsp->pd_lobby); From a67da3f3655b7ea0e5e0b093e68d108ac1927180 Mon Sep 17 00:00:00 2001 From: Kuniyuki Iwashima Date: Wed, 10 Dec 2025 08:11:13 +0000 Subject: [PATCH 0118/1321] sctp: Clear inet_opt in sctp_v6_copy_ip_options(). syzbot reported the splat below. [0] Since the cited commit, the child socket inherits all fields of its parent socket unless explicitly cleared. syzbot set IP_OPTIONS to AF_INET6 socket and created a child socket inheriting inet_sk(sk)->inet_opt. sctp_v6_copy_ip_options() only clones np->opt, and leaving inet_opt results in double-free. Let's clear inet_opt in sctp_v6_copy_ip_options(). [0]: BUG: KASAN: double-free in inet_sock_destruct+0x538/0x740 net/ipv4/af_inet.c:159 Free of addr ffff8880304b6d40 by task ksoftirqd/0/15 CPU: 0 UID: 0 PID: 15 Comm: ksoftirqd/0 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025 Call Trace: dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xca/0x240 mm/kasan/report.c:482 kasan_report_invalid_free+0xea/0x110 mm/kasan/report.c:557 check_slab_allocation+0xe1/0x130 include/linux/page-flags.h:-1 kasan_slab_pre_free include/linux/kasan.h:198 [inline] slab_free_hook mm/slub.c:2484 [inline] slab_free mm/slub.c:6630 [inline] kfree+0x148/0x6d0 mm/slub.c:6837 inet_sock_destruct+0x538/0x740 net/ipv4/af_inet.c:159 __sk_destruct+0x89/0x660 net/core/sock.c:2350 sock_put include/net/sock.h:1991 [inline] sctp_endpoint_destroy_rcu+0xa1/0xf0 net/sctp/endpointola.c:197 rcu_do_batch kernel/rcu/tree.c:2605 [inline] rcu_core+0xcab/0x1770 kernel/rcu/tree.c:2861 handle_softirqs+0x286/0x870 kernel/softirq.c:622 run_ksoftirqd+0x9b/0x100 kernel/softirq.c:1063 smpboot_thread_fn+0x542/0xa60 kernel/smpboot.c:160 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 Allocated by task 6003: kasan_save_stack mm/kasan/common.c:56 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:77 poison_kmalloc_redzone mm/kasan/common.c:400 [inline] __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:417 kasan_kmalloc include/linux/kasan.h:262 [inline] __do_kmalloc_node mm/slub.c:5642 [inline] __kmalloc_noprof+0x411/0x7f0 mm/slub.c:5654 kmalloc_noprof include/linux/slab.h:961 [inline] kzalloc_noprof include/linux/slab.h:1094 [inline] ip_options_get+0x51/0x4c0 net/ipv4/ip_options.c:517 do_ip_setsockopt+0x1d9b/0x2d00 net/ipv4/ip_sockglue.c:1087 ip_setsockopt+0x66/0x110 net/ipv4/ip_sockglue.c:1417 do_sock_setsockopt+0x17c/0x1b0 net/socket.c:2360 __sys_setsockopt net/socket.c:2385 [inline] __do_sys_setsockopt net/socket.c:2391 [inline] __se_sys_setsockopt net/socket.c:2388 [inline] __x64_sys_setsockopt+0x13f/0x1b0 net/socket.c:2388 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 15: kasan_save_stack mm/kasan/common.c:56 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:77 __kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:587 kasan_save_free_info mm/kasan/kasan.h:406 [inline] poison_slab_object mm/kasan/common.c:252 [inline] __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:284 kasan_slab_free include/linux/kasan.h:234 [inline] slab_free_hook mm/slub.c:2539 [inline] slab_free mm/slub.c:6630 [inline] kfree+0x19a/0x6d0 mm/slub.c:6837 inet_sock_destruct+0x538/0x740 net/ipv4/af_inet.c:159 __sk_destruct+0x89/0x660 net/core/sock.c:2350 sock_put include/net/sock.h:1991 [inline] sctp_endpoint_destroy_rcu+0xa1/0xf0 net/sctp/endpointola.c:197 rcu_do_batch kernel/rcu/tree.c:2605 [inline] rcu_core+0xcab/0x1770 kernel/rcu/tree.c:2861 handle_softirqs+0x286/0x870 kernel/softirq.c:622 run_ksoftirqd+0x9b/0x100 kernel/softirq.c:1063 smpboot_thread_fn+0x542/0xa60 kernel/smpboot.c:160 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 Fixes: 16942cf4d3e31 ("sctp: Use sk_clone() in sctp_accept().") Reported-by: syzbot+ec33a1a006ed5abe7309@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6936d112.a70a0220.38f243.00a8.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20251210081206.1141086-3-kuniyu@google.com Acked-by: Xin Long Signed-off-by: Paolo Abeni --- net/sctp/ipv6.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c index 069b7e45d8bda..531cb0690007a 100644 --- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -492,6 +492,8 @@ static void sctp_v6_copy_ip_options(struct sock *sk, struct sock *newsk) struct ipv6_pinfo *newnp, *np = inet6_sk(sk); struct ipv6_txoptions *opt; + inet_sk(newsk)->inet_opt = NULL; + newnp = inet6_sk(newsk); rcu_read_lock(); From dfee6bde73b6077a2c00dae185fa1355a31fcae3 Mon Sep 17 00:00:00 2001 From: Jamal Hadi Salim Date: Wed, 10 Dec 2025 11:22:54 -0500 Subject: [PATCH 0119/1321] net/sched: act_mirred: fix loop detection Fix a loop scenario of ethx:egress->ethx:egress Example setup to reproduce: tc qdisc add dev ethx root handle 1: drr tc filter add dev ethx parent 1: protocol ip prio 1 matchall \ action mirred egress redirect dev ethx Now ping out of ethx and you get a deadlock: [ 116.892898][ T307] ============================================ [ 116.893182][ T307] WARNING: possible recursive locking detected [ 116.893418][ T307] 6.18.0-rc6-01205-ge05021a829b8-dirty #204 Not tainted [ 116.893682][ T307] -------------------------------------------- [ 116.893926][ T307] ping/307 is trying to acquire lock: [ 116.894133][ T307] ffff88800c122908 (&sch->root_lock_key){+...}-{3:3}, at: __dev_queue_xmit+0x2210/0x3b50 [ 116.894517][ T307] [ 116.894517][ T307] but task is already holding lock: [ 116.894836][ T307] ffff88800c122908 (&sch->root_lock_key){+...}-{3:3}, at: __dev_queue_xmit+0x2210/0x3b50 [ 116.895252][ T307] [ 116.895252][ T307] other info that might help us debug this: [ 116.895608][ T307] Possible unsafe locking scenario: [ 116.895608][ T307] [ 116.895901][ T307] CPU0 [ 116.896057][ T307] ---- [ 116.896200][ T307] lock(&sch->root_lock_key); [ 116.896392][ T307] lock(&sch->root_lock_key); [ 116.896605][ T307] [ 116.896605][ T307] *** DEADLOCK *** [ 116.896605][ T307] [ 116.896864][ T307] May be due to missing lock nesting notation [ 116.896864][ T307] [ 116.897123][ T307] 6 locks held by ping/307: [ 116.897302][ T307] #0: ffff88800b4b0250 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0xb20/0x2cf0 [ 116.897808][ T307] #1: ffffffff88c839c0 (rcu_read_lock){....}-{1:3}, at: ip_output+0xa9/0x600 [ 116.898138][ T307] #2: ffffffff88c839c0 (rcu_read_lock){....}-{1:3}, at: ip_finish_output2+0x2c6/0x1ee0 [ 116.898459][ T307] #3: ffffffff88c83960 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x200/0x3b50 [ 116.898782][ T307] #4: ffff88800c122908 (&sch->root_lock_key){+...}-{3:3}, at: __dev_queue_xmit+0x2210/0x3b50 [ 116.899132][ T307] #5: ffffffff88c83960 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x200/0x3b50 [ 116.899442][ T307] [ 116.899442][ T307] stack backtrace: [ 116.899667][ T307] CPU: 2 UID: 0 PID: 307 Comm: ping Not tainted 6.18.0-rc6-01205-ge05021a829b8-dirty #204 PREEMPT(voluntary) [ 116.899672][ T307] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 116.899675][ T307] Call Trace: [ 116.899678][ T307] [ 116.899680][ T307] dump_stack_lvl+0x6f/0xb0 [ 116.899688][ T307] print_deadlock_bug.cold+0xc0/0xdc [ 116.899695][ T307] __lock_acquire+0x11f7/0x1be0 [ 116.899704][ T307] lock_acquire+0x162/0x300 [ 116.899707][ T307] ? __dev_queue_xmit+0x2210/0x3b50 [ 116.899713][ T307] ? srso_alias_return_thunk+0x5/0xfbef5 [ 116.899717][ T307] ? stack_trace_save+0x93/0xd0 [ 116.899723][ T307] _raw_spin_lock+0x30/0x40 [ 116.899728][ T307] ? __dev_queue_xmit+0x2210/0x3b50 [ 116.899731][ T307] __dev_queue_xmit+0x2210/0x3b50 Fixes: 178ca30889a1 ("Revert "net/sched: Fix mirred deadlock on device recursion"") Tested-by: Victor Nogueira Signed-off-by: Jamal Hadi Salim Link: https://patch.msgid.link/20251210162255.1057663-1-jhs@mojatatu.com Signed-off-by: Paolo Abeni --- net/sched/act_mirred.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index f27b583def78e..91c96cc625bd6 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -281,6 +281,15 @@ static int tcf_mirred_to_dev(struct sk_buff *skb, struct tcf_mirred *m, want_ingress = tcf_mirred_act_wants_ingress(m_eaction); + if (dev == skb->dev && want_ingress == at_ingress) { + pr_notice_once("tc mirred: Loop (%s:%s --> %s:%s)\n", + netdev_name(skb->dev), + at_ingress ? "ingress" : "egress", + netdev_name(dev), + want_ingress ? "ingress" : "egress"); + goto err_cant_do; + } + /* All mirred/redirected skbs should clear previous ct info */ nf_reset_ct(skb_to_send); if (want_ingress && !at_ingress) /* drop dst for egress -> ingress */ From 8d505602e4b7cfc8922ac4ef630961b0df986f9e Mon Sep 17 00:00:00 2001 From: Victor Nogueira Date: Wed, 10 Dec 2025 11:22:55 -0500 Subject: [PATCH 0120/1321] selftests/tc-testing: Test case exercising potential mirred redirect deadlock Add a test case that reproduces deadlock scenario where the user has a drr qdisc attached to root and has a mirred action that redirects to self on egress Signed-off-by: Victor Nogueira Acked-by: Jamal Hadi Salim Link: https://patch.msgid.link/20251210162255.1057663-2-jhs@mojatatu.com Signed-off-by: Paolo Abeni --- .../tc-testing/tc-tests/actions/mirred.json | 46 +++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json b/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json index b73bd255ea36f..da156feabcbff 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json +++ b/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json @@ -1052,5 +1052,51 @@ "$TC qdisc del dev $DEV1 ingress_block 21 clsact", "$TC actions flush action mirred" ] + }, + { + "id": "7eba", + "name": "Redirect multiport: dummy egress -> dummy egress (Loop)", + "category": [ + "filter", + "mirred" + ], + "plugins": { + "requires": [ + "nsPlugin" + ] + }, + "setup": [ + "$IP link set dev $DUMMY up || true", + "$IP addr add 10.10.10.10/24 dev $DUMMY || true", + "$TC qdisc add dev $DUMMY handle 1: root drr", + "$TC filter add dev $DUMMY parent 1: protocol ip prio 10 matchall action mirred egress redirect dev $DUMMY index 1" + ], + "cmdUnderTest": "ping -c1 -W0.01 -I $DUMMY 10.10.10.1", + "expExitCode": "1", + "verifyCmd": "$TC -j -s actions get action mirred index 1", + "matchJSON": [ + { + "total acts": 0 + }, + { + "actions": [ + { + "order": 1, + "kind": "mirred", + "mirred_action": "redirect", + "direction": "egress", + "index": 1, + "stats": { + "packets": 1, + "overlimits": 1 + }, + "not_in_hw": true + } + ] + } + ], + "teardown": [ + "$TC qdisc del dev $DUMMY root" + ] } ] From e6ea55efc830792eb6d603014db45dec7451aabd Mon Sep 17 00:00:00 2001 From: Wei Fang Date: Thu, 11 Dec 2025 10:09:19 +0800 Subject: [PATCH 0121/1321] net: enetc: do not transmit redirected XDP frames when the link is down In the current implementation, the enetc_xdp_xmit() always transmits redirected XDP frames even if the link is down, but the frames cannot be transmitted from TX BD rings when the link is down, so the frames are still kept in the TX BD rings. If the XDP program is uninstalled, users will see the following warning logs. fsl_enetc 0000:00:00.0 eno0: timeout for tx ring #6 clear More worse, the TX BD ring cannot work properly anymore, because the HW PIR and CIR are not equal after the re-initialization of the TX BD ring. At this point, the BDs between CIR and PIR are invalid, which will cause a hardware malfunction. Another reason is that there is internal context in the ring prefetch logic that will retain the state from the first incarnation of the ring and continue prefetching from the stale location when we re-initialize the ring. The internal context is only reset by an FLR. That is to say, for LS1028A ENETC, software cannot set the HW CIR and PIR when initializing the TX BD ring. It does not make sense to transmit redirected XDP frames when the link is down. Add a link status check to prevent transmission in this condition. This fixes part of the issue, but more complex cases remain. For example, the TX BD ring may still contain unsent frames when the link goes down. Those situations require additional patches, which will build on this one. Fixes: 9d2b68cc108d ("net: enetc: add support for XDP_REDIRECT") Signed-off-by: Wei Fang Reviewed-by: Frank Li Reviewed-by: Hariprasad Kelam Reviewed-by: Vladimir Oltean Link: https://patch.msgid.link/20251211020919.121113-1-wei.fang@nxp.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/freescale/enetc/enetc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index d5e5800b84eff..53b26cece16a8 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -1787,7 +1787,8 @@ int enetc_xdp_xmit(struct net_device *ndev, int num_frames, int xdp_tx_bd_cnt, i, k; int xdp_tx_frm_cnt = 0; - if (unlikely(test_bit(ENETC_TX_DOWN, &priv->flags))) + if (unlikely(test_bit(ENETC_TX_DOWN, &priv->flags) || + !netif_carrier_ok(ndev))) return -ENETDOWN; enetc_lock_mdio(); From 248923bd2cd31c1970a5ffdeb3656f808fa34e7f Mon Sep 17 00:00:00 2001 From: Jian Shen Date: Thu, 11 Dec 2025 10:37:35 +0800 Subject: [PATCH 0122/1321] net: hns3: using the num_tqps in the vf driver to apply for resources Currently, hdev->htqp is allocated using hdev->num_tqps, and kinfo->tqp is allocated using kinfo->num_tqps. However, kinfo->num_tqps is set to min(new_tqps, hdev->num_tqps); Therefore, kinfo->num_tqps may be smaller than hdev->num_tqps, which causes some hdev->htqp[i] to remain uninitialized in hclgevf_knic_setup(). Thus, this patch allocates hdev->htqp and kinfo->tqp using hdev->num_tqps, ensuring that the lengths of hdev->htqp and kinfo->tqp are consistent and that all elements are properly initialized. Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Jian Shen Signed-off-by: Jijie Shao Reviewed-by: Simon Horman Link: https://patch.msgid.link/20251211023737.2327018-2-shaojijie@huawei.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c index 8fcf220a120d2..70327a73dee32 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c @@ -368,12 +368,12 @@ static int hclgevf_knic_setup(struct hclgevf_dev *hdev) new_tqps = kinfo->rss_size * num_tc; kinfo->num_tqps = min(new_tqps, hdev->num_tqps); - kinfo->tqp = devm_kcalloc(&hdev->pdev->dev, kinfo->num_tqps, + kinfo->tqp = devm_kcalloc(&hdev->pdev->dev, hdev->num_tqps, sizeof(struct hnae3_queue *), GFP_KERNEL); if (!kinfo->tqp) return -ENOMEM; - for (i = 0; i < kinfo->num_tqps; i++) { + for (i = 0; i < hdev->num_tqps; i++) { hdev->htqp[i].q.handle = &hdev->nic; hdev->htqp[i].q.tqp_index = i; kinfo->tqp[i] = &hdev->htqp[i].q; From caf76be55342e8b340e858050f1a45b77e8482b5 Mon Sep 17 00:00:00 2001 From: Jian Shen Date: Thu, 11 Dec 2025 10:37:36 +0800 Subject: [PATCH 0123/1321] net: hns3: using the num_tqps to check whether tqp_index is out of range when vf get ring info from mbx Currently, rss_size = num_tqps / tc_num. If tc_num is 1, then num_tqps equals rss_size. However, if the tc_num is greater than 1, then rss_size will be less than num_tqps, causing the tqp_index check for subsequent TCs using rss_size to always fail. This patch uses the num_tqps to check whether tqp_index is out of range, instead of rss_size. Fixes: 326334aad024 ("net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx()") Signed-off-by: Jian Shen Signed-off-by: Jijie Shao Reviewed-by: Simon Horman Link: https://patch.msgid.link/20251211023737.2327018-3-shaojijie@huawei.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c index c7ff12a6c0764..b7d4e06a55d40 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c @@ -193,10 +193,10 @@ static int hclge_get_ring_chain_from_mbx( return -EINVAL; for (i = 0; i < ring_num; i++) { - if (req->msg.param[i].tqp_index >= vport->nic.kinfo.rss_size) { + if (req->msg.param[i].tqp_index >= vport->nic.kinfo.num_tqps) { dev_err(&hdev->pdev->dev, "tqp index(%u) is out of range(0-%u)\n", req->msg.param[i].tqp_index, - vport->nic.kinfo.rss_size - 1U); + vport->nic.kinfo.num_tqps - 1U); return -EINVAL; } } From 98ff67b44f9ef4de3171d665a88e22904d074068 Mon Sep 17 00:00:00 2001 From: Jian Shen Date: Thu, 11 Dec 2025 10:37:37 +0800 Subject: [PATCH 0124/1321] net: hns3: add VLAN id validation before using Currently, the VLAN id may be used without validation when receive a VLAN configuration mailbox from VF. The length of vlan_del_fail_bmap is BITS_TO_LONGS(VLAN_N_VID). It may cause out-of-bounds memory access once the VLAN id is bigger than or equal to VLAN_N_VID. Therefore, VLAN id needs to be checked to ensure it is within the range of VLAN_N_VID. Fixes: fe4144d47eef ("net: hns3: sync VLAN filter entries when kill VLAN ID failed") Signed-off-by: Jian Shen Signed-off-by: Jijie Shao Reviewed-by: Simon Horman Link: https://patch.msgid.link/20251211023737.2327018-4-shaojijie@huawei.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index cf8abbe018402..c589baea7c775 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -10555,6 +10555,9 @@ int hclge_set_vlan_filter(struct hnae3_handle *handle, __be16 proto, bool writen_to_tbl = false; int ret = 0; + if (vlan_id >= VLAN_N_VID) + return -EINVAL; + /* When device is resetting or reset failed, firmware is unable to * handle mailbox. Just record the vlan id, and remove it after * reset finished. From 5faebd8e64d13a4299fab793fd92f9faa63fc83e Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Tue, 25 Nov 2025 22:39:59 +0900 Subject: [PATCH 0125/1321] can: j1939: make j1939_session_activate() fail if device is no longer registered syzbot is still reporting unregister_netdevice: waiting for vcan0 to become free. Usage count = 2 even after commit 93a27b5891b8 ("can: j1939: add missing calls in NETDEV_UNREGISTER notification handler") was added. A debug printk() patch found that j1939_session_activate() can succeed even after j1939_cancel_active_session() from j1939_netdev_notify(NETDEV_UNREGISTER) has completed. Since j1939_cancel_active_session() is processed with the session list lock held, checking ndev->reg_state in j1939_session_activate() with the session list lock held can reliably close the race window. Reported-by: syzbot Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 Signed-off-by: Tetsuo Handa Acked-by: Oleksij Rempel Link: https://patch.msgid.link/b9653191-d479-4c8b-8536-1326d028db5c@I-love.SAKURA.ne.jp Signed-off-by: Marc Kleine-Budde --- net/can/j1939/transport.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c index fbf5c8001c9df..613a911dda100 100644 --- a/net/can/j1939/transport.c +++ b/net/can/j1939/transport.c @@ -1567,6 +1567,8 @@ int j1939_session_activate(struct j1939_session *session) if (active) { j1939_session_put(active); ret = -EAGAIN; + } else if (priv->ndev->reg_state != NETREG_REGISTERED) { + ret = -ENODEV; } else { WARN_ON_ONCE(session->state != J1939_SESSION_NEW); list_add_tail(&session->active_session_list_entry, From 168c82e3815a9c06784b58fa1c643253a45486fa Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Tue, 25 Nov 2025 22:43:12 +0900 Subject: [PATCH 0126/1321] can: j1939: make j1939_sk_bind() fail if device is no longer registered There is a theoretical race window in j1939_sk_netdev_event_unregister() where two j1939_sk_bind() calls jump in between read_unlock_bh() and lock_sock(). The assumption jsk->priv == priv can fail if the first j1939_sk_bind() call once made jsk->priv == NULL due to failed j1939_local_ecu_get() call and the second j1939_sk_bind() call again made jsk->priv != NULL due to successful j1939_local_ecu_get() call. Since the socket lock is held by both j1939_sk_netdev_event_unregister() and j1939_sk_bind(), checking ndev->reg_state with the socket lock held can reliably make the second j1939_sk_bind() call fail (and close this race window). Fixes: 7fcbe5b2c6a4 ("can: j1939: implement NETDEV_UNREGISTER notification handler") Signed-off-by: Tetsuo Handa Acked-by: Oleksij Rempel Link: https://patch.msgid.link/5732921e-247e-4957-a364-da74bd7031d7@I-love.SAKURA.ne.jp Signed-off-by: Marc Kleine-Budde --- net/can/j1939/socket.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c index 6272326dd614a..ff9c4fd7b4337 100644 --- a/net/can/j1939/socket.c +++ b/net/can/j1939/socket.c @@ -482,6 +482,12 @@ static int j1939_sk_bind(struct socket *sock, struct sockaddr_unsized *uaddr, in goto out_release_sock; } + if (ndev->reg_state != NETREG_REGISTERED) { + dev_put(ndev); + ret = -ENODEV; + goto out_release_sock; + } + can_ml = can_get_ml_priv(ndev); if (!can_ml) { dev_put(ndev); From f57d2c034d60cb3e9bff675e883f9beeed75f69a Mon Sep 17 00:00:00 2001 From: Marc Kleine-Budde Date: Wed, 17 Dec 2025 10:45:53 +0100 Subject: [PATCH 0127/1321] can: fix build dependency Arnd Bergmann's patch [1] fixed the build dependency problem introduced by bugfix commit cb2dc6d2869a ("can: Kconfig: select CAN driver infrastructure by default"). This ended up as commit 6abd4577bccc ("can: fix build dependency"), but I broke Arnd's fix by removing a dependency that we thought was superfluous. [1] https://lore.kernel.org/all/20251204100015.1033688-1-arnd@kernel.org/ Meanwhile the problem was also found by intel's kernel test robot, complaining about undefined symbols: | ERROR: modpost: "m_can_class_unregister" [drivers/net/can/m_can/m_can_platform.ko] undefined! | ERROR: modpost: "m_can_class_free_dev" [drivers/net/can/m_can/m_can_platform.ko] undefined! | ERROR: modpost: "m_can_class_allocate_dev" [drivers/net/can/m_can/m_can_platform.ko] undefined! | ERROR: modpost: "m_can_class_get_clocks" [drivers/net/can/m_can/m_can_platform.ko] undefined! | ERROR: modpost: "m_can_class_register" [drivers/net/can/m_can/m_can_platform.ko] undefined! To fix this problem, add the missing dependency again. Cc: Vincent Mailhol Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202512132253.vO9WFDJK-lkp@intel.com/ Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202512180808.fTAUQ2XN-lkp@intel.com/ Reported-by: Arnd Bergmann Closes: https://lore.kernel.org/all/7427949a-ea7d-4854-9fe4-e01db7d878c7@app.fastmail.com/ Fixes: 6abd4577bccc ("can: fix build dependency") Fixes: cb2dc6d2869a ("can: Kconfig: select CAN driver infrastructure by default") Acked-by: Vincent Mailhol Link: https://patch.msgid.link/20251217-can-fix-dependency-v1-1-fd2d4f2a2bf5@pengutronix.de Signed-off-by: Marc Kleine-Budde --- drivers/net/can/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/can/Kconfig b/drivers/net/can/Kconfig index 460a74ae69233..cfaea6178a719 100644 --- a/drivers/net/can/Kconfig +++ b/drivers/net/can/Kconfig @@ -17,7 +17,7 @@ menuconfig CAN_DEV virtual ones. If you own such devices or plan to use the virtual CAN interfaces to develop applications, say Y here. -if CAN_DEV +if CAN_DEV && CAN config CAN_VCAN tristate "Virtual Local CAN Interface (vcan)" From 1181979cf9d64d1afbdadf7ed678856c7bf44b84 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Thu, 4 Dec 2025 11:03:29 +0100 Subject: [PATCH 0128/1321] iommufd: Fix building without dmabuf When DMABUF is disabled, trying to use it causes a link failure: x86_64-linux-ld: drivers/iommu/iommufd/io_pagetable.o: in function `iopt_map_file_pages': io_pagetable.c:(.text+0x1735): undefined reference to `dma_buf_get' x86_64-linux-ld: io_pagetable.c:(.text+0x1775): undefined reference to `dma_buf_put' Fixes: 44ebaa1744fd ("iommufd: Accept a DMABUF through IOMMU_IOAS_MAP_FILE") Link: https://patch.msgid.link/r/20251204100333.1034767-1-arnd@kernel.org Signed-off-by: Arnd Bergmann Signed-off-by: Jason Gunthorpe --- drivers/iommu/iommufd/io_pagetable.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c index 54cf4d856179b..436992331111c 100644 --- a/drivers/iommu/iommufd/io_pagetable.c +++ b/drivers/iommu/iommufd/io_pagetable.c @@ -495,7 +495,11 @@ int iopt_map_file_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt, return -EOVERFLOW; start_byte = start - ALIGN_DOWN(start, PAGE_SIZE); - dmabuf = dma_buf_get(fd); + if (IS_ENABLED(CONFIG_DMA_SHARED_BUFFER)) + dmabuf = dma_buf_get(fd); + else + dmabuf = ERR_PTR(-ENXIO); + if (!IS_ERR(dmabuf)) { pages = iopt_alloc_dmabuf_pages(ictx, dmabuf, start_byte, start, length, From 5b2619a25b427da5a926e9db7806ab1e107170c1 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Fri, 5 Dec 2025 14:56:12 -0400 Subject: [PATCH 0129/1321] iommufd/selftest: Make it clearer to gcc that the access is not out of bounds GCC gets a bit confused and reports: In function '_test_cmd_get_hw_info', inlined from 'iommufd_ioas_get_hw_info' at iommufd.c:779:3, inlined from 'wrapper_iommufd_ioas_get_hw_info' at iommufd.c:752:1: >> iommufd_utils.h:804:37: warning: array subscript 'struct iommu_test_hw_info[0]' is partly outside array bounds of 'struct iommu_test_hw_info_buffer_smaller[1]' [-Warray-bounds=] 804 | assert(!info->flags); | ~~~~^~~~~~~ iommufd.c: In function 'wrapper_iommufd_ioas_get_hw_info': iommufd.c:761:11: note: object 'buffer_smaller' of size 4 761 | } buffer_smaller; | ^~~~~~~~~~~~~~ While it is true that "struct iommu_test_hw_info[0]" is partly out of bounds of the input pointer, it is not true that info->flags is out of bounds. Unclear why it warns on this. Reuse an existing properly sized stack buffer and pass a truncated length instead to test the same thing. Fixes: af4fde93c319 ("iommufd/selftest: Add coverage for IOMMU_GET_HW_INFO ioctl") Link: https://patch.msgid.link/r/0-v1-63a2cffb09da+4486-iommufd_gcc_bounds_jgg@nvidia.com Reviewed-by: Kevin Tian Reviewed-by: Nicolin Chen Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202512032344.kaAcKFIM-lkp@intel.com/ Signed-off-by: Jason Gunthorpe --- tools/testing/selftests/iommu/iommufd.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 10e051b6f592d..dadad277f4eb2 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -755,9 +755,6 @@ TEST_F(iommufd_ioas, get_hw_info) struct iommu_test_hw_info info; uint64_t trailing_bytes; } buffer_larger; - struct iommu_test_hw_info_buffer_smaller { - __u32 flags; - } buffer_smaller; if (self->device_id) { uint8_t max_pasid = 0; @@ -789,8 +786,9 @@ TEST_F(iommufd_ioas, get_hw_info) * the fields within the size range still gets updated. */ test_cmd_get_hw_info(self->device_id, - IOMMU_HW_INFO_TYPE_DEFAULT, - &buffer_smaller, sizeof(buffer_smaller)); + IOMMU_HW_INFO_TYPE_DEFAULT, &buffer_exact, + offsetofend(struct iommu_test_hw_info, + flags)); test_cmd_get_hw_info_pasid(self->device_id, &max_pasid); ASSERT_EQ(0, max_pasid); if (variant->pasid_capable) { From d6d4f7d7532acfc4c1c45c19cfd01b62447e19ef Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Fri, 5 Dec 2025 15:42:47 -0400 Subject: [PATCH 0130/1321] iommufd/selftest: Do not leak the hwpt if IOMMU_TEST_OP_MD_CHECK_MAP fails If the input validation fails it returned without freeing the hwpt refcount causing a leak. This triggers a WARN_ON when closing the fd: WARNING: drivers/iommu/iommufd/main.c:369 at iommufd_fops_release+0x385/0x430, CPU#1: repro/724 Found by szykaller. Fixes: e93d5945ed5b ("iommufd: Change the selftest to use iommupt instead of xarray") Link: https://patch.msgid.link/r/0-v1-c8ed57e24380+44ae-iommufd_selftest_hwpt_leak_jgg@nvidia.com Reviewed-by: Kevin Tian Reviewed-by: Pasha Tatashin Reported-by: "Lai, Yi" Closes: https://lore.kernel.org/r/aTJGMaqwQK0ASj0G@ly-workstation Signed-off-by: Jason Gunthorpe --- drivers/iommu/iommufd/selftest.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index c4322fd26f93e..86446e1537949 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -1215,8 +1215,10 @@ static int iommufd_test_md_check_pa(struct iommufd_ucmd *ucmd, page_size = 1 << __ffs(mock->domain.pgsize_bitmap); if (iova % page_size || length % page_size || (uintptr_t)uptr % page_size || - check_add_overflow((uintptr_t)uptr, (uintptr_t)length, &end)) - return -EINVAL; + check_add_overflow((uintptr_t)uptr, (uintptr_t)length, &end)) { + rc = -EINVAL; + goto out_put; + } for (; length; length -= page_size) { struct page *pages[1]; From 4735d8471a2e6c38217d31426f99068ae900c0e6 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 16 Dec 2025 11:53:40 -0400 Subject: [PATCH 0131/1321] iommufd/selftest: Check for overflow in IOMMU_TEST_OP_ADD_RESERVED syzkaller found it could overflow math in the test infrastructure and cause a WARN_ON by corrupting the reserved interval tree. This only effects test kernels with CONFIG_IOMMUFD_TEST. Validate the user input length in the test ioctl. Fixes: f4b20bb34c83 ("iommufd: Add kernel support for testing iommufd") Link: https://patch.msgid.link/r/0-v1-cd99f6049ba5+51-iommufd_syz_add_resv_jgg@nvidia.com Reviewed-by: Samiullah Khawaja Reviewed-by: Kevin Tian Tested-by: Yi Liu Reported-by: syzbot+57fdb0cf6a0c5d1f15a2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/69368129.a70a0220.38f243.008f.GAE@google.com Signed-off-by: Jason Gunthorpe --- drivers/iommu/iommufd/selftest.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 86446e1537949..550ff36dec3a3 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -1184,14 +1184,20 @@ static int iommufd_test_add_reserved(struct iommufd_ucmd *ucmd, unsigned int mockpt_id, unsigned long start, size_t length) { + unsigned long last; struct iommufd_ioas *ioas; int rc; + if (!length) + return -EINVAL; + if (check_add_overflow(start, length - 1, &last)) + return -EOVERFLOW; + ioas = iommufd_get_ioas(ucmd->ictx, mockpt_id); if (IS_ERR(ioas)) return PTR_ERR(ioas); down_write(&ioas->iopt.iova_rwsem); - rc = iopt_reserve_iova(&ioas->iopt, start, start + length - 1, NULL); + rc = iopt_reserve_iova(&ioas->iopt, start, last, NULL); up_write(&ioas->iopt.iova_rwsem); iommufd_put_object(ucmd->ictx, &ioas->obj); return rc; From 4b5876e722b88aa8990d6b38258aeb33829fc346 Mon Sep 17 00:00:00 2001 From: Srinivas Pandruvada Date: Fri, 5 Dec 2025 15:00:07 -0800 Subject: [PATCH 0132/1321] thermal: intel: int340x: Enable power slider interface for Wildcat Lake Set the PROC_THERMAL_FEATURE_SOC_POWER_SLIDER feature flag in proc_thermal_pci_ids[] for Wildcat Lake to enable power slider interface. Signed-off-by: Srinivas Pandruvada Link: https://patch.msgid.link/20251205230007.2218533-1-srinivas.pandruvada@linux.intel.com Signed-off-by: Rafael J. Wysocki --- .../intel/int340x_thermal/processor_thermal_device_pci.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c b/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c index 0d4dcc66e097e..c693d934103af 100644 --- a/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c +++ b/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c @@ -503,7 +503,8 @@ static const struct pci_device_id proc_thermal_pci_ids[] = { { PCI_DEVICE_DATA(INTEL, WCL_THERMAL, PROC_THERMAL_FEATURE_MSI_SUPPORT | PROC_THERMAL_FEATURE_RAPL | PROC_THERMAL_FEATURE_DLVR | PROC_THERMAL_FEATURE_DVFS | PROC_THERMAL_FEATURE_WT_HINT | - PROC_THERMAL_FEATURE_POWER_FLOOR | PROC_THERMAL_FEATURE_PTC) }, + PROC_THERMAL_FEATURE_POWER_FLOOR | PROC_THERMAL_FEATURE_PTC | + PROC_THERMAL_FEATURE_SOC_POWER_SLIDER) }, { PCI_DEVICE_DATA(INTEL, NVL_H_THERMAL, PROC_THERMAL_FEATURE_RAPL | PROC_THERMAL_FEATURE_DLVR | PROC_THERMAL_FEATURE_DVFS | PROC_THERMAL_FEATURE_MSI_SUPPORT | PROC_THERMAL_FEATURE_WT_HINT | From ed2c3c412e03f387ac7079def6ec1253a122a700 Mon Sep 17 00:00:00 2001 From: Thorsten Blum Date: Sat, 6 Dec 2025 18:42:45 +0100 Subject: [PATCH 0133/1321] thermal: core: Fix typo and indentation in comments s/tmperature/temperature/ and adjust the indentation of the @ops parameter description to improve readability. Signed-off-by: Thorsten Blum Link: https://patch.msgid.link/20251206174245.116391-2-thorsten.blum@linux.dev Signed-off-by: Rafael J. Wysocki --- drivers/thermal/thermal_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index 17ca5c0826435..89758c9934ec6 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -500,7 +500,7 @@ void thermal_zone_set_trip_hyst(struct thermal_zone_device *tz, WRITE_ONCE(trip->hysteresis, hyst); thermal_notify_tz_trip_change(tz, trip); /* - * If the zone temperature is above or at the trip tmperature, the trip + * If the zone temperature is above or at the trip temperature, the trip * is in the trips_reached list and its threshold is equal to its low * temperature. It needs to stay in that list, but its threshold needs * to be updated and the list ordering may need to be restored. @@ -1043,7 +1043,7 @@ static void thermal_cooling_device_init_complete(struct thermal_cooling_device * * @np: a pointer to a device tree node. * @type: the thermal cooling device type. * @devdata: device private data. - * @ops: standard thermal cooling devices callbacks. + * @ops: standard thermal cooling devices callbacks. * * This interface function adds a new thermal cooling device (fan/processor/...) * to /sys/class/thermal/ folder as cooling_device[0-*]. It tries to bind itself From 8c3a2a58fbd64c621db903253e6dc1e7a1122d1d Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Mon, 15 Dec 2025 15:21:34 +0100 Subject: [PATCH 0134/1321] PM: runtime: Do not clear needs_force_resume with enabled runtime PM Commit 89d9cec3b1e9 ("PM: runtime: Clear power.needs_force_resume in pm_runtime_reinit()") added provisional clearing of power.needs_force_resume to pm_runtime_reinit(), but it is done unconditionally which is a mistake because pm_runtime_reinit() may race with driver probing and removal [1]. To address this, notice that power.needs_force_resume should never be set when runtime PM is enabled and so it only needs to be cleared when runtime PM is disabled, and update pm_runtime_init() to only clear that flag when runtime PM is disabled. Fixes: 89d9cec3b1e9 ("PM: runtime: Clear power.needs_force_resume in pm_runtime_reinit()") Reported-by: Ed Tsai Closes: https://lore.kernel.org/linux-pm/20251215122154.3180001-1-ed.tsai@mediatek.com/ [1] Signed-off-by: Rafael J. Wysocki Cc: 6.17+ # 6.17+ Reviewed-by: Ulf Hansson Link: https://patch.msgid.link/12807571.O9o76ZdvQC@rafael.j.wysocki --- drivers/base/power/runtime.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index 84676cc242214..0ee8ea971aa46 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -1868,16 +1868,18 @@ void pm_runtime_init(struct device *dev) */ void pm_runtime_reinit(struct device *dev) { - if (!pm_runtime_enabled(dev)) { - if (dev->power.runtime_status == RPM_ACTIVE) - pm_runtime_set_suspended(dev); - if (dev->power.irq_safe) { - spin_lock_irq(&dev->power.lock); - dev->power.irq_safe = 0; - spin_unlock_irq(&dev->power.lock); - if (dev->parent) - pm_runtime_put(dev->parent); - } + if (pm_runtime_enabled(dev)) + return; + + if (dev->power.runtime_status == RPM_ACTIVE) + pm_runtime_set_suspended(dev); + + if (dev->power.irq_safe) { + spin_lock_irq(&dev->power.lock); + dev->power.irq_safe = 0; + spin_unlock_irq(&dev->power.lock); + if (dev->parent) + pm_runtime_put(dev->parent); } /* * Clear power.needs_force_resume in case it has been set by From 61c0328b64a2bc6358abc15ff6bada8cff78f724 Mon Sep 17 00:00:00 2001 From: Sumeet Pawnikar Date: Sat, 6 Dec 2025 00:32:16 +0530 Subject: [PATCH 0135/1321] powercap: fix race condition in register_control_type() The device becomes visible to userspace via device_register() even before it fully initialized by idr_init(). If userspace or another thread tries to register a zone immediately after device_register(), the control_type_valid() will fail because the control_type is not yet in the list. The IDR is not yet initialized, so this race condition causes zone registration failure. Move idr_init() and list addition before device_register() fix the race condition. Signed-off-by: Sumeet Pawnikar [ rjw: Subject adjustment, empty line added ] Link: https://patch.msgid.link/20251205190216.5032-1-sumeet4linux@gmail.com Signed-off-by: Rafael J. Wysocki --- drivers/powercap/powercap_sys.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/drivers/powercap/powercap_sys.c b/drivers/powercap/powercap_sys.c index 4112a00973382..d14b36b75189d 100644 --- a/drivers/powercap/powercap_sys.c +++ b/drivers/powercap/powercap_sys.c @@ -625,17 +625,23 @@ struct powercap_control_type *powercap_register_control_type( INIT_LIST_HEAD(&control_type->node); control_type->dev.class = &powercap_class; dev_set_name(&control_type->dev, "%s", name); - result = device_register(&control_type->dev); - if (result) { - put_device(&control_type->dev); - return ERR_PTR(result); - } idr_init(&control_type->idr); mutex_lock(&powercap_cntrl_list_lock); list_add_tail(&control_type->node, &powercap_cntrl_list); mutex_unlock(&powercap_cntrl_list_lock); + result = device_register(&control_type->dev); + if (result) { + mutex_lock(&powercap_cntrl_list_lock); + list_del(&control_type->node); + mutex_unlock(&powercap_cntrl_list_lock); + + idr_destroy(&control_type->idr); + put_device(&control_type->dev); + return ERR_PTR(result); + } + return control_type; } EXPORT_SYMBOL_GPL(powercap_register_control_type); From 3be3eba279fb361ddaf69cdbe774a080a92de598 Mon Sep 17 00:00:00 2001 From: Sumeet Pawnikar Date: Sun, 7 Dec 2025 20:45:48 +0530 Subject: [PATCH 0136/1321] powercap: fix sscanf() error return value handling Fix inconsistent error handling for sscanf() return value check. Implicit boolean conversion is used instead of explicit return value checks. The code checks if (!sscanf(...)) which is incorrect because: 1. sscanf returns the number of successfully parsed items 2. On success, it returns 1 (one item passed) 3. On failure, it returns 0 or EOF 4. The check 'if (!sscanf(...))' is wrong because it treats success (1) as failure All occurrences of sscanf() now uses explicit return value check. With this behavior it returns '-EINVAL' when parsing fails (returns 0 or EOF), and continues when parsing succeeds (returns 1). Signed-off-by: Sumeet Pawnikar [ rjw: Subject and changelog edits ] Link: https://patch.msgid.link/20251207151549.202452-1-sumeet4linux@gmail.com Signed-off-by: Rafael J. Wysocki --- drivers/powercap/powercap_sys.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/powercap/powercap_sys.c b/drivers/powercap/powercap_sys.c index d14b36b75189d..1ff369880beb2 100644 --- a/drivers/powercap/powercap_sys.c +++ b/drivers/powercap/powercap_sys.c @@ -68,7 +68,7 @@ static ssize_t show_constraint_##_attr(struct device *dev, \ int id; \ struct powercap_zone_constraint *pconst;\ \ - if (!sscanf(dev_attr->attr.name, "constraint_%d_", &id)) \ + if (sscanf(dev_attr->attr.name, "constraint_%d_", &id) != 1) \ return -EINVAL; \ if (id >= power_zone->const_id_cnt) \ return -EINVAL; \ @@ -93,7 +93,7 @@ static ssize_t store_constraint_##_attr(struct device *dev,\ int id; \ struct powercap_zone_constraint *pconst;\ \ - if (!sscanf(dev_attr->attr.name, "constraint_%d_", &id)) \ + if (sscanf(dev_attr->attr.name, "constraint_%d_", &id) != 1) \ return -EINVAL; \ if (id >= power_zone->const_id_cnt) \ return -EINVAL; \ @@ -162,7 +162,7 @@ static ssize_t show_constraint_name(struct device *dev, ssize_t len = -ENODATA; struct powercap_zone_constraint *pconst; - if (!sscanf(dev_attr->attr.name, "constraint_%d_", &id)) + if (sscanf(dev_attr->attr.name, "constraint_%d_", &id) != 1) return -EINVAL; if (id >= power_zone->const_id_cnt) return -EINVAL; From ad4f7cef7275ee2cd6328ad80b391cce6f31457f Mon Sep 17 00:00:00 2001 From: Srinivas Pandruvada Date: Wed, 17 Dec 2025 07:34:55 -0800 Subject: [PATCH 0137/1321] powercap: intel_rapl: Fix possible recursive lock warning With the RAPL PMU addition, there is a recursive locking when CPU online callback function calls rapl_package_add_pmu(). Here cpu_hotplug_lock is already acquired by cpuhp_thread_fun() and rapl_package_add_pmu() tries to acquire again. <4>[ 8.197433] ============================================ <4>[ 8.197437] WARNING: possible recursive locking detected <4>[ 8.197440] 6.19.0-rc1-lgci-xe-xe-4242-05b7c58b3367dca84+ #1 Not tainted <4>[ 8.197444] -------------------------------------------- <4>[ 8.197447] cpuhp/0/20 is trying to acquire lock: <4>[ 8.197450] ffffffff83487870 (cpu_hotplug_lock){++++}-{0:0}, at: rapl_package_add_pmu+0x37/0x370 [intel_rapl_common] <4>[ 8.197463] but task is already holding lock: <4>[ 8.197466] ffffffff83487870 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x6d/0x290 <4>[ 8.197477] other info that might help us debug this: <4>[ 8.197480] Possible unsafe locking scenario: <4>[ 8.197483] CPU0 <4>[ 8.197485] ---- <4>[ 8.197487] lock(cpu_hotplug_lock); <4>[ 8.197490] lock(cpu_hotplug_lock); <4>[ 8.197493] *** DEADLOCK *** .. .. <4>[ 8.197542] __lock_acquire+0x146e/0x2790 <4>[ 8.197548] lock_acquire+0xc4/0x2c0 <4>[ 8.197550] ? rapl_package_add_pmu+0x37/0x370 [intel_rapl_common] <4>[ 8.197556] cpus_read_lock+0x41/0x110 <4>[ 8.197558] ? rapl_package_add_pmu+0x37/0x370 [intel_rapl_common] <4>[ 8.197561] rapl_package_add_pmu+0x37/0x370 [intel_rapl_common] <4>[ 8.197565] rapl_cpu_online+0x85/0x87 [intel_rapl_msr] <4>[ 8.197568] ? __pfx_rapl_cpu_online+0x10/0x10 [intel_rapl_msr] <4>[ 8.197570] cpuhp_invoke_callback+0x41f/0x6c0 <4>[ 8.197573] ? cpuhp_thread_fun+0x6d/0x290 <4>[ 8.197575] cpuhp_thread_fun+0x1e2/0x290 <4>[ 8.197578] ? smpboot_thread_fn+0x26/0x290 <4>[ 8.197581] smpboot_thread_fn+0x12f/0x290 <4>[ 8.197584] ? __pfx_smpboot_thread_fn+0x10/0x10 <4>[ 8.197586] kthread+0x11f/0x250 <4>[ 8.197589] ? __pfx_kthread+0x10/0x10 <4>[ 8.197592] ret_from_fork+0x344/0x3a0 <4>[ 8.197595] ? __pfx_kthread+0x10/0x10 <4>[ 8.197597] ret_from_fork_asm+0x1a/0x30 <4>[ 8.197604] Fix this issue in the same way as rapl powercap package domain is added from the same CPU online callback by introducing another interface which doesn't call cpus_read_lock(). Add rapl_package_add_pmu_locked() and rapl_package_remove_pmu_locked() which don't call cpus_read_lock(). Fixes: 748d6ba43afd ("powercap: intel_rapl: Enable MSR-based RAPL PMU support") Reported-by: Borah, Chaitanya Kumar Closes: https://lore.kernel.org/linux-pm/5427ede1-57a0-43d1-99f3-8ca4b0643e82@intel.com/T/#u Tested-by: Kuppuswamy Sathyanarayanan Tested-by: RavitejaX Veesam Signed-off-by: Srinivas Pandruvada Link: https://patch.msgid.link/20251217153455.3560176-1-srinivas.pandruvada@linux.intel.com Signed-off-by: Rafael J. Wysocki --- drivers/powercap/intel_rapl_common.c | 24 ++++++++++++++++++------ drivers/powercap/intel_rapl_msr.c | 4 ++-- include/linux/intel_rapl.h | 4 ++++ 3 files changed, 24 insertions(+), 8 deletions(-) diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c index b9d87e56cbbc8..3ff6da3bf4e63 100644 --- a/drivers/powercap/intel_rapl_common.c +++ b/drivers/powercap/intel_rapl_common.c @@ -2032,7 +2032,7 @@ static int rapl_pmu_update(struct rapl_package *rp) return ret; } -int rapl_package_add_pmu(struct rapl_package *rp) +int rapl_package_add_pmu_locked(struct rapl_package *rp) { struct rapl_package_pmu_data *data = &rp->pmu_data; int idx; @@ -2040,8 +2040,6 @@ int rapl_package_add_pmu(struct rapl_package *rp) if (rp->has_pmu) return -EEXIST; - guard(cpus_read_lock)(); - for (idx = 0; idx < rp->nr_domains; idx++) { struct rapl_domain *rd = &rp->domains[idx]; int domain = rd->id; @@ -2091,17 +2089,23 @@ int rapl_package_add_pmu(struct rapl_package *rp) return rapl_pmu_update(rp); } +EXPORT_SYMBOL_GPL(rapl_package_add_pmu_locked); + +int rapl_package_add_pmu(struct rapl_package *rp) +{ + guard(cpus_read_lock)(); + + return rapl_package_add_pmu_locked(rp); +} EXPORT_SYMBOL_GPL(rapl_package_add_pmu); -void rapl_package_remove_pmu(struct rapl_package *rp) +void rapl_package_remove_pmu_locked(struct rapl_package *rp) { struct rapl_package *pos; if (!rp->has_pmu) return; - guard(cpus_read_lock)(); - list_for_each_entry(pos, &rapl_packages, plist) { /* PMU is still needed */ if (pos->has_pmu && pos != rp) @@ -2111,6 +2115,14 @@ void rapl_package_remove_pmu(struct rapl_package *rp) perf_pmu_unregister(&rapl_pmu.pmu); memset(&rapl_pmu, 0, sizeof(struct rapl_pmu)); } +EXPORT_SYMBOL_GPL(rapl_package_remove_pmu_locked); + +void rapl_package_remove_pmu(struct rapl_package *rp) +{ + guard(cpus_read_lock)(); + + rapl_package_remove_pmu_locked(rp); +} EXPORT_SYMBOL_GPL(rapl_package_remove_pmu); #endif diff --git a/drivers/powercap/intel_rapl_msr.c b/drivers/powercap/intel_rapl_msr.c index 0ce1096b63145..9a7e150b3536b 100644 --- a/drivers/powercap/intel_rapl_msr.c +++ b/drivers/powercap/intel_rapl_msr.c @@ -82,7 +82,7 @@ static int rapl_cpu_online(unsigned int cpu) if (IS_ERR(rp)) return PTR_ERR(rp); if (rapl_msr_pmu) - rapl_package_add_pmu(rp); + rapl_package_add_pmu_locked(rp); } cpumask_set_cpu(cpu, &rp->cpumask); return 0; @@ -101,7 +101,7 @@ static int rapl_cpu_down_prep(unsigned int cpu) lead_cpu = cpumask_first(&rp->cpumask); if (lead_cpu >= nr_cpu_ids) { if (rapl_msr_pmu) - rapl_package_remove_pmu(rp); + rapl_package_remove_pmu_locked(rp); rapl_remove_package_cpuslocked(rp); } else if (rp->lead_cpu == cpu) { rp->lead_cpu = lead_cpu; diff --git a/include/linux/intel_rapl.h b/include/linux/intel_rapl.h index e9ade2ff4af66..f479ef5b3341c 100644 --- a/include/linux/intel_rapl.h +++ b/include/linux/intel_rapl.h @@ -214,10 +214,14 @@ void rapl_remove_package(struct rapl_package *rp); #ifdef CONFIG_PERF_EVENTS int rapl_package_add_pmu(struct rapl_package *rp); +int rapl_package_add_pmu_locked(struct rapl_package *rp); void rapl_package_remove_pmu(struct rapl_package *rp); +void rapl_package_remove_pmu_locked(struct rapl_package *rp); #else static inline int rapl_package_add_pmu(struct rapl_package *rp) { return 0; } +static inline int rapl_package_add_pmu_locked(struct rapl_package *rp) { return 0; } static inline void rapl_package_remove_pmu(struct rapl_package *rp) { } +static inline void rapl_package_remove_pmu_locked(struct rapl_package *rp) { } #endif #endif /* __INTEL_RAPL_H__ */ From 074c52d17335be37ee83c01b4097438ccc27ae68 Mon Sep 17 00:00:00 2001 From: Pengjie Zhang Date: Wed, 10 Dec 2025 21:22:27 +0800 Subject: [PATCH 0138/1321] ACPI: CPPC: Fix missing PCC check for guaranteed_perf The current implementation overlooks the 'guaranteed_perf' register in this check. If the Guaranteed Performance register is located in the PCC subspace, the function currently attempts to read it without acquiring the lock and without sending the CMD_READ doorbell to the firmware. This can result in reading stale data. Fixes: 29523f095397 ("ACPI / CPPC: Add support for guaranteed performance") Signed-off-by: Pengjie Zhang Cc: 4.20+ # 4.20+ [ rjw: Subject and changelog edits ] Link: https://patch.msgid.link/20251210132227.1988380-1-zhangpengjie2@huawei.com Signed-off-by: Rafael J. Wysocki --- drivers/acpi/cppc_acpi.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c index 3bdeeee3414e6..e66e20d1f31b7 100644 --- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -1366,7 +1366,8 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps) /* Are any of the regs PCC ?*/ if (CPC_IN_PCC(highest_reg) || CPC_IN_PCC(lowest_reg) || CPC_IN_PCC(lowest_non_linear_reg) || CPC_IN_PCC(nominal_reg) || - CPC_IN_PCC(low_freq_reg) || CPC_IN_PCC(nom_freq_reg)) { + CPC_IN_PCC(low_freq_reg) || CPC_IN_PCC(nom_freq_reg) || + CPC_IN_PCC(guaranteed_reg)) { if (pcc_ss_id < 0) { pr_debug("Invalid pcc_ss_id\n"); return -ENODEV; From 0a0d83f12181f1b591d98cfef05b2f910af50e0e Mon Sep 17 00:00:00 2001 From: Pengjie Zhang Date: Wed, 10 Dec 2025 21:26:34 +0800 Subject: [PATCH 0139/1321] ACPI: PCC: Fix race condition by removing static qualifier Local variable 'ret' in acpi_pcc_address_space_setup() is currently declared as 'static'. This can lead to race conditions in a multithreaded environment. Remove the 'static' qualifier to ensure that 'ret' will be allocated directly on the stack as a local variable. Fixes: a10b1c99e2dc ("ACPI: PCC: Setup PCC Opregion handler only if platform interrupt is available") Signed-off-by: Pengjie Zhang Reviewed-by: Sudeep Holla Acked-by: lihuisong@huawei.com Cc: 6.2+ # 6.2+ [ rjw: Changelog edits ] Link: https://patch.msgid.link/20251210132634.2050033-1-zhangpengjie2@huawei.com Signed-off-by: Rafael J. Wysocki --- drivers/acpi/acpi_pcc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/acpi_pcc.c b/drivers/acpi/acpi_pcc.c index 97064e943768a..e3f302b9dee5f 100644 --- a/drivers/acpi/acpi_pcc.c +++ b/drivers/acpi/acpi_pcc.c @@ -52,7 +52,7 @@ acpi_pcc_address_space_setup(acpi_handle region_handle, u32 function, struct pcc_data *data; struct acpi_pcc_info *ctx = handler_context; struct pcc_mbox_chan *pcc_chan; - static acpi_status ret; + acpi_status ret; data = kzalloc(sizeof(*data), GFP_KERNEL); if (!data) From ec86421e119e99fd096a909052a4b2224440e1ed Mon Sep 17 00:00:00 2001 From: Ard Biesheuvel Date: Tue, 9 Dec 2025 06:48:49 +0100 Subject: [PATCH 0140/1321] arm64/simd: Avoid pointless clearing of FP/SIMD buffer The buffer provided to kernel_neon_begin() is only used if the task is scheduled out while the FP/SIMD is in use by the kernel, or when such a section is interrupted by a softirq that also uses the FP/SIMD. IOW, this happens rarely, and even if it happened often, there is still no reason for this buffer to be cleared beforehand, which happens unconditionally, due to the use of a compound literal expression. So define that buffer variable explicitly, and mark it as __uninitialized so that it will not get cleared, even when -ftrivial-auto-var-init is in effect. This requires some preprocessor gymnastics, due to the fact that the variable must be defined throughout the entire guarded scope, and the expression ({ struct user_fpsimd_state __uninitialized st; &st; }) is problematic in that regard, even though the compilers seem to permit it. So instead, repeat the 'for ()' trick that is also used in the implementation of the guarded scope helpers. Cc: Will Deacon Cc: Catalin Marinas Cc: Kees Cook Cc: Eric Biggers Signed-off-by: Ard Biesheuvel Fixes: 4fa617cc6851 ("arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack") Link: https://lore.kernel.org/r/20251209054848.998878-2-ardb@kernel.org Signed-off-by: Eric Biggers --- arch/arm64/include/asm/simd.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h index 0941f6f58a146..69ecbd69ca8cc 100644 --- a/arch/arm64/include/asm/simd.h +++ b/arch/arm64/include/asm/simd.h @@ -48,6 +48,13 @@ DEFINE_LOCK_GUARD_1(ksimd, kernel_neon_begin(_T->lock), kernel_neon_end(_T->lock)) -#define scoped_ksimd() scoped_guard(ksimd, &(struct user_fpsimd_state){}) +#define __scoped_ksimd(_label) \ + for (struct user_fpsimd_state __uninitialized __st; \ + true; ({ goto _label; })) \ + if (0) { \ +_label: break; \ + } else scoped_guard(ksimd, &__st) + +#define scoped_ksimd() __scoped_ksimd(__UNIQUE_ID(label)) #endif From d4774a7899273709ad35896e3acab6c7e607d61b Mon Sep 17 00:00:00 2001 From: Charles Mirabile Date: Fri, 12 Dec 2025 13:47:17 -0500 Subject: [PATCH 0141/1321] lib/crypto: riscv: Add poly1305-core.S to .gitignore poly1305-core.S is an auto-generated file, so it should be ignored. Fixes: bef9c7559869 ("lib/crypto: riscv/poly1305: Import OpenSSL/CRYPTOGAMS implementation") Cc: stable@vger.kernel.org Signed-off-by: Charles Mirabile Link: https://lore.kernel.org/r/20251212184717.133701-1-cmirabil@redhat.com Signed-off-by: Eric Biggers --- lib/crypto/riscv/.gitignore | 2 ++ 1 file changed, 2 insertions(+) create mode 100644 lib/crypto/riscv/.gitignore diff --git a/lib/crypto/riscv/.gitignore b/lib/crypto/riscv/.gitignore new file mode 100644 index 0000000000000..0d47d4f21c6de --- /dev/null +++ b/lib/crypto/riscv/.gitignore @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0-only +poly1305-core.S From 798c583448ace306e7feb5d61cfeefc84b298a22 Mon Sep 17 00:00:00 2001 From: Lukas Bulwahn Date: Wed, 5 Nov 2025 10:24:28 +0100 Subject: [PATCH 0142/1321] MAINTAINERS: add tracepoint core-api doc files to TRACING The files in Documentation/core-api/ are by virtue of their top-level directory part of the Documentation section in MAINTAINERS. Each file in Documentation/core-api/ should however also have a further section in MAINTAINERS it belongs to, which fits to the technical area of the documented API in that file. The tracepoint.rst provides some explanation to tracepoints defined in selected files under include/trace/events/, which itself is part of the TRACING section. So, add this core-api document to TRACING. Cc: Mathieu Desnoyers Link: https://patch.msgid.link/20251105092428.153378-1-lukas.bulwahn@redhat.com Signed-off-by: Lukas Bulwahn Acked-by: Masami Hiramatsu (Google) Signed-off-by: Steven Rostedt (Google) --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 454b8ed119e9e..dc731d37c8fee 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -26463,6 +26463,7 @@ L: linux-trace-kernel@vger.kernel.org S: Maintained Q: https://patchwork.kernel.org/project/linux-trace-kernel/list/ T: git git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git +F: Documentation/core-api/tracepoint.rst F: Documentation/trace/* F: fs/tracefs/ F: include/linux/trace*.h From 41457b9f70d8ef75552a6540ed31a9e059b2cede Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Tue, 16 Dec 2025 18:24:40 -0500 Subject: [PATCH 0143/1321] tracing: Do not register unsupported perf events Synthetic events currently do not have a function to register perf events. This leads to calling the tracepoint register functions with a NULL function pointer which triggers: ------------[ cut here ]------------ WARNING: kernel/tracepoint.c:175 at tracepoint_add_func+0x357/0x370, CPU#2: perf/2272 Modules linked in: kvm_intel kvm irqbypass CPU: 2 UID: 0 PID: 2272 Comm: perf Not tainted 6.18.0-ftest-11964-ge022764176fc-dirty #323 PREEMPTLAZY Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-debian-1.17.0-1 04/01/2014 RIP: 0010:tracepoint_add_func+0x357/0x370 Code: 28 9c e8 4c 0b f5 ff eb 0f 4c 89 f7 48 c7 c6 80 4d 28 9c e8 ab 89 f4 ff 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc cc <0f> 0b 49 c7 c6 ea ff ff ff e9 ee fe ff ff 0f 0b e9 f9 fe ff ff 0f RSP: 0018:ffffabc0c44d3c40 EFLAGS: 00010246 RAX: 0000000000000001 RBX: ffff9380aa9e4060 RCX: 0000000000000000 RDX: 000000000000000a RSI: ffffffff9e1d4a98 RDI: ffff937fcf5fd6c8 RBP: 0000000000000001 R08: 0000000000000007 R09: ffff937fcf5fc780 R10: 0000000000000003 R11: ffffffff9c193910 R12: 000000000000000a R13: ffffffff9e1e5888 R14: 0000000000000000 R15: ffffabc0c44d3c78 FS: 00007f6202f5f340(0000) GS:ffff93819f00f000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055d3162281a8 CR3: 0000000106a56003 CR4: 0000000000172ef0 Call Trace: tracepoint_probe_register+0x5d/0x90 synth_event_reg+0x3c/0x60 perf_trace_event_init+0x204/0x340 perf_trace_init+0x85/0xd0 perf_tp_event_init+0x2e/0x50 perf_try_init_event+0x6f/0x230 ? perf_event_alloc+0x4bb/0xdc0 perf_event_alloc+0x65a/0xdc0 __se_sys_perf_event_open+0x290/0x9f0 do_syscall_64+0x93/0x7b0 ? entry_SYSCALL_64_after_hwframe+0x76/0x7e ? trace_hardirqs_off+0x53/0xc0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Instead, have the code return -ENODEV, which doesn't warn and has perf error out with: # perf record -e synthetic:futex_wait Error: The sys_perf_event_open() syscall returned with 19 (No such device) for event (synthetic:futex_wait). "dmesg | grep -i perf" may provide additional information. Ideally perf should support synthetic events, but for now just fix the warning. The support can come later. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu Cc: Mathieu Desnoyers Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Namhyung Kim Link: https://patch.msgid.link/20251216182440.147e4453@gandalf.local.home Fixes: 4b147936fa509 ("tracing: Add support for 'synthetic' events") Reported-by: Ian Rogers Signed-off-by: Steven Rostedt (Google) --- kernel/trace/trace_events.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index b16a5a1580401..76067529db61b 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -700,6 +700,8 @@ int trace_event_reg(struct trace_event_call *call, #ifdef CONFIG_PERF_EVENTS case TRACE_REG_PERF_REGISTER: + if (!call->class->perf_probe) + return -ENODEV; return tracepoint_probe_register(call->tp, call->class->perf_probe, call); From 5702b20fad8ef24f04db435cb056f310da9c18e6 Mon Sep 17 00:00:00 2001 From: "Darrick J. Wong" Date: Tue, 16 Dec 2025 09:49:50 -0800 Subject: [PATCH 0144/1321] tracing: Fix UBSAN warning in __remove_instance() xfs/558 triggers the following UBSAN warning: ------------[ cut here ]------------ UBSAN: shift-out-of-bounds in kernel/trace/trace.c:10510:10 shift exponent 32 is too large for 32-bit type 'int' CPU: 1 UID: 0 PID: 888674 Comm: rmdir Not tainted 6.19.0-rc1-xfsx #rc1 PREEMPT(lazy) dbf607ef4c142c563f76d706e71af9731d7b9c90 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-4.module+el8.8.0+21164+ed375313 04/01/2014 Call Trace: dump_stack_lvl+0x4a/0x70 ubsan_epilogue+0x5/0x2b __ubsan_handle_shift_out_of_bounds.cold+0x5e/0x113 __remove_instance.part.0.constprop.0.cold+0x18/0x26f instance_rmdir+0xf3/0x110 tracefs_syscall_rmdir+0x4d/0x90 vfs_rmdir+0x139/0x230 do_rmdir+0x143/0x230 __x64_sys_rmdir+0x1d/0x20 do_syscall_64+0x44/0x230 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f7ae8e51f17 Code: f0 ff ff 73 01 c3 48 8b 0d de 2e 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 54 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 b1 2e 0e 00 f7 d8 64 89 02 b8 RSP: 002b:00007ffd90743f08 EFLAGS: 00000246 ORIG_RAX: 0000000000000054 RAX: ffffffffffffffda RBX: 00007ffd907440f8 RCX: 00007f7ae8e51f17 RDX: 00007f7ae8f3c5c0 RSI: 00007ffd90744a21 RDI: 00007ffd90744a21 RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000 R10: 00007f7ae8f35ac0 R11: 0000000000000246 R12: 00007ffd90744a21 R13: 0000000000000001 R14: 00007f7ae8f8b000 R15: 000055e5283e6a98 ---[ end trace ]--- whilst tearing down an ftrace instance. TRACE_FLAGS_MAX_SIZE is now 64bit, so the mask comparison expression must be typecast to a u64 value to avoid an overflow. AFAICT, ZEROED_TRACE_FLAGS is already cast to ULL so this is ok. Link: https://patch.msgid.link/20251216174950.GA7705@frogsfrogsfrogs Fixes: bbec8e28cac592 ("tracing: Allow tracer to add more than 32 options") Signed-off-by: "Darrick J. Wong" Signed-off-by: Steven Rostedt (Google) --- kernel/trace/trace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index e575956ef9b5a..6f2148df14d96 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -10507,7 +10507,7 @@ static int __remove_instance(struct trace_array *tr) /* Disable all the flags that were enabled coming in */ for (i = 0; i < TRACE_FLAGS_MAX_SIZE; i++) { - if ((1 << i) & ZEROED_TRACE_FLAGS) + if ((1ULL << i) & ZEROED_TRACE_FLAGS) set_tracer_flag(tr, 1ULL << i, 0); } From 1399ceec9974da800f5897a4ef3a1cc435ab161e Mon Sep 17 00:00:00 2001 From: Menglong Dong Date: Wed, 17 Dec 2025 11:00:53 +0800 Subject: [PATCH 0145/1321] ftrace: Fix address for jmp mode in t_show() The address from ftrace_find_rec_direct() is printed directly in t_show(). This can mislead symbol offsets if it has the "jmp" bit in the last bit. Fix this by printing the address that returned by ftrace_jmp_get(). Link: https://patch.msgid.link/20251217030053.80343-1-dongml2@chinatelecom.cn Fixes: 25e4e3565d45 ("ftrace: Introduce FTRACE_OPS_FL_JMP") Signed-off-by: Menglong Dong Signed-off-by: Steven Rostedt (Google) --- kernel/trace/ftrace.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 3ec2033c07743..ef2d5dca6f70c 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -4518,8 +4518,11 @@ static int t_show(struct seq_file *m, void *v) unsigned long direct; direct = ftrace_find_rec_direct(rec->ip); - if (direct) - seq_printf(m, "\n\tdirect-->%pS", (void *)direct); + if (direct) { + seq_printf(m, "\n\tdirect%s-->%pS", + ftrace_is_jmp(direct) ? "(jmp)" : "", + (void *)ftrace_jmp_get(direct)); + } } } From cd6bd3aa04d5cdb0803e9c8e77f1cf567474d4d8 Mon Sep 17 00:00:00 2001 From: huang-jl Date: Wed, 17 Dec 2025 14:26:32 +0800 Subject: [PATCH 0146/1321] io_uring: fix nr_segs calculation in io_import_kbuf io_import_kbuf() calculates nr_segs incorrectly when iov_offset is non-zero after iov_iter_advance(). It doesn't account for the partial consumption of the first bvec. The problem comes when meet the following conditions: 1. Use UBLK_F_AUTO_BUF_REG feature of ublk. 2. The kernel will help to register the buffer, into the io uring. 3. Later, the ublk server try to send IO request using the registered buffer in the io uring, to read/write to fuse-based filesystem, with O_DIRECT. >From a userspace perspective, the ublk server thread is blocked in the kernel, and will see "soft lockup" in the kernel dmesg. When ublk registers a buffer with mixed-size bvecs like [4K]*6 + [12K] and a request partially consumes a bvec, the next request's nr_segs calculation uses bvec->bv_len instead of (bv_len - iov_offset). This causes fuse_get_user_pages() to loop forever because nr_segs indicates fewer pages than actually needed. Specifically, the infinite loop happens at: fuse_get_user_pages() -> iov_iter_extract_pages() -> iov_iter_extract_bvec_pages() Since the nr_segs is miscalculated, the iov_iter_extract_bvec_pages returns when finding that i->nr_segs is zero. Then iov_iter_extract_pages returns zero. However, fuse_get_user_pages does still not get enough data/pages, causing infinite loop. Example: - Bvecs: [4K, 4K, 4K, 4K, 4K, 4K, 12K, ...] - Request 1: 32K at offset 0, uses 6*4K + 8K of the 12K bvec - Request 2: 32K at offset 32K - iov_offset = 8K (8K already consumed from 12K bvec) - Bug: calculates using 12K, not (12K - 8K) = 4K - Result: nr_segs too small, infinite loop in fuse_get_user_pages. Fix by accounting for iov_offset when calculating the first segment's available length. Fixes: b419bed4f0a6 ("io_uring/rsrc: ensure segments counts are correct on kbuf buffers") Signed-off-by: huang-jl Reviewed-by: Ming Lei Signed-off-by: Jens Axboe --- io_uring/rsrc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index a63474b331bf8..41c89f5c616da 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -1059,6 +1059,7 @@ static int io_import_kbuf(int ddir, struct iov_iter *iter, if (count < imu->len) { const struct bio_vec *bvec = iter->bvec; + len += iter->iov_offset; while (len > bvec->bv_len) { len -= bvec->bv_len; bvec++; From 21bc6959b1b39adae76d7701e419ce052a0b92ec Mon Sep 17 00:00:00 2001 From: Pavel Begunkov Date: Sun, 23 Nov 2025 22:51:23 +0000 Subject: [PATCH 0147/1321] block: move around bio flagging helpers We'll need bio_flagged() earlier in bio.h for later patches, move it together with all related helpers, and mark the bio_flagged()'s bio argument as const. Signed-off-by: Pavel Begunkov Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- include/linux/bio.h | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/include/linux/bio.h b/include/linux/bio.h index ad2d57908c1c0..c75a9b3672aa4 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -46,6 +46,21 @@ static inline unsigned int bio_max_segs(unsigned int nr_segs) #define bio_data_dir(bio) \ (op_is_write(bio_op(bio)) ? WRITE : READ) +static inline bool bio_flagged(const struct bio *bio, unsigned int bit) +{ + return bio->bi_flags & (1U << bit); +} + +static inline void bio_set_flag(struct bio *bio, unsigned int bit) +{ + bio->bi_flags |= (1U << bit); +} + +static inline void bio_clear_flag(struct bio *bio, unsigned int bit) +{ + bio->bi_flags &= ~(1U << bit); +} + /* * Check whether this bio carries any data or not. A NULL bio is allowed. */ @@ -225,21 +240,6 @@ static inline void bio_cnt_set(struct bio *bio, unsigned int count) atomic_set(&bio->__bi_cnt, count); } -static inline bool bio_flagged(struct bio *bio, unsigned int bit) -{ - return bio->bi_flags & (1U << bit); -} - -static inline void bio_set_flag(struct bio *bio, unsigned int bit) -{ - bio->bi_flags |= (1U << bit); -} - -static inline void bio_clear_flag(struct bio *bio, unsigned int bit) -{ - bio->bi_flags &= ~(1U << bit); -} - static inline struct bio_vec *bio_first_bvec_all(struct bio *bio) { WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)); From 540191cf0cc66c560dde29baa393893bd52dbf3c Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Fri, 12 Dec 2025 10:16:59 -0700 Subject: [PATCH 0148/1321] selftests: ublk: fix overflow in ublk_queue_auto_zc_fallback() The functions ublk_queue_use_zc(), ublk_queue_use_auto_zc(), and ublk_queue_auto_zc_fallback() were returning int, but performing bitwise AND on q->flags which is __u64. When a flag bit is set in the upper 32 bits (beyond INT_MAX), the result of the bitwise AND operation could overflow when cast to int, leading to incorrect boolean evaluation. For example, if UBLKS_Q_AUTO_BUF_REG_FALLBACK is 0x8000000000000000: - (u64)flags & 0x8000000000000000 = 0x8000000000000000 - Cast to int: undefined behavior / incorrect value - Used in if(): may evaluate incorrectly Fix by: 1. Changing return type from int to bool for semantic correctness 2. Using !! to explicitly convert to boolean (0 or 1) This ensures the functions return proper boolean values regardless of which bit position the flags occupy in the 64-bit field. Fixes: c3a6d48f86da ("selftests: ublk: remove ublk queue self-defined flags") Signed-off-by: Ming Lei Signed-off-by: Jens Axboe --- tools/testing/selftests/ublk/kublk.h | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/ublk/kublk.h b/tools/testing/selftests/ublk/kublk.h index fe42705c6d42d..6e8f381f34810 100644 --- a/tools/testing/selftests/ublk/kublk.h +++ b/tools/testing/selftests/ublk/kublk.h @@ -390,19 +390,19 @@ static inline int ublk_completed_tgt_io(struct ublk_thread *t, return --io->tgt_ios == 0; } -static inline int ublk_queue_use_zc(const struct ublk_queue *q) +static inline bool ublk_queue_use_zc(const struct ublk_queue *q) { - return q->flags & UBLK_F_SUPPORT_ZERO_COPY; + return !!(q->flags & UBLK_F_SUPPORT_ZERO_COPY); } -static inline int ublk_queue_use_auto_zc(const struct ublk_queue *q) +static inline bool ublk_queue_use_auto_zc(const struct ublk_queue *q) { - return q->flags & UBLK_F_AUTO_BUF_REG; + return !!(q->flags & UBLK_F_AUTO_BUF_REG); } -static inline int ublk_queue_auto_zc_fallback(const struct ublk_queue *q) +static inline bool ublk_queue_auto_zc_fallback(const struct ublk_queue *q) { - return q->flags & UBLKS_Q_AUTO_BUF_REG_FALLBACK; + return !!(q->flags & UBLKS_Q_AUTO_BUF_REG_FALLBACK); } static inline int ublk_queue_no_buf(const struct ublk_queue *q) From 90678d623a14e1e822a3f5c236718907b26a8108 Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Fri, 12 Dec 2025 10:17:00 -0700 Subject: [PATCH 0149/1321] selftests: ublk: correct last_rw map type in seq_io.bt The last_rw map is initialized with a value of 0 but later assigned the value args.sector + args.nr_sector, which has type sector_t = u64. bpftrace complains about the type mismatch between int64 and uint64: trace/seq_io.bt:18:3-59: ERROR: Type mismatch for @last_rw: trying to assign value of type 'uint64' when map already contains a value of type 'int64' @last_rw[$dev, str($2)] = (args.sector + args.nr_sector); Cast the initial value to uint64 so bpftrace will load the program. Signed-off-by: Caleb Sander Mateos Reviewed-by: Ming Lei Signed-off-by: Jens Axboe --- tools/testing/selftests/ublk/trace/seq_io.bt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/ublk/trace/seq_io.bt b/tools/testing/selftests/ublk/trace/seq_io.bt index 272ac54c9d5fa..507a3ca05abfc 100644 --- a/tools/testing/selftests/ublk/trace/seq_io.bt +++ b/tools/testing/selftests/ublk/trace/seq_io.bt @@ -4,7 +4,7 @@ $3: strlen($2) */ BEGIN { - @last_rw[$1, str($2)] = 0; + @last_rw[$1, str($2)] = (uint64)0; } tracepoint:block:block_rq_complete { From 1634bf7ec37d8d8ce74507144cca9bdc79fa3cd1 Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Fri, 12 Dec 2025 10:17:01 -0700 Subject: [PATCH 0150/1321] selftests: ublk: remove unused ios map in seq_io.bt The ios map populated by seq_io.bt is never read, so remove it. Signed-off-by: Caleb Sander Mateos Reviewed-by: Ming Lei Signed-off-by: Jens Axboe --- tools/testing/selftests/ublk/trace/seq_io.bt | 1 - 1 file changed, 1 deletion(-) diff --git a/tools/testing/selftests/ublk/trace/seq_io.bt b/tools/testing/selftests/ublk/trace/seq_io.bt index 507a3ca05abfc..b2f60a92b118b 100644 --- a/tools/testing/selftests/ublk/trace/seq_io.bt +++ b/tools/testing/selftests/ublk/trace/seq_io.bt @@ -17,7 +17,6 @@ tracepoint:block:block_rq_complete } @last_rw[$dev, str($2)] = (args.sector + args.nr_sector); } - @ios = count(); } END { From b6d84624551696f0d36d1d95e55df2b7a8c79335 Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Fri, 12 Dec 2025 10:17:02 -0700 Subject: [PATCH 0151/1321] selftests: ublk: fix fio arguments in run_io_and_recover() run_io_and_recover() invokes fio with --size="${size}", but the variable size doesn't exist. Thus, the argument expands to --size=, which causes fio to exit immediately with an error without issuing any I/O. Pass the value for size as the first argument to the function. Signed-off-by: Caleb Sander Mateos Reviewed-by: Ming Lei Signed-off-by: Jens Axboe --- tools/testing/selftests/ublk/test_common.sh | 5 +++-- tools/testing/selftests/ublk/test_generic_04.sh | 2 +- tools/testing/selftests/ublk/test_generic_05.sh | 2 +- tools/testing/selftests/ublk/test_generic_11.sh | 2 +- 4 files changed, 6 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/ublk/test_common.sh b/tools/testing/selftests/ublk/test_common.sh index 8a4dbd09feb0a..6f1c042de40e7 100755 --- a/tools/testing/selftests/ublk/test_common.sh +++ b/tools/testing/selftests/ublk/test_common.sh @@ -333,11 +333,12 @@ run_io_and_kill_daemon() run_io_and_recover() { - local action=$1 + local size=$1 + local action=$2 local state local dev_id - shift 1 + shift 2 dev_id=$(_add_ublk_dev "$@") _check_add_dev "$TID" $? diff --git a/tools/testing/selftests/ublk/test_generic_04.sh b/tools/testing/selftests/ublk/test_generic_04.sh index 8b533217d4a17..baf5b156193de 100755 --- a/tools/testing/selftests/ublk/test_generic_04.sh +++ b/tools/testing/selftests/ublk/test_generic_04.sh @@ -8,7 +8,7 @@ ERR_CODE=0 ublk_run_recover_test() { - run_io_and_recover "kill_daemon" "$@" + run_io_and_recover 256M "kill_daemon" "$@" ERR_CODE=$? if [ ${ERR_CODE} -ne 0 ]; then echo "$TID failure: $*" diff --git a/tools/testing/selftests/ublk/test_generic_05.sh b/tools/testing/selftests/ublk/test_generic_05.sh index 398e9e2b58e15..7b5083afc02ab 100755 --- a/tools/testing/selftests/ublk/test_generic_05.sh +++ b/tools/testing/selftests/ublk/test_generic_05.sh @@ -8,7 +8,7 @@ ERR_CODE=0 ublk_run_recover_test() { - run_io_and_recover "kill_daemon" "$@" + run_io_and_recover 256M "kill_daemon" "$@" ERR_CODE=$? if [ ${ERR_CODE} -ne 0 ]; then echo "$TID failure: $*" diff --git a/tools/testing/selftests/ublk/test_generic_11.sh b/tools/testing/selftests/ublk/test_generic_11.sh index a00357a5ec6b8..d1f973c8c6459 100755 --- a/tools/testing/selftests/ublk/test_generic_11.sh +++ b/tools/testing/selftests/ublk/test_generic_11.sh @@ -8,7 +8,7 @@ ERR_CODE=0 ublk_run_quiesce_recover() { - run_io_and_recover "quiesce_dev" "$@" + run_io_and_recover 256M "quiesce_dev" "$@" ERR_CODE=$? if [ ${ERR_CODE} -ne 0 ]; then echo "$TID failure: $*" From fdf130ad0c982ccdd0ae90d1f3b2204c2e074b15 Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Fri, 12 Dec 2025 10:17:03 -0700 Subject: [PATCH 0152/1321] selftests: ublk: use auto_zc for PER_IO_DAEMON tests in stress_04 stress_04 is described as "run IO and kill ublk server(zero copy)" but the --per_io_tasks tests cases don't use zero copy. Plus, one of the test cases is duplicated. Add --auto_zc to these test cases and --auto_zc_fallback to one of the duplicated ones. This matches the test cases in stress_03. Signed-off-by: Caleb Sander Mateos Reviewed-by: Ming Lei Signed-off-by: Jens Axboe --- tools/testing/selftests/ublk/test_stress_04.sh | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/ublk/test_stress_04.sh b/tools/testing/selftests/ublk/test_stress_04.sh index 3f901db4d09dc..c0c926ce05391 100755 --- a/tools/testing/selftests/ublk/test_stress_04.sh +++ b/tools/testing/selftests/ublk/test_stress_04.sh @@ -40,10 +40,10 @@ if _have_feature "AUTO_BUF_REG"; then fi if _have_feature "PER_IO_DAEMON"; then - ublk_io_and_kill_daemon 8G -t null -q 4 --nthreads 8 --per_io_tasks & - ublk_io_and_kill_daemon 256M -t loop -q 4 --nthreads 8 --per_io_tasks "${UBLK_BACKFILES[0]}" & - ublk_io_and_kill_daemon 256M -t stripe -q 4 --nthreads 8 --per_io_tasks "${UBLK_BACKFILES[1]}" "${UBLK_BACKFILES[2]}" & - ublk_io_and_kill_daemon 8G -t null -q 4 --nthreads 8 --per_io_tasks & + ublk_io_and_kill_daemon 8G -t null -q 4 --auto_zc --nthreads 8 --per_io_tasks & + ublk_io_and_kill_daemon 256M -t loop -q 4 --auto_zc --nthreads 8 --per_io_tasks "${UBLK_BACKFILES[0]}" & + ublk_io_and_kill_daemon 256M -t stripe -q 4 --auto_zc --nthreads 8 --per_io_tasks "${UBLK_BACKFILES[1]}" "${UBLK_BACKFILES[2]}" & + ublk_io_and_kill_daemon 8G -t null -q 4 -z --auto_zc --auto_zc_fallback --nthreads 8 --per_io_tasks & fi wait From 2ea483b3da1681b80b72d56acde1112edb4af3c8 Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Fri, 12 Dec 2025 10:17:04 -0700 Subject: [PATCH 0153/1321] selftests: ublk: don't share backing files between ublk servers stress_04 is missing a wait between blocks of tests, meaning multiple ublk servers will be running in parallel using the same backing files. Add a wait after each section to ensure each backing file is in use by a single ublk server at a time. Signed-off-by: Caleb Sander Mateos Reviewed-by: Ming Lei Signed-off-by: Jens Axboe --- tools/testing/selftests/ublk/test_stress_04.sh | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/ublk/test_stress_04.sh b/tools/testing/selftests/ublk/test_stress_04.sh index c0c926ce05391..efa8dc33234b5 100755 --- a/tools/testing/selftests/ublk/test_stress_04.sh +++ b/tools/testing/selftests/ublk/test_stress_04.sh @@ -31,12 +31,14 @@ _create_backfile 2 128M ublk_io_and_kill_daemon 8G -t null -q 4 -z --no_ublk_fixed_fd & ublk_io_and_kill_daemon 256M -t loop -q 4 -z --no_ublk_fixed_fd "${UBLK_BACKFILES[0]}" & ublk_io_and_kill_daemon 256M -t stripe -q 4 -z "${UBLK_BACKFILES[1]}" "${UBLK_BACKFILES[2]}" & +wait if _have_feature "AUTO_BUF_REG"; then ublk_io_and_kill_daemon 8G -t null -q 4 --auto_zc & ublk_io_and_kill_daemon 256M -t loop -q 4 --auto_zc "${UBLK_BACKFILES[0]}" & ublk_io_and_kill_daemon 256M -t stripe -q 4 --auto_zc --no_ublk_fixed_fd "${UBLK_BACKFILES[1]}" "${UBLK_BACKFILES[2]}" & ublk_io_and_kill_daemon 8G -t null -q 4 -z --auto_zc --auto_zc_fallback & + wait fi if _have_feature "PER_IO_DAEMON"; then @@ -44,8 +46,8 @@ if _have_feature "PER_IO_DAEMON"; then ublk_io_and_kill_daemon 256M -t loop -q 4 --auto_zc --nthreads 8 --per_io_tasks "${UBLK_BACKFILES[0]}" & ublk_io_and_kill_daemon 256M -t stripe -q 4 --auto_zc --nthreads 8 --per_io_tasks "${UBLK_BACKFILES[1]}" "${UBLK_BACKFILES[2]}" & ublk_io_and_kill_daemon 8G -t null -q 4 -z --auto_zc --auto_zc_fallback --nthreads 8 --per_io_tasks & + wait fi -wait _cleanup_test "stress" _show_result $TID $ERR_CODE From 47d814502540c1fe16051700350230ca4399bcf9 Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Fri, 12 Dec 2025 10:17:05 -0700 Subject: [PATCH 0154/1321] selftests: ublk: forbid multiple data copy modes The kublk mock ublk server allows multiple data copy mode arguments to be passed on the command line (--zero_copy, --get_data, and --auto_zc). The ublk device will be created with all the requested feature flags, however kublk will only use one of the modes to interact with request data (arbitrarily preferring auto_zc over zero_copy over get_data). To clarify the intent of the test, don't allow multiple data copy modes to be specified. --zero_copy and --auto_zc are allowed together for --auto_zc_fallback, which uses both copy modes. Don't set UBLK_F_USER_COPY for zero_copy, as it's a separate feature. Fix the test cases in test_stress_05 passing --get_data along with --zero_copy or --auto_zc. Signed-off-by: Caleb Sander Mateos Reviewed-by: Ming Lei Signed-off-by: Jens Axboe --- tools/testing/selftests/ublk/kublk.c | 11 ++++++++++- tools/testing/selftests/ublk/test_stress_05.sh | 10 +++++----- 2 files changed, 15 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/ublk/kublk.c b/tools/testing/selftests/ublk/kublk.c index f8fa102a627fd..4dd02cb083baa 100644 --- a/tools/testing/selftests/ublk/kublk.c +++ b/tools/testing/selftests/ublk/kublk.c @@ -1613,7 +1613,7 @@ int main(int argc, char *argv[]) ctx.queue_depth = strtol(optarg, NULL, 10); break; case 'z': - ctx.flags |= UBLK_F_SUPPORT_ZERO_COPY | UBLK_F_USER_COPY; + ctx.flags |= UBLK_F_SUPPORT_ZERO_COPY; break; case 'r': value = strtol(optarg, NULL, 10); @@ -1686,6 +1686,15 @@ int main(int argc, char *argv[]) return -EINVAL; } + if (!!(ctx.flags & UBLK_F_NEED_GET_DATA) + + !!(ctx.flags & UBLK_F_USER_COPY) + + (ctx.flags & UBLK_F_SUPPORT_ZERO_COPY && !ctx.auto_zc_fallback) + + (ctx.flags & UBLK_F_AUTO_BUF_REG && !ctx.auto_zc_fallback) + + ctx.auto_zc_fallback > 1) { + fprintf(stderr, "too many data copy modes specified\n"); + return -EINVAL; + } + i = optind; while (i < argc && ctx.nr_files < MAX_BACK_FILES) { ctx.files[ctx.nr_files++] = argv[i++]; diff --git a/tools/testing/selftests/ublk/test_stress_05.sh b/tools/testing/selftests/ublk/test_stress_05.sh index 274295061042e..68a1941443025 100755 --- a/tools/testing/selftests/ublk/test_stress_05.sh +++ b/tools/testing/selftests/ublk/test_stress_05.sh @@ -58,17 +58,17 @@ done if _have_feature "ZERO_COPY"; then for reissue in $(seq 0 1); do - ublk_io_and_remove 8G -t null -q 4 -g -z -r 1 -i "$reissue" & - ublk_io_and_remove 256M -t loop -q 4 -g -z -r 1 -i "$reissue" "${UBLK_BACKFILES[1]}" & + ublk_io_and_remove 8G -t null -q 4 -z -r 1 -i "$reissue" & + ublk_io_and_remove 256M -t loop -q 4 -z -r 1 -i "$reissue" "${UBLK_BACKFILES[1]}" & wait done fi if _have_feature "AUTO_BUF_REG"; then for reissue in $(seq 0 1); do - ublk_io_and_remove 8G -t null -q 4 -g --auto_zc -r 1 -i "$reissue" & - ublk_io_and_remove 256M -t loop -q 4 -g --auto_zc -r 1 -i "$reissue" "${UBLK_BACKFILES[1]}" & - ublk_io_and_remove 8G -t null -q 4 -g -z --auto_zc --auto_zc_fallback -r 1 -i "$reissue" & + ublk_io_and_remove 8G -t null -q 4 --auto_zc -r 1 -i "$reissue" & + ublk_io_and_remove 256M -t loop -q 4 --auto_zc -r 1 -i "$reissue" "${UBLK_BACKFILES[1]}" & + ublk_io_and_remove 8G -t null -q 4 -z --auto_zc --auto_zc_fallback -r 1 -i "$reissue" & wait done fi From 5f1b9bbc74fff0e0d3c1f25975ad824481c2238d Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Fri, 12 Dec 2025 10:17:06 -0700 Subject: [PATCH 0155/1321] selftests: ublk: add support for user copy to kublk The ublk selftests mock ublk server kublk supports every data copy mode except user copy. Add support for user copy to kublk, enabled via the --user_copy (-u) command line argument. On writes, issue pread() calls to copy the write data into the ublk_io's buffer before dispatching the write to the target implementation. On reads, issue pwrite() calls to copy read data from the ublk_io's buffer before committing the request. Copy in 2 KB chunks to provide some coverage of the offseting logic. Signed-off-by: Caleb Sander Mateos Reviewed-by: Ming Lei Signed-off-by: Jens Axboe --- tools/testing/selftests/ublk/file_backed.c | 7 +-- tools/testing/selftests/ublk/kublk.c | 53 ++++++++++++++++++++-- tools/testing/selftests/ublk/kublk.h | 11 +++++ tools/testing/selftests/ublk/stripe.c | 2 +- 4 files changed, 64 insertions(+), 9 deletions(-) diff --git a/tools/testing/selftests/ublk/file_backed.c b/tools/testing/selftests/ublk/file_backed.c index cd9fe69ecce20..269d5f124e06a 100644 --- a/tools/testing/selftests/ublk/file_backed.c +++ b/tools/testing/selftests/ublk/file_backed.c @@ -34,8 +34,9 @@ static int loop_queue_tgt_rw_io(struct ublk_thread *t, struct ublk_queue *q, unsigned zc = ublk_queue_use_zc(q); unsigned auto_zc = ublk_queue_use_auto_zc(q); enum io_uring_op op = ublk_to_uring_op(iod, zc | auto_zc); + struct ublk_io *io = ublk_get_io(q, tag); struct io_uring_sqe *sqe[3]; - void *addr = (zc | auto_zc) ? NULL : (void *)iod->addr; + void *addr = io->buf_addr; if (!zc || auto_zc) { ublk_io_alloc_sqes(t, sqe, 1); @@ -56,7 +57,7 @@ static int loop_queue_tgt_rw_io(struct ublk_thread *t, struct ublk_queue *q, ublk_io_alloc_sqes(t, sqe, 3); - io_uring_prep_buf_register(sqe[0], q, tag, q->q_id, ublk_get_io(q, tag)->buf_index); + io_uring_prep_buf_register(sqe[0], q, tag, q->q_id, io->buf_index); sqe[0]->flags |= IOSQE_CQE_SKIP_SUCCESS | IOSQE_IO_HARDLINK; sqe[0]->user_data = build_user_data(tag, ublk_cmd_op_nr(sqe[0]->cmd_op), 0, q->q_id, 1); @@ -68,7 +69,7 @@ static int loop_queue_tgt_rw_io(struct ublk_thread *t, struct ublk_queue *q, sqe[1]->flags |= IOSQE_FIXED_FILE | IOSQE_IO_HARDLINK; sqe[1]->user_data = build_user_data(tag, ublk_op, 0, q->q_id, 1); - io_uring_prep_buf_unregister(sqe[2], q, tag, q->q_id, ublk_get_io(q, tag)->buf_index); + io_uring_prep_buf_unregister(sqe[2], q, tag, q->q_id, io->buf_index); sqe[2]->user_data = build_user_data(tag, ublk_cmd_op_nr(sqe[2]->cmd_op), 0, q->q_id, 1); return 2; diff --git a/tools/testing/selftests/ublk/kublk.c b/tools/testing/selftests/ublk/kublk.c index 4dd02cb083baa..185ba553686ab 100644 --- a/tools/testing/selftests/ublk/kublk.c +++ b/tools/testing/selftests/ublk/kublk.c @@ -596,6 +596,38 @@ static void ublk_set_auto_buf_reg(const struct ublk_queue *q, sqe->addr = ublk_auto_buf_reg_to_sqe_addr(&buf); } +/* Copy in pieces to test the buffer offset logic */ +#define UBLK_USER_COPY_LEN 2048 + +static void ublk_user_copy(const struct ublk_io *io, __u8 match_ublk_op) +{ + const struct ublk_queue *q = ublk_io_to_queue(io); + const struct ublksrv_io_desc *iod = ublk_get_iod(q, io->tag); + __u64 off = ublk_user_copy_offset(q->q_id, io->tag); + __u8 ublk_op = ublksrv_get_op(iod); + __u32 len = iod->nr_sectors << 9; + void *addr = io->buf_addr; + + if (ublk_op != match_ublk_op) + return; + + while (len) { + __u32 copy_len = min(len, UBLK_USER_COPY_LEN); + ssize_t copied; + + if (ublk_op == UBLK_IO_OP_WRITE) + copied = pread(q->ublk_fd, addr, copy_len, off); + else if (ublk_op == UBLK_IO_OP_READ) + copied = pwrite(q->ublk_fd, addr, copy_len, off); + else + assert(0); + assert(copied == (ssize_t)copy_len); + addr += copy_len; + off += copy_len; + len -= copy_len; + } +} + int ublk_queue_io_cmd(struct ublk_thread *t, struct ublk_io *io) { struct ublk_queue *q = ublk_io_to_queue(io); @@ -618,9 +650,12 @@ int ublk_queue_io_cmd(struct ublk_thread *t, struct ublk_io *io) if (io->flags & UBLKS_IO_NEED_GET_DATA) cmd_op = UBLK_U_IO_NEED_GET_DATA; - else if (io->flags & UBLKS_IO_NEED_COMMIT_RQ_COMP) + else if (io->flags & UBLKS_IO_NEED_COMMIT_RQ_COMP) { + if (ublk_queue_use_user_copy(q)) + ublk_user_copy(io, UBLK_IO_OP_READ); + cmd_op = UBLK_U_IO_COMMIT_AND_FETCH_REQ; - else if (io->flags & UBLKS_IO_NEED_FETCH_RQ) + } else if (io->flags & UBLKS_IO_NEED_FETCH_RQ) cmd_op = UBLK_U_IO_FETCH_REQ; if (io_uring_sq_space_left(&t->ring) < 1) @@ -649,7 +684,7 @@ int ublk_queue_io_cmd(struct ublk_thread *t, struct ublk_io *io) sqe[0]->rw_flags = 0; cmd->tag = io->tag; cmd->q_id = q->q_id; - if (!ublk_queue_no_buf(q)) + if (!ublk_queue_no_buf(q) && !ublk_queue_use_user_copy(q)) cmd->addr = (__u64) (uintptr_t) io->buf_addr; else cmd->addr = 0; @@ -751,6 +786,10 @@ static void ublk_handle_uring_cmd(struct ublk_thread *t, if (cqe->res == UBLK_IO_RES_OK) { assert(tag < q->q_depth); + + if (ublk_queue_use_user_copy(q)) + ublk_user_copy(io, UBLK_IO_OP_WRITE); + if (q->tgt_ops->queue_io) q->tgt_ops->queue_io(t, q, tag); } else if (cqe->res == UBLK_IO_RES_NEED_GET_DATA) { @@ -1507,7 +1546,7 @@ static void __cmd_create_help(char *exe, bool recovery) printf("%s %s -t [null|loop|stripe|fault_inject] [-q nr_queues] [-d depth] [-n dev_id]\n", exe, recovery ? "recover" : "add"); - printf("\t[--foreground] [--quiet] [-z] [--auto_zc] [--auto_zc_fallback] [--debug_mask mask] [-r 0|1 ] [-g]\n"); + printf("\t[--foreground] [--quiet] [-z] [--auto_zc] [--auto_zc_fallback] [--debug_mask mask] [-r 0|1] [-g] [-u]\n"); printf("\t[-e 0|1 ] [-i 0|1] [--no_ublk_fixed_fd]\n"); printf("\t[--nthreads threads] [--per_io_tasks]\n"); printf("\t[target options] [backfile1] [backfile2] ...\n"); @@ -1568,6 +1607,7 @@ int main(int argc, char *argv[]) { "get_data", 1, NULL, 'g'}, { "auto_zc", 0, NULL, 0 }, { "auto_zc_fallback", 0, NULL, 0 }, + { "user_copy", 0, NULL, 'u'}, { "size", 1, NULL, 's'}, { "nthreads", 1, NULL, 0 }, { "per_io_tasks", 0, NULL, 0 }, @@ -1593,7 +1633,7 @@ int main(int argc, char *argv[]) opterr = 0; optind = 2; - while ((opt = getopt_long(argc, argv, "t:n:d:q:r:e:i:s:gaz", + while ((opt = getopt_long(argc, argv, "t:n:d:q:r:e:i:s:gazu", longopts, &option_idx)) != -1) { switch (opt) { case 'a': @@ -1633,6 +1673,9 @@ int main(int argc, char *argv[]) case 'g': ctx.flags |= UBLK_F_NEED_GET_DATA; break; + case 'u': + ctx.flags |= UBLK_F_USER_COPY; + break; case 's': ctx.size = strtoull(optarg, NULL, 10); break; diff --git a/tools/testing/selftests/ublk/kublk.h b/tools/testing/selftests/ublk/kublk.h index 6e8f381f34810..8a83b90ec603a 100644 --- a/tools/testing/selftests/ublk/kublk.h +++ b/tools/testing/selftests/ublk/kublk.h @@ -208,6 +208,12 @@ static inline int ublk_io_auto_zc_fallback(const struct ublksrv_io_desc *iod) return !!(iod->op_flags & UBLK_IO_F_NEED_REG_BUF); } +static inline __u64 ublk_user_copy_offset(unsigned q_id, unsigned tag) +{ + return UBLKSRV_IO_BUF_OFFSET + + ((__u64)q_id << UBLK_QID_OFF | (__u64)tag << UBLK_TAG_OFF); +} + static inline int is_target_io(__u64 user_data) { return (user_data & (1ULL << 63)) != 0; @@ -405,6 +411,11 @@ static inline bool ublk_queue_auto_zc_fallback(const struct ublk_queue *q) return !!(q->flags & UBLKS_Q_AUTO_BUF_REG_FALLBACK); } +static inline bool ublk_queue_use_user_copy(const struct ublk_queue *q) +{ + return !!(q->flags & UBLK_F_USER_COPY); +} + static inline int ublk_queue_no_buf(const struct ublk_queue *q) { return ublk_queue_use_zc(q) || ublk_queue_use_auto_zc(q); diff --git a/tools/testing/selftests/ublk/stripe.c b/tools/testing/selftests/ublk/stripe.c index 791fa8dc16510..fd412e1f01c0e 100644 --- a/tools/testing/selftests/ublk/stripe.c +++ b/tools/testing/selftests/ublk/stripe.c @@ -134,7 +134,7 @@ static int stripe_queue_tgt_rw_io(struct ublk_thread *t, struct ublk_queue *q, struct stripe_array *s = alloc_stripe_array(conf, iod); struct ublk_io *io = ublk_get_io(q, tag); int i, extra = zc ? 2 : 0; - void *base = (zc | auto_zc) ? NULL : (void *)iod->addr; + void *base = io->buf_addr; io->private_data = s; calculate_stripe_array(conf, iod, s, base); From 04a5685ddea807c033a712f7b447fa6b7ce460cf Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Fri, 12 Dec 2025 10:17:07 -0700 Subject: [PATCH 0156/1321] selftests: ublk: add user copy test cases The ublk selftests cover every data copy mode except user copy. Add tests for user copy based on the existing test suite: - generic_14 ("basic recover function verification (user copy)") based on generic_04 and generic_05 - null_03 ("basic IO test with user copy") based on null_01 and null_02 - loop_06 ("write and verify over user copy") based on loop_01 and loop_03 - loop_07 ("mkfs & mount & umount with user copy") based on loop_02 and loop_04 - stripe_05 ("write and verify test on user copy") based on stripe_03 - stripe_06 ("mkfs & mount & umount on user copy") based on stripe_02 and stripe_04 - stress_06 ("run IO and remove device (user copy)") based on stress_01 and stress_03 - stress_07 ("run IO and kill ublk server (user copy)") based on stress_02 and stress_04 Signed-off-by: Caleb Sander Mateos Reviewed-by: Ming Lei Signed-off-by: Jens Axboe --- tools/testing/selftests/ublk/Makefile | 8 ++++ .../testing/selftests/ublk/test_generic_14.sh | 40 +++++++++++++++++++ tools/testing/selftests/ublk/test_loop_06.sh | 25 ++++++++++++ tools/testing/selftests/ublk/test_loop_07.sh | 21 ++++++++++ tools/testing/selftests/ublk/test_null_03.sh | 24 +++++++++++ .../testing/selftests/ublk/test_stress_06.sh | 39 ++++++++++++++++++ .../testing/selftests/ublk/test_stress_07.sh | 39 ++++++++++++++++++ .../testing/selftests/ublk/test_stripe_05.sh | 26 ++++++++++++ .../testing/selftests/ublk/test_stripe_06.sh | 21 ++++++++++ 9 files changed, 243 insertions(+) create mode 100755 tools/testing/selftests/ublk/test_generic_14.sh create mode 100755 tools/testing/selftests/ublk/test_loop_06.sh create mode 100755 tools/testing/selftests/ublk/test_loop_07.sh create mode 100755 tools/testing/selftests/ublk/test_null_03.sh create mode 100755 tools/testing/selftests/ublk/test_stress_06.sh create mode 100755 tools/testing/selftests/ublk/test_stress_07.sh create mode 100755 tools/testing/selftests/ublk/test_stripe_05.sh create mode 100755 tools/testing/selftests/ublk/test_stripe_06.sh diff --git a/tools/testing/selftests/ublk/Makefile b/tools/testing/selftests/ublk/Makefile index 770269efe42ab..837977b624171 100644 --- a/tools/testing/selftests/ublk/Makefile +++ b/tools/testing/selftests/ublk/Makefile @@ -21,24 +21,32 @@ TEST_PROGS += test_generic_10.sh TEST_PROGS += test_generic_11.sh TEST_PROGS += test_generic_12.sh TEST_PROGS += test_generic_13.sh +TEST_PROGS += test_generic_14.sh TEST_PROGS += test_null_01.sh TEST_PROGS += test_null_02.sh +TEST_PROGS += test_null_03.sh TEST_PROGS += test_loop_01.sh TEST_PROGS += test_loop_02.sh TEST_PROGS += test_loop_03.sh TEST_PROGS += test_loop_04.sh TEST_PROGS += test_loop_05.sh +TEST_PROGS += test_loop_06.sh +TEST_PROGS += test_loop_07.sh TEST_PROGS += test_stripe_01.sh TEST_PROGS += test_stripe_02.sh TEST_PROGS += test_stripe_03.sh TEST_PROGS += test_stripe_04.sh +TEST_PROGS += test_stripe_05.sh +TEST_PROGS += test_stripe_06.sh TEST_PROGS += test_stress_01.sh TEST_PROGS += test_stress_02.sh TEST_PROGS += test_stress_03.sh TEST_PROGS += test_stress_04.sh TEST_PROGS += test_stress_05.sh +TEST_PROGS += test_stress_06.sh +TEST_PROGS += test_stress_07.sh TEST_GEN_PROGS_EXTENDED = kublk diff --git a/tools/testing/selftests/ublk/test_generic_14.sh b/tools/testing/selftests/ublk/test_generic_14.sh new file mode 100755 index 0000000000000..cd9b44b97c24e --- /dev/null +++ b/tools/testing/selftests/ublk/test_generic_14.sh @@ -0,0 +1,40 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +. "$(cd "$(dirname "$0")" && pwd)"/test_common.sh + +TID="generic_14" +ERR_CODE=0 + +ublk_run_recover_test() +{ + run_io_and_recover 256M "kill_daemon" "$@" + ERR_CODE=$? + if [ ${ERR_CODE} -ne 0 ]; then + echo "$TID failure: $*" + _show_result $TID $ERR_CODE + fi +} + +if ! _have_program fio; then + exit "$UBLK_SKIP_CODE" +fi + +_prep_test "recover" "basic recover function verification (user copy)" + +_create_backfile 0 256M +_create_backfile 1 128M +_create_backfile 2 128M + +ublk_run_recover_test -t null -q 2 -r 1 -u & +ublk_run_recover_test -t loop -q 2 -r 1 -u "${UBLK_BACKFILES[0]}" & +ublk_run_recover_test -t stripe -q 2 -r 1 -u "${UBLK_BACKFILES[1]}" "${UBLK_BACKFILES[2]}" & +wait + +ublk_run_recover_test -t null -q 2 -r 1 -u -i 1 & +ublk_run_recover_test -t loop -q 2 -r 1 -u -i 1 "${UBLK_BACKFILES[0]}" & +ublk_run_recover_test -t stripe -q 2 -r 1 -u -i 1 "${UBLK_BACKFILES[1]}" "${UBLK_BACKFILES[2]}" & +wait + +_cleanup_test "recover" +_show_result $TID $ERR_CODE diff --git a/tools/testing/selftests/ublk/test_loop_06.sh b/tools/testing/selftests/ublk/test_loop_06.sh new file mode 100755 index 0000000000000..1d1a8a7255023 --- /dev/null +++ b/tools/testing/selftests/ublk/test_loop_06.sh @@ -0,0 +1,25 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +. "$(cd "$(dirname "$0")" && pwd)"/test_common.sh + +TID="loop_06" +ERR_CODE=0 + +if ! _have_program fio; then + exit "$UBLK_SKIP_CODE" +fi + +_prep_test "loop" "write and verify over user copy" + +_create_backfile 0 256M +dev_id=$(_add_ublk_dev -t loop -u "${UBLK_BACKFILES[0]}") +_check_add_dev $TID $? + +# run fio over the ublk disk +_run_fio_verify_io --filename=/dev/ublkb"${dev_id}" --size=256M +ERR_CODE=$? + +_cleanup_test "loop" + +_show_result $TID $ERR_CODE diff --git a/tools/testing/selftests/ublk/test_loop_07.sh b/tools/testing/selftests/ublk/test_loop_07.sh new file mode 100755 index 0000000000000..493f3fb611a5a --- /dev/null +++ b/tools/testing/selftests/ublk/test_loop_07.sh @@ -0,0 +1,21 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +. "$(cd "$(dirname "$0")" && pwd)"/test_common.sh + +TID="loop_07" +ERR_CODE=0 + +_prep_test "loop" "mkfs & mount & umount with user copy" + +_create_backfile 0 256M + +dev_id=$(_add_ublk_dev -t loop -u "${UBLK_BACKFILES[0]}") +_check_add_dev $TID $? + +_mkfs_mount_test /dev/ublkb"${dev_id}" +ERR_CODE=$? + +_cleanup_test "loop" + +_show_result $TID $ERR_CODE diff --git a/tools/testing/selftests/ublk/test_null_03.sh b/tools/testing/selftests/ublk/test_null_03.sh new file mode 100755 index 0000000000000..0051067b46869 --- /dev/null +++ b/tools/testing/selftests/ublk/test_null_03.sh @@ -0,0 +1,24 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +. "$(cd "$(dirname "$0")" && pwd)"/test_common.sh + +TID="null_03" +ERR_CODE=0 + +if ! _have_program fio; then + exit "$UBLK_SKIP_CODE" +fi + +_prep_test "null" "basic IO test with user copy" + +dev_id=$(_add_ublk_dev -t null -u) +_check_add_dev $TID $? + +# run fio over the two disks +fio --name=job1 --filename=/dev/ublkb"${dev_id}" --ioengine=libaio --rw=readwrite --iodepth=32 --size=256M > /dev/null 2>&1 +ERR_CODE=$? + +_cleanup_test "null" + +_show_result $TID $ERR_CODE diff --git a/tools/testing/selftests/ublk/test_stress_06.sh b/tools/testing/selftests/ublk/test_stress_06.sh new file mode 100755 index 0000000000000..37188ec2e1f70 --- /dev/null +++ b/tools/testing/selftests/ublk/test_stress_06.sh @@ -0,0 +1,39 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +. "$(cd "$(dirname "$0")" && pwd)"/test_common.sh +TID="stress_06" +ERR_CODE=0 + +ublk_io_and_remove() +{ + run_io_and_remove "$@" + ERR_CODE=$? + if [ ${ERR_CODE} -ne 0 ]; then + echo "$TID failure: $*" + _show_result $TID $ERR_CODE + fi +} + +if ! _have_program fio; then + exit "$UBLK_SKIP_CODE" +fi + +_prep_test "stress" "run IO and remove device (user copy)" + +_create_backfile 0 256M +_create_backfile 1 128M +_create_backfile 2 128M + +ublk_io_and_remove 8G -t null -q 4 -u & +ublk_io_and_remove 256M -t loop -q 4 -u "${UBLK_BACKFILES[0]}" & +ublk_io_and_remove 256M -t stripe -q 4 -u "${UBLK_BACKFILES[1]}" "${UBLK_BACKFILES[2]}" & +wait + +ublk_io_and_remove 8G -t null -q 4 -u --nthreads 8 --per_io_tasks & +ublk_io_and_remove 256M -t loop -q 4 -u --nthreads 8 --per_io_tasks "${UBLK_BACKFILES[0]}" & +ublk_io_and_remove 256M -t stripe -q 4 -u --nthreads 8 --per_io_tasks "${UBLK_BACKFILES[1]}" "${UBLK_BACKFILES[2]}" & +wait + +_cleanup_test "stress" +_show_result $TID $ERR_CODE diff --git a/tools/testing/selftests/ublk/test_stress_07.sh b/tools/testing/selftests/ublk/test_stress_07.sh new file mode 100755 index 0000000000000..fb061fc26d362 --- /dev/null +++ b/tools/testing/selftests/ublk/test_stress_07.sh @@ -0,0 +1,39 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +. "$(cd "$(dirname "$0")" && pwd)"/test_common.sh +TID="stress_07" +ERR_CODE=0 + +ublk_io_and_kill_daemon() +{ + run_io_and_kill_daemon "$@" + ERR_CODE=$? + if [ ${ERR_CODE} -ne 0 ]; then + echo "$TID failure: $*" + _show_result $TID $ERR_CODE + fi +} + +if ! _have_program fio; then + exit "$UBLK_SKIP_CODE" +fi + +_prep_test "stress" "run IO and kill ublk server (user copy)" + +_create_backfile 0 256M +_create_backfile 1 128M +_create_backfile 2 128M + +ublk_io_and_kill_daemon 8G -t null -q 4 -u --no_ublk_fixed_fd & +ublk_io_and_kill_daemon 256M -t loop -q 4 -u --no_ublk_fixed_fd "${UBLK_BACKFILES[0]}" & +ublk_io_and_kill_daemon 256M -t stripe -q 4 -u "${UBLK_BACKFILES[1]}" "${UBLK_BACKFILES[2]}" & +wait + +ublk_io_and_kill_daemon 8G -t null -q 4 -u --nthreads 8 --per_io_tasks & +ublk_io_and_kill_daemon 256M -t loop -q 4 -u --nthreads 8 --per_io_tasks "${UBLK_BACKFILES[0]}" & +ublk_io_and_kill_daemon 256M -t stripe -q 4 -u --nthreads 8 --per_io_tasks "${UBLK_BACKFILES[1]}" "${UBLK_BACKFILES[2]}" & +wait + +_cleanup_test "stress" +_show_result $TID $ERR_CODE diff --git a/tools/testing/selftests/ublk/test_stripe_05.sh b/tools/testing/selftests/ublk/test_stripe_05.sh new file mode 100755 index 0000000000000..05d71951d710c --- /dev/null +++ b/tools/testing/selftests/ublk/test_stripe_05.sh @@ -0,0 +1,26 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +. "$(cd "$(dirname "$0")" && pwd)"/test_common.sh + +TID="stripe_05" +ERR_CODE=0 + +if ! _have_program fio; then + exit "$UBLK_SKIP_CODE" +fi + +_prep_test "stripe" "write and verify test on user copy" + +_create_backfile 0 256M +_create_backfile 1 256M + +dev_id=$(_add_ublk_dev -t stripe -q 2 -u "${UBLK_BACKFILES[0]}" "${UBLK_BACKFILES[1]}") +_check_add_dev $TID $? + +# run fio over the ublk disk +_run_fio_verify_io --filename=/dev/ublkb"${dev_id}" --size=512M +ERR_CODE=$? + +_cleanup_test "stripe" +_show_result $TID $ERR_CODE diff --git a/tools/testing/selftests/ublk/test_stripe_06.sh b/tools/testing/selftests/ublk/test_stripe_06.sh new file mode 100755 index 0000000000000..d06cac7626e21 --- /dev/null +++ b/tools/testing/selftests/ublk/test_stripe_06.sh @@ -0,0 +1,21 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +. "$(cd "$(dirname "$0")" && pwd)"/test_common.sh + +TID="stripe_06" +ERR_CODE=0 + +_prep_test "stripe" "mkfs & mount & umount on user copy" + +_create_backfile 0 256M +_create_backfile 1 256M + +dev_id=$(_add_ublk_dev -t stripe -u -q 2 "${UBLK_BACKFILES[0]}" "${UBLK_BACKFILES[1]}") +_check_add_dev $TID $? + +_mkfs_mount_test /dev/ublkb"${dev_id}" +ERR_CODE=$? + +_cleanup_test "stripe" +_show_result $TID $ERR_CODE From 8c62b0827ad34406747c3d600b948e1c81b66dfc Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Fri, 12 Dec 2025 22:35:00 +0800 Subject: [PATCH 0157/1321] block: fix race between wbt_enable_default and IO submission When wbt_enable_default() is moved out of queue freezing in elevator_change(), it can cause the wbt inflight counter to become negative (-1), leading to hung tasks in the writeback path. Tasks get stuck in wbt_wait() because the counter is in an inconsistent state. The issue occurs because wbt_enable_default() could race with IO submission, allowing the counter to be decremented before proper initialization. This manifests as: rq_wait[0]: inflight: -1 has_waiters: True rwb_enabled() checks the state, which can be updated exactly between wbt_wait() (rq_qos_throttle()) and wbt_track()(rq_qos_track()), then the inflight counter will become negative. And results in hung task warnings like: task:kworker/u24:39 state:D stack:0 pid:14767 Call Trace: rq_qos_wait+0xb4/0x150 wbt_wait+0xa9/0x100 __rq_qos_throttle+0x24/0x40 blk_mq_submit_bio+0x672/0x7b0 ... Fix this by: 1. Splitting wbt_enable_default() into: - __wbt_enable_default(): Returns true if wbt_init() should be called - wbt_enable_default(): Wrapper for existing callers (no init) - wbt_init_enable_default(): New function that checks and inits WBT 2. Using wbt_init_enable_default() in blk_register_queue() to ensure proper initialization during queue registration 3. Move wbt_init() out of wbt_enable_default() which is only for enabling disabled wbt from bfq and iocost, and wbt_init() isn't needed. Then the original lock warning can be avoided. 4. Removing the ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT flag and its handling code since it's no longer needed This ensures WBT is properly initialized before any IO can be submitted, preventing the counter from going negative. Cc: Nilay Shroff Cc: Yu Kuai Cc: Guangwu Zhang Fixes: 78c271344b6f ("block: move wbt_enable_default() out of queue freezing from sched ->exit()") Signed-off-by: Ming Lei Reviewed-by: Nilay Shroff Signed-off-by: Jens Axboe --- block/bfq-iosched.c | 2 +- block/blk-sysfs.c | 2 +- block/blk-wbt.c | 20 ++++++++++++++++---- block/blk-wbt.h | 5 +++++ block/elevator.c | 4 ---- block/elevator.h | 1 - 6 files changed, 23 insertions(+), 11 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 4a8d3d96bfe49..6e54b1d3d8bc2 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -7181,7 +7181,7 @@ static void bfq_exit_queue(struct elevator_queue *e) blk_stat_disable_accounting(bfqd->queue); blk_queue_flag_clear(QUEUE_FLAG_DISABLE_WBT_DEF, bfqd->queue); - set_bit(ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT, &e->flags); + wbt_enable_default(bfqd->queue->disk); kfree(bfqd); } diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 8684c57498cc1..e0a70d26972b3 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -932,7 +932,7 @@ int blk_register_queue(struct gendisk *disk) elevator_set_default(q); blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q); - wbt_enable_default(disk); + wbt_init_enable_default(disk); /* Now everything is ready and send out KOBJ_ADD uevent */ kobject_uevent(&disk->queue_kobj, KOBJ_ADD); diff --git a/block/blk-wbt.c b/block/blk-wbt.c index eb8037bae0bda..0974875f77bda 100644 --- a/block/blk-wbt.c +++ b/block/blk-wbt.c @@ -699,7 +699,7 @@ static void wbt_requeue(struct rq_qos *rqos, struct request *rq) /* * Enable wbt if defaults are configured that way */ -void wbt_enable_default(struct gendisk *disk) +static bool __wbt_enable_default(struct gendisk *disk) { struct request_queue *q = disk->queue; struct rq_qos *rqos; @@ -716,19 +716,31 @@ void wbt_enable_default(struct gendisk *disk) if (enable && RQWB(rqos)->enable_state == WBT_STATE_OFF_DEFAULT) RQWB(rqos)->enable_state = WBT_STATE_ON_DEFAULT; mutex_unlock(&disk->rqos_state_mutex); - return; + return false; } mutex_unlock(&disk->rqos_state_mutex); /* Queue not registered? Maybe shutting down... */ if (!blk_queue_registered(q)) - return; + return false; if (queue_is_mq(q) && enable) - wbt_init(disk); + return true; + return false; +} + +void wbt_enable_default(struct gendisk *disk) +{ + __wbt_enable_default(disk); } EXPORT_SYMBOL_GPL(wbt_enable_default); +void wbt_init_enable_default(struct gendisk *disk) +{ + if (__wbt_enable_default(disk)) + WARN_ON_ONCE(wbt_init(disk)); +} + u64 wbt_default_latency_nsec(struct request_queue *q) { /* diff --git a/block/blk-wbt.h b/block/blk-wbt.h index e5fc653b9b76f..925f224757383 100644 --- a/block/blk-wbt.h +++ b/block/blk-wbt.h @@ -5,6 +5,7 @@ #ifdef CONFIG_BLK_WBT int wbt_init(struct gendisk *disk); +void wbt_init_enable_default(struct gendisk *disk); void wbt_disable_default(struct gendisk *disk); void wbt_enable_default(struct gendisk *disk); @@ -16,6 +17,10 @@ u64 wbt_default_latency_nsec(struct request_queue *); #else +static inline void wbt_init_enable_default(struct gendisk *disk) +{ +} + static inline void wbt_disable_default(struct gendisk *disk) { } diff --git a/block/elevator.c b/block/elevator.c index 5b37ef44f52d7..a2f8b2251dc6e 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -633,14 +633,10 @@ static int elevator_change_done(struct request_queue *q, .et = ctx->old->et, .data = ctx->old->elevator_data }; - bool enable_wbt = test_bit(ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT, - &ctx->old->flags); elv_unregister_queue(q, ctx->old); blk_mq_free_sched_res(&res, ctx->old->type, q->tag_set); kobject_put(&ctx->old->kobj); - if (enable_wbt) - wbt_enable_default(q->disk); } if (ctx->new) { ret = elv_register_queue(q, ctx->new, !ctx->no_uevent); diff --git a/block/elevator.h b/block/elevator.h index a9d092c5a9e85..3eb32516be0b1 100644 --- a/block/elevator.h +++ b/block/elevator.h @@ -156,7 +156,6 @@ struct elevator_queue #define ELEVATOR_FLAG_REGISTERED 0 #define ELEVATOR_FLAG_DYING 1 -#define ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT 2 /* * block elevator interface From 29e5b401a89b0db4b56114207456fd8ab154076a Mon Sep 17 00:00:00 2001 From: Yongpeng Yang Date: Mon, 15 Dec 2025 23:21:04 +0800 Subject: [PATCH 0158/1321] loop: use READ_ONCE() to read lo->lo_state without locking When lo->lo_mutex is not held, direct access may read stale data. This patch uses READ_ONCE() to read lo->lo_state and data_race() to silence code checkers, and changes all assignments to use WRITE_ONCE(). Reviewed-by: Damien Le Moal Signed-off-by: Yongpeng Yang Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- drivers/block/loop.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 272bc608e5282..32a3a5b138029 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1082,7 +1082,7 @@ static int loop_configure(struct loop_device *lo, blk_mode_t mode, /* Order wrt reading lo_state in loop_validate_file(). */ wmb(); - lo->lo_state = Lo_bound; + WRITE_ONCE(lo->lo_state, Lo_bound); if (part_shift) lo->lo_flags |= LO_FLAGS_PARTSCAN; partscan = lo->lo_flags & LO_FLAGS_PARTSCAN; @@ -1179,7 +1179,7 @@ static void __loop_clr_fd(struct loop_device *lo) if (!part_shift) set_bit(GD_SUPPRESS_PART_SCAN, &lo->lo_disk->state); mutex_lock(&lo->lo_mutex); - lo->lo_state = Lo_unbound; + WRITE_ONCE(lo->lo_state, Lo_unbound); mutex_unlock(&lo->lo_mutex); /* @@ -1218,7 +1218,7 @@ static int loop_clr_fd(struct loop_device *lo) lo->lo_flags |= LO_FLAGS_AUTOCLEAR; if (disk_openers(lo->lo_disk) == 1) - lo->lo_state = Lo_rundown; + WRITE_ONCE(lo->lo_state, Lo_rundown); loop_global_unlock(lo, true); return 0; @@ -1743,7 +1743,7 @@ static void lo_release(struct gendisk *disk) mutex_lock(&lo->lo_mutex); if (lo->lo_state == Lo_bound && (lo->lo_flags & LO_FLAGS_AUTOCLEAR)) - lo->lo_state = Lo_rundown; + WRITE_ONCE(lo->lo_state, Lo_rundown); need_clear = (lo->lo_state == Lo_rundown); mutex_unlock(&lo->lo_mutex); @@ -1858,7 +1858,7 @@ static blk_status_t loop_queue_rq(struct blk_mq_hw_ctx *hctx, blk_mq_start_request(rq); - if (lo->lo_state != Lo_bound) + if (data_race(READ_ONCE(lo->lo_state)) != Lo_bound) return BLK_STS_IOERR; switch (req_op(rq)) { @@ -2016,7 +2016,7 @@ static int loop_add(int i) lo->worker_tree = RB_ROOT; INIT_LIST_HEAD(&lo->idle_worker_list); timer_setup(&lo->timer, loop_free_idle_workers_timer, TIMER_DEFERRABLE); - lo->lo_state = Lo_unbound; + WRITE_ONCE(lo->lo_state, Lo_unbound); err = mutex_lock_killable(&loop_ctl_mutex); if (err) @@ -2174,7 +2174,7 @@ static int loop_control_remove(int idx) goto mark_visible; } /* Mark this loop device as no more bound, but not quite unbound yet */ - lo->lo_state = Lo_deleting; + WRITE_ONCE(lo->lo_state, Lo_deleting); mutex_unlock(&lo->lo_mutex); loop_remove(lo); @@ -2197,8 +2197,12 @@ static int loop_control_get_free(int idx) if (ret) return ret; idr_for_each_entry(&loop_index_idr, lo, id) { - /* Hitting a race results in creating a new loop device which is harmless. */ - if (lo->idr_visible && data_race(lo->lo_state) == Lo_unbound) + /* + * Hitting a race results in creating a new loop device + * which is harmless. + */ + if (lo->idr_visible && + data_race(READ_ONCE(lo->lo_state)) == Lo_unbound) goto found; } mutex_unlock(&loop_ctl_mutex); From d904fc1b5b4d4fdc2be573a4971cad7d99fff04a Mon Sep 17 00:00:00 2001 From: Yongpeng Yang Date: Mon, 15 Dec 2025 23:21:06 +0800 Subject: [PATCH 0159/1321] zloop: use READ_ONCE() to read lo->lo_state in queue_rq path In the queue_rq path, zlo->state is accessed without locking, and direct access may read stale data. This patch uses READ_ONCE() to read zlo->state and data_race() to silence code checkers, and changes all assignments to use WRITE_ONCE(). Reviewed-by: Damien Le Moal Reviewed-by: Christoph Hellwig Signed-off-by: Yongpeng Yang Signed-off-by: Jens Axboe --- drivers/block/zloop.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/block/zloop.c b/drivers/block/zloop.c index 77bd6081b2445..8e334f5025fc0 100644 --- a/drivers/block/zloop.c +++ b/drivers/block/zloop.c @@ -697,7 +697,7 @@ static blk_status_t zloop_queue_rq(struct blk_mq_hw_ctx *hctx, struct zloop_cmd *cmd = blk_mq_rq_to_pdu(rq); struct zloop_device *zlo = rq->q->queuedata; - if (zlo->state == Zlo_deleting) + if (data_race(READ_ONCE(zlo->state)) == Zlo_deleting) return BLK_STS_IOERR; /* @@ -1002,7 +1002,7 @@ static int zloop_ctl_add(struct zloop_options *opts) ret = -ENOMEM; goto out; } - zlo->state = Zlo_creating; + WRITE_ONCE(zlo->state, Zlo_creating); ret = mutex_lock_killable(&zloop_ctl_mutex); if (ret) @@ -1113,7 +1113,7 @@ static int zloop_ctl_add(struct zloop_options *opts) } mutex_lock(&zloop_ctl_mutex); - zlo->state = Zlo_live; + WRITE_ONCE(zlo->state, Zlo_live); mutex_unlock(&zloop_ctl_mutex); pr_info("zloop: device %d, %u zones of %llu MiB, %u B block size\n", @@ -1177,7 +1177,7 @@ static int zloop_ctl_remove(struct zloop_options *opts) ret = -EINVAL; } else { idr_remove(&zloop_index_idr, zlo->id); - zlo->state = Zlo_deleting; + WRITE_ONCE(zlo->state, Zlo_deleting); } mutex_unlock(&zloop_ctl_mutex); From 4066a783b59bd550629d4b27449cd52f02a33c25 Mon Sep 17 00:00:00 2001 From: Yongpeng Yang Date: Mon, 15 Dec 2025 17:58:17 +0800 Subject: [PATCH 0160/1321] Documentation: admin-guide: blockdev: replace zone_capacity with zone_capacity_mb when creating devices The "zone_capacity=%umb" option is no longer used. The effective option is now "zone_capacity_mb=%u", so update the documentation accordingly. Signed-off-by: Yongpeng Yang Reviewed-by: Damien Le Moal Signed-off-by: Jens Axboe --- Documentation/admin-guide/blockdev/zoned_loop.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/admin-guide/blockdev/zoned_loop.rst b/Documentation/admin-guide/blockdev/zoned_loop.rst index 806adde664dbf..6aa865424ac38 100644 --- a/Documentation/admin-guide/blockdev/zoned_loop.rst +++ b/Documentation/admin-guide/blockdev/zoned_loop.rst @@ -134,7 +134,7 @@ MB and a zone capacity of 63 MB:: $ modprobe zloop $ mkdir -p /var/local/zloop/0 - $ echo "add capacity_mb=2048,zone_size_mb=64,zone_capacity=63MB" > /dev/zloop-control + $ echo "add capacity_mb=2048,zone_size_mb=64,zone_capacity_mb=63" > /dev/zloop-control For the device created (/dev/zloop0), the zone backing files are all created under the default base directory (/var/local/zloop):: From 9b80ea168950ccbc4bd4f70ca0b0bf171f0abf56 Mon Sep 17 00:00:00 2001 From: Deepanshu Kartikey Date: Wed, 17 Dec 2025 07:17:12 +0530 Subject: [PATCH 0161/1321] block: add allocation size check in blkdev_pr_read_keys() blkdev_pr_read_keys() takes num_keys from userspace and uses it to calculate the allocation size for keys_info via struct_size(). While there is a check for SIZE_MAX (integer overflow), there is no upper bound validation on the allocation size itself. A malicious or buggy userspace can pass a large num_keys value that doesn't trigger overflow but still results in an excessive allocation attempt, causing a warning in the page allocator when the order exceeds MAX_PAGE_ORDER. Fix this by introducing PR_KEYS_MAX to limit the number of keys to a sane value. This makes the SIZE_MAX check redundant, so remove it. Also switch to kvzalloc/kvfree to handle larger allocations gracefully. Fixes: 22a1ffea5f80 ("block: add IOC_PR_READ_KEYS ioctl") Tested-by: syzbot+660d079d90f8a1baf54d@syzkaller.appspotmail.com Reported-by: syzbot+660d079d90f8a1baf54d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=660d079d90f8a1baf54d Link: https://lore.kernel.org/all/20251212013510.3576091-1-kartikey406@gmail.com/T/ [v1] Signed-off-by: Deepanshu Kartikey Reviewed-by: Martin K. Petersen Reviewed-by: Stefan Hajnoczi Signed-off-by: Jens Axboe --- block/ioctl.c | 9 +++++---- include/uapi/linux/pr.h | 2 ++ 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/block/ioctl.c b/block/ioctl.c index 61feed686418a..344478348a54e 100644 --- a/block/ioctl.c +++ b/block/ioctl.c @@ -442,11 +442,12 @@ static int blkdev_pr_read_keys(struct block_device *bdev, blk_mode_t mode, if (copy_from_user(&read_keys, arg, sizeof(read_keys))) return -EFAULT; - keys_info_len = struct_size(keys_info, keys, read_keys.num_keys); - if (keys_info_len == SIZE_MAX) + if (read_keys.num_keys > PR_KEYS_MAX) return -EINVAL; - keys_info = kzalloc(keys_info_len, GFP_KERNEL); + keys_info_len = struct_size(keys_info, keys, read_keys.num_keys); + + keys_info = kvzalloc(keys_info_len, GFP_KERNEL); if (!keys_info) return -ENOMEM; @@ -473,7 +474,7 @@ static int blkdev_pr_read_keys(struct block_device *bdev, blk_mode_t mode, if (copy_to_user(arg, &read_keys, sizeof(read_keys))) ret = -EFAULT; out: - kfree(keys_info); + kvfree(keys_info); return ret; } diff --git a/include/uapi/linux/pr.h b/include/uapi/linux/pr.h index 847f3051057af..f0ecb1677317d 100644 --- a/include/uapi/linux/pr.h +++ b/include/uapi/linux/pr.h @@ -79,4 +79,6 @@ struct pr_read_reservation { #define IOC_PR_READ_KEYS _IOWR('p', 206, struct pr_read_keys) #define IOC_PR_READ_RESERVATION _IOR('p', 207, struct pr_read_reservation) +#define PR_KEYS_MAX (1u << 16) + #endif /* _UAPI_PR_H */ From f0e898ef345c92b89735a8d0df7d6fbd1d47d94a Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Fri, 12 Dec 2025 22:34:15 +0800 Subject: [PATCH 0162/1321] ublk: fix deadlock when reading partition table When one process(such as udev) opens ublk block device (e.g., to read the partition table via bdev_open()), a deadlock[1] can occur: 1. bdev_open() grabs disk->open_mutex 2. The process issues read I/O to ublk backend to read partition table 3. In __ublk_complete_rq(), blk_update_request() or blk_mq_end_request() runs bio->bi_end_io() callbacks 4. If this triggers fput() on file descriptor of ublk block device, the work may be deferred to current task's task work (see fput() implementation) 5. This eventually calls blkdev_release() from the same context 6. blkdev_release() tries to grab disk->open_mutex again 7. Deadlock: same task waiting for a mutex it already holds The fix is to run blk_update_request() and blk_mq_end_request() with bottom halves disabled. This forces blkdev_release() to run in kernel work-queue context instead of current task work context, and allows ublk server to make forward progress, and avoids the deadlock. Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Link: https://github.com/ublk-org/ublksrv/issues/170 [1] Signed-off-by: Ming Lei Reviewed-by: Caleb Sander Mateos [axboe: rewrite comment in ublk] Signed-off-by: Jens Axboe --- drivers/block/ublk_drv.c | 32 ++++++++++++++++++++++++++++---- 1 file changed, 28 insertions(+), 4 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index df9831783a133..cfd2132410dd7 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1080,12 +1080,20 @@ static inline struct ublk_uring_cmd_pdu *ublk_get_uring_cmd_pdu( return io_uring_cmd_to_pdu(ioucmd, struct ublk_uring_cmd_pdu); } +static void ublk_end_request(struct request *req, blk_status_t error) +{ + local_bh_disable(); + blk_mq_end_request(req, error); + local_bh_enable(); +} + /* todo: handle partial completion */ static inline void __ublk_complete_rq(struct request *req, struct ublk_io *io, bool need_map) { unsigned int unmapped_bytes; blk_status_t res = BLK_STS_OK; + bool requeue; /* failed read IO if nothing is read */ if (!io->res && req_op(req) == REQ_OP_READ) @@ -1117,14 +1125,30 @@ static inline void __ublk_complete_rq(struct request *req, struct ublk_io *io, if (unlikely(unmapped_bytes < io->res)) io->res = unmapped_bytes; - if (blk_update_request(req, BLK_STS_OK, io->res)) + /* + * Run bio->bi_end_io() with softirqs disabled. If the final fput + * happens off this path, then that will prevent ublk's blkdev_release() + * from being called on current's task work, see fput() implementation. + * + * Otherwise, ublk server may not provide forward progress in case of + * reading the partition table from bdev_open() with disk->open_mutex + * held, and causes dead lock as we could already be holding + * disk->open_mutex here. + * + * Preferably we would not be doing IO with a mutex held that is also + * used for release, but this work-around will suffice for now. + */ + local_bh_disable(); + requeue = blk_update_request(req, BLK_STS_OK, io->res); + local_bh_enable(); + if (requeue) blk_mq_requeue_request(req, true); else if (likely(!blk_should_fake_timeout(req->q))) __blk_mq_end_request(req, BLK_STS_OK); return; exit: - blk_mq_end_request(req, res); + ublk_end_request(req, res); } static struct io_uring_cmd *__ublk_prep_compl_io_cmd(struct ublk_io *io, @@ -1164,7 +1188,7 @@ static inline void __ublk_abort_rq(struct ublk_queue *ubq, if (ublk_nosrv_dev_should_queue_io(ubq->dev)) blk_mq_requeue_request(rq, false); else - blk_mq_end_request(rq, BLK_STS_IOERR); + ublk_end_request(rq, BLK_STS_IOERR); } static void @@ -1209,7 +1233,7 @@ __ublk_do_auto_buf_reg(const struct ublk_queue *ubq, struct request *req, ublk_auto_buf_reg_fallback(ubq, req->tag); return AUTO_BUF_REG_FALLBACK; } - blk_mq_end_request(req, BLK_STS_IOERR); + ublk_end_request(req, BLK_STS_IOERR); return AUTO_BUF_REG_FAIL; } From 636cefc08c0f3ca24981dd6234af6536aba4e732 Mon Sep 17 00:00:00 2001 From: Thomas Fourier Date: Wed, 17 Dec 2025 10:36:48 +0100 Subject: [PATCH 0163/1321] block: rnbd-clt: Fix leaked ID in init_dev() If kstrdup() fails in init_dev(), then the newly allocated ID is lost. Fixes: 64e8a6ece1a5 ("block/rnbd-clt: Dynamically alloc buffer for pathname & blk_symlink_name") Signed-off-by: Thomas Fourier Acked-by: Jack Wang Signed-off-by: Jens Axboe --- drivers/block/rnbd/rnbd-clt.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c index f1409e54010a6..d1c354636315d 100644 --- a/drivers/block/rnbd/rnbd-clt.c +++ b/drivers/block/rnbd/rnbd-clt.c @@ -1423,9 +1423,11 @@ static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess, goto out_alloc; } - ret = ida_alloc_max(&index_ida, (1 << (MINORBITS - RNBD_PART_BITS)) - 1, - GFP_KERNEL); - if (ret < 0) { + dev->clt_device_id = ida_alloc_max(&index_ida, + (1 << (MINORBITS - RNBD_PART_BITS)) - 1, + GFP_KERNEL); + if (dev->clt_device_id < 0) { + ret = dev->clt_device_id; pr_err("Failed to initialize device '%s' from session %s, allocating idr failed, err: %d\n", pathname, sess->sessname, ret); goto out_queues; @@ -1434,10 +1436,9 @@ static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess, dev->pathname = kstrdup(pathname, GFP_KERNEL); if (!dev->pathname) { ret = -ENOMEM; - goto out_queues; + goto out_ida; } - dev->clt_device_id = ret; dev->sess = sess; dev->access_mode = access_mode; dev->nr_poll_queues = nr_poll_queues; @@ -1453,6 +1454,8 @@ static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess, return dev; +out_ida: + ida_free(&index_ida, dev->clt_device_id); out_queues: kfree(dev->hw_queues); out_alloc: From 86644c359c7fa76e5f68bd08cb2b7143cfd5d9ad Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Tue, 16 Dec 2025 22:34:35 -0700 Subject: [PATCH 0164/1321] block: validate pi_offset integrity limit The PI tuple must be contained within the metadata value, so validate that pi_offset + pi_tuple_size <= metadata_size. This guards against block drivers that report invalid pi_offset values. Signed-off-by: Caleb Sander Mateos Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- block/blk-settings.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/block/blk-settings.c b/block/blk-settings.c index 51401f08ce05b..d138abc973bba 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -161,10 +161,9 @@ static int blk_validate_integrity_limits(struct queue_limits *lim) return -EINVAL; } - if (bi->pi_tuple_size > bi->metadata_size) { - pr_warn("pi_tuple_size (%u) exceeds metadata_size (%u)\n", - bi->pi_tuple_size, - bi->metadata_size); + if (bi->pi_offset + bi->pi_tuple_size > bi->metadata_size) { + pr_warn("pi_offset (%u) + pi_tuple_size (%u) exceeds metadata_size (%u)\n", + bi->pi_offset, bi->pi_tuple_size, bi->metadata_size); return -EINVAL; } From 0a6534877db64658b891d2657c144ddfa87d4bc6 Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Tue, 16 Dec 2025 22:34:36 -0700 Subject: [PATCH 0165/1321] block: validate interval_exp integrity limit Various code assumes that the integrity interval is at least 1 sector and evenly divides the logical block size. Add these checks to blk_validate_integrity_limits(). This guards against block drivers that report invalid interval_exp values. Signed-off-by: Caleb Sander Mateos Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- block/blk-settings.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/block/blk-settings.c b/block/blk-settings.c index d138abc973bba..a9e65dc090dae 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -193,8 +193,13 @@ static int blk_validate_integrity_limits(struct queue_limits *lim) break; } - if (!bi->interval_exp) + if (!bi->interval_exp) { bi->interval_exp = ilog2(lim->logical_block_size); + } else if (bi->interval_exp < SECTOR_SHIFT || + bi->interval_exp > ilog2(lim->logical_block_size)) { + pr_warn("invalid interval_exp %u\n", bi->interval_exp); + return -EINVAL; + } /* * The PI generation / validation helpers do not expect intervals to From f9f0735dd9b9fdcf68d967d4f2299d1bf0960b64 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 17 Dec 2025 13:43:04 -0400 Subject: [PATCH 0166/1321] iommupt: Return ERR_PTR from _table_alloc() syzkaller noticed that with fault injection a failure inside iommu_alloc_pages_node_sz() oops's in PT_FEAT_DMA_INCOHERENT because it goes on to make NULL incoherent. Closer inspection shows the return value has become confused, the alloc routines on the iommupt side expect ERR_PTR while iommu_alloc_pages_node_sz() returns NULL. Error out early to fix both issues. Fixes: aefd967dab64 ("iommupt: Use the incoherent start/stop functions for PT_FEAT_DMA_INCOHERENT") Fixes: dcd6a011a8d5 ("iommupt: Add map_pages op") Fixes: cdb39d918579 ("iommupt: Add the basic structure of the iommu implementation") Reported-by: syzbot+e06bb7478e687f235ad7@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/693a39de.050a0220.4004e.02ce.GAE@google.com/ Signed-off-by: Jason Gunthorpe Reviewed-by: Kevin Tian Reviewed-by: Lu Baolu Signed-off-by: Joerg Roedel --- drivers/iommu/generic_pt/iommu_pt.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iommu/generic_pt/iommu_pt.h b/drivers/iommu/generic_pt/iommu_pt.h index 97aeda1ad01cc..3327116a441ca 100644 --- a/drivers/iommu/generic_pt/iommu_pt.h +++ b/drivers/iommu/generic_pt/iommu_pt.h @@ -372,6 +372,9 @@ static inline struct pt_table_p *_table_alloc(struct pt_common *common, table_mem = iommu_alloc_pages_node_sz(iommu_table->nid, gfp, log2_to_int(lg2sz)); + if (!table_mem) + return ERR_PTR(-ENOMEM); + if (pt_feature(common, PT_FEAT_DMA_INCOHERENT) && mode == ALLOC_NORMAL) { int ret = iommu_pages_start_incoherent( From b9fff56703e53b1a0f80bdf4c473ad2f41b2686d Mon Sep 17 00:00:00 2001 From: Sairaj Kodilkar Date: Fri, 21 Nov 2025 14:41:15 +0530 Subject: [PATCH 0167/1321] amd/iommu: Preserve domain ids inside the kdump kernel Currently AMD IOMMU driver does not reserve domain ids programmed in the DTE while reusing the device table inside kdump kernel. This can cause reallocation of these domain ids for newer domains that are created by the kdump kernel, which can lead to potential IO_PAGE_FAULTs Hence reserve these ids inside pdom_ids. Fixes: 38e5f33ee359 ("iommu/amd: Reuse device table for kdump") Signed-off-by: Sairaj Kodilkar Reported-by: Jason Gunthorpe Reviewed-by: Vasant Hegde Reviewed-by: Jason Gunthorpe Signed-off-by: Joerg Roedel --- drivers/iommu/amd/init.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 4b29534189770..106ee3cf30388 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1136,9 +1136,13 @@ static void set_dte_bit(struct dev_table_entry *dte, u8 bit) static bool __reuse_device_table(struct amd_iommu *iommu) { struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg; - u32 lo, hi, old_devtb_size; + struct dev_table_entry *old_dev_tbl_entry; + u32 lo, hi, old_devtb_size, devid; phys_addr_t old_devtb_phys; + u16 dom_id; + bool dte_v; u64 entry; + int ret; /* Each IOMMU use separate device table with the same size */ lo = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET); @@ -1173,6 +1177,23 @@ static bool __reuse_device_table(struct amd_iommu *iommu) return false; } + for (devid = 0; devid <= pci_seg->last_bdf; devid++) { + old_dev_tbl_entry = &pci_seg->old_dev_tbl_cpy[devid]; + dte_v = FIELD_GET(DTE_FLAG_V, old_dev_tbl_entry->data[0]); + dom_id = FIELD_GET(DEV_DOMID_MASK, old_dev_tbl_entry->data[1]); + + if (!dte_v || !dom_id) + continue; + /* + * ID reservation can fail with -ENOSPC when there + * are multiple devices present in the same domain, + * hence check only for -ENOMEM. + */ + ret = ida_alloc_range(&pdom_ids, dom_id, dom_id, GFP_KERNEL); + if (ret == -ENOMEM) + return false; + } + return true; } From 26194600eb89549bf8585bda73c4c2860fdd224e Mon Sep 17 00:00:00 2001 From: Sairaj Kodilkar Date: Fri, 21 Nov 2025 14:41:16 +0530 Subject: [PATCH 0168/1321] amd/iommu: Make protection domain ID functions non-static So that both iommu.c and init.c can utilize them. Also define a new function 'pdom_id_destroy()' to destroy 'pdom_ids' instead of directly calling ida functions. Signed-off-by: Sairaj Kodilkar Reviewed-by: Vasant Hegde Signed-off-by: Joerg Roedel --- drivers/iommu/amd/amd_iommu.h | 5 +++++ drivers/iommu/amd/init.c | 7 ++----- drivers/iommu/amd/iommu.c | 27 ++++++++++++++++++--------- 3 files changed, 25 insertions(+), 14 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 25044d28f28a8..b742ef1adb352 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -173,6 +173,11 @@ static inline struct protection_domain *to_pdomain(struct iommu_domain *dom) bool translation_pre_enabled(struct amd_iommu *iommu); int __init add_special_device(u8 type, u8 id, u32 *devid, bool cmd_line); +int amd_iommu_pdom_id_alloc(void); +int amd_iommu_pdom_id_reserve(u16 id, gfp_t gfp); +void amd_iommu_pdom_id_free(int id); +void amd_iommu_pdom_id_destroy(void); + #ifdef CONFIG_DMI void amd_iommu_apply_ivrs_quirks(void); #else diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 106ee3cf30388..384c90b4f90a0 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1142,7 +1142,6 @@ static bool __reuse_device_table(struct amd_iommu *iommu) u16 dom_id; bool dte_v; u64 entry; - int ret; /* Each IOMMU use separate device table with the same size */ lo = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET); @@ -1189,8 +1188,7 @@ static bool __reuse_device_table(struct amd_iommu *iommu) * are multiple devices present in the same domain, * hence check only for -ENOMEM. */ - ret = ida_alloc_range(&pdom_ids, dom_id, dom_id, GFP_KERNEL); - if (ret == -ENOMEM) + if (amd_iommu_pdom_id_reserve(dom_id, GFP_KERNEL) == -ENOMEM) return false; } @@ -3148,8 +3146,7 @@ static bool __init check_ioapic_information(void) static void __init free_dma_resources(void) { - ida_destroy(&pdom_ids); - + amd_iommu_pdom_id_destroy(); free_unity_maps(); } diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 9f1d56a5e145f..5d45795c367a6 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1811,17 +1811,26 @@ int amd_iommu_complete_ppr(struct device *dev, u32 pasid, int status, int tag) * contain. * ****************************************************************************/ - -static int pdom_id_alloc(void) +int amd_iommu_pdom_id_alloc(void) { return ida_alloc_range(&pdom_ids, 1, MAX_DOMAIN_ID - 1, GFP_ATOMIC); } -static void pdom_id_free(int id) +int amd_iommu_pdom_id_reserve(u16 id, gfp_t gfp) +{ + return ida_alloc_range(&pdom_ids, id, id, gfp); +} + +void amd_iommu_pdom_id_free(int id) { ida_free(&pdom_ids, id); } +void amd_iommu_pdom_id_destroy(void) +{ + ida_destroy(&pdom_ids); +} + static void free_gcr3_tbl_level1(u64 *tbl) { u64 *ptr; @@ -1864,7 +1873,7 @@ static void free_gcr3_table(struct gcr3_tbl_info *gcr3_info) gcr3_info->glx = 0; /* Free per device domain ID */ - pdom_id_free(gcr3_info->domid); + amd_iommu_pdom_id_free(gcr3_info->domid); iommu_free_pages(gcr3_info->gcr3_tbl); gcr3_info->gcr3_tbl = NULL; @@ -1900,14 +1909,14 @@ static int setup_gcr3_table(struct gcr3_tbl_info *gcr3_info, return -EBUSY; /* Allocate per device domain ID */ - domid = pdom_id_alloc(); + domid = amd_iommu_pdom_id_alloc(); if (domid <= 0) return -ENOSPC; gcr3_info->domid = domid; gcr3_info->gcr3_tbl = iommu_alloc_pages_node_sz(nid, GFP_ATOMIC, SZ_4K); if (gcr3_info->gcr3_tbl == NULL) { - pdom_id_free(domid); + amd_iommu_pdom_id_free(domid); return -ENOMEM; } @@ -2503,7 +2512,7 @@ struct protection_domain *protection_domain_alloc(void) if (!domain) return NULL; - domid = pdom_id_alloc(); + domid = amd_iommu_pdom_id_alloc(); if (domid <= 0) { kfree(domain); return NULL; @@ -2802,7 +2811,7 @@ void amd_iommu_domain_free(struct iommu_domain *dom) WARN_ON(!list_empty(&domain->dev_list)); pt_iommu_deinit(&domain->iommu); - pdom_id_free(domain->id); + amd_iommu_pdom_id_free(domain->id); kfree(domain); } @@ -2853,7 +2862,7 @@ void amd_iommu_init_identity_domain(void) domain->ops = &identity_domain_ops; domain->owner = &amd_iommu_ops; - identity_domain.id = pdom_id_alloc(); + identity_domain.id = amd_iommu_pdom_id_alloc(); protection_domain_init(&identity_domain); } From 67d008c7435345b7ac36a6f5df87532b9bc91c4e Mon Sep 17 00:00:00 2001 From: Deepanshu Kartikey Date: Wed, 10 Dec 2025 07:50:24 +0530 Subject: [PATCH 0169/1321] mm/slub: reset KASAN tag in defer_free() before accessing freed memory When CONFIG_SLUB_TINY is enabled, kfree_nolock() calls kasan_slab_free() before defer_free(). On ARM64 with MTE (Memory Tagging Extension), kasan_slab_free() poisons the memory and changes the tag from the original (e.g., 0xf3) to a poison tag (0xfe). When defer_free() then tries to write to the freed object to build the deferred free list via llist_add(), the pointer still has the old tag, causing a tag mismatch and triggering a KASAN use-after-free report: BUG: KASAN: slab-use-after-free in defer_free+0x3c/0xbc mm/slub.c:6537 Write at addr f3f000000854f020 by task kworker/u8:6/983 Pointer tag: [f3], memory tag: [fe] Fix this by calling kasan_reset_tag() before accessing the freed memory. This is safe because defer_free() is part of the allocator itself and is expected to manipulate freed memory for bookkeeping purposes. Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().") Cc: stable@vger.kernel.org Reported-by: syzbot+7a25305a76d872abcfa1@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7a25305a76d872abcfa1 Tested-by: syzbot+7a25305a76d872abcfa1@syzkaller.appspotmail.com Signed-off-by: Deepanshu Kartikey Acked-by: Alexei Starovoitov Link: https://patch.msgid.link/20251210022024.3255826-1-kartikey406@gmail.com Signed-off-by: Vlastimil Babka --- mm/slub.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/slub.c b/mm/slub.c index f21b2f0c6f5ae..861592ac54257 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -6539,6 +6539,8 @@ static void defer_free(struct kmem_cache *s, void *head) guard(preempt)(); + head = kasan_reset_tag(head); + df = this_cpu_ptr(&defer_free_objects); if (llist_add(head + s->offset, &df->objects)) irq_work_queue(&df->work); From 3ea5abad7d96d7f9f1e65ad9bb5b4da2c5e74a99 Mon Sep 17 00:00:00 2001 From: Juergen Gross Date: Mon, 15 Dec 2025 12:51:12 +0100 Subject: [PATCH 0170/1321] x86/xen: Fix sparse warning in enlighten_pv.c The sparse tool issues a warning for arch/x76/xen/enlighten_pv.c: arch/x86/xen/enlighten_pv.c:120:9: sparse: sparse: incorrect type in initializer (different address spaces) expected void const [noderef] __percpu *__vpp_verify got bool * This is due to the percpu variable xen_in_preemptible_hcall being exported via EXPORT_SYMBOL_GPL() instead of EXPORT_PER_CPU_SYMBOL_GPL(). Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202512140856.Ic6FetG6-lkp@intel.com/ Fixes: fdfd811ddde3 ("x86/xen: allow privcmd hypercalls to be preempted") Reviewed-by: Boris Ostrovsky Signed-off-by: Juergen Gross Message-ID: <20251215115112.15072-1-jgross@suse.com> --- arch/x86/xen/enlighten_pv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 4806cc28d7ca7..b74ff8bc7f2a8 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -108,7 +108,7 @@ static int xen_cpu_dead_pv(unsigned int cpu); * calls. */ DEFINE_PER_CPU(bool, xen_in_preemptible_hcall); -EXPORT_SYMBOL_GPL(xen_in_preemptible_hcall); +EXPORT_PER_CPU_SYMBOL_GPL(xen_in_preemptible_hcall); /* * In case of scheduling the flag must be cleared and restored after From 806e046594dc606add3966477231d1cff2d0a7e2 Mon Sep 17 00:00:00 2001 From: Gavin Shan Date: Mon, 24 Nov 2025 15:04:27 +1000 Subject: [PATCH 0171/1321] KVM: selftests: Add missing "break" in rseq_test's param parsing In commit 0297cdc12a87 ("KVM: selftests: Add option to rseq test to override /dev/cpu_dma_latency"), a 'break' is missed before the option 'l' in the argument parsing loop, which leads to an unexpected core dump in atoi_paranoid(). It tries to get the latency from non-existent argument. host$ ./rseq_test -u Random seed: 0x6b8b4567 Segmentation fault (core dumped) Add a 'break' before the option 'l' in the argument parsing loop to avoid the unexpected core dump. Fixes: 0297cdc12a87 ("KVM: selftests: Add option to rseq test to override /dev/cpu_dma_latency") Cc: stable@vger.kernel.org # v6.15+ Signed-off-by: Gavin Shan Link: https://patch.msgid.link/20251124050427.1924591-1-gshan@redhat.com [sean: describe code change in shortlog] Signed-off-by: Sean Christopherson --- tools/testing/selftests/kvm/rseq_test.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/kvm/rseq_test.c b/tools/testing/selftests/kvm/rseq_test.c index 1375fca80bcdb..f80ad6b47d16b 100644 --- a/tools/testing/selftests/kvm/rseq_test.c +++ b/tools/testing/selftests/kvm/rseq_test.c @@ -215,6 +215,7 @@ int main(int argc, char *argv[]) switch (opt) { case 'u': skip_sanity_check = true; + break; case 'l': latency = atoi_paranoid(optarg); break; From 875edcc0375374f9801cb7935ed795c6423bde1b Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Mon, 1 Dec 2025 17:50:48 -0800 Subject: [PATCH 0172/1321] KVM: x86: Apply runtime updates to current CPUID during KVM_SET_CPUID{,2} When handling KVM_SET_CPUID{,2}, do runtime CPUID updates on the vCPU's current CPUID (and caps) prior to swapping in the incoming CPUID state so that KVM doesn't lose pending updates if the incoming CPUID is rejected, and to prevent a false failure on the equality check. Note, runtime updates are unconditionally performed on the incoming/new CPUID (and associated caps), i.e. clearing the dirty flag won't negatively affect the new CPUID. Fixes: 93da6af3ae56 ("KVM: x86: Defer runtime updates of dynamic CPUID bits until CPUID emulation") Reported-by: Igor Mammedov Closes: https://lore.kernel.org/all/20251128123202.68424a95@imammedo Cc: stable@vger.kernel.org Acked-by: Igor Mammedov Tested-by: Igor Mammedov Link: https://patch.msgid.link/20251202015049.1167490-2-seanjc@google.com Signed-off-by: Sean Christopherson --- arch/x86/kvm/cpuid.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index d563a948318b7..88a5426674a10 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -509,11 +509,18 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2, u32 vcpu_caps[NR_KVM_CPU_CAPS]; int r; + /* + * Apply pending runtime CPUID updates to the current CPUID entries to + * avoid false positives due to mismatches on KVM-owned feature flags. + */ + if (vcpu->arch.cpuid_dynamic_bits_dirty) + kvm_update_cpuid_runtime(vcpu); + /* * Swap the existing (old) entries with the incoming (new) entries in * order to massage the new entries, e.g. to account for dynamic bits - * that KVM controls, without clobbering the current guest CPUID, which - * KVM needs to preserve in order to unwind on failure. + * that KVM controls, without losing the current guest CPUID, which KVM + * needs to preserve in order to unwind on failure. * * Similarly, save the vCPU's current cpu_caps so that the capabilities * can be updated alongside the CPUID entries when performing runtime From 7a53e9599ce6587636fc09a13c565caf4d00cea5 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Mon, 1 Dec 2025 17:50:49 -0800 Subject: [PATCH 0173/1321] KVM: selftests: Add a CPUID testcase for KVM_SET_CPUID2 with runtime updates Add a CPUID testcase to verify that KVM allows KVM_SET_CPUID2 after (or in conjunction with) runtime updates. This is a regression test for the bug introduced by commit 93da6af3ae56 ("KVM: x86: Defer runtime updates of dynamic CPUID bits until CPUID emulation"), where KVM would incorrectly reject KVM_SET_CPUID due to a not handling a pending runtime update on the current CPUID, resulting in a false mismatch between the "old" and "new" CPUID entries. Link: https://lore.kernel.org/all/20251128123202.68424a95@imammedo Link: https://patch.msgid.link/20251202015049.1167490-3-seanjc@google.com Signed-off-by: Sean Christopherson --- tools/testing/selftests/kvm/x86/cpuid_test.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/tools/testing/selftests/kvm/x86/cpuid_test.c b/tools/testing/selftests/kvm/x86/cpuid_test.c index 7b3fda6842bce..f9ed14996977a 100644 --- a/tools/testing/selftests/kvm/x86/cpuid_test.c +++ b/tools/testing/selftests/kvm/x86/cpuid_test.c @@ -155,6 +155,7 @@ struct kvm_cpuid2 *vcpu_alloc_cpuid(struct kvm_vm *vm, vm_vaddr_t *p_gva, struct static void set_cpuid_after_run(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *ent; + struct kvm_sregs sregs; int rc; u32 eax, ebx, x; @@ -162,6 +163,20 @@ static void set_cpuid_after_run(struct kvm_vcpu *vcpu) rc = __vcpu_set_cpuid(vcpu); TEST_ASSERT(!rc, "Setting unmodified CPUID after KVM_RUN failed: %d", rc); + /* + * Toggle CR4 bits that affect dynamic CPUID feature flags to verify + * setting unmodified CPUID succeeds with runtime CPUID updates. + */ + vcpu_sregs_get(vcpu, &sregs); + if (kvm_cpu_has(X86_FEATURE_XSAVE)) + sregs.cr4 ^= X86_CR4_OSXSAVE; + if (kvm_cpu_has(X86_FEATURE_PKU)) + sregs.cr4 ^= X86_CR4_PKE; + vcpu_sregs_set(vcpu, &sregs); + + rc = __vcpu_set_cpuid(vcpu); + TEST_ASSERT(!rc, "Setting unmodified CPUID after KVM_RUN failed: %d", rc); + /* Changing CPU features is forbidden */ ent = vcpu_get_cpuid_entry(vcpu, 0x7); ebx = ent->ebx; From eaca8cc1a408cf1607f5e5d42a0015ac4deee97a Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Mon, 1 Dec 2025 18:03:33 -0800 Subject: [PATCH 0174/1321] KVM: Disallow toggling KVM_MEM_GUEST_MEMFD on an existing memslot Reject attempts to disable KVM_MEM_GUEST_MEMFD on a memslot that was initially created with a guest_memfd binding, as KVM doesn't support toggling KVM_MEM_GUEST_MEMFD on existing memslots. KVM prevents enabling KVM_MEM_GUEST_MEMFD, but doesn't prevent clearing the flag. Failure to reject the new memslot results in a use-after-free due to KVM not unbinding from the guest_memfd instance. Unbinding on a FLAGS_ONLY change is easy enough, and can/will be done as a hardening measure (in anticipation of KVM supporting dirty logging on guest_memfd at some point), but fixing the use-after-free would only address the immediate symptom. ================================================================== BUG: KASAN: slab-use-after-free in kvm_gmem_release+0x362/0x400 [kvm] Write of size 8 at addr ffff8881111ae908 by task repro/745 CPU: 7 UID: 1000 PID: 745 Comm: repro Not tainted 6.18.0-rc6-115d5de2eef3-next-kasan #3 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: dump_stack_lvl+0x51/0x60 print_report+0xcb/0x5c0 kasan_report+0xb4/0xe0 kvm_gmem_release+0x362/0x400 [kvm] __fput+0x2fa/0x9d0 task_work_run+0x12c/0x200 do_exit+0x6ae/0x2100 do_group_exit+0xa8/0x230 __x64_sys_exit_group+0x3a/0x50 x64_sys_call+0x737/0x740 do_syscall_64+0x5b/0x900 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f581f2eac31 Allocated by task 745 on cpu 6 at 9.746971s: kasan_save_stack+0x20/0x40 kasan_save_track+0x13/0x50 __kasan_kmalloc+0x77/0x90 kvm_set_memory_region.part.0+0x652/0x1110 [kvm] kvm_vm_ioctl+0x14b0/0x3290 [kvm] __x64_sys_ioctl+0x129/0x1a0 do_syscall_64+0x5b/0x900 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Freed by task 745 on cpu 6 at 9.747467s: kasan_save_stack+0x20/0x40 kasan_save_track+0x13/0x50 __kasan_save_free_info+0x37/0x50 __kasan_slab_free+0x3b/0x60 kfree+0xf5/0x440 kvm_set_memslot+0x3c2/0x1160 [kvm] kvm_set_memory_region.part.0+0x86a/0x1110 [kvm] kvm_vm_ioctl+0x14b0/0x3290 [kvm] __x64_sys_ioctl+0x129/0x1a0 do_syscall_64+0x5b/0x900 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Reported-by: Alexander Potapenko Fixes: a7800aa80ea4 ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251202020334.1171351-2-seanjc@google.com Signed-off-by: Sean Christopherson --- virt/kvm/kvm_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 5fcd401a58973..b38bae53fd31a 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2086,7 +2086,7 @@ static int kvm_set_memory_region(struct kvm *kvm, return -EINVAL; if ((mem->userspace_addr != old->userspace_addr) || (npages != old->npages) || - ((mem->flags ^ old->flags) & KVM_MEM_READONLY)) + ((mem->flags ^ old->flags) & (KVM_MEM_READONLY | KVM_MEM_GUEST_MEMFD))) return -EINVAL; if (base_gfn != old->base_gfn) From 2d5817a99d84e9fbe96e42f88038a104b556a833 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Mon, 1 Dec 2025 18:03:34 -0800 Subject: [PATCH 0175/1321] KVM: Harden and prepare for modifying existing guest_memfd memslots Unbind guest_memfd memslots if KVM commits a MOVE or FLAGS_ONLY memslot change to harden against use-after-free, and to prepare for eventually supporting dirty logging on guest_memfd memslots, at which point FLAGS_ONLY changes will be expected/supported. Add two separate WARNs, once to yell if a guest_memfd memslot is moved (which KVM is never expected to allow/support), and again if the unbind() is triggered, to help detect uAPI goofs prior to deliberately allowing FLAGS_ONLY changes. Link: https://patch.msgid.link/20251202020334.1171351-3-seanjc@google.com Signed-off-by: Sean Christopherson --- virt/kvm/kvm_main.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b38bae53fd31a..5b5b69c97665e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1749,6 +1749,12 @@ static void kvm_commit_memory_region(struct kvm *kvm, kvm_free_memslot(kvm, old); break; case KVM_MR_MOVE: + /* + * Moving a guest_memfd memslot isn't supported, and will never + * be supported. + */ + WARN_ON_ONCE(old->flags & KVM_MEM_GUEST_MEMFD); + fallthrough; case KVM_MR_FLAGS_ONLY: /* * Free the dirty bitmap as needed; the below check encompasses @@ -1757,6 +1763,15 @@ static void kvm_commit_memory_region(struct kvm *kvm, if (old->dirty_bitmap && !new->dirty_bitmap) kvm_destroy_dirty_bitmap(old); + /* + * Unbind the guest_memfd instance as needed; the @new slot has + * already created its own binding. TODO: Drop the WARN when + * dirty logging guest_memfd memslots is supported. Until then, + * flags-only changes on guest_memfd slots should be impossible. + */ + if (WARN_ON_ONCE(old->flags & KVM_MEM_GUEST_MEMFD)) + kvm_gmem_unbind(old); + /* * The final quirk. Free the detached, old slot, but only its * memory, not any metadata. Metadata, including arch specific From a2ea2467979d7a0d2e94d19cf438599c5dc8997c Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Thu, 13 Nov 2025 14:56:13 -0800 Subject: [PATCH 0176/1321] KVM: nSVM: Clear exit_code_hi in VMCB when synthesizing nested VM-Exits Explicitly clear exit_code_hi in the VMCB when synthesizing "normal" nested VM-Exits, as the full exit code is a 64-bit value (spoiler alert), and all exit codes for non-failing VMRUN use only bits 31:0. Cc: Jim Mattson Cc: Yosry Ahmed Cc: stable@vger.kernel.org Reviewed-by: Yosry Ahmed Link: https://patch.msgid.link/20251113225621.1688428-2-seanjc@google.com Signed-off-by: Sean Christopherson --- arch/x86/kvm/svm/svm.c | 2 ++ arch/x86/kvm/svm/svm.h | 7 ++++--- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index f56c2d895011c..24d59ccfa40d9 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -2443,6 +2443,7 @@ static bool check_selective_cr0_intercepted(struct kvm_vcpu *vcpu, if (cr0 ^ val) { svm->vmcb->control.exit_code = SVM_EXIT_CR0_SEL_WRITE; + svm->vmcb->control.exit_code_hi = 0; ret = (nested_svm_exit_handled(svm) == NESTED_EXIT_DONE); } @@ -4617,6 +4618,7 @@ static int svm_check_intercept(struct kvm_vcpu *vcpu, if (static_cpu_has(X86_FEATURE_NRIPS)) vmcb->control.next_rip = info->next_rip; vmcb->control.exit_code = icpt_info.exit_code; + vmcb->control.exit_code_hi = 0; vmexit = nested_svm_exit_handled(svm); ret = (vmexit == NESTED_EXIT_DONE) ? X86EMUL_INTERCEPTED diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 9e151dbdef25d..01be93a53d077 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -761,9 +761,10 @@ int nested_svm_vmexit(struct vcpu_svm *svm); static inline int nested_svm_simple_vmexit(struct vcpu_svm *svm, u32 exit_code) { - svm->vmcb->control.exit_code = exit_code; - svm->vmcb->control.exit_info_1 = 0; - svm->vmcb->control.exit_info_2 = 0; + svm->vmcb->control.exit_code = exit_code; + svm->vmcb->control.exit_code_hi = 0; + svm->vmcb->control.exit_info_1 = 0; + svm->vmcb->control.exit_info_2 = 0; return nested_svm_vmexit(svm); } From 5e19fb1deb3c04148ddccb46a98ed7fcf65209e8 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Thu, 13 Nov 2025 14:56:14 -0800 Subject: [PATCH 0177/1321] KVM: nSVM: Set exit_code_hi to -1 when synthesizing SVM_EXIT_ERR (failed VMRUN) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Set exit_code_hi to -1u as a temporary band-aid to fix a long-standing (effectively since KVM's inception) bug where KVM treats the exit code as a 32-bit value, when in reality it's a 64-bit value. Per the APM, offset 0x70 is a single 64-bit value: 070h 63:0 EXITCODE And a sane reading of the error values defined in "Table C-1. SVM Intercept Codes" is that negative values use the full 64 bits: –1 VMEXIT_INVALID Invalid guest state in VMCB. –2 VMEXIT_BUSYBUSY bit was set in the VMSA –3 VMEXIT_IDLE_REQUIREDThe sibling thread is not in an idle state -4 VMEXIT_INVALID_PMC Invalid PMC state And that interpretation is confirmed by testing on Milan and Turin (by setting bits in CR0[63:32] to generate VMEXIT_INVALID on VMRUN). Furthermore, Xen has treated exitcode as a 64-bit value since HVM support was adding in 2006 (see Xen commit d1bd157fbc ("Big merge the HVM full-virtualisation abstractions.")). Cc: Jim Mattson Cc: Yosry Ahmed Cc: stable@vger.kernel.org Reviewed-by: Yosry Ahmed Link: https://patch.msgid.link/20251113225621.1688428-3-seanjc@google.com Signed-off-by: Sean Christopherson --- arch/x86/kvm/svm/nested.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index c81005b245222..ba0f11c68372b 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -985,7 +985,7 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu) if (!nested_vmcb_check_save(vcpu) || !nested_vmcb_check_controls(vcpu)) { vmcb12->control.exit_code = SVM_EXIT_ERR; - vmcb12->control.exit_code_hi = 0; + vmcb12->control.exit_code_hi = -1u; vmcb12->control.exit_info_1 = 0; vmcb12->control.exit_info_2 = 0; goto out; @@ -1018,7 +1018,7 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu) svm->soft_int_injected = false; svm->vmcb->control.exit_code = SVM_EXIT_ERR; - svm->vmcb->control.exit_code_hi = 0; + svm->vmcb->control.exit_code_hi = -1u; svm->vmcb->control.exit_info_1 = 0; svm->vmcb->control.exit_info_2 = 0; From 4782863c7f1f1f7c0506c6218d482d5fa520d545 Mon Sep 17 00:00:00 2001 From: Dongli Zhang Date: Fri, 5 Dec 2025 15:19:04 -0800 Subject: [PATCH 0178/1321] KVM: VMX: Update SVI during runtime APICv activation The APICv (apic->apicv_active) can be activated or deactivated at runtime, for instance, because of APICv inhibit reasons. Intel VMX employs different mechanisms to virtualize LAPIC based on whether APICv is active. When APICv is activated at runtime, GUEST_INTR_STATUS is used to configure and report the current pending IRR and ISR states. Unless a specific vector is explicitly included in EOI_EXIT_BITMAP, its EOI will not be trapped to KVM. Intel VMX automatically clears the corresponding ISR bit based on the GUEST_INTR_STATUS.SVI field. When APICv is deactivated at runtime, the VM_ENTRY_INTR_INFO_FIELD is used to specify the next interrupt vector to invoke upon VM-entry. The VMX IDT_VECTORING_INFO_FIELD is used to report un-invoked vectors on VM-exit. EOIs are always trapped to KVM, so the software can manually clear pending ISR bits. There are scenarios where, with APICv activated at runtime, a guest-issued EOI may not be able to clear the pending ISR bit. Taking vector 236 as an example, here is one scenario. 1. Suppose APICv is inactive. Vector 236 is pending in the IRR. 2. To handle KVM_REQ_EVENT, KVM moves vector 236 from the IRR to the ISR, and configures the VM_ENTRY_INTR_INFO_FIELD via vmx_inject_irq(). 3. After VM-entry, vector 236 is invoked through the guest IDT. At this point, the data in VM_ENTRY_INTR_INFO_FIELD is no longer valid. The guest interrupt handler for vector 236 is invoked. 4. Suppose a VM exit occurs very early in the guest interrupt handler, before the EOI is issued. 5. Nothing is reported through the IDT_VECTORING_INFO_FIELD because vector 236 has already been invoked in the guest. 6. Now, suppose APICv is activated. Before the next VM-entry, KVM calls kvm_vcpu_update_apicv() to activate APICv. 7. Unfortunately, GUEST_INTR_STATUS.SVI is not configured, although vector 236 is still pending in the ISR. 8. After VM-entry, the guest finally issues the EOI for vector 236. However, because SVI is not configured, vector 236 is not cleared. 9. ISR is stalled forever on vector 236. Here is another scenario. 1. Suppose APICv is inactive. Vector 236 is pending in the IRR. 2. To handle KVM_REQ_EVENT, KVM moves vector 236 from the IRR to the ISR, and configures the VM_ENTRY_INTR_INFO_FIELD via vmx_inject_irq(). 3. VM-exit occurs immediately after the next VM-entry. The vector 236 is not invoked through the guest IDT. Instead, it is saved to the IDT_VECTORING_INFO_FIELD during the VM-exit. 4. KVM calls kvm_queue_interrupt() to re-queue the un-invoked vector 236 into vcpu->arch.interrupt. A KVM_REQ_EVENT is requested. 5. Now, suppose APICv is activated. Before the next VM-entry, KVM calls kvm_vcpu_update_apicv() to activate APICv. 6. Although APICv is now active, KVM still uses the legacy VM_ENTRY_INTR_INFO_FIELD to re-inject vector 236. GUEST_INTR_STATUS.SVI is not configured. 7. After the next VM-entry, vector 236 is invoked through the guest IDT. Finally, an EOI occurs. However, due to the lack of GUEST_INTR_STATUS.SVI configuration, vector 236 is not cleared from the ISR. 8. ISR is stalled forever on vector 236. Using QEMU as an example, vector 236 is stuck in ISR forever. (qemu) info lapic 1 dumping local APIC state for CPU 1 LVT0 0x00010700 active-hi edge masked ExtINT (vec 0) LVT1 0x00010400 active-hi edge masked NMI LVTPC 0x00000400 active-hi edge NMI LVTERR 0x000000fe active-hi edge Fixed (vec 254) LVTTHMR 0x00010000 active-hi edge masked Fixed (vec 0) LVTT 0x000400ec active-hi edge tsc-deadline Fixed (vec 236) Timer DCR=0x0 (divide by 2) initial_count = 0 current_count = 0 SPIV 0x000001ff APIC enabled, focus=off, spurious vec 255 ICR 0x000000fd physical edge de-assert no-shorthand ICR2 0x00000000 cpu 0 (X2APIC ID) ESR 0x00000000 ISR 236 IRR 37(level) 236 The issue isn't applicable to AMD SVM as KVM simply writes vmcb01 directly irrespective of whether L1 (vmcs01) or L2 (vmcb02) is active (unlike VMX, there is no need/cost to switch between VMCBs). In addition, APICV_INHIBIT_REASON_IRQWIN ensures AMD SVM AVIC is not activated until the last interrupt is EOI'd. Fix the bug by configuring Intel VMX GUEST_INTR_STATUS.SVI if APICv is activated at runtime. Signed-off-by: Dongli Zhang Reviewed-by: Chao Gao Link: https://patch.msgid.link/20251110063212.34902-1-dongli.zhang@oracle.com [sean: call out that SVM writes vmcb01 directly, tweak comment] Link: https://patch.msgid.link/20251205231913.441872-2-seanjc@google.com Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/vmx.c | 9 --------- arch/x86/kvm/x86.c | 7 +++++++ 2 files changed, 7 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 4cbe8c84b6366..6b96f7aea20bd 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6937,15 +6937,6 @@ void vmx_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr) * VM-Exit, otherwise L1 with run with a stale SVI. */ if (is_guest_mode(vcpu)) { - /* - * KVM is supposed to forward intercepted L2 EOIs to L1 if VID - * is enabled in vmcs12; as above, the EOIs affect L2's vAPIC. - * Note, userspace can stuff state while L2 is active; assert - * that VID is disabled if and only if the vCPU is in KVM_RUN - * to avoid false positives if userspace is setting APIC state. - */ - WARN_ON_ONCE(vcpu->wants_to_run && - nested_cpu_has_vid(get_vmcs12(vcpu))); to_vmx(vcpu)->nested.update_vmcs01_hwapic_isr = true; return; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0c6d899d53ddc..ff8812f3a1293 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10886,9 +10886,16 @@ void __kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu) * pending. At the same time, KVM_REQ_EVENT may not be set as APICv was * still active when the interrupt got accepted. Make sure * kvm_check_and_inject_events() is called to check for that. + * + * Update SVI when APICv gets enabled, otherwise SVI won't reflect the + * highest bit in vISR and the next accelerated EOI in the guest won't + * be virtualized correctly (the CPU uses SVI to determine which vISR + * vector to clear). */ if (!apic->apicv_active) kvm_make_request(KVM_REQ_EVENT, vcpu); + else + kvm_apic_update_hwapic_isr(vcpu); out: preempt_enable(); From 0fb2b68db34be61786c06d7b8c18c9d00afae381 Mon Sep 17 00:00:00 2001 From: Dongli Zhang Date: Fri, 5 Dec 2025 15:19:05 -0800 Subject: [PATCH 0179/1321] KVM: nVMX: Immediately refresh APICv controls as needed on nested VM-Exit If an APICv status updated was pended while L2 was active, immediately refresh vmcs01's controls instead of pending KVM_REQ_APICV_UPDATE as kvm_vcpu_update_apicv() only calls into vendor code if a change is necessary. E.g. if APICv is inhibited, and then activated while L2 is running: kvm_vcpu_update_apicv() | -> __kvm_vcpu_update_apicv() | -> apic->apicv_active = true | -> vmx_refresh_apicv_exec_ctrl() | -> vmx->nested.update_vmcs01_apicv_status = true | -> return Then L2 exits to L1: __nested_vmx_vmexit() | -> kvm_make_request(KVM_REQ_APICV_UPDATE) vcpu_enter_guest(): KVM_REQ_APICV_UPDATE -> kvm_vcpu_update_apicv() | -> __kvm_vcpu_update_apicv() | -> return // because if (apic->apicv_active == activate) Reported-by: Chao Gao Closes: https://lore.kernel.org/all/aQ2jmnN8wUYVEawF@intel.com Fixes: 7c69661e225c ("KVM: nVMX: Defer APICv updates while L2 is active until L1 is active") Cc: stable@vger.kernel.org Signed-off-by: Dongli Zhang [sean: write changelog] Link: https://patch.msgid.link/20251205231913.441872-3-seanjc@google.com Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/nested.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 40777278eabbf..6137e5307d0f6 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -19,6 +19,7 @@ #include "trace.h" #include "vmx.h" #include "smm.h" +#include "x86_ops.h" static bool __read_mostly enable_shadow_vmcs = 1; module_param_named(enable_shadow_vmcs, enable_shadow_vmcs, bool, S_IRUGO); @@ -5165,7 +5166,7 @@ void __nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason, if (vmx->nested.update_vmcs01_apicv_status) { vmx->nested.update_vmcs01_apicv_status = false; - kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu); + vmx_refresh_apicv_exec_ctrl(vcpu); } if (vmx->nested.update_vmcs01_hwapic_isr) { From 7b5bcd168a499138673ce562d528db8f0ab13c6b Mon Sep 17 00:00:00 2001 From: Kevin Brodsky Date: Wed, 19 Nov 2025 13:00:16 +0000 Subject: [PATCH 0180/1321] arm64: mm: Simplify check in arch_kfence_init_pool() TL;DR: checking force_pte_mapping() in arch_kfence_init_pool() is sufficient Commit ce2b3a50ad92 ("arm64: mm: Don't sleep in split_kernel_leaf_mapping() when in atomic context") recently added an arm64 implementation of arch_kfence_init_pool() to ensure that the KFENCE pool is PTE-mapped. Assuming that the pool was not initialised early, block splitting is necessary if the linear mapping is not fully PTE-mapped, in other words if force_pte_mapping() is false. arch_kfence_init_pool() currently makes another check: whether BBML2-noabort is supported, i.e. whether we are *able* to split block mappings. This check is however unnecessary, because force_pte_mapping() is always true if KFENCE is enabled and BBML2-noabort is not supported. This must be the case by design, since KFENCE requires PTE-mapped pages in all cases. We can therefore remove that check. The situation is different in split_kernel_leaf_mapping(), as that function is called unconditionally regardless of the configuration. If BBML2-noabort is not supported, it cannot do anything and bails out. If force_pte_mapping() is true, there is nothing to do and it also bails out, but these are independent checks. Commit 53357f14f924 ("arm64: mm: Tidy up force_pte_mapping()") grouped these checks into a helper, split_leaf_mapping_possible(). This isn't so helpful as only split_kernel_leaf_mapping() should check both. Revert the parts of that commit that introduced the helper, reintroducing the more accurate comments in split_kernel_leaf_mapping(). Signed-off-by: Kevin Brodsky Reviewed-by: Ryan Roberts Signed-off-by: Catalin Marinas --- arch/arm64/mm/mmu.c | 33 ++++++++++++++++----------------- 1 file changed, 16 insertions(+), 17 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 9ae7ce00a7ef2..8e1d80a7033e3 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -767,18 +767,6 @@ static inline bool force_pte_mapping(void) return rodata_full || arm64_kfence_can_set_direct_map() || is_realm_world(); } -static inline bool split_leaf_mapping_possible(void) -{ - /* - * !BBML2_NOABORT systems should never run into scenarios where we would - * have to split. So exit early and let calling code detect it and raise - * a warning. - */ - if (!system_supports_bbml2_noabort()) - return false; - return !force_pte_mapping(); -} - static DEFINE_MUTEX(pgtable_split_lock); int split_kernel_leaf_mapping(unsigned long start, unsigned long end) @@ -786,11 +774,22 @@ int split_kernel_leaf_mapping(unsigned long start, unsigned long end) int ret; /* - * Exit early if the region is within a pte-mapped area or if we can't - * split. For the latter case, the permission change code will raise a - * warning if not already pte-mapped. + * !BBML2_NOABORT systems should not be trying to change permissions on + * anything that is not pte-mapped in the first place. Just return early + * and let the permission change code raise a warning if not already + * pte-mapped. */ - if (!split_leaf_mapping_possible() || is_kfence_address((void *)start)) + if (!system_supports_bbml2_noabort()) + return 0; + + /* + * If the region is within a pte-mapped area, there is no need to try to + * split. Additionally, CONFIG_DEBUG_PAGEALLOC and CONFIG_KFENCE may + * change permissions from atomic context so for those cases (which are + * always pte-mapped), we must not go any further because taking the + * mutex below may sleep. + */ + if (force_pte_mapping() || is_kfence_address((void *)start)) return 0; /* @@ -1089,7 +1088,7 @@ bool arch_kfence_init_pool(void) int ret; /* Exit early if we know the linear map is already pte-mapped. */ - if (!split_leaf_mapping_possible()) + if (force_pte_mapping()) return true; /* Kfence pool is already pte-mapped for the early init case. */ From 885a130bc4518a66ab2459b27298e8899fc31d41 Mon Sep 17 00:00:00 2001 From: Ard Biesheuvel Date: Sat, 6 Dec 2025 20:01:16 +0100 Subject: [PATCH 0181/1321] lkdtm/bugs: Add cases for BUG and PANIC occurring in hardirq context Add lkdtm cases to trigger a BUG() or panic() from hardirq context. This is useful for testing pstore behavior being invoked from such contexts. Reviewed-by: Kees Cook Signed-off-by: Ard Biesheuvel Signed-off-by: Catalin Marinas --- drivers/misc/lkdtm/bugs.c | 53 +++++++++++++++++++++++++ tools/testing/selftests/lkdtm/tests.txt | 2 + 2 files changed, 55 insertions(+) diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c index 376047beea3d6..fa05d77acb558 100644 --- a/drivers/misc/lkdtm/bugs.c +++ b/drivers/misc/lkdtm/bugs.c @@ -8,6 +8,7 @@ #include "lkdtm.h" #include #include +#include #include #include #include @@ -100,11 +101,61 @@ static void lkdtm_PANIC_STOP_IRQOFF(void) stop_machine(panic_stop_irqoff_fn, &v, cpu_online_mask); } +static bool wait_for_panic; + +static enum hrtimer_restart panic_in_hardirq(struct hrtimer *timer) +{ + panic("from hard IRQ context"); + + wait_for_panic = false; + return HRTIMER_NORESTART; +} + +static void lkdtm_PANIC_IN_HARDIRQ(void) +{ + struct hrtimer timer; + + wait_for_panic = true; + hrtimer_setup_on_stack(&timer, panic_in_hardirq, + CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD); + hrtimer_start(&timer, us_to_ktime(100), HRTIMER_MODE_REL_HARD); + + while (wait_for_panic) + ; + + hrtimer_cancel(&timer); +} + static void lkdtm_BUG(void) { BUG(); } +static bool wait_for_bug; + +static enum hrtimer_restart bug_in_hardirq(struct hrtimer *timer) +{ + BUG(); + + wait_for_bug = false; + return HRTIMER_NORESTART; +} + +static void lkdtm_BUG_IN_HARDIRQ(void) +{ + struct hrtimer timer; + + wait_for_bug = true; + hrtimer_setup_on_stack(&timer, bug_in_hardirq, + CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD); + hrtimer_start(&timer, us_to_ktime(100), HRTIMER_MODE_REL_HARD); + + while (wait_for_bug) + ; + + hrtimer_cancel(&timer); +} + static int warn_counter; static void lkdtm_WARNING(void) @@ -696,7 +747,9 @@ static noinline void lkdtm_CORRUPT_PAC(void) static struct crashtype crashtypes[] = { CRASHTYPE(PANIC), CRASHTYPE(PANIC_STOP_IRQOFF), + CRASHTYPE(PANIC_IN_HARDIRQ), CRASHTYPE(BUG), + CRASHTYPE(BUG_IN_HARDIRQ), CRASHTYPE(WARNING), CRASHTYPE(WARNING_MESSAGE), CRASHTYPE(EXCEPTION), diff --git a/tools/testing/selftests/lkdtm/tests.txt b/tools/testing/selftests/lkdtm/tests.txt index cff124c1eddd3..67cd53715d932 100644 --- a/tools/testing/selftests/lkdtm/tests.txt +++ b/tools/testing/selftests/lkdtm/tests.txt @@ -1,6 +1,8 @@ #PANIC #PANIC_STOP_IRQOFF Crashes entire system +#PANIC_IN_HARDIRQ Crashes entire system BUG kernel BUG at +#BUG_IN_HARDIRQ Crashes entire system WARNING WARNING: WARNING_MESSAGE message trigger EXCEPTION From f0e472ef76f48c7dd33f75c8a368551eec4f0696 Mon Sep 17 00:00:00 2001 From: Ard Biesheuvel Date: Sat, 6 Dec 2025 20:01:17 +0100 Subject: [PATCH 0182/1321] arm64/efi: Remove unneeded SVE/SME fallback preserve/store handling Since commit 7137a203b251 ("arm64/fpsimd: Permit kernel mode NEON with IRQs off"), the only condition under which the fallback path is taken for FP/SIMD preserve/restore across a EFI runtime call is when it is called from hardirq or NMI context. In practice, this only happens when the EFI pstore driver is called to dump the kernel log buffer into a EFI variable under a panic, oops or emergency_restart() condition, and none of these can be expected to result in a return to user space for the task in question. This means that the existing EFI-specific logic for preserving and restoring SVE/SME state is pointless, and can be removed. Instead, kill the task, so that an exceedingly unlikely inadvertent return to user space does not proceed with a corrupted FP/SIMD state. Also, retain the preserve and restore of the base FP/SIMD state, as that might belong to kernel mode use of FP/SIMD. (Note that EFI runtime calls are never invoked reentrantly, even in this case, and so any interrupted kernel mode FP/SIMD usage will be unrelated to EFI) Signed-off-by: Ard Biesheuvel Signed-off-by: Catalin Marinas --- arch/arm64/kernel/fpsimd.c | 130 ++++++------------------------------- 1 file changed, 20 insertions(+), 110 deletions(-) diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index c154f72634e02..9de1d8a604cbf 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -180,13 +180,6 @@ static inline void set_sve_default_vl(int val) set_default_vl(ARM64_VEC_SVE, val); } -static u8 *efi_sve_state; - -#else /* ! CONFIG_ARM64_SVE */ - -/* Dummy declaration for code that will be optimised out: */ -extern u8 *efi_sve_state; - #endif /* ! CONFIG_ARM64_SVE */ #ifdef CONFIG_ARM64_SME @@ -1095,36 +1088,6 @@ int vec_verify_vq_map(enum vec_type type) return 0; } -static void __init sve_efi_setup(void) -{ - int max_vl = 0; - int i; - - if (!IS_ENABLED(CONFIG_EFI)) - return; - - for (i = 0; i < ARRAY_SIZE(vl_info); i++) - max_vl = max(vl_info[i].max_vl, max_vl); - - /* - * alloc_percpu() warns and prints a backtrace if this goes wrong. - * This is evidence of a crippled system and we are returning void, - * so no attempt is made to handle this situation here. - */ - if (!sve_vl_valid(max_vl)) - goto fail; - - efi_sve_state = kmalloc(SVE_SIG_REGS_SIZE(sve_vq_from_vl(max_vl)), - GFP_KERNEL); - if (!efi_sve_state) - goto fail; - - return; - -fail: - panic("Cannot allocate memory for EFI SVE save/restore"); -} - void cpu_enable_sve(const struct arm64_cpu_capabilities *__always_unused p) { write_sysreg(read_sysreg(CPACR_EL1) | CPACR_EL1_ZEN_EL1EN, CPACR_EL1); @@ -1185,8 +1148,6 @@ void __init sve_setup(void) if (sve_max_virtualisable_vl() < sve_max_vl()) pr_warn("%s: unvirtualisable vector lengths present\n", info->name); - - sve_efi_setup(); } /* @@ -1947,9 +1908,6 @@ EXPORT_SYMBOL_GPL(kernel_neon_end); #ifdef CONFIG_EFI static struct user_fpsimd_state efi_fpsimd_state; -static bool efi_fpsimd_state_used; -static bool efi_sve_state_used; -static bool efi_sm_state; /* * EFI runtime services support functions @@ -1976,43 +1934,26 @@ void __efi_fpsimd_begin(void) if (may_use_simd()) { kernel_neon_begin(&efi_fpsimd_state); } else { - WARN_ON(preemptible()); - /* - * If !efi_sve_state, SVE can't be in use yet and doesn't need - * preserving: + * We are running in hardirq or NMI context, and the only + * legitimate case where this might happen is when EFI pstore + * is attempting to record the system's dying gasps into EFI + * variables. This could be due to an oops, a panic or a call + * to emergency_restart(), and in none of those cases, we can + * expect the current task to ever return to user space again, + * or for the kernel to resume any normal execution, for that + * matter (an oops in hardirq context triggers a panic too). + * + * Therefore, there is no point in attempting to preserve any + * SVE/SME state here. On the off chance that we might have + * ended up here for a different reason inadvertently, kill the + * task and preserve/restore the base FP/SIMD state, which + * might belong to kernel mode FP/SIMD. */ - if (system_supports_sve() && efi_sve_state != NULL) { - bool ffr = true; - u64 svcr; - - efi_sve_state_used = true; - - if (system_supports_sme()) { - svcr = read_sysreg_s(SYS_SVCR); - - efi_sm_state = svcr & SVCR_SM_MASK; - - /* - * Unless we have FA64 FFR does not - * exist in streaming mode. - */ - if (!system_supports_fa64()) - ffr = !(svcr & SVCR_SM_MASK); - } - - sve_save_state(efi_sve_state + sve_ffr_offset(sve_max_vl()), - &efi_fpsimd_state.fpsr, ffr); - - if (system_supports_sme()) - sysreg_clear_set_s(SYS_SVCR, - SVCR_SM_MASK, 0); - - } else { - fpsimd_save_state(&efi_fpsimd_state); - } - - efi_fpsimd_state_used = true; + pr_warn_ratelimited("Calling EFI runtime from %s context\n", + in_nmi() ? "NMI" : "hardirq"); + force_signal_inject(SIGKILL, SI_KERNEL, 0, 0); + fpsimd_save_state(&efi_fpsimd_state); } } @@ -2024,41 +1965,10 @@ void __efi_fpsimd_end(void) if (!system_supports_fpsimd()) return; - if (!efi_fpsimd_state_used) { + if (may_use_simd()) { kernel_neon_end(&efi_fpsimd_state); } else { - if (system_supports_sve() && efi_sve_state_used) { - bool ffr = true; - - /* - * Restore streaming mode; EFI calls are - * normal function calls so should not return in - * streaming mode. - */ - if (system_supports_sme()) { - if (efi_sm_state) { - sysreg_clear_set_s(SYS_SVCR, - 0, - SVCR_SM_MASK); - - /* - * Unless we have FA64 FFR does not - * exist in streaming mode. - */ - if (!system_supports_fa64()) - ffr = false; - } - } - - sve_load_state(efi_sve_state + sve_ffr_offset(sve_max_vl()), - &efi_fpsimd_state.fpsr, ffr); - - efi_sve_state_used = false; - } else { - fpsimd_load_state(&efi_fpsimd_state); - } - - efi_fpsimd_state_used = false; + fpsimd_load_state(&efi_fpsimd_state); } } From 7dea9aca604c0a01fac83a4df8fc4bf84145239d Mon Sep 17 00:00:00 2001 From: Mark Brown Date: Sat, 29 Nov 2025 00:48:45 +0000 Subject: [PATCH 0183/1321] arm64/gcs: Flush the GCS locking state on exec When we exec a new task we forget to flush the set of locked GCS mode bits. Since we do flush the rest of the state this means that if GCS is locked the new task will be unable to enable GCS, it will be locked as being disabled. Add the expected flush. Fixes: fc84bc5378a8 ("arm64/gcs: Context switch GCS state for EL0") Cc: # 6.13.x Reported-by: Yury Khrustalev Signed-off-by: Mark Brown Tested-by: Yury Khrustalev Signed-off-by: Catalin Marinas --- arch/arm64/kernel/process.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index fba7ca102a8c4..489554931231e 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -292,6 +292,7 @@ static void flush_gcs(void) current->thread.gcs_base = 0; current->thread.gcs_size = 0; current->thread.gcs_el0_mode = 0; + current->thread.gcs_el0_locked = 0; write_sysreg_s(GCSCRE0_EL1_nTR, SYS_GCSCRE0_EL1); write_sysreg_s(0, SYS_GCSPR_EL0); } From ce24f49d1c23dc7e3eee4a08457600cfa2c0c822 Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Fri, 19 Dec 2025 15:09:09 +0000 Subject: [PATCH 0184/1321] lkdtm/bugs: Do not confuse the clang/objtool with busy wait loop Since commit eb972eab0794 ("lkdtm/bugs: Add cases for BUG and PANIC occurring in hardirq context"), building with clang for x86_64 results in the following warnings: vmlinux.o: warning: objtool: lkdtm_PANIC_IN_HARDIRQ(): unexpected end of section .text.lkdtm_PANIC_IN_HARDIRQ vmlinux.o: warning: objtool: lkdtm_BUG_IN_HARDIRQ(): unexpected end of section .text.lkdtm_BUG_IN_HARDIRQ caused by busy "while (wait_for_...);" loops. Add READ_ONCE() and cpu_relax() to better indicate the intention and avoid any unwanted compiler optimisations. Fixes: eb972eab0794 ("lkdtm/bugs: Add cases for BUG and PANIC occurring in hardirq context") Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202512190111.jxFSqxUH-lkp@intel.com/ Signed-off-by: Catalin Marinas --- drivers/misc/lkdtm/bugs.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c index fa05d77acb558..502059078b456 100644 --- a/drivers/misc/lkdtm/bugs.c +++ b/drivers/misc/lkdtm/bugs.c @@ -120,8 +120,8 @@ static void lkdtm_PANIC_IN_HARDIRQ(void) CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD); hrtimer_start(&timer, us_to_ktime(100), HRTIMER_MODE_REL_HARD); - while (wait_for_panic) - ; + while (READ_ONCE(wait_for_panic)) + cpu_relax(); hrtimer_cancel(&timer); } @@ -150,8 +150,8 @@ static void lkdtm_BUG_IN_HARDIRQ(void) CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD); hrtimer_start(&timer, us_to_ktime(100), HRTIMER_MODE_REL_HARD); - while (wait_for_bug) - ; + while (READ_ONCE(wait_for_bug)) + cpu_relax(); hrtimer_cancel(&timer); } From 3d401b8233dca76e028725897d99a0f9cbc15fea Mon Sep 17 00:00:00 2001 From: Thierry Reding Date: Wed, 29 Oct 2025 16:03:16 +0100 Subject: [PATCH 0185/1321] MIPS: Alchemy: Remove bogus static/inline specifiers The recent io_remap_pfn_range() rework applied the static and inline specifiers to the implementation of io_remap_pfn_range_pfn() on MIPS Alchemy, mirroring the same change on other platforms. However, this function is defined in a source file and that definition causes a conflict with its declaration. Fix this by dropping the specifiers. Fixes: c707a68f9468 ("mm: abstract io_remap_pfn_range() based on PFN") Signed-off-by: Thierry Reding Acked-by: Thomas Bogendoerfer Tested-by: Florian Fainelli Reviewed-by: Florian Fainelli Signed-off-by: Thomas Bogendoerfer --- arch/mips/alchemy/common/setup.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/mips/alchemy/common/setup.c b/arch/mips/alchemy/common/setup.c index c35b4f809d512..992134a8c23ae 100644 --- a/arch/mips/alchemy/common/setup.c +++ b/arch/mips/alchemy/common/setup.c @@ -94,8 +94,7 @@ phys_addr_t fixup_bigphys_addr(phys_addr_t phys_addr, phys_addr_t size) return phys_addr; } -static inline unsigned long io_remap_pfn_range_pfn(unsigned long pfn, - unsigned long size) +unsigned long io_remap_pfn_range_pfn(unsigned long pfn, unsigned long size) { phys_addr_t phys_addr = fixup_bigphys_addr(pfn << PAGE_SHIFT, size); From a963cb2ec1891a7152fdbcd48ae71a6187d24e1d Mon Sep 17 00:00:00 2001 From: Haoxiang Li Date: Thu, 4 Dec 2025 18:36:18 +0800 Subject: [PATCH 0186/1321] MIPS: Fix a reference leak bug in ip22_check_gio() If gio_device_register fails, gio_dev_put() is required to drop the gio_dev device reference. Fixes: e84de0c61905 ("MIPS: GIO bus support for SGI IP22/28") Signed-off-by: Haoxiang Li Signed-off-by: Thomas Bogendoerfer --- arch/mips/sgi-ip22/ip22-gio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/mips/sgi-ip22/ip22-gio.c b/arch/mips/sgi-ip22/ip22-gio.c index 5893ea4e382ca..19b70928d6dc3 100644 --- a/arch/mips/sgi-ip22/ip22-gio.c +++ b/arch/mips/sgi-ip22/ip22-gio.c @@ -372,7 +372,8 @@ static void ip22_check_gio(int slotno, unsigned long addr, int irq) gio_dev->resource.flags = IORESOURCE_MEM; gio_dev->irq = irq; dev_set_name(&gio_dev->dev, "%d", slotno); - gio_device_register(gio_dev); + if (gio_device_register(gio_dev)) + gio_dev_put(gio_dev); } else printk(KERN_INFO "GIO: slot %d : Empty\n", slotno); } From 2ad1d1a071a9a4fd50cd407623301aebbe8d9bd8 Mon Sep 17 00:00:00 2001 From: Jianpeng Chang Date: Fri, 5 Dec 2025 09:59:34 +0800 Subject: [PATCH 0187/1321] arm64: kdump: Fix elfcorehdr overlap caused by reserved memory processing reorder Commit 8a6e02d0c00e ("of: reserved_mem: Restructure how the reserved memory regions are processed") changed the processing order of reserved memory regions, causing elfcorehdr to overlap with dynamically allocated reserved memory regions during kdump kernel boot. The issue occurs because: 1. kexec-tools allocates elfcorehdr in the last crashkernel reserved memory region and passes it to the second kernel 2. The problematic commit moved dynamic reserved memory allocation (like bman-fbpr) to occur during fdt_scan_reserved_mem(), before elfcorehdr reservation in fdt_reserve_elfcorehdr() 3. bman-fbpr with 16MB alignment requirement can get allocated at addresses that overlap with the elfcorehdr location 4. When fdt_reserve_elfcorehdr() tries to reserve elfcorehdr memory, overlap detection identifies the conflict and skips reservation 5. kdump kernel fails with "Unable to handle kernel paging request" because elfcorehdr memory is not properly reserved The boot log: Before 8a6e02d0c00e: OF: fdt: Reserving 1 KiB of memory at 0xf4fff000 for elfcorehdr OF: reserved mem: 0xf3000000..0xf3ffffff bman-fbpr After 8a6e02d0c00e: OF: reserved mem: 0xf4000000..0xf4ffffff bman-fbpr OF: fdt: elfcorehdr is overlapped Fix this by ensuring elfcorehdr reservation occurs before dynamic reserved memory allocation. Fixes: 8a6e02d0c00e ("of: reserved_mem: Restructure how the reserved memory regions are processed") Signed-off-by: Jianpeng Chang Link: https://patch.msgid.link/20251205015934.700016-1-jianpeng.chang.cn@windriver.com Signed-off-by: Rob Herring (Arm) --- drivers/of/fdt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index d378d4b4109f5..331646d667b9b 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -503,8 +503,8 @@ void __init early_init_fdt_scan_reserved_mem(void) if (!initial_boot_params) return; - fdt_scan_reserved_mem(); fdt_reserve_elfcorehdr(); + fdt_scan_reserved_mem(); /* Process header /memreserve/ fields */ for (n = 0; ; n++) { From 3dec5c31e9e4ca0a790a7e76b2893afb635e06fa Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Sat, 15 Nov 2025 13:21:21 +0100 Subject: [PATCH 0188/1321] dt-bindings: display/ti: Simplify dma-coherent property Common boolean properties need to be only allowed in the binding (":true"), because their type is already defined by core DT schema. Simplify dma-coherent property to match common syntax. Signed-off-by: Krzysztof Kozlowski Link: https://patch.msgid.link/20251115122120.35315-4-krzk@kernel.org Signed-off-by: Rob Herring (Arm) --- Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml | 3 +-- Documentation/devicetree/bindings/display/ti/ti,j721e-dss.yaml | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml index 361e9cae6896c..38fcee91211e8 100644 --- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml +++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml @@ -84,8 +84,7 @@ properties: maxItems: 1 description: phandle to the associated power domain - dma-coherent: - type: boolean + dma-coherent: true ports: $ref: /schemas/graph.yaml#/properties/ports diff --git a/Documentation/devicetree/bindings/display/ti/ti,j721e-dss.yaml b/Documentation/devicetree/bindings/display/ti/ti,j721e-dss.yaml index fad7cba58d39a..65ae8a1c39986 100644 --- a/Documentation/devicetree/bindings/display/ti/ti,j721e-dss.yaml +++ b/Documentation/devicetree/bindings/display/ti/ti,j721e-dss.yaml @@ -103,8 +103,7 @@ properties: maxItems: 1 description: phandle to the associated power domain - dma-coherent: - type: boolean + dma-coherent: true ports: $ref: /schemas/graph.yaml#/properties/ports From 4b64e9eae3dab2122f785603476597448f7edccc Mon Sep 17 00:00:00 2001 From: "Rob Herring (Arm)" Date: Wed, 29 Oct 2025 10:56:13 -0500 Subject: [PATCH 0189/1321] dt-bindings: clock: sprd,sc9860-clk: Allow "reg" for gate clocks The gate bindings have an artificial split between a "syscon" and clock provider node. Allow "reg" properties so this split can be removed. Reviewed-by: Chunyan Zhang Link: https://patch.msgid.link/20251029155615.1167903-1-robh@kernel.org Signed-off-by: Rob Herring (Arm) --- .../bindings/clock/sprd,sc9860-clk.yaml | 26 ------------------- 1 file changed, 26 deletions(-) diff --git a/Documentation/devicetree/bindings/clock/sprd,sc9860-clk.yaml b/Documentation/devicetree/bindings/clock/sprd,sc9860-clk.yaml index 502cd723511fa..b131390207d6d 100644 --- a/Documentation/devicetree/bindings/clock/sprd,sc9860-clk.yaml +++ b/Documentation/devicetree/bindings/clock/sprd,sc9860-clk.yaml @@ -114,25 +114,6 @@ allOf: - reg properties: sprd,syscon: false - - if: - properties: - compatible: - contains: - enum: - - sprd,sc9860-agcp-gate - - sprd,sc9860-aon-gate - - sprd,sc9860-apahb-gate - - sprd,sc9860-apapb-gate - - sprd,sc9860-cam-gate - - sprd,sc9860-disp-gate - - sprd,sc9860-pll - - sprd,sc9860-pmu-gate - - sprd,sc9860-vsp-gate - then: - required: - - sprd,syscon - properties: - reg: false additionalProperties: false @@ -142,13 +123,6 @@ examples: #address-cells = <2>; #size-cells = <2>; - pmu-gate { - compatible = "sprd,sc9860-pmu-gate"; - clocks = <&ext_26m>; - #clock-cells = <1>; - sprd,syscon = <&pmu_regs>; - }; - clock-controller@20000000 { compatible = "sprd,sc9860-ap-clk"; reg = <0 0x20000000 0 0x400>; From 5313ba825de04a674897638e44b0e6afe73132e3 Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Wed, 10 Dec 2025 06:17:19 +0100 Subject: [PATCH 0190/1321] cpufreq: dt-platdev: Fix creating device on OPPv1 platforms Commit 6ea891a6dd37 ("cpufreq: dt-platdev: Simplify with of_machine_get_match_data()") broke several platforms which did not have OPPv2 proprety, because it incorrectly checked for device match data after first matching from "allowlist". Almost all of "allowlist" match entries do not have match data and it is expected to create platform device for them with empty data. Fix this by first checking if platform is on the allowlist with of_machine_device_match() and only then taking the match data. This duplicates the number of checks (we match against the allowlist twice), but makes the code here much smaller. Reported-by: Geert Uytterhoeven Closes: https://lore.kernel.org/all/CAMuHMdVJD4+J9QpUUs-sX0feKfuPD72CO0dcqN7shvF_UYpZ3Q@mail.gmail.com/ Reported-by: Pavel Pisa Closes: https://lore.kernel.org/all/6hnk7llbwdezh74h74fhvofbx4t4jihel5kvr6qwx2xuxxbjys@rmwbd7lkhrdz/ Fixes: 6ea891a6dd37 ("cpufreq: dt-platdev: Simplify with of_machine_get_match_data()") Signed-off-by: Krzysztof Kozlowski Tested-by: Pavel Pisa Acked-by: Viresh Kumar Link: https://patch.msgid.link/20251210051718.132795-2-krzysztof.kozlowski@oss.qualcomm.com Signed-off-by: Rob Herring (Arm) --- drivers/cpufreq/cpufreq-dt-platdev.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c b/drivers/cpufreq/cpufreq-dt-platdev.c index a1d11ecd1ac86..b06a43143d23c 100644 --- a/drivers/cpufreq/cpufreq-dt-platdev.c +++ b/drivers/cpufreq/cpufreq-dt-platdev.c @@ -219,11 +219,12 @@ static bool __init cpu0_node_has_opp_v2_prop(void) static int __init cpufreq_dt_platdev_init(void) { - const void *data; + const void *data = NULL; - data = of_machine_get_match_data(allowlist); - if (data) + if (of_machine_device_match(allowlist)) { + data = of_machine_get_match_data(allowlist); goto create_pdev; + } if (cpu0_node_has_opp_v2_prop() && !of_machine_device_match(blocklist)) goto create_pdev; From ea74da38c24fba647a66cc42a14d55cd5597f0d8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Niklas=20S=C3=B6derlund?= Date: Thu, 6 Nov 2025 22:23:41 +0100 Subject: [PATCH 0191/1321] dt-bindings: gpu: img,powervr-rogue: Document GE7800 GPU in Renesas R-Car V3U MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Document Imagination Technologies PowerVR Rogue GE7800 BNVC 15.5.1.64 present in Renesas R-Car R8A779A0 V3U SoC. Signed-off-by: Niklas Söderlund Reviewed-by: Marek Vasut Reviewed-by: Matt Coster Reviewed-by: Geert Uytterhoeven Acked-by: Conor Dooley Link: https://patch.msgid.link/20251106212342.2771579-2-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Rob Herring (Arm) --- Documentation/devicetree/bindings/gpu/img,powervr-rogue.yaml | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/gpu/img,powervr-rogue.yaml b/Documentation/devicetree/bindings/gpu/img,powervr-rogue.yaml index 225a6e1b7fcd3..86ef689853177 100644 --- a/Documentation/devicetree/bindings/gpu/img,powervr-rogue.yaml +++ b/Documentation/devicetree/bindings/gpu/img,powervr-rogue.yaml @@ -20,7 +20,9 @@ properties: - const: img,img-gx6250 - const: img,img-rogue - items: - - const: renesas,r8a77965-gpu + - enum: + - renesas,r8a77965-gpu + - renesas,r8a779a0-gpu - const: img,img-ge7800 - const: img,img-rogue - items: From d40f41b46e26a5a52e313aafd1c97d5a370bbcc3 Mon Sep 17 00:00:00 2001 From: Linus Walleij Date: Tue, 16 Dec 2025 10:52:10 +0100 Subject: [PATCH 0192/1321] dt-bindings: Updates Linus Walleij's mail address My name is stamped into maintainership for a big slew of DT bindings. Now that it is changing, switch it over to my kernel.org mail address, which will hopefully be stable for the rest of my life. Signed-off-by: Linus Walleij Link: https://patch.msgid.link/20251216-maintainers-dt-v1-1-0b5ab102c9bb@kernel.org Signed-off-by: Rob Herring (Arm) --- Documentation/devicetree/bindings/arm/arm,integrator.yaml | 2 +- Documentation/devicetree/bindings/arm/arm,realview.yaml | 2 +- Documentation/devicetree/bindings/arm/arm,scu.yaml | 2 +- Documentation/devicetree/bindings/arm/arm,versatile-sysreg.yaml | 2 +- Documentation/devicetree/bindings/arm/arm,versatile.yaml | 2 +- Documentation/devicetree/bindings/arm/arm,vexpress-juno.yaml | 2 +- Documentation/devicetree/bindings/arm/gemini.yaml | 2 +- Documentation/devicetree/bindings/arm/intel-ixp4xx.yaml | 2 +- Documentation/devicetree/bindings/arm/ux500.yaml | 2 +- Documentation/devicetree/bindings/ata/ata-generic.yaml | 2 +- .../devicetree/bindings/ata/cortina,gemini-sata-bridge.yaml | 2 +- Documentation/devicetree/bindings/ata/faraday,ftide010.yaml | 2 +- .../devicetree/bindings/ata/intel,ixp4xx-compact-flash.yaml | 2 +- Documentation/devicetree/bindings/ata/pata-common.yaml | 2 +- Documentation/devicetree/bindings/ata/sata-common.yaml | 2 +- .../devicetree/bindings/auxdisplay/arm,versatile-lcd.yaml | 2 +- .../devicetree/bindings/clock/stericsson,u8500-clks.yaml | 2 +- .../devicetree/bindings/crypto/intel,ixp4xx-crypto.yaml | 2 +- Documentation/devicetree/bindings/display/dsi-controller.yaml | 2 +- Documentation/devicetree/bindings/display/faraday,tve200.yaml | 2 +- .../devicetree/bindings/display/panel/arm,rtsm-display.yaml | 2 +- .../bindings/display/panel/arm,versatile-tft-panel.yaml | 2 +- .../devicetree/bindings/display/panel/ilitek,ili9322.yaml | 2 +- .../devicetree/bindings/display/panel/novatek,nt35510.yaml | 2 +- .../devicetree/bindings/display/panel/samsung,lms380kf01.yaml | 2 +- .../devicetree/bindings/display/panel/samsung,lms397kf04.yaml | 2 +- .../devicetree/bindings/display/panel/samsung,s6d16d0.yaml | 2 +- .../devicetree/bindings/display/panel/sony,acx424akp.yaml | 2 +- Documentation/devicetree/bindings/display/panel/ti,nspire.yaml | 2 +- Documentation/devicetree/bindings/display/panel/tpo,tpg110.yaml | 2 +- Documentation/devicetree/bindings/display/ste,mcde.yaml | 2 +- Documentation/devicetree/bindings/dma/stericsson,dma40.yaml | 2 +- Documentation/devicetree/bindings/extcon/fcs,fsa880.yaml | 2 +- .../firmware/intel,ixp4xx-network-processing-engine.yaml | 2 +- Documentation/devicetree/bindings/gnss/brcm,bcm4751.yaml | 2 +- Documentation/devicetree/bindings/gpio/faraday,ftgpio010.yaml | 2 +- .../devicetree/bindings/gpio/gpio-consumer-common.yaml | 2 +- Documentation/devicetree/bindings/gpio/gpio-ep9301.yaml | 2 +- Documentation/devicetree/bindings/gpio/gpio-mmio.yaml | 2 +- Documentation/devicetree/bindings/gpio/intel,ixp4xx-gpio.yaml | 2 +- Documentation/devicetree/bindings/gpio/mrvl-gpio.yaml | 2 +- Documentation/devicetree/bindings/gpio/pl061-gpio.yaml | 2 +- Documentation/devicetree/bindings/gpio/st,nomadik-gpio.yaml | 2 +- Documentation/devicetree/bindings/gpio/st,stmpe-gpio.yaml | 2 +- Documentation/devicetree/bindings/hwmon/ntc-thermistor.yaml | 2 +- Documentation/devicetree/bindings/hwmon/winbond,w83781d.yaml | 2 +- Documentation/devicetree/bindings/i2c/arm,i2c-versatile.yaml | 2 +- Documentation/devicetree/bindings/i2c/st,nomadik-i2c.yaml | 2 +- Documentation/devicetree/bindings/iio/accel/bosch,bma255.yaml | 2 +- Documentation/devicetree/bindings/iio/adc/qcom,pm8018-adc.yaml | 2 +- .../devicetree/bindings/iio/gyroscope/invensense,mpu3050.yaml | 2 +- Documentation/devicetree/bindings/iio/light/capella,cm3605.yaml | 2 +- Documentation/devicetree/bindings/iio/light/sharp,gp2ap002.yaml | 2 +- .../bindings/iio/magnetometer/asahi-kasei,ak8974.yaml | 2 +- .../devicetree/bindings/iio/magnetometer/yamaha,yas530.yaml | 2 +- Documentation/devicetree/bindings/iio/st,st-sensors.yaml | 2 +- Documentation/devicetree/bindings/input/atmel,maxtouch.yaml | 2 +- .../bindings/input/touchscreen/cypress,cy8ctma140.yaml | 2 +- .../bindings/input/touchscreen/cypress,cy8ctma340.yaml | 2 +- .../devicetree/bindings/input/touchscreen/melfas,mms114.yaml | 2 +- .../devicetree/bindings/input/touchscreen/zinitix,bt400.yaml | 2 +- .../bindings/interrupt-controller/arm,versatile-fpga-irq.yaml | 2 +- .../bindings/interrupt-controller/faraday,ftintc010.yaml | 2 +- .../bindings/interrupt-controller/intel,ixp4xx-interrupt.yaml | 2 +- .../devicetree/bindings/leds/backlight/kinetic,ktd253.yaml | 2 +- Documentation/devicetree/bindings/leds/register-bit-led.yaml | 2 +- Documentation/devicetree/bindings/leds/regulator-led.yaml | 2 +- Documentation/devicetree/bindings/leds/richtek,rt8515.yaml | 2 +- .../intel,ixp4xx-expansion-bus-controller.yaml | 2 +- .../intel,ixp4xx-expansion-peripheral-props.yaml | 2 +- .../devicetree/bindings/mfd/arm,dev-platforms-syscon.yaml | 2 +- Documentation/devicetree/bindings/mfd/st,stmpe.yaml | 2 +- Documentation/devicetree/bindings/mfd/stericsson,ab8500.yaml | 2 +- .../devicetree/bindings/mfd/stericsson,db8500-prcmu.yaml | 2 +- .../bindings/misc/intel,ixp4xx-ahb-queue-manager.yaml | 2 +- Documentation/devicetree/bindings/mmc/arm,pl18x.yaml | 2 +- .../bindings/mtd/partitions/arm,arm-firmware-suite.yaml | 2 +- .../devicetree/bindings/mtd/partitions/redboot-fis.yaml | 2 +- Documentation/devicetree/bindings/mtd/partitions/seama.yaml | 2 +- .../devicetree/bindings/net/bluetooth/brcm,bluetooth.yaml | 2 +- .../devicetree/bindings/net/cortina,gemini-ethernet.yaml | 2 +- Documentation/devicetree/bindings/net/dsa/micrel,ks8995.yaml | 2 +- Documentation/devicetree/bindings/net/dsa/realtek.yaml | 2 +- Documentation/devicetree/bindings/net/dsa/vitesse,vsc73xx.yaml | 2 +- .../devicetree/bindings/net/intel,ixp46x-ptp-timer.yaml | 2 +- .../devicetree/bindings/net/intel,ixp4xx-ethernet.yaml | 2 +- Documentation/devicetree/bindings/net/intel,ixp4xx-hss.yaml | 2 +- Documentation/devicetree/bindings/pci/faraday,ftpci100.yaml | 2 +- Documentation/devicetree/bindings/pci/intel,ixp4xx-pci.yaml | 2 +- Documentation/devicetree/bindings/pci/v3,v360epc-pci.yaml | 2 +- Documentation/devicetree/bindings/pinctrl/pincfg-node.yaml | 2 +- Documentation/devicetree/bindings/pinctrl/pinctrl.yaml | 2 +- Documentation/devicetree/bindings/pinctrl/pinmux-node.yaml | 2 +- .../devicetree/bindings/power/supply/samsung,battery.yaml | 2 +- Documentation/devicetree/bindings/rng/intel,ixp46x-rng.yaml | 2 +- Documentation/devicetree/bindings/rtc/faraday,ftrtc010.yaml | 2 +- .../devicetree/bindings/spi/arm,pl022-peripheral-props.yaml | 2 +- Documentation/devicetree/bindings/spi/spi-pl022.yaml | 2 +- Documentation/devicetree/bindings/timer/faraday,fttmr010.yaml | 2 +- Documentation/devicetree/bindings/timer/intel,ixp4xx-timer.yaml | 2 +- Documentation/devicetree/bindings/timer/st,nomadik-mtu.yaml | 2 +- Documentation/devicetree/bindings/usb/faraday,fotg210.yaml | 2 +- Documentation/devicetree/bindings/usb/intel,ixp4xx-udc.yaml | 2 +- .../devicetree/bindings/watchdog/faraday,ftwdt010.yaml | 2 +- Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml | 2 +- 105 files changed, 105 insertions(+), 105 deletions(-) diff --git a/Documentation/devicetree/bindings/arm/arm,integrator.yaml b/Documentation/devicetree/bindings/arm/arm,integrator.yaml index 1bdbd1b7ee381..8fe22185a3376 100644 --- a/Documentation/devicetree/bindings/arm/arm,integrator.yaml +++ b/Documentation/devicetree/bindings/arm/arm,integrator.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ARM Integrator Boards maintainers: - - Linus Walleij + - Linus Walleij description: |+ These were the first ARM platforms officially supported by ARM Ltd. diff --git a/Documentation/devicetree/bindings/arm/arm,realview.yaml b/Documentation/devicetree/bindings/arm/arm,realview.yaml index 3c5f1688dbd78..0b3133ecddac1 100644 --- a/Documentation/devicetree/bindings/arm/arm,realview.yaml +++ b/Documentation/devicetree/bindings/arm/arm,realview.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ARM RealView Boards maintainers: - - Linus Walleij + - Linus Walleij description: |+ The ARM RealView series of reference designs were built to explore the Arm11, diff --git a/Documentation/devicetree/bindings/arm/arm,scu.yaml b/Documentation/devicetree/bindings/arm/arm,scu.yaml index dae2aa27e641b..f735b7fb8e1cc 100644 --- a/Documentation/devicetree/bindings/arm/arm,scu.yaml +++ b/Documentation/devicetree/bindings/arm/arm,scu.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ARM Snoop Control Unit (SCU) maintainers: - - Linus Walleij + - Linus Walleij description: | As part of the MPCore complex, Cortex-A5 and Cortex-A9 are provided diff --git a/Documentation/devicetree/bindings/arm/arm,versatile-sysreg.yaml b/Documentation/devicetree/bindings/arm/arm,versatile-sysreg.yaml index 3b060c36b90cd..e72dc45c1afa7 100644 --- a/Documentation/devicetree/bindings/arm/arm,versatile-sysreg.yaml +++ b/Documentation/devicetree/bindings/arm/arm,versatile-sysreg.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Arm Versatile system registers maintainers: - - Linus Walleij + - Linus Walleij description: This is a system control registers block, providing multiple low level diff --git a/Documentation/devicetree/bindings/arm/arm,versatile.yaml b/Documentation/devicetree/bindings/arm/arm,versatile.yaml index 7a3caf6af200a..c777e455d0388 100644 --- a/Documentation/devicetree/bindings/arm/arm,versatile.yaml +++ b/Documentation/devicetree/bindings/arm/arm,versatile.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ARM Versatile Boards maintainers: - - Linus Walleij + - Linus Walleij description: |+ The ARM Versatile boards are two variants of ARM926EJ-S evaluation boards diff --git a/Documentation/devicetree/bindings/arm/arm,vexpress-juno.yaml b/Documentation/devicetree/bindings/arm/arm,vexpress-juno.yaml index 4cdca53205444..6430218ba1cea 100644 --- a/Documentation/devicetree/bindings/arm/arm,vexpress-juno.yaml +++ b/Documentation/devicetree/bindings/arm/arm,vexpress-juno.yaml @@ -8,7 +8,7 @@ title: ARM Versatile Express and Juno Boards maintainers: - Sudeep Holla - - Linus Walleij + - Linus Walleij description: |+ ARM's Versatile Express platform were built as reference designs for exploring diff --git a/Documentation/devicetree/bindings/arm/gemini.yaml b/Documentation/devicetree/bindings/arm/gemini.yaml index f6a0b675830fb..fc092962ab565 100644 --- a/Documentation/devicetree/bindings/arm/gemini.yaml +++ b/Documentation/devicetree/bindings/arm/gemini.yaml @@ -20,7 +20,7 @@ description: | Many of the IP blocks used in the SoC comes from Faraday Technology. maintainers: - - Linus Walleij + - Linus Walleij properties: $nodename: diff --git a/Documentation/devicetree/bindings/arm/intel-ixp4xx.yaml b/Documentation/devicetree/bindings/arm/intel-ixp4xx.yaml index b7b430896596a..0f1bf634a98a1 100644 --- a/Documentation/devicetree/bindings/arm/intel-ixp4xx.yaml +++ b/Documentation/devicetree/bindings/arm/intel-ixp4xx.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Intel IXP4xx maintainers: - - Linus Walleij + - Linus Walleij properties: $nodename: diff --git a/Documentation/devicetree/bindings/arm/ux500.yaml b/Documentation/devicetree/bindings/arm/ux500.yaml index b42d20fa43596..3a8611e5786e5 100644 --- a/Documentation/devicetree/bindings/arm/ux500.yaml +++ b/Documentation/devicetree/bindings/arm/ux500.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Ux500 platforms maintainers: - - Linus Walleij + - Linus Walleij properties: $nodename: diff --git a/Documentation/devicetree/bindings/ata/ata-generic.yaml b/Documentation/devicetree/bindings/ata/ata-generic.yaml index 0697927f3d7e6..9da341ea091e2 100644 --- a/Documentation/devicetree/bindings/ata/ata-generic.yaml +++ b/Documentation/devicetree/bindings/ata/ata-generic.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Generic Parallel ATA Controller maintainers: - - Linus Walleij + - Linus Walleij description: Generic Parallel ATA controllers supporting PIO modes only. diff --git a/Documentation/devicetree/bindings/ata/cortina,gemini-sata-bridge.yaml b/Documentation/devicetree/bindings/ata/cortina,gemini-sata-bridge.yaml index 5290936665084..66de6d4769c12 100644 --- a/Documentation/devicetree/bindings/ata/cortina,gemini-sata-bridge.yaml +++ b/Documentation/devicetree/bindings/ata/cortina,gemini-sata-bridge.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Cortina Systems Gemini SATA Bridge maintainers: - - Linus Walleij + - Linus Walleij description: | The Gemini SATA bridge in a SoC-internal PATA to SATA bridge that diff --git a/Documentation/devicetree/bindings/ata/faraday,ftide010.yaml b/Documentation/devicetree/bindings/ata/faraday,ftide010.yaml index fa16f3767c6a5..32e11d8a0a3b0 100644 --- a/Documentation/devicetree/bindings/ata/faraday,ftide010.yaml +++ b/Documentation/devicetree/bindings/ata/faraday,ftide010.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Faraday Technology FTIDE010 PATA controller maintainers: - - Linus Walleij + - Linus Walleij description: | This controller is the first Faraday IDE interface block, used in the diff --git a/Documentation/devicetree/bindings/ata/intel,ixp4xx-compact-flash.yaml b/Documentation/devicetree/bindings/ata/intel,ixp4xx-compact-flash.yaml index 378692010c561..894a8b9eb910b 100644 --- a/Documentation/devicetree/bindings/ata/intel,ixp4xx-compact-flash.yaml +++ b/Documentation/devicetree/bindings/ata/intel,ixp4xx-compact-flash.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Intel IXP4xx CompactFlash Card Controller maintainers: - - Linus Walleij + - Linus Walleij description: | The IXP4xx network processors have a CompactFlash interface that presents diff --git a/Documentation/devicetree/bindings/ata/pata-common.yaml b/Documentation/devicetree/bindings/ata/pata-common.yaml index 4e867dd4d402b..cee4bb7eb0b9c 100644 --- a/Documentation/devicetree/bindings/ata/pata-common.yaml +++ b/Documentation/devicetree/bindings/ata/pata-common.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Common Properties for Parallel AT attachment (PATA) controllers maintainers: - - Linus Walleij + - Linus Walleij description: | This document defines device tree properties common to most Parallel diff --git a/Documentation/devicetree/bindings/ata/sata-common.yaml b/Documentation/devicetree/bindings/ata/sata-common.yaml index 58c9342b99255..667f48c331959 100644 --- a/Documentation/devicetree/bindings/ata/sata-common.yaml +++ b/Documentation/devicetree/bindings/ata/sata-common.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Common Properties for Serial AT attachment (SATA) controllers maintainers: - - Linus Walleij + - Linus Walleij description: | This document defines device tree properties common to most Serial diff --git a/Documentation/devicetree/bindings/auxdisplay/arm,versatile-lcd.yaml b/Documentation/devicetree/bindings/auxdisplay/arm,versatile-lcd.yaml index 439f7b811a94a..51d68a778b5cb 100644 --- a/Documentation/devicetree/bindings/auxdisplay/arm,versatile-lcd.yaml +++ b/Documentation/devicetree/bindings/auxdisplay/arm,versatile-lcd.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ARM Versatile Character LCD maintainers: - - Linus Walleij + - Linus Walleij - Rob Herring description: diff --git a/Documentation/devicetree/bindings/clock/stericsson,u8500-clks.yaml b/Documentation/devicetree/bindings/clock/stericsson,u8500-clks.yaml index 2150307219a0c..4ebfa5a8d5242 100644 --- a/Documentation/devicetree/bindings/clock/stericsson,u8500-clks.yaml +++ b/Documentation/devicetree/bindings/clock/stericsson,u8500-clks.yaml @@ -8,7 +8,7 @@ title: ST-Ericsson DB8500 (U8500) clocks maintainers: - Ulf Hansson - - Linus Walleij + - Linus Walleij description: While named "U8500 clocks" these clocks are inside the DB8500 digital baseband system-on-chip and its siblings such as diff --git a/Documentation/devicetree/bindings/crypto/intel,ixp4xx-crypto.yaml b/Documentation/devicetree/bindings/crypto/intel,ixp4xx-crypto.yaml index a4006237aa89f..fd20b8197207a 100644 --- a/Documentation/devicetree/bindings/crypto/intel,ixp4xx-crypto.yaml +++ b/Documentation/devicetree/bindings/crypto/intel,ixp4xx-crypto.yaml @@ -8,7 +8,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Intel IXP4xx cryptographic engine maintainers: - - Linus Walleij + - Linus Walleij description: | The Intel IXP4xx cryptographic engine makes use of the IXP4xx NPE diff --git a/Documentation/devicetree/bindings/display/dsi-controller.yaml b/Documentation/devicetree/bindings/display/dsi-controller.yaml index bb4d6e9e7d0ca..850b86fe03ccb 100644 --- a/Documentation/devicetree/bindings/display/dsi-controller.yaml +++ b/Documentation/devicetree/bindings/display/dsi-controller.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Common Properties for DSI Display Panels maintainers: - - Linus Walleij + - Linus Walleij description: | This document defines device tree properties common to DSI, Display diff --git a/Documentation/devicetree/bindings/display/faraday,tve200.yaml b/Documentation/devicetree/bindings/display/faraday,tve200.yaml index e2ee777673211..b09628b69177c 100644 --- a/Documentation/devicetree/bindings/display/faraday,tve200.yaml +++ b/Documentation/devicetree/bindings/display/faraday,tve200.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Faraday TV Encoder TVE200 maintainers: - - Linus Walleij + - Linus Walleij properties: compatible: diff --git a/Documentation/devicetree/bindings/display/panel/arm,rtsm-display.yaml b/Documentation/devicetree/bindings/display/panel/arm,rtsm-display.yaml index 4ad484f09ba3a..fc04558fcc8dd 100644 --- a/Documentation/devicetree/bindings/display/panel/arm,rtsm-display.yaml +++ b/Documentation/devicetree/bindings/display/panel/arm,rtsm-display.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Arm RTSM Virtual Platforms Display maintainers: - - Linus Walleij + - Linus Walleij allOf: - $ref: panel-common.yaml# diff --git a/Documentation/devicetree/bindings/display/panel/arm,versatile-tft-panel.yaml b/Documentation/devicetree/bindings/display/panel/arm,versatile-tft-panel.yaml index c9958f824d9ab..b6c18e7283cd9 100644 --- a/Documentation/devicetree/bindings/display/panel/arm,versatile-tft-panel.yaml +++ b/Documentation/devicetree/bindings/display/panel/arm,versatile-tft-panel.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ARM Versatile TFT Panels maintainers: - - Linus Walleij + - Linus Walleij description: | These panels are connected to the daughterboards found on the diff --git a/Documentation/devicetree/bindings/display/panel/ilitek,ili9322.yaml b/Documentation/devicetree/bindings/display/panel/ilitek,ili9322.yaml index 44423465f6e35..4bdc33d12306b 100644 --- a/Documentation/devicetree/bindings/display/panel/ilitek,ili9322.yaml +++ b/Documentation/devicetree/bindings/display/panel/ilitek,ili9322.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Ilitek ILI9322 TFT panel driver with SPI control bus maintainers: - - Linus Walleij + - Linus Walleij description: | This is a driver for 320x240 TFT panels, accepting a variety of input diff --git a/Documentation/devicetree/bindings/display/panel/novatek,nt35510.yaml b/Documentation/devicetree/bindings/display/panel/novatek,nt35510.yaml index bb50fd5506c3d..b39fd0c5a48ad 100644 --- a/Documentation/devicetree/bindings/display/panel/novatek,nt35510.yaml +++ b/Documentation/devicetree/bindings/display/panel/novatek,nt35510.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Novatek NT35510-based display panels maintainers: - - Linus Walleij + - Linus Walleij allOf: - $ref: panel-common.yaml# diff --git a/Documentation/devicetree/bindings/display/panel/samsung,lms380kf01.yaml b/Documentation/devicetree/bindings/display/panel/samsung,lms380kf01.yaml index 7ce8540551f9e..74c2a617c2ff3 100644 --- a/Documentation/devicetree/bindings/display/panel/samsung,lms380kf01.yaml +++ b/Documentation/devicetree/bindings/display/panel/samsung,lms380kf01.yaml @@ -11,7 +11,7 @@ description: The LMS380KF01 is a 480x800 DPI display panel from Samsung Mobile used with internal or external backlight control. maintainers: - - Linus Walleij + - Linus Walleij allOf: - $ref: panel-common.yaml# diff --git a/Documentation/devicetree/bindings/display/panel/samsung,lms397kf04.yaml b/Documentation/devicetree/bindings/display/panel/samsung,lms397kf04.yaml index 9363032883de4..4cecf502a1506 100644 --- a/Documentation/devicetree/bindings/display/panel/samsung,lms397kf04.yaml +++ b/Documentation/devicetree/bindings/display/panel/samsung,lms397kf04.yaml @@ -10,7 +10,7 @@ description: The datasheet claims this is based around a display controller named DB7430 with a separate backlight controller. maintainers: - - Linus Walleij + - Linus Walleij allOf: - $ref: panel-common.yaml# diff --git a/Documentation/devicetree/bindings/display/panel/samsung,s6d16d0.yaml b/Documentation/devicetree/bindings/display/panel/samsung,s6d16d0.yaml index 2af5bc47323f5..0872476a8ac9a 100644 --- a/Documentation/devicetree/bindings/display/panel/samsung,s6d16d0.yaml +++ b/Documentation/devicetree/bindings/display/panel/samsung,s6d16d0.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Samsung S6D16D0 4" 864x480 AMOLED panel maintainers: - - Linus Walleij + - Linus Walleij allOf: - $ref: panel-common.yaml# diff --git a/Documentation/devicetree/bindings/display/panel/sony,acx424akp.yaml b/Documentation/devicetree/bindings/display/panel/sony,acx424akp.yaml index fd778a20f7609..64fa086730b05 100644 --- a/Documentation/devicetree/bindings/display/panel/sony,acx424akp.yaml +++ b/Documentation/devicetree/bindings/display/panel/sony,acx424akp.yaml @@ -12,7 +12,7 @@ description: The Sony ACX424AKP and ACX424AKM are panels built around AKP. maintainers: - - Linus Walleij + - Linus Walleij allOf: - $ref: panel-common.yaml# diff --git a/Documentation/devicetree/bindings/display/panel/ti,nspire.yaml b/Documentation/devicetree/bindings/display/panel/ti,nspire.yaml index 5c5a3b519e314..fc722f706ad71 100644 --- a/Documentation/devicetree/bindings/display/panel/ti,nspire.yaml +++ b/Documentation/devicetree/bindings/display/panel/ti,nspire.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Texas Instruments NSPIRE Display Panels maintainers: - - Linus Walleij + - Linus Walleij allOf: - $ref: panel-common.yaml# diff --git a/Documentation/devicetree/bindings/display/panel/tpo,tpg110.yaml b/Documentation/devicetree/bindings/display/panel/tpo,tpg110.yaml index 99db268eb9b3a..e5f3108cde5a6 100644 --- a/Documentation/devicetree/bindings/display/panel/tpo,tpg110.yaml +++ b/Documentation/devicetree/bindings/display/panel/tpo,tpg110.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: TPO TPG110 Panel maintainers: - - Linus Walleij + - Linus Walleij - Thierry Reding description: |+ diff --git a/Documentation/devicetree/bindings/display/ste,mcde.yaml b/Documentation/devicetree/bindings/display/ste,mcde.yaml index 564ea845c82e0..7a12d0b817e68 100644 --- a/Documentation/devicetree/bindings/display/ste,mcde.yaml +++ b/Documentation/devicetree/bindings/display/ste,mcde.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ST-Ericsson Multi Channel Display Engine MCDE maintainers: - - Linus Walleij + - Linus Walleij properties: compatible: diff --git a/Documentation/devicetree/bindings/dma/stericsson,dma40.yaml b/Documentation/devicetree/bindings/dma/stericsson,dma40.yaml index 8b42d98804003..607da11e7baa9 100644 --- a/Documentation/devicetree/bindings/dma/stericsson,dma40.yaml +++ b/Documentation/devicetree/bindings/dma/stericsson,dma40.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ST-Ericsson DMA40 DMA Engine maintainers: - - Linus Walleij + - Linus Walleij allOf: - $ref: dma-controller.yaml# diff --git a/Documentation/devicetree/bindings/extcon/fcs,fsa880.yaml b/Documentation/devicetree/bindings/extcon/fcs,fsa880.yaml index ef6a246a13378..bff3fd5f7f4eb 100644 --- a/Documentation/devicetree/bindings/extcon/fcs,fsa880.yaml +++ b/Documentation/devicetree/bindings/extcon/fcs,fsa880.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Fairchild Semiconductor FSA880, FSA9480 and compatibles maintainers: - - Linus Walleij + - Linus Walleij description: The FSA880 and FSA9480 are USB port accessory detectors and switches. diff --git a/Documentation/devicetree/bindings/firmware/intel,ixp4xx-network-processing-engine.yaml b/Documentation/devicetree/bindings/firmware/intel,ixp4xx-network-processing-engine.yaml index 50f1f08744a1d..4d66ef4835223 100644 --- a/Documentation/devicetree/bindings/firmware/intel,ixp4xx-network-processing-engine.yaml +++ b/Documentation/devicetree/bindings/firmware/intel,ixp4xx-network-processing-engine.yaml @@ -8,7 +8,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Intel IXP4xx Network Processing Engine maintainers: - - Linus Walleij + - Linus Walleij description: | On the IXP4xx SoCs, the Network Processing Engine (NPE) is a small diff --git a/Documentation/devicetree/bindings/gnss/brcm,bcm4751.yaml b/Documentation/devicetree/bindings/gnss/brcm,bcm4751.yaml index 089166089498d..c34b86bb7f6ff 100644 --- a/Documentation/devicetree/bindings/gnss/brcm,bcm4751.yaml +++ b/Documentation/devicetree/bindings/gnss/brcm,bcm4751.yaml @@ -8,7 +8,7 @@ title: Broadcom BCM4751 family GNSS Receiver maintainers: - Johan Hovold - - Linus Walleij + - Linus Walleij description: Broadcom GPS chips can be used over the UART or I2C bus. The UART diff --git a/Documentation/devicetree/bindings/gpio/faraday,ftgpio010.yaml b/Documentation/devicetree/bindings/gpio/faraday,ftgpio010.yaml index 640da5b9b0cc1..3a6a47f12982c 100644 --- a/Documentation/devicetree/bindings/gpio/faraday,ftgpio010.yaml +++ b/Documentation/devicetree/bindings/gpio/faraday,ftgpio010.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Faraday Technology FTGPIO010 GPIO Controller maintainers: - - Linus Walleij + - Linus Walleij properties: compatible: diff --git a/Documentation/devicetree/bindings/gpio/gpio-consumer-common.yaml b/Documentation/devicetree/bindings/gpio/gpio-consumer-common.yaml index 40d0be31e2000..fa0148758b4b2 100644 --- a/Documentation/devicetree/bindings/gpio/gpio-consumer-common.yaml +++ b/Documentation/devicetree/bindings/gpio/gpio-consumer-common.yaml @@ -8,7 +8,7 @@ title: Common GPIO lines maintainers: - Bartosz Golaszewski - - Linus Walleij + - Linus Walleij description: Pay attention to using proper GPIO flag (e.g. GPIO_ACTIVE_LOW) for the GPIOs diff --git a/Documentation/devicetree/bindings/gpio/gpio-ep9301.yaml b/Documentation/devicetree/bindings/gpio/gpio-ep9301.yaml index 3a1079d6ee200..ebdb7ee5b790d 100644 --- a/Documentation/devicetree/bindings/gpio/gpio-ep9301.yaml +++ b/Documentation/devicetree/bindings/gpio/gpio-ep9301.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: EP93xx GPIO controller maintainers: - - Linus Walleij + - Linus Walleij - Bartosz Golaszewski - Nikita Shubin diff --git a/Documentation/devicetree/bindings/gpio/gpio-mmio.yaml b/Documentation/devicetree/bindings/gpio/gpio-mmio.yaml index b4d55bf6a2854..ee5d5d25ae82f 100644 --- a/Documentation/devicetree/bindings/gpio/gpio-mmio.yaml +++ b/Documentation/devicetree/bindings/gpio/gpio-mmio.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Generic MMIO GPIO maintainers: - - Linus Walleij + - Linus Walleij - Bartosz Golaszewski description: diff --git a/Documentation/devicetree/bindings/gpio/intel,ixp4xx-gpio.yaml b/Documentation/devicetree/bindings/gpio/intel,ixp4xx-gpio.yaml index bfcb1f364c3aa..2a980c0ed86f6 100644 --- a/Documentation/devicetree/bindings/gpio/intel,ixp4xx-gpio.yaml +++ b/Documentation/devicetree/bindings/gpio/intel,ixp4xx-gpio.yaml @@ -22,7 +22,7 @@ description: | and this can be enabled by a special flag. maintainers: - - Linus Walleij + - Linus Walleij properties: compatible: diff --git a/Documentation/devicetree/bindings/gpio/mrvl-gpio.yaml b/Documentation/devicetree/bindings/gpio/mrvl-gpio.yaml index 65155bb701a9f..7f420b9c04808 100644 --- a/Documentation/devicetree/bindings/gpio/mrvl-gpio.yaml +++ b/Documentation/devicetree/bindings/gpio/mrvl-gpio.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Marvell PXA GPIO controller maintainers: - - Linus Walleij + - Linus Walleij - Bartosz Golaszewski - Rob Herring diff --git a/Documentation/devicetree/bindings/gpio/pl061-gpio.yaml b/Documentation/devicetree/bindings/gpio/pl061-gpio.yaml index c51e10680c0a5..4d970e55104bb 100644 --- a/Documentation/devicetree/bindings/gpio/pl061-gpio.yaml +++ b/Documentation/devicetree/bindings/gpio/pl061-gpio.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ARM PL061 GPIO controller maintainers: - - Linus Walleij + - Linus Walleij - Rob Herring # We need a select here so we don't match all nodes with 'arm,primecell' diff --git a/Documentation/devicetree/bindings/gpio/st,nomadik-gpio.yaml b/Documentation/devicetree/bindings/gpio/st,nomadik-gpio.yaml index b3e8951959b52..40b4a75514496 100644 --- a/Documentation/devicetree/bindings/gpio/st,nomadik-gpio.yaml +++ b/Documentation/devicetree/bindings/gpio/st,nomadik-gpio.yaml @@ -12,7 +12,7 @@ description: with pinctrl-nomadik. maintainers: - - Linus Walleij + - Linus Walleij properties: $nodename: diff --git a/Documentation/devicetree/bindings/gpio/st,stmpe-gpio.yaml b/Documentation/devicetree/bindings/gpio/st,stmpe-gpio.yaml index 4555f1644a4df..66dd602e797dd 100644 --- a/Documentation/devicetree/bindings/gpio/st,stmpe-gpio.yaml +++ b/Documentation/devicetree/bindings/gpio/st,stmpe-gpio.yaml @@ -14,7 +14,7 @@ description: GPIO portions of these expanders. maintainers: - - Linus Walleij + - Linus Walleij properties: compatible: diff --git a/Documentation/devicetree/bindings/hwmon/ntc-thermistor.yaml b/Documentation/devicetree/bindings/hwmon/ntc-thermistor.yaml index dc8bc4c6df34d..efd10bcfb0820 100644 --- a/Documentation/devicetree/bindings/hwmon/ntc-thermistor.yaml +++ b/Documentation/devicetree/bindings/hwmon/ntc-thermistor.yaml @@ -6,7 +6,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: NTC thermistor temperature sensors maintainers: - - Linus Walleij + - Linus Walleij description: | Thermistors with negative temperature coefficient (NTC) are resistors that diff --git a/Documentation/devicetree/bindings/hwmon/winbond,w83781d.yaml b/Documentation/devicetree/bindings/hwmon/winbond,w83781d.yaml index 6971ecb314ebc..d97b0e6984776 100644 --- a/Documentation/devicetree/bindings/hwmon/winbond,w83781d.yaml +++ b/Documentation/devicetree/bindings/hwmon/winbond,w83781d.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Winbond W83781 and compatible hardware monitor IC maintainers: - - Linus Walleij + - Linus Walleij properties: compatible: diff --git a/Documentation/devicetree/bindings/i2c/arm,i2c-versatile.yaml b/Documentation/devicetree/bindings/i2c/arm,i2c-versatile.yaml index e58465d1b0c88..26026dfd788a4 100644 --- a/Documentation/devicetree/bindings/i2c/arm,i2c-versatile.yaml +++ b/Documentation/devicetree/bindings/i2c/arm,i2c-versatile.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: I2C Controller on ARM Ltd development platforms maintainers: - - Linus Walleij + - Linus Walleij allOf: - $ref: /schemas/i2c/i2c-controller.yaml# diff --git a/Documentation/devicetree/bindings/i2c/st,nomadik-i2c.yaml b/Documentation/devicetree/bindings/i2c/st,nomadik-i2c.yaml index 012402debfeb2..63a459c63f6a7 100644 --- a/Documentation/devicetree/bindings/i2c/st,nomadik-i2c.yaml +++ b/Documentation/devicetree/bindings/i2c/st,nomadik-i2c.yaml @@ -12,7 +12,7 @@ description: The Nomadik I2C host controller began its life in the ST DB8500 after the merge of these two companies wireless divisions. maintainers: - - Linus Walleij + - Linus Walleij # Need a custom select here or 'arm,primecell' will match on lots of nodes select: diff --git a/Documentation/devicetree/bindings/iio/accel/bosch,bma255.yaml b/Documentation/devicetree/bindings/iio/accel/bosch,bma255.yaml index 85c9537f1f029..c1387e02eb826 100644 --- a/Documentation/devicetree/bindings/iio/accel/bosch,bma255.yaml +++ b/Documentation/devicetree/bindings/iio/accel/bosch,bma255.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Bosch BMA255 and Similar Accelerometers maintainers: - - Linus Walleij + - Linus Walleij - Stephan Gerhold description: diff --git a/Documentation/devicetree/bindings/iio/adc/qcom,pm8018-adc.yaml b/Documentation/devicetree/bindings/iio/adc/qcom,pm8018-adc.yaml index 58ea1ca4a5ee8..c978c3a3e31af 100644 --- a/Documentation/devicetree/bindings/iio/adc/qcom,pm8018-adc.yaml +++ b/Documentation/devicetree/bindings/iio/adc/qcom,pm8018-adc.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Qualcomm's PM8xxx voltage XOADC maintainers: - - Linus Walleij + - Linus Walleij description: | The Qualcomm PM8xxx PMICs contain a HK/XO ADC (Housekeeping/Crystal diff --git a/Documentation/devicetree/bindings/iio/gyroscope/invensense,mpu3050.yaml b/Documentation/devicetree/bindings/iio/gyroscope/invensense,mpu3050.yaml index f3242dc0e7e64..3a307ac50aa7f 100644 --- a/Documentation/devicetree/bindings/iio/gyroscope/invensense,mpu3050.yaml +++ b/Documentation/devicetree/bindings/iio/gyroscope/invensense,mpu3050.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Invensense MPU-3050 Gyroscope maintainers: - - Linus Walleij + - Linus Walleij properties: compatible: diff --git a/Documentation/devicetree/bindings/iio/light/capella,cm3605.yaml b/Documentation/devicetree/bindings/iio/light/capella,cm3605.yaml index c63b79c3351bf..01376c386a034 100644 --- a/Documentation/devicetree/bindings/iio/light/capella,cm3605.yaml +++ b/Documentation/devicetree/bindings/iio/light/capella,cm3605.yaml @@ -8,7 +8,7 @@ title: Capella Microsystems CM3605 Ambient Light and Short Distance Proximity Sensor maintainers: - - Linus Walleij + - Linus Walleij - Kevin Tsai description: | diff --git a/Documentation/devicetree/bindings/iio/light/sharp,gp2ap002.yaml b/Documentation/devicetree/bindings/iio/light/sharp,gp2ap002.yaml index f8a932be0d103..99bddf31cbed2 100644 --- a/Documentation/devicetree/bindings/iio/light/sharp,gp2ap002.yaml +++ b/Documentation/devicetree/bindings/iio/light/sharp,gp2ap002.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Sharp GP2AP002A00F and GP2AP002S00F proximity and ambient light sensors maintainers: - - Linus Walleij + - Linus Walleij description: | Proximity and ambient light sensor with IR LED for the proximity diff --git a/Documentation/devicetree/bindings/iio/magnetometer/asahi-kasei,ak8974.yaml b/Documentation/devicetree/bindings/iio/magnetometer/asahi-kasei,ak8974.yaml index cefb70def1886..f6b4d98741904 100644 --- a/Documentation/devicetree/bindings/iio/magnetometer/asahi-kasei,ak8974.yaml +++ b/Documentation/devicetree/bindings/iio/magnetometer/asahi-kasei,ak8974.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Asahi Kasei AK8974 magnetometer sensor maintainers: - - Linus Walleij + - Linus Walleij properties: compatible: diff --git a/Documentation/devicetree/bindings/iio/magnetometer/yamaha,yas530.yaml b/Documentation/devicetree/bindings/iio/magnetometer/yamaha,yas530.yaml index 877226e9219ba..5cbf60f3b08b5 100644 --- a/Documentation/devicetree/bindings/iio/magnetometer/yamaha,yas530.yaml +++ b/Documentation/devicetree/bindings/iio/magnetometer/yamaha,yas530.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Yamaha YAS530 family of magnetometer sensors maintainers: - - Linus Walleij + - Linus Walleij description: The Yamaha YAS530 magnetometers is a line of 3-axis magnetometers diff --git a/Documentation/devicetree/bindings/iio/st,st-sensors.yaml b/Documentation/devicetree/bindings/iio/st,st-sensors.yaml index e955eb8e87979..a1a958215cdb7 100644 --- a/Documentation/devicetree/bindings/iio/st,st-sensors.yaml +++ b/Documentation/devicetree/bindings/iio/st,st-sensors.yaml @@ -14,7 +14,7 @@ description: The STMicroelectronics sensor devices are pretty straight-forward maintainers: - Denis Ciocca - - Linus Walleij + - Linus Walleij properties: compatible: diff --git a/Documentation/devicetree/bindings/input/atmel,maxtouch.yaml b/Documentation/devicetree/bindings/input/atmel,maxtouch.yaml index d79b254f1cde4..9bf07acea5999 100644 --- a/Documentation/devicetree/bindings/input/atmel,maxtouch.yaml +++ b/Documentation/devicetree/bindings/input/atmel,maxtouch.yaml @@ -8,7 +8,7 @@ title: Atmel maXTouch touchscreen/touchpad maintainers: - Nick Dyer - - Linus Walleij + - Linus Walleij description: | Atmel maXTouch touchscreen or touchpads such as the mXT244 diff --git a/Documentation/devicetree/bindings/input/touchscreen/cypress,cy8ctma140.yaml b/Documentation/devicetree/bindings/input/touchscreen/cypress,cy8ctma140.yaml index 86a6d18f952a0..afeab49a9544f 100644 --- a/Documentation/devicetree/bindings/input/touchscreen/cypress,cy8ctma140.yaml +++ b/Documentation/devicetree/bindings/input/touchscreen/cypress,cy8ctma140.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Cypress CY8CTMA140 series touchscreen controller maintainers: - - Linus Walleij + - Linus Walleij allOf: - $ref: touchscreen.yaml# diff --git a/Documentation/devicetree/bindings/input/touchscreen/cypress,cy8ctma340.yaml b/Documentation/devicetree/bindings/input/touchscreen/cypress,cy8ctma340.yaml index 4dfbb93678b56..a0b8c12977a19 100644 --- a/Documentation/devicetree/bindings/input/touchscreen/cypress,cy8ctma340.yaml +++ b/Documentation/devicetree/bindings/input/touchscreen/cypress,cy8ctma340.yaml @@ -12,7 +12,7 @@ description: The Cypress CY8CTMA340 series (also known as "CYTTSP" after maintainers: - Javier Martinez Canillas - - Linus Walleij + - Linus Walleij allOf: - $ref: touchscreen.yaml# diff --git a/Documentation/devicetree/bindings/input/touchscreen/melfas,mms114.yaml b/Documentation/devicetree/bindings/input/touchscreen/melfas,mms114.yaml index 90ebd4f8354c2..a8a93f755458b 100644 --- a/Documentation/devicetree/bindings/input/touchscreen/melfas,mms114.yaml +++ b/Documentation/devicetree/bindings/input/touchscreen/melfas,mms114.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Melfas MMS114 family touchscreen controller maintainers: - - Linus Walleij + - Linus Walleij allOf: - $ref: touchscreen.yaml# diff --git a/Documentation/devicetree/bindings/input/touchscreen/zinitix,bt400.yaml b/Documentation/devicetree/bindings/input/touchscreen/zinitix,bt400.yaml index 3f663ce3e44ec..f1ce837b16dfa 100644 --- a/Documentation/devicetree/bindings/input/touchscreen/zinitix,bt400.yaml +++ b/Documentation/devicetree/bindings/input/touchscreen/zinitix,bt400.yaml @@ -12,7 +12,7 @@ description: The Zinitix BT4xx and BT5xx series of touchscreen controllers maintainers: - Michael Srba - - Linus Walleij + - Linus Walleij allOf: - $ref: touchscreen.yaml# diff --git a/Documentation/devicetree/bindings/interrupt-controller/arm,versatile-fpga-irq.yaml b/Documentation/devicetree/bindings/interrupt-controller/arm,versatile-fpga-irq.yaml index 8d581b3aac3a1..42ab873665e1e 100644 --- a/Documentation/devicetree/bindings/interrupt-controller/arm,versatile-fpga-irq.yaml +++ b/Documentation/devicetree/bindings/interrupt-controller/arm,versatile-fpga-irq.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ARM Versatile FPGA IRQ Controller maintainers: - - Linus Walleij + - Linus Walleij description: One or more FPGA IRQ controllers can be synthesized in an ARM reference board diff --git a/Documentation/devicetree/bindings/interrupt-controller/faraday,ftintc010.yaml b/Documentation/devicetree/bindings/interrupt-controller/faraday,ftintc010.yaml index 980e5c45f25b1..e6495acea038f 100644 --- a/Documentation/devicetree/bindings/interrupt-controller/faraday,ftintc010.yaml +++ b/Documentation/devicetree/bindings/interrupt-controller/faraday,ftintc010.yaml @@ -6,7 +6,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Faraday Technology FTINTC010 interrupt controller maintainers: - - Linus Walleij + - Linus Walleij description: This interrupt controller is a stock IP block from Faraday Technology found diff --git a/Documentation/devicetree/bindings/interrupt-controller/intel,ixp4xx-interrupt.yaml b/Documentation/devicetree/bindings/interrupt-controller/intel,ixp4xx-interrupt.yaml index a02a6b5af2056..c375e08ba4104 100644 --- a/Documentation/devicetree/bindings/interrupt-controller/intel,ixp4xx-interrupt.yaml +++ b/Documentation/devicetree/bindings/interrupt-controller/intel,ixp4xx-interrupt.yaml @@ -8,7 +8,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Intel IXP4xx XScale Networking Processors Interrupt Controller maintainers: - - Linus Walleij + - Linus Walleij description: | This interrupt controller is found in the Intel IXP4xx processors. diff --git a/Documentation/devicetree/bindings/leds/backlight/kinetic,ktd253.yaml b/Documentation/devicetree/bindings/leds/backlight/kinetic,ktd253.yaml index 73fa59e621816..e7207eb265842 100644 --- a/Documentation/devicetree/bindings/leds/backlight/kinetic,ktd253.yaml +++ b/Documentation/devicetree/bindings/leds/backlight/kinetic,ktd253.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Kinetic Technologies KTD253 and KTD259 one-wire backlight maintainers: - - Linus Walleij + - Linus Walleij description: | The Kinetic Technologies KTD253 and KTD259 are white LED backlights diff --git a/Documentation/devicetree/bindings/leds/register-bit-led.yaml b/Documentation/devicetree/bindings/leds/register-bit-led.yaml index 20930d327ae99..a6bafc96bd0c3 100644 --- a/Documentation/devicetree/bindings/leds/register-bit-led.yaml +++ b/Documentation/devicetree/bindings/leds/register-bit-led.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Register Bit LEDs maintainers: - - Linus Walleij + - Linus Walleij description: |+ Register bit leds are used with syscon multifunctional devices where single diff --git a/Documentation/devicetree/bindings/leds/regulator-led.yaml b/Documentation/devicetree/bindings/leds/regulator-led.yaml index 4ef7b96e9a086..75ee87d4a7869 100644 --- a/Documentation/devicetree/bindings/leds/regulator-led.yaml +++ b/Documentation/devicetree/bindings/leds/regulator-led.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Regulator LEDs maintainers: - - Linus Walleij + - Linus Walleij description: | Regulator LEDs are powered by a single regulator such that they can diff --git a/Documentation/devicetree/bindings/leds/richtek,rt8515.yaml b/Documentation/devicetree/bindings/leds/richtek,rt8515.yaml index 68c328eec03be..0356371a6b014 100644 --- a/Documentation/devicetree/bindings/leds/richtek,rt8515.yaml +++ b/Documentation/devicetree/bindings/leds/richtek,rt8515.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Richtek RT8515 1.5A dual channel LED driver maintainers: - - Linus Walleij + - Linus Walleij description: | The Richtek RT8515 is a dual channel (two mode) LED driver that diff --git a/Documentation/devicetree/bindings/memory-controllers/intel,ixp4xx-expansion-bus-controller.yaml b/Documentation/devicetree/bindings/memory-controllers/intel,ixp4xx-expansion-bus-controller.yaml index 3049d6bb0b1fe..2a4bf905a3698 100644 --- a/Documentation/devicetree/bindings/memory-controllers/intel,ixp4xx-expansion-bus-controller.yaml +++ b/Documentation/devicetree/bindings/memory-controllers/intel,ixp4xx-expansion-bus-controller.yaml @@ -12,7 +12,7 @@ description: | including IXP42x, IXP43x, IXP45x and IXP46x. maintainers: - - Linus Walleij + - Linus Walleij properties: $nodename: diff --git a/Documentation/devicetree/bindings/memory-controllers/intel,ixp4xx-expansion-peripheral-props.yaml b/Documentation/devicetree/bindings/memory-controllers/intel,ixp4xx-expansion-peripheral-props.yaml index d1479a7b9c8df..020fa49c34544 100644 --- a/Documentation/devicetree/bindings/memory-controllers/intel,ixp4xx-expansion-peripheral-props.yaml +++ b/Documentation/devicetree/bindings/memory-controllers/intel,ixp4xx-expansion-peripheral-props.yaml @@ -12,7 +12,7 @@ description: including IXP42x, IXP43x, IXP45x and IXP46x. maintainers: - - Linus Walleij + - Linus Walleij properties: intel,ixp4xx-eb-t1: diff --git a/Documentation/devicetree/bindings/mfd/arm,dev-platforms-syscon.yaml b/Documentation/devicetree/bindings/mfd/arm,dev-platforms-syscon.yaml index 46b164ae08315..7f3b1b77293c5 100644 --- a/Documentation/devicetree/bindings/mfd/arm,dev-platforms-syscon.yaml +++ b/Documentation/devicetree/bindings/mfd/arm,dev-platforms-syscon.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Arm Ltd Developer Platforms System Controllers maintainers: - - Linus Walleij + - Linus Walleij description: The Arm Ltd Integrator, Realview, and Versatile families of developer diff --git a/Documentation/devicetree/bindings/mfd/st,stmpe.yaml b/Documentation/devicetree/bindings/mfd/st,stmpe.yaml index b77cc3f3075d7..df43878fbe18f 100644 --- a/Documentation/devicetree/bindings/mfd/st,stmpe.yaml +++ b/Documentation/devicetree/bindings/mfd/st,stmpe.yaml @@ -12,7 +12,7 @@ description: STMicroelectronics Port Expander (STMPE) is a series of slow peripherals connected to SPI or I2C. maintainers: - - Linus Walleij + - Linus Walleij allOf: - $ref: /schemas/spi/spi-peripheral-props.yaml# diff --git a/Documentation/devicetree/bindings/mfd/stericsson,ab8500.yaml b/Documentation/devicetree/bindings/mfd/stericsson,ab8500.yaml index ce5e845ab5c52..0fdfbfdfe88a3 100644 --- a/Documentation/devicetree/bindings/mfd/stericsson,ab8500.yaml +++ b/Documentation/devicetree/bindings/mfd/stericsson,ab8500.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ST-Ericsson Analog Baseband AB8500 and AB8505 maintainers: - - Linus Walleij + - Linus Walleij description: the AB8500 "Analog Baseband" is the mixed-signals integrated circuit diff --git a/Documentation/devicetree/bindings/mfd/stericsson,db8500-prcmu.yaml b/Documentation/devicetree/bindings/mfd/stericsson,db8500-prcmu.yaml index d6c13779d44e9..4edd4a3bab880 100644 --- a/Documentation/devicetree/bindings/mfd/stericsson,db8500-prcmu.yaml +++ b/Documentation/devicetree/bindings/mfd/stericsson,db8500-prcmu.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ST-Ericsson DB8500 PRCMU - Power Reset and Control Management Unit maintainers: - - Linus Walleij + - Linus Walleij description: The DB8500 Power Reset and Control Management Unit is an XP70 8-bit diff --git a/Documentation/devicetree/bindings/misc/intel,ixp4xx-ahb-queue-manager.yaml b/Documentation/devicetree/bindings/misc/intel,ixp4xx-ahb-queue-manager.yaml index aab89946b04fb..1198d87d0ab67 100644 --- a/Documentation/devicetree/bindings/misc/intel,ixp4xx-ahb-queue-manager.yaml +++ b/Documentation/devicetree/bindings/misc/intel,ixp4xx-ahb-queue-manager.yaml @@ -8,7 +8,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Intel IXP4xx AHB Queue Manager maintainers: - - Linus Walleij + - Linus Walleij description: | The IXP4xx AHB Queue Manager maintains queues as circular buffers in diff --git a/Documentation/devicetree/bindings/mmc/arm,pl18x.yaml b/Documentation/devicetree/bindings/mmc/arm,pl18x.yaml index 8f62e2c7fa641..f90fd73904a24 100644 --- a/Documentation/devicetree/bindings/mmc/arm,pl18x.yaml +++ b/Documentation/devicetree/bindings/mmc/arm,pl18x.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ARM PrimeCell MultiMedia Card Interface (MMCI) PL180 and PL181 maintainers: - - Linus Walleij + - Linus Walleij - Ulf Hansson description: diff --git a/Documentation/devicetree/bindings/mtd/partitions/arm,arm-firmware-suite.yaml b/Documentation/devicetree/bindings/mtd/partitions/arm,arm-firmware-suite.yaml index 97618847ee354..e9b1a6869910c 100644 --- a/Documentation/devicetree/bindings/mtd/partitions/arm,arm-firmware-suite.yaml +++ b/Documentation/devicetree/bindings/mtd/partitions/arm,arm-firmware-suite.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ARM Firmware Suite (AFS) Partitions maintainers: - - Linus Walleij + - Linus Walleij select: false diff --git a/Documentation/devicetree/bindings/mtd/partitions/redboot-fis.yaml b/Documentation/devicetree/bindings/mtd/partitions/redboot-fis.yaml index ba7445cd69e8f..e3978d2bc056f 100644 --- a/Documentation/devicetree/bindings/mtd/partitions/redboot-fis.yaml +++ b/Documentation/devicetree/bindings/mtd/partitions/redboot-fis.yaml @@ -14,7 +14,7 @@ description: The FLASH Image System (FIS) directory is a flash description 32 KB in size. maintainers: - - Linus Walleij + - Linus Walleij select: false diff --git a/Documentation/devicetree/bindings/mtd/partitions/seama.yaml b/Documentation/devicetree/bindings/mtd/partitions/seama.yaml index 4c1cbf43e81a6..4af185204b4b9 100644 --- a/Documentation/devicetree/bindings/mtd/partitions/seama.yaml +++ b/Documentation/devicetree/bindings/mtd/partitions/seama.yaml @@ -18,7 +18,7 @@ allOf: - $ref: partition.yaml# maintainers: - - Linus Walleij + - Linus Walleij properties: compatible: diff --git a/Documentation/devicetree/bindings/net/bluetooth/brcm,bluetooth.yaml b/Documentation/devicetree/bindings/net/bluetooth/brcm,bluetooth.yaml index 3c410cadff230..95501e858e6f8 100644 --- a/Documentation/devicetree/bindings/net/bluetooth/brcm,bluetooth.yaml +++ b/Documentation/devicetree/bindings/net/bluetooth/brcm,bluetooth.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Broadcom Bluetooth Chips maintainers: - - Linus Walleij + - Linus Walleij description: This binding describes Broadcom UART-attached bluetooth chips. diff --git a/Documentation/devicetree/bindings/net/cortina,gemini-ethernet.yaml b/Documentation/devicetree/bindings/net/cortina,gemini-ethernet.yaml index a930358f6a66b..f0b5bea2458d5 100644 --- a/Documentation/devicetree/bindings/net/cortina,gemini-ethernet.yaml +++ b/Documentation/devicetree/bindings/net/cortina,gemini-ethernet.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Cortina Systems Gemini Ethernet Controller maintainers: - - Linus Walleij + - Linus Walleij description: | This ethernet controller is found in the Gemini SoC family: diff --git a/Documentation/devicetree/bindings/net/dsa/micrel,ks8995.yaml b/Documentation/devicetree/bindings/net/dsa/micrel,ks8995.yaml index 854808ff5ad5d..e9ce360670331 100644 --- a/Documentation/devicetree/bindings/net/dsa/micrel,ks8995.yaml +++ b/Documentation/devicetree/bindings/net/dsa/micrel,ks8995.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Micrel KS8995 Family DSA Switches maintainers: - - Linus Walleij + - Linus Walleij description: The Micrel KS8995 DSA Switches are 100 Mbit switches that were produced in diff --git a/Documentation/devicetree/bindings/net/dsa/realtek.yaml b/Documentation/devicetree/bindings/net/dsa/realtek.yaml index f348e66fb5158..473facd87a622 100644 --- a/Documentation/devicetree/bindings/net/dsa/realtek.yaml +++ b/Documentation/devicetree/bindings/net/dsa/realtek.yaml @@ -10,7 +10,7 @@ allOf: - $ref: dsa.yaml#/$defs/ethernet-ports maintainers: - - Linus Walleij + - Linus Walleij description: Realtek advertises these chips as fast/gigabit switches or unmanaged diff --git a/Documentation/devicetree/bindings/net/dsa/vitesse,vsc73xx.yaml b/Documentation/devicetree/bindings/net/dsa/vitesse,vsc73xx.yaml index 51cf574249bec..c41f479bdee94 100644 --- a/Documentation/devicetree/bindings/net/dsa/vitesse,vsc73xx.yaml +++ b/Documentation/devicetree/bindings/net/dsa/vitesse,vsc73xx.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Vitesse VSC73xx DSA Switches maintainers: - - Linus Walleij + - Linus Walleij description: The Vitesse DSA Switches were produced in the early-to-mid 2000s. diff --git a/Documentation/devicetree/bindings/net/intel,ixp46x-ptp-timer.yaml b/Documentation/devicetree/bindings/net/intel,ixp46x-ptp-timer.yaml index f92730b1d2fad..80336b7e64ecc 100644 --- a/Documentation/devicetree/bindings/net/intel,ixp46x-ptp-timer.yaml +++ b/Documentation/devicetree/bindings/net/intel,ixp46x-ptp-timer.yaml @@ -8,7 +8,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Intel IXP46x PTP Timer (TSYNC) maintainers: - - Linus Walleij + - Linus Walleij description: | The Intel IXP46x PTP timer is known in the manual as IEEE1588 Hardware diff --git a/Documentation/devicetree/bindings/net/intel,ixp4xx-ethernet.yaml b/Documentation/devicetree/bindings/net/intel,ixp4xx-ethernet.yaml index 8689de1aaea15..3b8f83b7099d3 100644 --- a/Documentation/devicetree/bindings/net/intel,ixp4xx-ethernet.yaml +++ b/Documentation/devicetree/bindings/net/intel,ixp4xx-ethernet.yaml @@ -11,7 +11,7 @@ allOf: - $ref: ethernet-controller.yaml# maintainers: - - Linus Walleij + - Linus Walleij description: | The Intel IXP4xx ethernet makes use of the IXP4xx NPE (Network diff --git a/Documentation/devicetree/bindings/net/intel,ixp4xx-hss.yaml b/Documentation/devicetree/bindings/net/intel,ixp4xx-hss.yaml index 7a405e9b37b2c..1d952735c81b6 100644 --- a/Documentation/devicetree/bindings/net/intel,ixp4xx-hss.yaml +++ b/Documentation/devicetree/bindings/net/intel,ixp4xx-hss.yaml @@ -8,7 +8,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Intel IXP4xx V.35 WAN High Speed Serial Link (HSS) maintainers: - - Linus Walleij + - Linus Walleij description: | The Intel IXP4xx HSS makes use of the IXP4xx NPE (Network diff --git a/Documentation/devicetree/bindings/pci/faraday,ftpci100.yaml b/Documentation/devicetree/bindings/pci/faraday,ftpci100.yaml index 378dd1c8e2ee2..fed393a895633 100644 --- a/Documentation/devicetree/bindings/pci/faraday,ftpci100.yaml +++ b/Documentation/devicetree/bindings/pci/faraday,ftpci100.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Faraday Technology FTPCI100 PCI Host Bridge maintainers: - - Linus Walleij + - Linus Walleij description: | This PCI bridge is found inside that Cortina Systems Gemini SoC platform and diff --git a/Documentation/devicetree/bindings/pci/intel,ixp4xx-pci.yaml b/Documentation/devicetree/bindings/pci/intel,ixp4xx-pci.yaml index 3cae2e0f7f5e2..c1806aef7bac4 100644 --- a/Documentation/devicetree/bindings/pci/intel,ixp4xx-pci.yaml +++ b/Documentation/devicetree/bindings/pci/intel,ixp4xx-pci.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Intel IXP4xx PCI controller maintainers: - - Linus Walleij + - Linus Walleij description: PCI host controller found in the Intel IXP4xx SoC series. diff --git a/Documentation/devicetree/bindings/pci/v3,v360epc-pci.yaml b/Documentation/devicetree/bindings/pci/v3,v360epc-pci.yaml index 38cac88f17bfd..0e2ac2f8faed3 100644 --- a/Documentation/devicetree/bindings/pci/v3,v360epc-pci.yaml +++ b/Documentation/devicetree/bindings/pci/v3,v360epc-pci.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: V3 Semiconductor V360 EPC PCI bridge maintainers: - - Linus Walleij + - Linus Walleij description: This bridge is found in the ARM Integrator/AP (Application Platform) diff --git a/Documentation/devicetree/bindings/pinctrl/pincfg-node.yaml b/Documentation/devicetree/bindings/pinctrl/pincfg-node.yaml index d1bc389e0a6d1..a916d0fc79a99 100644 --- a/Documentation/devicetree/bindings/pinctrl/pincfg-node.yaml +++ b/Documentation/devicetree/bindings/pinctrl/pincfg-node.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Generic Pin Configuration Node maintainers: - - Linus Walleij + - Linus Walleij description: Many data items that are represented in a pin configuration node are common diff --git a/Documentation/devicetree/bindings/pinctrl/pinctrl.yaml b/Documentation/devicetree/bindings/pinctrl/pinctrl.yaml index d471563119a98..290438826c507 100644 --- a/Documentation/devicetree/bindings/pinctrl/pinctrl.yaml +++ b/Documentation/devicetree/bindings/pinctrl/pinctrl.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Pin controller device maintainers: - - Linus Walleij + - Linus Walleij - Rafał Miłecki description: | diff --git a/Documentation/devicetree/bindings/pinctrl/pinmux-node.yaml b/Documentation/devicetree/bindings/pinctrl/pinmux-node.yaml index ca9d246d46fe4..7ba26271c4d65 100644 --- a/Documentation/devicetree/bindings/pinctrl/pinmux-node.yaml +++ b/Documentation/devicetree/bindings/pinctrl/pinmux-node.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Generic Pin Multiplexing Node maintainers: - - Linus Walleij + - Linus Walleij description: | The contents of the pin configuration child nodes are defined by the binding diff --git a/Documentation/devicetree/bindings/power/supply/samsung,battery.yaml b/Documentation/devicetree/bindings/power/supply/samsung,battery.yaml index 40292d581b105..fa1ccff043bed 100644 --- a/Documentation/devicetree/bindings/power/supply/samsung,battery.yaml +++ b/Documentation/devicetree/bindings/power/supply/samsung,battery.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Samsung SDI Batteries maintainers: - - Linus Walleij + - Linus Walleij description: | Samsung SDI (Samsung Digital Interface) batteries are all different versions diff --git a/Documentation/devicetree/bindings/rng/intel,ixp46x-rng.yaml b/Documentation/devicetree/bindings/rng/intel,ixp46x-rng.yaml index 9f7590ce6b3d6..146593a669d66 100644 --- a/Documentation/devicetree/bindings/rng/intel,ixp46x-rng.yaml +++ b/Documentation/devicetree/bindings/rng/intel,ixp46x-rng.yaml @@ -12,7 +12,7 @@ description: | 32 bit random number. maintainers: - - Linus Walleij + - Linus Walleij properties: compatible: diff --git a/Documentation/devicetree/bindings/rtc/faraday,ftrtc010.yaml b/Documentation/devicetree/bindings/rtc/faraday,ftrtc010.yaml index b1c1a0e213188..2b1215b495807 100644 --- a/Documentation/devicetree/bindings/rtc/faraday,ftrtc010.yaml +++ b/Documentation/devicetree/bindings/rtc/faraday,ftrtc010.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Faraday Technology FTRTC010 Real Time Clock maintainers: - - Linus Walleij + - Linus Walleij description: | This RTC appears in for example the Storlink Gemini family of SoCs. diff --git a/Documentation/devicetree/bindings/spi/arm,pl022-peripheral-props.yaml b/Documentation/devicetree/bindings/spi/arm,pl022-peripheral-props.yaml index bb8b6863b1090..f976e416395b7 100644 --- a/Documentation/devicetree/bindings/spi/arm,pl022-peripheral-props.yaml +++ b/Documentation/devicetree/bindings/spi/arm,pl022-peripheral-props.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Peripheral-specific properties for Arm PL022 SPI controller maintainers: - - Linus Walleij + - Linus Walleij select: false diff --git a/Documentation/devicetree/bindings/spi/spi-pl022.yaml b/Documentation/devicetree/bindings/spi/spi-pl022.yaml index 7f174b7d0a26f..680fdfa184d0c 100644 --- a/Documentation/devicetree/bindings/spi/spi-pl022.yaml +++ b/Documentation/devicetree/bindings/spi/spi-pl022.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ARM PL022 SPI controller maintainers: - - Linus Walleij + - Linus Walleij allOf: - $ref: spi-controller.yaml# diff --git a/Documentation/devicetree/bindings/timer/faraday,fttmr010.yaml b/Documentation/devicetree/bindings/timer/faraday,fttmr010.yaml index 39506323556c5..e93c20243dba1 100644 --- a/Documentation/devicetree/bindings/timer/faraday,fttmr010.yaml +++ b/Documentation/devicetree/bindings/timer/faraday,fttmr010.yaml @@ -8,7 +8,7 @@ title: Faraday FTTMR010 timer maintainers: - Joel Stanley - - Linus Walleij + - Linus Walleij description: This timer is a generic IP block from Faraday Technology, embedded in the diff --git a/Documentation/devicetree/bindings/timer/intel,ixp4xx-timer.yaml b/Documentation/devicetree/bindings/timer/intel,ixp4xx-timer.yaml index 526b8db4d5759..c92e6b9cd5e2f 100644 --- a/Documentation/devicetree/bindings/timer/intel,ixp4xx-timer.yaml +++ b/Documentation/devicetree/bindings/timer/intel,ixp4xx-timer.yaml @@ -8,7 +8,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Intel IXP4xx XScale Networking Processors Timers maintainers: - - Linus Walleij + - Linus Walleij description: This timer is found in the Intel IXP4xx processors. diff --git a/Documentation/devicetree/bindings/timer/st,nomadik-mtu.yaml b/Documentation/devicetree/bindings/timer/st,nomadik-mtu.yaml index fa65878b35719..873a01c287f47 100644 --- a/Documentation/devicetree/bindings/timer/st,nomadik-mtu.yaml +++ b/Documentation/devicetree/bindings/timer/st,nomadik-mtu.yaml @@ -8,7 +8,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: ST Microelectronics Nomadik Multi-Timer Unit MTU Timer maintainers: - - Linus Walleij + - Linus Walleij description: This timer is found in the ST Microelectronics Nomadik SoCs STn8800, STn8810 and STn8815 as well as in ST-Ericsson DB8500. diff --git a/Documentation/devicetree/bindings/usb/faraday,fotg210.yaml b/Documentation/devicetree/bindings/usb/faraday,fotg210.yaml index 3fe4d1564dfed..b97ba535087c9 100644 --- a/Documentation/devicetree/bindings/usb/faraday,fotg210.yaml +++ b/Documentation/devicetree/bindings/usb/faraday,fotg210.yaml @@ -8,7 +8,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Faraday Technology FOTG200 series HS OTG USB 2.0 controller maintainers: - - Linus Walleij + - Linus Walleij allOf: - $ref: usb-drd.yaml# diff --git a/Documentation/devicetree/bindings/usb/intel,ixp4xx-udc.yaml b/Documentation/devicetree/bindings/usb/intel,ixp4xx-udc.yaml index 4ed6027468972..91a149ad3ad6e 100644 --- a/Documentation/devicetree/bindings/usb/intel,ixp4xx-udc.yaml +++ b/Documentation/devicetree/bindings/usb/intel,ixp4xx-udc.yaml @@ -10,7 +10,7 @@ description: The IXP4xx SoCs has a full-speed USB Device Controller with 16 endpoints and a built-in transceiver. maintainers: - - Linus Walleij + - Linus Walleij properties: compatible: diff --git a/Documentation/devicetree/bindings/watchdog/faraday,ftwdt010.yaml b/Documentation/devicetree/bindings/watchdog/faraday,ftwdt010.yaml index 726dc872ad02d..3eb35f325f4c5 100644 --- a/Documentation/devicetree/bindings/watchdog/faraday,ftwdt010.yaml +++ b/Documentation/devicetree/bindings/watchdog/faraday,ftwdt010.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Faraday Technology FTWDT010 watchdog maintainers: - - Linus Walleij + - Linus Walleij - Corentin Labbe description: | diff --git a/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml b/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml index 442c21f12a3b2..defe0401ded0b 100644 --- a/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml +++ b/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml @@ -8,7 +8,7 @@ title: Maxim 63xx Watchdog Timers maintainers: - Marc Zyngier - - Linus Walleij + - Linus Walleij allOf: - $ref: watchdog.yaml# From 0f77b2ef6970800761203fed885f4bffbb8b3c12 Mon Sep 17 00:00:00 2001 From: Frank Wunderlich Date: Wed, 19 Nov 2025 18:51:22 +0100 Subject: [PATCH 0193/1321] arm64: dts: mediatek: mt7986: add dtbs with applied overlays for bpi-r3 Build devicetree binaries for testing overlays and providing users full dtb without using overlays. Suggested-by: Rob Herring Signed-off-by: Frank Wunderlich Fixes: a58c36806741 ("arm64: dts: mediatek: mt7988a-bpi-r4pro: Add mmc overlays") Fixes: dec929e61a42 ("arm64: dts: mediatek: mt7988a-bpi-r4-pro: Add PCIe overlays") Fixes: 714a80ced07a ("arm64: dts: mediatek: mt7988a-bpi-r4: Add dt overlays for sd + emmc") Fixes: 312189ebb802 ("arm64: dts: mt7986: add overlay for SATA power socket on BPI-R3") Fixes: 8e01fb15b815 ("arm64: dts: mt7986: add Bananapi R3") Acked-by: AngeloGioacchino Del Regno Acked-by: Rob Herring (Arm) Link: https://patch.msgid.link/20251119175124.48947-2-linux@fw-web.de Signed-off-by: Rob Herring (Arm) --- arch/arm64/boot/dts/mediatek/Makefile | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/arch/arm64/boot/dts/mediatek/Makefile b/arch/arm64/boot/dts/mediatek/Makefile index c5fd6191a925a..77d76730d61b9 100644 --- a/arch/arm64/boot/dts/mediatek/Makefile +++ b/arch/arm64/boot/dts/mediatek/Makefile @@ -19,6 +19,27 @@ dtb-$(CONFIG_ARCH_MEDIATEK) += mt7986a-bananapi-bpi-r3-nand.dtbo dtb-$(CONFIG_ARCH_MEDIATEK) += mt7986a-bananapi-bpi-r3-nor.dtbo dtb-$(CONFIG_ARCH_MEDIATEK) += mt7986a-bananapi-bpi-r3-sata.dtbo dtb-$(CONFIG_ARCH_MEDIATEK) += mt7986a-bananapi-bpi-r3-sd.dtbo +mt7986a-bananapi-bpi-r3-emmc-nand-dtbs := \ + mt7986a-bananapi-bpi-r3.dtb \ + mt7986a-bananapi-bpi-r3-emmc.dtbo \ + mt7986a-bananapi-bpi-r3-nand.dtbo +dtb-$(CONFIG_ARCH_MEDIATEK) += mt7986a-bananapi-bpi-r3-emmc-nand.dtb +mt7986a-bananapi-bpi-r3-emmc-nor-dtbs := \ + mt7986a-bananapi-bpi-r3.dtb \ + mt7986a-bananapi-bpi-r3-emmc.dtbo \ + mt7986a-bananapi-bpi-r3-nor.dtbo +dtb-$(CONFIG_ARCH_MEDIATEK) += mt7986a-bananapi-bpi-r3-emmc-nor.dtb +mt7986a-bananapi-bpi-r3-sd-nand-dtbs := \ + mt7986a-bananapi-bpi-r3.dtb \ + mt7986a-bananapi-bpi-r3-sd.dtbo \ + mt7986a-bananapi-bpi-r3-nand.dtbo \ + mt7986a-bananapi-bpi-r3-sata.dtbo +dtb-$(CONFIG_ARCH_MEDIATEK) += mt7986a-bananapi-bpi-r3-sd-nand.dtb +mt7986a-bananapi-bpi-r3-sd-nor-dtbs := \ + mt7986a-bananapi-bpi-r3.dtb \ + mt7986a-bananapi-bpi-r3-sd.dtbo \ + mt7986a-bananapi-bpi-r3-nor.dtbo +dtb-$(CONFIG_ARCH_MEDIATEK) += mt7986a-bananapi-bpi-r3-sd-nor.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt7986a-rfb.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt7986b-rfb.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt7988a-bananapi-bpi-r4.dtb From 705064e22e5c81605c3e5bf6b533a6e142c4a13f Mon Sep 17 00:00:00 2001 From: Frank Wunderlich Date: Wed, 19 Nov 2025 18:51:23 +0100 Subject: [PATCH 0194/1321] arm64: dts: mediatek: mt7988: add dtbs with applied overlays for bpi-r4 (pro) Build devicetree binaries for testing overlays and providing users full dtb without using overlays for Bananapi R4 (pro) variants. Signed-off-by: Frank Wunderlich Link: https://patch.msgid.link/20251119175124.48947-3-linux@fw-web.de Signed-off-by: Rob Herring (Arm) --- arch/arm64/boot/dts/mediatek/Makefile | 32 +++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/arch/arm64/boot/dts/mediatek/Makefile b/arch/arm64/boot/dts/mediatek/Makefile index 77d76730d61b9..cac8f4c6d76f1 100644 --- a/arch/arm64/boot/dts/mediatek/Makefile +++ b/arch/arm64/boot/dts/mediatek/Makefile @@ -52,6 +52,38 @@ dtb-$(CONFIG_ARCH_MEDIATEK) += mt7988a-bananapi-bpi-r4-pro-cn18.dtbo dtb-$(CONFIG_ARCH_MEDIATEK) += mt7988a-bananapi-bpi-r4-pro-emmc.dtbo dtb-$(CONFIG_ARCH_MEDIATEK) += mt7988a-bananapi-bpi-r4-pro-sd.dtbo dtb-$(CONFIG_ARCH_MEDIATEK) += mt7988a-bananapi-bpi-r4-sd.dtbo +mt7988a-bananapi-bpi-r4-emmc-dtbs := \ + mt7988a-bananapi-bpi-r4.dtb \ + mt7988a-bananapi-bpi-r4-emmc.dtbo +dtb-$(CONFIG_ARCH_MEDIATEK) += mt7988a-bananapi-bpi-r4-emmc.dtb +mt7988a-bananapi-bpi-r4-sd-dtbs := \ + mt7988a-bananapi-bpi-r4.dtb \ + mt7988a-bananapi-bpi-r4-sd.dtbo +dtb-$(CONFIG_ARCH_MEDIATEK) += mt7988a-bananapi-bpi-r4-sd.dtb +mt7988a-bananapi-bpi-r4-2g5-emmc-dtbs := \ + mt7988a-bananapi-bpi-r4-2g5.dtb \ + mt7988a-bananapi-bpi-r4-emmc.dtbo +dtb-$(CONFIG_ARCH_MEDIATEK) += mt7988a-bananapi-bpi-r4-2g5-emmc.dtb +mt7988a-bananapi-bpi-r4-2g5-sd-dtbs := \ + mt7988a-bananapi-bpi-r4-2g5.dtb \ + mt7988a-bananapi-bpi-r4-sd.dtbo +dtb-$(CONFIG_ARCH_MEDIATEK) += mt7988a-bananapi-bpi-r4-2g5-sd.dtb +mt7988a-bananapi-bpi-r4-pro-8x-emmc-dtbs := \ + mt7988a-bananapi-bpi-r4-pro-8x.dtb \ + mt7988a-bananapi-bpi-r4-pro-emmc.dtbo +dtb-$(CONFIG_ARCH_MEDIATEK) += mt7988a-bananapi-bpi-r4-pro-8x-emmc.dtb +mt7988a-bananapi-bpi-r4-pro-8x-sd-dtbs := \ + mt7988a-bananapi-bpi-r4-pro-8x.dtb \ + mt7988a-bananapi-bpi-r4-pro-sd.dtbo +dtb-$(CONFIG_ARCH_MEDIATEK) += mt7988a-bananapi-bpi-r4-pro-8x-sd.dtb +mt7988a-bananapi-bpi-r4-pro-8x-sd-cn15-dtbs := \ + mt7988a-bananapi-bpi-r4-pro-8x-sd.dtb \ + mt7988a-bananapi-bpi-r4-pro-cn15.dtbo +dtb-$(CONFIG_ARCH_MEDIATEK) += mt7988a-bananapi-bpi-r4-pro-8x-sd-cn15.dtb +mt7988a-bananapi-bpi-r4-pro-8x-sd-cn18-dtbs := \ + mt7988a-bananapi-bpi-r4-pro-8x-sd.dtb \ + mt7988a-bananapi-bpi-r4-pro-cn18.dtbo +dtb-$(CONFIG_ARCH_MEDIATEK) += mt7988a-bananapi-bpi-r4-pro-8x-sd-cn18.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt8167-pumpkin.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt8173-elm.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt8173-elm-hana.dtb From 58fb236cd747489c697c738bf7c596c658b9c8bb Mon Sep 17 00:00:00 2001 From: "Rob Herring (Arm)" Date: Fri, 5 Dec 2025 22:59:38 +0100 Subject: [PATCH 0195/1321] arm64: dts: mediatek: Apply mt8395-radxa DT overlay at build time It's a requirement that DT overlays be applied at build time in order to validate them as overlays are not validated on their own. Add missing target for mt8395-radxa hd panel overlay. Fixes: 4c8ff61199a7 ("arm64: dts: mediatek: mt8395-radxa-nio-12l: Add Radxa 8 HD panel") Signed-off-by: Frank Wunderlich Acked-by: AngeloGioacchino Del Regno Link: https://patch.msgid.link/20251205215940.19287-1-linux@fw-web.de Signed-off-by: Rob Herring (Arm) --- arch/arm64/boot/dts/mediatek/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/boot/dts/mediatek/Makefile b/arch/arm64/boot/dts/mediatek/Makefile index cac8f4c6d76f1..3f76d9ce98797 100644 --- a/arch/arm64/boot/dts/mediatek/Makefile +++ b/arch/arm64/boot/dts/mediatek/Makefile @@ -166,6 +166,8 @@ dtb-$(CONFIG_ARCH_MEDIATEK) += mt8390-grinn-genio-700-sbc.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt8395-kontron-3-5-sbc-i1200.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt8395-radxa-nio-12l.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt8395-radxa-nio-12l-8-hd-panel.dtbo +mt8395-radxa-nio-12l-8-hd-panel-dtbs := mt8395-radxa-nio-12l.dtb mt8395-radxa-nio-12l-8-hd-panel.dtbo +dtb-$(CONFIG_ARCH_MEDIATEK) += mt8395-radxa-nio-12l-8-hd-panel.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt8516-pumpkin.dtb # Device tree overlays support From df7a2855f3f14376d963e8c70eb19df175d1e246 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= Date: Wed, 10 Dec 2025 07:58:39 +0100 Subject: [PATCH 0196/1321] kunit: Drop unused parameter from kunit_device_register_internal MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The passed driver isn't used, so just drop this parameter. Link: https://lore.kernel.org/r/20251210065839.482608-2-u.kleine-koenig@baylibre.com Signed-off-by: Uwe Kleine-König Reviewed-by: David Gow Signed-off-by: Shuah Khan --- lib/kunit/device.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/lib/kunit/device.c b/lib/kunit/device.c index 520c1fccee8a5..f201aaacd4cf4 100644 --- a/lib/kunit/device.c +++ b/lib/kunit/device.c @@ -106,8 +106,7 @@ EXPORT_SYMBOL_GPL(kunit_driver_create); /* Helper which creates a kunit_device, attaches it to the kunit_bus*/ static struct kunit_device *kunit_device_register_internal(struct kunit *test, - const char *name, - const struct device_driver *drv) + const char *name) { struct kunit_device *kunit_dev; int err = -ENOMEM; @@ -150,7 +149,7 @@ struct device *kunit_device_register_with_driver(struct kunit *test, const char *name, const struct device_driver *drv) { - struct kunit_device *kunit_dev = kunit_device_register_internal(test, name, drv); + struct kunit_device *kunit_dev = kunit_device_register_internal(test, name); if (IS_ERR_OR_NULL(kunit_dev)) return ERR_CAST(kunit_dev); @@ -172,7 +171,7 @@ struct device *kunit_device_register(struct kunit *test, const char *name) if (IS_ERR(drv)) return ERR_CAST(drv); - dev = kunit_device_register_internal(test, name, drv); + dev = kunit_device_register_internal(test, name); if (IS_ERR(dev)) { kunit_release_action(test, driver_unregister_wrapper, (void *)drv); return ERR_CAST(dev); From 3eec08196184921fe87651c27b15351faf64b54b Mon Sep 17 00:00:00 2001 From: Brendan Jackman Date: Sun, 7 Dec 2025 02:17:10 +0000 Subject: [PATCH 0197/1321] kunit: make FAULT_TEST default to n when PANIC_ON_OOPS As describe in the help string, the user might want to disable these tests if they don't like to see stacktraces/BUG etc in their kernel log. However, if they enable PANIC_ON_OOPS, these tests also crash the machine, which it's safe to assume _almost_ nobody wants. One might argue that _absolutely_ nobody ever wants their kernel to crash so this should just be a hard dependency instead of a default. However, since this is rather special code that's anyway concerned with deliberately doing "bad" things, the normal rules don't seem to apply, hence prefer flexibility and allow users to set up a crashing Kconfig if they so choose. Link: https://lore.kernel.org/r/20251207-kunit-fault-no-panic-v1-1-2ac932f26864@google.com Signed-off-by: Brendan Jackman Reviewed-by: David Gow Signed-off-by: Shuah Khan --- lib/kunit/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/kunit/Kconfig b/lib/kunit/Kconfig index 50ecf55d2b9c8..498cc51e493dc 100644 --- a/lib/kunit/Kconfig +++ b/lib/kunit/Kconfig @@ -28,7 +28,7 @@ config KUNIT_FAULT_TEST bool "Enable KUnit tests which print BUG stacktraces" depends on KUNIT_TEST depends on !UML - default y + default !PANIC_ON_OOPS help Enables fault handling tests for the KUnit framework. These tests may trigger a kernel BUG(), and the associated stack trace, even when they From 4a397babc2184b74a901cb6c20c6cf6a588a39d0 Mon Sep 17 00:00:00 2001 From: "Mario Limonciello (AMD)" Date: Tue, 9 Dec 2025 16:00:29 -0600 Subject: [PATCH 0198/1321] drm/amd: Resume the device in thaw() callback when console suspend is disabled If console suspend has been disabled using `no_console_suspend` also wake up during thaw() so that some messages can be seen for debugging. Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4191 Signed-off-by: Mario Limonciello (AMD) Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher (cherry picked from commit 63387cbbb714d9f0d179d9d4560de1408d0906de) --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 2dfbddcef9ab3..848e6b7db482d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -33,6 +33,7 @@ #include #include +#include #include #include #include @@ -2704,7 +2705,9 @@ static int amdgpu_pmops_thaw(struct device *dev) struct drm_device *drm_dev = dev_get_drvdata(dev); /* do not resume device if it's normal hibernation */ - if (!pm_hibernate_is_recovering() && !pm_hibernation_mode_is_suspend()) + if (console_suspend_enabled && + !pm_hibernate_is_recovering() && + !pm_hibernation_mode_is_suspend()) return 0; return amdgpu_device_resume(drm_dev, true); From 37cbafa852f1a089495b030f4be629d5021216c5 Mon Sep 17 00:00:00 2001 From: Ray Wu Date: Fri, 28 Nov 2025 08:58:13 +0800 Subject: [PATCH 0199/1321] drm/amd/display: Fix scratch registers offsets for DCN35 [Why] Different platforms use differnet NBIO header files, causing display code to use differnt offset and read wrong accelerated status. [How] - Unified NBIO offset header file across platform. - Correct scratch registers offsets to proper locations. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4667 Cc: Mario Limonciello Cc: Alex Deucher Reviewed-by: Mario Limonciello Signed-off-by: Ray Wu Signed-off-by: Chenyu Chen Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher (cherry picked from commit 49a63bc8eda0304ba307f5ba68305f936174f72d) Cc: stable@vger.kernel.org --- .../drm/amd/display/dc/resource/dcn35/dcn35_resource.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c index ef69898d2cc5d..d056e5fd54587 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c @@ -203,12 +203,12 @@ enum dcn35_clk_src_array_id { NBIO_BASE_INNER(seg) #define NBIO_SR(reg_name)\ - REG_STRUCT.reg_name = NBIO_BASE(regBIF_BX2_ ## reg_name ## _BASE_IDX) + \ - regBIF_BX2_ ## reg_name + REG_STRUCT.reg_name = NBIO_BASE(regBIF_BX1_ ## reg_name ## _BASE_IDX) + \ + regBIF_BX1_ ## reg_name #define NBIO_SR_ARR(reg_name, id)\ - REG_STRUCT[id].reg_name = NBIO_BASE(regBIF_BX2_ ## reg_name ## _BASE_IDX) + \ - regBIF_BX2_ ## reg_name + REG_STRUCT[id].reg_name = NBIO_BASE(regBIF_BX1_ ## reg_name ## _BASE_IDX) + \ + regBIF_BX1_ ## reg_name #define bios_regs_init() \ ( \ From 2ff5b0efb7d061a2081006afbe734079e64671ef Mon Sep 17 00:00:00 2001 From: Ray Wu Date: Fri, 28 Nov 2025 09:14:09 +0800 Subject: [PATCH 0200/1321] drm/amd/display: Fix scratch registers offsets for DCN351 [Why] Different platforms use different NBIO header files, causing display code to use differnt offset and read wrong accelerated status. [How] - Unified NBIO offset header file across platform. - Correct scratch registers offsets to proper locations. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4667 Cc: Mario Limonciello Cc: Alex Deucher Reviewed-by: Mario Limonciello Signed-off-by: Ray Wu Signed-off-by: Chenyu Chen Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher (cherry picked from commit 576e032e909c8a6bb3d907b4ef5f6abe0f644199) Cc: stable@vger.kernel.org --- .../drm/amd/display/dc/resource/dcn351/dcn351_resource.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c index f3c614c4490ce..9fab3169069c4 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c @@ -183,12 +183,12 @@ enum dcn351_clk_src_array_id { NBIO_BASE_INNER(seg) #define NBIO_SR(reg_name)\ - REG_STRUCT.reg_name = NBIO_BASE(regBIF_BX2_ ## reg_name ## _BASE_IDX) + \ - regBIF_BX2_ ## reg_name + REG_STRUCT.reg_name = NBIO_BASE(regBIF_BX1_ ## reg_name ## _BASE_IDX) + \ + regBIF_BX1_ ## reg_name #define NBIO_SR_ARR(reg_name, id)\ - REG_STRUCT[id].reg_name = NBIO_BASE(regBIF_BX2_ ## reg_name ## _BASE_IDX) + \ - regBIF_BX2_ ## reg_name + REG_STRUCT[id].reg_name = NBIO_BASE(regBIF_BX1_ ## reg_name ## _BASE_IDX) + \ + regBIF_BX1_ ## reg_name #define bios_regs_init() \ ( \ From 49e522b058c219b5a5eb916807b0408703c8e9d0 Mon Sep 17 00:00:00 2001 From: Charlene Liu Date: Fri, 28 Nov 2025 19:38:31 -0500 Subject: [PATCH 0201/1321] drm/amd/display: Fix DP no audio issue [why] need to enable APG_CLOCK_ENABLE enable first also need to wake up az from D3 before access az block Reviewed-by: Swapnil Patel Signed-off-by: Charlene Liu Signed-off-by: Chenyu Chen Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher (cherry picked from commit bf5e396957acafd46003318965500914d5f4edfa) --- drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c index 4986f12dc9dfd..0cdd8c74abdfa 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c @@ -1118,13 +1118,13 @@ void dce110_enable_audio_stream(struct pipe_ctx *pipe_ctx) if (dc->current_state->res_ctx.pipe_ctx[i].stream_res.audio != NULL) num_audio++; } + if (num_audio >= 1 && clk_mgr->funcs->enable_pme_wa) { + /*wake AZ from D3 first before access az endpoint*/ + clk_mgr->funcs->enable_pme_wa(clk_mgr); + } pipe_ctx->stream_res.audio->funcs->az_enable(pipe_ctx->stream_res.audio); - if (num_audio >= 1 && clk_mgr->funcs->enable_pme_wa) - /*this is the first audio. apply the PME w/a in order to wake AZ from D3*/ - clk_mgr->funcs->enable_pme_wa(clk_mgr); - link_hwss->enable_audio_packet(pipe_ctx); if (pipe_ctx->stream_res.audio) From 3526fbf2c95c422a9a9a131a9afe5b57587e6903 Mon Sep 17 00:00:00 2001 From: Alex Deucher Date: Wed, 10 Dec 2025 11:02:30 -0500 Subject: [PATCH 0202/1321] drm/amdgpu: fix a job->pasid access race in gpu recovery MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Avoid a possible UAF in GPU recovery due to a race between the sched timeout callback and the tdr work queue. The gpu recovery function calls drm_sched_stop() and later drm_sched_start(). drm_sched_start() restarts the tdr queue which will eventually free the job. If the tdr queue frees the job before time out callback completes, the job will be freed and we'll get a UAF when accessing the pasid. Cache it early to avoid the UAF. Example KASAN trace: [ 493.058141] BUG: KASAN: slab-use-after-free in amdgpu_device_gpu_recover+0x968/0x990 [amdgpu] [ 493.067530] Read of size 4 at addr ffff88b0ce3f794c by task kworker/u128:1/323 [ 493.074892] [ 493.076485] CPU: 9 UID: 0 PID: 323 Comm: kworker/u128:1 Tainted: G E 6.16.0-1289896.2.zuul.bf4f11df81c1410bbe901c4373305a31 #1 PREEMPT(voluntary) [ 493.076493] Tainted: [E]=UNSIGNED_MODULE [ 493.076495] Hardware name: TYAN B8021G88V2HR-2T/S8021GM2NR-2T, BIOS V1.03.B10 04/01/2019 [ 493.076500] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched] [ 493.076512] Call Trace: [ 493.076515] [ 493.076518] dump_stack_lvl+0x64/0x80 [ 493.076529] print_report+0xce/0x630 [ 493.076536] ? _raw_spin_lock_irqsave+0x86/0xd0 [ 493.076541] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 493.076545] ? amdgpu_device_gpu_recover+0x968/0x990 [amdgpu] [ 493.077253] kasan_report+0xb8/0xf0 [ 493.077258] ? amdgpu_device_gpu_recover+0x968/0x990 [amdgpu] [ 493.077965] amdgpu_device_gpu_recover+0x968/0x990 [amdgpu] [ 493.078672] ? __pfx_amdgpu_device_gpu_recover+0x10/0x10 [amdgpu] [ 493.079378] ? amdgpu_coredump+0x1fd/0x4c0 [amdgpu] [ 493.080111] amdgpu_job_timedout+0x642/0x1400 [amdgpu] [ 493.080903] ? pick_task_fair+0x24e/0x330 [ 493.080910] ? __pfx_amdgpu_job_timedout+0x10/0x10 [amdgpu] [ 493.081702] ? _raw_spin_lock+0x75/0xc0 [ 493.081708] ? __pfx__raw_spin_lock+0x10/0x10 [ 493.081712] drm_sched_job_timedout+0x1b0/0x4b0 [gpu_sched] [ 493.081721] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 493.081725] process_one_work+0x679/0xff0 [ 493.081732] worker_thread+0x6ce/0xfd0 [ 493.081736] ? __pfx_worker_thread+0x10/0x10 [ 493.081739] kthread+0x376/0x730 [ 493.081744] ? __pfx_kthread+0x10/0x10 [ 493.081748] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 493.081751] ? __pfx_kthread+0x10/0x10 [ 493.081755] ret_from_fork+0x247/0x330 [ 493.081761] ? __pfx_kthread+0x10/0x10 [ 493.081764] ret_from_fork_asm+0x1a/0x30 [ 493.081771] Fixes: a72002cb181f ("drm/amdgpu: Make use of drm_wedge_task_info") Link: https://github.com/HansKristian-Work/vkd3d-proton/pull/2670 Cc: SRINIVASAN.SHANMUGAM@amd.com Cc: vitaly.prosyak@amd.com Cc: christian.koenig@amd.com Suggested-by: Matthew Brost Reviewed-by: Srinivasan Shanmugam Reviewed-by: Lijo Lazar Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit 20880a3fd5dd7bca1a079534cf6596bda92e107d) --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 58c3ffe707d1d..12201b8e99b3f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -6613,6 +6613,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, struct amdgpu_hive_info *hive = NULL; int r = 0; bool need_emergency_restart = false; + /* save the pasid here as the job may be freed before the end of the reset */ + int pasid = job ? job->pasid : -EINVAL; /* * If it reaches here because of hang/timeout and a RAS error is @@ -6713,8 +6715,12 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, if (!r) { struct amdgpu_task_info *ti = NULL; - if (job) - ti = amdgpu_vm_get_task_info_pasid(adev, job->pasid); + /* + * The job may already be freed at this point via the sched tdr workqueue so + * use the cached pasid. + */ + if (pasid >= 0) + ti = amdgpu_vm_get_task_info_pasid(adev, pasid); drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE, ti ? &ti->task : NULL); From 87d949aa515bd39c76cb368e0a671402b00764af Mon Sep 17 00:00:00 2001 From: mythilam Date: Thu, 4 Dec 2025 11:04:12 +0530 Subject: [PATCH 0203/1321] drm/amd/pm: restore SCLK settings after S0ix resume User-configured SCLK(GPU core clock)frequencies were not persisting across S0ix suspend/resume cycles on smu v14 hardware. The issue occurred because of the code resetting clock frequency to zero during resume. This patch addresses the problem by: - Preserving user-configured values in driver and sets the clock frequency across resume - Preserved settings are sent to the hardware during resume Signed-off-by: mythilam Acked-by: Alex Deucher Reviewed-by: Yang Wang Signed-off-by: Alex Deucher (cherry picked from commit 20ba98326f4c69e6bf8d1f42942ece485a675b27) --- .../gpu/drm/amd/pm/swsmu/smu14/smu_v14_0.c | 5 +++ .../drm/amd/pm/swsmu/smu14/smu_v14_0_0_ppt.c | 37 ++++++++++++++++--- 2 files changed, 37 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0.c index f9b0938c57ea7..f2a16dfee5998 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0.c @@ -1939,6 +1939,11 @@ int smu_v14_0_od_edit_dpm_table(struct smu_context *smu, dev_err(smu->adev->dev, "Set soft max sclk failed!"); return ret; } + if (smu->gfx_actual_hard_min_freq != smu->gfx_default_hard_min_freq || + smu->gfx_actual_soft_max_freq != smu->gfx_default_soft_max_freq) + smu->user_dpm_profile.user_od = true; + else + smu->user_dpm_profile.user_od = false; break; default: return -ENOSYS; diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_0_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_0_ppt.c index b1bd946d8e309..97414bc397641 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_0_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_0_ppt.c @@ -1514,9 +1514,10 @@ static int smu_v14_0_1_set_fine_grain_gfx_freq_parameters(struct smu_context *sm smu->gfx_default_hard_min_freq = clk_table->MinGfxClk; smu->gfx_default_soft_max_freq = clk_table->MaxGfxClk; - smu->gfx_actual_hard_min_freq = 0; - smu->gfx_actual_soft_max_freq = 0; - + if (smu->gfx_actual_hard_min_freq == 0) + smu->gfx_actual_hard_min_freq = smu->gfx_default_hard_min_freq; + if (smu->gfx_actual_soft_max_freq == 0) + smu->gfx_actual_soft_max_freq = smu->gfx_default_soft_max_freq; return 0; } @@ -1526,8 +1527,10 @@ static int smu_v14_0_0_set_fine_grain_gfx_freq_parameters(struct smu_context *sm smu->gfx_default_hard_min_freq = clk_table->MinGfxClk; smu->gfx_default_soft_max_freq = clk_table->MaxGfxClk; - smu->gfx_actual_hard_min_freq = 0; - smu->gfx_actual_soft_max_freq = 0; + if (smu->gfx_actual_hard_min_freq == 0) + smu->gfx_actual_hard_min_freq = smu->gfx_default_hard_min_freq; + if (smu->gfx_actual_soft_max_freq == 0) + smu->gfx_actual_soft_max_freq = smu->gfx_default_soft_max_freq; return 0; } @@ -1665,6 +1668,29 @@ static int smu_v14_0_common_set_mall_enable(struct smu_context *smu) return ret; } +static int smu_v14_0_0_restore_user_od_settings(struct smu_context *smu) +{ + int ret; + + ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetHardMinGfxClk, + smu->gfx_actual_hard_min_freq, + NULL); + if (ret) { + dev_err(smu->adev->dev, "Failed to restore hard min sclk!\n"); + return ret; + } + + ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetSoftMaxGfxClk, + smu->gfx_actual_soft_max_freq, + NULL); + if (ret) { + dev_err(smu->adev->dev, "Failed to restore soft max sclk!\n"); + return ret; + } + + return 0; +} + static const struct pptable_funcs smu_v14_0_0_ppt_funcs = { .check_fw_status = smu_v14_0_check_fw_status, .check_fw_version = smu_v14_0_check_fw_version, @@ -1688,6 +1714,7 @@ static const struct pptable_funcs smu_v14_0_0_ppt_funcs = { .mode2_reset = smu_v14_0_0_mode2_reset, .get_dpm_ultimate_freq = smu_v14_0_common_get_dpm_ultimate_freq, .set_soft_freq_limited_range = smu_v14_0_0_set_soft_freq_limited_range, + .restore_user_od_settings = smu_v14_0_0_restore_user_od_settings, .od_edit_dpm_table = smu_v14_0_od_edit_dpm_table, .print_clk_levels = smu_v14_0_0_print_clk_levels, .force_clk_levels = smu_v14_0_0_force_clk_levels, From 61089dcd717c55172c2f1ea680008ceafde79a71 Mon Sep 17 00:00:00 2001 From: Brian Kocoloski Date: Thu, 20 Nov 2025 13:57:19 -0500 Subject: [PATCH 0204/1321] drm/amdkfd: Fix improper NULL termination of queue restore SMI event string Pass character "0" rather than NULL terminator to properly format queue restoration SMI events. Currently, the NULL terminator precedes the newline character that is intended to delineate separate events in the SMI event buffer, which can break userspace parsers. Signed-off-by: Brian Kocoloski Reviewed-by: Philip Yang Signed-off-by: Alex Deucher (cherry picked from commit 6e7143e5e6e21f9d5572e0390f7089e6d53edf3c) --- drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c index a499449fcb068..d2bc169e84b0b 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c @@ -312,7 +312,7 @@ void kfd_smi_event_queue_restore(struct kfd_node *node, pid_t pid) { kfd_smi_event_add(pid, node, KFD_SMI_EVENT_QUEUE_RESTORE, KFD_EVENT_FMT_QUEUE_RESTORE(ktime_get_boottime_ns(), pid, - node->id, 0)); + node->id, '0')); } void kfd_smi_event_queue_restore_rescheduled(struct mm_struct *mm) From 7ed3ee73e2ebe757b9895f98feefb9ebc69e1892 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Exp=C3=B3sito?= Date: Tue, 4 Nov 2025 11:22:35 +0100 Subject: [PATCH 0205/1321] drm/tests: hdmi: Handle drm_kunit_helper_enable_crtc_connector() returning EDEADLK MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fedora/CentOS/RHEL CI is reporting intermittent failures while running the KUnit tests present in drm_hdmi_state_helper_test.c [1]. While the specific test causing the failure change between runs, all of them are caused by drm_kunit_helper_enable_crtc_connector() returning -EDEADLK. The error trace always follow this structure: # : ASSERTION FAILED at # drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c: Expected ret == 0, but ret == -35 (0xffffffffffffffdd) As documented, if the drm_kunit_helper_enable_crtc_connector() function returns -EDEADLK (-35), the entire atomic sequence must be restarted. Handle this error code for all function calls. Closes: https://datawarehouse.cki-project.org/issue/4039 [1] Fixes: 6a5c0ad7e08e ("drm/tests: hdmi_state_helpers: Switch to new helper") Reviewed-by: Maxime Ripard Signed-off-by: José Expósito Link: https://patch.msgid.link/20251104102258.10026-1-jose.exposito89@gmail.com --- .../drm/tests/drm_hdmi_state_helper_test.c | 143 ++++++++++++++++++ 1 file changed, 143 insertions(+) diff --git a/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c b/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c index 8bd412735000c..70f9aa7021430 100644 --- a/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c +++ b/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c @@ -257,10 +257,16 @@ static void drm_test_check_broadcast_rgb_crtc_mode_changed(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -326,10 +332,16 @@ static void drm_test_check_broadcast_rgb_crtc_mode_not_changed(struct kunit *tes drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -397,10 +409,16 @@ static void drm_test_check_broadcast_rgb_auto_cea_mode(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -457,10 +475,17 @@ static void drm_test_check_broadcast_rgb_auto_cea_mode_vic_1(struct kunit *test) KUNIT_ASSERT_NOT_NULL(test, mode); crtc = priv->crtc; + +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, mode, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -518,10 +543,16 @@ static void drm_test_check_broadcast_rgb_full_cea_mode(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -580,10 +611,17 @@ static void drm_test_check_broadcast_rgb_full_cea_mode_vic_1(struct kunit *test) KUNIT_ASSERT_NOT_NULL(test, mode); crtc = priv->crtc; + +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, mode, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -643,10 +681,16 @@ static void drm_test_check_broadcast_rgb_limited_cea_mode(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -705,10 +749,17 @@ static void drm_test_check_broadcast_rgb_limited_cea_mode_vic_1(struct kunit *te KUNIT_ASSERT_NOT_NULL(test, mode); crtc = priv->crtc; + +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, mode, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -870,10 +921,16 @@ static void drm_test_check_output_bpc_crtc_mode_changed(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -946,10 +1003,16 @@ static void drm_test_check_output_bpc_crtc_mode_not_changed(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -1022,10 +1085,16 @@ static void drm_test_check_output_bpc_dvi(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); conn_state = conn->state; @@ -1069,10 +1138,16 @@ static void drm_test_check_tmds_char_rate_rgb_8bpc(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); conn_state = conn->state; @@ -1118,10 +1193,16 @@ static void drm_test_check_tmds_char_rate_rgb_10bpc(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); conn_state = conn->state; @@ -1167,10 +1248,16 @@ static void drm_test_check_tmds_char_rate_rgb_12bpc(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); conn_state = conn->state; @@ -1218,10 +1305,16 @@ static void drm_test_check_hdmi_funcs_reject_rate(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); /* You shouldn't be doing that at home. */ @@ -1292,10 +1385,16 @@ static void drm_test_check_max_tmds_rate_bpc_fallback_rgb(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_EXPECT_EQ(test, ret, 0); conn_state = conn->state; @@ -1440,10 +1539,16 @@ static void drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv422(struct kunit drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_EXPECT_EQ(test, ret, 0); conn_state = conn->state; @@ -1669,10 +1774,17 @@ static void drm_test_check_output_bpc_format_vic_1(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); crtc = priv->crtc; + +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, mode, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_EXPECT_EQ(test, ret, 0); conn_state = conn->state; @@ -1736,10 +1848,16 @@ static void drm_test_check_output_bpc_format_driver_rgb_only(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_EXPECT_EQ(test, ret, 0); conn_state = conn->state; @@ -1805,10 +1923,16 @@ static void drm_test_check_output_bpc_format_display_rgb_only(struct kunit *test drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_EXPECT_EQ(test, ret, 0); conn_state = conn->state; @@ -1865,10 +1989,16 @@ static void drm_test_check_output_bpc_format_driver_8bpc_only(struct kunit *test drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_EXPECT_EQ(test, ret, 0); conn_state = conn->state; @@ -1927,10 +2057,16 @@ static void drm_test_check_output_bpc_format_display_8bpc_only(struct kunit *tes drm_modeset_acquire_init(&ctx, 0); +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_EXPECT_EQ(test, ret, 0); conn_state = conn->state; @@ -1970,10 +2106,17 @@ static void drm_test_check_disable_connector(struct kunit *test) drm = &priv->drm; crtc = priv->crtc; + +retry_conn_enable: ret = drm_kunit_helper_enable_crtc_connector(test, drm, crtc, conn, preferred, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_conn_enable; + } KUNIT_ASSERT_EQ(test, ret, 0); state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); From 31e24076444c11939a62cbbd5c0dcfbe81f6537a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Exp=C3=B3sito?= Date: Tue, 4 Nov 2025 11:25:21 +0100 Subject: [PATCH 0206/1321] drm/tests: Handle EDEADLK in drm_test_check_valid_clones() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fedora/CentOS/RHEL CI is reporting intermittent failures while running the drm_test_check_valid_clones() KUnit test. The error log can be either [1]: # drm_test_check_valid_clones: ASSERTION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:295 Expected ret == param->expected_result, but ret == -35 (0xffffffffffffffdd) param->expected_result == 0 (0x0) Or [2] depending on the test case: # drm_test_check_valid_clones: ASSERTION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:295 Expected ret == param->expected_result, but ret == -35 (0xffffffffffffffdd) param->expected_result == -22 (0xffffffffffffffea) Restart the atomic sequence when EDEADLK is returned. [1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2113057246/test_x86_64/11802139999/artifacts/jobwatch/logs/recipes/19824965/tasks/204347800/results/946112713/logs/dmesg.log [2] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2106744297/test_aarch64/11762450907/artifacts/jobwatch/logs/recipes/19797942/tasks/204139727/results/945094561/logs/dmesg.log Fixes: 88849f24e2ab ("drm/tests: Add test for drm_atomic_helper_check_modeset()") Closes: https://datawarehouse.cki-project.org/issue/4004 Reviewed-by: Maxime Ripard Signed-off-by: José Expósito Link: https://patch.msgid.link/20251104102535.12212-1-jose.exposito89@gmail.com --- drivers/gpu/drm/tests/drm_atomic_state_test.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/drivers/gpu/drm/tests/drm_atomic_state_test.c b/drivers/gpu/drm/tests/drm_atomic_state_test.c index 2f6ac7a09f445..1e857d86574cc 100644 --- a/drivers/gpu/drm/tests/drm_atomic_state_test.c +++ b/drivers/gpu/drm/tests/drm_atomic_state_test.c @@ -283,7 +283,14 @@ static void drm_test_check_valid_clones(struct kunit *test) state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, state); +retry: crtc_state = drm_atomic_get_crtc_state(state, priv->crtc); + if (PTR_ERR(crtc_state) == -EDEADLK) { + drm_atomic_state_clear(state); + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry; + } KUNIT_ASSERT_NOT_ERR_OR_NULL(test, crtc_state); crtc_state->encoder_mask = param->encoder_mask; @@ -292,6 +299,12 @@ static void drm_test_check_valid_clones(struct kunit *test) crtc_state->mode_changed = true; ret = drm_atomic_helper_check_modeset(drm, state); + if (ret == -EDEADLK) { + drm_atomic_state_clear(state); + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry; + } KUNIT_ASSERT_EQ(test, ret, param->expected_result); drm_modeset_drop_locks(&ctx); From 65eb5a10f1f678753e781c6af4705a10d514922f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Exp=C3=B3sito?= Date: Tue, 4 Nov 2025 11:25:22 +0100 Subject: [PATCH 0207/1321] drm/tests: Handle EDEADLK in set_up_atomic_state() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fedora/CentOS/RHEL CI is reporting intermittent failures while running the drm_validate_modeset test [1]: # drm_test_check_connector_changed_modeset: EXPECTATION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:162 Expected ret == 0, but ret == -35 (0xffffffffffffffdd) Change the set_up_atomic_state() helper function to return on error and restart the atomic sequence when the returned error is EDEADLK. [1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2106744096/test_x86_64/11762450343/artifacts/jobwatch/logs/recipes/19797909/tasks/204139142/results/945095586/logs/dmesg.log Fixes: 73d934d7b6e3 ("drm/tests: Add test for drm_atomic_helper_commit_modeset_disables()") Closes: https://datawarehouse.cki-project.org/issue/4004 Reviewed-by: Maxime Ripard Signed-off-by: José Expósito Link: https://patch.msgid.link/20251104102535.12212-2-jose.exposito89@gmail.com --- drivers/gpu/drm/tests/drm_atomic_state_test.c | 27 +++++++++++++++---- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/tests/drm_atomic_state_test.c b/drivers/gpu/drm/tests/drm_atomic_state_test.c index 1e857d86574cc..bc27f65b28233 100644 --- a/drivers/gpu/drm/tests/drm_atomic_state_test.c +++ b/drivers/gpu/drm/tests/drm_atomic_state_test.c @@ -156,24 +156,29 @@ static int set_up_atomic_state(struct kunit *test, if (connector) { conn_state = drm_atomic_get_connector_state(state, connector); - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, conn_state); + if (IS_ERR(conn_state)) + return PTR_ERR(conn_state); ret = drm_atomic_set_crtc_for_connector(conn_state, crtc); - KUNIT_EXPECT_EQ(test, ret, 0); + if (ret) + return ret; } crtc_state = drm_atomic_get_crtc_state(state, crtc); - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, crtc_state); + if (IS_ERR(crtc_state)) + return PTR_ERR(crtc_state); ret = drm_atomic_set_mode_for_crtc(crtc_state, &drm_atomic_test_mode); - KUNIT_EXPECT_EQ(test, ret, 0); + if (ret) + return ret; crtc_state->enable = true; crtc_state->active = true; if (connector) { ret = drm_atomic_commit(state); - KUNIT_ASSERT_EQ(test, ret, 0); + if (ret) + return ret; } else { // dummy connector mask crtc_state->connector_mask = DRM_TEST_CONN_0; @@ -206,7 +211,13 @@ static void drm_test_check_connector_changed_modeset(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); // first modeset to enable +retry_set_up: ret = set_up_atomic_state(test, priv, old_conn, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_set_up; + } KUNIT_ASSERT_EQ(test, ret, 0); state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); @@ -277,7 +288,13 @@ static void drm_test_check_valid_clones(struct kunit *test) drm_modeset_acquire_init(&ctx, 0); +retry_set_up: ret = set_up_atomic_state(test, priv, NULL, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) + goto retry_set_up; + } KUNIT_ASSERT_EQ(test, ret, 0); state = drm_kunit_helper_atomic_state_alloc(test, drm, &ctx); From ca3aa5e401ebf415f715729055d449a86e4f4e24 Mon Sep 17 00:00:00 2001 From: Karol Wachowski Date: Fri, 12 Dec 2025 14:41:33 +0100 Subject: [PATCH 0208/1321] drm: Fix object leak in DRM_IOCTL_GEM_CHANGE_HANDLE MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add missing drm_gem_object_put() call when drm_gem_object_lookup() successfully returns an object. This fixes a GEM object reference leak that can prevent driver modules from unloading when using prime buffers. Fixes: 53096728b891 ("drm: Add DRM prime interface to reassign GEM handle") Cc: # v6.18+ Signed-off-by: Karol Wachowski Reviewed-by: Christian König Reviewed-by: Maciej Falkowski Signed-off-by: Christian König Link: https://lore.kernel.org/r/20251212134133.475218-1-karol.wachowski@linux.intel.com --- drivers/gpu/drm/drm_gem.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index efc79bbf3c73d..e4df434273945 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -969,8 +969,10 @@ int drm_gem_change_handle_ioctl(struct drm_device *dev, void *data, if (!obj) return -ENOENT; - if (args->handle == args->new_handle) - return 0; + if (args->handle == args->new_handle) { + ret = 0; + goto out; + } mutex_lock(&file_priv->prime.lock); @@ -1002,6 +1004,8 @@ int drm_gem_change_handle_ioctl(struct drm_device *dev, void *data, out_unlock: mutex_unlock(&file_priv->prime.lock); +out: + drm_gem_object_put(obj); return ret; } From 1e765663c6e61ec23164020858dac232b8f6020c Mon Sep 17 00:00:00 2001 From: Marijn Suijten Date: Sun, 30 Nov 2025 23:40:05 +0100 Subject: [PATCH 0209/1321] drm/panel: sony-td4353-jdi: Enable prepare_prev_first The DSI host must be enabled before our prepare function can run, which has to send its init sequence over DSI. Without enabling the host first the panel will not probe. Fixes: 9e15123eca79 ("drm/msm/dsi: Stop unconditionally powering up DSI hosts at modeset") Signed-off-by: Marijn Suijten Reviewed-by: Douglas Anderson Reviewed-by: AngeloGioacchino Del Regno Reviewed-by: Martin Botka Signed-off-by: Douglas Anderson Link: https://patch.msgid.link/20251130-sony-akari-fix-panel-v1-1-1d27c60a55f5@somainline.org --- drivers/gpu/drm/panel/panel-sony-td4353-jdi.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/panel/panel-sony-td4353-jdi.c b/drivers/gpu/drm/panel/panel-sony-td4353-jdi.c index 7c989b70ab513..a14c86c60d19d 100644 --- a/drivers/gpu/drm/panel/panel-sony-td4353-jdi.c +++ b/drivers/gpu/drm/panel/panel-sony-td4353-jdi.c @@ -212,6 +212,8 @@ static int sony_td4353_jdi_probe(struct mipi_dsi_device *dsi) if (ret) return dev_err_probe(dev, ret, "Failed to get backlight\n"); + ctx->panel.prepare_prev_first = true; + drm_panel_add(&ctx->panel); ret = mipi_dsi_attach(dsi); From c28e43c6ca81996c2a921eaccd60507452d5ee16 Mon Sep 17 00:00:00 2001 From: "Mario Limonciello (AMD)" Date: Fri, 12 Dec 2025 23:44:47 -0600 Subject: [PATCH 0210/1321] accel/amdxdna: Block running under a hypervisor SVA support is required, which isn't configured by hypervisor solutions. Closes: https://github.com/QubesOS/qubes-issues/issues/10275 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4656 Reviewed-by: Lizhi Hou Link: https://patch.msgid.link/20251213054513.87925-1-superm1@kernel.org Signed-off-by: Mario Limonciello (AMD) --- drivers/accel/amdxdna/aie2_pci.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_pci.c index ceef1c502e9e2..8141d8e516360 100644 --- a/drivers/accel/amdxdna/aie2_pci.c +++ b/drivers/accel/amdxdna/aie2_pci.c @@ -17,6 +17,7 @@ #include #include #include +#include #include "aie2_msg_priv.h" #include "aie2_pci.h" @@ -508,6 +509,11 @@ static int aie2_init(struct amdxdna_dev *xdna) unsigned long bars = 0; int i, nvec, ret; + if (!hypervisor_is_type(X86_HYPER_NATIVE)) { + XDNA_ERR(xdna, "Running under hypervisor not supported"); + return -EINVAL; + } + ndev = drmm_kzalloc(&xdna->ddev, sizeof(*ndev), GFP_KERNEL); if (!ndev) return -ENOMEM; From 3a19cdd40cfe94b5638539a93f85d43d4d49c465 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Guido=20G=C3=BCnther?= Date: Fri, 17 Oct 2025 10:27:59 +0200 Subject: [PATCH 0211/1321] drm/panel: visionox-rm69299: Depend on BACKLIGHT_CLASS_DEVICE MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We handle backlight so need that dependency. Fixes: 7911d8cab554 ("drm/panel: visionox-rm69299: Add backlight support") Reported-by: kernelci.org bot Signed-off-by: Guido Günther Reviewed-by: Neil Armstrong Reviewed-by: Randy Dunlap Tested-by: Randy Dunlap Reviewed-by: David Heidelberg Signed-off-by: Neil Armstrong Link: https://patch.msgid.link/20251017-visionox-rm69299-bl-v2-1-9dfa06606754@sigxcpu.org --- drivers/gpu/drm/panel/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/panel/Kconfig b/drivers/gpu/drm/panel/Kconfig index 76f6af8190376..7a83804fedca1 100644 --- a/drivers/gpu/drm/panel/Kconfig +++ b/drivers/gpu/drm/panel/Kconfig @@ -1165,6 +1165,7 @@ config DRM_PANEL_VISIONOX_RM69299 tristate "Visionox RM69299" depends on OF depends on DRM_MIPI_DSI + depends on BACKLIGHT_CLASS_DEVICE help Say Y here if you want to enable support for Visionox RM69299 DSI Video Mode panel. From 62819aaeb3a318b79b6b4ca994370b8e2bd3fc52 Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Fri, 14 Nov 2025 20:56:39 +0000 Subject: [PATCH 0212/1321] drm/xe: Fix freq kobject leak on sysfs_create_files failure MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Ensure gt->freq is released when sysfs_create_files() fails in xe_gt_freq_init(). Without this, the kobject would leak. Add kobject_put() before returning the error. Fixes: fdc81c43f0c1 ("drm/xe: use devm_add_action_or_reset() helper") Signed-off-by: Shuicheng Lin Reviewed-by: Alex Zuo Reviewed-by: Xin Wang Link: https://patch.msgid.link/20251114205638.2184529-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper (cherry picked from commit 251be5fb4982ebb0f5a81b62d975bd770f3ad5c2) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_gt_freq.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_gt_freq.c b/drivers/gpu/drm/xe/xe_gt_freq.c index 849ea6c86e8e2..ce3c7810469f7 100644 --- a/drivers/gpu/drm/xe/xe_gt_freq.c +++ b/drivers/gpu/drm/xe/xe_gt_freq.c @@ -293,8 +293,10 @@ int xe_gt_freq_init(struct xe_gt *gt) return -ENOMEM; err = sysfs_create_files(gt->freq, freq_attrs); - if (err) + if (err) { + kobject_put(gt->freq); return err; + } err = devm_add_action_or_reset(xe->drm.dev, freq_fini, gt->freq); if (err) From 4e50cf25da87ff1e7f531576bd4ece5423e25fb5 Mon Sep 17 00:00:00 2001 From: Vinay Belgaumkar Date: Fri, 28 Nov 2025 21:25:48 -0800 Subject: [PATCH 0213/1321] drm/xe: Apply Wa_14020316580 in xe_gt_idle_enable_pg() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Wa_14020316580 was getting clobbered by power gating init code later in the driver load sequence. Move the Wa so that it applies correctly. Fixes: 7cd05ef89c9d ("drm/xe/xe2hpm: Add initial set of workarounds") Suggested-by: Matt Roper Signed-off-by: Vinay Belgaumkar Reviewed-by: Riana Tauro Reviewed-by: Matt Roper Link: https://patch.msgid.link/20251129052548.70766-1-vinay.belgaumkar@intel.com Signed-off-by: Matt Roper (cherry picked from commit 8b5502145351bde87f522df082b9e41356898ba3) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_gt_idle.c | 8 ++++++++ drivers/gpu/drm/xe/xe_wa.c | 8 -------- drivers/gpu/drm/xe/xe_wa_oob.rules | 1 + 3 files changed, 9 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_gt_idle.c b/drivers/gpu/drm/xe/xe_gt_idle.c index bdc9d9877ec49..3e3d1d52f6302 100644 --- a/drivers/gpu/drm/xe/xe_gt_idle.c +++ b/drivers/gpu/drm/xe/xe_gt_idle.c @@ -5,6 +5,7 @@ #include +#include #include "xe_force_wake.h" #include "xe_device.h" #include "xe_gt.h" @@ -16,6 +17,7 @@ #include "xe_mmio.h" #include "xe_pm.h" #include "xe_sriov.h" +#include "xe_wa.h" /** * DOC: Xe GT Idle @@ -145,6 +147,12 @@ void xe_gt_idle_enable_pg(struct xe_gt *gt) xe_mmio_write32(mmio, RENDER_POWERGATE_IDLE_HYSTERESIS, 25); } + if (XE_GT_WA(gt, 14020316580)) + gtidle->powergate_enable &= ~(VDN_HCP_POWERGATE_ENABLE(0) | + VDN_MFXVDENC_POWERGATE_ENABLE(0) | + VDN_HCP_POWERGATE_ENABLE(2) | + VDN_MFXVDENC_POWERGATE_ENABLE(2)); + xe_mmio_write32(mmio, POWERGATE_ENABLE, gtidle->powergate_enable); xe_force_wake_put(gt_to_fw(gt), fw_ref); } diff --git a/drivers/gpu/drm/xe/xe_wa.c b/drivers/gpu/drm/xe/xe_wa.c index 3764abca3d4f2..e32dd2fde6f1c 100644 --- a/drivers/gpu/drm/xe/xe_wa.c +++ b/drivers/gpu/drm/xe/xe_wa.c @@ -270,14 +270,6 @@ static const struct xe_rtp_entry_sr gt_was[] = { XE_RTP_ACTIONS(SET(VDBOX_CGCTL3F1C(0), MFXPIPE_CLKGATE_DIS)), XE_RTP_ENTRY_FLAG(FOREACH_ENGINE), }, - { XE_RTP_NAME("14020316580"), - XE_RTP_RULES(MEDIA_VERSION(1301)), - XE_RTP_ACTIONS(CLR(POWERGATE_ENABLE, - VDN_HCP_POWERGATE_ENABLE(0) | - VDN_MFXVDENC_POWERGATE_ENABLE(0) | - VDN_HCP_POWERGATE_ENABLE(2) | - VDN_MFXVDENC_POWERGATE_ENABLE(2))), - }, { XE_RTP_NAME("14019449301"), XE_RTP_RULES(MEDIA_VERSION(1301), ENGINE_CLASS(VIDEO_DECODE)), XE_RTP_ACTIONS(SET(VDBOX_CGCTL3F08(0), CG3DDISHRS_CLKGATE_DIS)), diff --git a/drivers/gpu/drm/xe/xe_wa_oob.rules b/drivers/gpu/drm/xe/xe_wa_oob.rules index fb38eb3d6e9a3..7ca7258eb5d82 100644 --- a/drivers/gpu/drm/xe/xe_wa_oob.rules +++ b/drivers/gpu/drm/xe/xe_wa_oob.rules @@ -76,3 +76,4 @@ 15015404425_disable PLATFORM(PANTHERLAKE), MEDIA_STEP(B0, FOREVER) 16026007364 MEDIA_VERSION(3000) +14020316580 MEDIA_VERSION(1301) From a8f87ee337ad1ca0702b3e44adfdaed8c79e02f3 Mon Sep 17 00:00:00 2001 From: Matthew Brost Date: Tue, 2 Dec 2025 17:18:09 -0800 Subject: [PATCH 0214/1321] drm/xe: Do not reference loop variable directly MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Do not reference the loop variable job after the loop has exited. Instead, save the job from the last iteration of the loop. Fixes: 3d98a7164da6 ("drm/xe/vf: Start re-emission from first unsignaled job during VF migration") Reported-by: kernel test robot Reported-by: Dan Carpenter Closes: https://lore.kernel.org/r/202511291102.jnnKP6IB-lkp@intel.com/ Signed-off-by: Matthew Brost Reviewed-by: Dnyaneshwar Bhadane Link: https://patch.msgid.link/20251203011809.968893-1-matthew.brost@intel.com (cherry picked from commit 76ce2313709f13a6adbcaa1a43a8539c8f509f6a) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_guc_submit.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index ed7be50b2f720..c0819377ce6e4 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -2253,10 +2253,11 @@ static void guc_exec_queue_unpause_prepare(struct xe_guc *guc, struct xe_exec_queue *q) { struct xe_gpu_scheduler *sched = &q->guc->sched; - struct xe_sched_job *job = NULL; + struct xe_sched_job *job = NULL, *__job; bool restore_replay = false; - list_for_each_entry(job, &sched->base.pending_list, drm.list) { + list_for_each_entry(__job, &sched->base.pending_list, drm.list) { + job = __job; restore_replay |= job->restore_replay; if (restore_replay) { xe_gt_dbg(guc_to_gt(guc), "Replay JOB - guc_id=%d, seqno=%d", From 245c0aeaebc559c55d211c78f678136dd9aefbcd Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Thu, 4 Dec 2025 10:46:58 +0100 Subject: [PATCH 0215/1321] drm/xe: fix drm_gpusvm_init() arguments MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Xe driver fails to build when CONFIG_DRM_XE_GPUSVM is disabled but CONFIG_DRM_GPUSVM is turned on, due to the clash of two commits: In file included from drivers/gpu/drm/xe/xe_vm_madvise.c:8: drivers/gpu/drm/xe/xe_svm.h: In function 'xe_svm_init': include/linux/stddef.h:8:14: error: passing argument 5 of 'drm_gpusvm_init' makes integer from pointer without a cast [-Wint-conversion] drivers/gpu/drm/xe/xe_svm.h:217:38: note: in expansion of macro 'NULL' 217 | NULL, NULL, 0, 0, 0, NULL, NULL, 0); | ^~~~ In file included from drivers/gpu/drm/xe/xe_bo_types.h:11, from drivers/gpu/drm/xe/xe_bo.h:11, from drivers/gpu/drm/xe/xe_vm_madvise.c:11: include/drm/drm_gpusvm.h:254:35: note: expected 'long unsigned int' but argument is of type 'void *' 254 | unsigned long mm_start, unsigned long mm_range, | ~~~~~~~~~~~~~~^~~~~~~~ In file included from drivers/gpu/drm/xe/xe_vm_madvise.c:14: drivers/gpu/drm/xe/xe_svm.h:216:16: error: too many arguments to function 'drm_gpusvm_init'; expected 10, have 11 216 | return drm_gpusvm_init(&vm->svm.gpusvm, "Xe SVM (simple)", &vm->xe->drm, | ^~~~~~~~~~~~~~~ 217 | NULL, NULL, 0, 0, 0, NULL, NULL, 0); | ~ include/drm/drm_gpusvm.h:251:5: note: declared here Adapt the caller to the new argument list by removing the extraneous NULL argument. Fixes: 9e9787414882 ("drm/xe/userptr: replace xe_hmm with gpusvm") Fixes: 10aa5c806030 ("drm/gpusvm, drm/xe: Fix userptr to not allow device private pages") Signed-off-by: Arnd Bergmann Reviewed-by: Thomas Hellström Signed-off-by: Thomas Hellström Link: https://patch.msgid.link/20251204094704.1030933-1-arnd@kernel.org (cherry picked from commit 29bce9c8b41d5c378263a927acb9a9074d0e7a0e) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_svm.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h index 0955d2ac8d744..fa757dd07954d 100644 --- a/drivers/gpu/drm/xe/xe_svm.h +++ b/drivers/gpu/drm/xe/xe_svm.h @@ -214,7 +214,7 @@ int xe_svm_init(struct xe_vm *vm) { #if IS_ENABLED(CONFIG_DRM_GPUSVM) return drm_gpusvm_init(&vm->svm.gpusvm, "Xe SVM (simple)", &vm->xe->drm, - NULL, NULL, 0, 0, 0, NULL, NULL, 0); + NULL, 0, 0, 0, NULL, NULL, 0); #else return 0; #endif From 364d02452fae4674d7a7b24a1f36d72724a3723e Mon Sep 17 00:00:00 2001 From: Raag Jadav Date: Wed, 3 Dec 2025 18:03:55 +0530 Subject: [PATCH 0216/1321] drm/xe/throttle: Skip reason prefix while emitting array MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The newly introduced "reasons" attribute already signifies possible reasons for throttling and makes the prefix in individual attribute names redundant while emitting them as an array. Skip the prefix. Fixes: 83ccde67a3f7 ("drm/xe/gt_throttle: Avoid TOCTOU when monitoring reasons") Signed-off-by: Raag Jadav Reviewed-by: Sk Anirban Link: https://patch.msgid.link/20251203123355.571606-1-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi (cherry picked from commit b64a14334ef3ebbcf70d11bc67d0934bdc0e390d) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_gt_throttle.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_gt_throttle.c b/drivers/gpu/drm/xe/xe_gt_throttle.c index 82c5fbcdfbe3e..01477fc7b37b9 100644 --- a/drivers/gpu/drm/xe/xe_gt_throttle.c +++ b/drivers/gpu/drm/xe/xe_gt_throttle.c @@ -140,7 +140,7 @@ static ssize_t reasons_show(struct kobject *kobj, struct throttle_attribute *other_ta = kobj_attribute_to_throttle(kattr); if (other_ta->mask != U32_MAX && reasons & other_ta->mask) - ret += sysfs_emit_at(buff, ret, "%s ", (*pother)->name); + ret += sysfs_emit_at(buff, ret, "%s ", (*pother)->name + strlen("reason_")); } if (drm_WARN_ONCE(&xe->drm, !ret, "Unknown reason: %#x\n", reasons)) From e89876310f4dde4ecbc8e544db7eef143d5068f9 Mon Sep 17 00:00:00 2001 From: Tomasz Lis Date: Thu, 4 Dec 2025 21:08:20 +0100 Subject: [PATCH 0217/1321] drm/xe/vf: Stop waiting for ring space on VF post migration recovery MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If wait for ring space started just before migration, it can delay the recovery process, by waiting without bailout path for up to 2 seconds. Two second wait for recovery is not acceptable, and if the ring was completely filled even without the migration temporarily stopping execution, then such a wait will result in up to a thousand new jobs (assuming constant flow) being added while the wait is happening. While this will not cause data corruption, it will lead to warning messages getting logged due to reset being scheduled on a GT under recovery. Also several seconds of unresponsiveness, as the backlog of jobs gets progressively executed. Add a bailout condition, to make sure the recovery starts without much delay. The recovery is expected to finish in about 100 ms when under moderate stress, so the condition verification period needs to be below that - settling at 64 ms. The theoretical max time which the recovery can take depends on how many requests can be emitted to engine rings and be pending execution. While stress testing, it was possible to reach 10k pending requests on rings when a platform with two GTs was used. This resulted in max recovery time of 5 seconds. But in real life situations, it is very unlikely that the amount of pending requests will ever exceed 100, and for that the recovery time will be around 50 ms - well within our claimed limit of 100ms. Fixes: a4dae94aad6a ("drm/xe/vf: Wakeup in GuC backend on VF post migration recovery") Signed-off-by: Tomasz Lis Reviewed-by: Matthew Brost Signed-off-by: Michal Wajdeczko Link: https://patch.msgid.link/20251204200820.2206168-1-tomasz.lis@intel.com (cherry picked from commit a00e305fba02a915cf2745bf6ef3f55537e65d57) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_guc_submit.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index c0819377ce6e4..311cd047911a4 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -722,21 +722,23 @@ static int wq_wait_for_space(struct xe_exec_queue *q, u32 wqi_size) struct xe_guc *guc = exec_queue_to_guc(q); struct xe_device *xe = guc_to_xe(guc); struct iosys_map map = xe_lrc_parallel_map(q->lrc[0]); - unsigned int sleep_period_ms = 1; + unsigned int sleep_period_ms = 1, sleep_total_ms = 0; #define AVAILABLE_SPACE \ CIRC_SPACE(q->guc->wqi_tail, q->guc->wqi_head, WQ_SIZE) if (wqi_size > AVAILABLE_SPACE && !vf_recovery(guc)) { try_again: q->guc->wqi_head = parallel_read(xe, map, wq_desc.head); - if (wqi_size > AVAILABLE_SPACE) { - if (sleep_period_ms == 1024) { + if (wqi_size > AVAILABLE_SPACE && !vf_recovery(guc)) { + if (sleep_total_ms > 2000) { xe_gt_reset_async(q->gt); return -ENODEV; } msleep(sleep_period_ms); - sleep_period_ms <<= 1; + sleep_total_ms += sleep_period_ms; + if (sleep_period_ms < 64) + sleep_period_ms <<= 1; goto try_again; } } From fc09a1e1f2440a72f4e3c1667b0e2daa191f7177 Mon Sep 17 00:00:00 2001 From: Junxiao Chang Date: Fri, 7 Nov 2025 11:31:52 +0800 Subject: [PATCH 0218/1321] drm/me/gsc: mei interrupt top half should be in irq disabled context MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit MEI GSC interrupt comes from i915 or xe driver. It has top half and bottom half. Top half is called from i915/xe interrupt handler. It should be in irq disabled context. With RT kernel(PREEMPT_RT enabled), by default IRQ handler is in threaded IRQ. MEI GSC top half might be in threaded IRQ context. generic_handle_irq_safe API could be called from either IRQ or process context, it disables local IRQ then calls MEI GSC interrupt top half. This change fixes B580 GPU boot issue with RT enabled. Fixes: e02cea83d32d ("drm/xe/gsc: add Battlemage support") Tested-by: Baoli Zhang Signed-off-by: Junxiao Chang Reviewed-by: Sebastian Andrzej Siewior Reviewed-by: Matthew Brost Link: https://patch.msgid.link/20251107033152.834960-1-junxiao.chang@intel.com Signed-off-by: Maarten Lankhorst (cherry picked from commit 3efadf028783a49ab2941294187c8b6dd86bf7da) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_heci_gsc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_heci_gsc.c b/drivers/gpu/drm/xe/xe_heci_gsc.c index 2b3d49dd394c0..495cdd4f948d5 100644 --- a/drivers/gpu/drm/xe/xe_heci_gsc.c +++ b/drivers/gpu/drm/xe/xe_heci_gsc.c @@ -223,7 +223,7 @@ void xe_heci_gsc_irq_handler(struct xe_device *xe, u32 iir) if (xe->heci_gsc.irq < 0) return; - ret = generic_handle_irq(xe->heci_gsc.irq); + ret = generic_handle_irq_safe(xe->heci_gsc.irq); if (ret) drm_err_ratelimited(&xe->drm, "error handling GSC irq: %d\n", ret); } @@ -243,7 +243,7 @@ void xe_heci_csc_irq_handler(struct xe_device *xe, u32 iir) if (xe->heci_gsc.irq < 0) return; - ret = generic_handle_irq(xe->heci_gsc.irq); + ret = generic_handle_irq_safe(xe->heci_gsc.irq); if (ret) drm_err_ratelimited(&xe->drm, "error handling GSC irq: %d\n", ret); } From c12b778c72b197d3a9d2c291861ab63213d7f9a2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= Date: Tue, 9 Dec 2025 21:49:20 +0100 Subject: [PATCH 0219/1321] drm/xe/bo: Don't include the CCS metadata in the dma-buf sg-table MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Some Xe bos are allocated with extra backing-store for the CCS metadata. It's never been the intention to share the CCS metadata when exporting such bos as dma-buf. Don't include it in the dma-buf sg-table. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Rodrigo Vivi Cc: Matthew Brost Cc: Maarten Lankhorst Cc: # v6.8+ Signed-off-by: Thomas Hellström Reviewed-by: Matthew Brost Reviewed-by: Karol Wachowski Link: https://patch.msgid.link/20251209204920.224374-1-thomas.hellstrom@linux.intel.com (cherry picked from commit a4ebfb9d95d78a12512b435a698ee6886d712571) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_dma_buf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c index 54e42960daadc..7c74a31d44860 100644 --- a/drivers/gpu/drm/xe/xe_dma_buf.c +++ b/drivers/gpu/drm/xe/xe_dma_buf.c @@ -124,7 +124,7 @@ static struct sg_table *xe_dma_buf_map(struct dma_buf_attachment *attach, case XE_PL_TT: sgt = drm_prime_pages_to_sg(obj->dev, bo->ttm.ttm->pages, - bo->ttm.ttm->num_pages); + obj->size >> PAGE_SHIFT); if (IS_ERR(sgt)) return sgt; From 262371ca4893c07d04e02169d9e88580137bb3f8 Mon Sep 17 00:00:00 2001 From: Satyanarayana K V P Date: Wed, 10 Dec 2025 05:25:48 +0000 Subject: [PATCH 0220/1321] drm/xe/vf: Fix queuing of recovery work MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Ensure VF migration recovery work is only queued when no recovery is already queued and teardown is not in progress. Fixes: b47c0c07c350 ("drm/xe/vf: Teardown VF post migration worker on driver unload") Signed-off-by: Satyanarayana K V P Cc: Michal Wajdeczko Cc: Matthew Brost Cc: Tomasz Lis Reviewed-by: Michal Wajdeczko Reviewed-by: Matthew Brost Signed-off-by: Michal Wajdeczko Link: https://patch.msgid.link/20251210052546.622809-5-satyanarayana.k.v.p@intel.com (cherry picked from commit 8d8cf42b03f149dcb545b547906306f3b474565e) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_gt_sriov_vf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c index 4c73a077d314f..033eae2d03d33 100644 --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c @@ -733,7 +733,7 @@ static void vf_start_migration_recovery(struct xe_gt *gt) spin_lock(>->sriov.vf.migration.lock); - if (!gt->sriov.vf.migration.recovery_queued || + if (!gt->sriov.vf.migration.recovery_queued && !gt->sriov.vf.migration.recovery_teardown) { gt->sriov.vf.migration.recovery_queued = true; WRITE_ONCE(gt->sriov.vf.migration.recovery_inprogress, true); From 24619bcc404c3dbdbc62bdd3e87daa832f4884dd Mon Sep 17 00:00:00 2001 From: Jagmeet Randhawa Date: Fri, 12 Dec 2025 05:21:46 +0800 Subject: [PATCH 0221/1321] drm/xe: Increase TDF timeout MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit There are some corner cases where flushing transient data may take slightly longer than the 150us timeout we currently allow. Update the driver to use a 300us timeout instead based on the latest guidance from the hardware team. An update to the bspec to formally document this is expected to arrive soon. Fixes: c01c6066e6fa ("drm/xe/device: implement transient flush") Signed-off-by: Jagmeet Randhawa Reviewed-by: Jonathan Cavitt Reviewed-by: Matt Roper Link: https://patch.msgid.link/0201b1d6ec64d3651fcbff1ea21026efa915126a.1765487866.git.jagmeet.randhawa@intel.com Signed-off-by: Matt Roper (cherry picked from commit d69d3636f5f7a84bae7cd43473b3701ad9b7d544) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index c7d373c70f0fb..cf29e259861f9 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -1056,7 +1056,7 @@ static void tdf_request_sync(struct xe_device *xe) * transient and need to be flushed.. */ if (xe_mmio_wait32(>->mmio, XE2_TDF_CTRL, TRANSIENT_FLUSH_REQUEST, 0, - 150, NULL, false)) + 300, NULL, false)) xe_gt_err_once(gt, "TD flush timeout\n"); xe_force_wake_put(gt_to_fw(gt), fw_ref); From 693bdcaa75a4df865f066236a2e9258713fffe0d Mon Sep 17 00:00:00 2001 From: Jan Maslak Date: Wed, 10 Dec 2025 15:56:18 +0100 Subject: [PATCH 0222/1321] drm/xe: Restore engine registers before restarting schedulers after GT reset MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit During GT reset recovery in do_gt_restart(), xe_uc_start() was called before xe_reg_sr_apply_mmio() restored engine-specific registers. This created a race window where the scheduler could run jobs before hardware state was fully restored. This caused failures in eudebug tests (xe_exec_sip_eudebug@breakpoint- waitsip-*) where TD_CTL register (containing TD_CTL_GLOBAL_DEBUG_ENABLE) wasn't restored before jobs started executing. Breakpoints would fail to trigger SIP entry because the debug enable bit wasn't set yet. Fix by moving xe_uc_start() after all MMIO register restoration, including engine registers and CCS mode configuration, ensuring all hardware state is fully restored before any jobs can be scheduled. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Jan Maslak Reviewed-by: Jonathan Cavitt Reviewed-by: Matthew Brost Signed-off-by: Matthew Brost Link: https://patch.msgid.link/20251210145618.169625-2-jan.maslak@intel.com (cherry picked from commit 825aed0328588b2837636c1c5a0c48795d724617) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_gt.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index dbb5e7a9bc6a9..cdce210e36f25 100644 --- a/drivers/gpu/drm/xe/xe_gt.c +++ b/drivers/gpu/drm/xe/xe_gt.c @@ -797,9 +797,6 @@ static int do_gt_restart(struct xe_gt *gt) xe_gt_sriov_pf_init_hw(gt); xe_mocs_init(gt); - err = xe_uc_start(>->uc); - if (err) - return err; for_each_hw_engine(hwe, gt, id) xe_reg_sr_apply_mmio(&hwe->reg_sr, gt); @@ -807,6 +804,10 @@ static int do_gt_restart(struct xe_gt *gt) /* Get CCS mode in sync between sw/hw */ xe_gt_apply_ccs_mode(gt); + err = xe_uc_start(>->uc); + if (err) + return err; + /* Restore GT freq to expected values */ xe_gt_sanitize_freq(gt); From f6dd40b152fb452bcebb1d6526a95afaf542d2b1 Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Fri, 5 Dec 2025 23:47:17 +0000 Subject: [PATCH 0223/1321] drm/xe: Limit num_syncs to prevent oversized allocations MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The exec and vm_bind ioctl allow userspace to specify an arbitrary num_syncs value. Without bounds checking, a very large num_syncs can force an excessively large allocation, leading to kernel warnings from the page allocator as below. Introduce DRM_XE_MAX_SYNCS (set to 1024) and reject any request exceeding this limit. " ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1217 at mm/page_alloc.c:5124 __alloc_frozen_pages_noprof+0x2f8/0x2180 mm/page_alloc.c:5124 ... Call Trace: alloc_pages_mpol+0xe4/0x330 mm/mempolicy.c:2416 ___kmalloc_large_node+0xd8/0x110 mm/slub.c:4317 __kmalloc_large_node_noprof+0x18/0xe0 mm/slub.c:4348 __do_kmalloc_node mm/slub.c:4364 [inline] __kmalloc_noprof+0x3d4/0x4b0 mm/slub.c:4388 kmalloc_noprof include/linux/slab.h:909 [inline] kmalloc_array_noprof include/linux/slab.h:948 [inline] xe_exec_ioctl+0xa47/0x1e70 drivers/gpu/drm/xe/xe_exec.c:158 drm_ioctl_kernel+0x1f1/0x3e0 drivers/gpu/drm/drm_ioctl.c:797 drm_ioctl+0x5e7/0xc50 drivers/gpu/drm/drm_ioctl.c:894 xe_drm_ioctl+0x10b/0x170 drivers/gpu/drm/xe/xe_device.c:224 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:598 [inline] __se_sys_ioctl fs/ioctl.c:584 [inline] __x64_sys_ioctl+0x18b/0x210 fs/ioctl.c:584 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xbb/0x380 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f ... " v2: Add "Reported-by" and Cc stable kernels. v3: Change XE_MAX_SYNCS from 64 to 1024. (Matt & Ashutosh) v4: s/XE_MAX_SYNCS/DRM_XE_MAX_SYNCS/ (Matt) v5: Do the check at the top of the exec func. (Matt) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Reported-by: Koen Koning Reported-by: Peter Senna Tschudin Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6450 Cc: # v6.12+ Cc: Matthew Brost Cc: Michal Mrozek Cc: Carl Zhang Cc: José Roberto de Souza Cc: Lionel Landwerlin Cc: Ivan Briano Cc: Thomas Hellström Cc: Ashutosh Dixit Signed-off-by: Shuicheng Lin Reviewed-by: Matthew Brost Signed-off-by: Matthew Brost Link: https://patch.msgid.link/20251205234715.2476561-5-shuicheng.lin@intel.com (cherry picked from commit b07bac9bd708ec468cd1b8a5fe70ae2ac9b0a11c) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_exec.c | 3 ++- drivers/gpu/drm/xe/xe_vm.c | 3 +++ include/uapi/drm/xe_drm.h | 1 + 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c index 4d81210e41f52..fd94800317506 100644 --- a/drivers/gpu/drm/xe/xe_exec.c +++ b/drivers/gpu/drm/xe/xe_exec.c @@ -132,7 +132,8 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file) if (XE_IOCTL_DBG(xe, args->extensions) || XE_IOCTL_DBG(xe, args->pad[0] || args->pad[1] || args->pad[2]) || - XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1])) + XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1]) || + XE_IOCTL_DBG(xe, args->num_syncs > DRM_XE_MAX_SYNCS)) return -EINVAL; q = xe_exec_queue_lookup(xef, args->exec_queue_id); diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 7cac646bdf1c0..c93155c6c6272 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -3324,6 +3324,9 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe, struct xe_vm *vm, if (XE_IOCTL_DBG(xe, args->extensions)) return -EINVAL; + if (XE_IOCTL_DBG(xe, args->num_syncs > DRM_XE_MAX_SYNCS)) + return -EINVAL; + if (args->num_binds > 1) { u64 __user *bind_user = u64_to_user_ptr(args->vector_of_binds); diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index 47853659a705e..f64dc0eff0e67 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -1463,6 +1463,7 @@ struct drm_xe_exec { /** @exec_queue_id: Exec queue ID for the batch buffer */ __u32 exec_queue_id; +#define DRM_XE_MAX_SYNCS 1024 /** @num_syncs: Amount of struct drm_xe_sync in array. */ __u32 num_syncs; From 5563a4f21b3fe4ce8895462f0d411f99436c36c5 Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Fri, 5 Dec 2025 23:47:18 +0000 Subject: [PATCH 0224/1321] drm/xe/oa: Limit num_syncs to prevent oversized allocations MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The OA open parameters did not validate num_syncs, allowing userspace to pass arbitrarily large values, potentially leading to excessive allocations. Add check to ensure that num_syncs does not exceed DRM_XE_MAX_SYNCS, returning -EINVAL when the limit is violated. v2: use XE_IOCTL_DBG() and drop duplicated check. (Ashutosh) Fixes: c8507a25cebd ("drm/xe/oa/uapi: Define and parse OA sync properties") Cc: Matthew Brost Cc: Ashutosh Dixit Signed-off-by: Shuicheng Lin Reviewed-by: Ashutosh Dixit Signed-off-by: Matthew Brost Link: https://patch.msgid.link/20251205234715.2476561-6-shuicheng.lin@intel.com (cherry picked from commit e057b2d2b8d815df3858a87dffafa2af37e5945b) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_oa.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c index 890c363282ae6..1dd8ebeb41d0c 100644 --- a/drivers/gpu/drm/xe/xe_oa.c +++ b/drivers/gpu/drm/xe/xe_oa.c @@ -1254,6 +1254,9 @@ static int xe_oa_set_no_preempt(struct xe_oa *oa, u64 value, static int xe_oa_set_prop_num_syncs(struct xe_oa *oa, u64 value, struct xe_oa_open_param *param) { + if (XE_IOCTL_DBG(oa->xe, value > DRM_XE_MAX_SYNCS)) + return -EINVAL; + param->num_syncs = value; return 0; } From f538b2118bd3ae7c9075d823e20a1668c0fb9b78 Mon Sep 17 00:00:00 2001 From: Matthew Brost Date: Fri, 12 Dec 2025 10:28:41 -0800 Subject: [PATCH 0225/1321] drm/xe: Adjust long-running workload timeslices to reasonable values MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit A 10ms timeslice for long-running workloads is far too long and causes significant jitter in benchmarks when the system is shared. Adjust the value to 5ms for preempt-fencing VMs, as the resume step there is quite costly as memory is moved around, and set it to zero for pagefault VMs, since switching back to pagefault mode after dma-fence mode is relatively fast. Also change min_run_period_ms to 'unsiged int' type rather than 's64' as only positive values make sense. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost Reviewed-by: Thomas Hellström Link: https://patch.msgid.link/20251212182847.1683222-2-matthew.brost@intel.com (cherry picked from commit 33a5abd9a68394aa67f9618b20eee65ee8702ff4) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_vm.c | 5 ++++- drivers/gpu/drm/xe/xe_vm_types.h | 2 +- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index c93155c6c6272..79ab6c512d3e0 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -1508,7 +1508,10 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags, struct xe_file *xef) INIT_WORK(&vm->destroy_work, vm_destroy_work_func); INIT_LIST_HEAD(&vm->preempt.exec_queues); - vm->preempt.min_run_period_ms = 10; /* FIXME: Wire up to uAPI */ + if (flags & XE_VM_FLAG_FAULT_MODE) + vm->preempt.min_run_period_ms = 0; + else + vm->preempt.min_run_period_ms = 5; for_each_tile(tile, xe, id) xe_range_fence_tree_init(&vm->rftree[id]); diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h index ccd6cc090309f..2168ef052499e 100644 --- a/drivers/gpu/drm/xe/xe_vm_types.h +++ b/drivers/gpu/drm/xe/xe_vm_types.h @@ -263,7 +263,7 @@ struct xe_vm { * @min_run_period_ms: The minimum run period before preempting * an engine again */ - s64 min_run_period_ms; + unsigned int min_run_period_ms; /** @exec_queues: list of exec queues attached to this VM */ struct list_head exec_queues; /** @num_exec_queues: number exec queues attached to this VM */ From 122b5b7112f980ba5df352d99f61a222e3e962b1 Mon Sep 17 00:00:00 2001 From: Ashutosh Dixit Date: Fri, 5 Dec 2025 13:26:13 -0800 Subject: [PATCH 0226/1321] drm/xe/oa: Always set OAG_OAGLBCTXCTRL_COUNTER_RESUME MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reports can be written out to the OA buffer using ways other than periodic sampling. These include mmio trigger and context switches. To support these use cases, when periodic sampling is not enabled, OAG_OAGLBCTXCTRL_COUNTER_RESUME must be set. Fixes: 1db9a9dc90ae ("drm/xe/oa: OA stream initialization (OAG)") Signed-off-by: Ashutosh Dixit Reviewed-by: Umesh Nerlige Ramappa Link: https://patch.msgid.link/20251205212613.826224-4-ashutosh.dixit@intel.com (cherry picked from commit 88d98e74adf3e20f678bb89581a5c3149fdbdeaa) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_oa.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c index 1dd8ebeb41d0c..8f3da6885e6c9 100644 --- a/drivers/gpu/drm/xe/xe_oa.c +++ b/drivers/gpu/drm/xe/xe_oa.c @@ -1105,11 +1105,12 @@ static int xe_oa_enable_metric_set(struct xe_oa_stream *stream) oag_buf_size_select(stream) | oag_configure_mmio_trigger(stream, true)); - xe_mmio_write32(mmio, __oa_regs(stream)->oa_ctx_ctrl, stream->periodic ? - (OAG_OAGLBCTXCTRL_COUNTER_RESUME | + xe_mmio_write32(mmio, __oa_regs(stream)->oa_ctx_ctrl, + OAG_OAGLBCTXCTRL_COUNTER_RESUME | + (stream->periodic ? OAG_OAGLBCTXCTRL_TIMER_ENABLE | REG_FIELD_PREP(OAG_OAGLBCTXCTRL_TIMER_PERIOD_MASK, - stream->period_exponent)) : 0); + stream->period_exponent) : 0)); /* * Initialize Super Queue Internal Cnt Register From 87523d4859f993570865f69c02c9c30177a075c8 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Fri, 5 Dec 2025 14:39:19 +0300 Subject: [PATCH 0227/1321] drm/xe/xe_sriov_vfio: Fix return value in xe_sriov_vfio_migration_supported() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The xe_sriov_vfio_migration_supported() function is type bool so returning -EPERM means returning true. Return false instead. Fixes: bd45d46ffc8f ("drm/xe/pf: Export helpers for VFIO") Signed-off-by: Dan Carpenter Reviewed-by: Michal Wajdeczko Link: https://patch.msgid.link/aTLEZ4g-FD-iMQ2V@stanley.mountain Signed-off-by: Michał Winiarski (cherry picked from commit 0a2404c8f6a3a120f79c57ef8a3302c8e8bc34d9) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_sriov_vfio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_sriov_vfio.c b/drivers/gpu/drm/xe/xe_sriov_vfio.c index e9a7615bb5c51..3da81af97b8bb 100644 --- a/drivers/gpu/drm/xe/xe_sriov_vfio.c +++ b/drivers/gpu/drm/xe/xe_sriov_vfio.c @@ -21,7 +21,7 @@ EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_get_pf, "xe-vfio-pci"); bool xe_sriov_vfio_migration_supported(struct xe_device *xe) { if (!IS_SRIOV_PF(xe)) - return -EPERM; + return false; return xe_sriov_pf_migration_supported(xe); } From bf1588573a857e5c7a1fd284e118790faa1c2aaa Mon Sep 17 00:00:00 2001 From: Ashutosh Dixit Date: Thu, 11 Dec 2025 22:18:49 -0800 Subject: [PATCH 0228/1321] drm/xe/oa: Disallow 0 OA property values MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit An OA property value of 0 is invalid and will cause a NPD. Reported-by: Peter Senna Tschudin Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6452 Fixes: cc4e6994d5a2 ("drm/xe/oa: Move functions up so they can be reused for config ioctl") Cc: stable@vger.kernel.org Signed-off-by: Ashutosh Dixit Reviewed-by: Harish Chegondi Link: https://patch.msgid.link/20251212061850.1565459-3-ashutosh.dixit@intel.com (cherry picked from commit 7a100e6ddcc47c1f6ba7a19402de86ce24790621) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_oa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c index 8f3da6885e6c9..f8bb28ab81248 100644 --- a/drivers/gpu/drm/xe/xe_oa.c +++ b/drivers/gpu/drm/xe/xe_oa.c @@ -1347,7 +1347,7 @@ static int xe_oa_user_ext_set_property(struct xe_oa *oa, enum xe_oa_user_extn_fr ARRAY_SIZE(xe_oa_set_property_funcs_config)); if (XE_IOCTL_DBG(oa->xe, ext.property >= ARRAY_SIZE(xe_oa_set_property_funcs_open)) || - XE_IOCTL_DBG(oa->xe, ext.pad)) + XE_IOCTL_DBG(oa->xe, !ext.property) || XE_IOCTL_DBG(oa->xe, ext.pad)) return -EINVAL; idx = array_index_nospec(ext.property, ARRAY_SIZE(xe_oa_set_property_funcs_open)); From 12be01538420ea6d11e0cc1013b0ffa08134e903 Mon Sep 17 00:00:00 2001 From: Ashutosh Dixit Date: Thu, 11 Dec 2025 22:18:50 -0800 Subject: [PATCH 0229/1321] drm/xe/eustall: Disallow 0 EU stall property values MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit An EU stall property value of 0 is invalid and will cause a NPD. Reported-by: Peter Senna Tschudin Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6453 Fixes: 1537ec85ebd7 ("drm/xe/uapi: Introduce API for EU stall sampling") Cc: stable@vger.kernel.org Signed-off-by: Ashutosh Dixit Reviewed-by: Harish Chegondi Link: https://patch.msgid.link/20251212061850.1565459-4-ashutosh.dixit@intel.com (cherry picked from commit 5bf763e908bf795da4ad538d21c1ec41f8021f76) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_eu_stall.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_eu_stall.c b/drivers/gpu/drm/xe/xe_eu_stall.c index 97dfb7945b7ac..a5c36a317a707 100644 --- a/drivers/gpu/drm/xe/xe_eu_stall.c +++ b/drivers/gpu/drm/xe/xe_eu_stall.c @@ -315,7 +315,7 @@ static int xe_eu_stall_user_ext_set_property(struct xe_device *xe, u64 extension return -EFAULT; if (XE_IOCTL_DBG(xe, ext.property >= ARRAY_SIZE(xe_set_eu_stall_property_funcs)) || - XE_IOCTL_DBG(xe, ext.pad)) + XE_IOCTL_DBG(xe, !ext.property) || XE_IOCTL_DBG(xe, ext.pad)) return -EINVAL; idx = array_index_nospec(ext.property, ARRAY_SIZE(xe_set_eu_stall_property_funcs)); From f910774f19616d6bdf91c7832205d4ae0e501059 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= Date: Wed, 17 Dec 2025 10:34:41 +0100 Subject: [PATCH 0230/1321] drm/xe: Drop preempt-fences when destroying imported dma-bufs. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When imported dma-bufs are destroyed, TTM is not fully individualizing the dma-resv, but it *is* copying the fences that need to be waited for before declaring idle. So in the case where the bo->resv != bo->_resv we can still drop the preempt-fences, but make sure we do that on bo->_resv which contains the fence-pointer copy. In the case where the copying fails, bo->_resv will typically not contain any fences pointers at all, so there will be nothing to drop. In that case, TTM would have ensured all fences that would have been copied are signaled, including any remaining preempt fences. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Fixes: fa0af721bd1f ("drm/ttm: test private resv obj on release/destroy") Cc: Matthew Brost Cc: # v6.16+ Signed-off-by: Thomas Hellström Tested-by: Matthew Brost Reviewed-by: Matthew Brost Link: https://patch.msgid.link/20251217093441.5073-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 425fe550fb513b567bd6d01f397d274092a9c274) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_bo.c | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index b0bd31d14bb97..bf4ee976b6805 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -1527,7 +1527,7 @@ static bool xe_ttm_bo_lock_in_destructor(struct ttm_buffer_object *ttm_bo) * always succeed here, as long as we hold the lru lock. */ spin_lock(&ttm_bo->bdev->lru_lock); - locked = dma_resv_trylock(ttm_bo->base.resv); + locked = dma_resv_trylock(&ttm_bo->base._resv); spin_unlock(&ttm_bo->bdev->lru_lock); xe_assert(xe, locked); @@ -1547,13 +1547,6 @@ static void xe_ttm_bo_release_notify(struct ttm_buffer_object *ttm_bo) bo = ttm_to_xe_bo(ttm_bo); xe_assert(xe_bo_device(bo), !(bo->created && kref_read(&ttm_bo->base.refcount))); - /* - * Corner case where TTM fails to allocate memory and this BOs resv - * still points the VMs resv - */ - if (ttm_bo->base.resv != &ttm_bo->base._resv) - return; - if (!xe_ttm_bo_lock_in_destructor(ttm_bo)) return; @@ -1563,14 +1556,14 @@ static void xe_ttm_bo_release_notify(struct ttm_buffer_object *ttm_bo) * TODO: Don't do this for external bos once we scrub them after * unbind. */ - dma_resv_for_each_fence(&cursor, ttm_bo->base.resv, + dma_resv_for_each_fence(&cursor, &ttm_bo->base._resv, DMA_RESV_USAGE_BOOKKEEP, fence) { if (xe_fence_is_xe_preempt(fence) && !dma_fence_is_signaled(fence)) { if (!replacement) replacement = dma_fence_get_stub(); - dma_resv_replace_fences(ttm_bo->base.resv, + dma_resv_replace_fences(&ttm_bo->base._resv, fence->context, replacement, DMA_RESV_USAGE_BOOKKEEP); @@ -1578,7 +1571,7 @@ static void xe_ttm_bo_release_notify(struct ttm_buffer_object *ttm_bo) } dma_fence_put(replacement); - dma_resv_unlock(ttm_bo->base.resv); + dma_resv_unlock(&ttm_bo->base._resv); } static void xe_ttm_bo_delete_mem_notify(struct ttm_buffer_object *ttm_bo) From 51f3c64a889a3c5658a4aca743234b48d06c6b2a Mon Sep 17 00:00:00 2001 From: Matthew Brost Date: Fri, 12 Dec 2025 10:28:42 -0800 Subject: [PATCH 0231/1321] drm/xe: Use usleep_range for accurate long-running workload timeslicing MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit msleep is not very accurate in terms of how long it actually sleeps, whereas usleep_range is precise. Replace the timeslice sleep for long-running workloads with the more accurate usleep_range to avoid jitter if the sleep period is less than 20ms. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost Reviewed-by: Thomas Hellström Link: https://patch.msgid.link/20251212182847.1683222-3-matthew.brost@intel.com (cherry picked from commit ca415c4d4c17ad676a2c8981e1fcc432221dce79) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_guc_submit.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index 311cd047911a4..f6ba2b0f074d2 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -717,6 +717,24 @@ static bool vf_recovery(struct xe_guc *guc) return xe_gt_recovery_pending(guc_to_gt(guc)); } +static inline void relaxed_ms_sleep(unsigned int delay_ms) +{ + unsigned long min_us, max_us; + + if (!delay_ms) + return; + + if (delay_ms > 20) { + msleep(delay_ms); + return; + } + + min_us = mul_u32_u32(delay_ms, 1000); + max_us = min_us + 500; + + usleep_range(min_us, max_us); +} + static int wq_wait_for_space(struct xe_exec_queue *q, u32 wqi_size) { struct xe_guc *guc = exec_queue_to_guc(q); @@ -1587,7 +1605,7 @@ static void __guc_exec_queue_process_msg_suspend(struct xe_sched_msg *msg) since_resume_ms; if (wait_ms > 0 && q->guc->resume_time) - msleep(wait_ms); + relaxed_ms_sleep(wait_ms); set_exec_queue_suspended(q); disable_scheduling(q, false); From 6ee07041554e7b429acbe861026ff0d9a41fc58b Mon Sep 17 00:00:00 2001 From: Andrew Jeffery Date: Thu, 11 Dec 2025 17:45:48 +0900 Subject: [PATCH 0232/1321] dt-bindings: mmc: sdhci-of-aspeed: Switch ref to sdhci-common.yaml Enable use of common SDHCI-related properties such as sdhci-caps-mask as found in the AST2600 EVB DTS. Cc: stable@vger.kernel.org # v6.2+ Signed-off-by: Andrew Jeffery Signed-off-by: Ulf Hansson --- Documentation/devicetree/bindings/mmc/aspeed,sdhci.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/mmc/aspeed,sdhci.yaml b/Documentation/devicetree/bindings/mmc/aspeed,sdhci.yaml index 9fce8cd7b0b62..d24950ccea952 100644 --- a/Documentation/devicetree/bindings/mmc/aspeed,sdhci.yaml +++ b/Documentation/devicetree/bindings/mmc/aspeed,sdhci.yaml @@ -41,7 +41,7 @@ properties: patternProperties: "^sdhci@[0-9a-f]+$": type: object - $ref: mmc-controller.yaml + $ref: sdhci-common.yaml unevaluatedProperties: false properties: From 8ee05eb057938bd4716ce15e0094f9e96dcaeef2 Mon Sep 17 00:00:00 2001 From: Sai Krishna Potthuri Date: Fri, 12 Dec 2025 12:05:09 +0530 Subject: [PATCH 0233/1321] mmc: sdhci-of-arasan: Increase CD stable timeout to 2 seconds On Xilinx/AMD platforms, the CD stable bit take slightly longer than one second(about an additional 100ms) to assert after a host controller reset. Although no functional failure observed with the existing one second delay but to ensure reliable initialization, increase the CD stable timeout to 2 seconds. Fixes: e251709aaddb ("mmc: sdhci-of-arasan: Ensure CD logic stabilization before power-up") Cc: stable@vger.kernel.org Signed-off-by: Sai Krishna Potthuri Acked-by: Adrian Hunter Signed-off-by: Ulf Hansson --- drivers/mmc/host/sdhci-of-arasan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/mmc/host/sdhci-of-arasan.c b/drivers/mmc/host/sdhci-of-arasan.c index b97d042897add..ab7f0ffe7b4f0 100644 --- a/drivers/mmc/host/sdhci-of-arasan.c +++ b/drivers/mmc/host/sdhci-of-arasan.c @@ -99,7 +99,7 @@ #define HIWORD_UPDATE(val, mask, shift) \ ((val) << (shift) | (mask) << ((shift) + 16)) -#define CD_STABLE_TIMEOUT_US 1000000 +#define CD_STABLE_TIMEOUT_US 2000000 #define CD_STABLE_MAX_SLEEP_US 10 /** From d5a1a1640635a6f3ed10b468eeba308f7cfa2eb1 Mon Sep 17 00:00:00 2001 From: Jared Kangas Date: Fri, 12 Dec 2025 07:03:17 -0800 Subject: [PATCH 0234/1321] mmc: sdhci-esdhc-imx: add alternate ARCH_S32 dependency to Kconfig MMC_SDHCI_ESDHC_IMX requires ARCH_MXC despite also being used on ARCH_S32, which results in unmet dependencies when compiling strictly for ARCH_S32. Resolve this by adding ARCH_S32 as an alternative to ARCH_MXC in the driver's dependencies. Fixes: 5c4f00627c9a ("mmc: sdhci-esdhc-imx: add NXP S32G2 support") Cc: stable@bvger.kernel.org Signed-off-by: Jared Kangas Reviewed-by: Haibo Chen Signed-off-by: Ulf Hansson --- drivers/mmc/host/Kconfig | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig index 24f07df32a1a5..6d79cc9a79e22 100644 --- a/drivers/mmc/host/Kconfig +++ b/drivers/mmc/host/Kconfig @@ -315,14 +315,14 @@ config MMC_SDHCI_ESDHC_MCF config MMC_SDHCI_ESDHC_IMX tristate "SDHCI support for the Freescale eSDHC/uSDHC i.MX controller" - depends on ARCH_MXC || COMPILE_TEST + depends on ARCH_MXC || ARCH_S32 || COMPILE_TEST depends on MMC_SDHCI_PLTFM depends on OF select MMC_SDHCI_IO_ACCESSORS select MMC_CQHCI help This selects the Freescale eSDHC/uSDHC controller support - found on i.MX25, i.MX35 i.MX5x and i.MX6x. + found on i.MX25, i.MX35, i.MX5x, i.MX6x, and S32G. If you have a controller with this interface, say Y or M here. From 927c4f167a82facaba49439c04908b4ec999a58e Mon Sep 17 00:00:00 2001 From: Denis Sergeev Date: Tue, 9 Dec 2025 09:37:06 +0300 Subject: [PATCH 0235/1321] hwmon: (dell-smm) Limit fan multiplier to avoid overflow The fan nominal speed returned by SMM is limited to 16 bits, but the driver allows the fan multiplier to be set via a module parameter. Clamp the computed fan multiplier so that fan_nominal_speed * i8k_fan_mult always fits into a signed 32-bit integer and refuse to initialize the driver if the value is too large. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 20bdeebc88269 ("hwmon: (dell-smm) Introduce helper function for data init") Signed-off-by: Denis Sergeev Link: https://lore.kernel.org/r/20251209063706.49008-1-denserg.edu@gmail.com Signed-off-by: Guenter Roeck --- drivers/hwmon/dell-smm-hwmon.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c index a34753fc29733..6040a89406743 100644 --- a/drivers/hwmon/dell-smm-hwmon.c +++ b/drivers/hwmon/dell-smm-hwmon.c @@ -76,6 +76,9 @@ #define DELL_SMM_NO_TEMP 10 #define DELL_SMM_NO_FANS 4 +/* limit fan multiplier to avoid overflow */ +#define DELL_SMM_MAX_FAN_MULT (INT_MAX / U16_MAX) + struct smm_regs { unsigned int eax; unsigned int ebx; @@ -1253,6 +1256,12 @@ static int dell_smm_init_data(struct device *dev, const struct dell_smm_ops *ops data->ops = ops; /* All options must not be 0 */ data->i8k_fan_mult = fan_mult ? : I8K_FAN_MULT; + if (data->i8k_fan_mult > DELL_SMM_MAX_FAN_MULT) { + dev_err(dev, + "fan multiplier %u is too large (max %u)\n", + data->i8k_fan_mult, DELL_SMM_MAX_FAN_MULT); + return -EINVAL; + } data->i8k_fan_max = fan_max ? : I8K_FAN_HIGH; data->i8k_pwm_mult = DIV_ROUND_UP(255, data->i8k_fan_max); From 99b5c6159eea3374eab85e2ce2e6a600e0b52e2e Mon Sep 17 00:00:00 2001 From: Junrui Luo Date: Wed, 10 Dec 2025 17:48:08 +0800 Subject: [PATCH 0236/1321] hwmon: (ibmpex) fix use-after-free in high/low store The ibmpex_high_low_store() function retrieves driver data using dev_get_drvdata() and uses it without validation. This creates a race condition where the sysfs callback can be invoked after the data structure is freed, leading to use-after-free. Fix by adding a NULL check after dev_get_drvdata(), and reordering operations in the deletion path to prevent TOCTOU. Reported-by: Yuhao Jiang Reported-by: Junrui Luo Fixes: 57c7c3a0fdea ("hwmon: IBM power meter driver") Signed-off-by: Junrui Luo Link: https://lore.kernel.org/r/MEYPR01MB7886BE2F51BFE41875B74B60AFA0A@MEYPR01MB7886.ausprd01.prod.outlook.com Signed-off-by: Guenter Roeck --- drivers/hwmon/ibmpex.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/hwmon/ibmpex.c b/drivers/hwmon/ibmpex.c index 228c5f6c6f383..129f3a9e8fe96 100644 --- a/drivers/hwmon/ibmpex.c +++ b/drivers/hwmon/ibmpex.c @@ -277,6 +277,9 @@ static ssize_t ibmpex_high_low_store(struct device *dev, { struct ibmpex_bmc_data *data = dev_get_drvdata(dev); + if (!data) + return -ENODEV; + ibmpex_reset_high_low_data(data); return count; @@ -508,6 +511,9 @@ static void ibmpex_bmc_delete(struct ibmpex_bmc_data *data) { int i, j; + hwmon_device_unregister(data->hwmon_dev); + dev_set_drvdata(data->bmc_device, NULL); + device_remove_file(data->bmc_device, &sensor_dev_attr_reset_high_low.dev_attr); device_remove_file(data->bmc_device, &dev_attr_name.attr); @@ -521,8 +527,7 @@ static void ibmpex_bmc_delete(struct ibmpex_bmc_data *data) } list_del(&data->list); - dev_set_drvdata(data->bmc_device, NULL); - hwmon_device_unregister(data->hwmon_dev); + ipmi_destroy_user(data->user); kfree(data->sensors); kfree(data); From e16395af71458ea8d8b75b3e10aacc140261d864 Mon Sep 17 00:00:00 2001 From: Alexey Simakov Date: Thu, 11 Dec 2025 19:43:43 +0300 Subject: [PATCH 0237/1321] hwmon: (tmp401) fix overflow caused by default conversion rate value The driver computes conversion intervals using the formula: interval = (1 << (7 - rate)) * 125ms where 'rate' is the sensor's conversion rate register value. According to the datasheet, the power-on reset value of this register is 0x8, which could be assigned to the register, after handling i2c general call. Using this default value causes a result greater than the bit width of left operand and an undefined behaviour in the calculation above, since shifting by values larger than the bit width is undefined behaviour as per C language standard. Limit the maximum usable 'rate' value to 7 to prevent undefined behaviour in calculations. Found by Linux Verification Center (linuxtesting.org) with Svace. Note (groeck): This does not matter in practice unless someone overwrites the chip configuration from outside the driver while the driver is loaded. The conversion time register is initialized with a value of 5 (500ms) when the driver is loaded, and the driver never writes a bad value. Fixes: ca53e7640de7 ("hwmon: (tmp401) Convert to _info API") Signed-off-by: Alexey Simakov Link: https://lore.kernel.org/r/20251211164342.6291-1-bigalex934@gmail.com Signed-off-by: Guenter Roeck --- drivers/hwmon/tmp401.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hwmon/tmp401.c b/drivers/hwmon/tmp401.c index fbaa34973694f..07f596581c6eb 100644 --- a/drivers/hwmon/tmp401.c +++ b/drivers/hwmon/tmp401.c @@ -397,7 +397,7 @@ static int tmp401_chip_read(struct device *dev, u32 attr, int channel, long *val ret = regmap_read(data->regmap, TMP401_CONVERSION_RATE, ®val); if (ret < 0) return ret; - *val = (1 << (7 - regval)) * 125; + *val = (1 << (7 - min(regval, 7))) * 125; break; case hwmon_chip_temp_reset_history: *val = 0; From e144c37ac07caf307556d90f85004b830a324efd Mon Sep 17 00:00:00 2001 From: Okan Akyuz Date: Mon, 15 Dec 2025 20:44:22 +0000 Subject: [PATCH 0238/1321] hwmon: (DS620) Update broken Datasheet URL in driver documentation The URL for the DS620 datasheet has changed. Update it to reflect the current location. Signed-off-by: Okan Akyuz Link: https://lore.kernel.org/r/20251215204423.80242-1-okan.akyuz.linux@gmail.com Signed-off-by: Guenter Roeck --- Documentation/hwmon/ds620.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/hwmon/ds620.rst b/Documentation/hwmon/ds620.rst index 2d686b17b547a..e2d915a988a24 100644 --- a/Documentation/hwmon/ds620.rst +++ b/Documentation/hwmon/ds620.rst @@ -7,9 +7,9 @@ Supported chips: Prefix: 'ds620' - Datasheet: Publicly available at the Dallas Semiconductor website + Datasheet: Publicly available at the Analog Devices website - http://www.dalsemi.com/ + https://www.analog.com/media/en/technical-documentation/data-sheets/DS620.pdf Authors: Roland Stigge From 1270327270a3b006480129574006c320c1c5ea44 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Nuno=20S=C3=A1?= Date: Fri, 19 Dec 2025 16:11:05 +0000 Subject: [PATCH 0239/1321] hwmon: (ltc4282): Fix reset_history file permissions MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The reset_history attributes are write only. Hence don't report them as readable just to return -EOPNOTSUPP later on. Fixes: cbc29538dbf7 ("hwmon: Add driver for LTC4282") Signed-off-by: Nuno Sá Link: https://lore.kernel.org/r/20251219-ltc4282-fix-reset-history-v1-1-8eab974c124b@analog.com Signed-off-by: Guenter Roeck --- drivers/hwmon/ltc4282.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/hwmon/ltc4282.c b/drivers/hwmon/ltc4282.c index b9cad89f2cd9a..db6534e679911 100644 --- a/drivers/hwmon/ltc4282.c +++ b/drivers/hwmon/ltc4282.c @@ -1000,8 +1000,9 @@ static umode_t ltc4282_in_is_visible(const struct ltc4282_state *st, u32 attr) case hwmon_in_max: case hwmon_in_min: case hwmon_in_enable: - case hwmon_in_reset_history: return 0644; + case hwmon_in_reset_history: + return 0200; default: return 0; } @@ -1020,8 +1021,9 @@ static umode_t ltc4282_curr_is_visible(u32 attr) return 0444; case hwmon_curr_max: case hwmon_curr_min: - case hwmon_curr_reset_history: return 0644; + case hwmon_curr_reset_history: + return 0200; default: return 0; } @@ -1039,8 +1041,9 @@ static umode_t ltc4282_power_is_visible(u32 attr) return 0444; case hwmon_power_max: case hwmon_power_min: - case hwmon_power_reset_history: return 0644; + case hwmon_power_reset_history: + return 0200; default: return 0; } From 8af44ec28024b301f34c3afbd0d85fda5d6159a6 Mon Sep 17 00:00:00 2001 From: Chaitanya Kulkarni Date: Mon, 24 Nov 2025 15:48:06 -0800 Subject: [PATCH 0240/1321] xfs: ignore discard return value __blkdev_issue_discard() always returns 0, making all error checking in XFS discard functions dead code. Change xfs_discard_extents() return type to void, remove error variable, error checking, and error logging for the __blkdev_issue_discard() call in same function. Update xfs_trim_perag_extents() and xfs_trim_rtgroup_extents() to ignore the xfs_discard_extents() return value and error checking code. Update xfs_discard_rtdev_extents() to ignore __blkdev_issue_discard() return value and error checking code. Reviewed-by: Johannes Thumshirn Reviewed-by: Christoph Hellwig Signed-off-by: Chaitanya Kulkarni Signed-off-by: Carlos Maiolino --- fs/xfs/xfs_discard.c | 27 +++++---------------------- fs/xfs/xfs_discard.h | 2 +- 2 files changed, 6 insertions(+), 23 deletions(-) diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c index 6917de8321915..b6ffe4807a111 100644 --- a/fs/xfs/xfs_discard.c +++ b/fs/xfs/xfs_discard.c @@ -108,7 +108,7 @@ xfs_discard_endio( * list. We plug and chain the bios so that we only need a single completion * call to clear all the busy extents once the discards are complete. */ -int +void xfs_discard_extents( struct xfs_mount *mp, struct xfs_busy_extents *extents) @@ -116,7 +116,6 @@ xfs_discard_extents( struct xfs_extent_busy *busyp; struct bio *bio = NULL; struct blk_plug plug; - int error = 0; blk_start_plug(&plug); list_for_each_entry(busyp, &extents->extent_list, list) { @@ -126,18 +125,10 @@ xfs_discard_extents( trace_xfs_discard_extent(xg, busyp->bno, busyp->length); - error = __blkdev_issue_discard(btp->bt_bdev, + __blkdev_issue_discard(btp->bt_bdev, xfs_gbno_to_daddr(xg, busyp->bno), XFS_FSB_TO_BB(mp, busyp->length), GFP_KERNEL, &bio); - if (error && error != -EOPNOTSUPP) { - xfs_info(mp, - "discard failed for extent [0x%llx,%u], error %d", - (unsigned long long)busyp->bno, - busyp->length, - error); - break; - } } if (bio) { @@ -148,8 +139,6 @@ xfs_discard_extents( xfs_discard_endio_work(&extents->endio_work); } blk_finish_plug(&plug); - - return error; } /* @@ -385,9 +374,7 @@ xfs_trim_perag_extents( * list after this function call, as it may have been freed by * the time control returns to us. */ - error = xfs_discard_extents(pag_mount(pag), extents); - if (error) - break; + xfs_discard_extents(pag_mount(pag), extents); if (xfs_trim_should_stop()) break; @@ -496,12 +483,10 @@ xfs_discard_rtdev_extents( trace_xfs_discard_rtextent(mp, busyp->bno, busyp->length); - error = __blkdev_issue_discard(bdev, + __blkdev_issue_discard(bdev, xfs_rtb_to_daddr(mp, busyp->bno), XFS_FSB_TO_BB(mp, busyp->length), GFP_NOFS, &bio); - if (error) - break; } xfs_discard_free_rtdev_extents(tr); @@ -741,9 +726,7 @@ xfs_trim_rtgroup_extents( * list after this function call, as it may have been freed by * the time control returns to us. */ - error = xfs_discard_extents(rtg_mount(rtg), tr.extents); - if (error) - break; + xfs_discard_extents(rtg_mount(rtg), tr.extents); low = tr.restart_rtx; } while (!xfs_trim_should_stop() && low <= high); diff --git a/fs/xfs/xfs_discard.h b/fs/xfs/xfs_discard.h index 2b1a85223a56c..8c5cc4af6a078 100644 --- a/fs/xfs/xfs_discard.h +++ b/fs/xfs/xfs_discard.h @@ -6,7 +6,7 @@ struct fstrim_range; struct xfs_mount; struct xfs_busy_extents; -int xfs_discard_extents(struct xfs_mount *mp, struct xfs_busy_extents *busy); +void xfs_discard_extents(struct xfs_mount *mp, struct xfs_busy_extents *busy); int xfs_ioc_trim(struct xfs_mount *mp, struct fstrim_range __user *fstrim); #endif /* XFS_DISCARD_H */ From f1fca8378d802a3002216e1d168653b12a6cbe20 Mon Sep 17 00:00:00 2001 From: "Darrick J. Wong" Date: Thu, 4 Dec 2025 13:43:50 -0800 Subject: [PATCH 0241/1321] xfs: fix a UAF problem in xattr repair The xchk_setup_xattr_buf function can allocate a new value buffer, which means that any reference to ab->value before the call could become a dangling pointer. Fix this by moving an assignment to after the buffer setup. Cc: stable@vger.kernel.org # v6.10 Fixes: e47dcf113ae348 ("xfs: repair extended attributes") Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Signed-off-by: Carlos Maiolino --- fs/xfs/scrub/attr_repair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/scrub/attr_repair.c b/fs/xfs/scrub/attr_repair.c index c7eb94069cafc..09d63aa10314b 100644 --- a/fs/xfs/scrub/attr_repair.c +++ b/fs/xfs/scrub/attr_repair.c @@ -333,7 +333,6 @@ xrep_xattr_salvage_remote_attr( .attr_filter = ent->flags & XFS_ATTR_NSP_ONDISK_MASK, .namelen = rentry->namelen, .name = rentry->name, - .value = ab->value, .valuelen = be32_to_cpu(rentry->valuelen), }; unsigned int namesize; @@ -363,6 +362,7 @@ xrep_xattr_salvage_remote_attr( error = -EDEADLOCK; if (error) return error; + args.value = ab->value; /* Look up the remote value and stash it for reconstruction. */ error = xfs_attr3_leaf_getvalue(leaf_bp, &args); From ce1fe2dd076e9f7386b6eff7e2cafc16136e7e8c Mon Sep 17 00:00:00 2001 From: "Darrick J. Wong" Date: Thu, 4 Dec 2025 13:44:15 -0800 Subject: [PATCH 0242/1321] xfs: fix stupid compiler warning MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit gcc 14.2 warns about: xfs_attr_item.c: In function ‘xfs_attr_recover_work’: xfs_attr_item.c:785:9: warning: ‘ip’ may be used uninitialized [-Wmaybe-uninitialized] 785 | xfs_trans_ijoin(tp, ip, 0); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ xfs_attr_item.c:740:42: note: ‘ip’ was declared here 740 | struct xfs_inode *ip; | ^~ I think this is bogus since xfs_attri_recover_work either returns a real pointer having initialized ip or an ERR_PTR having not touched it, but the tools are smarter than me so let's just null-init the variable anyway. Cc: stable@vger.kernel.org # v6.8 Fixes: e70fb328d52772 ("xfs: recreate work items when recovering intent items") Signed-off-by: Darrick J. Wong Reviewed-by: Carlos Maiolino Reviewed-by: Christoph Hellwig Signed-off-by: Carlos Maiolino --- fs/xfs/xfs_attr_item.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c index c3a593319bee7..e8fa326ac995b 100644 --- a/fs/xfs/xfs_attr_item.c +++ b/fs/xfs/xfs_attr_item.c @@ -737,7 +737,7 @@ xfs_attr_recover_work( struct xfs_attri_log_item *attrip = ATTRI_ITEM(lip); struct xfs_attr_intent *attr; struct xfs_mount *mp = lip->li_log->l_mp; - struct xfs_inode *ip; + struct xfs_inode *ip = NULL; struct xfs_da_args *args; struct xfs_trans *tp; struct xfs_trans_res resv; From cacdc180b0718831dd5b67258ea1e084c46712bb Mon Sep 17 00:00:00 2001 From: Haoxiang Li Date: Wed, 10 Dec 2025 17:06:01 +0800 Subject: [PATCH 0243/1321] xfs: fix a memory leak in xfs_buf_item_init() xfs_buf_item_get_format() may allocate memory for bip->bli_formats, free the memory in the error path. Fixes: c3d5f0c2fb85 ("xfs: complain if anyone tries to create a too-large buffer log item") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li Reviewed-by: Christoph Hellwig Reviewed-by: Carlos Maiolino Signed-off-by: Carlos Maiolino --- fs/xfs/xfs_buf_item.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index 8d85b5eee4444..f4c5be67826e2 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -896,6 +896,7 @@ xfs_buf_item_init( map_size = DIV_ROUND_UP(chunks, NBWORD); if (map_size > XFS_BLF_DATAMAP_SIZE) { + xfs_buf_item_free_format(bip); kmem_cache_free(xfs_buf_item_cache, bip); xfs_err(mp, "buffer item dirty bitmap (%u uints) too small to reflect %u bytes!", From 53d190940262fa5a293c46bf1d2e50567c851646 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Mon, 15 Dec 2025 07:05:46 +0100 Subject: [PATCH 0244/1321] xfs: fix XFS_ERRTAG_FORCE_ZERO_RANGE for zoned file system The new XFS_ERRTAG_FORCE_ZERO_RANGE error tag added by commit ea9989668081 ("xfs: error tag to force zeroing on debug kernels") fails to account for the zoned space reservation rules and this reliably fails xfs/131 because the zeroing operation returns -EIO. Fix this by reserving enough space to zero the entire range, which requires a bit of (fairly ugly) reshuffling to do the error injection early enough to affect the space reservation. Fixes: ea9989668081 ("xfs: error tag to force zeroing on debug kernels") Signed-off-by: Christoph Hellwig Reviewed-by: Brian Foster Reviewed-by: Carlos Maiolino Signed-off-by: Carlos Maiolino --- fs/xfs/xfs_file.c | 58 +++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 48 insertions(+), 10 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 6108612182e2f..7874cf745af37 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1240,6 +1240,38 @@ xfs_falloc_insert_range( return xfs_insert_file_space(XFS_I(inode), offset, len); } +/* + * For various operations we need to zero up to one block at each end of + * the affected range. For zoned file systems this will require a space + * allocation, for which we need a reservation ahead of time. + */ +#define XFS_ZONED_ZERO_EDGE_SPACE_RES 2 + +/* + * Zero range implements a full zeroing mechanism but is only used in limited + * situations. It is more efficient to allocate unwritten extents than to + * perform zeroing here, so use an errortag to randomly force zeroing on DEBUG + * kernels for added test coverage. + * + * On zoned file systems, the error is already injected by + * xfs_file_zoned_fallocate, which then reserves the additional space needed. + * We only check for this extra space reservation here. + */ +static inline bool +xfs_falloc_force_zero( + struct xfs_inode *ip, + struct xfs_zone_alloc_ctx *ac) +{ + if (xfs_is_zoned_inode(ip)) { + if (ac->reserved_blocks > XFS_ZONED_ZERO_EDGE_SPACE_RES) { + ASSERT(IS_ENABLED(CONFIG_XFS_DEBUG)); + return true; + } + return false; + } + return XFS_TEST_ERROR(ip->i_mount, XFS_ERRTAG_FORCE_ZERO_RANGE); +} + /* * Punch a hole and prealloc the range. We use a hole punch rather than * unwritten extent conversion for two reasons: @@ -1268,14 +1300,7 @@ xfs_falloc_zero_range( if (error) return error; - /* - * Zero range implements a full zeroing mechanism but is only used in - * limited situations. It is more efficient to allocate unwritten - * extents than to perform zeroing here, so use an errortag to randomly - * force zeroing on DEBUG kernels for added test coverage. - */ - if (XFS_TEST_ERROR(ip->i_mount, - XFS_ERRTAG_FORCE_ZERO_RANGE)) { + if (xfs_falloc_force_zero(ip, ac)) { error = xfs_zero_range(ip, offset, len, ac, NULL); } else { error = xfs_free_file_space(ip, offset, len, ac); @@ -1423,13 +1448,26 @@ xfs_file_zoned_fallocate( { struct xfs_zone_alloc_ctx ac = { }; struct xfs_inode *ip = XFS_I(file_inode(file)); + struct xfs_mount *mp = ip->i_mount; + xfs_filblks_t count_fsb; int error; - error = xfs_zoned_space_reserve(ip->i_mount, 2, XFS_ZR_RESERVED, &ac); + /* + * If full zeroing is forced by the error injection knob, we need a + * space reservation that covers the entire range. See the comment in + * xfs_zoned_write_space_reserve for the rationale for the calculation. + * Otherwise just reserve space for the two boundary blocks. + */ + count_fsb = XFS_ZONED_ZERO_EDGE_SPACE_RES; + if ((mode & FALLOC_FL_MODE_MASK) == FALLOC_FL_ZERO_RANGE && + XFS_TEST_ERROR(mp, XFS_ERRTAG_FORCE_ZERO_RANGE)) + count_fsb += XFS_B_TO_FSB(mp, len) + 1; + + error = xfs_zoned_space_reserve(mp, count_fsb, XFS_ZR_RESERVED, &ac); if (error) return error; error = __xfs_file_fallocate(file, mode, offset, len, &ac); - xfs_zoned_space_unreserve(ip->i_mount, &ac); + xfs_zoned_space_unreserve(mp, &ac); return error; } From c7d608ffc9851fba9f92f8ae312f85b14eba8b5e Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Tue, 16 Dec 2025 18:30:08 +0100 Subject: [PATCH 0245/1321] xfs: validate that zoned RT devices are zone aligned Garbage collection assumes all zones contain the full amount of blocks. Mkfs already ensures this happens, but make the kernel check it as well to avoid getting into trouble due to fuzzers or mkfs bugs. Fixes: 2167eaabe2fa ("xfs: define the zoned on-disk format") Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Cc: stable@vger.kernel.org # v6.15 Signed-off-by: Carlos Maiolino --- fs/xfs/libxfs/xfs_sb.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index cdd16dd805d77..94c272a2ae262 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -301,6 +301,21 @@ xfs_validate_rt_geometry( sbp->sb_rbmblocks != xfs_expected_rbmblocks(sbp)) return false; + if (xfs_sb_is_v5(sbp) && + (sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_ZONED)) { + uint32_t mod; + + /* + * Zoned RT devices must be aligned to the RT group size, + * because garbage collection assumes that all zones have the + * same size to avoid insane complexity if that weren't the + * case. + */ + div_u64_rem(sbp->sb_rextents, sbp->sb_rgextents, &mod); + if (mod) + return false; + } + return true; } From 129aecd6817e9d7fc69f20dd4349d0b843f9327f Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Tue, 16 Dec 2025 18:30:09 +0100 Subject: [PATCH 0246/1321] xfs: fix the zoned RT growfs check for zone alignment The grofs code for zoned RT subvolums already tries to check for zone alignment, but gets it wrong by using the old instead of the new mount structure. Fixes: 01b71e64bb87 ("xfs: support growfs on zoned file systems") Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Cc: stable@vger.kernel.org # v6.15 Signed-off-by: Carlos Maiolino --- fs/xfs/xfs_rtalloc.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 6907e871fa151..e063f4f2f2e61 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -1255,12 +1255,10 @@ xfs_growfs_check_rtgeom( min_logfsbs = min_t(xfs_extlen_t, xfs_log_calc_minimum_size(nmp), nmp->m_rsumblocks * 2); - kfree(nmp); - trace_xfs_growfs_check_rtgeom(mp, min_logfsbs); if (min_logfsbs > mp->m_sb.sb_logblocks) - return -EINVAL; + goto out_inval; if (xfs_has_zoned(mp)) { uint32_t gblocks = mp->m_groups[XG_TYPE_RTG].blocks; @@ -1268,16 +1266,20 @@ xfs_growfs_check_rtgeom( if (rextsize != 1) return -EINVAL; - div_u64_rem(mp->m_sb.sb_rblocks, gblocks, &rem); + div_u64_rem(nmp->m_sb.sb_rblocks, gblocks, &rem); if (rem) { xfs_warn(mp, "new RT volume size (%lld) not aligned to RT group size (%d)", - mp->m_sb.sb_rblocks, gblocks); - return -EINVAL; + nmp->m_sb.sb_rblocks, gblocks); + goto out_inval; } } + kfree(nmp); return 0; +out_inval: + kfree(nmp); + return -EINVAL; } /* From 58ca269fc72407c35b4ebe0d64d51c6b0954a025 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Fri, 19 Dec 2025 11:20:06 +0000 Subject: [PATCH 0247/1321] clang: work around asm output constraint problems Work around clang problems with "=rm" asm constraint. clang seems to always chose the memory output, while it is almost always the worst choice. Add ASM_OUTPUT_RM so that we can replace "=rm" constraint where it matters for clang, while not penalizing gcc. Signed-off-by: Eric Dumazet Suggested-by: Uros Bizjak Signed-off-by: Linus Torvalds --- include/linux/compiler-clang.h | 1 + include/linux/compiler_types.h | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h index 107ce05bd16eb..7edf1a07b5350 100644 --- a/include/linux/compiler-clang.h +++ b/include/linux/compiler-clang.h @@ -145,6 +145,7 @@ */ #define ASM_INPUT_G "ir" #define ASM_INPUT_RM "r" +#define ASM_OUTPUT_RM "=r" /* * Declare compiler support for __typeof_unqual__() operator. diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h index 1280693766b9d..d3318a3c25777 100644 --- a/include/linux/compiler_types.h +++ b/include/linux/compiler_types.h @@ -548,11 +548,12 @@ struct ftrace_likely_data { /* * Clang has trouble with constraints with multiple - * alternative behaviors (mainly "g" and "rm"). + * alternative behaviors ("g" , "rm" and "=rm"). */ #ifndef ASM_INPUT_G #define ASM_INPUT_G "g" #define ASM_INPUT_RM "rm" + #define ASM_OUTPUT_RM "=rm" #endif #ifdef CONFIG_CC_HAS_ASM_INLINE From fc247fbd8fd66f8bcfe0fa66ce2c183dab29fc57 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Fri, 19 Dec 2025 11:20:07 +0000 Subject: [PATCH 0248/1321] x86/irqflags: Use ASM_OUTPUT_RM in native_save_fl() clang is generating very inefficient code for native_save_fl() which is used for local_irq_save() in critical spots. Allowing the "pop %0" to use memory: 1) forces the compiler to add annoying stack canaries when CONFIG_STACKPROTECTOR_STRONG=y in many places. 2) Almost always is followed by an immediate "move memory,register" One good example is _raw_spin_lock_irqsave, with 8 extra instructions ffffffff82067a30 <_raw_spin_lock_irqsave>: ffffffff82067a30: ... ffffffff82067a39: 53 push %rbx // Three instructions to ajust the stack, read the per-cpu canary // and copy it to 8(%rsp) ffffffff82067a3a: 48 83 ec 10 sub $0x10,%rsp ffffffff82067a3e: 65 48 8b 05 da 15 45 02 mov %gs:0x24515da(%rip),%rax # <__stack_chk_guard> ffffffff82067a46: 48 89 44 24 08 mov %rax,0x8(%rsp) ffffffff82067a4b: 9c pushf // instead of pop %rbx, compiler uses 2 instructions. ffffffff82067a4c: 8f 04 24 pop (%rsp) ffffffff82067a4f: 48 8b 1c 24 mov (%rsp),%rbx ffffffff82067a53: fa cli ffffffff82067a54: b9 01 00 00 00 mov $0x1,%ecx ffffffff82067a59: 31 c0 xor %eax,%eax ffffffff82067a5b: f0 0f b1 0f lock cmpxchg %ecx,(%rdi) ffffffff82067a5f: 75 1d jne ffffffff82067a7e <_raw_spin_lock_irqsave+0x4e> // three instructions to check the stack canary ffffffff82067a61: 65 48 8b 05 b7 15 45 02 mov %gs:0x24515b7(%rip),%rax # <__stack_chk_guard> ffffffff82067a69: 48 3b 44 24 08 cmp 0x8(%rsp),%rax ffffffff82067a6e: 75 17 jne ffffffff82067a87 ... // One extra instruction to adjust the stack. ffffffff82067a73: 48 83 c4 10 add $0x10,%rsp ... // One more instruction in case the stack was mangled. ffffffff82067a87: e8 a4 35 ff ff call ffffffff8205b030 <__stack_chk_fail> This patch changes nothing for gcc, but for clang saves ~20000 bytes of text even though more functions are inlined. $ size vmlinux.gcc.before vmlinux.gcc.after vmlinux.clang.before vmlinux.clang.after text data bss dec hex filename 45565821 25005462 4704800 75276083 47c9f33 vmlinux.gcc.before 45565821 25005462 4704800 75276083 47c9f33 vmlinux.gcc.after 45121072 24638617 5533040 75292729 47ce039 vmlinux.clang.before 45093887 24638633 5536808 75269328 47c84d0 vmlinux.clang.after $ scripts/bloat-o-meter -t vmlinux.clang.before vmlinux.clang.after add/remove: 1/2 grow/shrink: 21/533 up/down: 2250/-22112 (-19862) Signed-off-by: Eric Dumazet Cc: Uros Bizjak Signed-off-by: Linus Torvalds --- arch/x86/include/asm/irqflags.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h index b30e5474c18e1..a1193e9d65f20 100644 --- a/arch/x86/include/asm/irqflags.h +++ b/arch/x86/include/asm/irqflags.h @@ -25,7 +25,7 @@ extern __always_inline unsigned long native_save_fl(void) */ asm volatile("# __raw_save_flags\n\t" "pushf ; pop %0" - : "=rm" (flags) + : ASM_OUTPUT_RM (flags) : /* no input */ : "memory"); From 2d9f85e9e579205c7dafab44e210a25c714dcffa Mon Sep 17 00:00:00 2001 From: Christophe JAILLET Date: Sat, 13 Dec 2025 08:48:51 +0100 Subject: [PATCH 0249/1321] spi: mpfs: Fix an error handling path in mpfs_spi_probe() mpfs_spi_init() calls mpfs_spi_enable_ints(), so mpfs_spi_disable_ints() should be called if an error occurs after calling mpfs_spi_init(), as already done in the remove function. Fixes: 9ac8d17694b6 ("spi: add support for microchip fpga spi controllers") Signed-off-by: Christophe JAILLET Link: https://patch.msgid.link/eb35f168517cc402ef7e78f26da02863e2f45c03.1765612110.git.christophe.jaillet@wanadoo.fr Signed-off-by: Mark Brown --- drivers/spi/spi-mpfs.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/spi/spi-mpfs.c b/drivers/spi/spi-mpfs.c index 9a14d1732a159..7e9e64d8e6c81 100644 --- a/drivers/spi/spi-mpfs.c +++ b/drivers/spi/spi-mpfs.c @@ -577,6 +577,7 @@ static int mpfs_spi_probe(struct platform_device *pdev) ret = devm_spi_register_controller(&pdev->dev, host); if (ret) { + mpfs_spi_disable_ints(spi); mpfs_spi_disable(spi); return dev_err_probe(&pdev->dev, ret, "unable to register host for SPI controller\n"); From ef57087b6309d003d98d62dea56189b938b6ef96 Mon Sep 17 00:00:00 2001 From: Christophe Leroy Date: Thu, 20 Nov 2025 09:34:49 +0100 Subject: [PATCH 0250/1321] spi: fsl-cpm: Check length parity before switching to 16 bit mode Commit fc96ec826bce ("spi: fsl-cpm: Use 16 bit mode for large transfers with even size") failed to make sure that the size is really even before switching to 16 bit mode. Until recently the problem went unnoticed because kernfs uses a pre-allocated bounce buffer of size PAGE_SIZE for reading EEPROM. But commit 8ad6249c51d0 ("eeprom: at25: convert to spi-mem API") introduced an additional dynamically allocated bounce buffer whose size is exactly the size of the transfer, leading to a buffer overrun in the fsl-cpm driver when that size is odd. Add the missing length parity verification and remain in 8 bit mode when the length is not even. Fixes: fc96ec826bce ("spi: fsl-cpm: Use 16 bit mode for large transfers with even size") Cc: stable@vger.kernel.org Closes: https://lore.kernel.org/all/638496dd-ec60-4e53-bad7-eb657f67d580@csgroup.eu/ Signed-off-by: Christophe Leroy Reviewed-by: Sverdlin Alexander Link: https://patch.msgid.link/3c4d81c3923c93f95ec56702a454744a4bad3cfc.1763627618.git.christophe.leroy@csgroup.eu Signed-off-by: Mark Brown --- drivers/spi/spi-fsl-spi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/spi/spi-fsl-spi.c b/drivers/spi/spi-fsl-spi.c index 2f2082652a1a2..481a7b28aacd3 100644 --- a/drivers/spi/spi-fsl-spi.c +++ b/drivers/spi/spi-fsl-spi.c @@ -335,7 +335,7 @@ static int fsl_spi_prepare_message(struct spi_controller *ctlr, if (t->bits_per_word == 16 || t->bits_per_word == 32) t->bits_per_word = 8; /* pretend its 8 bits */ if (t->bits_per_word == 8 && t->len >= 256 && - (mpc8xxx_spi->flags & SPI_CPM1)) + !(t->len & 1) && (mpc8xxx_spi->flags & SPI_CPM1)) t->bits_per_word = 16; } } From 34254181487aef529f04112a1365ba3f5c648750 Mon Sep 17 00:00:00 2001 From: "Rob Herring (Arm)" Date: Mon, 15 Dec 2025 17:03:22 -0600 Subject: [PATCH 0251/1321] spi: dt-bindings: snps,dw-abp-ssi: Allow up to 16 chip-selects At least the Microchip Sparx5 supports up to 16 chip-selects, so increase the maximum. The pattern for the child unit-address was unconstrained, so update it to match the maximum number of chip-selects. Signed-off-by: Rob Herring (Arm) Link: https://patch.msgid.link/20251215230323.3634112-1-robh@kernel.org Signed-off-by: Mark Brown --- Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml b/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml index 5c87fc8a845df..81838577cf9cd 100644 --- a/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml +++ b/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml @@ -121,7 +121,7 @@ properties: num-cs: default: 4 minimum: 1 - maximum: 4 + maximum: 16 dmas: items: @@ -153,14 +153,14 @@ properties: provides an interface to override the native DWC SSI CS control. patternProperties: - "@[0-9a-f]+$": + "@[0-9a-f]$": type: object additionalProperties: true properties: reg: minimum: 0 - maximum: 3 + maximum: 0xf unevaluatedProperties: false From ae56a0cf0bcda2b566004a259d66f39182f80394 Mon Sep 17 00:00:00 2001 From: Fei Shao Date: Wed, 17 Dec 2025 18:10:47 +0800 Subject: [PATCH 0252/1321] spi: mt65xx: Use IRQF_ONESHOT with threaded IRQ This driver is migrated to use threaded IRQ since commit 5972eb05ca32 ("spi: spi-mt65xx: Use threaded interrupt for non-SPIMEM transfer"), and we almost always want to disable the interrupt line to avoid excess interrupts while the threaded handler is processing SPI transfer. Use IRQF_ONESHOT for that purpose. In practice, we see MediaTek devices show SPI transfer timeout errors when communicating with ChromeOS EC in certain scenarios, and with IRQF_ONESHOT, the issue goes away. Signed-off-by: Fei Shao Link: https://patch.msgid.link/20251217101131.1975131-1-fshao@chromium.org Signed-off-by: Mark Brown --- drivers/spi/spi-mt65xx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/spi/spi-mt65xx.c b/drivers/spi/spi-mt65xx.c index 4b40985af1eaf..90e5813cfdc33 100644 --- a/drivers/spi/spi-mt65xx.c +++ b/drivers/spi/spi-mt65xx.c @@ -1320,7 +1320,7 @@ static int mtk_spi_probe(struct platform_device *pdev) ret = devm_request_threaded_irq(dev, irq, mtk_spi_interrupt, mtk_spi_interrupt_thread, - IRQF_TRIGGER_NONE, dev_name(dev), host); + IRQF_ONESHOT, dev_name(dev), host); if (ret) return dev_err_probe(dev, ret, "failed to register irq\n"); From 1b6a3777c0857262f016f5d141a2e81b321ba1f6 Mon Sep 17 00:00:00 2001 From: Anurag Dutta Date: Fri, 12 Dec 2025 12:53:11 +0530 Subject: [PATCH 0253/1321] spi: cadence-quadspi: Add error logging for DMA request failure Add dev_err_probe() to log DMA request failures. This properly handles -EPROBE_DEFER at debug level, reducing log spam during deferred probing. Signed-off-by: Anurag Dutta Link: https://patch.msgid.link/20251212072312.2711806-2-a-dutta@ti.com Signed-off-by: Mark Brown --- drivers/spi/spi-cadence-quadspi.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c index af6d050da1c8a..7c1f742d95a68 100644 --- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -2001,8 +2001,10 @@ static int cqspi_probe(struct platform_device *pdev) if (cqspi->use_direct_mode) { ret = cqspi_request_mmap_dma(cqspi); - if (ret == -EPROBE_DEFER) + if (ret == -EPROBE_DEFER) { + dev_err_probe(&pdev->dev, ret, "Failed to request mmap DMA\n"); goto probe_setup_failed; + } } ret = spi_register_controller(host); From 7c9e557e9aab92b924a3080544b7b9b118ba2259 Mon Sep 17 00:00:00 2001 From: Anurag Dutta Date: Fri, 12 Dec 2025 12:53:12 +0530 Subject: [PATCH 0254/1321] spi: cadence-quadspi: Fix clock disable on probe failure path When cqspi_request_mmap_dma() returns -EPROBE_DEFER after runtime PM is enabled, the error path calls clk_disable_unprepare() on an already disabled clock, causing an imbalance. Use pm_runtime_get_sync() to increment the usage counter and resume the device. This prevents runtime_suspend() from being invoked and causing a double clock disable. Fixes: 140623410536 ("mtd: spi-nor: Add driver for Cadence Quad SPI Flash Controller") Signed-off-by: Anurag Dutta Tested-by: Nishanth Menon Link: https://patch.msgid.link/20251212072312.2711806-3-a-dutta@ti.com Signed-off-by: Mark Brown --- drivers/spi/spi-cadence-quadspi.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c index 7c1f742d95a68..f8823e83a6226 100644 --- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -2026,7 +2026,9 @@ static int cqspi_probe(struct platform_device *pdev) probe_reset_failed: if (cqspi->is_jh7110) cqspi_jh7110_disable_clk(pdev, cqspi); - clk_disable_unprepare(cqspi->clk); + + if (pm_runtime_get_sync(&pdev->dev) >= 0) + clk_disable_unprepare(cqspi->clk); probe_clk_failed: return ret; } From e7cf9287700f482328d6d123aa75dcb0c9311046 Mon Sep 17 00:00:00 2001 From: Niklas Cassel Date: Tue, 9 Dec 2025 05:24:00 +0100 Subject: [PATCH 0255/1321] ata: libata-core: Disable LPM on ST2000DM008-2FR102 According to a user report, the ST2000DM008-2FR102 has problems with LPM. Reported-by: Emerson Pinter Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220693 Signed-off-by: Niklas Cassel Signed-off-by: Damien Le Moal --- drivers/ata/libata-core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 0b24bd169d61d..09d8c035fcdf9 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4143,6 +4143,9 @@ static const struct ata_dev_quirks_entry __ata_dev_quirks[] = { { "ST3320[68]13AS", "SD1[5-9]", ATA_QUIRK_NONCQ | ATA_QUIRK_FIRMWARE_WARN }, + /* Seagate disks with LPM issues */ + { "ST2000DM008-2FR102", NULL, ATA_QUIRK_NOLPM }, + /* drives which fail FPDMA_AA activation (some may freeze afterwards) the ST disks also have LPM issues */ { "ST1000LM024 HN-M101MBB", NULL, ATA_QUIRK_BROKEN_FPDMA_AA | From 92e80c91bb2cedae3617b17386154ac7ff3e2ba2 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Fri, 12 Dec 2025 13:01:04 +0900 Subject: [PATCH 0256/1321] genirq: Don't overwrite interrupt thread flags on setup Chris reported that the recent affinity management changes result in overwriting the already initialized thread flags. Use set_bit() to set the affinity bit instead of assigning the bit value to the flags. Fixes: 801afdfbfcd9 ("genirq: Fix interrupt threads affinity vs. cpuset isolated partitions") Reported-by: Chris Mason Signed-off-by: Thomas Gleixner Acked-by: Frederic Weisbecker Link: https://patch.msgid.link/87ecp0e4cf.ffs@tglx Closes: https://lore.kernel.org/all/20251212014848.3509622-1-clm@meta.com --- kernel/irq/manage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index 8b1b4c8a4f54c..349ae7979da0e 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -1414,7 +1414,7 @@ setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary) * Ensure the thread adjusts the affinity once it reaches the * thread function. */ - new->thread_flags = BIT(IRQTF_AFFINITY); + set_bit(IRQTF_AFFINITY, &new->thread_flags); return 0; } From 5004ca8fc8a751c02041b7b5f224662327aba244 Mon Sep 17 00:00:00 2001 From: Swaraj Gaikwad Date: Wed, 10 Dec 2025 09:28:14 +0000 Subject: [PATCH 0257/1321] x86/boot/Documentation: Fix htmldocs build warning due to malformed table in boot.rst Sphinx reports htmldocs warnings: Documentation/arch/x86/boot.rst:437: ERROR: Malformed table. Text in column margin in table line 2. The table header defined the first column width as 2 characters ("=="), which is too narrow for entries like "0x10" and "0x13". This caused the text to spill into the margin, triggering a docutils parsing failure. Fix it by extending the first column of assigned boot loader ID to 4 characters ("====") to fit the widest entries. Fixes: 1c3377bee212 ("x86/boot/Documentation: Prefix hexadecimal literals with 0x") Tested-by: Randy Dunlap Signed-off-by: Swaraj Gaikwad Signed-off-by: Ingo Molnar Reviewed-by: Randy Dunlap Reviewed-by: Bagas Sanjaya Link: https://patch.msgid.link/20251210092814.9986-1-swarajgaikwad1925@gmail.com --- Documentation/arch/x86/boot.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst index 6d36ce86fd8ec..18574f010d46c 100644 --- a/Documentation/arch/x86/boot.rst +++ b/Documentation/arch/x86/boot.rst @@ -433,7 +433,7 @@ Protocol: 2.00+ Assigned boot loader IDs: - == ======================================= + ==== ======================================= 0x0 LILO (0x00 reserved for pre-2.00 bootloader) 0x1 Loadlin @@ -456,7 +456,7 @@ Protocol: 2.00+ 0x12 OVMF UEFI virtualization stack 0x13 barebox - == ======================================= + ==== ======================================= Please contact if you need a bootloader ID value assigned. From fe5144302bc7f37cd90007ec17b87e4a8fdef79b Mon Sep 17 00:00:00 2001 From: Yongxin Liu Date: Wed, 10 Dec 2025 08:02:20 +0800 Subject: [PATCH 0258/1321] x86/fpu: Fix FPU state core dump truncation on CPUs with no extended xfeatures Zero can be a valid value of num_records. For example, on Intel Atom x6425RE, only x87 and SSE are supported (features 0, 1), and fpu_user_cfg.max_features is 3. The for_each_extended_xfeature() loop only iterates feature 2, which is not enabled, so num_records = 0. This is valid and should not cause core dump failure. The issue is that dump_xsave_layout_desc() returns 0 for both genuine errors (dump_emit() failure) and valid cases (no extended features). Use negative return values for errors and only abort on genuine failures. Fixes: ba386777a30b ("x86/elf: Add a new FPU buffer layout info to x86 core files") Signed-off-by: Yongxin Liu Signed-off-by: Ingo Molnar Link: https://patch.msgid.link/20251210000219.4094353-2-yongxin.liu@windriver.com --- arch/x86/kernel/fpu/xstate.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 48113c5193aa3..76153dfb58c9d 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1946,7 +1946,7 @@ static int dump_xsave_layout_desc(struct coredump_params *cprm) }; if (!dump_emit(cprm, &xc, sizeof(xc))) - return 0; + return -1; num_records++; } @@ -1984,7 +1984,7 @@ int elf_coredump_extra_notes_write(struct coredump_params *cprm) return 1; num_records = dump_xsave_layout_desc(cprm); - if (!num_records) + if (num_records < 0) return 1; /* Total size should be equal to the number of records */ From 94d2c5216251a62d2cdf40139b3f4e69c6aa5233 Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Wed, 10 Dec 2025 08:36:18 +0100 Subject: [PATCH 0259/1321] x86/boot/Documentation: Fix whitespace noise in boot.rst There's a lot of unnecessary whitespace damage in this file: space before tabs, etc., that has no formatting or readability effect or advantages. Fix them. Signed-off-by: Ingo Molnar Link: https://patch.msgid.link/176535283007.498.16442167388418039352.tip-bot2@tip-bot2 --- Documentation/arch/x86/boot.rst | 194 ++++++++++++++++---------------- 1 file changed, 97 insertions(+), 97 deletions(-) diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst index 18574f010d46c..dca3875a24351 100644 --- a/Documentation/arch/x86/boot.rst +++ b/Documentation/arch/x86/boot.rst @@ -95,26 +95,26 @@ Memory Layout The traditional memory map for the kernel loader, used for Image or zImage kernels, typically looks like:: - | | + | | 0A0000 +------------------------+ - | Reserved for BIOS | Do not use. Reserved for BIOS EBDA. + | Reserved for BIOS | Do not use. Reserved for BIOS EBDA. 09A000 +------------------------+ - | Command line | - | Stack/heap | For use by the kernel real-mode code. + | Command line | + | Stack/heap | For use by the kernel real-mode code. 098000 +------------------------+ - | Kernel setup | The kernel real-mode code. + | Kernel setup | The kernel real-mode code. 090200 +------------------------+ - | Kernel boot sector | The kernel legacy boot sector. + | Kernel boot sector | The kernel legacy boot sector. 090000 +------------------------+ - | Protected-mode kernel | The bulk of the kernel image. + | Protected-mode kernel | The bulk of the kernel image. 010000 +------------------------+ - | Boot loader | <- Boot sector entry point 0000:7C00 + | Boot loader | <- Boot sector entry point 0000:7C00 001000 +------------------------+ - | Reserved for MBR/BIOS | + | Reserved for MBR/BIOS | 000800 +------------------------+ - | Typically used by MBR | + | Typically used by MBR | 000600 +------------------------+ - | BIOS use only | + | BIOS use only | 000000 +------------------------+ When using bzImage, the protected-mode kernel was relocated to @@ -142,27 +142,27 @@ above the 0x9A000 point; too many BIOSes will break above that point. For a modern bzImage kernel with boot protocol version >= 2.02, a memory layout like the following is suggested:: - ~ ~ - | Protected-mode kernel | + ~ ~ + | Protected-mode kernel | 100000 +------------------------+ - | I/O memory hole | + | I/O memory hole | 0A0000 +------------------------+ - | Reserved for BIOS | Leave as much as possible unused - ~ ~ - | Command line | (Can also be below the X+10000 mark) + | Reserved for BIOS | Leave as much as possible unused + ~ ~ + | Command line | (Can also be below the X+10000 mark) X+10000 +------------------------+ - | Stack/heap | For use by the kernel real-mode code. + | Stack/heap | For use by the kernel real-mode code. X+08000 +------------------------+ - | Kernel setup | The kernel real-mode code. - | Kernel boot sector | The kernel legacy boot sector. + | Kernel setup | The kernel real-mode code. + | Kernel boot sector | The kernel legacy boot sector. X +------------------------+ - | Boot loader | <- Boot sector entry point 0000:7C00 + | Boot loader | <- Boot sector entry point 0000:7C00 001000 +------------------------+ - | Reserved for MBR/BIOS | + | Reserved for MBR/BIOS | 000800 +------------------------+ - | Typically used by MBR | + | Typically used by MBR | 000600 +------------------------+ - | BIOS use only | + | BIOS use only | 000000 +------------------------+ ... where the address X is as low as the design of the boot loader permits. @@ -809,12 +809,12 @@ Protocol: 2.09+ as follow:: struct setup_data { - __u64 next; - __u32 type; - __u32 len; - __u8 data[]; + __u64 next; + __u32 type; + __u32 len; + __u8 data[]; } - + Where, the next is a 64-bit physical pointer to the next node of linked list, the next field of the last node is 0; the type is used to identify the contents of data; the len is the length of data @@ -835,10 +835,10 @@ Protocol: 2.09+ protocol 2.15:: struct setup_indirect { - __u32 type; - __u32 reserved; /* Reserved, must be set to zero. */ - __u64 len; - __u64 addr; + __u32 type; + __u32 reserved; /* Reserved, must be set to zero. */ + __u64 len; + __u64 addr; }; The type member is a SETUP_INDIRECT | SETUP_* type. However, it cannot be @@ -850,15 +850,15 @@ Protocol: 2.09+ In this case setup_data and setup_indirect will look like this:: struct setup_data { - .next = 0, /* or */ - .type = SETUP_INDIRECT, - .len = sizeof(setup_indirect), - .data[sizeof(setup_indirect)] = (struct setup_indirect) { - .type = SETUP_INDIRECT | SETUP_E820_EXT, - .reserved = 0, - .len = , - .addr = , - }, + .next = 0, /* or */ + .type = SETUP_INDIRECT, + .len = sizeof(setup_indirect), + .data[sizeof(setup_indirect)] = (struct setup_indirect) { + .type = SETUP_INDIRECT | SETUP_E820_EXT, + .reserved = 0, + .len = , + .addr = , + }, } .. note:: @@ -897,11 +897,11 @@ Offset/size: 0x260/4 The kernel runtime start address is determined by the following algorithm:: if (relocatable_kernel) { - if (load_address < pref_address) - load_address = pref_address; - runtime_start = align_up(load_address, kernel_alignment); + if (load_address < pref_address) + load_address = pref_address; + runtime_start = align_up(load_address, kernel_alignment); } else { - runtime_start = pref_address; + runtime_start = pref_address; } Hence the necessary memory window location and size can be estimated by @@ -975,22 +975,22 @@ after kernel_info_var_len_data label. Each chunk of variable size data has to be prefixed with header/magic and its size, e.g.:: kernel_info: - .ascii "LToP" /* Header, Linux top (structure). */ - .long kernel_info_var_len_data - kernel_info - .long kernel_info_end - kernel_info - .long 0x01234567 /* Some fixed size data for the bootloaders. */ + .ascii "LToP" /* Header, Linux top (structure). */ + .long kernel_info_var_len_data - kernel_info + .long kernel_info_end - kernel_info + .long 0x01234567 /* Some fixed size data for the bootloaders. */ kernel_info_var_len_data: example_struct: /* Some variable size data for the bootloaders. */ - .ascii "0123" /* Header/Magic. */ - .long example_struct_end - example_struct - .ascii "Struct" - .long 0x89012345 + .ascii "0123" /* Header/Magic. */ + .long example_struct_end - example_struct + .ascii "Struct" + .long 0x89012345 example_struct_end: example_strings: /* Some variable size data for the bootloaders. */ - .ascii "ABCD" /* Header/Magic. */ - .long example_strings_end - example_strings - .asciz "String_0" - .asciz "String_1" + .ascii "ABCD" /* Header/Magic. */ + .long example_strings_end - example_strings + .asciz "String_0" + .asciz "String_1" example_strings_end: kernel_info_end: @@ -1132,53 +1132,53 @@ Such a boot loader should enter the following fields in the header:: unsigned long base_ptr; /* base address for real-mode segment */ if (setup_sects == 0) - setup_sects = 4; + setup_sects = 4; if (protocol >= 0x0200) { - type_of_loader = ; - if (loading_initrd) { - ramdisk_image = ; - ramdisk_size = ; - } - - if (protocol >= 0x0202 && loadflags & 0x01) - heap_end = 0xe000; - else - heap_end = 0x9800; - - if (protocol >= 0x0201) { - heap_end_ptr = heap_end - 0x200; - loadflags |= 0x80; /* CAN_USE_HEAP */ - } - - if (protocol >= 0x0202) { - cmd_line_ptr = base_ptr + heap_end; - strcpy(cmd_line_ptr, cmdline); - } else { - cmd_line_magic = 0xA33F; - cmd_line_offset = heap_end; - setup_move_size = heap_end + strlen(cmdline) + 1; - strcpy(base_ptr + cmd_line_offset, cmdline); - } + type_of_loader = ; + if (loading_initrd) { + ramdisk_image = ; + ramdisk_size = ; + } + + if (protocol >= 0x0202 && loadflags & 0x01) + heap_end = 0xe000; + else + heap_end = 0x9800; + + if (protocol >= 0x0201) { + heap_end_ptr = heap_end - 0x200; + loadflags |= 0x80; /* CAN_USE_HEAP */ + } + + if (protocol >= 0x0202) { + cmd_line_ptr = base_ptr + heap_end; + strcpy(cmd_line_ptr, cmdline); + } else { + cmd_line_magic = 0xA33F; + cmd_line_offset = heap_end; + setup_move_size = heap_end + strlen(cmdline) + 1; + strcpy(base_ptr + cmd_line_offset, cmdline); + } } else { - /* Very old kernel */ + /* Very old kernel */ - heap_end = 0x9800; + heap_end = 0x9800; - cmd_line_magic = 0xA33F; - cmd_line_offset = heap_end; + cmd_line_magic = 0xA33F; + cmd_line_offset = heap_end; - /* A very old kernel MUST have its real-mode code loaded at 0x90000 */ - if (base_ptr != 0x90000) { - /* Copy the real-mode kernel */ - memcpy(0x90000, base_ptr, (setup_sects + 1) * 512); - base_ptr = 0x90000; /* Relocated */ - } + /* A very old kernel MUST have its real-mode code loaded at 0x90000 */ + if (base_ptr != 0x90000) { + /* Copy the real-mode kernel */ + memcpy(0x90000, base_ptr, (setup_sects + 1) * 512); + base_ptr = 0x90000; /* Relocated */ + } - strcpy(0x90000 + cmd_line_offset, cmdline); + strcpy(0x90000 + cmd_line_offset, cmdline); - /* It is recommended to clear memory up to the 32K mark */ - memset(0x90000 + (setup_sects + 1) * 512, 0, (64 - (setup_sects + 1)) * 512); + /* It is recommended to clear memory up to the 32K mark */ + memset(0x90000 + (setup_sects + 1) * 512, 0, (64 - (setup_sects + 1)) * 512); } From 3a9580a308813c2546dcb121ef908232e378949a Mon Sep 17 00:00:00 2001 From: Thorsten Blum Date: Wed, 10 Dec 2025 13:56:28 +0100 Subject: [PATCH 0260/1321] x86/sgx: Remove unmatched quote in __sgx_encl_extend function comment There is no opening quote. Remove the unmatched closing quote. Signed-off-by: Thorsten Blum Signed-off-by: Ingo Molnar Reviewed-by: Kai Huang Link: https://patch.msgid.link/20251210125628.544916-1-thorsten.blum@linux.dev --- arch/x86/kernel/cpu/sgx/ioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index 66f1efa16fbb7..9322a9287dc7f 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -242,7 +242,7 @@ static int __sgx_encl_add_page(struct sgx_encl *encl, /* * If the caller requires measurement of the page as a proof for the content, * use EEXTEND to add a measurement for 256 bytes of the page. Repeat this - * operation until the entire page is measured." + * operation until the entire page is measured. */ static int __sgx_encl_extend(struct sgx_encl *encl, struct sgx_epc_page *epc_page) From a891a4a01fe19846634157461441c482a59b28b8 Mon Sep 17 00:00:00 2001 From: Tal Zussman Date: Fri, 12 Dec 2025 04:08:07 -0500 Subject: [PATCH 0261/1321] x86/mm/tlb/trace: Export the TLB_REMOTE_WRONG_CPU enum in When the TLB_REMOTE_WRONG_CPU enum was introduced for the tlb_flush tracepoint, the enum was not exported to user-space. Add it to the appropriate macro definition to enable parsing by userspace tools, as per: Link: https://lore.kernel.org/all/20150403013802.220157513@goodmis.org [ mingo: Capitalize IPI, etc. ] Fixes: 2815a56e4b72 ("x86/mm/tlb: Add tracepoint for TLB flush IPI to stale CPU") Signed-off-by: Tal Zussman Signed-off-by: Ingo Molnar Reviewed-by: Steven Rostedt (Google) Reviewed-by: David Hildenbrand Reviewed-by: Rik van Riel Link: https://patch.msgid.link/20251212-tlb-trace-fix-v2-1-d322e0ad9b69@columbia.edu --- include/trace/events/tlb.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/include/trace/events/tlb.h b/include/trace/events/tlb.h index b4d8e7dc38f88..fb83695116856 100644 --- a/include/trace/events/tlb.h +++ b/include/trace/events/tlb.h @@ -12,8 +12,9 @@ EM( TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" ) \ EM( TLB_REMOTE_SHOOTDOWN, "remote shootdown" ) \ EM( TLB_LOCAL_SHOOTDOWN, "local shootdown" ) \ - EM( TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" ) \ - EMe( TLB_REMOTE_SEND_IPI, "remote ipi send" ) + EM( TLB_LOCAL_MM_SHOOTDOWN, "local MM shootdown" ) \ + EM( TLB_REMOTE_SEND_IPI, "remote IPI send" ) \ + EMe( TLB_REMOTE_WRONG_CPU, "remote wrong CPU" ) /* * First define the enums in TLB_FLUSH_REASON to be exported to userspace From eca82fab0f6aefa845b9d843abd98d0b575e5bf5 Mon Sep 17 00:00:00 2001 From: Tal Zussman Date: Fri, 12 Dec 2025 04:08:08 -0500 Subject: [PATCH 0262/1321] mm: Remove tlb_flush_reason::NR_TLB_FLUSH_REASONS from This has been unused since it was added 11 years ago in: d17d8f9dedb9 ("x86/mm: Add tracepoints for TLB flushes") Signed-off-by: Tal Zussman Signed-off-by: Ingo Molnar Reviewed-by: Rik van Riel Acked-by: David Hildenbrand Link: https://patch.msgid.link/20251212-tlb-trace-fix-v2-2-d322e0ad9b69@columbia.edu --- include/linux/mm_types.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 9f6de068295d3..42af2292951d4 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1631,7 +1631,6 @@ enum tlb_flush_reason { TLB_LOCAL_MM_SHOOTDOWN, TLB_REMOTE_SEND_IPI, TLB_REMOTE_WRONG_CPU, - NR_TLB_FLUSH_REASONS, }; /** From 330581ffe0ea4c078dafb6b010d649e810a117f4 Mon Sep 17 00:00:00 2001 From: Kyle Meyer Date: Fri, 12 Dec 2025 12:53:36 -0600 Subject: [PATCH 0263/1321] x86/platform/uv: Fix UBSAN array-index-out-of-bounds When UBSAN is enabled, multiple array-index-out-of-bounds messages are printed: [ 0.000000] [ T0] UBSAN: array-index-out-of-bounds in arch/x86/kernel/apic/x2apic_uv_x.c:276:23 [ 0.000000] [ T0] index 1 is out of range for type ' [1]' ... [ 0.000000] [ T0] UBSAN: array-index-out-of-bounds in arch/x86/kernel/apic/x2apic_uv_x.c:277:32 [ 0.000000] [ T0] index 1 is out of range for type ' [1]' ... [ 0.000000] [ T0] UBSAN: array-index-out-of-bounds in arch/x86/kernel/apic/x2apic_uv_x.c:282:16 [ 0.000000] [ T0] index 1 is out of range for type ' [1]' ... [ 0.515850] [ T1] UBSAN: array-index-out-of-bounds in arch/x86/kernel/apic/x2apic_uv_x.c:1344:23 [ 0.519851] [ T1] index 1 is out of range for type ' [1]' ... [ 0.603850] [ T1] UBSAN: array-index-out-of-bounds in arch/x86/kernel/apic/x2apic_uv_x.c:1345:32 [ 0.607850] [ T1] index 1 is out of range for type ' [1]' ... [ 0.691850] [ T1] UBSAN: array-index-out-of-bounds in arch/x86/kernel/apic/x2apic_uv_x.c:1353:20 [ 0.695850] [ T1] index 1 is out of range for type ' [1]' One-element arrays have been deprecated: https://docs.kernel.org/process/deprecated.html#zero-length-and-one-element-arrays Switch entry in struct uv_systab to a flexible array member to fix UBSAN array-index-out-of-bounds messages. sizeof(struct uv_systab) is passed to early_memremap() and ioremap(). The flexible array member is not accessed until the UV system table size is used to remap the entire UV system table, so changes to sizeof(struct uv_systab) have no impact. Signed-off-by: Kyle Meyer Signed-off-by: Ingo Molnar Link: https://patch.msgid.link/aTxksN-3otY41WvQ@hpe.com --- arch/x86/include/asm/uv/bios.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/uv/bios.h b/arch/x86/include/asm/uv/bios.h index 6989b824fd321..d0b62e2552902 100644 --- a/arch/x86/include/asm/uv/bios.h +++ b/arch/x86/include/asm/uv/bios.h @@ -122,7 +122,7 @@ struct uv_systab { struct { u32 type:8; /* type of entry */ u32 offset:24; /* byte offset from struct start to entry */ - } entry[1]; /* additional entries follow */ + } entry[]; /* additional entries follow */ }; extern struct uv_systab *uv_systab; From bac45491d96f38fc72c5ce24e335d885165d3044 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Tue, 25 Nov 2025 22:50:45 +0100 Subject: [PATCH 0264/1321] x86/msi: Make irq_retrigger() functional for posted MSI Luigi reported that retriggering a posted MSI interrupt does not work correctly. The reason is that the retrigger happens at the vector domain by sending an IPI to the actual vector on the target CPU. That works correctly exactly once because the posted MSI interrupt chip does not issue an EOI as that's only required for the posted MSI notification vector itself. As a consequence the vector becomes stale in the ISR, which not only affects this vector but also any lower priority vector in the affected APIC because the ISR bit is not cleared. Luigi proposed to set the vector in the remap PIR bitmap and raise the posted MSI notification vector. That works, but that still does not cure a related problem: If there is ever a stray interrupt on such a vector, then the related APIC ISR bit becomes stale due to the lack of EOI as described above. Unlikely to happen, but if it happens it's not debuggable at all. So instead of playing games with the PIR, this can be actually solved for both cases by: 1) Keeping track of the posted interrupt vector handler state 2) Implementing a posted MSI specific irq_ack() callback which checks that state. If the posted vector handler is inactive it issues an EOI, otherwise it delegates that to the posted handler. This is correct versus affinity changes and concurrent events on the posted vector as the actual handler invocation is serialized through the interrupt descriptor lock. Fixes: ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs") Reported-by: Luigi Rizzo Signed-off-by: Thomas Gleixner Tested-by: Luigi Rizzo Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251125214631.044440658@linutronix.de Closes: https://lore.kernel.org/lkml/20251124104836.3685533-1-lrizzo@google.com --- arch/x86/include/asm/irq_remapping.h | 7 +++++++ arch/x86/kernel/irq.c | 23 +++++++++++++++++++++++ drivers/iommu/intel/irq_remapping.c | 8 ++++---- 3 files changed, 34 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h index 5a0d42464d442..4e55d17558465 100644 --- a/arch/x86/include/asm/irq_remapping.h +++ b/arch/x86/include/asm/irq_remapping.h @@ -87,4 +87,11 @@ static inline void panic_if_irq_remap(const char *msg) } #endif /* CONFIG_IRQ_REMAP */ + +#ifdef CONFIG_X86_POSTED_MSI +void intel_ack_posted_msi_irq(struct irq_data *irqd); +#else +#define intel_ack_posted_msi_irq NULL +#endif + #endif /* __X86_IRQ_REMAPPING_H */ diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 86f4e574de026..b2fe6181960c3 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -397,6 +397,7 @@ DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_kvm_posted_intr_nested_ipi) /* Posted Interrupt Descriptors for coalesced MSIs to be posted */ DEFINE_PER_CPU_ALIGNED(struct pi_desc, posted_msi_pi_desc); +static DEFINE_PER_CPU_CACHE_HOT(bool, posted_msi_handler_active); void intel_posted_msi_init(void) { @@ -414,6 +415,25 @@ void intel_posted_msi_init(void) this_cpu_write(posted_msi_pi_desc.ndst, destination); } +void intel_ack_posted_msi_irq(struct irq_data *irqd) +{ + irq_move_irq(irqd); + + /* + * Handle the rare case that irq_retrigger() raised the actual + * assigned vector on the target CPU, which means that it was not + * invoked via the posted MSI handler below. In that case APIC EOI + * is required as otherwise the ISR entry becomes stale and lower + * priority interrupts are never going to be delivered after that. + * + * If the posted handler invoked the device interrupt handler then + * the EOI would be premature because it would acknowledge the + * posted vector. + */ + if (unlikely(!__this_cpu_read(posted_msi_handler_active))) + apic_eoi(); +} + static __always_inline bool handle_pending_pir(unsigned long *pir, struct pt_regs *regs) { unsigned long pir_copy[NR_PIR_WORDS]; @@ -446,6 +466,8 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi_notification) pid = this_cpu_ptr(&posted_msi_pi_desc); + /* Mark the handler active for intel_ack_posted_msi_irq() */ + __this_cpu_write(posted_msi_handler_active, true); inc_irq_stat(posted_msi_notification_count); irq_enter(); @@ -474,6 +496,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi_notification) apic_eoi(); irq_exit(); + __this_cpu_write(posted_msi_handler_active, false); set_irq_regs(old_regs); } #endif /* X86_POSTED_MSI */ diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c index 4f9b01dc91e86..8bcbfe3d9c722 100644 --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1303,17 +1303,17 @@ static struct irq_chip intel_ir_chip = { * irq_enter(); * handle_edge_irq() * irq_chip_ack_parent() - * irq_move_irq(); // No EOI + * intel_ack_posted_msi_irq(); // No EOI * handle_irq_event() * driver_handler() * handle_edge_irq() * irq_chip_ack_parent() - * irq_move_irq(); // No EOI + * intel_ack_posted_msi_irq(); // No EOI * handle_irq_event() * driver_handler() * handle_edge_irq() * irq_chip_ack_parent() - * irq_move_irq(); // No EOI + * intel_ack_posted_msi_irq(); // No EOI * handle_irq_event() * driver_handler() * apic_eoi() @@ -1322,7 +1322,7 @@ static struct irq_chip intel_ir_chip = { */ static struct irq_chip intel_ir_chip_post_msi = { .name = "INTEL-IR-POST", - .irq_ack = irq_move_irq, + .irq_ack = intel_ack_posted_msi_irq, .irq_set_affinity = intel_ir_set_affinity, .irq_compose_msi_msg = intel_ir_compose_msi_msg, .irq_set_vcpu_affinity = intel_ir_set_vcpu_affinity, From 753441edc30dff1a5b3d98ee767e4ea6c2cc87b0 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Thu, 18 Dec 2025 11:47:38 +0100 Subject: [PATCH 0265/1321] x86/bug: Fix old GCC compile fails For some mysterious reasons the GCC 8 and 9 preprocessor manages to sporadically fumble _ASM_BYTES(0x0f, 0x0b): $ grep ".byte[ ]*0x0f" defconfig-build/drivers/net/wireless/realtek/rtlwifi/base.s 1: .byte0x0f,0x0b ; 1: .byte 0x0f,0x0b ; which makes the assembler upset and all that. While there are more _ASM_BYTES() users (notably the NOP instructions), those don't seem affected. Therefore replace the offending ASM_UD2 with one using the ud2 mnemonic. Reported-by: Jean Delvare Suggested-by: Uros Bizjak Fixes: 85a2d4a890dc ("x86,ibt: Use UDB instead of 0xEA") Cc: stable@kernel.org Signed-off-by: Peter Zijlstra (Intel) Link: https://patch.msgid.link/20251218104659.GT3911114@noisy.programming.kicks-ass.net --- arch/x86/include/asm/bug.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h index d561a8443c137..9b4e04690e1a2 100644 --- a/arch/x86/include/asm/bug.h +++ b/arch/x86/include/asm/bug.h @@ -15,7 +15,7 @@ extern void __WARN_trap(struct bug_entry *bug, ...); /* * Despite that some emulators terminate on UD2, we use it for WARN(). */ -#define ASM_UD2 _ASM_BYTES(0x0f, 0x0b) +#define ASM_UD2 __ASM_FORM(ud2) #define INSN_UD2 0x0b0f #define LEN_UD2 2 From 2dc2122cbd6fa51e5330b0033cbd52c1b3e0d333 Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Wed, 26 Nov 2025 19:22:58 +0100 Subject: [PATCH 0266/1321] i2c: bcm-iproc: Fix Wvoid-pointer-to-enum-cast warning 'type' is an enum, thus cast of pointer on 64-bit compile test with clang and W=1 causes: i2c-bcm-iproc.c:1102:3: error: cast to smaller integer type 'enum bcm_iproc_i2c_type' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] One of the discussions in 2023 on LKML suggested warning is not suitable for kernel. Nothing changed in this regard since that time, so assume the warning will stay and we want to have warnings-free builds. Signed-off-by: Krzysztof Kozlowski Signed-off-by: Andi Shyti Link: https://lore.kernel.org/r/20251126182257.157439-4-krzysztof.kozlowski@oss.qualcomm.com --- drivers/i2c/busses/i2c-bcm-iproc.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/i2c/busses/i2c-bcm-iproc.c b/drivers/i2c/busses/i2c-bcm-iproc.c index e418a4f23f156..b5629cffe99b5 100644 --- a/drivers/i2c/busses/i2c-bcm-iproc.c +++ b/drivers/i2c/busses/i2c-bcm-iproc.c @@ -1098,8 +1098,7 @@ static int bcm_iproc_i2c_probe(struct platform_device *pdev) platform_set_drvdata(pdev, iproc_i2c); iproc_i2c->device = &pdev->dev; - iproc_i2c->type = - (enum bcm_iproc_i2c_type)of_device_get_match_data(&pdev->dev); + iproc_i2c->type = (kernel_ulong_t)of_device_get_match_data(&pdev->dev); init_completion(&iproc_i2c->done); iproc_i2c->base = devm_platform_ioremap_resource(pdev, 0); From aff281926ff358270193f56d544aeab189e3e311 Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Wed, 26 Nov 2025 19:22:59 +0100 Subject: [PATCH 0267/1321] i2c: pxa: Fix Wvoid-pointer-to-enum-cast warning 'i2c_types' is an enum, thus cast of pointer on 64-bit compile test with clang and W=1 causes: i2c-pxa.c:1269:15: error: cast to smaller integer type 'enum pxa_i2c_types' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] One of the discussions in 2023 on LKML suggested warning is not suitable for kernel. Nothing changed in this regard since that time, so assume the warning will stay and we want to have warnings-free builds. Signed-off-by: Krzysztof Kozlowski Signed-off-by: Andi Shyti Link: https://lore.kernel.org/r/20251126182257.157439-5-krzysztof.kozlowski@oss.qualcomm.com --- drivers/i2c/busses/i2c-pxa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-pxa.c b/drivers/i2c/busses/i2c-pxa.c index 968a8b8794dac..09af3b3625f11 100644 --- a/drivers/i2c/busses/i2c-pxa.c +++ b/drivers/i2c/busses/i2c-pxa.c @@ -1266,7 +1266,7 @@ static int i2c_pxa_probe_dt(struct platform_device *pdev, struct pxa_i2c *i2c, i2c->use_pio = of_property_read_bool(np, "mrvl,i2c-polling"); i2c->fast_mode = of_property_read_bool(np, "mrvl,i2c-fast-mode"); - *i2c_types = (enum pxa_i2c_types)device_get_match_data(&pdev->dev); + *i2c_types = (kernel_ulong_t)device_get_match_data(&pdev->dev); return 0; } From bb9aa3a21b5f5350e6049a9766960ef7dcd5366b Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Wed, 26 Nov 2025 19:23:00 +0100 Subject: [PATCH 0268/1321] i2c: rcar: Fix Wvoid-pointer-to-enum-cast warning 'i2c_types' is an enum, thus cast of pointer on 64-bit compile test with clang and W=1 causes: i2c-rcar.c:1144:18: error: cast to smaller integer type 'enum rcar_i2c_type' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] One of the discussions in 2023 on LKML suggested warning is not suitable for kernel. Nothing changed in this regard since that time, so assume the warning will stay and we want to have warnings-free builds. Signed-off-by: Krzysztof Kozlowski Reviewed-by: Geert Uytterhoeven Signed-off-by: Andi Shyti Link: https://lore.kernel.org/r/20251126182257.157439-6-krzysztof.kozlowski@oss.qualcomm.com --- drivers/i2c/busses/i2c-rcar.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-rcar.c b/drivers/i2c/busses/i2c-rcar.c index d51884ab99f4d..5ce8f8e4856fb 100644 --- a/drivers/i2c/busses/i2c-rcar.c +++ b/drivers/i2c/busses/i2c-rcar.c @@ -1141,7 +1141,7 @@ static int rcar_i2c_probe(struct platform_device *pdev) if (IS_ERR(priv->io)) return PTR_ERR(priv->io); - priv->devtype = (enum rcar_i2c_type)of_device_get_match_data(dev); + priv->devtype = (kernel_ulong_t)of_device_get_match_data(dev); init_waitqueue_head(&priv->wait); adap = &priv->adap; From 360917b11d66d5c70a9b9cbc8a30189f3cb7838a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Beno=C3=AEt=20Monin?= Date: Wed, 26 Nov 2025 11:46:24 +0100 Subject: [PATCH 0269/1321] dt-bindings: i2c: dw: Add Mobileye I2C controllers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add compatible string for the DesignWare-based I2C controllers present in Mobileye Eyeq6Lplus SoC, with a fallback to the default compatible. The same controllers are also present in the EyeQ7H, so add a compatible for those with a fallback to the Eyeq6Lplus compatible. Reviewed-by: Krzysztof Kozlowski Signed-off-by: Benoît Monin Acked-by: Mika Westerberg Signed-off-by: Andi Shyti Link: https://lore.kernel.org/r/20251126-i2c-dw-v4-1-b0654598e7c5@bootlin.com --- .../devicetree/bindings/i2c/snps,designware-i2c.yaml | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/Documentation/devicetree/bindings/i2c/snps,designware-i2c.yaml b/Documentation/devicetree/bindings/i2c/snps,designware-i2c.yaml index d904191bb0c6e..9142001888095 100644 --- a/Documentation/devicetree/bindings/i2c/snps,designware-i2c.yaml +++ b/Documentation/devicetree/bindings/i2c/snps,designware-i2c.yaml @@ -34,8 +34,15 @@ properties: - const: snps,designware-i2c - description: Baikal-T1 SoC System I2C controller const: baikal,bt1-sys-i2c + - description: Mobileye EyeQ DesignWare I2C controller + items: + - enum: + - mobileye,eyeq7h-i2c + - const: mobileye,eyeq6lplus-i2c + - const: snps,designware-i2c - items: - enum: + - mobileye,eyeq6lplus-i2c - mscc,ocelot-i2c - sophgo,sg2044-i2c - thead,th1520-i2c From 61ac46fcc083bcf8307ee40dc0ba70ea5730753e Mon Sep 17 00:00:00 2001 From: Jarkko Nikula Date: Mon, 24 Nov 2025 14:28:15 +0100 Subject: [PATCH 0270/1321] i2c: i801: Add support for Intel Nova Lake-S Add SMBus PCI IDs on Intel Nova Lake-S. Signed-off-by: Jarkko Nikula Signed-off-by: Heikki Krogerus Signed-off-by: Andi Shyti Link: https://lore.kernel.org/r/20251124132816.470599-1-heikki.krogerus@linux.intel.com --- Documentation/i2c/busses/i2c-i801.rst | 1 + drivers/i2c/busses/Kconfig | 1 + drivers/i2c/busses/i2c-i801.c | 3 +++ 3 files changed, 5 insertions(+) diff --git a/Documentation/i2c/busses/i2c-i801.rst b/Documentation/i2c/busses/i2c-i801.rst index c939a5bfc8d00..bbbce90eb7d84 100644 --- a/Documentation/i2c/busses/i2c-i801.rst +++ b/Documentation/i2c/busses/i2c-i801.rst @@ -52,6 +52,7 @@ Supported adapters: * Intel Panther Lake (SOC) * Intel Wildcat Lake (SOC) * Intel Diamond Rapids (SOC) + * Intel Nova Lake (PCH) Datasheets: Publicly available at the Intel website diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig index cea87fcb4a1a9..09ba55bae1fac 100644 --- a/drivers/i2c/busses/Kconfig +++ b/drivers/i2c/busses/Kconfig @@ -167,6 +167,7 @@ config I2C_I801 Panther Lake (SOC) Wildcat Lake (SOC) Diamond Rapids (SOC) + Nova Lake (PCH) This driver can also be built as a module. If so, the module will be called i2c-i801. diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c index 81e6e2d7ad3dc..9e1789725edf7 100644 --- a/drivers/i2c/busses/i2c-i801.c +++ b/drivers/i2c/busses/i2c-i801.c @@ -85,6 +85,7 @@ * Panther Lake-P (SOC) 0xe422 32 hard yes yes yes * Wildcat Lake-U (SOC) 0x4d22 32 hard yes yes yes * Diamond Rapids (SOC) 0x5827 32 hard yes yes yes + * Nova Lake-S (PCH) 0x6e23 32 hard yes yes yes * * Features supported by this driver: * Software PEC no @@ -245,6 +246,7 @@ #define PCI_DEVICE_ID_INTEL_BIRCH_STREAM_SMBUS 0x5796 #define PCI_DEVICE_ID_INTEL_DIAMOND_RAPIDS_SMBUS 0x5827 #define PCI_DEVICE_ID_INTEL_BROXTON_SMBUS 0x5ad4 +#define PCI_DEVICE_ID_INTEL_NOVA_LAKE_S_SMBUS 0x6e23 #define PCI_DEVICE_ID_INTEL_ARROW_LAKE_H_SMBUS 0x7722 #define PCI_DEVICE_ID_INTEL_RAPTOR_LAKE_S_SMBUS 0x7a23 #define PCI_DEVICE_ID_INTEL_ALDER_LAKE_S_SMBUS 0x7aa3 @@ -1061,6 +1063,7 @@ static const struct pci_device_id i801_ids[] = { { PCI_DEVICE_DATA(INTEL, PANTHER_LAKE_H_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { PCI_DEVICE_DATA(INTEL, PANTHER_LAKE_P_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { PCI_DEVICE_DATA(INTEL, WILDCAT_LAKE_U_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, + { PCI_DEVICE_DATA(INTEL, NOVA_LAKE_S_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { 0, } }; From 76699d5b4b5407cd7fb34d4d37c7e11a60fd8aa0 Mon Sep 17 00:00:00 2001 From: Hangxiang Ma Date: Wed, 26 Nov 2025 01:38:34 -0800 Subject: [PATCH 0271/1321] dt-bindings: i2c: qcom-cci: Document SM8750 compatible Add SM8750 compatible consistent with CAMSS CCI interfaces. Signed-off-by: Hangxiang Ma Reviewed-by: Bryan O'Donoghue Reviewed-by: Krzysztof Kozlowski Link: https://lore.kernel.org/r/20251126-add-support-for-camss-on-sm8750-v1-1-646fee2eb720@oss.qualcomm.com Signed-off-by: Andi Shyti --- Documentation/devicetree/bindings/i2c/qcom,i2c-cci.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/Documentation/devicetree/bindings/i2c/qcom,i2c-cci.yaml b/Documentation/devicetree/bindings/i2c/qcom,i2c-cci.yaml index 33852a5ffca8f..a3fe1eea6aece 100644 --- a/Documentation/devicetree/bindings/i2c/qcom,i2c-cci.yaml +++ b/Documentation/devicetree/bindings/i2c/qcom,i2c-cci.yaml @@ -38,6 +38,7 @@ properties: - qcom,sm8450-cci - qcom,sm8550-cci - qcom,sm8650-cci + - qcom,sm8750-cci - qcom,x1e80100-cci - const: qcom,msm8996-cci # CCI v2 @@ -132,6 +133,7 @@ allOf: enum: - qcom,kaanapali-cci - qcom,qcm2290-cci + - qcom,sm8750-cci then: properties: clocks: From 6203f9c96b6e108fa6bdb89b46c28976bc937515 Mon Sep 17 00:00:00 2001 From: Minseong Kim Date: Fri, 12 Dec 2025 00:29:23 -0800 Subject: [PATCH 0272/1321] Input: lkkbd - disable pending work before freeing device lkkbd_interrupt() schedules lk->tq via schedule_work(), and the work handler lkkbd_reinit() dereferences the lkkbd structure and its serio/input_dev fields. lkkbd_disconnect() and error paths in lkkbd_connect() free the lkkbd structure without preventing the reinit work from being queued again until serio_close() returns. This can allow the work handler to run after the structure has been freed, leading to a potential use-after-free. Use disable_work_sync() instead of cancel_work_sync() to ensure the reinit work cannot be re-queued, and call it both in lkkbd_disconnect() and in lkkbd_connect() error paths after serio_open(). Signed-off-by: Minseong Kim Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251212052314.16139-1-ii4gsp@gmail.com Signed-off-by: Dmitry Torokhov --- drivers/input/keyboard/lkkbd.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/input/keyboard/lkkbd.c b/drivers/input/keyboard/lkkbd.c index c035216dd27c1..2f130f819363c 100644 --- a/drivers/input/keyboard/lkkbd.c +++ b/drivers/input/keyboard/lkkbd.c @@ -670,7 +670,8 @@ static int lkkbd_connect(struct serio *serio, struct serio_driver *drv) return 0; - fail3: serio_close(serio); + fail3: disable_work_sync(&lk->tq); + serio_close(serio); fail2: serio_set_drvdata(serio, NULL); fail1: input_free_device(input_dev); kfree(lk); @@ -684,6 +685,8 @@ static void lkkbd_disconnect(struct serio *serio) { struct lkkbd *lk = serio_get_drvdata(serio); + disable_work_sync(&lk->tq); + input_get_device(lk->dev); input_unregister_device(lk->dev); serio_close(serio); From 043e3ccc1a837774b3457e3ed9e2e43e9ec207c3 Mon Sep 17 00:00:00 2001 From: Cryolitia PukNgae Date: Thu, 23 Oct 2025 12:42:25 -0700 Subject: [PATCH 0273/1321] Input: atkbd - skip deactivate for HONOR FMB-P's internal keyboard After commit 9cf6e24c9fbf17e52de9fff07f12be7565ea6d61 ("Input: atkbd - do not skip atkbd_deactivate() when skipping ATKBD_CMD_GETID"), HONOR FMB-P, aka HONOR MagicBook Pro 14 2025's internal keyboard stops working. Adding the atkbd_deactivate_fixup quirk fixes it. DMI: HONOR FMB-P/FMB-P-PCB, BIOS 1.13 05/08/2025 Fixes: 9cf6e24c9fbf17e52de9fff07f12be7565ea6d61 ("Input: atkbd - do not skip atkbd_deactivate() when skipping ATKBD_CMD_GETID") Reported-by: Mikura Kyouka Reported-by: foad.elkhattabi Signed-off-by: Cryolitia PukNgae Reviewed-by: Hans de Goede Link: https://patch.msgid.link/20251022-honor-v1-1-ff894ed271a9@linux.dev Signed-off-by: Dmitry Torokhov --- drivers/input/keyboard/atkbd.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/input/keyboard/atkbd.c b/drivers/input/keyboard/atkbd.c index 6c999d89ee4bd..422e28ad1e8e2 100644 --- a/drivers/input/keyboard/atkbd.c +++ b/drivers/input/keyboard/atkbd.c @@ -1937,6 +1937,13 @@ static const struct dmi_system_id atkbd_dmi_quirk_table[] __initconst = { }, .callback = atkbd_deactivate_fixup, }, + { + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "HONOR"), + DMI_MATCH(DMI_PRODUCT_NAME, "FMB-P"), + }, + .callback = atkbd_deactivate_fixup, + }, { } }; From f02eae17051c857b2d40948505c003a79216de37 Mon Sep 17 00:00:00 2001 From: Christoffer Sandberg Date: Mon, 24 Nov 2025 21:31:34 +0100 Subject: [PATCH 0274/1321] Input: i8042 - add TUXEDO InfinityBook Max Gen10 AMD to i8042 quirk table The device occasionally wakes up from suspend with missing input on the internal keyboard and the following suspend attempt results in an instant wake-up. The quirks fix both issues for this device. Signed-off-by: Christoffer Sandberg Signed-off-by: Werner Sembach Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251124203336.64072-1-wse@tuxedocomputers.com Signed-off-by: Dmitry Torokhov --- drivers/input/serio/i8042-acpipnpio.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/input/serio/i8042-acpipnpio.h b/drivers/input/serio/i8042-acpipnpio.h index 1caa6c4ca435c..654771275ce87 100644 --- a/drivers/input/serio/i8042-acpipnpio.h +++ b/drivers/input/serio/i8042-acpipnpio.h @@ -1169,6 +1169,13 @@ static const struct dmi_system_id i8042_dmi_quirk_table[] __initconst = { .driver_data = (void *)(SERIO_QUIRK_NOMUX | SERIO_QUIRK_RESET_ALWAYS | SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP) }, + { + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "X5KK45xS_X5SP45xS"), + }, + .driver_data = (void *)(SERIO_QUIRK_NOMUX | SERIO_QUIRK_RESET_ALWAYS | + SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP) + }, /* * A lot of modern Clevo barebones have touchpad and/or keyboard issues * after suspend fixable with the forcenorestore quirk. From d9cec75b1c886e754c9cec944db32cd50c3c900c Mon Sep 17 00:00:00 2001 From: Duoming Zhou Date: Wed, 17 Dec 2025 11:00:17 +0800 Subject: [PATCH 0275/1321] Input: alps - fix use-after-free bugs caused by dev3_register_work The dev3_register_work delayed work item is initialized within alps_reconnect() and scheduled upon receipt of the first bare PS/2 packet from an external PS/2 device connected to the ALPS touchpad. During device detachment, the original implementation calls flush_workqueue() in psmouse_disconnect() to ensure completion of dev3_register_work. However, the flush_workqueue() in psmouse_disconnect() only blocks and waits for work items that were already queued to the workqueue prior to its invocation. Any work items submitted after flush_workqueue() is called are not included in the set of tasks that the flush operation awaits. This means that after flush_workqueue() has finished executing, the dev3_register_work could still be scheduled. Although the psmouse state is set to PSMOUSE_CMD_MODE in psmouse_disconnect(), the scheduling of dev3_register_work remains unaffected. The race condition can occur as follows: CPU 0 (cleanup path) | CPU 1 (delayed work) psmouse_disconnect() | psmouse_set_state() | flush_workqueue() | alps_report_bare_ps2_packet() alps_disconnect() | psmouse_queue_work() kfree(priv); // FREE | alps_register_bare_ps2_mouse() | priv = container_of(work...); // USE | priv->dev3 // USE Add disable_delayed_work_sync() in alps_disconnect() to ensure that dev3_register_work is properly canceled and prevented from executing after the alps_data structure has been deallocated. This bug is identified by static analysis. Fixes: 04aae283ba6a ("Input: ALPS - do not mix trackstick and external PS/2 mouse data") Cc: stable@kernel.org Signed-off-by: Duoming Zhou Link: https://patch.msgid.link/b57b0a9ccca51a3f06be141bfc02b9ffe69d1845.1765939397.git.duoming@zju.edu.cn Signed-off-by: Dmitry Torokhov --- drivers/input/mouse/alps.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/input/mouse/alps.c b/drivers/input/mouse/alps.c index d0cb9fb948218..df8953a5196e1 100644 --- a/drivers/input/mouse/alps.c +++ b/drivers/input/mouse/alps.c @@ -2975,6 +2975,7 @@ static void alps_disconnect(struct psmouse *psmouse) psmouse_reset(psmouse); timer_shutdown_sync(&priv->timer); + disable_delayed_work_sync(&priv->dev3_register_work); if (priv->dev2) input_unregister_device(priv->dev2); if (!IS_ERR_OR_NULL(priv->dev3)) From a76469aec406dd8fd38d41bbd5ecd82c9723b671 Mon Sep 17 00:00:00 2001 From: Sasha Finkelstein Date: Thu, 18 Dec 2025 10:15:23 -0800 Subject: [PATCH 0276/1321] Input: apple_z2 - fix reading incorrect reports after exiting sleep Under certain conditions (more prevalent after a suspend/resume cycle), the touchscreen controller can send the "boot complete" interrupt before it actually finished booting. In those cases, attempting to read touch data resuls in a stream of "not ready" messages being read and interpreted as a touch report. Check that the response is in fact a touch report and discard it otherwise. Reported-by: pitust Closes: https://oftc.catirclogs.org/asahi/2025-12-17#34878715; Fixes: 471a92f8a21a ("Input: apple_z2 - add a driver for Apple Z2 touchscreens") Signed-off-by: Sasha Finkelstein Link: https://patch.msgid.link/20251218-z2-init-fix-v1-1-48e3aa239caf@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov --- drivers/input/touchscreen/apple_z2.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/input/touchscreen/apple_z2.c b/drivers/input/touchscreen/apple_z2.c index 0de161eae59a0..271ababf0ad55 100644 --- a/drivers/input/touchscreen/apple_z2.c +++ b/drivers/input/touchscreen/apple_z2.c @@ -21,6 +21,7 @@ #define APPLE_Z2_TOUCH_STARTED 3 #define APPLE_Z2_TOUCH_MOVED 4 #define APPLE_Z2_CMD_READ_INTERRUPT_DATA 0xEB +#define APPLE_Z2_REPLY_INTERRUPT_DATA 0xE1 #define APPLE_Z2_HBPP_CMD_BLOB 0x3001 #define APPLE_Z2_FW_MAGIC 0x5746325A #define LOAD_COMMAND_INIT_PAYLOAD 0 @@ -142,6 +143,9 @@ static int apple_z2_read_packet(struct apple_z2 *z2) if (error) return error; + if (z2->rx_buf[0] != APPLE_Z2_REPLY_INTERRUPT_DATA) + return 0; + pkt_len = (get_unaligned_le16(z2->rx_buf + 1) + 8) & 0xfffffffc; error = spi_read(z2->spidev, z2->rx_buf, pkt_len); From 6a73b475e26624ca34dc6b8cc7b45dc40f5c1b00 Mon Sep 17 00:00:00 2001 From: Gergo Koteles Date: Thu, 13 Nov 2025 17:02:58 +0100 Subject: [PATCH 0277/1321] Input: add ABS_SND_PROFILE MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ABS_SND_PROFILE used to describe the state of a multi-value sound profile switch. This will be used for the alert-slider on OnePlus phones or other phones. Profile values added as SND_PROFLE_(SILENT|VIBRATE|RING) identifiers to input-event-codes.h so they can be used from DTS. Signed-off-by: Gergo Koteles Reviewed-by: Bjorn Andersson Tested-by: Guido Günther # oneplus,fajita & oneplus,enchilada Reviewed-by: Guido Günther Signed-off-by: David Heidelberg Reviewed-by: Pavel Machek Link: https://patch.msgid.link/20251113-op6-tri-state-v8-1-54073f3874bc@ixit.cz Signed-off-by: Dmitry Torokhov --- Documentation/input/event-codes.rst | 6 ++++++ drivers/hid/hid-debug.c | 1 + include/uapi/linux/input-event-codes.h | 9 +++++++++ 3 files changed, 16 insertions(+) diff --git a/Documentation/input/event-codes.rst b/Documentation/input/event-codes.rst index 4424cbff251f8..77a6c9b3956d5 100644 --- a/Documentation/input/event-codes.rst +++ b/Documentation/input/event-codes.rst @@ -241,6 +241,12 @@ A few EV_ABS codes have special meanings: emitted only when the selected profile changes, indicating the newly selected profile value. +* ABS_SND_PROFILE: + + - Used to describe the state of a multi-value sound profile switch. + An event is emitted only when the selected profile changes, + indicating the newly selected profile value. + * ABS_MT_: - Used to describe multitouch input events. Please see diff --git a/drivers/hid/hid-debug.c b/drivers/hid/hid-debug.c index 337d2dc81b4ca..c5865b0d2aaaf 100644 --- a/drivers/hid/hid-debug.c +++ b/drivers/hid/hid-debug.c @@ -3513,6 +3513,7 @@ static const char *absolutes[ABS_CNT] = { [ABS_DISTANCE] = "Distance", [ABS_TILT_X] = "XTilt", [ABS_TILT_Y] = "YTilt", [ABS_TOOL_WIDTH] = "ToolWidth", [ABS_VOLUME] = "Volume", [ABS_PROFILE] = "Profile", + [ABS_SND_PROFILE] = "SoundProfile", [ABS_MISC] = "Misc", [ABS_MT_SLOT] = "MTSlot", [ABS_MT_TOUCH_MAJOR] = "MTMajor", diff --git a/include/uapi/linux/input-event-codes.h b/include/uapi/linux/input-event-codes.h index 30f3c9eaafaad..4bdb6a1659873 100644 --- a/include/uapi/linux/input-event-codes.h +++ b/include/uapi/linux/input-event-codes.h @@ -891,6 +891,7 @@ #define ABS_VOLUME 0x20 #define ABS_PROFILE 0x21 +#define ABS_SND_PROFILE 0x22 #define ABS_MISC 0x28 @@ -1000,4 +1001,12 @@ #define SND_MAX 0x07 #define SND_CNT (SND_MAX+1) +/* + * ABS_SND_PROFILE values + */ + +#define SND_PROFILE_SILENT 0x00 +#define SND_PROFILE_VIBRATE 0x01 +#define SND_PROFILE_RING 0x02 + #endif From 955bcf1047a64030cd38237cf55576f96b3ec1f1 Mon Sep 17 00:00:00 2001 From: Sanjay Govind Date: Sat, 29 Nov 2025 20:37:11 +1300 Subject: [PATCH 0278/1321] Input: xpad - add support for CRKD Guitars Add support for various CRKD Guitar Controllers. Signed-off-by: Sanjay Govind Link: https://patch.msgid.link/20251129073720.2750-2-sanjay.govind9@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov --- drivers/input/joystick/xpad.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/input/joystick/xpad.c b/drivers/input/joystick/xpad.c index d72e89c25e503..363d509493866 100644 --- a/drivers/input/joystick/xpad.c +++ b/drivers/input/joystick/xpad.c @@ -133,6 +133,8 @@ static const struct xpad_device { } xpad_device[] = { /* Please keep this list sorted by vendor and product ID. */ { 0x0079, 0x18d4, "GPD Win 2 X-Box Controller", 0, XTYPE_XBOX360 }, + { 0x0351, 0x1000, "CRKD LP Blueberry Burst Pro Edition (Xbox)", 0, XTYPE_XBOX360 }, + { 0x0351, 0x2000, "CRKD LP Black Tribal Edition (Xbox) ", 0, XTYPE_XBOX360 }, { 0x03eb, 0xff01, "Wooting One (Legacy)", 0, XTYPE_XBOX360 }, { 0x03eb, 0xff02, "Wooting Two (Legacy)", 0, XTYPE_XBOX360 }, { 0x03f0, 0x038D, "HyperX Clutch", 0, XTYPE_XBOX360 }, /* wired */ @@ -420,6 +422,7 @@ static const struct xpad_device { { 0x3285, 0x0663, "Nacon Evol-X", 0, XTYPE_XBOXONE }, { 0x3537, 0x1004, "GameSir T4 Kaleid", 0, XTYPE_XBOX360 }, { 0x3537, 0x1010, "GameSir G7 SE", 0, XTYPE_XBOXONE }, + { 0x3651, 0x1000, "CRKD SG", 0, XTYPE_XBOX360 }, { 0x366c, 0x0005, "ByoWave Proteus Controller", MAP_SHARE_BUTTON, XTYPE_XBOXONE, FLAG_DELAY_INIT }, { 0x3767, 0x0101, "Fanatec Speedster 3 Forceshock Wheel", 0, XTYPE_XBOX }, { 0x37d7, 0x2501, "Flydigi Apex 5", 0, XTYPE_XBOX360 }, @@ -518,6 +521,7 @@ static const struct usb_device_id xpad_table[] = { */ { USB_INTERFACE_INFO('X', 'B', 0) }, /* Xbox USB-IF not-approved class */ XPAD_XBOX360_VENDOR(0x0079), /* GPD Win 2 controller */ + XPAD_XBOX360_VENDOR(0x0351), /* CRKD Controllers */ XPAD_XBOX360_VENDOR(0x03eb), /* Wooting Keyboards (Legacy) */ XPAD_XBOX360_VENDOR(0x03f0), /* HP HyperX Xbox 360 controllers */ XPAD_XBOXONE_VENDOR(0x03f0), /* HP HyperX Xbox One controllers */ @@ -578,6 +582,7 @@ static const struct usb_device_id xpad_table[] = { XPAD_XBOXONE_VENDOR(0x3285), /* Nacon Evol-X */ XPAD_XBOX360_VENDOR(0x3537), /* GameSir Controllers */ XPAD_XBOXONE_VENDOR(0x3537), /* GameSir Controllers */ + XPAD_XBOX360_VENDOR(0x3651), /* CRKD Controllers */ XPAD_XBOXONE_VENDOR(0x366c), /* ByoWave controllers */ XPAD_XBOX360_VENDOR(0x37d7), /* Flydigi Controllers */ XPAD_XBOX360_VENDOR(0x413d), /* Black Shark Green Ghost Controller */ From f5219b08361bd3aa1a2752f09c4389b128a7fa88 Mon Sep 17 00:00:00 2001 From: Junjie Cao Date: Thu, 18 Dec 2025 21:56:59 -0800 Subject: [PATCH 0279/1321] Input: ti_am335x_tsc - fix off-by-one error in wire_order validation The current validation 'wire_order[i] > ARRAY_SIZE(config_pins)' allows wire_order[i] to equal ARRAY_SIZE(config_pins), which causes out-of-bounds access when used as index in 'config_pins[wire_order[i]]'. Since config_pins has 4 elements (indices 0-3), the valid range for wire_order should be 0-3. Fix the off-by-one error by using >= instead of > in the validation check. Signed-off-by: Junjie Cao Link: https://patch.msgid.link/20251114062817.852698-1-junjie.cao@intel.com Fixes: bb76dc09ddfc ("input: ti_am33x_tsc: Order of TSC wires, made configurable") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov --- drivers/input/touchscreen/ti_am335x_tsc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/input/touchscreen/ti_am335x_tsc.c b/drivers/input/touchscreen/ti_am335x_tsc.c index d6edfab167704..0534b2ba650bb 100644 --- a/drivers/input/touchscreen/ti_am335x_tsc.c +++ b/drivers/input/touchscreen/ti_am335x_tsc.c @@ -85,7 +85,7 @@ static int titsc_config_wires(struct titsc *ts_dev) wire_order[i] = ts_dev->config_inp[i] & 0x0F; if (WARN_ON(analog_line[i] > 7)) return -EINVAL; - if (WARN_ON(wire_order[i] > ARRAY_SIZE(config_pins))) + if (WARN_ON(wire_order[i] >= ARRAY_SIZE(config_pins))) return -EINVAL; } From 66dda149ae35a15ba6bbbd38f66803713e16d049 Mon Sep 17 00:00:00 2001 From: Songwei Chai Date: Fri, 6 Jun 2025 14:09:36 +0800 Subject: [PATCH 0280/1321] scripts: coccicheck: filter *.cocci files by MODE Enhance the coccicheck script to filter *.cocci files based on the specified MODE (e.g., report, patch). This ensures that only compatible semantic patch files are executed, preventing errors such as: "virtual rule report not supported" This error occurs when a .cocci file does not define a 'virtual ' rule, yet is executed in that mode. For example: make coccicheck M=drivers/hwtracing/coresight/ MODE=report In this case, running "secs_to_jiffies.cocci" would trigger the error because it lacks support for 'report' mode. With this change, such files are skipped automatically, improving robustness and developer experience. Signed-off-by: Songwei Chai Reviewed-by: Julia Lawall --- scripts/coccicheck | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/scripts/coccicheck b/scripts/coccicheck index 0e6bc5a10320c..89d591af5f3e7 100755 --- a/scripts/coccicheck +++ b/scripts/coccicheck @@ -270,7 +270,11 @@ fi if [ "$COCCI" = "" ] ; then for f in `find $srctree/scripts/coccinelle/ -name '*.cocci' -type f | sort`; do - coccinelle $f + if grep -q "virtual[[:space:]]\+$MODE" "$f"; then + coccinelle $f + else + echo "warning: Skipping $f as it does not match mode '$MODE'" + fi done else coccinelle $COCCI From 9fb4c053fa9bd26e66dcaf54ec43018432ac66de Mon Sep 17 00:00:00 2001 From: Thorsten Blum Date: Sat, 22 Nov 2025 12:48:04 +0100 Subject: [PATCH 0281/1321] Coccinelle: pm_runtime: Fix typo in report message s/Unecessary/Unnecessary/ Reviewed-by: Julia Lawall Signed-off-by: Thorsten Blum --- scripts/coccinelle/api/pm_runtime.cocci | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/coccinelle/api/pm_runtime.cocci b/scripts/coccinelle/api/pm_runtime.cocci index bf128ccae9210..b720489418fa4 100644 --- a/scripts/coccinelle/api/pm_runtime.cocci +++ b/scripts/coccinelle/api/pm_runtime.cocci @@ -109,5 +109,5 @@ p2 << r.p2; pm_runtime_api << r.pm_runtime_api; @@ -msg = "%s returns < 0 as error. Unecessary IS_ERR_VALUE at line %s" % (pm_runtime_api, p2[0].line) +msg = "%s returns < 0 as error. Unnecessary IS_ERR_VALUE at line %s" % (pm_runtime_api, p2[0].line) coccilib.report.print_report(p1[0],msg) From 679fdbc33b31c880d28556e5829e12f4dae355a0 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sun, 21 Dec 2025 15:52:04 -0800 Subject: [PATCH 0282/1321] Linux 6.19-rc2 --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index e404e4767944e..3cd00b62cde99 100644 --- a/Makefile +++ b/Makefile @@ -2,7 +2,7 @@ VERSION = 6 PATCHLEVEL = 19 SUBLEVEL = 0 -EXTRAVERSION = -rc1 +EXTRAVERSION = -rc2 NAME = Baby Opossum Posse # *DOCUMENTATION* From 5d135536e1bdf2c562f4ceccfa334c81edb81958 Mon Sep 17 00:00:00 2001 From: Leon Romanovsky Date: Thu, 18 Dec 2025 14:08:08 +0200 Subject: [PATCH 0283/1321] =?UTF-8?q?parisc:=20Set=20valid=20bit=20in=20hi?= =?UTF-8?q?gh=20byte=20of=2064=E2=80=91bit=20physical=20address?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On 32‑bit systems, phys_addr_t is defined as u32. However, parisc expects physical addresses to be 64‑bit values so it can store a validity bit in the upper byte. Resolve this mismatch by casting the physical address to unsigned long, ensuring it is treated as a 64‑bit value where required. This fixes the failure to start block device drivers on the C3700 platform, as reported by Guenter. QEMU command line to reproduce the issue (with Debian SID as rootfs): qemu-system-hppa -machine C3700 \ -kernel arch/parisc/boot/bzImage \ -append "console=ttyS0 \ root=/dev/sda rw rootwait panic=-1" \ -nographic \ -device lsi53c895a \ -drive file=rootfs-hppa.img,if=none,format=raw,id=hd0 \ -device scsi-hd,drive=hd0 Fixes: 96ddf2ef58ec ("parisc: Convert DMA map_page to map_phys interface") Reported-by: Guenter Roeck Closes: https://lore.kernel.org/all/b184f1bf-96dc-4546-8512-9cba5ecb58f7@roeck-us.net/ Signed-off-by: Leon Romanovsky Tested-by: Guenter Roeck [mszyprow: dropped the lpa() macro removal] Signed-off-by: Marek Szyprowski Link: https://lore.kernel.org/r/20251218-fix-parisc-conversion-v1-1-4a04d26b0168@nvidia.com --- drivers/parisc/sba_iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c index a6eb6bffa5ea2..eefb2bac8443f 100644 --- a/drivers/parisc/sba_iommu.c +++ b/drivers/parisc/sba_iommu.c @@ -578,8 +578,8 @@ sba_io_pdir_entry(__le64 *pdir_ptr, space_t sid, phys_addr_t pba, pba &= IOVP_MASK; pba |= (ci >> PAGE_SHIFT) & 0xff; /* move CI (8 bits) into lowest byte */ - pba |= SBA_PDIR_VALID_BIT; /* set "valid" bit */ - *pdir_ptr = cpu_to_le64(pba); /* swap and store into I/O Pdir */ + /* set "valid" bit, swap and store into I/O Pdir */ + *pdir_ptr = cpu_to_le64((unsigned long)pba | SBA_PDIR_VALID_BIT); /* * If the PDC_MODEL capabilities has Non-coherent IO-PDIR bit set From f9f71f075f47338ba9e98c3373eeacc9e15eafba Mon Sep 17 00:00:00 2001 From: Jiapeng Chong Date: Fri, 12 Dec 2025 14:24:10 +0800 Subject: [PATCH 0284/1321] ALSA: hda: Remove unnecessary print function dev_err() The print function dev_err() is redundant because platform_get_irq() already prints an error. ./sound/hda/controllers/cix-ipbloq.c:119:2-9: line 119 is redundant because platform_get_irq() already prints an error. Reported-by: Abaci Robot Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=28045 Signed-off-by: Jiapeng Chong Link: https://patch.msgid.link/20251212062410.3706839-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Takashi Iwai --- sound/hda/controllers/cix-ipbloq.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/sound/hda/controllers/cix-ipbloq.c b/sound/hda/controllers/cix-ipbloq.c index 99f9f48e91d4b..c1084a915c136 100644 --- a/sound/hda/controllers/cix-ipbloq.c +++ b/sound/hda/controllers/cix-ipbloq.c @@ -115,10 +115,8 @@ static int cix_ipbloq_hda_init(struct cix_ipbloq_hda *hda, bus->addr = res->start; irq_id = platform_get_irq(pdev, 0); - if (irq_id < 0) { - dev_err(hda->dev, "failed to get the irq, err = %d\n", irq_id); + if (irq_id < 0) return irq_id; - } err = devm_request_irq(hda->dev, irq_id, azx_interrupt, 0, KBUILD_MODNAME, chip); From 618bcedcd28a090b61d48c857a8d16ab66bd8770 Mon Sep 17 00:00:00 2001 From: Jussi Laako Date: Thu, 11 Dec 2025 17:22:21 +0200 Subject: [PATCH 0285/1321] ALSA: usb-audio: Update for native DSD support quirks Maintenance patch for native DSD support. Add set of missing device and vendor quirks; TEAC, Esoteric, Luxman and Musical Fidelity. Signed-off-by: Jussi Laako Signed-off-by: Takashi Iwai Link: https://patch.msgid.link/20251211152224.1780782-1-jussi@sonarnerd.net --- sound/usb/quirks.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index 61bd61ffb1b23..94a8fdc9c6d3c 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -2230,6 +2230,12 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = { DEVICE_FLG(0x0644, 0x806b, /* TEAC UD-701 */ QUIRK_FLAG_ITF_USB_DSD_DAC | QUIRK_FLAG_CTL_MSG_DELAY | QUIRK_FLAG_IFACE_DELAY), + DEVICE_FLG(0x0644, 0x807d, /* TEAC UD-507 */ + QUIRK_FLAG_ITF_USB_DSD_DAC | QUIRK_FLAG_CTL_MSG_DELAY | + QUIRK_FLAG_IFACE_DELAY), + DEVICE_FLG(0x0644, 0x806c, /* Esoteric XD */ + QUIRK_FLAG_ITF_USB_DSD_DAC | QUIRK_FLAG_CTL_MSG_DELAY | + QUIRK_FLAG_IFACE_DELAY), DEVICE_FLG(0x06f8, 0xb000, /* Hercules DJ Console (Windows Edition) */ QUIRK_FLAG_IGNORE_CTL_ERROR), DEVICE_FLG(0x06f8, 0xd002, /* Hercules DJ Console (Macintosh Edition) */ @@ -2388,6 +2394,8 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = { QUIRK_FLAG_CTL_MSG_DELAY_1M), DEVICE_FLG(0x30be, 0x0101, /* Schiit Hel */ QUIRK_FLAG_IGNORE_CTL_ERROR), + DEVICE_FLG(0x3255, 0x0000, /* Luxman D-10X */ + QUIRK_FLAG_ITF_USB_DSD_DAC | QUIRK_FLAG_CTL_MSG_DELAY), DEVICE_FLG(0x339b, 0x3a07, /* Synaptics HONOR USB-C HEADSET */ QUIRK_FLAG_MIXER_PLAYBACK_MIN_MUTE), DEVICE_FLG(0x413c, 0xa506, /* Dell AE515 sound bar */ @@ -2431,6 +2439,8 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = { QUIRK_FLAG_DSD_RAW), VENDOR_FLG(0x2622, /* IAG Limited devices */ QUIRK_FLAG_DSD_RAW), + VENDOR_FLG(0x2772, /* Musical Fidelity devices */ + QUIRK_FLAG_DSD_RAW), VENDOR_FLG(0x278b, /* Rotel? */ QUIRK_FLAG_DSD_RAW), VENDOR_FLG(0x292b, /* Gustard/Ess based devices */ From 65fa1507f3a39373d1dd494f7e2c1e5c5c13af90 Mon Sep 17 00:00:00 2001 From: Jussi Laako Date: Thu, 11 Dec 2025 17:22:22 +0200 Subject: [PATCH 0286/1321] ALSA: usb-audio: Reorder USB mode selection quirk When using mode selection quirk, apply the quirk before rate setting. Also apply this quirk on certain newer ITF interface devices. Signed-off-by: Jussi Laako Signed-off-by: Takashi Iwai Link: https://patch.msgid.link/20251211152224.1780782-2-jussi@sonarnerd.net --- sound/usb/endpoint.c | 6 +++--- sound/usb/quirks.c | 8 ++++---- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index cc15624ecaffe..8f9313857ee9d 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -1481,15 +1481,15 @@ int snd_usb_endpoint_prepare(struct snd_usb_audio *chip, return err; } - err = snd_usb_init_pitch(chip, ep->cur_audiofmt); + err = snd_usb_select_mode_quirk(chip, ep->cur_audiofmt); if (err < 0) return err; - err = init_sample_rate(chip, ep); + err = snd_usb_init_pitch(chip, ep->cur_audiofmt); if (err < 0) return err; - err = snd_usb_select_mode_quirk(chip, ep->cur_audiofmt); + err = init_sample_rate(chip, ep); if (err < 0) return err; diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index 94a8fdc9c6d3c..f38330b095e93 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -2221,7 +2221,7 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = { QUIRK_FLAG_IFACE_DELAY), DEVICE_FLG(0x0644, 0x8044, /* Esoteric D-05X */ QUIRK_FLAG_ITF_USB_DSD_DAC | QUIRK_FLAG_CTL_MSG_DELAY | - QUIRK_FLAG_IFACE_DELAY), + QUIRK_FLAG_IFACE_DELAY | QUIRK_FLAG_FORCE_IFACE_RESET), DEVICE_FLG(0x0644, 0x804a, /* TEAC UD-301 */ QUIRK_FLAG_ITF_USB_DSD_DAC | QUIRK_FLAG_CTL_MSG_DELAY | QUIRK_FLAG_IFACE_DELAY), @@ -2229,13 +2229,13 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = { QUIRK_FLAG_FORCE_IFACE_RESET), DEVICE_FLG(0x0644, 0x806b, /* TEAC UD-701 */ QUIRK_FLAG_ITF_USB_DSD_DAC | QUIRK_FLAG_CTL_MSG_DELAY | - QUIRK_FLAG_IFACE_DELAY), + QUIRK_FLAG_IFACE_DELAY | QUIRK_FLAG_FORCE_IFACE_RESET), DEVICE_FLG(0x0644, 0x807d, /* TEAC UD-507 */ QUIRK_FLAG_ITF_USB_DSD_DAC | QUIRK_FLAG_CTL_MSG_DELAY | - QUIRK_FLAG_IFACE_DELAY), + QUIRK_FLAG_IFACE_DELAY | QUIRK_FLAG_FORCE_IFACE_RESET), DEVICE_FLG(0x0644, 0x806c, /* Esoteric XD */ QUIRK_FLAG_ITF_USB_DSD_DAC | QUIRK_FLAG_CTL_MSG_DELAY | - QUIRK_FLAG_IFACE_DELAY), + QUIRK_FLAG_IFACE_DELAY | QUIRK_FLAG_FORCE_IFACE_RESET), DEVICE_FLG(0x06f8, 0xb000, /* Hercules DJ Console (Windows Edition) */ QUIRK_FLAG_IGNORE_CTL_ERROR), DEVICE_FLG(0x06f8, 0xd002, /* Hercules DJ Console (Macintosh Edition) */ From 59d541e5063c95446988831a4f85c9a076a6615d Mon Sep 17 00:00:00 2001 From: Jussi Laako Date: Thu, 11 Dec 2025 17:22:23 +0200 Subject: [PATCH 0287/1321] ALSA: usb-audio: Do not expose PCM and DSD on same altsetting unless DoP Do not expose DSD altsetting as a PCM one, even if the descriptor claims it to be PCM instead of special format. Signed-off-by: Jussi Laako Signed-off-by: Takashi Iwai Link: https://patch.msgid.link/20251211152224.1780782-3-jussi@sonarnerd.net --- sound/usb/format.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/sound/usb/format.c b/sound/usb/format.c index ec95a063beb10..64cfe4a9d8cdf 100644 --- a/sound/usb/format.c +++ b/sound/usb/format.c @@ -34,6 +34,7 @@ static u64 parse_audio_format_i_type(struct snd_usb_audio *chip, { int sample_width, sample_bytes; u64 pcm_formats = 0; + u64 dsd_formats = 0; switch (fp->protocol) { case UAC_VERSION_1: @@ -154,7 +155,9 @@ static u64 parse_audio_format_i_type(struct snd_usb_audio *chip, fp->iface, fp->altsetting, format); } - pcm_formats |= snd_usb_interface_dsd_format_quirks(chip, fp, sample_bytes); + dsd_formats |= snd_usb_interface_dsd_format_quirks(chip, fp, sample_bytes); + if (dsd_formats && !fp->dsd_dop) + pcm_formats = dsd_formats; return pcm_formats; } From 81fc62b068f66fbbf50b2991f00e7dcb237df87c Mon Sep 17 00:00:00 2001 From: Kai Vehmanen Date: Fri, 12 Dec 2025 19:46:58 +0200 Subject: [PATCH 0288/1321] ALSA: hda/realtek: enable woofer speakers on Medion NM14LNL The ALC233 codec on these Medion NM14LNL (SPRCHRGD 14 S2) systems requires a quirk to enable all speakers. Tested-by: davplsm Link: https://github.com/thesofproject/linux/issues/5611 Signed-off-by: Kai Vehmanen Link: https://patch.msgid.link/20251212174658.752641-1-kai.vehmanen@linux.intel.com Signed-off-by: Takashi Iwai --- sound/hda/codecs/realtek/alc269.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index 171a71457ec3b..c8a9b9b15cb49 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -7296,6 +7296,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1d72, 0x1901, "RedmiBook 14", ALC256_FIXUP_ASUS_HEADSET_MIC), SND_PCI_QUIRK(0x1d72, 0x1945, "Redmi G", ALC256_FIXUP_ASUS_HEADSET_MIC), SND_PCI_QUIRK(0x1d72, 0x1947, "RedmiBook Air", ALC255_FIXUP_XIAOMI_HEADSET_MIC), + SND_PCI_QUIRK(0x1e39, 0xca14, "MEDION NM14LNL", ALC233_FIXUP_MEDION_MTL_SPK), SND_PCI_QUIRK(0x1ee7, 0x2078, "HONOR BRB-X M1010", ALC2XX_FIXUP_HEADSET_MIC), SND_PCI_QUIRK(0x1f66, 0x0105, "Ayaneo Portable Game Player", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x2014, 0x800a, "Positivo ARN50", ALC269_FIXUP_LIMIT_INT_MIC_BOOST), From b539c36711deb5ac75e858b613b55d2219c5b1e2 Mon Sep 17 00:00:00 2001 From: Haotian Zhang Date: Mon, 15 Dec 2025 12:26:52 +0800 Subject: [PATCH 0289/1321] ALSA: vxpocket: Fix resource leak in vxpocket_probe error path When vxpocket_config() fails, vxpocket_probe() returns the error code directly without freeing the sound card resources allocated by snd_card_new(), which leads to a memory leak. Add proper error handling to free the sound card and clear the allocation bit when vxpocket_config() fails. Fixes: 15b99ac17295 ("[PATCH] pcmcia: add return value to _config() functions") Signed-off-by: Haotian Zhang Link: https://patch.msgid.link/20251215042652.695-1-vulab@iscas.ac.cn Signed-off-by: Takashi Iwai --- sound/pcmcia/vx/vxpocket.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/sound/pcmcia/vx/vxpocket.c b/sound/pcmcia/vx/vxpocket.c index 2e09f2a513a6a..9a5c9aa8eec4a 100644 --- a/sound/pcmcia/vx/vxpocket.c +++ b/sound/pcmcia/vx/vxpocket.c @@ -284,7 +284,13 @@ static int vxpocket_probe(struct pcmcia_device *p_dev) vxp->p_dev = p_dev; - return vxpocket_config(p_dev); + err = vxpocket_config(p_dev); + if (err < 0) { + card_alloc &= ~(1 << i); + snd_card_free(card); + return err; + } + return 0; } static void vxpocket_detach(struct pcmcia_device *link) From 6ebc72f490c992ad8678b36c156e33dc48c5d6e9 Mon Sep 17 00:00:00 2001 From: Haotian Zhang Date: Mon, 15 Dec 2025 17:04:33 +0800 Subject: [PATCH 0290/1321] ALSA: pcmcia: Fix resource leak in snd_pdacf_probe error path When pdacf_config() fails, snd_pdacf_probe() returns the error code directly without freeing the sound card resources allocated by snd_card_new(), which leads to a memory leak. Add proper error handling to free the sound card and clear the card list entry when pdacf_config() fails. Fixes: 15b99ac17295 ("[PATCH] pcmcia: add return value to _config() functions") Suggested-by: Takashi Iwai Signed-off-by: Haotian Zhang Link: https://patch.msgid.link/20251215090433.211-1-vulab@iscas.ac.cn Signed-off-by: Takashi Iwai --- sound/pcmcia/pdaudiocf/pdaudiocf.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/sound/pcmcia/pdaudiocf/pdaudiocf.c b/sound/pcmcia/pdaudiocf/pdaudiocf.c index 13419837dfb7c..a3291e626440e 100644 --- a/sound/pcmcia/pdaudiocf/pdaudiocf.c +++ b/sound/pcmcia/pdaudiocf/pdaudiocf.c @@ -131,7 +131,13 @@ static int snd_pdacf_probe(struct pcmcia_device *link) link->config_index = 1; link->config_regs = PRESENT_OPTION; - return pdacf_config(link); + err = pdacf_config(link); + if (err < 0) { + card_list[i] = NULL; + snd_card_free(card); + return err; + } + return 0; } From faba6283cc759e87baecb711b20b7d6c7aa01219 Mon Sep 17 00:00:00 2001 From: Shipei Qu Date: Wed, 17 Dec 2025 10:46:30 +0800 Subject: [PATCH 0291/1321] ALSA: usb-mixer: us16x08: validate meter packet indices get_meter_levels_from_urb() parses the 64-byte meter packets sent by the device and fills the per-channel arrays meter_level[], comp_level[] and master_level[] in struct snd_us16x08_meter_store. Currently the function derives the channel index directly from the meter packet (MUB2(meter_urb, s) - 1) and uses it to index those arrays without validating the range. If the packet contains a negative or out-of-range channel number, the driver may write past the end of these arrays. Introduce a local channel variable and validate it before updating the arrays. We reject negative indices, limit meter_level[] and comp_level[] to SND_US16X08_MAX_CHANNELS, and guard master_level[] updates with ARRAY_SIZE(master_level). Fixes: d2bb390a2081 ("ALSA: usb-audio: Tascam US-16x08 DSP mixer quirk") Reported-by: DARKNAVY (@DarkNavyOrg) Closes: https://lore.kernel.org/tencent_21C112743C44C1A2517FF219@qq.com Signed-off-by: Shipei Qu Link: https://patch.msgid.link/20251217024630.59576-1-qu@darknavy.com Signed-off-by: Takashi Iwai --- sound/usb/mixer_us16x08.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/sound/usb/mixer_us16x08.c b/sound/usb/mixer_us16x08.c index 1c5712c31f5e2..f9df40730effd 100644 --- a/sound/usb/mixer_us16x08.c +++ b/sound/usb/mixer_us16x08.c @@ -655,17 +655,25 @@ static void get_meter_levels_from_urb(int s, u8 *meter_urb) { int val = MUC2(meter_urb, s) + (MUC3(meter_urb, s) << 8); + int ch = MUB2(meter_urb, s) - 1; + + if (ch < 0) + return; if (MUA0(meter_urb, s) == 0x61 && MUA1(meter_urb, s) == 0x02 && MUA2(meter_urb, s) == 0x04 && MUB0(meter_urb, s) == 0x62) { - if (MUC0(meter_urb, s) == 0x72) - store->meter_level[MUB2(meter_urb, s) - 1] = val; - if (MUC0(meter_urb, s) == 0xb2) - store->comp_level[MUB2(meter_urb, s) - 1] = val; + if (ch < SND_US16X08_MAX_CHANNELS) { + if (MUC0(meter_urb, s) == 0x72) + store->meter_level[ch] = val; + if (MUC0(meter_urb, s) == 0xb2) + store->comp_level[ch] = val; + } } if (MUA0(meter_urb, s) == 0x61 && MUA1(meter_urb, s) == 0x02 && - MUA2(meter_urb, s) == 0x02 && MUB0(meter_urb, s) == 0x62) - store->master_level[MUB2(meter_urb, s) - 1] = val; + MUA2(meter_urb, s) == 0x02 && MUB0(meter_urb, s) == 0x62) { + if (ch < ARRAY_SIZE(store->master_level)) + store->master_level[ch] = val; + } } /* Function to retrieve current meter values from the device. From ec4501becd952044814dd6b651bb5916b6c92acc Mon Sep 17 00:00:00 2001 From: Stefan Binding Date: Tue, 16 Dec 2025 16:48:12 +0000 Subject: [PATCH 0292/1321] ALSA: hda/realtek: Add support for HP Trekker Laptop Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with I2C Signed-off-by: Stefan Binding Link: https://patch.msgid.link/20251216164830.832148-2-sbinding@opensource.cirrus.com Signed-off-by: Takashi Iwai --- sound/hda/codecs/realtek/alc269.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index c8a9b9b15cb49..ec57c075757cd 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -6795,6 +6795,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x8f40, "HP ZBook 8 G2a 14", ALC245_FIXUP_HP_TAS2781_I2C_MUTE_LED), SND_PCI_QUIRK(0x103c, 0x8f41, "HP ZBook 8 G2a 16", ALC245_FIXUP_HP_TAS2781_I2C_MUTE_LED), SND_PCI_QUIRK(0x103c, 0x8f42, "HP ZBook 8 G2a 14W", ALC245_FIXUP_HP_TAS2781_I2C_MUTE_LED), + SND_PCI_QUIRK(0x103c, 0x8f57, "HP Trekker G7JC", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8f62, "HP ZBook 8 G2a 16W", ALC245_FIXUP_HP_TAS2781_I2C_MUTE_LED), SND_PCI_QUIRK(0x1043, 0x1032, "ASUS VivoBook X513EA", ALC256_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x1034, "ASUS GU605C", ALC285_FIXUP_ASUS_GU605_SPI_SPEAKER2_TO_DAC1), From e66b5784fe77ecc14e5a951ca2bbc803d9865a38 Mon Sep 17 00:00:00 2001 From: Stefan Binding Date: Tue, 16 Dec 2025 16:48:13 +0000 Subject: [PATCH 0293/1321] ALSA: hda/realtek: Add support for HP Clipper Laptop Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with I2C Signed-off-by: Stefan Binding Link: https://patch.msgid.link/20251216164830.832148-3-sbinding@opensource.cirrus.com Signed-off-by: Takashi Iwai --- sound/hda/codecs/realtek/alc269.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index ec57c075757cd..e8f3cdcff0f3a 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -6771,6 +6771,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x8e61, "HP Trekker ", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8e62, "HP Trekker ", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8e8a, "HP NexusX", ALC245_FIXUP_HP_TAS2781_I2C_MUTE_LED), + SND_PCI_QUIRK(0x103c, 0x8e9c, "HP 16 Clipper OmniBook X X360", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8e9d, "HP 17 Turbine OmniBook X UMA", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8e9e, "HP 17 Turbine OmniBook X UMA", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8eb6, "HP Abe A6U", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_GPIO), From 7200abc7ab6ef810c5bd4e5c3ba2d6a32df9eb26 Mon Sep 17 00:00:00 2001 From: Dirk Su Date: Wed, 17 Dec 2025 10:52:44 +0800 Subject: [PATCH 0294/1321] ALSA: hda/realtek: fix micmute LED reversed on HP Abe and Bantie Quirk ALC236_FIXUP_HP_MUTE_LED_MICMUTE_GPIO make mute/micmute LEDs on HP Abe and Bantie functional. But the micmute LED's function is reversed, LED will be on when Mic enabled and off when Mic disabled. Create a new function to fix the micmute LED reversed issue. Fixes: b72a6ddf6af2 ("ALSA: hda/realtek: fix mute/micmute LEDs don't work for HP 200 G2i") Signed-off-by: Dirk Su Link: https://patch.msgid.link/20251217025257.44600-1-dirk.su@canonical.com Signed-off-by: Takashi Iwai --- sound/hda/codecs/realtek/alc269.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index e8f3cdcff0f3a..2bc99a8755c98 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -1656,6 +1656,18 @@ static void alc236_fixup_hp_mute_led_micmute_vref(struct hda_codec *codec, alc236_fixup_hp_micmute_led_vref(codec, fix, action); } +static void alc236_fixup_hp_mute_led_micmute_gpio(struct hda_codec *codec, + const struct hda_fixup *fix, int action) +{ + struct alc_spec *spec = codec->spec; + + if (action == HDA_FIXUP_ACT_PRE_PROBE) + spec->micmute_led_polarity = 1; + + alc236_fixup_hp_mute_led_coefbit2(codec, fix, action); + alc_fixup_hp_gpio_led(codec, action, 0x00, 0x01); +} + static inline void alc298_samsung_write_coef_pack(struct hda_codec *codec, const unsigned short coefs[2]) { @@ -5326,9 +5338,7 @@ static const struct hda_fixup alc269_fixups[] = { }, [ALC236_FIXUP_HP_MUTE_LED_MICMUTE_GPIO] = { .type = HDA_FIXUP_FUNC, - .v.func = alc236_fixup_hp_mute_led_coefbit2, - .chained = true, - .chain_id = ALC236_FIXUP_HP_GPIO_LED, + .v.func = alc236_fixup_hp_mute_led_micmute_gpio, }, [ALC236_FIXUP_LENOVO_INV_DMIC] = { .type = HDA_FIXUP_FUNC, From 9353848451fda8998df9ba6e000ad68704e278e3 Mon Sep 17 00:00:00 2001 From: Antheas Kapenekakis Date: Tue, 16 Dec 2025 22:17:14 +0100 Subject: [PATCH 0295/1321] ALSA: hda/realtek: Add Asus quirk for TAS amplifiers By default, these devices use the quirk ALC294_FIXUP_ASUS_SPK. Not using it causes the headphone jack to stop working. Therefore, introduce a new quirk ALC287_FIXUP_TXNW2781_I2C_ASUS that binds to the TAS amplifier while using that quirk. Cc: stable@kernel.org Fixes: 18a4895370a7 ("ALSA: hda/realtek: Add match for ASUS Xbox Ally projects") Signed-off-by: Antheas Kapenekakis Link: https://patch.msgid.link/20251216211714.1116898-1-lkml@antheas.dev Signed-off-by: Takashi Iwai --- sound/hda/codecs/realtek/alc269.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index 2bc99a8755c98..355f118275318 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -3765,6 +3765,7 @@ enum { ALC295_FIXUP_DELL_TAS2781_I2C, ALC245_FIXUP_TAS2781_SPI_2, ALC287_FIXUP_TXNW2781_I2C, + ALC287_FIXUP_TXNW2781_I2C_ASUS, ALC287_FIXUP_YOGA7_14ARB7_I2C, ALC245_FIXUP_HP_MUTE_LED_COEFBIT, ALC245_FIXUP_HP_MUTE_LED_V1_COEFBIT, @@ -6063,6 +6064,12 @@ static const struct hda_fixup alc269_fixups[] = { .chained = true, .chain_id = ALC285_FIXUP_THINKPAD_HEADSET_JACK, }, + [ALC287_FIXUP_TXNW2781_I2C_ASUS] = { + .type = HDA_FIXUP_FUNC, + .v.func = tas2781_fixup_txnw_i2c, + .chained = true, + .chain_id = ALC294_FIXUP_ASUS_SPK, + }, [ALC287_FIXUP_YOGA7_14ARB7_I2C] = { .type = HDA_FIXUP_FUNC, .v.func = yoga7_14arb7_fixup_i2c, @@ -6839,8 +6846,8 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1043, 0x12f0, "ASUS X541UV", ALC256_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x1313, "Asus K42JZ", ALC269VB_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x1314, "ASUS GA605K", ALC285_FIXUP_ASUS_GA605K_HEADSET_MIC), - SND_PCI_QUIRK(0x1043, 0x1384, "ASUS RC73XA", ALC287_FIXUP_TXNW2781_I2C), - SND_PCI_QUIRK(0x1043, 0x1394, "ASUS RC73YA", ALC287_FIXUP_TXNW2781_I2C), + SND_PCI_QUIRK(0x1043, 0x1384, "ASUS RC73XA", ALC287_FIXUP_TXNW2781_I2C_ASUS), + SND_PCI_QUIRK(0x1043, 0x1394, "ASUS RC73YA", ALC287_FIXUP_TXNW2781_I2C_ASUS), SND_PCI_QUIRK(0x1043, 0x13b0, "ASUS Z550SA", ALC256_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x1427, "Asus Zenbook UX31E", ALC269VB_FIXUP_ASUS_ZENBOOK), SND_PCI_QUIRK(0x1043, 0x1433, "ASUS GX650PY/PZ/PV/PU/PYV/PZV/PIV/PVV", ALC285_FIXUP_ASUS_I2C_HEADSET_MIC), From a1260c73776520cd8d31538e3e3f99c6bee390eb Mon Sep 17 00:00:00 2001 From: sheetal Date: Mon, 8 Dec 2025 10:50:40 +0530 Subject: [PATCH 0296/1321] ASoC: tegra: Fix uninitialized flat cache warning in tegra210_ahub The tegra210_ahub driver started triggering a warning after commit e062bdfdd6ad ("regmap: warn users about uninitialized flat cache"), which flags drivers using REGCACHE_FLAT without register defaults. Since the driver omits default definitions because its registers are zero initialized, the following warning is shown: WARNING KERN tegra210-ahub 2900800.ahub: using zero-initialized flat cache, this may cause unexpected behavior Switch to REGCACHE_FLAT_S which is the recommended cache type for sparse register maps without defaults. This cache type initializes entries on-demand from hardware, eliminating the warning while using memory efficiently. Signed-off-by: sheetal Link: https://patch.msgid.link/20251208052040.4025612-1-sheetal@nvidia.com Signed-off-by: Mark Brown --- sound/soc/tegra/tegra210_ahub.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/sound/soc/tegra/tegra210_ahub.c b/sound/soc/tegra/tegra210_ahub.c index e795907a3963a..261d9067d27b6 100644 --- a/sound/soc/tegra/tegra210_ahub.c +++ b/sound/soc/tegra/tegra210_ahub.c @@ -2077,7 +2077,7 @@ static const struct regmap_config tegra210_ahub_regmap_config = { .val_bits = 32, .reg_stride = 4, .max_register = TEGRA210_MAX_REGISTER_ADDR, - .cache_type = REGCACHE_FLAT, + .cache_type = REGCACHE_FLAT_S, }; static const struct regmap_config tegra186_ahub_regmap_config = { @@ -2085,7 +2085,7 @@ static const struct regmap_config tegra186_ahub_regmap_config = { .val_bits = 32, .reg_stride = 4, .max_register = TEGRA186_MAX_REGISTER_ADDR, - .cache_type = REGCACHE_FLAT, + .cache_type = REGCACHE_FLAT_S, }; static const struct regmap_config tegra264_ahub_regmap_config = { @@ -2094,7 +2094,7 @@ static const struct regmap_config tegra264_ahub_regmap_config = { .reg_stride = 4, .writeable_reg = tegra264_ahub_wr_reg, .max_register = TEGRA264_MAX_REGISTER_ADDR, - .cache_type = REGCACHE_FLAT, + .cache_type = REGCACHE_FLAT_S, }; static const struct tegra_ahub_soc_data soc_data_tegra210 = { From d999cb7406fa4483c1fc3ff8451ea67ecf49e56d Mon Sep 17 00:00:00 2001 From: Andrew Elantsev Date: Wed, 10 Dec 2025 23:38:00 +0300 Subject: [PATCH 0297/1321] ASoC: amd: yc: Add quirk for Honor MagicBook X16 2025 Add a DMI quirk for the Honor MagicBook X16 2025 laptop fixing the issue where the internal microphone was not detected. Signed-off-by: Andrew Elantsev Link: https://patch.msgid.link/20251210203800.142822-1-elantsew.andrew@gmail.com Signed-off-by: Mark Brown --- sound/soc/amd/yc/acp6x-mach.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/sound/soc/amd/yc/acp6x-mach.c b/sound/soc/amd/yc/acp6x-mach.c index f210a253da9f5..bf4d9d3365617 100644 --- a/sound/soc/amd/yc/acp6x-mach.c +++ b/sound/soc/amd/yc/acp6x-mach.c @@ -661,6 +661,13 @@ static const struct dmi_system_id yc_acp_quirk_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "Bravo 15 C7UCX"), } }, + { + .driver_data = &acp6x_card, + .matches = { + DMI_MATCH(DMI_BOARD_VENDOR, "HONOR"), + DMI_MATCH(DMI_PRODUCT_NAME, "GOH-X"), + } + }, {} }; From c173c7a0e7cfb4affef00f49bccf8c31dce27a55 Mon Sep 17 00:00:00 2001 From: Robert Oscilowski Date: Sat, 15 Nov 2025 19:43:58 +0100 Subject: [PATCH 0298/1321] ASoC: qcom: sdm845: set quaternary MI2S codec DAI to I2S format We configure the codec DAI format for primary and secondary but not the quaternery MI2S path. Add the missing configuration to enable speaker codecs on the quaternary MI2S like the MAX9827 found on the OnePlus 6. Signed-off-by: Robert Oscilowski Signed-off-by: Casey Connolly Signed-off-by: David Heidelberg Reviewed-by: Alexey Klimov Reviewed-by: Dmitry Baryshkov Link: https://patch.msgid.link/20251115-sdm845-quaternary-v3-1-c16bf19128ac@ixit.cz Signed-off-by: Mark Brown --- sound/soc/qcom/sdm845.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/soc/qcom/sdm845.c b/sound/soc/qcom/sdm845.c index e18a8e44f2db5..0ce9dff4dc525 100644 --- a/sound/soc/qcom/sdm845.c +++ b/sound/soc/qcom/sdm845.c @@ -365,10 +365,12 @@ static int sdm845_snd_startup(struct snd_pcm_substream *substream) snd_soc_dai_set_fmt(codec_dai, codec_dai_fmt); break; case QUATERNARY_MI2S_RX: + codec_dai_fmt |= SND_SOC_DAIFMT_NB_NF | SND_SOC_DAIFMT_I2S; snd_soc_dai_set_sysclk(cpu_dai, Q6AFE_LPASS_CLK_ID_QUAD_MI2S_IBIT, MI2S_BCLK_RATE, SNDRV_PCM_STREAM_PLAYBACK); snd_soc_dai_set_fmt(cpu_dai, fmt); + snd_soc_dai_set_fmt(codec_dai, codec_dai_fmt); break; case QUATERNARY_TDM_RX_0: From 189d1b684d9e1c902661ef8f91fdb17eb0f39bf3 Mon Sep 17 00:00:00 2001 From: Chancel Liu Date: Wed, 10 Dec 2025 15:21:09 +0900 Subject: [PATCH 0299/1321] ASoC: fsl_sai: Constrain sample rates from audio PLLs only in master mode If SAI works in master mode it will generate clocks for external codec from audio PLLs. Thus sample rates should be constrained according to audio PLL clocks. While SAI works in slave mode which means clocks are generated externally then constraints are independent of audio PLLs. Fixes: 4edc98598be4 ("ASoC: fsl_sai: Add sample rate constraint") Signed-off-by: Chancel Liu Link: https://patch.msgid.link/20251210062109.2577735-1-chancel.liu@nxp.com Signed-off-by: Mark Brown --- sound/soc/fsl/fsl_sai.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c index 72bfc91e21b9b..86730c2149146 100644 --- a/sound/soc/fsl/fsl_sai.c +++ b/sound/soc/fsl/fsl_sai.c @@ -917,8 +917,14 @@ static int fsl_sai_startup(struct snd_pcm_substream *substream, tx ? sai->dma_params_tx.maxburst : sai->dma_params_rx.maxburst); - ret = snd_pcm_hw_constraint_list(substream->runtime, 0, - SNDRV_PCM_HW_PARAM_RATE, &sai->constraint_rates); + if (sai->is_consumer_mode[tx]) + ret = snd_pcm_hw_constraint_list(substream->runtime, 0, + SNDRV_PCM_HW_PARAM_RATE, + &fsl_sai_rate_constraints); + else + ret = snd_pcm_hw_constraint_list(substream->runtime, 0, + SNDRV_PCM_HW_PARAM_RATE, + &sai->constraint_rates); return ret; } From cea72fbd7bd1d13c60946b93ad06b77607295485 Mon Sep 17 00:00:00 2001 From: Bard Liao Date: Fri, 12 Dec 2025 20:11:12 +0800 Subject: [PATCH 0300/1321] ASoC: sdw_utils: subtract the endpoint that is not present MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When asoc_sdw_count_sdw_endpoints() count the num_ends, it doesn't skip the unpresented endpoints. But, asoc_sdw_parse_sdw_endpoints() will skip the unpresented endpoints either by quirk or the SDCA function doesn't show up the endpoint. The endpoint number mismatches between count and parse and the machine driver will show up a warning about it. Fixes: 26ee34d2f5c7 ("ASoC: sdw_utils: Add codec_conf for every DAI") Closes: https://github.com/thesofproject/linux/issues/5620 Signed-off-by: Bard Liao Reviewed-by: Péter Ujfalusi Reviewed-by: Vijendar Mukunda Reviewed-by: Charles Keepax Link: https://patch.msgid.link/20251212121112.3313017-1-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown --- sound/soc/sdw_utils/soc_sdw_utils.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/sound/soc/sdw_utils/soc_sdw_utils.c b/sound/soc/sdw_utils/soc_sdw_utils.c index 6c656b2f7f3ae..f169d95895ea2 100644 --- a/sound/soc/sdw_utils/soc_sdw_utils.c +++ b/sound/soc/sdw_utils/soc_sdw_utils.c @@ -1534,8 +1534,10 @@ int asoc_sdw_parse_sdw_endpoints(struct snd_soc_card *card, * endpoint check is not necessary */ if (dai_info->quirk && - !(dai_info->quirk_exclude ^ !!(dai_info->quirk & ctx->mc_quirk))) + !(dai_info->quirk_exclude ^ !!(dai_info->quirk & ctx->mc_quirk))) { + (*num_devs)--; continue; + } } else { /* Check SDCA codec endpoint if there is no matching quirk */ ret = is_sdca_endpoint_present(dev, codec_info, adr_link, i, j); @@ -1543,8 +1545,10 @@ int asoc_sdw_parse_sdw_endpoints(struct snd_soc_card *card, return ret; /* The endpoint is not present, skip */ - if (!ret) + if (!ret) { + (*num_devs)--; continue; + } } dev_dbg(dev, From e0f8b41d5c0e58e28cf0f6dd2e637a05e25bc644 Mon Sep 17 00:00:00 2001 From: Peter Ujfalusi Date: Mon, 15 Dec 2025 15:07:41 +0200 Subject: [PATCH 0301/1321] ASoC: SOF: topology: Add context when sink or source widget is missing Add some context to the error prints when sink or source widget is not found by printing the name of the other side of the connection. Signed-off-by: Peter Ujfalusi Reviewed-by: Bard Liao Link: https://patch.msgid.link/20251215130741.31106-1-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown --- sound/soc/sof/topology.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/sound/soc/sof/topology.c b/sound/soc/sof/topology.c index c1083ea4624ad..6b09b8cdf1cb6 100644 --- a/sound/soc/sof/topology.c +++ b/sound/soc/sof/topology.c @@ -2106,8 +2106,8 @@ static int sof_route_load(struct snd_soc_component *scomp, int index, /* source component */ source_swidget = snd_sof_find_swidget(scomp, (char *)route->source); if (!source_swidget) { - dev_err(scomp->dev, "error: source %s not found\n", - route->source); + dev_err(scomp->dev, "source %s for sink %s is not found\n", + route->source, route->sink); ret = -EINVAL; goto err; } @@ -2125,8 +2125,8 @@ static int sof_route_load(struct snd_soc_component *scomp, int index, /* sink component */ sink_swidget = snd_sof_find_swidget(scomp, (char *)route->sink); if (!sink_swidget) { - dev_err(scomp->dev, "error: sink %s not found\n", - route->sink); + dev_err(scomp->dev, "sink %s for source %s is not found\n", + route->sink, route->source); ret = -EINVAL; goto err; } From 99529e9b146069d097650b3c782c8c69494ebc83 Mon Sep 17 00:00:00 2001 From: Bard Liao Date: Mon, 15 Dec 2025 15:07:23 +0200 Subject: [PATCH 0302/1321] ASoC: SOF: ipc4-topology: set playback channel mask MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Currently, we send all channels to all amps and copy the channel_mask to all ALH DMAs in playback. However, the amp may not have the capability to run any process and SOF may need to split the channels and send specific data channel to each amp. In that case, we need to split the channel_mask in ALH DMA. Copy the channel mask only if the widget channel count is the same the FE channels for playback, otherwise, split the channels among the aggregated DAIs. Like what we did in capture. Signed-off-by: Bard Liao Reviewed-by: Ranjani Sridharan Reviewed-by: Péter Ujfalusi Reviewed-by: Liam Girdwood Signed-off-by: Peter Ujfalusi Link: https://patch.msgid.link/20251215130723.31081-1-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown --- sound/soc/sof/ipc4-topology.c | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c index 221e9d4052b8f..588defd3eec99 100644 --- a/sound/soc/sof/ipc4-topology.c +++ b/sound/soc/sof/ipc4-topology.c @@ -2280,8 +2280,19 @@ sof_ipc4_prepare_copier_module(struct snd_sof_widget *swidget, ch_map >>= 4; } - step = ch_count / blob->alh_cfg.device_count; - mask = GENMASK(step - 1, 0); + if (swidget->id == snd_soc_dapm_dai_in && ch_count == out_ref_channels) { + /* + * For playback DAI widgets where the channel number is equal to + * the output reference channels, set the step = 0 to ensure all + * the ch_mask is applied to all alh mappings. + */ + mask = ch_mask; + step = 0; + } else { + step = ch_count / blob->alh_cfg.device_count; + mask = GENMASK(step - 1, 0); + } + /* * Set each gtw_cfg.node_id to blob->alh_cfg.mapping[] * for all widgets with the same stream name @@ -2316,8 +2327,9 @@ sof_ipc4_prepare_copier_module(struct snd_sof_widget *swidget, } /* - * Set the same channel mask for playback as the audio data is - * duplicated for all speakers. For capture, split the channels + * Set the same channel mask if the widget channel count is the same + * as the FE channels for playback as the audio data is duplicated + * for all speakers in this case. Otherwise, split the channels * among the aggregated DAIs. For example, with 4 channels on 2 * aggregated DAIs, the channel_mask should be 0x3 and 0xc for the * two DAI's. @@ -2326,10 +2338,7 @@ sof_ipc4_prepare_copier_module(struct snd_sof_widget *swidget, * the tables in soc_acpi files depending on the _ADR and devID * registers for each codec. */ - if (w->id == snd_soc_dapm_dai_in) - blob->alh_cfg.mapping[i].channel_mask = ch_mask; - else - blob->alh_cfg.mapping[i].channel_mask = mask << (step * i); + blob->alh_cfg.mapping[i].channel_mask = mask << (step * i); i++; } From 55e41f37f36d4985dbbb2a4ad5852649a456a37c Mon Sep 17 00:00:00 2001 From: Peter Ujfalusi Date: Mon, 15 Dec 2025 15:08:05 +0200 Subject: [PATCH 0303/1321] ASoC: SOF: Intel: pci-mtl: Change the topology path to intel/sof-ipc4-tplg The default topology path for IPC4 is intel/sof-ipc4-tplg with a symlink to it as intel/sof-ace-tplg to support old kernels. sof-bin has been released in this manner for almost two years now, it is time to change the default path for MTL family. Link: https://thesofproject.github.io/latest/getting_started/intel_debug/introduction.html#topology-file Signed-off-by: Peter Ujfalusi Reviewed-by: Bard Liao Reviewed-by: Ranjani Sridharan Reviewed-by: Kai Vehmanen Link: https://patch.msgid.link/20251215130805.31146-1-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown --- sound/soc/sof/intel/pci-mtl.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/sound/soc/sof/intel/pci-mtl.c b/sound/soc/sof/intel/pci-mtl.c index 7b25339991952..23adc5d765b47 100644 --- a/sound/soc/sof/intel/pci-mtl.c +++ b/sound/soc/sof/intel/pci-mtl.c @@ -47,7 +47,7 @@ static const struct sof_dev_desc mtl_desc = { [SOF_IPC_TYPE_4] = "intel/sof-ipc4-lib/mtl", }, .default_tplg_path = { - [SOF_IPC_TYPE_4] = "intel/sof-ace-tplg", + [SOF_IPC_TYPE_4] = "intel/sof-ipc4-tplg", }, .default_fw_filename = { [SOF_IPC_TYPE_4] = "sof-mtl.ri", @@ -77,7 +77,7 @@ static const struct sof_dev_desc arl_desc = { [SOF_IPC_TYPE_4] = "intel/sof-ipc4-lib/arl", }, .default_tplg_path = { - [SOF_IPC_TYPE_4] = "intel/sof-ace-tplg", + [SOF_IPC_TYPE_4] = "intel/sof-ipc4-tplg", }, .default_fw_filename = { [SOF_IPC_TYPE_4] = "sof-arl.ri", @@ -107,7 +107,7 @@ static const struct sof_dev_desc arl_s_desc = { [SOF_IPC_TYPE_4] = "intel/sof-ipc4-lib/arl-s", }, .default_tplg_path = { - [SOF_IPC_TYPE_4] = "intel/sof-ace-tplg", + [SOF_IPC_TYPE_4] = "intel/sof-ipc4-tplg", }, .default_fw_filename = { [SOF_IPC_TYPE_4] = "sof-arl-s.ri", From 7ff6bfaaafe0a64ceea1c129f52c6ab3ef4eed97 Mon Sep 17 00:00:00 2001 From: Peter Ujfalusi Date: Mon, 15 Dec 2025 12:10:35 +0200 Subject: [PATCH 0304/1321] ASoC: soc-acpi / SOF: Add best_effort flag to get_function_tplg_files op When there is no fallback possibility available for the function topology use it is better to try to create a profile for the card in best effort manner, leaving out non supported links for example. As an example: some laptops present SSPx-BT link but we don't have fragment yet to support this. If we only have support for functional topology without monolithic fallback then we would fail the card creation. The reason why the monolithic topology works on the same device is that it does not have the SSPx-BT link handled, it is ignored. In case when there is no fallback possibility we should try to create the card with links that we support as best effort instead of failing and leaving the user without a card. Signed-off-by: Peter Ujfalusi Reviewed-by: Kai Vehmanen Reviewed-by: Bard Liao Link: https://patch.msgid.link/20251215101036.9370-2-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown --- include/sound/soc-acpi.h | 5 ++++- .../intel/common/sof-function-topology-lib.c | 5 ++++- .../intel/common/sof-function-topology-lib.h | 2 +- sound/soc/sof/topology.c | 18 +++++++++++++++++- 4 files changed, 26 insertions(+), 4 deletions(-) diff --git a/include/sound/soc-acpi.h b/include/sound/soc-acpi.h index 90d73b9bddabe..0519afd7217f1 100644 --- a/include/sound/soc-acpi.h +++ b/include/sound/soc-acpi.h @@ -203,6 +203,8 @@ struct snd_soc_acpi_link_adr { * @mach: the pointer of the machine driver * @prefix: the prefix of the topology file name. Typically, it is the path. * @tplg_files: the pointer of the array of the topology file names. + * @best_effort: ignore non supported links and try to build the card in best effort + * with supported links */ /* Descriptor for SST ASoC machine driver */ struct snd_soc_acpi_mach { @@ -224,7 +226,8 @@ struct snd_soc_acpi_mach { const u32 tplg_quirk_mask; int (*get_function_tplg_files)(struct snd_soc_card *card, const struct snd_soc_acpi_mach *mach, - const char *prefix, const char ***tplg_files); + const char *prefix, const char ***tplg_files, + bool best_effort); }; #define SND_SOC_ACPI_MAX_CODECS 3 diff --git a/sound/soc/intel/common/sof-function-topology-lib.c b/sound/soc/intel/common/sof-function-topology-lib.c index b10d4794159a4..0daa7d83808be 100644 --- a/sound/soc/intel/common/sof-function-topology-lib.c +++ b/sound/soc/intel/common/sof-function-topology-lib.c @@ -28,7 +28,7 @@ enum tplg_device_id { #define SOF_INTEL_PLATFORM_NAME_MAX 4 int sof_sdw_get_tplg_files(struct snd_soc_card *card, const struct snd_soc_acpi_mach *mach, - const char *prefix, const char ***tplg_files) + const char *prefix, const char ***tplg_files, bool best_effort) { struct snd_soc_acpi_mach_params mach_params = mach->mach_params; struct snd_soc_dai_link *dai_link; @@ -87,6 +87,9 @@ int sof_sdw_get_tplg_files(struct snd_soc_card *card, const struct snd_soc_acpi_ dev_dbg(card->dev, "dai_link %s is not supported by separated tplg yet\n", dai_link->name); + if (best_effort) + continue; + return 0; } if (tplg_mask & BIT(tplg_dev)) diff --git a/sound/soc/intel/common/sof-function-topology-lib.h b/sound/soc/intel/common/sof-function-topology-lib.h index e7d0c39d07883..f358f8c52d785 100644 --- a/sound/soc/intel/common/sof-function-topology-lib.h +++ b/sound/soc/intel/common/sof-function-topology-lib.h @@ -10,6 +10,6 @@ #define _SND_SOC_ACPI_INTEL_GET_TPLG_H int sof_sdw_get_tplg_files(struct snd_soc_card *card, const struct snd_soc_acpi_mach *mach, - const char *prefix, const char ***tplg_files); + const char *prefix, const char ***tplg_files, bool best_effort); #endif diff --git a/sound/soc/sof/topology.c b/sound/soc/sof/topology.c index 6b09b8cdf1cb6..9bf8ab610a7ea 100644 --- a/sound/soc/sof/topology.c +++ b/sound/soc/sof/topology.c @@ -2506,12 +2506,28 @@ int snd_sof_load_topology(struct snd_soc_component *scomp, const char *file) if (!tplg_files) return -ENOMEM; + /* Try to use function topologies if possible */ if (!sof_pdata->disable_function_topology && !disable_function_topology && sof_pdata->machine && sof_pdata->machine->get_function_tplg_files) { + /* + * When the topology name contains 'dummy' word, it means that + * there is no fallback option to monolithic topology in case + * any of the function topologies might be missing. + * In this case we should use best effort to form the card, + * ignoring functionalities that we are missing a fragment for. + * + * Note: monolithic topologies also ignore these possibly + * missing functions, so the functionality of the card would be + * identical to the case if there would be a fallback monolithic + * topology created for the configuration. + */ + bool no_fallback = strstr(file, "dummy"); + tplg_cnt = sof_pdata->machine->get_function_tplg_files(scomp->card, sof_pdata->machine, tplg_filename_prefix, - &tplg_files); + &tplg_files, + no_fallback); if (tplg_cnt < 0) { kfree(tplg_files); return tplg_cnt; From 16a6d424b49749b5d3a6bb441862fc2ac9f55ce1 Mon Sep 17 00:00:00 2001 From: Peter Ujfalusi Date: Mon, 15 Dec 2025 12:10:36 +0200 Subject: [PATCH 0305/1321] ASoC: Intel: soc-acpi-intel-nvl-match: Drop rt722 l3 from the match table Revert "ASoC: Intel: soc-acpi-intel-nvl-match: add rt722 l3 support" NVL should be only using functional topologies for products, no monolithic topologies are planned to be released. In parallel a feature has been landed [1] which allows to remove the entries from the match table for sdca codecs to rely solely on function fragments. This reverts commit 41566e3de40616375e8dfe5455344558b79f9354. Link: https://lore.kernel.org/linux-sound/20251014071335.3844631-1-yung-chuan.liao@linux.intel.com/ Signed-off-by: Peter Ujfalusi Reviewed-by: Kai Vehmanen Reviewed-by: Bard Liao Link: https://patch.msgid.link/20251215101036.9370-3-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown --- .../intel/common/soc-acpi-intel-nvl-match.c | 49 ------------------- 1 file changed, 49 deletions(-) diff --git a/sound/soc/intel/common/soc-acpi-intel-nvl-match.c b/sound/soc/intel/common/soc-acpi-intel-nvl-match.c index 2768dd10aaa08..b8695d47e55b3 100644 --- a/sound/soc/intel/common/soc-acpi-intel-nvl-match.c +++ b/sound/soc/intel/common/soc-acpi-intel-nvl-match.c @@ -15,49 +15,6 @@ struct snd_soc_acpi_mach snd_soc_acpi_intel_nvl_machines[] = { }; EXPORT_SYMBOL_GPL(snd_soc_acpi_intel_nvl_machines); -/* - * Multi-function codecs with three endpoints created for - * headset, amp and dmic functions. - */ -static const struct snd_soc_acpi_endpoint rt_mf_endpoints[] = { - { - .num = 0, - .aggregated = 0, - .group_position = 0, - .group_id = 0, - }, - { - .num = 1, - .aggregated = 0, - .group_position = 0, - .group_id = 0, - }, - { - .num = 2, - .aggregated = 0, - .group_position = 0, - .group_id = 0, - }, -}; - -static const struct snd_soc_acpi_adr_device rt722_3_single_adr[] = { - { - .adr = 0x000330025d072201ull, - .num_endpoints = ARRAY_SIZE(rt_mf_endpoints), - .endpoints = rt_mf_endpoints, - .name_prefix = "rt722" - } -}; - -static const struct snd_soc_acpi_link_adr nvl_rt722_l3[] = { - { - .mask = BIT(3), - .num_adr = ARRAY_SIZE(rt722_3_single_adr), - .adr_d = rt722_3_single_adr, - }, - {} -}; - /* this table is used when there is no I2S codec present */ struct snd_soc_acpi_mach snd_soc_acpi_intel_nvl_sdw_machines[] = { /* mockup tests need to be first */ @@ -79,12 +36,6 @@ struct snd_soc_acpi_mach snd_soc_acpi_intel_nvl_sdw_machines[] = { .drv_name = "sof_sdw", .sof_tplg_filename = "sof-nvl-rt715-rt711-rt1308-mono.tplg", }, - { - .link_mask = BIT(3), - .links = nvl_rt722_l3, - .drv_name = "sof_sdw", - .sof_tplg_filename = "sof-nvl-rt722.tplg", - }, {}, }; EXPORT_SYMBOL_GPL(snd_soc_acpi_intel_nvl_sdw_machines); From e76036dcaf0bb971bb3b6722a7054264acb34289 Mon Sep 17 00:00:00 2001 From: Peter Ujfalusi Date: Mon, 15 Dec 2025 14:06:47 +0200 Subject: [PATCH 0306/1321] ASoC: SOF: ipc4-topology: Prefer 32-bit DMIC blobs for 8-bit formats as well With the introduction of 8-bit formats the DMIC blob lookup also needs to be modified to prefer the 32-bit blob when 8-bit format is used on FE. At the same time we also need to make sure that in case 8-bit format is used, but only 16-bit blob is available for DMIC then we will not try to look for 8-bit blob (which is invalid) as fallback, but for a 16-bit one. Fixes: c04c2e829649 ("ASoC: SOF: ipc4-topology: Add support for 8-bit formats") Cc: stable@vger.kernel.org Signed-off-by: Peter Ujfalusi Reviewed-by: Bard Liao Reviewed-by: Seppo Ingalsuo Reviewed-by: Kai Vehmanen Reviewed-by: Ranjani Sridharan Link: https://patch.msgid.link/20251215120648.4827-2-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown --- sound/soc/sof/ipc4-topology.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c index 588defd3eec99..b2438835f2a30 100644 --- a/sound/soc/sof/ipc4-topology.c +++ b/sound/soc/sof/ipc4-topology.c @@ -1752,11 +1752,9 @@ snd_sof_get_nhlt_endpoint_data(struct snd_sof_dev *sdev, struct snd_sof_dai *dai channel_count = params_channels(params); sample_rate = params_rate(params); bit_depth = params_width(params); - /* - * Look for 32-bit blob first instead of 16-bit if copier - * supports multiple formats - */ - if (bit_depth == 16 && !single_bitdepth) { + + /* Prefer 32-bit blob if copier supports multiple formats */ + if (bit_depth <= 16 && !single_bitdepth) { dev_dbg(sdev->dev, "Looking for 32-bit blob first for DMIC\n"); format_change = true; bit_depth = 32; @@ -1799,10 +1797,18 @@ snd_sof_get_nhlt_endpoint_data(struct snd_sof_dev *sdev, struct snd_sof_dai *dai if (format_change) { /* * The 32-bit blob was not found in NHLT table, try to - * look for one based on the params + * look for 16-bit for DMIC or based on the params for + * SSP */ - bit_depth = params_width(params); - format_change = false; + if (linktype == SOF_DAI_INTEL_DMIC) { + bit_depth = 16; + if (params_width(params) == 16) + format_change = false; + } else { + bit_depth = params_width(params); + format_change = false; + } + get_new_blob = true; } else if (linktype == SOF_DAI_INTEL_DMIC && !single_bitdepth) { /* From 96c0f6812ad7791f1c75c54aee4e30f45c5284f0 Mon Sep 17 00:00:00 2001 From: Peter Ujfalusi Date: Mon, 15 Dec 2025 14:06:48 +0200 Subject: [PATCH 0307/1321] ASoC: SOF: ipc4-topology: Convert FLOAT to S32 during blob selection SSP/DMIC blobs have no support for FLOAT type, they are using S32 on data bus. Convert the format from FLOAT_LE to S32_LE to make sure that the correct format is used within the path. FLOAT conversion will be done on the host side (or within the path). Fixes: f7c41911ad74 ("ASoC: SOF: ipc4-topology: Add support for float sample type") Cc: stable@vger.kernel.org Signed-off-by: Peter Ujfalusi Reviewed-by: Bard Liao Reviewed-by: Seppo Ingalsuo Reviewed-by: Kai Vehmanen Reviewed-by: Ranjani Sridharan Link: https://patch.msgid.link/20251215120648.4827-3-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown --- sound/soc/sof/ipc4-topology.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c index b2438835f2a30..d621e7914a73c 100644 --- a/sound/soc/sof/ipc4-topology.c +++ b/sound/soc/sof/ipc4-topology.c @@ -1843,7 +1843,7 @@ snd_sof_get_nhlt_endpoint_data(struct snd_sof_dev *sdev, struct snd_sof_dai *dai *len = cfg->size >> 2; *dst = (u32 *)cfg->caps; - if (format_change) { + if (format_change || params_format(params) == SNDRV_PCM_FORMAT_FLOAT_LE) { /* * Update the params to reflect that different blob was loaded * instead of the requested bit depth (16 -> 32 or 32 -> 16). From b85ded195154172ceb1e3194589401fb146960e8 Mon Sep 17 00:00:00 2001 From: Shengjiu Wang Date: Tue, 16 Dec 2025 15:02:01 +0800 Subject: [PATCH 0308/1321] ASoC: ak4458: remove the reset operation in probe and remove The reset_control handler has the reference count for usage, as there is reset operation in runtime suspend and resume, then reset operation in probe() would cause the reference count of reset not balanced. Previously add reset operation in probe and remove is to fix the compile issue with !CONFIG_PM, as the driver has been update to use RUNTIME_PM_OPS(), so that change can be reverted. Fixes: 1e0dff741b0a ("ASoC: ak4458: remove "reset-gpios" property handler") Signed-off-by: Shengjiu Wang Link: https://patch.msgid.link/20251216070201.358477-1-shengjiu.wang@nxp.com Signed-off-by: Mark Brown --- sound/soc/codecs/ak4458.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/sound/soc/codecs/ak4458.c b/sound/soc/codecs/ak4458.c index 783d2ef21c11c..f81cd8cebdd89 100644 --- a/sound/soc/codecs/ak4458.c +++ b/sound/soc/codecs/ak4458.c @@ -783,16 +783,12 @@ static int ak4458_i2c_probe(struct i2c_client *i2c) pm_runtime_enable(&i2c->dev); regcache_cache_only(ak4458->regmap, true); - ak4458_reset(ak4458, false); return 0; } static void ak4458_i2c_remove(struct i2c_client *i2c) { - struct ak4458_priv *ak4458 = i2c_get_clientdata(i2c); - - ak4458_reset(ak4458, true); pm_runtime_disable(&i2c->dev); } From 829bca96c2bf1dc07b7e9805d55e594ad8d981b7 Mon Sep 17 00:00:00 2001 From: Alexander Stein Date: Tue, 16 Dec 2025 11:22:45 +0100 Subject: [PATCH 0309/1321] ASoC: fsl_sai: Add missing registers to cache default Drivers does cache sync during runtime resume, setting all writable registers. Not all writable registers are set in cache default, resulting in the erorr message: fsl-sai 30c30000.sai: using zero-initialized flat cache, this may cause unexpected behavior Fix this by adding missing writable register defaults. Signed-off-by: Alexander Stein Link: https://patch.msgid.link/20251216102246.676181-1-alexander.stein@ew.tq-group.com Signed-off-by: Mark Brown --- sound/soc/fsl/fsl_sai.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c index 86730c2149146..2fa14fbdfe1a8 100644 --- a/sound/soc/fsl/fsl_sai.c +++ b/sound/soc/fsl/fsl_sai.c @@ -1081,6 +1081,7 @@ static const struct reg_default fsl_sai_reg_defaults_ofs0[] = { {FSL_SAI_TDR6, 0}, {FSL_SAI_TDR7, 0}, {FSL_SAI_TMR, 0}, + {FSL_SAI_TTCTL, 0}, {FSL_SAI_RCR1(0), 0}, {FSL_SAI_RCR2(0), 0}, {FSL_SAI_RCR3(0), 0}, @@ -1104,12 +1105,14 @@ static const struct reg_default fsl_sai_reg_defaults_ofs8[] = { {FSL_SAI_TDR6, 0}, {FSL_SAI_TDR7, 0}, {FSL_SAI_TMR, 0}, + {FSL_SAI_TTCTL, 0}, {FSL_SAI_RCR1(8), 0}, {FSL_SAI_RCR2(8), 0}, {FSL_SAI_RCR3(8), 0}, {FSL_SAI_RCR4(8), 0}, {FSL_SAI_RCR5(8), 0}, {FSL_SAI_RMR, 0}, + {FSL_SAI_RTCTL, 0}, {FSL_SAI_MCTL, 0}, {FSL_SAI_MDIV, 0}, }; From df837ef82c9e181cb107e796723192583ba0e647 Mon Sep 17 00:00:00 2001 From: Alexander Stein Date: Tue, 16 Dec 2025 09:49:30 +0100 Subject: [PATCH 0310/1321] ASoC: fsl_xcvr: provide regmap names This driver uses multiple regmaps, which will causes name conflicts in debugfs like: debugfs: '30cc0000.xcvr' already exists in 'regmap' Fix this by adding a name for the non-core regmap configurations. Signed-off-by: Alexander Stein Link: https://patch.msgid.link/20251216084931.553328-1-alexander.stein@ew.tq-group.com Signed-off-by: Mark Brown --- sound/soc/fsl/fsl_xcvr.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/sound/soc/fsl/fsl_xcvr.c b/sound/soc/fsl/fsl_xcvr.c index 06434b2c9a0fb..a268fb81a2f86 100644 --- a/sound/soc/fsl/fsl_xcvr.c +++ b/sound/soc/fsl/fsl_xcvr.c @@ -1323,6 +1323,7 @@ static const struct reg_default fsl_xcvr_phy_reg_defaults[] = { }; static const struct regmap_config fsl_xcvr_regmap_phy_cfg = { + .name = "phy", .reg_bits = 8, .reg_stride = 4, .val_bits = 32, @@ -1335,6 +1336,7 @@ static const struct regmap_config fsl_xcvr_regmap_phy_cfg = { }; static const struct regmap_config fsl_xcvr_regmap_pllv0_cfg = { + .name = "pllv0", .reg_bits = 8, .reg_stride = 4, .val_bits = 32, @@ -1345,6 +1347,7 @@ static const struct regmap_config fsl_xcvr_regmap_pllv0_cfg = { }; static const struct regmap_config fsl_xcvr_regmap_pllv1_cfg = { + .name = "pllv1", .reg_bits = 8, .reg_stride = 4, .val_bits = 32, From 52f1e53d557ae19fb4fca90ac00b9ed21e20a1e2 Mon Sep 17 00:00:00 2001 From: Shuming Fan Date: Tue, 16 Dec 2025 17:06:01 +0800 Subject: [PATCH 0311/1321] ASoC: rt1320: update VC blind write settings This patch updates blind write settings for VC version. Signed-off-by: Shuming Fan Link: https://patch.msgid.link/20251216090601.3955252-1-shumingf@realtek.com Signed-off-by: Mark Brown --- sound/soc/codecs/rt1320-sdw.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/sound/soc/codecs/rt1320-sdw.c b/sound/soc/codecs/rt1320-sdw.c index e3f9b03df3aae..feecef258b650 100644 --- a/sound/soc/codecs/rt1320-sdw.c +++ b/sound/soc/codecs/rt1320-sdw.c @@ -115,7 +115,8 @@ static const struct reg_sequence rt1320_blind_write[] = { static const struct reg_sequence rt1320_vc_blind_write[] = { { 0xc003, 0xe0 }, { 0xe80a, 0x01 }, - { 0xc5c3, 0xf3 }, + { 0xc5c3, 0xf2 }, + { 0xc5c8, 0x03 }, { 0xc057, 0x51 }, { 0xc054, 0x35 }, { 0xca05, 0xd6 }, @@ -126,8 +127,6 @@ static const struct reg_sequence rt1320_vc_blind_write[] = { { 0xc609, 0x40 }, { 0xc046, 0xff }, { 0xc045, 0xff }, - { 0xda81, 0x14 }, - { 0xda8d, 0x14 }, { 0xc044, 0xff }, { 0xc043, 0xff }, { 0xc042, 0xff }, @@ -136,8 +135,8 @@ static const struct reg_sequence rt1320_vc_blind_write[] = { { 0xcc10, 0x01 }, { 0xc700, 0xf0 }, { 0xc701, 0x13 }, - { 0xc901, 0x09 }, - { 0xc900, 0xd0 }, + { 0xc901, 0x04 }, + { 0xc900, 0x73 }, { 0xde03, 0x05 }, { 0xdd0b, 0x0d }, { 0xdd0a, 0xff }, @@ -153,6 +152,7 @@ static const struct reg_sequence rt1320_vc_blind_write[] = { { 0xf082, 0xff }, { 0xf081, 0xff }, { 0xf080, 0xff }, + { 0xe801, 0x01 }, { 0xe802, 0xf8 }, { 0xe803, 0xbe }, { 0xc003, 0xc0 }, @@ -202,7 +202,7 @@ static const struct reg_sequence rt1320_vc_blind_write[] = { { 0x3fc2bfc3, 0x00 }, { 0x3fc2bfc2, 0x00 }, { 0x3fc2bfc1, 0x00 }, - { 0x3fc2bfc0, 0x03 }, + { 0x3fc2bfc0, 0x07 }, { 0x0000d486, 0x43 }, { SDW_SDCA_CTL(FUNC_NUM_AMP, RT1320_SDCA_ENT_PDE23, RT1320_SDCA_CTL_REQ_POWER_STATE, 0), 0x00 }, { 0x1000db00, 0x07 }, @@ -241,9 +241,7 @@ static const struct reg_sequence rt1320_vc_blind_write[] = { { 0x1000db21, 0x00 }, { 0x1000db22, 0x00 }, { 0x1000db23, 0x00 }, - { 0x0000d540, 0x01 }, - { 0x0000c081, 0xfc }, - { 0x0000f01e, 0x80 }, + { 0x0000d540, 0x21 }, { 0xc01b, 0xfc }, { 0xc5d1, 0x89 }, { 0xc5d8, 0x0a }, From 68760d7cea88cf2e73238085edb73ff6bb0dc1d1 Mon Sep 17 00:00:00 2001 From: Chancel Liu Date: Tue, 16 Dec 2025 16:16:56 +0900 Subject: [PATCH 0312/1321] ASoC: fsl-asoc-card: Use of_property_present() for non-boolean properties The use of of_property_read_bool() for non-boolean properties is deprecated in favor of of_property_present() when testing for property presence. Otherwise there'll be kernel warning: [ 29.018081] OF: /sound-wm8962: Read of boolean property 'hp-det-gpios' with a value. Signed-off-by: Chancel Liu Link: https://patch.msgid.link/20251216071656.648412-1-chancel.liu@nxp.com Signed-off-by: Mark Brown --- sound/soc/fsl/fsl-asoc-card.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/sound/soc/fsl/fsl-asoc-card.c b/sound/soc/fsl/fsl-asoc-card.c index 2c7eb0baa0f36..70a6159430ed3 100644 --- a/sound/soc/fsl/fsl-asoc-card.c +++ b/sound/soc/fsl/fsl-asoc-card.c @@ -1045,8 +1045,8 @@ static int fsl_asoc_card_probe(struct platform_device *pdev) * The notifier is initialized in snd_soc_card_jack_new(), then * snd_soc_jack_notifier_register can be called. */ - if (of_property_read_bool(np, "hp-det-gpios") || - of_property_read_bool(np, "hp-det-gpio") /* deprecated */) { + if (of_property_present(np, "hp-det-gpios") || + of_property_present(np, "hp-det-gpio") /* deprecated */) { ret = simple_util_init_jack(&priv->card, &priv->hp_jack, 1, NULL, "Headphone Jack"); if (ret) @@ -1055,8 +1055,8 @@ static int fsl_asoc_card_probe(struct platform_device *pdev) snd_soc_jack_notifier_register(&priv->hp_jack.jack, &hp_jack_nb); } - if (of_property_read_bool(np, "mic-det-gpios") || - of_property_read_bool(np, "mic-det-gpio") /* deprecated */) { + if (of_property_present(np, "mic-det-gpios") || + of_property_present(np, "mic-det-gpio") /* deprecated */) { ret = simple_util_init_jack(&priv->card, &priv->mic_jack, 0, NULL, "Mic Jack"); if (ret) From 19f50ff941b193c7995fd95e00f232c2f4b87037 Mon Sep 17 00:00:00 2001 From: Stefan Binding Date: Tue, 16 Dec 2025 13:49:20 +0000 Subject: [PATCH 0313/1321] ASoC: ops: fix snd_soc_get_volsw for sx controls SX controls are currently broken, since the clamp introduced in commit a0ce874cfaaa ("ASoC: ops: improve snd_soc_get_volsw") does not handle SX controls, for example where the min value in the clamp is greater than the max value in the clamp. Add clamp parameter to prevent clamping in SX controls. The nature of SX controls mean that it wraps around 0, with a variable number of bits, therefore clamping the value becomes complicated and prone to error. Fixes 35 kunit tests for soc_ops_test_access. Fixes: a0ce874cfaaa ("ASoC: ops: improve snd_soc_get_volsw") Co-developed-by: Charles Keepax Signed-off-by: Stefan Binding Tested-by: Peter Ujfalusi Link: https://patch.msgid.link/20251216134938.788625-1-sbinding@opensource.cirrus.com Signed-off-by: Mark Brown --- sound/soc/soc-ops.c | 32 ++++++++++++++++++++------------ 1 file changed, 20 insertions(+), 12 deletions(-) diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c index ce86978c158d6..624e9269fc25b 100644 --- a/sound/soc/soc-ops.c +++ b/sound/soc/soc-ops.c @@ -111,7 +111,8 @@ int snd_soc_put_enum_double(struct snd_kcontrol *kcontrol, EXPORT_SYMBOL_GPL(snd_soc_put_enum_double); static int sdca_soc_q78_reg_to_ctl(struct soc_mixer_control *mc, unsigned int reg_val, - unsigned int mask, unsigned int shift, int max) + unsigned int mask, unsigned int shift, int max, + bool sx) { int val = reg_val; @@ -141,20 +142,26 @@ static unsigned int sdca_soc_q78_ctl_to_reg(struct soc_mixer_control *mc, int va } static int soc_mixer_reg_to_ctl(struct soc_mixer_control *mc, unsigned int reg_val, - unsigned int mask, unsigned int shift, int max) + unsigned int mask, unsigned int shift, int max, + bool sx) { int val = (reg_val >> shift) & mask; if (mc->sign_bit) val = sign_extend32(val, mc->sign_bit); - val = clamp(val, mc->min, mc->max); - val -= mc->min; + if (sx) { + val -= mc->min; // SX controls intentionally can overflow here + val = min_t(unsigned int, val & mask, max); + } else { + val = clamp(val, mc->min, mc->max); + val -= mc->min; + } if (mc->invert) val = max - val; - return val & mask; + return val; } static unsigned int soc_mixer_ctl_to_reg(struct soc_mixer_control *mc, int val, @@ -280,9 +287,10 @@ static int soc_put_volsw(struct snd_kcontrol *kcontrol, static int soc_get_volsw(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol, - struct soc_mixer_control *mc, int mask, int max) + struct soc_mixer_control *mc, int mask, int max, bool sx) { - int (*reg_to_ctl)(struct soc_mixer_control *, unsigned int, unsigned int, unsigned int, int); + int (*reg_to_ctl)(struct soc_mixer_control *, unsigned int, unsigned int, + unsigned int, int, bool); struct snd_soc_component *component = snd_kcontrol_chip(kcontrol); unsigned int reg_val; int val; @@ -293,16 +301,16 @@ static int soc_get_volsw(struct snd_kcontrol *kcontrol, reg_to_ctl = soc_mixer_reg_to_ctl; reg_val = snd_soc_component_read(component, mc->reg); - val = reg_to_ctl(mc, reg_val, mask, mc->shift, max); + val = reg_to_ctl(mc, reg_val, mask, mc->shift, max, sx); ucontrol->value.integer.value[0] = val; if (snd_soc_volsw_is_stereo(mc)) { if (mc->reg == mc->rreg) { - val = reg_to_ctl(mc, reg_val, mask, mc->rshift, max); + val = reg_to_ctl(mc, reg_val, mask, mc->rshift, max, sx); } else { reg_val = snd_soc_component_read(component, mc->rreg); - val = reg_to_ctl(mc, reg_val, mask, mc->shift, max); + val = reg_to_ctl(mc, reg_val, mask, mc->shift, max, sx); } ucontrol->value.integer.value[1] = val; @@ -371,7 +379,7 @@ int snd_soc_get_volsw(struct snd_kcontrol *kcontrol, (struct soc_mixer_control *)kcontrol->private_value; unsigned int mask = soc_mixer_mask(mc); - return soc_get_volsw(kcontrol, ucontrol, mc, mask, mc->max - mc->min); + return soc_get_volsw(kcontrol, ucontrol, mc, mask, mc->max - mc->min, false); } EXPORT_SYMBOL_GPL(snd_soc_get_volsw); @@ -413,7 +421,7 @@ int snd_soc_get_volsw_sx(struct snd_kcontrol *kcontrol, (struct soc_mixer_control *)kcontrol->private_value; unsigned int mask = soc_mixer_sx_mask(mc); - return soc_get_volsw(kcontrol, ucontrol, mc, mask, mc->max); + return soc_get_volsw(kcontrol, ucontrol, mc, mask, mc->max, true); } EXPORT_SYMBOL_GPL(snd_soc_get_volsw_sx); From 302c7b52f6d538f44102e474efdc58858fe55d79 Mon Sep 17 00:00:00 2001 From: Alexander Stein Date: Tue, 16 Dec 2025 10:40:42 +0100 Subject: [PATCH 0314/1321] ASoC: fsl_easrc: fix duplicate debugfs directory error This driver registers two components: asrc and easrc, both attached using the device name as component name. Eventually debugfs directories with identical name are created in soc_init_component_debugfs(), leading to error message: debugfs: '30c90000.easrc' already exists in 'tqm-tlv320aic32' Fix this by adding the debugfs_prefix. Signed-off-by: Alexander Stein Reviewed-by: Fabio Estevam Link: https://patch.msgid.link/20251216094045.623184-2-alexander.stein@ew.tq-group.com Signed-off-by: Mark Brown --- sound/soc/fsl/fsl_easrc.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/sound/soc/fsl/fsl_easrc.c b/sound/soc/fsl/fsl_easrc.c index f404a39009e1a..e64a0d97afd0c 100644 --- a/sound/soc/fsl/fsl_easrc.c +++ b/sound/soc/fsl/fsl_easrc.c @@ -1577,6 +1577,9 @@ static const struct snd_soc_component_driver fsl_easrc_component = { .controls = fsl_easrc_snd_controls, .num_controls = ARRAY_SIZE(fsl_easrc_snd_controls), .legacy_dai_naming = 1, +#ifdef CONFIG_DEBUG_FS + .debugfs_prefix = "easrc", +#endif }; static const struct reg_default fsl_easrc_reg_defaults[] = { From 58ee95db7f644c3832f45271b5b3150ccd5911ec Mon Sep 17 00:00:00 2001 From: Alexander Stein Date: Tue, 16 Dec 2025 10:40:43 +0100 Subject: [PATCH 0315/1321] ASoC: fsl_asrc_dma: fix duplicate debugfs directory error This driver registers a component for asrc. This is also used together with easrc, both attached using the device name as component name. Eventually debugfs directories with identical name are created in soc_init_component_debugfs(), leading to error message: debugfs: '30c90000.easrc' already exists in 'tqm-tlv320aic32' Fix this by adding the debugfs_prefix. Signed-off-by: Alexander Stein Reviewed-by: Fabio Estevam Link: https://patch.msgid.link/20251216094045.623184-3-alexander.stein@ew.tq-group.com Signed-off-by: Mark Brown --- sound/soc/fsl/fsl_asrc_dma.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/sound/soc/fsl/fsl_asrc_dma.c b/sound/soc/fsl/fsl_asrc_dma.c index 1bba48318e2dd..7dacc06b2f02e 100644 --- a/sound/soc/fsl/fsl_asrc_dma.c +++ b/sound/soc/fsl/fsl_asrc_dma.c @@ -473,5 +473,8 @@ struct snd_soc_component_driver fsl_asrc_component = { .pointer = fsl_asrc_dma_pcm_pointer, .pcm_construct = fsl_asrc_dma_pcm_new, .legacy_dai_naming = 1, +#ifdef CONFIG_DEBUG_FS + .debugfs_prefix = "asrc", +#endif }; EXPORT_SYMBOL_GPL(fsl_asrc_component); From aca3b7396167463b45d78945f875ce693960401c Mon Sep 17 00:00:00 2001 From: Stefan Binding Date: Wed, 17 Dec 2025 16:32:26 +0000 Subject: [PATCH 0316/1321] ASoC: Intel: soc-acpi-intel-mtl-match: Add 6 amp CS35L56 with feedback Add a match for 6x CS35L56, 3x on link 0 and 3x on link 1. To support the CDB35L56-EIGHT-C board using 6 amps. This is the same as the existing 8-amp configuration mtl_cs35l56_x8_link0_link1_fb, but reduced to 6 amps. Signed-off-by: Stefan Binding Signed-off-by: Richard Fitzgerald Link: https://patch.msgid.link/20251217163227.1186373-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown --- .../intel/common/soc-acpi-intel-mtl-match.c | 42 +++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/sound/soc/intel/common/soc-acpi-intel-mtl-match.c b/sound/soc/intel/common/soc-acpi-intel-mtl-match.c index ec9fd8486c053..f0cf956ffb82f 100644 --- a/sound/soc/intel/common/soc-acpi-intel-mtl-match.c +++ b/sound/soc/intel/common/soc-acpi-intel-mtl-match.c @@ -699,6 +699,27 @@ static const struct snd_soc_acpi_adr_device cs35l56_1_fb_adr[] = { }, }; +static const struct snd_soc_acpi_adr_device cs35l56_6amp_1_fb_adr[] = { + { + .adr = 0x00013701FA355601ull, + .num_endpoints = ARRAY_SIZE(cs35l56_r_fb_endpoints), + .endpoints = cs35l56_r_fb_endpoints, + .name_prefix = "AMP6" + }, + { + .adr = 0x00013601FA355601ull, + .num_endpoints = ARRAY_SIZE(cs35l56_3_fb_endpoints), + .endpoints = cs35l56_3_fb_endpoints, + .name_prefix = "AMP5" + }, + { + .adr = 0x00013501FA355601ull, + .num_endpoints = ARRAY_SIZE(cs35l56_5_fb_endpoints), + .endpoints = cs35l56_5_fb_endpoints, + .name_prefix = "AMP4" + }, +}; + static const struct snd_soc_acpi_adr_device cs35l56_2_r_adr[] = { { .adr = 0x00023201FA355601ull, @@ -1069,6 +1090,21 @@ static const struct snd_soc_acpi_link_adr mtl_cs35l56_x8_link0_link1_fb[] = { {} }; +static const struct snd_soc_acpi_link_adr mtl_cs35l56_x6_link0_link1_fb[] = { + { + .mask = BIT(1), + .num_adr = ARRAY_SIZE(cs35l56_6amp_1_fb_adr), + .adr_d = cs35l56_6amp_1_fb_adr, + }, + { + .mask = BIT(0), + /* First 3 amps in cs35l56_0_fb_adr */ + .num_adr = 3, + .adr_d = cs35l56_0_fb_adr, + }, + {} +}; + static const struct snd_soc_acpi_link_adr mtl_cs35l63_x2_link1_link3_fb[] = { { .mask = BIT(3), @@ -1189,6 +1225,12 @@ struct snd_soc_acpi_mach snd_soc_acpi_intel_mtl_sdw_machines[] = { .sof_tplg_filename = "sof-mtl-cs35l56-l01-fb8.tplg", .get_function_tplg_files = sof_sdw_get_tplg_files, }, + { + .link_mask = BIT(0) | BIT(1), + .links = mtl_cs35l56_x6_link0_link1_fb, + .drv_name = "sof_sdw", + .sof_tplg_filename = "sof-mtl-cs35l56-l01-fb6.tplg" + }, { .link_mask = BIT(0), .links = mtl_cs42l43_l0, From 77c234fbc35667aaf21b04d4ca7850e981b93f1c Mon Sep 17 00:00:00 2001 From: Stefan Binding Date: Wed, 17 Dec 2025 16:32:27 +0000 Subject: [PATCH 0317/1321] ASoC: Intel: soc-acpi-intel-mtl-match: Add 6 amp CS35L63 with feedback Add match for 6x CS35L63, 3x on link 2 and 3x on link 3. This is to support 6 amps on the CDB35L63-CB8 board. Signed-off-by: Stefan Binding Signed-off-by: Richard Fitzgerald Link: https://patch.msgid.link/20251217163227.1186373-3-rf@opensource.cirrus.com Signed-off-by: Mark Brown --- .../intel/common/soc-acpi-intel-mtl-match.c | 62 +++++++++++++++++++ 1 file changed, 62 insertions(+) diff --git a/sound/soc/intel/common/soc-acpi-intel-mtl-match.c b/sound/soc/intel/common/soc-acpi-intel-mtl-match.c index f0cf956ffb82f..1270ee21ee721 100644 --- a/sound/soc/intel/common/soc-acpi-intel-mtl-match.c +++ b/sound/soc/intel/common/soc-acpi-intel-mtl-match.c @@ -720,6 +720,48 @@ static const struct snd_soc_acpi_adr_device cs35l56_6amp_1_fb_adr[] = { }, }; +static const struct snd_soc_acpi_adr_device cs35l63_6amp_3_fb_adr[] = { + { + .adr = 0x00033001FA356301ull, + .num_endpoints = ARRAY_SIZE(cs35l56_l_fb_endpoints), + .endpoints = cs35l56_l_fb_endpoints, + .name_prefix = "AMP1" + }, + { + .adr = 0x00033201FA356301ull, + .num_endpoints = ARRAY_SIZE(cs35l56_2_fb_endpoints), + .endpoints = cs35l56_2_fb_endpoints, + .name_prefix = "AMP3" + }, + { + .adr = 0x00033401FA356301ull, + .num_endpoints = ARRAY_SIZE(cs35l56_4_fb_endpoints), + .endpoints = cs35l56_4_fb_endpoints, + .name_prefix = "AMP5" + }, +}; + +static const struct snd_soc_acpi_adr_device cs35l63_6amp_2_fb_adr[] = { + { + .adr = 0x00023101FA356301ull, + .num_endpoints = ARRAY_SIZE(cs35l56_r_fb_endpoints), + .endpoints = cs35l56_r_fb_endpoints, + .name_prefix = "AMP2" + }, + { + .adr = 0x00023301FA356301ull, + .num_endpoints = ARRAY_SIZE(cs35l56_3_fb_endpoints), + .endpoints = cs35l56_3_fb_endpoints, + .name_prefix = "AMP4" + }, + { + .adr = 0x00023501FA356301ull, + .num_endpoints = ARRAY_SIZE(cs35l56_5_fb_endpoints), + .endpoints = cs35l56_5_fb_endpoints, + .name_prefix = "AMP6" + }, +}; + static const struct snd_soc_acpi_adr_device cs35l56_2_r_adr[] = { { .adr = 0x00023201FA355601ull, @@ -1105,6 +1147,20 @@ static const struct snd_soc_acpi_link_adr mtl_cs35l56_x6_link0_link1_fb[] = { {} }; +static const struct snd_soc_acpi_link_adr mtl_cs35l63_x6_link2_link3_fb[] = { + { + .mask = BIT(3), + .num_adr = ARRAY_SIZE(cs35l63_6amp_3_fb_adr), + .adr_d = cs35l63_6amp_3_fb_adr, + }, + { + .mask = BIT(2), + .num_adr = ARRAY_SIZE(cs35l63_6amp_2_fb_adr), + .adr_d = cs35l63_6amp_2_fb_adr, + }, + {} +}; + static const struct snd_soc_acpi_link_adr mtl_cs35l63_x2_link1_link3_fb[] = { { .mask = BIT(3), @@ -1244,6 +1300,12 @@ struct snd_soc_acpi_mach snd_soc_acpi_intel_mtl_sdw_machines[] = { .drv_name = "sof_sdw", .sof_tplg_filename = "sof-mtl-cs35l56-l01-fb8.tplg", }, + { + .link_mask = BIT(2) | BIT(3), + .links = mtl_cs35l63_x6_link2_link3_fb, + .drv_name = "sof_sdw", + .sof_tplg_filename = "sof-mtl-cs35l56-l01-fb6.tplg", + }, { .link_mask = GENMASK(3, 0), .links = mtl_3_in_1_sdca, From 58e7e8a8b3ce9f6a4673f5daa61aa73a25b9ac81 Mon Sep 17 00:00:00 2001 From: Chris Chiu Date: Thu, 18 Dec 2025 06:22:51 +0000 Subject: [PATCH 0318/1321] ALSA: hda/realtek: fix PCI SSID for one of the HP 200 G2i laptop The PCI subsystem ID of the HP machine Abe A6U should be 0x8ee7 instead of 0x8eb7. Fixes: a30fa8122222 ("ALSA: hda/realtek: fix mute/micmute LEDs don't work for more HP laptops") Signed-off-by: Chris Chiu Link: https://patch.msgid.link/20251218062251.2039592-1-chris.chiu@canonical.com Signed-off-by: Takashi Iwai --- sound/hda/codecs/realtek/alc269.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index 355f118275318..1de46c06f8c2a 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -6792,7 +6792,6 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x8e9d, "HP 17 Turbine OmniBook X UMA", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8e9e, "HP 17 Turbine OmniBook X UMA", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8eb6, "HP Abe A6U", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_GPIO), - SND_PCI_QUIRK(0x103c, 0x8eb7, "HP Abe A6U", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_GPIO), SND_PCI_QUIRK(0x103c, 0x8eb8, "HP Abe A6U", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_GPIO), SND_PCI_QUIRK(0x103c, 0x8ec1, "HP 200 G2i", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_GPIO), SND_PCI_QUIRK(0x103c, 0x8ec4, "HP Bantie I6U", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_GPIO), @@ -6808,6 +6807,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x8eda, "HP ZBook Firefly 16W", ALC245_FIXUP_HP_TAS2781_SPI_MUTE_LED), SND_PCI_QUIRK(0x103c, 0x8ee4, "HP Bantie A6U", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_GPIO), SND_PCI_QUIRK(0x103c, 0x8ee5, "HP Bantie A6U", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_GPIO), + SND_PCI_QUIRK(0x103c, 0x8ee7, "HP Abe A6U", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_GPIO), SND_PCI_QUIRK(0x103c, 0x8f0c, "HP ZBook X G2i 16W", ALC236_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8f0e, "HP ZBook X G2i 16W", ALC236_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8f40, "HP ZBook 8 G2a 14", ALC245_FIXUP_HP_TAS2781_I2C_MUTE_LED), From 54f9cb0713740c4918499f0a06f4ec49ab8f61b1 Mon Sep 17 00:00:00 2001 From: Junbeom Yeom Date: Fri, 19 Dec 2025 21:40:31 +0900 Subject: [PATCH 0319/1321] erofs: fix unexpected EIO under memory pressure erofs readahead could fail with ENOMEM under the memory pressure because it tries to alloc_page with GFP_NOWAIT | GFP_NORETRY, while GFP_KERNEL for a regular read. And if readahead fails (with non-uptodate folios), the original request will then fall back to synchronous read, and `.read_folio()` should return appropriate errnos. However, in scenarios where readahead and read operations compete, read operation could return an unintended EIO because of an incorrect error propagation. To resolve this, this patch modifies the behavior so that, when the PCL is for read(which means pcl.besteffort is true), it attempts actual decompression instead of propagating the privios error except initial EIO. - Page size: 4K - The original size of FileA: 16K - Compress-ratio per PCL: 50% (Uncompressed 8K -> Compressed 4K) [page0, page1] [page2, page3] [PCL0]---------[PCL1] - functions declaration: . pread(fd, buf, count, offset) . readahead(fd, offset, count) - Thread A tries to read the last 4K - Thread B tries to do readahead 8K from 4K - RA, besteffort == false - R, besteffort == true pread(FileA, buf, 4K, 12K) do readahead(page3) // failed with ENOMEM wait_lock(page3) if (!uptodate(page3)) goto do_read readahead(FileA, 4K, 8K) // Here create PCL-chain like below: // [null, page1] [page2, null] // [PCL0:RA]-----[PCL1:RA] ... do read(page3) // found [PCL1:RA] and add page3 into it, // and then, change PCL1 from RA to R ... // Now, PCL-chain is as below: // [null, page1] [page2, page3] // [PCL0:RA]-----[PCL1:R] // try to decompress PCL-chain... z_erofs_decompress_queue err = 0; // failed with ENOMEM, so page 1 // only for RA will not be uptodated. // it's okay. err = decompress([PCL0:RA], err) // However, ENOMEM propagated to next // PCL, even though PCL is not only // for RA but also for R. As a result, // it just failed with ENOMEM without // trying any decompression, so page2 // and page3 will not be uptodated. ** BUG HERE ** --> err = decompress([PCL1:R], err) return err as ENOMEM ... wait_lock(page3) if (!uptodate(page3)) return EIO <-- Return an unexpected EIO! ... Fixes: 2349d2fa02db ("erofs: sunset unneeded NOFAILs") Cc: stable@vger.kernel.org Reviewed-by: Jaewook Kim Reviewed-by: Sungjong Seo Signed-off-by: Junbeom Yeom Reviewed-by: Gao Xiang Signed-off-by: Gao Xiang --- fs/erofs/zdata.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index 65da215046320..3d31f7840ca04 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -1262,7 +1262,7 @@ static int z_erofs_parse_in_bvecs(struct z_erofs_backend *be, bool *overlapped) return err; } -static int z_erofs_decompress_pcluster(struct z_erofs_backend *be, int err) +static int z_erofs_decompress_pcluster(struct z_erofs_backend *be, bool eio) { struct erofs_sb_info *const sbi = EROFS_SB(be->sb); struct z_erofs_pcluster *pcl = be->pcl; @@ -1270,7 +1270,7 @@ static int z_erofs_decompress_pcluster(struct z_erofs_backend *be, int err) const struct z_erofs_decompressor *alg = z_erofs_decomp[pcl->algorithmformat]; bool try_free = true; - int i, j, jtop, err2; + int i, j, jtop, err2, err = eio ? -EIO : 0; struct page *page; bool overlapped; const char *reason; @@ -1413,12 +1413,12 @@ static int z_erofs_decompress_queue(const struct z_erofs_decompressqueue *io, .pcl = io->head, }; struct z_erofs_pcluster *next; - int err = io->eio ? -EIO : 0; + int err = 0; for (; be.pcl != Z_EROFS_PCLUSTER_TAIL; be.pcl = next) { DBG_BUGON(!be.pcl); next = READ_ONCE(be.pcl->next); - err = z_erofs_decompress_pcluster(&be, err) ?: err; + err = z_erofs_decompress_pcluster(&be, io->eio) ?: err; } return err; } From c1d969c040060a1fb44818712f18b68e082c120c Mon Sep 17 00:00:00 2001 From: Joshua Rogers Date: Fri, 7 Nov 2025 10:05:33 -0500 Subject: [PATCH 0320/1321] SUNRPC: svcauth_gss: avoid NULL deref on zero length gss_token in gss_read_proxy_verf A zero length gss_token results in pages == 0 and in_token->pages[0] is NULL. The code unconditionally evaluates page_address(in_token->pages[0]) for the initial memcpy, which can dereference NULL even when the copy length is 0. Guard the first memcpy so it only runs when length > 0. Fixes: 5866efa8cbfb ("SUNRPC: Fix svcauth_gss_proxy_init()") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers Signed-off-by: Chuck Lever --- net/sunrpc/auth_gss/svcauth_gss.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c index a8ec30759a184..e2f0df8cdaa6a 100644 --- a/net/sunrpc/auth_gss/svcauth_gss.c +++ b/net/sunrpc/auth_gss/svcauth_gss.c @@ -1083,7 +1083,8 @@ static int gss_read_proxy_verf(struct svc_rqst *rqstp, } length = min_t(unsigned int, inlen, (char *)xdr->end - (char *)xdr->p); - memcpy(page_address(in_token->pages[0]), xdr->p, length); + if (length) + memcpy(page_address(in_token->pages[0]), xdr->p, length); inlen -= length; to_offs = length; From 8a00ea748b8ac918ee485e4af026fbc293fa0c2b Mon Sep 17 00:00:00 2001 From: Joshua Rogers Date: Fri, 7 Nov 2025 10:09:47 -0500 Subject: [PATCH 0321/1321] svcrdma: use rc_pageoff for memcpy byte offset svc_rdma_copy_inline_range added rc_curpage (page index) to the page base instead of the byte offset rc_pageoff. Use rc_pageoff so copies land within the current page. Found by ZeroPath (https://zeropath.com) Fixes: 8e122582680c ("svcrdma: Move svc_rdma_read_info::ri_pageno to struct svc_rdma_recv_ctxt") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers Signed-off-by: Chuck Lever --- net/sunrpc/xprtrdma/svc_rdma_rw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sunrpc/xprtrdma/svc_rdma_rw.c b/net/sunrpc/xprtrdma/svc_rdma_rw.c index 661b3fe2779f0..945fbb374331c 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_rw.c +++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c @@ -848,7 +848,7 @@ static int svc_rdma_copy_inline_range(struct svc_rqst *rqstp, head->rc_page_count++; dst = page_address(rqstp->rq_pages[head->rc_curpage]); - memcpy(dst + head->rc_curpage, src + offset, page_len); + memcpy((unsigned char *)dst + head->rc_pageoff, src + offset, page_len); head->rc_readbytes += page_len; head->rc_pageoff += page_len; From f0a9d1a80966a1b5ab5e5ec55b509250c731d188 Mon Sep 17 00:00:00 2001 From: Joshua Rogers Date: Fri, 7 Nov 2025 10:09:48 -0500 Subject: [PATCH 0322/1321] svcrdma: return 0 on success from svc_rdma_copy_inline_range The function comment specifies 0 on success and -EINVAL on invalid parameters. Make the tail return 0 after a successful copy loop. Fixes: d7cc73972661 ("svcrdma: support multiple Read chunks per RPC") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers Signed-off-by: Chuck Lever --- net/sunrpc/xprtrdma/svc_rdma_rw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sunrpc/xprtrdma/svc_rdma_rw.c b/net/sunrpc/xprtrdma/svc_rdma_rw.c index 945fbb374331c..e813e54633521 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_rw.c +++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c @@ -860,7 +860,7 @@ static int svc_rdma_copy_inline_range(struct svc_rqst *rqstp, offset += page_len; } - return -EINVAL; + return 0; } /** From dc6e9d0e265b3dac3edcb6178920ebff674b3a19 Mon Sep 17 00:00:00 2001 From: Joshua Rogers Date: Fri, 7 Nov 2025 10:09:49 -0500 Subject: [PATCH 0323/1321] svcrdma: bound check rq_pages index in inline path svc_rdma_copy_inline_range indexed rqstp->rq_pages[rc_curpage] without verifying rc_curpage stays within the allocated page array. Add guards before the first use and after advancing to a new page. Fixes: d7cc73972661 ("svcrdma: support multiple Read chunks per RPC") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers Signed-off-by: Chuck Lever --- net/sunrpc/xprtrdma/svc_rdma_rw.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/sunrpc/xprtrdma/svc_rdma_rw.c b/net/sunrpc/xprtrdma/svc_rdma_rw.c index e813e54633521..310de7a80be52 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_rw.c +++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c @@ -841,6 +841,9 @@ static int svc_rdma_copy_inline_range(struct svc_rqst *rqstp, for (page_no = 0; page_no < numpages; page_no++) { unsigned int page_len; + if (head->rc_curpage >= rqstp->rq_maxpages) + return -EINVAL; + page_len = min_t(unsigned int, remaining, PAGE_SIZE - head->rc_pageoff); From 4ada0d48134521f1fc6d1d013a5787ee796c0b39 Mon Sep 17 00:00:00 2001 From: Andy Shevchenko Date: Thu, 13 Nov 2025 09:31:31 +0100 Subject: [PATCH 0324/1321] nfsd: Mark variable __maybe_unused to avoid W=1 build break Clang is not happy about set but (in some cases) unused variable: fs/nfsd/export.c:1027:17: error: variable 'inode' set but not used [-Werror,-Wunused-but-set-variable] since it's used as a parameter to dprintk() which might be configured a no-op. To avoid uglifying code with the specific ifdeffery just mark the variable __maybe_unused. The commit [1], which introduced this behaviour, is quite old and hence the Fixes tag points to the first of the Git era. Link: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=0431923fb7a1 [1] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Andy Shevchenko Signed-off-by: Chuck Lever --- fs/nfsd/export.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c index 9d55512d0cc97..2a1499f2ad196 100644 --- a/fs/nfsd/export.c +++ b/fs/nfsd/export.c @@ -1024,7 +1024,7 @@ exp_rootfh(struct net *net, struct auth_domain *clp, char *name, { struct svc_export *exp; struct path path; - struct inode *inode; + struct inode *inode __maybe_unused; struct svc_fh fh; int err; struct nfsd_net *nn = net_generic(net, nfsd_net_id); From 2929aceeb5f698afa6dc292448277bdea13acaa9 Mon Sep 17 00:00:00 2001 From: Shardul Bankar Date: Mon, 17 Nov 2025 17:41:21 +0530 Subject: [PATCH 0325/1321] nfsd: fix memory leak in nfsd_create_serv error paths When nfsd_create_serv() calls percpu_ref_init() to initialize nn->nfsd_net_ref, it allocates both a percpu reference counter and a percpu_ref_data structure (64 bytes). However, if the function fails later due to svc_create_pooled() returning NULL or svc_bind() returning an error, these allocations are not cleaned up, resulting in a memory leak. The leak manifests as: - Unreferenced percpu allocation (8 bytes per CPU) - Unreferenced percpu_ref_data structure (64 bytes) Fix this by adding percpu_ref_exit() calls in both error paths to properly clean up the percpu_ref_init() allocations. This patch fixes the percpu_ref leak in nfsd_create_serv() seen as an auxiliary leak in syzbot report 099461f8558eb0a1f4f3; the prepare_creds() and vsock-related leaks in the same report remain to be addressed separately. Reported-by: syzbot+099461f8558eb0a1f4f3@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=099461f8558eb0a1f4f3 Fixes: 47e988147f40 ("nfsd: add nfsd_serv_try_get and nfsd_serv_put") Signed-off-by: Shardul Bankar Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever --- fs/nfsd/nfssvc.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index b08ae85d53ef5..f6cae4430ba44 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -615,12 +615,15 @@ int nfsd_create_serv(struct net *net) serv = svc_create_pooled(nfsd_programs, ARRAY_SIZE(nfsd_programs), &nn->nfsd_svcstats, nfsd_max_blksize, nfsd); - if (serv == NULL) + if (serv == NULL) { + percpu_ref_exit(&nn->nfsd_net_ref); return -ENOMEM; + } error = svc_bind(serv, net); if (error < 0) { svc_destroy(&serv); + percpu_ref_exit(&nn->nfsd_net_ref); return error; } spin_lock(&nfsd_notifier_lock); From 829775166d44c52dc18f1b08ca79d63b3169d9c1 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Mon, 17 Nov 2025 11:00:49 -0500 Subject: [PATCH 0326/1321] NFSD: Clear SECLABEL in the suppattr_exclcreat bitmap >From RFC 8881: 5.8.1.14. Attribute 75: suppattr_exclcreat > The bit vector that would set all REQUIRED and RECOMMENDED > attributes that are supported by the EXCLUSIVE4_1 method of file > creation via the OPEN operation. The scope of this attribute > applies to all objects with a matching fsid. There's nothing in RFC 8881 that states that suppattr_exclcreat is or is not allowed to contain bits for attributes that are clear in the reported supported_attrs bitmask. But it doesn't make sense for an NFS server to indicate that it /doesn't/ implement an attribute, but then also indicate that clients /are/ allowed to set that attribute using OPEN(create) with EXCLUSIVE4_1. Ensure that the SECURITY_LABEL and ACL bits are not set in the suppattr_exclcreat bitmask when they are also not set in the supported_attrs bitmask. Fixes: 8c18f2052e75 ("nfsd41: SUPPATTR_EXCLCREAT attribute") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever --- fs/nfsd/nfs4xdr.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index 30ce5851fe4c4..51ef97c254568 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -3375,6 +3375,11 @@ static __be32 nfsd4_encode_fattr4_suppattr_exclcreat(struct xdr_stream *xdr, u32 supp[3]; memcpy(supp, nfsd_suppattrs[resp->cstate.minorversion], sizeof(supp)); + if (!IS_POSIXACL(d_inode(args->dentry))) + supp[0] &= ~FATTR4_WORD0_ACL; + if (!args->contextsupport) + supp[2] &= ~FATTR4_WORD2_SECURITY_LABEL; + supp[0] &= NFSD_SUPPATTR_EXCLCREAT_WORD0; supp[1] &= NFSD_SUPPATTR_EXCLCREAT_WORD1; supp[2] &= NFSD_SUPPATTR_EXCLCREAT_WORD2; From 3499d588de49de0b23cb7ba271fe66648cc32f6a Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Mon, 17 Nov 2025 11:00:50 -0500 Subject: [PATCH 0327/1321] NFSD: Clear TIME_DELEG in the suppattr_exclcreat bitmap >From RFC 8881: 5.8.1.14. Attribute 75: suppattr_exclcreat > The bit vector that would set all REQUIRED and RECOMMENDED > attributes that are supported by the EXCLUSIVE4_1 method of file > creation via the OPEN operation. The scope of this attribute > applies to all objects with a matching fsid. There's nothing in RFC 8881 that states that suppattr_exclcreat is or is not allowed to contain bits for attributes that are clear in the reported supported_attrs bitmask. But it doesn't make sense for an NFS server to indicate that it /doesn't/ implement an attribute, but then also indicate that clients /are/ allowed to set that attribute using OPEN(create) with EXCLUSIVE4_1. The FATTR4_WORD2_TIME_DELEG attributes are also not to be allowed for OPEN(create) with EXCLUSIVE4_1. It doesn't make sense to set a delegated timestamp on a new file. Fixes: 7e13f4f8d27d ("nfsd: handle delegated timestamps in SETATTR") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever --- fs/nfsd/nfsd.h | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h index e4263326ca4ac..50be785f1d2ce 100644 --- a/fs/nfsd/nfsd.h +++ b/fs/nfsd/nfsd.h @@ -547,8 +547,14 @@ static inline bool nfsd_attrs_supported(u32 minorversion, const u32 *bmval) #define NFSD_SUPPATTR_EXCLCREAT_WORD1 \ (NFSD_WRITEABLE_ATTRS_WORD1 & \ ~(FATTR4_WORD1_TIME_ACCESS_SET | FATTR4_WORD1_TIME_MODIFY_SET)) +/* + * The FATTR4_WORD2_TIME_DELEG attributes are not to be allowed for + * OPEN(create) with EXCLUSIVE4_1. It doesn't make sense to set a + * delegated timestamp on a new file. + */ #define NFSD_SUPPATTR_EXCLCREAT_WORD2 \ - NFSD_WRITEABLE_ATTRS_WORD2 + (NFSD_WRITEABLE_ATTRS_WORD2 & \ + ~(FATTR4_WORD2_TIME_DELEG_ACCESS | FATTR4_WORD2_TIME_DELEG_MODIFY)) extern int nfsd4_is_junction(struct dentry *dentry); extern int register_cld_notifier(void); From aa10ba217c902bb0dcd9fa07dd633425520d134c Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 18 Nov 2025 19:51:19 -0500 Subject: [PATCH 0328/1321] NFSD: NFSv4 file creation neglects setting ACL MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit An NFSv4 client that sets an ACL with a named principal during file creation retrieves the ACL afterwards, and finds that it is only a default ACL (based on the mode bits) and not the ACL that was requested during file creation. This violates RFC 8881 section 6.4.1.3: "the ACL attribute is set as given". The issue occurs in nfsd_create_setattr(), which calls nfsd_attrs_valid() to determine whether to call nfsd_setattr(). However, nfsd_attrs_valid() checks only for iattr changes and security labels, but not POSIX ACLs. When only an ACL is present, the function returns false, nfsd_setattr() is skipped, and the POSIX ACL is never applied to the inode. Subsequently, when the client retrieves the ACL, the server finds no POSIX ACL on the inode and returns one generated from the file's mode bits rather than returning the originally-specified ACL. Reported-by: Aurélien Couderc Fixes: c0cbe70742f4 ("NFSD: add posix ACLs to struct nfsd_attrs") Cc: Roland Mainz Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever --- fs/nfsd/vfs.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h index ded2900d423f8..e192dca4a6791 100644 --- a/fs/nfsd/vfs.h +++ b/fs/nfsd/vfs.h @@ -67,7 +67,8 @@ static inline bool nfsd_attrs_valid(struct nfsd_attrs *attrs) struct iattr *iap = attrs->na_iattr; return (iap->ia_valid || (attrs->na_seclabel && - attrs->na_seclabel->len)); + attrs->na_seclabel->len) || + attrs->na_pacl || attrs->na_dpacl); } __be32 nfserrno (int errno); From a442334fb40a1d948f23676f54459cb0c8db3404 Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 19 Dec 2025 01:20:34 +0900 Subject: [PATCH 0329/1321] ksmbd: Fix to handle removal of rfc1002 header from smb_hdr The commit that removed the RFC1002 header from struct smb_hdr didn't also fix the places in ksmbd that use it in order to provide graceful rejection of SMB1 protocol requests. Fixes: 83bfbd0bb902 ("cifs: Remove the RFC1002 header from smb_hdr") Reported-by: Namjae Jeon Link: https://lore.kernel.org/r/CAKYAXd9Ju4MFkkH5Jxfi1mO0AWEr=R35M3vQ_Xa7Yw34JoNZ0A@mail.gmail.com/ Cc: ChenXiaoSong Signed-off-by: David Howells Signed-off-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/server/server.c | 2 +- fs/smb/server/smb_common.c | 20 ++++++++++---------- 2 files changed, 11 insertions(+), 11 deletions(-) diff --git a/fs/smb/server/server.c b/fs/smb/server/server.c index 3cea16050e4f7..bedc8390b6dbd 100644 --- a/fs/smb/server/server.c +++ b/fs/smb/server/server.c @@ -95,7 +95,7 @@ static inline int check_conn_state(struct ksmbd_work *work) if (ksmbd_conn_exiting(work->conn) || ksmbd_conn_need_reconnect(work->conn)) { - rsp_hdr = work->response_buf; + rsp_hdr = smb2_get_msg(work->response_buf); rsp_hdr->Status.CifsError = STATUS_CONNECTION_DISCONNECTED; return 1; } diff --git a/fs/smb/server/smb_common.c b/fs/smb/server/smb_common.c index b23203a1c2865..6d7b4449276bc 100644 --- a/fs/smb/server/smb_common.c +++ b/fs/smb/server/smb_common.c @@ -140,7 +140,7 @@ int ksmbd_verify_smb_message(struct ksmbd_work *work) if (smb2_hdr->ProtocolId == SMB2_PROTO_NUMBER) return ksmbd_smb2_check_message(work); - hdr = work->request_buf; + hdr = smb2_get_msg(work->request_buf); if (*(__le32 *)hdr->Protocol == SMB1_PROTO_NUMBER && hdr->Command == SMB_COM_NEGOTIATE) { work->conn->outstanding_credits++; @@ -278,15 +278,14 @@ static int ksmbd_negotiate_smb_dialect(void *buf) req->DialectCount); } - proto = *(__le32 *)((struct smb_hdr *)buf)->Protocol; if (proto == SMB1_PROTO_NUMBER) { struct smb_negotiate_req *req; - req = (struct smb_negotiate_req *)buf; + req = (struct smb_negotiate_req *)smb2_get_msg(buf); if (le16_to_cpu(req->ByteCount) < 2) goto err_out; - if (offsetof(struct smb_negotiate_req, DialectsArray) - 4 + + if (offsetof(struct smb_negotiate_req, DialectsArray) + le16_to_cpu(req->ByteCount) > smb_buf_length) { goto err_out; } @@ -320,8 +319,8 @@ static u16 get_smb1_cmd_val(struct ksmbd_work *work) */ static int init_smb1_rsp_hdr(struct ksmbd_work *work) { - struct smb_hdr *rsp_hdr = (struct smb_hdr *)work->response_buf; - struct smb_hdr *rcv_hdr = (struct smb_hdr *)work->request_buf; + struct smb_hdr *rsp_hdr = (struct smb_hdr *)smb2_get_msg(work->response_buf); + struct smb_hdr *rcv_hdr = (struct smb_hdr *)smb2_get_msg(work->request_buf); rsp_hdr->Command = SMB_COM_NEGOTIATE; *(__le32 *)rsp_hdr->Protocol = SMB1_PROTO_NUMBER; @@ -412,9 +411,10 @@ static int init_smb1_server(struct ksmbd_conn *conn) int ksmbd_init_smb_server(struct ksmbd_conn *conn) { + struct smb_hdr *rcv_hdr = (struct smb_hdr *)smb2_get_msg(conn->request_buf); __le32 proto; - proto = *(__le32 *)((struct smb_hdr *)conn->request_buf)->Protocol; + proto = *(__le32 *)rcv_hdr->Protocol; if (conn->need_neg == false) { if (proto == SMB1_PROTO_NUMBER) return -EINVAL; @@ -572,12 +572,12 @@ static int __smb2_negotiate(struct ksmbd_conn *conn) static int smb_handle_negotiate(struct ksmbd_work *work) { - struct smb_negotiate_rsp *neg_rsp = work->response_buf; + struct smb_negotiate_rsp *neg_rsp = smb2_get_msg(work->response_buf); ksmbd_debug(SMB, "Unsupported SMB1 protocol\n"); - if (ksmbd_iov_pin_rsp(work, (void *)neg_rsp + 4, - sizeof(struct smb_negotiate_rsp) - 4)) + if (ksmbd_iov_pin_rsp(work, (void *)neg_rsp, + sizeof(struct smb_negotiate_rsp))) return -ENOMEM; neg_rsp->hdr.Status.CifsError = STATUS_SUCCESS; From c91f23b39f45fdc0cd462d74638dbb6a354b55e7 Mon Sep 17 00:00:00 2001 From: Namjae Jeon Date: Fri, 19 Dec 2025 10:04:25 +0900 Subject: [PATCH 0330/1321] ksmbd: rename smb2_get_msg to smb_get_msg With the removal of the RFC1002 length field from the SMB header, smb2_get_msg is now used to get the smb1 request from the request buffer. Since this function is no longer exclusive to smb2 and now supports smb1 as well, This patch rename it to smb_get_msg to better reflect its usage. Signed-off-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/server/auth.c | 4 +-- fs/smb/server/connection.c | 2 +- fs/smb/server/oplock.c | 8 ++--- fs/smb/server/server.c | 2 +- fs/smb/server/smb2pdu.c | 70 +++++++++++++++++++------------------- fs/smb/server/smb2pdu.h | 9 ----- fs/smb/server/smb_common.c | 18 +++++----- fs/smb/server/smb_common.h | 9 +++++ 8 files changed, 61 insertions(+), 61 deletions(-) diff --git a/fs/smb/server/auth.c b/fs/smb/server/auth.c index f2767c4b51326..09af55b71153e 100644 --- a/fs/smb/server/auth.c +++ b/fs/smb/server/auth.c @@ -714,7 +714,7 @@ void ksmbd_gen_smb311_encryptionkey(struct ksmbd_conn *conn, int ksmbd_gen_preauth_integrity_hash(struct ksmbd_conn *conn, char *buf, __u8 *pi_hash) { - struct smb2_hdr *rcv_hdr = smb2_get_msg(buf); + struct smb2_hdr *rcv_hdr = smb_get_msg(buf); char *all_bytes_msg = (char *)&rcv_hdr->ProtocolId; int msg_size = get_rfc1002_len(buf); struct sha512_ctx sha_ctx; @@ -841,7 +841,7 @@ int ksmbd_crypt_message(struct ksmbd_work *work, struct kvec *iov, unsigned int nvec, int enc) { struct ksmbd_conn *conn = work->conn; - struct smb2_transform_hdr *tr_hdr = smb2_get_msg(iov[0].iov_base); + struct smb2_transform_hdr *tr_hdr = smb_get_msg(iov[0].iov_base); unsigned int assoc_data_len = sizeof(struct smb2_transform_hdr) - 20; int rc; struct scatterlist *sg; diff --git a/fs/smb/server/connection.c b/fs/smb/server/connection.c index b6b4f1286b9cf..b2afd8a43b0a0 100644 --- a/fs/smb/server/connection.c +++ b/fs/smb/server/connection.c @@ -394,7 +394,7 @@ int ksmbd_conn_handler_loop(void *p) if (!ksmbd_smb_request(conn)) break; - if (((struct smb2_hdr *)smb2_get_msg(conn->request_buf))->ProtocolId == + if (((struct smb2_hdr *)smb_get_msg(conn->request_buf))->ProtocolId == SMB2_PROTO_NUMBER) { if (pdu_size < SMB2_MIN_SUPPORTED_HEADER_SIZE) break; diff --git a/fs/smb/server/oplock.c b/fs/smb/server/oplock.c index 1f07ebf431d7b..a5967ac466049 100644 --- a/fs/smb/server/oplock.c +++ b/fs/smb/server/oplock.c @@ -637,7 +637,7 @@ static void __smb2_oplock_break_noti(struct work_struct *wk) goto out; } - rsp_hdr = smb2_get_msg(work->response_buf); + rsp_hdr = smb_get_msg(work->response_buf); memset(rsp_hdr, 0, sizeof(struct smb2_hdr) + 2); rsp_hdr->ProtocolId = SMB2_PROTO_NUMBER; rsp_hdr->StructureSize = SMB2_HEADER_STRUCTURE_SIZE; @@ -651,7 +651,7 @@ static void __smb2_oplock_break_noti(struct work_struct *wk) rsp_hdr->SessionId = 0; memset(rsp_hdr->Signature, 0, 16); - rsp = smb2_get_msg(work->response_buf); + rsp = smb_get_msg(work->response_buf); rsp->StructureSize = cpu_to_le16(24); if (!br_info->open_trunc && @@ -744,7 +744,7 @@ static void __smb2_lease_break_noti(struct work_struct *wk) goto out; } - rsp_hdr = smb2_get_msg(work->response_buf); + rsp_hdr = smb_get_msg(work->response_buf); memset(rsp_hdr, 0, sizeof(struct smb2_hdr) + 2); rsp_hdr->ProtocolId = SMB2_PROTO_NUMBER; rsp_hdr->StructureSize = SMB2_HEADER_STRUCTURE_SIZE; @@ -758,7 +758,7 @@ static void __smb2_lease_break_noti(struct work_struct *wk) rsp_hdr->SessionId = 0; memset(rsp_hdr->Signature, 0, 16); - rsp = smb2_get_msg(work->response_buf); + rsp = smb_get_msg(work->response_buf); rsp->StructureSize = cpu_to_le16(44); rsp->Epoch = br_info->epoch; rsp->Flags = 0; diff --git a/fs/smb/server/server.c b/fs/smb/server/server.c index bedc8390b6dbd..554ae90df906d 100644 --- a/fs/smb/server/server.c +++ b/fs/smb/server/server.c @@ -95,7 +95,7 @@ static inline int check_conn_state(struct ksmbd_work *work) if (ksmbd_conn_exiting(work->conn) || ksmbd_conn_need_reconnect(work->conn)) { - rsp_hdr = smb2_get_msg(work->response_buf); + rsp_hdr = smb_get_msg(work->response_buf); rsp_hdr->Status.CifsError = STATUS_CONNECTION_DISCONNECTED; return 1; } diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index 8aa483800014d..469b70757dba6 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -47,8 +47,8 @@ static void __wbuf(struct ksmbd_work *work, void **req, void **rsp) *req = ksmbd_req_buf_next(work); *rsp = ksmbd_resp_buf_next(work); } else { - *req = smb2_get_msg(work->request_buf); - *rsp = smb2_get_msg(work->response_buf); + *req = smb_get_msg(work->request_buf); + *rsp = smb_get_msg(work->response_buf); } } @@ -146,7 +146,7 @@ void smb2_set_err_rsp(struct ksmbd_work *work) if (work->next_smb2_rcv_hdr_off) err_rsp = ksmbd_resp_buf_next(work); else - err_rsp = smb2_get_msg(work->response_buf); + err_rsp = smb_get_msg(work->response_buf); if (err_rsp->hdr.Status != STATUS_STOPPED_ON_SYMLINK) { int err; @@ -172,7 +172,7 @@ void smb2_set_err_rsp(struct ksmbd_work *work) */ bool is_smb2_neg_cmd(struct ksmbd_work *work) { - struct smb2_hdr *hdr = smb2_get_msg(work->request_buf); + struct smb2_hdr *hdr = smb_get_msg(work->request_buf); /* is it SMB2 header ? */ if (hdr->ProtocolId != SMB2_PROTO_NUMBER) @@ -196,7 +196,7 @@ bool is_smb2_neg_cmd(struct ksmbd_work *work) */ bool is_smb2_rsp(struct ksmbd_work *work) { - struct smb2_hdr *hdr = smb2_get_msg(work->response_buf); + struct smb2_hdr *hdr = smb_get_msg(work->response_buf); /* is it SMB2 header ? */ if (hdr->ProtocolId != SMB2_PROTO_NUMBER) @@ -222,7 +222,7 @@ u16 get_smb2_cmd_val(struct ksmbd_work *work) if (work->next_smb2_rcv_hdr_off) rcv_hdr = ksmbd_req_buf_next(work); else - rcv_hdr = smb2_get_msg(work->request_buf); + rcv_hdr = smb_get_msg(work->request_buf); return le16_to_cpu(rcv_hdr->Command); } @@ -235,7 +235,7 @@ void set_smb2_rsp_status(struct ksmbd_work *work, __le32 err) { struct smb2_hdr *rsp_hdr; - rsp_hdr = smb2_get_msg(work->response_buf); + rsp_hdr = smb_get_msg(work->response_buf); rsp_hdr->Status = err; work->iov_idx = 0; @@ -258,7 +258,7 @@ int init_smb2_neg_rsp(struct ksmbd_work *work) struct ksmbd_conn *conn = work->conn; int err; - rsp_hdr = smb2_get_msg(work->response_buf); + rsp_hdr = smb_get_msg(work->response_buf); memset(rsp_hdr, 0, sizeof(struct smb2_hdr) + 2); rsp_hdr->ProtocolId = SMB2_PROTO_NUMBER; rsp_hdr->StructureSize = SMB2_HEADER_STRUCTURE_SIZE; @@ -272,7 +272,7 @@ int init_smb2_neg_rsp(struct ksmbd_work *work) rsp_hdr->SessionId = 0; memset(rsp_hdr->Signature, 0, 16); - rsp = smb2_get_msg(work->response_buf); + rsp = smb_get_msg(work->response_buf); WARN_ON(ksmbd_conn_good(conn)); @@ -446,7 +446,7 @@ static void init_chained_smb2_rsp(struct ksmbd_work *work) */ bool is_chained_smb2_message(struct ksmbd_work *work) { - struct smb2_hdr *hdr = smb2_get_msg(work->request_buf); + struct smb2_hdr *hdr = smb_get_msg(work->request_buf); unsigned int len, next_cmd; if (hdr->ProtocolId != SMB2_PROTO_NUMBER) @@ -497,8 +497,8 @@ bool is_chained_smb2_message(struct ksmbd_work *work) */ int init_smb2_rsp_hdr(struct ksmbd_work *work) { - struct smb2_hdr *rsp_hdr = smb2_get_msg(work->response_buf); - struct smb2_hdr *rcv_hdr = smb2_get_msg(work->request_buf); + struct smb2_hdr *rsp_hdr = smb_get_msg(work->response_buf); + struct smb2_hdr *rcv_hdr = smb_get_msg(work->request_buf); memset(rsp_hdr, 0, sizeof(struct smb2_hdr) + 2); rsp_hdr->ProtocolId = rcv_hdr->ProtocolId; @@ -527,7 +527,7 @@ int init_smb2_rsp_hdr(struct ksmbd_work *work) */ int smb2_allocate_rsp_buf(struct ksmbd_work *work) { - struct smb2_hdr *hdr = smb2_get_msg(work->request_buf); + struct smb2_hdr *hdr = smb_get_msg(work->request_buf); size_t small_sz = MAX_CIFS_SMALL_BUFFER_SIZE; size_t large_sz = small_sz + work->conn->vals->max_trans_size; size_t sz = small_sz; @@ -543,7 +543,7 @@ int smb2_allocate_rsp_buf(struct ksmbd_work *work) offsetof(struct smb2_query_info_req, OutputBufferLength)) return -EINVAL; - req = smb2_get_msg(work->request_buf); + req = smb_get_msg(work->request_buf); if ((req->InfoType == SMB2_O_INFO_FILE && (req->FileInfoClass == FILE_FULL_EA_INFORMATION || req->FileInfoClass == FILE_ALL_INFORMATION)) || @@ -712,10 +712,10 @@ void smb2_send_interim_resp(struct ksmbd_work *work, __le32 status) } in_work->conn = work->conn; - memcpy(smb2_get_msg(in_work->response_buf), ksmbd_resp_buf_next(work), + memcpy(smb_get_msg(in_work->response_buf), ksmbd_resp_buf_next(work), __SMB2_HEADER_STRUCTURE_SIZE); - rsp_hdr = smb2_get_msg(in_work->response_buf); + rsp_hdr = smb_get_msg(in_work->response_buf); rsp_hdr->Flags |= SMB2_FLAGS_ASYNC_COMMAND; rsp_hdr->Id.AsyncId = cpu_to_le64(work->async_id); smb2_set_err_rsp(in_work); @@ -1093,8 +1093,8 @@ static __le32 deassemble_neg_contexts(struct ksmbd_conn *conn, int smb2_handle_negotiate(struct ksmbd_work *work) { struct ksmbd_conn *conn = work->conn; - struct smb2_negotiate_req *req = smb2_get_msg(work->request_buf); - struct smb2_negotiate_rsp *rsp = smb2_get_msg(work->response_buf); + struct smb2_negotiate_req *req = smb_get_msg(work->request_buf); + struct smb2_negotiate_rsp *rsp = smb_get_msg(work->response_buf); int rc = 0; unsigned int smb2_buf_len, smb2_neg_size, neg_ctxt_len = 0; __le32 status; @@ -5967,7 +5967,7 @@ int smb2_close(struct ksmbd_work *work) */ int smb2_echo(struct ksmbd_work *work) { - struct smb2_echo_rsp *rsp = smb2_get_msg(work->response_buf); + struct smb2_echo_rsp *rsp = smb_get_msg(work->response_buf); ksmbd_debug(SMB, "Received smb2 echo request\n"); @@ -6520,8 +6520,8 @@ int smb2_set_info(struct ksmbd_work *work) pid = work->compound_pfid; } } else { - req = smb2_get_msg(work->request_buf); - rsp = smb2_get_msg(work->response_buf); + req = smb_get_msg(work->request_buf); + rsp = smb_get_msg(work->response_buf); } if (!test_tree_conn_flag(work->tcon, KSMBD_TREE_CONN_FLAG_WRITABLE)) { @@ -6754,8 +6754,8 @@ int smb2_read(struct ksmbd_work *work) pid = work->compound_pfid; } } else { - req = smb2_get_msg(work->request_buf); - rsp = smb2_get_msg(work->response_buf); + req = smb_get_msg(work->request_buf); + rsp = smb_get_msg(work->response_buf); } if (!has_file_id(id)) { @@ -7183,7 +7183,7 @@ int smb2_flush(struct ksmbd_work *work) int smb2_cancel(struct ksmbd_work *work) { struct ksmbd_conn *conn = work->conn; - struct smb2_hdr *hdr = smb2_get_msg(work->request_buf); + struct smb2_hdr *hdr = smb_get_msg(work->request_buf); struct smb2_hdr *chdr; struct ksmbd_work *iter; struct list_head *command_list; @@ -7200,7 +7200,7 @@ int smb2_cancel(struct ksmbd_work *work) spin_lock(&conn->request_lock); list_for_each_entry(iter, command_list, async_request_entry) { - chdr = smb2_get_msg(iter->request_buf); + chdr = smb_get_msg(iter->request_buf); if (iter->async_id != le64_to_cpu(hdr->Id.AsyncId)) @@ -7221,7 +7221,7 @@ int smb2_cancel(struct ksmbd_work *work) spin_lock(&conn->request_lock); list_for_each_entry(iter, command_list, request_entry) { - chdr = smb2_get_msg(iter->request_buf); + chdr = smb_get_msg(iter->request_buf); if (chdr->MessageId != hdr->MessageId || iter == work) @@ -8151,8 +8151,8 @@ int smb2_ioctl(struct ksmbd_work *work) id = work->compound_fid; } } else { - req = smb2_get_msg(work->request_buf); - rsp = smb2_get_msg(work->response_buf); + req = smb_get_msg(work->request_buf); + rsp = smb_get_msg(work->response_buf); } if (!has_file_id(id)) @@ -8817,7 +8817,7 @@ int smb2_notify(struct ksmbd_work *work) */ bool smb2_is_sign_req(struct ksmbd_work *work, unsigned int command) { - struct smb2_hdr *rcv_hdr2 = smb2_get_msg(work->request_buf); + struct smb2_hdr *rcv_hdr2 = smb_get_msg(work->request_buf); if ((rcv_hdr2->Flags & SMB2_FLAGS_SIGNED) && command != SMB2_NEGOTIATE_HE && @@ -8842,7 +8842,7 @@ int smb2_check_sign_req(struct ksmbd_work *work) struct kvec iov[1]; size_t len; - hdr = smb2_get_msg(work->request_buf); + hdr = smb_get_msg(work->request_buf); if (work->next_smb2_rcv_hdr_off) hdr = ksmbd_req_buf_next(work); @@ -8916,7 +8916,7 @@ int smb3_check_sign_req(struct ksmbd_work *work) struct kvec iov[1]; size_t len; - hdr = smb2_get_msg(work->request_buf); + hdr = smb_get_msg(work->request_buf); if (work->next_smb2_rcv_hdr_off) hdr = ksmbd_req_buf_next(work); @@ -9049,7 +9049,7 @@ void smb3_preauth_hash_rsp(struct ksmbd_work *work) static void fill_transform_hdr(void *tr_buf, char *old_buf, __le16 cipher_type) { struct smb2_transform_hdr *tr_hdr = tr_buf + 4; - struct smb2_hdr *hdr = smb2_get_msg(old_buf); + struct smb2_hdr *hdr = smb_get_msg(old_buf); unsigned int orig_len = get_rfc1002_len(old_buf); /* tr_buf must be cleared by the caller */ @@ -9088,7 +9088,7 @@ int smb3_encrypt_resp(struct ksmbd_work *work) bool smb3_is_transform_hdr(void *buf) { - struct smb2_transform_hdr *trhdr = smb2_get_msg(buf); + struct smb2_transform_hdr *trhdr = smb_get_msg(buf); return trhdr->ProtocolId == SMB2_TRANSFORM_PROTO_NUM; } @@ -9100,7 +9100,7 @@ int smb3_decrypt_req(struct ksmbd_work *work) unsigned int pdu_length = get_rfc1002_len(buf); struct kvec iov[2]; int buf_data_size = pdu_length - sizeof(struct smb2_transform_hdr); - struct smb2_transform_hdr *tr_hdr = smb2_get_msg(buf); + struct smb2_transform_hdr *tr_hdr = smb_get_msg(buf); int rc = 0; if (pdu_length < sizeof(struct smb2_transform_hdr) || @@ -9141,7 +9141,7 @@ bool smb3_11_final_sess_setup_resp(struct ksmbd_work *work) { struct ksmbd_conn *conn = work->conn; struct ksmbd_session *sess = work->sess; - struct smb2_hdr *rsp = smb2_get_msg(work->response_buf); + struct smb2_hdr *rsp = smb_get_msg(work->response_buf); if (conn->dialect < SMB30_PROT_ID) return false; diff --git a/fs/smb/server/smb2pdu.h b/fs/smb/server/smb2pdu.h index 66cdc8e4a6488..257c6d26df264 100644 --- a/fs/smb/server/smb2pdu.h +++ b/fs/smb/server/smb2pdu.h @@ -383,15 +383,6 @@ int smb2_ioctl(struct ksmbd_work *work); int smb2_oplock_break(struct ksmbd_work *work); int smb2_notify(struct ksmbd_work *ksmbd_work); -/* - * Get the body of the smb2 message excluding the 4 byte rfc1002 headers - * from request/response buffer. - */ -static inline void *smb2_get_msg(void *buf) -{ - return buf + 4; -} - #define POSIX_TYPE_FILE 0 #define POSIX_TYPE_DIR 1 #define POSIX_TYPE_SYMLINK 2 diff --git a/fs/smb/server/smb_common.c b/fs/smb/server/smb_common.c index 6d7b4449276bc..1cd7e738434d7 100644 --- a/fs/smb/server/smb_common.c +++ b/fs/smb/server/smb_common.c @@ -140,7 +140,7 @@ int ksmbd_verify_smb_message(struct ksmbd_work *work) if (smb2_hdr->ProtocolId == SMB2_PROTO_NUMBER) return ksmbd_smb2_check_message(work); - hdr = smb2_get_msg(work->request_buf); + hdr = smb_get_msg(work->request_buf); if (*(__le32 *)hdr->Protocol == SMB1_PROTO_NUMBER && hdr->Command == SMB_COM_NEGOTIATE) { work->conn->outstanding_credits++; @@ -163,7 +163,7 @@ bool ksmbd_smb_request(struct ksmbd_conn *conn) if (conn->request_buf[0] != 0) return false; - proto = (__le32 *)smb2_get_msg(conn->request_buf); + proto = (__le32 *)smb_get_msg(conn->request_buf); if (*proto == SMB2_COMPRESSION_TRANSFORM_ID) { pr_err_ratelimited("smb2 compression not support yet"); return false; @@ -259,14 +259,14 @@ int ksmbd_lookup_dialect_by_id(__le16 *cli_dialects, __le16 dialects_count) static int ksmbd_negotiate_smb_dialect(void *buf) { int smb_buf_length = get_rfc1002_len(buf); - __le32 proto = ((struct smb2_hdr *)smb2_get_msg(buf))->ProtocolId; + __le32 proto = ((struct smb2_hdr *)smb_get_msg(buf))->ProtocolId; if (proto == SMB2_PROTO_NUMBER) { struct smb2_negotiate_req *req; int smb2_neg_size = offsetof(struct smb2_negotiate_req, Dialects); - req = (struct smb2_negotiate_req *)smb2_get_msg(buf); + req = (struct smb2_negotiate_req *)smb_get_msg(buf); if (smb2_neg_size > smb_buf_length) goto err_out; @@ -281,7 +281,7 @@ static int ksmbd_negotiate_smb_dialect(void *buf) if (proto == SMB1_PROTO_NUMBER) { struct smb_negotiate_req *req; - req = (struct smb_negotiate_req *)smb2_get_msg(buf); + req = (struct smb_negotiate_req *)smb_get_msg(buf); if (le16_to_cpu(req->ByteCount) < 2) goto err_out; @@ -319,8 +319,8 @@ static u16 get_smb1_cmd_val(struct ksmbd_work *work) */ static int init_smb1_rsp_hdr(struct ksmbd_work *work) { - struct smb_hdr *rsp_hdr = (struct smb_hdr *)smb2_get_msg(work->response_buf); - struct smb_hdr *rcv_hdr = (struct smb_hdr *)smb2_get_msg(work->request_buf); + struct smb_hdr *rsp_hdr = (struct smb_hdr *)smb_get_msg(work->response_buf); + struct smb_hdr *rcv_hdr = (struct smb_hdr *)smb_get_msg(work->request_buf); rsp_hdr->Command = SMB_COM_NEGOTIATE; *(__le32 *)rsp_hdr->Protocol = SMB1_PROTO_NUMBER; @@ -411,7 +411,7 @@ static int init_smb1_server(struct ksmbd_conn *conn) int ksmbd_init_smb_server(struct ksmbd_conn *conn) { - struct smb_hdr *rcv_hdr = (struct smb_hdr *)smb2_get_msg(conn->request_buf); + struct smb_hdr *rcv_hdr = (struct smb_hdr *)smb_get_msg(conn->request_buf); __le32 proto; proto = *(__le32 *)rcv_hdr->Protocol; @@ -572,7 +572,7 @@ static int __smb2_negotiate(struct ksmbd_conn *conn) static int smb_handle_negotiate(struct ksmbd_work *work) { - struct smb_negotiate_rsp *neg_rsp = smb2_get_msg(work->response_buf); + struct smb_negotiate_rsp *neg_rsp = smb_get_msg(work->response_buf); ksmbd_debug(SMB, "Unsupported SMB1 protocol\n"); diff --git a/fs/smb/server/smb_common.h b/fs/smb/server/smb_common.h index 95bf1465387b9..ddd6867c50b2e 100644 --- a/fs/smb/server/smb_common.h +++ b/fs/smb/server/smb_common.h @@ -203,4 +203,13 @@ unsigned int ksmbd_server_side_copy_max_chunk_size(void); unsigned int ksmbd_server_side_copy_max_total_size(void); bool is_asterisk(char *p); __le32 smb_map_generic_desired_access(__le32 daccess); + +/* + * Get the body of the smb message excluding the 4 byte rfc1002 headers + * from request/response buffer. + */ +static inline void *smb_get_msg(void *buf) +{ + return buf + 4; +} #endif /* __SMB_SERVER_COMMON_H__ */ From 8fa94fcbb5821f12e06ecbb7455822f60b7656e4 Mon Sep 17 00:00:00 2001 From: ChenXiaoSong Date: Sat, 20 Dec 2025 21:25:50 +0800 Subject: [PATCH 0331/1321] smb/server: fix minimum SMB1 PDU size Since the RFC1002 header has been removed from `struct smb_hdr`, the minimum SMB1 PDU size should be updated as well. Fixes: 83bfbd0bb902 ("cifs: Remove the RFC1002 header from smb_hdr") Suggested-by: David Howells Suggested-by: Namjae Jeon Signed-off-by: ChenXiaoSong Reviewed-by: David Howells Acked-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/server/connection.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/smb/server/connection.c b/fs/smb/server/connection.c index b2afd8a43b0a0..487b5562c7711 100644 --- a/fs/smb/server/connection.c +++ b/fs/smb/server/connection.c @@ -295,7 +295,8 @@ bool ksmbd_conn_alive(struct ksmbd_conn *conn) return true; } -#define SMB1_MIN_SUPPORTED_HEADER_SIZE (sizeof(struct smb_hdr)) +/* "+2" for BCC field (ByteCount, 2 bytes) */ +#define SMB1_MIN_SUPPORTED_PDU_SIZE (sizeof(struct smb_hdr) + 2) #define SMB2_MIN_SUPPORTED_HEADER_SIZE (sizeof(struct smb2_hdr) + 4) /** @@ -363,7 +364,7 @@ int ksmbd_conn_handler_loop(void *p) if (pdu_size > MAX_STREAM_PROT_LEN) break; - if (pdu_size < SMB1_MIN_SUPPORTED_HEADER_SIZE) + if (pdu_size < SMB1_MIN_SUPPORTED_PDU_SIZE) break; /* 4 for rfc1002 length field */ From 281f92db38c01b012374bb085079be3f0e61ccbd Mon Sep 17 00:00:00 2001 From: ChenXiaoSong Date: Sat, 20 Dec 2025 21:25:51 +0800 Subject: [PATCH 0332/1321] smb/server: fix minimum SMB2 PDU size The minimum SMB2 PDU size should be updated to the size of `struct smb2_pdu` (that is, the size of `struct smb2_hdr` + 2). Suggested-by: David Howells Suggested-by: Namjae Jeon Signed-off-by: ChenXiaoSong Reviewed-by: David Howells Acked-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/server/connection.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/smb/server/connection.c b/fs/smb/server/connection.c index 487b5562c7711..6cac48c8fbe8e 100644 --- a/fs/smb/server/connection.c +++ b/fs/smb/server/connection.c @@ -297,7 +297,7 @@ bool ksmbd_conn_alive(struct ksmbd_conn *conn) /* "+2" for BCC field (ByteCount, 2 bytes) */ #define SMB1_MIN_SUPPORTED_PDU_SIZE (sizeof(struct smb_hdr) + 2) -#define SMB2_MIN_SUPPORTED_HEADER_SIZE (sizeof(struct smb2_hdr) + 4) +#define SMB2_MIN_SUPPORTED_PDU_SIZE (sizeof(struct smb2_pdu)) /** * ksmbd_conn_handler_loop() - session thread to listen on new smb requests @@ -397,7 +397,7 @@ int ksmbd_conn_handler_loop(void *p) if (((struct smb2_hdr *)smb_get_msg(conn->request_buf))->ProtocolId == SMB2_PROTO_NUMBER) { - if (pdu_size < SMB2_MIN_SUPPORTED_HEADER_SIZE) + if (pdu_size < SMB2_MIN_SUPPORTED_PDU_SIZE) break; } From c6a2f00a5c83c615282d791761dfeef49ebe1d73 Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 4 Dec 2025 12:22:28 -0500 Subject: [PATCH 0333/1321] tools/virtio: fix up compiler.h stub Add #undef __user before and after including compiler_types.h to avoid redefinition warnings when compiling with system headers that also define __user. This allows tools/virtio to build without warnings. Additionally, stub out __must_check Created using Cursor CLI. Message-ID: <56424ce95c72cb4957070a7cd3c3c40ad5addaee.1764873799.git.mst@redhat.com> Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- tools/virtio/linux/compiler.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/tools/virtio/linux/compiler.h b/tools/virtio/linux/compiler.h index 725b93bfeee17..0f25db473f55c 100644 --- a/tools/virtio/linux/compiler.h +++ b/tools/virtio/linux/compiler.h @@ -2,7 +2,11 @@ #ifndef LINUX_COMPILER_H #define LINUX_COMPILER_H +/* Avoid redefinition warnings */ +#undef __user #include "../../../include/linux/compiler_types.h" +#undef __user +#define __user #define WRITE_ONCE(var, val) \ (*((volatile typeof(val) *)(&(var))) = (val)) @@ -35,4 +39,6 @@ __v; \ }) +#define __must_check + #endif From 1bc23e7621a624bd5c37235cdf318479ff932041 Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 4 Dec 2025 13:31:52 -0500 Subject: [PATCH 0334/1321] virtio: make it self-contained virtio.h uses struct module, add a forward declaration to make the header self-contained. Message-ID: <9171b5cac60793eb59ab044c96ee038bf1363bee.1764873799.git.mst@redhat.com> Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- include/linux/virtio.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/linux/virtio.h b/include/linux/virtio.h index 132a474e59140..3626eb6947282 100644 --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -13,6 +13,8 @@ #include #include +struct module; + /** * struct virtqueue - a queue to register buffers for sending or receiving. * @list: the chain of virtqueues for this device From fbd087de4506f9aa5773ee6d9a4a71299cfe939e Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 4 Dec 2025 12:22:31 -0500 Subject: [PATCH 0335/1321] tools/virtio: use kernel's virtio.h Replace virtio stubs with an include of the kernel header. Message-ID: <33daf1033fc447eb8e3e54d21013ccfd99550e37.1764873799.git.mst@redhat.com> Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- tools/virtio/linux/virtio.h | 73 +------------------------------------ 1 file changed, 1 insertion(+), 72 deletions(-) diff --git a/tools/virtio/linux/virtio.h b/tools/virtio/linux/virtio.h index 5d3440f474dd3..d3029c9445896 100644 --- a/tools/virtio/linux/virtio.h +++ b/tools/virtio/linux/virtio.h @@ -1,72 +1 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef LINUX_VIRTIO_H -#define LINUX_VIRTIO_H -#include -#include -#include - -struct device { - void *parent; -}; - -struct virtio_device { - struct device dev; - u64 features; - struct list_head vqs; - spinlock_t vqs_list_lock; - const struct virtio_config_ops *config; -}; - -struct virtqueue { - struct list_head list; - void (*callback)(struct virtqueue *vq); - const char *name; - struct virtio_device *vdev; - unsigned int index; - unsigned int num_free; - unsigned int num_max; - void *priv; - bool reset; -}; - -/* Interfaces exported by virtio_ring. */ -int virtqueue_add_sgs(struct virtqueue *vq, - struct scatterlist *sgs[], - unsigned int out_sgs, - unsigned int in_sgs, - void *data, - gfp_t gfp); - -int virtqueue_add_outbuf(struct virtqueue *vq, - struct scatterlist sg[], unsigned int num, - void *data, - gfp_t gfp); - -int virtqueue_add_inbuf(struct virtqueue *vq, - struct scatterlist sg[], unsigned int num, - void *data, - gfp_t gfp); - -bool virtqueue_kick(struct virtqueue *vq); - -void *virtqueue_get_buf(struct virtqueue *vq, unsigned int *len); - -void virtqueue_disable_cb(struct virtqueue *vq); - -bool virtqueue_enable_cb(struct virtqueue *vq); -bool virtqueue_enable_cb_delayed(struct virtqueue *vq); - -void *virtqueue_detach_unused_buf(struct virtqueue *vq); -struct virtqueue *vring_new_virtqueue(unsigned int index, - unsigned int num, - unsigned int vring_align, - struct virtio_device *vdev, - bool weak_barriers, - bool ctx, - void *pages, - bool (*notify)(struct virtqueue *vq), - void (*callback)(struct virtqueue *vq), - const char *name); -void vring_del_virtqueue(struct virtqueue *vq); - -#endif +#include <../../include/linux/virtio.h> From b29804c0049085b2dda19d9d902ce5f866eff0b4 Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 4 Dec 2025 12:22:32 -0500 Subject: [PATCH 0336/1321] tools/virtio: add struct module forward declaration Declarate struct module in our linux/module.h stub. Created using Cursor CLI. Message-ID: Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- tools/virtio/linux/module.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/virtio/linux/module.h b/tools/virtio/linux/module.h index b91681fc15718..b842ae9d870ce 100644 --- a/tools/virtio/linux/module.h +++ b/tools/virtio/linux/module.h @@ -1,6 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0 */ #include +struct module; + #define MODULE_LICENSE(__MODULE_LICENSE_value) \ static __attribute__((unused)) const char *__MODULE_LICENSE_name = \ __MODULE_LICENSE_value From 1f2fe26106cba72ba8773b20046d382816237558 Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 4 Dec 2025 12:22:34 -0500 Subject: [PATCH 0337/1321] tools/virtio: stub DMA mapping functions Add dma_map_page_attrs and dma_unmap_page_attrs stubs. Follow the same pattern as existing DMA mapping stubs. Created using Cursor CLI. Message-ID: <3512df1fe0e2129ea493434a21c940c50381cc93.1764873799.git.mst@redhat.com> Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- tools/virtio/linux/dma-mapping.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tools/virtio/linux/dma-mapping.h b/tools/virtio/linux/dma-mapping.h index 0959584617886..fddfa2fbb276a 100644 --- a/tools/virtio/linux/dma-mapping.h +++ b/tools/virtio/linux/dma-mapping.h @@ -22,6 +22,7 @@ enum dma_data_direction { #define dma_free_coherent(d, s, p, h) kfree(p) #define dma_map_page(d, p, o, s, dir) (page_to_phys(p) + (o)) +#define dma_map_page_attrs(d, p, o, s, dir, a) (page_to_phys(p) + (o)) #define dma_map_single(d, p, s, dir) (virt_to_phys(p)) #define dma_map_single_attrs(d, p, s, dir, a) (virt_to_phys(p)) @@ -29,6 +30,9 @@ enum dma_data_direction { #define dma_unmap_single(d, a, s, r) do { (void)(d); (void)(a); (void)(s); (void)(r); } while (0) #define dma_unmap_page(d, a, s, r) do { (void)(d); (void)(a); (void)(s); (void)(r); } while (0) +#define dma_unmap_page_attrs(d, a, s, r, t) do { \ + (void)(d); (void)(a); (void)(s); (void)(r); (void)(t); \ +} while (0) #define sg_dma_address(sg) (0) #define sg_dma_len(sg) (0) From 4a891b9a9be1f2b472443997e8111f514475a0a2 Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 4 Dec 2025 12:22:36 -0500 Subject: [PATCH 0338/1321] tools/virtio: add dev_WARN_ONCE and is_vmalloc_addr stubs Add dev_WARN_ONCE and is_vmalloc_addr stubs needed by virtio_ring.c. is_vmalloc_addr stub always returns false - that's fine since it's merely a sanity check. Created using Cursor CLI. Message-ID: <749e7a03b7cd56baf50a27efc3b05e50cf8f36b6.1764873799.git.mst@redhat.com> Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- tools/virtio/linux/kernel.h | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h index 6702008f7f5c6..d7fc70b68a2b7 100644 --- a/tools/virtio/linux/kernel.h +++ b/tools/virtio/linux/kernel.h @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include @@ -135,6 +136,14 @@ static inline void *krealloc_array(void *p, size_t new_n, size_t new_size, gfp_t #define dev_warn(dev, format, ...) fprintf (stderr, format, ## __VA_ARGS__) #define dev_warn_once(dev, format, ...) fprintf (stderr, format, ## __VA_ARGS__) +#define dev_WARN_ONCE(dev, condition, format...) \ + WARN_ONCE(condition, format) + +static inline bool is_vmalloc_addr(const void *x) +{ + return false; +} + #define min(x, y) ({ \ typeof(x) _min1 = (x); \ typeof(y) _min2 = (y); \ From 6b7a94faa3d87cc5f2dbdc6d9b569185982ea95d Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 4 Dec 2025 12:22:40 -0500 Subject: [PATCH 0339/1321] tools/virtio: add ucopysize.h stub Add ucopysize.h with stub implementations of check_object_size, copy_overflow, and check_copy_size. Created using Cursor CLI. Message-ID: <5046df90002bb744609248404b81d33b559fe813.1764873799.git.mst@redhat.com> Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- tools/virtio/linux/ucopysize.h | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) create mode 100644 tools/virtio/linux/ucopysize.h diff --git a/tools/virtio/linux/ucopysize.h b/tools/virtio/linux/ucopysize.h new file mode 100644 index 0000000000000..8beb7755d0601 --- /dev/null +++ b/tools/virtio/linux/ucopysize.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __LINUX_UCOPYSIZE_H__ +#define __LINUX_UCOPYSIZE_H__ + +#include + +static inline void check_object_size(const void *ptr, unsigned long n, + bool to_user) +{ } + +static inline void copy_overflow(int size, unsigned long count) +{ +} + +static __always_inline __must_check bool +check_copy_size(const void *addr, size_t bytes, bool is_source) +{ + return true; +} + +#endif /* __LINUX_UCOPYSIZE_H__ */ From da011ceac58c4e653e55a0a224c9f94fe3a90f0b Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 4 Dec 2025 12:22:43 -0500 Subject: [PATCH 0340/1321] tools/virtio: pass KCFLAGS to module build Update the mod target to pass KCFLAGS with the in-tree vhost driver include path. This way vhost_test can find vhost headers. Created using Cursor CLI. Message-ID: <5473e5a5dfd2fcd261a778f2017cac669c031f23.1764873799.git.mst@redhat.com> Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- tools/virtio/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/virtio/Makefile b/tools/virtio/Makefile index e25e99c1c3b7b..a60316211df6b 100644 --- a/tools/virtio/Makefile +++ b/tools/virtio/Makefile @@ -20,8 +20,9 @@ CFLAGS += -g -O2 -Werror -Wno-maybe-uninitialized -Wall -I. -I../include/ -I ../ CFLAGS += -pthread LDFLAGS += -pthread vpath %.c ../../drivers/virtio ../../drivers/vhost +BUILD=KCFLAGS="-I "`pwd`/../../drivers/vhost ${MAKE} -C `pwd`/../.. V=${V} mod: - ${MAKE} -C `pwd`/../.. M=`pwd`/vhost_test V=${V} + ${BUILD} M=`pwd`/vhost_test #oot: build vhost as an out of tree module for a distro kernel #no effort is taken to make it actually build or work, but tends to mostly work From 2dc424267184422043d08e3e81f3b17eb47fb8e9 Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 4 Dec 2025 12:25:15 -0500 Subject: [PATCH 0341/1321] tools/virtio: add struct cpumask to cpumask.h Add struct cpumask stub used by virtio_config.h. Created using Cursor CLI. Message-ID: Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- tools/virtio/linux/cpumask.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tools/virtio/linux/cpumask.h b/tools/virtio/linux/cpumask.h index 307da69d6b26c..38ffc00e149d3 100644 --- a/tools/virtio/linux/cpumask.h +++ b/tools/virtio/linux/cpumask.h @@ -4,4 +4,8 @@ #include +struct cpumask { + unsigned long bits[1]; +}; + #endif /* _LINUX_CPUMASK_H */ From 63b9c3fce5800821f454a2b5bf43947b4a5e7d9e Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 4 Dec 2025 12:25:17 -0500 Subject: [PATCH 0342/1321] tools/virtio: stub might_sleep and synchronize_rcu Add might_sleep() and synchronize_rcu() stubs needed by virtio_config.h. might_sleep() is a no-op, synchronize_rcu doesn't work but we don't need it to. Created using Cursor CLI. Message-ID: <5557e026335d808acd7b890693ee1382e73dd33a.1764873799.git.mst@redhat.com> Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- tools/virtio/linux/kernel.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h index d7fc70b68a2b7..416d02703f614 100644 --- a/tools/virtio/linux/kernel.h +++ b/tools/virtio/linux/kernel.h @@ -144,6 +144,13 @@ static inline bool is_vmalloc_addr(const void *x) return false; } +#define might_sleep() do { } while (0) + +static inline void synchronize_rcu(void) +{ + assert(0); +} + #define min(x, y) ({ \ typeof(x) _min1 = (x); \ typeof(y) _min2 = (y); \ From c8f310817e23b10a51ea913b6e8dcca003c160b1 Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 4 Dec 2025 12:22:38 -0500 Subject: [PATCH 0343/1321] tools/virtio: switch to kernel's virtio_config.h Drops stubs in virtio_config.h, use the kernel's version instead - we are now activly developing it, so the stub became too hard to maintain. Message-ID: <8e5c85dc8aad001f161f7e2d8799ffbccfc31381.1764873799.git.mst@redhat.com> Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- tools/virtio/linux/virtio_config.h | 102 +---------------------------- 1 file changed, 1 insertion(+), 101 deletions(-) diff --git a/tools/virtio/linux/virtio_config.h b/tools/virtio/linux/virtio_config.h index 42a564f22f2d1..a0cd3f9a31114 100644 --- a/tools/virtio/linux/virtio_config.h +++ b/tools/virtio/linux/virtio_config.h @@ -1,101 +1 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef LINUX_VIRTIO_CONFIG_H -#define LINUX_VIRTIO_CONFIG_H -#include -#include -#include - -struct virtio_config_ops { - int (*disable_vq_and_reset)(struct virtqueue *vq); - int (*enable_vq_after_reset)(struct virtqueue *vq); -}; - -/* - * __virtio_test_bit - helper to test feature bits. For use by transports. - * Devices should normally use virtio_has_feature, - * which includes more checks. - * @vdev: the device - * @fbit: the feature bit - */ -static inline bool __virtio_test_bit(const struct virtio_device *vdev, - unsigned int fbit) -{ - return vdev->features & (1ULL << fbit); -} - -/** - * __virtio_set_bit - helper to set feature bits. For use by transports. - * @vdev: the device - * @fbit: the feature bit - */ -static inline void __virtio_set_bit(struct virtio_device *vdev, - unsigned int fbit) -{ - vdev->features |= (1ULL << fbit); -} - -/** - * __virtio_clear_bit - helper to clear feature bits. For use by transports. - * @vdev: the device - * @fbit: the feature bit - */ -static inline void __virtio_clear_bit(struct virtio_device *vdev, - unsigned int fbit) -{ - vdev->features &= ~(1ULL << fbit); -} - -#define virtio_has_feature(dev, feature) \ - (__virtio_test_bit((dev), feature)) - -/** - * virtio_has_dma_quirk - determine whether this device has the DMA quirk - * @vdev: the device - */ -static inline bool virtio_has_dma_quirk(const struct virtio_device *vdev) -{ - /* - * Note the reverse polarity of the quirk feature (compared to most - * other features), this is for compatibility with legacy systems. - */ - return !virtio_has_feature(vdev, VIRTIO_F_ACCESS_PLATFORM); -} - -static inline bool virtio_is_little_endian(struct virtio_device *vdev) -{ - return virtio_has_feature(vdev, VIRTIO_F_VERSION_1) || - virtio_legacy_is_little_endian(); -} - -/* Memory accessors */ -static inline u16 virtio16_to_cpu(struct virtio_device *vdev, __virtio16 val) -{ - return __virtio16_to_cpu(virtio_is_little_endian(vdev), val); -} - -static inline __virtio16 cpu_to_virtio16(struct virtio_device *vdev, u16 val) -{ - return __cpu_to_virtio16(virtio_is_little_endian(vdev), val); -} - -static inline u32 virtio32_to_cpu(struct virtio_device *vdev, __virtio32 val) -{ - return __virtio32_to_cpu(virtio_is_little_endian(vdev), val); -} - -static inline __virtio32 cpu_to_virtio32(struct virtio_device *vdev, u32 val) -{ - return __cpu_to_virtio32(virtio_is_little_endian(vdev), val); -} - -static inline u64 virtio64_to_cpu(struct virtio_device *vdev, __virtio64 val) -{ - return __virtio64_to_cpu(virtio_is_little_endian(vdev), val); -} - -static inline __virtio64 cpu_to_virtio64(struct virtio_device *vdev, u64 val) -{ - return __cpu_to_virtio64(virtio_is_little_endian(vdev), val); -} - -#endif +#include "../../include/linux/virtio_config.h" From 89fe8e06a820d0c7c5456e136bb7d2c2c6b0ca85 Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 4 Dec 2025 12:49:34 -0500 Subject: [PATCH 0344/1321] virtio_features: make it self-contained virtio_features.h uses WARN_ON_ONCE and memset so it must include linux/bug.h and linux/string.h Message-ID: <579986aa9b8d023844990d2a0e267382f8ad85d5.1764873799.git.mst@redhat.com> Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- include/linux/virtio_features.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/linux/virtio_features.h b/include/linux/virtio_features.h index ea2ad8717882e..ce59ea91f474a 100644 --- a/include/linux/virtio_features.h +++ b/include/linux/virtio_features.h @@ -3,6 +3,8 @@ #define _LINUX_VIRTIO_FEATURES_H #include +#include +#include #define VIRTIO_FEATURES_U64S 2 #define VIRTIO_FEATURES_BITS (VIRTIO_FEATURES_U64S * 64) From b3dbe2fc993180f1a4ea9b4b109b771d5f9c4504 Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 4 Dec 2025 12:55:11 -0500 Subject: [PATCH 0345/1321] tools/virtio: fix up oot build oot build tends to help uncover bugs so it's worth keeping around, as long as it's low effort. add stubs for a couple of macros virtio gained recently, and disable vdpa in the test build. Message-ID: <33968faa7994b86d1f78057358a50b8f460c7a23.1764873799.git.mst@redhat.com> Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- tools/virtio/Makefile | 5 +++-- tools/virtio/oot-stubs.h | 10 ++++++++++ 2 files changed, 13 insertions(+), 2 deletions(-) create mode 100644 tools/virtio/oot-stubs.h diff --git a/tools/virtio/Makefile b/tools/virtio/Makefile index a60316211df6b..2cac5fd4b9794 100644 --- a/tools/virtio/Makefile +++ b/tools/virtio/Makefile @@ -38,8 +38,9 @@ OOT_CONFIGS=\ CONFIG_VHOST_NET=n \ CONFIG_VHOST_SCSI=n \ CONFIG_VHOST_VSOCK=n \ - CONFIG_VHOST_RING=n -OOT_BUILD=KCFLAGS="-I "${OOT_VHOST} ${MAKE} -C ${OOT_KSRC} V=${V} + CONFIG_VHOST_RING=n \ + CONFIG_VHOST_VDPA=n +OOT_BUILD=KCFLAGS="-include "`pwd`"/oot-stubs.h -I "${OOT_VHOST} ${MAKE} -C ${OOT_KSRC} V=${V} oot-build: echo "UNSUPPORTED! Don't use the resulting modules in production!" ${OOT_BUILD} M=`pwd`/vhost_test diff --git a/tools/virtio/oot-stubs.h b/tools/virtio/oot-stubs.h new file mode 100644 index 0000000000000..69e059cd14d68 --- /dev/null +++ b/tools/virtio/oot-stubs.h @@ -0,0 +1,10 @@ +#include +#include +#include + +#ifndef VIRTIO_FEATURES_BITS +#define VIRTIO_FEATURES_BITS 128 +#endif +#ifndef VIRTIO_U64 +#define VIRTIO_U64(b) ((b) >> 6) +#endif From cbcb7eb6add41ed33395e98a748121c383e0b488 Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Thu, 4 Dec 2025 13:37:07 -0500 Subject: [PATCH 0346/1321] tools/virtio: add device, device_driver stubs Add stubs needed by virtio.h Message-ID: <0fabf13f6ea812ebc73b1c919fb17d4dec1545db.1764873799.git.mst@redhat.com> Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- tools/virtio/linux/device.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/tools/virtio/linux/device.h b/tools/virtio/linux/device.h index 4ad7e1df0db5a..075c2140d975a 100644 --- a/tools/virtio/linux/device.h +++ b/tools/virtio/linux/device.h @@ -1,2 +1,10 @@ #ifndef LINUX_DEVICE_H + +struct device { + void *parent; +}; + +struct device_driver { + const char *name; +}; #endif From ec8752bdf1114c8fdb1c6f3ba599859c7130b7e2 Mon Sep 17 00:00:00 2001 From: Stefano Garzarella Date: Wed, 26 Nov 2025 14:38:26 +0100 Subject: [PATCH 0347/1321] vhost/vsock: improve RCU read sections around vhost_vsock_get() vhost_vsock_get() uses hash_for_each_possible_rcu() to find the `vhost_vsock` associated with the `guest_cid`. hash_for_each_possible_rcu() should only be called within an RCU read section, as mentioned in the following comment in include/linux/rculist.h: /** * hlist_for_each_entry_rcu - iterate over rcu list of given type * @pos: the type * to use as a loop cursor. * @head: the head for your list. * @member: the name of the hlist_node within the struct. * @cond: optional lockdep expression if called from non-RCU protection. * * This list-traversal primitive may safely run concurrently with * the _rcu list-mutation primitives such as hlist_add_head_rcu() * as long as the traversal is guarded by rcu_read_lock(). */ Currently, all calls to vhost_vsock_get() are between rcu_read_lock() and rcu_read_unlock() except for calls in vhost_vsock_set_cid() and vhost_vsock_reset_orphans(). In both cases, the current code is safe, but we can make improvements to make it more robust. About vhost_vsock_set_cid(), when building the kernel with CONFIG_PROVE_RCU_LIST enabled, we get the following RCU warning when the user space issues `ioctl(dev, VHOST_VSOCK_SET_GUEST_CID, ...)` : WARNING: suspicious RCU usage 6.18.0-rc7 #62 Not tainted ----------------------------- drivers/vhost/vsock.c:74 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by rpc-libvirtd/3443: #0: ffffffffc05032a8 (vhost_vsock_mutex){+.+.}-{4:4}, at: vhost_vsock_dev_ioctl+0x2ff/0x530 [vhost_vsock] stack backtrace: CPU: 2 UID: 0 PID: 3443 Comm: rpc-libvirtd Not tainted 6.18.0-rc7 #62 PREEMPT(none) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-7.fc42 06/10/2025 Call Trace: dump_stack_lvl+0x75/0xb0 dump_stack+0x14/0x1a lockdep_rcu_suspicious.cold+0x4e/0x97 vhost_vsock_get+0x8f/0xa0 [vhost_vsock] vhost_vsock_dev_ioctl+0x307/0x530 [vhost_vsock] __x64_sys_ioctl+0x4f2/0xa00 x64_sys_call+0xed0/0x1da0 do_syscall_64+0x73/0xfa0 entry_SYSCALL_64_after_hwframe+0x76/0x7e ... This is not a real problem, because the vhost_vsock_get() caller, i.e. vhost_vsock_set_cid(), holds the `vhost_vsock_mutex` used by the hash table writers. Anyway, to prevent that warning, add lockdep_is_held() condition to hash_for_each_possible_rcu() to verify that either the caller is in an RCU read section or `vhost_vsock_mutex` is held when CONFIG_PROVE_RCU_LIST is enabled; and also clarify the comment for vhost_vsock_get() to better describe the locking requirements and the scope of the returned pointer validity. About vhost_vsock_reset_orphans(), currently this function is only called via vsock_for_each_connected_socket(), which holds the `vsock_table_lock` spinlock (which is also an RCU read-side critical section). However, add an explicit RCU read lock there to make the code more robust and explicit about the RCU requirements, and to prevent issues if the calling context changes in the future or if vhost_vsock_reset_orphans() is called from other contexts. Fixes: 834e772c8db0 ("vhost/vsock: fix use-after-free in network stack callers") Cc: stefanha@redhat.com Signed-off-by: Stefano Garzarella Reviewed-by: Stefan Hajnoczi Message-Id: <20251126133826.142496-1-sgarzare@redhat.com> Message-ID: <20251126210313.GA499503@fedora> Acked-by: Jason Wang Signed-off-by: Michael S. Tsirkin --- drivers/vhost/vsock.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 0298ddc348242..552cfb53498ad 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -66,14 +66,15 @@ static u32 vhost_transport_get_local_cid(void) return VHOST_VSOCK_DEFAULT_HOST_CID; } -/* Callers that dereference the return value must hold vhost_vsock_mutex or the - * RCU read lock. +/* Callers must be in an RCU read section or hold the vhost_vsock_mutex. + * The return value can only be dereferenced while within the section. */ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid) { struct vhost_vsock *vsock; - hash_for_each_possible_rcu(vhost_vsock_hash, vsock, hash, guest_cid) { + hash_for_each_possible_rcu(vhost_vsock_hash, vsock, hash, guest_cid, + lockdep_is_held(&vhost_vsock_mutex)) { u32 other_cid = vsock->guest_cid; /* Skip instances that have no CID yet */ @@ -709,9 +710,15 @@ static void vhost_vsock_reset_orphans(struct sock *sk) * executing. */ + rcu_read_lock(); + /* If the peer is still valid, no need to reset connection */ - if (vhost_vsock_get(vsk->remote_addr.svm_cid)) + if (vhost_vsock_get(vsk->remote_addr.svm_cid)) { + rcu_read_unlock(); return; + } + + rcu_read_unlock(); /* If the close timeout is pending, let it expire. This avoids races * with the timeout callback. From 06a0f9d953e9bc92d54cbfa7de875aed3a1f8a47 Mon Sep 17 00:00:00 2001 From: Prithvi Tambewagh Date: Thu, 25 Dec 2025 12:58:29 +0530 Subject: [PATCH 0348/1321] io_uring: fix filename leak in __io_openat_prep() __io_openat_prep() allocates a struct filename using getname(). However, for the condition of the file being installed in the fixed file table as well as having O_CLOEXEC flag set, the function returns early. At that point, the request doesn't have REQ_F_NEED_CLEANUP flag set. Due to this, the memory for the newly allocated struct filename is not cleaned up, causing a memory leak. Fix this by setting the REQ_F_NEED_CLEANUP for the request just after the successful getname() call, so that when the request is torn down, the filename will be cleaned up, along with other resources needing cleanup. Reported-by: syzbot+00e61c43eb5e4740438f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=00e61c43eb5e4740438f Tested-by: syzbot+00e61c43eb5e4740438f@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Prithvi Tambewagh Fixes: b9445598d8c6 ("io_uring: openat directly into fixed fd table") Signed-off-by: Jens Axboe --- io_uring/openclose.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/io_uring/openclose.c b/io_uring/openclose.c index bfeb91b31bba5..15dde9bd6ff67 100644 --- a/io_uring/openclose.c +++ b/io_uring/openclose.c @@ -73,13 +73,13 @@ static int __io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe open->filename = NULL; return ret; } + req->flags |= REQ_F_NEED_CLEANUP; open->file_slot = READ_ONCE(sqe->file_index); if (open->file_slot && (open->how.flags & O_CLOEXEC)) return -EINVAL; open->nofile = rlimit(RLIMIT_NOFILE); - req->flags |= REQ_F_NEED_CLEANUP; if (io_openat_force_async(open)) req->flags |= REQ_F_FORCE_ASYNC; return 0; From 2b0c0ca95415dd9883b6d7185abc1a1028584ed3 Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Fri, 12 Dec 2025 17:19:49 -0700 Subject: [PATCH 0349/1321] ublk: clean up user copy references on ublk server exit If a ublk server process releases a ublk char device file, any requests dispatched to the ublk server but not yet completed will retain a ref value of UBLK_REFCOUNT_INIT. Before commit e63d2228ef83 ("ublk: simplify aborting ublk request"), __ublk_fail_req() would decrement the reference count before completing the failed request. However, that commit optimized __ublk_fail_req() to call __ublk_complete_rq() directly without decrementing the request reference count. The leaked reference count incorrectly allows user copy and zero copy operations on the completed ublk request. It also triggers the WARN_ON_ONCE(refcount_read(&io->ref)) warnings in ublk_queue_reinit() and ublk_deinit_queue(). Commit c5c5eb24ed61 ("ublk: avoid ublk_io_release() called after ublk char dev is closed") already fixed the issue for ublk devices using UBLK_F_SUPPORT_ZERO_COPY or UBLK_F_AUTO_BUF_REG. However, the reference count leak also affects UBLK_F_USER_COPY, the other reference-counted data copy mode. Fix the condition in ublk_check_and_reset_active_ref() to include all reference-counted data copy modes. This ensures that any ublk requests still owned by the ublk server when it exits have their reference counts reset to 0. Signed-off-by: Caleb Sander Mateos Fixes: e63d2228ef83 ("ublk: simplify aborting ublk request") Reviewed-by: Ming Lei Signed-off-by: Jens Axboe --- drivers/block/ublk_drv.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index cfd2132410dd7..49c2084571981 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1607,8 +1607,7 @@ static bool ublk_check_and_reset_active_ref(struct ublk_device *ub) { int i, j; - if (!(ub->dev_info.flags & (UBLK_F_SUPPORT_ZERO_COPY | - UBLK_F_AUTO_BUF_REG))) + if (!ublk_dev_need_req_ref(ub)) return false; for (i = 0; i < ub->dev_info.nr_hw_queues; i++) { From 0894a7a77ab6e4c31284100b2f98db344bc3cdfe Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Sat, 20 Dec 2025 11:46:10 +0300 Subject: [PATCH 0350/1321] block: rnbd-clt: Fix signedness bug in init_dev() The "dev->clt_device_id" variable is set using ida_alloc_max() which returns an int and in particular it returns negative error codes. Change the type from u32 to int to fix the error checking. Fixes: c9b5645fd8ca ("block: rnbd-clt: Fix leaked ID in init_dev()") Signed-off-by: Dan Carpenter Signed-off-by: Jens Axboe --- drivers/block/rnbd/rnbd-clt.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/rnbd/rnbd-clt.h b/drivers/block/rnbd/rnbd-clt.h index a48e040abe639..fbc1ed766025c 100644 --- a/drivers/block/rnbd/rnbd-clt.h +++ b/drivers/block/rnbd/rnbd-clt.h @@ -112,7 +112,7 @@ struct rnbd_clt_dev { struct rnbd_queue *hw_queues; u32 device_id; /* local Idr index - used to track minor number allocations. */ - u32 clt_device_id; + int clt_device_id; struct mutex lock; enum rnbd_clt_dev_state dev_state; refcount_t refcount; From 194cecaaf0af89902db2c81b1f0713d95cfb649c Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Tue, 23 Dec 2025 11:10:46 +0100 Subject: [PATCH 0351/1321] efi/libstub: gop: Fix EDID support in mixed-mode The efi_edid_discovered_protocol and efi_edid_active_protocol have mixed mode fields. So all their attributes should be accessed through the efi_table_attr() helper. Doing so fixes the upper 32 bits of the 64 bit gop_edid pointer getting set to random values (followed by a crash at boot) when booting a x86_64 kernel on a machine with 32 bit UEFI like the Asus T100TA. Fixes: 17029cdd8f9d ("efi/libstub: gop: Add support for reading EDID") Cc: Thomas Zimmermann Cc: Javier Martinez Canillas Signed-off-by: Hans de Goede Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/libstub/gop.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/firmware/efi/libstub/gop.c b/drivers/firmware/efi/libstub/gop.c index 72d74436a7a44..80dc8cfeb33e9 100644 --- a/drivers/firmware/efi/libstub/gop.c +++ b/drivers/firmware/efi/libstub/gop.c @@ -513,15 +513,15 @@ efi_status_t efi_setup_graphics(struct screen_info *si, struct edid_info *edid) status = efi_bs_call(handle_protocol, handle, &EFI_EDID_ACTIVE_PROTOCOL_GUID, (void **)&active_edid); if (status == EFI_SUCCESS) { - gop_size_of_edid = active_edid->size_of_edid; - gop_edid = active_edid->edid; + gop_size_of_edid = efi_table_attr(active_edid, size_of_edid); + gop_edid = efi_table_attr(active_edid, edid); } else { status = efi_bs_call(handle_protocol, handle, &EFI_EDID_DISCOVERED_PROTOCOL_GUID, (void **)&discovered_edid); if (status == EFI_SUCCESS) { - gop_size_of_edid = discovered_edid->size_of_edid; - gop_edid = discovered_edid->edid; + gop_size_of_edid = efi_table_attr(discovered_edid, size_of_edid); + gop_edid = efi_table_attr(discovered_edid, edid); } } From 7237407d985d9ec4f95c730a04a303334abc12a3 Mon Sep 17 00:00:00 2001 From: Breno Leitao Date: Tue, 23 Dec 2025 02:55:43 -0800 Subject: [PATCH 0352/1321] arm64: efi: Fix NULL pointer dereference by initializing user_ns Linux 6.19-rc2 (9448598b22c5 ("Linux 6.19-rc2")) is crashing with a NULL pointer dereference on arm64 hosts: Unable to handle kernel NULL pointer dereference at virtual address 00000000000000c8 pc : cap_capable (security/commoncap.c:82 security/commoncap.c:128) Call trace: cap_capable (security/commoncap.c:82 security/commoncap.c:128) (P) security_capable (security/security.c:?) ns_capable_noaudit (kernel/capability.c:342 kernel/capability.c:381) __ptrace_may_access (./include/linux/rcupdate.h:895 kernel/ptrace.c:326) ptrace_may_access (kernel/ptrace.c:353) do_task_stat (fs/proc/array.c:467) proc_tgid_stat (fs/proc/array.c:673) proc_single_show (fs/proc/base.c:803) I've bissected the problem to commit a5baf582f4c0 ("arm64/efi: Call EFI runtime services without disabling preemption"). >From my analyzes, the crash occurs because efi_mm lacks a user_ns field initialization. This was previously harmless, but commit a5baf582f4c0 ("arm64/efi: Call EFI runtime services without disabling preemption") changed the EFI runtime call path to use kthread_use_mm(&efi_mm), which temporarily adopts efi_mm as the current mm for the calling kthread. When a thread has an active mm, LSM hooks like cap_capable() expect mm->user_ns to be valid for credential checks. With efi_mm.user_ns being NULL, capability checks during possible /proc access dereference the NULL pointer and crash. Fix by initializing efi_mm.user_ns to &init_user_ns. Fixes: a5baf582f4c0 ("arm64/efi: Call EFI runtime services without disabling preemption") Signed-off-by: Breno Leitao Acked-by: Rik van Riel Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/efi.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index a9070d00b833f..55452e61af31d 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -73,6 +73,7 @@ struct mm_struct efi_mm = { MMAP_LOCK_INITIALIZER(efi_mm) .page_table_lock = __SPIN_LOCK_UNLOCKED(efi_mm.page_table_lock), .mmlist = LIST_HEAD_INIT(efi_mm.mmlist), + .user_ns = &init_user_ns, .cpu_bitmap = { [BITS_TO_LONGS(NR_CPUS)] = 0}, #ifdef CONFIG_SCHED_MM_CID .mm_cid.lock = __RAW_SPIN_LOCK_UNLOCKED(efi_mm.mm_cid.lock), From db1eb7450332b5bbcaee667e9bafbb3cfc5405ff Mon Sep 17 00:00:00 2001 From: Breno Leitao Date: Tue, 23 Dec 2025 02:55:44 -0800 Subject: [PATCH 0353/1321] kthread: Warn if mm_struct lacks user_ns in kthread_use_mm() Add a WARN_ON_ONCE() check to detect mm_struct instances that are missing user_ns initialization when passed to kthread_use_mm(). When a kthread adopts an mm via kthread_use_mm(), LSM hooks and capability checks may access current->mm->user_ns for credential validation. If user_ns is NULL, this leads to a NULL pointer dereference crash. This was observed with efi_mm on arm64, where commit a5baf582f4c0 ("arm64/efi: Call EFI runtime services without disabling preemption") introduced kthread_use_mm(&efi_mm), but efi_mm lacked user_ns initialization, causing crashes during /proc access. Adding this warning helps catch similar bugs early during development rather than waiting for hard-to-debug NULL pointer crashes in production. Signed-off-by: Breno Leitao Acked-by: Rik van Riel Signed-off-by: Ard Biesheuvel --- kernel/kthread.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/kthread.c b/kernel/kthread.c index 99a3808d086f0..39511dd2abc97 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -1599,6 +1599,7 @@ void kthread_use_mm(struct mm_struct *mm) WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD)); WARN_ON_ONCE(tsk->mm); + WARN_ON_ONCE(!mm->user_ns); /* * It is possible for mm to be the same as tsk->active_mm, but From 63b26af12292cd1c270a7ff9fa8dc52afdc424aa Mon Sep 17 00:00:00 2001 From: FUJITA Tomonori Date: Fri, 5 Dec 2025 01:06:39 +0900 Subject: [PATCH 0354/1321] rust: dma: add helpers for architectures without CONFIG_HAS_DMA Add dma_set_mask(), dma_set_coherent_mask(), dma_map_sgtable(), and dma_max_mapping_size() helpers to fix a build error when CONFIG_HAS_DMA is not enabled. Note that when CONFIG_HAS_DMA is enabled, they are included in both bindings_generated.rs and bindings_helpers_generated.rs. The former takes precedence so behavior remains unchanged in that case. This fixes the following build error on UML: error[E0425]: cannot find function `dma_set_mask` in crate `bindings` --> rust/kernel/dma.rs:46:38 | 46 | to_result(unsafe { bindings::dma_set_mask(self.as_ref().as_raw(), mask.value()) }) | ^^^^^^^^^^^^ help: a function with a similar name exists: `xa_set_mark` | ::: rust/bindings/bindings_generated.rs:24690:5 | 24690 | pub fn xa_set_mark(arg1: *mut xarray, index: ffi::c_ulong, arg2: xa_mark_t); | ---------------------------------------------------------------------------- similarly named function `xa_set_mark` defined here error[E0425]: cannot find function `dma_set_coherent_mask` in crate `bindings` --> rust/kernel/dma.rs:63:38 | 63 | to_result(unsafe { bindings::dma_set_coherent_mask(self.as_ref().as_raw(), mask.value()) }) | ^^^^^^^^^^^^^^^^^^^^^ help: a function with a similar name exists: `dma_coherent_ok` | ::: rust/bindings/bindings_generated.rs:52745:5 | 52745 | pub fn dma_coherent_ok(dev: *mut device, phys: phys_addr_t, size: usize) -> bool_; | ---------------------------------------------------------------------------------- similarly named function `dma_coherent_ok` defined here error[E0425]: cannot find function `dma_map_sgtable` in crate `bindings` --> rust/kernel/scatterlist.rs:212:23 | 212 | bindings::dma_map_sgtable(dev.as_raw(), sgt.as_ptr(), dir.into(), 0) | ^^^^^^^^^^^^^^^ help: a function with a similar name exists: `dma_unmap_sgtable` | ::: rust/bindings/bindings_helpers_generated.rs:1351:5 | 1351 | / pub fn dma_unmap_sgtable( 1352 | | dev: *mut device, 1353 | | sgt: *mut sg_table, 1354 | | dir: dma_data_direction, 1355 | | attrs: ffi::c_ulong, 1356 | | ); | |______- similarly named function `dma_unmap_sgtable` defined here error[E0425]: cannot find function `dma_max_mapping_size` in crate `bindings` --> rust/kernel/scatterlist.rs:356:52 | 356 | let max_segment = match unsafe { bindings::dma_max_mapping_size(dev.as_raw()) } { | ^^^^^^^^^^^^^^^^^^^^ not found in `bindings` error: aborting due to 4 previous errors Cc: stable@vger.kernel.org # v6.17+ Fixes: 101d66828a4ee ("rust: dma: add DMA addressing capabilities") Signed-off-by: FUJITA Tomonori Reviewed-by: David Gow Reviewed-by: Alice Ryhl Link: https://patch.msgid.link/20251204160639.364936-1-fujita.tomonori@gmail.com [ Use relative paths in the error splat; add 'dma' prefix. - Danilo ] Signed-off-by: Danilo Krummrich --- rust/helpers/dma.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/rust/helpers/dma.c b/rust/helpers/dma.c index 6e741c1972425..2afa32c21c946 100644 --- a/rust/helpers/dma.c +++ b/rust/helpers/dma.c @@ -19,3 +19,24 @@ int rust_helper_dma_set_mask_and_coherent(struct device *dev, u64 mask) { return dma_set_mask_and_coherent(dev, mask); } + +int rust_helper_dma_set_mask(struct device *dev, u64 mask) +{ + return dma_set_mask(dev, mask); +} + +int rust_helper_dma_set_coherent_mask(struct device *dev, u64 mask) +{ + return dma_set_coherent_mask(dev, mask); +} + +int rust_helper_dma_map_sgtable(struct device *dev, struct sg_table *sgt, + enum dma_data_direction dir, unsigned long attrs) +{ + return dma_map_sgtable(dev, sgt, dir, attrs); +} + +size_t rust_helper_dma_max_mapping_size(struct device *dev) +{ + return dma_max_mapping_size(dev); +} From a59114681c39d474bd26b61a57f6fd86994a124b Mon Sep 17 00:00:00 2001 From: Marko Turk Date: Wed, 10 Dec 2025 12:25:51 +0100 Subject: [PATCH 0355/1321] samples: rust: fix endianness issue in rust_driver_pci MMIO backend of PCI Bar always assumes little-endian devices and will convert to CPU endianness automatically. Remove the u32::from_le conversion which would cause a bug on big-endian machines. Cc: stable@vger.kernel.org Reviewed-by: Dirk Behme Signed-off-by: Marko Turk Fixes: 685376d18e9a ("samples: rust: add Rust PCI sample driver") Link: https://patch.msgid.link/20251210112503.62925-2-mt@markoturk.info Signed-off-by: Danilo Krummrich --- samples/rust/rust_driver_pci.rs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/samples/rust/rust_driver_pci.rs b/samples/rust/rust_driver_pci.rs index 5823787bea8ec..fa677991a5c45 100644 --- a/samples/rust/rust_driver_pci.rs +++ b/samples/rust/rust_driver_pci.rs @@ -48,7 +48,7 @@ impl SampleDriver { // Select the test. bar.write8(index.0, Regs::TEST); - let offset = u32::from_le(bar.read32(Regs::OFFSET)) as usize; + let offset = bar.read32(Regs::OFFSET) as usize; let data = bar.read8(Regs::DATA); // Write `data` to `offset` to increase `count` by one. From 1e462db0ba60ea9838b2e320a35516cfe6978a2f Mon Sep 17 00:00:00 2001 From: Kuniyuki Iwashima Date: Mon, 8 Dec 2025 09:45:45 +0000 Subject: [PATCH 0356/1321] debugfs: Fix memleak in debugfs_change_name(). syzbot reported memleak in debugfs_change_name(). [0] When lookup_noperm_unlocked() fails, new_name is leaked. Let's fix it by reusing kfree_const() at the end of debugfs_change_name(). [0]: BUG: memory leak unreferenced object 0xffff8881110bb308 (size 8): comm "syz.0.17", pid 6090, jiffies 4294942958 hex dump (first 8 bytes): 2e 00 00 00 00 00 00 00 ........ backtrace (crc ecfc7064): kmemleak_alloc_recursive include/linux/kmemleak.h:44 [inline] slab_post_alloc_hook mm/slub.c:4953 [inline] slab_alloc_node mm/slub.c:5258 [inline] __do_kmalloc_node mm/slub.c:5651 [inline] __kmalloc_node_track_caller_noprof+0x3b2/0x670 mm/slub.c:5759 __kmemdup_nul mm/util.c:64 [inline] kstrdup+0x3c/0x80 mm/util.c:84 kstrdup_const+0x63/0x80 mm/util.c:104 kvasprintf_const+0xca/0x110 lib/kasprintf.c:48 debugfs_change_name+0xf6/0x5d0 fs/debugfs/inode.c:854 cfg80211_dev_rename+0xd8/0x110 net/wireless/core.c:149 nl80211_set_wiphy+0x102/0x1770 net/wireless/nl80211.c:3844 genl_family_rcv_msg_doit+0x11e/0x190 net/netlink/genetlink.c:1115 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0x2fd/0x440 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x93/0x1d0 net/netlink/af_netlink.c:2550 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline] netlink_unicast+0x3a3/0x4f0 net/netlink/af_netlink.c:1344 netlink_sendmsg+0x335/0x6b0 net/netlink/af_netlink.c:1894 sock_sendmsg_nosec net/socket.c:718 [inline] __sock_sendmsg net/socket.c:733 [inline] ____sys_sendmsg+0x562/0x5a0 net/socket.c:2608 ___sys_sendmsg+0xc8/0x130 net/socket.c:2662 __sys_sendmsg+0xc7/0x140 net/socket.c:2694 Fixes: 833d2b3a072f7 ("Add start_renaming_two_dentries()") Reported-by: syzbot+3d7ca9c802c547f8550a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/69369d82.a70a0220.38f243.009f.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20251208094551.46184-1-kuniyu@google.com [ Fix minor typo in commit message. - Danilo ] Signed-off-by: Danilo Krummrich --- fs/debugfs/inode.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c index 4b263c328ed29..4005d21cf009c 100644 --- a/fs/debugfs/inode.c +++ b/fs/debugfs/inode.c @@ -841,8 +841,10 @@ int __printf(2, 3) debugfs_change_name(struct dentry *dentry, const char *fmt, . rd.new_parent = rd.old_parent; rd.flags = RENAME_NOREPLACE; target = lookup_noperm_unlocked(&QSTR(new_name), rd.new_parent); - if (IS_ERR(target)) - return PTR_ERR(target); + if (IS_ERR(target)) { + error = PTR_ERR(target); + goto out_free; + } error = start_renaming_two_dentries(&rd, dentry, target); if (error) { @@ -862,6 +864,7 @@ int __printf(2, 3) debugfs_change_name(struct dentry *dentry, const char *fmt, . out: dput(rd.old_parent); dput(target); +out_free: kfree_const(new_name); return error; } From 3a1c524d50037c890cc2d462f158a1951ebc19d6 Mon Sep 17 00:00:00 2001 From: Sakari Ailus Date: Fri, 19 Dec 2025 10:36:38 +0200 Subject: [PATCH 0357/1321] software node: Also support referencing non-constant software nodes Fwnode references are be implemented differently if referenced node is a software node. _Generic() is used to differentiate between the two cases but only const software nodes were present in the selection. Also add non-const software nodes. Reported-by: Kenneth Crudup Closes: https://lore.kernel.org/all/af773b82-bef2-4209-baaf-526d4661b7fc@panix.com/ Fixes: d7cdbbc93c56 ("software node: allow referencing firmware nodes") Signed-off-by: Sakari Ailus Tested-By: Kenneth R. Crudup Tested-by: Mehdi Djait # Dell XPS 9315 Reviewed-by: Mehdi Djait Link: https://patch.msgid.link/20251219083638.2454138-1-sakari.ailus@linux.intel.com Signed-off-by: Danilo Krummrich --- include/linux/property.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/property.h b/include/linux/property.h index 272bfbdea7bf4..e30ef23a9af33 100644 --- a/include/linux/property.h +++ b/include/linux/property.h @@ -371,6 +371,7 @@ struct software_node_ref_args { (const struct software_node_ref_args) { \ .swnode = _Generic(_ref_, \ const struct software_node *: _ref_, \ + struct software_node *: _ref_, \ default: NULL), \ .fwnode = _Generic(_ref_, \ struct fwnode_handle *: _ref_, \ From ade99df39dbb552eadcdb4d6fdc8d87ada66bec2 Mon Sep 17 00:00:00 2001 From: Will Rosenberg Date: Tue, 16 Dec 2025 23:01:07 -0700 Subject: [PATCH 0358/1321] fs/kernfs: null-ptr deref in simple_xattrs_free() There exists a null pointer dereference in simple_xattrs_free() as part of the __kernfs_new_node() routine. Within __kernfs_new_node(), err_out4 calls simple_xattr_free(), but kn->iattr may be NULL if __kernfs_setattr() was never called. As a result, the first argument to simple_xattrs_free() may be NULL + 0x38, and no NULL check is done internally, causing an incorrect pointer dereference. Add a check to ensure kn->iattr is not NULL, meaning __kernfs_setattr() has been called and kn->iattr is allocated. Note that struct kernfs_node kn is allocated with kmem_cache_zalloc, so we can assume kn->iattr will be NULL if not allocated. An alternative fix could be to not call simple_xattrs_free() at all. As was previously discussed during the initial patch, simple_xattrs_free() is not strictly needed and is included to be consistent with kernfs_free_rcu(), which also helps the function maintain correctness if changes are made in __kernfs_new_node(). Reported-by: syzbot+6aaf7f48ae034ab0ea97@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6aaf7f48ae034ab0ea97 Fixes: 382b1e8f30f7 ("kernfs: fix memory leak of kernfs_iattrs in __kernfs_new_node") Signed-off-by: Will Rosenberg Link: https://patch.msgid.link/20251217060107.4171558-1-whrosenb@asu.edu Signed-off-by: Greg Kroah-Hartman --- fs/kernfs/dir.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 5c0efd6b239f6..29baeeb978713 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -681,8 +681,10 @@ static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root, return kn; err_out4: - simple_xattrs_free(&kn->iattr->xattrs, NULL); - kmem_cache_free(kernfs_iattrs_cache, kn->iattr); + if (kn->iattr) { + simple_xattrs_free(&kn->iattr->xattrs, NULL); + kmem_cache_free(kernfs_iattrs_cache, kn->iattr); + } err_out3: spin_lock(&root->kernfs_idr_lock); idr_remove(&root->ino_idr, (u32)kernfs_ino(kn)); From 3ee48909d8b3aae6b1b34a4d54e1a440782fc7ad Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Wed, 24 Dec 2025 15:21:42 +0000 Subject: [PATCH 0359/1321] cifs: Fix memory and information leak in smb3_reconfigure() In smb3_reconfigure(), if smb3_sync_session_ctx_passwords() fails, the function returns immediately without freeing and erasing the newly allocated new_password and new_password2. This causes both a memory leak and a potential information leak. Fix this by calling kfree_sensitive() on both password buffers before returning in this error case. Fixes: 0f0e357902957 ("cifs: during remount, make sure passwords are in sync") Signed-off-by: Zilin Guan Reviewed-by: ChenXiaoSong Signed-off-by: Steve French --- fs/smb/client/fs_context.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c index c2de97e4ad59d..d4291d3a9a485 100644 --- a/fs/smb/client/fs_context.c +++ b/fs/smb/client/fs_context.c @@ -1139,6 +1139,8 @@ static int smb3_reconfigure(struct fs_context *fc) rc = smb3_sync_session_ctx_passwords(cifs_sb, ses); if (rc) { mutex_unlock(&ses->session_mutex); + kfree_sensitive(new_password); + kfree_sensitive(new_password2); return rc; } From eb832061c6fe9a73c3612da5f3598082e69f6b26 Mon Sep 17 00:00:00 2001 From: John Garry Date: Thu, 11 Dec 2025 10:06:51 +0000 Subject: [PATCH 0360/1321] scsi: scsi_debug: Fix atomic write enable module param description The atomic write enable module param is "atomic_wr", and not "atomic_write", so fix the module param description. Fixes: 84f3a3c01d70 ("scsi: scsi_debug: Atomic write support") Signed-off-by: John Garry Reviewed-by: Bart Van Assche Link: https://patch.msgid.link/20251211100651.9056-1-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen --- drivers/scsi/scsi_debug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 1f2a53ba5dd98..c5085e6d2e75a 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -7459,7 +7459,7 @@ MODULE_PARM_DESC(lbprz, MODULE_PARM_DESC(lbpu, "enable LBP, support UNMAP command (def=0)"); MODULE_PARM_DESC(lbpws, "enable LBP, support WRITE SAME(16) with UNMAP bit (def=0)"); MODULE_PARM_DESC(lbpws10, "enable LBP, support WRITE SAME(10) with UNMAP bit (def=0)"); -MODULE_PARM_DESC(atomic_write, "enable ATOMIC WRITE support, support WRITE ATOMIC(16) (def=0)"); +MODULE_PARM_DESC(atomic_wr, "enable ATOMIC WRITE support, support WRITE ATOMIC(16) (def=0)"); MODULE_PARM_DESC(lowest_aligned, "lowest aligned lba (def=0)"); MODULE_PARM_DESC(lun_format, "LUN format: 0->peripheral (def); 1 --> flat address method"); MODULE_PARM_DESC(max_luns, "number of LUNs per target to simulate(def=1)"); From 1395bd85c47f2adfb91e87099416acd88452662d Mon Sep 17 00:00:00 2001 From: Chandrakanth Patil Date: Thu, 11 Dec 2025 05:59:29 +0530 Subject: [PATCH 0361/1321] scsi: mpi3mr: Read missing IOCFacts flag for reply queue full overflow The driver was not reading the MAX_REQ_PER_REPLY_QUEUE_LIMIT IOCFacts flag, so the reply-queue-full handling was never enabled, even on firmware that supports it. Reading this flag enables the feature and prevents reply queue overflow. Fixes: f08b24d82749 ("scsi: mpi3mr: Avoid reply queue full condition") Cc: stable@vger.kernel.org Signed-off-by: Chandrakanth Patil Link: https://patch.msgid.link/20251211002929.22071-1-chandrakanth.patil@broadcom.com Signed-off-by: Martin K. Petersen --- drivers/scsi/mpi3mr/mpi/mpi30_ioc.h | 1 + drivers/scsi/mpi3mr/mpi3mr_fw.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/drivers/scsi/mpi3mr/mpi/mpi30_ioc.h b/drivers/scsi/mpi3mr/mpi/mpi30_ioc.h index b42933fcd4233..6561f98c3cb2d 100644 --- a/drivers/scsi/mpi3mr/mpi/mpi30_ioc.h +++ b/drivers/scsi/mpi3mr/mpi/mpi30_ioc.h @@ -166,6 +166,7 @@ struct mpi3_ioc_facts_data { #define MPI3_IOCFACTS_FLAGS_SIGNED_NVDATA_REQUIRED (0x00010000) #define MPI3_IOCFACTS_FLAGS_DMA_ADDRESS_WIDTH_MASK (0x0000ff00) #define MPI3_IOCFACTS_FLAGS_DMA_ADDRESS_WIDTH_SHIFT (8) +#define MPI3_IOCFACTS_FLAGS_MAX_REQ_PER_REPLY_QUEUE_LIMIT (0x00000040) #define MPI3_IOCFACTS_FLAGS_INITIAL_PORT_ENABLE_MASK (0x00000030) #define MPI3_IOCFACTS_FLAGS_INITIAL_PORT_ENABLE_SHIFT (4) #define MPI3_IOCFACTS_FLAGS_INITIAL_PORT_ENABLE_NOT_STARTED (0x00000000) diff --git a/drivers/scsi/mpi3mr/mpi3mr_fw.c b/drivers/scsi/mpi3mr/mpi3mr_fw.c index 8fe6e0bf342e2..8c4bb7169a87c 100644 --- a/drivers/scsi/mpi3mr/mpi3mr_fw.c +++ b/drivers/scsi/mpi3mr/mpi3mr_fw.c @@ -3158,6 +3158,8 @@ static void mpi3mr_process_factsdata(struct mpi3mr_ioc *mrioc, mrioc->facts.dma_mask = (facts_flags & MPI3_IOCFACTS_FLAGS_DMA_ADDRESS_WIDTH_MASK) >> MPI3_IOCFACTS_FLAGS_DMA_ADDRESS_WIDTH_SHIFT; + mrioc->facts.max_req_limit = (facts_flags & + MPI3_IOCFACTS_FLAGS_MAX_REQ_PER_REPLY_QUEUE_LIMIT); mrioc->facts.protocol_flags = facts_data->protocol_flags; mrioc->facts.mpi_version = le32_to_cpu(facts_data->mpi_version.word); mrioc->facts.max_reqs = le16_to_cpu(facts_data->max_outstanding_requests); From db64191d0b70c37eb849ac6505ef2ce7e9a5917b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20R=C3=A1bek?= Date: Fri, 12 Dec 2025 17:08:23 +0100 Subject: [PATCH 0362/1321] scsi: sg: Fix occasional bogus elapsed time that exceeds timeout MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit A race condition was found in sg_proc_debug_helper(). It was observed on a system using an IBM LTO-9 SAS Tape Drive (ULTRIUM-TD9) and monitoring /proc/scsi/sg/debug every second. A very large elapsed time would sometimes appear. This is caused by two race conditions. We reproduced the issue with an IBM ULTRIUM-HH9 tape drive on an x86_64 architecture. A patched kernel was built, and the race condition could not be observed anymore after the application of this patch. A reproducer C program utilising the scsi_debug module was also built by Changhui Zhong and can be viewed here: https://github.com/MichaelRabek/linux-tests/blob/master/drivers/scsi/sg/sg_race_trigger.c The first race happens between the reading of hp->duration in sg_proc_debug_helper() and request completion in sg_rq_end_io(). The hp->duration member variable may hold either of two types of information: #1 - The start time of the request. This value is present while the request is not yet finished. #2 - The total execution time of the request (end_time - start_time). If sg_proc_debug_helper() executes *after* the value of hp->duration was changed from #1 to #2, but *before* srp->done is set to 1 in sg_rq_end_io(), a fresh timestamp is taken in the else branch, and the elapsed time (value type #2) is subtracted from a timestamp, which cannot yield a valid elapsed time (which is a type #2 value as well). To fix this issue, the value of hp->duration must change under the protection of the sfp->rq_list_lock in sg_rq_end_io(). Since sg_proc_debug_helper() takes this read lock, the change to srp->done and srp->header.duration will happen atomically from the perspective of sg_proc_debug_helper() and the race condition is thus eliminated. The second race condition happens between sg_proc_debug_helper() and sg_new_write(). Even though hp->duration is set to the current time stamp in sg_add_request() under the write lock's protection, it gets overwritten by a call to get_sg_io_hdr(), which calls copy_from_user() to copy struct sg_io_hdr from userspace into kernel space. hp->duration is set to the start time again in sg_common_write(). If sg_proc_debug_helper() is called between these two calls, an arbitrary value set by userspace (usually zero) is used to compute the elapsed time. To fix this issue, hp->duration must be set to the current timestamp again after get_sg_io_hdr() returns successfully. A small race window still exists between get_sg_io_hdr() and setting hp->duration, but this window is only a few instructions wide and does not result in observable issues in practice, as confirmed by testing. Additionally, we fix the format specifier from %d to %u for printing unsigned int values in sg_proc_debug_helper(). Signed-off-by: Michal Rábek Suggested-by: Tomas Henzl Tested-by: Changhui Zhong Reviewed-by: Ewan D. Milne Reviewed-by: John Meneghini Reviewed-by: Tomas Henzl Link: https://patch.msgid.link/20251212160900.64924-1-mrabek@redhat.com Signed-off-by: Martin K. Petersen --- drivers/scsi/sg.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index b3af9b78fa123..57fba34832ad1 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -731,6 +731,8 @@ sg_new_write(Sg_fd *sfp, struct file *file, const char __user *buf, sg_remove_request(sfp, srp); return -EFAULT; } + hp->duration = jiffies_to_msecs(jiffies); + if (hp->interface_id != 'S') { sg_remove_request(sfp, srp); return -ENOSYS; @@ -815,7 +817,6 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp, return -ENODEV; } - hp->duration = jiffies_to_msecs(jiffies); if (hp->interface_id != '\0' && /* v3 (or later) interface */ (SG_FLAG_Q_AT_TAIL & hp->flags)) at_head = 0; @@ -1338,9 +1339,6 @@ sg_rq_end_io(struct request *rq, blk_status_t status) "sg_cmd_done: pack_id=%d, res=0x%x\n", srp->header.pack_id, result)); srp->header.resid = resid; - ms = jiffies_to_msecs(jiffies); - srp->header.duration = (ms > srp->header.duration) ? - (ms - srp->header.duration) : 0; if (0 != result) { struct scsi_sense_hdr sshdr; @@ -1389,6 +1387,9 @@ sg_rq_end_io(struct request *rq, blk_status_t status) done = 0; } srp->done = done; + ms = jiffies_to_msecs(jiffies); + srp->header.duration = (ms > srp->header.duration) ? + (ms - srp->header.duration) : 0; write_unlock_irqrestore(&sfp->rq_list_lock, iflags); if (likely(done)) { @@ -2533,6 +2534,7 @@ static void sg_proc_debug_helper(struct seq_file *s, Sg_device * sdp) const sg_io_hdr_t *hp; const char * cp; unsigned int ms; + unsigned int duration; k = 0; list_for_each_entry(fp, &sdp->sfds, sfd_siblings) { @@ -2570,13 +2572,17 @@ static void sg_proc_debug_helper(struct seq_file *s, Sg_device * sdp) seq_printf(s, " id=%d blen=%d", srp->header.pack_id, blen); if (srp->done) - seq_printf(s, " dur=%d", hp->duration); + seq_printf(s, " dur=%u", hp->duration); else { ms = jiffies_to_msecs(jiffies); - seq_printf(s, " t_o/elap=%d/%d", + duration = READ_ONCE(hp->duration); + if (duration) + duration = (ms > duration ? + ms - duration : 0); + seq_printf(s, " t_o/elap=%u/%u", (new_interface ? hp->timeout : jiffies_to_msecs(fp->timeout)), - (ms > hp->duration ? ms - hp->duration : 0)); + duration); } seq_printf(s, "ms sgat=%d op=0x%02x\n", usg, (int) srp->data.cmd_opcode); From fe447d6dc74489f8a96153686b6dbd434fb7960e Mon Sep 17 00:00:00 2001 From: Seunghwan Baek Date: Wed, 10 Dec 2025 15:38:54 +0900 Subject: [PATCH 0363/1321] scsi: ufs: core: Add ufshcd_update_evt_hist() for UFS suspend error If UFS resume fails, the event history is updated in ufshcd_resume(), but there is no code anywhere to record UFS suspend. Therefore, add code to record UFS suspend error event history. Fixes: dd11376b9f1b ("scsi: ufs: Split the drivers/scsi/ufs directory") Cc: stable@vger.kernel.org Signed-off-by: Seunghwan Baek Reviewed-by: Peter Wang Link: https://patch.msgid.link/20251210063854.1483899-2-sh8267.baek@samsung.com Signed-off-by: Martin K. Petersen --- drivers/ufs/core/ufshcd.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index 80c0b49f30b01..0babb7035200f 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -10359,7 +10359,7 @@ static int ufshcd_suspend(struct ufs_hba *hba) ret = ufshcd_setup_clocks(hba, false); if (ret) { ufshcd_enable_irq(hba); - return ret; + goto out; } if (ufshcd_is_clkgating_allowed(hba)) { hba->clk_gating.state = CLKS_OFF; @@ -10371,6 +10371,9 @@ static int ufshcd_suspend(struct ufs_hba *hba) /* Put the host controller in low power mode if possible */ ufshcd_hba_vreg_set_lpm(hba); ufshcd_pm_qos_update(hba, false); +out: + if (ret) + ufshcd_update_evt_hist(hba, UFS_EVT_SUSPEND_ERR, (u32)ret); return ret; } From 9a9bd9b459be0fe6f646a5d61a3f7112942d36c1 Mon Sep 17 00:00:00 2001 From: Neil Armstrong Date: Mon, 17 Nov 2025 15:51:35 +0100 Subject: [PATCH 0364/1321] drm/msm: adreno: fix deferencing ifpc_reglist when not declared On plaforms with an a7xx GPU not supporting IFPC, the ifpc_reglist if still deferenced in a7xx_patch_pwrup_reglist() which causes a kernel crash: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 ... pc : a6xx_hw_init+0x155c/0x1e4c [msm] lr : a6xx_hw_init+0x9a8/0x1e4c [msm] ... Call trace: a6xx_hw_init+0x155c/0x1e4c [msm] (P) msm_gpu_hw_init+0x58/0x88 [msm] adreno_load_gpu+0x94/0x1fc [msm] msm_open+0xe4/0xf4 [msm] drm_file_alloc+0x1a0/0x2e4 [drm] drm_client_init+0x7c/0x104 [drm] drm_fbdev_client_setup+0x94/0xcf0 [drm_client_lib] drm_client_setup+0xb4/0xd8 [drm_client_lib] msm_drm_kms_post_init+0x2c/0x3c [msm] msm_drm_init+0x1a4/0x228 [msm] msm_drm_bind+0x30/0x3c [msm] ... Check the validity of ifpc_reglist before deferencing the table to setup the register values. Fixes: a6a0157cc68e ("drm/msm/a6xx: Enable IFPC on Adreno X1-85") Signed-off-by: Neil Armstrong Reviewed-by: Akhil P Oommen Patchwork: https://patchwork.freedesktop.org/patch/688944/ Message-ID: <20251117-topic-sm8x50-fix-a6xx-non-ifpc-v1-1-e4473cbf5903@linaro.org> Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 0200a7e71cdf5..7e71f6bb5283b 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -873,15 +873,17 @@ static void a7xx_patch_pwrup_reglist(struct msm_gpu *gpu) lock->gpu_req = lock->cpu_req = lock->turn = 0; reglist = adreno_gpu->info->a6xx->ifpc_reglist; - lock->ifpc_list_len = reglist->count; + if (reglist) { + lock->ifpc_list_len = reglist->count; - /* - * For each entry in each of the lists, write the offset and the current - * register value into the GPU buffer - */ - for (i = 0; i < reglist->count; i++) { - *dest++ = reglist->regs[i]; - *dest++ = gpu_read(gpu, reglist->regs[i]); + /* + * For each entry in each of the lists, write the offset and the current + * register value into the GPU buffer + */ + for (i = 0; i < reglist->count; i++) { + *dest++ = reglist->regs[i]; + *dest++ = gpu_read(gpu, reglist->regs[i]); + } } reglist = adreno_gpu->info->a6xx->pwrup_reglist; From 94bbe423bb09a705ff4b5ea5b5a055d0f83d45ee Mon Sep 17 00:00:00 2001 From: Alok Tiwari Date: Thu, 13 Nov 2025 00:28:31 -0800 Subject: [PATCH 0365/1321] drm/msm/a6xx: move preempt_prepare_postamble after error check Move the call to preempt_prepare_postamble() after verifying that preempt_postamble_ptr is valid. If preempt_postamble_ptr is NULL, dereferencing it in preempt_prepare_postamble() would lead to a crash. This change avoids calling the preparation function when the postamble allocation has failed, preventing potential NULL pointer dereference and ensuring proper error handling. Fixes: 50117cad0c50 ("drm/msm/a6xx: Use posamble to reset counters on preemption") Signed-off-by: Alok Tiwari Patchwork: https://patchwork.freedesktop.org/patch/687659/ Message-ID: <20251113082839.3821867-1-alok.a.tiwari@oracle.com> Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_preempt.c b/drivers/gpu/drm/msm/adreno/a6xx_preempt.c index afc5f4aa3b173..747a22afad9f6 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_preempt.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_preempt.c @@ -454,11 +454,11 @@ void a6xx_preempt_init(struct msm_gpu *gpu) gpu->vm, &a6xx_gpu->preempt_postamble_bo, &a6xx_gpu->preempt_postamble_iova); - preempt_prepare_postamble(a6xx_gpu); - if (IS_ERR(a6xx_gpu->preempt_postamble_ptr)) goto fail; + preempt_prepare_postamble(a6xx_gpu); + timer_setup(&a6xx_gpu->preempt_timer, a6xx_preempt_timer, 0); return; From 606d619046099bc98ff9bd77a341040a12ac67de Mon Sep 17 00:00:00 2001 From: Anna Maniscalco Date: Thu, 27 Nov 2025 19:22:35 +0100 Subject: [PATCH 0366/1321] drm/msm: add PERFCTR_CNTL to ifpc_reglist Previously this register would become 0 after IFPC took place which broke all usages of counters. Fixes: a6a0157cc68e ("drm/msm/a6xx: Enable IFPC on Adreno X1-85") Cc: stable@vger.kernel.org Signed-off-by: Anna Maniscalco Reviewed-by: Akhil P Oommen Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/690960/ Message-ID: <20251127-ifpc_counters-v3-1-fac0a126bc88@gmail.com> Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c index 29107b3623464..b731491dc5225 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c @@ -1392,6 +1392,7 @@ static const u32 a750_ifpc_reglist_regs[] = { REG_A6XX_TPL1_BICUBIC_WEIGHTS_TABLE(2), REG_A6XX_TPL1_BICUBIC_WEIGHTS_TABLE(3), REG_A6XX_TPL1_BICUBIC_WEIGHTS_TABLE(4), + REG_A6XX_RBBM_PERFCTR_CNTL, REG_A6XX_TPL1_NC_MODE_CNTL, REG_A6XX_SP_NC_MODE_CNTL, REG_A6XX_CP_DBG_ECO_CNTL, From 65caa15a42af95b98bee94cc5c4e6ce9aa8cdadd Mon Sep 17 00:00:00 2001 From: Anna Maniscalco Date: Mon, 1 Dec 2025 19:14:36 +0100 Subject: [PATCH 0367/1321] drm/msm: Fix a7xx per pipe register programming GEN7_GRAS_NC_MODE_CNTL was only programmed for BR and not for BV pipe but it needs to be programmed for both. Program both pipes in hw_init and introducea separate reglist for it in order to add this register to the dynamic reglist which supports restoring registers per pipe. Fixes: 91389b4e3263 ("drm/msm/a6xx: Add a pwrup_list field to a6xx_info") Cc: stable@vger.kernel.org Reviewed-by: Akhil P Oommen Signed-off-by: Anna Maniscalco Patchwork: https://patchwork.freedesktop.org/patch/691553/ Message-ID: <20251201-gras_nc_mode_fix-v3-1-92a8a10d91d0@gmail.com> Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 12 +++++++- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 34 ++++++++++++++++++++--- drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 1 + drivers/gpu/drm/msm/adreno/adreno_gpu.h | 13 +++++++++ 4 files changed, 55 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c index b731491dc5225..ac9a95aab2fb5 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c @@ -1376,7 +1376,6 @@ static const uint32_t a7xx_pwrup_reglist_regs[] = { REG_A6XX_UCHE_MODE_CNTL, REG_A6XX_RB_NC_MODE_CNTL, REG_A6XX_RB_CMP_DBG_ECO_CNTL, - REG_A7XX_GRAS_NC_MODE_CNTL, REG_A6XX_RB_CONTEXT_SWITCH_GMEM_SAVE_RESTORE_ENABLE, REG_A6XX_UCHE_GBIF_GX_CONFIG, REG_A6XX_UCHE_CLIENT_PF, @@ -1449,6 +1448,12 @@ static const u32 a750_ifpc_reglist_regs[] = { DECLARE_ADRENO_REGLIST_LIST(a750_ifpc_reglist); +static const struct adreno_reglist_pipe a7xx_dyn_pwrup_reglist_regs[] = { + { REG_A7XX_GRAS_NC_MODE_CNTL, 0, BIT(PIPE_BV) | BIT(PIPE_BR) }, +}; + +DECLARE_ADRENO_REGLIST_PIPE_LIST(a7xx_dyn_pwrup_reglist); + static const struct adreno_info a7xx_gpus[] = { { .chip_ids = ADRENO_CHIP_IDS(0x07000200), @@ -1492,6 +1497,7 @@ static const struct adreno_info a7xx_gpus[] = { .hwcg = a730_hwcg, .protect = &a730_protect, .pwrup_reglist = &a7xx_pwrup_reglist, + .dyn_pwrup_reglist = &a7xx_dyn_pwrup_reglist, .gbif_cx = a640_gbif, .gmu_cgc_mode = 0x00020000, }, @@ -1514,6 +1520,7 @@ static const struct adreno_info a7xx_gpus[] = { .hwcg = a740_hwcg, .protect = &a730_protect, .pwrup_reglist = &a7xx_pwrup_reglist, + .dyn_pwrup_reglist = &a7xx_dyn_pwrup_reglist, .gbif_cx = a640_gbif, .gmu_chipid = 0x7020100, .gmu_cgc_mode = 0x00020202, @@ -1548,6 +1555,7 @@ static const struct adreno_info a7xx_gpus[] = { .hwcg = a740_hwcg, .protect = &a730_protect, .pwrup_reglist = &a7xx_pwrup_reglist, + .dyn_pwrup_reglist = &a7xx_dyn_pwrup_reglist, .ifpc_reglist = &a750_ifpc_reglist, .gbif_cx = a640_gbif, .gmu_chipid = 0x7050001, @@ -1590,6 +1598,7 @@ static const struct adreno_info a7xx_gpus[] = { .a6xx = &(const struct a6xx_info) { .protect = &a730_protect, .pwrup_reglist = &a7xx_pwrup_reglist, + .dyn_pwrup_reglist = &a7xx_dyn_pwrup_reglist, .ifpc_reglist = &a750_ifpc_reglist, .gbif_cx = a640_gbif, .gmu_chipid = 0x7090100, @@ -1624,6 +1633,7 @@ static const struct adreno_info a7xx_gpus[] = { .hwcg = a740_hwcg, .protect = &a730_protect, .pwrup_reglist = &a7xx_pwrup_reglist, + .dyn_pwrup_reglist = &a7xx_dyn_pwrup_reglist, .gbif_cx = a640_gbif, .gmu_chipid = 0x70f0000, .gmu_cgc_mode = 0x00020222, diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 7e71f6bb5283b..2129d230a92b4 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -849,9 +849,16 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu) min_acc_len_64b << 3 | hbb_lo << 1 | ubwc_mode); - if (adreno_is_a7xx(adreno_gpu)) - gpu_write(gpu, REG_A7XX_GRAS_NC_MODE_CNTL, - FIELD_PREP(GENMASK(8, 5), hbb_lo)); + if (adreno_is_a7xx(adreno_gpu)) { + for (u32 pipe_id = PIPE_BR; pipe_id <= PIPE_BV; pipe_id++) { + gpu_write(gpu, REG_A7XX_CP_APERTURE_CNTL_HOST, + A7XX_CP_APERTURE_CNTL_HOST_PIPE(pipe_id)); + gpu_write(gpu, REG_A7XX_GRAS_NC_MODE_CNTL, + FIELD_PREP(GENMASK(8, 5), hbb_lo)); + } + gpu_write(gpu, REG_A7XX_CP_APERTURE_CNTL_HOST, + A7XX_CP_APERTURE_CNTL_HOST_PIPE(PIPE_NONE)); + } gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len_64b << 23 | hbb_lo << 21); @@ -865,9 +872,11 @@ static void a7xx_patch_pwrup_reglist(struct msm_gpu *gpu) struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); const struct adreno_reglist_list *reglist; + const struct adreno_reglist_pipe_list *dyn_pwrup_reglist; void *ptr = a6xx_gpu->pwrup_reglist_ptr; struct cpu_gpu_lock *lock = ptr; u32 *dest = (u32 *)&lock->regs[0]; + u32 dyn_pwrup_reglist_count = 0; int i; lock->gpu_req = lock->cpu_req = lock->turn = 0; @@ -909,7 +918,24 @@ static void a7xx_patch_pwrup_reglist(struct msm_gpu *gpu) * (
), and the length is * stored as number for triplets in dynamic_list_len. */ - lock->dynamic_list_len = 0; + dyn_pwrup_reglist = adreno_gpu->info->a6xx->dyn_pwrup_reglist; + if (dyn_pwrup_reglist) { + for (u32 pipe_id = PIPE_BR; pipe_id <= PIPE_BV; pipe_id++) { + gpu_write(gpu, REG_A7XX_CP_APERTURE_CNTL_HOST, + A7XX_CP_APERTURE_CNTL_HOST_PIPE(pipe_id)); + for (i = 0; i < dyn_pwrup_reglist->count; i++) { + if ((dyn_pwrup_reglist->regs[i].pipe & BIT(pipe_id)) == 0) + continue; + *dest++ = A7XX_CP_APERTURE_CNTL_HOST_PIPE(pipe_id); + *dest++ = dyn_pwrup_reglist->regs[i].offset; + *dest++ = gpu_read(gpu, dyn_pwrup_reglist->regs[i].offset); + dyn_pwrup_reglist_count++; + } + } + gpu_write(gpu, REG_A7XX_CP_APERTURE_CNTL_HOST, + A7XX_CP_APERTURE_CNTL_HOST_PIPE(PIPE_NONE)); + } + lock->dynamic_list_len = dyn_pwrup_reglist_count; } static int a7xx_preempt_start(struct msm_gpu *gpu) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h index 6820216ec5fc9..4eaa047112460 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h @@ -45,6 +45,7 @@ struct a6xx_info { const struct adreno_reglist *hwcg; const struct adreno_protect *protect; const struct adreno_reglist_list *pwrup_reglist; + const struct adreno_reglist_pipe_list *dyn_pwrup_reglist; const struct adreno_reglist_list *ifpc_reglist; const struct adreno_reglist *gbif_cx; const struct adreno_reglist_pipe *nonctxt_reglist; diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h index 0f8d3de97636c..1d0145f8b3ecb 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h @@ -188,6 +188,19 @@ static const struct adreno_reglist_list name = { \ .count = ARRAY_SIZE(name ## _regs), \ }; +struct adreno_reglist_pipe_list { + /** @reg: List of register **/ + const struct adreno_reglist_pipe *regs; + /** @count: Number of registers in the list **/ + u32 count; +}; + +#define DECLARE_ADRENO_REGLIST_PIPE_LIST(name) \ +static const struct adreno_reglist_pipe_list name = { \ + .regs = name ## _regs, \ + .count = ARRAY_SIZE(name ## _regs), \ +}; + struct adreno_gpu { struct msm_gpu base; const struct adreno_info *info; From ba27f1fbbb655cd420f0f8a1cd643190e5589bba Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:20 -0800 Subject: [PATCH 0368/1321] drm/msm/disp: mdp_format: fix all kernel-doc warnings Correct and add kernel-doc comments to eliminate all warnings: Warning: ../drivers/gpu/drm/msm/disp/mdp_format.h:27 This comment starts with '/**', but isn't a kernel-doc comment. Warning: ../drivers/gpu/drm/msm/disp/mdp_format.h:64 struct member 'bpc_a' not described in 'msm_format' Warning: ../drivers/gpu/drm/msm/disp/mdp_format.h:64 struct member 'bpc_b_cb' not described in 'msm_format' Warning: ../drivers/gpu/drm/msm/disp/mdp_format.h:64 struct member 'bpc_g_y' not described in 'msm_format' Warning: ../drivers/gpu/drm/msm/disp/mdp_format.h:64 struct member 'bpc_r_cr' not described in 'msm_format' Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695650/ Link: https://lore.kernel.org/r/20251219184638.1813181-2-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/mdp_format.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/disp/mdp_format.h b/drivers/gpu/drm/msm/disp/mdp_format.h index a00d646ff4d47..915954bf5dc76 100644 --- a/drivers/gpu/drm/msm/disp/mdp_format.h +++ b/drivers/gpu/drm/msm/disp/mdp_format.h @@ -24,7 +24,7 @@ enum msm_format_flags { #define MSM_FORMAT_FLAG_UNPACK_TIGHT BIT(MSM_FORMAT_FLAG_UNPACK_TIGHT_BIT) #define MSM_FORMAT_FLAG_UNPACK_ALIGN_MSB BIT(MSM_FORMAT_FLAG_UNPACK_ALIGN_MSB_BIT) -/** +/* * DPU HW,Component order color map */ enum { @@ -37,6 +37,10 @@ enum { /** * struct msm_format: defines the format configuration * @pixel_format: format fourcc + * @bpc_g_y: element bit widths: BPC for G or Y + * @bpc_b_cb: element bit widths: BPC for B or Cb + * @bpc_r_cr: element bit widths: BPC for R or Cr + * @bpc_a: element bit widths: BPC for the alpha channel * @element: element color ordering * @fetch_type: how the color components are packed in pixel format * @chroma_sample: chroma sub-samplng type From b70de7d29d0fc4e76acd9afc1b76c618eeb9c79b Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:21 -0800 Subject: [PATCH 0369/1321] drm/msm/dp: fix all kernel-doc warnings Correct and add kernel-doc comments to eliminate all warnings: Warning: ../drivers/gpu/drm/msm/dp/dp_debug.h:31 expecting prototype for msm_dp_debug_get(). Prototype was for msm_dp_debug_init() instead Warning: ../drivers/gpu/drm/msm/dp/dp_drm.c:24 function parameter 'connector' not described in 'msm_dp_bridge_detect' Warning: ../drivers/gpu/drm/msm/dp/dp_link.h:90 expecting prototype for mdss_dp_test_bit_depth_to_bpp(). Prototype was for msm_dp_link_bit_depth_to_bpp() instead Warning: ../drivers/gpu/drm/msm/dp/dp_link.h:126 function parameter 'aux' not described in 'msm_dp_link_get' Warning: ../drivers/gpu/drm/msm/dp/dp_link.h:126 function parameter 'dev' not described in 'msm_dp_link_get' Warning: ../drivers/gpu/drm/msm/dp/dp_panel.h:70 function parameter 'bw_code' not described in 'is_link_rate_valid' Warning: ../drivers/gpu/drm/msm/dp/dp_panel.h:84 expecting prototype for msm_dp_link_is_lane_count_valid(). Prototype was for is_lane_count_valid() instead Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695647/ Link: https://lore.kernel.org/r/20251219184638.1813181-3-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/dp/dp_debug.h | 2 +- drivers/gpu/drm/msm/dp/dp_drm.c | 1 + drivers/gpu/drm/msm/dp/dp_link.h | 9 +++++---- drivers/gpu/drm/msm/dp/dp_panel.h | 8 ++++---- 4 files changed, 11 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/msm/dp/dp_debug.h b/drivers/gpu/drm/msm/dp/dp_debug.h index 6dc0ff4f0f650..a90083fec856d 100644 --- a/drivers/gpu/drm/msm/dp/dp_debug.h +++ b/drivers/gpu/drm/msm/dp/dp_debug.h @@ -12,7 +12,7 @@ #if defined(CONFIG_DEBUG_FS) /** - * msm_dp_debug_get() - configure and get the DisplayPlot debug module data + * msm_dp_debug_init() - configure and get the DisplayPlot debug module data * * @dev: device instance of the caller * @panel: instance of panel module diff --git a/drivers/gpu/drm/msm/dp/dp_drm.c b/drivers/gpu/drm/msm/dp/dp_drm.c index 9a461ab2f32fc..fd6443d2b6cea 100644 --- a/drivers/gpu/drm/msm/dp/dp_drm.c +++ b/drivers/gpu/drm/msm/dp/dp_drm.c @@ -18,6 +18,7 @@ /** * msm_dp_bridge_detect - callback to determine if connector is connected * @bridge: Pointer to drm bridge structure + * @connector: Pointer to drm connector structure * Returns: Bridge's 'is connected' status */ static enum drm_connector_status diff --git a/drivers/gpu/drm/msm/dp/dp_link.h b/drivers/gpu/drm/msm/dp/dp_link.h index b1eb2de6d2a76..8460e4ad2d353 100644 --- a/drivers/gpu/drm/msm/dp/dp_link.h +++ b/drivers/gpu/drm/msm/dp/dp_link.h @@ -80,11 +80,11 @@ struct msm_dp_link { }; /** - * mdss_dp_test_bit_depth_to_bpp() - convert test bit depth to bpp + * msm_dp_link_bit_depth_to_bpp() - convert test bit depth to bpp * @tbd: test bit depth * - * Returns the bits per pixel (bpp) to be used corresponding to the - * git bit depth value. This function assumes that bit depth has + * Returns: the bits per pixel (bpp) to be used corresponding to the + * bit depth value. This function assumes that bit depth has * already been validated. */ static inline u32 msm_dp_link_bit_depth_to_bpp(u32 tbd) @@ -120,7 +120,8 @@ bool msm_dp_link_send_edid_checksum(struct msm_dp_link *msm_dp_link, u8 checksum /** * msm_dp_link_get() - get the functionalities of dp test module - * + * @dev: kernel device structure + * @aux: DisplayPort AUX channel * * return: a pointer to msm_dp_link struct */ diff --git a/drivers/gpu/drm/msm/dp/dp_panel.h b/drivers/gpu/drm/msm/dp/dp_panel.h index 921a296852d4d..177c1328fd997 100644 --- a/drivers/gpu/drm/msm/dp/dp_panel.h +++ b/drivers/gpu/drm/msm/dp/dp_panel.h @@ -63,9 +63,9 @@ void msm_dp_panel_disable_vsc_sdp(struct msm_dp_panel *msm_dp_panel); /** * is_link_rate_valid() - validates the link rate - * @lane_rate: link rate requested by the sink + * @bw_code: link rate requested by the sink * - * Returns true if the requested link rate is supported. + * Returns: true if the requested link rate is supported. */ static inline bool is_link_rate_valid(u32 bw_code) { @@ -76,10 +76,10 @@ static inline bool is_link_rate_valid(u32 bw_code) } /** - * msm_dp_link_is_lane_count_valid() - validates the lane count + * is_lane_count_valid() - validates the lane count * @lane_count: lane count requested by the sink * - * Returns true if the requested lane count is supported. + * Returns: true if the requested lane count is supported. */ static inline bool is_lane_count_valid(u32 lane_count) { From 9d7803e8cda0554111169e1356c7daf487d197e4 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:22 -0800 Subject: [PATCH 0370/1321] drm/msm/dpu: dpu_hw_cdm.h: fix all kernel-doc warnings Correct and add kernel-doc comments to eliminate all warnings: Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.h:91 Incorrect use of kernel-doc format: * Enable the CDM module Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.h:97 Incorrect use of kernel-doc format: * Enable/disable the connection with pingpong Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695651/ Link: https://lore.kernel.org/r/20251219184638.1813181-4-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.h index 6bb3476a05f80..75e6dae0fcd9b 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.h @@ -89,13 +89,13 @@ enum dpu_hw_cdwn_op_mode_method_h_v { */ struct dpu_hw_cdm_ops { /** - * Enable the CDM module + * @enable: Enable the CDM module * @cdm Pointer to chroma down context */ int (*enable)(struct dpu_hw_cdm *cdm, struct dpu_hw_cdm_cfg *cfg); /** - * Enable/disable the connection with pingpong + * @bind_pingpong_blk: Enable/disable the connection with pingpong * @cdm Pointer to chroma down context * @pp pingpong block id. */ From 97bdedd6951de8a12113096f1e5fb3d7422b26cb Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:23 -0800 Subject: [PATCH 0371/1321] drm/msm/dpu: dpu_hw_ctl.h: fix all kernel-doc warnings Correct and add kernel-doc comments to eliminate all warnings: Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:18 cannot understand function prototype: 'enum dpu_ctl_mode_sel' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:58 struct member 'wb' not described in 'dpu_hw_intf_cfg' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:66 Incorrect use of kernel-doc format: * kickoff hw operation for Sw controlled interfaces Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:73 Incorrect use of kernel-doc format: * check if the ctl is started Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:80 Incorrect use of kernel-doc format: * kickoff prepare is in progress hw operation for sw Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:88 Incorrect use of kernel-doc format: * Clear the value of the cached pending_flush_mask Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:96 Incorrect use of kernel-doc format: * Query the value of the cached pending_flush_mask Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:103 Incorrect use of kernel-doc format: * OR in the given flushbits to the cached pending_flush_mask Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:112 Incorrect use of kernel-doc format: * OR in the given flushbits to the cached pending_(wb_)flush_mask Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:121 Incorrect use of kernel-doc format: * OR in the given flushbits to the cached pending_(cwb_)flush_mask Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:130 Incorrect use of kernel-doc format: * OR in the given flushbits to the cached pending_(intf_)flush_mask Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:139 Incorrect use of kernel-doc format: * OR in the given flushbits to the cached pending_(periph_)flush_mask Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:148 Incorrect use of kernel-doc format: * OR in the given flushbits to the cached pending_(merge_3d_)flush_mask Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:157 Incorrect use of kernel-doc format: * OR in the given flushbits to the cached pending_flush_mask Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:166 Incorrect use of kernel-doc format: * OR in the given flushbits to the cached pending_flush_mask Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:175 Incorrect use of kernel-doc format: * OR in the given flushbits to the cached pending_flush_mask Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:185 Incorrect use of kernel-doc format: * OR in the given flushbits to the cached pending_(dsc_)flush_mask Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:194 Incorrect use of kernel-doc format: * OR in the given flushbits to the cached pending_(cdm_)flush_mask Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:202 Incorrect use of kernel-doc format: * Write the value of the pending_flush_mask to hardware Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:208 Incorrect use of kernel-doc format: * Read the value of the flush register Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:215 Incorrect use of kernel-doc format: * Setup ctl_path interface config Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:223 Incorrect use of kernel-doc format: * reset ctl_path interface config Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:244 Incorrect use of kernel-doc format: * Set all blend stages to disabled Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:250 Incorrect use of kernel-doc format: * Configure layer mixer to pipe configuration Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:262 Incorrect use of kernel-doc format: * Set active pipes attached to this CTL Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:270 Incorrect use of kernel-doc format: * Set active layer mixers attached to this CTL Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:277 struct member 'trigger_start' not described in 'dpu_hw_ctl_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:277 struct member 'is_started' not described in 'dpu_hw_ctl_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:277 struct member 'trigger_pending' not described in 'dpu_hw_ctl_ops' [many here] Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:319 struct member 'pending_periph_flush_mask' not described in 'dpu_hw_ctl' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:319 struct member 'pending_merge_3d_flush_mask' not described in 'dpu_hw_ctl' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:319 struct member 'pending_dspp_flush_mask' not described in 'dpu_hw_ctl' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h:327 expecting prototype for dpu_hw_ctl(). Prototype was for to_dpu_hw_ctl() instead Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695649/ Link: https://lore.kernel.org/r/20251219184638.1813181-5-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h | 84 ++++++++++++++-------- 1 file changed, 53 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h index 15931b22ec941..e535bf013825b 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h @@ -12,9 +12,9 @@ #include "dpu_hw_sspp.h" /** - * dpu_ctl_mode_sel: Interface mode selection - * DPU_CTL_MODE_SEL_VID: Video mode interface - * DPU_CTL_MODE_SEL_CMD: Command mode interface + * enum dpu_ctl_mode_sel: Interface mode selection + * @DPU_CTL_MODE_SEL_VID: Video mode interface + * @DPU_CTL_MODE_SEL_CMD: Command mode interface */ enum dpu_ctl_mode_sel { DPU_CTL_MODE_SEL_VID = 0, @@ -37,6 +37,7 @@ struct dpu_hw_stage_cfg { * struct dpu_hw_intf_cfg :Describes how the DPU writes data to output interface * @intf : Interface id * @intf_master: Master interface id in the dual pipe topology + * @wb: Writeback mode * @mode_3d: 3d mux configuration * @merge_3d: 3d merge block used * @intf_mode_sel: Interface mode, cmd / vid @@ -64,21 +65,21 @@ struct dpu_hw_intf_cfg { */ struct dpu_hw_ctl_ops { /** - * kickoff hw operation for Sw controlled interfaces + * @trigger_start: kickoff hw operation for Sw controlled interfaces * DSI cmd mode and WB interface are SW controlled * @ctx : ctl path ctx pointer */ void (*trigger_start)(struct dpu_hw_ctl *ctx); /** - * check if the ctl is started + * @is_started: check if the ctl is started * @ctx : ctl path ctx pointer * @Return: true if started, false if stopped */ bool (*is_started)(struct dpu_hw_ctl *ctx); /** - * kickoff prepare is in progress hw operation for sw + * @trigger_pending: kickoff prepare is in progress hw operation for sw * controlled interfaces: DSI cmd mode and WB interface * are SW controlled * @ctx : ctl path ctx pointer @@ -86,7 +87,7 @@ struct dpu_hw_ctl_ops { void (*trigger_pending)(struct dpu_hw_ctl *ctx); /** - * Clear the value of the cached pending_flush_mask + * @clear_pending_flush: Clear the value of the cached pending_flush_mask * No effect on hardware. * Required to be implemented. * @ctx : ctl path ctx pointer @@ -94,14 +95,15 @@ struct dpu_hw_ctl_ops { void (*clear_pending_flush)(struct dpu_hw_ctl *ctx); /** - * Query the value of the cached pending_flush_mask + * @get_pending_flush: Query the value of the cached pending_flush_mask * No effect on hardware * @ctx : ctl path ctx pointer */ u32 (*get_pending_flush)(struct dpu_hw_ctl *ctx); /** - * OR in the given flushbits to the cached pending_flush_mask + * @update_pending_flush: OR in the given flushbits to the cached + * pending_flush_mask. * No effect on hardware * @ctx : ctl path ctx pointer * @flushbits : module flushmask @@ -110,7 +112,8 @@ struct dpu_hw_ctl_ops { u32 flushbits); /** - * OR in the given flushbits to the cached pending_(wb_)flush_mask + * @update_pending_flush_wb: OR in the given flushbits to the + * cached pending_(wb_)flush_mask. * No effect on hardware * @ctx : ctl path ctx pointer * @blk : writeback block index @@ -119,7 +122,8 @@ struct dpu_hw_ctl_ops { enum dpu_wb blk); /** - * OR in the given flushbits to the cached pending_(cwb_)flush_mask + * @update_pending_flush_cwb: OR in the given flushbits to the + * cached pending_(cwb_)flush_mask. * No effect on hardware * @ctx : ctl path ctx pointer * @blk : concurrent writeback block index @@ -128,7 +132,8 @@ struct dpu_hw_ctl_ops { enum dpu_cwb blk); /** - * OR in the given flushbits to the cached pending_(intf_)flush_mask + * @update_pending_flush_intf: OR in the given flushbits to the + * cached pending_(intf_)flush_mask. * No effect on hardware * @ctx : ctl path ctx pointer * @blk : interface block index @@ -137,7 +142,8 @@ struct dpu_hw_ctl_ops { enum dpu_intf blk); /** - * OR in the given flushbits to the cached pending_(periph_)flush_mask + * @update_pending_flush_periph: OR in the given flushbits to the + * cached pending_(periph_)flush_mask. * No effect on hardware * @ctx : ctl path ctx pointer * @blk : interface block index @@ -146,7 +152,8 @@ struct dpu_hw_ctl_ops { enum dpu_intf blk); /** - * OR in the given flushbits to the cached pending_(merge_3d_)flush_mask + * @update_pending_flush_merge_3d: OR in the given flushbits to the + * cached pending_(merge_3d_)flush_mask. * No effect on hardware * @ctx : ctl path ctx pointer * @blk : interface block index @@ -155,7 +162,8 @@ struct dpu_hw_ctl_ops { enum dpu_merge_3d blk); /** - * OR in the given flushbits to the cached pending_flush_mask + * @update_pending_flush_sspp: OR in the given flushbits to the + * cached pending_flush_mask. * No effect on hardware * @ctx : ctl path ctx pointer * @blk : SSPP block index @@ -164,7 +172,8 @@ struct dpu_hw_ctl_ops { enum dpu_sspp blk); /** - * OR in the given flushbits to the cached pending_flush_mask + * @update_pending_flush_mixer: OR in the given flushbits to the + * cached pending_flush_mask. * No effect on hardware * @ctx : ctl path ctx pointer * @blk : LM block index @@ -173,7 +182,8 @@ struct dpu_hw_ctl_ops { enum dpu_lm blk); /** - * OR in the given flushbits to the cached pending_flush_mask + * @update_pending_flush_dspp: OR in the given flushbits to the + * cached pending_flush_mask. * No effect on hardware * @ctx : ctl path ctx pointer * @blk : DSPP block index @@ -183,7 +193,8 @@ struct dpu_hw_ctl_ops { enum dpu_dspp blk, u32 dspp_sub_blk); /** - * OR in the given flushbits to the cached pending_(dsc_)flush_mask + * @update_pending_flush_dsc: OR in the given flushbits to the + * cached pending_(dsc_)flush_mask. * No effect on hardware * @ctx: ctl path ctx pointer * @blk: interface block index @@ -192,7 +203,8 @@ struct dpu_hw_ctl_ops { enum dpu_dsc blk); /** - * OR in the given flushbits to the cached pending_(cdm_)flush_mask + * @update_pending_flush_cdm: OR in the given flushbits to the + * cached pending_(cdm_)flush_mask. * No effect on hardware * @ctx: ctl path ctx pointer * @cdm_num: idx of cdm to be flushed @@ -200,20 +212,20 @@ struct dpu_hw_ctl_ops { void (*update_pending_flush_cdm)(struct dpu_hw_ctl *ctx, enum dpu_cdm cdm_num); /** - * Write the value of the pending_flush_mask to hardware + * @trigger_flush: Write the value of the pending_flush_mask to hardware * @ctx : ctl path ctx pointer */ void (*trigger_flush)(struct dpu_hw_ctl *ctx); /** - * Read the value of the flush register + * @get_flush_register: Read the value of the flush register * @ctx : ctl path ctx pointer * @Return: value of the ctl flush register. */ u32 (*get_flush_register)(struct dpu_hw_ctl *ctx); /** - * Setup ctl_path interface config + * @setup_intf_cfg: Setup ctl_path interface config * @ctx * @cfg : interface config structure pointer */ @@ -221,17 +233,20 @@ struct dpu_hw_ctl_ops { struct dpu_hw_intf_cfg *cfg); /** - * reset ctl_path interface config + * @reset_intf_cfg: reset ctl_path interface config * @ctx : ctl path ctx pointer * @cfg : interface config structure pointer */ void (*reset_intf_cfg)(struct dpu_hw_ctl *ctx, struct dpu_hw_intf_cfg *cfg); + /** + * @reset: reset function for this ctl type + */ int (*reset)(struct dpu_hw_ctl *c); - /* - * wait_reset_status - checks ctl reset status + /** + * @wait_reset_status: checks ctl reset status * @ctx : ctl path ctx pointer * * This function checks the ctl reset status bit. @@ -242,13 +257,13 @@ struct dpu_hw_ctl_ops { int (*wait_reset_status)(struct dpu_hw_ctl *ctx); /** - * Set all blend stages to disabled + * @clear_all_blendstages: Set all blend stages to disabled * @ctx : ctl path ctx pointer */ void (*clear_all_blendstages)(struct dpu_hw_ctl *ctx); /** - * Configure layer mixer to pipe configuration + * @setup_blendstage: Configure layer mixer to pipe configuration * @ctx : ctl path ctx pointer * @lm : layer mixer enumeration * @cfg : blend stage configuration @@ -256,11 +271,16 @@ struct dpu_hw_ctl_ops { void (*setup_blendstage)(struct dpu_hw_ctl *ctx, enum dpu_lm lm, struct dpu_hw_stage_cfg *cfg); + /** + * @set_active_fetch_pipes: Set active pipes attached to this CTL + * @ctx: ctl path ctx pointer + * @active_pipes: bitmap of enum dpu_sspp + */ void (*set_active_fetch_pipes)(struct dpu_hw_ctl *ctx, unsigned long *fetch_active); /** - * Set active pipes attached to this CTL + * @set_active_pipes: Set active pipes attached to this CTL * @ctx: ctl path ctx pointer * @active_pipes: bitmap of enum dpu_sspp */ @@ -268,13 +288,12 @@ struct dpu_hw_ctl_ops { unsigned long *active_pipes); /** - * Set active layer mixers attached to this CTL + * @set_active_lms: Set active layer mixers attached to this CTL * @ctx: ctl path ctx pointer * @active_lms: bitmap of enum dpu_lm */ void (*set_active_lms)(struct dpu_hw_ctl *ctx, unsigned long *active_lms); - }; /** @@ -289,6 +308,9 @@ struct dpu_hw_ctl_ops { * @pending_intf_flush_mask: pending INTF flush * @pending_wb_flush_mask: pending WB flush * @pending_cwb_flush_mask: pending CWB flush + * @pending_periph_flush_mask: pending PERIPH flush + * @pending_merge_3d_flush_mask: pending MERGE 3D flush + * @pending_dspp_flush_mask: pending DSPP flush * @pending_dsc_flush_mask: pending DSC flush * @pending_cdm_flush_mask: pending CDM flush * @mdss_ver: MDSS revision information @@ -320,7 +342,7 @@ struct dpu_hw_ctl { }; /** - * dpu_hw_ctl - convert base object dpu_hw_base to container + * to_dpu_hw_ctl - convert base object dpu_hw_base to container * @hw: Pointer to base hardware block * return: Pointer to hardware block container */ From a14a80c3dc2c7170cf5f7afdb1a21c642bb18d0a Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:24 -0800 Subject: [PATCH 0372/1321] drm/msm/dpu: dpu_hw_cwb.h: fix all kernel-doc warnings Correct or add kernel-doc comments to eliminate all warnings: Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.h:30 Cannot find identifier on line: * Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.h:61 expecting prototype for dpu_hw_cwb(). Prototype was for to_dpu_hw_cwb() instead Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695664/ Link: https://lore.kernel.org/r/20251219184638.1813181-6-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.h index 96b6edf6b2bbf..ed7bfcee7f1cc 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.h @@ -28,7 +28,6 @@ struct dpu_hw_cwb_setup_cfg { }; /** - * * struct dpu_hw_cwb_ops : Interface to the cwb hw driver functions * @config_cwb: configure CWB mux */ @@ -54,7 +53,7 @@ struct dpu_hw_cwb { }; /** - * dpu_hw_cwb - convert base object dpu_hw_base to container + * to_dpu_hw_cwb - convert base object dpu_hw_base to container * @hw: Pointer to base hardware block * return: Pointer to hardware block container */ From 33d02b4ffcccdaf765a06c4749d38fc4b554dcda Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:25 -0800 Subject: [PATCH 0373/1321] drm/msm/dpu: dpu_hw_dsc.h: fix all kernel-doc warnings Correct or add kernel-doc comments to eliminate all warnings: Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h:23 Incorrect use of kernel-doc format: * dsc_disable - disable dsc Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h:29 Incorrect use of kernel-doc format: * dsc_config - configures dsc encoder Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h:41 Incorrect use of kernel-doc format: * dsc_config_thresh - programs panel thresholds Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h:50 struct member 'dsc_disable' not described in 'dpu_hw_dsc_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h:50 struct member 'dsc_config' not described in 'dpu_hw_dsc_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h:50 struct member 'dsc_config_thresh' not described in 'dpu_hw_dsc_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h:50 struct member 'dsc_bind_pingpong_blk' not described in 'dpu_hw_dsc_ops' Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695658/ Link: https://lore.kernel.org/r/20251219184638.1813181-7-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h index cc7cc6f6f7cda..39d93b9df0515 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h @@ -21,13 +21,13 @@ struct dpu_hw_dsc; */ struct dpu_hw_dsc_ops { /** - * dsc_disable - disable dsc + * @dsc_disable: disable dsc * @hw_dsc: Pointer to dsc context */ void (*dsc_disable)(struct dpu_hw_dsc *hw_dsc); /** - * dsc_config - configures dsc encoder + * @dsc_config: configures dsc encoder * @hw_dsc: Pointer to dsc context * @dsc: panel dsc parameters * @mode: dsc topology mode to be set @@ -39,13 +39,17 @@ struct dpu_hw_dsc_ops { u32 initial_lines); /** - * dsc_config_thresh - programs panel thresholds + * @dsc_config_thresh: programs panel thresholds * @hw_dsc: Pointer to dsc context * @dsc: panel dsc parameters */ void (*dsc_config_thresh)(struct dpu_hw_dsc *hw_dsc, struct drm_dsc_config *dsc); + /** + * @dsc_bind_pingpong_blk: binds pixel output from a DSC block + * to a pingpong block + */ void (*dsc_bind_pingpong_blk)(struct dpu_hw_dsc *hw_dsc, enum dpu_pingpong pp); }; From 3ba4463cc6c446a219a4bebcdd94beab05e5e4bc Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:26 -0800 Subject: [PATCH 0374/1321] drm/msm/dpu: dpu_hw_dspp.h: fix all kernel-doc warnings Correct or add kernel-doc comments to eliminate all warnings: Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dspp.h:33 expecting prototype for struct dpu_hw_pcc. Prototype was for struct dpu_hw_pcc_cfg instead Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dspp.h:42 Incorrect use of kernel-doc format: * setup_pcc - setup dspp pcc Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dspp.h:48 struct member 'setup_pcc' not described in 'dpu_hw_dspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dspp.h:76 expecting prototype for dpu_hw_dspp(). Prototype was for to_dpu_hw_dspp() instead Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695652/ Link: https://lore.kernel.org/r/20251219184638.1813181-8-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dspp.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dspp.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dspp.h index 45c26cd49fa3e..722b0f482e9b6 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dspp.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dspp.h @@ -22,7 +22,7 @@ struct dpu_hw_pcc_coeff { }; /** - * struct dpu_hw_pcc - pcc feature structure + * struct dpu_hw_pcc_cfg - pcc feature structure * @r: red coefficients. * @g: green coefficients. * @b: blue coefficients. @@ -40,7 +40,7 @@ struct dpu_hw_pcc_cfg { */ struct dpu_hw_dspp_ops { /** - * setup_pcc - setup dspp pcc + * @setup_pcc: setup_pcc - setup dspp pcc * @ctx: Pointer to dspp context * @cfg: Pointer to configuration */ @@ -69,7 +69,7 @@ struct dpu_hw_dspp { }; /** - * dpu_hw_dspp - convert base object dpu_hw_base to container + * to_dpu_hw_dspp - convert base object dpu_hw_base to container * @hw: Pointer to base hardware block * return: Pointer to hardware block container */ From d049cd858c2d024182ba2060d9749f1ad26f62a3 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:27 -0800 Subject: [PATCH 0375/1321] drm/msm/dpu: dpu_hw_intf.h: fix all kernel-doc warnings Correct or add kernel-doc comments to eliminate all warnings: Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h:76 duplicate section name 'Return' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h:112 Incorrect use of kernel-doc format: * Disable autorefresh if enabled Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h:118 struct member 'setup_timing_gen' not described in 'dpu_hw_intf_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h:118 struct member 'setup_prg_fetch' not described in 'dpu_hw_intf_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h:118 struct member 'enable_timing' not described in 'dpu_hw_intf_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h:118 struct member 'get_status' not described in 'dpu_hw_intf_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h:118 struct member 'get_line_count' not described in 'dpu_hw_intf_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h:118 struct member 'disable_autorefresh' not described in 'dpu_hw_intf_ops' dpu_hw_intf.h:119: warning: Excess struct member 'get_vsync_info' description in 'dpu_hw_intf_ops' dpu_hw_intf.h:119: warning: Excess struct member 'setup_autorefresh' description in 'dpu_hw_intf_ops' dpu_hw_intf.h:119: warning: Excess struct member 'get_autorefresh' description in 'dpu_hw_intf_ops' Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695646/ Link: https://lore.kernel.org/r/20251219184638.1813181-9-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h | 20 +++++++------------- 1 file changed, 7 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h index f31067a9aaf1d..5a19cd74fa947 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h @@ -57,11 +57,11 @@ struct dpu_hw_intf_cmd_mode_cfg { /** * struct dpu_hw_intf_ops : Interface to the interface Hw driver functions * Assumption is these functions will be called after clocks are enabled - * @ setup_timing_gen : programs the timing engine - * @ setup_prog_fetch : enables/disables the programmable fetch logic - * @ enable_timing: enable/disable timing engine - * @ get_status: returns if timing engine is enabled or not - * @ get_line_count: reads current vertical line counter + * @setup_timing_gen : programs the timing engine + * @setup_prg_fetch : enables/disables the programmable fetch logic + * @enable_timing: enable/disable timing engine + * @get_status: returns if timing engine is enabled or not + * @get_line_count: reads current vertical line counter * @bind_pingpong_blk: enable/disable the connection with pingpong which will * feed pixels to this interface * @setup_misr: enable/disable MISR @@ -70,12 +70,9 @@ struct dpu_hw_intf_cmd_mode_cfg { * pointer and programs the tear check configuration * @disable_tearcheck: Disables tearcheck block * @connect_external_te: Read, modify, write to either set or clear listening to external TE - * Return: 1 if TE was originally connected, 0 if not, or -ERROR - * @get_vsync_info: Provides the programmed and current line_count - * @setup_autorefresh: Configure and enable the autorefresh config - * @get_autorefresh: Retrieve autorefresh config from hardware - * Return: 0 on success, -ETIMEDOUT on timeout + * Returns 1 if TE was originally connected, 0 if not, or -ERROR * @vsync_sel: Select vsync signal for tear-effect configuration + * @disable_autorefresh: Disable autorefresh if enabled * @program_intf_cmd_cfg: Program the DPU to interface datapath for command mode */ struct dpu_hw_intf_ops { @@ -109,9 +106,6 @@ struct dpu_hw_intf_ops { void (*vsync_sel)(struct dpu_hw_intf *intf, enum dpu_vsync_source vsync_source); - /** - * Disable autorefresh if enabled - */ void (*disable_autorefresh)(struct dpu_hw_intf *intf, uint32_t encoder_id, u16 vdisplay); void (*program_intf_cmd_cfg)(struct dpu_hw_intf *intf, From 51fc909d67884f9634233307d96a78b6b30a06d6 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:28 -0800 Subject: [PATCH 0376/1321] drm/msm/dpu: dpu_hw_lm.h: fix all kernel-doc warnings Correct or add kernel-doc comments to eliminate all warnings: Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h:27 Cannot find identifier on line: * Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h:52 Incorrect use of kernel-doc format: * Clear layer mixer to pipe configuration Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h:59 Incorrect use of kernel-doc format: * Configure layer mixer to pipe configuration Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h:69 Incorrect use of kernel-doc format: * setup_border_color : enable/disable border color Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h:76 Incorrect use of kernel-doc format: * setup_misr: Enable/disable MISR Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h:81 Incorrect use of kernel-doc format: * collect_misr: Read MISR signature Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h:84 struct member 'setup_mixer_out' not described in 'dpu_hw_lm_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h:84 struct member 'setup_blend_config' not described in 'dpu_hw_lm_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h:84 struct member 'setup_alpha_out' not described in 'dpu_hw_lm_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h:84 struct member 'clear_all_blendstages' not described in 'dpu_hw_lm_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h:84 struct member 'setup_blendstage' not described in 'dpu_hw_lm_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h:84 struct member 'setup_border_color' not described in 'dpu_hw_lm_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h:84 struct member 'setup_misr' not described in 'dpu_hw_lm_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h:84 struct member 'collect_misr' not described in 'dpu_hw_lm_ops' Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695648/ Link: https://lore.kernel.org/r/20251219184638.1813181-10-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h index 1b9ecd082d7fd..ecbb77711d83f 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h @@ -25,39 +25,38 @@ struct dpu_hw_color3_cfg { }; /** - * * struct dpu_hw_lm_ops : Interface to the mixer Hw driver functions * Assumption is these functions will be called after clocks are enabled */ struct dpu_hw_lm_ops { - /* - * Sets up mixer output width and height + /** + * @setup_mixer_out: Sets up mixer output width and height * and border color if enabled */ void (*setup_mixer_out)(struct dpu_hw_mixer *ctx, struct dpu_hw_mixer_cfg *cfg); - /* - * Alpha blending configuration + /** + * @setup_blend_config: Alpha blending configuration * for the specified stage */ void (*setup_blend_config)(struct dpu_hw_mixer *ctx, uint32_t stage, uint32_t fg_alpha, uint32_t bg_alpha, uint32_t blend_op); - /* - * Alpha color component selection from either fg or bg + /** + * @setup_alpha_out: Alpha color component selection from either fg or bg */ void (*setup_alpha_out)(struct dpu_hw_mixer *ctx, uint32_t mixer_op); /** - * Clear layer mixer to pipe configuration + * @clear_all_blendstages: Clear layer mixer to pipe configuration * @ctx : mixer ctx pointer * Returns: 0 on success or -error */ int (*clear_all_blendstages)(struct dpu_hw_mixer *ctx); /** - * Configure layer mixer to pipe configuration + * @setup_blendstage: Configure layer mixer to pipe configuration * @ctx : mixer ctx pointer * @lm : layer mixer enumeration * @stage_cfg : blend stage configuration @@ -67,19 +66,19 @@ struct dpu_hw_lm_ops { struct dpu_hw_stage_cfg *stage_cfg); /** - * setup_border_color : enable/disable border color + * @setup_border_color : enable/disable border color */ void (*setup_border_color)(struct dpu_hw_mixer *ctx, struct dpu_mdss_color *color, u8 border_en); /** - * setup_misr: Enable/disable MISR + * @setup_misr: Enable/disable MISR */ void (*setup_misr)(struct dpu_hw_mixer *ctx); /** - * collect_misr: Read MISR signature + * @collect_misr: Read MISR signature */ int (*collect_misr)(struct dpu_hw_mixer *ctx, u32 *misr_value); }; From e8cb706b8ece1f826a406b804cf062903a49dcaf Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:29 -0800 Subject: [PATCH 0377/1321] drm/msm/dpu: dpu_hw_merge3d.h: fix all kernel-doc warnings Delete one "empty" kernel-doc line to eliminate a warning: Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_merge3d.h:14 Cannot find identifier on line: * Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695673/ Link: https://lore.kernel.org/r/20251219184638.1813181-11-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_merge3d.h | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_merge3d.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_merge3d.h index 6833c02075236..b57f88187148b 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_merge3d.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_merge3d.h @@ -12,7 +12,6 @@ struct dpu_hw_merge_3d; /** - * * struct dpu_hw_merge_3d_ops : Interface to the merge_3d Hw driver functions * Assumption is these functions will be called after clocks are enabled * @setup_3d_mode : enable 3D merge From 6a5c3ae079eec314b05a384a790be465cc2c0cbb Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:30 -0800 Subject: [PATCH 0378/1321] drm/msm/dpu: dpu_hw_pingpong.h: fix all kernel-doc warnings Correct or add kernel-doc comments to eliminate all warnings: Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:36 Cannot find identifier on line: * Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:46 Incorrect use of kernel-doc format: * enables vysnc generation and sets up init value of Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:53 Incorrect use of kernel-doc format: * disables tear check block Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:58 Incorrect use of kernel-doc format: * read, modify, write to either set or clear listening to external TE Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:65 Incorrect use of kernel-doc format: * Obtain current vertical line counter Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:70 Incorrect use of kernel-doc format: * Disable autorefresh if enabled Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:75 Incorrect use of kernel-doc format: * Setup dither matix for pingpong block Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:80 Incorrect use of kernel-doc format: * Enable DSC Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:85 Incorrect use of kernel-doc format: * Disable DSC Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:90 Incorrect use of kernel-doc format: * Setup DSC Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:93 struct member 'connect_external_te' not described in 'dpu_hw_pingpong_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:93 struct member 'disable_autorefresh' not described in 'dpu_hw_pingpong_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:93 struct member 'enable_dsc' not described in 'dpu_hw_pingpong_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:93 struct member 'disable_dsc' not described in 'dpu_hw_pingpong_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h:93 struct member 'setup_dsc' not described in 'dpu_hw_pingpong_ops' Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695659/ Link: https://lore.kernel.org/r/20251219184638.1813181-12-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- .../gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h index dd99e1c21a1ee..effd012d864af 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h @@ -34,7 +34,6 @@ struct dpu_hw_dither_cfg { }; /** - * * struct dpu_hw_pingpong_ops : Interface to the pingpong Hw driver functions * Assumption is these functions will be called after clocks are enabled * @enable_tearcheck: program and enable tear check block @@ -44,51 +43,52 @@ struct dpu_hw_dither_cfg { */ struct dpu_hw_pingpong_ops { /** - * enables vysnc generation and sets up init value of + * @enable_tearcheck: enables vysnc generation and sets up init value of * read pointer and programs the tear check cofiguration */ int (*enable_tearcheck)(struct dpu_hw_pingpong *pp, struct dpu_hw_tear_check *cfg); /** - * disables tear check block + * @disable_tearcheck: disables tear check block */ int (*disable_tearcheck)(struct dpu_hw_pingpong *pp); /** - * read, modify, write to either set or clear listening to external TE + * @connect_external_te: read, modify, write to either set or clear + * listening to external TE * @Return: 1 if TE was originally connected, 0 if not, or -ERROR */ int (*connect_external_te)(struct dpu_hw_pingpong *pp, bool enable_external_te); /** - * Obtain current vertical line counter + * @get_line_count: Obtain current vertical line counter */ u32 (*get_line_count)(struct dpu_hw_pingpong *pp); /** - * Disable autorefresh if enabled + * @disable_autorefresh: Disable autorefresh if enabled */ void (*disable_autorefresh)(struct dpu_hw_pingpong *pp, uint32_t encoder_id, u16 vdisplay); /** - * Setup dither matix for pingpong block + * @setup_dither: Setup dither matix for pingpong block */ void (*setup_dither)(struct dpu_hw_pingpong *pp, struct dpu_hw_dither_cfg *cfg); /** - * Enable DSC + * @enable_dsc: Enable DSC */ int (*enable_dsc)(struct dpu_hw_pingpong *pp); /** - * Disable DSC + * @disable_dsc: Disable DSC */ void (*disable_dsc)(struct dpu_hw_pingpong *pp); /** - * Setup DSC + * @setup_dsc: Setup DSC */ int (*setup_dsc)(struct dpu_hw_pingpong *pp); }; From 79abd347f1a62bfafe9f63799513e9b34867b457 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:31 -0800 Subject: [PATCH 0379/1321] drm/msm/dpu: dpu_hw_sspp.h: fix all kernel-doc warnings Modify non-kernel-doc comments to begin with "/*" instead of "/**". Correct or add kernel-doc comments to eliminate all warnings: Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:17 missing initial short description on line: * Flags Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:19 expecting prototype for Flags(). Prototype was for DPU_SSPP_FLIP_LR() instead Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:26 This comment starts with '/**', but isn't a kernel-doc comment. * Component indices Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:47 cannot understand function prototype: 'enum dpu_sspp_multirect_index' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:152 struct member 'dst_rect' not described in 'dpu_sw_pipe_cfg' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:174 struct member 'multirect_index' not described in 'dpu_sw_pipe' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:174 struct member 'multirect_mode' not described in 'dpu_sw_pipe' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:183 Incorrect use of kernel-doc format: * setup_format - setup pixel format cropping rectangle, flip Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:192 Incorrect use of kernel-doc format: * setup_rects - setup pipe ROI rectangles Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:200 Incorrect use of kernel-doc format: * setup_pe - setup pipe pixel extension Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:208 Incorrect use of kernel-doc format: * setup_sourceaddress - setup pipe source addresses Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:216 Incorrect use of kernel-doc format: * setup_csc - setup color space coversion Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:223 Incorrect use of kernel-doc format: * setup_solidfill - enable/disable colorfill Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:231 Incorrect use of kernel-doc format: * setup_multirect - setup multirect configuration Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:238 Incorrect use of kernel-doc format: * setup_sharpening - setup sharpening Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:247 Incorrect use of kernel-doc format: * setup_qos_lut - setup QoS LUTs Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:255 Incorrect use of kernel-doc format: * setup_qos_ctrl - setup QoS control Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:263 Incorrect use of kernel-doc format: * setup_clk_force_ctrl - setup clock force control Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:271 Incorrect use of kernel-doc format: * setup_histogram - setup histograms Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:279 Incorrect use of kernel-doc format: * setup_scaler - setup scaler Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:288 Incorrect use of kernel-doc format: * setup_cdp - setup client driven prefetch Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:296 struct member 'setup_format' not described in 'dpu_hw_sspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:296 struct member 'setup_rects' not described in 'dpu_hw_sspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:296 struct member 'setup_pe' not described in 'dpu_hw_sspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:296 struct member 'setup_sourceaddress' not described in 'dpu_hw_sspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:296 struct member 'setup_csc' not described in 'dpu_hw_sspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:296 struct member 'setup_solidfill' not described in 'dpu_hw_sspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:296 struct member 'setup_multirect' not described in 'dpu_hw_sspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:296 struct member 'setup_sharpening' not described in 'dpu_hw_sspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:296 struct member 'setup_qos_lut' not described in 'dpu_hw_sspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:296 struct member 'setup_qos_ctrl' not described in 'dpu_hw_sspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:296 struct member 'setup_clk_force_ctrl' not described in 'dpu_hw_sspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:296 struct member 'setup_histogram' not described in 'dpu_hw_sspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:296 struct member 'setup_scaler' not described in 'dpu_hw_sspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:296 struct member 'setup_cdp' not described in 'dpu_hw_sspp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h:320 struct member 'mdss_ver' not described in 'dpu_hw_sspp' Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695661/ Link: https://lore.kernel.org/r/20251219184638.1813181-13-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h | 47 +++++++++++---------- 1 file changed, 24 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h index bdac5c04bf790..3822094f85bc5 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h @@ -14,7 +14,7 @@ struct dpu_hw_sspp; #define DPU_SSPP_MAX_PITCH_SIZE 0xffff -/** +/* * Flags */ #define DPU_SSPP_FLIP_LR BIT(0) @@ -23,7 +23,7 @@ struct dpu_hw_sspp; #define DPU_SSPP_ROT_90 BIT(3) #define DPU_SSPP_SOLID_FILL BIT(4) -/** +/* * Component indices */ enum { @@ -36,9 +36,10 @@ enum { }; /** - * DPU_SSPP_RECT_SOLO - multirect disabled - * DPU_SSPP_RECT_0 - rect0 of a multirect pipe - * DPU_SSPP_RECT_1 - rect1 of a multirect pipe + * enum dpu_sspp_multirect_index - multirect mode + * @DPU_SSPP_RECT_SOLO: multirect disabled + * @DPU_SSPP_RECT_0: rect0 of a multirect pipe + * @DPU_SSPP_RECT_1: rect1 of a multirect pipe * * Note: HW supports multirect with either RECT0 or * RECT1. Considering no benefit of such configs over @@ -143,7 +144,7 @@ struct dpu_hw_pixel_ext { * struct dpu_sw_pipe_cfg : software pipe configuration * @src_rect: src ROI, caller takes into account the different operations * such as decimation, flip etc to program this field - * @dest_rect: destination ROI. + * @dst_rect: destination ROI. * @rotation: simplified drm rotation hint */ struct dpu_sw_pipe_cfg { @@ -165,8 +166,8 @@ struct dpu_hw_pipe_ts_cfg { /** * struct dpu_sw_pipe - software pipe description * @sspp: backing SSPP pipe - * @index: index of the rectangle of SSPP - * @mode: parallel or time multiplex multirect mode + * @multirect_index: index of the rectangle of SSPP + * @multirect_mode: parallel or time multiplex multirect mode */ struct dpu_sw_pipe { struct dpu_hw_sspp *sspp; @@ -181,7 +182,7 @@ struct dpu_sw_pipe { */ struct dpu_hw_sspp_ops { /** - * setup_format - setup pixel format cropping rectangle, flip + * @setup_format: setup pixel format cropping rectangle, flip * @pipe: Pointer to software pipe context * @cfg: Pointer to pipe config structure * @flags: Extra flags for format config @@ -190,7 +191,7 @@ struct dpu_hw_sspp_ops { const struct msm_format *fmt, u32 flags); /** - * setup_rects - setup pipe ROI rectangles + * @setup_rects: setup pipe ROI rectangles * @pipe: Pointer to software pipe context * @cfg: Pointer to pipe config structure */ @@ -198,7 +199,7 @@ struct dpu_hw_sspp_ops { struct dpu_sw_pipe_cfg *cfg); /** - * setup_pe - setup pipe pixel extension + * @setup_pe: setup pipe pixel extension * @ctx: Pointer to pipe context * @pe_ext: Pointer to pixel ext settings */ @@ -206,7 +207,7 @@ struct dpu_hw_sspp_ops { struct dpu_hw_pixel_ext *pe_ext); /** - * setup_sourceaddress - setup pipe source addresses + * @setup_sourceaddress: setup pipe source addresses * @pipe: Pointer to software pipe context * @layout: format layout information for programming buffer to hardware */ @@ -214,14 +215,14 @@ struct dpu_hw_sspp_ops { struct dpu_hw_fmt_layout *layout); /** - * setup_csc - setup color space coversion + * @setup_csc: setup color space coversion * @ctx: Pointer to pipe context * @data: Pointer to config structure */ void (*setup_csc)(struct dpu_hw_sspp *ctx, const struct dpu_csc_cfg *data); /** - * setup_solidfill - enable/disable colorfill + * @setup_solidfill: enable/disable colorfill * @pipe: Pointer to software pipe context * @const_color: Fill color value * @flags: Pipe flags @@ -229,23 +230,22 @@ struct dpu_hw_sspp_ops { void (*setup_solidfill)(struct dpu_sw_pipe *pipe, u32 color); /** - * setup_multirect - setup multirect configuration + * @setup_multirect: setup multirect configuration * @pipe: Pointer to software pipe context */ void (*setup_multirect)(struct dpu_sw_pipe *pipe); /** - * setup_sharpening - setup sharpening + * @setup_sharpening: setup sharpening * @ctx: Pointer to pipe context * @cfg: Pointer to config structure */ void (*setup_sharpening)(struct dpu_hw_sspp *ctx, struct dpu_hw_sharp_cfg *cfg); - /** - * setup_qos_lut - setup QoS LUTs + * @setup_qos_lut: setup QoS LUTs * @ctx: Pointer to pipe context * @cfg: LUT configuration */ @@ -253,7 +253,7 @@ struct dpu_hw_sspp_ops { struct dpu_hw_qos_cfg *cfg); /** - * setup_qos_ctrl - setup QoS control + * @setup_qos_ctrl: setup QoS control * @ctx: Pointer to pipe context * @danger_safe_en: flags controlling enabling of danger/safe QoS/LUT */ @@ -261,7 +261,7 @@ struct dpu_hw_sspp_ops { bool danger_safe_en); /** - * setup_clk_force_ctrl - setup clock force control + * @setup_clk_force_ctrl: setup clock force control * @ctx: Pointer to pipe context * @enable: enable clock force if true */ @@ -269,7 +269,7 @@ struct dpu_hw_sspp_ops { bool enable); /** - * setup_histogram - setup histograms + * @setup_histogram: setup histograms * @ctx: Pointer to pipe context * @cfg: Pointer to histogram configuration */ @@ -277,7 +277,7 @@ struct dpu_hw_sspp_ops { void *cfg); /** - * setup_scaler - setup scaler + * @setup_scaler: setup scaler * @scaler3_cfg: Pointer to scaler configuration * @format: pixel format parameters */ @@ -286,7 +286,7 @@ struct dpu_hw_sspp_ops { const struct msm_format *format); /** - * setup_cdp - setup client driven prefetch + * @setup_cdp: setup client driven prefetch * @pipe: Pointer to software pipe context * @fmt: format used by the sw pipe * @enable: whether the CDP should be enabled for this pipe @@ -303,6 +303,7 @@ struct dpu_hw_sspp_ops { * @ubwc: UBWC configuration data * @idx: pipe index * @cap: pointer to layer_cfg + * @mdss_ver: MDSS version info to use for feature checks * @ops: pointer to operations possible for this pipe */ struct dpu_hw_sspp { From 8d3ad0d5ea62b5979c103648c47fabea4604a5dc Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:32 -0800 Subject: [PATCH 0380/1321] drm/msm/dpu: dpu_hw_top.h: fix all kernel-doc warnings Correct or add kernel-doc comments to eliminate all warnings: Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h:93 Incorrect use of kernel-doc format: * setup_traffic_shaper() : Setup traffic shaper control Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h:101 Incorrect use of kernel-doc format: * setup_clk_force_ctrl - set clock force control Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h:111 Incorrect use of kernel-doc format: * get_danger_status - get danger status Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h:119 Incorrect use of kernel-doc format: * setup_vsync_source - setup vsync source configuration details Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h:127 Incorrect use of kernel-doc format: * get_safe_status - get safe status Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h:135 Incorrect use of kernel-doc format: * dp_phy_intf_sel - configure intf to phy mapping Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h:142 Incorrect use of kernel-doc format: * intf_audio_select - select the external interface for audio Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h:146 struct member 'setup_clk_force_ctrl' not described in 'dpu_hw_mdp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h:146 struct member 'get_danger_status' not described in 'dpu_hw_mdp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h:146 struct member 'setup_vsync_source' not described in 'dpu_hw_mdp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h:146 struct member 'get_safe_status' not described in 'dpu_hw_mdp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h:146 struct member 'dp_phy_intf_sel' not described in 'dpu_hw_mdp_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h:146 struct member 'intf_audio_select' not described in 'dpu_hw_mdp_ops' Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695669/ Link: https://lore.kernel.org/r/20251219184638.1813181-14-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h index 04efdcd21ceb0..80279d87c2cde 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.h @@ -77,12 +77,11 @@ enum dpu_dp_phy_sel { /** * struct dpu_hw_mdp_ops - interface to the MDP TOP Hw driver functions * Assumption is these functions will be called after clocks are enabled. - * @setup_split_pipe : Programs the pipe control registers - * @setup_pp_split : Programs the pp split control registers - * @setup_traffic_shaper : programs traffic shaper control */ struct dpu_hw_mdp_ops { - /** setup_split_pipe() : Registers are not double buffered, thisk + /** + * @setup_split_pipe : Programs the pipe control registers. + * Registers are not double buffered, this * function should be called before timing control enable * @mdp : mdp top context driver * @cfg : upper and lower part of pipe configuration @@ -91,7 +90,7 @@ struct dpu_hw_mdp_ops { struct split_pipe_cfg *p); /** - * setup_traffic_shaper() : Setup traffic shaper control + * @setup_traffic_shaper : programs traffic shaper control. * @mdp : mdp top context driver * @cfg : traffic shaper configuration */ @@ -99,7 +98,7 @@ struct dpu_hw_mdp_ops { struct traffic_shaper_cfg *cfg); /** - * setup_clk_force_ctrl - set clock force control + * @setup_clk_force_ctrl: set clock force control * @mdp: mdp top context driver * @clk_ctrl: clock to be controlled * @enable: force on enable @@ -109,7 +108,7 @@ struct dpu_hw_mdp_ops { enum dpu_clk_ctrl_type clk_ctrl, bool enable); /** - * get_danger_status - get danger status + * @get_danger_status: get danger status * @mdp: mdp top context driver * @status: Pointer to danger safe status */ @@ -117,7 +116,7 @@ struct dpu_hw_mdp_ops { struct dpu_danger_safe_status *status); /** - * setup_vsync_source - setup vsync source configuration details + * @setup_vsync_source: setup vsync source configuration details * @mdp: mdp top context driver * @cfg: vsync source selection configuration */ @@ -125,7 +124,7 @@ struct dpu_hw_mdp_ops { struct dpu_vsync_source_cfg *cfg); /** - * get_safe_status - get safe status + * @get_safe_status: get safe status * @mdp: mdp top context driver * @status: Pointer to danger safe status */ @@ -133,14 +132,14 @@ struct dpu_hw_mdp_ops { struct dpu_danger_safe_status *status); /** - * dp_phy_intf_sel - configure intf to phy mapping + * @dp_phy_intf_sel: configure intf to phy mapping * @mdp: mdp top context driver * @phys: list of phys the DP interfaces should be connected to. 0 disables the INTF. */ void (*dp_phy_intf_sel)(struct dpu_hw_mdp *mdp, enum dpu_dp_phy_sel phys[2]); /** - * intf_audio_select - select the external interface for audio + * @intf_audio_select: select the external interface for audio * @mdp: mdp top context driver */ void (*intf_audio_select)(struct dpu_hw_mdp *mdp); From ed2449da1f70ac4fb15a32d11a1f38634ff2dd1a Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:33 -0800 Subject: [PATCH 0381/1321] drm/msm/dpu: dpu_hw_vbif.h: fix all kernel-doc warnings Correct or add kernel-doc comments to eliminate all warnings: Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:19 Incorrect use of kernel-doc format: * set_limit_conf - set transaction limit config Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:29 Incorrect use of kernel-doc format: * get_limit_conf - get transaction limit config Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:39 Incorrect use of kernel-doc format: * set_halt_ctrl - set halt control Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:48 Incorrect use of kernel-doc format: * get_halt_ctrl - get halt control Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:57 Incorrect use of kernel-doc format: * set_qos_remap - set QoS priority remap Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:67 Incorrect use of kernel-doc format: * set_mem_type - set memory type Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:76 Incorrect use of kernel-doc format: * clear_errors - clear any vbif errors Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:88 Incorrect use of kernel-doc format: * set_write_gather_en - set write_gather enable Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:93 struct member 'limit' not described in 'dpu_hw_vbif_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:93 struct member 'set_limit_conf' not described in 'dpu_hw_vbif_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:93 struct member 'get_limit_conf' not described in 'dpu_hw_vbif_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:93 struct member 'set_halt_ctrl' not described in 'dpu_hw_vbif_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:93 struct member 'get_halt_ctrl' not described in 'dpu_hw_vbif_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:93 struct member 'set_qos_remap' not described in 'dpu_hw_vbif_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:93 struct member 'set_mem_type' not described in 'dpu_hw_vbif_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:93 struct member 's' not described in 'dpu_hw_vbif_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h:93 struct member 'set_write_gather_en' not described in 'dpu_hw_vbif_ops' Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695680/ Link: https://lore.kernel.org/r/20251219184638.1813181-15-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h index 285121ec804cc..9ac49448e4325 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_vbif.h @@ -17,7 +17,7 @@ struct dpu_hw_vbif; */ struct dpu_hw_vbif_ops { /** - * set_limit_conf - set transaction limit config + * @set_limit_conf: set transaction limit config * @vbif: vbif context driver * @xin_id: client interface identifier * @rd: true for read limit; false for write limit @@ -27,7 +27,7 @@ struct dpu_hw_vbif_ops { u32 xin_id, bool rd, u32 limit); /** - * get_limit_conf - get transaction limit config + * @get_limit_conf: get transaction limit config * @vbif: vbif context driver * @xin_id: client interface identifier * @rd: true for read limit; false for write limit @@ -37,7 +37,7 @@ struct dpu_hw_vbif_ops { u32 xin_id, bool rd); /** - * set_halt_ctrl - set halt control + * @set_halt_ctrl: set halt control * @vbif: vbif context driver * @xin_id: client interface identifier * @enable: halt control enable @@ -46,7 +46,7 @@ struct dpu_hw_vbif_ops { u32 xin_id, bool enable); /** - * get_halt_ctrl - get halt control + * @get_halt_ctrl: get halt control * @vbif: vbif context driver * @xin_id: client interface identifier * @return: halt control enable @@ -55,7 +55,7 @@ struct dpu_hw_vbif_ops { u32 xin_id); /** - * set_qos_remap - set QoS priority remap + * @set_qos_remap: set QoS priority remap * @vbif: vbif context driver * @xin_id: client interface identifier * @level: priority level @@ -65,7 +65,7 @@ struct dpu_hw_vbif_ops { u32 xin_id, u32 level, u32 remap_level); /** - * set_mem_type - set memory type + * @set_mem_type: set memory type * @vbif: vbif context driver * @xin_id: client interface identifier * @value: memory type value @@ -74,7 +74,7 @@ struct dpu_hw_vbif_ops { u32 xin_id, u32 value); /** - * clear_errors - clear any vbif errors + * @clear_errors: clear any vbif errors * This function clears any detected pending/source errors * on the VBIF interface, and optionally returns the detected * error mask(s). @@ -86,7 +86,7 @@ struct dpu_hw_vbif_ops { u32 *pnd_errors, u32 *src_errors); /** - * set_write_gather_en - set write_gather enable + * @set_write_gather_en: set write_gather enable * @vbif: vbif context driver * @xin_id: client interface identifier */ From 11b07a65587ca3bee5eb6bc7a463cfd25062e298 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:34 -0800 Subject: [PATCH 0382/1321] drm/msm/dpu: dpu_hw_wb.h: fix all kernel-doc warnings Correct or add kernel-doc comments to eliminate all warnings: Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.h:24 Cannot find identifier on line: * Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.h:57 struct member 'setup_roi' not described in 'dpu_hw_wb_ops' Warning: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.h:75 struct member 'caps' not described in 'dpu_hw_wb' Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695672/ Link: https://lore.kernel.org/r/20251219184638.1813181-16-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.h index ee5e5ab786e1b..cfdbb5bb2a0f3 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.h @@ -22,11 +22,11 @@ struct dpu_hw_wb_cfg { }; /** - * * struct dpu_hw_wb_ops : Interface to the wb hw driver functions * Assumption is these functions will be called after clocks are enabled * @setup_outaddress: setup output address from the writeback job * @setup_outformat: setup output format of writeback block from writeback job + * @setup_roi: setup ROI (Region of Interest) parameters * @setup_qos_lut: setup qos LUT for writeback block based on input * @setup_cdp: setup chroma down prefetch block for writeback block * @setup_clk_force_ctrl: setup clock force control @@ -61,7 +61,7 @@ struct dpu_hw_wb_ops { * struct dpu_hw_wb : WB driver object * @hw: block hardware details * @idx: hardware index number within type - * @wb_hw_caps: hardware capabilities + * @caps: hardware capabilities * @ops: function pointers */ struct dpu_hw_wb { From 87b2543805f838768a85417d51d7829857ca50c4 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:35 -0800 Subject: [PATCH 0383/1321] drm/msm: msm_fence.h: fix all kernel-doc warnings Correct or add kernel-doc comments to eliminate all warnings: Warning: drivers/gpu/drm/msm/msm_fence.h:27 Incorrect use of kernel-doc format: * last_fence: Warning: drivers/gpu/drm/msm/msm_fence.h:36 Incorrect use of kernel-doc format: * completed_fence: Warning: drivers/gpu/drm/msm/msm_fence.h:44 Incorrect use of kernel-doc format: * fenceptr: Warning: drivers/gpu/drm/msm/msm_fence.h:65 Incorrect use of kernel-doc format: * next_deadline_fence: Warning: drivers/gpu/drm/msm/msm_fence.h:74 struct member 'dev' not described in 'msm_fence_context' Warning: drivers/gpu/drm/msm/msm_fence.h:74 struct member 'name' not described in 'msm_fence_context' Warning: drivers/gpu/drm/msm/msm_fence.h:74 struct member 'context' not described in 'msm_fence_context' Warning: drivers/gpu/drm/msm/msm_fence.h:74 struct member 'index' not described in 'msm_fence_context' Warning: drivers/gpu/drm/msm/msm_fence.h:74 struct member 'fence' not described in 'msm_fence_context' Warning: drivers/gpu/drm/msm/msm_fence.h:74 struct member 'there is no remaining pending work */ uint32_t last_fence' not described in 'msm_fence_context' Warning: drivers/gpu/drm/msm/msm_fence.h:74 struct member 'updated from the CPU after interrupt * from GPU */ uint32_t completed_fence' not described in 'msm_fence_context' Warning: drivers/gpu/drm/msm/msm_fence.h:74 struct member 'fenceptr' not described in 'msm_fence_context' Warning: drivers/gpu/drm/msm/msm_fence.h:74 struct member 'spinlock' not described in 'msm_fence_context' Warning: drivers/gpu/drm/msm/msm_fence.h:74 struct member 'next_deadline' not described in 'msm_fence_context' Warning: drivers/gpu/drm/msm/msm_fence.h:74 struct member 'next_deadline_fence' not described in 'msm_fence_context' Warning: drivers/gpu/drm/msm/msm_fence.h:74 struct member 'deadline_timer' not described in 'msm_fence_context' Warning: drivers/gpu/drm/msm/msm_fence.h:74 struct member 'deadline_work' not described in 'msm_fence_context' Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695667/ Link: https://lore.kernel.org/r/20251219184638.1813181-17-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/msm_fence.h | 36 +++++++++++++++++---------------- 1 file changed, 19 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h index 148196375a0b6..3317a485beef5 100644 --- a/drivers/gpu/drm/msm/msm_fence.h +++ b/drivers/gpu/drm/msm/msm_fence.h @@ -16,34 +16,29 @@ * incrementing fence seqno at the end of each submit */ struct msm_fence_context { + /** @dev: the drm device */ struct drm_device *dev; - /** name: human readable name for fence timeline */ + /** @name: human readable name for fence timeline */ char name[32]; - /** context: see dma_fence_context_alloc() */ + /** @context: see dma_fence_context_alloc() */ unsigned context; - /** index: similar to context, but local to msm_fence_context's */ + /** @index: similar to context, but local to msm_fence_context's */ unsigned index; - /** - * last_fence: - * + * @last_fence: * Last assigned fence, incremented each time a fence is created * on this fence context. If last_fence == completed_fence, * there is no remaining pending work */ uint32_t last_fence; - /** - * completed_fence: - * + * @completed_fence: * The last completed fence, updated from the CPU after interrupt * from GPU */ uint32_t completed_fence; - /** - * fenceptr: - * + * @fenceptr: * The address that the GPU directly writes with completed fence * seqno. This can be ahead of completed_fence. We can peek at * this to see if a fence has already signaled but the CPU hasn't @@ -51,6 +46,9 @@ struct msm_fence_context { */ volatile uint32_t *fenceptr; + /** + * @spinlock: fence context spinlock + */ spinlock_t spinlock; /* @@ -59,18 +57,22 @@ struct msm_fence_context { * don't queue, so maybe that is ok */ - /** next_deadline: Time of next deadline */ + /** @next_deadline: Time of next deadline */ ktime_t next_deadline; - /** - * next_deadline_fence: - * + * @next_deadline_fence: * Fence value for next pending deadline. The deadline timer is * canceled when this fence is signaled. */ uint32_t next_deadline_fence; - + /** + * @deadline_timer: tracks nearest deadline of a fence timeline and + * expires just before it. + */ struct hrtimer deadline_timer; + /** + * @deadline_work: work to do after deadline_timer expires + */ struct kthread_work deadline_work; }; From 57b597ff0c4f6d39e0367ab8a0706db9779ae341 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:36 -0800 Subject: [PATCH 0384/1321] drm/msm: msm_gem_vma.c: fix all kernel-doc warnings Correct or add kernel-doc comments to eliminate all warnings: Warning: ../drivers/gpu/drm/msm/msm_gem_vma.c:96 expecting prototype for struct msm_vma_op. Prototype was for struct msm_vm_op instead Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695679/ Link: https://lore.kernel.org/r/20251219184638.1813181-18-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/msm_gem_vma.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c index 71d5238437eb8..8f7c90167447e 100644 --- a/drivers/gpu/drm/msm/msm_gem_vma.c +++ b/drivers/gpu/drm/msm/msm_gem_vma.c @@ -65,7 +65,7 @@ struct msm_vm_unmap_op { }; /** - * struct msm_vma_op - A MAP or UNMAP operation + * struct msm_vm_op - A MAP or UNMAP operation */ struct msm_vm_op { /** @op: The operation type */ @@ -798,6 +798,9 @@ static const struct drm_sched_backend_ops msm_vm_bind_ops = { * synchronous operations are supported. In a user managed VM, userspace * handles virtual address allocation, and both async and sync operations * are supported. + * + * Returns: pointer to the created &struct drm_gpuvm on success + * or an ERR_PTR(-errno) on failure. */ struct drm_gpuvm * msm_gem_vm_create(struct drm_device *drm, struct msm_mmu *mmu, const char *name, From 145202daf9c87a1aff74744f87e65485c5378cd4 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:37 -0800 Subject: [PATCH 0385/1321] drm/msm: msm_gpu.h: fix all kernel-doc warnings Correct or add kernel-doc comments to eliminate all warnings: Warning: drivers/gpu/drm/msm/msm_gpu.h:119 Incorrect use of kernel-doc format: * devfreq: devfreq instance Warning: drivers/gpu/drm/msm/msm_gpu.h:125 Incorrect use of kernel-doc format: * idle_freq: Warning: drivers/gpu/drm/msm/msm_gpu.h:136 Incorrect use of kernel-doc format: * boost_constraint: Warning: drivers/gpu/drm/msm/msm_gpu.h:144 Incorrect use of kernel-doc format: * busy_cycles: Last busy counter value, for calculating elapsed busy Warning: drivers/gpu/drm/msm/msm_gpu.h:156 Incorrect use of kernel-doc format: * idle_work: Warning: drivers/gpu/drm/msm/msm_gpu.h:163 Incorrect use of kernel-doc format: * boost_work: Warning: drivers/gpu/drm/msm/msm_gpu.h:170 struct member 'devfreq' not described in 'msm_gpu_devfreq' Warning: drivers/gpu/drm/msm/msm_gpu.h:170 struct member 'boost_freq' not described in 'msm_gpu_devfreq' Warning: drivers/gpu/drm/msm/msm_gpu.h:172 struct member 'devfreq' not described in 'msm_gpu_devfreq' Warning: drivers/gpu/drm/msm/msm_gpu.h:172 struct member 'lock' not described in 'msm_gpu_devfreq' Warning: drivers/gpu/drm/msm/msm_gpu.h:172 struct member 'governor' not described in 'msm_gpu_devfreq' Warning: drivers/gpu/drm/msm/msm_gpu.h:172 struct member 'we are continuing to sample busyness and * adjust frequency while the GPU is idle' not described in 'msm_gpu_devfreq' Warning: drivers/gpu/drm/msm/msm_gpu.h:172 struct member 'boost_freq' not described in 'msm_gpu_devfreq' Warning: drivers/gpu/drm/msm/msm_gpu.h:172 struct member 'busy_cycles' not described in 'msm_gpu_devfreq' Warning: drivers/gpu/drm/msm/msm_gpu.h:172 struct member 'time' not described in 'msm_gpu_devfreq' Warning: drivers/gpu/drm/msm/msm_gpu.h:172 struct member 'idle_time' not described in 'msm_gpu_devfreq' Warning: drivers/gpu/drm/msm/msm_gpu.h:172 struct member 'idle_work' not described in 'msm_gpu_devfreq' Warning: drivers/gpu/drm/msm/msm_gpu.h:172 struct member 'boost_work' not described in 'msm_gpu_devfreq' Warning: drivers/gpu/drm/msm/msm_gpu.h:172 struct member 'suspended' not described in 'msm_gpu_devfreq' Warning: drivers/gpu/drm/msm/msm_gpu.h:472 No description found for return value of 'msm_context_is_vmbind' Warning: drivers/gpu/drm/msm/msm_gpu.h:476 struct member 'ref' not described in 'msm_context' Warning: drivers/gpu/drm/msm/msm_gpu.h:476 struct member 'elapsed_ns' not described in 'msm_context' Warning: drivers/gpu/drm/msm/msm_gpu.h:492 expecting prototype for msm_context_is_vm_bind(). Prototype was for msm_context_is_vmbind() instead Warning: drivers/gpu/drm/msm/msm_gpu.h:523 No description found for return value of 'msm_gpu_convert_priority' Warning: drivers/gpu/drm/msm/msm_gpu.h:583 expecting prototype for struct msm_gpu_submitqueues. Prototype was for struct msm_gpu_submitqueue instead Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695671/ Link: https://lore.kernel.org/r/20251219184638.1813181-19-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/msm_gpu.h | 68 ++++++++++------------------------- 1 file changed, 18 insertions(+), 50 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 2894fc118485f..666cf499b7ec5 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -116,15 +116,12 @@ struct msm_gpu_fault_info { * struct msm_gpu_devfreq - devfreq related state */ struct msm_gpu_devfreq { - /** devfreq: devfreq instance */ + /** @devfreq: devfreq instance */ struct devfreq *devfreq; - - /** lock: lock for "suspended", "busy_cycles", and "time" */ + /** @lock: lock for "suspended", "busy_cycles", and "time" */ struct mutex lock; - /** - * idle_freq: - * + * @idle_freq: * Shadow frequency used while the GPU is idle. From the PoV of * the devfreq governor, we are continuing to sample busyness and * adjust frequency while the GPU is idle, but we use this shadow @@ -132,43 +129,34 @@ struct msm_gpu_devfreq { * it is inactive. */ unsigned long idle_freq; - /** - * boost_constraint: - * + * @boost_freq: * A PM QoS constraint to boost min freq for a period of time * until the boost expires. */ struct dev_pm_qos_request boost_freq; - /** - * busy_cycles: Last busy counter value, for calculating elapsed busy + * @busy_cycles: Last busy counter value, for calculating elapsed busy * cycles since last sampling period. */ u64 busy_cycles; - - /** time: Time of last sampling period. */ + /** @time: Time of last sampling period. */ ktime_t time; - - /** idle_time: Time of last transition to idle: */ + /** @idle_time: Time of last transition to idle. */ ktime_t idle_time; - /** - * idle_work: - * + * @idle_work: * Used to delay clamping to idle freq on active->idle transition. */ struct msm_hrtimer_work idle_work; - /** - * boost_work: - * + * @boost_work: * Used to reset the boost_constraint after the boost period has * elapsed */ struct msm_hrtimer_work boost_work; - /** suspended: tracks if we're suspended */ + /** @suspended: tracks if we're suspended */ bool suspended; }; @@ -358,57 +346,43 @@ struct msm_gpu_perfcntr { struct msm_context { /** @queuelock: synchronizes access to submitqueues list */ rwlock_t queuelock; - /** @submitqueues: list of &msm_gpu_submitqueue created by userspace */ struct list_head submitqueues; - /** * @queueid: - * * Counter incremented each time a submitqueue is created, used to * assign &msm_gpu_submitqueue.id */ int queueid; - /** * @closed: The device file associated with this context has been closed. - * * Once the device is closed, any submits that have not been written * to the ring buffer are no-op'd. */ bool closed; - /** * @userspace_managed_vm: - * * Has userspace opted-in to userspace managed VM (ie. VM_BIND) via * MSM_PARAM_EN_VM_BIND? */ bool userspace_managed_vm; - /** * @vm: - * * The per-process GPU address-space. Do not access directly, use * msm_context_vm(). */ struct drm_gpuvm *vm; - - /** @kref: the reference count */ + /** @ref: the reference count */ struct kref ref; - /** * @seqno: - * * A unique per-process sequence number. Used to detect context * switches, without relying on keeping a, potentially dangling, * pointer to the previous context. */ int seqno; - /** * @sysprof: - * * The value of MSM_PARAM_SYSPROF set by userspace. This is * intended to be used by system profiling tools like Mesa's * pps-producer (perfetto), and restricted to CAP_SYS_ADMIN. @@ -423,40 +397,32 @@ struct msm_context { * file is closed. */ int sysprof; - /** * @comm: Overridden task comm, see MSM_PARAM_COMM * * Accessed under msm_gpu::lock */ char *comm; - /** * @cmdline: Overridden task cmdline, see MSM_PARAM_CMDLINE * * Accessed under msm_gpu::lock */ char *cmdline; - /** - * @elapsed: - * + * @elapsed_ns: * The total (cumulative) elapsed time GPU was busy with rendering * from this context in ns. */ uint64_t elapsed_ns; - /** * @cycles: - * * The total (cumulative) GPU cycles elapsed attributed to this * context. */ uint64_t cycles; - /** * @entities: - * * Table of per-priority-level sched entities used by submitqueues * associated with this &drm_file. Because some userspace apps * make assumptions about rendering from multiple gl contexts @@ -466,10 +432,8 @@ struct msm_context { * level. */ struct drm_sched_entity *entities[NR_SCHED_PRIORITIES * MSM_GPU_MAX_RINGS]; - /** * @ctx_mem: - * * Total amount of memory of GEM buffers with handles attached for * this context. */ @@ -479,7 +443,7 @@ struct msm_context { struct drm_gpuvm *msm_context_vm(struct drm_device *dev, struct msm_context *ctx); /** - * msm_context_is_vm_bind() - has userspace opted in to VM_BIND? + * msm_context_is_vmbind() - has userspace opted in to VM_BIND? * * @ctx: the drm_file context * @@ -487,6 +451,8 @@ struct drm_gpuvm *msm_context_vm(struct drm_device *dev, struct msm_context *ctx * do sparse binding including having multiple, potentially partial, * mappings in the VM. Therefore certain legacy uabi (ie. GET_IOVA, * SET_IOVA) are rejected because they don't have a sensible meaning. + * + * Returns: %true if userspace is managing the VM, %false otherwise. */ static inline bool msm_context_is_vmbind(struct msm_context *ctx) @@ -518,6 +484,8 @@ msm_context_is_vmbind(struct msm_context *ctx) * This allows generations without preemption (nr_rings==1) to have some * amount of prioritization, and provides more priority levels for gens * that do have preemption. + * + * Returns: %0 on success, %-errno on error. */ static inline int msm_gpu_convert_priority(struct msm_gpu *gpu, int prio, unsigned *ring_nr, enum drm_sched_priority *sched_prio) @@ -541,7 +509,7 @@ static inline int msm_gpu_convert_priority(struct msm_gpu *gpu, int prio, } /** - * struct msm_gpu_submitqueues - Userspace created context. + * struct msm_gpu_submitqueue - Userspace created context. * * A submitqueue is associated with a gl context or vk queue (or equiv) * in userspace. From cb987e0a9a10e93d5885f34fa359490aeebb17ce Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 19 Dec 2025 10:46:38 -0800 Subject: [PATCH 0386/1321] drm/msm: msm_iommu.c: fix all kernel-doc warnings Correct or add kernel-doc comments to eliminate all warnings: Warning: ../drivers/gpu/drm/msm/msm_iommu.c:381 expecting prototype for alloc_pt(). Prototype was for msm_iommu_pagetable_alloc_pt() instead Warning: ../drivers/gpu/drm/msm/msm_iommu.c:426 expecting prototype for free_pt(). Prototype was for msm_iommu_pagetable_free_pt() instead Signed-off-by: Randy Dunlap Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695675/ Link: https://lore.kernel.org/r/20251219184638.1813181-20-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/msm_iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c index a188617653e85..d5dede4ff7619 100644 --- a/drivers/gpu/drm/msm/msm_iommu.c +++ b/drivers/gpu/drm/msm/msm_iommu.c @@ -364,7 +364,7 @@ msm_iommu_pagetable_prealloc_cleanup(struct msm_mmu *mmu, struct msm_mmu_preallo } /** - * alloc_pt() - Custom page table allocator + * msm_iommu_pagetable_alloc_pt() - Custom page table allocator * @cookie: Cookie passed at page table allocation time. * @size: Size of the page table. This size should be fixed, * and determined at creation time based on the granule size. @@ -416,7 +416,7 @@ msm_iommu_pagetable_alloc_pt(void *cookie, size_t size, gfp_t gfp) /** - * free_pt() - Custom page table free function + * msm_iommu_pagetable_free_pt() - Custom page table free function * @cookie: Cookie passed at page table allocation time. * @data: Page table to free. * @size: Size of the page table. This size should be fixed, From 08244b3e4e130bb5c28ca750d30483e99a70fd8e Mon Sep 17 00:00:00 2001 From: Abel Vesa Date: Fri, 19 Dec 2025 12:39:01 +0200 Subject: [PATCH 0387/1321] Revert "drm/msm/dpu: support plane splitting in quad-pipe case" This reverts commit 5978864e34b66bdae4d7613834c03dd5d0a0c891. At least on Hamoa based devices, there are IOMMU faults: arm-smmu 15000000.iommu: Unhandled context fault: fsr=0x402, iova=0x00000000, fsynr=0x3d0023, cbfrsynra=0x1c00, cb=13 arm-smmu 15000000.iommu: FSR = 00000402 [Format=2 TF], SID=0x1c00 arm-smmu 15000000.iommu: FSYNR0 = 003d0023 [S1CBNDX=61 PNU PLVL=3] While on some of these devices, there are also all sorts of artifacts on eDP. Reverting this fixes these issues. Closes: https://lore.kernel.org/r/z75wnahrp7lrl5yhfdysr3np3qrs6xti2i4otkng4ex3blfgrx@xyiucge3xykb/ Signed-off-by: Abel Vesa Reviewed-by: Marijn Suijten Fixes: 5978864e34b6 ("drm/msm/dpu: support plane splitting in quad-pipe case") Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695549/ Link: https://lore.kernel.org/r/20251219-drm-msm-dpu-revert-quad-pipe-broken-v1-1-654b46505f84@oss.qualcomm.com Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 11 -- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h | 2 - drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 137 +++++++--------------- 3 files changed, 40 insertions(+), 110 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c index c39f1908ea654..011946bbf5a29 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c @@ -1620,17 +1620,6 @@ int dpu_crtc_vblank(struct drm_crtc *crtc, bool en) return 0; } -/** - * dpu_crtc_get_num_lm - Get mixer number in this CRTC pipeline - * @state: Pointer to drm crtc state object - */ -unsigned int dpu_crtc_get_num_lm(const struct drm_crtc_state *state) -{ - struct dpu_crtc_state *cstate = to_dpu_crtc_state(state); - - return cstate->num_mixers; -} - #ifdef CONFIG_DEBUG_FS static int _dpu_debugfs_status_show(struct seq_file *s, void *data) { diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h index 455073c7025b0..2c83f1578fc39 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h @@ -267,6 +267,4 @@ static inline enum dpu_crtc_client_type dpu_crtc_get_client_type( void dpu_crtc_frame_event_cb(struct drm_crtc *crtc, u32 event); -unsigned int dpu_crtc_get_num_lm(const struct drm_crtc_state *state); - #endif /* _DPU_CRTC_H_ */ diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c index d07a6ab6e7ee1..9b7a8b46bfa91 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c @@ -826,12 +826,8 @@ static int dpu_plane_atomic_check_nosspp(struct drm_plane *plane, struct dpu_plane_state *pstate = to_dpu_plane_state(new_plane_state); struct dpu_sw_pipe_cfg *pipe_cfg; struct dpu_sw_pipe_cfg *r_pipe_cfg; - struct dpu_sw_pipe_cfg init_pipe_cfg; struct drm_rect fb_rect = { 0 }; - const struct drm_display_mode *mode = &crtc_state->adjusted_mode; uint32_t max_linewidth; - u32 num_lm; - int stage_id, num_stages; min_scale = FRAC_16_16(1, MAX_UPSCALE_RATIO); max_scale = MAX_DOWNSCALE_RATIO << 16; @@ -854,10 +850,13 @@ static int dpu_plane_atomic_check_nosspp(struct drm_plane *plane, return -EINVAL; } - num_lm = dpu_crtc_get_num_lm(crtc_state); - + /* move the assignment here, to ease handling to another pairs later */ + pipe_cfg = &pstate->pipe_cfg[0]; + r_pipe_cfg = &pstate->pipe_cfg[1]; /* state->src is 16.16, src_rect is not */ - drm_rect_fp_to_int(&init_pipe_cfg.src_rect, &new_plane_state->src); + drm_rect_fp_to_int(&pipe_cfg->src_rect, &new_plane_state->src); + + pipe_cfg->dst_rect = new_plane_state->dst; fb_rect.x2 = new_plane_state->fb->width; fb_rect.y2 = new_plane_state->fb->height; @@ -882,94 +881,35 @@ static int dpu_plane_atomic_check_nosspp(struct drm_plane *plane, max_linewidth = pdpu->catalog->caps->max_linewidth; - drm_rect_rotate(&init_pipe_cfg.src_rect, + drm_rect_rotate(&pipe_cfg->src_rect, new_plane_state->fb->width, new_plane_state->fb->height, new_plane_state->rotation); - /* - * We have 1 mixer pair cfg for 1:1:1 and 2:2:1 topology, 2 mixer pair - * configs for left and right half screen in case of 4:4:2 topology. - * But we may have 2 rect to split wide plane that exceeds limit with 1 - * config for 2:2:1. So need to handle both wide plane splitting, and - * two halves of screen splitting for quad-pipe case. Check dest - * rectangle left/right clipping first, then check wide rectangle - * splitting in every half next. - */ - num_stages = (num_lm + 1) / 2; - /* iterate mixer configs for this plane, to separate left/right with the id */ - for (stage_id = 0; stage_id < num_stages; stage_id++) { - struct drm_rect mixer_rect = { - .x1 = stage_id * mode->hdisplay / num_stages, - .y1 = 0, - .x2 = (stage_id + 1) * mode->hdisplay / num_stages, - .y2 = mode->vdisplay - }; - int cfg_idx = stage_id * PIPES_PER_STAGE; - - pipe_cfg = &pstate->pipe_cfg[cfg_idx]; - r_pipe_cfg = &pstate->pipe_cfg[cfg_idx + 1]; - - drm_rect_fp_to_int(&pipe_cfg->src_rect, &new_plane_state->src); - pipe_cfg->dst_rect = new_plane_state->dst; - - DPU_DEBUG_PLANE(pdpu, "checking src " DRM_RECT_FMT - " vs clip window " DRM_RECT_FMT "\n", - DRM_RECT_ARG(&pipe_cfg->src_rect), - DRM_RECT_ARG(&mixer_rect)); - - /* - * If this plane does not fall into mixer rect, check next - * mixer rect. - */ - if (!drm_rect_clip_scaled(&pipe_cfg->src_rect, - &pipe_cfg->dst_rect, - &mixer_rect)) { - memset(pipe_cfg, 0, 2 * sizeof(struct dpu_sw_pipe_cfg)); - - continue; + if ((drm_rect_width(&pipe_cfg->src_rect) > max_linewidth) || + _dpu_plane_calc_clk(&crtc_state->adjusted_mode, pipe_cfg) > max_mdp_clk_rate) { + if (drm_rect_width(&pipe_cfg->src_rect) > 2 * max_linewidth) { + DPU_DEBUG_PLANE(pdpu, "invalid src " DRM_RECT_FMT " line:%u\n", + DRM_RECT_ARG(&pipe_cfg->src_rect), max_linewidth); + return -E2BIG; } - pipe_cfg->dst_rect.x1 -= mixer_rect.x1; - pipe_cfg->dst_rect.x2 -= mixer_rect.x1; - - DPU_DEBUG_PLANE(pdpu, "Got clip src:" DRM_RECT_FMT " dst: " DRM_RECT_FMT "\n", - DRM_RECT_ARG(&pipe_cfg->src_rect), DRM_RECT_ARG(&pipe_cfg->dst_rect)); - - /* Split wide rect into 2 rect */ - if ((drm_rect_width(&pipe_cfg->src_rect) > max_linewidth) || - _dpu_plane_calc_clk(mode, pipe_cfg) > max_mdp_clk_rate) { - - if (drm_rect_width(&pipe_cfg->src_rect) > 2 * max_linewidth) { - DPU_DEBUG_PLANE(pdpu, "invalid src " DRM_RECT_FMT " line:%u\n", - DRM_RECT_ARG(&pipe_cfg->src_rect), max_linewidth); - return -E2BIG; - } - - memcpy(r_pipe_cfg, pipe_cfg, sizeof(struct dpu_sw_pipe_cfg)); - pipe_cfg->src_rect.x2 = (pipe_cfg->src_rect.x1 + pipe_cfg->src_rect.x2) >> 1; - pipe_cfg->dst_rect.x2 = (pipe_cfg->dst_rect.x1 + pipe_cfg->dst_rect.x2) >> 1; - r_pipe_cfg->src_rect.x1 = pipe_cfg->src_rect.x2; - r_pipe_cfg->dst_rect.x1 = pipe_cfg->dst_rect.x2; - DPU_DEBUG_PLANE(pdpu, "Split wide plane into:" - DRM_RECT_FMT " and " DRM_RECT_FMT "\n", - DRM_RECT_ARG(&pipe_cfg->src_rect), - DRM_RECT_ARG(&r_pipe_cfg->src_rect)); - } else { - memset(r_pipe_cfg, 0, sizeof(struct dpu_sw_pipe_cfg)); - } + *r_pipe_cfg = *pipe_cfg; + pipe_cfg->src_rect.x2 = (pipe_cfg->src_rect.x1 + pipe_cfg->src_rect.x2) >> 1; + pipe_cfg->dst_rect.x2 = (pipe_cfg->dst_rect.x1 + pipe_cfg->dst_rect.x2) >> 1; + r_pipe_cfg->src_rect.x1 = pipe_cfg->src_rect.x2; + r_pipe_cfg->dst_rect.x1 = pipe_cfg->dst_rect.x2; + } else { + memset(r_pipe_cfg, 0, sizeof(*r_pipe_cfg)); + } - drm_rect_rotate_inv(&pipe_cfg->src_rect, - new_plane_state->fb->width, - new_plane_state->fb->height, + drm_rect_rotate_inv(&pipe_cfg->src_rect, + new_plane_state->fb->width, new_plane_state->fb->height, + new_plane_state->rotation); + if (drm_rect_width(&r_pipe_cfg->src_rect) != 0) + drm_rect_rotate_inv(&r_pipe_cfg->src_rect, + new_plane_state->fb->width, new_plane_state->fb->height, new_plane_state->rotation); - if (drm_rect_width(&r_pipe_cfg->src_rect) != 0) - drm_rect_rotate_inv(&r_pipe_cfg->src_rect, - new_plane_state->fb->width, - new_plane_state->fb->height, - new_plane_state->rotation); - } - pstate->needs_qos_remap = drm_atomic_crtc_needs_modeset(crtc_state); return 0; @@ -1045,17 +985,20 @@ static int dpu_plane_atomic_check_sspp(struct drm_plane *plane, drm_atomic_get_new_plane_state(state, plane); struct dpu_plane *pdpu = to_dpu_plane(plane); struct dpu_plane_state *pstate = to_dpu_plane_state(new_plane_state); - struct dpu_sw_pipe *pipe; - struct dpu_sw_pipe_cfg *pipe_cfg; - int ret = 0, i; + struct dpu_sw_pipe *pipe = &pstate->pipe[0]; + struct dpu_sw_pipe *r_pipe = &pstate->pipe[1]; + struct dpu_sw_pipe_cfg *pipe_cfg = &pstate->pipe_cfg[0]; + struct dpu_sw_pipe_cfg *r_pipe_cfg = &pstate->pipe_cfg[1]; + int ret = 0; - for (i = 0; i < PIPES_PER_PLANE; i++) { - pipe = &pstate->pipe[i]; - pipe_cfg = &pstate->pipe_cfg[i]; - if (!drm_rect_width(&pipe_cfg->src_rect)) - continue; - DPU_DEBUG_PLANE(pdpu, "pipe %d is in use, validate it\n", i); - ret = dpu_plane_atomic_check_pipe(pdpu, pipe, pipe_cfg, + ret = dpu_plane_atomic_check_pipe(pdpu, pipe, pipe_cfg, + &crtc_state->adjusted_mode, + new_plane_state); + if (ret) + return ret; + + if (drm_rect_width(&r_pipe_cfg->src_rect) != 0) { + ret = dpu_plane_atomic_check_pipe(pdpu, r_pipe, r_pipe_cfg, &crtc_state->adjusted_mode, new_plane_state); if (ret) From 90aa521a42de31b1cc4a782adf6512b9f01427ae Mon Sep 17 00:00:00 2001 From: Abel Vesa Date: Fri, 19 Dec 2025 12:39:02 +0200 Subject: [PATCH 0388/1321] Revert "drm/msm/dpu: Enable quad-pipe for DSC and dual-DSI case" This reverts commit d7ec9366b15cd04508fa015cb94d546b1c01edfb. The dual-DSI dual-DSC scenario seems to be broken by this commit. Reported-by: Marijn Suijten Closes: https://lore.kernel.org/r/aUR2b3FOSisTfDFj@SoMainline.org Signed-off-by: Abel Vesa Fixes: d7ec9366b15c ("drm/msm/dpu: Enable quad-pipe for DSC and dual-DSI case") Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/695550/ Link: https://lore.kernel.org/r/20251219-drm-msm-dpu-revert-quad-pipe-broken-v1-2-654b46505f84@oss.qualcomm.com Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 27 +++++------------ drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h | 6 ++-- drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 29 +++++++++++++------ .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h | 2 +- .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 2 +- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h | 2 +- 6 files changed, 33 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c index 011946bbf5a29..2d06c950e8143 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c @@ -200,7 +200,7 @@ static int dpu_crtc_get_lm_crc(struct drm_crtc *crtc, struct dpu_crtc_state *crtc_state) { struct dpu_crtc_mixer *m; - u32 crcs[CRTC_QUAD_MIXERS]; + u32 crcs[CRTC_DUAL_MIXERS]; int rc = 0; int i; @@ -1328,7 +1328,6 @@ static struct msm_display_topology dpu_crtc_get_topology( struct drm_display_mode *mode = &crtc_state->adjusted_mode; struct msm_display_topology topology = {0}; struct drm_encoder *drm_enc; - u32 num_rt_intf; drm_for_each_encoder_mask(drm_enc, crtc->dev, crtc_state->encoder_mask) dpu_encoder_update_topology(drm_enc, &topology, crtc_state->state, @@ -1342,14 +1341,11 @@ static struct msm_display_topology dpu_crtc_get_topology( * Dual display * 2 LM, 2 INTF ( Split display using 2 interfaces) * - * If DSC is enabled, try to use 4:4:2 topology if there is enough - * resource. Otherwise, use 2:2:2 topology. - * * Single display * 1 LM, 1 INTF * 2 LM, 1 INTF (stream merge to support high resolution interfaces) * - * If DSC is enabled, use 2:2:1 topology + * If DSC is enabled, use 2 LMs for 2:2:1 topology * * Add dspps to the reservation requirements if ctm is requested * @@ -1361,23 +1357,14 @@ static struct msm_display_topology dpu_crtc_get_topology( * (mode->hdisplay > MAX_HDISPLAY_SPLIT) check. */ - num_rt_intf = topology.num_intf; - if (topology.cwb_enabled) - num_rt_intf--; - - if (topology.num_dsc) { - if (dpu_kms->catalog->dsc_count >= num_rt_intf * 2) - topology.num_dsc = num_rt_intf * 2; - else - topology.num_dsc = num_rt_intf; - topology.num_lm = topology.num_dsc; - } else if (num_rt_intf == 2) { + if (topology.num_intf == 2 && !topology.cwb_enabled) + topology.num_lm = 2; + else if (topology.num_dsc == 2) topology.num_lm = 2; - } else if (dpu_kms->catalog->caps->has_3d_merge) { + else if (dpu_kms->catalog->caps->has_3d_merge) topology.num_lm = (mode->hdisplay > MAX_HDISPLAY_SPLIT) ? 2 : 1; - } else { + else topology.num_lm = 1; - } if (crtc_state->ctm) topology.num_dspp = topology.num_lm; diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h index 2c83f1578fc39..94392b9b92454 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h @@ -210,7 +210,7 @@ struct dpu_crtc_state { bool bw_control; bool bw_split_vote; - struct drm_rect lm_bounds[CRTC_QUAD_MIXERS]; + struct drm_rect lm_bounds[CRTC_DUAL_MIXERS]; uint64_t input_fence_timeout_ns; @@ -218,10 +218,10 @@ struct dpu_crtc_state { /* HW Resources reserved for the crtc */ u32 num_mixers; - struct dpu_crtc_mixer mixers[CRTC_QUAD_MIXERS]; + struct dpu_crtc_mixer mixers[CRTC_DUAL_MIXERS]; u32 num_ctls; - struct dpu_hw_ctl *hw_ctls[CRTC_QUAD_MIXERS]; + struct dpu_hw_ctl *hw_ctls[CRTC_DUAL_MIXERS]; enum dpu_crtc_crc_source crc_source; int crc_frame_skip_count; diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c index d1cfe81a33737..9f3957f24c6a3 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c @@ -55,7 +55,7 @@ #define MAX_PHYS_ENCODERS_PER_VIRTUAL \ (MAX_H_TILES_PER_DISPLAY * NUM_PHYS_ENCODER_TYPES) -#define MAX_CHANNELS_PER_ENC 4 +#define MAX_CHANNELS_PER_ENC 2 #define MAX_CWB_PER_ENC 2 #define IDLE_SHORT_TIMEOUT 1 @@ -661,6 +661,7 @@ void dpu_encoder_update_topology(struct drm_encoder *drm_enc, struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc); struct msm_drm_private *priv = dpu_enc->base.dev->dev_private; struct msm_display_info *disp_info = &dpu_enc->disp_info; + struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms); struct drm_connector *connector; struct drm_connector_state *conn_state; struct drm_framebuffer *fb; @@ -674,12 +675,22 @@ void dpu_encoder_update_topology(struct drm_encoder *drm_enc, dsc = dpu_encoder_get_dsc_config(drm_enc); - /* - * Set DSC number as 1 to mark the enabled status, will be adjusted - * in dpu_crtc_get_topology() - */ - if (dsc) - topology->num_dsc = 1; + /* We only support 2 DSC mode (with 2 LM and 1 INTF) */ + if (dsc) { + /* + * Use 2 DSC encoders, 2 layer mixers and 1 or 2 interfaces + * when Display Stream Compression (DSC) is enabled, + * and when enough DSC blocks are available. + * This is power-optimal and can drive up to (including) 4k + * screens. + */ + WARN(topology->num_intf > 2, + "DSC topology cannot support more than 2 interfaces\n"); + if (topology->num_intf >= 2 || dpu_kms->catalog->dsc_count >= 2) + topology->num_dsc = 2; + else + topology->num_dsc = 1; + } connector = drm_atomic_get_new_connector_for_encoder(state, drm_enc); if (!connector) @@ -2169,8 +2180,8 @@ static void dpu_encoder_helper_reset_mixers(struct dpu_encoder_phys *phys_enc) { int i, num_lm; struct dpu_global_state *global_state; - struct dpu_hw_blk *hw_lm[MAX_CHANNELS_PER_ENC]; - struct dpu_hw_mixer *hw_mixer[MAX_CHANNELS_PER_ENC]; + struct dpu_hw_blk *hw_lm[2]; + struct dpu_hw_mixer *hw_mixer[2]; struct dpu_hw_ctl *ctl = phys_enc->hw_ctl; /* reset all mixers for this encoder */ diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h index 09395d7910ac8..61b22d9494546 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h @@ -302,7 +302,7 @@ static inline enum dpu_3d_blend_mode dpu_encoder_helper_get_3d_blend_mode( /* Use merge_3d unless DSC MERGE topology is used */ if (phys_enc->split_role == ENC_ROLE_SOLO && - (dpu_cstate->num_mixers != 1) && + dpu_cstate->num_mixers == CRTC_DUAL_MIXERS && !dpu_encoder_use_dsc_merge(phys_enc->parent)) return BLEND_3D_H_ROW_INT; diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h index 336757103b5af..4964e70610d1b 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h @@ -24,7 +24,7 @@ #define DPU_MAX_IMG_WIDTH 0x3fff #define DPU_MAX_IMG_HEIGHT 0x3fff -#define CRTC_QUAD_MIXERS 4 +#define CRTC_DUAL_MIXERS 2 #define MAX_XIN_COUNT 16 diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h index 31451241f0839..046b683d4c66d 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h @@ -34,7 +34,7 @@ #define DPU_MAX_PLANES 4 #endif -#define STAGES_PER_PLANE 2 +#define STAGES_PER_PLANE 1 #define PIPES_PER_STAGE 2 #define PIPES_PER_PLANE (PIPES_PER_STAGE * STAGES_PER_PLANE) #ifndef DPU_MAX_DE_CURVES From 7676a66906eac39ac89c378961d837e7990928e3 Mon Sep 17 00:00:00 2001 From: Nikolay Kuratov Date: Thu, 11 Dec 2025 12:36:30 +0300 Subject: [PATCH 0389/1321] drm/msm/dpu: Add missing NULL pointer check for pingpong interface It is checked almost always in dpu_encoder_phys_wb_setup_ctl(), but in a single place the check is missing. Also use convenient locals instead of phys_enc->* where available. Cc: stable@vger.kernel.org Fixes: d7d0e73f7de33 ("drm/msm/dpu: introduce the dpu_encoder_phys_* for writeback") Signed-off-by: Nikolay Kuratov Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/693860/ Link: https://lore.kernel.org/r/20251211093630.171014-1-kniv@yandex-team.ru Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c index 46f348972a975..6d28f2281c765 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c @@ -247,14 +247,12 @@ static void dpu_encoder_phys_wb_setup_ctl(struct dpu_encoder_phys *phys_enc) if (hw_cdm) intf_cfg.cdm = hw_cdm->idx; - if (phys_enc->hw_pp->merge_3d && phys_enc->hw_pp->merge_3d->ops.setup_3d_mode) - phys_enc->hw_pp->merge_3d->ops.setup_3d_mode(phys_enc->hw_pp->merge_3d, - mode_3d); + if (hw_pp && hw_pp->merge_3d && hw_pp->merge_3d->ops.setup_3d_mode) + hw_pp->merge_3d->ops.setup_3d_mode(hw_pp->merge_3d, mode_3d); /* setup which pp blk will connect to this wb */ - if (hw_pp && phys_enc->hw_wb->ops.bind_pingpong_blk) - phys_enc->hw_wb->ops.bind_pingpong_blk(phys_enc->hw_wb, - phys_enc->hw_pp->idx); + if (hw_pp && hw_wb->ops.bind_pingpong_blk) + hw_wb->ops.bind_pingpong_blk(hw_wb, hw_pp->idx); phys_enc->hw_ctl->ops.setup_intf_cfg(phys_enc->hw_ctl, &intf_cfg); } else if (phys_enc->hw_ctl && phys_enc->hw_ctl->ops.setup_intf_cfg) { From bde07e2563abba522189f0958fafe3cacdcca4e7 Mon Sep 17 00:00:00 2001 From: Evan Lambert Date: Wed, 24 Dec 2025 12:44:22 +0000 Subject: [PATCH 0390/1321] drm/msm: Replace unsafe snprintf usage with scnprintf The refill_buf function uses snprintf to append to a fixed-size buffer. snprintf returns the length that would have been written, which can exceed the remaining buffer size. If this happens, ptr advances beyond the buffer and rem becomes negative. In the 2nd iteration, rem is treated as a large unsigned integer, causing snprintf to write oob. While this behavior is technically mitigated by num_perfcntrs being locked at 5, it's still unsafe if num_perfcntrs were ever to change/a second source was added. Signed-off-by: Evan Lambert Reviewed-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/696358/ Link: https://lore.kernel.org/r/20251224124254.17920-3-veyga@veygax.dev Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/msm_perf.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_perf.c b/drivers/gpu/drm/msm/msm_perf.c index d3c7889aaf267..c369d4acc3781 100644 --- a/drivers/gpu/drm/msm/msm_perf.c +++ b/drivers/gpu/drm/msm/msm_perf.c @@ -65,13 +65,13 @@ static int refill_buf(struct msm_perf_state *perf) if ((perf->cnt++ % 32) == 0) { /* Header line: */ - n = snprintf(ptr, rem, "%%BUSY"); + n = scnprintf(ptr, rem, "%%BUSY"); ptr += n; rem -= n; for (i = 0; i < gpu->num_perfcntrs; i++) { const struct msm_gpu_perfcntr *perfcntr = &gpu->perfcntrs[i]; - n = snprintf(ptr, rem, "\t%s", perfcntr->name); + n = scnprintf(ptr, rem, "\t%s", perfcntr->name); ptr += n; rem -= n; } @@ -93,21 +93,21 @@ static int refill_buf(struct msm_perf_state *perf) return ret; val = totaltime ? 1000 * activetime / totaltime : 0; - n = snprintf(ptr, rem, "%3d.%d%%", val / 10, val % 10); + n = scnprintf(ptr, rem, "%3d.%d%%", val / 10, val % 10); ptr += n; rem -= n; for (i = 0; i < ret; i++) { /* cycle counters (I think).. convert to MHz.. */ val = cntrs[i] / 10000; - n = snprintf(ptr, rem, "\t%5d.%02d", + n = scnprintf(ptr, rem, "\t%5d.%02d", val / 100, val % 100); ptr += n; rem -= n; } } - n = snprintf(ptr, rem, "\n"); + n = scnprintf(ptr, rem, "\n"); ptr += n; rem -= n; From fa80fee54e425762b775ab9cd1bf6897c52ebfe7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= Date: Mon, 22 Dec 2025 08:45:48 +0100 Subject: [PATCH 0391/1321] regulator: uapi: Use UAPI integer type MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Using libc types and headers from the UAPI headers is problematic as it introduces a dependency on a full C toolchain. Use the fixed-width integer type provided by the UAPI headers instead. Signed-off-by: Thomas Weißschuh Link: https://patch.msgid.link/20251222-uapi-regulator-v1-1-a71c66eb1a94@linutronix.de Signed-off-by: Mark Brown --- include/uapi/regulator/regulator.h | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/include/uapi/regulator/regulator.h b/include/uapi/regulator/regulator.h index 71bf71a22e7fb..c4f2d1c198280 100644 --- a/include/uapi/regulator/regulator.h +++ b/include/uapi/regulator/regulator.h @@ -8,11 +8,7 @@ #ifndef _UAPI_REGULATOR_H #define _UAPI_REGULATOR_H -#ifdef __KERNEL__ #include -#else -#include -#endif /* * Regulator notifier events. @@ -62,7 +58,7 @@ struct reg_genl_event { char reg_name[32]; - uint64_t event; + __u64 event; }; /* attributes of reg_genl_family */ From 65ca6c14ab46521e59ab219aad248abc66a0fd74 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= Date: Mon, 22 Dec 2025 08:49:13 +0100 Subject: [PATCH 0392/1321] regulator: Add UAPI headers to MAINTAINERS MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The regulator UAPI headers were missing an entry in MAINTAINERS, add it. Signed-off-by: Thomas Weißschuh Link: https://patch.msgid.link/20251222-maintainers-regulator-v1-1-7572390fdf1b@linutronix.de Signed-off-by: Mark Brown --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index dc731d37c8fee..12f49de7fe036 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -27920,6 +27920,7 @@ F: drivers/regulator/ F: rust/kernel/regulator.rs F: include/dt-bindings/regulator/ F: include/linux/regulator/ +F: include/uapi/regulator/ K: regulator_get_optional VOLTAGE AND CURRENT REGULATOR IRQ HELPERS From 601ac83e7c4dddbe7032c15c83ac5279bf3433a7 Mon Sep 17 00:00:00 2001 From: Andreas Kemnade Date: Tue, 23 Dec 2025 22:51:31 +0100 Subject: [PATCH 0393/1321] regulator: fp9931: fix regulator node pointer Sync the driver with the binding. During review process a regulators subnode was requested but neither driver nor test setup was updated. Fixes: 12d821bd13d4 ("regulator: Add FP9931/JD9930 driver") Signed-off-by: Andreas Kemnade Link: https://patch.msgid.link/20251223-fp9931-fix-v1-1-b19b4c1e7056@kemnade.info Signed-off-by: Mark Brown --- drivers/regulator/fp9931.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/regulator/fp9931.c b/drivers/regulator/fp9931.c index fef0bb07fd5d8..69b3c712e5d58 100644 --- a/drivers/regulator/fp9931.c +++ b/drivers/regulator/fp9931.c @@ -391,6 +391,7 @@ static const struct regulator_desc regulators[] = { { .name = "v3p3", .of_match = of_match_ptr("v3p3"), + .regulators_node = of_match_ptr("regulators"), .id = 0, .ops = &fp9931_v3p3ops, .type = REGULATOR_VOLTAGE, @@ -403,6 +404,7 @@ static const struct regulator_desc regulators[] = { { .name = "vposneg", .of_match = of_match_ptr("vposneg"), + .regulators_node = of_match_ptr("regulators"), .id = 1, .ops = &fp9931_vposneg_ops, .type = REGULATOR_VOLTAGE, @@ -415,6 +417,7 @@ static const struct regulator_desc regulators[] = { { .name = "vcom", .of_match = of_match_ptr("vcom"), + .regulators_node = of_match_ptr("regulators"), .id = 2, .ops = &fp9931_vcom_ops, .type = REGULATOR_VOLTAGE, From 1e9344b1588c44ffee0f1460beeb9ea2e83aa189 Mon Sep 17 00:00:00 2001 From: Chen-Yu Tsai Date: Sun, 21 Dec 2025 19:05:08 +0800 Subject: [PATCH 0394/1321] spi: dt-bindings: sun6i: Add compatibles for A523's SPI controllers The A523 has four SPI controllers. One of them supports MIPI DBI mode in addition to standard SPI. Compared to older generations, this newer controller now has a combined counter for the RX FIFO ad buffer levels. In older generations, the RX buffer level was a separate bitfield in the FIFO status register. In practice this difference is negligible. The buffer is mostly invisible to the implementation. If programmed I/O transfers are limited to the FIFO size, then the contents of the buffer seem to always be flushed over to the FIFO. For DMA, the DRQ trigger levels are only tied to the FIFO levels. In all other aspects, the controller is the same as the one in the R329. Add new compatible strings for the new controllers. Signed-off-by: Chen-Yu Tsai Acked-by: Krzysztof Kozlowski Link: https://patch.msgid.link/20251221110513.1850535-2-wens@kernel.org Signed-off-by: Mark Brown --- .../devicetree/bindings/spi/allwinner,sun6i-a31-spi.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/Documentation/devicetree/bindings/spi/allwinner,sun6i-a31-spi.yaml b/Documentation/devicetree/bindings/spi/allwinner,sun6i-a31-spi.yaml index 3b47b68b92cb8..1b91d1566c953 100644 --- a/Documentation/devicetree/bindings/spi/allwinner,sun6i-a31-spi.yaml +++ b/Documentation/devicetree/bindings/spi/allwinner,sun6i-a31-spi.yaml @@ -17,6 +17,7 @@ properties: compatible: oneOf: - const: allwinner,sun50i-r329-spi + - const: allwinner,sun55i-a523-spi - const: allwinner,sun6i-a31-spi - const: allwinner,sun8i-h3-spi - items: @@ -35,6 +36,9 @@ properties: - const: allwinner,sun20i-d1-spi-dbi - const: allwinner,sun50i-r329-spi-dbi - const: allwinner,sun50i-r329-spi + - items: + - const: allwinner,sun55i-a523-spi-dbi + - const: allwinner,sun55i-a523-spi reg: maxItems: 1 From 4d2422d00a35b9321d66ddaedc881a02df1f5aad Mon Sep 17 00:00:00 2001 From: Chen-Yu Tsai Date: Sun, 21 Dec 2025 19:05:09 +0800 Subject: [PATCH 0395/1321] spi: sun6i: Support A523's SPI controllers The A523 has four SPI controllers. One of them supports MIPI DBI mode in addition to standard SPI. Compared to older generations, this newer controller now has a combined counter for the RX FIFO ad buffer levels. In older generations, the RX buffer level was a separate bitfield in the FIFO status register. In practice this difference is negligible. The buffer is mostly invisible to the implementation. If programmed I/O transfers are limited to the FIFO size, then the contents of the buffer seem to always be flushed over to the FIFO. For DMA, the DRQ trigger levels are only tied to the FIFO levels. In all other aspects, the controller is the same as the one in the R329. Support the standard SPI mode controllers using the settings for R329. DBI is left out as there currently is no infrastructure for enabling a DBI host controller, as was the case for the R329. Also fold the entry for the R329 to make the style consistent. Signed-off-by: Chen-Yu Tsai Reviewed-by: Jernej Skrabec Link: https://patch.msgid.link/20251221110513.1850535-3-wens@kernel.org Signed-off-by: Mark Brown --- drivers/spi/spi-sun6i.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/spi/spi-sun6i.c b/drivers/spi/spi-sun6i.c index 871dfd3e77be2..d1de6c99e7622 100644 --- a/drivers/spi/spi-sun6i.c +++ b/drivers/spi/spi-sun6i.c @@ -795,10 +795,13 @@ static const struct sun6i_spi_cfg sun50i_r329_spi_cfg = { static const struct of_device_id sun6i_spi_match[] = { { .compatible = "allwinner,sun6i-a31-spi", .data = &sun6i_a31_spi_cfg }, { .compatible = "allwinner,sun8i-h3-spi", .data = &sun8i_h3_spi_cfg }, - { - .compatible = "allwinner,sun50i-r329-spi", - .data = &sun50i_r329_spi_cfg - }, + { .compatible = "allwinner,sun50i-r329-spi", .data = &sun50i_r329_spi_cfg }, + /* + * A523's SPI controller has a combined RX buffer + FIFO counter + * at offset 0x400, instead of split buffer count in FIFO status + * register. But in practice we only care about the FIFO level. + */ + { .compatible = "allwinner,sun55i-a523-spi", .data = &sun50i_r329_spi_cfg }, {} }; MODULE_DEVICE_TABLE(of, sun6i_spi_match); From 5af02c71836839c21f6f295217ec6588dbd2fdd7 Mon Sep 17 00:00:00 2001 From: Mateusz Litwin Date: Thu, 18 Dec 2025 22:33:04 +0100 Subject: [PATCH 0396/1321] spi: cadence-quadspi: Prevent lost complete() call during indirect read A race condition exists between the read loop and IRQ `complete()` call. An interrupt could call the complete() between the inner loop and reinit_completion(), potentially losing the completion event and causing an unnecessary timeout. Moving reinit_completion() before the loop prevents this. A premature signal will only result in a spurious wakeup and another wait cycle, which is preferable to waiting for a timeout. Signed-off-by: Mateusz Litwin Link: https://patch.msgid.link/20251218-cqspi_indirect_read_improve-v2-1-396079972f2a@nokia.com Signed-off-by: Mark Brown --- drivers/spi/spi-cadence-quadspi.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c index f8823e83a6226..837dd646481f6 100644 --- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -769,6 +769,7 @@ static int cqspi_indirect_read_execute(struct cqspi_flash_pdata *f_pdata, readl(reg_base + CQSPI_REG_INDIRECTRD); /* Flush posted write. */ while (remaining > 0) { + ret = 0; if (use_irq && !wait_for_completion_timeout(&cqspi->transfer_complete, msecs_to_jiffies(CQSPI_READ_TIMEOUT_MS))) @@ -781,6 +782,14 @@ static int cqspi_indirect_read_execute(struct cqspi_flash_pdata *f_pdata, if (cqspi->slow_sram) writel(0x0, reg_base + CQSPI_REG_IRQMASK); + /* + * Prevent lost interrupt and race condition by reinitializing early. + * A spurious wakeup and another wait cycle can occur here, + * which is preferable to waiting until timeout if interrupt is lost. + */ + if (use_irq) + reinit_completion(&cqspi->transfer_complete); + bytes_to_read = cqspi_get_rd_sram_level(cqspi); if (ret && bytes_to_read == 0) { @@ -813,7 +822,6 @@ static int cqspi_indirect_read_execute(struct cqspi_flash_pdata *f_pdata, } if (use_irq && remaining > 0) { - reinit_completion(&cqspi->transfer_complete); if (cqspi->slow_sram) writel(CQSPI_REG_IRQ_WATERMARK, reg_base + CQSPI_REG_IRQMASK); } From 3e0c3291c7e2bfcfb8f676db1086d4355cfa4ffb Mon Sep 17 00:00:00 2001 From: Mateusz Litwin Date: Thu, 18 Dec 2025 22:33:05 +0100 Subject: [PATCH 0397/1321] spi: cadence-quadspi: Improve CQSPI_SLOW_SRAM quirk if flash is slow CQSPI_SLOW_SRAM quirk on the Stratix10 platform causes fewer interrupts, but also causes timeouts if a small block is used or if flash devices are slower than or equal in speed to SRAM's read operations. Adding the CQSPI_REG_IRQ_IND_COMP interrupt would resolve the problem for small reads, and removing the disabling of interrupts would resolve the issue with lost interrupts. This marginally increases IRQ count. Tests show that this will cause only a few percent more interrupts. Test: $ dd if=/dev/mtd0 of=/dev/null bs=1M count=64 Results from the Stratix10 platform with mt25qu02g flash. FIFO size in all tests: 128 Serviced interrupt call counts: Without CQSPI_SLOW_SRAM quirk: 16 668 850 With CQSPI_SLOW_SRAM quirk: 204 176 With CQSPI_SLOW_SRAM and this commit: 224 528 Signed-off-by: Mateusz Litwin Link: https://patch.msgid.link/20251218-cqspi_indirect_read_improve-v2-2-396079972f2a@nokia.com Signed-off-by: Mark Brown --- drivers/spi/spi-cadence-quadspi.c | 19 +++++-------------- 1 file changed, 5 insertions(+), 14 deletions(-) diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c index 837dd646481f6..965b4cea3388a 100644 --- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -300,6 +300,9 @@ struct cqspi_driver_platdata { CQSPI_REG_IRQ_IND_SRAM_FULL | \ CQSPI_REG_IRQ_IND_COMP) +#define CQSPI_IRQ_MASK_RD_SLOW_SRAM (CQSPI_REG_IRQ_WATERMARK | \ + CQSPI_REG_IRQ_IND_COMP) + #define CQSPI_IRQ_MASK_WR (CQSPI_REG_IRQ_IND_COMP | \ CQSPI_REG_IRQ_WATERMARK | \ CQSPI_REG_IRQ_UNDERFLOW) @@ -381,7 +384,7 @@ static irqreturn_t cqspi_irq_handler(int this_irq, void *dev) else if (!cqspi->slow_sram) irq_status &= CQSPI_IRQ_MASK_RD | CQSPI_IRQ_MASK_WR; else - irq_status &= CQSPI_REG_IRQ_WATERMARK | CQSPI_IRQ_MASK_WR; + irq_status &= CQSPI_IRQ_MASK_RD_SLOW_SRAM | CQSPI_IRQ_MASK_WR; if (irq_status) complete(&cqspi->transfer_complete); @@ -757,7 +760,7 @@ static int cqspi_indirect_read_execute(struct cqspi_flash_pdata *f_pdata, */ if (use_irq && cqspi->slow_sram) - writel(CQSPI_REG_IRQ_WATERMARK, reg_base + CQSPI_REG_IRQMASK); + writel(CQSPI_IRQ_MASK_RD_SLOW_SRAM, reg_base + CQSPI_REG_IRQMASK); else if (use_irq) writel(CQSPI_IRQ_MASK_RD, reg_base + CQSPI_REG_IRQMASK); else @@ -775,13 +778,6 @@ static int cqspi_indirect_read_execute(struct cqspi_flash_pdata *f_pdata, msecs_to_jiffies(CQSPI_READ_TIMEOUT_MS))) ret = -ETIMEDOUT; - /* - * Disable all read interrupts until - * we are out of "bytes to read" - */ - if (cqspi->slow_sram) - writel(0x0, reg_base + CQSPI_REG_IRQMASK); - /* * Prevent lost interrupt and race condition by reinitializing early. * A spurious wakeup and another wait cycle can occur here, @@ -820,11 +816,6 @@ static int cqspi_indirect_read_execute(struct cqspi_flash_pdata *f_pdata, remaining -= bytes_to_read; bytes_to_read = cqspi_get_rd_sram_level(cqspi); } - - if (use_irq && remaining > 0) { - if (cqspi->slow_sram) - writel(CQSPI_REG_IRQ_WATERMARK, reg_base + CQSPI_REG_IRQMASK); - } } /* Check indirect done status */ From 4b0b83f12bf204e28ad78223ce777a6130c20244 Mon Sep 17 00:00:00 2001 From: "Nysal Jan K.A." Date: Tue, 28 Oct 2025 16:25:12 +0530 Subject: [PATCH 0398/1321] powerpc/kexec: Enable SMT before waking offline CPUs If SMT is disabled or a partial SMT state is enabled, when a new kernel image is loaded for kexec, on reboot the following warning is observed: kexec: Waking offline cpu 228. WARNING: CPU: 0 PID: 9062 at arch/powerpc/kexec/core_64.c:223 kexec_prepare_cpus+0x1b0/0x1bc [snip] NIP kexec_prepare_cpus+0x1b0/0x1bc LR kexec_prepare_cpus+0x1a0/0x1bc Call Trace: kexec_prepare_cpus+0x1a0/0x1bc (unreliable) default_machine_kexec+0x160/0x19c machine_kexec+0x80/0x88 kernel_kexec+0xd0/0x118 __do_sys_reboot+0x210/0x2c4 system_call_exception+0x124/0x320 system_call_vectored_common+0x15c/0x2ec This occurs as add_cpu() fails due to cpu_bootable() returning false for CPUs that fail the cpu_smt_thread_allowed() check or non primary threads if SMT is disabled. Fix the issue by enabling SMT and resetting the number of SMT threads to the number of threads per core, before attempting to wake up all present CPUs. Fixes: 38253464bc82 ("cpu/SMT: Create topology_smt_thread_allowed()") Reported-by: Sachin P Bappalige Cc: stable@vger.kernel.org # v6.6+ Reviewed-by: Srikar Dronamraju Signed-off-by: Nysal Jan K.A. Tested-by: Samir M Reviewed-by: Sourabh Jain Signed-off-by: Madhavan Srinivasan Link: https://patch.msgid.link/20251028105516.26258-1-nysal@linux.ibm.com --- arch/powerpc/kexec/core_64.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c index 222aa326dacee..825ab8a88f18e 100644 --- a/arch/powerpc/kexec/core_64.c +++ b/arch/powerpc/kexec/core_64.c @@ -202,6 +202,23 @@ static void kexec_prepare_cpus_wait(int wait_state) mb(); } + +/* + * The add_cpu() call in wake_offline_cpus() can fail as cpu_bootable() + * returns false for CPUs that fail the cpu_smt_thread_allowed() check + * or non primary threads if SMT is disabled. Re-enable SMT and set the + * number of SMT threads to threads per core. + */ +static void kexec_smt_reenable(void) +{ +#if defined(CONFIG_SMP) && defined(CONFIG_HOTPLUG_SMT) + lock_device_hotplug(); + cpu_smt_num_threads = threads_per_core; + cpu_smt_control = CPU_SMT_ENABLED; + unlock_device_hotplug(); +#endif +} + /* * We need to make sure each present CPU is online. The next kernel will scan * the device tree and assume primary threads are online and query secondary @@ -216,6 +233,8 @@ static void wake_offline_cpus(void) { int cpu = 0; + kexec_smt_reenable(); + for_each_present_cpu(cpu) { if (!cpu_online(cpu)) { printk(KERN_INFO "kexec: Waking offline cpu %d.\n", From 12a35b280934edb1585f4ae8fe4949fb48e370fa Mon Sep 17 00:00:00 2001 From: Gopi Krishna Menon Date: Mon, 22 Sep 2025 06:11:23 +0530 Subject: [PATCH 0399/1321] selftests/powerpc/pmu/: Add check_extended_reg_test to .gitignore Add the check_extended_reg_test binary to .gitignore to avoid accidentally staging the build artifact. Signed-off-by: Gopi Krishna Menon Tested-by: Aditya Bodkhe Signed-off-by: Madhavan Srinivasan Link: https://patch.msgid.link/20250922004439.2395-1-krishnagopi487@gmail.com --- tools/testing/selftests/powerpc/pmu/sampling_tests/.gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/powerpc/pmu/sampling_tests/.gitignore b/tools/testing/selftests/powerpc/pmu/sampling_tests/.gitignore index f93b4c7c3a8ad..ea29228334e8e 100644 --- a/tools/testing/selftests/powerpc/pmu/sampling_tests/.gitignore +++ b/tools/testing/selftests/powerpc/pmu/sampling_tests/.gitignore @@ -1,5 +1,6 @@ bhrb_filter_map_test bhrb_no_crash_wo_pmu_test +check_extended_reg_test intr_regs_no_crash_wo_pmu_test mmcr0_cc56run_test mmcr0_exceptionbits_test From a449d4290d39538412de3ab69d13a7c15fabff48 Mon Sep 17 00:00:00 2001 From: Jan Stancek Date: Tue, 23 Sep 2025 17:32:16 +0200 Subject: [PATCH 0400/1321] powerpc/tools: drop `-o pipefail` in gcc check scripts Fixes: 0f71dcfb4aef ("powerpc/ftrace: Add support for -fpatchable-function-entry") Fixes: b71c9ffb1405 ("powerpc: Add arch/powerpc/tools directory") Reported-by: Joe Lawrence Acked-by: Joe Lawrence Signed-off-by: Jan Stancek Fixes: 8c50b72a3b4f ("powerpc/ftrace: Add Kconfig & Make glue for mprofile-kernel") Fixes: abba759796f9 ("powerpc/kbuild: move -mprofile-kernel check to Kconfig") Tested-by: Justin M. Forbes Reviewed-by: Naveen N Rao (AMD) Reviewed-by: Josh Poimboeuf Signed-off-by: Madhavan Srinivasan Link: https://patch.msgid.link/cc6cdd116c3ad9d990df21f13c6d8e8a83815bbd.1758641374.git.jstancek@redhat.com --- arch/powerpc/tools/gcc-check-fpatchable-function-entry.sh | 1 - arch/powerpc/tools/gcc-check-mprofile-kernel.sh | 1 - 2 files changed, 2 deletions(-) diff --git a/arch/powerpc/tools/gcc-check-fpatchable-function-entry.sh b/arch/powerpc/tools/gcc-check-fpatchable-function-entry.sh index 06706903503b6..baed467a016b3 100755 --- a/arch/powerpc/tools/gcc-check-fpatchable-function-entry.sh +++ b/arch/powerpc/tools/gcc-check-fpatchable-function-entry.sh @@ -2,7 +2,6 @@ # SPDX-License-Identifier: GPL-2.0 set -e -set -o pipefail # To debug, uncomment the following line # set -x diff --git a/arch/powerpc/tools/gcc-check-mprofile-kernel.sh b/arch/powerpc/tools/gcc-check-mprofile-kernel.sh index 73e331e7660ef..6193b0ed0c775 100755 --- a/arch/powerpc/tools/gcc-check-mprofile-kernel.sh +++ b/arch/powerpc/tools/gcc-check-mprofile-kernel.sh @@ -2,7 +2,6 @@ # SPDX-License-Identifier: GPL-2.0 set -e -set -o pipefail # To debug, uncomment the following line # set -x From 8fae332f02ce8eac2e9db85b0a55f729c1b7c7e5 Mon Sep 17 00:00:00 2001 From: Finn Thain Date: Mon, 10 Nov 2025 10:30:22 +1100 Subject: [PATCH 0401/1321] powerpc: Add reloc_offset() to font bitmap pointer used for bootx_printf() Since Linux v6.7, booting using BootX on an Old World PowerMac produces an early crash. Stan Johnson writes, "the symptoms are that the screen goes blank and the backlight stays on, and the system freezes (Linux doesn't boot)." Further testing revealed that the failure can be avoided by disabling CONFIG_BOOTX_TEXT. Bisection revealed that the regression was caused by a change to the font bitmap pointer that's used when btext_init() begins painting characters on the display, early in the boot process. Christophe Leroy explains, "before kernel text is relocated to its final location ... data is addressed with an offset which is added to the Global Offset Table (GOT) entries at the start of bootx_init() by function reloc_got2(). But the pointers that are located inside a structure are not referenced in the GOT and are therefore not updated by reloc_got2(). It is therefore needed to apply the offset manually by using PTRRELOC() macro." Cc: stable@vger.kernel.org Link: https://lists.debian.org/debian-powerpc/2025/10/msg00111.html Link: https://lore.kernel.org/linuxppc-dev/d81ddca8-c5ee-d583-d579-02b19ed95301@yahoo.com/ Reported-by: Cedar Maxwell Closes: https://lists.debian.org/debian-powerpc/2025/09/msg00031.html Bisected-by: Stan Johnson Tested-by: Stan Johnson Fixes: 0ebc7feae79a ("powerpc: Use shared font data") Suggested-by: Christophe Leroy Signed-off-by: Finn Thain Reviewed-by: Christophe Leroy Signed-off-by: Madhavan Srinivasan Link: https://patch.msgid.link/22b3b247425a052b079ab84da926706b3702c2c7.1762731022.git.fthain@linux-m68k.org --- arch/powerpc/kernel/btext.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/btext.c b/arch/powerpc/kernel/btext.c index 7f63f1cdc6c39..ca00c4824e313 100644 --- a/arch/powerpc/kernel/btext.c +++ b/arch/powerpc/kernel/btext.c @@ -20,6 +20,7 @@ #include #include #include +#include #define NO_SCROLL @@ -463,7 +464,7 @@ static noinline void draw_byte(unsigned char c, long locX, long locY) { unsigned char *base = calc_base(locX << 3, locY << 4); unsigned int font_index = c * 16; - const unsigned char *font = font_sun_8x16.data + font_index; + const unsigned char *font = PTRRELOC(font_sun_8x16.data) + font_index; int rb = dispDeviceRowBytes; rmci_maybe_on(); From ee78b04317e0b19237cceb1ba96c89f1679f0892 Mon Sep 17 00:00:00 2001 From: Aboorva Devarajan Date: Mon, 8 Sep 2025 14:21:23 +0530 Subject: [PATCH 0402/1321] powerpc/powernv: Enable cpuidle state detection for POWER11 Extend cpuidle state detection to POWER11 by updating the PVR check. This ensures POWER11 correctly recognizes supported stop states, similar to POWER9 and POWER10. Without Patch: (Power11 - PowerNV systems) CPUidle driver: powernv_idle CPUidle governor: menu analyzing CPU 927: Number of idle states: 1 Available idle states: snooze snooze: Flags/Description: snooze Latency: 0 Usage: 251631 Duration: 207497715900 -- With Patch: (Power11 - PowerNV systems) CPUidle driver: powernv_idle CPUidle governor: menu analyzing CPU 959: Number of idle states: 4 Available idle states: snooze stop0_lite stop0 stop3 snooze: Flags/Description: snooze Latency: 0 Usage: 2 Duration: 33 stop0_lite: Flags/Description: stop0_lite Latency: 1 Usage: 1 Duration: 52 stop0: Flags/Description: stop0 Latency: 10 Usage: 13 Duration: 1920 stop3: Flags/Description: stop3 Latency: 45 Usage: 381 Duration: 21638478 Signed-off-by: Aboorva Devarajan Tested-by: Madadi Vineeth Reddy Reviewed-by: Madadi Vineeth Reddy Signed-off-by: Madhavan Srinivasan Link: https://patch.msgid.link/20250908085123.216780-1-aboorvad@linux.ibm.com --- arch/powerpc/platforms/powernv/idle.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c index d98b933e4984c..e4f4e907f6e36 100644 --- a/arch/powerpc/platforms/powernv/idle.c +++ b/arch/powerpc/platforms/powernv/idle.c @@ -1171,8 +1171,9 @@ static void __init pnv_arch300_idle_init(void) u64 max_residency_ns = 0; int i; - /* stop is not really architected, we only have p9,p10 drivers */ - if (!pvr_version_is(PVR_POWER10) && !pvr_version_is(PVR_POWER9)) + /* stop is not really architected, we only have p9,p10 and p11 drivers */ + if (!pvr_version_is(PVR_POWER9) && !pvr_version_is(PVR_POWER10) && + !pvr_version_is(PVR_POWER11)) return; /* @@ -1189,8 +1190,8 @@ static void __init pnv_arch300_idle_init(void) struct pnv_idle_states_t *state = &pnv_idle_states[i]; u64 psscr_rl = state->psscr_val & PSSCR_RL_MASK; - /* No deep loss driver implemented for POWER10 yet */ - if (pvr_version_is(PVR_POWER10) && + /* No deep loss driver implemented for POWER10 and POWER11 yet */ + if ((pvr_version_is(PVR_POWER10) || pvr_version_is(PVR_POWER11)) && state->flags & (OPAL_PM_TIMEBASE_STOP|OPAL_PM_LOSE_FULL_CONTEXT)) continue; From 560a502baa6b2ed8e1cf71e94dc484acde133651 Mon Sep 17 00:00:00 2001 From: "Christophe Leroy (CS GROUP)" Date: Fri, 19 Dec 2025 13:23:52 +0100 Subject: [PATCH 0403/1321] powerpc/32: Restore disabling of interrupts at interrupt/syscall exit Commit 2997876c4a1a ("powerpc/32: Restore clearing of MSR[RI] at interrupt/syscall exit") delayed clearing of MSR[RI], but missed that both MSR[RI] and MSR[EE] are cleared at the same time, so the commit also delayed the disabling of interrupts, leading to unexpected behaviour. To fix that, mostly revert the blamed commit and restore the clearing of MSR[RI] in interrupt_exit_kernel_prepare() instead. For 8xx it implies adding a synchronising instruction after the mtspr in order to make sure no instruction counter interrupt (used for perf events) will fire just after clearing MSR[RI]. Reported-by: Christian Zigotzky Closes: https://lore.kernel.org/all/4d0bd05d-6158-1323-3509-744d3fbe8fc7@xenosoft.de/ Reported-by: Guenter Roeck Closes: https://lore.kernel.org/all/6b05eb1c-fdef-44e0-91a7-8286825e68f1@roeck-us.net/ Fixes: 2997876c4a1a ("powerpc/32: Restore clearing of MSR[RI] at interrupt/syscall exit") Signed-off-by: Christophe Leroy (CS GROUP) Signed-off-by: Madhavan Srinivasan Link: https://patch.msgid.link/585ea521b2be99d293b539bbfae148366cfb3687.1766146895.git.chleroy@kernel.org --- arch/powerpc/include/asm/hw_irq.h | 2 +- arch/powerpc/include/asm/reg.h | 1 + arch/powerpc/kernel/entry_32.S | 15 --------------- arch/powerpc/kernel/interrupt.c | 5 ++++- 4 files changed, 6 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index 1078ba88efaf4..9cd945f2acafa 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -90,7 +90,7 @@ static inline void __hard_EE_RI_disable(void) if (IS_ENABLED(CONFIG_BOOKE)) wrtee(0); else if (IS_ENABLED(CONFIG_PPC_8xx)) - wrtspr(SPRN_NRI); + wrtspr_sync(SPRN_NRI); else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) __mtmsrd(0, 1); else diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index 3fe1866354323..3449dd2b577d4 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -1400,6 +1400,7 @@ static inline void mtmsr_isync(unsigned long val) : "r" ((unsigned long)(v)) \ : "memory") #define wrtspr(rn) asm volatile("mtspr " __stringify(rn) ",2" : : : "memory") +#define wrtspr_sync(rn) asm volatile("mtspr " __stringify(rn) ",2; sync" : : : "memory") static inline void wrtee(unsigned long val) { diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 16f8ee6cb2cd6..d8426251b1cda 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -101,17 +101,6 @@ SYM_FUNC_END(__kuep_unlock) .endm #endif -.macro clr_ri trash -#ifndef CONFIG_BOOKE -#ifdef CONFIG_PPC_8xx - mtspr SPRN_NRI, \trash -#else - li \trash, MSR_KERNEL & ~MSR_RI - mtmsr \trash -#endif -#endif -.endm - .globl transfer_to_syscall transfer_to_syscall: stw r3, ORIG_GPR3(r1) @@ -160,7 +149,6 @@ ret_from_syscall: cmpwi r3,0 REST_GPR(3, r1) syscall_exit_finish: - clr_ri r4 mtspr SPRN_SRR0,r7 mtspr SPRN_SRR1,r8 @@ -237,7 +225,6 @@ fast_exception_return: /* Clear the exception marker on the stack to avoid confusing stacktrace */ li r10, 0 stw r10, 8(r11) - clr_ri r10 mtspr SPRN_SRR1,r9 mtspr SPRN_SRR0,r12 REST_GPR(9, r11) @@ -270,7 +257,6 @@ interrupt_return: .Lfast_user_interrupt_return: lwz r11,_NIP(r1) lwz r12,_MSR(r1) - clr_ri r4 mtspr SPRN_SRR0,r11 mtspr SPRN_SRR1,r12 @@ -313,7 +299,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_NEED_PAIRED_STWCX) cmpwi cr1,r3,0 lwz r11,_NIP(r1) lwz r12,_MSR(r1) - clr_ri r4 mtspr SPRN_SRR0,r11 mtspr SPRN_SRR1,r12 diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c index aea6f7e8e9c67..e63bfde13e031 100644 --- a/arch/powerpc/kernel/interrupt.c +++ b/arch/powerpc/kernel/interrupt.c @@ -38,7 +38,7 @@ static inline bool exit_must_hard_disable(void) #else static inline bool exit_must_hard_disable(void) { - return false; + return true; } #endif @@ -443,6 +443,9 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs) if (unlikely(stack_store)) __hard_EE_RI_disable(); +#else + } else { + __hard_EE_RI_disable(); #endif /* CONFIG_PPC64 */ } From 7630724b77ee646c308460466317b3cdbb57e29a Mon Sep 17 00:00:00 2001 From: Andy Chiu Date: Wed, 12 Nov 2025 16:43:14 -0800 Subject: [PATCH 0404/1321] riscv: signal: abstract header saving for setup_sigcontext The function save_v_state() served two purposes. First, it saved extension context into the signal stack. Then, it constructed the extension header if there was no fault. The second part is independent of the extension itself. As a result, we can pull that part out, so future extensions may reuse it. This patch adds arch_ext_list and makes setup_sigcontext() go through all possible extensions' save() callback. The callback returns a positive value indicating the size of the successfully saved extension. Then the kernel proceeds to construct the header for that extension. The kernel skips an extension if it does not exist, or if the saving fails for some reasons. The error code is propagated out on the later case. This patch does not introduce any functional changes. Signed-off-by: Andy Chiu Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-16-b55691eacf4f@rivosinc.com Signed-off-by: Paul Walmsley --- arch/riscv/include/asm/vector.h | 3 ++ arch/riscv/kernel/signal.c | 62 ++++++++++++++++++++++----------- 2 files changed, 44 insertions(+), 21 deletions(-) diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h index e7aa449368ad7..00cb9c0982b1a 100644 --- a/arch/riscv/include/asm/vector.h +++ b/arch/riscv/include/asm/vector.h @@ -424,6 +424,9 @@ static inline bool riscv_v_vstate_ctrl_user_allowed(void) { return false; } #define riscv_v_thread_free(tsk) do {} while (0) #define riscv_v_setup_ctx_cache() do {} while (0) #define riscv_v_thread_alloc(tsk) do {} while (0) +#define get_cpu_vector_context() do {} while (0) +#define put_cpu_vector_context() do {} while (0) +#define riscv_v_vstate_set_restore(task, regs) do {} while (0) #endif /* CONFIG_RISCV_ISA_V */ diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c index 08378fea3a111..5a956108b1eaf 100644 --- a/arch/riscv/kernel/signal.c +++ b/arch/riscv/kernel/signal.c @@ -68,18 +68,19 @@ static long save_fp_state(struct pt_regs *regs, #define restore_fp_state(task, regs) (0) #endif -#ifdef CONFIG_RISCV_ISA_V - -static long save_v_state(struct pt_regs *regs, void __user **sc_vec) +static long save_v_state(struct pt_regs *regs, void __user *sc_vec) { - struct __riscv_ctx_hdr __user *hdr; struct __sc_riscv_v_state __user *state; void __user *datap; long err; - hdr = *sc_vec; - /* Place state to the user's signal context space after the hdr */ - state = (struct __sc_riscv_v_state __user *)(hdr + 1); + if (!IS_ENABLED(CONFIG_RISCV_ISA_V) || + !((has_vector() || has_xtheadvector()) && + riscv_v_vstate_query(regs))) + return 0; + + /* Place state to the user's signal context space */ + state = (struct __sc_riscv_v_state __user *)sc_vec; /* Point datap right after the end of __sc_riscv_v_state */ datap = state + 1; @@ -97,15 +98,11 @@ static long save_v_state(struct pt_regs *regs, void __user **sc_vec) err |= __put_user((__force void *)datap, &state->v_state.datap); /* Copy the whole vector content to user space datap. */ err |= __copy_to_user(datap, current->thread.vstate.datap, riscv_v_vsize); - /* Copy magic to the user space after saving all vector conetext */ - err |= __put_user(RISCV_V_MAGIC, &hdr->magic); - err |= __put_user(riscv_v_sc_size, &hdr->size); if (unlikely(err)) - return err; + return -EFAULT; - /* Only progress the sv_vec if everything has done successfully */ - *sc_vec += riscv_v_sc_size; - return 0; + /* Only return the size if everything has done successfully */ + return riscv_v_sc_size; } /* @@ -142,10 +139,20 @@ static long __restore_v_state(struct pt_regs *regs, void __user *sc_vec) */ return copy_from_user(current->thread.vstate.datap, datap, riscv_v_vsize); } -#else -#define save_v_state(task, regs) (0) -#define __restore_v_state(task, regs) (0) -#endif + +struct arch_ext_priv { + __u32 magic; + long (*save)(struct pt_regs *regs, void __user *sc_vec); +}; + +struct arch_ext_priv arch_ext_list[] = { + { + .magic = RISCV_V_MAGIC, + .save = &save_v_state, + }, +}; + +const size_t nr_arch_exts = ARRAY_SIZE(arch_ext_list); static long restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc) @@ -270,7 +277,8 @@ static long setup_sigcontext(struct rt_sigframe __user *frame, { struct sigcontext __user *sc = &frame->uc.uc_mcontext; struct __riscv_ctx_hdr __user *sc_ext_ptr = &sc->sc_extdesc.hdr; - long err; + struct arch_ext_priv *arch_ext; + long err, i, ext_size; /* sc_regs is structured the same as the start of pt_regs */ err = __copy_to_user(&sc->sc_regs, regs, sizeof(sc->sc_regs)); @@ -278,8 +286,20 @@ static long setup_sigcontext(struct rt_sigframe __user *frame, if (has_fpu()) err |= save_fp_state(regs, &sc->sc_fpregs); /* Save the vector state. */ - if ((has_vector() || has_xtheadvector()) && riscv_v_vstate_query(regs)) - err |= save_v_state(regs, (void __user **)&sc_ext_ptr); + for (i = 0; i < nr_arch_exts; i++) { + arch_ext = &arch_ext_list[i]; + if (!arch_ext->save) + continue; + + ext_size = arch_ext->save(regs, sc_ext_ptr + 1); + if (ext_size <= 0) { + err |= ext_size; + } else { + err |= __put_user(arch_ext->magic, &sc_ext_ptr->magic); + err |= __put_user(ext_size, &sc_ext_ptr->size); + sc_ext_ptr = (void *)sc_ext_ptr + ext_size; + } + } /* Write zero to fp-reserved space and check it on restore_sigcontext */ err |= __put_user(0, &sc->sc_extdesc.reserved); /* And put END __riscv_ctx_hdr at the end. */ From 2fb879dde861a9aae85ae745c4e3dc5ce8bbe7ab Mon Sep 17 00:00:00 2001 From: Paul Walmsley Date: Mon, 17 Nov 2025 21:19:27 -0700 Subject: [PATCH 0405/1321] riscv: mm: pmdp_huge_get_and_clear(): avoid atomic ops when !CONFIG_SMP When !CONFIG_SMP, there's no need for atomic operations in pmdp_huge_get_and_clear(), so, similar to what x86 does, let's not use atomics in this case. See also commit 546e42c8c6d94 ("riscv: Use an atomic xchg in pudp_huge_get_and_clear()"). Cc: Alexandre Ghiti Signed-off-by: Paul Walmsley --- arch/riscv/include/asm/pgtable.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 8bd36ac842eba..1df8a6adb407e 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -997,7 +997,13 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma, static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned long address, pmd_t *pmdp) { +#ifdef CONFIG_SMP pmd_t pmd = __pmd(atomic_long_xchg((atomic_long_t *)pmdp, 0)); +#else + pmd_t pmd = *pmdp; + + pmd_clear(pmdp); +#endif page_table_check_pmd_clear(mm, pmd); From 71387cc2ef883b910c62e39073e32830423375d5 Mon Sep 17 00:00:00 2001 From: Paul Walmsley Date: Mon, 17 Nov 2025 21:19:27 -0700 Subject: [PATCH 0406/1321] riscv: mm: ptep_get_and_clear(): avoid atomic ops when !CONFIG_SMP When !CONFIG_SMP, there's no need for atomic operations in ptep_get_and_clear(), so, similar to x86, let's not use atomics in this case. Cc: Alexandre Ghiti Signed-off-by: Paul Walmsley --- arch/riscv/include/asm/pgtable.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 1df8a6adb407e..ebab8ecd78b20 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -660,7 +660,13 @@ extern int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long a static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address, pte_t *ptep) { +#ifdef CONFIG_SMP pte_t pte = __pte(atomic_long_xchg((atomic_long_t *)ptep, 0)); +#else + pte_t pte = *ptep; + + set_pte(ptep, __pte(0)); +#endif page_table_check_pte_clear(mm, pte); From 2ec33dc2160385dea3bd5a6cfa4d113ca1be1bed Mon Sep 17 00:00:00 2001 From: Paul Walmsley Date: Mon, 17 Nov 2025 21:19:28 -0700 Subject: [PATCH 0407/1321] riscv: mm: use xchg() on non-atomic_long_t variables, not atomic_long_xchg() Let's not call atomic_long_xchg() on something that's not an atomic_long_t, and just use xchg() instead. Continues the cleanup from commit 546e42c8c6d94 ("riscv: Use an atomic xchg in pudp_huge_get_and_clear()"), Cc: Alexandre Ghiti Signed-off-by: Paul Walmsley --- arch/riscv/include/asm/pgtable.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index ebab8ecd78b20..6bb1f5bdc5d26 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -661,7 +661,7 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address, pte_t *ptep) { #ifdef CONFIG_SMP - pte_t pte = __pte(atomic_long_xchg((atomic_long_t *)ptep, 0)); + pte_t pte = __pte(xchg(&ptep->pte, 0)); #else pte_t pte = *ptep; @@ -1004,7 +1004,7 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned long address, pmd_t *pmdp) { #ifdef CONFIG_SMP - pmd_t pmd = __pmd(atomic_long_xchg((atomic_long_t *)pmdp, 0)); + pmd_t pmd = __pmd(xchg(&pmdp->pmd, 0)); #else pmd_t pmd = *pmdp; From de6a4f13815cc314628d27ab78378a875a5a4c98 Mon Sep 17 00:00:00 2001 From: Pincheng Wang Date: Wed, 27 Aug 2025 00:29:35 +0800 Subject: [PATCH 0408/1321] dt-bindings: riscv: add Zilsd and Zclsd extension descriptions Add descriptions for the Zilsd (Load/Store pair instructions) and Zclsd (Compressed Load/Store pair instructions) ISA extensions which were ratified in commit f88abf1 ("Integrating load/store pair for RV32 with the main manual") of the riscv-isa-manual. Signed-off-by: Pincheng Wang Reviewed-by: Nutty Liu Acked-by: Conor Dooley Link: https://patch.msgid.link/20250826162939.1494021-2-pincheng.plct@isrc.iscas.ac.cn Signed-off-by: Paul Walmsley --- .../devicetree/bindings/riscv/extensions.yaml | 36 +++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml index 565cb2cbb49b5..5bab356addc84 100644 --- a/Documentation/devicetree/bindings/riscv/extensions.yaml +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml @@ -377,6 +377,20 @@ properties: guarantee on LR/SC sequences, as ratified in commit b1d806605f87 ("Updated to ratified state.") of the riscv profiles specification. + - const: zilsd + description: + The standard Zilsd extension which provides support for aligned + register-pair load and store operations in 32-bit instruction + encodings, as ratified in commit f88abf1 ("Integrating + load/store pair for RV32 with the main manual") of riscv-isa-manual. + + - const: zclsd + description: + The Zclsd extension implements the compressed (16-bit) version of the + Load/Store Pair for RV32. As with Zilsd, this extension was ratified + in commit f88abf1 ("Integrating load/store pair for RV32 with the + main manual") of riscv-isa-manual. + - const: zk description: The standard Zk Standard Scalar cryptography extension as ratified @@ -882,6 +896,16 @@ properties: anyOf: - const: v - const: zve32x + # Zclsd depends on Zilsd and Zca + - if: + contains: + anyOf: + - const: zclsd + then: + contains: + allOf: + - const: zilsd + - const: zca allOf: # Zcf extension does not exist on rv64 @@ -899,6 +923,18 @@ allOf: not: contains: const: zcf + # Zilsd extension does not exist on rv64 + - if: + properties: + riscv,isa-base: + contains: + const: rv64i + then: + properties: + riscv,isa-extensions: + not: + contains: + const: zilsd additionalProperties: true ... From d7fd4e9ab588a4668694937877169fb8452c6aef Mon Sep 17 00:00:00 2001 From: Pincheng Wang Date: Wed, 27 Aug 2025 00:29:36 +0800 Subject: [PATCH 0409/1321] riscv: add ISA extension parsing for Zilsd and Zclsd Add parsing for Zilsd and Zclsd ISA extensions which were ratified in commit f88abf1 ("Integrating load/store pair for RV32 with the main manual") of the riscv-isa-manual. Signed-off-by: Pincheng Wang Reviewed-by: Nutty Liu Link: https://patch.msgid.link/20250826162939.1494021-3-pincheng.plct@isrc.iscas.ac.cn [pjw@kernel.org: cleaned up checkpatch issues, whitespace; updated to apply] Signed-off-by: Paul Walmsley --- arch/riscv/include/asm/hwcap.h | 2 ++ arch/riscv/kernel/cpufeature.c | 24 ++++++++++++++++++++++++ 2 files changed, 26 insertions(+) diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index dfe57b215e6c9..4369a23385413 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -108,6 +108,8 @@ #define RISCV_ISA_EXT_ZICBOP 99 #define RISCV_ISA_EXT_SVRSW60T59B 100 #define RISCV_ISA_EXT_ZALASR 101 +#define RISCV_ISA_EXT_ZILSD 102 +#define RISCV_ISA_EXT_ZCLSD 103 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index b057362f8fb5f..c05b11596c190 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -242,6 +242,28 @@ static int riscv_ext_zcf_validate(const struct riscv_isa_ext_data *data, return -EPROBE_DEFER; } +static int riscv_ext_zilsd_validate(const struct riscv_isa_ext_data *data, + const unsigned long *isa_bitmap) +{ + if (IS_ENABLED(CONFIG_64BIT)) + return -EINVAL; + + return 0; +} + +static int riscv_ext_zclsd_validate(const struct riscv_isa_ext_data *data, + const unsigned long *isa_bitmap) +{ + if (IS_ENABLED(CONFIG_64BIT)) + return -EINVAL; + + if (__riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_ZILSD) && + __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_ZCA)) + return 0; + + return -EPROBE_DEFER; +} + static int riscv_vector_f_validate(const struct riscv_isa_ext_data *data, const unsigned long *isa_bitmap) { @@ -484,6 +506,8 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_DATA_VALIDATE(zcd, RISCV_ISA_EXT_ZCD, riscv_ext_zcd_validate), __RISCV_ISA_EXT_DATA_VALIDATE(zcf, RISCV_ISA_EXT_ZCF, riscv_ext_zcf_validate), __RISCV_ISA_EXT_DATA_VALIDATE(zcmop, RISCV_ISA_EXT_ZCMOP, riscv_ext_zca_depends), + __RISCV_ISA_EXT_DATA_VALIDATE(zclsd, RISCV_ISA_EXT_ZCLSD, riscv_ext_zclsd_validate), + __RISCV_ISA_EXT_DATA_VALIDATE(zilsd, RISCV_ISA_EXT_ZILSD, riscv_ext_zilsd_validate), __RISCV_ISA_EXT_DATA(zba, RISCV_ISA_EXT_ZBA), __RISCV_ISA_EXT_DATA(zbb, RISCV_ISA_EXT_ZBB), __RISCV_ISA_EXT_DATA(zbc, RISCV_ISA_EXT_ZBC), From 5f7793e6ae8e53847e39006ce54e341e47ca78b0 Mon Sep 17 00:00:00 2001 From: Pincheng Wang Date: Wed, 27 Aug 2025 00:29:37 +0800 Subject: [PATCH 0410/1321] riscv: hwprobe: export Zilsd and Zclsd ISA extensions Export Zilsd and Zclsd ISA extensions through hwprobe. Signed-off-by: Pincheng Wang Reviewed-by: Nutty Liu Link: https://patch.msgid.link/20250826162939.1494021-4-pincheng.plct@isrc.iscas.ac.cn [pjw@kernel.org: fixed whitespace; updated to apply] Signed-off-by: Paul Walmsley --- Documentation/arch/riscv/hwprobe.rst | 8 ++++++++ arch/riscv/include/uapi/asm/hwprobe.h | 3 +++ arch/riscv/kernel/sys_hwprobe.c | 2 ++ 3 files changed, 13 insertions(+) diff --git a/Documentation/arch/riscv/hwprobe.rst b/Documentation/arch/riscv/hwprobe.rst index 06c5280b728a2..641ec4abb9062 100644 --- a/Documentation/arch/riscv/hwprobe.rst +++ b/Documentation/arch/riscv/hwprobe.rst @@ -281,6 +281,14 @@ The following keys are defined: * :c:macro:`RISCV_HWPROBE_EXT_ZICBOP`: The Zicbop extension is supported, as ratified in commit 3dd606f ("Create cmobase-v1.0.pdf") of riscv-CMOs. + * :c:macro:`RISCV_HWPROBE_EXT_ZILSD`: The Zilsd extension is supported as + defined in the RISC-V ISA manual starting from commit f88abf1 ("Integrating + load/store pair for RV32 with the main manual") of the riscv-isa-manual. + + * :c:macro:`RISCV_HWPROBE_EXT_ZCLSD`: The Zclsd extension is supported as + defined in the RISC-V ISA manual starting from commit f88abf1 ("Integrating + load/store pair for RV32 with the main manual") of the riscv-isa-manual. + * :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: Deprecated. Returns similar values to :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF`, but the key was mistakenly classified as a bitmask rather than a value. diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h index 1edea2331b8bd..cd3c126730c33 100644 --- a/arch/riscv/include/uapi/asm/hwprobe.h +++ b/arch/riscv/include/uapi/asm/hwprobe.h @@ -84,6 +84,9 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_EXT_ZABHA (1ULL << 58) #define RISCV_HWPROBE_EXT_ZALASR (1ULL << 59) #define RISCV_HWPROBE_EXT_ZICBOP (1ULL << 60) +#define RISCV_HWPROBE_EXT_ZILSD (1ULL << 61) +#define RISCV_HWPROBE_EXT_ZCLSD (1ULL << 62) + #define RISCV_HWPROBE_KEY_CPUPERF_0 5 #define RISCV_HWPROBE_MISALIGNED_UNKNOWN (0 << 0) #define RISCV_HWPROBE_MISALIGNED_EMULATED (1 << 0) diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 0f701ace3bb9a..e6787ba7f2fc4 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -121,6 +121,7 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZBS); EXT_KEY(ZCA); EXT_KEY(ZCB); + EXT_KEY(ZCLSD); EXT_KEY(ZCMOP); EXT_KEY(ZICBOM); EXT_KEY(ZICBOP); @@ -130,6 +131,7 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZIHINTNTL); EXT_KEY(ZIHINTPAUSE); EXT_KEY(ZIHPM); + EXT_KEY(ZILSD); EXT_KEY(ZIMOP); EXT_KEY(ZKND); EXT_KEY(ZKNE); From ef38b753ce16cc25dcfd66ff31b1d1817b78a8ff Mon Sep 17 00:00:00 2001 From: Zongmin Zhou Date: Thu, 20 Nov 2025 17:58:31 +0800 Subject: [PATCH 0411/1321] riscv/atomic.h: use RISCV_FULL_BARRIER in _arch_atomic* function. Replace the same code with the pre-defined macro RISCV_FULL_BARRIER to simplify the code. Signed-off-by: Zongmin Zhou Link: https://patch.msgid.link/20251120095831.64211-1-min_halo@163.com Signed-off-by: Paul Walmsley --- arch/riscv/include/asm/atomic.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h index 5b96c2f61adb5..3f33dc54f94b2 100644 --- a/arch/riscv/include/asm/atomic.h +++ b/arch/riscv/include/asm/atomic.h @@ -203,7 +203,7 @@ ATOMIC_OPS(xor, xor, i) " add %[rc], %[p], %[a]\n" \ " sc." sfx ".rl %[rc], %[rc], %[c]\n" \ " bnez %[rc], 0b\n" \ - " fence rw, rw\n" \ + RISCV_FULL_BARRIER \ "1:\n" \ : [p]"=&r" (_prev), [rc]"=&r" (_rc), [c]"+A" (counter) \ : [a]"r" (_a), [u]"r" (_u) \ @@ -242,7 +242,7 @@ static __always_inline s64 arch_atomic64_fetch_add_unless(atomic64_t *v, s64 a, " addi %[rc], %[p], 1\n" \ " sc." sfx ".rl %[rc], %[rc], %[c]\n" \ " bnez %[rc], 0b\n" \ - " fence rw, rw\n" \ + RISCV_FULL_BARRIER \ "1:\n" \ : [p]"=&r" (_prev), [rc]"=&r" (_rc), [c]"+A" (counter) \ : \ @@ -268,7 +268,7 @@ static __always_inline bool arch_atomic_inc_unless_negative(atomic_t *v) " addi %[rc], %[p], -1\n" \ " sc." sfx ".rl %[rc], %[rc], %[c]\n" \ " bnez %[rc], 0b\n" \ - " fence rw, rw\n" \ + RISCV_FULL_BARRIER \ "1:\n" \ : [p]"=&r" (_prev), [rc]"=&r" (_rc), [c]"+A" (counter) \ : \ @@ -294,7 +294,7 @@ static __always_inline bool arch_atomic_dec_unless_positive(atomic_t *v) " bltz %[rc], 1f\n" \ " sc." sfx ".rl %[rc], %[rc], %[c]\n" \ " bnez %[rc], 0b\n" \ - " fence rw, rw\n" \ + RISCV_FULL_BARRIER \ "1:\n" \ : [p]"=&r" (_prev), [rc]"=&r" (_rc), [c]"+A" (counter) \ : \ From a3e4532cf0a8b814484ff40b9d500279eb649e13 Mon Sep 17 00:00:00 2001 From: Himanshu Chauhan Date: Thu, 10 Jul 2025 18:22:30 +0530 Subject: [PATCH 0412/1321] riscv: Add SBI debug trigger extension and function ids Debug trigger extension is an SBI extension to support native debugging in S-mode and VS-mode. This patch adds the extension and the function IDs defined by the extension. Signed-off-by: Himanshu Chauhan Link: https://patch.msgid.link/20250710125231.653967-2-hchauhan@ventanamicro.com [pjw@kernel.org: updated to apply] Signed-off-by: Paul Walmsley --- arch/riscv/include/asm/sbi.h | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index ccc77a89b1e22..5725e0ca4dda3 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -37,6 +37,7 @@ enum sbi_ext_id { SBI_EXT_NACL = 0x4E41434C, SBI_EXT_FWFT = 0x46574654, SBI_EXT_MPXY = 0x4D505859, + SBI_EXT_DBTR = 0x44425452, /* Experimentals extensions must lie within this range */ SBI_EXT_EXPERIMENTAL_START = 0x08000000, @@ -505,6 +506,34 @@ enum sbi_mpxy_rpmi_attribute_id { #define SBI_MPXY_CHAN_CAP_SEND_WITHOUT_RESP BIT(4) #define SBI_MPXY_CHAN_CAP_GET_NOTIFICATIONS BIT(5) +/* SBI debug triggers function IDs */ +enum sbi_ext_dbtr_fid { + SBI_EXT_DBTR_NUM_TRIGGERS = 0, + SBI_EXT_DBTR_SETUP_SHMEM, + SBI_EXT_DBTR_TRIG_READ, + SBI_EXT_DBTR_TRIG_INSTALL, + SBI_EXT_DBTR_TRIG_UPDATE, + SBI_EXT_DBTR_TRIG_UNINSTALL, + SBI_EXT_DBTR_TRIG_ENABLE, + SBI_EXT_DBTR_TRIG_DISABLE, +}; + +struct sbi_dbtr_data_msg { + unsigned long tstate; + unsigned long tdata1; + unsigned long tdata2; + unsigned long tdata3; +}; + +struct sbi_dbtr_id_msg { + unsigned long idx; +}; + +union sbi_dbtr_shmem_entry { + struct sbi_dbtr_data_msg data; + struct sbi_dbtr_id_msg id; +}; + /* SBI spec version fields */ #define SBI_SPEC_VERSION_DEFAULT 0x1 #define SBI_SPEC_VERSION_MAJOR_SHIFT 24 From f63cf0bad7f51b4398f9141fd2e26d6944e254eb Mon Sep 17 00:00:00 2001 From: Thomas Fourier Date: Tue, 16 Dec 2025 17:54:18 +0100 Subject: [PATCH 0413/1321] firewire: nosy: Fix dma_free_coherent() size It looks like the buffer allocated and mapped in add_card() is done with size RCV_BUFFER_SIZE which is 16 KB and 4KB. Fixes: 286468210d83 ("firewire: new driver: nosy - IEEE 1394 traffic sniffer") Co-developed-by: Thomas Fourier Signed-off-by: Thomas Fourier Co-developed-by: Christophe JAILLET Signed-off-by: Christophe JAILLET Link: https://lore.kernel.org/r/20251216165420.38355-2-fourier.thomas@gmail.com Signed-off-by: Takashi Sakamoto --- drivers/firewire/nosy.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/firewire/nosy.c b/drivers/firewire/nosy.c index ea31ac7ac1ca9..e59053738a432 100644 --- a/drivers/firewire/nosy.c +++ b/drivers/firewire/nosy.c @@ -36,6 +36,8 @@ static char driver_name[] = KBUILD_MODNAME; +#define RCV_BUFFER_SIZE (16 * 1024) + /* this is the physical layout of a PCL, its size is 128 bytes */ struct pcl { __le32 next; @@ -517,16 +519,14 @@ remove_card(struct pci_dev *dev) lynx->rcv_start_pcl, lynx->rcv_start_pcl_bus); dma_free_coherent(&lynx->pci_device->dev, sizeof(struct pcl), lynx->rcv_pcl, lynx->rcv_pcl_bus); - dma_free_coherent(&lynx->pci_device->dev, PAGE_SIZE, lynx->rcv_buffer, - lynx->rcv_buffer_bus); + dma_free_coherent(&lynx->pci_device->dev, RCV_BUFFER_SIZE, + lynx->rcv_buffer, lynx->rcv_buffer_bus); iounmap(lynx->registers); pci_disable_device(dev); lynx_put(lynx); } -#define RCV_BUFFER_SIZE (16 * 1024) - static int add_card(struct pci_dev *dev, const struct pci_device_id *unused) { @@ -680,7 +680,7 @@ add_card(struct pci_dev *dev, const struct pci_device_id *unused) dma_free_coherent(&lynx->pci_device->dev, sizeof(struct pcl), lynx->rcv_pcl, lynx->rcv_pcl_bus); if (lynx->rcv_buffer) - dma_free_coherent(&lynx->pci_device->dev, PAGE_SIZE, + dma_free_coherent(&lynx->pci_device->dev, RCV_BUFFER_SIZE, lynx->rcv_buffer, lynx->rcv_buffer_bus); iounmap(lynx->registers); From 8bbe9dfc0f5c1bc8c3bee221bd3a9d0f5932ca29 Mon Sep 17 00:00:00 2001 From: Andy Shevchenko Date: Thu, 27 Nov 2025 17:36:50 +0100 Subject: [PATCH 0414/1321] serial: core: Restore sysfs fwnode information The change that restores sysfs fwnode information does it only for OF cases. Update the fix to cover all possible types of fwnodes. Fixes: d36f0e9a0002 ("serial: core: restore of_node information in sysfs") Cc: stable Signed-off-by: Andy Shevchenko Link: https://patch.msgid.link/20251127163650.2942075-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/serial_base_bus.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/tty/serial/serial_base_bus.c b/drivers/tty/serial/serial_base_bus.c index 22749ab0428a7..8e891984cdc0d 100644 --- a/drivers/tty/serial/serial_base_bus.c +++ b/drivers/tty/serial/serial_base_bus.c @@ -13,7 +13,7 @@ #include #include #include -#include +#include #include #include #include @@ -60,6 +60,7 @@ void serial_base_driver_unregister(struct device_driver *driver) driver_unregister(driver); } +/* On failure the caller must put device @dev with put_device() */ static int serial_base_device_init(struct uart_port *port, struct device *dev, struct device *parent_dev, @@ -73,7 +74,8 @@ static int serial_base_device_init(struct uart_port *port, dev->parent = parent_dev; dev->bus = &serial_base_bus_type; dev->release = release; - device_set_of_node_from_dev(dev, parent_dev); + + device_set_node(dev, fwnode_handle_get(dev_fwnode(parent_dev))); if (!serial_base_initialized) { dev_dbg(port->dev, "uart_add_one_port() called before arch_initcall()?\n"); @@ -94,7 +96,7 @@ static void serial_base_ctrl_release(struct device *dev) { struct serial_ctrl_device *ctrl_dev = to_serial_base_ctrl_device(dev); - of_node_put(dev->of_node); + fwnode_handle_put(dev_fwnode(dev)); kfree(ctrl_dev); } @@ -142,7 +144,7 @@ static void serial_base_port_release(struct device *dev) { struct serial_port_device *port_dev = to_serial_base_port_device(dev); - of_node_put(dev->of_node); + fwnode_handle_put(dev_fwnode(dev)); kfree(port_dev); } From be7ad3c63aa04194a2430a9c3b5cba4fb419e875 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Sat, 29 Nov 2025 17:51:23 +0300 Subject: [PATCH 0415/1321] serial: 8250: longson: Fix NULL vs IS_ERR() bug in probe The devm_platform_get_and_ioremap_resource() function never returns NULL, it returns error pointers. Fix the error checking to match. Fixes: 25e95d763176 ("serial: 8250: Add Loongson uart driver support") Signed-off-by: Dan Carpenter Link: https://patch.msgid.link/aSsIa3KdAlXh5uQC@stanley.mountain Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/8250/8250_loongson.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/tty/serial/8250/8250_loongson.c b/drivers/tty/serial/8250/8250_loongson.c index 53153a116c011..47df3c4c9d21f 100644 --- a/drivers/tty/serial/8250/8250_loongson.c +++ b/drivers/tty/serial/8250/8250_loongson.c @@ -128,8 +128,8 @@ static int loongson_uart_probe(struct platform_device *pdev) port->private_data = priv; port->membase = devm_platform_get_and_ioremap_resource(pdev, 0, &priv->res); - if (!port->membase) - return -ENOMEM; + if (IS_ERR(port->membase)) + return PTR_ERR(port->membase); port->mapbase = priv->res->start; port->mapsize = resource_size(priv->res); From d9ae6c464eca50197f060a97a2f8a101abb0950b Mon Sep 17 00:00:00 2001 From: Alexander Stein Date: Fri, 19 Dec 2025 16:28:12 +0100 Subject: [PATCH 0416/1321] serial: core: Fix serial device initialization During restoring sysfs fwnode information the information of_node_reused was dropped. This was previously set by device_set_of_node_from_dev(). Add it back manually Fixes: 24ec03cc5512 ("serial: core: Restore sysfs fwnode information") Cc: stable Suggested-by: Cosmin Tanislav Signed-off-by: Alexander Stein Tested-by: Michael Walle Tested-by: Marek Szyprowski Tested-by: Cosmin Tanislav Link: https://patch.msgid.link/20251219152813.1893982-1-alexander.stein@ew.tq-group.com Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/serial_base_bus.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/tty/serial/serial_base_bus.c b/drivers/tty/serial/serial_base_bus.c index 8e891984cdc0d..1e1ad28d83fcf 100644 --- a/drivers/tty/serial/serial_base_bus.c +++ b/drivers/tty/serial/serial_base_bus.c @@ -74,6 +74,7 @@ static int serial_base_device_init(struct uart_port *port, dev->parent = parent_dev; dev->bus = &serial_base_bus_type; dev->release = release; + dev->of_node_reused = true; device_set_node(dev, fwnode_handle_get(dev_fwnode(parent_dev))); From a698faaa4afe627b6f60abac7e0af002b48326e3 Mon Sep 17 00:00:00 2001 From: Claudiu Beznea Date: Wed, 17 Dec 2025 15:57:59 +0200 Subject: [PATCH 0417/1321] serial: sh-sci: Check that the DMA cookie is valid The driver updates struct sci_port::tx_cookie to zero right before the TX work is scheduled, or to -EINVAL when DMA is disabled. dma_async_is_complete(), called through dma_cookie_status() (and possibly through dmaengine_tx_status()), considers cookies valid only if they have values greater than or equal to 1. Passing zero or -EINVAL to dmaengine_tx_status() before any TX DMA transfer has started leads to an incorrect TX status being reported, as the cookie is invalid for the DMA subsystem. This may cause long wait times when the serial device is opened for configuration before any TX activity has occurred. Check that the TX cookie is valid before passing it to dmaengine_tx_status(). Fixes: 7cc0e0a43a91 ("serial: sh-sci: Check if TX data was written to device in .tx_empty()") Cc: stable Signed-off-by: Claudiu Beznea Link: https://patch.msgid.link/20251217135759.402015-1-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/sh-sci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c index 53edbf1d89633..fbfe5575bd3c1 100644 --- a/drivers/tty/serial/sh-sci.c +++ b/drivers/tty/serial/sh-sci.c @@ -1914,7 +1914,7 @@ static void sci_dma_check_tx_occurred(struct sci_port *s) struct dma_tx_state state; enum dma_status status; - if (!s->chan_tx) + if (!s->chan_tx || s->cookie_tx <= 0) return; status = dmaengine_tx_status(s->chan_tx, s->cookie_tx, &state); From 2c74e9416f2130f9f81338eb852322a33e1a315a Mon Sep 17 00:00:00 2001 From: "j.turek" Date: Sun, 21 Dec 2025 11:32:21 +0100 Subject: [PATCH 0418/1321] serial: xilinx_uartps: fix rs485 delay_rts_after_send RTS line control with delay should be triggered when there is no more bytes in kfifo and hardware buffer is empty. Without this patch RTS control is scheduled right after feeding hardware buffer and this is too early. RTS line may change state before hardware buffer is empty. With this patch delayed RTS state change is triggered when function cdns_uart_handle_tx is called from cdns_uart_isr on CDNS_UART_IXR_TXEMPTY exactly when hardware completed transmission Fixes: fccc9d9233f9 ("tty: serial: uartps: Add rs485 support to uartps driver") Cc: stable Link: https://patch.msgid.link/20251221103221.1971125-1-jakub.turek@elsta.tech Signed-off-by: Jakub Turek Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/xilinx_uartps.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/tty/serial/xilinx_uartps.c b/drivers/tty/serial/xilinx_uartps.c index c793fc74c26be..c593d20a1b5bd 100644 --- a/drivers/tty/serial/xilinx_uartps.c +++ b/drivers/tty/serial/xilinx_uartps.c @@ -428,10 +428,17 @@ static void cdns_uart_handle_tx(void *dev_id) struct tty_port *tport = &port->state->port; unsigned int numbytes; unsigned char ch; + ktime_t rts_delay; if (kfifo_is_empty(&tport->xmit_fifo) || uart_tx_stopped(port)) { /* Disable the TX Empty interrupt */ writel(CDNS_UART_IXR_TXEMPTY, port->membase + CDNS_UART_IDR); + /* Set RTS line after delay */ + if (cdns_uart->port->rs485.flags & SER_RS485_ENABLED) { + cdns_uart->tx_timer.function = &cdns_rs485_rx_callback; + rts_delay = ns_to_ktime(cdns_calc_after_tx_delay(cdns_uart)); + hrtimer_start(&cdns_uart->tx_timer, rts_delay, HRTIMER_MODE_REL); + } return; } @@ -448,13 +455,6 @@ static void cdns_uart_handle_tx(void *dev_id) /* Enable the TX Empty interrupt */ writel(CDNS_UART_IXR_TXEMPTY, cdns_uart->port->membase + CDNS_UART_IER); - - if (cdns_uart->port->rs485.flags & SER_RS485_ENABLED && - (kfifo_is_empty(&tport->xmit_fifo) || uart_tx_stopped(port))) { - hrtimer_update_function(&cdns_uart->tx_timer, cdns_rs485_rx_callback); - hrtimer_start(&cdns_uart->tx_timer, - ns_to_ktime(cdns_calc_after_tx_delay(cdns_uart)), HRTIMER_MODE_REL); - } } /** From 2ff5f7b449b61213a5636c30d016f346eeb1603a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C5=81ukasz=20Bartosik?= Date: Thu, 27 Nov 2025 11:16:44 +0000 Subject: [PATCH 0419/1321] xhci: dbgtty: fix device unregister: fixup MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This fixup replaces tty_vhangup() call with call to tty_port_tty_vhangup(). Both calls hangup tty device synchronously however tty_port_tty_vhangup() increases reference count during the hangup operation using scoped_guard(tty_port_tty). Cc: stable Fixes: 1f73b8b56cf3 ("xhci: dbgtty: fix device unregister") Signed-off-by: Łukasz Bartosik Link: https://patch.msgid.link/20251127111644.3161386-1-ukaszb@google.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/host/xhci-dbgtty.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/host/xhci-dbgtty.c b/drivers/usb/host/xhci-dbgtty.c index 57cdda4e09c8e..90282e51e23ec 100644 --- a/drivers/usb/host/xhci-dbgtty.c +++ b/drivers/usb/host/xhci-dbgtty.c @@ -554,7 +554,7 @@ static void xhci_dbc_tty_unregister_device(struct xhci_dbc *dbc) * Hang up the TTY. This wakes up any blocked * writers and causes subsequent writes to fail. */ - tty_vhangup(port->port.tty); + tty_port_tty_vhangup(&port->port); tty_unregister_device(dbc_tty_driver, port->minor); xhci_dbc_tty_exit_port(port); From 199cf0df93e55165fc43d22c1b1f9a2459f4503f Mon Sep 17 00:00:00 2001 From: Udipto Goswami Date: Wed, 26 Nov 2025 11:12:21 +0530 Subject: [PATCH 0420/1321] usb: dwc3: keep susphy enabled during exit to avoid controller faults On some platforms, switching USB roles from host to device can trigger controller faults due to premature PHY power-down. This occurs when the PHY is disabled too early during teardown, causing synchronization issues between the PHY and controller. Keep susphy enabled during dwc3_host_exit() and dwc3_gadget_exit() ensures the PHY remains in a low-power state capable of handling required commands during role switch. Cc: stable Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init") Suggested-by: Thinh Nguyen Signed-off-by: Udipto Goswami Acked-by: Thinh Nguyen Link: https://patch.msgid.link/20251126054221.120638-1-udipto.goswami@oss.qualcomm.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/dwc3/gadget.c | 2 +- drivers/usb/dwc3/host.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index bc3fe31638b9a..8a35a6901db7d 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4826,7 +4826,7 @@ void dwc3_gadget_exit(struct dwc3 *dwc) if (!dwc->gadget) return; - dwc3_enable_susphy(dwc, false); + dwc3_enable_susphy(dwc, true); usb_del_gadget(dwc->gadget); dwc3_gadget_free_endpoints(dwc); usb_put_gadget(dwc->gadget); diff --git a/drivers/usb/dwc3/host.c b/drivers/usb/dwc3/host.c index cf6512ed17a69..96b588bd08cdc 100644 --- a/drivers/usb/dwc3/host.c +++ b/drivers/usb/dwc3/host.c @@ -227,7 +227,7 @@ void dwc3_host_exit(struct dwc3 *dwc) if (dwc->sys_wakeup) device_init_wakeup(&dwc->xhci->dev, false); - dwc3_enable_susphy(dwc, false); + dwc3_enable_susphy(dwc, true); platform_device_unregister(dwc->xhci); dwc->xhci = NULL; } From 9dfa76edb1afe561ed1b1e2fa243adf552eee6c5 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Thu, 4 Dec 2025 11:11:07 +0100 Subject: [PATCH 0421/1321] usb: typec: ucsi: huawei-gaokin: add DRM dependency Selecting DRM_AUX_HPD_BRIDGE is not possible from a built-in driver when CONFIG_DRM=m: WARNING: unmet direct dependencies detected for DRM_AUX_HPD_BRIDGE Depends on [m]: HAS_IOMEM [=y] && DRM [=m] && DRM_BRIDGE [=y] && OF [=y] Selected by [y]: - UCSI_HUAWEI_GAOKUN [=y] && USB_SUPPORT [=y] && TYPEC [=y] && TYPEC_UCSI [=y] && EC_HUAWEI_GAOKUN [=y] && DRM_BRIDGE [=y] && OF [=y] Add the same dependency we have in similar drivers to work around this. Fixes: 00327d7f2c8c ("usb: typec: ucsi: add Huawei Matebook E Go ucsi driver") Cc: stable Signed-off-by: Arnd Bergmann Reviewed-by: Heikki Krogerus Link: https://patch.msgid.link/20251204101111.1035975-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/typec/ucsi/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/usb/typec/ucsi/Kconfig b/drivers/usb/typec/ucsi/Kconfig index 7fcb1e1de5d6d..b812be4d0e674 100644 --- a/drivers/usb/typec/ucsi/Kconfig +++ b/drivers/usb/typec/ucsi/Kconfig @@ -96,6 +96,7 @@ config UCSI_LENOVO_YOGA_C630 config UCSI_HUAWEI_GAOKUN tristate "UCSI Interface Driver for Huawei Matebook E Go" depends on EC_HUAWEI_GAOKUN + depends on DRM || !DRM select DRM_AUX_HPD_BRIDGE if DRM_BRIDGE && OF help This driver enables UCSI support on the Huawei Matebook E Go tablet, From 84024f697da4afa10c290ccdc9089b2a300a5ca4 Mon Sep 17 00:00:00 2001 From: Haoxiang Li Date: Thu, 4 Dec 2025 21:21:29 +0800 Subject: [PATCH 0422/1321] usb: renesas_usbhs: Fix a resource leak in usbhs_pipe_malloc() usbhsp_get_pipe() set pipe's flags to IS_USED. In error paths, usbhsp_put_pipe() is required to clear pipe's flags to prevent pipe exhaustion. Fixes: f1407d5c6624 ("usb: renesas_usbhs: Add Renesas USBHS common code") Cc: stable Signed-off-by: Haoxiang Li Link: https://patch.msgid.link/20251204132129.109234-1-haoxiang_li2024@163.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/renesas_usbhs/pipe.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/usb/renesas_usbhs/pipe.c b/drivers/usb/renesas_usbhs/pipe.c index 75fff2e4cbc65..56fc3ff5016fc 100644 --- a/drivers/usb/renesas_usbhs/pipe.c +++ b/drivers/usb/renesas_usbhs/pipe.c @@ -713,11 +713,13 @@ struct usbhs_pipe *usbhs_pipe_malloc(struct usbhs_priv *priv, /* make sure pipe is not busy */ ret = usbhsp_pipe_barrier(pipe); if (ret < 0) { + usbhsp_put_pipe(pipe); dev_err(dev, "pipe setup failed %d\n", usbhs_pipe_number(pipe)); return NULL; } if (usbhsp_setup_pipecfg(pipe, is_host, dir_in, &pipecfg)) { + usbhsp_put_pipe(pipe); dev_err(dev, "can't setup pipe\n"); return NULL; } From 66d648b66bb3375b6757824c77a29403dfab0f4e Mon Sep 17 00:00:00 2001 From: Duoming Zhou Date: Fri, 5 Dec 2025 11:48:31 +0800 Subject: [PATCH 0423/1321] usb: phy: fsl-usb: Fix use-after-free in delayed work during device removal The delayed work item otg_event is initialized in fsl_otg_conf() and scheduled under two conditions: 1. When a host controller binds to the OTG controller. 2. When the USB ID pin state changes (cable insertion/removal). A race condition occurs when the device is removed via fsl_otg_remove(): the fsl_otg instance may be freed while the delayed work is still pending or executing. This leads to use-after-free when the work function fsl_otg_event() accesses the already freed memory. The problematic scenario: (detach thread) | (delayed work) fsl_otg_remove() | kfree(fsl_otg_dev) //FREE| fsl_otg_event() | og = container_of(...) //USE | og-> //USE Fix this by calling disable_delayed_work_sync() in fsl_otg_remove() before deallocating the fsl_otg structure. This ensures the delayed work is properly canceled and completes execution prior to memory deallocation. This bug was identified through static analysis. Fixes: 0807c500a1a6 ("USB: add Freescale USB OTG Transceiver driver") Cc: stable Signed-off-by: Duoming Zhou Link: https://patch.msgid.link/20251205034831.12846-1-duoming@zju.edu.cn Signed-off-by: Greg Kroah-Hartman --- drivers/usb/phy/phy-fsl-usb.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/usb/phy/phy-fsl-usb.c b/drivers/usb/phy/phy-fsl-usb.c index 40ac68e52cee7..e266a47c4d483 100644 --- a/drivers/usb/phy/phy-fsl-usb.c +++ b/drivers/usb/phy/phy-fsl-usb.c @@ -988,6 +988,7 @@ static void fsl_otg_remove(struct platform_device *pdev) { struct fsl_usb2_platform_data *pdata = dev_get_platdata(&pdev->dev); + disable_delayed_work_sync(&fsl_otg_dev->otg_event); usb_remove_phy(&fsl_otg_dev->phy); free_irq(fsl_otg_dev->irq, fsl_otg_dev); From d203bb61c24cca0fc0b58af1922fe6b3d48aafa1 Mon Sep 17 00:00:00 2001 From: Haoxiang Li Date: Sat, 6 Dec 2025 15:04:45 +0800 Subject: [PATCH 0424/1321] usb: typec: altmodes/displayport: Drop the device reference in dp_altmode_probe() In error paths, call typec_altmode_put_plug() to drop the device reference obtained by typec_altmode_get_plug(). Fixes: 71ba4fe56656 ("usb: typec: altmodes/displayport: add SOP' support") Cc: stable Signed-off-by: Haoxiang Li Reviewed-by: Heikki Krogerus Link: https://patch.msgid.link/20251206070445.190770-1-lihaoxiang@isrc.iscas.ac.cn Signed-off-by: Greg Kroah-Hartman --- drivers/usb/typec/altmodes/displayport.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/usb/typec/altmodes/displayport.c b/drivers/usb/typec/altmodes/displayport.c index 8d111ad3b71b8..d96ab106a980b 100644 --- a/drivers/usb/typec/altmodes/displayport.c +++ b/drivers/usb/typec/altmodes/displayport.c @@ -766,12 +766,16 @@ int dp_altmode_probe(struct typec_altmode *alt) if (!(DP_CAP_PIN_ASSIGN_DFP_D(port->vdo) & DP_CAP_PIN_ASSIGN_UFP_D(alt->vdo)) && !(DP_CAP_PIN_ASSIGN_UFP_D(port->vdo) & - DP_CAP_PIN_ASSIGN_DFP_D(alt->vdo))) + DP_CAP_PIN_ASSIGN_DFP_D(alt->vdo))) { + typec_altmode_put_plug(plug); return -ENODEV; + } dp = devm_kzalloc(&alt->dev, sizeof(*dp), GFP_KERNEL); - if (!dp) + if (!dp) { + typec_altmode_put_plug(plug); return -ENOMEM; + } INIT_WORK(&dp->work, dp_altmode_work); mutex_init(&dp->lock); From 0a443f4280bbd8f67ed40c94455cb4bcada9d3aa Mon Sep 17 00:00:00 2001 From: Ma Ke Date: Mon, 15 Dec 2025 10:09:31 +0800 Subject: [PATCH 0425/1321] USB: lpc32xx_udc: Fix error handling in probe lpc32xx_udc_probe() acquires an i2c_client reference through isp1301_get_client() but fails to release it in both error handling paths and the normal removal path. This could result in a reference count leak for the I2C device, preventing proper cleanup and potentially leading to resource exhaustion. Add put_device() to release the reference in the probe failure path and in the remove function. Calling path: isp1301_get_client() -> of_find_i2c_device_by_node() -> i2c_find_device_by_fwnode(). As comments of i2c_find_device_by_fwnode() says, 'The user must call put_device(&client->dev) once done with the i2c client.' Found by code review. Cc: stable Fixes: 24a28e428351 ("USB: gadget driver for LPC32xx") Signed-off-by: Ma Ke Link: https://patch.msgid.link/20251215020931.15324-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman --- drivers/usb/gadget/udc/lpc32xx_udc.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/drivers/usb/gadget/udc/lpc32xx_udc.c b/drivers/usb/gadget/udc/lpc32xx_udc.c index 1a7d3c4f652fe..73c0f28a85852 100644 --- a/drivers/usb/gadget/udc/lpc32xx_udc.c +++ b/drivers/usb/gadget/udc/lpc32xx_udc.c @@ -3020,7 +3020,7 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) pdev->dev.dma_mask = &lpc32xx_usbd_dmamask; retval = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32)); if (retval) - return retval; + goto i2c_fail; udc->board = &lpc32xx_usbddata; @@ -3038,28 +3038,32 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) /* Get IRQs */ for (i = 0; i < 4; i++) { udc->udp_irq[i] = platform_get_irq(pdev, i); - if (udc->udp_irq[i] < 0) - return udc->udp_irq[i]; + if (udc->udp_irq[i] < 0) { + retval = udc->udp_irq[i]; + goto i2c_fail; + } } udc->udp_baseaddr = devm_platform_ioremap_resource(pdev, 0); if (IS_ERR(udc->udp_baseaddr)) { dev_err(udc->dev, "IO map failure\n"); - return PTR_ERR(udc->udp_baseaddr); + retval = PTR_ERR(udc->udp_baseaddr); + goto i2c_fail; } /* Get USB device clock */ udc->usb_slv_clk = devm_clk_get(&pdev->dev, NULL); if (IS_ERR(udc->usb_slv_clk)) { dev_err(udc->dev, "failed to acquire USB device clock\n"); - return PTR_ERR(udc->usb_slv_clk); + retval = PTR_ERR(udc->usb_slv_clk); + goto i2c_fail; } /* Enable USB device clock */ retval = clk_prepare_enable(udc->usb_slv_clk); if (retval < 0) { dev_err(udc->dev, "failed to start USB device clock\n"); - return retval; + goto i2c_fail; } /* Setup deferred workqueue data */ @@ -3161,6 +3165,8 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) dma_free_coherent(&pdev->dev, UDCA_BUFF_SIZE, udc->udca_v_base, udc->udca_p_base); i2c_fail: + if (udc->isp1301_i2c_client) + put_device(&udc->isp1301_i2c_client->dev); clk_disable_unprepare(udc->usb_slv_clk); dev_err(udc->dev, "%s probe failed, %d\n", driver_name, retval); @@ -3189,6 +3195,9 @@ static void lpc32xx_udc_remove(struct platform_device *pdev) dma_free_coherent(&pdev->dev, UDCA_BUFF_SIZE, udc->udca_v_base, udc->udca_p_base); + if (udc->isp1301_i2c_client) + put_device(&udc->isp1301_i2c_client->dev); + clk_disable_unprepare(udc->usb_slv_clk); } From 45b667f53e94010ac79131d4b77ba395946a4ea6 Mon Sep 17 00:00:00 2001 From: "Mario Limonciello (AMD)" Date: Tue, 16 Dec 2025 06:22:02 -0600 Subject: [PATCH 0426/1321] usb: typec: ucsi: Fix null pointer dereference in ucsi_sync_control_common Add missing null check for cci parameter before dereferencing it in ucsi_sync_control_common(). The function can be called with cci=NULL from ucsi_acknowledge(), which leads to a null pointer dereference when accessing *cci in the condition check. The crash occurs because the code checks if cci is not null before calling ucsi->ops->read_cci(ucsi, cci), but then immediately dereferences cci without a null check in the following condition: (*cci & UCSI_CCI_COMMAND_COMPLETE). KASAN trace: KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] RIP: 0010:ucsi_sync_control_common+0x2ae/0x4e0 [typec_ucsi] Cc: stable Fixes: 667ecac55861 ("usb: typec: ucsi: return CCI and message from sync_control callback") Reviewed-by: Heikki Krogerus Signed-off-by: Mario Limonciello (AMD) Link: https://patch.msgid.link/20251216122210.5457-1-superm1@kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/typec/ucsi/ucsi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index 9b3df776137a1..7129973f19e7e 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -97,7 +97,7 @@ int ucsi_sync_control_common(struct ucsi *ucsi, u64 command, u32 *cci) if (!ret && cci) ret = ucsi->ops->read_cci(ucsi, cci); - if (!ret && ucsi->message_in_size > 0 && + if (!ret && cci && ucsi->message_in_size > 0 && (*cci & UCSI_CCI_COMMAND_COMPLETE)) ret = ucsi->ops->read_message_in(ucsi, ucsi->message_in, ucsi->message_in_size); From 68a7674441eae89d09b5cbda7b382bddbc3941e3 Mon Sep 17 00:00:00 2001 From: Miaoqian Lin Date: Thu, 11 Dec 2025 10:49:36 +0400 Subject: [PATCH 0427/1321] usb: dwc3: of-simple: fix clock resource leak in dwc3_of_simple_probe When clk_bulk_prepare_enable() fails, the error path jumps to err_resetc_assert, skipping clk_bulk_put_all() and leaking the clock references acquired by clk_bulk_get_all(). Add err_clk_put_all label to properly release clock resources in all error paths. Found via static analysis and code review. Fixes: c0c61471ef86 ("usb: dwc3: of-simple: Convert to bulk clk API") Cc: stable Signed-off-by: Miaoqian Lin Acked-by: Thinh Nguyen Link: https://patch.msgid.link/20251211064937.2360510-1-linmq006@gmail.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/dwc3/dwc3-of-simple.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/usb/dwc3/dwc3-of-simple.c b/drivers/usb/dwc3/dwc3-of-simple.c index a4954a21be930..c116143335d9f 100644 --- a/drivers/usb/dwc3/dwc3-of-simple.c +++ b/drivers/usb/dwc3/dwc3-of-simple.c @@ -70,11 +70,11 @@ static int dwc3_of_simple_probe(struct platform_device *pdev) simple->num_clocks = ret; ret = clk_bulk_prepare_enable(simple->num_clocks, simple->clks); if (ret) - goto err_resetc_assert; + goto err_clk_put_all; ret = of_platform_populate(np, NULL, NULL, dev); if (ret) - goto err_clk_put; + goto err_clk_disable; pm_runtime_set_active(dev); pm_runtime_enable(dev); @@ -82,8 +82,9 @@ static int dwc3_of_simple_probe(struct platform_device *pdev) return 0; -err_clk_put: +err_clk_disable: clk_bulk_disable_unprepare(simple->num_clocks, simple->clks); +err_clk_put_all: clk_bulk_put_all(simple->num_clocks, simple->clks); err_resetc_assert: From 0ff5ed063d5b32842249e4b0ce1ff7a0d28eafc8 Mon Sep 17 00:00:00 2001 From: Chen Changcheng Date: Thu, 18 Dec 2025 09:23:18 +0800 Subject: [PATCH 0428/1321] usb: usb-storage: Maintain minimal modifications to the bcdDevice range. We cannot determine which models require the NO_ATA_1X and IGNORE_RESIDUE quirks aside from the EL-R12 optical drive device. Fixes: 955a48a5353f ("usb: usb-storage: No additional quirks need to be added to the EL-R12 optical drive.") Signed-off-by: Chen Changcheng Link: https://patch.msgid.link/20251218012318.15978-1-chenchangcheng@kylinos.cn Signed-off-by: Greg Kroah-Hartman --- drivers/usb/storage/unusual_uas.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/storage/unusual_uas.h b/drivers/usb/storage/unusual_uas.h index b695f5ba9a409..939a98c2d3f74 100644 --- a/drivers/usb/storage/unusual_uas.h +++ b/drivers/usb/storage/unusual_uas.h @@ -98,7 +98,7 @@ UNUSUAL_DEV(0x125f, 0xa94a, 0x0160, 0x0160, US_FL_NO_ATA_1X), /* Reported-by: Benjamin Tissoires */ -UNUSUAL_DEV(0x13fd, 0x3940, 0x0309, 0x0309, +UNUSUAL_DEV(0x13fd, 0x3940, 0x0000, 0x0309, "Initio Corporation", "INIC-3069", USB_SC_DEVICE, USB_PR_DEVICE, NULL, From 5832a6966215030eec5e802577c57fd28762d23f Mon Sep 17 00:00:00 2001 From: Hsin-Te Yuan Date: Thu, 18 Dec 2025 15:37:57 +0800 Subject: [PATCH 0429/1321] usb: typec: ucsi: Get connector status after enable notifications Originally, the notification for connector change will be enabled after the first read of the connector status. Therefore, if the event happens during this window, it will be missing and make the status unsynced. Get the connector status only after enabling the notification for connector change to ensure the status is synced. Fixes: c1b0bc2dabfa ("usb: typec: Add support for UCSI interface") Cc: stable Tested-by: Kenneth R. Crudup Reviewed-by: Heikki Krogerus Signed-off-by: Hsin-Te Yuan Link: https://patch.msgid.link/20251218-ucsi-v7-1-aea83e83fb12@chromium.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/typec/ucsi/ucsi.c | 133 +++++++++++++++++++--------------- 1 file changed, 74 insertions(+), 59 deletions(-) diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index 7129973f19e7e..b153ed5bffb1d 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -1624,11 +1624,71 @@ static struct fwnode_handle *ucsi_find_fwnode(struct ucsi_connector *con) return NULL; } +static void ucsi_init_port(struct ucsi *ucsi, struct ucsi_connector *con) +{ + enum usb_role u_role = USB_ROLE_NONE; + int ret; + + /* Get the status */ + ret = ucsi_get_connector_status(con, false); + if (ret) { + dev_err(ucsi->dev, "con%d: failed to get status\n", con->num); + return; + } + + if (ucsi->ops->connector_status) + ucsi->ops->connector_status(con); + + switch (UCSI_CONSTAT(con, PARTNER_TYPE)) { + case UCSI_CONSTAT_PARTNER_TYPE_UFP: + case UCSI_CONSTAT_PARTNER_TYPE_CABLE_AND_UFP: + u_role = USB_ROLE_HOST; + fallthrough; + case UCSI_CONSTAT_PARTNER_TYPE_CABLE: + typec_set_data_role(con->port, TYPEC_HOST); + break; + case UCSI_CONSTAT_PARTNER_TYPE_DFP: + u_role = USB_ROLE_DEVICE; + typec_set_data_role(con->port, TYPEC_DEVICE); + break; + default: + break; + } + + /* Check if there is already something connected */ + if (UCSI_CONSTAT(con, CONNECTED)) { + typec_set_pwr_role(con->port, UCSI_CONSTAT(con, PWR_DIR)); + ucsi_register_partner(con); + ucsi_pwr_opmode_change(con); + ucsi_orientation(con); + ucsi_port_psy_changed(con); + if (con->ucsi->cap.features & UCSI_CAP_GET_PD_MESSAGE) + ucsi_get_partner_identity(con); + if (con->ucsi->cap.features & UCSI_CAP_CABLE_DETAILS) + ucsi_check_cable(con); + } + + /* Only notify USB controller if partner supports USB data */ + if (!(UCSI_CONSTAT(con, PARTNER_FLAG_USB))) + u_role = USB_ROLE_NONE; + + ret = usb_role_switch_set_role(con->usb_role_sw, u_role); + if (ret) + dev_err(ucsi->dev, "con:%d: failed to set usb role:%d\n", + con->num, u_role); + + if (con->partner && UCSI_CONSTAT(con, PWR_OPMODE) == UCSI_CONSTAT_PWR_OPMODE_PD) { + ucsi_register_device_pdos(con); + ucsi_get_src_pdos(con); + ucsi_check_altmodes(con); + ucsi_check_connector_capability(con); + } +} + static int ucsi_register_port(struct ucsi *ucsi, struct ucsi_connector *con) { struct typec_capability *cap = &con->typec_cap; enum typec_accessory *accessory = cap->accessory; - enum usb_role u_role = USB_ROLE_NONE; u64 command; char *name; int ret; @@ -1729,63 +1789,6 @@ static int ucsi_register_port(struct ucsi *ucsi, struct ucsi_connector *con) goto out; } - /* Get the status */ - ret = ucsi_get_connector_status(con, false); - if (ret) { - dev_err(ucsi->dev, "con%d: failed to get status\n", con->num); - goto out; - } - - if (ucsi->ops->connector_status) - ucsi->ops->connector_status(con); - - switch (UCSI_CONSTAT(con, PARTNER_TYPE)) { - case UCSI_CONSTAT_PARTNER_TYPE_UFP: - case UCSI_CONSTAT_PARTNER_TYPE_CABLE_AND_UFP: - u_role = USB_ROLE_HOST; - fallthrough; - case UCSI_CONSTAT_PARTNER_TYPE_CABLE: - typec_set_data_role(con->port, TYPEC_HOST); - break; - case UCSI_CONSTAT_PARTNER_TYPE_DFP: - u_role = USB_ROLE_DEVICE; - typec_set_data_role(con->port, TYPEC_DEVICE); - break; - default: - break; - } - - /* Check if there is already something connected */ - if (UCSI_CONSTAT(con, CONNECTED)) { - typec_set_pwr_role(con->port, UCSI_CONSTAT(con, PWR_DIR)); - ucsi_register_partner(con); - ucsi_pwr_opmode_change(con); - ucsi_orientation(con); - ucsi_port_psy_changed(con); - if (con->ucsi->cap.features & UCSI_CAP_GET_PD_MESSAGE) - ucsi_get_partner_identity(con); - if (con->ucsi->cap.features & UCSI_CAP_CABLE_DETAILS) - ucsi_check_cable(con); - } - - /* Only notify USB controller if partner supports USB data */ - if (!(UCSI_CONSTAT(con, PARTNER_FLAG_USB))) - u_role = USB_ROLE_NONE; - - ret = usb_role_switch_set_role(con->usb_role_sw, u_role); - if (ret) { - dev_err(ucsi->dev, "con:%d: failed to set usb role:%d\n", - con->num, u_role); - ret = 0; - } - - if (con->partner && UCSI_CONSTAT(con, PWR_OPMODE) == UCSI_CONSTAT_PWR_OPMODE_PD) { - ucsi_register_device_pdos(con); - ucsi_get_src_pdos(con); - ucsi_check_altmodes(con); - ucsi_check_connector_capability(con); - } - trace_ucsi_register_port(con->num, con); out: @@ -1903,17 +1906,29 @@ static int ucsi_init(struct ucsi *ucsi) goto err_unregister; } + /* Delay other interactions with each connector until ucsi_init_port is done */ + for (i = 0; i < ucsi->cap.num_connectors; i++) + mutex_lock(&connector[i].lock); + /* Enable all supported notifications */ ntfy = ucsi_get_supported_notifications(ucsi); command = UCSI_SET_NOTIFICATION_ENABLE | ntfy; ucsi->message_in_size = 0; ret = ucsi_send_command(ucsi, command); - if (ret < 0) + if (ret < 0) { + for (i = 0; i < ucsi->cap.num_connectors; i++) + mutex_unlock(&connector[i].lock); goto err_unregister; + } ucsi->connector = connector; ucsi->ntfy = ntfy; + for (i = 0; i < ucsi->cap.num_connectors; i++) { + ucsi_init_port(ucsi, &connector[i]); + mutex_unlock(&connector[i].lock); + } + mutex_lock(&ucsi->ppm_lock); ret = ucsi->ops->read_cci(ucsi, &cci); mutex_unlock(&ucsi->ppm_lock); From e8d4c00e8967b51675ffa753a4bc8b499a6261b8 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Thu, 18 Dec 2025 16:35:15 +0100 Subject: [PATCH 0430/1321] usb: gadget: lpc32xx_udc: fix clock imbalance in error path A recent change fixing a device reference leak introduced a clock imbalance by reusing an error path so that the clock may be disabled before having been enabled. Note that the clock framework allows for passing in NULL clocks so there is no risk for a NULL pointer dereference. Also drop the bogus I2C client NULL check added by the offending commit as the pointer has already been verified to be non-NULL. Fixes: c84117912bdd ("USB: lpc32xx_udc: Fix error handling in probe") Cc: stable@vger.kernel.org Cc: Ma Ke Signed-off-by: Johan Hovold Reviewed-by: Vladimir Zapolskiy Link: https://patch.msgid.link/20251218153519.19453-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/gadget/udc/lpc32xx_udc.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/usb/gadget/udc/lpc32xx_udc.c b/drivers/usb/gadget/udc/lpc32xx_udc.c index 73c0f28a85852..a962d4294fbec 100644 --- a/drivers/usb/gadget/udc/lpc32xx_udc.c +++ b/drivers/usb/gadget/udc/lpc32xx_udc.c @@ -3020,7 +3020,7 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) pdev->dev.dma_mask = &lpc32xx_usbd_dmamask; retval = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32)); if (retval) - goto i2c_fail; + goto err_put_client; udc->board = &lpc32xx_usbddata; @@ -3040,7 +3040,7 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) udc->udp_irq[i] = platform_get_irq(pdev, i); if (udc->udp_irq[i] < 0) { retval = udc->udp_irq[i]; - goto i2c_fail; + goto err_put_client; } } @@ -3048,7 +3048,7 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) if (IS_ERR(udc->udp_baseaddr)) { dev_err(udc->dev, "IO map failure\n"); retval = PTR_ERR(udc->udp_baseaddr); - goto i2c_fail; + goto err_put_client; } /* Get USB device clock */ @@ -3056,14 +3056,14 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) if (IS_ERR(udc->usb_slv_clk)) { dev_err(udc->dev, "failed to acquire USB device clock\n"); retval = PTR_ERR(udc->usb_slv_clk); - goto i2c_fail; + goto err_put_client; } /* Enable USB device clock */ retval = clk_prepare_enable(udc->usb_slv_clk); if (retval < 0) { dev_err(udc->dev, "failed to start USB device clock\n"); - goto i2c_fail; + goto err_put_client; } /* Setup deferred workqueue data */ @@ -3165,9 +3165,10 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) dma_free_coherent(&pdev->dev, UDCA_BUFF_SIZE, udc->udca_v_base, udc->udca_p_base); i2c_fail: - if (udc->isp1301_i2c_client) - put_device(&udc->isp1301_i2c_client->dev); clk_disable_unprepare(udc->usb_slv_clk); +err_put_client: + put_device(&udc->isp1301_i2c_client->dev); + dev_err(udc->dev, "%s probe failed, %d\n", driver_name, retval); return retval; @@ -3195,10 +3196,9 @@ static void lpc32xx_udc_remove(struct platform_device *pdev) dma_free_coherent(&pdev->dev, UDCA_BUFF_SIZE, udc->udca_v_base, udc->udca_p_base); - if (udc->isp1301_i2c_client) - put_device(&udc->isp1301_i2c_client->dev); - clk_disable_unprepare(udc->usb_slv_clk); + + put_device(&udc->isp1301_i2c_client->dev); } #ifdef CONFIG_PM From 86c8da760f3c1dee053b34d2d8843153c9725a81 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Thu, 18 Dec 2025 16:35:16 +0100 Subject: [PATCH 0431/1321] usb: phy: isp1301: fix non-OF device reference imbalance A recent change fixing a device reference leak in a UDC driver introduced a potential use-after-free in the non-OF case as the isp1301_get_client() helper only increases the reference count for the returned I2C device in the OF case. Increment the reference count also for non-OF so that the caller can decrement it unconditionally. Note that this is inherently racy just as using the returned I2C device is since nothing is preventing the PHY driver from being unbound while in use. Fixes: c84117912bdd ("USB: lpc32xx_udc: Fix error handling in probe") Cc: stable@vger.kernel.org Cc: Ma Ke Signed-off-by: Johan Hovold Reviewed-by: Vladimir Zapolskiy Link: https://patch.msgid.link/20251218153519.19453-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/phy/phy-isp1301.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/usb/phy/phy-isp1301.c b/drivers/usb/phy/phy-isp1301.c index f9b5c411aee4e..2940f0c84e1b7 100644 --- a/drivers/usb/phy/phy-isp1301.c +++ b/drivers/usb/phy/phy-isp1301.c @@ -149,7 +149,12 @@ struct i2c_client *isp1301_get_client(struct device_node *node) return client; /* non-DT: only one ISP1301 chip supported */ - return isp1301_i2c_client; + if (isp1301_i2c_client) { + get_device(&isp1301_i2c_client->dev); + return isp1301_i2c_client; + } + + return NULL; } EXPORT_SYMBOL_GPL(isp1301_get_client); From 0d3b264dcceb513fdecc834ba13d9c982704045c Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Thu, 18 Dec 2025 16:35:17 +0100 Subject: [PATCH 0432/1321] usb: ohci-nxp: fix device leak on probe failure Make sure to drop the reference taken when looking up the PHY I2C device during probe on probe failure (e.g. probe deferral) and on driver unbind. Fixes: 73108aa90cbf ("USB: ohci-nxp: Use isp1301 driver") Cc: stable@vger.kernel.org # 3.5 Reported-by: Ma Ke Link: https://lore.kernel.org/lkml/20251117013428.21840-1-make24@iscas.ac.cn/ Signed-off-by: Johan Hovold Acked-by: Alan Stern Reviewed-by: Vladimir Zapolskiy Link: https://patch.msgid.link/20251218153519.19453-4-johan@kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/host/ohci-nxp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/usb/host/ohci-nxp.c b/drivers/usb/host/ohci-nxp.c index 24d5a1dc50560..509ca7d8d5138 100644 --- a/drivers/usb/host/ohci-nxp.c +++ b/drivers/usb/host/ohci-nxp.c @@ -223,6 +223,7 @@ static int ohci_hcd_nxp_probe(struct platform_device *pdev) fail_resource: usb_put_hcd(hcd); fail_disable: + put_device(&isp1301_i2c_client->dev); isp1301_i2c_client = NULL; return ret; } @@ -234,6 +235,7 @@ static void ohci_hcd_nxp_remove(struct platform_device *pdev) usb_remove_hcd(hcd); ohci_nxp_stop_hc(); usb_put_hcd(hcd); + put_device(&isp1301_i2c_client->dev); isp1301_i2c_client = NULL; } From 71e74cb4d8690dd6f76e39e3e4cd35d654378a74 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Thu, 18 Dec 2025 16:35:18 +0100 Subject: [PATCH 0433/1321] usb: gadget: lpc32xx_udc: clean up probe error labels Error labels should be named after what they do rather than after from where they are jumped to. Rename the probe error labels for consistency and to improve readability. Signed-off-by: Johan Hovold Reviewed-by: Vladimir Zapolskiy Link: https://patch.msgid.link/20251218153519.19453-5-johan@kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/gadget/udc/lpc32xx_udc.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/drivers/usb/gadget/udc/lpc32xx_udc.c b/drivers/usb/gadget/udc/lpc32xx_udc.c index a962d4294fbec..83c7e243dcf9d 100644 --- a/drivers/usb/gadget/udc/lpc32xx_udc.c +++ b/drivers/usb/gadget/udc/lpc32xx_udc.c @@ -3084,7 +3084,7 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) if (!udc->udca_v_base) { dev_err(udc->dev, "error getting UDCA region\n"); retval = -ENOMEM; - goto i2c_fail; + goto err_disable_clk; } udc->udca_p_base = dma_handle; dev_dbg(udc->dev, "DMA buffer(0x%x bytes), P:0x%08x, V:0x%p\n", @@ -3097,7 +3097,7 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) if (!udc->dd_cache) { dev_err(udc->dev, "error getting DD DMA region\n"); retval = -ENOMEM; - goto dma_alloc_fail; + goto err_free_dma; } /* Clear USB peripheral and initialize gadget endpoints */ @@ -3111,14 +3111,14 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) if (retval < 0) { dev_err(udc->dev, "LP request irq %d failed\n", udc->udp_irq[IRQ_USB_LP]); - goto irq_req_fail; + goto err_destroy_pool; } retval = devm_request_irq(dev, udc->udp_irq[IRQ_USB_HP], lpc32xx_usb_hp_irq, 0, "udc_hp", udc); if (retval < 0) { dev_err(udc->dev, "HP request irq %d failed\n", udc->udp_irq[IRQ_USB_HP]); - goto irq_req_fail; + goto err_destroy_pool; } retval = devm_request_irq(dev, udc->udp_irq[IRQ_USB_DEVDMA], @@ -3126,7 +3126,7 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) if (retval < 0) { dev_err(udc->dev, "DEV request irq %d failed\n", udc->udp_irq[IRQ_USB_DEVDMA]); - goto irq_req_fail; + goto err_destroy_pool; } /* The transceiver interrupt is used for VBUS detection and will @@ -3137,7 +3137,7 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) if (retval < 0) { dev_err(udc->dev, "VBUS request irq %d failed\n", udc->udp_irq[IRQ_USB_ATX]); - goto irq_req_fail; + goto err_destroy_pool; } /* Initialize wait queue */ @@ -3146,7 +3146,7 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) retval = usb_add_gadget_udc(dev, &udc->gadget); if (retval < 0) - goto add_gadget_fail; + goto err_destroy_pool; dev_set_drvdata(dev, udc); device_init_wakeup(dev, 1); @@ -3158,13 +3158,12 @@ static int lpc32xx_udc_probe(struct platform_device *pdev) dev_info(udc->dev, "%s version %s\n", driver_name, DRIVER_VERSION); return 0; -add_gadget_fail: -irq_req_fail: +err_destroy_pool: dma_pool_destroy(udc->dd_cache); -dma_alloc_fail: +err_free_dma: dma_free_coherent(&pdev->dev, UDCA_BUFF_SIZE, udc->udca_v_base, udc->udca_p_base); -i2c_fail: +err_disable_clk: clk_disable_unprepare(udc->usb_slv_clk); err_put_client: put_device(&udc->isp1301_i2c_client->dev); From 877216e17287374cabf197bde68826aa4373bcea Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Thu, 18 Dec 2025 16:35:19 +0100 Subject: [PATCH 0434/1321] usb: ohci-nxp: clean up probe error labels Error labels should be named after what they do rather than after from where they are jumped to. Rename the probe error labels for consistency and to improve readability. Signed-off-by: Johan Hovold Reviewed-by: Vladimir Zapolskiy Link: https://patch.msgid.link/20251218153519.19453-6-johan@kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/host/ohci-nxp.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/usb/host/ohci-nxp.c b/drivers/usb/host/ohci-nxp.c index 509ca7d8d5138..7663f2aa35e92 100644 --- a/drivers/usb/host/ohci-nxp.c +++ b/drivers/usb/host/ohci-nxp.c @@ -169,13 +169,13 @@ static int ohci_hcd_nxp_probe(struct platform_device *pdev) ret = dma_coerce_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32)); if (ret) - goto fail_disable; + goto err_put_client; dev_dbg(&pdev->dev, "%s: " DRIVER_DESC " (nxp)\n", hcd_name); if (usb_disabled()) { dev_err(&pdev->dev, "USB is disabled\n"); ret = -ENODEV; - goto fail_disable; + goto err_put_client; } /* Enable USB host clock */ @@ -183,7 +183,7 @@ static int ohci_hcd_nxp_probe(struct platform_device *pdev) if (IS_ERR(usb_host_clk)) { dev_err(&pdev->dev, "failed to acquire and start USB OHCI clock\n"); ret = PTR_ERR(usb_host_clk); - goto fail_disable; + goto err_put_client; } isp1301_configure(); @@ -192,13 +192,13 @@ static int ohci_hcd_nxp_probe(struct platform_device *pdev) if (!hcd) { dev_err(&pdev->dev, "Failed to allocate HC buffer\n"); ret = -ENOMEM; - goto fail_disable; + goto err_put_client; } hcd->regs = devm_platform_get_and_ioremap_resource(pdev, 0, &res); if (IS_ERR(hcd->regs)) { ret = PTR_ERR(hcd->regs); - goto fail_resource; + goto err_put_hcd; } hcd->rsrc_start = res->start; hcd->rsrc_len = resource_size(res); @@ -206,7 +206,7 @@ static int ohci_hcd_nxp_probe(struct platform_device *pdev) irq = platform_get_irq(pdev, 0); if (irq < 0) { ret = -ENXIO; - goto fail_resource; + goto err_put_hcd; } ohci_nxp_start_hc(); @@ -220,9 +220,9 @@ static int ohci_hcd_nxp_probe(struct platform_device *pdev) } ohci_nxp_stop_hc(); -fail_resource: +err_put_hcd: usb_put_hcd(hcd); -fail_disable: +err_put_client: put_device(&isp1301_i2c_client->dev); isp1301_i2c_client = NULL; return ret; From 703fccabf69075bc771caa53d1999cad3dc36fbc Mon Sep 17 00:00:00 2001 From: Greg Kroah-Hartman Date: Tue, 23 Dec 2025 15:54:06 +0100 Subject: [PATCH 0435/1321] Revert "usb: typec: ucsi: Get connector status after enable notifications" This reverts commit 5106dbab44fba8ec6dede3f4e75d17f5aa777ec8. There are reported issues in this file, so revert the commit for now so that the original offending changes can be reverted and working systems can be restored. This can come back at a later time if it is rebased yet-again (sorry.) Cc: stable Cc: Johan Hovold Link: https://lore.kernel.org/r/20251222152204.2846-1-johan@kernel.org Fixes: 5106dbab44fb ("usb: typec: ucsi: Get connector status after enable notifications") Cc: Kenneth R. Crudup Cc: Heikki Krogerus Cc: Hsin-Te Yuan Signed-off-by: Greg Kroah-Hartman --- drivers/usb/typec/ucsi/ucsi.c | 133 +++++++++++++++------------------- 1 file changed, 59 insertions(+), 74 deletions(-) diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index b153ed5bffb1d..7129973f19e7e 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -1624,71 +1624,11 @@ static struct fwnode_handle *ucsi_find_fwnode(struct ucsi_connector *con) return NULL; } -static void ucsi_init_port(struct ucsi *ucsi, struct ucsi_connector *con) -{ - enum usb_role u_role = USB_ROLE_NONE; - int ret; - - /* Get the status */ - ret = ucsi_get_connector_status(con, false); - if (ret) { - dev_err(ucsi->dev, "con%d: failed to get status\n", con->num); - return; - } - - if (ucsi->ops->connector_status) - ucsi->ops->connector_status(con); - - switch (UCSI_CONSTAT(con, PARTNER_TYPE)) { - case UCSI_CONSTAT_PARTNER_TYPE_UFP: - case UCSI_CONSTAT_PARTNER_TYPE_CABLE_AND_UFP: - u_role = USB_ROLE_HOST; - fallthrough; - case UCSI_CONSTAT_PARTNER_TYPE_CABLE: - typec_set_data_role(con->port, TYPEC_HOST); - break; - case UCSI_CONSTAT_PARTNER_TYPE_DFP: - u_role = USB_ROLE_DEVICE; - typec_set_data_role(con->port, TYPEC_DEVICE); - break; - default: - break; - } - - /* Check if there is already something connected */ - if (UCSI_CONSTAT(con, CONNECTED)) { - typec_set_pwr_role(con->port, UCSI_CONSTAT(con, PWR_DIR)); - ucsi_register_partner(con); - ucsi_pwr_opmode_change(con); - ucsi_orientation(con); - ucsi_port_psy_changed(con); - if (con->ucsi->cap.features & UCSI_CAP_GET_PD_MESSAGE) - ucsi_get_partner_identity(con); - if (con->ucsi->cap.features & UCSI_CAP_CABLE_DETAILS) - ucsi_check_cable(con); - } - - /* Only notify USB controller if partner supports USB data */ - if (!(UCSI_CONSTAT(con, PARTNER_FLAG_USB))) - u_role = USB_ROLE_NONE; - - ret = usb_role_switch_set_role(con->usb_role_sw, u_role); - if (ret) - dev_err(ucsi->dev, "con:%d: failed to set usb role:%d\n", - con->num, u_role); - - if (con->partner && UCSI_CONSTAT(con, PWR_OPMODE) == UCSI_CONSTAT_PWR_OPMODE_PD) { - ucsi_register_device_pdos(con); - ucsi_get_src_pdos(con); - ucsi_check_altmodes(con); - ucsi_check_connector_capability(con); - } -} - static int ucsi_register_port(struct ucsi *ucsi, struct ucsi_connector *con) { struct typec_capability *cap = &con->typec_cap; enum typec_accessory *accessory = cap->accessory; + enum usb_role u_role = USB_ROLE_NONE; u64 command; char *name; int ret; @@ -1789,6 +1729,63 @@ static int ucsi_register_port(struct ucsi *ucsi, struct ucsi_connector *con) goto out; } + /* Get the status */ + ret = ucsi_get_connector_status(con, false); + if (ret) { + dev_err(ucsi->dev, "con%d: failed to get status\n", con->num); + goto out; + } + + if (ucsi->ops->connector_status) + ucsi->ops->connector_status(con); + + switch (UCSI_CONSTAT(con, PARTNER_TYPE)) { + case UCSI_CONSTAT_PARTNER_TYPE_UFP: + case UCSI_CONSTAT_PARTNER_TYPE_CABLE_AND_UFP: + u_role = USB_ROLE_HOST; + fallthrough; + case UCSI_CONSTAT_PARTNER_TYPE_CABLE: + typec_set_data_role(con->port, TYPEC_HOST); + break; + case UCSI_CONSTAT_PARTNER_TYPE_DFP: + u_role = USB_ROLE_DEVICE; + typec_set_data_role(con->port, TYPEC_DEVICE); + break; + default: + break; + } + + /* Check if there is already something connected */ + if (UCSI_CONSTAT(con, CONNECTED)) { + typec_set_pwr_role(con->port, UCSI_CONSTAT(con, PWR_DIR)); + ucsi_register_partner(con); + ucsi_pwr_opmode_change(con); + ucsi_orientation(con); + ucsi_port_psy_changed(con); + if (con->ucsi->cap.features & UCSI_CAP_GET_PD_MESSAGE) + ucsi_get_partner_identity(con); + if (con->ucsi->cap.features & UCSI_CAP_CABLE_DETAILS) + ucsi_check_cable(con); + } + + /* Only notify USB controller if partner supports USB data */ + if (!(UCSI_CONSTAT(con, PARTNER_FLAG_USB))) + u_role = USB_ROLE_NONE; + + ret = usb_role_switch_set_role(con->usb_role_sw, u_role); + if (ret) { + dev_err(ucsi->dev, "con:%d: failed to set usb role:%d\n", + con->num, u_role); + ret = 0; + } + + if (con->partner && UCSI_CONSTAT(con, PWR_OPMODE) == UCSI_CONSTAT_PWR_OPMODE_PD) { + ucsi_register_device_pdos(con); + ucsi_get_src_pdos(con); + ucsi_check_altmodes(con); + ucsi_check_connector_capability(con); + } + trace_ucsi_register_port(con->num, con); out: @@ -1906,29 +1903,17 @@ static int ucsi_init(struct ucsi *ucsi) goto err_unregister; } - /* Delay other interactions with each connector until ucsi_init_port is done */ - for (i = 0; i < ucsi->cap.num_connectors; i++) - mutex_lock(&connector[i].lock); - /* Enable all supported notifications */ ntfy = ucsi_get_supported_notifications(ucsi); command = UCSI_SET_NOTIFICATION_ENABLE | ntfy; ucsi->message_in_size = 0; ret = ucsi_send_command(ucsi, command); - if (ret < 0) { - for (i = 0; i < ucsi->cap.num_connectors; i++) - mutex_unlock(&connector[i].lock); + if (ret < 0) goto err_unregister; - } ucsi->connector = connector; ucsi->ntfy = ntfy; - for (i = 0; i < ucsi->cap.num_connectors; i++) { - ucsi_init_port(ucsi, &connector[i]); - mutex_unlock(&connector[i].lock); - } - mutex_lock(&ucsi->ppm_lock); ret = ucsi->ops->read_cci(ucsi, &cci); mutex_unlock(&ucsi->ppm_lock); From 007c3279c8c951dffd2a1219ff9d002832839f4a Mon Sep 17 00:00:00 2001 From: Greg Kroah-Hartman Date: Tue, 23 Dec 2025 15:57:16 +0100 Subject: [PATCH 0436/1321] Revert "usb: typec: ucsi: Fix null pointer dereference in ucsi_sync_control_common" This reverts commit 14ad4c10d5bdd413ff9a914260e89b5f54b7a2c7. The originally offending commit will be reverted instead of this fix up at this point in time, so revert this fix. Cc: Heikki Krogerus Cc: Mario Limonciello (AMD) Cc: stable Cc: Johan Hovold Fixes: 14ad4c10d5bd ("usb: typec: ucsi: Fix null pointer dereference in ucsi_sync_control_common") Link: https://lore.kernel.org/r/20251222152204.2846-1-johan@kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/typec/ucsi/ucsi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index 7129973f19e7e..9b3df776137a1 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -97,7 +97,7 @@ int ucsi_sync_control_common(struct ucsi *ucsi, u64 command, u32 *cci) if (!ret && cci) ret = ucsi->ops->read_cci(ucsi, cci); - if (!ret && cci && ucsi->message_in_size > 0 && + if (!ret && ucsi->message_in_size > 0 && (*cci & UCSI_CCI_COMMAND_COMPLETE)) ret = ucsi->ops->read_message_in(ucsi, ucsi->message_in, ucsi->message_in_size); From 6c09d8a1c424fa314c547ec33252880d583276b1 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 22 Dec 2025 16:22:01 +0100 Subject: [PATCH 0437/1321] Revert "usb: typec: ucsi: Add support for SET_PDOS command" This reverts commit 1b474ee01fbb73b1365adbf9b3067f7375e471ee. The new buffer management code that this feature relies on is broken so revert for now. The interface for writing data and support for UCSI_SET_PDOS looks like it could use some more thought as well. Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Link: https://patch.msgid.link/20251222152204.2846-2-johan@kernel.org --- drivers/usb/typec/ucsi/debugfs.c | 1 - drivers/usb/typec/ucsi/ucsi.h | 1 - 2 files changed, 2 deletions(-) diff --git a/drivers/usb/typec/ucsi/debugfs.c b/drivers/usb/typec/ucsi/debugfs.c index 174f4d53b7771..90d11b79d2c07 100644 --- a/drivers/usb/typec/ucsi/debugfs.c +++ b/drivers/usb/typec/ucsi/debugfs.c @@ -37,7 +37,6 @@ static int ucsi_cmd(void *data, u64 val) case UCSI_SET_USB: case UCSI_SET_POWER_LEVEL: case UCSI_READ_POWER_LEVEL: - case UCSI_SET_PDOS: ucsi->message_in_size = 0; ret = ucsi_send_command(ucsi, val); break; diff --git a/drivers/usb/typec/ucsi/ucsi.h b/drivers/usb/typec/ucsi/ucsi.h index f946b728c373d..d01b796a8d23a 100644 --- a/drivers/usb/typec/ucsi/ucsi.h +++ b/drivers/usb/typec/ucsi/ucsi.h @@ -137,7 +137,6 @@ void ucsi_connector_change(struct ucsi *ucsi, u8 num); #define UCSI_GET_PD_MESSAGE 0x15 #define UCSI_GET_CAM_CS 0x18 #define UCSI_SET_SINK_PATH 0x1c -#define UCSI_SET_PDOS 0x1d #define UCSI_READ_POWER_LEVEL 0x1e #define UCSI_SET_USB 0x21 #define UCSI_GET_LPM_PPM_INFO 0x22 From a2f7f36ce29455e61a45156a0f24eb53f8648069 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 22 Dec 2025 16:22:02 +0100 Subject: [PATCH 0438/1321] Revert "usb: typec: ucsi: Enable debugfs for message_out data structure" This reverts commit 775fae520e6ae62c393a8daf42dc534f09692f3f. The new buffer management code that this relies on is broken so revert for now. It also looks like the error handling needs some more thought as the message out size is not reset on errors. Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Link: https://patch.msgid.link/20251222152204.2846-3-johan@kernel.org --- drivers/usb/typec/ucsi/debugfs.c | 26 -------------------------- 1 file changed, 26 deletions(-) diff --git a/drivers/usb/typec/ucsi/debugfs.c b/drivers/usb/typec/ucsi/debugfs.c index 90d11b79d2c07..924f930275534 100644 --- a/drivers/usb/typec/ucsi/debugfs.c +++ b/drivers/usb/typec/ucsi/debugfs.c @@ -110,30 +110,6 @@ static int ucsi_vbus_volt_show(struct seq_file *m, void *v) } DEFINE_SHOW_ATTRIBUTE(ucsi_vbus_volt); -static ssize_t ucsi_message_out_write(struct file *file, - const char __user *data, size_t count, loff_t *ppos) -{ - struct ucsi *ucsi = file->private_data; - int ret; - - char *buf __free(kfree) = memdup_user_nul(data, count); - if (IS_ERR(buf)) - return PTR_ERR(buf); - - ucsi->message_out_size = min(count / 2, UCSI_MAX_MESSAGE_OUT_LENGTH); - ret = hex2bin(ucsi->message_out, buf, ucsi->message_out_size); - if (ret) - return ret; - - return count; -} - -static const struct file_operations ucsi_message_out_fops = { - .open = simple_open, - .write = ucsi_message_out_write, - .llseek = generic_file_llseek, -}; - void ucsi_debugfs_register(struct ucsi *ucsi) { ucsi->debugfs = kzalloc(sizeof(*ucsi->debugfs), GFP_KERNEL); @@ -146,8 +122,6 @@ void ucsi_debugfs_register(struct ucsi *ucsi) debugfs_create_file("peak_current", 0400, ucsi->debugfs->dentry, ucsi, &ucsi_peak_curr_fops); debugfs_create_file("avg_current", 0400, ucsi->debugfs->dentry, ucsi, &ucsi_avg_curr_fops); debugfs_create_file("vbus_voltage", 0400, ucsi->debugfs->dentry, ucsi, &ucsi_vbus_volt_fops); - debugfs_create_file("message_out", 0200, ucsi->debugfs->dentry, ucsi, - &ucsi_message_out_fops); } void ucsi_debugfs_unregister(struct ucsi *ucsi) From 885fc1761e2c65e287344a7ebe6d36311124cd7d Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 22 Dec 2025 16:22:03 +0100 Subject: [PATCH 0439/1321] Revert "usb: typec: ucsi: Add support for message out data structure" This reverts commit db0028637cc832add6d87564fcc2ebb12781b046. The new buffer management code that this feature relies on is broken so revert for now. As for the in buffer, nothing prevents the out message size and buffer from being modified while the message is being processed due to lack of serialisation. Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Link: https://patch.msgid.link/20251222152204.2846-4-johan@kernel.org --- drivers/usb/typec/ucsi/ucsi.c | 14 -------------- drivers/usb/typec/ucsi/ucsi.h | 2 -- drivers/usb/typec/ucsi/ucsi_acpi.c | 16 ---------------- 3 files changed, 32 deletions(-) diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index 9b3df776137a1..8195407131501 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -67,20 +67,6 @@ int ucsi_sync_control_common(struct ucsi *ucsi, u64 command, u32 *cci) reinit_completion(&ucsi->complete); - if (ucsi->message_out_size > 0) { - if (!ucsi->ops->write_message_out) { - ucsi->message_out_size = 0; - ret = -EOPNOTSUPP; - goto out_clear_bit; - } - - ret = ucsi->ops->write_message_out(ucsi, ucsi->message_out, - ucsi->message_out_size); - ucsi->message_out_size = 0; - if (ret) - goto out_clear_bit; - } - ret = ucsi->ops->async_control(ucsi, command); if (ret) goto out_clear_bit; diff --git a/drivers/usb/typec/ucsi/ucsi.h b/drivers/usb/typec/ucsi/ucsi.h index d01b796a8d23a..479bf1f69c72b 100644 --- a/drivers/usb/typec/ucsi/ucsi.h +++ b/drivers/usb/typec/ucsi/ucsi.h @@ -69,7 +69,6 @@ struct dentry; * @read_cci: Read CCI register * @poll_cci: Read CCI register while polling with notifications disabled * @read_message_in: Read message data from UCSI - * @write_message_out: Write message data to UCSI * @sync_control: Blocking control operation * @async_control: Non-blocking control operation * @update_altmodes: Squashes duplicate DP altmodes @@ -85,7 +84,6 @@ struct ucsi_operations { int (*read_cci)(struct ucsi *ucsi, u32 *cci); int (*poll_cci)(struct ucsi *ucsi, u32 *cci); int (*read_message_in)(struct ucsi *ucsi, void *val, size_t val_len); - int (*write_message_out)(struct ucsi *ucsi, void *data, size_t data_len); int (*sync_control)(struct ucsi *ucsi, u64 command, u32 *cci); int (*async_control)(struct ucsi *ucsi, u64 command); bool (*update_altmodes)(struct ucsi *ucsi, u8 recipient, diff --git a/drivers/usb/typec/ucsi/ucsi_acpi.c b/drivers/usb/typec/ucsi/ucsi_acpi.c index f9beeb8352382..f1d1f6917b098 100644 --- a/drivers/usb/typec/ucsi/ucsi_acpi.c +++ b/drivers/usb/typec/ucsi/ucsi_acpi.c @@ -86,21 +86,6 @@ static int ucsi_acpi_read_message_in(struct ucsi *ucsi, void *val, size_t val_le return 0; } -static int ucsi_acpi_write_message_out(struct ucsi *ucsi, void *data, size_t data_len) -{ - struct ucsi_acpi *ua = ucsi_get_drvdata(ucsi); - - if (!data || !data_len) - return -EINVAL; - - if (ucsi->version <= UCSI_VERSION_1_2) - memcpy(ua->base + UCSI_MESSAGE_OUT, data, data_len); - else - memcpy(ua->base + UCSIv2_MESSAGE_OUT, data, data_len); - - return 0; -} - static int ucsi_acpi_async_control(struct ucsi *ucsi, u64 command) { struct ucsi_acpi *ua = ucsi_get_drvdata(ucsi); @@ -116,7 +101,6 @@ static const struct ucsi_operations ucsi_acpi_ops = { .read_cci = ucsi_acpi_read_cci, .poll_cci = ucsi_acpi_poll_cci, .read_message_in = ucsi_acpi_read_message_in, - .write_message_out = ucsi_acpi_write_message_out, .sync_control = ucsi_sync_control_common, .async_control = ucsi_acpi_async_control }; From 2dd554b9f5d12fb8adbad42d2a93779c7cd3df06 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 22 Dec 2025 16:22:04 +0100 Subject: [PATCH 0440/1321] Revert "usb: typec: ucsi: Update UCSI structure to have message in and message out fields" This reverts commit 3e082978c33151d576694deac8abde021ea669a8. The new buffer management code has not been tested or reviewed properly and breaks boot of machines like the Lenovo ThinkPad X13s: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 CPU: 0 UID: 0 PID: 813 Comm: kworker/0:3 Not tainted 6.19.0-rc2 #26 PREEMPT Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET87W (1.59 ) 12/05/2023 Workqueue: events ucsi_handle_connector_change [typec_ucsi] Call trace: ucsi_sync_control_common+0xe4/0x1ec [typec_ucsi] (P) ucsi_run_command+0xcc/0x194 [typec_ucsi] ucsi_send_command_common+0x84/0x2a0 [typec_ucsi] ucsi_get_connector_status+0x48/0x78 [typec_ucsi] ucsi_handle_connector_change+0x5c/0x4f4 [typec_ucsi] process_one_work+0x208/0x60c worker_thread+0x244/0x388 The new code completely ignores concurrency so that the message length can be updated while a transaction is ongoing. In the above case, the length ends up being modified by another thread while processing an ack so that the NULL cci pointer is dereferenced. Fixing this will require designing a proper interface for managing these transactions, something which most likely involves reverting most of the offending commit anyway. Revert the broken code to fix the regression and let Intel come up with a properly tested implementation for a later kernel. Fixes: 3e082978c331 ("usb: typec: ucsi: Update UCSI structure to have message in and message out fields") Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Link: https://patch.msgid.link/20251222152204.2846-5-johan@kernel.org --- drivers/usb/typec/ucsi/cros_ec_ucsi.c | 5 +- drivers/usb/typec/ucsi/debugfs.c | 9 +- drivers/usb/typec/ucsi/displayport.c | 11 +-- drivers/usb/typec/ucsi/ucsi.c | 104 ++++++++---------------- drivers/usb/typec/ucsi/ucsi.h | 19 ++--- drivers/usb/typec/ucsi/ucsi_acpi.c | 9 +- drivers/usb/typec/ucsi/ucsi_ccg.c | 11 +-- drivers/usb/typec/ucsi/ucsi_yoga_c630.c | 15 ++-- 8 files changed, 71 insertions(+), 112 deletions(-) diff --git a/drivers/usb/typec/ucsi/cros_ec_ucsi.c b/drivers/usb/typec/ucsi/cros_ec_ucsi.c index d753f2188e257..eed2a7d0ebc63 100644 --- a/drivers/usb/typec/ucsi/cros_ec_ucsi.c +++ b/drivers/usb/typec/ucsi/cros_ec_ucsi.c @@ -105,12 +105,13 @@ static int cros_ucsi_async_control(struct ucsi *ucsi, u64 cmd) return 0; } -static int cros_ucsi_sync_control(struct ucsi *ucsi, u64 cmd, u32 *cci) +static int cros_ucsi_sync_control(struct ucsi *ucsi, u64 cmd, u32 *cci, + void *data, size_t size) { struct cros_ucsi_data *udata = ucsi_get_drvdata(ucsi); int ret; - ret = ucsi_sync_control_common(ucsi, cmd, cci); + ret = ucsi_sync_control_common(ucsi, cmd, cci, data, size); switch (ret) { case -EBUSY: /* EC may return -EBUSY if CCI.busy is set. diff --git a/drivers/usb/typec/ucsi/debugfs.c b/drivers/usb/typec/ucsi/debugfs.c index 924f930275534..f3684ab787fe6 100644 --- a/drivers/usb/typec/ucsi/debugfs.c +++ b/drivers/usb/typec/ucsi/debugfs.c @@ -37,8 +37,7 @@ static int ucsi_cmd(void *data, u64 val) case UCSI_SET_USB: case UCSI_SET_POWER_LEVEL: case UCSI_READ_POWER_LEVEL: - ucsi->message_in_size = 0; - ret = ucsi_send_command(ucsi, val); + ret = ucsi_send_command(ucsi, val, NULL, 0); break; case UCSI_GET_CAPABILITY: case UCSI_GET_CONNECTOR_CAPABILITY: @@ -53,9 +52,9 @@ static int ucsi_cmd(void *data, u64 val) case UCSI_GET_ATTENTION_VDO: case UCSI_GET_CAM_CS: case UCSI_GET_LPM_PPM_INFO: - ucsi->message_in_size = sizeof(ucsi->debugfs->response); - ret = ucsi_send_command(ucsi, val); - memcpy(&ucsi->debugfs->response, ucsi->message_in, sizeof(ucsi->debugfs->response)); + ret = ucsi_send_command(ucsi, val, + &ucsi->debugfs->response, + sizeof(ucsi->debugfs->response)); break; default: ret = -EOPNOTSUPP; diff --git a/drivers/usb/typec/ucsi/displayport.c b/drivers/usb/typec/ucsi/displayport.c index a09b4900ec764..8aae80b457d74 100644 --- a/drivers/usb/typec/ucsi/displayport.c +++ b/drivers/usb/typec/ucsi/displayport.c @@ -67,14 +67,11 @@ static int ucsi_displayport_enter(struct typec_altmode *alt, u32 *vdo) } command = UCSI_GET_CURRENT_CAM | UCSI_CONNECTOR_NUMBER(dp->con->num); - ucsi->message_in_size = sizeof(cur); - ret = ucsi_send_command(ucsi, command); + ret = ucsi_send_command(ucsi, command, &cur, sizeof(cur)); if (ret < 0) { if (ucsi->version > 0x0100) goto err_unlock; cur = 0xff; - } else { - memcpy(&cur, ucsi->message_in, ucsi->message_in_size); } if (cur != 0xff) { @@ -129,8 +126,7 @@ static int ucsi_displayport_exit(struct typec_altmode *alt) } command = UCSI_CMD_SET_NEW_CAM(dp->con->num, 0, dp->offset, 0); - dp->con->ucsi->message_in_size = 0; - ret = ucsi_send_command(dp->con->ucsi, command); + ret = ucsi_send_command(dp->con->ucsi, command, NULL, 0); if (ret < 0) goto out_unlock; @@ -197,8 +193,7 @@ static int ucsi_displayport_configure(struct ucsi_dp *dp) command = UCSI_CMD_SET_NEW_CAM(dp->con->num, 1, dp->offset, pins); - dp->con->ucsi->message_in_size = 0; - return ucsi_send_command(dp->con->ucsi, command); + return ucsi_send_command(dp->con->ucsi, command, NULL, 0); } static int ucsi_displayport_vdm(struct typec_altmode *alt, diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index 8195407131501..a7b388dc7fa0f 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -55,7 +55,8 @@ void ucsi_notify_common(struct ucsi *ucsi, u32 cci) } EXPORT_SYMBOL_GPL(ucsi_notify_common); -int ucsi_sync_control_common(struct ucsi *ucsi, u64 command, u32 *cci) +int ucsi_sync_control_common(struct ucsi *ucsi, u64 command, u32 *cci, + void *data, size_t size) { bool ack = UCSI_COMMAND(command) == UCSI_ACK_CC_CI; int ret; @@ -83,10 +84,9 @@ int ucsi_sync_control_common(struct ucsi *ucsi, u64 command, u32 *cci) if (!ret && cci) ret = ucsi->ops->read_cci(ucsi, cci); - if (!ret && ucsi->message_in_size > 0 && + if (!ret && data && (*cci & UCSI_CCI_COMMAND_COMPLETE)) - ret = ucsi->ops->read_message_in(ucsi, ucsi->message_in, - ucsi->message_in_size); + ret = ucsi->ops->read_message_in(ucsi, data, size); return ret; } @@ -103,25 +103,23 @@ static int ucsi_acknowledge(struct ucsi *ucsi, bool conn_ack) ctrl |= UCSI_ACK_CONNECTOR_CHANGE; } - ucsi->message_in_size = 0; - return ucsi->ops->sync_control(ucsi, ctrl, NULL); + return ucsi->ops->sync_control(ucsi, ctrl, NULL, NULL, 0); } -static int ucsi_run_command(struct ucsi *ucsi, u64 command, u32 *cci, bool conn_ack) +static int ucsi_run_command(struct ucsi *ucsi, u64 command, u32 *cci, + void *data, size_t size, bool conn_ack) { int ret, err; *cci = 0; - if (ucsi->message_in_size > UCSI_MAX_DATA_LENGTH(ucsi)) + if (size > UCSI_MAX_DATA_LENGTH(ucsi)) return -EINVAL; - ret = ucsi->ops->sync_control(ucsi, command, cci); + ret = ucsi->ops->sync_control(ucsi, command, cci, data, size); - if (*cci & UCSI_CCI_BUSY) { - ucsi->message_in_size = 0; - return ucsi_run_command(ucsi, UCSI_CANCEL, cci, false) ?: -EBUSY; - } + if (*cci & UCSI_CCI_BUSY) + return ucsi_run_command(ucsi, UCSI_CANCEL, cci, NULL, 0, false) ?: -EBUSY; if (ret) return ret; @@ -153,13 +151,10 @@ static int ucsi_read_error(struct ucsi *ucsi, u8 connector_num) int ret; command = UCSI_GET_ERROR_STATUS | UCSI_CONNECTOR_NUMBER(connector_num); - ucsi->message_in_size = sizeof(error); - ret = ucsi_run_command(ucsi, command, &cci, false); + ret = ucsi_run_command(ucsi, command, &cci, &error, sizeof(error), false); if (ret < 0) return ret; - memcpy(&error, ucsi->message_in, sizeof(error)); - switch (error) { case UCSI_ERROR_INCOMPATIBLE_PARTNER: return -EOPNOTSUPP; @@ -205,7 +200,8 @@ static int ucsi_read_error(struct ucsi *ucsi, u8 connector_num) return -EIO; } -static int ucsi_send_command_common(struct ucsi *ucsi, u64 cmd, bool conn_ack) +static int ucsi_send_command_common(struct ucsi *ucsi, u64 cmd, + void *data, size_t size, bool conn_ack) { u8 connector_num; u32 cci; @@ -233,7 +229,7 @@ static int ucsi_send_command_common(struct ucsi *ucsi, u64 cmd, bool conn_ack) mutex_lock(&ucsi->ppm_lock); - ret = ucsi_run_command(ucsi, cmd, &cci, conn_ack); + ret = ucsi_run_command(ucsi, cmd, &cci, data, size, conn_ack); if (cci & UCSI_CCI_ERROR) ret = ucsi_read_error(ucsi, connector_num); @@ -242,9 +238,10 @@ static int ucsi_send_command_common(struct ucsi *ucsi, u64 cmd, bool conn_ack) return ret; } -int ucsi_send_command(struct ucsi *ucsi, u64 command) +int ucsi_send_command(struct ucsi *ucsi, u64 command, + void *data, size_t size) { - return ucsi_send_command_common(ucsi, command, false); + return ucsi_send_command_common(ucsi, command, data, size, false); } EXPORT_SYMBOL_GPL(ucsi_send_command); @@ -322,8 +319,7 @@ void ucsi_altmode_update_active(struct ucsi_connector *con) int i; command = UCSI_GET_CURRENT_CAM | UCSI_CONNECTOR_NUMBER(con->num); - con->ucsi->message_in_size = sizeof(cur); - ret = ucsi_send_command(con->ucsi, command); + ret = ucsi_send_command(con->ucsi, command, &cur, sizeof(cur)); if (ret < 0) { if (con->ucsi->version > 0x0100) { dev_err(con->ucsi->dev, @@ -331,8 +327,6 @@ void ucsi_altmode_update_active(struct ucsi_connector *con) return; } cur = 0xff; - } else { - memcpy(&cur, con->ucsi->message_in, sizeof(cur)); } if (cur < UCSI_MAX_ALTMODES) @@ -516,8 +510,7 @@ ucsi_register_altmodes_nvidia(struct ucsi_connector *con, u8 recipient) command |= UCSI_GET_ALTMODE_RECIPIENT(recipient); command |= UCSI_GET_ALTMODE_CONNECTOR_NUMBER(con->num); command |= UCSI_GET_ALTMODE_OFFSET(i); - ucsi->message_in_size = sizeof(alt); - len = ucsi_send_command(con->ucsi, command); + len = ucsi_send_command(con->ucsi, command, &alt, sizeof(alt)); /* * We are collecting all altmodes first and then registering. * Some type-C device will return zero length data beyond last @@ -526,8 +519,6 @@ ucsi_register_altmodes_nvidia(struct ucsi_connector *con, u8 recipient) if (len < 0) return len; - memcpy(&alt, ucsi->message_in, sizeof(alt)); - /* We got all altmodes, now break out and register them */ if (!len || !alt.svid) break; @@ -595,15 +586,12 @@ static int ucsi_register_altmodes(struct ucsi_connector *con, u8 recipient) command |= UCSI_GET_ALTMODE_RECIPIENT(recipient); command |= UCSI_GET_ALTMODE_CONNECTOR_NUMBER(con->num); command |= UCSI_GET_ALTMODE_OFFSET(i); - con->ucsi->message_in_size = sizeof(alt); - len = ucsi_send_command(con->ucsi, command); + len = ucsi_send_command(con->ucsi, command, alt, sizeof(alt)); if (len == -EBUSY) continue; if (len <= 0) return len; - memcpy(&alt, con->ucsi->message_in, sizeof(alt)); - /* * This code is requesting one alt mode at a time, but some PPMs * may still return two. If that happens both alt modes need be @@ -671,9 +659,7 @@ static int ucsi_get_connector_status(struct ucsi_connector *con, bool conn_ack) UCSI_MAX_DATA_LENGTH(con->ucsi)); int ret; - con->ucsi->message_in_size = size; - ret = ucsi_send_command_common(con->ucsi, command, conn_ack); - memcpy(&con->status, con->ucsi->message_in, size); + ret = ucsi_send_command_common(con->ucsi, command, &con->status, size, conn_ack); return ret < 0 ? ret : 0; } @@ -696,9 +682,8 @@ static int ucsi_read_pdos(struct ucsi_connector *con, command |= UCSI_GET_PDOS_PDO_OFFSET(offset); command |= UCSI_GET_PDOS_NUM_PDOS(num_pdos - 1); command |= is_source(role) ? UCSI_GET_PDOS_SRC_PDOS : 0; - ucsi->message_in_size = num_pdos * sizeof(u32); - ret = ucsi_send_command(ucsi, command); - memcpy(pdos + offset, ucsi->message_in, num_pdos * sizeof(u32)); + ret = ucsi_send_command(ucsi, command, pdos + offset, + num_pdos * sizeof(u32)); if (ret < 0 && ret != -ETIMEDOUT) dev_err(ucsi->dev, "UCSI_GET_PDOS failed (%d)\n", ret); @@ -785,9 +770,7 @@ static int ucsi_get_pd_message(struct ucsi_connector *con, u8 recipient, command |= UCSI_GET_PD_MESSAGE_BYTES(len); command |= UCSI_GET_PD_MESSAGE_TYPE(type); - con->ucsi->message_in_size = len; - ret = ucsi_send_command(con->ucsi, command); - memcpy(data + offset, con->ucsi->message_in, len); + ret = ucsi_send_command(con->ucsi, command, data + offset, len); if (ret < 0) return ret; } @@ -952,9 +935,7 @@ static int ucsi_register_cable(struct ucsi_connector *con) int ret; command = UCSI_GET_CABLE_PROPERTY | UCSI_CONNECTOR_NUMBER(con->num); - con->ucsi->message_in_size = sizeof(cable_prop); - ret = ucsi_send_command(con->ucsi, command); - memcpy(&cable_prop, con->ucsi->message_in, sizeof(cable_prop)); + ret = ucsi_send_command(con->ucsi, command, &cable_prop, sizeof(cable_prop)); if (ret < 0) { dev_err(con->ucsi->dev, "GET_CABLE_PROPERTY failed (%d)\n", ret); return ret; @@ -1015,9 +996,7 @@ static int ucsi_check_connector_capability(struct ucsi_connector *con) return 0; command = UCSI_GET_CONNECTOR_CAPABILITY | UCSI_CONNECTOR_NUMBER(con->num); - con->ucsi->message_in_size = sizeof(con->cap); - ret = ucsi_send_command(con->ucsi, command); - memcpy(&con->cap, con->ucsi->message_in, sizeof(con->cap)); + ret = ucsi_send_command(con->ucsi, command, &con->cap, sizeof(con->cap)); if (ret < 0) { dev_err(con->ucsi->dev, "GET_CONNECTOR_CAPABILITY failed (%d)\n", ret); return ret; @@ -1401,8 +1380,7 @@ static int ucsi_reset_connector(struct ucsi_connector *con, bool hard) else if (con->ucsi->version >= UCSI_VERSION_2_0) command |= hard ? 0 : UCSI_CONNECTOR_RESET_DATA_VER_2_0; - con->ucsi->message_in_size = 0; - return ucsi_send_command(con->ucsi, command); + return ucsi_send_command(con->ucsi, command, NULL, 0); } static int ucsi_reset_ppm(struct ucsi *ucsi) @@ -1483,8 +1461,7 @@ static int ucsi_role_cmd(struct ucsi_connector *con, u64 command) { int ret; - con->ucsi->message_in_size = 0; - ret = ucsi_send_command(con->ucsi, command); + ret = ucsi_send_command(con->ucsi, command, NULL, 0); if (ret == -ETIMEDOUT) { u64 c; @@ -1492,8 +1469,7 @@ static int ucsi_role_cmd(struct ucsi_connector *con, u64 command) ucsi_reset_ppm(con->ucsi); c = UCSI_SET_NOTIFICATION_ENABLE | con->ucsi->ntfy; - con->ucsi->message_in_size = 0; - ucsi_send_command(con->ucsi, c); + ucsi_send_command(con->ucsi, c, NULL, 0); ucsi_reset_connector(con, true); } @@ -1646,13 +1622,10 @@ static int ucsi_register_port(struct ucsi *ucsi, struct ucsi_connector *con) /* Get connector capability */ command = UCSI_GET_CONNECTOR_CAPABILITY; command |= UCSI_CONNECTOR_NUMBER(con->num); - ucsi->message_in_size = sizeof(con->cap); - ret = ucsi_send_command(ucsi, command); + ret = ucsi_send_command(ucsi, command, &con->cap, sizeof(con->cap)); if (ret < 0) goto out_unlock; - memcpy(&con->cap, ucsi->message_in, sizeof(con->cap)); - if (UCSI_CONCAP(con, OPMODE_DRP)) cap->data = TYPEC_PORT_DRD; else if (UCSI_CONCAP(con, OPMODE_DFP)) @@ -1849,20 +1822,17 @@ static int ucsi_init(struct ucsi *ucsi) /* Enable basic notifications */ ntfy = UCSI_ENABLE_NTFY_CMD_COMPLETE | UCSI_ENABLE_NTFY_ERROR; command = UCSI_SET_NOTIFICATION_ENABLE | ntfy; - ucsi->message_in_size = 0; - ret = ucsi_send_command(ucsi, command); + ret = ucsi_send_command(ucsi, command, NULL, 0); if (ret < 0) goto err_reset; /* Get PPM capabilities */ command = UCSI_GET_CAPABILITY; - ucsi->message_in_size = BITS_TO_BYTES(UCSI_GET_CAPABILITY_SIZE); - ret = ucsi_send_command(ucsi, command); + ret = ucsi_send_command(ucsi, command, &ucsi->cap, + BITS_TO_BYTES(UCSI_GET_CAPABILITY_SIZE)); if (ret < 0) goto err_reset; - memcpy(&ucsi->cap, ucsi->message_in, BITS_TO_BYTES(UCSI_GET_CAPABILITY_SIZE)); - if (!ucsi->cap.num_connectors) { ret = -ENODEV; goto err_reset; @@ -1892,8 +1862,7 @@ static int ucsi_init(struct ucsi *ucsi) /* Enable all supported notifications */ ntfy = ucsi_get_supported_notifications(ucsi); command = UCSI_SET_NOTIFICATION_ENABLE | ntfy; - ucsi->message_in_size = 0; - ret = ucsi_send_command(ucsi, command); + ret = ucsi_send_command(ucsi, command, NULL, 0); if (ret < 0) goto err_unregister; @@ -1944,8 +1913,7 @@ static void ucsi_resume_work(struct work_struct *work) /* Restore UCSI notification enable mask after system resume */ command = UCSI_SET_NOTIFICATION_ENABLE | ucsi->ntfy; - ucsi->message_in_size = 0; - ret = ucsi_send_command(ucsi, command); + ret = ucsi_send_command(ucsi, command, NULL, 0); if (ret < 0) { dev_err(ucsi->dev, "failed to re-enable notifications (%d)\n", ret); return; diff --git a/drivers/usb/typec/ucsi/ucsi.h b/drivers/usb/typec/ucsi/ucsi.h index 479bf1f69c72b..410389ef173ab 100644 --- a/drivers/usb/typec/ucsi/ucsi.h +++ b/drivers/usb/typec/ucsi/ucsi.h @@ -29,10 +29,6 @@ struct dentry; #define UCSI_MESSAGE_OUT 32 #define UCSIv2_MESSAGE_OUT 272 -/* Define maximum lengths for message buffers */ -#define UCSI_MAX_MESSAGE_IN_LENGTH 256 -#define UCSI_MAX_MESSAGE_OUT_LENGTH 256 - /* UCSI versions */ #define UCSI_VERSION_1_0 0x0100 #define UCSI_VERSION_1_1 0x0110 @@ -84,7 +80,8 @@ struct ucsi_operations { int (*read_cci)(struct ucsi *ucsi, u32 *cci); int (*poll_cci)(struct ucsi *ucsi, u32 *cci); int (*read_message_in)(struct ucsi *ucsi, void *val, size_t val_len); - int (*sync_control)(struct ucsi *ucsi, u64 command, u32 *cci); + int (*sync_control)(struct ucsi *ucsi, u64 command, u32 *cci, + void *data, size_t size); int (*async_control)(struct ucsi *ucsi, u64 command); bool (*update_altmodes)(struct ucsi *ucsi, u8 recipient, struct ucsi_altmode *orig, @@ -496,12 +493,6 @@ struct ucsi { unsigned long quirks; #define UCSI_NO_PARTNER_PDOS BIT(0) /* Don't read partner's PDOs */ #define UCSI_DELAY_DEVICE_PDOS BIT(1) /* Reading PDOs fails until the parter is in PD mode */ - - /* Fixed-size buffers for incoming and outgoing messages */ - u8 message_in[UCSI_MAX_MESSAGE_IN_LENGTH]; - size_t message_in_size; - u8 message_out[UCSI_MAX_MESSAGE_OUT_LENGTH]; - size_t message_out_size; }; #define UCSI_MAX_DATA_LENGTH(u) (((u)->version < UCSI_VERSION_2_0) ? 0x10 : 0xff) @@ -564,13 +555,15 @@ struct ucsi_connector { struct usb_pd_identity cable_identity; }; -int ucsi_send_command(struct ucsi *ucsi, u64 command); +int ucsi_send_command(struct ucsi *ucsi, u64 command, + void *retval, size_t size); void ucsi_altmode_update_active(struct ucsi_connector *con); int ucsi_resume(struct ucsi *ucsi); void ucsi_notify_common(struct ucsi *ucsi, u32 cci); -int ucsi_sync_control_common(struct ucsi *ucsi, u64 command, u32 *cci); +int ucsi_sync_control_common(struct ucsi *ucsi, u64 command, u32 *cci, + void *data, size_t size); #if IS_ENABLED(CONFIG_POWER_SUPPLY) int ucsi_register_port_psy(struct ucsi_connector *con); diff --git a/drivers/usb/typec/ucsi/ucsi_acpi.c b/drivers/usb/typec/ucsi/ucsi_acpi.c index f1d1f6917b098..6b92f296e9850 100644 --- a/drivers/usb/typec/ucsi/ucsi_acpi.c +++ b/drivers/usb/typec/ucsi/ucsi_acpi.c @@ -105,14 +105,15 @@ static const struct ucsi_operations ucsi_acpi_ops = { .async_control = ucsi_acpi_async_control }; -static int ucsi_gram_sync_control(struct ucsi *ucsi, u64 command, u32 *cci) +static int ucsi_gram_sync_control(struct ucsi *ucsi, u64 command, u32 *cci, + void *val, size_t len) { u16 bogus_change = UCSI_CONSTAT_POWER_LEVEL_CHANGE | UCSI_CONSTAT_PDOS_CHANGE; struct ucsi_acpi *ua = ucsi_get_drvdata(ucsi); int ret; - ret = ucsi_sync_control_common(ucsi, command, cci); + ret = ucsi_sync_control_common(ucsi, command, cci, val, len); if (ret < 0) return ret; @@ -124,8 +125,8 @@ static int ucsi_gram_sync_control(struct ucsi *ucsi, u64 command, u32 *cci) if (UCSI_COMMAND(ua->cmd) == UCSI_GET_CONNECTOR_STATUS && ua->check_bogus_event) { /* Clear the bogus change */ - if (*(u16 *)ucsi->message_in == bogus_change) - *(u16 *)ucsi->message_in = 0; + if (*(u16 *)val == bogus_change) + *(u16 *)val = 0; ua->check_bogus_event = false; } diff --git a/drivers/usb/typec/ucsi/ucsi_ccg.c b/drivers/usb/typec/ucsi/ucsi_ccg.c index ead1b2a25c791..d83a0051c7373 100644 --- a/drivers/usb/typec/ucsi/ucsi_ccg.c +++ b/drivers/usb/typec/ucsi/ucsi_ccg.c @@ -606,7 +606,8 @@ static int ucsi_ccg_async_control(struct ucsi *ucsi, u64 command) return ccg_write(uc, reg, (u8 *)&command, sizeof(command)); } -static int ucsi_ccg_sync_control(struct ucsi *ucsi, u64 command, u32 *cci) +static int ucsi_ccg_sync_control(struct ucsi *ucsi, u64 command, u32 *cci, + void *data, size_t size) { struct ucsi_ccg *uc = ucsi_get_drvdata(ucsi); struct ucsi_connector *con; @@ -628,16 +629,16 @@ static int ucsi_ccg_sync_control(struct ucsi *ucsi, u64 command, u32 *cci) ucsi_ccg_update_set_new_cam_cmd(uc, con, &command); } - ret = ucsi_sync_control_common(ucsi, command, cci); + ret = ucsi_sync_control_common(ucsi, command, cci, data, size); switch (UCSI_COMMAND(command)) { case UCSI_GET_CURRENT_CAM: if (uc->has_multiple_dp) - ucsi_ccg_update_get_current_cam_cmd(uc, (u8 *)ucsi->message_in); + ucsi_ccg_update_get_current_cam_cmd(uc, (u8 *)data); break; case UCSI_GET_ALTERNATE_MODES: if (UCSI_ALTMODE_RECIPIENT(command) == UCSI_RECIPIENT_SOP) { - struct ucsi_altmode *alt = (struct ucsi_altmode *)ucsi->message_in; + struct ucsi_altmode *alt = data; if (alt[0].svid == USB_TYPEC_NVIDIA_VLINK_SID) ucsi_ccg_nvidia_altmode(uc, alt, command); @@ -645,7 +646,7 @@ static int ucsi_ccg_sync_control(struct ucsi *ucsi, u64 command, u32 *cci) break; case UCSI_GET_CAPABILITY: if (uc->fw_build == CCG_FW_BUILD_NVIDIA_TEGRA) { - struct ucsi_capability *cap = (struct ucsi_capability *)ucsi->message_in; + struct ucsi_capability *cap = data; cap->features &= ~UCSI_CAP_ALT_MODE_DETAILS; } diff --git a/drivers/usb/typec/ucsi/ucsi_yoga_c630.c b/drivers/usb/typec/ucsi/ucsi_yoga_c630.c index 299081444caa9..0187c1c4b21ab 100644 --- a/drivers/usb/typec/ucsi/ucsi_yoga_c630.c +++ b/drivers/usb/typec/ucsi/ucsi_yoga_c630.c @@ -88,7 +88,8 @@ static int yoga_c630_ucsi_async_control(struct ucsi *ucsi, u64 command) static int yoga_c630_ucsi_sync_control(struct ucsi *ucsi, u64 command, - u32 *cci) + u32 *cci, + void *data, size_t size) { int ret; @@ -106,8 +107,8 @@ static int yoga_c630_ucsi_sync_control(struct ucsi *ucsi, }; dev_dbg(ucsi->dev, "faking DP altmode for con1\n"); - memset(ucsi->message_in, 0, ucsi->message_in_size); - memcpy(ucsi->message_in, &alt, min(sizeof(alt), ucsi->message_in_size)); + memset(data, 0, size); + memcpy(data, &alt, min(sizeof(alt), size)); *cci = UCSI_CCI_COMMAND_COMPLETE | UCSI_SET_CCI_LENGTH(sizeof(alt)); return 0; } @@ -120,18 +121,18 @@ static int yoga_c630_ucsi_sync_control(struct ucsi *ucsi, if (UCSI_COMMAND(command) == UCSI_GET_ALTERNATE_MODES && UCSI_GET_ALTMODE_GET_CONNECTOR_NUMBER(command) == 2) { dev_dbg(ucsi->dev, "ignoring altmodes for con2\n"); - memset(ucsi->message_in, 0, ucsi->message_in_size); + memset(data, 0, size); *cci = UCSI_CCI_COMMAND_COMPLETE; return 0; } - ret = ucsi_sync_control_common(ucsi, command, cci); + ret = ucsi_sync_control_common(ucsi, command, cci, data, size); if (ret < 0) return ret; /* UCSI_GET_CURRENT_CAM is off-by-one on all ports */ - if (UCSI_COMMAND(command) == UCSI_GET_CURRENT_CAM && ucsi->message_in_size > 0) - ucsi->message_in[0]--; + if (UCSI_COMMAND(command) == UCSI_GET_CURRENT_CAM && data) + ((u8 *)data)[0]--; return ret; } From b72456dacd238807331f36c07f33c5d1879c89af Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sun, 28 Dec 2025 13:24:26 -0800 Subject: [PATCH 0441/1321] Linux 6.19-rc3 --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 3cd00b62cde99..27ce077520fe1 100644 --- a/Makefile +++ b/Makefile @@ -2,7 +2,7 @@ VERSION = 6 PATCHLEVEL = 19 SUBLEVEL = 0 -EXTRAVERSION = -rc2 +EXTRAVERSION = -rc3 NAME = Baby Opossum Posse # *DOCUMENTATION* From e39dd8d095a887bdc81a554d40d2da0ffebc3d05 Mon Sep 17 00:00:00 2001 From: Herbert Xu Date: Wed, 17 Dec 2025 14:15:41 +0800 Subject: [PATCH 0442/1321] crypto: seqiv - Do not use req->iv after crypto_aead_encrypt As soon as crypto_aead_encrypt is called, the underlying request may be freed by an asynchronous completion. Thus dereferencing req->iv after it returns is invalid. Instead of checking req->iv against info, create a new variable unaligned_info and use it for that purpose instead. Fixes: 0a270321dbf9 ("[CRYPTO] seqiv: Add Sequence Number IV Generator") Reported-by: Xiumei Mu Reported-by: Xin Long Signed-off-by: Herbert Xu --- crypto/seqiv.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/crypto/seqiv.c b/crypto/seqiv.c index 2bae99e335268..678bb4145d783 100644 --- a/crypto/seqiv.c +++ b/crypto/seqiv.c @@ -50,6 +50,7 @@ static int seqiv_aead_encrypt(struct aead_request *req) struct aead_geniv_ctx *ctx = crypto_aead_ctx(geniv); struct aead_request *subreq = aead_request_ctx(req); crypto_completion_t compl; + bool unaligned_info; void *data; u8 *info; unsigned int ivsize = 8; @@ -68,8 +69,9 @@ static int seqiv_aead_encrypt(struct aead_request *req) memcpy_sglist(req->dst, req->src, req->assoclen + req->cryptlen); - if (unlikely(!IS_ALIGNED((unsigned long)info, - crypto_aead_alignmask(geniv) + 1))) { + unaligned_info = !IS_ALIGNED((unsigned long)info, + crypto_aead_alignmask(geniv) + 1); + if (unlikely(unaligned_info)) { info = kmemdup(req->iv, ivsize, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP ? GFP_KERNEL : GFP_ATOMIC); @@ -89,7 +91,7 @@ static int seqiv_aead_encrypt(struct aead_request *req) scatterwalk_map_and_copy(info, req->dst, req->assoclen, ivsize, 1); err = crypto_aead_encrypt(subreq); - if (unlikely(info != req->iv)) + if (unlikely(unaligned_info)) seqiv_aead_encrypt_complete2(req, err); return err; } From d9cecde81b6dfcc10a008a35aa98617fe5e5e7cc Mon Sep 17 00:00:00 2001 From: Chenghai Huang Date: Thu, 20 Nov 2025 21:21:24 +0800 Subject: [PATCH 0443/1321] crypto: hisilicon/qm - fix incorrect judgment in qm_get_complete_eqe_num() In qm_get_complete_eqe_num(), the function entry has already checked whether the interrupt is valid, so the interrupt event can be processed directly. Currently, the interrupt valid bit is being checked again redundantly, and no interrupt processing is performed. Therefore, the loop condition should be modified to directly process the interrupt event, and use do while instead of the current while loop, because the condition is always satisfied on the first iteration. Fixes: f5a332980a68 ("crypto: hisilicon/qm - add the save operation of eqe and aeqe") Signed-off-by: Chenghai Huang Signed-off-by: Herbert Xu --- drivers/crypto/hisilicon/qm.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index f8bfff5dd0bde..d47bf06a90f7d 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -991,7 +991,7 @@ static void qm_get_complete_eqe_num(struct hisi_qm *qm) return; poll_data = &qm->poll_data[cqn]; - while (QM_EQE_PHASE(dw0) != qm->status.eqc_phase) { + do { poll_data->qp_finish_id[eqe_num] = dw0 & QM_EQE_CQN_MASK; eqe_num++; @@ -1004,11 +1004,10 @@ static void qm_get_complete_eqe_num(struct hisi_qm *qm) qm->status.eq_head++; } - if (eqe_num == (eq_depth >> 1) - 1) - break; - dw0 = le32_to_cpu(eqe->dw0); - } + if (QM_EQE_PHASE(dw0) != qm->status.eqc_phase) + break; + } while (eqe_num < (eq_depth >> 1) - 1); poll_data->eqe_num = eqe_num; queue_work(qm->wq, &poll_data->work); From 96db7d5d6702313d60e7670e6905c44cfe3419de Mon Sep 17 00:00:00 2001 From: Chen Ridong Date: Thu, 18 Dec 2025 01:59:50 +0000 Subject: [PATCH 0444/1321] cpuset: fix warning when disabling remote partition A warning was triggered as follows: WARNING: kernel/cgroup/cpuset.c:1651 at remote_partition_disable+0xf7/0x110 RIP: 0010:remote_partition_disable+0xf7/0x110 RSP: 0018:ffffc90001947d88 EFLAGS: 00000206 RAX: 0000000000007fff RBX: ffff888103b6e000 RCX: 0000000000006f40 RDX: 0000000000006f00 RSI: ffffc90001947da8 RDI: ffff888103b6e000 RBP: ffff888103b6e000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: ffff88810b2e2728 R12: ffffc90001947da8 R13: 0000000000000000 R14: ffffc90001947da8 R15: ffff8881081f1c00 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f55c8bbe0b2 CR3: 000000010b14c000 CR4: 00000000000006f0 Call Trace: update_prstate+0x2d3/0x580 cpuset_partition_write+0x94/0xf0 kernfs_fop_write_iter+0x147/0x200 vfs_write+0x35d/0x500 ksys_write+0x66/0xe0 do_syscall_64+0x6b/0x390 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f55c8cd4887 Reproduction steps (on a 16-CPU machine): # cd /sys/fs/cgroup/ # mkdir A1 # echo +cpuset > A1/cgroup.subtree_control # echo "0-14" > A1/cpuset.cpus.exclusive # mkdir A1/A2 # echo "0-14" > A1/A2/cpuset.cpus.exclusive # echo "root" > A1/A2/cpuset.cpus.partition # echo 0 > /sys/devices/system/cpu/cpu15/online # echo member > A1/A2/cpuset.cpus.partition When CPU 15 is offlined, subpartitions_cpus gets cleared because no CPUs remain available for the top_cpuset, forcing partitions to share CPUs with the top_cpuset. In this scenario, disabling the remote partition triggers a warning stating that effective_xcpus is not a subset of subpartitions_cpus. Partitions should be invalidated in this case to inform users that the partition is now invalid(cpus are shared with top_cpuset). To fix this issue: 1. Only emit the warning only if subpartitions_cpus is not empty and the effective_xcpus is not a subset of subpartitions_cpus. 2. During the CPU hotplug process, invalidate partitions if subpartitions_cpus is empty. Fixes: f62a5d39368e ("cgroup/cpuset: Remove remote_partition_check() & make update_cpumasks_hier() handle remote partition") Signed-off-by: Chen Ridong Reviewed-by: Waiman Long Signed-off-by: Tejun Heo --- kernel/cgroup/cpuset.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 6e6eb09b8db68..3e8cc34d8d502 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1668,7 +1668,14 @@ static int remote_partition_enable(struct cpuset *cs, int new_prs, static void remote_partition_disable(struct cpuset *cs, struct tmpmasks *tmp) { WARN_ON_ONCE(!is_remote_partition(cs)); - WARN_ON_ONCE(!cpumask_subset(cs->effective_xcpus, subpartitions_cpus)); + /* + * When a CPU is offlined, top_cpuset may end up with no available CPUs, + * which should clear subpartitions_cpus. We should not emit a warning for this + * scenario: the hierarchy is updated from top to bottom, so subpartitions_cpus + * may already be cleared when disabling the partition. + */ + WARN_ON_ONCE(!cpumask_subset(cs->effective_xcpus, subpartitions_cpus) && + !cpumask_empty(subpartitions_cpus)); spin_lock_irq(&callback_lock); cs->remote_partition = false; @@ -3976,8 +3983,9 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp) if (remote || (is_partition_valid(cs) && is_partition_valid(parent))) compute_partition_effective_cpumask(cs, &new_cpus); - if (remote && cpumask_empty(&new_cpus) && - partition_is_populated(cs, NULL)) { + if (remote && (cpumask_empty(subpartitions_cpus) || + (cpumask_empty(&new_cpus) && + partition_is_populated(cs, NULL)))) { cs->prs_err = PERR_HOTPLUG; remote_partition_disable(cs, tmp); compute_effective_cpumask(&new_cpus, cs, parent); @@ -3990,9 +3998,12 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp) * 1) empty effective cpus but not valid empty partition. * 2) parent is invalid or doesn't grant any cpus to child * partitions. + * 3) subpartitions_cpus is empty. */ - if (is_local_partition(cs) && (!is_partition_valid(parent) || - tasks_nocpu_error(parent, cs, &new_cpus))) + if (is_local_partition(cs) && + (!is_partition_valid(parent) || + tasks_nocpu_error(parent, cs, &new_cpus) || + cpumask_empty(subpartitions_cpus))) partcmd = partcmd_invalidate; /* * On the other hand, an invalid partition root may be transitioned From 93ea0c7faa6183ddee03d2bcb268bb4d8ae0d37a Mon Sep 17 00:00:00 2001 From: Liang Jie Date: Tue, 16 Dec 2025 17:39:55 +0800 Subject: [PATCH 0445/1321] sched_ext: fix uninitialized ret on alloc_percpu() failure Smatch reported: kernel/sched/ext.c:5332 scx_alloc_and_add_sched() warn: passing zero to 'ERR_PTR' In scx_alloc_and_add_sched(), the alloc_percpu() failure path jumps to err_free_gdsqs without initializing @ret. That can lead to returning ERR_PTR(0), which violates the ERR_PTR() convention and confuses callers. Set @ret to -ENOMEM before jumping to the error path when alloc_percpu() fails. Reported-by: kernel test robot Closes: https://lore.kernel.org/r/202512141601.yAXDAeA9-lkp@intel.com/ Reported-by: Dan Carpenter Fixes: c201ea1578d3 ("sched_ext: Move event_stats_cpu into scx_sched") Signed-off-by: Liang Jie Reviewed-by: Emil Tsalapatis Reviewed-by: Andrea Righi Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 94164f2dec6dc..7a53d1cf8e82c 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -4783,8 +4783,10 @@ static struct scx_sched *scx_alloc_and_add_sched(struct sched_ext_ops *ops) } sch->pcpu = alloc_percpu(struct scx_sched_pcpu); - if (!sch->pcpu) + if (!sch->pcpu) { + ret = -ENOMEM; goto err_free_gdsqs; + } sch->helper = kthread_run_worker(0, "sched_ext_helper"); if (IS_ERR(sch->helper)) { From c2963a3f44191aa416cfcdea818d1f1386f755fa Mon Sep 17 00:00:00 2001 From: Zqiang Date: Fri, 19 Dec 2025 17:34:04 +0800 Subject: [PATCH 0446/1321] sched_ext: Fix some comments in ext.c This commit update balance_scx() in the comments to balance_one(). Signed-off-by: Zqiang Reviewed-by: Andrea Righi Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 7a53d1cf8e82c..5ebf8a7408478 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -1577,7 +1577,7 @@ static bool dequeue_task_scx(struct rq *rq, struct task_struct *p, int deq_flags * * @p may go through multiple stopping <-> running transitions between * here and put_prev_task_scx() if task attribute changes occur while - * balance_scx() leaves @rq unlocked. However, they don't contain any + * balance_one() leaves @rq unlocked. However, they don't contain any * information meaningful to the BPF scheduler and can be suppressed by * skipping the callbacks if the task is !QUEUED. */ @@ -2372,7 +2372,7 @@ static void switch_class(struct rq *rq, struct task_struct *next) * preempted, and it regaining control of the CPU. * * ->cpu_release() complements ->cpu_acquire(), which is emitted the - * next time that balance_scx() is invoked. + * next time that balance_one() is invoked. */ if (!rq->scx.cpu_released) { if (SCX_HAS_OP(sch, cpu_release)) { @@ -2478,7 +2478,7 @@ do_pick_task_scx(struct rq *rq, struct rq_flags *rf, bool force_scx) } /* - * If balance_scx() is telling us to keep running @prev, replenish slice + * If balance_one() is telling us to keep running @prev, replenish slice * if necessary and keep running @prev. Otherwise, pop the first one * from the local DSQ. */ @@ -4025,7 +4025,7 @@ static DEFINE_TIMER(scx_bypass_lb_timer, scx_bypass_lb_timerfn); * * - ops.dispatch() is ignored. * - * - balance_scx() does not set %SCX_RQ_BAL_KEEP on non-zero slice as slice + * - balance_one() does not set %SCX_RQ_BAL_KEEP on non-zero slice as slice * can't be trusted. Whenever a tick triggers, the running task is rotated to * the tail of the queue with core_sched_at touched. * @@ -6069,7 +6069,7 @@ __bpf_kfunc bool scx_bpf_dsq_move_to_local(u64 dsq_id) /* * A successfully consumed task can be dequeued before it starts * running while the CPU is trying to migrate other dispatched - * tasks. Bump nr_tasks to tell balance_scx() to retry on empty + * tasks. Bump nr_tasks to tell balance_one() to retry on empty * local DSQ. */ dspc->nr_tasks++; From 84af4e5ee1ec73947f6123f113283911a3755aba Mon Sep 17 00:00:00 2001 From: Zqiang Date: Mon, 22 Dec 2025 19:53:17 +0800 Subject: [PATCH 0447/1321] sched_ext: Use the resched_cpu() to replace resched_curr() in the bypass_lb_node() For the PREEMPT_RT kernels, the scx_bypass_lb_timerfn() running in the preemptible per-CPU ktimer kthread context, this means that the following scenarios will occur(for x86 platform): cpu1 cpu2 ktimer kthread: ->scx_bypass_lb_timerfn ->bypass_lb_node ->for_each_cpu(cpu, resched_mask) migration/1: by preempt by migration/2: multi_cpu_stop() multi_cpu_stop() ->take_cpu_down() ->__cpu_disable() ->set cpu1 offline ->rq1 = cpu_rq(cpu1) ->resched_curr(rq1) ->smp_send_reschedule(cpu1) ->native_smp_send_reschedule(cpu1) ->if(unlikely(cpu_is_offline(cpu))) { WARN(1, "sched: Unexpected reschedule of offline CPU#%d!\n", cpu); return; } This commit therefore use the resched_cpu() to replace resched_curr() in the bypass_lb_node() to avoid send-ipi to offline CPUs. Signed-off-by: Zqiang Reviewed-by: Andrea Righi Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 5ebf8a7408478..8f6d8d7f895cc 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -3956,13 +3956,8 @@ static void bypass_lb_node(struct scx_sched *sch, int node) nr_donor_target, nr_target); } - for_each_cpu(cpu, resched_mask) { - struct rq *rq = cpu_rq(cpu); - - raw_spin_rq_lock_irq(rq); - resched_curr(rq); - raw_spin_rq_unlock_irq(rq); - } + for_each_cpu(cpu, resched_mask) + resched_cpu(cpu); for_each_cpu_and(cpu, cpu_online_mask, node_mask) { u32 nr = READ_ONCE(cpu_rq(cpu)->scx.bypass_dsq.nr); From 8db3e09bf921d7fbad3d703bb01b49fd1f846f66 Mon Sep 17 00:00:00 2001 From: Kohei Enju Date: Fri, 26 Dec 2025 17:46:49 +0900 Subject: [PATCH 0448/1321] tools/sched_ext: fix scx_show_state.py for scx_root change Commit 48e126777386 ("sched_ext: Introduce scx_sched") introduced scx_root and removed scx_ops, causing scx_show_state.py to fail when searching for the 'scx_ops' object. [1] Fix by using 'scx_root' instead, with NULL pointer handling. [1] # drgn -s vmlinux ./tools/sched_ext/scx_show_state.py Traceback (most recent call last): File "/root/.venv/bin/drgn", line 8, in sys.exit(_main()) ~~~~~^^ File "/root/.venv/lib64/python3.14/site-packages/drgn/cli.py", line 625, in _main runpy.run_path( ~~~~~~~~~~~~~~^ script_path, init_globals={"prog": prog}, run_name="__main__" ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ) ^ File "", line 287, in run_path File "", line 98, in _run_module_code File "", line 88, in _run_code File "./tools/sched_ext/scx_show_state.py", line 30, in ops = prog['scx_ops'] ~~~~^^^^^^^^^^^ _drgn.ObjectNotFoundError: could not find 'scx_ops' Fixes: 48e126777386 ("sched_ext: Introduce scx_sched") Signed-off-by: Kohei Enju Reviewed-by: Emil Tsalapatis Signed-off-by: Tejun Heo --- tools/sched_ext/scx_show_state.py | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/tools/sched_ext/scx_show_state.py b/tools/sched_ext/scx_show_state.py index 7cdcc6729ea4e..aec4a4498140f 100644 --- a/tools/sched_ext/scx_show_state.py +++ b/tools/sched_ext/scx_show_state.py @@ -27,10 +27,13 @@ def read_static_key(name): def state_str(state): return prog['scx_enable_state_str'][state].string_().decode() -ops = prog['scx_ops'] +root = prog['scx_root'] enable_state = read_atomic("scx_enable_state_var") -print(f'ops : {ops.name.string_().decode()}') +if root: + print(f'ops : {root.ops.name.string_().decode()}') +else: + print('ops : ') print(f'enabled : {read_static_key("__scx_enabled")}') print(f'switching_all : {read_int("scx_switching_all")}') print(f'switched_all : {read_static_key("__scx_switched_all")}') From feae967b6439533f0119a66619bdbbe5a4210bd7 Mon Sep 17 00:00:00 2001 From: Kohei Enju Date: Fri, 26 Dec 2025 17:46:50 +0900 Subject: [PATCH 0449/1321] tools/sched_ext: update scx_show_state.py for scx_aborting change Commit a69040ed57f5 ("sched_ext: Simplify breather mechanism with scx_aborting flag") removed scx_in_softlockup and scx_breather_depth, replacing them with scx_aborting. Update the script accordingly. Fixes: a69040ed57f5 ("sched_ext: Simplify breather mechanism with scx_aborting flag") Signed-off-by: Kohei Enju Reviewed-by: Emil Tsalapatis Signed-off-by: Tejun Heo --- tools/sched_ext/scx_show_state.py | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/tools/sched_ext/scx_show_state.py b/tools/sched_ext/scx_show_state.py index aec4a4498140f..02e43c184d438 100644 --- a/tools/sched_ext/scx_show_state.py +++ b/tools/sched_ext/scx_show_state.py @@ -38,8 +38,7 @@ def state_str(state): print(f'switching_all : {read_int("scx_switching_all")}') print(f'switched_all : {read_static_key("__scx_switched_all")}') print(f'enable_state : {state_str(enable_state)} ({enable_state})') -print(f'in_softlockup : {prog["scx_in_softlockup"].value_()}') -print(f'breather_depth: {read_atomic("scx_breather_depth")}') +print(f'aborting : {prog["scx_aborting"].value_()}') print(f'bypass_depth : {prog["scx_bypass_depth"].value_()}') print(f'nr_rejected : {read_atomic("scx_nr_rejected")}') print(f'enable_seq : {read_atomic("scx_enable_seq")}') From cef747b3b665a9d8e6b9e6730ca4036d592f429a Mon Sep 17 00:00:00 2001 From: "Mike Rapoport (Microsoft)" Date: Tue, 25 Nov 2025 15:29:08 +0200 Subject: [PATCH 0450/1321] MAINTAINERS: add Mike Rapoport as maintainer for userfaultfd Link: https://lkml.kernel.org/r/20251125132908.847055-1-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) Acked-by: David Hildenbrand (Red Hat) Acked-by: Vlastimil Babka Acked-by: Peter Xu Acked-by: Lorenzo Stoakes Cc: Liam Howlett Cc: Michal Hocko Cc: Mike Rapoport Cc: Suren Baghdasaryan Signed-off-by: Andrew Morton --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 12f49de7fe036..82fafbe0d4339 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -16751,6 +16751,7 @@ F: tools/testing/selftests/mm/transhuge-stress.c MEMORY MANAGEMENT - USERFAULTFD M: Andrew Morton +M: Mike Rapoport R: Peter Xu L: linux-mm@kvack.org S: Maintained From 39a8462521eed671c7d66c35018e45baeadef9ae Mon Sep 17 00:00:00 2001 From: Andrew Morton Date: Thu, 27 Nov 2025 10:39:24 -0800 Subject: [PATCH 0451/1321] genalloc.h: fix htmldocs warning WARNING: include/linux/genalloc.h:52 function parameter 'start_addr' not described in 'genpool_algo_t' Fixes: 52fbf1134d47 ("lib/genalloc.c: fix allocation of aligned buffer from non-aligned chunk") Reported-by: Stephen Rothwell Closes: https://lkml.kernel.org/r/20251127130624.563597e3@canb.auug.org.au Acked-by: Randy Dunlap Tested-by: Randy Dunlap Cc: Alexey Skidanov Signed-off-by: Andrew Morton --- include/linux/genalloc.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/genalloc.h b/include/linux/genalloc.h index 0bd581003cd5d..60de63e46b33d 100644 --- a/include/linux/genalloc.h +++ b/include/linux/genalloc.h @@ -44,6 +44,7 @@ struct gen_pool; * @nr: The number of zeroed bits we're looking for * @data: optional additional data used by the callback * @pool: the pool being allocated from + * @start_addr: start address of memory chunk */ typedef unsigned long (*genpool_algo_t)(unsigned long *map, unsigned long size, From b14f59afc1cf68d3f6a9e40d32ec3f743662e9fe Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Fri, 28 Nov 2025 14:33:18 +0100 Subject: [PATCH 0452/1321] mailmap: update entry for Bartosz Golaszewski My linaro address will stop working tonight. Update my mailmap entry. Link: https://lkml.kernel.org/r/20251128133318.44912-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski Cc: Hans Verkuil Signed-off-by: Andrew Morton --- .mailmap | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/.mailmap b/.mailmap index 84309a39d329c..e6431290e849b 100644 --- a/.mailmap +++ b/.mailmap @@ -127,7 +127,8 @@ Barry Song Barry Song Bart Van Assche Bart Van Assche -Bartosz Golaszewski +Bartosz Golaszewski +Bartosz Golaszewski Ben Dooks Ben Dooks Ben Gardner From 147a3c5b869faa7b5bbd6b2535a5d6de4ef50ffe Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Fri, 28 Nov 2025 16:18:32 +0000 Subject: [PATCH 0453/1321] idr: fix idr_alloc() returning an ID out of range MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If you use an IDR with a non-zero base, and specify a range that lies entirely below the base, 'max - base' becomes very large and idr_get_free() can return an ID that lies outside of the requested range. Link: https://lkml.kernel.org/r/20251128161853.3200058-1-willy@infradead.org Fixes: 6ce711f27500 ("idr: Make 1-based IDRs more efficient") Signed-off-by: Matthew Wilcox (Oracle) Reported-by: Jan Sokolowski Reported-by: Koen Koning Reported-by: Peter Senna Tschudin Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6449 Reviewed-by: Christian König Cc: Signed-off-by: Andrew Morton --- lib/idr.c | 2 ++ tools/testing/radix-tree/idr-test.c | 21 +++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/lib/idr.c b/lib/idr.c index e2adc457abb4b..457430cff8c5e 100644 --- a/lib/idr.c +++ b/lib/idr.c @@ -40,6 +40,8 @@ int idr_alloc_u32(struct idr *idr, void *ptr, u32 *nextid, if (WARN_ON_ONCE(!(idr->idr_rt.xa_flags & ROOT_IS_IDR))) idr->idr_rt.xa_flags |= IDR_RT_MARKER; + if (max < base) + return -ENOSPC; id = (id < base) ? 0 : id - base; radix_tree_iter_init(&iter, id); diff --git a/tools/testing/radix-tree/idr-test.c b/tools/testing/radix-tree/idr-test.c index 2f830ff8396cc..945144e985072 100644 --- a/tools/testing/radix-tree/idr-test.c +++ b/tools/testing/radix-tree/idr-test.c @@ -57,6 +57,26 @@ void idr_alloc_test(void) idr_destroy(&idr); } +void idr_alloc2_test(void) +{ + int id; + struct idr idr = IDR_INIT_BASE(idr, 1); + + id = idr_alloc(&idr, idr_alloc2_test, 0, 1, GFP_KERNEL); + assert(id == -ENOSPC); + + id = idr_alloc(&idr, idr_alloc2_test, 1, 2, GFP_KERNEL); + assert(id == 1); + + id = idr_alloc(&idr, idr_alloc2_test, 0, 1, GFP_KERNEL); + assert(id == -ENOSPC); + + id = idr_alloc(&idr, idr_alloc2_test, 0, 2, GFP_KERNEL); + assert(id == -ENOSPC); + + idr_destroy(&idr); +} + void idr_replace_test(void) { DEFINE_IDR(idr); @@ -409,6 +429,7 @@ void idr_checks(void) idr_replace_test(); idr_alloc_test(); + idr_alloc2_test(); idr_null_test(); idr_nowait_test(); idr_get_next_test(0); From be7e995140f4917779ac133318466cf0062d8fbc Mon Sep 17 00:00:00 2001 From: Jiayuan Chen Date: Thu, 4 Dec 2025 18:59:55 +0000 Subject: [PATCH 0454/1321] mm/kasan: fix incorrect unpoisoning in vrealloc for KASAN Patch series "kasan: vmalloc: Fixes for the percpu allocator and vrealloc", v3. Patches fix two issues related to KASAN and vmalloc. The first one, a KASAN tag mismatch, possibly resulting in a kernel panic, can be observed on systems with a tag-based KASAN enabled and with multiple NUMA nodes. Initially it was only noticed on x86 [1] but later a similar issue was also reported on arm64 [2]. Specifically the problem is related to how vm_structs interact with pcpu_chunks - both when they are allocated, assigned and when pcpu_chunk addresses are derived. When vm_structs are allocated they are unpoisoned, each with a different random tag, if vmalloc support is enabled along the KASAN mode. Later when first pcpu chunk is allocated it gets its 'base_addr' field set to the first allocated vm_struct. With that it inherits that vm_struct's tag. When pcpu_chunk addresses are later derived (by pcpu_chunk_addr(), for example in pcpu_alloc_noprof()) the base_addr field is used and offsets are added to it. If the initial conditions are satisfied then some of the offsets will point into memory allocated with a different vm_struct. So while the lower bits will get accurately derived the tag bits in the top of the pointer won't match the shadow memory contents. The solution (proposed at v2 of the x86 KASAN series [3]) is to unpoison the vm_structs with the same tag when allocating them for the per cpu allocator (in pcpu_get_vm_areas()). The second one reported by syzkaller [4] is related to vrealloc and happens because of random tag generation when unpoisoning memory without allocating new pages. This breaks shadow memory tracking and needs to reuse the existing tag instead of generating a new one. At the same time an inconsistency in used flags is corrected. This patch (of 3): Syzkaller reported a memory out-of-bounds bug [4]. This patch fixes two issues: 1. In vrealloc the KASAN_VMALLOC_VM_ALLOC flag is missing when unpoisoning the extended region. This flag is required to correctly associate the allocation with KASAN's vmalloc tracking. Note: In contrast, vzalloc (via __vmalloc_node_range_noprof) explicitly sets KASAN_VMALLOC_VM_ALLOC and calls kasan_unpoison_vmalloc() with it. vrealloc must behave consistently -- especially when reusing existing vmalloc regions -- to ensure KASAN can track allocations correctly. 2. When vrealloc reuses an existing vmalloc region (without allocating new pages) KASAN generates a new tag, which breaks tag-based memory access tracking. Introduce KASAN_VMALLOC_KEEP_TAG, a new KASAN flag that allows reusing the tag already attached to the pointer, ensuring consistent tag behavior during reallocation. Pass KASAN_VMALLOC_KEEP_TAG and KASAN_VMALLOC_VM_ALLOC to the kasan_unpoison_vmalloc inside vrealloc_node_align_noprof(). Link: https://lkml.kernel.org/r/cover.1765978969.git.m.wieczorretman@pm.me Link: https://lkml.kernel.org/r/38dece0a4074c43e48150d1e242f8242c73bf1a5.1764874575.git.m.wieczorretman@pm.me Link: https://lore.kernel.org/all/e7e04692866d02e6d3b32bb43b998e5d17092ba4.1738686764.git.maciej.wieczor-retman@intel.com/ [1] Link: https://lore.kernel.org/all/aMUrW1Znp1GEj7St@MiWiFi-R3L-srv/ [2] Link: https://lore.kernel.org/all/CAPAsAGxDRv_uFeMYu9TwhBVWHCCtkSxoWY4xmFB_vowMbi8raw@mail.gmail.com/ [3] Link: https://syzkaller.appspot.com/bug?extid=997752115a851cb0cf36 [4] Fixes: a0309faf1cb0 ("mm: vmalloc: support more granular vrealloc() sizing") Signed-off-by: Jiayuan Chen Co-developed-by: Maciej Wieczor-Retman Signed-off-by: Maciej Wieczor-Retman Reported-by: syzbot+997752115a851cb0cf36@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68e243a2.050a0220.1696c6.007d.GAE@google.com/T/ Reviewed-by: Andrey Konovalov Cc: Alexander Potapenko Cc: Andrey Ryabinin Cc: Danilo Krummrich Cc: Dmitriy Vyukov Cc: Kees Cook Cc: Marco Elver Cc: "Uladzislau Rezki (Sony)" Cc: Vincenzo Frascino Cc: Signed-off-by: Andrew Morton --- include/linux/kasan.h | 1 + mm/kasan/hw_tags.c | 2 +- mm/kasan/shadow.c | 4 +++- mm/vmalloc.c | 4 +++- 4 files changed, 8 insertions(+), 3 deletions(-) diff --git a/include/linux/kasan.h b/include/linux/kasan.h index f335c1d7b61d3..df3d8567dde93 100644 --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -28,6 +28,7 @@ typedef unsigned int __bitwise kasan_vmalloc_flags_t; #define KASAN_VMALLOC_INIT ((__force kasan_vmalloc_flags_t)0x01u) #define KASAN_VMALLOC_VM_ALLOC ((__force kasan_vmalloc_flags_t)0x02u) #define KASAN_VMALLOC_PROT_NORMAL ((__force kasan_vmalloc_flags_t)0x04u) +#define KASAN_VMALLOC_KEEP_TAG ((__force kasan_vmalloc_flags_t)0x08u) #define KASAN_VMALLOC_PAGE_RANGE 0x1 /* Apply exsiting page range */ #define KASAN_VMALLOC_TLB_FLUSH 0x2 /* TLB flush */ diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c index 1c373cc4b3fa5..cbef5e450954e 100644 --- a/mm/kasan/hw_tags.c +++ b/mm/kasan/hw_tags.c @@ -361,7 +361,7 @@ void *__kasan_unpoison_vmalloc(const void *start, unsigned long size, return (void *)start; } - tag = kasan_random_tag(); + tag = (flags & KASAN_VMALLOC_KEEP_TAG) ? get_tag(start) : kasan_random_tag(); start = set_tag(start, tag); /* Unpoison and initialize memory up to size. */ diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c index 29a751a8a08d9..32fbdf759ea20 100644 --- a/mm/kasan/shadow.c +++ b/mm/kasan/shadow.c @@ -631,7 +631,9 @@ void *__kasan_unpoison_vmalloc(const void *start, unsigned long size, !(flags & KASAN_VMALLOC_PROT_NORMAL)) return (void *)start; - start = set_tag(start, kasan_random_tag()); + if (unlikely(!(flags & KASAN_VMALLOC_KEEP_TAG))) + start = set_tag(start, kasan_random_tag()); + kasan_unpoison(start, size, false); return (void *)start; } diff --git a/mm/vmalloc.c b/mm/vmalloc.c index ecbac900c35f9..94c0a9262a467 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -4331,7 +4331,9 @@ void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align */ if (size <= alloced_size) { kasan_unpoison_vmalloc(p + old_size, size - old_size, - KASAN_VMALLOC_PROT_NORMAL); + KASAN_VMALLOC_PROT_NORMAL | + KASAN_VMALLOC_VM_ALLOC | + KASAN_VMALLOC_KEEP_TAG); /* * No need to zero memory here, as unused memory will have * already been zeroed at initial allocation time or during From 70001c6c3054b0f7fb373c04e5ecc42eaddaf17b Mon Sep 17 00:00:00 2001 From: Maciej Wieczor-Retman Date: Thu, 4 Dec 2025 19:00:04 +0000 Subject: [PATCH 0455/1321] kasan: refactor pcpu kasan vmalloc unpoison A KASAN tag mismatch, possibly causing a kernel panic, can be observed on systems with a tag-based KASAN enabled and with multiple NUMA nodes. It was reported on arm64 and reproduced on x86. It can be explained in the following points: 1. There can be more than one virtual memory chunk. 2. Chunk's base address has a tag. 3. The base address points at the first chunk and thus inherits the tag of the first chunk. 4. The subsequent chunks will be accessed with the tag from the first chunk. 5. Thus, the subsequent chunks need to have their tag set to match that of the first chunk. Refactor code by reusing __kasan_unpoison_vmalloc in a new helper in preparation for the actual fix. Link: https://lkml.kernel.org/r/eb61d93b907e262eefcaa130261a08bcb6c5ce51.1764874575.git.m.wieczorretman@pm.me Fixes: 1d96320f8d53 ("kasan, vmalloc: add vmalloc tagging for SW_TAGS") Signed-off-by: Maciej Wieczor-Retman Reviewed-by: Andrey Konovalov Cc: Alexander Potapenko Cc: Andrey Ryabinin Cc: Danilo Krummrich Cc: Dmitriy Vyukov Cc: Jiayuan Chen Cc: Kees Cook Cc: Marco Elver Cc: "Uladzislau Rezki (Sony)" Cc: Vincenzo Frascino Cc: [6.1+] Signed-off-by: Andrew Morton --- include/linux/kasan.h | 15 +++++++++++++++ mm/kasan/common.c | 17 +++++++++++++++++ mm/vmalloc.c | 4 +--- 3 files changed, 33 insertions(+), 3 deletions(-) diff --git a/include/linux/kasan.h b/include/linux/kasan.h index df3d8567dde93..9c6ac4b62eb99 100644 --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -631,6 +631,16 @@ static __always_inline void kasan_poison_vmalloc(const void *start, __kasan_poison_vmalloc(start, size); } +void __kasan_unpoison_vmap_areas(struct vm_struct **vms, int nr_vms, + kasan_vmalloc_flags_t flags); +static __always_inline void +kasan_unpoison_vmap_areas(struct vm_struct **vms, int nr_vms, + kasan_vmalloc_flags_t flags) +{ + if (kasan_enabled()) + __kasan_unpoison_vmap_areas(vms, nr_vms, flags); +} + #else /* CONFIG_KASAN_VMALLOC */ static inline void kasan_populate_early_vm_area_shadow(void *start, @@ -655,6 +665,11 @@ static inline void *kasan_unpoison_vmalloc(const void *start, static inline void kasan_poison_vmalloc(const void *start, unsigned long size) { } +static __always_inline void +kasan_unpoison_vmap_areas(struct vm_struct **vms, int nr_vms, + kasan_vmalloc_flags_t flags) +{ } + #endif /* CONFIG_KASAN_VMALLOC */ #if (defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)) && \ diff --git a/mm/kasan/common.c b/mm/kasan/common.c index 1d27f1bd260b3..b2b40c59ce18b 100644 --- a/mm/kasan/common.c +++ b/mm/kasan/common.c @@ -28,6 +28,7 @@ #include #include #include +#include #include "kasan.h" #include "../slab.h" @@ -575,3 +576,19 @@ bool __kasan_check_byte(const void *address, unsigned long ip) } return true; } + +#ifdef CONFIG_KASAN_VMALLOC +void __kasan_unpoison_vmap_areas(struct vm_struct **vms, int nr_vms, + kasan_vmalloc_flags_t flags) +{ + unsigned long size; + void *addr; + int area; + + for (area = 0 ; area < nr_vms ; area++) { + size = vms[area]->size; + addr = vms[area]->addr; + vms[area]->addr = __kasan_unpoison_vmalloc(addr, size, flags); + } +} +#endif diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 94c0a9262a467..41dd01e8430c5 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -5027,9 +5027,7 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, * With hardware tag-based KASAN, marking is skipped for * non-VM_ALLOC mappings, see __kasan_unpoison_vmalloc(). */ - for (area = 0; area < nr_vms; area++) - vms[area]->addr = kasan_unpoison_vmalloc(vms[area]->addr, - vms[area]->size, KASAN_VMALLOC_PROT_NORMAL); + kasan_unpoison_vmap_areas(vms, nr_vms, KASAN_VMALLOC_PROT_NORMAL); kfree(vas); return vms; From 85ef29657266194820348c98b198bb4ac3ef7d47 Mon Sep 17 00:00:00 2001 From: Maciej Wieczor-Retman Date: Thu, 4 Dec 2025 19:00:11 +0000 Subject: [PATCH 0456/1321] kasan: unpoison vms[area] addresses with a common tag A KASAN tag mismatch, possibly causing a kernel panic, can be observed on systems with a tag-based KASAN enabled and with multiple NUMA nodes. It was reported on arm64 and reproduced on x86. It can be explained in the following points: 1. There can be more than one virtual memory chunk. 2. Chunk's base address has a tag. 3. The base address points at the first chunk and thus inherits the tag of the first chunk. 4. The subsequent chunks will be accessed with the tag from the first chunk. 5. Thus, the subsequent chunks need to have their tag set to match that of the first chunk. Use the new vmalloc flag that disables random tag assignment in __kasan_unpoison_vmalloc() - pass the same random tag to all the vm_structs by tagging the pointers before they go inside __kasan_unpoison_vmalloc(). Assigning a common tag resolves the pcpu chunk address mismatch. [akpm@linux-foundation.org: use WARN_ON_ONCE(), per Andrey] Link: https://lkml.kernel.org/r/CA+fCnZeuGdKSEm11oGT6FS71_vGq1vjq-xY36kxVdFvwmag2ZQ@mail.gmail.com [maciej.wieczor-retman@intel.com: remove unneeded pr_warn()] Link: https://lkml.kernel.org/r/919897daaaa3c982a27762a2ee038769ad033991.1764945396.git.m.wieczorretman@pm.me Link: https://lkml.kernel.org/r/873821114a9f722ffb5d6702b94782e902883fdf.1764874575.git.m.wieczorretman@pm.me Fixes: 1d96320f8d53 ("kasan, vmalloc: add vmalloc tagging for SW_TAGS") Signed-off-by: Maciej Wieczor-Retman Reviewed-by: Andrey Konovalov Cc: Alexander Potapenko Cc: Andrey Ryabinin Cc: Danilo Krummrich Cc: Dmitriy Vyukov Cc: Jiayuan Chen Cc: Kees Cook Cc: Marco Elver Cc: "Uladzislau Rezki (Sony)" Cc: Vincenzo Frascino Cc: [6.1+] Signed-off-by: Andrew Morton --- mm/kasan/common.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/mm/kasan/common.c b/mm/kasan/common.c index b2b40c59ce18b..ed489a14dddf7 100644 --- a/mm/kasan/common.c +++ b/mm/kasan/common.c @@ -584,11 +584,26 @@ void __kasan_unpoison_vmap_areas(struct vm_struct **vms, int nr_vms, unsigned long size; void *addr; int area; + u8 tag; + + /* + * If KASAN_VMALLOC_KEEP_TAG was set at this point, all vms[] pointers + * would be unpoisoned with the KASAN_TAG_KERNEL which would disable + * KASAN checks down the line. + */ + if (WARN_ON_ONCE(flags & KASAN_VMALLOC_KEEP_TAG)) + return; + + size = vms[0]->size; + addr = vms[0]->addr; + vms[0]->addr = __kasan_unpoison_vmalloc(addr, size, flags); + tag = get_tag(vms[0]->addr); - for (area = 0 ; area < nr_vms ; area++) { + for (area = 1 ; area < nr_vms ; area++) { size = vms[area]->size; - addr = vms[area]->addr; - vms[area]->addr = __kasan_unpoison_vmalloc(addr, size, flags); + addr = set_tag(vms[area]->addr, tag); + vms[area]->addr = + __kasan_unpoison_vmalloc(addr, size, flags | KASAN_VMALLOC_KEEP_TAG); } } #endif From 59ece11ee5db7b5175685e1932b1ca0c43f1d73d Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Sun, 14 Dec 2025 12:15:17 -0800 Subject: [PATCH 0457/1321] mm: leafops.h: correct kernel-doc function param. names Modify the kernel-doc function parameter names to prevent kernel-doc warnings: Warning: include/linux/leafops.h:135 function parameter 'entry' not described in 'leafent_type' Warning: include/linux/leafops.h:540 function parameter 'pte' not described in 'pte_is_uffd_marker' Link: https://lkml.kernel.org/r/20251214201517.2187051-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap Reviewed-by: Lorenzo Stoakes Cc: Liam Howlett Cc: Michal Hocko Cc: Mike Rapoport Cc: Suren Baghdasaryan Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- include/linux/leafops.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/leafops.h b/include/linux/leafops.h index cfafe7a5e7b13..a9ff94b744f22 100644 --- a/include/linux/leafops.h +++ b/include/linux/leafops.h @@ -133,7 +133,7 @@ static inline bool softleaf_is_none(softleaf_t entry) /** * softleaf_type() - Identify the type of leaf entry. - * @enntry: Leaf entry. + * @entry: Leaf entry. * * Returns: the leaf entry type associated with @entry. */ @@ -534,7 +534,7 @@ static inline bool pte_is_uffd_wp_marker(pte_t pte) /** * pte_is_uffd_marker() - Does this PTE entry encode a userfault-specific marker * leaf entry? - * @entry: Leaf entry. + * @pte: PTE entry. * * It's useful to be able to determine which leaf entries encode UFFD-specific * markers so we can handle these correctly. From fb5213707528ffabc7784660fbc973ac0995368a Mon Sep 17 00:00:00 2001 From: Alexander Gordeev Date: Fri, 12 Dec 2025 16:14:57 +0100 Subject: [PATCH 0458/1321] mm/page_alloc: change all pageblocks migrate type on coalescing When a page is freed it coalesces with a buddy into a higher order page while possible. When the buddy page migrate type differs, it is expected to be updated to match the one of the page being freed. However, only the first pageblock of the buddy page is updated, while the rest of the pageblocks are left unchanged. That causes warnings in later expand() and other code paths (like below), since an inconsistency between migration type of the list containing the page and the page-owned pageblocks migration types is introduced. [ 308.986589] ------------[ cut here ]------------ [ 308.987227] page type is 0, passed migratetype is 1 (nr=256) [ 308.987275] WARNING: CPU: 1 PID: 5224 at mm/page_alloc.c:812 expand+0x23c/0x270 [ 308.987293] Modules linked in: algif_hash(E) af_alg(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) s390_trng(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) i2c_core(E) drm_panel_orientation_quirks(E) loop(E) nfnetlink(E) vsock_loopback(E) vmw_vsock_virtio_transport_common(E) vsock(E) ctcm(E) fsm(E) diag288_wdt(E) watchdog(E) zfcp(E) scsi_transport_fc(E) ghash_s390(E) prng(E) aes_s390(E) des_generic(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha_common(E) paes_s390(E) crypto_engine(E) pkey_cca(E) pkey_ep11(E) zcrypt(E) rng_core(E) pkey_pckmo(E) pkey(E) autofs4(E) [ 308.987439] Unloaded tainted modules: hmac_s390(E):2 [ 308.987650] CPU: 1 UID: 0 PID: 5224 Comm: mempig_verify Kdump: loaded Tainted: G E 6.18.0-gcc-bpf-debug #431 PREEMPT [ 308.987657] Tainted: [E]=UNSIGNED_MODULE [ 308.987661] Hardware name: IBM 3906 M04 704 (z/VM 7.3.0) [ 308.987666] Krnl PSW : 0404f00180000000 00000349976fa600 (expand+0x240/0x270) [ 308.987676] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 RI:0 EA:3 [ 308.987682] Krnl GPRS: 0000034980000004 0000000000000005 0000000000000030 000003499a0e6d88 [ 308.987688] 0000000000000005 0000034980000005 000002be803ac000 0000023efe6c8300 [ 308.987692] 0000000000000008 0000034998d57290 000002be00000100 0000023e00000008 [ 308.987696] 0000000000000000 0000000000000000 00000349976fa5fc 000002c99b1eb6f0 [ 308.987708] Krnl Code: 00000349976fa5f0: c020008a02f2 larl %r2,000003499883abd4 00000349976fa5f6: c0e5ffe3f4b5 brasl %r14,0000034997378f60 #00000349976fa5fc: af000000 mc 0,0 >00000349976fa600: a7f4ff4c brc 15,00000349976fa498 00000349976fa604: b9040026 lgr %r2,%r6 00000349976fa608: c0300088317f larl %r3,0000034998800906 00000349976fa60e: c0e5fffdb6e1 brasl %r14,00000349976b13d0 00000349976fa614: af000000 mc 0,0 [ 308.987734] Call Trace: [ 308.987738] [<00000349976fa600>] expand+0x240/0x270 [ 308.987744] ([<00000349976fa5fc>] expand+0x23c/0x270) [ 308.987749] [<00000349976ff95e>] rmqueue_bulk+0x71e/0x940 [ 308.987754] [<00000349976ffd7e>] __rmqueue_pcplist+0x1fe/0x2a0 [ 308.987759] [<0000034997700966>] rmqueue.isra.0+0xb46/0xf40 [ 308.987763] [<0000034997703ec8>] get_page_from_freelist+0x198/0x8d0 [ 308.987768] [<0000034997706fa8>] __alloc_frozen_pages_noprof+0x198/0x400 [ 308.987774] [<00000349977536f8>] alloc_pages_mpol+0xb8/0x220 [ 308.987781] [<0000034997753bf6>] folio_alloc_mpol_noprof+0x26/0xc0 [ 308.987786] [<0000034997753e4c>] vma_alloc_folio_noprof+0x6c/0xa0 [ 308.987791] [<0000034997775b22>] vma_alloc_anon_folio_pmd+0x42/0x240 [ 308.987799] [<000003499777bfea>] __do_huge_pmd_anonymous_page+0x3a/0x210 [ 308.987804] [<00000349976cb08e>] __handle_mm_fault+0x4de/0x500 [ 308.987809] [<00000349976cb14c>] handle_mm_fault+0x9c/0x3a0 [ 308.987813] [<000003499734d70e>] do_exception+0x1de/0x540 [ 308.987822] [<0000034998387390>] __do_pgm_check+0x130/0x220 [ 308.987830] [<000003499839a934>] pgm_check_handler+0x114/0x160 [ 308.987838] 3 locks held by mempig_verify/5224: [ 308.987842] #0: 0000023ea44c1e08 (vm_lock){++++}-{0:0}, at: lock_vma_under_rcu+0xb2/0x2a0 [ 308.987859] #1: 0000023ee4d41b18 (&pcp->lock){+.+.}-{2:2}, at: rmqueue.isra.0+0xad6/0xf40 [ 308.987871] #2: 0000023efe6c8998 (&zone->lock){..-.}-{2:2}, at: rmqueue_bulk+0x5a/0x940 [ 308.987886] Last Breaking-Event-Address: [ 308.987890] [<0000034997379096>] __warn_printk+0x136/0x140 [ 308.987897] irq event stamp: 52330356 [ 308.987901] hardirqs last enabled at (52330355): [<000003499838742e>] __do_pgm_check+0x1ce/0x220 [ 308.987907] hardirqs last disabled at (52330356): [<000003499839932e>] _raw_spin_lock_irqsave+0x9e/0xe0 [ 308.987913] softirqs last enabled at (52329882): [<0000034997383786>] handle_softirqs+0x2c6/0x530 [ 308.987922] softirqs last disabled at (52329859): [<0000034997382f86>] __irq_exit_rcu+0x126/0x140 [ 308.987929] ---[ end trace 0000000000000000 ]--- [ 308.987936] ------------[ cut here ]------------ [ 308.987940] page type is 0, passed migratetype is 1 (nr=256) [ 308.987951] WARNING: CPU: 1 PID: 5224 at mm/page_alloc.c:860 __del_page_from_free_list+0x1be/0x1e0 [ 308.987960] Modules linked in: algif_hash(E) af_alg(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) s390_trng(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) i2c_core(E) drm_panel_orientation_quirks(E) loop(E) nfnetlink(E) vsock_loopback(E) vmw_vsock_virtio_transport_common(E) vsock(E) ctcm(E) fsm(E) diag288_wdt(E) watchdog(E) zfcp(E) scsi_transport_fc(E) ghash_s390(E) prng(E) aes_s390(E) des_generic(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha_common(E) paes_s390(E) crypto_engine(E) pkey_cca(E) pkey_ep11(E) zcrypt(E) rng_core(E) pkey_pckmo(E) pkey(E) autofs4(E) [ 308.988070] Unloaded tainted modules: hmac_s390(E):2 [ 308.988087] CPU: 1 UID: 0 PID: 5224 Comm: mempig_verify Kdump: loaded Tainted: G W E 6.18.0-gcc-bpf-debug #431 PREEMPT [ 308.988095] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE [ 308.988100] Hardware name: IBM 3906 M04 704 (z/VM 7.3.0) [ 308.988105] Krnl PSW : 0404f00180000000 00000349976f9e32 (__del_page_from_free_list+0x1c2/0x1e0) [ 308.988118] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 RI:0 EA:3 [ 308.988127] Krnl GPRS: 0000034980000004 0000000000000005 0000000000000030 000003499a0e6d88 [ 308.988133] 0000000000000005 0000034980000005 0000034998d57290 0000023efe6c8300 [ 308.988139] 0000000000000001 0000000000000008 000002be00000100 000002be803ac000 [ 308.988144] 0000000000000000 0000000000000001 00000349976f9e2e 000002c99b1eb728 [ 308.988153] Krnl Code: 00000349976f9e22: c020008a06d9 larl %r2,000003499883abd4 00000349976f9e28: c0e5ffe3f89c brasl %r14,0000034997378f60 #00000349976f9e2e: af000000 mc 0,0 >00000349976f9e32: a7f4ff4e brc 15,00000349976f9cce 00000349976f9e36: b904002b lgr %r2,%r11 00000349976f9e3a: c030008a06e7 larl %r3,000003499883ac08 00000349976f9e40: c0e5fffdbac8 brasl %r14,00000349976b13d0 00000349976f9e46: af000000 mc 0,0 [ 308.988184] Call Trace: [ 308.988188] [<00000349976f9e32>] __del_page_from_free_list+0x1c2/0x1e0 [ 308.988195] ([<00000349976f9e2e>] __del_page_from_free_list+0x1be/0x1e0) [ 308.988202] [<00000349976ff946>] rmqueue_bulk+0x706/0x940 [ 308.988208] [<00000349976ffd7e>] __rmqueue_pcplist+0x1fe/0x2a0 [ 308.988214] [<0000034997700966>] rmqueue.isra.0+0xb46/0xf40 [ 308.988221] [<0000034997703ec8>] get_page_from_freelist+0x198/0x8d0 [ 308.988227] [<0000034997706fa8>] __alloc_frozen_pages_noprof+0x198/0x400 [ 308.988233] [<00000349977536f8>] alloc_pages_mpol+0xb8/0x220 [ 308.988240] [<0000034997753bf6>] folio_alloc_mpol_noprof+0x26/0xc0 [ 308.988247] [<0000034997753e4c>] vma_alloc_folio_noprof+0x6c/0xa0 [ 308.988253] [<0000034997775b22>] vma_alloc_anon_folio_pmd+0x42/0x240 [ 308.988260] [<000003499777bfea>] __do_huge_pmd_anonymous_page+0x3a/0x210 [ 308.988267] [<00000349976cb08e>] __handle_mm_fault+0x4de/0x500 [ 308.988273] [<00000349976cb14c>] handle_mm_fault+0x9c/0x3a0 [ 308.988279] [<000003499734d70e>] do_exception+0x1de/0x540 [ 308.988286] [<0000034998387390>] __do_pgm_check+0x130/0x220 [ 308.988293] [<000003499839a934>] pgm_check_handler+0x114/0x160 [ 308.988300] 3 locks held by mempig_verify/5224: [ 308.988305] #0: 0000023ea44c1e08 (vm_lock){++++}-{0:0}, at: lock_vma_under_rcu+0xb2/0x2a0 [ 308.988322] #1: 0000023ee4d41b18 (&pcp->lock){+.+.}-{2:2}, at: rmqueue.isra.0+0xad6/0xf40 [ 308.988334] #2: 0000023efe6c8998 (&zone->lock){..-.}-{2:2}, at: rmqueue_bulk+0x5a/0x940 [ 308.988346] Last Breaking-Event-Address: [ 308.988350] [<0000034997379096>] __warn_printk+0x136/0x140 [ 308.988356] irq event stamp: 52330356 [ 308.988360] hardirqs last enabled at (52330355): [<000003499838742e>] __do_pgm_check+0x1ce/0x220 [ 308.988366] hardirqs last disabled at (52330356): [<000003499839932e>] _raw_spin_lock_irqsave+0x9e/0xe0 [ 308.988373] softirqs last enabled at (52329882): [<0000034997383786>] handle_softirqs+0x2c6/0x530 [ 308.988380] softirqs last disabled at (52329859): [<0000034997382f86>] __irq_exit_rcu+0x126/0x140 [ 308.988388] ---[ end trace 0000000000000000 ]--- Link: https://lkml.kernel.org/r/20251215081002.3353900A9c-agordeev@linux.ibm.com Link: https://lkml.kernel.org/r/20251212151457.3898073Add-agordeev@linux.ibm.com Fixes: e6cf9e1c4cde ("mm: page_alloc: fix up block types when merging compatible blocks") Signed-off-by: Alexander Gordeev Reported-by: Marc Hartmayer Closes: https://lore.kernel.org/linux-mm/87wmalyktd.fsf@linux.ibm.com/ Acked-by: Vlastimil Babka Acked-by: Johannes Weiner Reviewed-by: Wei Yang Cc: Marc Hartmayer Cc: Signed-off-by: Andrew Morton --- mm/page_alloc.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 822e05f1a9646..f6586f165b893 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -914,6 +914,17 @@ buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn, NULL) != NULL; } +static void change_pageblock_range(struct page *pageblock_page, + int start_order, int migratetype) +{ + int nr_pageblocks = 1 << (start_order - pageblock_order); + + while (nr_pageblocks--) { + set_pageblock_migratetype(pageblock_page, migratetype); + pageblock_page += pageblock_nr_pages; + } +} + /* * Freeing function for a buddy system allocator. * @@ -1000,7 +1011,7 @@ static inline void __free_one_page(struct page *page, * expand() down the line puts the sub-blocks * on the right freelists. */ - set_pageblock_migratetype(buddy, migratetype); + change_pageblock_range(buddy, order, migratetype); } combined_pfn = buddy_pfn & pfn; @@ -2147,17 +2158,6 @@ bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *pag #endif /* CONFIG_MEMORY_ISOLATION */ -static void change_pageblock_range(struct page *pageblock_page, - int start_order, int migratetype) -{ - int nr_pageblocks = 1 << (start_order - pageblock_order); - - while (nr_pageblocks--) { - set_pageblock_migratetype(pageblock_page, migratetype); - pageblock_page += pageblock_nr_pages; - } -} - static inline bool boost_watermark(struct zone *zone) { unsigned long max_boost; From 5c08956d2a71004aa593cb0194ebfdf655d74429 Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Thu, 4 Dec 2025 11:45:31 +0100 Subject: [PATCH 0459/1321] MAINTAINERS: update one straggling entry for Bartosz Golaszewski The entry for the Qualcomm bluetooth driver only now got sent upstream and still has my old address. Update it to use the kernel.org one. Link: https://lkml.kernel.org/r/20251204104531.22045-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski Cc: Hans Verkuil Signed-off-by: Andrew Morton --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 82fafbe0d4339..bc651d25c2916 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -21346,7 +21346,7 @@ F: Documentation/devicetree/bindings/net/qcom,bam-dmux.yaml F: drivers/net/wwan/qcom_bam_dmux.c QUALCOMM BLUETOOTH DRIVER -M: Bartosz Golaszewski +M: Bartosz Golaszewski L: linux-arm-msm@vger.kernel.org S: Maintained F: drivers/bluetooth/btqca.[ch] From 160bcabbb5554143557c5edddfd1510ad8c6ec17 Mon Sep 17 00:00:00 2001 From: Akinobu Mita Date: Wed, 10 Dec 2025 00:10:34 +0900 Subject: [PATCH 0460/1321] mm/damon/vaddr: fix missing pte_unmap_unlock in damos_va_migrate_pmd_entry() If the PTE page table lock is acquired by pte_offset_map_lock(), the lock must be released via pte_unmap_unlock(). However, in damos_va_migrate_pmd_entry(), if damos_va_filter_out() returns true, it immediately returns without releasing the lock. This fixes the issue by not stopping page table traversal when damos_va_filter_out() returns true and ensuring that the lock is released. Link: https://lkml.kernel.org/r/20251209151034.77221-1-akinobu.mita@gmail.com Fixes: 09efc56a3b1c ("mm/damon/vaddr: consistently use only pmd_entry for damos_migrate") Signed-off-by: Akinobu Mita Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/vaddr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c index 2750c88e72252..23ed738a0bd6f 100644 --- a/mm/damon/vaddr.c +++ b/mm/damon/vaddr.c @@ -743,7 +743,7 @@ static int damos_va_migrate_pmd_entry(pmd_t *pmd, unsigned long addr, if (!folio) continue; if (damos_va_filter_out(s, folio, walk->vma, addr, pte, NULL)) - return 0; + continue; damos_va_migrate_dests_add(folio, walk->vma, addr, dests, migration_lists); nr = folio_nr_pages(folio); From add2932194001b968e7fa8e3fde6bccc89c0c088 Mon Sep 17 00:00:00 2001 From: WangYuli Date: Mon, 8 Dec 2025 10:57:30 +0800 Subject: [PATCH 0461/1321] .mailmap: remove one of the entries for WangYuli MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Since commit 01ef0296d2eb (".mailmap: add entry for WangYuli") was merged into mainline, I've received feedback from former colleagues: They believe the change to .mailmap affects git log based statistics, which in turn reduces the reported “contributions from uniontech” in the Linux commit tree, and they think it's difficult to explain to everyone that future statistics must be generated with the --no-use-mailmap option. I don't have a strong opinion either way, but since my commit has caused them trouble, I'm now requesting that this line be removed to bring a little more LOVE AND PEACE to the world :-) Link: https://lkml.kernel.org/r/20251208025730.33881-1-wangyuli@aosc.io Signed-off-by: WangYuli Cc: Carlos Bilbao Cc: Hans Verkuil Cc: Martin Kepplinger Cc: Shannon Nelson Signed-off-by: Andrew Morton --- .mailmap | 1 - 1 file changed, 1 deletion(-) diff --git a/.mailmap b/.mailmap index e6431290e849b..7a6110d0e46d5 100644 --- a/.mailmap +++ b/.mailmap @@ -858,7 +858,6 @@ Vladimir Davydov Vladimir Davydov WangYuli WangYuli -WangYuli Weiwen Hu WeiXiong Liao Wen Gong From 2b14e53d161f6b24e23aeade3af0126337672aa4 Mon Sep 17 00:00:00 2001 From: Pratyush Yadav Date: Fri, 12 Dec 2025 16:12:02 +0900 Subject: [PATCH 0462/1321] MAINTAINERS: add ABI headers to KHO and LIVE UPDATE include/linux/kho is supposed to hold KHO headers. Add it to KHO's MAINTAINERS entry so the right people can get patches to it. include/linux/kho/abi contains the live update ABI headers for LUO core and memfd. It will also hold ABI headers for other upcoming file types as well. Add it to live update entry so live update maintainers can get changes for it (currently they happen to be the same people). Link: https://lkml.kernel.org/r/20251212071204.398788-1-pratyush@kernel.org Signed-off-by: Pratyush Yadav Reviewed-by: Pasha Tatashin Cc: Alexander Graf Cc: Mike Rapoport Signed-off-by: Andrew Morton --- MAINTAINERS | 2 ++ 1 file changed, 2 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index bc651d25c2916..5d652c9e4ba5f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13959,6 +13959,7 @@ S: Maintained F: Documentation/admin-guide/mm/kho.rst F: Documentation/core-api/kho/* F: include/linux/kexec_handover.h +F: include/linux/kho/ F: kernel/liveupdate/kexec_handover* F: lib/test_kho.c F: tools/testing/selftests/kho/ @@ -14637,6 +14638,7 @@ S: Maintained F: Documentation/core-api/liveupdate.rst F: Documentation/mm/memfd_preservation.rst F: Documentation/userspace-api/liveupdate.rst +F: include/linux/kho/abi/ F: include/linux/liveupdate.h F: include/linux/liveupdate/ F: include/uapi/linux/liveupdate.h From feda8e519014a7a29d7c76b1a265ad1d3c0d77b4 Mon Sep 17 00:00:00 2001 From: Pingfan Liu Date: Tue, 16 Dec 2025 09:48:51 +0800 Subject: [PATCH 0463/1321] kernel/kexec: change the prototype of kimage_map_segment() The kexec segment index will be required to extract the corresponding information for that segment in kimage_map_segment(). Additionally, kexec_segment already holds the kexec relocation destination address and size. Therefore, the prototype of kimage_map_segment() can be changed. Link: https://lkml.kernel.org/r/20251216014852.8737-1-piliu@redhat.com Fixes: 07d24902977e ("kexec: enable CMA based contiguous allocation") Signed-off-by: Pingfan Liu Acked-by: Baoquan He Cc: Mimi Zohar Cc: Roberto Sassu Cc: Alexander Graf Cc: Steven Chen Cc: Signed-off-by: Andrew Morton --- include/linux/kexec.h | 4 ++-- kernel/kexec_core.c | 9 ++++++--- security/integrity/ima/ima_kexec.c | 4 +--- 3 files changed, 9 insertions(+), 8 deletions(-) diff --git a/include/linux/kexec.h b/include/linux/kexec.h index ff7e231b0485a..8a22bc9b8c6c8 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -530,7 +530,7 @@ extern bool kexec_file_dbg_print; #define kexec_dprintk(fmt, arg...) \ do { if (kexec_file_dbg_print) pr_info(fmt, ##arg); } while (0) -extern void *kimage_map_segment(struct kimage *image, unsigned long addr, unsigned long size); +extern void *kimage_map_segment(struct kimage *image, int idx); extern void kimage_unmap_segment(void *buffer); #else /* !CONFIG_KEXEC_CORE */ struct pt_regs; @@ -540,7 +540,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { } static inline void crash_kexec(struct pt_regs *regs) { } static inline int kexec_should_crash(struct task_struct *p) { return 0; } static inline int kexec_crash_loaded(void) { return 0; } -static inline void *kimage_map_segment(struct kimage *image, unsigned long addr, unsigned long size) +static inline void *kimage_map_segment(struct kimage *image, int idx) { return NULL; } static inline void kimage_unmap_segment(void *buffer) { } #define kexec_in_progress false diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index 0f92acdd354da..1a79c5b18d8fd 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -953,17 +953,20 @@ int kimage_load_segment(struct kimage *image, int idx) return result; } -void *kimage_map_segment(struct kimage *image, - unsigned long addr, unsigned long size) +void *kimage_map_segment(struct kimage *image, int idx) { + unsigned long addr, size, eaddr; unsigned long src_page_addr, dest_page_addr = 0; - unsigned long eaddr = addr + size; kimage_entry_t *ptr, entry; struct page **src_pages; unsigned int npages; void *vaddr = NULL; int i; + addr = image->segment[idx].mem; + size = image->segment[idx].memsz; + eaddr = addr + size; + /* * Collect the source pages and map them in a contiguous VA range. */ diff --git a/security/integrity/ima/ima_kexec.c b/security/integrity/ima/ima_kexec.c index 7362f68f2d8b1..5beb69edd12fd 100644 --- a/security/integrity/ima/ima_kexec.c +++ b/security/integrity/ima/ima_kexec.c @@ -250,9 +250,7 @@ void ima_kexec_post_load(struct kimage *image) if (!image->ima_buffer_addr) return; - ima_kexec_buffer = kimage_map_segment(image, - image->ima_buffer_addr, - image->ima_buffer_size); + ima_kexec_buffer = kimage_map_segment(image, image->ima_segment_index); if (!ima_kexec_buffer) { pr_err("Could not map measurements buffer.\n"); return; From 228c9575afc178fb6829a986577bc89eebbc6dc8 Mon Sep 17 00:00:00 2001 From: Pingfan Liu Date: Tue, 16 Dec 2025 09:48:52 +0800 Subject: [PATCH 0464/1321] kernel/kexec: fix IMA when allocation happens in CMA area *** Bug description *** When I tested kexec with the latest kernel, I ran into the following warning: [ 40.712410] ------------[ cut here ]------------ [ 40.712576] WARNING: CPU: 2 PID: 1562 at kernel/kexec_core.c:1001 kimage_map_segment+0x144/0x198 [...] [ 40.816047] Call trace: [ 40.818498] kimage_map_segment+0x144/0x198 (P) [ 40.823221] ima_kexec_post_load+0x58/0xc0 [ 40.827246] __do_sys_kexec_file_load+0x29c/0x368 [...] [ 40.855423] ---[ end trace 0000000000000000 ]--- *** How to reproduce *** This bug is only triggered when the kexec target address is allocated in the CMA area. If no CMA area is reserved in the kernel, use the "cma=" option in the kernel command line to reserve one. *** Root cause *** The commit 07d24902977e ("kexec: enable CMA based contiguous allocation") allocates the kexec target address directly on the CMA area to avoid copying during the jump. In this case, there is no IND_SOURCE for the kexec segment. But the current implementation of kimage_map_segment() assumes that IND_SOURCE pages exist and map them into a contiguous virtual address by vmap(). *** Solution *** If IMA segment is allocated in the CMA area, use its page_address() directly. Link: https://lkml.kernel.org/r/20251216014852.8737-2-piliu@redhat.com Fixes: 07d24902977e ("kexec: enable CMA based contiguous allocation") Signed-off-by: Pingfan Liu Acked-by: Baoquan He Cc: Alexander Graf Cc: Steven Chen Cc: Mimi Zohar Cc: Roberto Sassu Cc: Signed-off-by: Andrew Morton --- kernel/kexec_core.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index 1a79c5b18d8fd..95c585c6ddc33 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -960,13 +960,17 @@ void *kimage_map_segment(struct kimage *image, int idx) kimage_entry_t *ptr, entry; struct page **src_pages; unsigned int npages; + struct page *cma; void *vaddr = NULL; int i; + cma = image->segment_cma[idx]; + if (cma) + return page_address(cma); + addr = image->segment[idx].mem; size = image->segment[idx].memsz; eaddr = addr + size; - /* * Collect the source pages and map them in a contiguous VA range. */ @@ -1007,7 +1011,8 @@ void *kimage_map_segment(struct kimage *image, int idx) void kimage_unmap_segment(void *segment_buffer) { - vunmap(segment_buffer); + if (is_vmalloc_addr(segment_buffer)) + vunmap(segment_buffer); } struct kexec_load_limit { From 694922304db13cf048f78bb2fdc722336f159a12 Mon Sep 17 00:00:00 2001 From: Wake Liu Date: Wed, 10 Dec 2025 17:14:08 +0800 Subject: [PATCH 0465/1321] selftests/mm: fix thread state check in uffd-unit-tests In the thread_state_get() function, the logic to find the thread's state character was using `sizeof(header) - 1` to calculate the offset from the "State:\t" string. The `header` variable is a `const char *` pointer. `sizeof()` on a pointer returns the size of the pointer itself, not the length of the string literal it points to. This makes the code's behavior dependent on the architecture's pointer size. This bug was identified on a 32-bit ARM build (`gsi_tv_arm`) for Android, running on an ARMv8-based device, compiled with Clang 19.0.1. On this 32-bit architecture, `sizeof(char *)` is 4. The expression `sizeof(header) - 1` resulted in an incorrect offset of 3, causing the test to read the wrong character from `/proc/[tid]/status` and fail. On 64-bit architectures, `sizeof(char *)` is 8, so the expression coincidentally evaluates to 7, which matches the length of "State:\t". This is why the bug likely remained hidden on 64-bit builds. To fix this and make the code portable and correct across all architectures, this patch replaces `sizeof(header) - 1` with `strlen(header)`. The `strlen()` function correctly calculates the string's length, ensuring the correct offset is always used. Link: https://lkml.kernel.org/r/20251210091408.3781445-1-wakel@google.com Fixes: f60b6634cd88 ("mm/selftests: add a test to verify mmap_changing race with -EAGAIN") Signed-off-by: Wake Liu Acked-by: Peter Xu Reviewed-by: Mike Rapoport (Microsoft) Cc: Bill Wendling Cc: Justin Stitt Cc: Liam Howlett Cc: Lorenzo Stoakes Cc: Michal Hocko Cc: Nathan Chancellor Cc: Shuah Khan Cc: Suren Baghdasaryan Cc: Vlastimil Babka Cc: Signed-off-by: Andrew Morton --- tools/testing/selftests/mm/uffd-unit-tests.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c index f4807242c5b2b..6f5e404a446c3 100644 --- a/tools/testing/selftests/mm/uffd-unit-tests.c +++ b/tools/testing/selftests/mm/uffd-unit-tests.c @@ -1317,7 +1317,7 @@ static thread_state thread_state_get(pid_t tid) p = strstr(tmp, header); if (p) { /* For example, "State:\tD (disk sleep)" */ - c = *(p + sizeof(header) - 1); + c = *(p + strlen(header)); return c == 'D' ? THR_STATE_UNINTERRUPTIBLE : THR_STATE_UNKNOWN; } From ba7549a716116a6b4048c3fe152bf71e2db8b865 Mon Sep 17 00:00:00 2001 From: Kaushlendra Kumar Date: Tue, 9 Dec 2025 10:15:52 +0530 Subject: [PATCH 0466/1321] tools/mm/page_owner_sort: fix timestamp comparison for stable sorting The ternary operator in compare_ts() returns 1 when timestamps are equal, causing unstable sorting behavior. Replace with explicit three-way comparison that returns 0 for equal timestamps, ensuring stable qsort ordering and consistent output. Link: https://lkml.kernel.org/r/20251209044552.3396468-1-kaushlendra.kumar@intel.com Fixes: 8f9c447e2e2b ("tools/vm/page_owner_sort.c: support sorting pid and time") Signed-off-by: Kaushlendra Kumar Cc: Chongxi Zhao Cc: Signed-off-by: Andrew Morton --- tools/mm/page_owner_sort.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/tools/mm/page_owner_sort.c b/tools/mm/page_owner_sort.c index 14c67e9e84c42..e6954909401c8 100644 --- a/tools/mm/page_owner_sort.c +++ b/tools/mm/page_owner_sort.c @@ -181,7 +181,11 @@ static int compare_ts(const void *p1, const void *p2) { const struct block_list *l1 = p1, *l2 = p2; - return l1->ts_nsec < l2->ts_nsec ? -1 : 1; + if (l1->ts_nsec < l2->ts_nsec) + return -1; + if (l1->ts_nsec > l2->ts_nsec) + return 1; + return 0; } static int compare_cull_condition(const void *p1, const void *p2) From 3f319d418681b34f83b7b73813d83ff8ab461552 Mon Sep 17 00:00:00 2001 From: Ankit Agrawal Date: Thu, 11 Dec 2025 07:06:01 +0000 Subject: [PATCH 0467/1321] mm: fixup pfnmap memory failure handling to use pgoff The memory failure handling implementation for the PFNMAP memory with no struct pages is faulty. The VA of the mapping is determined based on the the PFN. It should instead be based on the file mapping offset. At the occurrence of poison, the memory_failure_pfn is triggered on the poisoned PFN. Introduce a callback function that allows mm to translate the PFN to the corresponding file page offset. The kernel module using the registration API must implement the callback function and provide the translation. The translated value is then used to determine the VA information and sending the SIGBUS to the usermode process mapped to the poisoned PFN. The callback is also useful for the driver to be notified of the poisoned PFN, which may then track it. Link: https://lkml.kernel.org/r/20251211070603.338701-2-ankita@nvidia.com Fixes: 2ec41967189c ("mm: handle poisoning of pfn without struct pages") Signed-off-by: Ankit Agrawal Suggested-by: Jason Gunthorpe Cc: Kevin Tian Cc: Matthew R. Ochs Cc: Miaohe Lin Cc: Naoya Horiguchi Cc: Neo Jia Cc: Vikram Sethi Cc: Yishai Hadas Cc: Zhi Wang Signed-off-by: Andrew Morton --- include/linux/memory-failure.h | 2 ++ mm/memory-failure.c | 29 ++++++++++++++++++----------- 2 files changed, 20 insertions(+), 11 deletions(-) diff --git a/include/linux/memory-failure.h b/include/linux/memory-failure.h index bc326503d2d25..7b5e11cf905ff 100644 --- a/include/linux/memory-failure.h +++ b/include/linux/memory-failure.h @@ -9,6 +9,8 @@ struct pfn_address_space; struct pfn_address_space { struct interval_tree_node node; struct address_space *mapping; + int (*pfn_to_vma_pgoff)(struct vm_area_struct *vma, + unsigned long pfn, pgoff_t *pgoff); }; int register_pfn_address_space(struct pfn_address_space *pfn_space); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index fbc5a01260c89..c80c2907da333 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -2161,6 +2161,9 @@ int register_pfn_address_space(struct pfn_address_space *pfn_space) { guard(mutex)(&pfn_space_lock); + if (!pfn_space->pfn_to_vma_pgoff) + return -EINVAL; + if (interval_tree_iter_first(&pfn_space_itree, pfn_space->node.start, pfn_space->node.last)) @@ -2183,10 +2186,10 @@ void unregister_pfn_address_space(struct pfn_address_space *pfn_space) } EXPORT_SYMBOL_GPL(unregister_pfn_address_space); -static void add_to_kill_pfn(struct task_struct *tsk, - struct vm_area_struct *vma, - struct list_head *to_kill, - unsigned long pfn) +static void add_to_kill_pgoff(struct task_struct *tsk, + struct vm_area_struct *vma, + struct list_head *to_kill, + pgoff_t pgoff) { struct to_kill *tk; @@ -2197,12 +2200,12 @@ static void add_to_kill_pfn(struct task_struct *tsk, } /* Check for pgoff not backed by struct page */ - tk->addr = vma_address(vma, pfn, 1); + tk->addr = vma_address(vma, pgoff, 1); tk->size_shift = PAGE_SHIFT; if (tk->addr == -EFAULT) pr_info("Unable to find address %lx in %s\n", - pfn, tsk->comm); + pgoff, tsk->comm); get_task_struct(tsk); tk->tsk = tsk; @@ -2212,11 +2215,12 @@ static void add_to_kill_pfn(struct task_struct *tsk, /* * Collect processes when the error hit a PFN not backed by struct page. */ -static void collect_procs_pfn(struct address_space *mapping, +static void collect_procs_pfn(struct pfn_address_space *pfn_space, unsigned long pfn, struct list_head *to_kill) { struct vm_area_struct *vma; struct task_struct *tsk; + struct address_space *mapping = pfn_space->mapping; i_mmap_lock_read(mapping); rcu_read_lock(); @@ -2226,9 +2230,12 @@ static void collect_procs_pfn(struct address_space *mapping, t = task_early_kill(tsk, true); if (!t) continue; - vma_interval_tree_foreach(vma, &mapping->i_mmap, pfn, pfn) { - if (vma->vm_mm == t->mm) - add_to_kill_pfn(t, vma, to_kill, pfn); + vma_interval_tree_foreach(vma, &mapping->i_mmap, 0, ULONG_MAX) { + pgoff_t pgoff; + + if (vma->vm_mm == t->mm && + !pfn_space->pfn_to_vma_pgoff(vma, pfn, &pgoff)) + add_to_kill_pgoff(t, vma, to_kill, pgoff); } } rcu_read_unlock(); @@ -2264,7 +2271,7 @@ static int memory_failure_pfn(unsigned long pfn, int flags) struct pfn_address_space *pfn_space = container_of(node, struct pfn_address_space, node); - collect_procs_pfn(pfn_space->mapping, pfn, &tokill); + collect_procs_pfn(pfn_space, pfn, &tokill); mf_handled = true; } From 00e4ea2a932d739264144065c6667feec4b0eddc Mon Sep 17 00:00:00 2001 From: Shakeel Butt Date: Tue, 16 Dec 2025 13:20:54 -0800 Subject: [PATCH 0468/1321] mm: memcg: fix unit conversion for K() macro in OOM log The commit bc8e51c05ad5 ("mm: memcg: dump memcg protection info on oom or alloc failures") added functionality to dump memcg protections on OOM or allocation failures. It uses K() macro to dump the information and passes bytes to the macro. However the macro take number of pages instead of bytes. It is defined as: #define K(x) ((x) << (PAGE_SHIFT-10)) Let's fix this. Link: https://lkml.kernel.org/r/20251216212054.484079-1-shakeel.butt@linux.dev Fixes: bc8e51c05ad5 ("mm: memcg: dump memcg protection info on oom or alloc failures") Signed-off-by: Shakeel Butt Reported-by: Chris Mason Acked-by: Michal Hocko Acked-by: Vlastimil Babka Reviewed-by: Muchun Song Cc: Johannes Weiner Cc: Roman Gushchin Signed-off-by: Andrew Morton --- mm/memcontrol.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index be810c1fbfc3e..86f43b7e5f710 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5638,6 +5638,6 @@ void mem_cgroup_show_protected_memory(struct mem_cgroup *memcg) memcg = root_mem_cgroup; pr_warn("Memory cgroup min protection %lukB -- low protection %lukB", - K(atomic_long_read(&memcg->memory.children_min_usage)*PAGE_SIZE), - K(atomic_long_read(&memcg->memory.children_low_usage)*PAGE_SIZE)); + K(atomic_long_read(&memcg->memory.children_min_usage)), + K(atomic_long_read(&memcg->memory.children_low_usage))); } From 3c5a27e921d2e98286d3e9604129794cba47e304 Mon Sep 17 00:00:00 2001 From: Alice Ryhl Date: Wed, 17 Dec 2025 13:10:37 +0000 Subject: [PATCH 0469/1321] rust: maple_tree: rcu_read_lock() in destructor to silence lockdep MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When running the Rust maple tree kunit tests with lockdep, you may trigger a warning that looks like this: lib/maple_tree.c:780 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 no locks held by kunit_try_catch/344. stack backtrace: CPU: 3 UID: 0 PID: 344 Comm: kunit_try_catch Tainted: G N 6.19.0-rc1+ #2 NONE Tainted: [N]=TEST Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack_lvl+0x71/0x90 lockdep_rcu_suspicious+0x150/0x190 mas_start+0x104/0x150 mas_find+0x179/0x240 _RINvNtCs5QSdWC790r4_4core3ptr13drop_in_placeINtNtCs1cdwasc6FUb_6kernel10maple_tree9MapleTreeINtNtNtBL_5alloc4kbox3BoxlNtNtB1x_9allocator7KmallocEEECsgxAQYCfdR72_25doctests_kernel_generated+0xaf/0x130 rust_doctest_kernel_maple_tree_rs_0+0x600/0x6b0 ? lock_release+0xeb/0x2a0 ? kunit_try_catch_run+0x210/0x210 kunit_try_run_case+0x74/0x160 ? kunit_try_catch_run+0x210/0x210 kunit_generic_run_threadfn_adapter+0x12/0x30 kthread+0x21c/0x230 ? __do_trace_sched_kthread_stop_ret+0x40/0x40 ret_from_fork+0x16c/0x270 ? __do_trace_sched_kthread_stop_ret+0x40/0x40 ret_from_fork_asm+0x11/0x20 This is because the destructor of maple tree calls mas_find() without taking rcu_read_lock() or the spinlock. Doing that is actually ok in this case since the destructor has exclusive access to the entire maple tree, but it triggers a lockdep warning. To fix that, take the rcu read lock. In the future, it's possible that memory reclaim could gain a feature where it reallocates entries in maple trees even if no user-code is touching it. If that feature is added, then this use of rcu read lock would become load-bearing, so I did not make it conditional on lockdep. We have to repeatedly take and release rcu because the destructor of T might perform operations that sleep. Link: https://lkml.kernel.org/r/20251217-maple-drop-rcu-v1-1-702af063573f@google.com Fixes: da939ef4c494 ("rust: maple_tree: add MapleTree") Signed-off-by: Alice Ryhl Reported-by: Andreas Hindborg Closes: https://rust-for-linux.zulipchat.com/#narrow/channel/x/topic/x/near/564215108 Reviewed-by: Gary Guo Reviewed-by: Daniel Almeida Cc: Andrew Ballance Cc: Björn Roy Baron Cc: Boqun Feng Cc: Danilo Krummrich Cc: Liam Howlett Cc: Matthew Wilcox (Oracle) Cc: Miguel Ojeda Cc: Trevor Gross Cc: Signed-off-by: Andrew Morton --- rust/kernel/maple_tree.rs | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/rust/kernel/maple_tree.rs b/rust/kernel/maple_tree.rs index e72eec56bf577..265d6396a78a1 100644 --- a/rust/kernel/maple_tree.rs +++ b/rust/kernel/maple_tree.rs @@ -265,7 +265,16 @@ impl MapleTree { loop { // This uses the raw accessor because we're destroying pointers without removing them // from the maple tree, which is only valid because this is the destructor. - let ptr = ma_state.mas_find_raw(usize::MAX); + // + // Take the rcu lock because mas_find_raw() requires that you hold either the spinlock + // or the rcu read lock. This is only really required if memory reclaim might + // reallocate entries in the tree, as we otherwise have exclusive access. That feature + // doesn't exist yet, so for now, taking the rcu lock only serves the purpose of + // silencing lockdep. + let ptr = { + let _rcu = kernel::sync::rcu::Guard::new(); + ma_state.mas_find_raw(usize::MAX) + }; if ptr.is_null() { break; } From 0b5a0acee796650d6a20e767355f99ed4f8d27c8 Mon Sep 17 00:00:00 2001 From: Bijan Tabatabai Date: Tue, 16 Dec 2025 14:07:27 -0600 Subject: [PATCH 0470/1321] mm: consider non-anon swap cache folios in folio_expected_ref_count() Currently, folio_expected_ref_count() only adds references for the swap cache if the folio is anonymous. However, according to the comment above the definition of PG_swapcache in enum pageflags, shmem folios can also have PG_swapcache set. This patch makes sure references for the swap cache are added if folio_test_swapcache(folio) is true. This issue was found when trying to hot-unplug memory in a QEMU/KVM virtual machine. When initiating hot-unplug when most of the guest memory is allocated, hot-unplug hangs partway through removal due to migration failures. The following message would be printed several times, and would be printed again about every five seconds: [ 49.641309] migrating pfn b12f25 failed ret:7 [ 49.641310] page: refcount:2 mapcount:0 mapping:0000000033bd8fe2 index:0x7f404d925 pfn:0xb12f25 [ 49.641311] aops:swap_aops [ 49.641313] flags: 0x300000000030508(uptodate|active|owner_priv_1|reclaim|swapbacked|node=0|zone=3) [ 49.641314] raw: 0300000000030508 ffffed312c4bc908 ffffed312c4bc9c8 0000000000000000 [ 49.641315] raw: 00000007f404d925 00000000000c823b 00000002ffffffff 0000000000000000 [ 49.641315] page dumped because: migration failure When debugging this, I found that these migration failures were due to __migrate_folio() returning -EAGAIN for a small set of folios because the expected reference count it calculates via folio_expected_ref_count() is one less than the actual reference count of the folios. Furthermore, all of the affected folios were not anonymous, but had the PG_swapcache flag set, inspiring this patch. After applying this patch, the memory hot-unplug behaves as expected. I tested this on a machine running Ubuntu 24.04 with kernel version 6.8.0-90-generic and 64GB of memory. The guest VM is managed by libvirt and runs Ubuntu 24.04 with kernel version 6.18 (though the head of the mm-unstable branch as a Dec 16, 2025 was also tested and behaves the same) and 48GB of memory. The libvirt XML definition for the VM can be found at [1]. CONFIG_MHP_DEFAULT_ONLINE_TYPE_ONLINE_MOVABLE is set in the guest kernel so the hot-pluggable memory is automatically onlined. Below are the steps to reproduce this behavior: 1) Define and start and virtual machine host$ virsh -c qemu:///system define ./test_vm.xml # test_vm.xml from [1] host$ virsh -c qemu:///system start test_vm 2) Setup swap in the guest guest$ sudo fallocate -l 32G /swapfile guest$ sudo chmod 0600 /swapfile guest$ sudo mkswap /swapfile guest$ sudo swapon /swapfile 3) Use alloc_data [2] to allocate most of the remaining guest memory guest$ ./alloc_data 45 4) In a separate guest terminal, monitor the amount of used memory guest$ watch -n1 free -h 5) When alloc_data has finished allocating, initiate the memory hot-unplug using the provided xml file [3] host$ virsh -c qemu:///system detach-device test_vm ./remove.xml --live After initiating the memory hot-unplug, you should see the amount of available memory in the guest decrease, and the amount of used swap data increase. If everything works as expected, when all of the memory is unplugged, there should be around 8.5-9GB of data in swap. If the unplugging is unsuccessful, the amount of used swap data will settle below that. If that happens, you should be able to see log messages in dmesg similar to the one posted above. Link: https://lkml.kernel.org/r/20251216200727.2360228-1-bijan311@gmail.com Link: https://github.com/BijanT/linux_patch_files/blob/main/test_vm.xml [1] Link: https://github.com/BijanT/linux_patch_files/blob/main/alloc_data.c [2] Link: https://github.com/BijanT/linux_patch_files/blob/main/remove.xml [3] Fixes: 86ebd50224c0 ("mm: add folio_expected_ref_count() for reference count calculation") Signed-off-by: Bijan Tabatabai Acked-by: David Hildenbrand (Red Hat) Acked-by: Zi Yan Reviewed-by: Baolin Wang Cc: Liam Howlett Cc: Lorenzo Stoakes Cc: Michal Hocko Cc: Mike Rapoport Cc: Shivank Garg Cc: Suren Baghdasaryan Cc: Vlastimil Babka Cc: Kairui Song Cc: Signed-off-by: Andrew Morton --- include/linux/mm.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 15076261d0c2e..6f959d8ca4b42 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2459,10 +2459,10 @@ static inline int folio_expected_ref_count(const struct folio *folio) if (WARN_ON_ONCE(page_has_type(&folio->page) && !folio_test_hugetlb(folio))) return 0; - if (folio_test_anon(folio)) { - /* One reference per page from the swapcache. */ - ref_count += folio_test_swapcache(folio) << order; - } else { + /* One reference per page from the swapcache. */ + ref_count += folio_test_swapcache(folio) << order; + + if (!folio_test_anon(folio)) { /* One reference per page from the pagecache. */ ref_count += !!folio->mapping << order; /* One reference from PG_private. */ From f29e6a10dbfc70290289da92509f7ca2fe76ff09 Mon Sep 17 00:00:00 2001 From: Joshua Hahn Date: Thu, 18 Dec 2025 00:31:59 -0800 Subject: [PATCH 0471/1321] mm/page_alloc: report 1 as zone_batchsize for !CONFIG_MMU Commit 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0") moved the error handling (0-handling) of zone_batchsize from its callers to inside the function. However, the commit left out the error handling for the NOMMU case, leading to deadlocks on NOMMU systems. For NOMMU systems, return 1 instead of 0 for zone_batchsize, which restores the previous deadlock-free behavior. There is no functional difference expected with this patch before commit 2783088ef24e, other than the pr_debug in zone_pcp_init now printing out 1 instead of 0 for zones in NOMMU systems. Not only is this a pr_debug, the difference is purely semantic anyways. Link: https://lkml.kernel.org/r/20251218083200.2435789-1-joshua.hahnjy@gmail.com Fixes: 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0") Signed-off-by: Joshua Hahn Reported-by: Daniel Palmer Closes: https://lore.kernel.org/linux-mm/CAFr9PX=_HaM3_xPtTiBn5Gw5-0xcRpawpJ02NStfdr0khF2k7g@mail.gmail.com/ Reported-by: Guenter Roeck Closes: https://lore.kernel.org/all/42143500-c380-41fe-815c-696c17241506@roeck-us.net/ Reviewed-by: Vlastimil Babka Tested-by: Daniel Palmer Tested-by: Guenter Roeck Acked-by: SeongJae Park Tested-by: Hajime Tazaki Cc: Brendan Jackman Cc: Johannes Weiner Cc: Michal Hocko Cc: Suren Baghdasaryan Cc: Zi Yan Signed-off-by: Andrew Morton --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f6586f165b893..c380f063e8b7b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5924,7 +5924,7 @@ static int zone_batchsize(struct zone *zone) * recycled, this leads to the once large chunks of space being * fragmented and becoming unavailable for high-order allocations. */ - return 0; + return 1; #endif } From e5f6dfc62b835c0fd879298431dde8db797ad4b8 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Wed, 17 Dec 2025 22:09:21 -0800 Subject: [PATCH 0472/1321] sparse: update MAINTAINERS info Chris Li is back as sparse maintainer. See https://git.kernel.org/pub/scm/devel/sparse/sparse.git/commit/?id=67f0a03cee4637e495151c48a02be642a158cbbb Link: https://lkml.kernel.org/r/20251218060921.995516-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap Cc: Christopher Li Signed-off-by: Andrew Morton --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 5d652c9e4ba5f..baa56ba548be8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -24574,7 +24574,7 @@ F: drivers/tty/vcc.c F: include/linux/sunserialcore.h SPARSE CHECKER -M: "Luc Van Oostenryck" +M: Chris Li L: linux-sparse@vger.kernel.org S: Maintained W: https://sparse.docs.kernel.org/ From 3b7270ee20ff9ffcdb6747d260d7cac3281ad572 Mon Sep 17 00:00:00 2001 From: Dan Williams Date: Fri, 19 Dec 2025 16:03:27 -0800 Subject: [PATCH 0473/1321] MAINTAINERS: notify the "Device Memory" community of memory hotplug changes The recent episode of a warning regression in memremap_pages() [1] highlights that relevant updates are being missed by folks that care about core ZONE_DEVICE changes. Yes, CXL folks should pay more attention to linux-mm@, but it also would not hurt to copy linux-cxl@, where most Device Memory folks hang out, on memory hotplug changes by default. Link: http://lore.kernel.org/20251219123717.39330-1-john@groves.net [1] Link: https://lkml.kernel.org/r/20251220000327.3502994-1-dan.j.williams@intel.com Signed-off-by: Dan Williams Acked-by: Jonathan Cameron Acked-by: John Groves Cc: David Hildenbrand Cc: Oscar Salvador Cc: Davidlohr Bueso Cc: Dave Jiang Cc: Alison Schofield Cc: Vishal Verma Cc: Ira Weiny Signed-off-by: Andrew Morton --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index baa56ba548be8..765ad2daa2183 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -16428,6 +16428,7 @@ MEMORY HOT(UN)PLUG M: David Hildenbrand M: Oscar Salvador L: linux-mm@kvack.org +L: linux-cxl@vger.kernel.org S: Maintained F: Documentation/admin-guide/mm/memory-hotplug.rst F: Documentation/core-api/memory-hotplug.rst From c8cdabfeaf2351faf40eaa05575707442cb96b81 Mon Sep 17 00:00:00 2001 From: John Groves Date: Fri, 19 Dec 2025 06:37:17 -0600 Subject: [PATCH 0474/1321] mm/memremap: fix spurious large folio warning for FS-DAX This patch addresses a warning that I discovered while working on famfs, which is an fs-dax file system that virtually always does PMD faults (next famfs patch series coming after the holidays). However, XFS also does PMD faults in fs-dax mode, and it also triggers the warning. It takes some effort to get XFS to do a PMD fault, but instructions to reproduce it are below. The VM_WARN_ON_ONCE(folio_test_large(folio)) check in free_zone_device_folio() incorrectly triggers for MEMORY_DEVICE_FS_DAX when PMD (2MB) mappings are used. FS-DAX legitimately creates large file-backed folios when handling PMD faults. This is a core feature of FS-DAX that provides significant performance benefits by mapping 2MB regions directly to persistent memory. When these mappings are unmapped, the large folios are freed through free_zone_device_folio(), which triggers the spurious warning. The warning was introduced by commit that added support for large zone device private folios. However, that commit did not account for FS-DAX file-backed folios, which have always supported large (PMD-sized) mappings. The check distinguishes between anonymous folios (which clear AnonExclusive flags for each sub-page) and file-backed folios. For file-backed folios, it assumes large folios are unexpected - but this assumption is incorrect for FS-DAX. The fix is to exempt MEMORY_DEVICE_FS_DAX from the large folio warning, allowing FS-DAX to continue using PMD mappings without triggering false warnings. Link: https://lkml.kernel.org/r/20251219123717.39330-1-john@groves.net Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios") Signed-off-by: John Groves Acked-by: David Hildenbrand (Red Hat) Reviewed-by: Dan Williams Tested-by: Alison Schofield Cc: Alistair Popple Cc: Balbir Singh Cc: "Darrick J. Wong" Cc: Gregory Price Cc: Oscar Salvador Signed-off-by: Andrew Morton --- mm/memremap.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/mm/memremap.c b/mm/memremap.c index 4c2e0d68eb279..63c6ab4fdf082 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -427,8 +427,6 @@ void free_zone_device_folio(struct folio *folio) if (folio_test_anon(folio)) { for (i = 0; i < nr; i++) __ClearPageAnonExclusive(folio_page(folio, i)); - } else { - VM_WARN_ON_ONCE(folio_test_large(folio)); } /* From 2bcb89de0a253ef43ceaad4976042305f06559f8 Mon Sep 17 00:00:00 2001 From: Ran Xiaokai Date: Fri, 19 Dec 2025 07:42:32 +0000 Subject: [PATCH 0475/1321] mm/page_owner: fix memory leak in page_owner_stack_fops->release() The page_owner_stack_fops->open() callback invokes seq_open_private(), therefore its corresponding ->release() callback must call seq_release_private(). Otherwise it will cause a memory leak of struct stack_print_ctx. Link: https://lkml.kernel.org/r/20251219074232.136482-1-ranxiaokai627@163.com Fixes: 765973a09803 ("mm,page_owner: display all stacks and their count") Signed-off-by: Ran Xiaokai Acked-by: Michal Hocko Acked-by: Vlastimil Babka Cc: Andrey Konovalov Cc: Brendan Jackman Cc: Johannes Weiner Cc: Marco Elver Cc: Suren Baghdasaryan Cc: Zi Yan Cc: Signed-off-by: Andrew Morton --- mm/page_owner.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_owner.c b/mm/page_owner.c index a702456842061..b3260f0c17ba4 100644 --- a/mm/page_owner.c +++ b/mm/page_owner.c @@ -952,7 +952,7 @@ static const struct file_operations page_owner_stack_fops = { .open = page_owner_stack_open, .read = seq_read, .llseek = seq_lseek, - .release = seq_release, + .release = seq_release_private, }; static int page_owner_threshold_get(void *data, u64 *val) From 6226a013bd37986d86f891add9ad9fe23188b91e Mon Sep 17 00:00:00 2001 From: Sasha Levin Date: Sat, 20 Dec 2025 15:29:26 -0500 Subject: [PATCH 0476/1321] mm/ksm: fix pte_unmap_unlock of wrong address in break_ksm_pmd_entry On ARM32 with HIGHMEM/HIGHPTE, break_ksm_pmd_entry() triggers a BUG during KSM unmerging because pte_unmap_unlock() is passed a pointer that may be beyond the mapped PTE page. The issue occurs when the PTE iteration loop completes without finding a KSM page. After the loop, 'ptep' has been incremented past the last PTE entry. On ARM32 LPAE with 512 PTEs per page (512 * 8 = 4096 bytes), this means ptep points to the next page, outside the kmap'd region. When pte_unmap_unlock(ptep, ptl) calls kunmap_local(ptep), it unmaps the wrong page address, leaving the original kmap slot still mapped. The next kmap_local then finds this slot unexpectedly occupied: WARNING: mm/highmem.c:622 kunmap_local_indexed (address mismatch) kernel BUG at mm/highmem.c:564 __kmap_local_pfn_prot (slot not empty) Fix this by passing start_ptep to pte_unmap_unlock(), which always points within the originally mapped PTE page. Reproducer: Run LTP ksm03 test on ARM32 with HIGHMEM enabled. The test triggers KSM merging followed by unmerging (writing 0 then 2 to /sys/kernel/mm/ksm/run), which exercises break_ksm_pmd_entry(). Link: https://lkml.kernel.org/r/20251220202926.318366-1-sashal@kernel.org Fixes: 5d4939fc2258 ("ksm: perform a range-walk in break_ksm") Signed-off-by: Sasha Levin Assisted-by: claude-opus-4-5-20251101 Acked-by: David Hildenbrand (Red Hat) Reviewed-by: Chengming Zhou Cc: Pedro Demarchi Gomes Cc: xu xin Signed-off-by: Andrew Morton --- mm/ksm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/ksm.c b/mm/ksm.c index cfc182255c7ba..2d89a7c8b4ebc 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -650,7 +650,7 @@ static int break_ksm_pmd_entry(pmd_t *pmdp, unsigned long addr, unsigned long en } } out_unlock: - pte_unmap_unlock(ptep, ptl); + pte_unmap_unlock(start_ptep, ptl); return found; } From a670b40e3bbab942e2bd0f675ee549eb54d60333 Mon Sep 17 00:00:00 2001 From: Nicolas Schier Date: Wed, 17 Dec 2025 20:13:43 +0100 Subject: [PATCH 0477/1321] Revert "scripts/clang-tools: Handle included .c files in gen_compile_commands" This reverts commit 9362d34acf91a706c543d919ade3e651b9bd2d6f. Dmitry Vyukov reported that commit 9362d34acf91 ("scripts/clang-tools: Handle included .c files in gen_compile_commands") generates false entries in some cases for C files that are included in other C files but not meant for standalone compilation. For properly supporting clangd, including .c files is discouraged. Reported-by: Dmitry Vyukov Closes: https://lore.kernel.org/r/CACT4Y+Z8aCz0XcoJx9XXPHZSZHxGF8Kx9iUbFarhpTSEPDhMfg@mail.gmail.com Acked-by: Nathan Chancellor Acked-by: Dmitry Vyukov Fixes: 9362d34acf91 ("scripts/clang-tools: Handle included .c files in gen_compile_commands") Link: https://patch.msgid.link/20251217-revert-scripts-clang-rools-handle-included-c-files-v1-1-def5651446da@kernel.org Signed-off-by: Nicolas Schier --- scripts/clang-tools/gen_compile_commands.py | 135 +------------------- 1 file changed, 7 insertions(+), 128 deletions(-) diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py index 6f4afa92a4665..96e6e46ad1a70 100755 --- a/scripts/clang-tools/gen_compile_commands.py +++ b/scripts/clang-tools/gen_compile_commands.py @@ -21,12 +21,6 @@ _FILENAME_PATTERN = r'^\..*\.cmd$' _LINE_PATTERN = r'^(saved)?cmd_[^ ]*\.o := (?P.* )(?P[^ ]*\.[cS]) *(;|$)' _VALID_LOG_LEVELS = ['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] - -# Pre-compiled regexes for better performance -_INCLUDE_PATTERN = re.compile(r'^\s*#\s*include\s*[<"]([^>"]*)[>"]') -_C_INCLUDE_PATTERN = re.compile(r'^\s*#\s*include\s*"([^"]*\.c)"\s*$') -_FILENAME_MATCHER = re.compile(_FILENAME_PATTERN) - # The tools/ directory adopts a different build system, and produces .cmd # files in a different format. Do not support it. _EXCLUDE_DIRS = ['.git', 'Documentation', 'include', 'tools'] @@ -88,6 +82,7 @@ def cmdfiles_in_dir(directory): The path to a .cmd file. """ + filename_matcher = re.compile(_FILENAME_PATTERN) exclude_dirs = [ os.path.join(directory, d) for d in _EXCLUDE_DIRS ] for dirpath, dirnames, filenames in os.walk(directory, topdown=True): @@ -97,7 +92,7 @@ def cmdfiles_in_dir(directory): continue for filename in filenames: - if _FILENAME_MATCHER.match(filename): + if filename_matcher.match(filename): yield os.path.join(dirpath, filename) @@ -154,87 +149,8 @@ def cmdfiles_for_modorder(modorder): yield to_cmdfile(mod_line.rstrip()) -def extract_includes_from_file(source_file, root_directory): - """Extract #include statements from a C file. - - Args: - source_file: Path to the source .c file to analyze - root_directory: Root directory for resolving relative paths - - Returns: - List of header files that should be included (without quotes/brackets) - """ - includes = [] - if not os.path.exists(source_file): - return includes - - try: - with open(source_file, 'r') as f: - for line in f: - line = line.strip() - # Look for #include statements. - # Match both #include "header.h" and #include . - match = _INCLUDE_PATTERN.match(line) - if match: - header = match.group(1) - # Skip including other .c files to avoid circular includes. - if not header.endswith('.c'): - # For relative includes (quoted), resolve path relative to source file. - if '"' in line: - src_dir = os.path.dirname(source_file) - header_path = os.path.join(src_dir, header) - if os.path.exists(header_path): - rel_header = os.path.relpath(header_path, root_directory) - includes.append(rel_header) - else: - includes.append(header) - else: - # System include like . - includes.append(header) - except IOError: - pass - - return includes - - -def find_included_c_files(source_file, root_directory): - """Find .c files that are included by the given source file. - - Args: - source_file: Path to the source .c file - root_directory: Root directory for resolving relative paths - - Yields: - Full paths to included .c files - """ - if not os.path.exists(source_file): - return - - try: - with open(source_file, 'r') as f: - for line in f: - line = line.strip() - # Look for #include "*.c" patterns. - match = _C_INCLUDE_PATTERN.match(line) - if match: - included_file = match.group(1) - # Handle relative paths. - if not os.path.isabs(included_file): - src_dir = os.path.dirname(source_file) - included_file = os.path.join(src_dir, included_file) - - # Normalize the path. - included_file = os.path.normpath(included_file) - - # Check if the file exists. - if os.path.exists(included_file): - yield included_file - except IOError: - pass - - def process_line(root_directory, command_prefix, file_path): - """Extracts information from a .cmd line and creates entries from it. + """Extracts information from a .cmd line and creates an entry from it. Args: root_directory: The directory that was searched for .cmd files. Usually @@ -244,8 +160,7 @@ def process_line(root_directory, command_prefix, file_path): Usually relative to root_directory, but sometimes absolute. Returns: - A list of entries to append to compile_commands (may include multiple - entries if the source file includes other .c files). + An entry to append to compile_commands. Raises: ValueError: Could not find the extracted file based on file_path and @@ -261,47 +176,11 @@ def process_line(root_directory, command_prefix, file_path): abs_path = os.path.realpath(os.path.join(root_directory, file_path)) if not os.path.exists(abs_path): raise ValueError('File %s not found' % abs_path) - - entries = [] - - # Create entry for the main source file. - main_entry = { + return { 'directory': root_directory, 'file': abs_path, 'command': prefix + file_path, } - entries.append(main_entry) - - # Find and create entries for included .c files. - for included_c_file in find_included_c_files(abs_path, root_directory): - # For included .c files, create a compilation command that: - # 1. Uses the same compilation flags as the parent file - # 2. But compiles the included file directly (not the parent) - # 3. Includes necessary headers from the parent file for proper macro resolution - - # Convert absolute path to relative for the command. - rel_path = os.path.relpath(included_c_file, root_directory) - - # Extract includes from the parent file to provide proper compilation context. - extra_includes = '' - try: - parent_includes = extract_includes_from_file(abs_path, root_directory) - if parent_includes: - extra_includes = ' ' + ' '.join('-include ' + inc for inc in parent_includes) - except IOError: - pass - - included_entry = { - 'directory': root_directory, - 'file': included_c_file, - # Use the same compilation prefix but target the included file directly. - # Add extra headers for proper macro resolution. - 'command': prefix + extra_includes + ' ' + rel_path, - } - entries.append(included_entry) - logging.debug('Added entry for included file: %s', included_c_file) - - return entries def main(): @@ -334,9 +213,9 @@ def main(): result = line_matcher.match(f.readline()) if result: try: - entries = process_line(directory, result.group('command_prefix'), + entry = process_line(directory, result.group('command_prefix'), result.group('file_path')) - compile_commands.extend(entries) + compile_commands.append(entry) except ValueError as err: logging.info('Could not add line from %s: %s', cmdfile, err) From a877eec45cc74bb3abfa0c9c109801336b11b9b6 Mon Sep 17 00:00:00 2001 From: Thomas De Schampheleire Date: Wed, 26 Nov 2025 11:00:16 +0100 Subject: [PATCH 0478/1321] kbuild: fix compilation of dtb specified on command-line without make rule Since commit e7e2941300d2 ("kbuild: split device tree build rules into scripts/Makefile.dtbs"), it is no longer possible to compile a device tree blob that is not specified in a make rule like: dtb-$(CONFIG_FOO) += foo.dtb Before the mentioned commit, one could copy a dts file to e.g. arch/arm64/boot/dts/ (or a new subdirectory) and then convert it to a dtb file using: make ARCH=arm64 foo.dtb In this scenario, both 'dtb-y' and 'dtb-' are empty, and the inclusion of scripts/Makefile.dtbs relies on 'targets' to contain the MAKECMDGOALS. The value of 'targets', however, is only final later in the code. Move the conditional include of scripts/Makefile.dtbs down to where the value of 'targets' is final. Since Makefile.dtbs updates 'always-y' which is used as a prerequisite in the build rule, the build rule also needs to move down. Fixes: e7e2941300d2 ("kbuild: split device tree build rules into scripts/Makefile.dtbs") Signed-off-by: Thomas De Schampheleire Reviewed-by: Nathan Chancellor Tested-by: Nathan Chancellor Acked-by: Rob Herring (Arm) Link: https://patch.msgid.link/20251126100017.1162330-1-thomas.de_schampheleire@nokia.com Signed-off-by: Nicolas Schier --- scripts/Makefile.build | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 52c08c4eb0b9a..5037f4715d749 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -527,18 +527,6 @@ ifneq ($(userprogs),) include $(srctree)/scripts/Makefile.userprogs endif -ifneq ($(need-dtbslist)$(dtb-y)$(dtb-)$(filter %.dtb %.dtb.o %.dtbo.o,$(targets)),) -include $(srctree)/scripts/Makefile.dtbs -endif - -# Build -# --------------------------------------------------------------------------- - -$(obj)/: $(if $(KBUILD_BUILTIN), $(targets-for-builtin)) \ - $(if $(KBUILD_MODULES), $(targets-for-modules)) \ - $(subdir-ym) $(always-y) - @: - # Single targets # --------------------------------------------------------------------------- @@ -568,6 +556,20 @@ FORCE: targets += $(filter-out $(single-subdir-goals), $(MAKECMDGOALS)) targets := $(filter-out $(PHONY), $(targets)) +# Now that targets is fully known, include dtb rules if needed +ifneq ($(need-dtbslist)$(dtb-y)$(dtb-)$(filter %.dtb %.dtb.o %.dtbo.o,$(targets)),) +include $(srctree)/scripts/Makefile.dtbs +endif + +# Build +# Needs to be after the include of Makefile.dtbs, which updates always-y +# --------------------------------------------------------------------------- + +$(obj)/: $(if $(KBUILD_BUILTIN), $(targets-for-builtin)) \ + $(if $(KBUILD_MODULES), $(targets-for-modules)) \ + $(subdir-ym) $(always-y) + @: + # Read all saved command lines and dependencies for the $(targets) we # may be building above, using $(if_changed{,_dep}). As an # optimization, we don't need to read them if the target does not From d932dd21b9bc2d732af31cd43c640c1152bf1a28 Mon Sep 17 00:00:00 2001 From: Jose Javier Rodriguez Barbarin Date: Tue, 2 Dec 2025 09:42:00 +0100 Subject: [PATCH 0479/1321] mcb: Add missing modpost build support mcb bus is not prepared to autoload client drivers with the data defined on the drivers' MODULE_DEVICE_TABLE. modpost cannot access to mcb_table_id inside MODULE_DEVICE_TABLE so the data declared inside is ignored. Add modpost build support for accessing to the mcb_table_id coded on device drivers' MODULE_DEVICE_TABLE. Fixes: 3764e82e5150 ("drivers: Introduce MEN Chameleon Bus") Reviewed-by: Jorge Sanjuan Garcia Signed-off-by: Jose Javier Rodriguez Barbarin Acked-by: Nathan Chancellor Reviewed-by: Andy Shevchenko Link: https://patch.msgid.link/20251202084200.10410-1-dev-josejavier.rodriguez@duagon.com Signed-off-by: Nicolas Schier --- scripts/mod/devicetable-offsets.c | 3 +++ scripts/mod/file2alias.c | 9 +++++++++ 2 files changed, 12 insertions(+) diff --git a/scripts/mod/devicetable-offsets.c b/scripts/mod/devicetable-offsets.c index ef2ffb68f69d1..b4178c42d08f5 100644 --- a/scripts/mod/devicetable-offsets.c +++ b/scripts/mod/devicetable-offsets.c @@ -199,6 +199,9 @@ int main(void) DEVID(cpu_feature); DEVID_FIELD(cpu_feature, feature); + DEVID(mcb_device_id); + DEVID_FIELD(mcb_device_id, device); + DEVID(mei_cl_device_id); DEVID_FIELD(mei_cl_device_id, name); DEVID_FIELD(mei_cl_device_id, uuid); diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c index b3333560b95ee..4e99393a35f15 100644 --- a/scripts/mod/file2alias.c +++ b/scripts/mod/file2alias.c @@ -1110,6 +1110,14 @@ static void do_cpu_entry(struct module *mod, void *symval) module_alias_printf(mod, false, "cpu:type:*:feature:*%04X*", feature); } +/* Looks like: mcb:16zN */ +static void do_mcb_entry(struct module *mod, void *symval) +{ + DEF_FIELD(symval, mcb_device_id, device); + + module_alias_printf(mod, false, "mcb:16z%03d", device); +} + /* Looks like: mei:S:uuid:N:* */ static void do_mei_entry(struct module *mod, void *symval) { @@ -1444,6 +1452,7 @@ static const struct devtable devtable[] = { {"mipscdmm", SIZE_mips_cdmm_device_id, do_mips_cdmm_entry}, {"x86cpu", SIZE_x86_cpu_id, do_x86cpu_entry}, {"cpu", SIZE_cpu_feature, do_cpu_entry}, + {"mcb", SIZE_mcb_device_id, do_mcb_entry}, {"mei", SIZE_mei_cl_device_id, do_mei_entry}, {"rapidio", SIZE_rio_device_id, do_rio_entry}, {"ulpi", SIZE_ulpi_device_id, do_ulpi_entry}, From 35f3e3842833ff7d661303f5a180942bbd54e3d2 Mon Sep 17 00:00:00 2001 From: Ethan Nelson-Moore Date: Wed, 10 Dec 2025 22:24:51 -0800 Subject: [PATCH 0480/1321] net: usb: sr9700: support devices with virtual driver CD Some SR9700 devices have an SPI flash chip containing a virtual driver CD, in which case they appear as a device with two interfaces and product ID 0x9702. Interface 0 is the driver CD and interface 1 is the Ethernet device. Link: https://github.com/name-kurniawan/usb-lan Link: https://www.draisberghof.de/usb_modeswitch/bb/viewtopic.php?t=2185 Signed-off-by: Ethan Nelson-Moore Link: https://patch.msgid.link/20251211062451.139036-1-enelsonmoore@gmail.com [pabeni@redhat.com: fixes link tags] Signed-off-by: Paolo Abeni --- drivers/net/usb/sr9700.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/net/usb/sr9700.c b/drivers/net/usb/sr9700.c index 091bc2aca7e8e..d8ffb59eaf348 100644 --- a/drivers/net/usb/sr9700.c +++ b/drivers/net/usb/sr9700.c @@ -539,6 +539,11 @@ static const struct usb_device_id products[] = { USB_DEVICE(0x0fe6, 0x9700), /* SR9700 device */ .driver_info = (unsigned long)&sr9700_driver_info, }, + { + /* SR9700 with virtual driver CD-ROM - interface 0 is the CD-ROM device */ + USB_DEVICE_INTERFACE_NUMBER(0x0fe6, 0x9702, 1), + .driver_info = (unsigned long)&sr9700_driver_info, + }, {}, /* END */ }; From a3c930bbed187893e17e5d3cdfc251936341e732 Mon Sep 17 00:00:00 2001 From: Jacky Chou Date: Thu, 11 Dec 2025 14:24:58 +0800 Subject: [PATCH 0481/1321] net: mdio: aspeed: add dummy read to avoid read-after-write issue The Aspeed MDIO controller may return incorrect data when a read operation follows immediately after a write. Due to a controller bug, the subsequent read can latch stale data, causing the polling logic to terminate earlier than expected. To work around this hardware issue, insert a dummy read after each write operation. This ensures that the next actual read returns the correct data and prevents premature polling exit. This workaround has been verified to stabilize MDIO transactions on affected Aspeed platforms. Fixes: f160e99462c6 ("net: phy: Add mdio-aspeed") Signed-off-by: Jacky Chou Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20251211-aspeed_mdio_add_dummy_read-v3-1-382868869004@aspeedtech.com Signed-off-by: Paolo Abeni --- drivers/net/mdio/mdio-aspeed.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/mdio/mdio-aspeed.c b/drivers/net/mdio/mdio-aspeed.c index e55be6dc9ae70..d6b9004c61dc1 100644 --- a/drivers/net/mdio/mdio-aspeed.c +++ b/drivers/net/mdio/mdio-aspeed.c @@ -63,6 +63,13 @@ static int aspeed_mdio_op(struct mii_bus *bus, u8 st, u8 op, u8 phyad, u8 regad, iowrite32(ctrl, ctx->base + ASPEED_MDIO_CTRL); + /* Workaround for read-after-write issue. + * The controller may return stale data if a read follows immediately + * after a write. A dummy read forces the hardware to update its + * internal state, ensuring that the next real read returns correct data. + */ + ioread32(ctx->base + ASPEED_MDIO_CTRL); + return readl_poll_timeout(ctx->base + ASPEED_MDIO_CTRL, ctrl, !(ctrl & ASPEED_MDIO_CTRL_FIRE), ASPEED_MDIO_INTERVAL_US, From 050fb685bb038e1ebc4abce4981883b5bcbb6cfd Mon Sep 17 00:00:00 2001 From: Haoxiang Li Date: Thu, 11 Dec 2025 15:37:56 +0800 Subject: [PATCH 0482/1321] fjes: Add missing iounmap in fjes_hw_init() In error paths, add fjes_hw_iounmap() to release the resource acquired by fjes_hw_iomap(). Add a goto label to do so. Fixes: 8cdc3f6c5d22 ("fjes: Hardware initialization routine") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li Signed-off-by: Simon Horman Reviewed-by: Simon Horman Link: https://patch.msgid.link/20251211073756.101824-1-lihaoxiang@isrc.iscas.ac.cn Signed-off-by: Paolo Abeni --- drivers/net/fjes/fjes_hw.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c index b9b5554ea8620..5ad2673f213d6 100644 --- a/drivers/net/fjes/fjes_hw.c +++ b/drivers/net/fjes/fjes_hw.c @@ -334,7 +334,7 @@ int fjes_hw_init(struct fjes_hw *hw) ret = fjes_hw_reset(hw); if (ret) - return ret; + goto err_iounmap; fjes_hw_set_irqmask(hw, REG_ICTL_MASK_ALL, true); @@ -347,8 +347,10 @@ int fjes_hw_init(struct fjes_hw *hw) hw->max_epid = fjes_hw_get_max_epid(hw); hw->my_epid = fjes_hw_get_my_epid(hw); - if ((hw->max_epid == 0) || (hw->my_epid >= hw->max_epid)) - return -ENXIO; + if ((hw->max_epid == 0) || (hw->my_epid >= hw->max_epid)) { + ret = -ENXIO; + goto err_iounmap; + } ret = fjes_hw_setup(hw); @@ -356,6 +358,10 @@ int fjes_hw_init(struct fjes_hw *hw) hw->hw_info.trace_size = FJES_DEBUG_BUFFER_SIZE; return ret; + +err_iounmap: + fjes_hw_iounmap(hw); + return ret; } void fjes_hw_exit(struct fjes_hw *hw) From abcc6104a63eb2890450ba1d7a28e49329e3f033 Mon Sep 17 00:00:00 2001 From: Miaoqian Lin Date: Thu, 11 Dec 2025 12:13:13 +0400 Subject: [PATCH 0483/1321] net: phy: mediatek: fix nvmem cell reference leak in mt798x_phy_calibration When nvmem_cell_read() fails in mt798x_phy_calibration(), the function returns without calling nvmem_cell_put(), leaking the cell reference. Move nvmem_cell_put() right after nvmem_cell_read() to ensure the cell reference is always released regardless of the read result. Found via static analysis and code review. Fixes: 98c485eaf509 ("net: phy: add driver for MediaTek SoC built-in GE PHYs") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin Reviewed-by: Daniel Golle Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20251211081313.2368460-1-linmq006@gmail.com Signed-off-by: Paolo Abeni --- drivers/net/phy/mediatek/mtk-ge-soc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/phy/mediatek/mtk-ge-soc.c b/drivers/net/phy/mediatek/mtk-ge-soc.c index cd09fbf92ef23..2c4bbc236202b 100644 --- a/drivers/net/phy/mediatek/mtk-ge-soc.c +++ b/drivers/net/phy/mediatek/mtk-ge-soc.c @@ -1167,9 +1167,9 @@ static int mt798x_phy_calibration(struct phy_device *phydev) } buf = (u32 *)nvmem_cell_read(cell, &len); + nvmem_cell_put(cell); if (IS_ERR(buf)) return PTR_ERR(buf); - nvmem_cell_put(cell); if (!buf[0] || !buf[1] || !buf[2] || !buf[3] || len < 4 * sizeof(u32)) { phydev_err(phydev, "invalid efuse data\n"); From a9b6f30bff82e4fe961b9a754db1d0afffea7754 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Toke=20H=C3=B8iland-J=C3=B8rgensen?= Date: Thu, 11 Dec 2025 12:50:05 +0100 Subject: [PATCH 0484/1321] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The openvswitch teardown code will immediately call ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification. It will then start the dp_notify_work workqueue, which will later end up calling the vport destroy() callback. This callback takes the RTNL to do another ovs_netdev_detach_port(), which in this case is unnecessary. This causes extra pressure on the RTNL, in some cases leading to "unregister_netdevice: waiting for XX to become free" warnings on teardown. We can straight-forwardly avoid the extra RTNL lock acquisition by checking the device flags before taking the lock, and skip the locking altogether if the IFF_OVS_DATAPATH flag has already been unset. Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister") Tested-by: Adrian Moreno Signed-off-by: Toke Høiland-Jørgensen Acked-by: Eelco Chaudron Acked-by: Aaron Conole Link: https://patch.msgid.link/20251211115006.228876-1-toke@redhat.com Signed-off-by: Paolo Abeni --- net/openvswitch/vport-netdev.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c index 91a11067e4588..6574f9bcdc026 100644 --- a/net/openvswitch/vport-netdev.c +++ b/net/openvswitch/vport-netdev.c @@ -160,10 +160,19 @@ void ovs_netdev_detach_dev(struct vport *vport) static void netdev_destroy(struct vport *vport) { - rtnl_lock(); - if (netif_is_ovs_port(vport->dev)) - ovs_netdev_detach_dev(vport); - rtnl_unlock(); + /* When called from ovs_db_notify_wq() after a dp_device_event(), the + * port has already been detached, so we can avoid taking the RTNL by + * checking this first. + */ + if (netif_is_ovs_port(vport->dev)) { + rtnl_lock(); + /* Check again while holding the lock to ensure we don't race + * with the netdev notifier and detach twice. + */ + if (netif_is_ovs_port(vport->dev)) + ovs_netdev_detach_dev(vport); + rtnl_unlock(); + } call_rcu(&vport->rcu, vport_netdev_free); } From 8d7b38961d87fa86a792d39c44ced6036d17e90a Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 11 Dec 2025 17:35:50 +0000 Subject: [PATCH 0485/1321] ip6_gre: make ip6gre_header() robust Over the years, syzbot found many ways to crash the kernel in ip6gre_header() [1]. This involves team or bonding drivers ability to dynamically change their dev->needed_headroom and/or dev->hard_header_len In this particular crash mld_newpack() allocated an skb with a too small reserve/headroom, and by the time mld_sendpack() was called, syzbot managed to attach an ip6gre device. [1] skbuff: skb_under_panic: text:ffffffff8a1d69a8 len:136 put:40 head:ffff888059bc7000 data:ffff888059bc6fe8 tail:0x70 end:0x6c0 dev:team0 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:213 ! skb_under_panic net/core/skbuff.c:223 [inline] skb_push+0xc3/0xe0 net/core/skbuff.c:2641 ip6gre_header+0xc8/0x790 net/ipv6/ip6_gre.c:1371 dev_hard_header include/linux/netdevice.h:3436 [inline] neigh_connected_output+0x286/0x460 net/core/neighbour.c:1618 neigh_output include/net/neighbour.h:556 [inline] ip6_finish_output2+0xfb3/0x1480 net/ipv6/ip6_output.c:136 __ip6_finish_output net/ipv6/ip6_output.c:-1 [inline] ip6_finish_output+0x234/0x7d0 net/ipv6/ip6_output.c:220 NF_HOOK_COND include/linux/netfilter.h:307 [inline] ip6_output+0x340/0x550 net/ipv6/ip6_output.c:247 NF_HOOK+0x9e/0x380 include/linux/netfilter.h:318 mld_sendpack+0x8d4/0xe60 net/ipv6/mcast.c:1855 mld_send_cr net/ipv6/mcast.c:2154 [inline] mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693 Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Reported-by: syzbot+43a2ebcf2a64b1102d64@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/693b002c.a70a0220.33cd7b.0033.GAE@google.com/T/#u Signed-off-by: Eric Dumazet Link: https://patch.msgid.link/20251211173550.2032674-1-edumazet@google.com Signed-off-by: Paolo Abeni --- net/ipv6/ip6_gre.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index c82a75510c0e2..8bc3f05f594ed 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -1366,9 +1366,16 @@ static int ip6gre_header(struct sk_buff *skb, struct net_device *dev, { struct ip6_tnl *t = netdev_priv(dev); struct ipv6hdr *ipv6h; + int needed; __be16 *p; - ipv6h = skb_push(skb, t->hlen + sizeof(*ipv6h)); + needed = t->hlen + sizeof(*ipv6h); + if (skb_headroom(skb) < needed && + pskb_expand_head(skb, HH_DATA_ALIGN(needed - skb_headroom(skb)), + 0, GFP_ATOMIC)) + return -needed; + + ipv6h = skb_push(skb, needed); ip6_flow_hdr(ipv6h, 0, ip6_make_flowlabel(dev_net(dev), skb, t->fl.u.ip6.flowlabel, true, &t->fl.u.ip6)); From bf5de07c0d8d6fea3eb900da9330a10cdd365a0b Mon Sep 17 00:00:00 2001 From: Wang Liang Date: Fri, 12 Dec 2025 09:27:23 +0800 Subject: [PATCH 0486/1321] net/handshake: Fix null-ptr-deref in handshake_complete() A null pointer dereference in handshake_complete() was observed [1]. When handshake_req_next() return NULL in handshake_nl_accept_doit(), function handshake_complete() will be called unexpectedly which triggers this crash. Fix it by goto out_status when req is NULL. [1] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] SMP KASAN PTI RIP: 0010:handshake_complete+0x36/0x2b0 net/handshake/request.c:288 Call Trace: handshake_nl_accept_doit+0x32d/0x7e0 net/handshake/netlink.c:129 genl_family_rcv_msg_doit+0x204/0x300 net/netlink/genetlink.c:1115 genl_family_rcv_msg+0x436/0x670 net/netlink/genetlink.c:1195 genl_rcv_msg+0xcc/0x170 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x14c/0x430 net/netlink/af_netlink.c:2550 genl_rcv+0x2d/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline] netlink_unicast+0x878/0xb20 net/netlink/af_netlink.c:1344 netlink_sendmsg+0x897/0xd70 net/netlink/af_netlink.c:1894 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg net/socket.c:742 [inline] ____sys_sendmsg+0xa39/0xbf0 net/socket.c:2592 ___sys_sendmsg+0x121/0x1c0 net/socket.c:2646 __sys_sendmsg+0x155/0x200 net/socket.c:2678 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x5f/0x350 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: fe67b063f687 ("net/handshake: convert handshake_nl_accept_doit() to FD_PREPARE()") Reviewed-by: Chuck Lever Reported-by: Dan Carpenter Closes: https://lore.kernel.org/kernel-tls-handshake/aScekpuOYHRM9uOd@morisot.1015granger.net/T/#m7cfa5c11efc626d77622b2981591197a2acdd65e Signed-off-by: Wang Liang Reviewed-by: Simon Horman Link: https://patch.msgid.link/20251212012723.4111831-1-wangliang74@huawei.com Signed-off-by: Paolo Abeni --- net/handshake/netlink.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c index 1d33a4675a483..b989456fc4c5f 100644 --- a/net/handshake/netlink.c +++ b/net/handshake/netlink.c @@ -126,7 +126,8 @@ int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *info) } out_complete: - handshake_complete(req, -EIO, NULL); + if (req) + handshake_complete(req, -EIO, NULL); out_status: trace_handshake_cmd_accept_err(net, req, NULL, err); return err; From e4299f1f78779679be31baf3eb8caf47cacdac59 Mon Sep 17 00:00:00 2001 From: Jiri Pirko Date: Fri, 12 Dec 2025 11:29:53 +0100 Subject: [PATCH 0487/1321] team: fix check for port enabled in team_queue_override_port_prio_changed() There has been a syzkaller bug reported recently with the following trace: list_del corruption, ffff888058bea080->prev is LIST_POISON2 (dead000000000122) ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:59! Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI CPU: 3 UID: 0 PID: 21246 Comm: syz.0.2928 Not tainted syzkaller #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:__list_del_entry_valid_or_report+0x13e/0x200 lib/list_debug.c:59 Code: 48 c7 c7 e0 71 f0 8b e8 30 08 ef fc 90 0f 0b 48 89 ef e8 a5 02 55 fd 48 89 ea 48 89 de 48 c7 c7 40 72 f0 8b e8 13 08 ef fc 90 <0f> 0b 48 89 ef e8 88 02 55 fd 48 89 ea 48 b8 00 00 00 00 00 fc ff RSP: 0018:ffffc9000d49f370 EFLAGS: 00010286 RAX: 000000000000004e RBX: ffff888058bea080 RCX: ffffc9002817d000 RDX: 0000000000000000 RSI: ffffffff819becc6 RDI: 0000000000000005 RBP: dead000000000122 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000080000000 R11: 0000000000000001 R12: ffff888039e9c230 R13: ffff888058bea088 R14: ffff888058bea080 R15: ffff888055461480 FS: 00007fbbcfe6f6c0(0000) GS:ffff8880d6d0a000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000110c3afcb0 CR3: 00000000382c7000 CR4: 0000000000352ef0 Call Trace: __list_del_entry_valid include/linux/list.h:132 [inline] __list_del_entry include/linux/list.h:223 [inline] list_del_rcu include/linux/rculist.h:178 [inline] __team_queue_override_port_del drivers/net/team/team_core.c:826 [inline] __team_queue_override_port_del drivers/net/team/team_core.c:821 [inline] team_queue_override_port_prio_changed drivers/net/team/team_core.c:883 [inline] team_priority_option_set+0x171/0x2f0 drivers/net/team/team_core.c:1534 team_option_set drivers/net/team/team_core.c:376 [inline] team_nl_options_set_doit+0x8ae/0xe60 drivers/net/team/team_core.c:2653 genl_family_rcv_msg_doit+0x209/0x2f0 net/netlink/genetlink.c:1115 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0x55c/0x800 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2552 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline] netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1346 netlink_sendmsg+0x8c8/0xdd0 net/netlink/af_netlink.c:1896 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg net/socket.c:742 [inline] ____sys_sendmsg+0xa98/0xc70 net/socket.c:2630 ___sys_sendmsg+0x134/0x1d0 net/socket.c:2684 __sys_sendmsg+0x16d/0x220 net/socket.c:2716 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f The problem is in this flow: 1) Port is enabled, queue_id != 0, in qom_list 2) Port gets disabled -> team_port_disable() -> team_queue_override_port_del() -> del (removed from list) 3) Port is disabled, queue_id != 0, not in any list 4) Priority changes -> team_queue_override_port_prio_changed() -> checks: port disabled && queue_id != 0 -> calls del - hits the BUG as it is removed already To fix this, change the check in team_queue_override_port_prio_changed() so it returns early if port is not enabled. Reported-by: syzbot+422806e5f4cce722a71f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=422806e5f4cce722a71f Fixes: 6c31ff366c11 ("team: remove synchronize_rcu() called during queue override change") Signed-off-by: Jiri Pirko Reviewed-by: Simon Horman Link: https://patch.msgid.link/20251212102953.167287-1-jiri@resnulli.us Signed-off-by: Paolo Abeni --- drivers/net/team/team_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c index 4d5c9ae8f2219..c08a5c1bd6e4d 100644 --- a/drivers/net/team/team_core.c +++ b/drivers/net/team/team_core.c @@ -878,7 +878,7 @@ static void __team_queue_override_enabled_check(struct team *team) static void team_queue_override_port_prio_changed(struct team *team, struct team_port *port) { - if (!port->queue_id || team_port_enabled(port)) + if (!port->queue_id || !team_port_enabled(port)) return; __team_queue_override_port_del(team, port); __team_queue_override_port_add(team, port); From 338fda8244c6a1cbaec93108a9f422985a075692 Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Fri, 12 Dec 2025 13:54:03 +0100 Subject: [PATCH 0488/1321] mptcp: fallback earlier on simult connection Syzkaller reports a simult-connect race leading to inconsistent fallback status: WARNING: CPU: 3 PID: 33 at net/mptcp/subflow.c:1515 subflow_data_ready+0x40b/0x7c0 net/mptcp/subflow.c:1515 Modules linked in: CPU: 3 UID: 0 PID: 33 Comm: ksoftirqd/3 Not tainted syzkaller #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:subflow_data_ready+0x40b/0x7c0 net/mptcp/subflow.c:1515 Code: 89 ee e8 78 61 3c f6 40 84 ed 75 21 e8 8e 66 3c f6 44 89 fe bf 07 00 00 00 e8 c1 61 3c f6 41 83 ff 07 74 09 e8 76 66 3c f6 90 <0f> 0b 90 e8 6d 66 3c f6 48 89 df e8 e5 ad ff ff 31 ff 89 c5 89 c6 RSP: 0018:ffffc900006cf338 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888031acd100 RCX: ffffffff8b7f2abf RDX: ffff88801e6ea440 RSI: ffffffff8b7f2aca RDI: 0000000000000005 RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000007 R10: 0000000000000004 R11: 0000000000002c10 R12: ffff88802ba69900 R13: 1ffff920000d9e67 R14: ffff888046f81800 R15: 0000000000000004 FS: 0000000000000000(0000) GS:ffff8880d69bc000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000560fc0ca1670 CR3: 0000000032c3a000 CR4: 0000000000352ef0 Call Trace: tcp_data_queue+0x13b0/0x4f90 net/ipv4/tcp_input.c:5197 tcp_rcv_state_process+0xfdf/0x4ec0 net/ipv4/tcp_input.c:6922 tcp_v6_do_rcv+0x492/0x1740 net/ipv6/tcp_ipv6.c:1672 tcp_v6_rcv+0x2976/0x41e0 net/ipv6/tcp_ipv6.c:1918 ip6_protocol_deliver_rcu+0x188/0x1520 net/ipv6/ip6_input.c:438 ip6_input_finish+0x1e4/0x4b0 net/ipv6/ip6_input.c:489 NF_HOOK include/linux/netfilter.h:318 [inline] NF_HOOK include/linux/netfilter.h:312 [inline] ip6_input+0x105/0x2f0 net/ipv6/ip6_input.c:500 dst_input include/net/dst.h:471 [inline] ip6_rcv_finish net/ipv6/ip6_input.c:79 [inline] NF_HOOK include/linux/netfilter.h:318 [inline] NF_HOOK include/linux/netfilter.h:312 [inline] ipv6_rcv+0x264/0x650 net/ipv6/ip6_input.c:311 __netif_receive_skb_one_core+0x12d/0x1e0 net/core/dev.c:5979 __netif_receive_skb+0x1d/0x160 net/core/dev.c:6092 process_backlog+0x442/0x15e0 net/core/dev.c:6444 __napi_poll.constprop.0+0xba/0x550 net/core/dev.c:7494 napi_poll net/core/dev.c:7557 [inline] net_rx_action+0xa9f/0xfe0 net/core/dev.c:7684 handle_softirqs+0x216/0x8e0 kernel/softirq.c:579 run_ksoftirqd kernel/softirq.c:968 [inline] run_ksoftirqd+0x3a/0x60 kernel/softirq.c:960 smpboot_thread_fn+0x3f7/0xae0 kernel/smpboot.c:160 kthread+0x3c2/0x780 kernel/kthread.c:463 ret_from_fork+0x5d7/0x6f0 arch/x86/kernel/process.c:148 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 The TCP subflow can process the simult-connect syn-ack packet after transitioning to TCP_FIN1 state, bypassing the MPTCP fallback check, as the sk_state_change() callback is not invoked for * -> FIN_WAIT1 transitions. That will move the msk socket to an inconsistent status and the next incoming data will hit the reported splat. Close the race moving the simult-fallback check at the earliest possible stage - that is at syn-ack generation time. About the fixes tags: [2] was supposed to also fix this issue introduced by [3]. [1] is required as a dependence: it was not explicitly marked as a fix, but it is one and it has already been backported before [3]. In other words, this commit should be backported up to [3], including [2] and [1] if that's not already there. Fixes: 23e89e8ee7be ("tcp: Don't drop SYN+ACK for simultaneous connect().") [1] Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race") [2] Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support") [3] Cc: stable@vger.kernel.org Reported-by: syzbot+0ff6b771b4f7a5bce83b@syzkaller.appspotmail.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/586 Signed-off-by: Paolo Abeni Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251212-net-mptcp-subflow_data_ready-warn-v1-1-d1f9fd1c36c8@kernel.org Signed-off-by: Paolo Abeni --- net/mptcp/options.c | 10 ++++++++++ net/mptcp/protocol.h | 6 ++---- net/mptcp/subflow.c | 6 ------ 3 files changed, 12 insertions(+), 10 deletions(-) diff --git a/net/mptcp/options.c b/net/mptcp/options.c index f24ae7d40e883..43df4293f58bf 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -408,6 +408,16 @@ bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb, */ subflow->snd_isn = TCP_SKB_CB(skb)->end_seq; if (subflow->request_mptcp) { + if (unlikely(subflow_simultaneous_connect(sk))) { + WARN_ON_ONCE(!mptcp_try_fallback(sk, MPTCP_MIB_SIMULTCONNFALLBACK)); + + /* Ensure mptcp_finish_connect() will not process the + * MPC handshake. + */ + subflow->request_mptcp = 0; + return false; + } + opts->suboptions = OPTION_MPTCP_MPC_SYN; opts->csum_reqd = mptcp_is_checksum_enabled(sock_net(sk)); opts->allow_join_id0 = mptcp_allow_join_id0(sock_net(sk)); diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 9c0d17876b22f..bed0c9aa28b61 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -1337,10 +1337,8 @@ static inline bool subflow_simultaneous_connect(struct sock *sk) { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); - return (1 << sk->sk_state) & - (TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | TCPF_CLOSING) && - is_active_ssk(subflow) && - !subflow->conn_finished; + /* Note that the sk state implies !subflow->conn_finished. */ + return sk->sk_state == TCP_SYN_RECV && is_active_ssk(subflow); } #ifdef CONFIG_SYN_COOKIES diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 86ce58ae533df..96d54cb2cd93f 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1878,12 +1878,6 @@ static void subflow_state_change(struct sock *sk) __subflow_state_change(sk); - if (subflow_simultaneous_connect(sk)) { - WARN_ON_ONCE(!mptcp_try_fallback(sk, MPTCP_MIB_SIMULTCONNFALLBACK)); - subflow->conn_finished = 1; - mptcp_propagate_state(parent, sk, subflow, NULL); - } - /* as recvmsg() does not acquire the subflow socket for ssk selection * a fin packet carrying a DSS can be unnoticed if we don't trigger * the data available machinery here. From 88ae91d08feb1f1289ea838abda8434fb60cb753 Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Fri, 12 Dec 2025 13:54:04 +0100 Subject: [PATCH 0489/1321] mptcp: ensure context reset on disconnect() After the blamed commit below, if the MPC subflow is already in TCP_CLOSE status or has fallback to TCP at mptcp_disconnect() time, mptcp_do_fastclose() skips setting the `send_fastclose flag` and the later __mptcp_close_ssk() does not reset anymore the related subflow context. Any later connection will be created with both the `request_mptcp` flag and the msk-level fallback status off (it is unconditionally cleared at MPTCP disconnect time), leading to a warning in subflow_data_ready(): WARNING: CPU: 26 PID: 8996 at net/mptcp/subflow.c:1519 subflow_data_ready (net/mptcp/subflow.c:1519 (discriminator 13)) Modules linked in: CPU: 26 UID: 0 PID: 8996 Comm: syz.22.39 Not tainted 6.18.0-rc7-05427-g11fc074f6c36 #1 PREEMPT(voluntary) Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 RIP: 0010:subflow_data_ready (net/mptcp/subflow.c:1519 (discriminator 13)) Code: 90 0f 0b 90 90 e9 04 fe ff ff e8 b7 1e f5 fe 89 ee bf 07 00 00 00 e8 db 19 f5 fe 83 fd 07 0f 84 35 ff ff ff e8 9d 1e f5 fe 90 <0f> 0b 90 e9 27 ff ff ff e8 8f 1e f5 fe 4c 89 e7 48 89 de e8 14 09 RSP: 0018:ffffc9002646fb30 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff88813b218000 RCX: ffffffff825c8435 RDX: ffff8881300b3580 RSI: ffffffff825c8443 RDI: 0000000000000005 RBP: 000000000000000b R08: ffffffff825c8435 R09: 000000000000000b R10: 0000000000000005 R11: 0000000000000007 R12: ffff888131ac0000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f88330af6c0(0000) GS:ffff888a93dd2000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f88330aefe8 CR3: 000000010ff59000 CR4: 0000000000350ef0 Call Trace: tcp_data_ready (net/ipv4/tcp_input.c:5356) tcp_data_queue (net/ipv4/tcp_input.c:5445) tcp_rcv_state_process (net/ipv4/tcp_input.c:7165) tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1955) __release_sock (include/net/sock.h:1158 (discriminator 6) net/core/sock.c:3180 (discriminator 6)) release_sock (net/core/sock.c:3737) mptcp_sendmsg (net/mptcp/protocol.c:1763 net/mptcp/protocol.c:1857) inet_sendmsg (net/ipv4/af_inet.c:853 (discriminator 7)) __sys_sendto (net/socket.c:727 (discriminator 15) net/socket.c:742 (discriminator 15) net/socket.c:2244 (discriminator 15)) __x64_sys_sendto (net/socket.c:2247) do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1)) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) RIP: 0033:0x7f883326702d Address the issue setting an explicit `fastclosing` flag at fastclose time, and checking such flag after mptcp_do_fastclose(). Fixes: ae155060247b ("mptcp: fix duplicate reset on fastclose") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20251212-net-mptcp-subflow_data_ready-warn-v1-2-d1f9fd1c36c8@kernel.org Signed-off-by: Paolo Abeni --- net/mptcp/protocol.c | 8 +++++--- net/mptcp/protocol.h | 3 ++- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 9b1fafd87cb94..f505b780f7139 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2467,10 +2467,10 @@ bool __mptcp_retransmit_pending_data(struct sock *sk) */ static void __mptcp_subflow_disconnect(struct sock *ssk, struct mptcp_subflow_context *subflow, - unsigned int flags) + bool fastclosing) { if (((1 << ssk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) || - subflow->send_fastclose) { + fastclosing) { /* The MPTCP code never wait on the subflow sockets, TCP-level * disconnect should never fail */ @@ -2538,7 +2538,7 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, need_push = (flags & MPTCP_CF_PUSH) && __mptcp_retransmit_pending_data(sk); if (!dispose_it) { - __mptcp_subflow_disconnect(ssk, subflow, flags); + __mptcp_subflow_disconnect(ssk, subflow, msk->fastclosing); release_sock(ssk); goto out; @@ -2884,6 +2884,7 @@ static void mptcp_do_fastclose(struct sock *sk) mptcp_set_state(sk, TCP_CLOSE); mptcp_backlog_purge(sk); + msk->fastclosing = 1; /* Explicitly send the fastclose reset as need */ if (__mptcp_check_fallback(msk)) @@ -3418,6 +3419,7 @@ static int mptcp_disconnect(struct sock *sk, int flags) msk->bytes_sent = 0; msk->bytes_retrans = 0; msk->rcvspace_init = 0; + msk->fastclosing = 0; /* for fallback's sake */ WRITE_ONCE(msk->ack_seq, 0); diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index bed0c9aa28b61..66e9735007912 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -320,7 +320,8 @@ struct mptcp_sock { fastopening:1, in_accept_queue:1, free_first:1, - rcvspace_init:1; + rcvspace_init:1, + fastclosing:1; u32 notsent_lowat; int keepalive_cnt; int keepalive_idle; From aee12ce9e689c01bc5cca29bf509f3a2681f9ccb Mon Sep 17 00:00:00 2001 From: Frode Nordahl Date: Sat, 13 Dec 2025 10:13:36 +0000 Subject: [PATCH 0490/1321] erspan: Initialize options_len before referencing options. The struct ip_tunnel_info has a flexible array member named options that is protected by a counted_by(options_len) attribute. The compiler will use this information to enforce runtime bounds checking deployed by FORTIFY_SOURCE string helpers. As laid out in the GCC documentation, the counter must be initialized before the first reference to the flexible array member. After scanning through the files that use struct ip_tunnel_info and also refer to options or options_len, it appears the normal case is to use the ip_tunnel_info_opts_set() helper. Said helper would initialize options_len properly before copying data into options, however in the GRE ERSPAN code a partial update is done, preventing the use of the helper function. Before this change the handling of ERSPAN traffic in GRE tunnels would cause a kernel panic when the kernel is compiled with GCC 15+ and having FORTIFY_SOURCE configured: memcpy: detected buffer overflow: 4 byte write of buffer size 0 Call Trace: __fortify_panic+0xd/0xf erspan_rcv.cold+0x68/0x83 ? ip_route_input_slow+0x816/0x9d0 gre_rcv+0x1b2/0x1c0 gre_rcv+0x8e/0x100 ? raw_v4_input+0x2a0/0x2b0 ip_protocol_deliver_rcu+0x1ea/0x210 ip_local_deliver_finish+0x86/0x110 ip_local_deliver+0x65/0x110 ? ip_rcv_finish_core+0xd6/0x360 ip_rcv+0x186/0x1a0 Cc: stable@vger.kernel.org Link: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-counted_005fby-variable-attribute Reported-at: https://launchpad.net/bugs/2129580 Fixes: bb5e62f2d547 ("net: Add options as a flexible array to struct ip_tunnel_info") Signed-off-by: Frode Nordahl Reviewed-by: Simon Horman Link: https://patch.msgid.link/20251213101338.4693-1-fnordahl@ubuntu.com Signed-off-by: Paolo Abeni --- net/ipv4/ip_gre.c | 6 ++++-- net/ipv6/ip6_gre.c | 6 ++++-- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index 761a53c6a89a6..8178c44a3cdd4 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -330,6 +330,10 @@ static int erspan_rcv(struct sk_buff *skb, struct tnl_ptk_info *tpi, if (!tun_dst) return PACKET_REJECT; + /* MUST set options_len before referencing options */ + info = &tun_dst->u.tun_info; + info->options_len = sizeof(*md); + /* skb can be uncloned in __iptunnel_pull_header, so * old pkt_md is no longer valid and we need to reset * it @@ -344,10 +348,8 @@ static int erspan_rcv(struct sk_buff *skb, struct tnl_ptk_info *tpi, memcpy(md2, pkt_md, ver == 1 ? ERSPAN_V1_MDSIZE : ERSPAN_V2_MDSIZE); - info = &tun_dst->u.tun_info; __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, info->key.tun_flags); - info->options_len = sizeof(*md); } skb_reset_mac_header(skb); diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index 8bc3f05f594ed..d19d86ed43766 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -535,6 +535,10 @@ static int ip6erspan_rcv(struct sk_buff *skb, if (!tun_dst) return PACKET_REJECT; + /* MUST set options_len before referencing options */ + info = &tun_dst->u.tun_info; + info->options_len = sizeof(*md); + /* skb can be uncloned in __iptunnel_pull_header, so * old pkt_md is no longer valid and we need to reset * it @@ -543,7 +547,6 @@ static int ip6erspan_rcv(struct sk_buff *skb, skb_network_header_len(skb); pkt_md = (struct erspan_metadata *)(gh + gre_hdr_len + sizeof(*ershdr)); - info = &tun_dst->u.tun_info; md = ip_tunnel_info_opts(info); md->version = ver; md2 = &md->u.md2; @@ -551,7 +554,6 @@ static int ip6erspan_rcv(struct sk_buff *skb, ERSPAN_V2_MDSIZE); __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, info->key.tun_flags); - info->options_len = sizeof(*md); ip6_tnl_rcv(tunnel, skb, tpi, tun_dst, log_ecn_error); From 67e2a2cae3a6354f7a57a51aeee6dfde1b1df74e Mon Sep 17 00:00:00 2001 From: Lorenzo Bianconi Date: Sun, 14 Dec 2025 10:30:07 +0100 Subject: [PATCH 0491/1321] net: airoha: Move net_devs registration in a dedicated routine Since airoha_probe() is not executed under rtnl lock, there is small race where a given device is configured by user-space while the remaining ones are not completely loaded from the dts yet. This condition will allow a hw device misconfiguration since there are some conditions (e.g. GDM2 check in airoha_dev_init()) that require all device are properly loaded from the device tree. Fix the issue moving net_devices registration at the end of the airoha_probe routine. Fixes: 9cd451d414f6e ("net: airoha: Add loopback support for GDM2") Signed-off-by: Lorenzo Bianconi Reviewed-by: Simon Horman Link: https://patch.msgid.link/20251214-airoha-fix-dev-registration-v1-1-860e027ad4c6@kernel.org Signed-off-by: Paolo Abeni --- drivers/net/ethernet/airoha/airoha_eth.c | 39 ++++++++++++++++-------- 1 file changed, 26 insertions(+), 13 deletions(-) diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c index 75893c90a0a17..315d97036ac1d 100644 --- a/drivers/net/ethernet/airoha/airoha_eth.c +++ b/drivers/net/ethernet/airoha/airoha_eth.c @@ -2924,19 +2924,26 @@ static int airoha_alloc_gdm_port(struct airoha_eth *eth, port->id = id; eth->ports[p] = port; - err = airoha_metadata_dst_alloc(port); - if (err) - return err; + return airoha_metadata_dst_alloc(port); +} - err = register_netdev(dev); - if (err) - goto free_metadata_dst; +static int airoha_register_gdm_devices(struct airoha_eth *eth) +{ + int i; - return 0; + for (i = 0; i < ARRAY_SIZE(eth->ports); i++) { + struct airoha_gdm_port *port = eth->ports[i]; + int err; -free_metadata_dst: - airoha_metadata_dst_free(port); - return err; + if (!port) + continue; + + err = register_netdev(port->dev); + if (err) + return err; + } + + return 0; } static int airoha_probe(struct platform_device *pdev) @@ -3027,6 +3034,10 @@ static int airoha_probe(struct platform_device *pdev) } } + err = airoha_register_gdm_devices(eth); + if (err) + goto error_napi_stop; + return 0; error_napi_stop: @@ -3040,10 +3051,12 @@ static int airoha_probe(struct platform_device *pdev) for (i = 0; i < ARRAY_SIZE(eth->ports); i++) { struct airoha_gdm_port *port = eth->ports[i]; - if (port && port->dev->reg_state == NETREG_REGISTERED) { + if (!port) + continue; + + if (port->dev->reg_state == NETREG_REGISTERED) unregister_netdev(port->dev); - airoha_metadata_dst_free(port); - } + airoha_metadata_dst_free(port); } free_netdev(eth->napi_dev); platform_set_drvdata(pdev, NULL); From b93d095d558d791a71aa2f0f0bdebfb83fb9273e Mon Sep 17 00:00:00 2001 From: Vladimir Oltean Date: Mon, 15 Dec 2025 17:02:35 +0200 Subject: [PATCH 0492/1321] net: dsa: properly keep track of conduit reference Problem description ------------------- DSA has a mumbo-jumbo of reference handling of the conduit net device and its kobject which, sadly, is just wrong and doesn't make sense. There are two distinct problems. 1. The OF path, which uses of_find_net_device_by_node(), never releases the elevated refcount on the conduit's kobject. Nominally, the OF and non-OF paths should result in objects having identical reference counts taken, and it is already suspicious that dsa_dev_to_net_device() has a put_device() call which is missing in dsa_port_parse_of(), but we can actually even verify that an issue exists. With CONFIG_DEBUG_KOBJECT_RELEASE=y, if we run this command "before" and "after" applying this patch: (unbind the conduit driver for net device eno2) echo 0000:00:00.2 > /sys/bus/pci/drivers/fsl_enetc/unbind we see these lines in the output diff which appear only with the patch applied: kobject: 'eno2' (ffff002009a3a6b8): kobject_release, parent 0000000000000000 (delayed 1000) kobject: '109' (ffff0020099d59a0): kobject_release, parent 0000000000000000 (delayed 1000) 2. After we find the conduit interface one way (OF) or another (non-OF), it can get unregistered at any time, and DSA remains with a long-lived, but in this case stale, cpu_dp->conduit pointer. Holding the net device's underlying kobject isn't actually of much help, it just prevents it from being freed (but we never need that kobject directly). What helps us to prevent the net device from being unregistered is the parallel netdev reference mechanism (dev_hold() and dev_put()). Actually we actually use that netdev tracker mechanism implicitly on user ports since commit 2f1e8ea726e9 ("net: dsa: link interfaces with the DSA master to get rid of lockdep warnings"), via netdev_upper_dev_link(). But time still passes at DSA switch probe time between the initial of_find_net_device_by_node() code and the user port creation time, time during which the conduit could unregister itself and DSA wouldn't know about it. So we have to run of_find_net_device_by_node() under rtnl_lock() to prevent that from happening, and release the lock only with the netdev tracker having acquired the reference. Do we need to keep the reference until dsa_unregister_switch() / dsa_switch_shutdown()? 1: Maybe yes. A switch device will still be registered even if all user ports failed to probe, see commit 86f8b1c01a0a ("net: dsa: Do not make user port errors fatal"), and the cpu_dp->conduit pointers remain valid. I haven't audited all call paths to see whether they will actually use the conduit in lack of any user port, but if they do, it seems safer to not rely on user ports for that reference. 2. Definitely yes. We support changing the conduit which a user port is associated to, and we can get into a situation where we've moved all user ports away from a conduit, thus no longer hold any reference to it via the net device tracker. But we shouldn't let it go nonetheless - see the next change in relation to dsa_tree_find_first_conduit() and LAG conduits which disappear. We have to be prepared to return to the physical conduit, so the CPU port must explicitly keep another reference to it. This is also to say: the user ports and their CPU ports may not always keep a reference to the same conduit net device, and both are needed. As for the conduit's kobject for the /sys/class/net/ entry, we don't care about it, we can release it as soon as we hold the net device object itself. History and blame attribution ----------------------------- The code has been refactored so many times, it is very difficult to follow and properly attribute a blame, but I'll try to make a short history which I hope to be correct. We have two distinct probing paths: - one for OF, introduced in 2016 in commit 83c0afaec7b7 ("net: dsa: Add new binding implementation") - one for non-OF, introduced in 2017 in commit 71e0bbde0d88 ("net: dsa: Add support for platform data") These are both complete rewrites of the original probing paths (which used struct dsa_switch_driver and other weird stuff, instead of regular devices on their respective buses for register access, like MDIO, SPI, I2C etc): - one for OF, introduced in 2013 in commit 5e95329b701c ("dsa: add device tree bindings to register DSA switches") - one for non-OF, introduced in 2008 in commit 91da11f870f0 ("net: Distributed Switch Architecture protocol support") except for tiny bits and pieces like dsa_dev_to_net_device() which were seemingly carried over since the original commit, and used to this day. The point is that the original probing paths received a fix in 2015 in the form of commit 679fb46c5785 ("net: dsa: Add missing master netdev dev_put() calls"), but the fix never made it into the "new" (dsa2) probing paths that can still be traced to today, and the fixed probing path was later deleted in 2019 in commit 93e86b3bc842 ("net: dsa: Remove legacy probing support"). That is to say, the new probing paths were never quite correct in this area. The existence of the legacy probing support which was deleted in 2019 explains why dsa_dev_to_net_device() returns a conduit with elevated refcount (because it was supposed to be released during dsa_remove_dst()). After the removal of the legacy code, the only user of dsa_dev_to_net_device() calls dev_put(conduit) immediately after this function returns. This pattern makes no sense today, and can only be interpreted historically to understand why dev_hold() was there in the first place. Change details -------------- Today we have a better netdev tracking infrastructure which we should use. Logically netdev_hold() belongs in common code (dsa_port_parse_cpu(), where dp->conduit is assigned), but there is a tradeoff to be made with the rtnl_lock() section which would become a bit too long if we did that - dsa_port_parse_cpu() also calls request_module(). So we duplicate a bit of logic in order for the callers of dsa_port_parse_cpu() to be the ones responsible of holding the conduit reference and releasing it on error. This shortens the rtnl_lock() section significantly. In the dsa_switch_probe() error path, dsa_switch_release_ports() will be called in a number of situations, one being where dsa_port_parse_cpu() maybe didn't get the chance to run at all (a different port failed earlier, etc). So we have to test for the conduit being NULL prior to calling netdev_put(). There have still been so many transformations to the code since the blamed commits (rename master -> conduit, commit 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown")), that it only makes sense to fix the code using the best methods available today and see how it can be backported to stable later. I suspect the fix cannot even be backported to kernels which lack dsa_switch_shutdown(), and I suspect this is also maybe why the long-lived conduit reference didn't make it into the new DSA probing paths at the time (problems during shutdown). Because dsa_dev_to_net_device() has a single call site and has to be changed anyway, the logic was just absorbed into the non-OF dsa_port_parse(). Tested on the ocelot/felix switch and on dsa_loop, both on the NXP LS1028A with CONFIG_DEBUG_KOBJECT_RELEASE=y. Reported-by: Ma Ke Closes: https://lore.kernel.org/netdev/20251214131204.4684-1-make24@iscas.ac.cn/ Fixes: 83c0afaec7b7 ("net: dsa: Add new binding implementation") Fixes: 71e0bbde0d88 ("net: dsa: Add support for platform data") Reviewed-by: Jonas Gorski Signed-off-by: Vladimir Oltean Link: https://patch.msgid.link/20251215150236.3931670-1-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni --- include/net/dsa.h | 1 + net/dsa/dsa.c | 59 +++++++++++++++++++++++++++-------------------- 2 files changed, 35 insertions(+), 25 deletions(-) diff --git a/include/net/dsa.h b/include/net/dsa.h index cced1a8667578..6b2b5ed64ea4c 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -302,6 +302,7 @@ struct dsa_port { struct devlink_port devlink_port; struct phylink *pl; struct phylink_config pl_config; + netdevice_tracker conduit_tracker; struct dsa_lag *lag; struct net_device *hsr_dev; diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index a20efabe778fc..50b3fceb5c04d 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -1253,14 +1253,25 @@ static int dsa_port_parse_of(struct dsa_port *dp, struct device_node *dn) if (ethernet) { struct net_device *conduit; const char *user_protocol; + int err; + rtnl_lock(); conduit = of_find_net_device_by_node(ethernet); of_node_put(ethernet); - if (!conduit) + if (!conduit) { + rtnl_unlock(); return -EPROBE_DEFER; + } + + netdev_hold(conduit, &dp->conduit_tracker, GFP_KERNEL); + put_device(&conduit->dev); + rtnl_unlock(); user_protocol = of_get_property(dn, "dsa-tag-protocol", NULL); - return dsa_port_parse_cpu(dp, conduit, user_protocol); + err = dsa_port_parse_cpu(dp, conduit, user_protocol); + if (err) + netdev_put(conduit, &dp->conduit_tracker); + return err; } if (link) @@ -1393,37 +1404,30 @@ static struct device *dev_find_class(struct device *parent, char *class) return device_find_child(parent, class, dev_is_class); } -static struct net_device *dsa_dev_to_net_device(struct device *dev) -{ - struct device *d; - - d = dev_find_class(dev, "net"); - if (d != NULL) { - struct net_device *nd; - - nd = to_net_dev(d); - dev_hold(nd); - put_device(d); - - return nd; - } - - return NULL; -} - static int dsa_port_parse(struct dsa_port *dp, const char *name, struct device *dev) { if (!strcmp(name, "cpu")) { struct net_device *conduit; + struct device *d; + int err; - conduit = dsa_dev_to_net_device(dev); - if (!conduit) + rtnl_lock(); + d = dev_find_class(dev, "net"); + if (!d) { + rtnl_unlock(); return -EPROBE_DEFER; + } - dev_put(conduit); + conduit = to_net_dev(d); + netdev_hold(conduit, &dp->conduit_tracker, GFP_KERNEL); + put_device(d); + rtnl_unlock(); - return dsa_port_parse_cpu(dp, conduit, NULL); + err = dsa_port_parse_cpu(dp, conduit, NULL); + if (err) + netdev_put(conduit, &dp->conduit_tracker); + return err; } if (!strcmp(name, "dsa")) @@ -1491,6 +1495,9 @@ static void dsa_switch_release_ports(struct dsa_switch *ds) struct dsa_vlan *v, *n; dsa_switch_for_each_port_safe(dp, next, ds) { + if (dsa_port_is_cpu(dp) && dp->conduit) + netdev_put(dp->conduit, &dp->conduit_tracker); + /* These are either entries that upper layers lost track of * (probably due to bugs), or installed through interfaces * where one does not necessarily have to remove them, like @@ -1635,8 +1642,10 @@ void dsa_switch_shutdown(struct dsa_switch *ds) /* Disconnect from further netdevice notifiers on the conduit, * since netdev_uses_dsa() will now return false. */ - dsa_switch_for_each_cpu_port(dp, ds) + dsa_switch_for_each_cpu_port(dp, ds) { dp->conduit->dsa_ptr = NULL; + netdev_put(dp->conduit, &dp->conduit_tracker); + } rtnl_unlock(); out: From b9662da162634887764d2aa723626494bc71cc5c Mon Sep 17 00:00:00 2001 From: Vladimir Oltean Date: Mon, 15 Dec 2025 17:02:36 +0200 Subject: [PATCH 0493/1321] net: dsa: fix missing put_device() in dsa_tree_find_first_conduit() of_find_net_device_by_node() searches net devices by their /sys/class/net/, entry. It is documented in its kernel-doc that: * If successful, returns a pointer to the net_device with the embedded * struct device refcount incremented by one, or NULL on failure. The * refcount must be dropped when done with the net_device. We are missing a put_device(&conduit->dev) which we could place at the end of dsa_tree_find_first_conduit(). But to explain why calling put_device() right away is safe is the same as to explain why the chosen solution is different. The code is very poorly split: dsa_tree_find_first_conduit() was first introduced in commit 95f510d0b792 ("net: dsa: allow the DSA master to be seen and changed through rtnetlink") but was first used several commits later, in commit acc43b7bf52a ("net: dsa: allow masters to join a LAG"). Assume there is a switch with 2 CPU ports and 2 conduits, eno2 and eno3. When we create a LAG (bonding or team device) and place eno2 and eno3 beneath it, we create a 3rd conduit (the LAG device itself), but this is slightly different than the first two. Namely, the cpu_dp->conduit pointer of the CPU ports does not change, and remains pointing towards the physical Ethernet controllers which are now LAG ports. Only 2 things change: - the LAG device has a dev->dsa_ptr which marks it as a DSA conduit - dsa_port_to_conduit(user port) finds the LAG and not the physical conduit, because of the dp->cpu_port_in_lag bit being set. When the LAG device is destroyed, dsa_tree_migrate_ports_from_lag_conduit() is called and this is where dsa_tree_find_first_conduit() kicks in. This is the logical mistake and the reason why introducing code in one patch and using it from another is bad practice. I didn't realize that I don't have to call of_find_net_device_by_node() again; the cpu_dp->conduit association was never undone, and is still available for direct (re)use. There's only one concern - maybe the conduit disappeared in the meantime, but the netdev_hold() call we made during dsa_port_parse_cpu() (see previous change) ensures that this was not the case. Therefore, fixing the code means reimplementing it in the simplest way. I am blaming the time of use, since this is what "git blame" would show if we were to monitor for the conduit's kobject's refcount remaining elevated instead of being freed. Tested on the NXP LS1028A, using the steps from Documentation/networking/dsa/configuration.rst section "Affinity of user ports to CPU ports", followed by (extra prints added by me): $ ip link del bond0 mscc_felix 0000:00:00.5 swp3: Link is Down bond0 (unregistering): (slave eno2): Releasing backup interface fsl_enetc 0000:00:00.2 eno2: Link is Down mscc_felix 0000:00:00.5 swp0: bond0 disappeared, migrating to eno2 mscc_felix 0000:00:00.5 swp1: bond0 disappeared, migrating to eno2 mscc_felix 0000:00:00.5 swp2: bond0 disappeared, migrating to eno2 mscc_felix 0000:00:00.5 swp3: bond0 disappeared, migrating to eno2 Fixes: acc43b7bf52a ("net: dsa: allow masters to join a LAG") Signed-off-by: Vladimir Oltean Link: https://patch.msgid.link/20251215150236.3931670-2-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni --- net/dsa/dsa.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index 50b3fceb5c04d..99ede37698ac5 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -367,16 +367,10 @@ static struct dsa_port *dsa_tree_find_first_cpu(struct dsa_switch_tree *dst) struct net_device *dsa_tree_find_first_conduit(struct dsa_switch_tree *dst) { - struct device_node *ethernet; - struct net_device *conduit; struct dsa_port *cpu_dp; cpu_dp = dsa_tree_find_first_cpu(dst); - ethernet = of_parse_phandle(cpu_dp->dn, "ethernet", 0); - conduit = of_find_net_device_by_node(ethernet); - of_node_put(ethernet); - - return conduit; + return cpu_dp->conduit; } /* Assign the default CPU port (the first one in the tree) to all ports of the From 116abe7f13510b4592b63490fa0d252d22f60289 Mon Sep 17 00:00:00 2001 From: Raju Rangoju Date: Mon, 15 Dec 2025 20:47:28 +0530 Subject: [PATCH 0494/1321] amd-xgbe: reset retries and mode on RX adapt failures During the stress tests, early RX adaptation handshakes can fail, such as missing the RX_ADAPT ACK or not receiving a coefficient update before block lock is established. Continuing to retry RX adaptation in this state is often ineffective if the current mode selection is not viable. Resetting the RX adaptation retry counter when an RX_ADAPT request fails to receive ACK or a coefficient update prior to block lock, and clearing mode_set so the next bring-up performs a fresh mode selection rather than looping on a likely invalid configuration. Fixes: 4f3b20bfbb75 ("amd-xgbe: add support for rx-adaptation") Signed-off-by: Raju Rangoju Reviewed-by: Simon Horman Reviewed-by: Shyam Sundar S K Link: https://patch.msgid.link/20251215151728.311713-1-Raju.Rangoju@amd.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c index a68757e8fd22c..c63ddb12237ea 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c @@ -1928,6 +1928,7 @@ static void xgbe_set_rx_adap_mode(struct xgbe_prv_data *pdata, { if (pdata->rx_adapt_retries++ >= MAX_RX_ADAPT_RETRIES) { pdata->rx_adapt_retries = 0; + pdata->mode_set = false; return; } @@ -1974,6 +1975,7 @@ static void xgbe_rx_adaptation(struct xgbe_prv_data *pdata) */ netif_dbg(pdata, link, pdata->netdev, "Block_lock done"); pdata->rx_adapt_done = true; + pdata->rx_adapt_retries = 0; pdata->mode_set = false; return; } From b2be178797ecc641796580ade2cfc77b4f382538 Mon Sep 17 00:00:00 2001 From: Daniel Zahka Date: Tue, 16 Dec 2025 06:21:35 -0800 Subject: [PATCH 0495/1321] selftests: drv-net: psp: fix templated test names in psp_ip_ver_test_builder() test_case will only take on its formatted name after it is called by the test runner. Move the assignment to test_case.__name__ to when the test_case is constructed, not called. Fixes: 8f90dc6e417a ("selftests: drv-net: psp: add basic data transfer and key rotation tests") Signed-off-by: Daniel Zahka Link: https://patch.msgid.link/20251216-psp-test-fix-v1-1-3b5a6dde186f@gmail.com Signed-off-by: Paolo Abeni --- tools/testing/selftests/drivers/net/psp.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/drivers/net/psp.py b/tools/testing/selftests/drivers/net/psp.py index 06559ef49b9a5..56dee824bb4cc 100755 --- a/tools/testing/selftests/drivers/net/psp.py +++ b/tools/testing/selftests/drivers/net/psp.py @@ -573,8 +573,9 @@ def psp_ip_ver_test_builder(name, test_func, psp_ver, ipver): """Build test cases for each combo of PSP version and IP version""" def test_case(cfg): cfg.require_ipver(ipver) - test_case.__name__ = f"{name}_v{psp_ver}_ip{ipver}" test_func(cfg, psp_ver, ipver) + + test_case.__name__ = f"{name}_v{psp_ver}_ip{ipver}" return test_case From 2d69de8af564dd0e3b367da3bcc090046fd057f9 Mon Sep 17 00:00:00 2001 From: Daniel Zahka Date: Tue, 16 Dec 2025 06:21:36 -0800 Subject: [PATCH 0496/1321] selftests: drv-net: psp: fix test names in ipver_test_builder() test_case will only take on the formatted name after being called. This does not work with the way ksft_run() currently works. Assign the name after the test_case is created. Fixes: 81236c74dba6 ("selftests: drv-net: psp: add test for auto-adjusting TCP MSS") Signed-off-by: Daniel Zahka Link: https://patch.msgid.link/20251216-psp-test-fix-v1-2-3b5a6dde186f@gmail.com Signed-off-by: Paolo Abeni --- tools/testing/selftests/drivers/net/psp.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/drivers/net/psp.py b/tools/testing/selftests/drivers/net/psp.py index 56dee824bb4cc..52523bdad2407 100755 --- a/tools/testing/selftests/drivers/net/psp.py +++ b/tools/testing/selftests/drivers/net/psp.py @@ -583,8 +583,9 @@ def ipver_test_builder(name, test_func, ipver): """Build test cases for each IP version""" def test_case(cfg): cfg.require_ipver(ipver) - test_case.__name__ = f"{name}_ip{ipver}" test_func(cfg, ipver) + + test_case.__name__ = f"{name}_ip{ipver}" return test_case From b65d15508810af84d0be8890db8136a7cbc79ac4 Mon Sep 17 00:00:00 2001 From: Deepakkumar Karn Date: Tue, 16 Dec 2025 20:43:05 +0530 Subject: [PATCH 0497/1321] net: usb: rtl8150: fix memory leak on usb_submit_urb() failure In async_set_registers(), when usb_submit_urb() fails, the allocated async_req structure and URB are not freed, causing a memory leak. The completion callback async_set_reg_cb() is responsible for freeing these allocations, but it is only called after the URB is successfully submitted and completes (successfully or with error). If submission fails, the callback never runs and the memory is leaked. Fix this by freeing both the URB and the request structure in the error path when usb_submit_urb() fails. Reported-by: syzbot+8dd915c7cb0490fc8c52@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=8dd915c7cb0490fc8c52 Fixes: 4d12997a9bb3 ("drivers: net: usb: rtl8150: concurrent URB bugfix") Signed-off-by: Deepakkumar Karn Link: https://patch.msgid.link/20251216151304.59865-2-dkarn@redhat.com Signed-off-by: Paolo Abeni --- drivers/net/usb/rtl8150.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/usb/rtl8150.c b/drivers/net/usb/rtl8150.c index 278e6cb6f4d99..e40b0669d9f4b 100644 --- a/drivers/net/usb/rtl8150.c +++ b/drivers/net/usb/rtl8150.c @@ -211,6 +211,8 @@ static int async_set_registers(rtl8150_t *dev, u16 indx, u16 size, u16 reg) if (res == -ENODEV) netif_device_detach(dev->netdev); dev_err(&dev->udev->dev, "%s failed with %d\n", __func__, res); + kfree(req); + usb_free_urb(async_urb); } return res; } From 7f5e4575f76eefba4611628d6a3c2e9466d8fc30 Mon Sep 17 00:00:00 2001 From: "Alice C. Munduruca" Date: Tue, 16 Dec 2025 12:06:41 -0500 Subject: [PATCH 0498/1321] selftests: net: fix "buffer overflow detected" for tap.c When the selftest 'tap.c' is compiled with '-D_FORTIFY_SOURCE=3', the strcpy() in rtattr_add_strsz() is replaced with a checked version which causes the test to consistently fail when compiled with toolchains for which this option is enabled by default. TAP version 13 1..3 # Starting 3 tests from 1 test cases. # RUN tap.test_packet_valid_udp_gso ... *** buffer overflow detected ***: terminated # test_packet_valid_udp_gso: Test terminated by assertion # FAIL tap.test_packet_valid_udp_gso not ok 1 tap.test_packet_valid_udp_gso # RUN tap.test_packet_valid_udp_csum ... *** buffer overflow detected ***: terminated # test_packet_valid_udp_csum: Test terminated by assertion # FAIL tap.test_packet_valid_udp_csum not ok 2 tap.test_packet_valid_udp_csum # RUN tap.test_packet_crash_tap_invalid_eth_proto ... *** buffer overflow detected ***: terminated # test_packet_crash_tap_invalid_eth_proto: Test terminated by assertion # FAIL tap.test_packet_crash_tap_invalid_eth_proto not ok 3 tap.test_packet_crash_tap_invalid_eth_proto # FAILED: 0 / 3 tests passed. # Totals: pass:0 fail:3 xfail:0 xpass:0 skip:0 error:0 A buffer overflow is detected by the fortified glibc __strcpy_chk() since the __builtin_object_size() of `RTA_DATA(rta)` is incorrectly reported as 1, even though there is ample space in its bounding buffer `req`. Additionally, given that IFLA_IFNAME also expects a null-terminated string, callers of rtaddr_add_str{,sz}() could simply use the rtaddr_add_strsz() variant. (which has been renamed to remove the trailing `sz`) memset() has been used for this function since it is unchecked and thus circumvents the issue discussed in the previous paragraph. Fixes: 2e64fe4624d1 ("selftests: add few test cases for tap driver") Signed-off-by: Alice C. Munduruca Reviewed-by: Cengiz Can Reviewed-by: Willem de Bruijn Link: https://patch.msgid.link/20251216170641.250494-1-alice.munduruca@canonical.com Signed-off-by: Paolo Abeni --- tools/testing/selftests/net/tap.c | 16 +++++----------- 1 file changed, 5 insertions(+), 11 deletions(-) diff --git a/tools/testing/selftests/net/tap.c b/tools/testing/selftests/net/tap.c index 9ec1c9b50e772..a0c9418132c82 100644 --- a/tools/testing/selftests/net/tap.c +++ b/tools/testing/selftests/net/tap.c @@ -56,18 +56,12 @@ static void rtattr_end(struct nlmsghdr *nh, struct rtattr *attr) static struct rtattr *rtattr_add_str(struct nlmsghdr *nh, unsigned short type, const char *s) { - struct rtattr *rta = rtattr_add(nh, type, strlen(s)); + unsigned int strsz = strlen(s) + 1; + struct rtattr *rta; - memcpy(RTA_DATA(rta), s, strlen(s)); - return rta; -} - -static struct rtattr *rtattr_add_strsz(struct nlmsghdr *nh, unsigned short type, - const char *s) -{ - struct rtattr *rta = rtattr_add(nh, type, strlen(s) + 1); + rta = rtattr_add(nh, type, strsz); - strcpy(RTA_DATA(rta), s); + memcpy(RTA_DATA(rta), s, strsz); return rta; } @@ -119,7 +113,7 @@ static int dev_create(const char *dev, const char *link_type, link_info = rtattr_begin(&req.nh, IFLA_LINKINFO); - rtattr_add_strsz(&req.nh, IFLA_INFO_KIND, link_type); + rtattr_add_str(&req.nh, IFLA_INFO_KIND, link_type); if (fill_info_data) { info_data = rtattr_begin(&req.nh, IFLA_INFO_DATA); From 8454e1cc04fb84c0373bfb64e476d738d67c727f Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 16 Dec 2025 22:35:42 +0100 Subject: [PATCH 0499/1321] net: wangxun: move PHYLINK dependency The LIBWX library code is what calls into phylink, so any user of it has to select CONFIG_PHYLINK at the moment, with NGBEVF missing this: x86_64-linux-ld: drivers/net/ethernet/wangxun/libwx/wx_ethtool.o: in function `wx_nway_reset': wx_ethtool.c:(.text+0x613): undefined reference to `phylink_ethtool_nway_reset' x86_64-linux-ld: drivers/net/ethernet/wangxun/libwx/wx_ethtool.o: in function `wx_get_link_ksettings': wx_ethtool.c:(.text+0x62b): undefined reference to `phylink_ethtool_ksettings_get' x86_64-linux-ld: drivers/net/ethernet/wangxun/libwx/wx_ethtool.o: in function `wx_set_link_ksettings': wx_ethtool.c:(.text+0x643): undefined reference to `phylink_ethtool_ksettings_set' x86_64-linux-ld: drivers/net/ethernet/wangxun/libwx/wx_ethtool.o: in function `wx_get_pauseparam': wx_ethtool.c:(.text+0x65b): undefined reference to `phylink_ethtool_get_pauseparam' x86_64-linux-ld: drivers/net/ethernet/wangxun/libwx/wx_ethtool.o: in function `wx_set_pauseparam': wx_ethtool.c:(.text+0x677): undefined reference to `phylink_ethtool_set_pauseparam' Add the 'select PHYLINK' line in the libwx option directly so this will always be enabled for all current and future wangxun drivers, and remove the now duplicate lines. Fixes: a0008a3658a3 ("net: wangxun: add ngbevf build") Signed-off-by: Arnd Bergmann Reviewed-by: Vadim Fedorenko Link: https://patch.msgid.link/20251216213547.115026-1-arnd@kernel.org Signed-off-by: Paolo Abeni --- drivers/net/ethernet/wangxun/Kconfig | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/net/ethernet/wangxun/Kconfig b/drivers/net/ethernet/wangxun/Kconfig index d138dea7d208d..ec278f99d2955 100644 --- a/drivers/net/ethernet/wangxun/Kconfig +++ b/drivers/net/ethernet/wangxun/Kconfig @@ -21,6 +21,7 @@ config LIBWX depends on PTP_1588_CLOCK_OPTIONAL select PAGE_POOL select DIMLIB + select PHYLINK help Common library for Wangxun(R) Ethernet drivers. @@ -29,7 +30,6 @@ config NGBE depends on PCI depends on PTP_1588_CLOCK_OPTIONAL select LIBWX - select PHYLINK help This driver supports Wangxun(R) GbE PCI Express family of adapters. @@ -48,7 +48,6 @@ config TXGBE depends on PTP_1588_CLOCK_OPTIONAL select MARVELL_10G_PHY select REGMAP - select PHYLINK select HWMON if TXGBE=y select SFP select GPIOLIB @@ -71,7 +70,6 @@ config TXGBEVF depends on PCI_MSI depends on PTP_1588_CLOCK_OPTIONAL select LIBWX - select PHYLINK help This driver supports virtual functions for SP1000A, WX1820AL, WX5XXX, WX5XXXAL. From 092438c24972a6e015b95b2a5441f1a8da95dc84 Mon Sep 17 00:00:00 2001 From: Pauli Virtanen Date: Thu, 4 Dec 2025 22:40:20 +0200 Subject: [PATCH 0500/1321] Bluetooth: MGMT: report BIS capability flags in supported settings MGMT_SETTING_ISO_BROADCASTER and MGMT_SETTING_ISO_RECEIVER flags are missing from supported_settings although they are in current_settings. Report them also in supported_settings to be consistent. Fixes: ae7533613133 ("Bluetooth: Check for ISO support in controller") Signed-off-by: Pauli Virtanen Signed-off-by: Luiz Augusto von Dentz --- net/bluetooth/mgmt.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c index c11cdef42b6f6..5be9b8c919490 100644 --- a/net/bluetooth/mgmt.c +++ b/net/bluetooth/mgmt.c @@ -849,6 +849,12 @@ static u32 get_supported_settings(struct hci_dev *hdev) if (cis_peripheral_capable(hdev)) settings |= MGMT_SETTING_CIS_PERIPHERAL; + if (bis_capable(hdev)) + settings |= MGMT_SETTING_ISO_BROADCASTER; + + if (sync_recv_capable(hdev)) + settings |= MGMT_SETTING_ISO_SYNC_RECEIVER; + if (ll_privacy_capable(hdev)) settings |= MGMT_SETTING_LL_PRIVACY; From fa1132236e12bb8248c032aa01ca56c6b889b73a Mon Sep 17 00:00:00 2001 From: Raphael Pinsonneault-Thibeault Date: Wed, 10 Dec 2025 11:02:28 -0500 Subject: [PATCH 0501/1321] Bluetooth: btusb: revert use of devm_kzalloc in btusb This reverts commit 98921dbd00c4e ("Bluetooth: Use devm_kzalloc in btusb.c file"). In btusb_probe(), we use devm_kzalloc() to allocate the btusb data. This ties the lifetime of all the btusb data to the binding of a driver to one interface, INTF. In a driver that binds to other interfaces, ISOC and DIAG, this is an accident waiting to happen. The issue is revealed in btusb_disconnect(), where calling usb_driver_release_interface(&btusb_driver, data->intf) will have devm free the data that is also being used by the other interfaces of the driver that may not be released yet. To fix this, revert the use of devm and go back to freeing memory explicitly. Fixes: 98921dbd00c4e ("Bluetooth: Use devm_kzalloc in btusb.c file") Signed-off-by: Raphael Pinsonneault-Thibeault Signed-off-by: Luiz Augusto von Dentz --- drivers/bluetooth/btusb.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index 8ed3883ab8eef..ded09e94d296d 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -4052,7 +4052,7 @@ static int btusb_probe(struct usb_interface *intf, return -ENODEV; } - data = devm_kzalloc(&intf->dev, sizeof(*data), GFP_KERNEL); + data = kzalloc(sizeof(*data), GFP_KERNEL); if (!data) return -ENOMEM; @@ -4075,8 +4075,10 @@ static int btusb_probe(struct usb_interface *intf, } } - if (!data->intr_ep || !data->bulk_tx_ep || !data->bulk_rx_ep) + if (!data->intr_ep || !data->bulk_tx_ep || !data->bulk_rx_ep) { + kfree(data); return -ENODEV; + } if (id->driver_info & BTUSB_AMP) { data->cmdreq_type = USB_TYPE_CLASS | 0x01; @@ -4131,8 +4133,10 @@ static int btusb_probe(struct usb_interface *intf, data->recv_acl = hci_recv_frame; hdev = hci_alloc_dev_priv(priv_size); - if (!hdev) + if (!hdev) { + kfree(data); return -ENOMEM; + } hdev->bus = HCI_USB; hci_set_drvdata(hdev, data); @@ -4406,6 +4410,7 @@ static int btusb_probe(struct usb_interface *intf, if (data->reset_gpio) gpiod_put(data->reset_gpio); hci_free_dev(hdev); + kfree(data); return err; } @@ -4454,6 +4459,7 @@ static void btusb_disconnect(struct usb_interface *intf) } hci_free_dev(hdev); + kfree(data); } #ifdef CONFIG_PM From af7580e26eb6156cb61a2fc1be6fb975c3fe6041 Mon Sep 17 00:00:00 2001 From: Yeoreum Yun Date: Wed, 17 Dec 2025 08:51:15 +0000 Subject: [PATCH 0502/1321] smc91x: fix broken irq-context in PREEMPT_RT When smc91x.c is built with PREEMPT_RT, the following splat occurs in FVP_RevC: [ 13.055000] smc91x LNRO0003:00 eth0: link up, 10Mbps, half-duplex, lpa 0x0000 [ 13.062137] BUG: workqueue leaked atomic, lock or RCU: kworker/2:1[106] [ 13.062137] preempt=0x00000000 lock=0->0 RCU=0->1 workfn=mld_ifc_work [ 13.062266] C ** replaying previous printk message ** [ 13.062266] CPU: 2 UID: 0 PID: 106 Comm: kworker/2:1 Not tainted 6.18.0-dirty #179 PREEMPT_{RT,(full)} [ 13.062353] Hardware name: , BIOS [ 13.062382] Workqueue: mld mld_ifc_work [ 13.062469] Call trace: [ 13.062494] show_stack+0x24/0x40 (C) [ 13.062602] __dump_stack+0x28/0x48 [ 13.062710] dump_stack_lvl+0x7c/0xb0 [ 13.062818] dump_stack+0x18/0x34 [ 13.062926] process_scheduled_works+0x294/0x450 [ 13.063043] worker_thread+0x260/0x3d8 [ 13.063124] kthread+0x1c4/0x228 [ 13.063235] ret_from_fork+0x10/0x20 This happens because smc_special_trylock() disables IRQs even on PREEMPT_RT, but smc_special_unlock() does not restore IRQs on PREEMPT_RT. The reason is that smc_special_unlock() calls spin_unlock_irqrestore(), and rcu_read_unlock_bh() in __dev_queue_xmit() cannot invoke rcu_read_unlock() through __local_bh_enable_ip() when current->softirq_disable_cnt becomes zero. To address this issue, replace smc_special_trylock() with spin_trylock_irqsave(). Fixes: 342a93247e08 ("locking/spinlock: Provide RT variant header: ") Signed-off-by: Yeoreum Yun Reviewed-by: Simon Horman Link: https://patch.msgid.link/20251217085115.1730036-1-yeoreum.yun@arm.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/smsc/smc91x.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/drivers/net/ethernet/smsc/smc91x.c b/drivers/net/ethernet/smsc/smc91x.c index 9d1a83a5fa7e5..d16c178d10344 100644 --- a/drivers/net/ethernet/smsc/smc91x.c +++ b/drivers/net/ethernet/smsc/smc91x.c @@ -516,15 +516,7 @@ static inline void smc_rcv(struct net_device *dev) * any other concurrent access and C would always interrupt B. But life * isn't that easy in a SMP world... */ -#define smc_special_trylock(lock, flags) \ -({ \ - int __ret; \ - local_irq_save(flags); \ - __ret = spin_trylock(lock); \ - if (!__ret) \ - local_irq_restore(flags); \ - __ret; \ -}) +#define smc_special_trylock(lock, flags) spin_trylock_irqsave(lock, flags) #define smc_special_lock(lock, flags) spin_lock_irqsave(lock, flags) #define smc_special_unlock(lock, flags) spin_unlock_irqrestore(lock, flags) #else From f5d02514f8795cb7d480d7f0bc02cd99d3942862 Mon Sep 17 00:00:00 2001 From: Rajashekar Hudumula Date: Wed, 17 Dec 2025 02:47:48 -0800 Subject: [PATCH 0503/1321] bng_en: update module description The Broadcom BCM57708/800G NIC family is branded as ThorUltra. Update the driver description accordingly. Fixes: 74715c4ab0fa0 ("bng_en: Add PCI interface") Signed-off-by: Rajashekar Hudumula Reviewed-by: Vikas Gupta Reviewed-by: Bhargava Chenna Marreddy Link: https://patch.msgid.link/20251217104748.3004706-1-rajashekar.hudumula@broadcom.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/broadcom/Kconfig | 8 ++++---- drivers/net/ethernet/broadcom/bnge/bnge.h | 2 +- drivers/net/ethernet/broadcom/bnge/bnge_core.c | 2 +- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/broadcom/Kconfig b/drivers/net/ethernet/broadcom/Kconfig index 666522d647751..ca565ace6e6ad 100644 --- a/drivers/net/ethernet/broadcom/Kconfig +++ b/drivers/net/ethernet/broadcom/Kconfig @@ -255,14 +255,14 @@ config BNXT_HWMON devices, via the hwmon sysfs interface. config BNGE - tristate "Broadcom Ethernet device support" + tristate "Broadcom ThorUltra Ethernet device support" depends on PCI select NET_DEVLINK select PAGE_POOL help - This driver supports Broadcom 50/100/200/400/800 gigabit Ethernet cards. - The module will be called bng_en. To compile this driver as a module, - choose M here. + This driver supports Broadcom ThorUltra 50/100/200/400/800 gigabit + Ethernet cards. The module will be called bng_en. To compile this + driver as a module, choose M here. config BCMASP tristate "Broadcom ASP 2.0 Ethernet support" diff --git a/drivers/net/ethernet/broadcom/bnge/bnge.h b/drivers/net/ethernet/broadcom/bnge/bnge.h index 411744894349f..32fc16a37d02a 100644 --- a/drivers/net/ethernet/broadcom/bnge/bnge.h +++ b/drivers/net/ethernet/broadcom/bnge/bnge.h @@ -5,7 +5,7 @@ #define _BNGE_H_ #define DRV_NAME "bng_en" -#define DRV_SUMMARY "Broadcom 800G Ethernet Linux Driver" +#define DRV_SUMMARY "Broadcom ThorUltra NIC Ethernet Driver" #include #include diff --git a/drivers/net/ethernet/broadcom/bnge/bnge_core.c b/drivers/net/ethernet/broadcom/bnge/bnge_core.c index c94e132bebc80..b4090283df0f2 100644 --- a/drivers/net/ethernet/broadcom/bnge/bnge_core.c +++ b/drivers/net/ethernet/broadcom/bnge/bnge_core.c @@ -19,7 +19,7 @@ char bnge_driver_name[] = DRV_NAME; static const struct { char *name; } board_info[] = { - [BCM57708] = { "Broadcom BCM57708 50Gb/100Gb/200Gb/400Gb/800Gb Ethernet" }, + [BCM57708] = { "Broadcom BCM57708 ThorUltra 50Gb/100Gb/200Gb/400Gb/800Gb Ethernet" }, }; static const struct pci_device_id bnge_pci_tbl[] = { From 09fcbf1cc9800737d508646f28fcbd3d357d4aac Mon Sep 17 00:00:00 2001 From: Przemyslaw Korba Date: Thu, 20 Nov 2025 13:07:28 +0100 Subject: [PATCH 0504/1321] i40e: fix scheduling in set_rx_mode Add service task schedule to set_rx_mode. In some cases there are error messages printed out in PTP application (ptp4l): ptp4l[13848.762]: port 1 (ens2f3np3): received SYNC without timestamp ptp4l[13848.825]: port 1 (ens2f3np3): received SYNC without timestamp ptp4l[13848.887]: port 1 (ens2f3np3): received SYNC without timestamp This happens when service task would not run immediately after set_rx_mode, and we need it for setup tasks. This service task checks, if PTP RX packets are hung in firmware, and propagate correct settings such as multicast address for IEEE 1588 Precision Time Protocol. RX timestamping depends on some of these filters set. Bug happens only with high PTP packets frequency incoming, and not every run since sometimes service task is being ran from a different place immediately after starting ptp4l. Fixes: 0e4425ed641f ("i40e: fix: do not sleep in netdev_ops") Reviewed-by: Grzegorz Nitka Reviewed-by: Jacob Keller Reviewed-by: Aleksandr Loktionov Signed-off-by: Przemyslaw Korba Tested-by: Rinitha S (A Contingent worker at Intel) Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/i40e/i40e_main.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index d8192aa232548..0b1cc0481027a 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -2234,6 +2234,7 @@ static void i40e_set_rx_mode(struct net_device *netdev) vsi->flags |= I40E_VSI_FLAG_FILTER_CHANGED; set_bit(__I40E_MACVLAN_SYNC_PENDING, vsi->back->state); } + i40e_service_event_schedule(vsi->back); } /** From cf81c32dbbd611120cfc66fece05c271740ca7dc Mon Sep 17 00:00:00 2001 From: Gregory Herrero Date: Fri, 12 Dec 2025 22:06:43 +0100 Subject: [PATCH 0505/1321] i40e: validate ring_len parameter against hardware-specific values The maximum number of descriptors supported by the hardware is hardware-dependent and can be retrieved using i40e_get_max_num_descriptors(). Move this function to a shared header and use it when checking for valid ring_len parameter rather than using hardcoded value. By fixing an over-acceptance issue, behavior change could be seen where ring_len could now be rejected while configuring rx and tx queues if its size is larger than the hardware-dependent maximum number of descriptors. Fixes: 55d225670def ("i40e: add validation for ring_len param") Signed-off-by: Gregory Herrero Tested-by: Rafal Romanowski Reviewed-by: Aleksandr Loktionov Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/i40e/i40e.h | 11 +++++++++++ drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 12 ------------ drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 4 ++-- 3 files changed, 13 insertions(+), 14 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h index d2d03db2acec6..dcb50c2e1aa27 100644 --- a/drivers/net/ethernet/intel/i40e/i40e.h +++ b/drivers/net/ethernet/intel/i40e/i40e.h @@ -1422,4 +1422,15 @@ static inline struct i40e_veb *i40e_pf_get_main_veb(struct i40e_pf *pf) return (pf->lan_veb != I40E_NO_VEB) ? pf->veb[pf->lan_veb] : NULL; } +static inline u32 i40e_get_max_num_descriptors(const struct i40e_pf *pf) +{ + const struct i40e_hw *hw = &pf->hw; + + switch (hw->mac.type) { + case I40E_MAC_XL710: + return I40E_MAX_NUM_DESCRIPTORS_XL710; + default: + return I40E_MAX_NUM_DESCRIPTORS; + } +} #endif /* _I40E_H_ */ diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c index f2c2646ea2989..6a47ea0927e96 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c @@ -2013,18 +2013,6 @@ static void i40e_get_drvinfo(struct net_device *netdev, drvinfo->n_priv_flags += I40E_GL_PRIV_FLAGS_STR_LEN; } -static u32 i40e_get_max_num_descriptors(struct i40e_pf *pf) -{ - struct i40e_hw *hw = &pf->hw; - - switch (hw->mac.type) { - case I40E_MAC_XL710: - return I40E_MAX_NUM_DESCRIPTORS_XL710; - default: - return I40E_MAX_NUM_DESCRIPTORS; - } -} - static void i40e_get_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring, struct kernel_ethtool_ringparam *kernel_ring, diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c index 8b30a3accd310..1fa877b52f618 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -656,7 +656,7 @@ static int i40e_config_vsi_tx_queue(struct i40e_vf *vf, u16 vsi_id, /* ring_len has to be multiple of 8 */ if (!IS_ALIGNED(info->ring_len, 8) || - info->ring_len > I40E_MAX_NUM_DESCRIPTORS_XL710) { + info->ring_len > i40e_get_max_num_descriptors(pf)) { ret = -EINVAL; goto error_context; } @@ -726,7 +726,7 @@ static int i40e_config_vsi_rx_queue(struct i40e_vf *vf, u16 vsi_id, /* ring_len has to be multiple of 32 */ if (!IS_ALIGNED(info->ring_len, 32) || - info->ring_len > I40E_MAX_NUM_DESCRIPTORS_XL710) { + info->ring_len > i40e_get_max_num_descriptors(pf)) { ret = -EINVAL; goto error_param; } From 1157b7fc7d7c98bf8144f0fc4ff7852a4ed1eab5 Mon Sep 17 00:00:00 2001 From: Kohei Enju Date: Sun, 26 Oct 2025 01:58:50 +0900 Subject: [PATCH 0506/1321] iavf: fix off-by-one issues in iavf_config_rss_reg() There are off-by-one bugs when configuring RSS hash key and lookup table, causing out-of-bounds reads to memory [1] and out-of-bounds writes to device registers. Before commit 43a3d9ba34c9 ("i40evf: Allow PF driver to configure RSS"), the loop upper bounds were: i <= I40E_VFQF_{HKEY,HLUT}_MAX_INDEX which is safe since the value is the last valid index. That commit changed the bounds to: i <= adapter->rss_{key,lut}_size / 4 where `rss_{key,lut}_size / 4` is the number of dwords, so the last valid index is `(rss_{key,lut}_size / 4) - 1`. Therefore, using `<=` accesses one element past the end. Fix the issues by using `<` instead of `<=`, ensuring we do not exceed the bounds. [1] KASAN splat about rss_key_size off-by-one BUG: KASAN: slab-out-of-bounds in iavf_config_rss+0x619/0x800 Read of size 4 at addr ffff888102c50134 by task kworker/u8:6/63 CPU: 0 UID: 0 PID: 63 Comm: kworker/u8:6 Not tainted 6.18.0-rc2-enjuk-tnguy-00378-g3005f5b77652-dirty #156 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Workqueue: iavf iavf_watchdog_task Call Trace: dump_stack_lvl+0x6f/0xb0 print_report+0x170/0x4f3 kasan_report+0xe1/0x1a0 iavf_config_rss+0x619/0x800 iavf_watchdog_task+0x2be7/0x3230 process_one_work+0x7fd/0x1420 worker_thread+0x4d1/0xd40 kthread+0x344/0x660 ret_from_fork+0x249/0x320 ret_from_fork_asm+0x1a/0x30 Allocated by task 63: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x7f/0x90 __kmalloc_noprof+0x246/0x6f0 iavf_watchdog_task+0x28fc/0x3230 process_one_work+0x7fd/0x1420 worker_thread+0x4d1/0xd40 kthread+0x344/0x660 ret_from_fork+0x249/0x320 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff888102c50100 which belongs to the cache kmalloc-64 of size 64 The buggy address is located 0 bytes to the right of allocated 52-byte region [ffff888102c50100, ffff888102c50134) The buggy address belongs to the physical page: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x102c50 flags: 0x200000000000000(node=0|zone=2) page_type: f5(slab) raw: 0200000000000000 ffff8881000418c0 dead000000000122 0000000000000000 raw: 0000000000000000 0000000080200020 00000000f5000000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888102c50000: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc ffff888102c50080: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc >ffff888102c50100: 00 00 00 00 00 00 04 fc fc fc fc fc fc fc fc fc ^ ffff888102c50180: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc ffff888102c50200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Fixes: 43a3d9ba34c9 ("i40evf: Allow PF driver to configure RSS") Signed-off-by: Kohei Enju Reviewed-by: Aleksandr Loktionov Reviewed-by: Przemek Kitszel Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/iavf/iavf_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index c2fbe443ef853..4b0fc8f354bc9 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -1726,11 +1726,11 @@ static int iavf_config_rss_reg(struct iavf_adapter *adapter) u16 i; dw = (u32 *)adapter->rss_key; - for (i = 0; i <= adapter->rss_key_size / 4; i++) + for (i = 0; i < adapter->rss_key_size / 4; i++) wr32(hw, IAVF_VFQF_HKEY(i), dw[i]); dw = (u32 *)adapter->rss_lut; - for (i = 0; i <= adapter->rss_lut_size / 4; i++) + for (i = 0; i < adapter->rss_lut_size / 4; i++) wr32(hw, IAVF_VFQF_HLUT(i), dw[i]); iavf_flush(hw); From 5e9b65f3609daab4fc1a083aa2a4e2ce691d4d4c Mon Sep 17 00:00:00 2001 From: Larysa Zaremba Date: Tue, 7 Oct 2025 13:46:22 +0200 Subject: [PATCH 0507/1321] idpf: fix LAN memory regions command on some NVMs IPU SDK versions 1.9 through 2.0.5 require send buffer to contain a single empty memory region. Set number of regions to 1 and use appropriate send buffer size to satisfy this requirement. Fixes: 6aa53e861c1a ("idpf: implement get LAN MMIO memory regions") Suggested-by: Michal Swiatkowski Reviewed-by: Aleksandr Loktionov Signed-off-by: Larysa Zaremba Tested-by: Krishneil Singh Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c index 44cd4b466c481..5bbe7d9294c14 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c @@ -1016,6 +1016,9 @@ static int idpf_send_get_lan_memory_regions(struct idpf_adapter *adapter) struct idpf_vc_xn_params xn_params = { .vc_op = VIRTCHNL2_OP_GET_LAN_MEMORY_REGIONS, .recv_buf.iov_len = IDPF_CTLQ_MAX_BUF_LEN, + .send_buf.iov_len = + sizeof(struct virtchnl2_get_lan_memory_regions) + + sizeof(struct virtchnl2_mem_region), .timeout_ms = IDPF_VC_XN_DEFAULT_TIMEOUT_MSEC, }; int num_regions, size; @@ -1028,6 +1031,8 @@ static int idpf_send_get_lan_memory_regions(struct idpf_adapter *adapter) return -ENOMEM; xn_params.recv_buf.iov_base = rcvd_regions; + rcvd_regions->num_memory_regions = cpu_to_le16(1); + xn_params.send_buf.iov_base = rcvd_regions; reply_sz = idpf_vc_xn_exec(adapter, &xn_params); if (reply_sz < 0) return reply_sz; From 5ed6cfee95142ad641cc9d731522f449b19349ef Mon Sep 17 00:00:00 2001 From: Brian Vazquez Date: Mon, 10 Nov 2025 20:58:37 +0000 Subject: [PATCH 0508/1321] idpf: reduce mbx_task schedule delay to 300us During the IDPF init phase, the mailbox runs in poll mode until it is configured to properly handle interrupts. The previous delay of 300ms is excessively long for the mailbox polling mechanism, which causes a slow initialization of ~2s: echo 0000:06:12.4 > /sys/bus/pci/drivers/idpf/bind [ 52.444239] idpf 0000:06:12.4: enabling device (0000 -> 0002) [ 52.485005] idpf 0000:06:12.4: Device HW Reset initiated [ 54.177181] idpf 0000:06:12.4: PTP init failed, err=-EOPNOTSUPP [ 54.206177] idpf 0000:06:12.4: Minimum RX descriptor support not provided, using the default [ 54.206182] idpf 0000:06:12.4: Minimum TX descriptor support not provided, using the default Changing the delay to 300us avoids the delays during the initial mailbox transactions, making the init phase much faster: [ 83.342590] idpf 0000:06:12.4: enabling device (0000 -> 0002) [ 83.384402] idpf 0000:06:12.4: Device HW Reset initiated [ 83.518323] idpf 0000:06:12.4: PTP init failed, err=-EOPNOTSUPP [ 83.547430] idpf 0000:06:12.4: Minimum RX descriptor support not provided, using the default [ 83.547435] idpf 0000:06:12.4: Minimum TX descriptor support not provided, using the default Fixes: 4930fbf419a7 ("idpf: add core init and interrupt request") Signed-off-by: Brian Vazquez Reviewed-by: Aleksandr Loktionov Tested-by: Samuel Salin Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf_lib.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c index 7a7e101afeb68..7ce4eb71a433c 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c @@ -1271,7 +1271,7 @@ void idpf_mbx_task(struct work_struct *work) idpf_mb_irq_enable(adapter); else queue_delayed_work(adapter->mbx_wq, &adapter->mbx_task, - msecs_to_jiffies(300)); + usecs_to_jiffies(300)); idpf_recv_mb_msg(adapter); } From 9ad8f2a07ff42af8ece4f0c4c4cb3fb760dd9f4f Mon Sep 17 00:00:00 2001 From: Guangshuo Li Date: Mon, 1 Dec 2025 11:40:58 +0800 Subject: [PATCH 0509/1321] e1000: fix OOB in e1000_tbi_should_accept() In e1000_tbi_should_accept() we read the last byte of the frame via 'data[length - 1]' to evaluate the TBI workaround. If the descriptor- reported length is zero or larger than the actual RX buffer size, this read goes out of bounds and can hit unrelated slab objects. The issue is observed from the NAPI receive path (e1000_clean_rx_irq): ================================================================== BUG: KASAN: slab-out-of-bounds in e1000_tbi_should_accept+0x610/0x790 Read of size 1 at addr ffff888014114e54 by task sshd/363 CPU: 0 PID: 363 Comm: sshd Not tainted 5.18.0-rc1 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack_lvl+0x5a/0x74 print_address_description+0x7b/0x440 print_report+0x101/0x200 kasan_report+0xc1/0xf0 e1000_tbi_should_accept+0x610/0x790 e1000_clean_rx_irq+0xa8c/0x1110 e1000_clean+0xde2/0x3c10 __napi_poll+0x98/0x380 net_rx_action+0x491/0xa20 __do_softirq+0x2c9/0x61d do_softirq+0xd1/0x120 __local_bh_enable_ip+0xfe/0x130 ip_finish_output2+0x7d5/0xb00 __ip_queue_xmit+0xe24/0x1ab0 __tcp_transmit_skb+0x1bcb/0x3340 tcp_write_xmit+0x175d/0x6bd0 __tcp_push_pending_frames+0x7b/0x280 tcp_sendmsg_locked+0x2e4f/0x32d0 tcp_sendmsg+0x24/0x40 sock_write_iter+0x322/0x430 vfs_write+0x56c/0xa60 ksys_write+0xd1/0x190 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f511b476b10 Code: 73 01 c3 48 8b 0d 88 d3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d f9 2b 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 8e 9b 01 00 48 89 04 24 RSP: 002b:00007ffc9211d4e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000004024 RCX: 00007f511b476b10 RDX: 0000000000004024 RSI: 0000559a9385962c RDI: 0000000000000003 RBP: 0000559a9383a400 R08: fffffffffffffff0 R09: 0000000000004f00 R10: 0000000000000070 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffc9211d57f R14: 0000559a9347bde7 R15: 0000000000000003 Allocated by task 1: __kasan_krealloc+0x131/0x1c0 krealloc+0x90/0xc0 add_sysfs_param+0xcb/0x8a0 kernel_add_sysfs_param+0x81/0xd4 param_sysfs_builtin+0x138/0x1a6 param_sysfs_init+0x57/0x5b do_one_initcall+0x104/0x250 do_initcall_level+0x102/0x132 do_initcalls+0x46/0x74 kernel_init_freeable+0x28f/0x393 kernel_init+0x14/0x1a0 ret_from_fork+0x22/0x30 The buggy address belongs to the object at ffff888014114000 which belongs to the cache kmalloc-2k of size 2048 The buggy address is located 1620 bytes to the right of 2048-byte region [ffff888014114000, ffff888014114800] The buggy address belongs to the physical page: page:ffffea0000504400 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x14110 head:ffffea0000504400 order:3 compound_mapcount:0 compound_pincount:0 flags: 0x100000000010200(slab|head|node=0|zone=1) raw: 0100000000010200 0000000000000000 dead000000000001 ffff888013442000 raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected ================================================================== This happens because the TBI check unconditionally dereferences the last byte without validating the reported length first: u8 last_byte = *(data + length - 1); Fix by rejecting the frame early if the length is zero, or if it exceeds adapter->rx_buffer_len. This preserves the TBI workaround semantics for valid frames and prevents touching memory beyond the RX buffer. Fixes: 2037110c96d5 ("e1000: move tbi workaround code into helper function") Cc: stable@vger.kernel.org Signed-off-by: Guangshuo Li Reviewed-by: Simon Horman Reviewed-by: Aleksandr Loktionov Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/e1000/e1000_main.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c index 292389aceb2d4..7f078ec9c14c5 100644 --- a/drivers/net/ethernet/intel/e1000/e1000_main.c +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c @@ -4094,7 +4094,15 @@ static bool e1000_tbi_should_accept(struct e1000_adapter *adapter, u32 length, const u8 *data) { struct e1000_hw *hw = &adapter->hw; - u8 last_byte = *(data + length - 1); + u8 last_byte; + + /* Guard against OOB on data[length - 1] */ + if (unlikely(!length)) + return false; + /* Upper bound: length must not exceed rx_buffer_len */ + if (unlikely(length > adapter->rx_buffer_len)) + return false; + last_byte = *(data + length - 1); if (TBI_ACCEPT(hw, status, errors, length, last_byte)) { unsigned long irq_flags; From be5dae88a9a5fa42ea19c4317e4b765320c1d0f9 Mon Sep 17 00:00:00 2001 From: Jonas Gorski Date: Wed, 17 Dec 2025 21:57:56 +0100 Subject: [PATCH 0510/1321] net: dsa: b53: skip multicast entries for fdb_dump() port_fdb_dump() is supposed to only add fdb entries, but we iterate over the full ARL table, which also includes multicast entries. So check if the entry is a multicast entry before passing it on to the callback(). Additionally, the port of those entries is a bitmask, not a port number, so any included entries would have even be for the wrong port. Fixes: 1da6df85c6fb ("net: dsa: b53: Implement ARL add/del/dump operations") Signed-off-by: Jonas Gorski Reviewed-by: Florian Fainelli Link: https://patch.msgid.link/20251217205756.172123-1-jonas.gorski@gmail.com Signed-off-by: Paolo Abeni --- drivers/net/dsa/b53/b53_common.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index a1a177713d99d..2c4131ed7e30b 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -2169,6 +2169,9 @@ static int b53_fdb_copy(int port, const struct b53_arl_entry *ent, if (!ent->is_valid) return 0; + if (is_multicast_ether_addr(ent->mac)) + return 0; + if (port != ent->port) return 0; From 4b9a04b4a9b62974bd72469cdd5890675c38b161 Mon Sep 17 00:00:00 2001 From: Rosen Penev Date: Wed, 17 Dec 2025 13:01:53 -0800 Subject: [PATCH 0511/1321] net: mdio: rtl9300: use scoped for loops Currently in the return path, fwnode_handle_put calls are missing. Just use _scoped to avoid the issue. Fixes: 24e31e474769 ("net: mdio: Add RTL9300 MDIO driver") Signed-off-by: Rosen Penev Link: https://patch.msgid.link/20251217210153.14641-1-rosenp@gmail.com Signed-off-by: Paolo Abeni --- drivers/net/mdio/mdio-realtek-rtl9300.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/net/mdio/mdio-realtek-rtl9300.c b/drivers/net/mdio/mdio-realtek-rtl9300.c index 33694c3ff9a71..405a07075dd11 100644 --- a/drivers/net/mdio/mdio-realtek-rtl9300.c +++ b/drivers/net/mdio/mdio-realtek-rtl9300.c @@ -354,7 +354,6 @@ static int rtl9300_mdiobus_probe_one(struct device *dev, struct rtl9300_mdio_pri struct fwnode_handle *node) { struct rtl9300_mdio_chan *chan; - struct fwnode_handle *child; struct mii_bus *bus; u32 mdio_bus; int err; @@ -371,7 +370,7 @@ static int rtl9300_mdiobus_probe_one(struct device *dev, struct rtl9300_mdio_pri * compatible = "ethernet-phy-ieee802.3-c45". This does mean we can't * support both c45 and c22 on the same MDIO bus. */ - fwnode_for_each_child_node(node, child) + fwnode_for_each_child_node_scoped(node, child) if (fwnode_device_is_compatible(child, "ethernet-phy-ieee802.3-c45")) priv->smi_bus_is_c45[mdio_bus] = true; @@ -409,7 +408,6 @@ static int rtl9300_mdiobus_map_ports(struct device *dev) { struct rtl9300_mdio_priv *priv = dev_get_drvdata(dev); struct device *parent = dev->parent; - struct fwnode_handle *port; int err; struct fwnode_handle *ports __free(fwnode_handle) = @@ -418,7 +416,7 @@ static int rtl9300_mdiobus_map_ports(struct device *dev) return dev_err_probe(dev, -EINVAL, "%pfwP missing ethernet-ports\n", dev_fwnode(parent)); - fwnode_for_each_child_node(ports, port) { + fwnode_for_each_child_node_scoped(ports, port) { struct device_node *mdio_dn; u32 addr; u32 bus; From f4b0956650532c1dcf27972a2bd40571744a16dc Mon Sep 17 00:00:00 2001 From: Deepanshu Kartikey Date: Thu, 18 Dec 2025 06:41:56 +0530 Subject: [PATCH 0512/1321] net: usb: asix: validate PHY address before use The ASIX driver reads the PHY address from the USB device via asix_read_phy_addr(). A malicious or faulty device can return an invalid address (>= PHY_MAX_ADDR), which causes a warning in mdiobus_get_phy(): addr 207 out of range WARNING: drivers/net/phy/mdio_bus.c:76 Validate the PHY address in asix_read_phy_addr() and remove the now-redundant check in ax88172a.c. Reported-by: syzbot+3d43c9066a5b54902232@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=3d43c9066a5b54902232 Tested-by: syzbot+3d43c9066a5b54902232@syzkaller.appspotmail.com Fixes: 7e88b11a862a ("net: usb: asix: refactor asix_read_phy_addr() and handle errors on return") Link: https://lore.kernel.org/all/20251217085057.270704-1-kartikey406@gmail.com/T/ [v1] Signed-off-by: Deepanshu Kartikey Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20251218011156.276824-1-kartikey406@gmail.com Signed-off-by: Paolo Abeni --- drivers/net/usb/asix_common.c | 5 +++++ drivers/net/usb/ax88172a.c | 6 +----- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c index 7fd763917ae2c..6ab3486072cb0 100644 --- a/drivers/net/usb/asix_common.c +++ b/drivers/net/usb/asix_common.c @@ -335,6 +335,11 @@ int asix_read_phy_addr(struct usbnet *dev, bool internal) offset = (internal ? 1 : 0); ret = buf[offset]; + if (ret >= PHY_MAX_ADDR) { + netdev_err(dev->net, "invalid PHY address: %d\n", ret); + return -ENODEV; + } + netdev_dbg(dev->net, "%s PHY address 0x%x\n", internal ? "internal" : "external", ret); diff --git a/drivers/net/usb/ax88172a.c b/drivers/net/usb/ax88172a.c index f613e4bc68c85..758a423a459b8 100644 --- a/drivers/net/usb/ax88172a.c +++ b/drivers/net/usb/ax88172a.c @@ -210,11 +210,7 @@ static int ax88172a_bind(struct usbnet *dev, struct usb_interface *intf) ret = asix_read_phy_addr(dev, priv->use_embdphy); if (ret < 0) goto free; - if (ret >= PHY_MAX_ADDR) { - netdev_err(dev->net, "Invalid PHY address %#x\n", ret); - ret = -ENODEV; - goto free; - } + priv->phy_addr = ret; ax88172a_reset_phy(dev, priv->use_embdphy); From 783b0f77fb1513585d2389e7e0d3a2545eabd450 Mon Sep 17 00:00:00 2001 From: Deepanshu Kartikey Date: Thu, 18 Dec 2025 06:53:54 +0530 Subject: [PATCH 0513/1321] net: nfc: fix deadlock between nfc_unregister_device and rfkill_fop_write A deadlock can occur between nfc_unregister_device() and rfkill_fop_write() due to lock ordering inversion between device_lock and rfkill_global_mutex. The problematic lock order is: Thread A (rfkill_fop_write): rfkill_fop_write() mutex_lock(&rfkill_global_mutex) rfkill_set_block() nfc_rfkill_set_block() nfc_dev_down() device_lock(&dev->dev) <- waits for device_lock Thread B (nfc_unregister_device): nfc_unregister_device() device_lock(&dev->dev) rfkill_unregister() mutex_lock(&rfkill_global_mutex) <- waits for rfkill_global_mutex This creates a classic ABBA deadlock scenario. Fix this by moving rfkill_unregister() and rfkill_destroy() outside the device_lock critical section. Store the rfkill pointer in a local variable before releasing the lock, then call rfkill_unregister() after releasing device_lock. This change is safe because rfkill_fop_write() holds rfkill_global_mutex while calling the rfkill callbacks, and rfkill_unregister() also acquires rfkill_global_mutex before cleanup. Therefore, rfkill_unregister() will wait for any ongoing callback to complete before proceeding, and device_del() is only called after rfkill_unregister() returns, preventing any use-after-free. The similar lock ordering in nfc_register_device() (device_lock -> rfkill_global_mutex via rfkill_register) is safe because during registration the device is not yet in rfkill_list, so no concurrent rfkill operations can occur on this device. Fixes: 3e3b5dfcd16a ("NFC: reorder the logic in nfc_{un,}register_device") Cc: stable@vger.kernel.org Reported-by: syzbot+4ef89409a235d804c6c2@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=4ef89409a235d804c6c2 Link: https://lore.kernel.org/all/20251217054908.178907-1-kartikey406@gmail.com/T/ [v1] Signed-off-by: Deepanshu Kartikey Reviewed-by: Krzysztof Kozlowski Link: https://patch.msgid.link/20251218012355.279940-1-kartikey406@gmail.com Signed-off-by: Paolo Abeni --- net/nfc/core.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/net/nfc/core.c b/net/nfc/core.c index ae1c842f9c642..82f023f377541 100644 --- a/net/nfc/core.c +++ b/net/nfc/core.c @@ -1154,6 +1154,7 @@ EXPORT_SYMBOL(nfc_register_device); void nfc_unregister_device(struct nfc_dev *dev) { int rc; + struct rfkill *rfk = NULL; pr_debug("dev_name=%s\n", dev_name(&dev->dev)); @@ -1164,13 +1165,17 @@ void nfc_unregister_device(struct nfc_dev *dev) device_lock(&dev->dev); if (dev->rfkill) { - rfkill_unregister(dev->rfkill); - rfkill_destroy(dev->rfkill); + rfk = dev->rfkill; dev->rfkill = NULL; } dev->shutting_down = true; device_unlock(&dev->dev); + if (rfk) { + rfkill_unregister(rfk); + rfkill_destroy(rfk); + } + if (dev->ops->check_presence) { timer_delete_sync(&dev->check_pres_timer); cancel_work_sync(&dev->check_pres_work); From 522e1a70a8c76d81e6d1db79ea43fde7d99629d0 Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Thu, 18 Dec 2025 11:29:37 +0700 Subject: [PATCH 0514/1321] net: bridge: Describe @tunnel_hash member in net_bridge_vlan_group struct Sphinx reports kernel-doc warning: WARNING: ./net/bridge/br_private.h:267 struct member 'tunnel_hash' not described in 'net_bridge_vlan_group' Fix it by describing @tunnel_hash member. Fixes: efa5356b0d9753 ("bridge: per vlan dst_metadata netlink support") Signed-off-by: Bagas Sanjaya Acked-by: Nikolay Aleksandrov Reviewed-by: Ido Schimmel Link: https://patch.msgid.link/20251218042936.24175-2-bagasdotme@gmail.com Signed-off-by: Paolo Abeni --- net/bridge/br_private.h | 1 + 1 file changed, 1 insertion(+) diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 7280c4e9305f3..b9b2981c48414 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -247,6 +247,7 @@ struct net_bridge_vlan { * struct net_bridge_vlan_group * * @vlan_hash: VLAN entry rhashtable + * @tunnel_hash: Hash table to map from tunnel key ID (e.g. VXLAN VNI) to VLAN * @vlan_list: sorted VLAN entry list * @num_vlans: number of total VLAN entries * @pvid: PVID VLAN id From fdf41b705f9c638f115258cf1e6f38fd8a71b5f7 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 18 Dec 2025 08:18:44 +0000 Subject: [PATCH 0515/1321] net: avoid prefetching NULL pointers Aditya Gupta reported PowerPC crashes bisected to the blamed commit. Apparently some platforms do not allow prefetch() on arbitrary pointers. prefetch(next); prefetch(&next->priority); // CRASH when next == NULL Only NULL seems to be supported, with specific handling in prefetch(). Add a conditional to avoid the two prefetches and the skb->next clearing for the last skb in the list. Fixes: b2e9821cff6c ("net: prefech skb->priority in __dev_xmit_skb()") Reported-by: Aditya Gupta Closes: https://lore.kernel.org/netdev/e9f4abee-b132-440f-a50e-bced0868b5a7@linux.ibm.com/T/#mddc372b64ec5a3b181acc9ee3909110c391cc18a Signed-off-by: Eric Dumazet Tested-by: Aditya Gupta Link: https://patch.msgid.link/20251218081844.809008-1-edumazet@google.com Signed-off-by: Paolo Abeni --- net/core/dev.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 9094c0fb8c689..36dc5199037ed 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4241,9 +4241,11 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q, int count = 0; llist_for_each_entry_safe(skb, next, ll_list, ll_node) { - prefetch(next); - prefetch(&next->priority); - skb_mark_not_on_list(skb); + if (next) { + prefetch(next); + prefetch(&next->priority); + skb_mark_not_on_list(skb); + } rc = dev_qdisc_enqueue(skb, q, &to_free, txq); count++; } From 5f237d3ac697a2184289d54d8c8927dbf33354fc Mon Sep 17 00:00:00 2001 From: Dipayaan Roy Date: Thu, 18 Dec 2025 05:10:54 -0800 Subject: [PATCH 0516/1321] net: mana: Fix use-after-free in reset service rescan path When mana_serv_reset() encounters -ETIMEDOUT or -EPROTO from mana_gd_resume(), it performs a PCI rescan via mana_serv_rescan(). mana_serv_rescan() calls pci_stop_and_remove_bus_device(), which can invoke the driver's remove path and free the gdma_context associated with the device. After returning, mana_serv_reset() currently jumps to the out label and attempts to clear gc->in_service, dereferencing a freed gdma_context. The issue was observed with the following call logs: [ 698.942636] BUG: unable to handle page fault for address: ff6c2b638088508d [ 698.943121] #PF: supervisor write access in kernel mode [ 698.943423] #PF: error_code(0x0002) - not-present page [S[ 698.943793] Pat Dec 6 07:GD5 100000067 P4D 1002f7067 PUD 1002f8067 PMD 101bef067 PTE 0 0:56 2025] hv_[n e 698.944283] Oops: Oops: 0002 [#1] SMP NOPTI tvsc f8615163-00[ 698.944611] CPU: 28 UID: 0 PID: 249 Comm: kworker/28:1 ... [Sat Dec 6 07:50:56 2025] R10: [ 699.121594] mana 7870:00:00.0 enP30832s1: Configured vPort 0 PD 18 DB 16 000000000000001b R11: 0000000000000000 R12: ff44cf3f40270000 [Sat Dec 6 07:50:56 2025] R13: 0000000000000001 R14: ff44cf3f402700c8 R15: ff44cf3f4021b405 [Sat Dec 6 07:50:56 2025] FS: 0000000000000000(0000) GS:ff44cf7e9fcf9000(0000) knlGS:0000000000000000 [Sat Dec 6 07:50:56 2025] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Sat Dec 6 07:50:56 2025] CR2: ff6c2b638088508d CR3: 000000011fe43001 CR4: 0000000000b73ef0 [Sat Dec 6 07:50:56 2025] Call Trace: [Sat Dec 6 07:50:56 2025] [Sat Dec 6 07:50:56 2025] mana_serv_func+0x24/0x50 [mana] [Sat Dec 6 07:50:56 2025] process_one_work+0x190/0x350 [Sat Dec 6 07:50:56 2025] worker_thread+0x2b7/0x3d0 [Sat Dec 6 07:50:56 2025] kthread+0xf3/0x200 [Sat Dec 6 07:50:56 2025] ? __pfx_worker_thread+0x10/0x10 [Sat Dec 6 07:50:56 2025] ? __pfx_kthread+0x10/0x10 [Sat Dec 6 07:50:56 2025] ret_from_fork+0x21a/0x250 [Sat Dec 6 07:50:56 2025] ? __pfx_kthread+0x10/0x10 [Sat Dec 6 07:50:56 2025] ret_from_fork_asm+0x1a/0x30 [Sat Dec 6 07:50:56 2025] Fix this by returning immediately after mana_serv_rescan() to avoid accessing GC state that may no longer be valid. Fixes: 9bf66036d686 ("net: mana: Handle hardware recovery events when probing the device") Reviewed-by: Simon Horman Reviewed-by: Long Li Signed-off-by: Dipayaan Roy Link: https://patch.msgid.link/20251218131054.GA3173@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net Signed-off-by: Paolo Abeni --- drivers/net/ethernet/microsoft/mana/gdma_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c index efb4e412ec7e4..0055c231acf6d 100644 --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c @@ -481,7 +481,7 @@ static void mana_serv_reset(struct pci_dev *pdev) /* Perform PCI rescan on device if we failed on HWC */ dev_err(&pdev->dev, "MANA service: resume failed, rescanning\n"); mana_serv_rescan(pdev); - goto out; + return; } if (ret) From fb86456e1f3db8d02817bdc069834b8788f91eb1 Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Thu, 18 Dec 2025 15:21:28 -0700 Subject: [PATCH 0517/1321] af_unix: don't post cmsg for SO_INQ unless explicitly asked for A previous commit added SO_INQ support for AF_UNIX (SOCK_STREAM), but it posts a SCM_INQ cmsg even if just msg->msg_get_inq is set. This is incorrect, as ->msg_get_inq is just the caller asking for the remainder to be passed back in msg->msg_inq, it has nothing to do with cmsg. The original commit states that this is done to make sockets io_uring-friendly", but it's actually incorrect as io_uring doesn't use cmsg headers internally at all, and it's actively wrong as this means that cmsg's are always posted if someone does recvmsg via io_uring. Fix that up by only posting a cmsg if u->recvmsg_inq is set. Additionally, mirror how TCP handles inquiry handling in that it should only be done for a successful return. This makes the logic for the two identical. Cc: stable@vger.kernel.org Fixes: df30285b3670 ("af_unix: Introduce SO_INQ.") Reported-by: Julian Orth Link: https://github.com/axboe/liburing/issues/1509 Signed-off-by: Jens Axboe Reviewed-by: Willem de Bruijn Link: https://patch.msgid.link/07adc0c2-2c3b-4d08-8af1-1c466a40b6a8@kernel.dk Signed-off-by: Paolo Abeni --- net/unix/af_unix.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 55cdebfa0da02..a7ca74653d946 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2904,6 +2904,7 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state, unsigned int last_len; struct unix_sock *u; int copied = 0; + bool do_cmsg; int err = 0; long timeo; int target; @@ -2929,6 +2930,9 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state, u = unix_sk(sk); + do_cmsg = READ_ONCE(u->recvmsg_inq); + if (do_cmsg) + msg->msg_get_inq = 1; redo: /* Lock the socket to prevent queue disordering * while sleeps in memcpy_tomsg @@ -3088,10 +3092,11 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state, if (msg) { scm_recv_unix(sock, msg, &scm, flags); - if (READ_ONCE(u->recvmsg_inq) || msg->msg_get_inq) { + if (msg->msg_get_inq && (copied ?: err) >= 0) { msg->msg_inq = READ_ONCE(u->inq_len); - put_cmsg(msg, SOL_SOCKET, SCM_INQ, - sizeof(msg->msg_inq), &msg->msg_inq); + if (do_cmsg) + put_cmsg(msg, SOL_SOCKET, SCM_INQ, + sizeof(msg->msg_inq), &msg->msg_inq); } } else { scm_destroy(&scm); From cbccf52d70ba579dac8bea8e7ec297edafd5aa65 Mon Sep 17 00:00:00 2001 From: Anshumali Gaur Date: Fri, 19 Dec 2025 11:52:26 +0530 Subject: [PATCH 0518/1321] octeontx2-pf: fix "UBSAN: shift-out-of-bounds error" This patch ensures that the RX ring size (rx_pending) is not set below the permitted length. This avoids UBSAN shift-out-of-bounds errors when users passes small or zero ring sizes via ethtool -G. Fixes: d45d8979840d ("octeontx2-pf: Add basic ethtool support") Signed-off-by: Anshumali Gaur Link: https://patch.msgid.link/20251219062226.524844-1-agaur@marvell.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c index b90e23dc49de9..b6449f0a9e7dd 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c @@ -418,6 +418,14 @@ static int otx2_set_ringparam(struct net_device *netdev, */ if (rx_count < pfvf->hw.rq_skid) rx_count = pfvf->hw.rq_skid; + + if (ring->rx_pending < 16) { + netdev_err(netdev, + "rx ring size %u invalid, min is 16\n", + ring->rx_pending); + return -EINVAL; + } + rx_count = Q_COUNT(Q_SIZE(rx_count, 3)); /* Due pipelining impact minimum 2000 unused SQ CQE's From 942a2a695b2b9be235f861c3ac3f60c528260eb7 Mon Sep 17 00:00:00 2001 From: Jouni Malinen Date: Mon, 15 Dec 2025 17:11:34 +0200 Subject: [PATCH 0519/1321] wifi: mac80211: Discard Beacon frames to non-broadcast address Beacon frames are required to be sent to the broadcast address, see IEEE Std 802.11-2020, 11.1.3.1 ("The Address 1 field of the Beacon .. frame shall be set to the broadcast address"). A unicast Beacon frame might be used as a targeted attack to get one of the associated STAs to do something (e.g., using CSA to move it to another channel). As such, it is better have strict filtering for this on the received side and discard all Beacon frames that are sent to an unexpected address. This is even more important for cases where beacon protection is used. The current implementation in mac80211 is correctly discarding unicast Beacon frames if the Protected Frame bit in the Frame Control field is set to 0. However, if that bit is set to 1, the logic used for checking for configured BIGTK(s) does not actually work. If the driver does not have logic for dropping unicast Beacon frames with Protected Frame bit 1, these frames would be accepted in mac80211 processing as valid Beacon frames even though they are not protected. This would allow beacon protection to be bypassed. While the logic for checking beacon protection could be extended to cover this corner case, a more generic check for discard all Beacon frames based on A1=unicast address covers this without needing additional changes. Address all these issues by dropping received Beacon frames if they are sent to a non-broadcast address. Cc: stable@vger.kernel.org Fixes: af2d14b01c32 ("mac80211: Beacon protection using the new BIGTK (STA)") Signed-off-by: Jouni Malinen Link: https://patch.msgid.link/20251215151134.104501-1-jouni.malinen@oss.qualcomm.com Signed-off-by: Johannes Berg --- net/mac80211/rx.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c index 6a1899512d078..e0ccd97498536 100644 --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@ -3511,6 +3511,11 @@ ieee80211_rx_h_mgmt_check(struct ieee80211_rx_data *rx) rx->skb->len < IEEE80211_MIN_ACTION_SIZE) return RX_DROP_U_RUNT_ACTION; + /* Drop non-broadcast Beacon frames */ + if (ieee80211_is_beacon(mgmt->frame_control) && + !is_broadcast_ether_addr(mgmt->da)) + return RX_DROP; + if (rx->sdata->vif.type == NL80211_IFTYPE_AP && ieee80211_is_beacon(mgmt->frame_control) && !(rx->flags & IEEE80211_RX_BEACON_REPORTED)) { From acbb51dafd38604241a3ab080cfeed45bcc1866d Mon Sep 17 00:00:00 2001 From: Ping-Ke Shih Date: Tue, 25 Nov 2025 09:38:49 +0800 Subject: [PATCH 0520/1321] wifi: rtw88: limit indirect IO under powered off for RTL8822CS The indirect IO is necessary for RTL8822CS, but not necessary for other chips. Otherwiese, it throws errors and becomes unusable. rtw88_8723cs mmc1:0001:1: WOW Firmware version 11.0.0, H2C version 0 rtw88_8723cs mmc1:0001:1: Firmware version 11.0.0, H2C version 0 rtw88_8723cs mmc1:0001:1: sdio read32 failed (0xf0): -110 rtw88_8723cs mmc1:0001:1: sdio write8 failed (0x1c): -110 rtw88_8723cs mmc1:0001:1: sdio read32 failed (0xf0): -110 By vendor driver, only RTL8822CS and RTL8822ES need indirect IO, but RTL8822ES isn't supported yet. Therefore, limit it to RTL8822CS only. Reported-by: Andrey Skvortsov Closes: https://lore.kernel.org/linux-wireless/07a32e2d6c764eb1bd9415b5a921a652@realtek.com/T/#m997b4522f7209ba629561c776bfd1d13ab24c1d4 Fixes: 58de1f91e033 ("wifi: rtw88: sdio: use indirect IO for device registers before power-on") Signed-off-by: Ping-Ke Shih Tested-by: Andrey Skvortsov Link: https://patch.msgid.link/1764034729-1251-1-git-send-email-pkshih@realtek.com --- drivers/net/wireless/realtek/rtw88/sdio.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/realtek/rtw88/sdio.c b/drivers/net/wireless/realtek/rtw88/sdio.c index 99d7c629eac6f..e35de52d8eb43 100644 --- a/drivers/net/wireless/realtek/rtw88/sdio.c +++ b/drivers/net/wireless/realtek/rtw88/sdio.c @@ -144,8 +144,10 @@ static u32 rtw_sdio_to_io_address(struct rtw_dev *rtwdev, u32 addr, static bool rtw_sdio_use_direct_io(struct rtw_dev *rtwdev, u32 addr) { + bool might_indirect_under_power_off = rtwdev->chip->id == RTW_CHIP_TYPE_8822C; + if (!test_bit(RTW_FLAG_POWERON, rtwdev->flags) && - !rtw_sdio_is_bus_addr(addr)) + !rtw_sdio_is_bus_addr(addr) && might_indirect_under_power_off) return false; return !rtw_sdio_is_sdio30_supported(rtwdev) || From 8daba66eb981abdea0919b8e34b48dfdaeb745ad Mon Sep 17 00:00:00 2001 From: Morning Star Date: Thu, 27 Nov 2025 16:37:08 +0800 Subject: [PATCH 0521/1321] wifi: rtlwifi: 8192cu: fix tid out of range in rtl92cu_tx_fill_desc() TID getting from ieee80211_get_tid() might be out of range of array size of sta_entry->tids[], so check TID is less than MAX_TID_COUNT. Othwerwise, UBSAN warn: UBSAN: array-index-out-of-bounds in drivers/net/wireless/realtek/rtlwifi/rtl8192cu/trx.c:514:30 index 10 is out of range for type 'rtl_tid_data [9]' Fixes: 8ca4cdef9329 ("wifi: rtlwifi: rtl8192cu: Fix TX aggregation") Signed-off-by: Morning Star Signed-off-by: Ping-Ke Shih Link: https://patch.msgid.link/1764232628-13625-1-git-send-email-pkshih@realtek.com --- drivers/net/wireless/realtek/rtlwifi/rtl8192cu/trx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/trx.c b/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/trx.c index aa702ba7c9f54..d6c35e8d02a58 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/trx.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/trx.c @@ -511,7 +511,8 @@ void rtl92cu_tx_fill_desc(struct ieee80211_hw *hw, if (sta) { sta_entry = (struct rtl_sta_info *)sta->drv_priv; tid = ieee80211_get_tid(hdr); - agg_state = sta_entry->tids[tid].agg.agg_state; + if (tid < MAX_TID_COUNT) + agg_state = sta_entry->tids[tid].agg.agg_state; ampdu_density = sta->deflink.ht_cap.ampdu_density; } From 20f740210eb5f450b912eedd6fdbdc5c927a0c6e Mon Sep 17 00:00:00 2001 From: Bitterblue Smith Date: Sat, 6 Dec 2025 20:32:43 +0200 Subject: [PATCH 0522/1321] Revert "wifi: rtw88: add WQ_UNBOUND to alloc_workqueue users" This reverts commit 9c194fe4625db18f93d5abcfb7f7997557a0b29d. This commit breaks all USB wifi adapters supported by rtw88: usb 1-2: new high-speed USB device number 6 using xhci_hcd usb 1-2: New USB device found, idVendor=2357, idProduct=0138, bcdDevice= 2.10 usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 usb 1-2: Product: 802.11ac NIC usb 1-2: Manufacturer: Realtek usb 1-2: SerialNumber: 123456 ------------[ cut here ]------------ WARNING: CPU: 3 PID: 152 at kernel/workqueue.c:5667 alloc_workqueue_noprof+0x676/0x770 [...] Call Trace: ? rtw_usb_probe+0x30e/0xa5c [rtw88_usb 4af3cb64eedafeecbfb08f80c1e9e2893e2ee7a6] rtw_usb_probe+0x3eb/0xa5c [rtw88_usb 4af3cb64eedafeecbfb08f80c1e9e2893e2ee7a6] usb_probe_interface+0xdd/0x2c0 really_probe+0xdb/0x340 ? pm_runtime_barrier+0x55/0x90 ? __pfx___device_attach_driver+0x10/0x10 __driver_probe_device+0x78/0x140 driver_probe_device+0x1f/0xa0 __device_attach_driver+0x89/0x110 bus_for_each_drv+0x8f/0xe0 __device_attach+0xb0/0x1c0 bus_probe_device+0x90/0xa0 device_add+0x663/0x880 usb_set_configuration+0x5a5/0x870 usb_generic_driver_probe+0x4a/0x70 usb_probe_device+0x3d/0x140 ? driver_sysfs_add+0x59/0xd0 really_probe+0xdb/0x340 ? pm_runtime_barrier+0x55/0x90 ? __pfx___device_attach_driver+0x10/0x10 __driver_probe_device+0x78/0x140 driver_probe_device+0x1f/0xa0 __device_attach_driver+0x89/0x110 bus_for_each_drv+0x8f/0xe0 __device_attach+0xb0/0x1c0 bus_probe_device+0x90/0xa0 device_add+0x663/0x880 usb_new_device.cold+0x141/0x3b5 hub_event+0x1132/0x1900 ? page_counter_uncharge+0x4a/0x90 process_one_work+0x190/0x350 worker_thread+0x2d7/0x410 ? __pfx_worker_thread+0x10/0x10 kthread+0xf9/0x240 ? __pfx_kthread+0x10/0x10 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x1c1/0x1f0 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 ---[ end trace 0000000000000000 ]--- rtw88_8822bu 1-2:1.0: failed to create RX work queue rtw88_8822bu 1-2:1.0: failed to init USB RX rtw88_8822bu 1-2:1.0: Firmware version 27.2.0, H2C version 13 rtw88_8822bu 1-2:1.0: probe with driver rtw88_8822bu failed with error -12 WQ_UNBOUND is not compatible with WQ_BH. Comment in enum wq_flags in workqueue.h says: /* BH wq only allows the following flags */ __WQ_BH_ALLOWS = WQ_BH | WQ_HIGHPRI | WQ_PERCPU, Signed-off-by: Bitterblue Smith Acked-by: Ping-Ke Shih Signed-off-by: Ping-Ke Shih Link: https://patch.msgid.link/d57efe48-b8ff-4bf1-942c-7e808535eda6@gmail.com --- drivers/net/wireless/realtek/rtw88/usb.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/wireless/realtek/rtw88/usb.c b/drivers/net/wireless/realtek/rtw88/usb.c index 009202c627d25..3b5126ffc81a1 100644 --- a/drivers/net/wireless/realtek/rtw88/usb.c +++ b/drivers/net/wireless/realtek/rtw88/usb.c @@ -965,8 +965,7 @@ static int rtw_usb_init_rx(struct rtw_dev *rtwdev) struct sk_buff *rx_skb; int i; - rtwusb->rxwq = alloc_workqueue("rtw88_usb: rx wq", WQ_BH | WQ_UNBOUND, - 0); + rtwusb->rxwq = alloc_workqueue("rtw88_usb: rx wq", WQ_BH, 0); if (!rtwusb->rxwq) { rtw_err(rtwdev, "failed to create RX work queue\n"); return -ENOMEM; From 86a808d7465b644da05208f21c62443862820bba Mon Sep 17 00:00:00 2001 From: Dmitry Antipov Date: Thu, 4 Dec 2025 16:05:33 +0300 Subject: [PATCH 0523/1321] wifi: mac80211: fix list iteration in ieee80211_add_virtual_monitor() Since 'mon_list' of 'struct ieee80211_local' is RCU-protected and an instances of 'struct ieee80211_sub_if_data' are linked there via 'u.mntr.list' member, adjust the corresponding list iteration in 'ieee80211_add_virtual_monitor()' accordingly. Reported-by: syzbot+bc1aabf52d0a31e91f96@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=bc1aabf52d0a31e91f96 Fixes: a5aa46f1ac4f ("wifi: mac80211: track MU-MIMO configuration on disabled interfaces") Signed-off-by: Dmitry Antipov Link: https://patch.msgid.link/20251204130533.340069-1-dmantipov@yandex.ru Signed-off-by: Johannes Berg --- net/mac80211/iface.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c index 4f04d95c19d49..7b0aa24c1f97c 100644 --- a/net/mac80211/iface.c +++ b/net/mac80211/iface.c @@ -1251,7 +1251,7 @@ int ieee80211_add_virtual_monitor(struct ieee80211_local *local, if (!creator_sdata) { struct ieee80211_sub_if_data *other; - list_for_each_entry(other, &local->mon_list, list) { + list_for_each_entry_rcu(other, &local->mon_list, u.mntr.list) { if (!other->vif.bss_conf.mu_mimo_owner) continue; From 4489ea97ea8f14840c9672cfa667843f9dbb6216 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Wed, 3 Dec 2025 14:14:47 +0300 Subject: [PATCH 0524/1321] wifi: cfg80211: sme: store capped length in __cfg80211_connect_result() The QGenie AI code review tool says we should store the capped length to wdev->u.client.ssid_len. The AI is correct. Fixes: 62b635dcd69c ("wifi: cfg80211: sme: cap SSID length in __cfg80211_connect_result()") Signed-off-by: Dan Carpenter Link: https://patch.msgid.link/aTAbp5RleyH_lnZE@stanley.mountain Signed-off-by: Johannes Berg --- net/wireless/sme.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/wireless/sme.c b/net/wireless/sme.c index 3a028ff287fbb..4e629ca305bcc 100644 --- a/net/wireless/sme.c +++ b/net/wireless/sme.c @@ -910,7 +910,7 @@ void __cfg80211_connect_result(struct net_device *dev, ssid_len = min(ssid->datalen, IEEE80211_MAX_SSID_LEN); memcpy(wdev->u.client.ssid, ssid->data, ssid_len); - wdev->u.client.ssid_len = ssid->datalen; + wdev->u.client.ssid_len = ssid_len; break; } rcu_read_unlock(); From 8398ecbe3431f4261250518836b452c8717cb6ab Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Peter=20=C3=85strand?= Date: Wed, 3 Dec 2025 08:57:08 +0100 Subject: [PATCH 0525/1321] wifi: wlcore: ensure skb headroom before skb_push This avoids occasional skb_under_panic Oops from wl1271_tx_work. In this case, headroom is less than needed (typically 110 - 94 = 16 bytes). Signed-off-by: Peter Astrand Link: https://patch.msgid.link/097bd417-e1d7-acd4-be05-47b199075013@lysator.liu.se Signed-off-by: Johannes Berg --- drivers/net/wireless/ti/wlcore/tx.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/net/wireless/ti/wlcore/tx.c b/drivers/net/wireless/ti/wlcore/tx.c index f76087be2f758..6241866d39df6 100644 --- a/drivers/net/wireless/ti/wlcore/tx.c +++ b/drivers/net/wireless/ti/wlcore/tx.c @@ -207,6 +207,11 @@ static int wl1271_tx_allocate(struct wl1271 *wl, struct wl12xx_vif *wlvif, total_blocks = wlcore_hw_calc_tx_blocks(wl, total_len, spare_blocks); if (total_blocks <= wl->tx_blocks_available) { + if (skb_headroom(skb) < (total_len - skb->len) && + pskb_expand_head(skb, (total_len - skb->len), 0, GFP_ATOMIC)) { + wl1271_free_tx_id(wl, id); + return -EAGAIN; + } desc = skb_push(skb, total_len - skb->len); wlcore_hw_set_tx_desc_blocks(wl, desc, total_blocks, From 51e5812b0b215b1aafb647039ce224e912a19f50 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Tue, 2 Dec 2025 10:25:11 +0100 Subject: [PATCH 0526/1321] wifi: mac80211: don't WARN for connections on invalid channels It's not clear (to me) how exactly syzbot managed to hit this, but it seems conceivable that e.g. regulatory changed and has disabled a channel between scanning (channel is checked to be usable by cfg80211_get_ies_channel_number) and connecting on the channel later. With one scenario that isn't covered elsewhere described above, the warning isn't good, replace it with a (more informative) error message. Reported-by: syzbot+639af5aa411f2581ad38@syzkaller.appspotmail.com Link: https://patch.msgid.link/20251202102511.5a8fb5184fa3.I961ee41b8f10538a54b8565dbf03ec1696e80e03@changeid Signed-off-by: Johannes Berg --- net/mac80211/mlme.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index e56ad4b9330f2..ad53dedd929c2 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -1126,7 +1126,10 @@ ieee80211_determine_chan_mode(struct ieee80211_sub_if_data *sdata, while (!ieee80211_chandef_usable(sdata, &chanreq->oper, IEEE80211_CHAN_DISABLED)) { - if (WARN_ON(chanreq->oper.width == NL80211_CHAN_WIDTH_20_NOHT)) { + if (chanreq->oper.width == NL80211_CHAN_WIDTH_20_NOHT) { + link_id_info(sdata, link_id, + "unusable channel (%d MHz) for connection\n", + chanreq->oper.chan->center_freq); ret = -EINVAL; goto free; } From 66ffe18329b86dfa432b81c7c12de8840c43b5d9 Mon Sep 17 00:00:00 2001 From: Aloka Dixit Date: Mon, 15 Dec 2025 09:46:56 -0800 Subject: [PATCH 0527/1321] wifi: mac80211: do not use old MBSSID elements When userspace brings down and deletes a non-transmitted profile, it is expected to send a new updated Beacon template for the transmitted profile of that multiple BSSID (MBSSID) group which does not include the removed profile in MBSSID element. This update comes via NL80211_CMD_SET_BEACON. Such updates work well as long as the group continues to have at least one non-transmitted profile as NL80211_ATTR_MBSSID_ELEMS is included in the new Beacon template. But when the last non-trasmitted profile is removed, it still gets included in Beacon templates sent to driver. This happens because when no MBSSID elements are sent by the userspace, ieee80211_assign_beacon() ends up using the element stored from earlier Beacon template. Do not copy old MBSSID elements, instead userspace should always include these when applicable. Fixes: 2b3171c6fe0a ("mac80211: MBSSID beacon handling in AP mode") Signed-off-by: Aloka Dixit Link: https://patch.msgid.link/20251215174656.2866319-2-aloka.dixit@oss.qualcomm.com Signed-off-by: Johannes Berg --- net/mac80211/cfg.c | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c index b51c2c8584ae0..c81091a5cc3a3 100644 --- a/net/mac80211/cfg.c +++ b/net/mac80211/cfg.c @@ -1345,7 +1345,6 @@ ieee80211_assign_beacon(struct ieee80211_sub_if_data *sdata, size = sizeof(*new) + new_head_len + new_tail_len; - /* new or old multiple BSSID elements? */ if (params->mbssid_ies) { mbssid = params->mbssid_ies; size += struct_size(new->mbssid_ies, elem, mbssid->cnt); @@ -1355,15 +1354,6 @@ ieee80211_assign_beacon(struct ieee80211_sub_if_data *sdata, } size += ieee80211_get_mbssid_beacon_len(mbssid, rnr, mbssid->cnt); - } else if (old && old->mbssid_ies) { - mbssid = old->mbssid_ies; - size += struct_size(new->mbssid_ies, elem, mbssid->cnt); - if (old && old->rnr_ies) { - rnr = old->rnr_ies; - size += struct_size(new->rnr_ies, elem, rnr->cnt); - } - size += ieee80211_get_mbssid_beacon_len(mbssid, rnr, - mbssid->cnt); } new = kzalloc(size, GFP_KERNEL); From 896635304f67b8ea2e914498be399c400eb6104f Mon Sep 17 00:00:00 2001 From: Moon Hee Lee Date: Mon, 15 Dec 2025 19:59:32 -0800 Subject: [PATCH 0528/1321] wifi: mac80211: ocb: skip rx_no_sta when interface is not joined ieee80211_ocb_rx_no_sta() assumes a valid channel context, which is only present after JOIN_OCB. RX may run before JOIN_OCB is executed, in which case the OCB interface is not operational. Skip RX peer handling when the interface is not joined to avoid warnings in the RX path. Reported-by: syzbot+b364457b2d1d4e4a3054@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=b364457b2d1d4e4a3054 Tested-by: syzbot+b364457b2d1d4e4a3054@syzkaller.appspotmail.com Signed-off-by: Moon Hee Lee Link: https://patch.msgid.link/20251216035932.18332-1-moonhee.lee.ca@gmail.com Signed-off-by: Johannes Berg --- net/mac80211/ocb.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/mac80211/ocb.c b/net/mac80211/ocb.c index a5d4358f122ae..ebb4f4d88c237 100644 --- a/net/mac80211/ocb.c +++ b/net/mac80211/ocb.c @@ -47,6 +47,9 @@ void ieee80211_ocb_rx_no_sta(struct ieee80211_sub_if_data *sdata, struct sta_info *sta; int band; + if (!ifocb->joined) + return; + /* XXX: Consider removing the least recently used entry and * allow new one to be added. */ From 00b1f3eb33957db0e2c76cc86fc5eec9307ab500 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= Date: Fri, 14 Nov 2025 00:28:52 +0200 Subject: [PATCH 0529/1321] wifi: iwlwifi: Fix firmware version handling MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On my system the arithmetic done on the firmware numbers results in a negative number, but since the types are unsigned it gets interpreted as a large positive number. The end result is that the firmware gets rejected and wifi is defunct. Switch to signed types to handle this case correctly. iwlwifi 0000:0c:00.0: Driver unable to support your firmware API. Driver supports FW core 4294967294..2, firmware is 2. iwlwifi 0000:0c:00.0: Direct firmware load for iwlwifi-5000-4.ucode failed with error -2 iwlwifi 0000:0c:00.0: Direct firmware load for iwlwifi-5000-3.ucode failed with error -2 iwlwifi 0000:0c:00.0: Direct firmware load for iwlwifi-5000-2.ucode failed with error -2 iwlwifi 0000:0c:00.0: Direct firmware load for iwlwifi-5000-1.ucode failed with error -2 iwlwifi 0000:0c:00.0: no suitable firmware found! iwlwifi 0000:0c:00.0: minimum version required: iwlwifi-5000-1 iwlwifi 0000:0c:00.0: maximum version supported: iwlwifi-5000-5 iwlwifi 0000:0c:00.0: check git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git Cc: stable@vger.kernel.org Fixes: 5f708cccde9d ("wifi: iwlwifi: add a new FW file numbering scheme") Signed-off-by: Ville Syrjälä Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220805 Link: https://patch.msgid.link/20251113222852.15896-1-ville.syrjala@linux.intel.com Signed-off-by: Johannes Berg --- drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c index 3391f07b01de3..f8fc6f30fbe5f 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c +++ b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c @@ -1597,7 +1597,7 @@ static void _iwl_op_mode_stop(struct iwl_drv *drv) */ static void iwl_req_fw_callback(const struct firmware *ucode_raw, void *context) { - unsigned int min_core, max_core, loaded_core; + int min_core, max_core, loaded_core; struct iwl_drv *drv = context; struct iwl_fw *fw = &drv->fw; const struct iwl_ucode_header *ucode; @@ -1676,7 +1676,7 @@ static void iwl_req_fw_callback(const struct firmware *ucode_raw, void *context) if (loaded_core < min_core || loaded_core > max_core) { IWL_ERR(drv, "Driver unable to support your firmware API. " - "Driver supports FW core %u..%u, firmware is %u.\n", + "Driver supports FW core %d..%d, firmware is %d.\n", min_core, max_core, loaded_core); goto try_again; } From 8e410d8cbd9643af316cfc027dbc28a03c3da274 Mon Sep 17 00:00:00 2001 From: Yao Zi Date: Thu, 4 Dec 2025 12:32:04 +0000 Subject: [PATCH 0530/1321] wifi: iwlwifi: Implement settime64 as stub for MVM/MLD PTP Since commit dfb073d32cac ("ptp: Return -EINVAL on ptp_clock_register if required ops are NULL"), PTP clock registered through ptp_clock_register is required to have ptp_clock_info.settime64 set, however, neither MVM nor MLD's PTP clock implementation sets it, resulting in warnings when the interface starts up, like WARNING: drivers/ptp/ptp_clock.c:325 at ptp_clock_register+0x2c8/0x6b8, CPU#1: wpa_supplicant/469 CPU: 1 UID: 0 PID: 469 Comm: wpa_supplicant Not tainted 6.18.0+ #101 PREEMPT(full) ra: ffff800002732cd4 iwl_mvm_ptp_init+0x114/0x188 [iwlmvm] ERA: 9000000002fdc468 ptp_clock_register+0x2c8/0x6b8 iwlwifi 0000:01:00.0: Failed to register PHC clock (-22) I don't find an appropriate firmware interface to implement settime64() for iwlwifi MLD/MVM, thus instead create a stub that returns -EOPTNOTSUPP only, suppressing the warning and allowing the PTP clock to be registered. Reported-by: Nathan Chancellor Closes: https://lore.kernel.org/all/20251108044822.GA3262936@ax162/ Signed-off-by: Yao Zi Tested-by: Nathan Chancellor Reviewed-by: Simon Horman tested-by: damian Tometzki damian@riscv-rocks.de Tested-by: Oliver Hartkopp Acked-by: Miri Korenblit Link: https://patch.msgid.link/20251204123204.9316-1-ziyao@disroot.org Signed-off-by: Johannes Berg --- drivers/net/wireless/intel/iwlwifi/mld/ptp.c | 7 +++++++ drivers/net/wireless/intel/iwlwifi/mvm/ptp.c | 7 +++++++ 2 files changed, 14 insertions(+) diff --git a/drivers/net/wireless/intel/iwlwifi/mld/ptp.c b/drivers/net/wireless/intel/iwlwifi/mld/ptp.c index ffeb37a7f830e..231920425c066 100644 --- a/drivers/net/wireless/intel/iwlwifi/mld/ptp.c +++ b/drivers/net/wireless/intel/iwlwifi/mld/ptp.c @@ -121,6 +121,12 @@ static int iwl_mld_ptp_gettime(struct ptp_clock_info *ptp, return 0; } +static int iwl_mld_ptp_settime(struct ptp_clock_info *ptp, + const struct timespec64 *ts) +{ + return -EOPNOTSUPP; +} + static int iwl_mld_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta) { struct iwl_mld *mld = container_of(ptp, struct iwl_mld, @@ -279,6 +285,7 @@ void iwl_mld_ptp_init(struct iwl_mld *mld) mld->ptp_data.ptp_clock_info.owner = THIS_MODULE; mld->ptp_data.ptp_clock_info.gettime64 = iwl_mld_ptp_gettime; + mld->ptp_data.ptp_clock_info.settime64 = iwl_mld_ptp_settime; mld->ptp_data.ptp_clock_info.max_adj = 0x7fffffff; mld->ptp_data.ptp_clock_info.adjtime = iwl_mld_ptp_adjtime; mld->ptp_data.ptp_clock_info.adjfine = iwl_mld_ptp_adjfine; diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/ptp.c b/drivers/net/wireless/intel/iwlwifi/mvm/ptp.c index 06a4c9f74797a..ad156b82eaa94 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/ptp.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/ptp.c @@ -220,6 +220,12 @@ static int iwl_mvm_ptp_gettime(struct ptp_clock_info *ptp, return 0; } +static int iwl_mvm_ptp_settime(struct ptp_clock_info *ptp, + const struct timespec64 *ts) +{ + return -EOPNOTSUPP; +} + static int iwl_mvm_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta) { struct iwl_mvm *mvm = container_of(ptp, struct iwl_mvm, @@ -281,6 +287,7 @@ void iwl_mvm_ptp_init(struct iwl_mvm *mvm) mvm->ptp_data.ptp_clock_info.adjfine = iwl_mvm_ptp_adjfine; mvm->ptp_data.ptp_clock_info.adjtime = iwl_mvm_ptp_adjtime; mvm->ptp_data.ptp_clock_info.gettime64 = iwl_mvm_ptp_gettime; + mvm->ptp_data.ptp_clock_info.settime64 = iwl_mvm_ptp_settime; mvm->ptp_data.scaled_freq = SCALE_FACTOR; /* Give a short 'friendly name' to identify the PHC clock */ From 340e9f65fb66c237e28673f4e92a2445eb658ca6 Mon Sep 17 00:00:00 2001 From: Wei Fang Date: Thu, 4 Dec 2025 15:13:32 +0800 Subject: [PATCH 0531/1321] net: stmmac: fix the crash issue for zero copy XDP_TX action There is a crash issue when running zero copy XDP_TX action, the crash log is shown below. [ 216.122464] Unable to handle kernel paging request at virtual address fffeffff80000000 [ 216.187524] Internal error: Oops: 0000000096000144 [#1] SMP [ 216.301694] Call trace: [ 216.304130] dcache_clean_poc+0x20/0x38 (P) [ 216.308308] __dma_sync_single_for_device+0x1bc/0x1e0 [ 216.313351] stmmac_xdp_xmit_xdpf+0x354/0x400 [ 216.317701] __stmmac_xdp_run_prog+0x164/0x368 [ 216.322139] stmmac_napi_poll_rxtx+0xba8/0xf00 [ 216.326576] __napi_poll+0x40/0x218 [ 216.408054] Kernel panic - not syncing: Oops: Fatal exception in interrupt For XDP_TX action, the xdp_buff is converted to xdp_frame by xdp_convert_buff_to_frame(). The memory type of the resulting xdp_frame depends on the memory type of the xdp_buff. For page pool based xdp_buff it produces xdp_frame with memory type MEM_TYPE_PAGE_POOL. For zero copy XSK pool based xdp_buff it produces xdp_frame with memory type MEM_TYPE_PAGE_ORDER0. However, stmmac_xdp_xmit_back() does not check the memory type and always uses the page pool type, this leads to invalid mappings and causes the crash. Therefore, check the xdp_buff memory type in stmmac_xdp_xmit_back() to fix this issue. Fixes: bba2556efad6 ("net: stmmac: Enable RX via AF_XDP zero-copy") Signed-off-by: Wei Fang Reviewed-by: Hariprasad Kelam Link: https://patch.msgid.link/20251204071332.1907111-1-wei.fang@nxp.com Signed-off-by: Paolo Abeni --- .../net/ethernet/stmicro/stmmac/stmmac_main.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index da206b24aaed9..b3730312aeed4 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -89,6 +89,7 @@ MODULE_PARM_DESC(phyaddr, "Physical device address"); #define STMMAC_XDP_CONSUMED BIT(0) #define STMMAC_XDP_TX BIT(1) #define STMMAC_XDP_REDIRECT BIT(2) +#define STMMAC_XSK_CONSUMED BIT(3) static int flow_ctrl = 0xdead; module_param(flow_ctrl, int, 0644); @@ -5126,6 +5127,7 @@ static int stmmac_xdp_get_tx_queue(struct stmmac_priv *priv, static int stmmac_xdp_xmit_back(struct stmmac_priv *priv, struct xdp_buff *xdp) { + bool zc = !!(xdp->rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL); struct xdp_frame *xdpf = xdp_convert_buff_to_frame(xdp); int cpu = smp_processor_id(); struct netdev_queue *nq; @@ -5142,9 +5144,18 @@ static int stmmac_xdp_xmit_back(struct stmmac_priv *priv, /* Avoids TX time-out as we are sharing with slow path */ txq_trans_cond_update(nq); - res = stmmac_xdp_xmit_xdpf(priv, queue, xdpf, false); - if (res == STMMAC_XDP_TX) + /* For zero copy XDP_TX action, dma_map is true */ + res = stmmac_xdp_xmit_xdpf(priv, queue, xdpf, zc); + if (res == STMMAC_XDP_TX) { stmmac_flush_tx_descriptors(priv, queue); + } else if (res == STMMAC_XDP_CONSUMED && zc) { + /* xdp has been freed by xdp_convert_buff_to_frame(), + * no need to call xsk_buff_free() again, so return + * STMMAC_XSK_CONSUMED. + */ + res = STMMAC_XSK_CONSUMED; + xdp_return_frame(xdpf); + } __netif_tx_unlock(nq); @@ -5494,6 +5505,8 @@ static int stmmac_rx_zc(struct stmmac_priv *priv, int limit, u32 queue) break; case STMMAC_XDP_CONSUMED: xsk_buff_free(buf->xdp); + fallthrough; + case STMMAC_XSK_CONSUMED: rx_dropped++; break; case STMMAC_XDP_TX: From e0eeb02c8b34521ce9a0952e1a6762f6998a3cab Mon Sep 17 00:00:00 2001 From: Ankit Garg Date: Fri, 19 Dec 2025 10:29:45 +0000 Subject: [PATCH 0532/1321] gve: defer interrupt enabling until NAPI registration Currently, interrupts are automatically enabled immediately upon request. This allows interrupt to fire before the associated NAPI context is fully initialized and cause failures like below: [ 0.946369] Call Trace: [ 0.946369] [ 0.946369] __napi_poll+0x2a/0x1e0 [ 0.946369] net_rx_action+0x2f9/0x3f0 [ 0.946369] handle_softirqs+0xd6/0x2c0 [ 0.946369] ? handle_edge_irq+0xc1/0x1b0 [ 0.946369] __irq_exit_rcu+0xc3/0xe0 [ 0.946369] common_interrupt+0x81/0xa0 [ 0.946369] [ 0.946369] [ 0.946369] asm_common_interrupt+0x22/0x40 [ 0.946369] RIP: 0010:pv_native_safe_halt+0xb/0x10 Use the `IRQF_NO_AUTOEN` flag when requesting interrupts to prevent auto enablement and explicitly enable the interrupt in NAPI initialization path (and disable it during NAPI teardown). This ensures that interrupt lifecycle is strictly coupled with readiness of NAPI context. Cc: stable@vger.kernel.org Fixes: 1dfc2e46117e ("gve: Refactor napi add and remove functions") Signed-off-by: Ankit Garg Reviewed-by: Jordan Rhee Reviewed-by: Joshua Washington Signed-off-by: Harshitha Ramamurthy Link: https://patch.msgid.link/20251219102945.2193617-1-hramamurthy@google.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/google/gve/gve_main.c | 2 +- drivers/net/ethernet/google/gve/gve_utils.c | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c index a7a088a77f378..7eb64e1e4d858 100644 --- a/drivers/net/ethernet/google/gve/gve_main.c +++ b/drivers/net/ethernet/google/gve/gve_main.c @@ -558,7 +558,7 @@ static int gve_alloc_notify_blocks(struct gve_priv *priv) block->priv = priv; err = request_irq(priv->msix_vectors[msix_idx].vector, gve_is_gqi(priv) ? gve_intr : gve_intr_dqo, - 0, block->name, block); + IRQF_NO_AUTOEN, block->name, block); if (err) { dev_err(&priv->pdev->dev, "Failed to receive msix vector %d\n", i); diff --git a/drivers/net/ethernet/google/gve/gve_utils.c b/drivers/net/ethernet/google/gve/gve_utils.c index ace9b8698021f..b53b7fcdcdaf1 100644 --- a/drivers/net/ethernet/google/gve/gve_utils.c +++ b/drivers/net/ethernet/google/gve/gve_utils.c @@ -112,11 +112,13 @@ void gve_add_napi(struct gve_priv *priv, int ntfy_idx, netif_napi_add_locked(priv->dev, &block->napi, gve_poll); netif_napi_set_irq_locked(&block->napi, block->irq); + enable_irq(block->irq); } void gve_remove_napi(struct gve_priv *priv, int ntfy_idx) { struct gve_notify_block *block = &priv->ntfy_blocks[ntfy_idx]; + disable_irq(block->irq); netif_napi_del_locked(&block->napi); } From 882c28b6e5f667483c66e7539e42e08cff5fa8e1 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Fri, 19 Dec 2025 14:44:59 +0000 Subject: [PATCH 0533/1321] usbnet: avoid a possible crash in dql_completed() syzbot reported a crash [1] in dql_completed() after recent usbnet BQL adoption. The reason for the crash is that netdev_reset_queue() is called too soon. It should be called after cancel_work_sync(&dev->bh_work) to make sure no more TX completion can happen. [1] kernel BUG at lib/dynamic_queue_limits.c:99 ! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 1 UID: 0 PID: 5197 Comm: udevd Tainted: G L syzkaller #0 PREEMPT(full) Tainted: [L]=SOFTLOCKUP Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 RIP: 0010:dql_completed+0xbe1/0xbf0 lib/dynamic_queue_limits.c:99 Call Trace: netdev_tx_completed_queue include/linux/netdevice.h:3864 [inline] netdev_completed_queue include/linux/netdevice.h:3894 [inline] usbnet_bh+0x793/0x1020 drivers/net/usb/usbnet.c:1601 process_one_work kernel/workqueue.c:3257 [inline] process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340 bh_worker+0x2b1/0x600 kernel/workqueue.c:3611 tasklet_action+0xc/0x70 kernel/softirq.c:952 handle_softirqs+0x27d/0x850 kernel/softirq.c:622 __do_softirq kernel/softirq.c:656 [inline] invoke_softirq kernel/softirq.c:496 [inline] __irq_exit_rcu+0xca/0x1f0 kernel/softirq.c:723 irq_exit_rcu+0x9/0x30 kernel/softirq.c:739 Fixes: 7ff14c52049e ("usbnet: Add support for Byte Queue Limits (BQL)") Reported-by: syzbot+5b55e49f8bbd84631a9c@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6945644f.a70a0220.207337.0113.GAE@google.com/T/#u Signed-off-by: Eric Dumazet Cc: Simon Schippers Link: https://patch.msgid.link/20251219144459.692715-1-edumazet@google.com Signed-off-by: Paolo Abeni --- drivers/net/usb/usbnet.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c index 1d9faa70ba3b7..36742e64cff75 100644 --- a/drivers/net/usb/usbnet.c +++ b/drivers/net/usb/usbnet.c @@ -831,7 +831,6 @@ int usbnet_stop(struct net_device *net) clear_bit(EVENT_DEV_OPEN, &dev->flags); netif_stop_queue(net); - netdev_reset_queue(net); netif_info(dev, ifdown, dev->net, "stop stats: rx/tx %lu/%lu, errs %lu/%lu\n", @@ -875,6 +874,8 @@ int usbnet_stop(struct net_device *net) timer_delete_sync(&dev->delay); cancel_work_sync(&dev->kevent); + netdev_reset_queue(net); + if (!pm) usb_autopm_put_interface(dev->intf); From 426d74b00abbe9ab6f826d44ba108fccc0714ba5 Mon Sep 17 00:00:00 2001 From: Will Rosenberg Date: Fri, 19 Dec 2025 10:36:37 -0700 Subject: [PATCH 0534/1321] ipv6: BUG() in pskb_expand_head() as part of calipso_skbuff_setattr() There exists a kernel oops caused by a BUG_ON(nhead < 0) at net/core/skbuff.c:2232 in pskb_expand_head(). This bug is triggered as part of the calipso_skbuff_setattr() routine when skb_cow() is passed headroom > INT_MAX (i.e. (int)(skb_headroom(skb) + len_delta) < 0). The root cause of the bug is due to an implicit integer cast in __skb_cow(). The check (headroom > skb_headroom(skb)) is meant to ensure that delta = headroom - skb_headroom(skb) is never negative, otherwise we will trigger a BUG_ON in pskb_expand_head(). However, if headroom > INT_MAX and delta <= -NET_SKB_PAD, the check passes, delta becomes negative, and pskb_expand_head() is passed a negative value for nhead. Fix the trigger condition in calipso_skbuff_setattr(). Avoid passing "negative" headroom sizes to skb_cow() within calipso_skbuff_setattr() by only using skb_cow() to grow headroom. PoC: Using `netlabelctl` tool: netlabelctl map del default netlabelctl calipso add pass doi:7 netlabelctl map add default address:0::1/128 protocol:calipso,7 Then run the following PoC: int fd = socket(AF_INET6, SOCK_DGRAM, IPPROTO_UDP); // setup msghdr int cmsg_size = 2; int cmsg_len = 0x60; struct msghdr msg; struct sockaddr_in6 dest_addr; struct cmsghdr * cmsg = (struct cmsghdr *) calloc(1, sizeof(struct cmsghdr) + cmsg_len); msg.msg_name = &dest_addr; msg.msg_namelen = sizeof(dest_addr); msg.msg_iov = NULL; msg.msg_iovlen = 0; msg.msg_control = cmsg; msg.msg_controllen = cmsg_len; msg.msg_flags = 0; // setup sockaddr dest_addr.sin6_family = AF_INET6; dest_addr.sin6_port = htons(31337); dest_addr.sin6_flowinfo = htonl(31337); dest_addr.sin6_addr = in6addr_loopback; dest_addr.sin6_scope_id = 31337; // setup cmsghdr cmsg->cmsg_len = cmsg_len; cmsg->cmsg_level = IPPROTO_IPV6; cmsg->cmsg_type = IPV6_HOPOPTS; char * hop_hdr = (char *)cmsg + sizeof(struct cmsghdr); hop_hdr[1] = 0x9; //set hop size - (0x9 + 1) * 8 = 80 sendmsg(fd, &msg, 0); Fixes: 2917f57b6bc1 ("calipso: Allow the lsm to label the skbuff directly.") Suggested-by: Paul Moore Signed-off-by: Will Rosenberg Acked-by: Paul Moore Link: https://patch.msgid.link/20251219173637.797418-1-whrosenb@asu.edu Signed-off-by: Paolo Abeni --- net/ipv6/calipso.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/ipv6/calipso.c b/net/ipv6/calipso.c index df1986973430c..21f6ed126253a 100644 --- a/net/ipv6/calipso.c +++ b/net/ipv6/calipso.c @@ -1342,7 +1342,8 @@ static int calipso_skbuff_setattr(struct sk_buff *skb, /* At this point new_end aligns to 4n, so (new_end & 4) pads to 8n */ pad = ((new_end & 4) + (end & 7)) & 7; len_delta = new_end - (int)end + pad; - ret_val = skb_cow(skb, skb_headroom(skb) + len_delta); + ret_val = skb_cow(skb, + skb_headroom(skb) + (len_delta > 0 ? len_delta : 0)); if (ret_val < 0) return ret_val; From 900acf6783c2161fc4070c8e38a1c8b668623cb5 Mon Sep 17 00:00:00 2001 From: Ethan Nelson-Moore Date: Sun, 21 Dec 2025 00:24:00 -0800 Subject: [PATCH 0535/1321] net: usb: sr9700: fix incorrect command used to write single register This fixes the device failing to initialize with "error reading MAC address" for me, probably because the incorrect write of NCR_RST to SR_NCR is not actually resetting the device. Fixes: c9b37458e95629b1d1171457afdcc1bf1eb7881d ("USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support") Cc: stable@vger.kernel.org Signed-off-by: Ethan Nelson-Moore Link: https://patch.msgid.link/20251221082400.50688-1-enelsonmoore@gmail.com Signed-off-by: Paolo Abeni --- drivers/net/usb/sr9700.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/usb/sr9700.c b/drivers/net/usb/sr9700.c index d8ffb59eaf348..820c4c5069792 100644 --- a/drivers/net/usb/sr9700.c +++ b/drivers/net/usb/sr9700.c @@ -52,7 +52,7 @@ static int sr_read_reg(struct usbnet *dev, u8 reg, u8 *value) static int sr_write_reg(struct usbnet *dev, u8 reg, u8 value) { - return usbnet_write_cmd(dev, SR_WR_REGS, SR_REQ_WR_REG, + return usbnet_write_cmd(dev, SR_WR_REG, SR_REQ_WR_REG, value, reg, NULL, 0); } @@ -65,7 +65,7 @@ static void sr_write_async(struct usbnet *dev, u8 reg, u16 length, static void sr_write_reg_async(struct usbnet *dev, u8 reg, u8 value) { - usbnet_write_cmd_async(dev, SR_WR_REGS, SR_REQ_WR_REG, + usbnet_write_cmd_async(dev, SR_WR_REG, SR_REQ_WR_REG, value, reg, NULL, 0); } From 34475cb3f37a90a4cf40d4a478a319a5de3f3972 Mon Sep 17 00:00:00 2001 From: Ido Schimmel Date: Sun, 21 Dec 2025 16:48:28 +0200 Subject: [PATCH 0536/1321] ipv4: Fix reference count leak when using error routes with nexthop objects When a nexthop object is deleted, it is marked as dead and then fib_table_flush() is called to flush all the routes that are using the dead nexthop. The current logic in fib_table_flush() is to only flush error routes (e.g., blackhole) when it is called as part of network namespace dismantle (i.e., with flush_all=true). Therefore, error routes are not flushed when their nexthop object is deleted: # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip route add 198.51.100.1/32 nhid 1 # ip route add blackhole 198.51.100.2/32 nhid 1 # ip nexthop del id 1 # ip route show blackhole 198.51.100.2 nhid 1 dev dummy1 As such, they keep holding a reference on the nexthop object which in turn holds a reference on the nexthop device, resulting in a reference count leak: # ip link del dev dummy1 [ 70.516258] unregister_netdevice: waiting for dummy1 to become free. Usage count = 2 Fix by flushing error routes when their nexthop is marked as dead. IPv6 does not suffer from this problem. Fixes: 493ced1ac47c ("ipv4: Allow routes to use nexthop objects") Reported-by: Tetsuo Handa Closes: https://lore.kernel.org/netdev/d943f806-4da6-4970-ac28-b9373b0e63ac@I-love.SAKURA.ne.jp/ Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Signed-off-by: Ido Schimmel Reviewed-by: David Ahern Link: https://patch.msgid.link/20251221144829.197694-1-idosch@nvidia.com Signed-off-by: Paolo Abeni --- net/ipv4/fib_trie.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c index 59a6f0a9638f9..7e2c17fec3fc4 100644 --- a/net/ipv4/fib_trie.c +++ b/net/ipv4/fib_trie.c @@ -2053,10 +2053,11 @@ int fib_table_flush(struct net *net, struct fib_table *tb, bool flush_all) continue; } - /* Do not flush error routes if network namespace is - * not being dismantled + /* When not flushing the entire table, skip error + * routes that are not marked for deletion. */ - if (!flush_all && fib_props[fa->fa_type].error) { + if (!flush_all && fib_props[fa->fa_type].error && + !(fi->fib_flags & RTNH_F_DEAD)) { slen = fa->fa_slen; continue; } From 1584d8ae5273144929c1d6b3ef8e4a185618bf42 Mon Sep 17 00:00:00 2001 From: Ido Schimmel Date: Sun, 21 Dec 2025 16:48:29 +0200 Subject: [PATCH 0537/1321] selftests: fib_nexthops: Add test cases for error routes deletion Add test cases that check that error routes (e.g., blackhole) are deleted when their nexthop is deleted. Output without "ipv4: Fix reference count leak when using error routes with nexthop objects": # ./fib_nexthops.sh -t "ipv4_fcnal ipv6_fcnal" IPv4 functional ---------------------- [...] WARNING: Unexpected route entry TEST: Error route removed on nexthop deletion [FAIL] IPv6 ---------------------- [...] TEST: Error route removed on nexthop deletion [ OK ] Tests passed: 20 Tests failed: 1 Tests skipped: 0 Output with "ipv4: Fix reference count leak when using error routes with nexthop objects": # ./fib_nexthops.sh -t "ipv4_fcnal ipv6_fcnal" IPv4 functional ---------------------- [...] TEST: Error route removed on nexthop deletion [ OK ] IPv6 ---------------------- [...] TEST: Error route removed on nexthop deletion [ OK ] Tests passed: 21 Tests failed: 0 Tests skipped: 0 Signed-off-by: Ido Schimmel Reviewed-by: David Ahern Link: https://patch.msgid.link/20251221144829.197694-2-idosch@nvidia.com Signed-off-by: Paolo Abeni --- tools/testing/selftests/net/fib_nexthops.sh | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh index 2b0a90581e2f1..21026b6676670 100755 --- a/tools/testing/selftests/net/fib_nexthops.sh +++ b/tools/testing/selftests/net/fib_nexthops.sh @@ -800,6 +800,14 @@ ipv6_fcnal() set +e check_nexthop "dev veth1" "" log_test $? 0 "Nexthops removed on admin down" + + # error routes should be deleted when their nexthop is deleted + run_cmd "$IP li set dev veth1 up" + run_cmd "$IP -6 nexthop add id 58 dev veth1" + run_cmd "$IP ro add blackhole 2001:db8:101::1/128 nhid 58" + run_cmd "$IP nexthop del id 58" + check_route6 "2001:db8:101::1" "" + log_test $? 0 "Error route removed on nexthop deletion" } ipv6_grp_refs() @@ -1459,6 +1467,13 @@ ipv4_fcnal() run_cmd "$IP ro del 172.16.102.0/24" log_test $? 0 "Delete route when not specifying nexthop attributes" + + # error routes should be deleted when their nexthop is deleted + run_cmd "$IP nexthop add id 23 dev veth1" + run_cmd "$IP ro add blackhole 172.16.102.100/32 nhid 23" + run_cmd "$IP nexthop del id 23" + check_route "172.16.102.100" "" + log_test $? 0 "Error route removed on nexthop deletion" } ipv4_grp_fcnal() From c55f51b6926b390d089d6eddec3aaf8900fbb11d Mon Sep 17 00:00:00 2001 From: Vadim Fedorenko Date: Sun, 21 Dec 2025 19:26:38 +0000 Subject: [PATCH 0538/1321] net: fib: restore ECMP balance from loopback Preference of nexthop with source address broke ECMP for packets with source addresses which are not in the broadcast domain, but rather added to loopback/dummy interfaces. Original behaviour was to balance over nexthops while now it uses the latest nexthop from the group. To fix the issue introduce next hop scoring system where next hops with source address equal to requested will always have higher priority. For the case with 198.51.100.1/32 assigned to dummy0 and routed using 192.0.2.0/24 and 203.0.113.0/24 networks: 2: dummy0: mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether d6:54:8a:ff:78:f5 brd ff:ff:ff:ff:ff:ff inet 198.51.100.1/32 scope global dummy0 valid_lft forever preferred_lft forever 7: veth1@if6: mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 06:ed:98:87:6d:8a brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 192.0.2.2/24 scope global veth1 valid_lft forever preferred_lft forever inet6 fe80::4ed:98ff:fe87:6d8a/64 scope link proto kernel_ll valid_lft forever preferred_lft forever 9: veth3@if8: mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether ae:75:23:38:a0:d2 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 203.0.113.2/24 scope global veth3 valid_lft forever preferred_lft forever inet6 fe80::ac75:23ff:fe38:a0d2/64 scope link proto kernel_ll valid_lft forever preferred_lft forever ~ ip ro list: default nexthop via 192.0.2.1 dev veth1 weight 1 nexthop via 203.0.113.1 dev veth3 weight 1 192.0.2.0/24 dev veth1 proto kernel scope link src 192.0.2.2 203.0.113.0/24 dev veth3 proto kernel scope link src 203.0.113.2 before: for i in {1..255} ; do ip ro get 10.0.0.$i; done | grep veth | awk ' {print $(NF-2)}' | sort | uniq -c: 255 veth3 after: for i in {1..255} ; do ip ro get 10.0.0.$i; done | grep veth | awk ' {print $(NF-2)}' | sort | uniq -c: 122 veth1 133 veth3 Fixes: 32607a332cfe ("ipv4: prefer multipath nexthop that matches source address") Signed-off-by: Vadim Fedorenko Reviewed-by: Ido Schimmel Reviewed-by: Willem de Bruijn Link: https://patch.msgid.link/20251221192639.3911901-1-vadim.fedorenko@linux.dev Signed-off-by: Paolo Abeni --- net/ipv4/fib_semantics.c | 26 ++++++++++---------------- 1 file changed, 10 insertions(+), 16 deletions(-) diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index a5f3c8459758f..0caf38e44c738 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -2167,8 +2167,8 @@ void fib_select_multipath(struct fib_result *res, int hash, { struct fib_info *fi = res->fi; struct net *net = fi->fib_net; - bool found = false; bool use_neigh; + int score = -1; __be32 saddr; if (unlikely(res->fi->nh)) { @@ -2180,7 +2180,7 @@ void fib_select_multipath(struct fib_result *res, int hash, saddr = fl4 ? fl4->saddr : 0; change_nexthops(fi) { - int nh_upper_bound; + int nh_upper_bound, nh_score = 0; /* Nexthops without a carrier are assigned an upper bound of * minus one when "ignore_routes_with_linkdown" is set. @@ -2190,24 +2190,18 @@ void fib_select_multipath(struct fib_result *res, int hash, (use_neigh && !fib_good_nh(nexthop_nh))) continue; - if (!found) { + if (saddr && nexthop_nh->nh_saddr == saddr) + nh_score += 2; + if (hash <= nh_upper_bound) + nh_score++; + if (score < nh_score) { res->nh_sel = nhsel; res->nhc = &nexthop_nh->nh_common; - found = !saddr || nexthop_nh->nh_saddr == saddr; + if (nh_score == 3 || (!saddr && nh_score == 1)) + return; + score = nh_score; } - if (hash > nh_upper_bound) - continue; - - if (!saddr || nexthop_nh->nh_saddr == saddr) { - res->nh_sel = nhsel; - res->nhc = &nexthop_nh->nh_common; - return; - } - - if (found) - return; - } endfor_nexthops(fi); } #endif From ddb777a9729e656b0dbbc44a56346a65b08004c0 Mon Sep 17 00:00:00 2001 From: Vadim Fedorenko Date: Sun, 21 Dec 2025 19:26:39 +0000 Subject: [PATCH 0539/1321] selftests: fib_test: Add test case for ipv4 multi nexthops The test checks that with multi nexthops route the preferred route is the one which matches source ip. In case when source ip is on dummy interface, it checks that the routes are balanced. Reviewed-by: Willem de Bruijn Signed-off-by: Vadim Fedorenko Link: https://patch.msgid.link/20251221192639.3911901-2-vadim.fedorenko@linux.dev Signed-off-by: Paolo Abeni --- tools/testing/selftests/net/fib_tests.sh | 70 +++++++++++++++++++++++- 1 file changed, 69 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh index a88f797c549a7..c5694cc4ddd26 100755 --- a/tools/testing/selftests/net/fib_tests.sh +++ b/tools/testing/selftests/net/fib_tests.sh @@ -12,7 +12,7 @@ TESTS="unregister down carrier nexthop suppress ipv6_notify ipv4_notify \ ipv4_route_metrics ipv4_route_v6_gw rp_filter ipv4_del_addr \ ipv6_del_addr ipv4_mangle ipv6_mangle ipv4_bcast_neigh fib6_gc_test \ ipv4_mpath_list ipv6_mpath_list ipv4_mpath_balance ipv6_mpath_balance \ - fib6_ra_to_static" + ipv4_mpath_balance_preferred fib6_ra_to_static" VERBOSE=0 PAUSE_ON_FAIL=no @@ -2751,6 +2751,73 @@ ipv4_mpath_balance_test() forwarding_cleanup } +get_route_dev_src() +{ + local pfx="$1" + local src="$2" + local out + + if out=$($IP -j route get "$pfx" from "$src" | jq -re ".[0].dev"); then + echo "$out" + fi +} + +ipv4_mpath_preferred() +{ + local src_ip=$1 + local pref_dev=$2 + local dev routes + local route0=0 + local route1=0 + local pref_route=0 + num_routes=254 + + for i in $(seq 1 $num_routes) ; do + dev=$(get_route_dev_src 172.16.105.$i $src_ip) + if [ "$dev" = "$pref_dev" ]; then + pref_route=$((pref_route+1)) + elif [ "$dev" = "veth1" ]; then + route0=$((route0+1)) + elif [ "$dev" = "veth3" ]; then + route1=$((route1+1)) + fi + done + + routes=$((route0+route1)) + + [ "$VERBOSE" = "1" ] && echo "multipath: routes seen: ($route0,$route1,$pref_route)" + + if [ x"$pref_dev" = x"" ]; then + [[ $routes -ge $num_routes ]] && [[ $route0 -gt 0 ]] && [[ $route1 -gt 0 ]] + else + [[ $pref_route -ge $num_routes ]] + fi + +} + +ipv4_mpath_balance_preferred_test() +{ + echo + echo "IPv4 multipath load balance preferred route" + + forwarding_setup + + $IP route add 172.16.105.0/24 \ + nexthop via 172.16.101.2 \ + nexthop via 172.16.103.2 + + ipv4_mpath_preferred 172.16.101.1 veth1 + log_test $? 0 "IPv4 multipath loadbalance from veth1" + + ipv4_mpath_preferred 172.16.103.1 veth3 + log_test $? 0 "IPv4 multipath loadbalance from veth3" + + ipv4_mpath_preferred 198.51.100.1 + log_test $? 0 "IPv4 multipath loadbalance from dummy" + + forwarding_cleanup +} + ipv6_mpath_balance_test() { echo @@ -2861,6 +2928,7 @@ do ipv6_mpath_list) ipv6_mpath_list_test;; ipv4_mpath_balance) ipv4_mpath_balance_test;; ipv6_mpath_balance) ipv6_mpath_balance_test;; + ipv4_mpath_balance_preferred) ipv4_mpath_balance_preferred_test;; fib6_ra_to_static) fib6_ra_to_static;; help) echo "Test names: $TESTS"; exit 0;; From 96486a7c5e53a51d4e9e09873c0db5b7857c5c53 Mon Sep 17 00:00:00 2001 From: Xiaolei Wang Date: Mon, 22 Dec 2025 09:56:24 +0800 Subject: [PATCH 0540/1321] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open() In the non-RT kernel, local_bh_disable() merely disables preemption, whereas it maps to an actual spin lock in the RT kernel. Consequently, when attempting to refill RX buffers via netdev_alloc_skb() in macb_mac_link_up(), a deadlock scenario arises as follows: WARNING: possible circular locking dependency detected 6.18.0-08691-g2061f18ad76e #39 Not tainted ------------------------------------------------------ kworker/0:0/8 is trying to acquire lock: ffff00080369bbe0 (&bp->lock){+.+.}-{3:3}, at: macb_start_xmit+0x808/0xb7c but task is already holding lock: ffff000803698e58 (&queue->tx_ptr_lock){+...}-{3:3}, at: macb_start_xmit +0x148/0xb7c which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&queue->tx_ptr_lock){+...}-{3:3}: rt_spin_lock+0x50/0x1f0 macb_start_xmit+0x148/0xb7c dev_hard_start_xmit+0x94/0x284 sch_direct_xmit+0x8c/0x37c __dev_queue_xmit+0x708/0x1120 neigh_resolve_output+0x148/0x28c ip6_finish_output2+0x2c0/0xb2c __ip6_finish_output+0x114/0x308 ip6_output+0xc4/0x4a4 mld_sendpack+0x220/0x68c mld_ifc_work+0x2a8/0x4f4 process_one_work+0x20c/0x5f8 worker_thread+0x1b0/0x35c kthread+0x144/0x200 ret_from_fork+0x10/0x20 -> #2 (_xmit_ETHER#2){+...}-{3:3}: rt_spin_lock+0x50/0x1f0 sch_direct_xmit+0x11c/0x37c __dev_queue_xmit+0x708/0x1120 neigh_resolve_output+0x148/0x28c ip6_finish_output2+0x2c0/0xb2c __ip6_finish_output+0x114/0x308 ip6_output+0xc4/0x4a4 mld_sendpack+0x220/0x68c mld_ifc_work+0x2a8/0x4f4 process_one_work+0x20c/0x5f8 worker_thread+0x1b0/0x35c kthread+0x144/0x200 ret_from_fork+0x10/0x20 -> #1 ((softirq_ctrl.lock)){+.+.}-{3:3}: lock_release+0x250/0x348 __local_bh_enable_ip+0x7c/0x240 __netdev_alloc_skb+0x1b4/0x1d8 gem_rx_refill+0xdc/0x240 gem_init_rings+0xb4/0x108 macb_mac_link_up+0x9c/0x2b4 phylink_resolve+0x170/0x614 process_one_work+0x20c/0x5f8 worker_thread+0x1b0/0x35c kthread+0x144/0x200 ret_from_fork+0x10/0x20 -> #0 (&bp->lock){+.+.}-{3:3}: __lock_acquire+0x15a8/0x2084 lock_acquire+0x1cc/0x350 rt_spin_lock+0x50/0x1f0 macb_start_xmit+0x808/0xb7c dev_hard_start_xmit+0x94/0x284 sch_direct_xmit+0x8c/0x37c __dev_queue_xmit+0x708/0x1120 neigh_resolve_output+0x148/0x28c ip6_finish_output2+0x2c0/0xb2c __ip6_finish_output+0x114/0x308 ip6_output+0xc4/0x4a4 mld_sendpack+0x220/0x68c mld_ifc_work+0x2a8/0x4f4 process_one_work+0x20c/0x5f8 worker_thread+0x1b0/0x35c kthread+0x144/0x200 ret_from_fork+0x10/0x20 other info that might help us debug this: Chain exists of: &bp->lock --> _xmit_ETHER#2 --> &queue->tx_ptr_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&queue->tx_ptr_lock); lock(_xmit_ETHER#2); lock(&queue->tx_ptr_lock); lock(&bp->lock); *** DEADLOCK *** Call trace: show_stack+0x18/0x24 (C) dump_stack_lvl+0xa0/0xf0 dump_stack+0x18/0x24 print_circular_bug+0x28c/0x370 check_noncircular+0x198/0x1ac __lock_acquire+0x15a8/0x2084 lock_acquire+0x1cc/0x350 rt_spin_lock+0x50/0x1f0 macb_start_xmit+0x808/0xb7c dev_hard_start_xmit+0x94/0x284 sch_direct_xmit+0x8c/0x37c __dev_queue_xmit+0x708/0x1120 neigh_resolve_output+0x148/0x28c ip6_finish_output2+0x2c0/0xb2c __ip6_finish_output+0x114/0x308 ip6_output+0xc4/0x4a4 mld_sendpack+0x220/0x68c mld_ifc_work+0x2a8/0x4f4 process_one_work+0x20c/0x5f8 worker_thread+0x1b0/0x35c kthread+0x144/0x200 ret_from_fork+0x10/0x20 Notably, invoking the mog_init_rings() callback upon link establishment is unnecessary. Instead, we can exclusively call mog_init_rings() within the ndo_open() callback. This adjustment resolves the deadlock issue. Furthermore, since MACB_CAPS_MACB_IS_EMAC cases do not use mog_init_rings() when opening the network interface via at91ether_open(), moving mog_init_rings() to macb_open() also eliminates the MACB_CAPS_MACB_IS_EMAC check. Fixes: 633e98a711ac ("net: macb: use resolved link config in mac_link_up()") Cc: stable@vger.kernel.org Suggested-by: Kevin Hao Signed-off-by: Xiaolei Wang Link: https://patch.msgid.link/20251222015624.1994551-1-xiaolei.wang@windriver.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/cadence/macb_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index e461f5072884e..6511ecd5856bd 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -708,7 +708,6 @@ static void macb_mac_link_up(struct phylink_config *config, /* Initialize rings & buffers as clearing MACB_BIT(TE) in link down * cleared the pipeline and control registers. */ - bp->macbgem_ops.mog_init_rings(bp); macb_init_buffers(bp); for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) @@ -2954,6 +2953,8 @@ static int macb_open(struct net_device *dev) goto pm_exit; } + bp->macbgem_ops.mog_init_rings(bp); + for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) { napi_enable(&queue->napi_rx); napi_enable(&queue->napi_tx); From 32cf1a10afce1b8b364f44e5b2142e5da5e99b25 Mon Sep 17 00:00:00 2001 From: Wei Fang Date: Mon, 22 Dec 2025 10:26:28 +0800 Subject: [PATCH 0541/1321] net: enetc: do not print error log if addr is 0 A value of 0 for addr indicates that the IEB_LBCR register does not need to be configured, as its default value is 0. However, the driver will print an error log if addr is 0, so this issue needs to be fixed. Fixes: 50bfd9c06f0f ("net: enetc: set external PHY address in IERB for i.MX94 ENETC") Signed-off-by: Wei Fang Reviewed-by: Vladimir Oltean Link: https://patch.msgid.link/20251222022628.4016403-1-wei.fang@nxp.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/freescale/enetc/netc_blk_ctrl.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/freescale/enetc/netc_blk_ctrl.c b/drivers/net/ethernet/freescale/enetc/netc_blk_ctrl.c index 443983fdecd95..7fd39f8952901 100644 --- a/drivers/net/ethernet/freescale/enetc/netc_blk_ctrl.c +++ b/drivers/net/ethernet/freescale/enetc/netc_blk_ctrl.c @@ -577,11 +577,17 @@ static int imx94_enetc_mdio_phyaddr_config(struct netc_blk_ctrl *priv, } addr = netc_get_phy_addr(np); - if (addr <= 0) { + if (addr < 0) { dev_err(dev, "Failed to get PHY address\n"); return addr; } + /* The default value of LaBCR[MDIO_PHYAD_PRTAD] is 0, + * so no need to set the register. + */ + if (!addr) + return 0; + if (phy_mask & BIT(addr)) { dev_err(dev, "Find same PHY address in EMDIO and ENETC node\n"); From 7c16124705116619718c16e11d57bb7ed5678e6c Mon Sep 17 00:00:00 2001 From: Pwnverse Date: Mon, 22 Dec 2025 21:22:27 +0000 Subject: [PATCH 0542/1321] net: rose: fix invalid array index in rose_kill_by_device() rose_kill_by_device() collects sockets into a local array[] and then iterates over them to disconnect sockets bound to a device being brought down. The loop mistakenly indexes array[cnt] instead of array[i]. For cnt < ARRAY_SIZE(array), this reads an uninitialized entry; for cnt == ARRAY_SIZE(array), it is an out-of-bounds read. Either case can lead to an invalid socket pointer dereference and also leaks references taken via sock_hold(). Fix the index to use i. Fixes: 64b8bc7d5f143 ("net/rose: fix races in rose_kill_by_device()") Co-developed-by: Fatma Alwasmi Signed-off-by: Fatma Alwasmi Signed-off-by: Pwnverse Link: https://patch.msgid.link/20251222212227.4116041-1-ritviktanksalkar@gmail.com Signed-off-by: Paolo Abeni --- net/rose/af_rose.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c index fd67494f2815e..c0f5a515a8ce5 100644 --- a/net/rose/af_rose.c +++ b/net/rose/af_rose.c @@ -205,7 +205,7 @@ static void rose_kill_by_device(struct net_device *dev) spin_unlock_bh(&rose_list_lock); for (i = 0; i < cnt; i++) { - sk = array[cnt]; + sk = array[i]; rose = rose_sk(sk); lock_sock(sk); spin_lock_bh(&rose_list_lock); From d5f81ca2e04036a72c9e16406ceaf3c4d7682ce5 Mon Sep 17 00:00:00 2001 From: Jiayuan Chen Date: Tue, 23 Dec 2025 13:14:12 +0800 Subject: [PATCH 0543/1321] ipv6: fix a BUG in rt6_get_pcpu_route() under PREEMPT_RT On PREEMPT_RT kernels, after rt6_get_pcpu_route() returns NULL, the current task can be preempted. Another task running on the same CPU may then execute rt6_make_pcpu_route() and successfully install a pcpu_rt entry. When the first task resumes execution, its cmpxchg() in rt6_make_pcpu_route() will fail because rt6i_pcpu is no longer NULL, triggering the BUG_ON(prev). It's easy to reproduce it by adding mdelay() after rt6_get_pcpu_route(). Using preempt_disable/enable is not appropriate here because ip6_rt_pcpu_alloc() may sleep. Fix this by handling the cmpxchg() failure gracefully on PREEMPT_RT: free our allocation and return the existing pcpu_rt installed by another task. The BUG_ON is replaced by WARN_ON_ONCE for non-PREEMPT_RT kernels where such races should not occur. Link: https://syzkaller.appspot.com/bug?extid=9b35e9bc0951140d13e6 Fixes: d2d6422f8bd1 ("x86: Allow to enable PREEMPT_RT.") Reported-by: syzbot+9b35e9bc0951140d13e6@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6918cd88.050a0220.1c914e.0045.GAE@google.com/T/ Signed-off-by: Jiayuan Chen Link: https://patch.msgid.link/20251223051413.124687-1-jiayuan.chen@linux.dev Signed-off-by: Paolo Abeni --- net/ipv6/route.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index aee6a10b112aa..a3e051dc66ee0 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1470,7 +1470,18 @@ static struct rt6_info *rt6_make_pcpu_route(struct net *net, p = this_cpu_ptr(res->nh->rt6i_pcpu); prev = cmpxchg(p, NULL, pcpu_rt); - BUG_ON(prev); + if (unlikely(prev)) { + /* + * Another task on this CPU already installed a pcpu_rt. + * This can happen on PREEMPT_RT where preemption is possible. + * Free our allocation and return the existing one. + */ + WARN_ON_ONCE(!IS_ENABLED(CONFIG_PREEMPT_RT)); + + dst_dev_put(&pcpu_rt->dst); + dst_release(&pcpu_rt->dst); + return prev; + } if (res->f6i->fib6_destroying) { struct fib6_info *from; From 872971f1e6ca2d8d27a97fa89fe5c881cf4ffc2e Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Sat, 22 Nov 2025 12:00:36 +1100 Subject: [PATCH 0544/1321] lockd: fix vfs_test_lock() calls Usage of vfs_test_lock() is somewhat confused. Documentation suggests it is given a "lock" but this is not the case. It is given a struct file_lock which contains some details of the sort of lock it should be looking for. In particular passing a "file_lock" containing fl_lmops or fl_ops is meaningless and possibly confusing. This is particularly problematic in lockd. nlmsvc_testlock() receives an initialised "file_lock" from xdr-decode, including manager ops and an owner. It then mistakenly passes this to vfs_test_lock() which might replace the owner and the ops. This can lead to confusion when freeing the lock. The primary role of the 'struct file_lock' passed to vfs_test_lock() is to report a conflicting lock that was found, so it makes more sense for nlmsvc_testlock() to pass "conflock", which it uses for returning the conflicting lock. With this change, freeing of the lock is not confused and code in __nlm4svc_proc_test() and __nlmsvc_proc_test() can be simplified. Documentation for vfs_test_lock() is improved to reflect its real purpose, and a WARN_ON_ONCE() is added to avoid a similar problem in the future. Reported-by: Olga Kornievskaia Closes: https://lore.kernel.org/all/20251021130506.45065-1-okorniev@redhat.com Signed-off-by: NeilBrown Fixes: 20fa19027286 ("nfs: add export operations") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever --- fs/lockd/svc4proc.c | 4 +--- fs/lockd/svclock.c | 21 ++++++++++++--------- fs/lockd/svcproc.c | 5 +---- fs/locks.c | 12 ++++++++++-- 4 files changed, 24 insertions(+), 18 deletions(-) diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c index 109e5caae8c70..4b6f18d977343 100644 --- a/fs/lockd/svc4proc.c +++ b/fs/lockd/svc4proc.c @@ -97,7 +97,6 @@ __nlm4svc_proc_test(struct svc_rqst *rqstp, struct nlm_res *resp) struct nlm_args *argp = rqstp->rq_argp; struct nlm_host *host; struct nlm_file *file; - struct nlm_lockowner *test_owner; __be32 rc = rpc_success; dprintk("lockd: TEST4 called\n"); @@ -107,7 +106,6 @@ __nlm4svc_proc_test(struct svc_rqst *rqstp, struct nlm_res *resp) if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file))) return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success; - test_owner = argp->lock.fl.c.flc_owner; /* Now check for conflicting locks */ resp->status = nlmsvc_testlock(rqstp, file, host, &argp->lock, &resp->lock); @@ -116,7 +114,7 @@ __nlm4svc_proc_test(struct svc_rqst *rqstp, struct nlm_res *resp) else dprintk("lockd: TEST4 status %d\n", ntohl(resp->status)); - nlmsvc_put_lockowner(test_owner); + nlmsvc_release_lockowner(&argp->lock); nlmsvc_release_host(host); nlm_release_file(file); return rc; diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c index 3a3d05cfe09ad..6bce19fd024c5 100644 --- a/fs/lockd/svclock.c +++ b/fs/lockd/svclock.c @@ -633,7 +633,13 @@ nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file, } mode = lock_to_openmode(&lock->fl); - error = vfs_test_lock(file->f_file[mode], &lock->fl); + locks_init_lock(&conflock->fl); + /* vfs_test_lock only uses start, end, and owner, but tests flc_file */ + conflock->fl.c.flc_file = lock->fl.c.flc_file; + conflock->fl.fl_start = lock->fl.fl_start; + conflock->fl.fl_end = lock->fl.fl_end; + conflock->fl.c.flc_owner = lock->fl.c.flc_owner; + error = vfs_test_lock(file->f_file[mode], &conflock->fl); if (error) { /* We can't currently deal with deferred test requests */ if (error == FILE_LOCK_DEFERRED) @@ -643,22 +649,19 @@ nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file, goto out; } - if (lock->fl.c.flc_type == F_UNLCK) { + if (conflock->fl.c.flc_type == F_UNLCK) { ret = nlm_granted; goto out; } dprintk("lockd: conflicting lock(ty=%d, %Ld-%Ld)\n", - lock->fl.c.flc_type, (long long)lock->fl.fl_start, - (long long)lock->fl.fl_end); + conflock->fl.c.flc_type, (long long)conflock->fl.fl_start, + (long long)conflock->fl.fl_end); conflock->caller = "somehost"; /* FIXME */ conflock->len = strlen(conflock->caller); conflock->oh.len = 0; /* don't return OH info */ - conflock->svid = lock->fl.c.flc_pid; - conflock->fl.c.flc_type = lock->fl.c.flc_type; - conflock->fl.fl_start = lock->fl.fl_start; - conflock->fl.fl_end = lock->fl.fl_end; - locks_release_private(&lock->fl); + conflock->svid = conflock->fl.c.flc_pid; + locks_release_private(&conflock->fl); ret = nlm_lck_denied; out: diff --git a/fs/lockd/svcproc.c b/fs/lockd/svcproc.c index f53d5177f2673..5817ef272332d 100644 --- a/fs/lockd/svcproc.c +++ b/fs/lockd/svcproc.c @@ -117,7 +117,6 @@ __nlmsvc_proc_test(struct svc_rqst *rqstp, struct nlm_res *resp) struct nlm_args *argp = rqstp->rq_argp; struct nlm_host *host; struct nlm_file *file; - struct nlm_lockowner *test_owner; __be32 rc = rpc_success; dprintk("lockd: TEST called\n"); @@ -127,8 +126,6 @@ __nlmsvc_proc_test(struct svc_rqst *rqstp, struct nlm_res *resp) if ((resp->status = nlmsvc_retrieve_args(rqstp, argp, &host, &file))) return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success; - test_owner = argp->lock.fl.c.flc_owner; - /* Now check for conflicting locks */ resp->status = cast_status(nlmsvc_testlock(rqstp, file, host, &argp->lock, &resp->lock)); @@ -138,7 +135,7 @@ __nlmsvc_proc_test(struct svc_rqst *rqstp, struct nlm_res *resp) dprintk("lockd: TEST status %d vers %d\n", ntohl(resp->status), rqstp->rq_vers); - nlmsvc_put_lockowner(test_owner); + nlmsvc_release_lockowner(&argp->lock); nlmsvc_release_host(host); nlm_release_file(file); return rc; diff --git a/fs/locks.c b/fs/locks.c index 9f565802a88cd..e2036aa4bd373 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -2236,13 +2236,21 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd) /** * vfs_test_lock - test file byte range lock * @filp: The file to test lock for - * @fl: The lock to test; also used to hold result + * @fl: The byte-range in the file to test; also used to hold result * + * On entry, @fl does not contain a lock, but identifies a range (fl_start, fl_end) + * in the file (c.flc_file), and an owner (c.flc_owner) for whom existing locks + * should be ignored. c.flc_type and c.flc_flags are ignored. + * Both fl_lmops and fl_ops in @fl must be NULL. * Returns -ERRNO on failure. Indicates presence of conflicting lock by - * setting conf->fl_type to something other than F_UNLCK. + * setting fl->fl_type to something other than F_UNLCK. + * + * If vfs_test_lock() does find a lock and return it, the caller must + * use locks_free_lock() or locks_release_private() on the returned lock. */ int vfs_test_lock(struct file *filp, struct file_lock *fl) { + WARN_ON_ONCE(fl->fl_ops || fl->fl_lmops); WARN_ON_ONCE(filp != fl->c.flc_file); if (filp->f_op->lock) return filp->f_op->lock(filp, F_GETLK, fl); From dabd2b3310bbd0651e9f3f2426b222b259ec5115 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Mon, 1 Dec 2025 17:09:55 -0500 Subject: [PATCH 0545/1321] nfsd: fix nfsd_file reference leak in nfsd4_add_rdaccess_to_wrdeleg() nfsd4_add_rdaccess_to_wrdeleg() unconditionally overwrites fp->fi_fds[O_RDONLY] with a newly acquired nfsd_file. However, if the client already has a SHARE_ACCESS_READ open from a previous OPEN operation, this action overwrites the existing pointer without releasing its reference, orphaning the previous reference. Additionally, the function originally stored the same nfsd_file pointer in both fp->fi_fds[O_RDONLY] and fp->fi_rdeleg_file with only a single reference. When put_deleg_file() runs, it clears fi_rdeleg_file and calls nfs4_file_put_access() to release the file. However, nfs4_file_put_access() only releases fi_fds[O_RDONLY] when the fi_access[O_RDONLY] counter drops to zero. If another READ open exists on the file, the counter remains elevated and the nfsd_file reference from the delegation is never released. This potentially causes open conflicts on that file. Then, on server shutdown, these leaks cause __nfsd_file_cache_purge() to encounter files with an elevated reference count that cannot be cleaned up, ultimately triggering a BUG() in kmem_cache_destroy() because there are still nfsd_file objects allocated in that cache. Fixes: e7a8ebc305f2 ("NFSD: Offer write delegation for OPEN with OPEN4_SHARE_ACCESS_WRITE") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever --- fs/nfsd/nfs4state.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 808c24fb5c9a0..1ef560be2d714 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -1218,8 +1218,10 @@ static void put_deleg_file(struct nfs4_file *fp) if (nf) nfsd_file_put(nf); - if (rnf) + if (rnf) { + nfsd_file_put(rnf); nfs4_file_put_access(fp, NFS4_SHARE_ACCESS_READ); + } } static void nfsd4_finalize_deleg_timestamps(struct nfs4_delegation *dp, struct file *f) @@ -6231,10 +6233,14 @@ nfsd4_add_rdaccess_to_wrdeleg(struct svc_rqst *rqstp, struct nfsd4_open *open, fp = stp->st_stid.sc_file; spin_lock(&fp->fi_lock); __nfs4_file_get_access(fp, NFS4_SHARE_ACCESS_READ); - fp = stp->st_stid.sc_file; - fp->fi_fds[O_RDONLY] = nf; - fp->fi_rdeleg_file = nf; + if (!fp->fi_fds[O_RDONLY]) { + fp->fi_fds[O_RDONLY] = nf; + nf = NULL; + } + fp->fi_rdeleg_file = nfsd_file_get(fp->fi_fds[O_RDONLY]); spin_unlock(&fp->fi_lock); + if (nf) + nfsd_file_put(nf); } return true; } From 2c1972c336acd09b17ed43c370360dd479297368 Mon Sep 17 00:00:00 2001 From: Jeff Layton Date: Wed, 3 Dec 2025 10:52:15 -0500 Subject: [PATCH 0546/1321] nfsd: use ATTR_DELEG in nfsd4_finalize_deleg_timestamps() When finalizing timestamps that have never been updated and preparing to release the delegation lease, the notify_change() call can trigger a delegation break, and fail to update the timestamps. When this happens, there will be messages like this in dmesg: [ 2709.375785] Unable to update timestamps on inode 00:39:263: -11 Since this code is going to release the lease just after updating the timestamps, breaking the delegation is undesirable. Fix this by setting ATTR_DELEG in ia_valid, in order to avoid the delegation break. Fixes: e5e9b24ab8fa ("nfsd: freeze c/mtime updates with outstanding WRITE_ATTRS delegation") Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever --- fs/nfsd/nfs4state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 1ef560be2d714..b0111104e9690 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -1226,7 +1226,7 @@ static void put_deleg_file(struct nfs4_file *fp) static void nfsd4_finalize_deleg_timestamps(struct nfs4_delegation *dp, struct file *f) { - struct iattr ia = { .ia_valid = ATTR_ATIME | ATTR_CTIME | ATTR_MTIME }; + struct iattr ia = { .ia_valid = ATTR_ATIME | ATTR_CTIME | ATTR_MTIME | ATTR_DELEG }; struct inode *inode = file_inode(f); int ret; From 5dc8b9086c2a3f8c5d39e29966309df21224c2b0 Mon Sep 17 00:00:00 2001 From: Haoxiang Li Date: Sat, 6 Dec 2025 15:38:42 +0800 Subject: [PATCH 0547/1321] nfsd: Drop the client reference in client_states_open() In error path, call drop_client() to drop the reference obtained by get_nfsdfs_clp(). Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton Signed-off-by: Haoxiang Li Signed-off-by: Chuck Lever --- fs/nfsd/nfs4state.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index b0111104e9690..a6e8a1f27133c 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -3099,8 +3099,10 @@ static int client_states_open(struct inode *inode, struct file *file) return -ENXIO; ret = seq_open(file, &states_seq_ops); - if (ret) + if (ret) { + drop_client(clp); return ret; + } s = file->private_data; s->private = clp; return 0; From f10617bef098da3d08c58791579822e660e76681 Mon Sep 17 00:00:00 2001 From: Kevin Tian Date: Thu, 18 Dec 2025 08:16:49 +0000 Subject: [PATCH 0548/1321] vfio/pci: Disable qword access to the PCI ROM bar Commit 2b938e3db335 ("vfio/pci: Enable iowrite64 and ioread64 for vfio pci") enables qword access to the PCI bar resources. However certain devices (e.g. Intel X710) are observed with problem upon qword accesses to the rom bar, e.g. triggering PCI aer errors. This is triggered by Qemu which caches the rom content by simply does a pread() of the remaining size until it gets the full contents. The other bars would only perform operations at the same access width as their guest drivers. Instead of trying to identify all broken devices, universally disable qword access to the rom bar i.e. going back to the old way which worked reliably for years. Reported-by: Farrah Chen Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220740 Fixes: 2b938e3db335 ("vfio/pci: Enable iowrite64 and ioread64 for vfio pci") Cc: stable@vger.kernel.org Signed-off-by: Kevin Tian Tested-by: Farrah Chen Link: https://lore.kernel.org/r/20251218081650.555015-2-kevin.tian@intel.com Signed-off-by: Alex Williamson --- drivers/vfio/pci/nvgrace-gpu/main.c | 4 ++-- drivers/vfio/pci/vfio_pci_rdwr.c | 25 ++++++++++++++++++------- include/linux/vfio_pci_core.h | 10 +++++++++- 3 files changed, 29 insertions(+), 10 deletions(-) diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace-gpu/main.c index 84d142a47ec67..b45a24d003877 100644 --- a/drivers/vfio/pci/nvgrace-gpu/main.c +++ b/drivers/vfio/pci/nvgrace-gpu/main.c @@ -561,7 +561,7 @@ nvgrace_gpu_map_and_read(struct nvgrace_gpu_pci_core_device *nvdev, ret = vfio_pci_core_do_io_rw(&nvdev->core_device, false, nvdev->resmem.ioaddr, buf, offset, mem_count, - 0, 0, false); + 0, 0, false, VFIO_PCI_IO_WIDTH_8); } return ret; @@ -693,7 +693,7 @@ nvgrace_gpu_map_and_write(struct nvgrace_gpu_pci_core_device *nvdev, ret = vfio_pci_core_do_io_rw(&nvdev->core_device, false, nvdev->resmem.ioaddr, (char __user *)buf, pos, mem_count, - 0, 0, true); + 0, 0, true, VFIO_PCI_IO_WIDTH_8); } return ret; diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c index 6192788c8ba39..25380b7dfe18a 100644 --- a/drivers/vfio/pci/vfio_pci_rdwr.c +++ b/drivers/vfio/pci/vfio_pci_rdwr.c @@ -135,7 +135,8 @@ VFIO_IORDWR(64) ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, void __iomem *io, char __user *buf, loff_t off, size_t count, size_t x_start, - size_t x_end, bool iswrite) + size_t x_end, bool iswrite, + enum vfio_pci_io_width max_width) { ssize_t done = 0; int ret; @@ -150,20 +151,19 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, else fillable = 0; - if (fillable >= 8 && !(off % 8)) { + if (fillable >= 8 && !(off % 8) && max_width >= 8) { ret = vfio_pci_iordwr64(vdev, iswrite, test_mem, io, buf, off, &filled); if (ret) return ret; - } else - if (fillable >= 4 && !(off % 4)) { + } else if (fillable >= 4 && !(off % 4) && max_width >= 4) { ret = vfio_pci_iordwr32(vdev, iswrite, test_mem, io, buf, off, &filled); if (ret) return ret; - } else if (fillable >= 2 && !(off % 2)) { + } else if (fillable >= 2 && !(off % 2) && max_width >= 2) { ret = vfio_pci_iordwr16(vdev, iswrite, test_mem, io, buf, off, &filled); if (ret) @@ -234,6 +234,7 @@ ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev, char __user *buf, void __iomem *io; struct resource *res = &vdev->pdev->resource[bar]; ssize_t done; + enum vfio_pci_io_width max_width = VFIO_PCI_IO_WIDTH_8; if (pci_resource_start(pdev, bar)) end = pci_resource_len(pdev, bar); @@ -262,6 +263,16 @@ ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev, char __user *buf, if (!io) return -ENOMEM; x_end = end; + + /* + * Certain devices (e.g. Intel X710) don't support qword + * access to the ROM bar. Otherwise PCI AER errors might be + * triggered. + * + * Disable qword access to the ROM bar universally, which + * worked reliably for years before qword access is enabled. + */ + max_width = VFIO_PCI_IO_WIDTH_4; } else { int ret = vfio_pci_core_setup_barmap(vdev, bar); if (ret) { @@ -278,7 +289,7 @@ ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev, char __user *buf, } done = vfio_pci_core_do_io_rw(vdev, res->flags & IORESOURCE_MEM, io, buf, pos, - count, x_start, x_end, iswrite); + count, x_start, x_end, iswrite, max_width); if (done >= 0) *ppos += done; @@ -352,7 +363,7 @@ ssize_t vfio_pci_vga_rw(struct vfio_pci_core_device *vdev, char __user *buf, * to the memory enable bit in the command register. */ done = vfio_pci_core_do_io_rw(vdev, false, iomem, buf, off, count, - 0, 0, iswrite); + 0, 0, iswrite, VFIO_PCI_IO_WIDTH_8); vga_put(vdev->pdev, rsrc); diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 706877f998ff3..1ac86896875cf 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -145,6 +145,13 @@ struct vfio_pci_core_device { struct list_head dmabufs; }; +enum vfio_pci_io_width { + VFIO_PCI_IO_WIDTH_1 = 1, + VFIO_PCI_IO_WIDTH_2 = 2, + VFIO_PCI_IO_WIDTH_4 = 4, + VFIO_PCI_IO_WIDTH_8 = 8, +}; + /* Will be exported for vfio pci drivers usage */ int vfio_pci_core_register_dev_region(struct vfio_pci_core_device *vdev, unsigned int type, unsigned int subtype, @@ -188,7 +195,8 @@ pci_ers_result_t vfio_pci_core_aer_err_detected(struct pci_dev *pdev, ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, void __iomem *io, char __user *buf, loff_t off, size_t count, size_t x_start, - size_t x_end, bool iswrite); + size_t x_end, bool iswrite, + enum vfio_pci_io_width max_width); bool __vfio_pci_memory_enabled(struct vfio_pci_core_device *vdev); bool vfio_pci_core_range_intersect_range(loff_t buf_start, size_t buf_cnt, loff_t reg_start, size_t reg_cnt, From a2571cbf282b1fa84b9ec5b6b6561369fce2d82f Mon Sep 17 00:00:00 2001 From: Kevin Tian Date: Thu, 18 Dec 2025 08:16:50 +0000 Subject: [PATCH 0549/1321] vfio/pci: Disable qword access to the VGA region Seems no reason to allow qword access to the old VGA resource. Better restrict it to dword access as before. Suggested-by: Alex Williamson Signed-off-by: Kevin Tian Link: https://lore.kernel.org/r/20251218081650.555015-3-kevin.tian@intel.com Signed-off-by: Alex Williamson --- drivers/vfio/pci/vfio_pci_rdwr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c index 25380b7dfe18a..b38627b35c35d 100644 --- a/drivers/vfio/pci/vfio_pci_rdwr.c +++ b/drivers/vfio/pci/vfio_pci_rdwr.c @@ -363,7 +363,7 @@ ssize_t vfio_pci_vga_rw(struct vfio_pci_core_device *vdev, char __user *buf, * to the memory enable bit in the command register. */ done = vfio_pci_core_do_io_rw(vdev, false, iomem, buf, off, count, - 0, 0, iswrite, VFIO_PCI_IO_WIDTH_8); + 0, 0, iswrite, VFIO_PCI_IO_WIDTH_4); vga_put(vdev->pdev, rsrc); From 7ef3593fae8bfbc67ebadb21f3f05ebe4d0bef36 Mon Sep 17 00:00:00 2001 From: Michal Wajdeczko Date: Thu, 18 Dec 2025 21:51:06 +0100 Subject: [PATCH 0550/1321] vfio/xe: Add default handler for .get_region_info_caps New requirement for the vfio drivers was added by the commit f97859503859 ("vfio: Require drivers to implement get_region_info") followed by commit 1b0ecb5baf4a ("vfio/pci: Convert all PCI drivers to get_region_info_caps") that was missed by the new vfio/xe driver. Add handler for .get_region_info_caps to avoid -EINVAL errors. Fixes: 2e38c50ae492 ("vfio/xe: Add device specific vfio_pci driver variant for Intel graphics") Signed-off-by: Michal Wajdeczko Reviewed-by: Marcin Bernatowicz Tested-by: Marcin Bernatowicz Link: https://lore.kernel.org/r/20251218205106.4578-1-michal.wajdeczko@intel.com Signed-off-by: Alex Williamson --- drivers/vfio/pci/xe/main.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/vfio/pci/xe/main.c b/drivers/vfio/pci/xe/main.c index 0156b53c678b7..719ab46600856 100644 --- a/drivers/vfio/pci/xe/main.c +++ b/drivers/vfio/pci/xe/main.c @@ -504,6 +504,7 @@ static const struct vfio_device_ops xe_vfio_pci_ops = { .open_device = xe_vfio_pci_open_device, .close_device = xe_vfio_pci_close_device, .ioctl = vfio_pci_core_ioctl, + .get_region_info_caps = vfio_pci_ioctl_get_region_info, .device_feature = vfio_pci_core_ioctl_feature, .read = vfio_pci_core_read, .write = vfio_pci_core_write, From c3db077413f48c7f66c85f877171d774bf648de3 Mon Sep 17 00:00:00 2001 From: David Matlack Date: Fri, 19 Dec 2025 23:38:17 +0000 Subject: [PATCH 0551/1321] tools include: Add definitions for __aligned_{l,b}e64 Add definitions for the missing __aligned_le64 and __aligned_be64 to tools/include/linux/types.h. The former is needed by for builds where tools/include/ is on the include path ahead of usr/include/. Signed-off-by: David Matlack Link: https://lore.kernel.org/r/20251219233818.1965306-2-dmatlack@google.com Signed-off-by: Alex Williamson --- tools/include/linux/types.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/tools/include/linux/types.h b/tools/include/linux/types.h index 4928e33d44ac4..d41f8a261bce8 100644 --- a/tools/include/linux/types.h +++ b/tools/include/linux/types.h @@ -88,6 +88,14 @@ typedef struct { # define __aligned_u64 __u64 __attribute__((aligned(8))) #endif +#ifndef __aligned_be64 +# define __aligned_be64 __be64 __attribute__((aligned(8))) +#endif + +#ifndef __aligned_le64 +# define __aligned_le64 __le64 __attribute__((aligned(8))) +#endif + struct list_head { struct list_head *next, *prev; }; From e230114ee49fcda94c66f9ee5dcfa972403da601 Mon Sep 17 00:00:00 2001 From: David Matlack Date: Fri, 19 Dec 2025 23:38:18 +0000 Subject: [PATCH 0552/1321] vfio: selftests: Drop includes Drop the includes now that (tools/include/linux/types.h) has a definition for __aligned_le64, which is needed by . Including is harmless but causes benign typedef redefinitions. This is not a problem for VFIO selftests but becomes an issue when the VFIO selftests library is built into KVM selftests, since they are built with -std=gnu99 which does not allow typedef redifitions. No functional change intended. Signed-off-by: David Matlack Link: https://lore.kernel.org/r/20251219233818.1965306-3-dmatlack@google.com Signed-off-by: Alex Williamson --- .../testing/selftests/vfio/lib/include/libvfio/iova_allocator.h | 1 - tools/testing/selftests/vfio/lib/iommu.c | 1 - tools/testing/selftests/vfio/lib/iova_allocator.c | 1 - tools/testing/selftests/vfio/lib/vfio_pci_device.c | 1 - tools/testing/selftests/vfio/vfio_dma_mapping_test.c | 1 - tools/testing/selftests/vfio/vfio_iommufd_setup_test.c | 1 - 6 files changed, 6 deletions(-) diff --git a/tools/testing/selftests/vfio/lib/include/libvfio/iova_allocator.h b/tools/testing/selftests/vfio/lib/include/libvfio/iova_allocator.h index 8f1d994e9ea28..c7c0796a757f2 100644 --- a/tools/testing/selftests/vfio/lib/include/libvfio/iova_allocator.h +++ b/tools/testing/selftests/vfio/lib/include/libvfio/iova_allocator.h @@ -2,7 +2,6 @@ #ifndef SELFTESTS_VFIO_LIB_INCLUDE_LIBVFIO_IOVA_ALLOCATOR_H #define SELFTESTS_VFIO_LIB_INCLUDE_LIBVFIO_IOVA_ALLOCATOR_H -#include #include #include #include diff --git a/tools/testing/selftests/vfio/lib/iommu.c b/tools/testing/selftests/vfio/lib/iommu.c index 8079d43523f32..58b7fb7430d4f 100644 --- a/tools/testing/selftests/vfio/lib/iommu.c +++ b/tools/testing/selftests/vfio/lib/iommu.c @@ -11,7 +11,6 @@ #include #include -#include #include #include #include diff --git a/tools/testing/selftests/vfio/lib/iova_allocator.c b/tools/testing/selftests/vfio/lib/iova_allocator.c index a12b0a51e9e6f..8c1cc86b70cd7 100644 --- a/tools/testing/selftests/vfio/lib/iova_allocator.c +++ b/tools/testing/selftests/vfio/lib/iova_allocator.c @@ -11,7 +11,6 @@ #include #include -#include #include #include #include diff --git a/tools/testing/selftests/vfio/lib/vfio_pci_device.c b/tools/testing/selftests/vfio/lib/vfio_pci_device.c index 8e34b9bfc96be..fac4c0ecadef8 100644 --- a/tools/testing/selftests/vfio/lib/vfio_pci_device.c +++ b/tools/testing/selftests/vfio/lib/vfio_pci_device.c @@ -11,7 +11,6 @@ #include #include -#include #include #include #include diff --git a/tools/testing/selftests/vfio/vfio_dma_mapping_test.c b/tools/testing/selftests/vfio/vfio_dma_mapping_test.c index 16eba2ecca474..3bf984b337ac9 100644 --- a/tools/testing/selftests/vfio/vfio_dma_mapping_test.c +++ b/tools/testing/selftests/vfio/vfio_dma_mapping_test.c @@ -3,7 +3,6 @@ #include #include -#include #include #include #include diff --git a/tools/testing/selftests/vfio/vfio_iommufd_setup_test.c b/tools/testing/selftests/vfio/vfio_iommufd_setup_test.c index 17017ed3beac5..ec1e5633e0800 100644 --- a/tools/testing/selftests/vfio/vfio_iommufd_setup_test.c +++ b/tools/testing/selftests/vfio/vfio_iommufd_setup_test.c @@ -1,5 +1,4 @@ // SPDX-License-Identifier: GPL-2.0 -#include #include #include #include From 145b3953124a38a093cb0e414c0da28b9c6b5334 Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Thu, 25 Dec 2025 14:31:50 +0000 Subject: [PATCH 0553/1321] vfio/pds: Fix memory leak in pds_vfio_dirty_enable() pds_vfio_dirty_enable() allocates memory for region_info. If interval_tree_iter_first() returns NULL, the function returns -EINVAL immediately without freeing the allocated memory, causing a memory leak. Fix this by jumping to the out_free_region_info label to ensure region_info is freed. Fixes: 2e7c6feb4ef52 ("vfio/pds: Add multi-region support") Signed-off-by: Zilin Guan Link: https://lore.kernel.org/r/20251225143150.1117366-1-zilin@seu.edu.cn Signed-off-by: Alex Williamson --- drivers/vfio/pci/pds/dirty.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/pci/pds/dirty.c b/drivers/vfio/pci/pds/dirty.c index 481992142f790..4915a7c1c4916 100644 --- a/drivers/vfio/pci/pds/dirty.c +++ b/drivers/vfio/pci/pds/dirty.c @@ -292,8 +292,11 @@ static int pds_vfio_dirty_enable(struct pds_vfio_pci_device *pds_vfio, len = num_ranges * sizeof(*region_info); node = interval_tree_iter_first(ranges, 0, ULONG_MAX); - if (!node) - return -EINVAL; + if (!node) { + err = -EINVAL; + goto out_free_region_info; + } + for (int i = 0; i < num_ranges; i++) { struct pds_lm_dirty_region_info *ri = ®ion_info[i]; u64 region_size = node->last - node->start + 1; From 4b4d72497fa04b567474087e59953ed83d919ad2 Mon Sep 17 00:00:00 2001 From: Alper Ak Date: Thu, 25 Dec 2025 18:13:49 +0300 Subject: [PATCH 0554/1321] vfio/xe: Fix use-after-free in xe_vfio_pci_alloc_file() migf->filp is accessed after migf has been freed. Save the error value before calling kfree() to prevent use-after-free. Fixes: 1f5556ec8b9e ("vfio/xe: Add device specific vfio_pci driver variant for Intel graphics") Signed-off-by: Alper Ak Link: https://lore.kernel.org/r/20251225151349.360870-1-alperyasinak1@gmail.com Signed-off-by: Alex Williamson --- drivers/vfio/pci/xe/main.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/xe/main.c b/drivers/vfio/pci/xe/main.c index 719ab46600856..2a5eb9260ec7b 100644 --- a/drivers/vfio/pci/xe/main.c +++ b/drivers/vfio/pci/xe/main.c @@ -250,6 +250,7 @@ xe_vfio_pci_alloc_file(struct xe_vfio_pci_core_device *xe_vdev, struct xe_vfio_pci_migration_file *migf; const struct file_operations *fops; int flags; + int ret; migf = kzalloc(sizeof(*migf), GFP_KERNEL_ACCOUNT); if (!migf) @@ -259,8 +260,9 @@ xe_vfio_pci_alloc_file(struct xe_vfio_pci_core_device *xe_vdev, flags = type == XE_VFIO_FILE_SAVE ? O_RDONLY : O_WRONLY; migf->filp = anon_inode_getfile("xe_vfio_mig", fops, migf, flags); if (IS_ERR(migf->filp)) { + ret = PTR_ERR(migf->filp); kfree(migf); - return ERR_CAST(migf->filp); + return ERR_PTR(ret); } mutex_init(&migf->lock); From 84ec2d8fed6339827f4d73898eedcf984089689d Mon Sep 17 00:00:00 2001 From: Kurt Borja Date: Fri, 5 Dec 2025 13:50:10 -0500 Subject: [PATCH 0555/1321] platform/x86: alienware-wmi-wmax: Add support for new Area-51 laptops MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add AWCC support for new Alienware Area-51 laptops. Cc: stable@vger.kernel.org Signed-off-by: Kurt Borja Reviewed-by: Ilpo Järvinen Link: https://patch.msgid.link/20251205-area-51-v1-1-d2cb13530851@gmail.com Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/dell/alienware-wmi-wmax.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/drivers/platform/x86/dell/alienware-wmi-wmax.c b/drivers/platform/x86/dell/alienware-wmi-wmax.c index 1418bd326edf2..fd8e69432a802 100644 --- a/drivers/platform/x86/dell/alienware-wmi-wmax.c +++ b/drivers/platform/x86/dell/alienware-wmi-wmax.c @@ -89,6 +89,22 @@ static struct awcc_quirks generic_quirks = { static struct awcc_quirks empty_quirks; static const struct dmi_system_id awcc_dmi_table[] __initconst = { + { + .ident = "Alienware 16 Area-51", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Alienware"), + DMI_MATCH(DMI_PRODUCT_NAME, "Alienware 16 Area-51"), + }, + .driver_data = &g_series_quirks, + }, + { + .ident = "Alienware 18 Area-51", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Alienware"), + DMI_MATCH(DMI_PRODUCT_NAME, "Alienware 18 Area-51"), + }, + .driver_data = &g_series_quirks, + }, { .ident = "Alienware 16 Aurora", .matches = { From 13603e7c65eccf01d3a145098ae75a534c5efaf8 Mon Sep 17 00:00:00 2001 From: Kurt Borja Date: Fri, 5 Dec 2025 13:50:11 -0500 Subject: [PATCH 0556/1321] platform/x86: alienware-wmi-wmax: Add AWCC support for Alienware x16 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add AWCC support for Alienware x16 laptops. Cc: stable@vger.kernel.org Signed-off-by: Kurt Borja Reviewed-by: Ilpo Järvinen Link: https://patch.msgid.link/20251205-area-51-v1-2-d2cb13530851@gmail.com Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/dell/alienware-wmi-wmax.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/platform/x86/dell/alienware-wmi-wmax.c b/drivers/platform/x86/dell/alienware-wmi-wmax.c index fd8e69432a802..7fb7a795a9bb5 100644 --- a/drivers/platform/x86/dell/alienware-wmi-wmax.c +++ b/drivers/platform/x86/dell/alienware-wmi-wmax.c @@ -177,6 +177,14 @@ static const struct dmi_system_id awcc_dmi_table[] __initconst = { }, .driver_data = &generic_quirks, }, + { + .ident = "Alienware x16", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Alienware"), + DMI_MATCH(DMI_PRODUCT_NAME, "Alienware x16"), + }, + .driver_data = &g_series_quirks, + }, { .ident = "Alienware x17", .matches = { From c66bbf1ec47333e4165766e022b245fc862ebb6f Mon Sep 17 00:00:00 2001 From: Kurt Borja Date: Fri, 5 Dec 2025 13:50:12 -0500 Subject: [PATCH 0557/1321] platform/x86: alienware-wmi-wmax: Add support for Alienware 16X Aurora MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add AWCC support for Alienware 16X Aurora laptops. Cc: stable@vger.kernel.org Signed-off-by: Kurt Borja Reviewed-by: Ilpo Järvinen Link: https://patch.msgid.link/20251205-area-51-v1-3-d2cb13530851@gmail.com Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/dell/alienware-wmi-wmax.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/platform/x86/dell/alienware-wmi-wmax.c b/drivers/platform/x86/dell/alienware-wmi-wmax.c index 7fb7a795a9bb5..e69b50162bb1b 100644 --- a/drivers/platform/x86/dell/alienware-wmi-wmax.c +++ b/drivers/platform/x86/dell/alienware-wmi-wmax.c @@ -97,6 +97,14 @@ static const struct dmi_system_id awcc_dmi_table[] __initconst = { }, .driver_data = &g_series_quirks, }, + { + .ident = "Alienware 16X Aurora", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Alienware"), + DMI_MATCH(DMI_PRODUCT_NAME, "Alienware 16X Aurora"), + }, + .driver_data = &g_series_quirks, + }, { .ident = "Alienware 18 Area-51", .matches = { From 8424cbc6d101d550265613b8f1a778fff6484cb8 Mon Sep 17 00:00:00 2001 From: Werner Sembach Date: Fri, 12 Dec 2025 19:02:22 +0100 Subject: [PATCH 0558/1321] platform/x86/uniwill: Add TUXEDO Book BA15 Gen10 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add TUXEDO Book BA15 Gen10 to the list of supported devices of the Uniwill driver. Signed-off-by: Werner Sembach Link: https://patch.msgid.link/20251212180319.712913-1-wse@tuxedocomputers.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/uniwill/uniwill-acpi.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/platform/x86/uniwill/uniwill-acpi.c b/drivers/platform/x86/uniwill/uniwill-acpi.c index bd7e63dd51810..0f935532f2504 100644 --- a/drivers/platform/x86/uniwill/uniwill-acpi.c +++ b/drivers/platform/x86/uniwill/uniwill-acpi.c @@ -1844,6 +1844,13 @@ static const struct dmi_system_id uniwill_dmi_table[] __initconst = { DMI_EXACT_MATCH(DMI_BOARD_NAME, "X6AR5xxY_mLED"), }, }, + { + .ident = "TUXEDO Book BA15 Gen10 AMD", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"), + DMI_EXACT_MATCH(DMI_BOARD_NAME, "PF5PU1G"), + }, + }, { .ident = "TUXEDO Pulse 14 Gen1 AMD", .matches = { From b3684959c460ad33862c3a3b41dd41368e2e9694 Mon Sep 17 00:00:00 2001 From: Tim Wassink Date: Sun, 21 Dec 2025 19:17:14 +0100 Subject: [PATCH 0559/1321] platform/x86: asus-nb-wmi: Add keymap for display toggle MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On the Asus Zenbook 14 (UX3405MA), the Fn+F7 key combination emits WMI code 0x2d, which was previously unmapped. Map this code to KEY_DISPLAYTOGGLE. This matches the behavior of the display toggle/projector mode key found on other Asus laptops, allowing userspace to handle multi-monitor switching or screen toggling. Tested on ASUS Zenbook 14 UX3405MA. Signed-off-by: Tim Wassink Reviewed-by: Denis Benato Link: https://patch.msgid.link/20251221181724.19927-1-timwassink.dev@gmail.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/asus-nb-wmi.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/platform/x86/asus-nb-wmi.c b/drivers/platform/x86/asus-nb-wmi.c index 6a62bc5b02fda..a38a65f5c550d 100644 --- a/drivers/platform/x86/asus-nb-wmi.c +++ b/drivers/platform/x86/asus-nb-wmi.c @@ -580,6 +580,7 @@ static const struct key_entry asus_nb_wmi_keymap[] = { { KE_KEY, 0x2a, { KEY_SELECTIVE_SCREENSHOT } }, { KE_IGNORE, 0x2b, }, /* PrintScreen (also send via PS/2) on newer models */ { KE_IGNORE, 0x2c, }, /* CapsLock (also send via PS/2) on newer models */ + { KE_KEY, 0x2d, { KEY_DISPLAYTOGGLE } }, { KE_KEY, 0x30, { KEY_VOLUMEUP } }, { KE_KEY, 0x31, { KEY_VOLUMEDOWN } }, { KE_KEY, 0x32, { KEY_MUTE } }, From 1a5f2c4eafa4d6c73d6c59409453164ada47f352 Mon Sep 17 00:00:00 2001 From: Shravan Kumar Ramani Date: Thu, 18 Dec 2025 12:18:13 +0000 Subject: [PATCH 0560/1321] platform/mellanox: mlxbf-pmc: Remove trailing whitespaces from event names MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Some event names have trailing whitespaces at the end which causes programming of counters using the name for these specific events to fail and hence need to be removed. Fixes: 423c3361855c ("platform/mellanox: mlxbf-pmc: Add support for BlueField-3") Signed-off-by: Shravan Kumar Ramani Reviewed-by: David Thompson Link: https://patch.msgid.link/065cbae0717dcc1169681c4dbb1a6e050b8574b3.1766059953.git.shravankr@nvidia.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/mellanox/mlxbf-pmc.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/platform/mellanox/mlxbf-pmc.c b/drivers/platform/mellanox/mlxbf-pmc.c index 16a2fd9fdd9b8..5ec1ad4716967 100644 --- a/drivers/platform/mellanox/mlxbf-pmc.c +++ b/drivers/platform/mellanox/mlxbf-pmc.c @@ -801,18 +801,18 @@ static const struct mlxbf_pmc_events mlxbf_pmc_llt_miss_events[] = { {11, "GDC_MISS_MACHINE_CHI_TXDAT"}, {12, "GDC_MISS_MACHINE_CHI_RXDAT"}, {13, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC0_0"}, - {14, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC0_1 "}, + {14, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC0_1"}, {15, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC0_2"}, - {16, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC0_3 "}, - {17, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC1_0 "}, - {18, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC1_1 "}, - {19, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC1_2 "}, - {20, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC1_3 "}, + {16, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC0_3"}, + {17, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC1_0"}, + {18, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC1_1"}, + {19, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC1_2"}, + {20, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC1_3"}, {21, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC_DONE0_0"}, {22, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC_DONE0_1"}, {23, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC_DONE0_2"}, {24, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC_DONE0_3"}, - {25, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC_DONE1_0 "}, + {25, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC_DONE1_0"}, {26, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC_DONE1_1"}, {27, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC_DONE1_2"}, {28, "GDC_MISS_MACHINE_G_FIFO_FF_EXEC_DONE1_3"}, From c54ff8d4c8272cea34f1ee654ccce3d131313065 Mon Sep 17 00:00:00 2001 From: Dmytro Bagrii Date: Fri, 28 Nov 2025 18:15:23 +0200 Subject: [PATCH 0561/1321] platform/x86: dell-lis3lv02d: Add Latitude 5400 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add accelerometer address 0x29 for Dell Latitude 5400. The address is verified as below: $ cat /sys/class/dmi/id/product_name Latitude 5400 $ grep -H '' /sys/bus/pci/drivers/i801_smbus/0000\:00*/i2c-*/name /sys/bus/pci/drivers/i801_smbus/0000:00:1f.4/i2c-10/name:SMBus I801 adapter at 0000:00:1f.4 $ i2cdetect 10 WARNING! This program can confuse your I2C bus, cause data loss and worse! I will probe file /dev/i2c-10. I will probe address range 0x08-0x77. Continue? [Y/n] Y 0 1 2 3 4 5 6 7 8 9 a b c d e f 00: 08 -- -- -- -- -- -- -- 10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 20: -- -- -- -- -- -- -- -- -- UU -- -- -- -- -- -- 30: 30 -- -- -- -- 35 UU UU -- -- -- -- -- -- -- -- 40: -- -- -- -- 44 -- -- -- -- -- -- -- -- -- -- -- 50: UU -- 52 -- -- -- -- -- -- -- -- -- -- -- -- -- 60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 70: -- -- -- -- -- -- -- -- $ xargs -n1 -a /proc/cmdline | grep ^dell_lis3lv02d dell_lis3lv02d.probe_i2c_addr=1 $ dmesg | grep lis3lv02d ... [ 206.012411] i2c i2c-10: Probing for lis3lv02d on address 0x29 [ 206.013727] i2c i2c-10: Detected lis3lv02d on address 0x29, please report this upstream to platform-driver-x86@vger.kernel.org so that a quirk can be added [ 206.240841] lis3lv02d_i2c 10-0029: supply Vdd not found, using dummy regulator [ 206.240868] lis3lv02d_i2c 10-0029: supply Vdd_IO not found, using dummy regulator [ 206.261258] lis3lv02d: 8 bits 3DC sensor found [ 206.346722] input: ST LIS3LV02DL Accelerometer as /devices/faux/lis3lv02d/input/input17 $ cat /sys/class/input/input17/name ST LIS3LV02DL Accelerometer Signed-off-by: Dmytro Bagrii Reviewed-by: Hans de Goede Link: https://patch.msgid.link/20251128161523.6224-1-dimich.dmb@gmail.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/dell/dell-lis3lv02d.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/platform/x86/dell/dell-lis3lv02d.c b/drivers/platform/x86/dell/dell-lis3lv02d.c index 77905a9ddde9d..fe52bcd896f78 100644 --- a/drivers/platform/x86/dell/dell-lis3lv02d.c +++ b/drivers/platform/x86/dell/dell-lis3lv02d.c @@ -44,6 +44,7 @@ static const struct dmi_system_id lis3lv02d_devices[] __initconst = { /* * Additional individual entries were added after verification. */ + DELL_LIS3LV02D_DMI_ENTRY("Latitude 5400", 0x29), DELL_LIS3LV02D_DMI_ENTRY("Latitude 5480", 0x29), DELL_LIS3LV02D_DMI_ENTRY("Latitude 5500", 0x29), DELL_LIS3LV02D_DMI_ENTRY("Latitude E6330", 0x29), From 92215290e0458946283ecc9837f29b5b71bfee59 Mon Sep 17 00:00:00 2001 From: Mark Pearson Date: Thu, 27 Nov 2025 15:29:48 -0500 Subject: [PATCH 0562/1321] platform/x86: think-lmi: Add WMI certificate thumbprint support for ThinkCenter MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The ThinkCenter team are adding WMI certificate thumbprint support. Updating the driver to enable it. They are using the same GUID as Thinkpad/ThinkStation. Tested on M75q Gen 5. Signed-off-by: Mark Pearson Link: https://patch.msgid.link/20251127202959.399040-1-mpearson-lenovo@squebb.ca Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/lenovo/think-lmi.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/platform/x86/lenovo/think-lmi.c b/drivers/platform/x86/lenovo/think-lmi.c index 540b472b1bf35..c45f0206b4ab6 100644 --- a/drivers/platform/x86/lenovo/think-lmi.c +++ b/drivers/platform/x86/lenovo/think-lmi.c @@ -195,7 +195,7 @@ static const struct tlmi_cert_guids thinkpad_cert_guid = { }; static const struct tlmi_cert_guids thinkcenter_cert_guid = { - .thumbprint = NULL, + .thumbprint = LENOVO_CERT_THUMBPRINT_GUID, /* Same GUID as TP */ .set_bios_setting = LENOVO_TC_SET_BIOS_SETTING_CERT_GUID, .save_bios_setting = LENOVO_TC_SAVE_BIOS_SETTING_CERT_GUID, .cert_to_password = LENOVO_TC_CERT_TO_PASSWORD_GUID, @@ -709,6 +709,10 @@ static ssize_t cert_thumbprint(char *buf, const char *arg, int count) if (!tlmi_priv.cert_guid->thumbprint) return -EOPNOTSUPP; + /* Older ThinkCenter BIOS may not have support */ + if (!wmi_has_guid(tlmi_priv.cert_guid->thumbprint)) + return -EOPNOTSUPP; + status = wmi_evaluate_method(tlmi_priv.cert_guid->thumbprint, 0, 0, &input, &output); if (ACPI_FAILURE(status)) { kfree(output.pointer); From 74d04ca11446d543692e23952d31c2a822532d41 Mon Sep 17 00:00:00 2001 From: Thomas Fourier Date: Wed, 17 Dec 2025 11:36:13 +0100 Subject: [PATCH 0563/1321] platform/x86: msi-laptop: add missing sysfs_remove_group() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit A sysfs group is created in msi_init() when old_ec_model is enabled, but never removed. Remove the msipf_old_attribute_group in that case. Fixes: 03696e51d75a ("msi-laptop: Disable brightness control for new EC") Signed-off-by: Thomas Fourier Link: https://patch.msgid.link/20251217103617.27668-2-fourier.thomas@gmail.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/msi-laptop.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/platform/x86/msi-laptop.c b/drivers/platform/x86/msi-laptop.c index c4b150fa093fe..ddef6b78d2fa9 100644 --- a/drivers/platform/x86/msi-laptop.c +++ b/drivers/platform/x86/msi-laptop.c @@ -1130,6 +1130,9 @@ static void __exit msi_cleanup(void) sysfs_remove_group(&msipf_device->dev.kobj, &msipf_attribute_group); if (!quirks->old_ec_model && threeg_exists) device_remove_file(&msipf_device->dev, &dev_attr_threeg); + if (quirks->old_ec_model) + sysfs_remove_group(&msipf_device->dev.kobj, + &msipf_old_attribute_group); platform_device_unregister(msipf_device); platform_driver_unregister(&msipf_driver); backlight_device_unregister(msibl_device); From c01f1341dccf1d2e68b5f8d1390fab739999547c Mon Sep 17 00:00:00 2001 From: Junrui Luo Date: Fri, 19 Dec 2025 16:30:29 +0800 Subject: [PATCH 0564/1321] platform/x86: ibm_rtl: fix EBDA signature search pointer arithmetic MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The ibm_rtl_init() function searches for the signature but has a pointer arithmetic error. The loop counter suggests searching at 4-byte intervals but the implementation only advances by 1 byte per iteration. Fix by properly advancing the pointer by sizeof(unsigned int) bytes each iteration. Reported-by: Yuhao Jiang Reported-by: Junrui Luo Fixes: 35f0ce032b0f ("IBM Real-Time "SMI Free" mode driver -v7") Signed-off-by: Junrui Luo Link: https://patch.msgid.link/SYBPR01MB78812D887A92DE3802D0D06EAFA9A@SYBPR01MB7881.ausprd01.prod.outlook.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/ibm_rtl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/platform/x86/ibm_rtl.c b/drivers/platform/x86/ibm_rtl.c index 231b379098017..139956168cf94 100644 --- a/drivers/platform/x86/ibm_rtl.c +++ b/drivers/platform/x86/ibm_rtl.c @@ -273,7 +273,7 @@ static int __init ibm_rtl_init(void) { /* search for the _RTL_ signature at the start of the table */ for (i = 0 ; i < ebda_size/sizeof(unsigned int); i++) { struct ibm_rtl_table __iomem * tmp; - tmp = (struct ibm_rtl_table __iomem *) (ebda_map+i); + tmp = (struct ibm_rtl_table __iomem *) (ebda_map + i*sizeof(unsigned int)); if ((readq(&tmp->signature) & RTL_MASK) == RTL_SIGNATURE) { phys_addr_t addr; unsigned int plen; From cbba540169b6606453b194f49e79ed1224150036 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Mon, 15 Dec 2025 22:38:00 -0800 Subject: [PATCH 0565/1321] platform/x86/intel/vsec: correct kernel-doc comments MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fix kernel-doc warnings in intel_vsec.h to eliminate all kernel-doc warnings: Warning: include/linux/intel_vsec.h:92 struct member 'read_telem' not described in 'pmt_callbacks' Warning: include/linux/intel_vsec.h:146 expecting prototype for struct intel_sec_device. Prototype was for struct intel_vsec_device instead Warning: include/linux/intel_vsec.h:146 struct member 'priv_data_size' not described in 'intel_vsec_device' In struct pmt_callbacks, correct the kernel-doc for @read_telem. kernel-doc doesn't support documenting callback function parameters, so drop the '@' signs on those and use "* *" to make them somewhat readable in the produced documentation output. Signed-off-by: Randy Dunlap Link: https://patch.msgid.link/20251216063801.2896495-1-rdunlap@infradead.org Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- include/linux/intel_vsec.h | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/include/linux/intel_vsec.h b/include/linux/intel_vsec.h index 53f6fe88e369e..1a0f357c24271 100644 --- a/include/linux/intel_vsec.h +++ b/include/linux/intel_vsec.h @@ -80,13 +80,13 @@ enum intel_vsec_quirks { /** * struct pmt_callbacks - Callback infrastructure for PMT devices - * ->read_telem() when specified, called by client driver to access PMT data (instead - * of direct copy). - * @pdev: PCI device reference for the callback's use - * @guid: ID of data to acccss - * @data: buffer for the data to be copied - * @off: offset into the requested buffer - * @count: size of buffer + * @read_telem: when specified, called by client driver to access PMT + * data (instead of direct copy). + * * pdev: PCI device reference for the callback's use + * * guid: ID of data to acccss + * * data: buffer for the data to be copied + * * off: offset into the requested buffer + * * count: size of buffer */ struct pmt_callbacks { int (*read_telem)(struct pci_dev *pdev, u32 guid, u64 *data, loff_t off, u32 count); @@ -120,7 +120,7 @@ struct intel_vsec_platform_info { }; /** - * struct intel_sec_device - Auxbus specific device information + * struct intel_vsec_device - Auxbus specific device information * @auxdev: auxbus device struct for auxbus access * @pcidev: pci device associated with the device * @resource: any resources shared by the parent @@ -128,6 +128,7 @@ struct intel_vsec_platform_info { * @num_resources: number of resources * @id: xarray id * @priv_data: any private data needed + * @priv_data_size: size of private data area * @quirks: specified quirks * @base_addr: base address of entries (if specified) * @cap_id: the enumerated id of the vsec feature From 4f84d3edad825443c4158ae136a92f9cef14343b Mon Sep 17 00:00:00 2001 From: Kaushlendra Kumar Date: Tue, 23 Dec 2025 14:10:41 +0530 Subject: [PATCH 0566/1321] platform/x86/intel/pmt: Fix kobject memory leak on init failure MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When kobject_init_and_add() fails in pmt_features_discovery(), the function returns without calling kobject_put(). This violates the kobject API contract where kobject_put() must be called even on initialization failure to properly release allocated resources. Fixes: d9a078809356 ("platform/x86/intel/pmt: Add PMT Discovery driver") Signed-off-by: Kaushlendra Kumar Link: https://patch.msgid.link/20251223084041.3832933-1-kaushlendra.kumar@intel.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/intel/pmt/discovery.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/platform/x86/intel/pmt/discovery.c b/drivers/platform/x86/intel/pmt/discovery.c index 32713a194a550..9c5b4d0e1fae6 100644 --- a/drivers/platform/x86/intel/pmt/discovery.c +++ b/drivers/platform/x86/intel/pmt/discovery.c @@ -503,8 +503,10 @@ static int pmt_features_discovery(struct pmt_features_priv *priv, ret = kobject_init_and_add(&feature->kobj, ktype, &priv->dev->kobj, "%s", pmt_feature_names[feature->id]); - if (ret) + if (ret) { + kobject_put(&feature->kobj); return ret; + } kobject_uevent(&feature->kobj, KOBJ_ADD); pmt_features_add_feat(feature); From 252066229b2c182cecc8432c64de32ac1c1c0817 Mon Sep 17 00:00:00 2001 From: Armin Wolf Date: Sun, 28 Dec 2025 22:41:31 +0100 Subject: [PATCH 0567/1321] platform/x86: samsung-galaxybook: Fix problematic pointer cast MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit A user reported that reading the charge threshold on his device results in very strange values (like 78497792) being returned. The reason for this seems to be the fact that the driver casts the int pointer to an u8 pointer, leaving the last 3 bytes of the destination uninitialized. Fix this by using a temporary variable instead. Cc: stable@vger.kernel.org Fixes: 56f529ce4370 ("platform/x86: samsung-galaxybook: Add samsung-galaxybook driver") Reported-by: Gianni Ceccarelli Closes: https://lore.kernel.org/platform-driver-x86/20251228115556.14362d66@thenautilus.net/ Tested-by: Gianni Ceccarelli Signed-off-by: Armin Wolf Link: https://patch.msgid.link/20251228214217.35972-1-W_Armin@gmx.de Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/samsung-galaxybook.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/platform/x86/samsung-galaxybook.c b/drivers/platform/x86/samsung-galaxybook.c index 3c13e13d48858..755cb82bdb606 100644 --- a/drivers/platform/x86/samsung-galaxybook.c +++ b/drivers/platform/x86/samsung-galaxybook.c @@ -442,12 +442,13 @@ static int galaxybook_battery_ext_property_get(struct power_supply *psy, union power_supply_propval *val) { struct samsung_galaxybook *galaxybook = ext_data; + u8 value; int err; if (psp != POWER_SUPPLY_PROP_CHARGE_CONTROL_END_THRESHOLD) return -EINVAL; - err = charge_control_end_threshold_acpi_get(galaxybook, (u8 *)&val->intval); + err = charge_control_end_threshold_acpi_get(galaxybook, &value); if (err) return err; @@ -455,8 +456,10 @@ static int galaxybook_battery_ext_property_get(struct power_supply *psy, * device stores "no end threshold" as 0 instead of 100; * if device has 0, report 100 */ - if (val->intval == 0) - val->intval = 100; + if (value == 0) + value = 100; + + val->intval = value; return 0; } From 472f052c573f3730a196572636cf5ca2364ef500 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Tom=C3=A1=C5=A1=20Hnyk?= Date: Fri, 26 Dec 2025 21:34:54 +0100 Subject: [PATCH 0568/1321] platform/x86: ideapad-laptop: Reassign KEY_CUT to KEY_SELECTIVE_SCREENSHOT MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit As per Lenovo documentation, Fn+Print-Screen should "Open the Snipping tool" which corresponds to KEY_SELECTIVE_SCREENSHOT (keycode 0x27a). It is currently assigned to KEY_CUT because keycodes under 248 were preferred due to X11 limitations. Reassign Fn+Print-Screen from KEY_CUT to KEY_SELECTIVE_SCREENSHOT. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220566 Signed-off-by: Tomáš Hnyk Link: https://patch.msgid.link/20251226203454.405520-1-tomashnyk@gmail.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/lenovo/ideapad-laptop.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/platform/x86/lenovo/ideapad-laptop.c b/drivers/platform/x86/lenovo/ideapad-laptop.c index 5171a077f62c3..7d5f7a2f65647 100644 --- a/drivers/platform/x86/lenovo/ideapad-laptop.c +++ b/drivers/platform/x86/lenovo/ideapad-laptop.c @@ -1367,7 +1367,7 @@ static const struct key_entry ideapad_keymap[] = { /* Performance toggle also Fn+Q, handled inside ideapad_wmi_notify() */ { KE_KEY, 0x3d | IDEAPAD_WMI_KEY, { KEY_PROG4 } }, /* shift + prtsc */ - { KE_KEY, 0x2d | IDEAPAD_WMI_KEY, { KEY_CUT } }, + { KE_KEY, 0x2d | IDEAPAD_WMI_KEY, { KEY_SELECTIVE_SCREENSHOT } }, { KE_KEY, 0x29 | IDEAPAD_WMI_KEY, { KEY_TOUCHPAD_TOGGLE } }, { KE_KEY, 0x2a | IDEAPAD_WMI_KEY, { KEY_ROOT_MENU } }, From 4772683695ee5aaa719bba25ad328464d905b128 Mon Sep 17 00:00:00 2001 From: Denis Benato Date: Thu, 25 Dec 2025 03:38:41 +0100 Subject: [PATCH 0569/1321] platform/x86: asus-armoury: add support for GU605CR MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add TDP data for laptop model GU605CR. Signed-off-by: Denis Benato Link: https://patch.msgid.link/20251225023841.1970513-1-denis.benato@linux.dev Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/asus-armoury.h | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/drivers/platform/x86/asus-armoury.h b/drivers/platform/x86/asus-armoury.h index a1bb2005c3f35..d8814165d480a 100644 --- a/drivers/platform/x86/asus-armoury.h +++ b/drivers/platform/x86/asus-armoury.h @@ -950,6 +950,35 @@ static const struct dmi_system_id power_limits[] = { }, }, }, + { + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "GU605CR"), + }, + .driver_data = &(struct power_data) { + .ac_data = &(struct power_limits) { + .ppt_pl1_spl_min = 30, + .ppt_pl1_spl_max = 85, + .ppt_pl2_sppt_min = 38, + .ppt_pl2_sppt_max = 110, + .nv_dynamic_boost_min = 5, + .nv_dynamic_boost_max = 20, + .nv_temp_target_min = 75, + .nv_temp_target_max = 87, + .nv_tgp_min = 80, + .nv_tgp_def = 90, + .nv_tgp_max = 105, + }, + .dc_data = &(struct power_limits) { + .ppt_pl1_spl_min = 30, + .ppt_pl1_spl_max = 85, + .ppt_pl2_sppt_min = 38, + .ppt_pl2_sppt_max = 110, + .nv_temp_target_min = 75, + .nv_temp_target_max = 87, + }, + .requires_fan_curve = true, + }, + }, { .matches = { DMI_MATCH(DMI_BOARD_NAME, "GU605CW"), From 04fd71922a163cb170360cd42a5aebf749aae397 Mon Sep 17 00:00:00 2001 From: Denis Benato Date: Thu, 25 Dec 2025 03:53:01 +0100 Subject: [PATCH 0570/1321] platform/x86: asus-armoury: add support for GA403WR MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add TDP data for laptop model GA403WR. Signed-off-by: Denis Benato Link: https://patch.msgid.link/20251225025301.1980627-1-denis.benato@linux.dev Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/asus-armoury.h | 32 +++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/drivers/platform/x86/asus-armoury.h b/drivers/platform/x86/asus-armoury.h index d8814165d480a..1d689e561509e 100644 --- a/drivers/platform/x86/asus-armoury.h +++ b/drivers/platform/x86/asus-armoury.h @@ -822,6 +822,38 @@ static const struct dmi_system_id power_limits[] = { .requires_fan_curve = true, }, }, + { + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "GA403WR"), + }, + .driver_data = &(struct power_data) { + .ac_data = &(struct power_limits) { + .ppt_pl1_spl_min = 15, + .ppt_pl1_spl_max = 80, + .ppt_pl2_sppt_min = 25, + .ppt_pl2_sppt_max = 80, + .ppt_pl3_fppt_min = 35, + .ppt_pl3_fppt_max = 80, + .nv_dynamic_boost_min = 0, + .nv_dynamic_boost_max = 25, + .nv_temp_target_min = 75, + .nv_temp_target_max = 87, + .nv_tgp_min = 80, + .nv_tgp_max = 95, + }, + .dc_data = &(struct power_limits) { + .ppt_pl1_spl_min = 15, + .ppt_pl1_spl_max = 35, + .ppt_pl2_sppt_min = 25, + .ppt_pl2_sppt_max = 35, + .ppt_pl3_fppt_min = 35, + .ppt_pl3_fppt_max = 65, + .nv_temp_target_min = 75, + .nv_temp_target_max = 87, + }, + .requires_fan_curve = true, + }, + }, { .matches = { DMI_MATCH(DMI_BOARD_NAME, "GA503QR"), From 0816730d051a6240f8934276bee0b9cec6ab2d8e Mon Sep 17 00:00:00 2001 From: Denis Benato Date: Thu, 25 Dec 2025 04:03:54 +0100 Subject: [PATCH 0571/1321] platform/x86: asus-armoury: add support for FA608UM MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add TDP data for laptop model FA608UM. Signed-off-by: Denis Benato Link: https://patch.msgid.link/20251225030354.2315874-1-denis.benato@linux.dev Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/asus-armoury.h | 36 +++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/drivers/platform/x86/asus-armoury.h b/drivers/platform/x86/asus-armoury.h index 1d689e561509e..0b7cdae8b442e 100644 --- a/drivers/platform/x86/asus-armoury.h +++ b/drivers/platform/x86/asus-armoury.h @@ -552,6 +552,42 @@ static const struct dmi_system_id power_limits[] = { }, }, }, + { + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "FA608UM"), + }, + .driver_data = &(struct power_data) { + .ac_data = &(struct power_limits) { + .ppt_pl1_spl_min = 15, + .ppt_pl1_spl_def = 45, + .ppt_pl1_spl_max = 90, + .ppt_pl2_sppt_min = 35, + .ppt_pl2_sppt_def = 54, + .ppt_pl2_sppt_max = 90, + .ppt_pl3_fppt_min = 35, + .ppt_pl3_fppt_def = 90, + .ppt_pl3_fppt_max = 65, + .nv_dynamic_boost_min = 10, + .nv_dynamic_boost_max = 15, + .nv_temp_target_min = 75, + .nv_temp_target_max = 87, + .nv_tgp_min = 55, + .nv_tgp_max = 100, + }, + .dc_data = &(struct power_limits) { + .ppt_pl1_spl_min = 15, + .ppt_pl1_spl_def = 45, + .ppt_pl1_spl_max = 65, + .ppt_pl2_sppt_min = 35, + .ppt_pl2_sppt_def = 54, + .ppt_pl2_sppt_max = 65, + .ppt_pl3_fppt_min = 35, + .ppt_pl3_fppt_max = 65, + .nv_temp_target_min = 75, + .nv_temp_target_max = 87, + }, + }, + }, { .matches = { DMI_MATCH(DMI_BOARD_NAME, "FA608WI"), From b08265650656fd2a3146451a4ca74d3d57f4ec80 Mon Sep 17 00:00:00 2001 From: Denis Benato Date: Thu, 25 Dec 2025 04:10:41 +0100 Subject: [PATCH 0572/1321] platform/x86: asus-armoury: add support for G615LR MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add TDP data for laptop model G615LR. Signed-off-by: Denis Benato Link: https://patch.msgid.link/20251225031041.2321249-1-denis.benato@linux.dev Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/asus-armoury.h | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/drivers/platform/x86/asus-armoury.h b/drivers/platform/x86/asus-armoury.h index 0b7cdae8b442e..24977a9da2299 100644 --- a/drivers/platform/x86/asus-armoury.h +++ b/drivers/platform/x86/asus-armoury.h @@ -1357,6 +1357,35 @@ static const struct dmi_system_id power_limits[] = { .requires_fan_curve = true, }, }, + { + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "G615LR"), + }, + .driver_data = &(struct power_data) { + .ac_data = &(struct power_limits) { + .ppt_pl1_spl_min = 28, + .ppt_pl1_spl_def = 140, + .ppt_pl1_spl_max = 175, + .ppt_pl2_sppt_min = 28, + .ppt_pl2_sppt_max = 175, + .nv_temp_target_min = 75, + .nv_temp_target_max = 87, + .nv_dynamic_boost_min = 5, + .nv_dynamic_boost_max = 25, + .nv_tgp_min = 65, + .nv_tgp_max = 115, + }, + .dc_data = &(struct power_limits) { + .ppt_pl1_spl_min = 25, + .ppt_pl1_spl_max = 55, + .ppt_pl2_sppt_min = 25, + .ppt_pl2_sppt_max = 70, + .nv_temp_target_min = 75, + .nv_temp_target_max = 87, + }, + .requires_fan_curve = true, + }, + }, { .matches = { DMI_MATCH(DMI_BOARD_NAME, "G634J"), From fde4392b11f3310df833c200972fbd1da25b6f9e Mon Sep 17 00:00:00 2001 From: Junrui Luo Date: Fri, 26 Dec 2025 19:42:05 +0800 Subject: [PATCH 0573/1321] platform/x86: hp-bioscfg: Fix out-of-bounds array access in ACPI package parsing MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The hp_populate_*_elements_from_package() functions in the hp-bioscfg driver contain out-of-bounds array access vulnerabilities. These functions parse ACPI packages into internal data structures using a for loop with index variable 'elem' that iterates through enum_obj/integer_obj/order_obj/password_obj/string_obj arrays. When processing multi-element fields like PREREQUISITES and ENUM_POSSIBLE_VALUES, these functions read multiple consecutive array elements using expressions like 'enum_obj[elem + reqs]' and 'enum_obj[elem + pos_values]' within nested loops. The bug is that the bounds check only validated elem, but did not consider the additional offset when accessing elem + reqs or elem + pos_values. The fix changes the bounds check to validate the actual accessed index. Reported-by: Yuhao Jiang Reported-by: Junrui Luo Fixes: e6c7b3e15559 ("platform/x86: hp-bioscfg: string-attributes") Signed-off-by: Junrui Luo Link: https://patch.msgid.link/SYBPR01MB788173D7DD4EA2CB6383683DAFB0A@SYBPR01MB7881.ausprd01.prod.outlook.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/hp/hp-bioscfg/enum-attributes.c | 4 ++-- drivers/platform/x86/hp/hp-bioscfg/int-attributes.c | 2 +- drivers/platform/x86/hp/hp-bioscfg/order-list-attributes.c | 5 +++++ drivers/platform/x86/hp/hp-bioscfg/passwdobj-attributes.c | 5 +++++ drivers/platform/x86/hp/hp-bioscfg/string-attributes.c | 2 +- 5 files changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/platform/x86/hp/hp-bioscfg/enum-attributes.c b/drivers/platform/x86/hp/hp-bioscfg/enum-attributes.c index c50ad58805038..f346aad8e9d89 100644 --- a/drivers/platform/x86/hp/hp-bioscfg/enum-attributes.c +++ b/drivers/platform/x86/hp/hp-bioscfg/enum-attributes.c @@ -207,7 +207,7 @@ static int hp_populate_enumeration_elements_from_package(union acpi_object *enum case PREREQUISITES: size = min_t(u32, enum_data->common.prerequisites_size, MAX_PREREQUISITES_SIZE); for (reqs = 0; reqs < size; reqs++) { - if (elem >= enum_obj_count) { + if (elem + reqs >= enum_obj_count) { pr_err("Error enum-objects package is too small\n"); return -EINVAL; } @@ -255,7 +255,7 @@ static int hp_populate_enumeration_elements_from_package(union acpi_object *enum for (pos_values = 0; pos_values < size && pos_values < MAX_VALUES_SIZE; pos_values++) { - if (elem >= enum_obj_count) { + if (elem + pos_values >= enum_obj_count) { pr_err("Error enum-objects package is too small\n"); return -EINVAL; } diff --git a/drivers/platform/x86/hp/hp-bioscfg/int-attributes.c b/drivers/platform/x86/hp/hp-bioscfg/int-attributes.c index 6c7f4d5fa9cb9..63b1fda2be4e2 100644 --- a/drivers/platform/x86/hp/hp-bioscfg/int-attributes.c +++ b/drivers/platform/x86/hp/hp-bioscfg/int-attributes.c @@ -227,7 +227,7 @@ static int hp_populate_integer_elements_from_package(union acpi_object *integer_ size = min_t(u32, integer_data->common.prerequisites_size, MAX_PREREQUISITES_SIZE); for (reqs = 0; reqs < size; reqs++) { - if (elem >= integer_obj_count) { + if (elem + reqs >= integer_obj_count) { pr_err("Error elem-objects package is too small\n"); return -EINVAL; } diff --git a/drivers/platform/x86/hp/hp-bioscfg/order-list-attributes.c b/drivers/platform/x86/hp/hp-bioscfg/order-list-attributes.c index c6e57bb9d8b74..6a31f47ce3f5b 100644 --- a/drivers/platform/x86/hp/hp-bioscfg/order-list-attributes.c +++ b/drivers/platform/x86/hp/hp-bioscfg/order-list-attributes.c @@ -216,6 +216,11 @@ static int hp_populate_ordered_list_elements_from_package(union acpi_object *ord size = min_t(u32, ordered_list_data->common.prerequisites_size, MAX_PREREQUISITES_SIZE); for (reqs = 0; reqs < size; reqs++) { + if (elem + reqs >= order_obj_count) { + pr_err("Error elem-objects package is too small\n"); + return -EINVAL; + } + ret = hp_convert_hexstr_to_str(order_obj[elem + reqs].string.pointer, order_obj[elem + reqs].string.length, &str_value, &value_len); diff --git a/drivers/platform/x86/hp/hp-bioscfg/passwdobj-attributes.c b/drivers/platform/x86/hp/hp-bioscfg/passwdobj-attributes.c index 187b372123ed3..ec79d9d50377a 100644 --- a/drivers/platform/x86/hp/hp-bioscfg/passwdobj-attributes.c +++ b/drivers/platform/x86/hp/hp-bioscfg/passwdobj-attributes.c @@ -303,6 +303,11 @@ static int hp_populate_password_elements_from_package(union acpi_object *passwor MAX_PREREQUISITES_SIZE); for (reqs = 0; reqs < size; reqs++) { + if (elem + reqs >= password_obj_count) { + pr_err("Error elem-objects package is too small\n"); + return -EINVAL; + } + ret = hp_convert_hexstr_to_str(password_obj[elem + reqs].string.pointer, password_obj[elem + reqs].string.length, &str_value, &value_len); diff --git a/drivers/platform/x86/hp/hp-bioscfg/string-attributes.c b/drivers/platform/x86/hp/hp-bioscfg/string-attributes.c index 27758b779b2d3..7b885d25650c5 100644 --- a/drivers/platform/x86/hp/hp-bioscfg/string-attributes.c +++ b/drivers/platform/x86/hp/hp-bioscfg/string-attributes.c @@ -217,7 +217,7 @@ static int hp_populate_string_elements_from_package(union acpi_object *string_ob MAX_PREREQUISITES_SIZE); for (reqs = 0; reqs < size; reqs++) { - if (elem >= string_obj_count) { + if (elem + reqs >= string_obj_count) { pr_err("Error elem-objects package is too small\n"); return -EINVAL; } From 42ffc8fe8341517a2382632f1e8662fd9cc0f6a8 Mon Sep 17 00:00:00 2001 From: Alok Tiwari Date: Wed, 24 Dec 2025 01:51:09 -0800 Subject: [PATCH 0574/1321] platform/x86/intel/pmt/discovery: use valid device pointer in dev_err_probe MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The PMT feature probe creates a child device with device_create(). If device creation fail, the code pass priv->dev (which is an ERR_PTR) to dev_err_probe(), which is not a valid device pointer. This patch change the dev_err_probe() call to use the parent auxiliary device (&auxdev->dev) and update the error message to reference the parent device name. It ensure correct error reporting and avoid passing an invalid device pointer. Fixes: d9a078809356 ("platform/x86/intel/pmt: Add PMT Discovery driver") Signed-off-by: Alok Tiwari Link: https://patch.msgid.link/20251224095133.115678-1-alok.a.tiwari@oracle.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/intel/pmt/discovery.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/platform/x86/intel/pmt/discovery.c b/drivers/platform/x86/intel/pmt/discovery.c index 9c5b4d0e1fae6..e500aa327d237 100644 --- a/drivers/platform/x86/intel/pmt/discovery.c +++ b/drivers/platform/x86/intel/pmt/discovery.c @@ -548,9 +548,9 @@ static int pmt_features_probe(struct auxiliary_device *auxdev, const struct auxi priv->dev = device_create(&intel_pmt_class, &auxdev->dev, MKDEV(0, 0), priv, "%s-%s", "features", dev_name(priv->parent)); if (IS_ERR(priv->dev)) - return dev_err_probe(priv->dev, PTR_ERR(priv->dev), + return dev_err_probe(&auxdev->dev, PTR_ERR(priv->dev), "Could not create %s-%s device node\n", - "features", dev_name(priv->dev)); + "features", dev_name(priv->parent)); /* Initialize each feature */ for (i = 0; i < ivdev->num_resources; i++) { From d2d43d44a94c2a034b69d2633b7fd70e47e74850 Mon Sep 17 00:00:00 2001 From: Denis Benato Date: Mon, 29 Dec 2025 16:07:55 +0100 Subject: [PATCH 0575/1321] platform/x86: asus-armoury: fix ppt data for FA507R MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit PPT data for FA507R was reported to be wrong by a user: change limits to make them equal to Armoury Crate limits. Fixes: 39ae6c50e599 ("platform/x86: asus-armoury: add ppt_* and nv_* tuning knobs") Signed-off-by: Denis Benato Link: https://patch.msgid.link/20251229150755.1351495-1-denis.benato@linux.dev Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/asus-armoury.h | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/drivers/platform/x86/asus-armoury.h b/drivers/platform/x86/asus-armoury.h index 24977a9da2299..68b174b70a66e 100644 --- a/drivers/platform/x86/asus-armoury.h +++ b/drivers/platform/x86/asus-armoury.h @@ -449,12 +449,27 @@ static const struct dmi_system_id power_limits[] = { .ac_data = &(struct power_limits) { .ppt_pl1_spl_min = 15, .ppt_pl1_spl_max = 80, - .ppt_pl2_sppt_min = 25, + .ppt_pl2_sppt_min = 35, .ppt_pl2_sppt_max = 80, .ppt_pl3_fppt_min = 35, - .ppt_pl3_fppt_max = 80 + .ppt_pl3_fppt_max = 80, + .nv_dynamic_boost_min = 5, + .nv_dynamic_boost_max = 25, + .nv_temp_target_min = 75, + .nv_temp_target_max = 87, + }, + .dc_data = &(struct power_limits) { + .ppt_pl1_spl_min = 15, + .ppt_pl1_spl_def = 45, + .ppt_pl1_spl_max = 65, + .ppt_pl2_sppt_min = 35, + .ppt_pl2_sppt_def = 54, + .ppt_pl2_sppt_max = 65, + .ppt_pl3_fppt_min = 35, + .ppt_pl3_fppt_max = 65, + .nv_temp_target_min = 75, + .nv_temp_target_max = 87, }, - .dc_data = NULL, }, }, { From 5d331cf137fddc22b9f02c4a9db8730f3cb05ca5 Mon Sep 17 00:00:00 2001 From: Denis Benato Date: Mon, 29 Dec 2025 21:44:58 +0100 Subject: [PATCH 0576/1321] platform/x86: asus-armoury: add support for G835LW MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add TDP data for laptop model G835LW. Signed-off-by: Denis Benato Link: https://patch.msgid.link/20251229204458.2658777-1-denis.benato@linux.dev Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/asus-armoury.h | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/drivers/platform/x86/asus-armoury.h b/drivers/platform/x86/asus-armoury.h index 68b174b70a66e..3ac7aea378384 100644 --- a/drivers/platform/x86/asus-armoury.h +++ b/drivers/platform/x86/asus-armoury.h @@ -1567,6 +1567,35 @@ static const struct dmi_system_id power_limits[] = { .requires_fan_curve = true, }, }, + { + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "G835LW"), + }, + .driver_data = &(struct power_data) { + .ac_data = &(struct power_limits) { + .ppt_pl1_spl_min = 28, + .ppt_pl1_spl_def = 140, + .ppt_pl1_spl_max = 175, + .ppt_pl2_sppt_min = 28, + .ppt_pl2_sppt_max = 175, + .nv_dynamic_boost_min = 5, + .nv_dynamic_boost_max = 25, + .nv_temp_target_min = 75, + .nv_temp_target_max = 87, + .nv_tgp_min = 80, + .nv_tgp_max = 150, + }, + .dc_data = &(struct power_limits) { + .ppt_pl1_spl_min = 25, + .ppt_pl1_spl_max = 55, + .ppt_pl2_sppt_min = 25, + .ppt_pl2_sppt_max = 70, + .nv_temp_target_min = 75, + .nv_temp_target_max = 87, + }, + .requires_fan_curve = true, + }, + }, { .matches = { DMI_MATCH(DMI_BOARD_NAME, "H7606W"), From 0e991179b0a17a2547599c6a32e8bf6085a3c375 Mon Sep 17 00:00:00 2001 From: Shuah Khan Date: Tue, 30 Dec 2025 20:56:42 -0700 Subject: [PATCH 0577/1321] Revert "wifi: mt76: Strip whitespace from build ddate" This reverts commit f804a5895ebad2b2d4fb8a3688d2115926e993d5. This change introduced the following panic, and mt792x_load_firmware() fails. wifi is dead on systems with mt792x wireless. kern :crit : kernel BUG at lib/string_helpers.c:1043! kern :warn : Oops: invalid opcode: 0000 [#1] SMP NOPTI kern :warn : CPU: 14 UID: 0 PID: 61 Comm: kworker/14:0 Tainted: G W 6.19.0-rc1 #1 PREEMPT(voluntary) kern :warn : Tainted: [W]=WARN kern :warn : Hardware name: Framework Laptop 13 (AMD Ryzen 7040Series)/FRANMDCP07, BIOS 03.16 07/25/2025 kern :warn : Workqueue: events mt7921_init_work [mt7921_common] kern :warn : RIP: 0010:__fortify_panic+0xd/0xf kern :warn : Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 40 0f b6 ff e8 c3 55 71 00 <0f> 0b 48 8b 54 24 10 48 8b 74 24 08 4c 89 e9 48 c7 c7 00 a2 d5 a0 kern :warn : RSP: 0018:ffffa7a5c03a3d10 EFLAGS: 00010246 kern :warn : RAX: ffffffffa0d7aaf2 RBX: 0000000000000000 RCX: ffffffffa0d7aaf2 kern :warn : RDX: 0000000000000011 RSI: ffffffffa0d5a170 RDI: ffffffffa128db10 kern :warn : RBP: ffff91650ae52060 R08: 0000000000000010 R09: ffffa7a5c31b2000 kern :warn : R10: ffffa7a5c03a3bf0 R11: 00000000ffffffff R12: 0000000000000000 kern :warn : R13: ffffa7a5c31b2000 R14: 0000000000001000 R15: 0000000000000000 kern :warn : FS: 0000000000000000(0000) GS:ffff91743e664000(0000) knlGS:0000000000000000 kern :warn : CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kern :warn : CR2: 00007f10786c241c CR3: 00000003eca24000 CR4: 0000000000f50ef0 kern :warn : PKRU: 55555554 kern :warn : Call Trace: kern :warn : kern :warn : mt76_connac2_load_patch.cold+0x2b/0xa41 [mt76_connac_lib] kern :warn : ? srso_alias_return_thunk+0x5/0xfbef5 kern :warn : mt792x_load_firmware+0x36/0x150 [mt792x_lib] kern :warn : mt7921_run_firmware+0x2c/0x4a0 [mt7921_common] kern :warn : ? srso_alias_return_thunk+0x5/0xfbef5 kern :warn : ? mt7921_rr+0x12/0x30 [mt7921e] kern :warn : ? srso_alias_return_thunk+0x5/0xfbef5 kern :warn : ? ____mt76_poll_msec+0x75/0xb0 [mt76] kern :warn : mt7921e_mcu_init+0x4c/0x7a [mt7921e] kern :warn : mt7921_init_work+0x51/0x190 [mt7921_common] kern :warn : process_one_work+0x18b/0x340 kern :warn : worker_thread+0x256/0x3a0 kern :warn : ? __pfx_worker_thread+0x10/0x10 kern :warn : kthread+0xfc/0x240 kern :warn : ? __pfx_kthread+0x10/0x10 kern :warn : ? __pfx_kthread+0x10/0x10 kern :warn : ret_from_fork+0x254/0x290 kern :warn : ? __pfx_kthread+0x10/0x10 kern :warn : ret_from_fork_asm+0x1a/0x30 kern :warn : Signed-off-by: Shuah Khan Signed-off-by: Linus Torvalds --- drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.c b/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.c index ea99167765b0c..fba7025ffd3f3 100644 --- a/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.c @@ -3101,7 +3101,6 @@ int mt76_connac2_load_patch(struct mt76_dev *dev, const char *fw_name) int i, ret, sem, max_len = mt76_is_sdio(dev) ? 2048 : 4096; const struct mt76_connac2_patch_hdr *hdr; const struct firmware *fw = NULL; - char build_date[17]; sem = mt76_connac_mcu_patch_sem_ctrl(dev, true); switch (sem) { @@ -3125,11 +3124,8 @@ int mt76_connac2_load_patch(struct mt76_dev *dev, const char *fw_name) } hdr = (const void *)fw->data; - strscpy(build_date, hdr->build_date, sizeof(build_date)); - build_date[16] = '\0'; - strim(build_date); dev_info(dev->dev, "HW/SW Version: 0x%x, Build Time: %.16s\n", - be32_to_cpu(hdr->hw_sw_ver), build_date); + be32_to_cpu(hdr->hw_sw_ver), hdr->build_date); for (i = 0; i < be32_to_cpu(hdr->desc.n_region); i++) { struct mt76_connac2_patch_sec *sec; From cddd5f57047a684f898f241757bd29648d56effd Mon Sep 17 00:00:00 2001 From: Shuah Khan Date: Wed, 31 Dec 2025 16:46:26 -0700 Subject: [PATCH 0578/1321] wifi: mt76: Remove blank line after mt792x firmware version dmesg An extra blank line gets printed after printing firmware version because the build date is null terminated. Remove the "\n" from dev_info() calls to print firmware version and build date to fix the problem. Reported-by: Mario Limonciello Signed-off-by: Shuah Khan Signed-off-by: Linus Torvalds --- drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.c b/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.c index fba7025ffd3f3..0457712286d55 100644 --- a/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.c @@ -3019,7 +3019,7 @@ int mt76_connac2_load_ram(struct mt76_dev *dev, const char *fw_wm, } hdr = (const void *)(fw->data + fw->size - sizeof(*hdr)); - dev_info(dev->dev, "WM Firmware Version: %.10s, Build Time: %.15s\n", + dev_info(dev->dev, "WM Firmware Version: %.10s, Build Time: %.15s", hdr->fw_ver, hdr->build_date); ret = mt76_connac_mcu_send_ram_firmware(dev, hdr, fw->data, false); @@ -3048,7 +3048,7 @@ int mt76_connac2_load_ram(struct mt76_dev *dev, const char *fw_wm, } hdr = (const void *)(fw->data + fw->size - sizeof(*hdr)); - dev_info(dev->dev, "WA Firmware Version: %.10s, Build Time: %.15s\n", + dev_info(dev->dev, "WA Firmware Version: %.10s, Build Time: %.15s", hdr->fw_ver, hdr->build_date); ret = mt76_connac_mcu_send_ram_firmware(dev, hdr, fw->data, true); @@ -3124,7 +3124,7 @@ int mt76_connac2_load_patch(struct mt76_dev *dev, const char *fw_name) } hdr = (const void *)fw->data; - dev_info(dev->dev, "HW/SW Version: 0x%x, Build Time: %.16s\n", + dev_info(dev->dev, "HW/SW Version: 0x%x, Build Time: %.16s", be32_to_cpu(hdr->hw_sw_ver), hdr->build_date); for (i = 0; i < be32_to_cpu(hdr->desc.n_region); i++) { From ba788346ae2776472bfe19f412bad3c8bb95aca0 Mon Sep 17 00:00:00 2001 From: Steve French Date: Mon, 29 Dec 2025 10:23:12 -0600 Subject: [PATCH 0579/1321] smb3 client: add missing tracepoint for unsupported ioctls In debugging a recent problem with an xfstest, noticed that we weren't tracing cases where the ioctl was not supported. Add dynamic tracepoint: "trace-cmd record -e smb3_unsupported_ioctl" and then after running an app which calls unsupported ioctl, "trace-cmd show"would display e.g. xfs_io-7289 [012] ..... 1205.137765: smb3_unsupported_ioctl: xid=19 fid=0x4535bb84 ioctl cmd=0x801c581f Acked-by: Bharath SM Signed-off-by: Steve French --- fs/smb/client/ioctl.c | 3 +++ fs/smb/client/trace.h | 1 + 2 files changed, 4 insertions(+) diff --git a/fs/smb/client/ioctl.c b/fs/smb/client/ioctl.c index 0a9935ce05a5a..d1b1532094240 100644 --- a/fs/smb/client/ioctl.c +++ b/fs/smb/client/ioctl.c @@ -588,6 +588,9 @@ long cifs_ioctl(struct file *filep, unsigned int command, unsigned long arg) break; default: cifs_dbg(FYI, "unsupported ioctl\n"); + trace_smb3_unsupported_ioctl(xid, + pSMBFile ? pSMBFile->fid.persistent_fid : 0, + command); break; } cifs_ioc_exit: diff --git a/fs/smb/client/trace.h b/fs/smb/client/trace.h index b0fbc2df642e9..a584a77431132 100644 --- a/fs/smb/client/trace.h +++ b/fs/smb/client/trace.h @@ -1579,6 +1579,7 @@ DEFINE_EVENT(smb3_ioctl_class, smb3_##name, \ TP_ARGS(xid, fid, command)) DEFINE_SMB3_IOCTL_EVENT(ioctl); +DEFINE_SMB3_IOCTL_EVENT(unsupported_ioctl); DECLARE_EVENT_CLASS(smb3_shutdown_class, TP_PROTO(__u32 flags, From 6fc77eb9423c2c5fc9f58bac2432687f0a17b279 Mon Sep 17 00:00:00 2001 From: Henrique Carvalho Date: Mon, 29 Dec 2025 14:49:43 -0300 Subject: [PATCH 0580/1321] smb: client: fix UBSAN array-index-out-of-bounds in smb2_copychunk_range struct copychunk_ioctl_req::ChunkCount is annotated with __counted_by_le() as the number of elements in Chunks[]. smb2_copychunk_range reuses ChunkCount to store the number of chunks sent in the current iteration. If a later iteration populates more chunks than a previous one, the stale smaller value trips UBSAN. Set ChunkCount to chunk_count (allocated capacity) before populating Chunks[]. Fixes: cc26f593dc19 ("smb: move copychunk definitions to common/smb2pdu.h") Link: https://lore.kernel.org/linux-cifs/CAH2r5ms9AWLy8WZ04Cpq5XOeVK64tcrUQ6__iMW+yk1VPzo1BA@mail.gmail.com Tested-by: Youling Tang Acked-by: ChenXiaoSong Signed-off-by: Henrique Carvalho Signed-off-by: Steve French --- fs/smb/client/smb2ops.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c index a16ded46b5a26..c1aaf77e187b6 100644 --- a/fs/smb/client/smb2ops.c +++ b/fs/smb/client/smb2ops.c @@ -1905,6 +1905,12 @@ smb2_copychunk_range(const unsigned int xid, src_off_prev = src_off; dst_off_prev = dst_off; + /* + * __counted_by_le(ChunkCount): set to allocated chunks before + * populating Chunks[] + */ + cc_req->ChunkCount = cpu_to_le32(chunk_count); + chunks = 0; copy_bytes = 0; copy_bytes_left = umin(total_bytes_left, tcon->max_bytes_copy); From 3ec111d5e2ab8028022990faa3b9d5893478d001 Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Wed, 24 Dec 2025 14:20:16 +0000 Subject: [PATCH 0581/1321] ksmbd: Fix memory leak in get_file_all_info() In get_file_all_info(), if vfs_getattr() fails, the function returns immediately without freeing the allocated filename, leading to a memory leak. Fix this by freeing the filename before returning in this error case. Fixes: 5614c8c487f6a ("ksmbd: replace generic_fillattr with vfs_getattr") Signed-off-by: Zilin Guan Acked-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/server/smb2pdu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index 469b70757dba6..a607e072a3701 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -4923,8 +4923,10 @@ static int get_file_all_info(struct ksmbd_work *work, ret = vfs_getattr(&fp->filp->f_path, &stat, STATX_BASIC_STATS, AT_STATX_SYNC_AS_STAT); - if (ret) + if (ret) { + kfree(filename); return ret; + } ksmbd_debug(SMB, "filename = %s\n", filename); delete_pending = ksmbd_inode_pending_delete(fp); From 924e83d17ae4ff3da42f7b46f49f2c8d386d40eb Mon Sep 17 00:00:00 2001 From: ZhangGuoDong Date: Sun, 28 Dec 2025 22:51:01 +0800 Subject: [PATCH 0582/1321] smb/server: call ksmbd_session_rpc_close() on error path in create_smb2_pipe() When ksmbd_iov_pin_rsp() fails, we should call ksmbd_session_rpc_close(). Signed-off-by: ZhangGuoDong Signed-off-by: ChenXiaoSong Acked-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/server/smb2pdu.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index a607e072a3701..8a7c48adb87e6 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -2281,7 +2281,7 @@ static noinline int create_smb2_pipe(struct ksmbd_work *work) { struct smb2_create_rsp *rsp; struct smb2_create_req *req; - int id; + int id = -1; int err; char *name; @@ -2338,6 +2338,9 @@ static noinline int create_smb2_pipe(struct ksmbd_work *work) break; } + if (id >= 0) + ksmbd_session_rpc_close(work->sess, id); + if (!IS_ERR(name)) kfree(name); From 142d9c22cb24fa082603f68f26f9a18f6d56efa1 Mon Sep 17 00:00:00 2001 From: ZhangGuoDong Date: Mon, 29 Dec 2025 10:13:29 +0800 Subject: [PATCH 0583/1321] smb/server: fix refcount leak in parse_durable_handle_context() When the command is a replay operation and -ENOEXEC is returned, the refcount of ksmbd_file must be released. Signed-off-by: ZhangGuoDong Signed-off-by: ChenXiaoSong Acked-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/server/smb2pdu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index 8a7c48adb87e6..ec9e4cd24c4cb 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -2812,6 +2812,7 @@ static int parse_durable_handle_context(struct ksmbd_work *work, SMB2_CLIENT_GUID_SIZE)) { if (!(req->hdr.Flags & SMB2_FLAGS_REPLAY_OPERATION)) { err = -ENOEXEC; + ksmbd_put_durable_fd(dh_info->fp); goto out; } From 9d00bc03a174960558a690593a2d0029502ee1fb Mon Sep 17 00:00:00 2001 From: ZhangGuoDong Date: Mon, 29 Dec 2025 11:15:18 +0800 Subject: [PATCH 0584/1321] smb/server: fix refcount leak in smb2_open() When ksmbd_vfs_getattr() fails, the reference count of ksmbd_file must be released. Suggested-by: Namjae Jeon Signed-off-by: ZhangGuoDong Signed-off-by: ChenXiaoSong Acked-by: Namjae Jeon Signed-off-by: Steve French --- fs/smb/server/smb2pdu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index ec9e4cd24c4cb..2fcd0d4d1fb0d 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -3010,10 +3010,10 @@ int smb2_open(struct ksmbd_work *work) file_info = FILE_OPENED; rc = ksmbd_vfs_getattr(&fp->filp->f_path, &stat); + ksmbd_put_durable_fd(fp); if (rc) goto err_out2; - ksmbd_put_durable_fd(fp); goto reconnected_fp; } } else if (req_op_level == SMB2_OPLOCK_LEVEL_LEASE) From be0894d36fa94aaba46e81261298e4a094d5c177 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= Date: Fri, 19 Dec 2025 12:32:57 +0100 Subject: [PATCH 0585/1321] drm/xe/svm: Fix a debug printout MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Avoid spamming the log with drm_info(). Use drm_dbg() instead. Fixes: cc795e041034 ("drm/xe/svm: Make xe_svm_range_needs_migrate_to_vram() public") Cc: Matthew Brost Cc: Himal Prasad Ghimiray Cc: # v6.17+ Signed-off-by: Thomas Hellström Reviewed-by: Himal Prasad Ghimiray Link: https://patch.msgid.link/20251219113320.183860-2-thomas.hellstrom@linux.intel.com (cherry picked from commit 72aee5f70ba47b939345a0d3414b51b0639c5b88) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_svm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c index 55c5a0eb82e12..894e8f092e3f7 100644 --- a/drivers/gpu/drm/xe/xe_svm.c +++ b/drivers/gpu/drm/xe/xe_svm.c @@ -941,7 +941,7 @@ bool xe_svm_range_needs_migrate_to_vram(struct xe_svm_range *range, struct xe_vm xe_assert(vm->xe, IS_DGFX(vm->xe)); if (xe_svm_range_in_vram(range)) { - drm_info(&vm->xe->drm, "Range is already in VRAM\n"); + drm_dbg(&vm->xe->drm, "Range is already in VRAM\n"); return false; } From a3772afe8143a6f14b36781848c79c19817e843d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= Date: Fri, 19 Dec 2025 12:32:59 +0100 Subject: [PATCH 0586/1321] drm/pagemap, drm/xe: Ensure that the devmem allocation is idle before use MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In situations where no system memory is migrated to devmem, and in upcoming patches where another GPU is performing the migration to the newly allocated devmem buffer, there is nothing to ensure any ongoing clear to the devmem allocation or async eviction from the devmem allocation is complete. Address that by passing a struct dma_fence down to the copy functions, and ensure it is waited for before migration is marked complete. v3: - New patch. v4: - Update the logic used for determining when to wait for the pre_migrate_fence. - Update the logic used for determining when to warn for the pre_migrate_fence since the scheduler fences apparently can signal out-of-order. v5: - Fix a UAF (CI) - Remove references to source P2P migration (Himal) - Put the pre_migrate_fence after migration. v6: - Pipeline the pre_migrate_fence dependency (Matt Brost) Fixes: c5b3eb5a906c ("drm/xe: Add GPUSVM device memory copy vfunc functions") Cc: Matthew Brost Cc: # v6.15+ Signed-off-by: Thomas Hellström Reviewed-by: Matthew Brost Acked-by: Maarten Lankhorst # For merging through drm-xe. Link: https://patch.msgid.link/20251219113320.183860-4-thomas.hellstrom@linux.intel.com (cherry picked from commit 16b5ad31952476fb925c401897fc171cd37f536b) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/drm_pagemap.c | 17 +++++++++--- drivers/gpu/drm/xe/xe_migrate.c | 25 +++++++++++++---- drivers/gpu/drm/xe/xe_migrate.h | 6 ++-- drivers/gpu/drm/xe/xe_svm.c | 49 +++++++++++++++++++++++++-------- include/drm/drm_pagemap.h | 17 ++++++++++-- 5 files changed, 88 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c index 37d7cfbbb3e8a..06c1bd8fc4d17 100644 --- a/drivers/gpu/drm/drm_pagemap.c +++ b/drivers/gpu/drm/drm_pagemap.c @@ -3,6 +3,7 @@ * Copyright © 2024-2025 Intel Corporation */ +#include #include #include #include @@ -408,10 +409,14 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation, drm_pagemap_get_devmem_page(page, zdd); } - err = ops->copy_to_devmem(pages, pagemap_addr, npages); + err = ops->copy_to_devmem(pages, pagemap_addr, npages, + devmem_allocation->pre_migrate_fence); if (err) goto err_finalize; + dma_fence_put(devmem_allocation->pre_migrate_fence); + devmem_allocation->pre_migrate_fence = NULL; + /* Upon success bind devmem allocation to range and zdd */ devmem_allocation->timeslice_expiration = get_jiffies_64() + msecs_to_jiffies(timeslice_ms); @@ -596,7 +601,7 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation) for (i = 0; i < npages; ++i) pages[i] = migrate_pfn_to_page(src[i]); - err = ops->copy_to_ram(pages, pagemap_addr, npages); + err = ops->copy_to_ram(pages, pagemap_addr, npages, NULL); if (err) goto err_finalize; @@ -732,7 +737,7 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas, for (i = 0; i < npages; ++i) pages[i] = migrate_pfn_to_page(migrate.src[i]); - err = ops->copy_to_ram(pages, pagemap_addr, npages); + err = ops->copy_to_ram(pages, pagemap_addr, npages, NULL); if (err) goto err_finalize; @@ -813,11 +818,14 @@ EXPORT_SYMBOL_GPL(drm_pagemap_pagemap_ops_get); * @ops: Pointer to the operations structure for GPU SVM device memory * @dpagemap: The struct drm_pagemap we're allocating from. * @size: Size of device memory allocation + * @pre_migrate_fence: Fence to wait for or pipeline behind before migration starts. + * (May be NULL). */ void drm_pagemap_devmem_init(struct drm_pagemap_devmem *devmem_allocation, struct device *dev, struct mm_struct *mm, const struct drm_pagemap_devmem_ops *ops, - struct drm_pagemap *dpagemap, size_t size) + struct drm_pagemap *dpagemap, size_t size, + struct dma_fence *pre_migrate_fence) { init_completion(&devmem_allocation->detached); devmem_allocation->dev = dev; @@ -825,6 +833,7 @@ void drm_pagemap_devmem_init(struct drm_pagemap_devmem *devmem_allocation, devmem_allocation->ops = ops; devmem_allocation->dpagemap = dpagemap; devmem_allocation->size = size; + devmem_allocation->pre_migrate_fence = pre_migrate_fence; } EXPORT_SYMBOL_GPL(drm_pagemap_devmem_init); diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c index 2184af413b912..5a95b08a4723a 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c @@ -2062,6 +2062,7 @@ static struct dma_fence *xe_migrate_vram(struct xe_migrate *m, unsigned long sram_offset, struct drm_pagemap_addr *sram_addr, u64 vram_addr, + struct dma_fence *deps, const enum xe_migrate_copy_dir dir) { struct xe_gt *gt = m->tile->primary_gt; @@ -2150,6 +2151,14 @@ static struct dma_fence *xe_migrate_vram(struct xe_migrate *m, xe_sched_job_add_migrate_flush(job, MI_INVALIDATE_TLB); + if (deps && !dma_fence_is_signaled(deps)) { + dma_fence_get(deps); + err = drm_sched_job_add_dependency(&job->drm, deps); + if (err) + dma_fence_wait(deps, false); + err = 0; + } + mutex_lock(&m->job_mutex); xe_sched_job_arm(job); fence = dma_fence_get(&job->drm.s_fence->finished); @@ -2175,6 +2184,8 @@ static struct dma_fence *xe_migrate_vram(struct xe_migrate *m, * @npages: Number of pages to migrate. * @src_addr: Array of DMA information (source of migrate) * @dst_addr: Device physical address of VRAM (destination of migrate) + * @deps: struct dma_fence representing the dependencies that need + * to be signaled before migration. * * Copy from an array dma addresses to a VRAM device physical address * @@ -2184,10 +2195,11 @@ static struct dma_fence *xe_migrate_vram(struct xe_migrate *m, struct dma_fence *xe_migrate_to_vram(struct xe_migrate *m, unsigned long npages, struct drm_pagemap_addr *src_addr, - u64 dst_addr) + u64 dst_addr, + struct dma_fence *deps) { return xe_migrate_vram(m, npages * PAGE_SIZE, 0, src_addr, dst_addr, - XE_MIGRATE_COPY_TO_VRAM); + deps, XE_MIGRATE_COPY_TO_VRAM); } /** @@ -2196,6 +2208,8 @@ struct dma_fence *xe_migrate_to_vram(struct xe_migrate *m, * @npages: Number of pages to migrate. * @src_addr: Device physical address of VRAM (source of migrate) * @dst_addr: Array of DMA information (destination of migrate) + * @deps: struct dma_fence representing the dependencies that need + * to be signaled before migration. * * Copy from a VRAM device physical address to an array dma addresses * @@ -2205,10 +2219,11 @@ struct dma_fence *xe_migrate_to_vram(struct xe_migrate *m, struct dma_fence *xe_migrate_from_vram(struct xe_migrate *m, unsigned long npages, u64 src_addr, - struct drm_pagemap_addr *dst_addr) + struct drm_pagemap_addr *dst_addr, + struct dma_fence *deps) { return xe_migrate_vram(m, npages * PAGE_SIZE, 0, dst_addr, src_addr, - XE_MIGRATE_COPY_TO_SRAM); + deps, XE_MIGRATE_COPY_TO_SRAM); } static void xe_migrate_dma_unmap(struct xe_device *xe, @@ -2384,7 +2399,7 @@ int xe_migrate_access_memory(struct xe_migrate *m, struct xe_bo *bo, __fence = xe_migrate_vram(m, current_bytes, (unsigned long)buf & ~PAGE_MASK, &pagemap_addr[current_page], - vram_addr, write ? + vram_addr, NULL, write ? XE_MIGRATE_COPY_TO_VRAM : XE_MIGRATE_COPY_TO_SRAM); if (IS_ERR(__fence)) { diff --git a/drivers/gpu/drm/xe/xe_migrate.h b/drivers/gpu/drm/xe/xe_migrate.h index 260e298e5dd7f..b76441f062b4f 100644 --- a/drivers/gpu/drm/xe/xe_migrate.h +++ b/drivers/gpu/drm/xe/xe_migrate.h @@ -116,12 +116,14 @@ int xe_migrate_init(struct xe_migrate *m); struct dma_fence *xe_migrate_to_vram(struct xe_migrate *m, unsigned long npages, struct drm_pagemap_addr *src_addr, - u64 dst_addr); + u64 dst_addr, + struct dma_fence *deps); struct dma_fence *xe_migrate_from_vram(struct xe_migrate *m, unsigned long npages, u64 src_addr, - struct drm_pagemap_addr *dst_addr); + struct drm_pagemap_addr *dst_addr, + struct dma_fence *deps); struct dma_fence *xe_migrate_copy(struct xe_migrate *m, struct xe_bo *src_bo, diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c index 894e8f092e3f7..f97e0af6a9b01 100644 --- a/drivers/gpu/drm/xe/xe_svm.c +++ b/drivers/gpu/drm/xe/xe_svm.c @@ -476,7 +476,8 @@ static void xe_svm_copy_us_stats_incr(struct xe_gt *gt, static int xe_svm_copy(struct page **pages, struct drm_pagemap_addr *pagemap_addr, - unsigned long npages, const enum xe_svm_copy_dir dir) + unsigned long npages, const enum xe_svm_copy_dir dir, + struct dma_fence *pre_migrate_fence) { struct xe_vram_region *vr = NULL; struct xe_gt *gt = NULL; @@ -565,7 +566,8 @@ static int xe_svm_copy(struct page **pages, __fence = xe_migrate_from_vram(vr->migrate, i - pos + incr, vram_addr, - &pagemap_addr[pos]); + &pagemap_addr[pos], + pre_migrate_fence); } else { vm_dbg(&xe->drm, "COPY TO VRAM - 0x%016llx -> 0x%016llx, NPAGES=%ld", @@ -574,13 +576,14 @@ static int xe_svm_copy(struct page **pages, __fence = xe_migrate_to_vram(vr->migrate, i - pos + incr, &pagemap_addr[pos], - vram_addr); + vram_addr, + pre_migrate_fence); } if (IS_ERR(__fence)) { err = PTR_ERR(__fence); goto err_out; } - + pre_migrate_fence = NULL; dma_fence_put(fence); fence = __fence; } @@ -603,20 +606,22 @@ static int xe_svm_copy(struct page **pages, vram_addr, (u64)pagemap_addr[pos].addr, 1); __fence = xe_migrate_from_vram(vr->migrate, 1, vram_addr, - &pagemap_addr[pos]); + &pagemap_addr[pos], + pre_migrate_fence); } else { vm_dbg(&xe->drm, "COPY TO VRAM - 0x%016llx -> 0x%016llx, NPAGES=%d", (u64)pagemap_addr[pos].addr, vram_addr, 1); __fence = xe_migrate_to_vram(vr->migrate, 1, &pagemap_addr[pos], - vram_addr); + vram_addr, + pre_migrate_fence); } if (IS_ERR(__fence)) { err = PTR_ERR(__fence); goto err_out; } - + pre_migrate_fence = NULL; dma_fence_put(fence); fence = __fence; } @@ -629,6 +634,8 @@ static int xe_svm_copy(struct page **pages, dma_fence_wait(fence, false); dma_fence_put(fence); } + if (pre_migrate_fence) + dma_fence_wait(pre_migrate_fence, false); /* * XXX: We can't derive the GT here (or anywhere in this functions, but @@ -645,16 +652,20 @@ static int xe_svm_copy(struct page **pages, static int xe_svm_copy_to_devmem(struct page **pages, struct drm_pagemap_addr *pagemap_addr, - unsigned long npages) + unsigned long npages, + struct dma_fence *pre_migrate_fence) { - return xe_svm_copy(pages, pagemap_addr, npages, XE_SVM_COPY_TO_VRAM); + return xe_svm_copy(pages, pagemap_addr, npages, XE_SVM_COPY_TO_VRAM, + pre_migrate_fence); } static int xe_svm_copy_to_ram(struct page **pages, struct drm_pagemap_addr *pagemap_addr, - unsigned long npages) + unsigned long npages, + struct dma_fence *pre_migrate_fence) { - return xe_svm_copy(pages, pagemap_addr, npages, XE_SVM_COPY_TO_SRAM); + return xe_svm_copy(pages, pagemap_addr, npages, XE_SVM_COPY_TO_SRAM, + pre_migrate_fence); } static struct xe_bo *to_xe_bo(struct drm_pagemap_devmem *devmem_allocation) @@ -667,6 +678,7 @@ static void xe_svm_devmem_release(struct drm_pagemap_devmem *devmem_allocation) struct xe_bo *bo = to_xe_bo(devmem_allocation); struct xe_device *xe = xe_bo_device(bo); + dma_fence_put(devmem_allocation->pre_migrate_fence); xe_bo_put_async(bo); xe_pm_runtime_put(xe); } @@ -861,6 +873,7 @@ static int xe_drm_pagemap_populate_mm(struct drm_pagemap *dpagemap, unsigned long timeslice_ms) { struct xe_vram_region *vr = container_of(dpagemap, typeof(*vr), dpagemap); + struct dma_fence *pre_migrate_fence = NULL; struct xe_device *xe = vr->xe; struct device *dev = xe->drm.dev; struct drm_buddy_block *block; @@ -887,8 +900,20 @@ static int xe_drm_pagemap_populate_mm(struct drm_pagemap *dpagemap, break; } + /* Ensure that any clearing or async eviction will complete before migration. */ + if (!dma_resv_test_signaled(bo->ttm.base.resv, DMA_RESV_USAGE_KERNEL)) { + err = dma_resv_get_singleton(bo->ttm.base.resv, DMA_RESV_USAGE_KERNEL, + &pre_migrate_fence); + if (err) + dma_resv_wait_timeout(bo->ttm.base.resv, DMA_RESV_USAGE_KERNEL, + false, MAX_SCHEDULE_TIMEOUT); + else if (pre_migrate_fence) + dma_fence_enable_sw_signaling(pre_migrate_fence); + } + drm_pagemap_devmem_init(&bo->devmem_allocation, dev, mm, - &dpagemap_devmem_ops, dpagemap, end - start); + &dpagemap_devmem_ops, dpagemap, end - start, + pre_migrate_fence); blocks = &to_xe_ttm_vram_mgr_resource(bo->ttm.resource)->blocks; list_for_each_entry(block, blocks, link) diff --git a/include/drm/drm_pagemap.h b/include/drm/drm_pagemap.h index f6e7e234c0892..70a7991f784f9 100644 --- a/include/drm/drm_pagemap.h +++ b/include/drm/drm_pagemap.h @@ -8,6 +8,7 @@ #define NR_PAGES(order) (1U << (order)) +struct dma_fence; struct drm_pagemap; struct drm_pagemap_zdd; struct device; @@ -174,6 +175,8 @@ struct drm_pagemap_devmem_ops { * @pages: Pointer to array of device memory pages (destination) * @pagemap_addr: Pointer to array of DMA information (source) * @npages: Number of pages to copy + * @pre_migrate_fence: dma-fence to wait for before migration start. + * May be NULL. * * Copy pages to device memory. If the order of a @pagemap_addr entry * is greater than 0, the entry is populated but subsequent entries @@ -183,13 +186,16 @@ struct drm_pagemap_devmem_ops { */ int (*copy_to_devmem)(struct page **pages, struct drm_pagemap_addr *pagemap_addr, - unsigned long npages); + unsigned long npages, + struct dma_fence *pre_migrate_fence); /** * @copy_to_ram: Copy to system RAM (required for migration) * @pages: Pointer to array of device memory pages (source) * @pagemap_addr: Pointer to array of DMA information (destination) * @npages: Number of pages to copy + * @pre_migrate_fence: dma-fence to wait for before migration start. + * May be NULL. * * Copy pages to system RAM. If the order of a @pagemap_addr entry * is greater than 0, the entry is populated but subsequent entries @@ -199,7 +205,8 @@ struct drm_pagemap_devmem_ops { */ int (*copy_to_ram)(struct page **pages, struct drm_pagemap_addr *pagemap_addr, - unsigned long npages); + unsigned long npages, + struct dma_fence *pre_migrate_fence); }; /** @@ -212,6 +219,8 @@ struct drm_pagemap_devmem_ops { * @dpagemap: The struct drm_pagemap of the pages this allocation belongs to. * @size: Size of device memory allocation * @timeslice_expiration: Timeslice expiration in jiffies + * @pre_migrate_fence: Fence to wait for or pipeline behind before migration starts. + * (May be NULL). */ struct drm_pagemap_devmem { struct device *dev; @@ -221,6 +230,7 @@ struct drm_pagemap_devmem { struct drm_pagemap *dpagemap; size_t size; u64 timeslice_expiration; + struct dma_fence *pre_migrate_fence; }; int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation, @@ -238,7 +248,8 @@ struct drm_pagemap *drm_pagemap_page_to_dpagemap(struct page *page); void drm_pagemap_devmem_init(struct drm_pagemap_devmem *devmem_allocation, struct device *dev, struct mm_struct *mm, const struct drm_pagemap_devmem_ops *ops, - struct drm_pagemap *dpagemap, size_t size); + struct drm_pagemap *dpagemap, size_t size, + struct dma_fence *pre_migrate_fence); int drm_pagemap_populate_mm(struct drm_pagemap *dpagemap, unsigned long start, unsigned long end, From 563087428cbae67deb02e7b6a23fd02db0ac28d6 Mon Sep 17 00:00:00 2001 From: Jonathan Cavitt Date: Mon, 22 Dec 2025 20:19:59 +0000 Subject: [PATCH 0587/1321] drm/xe/guc: READ/WRITE_ONCE g2h_fence->done MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Use READ_ONCE and WRITE_ONCE when operating on g2h_fence->done to prevent the compiler from ignoring important modifications to its value. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Suggested-by: Matthew Brost Signed-off-by: Jonathan Cavitt Cc: Rodrigo Vivi Reviewed-by: Matthew Brost Link: https://patch.msgid.link/20251222201957.63245-5-jonathan.cavitt@intel.com Signed-off-by: Rodrigo Vivi (cherry picked from commit b5179dbd1c14743ae80f0aaa28eaaf35c361608f) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_guc_ct.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c index 4ac434ad216f9..a5019d1e741b3 100644 --- a/drivers/gpu/drm/xe/xe_guc_ct.c +++ b/drivers/gpu/drm/xe/xe_guc_ct.c @@ -104,7 +104,9 @@ static void g2h_fence_cancel(struct g2h_fence *g2h_fence) { g2h_fence->cancel = true; g2h_fence->fail = true; - g2h_fence->done = true; + + /* WRITE_ONCE pairs with READ_ONCEs in guc_ct_send_recv. */ + WRITE_ONCE(g2h_fence->done, true); } static bool g2h_fence_needs_alloc(struct g2h_fence *g2h_fence) @@ -1203,10 +1205,13 @@ static int guc_ct_send_recv(struct xe_guc_ct *ct, const u32 *action, u32 len, return ret; } - ret = wait_event_timeout(ct->g2h_fence_wq, g2h_fence.done, HZ); + /* READ_ONCEs pairs with WRITE_ONCEs in parse_g2h_response + * and g2h_fence_cancel. + */ + ret = wait_event_timeout(ct->g2h_fence_wq, READ_ONCE(g2h_fence.done), HZ); if (!ret) { LNL_FLUSH_WORK(&ct->g2h_worker); - if (g2h_fence.done) { + if (READ_ONCE(g2h_fence.done)) { xe_gt_warn(gt, "G2H fence %u, action %04x, done\n", g2h_fence.seqno, action[0]); ret = 1; @@ -1454,7 +1459,8 @@ static int parse_g2h_response(struct xe_guc_ct *ct, u32 *msg, u32 len) g2h_release_space(ct, GUC_CTB_HXG_MSG_MAX_LEN); - g2h_fence->done = true; + /* WRITE_ONCE pairs with READ_ONCEs in guc_ct_send_recv. */ + WRITE_ONCE(g2h_fence->done, true); smp_mb(); wake_up_all(&ct->g2h_fence_wq); From 1d7e4da15f296f06ed326a9b60f4a87b1f5b73c7 Mon Sep 17 00:00:00 2001 From: Alessio Belle Date: Mon, 8 Dec 2025 09:11:00 +0000 Subject: [PATCH 0588/1321] drm/imagination: Disallow exporting of PM/FW protected objects These objects are meant to be used by the GPU firmware or by the PM unit within the GPU, in which case they may contain physical addresses. This adds a layer of protection against exposing potentially exploitable information outside of the driver. Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code") Signed-off-by: Alessio Belle Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251208-no-export-pm-fw-obj-v1-1-83ab12c61693@imgtec.com Signed-off-by: Matt Coster --- drivers/gpu/drm/imagination/pvr_gem.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/gpu/drm/imagination/pvr_gem.c b/drivers/gpu/drm/imagination/pvr_gem.c index a66cf082af244..c07c9a9151903 100644 --- a/drivers/gpu/drm/imagination/pvr_gem.c +++ b/drivers/gpu/drm/imagination/pvr_gem.c @@ -28,6 +28,16 @@ static void pvr_gem_object_free(struct drm_gem_object *obj) drm_gem_shmem_object_free(obj); } +static struct dma_buf *pvr_gem_export(struct drm_gem_object *obj, int flags) +{ + struct pvr_gem_object *pvr_obj = gem_to_pvr_gem(obj); + + if (pvr_obj->flags & DRM_PVR_BO_PM_FW_PROTECT) + return ERR_PTR(-EPERM); + + return drm_gem_prime_export(obj, flags); +} + static int pvr_gem_mmap(struct drm_gem_object *gem_obj, struct vm_area_struct *vma) { struct pvr_gem_object *pvr_obj = gem_to_pvr_gem(gem_obj); @@ -42,6 +52,7 @@ static int pvr_gem_mmap(struct drm_gem_object *gem_obj, struct vm_area_struct *v static const struct drm_gem_object_funcs pvr_gem_object_funcs = { .free = pvr_gem_object_free, .print_info = drm_gem_shmem_object_print_info, + .export = pvr_gem_export, .pin = drm_gem_shmem_object_pin, .unpin = drm_gem_shmem_object_unpin, .get_sg_table = drm_gem_shmem_object_get_sg_table, From 226e60ee4bd22fefbbe6ff81a9e9c0a1def29762 Mon Sep 17 00:00:00 2001 From: Lyude Paul Date: Thu, 11 Dec 2025 14:02:54 -0500 Subject: [PATCH 0589/1321] drm/nouveau/dispnv50: Don't call drm_atomic_get_crtc_state() in prepare_fb Since we recently started warning about uses of this function after the atomic check phase completes, we've started getting warnings about this in nouveau. It appears a misplaced drm_atomic_get_crtc_state() call has been hiding in our .prepare_fb callback for a while. So, fix this by adding a new nv50_head_atom_get_new() function and use that in our .prepare_fb callback instead. Signed-off-by: Lyude Paul Reviewed-by: Dave Airlie Fixes: 1590700d94ac ("drm/nouveau/kms/nv50-: split each resource type into their own source files") Cc: # v4.18+ Link: https://patch.msgid.link/20251211190256.396742-1-lyude@redhat.com --- drivers/gpu/drm/nouveau/dispnv50/atom.h | 13 +++++++++++++ drivers/gpu/drm/nouveau/dispnv50/wndw.c | 2 +- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/dispnv50/atom.h b/drivers/gpu/drm/nouveau/dispnv50/atom.h index 93f8f4f645784..b43c4f9bbcdf5 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/atom.h +++ b/drivers/gpu/drm/nouveau/dispnv50/atom.h @@ -152,8 +152,21 @@ static inline struct nv50_head_atom * nv50_head_atom_get(struct drm_atomic_state *state, struct drm_crtc *crtc) { struct drm_crtc_state *statec = drm_atomic_get_crtc_state(state, crtc); + if (IS_ERR(statec)) return (void *)statec; + + return nv50_head_atom(statec); +} + +static inline struct nv50_head_atom * +nv50_head_atom_get_new(struct drm_atomic_state *state, struct drm_crtc *crtc) +{ + struct drm_crtc_state *statec = drm_atomic_get_new_crtc_state(state, crtc); + + if (!statec) + return NULL; + return nv50_head_atom(statec); } diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c index ef9e410babbfb..9a2c20fce0f3e 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c +++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c @@ -583,7 +583,7 @@ nv50_wndw_prepare_fb(struct drm_plane *plane, struct drm_plane_state *state) asyw->image.offset[0] = nvbo->offset; if (wndw->func->prepare) { - asyh = nv50_head_atom_get(asyw->state.state, asyw->state.crtc); + asyh = nv50_head_atom_get_new(asyw->state.state, asyw->state.crtc); if (IS_ERR(asyh)) return PTR_ERR(asyh); From 8b797bfac306944047fa064094ffa6cbb6d73b9a Mon Sep 17 00:00:00 2001 From: Thomas Zimmermann Date: Tue, 9 Dec 2025 14:41:58 +0100 Subject: [PATCH 0590/1321] drm/gem-shmem: Fix typos in documentation Fix the compile-time warnings Warning: drm_gem_shmem_helper.c:104 function parameter 'shmem' not described in 'drm_gem_shmem_init' Warning: drm_gem_shmem_helper.c:104 function parameter 'size' not described in 'drm_gem_shmem_init' Signed-off-by: Thomas Zimmermann Reviewed-by: Boris Brezillon Fixes: e3f4bdaf2c5b ("drm/gem/shmem: Extract drm_gem_shmem_init() from drm_gem_shmem_create()") Link: https://patch.msgid.link/20251209140141.94407-2-tzimmermann@suse.de --- drivers/gpu/drm/drm_gem_shmem_helper.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index 93b9cff89080f..9cd52f8930fa2 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -96,7 +96,8 @@ static int __drm_gem_shmem_init(struct drm_device *dev, struct drm_gem_shmem_obj /** * drm_gem_shmem_init - Initialize an allocated object. * @dev: DRM device - * @obj: The allocated shmem GEM object. + * @shmem: The allocated shmem GEM object. + * @size: Buffer size in bytes * * Returns: * 0 on success, or a negative error code on failure. From 97d87aa056c8813efd3951798764dee7960ae197 Mon Sep 17 00:00:00 2001 From: Thomas Zimmermann Date: Tue, 9 Dec 2025 14:41:59 +0100 Subject: [PATCH 0591/1321] drm/gem-shmem: Fix the MODULE_LICENSE() string Replace the bogus "GPL v2" with "GPL" as MODULE_LICNSE() string. The value does not declare the module's exact license, but only lets the module loader test whether the module is Free Software or not. See commit bf7fbeeae6db ("module: Cure the MODULE_LICENSE "GPL" vs. "GPL v2" bogosity") in the details of the issue. The fix is to use "GPL" for all modules under any variant of the GPL. Signed-off-by: Thomas Zimmermann Reviewed-by: Boris Brezillon Fixes: 4b2b5e142ff4 ("drm: Move GEM memory managers into modules") Link: https://patch.msgid.link/20251209140141.94407-3-tzimmermann@suse.de --- drivers/gpu/drm/drm_gem_shmem_helper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index 9cd52f8930fa2..f13eb5f36e8a9 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -896,4 +896,4 @@ EXPORT_SYMBOL_GPL(drm_gem_shmem_prime_import_no_map); MODULE_DESCRIPTION("DRM SHMEM memory-management helpers"); MODULE_IMPORT_NS("DMA_BUF"); -MODULE_LICENSE("GPL v2"); +MODULE_LICENSE("GPL"); From 4a12c787db6121c8c5b2b37dd65101b75b823f43 Mon Sep 17 00:00:00 2001 From: Krzysztof Niemiec Date: Tue, 16 Dec 2025 19:09:01 +0100 Subject: [PATCH 0592/1321] drm/i915/gem: Zero-initialize the eb.vma array in i915_gem_do_execbuffer Initialize the eb.vma array with values of 0 when the eb structure is first set up. In particular, this sets the eb->vma[i].vma pointers to NULL, simplifying cleanup and getting rid of the bug described below. During the execution of eb_lookup_vmas(), the eb->vma array is successively filled up with struct eb_vma objects. This process includes calling eb_add_vma(), which might fail; however, even in the event of failure, eb->vma[i].vma is set for the currently processed buffer. If eb_add_vma() fails, eb_lookup_vmas() returns with an error, which prompts a call to eb_release_vmas() to clean up the mess. Since eb_lookup_vmas() might fail during processing any (possibly not first) buffer, eb_release_vmas() checks whether a buffer's vma is NULL to know at what point did the lookup function fail. In eb_lookup_vmas(), eb->vma[i].vma is set to NULL if either the helper function eb_lookup_vma() or eb_validate_vma() fails. eb->vma[i+1].vma is set to NULL in case i915_gem_object_userptr_submit_init() fails; the current one needs to be cleaned up by eb_release_vmas() at this point, so the next one is set. If eb_add_vma() fails, neither the current nor the next vma is set to NULL, which is a source of a NULL deref bug described in the issue linked in the Closes tag. When entering eb_lookup_vmas(), the vma pointers are set to the slab poison value, instead of NULL. This doesn't matter for the actual lookup, since it gets overwritten anyway, however the eb_release_vmas() function only recognizes NULL as the stopping value, hence the pointers are being set to NULL as they go in case of intermediate failure. This patch changes the approach to filling them all with NULL at the start instead, rather than handling that manually during failure. Reported-by: Gangmin Kim Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15062 Fixes: 544460c33821 ("drm/i915: Multi-BB execbuf") Cc: stable@vger.kernel.org # 5.16.x Signed-off-by: Krzysztof Niemiec Reviewed-by: Janusz Krzysztofik Reviewed-by: Krzysztof Karas Reviewed-by: Andi Shyti Signed-off-by: Andi Shyti Link: https://lore.kernel.org/r/20251216180900.54294-2-krzysztof.niemiec@intel.com (cherry picked from commit 08889b706d4f0b8d2352b7ca29c2d8df4d0787cd) Signed-off-by: Jani Nikula --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 37 +++++++++---------- 1 file changed, 17 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index b057c2fa03a45..d49e96f9be510 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -951,13 +951,13 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb) vma = eb_lookup_vma(eb, eb->exec[i].handle); if (IS_ERR(vma)) { err = PTR_ERR(vma); - goto err; + return err; } err = eb_validate_vma(eb, &eb->exec[i], vma); if (unlikely(err)) { i915_vma_put(vma); - goto err; + return err; } err = eb_add_vma(eb, ¤t_batch, i, vma); @@ -966,19 +966,8 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb) if (i915_gem_object_is_userptr(vma->obj)) { err = i915_gem_object_userptr_submit_init(vma->obj); - if (err) { - if (i + 1 < eb->buffer_count) { - /* - * Execbuffer code expects last vma entry to be NULL, - * since we already initialized this entry, - * set the next value to NULL or we mess up - * cleanup handling. - */ - eb->vma[i + 1].vma = NULL; - } - + if (err) return err; - } eb->vma[i].flags |= __EXEC_OBJECT_USERPTR_INIT; eb->args->flags |= __EXEC_USERPTR_USED; @@ -986,10 +975,6 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb) } return 0; - -err: - eb->vma[i].vma = NULL; - return err; } static int eb_lock_vmas(struct i915_execbuffer *eb) @@ -3375,7 +3360,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, eb.exec = exec; eb.vma = (struct eb_vma *)(exec + args->buffer_count + 1); - eb.vma[0].vma = NULL; + memset(eb.vma, 0, (args->buffer_count + 1) * sizeof(struct eb_vma)); + eb.batch_pool = NULL; eb.invalid_flags = __EXEC_OBJECT_UNKNOWN_FLAGS; @@ -3584,7 +3570,18 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, if (err) return err; - /* Allocate extra slots for use by the command parser */ + /* + * Allocate extra slots for use by the command parser. + * + * Note that this allocation handles two different arrays (the + * exec2_list array, and the eventual eb.vma array introduced in + * i915_gem_do_execbuffer()), that reside in virtually contiguous + * memory. Also note that the allocation intentionally doesn't fill the + * area with zeros, because the exec2_list part doesn't need to be, as + * it's immediately overwritten by user data a few lines below. + * However, the eb.vma part is explicitly zeroed later in + * i915_gem_do_execbuffer(). + */ exec2_list = kvmalloc_array(count + 2, eb_element_size(), __GFP_NOWARN | GFP_KERNEL); if (exec2_list == NULL) { From 3e9ad2c1a3150abbe0a1a630e4b220efe7af8d5c Mon Sep 17 00:00:00 2001 From: Huacai Chen Date: Wed, 31 Dec 2025 15:19:10 +0800 Subject: [PATCH 0593/1321] LoongArch: Complete CPUCFG registers definition According to the "LoongArch Reference Manual Volume 1: Basic Architecture", begin with LA664 CPU core there are more features supported which are indicated in CPUCFG2 and CPUCFG3. This patch completes the definitions of them so as to match the architecture specification. Signed-off-by: Bibo Mao Signed-off-by: Huacai Chen --- arch/loongarch/include/asm/loongarch.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h index e6b8ff61c8cc6..553c4dc7a156e 100644 --- a/arch/loongarch/include/asm/loongarch.h +++ b/arch/loongarch/include/asm/loongarch.h @@ -94,6 +94,12 @@ #define CPUCFG2_LSPW BIT(21) #define CPUCFG2_LAM BIT(22) #define CPUCFG2_PTW BIT(24) +#define CPUCFG2_FRECIPE BIT(25) +#define CPUCFG2_DIV32 BIT(26) +#define CPUCFG2_LAM_BH BIT(27) +#define CPUCFG2_LAMCAS BIT(28) +#define CPUCFG2_LLACQ_SCREL BIT(29) +#define CPUCFG2_SCQ BIT(30) #define LOONGARCH_CPUCFG3 0x3 #define CPUCFG3_CCDMA BIT(0) @@ -108,6 +114,7 @@ #define CPUCFG3_SPW_HG_HF BIT(11) #define CPUCFG3_RVA BIT(12) #define CPUCFG3_RVAMAX GENMASK(16, 13) +#define CPUCFG3_DBAR_HINTS BIT(17) #define CPUCFG3_ALDORDER_CAP BIT(18) /* All address load ordered, capability */ #define CPUCFG3_ASTORDER_CAP BIT(19) /* All address store ordered, capability */ #define CPUCFG3_ALDORDER_STA BIT(20) /* All address load ordered, status */ From 591ba42e401a7892a454cf4ec22154cf42cd2b75 Mon Sep 17 00:00:00 2001 From: Huacai Chen Date: Wed, 31 Dec 2025 15:19:10 +0800 Subject: [PATCH 0594/1321] LoongArch: Set correct protection_map[] for VM_NONE/VM_SHARED For 32BIT platform _PAGE_PROTNONE is 0, so set a VMA to be VM_NONE or VM_SHARED will make pages non-present, then cause Oops with kernel page fault. Fix it by set correct protection_map[] for VM_NONE/VM_SHARED, replacing _PAGE_PROTNONE with _PAGE_PRESENT. Signed-off-by: Huacai Chen --- arch/loongarch/mm/cache.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/loongarch/mm/cache.c b/arch/loongarch/mm/cache.c index 6be04d36ca076..496916845ff76 100644 --- a/arch/loongarch/mm/cache.c +++ b/arch/loongarch/mm/cache.c @@ -160,8 +160,8 @@ void cpu_cache_init(void) static const pgprot_t protection_map[16] = { [VM_NONE] = __pgprot(_CACHE_CC | _PAGE_USER | - _PAGE_PROTNONE | _PAGE_NO_EXEC | - _PAGE_NO_READ), + _PAGE_NO_EXEC | _PAGE_NO_READ | + (_PAGE_PROTNONE ? : _PAGE_PRESENT)), [VM_READ] = __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT | _PAGE_NO_EXEC), @@ -180,8 +180,8 @@ static const pgprot_t protection_map[16] = { [VM_EXEC | VM_WRITE | VM_READ] = __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT), [VM_SHARED] = __pgprot(_CACHE_CC | _PAGE_USER | - _PAGE_PROTNONE | _PAGE_NO_EXEC | - _PAGE_NO_READ), + _PAGE_NO_EXEC | _PAGE_NO_READ | + (_PAGE_PROTNONE ? : _PAGE_PRESENT)), [VM_SHARED | VM_READ] = __pgprot(_CACHE_CC | _PAGE_VALID | _PAGE_USER | _PAGE_PRESENT | _PAGE_NO_EXEC), From 3685ee7e8f527b0552a0a753c2148b9d72eda21a Mon Sep 17 00:00:00 2001 From: Tiezhu Yang Date: Wed, 31 Dec 2025 15:19:10 +0800 Subject: [PATCH 0595/1321] LoongArch: Use UNWIND_HINT_END_OF_STACK for entry points kernel_entry() and smpboot_entry() are the last frames for ORC unwinder, so it is proper to use the annotation UNWIND_HINT_END_OF_STACK for them. Link: https://lore.kernel.org/lkml/ots6w2ntyudj5ucs5eowncta2vmfssatpcqwzpar3ekk577hxi@j45dd4dmwx6x/ Suggested-by: Josh Poimboeuf Signed-off-by: Tiezhu Yang Signed-off-by: Huacai Chen --- arch/loongarch/kernel/head.S | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S index aba548db24460..ce7f6c04f4ab4 100644 --- a/arch/loongarch/kernel/head.S +++ b/arch/loongarch/kernel/head.S @@ -42,6 +42,7 @@ SYM_DATA(kernel_fsize, .long _kernel_fsize); .align 12 SYM_CODE_START(kernel_entry) # kernel entry point + UNWIND_HINT_END_OF_STACK SETUP_TWINS SETUP_MODES t0 @@ -113,6 +114,7 @@ SYM_CODE_END(kernel_entry) * function after setting up the stack and tp registers. */ SYM_CODE_START(smpboot_entry) + UNWIND_HINT_END_OF_STACK SETUP_TWINS SETUP_MODES t0 From 9d525a61872740a41c93b114627e93b3ddf2b9f7 Mon Sep 17 00:00:00 2001 From: Tiezhu Yang Date: Wed, 31 Dec 2025 15:19:10 +0800 Subject: [PATCH 0596/1321] LoongArch: Remove is_entry_func() and kernel_entry_end For now, the related code of is_entry_func() is useless, so they can be removed. Then the symbol kernel_entry_end is not used any more, so it can be removed too. Link: https://lore.kernel.org/lkml/kjiyla6qj3l7ezspitulrdoc5laj2e6hoecvd254hssnpddczm@g6nkaombh6va/ Suggested-by: Josh Poimboeuf Signed-off-by: Tiezhu Yang --- arch/loongarch/kernel/head.S | 2 -- arch/loongarch/kernel/unwind_orc.c | 11 ----------- 2 files changed, 13 deletions(-) diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S index ce7f6c04f4ab4..7f288e89573b7 100644 --- a/arch/loongarch/kernel/head.S +++ b/arch/loongarch/kernel/head.S @@ -144,5 +144,3 @@ SYM_CODE_START(smpboot_entry) SYM_CODE_END(smpboot_entry) #endif /* CONFIG_SMP */ - -SYM_ENTRY(kernel_entry_end, SYM_L_GLOBAL, SYM_A_NONE) diff --git a/arch/loongarch/kernel/unwind_orc.c b/arch/loongarch/kernel/unwind_orc.c index 0d5fa64a22252..710f82d73797b 100644 --- a/arch/loongarch/kernel/unwind_orc.c +++ b/arch/loongarch/kernel/unwind_orc.c @@ -348,14 +348,6 @@ void unwind_start(struct unwind_state *state, struct task_struct *task, } EXPORT_SYMBOL_GPL(unwind_start); -static bool is_entry_func(unsigned long addr) -{ - extern u32 kernel_entry; - extern u32 kernel_entry_end; - - return addr >= (unsigned long)&kernel_entry && addr < (unsigned long)&kernel_entry_end; -} - static inline unsigned long bt_address(unsigned long ra) { extern unsigned long eentry; @@ -402,9 +394,6 @@ bool unwind_next_frame(struct unwind_state *state) /* Don't let modules unload while we're reading their ORC data. */ guard(rcu)(); - if (is_entry_func(state->pc)) - goto end; - orc = orc_find(state->pc); if (!orc) { /* From 73e3ed8b5e68474abd01fbc7a4f08403b385679e Mon Sep 17 00:00:00 2001 From: Tiezhu Yang Date: Wed, 31 Dec 2025 15:19:19 +0800 Subject: [PATCH 0597/1321] LoongArch: Remove unnecessary checks for ORC unwinder According to the following function definitions, __kernel_text_address() already checks __module_text_address(), so it should remove the check of __module_text_address() in bt_address() at least. int __kernel_text_address(unsigned long addr) { if (kernel_text_address(addr)) return 1; ... return 0; } int kernel_text_address(unsigned long addr) { bool no_rcu; int ret = 1; ... if (is_module_text_address(addr)) goto out; ... return ret; } bool is_module_text_address(unsigned long addr) { guard(rcu)(); return __module_text_address(addr) != NULL; } Furthermore, there are two checks of __kernel_text_address(), one is in bt_address() and the other is after calling bt_address(), it looks like redundant. Handle the exception address first and then use __kernel_text_address() to validate the calculated address for exception or the normal address in bt_address(), then it can remove the check of __kernel_text_address() after calling bt_address(). Just remove unnecessary checks, no functional changes intended. Signed-off-by: Tiezhu Yang Signed-off-by: Huacai Chen --- arch/loongarch/kernel/unwind_orc.c | 16 +++++----------- 1 file changed, 5 insertions(+), 11 deletions(-) diff --git a/arch/loongarch/kernel/unwind_orc.c b/arch/loongarch/kernel/unwind_orc.c index 710f82d73797b..8a6e3429a860e 100644 --- a/arch/loongarch/kernel/unwind_orc.c +++ b/arch/loongarch/kernel/unwind_orc.c @@ -352,12 +352,6 @@ static inline unsigned long bt_address(unsigned long ra) { extern unsigned long eentry; - if (__kernel_text_address(ra)) - return ra; - - if (__module_text_address(ra)) - return ra; - if (ra >= eentry && ra < eentry + EXCCODE_INT_END * VECSIZE) { unsigned long func; unsigned long type = (ra - eentry) / VECSIZE; @@ -375,10 +369,13 @@ static inline unsigned long bt_address(unsigned long ra) break; } - return func + offset; + ra = func + offset; } - return ra; + if (__kernel_text_address(ra)) + return ra; + + return 0; } bool unwind_next_frame(struct unwind_state *state) @@ -501,9 +498,6 @@ bool unwind_next_frame(struct unwind_state *state) goto err; } - if (!__kernel_text_address(state->pc)) - goto err; - return true; err: From 585437d6f31126271271018d7607c2f4da71ce5b Mon Sep 17 00:00:00 2001 From: Chenghao Duan Date: Wed, 31 Dec 2025 15:19:20 +0800 Subject: [PATCH 0598/1321] LoongArch: Enable exception fixup for specific ADE subcode MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This patch allows the LoongArch BPF JIT to handle recoverable memory access errors generated by BPF_PROBE_MEM* instructions. When a BPF program performs memory access operations, the instructions it executes may trigger ADEM exceptions. The kernel’s built-in BPF exception table mechanism (EX_TYPE_BPF) will generate corresponding exception fixup entries in the JIT compilation phase; however, the architecture-specific trap handling function needs to proactively call the common fixup routine to achieve exception recovery. do_ade(): fix EX_TYPE_BPF memory access exceptions for BPF programs, ensure safe execution. Relevant test cases: illegal address access tests in module_attach and subprogs_extable of selftests/bpf. Signed-off-by: Chenghao Duan Signed-off-by: Huacai Chen --- arch/loongarch/kernel/traps.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/loongarch/kernel/traps.c b/arch/loongarch/kernel/traps.c index 004b8ebf00512..5d49b742e3bf6 100644 --- a/arch/loongarch/kernel/traps.c +++ b/arch/loongarch/kernel/traps.c @@ -535,10 +535,15 @@ asmlinkage void noinstr do_fpe(struct pt_regs *regs, unsigned long fcsr) asmlinkage void noinstr do_ade(struct pt_regs *regs) { irqentry_state_t state = irqentry_enter(regs); + unsigned int esubcode = FIELD_GET(CSR_ESTAT_ESUBCODE, regs->csr_estat); + + if ((esubcode == EXSUBCODE_ADEM) && fixup_exception(regs)) + goto out; die_if_kernel("Kernel ade access", regs); force_sig_fault(SIGBUS, BUS_ADRERR, (void __user *)regs->csr_badvaddr); +out: irqentry_exit(regs, state); } From 573ff2b1db1e6a6cc80548af8d1994017a924963 Mon Sep 17 00:00:00 2001 From: Chenghao Duan Date: Wed, 31 Dec 2025 15:19:20 +0800 Subject: [PATCH 0599/1321] LoongArch: Refactor register restoration in ftrace_common_return Refactor the register restoration sequence in the ftrace_common_return function to clearly distinguish between the logic of normal returns and direct call returns in function tracing scenarios. The logic is as follows: 1. In the case of a normal return, the execution flow returns to the traced function, and ftrace must ensure that the register data is consistent with the state when the function was entered. ra = parent return address; t0 = traced function return address. 2. In the case of a direct call return, the execution flow jumps to the custom trampoline function, and ftrace must ensure that the register data is consistent with the state when ftrace was entered. ra = traced function return address; t0 = parent return address. Cc: stable@vger.kernel.org Fixes: 9cdc3b6a299c ("LoongArch: ftrace: Add direct call support") Signed-off-by: Chenghao Duan Signed-off-by: Huacai Chen --- arch/loongarch/kernel/mcount_dyn.S | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/arch/loongarch/kernel/mcount_dyn.S b/arch/loongarch/kernel/mcount_dyn.S index d6b474ad1d5e5..5729c20e5b8b0 100644 --- a/arch/loongarch/kernel/mcount_dyn.S +++ b/arch/loongarch/kernel/mcount_dyn.S @@ -94,7 +94,6 @@ SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL) * at the callsite, so there is no need to restore the T series regs. */ ftrace_common_return: - PTR_L ra, sp, PT_R1 PTR_L a0, sp, PT_R4 PTR_L a1, sp, PT_R5 PTR_L a2, sp, PT_R6 @@ -104,12 +103,17 @@ ftrace_common_return: PTR_L a6, sp, PT_R10 PTR_L a7, sp, PT_R11 PTR_L fp, sp, PT_R22 - PTR_L t0, sp, PT_ERA PTR_L t1, sp, PT_R13 - PTR_ADDI sp, sp, PT_SIZE bnez t1, .Ldirect + + PTR_L ra, sp, PT_R1 + PTR_L t0, sp, PT_ERA + PTR_ADDI sp, sp, PT_SIZE jr t0 .Ldirect: + PTR_L t0, sp, PT_R1 + PTR_L ra, sp, PT_ERA + PTR_ADDI sp, sp, PT_SIZE jr t1 SYM_CODE_END(ftrace_common) @@ -161,6 +165,8 @@ SYM_CODE_END(return_to_handler) #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS SYM_CODE_START(ftrace_stub_direct_tramp) UNWIND_HINT_UNDEFINED - jr t0 + move t1, ra + move ra, t0 + jr t1 SYM_CODE_END(ftrace_stub_direct_tramp) #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */ From f616ad48d1bd4e1fb7690a15611e91974de83b76 Mon Sep 17 00:00:00 2001 From: Hengqi Chen Date: Wed, 31 Dec 2025 15:19:20 +0800 Subject: [PATCH 0600/1321] LoongArch: BPF: Sign extend kfunc call arguments The kfunc calls are native calls so they should follow LoongArch calling conventions. Sign extend its arguments properly to avoid kernel panic. This is done by adding a new emit_abi_ext() helper. The emit_abi_ext() helper performs extension in place meaning a value already store in the target register (Note: this is different from the existing sign_extend() helper and thus we can't reuse it). Cc: stable@vger.kernel.org Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support") Signed-off-by: Hengqi Chen Signed-off-by: Huacai Chen --- arch/loongarch/net/bpf_jit.c | 16 ++++++++++++++++ arch/loongarch/net/bpf_jit.h | 26 ++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c index 8dc58781b8eb7..5352d0c30fb28 100644 --- a/arch/loongarch/net/bpf_jit.c +++ b/arch/loongarch/net/bpf_jit.c @@ -950,6 +950,22 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, bool ext emit_insn(ctx, ldd, REG_TCC, LOONGARCH_GPR_SP, tcc_ptr_off); } + if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { + const struct btf_func_model *m; + int i; + + m = bpf_jit_find_kfunc_model(ctx->prog, insn); + if (!m) + return -EINVAL; + + for (i = 0; i < m->nr_args; i++) { + u8 reg = regmap[BPF_REG_1 + i]; + bool sign = m->arg_flags[i] & BTF_FMODEL_SIGNED_ARG; + + emit_abi_ext(ctx, reg, m->arg_size[i], sign); + } + } + move_addr(ctx, t1, func_addr); emit_insn(ctx, jirl, LOONGARCH_GPR_RA, t1, 0); diff --git a/arch/loongarch/net/bpf_jit.h b/arch/loongarch/net/bpf_jit.h index 5697158fd1645..75b6330030a9d 100644 --- a/arch/loongarch/net/bpf_jit.h +++ b/arch/loongarch/net/bpf_jit.h @@ -88,6 +88,32 @@ static inline void emit_sext_32(struct jit_ctx *ctx, enum loongarch_gpr reg, boo emit_insn(ctx, addiw, reg, reg, 0); } +/* Emit proper extension according to ABI requirements. + * Note that it requires a value of size `size` already resides in register `reg`. + */ +static inline void emit_abi_ext(struct jit_ctx *ctx, int reg, u8 size, bool sign) +{ + /* ABI requires unsigned char/short to be zero-extended */ + if (!sign && (size == 1 || size == 2)) + return; + + switch (size) { + case 1: + emit_insn(ctx, extwb, reg, reg); + break; + case 2: + emit_insn(ctx, extwh, reg, reg); + break; + case 4: + emit_insn(ctx, addiw, reg, reg, 0); + break; + case 8: + break; + default: + pr_warn("bpf_jit: invalid size %d for extension\n", size); + } +} + static inline void move_addr(struct jit_ctx *ctx, enum loongarch_gpr rd, u64 addr) { u64 imm_11_0, imm_31_12, imm_51_32, imm_63_52; From 36dc8c3ee00e340de53e7722804a2f3a12089ab8 Mon Sep 17 00:00:00 2001 From: Hengqi Chen Date: Wed, 31 Dec 2025 15:19:20 +0800 Subject: [PATCH 0601/1321] LoongArch: BPF: Zero-extend bpf_tail_call() index The bpf_tail_call() index should be treated as a u32 value. Let's zero-extend it to avoid calling wrong BPF progs. See similar fixes for x86 [1]) and arm64 ([2]) for more details. [1]: https://github.com/torvalds/linux/commit/90caccdd8cc0215705f18b92771b449b01e2474a [2]: https://github.com/torvalds/linux/commit/16338a9b3ac30740d49f5dfed81bac0ffa53b9c7 Cc: stable@vger.kernel.org Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support") Signed-off-by: Hengqi Chen Signed-off-by: Huacai Chen --- arch/loongarch/net/bpf_jit.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c index 5352d0c30fb28..766ded335fd8b 100644 --- a/arch/loongarch/net/bpf_jit.c +++ b/arch/loongarch/net/bpf_jit.c @@ -280,6 +280,8 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx, int insn) * goto out; */ tc_ninsn = insn ? ctx->offset[insn+1] - ctx->offset[insn] : ctx->offset[0]; + emit_zext_32(ctx, a2, true); + off = offsetof(struct bpf_array, map.max_entries); emit_insn(ctx, ldwu, t1, a1, off); /* bgeu $a2, $t1, jmp_offset */ From 78c82e7ae7922222caf09f02d6db12d68b63b31a Mon Sep 17 00:00:00 2001 From: Chenghao Duan Date: Wed, 31 Dec 2025 15:19:20 +0800 Subject: [PATCH 0602/1321] LoongArch: BPF: Save return address register ra to t0 before trampoline Modify the build_prologue() function to ensure the return address register ra is saved to t0 before entering trampoline operations. This change ensures the accurate return address handling when a BPF program calls another BPF program, preventing errors in the BPF-to-BPF call chain. Cc: stable@vger.kernel.org Fixes: 677e6123e3d2 ("LoongArch: BPF: Disable trampoline for kernel module function trace") Signed-off-by: Chenghao Duan Signed-off-by: Huacai Chen --- arch/loongarch/net/bpf_jit.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c index 766ded335fd8b..9729c0ff7bfcc 100644 --- a/arch/loongarch/net/bpf_jit.c +++ b/arch/loongarch/net/bpf_jit.c @@ -139,6 +139,7 @@ static void build_prologue(struct jit_ctx *ctx) stack_adjust = round_up(stack_adjust, 16); stack_adjust += bpf_stack_adjust; + move_reg(ctx, LOONGARCH_GPR_T0, LOONGARCH_GPR_RA); /* Reserve space for the move_imm + jirl instruction */ for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++) emit_insn(ctx, nop); From 2a5b79a652013a0cefda293aa6766e20e2a04c8c Mon Sep 17 00:00:00 2001 From: Chenghao Duan Date: Wed, 31 Dec 2025 15:19:21 +0800 Subject: [PATCH 0603/1321] LoongArch: BPF: Adjust the jump offset of tail calls Call the next bpf prog and skip the first instruction of TCC initialization. A total of 7 instructions are skipped: 'move t0, ra' 1 inst 'move_imm + jirl' 5 inst 'addid REG_TCC, zero, 0' 1 inst Relevant test cases: the tailcalls test item in selftests/bpf. Cc: stable@vger.kernel.org Fixes: 677e6123e3d2 ("LoongArch: BPF: Disable trampoline for kernel module function trace") Signed-off-by: Chenghao Duan Signed-off-by: Huacai Chen --- arch/loongarch/net/bpf_jit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c index 9729c0ff7bfcc..e6aeaec46969d 100644 --- a/arch/loongarch/net/bpf_jit.c +++ b/arch/loongarch/net/bpf_jit.c @@ -239,7 +239,7 @@ static void __build_epilogue(struct jit_ctx *ctx, bool is_tail_call) * Call the next bpf prog and skip the first instruction * of TCC initialization. */ - emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T3, 6); + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T3, 7); } } From a0d38d05f6a2bfed24331efabf531dbfdb11e5b3 Mon Sep 17 00:00:00 2001 From: Chenghao Duan Date: Wed, 31 Dec 2025 15:19:21 +0800 Subject: [PATCH 0604/1321] LoongArch: BPF: Enable trampoline-based tracing for module functions Remove the previous restrictions that blocked the tracing of kernel module functions. Fix the issue that previously caused kernel lockups when attempting to trace module functions. Before entering the trampoline code, the return address register ra shall store the address of the next assembly instruction after the 'bl trampoline' instruction, which is the traced function address, and the register t0 shall store the parent function return address. Refine the trampoline return logic to ensure that register data remains correct when returning to both the traced function and the parent function. Before this patch was applied, the module_attach test in selftests/bpf encountered a deadlock issue. This was caused by an incorrect jump address after the trampoline execution, which resulted in an infinite loop within the module function. Cc: stable@vger.kernel.org Fixes: 677e6123e3d2 ("LoongArch: BPF: Disable trampoline for kernel module function trace") Signed-off-by: Chenghao Duan Signed-off-by: Huacai Chen --- arch/loongarch/net/bpf_jit.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c index e6aeaec46969d..9f6e93343b6e3 100644 --- a/arch/loongarch/net/bpf_jit.c +++ b/arch/loongarch/net/bpf_jit.c @@ -1284,7 +1284,7 @@ static int emit_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call) return 0; } - return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO, (u64)target); + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_RA : LOONGARCH_GPR_ZERO, (u64)target); } static int emit_call(struct jit_ctx *ctx, u64 addr) @@ -1641,14 +1641,12 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i /* To traced function */ /* Ftrace jump skips 2 NOP instructions */ - if (is_kernel_text((unsigned long)orig_call)) + if (is_kernel_text((unsigned long)orig_call) || + is_module_text_address((unsigned long)orig_call)) orig_call += LOONGARCH_FENTRY_NBYTES; /* Direct jump skips 5 NOP instructions */ else if (is_bpf_text_address((unsigned long)orig_call)) orig_call += LOONGARCH_BPF_FENTRY_NBYTES; - /* Module tracing not supported - cause kernel lockups */ - else if (is_module_text_address((unsigned long)orig_call)) - return -ENOTSUPP; if (flags & BPF_TRAMP_F_CALL_ORIG) { move_addr(ctx, LOONGARCH_GPR_A0, (const u64)im); @@ -1741,12 +1739,16 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0); emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16); - if (flags & BPF_TRAMP_F_SKIP_FRAME) + if (flags & BPF_TRAMP_F_SKIP_FRAME) { /* return to parent function */ - emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0); - else - /* return to traced function */ + move_reg(ctx, LOONGARCH_GPR_RA, LOONGARCH_GPR_T0); emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0); + } else { + /* return to traced function */ + move_reg(ctx, LOONGARCH_GPR_T1, LOONGARCH_GPR_RA); + move_reg(ctx, LOONGARCH_GPR_RA, LOONGARCH_GPR_T0); + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T1, 0); + } } ret = ctx->idx; From 15ee3593012814416d38dbc8cc289ad5e0b2b924 Mon Sep 17 00:00:00 2001 From: Chenghao Duan Date: Wed, 31 Dec 2025 15:19:21 +0800 Subject: [PATCH 0605/1321] LoongArch: BPF: Enhance the bpf_arch_text_poke() function Enhance the bpf_arch_text_poke() function to enable accurate location of BPF program entry points. When modifying the entry point of a BPF program, skip the "move t0, ra" instruction to ensure the correct logic and copy of the jump address. Cc: stable@vger.kernel.org Fixes: 677e6123e3d2 ("LoongArch: BPF: Disable trampoline for kernel module function trace") Signed-off-by: Chenghao Duan Signed-off-by: Huacai Chen --- arch/loongarch/net/bpf_jit.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c index 9f6e93343b6e3..d1d5a65308b9e 100644 --- a/arch/loongarch/net/bpf_jit.c +++ b/arch/loongarch/net/bpf_jit.c @@ -1309,15 +1309,30 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t, { int ret; bool is_call; + unsigned long size = 0; + unsigned long offset = 0; + void *image = NULL; + char namebuf[KSYM_NAME_LEN]; u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP}; u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP}; /* Only poking bpf text is supported. Since kernel function entry * is set up by ftrace, we rely on ftrace to poke kernel functions. */ - if (!is_bpf_text_address((unsigned long)ip)) + if (!__bpf_address_lookup((unsigned long)ip, &size, &offset, namebuf)) return -ENOTSUPP; + image = ip - offset; + + /* zero offset means we're poking bpf prog entry */ + if (offset == 0) { + /* skip to the nop instruction in bpf prog entry: + * move t0, ra + * nop + */ + ip = image + LOONGARCH_INSN_SIZE; + } + is_call = old_t == BPF_MOD_CALL; ret = emit_jump_or_nops(old_addr, ip, old_insns, is_call); if (ret) From ac519fd9ecab80de7fd1d840d41aa69827a5777f Mon Sep 17 00:00:00 2001 From: Chenghao Duan Date: Wed, 31 Dec 2025 15:19:25 +0800 Subject: [PATCH 0606/1321] samples/ftrace: Adjust LoongArch register restore order in direct calls Ensure that in the ftrace direct call logic, the CPU register state (with ra = parent return address) is restored to the correct state after the execution of the custom trampoline function and before returning to the traced function. Additionally, guarantee the correctness of the jump logic for jr t0 (traced function address). Cc: stable@vger.kernel.org Fixes: 9cdc3b6a299c ("LoongArch: ftrace: Add direct call support") Reported-by: Youling Tang Acked-by: Steven Rostedt (Google) Signed-off-by: Chenghao Duan Signed-off-by: Huacai Chen --- samples/ftrace/ftrace-direct-modify.c | 8 ++++---- samples/ftrace/ftrace-direct-multi-modify.c | 8 ++++---- samples/ftrace/ftrace-direct-multi.c | 4 ++-- samples/ftrace/ftrace-direct-too.c | 4 ++-- samples/ftrace/ftrace-direct.c | 4 ++-- 5 files changed, 14 insertions(+), 14 deletions(-) diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c index da3a9f2091f55..1ba1927b548ee 100644 --- a/samples/ftrace/ftrace-direct-modify.c +++ b/samples/ftrace/ftrace-direct-modify.c @@ -176,8 +176,8 @@ asm ( " st.d $t0, $sp, 0\n" " st.d $ra, $sp, 8\n" " bl my_direct_func1\n" -" ld.d $t0, $sp, 0\n" -" ld.d $ra, $sp, 8\n" +" ld.d $ra, $sp, 0\n" +" ld.d $t0, $sp, 8\n" " addi.d $sp, $sp, 16\n" " jr $t0\n" " .size my_tramp1, .-my_tramp1\n" @@ -189,8 +189,8 @@ asm ( " st.d $t0, $sp, 0\n" " st.d $ra, $sp, 8\n" " bl my_direct_func2\n" -" ld.d $t0, $sp, 0\n" -" ld.d $ra, $sp, 8\n" +" ld.d $ra, $sp, 0\n" +" ld.d $t0, $sp, 8\n" " addi.d $sp, $sp, 16\n" " jr $t0\n" " .size my_tramp2, .-my_tramp2\n" diff --git a/samples/ftrace/ftrace-direct-multi-modify.c b/samples/ftrace/ftrace-direct-multi-modify.c index 8f7986d698d87..7a7822dfeb50a 100644 --- a/samples/ftrace/ftrace-direct-multi-modify.c +++ b/samples/ftrace/ftrace-direct-multi-modify.c @@ -199,8 +199,8 @@ asm ( " move $a0, $t0\n" " bl my_direct_func1\n" " ld.d $a0, $sp, 0\n" -" ld.d $t0, $sp, 8\n" -" ld.d $ra, $sp, 16\n" +" ld.d $ra, $sp, 8\n" +" ld.d $t0, $sp, 16\n" " addi.d $sp, $sp, 32\n" " jr $t0\n" " .size my_tramp1, .-my_tramp1\n" @@ -215,8 +215,8 @@ asm ( " move $a0, $t0\n" " bl my_direct_func2\n" " ld.d $a0, $sp, 0\n" -" ld.d $t0, $sp, 8\n" -" ld.d $ra, $sp, 16\n" +" ld.d $ra, $sp, 8\n" +" ld.d $t0, $sp, 16\n" " addi.d $sp, $sp, 32\n" " jr $t0\n" " .size my_tramp2, .-my_tramp2\n" diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c index db326c81a27dd..3fe6ddaf0b69f 100644 --- a/samples/ftrace/ftrace-direct-multi.c +++ b/samples/ftrace/ftrace-direct-multi.c @@ -131,8 +131,8 @@ asm ( " move $a0, $t0\n" " bl my_direct_func\n" " ld.d $a0, $sp, 0\n" -" ld.d $t0, $sp, 8\n" -" ld.d $ra, $sp, 16\n" +" ld.d $ra, $sp, 8\n" +" ld.d $t0, $sp, 16\n" " addi.d $sp, $sp, 32\n" " jr $t0\n" " .size my_tramp, .-my_tramp\n" diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c index 3d0fa260332d4..bf2411aa6fd7a 100644 --- a/samples/ftrace/ftrace-direct-too.c +++ b/samples/ftrace/ftrace-direct-too.c @@ -143,8 +143,8 @@ asm ( " ld.d $a0, $sp, 0\n" " ld.d $a1, $sp, 8\n" " ld.d $a2, $sp, 16\n" -" ld.d $t0, $sp, 24\n" -" ld.d $ra, $sp, 32\n" +" ld.d $ra, $sp, 24\n" +" ld.d $t0, $sp, 32\n" " addi.d $sp, $sp, 48\n" " jr $t0\n" " .size my_tramp, .-my_tramp\n" diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c index 956834b0d19ac..5368c8c39cbb4 100644 --- a/samples/ftrace/ftrace-direct.c +++ b/samples/ftrace/ftrace-direct.c @@ -124,8 +124,8 @@ asm ( " st.d $ra, $sp, 16\n" " bl my_direct_func\n" " ld.d $a0, $sp, 0\n" -" ld.d $t0, $sp, 8\n" -" ld.d $ra, $sp, 16\n" +" ld.d $ra, $sp, 8\n" +" ld.d $t0, $sp, 16\n" " addi.d $sp, $sp, 32\n" " jr $t0\n" " .size my_tramp, .-my_tramp\n" From c09cefd1954d53f09d3f196a3cffbaa86d550e94 Mon Sep 17 00:00:00 2001 From: Rong Zhang Date: Tue, 30 Dec 2025 02:22:21 +0800 Subject: [PATCH 0607/1321] x86/microcode/AMD: Fix Entrysign revision check for Zen5/Strix Halo Zen5 also contains family 1Ah, models 70h-7Fh, which are mistakenly missing from cpu_has_entrysign(). Add the missing range. Fixes: 8a9fb5129e8e ("x86/microcode/AMD: Limit Entrysign signature checking to known generations") Signed-off-by: Rong Zhang Signed-off-by: Borislav Petkov (AMD) Cc: stable@kernel.org Link: https://patch.msgid.link/20251229182245.152747-1-i@rong.moe --- arch/x86/kernel/cpu/microcode/amd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c index 3821a985f4ffe..46673530bc6f0 100644 --- a/arch/x86/kernel/cpu/microcode/amd.c +++ b/arch/x86/kernel/cpu/microcode/amd.c @@ -258,7 +258,7 @@ static bool cpu_has_entrysign(void) if (fam == 0x1a) { if (model <= 0x2f || (0x40 <= model && model <= 0x4f) || - (0x60 <= model && model <= 0x6f)) + (0x60 <= model && model <= 0x7f)) return true; } From bbffc664c56aeb14949377c4b42b3a81c4dd8873 Mon Sep 17 00:00:00 2001 From: Alexandre Negrel Date: Tue, 30 Dec 2025 19:57:28 +0100 Subject: [PATCH 0608/1321] io_uring: use GFP_NOWAIT for overflow CQEs on legacy rings Allocate the overflowing CQE with GFP_NOWAIT instead of GFP_ATOMIC. This changes causes allocations to fail earlier in out-of-memory situations, rather than being deferred. Using GFP_ATOMIC allows a process to exceed memory limits. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220794 Signed-off-by: Alexandre Negrel Link: https://lore.kernel.org/io-uring/20251229201933.515797-1-alexandre@negrel.dev/ Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 6cb24cdf8e684..709943fedaf40 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -864,7 +864,7 @@ static __cold bool io_cqe_overflow_locked(struct io_ring_ctx *ctx, { struct io_overflow_cqe *ocqe; - ocqe = io_alloc_ocqe(ctx, cqe, big_cqe, GFP_ATOMIC); + ocqe = io_alloc_ocqe(ctx, cqe, big_cqe, GFP_NOWAIT); return io_cqring_add_overflow(ctx, ocqe); } From e3169bafab0c73ab68e377c457c0ea2a039f5d32 Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Wed, 31 Dec 2025 08:12:46 -0700 Subject: [PATCH 0609/1321] io_uring/tctx: add separate lock for list of tctx's in ctx ctx->tcxt_list holds the tasks using this ring, and it's currently protected by the normal ctx->uring_lock. However, this can cause a circular locking issue, as reported by syzbot, where cancelations off exec end up needing to remove an entry from this list: ====================================================== WARNING: possible circular locking dependency detected syzkaller #0 Tainted: G L ------------------------------------------------------ syz.0.9999/12287 is trying to acquire lock: ffff88805851c0a8 (&ctx->uring_lock){+.+.}-{4:4}, at: io_uring_del_tctx_node+0xf0/0x2c0 io_uring/tctx.c:179 but task is already holding lock: ffff88802db5a2e0 (&sig->cred_guard_mutex){+.+.}-{4:4}, at: prepare_bprm_creds fs/exec.c:1360 [inline] ffff88802db5a2e0 (&sig->cred_guard_mutex){+.+.}-{4:4}, at: bprm_execve+0xb9/0x1400 fs/exec.c:1733 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&sig->cred_guard_mutex){+.+.}-{4:4}: __mutex_lock_common kernel/locking/mutex.c:614 [inline] __mutex_lock+0x187/0x1350 kernel/locking/mutex.c:776 proc_pid_attr_write+0x547/0x630 fs/proc/base.c:2837 vfs_write+0x27e/0xb30 fs/read_write.c:684 ksys_write+0x145/0x250 fs/read_write.c:738 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f -> #1 (sb_writers#3){.+.+}-{0:0}: percpu_down_read_internal include/linux/percpu-rwsem.h:53 [inline] percpu_down_read_freezable include/linux/percpu-rwsem.h:83 [inline] __sb_start_write include/linux/fs/super.h:19 [inline] sb_start_write+0x4d/0x1c0 include/linux/fs/super.h:125 mnt_want_write+0x41/0x90 fs/namespace.c:499 open_last_lookups fs/namei.c:4529 [inline] path_openat+0xadd/0x3dd0 fs/namei.c:4784 do_filp_open+0x1fa/0x410 fs/namei.c:4814 io_openat2+0x3e0/0x5c0 io_uring/openclose.c:143 __io_issue_sqe+0x181/0x4b0 io_uring/io_uring.c:1792 io_issue_sqe+0x165/0x1060 io_uring/io_uring.c:1815 io_queue_sqe io_uring/io_uring.c:2042 [inline] io_submit_sqe io_uring/io_uring.c:2320 [inline] io_submit_sqes+0xbf4/0x2140 io_uring/io_uring.c:2434 __do_sys_io_uring_enter io_uring/io_uring.c:3280 [inline] __se_sys_io_uring_enter+0x2e0/0x2b60 io_uring/io_uring.c:3219 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f -> #0 (&ctx->uring_lock){+.+.}-{4:4}: check_prev_add kernel/locking/lockdep.c:3165 [inline] check_prevs_add kernel/locking/lockdep.c:3284 [inline] validate_chain kernel/locking/lockdep.c:3908 [inline] __lock_acquire+0x15a6/0x2cf0 kernel/locking/lockdep.c:5237 lock_acquire+0x107/0x340 kernel/locking/lockdep.c:5868 __mutex_lock_common kernel/locking/mutex.c:614 [inline] __mutex_lock+0x187/0x1350 kernel/locking/mutex.c:776 io_uring_del_tctx_node+0xf0/0x2c0 io_uring/tctx.c:179 io_uring_clean_tctx+0xd4/0x1a0 io_uring/tctx.c:195 io_uring_cancel_generic+0x6ca/0x7d0 io_uring/cancel.c:646 io_uring_task_cancel include/linux/io_uring.h:24 [inline] begin_new_exec+0x10ed/0x2440 fs/exec.c:1131 load_elf_binary+0x9f8/0x2d70 fs/binfmt_elf.c:1010 search_binary_handler fs/exec.c:1669 [inline] exec_binprm fs/exec.c:1701 [inline] bprm_execve+0x92e/0x1400 fs/exec.c:1753 do_execveat_common+0x510/0x6a0 fs/exec.c:1859 do_execve fs/exec.c:1933 [inline] __do_sys_execve fs/exec.c:2009 [inline] __se_sys_execve fs/exec.c:2004 [inline] __x64_sys_execve+0x94/0xb0 fs/exec.c:2004 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f other info that might help us debug this: Chain exists of: &ctx->uring_lock --> sb_writers#3 --> &sig->cred_guard_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sig->cred_guard_mutex); lock(sb_writers#3); lock(&sig->cred_guard_mutex); lock(&ctx->uring_lock); *** DEADLOCK *** 1 lock held by syz.0.9999/12287: #0: ffff88802db5a2e0 (&sig->cred_guard_mutex){+.+.}-{4:4}, at: prepare_bprm_creds fs/exec.c:1360 [inline] #0: ffff88802db5a2e0 (&sig->cred_guard_mutex){+.+.}-{4:4}, at: bprm_execve+0xb9/0x1400 fs/exec.c:1733 stack backtrace: CPU: 0 UID: 0 PID: 12287 Comm: syz.0.9999 Tainted: G L syzkaller #0 PREEMPT(full) Tainted: [L]=SOFTLOCKUP Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 Call Trace: dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120 print_circular_bug+0x2e2/0x300 kernel/locking/lockdep.c:2043 check_noncircular+0x12e/0x150 kernel/locking/lockdep.c:2175 check_prev_add kernel/locking/lockdep.c:3165 [inline] check_prevs_add kernel/locking/lockdep.c:3284 [inline] validate_chain kernel/locking/lockdep.c:3908 [inline] __lock_acquire+0x15a6/0x2cf0 kernel/locking/lockdep.c:5237 lock_acquire+0x107/0x340 kernel/locking/lockdep.c:5868 __mutex_lock_common kernel/locking/mutex.c:614 [inline] __mutex_lock+0x187/0x1350 kernel/locking/mutex.c:776 io_uring_del_tctx_node+0xf0/0x2c0 io_uring/tctx.c:179 io_uring_clean_tctx+0xd4/0x1a0 io_uring/tctx.c:195 io_uring_cancel_generic+0x6ca/0x7d0 io_uring/cancel.c:646 io_uring_task_cancel include/linux/io_uring.h:24 [inline] begin_new_exec+0x10ed/0x2440 fs/exec.c:1131 load_elf_binary+0x9f8/0x2d70 fs/binfmt_elf.c:1010 search_binary_handler fs/exec.c:1669 [inline] exec_binprm fs/exec.c:1701 [inline] bprm_execve+0x92e/0x1400 fs/exec.c:1753 do_execveat_common+0x510/0x6a0 fs/exec.c:1859 do_execve fs/exec.c:1933 [inline] __do_sys_execve fs/exec.c:2009 [inline] __se_sys_execve fs/exec.c:2004 [inline] __x64_sys_execve+0x94/0xb0 fs/exec.c:2004 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff3a8b8f749 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ff3a9a97038 EFLAGS: 00000246 ORIG_RAX: 000000000000003b RAX: ffffffffffffffda RBX: 00007ff3a8de5fa0 RCX: 00007ff3a8b8f749 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000200000000400 RBP: 00007ff3a8c13f91 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ff3a8de6038 R14: 00007ff3a8de5fa0 R15: 00007ff3a8f0fa28 Add a separate lock just for the tctx_list, tctx_lock. This can nest under ->uring_lock, where necessary, and be used separately for list manipulation. For the cancelation off exec side, this removes the need to grab ->uring_lock, hence fixing the circular locking dependency. Reported-by: syzbot+b0e3b77ffaa8a4067ce5@syzkaller.appspotmail.com Signed-off-by: Jens Axboe --- include/linux/io_uring_types.h | 8 +++++++- io_uring/cancel.c | 5 +++++ io_uring/io_uring.c | 5 +++++ io_uring/register.c | 2 ++ io_uring/tctx.c | 8 ++++---- 5 files changed, 23 insertions(+), 5 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index e1adb0d20a0af..a3e8ddc9b380f 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -424,11 +424,17 @@ struct io_ring_ctx { struct user_struct *user; struct mm_struct *mm_account; + /* + * List of tctx nodes for this ctx, protected by tctx_lock. For + * cancelation purposes, nests under uring_lock. + */ + struct list_head tctx_list; + struct mutex tctx_lock; + /* ctx exit and cancelation */ struct llist_head fallback_llist; struct delayed_work fallback_work; struct work_struct exit_work; - struct list_head tctx_list; struct completion ref_comp; /* io-wq management, e.g. thread count */ diff --git a/io_uring/cancel.c b/io_uring/cancel.c index ca12ac10c0ae9..07b8d852218b1 100644 --- a/io_uring/cancel.c +++ b/io_uring/cancel.c @@ -184,7 +184,9 @@ static int __io_async_cancel(struct io_cancel_data *cd, } while (1); /* slow path, try all io-wq's */ + __set_current_state(TASK_RUNNING); io_ring_submit_lock(ctx, issue_flags); + mutex_lock(&ctx->tctx_lock); ret = -ENOENT; list_for_each_entry(node, &ctx->tctx_list, ctx_node) { ret = io_async_cancel_one(node->task->io_uring, cd); @@ -194,6 +196,7 @@ static int __io_async_cancel(struct io_cancel_data *cd, nr++; } } + mutex_unlock(&ctx->tctx_lock); io_ring_submit_unlock(ctx, issue_flags); return all ? nr : ret; } @@ -484,6 +487,7 @@ static __cold bool io_uring_try_cancel_iowq(struct io_ring_ctx *ctx) bool ret = false; mutex_lock(&ctx->uring_lock); + mutex_lock(&ctx->tctx_lock); list_for_each_entry(node, &ctx->tctx_list, ctx_node) { struct io_uring_task *tctx = node->task->io_uring; @@ -496,6 +500,7 @@ static __cold bool io_uring_try_cancel_iowq(struct io_ring_ctx *ctx) cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_ctx_cb, ctx, true); ret |= (cret != IO_WQ_CANCEL_NOTFOUND); } + mutex_unlock(&ctx->tctx_lock); mutex_unlock(&ctx->uring_lock); return ret; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 709943fedaf40..87a87396e9409 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -340,6 +340,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) INIT_LIST_HEAD(&ctx->ltimeout_list); init_llist_head(&ctx->work_llist); INIT_LIST_HEAD(&ctx->tctx_list); + mutex_init(&ctx->tctx_lock); ctx->submit_state.free_list.next = NULL; INIT_HLIST_HEAD(&ctx->waitid_list); xa_init_flags(&ctx->zcrx_ctxs, XA_FLAGS_ALLOC); @@ -3045,6 +3046,7 @@ static __cold void io_ring_exit_work(struct work_struct *work) exit.ctx = ctx; mutex_lock(&ctx->uring_lock); + mutex_lock(&ctx->tctx_lock); while (!list_empty(&ctx->tctx_list)) { WARN_ON_ONCE(time_after(jiffies, timeout)); @@ -3056,6 +3058,7 @@ static __cold void io_ring_exit_work(struct work_struct *work) if (WARN_ON_ONCE(ret)) continue; + mutex_unlock(&ctx->tctx_lock); mutex_unlock(&ctx->uring_lock); /* * See comment above for @@ -3064,7 +3067,9 @@ static __cold void io_ring_exit_work(struct work_struct *work) */ wait_for_completion_interruptible(&exit.completion); mutex_lock(&ctx->uring_lock); + mutex_lock(&ctx->tctx_lock); } + mutex_unlock(&ctx->tctx_lock); mutex_unlock(&ctx->uring_lock); spin_lock(&ctx->completion_lock); spin_unlock(&ctx->completion_lock); diff --git a/io_uring/register.c b/io_uring/register.c index 62d39b3ff317e..3d3822ff3fd9e 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -320,6 +320,7 @@ static __cold int io_register_iowq_max_workers(struct io_ring_ctx *ctx, return 0; /* now propagate the restriction to all registered users */ + mutex_lock(&ctx->tctx_lock); list_for_each_entry(node, &ctx->tctx_list, ctx_node) { tctx = node->task->io_uring; if (WARN_ON_ONCE(!tctx->io_wq)) @@ -330,6 +331,7 @@ static __cold int io_register_iowq_max_workers(struct io_ring_ctx *ctx, /* ignore errors, it always returns zero anyway */ (void)io_wq_max_workers(tctx->io_wq, new_count); } + mutex_unlock(&ctx->tctx_lock); return 0; err: if (sqd) { diff --git a/io_uring/tctx.c b/io_uring/tctx.c index 5b66755579c08..6d6f44215ec80 100644 --- a/io_uring/tctx.c +++ b/io_uring/tctx.c @@ -136,9 +136,9 @@ int __io_uring_add_tctx_node(struct io_ring_ctx *ctx) return ret; } - mutex_lock(&ctx->uring_lock); + mutex_lock(&ctx->tctx_lock); list_add(&node->ctx_node, &ctx->tctx_list); - mutex_unlock(&ctx->uring_lock); + mutex_unlock(&ctx->tctx_lock); } return 0; } @@ -176,9 +176,9 @@ __cold void io_uring_del_tctx_node(unsigned long index) WARN_ON_ONCE(current != node->task); WARN_ON_ONCE(list_empty(&node->ctx_node)); - mutex_lock(&node->ctx->uring_lock); + mutex_lock(&node->ctx->tctx_lock); list_del(&node->ctx_node); - mutex_unlock(&node->ctx->uring_lock); + mutex_unlock(&node->ctx->tctx_lock); if (tctx->last == node->ctx) tctx->last = NULL; From 182dc56df25f48a7495d4cbd69f841c655f37fc2 Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Wed, 31 Dec 2025 11:19:06 -0700 Subject: [PATCH 0610/1321] io_uring/memmap: drop unused sz param in io_uring_validate_mmap_request() io_uring_validate_mmap_request() doesn't use its size_t sz argument, so remove it. Signed-off-by: Caleb Sander Mateos Signed-off-by: Jens Axboe --- io_uring/memmap.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/io_uring/memmap.c b/io_uring/memmap.c index 18e574776ef65..7d3c5eb584801 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -268,8 +268,7 @@ static void *io_region_validate_mmap(struct io_ring_ctx *ctx, return io_region_get_ptr(mr); } -static void *io_uring_validate_mmap_request(struct file *file, loff_t pgoff, - size_t sz) +static void *io_uring_validate_mmap_request(struct file *file, loff_t pgoff) { struct io_ring_ctx *ctx = file->private_data; struct io_mapped_region *region; @@ -304,7 +303,7 @@ __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) guard(mutex)(&ctx->mmap_lock); - ptr = io_uring_validate_mmap_request(file, vma->vm_pgoff, sz); + ptr = io_uring_validate_mmap_request(file, vma->vm_pgoff); if (IS_ERR(ptr)) return PTR_ERR(ptr); @@ -336,7 +335,7 @@ unsigned long io_uring_get_unmapped_area(struct file *filp, unsigned long addr, guard(mutex)(&ctx->mmap_lock); - ptr = io_uring_validate_mmap_request(filp, pgoff, len); + ptr = io_uring_validate_mmap_request(filp, pgoff); if (IS_ERR(ptr)) return -ENOMEM; @@ -386,7 +385,7 @@ unsigned long io_uring_get_unmapped_area(struct file *file, unsigned long addr, guard(mutex)(&ctx->mmap_lock); - ptr = io_uring_validate_mmap_request(file, pgoff, len); + ptr = io_uring_validate_mmap_request(file, pgoff); if (IS_ERR(ptr)) return PTR_ERR(ptr); From 0d1714feb46818ffd8b18202bfe6b92b491991a5 Mon Sep 17 00:00:00 2001 From: shechenglong Date: Sun, 28 Dec 2025 21:04:26 +0800 Subject: [PATCH 0611/1321] block,bfq: fix aux stat accumulation destination Route bfqg_stats_add_aux() time accumulation into the destination stats object instead of the source, aligning with other stat fields. Reviewed-by: Yu Kuai Signed-off-by: shechenglong Signed-off-by: Jens Axboe --- block/bfq-cgroup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index 9fb9f35331502..6a75fe1c7a5c0 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -380,7 +380,7 @@ static void bfqg_stats_add_aux(struct bfqg_stats *to, struct bfqg_stats *from) blkg_rwstat_add_aux(&to->merged, &from->merged); blkg_rwstat_add_aux(&to->service_time, &from->service_time); blkg_rwstat_add_aux(&to->wait_time, &from->wait_time); - bfq_stat_add_aux(&from->time, &from->time); + bfq_stat_add_aux(&to->time, &from->time); bfq_stat_add_aux(&to->avg_queue_size_sum, &from->avg_queue_size_sum); bfq_stat_add_aux(&to->avg_queue_size_samples, &from->avg_queue_size_samples); From 917ff203a992706684127494948a43c82599c8ca Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Tue, 23 Dec 2025 11:27:40 +0800 Subject: [PATCH 0612/1321] ublk: scan partition in async way Implement async partition scan to avoid IO hang when reading partition tables. Similar to nvme_partition_scan_work(), partition scanning is deferred to a work queue to prevent deadlocks. When partition scan happens synchronously during add_disk(), IO errors can cause the partition scan to wait while holding ub->mutex, which can deadlock with other operations that need the mutex. Changes: - Add partition_scan_work to ublk_device structure - Implement ublk_partition_scan_work() to perform async scan - Always suppress sync partition scan during add_disk() - Schedule async work after add_disk() for trusted daemons - Add flush_work() in ublk_stop_dev() before grabbing ub->mutex Reviewed-by: Caleb Sander Mateos Reported-by: Yoav Cohen Closes: https://lore.kernel.org/linux-block/DM4PR12MB63280C5637917C071C2F0D65A9A8A@DM4PR12MB6328.namprd12.prod.outlook.com/ Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Signed-off-by: Ming Lei Signed-off-by: Jens Axboe --- drivers/block/ublk_drv.c | 35 ++++++++++++++++++++++++++++++++--- 1 file changed, 32 insertions(+), 3 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 49c2084571981..837fedb02e0d5 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -237,6 +237,7 @@ struct ublk_device { bool canceling; pid_t ublksrv_tgid; struct delayed_work exit_work; + struct work_struct partition_scan_work; struct ublk_queue *queues[]; }; @@ -254,6 +255,20 @@ static inline struct request *__ublk_check_and_get_req(struct ublk_device *ub, u16 q_id, u16 tag, struct ublk_io *io, size_t offset); static inline unsigned int ublk_req_build_flags(struct request *req); +static void ublk_partition_scan_work(struct work_struct *work) +{ + struct ublk_device *ub = + container_of(work, struct ublk_device, partition_scan_work); + + if (WARN_ON_ONCE(!test_and_clear_bit(GD_SUPPRESS_PART_SCAN, + &ub->ub_disk->state))) + return; + + mutex_lock(&ub->ub_disk->open_mutex); + bdev_disk_changed(ub->ub_disk, false); + mutex_unlock(&ub->ub_disk->open_mutex); +} + static inline struct ublksrv_io_desc * ublk_get_iod(const struct ublk_queue *ubq, unsigned tag) { @@ -2026,6 +2041,7 @@ static void ublk_stop_dev(struct ublk_device *ub) mutex_lock(&ub->mutex); ublk_stop_dev_unlocked(ub); mutex_unlock(&ub->mutex); + flush_work(&ub->partition_scan_work); ublk_cancel_dev(ub); } @@ -2954,9 +2970,17 @@ static int ublk_ctrl_start_dev(struct ublk_device *ub, ublk_apply_params(ub); - /* don't probe partitions if any daemon task is un-trusted */ - if (ub->unprivileged_daemons) - set_bit(GD_SUPPRESS_PART_SCAN, &disk->state); + /* + * Suppress partition scan to avoid potential IO hang. + * + * If ublk server error occurs during partition scan, the IO may + * wait while holding ub->mutex, which can deadlock with other + * operations that need the mutex. Defer partition scan to async + * work. + * For unprivileged daemons, keep GD_SUPPRESS_PART_SCAN set + * permanently. + */ + set_bit(GD_SUPPRESS_PART_SCAN, &disk->state); ublk_get_device(ub); ub->dev_info.state = UBLK_S_DEV_LIVE; @@ -2973,6 +2997,10 @@ static int ublk_ctrl_start_dev(struct ublk_device *ub, set_bit(UB_STATE_USED, &ub->state); + /* Schedule async partition scan for trusted daemons */ + if (!ub->unprivileged_daemons) + schedule_work(&ub->partition_scan_work); + out_put_cdev: if (ret) { ublk_detach_disk(ub); @@ -3138,6 +3166,7 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header) mutex_init(&ub->mutex); spin_lock_init(&ub->lock); mutex_init(&ub->cancel_mutex); + INIT_WORK(&ub->partition_scan_work, ublk_partition_scan_work); ret = ublk_alloc_dev_number(ub, header->dev_id); if (ret < 0) From d475693c46a81c96ba71d6c5e56058d31ea5efa9 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Tue, 23 Dec 2025 11:27:41 +0800 Subject: [PATCH 0613/1321] selftests/ublk: add test for async partition scan Add test_generic_15.sh to verify that async partition scan prevents IO hang when reading partition tables. The test creates ublk devices with fault_inject target and very large delay (60s) to simulate blocked partition table reads, then kills the daemon to verify proper state transitions without hanging: 1. Without recovery support: - Create device with fault_inject and 60s delay - Kill daemon while partition scan may be blocked - Verify device transitions to DEAD state 2. With recovery support (-r 1): - Create device with fault_inject, 60s delay, and recovery - Kill daemon while partition scan may be blocked - Verify device transitions to QUIESCED state Before the async partition scan fix, killing the daemon during partition scan would cause deadlock as partition scan held ub->mutex while waiting for IO. With the async fix, partition scan happens in a work function and flush_work() ensures proper synchronization. Add _add_ublk_dev_no_settle() helper function to skip udevadm settle, which would otherwise hang waiting for partition scan events to complete when partition table read is delayed. Signed-off-by: Ming Lei Signed-off-by: Jens Axboe --- tools/testing/selftests/ublk/Makefile | 1 + tools/testing/selftests/ublk/test_common.sh | 16 +++-- .../testing/selftests/ublk/test_generic_15.sh | 68 +++++++++++++++++++ 3 files changed, 81 insertions(+), 4 deletions(-) create mode 100755 tools/testing/selftests/ublk/test_generic_15.sh diff --git a/tools/testing/selftests/ublk/Makefile b/tools/testing/selftests/ublk/Makefile index 837977b624171..eb0e6cfb00ad3 100644 --- a/tools/testing/selftests/ublk/Makefile +++ b/tools/testing/selftests/ublk/Makefile @@ -22,6 +22,7 @@ TEST_PROGS += test_generic_11.sh TEST_PROGS += test_generic_12.sh TEST_PROGS += test_generic_13.sh TEST_PROGS += test_generic_14.sh +TEST_PROGS += test_generic_15.sh TEST_PROGS += test_null_01.sh TEST_PROGS += test_null_02.sh diff --git a/tools/testing/selftests/ublk/test_common.sh b/tools/testing/selftests/ublk/test_common.sh index 6f1c042de40e7..ea9a5f3eb70ab 100755 --- a/tools/testing/selftests/ublk/test_common.sh +++ b/tools/testing/selftests/ublk/test_common.sh @@ -178,8 +178,9 @@ _have_feature() _create_ublk_dev() { local dev_id; local cmd=$1 + local settle=$2 - shift 1 + shift 2 if [ ! -c /dev/ublk-control ]; then return ${UBLK_SKIP_CODE} @@ -194,7 +195,10 @@ _create_ublk_dev() { echo "fail to add ublk dev $*" return 255 fi - udevadm settle + + if [ "$settle" = "yes" ]; then + udevadm settle + fi if [[ "$dev_id" =~ ^[0-9]+$ ]]; then echo "${dev_id}" @@ -204,14 +208,18 @@ _create_ublk_dev() { } _add_ublk_dev() { - _create_ublk_dev "add" "$@" + _create_ublk_dev "add" "yes" "$@" +} + +_add_ublk_dev_no_settle() { + _create_ublk_dev "add" "no" "$@" } _recover_ublk_dev() { local dev_id local state - dev_id=$(_create_ublk_dev "recover" "$@") + dev_id=$(_create_ublk_dev "recover" "yes" "$@") for ((j=0;j<20;j++)); do state=$(_get_ublk_dev_state "${dev_id}") [ "$state" == "LIVE" ] && break diff --git a/tools/testing/selftests/ublk/test_generic_15.sh b/tools/testing/selftests/ublk/test_generic_15.sh new file mode 100755 index 0000000000000..76379362e0a28 --- /dev/null +++ b/tools/testing/selftests/ublk/test_generic_15.sh @@ -0,0 +1,68 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +. "$(cd "$(dirname "$0")" && pwd)"/test_common.sh + +TID="generic_15" +ERR_CODE=0 + +_test_partition_scan_no_hang() +{ + local recovery_flag=$1 + local expected_state=$2 + local dev_id + local state + local daemon_pid + local start_time + local elapsed + + # Create ublk device with fault_inject target and very large delay + # to simulate hang during partition table read + # --delay_us 60000000 = 60 seconds delay + # Use _add_ublk_dev_no_settle to avoid udevadm settle hang waiting + # for partition scan events to complete + if [ "$recovery_flag" = "yes" ]; then + echo "Testing partition scan with recovery support..." + dev_id=$(_add_ublk_dev_no_settle -t fault_inject -q 1 -d 1 --delay_us 60000000 -r 1) + else + echo "Testing partition scan without recovery..." + dev_id=$(_add_ublk_dev_no_settle -t fault_inject -q 1 -d 1 --delay_us 60000000) + fi + + _check_add_dev "$TID" $? + + # The add command should return quickly because partition scan is async. + # Now sleep briefly to let the async partition scan work start and hit + # the delay in the fault_inject handler. + sleep 1 + + # Kill the ublk daemon while partition scan is potentially blocked + # And check state transitions properly + start_time=${SECONDS} + daemon_pid=$(_get_ublk_daemon_pid "${dev_id}") + state=$(__ublk_kill_daemon "${dev_id}" "${expected_state}") + elapsed=$((SECONDS - start_time)) + + # Verify the device transitioned to expected state + if [ "$state" != "${expected_state}" ]; then + echo "FAIL: Device state is $state, expected ${expected_state}" + ERR_CODE=255 + ${UBLK_PROG} del -n "${dev_id}" > /dev/null 2>&1 + return + fi + echo "PASS: Device transitioned to ${expected_state} in ${elapsed}s without hanging" + + # Clean up the device + ${UBLK_PROG} del -n "${dev_id}" > /dev/null 2>&1 +} + +_prep_test "partition_scan" "verify async partition scan prevents IO hang" + +# Test 1: Without recovery support - should transition to DEAD +_test_partition_scan_no_hang "no" "DEAD" + +# Test 2: With recovery support - should transition to QUIESCED +_test_partition_scan_no_hang "yes" "QUIESCED" + +_cleanup_test "partition_scan" +_show_result $TID $ERR_CODE From a9eaacd6dcd2f4ef963230f9e9d4149fa6d2d4d0 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Tue, 23 Dec 2025 11:27:42 +0800 Subject: [PATCH 0614/1321] selftests/ublk: fix Makefile to rebuild on header changes Add header dependencies to kublk build rule so that changes to kublk.h, ublk_dep.h, or utils.h trigger a rebuild. Signed-off-by: Ming Lei Signed-off-by: Jens Axboe --- tools/testing/selftests/ublk/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/ublk/Makefile b/tools/testing/selftests/ublk/Makefile index eb0e6cfb00ad3..06ba6fde098de 100644 --- a/tools/testing/selftests/ublk/Makefile +++ b/tools/testing/selftests/ublk/Makefile @@ -51,10 +51,10 @@ TEST_PROGS += test_stress_07.sh TEST_GEN_PROGS_EXTENDED = kublk +LOCAL_HDRS += $(wildcard *.h) include ../lib.mk -$(TEST_GEN_PROGS_EXTENDED): kublk.c null.c file_backed.c common.c stripe.c \ - fault_inject.c +$(TEST_GEN_PROGS_EXTENDED): $(wildcard *.c) check: shellcheck -x -f gcc *.sh From 109e6deb3934f636ed8fbba73e744846f43185a2 Mon Sep 17 00:00:00 2001 From: Cong Zhang Date: Tue, 30 Dec 2025 17:17:05 +0800 Subject: [PATCH 0615/1321] blk-mq: skip CPU offline notify on unmapped hctx If an hctx has no software ctx mapped, blk_mq_map_swqueue() never allocates tags and leaves hctx->tags NULL. The CPU hotplug offline notifier can still run for that hctx, return early since hctx cannot hold any requests. Signed-off-by: Cong Zhang Fixes: bf0beec0607d ("blk-mq: drain I/O when all CPUs in a hctx are offline") Reviewed-by: Ming Lei Signed-off-by: Jens Axboe --- block/blk-mq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 1978eef95dca3..eff4f72ce83be 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3721,7 +3721,7 @@ static int blk_mq_hctx_notify_offline(unsigned int cpu, struct hlist_node *node) struct blk_mq_hw_ctx, cpuhp_online); int ret = 0; - if (blk_mq_hctx_has_online_cpu(hctx, cpu)) + if (!hctx->nr_ctx || blk_mq_hctx_has_online_cpu(hctx, cpu)) return 0; /* From bdd20a70e6c5d2c9b07d406dd4567b99984b778d Mon Sep 17 00:00:00 2001 From: Li Nan Date: Mon, 15 Dec 2025 20:44:12 +0800 Subject: [PATCH 0616/1321] md: Fix static checker warning in analyze_sbs The following warn is reported: drivers/md/md.c:3912 analyze_sbs() warn: iterator 'i' not incremented Fixes: d8730f0cf4ef ("md: Remove deprecated CONFIG_MD_MULTIPATH") Reported-by: Dan Carpenter Closes: https://lore.kernel.org/linux-raid/7e2e95ce-3740-09d8-a561-af6bfb767f18@huaweicloud.com/T/#t Signed-off-by: Li Nan Link: https://lore.kernel.org/linux-raid/20251215124412.4015572-1-linan666@huaweicloud.com Signed-off-by: Yu Kuai --- drivers/md/md.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index e5922a6829532..03433c88fb54b 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -3882,7 +3882,6 @@ static struct md_rdev *md_import_device(dev_t newdev, int super_format, int supe static int analyze_sbs(struct mddev *mddev) { - int i; struct md_rdev *rdev, *freshest, *tmp; freshest = NULL; @@ -3909,11 +3908,9 @@ static int analyze_sbs(struct mddev *mddev) super_types[mddev->major_version]. validate_super(mddev, NULL/*freshest*/, freshest); - i = 0; rdev_for_each_safe(rdev, tmp, mddev) { if (mddev->max_disks && - (rdev->desc_nr >= mddev->max_disks || - i > mddev->max_disks)) { + rdev->desc_nr >= mddev->max_disks) { pr_warn("md: %s: %pg: only %d devices permitted\n", mdname(mddev), rdev->bdev, mddev->max_disks); From cf805f9c63c9209e8f27589ccbda2bcd48f21bec Mon Sep 17 00:00:00 2001 From: Tuo Li Date: Thu, 25 Dec 2025 21:03:26 +0800 Subject: [PATCH 0617/1321] md/raid5: fix possible null-pointer dereferences in raid5_store_group_thread_cnt() The variable mddev->private is first assigned to conf and then checked: conf = mddev->private; if (!conf) ... If conf is NULL, then mddev->private is also NULL. In this case, null-pointer dereferences can occur when calling raid5_quiesce(): raid5_quiesce(mddev, true); raid5_quiesce(mddev, false); since mddev->private is assigned to conf again in raid5_quiesce(), and conf is dereferenced in several places, for example: conf->quiesce = 0; wake_up(&conf->wait_for_quiescent); To fix this issue, the function should unlock mddev and return before invoking raid5_quiesce() when conf is NULL, following the existing pattern in raid5_change_consistency_policy(). Fixes: fa1944bbe622 ("md/raid5: Wait sync io to finish before changing group cnt") Signed-off-by: Tuo Li Reviewed-by: Xiao Ni Reviewed-by: Paul Menzel Link: https://lore.kernel.org/linux-raid/20251225130326.67780-1-islituo@gmail.com Signed-off-by: Yu Kuai --- drivers/md/raid5.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index e57ce3295292b..8dc98f545969f 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -7187,12 +7187,14 @@ raid5_store_group_thread_cnt(struct mddev *mddev, const char *page, size_t len) err = mddev_suspend_and_lock(mddev); if (err) return err; + conf = mddev->private; + if (!conf) { + mddev_unlock_and_resume(mddev); + return -ENODEV; + } raid5_quiesce(mddev, true); - conf = mddev->private; - if (!conf) - err = -ENODEV; - else if (new != conf->worker_cnt_per_group) { + if (new != conf->worker_cnt_per_group) { old_groups = conf->worker_groups; if (old_groups) flush_workqueue(raid5_wq); From b3ddf6302b217369fad61f9140d9378932f00221 Mon Sep 17 00:00:00 2001 From: FengWei Shih Date: Fri, 26 Dec 2025 18:18:16 +0800 Subject: [PATCH 0618/1321] md: suspend array while updating raid_disks via sysfs In raid1_reshape(), freeze_array() is called before modifying the r1bio memory pool (conf->r1bio_pool) and conf->raid_disks, and unfreeze_array() is called after the update is completed. However, freeze_array() only waits until nr_sync_pending and (nr_pending - nr_queued) of all buckets reaches zero. When an I/O error occurs, nr_queued is increased and the corresponding r1bio is queued to either retry_list or bio_end_io_list. As a result, freeze_array() may unblock before these r1bios are released. This can lead to a situation where conf->raid_disks and the mempool have already been updated while queued r1bios, allocated with the old raid_disks value, are later released. Consequently, free_r1bio() may access memory out of bounds in put_all_bios() and release r1bios of the wrong size to the new mempool, potentially causing issues with the mempool as well. Since only normal I/O might increase nr_queued while an I/O error occurs, suspending the array avoids this issue. Note: Updating raid_disks via ioctl SET_ARRAY_INFO already suspends the array. Therefore, we suspend the array when updating raid_disks via sysfs to avoid this issue too. Signed-off-by: FengWei Shih Link: https://lore.kernel.org/linux-raid/20251226101816.4506-1-dannyshih@synology.com Signed-off-by: Yu Kuai --- drivers/md/md.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 03433c88fb54b..dda272e87a1b9 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -4404,7 +4404,7 @@ raid_disks_store(struct mddev *mddev, const char *buf, size_t len) if (err < 0) return err; - err = mddev_lock(mddev); + err = mddev_suspend_and_lock(mddev); if (err) return err; if (mddev->pers) @@ -4429,7 +4429,7 @@ raid_disks_store(struct mddev *mddev, const char *buf, size_t len) } else mddev->raid_disks = n; out_unlock: - mddev_unlock(mddev); + mddev_unlock_and_resume(mddev); return err ? err : len; } static struct md_sysfs_entry md_raid_disks = From 31c6cf9776ffb2b3f73389e6f3fb51dab72eda09 Mon Sep 17 00:00:00 2001 From: Li Nan Date: Fri, 26 Dec 2025 10:42:20 +0800 Subject: [PATCH 0619/1321] md: Fix logical_block_size configuration being overwritten In super_1_validate(), mddev->logical_block_size is directly overwritten with the value from metadata. This causes the previously configured lbs to be lost, making the configuration ineffective. Fix it. Fixes: 62ed1b582246 ("md: allow configuring logical block size") Signed-off-by: Li Nan Reviewed-by: Yu Kuai Reviewed-by: Xiao Ni Link: https://lore.kernel.org/linux-raid/20251226024221.724201-1-linan666@huaweicloud.com Signed-off-by: Yu Kuai --- drivers/md/md.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index dda272e87a1b9..a13f92a64df6b 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -1999,7 +1999,6 @@ static int super_1_validate(struct mddev *mddev, struct md_rdev *freshest, struc mddev->layout = le32_to_cpu(sb->layout); mddev->raid_disks = le32_to_cpu(sb->raid_disks); mddev->dev_sectors = le64_to_cpu(sb->size); - mddev->logical_block_size = le32_to_cpu(sb->logical_block_size); mddev->events = ev1; mddev->bitmap_info.offset = 0; mddev->bitmap_info.space = 0; @@ -2015,6 +2014,9 @@ static int super_1_validate(struct mddev *mddev, struct md_rdev *freshest, struc mddev->max_disks = (4096-256)/2; + if (!mddev->logical_block_size) + mddev->logical_block_size = le32_to_cpu(sb->logical_block_size); + if ((le32_to_cpu(sb->feature_map) & MD_FEATURE_BITMAP_OFFSET) && mddev->bitmap_info.file == NULL) { mddev->bitmap_info.offset = From b6b26a406ad4cc7dfbd05aea80395b03201eb7a3 Mon Sep 17 00:00:00 2001 From: Li Nan Date: Fri, 26 Dec 2025 10:42:21 +0800 Subject: [PATCH 0620/1321] md: Fix forward incompatibility from configurable logical block size Commit 62ed1b582246 ("md: allow configuring logical block size") used reserved pad to add 'logical_block_size' to metadata. RAID rejects non-zero reserved pad, so arrays fail when rolling back to old kernels after booting new ones. Set 'logical_block_size' only for newly created arrays to support rollback to old kernels. Importantly new arrays still won't work on old kernels to prevent data loss issue from LBS changes. For arrays created on old kernels which confirmed not to rollback, configure LBS by echo current LBS (queue/logical_block_size) to md/logical_block_size. Fixes: 62ed1b582246 ("md: allow configuring logical block size") Reported-by: BugReports Closes: https://lore.kernel.org/linux-raid/825e532d-d1e1-44bb-5581-692b7c091796@huaweicloud.com/T/#t Signed-off-by: Li Nan Link: https://lore.kernel.org/linux-raid/20251226024221.724201-2-linan666@huaweicloud.com Signed-off-by: Yu Kuai --- drivers/md/md.c | 48 ++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 44 insertions(+), 4 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index a13f92a64df6b..6d73f6e196a9f 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -5980,13 +5980,33 @@ lbs_store(struct mddev *mddev, const char *buf, size_t len) if (mddev->major_version == 0) return -EINVAL; - if (mddev->pers) - return -EBUSY; - err = kstrtouint(buf, 10, &lbs); if (err < 0) return -EINVAL; + if (mddev->pers) { + unsigned int curr_lbs; + + if (mddev->logical_block_size) + return -EBUSY; + /* + * To fix forward compatibility issues, LBS is not + * configured for arrays from old kernels (<=6.18) by default. + * If the user confirms no rollback to old kernels, + * enable LBS by writing current LBS — to prevent data + * loss from LBS changes. + */ + curr_lbs = queue_logical_block_size(mddev->gendisk->queue); + if (lbs != curr_lbs) + return -EINVAL; + + mddev->logical_block_size = curr_lbs; + set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); + pr_info("%s: logical block size configured successfully, array will not be assembled in old kernels (<= 6.18)\n", + mdname(mddev)); + return len; + } + err = mddev_lock(mddev); if (err) goto unlock; @@ -6162,7 +6182,27 @@ int mddev_stack_rdev_limits(struct mddev *mddev, struct queue_limits *lim, mdname(mddev)); return -EINVAL; } - mddev->logical_block_size = lim->logical_block_size; + + /* Only 1.x meta needs to set logical block size */ + if (mddev->major_version == 0) + return 0; + + /* + * Fix forward compatibility issue. Only set LBS by default for + * new arrays, mddev->events == 0 indicates the array was just + * created. When assembling an array, read LBS from the superblock + * instead — LBS is 0 in superblocks created by old kernels. + */ + if (!mddev->events) { + pr_info("%s: array will not be assembled in old kernels that lack configurable LBS support (<= 6.18)\n", + mdname(mddev)); + mddev->logical_block_size = lim->logical_block_size; + } + + if (!mddev->logical_block_size) + pr_warn("%s: echo current LBS to md/logical_block_size to prevent data loss issues from LBS changes.\n" + "\tNote: After setting, array will not be assembled in old kernels (<= 6.18)\n", + mdname(mddev)); return 0; } From 8e941dd46774345d221423093073c2da9d392f13 Mon Sep 17 00:00:00 2001 From: Julia Lawall Date: Wed, 31 Dec 2025 18:22:07 +0100 Subject: [PATCH 0621/1321] block, bfq: update outdated comment The function bfq_bfqq_may_idle() was renamed as bfq_better_to_idle() in commit 277a4a9b56cd ("block, bfq: give a better name to bfq_bfqq_may_idle"). Update the comment accordingly. Signed-off-by: Julia Lawall Signed-off-by: Jens Axboe --- block/bfq-iosched.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h index 34a498e6b2a51..355a731e2c049 100644 --- a/block/bfq-iosched.h +++ b/block/bfq-iosched.h @@ -984,7 +984,7 @@ struct bfq_group_data { * unused for the root group. Used to know whether there * are groups with more than one active @bfq_entity * (see the comments to the function - * bfq_bfqq_may_idle()). + * bfq_better_to_idle()). * @rq_pos_tree: rbtree sorted by next_request position, used when * determining if two or more queues have interleaving * requests (see bfq_find_close_cooperator()). From 1ed49479e4714e7a185386f5a9db52b21ed50680 Mon Sep 17 00:00:00 2001 From: Yipeng Zou Date: Fri, 18 Aug 2023 09:32:26 +0800 Subject: [PATCH 0622/1321] selftests/ftrace: traceonoff_triggers: strip off names The func_traceonoff_triggers.tc sometimes goes to fail on my board, Kunpeng-920. [root@localhost]# ./ftracetest ./test.d/ftrace/func_traceonoff_triggers.tc -l fail.log === Ftrace unit tests === [1] ftrace - test for function traceon/off triggers [FAIL] [2] (instance) ftrace - test for function traceon/off triggers [UNSUPPORTED] I look up the log, and it shows that the md5sum is different between csum1 and csum2. ++ cnt=611 ++ sleep .1 +++ cnt_trace +++ grep -v '^#' trace +++ wc -l ++ cnt2=611 ++ '[' 611 -ne 611 ']' +++ cat tracing_on ++ on=0 ++ '[' 0 '!=' 0 ']' +++ md5sum trace ++ csum1='76896aa74362fff66a6a5f3cf8a8a500 trace' ++ sleep .1 +++ md5sum trace ++ csum2='ee8625a21c058818fc26e45c1ed3f6de trace' ++ '[' '76896aa74362fff66a6a5f3cf8a8a500 trace' '!=' 'ee8625a21c058818fc26e45c1ed3f6de trace' ']' ++ fail 'Tracing file is still changing' ++ echo Tracing file is still changing Tracing file is still changing ++ exit_fail ++ exit 1 So I directly dump the trace file before md5sum, the diff shows that: [root@localhost]# diff trace_1.log trace_2.log -y --suppress-common-lines dockerd-12285 [036] d.... 18385.510290: sched_stat | <...>-12285 [036] d.... 18385.510290: sched_stat dockerd-12285 [036] d.... 18385.510291: sched_swit | <...>-12285 [036] d.... 18385.510291: sched_swit <...>-740 [044] d.... 18385.602859: sched_stat | kworker/44:1-740 [044] d.... 18385.602859: sched_stat <...>-740 [044] d.... 18385.602860: sched_swit | kworker/44:1-740 [044] d.... 18385.602860: sched_swit And we can see that <...> filed be filled with names. We can strip off the names there to fix that. After strip off the names: kworker/u257:0-12 [019] d..2. 2528.758910: sched_stat | -12 [019] d..2. 2528.758910: sched_stat_runtime: comm=k kworker/u257:0-12 [019] d..2. 2528.758912: sched_swit | -12 [019] d..2. 2528.758912: sched_switch: prev_comm=kw -0 [000] d.s5. 2528.762318: sched_waki | -0 [000] d.s5. 2528.762318: sched_waking: comm=sshd pi -0 [037] dNh2. 2528.762326: sched_wake | -0 [037] dNh2. 2528.762326: sched_wakeup: comm=sshd pi -0 [037] d..2. 2528.762334: sched_swit | -0 [037] d..2. 2528.762334: sched_switch: prev_comm=sw Link: https://lore.kernel.org/r/20230818013226.2182299-1-zouyipeng@huawei.com Fixes: d87b29179aa0 ("selftests: ftrace: Use md5sum to take less time of checking logs") Suggested-by: Steven Rostedt (Google) Signed-off-by: Yipeng Zou Acked-by: Masami Hiramatsu (Google) Reviewed-by: Steven Rostedt (Google) Signed-off-by: Shuah Khan --- .../ftrace/test.d/ftrace/func_traceonoff_triggers.tc | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/ftrace/test.d/ftrace/func_traceonoff_triggers.tc b/tools/testing/selftests/ftrace/test.d/ftrace/func_traceonoff_triggers.tc index aee22289536b1..1b57771dbfdf0 100644 --- a/tools/testing/selftests/ftrace/test.d/ftrace/func_traceonoff_triggers.tc +++ b/tools/testing/selftests/ftrace/test.d/ftrace/func_traceonoff_triggers.tc @@ -90,9 +90,10 @@ if [ $on != "0" ]; then fail "Tracing is not off" fi -csum1=`md5sum trace` +# Cannot rely on names being around as they are only cached, strip them +csum1=`cat trace | sed -e 's/^ *[^ ]*\(-[0-9][0-9]*\)/\1/' | md5sum` sleep $SLEEP_TIME -csum2=`md5sum trace` +csum2=`cat trace | sed -e 's/^ *[^ ]*\(-[0-9][0-9]*\)/\1/' | md5sum` if [ "$csum1" != "$csum2" ]; then fail "Tracing file is still changing" From 4c7ce3acb594b6bf65a8e6cc31b24a7ddf6b37d2 Mon Sep 17 00:00:00 2001 From: Zheng Yejian Date: Wed, 10 May 2023 04:36:59 +0800 Subject: [PATCH 0623/1321] selftests/ftrace: Test toplevel-enable for instance 'available_events' is actually not required by 'test.d/event/toplevel-enable.tc' and its Existence has been tested in 'test.d/00basic/basic4.tc'. So the require of 'available_events' can be dropped and then we can add 'instance' flag to test 'test.d/event/toplevel-enable.tc' for instance. Test result show as below: # ./ftracetest test.d/event/toplevel-enable.tc === Ftrace unit tests === [1] event tracing - enable/disable with top level files [PASS] [2] (instance) event tracing - enable/disable with top level files [PASS] # of passed: 2 # of failed: 0 # of unresolved: 0 # of untested: 0 # of unsupported: 0 # of xfailed: 0 # of undefined(test bug): 0 Link: https://lore.kernel.org/r/20230509203659.1173917-1-zhengyejian1@huawei.com Signed-off-by: Zheng Yejian Acked-by: Steven Rostedt (Google) Signed-off-by: Shuah Khan --- tools/testing/selftests/ftrace/test.d/event/toplevel-enable.tc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/ftrace/test.d/event/toplevel-enable.tc b/tools/testing/selftests/ftrace/test.d/event/toplevel-enable.tc index 93c10ea42a686..8b8e1aea985bc 100644 --- a/tools/testing/selftests/ftrace/test.d/event/toplevel-enable.tc +++ b/tools/testing/selftests/ftrace/test.d/event/toplevel-enable.tc @@ -1,7 +1,8 @@ #!/bin/sh # SPDX-License-Identifier: GPL-2.0 # description: event tracing - enable/disable with top level files -# requires: available_events set_event events/enable +# requires: set_event events/enable +# flags: instance do_reset() { echo > set_event From 67364de6f049ed6577e0609bf26918bb5cc3c7dd Mon Sep 17 00:00:00 2001 From: Wake Liu Date: Wed, 24 Dec 2025 16:41:20 +0800 Subject: [PATCH 0624/1321] kselftest/harness: Use helper to avoid zero-size memset warning When building kselftests with a toolchain that enables source fortification (e.g., Android's build environment, which uses -D_FORTIFY_SOURCE=3), a build failure occurs in tests that use an empty FIXTURE(). The root cause is that an empty fixture struct results in `sizeof(self_private)` evaluating to 0. The compiler's fortification checks then detect the `memset()` call with a compile-time constant size of 0, issuing a `-Wuser-defined-warnings` which is promoted to an error by `-Werror`. An initial attempt to guard the call with `if (sizeof(self_private) > 0)` was insufficient. The compiler's static analysis is aggressive enough to flag the `memset(..., 0)` pattern before evaluating the conditional, thus still triggering the error. To resolve this robustly, this change introduces a `static inline` helper function, `__kselftest_memset_safe()`. This function wraps the size check and the `memset()` call. By replacing the direct `memset()` in the `__TEST_F_IMPL` macro with a call to this helper, we create an abstraction boundary. This prevents the compiler's static analyzer from "seeing" the problematic pattern at the macro expansion site, resolving the build failure. Build Context: Compiler: Android (14488419, +pgo, +bolt, +lto, +mlgo, based on r584948) clang version 22.0.0 (https://android.googlesource.com/toolchain/llvm-project 2d65e4108033380e6fe8e08b1f1826cd2bfb0c99) Relevant Options: -O2 -Wall -Werror -D_FORTIFY_SOURCE=3 -target i686-linux-android10000 Test: m kselftest_futex_futex_requeue_pi Removed Gerrit Change-Id Shuah Khan Link: https://lore.kernel.org/r/20251224084120.249417-1-wakel@google.com Signed-off-by: Wake Liu Signed-off-by: Shuah Khan --- tools/testing/selftests/kselftest_harness.h | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h index baae6b7ded416..16a119a4656c7 100644 --- a/tools/testing/selftests/kselftest_harness.h +++ b/tools/testing/selftests/kselftest_harness.h @@ -70,6 +70,12 @@ #include "kselftest.h" +static inline void __kselftest_memset_safe(void *s, int c, size_t n) +{ + if (n > 0) + memset(s, c, n); +} + #define TEST_TIMEOUT_DEFAULT 30 /* Utilities exposed to the test definitions */ @@ -416,7 +422,7 @@ self = mmap(NULL, sizeof(*self), PROT_READ | PROT_WRITE, \ MAP_SHARED | MAP_ANONYMOUS, -1, 0); \ } else { \ - memset(&self_private, 0, sizeof(self_private)); \ + __kselftest_memset_safe(&self_private, 0, sizeof(self_private)); \ self = &self_private; \ } \ } \ From 6aa2c76f685781cd45011de781781bd2aba733f0 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Fri, 28 Nov 2025 13:37:28 -0400 Subject: [PATCH 0625/1321] RDMA/core: Check for the presence of LS_NLA_TYPE_DGID correctly The netlink response for RDMA_NL_LS_OP_IP_RESOLVE should always have a LS_NLA_TYPE_DGID attribute, it is invalid if it does not. Use the nl parsing logic properly and call nla_parse_deprecated() to fill the nlattrs array and then directly index that array to get the data for the DGID. Just fail if it is NULL. Remove the for loop searching for the nla, and squash the validation and parsing into one function. Fixes an uninitialized read from the stack triggered by userspace if it does not provide the DGID to a kernel initiated RDMA_NL_LS_OP_IP_RESOLVE query. BUG: KMSAN: uninit-value in hex_byte_pack include/linux/hex.h:13 [inline] BUG: KMSAN: uninit-value in ip6_string+0xef4/0x13a0 lib/vsprintf.c:1490 hex_byte_pack include/linux/hex.h:13 [inline] ip6_string+0xef4/0x13a0 lib/vsprintf.c:1490 ip6_addr_string+0x18a/0x3e0 lib/vsprintf.c:1509 ip_addr_string+0x245/0xee0 lib/vsprintf.c:1633 pointer+0xc09/0x1bd0 lib/vsprintf.c:2542 vsnprintf+0xf8a/0x1bd0 lib/vsprintf.c:2930 vprintk_store+0x3ae/0x1530 kernel/printk/printk.c:2279 vprintk_emit+0x307/0xcd0 kernel/printk/printk.c:2426 vprintk_default+0x3f/0x50 kernel/printk/printk.c:2465 vprintk+0x36/0x50 kernel/printk/printk_safe.c:82 _printk+0x17e/0x1b0 kernel/printk/printk.c:2475 ib_nl_process_good_ip_rsep drivers/infiniband/core/addr.c:128 [inline] ib_nl_handle_ip_res_resp+0x963/0x9d0 drivers/infiniband/core/addr.c:141 rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:-1 [inline] rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0xefa/0x11c0 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline] netlink_unicast+0xf04/0x12b0 net/netlink/af_netlink.c:1346 netlink_sendmsg+0x10b3/0x1250 net/netlink/af_netlink.c:1896 sock_sendmsg_nosec net/socket.c:714 [inline] __sock_sendmsg+0x333/0x3d0 net/socket.c:729 ____sys_sendmsg+0x7e0/0xd80 net/socket.c:2617 ___sys_sendmsg+0x271/0x3b0 net/socket.c:2671 __sys_sendmsg+0x1aa/0x300 net/socket.c:2703 __compat_sys_sendmsg net/compat.c:346 [inline] __do_compat_sys_sendmsg net/compat.c:353 [inline] __se_compat_sys_sendmsg net/compat.c:350 [inline] __ia32_compat_sys_sendmsg+0xa4/0x100 net/compat.c:350 ia32_sys_call+0x3f6c/0x4310 arch/x86/include/generated/asm/syscalls_32.h:371 do_syscall_32_irqs_on arch/x86/entry/syscall_32.c:83 [inline] __do_fast_syscall_32+0xb0/0x150 arch/x86/entry/syscall_32.c:306 do_fast_syscall_32+0x38/0x80 arch/x86/entry/syscall_32.c:331 do_SYSENTER_32+0x1f/0x30 arch/x86/entry/syscall_32.c:3 Link: https://patch.msgid.link/r/0-v1-3fbaef094271+2cf-rdma_op_ip_rslv_syz_jgg@nvidia.com Cc: stable@vger.kernel.org Fixes: ae43f8286730 ("IB/core: Add IP to GID netlink offload") Reported-by: syzbot+938fcd548c303fe33c1a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/68dc3dac.a00a0220.102ee.004f.GAE@google.com Signed-off-by: Jason Gunthorpe --- drivers/infiniband/core/addr.c | 33 ++++++++++----------------------- 1 file changed, 10 insertions(+), 23 deletions(-) diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index 61596cda2b65f..35ba852a172aa 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -80,37 +80,25 @@ static const struct nla_policy ib_nl_addr_policy[LS_NLA_TYPE_MAX] = { .min = sizeof(struct rdma_nla_ls_gid)}, }; -static inline bool ib_nl_is_good_ip_resp(const struct nlmsghdr *nlh) +static void ib_nl_process_ip_rsep(const struct nlmsghdr *nlh) { struct nlattr *tb[LS_NLA_TYPE_MAX] = {}; + union ib_gid gid; + struct addr_req *req; + int found = 0; int ret; if (nlh->nlmsg_flags & RDMA_NL_LS_F_ERR) - return false; + return; ret = nla_parse_deprecated(tb, LS_NLA_TYPE_MAX - 1, nlmsg_data(nlh), nlmsg_len(nlh), ib_nl_addr_policy, NULL); if (ret) - return false; - - return true; -} - -static void ib_nl_process_good_ip_rsep(const struct nlmsghdr *nlh) -{ - const struct nlattr *head, *curr; - union ib_gid gid; - struct addr_req *req; - int len, rem; - int found = 0; - - head = (const struct nlattr *)nlmsg_data(nlh); - len = nlmsg_len(nlh); + return; - nla_for_each_attr(curr, head, len, rem) { - if (curr->nla_type == LS_NLA_TYPE_DGID) - memcpy(&gid, nla_data(curr), nla_len(curr)); - } + if (!tb[LS_NLA_TYPE_DGID]) + return; + memcpy(&gid, nla_data(tb[LS_NLA_TYPE_DGID]), sizeof(gid)); spin_lock_bh(&lock); list_for_each_entry(req, &req_list, list) { @@ -137,8 +125,7 @@ int ib_nl_handle_ip_res_resp(struct sk_buff *skb, !(NETLINK_CB(skb).sk)) return -EPERM; - if (ib_nl_is_good_ip_resp(nlh)) - ib_nl_process_good_ip_rsep(nlh); + ib_nl_process_ip_rsep(nlh); return 0; } From 6bcfa0192b0516c216660131fcc38929ab767984 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Fri, 28 Nov 2025 20:53:21 -0400 Subject: [PATCH 0626/1321] RDMA/cm: Fix leaking the multicast GID table reference If the CM ID is destroyed while the CM event for multicast creating is still queued the cancel_work_sync() will prevent the work from running which also prevents destroying the ah_attr. This leaks a refcount and triggers a WARN: GID entry ref leak for dev syz1 index 2 ref=573 WARNING: CPU: 1 PID: 655 at drivers/infiniband/core/cache.c:809 release_gid_table drivers/infiniband/core/cache.c:806 [inline] WARNING: CPU: 1 PID: 655 at drivers/infiniband/core/cache.c:809 gid_table_release_one+0x284/0x3cc drivers/infiniband/core/cache.c:886 Destroy the ah_attr after canceling the work, it is safe to call this twice. Link: https://patch.msgid.link/r/0-v1-4285d070a6b2+20a-rdma_mc_gid_leak_syz_jgg@nvidia.com Cc: stable@vger.kernel.org Fixes: fe454dc31e84 ("RDMA/ucma: Fix use-after-free bug in ucma_create_uevent") Reported-by: syzbot+b0da83a6c0e2e2bddbd4@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68232e7b.050a0220.f2294.09f6.GAE@google.com Signed-off-by: Jason Gunthorpe --- drivers/infiniband/core/cma.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 95e89f5c147c2..f00f1d3fbd9c5 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -2009,6 +2009,7 @@ static void destroy_mc(struct rdma_id_private *id_priv, ib_sa_free_multicast(mc->sa_mc); if (rdma_protocol_roce(id_priv->id.device, id_priv->id.port_num)) { + struct rdma_cm_event *event = &mc->iboe_join.event; struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr; struct net_device *ndev = NULL; @@ -2031,6 +2032,8 @@ static void destroy_mc(struct rdma_id_private *id_priv, dev_put(ndev); cancel_work_sync(&mc->iboe_join.work); + if (event->event == RDMA_CM_EVENT_MULTICAST_JOIN) + rdma_destroy_ah_attr(&event->param.ud.ah_attr); } kfree(mc); } From af098dd47741e05de4bd24041eb888c267f17c11 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Mon, 8 Dec 2025 14:33:05 +0100 Subject: [PATCH 0627/1321] RDMA/ucma: Fix rdma_ucm_query_ib_service_resp struct padding On a few 32-bit architectures, the newly added ib_user_service_rec structure is not 64-bit aligned the way it is on most regular ones. Add explicit padding into the rdma_ucm_query_ib_service_resp and rdma_ucm_resolve_ib_service structures that embed it, so that the layout is compatible across all of them. This is an ABI change on i386, aligning it with x86_64 and the other 64-bit architectures to avoid having to use a compat ioctl handler. Fixes: 810f874eda8e ("RDMA/ucma: Support query resolved service records") Link: https://patch.msgid.link/r/20251208133311.313977-1-arnd@kernel.org Signed-off-by: Arnd Bergmann Signed-off-by: Jason Gunthorpe --- include/uapi/rdma/rdma_user_cm.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/include/uapi/rdma/rdma_user_cm.h b/include/uapi/rdma/rdma_user_cm.h index 5ded174687ee0..838f8d4602560 100644 --- a/include/uapi/rdma/rdma_user_cm.h +++ b/include/uapi/rdma/rdma_user_cm.h @@ -192,6 +192,7 @@ struct rdma_ucm_query_path_resp { struct rdma_ucm_query_ib_service_resp { __u32 num_service_recs; + __u32 reserved; struct ib_user_service_rec recs[]; }; @@ -354,7 +355,7 @@ enum { #define RDMA_USER_CM_IB_SERVICE_NAME_SIZE 64 struct rdma_ucm_ib_service { - __u64 service_id; + __aligned_u64 service_id; __u8 service_name[RDMA_USER_CM_IB_SERVICE_NAME_SIZE]; __u32 flags; __u32 reserved; @@ -362,6 +363,7 @@ struct rdma_ucm_ib_service { struct rdma_ucm_resolve_ib_service { __u32 id; + __u32 reserved; struct rdma_ucm_ib_service ibs; }; From d91037060852312922242a8ed84388cb67b33f62 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Mon, 8 Dec 2025 14:38:44 +0100 Subject: [PATCH 0628/1321] RDMA/irdma: Fix irdma_alloc_ucontext_resp padding A recent commit modified struct irdma_alloc_ucontext_resp by adding a member with implicit padding in front of it, though this does not change the offset of the data members other than m68k. Reported by scripts/check-uapi.sh: ==== ABI differences detected in include/rdma/irdma-abi.h from 1dd7bde2e91c -> HEAD ==== [C] 'struct irdma_alloc_ucontext_resp' changed: type size changed from 704 to 640 (in bits) 1 data member deletion: '__u8 rsvd3[2]', at offset 640 (in bits) at irdma-abi.h:61:1 1 data member insertion: '__u8 revd3[2]', at offset 592 (in bits) at irdma-abi.h:60:1 Change the size back to the previous version, and remove the implicit padding by making it explicit and matching what x86-64 would do by placing max_hw_srq_quanta member into a naturally aligned location. Fixes: 563e1feb5f6e ("RDMA/irdma: Add SRQ support") Link: https://patch.msgid.link/r/20251208133849.315451-1-arnd@kernel.org Signed-off-by: Arnd Bergmann Reviewed-by: Geert Uytterhoeven Tested-by: Jacob Moroni Signed-off-by: Jason Gunthorpe --- include/uapi/rdma/irdma-abi.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/uapi/rdma/irdma-abi.h b/include/uapi/rdma/irdma-abi.h index f7788d33376b8..36f20802bcc84 100644 --- a/include/uapi/rdma/irdma-abi.h +++ b/include/uapi/rdma/irdma-abi.h @@ -57,8 +57,8 @@ struct irdma_alloc_ucontext_resp { __u8 rsvd2; __aligned_u64 comp_mask; __u16 min_hw_wq_size; + __u8 revd3[2]; __u32 max_hw_srq_quanta; - __u8 rsvd3[2]; }; struct irdma_alloc_pd_resp { From aab1f377bb85bdbbbc8bf2b55735b08cde505863 Mon Sep 17 00:00:00 2001 From: Konstantin Taranov Date: Thu, 23 Oct 2025 03:03:00 -0700 Subject: [PATCH 0629/1321] RDMA/mana_ib: check cqe length for kernel CQs Check queue size during kernel CQ creation to prevent overflow of u32. Fixes: bec127e45d9f ("RDMA/mana_ib: create kernel-level CQs") Link: https://patch.msgid.link/r/1761213780-5457-1-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Konstantin Taranov Reviewed-by: Long Li Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/mana/cq.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c index 1becc87791235..7600412b0739f 100644 --- a/drivers/infiniband/hw/mana/cq.c +++ b/drivers/infiniband/hw/mana/cq.c @@ -56,6 +56,10 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, doorbell = mana_ucontext->doorbell; } else { is_rnic_cq = true; + if (attr->cqe > U32_MAX / COMP_ENTRY_SIZE / 2 + 1) { + ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr->cqe); + return -EINVAL; + } buf_size = MANA_PAGE_ALIGN(roundup_pow_of_two(attr->cqe * COMP_ENTRY_SIZE)); cq->cqe = buf_size / COMP_ENTRY_SIZE; err = mana_ib_create_kernel_queue(mdev, buf_size, GDMA_CQ, &cq->queue); From 03f99f0302be73c12d4d3159cf0e4b2e38e21b64 Mon Sep 17 00:00:00 2001 From: Michal Schmidt Date: Thu, 27 Nov 2025 15:31:50 +0100 Subject: [PATCH 0630/1321] RDMA/irdma: avoid invalid read in irdma_net_event irdma_net_event() should not dereference anything from "neigh" (alias "ptr") until it has checked that the event is NETEVENT_NEIGH_UPDATE. Other events come with different structures pointed to by "ptr" and they may be smaller than struct neighbour. Move the read of neigh->dev under the NETEVENT_NEIGH_UPDATE case. The bug is mostly harmless, but it triggers KASAN on debug kernels: BUG: KASAN: stack-out-of-bounds in irdma_net_event+0x32e/0x3b0 [irdma] Read of size 8 at addr ffffc900075e07f0 by task kworker/27:2/542554 CPU: 27 PID: 542554 Comm: kworker/27:2 Kdump: loaded Not tainted 5.14.0-630.el9.x86_64+debug #1 Hardware name: [...] Workqueue: events rt6_probe_deferred Call Trace: dump_stack_lvl+0x60/0xb0 print_address_description.constprop.0+0x2c/0x3f0 print_report+0xb4/0x270 kasan_report+0x92/0xc0 irdma_net_event+0x32e/0x3b0 [irdma] notifier_call_chain+0x9e/0x180 atomic_notifier_call_chain+0x5c/0x110 rt6_do_redirect+0xb91/0x1080 tcp_v6_err+0xe9b/0x13e0 icmpv6_notify+0x2b2/0x630 ndisc_redirect_rcv+0x328/0x530 icmpv6_rcv+0xc16/0x1360 ip6_protocol_deliver_rcu+0xb84/0x12e0 ip6_input_finish+0x117/0x240 ip6_input+0xc4/0x370 ipv6_rcv+0x420/0x7d0 __netif_receive_skb_one_core+0x118/0x1b0 process_backlog+0xd1/0x5d0 __napi_poll.constprop.0+0xa3/0x440 net_rx_action+0x78a/0xba0 handle_softirqs+0x2d4/0x9c0 do_softirq+0xad/0xe0 Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Link: https://patch.msgid.link/r/20251127143150.121099-1-mschmidt@redhat.com Signed-off-by: Michal Schmidt Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/irdma/utils.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c index cc2a12f735d37..13d7499131d48 100644 --- a/drivers/infiniband/hw/irdma/utils.c +++ b/drivers/infiniband/hw/irdma/utils.c @@ -251,7 +251,7 @@ int irdma_net_event(struct notifier_block *notifier, unsigned long event, void *ptr) { struct neighbour *neigh = ptr; - struct net_device *real_dev, *netdev = (struct net_device *)neigh->dev; + struct net_device *real_dev, *netdev; struct irdma_device *iwdev; struct ib_device *ibdev; __be32 *p; @@ -260,6 +260,7 @@ int irdma_net_event(struct notifier_block *notifier, unsigned long event, switch (event) { case NETEVENT_NEIGH_UPDATE: + netdev = neigh->dev; real_dev = rdma_vlan_dev_real_dev(netdev); if (!real_dev) real_dev = netdev; From b5aeb38b754ac99b6da2c439931727f1434f55c8 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 28 Nov 2025 18:21:46 -0800 Subject: [PATCH 0631/1321] RTRS/rtrs: clean up rtrs headers kernel-doc Fix all (30+) kernel-doc warnings in rtrs.h and rtrs-pri.h. The changes are: - add ending ':' to enum member names - change enum description separators from '-' to ':' - add "struct" keyword to kernel-doc for structs where missing - fix enum names in enum rtrs_clt_con_type - add a '-' separator and drop the "()" in enum rtrs_clt_con_type - convert struct rtrs_addr to kernel-doc format - add missing struct member descriptions for struct rtrs_attrs Link: https://patch.msgid.link/r/20251129022146.1498273-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap Acked-by: Jack Wang Signed-off-by: Jason Gunthorpe --- drivers/infiniband/ulp/rtrs/rtrs-pri.h | 32 +++++++++++++++++--------- drivers/infiniband/ulp/rtrs/rtrs.h | 24 +++++++++++-------- 2 files changed, 36 insertions(+), 20 deletions(-) diff --git a/drivers/infiniband/ulp/rtrs/rtrs-pri.h b/drivers/infiniband/ulp/rtrs/rtrs-pri.h index ef29bd483b5ad..59529d5938698 100644 --- a/drivers/infiniband/ulp/rtrs/rtrs-pri.h +++ b/drivers/infiniband/ulp/rtrs/rtrs-pri.h @@ -150,7 +150,7 @@ enum rtrs_msg_types { /** * enum rtrs_msg_flags - RTRS message flags. - * @RTRS_NEED_INVAL: Send invalidation in response. + * @RTRS_MSG_NEED_INVAL_F: Send invalidation in response. * @RTRS_MSG_NEW_RKEY_F: Send refreshed rkey in response. */ enum rtrs_msg_flags { @@ -179,16 +179,19 @@ struct rtrs_sg_desc { * @recon_cnt: Reconnections counter * @sess_uuid: UUID of a session (path) * @paths_uuid: UUID of a group of sessions (paths) - * + * @first_conn: %1 if the connection request is the first for that session, + * otherwise %0 * NOTE: max size 56 bytes, see man rdma_connect(). */ struct rtrs_msg_conn_req { - /* Is set to 0 by cma.c in case of AF_IB, do not touch that. - * see https://www.spinics.net/lists/linux-rdma/msg22397.html + /** + * @__cma_version: Is set to 0 by cma.c in case of AF_IB, do not touch + * that. See https://www.spinics.net/lists/linux-rdma/msg22397.html */ u8 __cma_version; - /* On sender side that should be set to 0, or cma_save_ip_info() - * extract garbage and will fail. + /** + * @__ip_version: On sender side that should be set to 0, or + * cma_save_ip_info() extract garbage and will fail. */ u8 __ip_version; __le16 magic; @@ -199,6 +202,7 @@ struct rtrs_msg_conn_req { uuid_t sess_uuid; uuid_t paths_uuid; u8 first_conn : 1; + /* private: */ u8 reserved_bits : 7; u8 reserved[11]; }; @@ -211,6 +215,7 @@ struct rtrs_msg_conn_req { * @queue_depth: max inflight messages (queue-depth) in this session * @max_io_size: max io size server supports * @max_hdr_size: max msg header size server supports + * @flags: RTRS message flags for this message * * NOTE: size is 56 bytes, max possible is 136 bytes, see man rdma_accept(). */ @@ -222,22 +227,24 @@ struct rtrs_msg_conn_rsp { __le32 max_io_size; __le32 max_hdr_size; __le32 flags; + /* private: */ u8 reserved[36]; }; /** - * struct rtrs_msg_info_req + * struct rtrs_msg_info_req - client additional info request * @type: @RTRS_MSG_INFO_REQ * @pathname: Path name chosen by client */ struct rtrs_msg_info_req { __le16 type; u8 pathname[NAME_MAX]; + /* private: */ u8 reserved[15]; }; /** - * struct rtrs_msg_info_rsp + * struct rtrs_msg_info_rsp - server additional info response * @type: @RTRS_MSG_INFO_RSP * @sg_cnt: Number of @desc entries * @desc: RDMA buffers where the client can write to server @@ -245,12 +252,14 @@ struct rtrs_msg_info_req { struct rtrs_msg_info_rsp { __le16 type; __le16 sg_cnt; + /* private: */ u8 reserved[4]; + /* public: */ struct rtrs_sg_desc desc[]; }; /** - * struct rtrs_msg_rkey_rsp + * struct rtrs_msg_rkey_rsp - server refreshed rkey response * @type: @RTRS_MSG_RKEY_RSP * @buf_id: RDMA buf_id of the new rkey * @rkey: new remote key for RDMA buffers id from server @@ -264,6 +273,7 @@ struct rtrs_msg_rkey_rsp { /** * struct rtrs_msg_rdma_read - RDMA data transfer request from client * @type: always @RTRS_MSG_READ + * @flags: RTRS message flags (enum rtrs_msg_flags) * @usr_len: length of user payload * @sg_cnt: number of @desc entries * @desc: RDMA buffers where the server can write the result to @@ -277,7 +287,7 @@ struct rtrs_msg_rdma_read { }; /** - * struct_msg_rdma_write - Message transferred to server with RDMA-Write + * struct rtrs_msg_rdma_write - Message transferred to server with RDMA-Write * @type: always @RTRS_MSG_WRITE * @usr_len: length of user payload */ @@ -287,7 +297,7 @@ struct rtrs_msg_rdma_write { }; /** - * struct_msg_rdma_hdr - header for read or write request + * struct rtrs_msg_rdma_hdr - header for read or write request * @type: @RTRS_MSG_WRITE | @RTRS_MSG_READ */ struct rtrs_msg_rdma_hdr { diff --git a/drivers/infiniband/ulp/rtrs/rtrs.h b/drivers/infiniband/ulp/rtrs/rtrs.h index b48b53a7c1435..b5bd35712de0b 100644 --- a/drivers/infiniband/ulp/rtrs/rtrs.h +++ b/drivers/infiniband/ulp/rtrs/rtrs.h @@ -24,8 +24,8 @@ struct rtrs_srv_op; /** * enum rtrs_clt_link_ev - Events about connectivity state of a client - * @RTRS_CLT_LINK_EV_RECONNECTED Client was reconnected. - * @RTRS_CLT_LINK_EV_DISCONNECTED Client was disconnected. + * @RTRS_CLT_LINK_EV_RECONNECTED: Client was reconnected. + * @RTRS_CLT_LINK_EV_DISCONNECTED: Client was disconnected. */ enum rtrs_clt_link_ev { RTRS_CLT_LINK_EV_RECONNECTED, @@ -33,7 +33,9 @@ enum rtrs_clt_link_ev { }; /** - * Source and destination address of a path to be established + * struct rtrs_addr - Source and destination address of a path to be established + * @src: source address + * @dst: destination address */ struct rtrs_addr { struct sockaddr_storage *src; @@ -41,7 +43,7 @@ struct rtrs_addr { }; /** - * rtrs_clt_ops - it holds the link event callback and private pointer. + * struct rtrs_clt_ops - it holds the link event callback and private pointer. * @priv: User supplied private data. * @link_ev: Event notification callback function for connection state changes * @priv: User supplied data that was passed to rtrs_clt_open() @@ -67,10 +69,10 @@ enum wait_type { }; /** - * enum rtrs_clt_con_type() type of ib connection to use with a given + * enum rtrs_clt_con_type - type of ib connection to use with a given * rtrs_permit - * @ADMIN_CON - use connection reserved for "service" messages - * @IO_CON - use a connection reserved for IO + * @RTRS_ADMIN_CON: use connection reserved for "service" messages + * @RTRS_IO_CON: use a connection reserved for IO */ enum rtrs_clt_con_type { RTRS_ADMIN_CON, @@ -85,7 +87,7 @@ void rtrs_clt_put_permit(struct rtrs_clt_sess *sess, struct rtrs_permit *permit); /** - * rtrs_clt_req_ops - it holds the request confirmation callback + * struct rtrs_clt_req_ops - it holds the request confirmation callback * and a private pointer. * @priv: User supplied private data. * @conf_fn: callback function to be called as confirmation @@ -105,7 +107,11 @@ int rtrs_clt_request(int dir, struct rtrs_clt_req_ops *ops, int rtrs_clt_rdma_cq_direct(struct rtrs_clt_sess *clt, unsigned int index); /** - * rtrs_attrs - RTRS session attributes + * struct rtrs_attrs - RTRS session attributes + * @queue_depth: queue_depth saved from rtrs_clt_sess message + * @max_io_size: max_io_size from rtrs_clt_sess message, capped to + * @max_segments * %SZ_4K + * @max_segments: max_segments saved from rtrs_clt_sess message */ struct rtrs_attrs { u32 queue_depth; From 7d7db2e9b54ccf6b669db0bf521ec1a9e7d9d74c Mon Sep 17 00:00:00 2001 From: Michael Margolin Date: Wed, 10 Dec 2025 17:36:56 +0000 Subject: [PATCH 0632/1321] RDMA/efa: Remove possible negative shift The page size used for device might in some cases be smaller than PAGE_SIZE what results in a negative shift when calculating the number of host pages in PAGE_SIZE for a debug log. Remove the debug line together with the calculation. Fixes: 40909f664d27 ("RDMA/efa: Add EFA verbs implementation") Link: https://patch.msgid.link/r/20251210173656.8180-1-mrgolin@amazon.com Reviewed-by: Tom Sela Reviewed-by: Yonatan Nachum Signed-off-by: Michael Margolin Reviewed-by: Gal Pressman Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/efa/efa_verbs.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c index 22d3e25c3b9d1..755bba8d58bbc 100644 --- a/drivers/infiniband/hw/efa/efa_verbs.c +++ b/drivers/infiniband/hw/efa/efa_verbs.c @@ -1320,13 +1320,9 @@ static int umem_to_page_list(struct efa_dev *dev, u32 hp_cnt, u8 hp_shift) { - u32 pages_in_hp = BIT(hp_shift - PAGE_SHIFT); struct ib_block_iter biter; unsigned int hp_idx = 0; - ibdev_dbg(&dev->ibdev, "hp_cnt[%u], pages_in_hp[%u]\n", - hp_cnt, pages_in_hp); - rdma_umem_for_each_dma_block(umem, &biter, BIT(hp_shift)) page_list[hp_idx++] = rdma_block_iter_dma_address(&biter); From 07aacdf599037e7527e3c061353a592df7212aa5 Mon Sep 17 00:00:00 2001 From: Jang Ingyu Date: Fri, 19 Dec 2025 13:15:08 +0900 Subject: [PATCH 0633/1321] RDMA/core: Fix logic error in ib_get_gids_from_rdma_hdr() Fix missing comparison operator for RDMA_NETWORK_ROCE_V1 in the conditional statement. The constant was used directly instead of being compared with net_type, causing the condition to always evaluate to true. Fixes: 1c15b4f2a42f ("RDMA/core: Modify enum ib_gid_type and enum rdma_network_type") Signed-off-by: Jang Ingyu Link: https://patch.msgid.link/20251219041508.1725947-1-ingyujang25@korea.ac.kr Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/verbs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index 11b1a194de443..ee390928511ae 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -738,7 +738,7 @@ int ib_get_gids_from_rdma_hdr(const union rdma_network_hdr *hdr, (struct in6_addr *)dgid); return 0; } else if (net_type == RDMA_NETWORK_IPV6 || - net_type == RDMA_NETWORK_IB || RDMA_NETWORK_ROCE_V1) { + net_type == RDMA_NETWORK_IB || net_type == RDMA_NETWORK_ROCE_V1) { *dgid = hdr->ibgrh.dgid; *sgid = hdr->ibgrh.sgid; return 0; From 76a8f77c6f74aa88ec1202148cf6403ebfdb58b3 Mon Sep 17 00:00:00 2001 From: Alok Tiwari Date: Wed, 17 Dec 2025 02:01:41 -0800 Subject: [PATCH 0634/1321] RDMA/bnxt_re: Fix incorrect BAR check in bnxt_qplib_map_creq_db() RCFW_COMM_CONS_PCI_BAR_REGION is defined as BAR 2, so checking !creq_db->reg.bar_id is incorrect and always false. pci_resource_start() returns the BAR base address, and a value of 0 indicates that the BAR is unassigned. Update the condition to test bar_base == 0 instead. This ensures the driver detects and logs an error for an unassigned RCFW communication BAR. Fixes: cee0c7bba486 ("RDMA/bnxt_re: Refactor command queue management code") Signed-off-by: Alok Tiwari Link: https://patch.msgid.link/20251217100158.752504-1-alok.a.tiwari@oracle.com Reviewed-by: Kalesh AP Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c index 295a9610f3e67..4dad0cfcfa98d 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c @@ -1112,7 +1112,7 @@ static int bnxt_qplib_map_creq_db(struct bnxt_qplib_rcfw *rcfw, u32 reg_offt) creq_db->dbinfo.flags = 0; creq_db->reg.bar_id = RCFW_COMM_CONS_PCI_BAR_REGION; creq_db->reg.bar_base = pci_resource_start(pdev, creq_db->reg.bar_id); - if (!creq_db->reg.bar_id) + if (!creq_db->reg.bar_base) dev_err(&pdev->dev, "QPLIB: CREQ BAR region %d resc start is 0!", creq_db->reg.bar_id); From dc17c0b12827cb819a8cb9de3c22406b038436c7 Mon Sep 17 00:00:00 2001 From: Stefan Metzmacher Date: Fri, 19 Dec 2025 15:04:08 +0100 Subject: [PATCH 0635/1321] RDMA/rxe: let rxe_reclassify_recv_socket() call sk_owner_put() On kernels build with CONFIG_PROVE_LOCKING, CONFIG_MODULES and CONFIG_DEBUG_LOCK_ALLOC 'rmmod rdma_rxe' is no longer possible. For the global recv sockets rxe_net_exit() is where we call rxe_release_udp_tunnel-> udp_tunnel_sock_release(), which means the sockets are destroyed before 'rmmod rdma_rxe' finishes, so there's no need to protect against rxe_recv_slock_key and rxe_recv_sk_key disappearing while the sockets are still alive. Fixes: 80a85a771deb ("RDMA/rxe: reclassify sockets in order to avoid false positives from lockdep") Cc: Zhu Yanjun Cc: Jason Gunthorpe Cc: Leon Romanovsky Cc: Shinichiro Kawasaki Cc: linux-rdma@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-cifs@vger.kernel.org Signed-off-by: Stefan Metzmacher Link: https://patch.msgid.link/20251219140408.2300163-1-metze@samba.org Reviewed-by: Zhu Yanjun Tested-by: Shin'ichiro Kawasaki Signed-off-by: Leon Romanovsky --- drivers/infiniband/sw/rxe/rxe_net.c | 32 +++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c index 0195d361e5e35..0bd0902b11f73 100644 --- a/drivers/infiniband/sw/rxe/rxe_net.c +++ b/drivers/infiniband/sw/rxe/rxe_net.c @@ -64,7 +64,39 @@ static inline void rxe_reclassify_recv_socket(struct socket *sock) break; default: WARN_ON_ONCE(1); + return; } + /* + * sock_lock_init_class_and_name() calls + * sk_owner_set(sk, THIS_MODULE); in order + * to make sure the referenced global + * variables rxe_recv_slock_key and + * rxe_recv_sk_key are not removed + * before the socket is closed. + * + * However this prevents rxe_net_exit() + * from being called and 'rmmod rdma_rxe' + * is refused because of the references. + * + * For the global sockets in recv_sockets, + * we are sure that rxe_net_exit() will call + * rxe_release_udp_tunnel -> udp_tunnel_sock_release. + * + * So we don't need the additional reference to + * our own (THIS_MODULE). + */ + sk_owner_put(sk); + /* + * We also call sk_owner_clear() otherwise + * sk_owner_put(sk) in sk_prot_free will + * fail, which is called via + * sk_free -> __sk_free -> sk_destruct + * and sk_destruct calls __sk_destruct + * directly or via call_rcu() + * so sk_prot_free() might be called + * after rxe_net_exit(). + */ + sk_owner_clear(sk); #endif /* CONFIG_DEBUG_LOCK_ALLOC */ } From b64f6892b66351eb6dc21c46aa97a65fa5d92014 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Sat, 20 Dec 2025 11:11:33 +0900 Subject: [PATCH 0636/1321] RDMA/core: always drop device refcount in ib_del_sub_device_and_put() Since nldev_deldev() (introduced by commit 060c642b2ab8 ("RDMA/nldev: Add support to add/delete a sub IB device through netlink") grabs a reference using ib_device_get_by_index() before calling ib_del_sub_device_and_put(), we need to drop that reference before returning -EOPNOTSUPP error. Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 Fixes: bca51197620a ("RDMA/core: Support IB sub device with type "SMI"") Signed-off-by: Tetsuo Handa Link: https://patch.msgid.link/80749a85-cbe2-460c-8451-42516013f9fa@I-love.SAKURA.ne.jp Reviewed-by: Parav Pandit Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/device.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index 13e8a1714bbd7..1174ab7da6295 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -2881,8 +2881,10 @@ int ib_del_sub_device_and_put(struct ib_device *sub) { struct ib_device *parent = sub->parent; - if (!parent) + if (!parent) { + ib_device_put(sub); return -EOPNOTSUPP; + } mutex_lock(&parent->subdev_lock); list_del(&sub->subdev_list); From d74aaf11fde627964da75f320d07d0d14034db64 Mon Sep 17 00:00:00 2001 From: Alok Tiwari Date: Fri, 19 Dec 2025 01:32:57 -0800 Subject: [PATCH 0637/1321] RDMA/bnxt_re: Fix IB_SEND_IP_CSUM handling in post_send The bnxt_re SEND path checks wr->send_flags to enable features such as IP checksum offload. However, send_flags is a bitmask and may contain multiple flags (e.g. IB_SEND_SIGNALED | IB_SEND_IP_CSUM), while the existing code uses a switch() statement that only matches when send_flags is exactly IB_SEND_IP_CSUM. As a result, checksum offload is not enabled when additional SEND flags are present. Replace the switch() with a bitmask test: if (wr->send_flags & IB_SEND_IP_CSUM) This ensures IP checksum offload is enabled correctly when multiple SEND flags are used. Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Alok Tiwari Link: https://patch.msgid.link/20251219093308.2415620-1-alok.a.tiwari@oracle.com Reviewed-by: Kalesh AP Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/bnxt_re/ib_verbs.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c index f19b55c13d580..ff91511bd3389 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c @@ -2919,14 +2919,9 @@ int bnxt_re_post_send(struct ib_qp *ib_qp, const struct ib_send_wr *wr, wqe.rawqp1.lflags |= SQ_SEND_RAWETH_QP1_LFLAGS_ROCE_CRC; } - switch (wr->send_flags) { - case IB_SEND_IP_CSUM: + if (wr->send_flags & IB_SEND_IP_CSUM) wqe.rawqp1.lflags |= SQ_SEND_RAWETH_QP1_LFLAGS_IP_CHKSUM; - break; - default: - break; - } fallthrough; case IB_WR_SEND_WITH_INV: rc = bnxt_re_build_send_wqe(qp, wr, &wqe); From 228f569983762953fef3811298fac4db7f916c26 Mon Sep 17 00:00:00 2001 From: Ding Hui Date: Mon, 8 Dec 2025 15:21:10 +0800 Subject: [PATCH 0638/1321] RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats() Commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update") added three new counters and placed them after BNXT_RE_OUT_OF_SEQ_ERR. BNXT_RE_OUT_OF_SEQ_ERR acts as a boundary marker for allocating hardware statistics with different num_counters values on chip_gen_p5_p7 devices. As a result, BNXT_RE_NUM_STD_COUNTERS are used when allocating hw_stats, which leads to an out-of-bounds write in bnxt_re_copy_err_stats(). The counters BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR, and BNXT_RE_RESP_REMOTE_ACCESS_ERRS are applicable to generic hardware, not only p5/p7 devices. Fix this by moving these counters before BNXT_RE_OUT_OF_SEQ_ERR so they are included in the generic counter set. Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update") Reported-by: Yingying Zheng Signed-off-by: Ding Hui Link: https://patch.msgid.link/20251208072110.28874-1-dinghui@sangfor.com.cn Reviewed-by: Kalesh AP Tested-by: Kalesh AP Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/bnxt_re/hw_counters.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/bnxt_re/hw_counters.h b/drivers/infiniband/hw/bnxt_re/hw_counters.h index 09d371d442aa7..cebec033f4a01 100644 --- a/drivers/infiniband/hw/bnxt_re/hw_counters.h +++ b/drivers/infiniband/hw/bnxt_re/hw_counters.h @@ -89,6 +89,9 @@ enum bnxt_re_hw_stats { BNXT_RE_RES_SRQ_LOAD_ERR, BNXT_RE_RES_TX_PCI_ERR, BNXT_RE_RES_RX_PCI_ERR, + BNXT_RE_REQ_CQE_ERROR, + BNXT_RE_RESP_CQE_ERROR, + BNXT_RE_RESP_REMOTE_ACCESS_ERRS, BNXT_RE_OUT_OF_SEQ_ERR, BNXT_RE_TX_ATOMIC_REQ, BNXT_RE_TX_READ_REQ, @@ -110,9 +113,6 @@ enum bnxt_re_hw_stats { BNXT_RE_TX_CNP, BNXT_RE_RX_CNP, BNXT_RE_RX_ECN, - BNXT_RE_REQ_CQE_ERROR, - BNXT_RE_RESP_CQE_ERROR, - BNXT_RE_RESP_REMOTE_ACCESS_ERRS, BNXT_RE_NUM_EXT_COUNTERS }; From a00ecc09e6ad5925b06f02093664731fbbedaa35 Mon Sep 17 00:00:00 2001 From: Kalesh AP Date: Tue, 23 Dec 2025 18:48:55 +0530 Subject: [PATCH 0639/1321] RDMA/bnxt_re: Fix to use correct page size for PDE table In bnxt_qplib_alloc_init_hwq(), while allocating memory for PDE table driver incorrectly is using the "pg_size" value passed to the function. Fixed to use the right value 4K. Also, fixed the allocation size for PBL table. Fixes: 0c4dcd602817 ("RDMA/bnxt_re: Refactor hardware queue memory allocation") Signed-off-by: Damodharam Ammepalli Signed-off-by: Kalesh AP Link: https://patch.msgid.link/20251223131855.145955-1-kalesh-anakkur.purayil@broadcom.com Reviewed-by: Selvin Xavier Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/bnxt_re/qplib_res.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c b/drivers/infiniband/hw/bnxt_re/qplib_res.c index 875d7b52c06ab..d5c12a51aa438 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_res.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c @@ -237,7 +237,7 @@ int bnxt_qplib_alloc_init_hwq(struct bnxt_qplib_hwq *hwq, if (npbl % BIT(MAX_PDL_LVL_SHIFT)) npde++; /* Alloc PDE pages */ - sginfo.pgsize = npde * pg_size; + sginfo.pgsize = npde * ROCE_PG_SIZE_4K; sginfo.npages = 1; rc = __alloc_pbl(res, &hwq->pbl[PBL_LVL_0], &sginfo); if (rc) @@ -245,7 +245,7 @@ int bnxt_qplib_alloc_init_hwq(struct bnxt_qplib_hwq *hwq, /* Alloc PBL pages */ sginfo.npages = npbl; - sginfo.pgsize = PAGE_SIZE; + sginfo.pgsize = ROCE_PG_SIZE_4K; rc = __alloc_pbl(res, &hwq->pbl[PBL_LVL_1], &sginfo); if (rc) goto fail; From d9caefb96009f8c9ba6f09c906284a6555bd04e1 Mon Sep 17 00:00:00 2001 From: Li Zhijian Date: Fri, 26 Dec 2025 17:41:12 +0800 Subject: [PATCH 0640/1321] IB/rxe: Fix missing umem_odp->umem_mutex unlock on error path rxe_odp_map_range_and_lock() must release umem_odp->umem_mutex when an error occurs, including cases where rxe_check_pagefault() fails. Fixes: 2fae67ab63db ("RDMA/rxe: Add support for Send/Recv/Write/Read with ODP") Signed-off-by: Li Zhijian Link: https://patch.msgid.link/20251226094112.3042583-1-lizhijian@fujitsu.com Reviewed-by: Zhu Yanjun Signed-off-by: Leon Romanovsky --- drivers/infiniband/sw/rxe/rxe_odp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/sw/rxe/rxe_odp.c b/drivers/infiniband/sw/rxe/rxe_odp.c index ae71812bea82c..c928cbf2e35f8 100644 --- a/drivers/infiniband/sw/rxe/rxe_odp.c +++ b/drivers/infiniband/sw/rxe/rxe_odp.c @@ -179,8 +179,10 @@ static int rxe_odp_map_range_and_lock(struct rxe_mr *mr, u64 iova, int length, u return err; need_fault = rxe_check_pagefault(umem_odp, iova, length); - if (need_fault) + if (need_fault) { + mutex_unlock(&umem_odp->umem_mutex); return -EFAULT; + } } return 0; From 7f204bd71f23d6b226f3203282e028f828e763c7 Mon Sep 17 00:00:00 2001 From: Honggang LI Date: Mon, 29 Dec 2025 10:56:17 +0800 Subject: [PATCH 0641/1321] RDMA/rtrs: Fix clt_path::max_pages_per_mr calculation If device max_mr_size bits in the range [mr_page_shift+31:mr_page_shift] are zero, the `min3` function will set clt_path::max_pages_per_mr to zero. `alloc_path_reqs` will pass zero, which is invalid, as the third parameter to `ib_alloc_mr`. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Signed-off-by: Honggang LI Link: https://patch.msgid.link/20251229025617.13241-1-honggangli@163.com Signed-off-by: Leon Romanovsky --- drivers/infiniband/ulp/rtrs/rtrs-clt.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c index 71387811b2815..2b397a544cb93 100644 --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c @@ -1464,6 +1464,7 @@ static void query_fast_reg_mode(struct rtrs_clt_path *clt_path) mr_page_shift = max(12, ffs(ib_dev->attrs.page_size_cap) - 1); max_pages_per_mr = ib_dev->attrs.max_mr_size; do_div(max_pages_per_mr, (1ull << mr_page_shift)); + max_pages_per_mr = min_not_zero((u32)max_pages_per_mr, U32_MAX); clt_path->max_pages_per_mr = min3(clt_path->max_pages_per_mr, (u32)max_pages_per_mr, ib_dev->attrs.max_fast_reg_page_list_len); From fe8c96affca8441a7f65a83be6e75c36eb7c486e Mon Sep 17 00:00:00 2001 From: Thomas Fourier Date: Tue, 30 Dec 2025 09:51:21 +0100 Subject: [PATCH 0642/1321] RDMA/bnxt_re: fix dma_free_coherent() pointer The dma_alloc_coherent() allocates a dma-mapped buffer, pbl->pg_arr[i]. The dma_free_coherent() should pass the same buffer to dma_free_coherent() and not page-aligned. Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Thomas Fourier Link: https://patch.msgid.link/20251230085121.8023-2-fourier.thomas@gmail.com Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/bnxt_re/qplib_res.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c b/drivers/infiniband/hw/bnxt_re/qplib_res.c index d5c12a51aa438..4d674a3aee1aa 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_res.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c @@ -64,9 +64,7 @@ static void __free_pbl(struct bnxt_qplib_res *res, struct bnxt_qplib_pbl *pbl, for (i = 0; i < pbl->pg_count; i++) { if (pbl->pg_arr[i]) dma_free_coherent(&pdev->dev, pbl->pg_size, - (void *)((unsigned long) - pbl->pg_arr[i] & - PAGE_MASK), + pbl->pg_arr[i], pbl->pg_map_arr[i]); else dev_warn(&pdev->dev, From d94c195165ef4b85e3b2382f49585b941a174a79 Mon Sep 17 00:00:00 2001 From: David Gow Date: Fri, 19 Dec 2025 16:52:58 +0800 Subject: [PATCH 0643/1321] kunit: Enforce task execution in {soft,hard}irq contexts The kunit_run_irq_test() helper allows a function to be run in hardirq and softirq contexts (in addition to the task context). It does this by running the user-provided function concurrently in the three contexts, until either a timeout has expired or a number of iterations have completed in the normal task context. However, on setups where the initialisation of the hardirq and softirq contexts (or, indeed, the scheduling of those tasks) is significantly slower than the function execution, it's possible for that number of iterations to be exceeded before any runs in irq contexts actually occur. This occurs with the polyval.test_polyval_preparekey_in_irqs test, which runs 20000 iterations of the relatively fast preparekey function, and therefore fails often under many UML, 32-bit arm, m68k and other environments. Instead, ensure that the max_iterations limit counts executions in all three contexts, and requires at least one of each. This will cause the test to continue iterating until at least the irq contexts have been tested, or the 1s wall-clock limit has been exceeded. This causes the test to pass in all of my environments. In so doing, we also update the task counters to atomic ints, to better match both the 'int' max_iterations input, and to ensure they are correctly updated across contexts. Finally, we also fix a few potential assertion messages to be less-specific to the original crypto usecases. Fixes: 950a81224e8b ("lib/crypto: tests: Add hash-test-template.h and gen-hash-testvecs.py") Signed-off-by: David Gow Link: https://lore.kernel.org/r/20251219085259.1163048-1-davidgow@google.com Signed-off-by: Eric Biggers --- include/kunit/run-in-irq-context.h | 53 +++++++++++++++++++----------- 1 file changed, 33 insertions(+), 20 deletions(-) diff --git a/include/kunit/run-in-irq-context.h b/include/kunit/run-in-irq-context.h index 108e96433ea45..c89b1b1b12dd5 100644 --- a/include/kunit/run-in-irq-context.h +++ b/include/kunit/run-in-irq-context.h @@ -20,8 +20,8 @@ struct kunit_irq_test_state { bool task_func_reported_failure; bool hardirq_func_reported_failure; bool softirq_func_reported_failure; - unsigned long hardirq_func_calls; - unsigned long softirq_func_calls; + atomic_t hardirq_func_calls; + atomic_t softirq_func_calls; struct hrtimer timer; struct work_struct bh_work; }; @@ -32,7 +32,7 @@ static enum hrtimer_restart kunit_irq_test_timer_func(struct hrtimer *timer) container_of(timer, typeof(*state), timer); WARN_ON_ONCE(!in_hardirq()); - state->hardirq_func_calls++; + atomic_inc(&state->hardirq_func_calls); if (!state->func(state->test_specific_state)) state->hardirq_func_reported_failure = true; @@ -48,7 +48,7 @@ static void kunit_irq_test_bh_work_func(struct work_struct *work) container_of(work, typeof(*state), bh_work); WARN_ON_ONCE(!in_serving_softirq()); - state->softirq_func_calls++; + atomic_inc(&state->softirq_func_calls); if (!state->func(state->test_specific_state)) state->softirq_func_reported_failure = true; @@ -59,7 +59,10 @@ static void kunit_irq_test_bh_work_func(struct work_struct *work) * hardirq context concurrently, and reports a failure to KUnit if any * invocation of @func in any context returns false. @func is passed * @test_specific_state as its argument. At most 3 invocations of @func will - * run concurrently: one in each of task, softirq, and hardirq context. + * run concurrently: one in each of task, softirq, and hardirq context. @func + * will continue running until either @max_iterations calls have been made (so + * long as at least one each runs in task, softirq, and hardirq contexts), or + * one second has passed. * * The main purpose of this interrupt context testing is to validate fallback * code paths that run in contexts where the normal code path cannot be used, @@ -85,6 +88,8 @@ static inline void kunit_run_irq_test(struct kunit *test, bool (*func)(void *), .test_specific_state = test_specific_state, }; unsigned long end_jiffies; + int hardirq_calls, softirq_calls; + bool allctx = false; /* * Set up a hrtimer (the way we access hardirq context) and a work @@ -94,14 +99,25 @@ static inline void kunit_run_irq_test(struct kunit *test, bool (*func)(void *), CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD); INIT_WORK_ONSTACK(&state.bh_work, kunit_irq_test_bh_work_func); - /* Run for up to max_iterations or 1 second, whichever comes first. */ + /* + * Run for up to max_iterations (including at least one task, softirq, + * and hardirq), or 1 second, whichever comes first. + */ end_jiffies = jiffies + HZ; hrtimer_start(&state.timer, KUNIT_IRQ_TEST_HRTIMER_INTERVAL, HRTIMER_MODE_REL_HARD); - for (int i = 0; i < max_iterations && !time_after(jiffies, end_jiffies); - i++) { + for (int task_calls = 0, calls = 0; + ((calls < max_iterations) || !allctx) && + !time_after(jiffies, end_jiffies); + task_calls++) { if (!func(test_specific_state)) state.task_func_reported_failure = true; + + hardirq_calls = atomic_read(&state.hardirq_func_calls); + softirq_calls = atomic_read(&state.softirq_func_calls); + calls = task_calls + hardirq_calls + softirq_calls; + allctx = (task_calls > 0) && (hardirq_calls > 0) && + (softirq_calls > 0); } /* Cancel the timer and work. */ @@ -109,21 +125,18 @@ static inline void kunit_run_irq_test(struct kunit *test, bool (*func)(void *), flush_work(&state.bh_work); /* Sanity check: the timer and BH functions should have been run. */ - KUNIT_EXPECT_GT_MSG(test, state.hardirq_func_calls, 0, + KUNIT_EXPECT_GT_MSG(test, atomic_read(&state.hardirq_func_calls), 0, "Timer function was not called"); - KUNIT_EXPECT_GT_MSG(test, state.softirq_func_calls, 0, + KUNIT_EXPECT_GT_MSG(test, atomic_read(&state.softirq_func_calls), 0, "BH work function was not called"); - /* Check for incorrect hash values reported from any context. */ - KUNIT_EXPECT_FALSE_MSG( - test, state.task_func_reported_failure, - "Incorrect hash values reported from task context"); - KUNIT_EXPECT_FALSE_MSG( - test, state.hardirq_func_reported_failure, - "Incorrect hash values reported from hardirq context"); - KUNIT_EXPECT_FALSE_MSG( - test, state.softirq_func_reported_failure, - "Incorrect hash values reported from softirq context"); + /* Check for failure reported from any context. */ + KUNIT_EXPECT_FALSE_MSG(test, state.task_func_reported_failure, + "Failure reported from task context"); + KUNIT_EXPECT_FALSE_MSG(test, state.hardirq_func_reported_failure, + "Failure reported from hardirq context"); + KUNIT_EXPECT_FALSE_MSG(test, state.softirq_func_reported_failure, + "Failure reported from softirq context"); } #endif /* _KUNIT_RUN_IN_IRQ_CONTEXT_H */ From 5b7c02f6fbbb5ea398c7bedc7471927412fcc3f0 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Fri, 26 Dec 2025 14:50:57 +0100 Subject: [PATCH 0644/1321] PM: sleep: Fix suspend_test() at the TEST_CORE level Commit a10ad1b10402 ("PM: suspend: Make pm_test delay interruptible by wakeup events") replaced mdelay() in suspend_test() with msleep() which does not work at the TEST_CORE test level that calls suspend_test() while running on one CPU with interrupts off. Address this by making suspend_test() check if the test level is suitable for using msleep() and use mdelay() otherwise. Fixes: a10ad1b10402 ("PM: suspend: Make pm_test delay interruptible by wakeup events") Reported-by: Sebastian Reichel Closes: https://lore.kernel.org/linux-pm/aUsAk0k1N9hw8IkY@venus/ Signed-off-by: Rafael J. Wysocki Tested-by: Sebastian Reichel Link: https://patch.msgid.link/6251576.lOV4Wx5bFT@rafael.j.wysocki --- kernel/power/suspend.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index 2da4482bb6eb8..57c44268698f7 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -349,9 +349,12 @@ static int suspend_test(int level) if (pm_test_level == level) { pr_info("suspend debug: Waiting for %d second(s).\n", pm_test_delay); - for (i = 0; i < pm_test_delay && !pm_wakeup_pending(); i++) - msleep(1000); - + for (i = 0; i < pm_test_delay && !pm_wakeup_pending(); i++) { + if (level > TEST_CORE) + msleep(1000); + else + mdelay(1000); + } return 1; } #endif /* !CONFIG_PM_DEBUG */ From 53c9bc5020e512b57916882b717852d20f95af42 Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Wed, 3 Dec 2025 15:29:23 -0800 Subject: [PATCH 0645/1321] tools/build: Add a feature test for libopenssl It's used by bpftool and the kernel build. Let's add a feature test so that perf can decide what to do based on the availability. Signed-off-by: Namhyung Kim --- tools/build/Makefile.feature | 6 ++++-- tools/build/feature/Makefile | 8 ++++++-- tools/build/feature/test-all.c | 5 +++++ tools/build/feature/test-libopenssl.c | 7 +++++++ 4 files changed, 22 insertions(+), 4 deletions(-) create mode 100644 tools/build/feature/test-libopenssl.c diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature index a7f030fc5e835..362cf8f4a0a02 100644 --- a/tools/build/Makefile.feature +++ b/tools/build/Makefile.feature @@ -99,7 +99,8 @@ FEATURE_TESTS_BASIC := \ libzstd \ disassembler-four-args \ disassembler-init-styled \ - file-handle + file-handle \ + libopenssl # FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list # of all feature tests @@ -147,7 +148,8 @@ FEATURE_DISPLAY ?= \ lzma \ bpf \ libaio \ - libzstd + libzstd \ + libopenssl # # Declare group members of a feature to display the logical OR of the detection diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile index 87a5a908d6fae..c699d4f4c6d93 100644 --- a/tools/build/feature/Makefile +++ b/tools/build/feature/Makefile @@ -67,12 +67,13 @@ FILES= \ test-libopencsd.bin \ test-clang.bin \ test-llvm.bin \ - test-llvm-perf.bin \ + test-llvm-perf.bin \ test-libaio.bin \ test-libzstd.bin \ test-clang-bpf-co-re.bin \ test-file-handle.bin \ - test-libpfm4.bin + test-libpfm4.bin \ + test-libopenssl.bin FILES := $(addprefix $(OUTPUT),$(FILES)) @@ -381,6 +382,9 @@ $(OUTPUT)test-file-handle.bin: $(OUTPUT)test-libpfm4.bin: $(BUILD) -lpfm +$(OUTPUT)test-libopenssl.bin: + $(BUILD) -lssl + $(OUTPUT)test-bpftool-skeletons.bin: $(SYSTEM_BPFTOOL) version | grep '^features:.*skeletons' \ > $(@:.bin=.make.output) 2>&1 diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c index eb346160d0ba0..1488bf6e60783 100644 --- a/tools/build/feature/test-all.c +++ b/tools/build/feature/test-all.c @@ -142,6 +142,10 @@ # include "test-libtraceevent.c" #undef main +#define main main_test_libopenssl +# include "test-libopenssl.c" +#undef main + int main(int argc, char *argv[]) { main_test_libpython(); @@ -173,6 +177,7 @@ int main(int argc, char *argv[]) main_test_reallocarray(); main_test_libzstd(); main_test_libtraceevent(); + main_test_libopenssl(); return 0; } diff --git a/tools/build/feature/test-libopenssl.c b/tools/build/feature/test-libopenssl.c new file mode 100644 index 0000000000000..168c45894e8be --- /dev/null +++ b/tools/build/feature/test-libopenssl.c @@ -0,0 +1,7 @@ +#include +#include + +int main(void) +{ + return SSL_library_init(); +} From 0b589921f08b37a241b09a3a2579ea6f2ea2f55d Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Wed, 3 Dec 2025 15:29:24 -0800 Subject: [PATCH 0646/1321] perf tools: Disable BPF skeleton if no libopenssl found The libopenssl is required by bpftool which is needed to generate BPF skeleton. Disable it by setting BUILD_BPF_SKEL to 0 otherwise it'll see build errors like below: CC /build/util/bpf_skel/.tmp/bootstrap/sign.o sign.c:16:10: fatal error: openssl/opensslv.h: No such file or directory 16 | #include | ^~~~~~~~~~~~~~~~~~~~ compilation terminated. make[3]: *** [Makefile:256: /build/util/bpf_skel/.tmp/bootstrap/sign.o] Error 1 make[3]: *** Waiting for unfinished jobs.... make[2]: *** [Makefile.perf:1211: /build/util/bpf_skel/.tmp/bootstrap/bpftool] Error 2 make[1]: *** [Makefile.perf:287: sub-make] Error 2 make: *** [Makefile:76: all] Error 2 Now it'll skip the build with the following message: Makefile.config:729: Warning: Disabled BPF skeletons as libopenssl is required Closes: https://lore.kernel.org/r/aP7uq6eVieG8v_v4@google.com Signed-off-by: Namhyung Kim --- tools/perf/Makefile.config | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index bd9f4804d56ba..d8d25f62aaad6 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -701,6 +701,11 @@ ifndef NO_LIBBPF endif endif +ifeq ($(feature-libopenssl), 1) + $(call detected,CONFIG_LIBOPENSSL) + CFLAGS += -DHAVE_LIBOPENSSL_SUPPORT +endif + ifndef BUILD_BPF_SKEL # BPF skeletons control a large number of perf features, by default # they are enabled. @@ -717,6 +722,9 @@ ifeq ($(BUILD_BPF_SKEL),1) else ifeq ($(filter -DHAVE_LIBBPF_SUPPORT, $(CFLAGS)),) $(warning Warning: Disabled BPF skeletons as libbpf is required) BUILD_BPF_SKEL := 0 + else ifeq ($(filter -DHAVE_LIBOPENSSL_SUPPORT, $(CFLAGS)),) + $(warning Warning: Disabled BPF skeletons as libopenssl is required) + BUILD_BPF_SKEL := 0 else ifeq ($(call get-executable,$(CLANG)),) $(warning Warning: Disabled BPF skeletons as clang ($(CLANG)) is missing) BUILD_BPF_SKEL := 0 From 234e41f96d0575a3b8a676dbe6c710efd488773a Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Sat, 6 Dec 2025 18:23:45 -0800 Subject: [PATCH 0647/1321] perf symbol: Fix ENOENT case for filename__read_build_id Some callers of filename__read_build_id assume the error value must be -1, fix by making them handle all < 0 values. If is_regular_file fails in filename__read_build_id then it could be the file is missing (ENOENT) and it would be wrong to return -EWOULDBLOCK in that case. Fix the logic so -EWOULDBLOCK is only reported if other errors with stat haven't occurred. Fixes: 834ebb5678d7 ("perf tools: Don't read build-ids from non-regular files") Signed-off-by: Ian Rogers Reviewed-by: James Clark Signed-off-by: Namhyung Kim --- tools/perf/builtin-buildid-cache.c | 6 ++++-- tools/perf/util/libbfd.c | 4 +++- tools/perf/util/symbol-elf.c | 4 +++- tools/perf/util/symbol-minimal.c | 4 +++- 4 files changed, 13 insertions(+), 5 deletions(-) diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c index c98104481c8a1..539e779e32682 100644 --- a/tools/perf/builtin-buildid-cache.c +++ b/tools/perf/builtin-buildid-cache.c @@ -276,12 +276,14 @@ static bool dso__missing_buildid_cache(struct dso *dso, int parm __maybe_unused) { char filename[PATH_MAX]; struct build_id bid = { .size = 0, }; + int err; if (!dso__build_id_filename(dso, filename, sizeof(filename), false)) return true; - if (filename__read_build_id(filename, &bid) == -1) { - if (errno == ENOENT) + err = filename__read_build_id(filename, &bid); + if (err < 0) { + if (err == -ENOENT) return false; pr_warning("Problems with %s file, consider removing it from the cache\n", diff --git a/tools/perf/util/libbfd.c b/tools/perf/util/libbfd.c index cc0c474cbfaa8..79f4528234a9d 100644 --- a/tools/perf/util/libbfd.c +++ b/tools/perf/util/libbfd.c @@ -426,8 +426,10 @@ int libbfd__read_build_id(const char *filename, struct build_id *bid) if (!filename) return -EFAULT; + + errno = 0; if (!is_regular_file(filename)) - return -EWOULDBLOCK; + return errno == 0 ? -EWOULDBLOCK : -errno; fd = open(filename, O_RDONLY); if (fd < 0) diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c index 957143fbf8a07..d1dcafa4b3b80 100644 --- a/tools/perf/util/symbol-elf.c +++ b/tools/perf/util/symbol-elf.c @@ -902,8 +902,10 @@ int filename__read_build_id(const char *filename, struct build_id *bid) if (!filename) return -EFAULT; + + errno = 0; if (!is_regular_file(filename)) - return -EWOULDBLOCK; + return errno == 0 ? -EWOULDBLOCK : -errno; err = kmod_path__parse(&m, filename); if (err) diff --git a/tools/perf/util/symbol-minimal.c b/tools/perf/util/symbol-minimal.c index c6b17c14a2e99..8221dc9868f7c 100644 --- a/tools/perf/util/symbol-minimal.c +++ b/tools/perf/util/symbol-minimal.c @@ -104,8 +104,10 @@ int filename__read_build_id(const char *filename, struct build_id *bid) if (!filename) return -EFAULT; + + errno = 0; if (!is_regular_file(filename)) - return -EWOULDBLOCK; + return errno == 0 ? -EWOULDBLOCK : -errno; fd = open(filename, O_RDONLY); if (fd < 0) From f9579a5ff155b1389de4fbd4359bfbf5c81f53d6 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Thu, 4 Dec 2025 14:55:21 -0800 Subject: [PATCH 0648/1321] perf tests kvm: Avoid leaving perf.data.guest file around Ensure the perf.data output when checking permissions is written to /dev/null so that it isn't left in the directory the test is run. Fixes: b58261584d2f ("perf test kvm: Add some basic perf kvm test coverage") Signed-off-by: Ian Rogers Signed-off-by: Namhyung Kim --- tools/perf/tests/shell/kvm.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/tests/shell/kvm.sh b/tools/perf/tests/shell/kvm.sh index 2fafde1a29cca..2a399b83fe808 100755 --- a/tools/perf/tests/shell/kvm.sh +++ b/tools/perf/tests/shell/kvm.sh @@ -118,7 +118,7 @@ setup_qemu() { skip "/dev/kvm not accessible" fi - if ! perf kvm stat record -a sleep 0.01 >/dev/null 2>&1; then + if ! perf kvm stat record -o /dev/null -a sleep 0.01 >/dev/null 2>&1; then skip "No permission to record kvm events" fi From 6ac3327e551408bb51bfdd2f5de92fc58d036dca Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Thu, 4 Dec 2025 14:55:22 -0800 Subject: [PATCH 0649/1321] perf tests top: Make the test exclusive With sufficient tests running the load causes the top test fails with: ``` 123: perf top tests : FAILED! --- start --- test child forked, pid 629856 Basic perf top test Basic perf top test [Failed: no sample percentage found] ---- end(-1) ---- ``` Mark the test exclusive to avoid flakes. Fixes: 75e961730b9e ("perf tests top: Add basic perf top coverage test") Signed-off-by: Ian Rogers Signed-off-by: Namhyung Kim --- tools/perf/tests/shell/top.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/tests/shell/top.sh b/tools/perf/tests/shell/top.sh index 768ebcf7a89cb..ad7fccd09025d 100755 --- a/tools/perf/tests/shell/top.sh +++ b/tools/perf/tests/shell/top.sh @@ -1,5 +1,5 @@ #!/bin/bash -# perf top tests +# perf top tests (exclusive) # SPDX-License-Identifier: GPL-2.0 set -e From c4d7fad1e6c4d70bec5f6bcbaddb85a22b26dd10 Mon Sep 17 00:00:00 2001 From: Besar Wicaksono Date: Fri, 19 Dec 2025 23:13:24 +0000 Subject: [PATCH 0650/1321] tools headers arm64: Add NVIDIA Olympus part Add the part number and MIDR for NVIDIA Olympus. Signed-off-by: Besar Wicaksono Reviewed-by: Leo Yan Signed-off-by: Namhyung Kim --- tools/arch/arm64/include/asm/cputype.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/arch/arm64/include/asm/cputype.h b/tools/arch/arm64/include/asm/cputype.h index f898c47e551f6..54ae78d23f7d4 100644 --- a/tools/arch/arm64/include/asm/cputype.h +++ b/tools/arch/arm64/include/asm/cputype.h @@ -130,6 +130,7 @@ #define NVIDIA_CPU_PART_DENVER 0x003 #define NVIDIA_CPU_PART_CARMEL 0x004 +#define NVIDIA_CPU_PART_OLYMPUS 0x010 #define FUJITSU_CPU_PART_A64FX 0x001 @@ -222,6 +223,7 @@ #define MIDR_NVIDIA_DENVER MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, NVIDIA_CPU_PART_DENVER) #define MIDR_NVIDIA_CARMEL MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, NVIDIA_CPU_PART_CARMEL) +#define MIDR_NVIDIA_OLYMPUS MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, NVIDIA_CPU_PART_OLYMPUS) #define MIDR_FUJITSU_A64FX MIDR_CPU_MODEL(ARM_CPU_IMP_FUJITSU, FUJITSU_CPU_PART_A64FX) #define MIDR_HISI_TSV110 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_TSV110) #define MIDR_HISI_HIP09 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_HIP09) From 85ecc7be8215b384db95b5a8afeac2e6ba5fe388 Mon Sep 17 00:00:00 2001 From: Besar Wicaksono Date: Fri, 19 Dec 2025 23:13:25 +0000 Subject: [PATCH 0651/1321] perf arm-spe: Add NVIDIA Olympus to neoverse list Add NVIDIA Olympus MIDR to neoverse_spe range list. Signed-off-by: Besar Wicaksono Reviewed-by: Leo Yan Signed-off-by: Namhyung Kim --- tools/perf/util/arm-spe.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index dc19e72258f30..70dd9bee47c75 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -587,6 +587,7 @@ static const struct midr_range common_ds_encoding_cpus[] = { MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V2), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3), + MIDR_ALL_VERSIONS(MIDR_NVIDIA_OLYMPUS), {}, }; From 1920e1096e93adc466e276d2d6f04b80ba01079b Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Mon, 22 Dec 2025 14:57:06 -0800 Subject: [PATCH 0652/1321] tools headers: Sync UAPI drm/drm.h with kernel sources To pick up changes from: 179ab8e7d7b378f1 ("drm/colorop: Introduce DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE") This should be used to beautify DRM syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/drm/drm.h include/uapi/drm/drm.h Please see tools/include/uapi/README. Cc: dri-devel@lists.freedesktop.org Signed-off-by: Namhyung Kim --- tools/include/uapi/drm/drm.h | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/tools/include/uapi/drm/drm.h b/tools/include/uapi/drm/drm.h index 3cd5cf15e3c9c..27cc159c1d275 100644 --- a/tools/include/uapi/drm/drm.h +++ b/tools/include/uapi/drm/drm.h @@ -906,6 +906,21 @@ struct drm_get_cap { */ #define DRM_CLIENT_CAP_CURSOR_PLANE_HOTSPOT 6 +/** + * DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE + * + * If set to 1 the DRM core will allow setting the COLOR_PIPELINE + * property on a &drm_plane, as well as drm_colorop properties. + * + * Setting of these plane properties will be rejected when this client + * cap is set: + * - COLOR_ENCODING + * - COLOR_RANGE + * + * The client must enable &DRM_CLIENT_CAP_ATOMIC first. + */ +#define DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE 7 + /* DRM_IOCTL_SET_CLIENT_CAP ioctl argument type */ struct drm_set_client_cap { __u64 capability; From 83201d960935e3e1b13ac6b3f3db46fba935ac47 Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Mon, 22 Dec 2025 14:57:07 -0800 Subject: [PATCH 0653/1321] tools headers: Sync UAPI KVM headers with kernel sources To pick up changes from: ad9c62bd8946621e ("KVM: arm64: VM exit to userspace to handle SEA") 8e8678e740ecde2a ("KVM: s390: Add capability that forwards operation exceptions") e0c26d47def7382d ("Merge tag 'kvm-s390-next-6.19-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD") 7a61d61396b97fd6 ("KVM: SEV: Publish supported SEV-SNP policy bits") This should be used to beautify DRM syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h Please see tools/include/uapi/README. Cc: kvm@vger.kernel.org Signed-off-by: Namhyung Kim --- tools/arch/x86/include/uapi/asm/kvm.h | 1 + tools/include/uapi/linux/kvm.h | 11 +++++++++++ 2 files changed, 12 insertions(+) diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h index d420c9c066d48..7ceff65836525 100644 --- a/tools/arch/x86/include/uapi/asm/kvm.h +++ b/tools/arch/x86/include/uapi/asm/kvm.h @@ -502,6 +502,7 @@ struct kvm_sync_regs { /* vendor-specific groups and attributes for system fd */ #define KVM_X86_GRP_SEV 1 # define KVM_X86_SEV_VMSA_FEATURES 0 +# define KVM_X86_SNP_POLICY_BITS 1 struct kvm_vmx_nested_state_data { __u8 vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE]; diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h index 52f6000ab0208..dddb781b0507d 100644 --- a/tools/include/uapi/linux/kvm.h +++ b/tools/include/uapi/linux/kvm.h @@ -179,6 +179,7 @@ struct kvm_xen_exit { #define KVM_EXIT_LOONGARCH_IOCSR 38 #define KVM_EXIT_MEMORY_FAULT 39 #define KVM_EXIT_TDX 40 +#define KVM_EXIT_ARM_SEA 41 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -473,6 +474,14 @@ struct kvm_run { } setup_event_notify; }; } tdx; + /* KVM_EXIT_ARM_SEA */ + struct { +#define KVM_EXIT_ARM_SEA_FLAG_GPA_VALID (1ULL << 0) + __u64 flags; + __u64 esr; + __u64 gva; + __u64 gpa; + } arm_sea; /* Fix the size of the union. */ char padding[256]; }; @@ -963,6 +972,8 @@ struct kvm_enable_cap { #define KVM_CAP_RISCV_MP_STATE_RESET 242 #define KVM_CAP_ARM_CACHEABLE_PFNMAP_SUPPORTED 243 #define KVM_CAP_GUEST_MEMFD_FLAGS 244 +#define KVM_CAP_ARM_SEA_TO_USER 245 +#define KVM_CAP_S390_USER_OPEREXEC 246 struct kvm_irq_routing_irqchip { __u32 irqchip; From 08405e9088dbfaddddceea522bdd79a653ffe473 Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Mon, 22 Dec 2025 14:57:08 -0800 Subject: [PATCH 0654/1321] tools headers: Sync UAPI linux/fcntl.h with kernel sources To pick up changes from: fe93446b5ebdaa89 ("vfs: use UAPI types for new struct delegation definition") 4be9e04ebf75a5c4 ("vfs: add needed headers for new struct delegation definition") 1602bad16d7df82f ("vfs: expose delegation support to userland") This should be used to beautify fcntl syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h Please see tools/include/uapi/README. Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Namhyung Kim --- tools/perf/trace/beauty/include/uapi/linux/fcntl.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/tools/perf/trace/beauty/include/uapi/linux/fcntl.h b/tools/perf/trace/beauty/include/uapi/linux/fcntl.h index 3741ea1b73d85..aadfbf6e0cb3a 100644 --- a/tools/perf/trace/beauty/include/uapi/linux/fcntl.h +++ b/tools/perf/trace/beauty/include/uapi/linux/fcntl.h @@ -4,6 +4,7 @@ #include #include +#include #define F_SETLEASE (F_LINUX_SPECIFIC_BASE + 0) #define F_GETLEASE (F_LINUX_SPECIFIC_BASE + 1) @@ -79,6 +80,17 @@ */ #define RWF_WRITE_LIFE_NOT_SET RWH_WRITE_LIFE_NOT_SET +/* Set/Get delegations */ +#define F_GETDELEG (F_LINUX_SPECIFIC_BASE + 15) +#define F_SETDELEG (F_LINUX_SPECIFIC_BASE + 16) + +/* Argument structure for F_GETDELEG and F_SETDELEG */ +struct delegation { + __u32 d_flags; /* Must be 0 */ + __u16 d_type; /* F_RDLCK, F_WRLCK, F_UNLCK */ + __u16 __pad; /* Must be 0 */ +}; + /* * Types of directory notifications that may be requested. */ From 34c6888ad277f2882b7e52240c656db743fda550 Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Mon, 22 Dec 2025 14:57:09 -0800 Subject: [PATCH 0655/1321] tools headers: Sync UAPI linux/fs.h with kernel sources To pick up changes from: b30ffcdc0c15a88f ("block: introduce BLKREPORTZONESV2 ioctl") 0d8627cc936de8ea ("blktrace: add definitions for blk_user_trace_setup2") This should be used to beautify ioctl syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/fs.h include/uapi/linux/fs.h Please see tools/include/uapi/README. Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Namhyung Kim --- tools/perf/trace/beauty/include/uapi/linux/fs.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/perf/trace/beauty/include/uapi/linux/fs.h b/tools/perf/trace/beauty/include/uapi/linux/fs.h index beb4c2d1e41cb..66ca526cf786c 100644 --- a/tools/perf/trace/beauty/include/uapi/linux/fs.h +++ b/tools/perf/trace/beauty/include/uapi/linux/fs.h @@ -298,8 +298,9 @@ struct file_attr { #define BLKROTATIONAL _IO(0x12,126) #define BLKZEROOUT _IO(0x12,127) #define BLKGETDISKSEQ _IOR(0x12,128,__u64) -/* 130-136 are used by zoned block device ioctls (uapi/linux/blkzoned.h) */ +/* 130-136 and 142 are used by zoned block device ioctls (uapi/linux/blkzoned.h) */ /* 137-141 are used by blk-crypto ioctls (uapi/linux/blk-crypto.h) */ +#define BLKTRACESETUP2 _IOWR(0x12, 142, struct blk_user_trace_setup2) #define BMAP_IOCTL 1 /* obsolete - kept for compatibility */ #define FIBMAP _IO(0x00,1) /* bmap access */ From 5a9000e20b8308069e76e28d2de1993596863137 Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Mon, 22 Dec 2025 14:57:10 -0800 Subject: [PATCH 0656/1321] tools headers: Sync UAPI linux/mount.h with kernel sources To pick up changes from: 78f0e33cd6c939a5 ("fs/namespace: correctly handle errors returned by grab_requested_mnt_ns") This should be used to beautify mount syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/mount.h include/uapi/linux/mount.h Please see tools/include/uapi/README. Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Namhyung Kim --- tools/perf/trace/beauty/include/uapi/linux/mount.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/trace/beauty/include/uapi/linux/mount.h b/tools/perf/trace/beauty/include/uapi/linux/mount.h index 7fa67c2031a5d..5d3f8c9e3a625 100644 --- a/tools/perf/trace/beauty/include/uapi/linux/mount.h +++ b/tools/perf/trace/beauty/include/uapi/linux/mount.h @@ -197,7 +197,7 @@ struct statmount { */ struct mnt_id_req { __u32 size; - __u32 spare; + __u32 mnt_ns_fd; __u64 mnt_id; __u64 param; __u64 mnt_ns_id; From d2abd71534727542127d86ca50f9740e6a845bce Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Mon, 22 Dec 2025 14:57:11 -0800 Subject: [PATCH 0657/1321] tools headers: Sync UAPI sound/asound.h with kernel sources To pick up changes from: 9a97857db0c5655b ("ALSA: uapi: Fix typo in asound.h comment") This should address these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/sound/asound.h include/uapi/sound/asound.h Please see tools/include/uapi/README. Cc: linux-sound@vger.kernel.org Signed-off-by: Namhyung Kim --- tools/perf/trace/beauty/include/uapi/sound/asound.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/trace/beauty/include/uapi/sound/asound.h b/tools/perf/trace/beauty/include/uapi/sound/asound.h index 5a049eeaeccea..d3ce75ba938a8 100644 --- a/tools/perf/trace/beauty/include/uapi/sound/asound.h +++ b/tools/perf/trace/beauty/include/uapi/sound/asound.h @@ -60,7 +60,7 @@ struct snd_cea_861_aud_if { unsigned char db2_sf_ss; /* sample frequency and size */ unsigned char db3; /* not used, all zeros */ unsigned char db4_ca; /* channel allocation code */ - unsigned char db5_dminh_lsv; /* downmix inhibit & level-shit values */ + unsigned char db5_dminh_lsv; /* downmix inhibit & level-shift values */ }; /**************************************************************************** From 9fe5dc0ba28816740b117b1a0fe67eaa8a52f874 Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Mon, 22 Dec 2025 14:57:12 -0800 Subject: [PATCH 0658/1321] tools headers: Sync x86 headers with kernel sources To pick up changes from: 54de197c9a5e8f52 ("Merge tag 'x86_sgx_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip") 679fcce0028bf101 ("Merge tag 'kvm-x86-svm-6.19' of https://github.com/kvm-x86/linux into HEAD") 3767def18f4cc394 ("x86/cpufeatures: Add support for L3 Smart Data Cache Injection Allocation Enforcement") f6106d41ec84e552 ("x86/bugs: Use an x86 feature to track the MMIO Stale Data mitigation") 7baadd463e147fdc ("x86/cpufeatures: Enumerate the LASS feature bits") 47955b58cf9b97fe ("x86/cpufeatures: Correct LKGS feature flag description") 5d0316e25defee47 ("x86/cpufeatures: Add X86_FEATURE_X2AVIC_EXT") 6ffdb49101f02313 ("x86/cpufeatures: Add X86_FEATURE_SGX_EUPDATESVN feature flag") 4793f990ea152330 ("KVM: x86: Advertise EferLmsleUnsupported to userspace") bb5f13df3c455110 ("perf/x86/intel: Add counter group support for arch-PEBS") 52448a0a739002ec ("perf/x86/intel: Setup PEBS data configuration and enable legacy groups") d21954c8a0ffbc94 ("perf/x86/intel: Process arch-PEBS records or record fragments") bffeb2fd0b9c99d8 ("x86/microcode/intel: Enable staging when available") 740144bc6bde9d44 ("x86/microcode/intel: Establish staging control logic") This should address these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Please see tools/include/uapi/README. Cc: x86@kernel.org Signed-off-by: Namhyung Kim --- tools/arch/x86/include/asm/cpufeatures.h | 11 +++++++++ tools/arch/x86/include/asm/msr-index.h | 30 ++++++++++++++++++++++++ 2 files changed, 41 insertions(+) diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h index ccc01ad6ff7c9..c3b53beb13007 100644 --- a/tools/arch/x86/include/asm/cpufeatures.h +++ b/tools/arch/x86/include/asm/cpufeatures.h @@ -314,6 +314,7 @@ #define X86_FEATURE_SM4 (12*32+ 2) /* SM4 instructions */ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* "avx_vnni" AVX VNNI instructions */ #define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* "avx512_bf16" AVX512 BFLOAT16 instructions */ +#define X86_FEATURE_LASS (12*32+ 6) /* "lass" Linear Address Space Separation */ #define X86_FEATURE_CMPCCXADD (12*32+ 7) /* CMPccXADD instructions */ #define X86_FEATURE_ARCH_PERFMON_EXT (12*32+ 8) /* Intel Architectural PerfMon Extension */ #define X86_FEATURE_FZRM (12*32+10) /* Fast zero-length REP MOVSB */ @@ -338,6 +339,7 @@ #define X86_FEATURE_AMD_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors */ #define X86_FEATURE_AMD_STIBP_ALWAYS_ON (13*32+17) /* Single Thread Indirect Branch Predictors always-on preferred */ #define X86_FEATURE_AMD_IBRS_SAME_MODE (13*32+19) /* Indirect Branch Restricted Speculation same mode protection*/ +#define X86_FEATURE_EFER_LMSLE_MBZ (13*32+20) /* EFER.LMSLE must be zero */ #define X86_FEATURE_AMD_PPIN (13*32+23) /* "amd_ppin" Protected Processor Inventory Number */ #define X86_FEATURE_AMD_SSBD (13*32+24) /* Speculative Store Bypass Disable */ #define X86_FEATURE_VIRT_SSBD (13*32+25) /* "virt_ssbd" Virtualized Speculative Store Bypass Disable */ @@ -502,6 +504,15 @@ #define X86_FEATURE_IBPB_EXIT_TO_USER (21*32+14) /* Use IBPB on exit-to-userspace, see VMSCAPE bug */ #define X86_FEATURE_ABMC (21*32+15) /* Assignable Bandwidth Monitoring Counters */ #define X86_FEATURE_MSR_IMM (21*32+16) /* MSR immediate form instructions */ +#define X86_FEATURE_SGX_EUPDATESVN (21*32+17) /* Support for ENCLS[EUPDATESVN] instruction */ + +#define X86_FEATURE_SDCIAE (21*32+18) /* L3 Smart Data Cache Injection Allocation Enforcement */ +#define X86_FEATURE_CLEAR_CPU_BUF_VM_MMIO (21*32+19) /* + * Clear CPU buffers before VM-Enter if the vCPU + * can access host MMIO (ignored for all intents + * and purposes if CLEAR_CPU_BUF_VM is set). + */ +#define X86_FEATURE_X2AVIC_EXT (21*32+20) /* AMD SVM x2AVIC support for 4k vCPUs */ /* * BUG word(s) diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index 9e1720d73244f..3d0a0950d20a1 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -166,6 +166,10 @@ * Processor MMIO stale data * vulnerabilities. */ +#define ARCH_CAP_MCU_ENUM BIT(16) /* + * Indicates the presence of microcode update + * feature enumeration and status information. + */ #define ARCH_CAP_FB_CLEAR BIT(17) /* * VERW clears CPU fill buffer * even on MDS_NO CPUs. @@ -327,6 +331,26 @@ PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE | \ PERF_CAP_PEBS_TIMING_INFO) +/* Arch PEBS */ +#define MSR_IA32_PEBS_BASE 0x000003f4 +#define MSR_IA32_PEBS_INDEX 0x000003f5 +#define ARCH_PEBS_OFFSET_MASK 0x7fffff +#define ARCH_PEBS_INDEX_WR_SHIFT 4 + +#define ARCH_PEBS_RELOAD 0xffffffff +#define ARCH_PEBS_CNTR_ALLOW BIT_ULL(35) +#define ARCH_PEBS_CNTR_GP BIT_ULL(36) +#define ARCH_PEBS_CNTR_FIXED BIT_ULL(37) +#define ARCH_PEBS_CNTR_METRICS BIT_ULL(38) +#define ARCH_PEBS_LBR_SHIFT 40 +#define ARCH_PEBS_LBR (0x3ull << ARCH_PEBS_LBR_SHIFT) +#define ARCH_PEBS_VECR_XMM BIT_ULL(49) +#define ARCH_PEBS_GPR BIT_ULL(61) +#define ARCH_PEBS_AUX BIT_ULL(62) +#define ARCH_PEBS_EN BIT_ULL(63) +#define ARCH_PEBS_CNTR_MASK (ARCH_PEBS_CNTR_GP | ARCH_PEBS_CNTR_FIXED | \ + ARCH_PEBS_CNTR_METRICS) + #define MSR_IA32_RTIT_CTL 0x00000570 #define RTIT_CTL_TRACEEN BIT(0) #define RTIT_CTL_CYCLEACC BIT(1) @@ -929,6 +953,10 @@ #define MSR_IA32_APICBASE_BASE (0xfffff<<12) #define MSR_IA32_UCODE_WRITE 0x00000079 + +#define MSR_IA32_MCU_ENUMERATION 0x0000007b +#define MCU_STAGING BIT(4) + #define MSR_IA32_UCODE_REV 0x0000008b /* Intel SGX Launch Enclave Public Key Hash MSRs */ @@ -1226,6 +1254,8 @@ #define MSR_IA32_VMX_VMFUNC 0x00000491 #define MSR_IA32_VMX_PROCBASED_CTLS3 0x00000492 +#define MSR_IA32_MCU_STAGING_MBOX_ADDR 0x000007a5 + /* Resctrl MSRs: */ /* - Intel: */ #define MSR_IA32_L3_QOS_CFG 0xc81 From d8397a62208ff8f0e4090c62fb09dcefd53384d6 Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Mon, 22 Dec 2025 14:57:13 -0800 Subject: [PATCH 0659/1321] tools headers: Sync arm64 headers with kernel sources To pick up changes from: b0a3f0e894f34e01 ("arm64/sysreg: Replace TCR_EL1 field macros") 3bbf004c4808e2c3 ("arm64: cputype: Add Neoverse-V3AE definitions") e185c8a0d84236d1 ("arm64: cputype: Add NVIDIA Olympus definitions") 52b49bd6de29a89a ("arm64: cputype: Remove duplicate Cortex-X1C definitions") This should address these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Please see tools/include/uapi/README. Note that this is still out of sync due to is_midr_in_range_list(). Reviewed-by: Leo Yan Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Namhyung Kim --- tools/arch/arm64/include/asm/cputype.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/arch/arm64/include/asm/cputype.h b/tools/arch/arm64/include/asm/cputype.h index 54ae78d23f7d4..9b73c1aa3ad74 100644 --- a/tools/arch/arm64/include/asm/cputype.h +++ b/tools/arch/arm64/include/asm/cputype.h @@ -81,7 +81,6 @@ #define ARM_CPU_PART_CORTEX_A78AE 0xD42 #define ARM_CPU_PART_CORTEX_X1 0xD44 #define ARM_CPU_PART_CORTEX_A510 0xD46 -#define ARM_CPU_PART_CORTEX_X1C 0xD4C #define ARM_CPU_PART_CORTEX_A520 0xD80 #define ARM_CPU_PART_CORTEX_A710 0xD47 #define ARM_CPU_PART_CORTEX_A715 0xD4D @@ -93,6 +92,7 @@ #define ARM_CPU_PART_NEOVERSE_V2 0xD4F #define ARM_CPU_PART_CORTEX_A720 0xD81 #define ARM_CPU_PART_CORTEX_X4 0xD82 +#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83 #define ARM_CPU_PART_NEOVERSE_V3 0xD84 #define ARM_CPU_PART_CORTEX_X925 0xD85 #define ARM_CPU_PART_CORTEX_A725 0xD87 @@ -172,7 +172,6 @@ #define MIDR_CORTEX_A78AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78AE) #define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1) #define MIDR_CORTEX_A510 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A510) -#define MIDR_CORTEX_X1C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1C) #define MIDR_CORTEX_A520 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A520) #define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710) #define MIDR_CORTEX_A715 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A715) @@ -184,6 +183,7 @@ #define MIDR_NEOVERSE_V2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V2) #define MIDR_CORTEX_A720 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A720) #define MIDR_CORTEX_X4 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X4) +#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE) #define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3) #define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925) #define MIDR_CORTEX_A725 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A725) @@ -247,7 +247,7 @@ /* Fujitsu Erratum 010001 affects A64FX 1.0 and 1.1, (v0r0 and v1r0) */ #define MIDR_FUJITSU_ERRATUM_010001 MIDR_FUJITSU_A64FX #define MIDR_FUJITSU_ERRATUM_010001_MASK (~MIDR_CPU_VAR_REV(1, 0)) -#define TCR_CLEAR_FUJITSU_ERRATUM_010001 (TCR_NFD1 | TCR_NFD0) +#define TCR_CLEAR_FUJITSU_ERRATUM_010001 (TCR_EL1_NFD1 | TCR_EL1_NFD0) #ifndef __ASSEMBLER__ From 3dfd6d2ec8cc0f1559741737afbd09e8d9e56757 Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Mon, 22 Dec 2025 14:57:14 -0800 Subject: [PATCH 0660/1321] tools headers: Sync linux/gfp_types.h with kernel sources To pick up changes from: 4c0a17e28340e458 ("slab: prevent recursive kmalloc() in alloc_empty_sheaf()") This would be used to handle GFP masks in the perf code and address these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/linux/gfp_types.h include/linux/gfp_types.h Please see tools/include/uapi/README. Acked-by: SeongJae Park Cc: linux-mm@kvack.org Signed-off-by: Namhyung Kim --- tools/include/linux/gfp_types.h | 6 ------ 1 file changed, 6 deletions(-) diff --git a/tools/include/linux/gfp_types.h b/tools/include/linux/gfp_types.h index 65db9349f9053..3de43b12209ee 100644 --- a/tools/include/linux/gfp_types.h +++ b/tools/include/linux/gfp_types.h @@ -55,9 +55,7 @@ enum { #ifdef CONFIG_LOCKDEP ___GFP_NOLOCKDEP_BIT, #endif -#ifdef CONFIG_SLAB_OBJ_EXT ___GFP_NO_OBJ_EXT_BIT, -#endif ___GFP_LAST_BIT }; @@ -98,11 +96,7 @@ enum { #else #define ___GFP_NOLOCKDEP 0 #endif -#ifdef CONFIG_SLAB_OBJ_EXT #define ___GFP_NO_OBJ_EXT BIT(___GFP_NO_OBJ_EXT_BIT) -#else -#define ___GFP_NO_OBJ_EXT 0 -#endif /* * Physical address zone modifiers (see linux/mmzone.h - low four bits) From c28d3bc783b2e268121f537348826e712362f826 Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Mon, 22 Dec 2025 14:57:15 -0800 Subject: [PATCH 0661/1321] tools headers: Sync linux/socket.h with kernel sources To pick up changes from: d73c167708739137 ("socket: Split out a getsockname helper for io_uring") 4677e78800bbde62 ("socket: Unify getsockname and getpeername implementation") bf33247a90d3e85d ("net: Add struct sockaddr_unsized for sockaddr of unknown length") This should be used to beautify socket syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h Please see tools/include/uapi/README. Cc: netdev@vger.kernel.org Signed-off-by: Namhyung Kim --- .../perf/trace/beauty/include/linux/socket.h | 24 ++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/tools/perf/trace/beauty/include/linux/socket.h b/tools/perf/trace/beauty/include/linux/socket.h index 77d7c59f5d8b1..ec715ad4bf25f 100644 --- a/tools/perf/trace/beauty/include/linux/socket.h +++ b/tools/perf/trace/beauty/include/linux/socket.h @@ -32,11 +32,29 @@ typedef __kernel_sa_family_t sa_family_t; * 1003.1g requires sa_family_t and that sa_data is char. */ +/* Deprecated for in-kernel use. Use struct sockaddr_unsized instead. */ struct sockaddr { sa_family_t sa_family; /* address family, AF_xxx */ char sa_data[14]; /* 14 bytes of protocol address */ }; +/** + * struct sockaddr_unsized - Unspecified size sockaddr for callbacks + * @sa_family: Address family (AF_UNIX, AF_INET, AF_INET6, etc.) + * @sa_data: Flexible array for address data + * + * This structure is designed for callback interfaces where the + * total size is known via the sockaddr_len parameter. Unlike struct + * sockaddr which has a fixed 14-byte sa_data limit or struct + * sockaddr_storage which has a fixed 128-byte sa_data limit, this + * structure can accommodate addresses of any size, but must be used + * carefully. + */ +struct sockaddr_unsized { + __kernel_sa_family_t sa_family; /* address family, AF_xxx */ + char sa_data[]; /* flexible address data */ +}; + struct linger { int l_onoff; /* Linger active */ int l_linger; /* How long to linger for */ @@ -450,10 +468,10 @@ extern int __sys_connect(int fd, struct sockaddr __user *uservaddr, int addrlen); extern int __sys_listen(int fd, int backlog); extern int __sys_listen_socket(struct socket *sock, int backlog); +extern int do_getsockname(struct socket *sock, int peer, + struct sockaddr __user *usockaddr, int __user *usockaddr_len); extern int __sys_getsockname(int fd, struct sockaddr __user *usockaddr, - int __user *usockaddr_len); -extern int __sys_getpeername(int fd, struct sockaddr __user *usockaddr, - int __user *usockaddr_len); + int __user *usockaddr_len, int peer); extern int __sys_socketpair(int family, int type, int protocol, int __user *usockvec); extern int __sys_shutdown_sock(struct socket *sock, int how); From fb2d1a3babdd3a54104da55d4e6f93d092c7cf4e Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Mon, 22 Dec 2025 14:57:16 -0800 Subject: [PATCH 0662/1321] tools headers: Sync syscall table with kernel sources To pick up changes from: b36d4b6aa88ef039 ("arch: hookup listns() system call") This should be used to beautify the syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h diff -u tools/scripts/syscall.tbl scripts/syscall.tbl diff -u tools/perf/arch/x86/entry/syscalls/syscall_32.tbl arch/x86/entry/syscalls/syscall_32.tbl diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl diff -u tools/perf/arch/arm/entry/syscalls/syscall.tbl arch/arm/tools/syscall.tbl diff -u tools/perf/arch/sh/entry/syscalls/syscall.tbl arch/sh/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/sparc/entry/syscalls/syscall.tbl arch/sparc/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/xtensa/entry/syscalls/syscall.tbl arch/xtensa/kernel/syscalls/syscall.tbl Please see tools/include/uapi/README. Note that s390 syscall table is still out of sync as it switches to use the generic table. But I'd like to minimize the change in this commit. Cc: linux-arch@vger.kernel.org Signed-off-by: Namhyung Kim --- tools/include/uapi/asm-generic/unistd.h | 4 +++- tools/perf/arch/arm/entry/syscalls/syscall.tbl | 1 + tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl | 1 + tools/perf/arch/powerpc/entry/syscalls/syscall.tbl | 1 + tools/perf/arch/s390/entry/syscalls/syscall.tbl | 1 + tools/perf/arch/sh/entry/syscalls/syscall.tbl | 1 + tools/perf/arch/sparc/entry/syscalls/syscall.tbl | 1 + tools/perf/arch/x86/entry/syscalls/syscall_32.tbl | 1 + tools/perf/arch/x86/entry/syscalls/syscall_64.tbl | 1 + tools/perf/arch/xtensa/entry/syscalls/syscall.tbl | 1 + tools/scripts/syscall.tbl | 1 + 11 files changed, 13 insertions(+), 1 deletion(-) diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h index 04e0077fb4c97..942370b3f5d25 100644 --- a/tools/include/uapi/asm-generic/unistd.h +++ b/tools/include/uapi/asm-generic/unistd.h @@ -857,9 +857,11 @@ __SYSCALL(__NR_open_tree_attr, sys_open_tree_attr) __SYSCALL(__NR_file_getattr, sys_file_getattr) #define __NR_file_setattr 469 __SYSCALL(__NR_file_setattr, sys_file_setattr) +#define __NR_listns 470 +__SYSCALL(__NR_listns, sys_listns) #undef __NR_syscalls -#define __NR_syscalls 470 +#define __NR_syscalls 471 /* * 32 bit systems traditionally used different diff --git a/tools/perf/arch/arm/entry/syscalls/syscall.tbl b/tools/perf/arch/arm/entry/syscalls/syscall.tbl index b07e699aaa3c2..fd09afae72a24 100644 --- a/tools/perf/arch/arm/entry/syscalls/syscall.tbl +++ b/tools/perf/arch/arm/entry/syscalls/syscall.tbl @@ -484,3 +484,4 @@ 467 common open_tree_attr sys_open_tree_attr 468 common file_getattr sys_file_getattr 469 common file_setattr sys_file_setattr +470 common listns sys_listns diff --git a/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl b/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl index 7a7049c2c3078..9b92bddf06b57 100644 --- a/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl +++ b/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl @@ -384,3 +384,4 @@ 467 n64 open_tree_attr sys_open_tree_attr 468 n64 file_getattr sys_file_getattr 469 n64 file_setattr sys_file_setattr +470 n64 listns sys_listns diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl index b453e80dfc003..ec4458cdb97b6 100644 --- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl +++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl @@ -560,3 +560,4 @@ 467 common open_tree_attr sys_open_tree_attr 468 common file_getattr sys_file_getattr 469 common file_setattr sys_file_setattr +470 common listns sys_listns diff --git a/tools/perf/arch/s390/entry/syscalls/syscall.tbl b/tools/perf/arch/s390/entry/syscalls/syscall.tbl index 8a6744d658db3..5863787ab0363 100644 --- a/tools/perf/arch/s390/entry/syscalls/syscall.tbl +++ b/tools/perf/arch/s390/entry/syscalls/syscall.tbl @@ -472,3 +472,4 @@ 467 common open_tree_attr sys_open_tree_attr sys_open_tree_attr 468 common file_getattr sys_file_getattr sys_file_getattr 469 common file_setattr sys_file_setattr sys_file_setattr +470 common listns sys_listns sys_listns diff --git a/tools/perf/arch/sh/entry/syscalls/syscall.tbl b/tools/perf/arch/sh/entry/syscalls/syscall.tbl index 5e9c9eff5539e..969c11325adeb 100644 --- a/tools/perf/arch/sh/entry/syscalls/syscall.tbl +++ b/tools/perf/arch/sh/entry/syscalls/syscall.tbl @@ -473,3 +473,4 @@ 467 common open_tree_attr sys_open_tree_attr 468 common file_getattr sys_file_getattr 469 common file_setattr sys_file_setattr +470 common listns sys_listns diff --git a/tools/perf/arch/sparc/entry/syscalls/syscall.tbl b/tools/perf/arch/sparc/entry/syscalls/syscall.tbl index ebb7d06d1044f..39aa26b6a50be 100644 --- a/tools/perf/arch/sparc/entry/syscalls/syscall.tbl +++ b/tools/perf/arch/sparc/entry/syscalls/syscall.tbl @@ -515,3 +515,4 @@ 467 common open_tree_attr sys_open_tree_attr 468 common file_getattr sys_file_getattr 469 common file_setattr sys_file_setattr +470 common listns sys_listns diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_32.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_32.tbl index 4877e16da69a5..e979a3eac7a35 100644 --- a/tools/perf/arch/x86/entry/syscalls/syscall_32.tbl +++ b/tools/perf/arch/x86/entry/syscalls/syscall_32.tbl @@ -475,3 +475,4 @@ 467 i386 open_tree_attr sys_open_tree_attr 468 i386 file_getattr sys_file_getattr 469 i386 file_setattr sys_file_setattr +470 i386 listns sys_listns diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl index ced2a1deecd7c..8a4ac4841be6e 100644 --- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl +++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl @@ -394,6 +394,7 @@ 467 common open_tree_attr sys_open_tree_attr 468 common file_getattr sys_file_getattr 469 common file_setattr sys_file_setattr +470 common listns sys_listns # # Due to a historical design error, certain syscalls are numbered differently diff --git a/tools/perf/arch/xtensa/entry/syscalls/syscall.tbl b/tools/perf/arch/xtensa/entry/syscalls/syscall.tbl index 374e4cb788d8a..438a3b1704022 100644 --- a/tools/perf/arch/xtensa/entry/syscalls/syscall.tbl +++ b/tools/perf/arch/xtensa/entry/syscalls/syscall.tbl @@ -440,3 +440,4 @@ 467 common open_tree_attr sys_open_tree_attr 468 common file_getattr sys_file_getattr 469 common file_setattr sys_file_setattr +470 common listns sys_listns diff --git a/tools/scripts/syscall.tbl b/tools/scripts/syscall.tbl index d1ae5e92c615b..e74868be513cf 100644 --- a/tools/scripts/syscall.tbl +++ b/tools/scripts/syscall.tbl @@ -410,3 +410,4 @@ 467 common open_tree_attr sys_open_tree_attr 468 common file_getattr sys_file_getattr 469 common file_setattr sys_file_setattr +470 common listns sys_listns From df205122239e00deb87e63e22cb08ad83b3759a3 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Fri, 26 Dec 2025 16:09:34 -0300 Subject: [PATCH 0663/1321] tools build: Fix the common set of features test wrt libopenssl The recent introduction of the libopenssl feature test forgot to add the -lssl to the test-all.o target, which made it always fail, fix it. Noticed by looking at this file after building: $ cat /tmp/build/perf-tools/feature/test-all.make.output /usr/bin/ld: /tmp/ccBhO8WH.ltrans0.ltrans.o: in function `main': /home/acme/git/perf-tools/tools/build/feature/test-libopenssl.c:6:(.text.startup+0x2ed): undefined reference to `OPENSSL_init_ssl' collect2: error: ld returned 1 exit status $ It was added only to the individual ssl test, that works: $ cat /tmp/build/perf-tools/feature/test-libopenssl.make.output $ ldd /tmp/build/perf-tools/feature/test-libopenssl.bin | grep ssl libssl.so.3 => /usr/lib64/libssl.so.3 (0x00007fb81eda8000) $ Fixes: 7678523109d1d9ee ("tools/build: Add a feature test for libopenssl") Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Namhyung Kim --- tools/build/feature/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile index c699d4f4c6d93..0d5a15654b17b 100644 --- a/tools/build/feature/Makefile +++ b/tools/build/feature/Makefile @@ -107,7 +107,7 @@ all: $(FILES) __BUILD = $(CC) $(CFLAGS) -MD -Wall -Werror -o $@ $(patsubst %.bin,%.c,$(@F)) $(LDFLAGS) BUILD = $(__BUILD) > $(@:.bin=.make.output) 2>&1 BUILD_BFD = $(BUILD) -DPACKAGE='"perf"' -lbfd -ldl - BUILD_ALL = $(BUILD) -fstack-protector-all -O2 -D_FORTIFY_SOURCE=2 -ldw -lelf -lnuma -lelf -lslang $(FLAGS_PERL_EMBED) $(FLAGS_PYTHON_EMBED) -ldl -lz -llzma -lzstd + BUILD_ALL = $(BUILD) -fstack-protector-all -O2 -D_FORTIFY_SOURCE=2 -ldw -lelf -lnuma -lelf -lslang $(FLAGS_PERL_EMBED) $(FLAGS_PYTHON_EMBED) -ldl -lz -llzma -lzstd -lssl __BUILDXX = $(CXX) $(CXXFLAGS) -MD -Wall -Werror -o $@ $(patsubst %.bin,%.cpp,$(@F)) $(LDFLAGS) BUILDXX = $(__BUILDXX) > $(@:.bin=.make.output) 2>&1 From 8858f2ff7dfed3b04177b89bbcf2779b449c9534 Mon Sep 17 00:00:00 2001 From: Macpaul Lin Date: Fri, 28 Nov 2025 12:17:22 +0800 Subject: [PATCH 0664/1321] pmdomain: mtk-pm-domains: Fix spinlock recursion fix in probe Remove scpsys_get_legacy_regmap(), replacing its usage with of_find_node_with_property(). Explicitly call of_node_get(np) before each of_find_node_with_property() to maintain correct node reference counting. The of_find_node_with_property() function "consumes" its input by calling of_node_put() internally, whether or not it finds a match. Currently, dev->of_node (np) is passed multiple times in sequence without incrementing its reference count, causing it to be decremented multiple times and risking early memory release. Adding of_node_get(np) before each call balances the reference count, preventing premature node release. Fixes: c1bac49fe91f ("pmdomains: mtk-pm-domains: Fix spinlock recursion in probe") Cc: stable@vger.kernel.org Signed-off-by: Macpaul Lin Tested-by: Louis-Alexis Eyraud Signed-off-by: Ulf Hansson --- drivers/pmdomain/mediatek/mtk-pm-domains.c | 21 ++++++--------------- 1 file changed, 6 insertions(+), 15 deletions(-) diff --git a/drivers/pmdomain/mediatek/mtk-pm-domains.c b/drivers/pmdomain/mediatek/mtk-pm-domains.c index 80561d27f2b23..f64f24d520ddd 100644 --- a/drivers/pmdomain/mediatek/mtk-pm-domains.c +++ b/drivers/pmdomain/mediatek/mtk-pm-domains.c @@ -984,18 +984,6 @@ static void scpsys_domain_cleanup(struct scpsys *scpsys) } } -static struct device_node *scpsys_get_legacy_regmap(struct device_node *np, const char *pn) -{ - struct device_node *local_node; - - for_each_child_of_node(np, local_node) { - if (of_property_present(local_node, pn)) - return local_node; - } - - return NULL; -} - static int scpsys_get_bus_protection_legacy(struct device *dev, struct scpsys *scpsys) { const u8 bp_blocks[3] = { @@ -1017,7 +1005,8 @@ static int scpsys_get_bus_protection_legacy(struct device *dev, struct scpsys *s * this makes it then possible to allocate the array of bus_prot * regmaps and convert all to the new style handling. */ - node = scpsys_get_legacy_regmap(np, "mediatek,infracfg"); + of_node_get(np); + node = of_find_node_with_property(np, "mediatek,infracfg"); if (node) { regmap[0] = syscon_regmap_lookup_by_phandle(node, "mediatek,infracfg"); of_node_put(node); @@ -1030,7 +1019,8 @@ static int scpsys_get_bus_protection_legacy(struct device *dev, struct scpsys *s regmap[0] = NULL; } - node = scpsys_get_legacy_regmap(np, "mediatek,smi"); + of_node_get(np); + node = of_find_node_with_property(np, "mediatek,smi"); if (node) { smi_np = of_parse_phandle(node, "mediatek,smi", 0); of_node_put(node); @@ -1048,7 +1038,8 @@ static int scpsys_get_bus_protection_legacy(struct device *dev, struct scpsys *s regmap[1] = NULL; } - node = scpsys_get_legacy_regmap(np, "mediatek,infracfg-nao"); + of_node_get(np); + node = of_find_node_with_property(np, "mediatek,infracfg-nao"); if (node) { regmap[2] = syscon_regmap_lookup_by_phandle(node, "mediatek,infracfg-nao"); num_regmaps++; From 1c5e8373ee5b1ac412eefe2306c3663f1d7ef102 Mon Sep 17 00:00:00 2001 From: Wentao Liang Date: Thu, 11 Dec 2025 04:02:52 +0000 Subject: [PATCH 0665/1321] pmdomain: imx: Fix reference count leak in imx_gpc_probe() of_get_child_by_name() returns a node pointer with refcount incremented. Use the __free() attribute to manage the pgc_node reference, ensuring automatic of_node_put() cleanup when pgc_node goes out of scope. This eliminates the need for explicit error handling paths and avoids reference count leaks. Fixes: 721cabf6c660 ("soc: imx: move PGC handling to a new GPC driver") Cc: stable@vger.kernel.org Signed-off-by: Wentao Liang Reviewed-by: Frank Li Signed-off-by: Ulf Hansson --- drivers/pmdomain/imx/gpc.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/pmdomain/imx/gpc.c b/drivers/pmdomain/imx/gpc.c index a34b260274f7b..de695f1944ab3 100644 --- a/drivers/pmdomain/imx/gpc.c +++ b/drivers/pmdomain/imx/gpc.c @@ -402,13 +402,12 @@ static int imx_gpc_old_dt_init(struct device *dev, struct regmap *regmap, static int imx_gpc_probe(struct platform_device *pdev) { const struct imx_gpc_dt_data *of_id_data = device_get_match_data(&pdev->dev); - struct device_node *pgc_node; + struct device_node *pgc_node __free(device_node) + = of_get_child_by_name(pdev->dev.of_node, "pgc"); struct regmap *regmap; void __iomem *base; int ret; - pgc_node = of_get_child_by_name(pdev->dev.of_node, "pgc"); - /* bail out if DT too old and doesn't provide the necessary info */ if (!of_property_present(pdev->dev.of_node, "#power-domain-cells") && !pgc_node) From c8c91f1ab428d99a83c9540047d749fd208db139 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 4 Dec 2025 15:31:27 +0000 Subject: [PATCH 0666/1321] entry: Always inline local_irq_{enable,disable}_exit_to_user() clang needs __always_inline instead of inline, even for tiny helpers. This saves some cycles in system call fast path, and saves 195 bytes on x86_64 build: $ size vmlinux.before vmlinux.after text data bss dec hex filename 34652814 22291961 5875180 62819955 3be8e73 vmlinux.before 34652619 22291961 5875180 62819760 3be8db0 vmlinux.after Signed-off-by: Eric Dumazet Signed-off-by: Peter Zijlstra (Intel) Link: https://patch.msgid.link/20251204153127.1321824-1-edumazet@google.com --- include/linux/irq-entry-common.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/irq-entry-common.h b/include/linux/irq-entry-common.h index 6ab913e57da0a..d26d1b1bcbfb9 100644 --- a/include/linux/irq-entry-common.h +++ b/include/linux/irq-entry-common.h @@ -110,7 +110,7 @@ static __always_inline void enter_from_user_mode(struct pt_regs *regs) static inline void local_irq_enable_exit_to_user(unsigned long ti_work); #ifndef local_irq_enable_exit_to_user -static inline void local_irq_enable_exit_to_user(unsigned long ti_work) +static __always_inline void local_irq_enable_exit_to_user(unsigned long ti_work) { local_irq_enable(); } @@ -125,7 +125,7 @@ static inline void local_irq_enable_exit_to_user(unsigned long ti_work) static inline void local_irq_disable_exit_to_user(void); #ifndef local_irq_disable_exit_to_user -static inline void local_irq_disable_exit_to_user(void) +static __always_inline void local_irq_disable_exit_to_user(void) { local_irq_disable(); } From 719ccc80105f3cc6ade5e1f6f8d4c2182853a60e Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sun, 4 Jan 2026 14:41:55 -0800 Subject: [PATCH 0667/1321] Linux 6.19-rc4 --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 27ce077520fe1..665b79aa21b8d 100644 --- a/Makefile +++ b/Makefile @@ -2,7 +2,7 @@ VERSION = 6 PATCHLEVEL = 19 SUBLEVEL = 0 -EXTRAVERSION = -rc3 +EXTRAVERSION = -rc4 NAME = Baby Opossum Posse # *DOCUMENTATION* From 4514dc682cf3dbe4428645336a8a68cfee0eae62 Mon Sep 17 00:00:00 2001 From: Saravana Kannan Date: Sun, 28 Dec 2025 22:03:49 -0800 Subject: [PATCH 0668/1321] MAINTAINERS: Update Saravana Kannan's email address I've switched employers. Updating email address to be employer-agnostic and remain reachable. Also updating mailmap so all my previous emails map to this new one. Signed-off-by: Saravana Kannan Link: https://patch.msgid.link/20251229060350.852-1-saravanak@kernel.org Signed-off-by: Rob Herring (Arm) --- .mailmap | 2 ++ MAINTAINERS | 6 +++--- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/.mailmap b/.mailmap index 7a6110d0e46d5..1873cfa159378 100644 --- a/.mailmap +++ b/.mailmap @@ -705,6 +705,8 @@ Sankeerth Billakanti Santosh Shilimkar Santosh Shilimkar Sarangdhar Joshi +Saravana Kannan +Saravana Kannan Sascha Hauer Sahitya Tummala Sathishkumar Muruganandam diff --git a/MAINTAINERS b/MAINTAINERS index 765ad2daa2183..a0dd762f5648b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6532,7 +6532,7 @@ F: rust/kernel/cpufreq.rs F: tools/testing/selftests/cpufreq/ CPU FREQUENCY DRIVERS - VIRTUAL MACHINE CPUFREQ -M: Saravana Kannan +M: Saravana Kannan L: linux-pm@vger.kernel.org S: Maintained F: drivers/cpufreq/virtual-cpufreq.c @@ -7170,7 +7170,7 @@ F: drivers/base/devcoredump.c F: include/linux/devcoredump.h DEVICE DEPENDENCY HELPER SCRIPT -M: Saravana Kannan +M: Saravana Kannan L: linux-kernel@vger.kernel.org S: Maintained F: scripts/dev-needs.sh @@ -19547,7 +19547,7 @@ F: include/linux/oa_tc6.h OPEN FIRMWARE AND FLATTENED DEVICE TREE M: Rob Herring -M: Saravana Kannan +M: Saravana Kannan L: devicetree@vger.kernel.org S: Maintained Q: http://patchwork.kernel.org/project/devicetree/list/ From 224308bff2e99dce96b90735389e35dafe58be76 Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Wed, 31 Dec 2025 11:49:15 +0000 Subject: [PATCH 0669/1321] of: unittest: Fix memory leak in unittest_data_add() In unittest_data_add(), if of_resolve_phandles() fails, the allocated unittest_data is not freed, leading to a memory leak. Fix this by using scope-based cleanup helper __free(kfree) for automatic resource cleanup. This ensures unittest_data is automatically freed when it goes out of scope in error paths. For the success path, use retain_and_null_ptr() to transfer ownership of the memory to the device tree and prevent double freeing. Fixes: 2eb46da2a760 ("of/selftest: Use the resolver to fixup phandles") Suggested-by: Rob Herring Co-developed-by: Jianhao Xu Signed-off-by: Jianhao Xu Signed-off-by: Zilin Guan Link: https://patch.msgid.link/20251231114915.234638-1-zilin@seu.edu.cn Signed-off-by: Rob Herring (Arm) --- drivers/of/unittest.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c index 388e9ec2cccf8..3b773aaf9d050 100644 --- a/drivers/of/unittest.c +++ b/drivers/of/unittest.c @@ -1985,7 +1985,6 @@ static void attach_node_and_children(struct device_node *np) */ static int __init unittest_data_add(void) { - void *unittest_data; void *unittest_data_align; struct device_node *unittest_data_node = NULL, *np; /* @@ -2004,7 +2003,7 @@ static int __init unittest_data_add(void) } /* creating copy */ - unittest_data = kmalloc(size + FDT_ALIGN_SIZE, GFP_KERNEL); + void *unittest_data __free(kfree) = kmalloc(size + FDT_ALIGN_SIZE, GFP_KERNEL); if (!unittest_data) return -ENOMEM; @@ -2014,12 +2013,10 @@ static int __init unittest_data_add(void) ret = of_fdt_unflatten_tree(unittest_data_align, NULL, &unittest_data_node); if (!ret) { pr_warn("%s: unflatten testcases tree failed\n", __func__); - kfree(unittest_data); return -ENODATA; } if (!unittest_data_node) { pr_warn("%s: testcases tree is empty\n", __func__); - kfree(unittest_data); return -ENODATA; } @@ -2038,7 +2035,6 @@ static int __init unittest_data_add(void) /* attach the sub-tree to live tree */ if (!of_root) { pr_warn("%s: no live tree to attach sub-tree\n", __func__); - kfree(unittest_data); rc = -ENODEV; goto unlock; } @@ -2059,6 +2055,8 @@ static int __init unittest_data_add(void) EXPECT_END(KERN_INFO, "Duplicate name in testcase-data, renamed to \"duplicate-name#1\""); + retain_and_null_ptr(unittest_data); + unlock: of_overlay_mutex_unlock(); From 84922dc4639a9044ff4edf97551ca681a54b04bb Mon Sep 17 00:00:00 2001 From: Boris Burkov Date: Mon, 1 Dec 2025 12:47:14 -0800 Subject: [PATCH 0670/1321] btrfs: fix qgroup_snapshot_quick_inherit() squota bug qgroup_snapshot_quick_inherit() detects conditions where the snapshot destination would land in the same parent qgroup as the snapshot source subvolume. In this case we can avoid costly qgroup calculations and just add the nodesize of the new snapshot to the parent. However, in the case of squotas this is actually a double count, and also an undercount for deeper qgroup nestings. The following annotated script shows the issue: btrfs quota enable --simple "$mnt" # Create 2-level qgroup hierarchy btrfs qgroup create 2/100 "$mnt" # Q2 (level 2) btrfs qgroup create 1/100 "$mnt" # Q1 (level 1) btrfs qgroup assign 1/100 2/100 "$mnt" # Create base subvolume btrfs subvolume create "$mnt/base" >/dev/null base_id=$(btrfs subvolume show "$mnt/base" | grep 'Subvolume ID:' | awk '{print $3}') # Create intermediate snapshot and add to Q1 btrfs subvolume snapshot "$mnt/base" "$mnt/intermediate" >/dev/null inter_id=$(btrfs subvolume show "$mnt/intermediate" | grep 'Subvolume ID:' | awk '{print $3}') btrfs qgroup assign "0/$inter_id" 1/100 "$mnt" # Create working snapshot with --inherit (auto-adds to Q1) # src=intermediate (in only Q1) # dst=snap (inheriting only into Q1) # This double counts the 16k nodesize of the snapshot in Q1, and # undercounts it in Q2. btrfs subvolume snapshot -i 1/100 "$mnt/intermediate" "$mnt/snap" >/dev/null snap_id=$(btrfs subvolume show "$mnt/snap" | grep 'Subvolume ID:' | awk '{print $3}') # Fully complete snapshot creation sync # Delete working snapshot # Q1 and Q2 will lose the full snap usage btrfs subvolume delete "$mnt/snap" >/dev/null # Delete intermediate and remove from Q1 # Q1 and Q2 will lose the full intermediate usage btrfs qgroup remove "0/$inter_id" 1/100 "$mnt" btrfs subvolume delete "$mnt/intermediate" >/dev/null # Q1 should be at 0, but still has 16k. Q2 is "correct" at 0 (for now...) # Trigger cleaner, wait for deletions mount -o remount,sync=1 "$mnt" btrfs subvolume sync "$mnt" "$snap_id" btrfs subvolume sync "$mnt" "$inter_id" # Remove Q1 from Q2 # Frees 16k more from Q2, underflowing it to 16EiB btrfs qgroup remove 1/100 2/100 "$mnt" # And show the bad state: btrfs qgroup show -pc "$mnt" Qgroupid Referenced Exclusive Parent Child Path -------- ---------- --------- ------ ----- ---- 0/5 16.00KiB 16.00KiB - - 0/256 16.00KiB 16.00KiB - - base 1/100 16.00KiB 16.00KiB - - <0 member qgroups> 2/100 16.00EiB 16.00EiB - - <0 member qgroups> Fix this by simply not doing this quick inheritance with squotas. I suspect that it is also wrong in normal qgroups to not recurse up the qgroup tree in the quick inherit case, though other consistency checks will likely fix it anyway. Fixes: b20fe56cd285 ("btrfs: qgroup: allow quick inherit if snapshot is created and added to the same parent") Reviewed-by: Qu Wenruo Signed-off-by: Boris Burkov Signed-off-by: David Sterba --- fs/btrfs/qgroup.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index d9d8d9968a582..904d2a05e63ab 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -3211,6 +3211,9 @@ static int qgroup_snapshot_quick_inherit(struct btrfs_fs_info *fs_info, struct btrfs_qgroup_list *list; int nr_parents = 0; + if (btrfs_qgroup_mode(fs_info) != BTRFS_QGROUP_MODE_FULL) + return 0; + src = find_qgroup_rb(fs_info, srcid); if (!src) return -ENOENT; From bfda48da84ff379b62e9f041f8983bdd8fc9ffce Mon Sep 17 00:00:00 2001 From: Qu Wenruo Date: Thu, 4 Dec 2025 14:38:23 +1030 Subject: [PATCH 0671/1321] btrfs: qgroup: update all parent qgroups when doing quick inherit [BUG] There is a bug that if a subvolume has multi-level parent qgroups, and is able to do a quick inherit, only the direct parent qgroup got updated: mkfs.btrfs -f -O quota $dev mount $dev $mnt btrfs subv create $mnt/subv1 btrfs qgroup create 1/100 $mnt btrfs qgroup create 2/100 $mnt btrfs qgroup assign 1/100 2/100 $mnt btrfs qgroup assign 0/256 1/100 $mnt btrfs qgroup show -p --sync $mnt Qgroupid Referenced Exclusive Parent Path -------- ---------- --------- ------ ---- 0/5 16.00KiB 16.00KiB - 0/256 16.00KiB 16.00KiB 1/100 subv1 1/100 16.00KiB 16.00KiB 2/100 2/100<1 member qgroup> 2/100 16.00KiB 16.00KiB - <0 member qgroups> btrfs subv snap -i 1/100 $mnt/subv1 $mnt/snap1 btrfs qgroup show -p --sync $mnt Qgroupid Referenced Exclusive Parent Path -------- ---------- --------- ------ ---- 0/5 16.00KiB 16.00KiB - 0/256 16.00KiB 16.00KiB 1/100 subv1 0/257 16.00KiB 16.00KiB 1/100 snap1 1/100 32.00KiB 32.00KiB 2/100 2/100<1 member qgroup> 2/100 16.00KiB 16.00KiB - <0 member qgroups> # Note that 2/100 is not updated, and qgroup numbers are inconsistent umount $mnt [CAUSE] If the snapshot source subvolume belongs to a parent qgroup, and the new snapshot target is also added to the new same parent qgroup, we allow a quick update without marking qgroup inconsistent. But that quick update only update the parent qgroup, without checking if there is any more parent qgroups. [FIX] Iterate through all parent qgroups during the quick inherit. Reported-by: Boris Burkov Fixes: b20fe56cd285 ("btrfs: qgroup: allow quick inherit if snapshot is created and added to the same parent") Reviewed-by: Boris Burkov Signed-off-by: Qu Wenruo Signed-off-by: David Sterba --- fs/btrfs/qgroup.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index 904d2a05e63ab..206587820fec0 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -3208,7 +3208,10 @@ static int qgroup_snapshot_quick_inherit(struct btrfs_fs_info *fs_info, { struct btrfs_qgroup *src; struct btrfs_qgroup *parent; + struct btrfs_qgroup *qgroup; struct btrfs_qgroup_list *list; + LIST_HEAD(qgroup_list); + const u32 nodesize = fs_info->nodesize; int nr_parents = 0; if (btrfs_qgroup_mode(fs_info) != BTRFS_QGROUP_MODE_FULL) @@ -3248,8 +3251,19 @@ static int qgroup_snapshot_quick_inherit(struct btrfs_fs_info *fs_info, if (parent->excl != parent->rfer) return 1; - parent->excl += fs_info->nodesize; - parent->rfer += fs_info->nodesize; + qgroup_iterator_add(&qgroup_list, parent); + list_for_each_entry(qgroup, &qgroup_list, iterator) { + qgroup->rfer += nodesize; + qgroup->rfer_cmpr += nodesize; + qgroup->excl += nodesize; + qgroup->excl_cmpr += nodesize; + qgroup_dirty(fs_info, qgroup); + + /* Append parent qgroups to @qgroup_list. */ + list_for_each_entry(list, &qgroup->groups, next_group) + qgroup_iterator_add(&qgroup_list, list->group); + } + qgroup_iterator_clean(&qgroup_list); return 0; } From f152cdb71a37af796babcc892548c75a763af667 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Miquel=20Sabat=C3=A9=20Sol=C3=A0?= Date: Tue, 21 Oct 2025 11:11:25 +0200 Subject: [PATCH 0672/1321] btrfs: fix NULL dereference on root when tracing inode eviction MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When evicting an inode the first thing we do is to setup tracing for it, which implies fetching the root's id. But in btrfs_evict_inode() the root might be NULL, as implied in the next check that we do in btrfs_evict_inode(). Hence, we either should set the ->root_objectid to 0 in case the root is NULL, or we move tracing setup after checking that the root is not NULL. Setting the rootid to 0 at least gives us the possibility to trace this call even in the case when the root is NULL, so that's the solution taken here. Fixes: 1abe9b8a138c ("Btrfs: add initial tracepoint support for btrfs") Reported-by: syzbot+d991fea1b4b23b1f6bf8@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d991fea1b4b23b1f6bf8 Signed-off-by: Miquel Sabaté Solà Reviewed-by: David Sterba Signed-off-by: David Sterba --- include/trace/events/btrfs.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index 7e418f065b945..125bdc166bfed 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -224,7 +224,8 @@ DECLARE_EVENT_CLASS(btrfs__inode, __entry->generation = BTRFS_I(inode)->generation; __entry->last_trans = BTRFS_I(inode)->last_trans; __entry->logged_trans = BTRFS_I(inode)->logged_trans; - __entry->root_objectid = btrfs_root_id(BTRFS_I(inode)->root); + __entry->root_objectid = BTRFS_I(inode)->root ? + btrfs_root_id(BTRFS_I(inode)->root) : 0; ), TP_printk_btrfs("root=%llu(%s) gen=%llu ino=%llu blocks=%llu " From 4a6bb9d50e4e16fb2fe69425a29b528ac746aec5 Mon Sep 17 00:00:00 2001 From: Robbie Ko Date: Thu, 11 Dec 2025 13:30:33 +0800 Subject: [PATCH 0673/1321] btrfs: fix deadlock in wait_current_trans() due to ignored transaction type When wait_current_trans() is called during start_transaction(), it currently waits for a blocked transaction without considering whether the given transaction type actually needs to wait for that particular transaction state. The btrfs_blocked_trans_types[] array already defines which transaction types should wait for which transaction states, but this check was missing in wait_current_trans(). This can lead to a deadlock scenario involving two transactions and pending ordered extents: 1. Transaction A is in TRANS_STATE_COMMIT_DOING state 2. A worker processing an ordered extent calls start_transaction() with TRANS_JOIN 3. join_transaction() returns -EBUSY because Transaction A is in TRANS_STATE_COMMIT_DOING 4. Transaction A moves to TRANS_STATE_UNBLOCKED and completes 5. A new Transaction B is created (TRANS_STATE_RUNNING) 6. The ordered extent from step 2 is added to Transaction B's pending ordered extents 7. Transaction B immediately starts commit by another task and enters TRANS_STATE_COMMIT_START 8. The worker finally reaches wait_current_trans(), sees Transaction B in TRANS_STATE_COMMIT_START (a blocked state), and waits unconditionally 9. However, TRANS_JOIN should NOT wait for TRANS_STATE_COMMIT_START according to btrfs_blocked_trans_types[] 10. Transaction B is waiting for pending ordered extents to complete 11. Deadlock: Transaction B waits for ordered extent, ordered extent waits for Transaction B This can be illustrated by the following call stacks: CPU0 CPU1 btrfs_finish_ordered_io() start_transaction(TRANS_JOIN) join_transaction() # -EBUSY (Transaction A is # TRANS_STATE_COMMIT_DOING) # Transaction A completes # Transaction B created # ordered extent added to # Transaction B's pending list btrfs_commit_transaction() # Transaction B enters # TRANS_STATE_COMMIT_START # waiting for pending ordered # extents wait_current_trans() # waits for Transaction B # (should not wait!) Task bstore_kv_sync in btrfs_commit_transaction waiting for ordered extents: __schedule+0x2e7/0x8a0 schedule+0x64/0xe0 btrfs_commit_transaction+0xbf7/0xda0 [btrfs] btrfs_sync_file+0x342/0x4d0 [btrfs] __x64_sys_fdatasync+0x4b/0x80 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Task kworker in wait_current_trans waiting for transaction commit: Workqueue: btrfs-syno_nocow btrfs_work_helper [btrfs] __schedule+0x2e7/0x8a0 schedule+0x64/0xe0 wait_current_trans+0xb0/0x110 [btrfs] start_transaction+0x346/0x5b0 [btrfs] btrfs_finish_ordered_io.isra.0+0x49b/0x9c0 [btrfs] btrfs_work_helper+0xe8/0x350 [btrfs] process_one_work+0x1d3/0x3c0 worker_thread+0x4d/0x3e0 kthread+0x12d/0x150 ret_from_fork+0x1f/0x30 Fix this by passing the transaction type to wait_current_trans() and checking btrfs_blocked_trans_types[cur_trans->state] against the given type before deciding to wait. This ensures that transaction types which are allowed to join during certain blocked states will not unnecessarily wait and cause deadlocks. Reviewed-by: Filipe Manana Signed-off-by: Robbie Ko Signed-off-by: Filipe Manana Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/transaction.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 05ee4391c83a3..bd03f465e2d3e 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -520,13 +520,14 @@ static inline int is_transaction_blocked(struct btrfs_transaction *trans) * when this is done, it is safe to start a new transaction, but the current * transaction might not be fully on disk. */ -static void wait_current_trans(struct btrfs_fs_info *fs_info) +static void wait_current_trans(struct btrfs_fs_info *fs_info, unsigned int type) { struct btrfs_transaction *cur_trans; spin_lock(&fs_info->trans_lock); cur_trans = fs_info->running_transaction; - if (cur_trans && is_transaction_blocked(cur_trans)) { + if (cur_trans && is_transaction_blocked(cur_trans) && + (btrfs_blocked_trans_types[cur_trans->state] & type)) { refcount_inc(&cur_trans->use_count); spin_unlock(&fs_info->trans_lock); @@ -701,12 +702,12 @@ start_transaction(struct btrfs_root *root, unsigned int num_items, sb_start_intwrite(fs_info->sb); if (may_wait_transaction(fs_info, type)) - wait_current_trans(fs_info); + wait_current_trans(fs_info, type); do { ret = join_transaction(fs_info, type); if (ret == -EBUSY) { - wait_current_trans(fs_info); + wait_current_trans(fs_info, type); if (unlikely(type == TRANS_ATTACH || type == TRANS_JOIN_NOSTART)) ret = -ENOENT; @@ -1003,7 +1004,7 @@ int btrfs_wait_for_commit(struct btrfs_fs_info *fs_info, u64 transid) void btrfs_throttle(struct btrfs_fs_info *fs_info) { - wait_current_trans(fs_info); + wait_current_trans(fs_info, TRANS_START); } bool btrfs_should_end_transaction(struct btrfs_trans_handle *trans) From 17fe4801773da70d5635eafc8f8a056b1740f96f Mon Sep 17 00:00:00 2001 From: Qu Wenruo Date: Thu, 11 Dec 2025 12:45:17 +1030 Subject: [PATCH 0674/1321] btrfs: fix beyond-EOF write handling [BUG] For the following write sequence with 64K page size and 4K fs block size, it will lead to file extent items to be inserted without any data checksum: mkfs.btrfs -s 4k -f $dev > /dev/null mount $dev $mnt xfs_io -f -c "pwrite 0 16k" -c "pwrite 32k 4k" -c pwrite "60k 64K" \ -c "truncate 16k" $mnt/foobar umount $mnt This will result the following 2 file extent items to be inserted (extra trace point added to insert_ordered_extent_file_extent()): btrfs_finish_one_ordered: root=5 ino=257 file_off=61440 num_bytes=4096 csum_bytes=0 btrfs_finish_one_ordered: root=5 ino=257 file_off=0 num_bytes=16384 csum_bytes=16384 Note for file offset 60K, we're inserting a file extent without any data checksum. Also note that range [32K, 36K) didn't reach insert_ordered_extent_file_extent(), which is the correct behavior as that OE is fully truncated, should not result any file extent. Although file extent at 60K will be later dropped by btrfs_truncate(), if the transaction got committed after file extent inserted but before the file extent dropping, we will have a small window where we have a file extent beyond EOF and without any data checksum. That will cause "btrfs check" to report error. [CAUSE] The sequence happens like this: - Buffered write dirtied the page cache and updated isize Now the inode size is 64K, with the following page cache layout: 0 16K 32K 48K 64K |/////////////| |//| |//| - Truncate the inode to 16K Which will trigger writeback through: btrfs_setsize() |- truncate_setsize() | Now the inode size is set to 16K | |- btrfs_truncate() |- btrfs_wait_ordered_range() for [16K, u64(-1)] |- btrfs_fdatawrite_range() for [16K, u64(-1)} |- extent_writepage() for folio 0 |- writepage_delalloc() | Generated OE for [0, 16K), [32K, 36K] and [60K, 64K) | |- extent_writepage_io() Then inside extent_writepage_io(), the dirty fs blocks are handled differently: - Submit write for range [0, 16K) As they are still inside the inode size (16K). - Mark OE [32K, 36K) as truncated Since we only call btrfs_lookup_first_ordered_range() once, which returned the first OE after file offset 16K. - Mark all OEs inside range [16K, 64K) as finished Which will mark OE ranges [32K, 36K) and [60K, 64K) as finished. For OE [32K, 36K) since it's already marked as truncated, and its truncated length is 0, no file extent will be inserted. For OE [60K, 64K) it has never been submitted thus has no data checksum, and we insert the file extent as usual. This is the root cause of file extent at 60K to be inserted without any data checksum. - Clear dirty flags for range [16K, 64K) It is the function btrfs_folio_clear_dirty() which searches and clears any dirty blocks inside that range. [FIX] The bug itself was introduced a long time ago, way before subpage and large folio support. At that time, fs block size must match page size, thus the range [cur, end) is just one fs block. But later with subpage and large folios, the same range [cur, end) can have multiple blocks and ordered extents. Later commit 18de34daa7c6 ("btrfs: truncate ordered extent when skipping writeback past i_size") was fixing a bug related to subpage/large folios, but it's still utilizing the old range [cur, end), meaning only the first OE will be marked as truncated. The proper fix here is to make EOF handling block-by-block, not trying to handle the whole range to @end. By this we always locate and truncate the OE for every dirty block. CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Filipe Manana Signed-off-by: Qu Wenruo Signed-off-by: David Sterba --- fs/btrfs/extent_io.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 629fd5af42860..a4b74023618d3 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1728,7 +1728,7 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode, struct btrfs_ordered_extent *ordered; ordered = btrfs_lookup_first_ordered_range(inode, cur, - folio_end - cur); + fs_info->sectorsize); /* * We have just run delalloc before getting here, so * there must be an ordered extent. @@ -1742,7 +1742,7 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode, btrfs_put_ordered_extent(ordered); btrfs_mark_ordered_io_finished(inode, folio, cur, - end - cur, true); + fs_info->sectorsize, true); /* * This range is beyond i_size, thus we don't need to * bother writing back. @@ -1751,8 +1751,8 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode, * writeback the sectors with subpage dirty bits, * causing writeback without ordered extent. */ - btrfs_folio_clear_dirty(fs_info, folio, cur, end - cur); - break; + btrfs_folio_clear_dirty(fs_info, folio, cur, fs_info->sectorsize); + continue; } ret = submit_one_sector(inode, folio, cur, bio_ctrl, i_size); if (unlikely(ret < 0)) { From ad3137639ffd74c8acb409c1d8ea209f4d956c11 Mon Sep 17 00:00:00 2001 From: Filipe Manana Date: Thu, 11 Dec 2025 15:06:26 +0000 Subject: [PATCH 0675/1321] btrfs: always detect conflicting inodes when logging inode refs After rename exchanging (either with the rename exchange operation or regular renames in multiple non-atomic steps) two inodes and at least one of them is a directory, we can end up with a log tree that contains only of the inodes and after a power failure that can result in an attempt to delete the other inode when it should not because it was not deleted before the power failure. In some case that delete attempt fails when the target inode is a directory that contains a subvolume inside it, since the log replay code is not prepared to deal with directory entries that point to root items (only inode items). 1) We have directories "dir1" (inode A) and "dir2" (inode B) under the same parent directory; 2) We have a file (inode C) under directory "dir1" (inode A); 3) We have a subvolume inside directory "dir2" (inode B); 4) All these inodes were persisted in a past transaction and we are currently at transaction N; 5) We rename the file (inode C), so at btrfs_log_new_name() we update inode C's last_unlink_trans to N; 6) We get a rename exchange for "dir1" (inode A) and "dir2" (inode B), so after the exchange "dir1" is inode B and "dir2" is inode A. During the rename exchange we call btrfs_log_new_name() for inodes A and B, but because they are directories, we don't update their last_unlink_trans to N; 7) An fsync against the file (inode C) is done, and because its inode has a last_unlink_trans with a value of N we log its parent directory (inode A) (through btrfs_log_all_parents(), called from btrfs_log_inode_parent()). 8) So we end up with inode B not logged, which now has the old name of inode A. At copy_inode_items_to_log(), when logging inode A, we did not check if we had any conflicting inode to log because inode A has a generation lower than the current transaction (created in a past transaction); 9) After a power failure, when replaying the log tree, since we find that inode A has a new name that conflicts with the name of inode B in the fs tree, we attempt to delete inode B... this is wrong since that directory was never deleted before the power failure, and because there is a subvolume inside that directory, attempting to delete it will fail since replay_dir_deletes() and btrfs_unlink_inode() are not prepared to deal with dir items that point to roots instead of inodes. When that happens the mount fails and we get a stack trace like the following: [87.2314] BTRFS info (device dm-0): start tree-log replay [87.2318] BTRFS critical (device dm-0): failed to delete reference to subvol, root 5 inode 256 parent 259 [87.2332] ------------[ cut here ]------------ [87.2338] BTRFS: Transaction aborted (error -2) [87.2346] WARNING: CPU: 1 PID: 638968 at fs/btrfs/inode.c:4345 __btrfs_unlink_inode+0x416/0x440 [btrfs] [87.2368] Modules linked in: btrfs loop dm_thin_pool (...) [87.2470] CPU: 1 UID: 0 PID: 638968 Comm: mount Tainted: G W 6.18.0-rc7-btrfs-next-218+ #2 PREEMPT(full) [87.2489] Tainted: [W]=WARN [87.2494] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [87.2514] RIP: 0010:__btrfs_unlink_inode+0x416/0x440 [btrfs] [87.2538] Code: c0 89 04 24 (...) [87.2568] RSP: 0018:ffffc0e741f4b9b8 EFLAGS: 00010286 [87.2574] RAX: 0000000000000000 RBX: ffff9d3ec8a6cf60 RCX: 0000000000000000 [87.2582] RDX: 0000000000000002 RSI: ffffffff84ab45a1 RDI: 00000000ffffffff [87.2591] RBP: ffff9d3ec8a6ef20 R08: 0000000000000000 R09: ffffc0e741f4b840 [87.2599] R10: ffff9d45dc1fffa8 R11: 0000000000000003 R12: ffff9d3ee26d77e0 [87.2608] R13: ffffc0e741f4ba98 R14: ffff9d4458040800 R15: ffff9d44b6b7ca10 [87.2618] FS: 00007f7b9603a840(0000) GS:ffff9d4658982000(0000) knlGS:0000000000000000 [87.2629] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [87.2637] CR2: 00007ffc9ec33b98 CR3: 000000011273e003 CR4: 0000000000370ef0 [87.2648] Call Trace: [87.2651] [87.2654] btrfs_unlink_inode+0x15/0x40 [btrfs] [87.2661] unlink_inode_for_log_replay+0x27/0xf0 [btrfs] [87.2669] check_item_in_log+0x1ea/0x2c0 [btrfs] [87.2676] replay_dir_deletes+0x16b/0x380 [btrfs] [87.2684] fixup_inode_link_count+0x34b/0x370 [btrfs] [87.2696] fixup_inode_link_counts+0x41/0x160 [btrfs] [87.2703] btrfs_recover_log_trees+0x1ff/0x7c0 [btrfs] [87.2711] ? __pfx_replay_one_buffer+0x10/0x10 [btrfs] [87.2719] open_ctree+0x10bb/0x15f0 [btrfs] [87.2726] btrfs_get_tree.cold+0xb/0x16c [btrfs] [87.2734] ? fscontext_read+0x15c/0x180 [87.2740] ? rw_verify_area+0x50/0x180 [87.2746] vfs_get_tree+0x25/0xd0 [87.2750] vfs_cmd_create+0x59/0xe0 [87.2755] __do_sys_fsconfig+0x4f6/0x6b0 [87.2760] do_syscall_64+0x50/0x1220 [87.2764] entry_SYSCALL_64_after_hwframe+0x76/0x7e [87.2770] RIP: 0033:0x7f7b9625f4aa [87.2775] Code: 73 01 c3 48 (...) [87.2803] RSP: 002b:00007ffc9ec35b08 EFLAGS: 00000246 ORIG_RAX: 00000000000001af [87.2817] RAX: ffffffffffffffda RBX: 0000558bfa91ac20 RCX: 00007f7b9625f4aa [87.2829] RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000003 [87.2842] RBP: 0000558bfa91b120 R08: 0000000000000000 R09: 0000000000000000 [87.2854] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [87.2864] R13: 00007f7b963f1580 R14: 00007f7b963f326c R15: 00007f7b963d8a23 [87.2877] [87.2882] ---[ end trace 0000000000000000 ]--- [87.2891] BTRFS: error (device dm-0 state A) in __btrfs_unlink_inode:4345: errno=-2 No such entry [87.2904] BTRFS: error (device dm-0 state EAO) in do_abort_log_replay:191: errno=-2 No such entry [87.2915] BTRFS critical (device dm-0 state EAO): log tree (for root 5) leaf currently being processed (slot 7 key (258 12 257)): [87.2929] BTRFS info (device dm-0 state EAO): leaf 30736384 gen 10 total ptrs 7 free space 15712 owner 18446744073709551610 [87.2929] BTRFS info (device dm-0 state EAO): refs 3 lock_owner 0 current 638968 [87.2929] item 0 key (257 INODE_ITEM 0) itemoff 16123 itemsize 160 [87.2929] inode generation 9 transid 10 size 0 nbytes 0 [87.2929] block group 0 mode 40755 links 1 uid 0 gid 0 [87.2929] rdev 0 sequence 7 flags 0x0 [87.2929] atime 1765464494.678070921 [87.2929] ctime 1765464494.686606513 [87.2929] mtime 1765464494.686606513 [87.2929] otime 1765464494.678070921 [87.2929] item 1 key (257 INODE_REF 256) itemoff 16109 itemsize 14 [87.2929] index 4 name_len 4 [87.2929] item 2 key (257 DIR_LOG_INDEX 2) itemoff 16101 itemsize 8 [87.2929] dir log end 2 [87.2929] item 3 key (257 DIR_LOG_INDEX 3) itemoff 16093 itemsize 8 [87.2929] dir log end 18446744073709551615 [87.2930] item 4 key (257 DIR_INDEX 3) itemoff 16060 itemsize 33 [87.2930] location key (258 1 0) type 1 [87.2930] transid 10 data_len 0 name_len 3 [87.2930] item 5 key (258 INODE_ITEM 0) itemoff 15900 itemsize 160 [87.2930] inode generation 9 transid 10 size 0 nbytes 0 [87.2930] block group 0 mode 100644 links 1 uid 0 gid 0 [87.2930] rdev 0 sequence 2 flags 0x0 [87.2930] atime 1765464494.678456467 [87.2930] ctime 1765464494.686606513 [87.2930] mtime 1765464494.678456467 [87.2930] otime 1765464494.678456467 [87.2930] item 6 key (258 INODE_REF 257) itemoff 15887 itemsize 13 [87.2930] index 3 name_len 3 [87.2930] BTRFS critical (device dm-0 state EAO): log replay failed in unlink_inode_for_log_replay:1045 for root 5, stage 3, with error -2: failed to unlink inode 256 parent dir 259 name subvol root 5 [87.2963] BTRFS: error (device dm-0 state EAO) in btrfs_recover_log_trees:7743: errno=-2 No such entry [87.2981] BTRFS: error (device dm-0 state EAO) in btrfs_replay_log:2083: errno=-2 No such entry (Failed to recover log tr So fix this by changing copy_inode_items_to_log() to always detect if there are conflicting inodes for the ref/extref of the inode being logged even if the inode was created in a past transaction. A test case for fstests will follow soon. CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Filipe Manana Signed-off-by: David Sterba --- fs/btrfs/tree-log.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 31edc93a383e2..5831754bb01c4 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -6341,10 +6341,8 @@ static int copy_inode_items_to_log(struct btrfs_trans_handle *trans, * and no keys greater than that, so bail out. */ break; - } else if ((min_key->type == BTRFS_INODE_REF_KEY || - min_key->type == BTRFS_INODE_EXTREF_KEY) && - (inode->generation == trans->transid || - ctx->logging_conflict_inodes)) { + } else if (min_key->type == BTRFS_INODE_REF_KEY || + min_key->type == BTRFS_INODE_EXTREF_KEY) { u64 other_ino = 0; u64 other_parent = 0; From 3c0806f45b2ede801f65724d902d7a8a840f8cd1 Mon Sep 17 00:00:00 2001 From: Leo Martins Date: Fri, 12 Dec 2025 17:26:26 -0800 Subject: [PATCH 0676/1321] btrfs: fix use-after-free warning in btrfs_get_or_create_delayed_node() Previously, btrfs_get_or_create_delayed_node() set the delayed_node's refcount before acquiring the root->delayed_nodes lock. Commit e8513c012de7 ("btrfs: implement ref_tracker for delayed_nodes") moved refcount_set inside the critical section, which means there is no longer a memory barrier between setting the refcount and setting btrfs_inode->delayed_node. Without that barrier, the stores to node->refs and btrfs_inode->delayed_node may become visible out of order. Another thread can then read btrfs_inode->delayed_node and attempt to increment a refcount that hasn't been set yet, leading to a refcounting bug and a use-after-free warning. The fix is to move refcount_set back to where it was to take advantage of the implicit memory barrier provided by lock acquisition. Because the allocations now happen outside of the lock's critical section, they can use GFP_NOFS instead of GFP_ATOMIC. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-lkp/202511262228.6dda231e-lkp@intel.com Fixes: e8513c012de7 ("btrfs: implement ref_tracker for delayed_nodes") Tested-by: kernel test robot Reviewed-by: Filipe Manana Signed-off-by: Leo Martins Signed-off-by: David Sterba --- fs/btrfs/delayed-inode.c | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c index ce6e9f8812e06..4b7d9015e0dad 100644 --- a/fs/btrfs/delayed-inode.c +++ b/fs/btrfs/delayed-inode.c @@ -152,37 +152,39 @@ static struct btrfs_delayed_node *btrfs_get_or_create_delayed_node( return ERR_PTR(-ENOMEM); btrfs_init_delayed_node(node, root, ino); + /* Cached in the inode and can be accessed. */ + refcount_set(&node->refs, 2); + btrfs_delayed_node_ref_tracker_alloc(node, tracker, GFP_NOFS); + btrfs_delayed_node_ref_tracker_alloc(node, &node->inode_cache_tracker, GFP_NOFS); + /* Allocate and reserve the slot, from now it can return a NULL from xa_load(). */ ret = xa_reserve(&root->delayed_nodes, ino, GFP_NOFS); - if (ret == -ENOMEM) { - btrfs_delayed_node_ref_tracker_dir_exit(node); - kmem_cache_free(delayed_node_cache, node); - return ERR_PTR(-ENOMEM); - } + if (ret == -ENOMEM) + goto cleanup; + xa_lock(&root->delayed_nodes); ptr = xa_load(&root->delayed_nodes, ino); if (ptr) { /* Somebody inserted it, go back and read it. */ xa_unlock(&root->delayed_nodes); - btrfs_delayed_node_ref_tracker_dir_exit(node); - kmem_cache_free(delayed_node_cache, node); - node = NULL; - goto again; + goto cleanup; } ptr = __xa_store(&root->delayed_nodes, ino, node, GFP_ATOMIC); ASSERT(xa_err(ptr) != -EINVAL); ASSERT(xa_err(ptr) != -ENOMEM); ASSERT(ptr == NULL); - - /* Cached in the inode and can be accessed. */ - refcount_set(&node->refs, 2); - btrfs_delayed_node_ref_tracker_alloc(node, tracker, GFP_ATOMIC); - btrfs_delayed_node_ref_tracker_alloc(node, &node->inode_cache_tracker, GFP_ATOMIC); - btrfs_inode->delayed_node = node; xa_unlock(&root->delayed_nodes); return node; +cleanup: + btrfs_delayed_node_ref_tracker_free(node, tracker); + btrfs_delayed_node_ref_tracker_free(node, &node->inode_cache_tracker); + btrfs_delayed_node_ref_tracker_dir_exit(node); + kmem_cache_free(delayed_node_cache, node); + if (ret) + return ERR_PTR(ret); + goto again; } /* From ad7abda156aea4dba544f17244cc59d742b85dee Mon Sep 17 00:00:00 2001 From: Filipe Manana Date: Fri, 12 Dec 2025 17:10:10 +0000 Subject: [PATCH 0677/1321] btrfs: do not free data reservation in fallback from inline due to -ENOSPC If we fail to create an inline extent due to -ENOSPC, we will attempt to go through the normal COW path, reserve an extent, create an ordered extent, etc. However we were always freeing the reserved qgroup data, which is wrong since we will use data. Fix this by freeing the reserved qgroup data in __cow_file_range_inline() only if we are not doing the fallback (ret is <= 0). Reviewed-by: Qu Wenruo Signed-off-by: Filipe Manana Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/inode.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 317db7d10a21d..d0b0113fbea20 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -674,8 +674,12 @@ static noinline int __cow_file_range_inline(struct btrfs_inode *inode, * it won't count as data extent, free them directly here. * And at reserve time, it's always aligned to page size, so * just free one page here. + * + * If we fallback to non-inline (ret == 1) due to -ENOSPC, then we need + * to keep the data reservation. */ - btrfs_qgroup_free_data(inode, NULL, 0, fs_info->sectorsize, NULL); + if (ret <= 0) + btrfs_qgroup_free_data(inode, NULL, 0, fs_info->sectorsize, NULL); btrfs_free_path(path); btrfs_end_transaction(trans); return ret; From 0ee85cda5dbcd457f09229baec6eb18afd421a1a Mon Sep 17 00:00:00 2001 From: Filipe Manana Date: Fri, 12 Dec 2025 17:18:25 +0000 Subject: [PATCH 0678/1321] btrfs: fix reservation leak in some error paths when inserting inline extent If we fail to allocate a path or join a transaction, we return from __cow_file_range_inline() without freeing the reserved qgroup data, resulting in a leak. Fix this by ensuring we call btrfs_qgroup_free_data() in such cases. Signed-off-by: Filipe Manana Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/inode.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index d0b0113fbea20..599c03a1c5737 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -618,19 +618,22 @@ static noinline int __cow_file_range_inline(struct btrfs_inode *inode, struct btrfs_drop_extents_args drop_args = { 0 }; struct btrfs_root *root = inode->root; struct btrfs_fs_info *fs_info = root->fs_info; - struct btrfs_trans_handle *trans; + struct btrfs_trans_handle *trans = NULL; u64 data_len = (compressed_size ?: size); int ret; struct btrfs_path *path; path = btrfs_alloc_path(); - if (!path) - return -ENOMEM; + if (!path) { + ret = -ENOMEM; + goto out; + } trans = btrfs_join_transaction(root); if (IS_ERR(trans)) { - btrfs_free_path(path); - return PTR_ERR(trans); + ret = PTR_ERR(trans); + trans = NULL; + goto out; } trans->block_rsv = &inode->block_rsv; @@ -681,7 +684,8 @@ static noinline int __cow_file_range_inline(struct btrfs_inode *inode, if (ret <= 0) btrfs_qgroup_free_data(inode, NULL, 0, fs_info->sectorsize, NULL); btrfs_free_path(path); - btrfs_end_transaction(trans); + if (trans) + btrfs_end_transaction(trans); return ret; } From beca344fa8205d637949abed23858e3589b30126 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 9 Dec 2025 19:28:49 -0500 Subject: [PATCH 0679/1321] NFSD: Remove NFSERR_EAGAIN I haven't found an NFSERR_EAGAIN in RFCs 1094, 1813, 7530, or 8881. None of these RFCs have an NFS status code that match the numeric value "11". Based on the meaning of the EAGAIN errno, I presume the use of this status in NFSD means NFS4ERR_DELAY. So replace the one usage of nfserr_eagain, and remove it from NFSD's NFS status conversion tables. As far as I can tell, NFSERR_EAGAIN has existed since the pre-git era, but was not actually used by any code until commit f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed."), at which time it become possible for NFSD to return a status code of 11 (which is not valid NFS protocol). Fixes: f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed.") Cc: stable@vger.kernel.org Reviewed-by: NeilBrown Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever --- fs/nfs_common/common.c | 1 - fs/nfsd/nfs4proc.c | 2 +- fs/nfsd/nfsd.h | 1 - include/trace/misc/nfs.h | 2 -- include/uapi/linux/nfs.h | 1 - 5 files changed, 1 insertion(+), 6 deletions(-) diff --git a/fs/nfs_common/common.c b/fs/nfs_common/common.c index af09aed09fd27..0778743ae2c2c 100644 --- a/fs/nfs_common/common.c +++ b/fs/nfs_common/common.c @@ -17,7 +17,6 @@ static const struct { { NFSERR_NOENT, -ENOENT }, { NFSERR_IO, -EIO }, { NFSERR_NXIO, -ENXIO }, -/* { NFSERR_EAGAIN, -EAGAIN }, */ { NFSERR_ACCES, -EACCES }, { NFSERR_EXIST, -EEXIST }, { NFSERR_XDEV, -EXDEV }, diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index b748009175837..9ec08dd4fe823 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -1502,7 +1502,7 @@ static __be32 nfsd4_ssc_setup_dul(struct nfsd_net *nn, char *ipaddr, (schedule_timeout(20*HZ) == 0)) { finish_wait(&nn->nfsd_ssc_waitq, &wait); kfree(work); - return nfserr_eagain; + return nfserr_jukebox; } finish_wait(&nn->nfsd_ssc_waitq, &wait); goto try_again; diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h index 50be785f1d2ce..b0283213a8f54 100644 --- a/fs/nfsd/nfsd.h +++ b/fs/nfsd/nfsd.h @@ -233,7 +233,6 @@ void nfsd_lockd_shutdown(void); #define nfserr_noent cpu_to_be32(NFSERR_NOENT) #define nfserr_io cpu_to_be32(NFSERR_IO) #define nfserr_nxio cpu_to_be32(NFSERR_NXIO) -#define nfserr_eagain cpu_to_be32(NFSERR_EAGAIN) #define nfserr_acces cpu_to_be32(NFSERR_ACCES) #define nfserr_exist cpu_to_be32(NFSERR_EXIST) #define nfserr_xdev cpu_to_be32(NFSERR_XDEV) diff --git a/include/trace/misc/nfs.h b/include/trace/misc/nfs.h index c82233e950ac0..a394b4d38e18f 100644 --- a/include/trace/misc/nfs.h +++ b/include/trace/misc/nfs.h @@ -16,7 +16,6 @@ TRACE_DEFINE_ENUM(NFSERR_PERM); TRACE_DEFINE_ENUM(NFSERR_NOENT); TRACE_DEFINE_ENUM(NFSERR_IO); TRACE_DEFINE_ENUM(NFSERR_NXIO); -TRACE_DEFINE_ENUM(NFSERR_EAGAIN); TRACE_DEFINE_ENUM(NFSERR_ACCES); TRACE_DEFINE_ENUM(NFSERR_EXIST); TRACE_DEFINE_ENUM(NFSERR_XDEV); @@ -52,7 +51,6 @@ TRACE_DEFINE_ENUM(NFSERR_JUKEBOX); { NFSERR_NXIO, "NXIO" }, \ { ECHILD, "CHILD" }, \ { ETIMEDOUT, "TIMEDOUT" }, \ - { NFSERR_EAGAIN, "AGAIN" }, \ { NFSERR_ACCES, "ACCES" }, \ { NFSERR_EXIST, "EXIST" }, \ { NFSERR_XDEV, "XDEV" }, \ diff --git a/include/uapi/linux/nfs.h b/include/uapi/linux/nfs.h index f356f2ba38142..71c7196d32817 100644 --- a/include/uapi/linux/nfs.h +++ b/include/uapi/linux/nfs.h @@ -49,7 +49,6 @@ NFSERR_NOENT = 2, /* v2 v3 v4 */ NFSERR_IO = 5, /* v2 v3 v4 */ NFSERR_NXIO = 6, /* v2 v3 v4 */ - NFSERR_EAGAIN = 11, /* v2 v3 */ NFSERR_ACCES = 13, /* v2 v3 v4 */ NFSERR_EXIST = 17, /* v2 v3 v4 */ NFSERR_XDEV = 18, /* v3 v4 */ From d5fbd89f7a955bcc2f2fbe2baf79cd07b828b8f6 Mon Sep 17 00:00:00 2001 From: Scott Mayhew Date: Thu, 11 Dec 2025 07:34:34 -0500 Subject: [PATCH 0680/1321] NFSD: Fix permission check for read access to executable-only files Commit abc02e5602f7 ("NFSD: Support write delegations in LAYOUTGET") added NFSD_MAY_OWNER_OVERRIDE to the access flags passed from nfsd4_layoutget() to fh_verify(). This causes LAYOUTGET to fail for executable-only files, and causes xfstests generic/126 to fail on pNFS SCSI. To allow read access to executable-only files, what we really want is: 1. The "permissions" portion of the access flags (the lower 6 bits) must be exactly NFSD_MAY_READ 2. The "hints" portion of the access flags (the upper 26 bits) can contain any combination of NFSD_MAY_OWNER_OVERRIDE and NFSD_MAY_READ_IF_EXEC Fixes: abc02e5602f7 ("NFSD: Support write delegations in LAYOUTGET") Cc: stable@vger.kernel.org # v6.6+ Signed-off-by: Scott Mayhew Reviewed-by: Jeff Layton Reviewed-by: NeilBrown Signed-off-by: Chuck Lever --- fs/nfsd/vfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 964cf922ad838..168d3ccc81557 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -2865,8 +2865,8 @@ nfsd_permission(struct svc_cred *cred, struct svc_export *exp, /* Allow read access to binaries even when mode 111 */ if (err == -EACCES && S_ISREG(inode->i_mode) && - (acc == (NFSD_MAY_READ | NFSD_MAY_OWNER_OVERRIDE) || - acc == (NFSD_MAY_READ | NFSD_MAY_READ_IF_EXEC))) + (((acc & NFSD_MAY_MASK) == NFSD_MAY_READ) && + (acc & (NFSD_MAY_OWNER_OVERRIDE | NFSD_MAY_READ_IF_EXEC)))) err = inode_permission(&nop_mnt_idmap, inode, MAY_EXEC); return err? nfserrno(err) : 0; From df3b9d17a73800cd8574c590748a27a78714a4da Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Sat, 13 Dec 2025 13:41:59 -0500 Subject: [PATCH 0681/1321] nfsd: provide locking for v4_end_grace Writing to v4_end_grace can race with server shutdown and result in memory being accessed after it was freed - reclaim_str_hashtbl in particularly. We cannot hold nfsd_mutex across the nfsd4_end_grace() call as that is held while client_tracking_op->init() is called and that can wait for an upcall to nfsdcltrack which can write to v4_end_grace, resulting in a deadlock. nfsd4_end_grace() is also called by the landromat work queue and this doesn't require locking as server shutdown will stop the work and wait for it before freeing anything that nfsd4_end_grace() might access. However, we must be sure that writing to v4_end_grace doesn't restart the work item after shutdown has already waited for it. For this we add a new flag protected with nn->client_lock. It is set only while it is safe to make client tracking calls, and v4_end_grace only schedules work while the flag is set with the spinlock held. So this patch adds a nfsd_net field "client_tracking_active" which is set as described. Another field "grace_end_forced", is set when v4_end_grace is written. After this is set, and providing client_tracking_active is set, the laundromat is scheduled. This "grace_end_forced" field bypasses other checks for whether the grace period has finished. This resolves a race which can result in use-after-free. Reported-by: Li Lingfeng Closes: https://lore.kernel.org/linux-nfs/20250623030015.2353515-1-neil@brown.name/T/#t Fixes: 7f5ef2e900d9 ("nfsd: add a v4_end_grace file to /proc/fs/nfsd") Cc: stable@vger.kernel.org Signed-off-by: NeilBrown Tested-by: Li Lingfeng Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever --- fs/nfsd/netns.h | 2 ++ fs/nfsd/nfs4state.c | 42 ++++++++++++++++++++++++++++++++++++++++-- fs/nfsd/nfsctl.c | 3 +-- fs/nfsd/state.h | 2 +- 4 files changed, 44 insertions(+), 5 deletions(-) diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h index 3e2d0fde80a7c..fe8338735e7cc 100644 --- a/fs/nfsd/netns.h +++ b/fs/nfsd/netns.h @@ -66,6 +66,8 @@ struct nfsd_net { struct lock_manager nfsd4_manager; bool grace_ended; + bool grace_end_forced; + bool client_tracking_active; time64_t boot_time; struct dentry *nfsd_client_dir; diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index a6e8a1f27133c..43b638642e7c3 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -84,7 +84,7 @@ static u64 current_sessionid = 1; /* forward declarations */ static bool check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner); static void nfs4_free_ol_stateid(struct nfs4_stid *stid); -void nfsd4_end_grace(struct nfsd_net *nn); +static void nfsd4_end_grace(struct nfsd_net *nn); static void _free_cpntf_state_locked(struct nfsd_net *nn, struct nfs4_cpntf_state *cps); static void nfsd4_file_hash_remove(struct nfs4_file *fi); static void deleg_reaper(struct nfsd_net *nn); @@ -6570,7 +6570,7 @@ nfsd4_renew(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, return nfs_ok; } -void +static void nfsd4_end_grace(struct nfsd_net *nn) { /* do nothing if grace period already ended */ @@ -6603,6 +6603,33 @@ nfsd4_end_grace(struct nfsd_net *nn) */ } +/** + * nfsd4_force_end_grace - forcibly end the NFSv4 grace period + * @nn: network namespace for the server instance to be updated + * + * Forces bypass of normal grace period completion, then schedules + * the laundromat to end the grace period immediately. Does not wait + * for the grace period to fully terminate before returning. + * + * Return values: + * %true: Grace termination schedule + * %false: No action was taken + */ +bool nfsd4_force_end_grace(struct nfsd_net *nn) +{ + if (!nn->client_tracking_ops) + return false; + spin_lock(&nn->client_lock); + if (nn->grace_ended || !nn->client_tracking_active) { + spin_unlock(&nn->client_lock); + return false; + } + WRITE_ONCE(nn->grace_end_forced, true); + mod_delayed_work(laundry_wq, &nn->laundromat_work, 0); + spin_unlock(&nn->client_lock); + return true; +} + /* * If we've waited a lease period but there are still clients trying to * reclaim, wait a little longer to give them a chance to finish. @@ -6612,6 +6639,8 @@ static bool clients_still_reclaiming(struct nfsd_net *nn) time64_t double_grace_period_end = nn->boot_time + 2 * nn->nfsd4_lease; + if (READ_ONCE(nn->grace_end_forced)) + return false; if (nn->track_reclaim_completes && atomic_read(&nn->nr_reclaim_complete) == nn->reclaim_str_hashtbl_size) @@ -8932,6 +8961,8 @@ static int nfs4_state_create_net(struct net *net) nn->unconf_name_tree = RB_ROOT; nn->boot_time = ktime_get_real_seconds(); nn->grace_ended = false; + nn->grace_end_forced = false; + nn->client_tracking_active = false; nn->nfsd4_manager.block_opens = true; INIT_LIST_HEAD(&nn->nfsd4_manager.list); INIT_LIST_HEAD(&nn->client_lru); @@ -9012,6 +9043,10 @@ nfs4_state_start_net(struct net *net) return ret; locks_start_grace(net, &nn->nfsd4_manager); nfsd4_client_tracking_init(net); + /* safe for laundromat to run now */ + spin_lock(&nn->client_lock); + nn->client_tracking_active = true; + spin_unlock(&nn->client_lock); if (nn->track_reclaim_completes && nn->reclaim_str_hashtbl_size == 0) goto skip_grace; printk(KERN_INFO "NFSD: starting %lld-second grace period (net %x)\n", @@ -9060,6 +9095,9 @@ nfs4_state_shutdown_net(struct net *net) shrinker_free(nn->nfsd_client_shrinker); cancel_work_sync(&nn->nfsd_shrinker_work); + spin_lock(&nn->client_lock); + nn->client_tracking_active = false; + spin_unlock(&nn->client_lock); cancel_delayed_work_sync(&nn->laundromat_work); locks_end_grace(&nn->nfsd4_manager); diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index 5ce9a49e76ba4..242fcbd958f12 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -1082,10 +1082,9 @@ static ssize_t write_v4_end_grace(struct file *file, char *buf, size_t size) case 'Y': case 'y': case '1': - if (!nn->nfsd_serv) + if (!nfsd4_force_end_grace(nn)) return -EBUSY; trace_nfsd_end_grace(netns(file)); - nfsd4_end_grace(nn); break; default: return -EINVAL; diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h index b052c1effdc53..848c5383d7820 100644 --- a/fs/nfsd/state.h +++ b/fs/nfsd/state.h @@ -849,7 +849,7 @@ static inline void nfsd4_revoke_states(struct net *net, struct super_block *sb) #endif /* grace period management */ -void nfsd4_end_grace(struct nfsd_net *nn); +bool nfsd4_force_end_grace(struct nfsd_net *nn); /* nfs4recover operations */ extern int nfsd4_client_tracking_init(struct net *net); From 776cb1551b807d0d252318496d7313f5f445f3ef Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Mon, 15 Dec 2025 08:07:28 +1100 Subject: [PATCH 0682/1321] nfsd: use correct loop termination in nfsd4_revoke_states() The loop in nfsd4_revoke_states() stops one too early because the end value given is CLIENT_HASH_MASK where it should be CLIENT_HASH_SIZE. This means that an admin request to drop all locks for a filesystem will miss locks held by clients which hash to the maximum possible hash value. Fixes: 1ac3629bf012 ("nfsd: prepare for supporting admin-revocation of state") Cc: stable@vger.kernel.org Signed-off-by: NeilBrown Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever --- fs/nfsd/nfs4state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 43b638642e7c3..f2e6a97a4f2d2 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -1780,7 +1780,7 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb) sc_types = SC_TYPE_OPEN | SC_TYPE_LOCK | SC_TYPE_DELEG | SC_TYPE_LAYOUT; spin_lock(&nn->client_lock); - for (idhashval = 0; idhashval < CLIENT_HASH_MASK; idhashval++) { + for (idhashval = 0; idhashval < CLIENT_HASH_SIZE; idhashval++) { struct list_head *head = &nn->conf_id_hashtbl[idhashval]; struct nfs4_client *clp; retry: From 3292af0941b51fe8791b8b95949d29f9307256cc Mon Sep 17 00:00:00 2001 From: Olga Kornievskaia Date: Mon, 15 Dec 2025 14:10:36 -0500 Subject: [PATCH 0683/1321] nfsd: check that server is running in unlock_filesystem If we are trying to unlock the filesystem via an administrative interface and nfsd isn't running, it crashes the server. This happens currently because nfsd4_revoke_states() access state structures (eg., conf_id_hashtbl) that has been freed as a part of the server shutdown. [ 59.465072] Call trace: [ 59.465308] nfsd4_revoke_states+0x1b4/0x898 [nfsd] (P) [ 59.465830] write_unlock_fs+0x258/0x440 [nfsd] [ 59.466278] nfsctl_transaction_write+0xb0/0x120 [nfsd] [ 59.466780] vfs_write+0x1f0/0x938 [ 59.467088] ksys_write+0xfc/0x1f8 [ 59.467395] __arm64_sys_write+0x74/0xb8 [ 59.467746] invoke_syscall.constprop.0+0xdc/0x1e8 [ 59.468177] do_el0_svc+0x154/0x1d8 [ 59.468489] el0_svc+0x40/0xe0 [ 59.468767] el0t_64_sync_handler+0xa0/0xe8 [ 59.469138] el0t_64_sync+0x1ac/0x1b0 Ensure this can't happen by taking the nfsd_mutex and checking that the server is still up, and then holding the mutex across the call to nfsd4_revoke_states(). Reviewed-by: NeilBrown Reviewed-by: Jeff Layton Fixes: 1ac3629bf0125 ("nfsd: prepare for supporting admin-revocation of state") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia Signed-off-by: Chuck Lever --- fs/nfsd/nfs4state.c | 5 ++--- fs/nfsd/nfsctl.c | 9 ++++++++- fs/nfsd/state.h | 4 ++-- 3 files changed, 12 insertions(+), 6 deletions(-) diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index f2e6a97a4f2d2..a2361ec36ab84 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -1759,7 +1759,7 @@ static struct nfs4_stid *find_one_sb_stid(struct nfs4_client *clp, /** * nfsd4_revoke_states - revoke all nfsv4 states associated with given filesystem - * @net: used to identify instance of nfsd (there is one per net namespace) + * @nn: used to identify instance of nfsd (there is one per net namespace) * @sb: super_block used to identify target filesystem * * All nfs4 states (open, lock, delegation, layout) held by the server instance @@ -1771,9 +1771,8 @@ static struct nfs4_stid *find_one_sb_stid(struct nfs4_client *clp, * The clients which own the states will subsequently being notified that the * states have been "admin-revoked". */ -void nfsd4_revoke_states(struct net *net, struct super_block *sb) +void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb) { - struct nfsd_net *nn = net_generic(net, nfsd_net_id); unsigned int idhashval; unsigned int sc_types; diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index 242fcbd958f12..084fc517e9e16 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -259,6 +259,7 @@ static ssize_t write_unlock_fs(struct file *file, char *buf, size_t size) struct path path; char *fo_path; int error; + struct nfsd_net *nn; /* sanity check */ if (size == 0) @@ -285,7 +286,13 @@ static ssize_t write_unlock_fs(struct file *file, char *buf, size_t size) * 3. Is that directory the root of an exported file system? */ error = nlmsvc_unlock_all_by_sb(path.dentry->d_sb); - nfsd4_revoke_states(netns(file), path.dentry->d_sb); + mutex_lock(&nfsd_mutex); + nn = net_generic(netns(file), nfsd_net_id); + if (nn->nfsd_serv) + nfsd4_revoke_states(nn, path.dentry->d_sb); + else + error = -EINVAL; + mutex_unlock(&nfsd_mutex); path_put(&path); return error; diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h index 848c5383d7820..508b7e36d846d 100644 --- a/fs/nfsd/state.h +++ b/fs/nfsd/state.h @@ -841,9 +841,9 @@ static inline void get_nfs4_file(struct nfs4_file *fi) struct nfsd_file *find_any_file(struct nfs4_file *f); #ifdef CONFIG_NFSD_V4 -void nfsd4_revoke_states(struct net *net, struct super_block *sb); +void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb); #else -static inline void nfsd4_revoke_states(struct net *net, struct super_block *sb) +static inline void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb) { } #endif From cdbdedab665811ff5058c41164fb3007d05061a6 Mon Sep 17 00:00:00 2001 From: Edward Adam Davis Date: Tue, 16 Dec 2025 18:27:37 +0800 Subject: [PATCH 0684/1321] NFSD: net ref data still needs to be freed even if net hasn't startup When the NFSD instance doesn't to startup, the net ref data memory is not properly reclaimed, which triggers the memory leak issue reported by syzbot [1]. To avoid the problem reported in [1], the net ref data memory reclamation action is moved outside of nfsd_net_up when the net is shutdown. [1] unreferenced object 0xffff88812a39dfc0 (size 64): backtrace (crc a2262fc6): percpu_ref_init+0x94/0x1e0 lib/percpu-refcount.c:76 nfsd_create_serv+0xbe/0x260 fs/nfsd/nfssvc.c:605 nfsd_nl_listener_set_doit+0x62/0xb00 fs/nfsd/nfsctl.c:1882 genl_family_rcv_msg_doit+0x11e/0x190 net/netlink/genetlink.c:1115 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0x2fd/0x440 net/netlink/genetlink.c:1210 BUG: memory leak Reported-by: syzbot+6ee3b889bdeada0a6226@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6ee3b889bdeada0a6226 Fixes: 39972494e318 ("nfsd: update percpu_ref to manage references on nfsd_net") Cc: stable@vger.kernel.org Signed-off-by: Edward Adam Davis Signed-off-by: Chuck Lever --- fs/nfsd/nfssvc.c | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index f6cae4430ba44..f1cc223ecee2f 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -406,26 +406,26 @@ static void nfsd_shutdown_net(struct net *net) { struct nfsd_net *nn = net_generic(net, nfsd_net_id); - if (!nn->nfsd_net_up) - return; - - percpu_ref_kill_and_confirm(&nn->nfsd_net_ref, nfsd_net_done); - wait_for_completion(&nn->nfsd_net_confirm_done); - - nfsd_export_flush(net); - nfs4_state_shutdown_net(net); - nfsd_reply_cache_shutdown(nn); - nfsd_file_cache_shutdown_net(net); - if (nn->lockd_up) { - lockd_down(net); - nn->lockd_up = false; + if (nn->nfsd_net_up) { + percpu_ref_kill_and_confirm(&nn->nfsd_net_ref, nfsd_net_done); + wait_for_completion(&nn->nfsd_net_confirm_done); + + nfsd_export_flush(net); + nfs4_state_shutdown_net(net); + nfsd_reply_cache_shutdown(nn); + nfsd_file_cache_shutdown_net(net); + if (nn->lockd_up) { + lockd_down(net); + nn->lockd_up = false; + } + wait_for_completion(&nn->nfsd_net_free_done); } - wait_for_completion(&nn->nfsd_net_free_done); percpu_ref_exit(&nn->nfsd_net_ref); + if (nn->nfsd_net_up) + nfsd_shutdown_generic(); nn->nfsd_net_up = false; - nfsd_shutdown_generic(); } static DEFINE_SPINLOCK(nfsd_notifier_lock); From 568c7734002b465cece9fc3bce29d0f3c925d386 Mon Sep 17 00:00:00 2001 From: Haoxiang Li Date: Sat, 20 Dec 2025 00:28:45 +0800 Subject: [PATCH 0685/1321] ALSA: ac97: fix a double free in snd_ac97_controller_register() If ac97_add_adapter() fails, put_device() is the correct way to drop the device reference. kfree() is not required. Add kfree() if idr_alloc() fails and in ac97_adapter_release() to do the cleanup. Found by code review. Fixes: 74426fbff66e ("ALSA: ac97: add an ac97 bus") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li Link: https://patch.msgid.link/20251219162845.657525-1-lihaoxiang@isrc.iscas.ac.cn Signed-off-by: Takashi Iwai --- sound/ac97/bus.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/sound/ac97/bus.c b/sound/ac97/bus.c index f4254703d29f7..bb9b795e02262 100644 --- a/sound/ac97/bus.c +++ b/sound/ac97/bus.c @@ -298,6 +298,7 @@ static void ac97_adapter_release(struct device *dev) idr_remove(&ac97_adapter_idr, ac97_ctrl->nr); dev_dbg(&ac97_ctrl->adap, "adapter unregistered by %s\n", dev_name(ac97_ctrl->parent)); + kfree(ac97_ctrl); } static const struct device_type ac97_adapter_type = { @@ -319,7 +320,9 @@ static int ac97_add_adapter(struct ac97_controller *ac97_ctrl) ret = device_register(&ac97_ctrl->adap); if (ret) put_device(&ac97_ctrl->adap); - } + } else + kfree(ac97_ctrl); + if (!ret) { list_add(&ac97_ctrl->controllers, &ac97_controllers); dev_dbg(&ac97_ctrl->adap, "adapter registered by %s\n", @@ -361,14 +364,11 @@ struct ac97_controller *snd_ac97_controller_register( ret = ac97_add_adapter(ac97_ctrl); if (ret) - goto err; + return ERR_PTR(ret); ac97_bus_reset(ac97_ctrl); ac97_bus_scan(ac97_ctrl); return ac97_ctrl; -err: - kfree(ac97_ctrl); - return ERR_PTR(ret); } EXPORT_SYMBOL_GPL(snd_ac97_controller_register); From e38df8d87ed937f2572ae4ff5fabd1ba51f462cf Mon Sep 17 00:00:00 2001 From: August Wikerfors Date: Mon, 22 Dec 2025 20:47:04 +0100 Subject: [PATCH 0686/1321] ALSA: hda/tas2781: properly initialize speaker_id for TAS2563 After speaker id retrieval was refactored to happen in tas2781_read_acpi, devices that do not use a speaker id need a negative speaker_id value instead of NULL, but no initialization was added to the TAS2563 code path. This causes the driver to attempt to load a non-existent firmware file name with a speaker id of 0 ("TAS2XXX38700.bin") instead of the correct file name without a speaker id ("TAS2XXX3870.bin"), resulting in low volume and these dmesg errors: tas2781-hda i2c-INT8866:00: Direct firmware load for TAS2XXX38700.bin failed with error -2 tas2781-hda i2c-INT8866:00: tasdevice_dsp_parser: load TAS2XXX38700.bin error tas2781-hda i2c-INT8866:00: dspfw load TAS2XXX38700.bin error [...] tas2781-hda i2c-INT8866:00: tasdevice_prmg_load: Firmware is NULL Fix this by setting speaker_id to -1 as is done for other models. Fixes: 945865a0ddf3 ("ALSA: hda/tas2781: fix speaker id retrieval for multiple probes") Cc: stable@vger.kernel.org Signed-off-by: August Wikerfors Link: https://patch.msgid.link/20251222194704.87232-1-git@augustwikerfors.se Signed-off-by: Takashi Iwai --- sound/hda/codecs/side-codecs/tas2781_hda_i2c.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c b/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c index c8619995b1d78..f7a7f216d5865 100644 --- a/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c +++ b/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c @@ -111,8 +111,10 @@ static int tas2781_read_acpi(struct tasdevice_priv *p, const char *hid) sub = acpi_get_subsystem_id(ACPI_HANDLE(physdev)); if (IS_ERR(sub)) { /* No subsys id in older tas2563 projects. */ - if (!strncmp(hid, "INT8866", sizeof("INT8866"))) + if (!strncmp(hid, "INT8866", sizeof("INT8866"))) { + p->speaker_id = -1; goto end_2563; + } dev_err(p->dev, "Failed to get SUBSYS ID.\n"); ret = PTR_ERR(sub); goto err; From 2a32fa87beebe05d2cd4d5b5cc4728de31a307cf Mon Sep 17 00:00:00 2001 From: Mac Chiang Date: Fri, 19 Dec 2025 11:49:02 +0800 Subject: [PATCH 0687/1321] ASoC: Intel: sof_sdw: shift SSP BT mask bits. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The SSP BT mask bits overlapped with SOC_SDW_CODEC_SPKR, SOC_SDW_SIDECAR_AMPS, and SOC_SDW_CODEC_MIC BIT[15–17] in sdw_utils.h. Shift the SSP BT mask bits to a higher range to eliminate the conflict. Signed-off-by: Mac Chiang Signed-off-by: Bard Liao Link: https://patch.msgid.link/20251219034902.3630537-1-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown --- sound/soc/intel/boards/sof_sdw_common.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/sound/soc/intel/boards/sof_sdw_common.h b/sound/soc/intel/boards/sof_sdw_common.h index 3aa1dcec5172c..5390f0a749d6d 100644 --- a/sound/soc/intel/boards/sof_sdw_common.h +++ b/sound/soc/intel/boards/sof_sdw_common.h @@ -46,11 +46,11 @@ enum { #define SOC_SDW_NO_AGGREGATION BIT(14) /* BT audio offload: reserve 3 bits for future */ -#define SOF_BT_OFFLOAD_SSP_SHIFT 15 -#define SOF_BT_OFFLOAD_SSP_MASK (GENMASK(17, 15)) +#define SOF_BT_OFFLOAD_SSP_SHIFT 18 +#define SOF_BT_OFFLOAD_SSP_MASK (GENMASK(20, 18)) #define SOF_BT_OFFLOAD_SSP(quirk) \ (((quirk) << SOF_BT_OFFLOAD_SSP_SHIFT) & SOF_BT_OFFLOAD_SSP_MASK) -#define SOF_SSP_BT_OFFLOAD_PRESENT BIT(18) +#define SOF_SSP_BT_OFFLOAD_PRESENT BIT(21) struct intel_mc_ctx { struct sof_hdmi_private hdmi; From 0633a6348ef4149a1f4efccda54bc939ef650ce0 Mon Sep 17 00:00:00 2001 From: Bard Liao Date: Fri, 19 Dec 2025 11:49:37 +0800 Subject: [PATCH 0688/1321] ASoC: SOF: Intel: add -bt tplg suffix if BT is present MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We need to distinguish the topologies with and without BT PCM. Signed-off-by: Bard Liao Reviewed-by: Kai Vehmanen Reviewed-by: Péter Ujfalusi Link: https://patch.msgid.link/20251219034937.3630569-1-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown --- sound/soc/sof/intel/hda.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/sound/soc/sof/intel/hda.c b/sound/soc/sof/intel/hda.c index c1518dbee1b7a..0bb85f92e1069 100644 --- a/sound/soc/sof/intel/hda.c +++ b/sound/soc/sof/intel/hda.c @@ -1549,6 +1549,7 @@ struct snd_soc_acpi_mach *hda_machine_select(struct snd_sof_dev *sdev) * name string if quirk flag is set. */ if (mach) { + const struct sof_intel_dsp_desc *chip = get_chip_info(sdev->pdata); bool tplg_fixup = false; bool dmic_fixup = false; @@ -1598,6 +1599,18 @@ struct snd_soc_acpi_mach *hda_machine_select(struct snd_sof_dev *sdev) sof_pdata->tplg_filename = tplg_filename; } + if (tplg_fixup && mach->mach_params.bt_link_mask && + chip->hw_ip_version >= SOF_INTEL_ACE_4_0) { + int bt_port = fls(mach->mach_params.bt_link_mask) - 1; + + tplg_filename = devm_kasprintf(sdev->dev, GFP_KERNEL, "%s-ssp%d-bt", + sof_pdata->tplg_filename, bt_port); + if (!tplg_filename) + return NULL; + + sof_pdata->tplg_filename = tplg_filename; + } + if (mach->link_mask) { mach->mach_params.links = mach->links; mach->mach_params.link_mask = mach->link_mask; @@ -1609,7 +1622,6 @@ struct snd_soc_acpi_mach *hda_machine_select(struct snd_sof_dev *sdev) if (tplg_fixup && mach->tplg_quirk_mask & SND_SOC_ACPI_TPLG_INTEL_SSP_NUMBER && mach->mach_params.i2s_link_mask) { - const struct sof_intel_dsp_desc *chip = get_chip_info(sdev->pdata); int ssp_num; int mclk_mask; From 109846ceaa7161eaea65057bdb6069fa675fb6ef Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Fri, 19 Dec 2025 15:24:12 +0100 Subject: [PATCH 0689/1321] ASoC: codecs: pm4125: clean up bind() device reference handling A recent change fixed a couple of device leaks on component bind failure and on unbind but did so in a confusing way by adding misleading initialisations at bind() and bogus NULL checks at unbind(). Cc: Ma Ke Signed-off-by: Johan Hovold Reviewed-by: Dmitry Baryshkov Link: https://patch.msgid.link/20251219142412.19043-1-johan@kernel.org Signed-off-by: Mark Brown --- sound/soc/codecs/pm4125.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/sound/soc/codecs/pm4125.c b/sound/soc/codecs/pm4125.c index 8bc3b99940197..43dcafff6c771 100644 --- a/sound/soc/codecs/pm4125.c +++ b/sound/soc/codecs/pm4125.c @@ -1505,10 +1505,6 @@ static int pm4125_bind(struct device *dev) struct device_link *devlink; int ret; - /* Initialize device pointers to NULL for safe cleanup */ - pm4125->rxdev = NULL; - pm4125->txdev = NULL; - /* Give the soundwire subdevices some more time to settle */ usleep_range(15000, 15010); @@ -1624,11 +1620,8 @@ static void pm4125_unbind(struct device *dev) device_link_remove(dev, pm4125->rxdev); device_link_remove(pm4125->rxdev, pm4125->txdev); - /* Release device references acquired in bind */ - if (pm4125->txdev) - put_device(pm4125->txdev); - if (pm4125->rxdev) - put_device(pm4125->rxdev); + put_device(pm4125->txdev); + put_device(pm4125->rxdev); component_unbind_all(dev, pm4125); } From 5597f7a46716f1df3459d21bd87ea0988fc09335 Mon Sep 17 00:00:00 2001 From: Chen-Yu Tsai Date: Sun, 21 Dec 2025 11:57:13 +0800 Subject: [PATCH 0690/1321] ASoC: sun4i-spdif: Add missing kerneldoc fields for sun4i_spdif_quirks When sun4i_spdif_quirks was recently expanded, the kerneldoc covering the structure was not expanded to match. This ends up causing a warning when the documents are built. Add the missing fields. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202501311953.0Ox9CW5w-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202503060947.QKUUR62l-lkp@intel.com/ Fixes: 0a2319308de8 ("ASoC: sun4i-spdif: Add clock multiplier settings") Fixes: 4a5ac6cd05a7 ("ASoC: sun4i-spdif: Support SPDIF output on A523 family") Signed-off-by: Chen-Yu Tsai Reviewed-by: Marcus Cooper Acked-by: Jernej Skrabec Link: https://patch.msgid.link/20251221035715.1722584-1-wens@kernel.org Signed-off-by: Mark Brown --- sound/soc/sunxi/sun4i-spdif.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/soc/sunxi/sun4i-spdif.c b/sound/soc/sunxi/sun4i-spdif.c index 2e7ac8ab71bb1..1e755a716c632 100644 --- a/sound/soc/sunxi/sun4i-spdif.c +++ b/sound/soc/sunxi/sun4i-spdif.c @@ -171,6 +171,8 @@ * @reg_dac_txdata: TX FIFO offset for DMA config. * @has_reset: SoC needs reset deasserted. * @val_fctl_ftx: TX FIFO flush bitmask. + * @mclk_multiplier: ratio of internal MCLK divider + * @tx_clk_name: name of TX module clock if split clock design */ struct sun4i_spdif_quirks { unsigned int reg_dac_txdata; From 380f1ff445377966784c601ff41ec0a4af17135a Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Fri, 19 Dec 2025 15:27:13 +0100 Subject: [PATCH 0691/1321] ASoC: codecs: pm4125: drop bogus container_of() error handling The dev_to_sdw_dev() helper uses container_of() to return the containing soundwire device structure of its pointer argument and will never return NULL. Fixes: 8ad529484937 ("ASoC: codecs: add new pm4125 audio codec driver") Cc: Alexey Klimov Signed-off-by: Johan Hovold Acked-by: Alexey Klimov Reviewed-by: Dmitry Baryshkov Link: https://patch.msgid.link/20251219142715.19254-2-johan@kernel.org Signed-off-by: Mark Brown --- sound/soc/codecs/pm4125.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/sound/soc/codecs/pm4125.c b/sound/soc/codecs/pm4125.c index 43dcafff6c771..1f0a3f5389f1b 100644 --- a/sound/soc/codecs/pm4125.c +++ b/sound/soc/codecs/pm4125.c @@ -1533,13 +1533,7 @@ static int pm4125_bind(struct device *dev) pm4125->sdw_priv[AIF1_CAP] = dev_get_drvdata(pm4125->txdev); pm4125->sdw_priv[AIF1_CAP]->pm4125 = pm4125; - pm4125->tx_sdw_dev = dev_to_sdw_dev(pm4125->txdev); - if (!pm4125->tx_sdw_dev) { - dev_err(dev, "could not get txslave with matching of dev\n"); - ret = -EINVAL; - goto error_put_tx; - } /* * As TX is the main CSR reg interface, which should not be suspended first. From 6944d85c1071e7feb2f0458c449d5f5f1f4f9dc6 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Fri, 19 Dec 2025 15:27:14 +0100 Subject: [PATCH 0692/1321] ASoC: codecs: wcd937x: drop bogus container_of() error handling The dev_to_sdw_dev() helper uses container_of() to return the containing soundwire device structure of its pointer argument and will never return NULL. Fixes: 9be3ec196da4 ("ASoC: codecs: wcd937x: add wcd937x codec driver") Cc: Prasad Kumpatla Signed-off-by: Johan Hovold Reviewed-by: Dmitry Baryshkov Link: https://patch.msgid.link/20251219142715.19254-3-johan@kernel.org Signed-off-by: Mark Brown --- sound/soc/codecs/wcd937x.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/sound/soc/codecs/wcd937x.c b/sound/soc/codecs/wcd937x.c index f4dbcf04be492..10a2d598caa71 100644 --- a/sound/soc/codecs/wcd937x.c +++ b/sound/soc/codecs/wcd937x.c @@ -2763,11 +2763,6 @@ static int wcd937x_bind(struct device *dev) wcd937x->sdw_priv[AIF1_CAP] = dev_get_drvdata(wcd937x->txdev); wcd937x->sdw_priv[AIF1_CAP]->wcd937x = wcd937x; wcd937x->tx_sdw_dev = dev_to_sdw_dev(wcd937x->txdev); - if (!wcd937x->tx_sdw_dev) { - dev_err(dev, "could not get txslave with matching of dev\n"); - ret = -EINVAL; - goto err_put_txdev; - } /* * As TX is the main CSR reg interface, which should not be suspended first. From 41f0f21fd5b7b2baa7aa673c9d937a16d7d945d0 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Fri, 19 Dec 2025 15:27:15 +0100 Subject: [PATCH 0693/1321] ASoC: soc_sdw_utils: drop bogus container_of() error handling The dev_to_sdw_dev() helper uses container_of() to return the containing soundwire device structure of its pointer argument and will never return NULL. Fixes: 4f8ef33dd44a ("ASoC: soc_sdw_utils: skip the endpoint that doesn't present") Cc: Bard Liao Signed-off-by: Johan Hovold Reviewed-by: Konrad Dybcio Reviewed-by: Dmitry Baryshkov Link: https://patch.msgid.link/20251219142715.19254-4-johan@kernel.org Signed-off-by: Mark Brown --- sound/soc/sdw_utils/soc_sdw_utils.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/sound/soc/sdw_utils/soc_sdw_utils.c b/sound/soc/sdw_utils/soc_sdw_utils.c index f169d95895ea2..bf382aa07e92f 100644 --- a/sound/soc/sdw_utils/soc_sdw_utils.c +++ b/sound/soc/sdw_utils/soc_sdw_utils.c @@ -1414,10 +1414,6 @@ static int is_sdca_endpoint_present(struct device *dev, } slave = dev_to_sdw_dev(sdw_dev); - if (!slave) { - ret = -EINVAL; - goto put_device; - } /* Make sure BIOS provides SDCA properties */ if (!slave->sdca_data.interface_revision) { From fa384455c4c7f2acd6c0ee21445de9d3f41b58cb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Matou=C5=A1=20L=C3=A1nsk=C3=BD?= Date: Wed, 31 Dec 2025 18:12:07 +0100 Subject: [PATCH 0694/1321] ALSA: hda/realtek: Add quirk for Acer Nitro AN517-55 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add headset mic quirk for Acer Nitro AN517-55. This laptop uses the same audio configuration as the AN515-58 model. Signed-off-by: Matouš Lánský Link: https://patch.msgid.link/20251231171207.76943-1-matouslansky@post.cz Signed-off-by: Takashi Iwai --- sound/hda/codecs/realtek/alc269.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index 1de46c06f8c2a..67baf04551cb8 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -6321,6 +6321,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1025, 0x1466, "Acer Aspire A515-56", ALC255_FIXUP_ACER_HEADPHONE_AND_MIC), SND_PCI_QUIRK(0x1025, 0x1534, "Acer Predator PH315-54", ALC255_FIXUP_ACER_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1025, 0x159c, "Acer Nitro 5 AN515-58", ALC2XX_FIXUP_HEADSET_MIC), + SND_PCI_QUIRK(0x1025, 0x1597, "Acer Nitro 5 AN517-55", ALC2XX_FIXUP_HEADSET_MIC), SND_PCI_QUIRK(0x1025, 0x169a, "Acer Swift SFG16", ALC256_FIXUP_ACER_SFG16_MICMUTE_LED), SND_PCI_QUIRK(0x1025, 0x1826, "Acer Helios ZPC", ALC287_FIXUP_PREDATOR_SPK_CS35L41_I2C_2), SND_PCI_QUIRK(0x1025, 0x182c, "Acer Helios ZPD", ALC287_FIXUP_PREDATOR_SPK_CS35L41_I2C_2), From 17a1b196dcc218b14f00cc81dbe1edeedfd71f43 Mon Sep 17 00:00:00 2001 From: Ruslan Krupitsa Date: Fri, 2 Jan 2026 02:53:36 +0300 Subject: [PATCH 0695/1321] ALSA: hda/realtek: add HP Laptop 15s-eq1xxx mute LED quirk HP Laptop 15s-eq1xxx with ALC236 codec does not enable the mute LED automatically. This patch adds a quirk entry for subsystem ID 0x8706 using the ALC236_FIXUP_HP_MUTE_LED_COEFBIT2 fixup, enabling correct mute LED behavior. Signed-off-by: Ruslan Krupitsa Link: https://patch.msgid.link/AS8P194MB112895B8EC2D87D53A876085BBBAA@AS8P194MB1128.EURP194.PROD.OUTLOOK.COM Signed-off-by: Takashi Iwai --- sound/hda/codecs/realtek/alc269.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index 67baf04551cb8..61c7372e6307a 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -6509,6 +6509,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x863e, "HP Spectre x360 15-df1xxx", ALC285_FIXUP_HP_SPECTRE_X360_DF1), SND_PCI_QUIRK(0x103c, 0x86e8, "HP Spectre x360 15-eb0xxx", ALC285_FIXUP_HP_SPECTRE_X360_EB1), SND_PCI_QUIRK(0x103c, 0x86f9, "HP Spectre x360 13-aw0xxx", ALC285_FIXUP_HP_SPECTRE_X360_MUTE_LED), + SND_PCI_QUIRK(0x103c, 0x8706, "HP Laptop 15s-eq1xxx", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2), SND_PCI_QUIRK(0x103c, 0x8716, "HP Elite Dragonfly G2 Notebook PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), SND_PCI_QUIRK(0x103c, 0x8720, "HP EliteBook x360 1040 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), SND_PCI_QUIRK(0x103c, 0x8724, "HP EliteBook 850 G7", ALC285_FIXUP_HP_GPIO_LED), From 6bcf9ac9ce86a1f1a04e601de149ccd275b40ecd Mon Sep 17 00:00:00 2001 From: Benjamin Tissoires Date: Mon, 15 Dec 2025 12:57:21 +0100 Subject: [PATCH 0696/1321] HID: usbhid: paper over wrong bNumDescriptor field Some faulty devices (ZWO EFWmini) have a wrong optional HID class descriptor count compared to the provided length. Given that we plainly ignore those optional descriptor, we can attempt to fix the provided number so we do not lock out those devices. Signed-off-by: Benjamin Tissoires --- drivers/hid/usbhid/hid-core.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c index aac0051a2cf65..758eb21430cda 100644 --- a/drivers/hid/usbhid/hid-core.c +++ b/drivers/hid/usbhid/hid-core.c @@ -985,6 +985,7 @@ static int usbhid_parse(struct hid_device *hid) struct usb_device *dev = interface_to_usbdev (intf); struct hid_descriptor *hdesc; struct hid_class_descriptor *hcdesc; + __u8 fixed_opt_descriptors_size; u32 quirks = 0; unsigned int rsize = 0; char *rdesc; @@ -1015,7 +1016,21 @@ static int usbhid_parse(struct hid_device *hid) (hdesc->bNumDescriptors - 1) * sizeof(*hcdesc)) { dbg_hid("hid descriptor invalid, bLen=%hhu bNum=%hhu\n", hdesc->bLength, hdesc->bNumDescriptors); - return -EINVAL; + + /* + * Some devices may expose a wrong number of descriptors compared + * to the provided length. + * However, we ignore the optional hid class descriptors entirely + * so we can safely recompute the proper field. + */ + if (hdesc->bLength >= sizeof(*hdesc)) { + fixed_opt_descriptors_size = hdesc->bLength - sizeof(*hdesc); + + hid_warn(intf, "fixing wrong optional hid class descriptors count\n"); + hdesc->bNumDescriptors = fixed_opt_descriptors_size / sizeof(*hcdesc) + 1; + } else { + return -EINVAL; + } } hid->version = le16_to_cpu(hdesc->bcdHID); From 07d4035a2a87efd3e7616458c09687fa2f3f4d39 Mon Sep 17 00:00:00 2001 From: Siarhei Vishniakou Date: Tue, 11 Nov 2025 15:45:19 -0800 Subject: [PATCH 0697/1321] HID: playstation: Center initial joystick axes to prevent spurious events When a new PlayStation gamepad (DualShock 4 or DualSense) is initialized, the input subsystem sets the default value for its absolute axes (e.g., ABS_X, ABS_Y) to 0. However, the hardware's actual neutral/resting state for these joysticks is 128 (0x80). This creates a mismatch. When the first HID report arrives from the device, the driver sees the resting value of 128. The kernel compares this to its initial state of 0 and incorrectly interprets this as a delta (0 -> 128). Consequently, it generates EV_ABS events for this initial, non-existent movement. This behavior can fail userspace 'sanity check' tests (e.g., in Android CTS) that correctly assert no motion events should be generated from a device that is already at rest. This patch fixes the issue by explicitly setting the initial value of the main joystick axes (e.g., ABS_X, ABS_Y, ABS_RX, ABS_RY) to 128 (0x80) in the common ps_gamepad_create() function. This aligns the kernel's initial state with the hardware's expected neutral state, ensuring that the first report (at 128) produces no delta and thus, no spurious event. Signed-off-by: Siarhei Vishniakou Reviewed-by: Benjamin Tissoires Signed-off-by: Benjamin Tissoires --- drivers/hid/hid-playstation.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/hid/hid-playstation.c b/drivers/hid/hid-playstation.c index 128aa6abd10be..e4dfcf26b04e7 100644 --- a/drivers/hid/hid-playstation.c +++ b/drivers/hid/hid-playstation.c @@ -753,11 +753,16 @@ ps_gamepad_create(struct hid_device *hdev, if (IS_ERR(gamepad)) return ERR_CAST(gamepad); + /* Set initial resting state for joysticks to 128 (center) */ input_set_abs_params(gamepad, ABS_X, 0, 255, 0, 0); + gamepad->absinfo[ABS_X].value = 128; input_set_abs_params(gamepad, ABS_Y, 0, 255, 0, 0); + gamepad->absinfo[ABS_Y].value = 128; input_set_abs_params(gamepad, ABS_Z, 0, 255, 0, 0); input_set_abs_params(gamepad, ABS_RX, 0, 255, 0, 0); + gamepad->absinfo[ABS_RX].value = 128; input_set_abs_params(gamepad, ABS_RY, 0, 255, 0, 0); + gamepad->absinfo[ABS_RY].value = 128; input_set_abs_params(gamepad, ABS_RZ, 0, 255, 0, 0); input_set_abs_params(gamepad, ABS_HAT0X, -1, 1, 0, 0); From de3ef8a8a295b1c8186ada8ca0cb14fd527673d8 Mon Sep 17 00:00:00 2001 From: Thomas Fourier Date: Wed, 3 Dec 2025 17:56:35 +0100 Subject: [PATCH 0698/1321] HID: Intel-thc-hid: Intel-thc: fix dma_unmap_sg() nents value The `dma_unmap_sg()` functions should be called with the same nents as the `dma_map_sg()`, not the value the map function returned. Save the number of entries in struct thc_dma_configuration. Fixes: a688404b2e20 ("HID: intel-thc-hid: intel-thc: Add THC DMA interfaces") Signed-off-by: Thomas Fourier Reviewed-by: Even Xu Reviewed-by: Andy Shevchenko Signed-off-by: Benjamin Tissoires --- drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.c | 4 +++- drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.h | 2 ++ 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.c b/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.c index 82b8854843e05..a0c368aa7979c 100644 --- a/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.c +++ b/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.c @@ -232,6 +232,7 @@ static int setup_dma_buffers(struct thc_device *dev, return 0; memset(config->sgls, 0, sizeof(config->sgls)); + memset(config->sgls_nent_pages, 0, sizeof(config->sgls_nent_pages)); memset(config->sgls_nent, 0, sizeof(config->sgls_nent)); cpu_addr = dma_alloc_coherent(dev->dev, prd_tbls_size, @@ -254,6 +255,7 @@ static int setup_dma_buffers(struct thc_device *dev, } count = dma_map_sg(dev->dev, config->sgls[i], nent, dir); + config->sgls_nent_pages[i] = nent; config->sgls_nent[i] = count; } @@ -299,7 +301,7 @@ static void release_dma_buffers(struct thc_device *dev, continue; dma_unmap_sg(dev->dev, config->sgls[i], - config->sgls_nent[i], + config->sgls_nent_pages[i], config->dir); sgl_free(config->sgls[i]); diff --git a/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.h b/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.h index 78917400492ca..541d33995baf3 100644 --- a/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.h +++ b/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.h @@ -91,6 +91,7 @@ struct thc_prd_table { * @dir: Direction of DMA for this config * @prd_tbls: PRD tables for current DMA * @sgls: Array of pointers to scatter-gather lists + * @sgls_nent_pages: Number of pages per scatter-gather list * @sgls_nent: Actual number of entries per scatter-gather list * @prd_tbl_num: Actual number of PRD tables * @max_packet_size: Size of the buffer needed for 1 DMA message (1 PRD table) @@ -107,6 +108,7 @@ struct thc_dma_configuration { struct thc_prd_table *prd_tbls; struct scatterlist *sgls[PRD_TABLES_NUM]; + u8 sgls_nent_pages[PRD_TABLES_NUM]; u8 sgls_nent[PRD_TABLES_NUM]; u8 prd_tbl_num; From 329eee5cbef1467da165ac9a9065375a3f4dc8bc Mon Sep 17 00:00:00 2001 From: Zhang Lixu Date: Wed, 10 Dec 2025 10:53:28 +0800 Subject: [PATCH 0699/1321] HID: intel-ish-hid: Update ishtp bus match to support device ID table The ishtp_cl_bus_match() function previously only checked the first entry in the driver's device ID table. Update it to iterate over the entire table, allowing proper matching for drivers with multiple supported protocol GUIDs. Signed-off-by: Zhang Lixu Acked-by: Srinivas Pandruvada Signed-off-by: Benjamin Tissoires --- drivers/hid/intel-ish-hid/ishtp/bus.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/hid/intel-ish-hid/ishtp/bus.c b/drivers/hid/intel-ish-hid/ishtp/bus.c index c6ce37244e497..c3915f3a060ea 100644 --- a/drivers/hid/intel-ish-hid/ishtp/bus.c +++ b/drivers/hid/intel-ish-hid/ishtp/bus.c @@ -240,9 +240,17 @@ static int ishtp_cl_bus_match(struct device *dev, const struct device_driver *dr { struct ishtp_cl_device *device = to_ishtp_cl_device(dev); struct ishtp_cl_driver *driver = to_ishtp_cl_driver(drv); + struct ishtp_fw_client *client = device->fw_client; + const struct ishtp_device_id *id; - return(device->fw_client ? guid_equal(&driver->id[0].guid, - &device->fw_client->props.protocol_name) : 0); + if (client) { + for (id = driver->id; !guid_is_null(&id->guid); id++) { + if (guid_equal(&id->guid, &client->props.protocol_name)) + return 1; + } + } + + return 0; } /** From 4769d195fbce5f8c1438761239cc7d89a582f8ff Mon Sep 17 00:00:00 2001 From: Zhang Lixu Date: Fri, 12 Dec 2025 10:51:50 +0800 Subject: [PATCH 0700/1321] HID: intel-ish-hid: Reset enum_devices_done before enumeration Some systems have enabled ISH without any sensors. In this case sending HOSTIF_DM_ENUM_DEVICES results in 0 sensors. This triggers ISH hardware reset on subsequent enumeration after S3/S4 resume. The enum_devices_done flag was not reset before sending the HOSTIF_DM_ENUM_DEVICES command. On subsequent enumeration calls (such as after S3/S4 resume), this flag retains its previous true value, causing the wait loop to be skipped and returning prematurely to hid_ishtp_cl_init(). If 0 HID devices are found, hid_ishtp_cl_init() skips getting HID device descriptors and sets init_done to true. When the delayed enumeration response arrives with init_done already true, the driver treats it as a bad packet and triggers an ISH hardware reset. Set enum_devices_done to false before sending the enumeration command, consistent with similar functions like ishtp_get_hid_descriptor() and ishtp_get_report_descriptor() which reset their respective flags. Signed-off-by: Zhang Lixu Acked-by: Srinivas Pandruvada Signed-off-by: Benjamin Tissoires --- drivers/hid/intel-ish-hid/ishtp-hid-client.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/hid/intel-ish-hid/ishtp-hid-client.c b/drivers/hid/intel-ish-hid/ishtp-hid-client.c index f37b3bc2bb7d1..6d64008f2ce05 100644 --- a/drivers/hid/intel-ish-hid/ishtp-hid-client.c +++ b/drivers/hid/intel-ish-hid/ishtp-hid-client.c @@ -495,6 +495,7 @@ static int ishtp_enum_enum_devices(struct ishtp_cl *hid_ishtp_cl) int rv; /* Send HOSTIF_DM_ENUM_DEVICES */ + client_data->enum_devices_done = false; memset(&msg, 0, sizeof(struct hostif_msg)); msg.hdr.command = HOSTIF_DM_ENUM_DEVICES; rv = ishtp_cl_send(hid_ishtp_cl, (unsigned char *)&msg, From 37dd99df0bd22290f97180e669fae3c585bbb744 Mon Sep 17 00:00:00 2001 From: DaytonCL Date: Sun, 14 Dec 2025 14:34:36 +0100 Subject: [PATCH 0701/1321] HID: multitouch: add MT_QUIRK_STICKY_FINGERS to MT_CLS_VTL Some VTL-class touchpads (e.g. TOPS0102:00 35CC:0104) intermittently fail to release a finger contact. A previous slot remains logically active, accompanied by stale BTN_TOOL_DOUBLETAP state, causing gestures to stay latched and resulting in stuck two-finger scrolling and false right-clicks. Apply MT_QUIRK_STICKY_FINGERS to handle the unreleased contact correctly. Link: https://gitlab.freedesktop.org/libinput/libinput/-/issues/1225 Suggested-by: Benjamin Tissoires Tested-by: DaytonCL Signed-off-by: DaytonCL Signed-off-by: Benjamin Tissoires --- drivers/hid/hid-multitouch.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c index 179dc316b4b51..a0c1ad5acb670 100644 --- a/drivers/hid/hid-multitouch.c +++ b/drivers/hid/hid-multitouch.c @@ -393,6 +393,7 @@ static const struct mt_class mt_classes[] = { { .name = MT_CLS_VTL, .quirks = MT_QUIRK_ALWAYS_VALID | MT_QUIRK_CONTACT_CNT_ACCURATE | + MT_QUIRK_STICKY_FINGERS | MT_QUIRK_FORCE_GET_FEATURE, }, { .name = MT_CLS_GOOGLE, From c4e9ecd13f9a33c73ec8657080e0e55ea560c21c Mon Sep 17 00:00:00 2001 From: Even Xu Date: Fri, 19 Dec 2025 09:14:38 +0800 Subject: [PATCH 0702/1321] HID: Intel-thc-hid: Intel-thc: Fix wrong register reading Correct the read register for the setting of max input size and interrupt delay. Fixes: 22da60f0304b ("HID: Intel-thc-hid: Intel-thc: Introduce interrupt delay control") Fixes: 45e92a093099 ("HID: Intel-thc-hid: Intel-thc: Introduce max input size control") Signed-off-by: Even Xu Signed-off-by: Benjamin Tissoires --- drivers/hid/intel-thc-hid/intel-thc/intel-thc-dev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dev.c b/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dev.c index 636a683065015..7e220a4c5ded7 100644 --- a/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dev.c +++ b/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dev.c @@ -1593,7 +1593,7 @@ int thc_i2c_set_rx_max_size(struct thc_device *dev, u32 max_rx_size) if (!max_rx_size) return -EOPNOTSUPP; - ret = regmap_read(dev->thc_regmap, THC_M_PRT_SW_SEQ_STS_OFFSET, &val); + ret = regmap_read(dev->thc_regmap, THC_M_PRT_SPI_ICRRD_OPCODE_OFFSET, &val); if (ret) return ret; @@ -1662,7 +1662,7 @@ int thc_i2c_set_rx_int_delay(struct thc_device *dev, u32 delay_us) if (!delay_us) return -EOPNOTSUPP; - ret = regmap_read(dev->thc_regmap, THC_M_PRT_SW_SEQ_STS_OFFSET, &val); + ret = regmap_read(dev->thc_regmap, THC_M_PRT_SPI_ICRRD_OPCODE_OFFSET, &val); if (ret) return ret; From 6d567a08c590ca950853cf7628fbea9c62b34e2e Mon Sep 17 00:00:00 2001 From: Benjamin Tissoires Date: Tue, 6 Jan 2026 16:30:54 +0100 Subject: [PATCH 0703/1321] HID: bpf: fix bpf compilation with -fms-extensions Similar to commit 835a50753579 ("selftests/bpf: Add -fms-extensions to bpf build flags") and commit 639f58a0f480 ("bpftool: Fix build warnings due to MS extensions") The kernel is now built with -fms-extensions, therefore generated vmlinux.h contains types like: struct slab { .. struct freelist_counters; }; Use -fms-extensions and -Wno-microsoft-anon-tag flags to build bpf programs that #include "vmlinux.h" Signed-off-by: Benjamin Tissoires --- drivers/hid/bpf/progs/Makefile | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/hid/bpf/progs/Makefile b/drivers/hid/bpf/progs/Makefile index ec1fc642fd635..66b8f38e591dd 100644 --- a/drivers/hid/bpf/progs/Makefile +++ b/drivers/hid/bpf/progs/Makefile @@ -56,8 +56,10 @@ clean: %.bpf.o: %.bpf.c vmlinux.h $(BPFOBJ) | $(OUTPUT) $(call msg,BPF,$@) - $(Q)$(CLANG) -g -O2 --target=bpf -Wall -Werror $(INCLUDES) \ - -c $(filter %.c,$^) -o $@ && \ + $(Q)$(CLANG) -g -O2 --target=bpf -Wall -Werror $(INCLUDES) \ + -Wno-microsoft-anon-tag \ + -fms-extensions \ + -c $(filter %.c,$^) -o $@ && \ $(LLVM_STRIP) -g $@ vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL) | $(INCLUDE_DIR) From 87b86ade916720143c7a8525a93c04436fd58acc Mon Sep 17 00:00:00 2001 From: Benjamin Tissoires Date: Tue, 6 Jan 2026 16:30:55 +0100 Subject: [PATCH 0704/1321] selftests/hid: fix bpf compilations due to -fms-extensions Similar to commit 835a50753579 ("selftests/bpf: Add -fms-extensions to bpf build flags") and commit 639f58a0f480 ("bpftool: Fix build warnings due to MS extensions") The kernel is now built with -fms-extensions, therefore generated vmlinux.h contains types like: struct slab { .. struct freelist_counters; }; Use -fms-extensions and -Wno-microsoft-anon-tag flags to build bpf programs that #include "vmlinux.h" Signed-off-by: Benjamin Tissoires --- tools/testing/selftests/hid/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/testing/selftests/hid/Makefile b/tools/testing/selftests/hid/Makefile index 2839d2612ce3a..50ec9e0406aba 100644 --- a/tools/testing/selftests/hid/Makefile +++ b/tools/testing/selftests/hid/Makefile @@ -184,6 +184,8 @@ MENDIAN=$(if $(IS_LITTLE_ENDIAN),-mlittle-endian,-mbig-endian) CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG)) BPF_CFLAGS = -g -Werror -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN) \ + -Wno-microsoft-anon-tag \ + -fms-extensions \ -I$(INCLUDE_DIR) CLANG_CFLAGS = $(CLANG_SYS_INCLUDES) \ From 7e3770dcce53a8459efc0fec011a9fa5812fe908 Mon Sep 17 00:00:00 2001 From: Tim Zimmermann Date: Fri, 28 Nov 2025 08:54:22 +0100 Subject: [PATCH 0705/1321] hid: intel-thc-hid: Select SGL_ALLOC intel-thc-dma.c uses sgl_alloc() resulting in a build failure if CONFIG_SGL_ALLOC is not enabled Signed-off-by: Tim Zimmermann Reviewed-by: Even Xu Signed-off-by: Benjamin Tissoires --- drivers/hid/intel-thc-hid/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/hid/intel-thc-hid/Kconfig b/drivers/hid/intel-thc-hid/Kconfig index 0351d11376072..9d74e53b8c62b 100644 --- a/drivers/hid/intel-thc-hid/Kconfig +++ b/drivers/hid/intel-thc-hid/Kconfig @@ -7,6 +7,7 @@ menu "Intel THC HID Support" config INTEL_THC_HID tristate "Intel Touch Host Controller" depends on ACPI + select SGL_ALLOC help THC (Touch Host Controller) is the name of the IP block in PCH that interfaces with Touch Devices (ex: touchscreen, touchpad etc.). It From e7aa7400936ecdfc25fa84156d04ae00a9156d3b Mon Sep 17 00:00:00 2001 From: Even Xu Date: Fri, 26 Dec 2025 11:39:53 +0800 Subject: [PATCH 0706/1321] HID: Intel-thc-hid: Intel-thc: Add safety check for reading DMA buffer Add DMA buffer readiness check before reading DMA buffer to avoid unexpected NULL pointer accessing. Signed-off-by: Even Xu Tested-by: Rui Zhang Signed-off-by: Benjamin Tissoires --- drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.c b/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.c index a0c368aa7979c..6ee675e0a7384 100644 --- a/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.c +++ b/drivers/hid/intel-thc-hid/intel-thc/intel-thc-dma.c @@ -575,6 +575,11 @@ static int read_dma_buffer(struct thc_device *dev, return -EINVAL; } + if (!read_config->prd_tbls || !read_config->sgls[prd_table_index]) { + dev_err_once(dev->dev, "PRD tables are not ready yet\n"); + return -EINVAL; + } + prd_tbl = &read_config->prd_tbls[prd_table_index]; mes_len = calc_message_len(prd_tbl, &nent); if (mes_len > read_config->max_packet_size) { From e7c97308405a1b3e15cca5572413b595c7c9e21d Mon Sep 17 00:00:00 2001 From: Chris Chiu Date: Fri, 2 Jan 2026 06:56:43 +0000 Subject: [PATCH 0707/1321] HID: quirks: Add another Chicony HP 5MP Cameras to hid_ignore_list Another Chicony Electronics HP 5MP Camera with USB ID 04F2:B882 reports a HID sensor interface that is not actually implemented. Add the device to the HID ignore list so the bogus sensor is never exposed to userspace. Then the system won't hang when runtime PM tries to wake the unresponsive device. Signed-off-by: Chris Chiu Signed-off-by: Benjamin Tissoires --- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-quirks.c | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index d31711f1aaecc..e8a1a86313b74 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -317,6 +317,7 @@ #define USB_DEVICE_ID_CHICONY_ACER_SWITCH12 0x1421 #define USB_DEVICE_ID_CHICONY_HP_5MP_CAMERA 0xb824 #define USB_DEVICE_ID_CHICONY_HP_5MP_CAMERA2 0xb82c +#define USB_DEVICE_ID_CHICONY_HP_5MP_CAMERA3 0xb882 #define USB_VENDOR_ID_CHUNGHWAT 0x2247 #define USB_DEVICE_ID_CHUNGHWAT_MULTITOUCH 0x0001 diff --git a/drivers/hid/hid-quirks.c b/drivers/hid/hid-quirks.c index c89a015686c07..3cf7971d49743 100644 --- a/drivers/hid/hid-quirks.c +++ b/drivers/hid/hid-quirks.c @@ -769,6 +769,7 @@ static const struct hid_device_id hid_ignore_list[] = { { HID_USB_DEVICE(USB_VENDOR_ID_BERKSHIRE, USB_DEVICE_ID_BERKSHIRE_PCWD) }, { HID_USB_DEVICE(USB_VENDOR_ID_CHICONY, USB_DEVICE_ID_CHICONY_HP_5MP_CAMERA) }, { HID_USB_DEVICE(USB_VENDOR_ID_CHICONY, USB_DEVICE_ID_CHICONY_HP_5MP_CAMERA2) }, + { HID_USB_DEVICE(USB_VENDOR_ID_CHICONY, USB_DEVICE_ID_CHICONY_HP_5MP_CAMERA3) }, { HID_USB_DEVICE(USB_VENDOR_ID_CIDC, 0x0103) }, { HID_USB_DEVICE(USB_VENDOR_ID_CYGNAL, USB_DEVICE_ID_CYGNAL_RADIO_SI470X) }, { HID_USB_DEVICE(USB_VENDOR_ID_CYGNAL, USB_DEVICE_ID_CYGNAL_RADIO_SI4713) }, From 604e1f57e211365fbf3916e9664eb8d8e3009f72 Mon Sep 17 00:00:00 2001 From: Peter Hutterer Date: Mon, 22 Dec 2025 09:43:34 +1000 Subject: [PATCH 0708/1321] HID: multitouch: set INPUT_PROP_PRESSUREPAD based on Digitizer/Button Type A Digitizer/Button Type value of 1 indicates the device is a pressurepad, see https://learn.microsoft.com/en-us/windows-hardware/design/component-guidelines/touchpad-windows-precision-touchpad-collection#device-capabilities-feature-report Signed-off-by: Peter Hutterer Signed-off-by: Benjamin Tissoires --- drivers/hid/hid-multitouch.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c index a0c1ad5acb670..b1c3ef1290587 100644 --- a/drivers/hid/hid-multitouch.c +++ b/drivers/hid/hid-multitouch.c @@ -81,6 +81,7 @@ MODULE_LICENSE("GPL"); #define MT_INPUTMODE_TOUCHPAD 0x03 #define MT_BUTTONTYPE_CLICKPAD 0 +#define MT_BUTTONTYPE_PRESSUREPAD 1 enum latency_mode { HID_LATENCY_NORMAL = 0, @@ -179,6 +180,7 @@ struct mt_device { __u8 inputmode_value; /* InputMode HID feature value */ __u8 maxcontacts; bool is_buttonpad; /* is this device a button pad? */ + bool is_pressurepad; /* is this device a pressurepad? */ bool is_haptic_touchpad; /* is this device a haptic touchpad? */ bool serial_maybe; /* need to check for serial protocol */ @@ -531,8 +533,14 @@ static void mt_feature_mapping(struct hid_device *hdev, } mt_get_feature(hdev, field->report); - if (field->value[usage->usage_index] == MT_BUTTONTYPE_CLICKPAD) + switch (field->value[usage->usage_index]) { + case MT_BUTTONTYPE_CLICKPAD: td->is_buttonpad = true; + break; + case MT_BUTTONTYPE_PRESSUREPAD: + td->is_pressurepad = true; + break; + } break; case 0xff0000c5: @@ -1394,6 +1402,8 @@ static int mt_touch_input_configured(struct hid_device *hdev, if (td->is_buttonpad) __set_bit(INPUT_PROP_BUTTONPAD, input->propbit); + if (td->is_pressurepad) + __set_bit(INPUT_PROP_PRESSUREPAD, input->propbit); app->pending_palm_slots = devm_kcalloc(&hi->input->dev, BITS_TO_LONGS(td->maxcontacts), From dab99c72f1b15395ebbf6bab294ddd63f54d45b0 Mon Sep 17 00:00:00 2001 From: Peter Hutterer Date: Mon, 22 Dec 2025 09:43:35 +1000 Subject: [PATCH 0709/1321] selftests/hid: require hidtools 0.12 Not all our tests really require it but since it's likely pip-installed anyway it's trivial to require the new version, just in case we want to start cleaning up other bits. Signed-off-by: Peter Hutterer Signed-off-by: Benjamin Tissoires --- tools/testing/selftests/hid/tests/conftest.py | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/tools/testing/selftests/hid/tests/conftest.py b/tools/testing/selftests/hid/tests/conftest.py index 1361ec981db6f..985a535324b2f 100644 --- a/tools/testing/selftests/hid/tests/conftest.py +++ b/tools/testing/selftests/hid/tests/conftest.py @@ -5,6 +5,7 @@ # Copyright (c) 2017 Benjamin Tissoires # Copyright (c) 2017 Red Hat, Inc. +from packaging.version import Version import platform import pytest import re @@ -14,6 +15,19 @@ from pathlib import Path +@pytest.fixture(autouse=True) +def hidtools_version_check(): + HIDTOOLS_VERSION = "0.12" + try: + import hidtools + + version = hidtools.__version__ # type: ignore + if Version(version) < Version(HIDTOOLS_VERSION): + pytest.skip(reason=f"have hidtools {version}, require >={HIDTOOLS_VERSION}") + except Exception: + pytest.skip(reason=f"hidtools >={HIDTOOLS_VERSION} required") + + # See the comment in HIDTestUdevRule, this doesn't set up but it will clean # up once the last test exited. @pytest.fixture(autouse=True, scope="session") From 8199eec50dc4556a7371e7c4b04a79529d9448b0 Mon Sep 17 00:00:00 2001 From: Peter Hutterer Date: Mon, 22 Dec 2025 09:43:36 +1000 Subject: [PATCH 0710/1321] selftests/hid: use a enum class for the different button types Instead of multiple spellings of a string-provided argument, let's make this a tad more type-safe and use an enum here. And while we do this fix the two wrong devices: - elan_04f3_313a (HP ZBook Fury 15) is discrete button pad - dell_044e_1220 (Dell Precision 7740) is a discrete button pad Equivalent hid-tools commit https://gitlab.freedesktop.org/libevdev/hid-tools/-/commit/8300a55bf4213c6a252cab8cb5b34c9ddb191625 Signed-off-by: Peter Hutterer Signed-off-by: Benjamin Tissoires --- .../selftests/hid/tests/test_multitouch.py | 24 +++++++++++-------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/tools/testing/selftests/hid/tests/test_multitouch.py b/tools/testing/selftests/hid/tests/test_multitouch.py index ece0ba8e7d34b..a06a087f00b69 100644 --- a/tools/testing/selftests/hid/tests/test_multitouch.py +++ b/tools/testing/selftests/hid/tests/test_multitouch.py @@ -9,6 +9,7 @@ from . import base from hidtools.hut import HUT from hidtools.util import BusType +import enum import libevdev import logging import pytest @@ -232,11 +233,17 @@ def set_report(self, req, rnum, rtype, data): return 0 +class HIDButtonType(enum.IntEnum): + CLICKPAD = 0 + PRESSUREPAD = 1 + DISCRETE_BUTTONS = 2 + + class PTP(Digitizer): def __init__( self, name, - type="Click Pad", + buttontype=HIDButtonType.CLICKPAD, rdesc_str=None, rdesc=None, application="Touch Pad", @@ -244,11 +251,8 @@ def __init__( max_contacts=None, input_info=None, ): - self.type = type.lower().replace(" ", "") - if self.type == "clickpad": - self.buttontype = 0 - else: # pressurepad - self.buttontype = 1 + self.buttontype = buttontype + self.clickpad_state = False self.left_state = False self.right_state = False @@ -983,7 +987,7 @@ def test_ptp_buttons(self): uhdev = self.uhdev evdev = uhdev.get_evdev() - if uhdev.type == "clickpad": + if uhdev.buttontype == HIDButtonType.CLICKPAD: r = uhdev.event(click=True) events = uhdev.next_sync_events() self.debug_reports(r, uhdev, events) @@ -1918,7 +1922,7 @@ class Testdell_044e_1220(BaseTest.TestPTP): def create_device(self): return PTP( "uhid test dell_044e_1220", - type="pressurepad", + buttontype=HIDButtonType.DISCRETE_BUTTONS, rdesc="05 01 09 02 a1 01 85 01 09 01 a1 00 05 09 19 01 29 03 15 00 25 01 75 01 95 03 81 02 95 05 81 01 05 01 09 30 09 31 15 81 25 7f 75 08 95 02 81 06 09 38 95 01 81 06 05 0c 0a 38 02 81 06 c0 c0 05 0d 09 05 a1 01 85 08 09 22 a1 02 15 00 25 01 09 47 09 42 95 02 75 01 81 02 95 01 75 03 25 05 09 51 81 02 75 01 95 03 81 03 05 01 15 00 26 af 04 75 10 55 0e 65 11 09 30 35 00 46 e8 03 95 01 81 02 26 7b 02 46 12 02 09 31 81 02 c0 55 0c 66 01 10 47 ff ff 00 00 27 ff ff 00 00 75 10 95 01 05 0d 09 56 81 02 09 54 25 05 95 01 75 08 81 02 05 09 19 01 29 03 25 01 75 01 95 03 81 02 95 05 81 03 05 0d 85 09 09 55 75 08 95 01 25 05 b1 02 06 00 ff 85 0a 09 c5 15 00 26 ff 00 75 08 96 00 01 b1 02 c0 06 01 ff 09 01 a1 01 85 03 09 01 15 00 26 ff 00 95 1b 81 02 85 04 09 02 95 50 81 02 85 05 09 03 95 07 b1 02 85 06 09 04 81 02 c0 06 02 ff 09 01 a1 01 85 07 09 02 95 86 75 08 b1 02 c0 05 0d 09 0e a1 01 85 0b 09 22 a1 02 09 52 15 00 25 0a 75 08 95 01 b1 02 c0 09 22 a1 00 85 0c 09 57 09 58 75 01 95 02 25 01 b1 02 95 06 b1 03 c0 c0", ) @@ -2018,7 +2022,7 @@ class Testelan_04f3_313a(BaseTest.TestPTP): def create_device(self): return PTP( "uhid test elan_04f3_313a", - type="touchpad", + buttontype=HIDButtonType.DISCRETE_BUTTONS, input_info=(BusType.I2C, 0x04F3, 0x313A), rdesc="05 01 09 02 a1 01 85 01 09 01 a1 00 05 09 19 01 29 03 15 00 25 01 75 01 95 03 81 02 95 05 81 03 05 01 09 30 09 31 15 81 25 7f 75 08 95 02 81 06 75 08 95 05 81 03 c0 06 00 ff 09 01 85 0e 09 c5 15 00 26 ff 00 75 08 95 04 b1 02 85 0a 09 c6 15 00 26 ff 00 75 08 95 04 b1 02 c0 06 00 ff 09 01 a1 01 85 5c 09 01 95 0b 75 08 81 06 85 0d 09 c5 15 00 26 ff 00 75 08 95 04 b1 02 85 0c 09 c6 96 80 03 75 08 b1 02 85 0b 09 c7 95 82 75 08 b1 02 c0 05 0d 09 05 a1 01 85 04 09 22 a1 02 15 00 25 01 09 47 09 42 95 02 75 01 81 02 05 09 09 02 09 03 15 00 25 01 75 01 95 02 81 02 05 0d 95 01 75 04 25 0f 09 51 81 02 05 01 15 00 26 d7 0e 75 10 55 0d 65 11 09 30 35 00 46 44 2f 95 01 81 02 46 12 16 26 eb 06 26 eb 06 09 31 81 02 05 0d 15 00 25 64 95 03 c0 55 0c 66 01 10 47 ff ff 00 00 27 ff ff 00 00 75 10 95 01 09 56 81 02 09 54 25 7f 95 01 75 08 81 02 25 01 75 01 95 08 81 03 09 c5 75 08 95 02 81 03 05 0d 85 02 09 55 09 59 75 04 95 02 25 0f b1 02 85 07 09 60 75 01 95 01 15 00 25 01 b1 02 95 0f b1 03 06 00 ff 06 00 ff 85 06 09 c5 15 00 26 ff 00 75 08 96 00 01 b1 02 c0 05 0d 09 0e a1 01 85 03 09 22 a1 00 09 52 15 00 25 0a 75 10 95 01 b1 02 c0 09 22 a1 00 85 05 09 57 09 58 75 01 95 02 25 01 b1 02 95 0e b1 03 c0 c0 05 01 09 02 a1 01 85 2a 09 01 a1 00 05 09 19 01 29 03 15 00 25 01 75 01 95 03 81 02 95 05 81 03 05 01 09 30 09 31 15 81 25 7f 35 81 45 7f 55 00 65 13 75 08 95 02 81 06 75 08 95 05 81 03 c0 c0", ) @@ -2110,7 +2114,7 @@ class Testsipodev_0603_0002(BaseTest.TestPTP): def create_device(self): return PTP( "uhid test sipodev_0603_0002", - type="clickpad", + buttontype=HIDButtonType.CLICKPAD, rdesc="05 01 09 02 a1 01 85 03 09 01 a1 00 05 09 19 01 29 02 25 01 75 01 95 02 81 02 95 06 81 03 05 01 09 30 09 31 15 80 25 7f 75 08 95 02 81 06 c0 c0 05 0d 09 05 a1 01 85 04 09 22 a1 02 15 00 25 01 09 47 09 42 95 02 75 01 81 02 75 01 95 02 81 03 95 01 75 04 25 05 09 51 81 02 05 01 15 00 26 44 0a 75 0c 55 0e 65 11 09 30 35 00 46 ac 03 95 01 81 02 46 fe 01 26 34 05 75 0c 09 31 81 02 05 0d c0 55 0c 66 01 10 47 ff ff 00 00 27 ff ff 00 00 75 10 95 01 09 56 81 02 09 54 25 0a 95 01 75 04 81 02 75 01 95 03 81 03 05 09 09 01 25 01 75 01 95 01 81 02 05 0d 85 0a 09 55 09 59 75 04 95 02 25 0f b1 02 85 0b 09 60 75 01 95 01 15 00 25 01 b1 02 95 07 b1 03 85 09 06 00 ff 09 c5 15 00 26 ff 00 75 08 96 00 01 b1 02 c0 05 0d 09 0e a1 01 85 06 09 22 a1 02 09 52 15 00 25 0a 75 08 95 01 b1 02 c0 09 22 a1 00 85 07 09 57 09 58 75 01 95 02 25 01 b1 02 95 06 b1 03 c0 c0 05 01 09 0c a1 01 85 08 15 00 25 01 09 c6 75 01 95 01 81 06 75 07 81 03 c0 05 01 09 80 a1 01 85 01 15 00 25 01 75 01 0a 81 00 0a 82 00 0a 83 00 95 03 81 06 95 05 81 01 c0 06 0c 00 09 01 a1 01 85 02 25 01 15 00 75 01 0a b5 00 0a b6 00 0a b7 00 0a cd 00 0a e2 00 0a a2 00 0a e9 00 0a ea 00 95 08 81 02 0a 83 01 0a 6f 00 0a 70 00 0a 88 01 0a 8a 01 0a 92 01 0a a8 02 0a 24 02 95 08 81 02 0a 21 02 0a 23 02 0a 96 01 0a 25 02 0a 26 02 0a 27 02 0a 23 02 0a b1 02 95 08 81 02 c0 06 00 ff 09 01 a1 01 85 05 15 00 26 ff 00 19 01 29 02 75 08 95 05 b1 02 c0", ) From 697ff6c3c07e9bbbd0d9139ca45913af3560011c Mon Sep 17 00:00:00 2001 From: Peter Hutterer Date: Mon, 22 Dec 2025 09:43:37 +1000 Subject: [PATCH 0711/1321] selftests/hid: add a test for the Digitizer/Button Type pressurepad We have to resort to a bit of a hack: python-libevdev gets the properties from libevdev at module init time. If libevdev hasn't been rebuilt with the new property it won't be automatically populated. So we hack around this by constructing the property manually. Signed-off-by: Peter Hutterer Signed-off-by: Benjamin Tissoires --- .../selftests/hid/tests/test_multitouch.py | 39 +++++++++++++++++-- 1 file changed, 35 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/hid/tests/test_multitouch.py b/tools/testing/selftests/hid/tests/test_multitouch.py index a06a087f00b69..fa4fb2054bd4f 100644 --- a/tools/testing/selftests/hid/tests/test_multitouch.py +++ b/tools/testing/selftests/hid/tests/test_multitouch.py @@ -979,15 +979,36 @@ def test_mt_azimuth(self): assert libevdev.InputEvent(libevdev.EV_ABS.ABS_MT_ORIENTATION, 90) in events class TestPTP(TestWin8Multitouch): + def test_buttontype(self): + """Check for the right ButtonType.""" + uhdev = self.uhdev + assert uhdev is not None + evdev = uhdev.get_evdev() + + # If libevdev.so is not yet compiled with INPUT_PROP_PRESSUREPAD + # python-libevdev won't have it either, let's fake it + if not getattr(libevdev, "INPUT_PROP_PRESSUREPAD", None): + prop = libevdev.InputProperty(name="INPUT_PROP_PRESSUREPAD", value=0x7) + libevdev.INPUT_PROP_PRESSUREPAD = prop + libevdev.props.append(prop) + + if uhdev.buttontype == HIDButtonType.CLICKPAD: + assert libevdev.INPUT_PROP_BUTTONPAD in evdev.properties + elif uhdev.buttontype == HIDButtonType.PRESSUREPAD: + assert libevdev.INPUT_PROP_PRESSUREPAD in evdev.properties + else: + assert libevdev.INPUT_PROP_PRESSUREPAD not in evdev.properties + assert libevdev.INPUT_PROP_BUTTONPAD not in evdev.properties + def test_ptp_buttons(self): """check for button reliability. - There are 2 types of touchpads: the click pads and the pressure pads. - Each should reliably report the BTN_LEFT events. + There are 3 types of touchpads: click pads + pressure pads and + those with discrete buttons. Each should reliably report the BTN_LEFT events. """ uhdev = self.uhdev evdev = uhdev.get_evdev() - if uhdev.buttontype == HIDButtonType.CLICKPAD: + if uhdev.buttontype in [HIDButtonType.CLICKPAD, HIDButtonType.PRESSUREPAD]: r = uhdev.event(click=True) events = uhdev.next_sync_events() self.debug_reports(r, uhdev, events) @@ -999,7 +1020,7 @@ def test_ptp_buttons(self): self.debug_reports(r, uhdev, events) assert libevdev.InputEvent(libevdev.EV_KEY.BTN_LEFT, 0) in events assert evdev.value[libevdev.EV_KEY.BTN_LEFT] == 0 - else: + elif uhdev.buttontype == HIDButtonType.DISCRETE_BUTTONS: r = uhdev.event(left=True) events = uhdev.next_sync_events() self.debug_reports(r, uhdev, events) @@ -2062,6 +2083,16 @@ def create_device(self): ) +class Testven_0488_108c(BaseTest.TestPTP): + def create_device(self): + return PTP( + "uhid test ven_0488_108c", + rdesc="05 01 09 02 a1 01 85 06 09 01 a1 00 05 09 19 01 29 03 15 00 25 01 95 03 75 01 81 02 95 01 75 05 81 03 05 01 09 30 09 31 09 38 15 81 25 7f 75 08 95 03 81 06 c0 c0 05 0d 09 05 a1 01 85 01 05 0d 09 22 a1 02 15 00 25 01 09 47 09 42 95 02 75 01 81 02 95 01 75 03 25 05 09 51 81 02 81 03 05 01 15 00 26 ba 0d 75 10 55 0e 65 11 09 30 35 00 46 d0 05 95 01 81 02 26 d0 06 46 bb 02 09 31 81 02 05 0d 95 01 75 10 26 ff 7f 46 ff 7f 09 30 81 02 c0 05 0d 09 22 a1 02 15 00 25 01 09 47 09 42 95 02 75 01 81 02 95 01 75 03 25 05 09 51 81 02 81 03 05 01 15 00 26 ba 0d 75 10 55 0e 65 11 09 30 35 00 46 d0 05 95 01 81 02 26 d0 06 46 bb 02 09 31 81 02 05 0d 95 01 75 10 26 ff 7f 46 ff 7f 09 30 81 02 c0 05 0d 09 22 a1 02 15 00 25 01 09 47 09 42 95 02 75 01 81 02 95 01 75 03 25 05 09 51 81 02 81 03 05 01 15 00 26 ba 0d 75 10 55 0e 65 11 09 30 35 00 46 d0 05 95 01 81 02 26 d0 06 46 bb 02 09 31 81 02 05 0d 95 01 75 10 26 ff 7f 46 ff 7f 09 30 81 02 c0 55 0c 66 01 10 47 ff ff 00 00 27 ff ff 00 00 75 10 95 01 05 0d 09 56 81 02 09 54 25 05 95 01 75 08 81 02 05 09 09 01 25 01 75 01 95 01 81 02 95 07 81 03 05 0d 85 02 09 55 75 08 95 01 25 05 b1 02 09 59 b1 02 06 00 ff 85 03 09 c5 15 00 26 ff 00 75 08 96 00 01 b1 02 05 0e 09 01 a1 02 85 13 09 23 15 00 25 64 75 08 95 01 b1 02 c0 c0 05 0d 09 0e a1 01 85 04 09 22 a1 02 09 52 15 00 25 0a 75 08 95 01 b1 02 c0 09 22 a1 00 85 05 09 57 09 58 75 01 95 02 25 01 b1 02 95 06 b1 03 c0 c0 06 01 ff 09 02 a1 01 09 00 85 07 15 00 26 ff 00 75 08 96 12 02 b1 02 c0 06 00 ff 09 01 a1 01 85 0d 15 00 26 ff 00 75 08 95 11 09 01 81 02 09 01 91 02 c0 05 0e 09 01 a1 01 85 11 09 35 15 00 26 ff 00 75 08 95 17 b1 02 c0 06 81 ff 09 01 a1 01 09 20 85 17 15 00 26 ff 00 75 08 95 3f 09 01 81 02 09 01 91 02 c0", + input_info=(0x18, 0x0488, 0x108C), + buttontype=HIDButtonType.PRESSUREPAD, + ) + + class Testn_trig_1b96_0c01(BaseTest.TestWin8Multitouch): def create_device(self): return Digitizer( From ec734ac5bf88df8ff1ee75a4d8f53f1054a072de Mon Sep 17 00:00:00 2001 From: Kwok Kin Ming Date: Thu, 1 Jan 2026 02:18:26 +0800 Subject: [PATCH 0712/1321] HID: i2c-hid: fix potential buffer overflow in i2c_hid_get_report() `i2c_hid_xfer` is used to read `recv_len + sizeof(__le16)` bytes of data into `ihid->rawbuf`. The former can come from the userspace in the hidraw driver and is only bounded by HID_MAX_BUFFER_SIZE(16384) by default (unless we also set `max_buffer_size` field of `struct hid_ll_driver` which we do not). The latter has size determined at runtime by the maximum size of different report types you could receive on any particular device and can be a much smaller value. Fix this by truncating `recv_len` to `ihid->bufsize - sizeof(__le16)`. The impact is low since access to hidraw devices requires root. Signed-off-by: Kwok Kin Ming Signed-off-by: Benjamin Tissoires --- drivers/hid/i2c-hid/i2c-hid-core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c index 63f46a2e57882..5a183af3d5c6a 100644 --- a/drivers/hid/i2c-hid/i2c-hid-core.c +++ b/drivers/hid/i2c-hid/i2c-hid-core.c @@ -286,6 +286,7 @@ static int i2c_hid_get_report(struct i2c_hid *ihid, * In addition to report data device will supply data length * in the first 2 bytes of the response, so adjust . */ + recv_len = min(recv_len, ihid->bufsize - sizeof(__le16)); error = i2c_hid_xfer(ihid, ihid->cmdbuf, length, ihid->rawbuf, recv_len + sizeof(__le16)); if (error) { From 4a1a63ed070f217dd5d420109a5ae52847c65f77 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Rodrigo=20Lugathe=20da=20Concei=C3=A7=C3=A3o=20Alves?= Date: Thu, 27 Nov 2025 19:03:57 -0300 Subject: [PATCH 0713/1321] HID: Apply quirk HID_QUIRK_ALWAYS_POLL to Edifier QR30 (2d99:a101) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The USB speaker has a bug that causes it to reboot when changing the brightness using the physical knob. Add a new vendor and product ID entry in hid-ids.h, and register the corresponding device in hid-quirks.c with the required quirk. Signed-off-by: Rodrigo Lugathe da Conceição Alves Reviewed-by: Terry Junge Signed-off-by: Jiri Kosina --- drivers/hid/hid-ids.h | 3 +++ drivers/hid/hid-quirks.c | 1 + 2 files changed, 4 insertions(+) diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index e8a1a86313b74..10cb1bc717c1d 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -439,6 +439,9 @@ #define USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_A001 0xa001 #define USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_C002 0xc002 +#define USB_VENDOR_ID_EDIFIER 0x2d99 +#define USB_DEVICE_ID_EDIFIER_QR30 0xa101 /* EDIFIER Hal0 2.0 SE */ + #define USB_VENDOR_ID_ELAN 0x04f3 #define USB_DEVICE_ID_TOSHIBA_CLICK_L9W 0x0401 #define USB_DEVICE_ID_HP_X2 0x074d diff --git a/drivers/hid/hid-quirks.c b/drivers/hid/hid-quirks.c index 3cf7971d49743..d6e42125d9189 100644 --- a/drivers/hid/hid-quirks.c +++ b/drivers/hid/hid-quirks.c @@ -81,6 +81,7 @@ static const struct hid_device_id hid_quirks[] = { { HID_USB_DEVICE(USB_VENDOR_ID_DRAGONRISE, USB_DEVICE_ID_DRAGONRISE_PS3), HID_QUIRK_MULTI_INPUT }, { HID_USB_DEVICE(USB_VENDOR_ID_DRAGONRISE, USB_DEVICE_ID_DRAGONRISE_WIIU), HID_QUIRK_MULTI_INPUT }, { HID_USB_DEVICE(USB_VENDOR_ID_DWAV, USB_DEVICE_ID_EGALAX_TOUCHCONTROLLER), HID_QUIRK_MULTI_INPUT | HID_QUIRK_NOGET }, + { HID_USB_DEVICE(USB_VENDOR_ID_EDIFIER, USB_DEVICE_ID_EDIFIER_QR30), HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_ELAN, HID_ANY_ID), HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_ELO, USB_DEVICE_ID_ELO_TS2700), HID_QUIRK_NOGET }, { HID_USB_DEVICE(USB_VENDOR_ID_EMS, USB_DEVICE_ID_EMS_TRIO_LINKER_PLUS_II), HID_QUIRK_MULTI_INPUT }, From e5e1d22a68b0d4b697661d81e9734f7ada6c7152 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ren=C3=A9=20Rebe?= Date: Fri, 28 Nov 2025 13:46:41 +0100 Subject: [PATCH 0714/1321] HID: quirks: work around VID/PID conflict for appledisplay MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit For years I wondered why the Apple Cinema Display driver would not just work for me. Turns out the hidraw driver instantly takes it over. Fix by adding appledisplay VID/PIDs to hid_have_special_driver. Fixes: 069e8a65cd79 ("Driver for Apple Cinema Display") Signed-off-by: René Rebe Signed-off-by: Jiri Kosina --- drivers/hid/hid-quirks.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/hid/hid-quirks.c b/drivers/hid/hid-quirks.c index d6e42125d9189..31b2a5d1cd98f 100644 --- a/drivers/hid/hid-quirks.c +++ b/drivers/hid/hid-quirks.c @@ -233,6 +233,15 @@ static const struct hid_device_id hid_quirks[] = { * used as a driver. See hid_scan_report(). */ static const struct hid_device_id hid_have_special_driver[] = { +#if IS_ENABLED(CONFIG_APPLEDISPLAY) + { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, 0x9218) }, + { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, 0x9219) }, + { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, 0x921c) }, + { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, 0x921d) }, + { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, 0x9222) }, + { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, 0x9226) }, + { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, 0x9236) }, +#endif #if IS_ENABLED(CONFIG_HID_A4TECH) { HID_USB_DEVICE(USB_VENDOR_ID_A4TECH, USB_DEVICE_ID_A4TECH_WCP32PU) }, { HID_USB_DEVICE(USB_VENDOR_ID_A4TECH, USB_DEVICE_ID_A4TECH_X5_005D) }, From cc65c161fbb9036869bf7d31460cb0f0693e2fd9 Mon Sep 17 00:00:00 2001 From: Arnoud Willemsen Date: Sun, 7 Dec 2025 03:43:19 +0100 Subject: [PATCH 0715/1321] HID: Elecom: Add support for ELECOM M-XT3DRBK (018C) Wireless/new version of the Elecom trackball mouse M-XT3DRBK has a product id that differs from the existing M-XT3DRBK. The report descriptor format also seems to have changed and matches other (newer?) models instead (except for six buttons instead of eight). This patch follows the same format as the patch for the M-XT3URBK (018F) by Naoki Ueki (Nov 3rd 2025) to enable the sixth mouse button. dmesg output: [ 292.074664] usb 1-2: new full-speed USB device number 7 using xhci_hcd [ 292.218667] usb 1-2: New USB device found, idVendor=056e, idProduct=018c, bcdDevice= 1.00 [ 292.218676] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [ 292.218679] usb 1-2: Product: ELECOM TrackBall Mouse [ 292.218681] usb 1-2: Manufacturer: ELECOM usbhid-dump output: 001:006:000:DESCRIPTOR 1765072638.050578 05 01 09 02 A1 01 09 01 A1 00 85 01 05 09 19 01 29 05 15 00 25 01 95 08 75 01 81 02 95 01 75 00 81 01 05 01 09 30 09 31 16 00 80 26 FF 7F 75 10 95 02 81 06 C0 A1 00 05 01 09 38 15 81 25 7F 75 08 95 01 81 06 C0 A1 00 05 0C 0A 38 02 95 01 75 08 15 81 25 7F 81 06 C0 C0 06 01 FF 09 00 A1 01 85 02 09 00 15 00 26 FF 00 75 08 95 07 81 02 C0 05 0C 09 01 A1 01 85 05 15 00 26 3C 02 19 00 2A 3C 02 75 10 95 01 81 00 C0 05 01 09 80 A1 01 85 03 19 81 29 83 15 00 25 01 95 03 75 01 81 02 95 01 75 05 81 01 C0 06 BC FF 09 88 A1 01 85 04 95 01 75 08 15 00 26 FF 00 19 00 2A FF 00 81 00 C0 06 02 FF 09 02 A1 01 85 06 09 02 15 00 26 FF 00 75 08 95 07 B1 02 C0 Signed-off-by: Arnoud Willemsen Signed-off-by: Jiri Kosina --- drivers/hid/hid-elecom.c | 15 +++++++++++++-- drivers/hid/hid-ids.h | 3 ++- drivers/hid/hid-quirks.c | 3 ++- 3 files changed, 17 insertions(+), 4 deletions(-) diff --git a/drivers/hid/hid-elecom.c b/drivers/hid/hid-elecom.c index 981d1b6e96589..2003d2dcda7cc 100644 --- a/drivers/hid/hid-elecom.c +++ b/drivers/hid/hid-elecom.c @@ -77,7 +77,7 @@ static const __u8 *elecom_report_fixup(struct hid_device *hdev, __u8 *rdesc, break; case USB_DEVICE_ID_ELECOM_M_XT3URBK_00FB: case USB_DEVICE_ID_ELECOM_M_XT3URBK_018F: - case USB_DEVICE_ID_ELECOM_M_XT3DRBK: + case USB_DEVICE_ID_ELECOM_M_XT3DRBK_00FC: case USB_DEVICE_ID_ELECOM_M_XT4DRBK: /* * Report descriptor format: @@ -102,6 +102,16 @@ static const __u8 *elecom_report_fixup(struct hid_device *hdev, __u8 *rdesc, */ mouse_button_fixup(hdev, rdesc, *rsize, 12, 30, 14, 20, 8); break; + case USB_DEVICE_ID_ELECOM_M_XT3DRBK_018C: + /* + * Report descriptor format: + * 22: button bit count + * 30: padding bit count + * 24: button report size + * 16: button usage maximum + */ + mouse_button_fixup(hdev, rdesc, *rsize, 22, 30, 24, 16, 6); + break; case USB_DEVICE_ID_ELECOM_M_DT2DRBK: case USB_DEVICE_ID_ELECOM_M_HT1DRBK_011C: /* @@ -122,7 +132,8 @@ static const struct hid_device_id elecom_devices[] = { { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XGL20DLBK) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3URBK_00FB) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3URBK_018F) }, - { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3DRBK) }, + { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3DRBK_00FC) }, + { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3DRBK_018C) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT4DRBK) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_DT1URBK) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_DT1DRBK) }, diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index 10cb1bc717c1d..9c2bf584d9f6f 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -455,7 +455,8 @@ #define USB_DEVICE_ID_ELECOM_M_XGL20DLBK 0x00e6 #define USB_DEVICE_ID_ELECOM_M_XT3URBK_00FB 0x00fb #define USB_DEVICE_ID_ELECOM_M_XT3URBK_018F 0x018f -#define USB_DEVICE_ID_ELECOM_M_XT3DRBK 0x00fc +#define USB_DEVICE_ID_ELECOM_M_XT3DRBK_00FC 0x00fc +#define USB_DEVICE_ID_ELECOM_M_XT3DRBK_018C 0x018c #define USB_DEVICE_ID_ELECOM_M_XT4DRBK 0x00fd #define USB_DEVICE_ID_ELECOM_M_DT1URBK 0x00fe #define USB_DEVICE_ID_ELECOM_M_DT1DRBK 0x00ff diff --git a/drivers/hid/hid-quirks.c b/drivers/hid/hid-quirks.c index 31b2a5d1cd98f..11438039cdb7f 100644 --- a/drivers/hid/hid-quirks.c +++ b/drivers/hid/hid-quirks.c @@ -422,7 +422,8 @@ static const struct hid_device_id hid_have_special_driver[] = { { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XGL20DLBK) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3URBK_00FB) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3URBK_018F) }, - { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3DRBK) }, + { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3DRBK_00FC) }, + { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT3DRBK_018C) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_XT4DRBK) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_DT1URBK) }, { HID_USB_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_M_DT1DRBK) }, From 13c3e1f9bd279f47ea0a4fe9375fb316e5b21c60 Mon Sep 17 00:00:00 2001 From: Dennis Marttinen Date: Sun, 4 Jan 2026 13:00:51 +0000 Subject: [PATCH 0716/1321] HID: logitech: add HID++ support for Logitech MX Anywhere 3S I've acquired a Logitech MX Anywhere 3S mouse, which supports HID++ over Bluetooth. Adding its PID 0xb037 to the allowlist enables the additional features, such as high-resolution scrolling. Tested working across multiple machines, with a mix of Intel and Mediatek Bluetooth chips. [jkosina@suse.com: standardize shortlog] Signed-off-by: Dennis Marttinen Signed-off-by: Jiri Kosina --- drivers/hid/hid-logitech-hidpp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c index d5011a5d08902..e871f1729d4b3 100644 --- a/drivers/hid/hid-logitech-hidpp.c +++ b/drivers/hid/hid-logitech-hidpp.c @@ -4662,6 +4662,8 @@ static const struct hid_device_id hidpp_devices[] = { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb025) }, { /* MX Master 3S mouse over Bluetooth */ HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb034) }, + { /* MX Anywhere 3S mouse over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb037) }, { /* MX Anywhere 3SB mouse over Bluetooth */ HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb038) }, {} From 3aa77bbe0f9823033c2a2769f22bb6ad38f41482 Mon Sep 17 00:00:00 2001 From: Alok Tiwari Date: Mon, 29 Dec 2025 21:21:18 -0800 Subject: [PATCH 0717/1321] net: marvell: prestera: fix NULL dereference on devlink_alloc() failure devlink_alloc() may return NULL on allocation failure, but prestera_devlink_alloc() unconditionally calls devlink_priv() on the returned pointer. This leads to a NULL pointer dereference if devlink allocation fails. Add a check for a NULL devlink pointer and return NULL early to avoid the crash. Fixes: 34dd1710f5a3 ("net: marvell: prestera: Add basic devlink support") Signed-off-by: Alok Tiwari Acked-by: Elad Nachman Link: https://patch.msgid.link/20251230052124.897012-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/marvell/prestera/prestera_devlink.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/marvell/prestera/prestera_devlink.c b/drivers/net/ethernet/marvell/prestera/prestera_devlink.c index 2a4c9df4eb797..e63d95c1842f3 100644 --- a/drivers/net/ethernet/marvell/prestera/prestera_devlink.c +++ b/drivers/net/ethernet/marvell/prestera/prestera_devlink.c @@ -387,6 +387,8 @@ struct prestera_switch *prestera_devlink_alloc(struct prestera_device *dev) dl = devlink_alloc(&prestera_dl_ops, sizeof(struct prestera_switch), dev->dev); + if (!dl) + return NULL; return devlink_priv(dl); } From 20272a496d2eb72800196593093648ecf037aaca Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Markus=20Bl=C3=B6chl?= Date: Sun, 28 Dec 2025 16:52:59 +0100 Subject: [PATCH 0718/1321] net: bnge: add AUXILIARY_BUS to Kconfig dependencies MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The build can currently fail with ld: drivers/net/ethernet/broadcom/bnge/bnge_auxr.o: in function `bnge_rdma_aux_device_add': bnge_auxr.c:(.text+0x366): undefined reference to `__auxiliary_device_add' ld: drivers/net/ethernet/broadcom/bnge/bnge_auxr.o: in function `bnge_rdma_aux_device_init': bnge_auxr.c:(.text+0x43c): undefined reference to `auxiliary_device_init' if BNGE is enabled but no other driver pulls in AUXILIARY_BUS. Select AUXILIARY_BUS in BNGE like in all other drivers which create an auxiliary_device. Fixes: 8ac050ec3b1c ("bng_en: Add RoCE aux device support") Signed-off-by: Markus Blöchl Reviewed-by: Vikas Gupta Link: https://patch.msgid.link/20251228-bnge_aux_bus-v1-1-82e273ebfdac@blochl.de Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/broadcom/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/broadcom/Kconfig b/drivers/net/ethernet/broadcom/Kconfig index ca565ace6e6ad..cd7dddeb91dd6 100644 --- a/drivers/net/ethernet/broadcom/Kconfig +++ b/drivers/net/ethernet/broadcom/Kconfig @@ -259,6 +259,7 @@ config BNGE depends on PCI select NET_DEVLINK select PAGE_POOL + select AUXILIARY_BUS help This driver supports Broadcom ThorUltra 50/100/200/400/800 gigabit Ethernet cards. The module will be called bng_en. To compile this From 18c19659ebb8ea702f1f11e7155480bae86a77c5 Mon Sep 17 00:00:00 2001 From: Alexandre Knecht Date: Sun, 28 Dec 2025 03:00:57 +0100 Subject: [PATCH 0719/1321] bridge: fix C-VLAN preservation in 802.1ad vlan_tunnel egress When using an 802.1ad bridge with vlan_tunnel, the C-VLAN tag is incorrectly stripped from frames during egress processing. br_handle_egress_vlan_tunnel() uses skb_vlan_pop() to remove the S-VLAN from hwaccel before VXLAN encapsulation. However, skb_vlan_pop() also moves any "next" VLAN from the payload into hwaccel: /* move next vlan tag to hw accel tag */ __skb_vlan_pop(skb, &vlan_tci); __vlan_hwaccel_put_tag(skb, vlan_proto, vlan_tci); For QinQ frames where the C-VLAN sits in the payload, this moves it to hwaccel where it gets lost during VXLAN encapsulation. Fix by calling __vlan_hwaccel_clear_tag() directly, which clears only the hwaccel S-VLAN and leaves the payload untouched. This path is only taken when vlan_tunnel is enabled and tunnel_info is configured, so 802.1Q bridges are unaffected. Tested with 802.1ad bridge + VXLAN vlan_tunnel, verified C-VLAN preserved in VXLAN payload via tcpdump. Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths") Signed-off-by: Alexandre Knecht Reviewed-by: Ido Schimmel Acked-by: Nikolay Aleksandrov Link: https://patch.msgid.link/20251228020057.2788865-1-knecht.alexandre@gmail.com Signed-off-by: Jakub Kicinski --- net/bridge/br_vlan_tunnel.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/net/bridge/br_vlan_tunnel.c b/net/bridge/br_vlan_tunnel.c index a966a6ec82634..257cae9f15698 100644 --- a/net/bridge/br_vlan_tunnel.c +++ b/net/bridge/br_vlan_tunnel.c @@ -189,7 +189,6 @@ int br_handle_egress_vlan_tunnel(struct sk_buff *skb, IP_TUNNEL_DECLARE_FLAGS(flags) = { }; struct metadata_dst *tunnel_dst; __be64 tunnel_id; - int err; if (!vlan) return 0; @@ -199,9 +198,13 @@ int br_handle_egress_vlan_tunnel(struct sk_buff *skb, return 0; skb_dst_drop(skb); - err = skb_vlan_pop(skb); - if (err) - return err; + /* For 802.1ad (QinQ), skb_vlan_pop() incorrectly moves the C-VLAN + * from payload to hwaccel after clearing S-VLAN. We only need to + * clear the hwaccel S-VLAN; the C-VLAN must stay in payload for + * correct VXLAN encapsulation. This is also correct for 802.1Q + * where no C-VLAN exists in payload. + */ + __vlan_hwaccel_clear_tag(skb); if (BR_INPUT_SKB_CB(skb)->backup_nhid) { __set_bit(IP_TUNNEL_KEY_BIT, flags); From ae4be8b573ec67e00b1007ef1425ac39eb470faa Mon Sep 17 00:00:00 2001 From: Jerry Wu Date: Thu, 25 Dec 2025 20:36:17 +0000 Subject: [PATCH 0720/1321] net: mscc: ocelot: Fix crash when adding interface under a lag Commit 15faa1f67ab4 ("lan966x: Fix crash when adding interface under a lag") fixed a similar issue in the lan966x driver caused by a NULL pointer dereference. The ocelot_set_aggr_pgids() function in the ocelot driver has similar logic and is susceptible to the same crash. This issue specifically affects the ocelot_vsc7514.c frontend, which leaves unused ports as NULL pointers. The felix_vsc9959.c frontend is unaffected as it uses the DSA framework which registers all ports. Fix this by checking if the port pointer is valid before accessing it. Fixes: 528d3f190c98 ("net: mscc: ocelot: drop the use of the "lags" array") Signed-off-by: Jerry Wu Reviewed-by: Vladimir Oltean Link: https://patch.msgid.link/tencent_75EF812B305E26B0869C673DD1160866C90A@qq.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mscc/ocelot.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c index 08bee56aea35f..c345d9b17c892 100644 --- a/drivers/net/ethernet/mscc/ocelot.c +++ b/drivers/net/ethernet/mscc/ocelot.c @@ -2307,14 +2307,16 @@ static void ocelot_set_aggr_pgids(struct ocelot *ocelot) /* Now, set PGIDs for each active LAG */ for (lag = 0; lag < ocelot->num_phys_ports; lag++) { - struct net_device *bond = ocelot->ports[lag]->bond; + struct ocelot_port *ocelot_port = ocelot->ports[lag]; int num_active_ports = 0; + struct net_device *bond; unsigned long bond_mask; u8 aggr_idx[16]; - if (!bond || (visited & BIT(lag))) + if (!ocelot_port || !ocelot_port->bond || (visited & BIT(lag))) continue; + bond = ocelot_port->bond; bond_mask = ocelot_get_bond_mask(ocelot, bond); for_each_set_bit(port, &bond_mask, ocelot->num_phys_ports) { From dec1873d9d5fb29342d91c3ac367dea94f2c5a0b Mon Sep 17 00:00:00 2001 From: "yuan.gao" Date: Wed, 24 Dec 2025 14:31:45 +0800 Subject: [PATCH 0721/1321] inet: ping: Fix icmp out counting When the ping program uses an IPPROTO_ICMP socket to send ICMP_ECHO messages, ICMP_MIB_OUTMSGS is counted twice. ping_v4_sendmsg ping_v4_push_pending_frames ip_push_pending_frames ip_finish_skb __ip_make_skb icmp_out_count(net, icmp_type); // first count icmp_out_count(sock_net(sk), user_icmph.type); // second count However, when the ping program uses an IPPROTO_RAW socket, ICMP_MIB_OUTMSGS is counted correctly only once. Therefore, the first count should be removed. Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Signed-off-by: yuan.gao Reviewed-by: Ido Schimmel Tested-by: Ido Schimmel Link: https://patch.msgid.link/20251224063145.3615282-1-yuan.gao@ucloud.cn Signed-off-by: Jakub Kicinski --- net/ipv4/ping.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c index ad56588107cc8..cfbd563498e85 100644 --- a/net/ipv4/ping.c +++ b/net/ipv4/ping.c @@ -828,10 +828,8 @@ static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) out_free: if (free) kfree(ipc.opt); - if (!err) { - icmp_out_count(sock_net(sk), user_icmph.type); + if (!err) return len; - } return err; do_confirm: From 3a8483b321cecb17a8641d5993377043d3159d13 Mon Sep 17 00:00:00 2001 From: Stefano Radaelli Date: Tue, 23 Dec 2025 13:09:39 +0100 Subject: [PATCH 0722/1321] net: phy: mxl-86110: Add power management and soft reset support Implement soft_reset, suspend, and resume callbacks using genphy_soft_reset(), genphy_suspend(), and genphy_resume() to fix PHY initialization and power management issues. The soft_reset callback is needed to properly recover the PHY after an ifconfig down/up cycle. Without it, the PHY can remain in power-down state, causing MDIO register access failures during config_init(). The soft reset ensures the PHY is operational before configuration. The suspend/resume callbacks enable proper power management during system suspend/resume cycles. Fixes: b2908a989c59 ("net: phy: add driver for MaxLinear MxL86110 PHY") Signed-off-by: Stefano Radaelli Link: https://patch.msgid.link/20251223120940.407195-1-stefano.r@variscite.com Signed-off-by: Jakub Kicinski --- drivers/net/phy/mxl-86110.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/phy/mxl-86110.c b/drivers/net/phy/mxl-86110.c index e5d137a37a1d4..42a5fe3f115f4 100644 --- a/drivers/net/phy/mxl-86110.c +++ b/drivers/net/phy/mxl-86110.c @@ -938,6 +938,9 @@ static struct phy_driver mxl_phy_drvs[] = { PHY_ID_MATCH_EXACT(PHY_ID_MXL86110), .name = "MXL86110 Gigabit Ethernet", .config_init = mxl86110_config_init, + .suspend = genphy_suspend, + .resume = genphy_resume, + .soft_reset = genphy_soft_reset, .get_wol = mxl86110_get_wol, .set_wol = mxl86110_set_wol, .led_brightness_set = mxl86110_led_brightness_set, From 32372b0ac0d07b475b3ce9dd25d1ed5466eff44f Mon Sep 17 00:00:00 2001 From: Weiming Shi Date: Wed, 24 Dec 2025 04:35:35 +0800 Subject: [PATCH 0723/1321] net: sock: fix hardened usercopy panic in sock_recv_errqueue skbuff_fclone_cache was created without defining a usercopy region, [1] unlike skbuff_head_cache which properly whitelists the cb[] field. [2] This causes a usercopy BUG() when CONFIG_HARDENED_USERCOPY is enabled and the kernel attempts to copy sk_buff.cb data to userspace via sock_recv_errqueue() -> put_cmsg(). The crash occurs when: 1. TCP allocates an skb using alloc_skb_fclone() (from skbuff_fclone_cache) [1] 2. The skb is cloned via skb_clone() using the pre-allocated fclone [3] 3. The cloned skb is queued to sk_error_queue for timestamp reporting 4. Userspace reads the error queue via recvmsg(MSG_ERRQUEUE) 5. sock_recv_errqueue() calls put_cmsg() to copy serr->ee from skb->cb [4] 6. __check_heap_object() fails because skbuff_fclone_cache has no usercopy whitelist [5] When cloned skbs allocated from skbuff_fclone_cache are used in the socket error queue, accessing the sock_exterr_skb structure in skb->cb via put_cmsg() triggers a usercopy hardening violation: [ 5.379589] usercopy: Kernel memory exposure attempt detected from SLUB object 'skbuff_fclone_cache' (offset 296, size 16)! [ 5.382796] kernel BUG at mm/usercopy.c:102! [ 5.383923] Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI [ 5.384903] CPU: 1 UID: 0 PID: 138 Comm: poc_put_cmsg Not tainted 6.12.57 #7 [ 5.384903] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 5.384903] RIP: 0010:usercopy_abort+0x6c/0x80 [ 5.384903] Code: 1a 86 51 48 c7 c2 40 15 1a 86 41 52 48 c7 c7 c0 15 1a 86 48 0f 45 d6 48 c7 c6 80 15 1a 86 48 89 c1 49 0f 45 f3 e8 84 27 88 ff <0f> 0b 490 [ 5.384903] RSP: 0018:ffffc900006f77a8 EFLAGS: 00010246 [ 5.384903] RAX: 000000000000006f RBX: ffff88800f0ad2a8 RCX: 1ffffffff0f72e74 [ 5.384903] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff87b973a0 [ 5.384903] RBP: 0000000000000010 R08: 0000000000000000 R09: fffffbfff0f72e74 [ 5.384903] R10: 0000000000000003 R11: 79706f6372657375 R12: 0000000000000001 [ 5.384903] R13: ffff88800f0ad2b8 R14: ffffea00003c2b40 R15: ffffea00003c2b00 [ 5.384903] FS: 0000000011bc4380(0000) GS:ffff8880bf100000(0000) knlGS:0000000000000000 [ 5.384903] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.384903] CR2: 000056aa3b8e5fe4 CR3: 000000000ea26004 CR4: 0000000000770ef0 [ 5.384903] PKRU: 55555554 [ 5.384903] Call Trace: [ 5.384903] [ 5.384903] __check_heap_object+0x9a/0xd0 [ 5.384903] __check_object_size+0x46c/0x690 [ 5.384903] put_cmsg+0x129/0x5e0 [ 5.384903] sock_recv_errqueue+0x22f/0x380 [ 5.384903] tls_sw_recvmsg+0x7ed/0x1960 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 [ 5.384903] ? schedule+0x6d/0x270 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 [ 5.384903] ? mutex_unlock+0x81/0xd0 [ 5.384903] ? __pfx_mutex_unlock+0x10/0x10 [ 5.384903] ? __pfx_tls_sw_recvmsg+0x10/0x10 [ 5.384903] ? _raw_spin_lock_irqsave+0x8f/0xf0 [ 5.384903] ? _raw_read_unlock_irqrestore+0x20/0x40 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 The crash offset 296 corresponds to skb2->cb within skbuff_fclones: - sizeof(struct sk_buff) = 232 - offsetof(struct sk_buff, cb) = 40 - offset of skb2.cb in fclones = 232 + 40 = 272 - crash offset 296 = 272 + 24 (inside sock_exterr_skb.ee) This patch uses a local stack variable as a bounce buffer to avoid the hardened usercopy check failure. [1] https://elixir.bootlin.com/linux/v6.12.62/source/net/ipv4/tcp.c#L885 [2] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5104 [3] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5566 [4] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5491 [5] https://elixir.bootlin.com/linux/v6.12.62/source/mm/slub.c#L5719 Fixes: 6d07d1cd300f ("usercopy: Restrict non-usercopy caches to size 0") Reported-by: Xiang Mei Signed-off-by: Weiming Shi Reviewed-by: Eric Dumazet Link: https://patch.msgid.link/20251223203534.1392218-2-bestswngs@gmail.com Signed-off-by: Jakub Kicinski --- net/core/sock.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/core/sock.c b/net/core/sock.c index 45c98bf524b28..a1c8b47b0d566 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -3896,7 +3896,7 @@ void sock_enable_timestamp(struct sock *sk, enum sock_flags flag) int sock_recv_errqueue(struct sock *sk, struct msghdr *msg, int len, int level, int type) { - struct sock_exterr_skb *serr; + struct sock_extended_err ee; struct sk_buff *skb; int copied, err; @@ -3916,8 +3916,9 @@ int sock_recv_errqueue(struct sock *sk, struct msghdr *msg, int len, sock_recv_timestamp(msg, sk, skb); - serr = SKB_EXT_ERR(skb); - put_cmsg(msg, level, type, sizeof(serr->ee), &serr->ee); + /* We must use a bounce buffer for CONFIG_HARDENED_USERCOPY=y */ + ee = SKB_EXT_ERR(skb)->ee; + put_cmsg(msg, level, type, sizeof(ee), &ee); msg->msg_flags |= MSG_ERRQUEUE; err = copied; From abfaf10a21fc4879958966dff9a3a6cbb83193a4 Mon Sep 17 00:00:00 2001 From: Di Zhu Date: Wed, 24 Dec 2025 09:22:24 +0800 Subject: [PATCH 0724/1321] netdev: preserve NETIF_F_ALL_FOR_ALL across TSO updates Directly increment the TSO features incurs a side effect: it will also directly clear the flags in NETIF_F_ALL_FOR_ALL on the master device, which can cause issues such as the inability to enable the nocache copy feature on the bonding driver. The fix is to include NETIF_F_ALL_FOR_ALL in the update mask, thereby preventing it from being cleared. Fixes: b0ce3508b25e ("bonding: allow TSO being set on bonding master") Signed-off-by: Di Zhu Link: https://patch.msgid.link/20251224012224.56185-1-zhud@hygon.cn Signed-off-by: Jakub Kicinski --- include/linux/netdevice.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 5870a9e514a51..d99b0fbc1942a 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -5323,7 +5323,8 @@ netdev_features_t netdev_increment_features(netdev_features_t all, static inline netdev_features_t netdev_add_tso_features(netdev_features_t features, netdev_features_t mask) { - return netdev_increment_features(features, NETIF_F_ALL_TSO, mask); + return netdev_increment_features(features, NETIF_F_ALL_TSO | + NETIF_F_ALL_FOR_ALL, mask); } int __netdev_update_features(struct net_device *dev); From 2b5b44cc5d1ff2597e84b81545e42d2eeda288dc Mon Sep 17 00:00:00 2001 From: Patrisious Haddad Date: Thu, 25 Dec 2025 15:27:13 +0200 Subject: [PATCH 0725/1321] net/mlx5: Lag, multipath, give priority for routes with smaller network prefix Today multipath offload is controlled by a single route and the route controlling is selected if it meets one of the following criteria: 1. No controlling route is set. 2. New route destination is the same as old one. 3. New route metric is lower than old route metric. This can cause unwanted behaviour in case a new route is added with a smaller network prefix which should get the priority. Fix this by adding a new criteria to give priority to new route with a smaller network prefix. Fixes: ad11c4f1d8fd ("net/mlx5e: Lag, Only handle events from highest priority multipath entry") Signed-off-by: Patrisious Haddad Signed-off-by: Mark Bloch Link: https://patch.msgid.link/20251225132717.358820-2-mbloch@nvidia.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c index aee17fcf3b36c..cdc99fe5c9568 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c @@ -173,10 +173,15 @@ static void mlx5_lag_fib_route_event(struct mlx5_lag *ldev, unsigned long event, } /* Handle multipath entry with lower priority value */ - if (mp->fib.mfi && mp->fib.mfi != fi && + if (mp->fib.mfi && (mp->fib.dst != fen_info->dst || mp->fib.dst_len != fen_info->dst_len) && - fi->fib_priority >= mp->fib.priority) + mp->fib.dst_len <= fen_info->dst_len && + !(mp->fib.dst_len == fen_info->dst_len && + fi->fib_priority < mp->fib.priority)) { + mlx5_core_dbg(ldev->pf[idx].dev, + "Multipath entry with lower priority was rejected\n"); return; + } nh_dev0 = mlx5_lag_get_next_fib_dev(ldev, fi, NULL); nh_dev1 = mlx5_lag_get_next_fib_dev(ldev, fi, nh_dev0); From f63dce2d62d1267e71d2a1dfba741224ab0d41e2 Mon Sep 17 00:00:00 2001 From: Alexei Lazar Date: Thu, 25 Dec 2025 15:27:14 +0200 Subject: [PATCH 0726/1321] net/mlx5e: Don't gate FEC histograms on ppcnt_statistical_group Currently, the ppcnt_statistical_group capability check incorrectly gates access to FEC histogram statistics. This capability applies only to statistical and physical counter groups, not for histogram data. Restrict the ppcnt_statistical_group check to the Physical_Layer_Counters and Physical_Layer_Statistical_Counters groups. Histogram statistics access remains gated by the pphcr capability. The issue is harmless as of today, as it happens that ppcnt_statistical_group is set on all existing devices that have pphcr set. Fixes: 6b81b8a0b197 ("net/mlx5e: Don't query FEC statistics when FEC is disabled") Signed-off-by: Alexei Lazar Reviewed-by: Tariq Toukan Signed-off-by: Mark Bloch Link: https://patch.msgid.link/20251225132717.358820-3-mbloch@nvidia.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c index a2802cfc9b989..a8af84fc97638 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c @@ -1608,12 +1608,13 @@ void mlx5e_stats_fec_get(struct mlx5e_priv *priv, { int mode = fec_active_mode(priv->mdev); - if (mode == MLX5E_FEC_NOFEC || - !MLX5_CAP_PCAM_FEATURE(priv->mdev, ppcnt_statistical_group)) + if (mode == MLX5E_FEC_NOFEC) return; - fec_set_corrected_bits_total(priv, fec_stats); - fec_set_block_stats(priv, mode, fec_stats); + if (MLX5_CAP_PCAM_FEATURE(priv->mdev, ppcnt_statistical_group)) { + fec_set_corrected_bits_total(priv, fec_stats); + fec_set_block_stats(priv, mode, fec_stats); + } if (MLX5_CAP_PCAM_REG(priv->mdev, pphcr)) fec_set_histograms_stats(priv, mode, hist); From 7baf67f05ee2c437a6c76544cf19a23716b8592b Mon Sep 17 00:00:00 2001 From: Gal Pressman Date: Thu, 25 Dec 2025 15:27:15 +0200 Subject: [PATCH 0727/1321] net/mlx5e: Fix NULL pointer dereference in ioctl module EEPROM query The mlx5_query_mcia() function unconditionally dereferences the status pointer to store the MCIA register status value. However, mlx5e_get_module_id() passes NULL since it doesn't need the status value. Add a NULL check before dereferencing the status pointer to prevent a NULL pointer dereference. Fixes: 2e4c44b12f4d ("net/mlx5: Refactor EEPROM query error handling to return status separately") Signed-off-by: Gal Pressman Reviewed-by: Tariq Toukan Reviewed-by: Dragos Tatulea Signed-off-by: Mark Bloch Link: https://patch.msgid.link/20251225132717.358820-4-mbloch@nvidia.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlx5/core/port.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c index 85a9e534f4427..8f36454dd1960 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c @@ -393,9 +393,11 @@ static int mlx5_query_mcia(struct mlx5_core_dev *dev, if (err) return err; - *status = MLX5_GET(mcia_reg, out, status); - if (*status) + if (MLX5_GET(mcia_reg, out, status)) { + if (status) + *status = MLX5_GET(mcia_reg, out, status); return -EIO; + } ptr = MLX5_ADDR_OF(mcia_reg, out, dword_0); memcpy(data, ptr, size); From 80acfec5aca1f9263788a8b871c0d9519ddb86a5 Mon Sep 17 00:00:00 2001 From: Gal Pressman Date: Thu, 25 Dec 2025 15:27:16 +0200 Subject: [PATCH 0728/1321] net/mlx5e: Don't print error message due to invalid module Dumping module EEPROM on newer modules is supported through the netlink interface only. Querying with old userspace ethtool (or other tools, such as 'lshw') which still uses the ioctl interface results in an error message that could flood dmesg (in addition to the expected error return value). The original message was added under the assumption that the driver should be able to handle all module types, but now that such flows are easily triggered from userspace, it doesn't serve its purpose. Change the log level of the print in mlx5_query_module_eeprom() to debug. Fixes: bb64143eee8c ("net/mlx5e: Add ethtool support for dump module EEPROM") Signed-off-by: Gal Pressman Reviewed-by: Tariq Toukan Signed-off-by: Mark Bloch Link: https://patch.msgid.link/20251225132717.358820-5-mbloch@nvidia.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlx5/core/port.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c index 8f36454dd1960..7f8bed353e675 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c @@ -431,7 +431,8 @@ int mlx5_query_module_eeprom(struct mlx5_core_dev *dev, mlx5_qsfp_eeprom_params_set(&query.i2c_address, &query.page, &offset); break; default: - mlx5_core_err(dev, "Module ID not recognized: 0x%x\n", module_id); + mlx5_core_dbg(dev, "Module ID not recognized: 0x%x\n", + module_id); return -EINVAL; } From 754db1549635dd974e82b8a40609271a8ba9a02a Mon Sep 17 00:00:00 2001 From: Cosmin Ratiu Date: Thu, 25 Dec 2025 15:27:17 +0200 Subject: [PATCH 0729/1321] net/mlx5e: Dealloc forgotten PSP RX modify header The commit which added RX steering rules for PSP forgot to free a modify header HW object on the cleanup path, which lead to health errors when reloading the driver and uninitializing the device: mlx5_core 0000:08:00.0: poll_health:803:(pid 3021): Fatal error 3 detected Fix that by saving the modify header pointer in the PSP steering struct and deallocating it after freeing the rule which references it. Fixes: 9536fbe10c9d ("net/mlx5e: Add PSP steering in local NIC RX") Signed-off-by: Cosmin Ratiu Reviewed-by: Dragos Tatulea Reviewed-by: Tariq Toukan Signed-off-by: Mark Bloch Link: https://patch.msgid.link/20251225132717.358820-6-mbloch@nvidia.com Signed-off-by: Jakub Kicinski --- .../net/ethernet/mellanox/mlx5/core/en_accel/psp.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c index 38e7c77cc8514..9a74438ce10aa 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c @@ -44,6 +44,7 @@ struct mlx5e_accel_fs_psp_prot { struct mlx5_flow_table *ft; struct mlx5_flow_group *miss_group; struct mlx5_flow_handle *miss_rule; + struct mlx5_modify_hdr *rx_modify_hdr; struct mlx5_flow_destination default_dest; struct mlx5e_psp_rx_err rx_err; u32 refcnt; @@ -286,13 +287,19 @@ static int accel_psp_fs_rx_err_create_ft(struct mlx5e_psp_fs *fs, return err; } -static void accel_psp_fs_rx_fs_destroy(struct mlx5e_accel_fs_psp_prot *fs_prot) +static void accel_psp_fs_rx_fs_destroy(struct mlx5e_psp_fs *fs, + struct mlx5e_accel_fs_psp_prot *fs_prot) { if (fs_prot->def_rule) { mlx5_del_flow_rules(fs_prot->def_rule); fs_prot->def_rule = NULL; } + if (fs_prot->rx_modify_hdr) { + mlx5_modify_header_dealloc(fs->mdev, fs_prot->rx_modify_hdr); + fs_prot->rx_modify_hdr = NULL; + } + if (fs_prot->miss_rule) { mlx5_del_flow_rules(fs_prot->miss_rule); fs_prot->miss_rule = NULL; @@ -396,6 +403,7 @@ static int accel_psp_fs_rx_create_ft(struct mlx5e_psp_fs *fs, modify_hdr = NULL; goto out_err; } + fs_prot->rx_modify_hdr = modify_hdr; flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | MLX5_FLOW_CONTEXT_ACTION_CRYPTO_DECRYPT | @@ -416,7 +424,7 @@ static int accel_psp_fs_rx_create_ft(struct mlx5e_psp_fs *fs, goto out; out_err: - accel_psp_fs_rx_fs_destroy(fs_prot); + accel_psp_fs_rx_fs_destroy(fs, fs_prot); out: kvfree(flow_group_in); kvfree(spec); @@ -433,7 +441,7 @@ static int accel_psp_fs_rx_destroy(struct mlx5e_psp_fs *fs, enum accel_fs_psp_ty /* The netdev unreg already happened, so all offloaded rule are already removed */ fs_prot = &accel_psp->fs_prot[type]; - accel_psp_fs_rx_fs_destroy(fs_prot); + accel_psp_fs_rx_fs_destroy(fs, fs_prot); accel_psp_fs_rx_err_destroy_ft(fs, &fs_prot->rx_err); From 62593bc81d22e3ea6599f6848dca8bca1c786ad5 Mon Sep 17 00:00:00 2001 From: Frank Liang Date: Wed, 31 Dec 2025 22:58:08 +0800 Subject: [PATCH 0730/1321] net/ena: fix missing lock when update devlink params Fix assert lock warning while calling devl_param_driverinit_value_set() in ena. WARNING: net/devlink/core.c:261 at devl_assert_locked+0x62/0x90, CPU#0: kworker/0:0/9 CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.19.0-rc2+ #1 PREEMPT(lazy) Hardware name: Amazon EC2 m8i-flex.4xlarge/, BIOS 1.0 10/16/2017 Workqueue: events work_for_cpu_fn RIP: 0010:devl_assert_locked+0x62/0x90 Call Trace: devl_param_driverinit_value_set+0x15/0x1c0 ena_devlink_alloc+0x18c/0x220 [ena] ? __pfx_ena_devlink_alloc+0x10/0x10 [ena] ? trace_hardirqs_on+0x18/0x140 ? lockdep_hardirqs_on+0x8c/0x130 ? __raw_spin_unlock_irqrestore+0x5d/0x80 ? __raw_spin_unlock_irqrestore+0x46/0x80 ? devm_ioremap_wc+0x9a/0xd0 ena_probe+0x4d2/0x1b20 [ena] ? __lock_acquire+0x56a/0xbd0 ? __pfx_ena_probe+0x10/0x10 [ena] ? local_clock+0x15/0x30 ? __lock_release.isra.0+0x1c9/0x340 ? mark_held_locks+0x40/0x70 ? lockdep_hardirqs_on_prepare.part.0+0x92/0x170 ? trace_hardirqs_on+0x18/0x140 ? lockdep_hardirqs_on+0x8c/0x130 ? __raw_spin_unlock_irqrestore+0x5d/0x80 ? __raw_spin_unlock_irqrestore+0x46/0x80 ? __pfx_ena_probe+0x10/0x10 [ena] ...... Fixes: 816b52624cf6 ("net: ena: Control PHC enable through devlink") Signed-off-by: Frank Liang Reviewed-by: David Arinzon Reviewed-by: Jiri Pirko Link: https://patch.msgid.link/20251231145808.6103-1-xiliang@redhat.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/amazon/ena/ena_devlink.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/net/ethernet/amazon/ena/ena_devlink.c b/drivers/net/ethernet/amazon/ena/ena_devlink.c index ac81c24016dd4..4772185e669d2 100644 --- a/drivers/net/ethernet/amazon/ena/ena_devlink.c +++ b/drivers/net/ethernet/amazon/ena/ena_devlink.c @@ -53,10 +53,12 @@ void ena_devlink_disable_phc_param(struct devlink *devlink) { union devlink_param_value value; + devl_lock(devlink); value.vbool = false; devl_param_driverinit_value_set(devlink, DEVLINK_PARAM_GENERIC_ID_ENABLE_PHC, value); + devl_unlock(devlink); } static void ena_devlink_port_register(struct devlink *devlink) @@ -145,10 +147,12 @@ static int ena_devlink_configure_params(struct devlink *devlink) return rc; } + devl_lock(devlink); value.vbool = ena_phc_is_enabled(adapter); devl_param_driverinit_value_set(devlink, DEVLINK_PARAM_GENERIC_ID_ENABLE_PHC, value); + devl_unlock(devlink); return 0; } From 089a7749d56c36f2b26090ac3fd082efc1e5130a Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Tue, 30 Dec 2025 07:18:53 +0000 Subject: [PATCH 0731/1321] net: wwan: iosm: Fix memory leak in ipc_mux_deinit() Commit 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support") allocated memory for pp_qlt in ipc_mux_init() but did not free it in ipc_mux_deinit(). This results in a memory leak when the driver is unloaded. Free the allocated memory in ipc_mux_deinit() to fix the leak. Fixes: 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support") Co-developed-by: Jianhao Xu Signed-off-by: Jianhao Xu Signed-off-by: Zilin Guan Reviewed-by: Loic Poulain Link: https://patch.msgid.link/20251230071853.1062223-1-zilin@seu.edu.cn Signed-off-by: Jakub Kicinski --- drivers/net/wwan/iosm/iosm_ipc_mux.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/wwan/iosm/iosm_ipc_mux.c b/drivers/net/wwan/iosm/iosm_ipc_mux.c index fc928b298a984..b846889fcb099 100644 --- a/drivers/net/wwan/iosm/iosm_ipc_mux.c +++ b/drivers/net/wwan/iosm/iosm_ipc_mux.c @@ -456,6 +456,7 @@ void ipc_mux_deinit(struct iosm_mux *ipc_mux) struct sk_buff_head *free_list; union mux_msg mux_msg; struct sk_buff *skb; + int i; if (!ipc_mux->initialized) return; @@ -479,5 +480,10 @@ void ipc_mux_deinit(struct iosm_mux *ipc_mux) ipc_mux->channel->dl_pipe.is_open = false; } + if (ipc_mux->protocol != MUX_LITE) { + for (i = 0; i < IPC_MEM_MUX_IP_SESSION_ENTRIES; i++) + kfree(ipc_mux->ul_adb.pp_qlt[i]); + } + kfree(ipc_mux); } From 549f7023ccc9452170064fc099c656a17c1b3957 Mon Sep 17 00:00:00 2001 From: Srijit Bose Date: Wed, 31 Dec 2025 00:36:25 -0800 Subject: [PATCH 0732/1321] bnxt_en: Fix potential data corruption with HW GRO/LRO Fix the max number of bits passed to find_first_zero_bit() in bnxt_alloc_agg_idx(). We were incorrectly passing the number of long words. find_first_zero_bit() may fail to find a zero bit and cause a wrong ID to be used. If the wrong ID is already in use, this can cause data corruption. Sometimes an error like this can also be seen: bnxt_en 0000:83:00.0 enp131s0np0: TPA end agg_buf 2 != expected agg_bufs 1 Fix it by passing the correct number of bits MAX_TPA_P5. Use DECLARE_BITMAP() to more cleanly define the bitmap. Add a sanity check to warn if a bit cannot be found and reset the ring [MChan]. Fixes: ec4d8e7cf024 ("bnxt_en: Add TPA ID mapping logic for 57500 chips.") Reviewed-by: Ray Jui Signed-off-by: Srijit Bose Signed-off-by: Michael Chan Reviewed-by: Vadim Fedorenko Link: https://patch.msgid.link/20251231083625.3911652-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 15 ++++++++++++--- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 +--- 2 files changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index d17d0ea89c364..d160e54ac121d 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -1482,9 +1482,11 @@ static u16 bnxt_alloc_agg_idx(struct bnxt_rx_ring_info *rxr, u16 agg_id) struct bnxt_tpa_idx_map *map = rxr->rx_tpa_idx_map; u16 idx = agg_id & MAX_TPA_P5_MASK; - if (test_bit(idx, map->agg_idx_bmap)) - idx = find_first_zero_bit(map->agg_idx_bmap, - BNXT_AGG_IDX_BMAP_SIZE); + if (test_bit(idx, map->agg_idx_bmap)) { + idx = find_first_zero_bit(map->agg_idx_bmap, MAX_TPA_P5); + if (idx >= MAX_TPA_P5) + return INVALID_HW_RING_ID; + } __set_bit(idx, map->agg_idx_bmap); map->agg_id_tbl[agg_id] = idx; return idx; @@ -1548,6 +1550,13 @@ static void bnxt_tpa_start(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS) { agg_id = TPA_START_AGG_ID_P5(tpa_start); agg_id = bnxt_alloc_agg_idx(rxr, agg_id); + if (unlikely(agg_id == INVALID_HW_RING_ID)) { + netdev_warn(bp->dev, "Unable to allocate agg ID for ring %d, agg 0x%x\n", + rxr->bnapi->index, + TPA_START_AGG_ID_P5(tpa_start)); + bnxt_sched_reset_rxr(bp, rxr); + return; + } } else { agg_id = TPA_START_AGG_ID(tpa_start); } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index f5f07a7e6b297..f88e7769a838a 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1080,11 +1080,9 @@ struct bnxt_tpa_info { struct rx_agg_cmp *agg_arr; }; -#define BNXT_AGG_IDX_BMAP_SIZE (MAX_TPA_P5 / BITS_PER_LONG) - struct bnxt_tpa_idx_map { u16 agg_id_tbl[1024]; - unsigned long agg_idx_bmap[BNXT_AGG_IDX_BMAP_SIZE]; + DECLARE_BITMAP(agg_idx_bmap, MAX_TPA_P5); }; struct bnxt_rx_ring_info { From a1a5800feb91933855cc75e58fc7b6474a2aab55 Mon Sep 17 00:00:00 2001 From: Kommula Shiva Shankar Date: Fri, 2 Jan 2026 15:49:00 +0530 Subject: [PATCH 0733/1321] virtio_net: fix device mismatch in devm_kzalloc/devm_kfree Initial rss_hdr allocation uses virtio_device->device, but virtnet_set_queues() frees using net_device->device. This device mismatch causing below devres warning [ 3788.514041] ------------[ cut here ]------------ [ 3788.514044] WARNING: drivers/base/devres.c:1095 at devm_kfree+0x84/0x98, CPU#16: vdpa/1463 [ 3788.514054] Modules linked in: octep_vdpa virtio_net virtio_vdpa [last unloaded: virtio_vdpa] [ 3788.514064] CPU: 16 UID: 0 PID: 1463 Comm: vdpa Tainted: G W 6.18.0 #10 PREEMPT [ 3788.514067] Tainted: [W]=WARN [ 3788.514069] Hardware name: Marvell CN106XX board (DT) [ 3788.514071] pstate: 63400009 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) [ 3788.514074] pc : devm_kfree+0x84/0x98 [ 3788.514076] lr : devm_kfree+0x54/0x98 [ 3788.514079] sp : ffff800084e2f220 [ 3788.514080] x29: ffff800084e2f220 x28: ffff0003b2366000 x27: 000000000000003f [ 3788.514085] x26: 000000000000003f x25: ffff000106f17c10 x24: 0000000000000080 [ 3788.514089] x23: ffff00045bb8ab08 x22: ffff00045bb8a000 x21: 0000000000000018 [ 3788.514093] x20: ffff0004355c3080 x19: ffff00045bb8aa00 x18: 0000000000080000 [ 3788.514098] x17: 0000000000000040 x16: 000000000000001f x15: 000000000007ffff [ 3788.514102] x14: 0000000000000488 x13: 0000000000000005 x12: 00000000000fffff [ 3788.514106] x11: ffffffffffffffff x10: 0000000000000005 x9 : ffff800080c8c05c [ 3788.514110] x8 : ffff800084e2eeb8 x7 : 0000000000000000 x6 : 000000000000003f [ 3788.514115] x5 : ffff8000831bafe0 x4 : ffff800080c8b010 x3 : ffff0004355c3080 [ 3788.514119] x2 : ffff0004355c3080 x1 : 0000000000000000 x0 : 0000000000000000 [ 3788.514123] Call trace: [ 3788.514125] devm_kfree+0x84/0x98 (P) [ 3788.514129] virtnet_set_queues+0x134/0x2e8 [virtio_net] [ 3788.514135] virtnet_probe+0x9c0/0xe00 [virtio_net] [ 3788.514139] virtio_dev_probe+0x1e0/0x338 [ 3788.514144] really_probe+0xc8/0x3a0 [ 3788.514149] __driver_probe_device+0x84/0x170 [ 3788.514152] driver_probe_device+0x44/0x120 [ 3788.514155] __device_attach_driver+0xc4/0x168 [ 3788.514158] bus_for_each_drv+0x8c/0xf0 [ 3788.514161] __device_attach+0xa4/0x1c0 [ 3788.514164] device_initial_probe+0x1c/0x30 [ 3788.514168] bus_probe_device+0xb4/0xc0 [ 3788.514170] device_add+0x614/0x828 [ 3788.514173] register_virtio_device+0x214/0x258 [ 3788.514175] virtio_vdpa_probe+0xa0/0x110 [virtio_vdpa] [ 3788.514179] vdpa_dev_probe+0xa8/0xd8 [ 3788.514183] really_probe+0xc8/0x3a0 [ 3788.514186] __driver_probe_device+0x84/0x170 [ 3788.514189] driver_probe_device+0x44/0x120 [ 3788.514192] __device_attach_driver+0xc4/0x168 [ 3788.514195] bus_for_each_drv+0x8c/0xf0 [ 3788.514197] __device_attach+0xa4/0x1c0 [ 3788.514200] device_initial_probe+0x1c/0x30 [ 3788.514203] bus_probe_device+0xb4/0xc0 [ 3788.514206] device_add+0x614/0x828 [ 3788.514209] _vdpa_register_device+0x58/0x88 [ 3788.514211] octep_vdpa_dev_add+0x104/0x228 [octep_vdpa] [ 3788.514215] vdpa_nl_cmd_dev_add_set_doit+0x2d0/0x3c0 [ 3788.514218] genl_family_rcv_msg_doit+0xe4/0x158 [ 3788.514222] genl_rcv_msg+0x218/0x298 [ 3788.514225] netlink_rcv_skb+0x64/0x138 [ 3788.514229] genl_rcv+0x40/0x60 [ 3788.514233] netlink_unicast+0x32c/0x3b0 [ 3788.514237] netlink_sendmsg+0x170/0x3b8 [ 3788.514241] __sys_sendto+0x12c/0x1c0 [ 3788.514246] __arm64_sys_sendto+0x30/0x48 [ 3788.514249] invoke_syscall.constprop.0+0x58/0xf8 [ 3788.514255] do_el0_svc+0x48/0xd0 [ 3788.514259] el0_svc+0x48/0x210 [ 3788.514264] el0t_64_sync_handler+0xa0/0xe8 [ 3788.514268] el0t_64_sync+0x198/0x1a0 [ 3788.514271] ---[ end trace 0000000000000000 ]--- Fix by using virtio_device->device consistently for allocation and deallocation Fixes: 4944be2f5ad8c ("virtio_net: Allocate rss_hdr with devres") Signed-off-by: Kommula Shiva Shankar Acked-by: Michael S. Tsirkin Acked-by: Jason Wang Reviewed-by: Xuan Zhuo Link: https://patch.msgid.link/20260102101900.692770-1-kshankar@marvell.com Signed-off-by: Jakub Kicinski --- drivers/net/virtio_net.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 1bb3aeca66c6e..22d894101c018 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3791,7 +3791,7 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) if (vi->has_rss && !netif_is_rxfh_configured(dev)) { old_rss_hdr = vi->rss_hdr; old_rss_trailer = vi->rss_trailer; - vi->rss_hdr = devm_kzalloc(&dev->dev, virtnet_rss_hdr_size(vi), GFP_KERNEL); + vi->rss_hdr = devm_kzalloc(&vi->vdev->dev, virtnet_rss_hdr_size(vi), GFP_KERNEL); if (!vi->rss_hdr) { vi->rss_hdr = old_rss_hdr; return -ENOMEM; @@ -3802,7 +3802,7 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) if (!virtnet_commit_rss_command(vi)) { /* restore ctrl_rss if commit_rss_command failed */ - devm_kfree(&dev->dev, vi->rss_hdr); + devm_kfree(&vi->vdev->dev, vi->rss_hdr); vi->rss_hdr = old_rss_hdr; vi->rss_trailer = old_rss_trailer; @@ -3810,7 +3810,7 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) queue_pairs); return -EINVAL; } - devm_kfree(&dev->dev, old_rss_hdr); + devm_kfree(&vi->vdev->dev, old_rss_hdr); goto succ; } From 3ea8d2c8f3234302646b920d93efdc57a3676341 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 2 Jan 2026 15:00:07 +0100 Subject: [PATCH 0734/1321] inet: frags: drop fraglist conntrack references Jakub added a warning in nf_conntrack_cleanup_net_list() to make debugging leaked skbs/conntrack references more obvious. syzbot reports this as triggering, and I can also reproduce this via ip_defrag.sh selftest: conntrack cleanup blocked for 60s WARNING: net/netfilter/nf_conntrack_core.c:2512 [..] conntrack clenups gets stuck because there are skbs with still hold nf_conn references via their frag_list. net.core.skb_defer_max=0 makes the hang disappear. Eric Dumazet points out that skb_release_head_state() doesn't follow the fraglist. ip_defrag.sh can only reproduce this problem since commit 6471658dc66c ("udp: use skb_attempt_defer_free()"), but AFAICS this problem could happen with TCP as well if pmtu discovery is off. The relevant problem path for udp is: 1. netns emits fragmented packets 2. nf_defrag_v6_hook reassembles them (in output hook) 3. reassembled skb is tracked (skb owns nf_conn reference) 4. ip6_output refragments 5. refragmented packets also own nf_conn reference (ip6_fragment calls ip6_copy_metadata()) 6. on input path, nf_defrag_v6_hook skips defragmentation: the fragments already have skb->nf_conn attached 7. skbs are reassembled via ipv6_frag_rcv() 8. skb_consume_udp -> skb_attempt_defer_free() -> skb ends up in pcpu freelist, but still has nf_conn reference. Possible solutions: 1 let defrag engine drop nf_conn entry, OR 2 export kick_defer_list_purge() and call it from the conntrack netns exit callback, OR 3 add skb_has_frag_list() check to skb_attempt_defer_free() 2 & 3 also solve ip_defrag.sh hang but share same drawback: Such reassembled skbs, queued to socket, can prevent conntrack module removal until userspace has consumed the packet. While both tcp and udp stack do call nf_reset_ct() before placing skb on socket queue, that function doesn't iterate frag_list skbs. Therefore drop nf_conn entries when they are placed in defrag queue. Keep the nf_conn entry of the first (offset 0) skb so that reassembled skb retains nf_conn entry for sake of TX path. Note that fixes tag is incorrect; it points to the commit introducing the 'ip_defrag.sh reproducible problem': no need to backport this patch to every stable kernel. Reported-by: syzbot+4393c47753b7808dac7d@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/693b0fa7.050a0220.4004e.040d.GAE@google.com/ Fixes: 6471658dc66c ("udp: use skb_attempt_defer_free()") Signed-off-by: Florian Westphal Reviewed-by: Eric Dumazet Link: https://patch.msgid.link/20260102140030.32367-1-fw@strlen.de Signed-off-by: Jakub Kicinski --- net/ipv4/inet_fragment.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c index 001ee5c4d962e..4e6d7467ed444 100644 --- a/net/ipv4/inet_fragment.c +++ b/net/ipv4/inet_fragment.c @@ -488,6 +488,8 @@ int inet_frag_queue_insert(struct inet_frag_queue *q, struct sk_buff *skb, } FRAG_CB(skb)->ip_defrag_offset = offset; + if (offset) + nf_reset_ct(skb); return IPFRAG_OK; } From 2c1f3ee1675e26a25a32fd3c3ee14b5bc574c839 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Thu, 4 Dec 2025 12:20:35 +0100 Subject: [PATCH 0735/1321] netfilter: nft_set_pipapo: fix range overlap detection set->klen has to be used, not sizeof(). The latter only compares a single register but a full check of the entire key is needed. Example: table ip t { map s { typeof iifname . ip saddr : verdict flags interval } } nft add element t s '{ "lo" . 10.0.0.0/24 : drop }' # no error, expected nft add element t s '{ "lo" . 10.0.0.0/24 : drop }' # no error, expected nft add element t s '{ "lo" . 10.0.0.0/8 : drop }' # bug: no error The 3rd 'add element' should be rejected via -ENOTEMPTY, not -EEXIST, so userspace / nft can report an error to the user. The latter is only correct for the 2nd case (re-add of existing element). As-is, userspace is told that the command was successful, but no elements were added. After this patch, 3rd command gives: Error: Could not process rule: File exists add element t s { "lo" . 127.0.0.0/8 . "lo" : drop } ^^^^^^^^^^^^^^^^^^^^^^^^^ Fixes: 0eb4b5ee33f2 ("netfilter: nft_set_pipapo: Separate partial and complete overlap cases on insertion") Signed-off-by: Florian Westphal --- net/netfilter/nft_set_pipapo.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c index 112fe46788b6f..6d77a5f0088ad 100644 --- a/net/netfilter/nft_set_pipapo.c +++ b/net/netfilter/nft_set_pipapo.c @@ -1317,8 +1317,8 @@ static int nft_pipapo_insert(const struct net *net, const struct nft_set *set, else dup_end = dup_key; - if (!memcmp(start, dup_key->data, sizeof(*dup_key->data)) && - !memcmp(end, dup_end->data, sizeof(*dup_end->data))) { + if (!memcmp(start, dup_key->data, set->klen) && + !memcmp(end, dup_end->data, set->klen)) { *elem_priv = &dup->priv; return -EEXIST; } From 4b22c52e2f1a0ab93309731e34b60b9672abf674 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Thu, 4 Dec 2025 12:20:36 +0100 Subject: [PATCH 0736/1321] selftests: netfilter: nft_concat_range.sh: add check for overlap detection bug without 'netfilter: nft_set_pipapo: fix range overlap detection': reject overlapping range on add 0s [FAIL] Returned success for add { 1.2.3.4 . 1.2.4.1-1.2.4.2 } given set: table inet filter { [..] elements = { 1.2.3.4 . 1.2.4.1 counter packets 0 bytes 0, 1.2.3.0-1.2.3.4 . 1.2.4.2 counter packets 0 bytes 0 } } The element collides with existing ones and was not added, but kernel returned success to userspace. Signed-off-by: Florian Westphal --- .../net/netfilter/nft_concat_range.sh | 45 ++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/net/netfilter/nft_concat_range.sh b/tools/testing/selftests/net/netfilter/nft_concat_range.sh index ad97c6227f351..394166f224a4b 100755 --- a/tools/testing/selftests/net/netfilter/nft_concat_range.sh +++ b/tools/testing/selftests/net/netfilter/nft_concat_range.sh @@ -29,7 +29,7 @@ TYPES="net_port port_net net6_port port_proto net6_port_mac net6_port_mac_proto net6_port_net6_port net_port_mac_proto_net" # Reported bugs, also described by TYPE_ variables below -BUGS="flush_remove_add reload net_port_proto_match avx2_mismatch doublecreate" +BUGS="flush_remove_add reload net_port_proto_match avx2_mismatch doublecreate insert_overlap" # List of possible paths to pktgen script from kernel tree for performance tests PKTGEN_SCRIPT_PATHS=" @@ -420,6 +420,18 @@ race_repeat 0 perf_duration 0 " +TYPE_insert_overlap=" +display reject overlapping range on add +type_spec ipv4_addr . ipv4_addr +chain_spec ip saddr . ip daddr +dst addr4 +proto icmp + +race_repeat 0 + +perf_duration 0 +" + # Set template for all tests, types and rules are filled in depending on test set_template=' flush ruleset @@ -1954,6 +1966,37 @@ EOF return 0 } +add_fail() +{ + if nft add element inet filter test "$1" 2>/dev/null ; then + err "Returned success for add ${1} given set:" + err "$(nft -a list set inet filter test )" + return 1 + fi + + return 0 +} + +test_bug_insert_overlap() +{ + local elements="1.2.3.4 . 1.2.4.1" + + setup veth send_"${proto}" set || return ${ksft_skip} + + add "{ $elements }" || return 1 + + elements="1.2.3.0-1.2.3.4 . 1.2.4.1" + add_fail "{ $elements }" || return 1 + + elements="1.2.3.0-1.2.3.4 . 1.2.4.2" + add "{ $elements }" || return 1 + + elements="1.2.3.4 . 1.2.4.1-1.2.4.2" + add_fail "{ $elements }" || return 1 + + return 0 +} + test_reported_issues() { eval test_bug_"${subtest}" } From fd49baac3703f76acc911f5c14e51e11a0926e38 Mon Sep 17 00:00:00 2001 From: Fernando Fernandez Mancera Date: Wed, 17 Dec 2025 21:21:59 +0100 Subject: [PATCH 0737/1321] netfilter: nft_synproxy: avoid possible data-race on update operation During nft_synproxy eval we are reading nf_synproxy_info struct which can be modified on update operation concurrently. As nf_synproxy_info struct fits in 32 bits, use READ_ONCE/WRITE_ONCE annotations. Fixes: ee394f96ad75 ("netfilter: nft_synproxy: add synproxy stateful object support") Signed-off-by: Fernando Fernandez Mancera Signed-off-by: Florian Westphal --- net/netfilter/nft_synproxy.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/net/netfilter/nft_synproxy.c b/net/netfilter/nft_synproxy.c index 5d3e518259859..4d3e5a31b4125 100644 --- a/net/netfilter/nft_synproxy.c +++ b/net/netfilter/nft_synproxy.c @@ -48,7 +48,7 @@ static void nft_synproxy_eval_v4(const struct nft_synproxy *priv, struct tcphdr *_tcph, struct synproxy_options *opts) { - struct nf_synproxy_info info = priv->info; + struct nf_synproxy_info info = READ_ONCE(priv->info); struct net *net = nft_net(pkt); struct synproxy_net *snet = synproxy_pernet(net); struct sk_buff *skb = pkt->skb; @@ -79,7 +79,7 @@ static void nft_synproxy_eval_v6(const struct nft_synproxy *priv, struct tcphdr *_tcph, struct synproxy_options *opts) { - struct nf_synproxy_info info = priv->info; + struct nf_synproxy_info info = READ_ONCE(priv->info); struct net *net = nft_net(pkt); struct synproxy_net *snet = synproxy_pernet(net); struct sk_buff *skb = pkt->skb; @@ -340,7 +340,7 @@ static void nft_synproxy_obj_update(struct nft_object *obj, struct nft_synproxy *newpriv = nft_obj_data(newobj); struct nft_synproxy *priv = nft_obj_data(obj); - priv->info = newpriv->info; + WRITE_ONCE(priv->info, newpriv->info); } static struct nft_object_type nft_synproxy_obj_type; From 869ea1dd3516c064b308892182778ce478fc6c57 Mon Sep 17 00:00:00 2001 From: Daniel Gomez Date: Fri, 19 Dec 2025 06:13:20 +0100 Subject: [PATCH 0738/1321] netfilter: replace -EEXIST with -EBUSY The -EEXIST error code is reserved by the module loading infrastructure to indicate that a module is already loaded. When a module's init function returns -EEXIST, userspace tools like kmod interpret this as "module already loaded" and treat the operation as successful, returning 0 to the user even though the module initialization actually failed. Replace -EEXIST with -EBUSY to ensure correct error reporting in the module initialization path. Affected modules: * ebtable_broute ebtable_filter ebtable_nat arptable_filter * ip6table_filter ip6table_mangle ip6table_nat ip6table_raw * ip6table_security iptable_filter iptable_mangle iptable_nat * iptable_raw iptable_security Signed-off-by: Daniel Gomez Signed-off-by: Florian Westphal --- net/bridge/netfilter/ebtables.c | 2 +- net/netfilter/nf_log.c | 4 ++-- net/netfilter/x_tables.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c index 5697e3949a365..a04fc17575289 100644 --- a/net/bridge/netfilter/ebtables.c +++ b/net/bridge/netfilter/ebtables.c @@ -1299,7 +1299,7 @@ int ebt_register_template(const struct ebt_table *t, int (*table_init)(struct ne list_for_each_entry(tmpl, &template_tables, list) { if (WARN_ON_ONCE(strcmp(t->name, tmpl->name) == 0)) { mutex_unlock(&ebt_mutex); - return -EEXIST; + return -EBUSY; } } diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c index 74cef8bf554c5..62cf6a30875e3 100644 --- a/net/netfilter/nf_log.c +++ b/net/netfilter/nf_log.c @@ -89,7 +89,7 @@ int nf_log_register(u_int8_t pf, struct nf_logger *logger) if (pf == NFPROTO_UNSPEC) { for (i = NFPROTO_UNSPEC; i < NFPROTO_NUMPROTO; i++) { if (rcu_access_pointer(loggers[i][logger->type])) { - ret = -EEXIST; + ret = -EBUSY; goto unlock; } } @@ -97,7 +97,7 @@ int nf_log_register(u_int8_t pf, struct nf_logger *logger) rcu_assign_pointer(loggers[i][logger->type], logger); } else { if (rcu_access_pointer(loggers[pf][logger->type])) { - ret = -EEXIST; + ret = -EBUSY; goto unlock; } rcu_assign_pointer(loggers[pf][logger->type], logger); diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index 90b7630421c44..48105ea3df152 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -1764,7 +1764,7 @@ EXPORT_SYMBOL_GPL(xt_hook_ops_alloc); int xt_register_template(const struct xt_table *table, int (*table_init)(struct net *net)) { - int ret = -EEXIST, af = table->af; + int ret = -EBUSY, af = table->af; struct xt_template *t; mutex_lock(&xt[af].mutex); From ab4db1cd96502bbf80e51088895471d39c9faa74 Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Wed, 24 Dec 2025 12:48:26 +0000 Subject: [PATCH 0739/1321] netfilter: nf_tables: fix memory leak in nf_tables_newrule() In nf_tables_newrule(), if nft_use_inc() fails, the function jumps to the err_release_rule label without freeing the allocated flow, leading to a memory leak. Fix this by adding a new label err_destroy_flow and jumping to it when nft_use_inc() fails. This ensures that the flow is properly released in this error case. Fixes: 1689f25924ada ("netfilter: nf_tables: report use refcount overflow") Signed-off-by: Zilin Guan Signed-off-by: Florian Westphal --- net/netfilter/nf_tables_api.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 618af6e90773f..729a92781a1a4 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -4439,7 +4439,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info, if (!nft_use_inc(&chain->use)) { err = -EMFILE; - goto err_release_rule; + goto err_destroy_flow; } if (info->nlh->nlmsg_flags & NLM_F_REPLACE) { @@ -4489,6 +4489,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info, err_destroy_flow_rule: nft_use_dec_restore(&chain->use); +err_destroy_flow: if (flow) nft_flow_rule_destroy(flow); err_release_rule: From 396764cb237518be93e60dcb31a39331dce044c1 Mon Sep 17 00:00:00 2001 From: Fernando Fernandez Mancera Date: Wed, 17 Dec 2025 15:46:40 +0100 Subject: [PATCH 0740/1321] netfilter: nf_conncount: update last_gc only when GC has been performed Currently last_gc is being updated everytime a new connection is tracked, that means that it is updated even if a GC wasn't performed. With a sufficiently high packet rate, it is possible to always bypass the GC, causing the list to grow infinitely. Update the last_gc value only when a GC has been actually performed. Fixes: d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC") Signed-off-by: Fernando Fernandez Mancera Signed-off-by: Florian Westphal --- net/netfilter/nf_conncount.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c index 3654f1e8976c9..8487808c87614 100644 --- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -229,6 +229,7 @@ static int __nf_conncount_add(struct net *net, nf_ct_put(found_ct); } + list->last_gc = (u32)jiffies; add_new_node: if (WARN_ON_ONCE(list->count > INT_MAX)) { @@ -248,7 +249,6 @@ static int __nf_conncount_add(struct net *net, conn->jiffies32 = (u32)jiffies; list_add_tail(&conn->node, &list->head); list->count++; - list->last_gc = (u32)jiffies; out_put: if (refcounted) From 1adce9017fa465a1ef296d9d731ab6055dfdaf6e Mon Sep 17 00:00:00 2001 From: Justin Iurman Date: Sat, 3 Jan 2026 17:53:31 +0100 Subject: [PATCH 0741/1321] MAINTAINERS: Update email address for Justin Iurman Due to a change of employer, I'll be using a permanent and personal email address. Signed-off-by: Justin Iurman Link: https://patch.msgid.link/20260103165331.20120-1-justin.iurman@gmail.com Signed-off-by: Jakub Kicinski --- .mailmap | 1 + MAINTAINERS | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/.mailmap b/.mailmap index 1873cfa159378..e071d0a08c933 100644 --- a/.mailmap +++ b/.mailmap @@ -416,6 +416,7 @@ Juha Yrjola Juha Yrjola Juha Yrjola Julien Thierry +Justin Iurman Iskren Chernev Kalle Valo Kalle Valo diff --git a/MAINTAINERS b/MAINTAINERS index a0dd762f5648b..7bc0723ae5415 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -18283,7 +18283,7 @@ X: net/wireless/ X: tools/testing/selftests/net/can/ NETWORKING [IOAM] -M: Justin Iurman +M: Justin Iurman S: Maintained F: Documentation/networking/ioam6* F: include/linux/ioam6* From a8df9aa2a074ab68255282712dea774c0ddfd311 Mon Sep 17 00:00:00 2001 From: Michal Luczaj Date: Mon, 29 Dec 2025 20:43:10 +0100 Subject: [PATCH 0742/1321] vsock: Make accept()ed sockets use custom setsockopt() SO_ZEROCOPY handling in vsock_connectible_setsockopt() does not get called on accept()ed sockets due to a missing flag. Flip it. Fixes: e0718bd82e27 ("vsock: enable setting SO_ZEROCOPY") Signed-off-by: Michal Luczaj Link: https://patch.msgid.link/20251229-vsock-child-sock-custom-sockopt-v2-1-64778d6c4f88@rbox.co Signed-off-by: Jakub Kicinski --- net/vmw_vsock/af_vsock.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index adcba1b7bf74a..a3505a4dcee0b 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1787,6 +1787,10 @@ static int vsock_accept(struct socket *sock, struct socket *newsock, } else { newsock->state = SS_CONNECTED; sock_graft(connected, newsock); + + set_bit(SOCK_CUSTOM_SOCKOPT, + &connected->sk_socket->flags); + if (vsock_msgzerocopy_allow(vconnected->transport)) set_bit(SOCK_SUPPORT_ZC, &connected->sk_socket->flags); From 882e7fddadaa88a754ae38285abf69887716ab98 Mon Sep 17 00:00:00 2001 From: Michal Luczaj Date: Mon, 29 Dec 2025 20:43:11 +0100 Subject: [PATCH 0743/1321] vsock/test: Test setting SO_ZEROCOPY on accept()ed socket Make sure setsockopt(SOL_SOCKET, SO_ZEROCOPY) on an accept()ed socket is handled by vsock's implementation. Reviewed-by: Stefano Garzarella Signed-off-by: Michal Luczaj Link: https://patch.msgid.link/20251229-vsock-child-sock-custom-sockopt-v2-2-64778d6c4f88@rbox.co Signed-off-by: Jakub Kicinski --- tools/testing/vsock/vsock_test.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c index 9e1250790f33d..bbe3723babdc4 100644 --- a/tools/testing/vsock/vsock_test.c +++ b/tools/testing/vsock/vsock_test.c @@ -2192,6 +2192,33 @@ static void test_stream_nolinger_server(const struct test_opts *opts) close(fd); } +static void test_stream_accepted_setsockopt_client(const struct test_opts *opts) +{ + int fd; + + fd = vsock_stream_connect(opts->peer_cid, opts->peer_port); + if (fd < 0) { + perror("connect"); + exit(EXIT_FAILURE); + } + + close(fd); +} + +static void test_stream_accepted_setsockopt_server(const struct test_opts *opts) +{ + int fd; + + fd = vsock_stream_accept(VMADDR_CID_ANY, opts->peer_port, NULL); + if (fd < 0) { + perror("accept"); + exit(EXIT_FAILURE); + } + + enable_so_zerocopy_check(fd); + close(fd); +} + static struct test_case test_cases[] = { { .name = "SOCK_STREAM connection reset", @@ -2371,6 +2398,11 @@ static struct test_case test_cases[] = { .run_client = test_seqpacket_unread_bytes_client, .run_server = test_seqpacket_unread_bytes_server, }, + { + .name = "SOCK_STREAM accept()ed socket custom setsockopt()", + .run_client = test_stream_accepted_setsockopt_client, + .run_server = test_stream_accepted_setsockopt_server, + }, {}, }; From 3938c539db7960909f46293a2cbee8eb36a6ac05 Mon Sep 17 00:00:00 2001 From: Jamal Hadi Salim Date: Thu, 1 Jan 2026 08:56:07 -0500 Subject: [PATCH 0744/1321] net/sched: act_mirred: Fix leak when redirecting to self on egress Whenever a mirred redirect to self on egress happens, mirred allocates a new skb (skb_to_send). The loop to self check was done after that allocation, but was not freeing the newly allocated skb, causing a leak. Fix this by moving the if-statement to before the allocation of the new skb. The issue was found by running the accompanying tdc test in 2/2 with config kmemleak enabled. After a few minutes the kmemleak thread ran and reported the leak coming from mirred. Fixes: 1d856251a009 ("net/sched: act_mirred: fix loop detection") Signed-off-by: Jamal Hadi Salim Link: https://patch.msgid.link/20260101135608.253079-2-jhs@mojatatu.com Signed-off-by: Jakub Kicinski --- net/sched/act_mirred.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index 91c96cc625bd6..05e0b14b57731 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -266,11 +266,22 @@ static int tcf_mirred_to_dev(struct sk_buff *skb, struct tcf_mirred *m, goto err_cant_do; } + want_ingress = tcf_mirred_act_wants_ingress(m_eaction); + + at_ingress = skb_at_tc_ingress(skb); + if (dev == skb->dev && want_ingress == at_ingress) { + pr_notice_once("tc mirred: Loop (%s:%s --> %s:%s)\n", + netdev_name(skb->dev), + at_ingress ? "ingress" : "egress", + netdev_name(dev), + want_ingress ? "ingress" : "egress"); + goto err_cant_do; + } + /* we could easily avoid the clone only if called by ingress and clsact; * since we can't easily detect the clsact caller, skip clone only for * ingress - that covers the TC S/W datapath. */ - at_ingress = skb_at_tc_ingress(skb); dont_clone = skb_at_tc_ingress(skb) && is_redirect && tcf_mirred_can_reinsert(retval); if (!dont_clone) { @@ -279,17 +290,6 @@ static int tcf_mirred_to_dev(struct sk_buff *skb, struct tcf_mirred *m, goto err_cant_do; } - want_ingress = tcf_mirred_act_wants_ingress(m_eaction); - - if (dev == skb->dev && want_ingress == at_ingress) { - pr_notice_once("tc mirred: Loop (%s:%s --> %s:%s)\n", - netdev_name(skb->dev), - at_ingress ? "ingress" : "egress", - netdev_name(dev), - want_ingress ? "ingress" : "egress"); - goto err_cant_do; - } - /* All mirred/redirected skbs should clear previous ct info */ nf_reset_ct(skb_to_send); if (want_ingress && !at_ingress) /* drop dst for egress -> ingress */ From c03506db28e99a371ffdc6b3dbc8a34447ba782c Mon Sep 17 00:00:00 2001 From: Victor Nogueira Date: Thu, 1 Jan 2026 08:56:08 -0500 Subject: [PATCH 0745/1321] selftests/tc-testing: Add test case redirecting to self on egress Add single mirred test case that attempts to redirect to self on egress using clsact Signed-off-by: Victor Nogueira Link: https://patch.msgid.link/20260101135608.253079-3-jhs@mojatatu.com Signed-off-by: Jakub Kicinski --- .../tc-testing/tc-tests/actions/mirred.json | 47 +++++++++++++++++++ 1 file changed, 47 insertions(+) diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json b/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json index da156feabcbff..b056eb9668718 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json +++ b/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json @@ -1098,5 +1098,52 @@ "teardown": [ "$TC qdisc del dev $DUMMY root" ] + }, + { + "id": "4ed9", + "name": "Try to redirect to self on egress with clsact", + "category": [ + "filter", + "mirred" + ], + "plugins": { + "requires": [ + "nsPlugin" + ] + }, + "setup": [ + "$IP link set dev $DUMMY up || true", + "$IP addr add 10.10.10.10/24 dev $DUMMY || true", + "$TC qdisc add dev $DUMMY clsact", + "$TC filter add dev $DUMMY egress protocol ip prio 10 matchall action mirred egress redirect dev $DUMMY index 1" + ], + "cmdUnderTest": "ping -c1 -W0.01 -I $DUMMY 10.10.10.1", + "expExitCode": "1", + "verifyCmd": "$TC -j -s actions get action mirred index 1", + "matchJSON": [ + { + "total acts": 0 + }, + { + "actions": [ + { + "order": 1, + "kind": "mirred", + "mirred_action": "redirect", + "direction": "egress", + "index": 1, + "stats": { + "packets": 1, + "overlimits": 1 + }, + "not_in_hw": true + } + ] + } + ], + "teardown": [ + "$TC qdisc del dev $DUMMY clsact" + ] } + ] From e5e2ac2a85fd65e6a81ee7502d7ad5704d24cbb4 Mon Sep 17 00:00:00 2001 From: Ankit Khushwaha Date: Thu, 1 Jan 2026 22:58:40 +0530 Subject: [PATCH 0746/1321] selftests: mptcp: Mark xerror and die_perror __noreturn Compiler reports potential uses of uninitialized variables in mptcp_connect.c when xerror() is called from failure paths. mptcp_connect.c:1262:11: warning: variable 'raw_addr' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] xerror() terminates execution by calling exit(), but it is not visible to the compiler & assumes control flow may continue past the call. Annotate xerror() with __noreturn so the compiler can correctly reason about control flow and avoid false-positive uninitialized variable warnings. Signed-off-by: Ankit Khushwaha Link: https://patch.msgid.link/20260101172840.90186-1-ankitkhushwaha.linux@gmail.com Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/mptcp/Makefile | 1 + tools/testing/selftests/net/mptcp/mptcp_connect.c | 3 ++- tools/testing/selftests/net/mptcp/mptcp_diag.c | 3 ++- tools/testing/selftests/net/mptcp/mptcp_inq.c | 5 +++-- tools/testing/selftests/net/mptcp/mptcp_sockopt.c | 5 +++-- 5 files changed, 11 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/Makefile b/tools/testing/selftests/net/mptcp/Makefile index 15d144a25d823..4dd6278cd3dd2 100644 --- a/tools/testing/selftests/net/mptcp/Makefile +++ b/tools/testing/selftests/net/mptcp/Makefile @@ -3,6 +3,7 @@ top_srcdir = ../../../../.. CFLAGS += -Wall -Wl,--no-as-needed -O2 -g -I$(top_srcdir)/usr/include $(KHDR_INCLUDES) +CFLAGS += -I$(top_srcdir)/tools/include TEST_PROGS := \ diag.sh \ diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c index 404a77bf366a8..10f6f99cfd4e0 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c @@ -33,6 +33,7 @@ #include #include #include +#include extern int optind; @@ -140,7 +141,7 @@ static void die_usage(void) exit(1); } -static void xerror(const char *fmt, ...) +static void __noreturn xerror(const char *fmt, ...) { va_list ap; diff --git a/tools/testing/selftests/net/mptcp/mptcp_diag.c b/tools/testing/selftests/net/mptcp/mptcp_diag.c index e084796e804d2..8e0b1b8d84b67 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_diag.c +++ b/tools/testing/selftests/net/mptcp/mptcp_diag.c @@ -5,6 +5,7 @@ #include #include #include +#include #include #include #include @@ -87,7 +88,7 @@ enum { #define rta_getattr(type, value) (*(type *)RTA_DATA(value)) -static void die_perror(const char *msg) +static void __noreturn die_perror(const char *msg) { perror(msg); exit(1); diff --git a/tools/testing/selftests/net/mptcp/mptcp_inq.c b/tools/testing/selftests/net/mptcp/mptcp_inq.c index 8e8f6441ad8b0..5716998da192a 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_inq.c +++ b/tools/testing/selftests/net/mptcp/mptcp_inq.c @@ -28,6 +28,7 @@ #include #include +#include #ifndef IPPROTO_MPTCP #define IPPROTO_MPTCP 262 @@ -40,7 +41,7 @@ static int pf = AF_INET; static int proto_tx = IPPROTO_MPTCP; static int proto_rx = IPPROTO_MPTCP; -static void die_perror(const char *msg) +static void __noreturn die_perror(const char *msg) { perror(msg); exit(1); @@ -52,7 +53,7 @@ static void die_usage(int r) exit(r); } -static void xerror(const char *fmt, ...) +static void __noreturn xerror(const char *fmt, ...) { va_list ap; diff --git a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c index 286164f7246ed..b6e58d936ebe5 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c +++ b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c @@ -25,6 +25,7 @@ #include #include +#include static int pf = AF_INET; @@ -127,7 +128,7 @@ struct so_state { #define MIN(a, b) ((a) < (b) ? (a) : (b)) #endif -static void die_perror(const char *msg) +static void __noreturn die_perror(const char *msg) { perror(msg); exit(1); @@ -139,7 +140,7 @@ static void die_usage(int r) exit(r); } -static void xerror(const char *fmt, ...) +static void __noreturn xerror(const char *fmt, ...) { va_list ap; From d04787073b70b012796dd498667fec1aa6c7ffde Mon Sep 17 00:00:00 2001 From: Lorenzo Bianconi Date: Fri, 2 Jan 2026 12:29:38 +0100 Subject: [PATCH 0747/1321] net: airoha: Fix npu rx DMA definitions Fix typos in npu rx DMA descriptor definitions. Fixes: b3ef7bdec66fb ("net: airoha: Add airoha_offload.h header") Signed-off-by: Lorenzo Bianconi Link: https://patch.msgid.link/20260102-airoha-npu-dma-rx-def-fixes-v1-1-205fc6bf7d94@kernel.org Signed-off-by: Jakub Kicinski --- include/linux/soc/airoha/airoha_offload.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/include/linux/soc/airoha/airoha_offload.h b/include/linux/soc/airoha/airoha_offload.h index 4d23cbb7d4075..ab64ecdf39a06 100644 --- a/include/linux/soc/airoha/airoha_offload.h +++ b/include/linux/soc/airoha/airoha_offload.h @@ -71,12 +71,12 @@ static inline void airoha_ppe_dev_check_skb(struct airoha_ppe_dev *dev, #define NPU_RX1_DESC_NUM 512 /* CTRL */ -#define NPU_RX_DMA_DESC_LAST_MASK BIT(29) -#define NPU_RX_DMA_DESC_LEN_MASK GENMASK(28, 15) -#define NPU_RX_DMA_DESC_CUR_LEN_MASK GENMASK(14, 1) +#define NPU_RX_DMA_DESC_LAST_MASK BIT(27) +#define NPU_RX_DMA_DESC_LEN_MASK GENMASK(26, 14) +#define NPU_RX_DMA_DESC_CUR_LEN_MASK GENMASK(13, 1) #define NPU_RX_DMA_DESC_DONE_MASK BIT(0) /* INFO */ -#define NPU_RX_DMA_PKT_COUNT_MASK GENMASK(31, 28) +#define NPU_RX_DMA_PKT_COUNT_MASK GENMASK(31, 29) #define NPU_RX_DMA_PKT_ID_MASK GENMASK(28, 26) #define NPU_RX_DMA_SRC_PORT_MASK GENMASK(25, 21) #define NPU_RX_DMA_CRSN_MASK GENMASK(20, 16) From 892d529491273991679aa5faa89b2737acbb167b Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Sun, 4 Jan 2026 08:52:32 -0800 Subject: [PATCH 0748/1321] netlink: specs: netdev: clarify the page pool API a little The phrasing of the page-pool-get doc is very confusing. It's supposed to highlight that support depends on the driver doing its part but it sounds like orphaned page pools won't be visible. The description of the ifindex is completely wrong. We move the page pool to loopback and skip the attribute if ifindex is loopback. Link: https://lore.kernel.org/20260104084347.5de3a537@kernel.org Reviewed-by: Donald Hunter Acked-by: Jesper Dangaard Brouer Link: https://patch.msgid.link/20260104165232.710460-1-kuba@kernel.org Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/netdev.yaml | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 82bf5cb2617d1..596c306ce52b8 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -142,7 +142,7 @@ attribute-sets: name: ifindex doc: | ifindex of the netdev to which the pool belongs. - May be reported as 0 if the page pool was allocated for a netdev + May not be reported if the page pool was allocated for a netdev which got destroyed already (page pools may outlast their netdevs because they wait for all memory to be returned). type: u32 @@ -601,7 +601,9 @@ operations: name: page-pool-get doc: | Get / dump information about Page Pools. - (Only Page Pools associated with a net_device can be listed.) + Only Page Pools associated by the driver with a net_device + can be listed. ifindex will not be reported if the net_device + no longer exists. attribute-set: page-pool do: request: From afbd7ea8a7de884bc347a1af192e664ecb4e4ea9 Mon Sep 17 00:00:00 2001 From: Mohammad Heib Date: Sun, 4 Jan 2026 23:31:01 +0200 Subject: [PATCH 0749/1321] net: fix memory leak in skb_segment_list for GRO packets When skb_segment_list() is called during packet forwarding, it handles packets that were aggregated by the GRO engine. Historically, the segmentation logic in skb_segment_list assumes that individual segments are split from a parent SKB and may need to carry their own socket memory accounting. Accordingly, the code transfers truesize from the parent to the newly created segments. Prior to commit ed4cccef64c1 ("gro: fix ownership transfer"), this truesize subtraction in skb_segment_list() was valid because fragments still carry a reference to the original socket. However, commit ed4cccef64c1 ("gro: fix ownership transfer") changed this behavior by ensuring that fraglist entries are explicitly orphaned (skb->sk = NULL) to prevent illegal orphaning later in the stack. This change meant that the entire socket memory charge remained with the head SKB, but the corresponding accounting logic in skb_segment_list() was never updated. As a result, the current code unconditionally adds each fragment's truesize to delta_truesize and subtracts it from the parent SKB. Since the fragments are no longer charged to the socket, this subtraction results in an effective under-count of memory when the head is freed. This causes sk_wmem_alloc to remain non-zero, preventing socket destruction and leading to a persistent memory leak. The leak can be observed via KMEMLEAK when tearing down the networking environment: unreferenced object 0xffff8881e6eb9100 (size 2048): comm "ping", pid 6720, jiffies 4295492526 backtrace: kmem_cache_alloc_noprof+0x5c6/0x800 sk_prot_alloc+0x5b/0x220 sk_alloc+0x35/0xa00 inet6_create.part.0+0x303/0x10d0 __sock_create+0x248/0x640 __sys_socket+0x11b/0x1d0 Since skb_segment_list() is exclusively used for SKB_GSO_FRAGLIST packets constructed by GRO, the truesize adjustment is removed. The call to skb_release_head_state() must be preserved. As documented in commit cf673ed0e057 ("net: fix fraglist segmentation reference count leak"), it is still required to correctly drop references to SKB extensions that may be overwritten during __copy_skb_header(). Fixes: ed4cccef64c1 ("gro: fix ownership transfer") Signed-off-by: Mohammad Heib Reviewed-by: Willem de Bruijn Link: https://patch.msgid.link/20260104213101.352887-1-mheib@redhat.com Signed-off-by: Jakub Kicinski --- net/core/skbuff.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index a00808f7be6a1..a56133902c0d9 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -4636,12 +4636,14 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb, { struct sk_buff *list_skb = skb_shinfo(skb)->frag_list; unsigned int tnl_hlen = skb_tnl_header_len(skb); - unsigned int delta_truesize = 0; unsigned int delta_len = 0; struct sk_buff *tail = NULL; struct sk_buff *nskb, *tmp; int len_diff, err; + /* Only skb_gro_receive_list generated skbs arrive here */ + DEBUG_NET_WARN_ON_ONCE(!(skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST)); + skb_push(skb, -skb_network_offset(skb) + offset); /* Ensure the head is writeable before touching the shared info */ @@ -4655,8 +4657,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb, nskb = list_skb; list_skb = list_skb->next; + DEBUG_NET_WARN_ON_ONCE(nskb->sk); + err = 0; - delta_truesize += nskb->truesize; if (skb_shared(nskb)) { tmp = skb_clone(nskb, GFP_ATOMIC); if (tmp) { @@ -4699,7 +4702,6 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb, goto err_linearize; } - skb->truesize = skb->truesize - delta_truesize; skb->data_len = skb->data_len - delta_len; skb->len = skb->len - delta_len; From b7223a827896b638fa12de7d7905c3f6ad80e5a7 Mon Sep 17 00:00:00 2001 From: Shyam Sundar S K Date: Thu, 11 Dec 2025 16:58:31 +0530 Subject: [PATCH 0750/1321] MAINTAINERS: Add an additional maintainer to the AMD XGBE driver Add Raju Rangoju as an additional maintainer to support the AMD XGBE network device driver. Signed-off-by: Shyam Sundar S K Acked-by: Raju Rangoju Link: https://patch.msgid.link/20251211112831.1781030-1-Shyam-sundar.S-k@amd.com Signed-off-by: Jakub Kicinski --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 7bc0723ae5415..6737aad729d63 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1283,6 +1283,7 @@ F: include/uapi/drm/amdxdna_accel.h AMD XGBE DRIVER M: "Shyam Sundar S K" +M: Raju Rangoju L: netdev@vger.kernel.org S: Maintained F: arch/arm64/boot/dts/amd/amd-seattle-xgbe*.dtsi From 6df2c70a563e3f7d38f829f5a0d1b84d03cd01da Mon Sep 17 00:00:00 2001 From: Vladimir Oltean Date: Sun, 4 Jan 2026 11:39:52 +0200 Subject: [PATCH 0751/1321] Revert "dsa: mv88e6xxx: make serdes SGMII/Fiber tx amplitude configurable" This reverts commit 926eae604403acfa27ba5b072af458e87e634a50, which never could have produced the intended effect: https://lore.kernel.org/netdev/AM0PR06MB10396BBF8B568D77556FC46F8F7DEA@AM0PR06MB10396.eurprd06.prod.outlook.com/ The reason why it is broken beyond repair in this form is that the mv88e6xxx driver outsources its "tx-p2p-microvolt" property to the OF node of an external Ethernet PHY. This: (a) does not work if there is no external PHY (chip-to-chip connection, or SFP module) (b) pollutes the OF property namespace / bindings of said external PHY ("tx-p2p-microvolt" could have meaning for the Ethernet PHY's SerDes interface as well) We can revisit the idea of making SerDes amplitude configurable once we have proper bindings for the mv88e6xxx SerDes. Until then, remove the code that leaves us with unnecessary baggage. Fixes: 926eae604403 ("dsa: mv88e6xxx: make serdes SGMII/Fiber tx amplitude configurable") Cc: Holger Brunck Signed-off-by: Vladimir Oltean Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20260104093952.486606-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski --- drivers/net/dsa/mv88e6xxx/chip.c | 23 --------------- drivers/net/dsa/mv88e6xxx/chip.h | 4 --- drivers/net/dsa/mv88e6xxx/serdes.c | 46 ------------------------------ drivers/net/dsa/mv88e6xxx/serdes.h | 5 ---- 4 files changed, 78 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index b4d48997bf467..09002c853b78e 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -3364,13 +3364,10 @@ static int mv88e6xxx_setup_upstream_port(struct mv88e6xxx_chip *chip, int port) static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port) { - struct device_node *phy_handle = NULL; struct fwnode_handle *ports_fwnode; struct fwnode_handle *port_fwnode; struct dsa_switch *ds = chip->ds; struct mv88e6xxx_port *p; - struct dsa_port *dp; - int tx_amp; int err; u16 reg; u32 val; @@ -3582,23 +3579,6 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port) return err; } - if (chip->info->ops->serdes_set_tx_amplitude) { - dp = dsa_to_port(ds, port); - if (dp) - phy_handle = of_parse_phandle(dp->dn, "phy-handle", 0); - - if (phy_handle && !of_property_read_u32(phy_handle, - "tx-p2p-microvolt", - &tx_amp)) - err = chip->info->ops->serdes_set_tx_amplitude(chip, - port, tx_amp); - if (phy_handle) { - of_node_put(phy_handle); - if (err) - return err; - } - } - /* Port based VLAN map: give each port the same default address * database, and allow bidirectional communication between the * CPU and DSA port(s), and the other ports. @@ -4768,7 +4748,6 @@ static const struct mv88e6xxx_ops mv88e6176_ops = { .serdes_irq_mapping = mv88e6352_serdes_irq_mapping, .serdes_get_regs_len = mv88e6352_serdes_get_regs_len, .serdes_get_regs = mv88e6352_serdes_get_regs, - .serdes_set_tx_amplitude = mv88e6352_serdes_set_tx_amplitude, .gpio_ops = &mv88e6352_gpio_ops, .phylink_get_caps = mv88e6352_phylink_get_caps, .pcs_ops = &mv88e6352_pcs_ops, @@ -5044,7 +5023,6 @@ static const struct mv88e6xxx_ops mv88e6240_ops = { .serdes_irq_mapping = mv88e6352_serdes_irq_mapping, .serdes_get_regs_len = mv88e6352_serdes_get_regs_len, .serdes_get_regs = mv88e6352_serdes_get_regs, - .serdes_set_tx_amplitude = mv88e6352_serdes_set_tx_amplitude, .gpio_ops = &mv88e6352_gpio_ops, .avb_ops = &mv88e6352_avb_ops, .ptp_ops = &mv88e6352_ptp_ops, @@ -5481,7 +5459,6 @@ static const struct mv88e6xxx_ops mv88e6352_ops = { .serdes_get_stats = mv88e6352_serdes_get_stats, .serdes_get_regs_len = mv88e6352_serdes_get_regs_len, .serdes_get_regs = mv88e6352_serdes_get_regs, - .serdes_set_tx_amplitude = mv88e6352_serdes_set_tx_amplitude, .phylink_get_caps = mv88e6352_phylink_get_caps, .pcs_ops = &mv88e6352_pcs_ops, }; diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h index 2f211e55cb47b..e073446ee7d02 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.h +++ b/drivers/net/dsa/mv88e6xxx/chip.h @@ -642,10 +642,6 @@ struct mv88e6xxx_ops { void (*serdes_get_regs)(struct mv88e6xxx_chip *chip, int port, void *_p); - /* SERDES SGMII/Fiber Output Amplitude */ - int (*serdes_set_tx_amplitude)(struct mv88e6xxx_chip *chip, int port, - int val); - /* Address Translation Unit operations */ int (*atu_get_hash)(struct mv88e6xxx_chip *chip, u8 *hash); int (*atu_set_hash)(struct mv88e6xxx_chip *chip, u8 hash); diff --git a/drivers/net/dsa/mv88e6xxx/serdes.c b/drivers/net/dsa/mv88e6xxx/serdes.c index b3330211edbca..a936ee80ce006 100644 --- a/drivers/net/dsa/mv88e6xxx/serdes.c +++ b/drivers/net/dsa/mv88e6xxx/serdes.c @@ -25,14 +25,6 @@ static int mv88e6352_serdes_read(struct mv88e6xxx_chip *chip, int reg, reg, val); } -static int mv88e6352_serdes_write(struct mv88e6xxx_chip *chip, int reg, - u16 val) -{ - return mv88e6xxx_phy_page_write(chip, MV88E6352_ADDR_SERDES, - MV88E6352_SERDES_PAGE_FIBER, - reg, val); -} - static int mv88e6390_serdes_read(struct mv88e6xxx_chip *chip, int lane, int device, int reg, u16 *val) { @@ -506,41 +498,3 @@ void mv88e6390_serdes_get_regs(struct mv88e6xxx_chip *chip, int port, void *_p) p[i] = reg; } } - -static const int mv88e6352_serdes_p2p_to_reg[] = { - /* Index of value in microvolts corresponds to the register value */ - 14000, 112000, 210000, 308000, 406000, 504000, 602000, 700000, -}; - -int mv88e6352_serdes_set_tx_amplitude(struct mv88e6xxx_chip *chip, int port, - int val) -{ - bool found = false; - u16 ctrl, reg; - int err; - int i; - - err = mv88e6352_g2_scratch_port_has_serdes(chip, port); - if (err <= 0) - return err; - - for (i = 0; i < ARRAY_SIZE(mv88e6352_serdes_p2p_to_reg); ++i) { - if (mv88e6352_serdes_p2p_to_reg[i] == val) { - reg = i; - found = true; - break; - } - } - - if (!found) - return -EINVAL; - - err = mv88e6352_serdes_read(chip, MV88E6352_SERDES_SPEC_CTRL2, &ctrl); - if (err) - return err; - - ctrl &= ~MV88E6352_SERDES_OUT_AMP_MASK; - ctrl |= reg; - - return mv88e6352_serdes_write(chip, MV88E6352_SERDES_SPEC_CTRL2, ctrl); -} diff --git a/drivers/net/dsa/mv88e6xxx/serdes.h b/drivers/net/dsa/mv88e6xxx/serdes.h index ad887d8601bcb..17a3e85fabaa3 100644 --- a/drivers/net/dsa/mv88e6xxx/serdes.h +++ b/drivers/net/dsa/mv88e6xxx/serdes.h @@ -29,8 +29,6 @@ struct phylink_link_state; #define MV88E6352_SERDES_INT_FIBRE_ENERGY BIT(4) #define MV88E6352_SERDES_INT_STATUS 0x13 -#define MV88E6352_SERDES_SPEC_CTRL2 0x1a -#define MV88E6352_SERDES_OUT_AMP_MASK 0x0007 #define MV88E6341_PORT5_LANE 0x15 @@ -140,9 +138,6 @@ void mv88e6352_serdes_get_regs(struct mv88e6xxx_chip *chip, int port, void *_p); int mv88e6390_serdes_get_regs_len(struct mv88e6xxx_chip *chip, int port); void mv88e6390_serdes_get_regs(struct mv88e6xxx_chip *chip, int port, void *_p); -int mv88e6352_serdes_set_tx_amplitude(struct mv88e6xxx_chip *chip, int port, - int val); - /* Return the (first) SERDES lane address a port is using, -errno otherwise. */ static inline int mv88e6xxx_serdes_get_lane(struct mv88e6xxx_chip *chip, int port) From cdcaba941f2f660f486b368aaf753b777c4472be Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Mon, 5 Jan 2026 09:36:30 +0000 Subject: [PATCH 0752/1321] udp: call skb_orphan() before skb_attempt_defer_free() Standard UDP receive path does not use skb->destructor. But skmsg layer does use it, since it calls skb_set_owner_sk_safe() from udp_read_skb(). This then triggers this warning in skb_attempt_defer_free(): DEBUG_NET_WARN_ON_ONCE(skb->destructor); We must call skb_orphan() to fix this issue. Fixes: 6471658dc66c ("udp: use skb_attempt_defer_free()") Reported-by: syzbot+3e68572cf2286ce5ebe9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/695b83bd.050a0220.1c9965.002b.GAE@google.com/T/#u Signed-off-by: Eric Dumazet Link: https://patch.msgid.link/20260105093630.1976085-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- net/ipv4/udp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index ffe074cb58658..ee63af0ef42cc 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1851,6 +1851,7 @@ void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len) sk_peek_offset_bwd(sk, len); if (!skb_shared(skb)) { + skb_orphan(skb); skb_attempt_defer_free(skb); return; } From 40704417e8e11872183fa10278342b9b328e919a Mon Sep 17 00:00:00 2001 From: Maxime Chevallier Date: Mon, 5 Jan 2026 16:18:39 +0100 Subject: [PATCH 0753/1321] net: sfp: return the number of written bytes for smbus single byte access We expect the SFP write accessors to return the number of written bytes. We fail to do so for single-byte smbus accesses, which may cause errors when setting a module's high-power state and for some cotsworks modules. Let's return the amount of written bytes, as expected. Fixes: 7662abf4db94 ("net: phy: sfp: Add support for SMBus module access") Signed-off-by: Maxime Chevallier Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20260105151840.144552-1-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski --- drivers/net/phy/sfp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c index 6166e91963644..84bef5099dda6 100644 --- a/drivers/net/phy/sfp.c +++ b/drivers/net/phy/sfp.c @@ -765,7 +765,7 @@ static int sfp_smbus_byte_write(struct sfp *sfp, bool a2, u8 dev_addr, dev_addr++; } - return 0; + return data - (u8 *)buf; } static int sfp_i2c_configure(struct sfp *sfp, struct i2c_adapter *i2c) From 2049b9ccc7724f8264771adc36fdffc1cfcab43e Mon Sep 17 00:00:00 2001 From: Shivani Gupta Date: Mon, 5 Jan 2026 00:59:05 +0000 Subject: [PATCH 0754/1321] net/sched: act_api: avoid dereferencing ERR_PTR in tcf_idrinfo_destroy syzbot reported a crash in tc_act_in_hw() during netns teardown where tcf_idrinfo_destroy() passed an ERR_PTR(-EBUSY) value as a tc_action pointer, leading to an invalid dereference. Guard against ERR_PTR entries when iterating the action IDR so teardown does not call tc_act_in_hw() on an error pointer. Fixes: 84a7d6797e6a ("net/sched: acp_api: no longer acquire RTNL in tc_action_net_exit()") Link: https://syzkaller.appspot.com/bug?extid=8f1c492ffa4644ff3826 Reported-by: syzbot+8f1c492ffa4644ff3826@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=8f1c492ffa4644ff3826 Signed-off-by: Shivani Gupta Link: https://patch.msgid.link/20260105005905.243423-1-shivani07g@gmail.com Signed-off-by: Jakub Kicinski --- net/sched/act_api.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/sched/act_api.c b/net/sched/act_api.c index ff6be5cfe2b05..e1ab0faeb8113 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -940,6 +940,8 @@ void tcf_idrinfo_destroy(const struct tc_action_ops *ops, int ret; idr_for_each_entry_ul(idr, p, tmp, id) { + if (IS_ERR(p)) + continue; if (tc_act_in_hw(p) && !mutex_taken) { rtnl_lock(); mutex_taken = true; From 063f3dc9ab980206851f4a0346fed3f70cf80fff Mon Sep 17 00:00:00 2001 From: Gal Pressman Date: Mon, 5 Jan 2026 18:33:19 +0200 Subject: [PATCH 0755/1321] selftests: drv-net: Bring back tool() to driver __init__s The pp_alloc_fail.py test (which doesn't run in NIPA CI?) uses tool, add back the import. Resolves: ImportError: cannot import name 'tool' from 'lib.py' Fixes: 68a052239fc4 ("selftests: drv-net: update remaining Python init files") Reviewed-by: Nimrod Oren Signed-off-by: Gal Pressman Link: https://patch.msgid.link/20260105163319.47619-1-gal@nvidia.com Signed-off-by: Jakub Kicinski --- tools/testing/selftests/drivers/net/hw/lib/py/__init__.py | 4 ++-- tools/testing/selftests/net/lib/py/__init__.py | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py b/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py index 766bfc4ad8428..d5d247eca6b7c 100644 --- a/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py +++ b/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py @@ -22,7 +22,7 @@ NlError, RtnlFamily, DevlinkFamily, PSPFamily from net.lib.py import CmdExitFailure from net.lib.py import bkg, cmd, bpftool, bpftrace, defer, ethtool, \ - fd_read_timeout, ip, rand_port, wait_port_listen, wait_file + fd_read_timeout, ip, rand_port, wait_port_listen, wait_file, tool from net.lib.py import KsftSkipEx, KsftFailEx, KsftXfailEx from net.lib.py import ksft_disruptive, ksft_exit, ksft_pr, ksft_run, \ ksft_setup, ksft_variants, KsftNamedVariant @@ -37,7 +37,7 @@ "CmdExitFailure", "bkg", "cmd", "bpftool", "bpftrace", "defer", "ethtool", "fd_read_timeout", "ip", "rand_port", - "wait_port_listen", "wait_file", + "wait_port_listen", "wait_file", "tool", "KsftSkipEx", "KsftFailEx", "KsftXfailEx", "ksft_disruptive", "ksft_exit", "ksft_pr", "ksft_run", "ksft_setup", "ksft_variants", "KsftNamedVariant", diff --git a/tools/testing/selftests/net/lib/py/__init__.py b/tools/testing/selftests/net/lib/py/__init__.py index 40f9ce307dd1e..f528b67639de0 100644 --- a/tools/testing/selftests/net/lib/py/__init__.py +++ b/tools/testing/selftests/net/lib/py/__init__.py @@ -13,7 +13,7 @@ from .netns import NetNS, NetNSEnter from .nsim import NetdevSim, NetdevSimDev from .utils import CmdExitFailure, fd_read_timeout, cmd, bkg, defer, \ - bpftool, ip, ethtool, bpftrace, rand_port, wait_port_listen, wait_file + bpftool, ip, ethtool, bpftrace, rand_port, wait_port_listen, wait_file, tool from .ynl import NlError, YnlFamily, EthtoolFamily, NetdevFamily, RtnlFamily, RtnlAddrFamily from .ynl import NetshaperFamily, DevlinkFamily, PSPFamily @@ -26,7 +26,7 @@ "NetNS", "NetNSEnter", "CmdExitFailure", "fd_read_timeout", "cmd", "bkg", "defer", "bpftool", "ip", "ethtool", "bpftrace", "rand_port", - "wait_port_listen", "wait_file", + "wait_port_listen", "wait_file", "tool", "NetdevSim", "NetdevSimDev", "NetshaperFamily", "DevlinkFamily", "PSPFamily", "NlError", "YnlFamily", "EthtoolFamily", "NetdevFamily", "RtnlFamily", From 0f871f46516ae31db552ab543c6a3fab6ffbd9f5 Mon Sep 17 00:00:00 2001 From: Yohei Kojima Date: Tue, 6 Jan 2026 00:17:32 +0900 Subject: [PATCH 0756/1321] net: netdevsim: fix inconsistent carrier state after link/unlink This patch fixes the edge case behavior on ifup/ifdown and linking/unlinking two netdevsim interfaces: 1. unlink two interfaces netdevsim1 and netdevsim2 2. ifdown netdevsim1 3. ifup netdevsim1 4. link two interfaces netdevsim1 and netdevsim2 5. (Now two interfaces are linked in terms of netdevsim peer, but carrier state of the two interfaces remains DOWN.) This inconsistent behavior is caused by the current implementation, which only cares about the "link, then ifup" order, not "ifup, then link" order. This patch fixes the inconsistency by calling netif_carrier_on() when two netdevsim interfaces are linked. This patch fixes buggy behavior on NetworkManager-based systems which causes the netdevsim test to fail with the following error: # timeout set to 600 # selftests: drivers/net/netdevsim: peer.sh # 2025/12/25 00:54:03 socat[9115] W address is opened in read-write mode but only supports read-only # 2025/12/25 00:56:17 socat[9115] W connect(7, AF=2 192.168.1.1:1234, 16): Connection timed out # 2025/12/25 00:56:17 socat[9115] E TCP:192.168.1.1:1234: Connection timed out # expected 3 bytes, got 0 # 2025/12/25 00:56:17 socat[9109] W exiting on signal 15 not ok 13 selftests: drivers/net/netdevsim: peer.sh # exit=1 This patch also solves timeout on TCP Fast Open (TFO) test in NetworkManager-based systems because it also depends on netdevsim's carrier consistency. Fixes: 1a8fed52f7be ("netdevsim: set the carrier when the device goes up") Signed-off-by: Yohei Kojima Reviewed-by: Breno Leitao Link: https://patch.msgid.link/602c9e1ba5bb2ee1997bb38b1d866c9c3b807ae9.1767624906.git.yk@y-koj.net Signed-off-by: Jakub Kicinski --- drivers/net/netdevsim/bus.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/net/netdevsim/bus.c b/drivers/net/netdevsim/bus.c index 70e8c38ddad6b..d16b95304aa7e 100644 --- a/drivers/net/netdevsim/bus.c +++ b/drivers/net/netdevsim/bus.c @@ -332,6 +332,11 @@ static ssize_t link_device_store(const struct bus_type *bus, const char *buf, si rcu_assign_pointer(nsim_a->peer, nsim_b); rcu_assign_pointer(nsim_b->peer, nsim_a); + if (netif_running(dev_a) && netif_running(dev_b)) { + netif_carrier_on(dev_a); + netif_carrier_on(dev_b); + } + out_err: put_net(ns_b); put_net(ns_a); @@ -381,6 +386,9 @@ static ssize_t unlink_device_store(const struct bus_type *bus, const char *buf, if (!peer) goto out_put_netns; + netif_carrier_off(dev); + netif_carrier_off(peer->netdev); + err = 0; RCU_INIT_POINTER(nsim->peer, NULL); RCU_INIT_POINTER(peer->peer, NULL); From ac9d05278a3507f15278be38c6225940919f59b2 Mon Sep 17 00:00:00 2001 From: Yohei Kojima Date: Tue, 6 Jan 2026 00:17:33 +0900 Subject: [PATCH 0757/1321] selftests: netdevsim: add carrier state consistency test This commit adds a test case for netdevsim carrier state consistency. Specifically, the added test verifies the carrier state during the following operations: 1. Unlink two netdevsims 2. ifdown one netdevsim, then ifup again 3. Link the netdevsims again 4. ifdown one netdevsim, then ifup again These steps verifies that the carrier is UP iff two netdevsims are linked and ifuped. Signed-off-by: Yohei Kojima Link: https://patch.msgid.link/481e2729e53b6074ebfc0ad85764d8feb244de8c.1767624906.git.yk@y-koj.net Signed-off-by: Jakub Kicinski --- .../selftests/drivers/net/netdevsim/peer.sh | 59 +++++++++++++++++++ 1 file changed, 59 insertions(+) diff --git a/tools/testing/selftests/drivers/net/netdevsim/peer.sh b/tools/testing/selftests/drivers/net/netdevsim/peer.sh index 7f32b56009257..f4721f7636dd6 100755 --- a/tools/testing/selftests/drivers/net/netdevsim/peer.sh +++ b/tools/testing/selftests/drivers/net/netdevsim/peer.sh @@ -52,6 +52,39 @@ cleanup_ns() ip netns del nssv } +is_carrier_up() +{ + local netns="$1" + local nsim_dev="$2" + + test "$(ip netns exec "$netns" \ + cat /sys/class/net/"$nsim_dev"/carrier 2>/dev/null)" -eq 1 +} + +assert_carrier_up() +{ + local netns="$1" + local nsim_dev="$2" + + if ! is_carrier_up "$netns" "$nsim_dev"; then + echo "$nsim_dev's carrier should be UP, but it isn't" + cleanup_ns + exit 1 + fi +} + +assert_carrier_down() +{ + local netns="$1" + local nsim_dev="$2" + + if is_carrier_up "$netns" "$nsim_dev"; then + echo "$nsim_dev's carrier should be DOWN, but it isn't" + cleanup_ns + exit 1 + fi +} + ### ### Code start ### @@ -113,6 +146,32 @@ if [ $? -eq 0 ]; then exit 1 fi +# netdevsim carrier state consistency checking +assert_carrier_up nssv "$NSIM_DEV_1_NAME" +assert_carrier_up nscl "$NSIM_DEV_2_NAME" + +echo "$NSIM_DEV_1_FD:$NSIM_DEV_1_IFIDX" > "$NSIM_DEV_SYS_UNLINK" + +assert_carrier_down nssv "$NSIM_DEV_1_NAME" +assert_carrier_down nscl "$NSIM_DEV_2_NAME" + +ip netns exec nssv ip link set dev "$NSIM_DEV_1_NAME" down +ip netns exec nssv ip link set dev "$NSIM_DEV_1_NAME" up + +assert_carrier_down nssv "$NSIM_DEV_1_NAME" +assert_carrier_down nscl "$NSIM_DEV_2_NAME" + +echo "$NSIM_DEV_1_FD:$NSIM_DEV_1_IFIDX $NSIM_DEV_2_FD:$NSIM_DEV_2_IFIDX" > $NSIM_DEV_SYS_LINK + +assert_carrier_up nssv "$NSIM_DEV_1_NAME" +assert_carrier_up nscl "$NSIM_DEV_2_NAME" + +ip netns exec nssv ip link set dev "$NSIM_DEV_1_NAME" down +ip netns exec nssv ip link set dev "$NSIM_DEV_1_NAME" up + +assert_carrier_up nssv "$NSIM_DEV_1_NAME" +assert_carrier_up nscl "$NSIM_DEV_2_NAME" + # send/recv packets tmp_file=$(mktemp) From 0d6dcff88c479e40283b07139dfccb4c654f7f42 Mon Sep 17 00:00:00 2001 From: Lorenzo Bianconi Date: Mon, 5 Jan 2026 09:43:31 +0100 Subject: [PATCH 0758/1321] net: airoha: Fix schedule while atomic in airoha_ppe_deinit() airoha_ppe_deinit() runs airoha_npu_ppe_deinit() in atomic context. airoha_npu_ppe_deinit routine allocates ppe_data buffer with GFP_KERNEL flag. Rely on rcu_replace_pointer in airoha_ppe_deinit routine in order to fix schedule while atomic issue in airoha_npu_ppe_deinit() since we do not need atomic context there. Fixes: 00a7678310fe3 ("net: airoha: Introduce flowtable offload support") Signed-off-by: Lorenzo Bianconi Link: https://patch.msgid.link/20260105-airoha-fw-ethtool-v2-1-3b32b158cc31@kernel.org Signed-off-by: Paolo Abeni --- drivers/net/ethernet/airoha/airoha_ppe.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/airoha/airoha_ppe.c b/drivers/net/ethernet/airoha/airoha_ppe.c index 0caabb0c3aa06..2221bafaf7c9f 100644 --- a/drivers/net/ethernet/airoha/airoha_ppe.c +++ b/drivers/net/ethernet/airoha/airoha_ppe.c @@ -1547,13 +1547,16 @@ void airoha_ppe_deinit(struct airoha_eth *eth) { struct airoha_npu *npu; - rcu_read_lock(); - npu = rcu_dereference(eth->npu); + mutex_lock(&flow_offload_mutex); + + npu = rcu_replace_pointer(eth->npu, NULL, + lockdep_is_held(&flow_offload_mutex)); if (npu) { npu->ops.ppe_deinit(npu); airoha_npu_put(npu); } - rcu_read_unlock(); + + mutex_unlock(&flow_offload_mutex); rhashtable_destroy(ð->ppe->l2_flows); rhashtable_destroy(ð->flow_table); From 4d3ad423717b026ecd0387fffaa3c51cb4be2c4d Mon Sep 17 00:00:00 2001 From: Xiang Mei Date: Mon, 5 Jan 2026 20:41:00 -0700 Subject: [PATCH 0759/1321] net/sched: sch_qfq: Fix NULL deref when deactivating inactive aggregate in qfq_reset `qfq_class->leaf_qdisc->q.qlen > 0` does not imply that the class itself is active. Two qfq_class objects may point to the same leaf_qdisc. This happens when: 1. one QFQ qdisc is attached to the dev as the root qdisc, and 2. another QFQ qdisc is temporarily referenced (e.g., via qdisc_get() / qdisc_put()) and is pending to be destroyed, as in function tc_new_tfilter. When packets are enqueued through the root QFQ qdisc, the shared leaf_qdisc->q.qlen increases. At the same time, the second QFQ qdisc triggers qdisc_put and qdisc_destroy: the qdisc enters qfq_reset() with its own q->q.qlen == 0, but its class's leaf qdisc->q.qlen > 0. Therefore, the qfq_reset would wrongly deactivate an inactive aggregate and trigger a null-deref in qfq_deactivate_agg: [ 0.903172] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 0.903571] #PF: supervisor write access in kernel mode [ 0.903860] #PF: error_code(0x0002) - not-present page [ 0.904177] PGD 10299b067 P4D 10299b067 PUD 10299c067 PMD 0 [ 0.904502] Oops: Oops: 0002 [#1] SMP NOPTI [ 0.904737] CPU: 0 UID: 0 PID: 135 Comm: exploit Not tainted 6.19.0-rc3+ #2 NONE [ 0.905157] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014 [ 0.905754] RIP: 0010:qfq_deactivate_agg (include/linux/list.h:992 (discriminator 2) include/linux/list.h:1006 (discriminator 2) net/sched/sch_qfq.c:1367 (discriminator 2) net/sched/sch_qfq.c:1393 (discriminator 2)) [ 0.906046] Code: 0f 84 4d 01 00 00 48 89 70 18 8b 4b 10 48 c7 c2 ff ff ff ff 48 8b 78 08 48 d3 e2 48 21 f2 48 2b 13 48 8b 30 48 d3 ea 8b 4b 18 0 Code starting with the faulting instruction =========================================== 0: 0f 84 4d 01 00 00 je 0x153 6: 48 89 70 18 mov %rsi,0x18(%rax) a: 8b 4b 10 mov 0x10(%rbx),%ecx d: 48 c7 c2 ff ff ff ff mov $0xffffffffffffffff,%rdx 14: 48 8b 78 08 mov 0x8(%rax),%rdi 18: 48 d3 e2 shl %cl,%rdx 1b: 48 21 f2 and %rsi,%rdx 1e: 48 2b 13 sub (%rbx),%rdx 21: 48 8b 30 mov (%rax),%rsi 24: 48 d3 ea shr %cl,%rdx 27: 8b 4b 18 mov 0x18(%rbx),%ecx ... [ 0.907095] RSP: 0018:ffffc900004a39a0 EFLAGS: 00010246 [ 0.907368] RAX: ffff8881043a0880 RBX: ffff888102953340 RCX: 0000000000000000 [ 0.907723] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 0.908100] RBP: ffff888102952180 R08: 0000000000000000 R09: 0000000000000000 [ 0.908451] R10: ffff8881043a0000 R11: 0000000000000000 R12: ffff888102952000 [ 0.908804] R13: ffff888102952180 R14: ffff8881043a0ad8 R15: ffff8881043a0880 [ 0.909179] FS: 000000002a1a0380(0000) GS:ffff888196d8d000(0000) knlGS:0000000000000000 [ 0.909572] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.909857] CR2: 0000000000000000 CR3: 0000000102993002 CR4: 0000000000772ef0 [ 0.910247] PKRU: 55555554 [ 0.910391] Call Trace: [ 0.910527] [ 0.910638] qfq_reset_qdisc (net/sched/sch_qfq.c:357 net/sched/sch_qfq.c:1485) [ 0.910826] qdisc_reset (include/linux/skbuff.h:2195 include/linux/skbuff.h:2501 include/linux/skbuff.h:3424 include/linux/skbuff.h:3430 net/sched/sch_generic.c:1036) [ 0.911040] __qdisc_destroy (net/sched/sch_generic.c:1076) [ 0.911236] tc_new_tfilter (net/sched/cls_api.c:2447) [ 0.911447] rtnetlink_rcv_msg (net/core/rtnetlink.c:6958) [ 0.911663] ? __pfx_rtnetlink_rcv_msg (net/core/rtnetlink.c:6861) [ 0.911894] netlink_rcv_skb (net/netlink/af_netlink.c:2550) [ 0.912100] netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344) [ 0.912296] ? __alloc_skb (net/core/skbuff.c:706) [ 0.912484] netlink_sendmsg (net/netlink/af_netlink.c:1894) [ 0.912682] sock_write_iter (net/socket.c:727 (discriminator 1) net/socket.c:742 (discriminator 1) net/socket.c:1195 (discriminator 1)) [ 0.912880] vfs_write (fs/read_write.c:593 fs/read_write.c:686) [ 0.913077] ksys_write (fs/read_write.c:738) [ 0.913252] do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1)) [ 0.913438] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:131) [ 0.913687] RIP: 0033:0x424c34 [ 0.913844] Code: 89 02 48 c7 c0 ff ff ff ff eb bd 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 80 3d 2d 44 09 00 00 74 13 b8 01 00 00 00 0f 05 9 Code starting with the faulting instruction =========================================== 0: 89 02 mov %eax,(%rdx) 2: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax 9: eb bd jmp 0xffffffffffffffc8 b: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) 12: 00 00 00 15: 90 nop 16: f3 0f 1e fa endbr64 1a: 80 3d 2d 44 09 00 00 cmpb $0x0,0x9442d(%rip) # 0x9444e 21: 74 13 je 0x36 23: b8 01 00 00 00 mov $0x1,%eax 28: 0f 05 syscall 2a: 09 .byte 0x9 [ 0.914807] RSP: 002b:00007ffea1938b78 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ 0.915197] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 0000000000424c34 [ 0.915556] RDX: 000000000000003c RSI: 000000002af378c0 RDI: 0000000000000003 [ 0.915912] RBP: 00007ffea1938bc0 R08: 00000000004b8820 R09: 0000000000000000 [ 0.916297] R10: 0000000000000001 R11: 0000000000000202 R12: 00007ffea1938d28 [ 0.916652] R13: 00007ffea1938d38 R14: 00000000004b3828 R15: 0000000000000001 [ 0.917039] [ 0.917158] Modules linked in: [ 0.917316] CR2: 0000000000000000 [ 0.917484] ---[ end trace 0000000000000000 ]--- [ 0.917717] RIP: 0010:qfq_deactivate_agg (include/linux/list.h:992 (discriminator 2) include/linux/list.h:1006 (discriminator 2) net/sched/sch_qfq.c:1367 (discriminator 2) net/sched/sch_qfq.c:1393 (discriminator 2)) [ 0.917978] Code: 0f 84 4d 01 00 00 48 89 70 18 8b 4b 10 48 c7 c2 ff ff ff ff 48 8b 78 08 48 d3 e2 48 21 f2 48 2b 13 48 8b 30 48 d3 ea 8b 4b 18 0 Code starting with the faulting instruction =========================================== 0: 0f 84 4d 01 00 00 je 0x153 6: 48 89 70 18 mov %rsi,0x18(%rax) a: 8b 4b 10 mov 0x10(%rbx),%ecx d: 48 c7 c2 ff ff ff ff mov $0xffffffffffffffff,%rdx 14: 48 8b 78 08 mov 0x8(%rax),%rdi 18: 48 d3 e2 shl %cl,%rdx 1b: 48 21 f2 and %rsi,%rdx 1e: 48 2b 13 sub (%rbx),%rdx 21: 48 8b 30 mov (%rax),%rsi 24: 48 d3 ea shr %cl,%rdx 27: 8b 4b 18 mov 0x18(%rbx),%ecx ... [ 0.918902] RSP: 0018:ffffc900004a39a0 EFLAGS: 00010246 [ 0.919198] RAX: ffff8881043a0880 RBX: ffff888102953340 RCX: 0000000000000000 [ 0.919559] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 0.919908] RBP: ffff888102952180 R08: 0000000000000000 R09: 0000000000000000 [ 0.920289] R10: ffff8881043a0000 R11: 0000000000000000 R12: ffff888102952000 [ 0.920648] R13: ffff888102952180 R14: ffff8881043a0ad8 R15: ffff8881043a0880 [ 0.921014] FS: 000000002a1a0380(0000) GS:ffff888196d8d000(0000) knlGS:0000000000000000 [ 0.921424] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.921710] CR2: 0000000000000000 CR3: 0000000102993002 CR4: 0000000000772ef0 [ 0.922097] PKRU: 55555554 [ 0.922240] Kernel panic - not syncing: Fatal exception [ 0.922590] Kernel Offset: disabled Fixes: 0545a3037773 ("pkt_sched: QFQ - quick fair queue scheduler") Signed-off-by: Xiang Mei Link: https://patch.msgid.link/20260106034100.1780779-1-xmei5@asu.edu Signed-off-by: Jakub Kicinski --- net/sched/sch_qfq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c index d920f57dc6d76..f4013b547438f 100644 --- a/net/sched/sch_qfq.c +++ b/net/sched/sch_qfq.c @@ -1481,7 +1481,7 @@ static void qfq_reset_qdisc(struct Qdisc *sch) for (i = 0; i < q->clhash.hashsize; i++) { hlist_for_each_entry(cl, &q->clhash.hash[i], common.hnode) { - if (cl->qdisc->q.qlen > 0) + if (cl_is_active(cl)) qfq_deactivate_class(q, cl); qdisc_reset(cl->qdisc); From d1524e58353de8804a440ebbf9db77883b93cd44 Mon Sep 17 00:00:00 2001 From: Thomas Fourier Date: Tue, 6 Jan 2026 10:47:21 +0100 Subject: [PATCH 0760/1321] net: 3com: 3c59x: fix possible null dereference in vortex_probe1() pdev can be null and free_ring: can be called in 1297 with a null pdev. Fixes: 55c82617c3e8 ("3c59x: convert to generic DMA API") Cc: Signed-off-by: Thomas Fourier Link: https://patch.msgid.link/20260106094731.25819-2-fourier.thomas@gmail.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/3com/3c59x.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/3com/3c59x.c b/drivers/net/ethernet/3com/3c59x.c index 8c9cc97efd4ee..4fe4efdb3737f 100644 --- a/drivers/net/ethernet/3com/3c59x.c +++ b/drivers/net/ethernet/3com/3c59x.c @@ -1473,7 +1473,7 @@ static int vortex_probe1(struct device *gendev, void __iomem *ioaddr, int irq, return 0; free_ring: - dma_free_coherent(&pdev->dev, + dma_free_coherent(gendev, sizeof(struct boom_rx_desc) * RX_RING_SIZE + sizeof(struct boom_tx_desc) * TX_RING_SIZE, vp->rx_ring, vp->rx_ring_dma); From 65fb34608eb3677de6cc601bbc402234c81bee84 Mon Sep 17 00:00:00 2001 From: Petko Manolov Date: Tue, 6 Jan 2026 10:48:21 +0200 Subject: [PATCH 0761/1321] net: usb: pegasus: fix memory leak in update_eth_regs_async() When asynchronously writing to the device registers and if usb_submit_urb() fail, the code fail to release allocated to this point resources. Fixes: 323b34963d11 ("drivers: net: usb: pegasus: fix control urb submission") Signed-off-by: Petko Manolov Link: https://patch.msgid.link/20260106084821.3746677-1-petko.manolov@konsulko.com Signed-off-by: Jakub Kicinski --- drivers/net/usb/pegasus.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/usb/pegasus.c b/drivers/net/usb/pegasus.c index 81ca64debc5b9..c514483134f05 100644 --- a/drivers/net/usb/pegasus.c +++ b/drivers/net/usb/pegasus.c @@ -168,6 +168,8 @@ static int update_eth_regs_async(pegasus_t *pegasus) netif_device_detach(pegasus->net); netif_err(pegasus, drv, pegasus->net, "%s returned %d\n", __func__, ret); + usb_free_urb(async_urb); + kfree(req); } return ret; } From eb3788c3ecd81c911456fc14c78ed4ab9ab6d5fe Mon Sep 17 00:00:00 2001 From: Breno Leitao Date: Tue, 6 Jan 2026 06:31:14 -0800 Subject: [PATCH 0762/1321] bnxt_en: Fix NULL pointer crash in bnxt_ptp_enable during error cleanup When bnxt_init_one() fails during initialization (e.g., bnxt_init_int_mode returns -ENODEV), the error path calls bnxt_free_hwrm_resources() which destroys the DMA pool and sets bp->hwrm_dma_pool to NULL. Subsequently, bnxt_ptp_clear() is called, which invokes ptp_clock_unregister(). Since commit a60fc3294a37 ("ptp: rework ptp_clock_unregister() to disable events"), ptp_clock_unregister() now calls ptp_disable_all_events(), which in turn invokes the driver's .enable() callback (bnxt_ptp_enable()) to disable PTP events before completing the unregistration. bnxt_ptp_enable() attempts to send HWRM commands via bnxt_ptp_cfg_pin() and bnxt_ptp_cfg_event(), both of which call hwrm_req_init(). This function tries to allocate from bp->hwrm_dma_pool, causing a NULL pointer dereference: bnxt_en 0000:01:00.0 (unnamed net_device) (uninitialized): bnxt_init_int_mode err: ffffffed KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f] Call Trace: __hwrm_req_init (drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.c:72) bnxt_ptp_enable (drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:323 drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:517) ptp_disable_all_events (drivers/ptp/ptp_chardev.c:66) ptp_clock_unregister (drivers/ptp/ptp_clock.c:518) bnxt_ptp_clear (drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:1134) bnxt_init_one (drivers/net/ethernet/broadcom/bnxt/bnxt.c:16889) Lines are against commit f8f9c1f4d0c7 ("Linux 6.19-rc3") Fix this by clearing and unregistering ptp (bnxt_ptp_clear()) before freeing HWRM resources. Suggested-by: Pavan Chebbi Signed-off-by: Breno Leitao Fixes: a60fc3294a37 ("ptp: rework ptp_clock_unregister() to disable events") Cc: stable@vger.kernel.org Reviewed-by: Pavan Chebbi Link: https://patch.msgid.link/20260106-bnxt-v3-1-71f37e11446a@debian.org Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index d160e54ac121d..8419d1eb4035d 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -16891,12 +16891,12 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) init_err_pci_clean: bnxt_hwrm_func_drv_unrgtr(bp); - bnxt_free_hwrm_resources(bp); - bnxt_hwmon_uninit(bp); - bnxt_ethtool_free(bp); bnxt_ptp_clear(bp); kfree(bp->ptp_cfg); bp->ptp_cfg = NULL; + bnxt_free_hwrm_resources(bp); + bnxt_hwmon_uninit(bp); + bnxt_ethtool_free(bp); kfree(bp->fw_health); bp->fw_health = NULL; bnxt_cleanup_pci(bp); From f9f679974ba931613ae38f496254eb7f57cfba56 Mon Sep 17 00:00:00 2001 From: Willem de Bruijn Date: Tue, 6 Jan 2026 10:05:46 -0500 Subject: [PATCH 0763/1321] net: do not write to msg_get_inq in callee NULL pointer dereference fix. msg_get_inq is an input field from caller to callee. Don't set it in the callee, as the caller may not clear it on struct reuse. This is a kernel-internal variant of msghdr only, and the only user does reinitialize the field. So this is not critical for that reason. But it is more robust to avoid the write, and slightly simpler code. And it fixes a bug, see below. Callers set msg_get_inq to request the input queue length to be returned in msg_inq. This is equivalent to but independent from the SO_INQ request to return that same info as a cmsg (tp->recvmsg_inq). To reduce branching in the hot path the second also sets the msg_inq. That is WAI. This is a fix to commit 4d1442979e4a ("af_unix: don't post cmsg for SO_INQ unless explicitly asked for"), which fixed the inverse. Also avoid NULL pointer dereference in unix_stream_read_generic if state->msg is NULL and msg->msg_get_inq is written. A NULL state->msg can happen when splicing as of commit 2b514574f7e8 ("net: af_unix: implement splice for stream af_unix sockets"). Also collapse two branches using a bitwise or. Cc: stable@vger.kernel.org Fixes: 4d1442979e4a ("af_unix: don't post cmsg for SO_INQ unless explicitly asked for") Link: https://lore.kernel.org/netdev/willemdebruijn.kernel.24d8030f7a3de@gmail.com/ Signed-off-by: Willem de Bruijn Reviewed-by: Jens Axboe Reviewed-by: Eric Dumazet Reviewed-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20260106150626.3944363-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski --- net/ipv4/tcp.c | 8 +++----- net/unix/af_unix.c | 8 +++----- 2 files changed, 6 insertions(+), 10 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index f035440c475a9..d5319ebe24525 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2652,10 +2652,8 @@ static int tcp_recvmsg_locked(struct sock *sk, struct msghdr *msg, size_t len, if (sk->sk_state == TCP_LISTEN) goto out; - if (tp->recvmsg_inq) { + if (tp->recvmsg_inq) *cmsg_flags = TCP_CMSG_INQ; - msg->msg_get_inq = 1; - } timeo = sock_rcvtimeo(sk, flags & MSG_DONTWAIT); /* Urgent data needs to be handled specially. */ @@ -2929,10 +2927,10 @@ int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags, ret = tcp_recvmsg_locked(sk, msg, len, flags, &tss, &cmsg_flags); release_sock(sk); - if ((cmsg_flags || msg->msg_get_inq) && ret >= 0) { + if ((cmsg_flags | msg->msg_get_inq) && ret >= 0) { if (cmsg_flags & TCP_CMSG_TS) tcp_recv_timestamp(msg, sk, &tss); - if (msg->msg_get_inq) { + if ((cmsg_flags & TCP_CMSG_INQ) | msg->msg_get_inq) { msg->msg_inq = tcp_inq_hint(sk); if (cmsg_flags & TCP_CMSG_INQ) put_cmsg(msg, SOL_TCP, TCP_CM_INQ, diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index a7ca74653d946..d0511225799ba 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2904,7 +2904,6 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state, unsigned int last_len; struct unix_sock *u; int copied = 0; - bool do_cmsg; int err = 0; long timeo; int target; @@ -2930,9 +2929,6 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state, u = unix_sk(sk); - do_cmsg = READ_ONCE(u->recvmsg_inq); - if (do_cmsg) - msg->msg_get_inq = 1; redo: /* Lock the socket to prevent queue disordering * while sleeps in memcpy_tomsg @@ -3090,9 +3086,11 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state, mutex_unlock(&u->iolock); if (msg) { + bool do_cmsg = READ_ONCE(u->recvmsg_inq); + scm_recv_unix(sock, msg, &scm, flags); - if (msg->msg_get_inq && (copied ?: err) >= 0) { + if ((do_cmsg | msg->msg_get_inq) && (copied ?: err) >= 0) { msg->msg_inq = READ_ONCE(u->inq_len); if (do_cmsg) put_cmsg(msg, SOL_SOCKET, SCM_INQ, From 947848ad61f01be2d163d06229f7ae98c214535d Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Tue, 6 Jan 2026 08:34:26 -0800 Subject: [PATCH 0764/1321] tools: ynl: don't install tests make's install target is meant for installing the production artifacts, AFAIU. Don't install test_ynl_cli and test_ynl_ethtool from under the main YNL install target. The install target under tests/ is retained in case someone wants the tests to be installed. Fixes: 308b7dee3e5c ("tools: ynl: add YNL test framework") Reviewed-by: Hangbin Liu Reviewed-by: Donald Hunter Link: https://patch.msgid.link/20260106163426.1468943-1-kuba@kernel.org Signed-off-by: Jakub Kicinski --- tools/net/ynl/Makefile | 1 - 1 file changed, 1 deletion(-) diff --git a/tools/net/ynl/Makefile b/tools/net/ynl/Makefile index 7736b492f559c..c2f3e8b3f2ac2 100644 --- a/tools/net/ynl/Makefile +++ b/tools/net/ynl/Makefile @@ -51,7 +51,6 @@ install: libynl.a lib/*.h @echo -e "\tINSTALL pyynl" @pip install --prefix=$(DESTDIR)$(prefix) . @make -C generated install - @make -C tests install run_tests: @$(MAKE) -C tests run_tests From cb97246e8de6cd974678b78731a178ca9c59e46c Mon Sep 17 00:00:00 2001 From: Thomas Fourier Date: Wed, 7 Jan 2026 10:01:36 +0100 Subject: [PATCH 0765/1321] atm: Fix dma_free_coherent() size The size of the buffer is not the same when alloc'd with dma_alloc_coherent() in he_init_tpdrq() and freed. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: Signed-off-by: Thomas Fourier Link: https://patch.msgid.link/20260107090141.80900-2-fourier.thomas@gmail.com Signed-off-by: Jakub Kicinski --- drivers/atm/he.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/atm/he.c b/drivers/atm/he.c index ad91cc6a34fc5..92a041d5387bd 100644 --- a/drivers/atm/he.c +++ b/drivers/atm/he.c @@ -1587,7 +1587,8 @@ he_stop(struct he_dev *he_dev) he_dev->tbrq_base, he_dev->tbrq_phys); if (he_dev->tpdrq_base) - dma_free_coherent(&he_dev->pci_dev->dev, CONFIG_TBRQ_SIZE * sizeof(struct he_tbrq), + dma_free_coherent(&he_dev->pci_dev->dev, + CONFIG_TPDRQ_SIZE * sizeof(struct he_tpdrq), he_dev->tpdrq_base, he_dev->tpdrq_phys); dma_pool_destroy(he_dev->tpd_pool); From 36cd2c394c3cdc258338eca8afb5cb8e3993a3ea Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 8 Jan 2026 10:19:27 +0000 Subject: [PATCH 0766/1321] wifi: avoid kernel-infoleak from struct iw_point struct iw_point has a 32bit hole on 64bit arches. struct iw_point { void __user *pointer; /* Pointer to the data (in user space) */ __u16 length; /* number of fields or size in bytes */ __u16 flags; /* Optional params */ }; Make sure to zero the structure to avoid disclosing 32bits of kernel data to user space. Fixes: 87de87d5e47f ("wext: Dispatch and handle compat ioctls entirely in net/wireless/wext.c") Reported-by: syzbot+bfc7323743ca6dbcc3d3@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/695f83f3.050a0220.1c677c.0392.GAE@google.com/T/#u Signed-off-by: Eric Dumazet Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260108101927.857582-1-edumazet@google.com Signed-off-by: Johannes Berg --- net/wireless/wext-core.c | 4 ++++ net/wireless/wext-priv.c | 4 ++++ 2 files changed, 8 insertions(+) diff --git a/net/wireless/wext-core.c b/net/wireless/wext-core.c index c32a7c6903d53..7b8e94214b072 100644 --- a/net/wireless/wext-core.c +++ b/net/wireless/wext-core.c @@ -1101,6 +1101,10 @@ static int compat_standard_call(struct net_device *dev, return ioctl_standard_call(dev, iwr, cmd, info, handler); iwp_compat = (struct compat_iw_point *) &iwr->u.data; + + /* struct iw_point has a 32bit hole on 64bit arches. */ + memset(&iwp, 0, sizeof(iwp)); + iwp.pointer = compat_ptr(iwp_compat->pointer); iwp.length = iwp_compat->length; iwp.flags = iwp_compat->flags; diff --git a/net/wireless/wext-priv.c b/net/wireless/wext-priv.c index 674d426a9d24f..37d1147019c2b 100644 --- a/net/wireless/wext-priv.c +++ b/net/wireless/wext-priv.c @@ -228,6 +228,10 @@ int compat_private_call(struct net_device *dev, struct iwreq *iwr, struct iw_point iwp; iwp_compat = (struct compat_iw_point *) &iwr->u.data; + + /* struct iw_point has a 32bit hole on 64bit arches. */ + memset(&iwp, 0, sizeof(iwp)); + iwp.pointer = compat_ptr(iwp_compat->pointer); iwp.length = iwp_compat->length; iwp.flags = iwp_compat->flags; From e09568401ecee6aff4a8cd8efef446d5e104c224 Mon Sep 17 00:00:00 2001 From: Benjamin Berg Date: Wed, 7 Jan 2026 14:36:51 +0100 Subject: [PATCH 0767/1321] wifi: mac80211_hwsim: fix typo in frequency notification The NAN notification is for 5745 MHz which corresponds to channel 149 and not 5475 which is not actually a valid channel. This could result in a NULL pointer dereference in cfg80211_next_nan_dw_notif. Fixes: a37a6f54439b ("wifi: mac80211_hwsim: Add simulation support for NAN device") Signed-off-by: Benjamin Berg Reviewed-by: Ilan Peer Reviewed-by: Miriam Rachel Korenblit Link: https://patch.msgid.link/20260107143652.7dab2035836f.Iacbaf7bb94ed5c14a0928a625827e4137d8bfede@changeid Signed-off-by: Johannes Berg --- drivers/net/wireless/virtual/mac80211_hwsim.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/wireless/virtual/mac80211_hwsim.c b/drivers/net/wireless/virtual/mac80211_hwsim.c index 551f5eb4e7477..92427f5272865 100644 --- a/drivers/net/wireless/virtual/mac80211_hwsim.c +++ b/drivers/net/wireless/virtual/mac80211_hwsim.c @@ -4040,7 +4040,7 @@ mac80211_hwsim_nan_dw_start(struct hrtimer *timer) ieee80211_vif_to_wdev(data->nan_device_vif); if (data->nan_curr_dw_band == NL80211_BAND_5GHZ) - ch = ieee80211_get_channel(hw->wiphy, 5475); + ch = ieee80211_get_channel(hw->wiphy, 5745); else ch = ieee80211_get_channel(hw->wiphy, 2437); From a80f7cbce57627cefd10a8bddcb2ab268df0542b Mon Sep 17 00:00:00 2001 From: Miri Korenblit Date: Wed, 7 Jan 2026 14:37:36 +0100 Subject: [PATCH 0768/1321] wifi: mac80211: don't iterate not running interfaces for_each_chanctx_user_* was introdcued as a replacement for for_each_sdata_link, which visits also other chanctx users that are not link. for_each_sdata_link skips not running interfaces, do the same for for_each_chanctx_user_* Fixes: 1ce954c98b89 ("wifi: mac80211: add and use chanctx usage iteration") Signed-off-by: Miri Korenblit Link: https://patch.msgid.link/20260107143736.55c084e2a976.I38b7b904a135dadca339321923b501b2c2c5c8c0@changeid Signed-off-by: Johannes Berg --- net/mac80211/chan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/mac80211/chan.c b/net/mac80211/chan.c index d0bfb12164016..d8c5f11afc157 100644 --- a/net/mac80211/chan.c +++ b/net/mac80211/chan.c @@ -90,6 +90,9 @@ static void ieee80211_chanctx_user_iter_next(struct ieee80211_local *local, /* next (or first) interface */ iter->sdata = list_prepare_entry(iter->sdata, &local->interfaces, list); list_for_each_entry_continue(iter->sdata, &local->interfaces, list) { + if (!ieee80211_sdata_running(iter->sdata)) + continue; + /* AP_VLAN has a chanctx pointer but follows AP */ if (iter->sdata->vif.type == NL80211_IFTYPE_AP_VLAN) continue; From 8d25c2d2b7828b82df1c0202ce9ed11edbea69ad Mon Sep 17 00:00:00 2001 From: Benjamin Berg Date: Wed, 7 Jan 2026 14:38:05 +0100 Subject: [PATCH 0769/1321] wifi: mac80211_hwsim: disable BHs for hwsim_radio_lock The hwsim_radio_lock spinlock expects bottom-half to be disabled, fix the call in mac80211_hwsim_nan_stop to ensure BHs are disabled. Signed-off-by: Benjamin Berg Link: https://patch.msgid.link/20260107143805.ce7406511608.I688f8b19346e94c1f8de0cdadde072054d4b861c@changeid Signed-off-by: Johannes Berg --- drivers/net/wireless/virtual/mac80211_hwsim.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/virtual/mac80211_hwsim.c b/drivers/net/wireless/virtual/mac80211_hwsim.c index 92427f5272865..79cc63272134d 100644 --- a/drivers/net/wireless/virtual/mac80211_hwsim.c +++ b/drivers/net/wireless/virtual/mac80211_hwsim.c @@ -4112,14 +4112,14 @@ static int mac80211_hwsim_stop_nan(struct ieee80211_hw *hw, hrtimer_cancel(&data->nan_timer); data->nan_device_vif = NULL; - spin_lock(&hwsim_radio_lock); + spin_lock_bh(&hwsim_radio_lock); list_for_each_entry(data2, &hwsim_radios, list) { if (data2->nan_device_vif) { nan_cluster_running = true; break; } } - spin_unlock(&hwsim_radio_lock); + spin_unlock_bh(&hwsim_radio_lock); if (!nan_cluster_running) memset(hwsim_nan_cluster_id, 0, ETH_ALEN); From 3dee14c5feaf5a18839e2b75acbab635e6d1b9bc Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Tue, 16 Dec 2025 11:52:42 +0100 Subject: [PATCH 0770/1321] wifi: mac80211: restore non-chanctx injection behaviour During the transition to use channel contexts throughout, the ability to do injection while in monitor mode concurrent with another interface was lost, since the (virtual) monitor won't have a chanctx assigned in this scenario. It's harder to fix drivers that actually transitioned to using channel contexts themselves, such as mt76, but it's easy to do those that are (still) just using the emulation. Do that. Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=218763 Reported-and-tested-by: Oscar Alfonso Diaz Fixes: 0a44dfc07074 ("wifi: mac80211: simplify non-chanctx drivers") Link: https://patch.msgid.link/20251216105242.18366-2-johannes@sipsolutions.net Signed-off-by: Johannes Berg --- net/mac80211/tx.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index 9d8b0a25f73c3..1b55e83404135 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -2397,6 +2397,8 @@ netdev_tx_t ieee80211_monitor_start_xmit(struct sk_buff *skb, if (chanctx_conf) chandef = &chanctx_conf->def; + else if (local->emulate_chanctx) + chandef = &local->hw.conf.chandef; else goto fail_rcu; From 3d4f58df536f0651cbd030cf3129722a20968aa5 Mon Sep 17 00:00:00 2001 From: Baochen Qiang Date: Mon, 22 Dec 2025 10:29:07 +0800 Subject: [PATCH 0771/1321] wifi: mac80211: collect station statistics earlier when disconnect In __sta_info_destroy_part2(), station statistics are requested after the IEEE80211_STA_NONE -> IEEE80211_STA_NOTEXIST transition. This is problematic because the driver may be unable to handle the request due to the STA being in the NOTEXIST state (i.e. if the driver destroys the underlying data when transitioning to NOTEXIST). Move the statistics collection to before the state transition to avoid this issue. Signed-off-by: Baochen Qiang Link: https://patch.msgid.link/20251222-mac80211-move-station-stats-collection-earlier-v1-1-12cd4e42c633@oss.qualcomm.com Signed-off-by: Johannes Berg --- net/mac80211/sta_info.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c index f4d3b67fda062..1a995bc301b19 100644 --- a/net/mac80211/sta_info.c +++ b/net/mac80211/sta_info.c @@ -1533,6 +1533,10 @@ static void __sta_info_destroy_part2(struct sta_info *sta, bool recalc) } } + sinfo = kzalloc(sizeof(*sinfo), GFP_KERNEL); + if (sinfo) + sta_set_sinfo(sta, sinfo, true); + if (sta->uploaded) { ret = drv_sta_state(local, sdata, sta, IEEE80211_STA_NONE, IEEE80211_STA_NOTEXIST); @@ -1541,9 +1545,6 @@ static void __sta_info_destroy_part2(struct sta_info *sta, bool recalc) sta_dbg(sdata, "Removed STA %pM\n", sta->sta.addr); - sinfo = kzalloc(sizeof(*sinfo), GFP_KERNEL); - if (sinfo) - sta_set_sinfo(sta, sinfo, true); cfg80211_del_sta_sinfo(sdata->dev, sta->sta.addr, sinfo, GFP_KERNEL); kfree(sinfo); From 448df46e3417334ef0292e56a84e1ac3cf1f07d5 Mon Sep 17 00:00:00 2001 From: Wei Fang Date: Wed, 7 Jan 2026 17:12:04 +0800 Subject: [PATCH 0772/1321] net: enetc: fix build warning when PAGE_SIZE is greater than 128K The max buffer size of ENETC RX BD is 0xFFFF bytes, so if the PAGE_SIZE is greater than 128K, ENETC_RXB_DMA_SIZE and ENETC_RXB_DMA_SIZE_XDP will be greater than 0xFFFF, thus causing a build warning. This will not cause any practical issues because ENETC is currently only used on the ARM64 platform, and the max PAGE_SIZE is 64K. So this patch is only for fixing the build warning that occurs when compiling ENETC drivers for other platforms. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202601050637.kHEKKOG7-lkp@intel.com/ Fixes: e59bc32df2e9 ("net: enetc: correct the value of ENETC_RXB_TRUESIZE") Signed-off-by: Wei Fang Reviewed-by: Frank Li Link: https://patch.msgid.link/20260107091204.1980222-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/freescale/enetc/enetc.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h index dce27bd67a7d1..aecd40aeef9c4 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.h +++ b/drivers/net/ethernet/freescale/enetc/enetc.h @@ -79,9 +79,9 @@ struct enetc_lso_t { #define ENETC_RXB_TRUESIZE (PAGE_SIZE >> 1) #define ENETC_RXB_PAD NET_SKB_PAD /* add extra space if needed */ #define ENETC_RXB_DMA_SIZE \ - (SKB_WITH_OVERHEAD(ENETC_RXB_TRUESIZE) - ENETC_RXB_PAD) + min(SKB_WITH_OVERHEAD(ENETC_RXB_TRUESIZE) - ENETC_RXB_PAD, 0xffff) #define ENETC_RXB_DMA_SIZE_XDP \ - (SKB_WITH_OVERHEAD(ENETC_RXB_TRUESIZE) - XDP_PACKET_HEADROOM) + min(SKB_WITH_OVERHEAD(ENETC_RXB_TRUESIZE) - XDP_PACKET_HEADROOM, 0xffff) struct enetc_rx_swbd { dma_addr_t dma; From 8b3a4ab42a1d88aa06192efdc921b41f06c127a8 Mon Sep 17 00:00:00 2001 From: Emil Tantilov Date: Thu, 20 Nov 2025 16:12:14 -0800 Subject: [PATCH 0773/1321] idpf: keep the netdev when a reset fails During a successful reset the driver would re-allocate vport resources while keeping the netdevs intact. However, in case of an error in the init task, the netdev of the failing vport will be unregistered, effectively removing the network interface: [ 121.211076] idpf 0000:83:00.0: enabling device (0100 -> 0102) [ 121.221976] idpf 0000:83:00.0: Device HW Reset initiated [ 124.161229] idpf 0000:83:00.0 ens801f0: renamed from eth0 [ 124.163364] idpf 0000:83:00.0 ens801f0d1: renamed from eth1 [ 125.934656] idpf 0000:83:00.0 ens801f0d2: renamed from eth2 [ 128.218429] idpf 0000:83:00.0 ens801f0d3: renamed from eth3 ip -br a ens801f0 UP ens801f0d1 UP ens801f0d2 UP ens801f0d3 UP echo 1 > /sys/class/net/ens801f0/device/reset [ 145.885537] idpf 0000:83:00.0: resetting [ 145.990280] idpf 0000:83:00.0: reset done [ 146.284766] idpf 0000:83:00.0: HW reset detected [ 146.296610] idpf 0000:83:00.0: Device HW Reset initiated [ 211.556719] idpf 0000:83:00.0: Transaction timed-out (op:526 cookie:7700 vc_op:526 salt:77 timeout:60000ms) [ 272.996705] idpf 0000:83:00.0: Transaction timed-out (op:502 cookie:7800 vc_op:502 salt:78 timeout:60000ms) ip -br a ens801f0d1 DOWN ens801f0d2 DOWN ens801f0d3 DOWN Re-shuffle the logic in the error path of the init task to make sure the netdevs remain intact. This will allow the driver to attempt recovery via subsequent resets, provided the FW is still functional. The main change is to make sure that idpf_decfg_netdev() is not called should the init task fail during a reset. The error handling is consolidated under unwind_vports, as the removed labels had the same cleanup logic split depending on the point of failure. Fixes: ce1b75d0635c ("idpf: add ptypes and MAC filter support") Signed-off-by: Emil Tantilov Reviewed-by: Aleksandr Loktionov Tested-by: Samuel Salin Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf_lib.c | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c index 7ce4eb71a433c..313803c088478 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c @@ -1579,6 +1579,10 @@ void idpf_init_task(struct work_struct *work) goto unwind_vports; } + err = idpf_send_get_rx_ptype_msg(vport); + if (err) + goto unwind_vports; + index = vport->idx; vport_config = adapter->vport_config[index]; @@ -1590,15 +1594,11 @@ void idpf_init_task(struct work_struct *work) err = idpf_check_supported_desc_ids(vport); if (err) { dev_err(&pdev->dev, "failed to get required descriptor ids\n"); - goto cfg_netdev_err; + goto unwind_vports; } if (idpf_cfg_netdev(vport)) - goto cfg_netdev_err; - - err = idpf_send_get_rx_ptype_msg(vport); - if (err) - goto handle_err; + goto unwind_vports; /* Once state is put into DOWN, driver is ready for dev_open */ np = netdev_priv(vport->netdev); @@ -1645,11 +1645,6 @@ void idpf_init_task(struct work_struct *work) return; -handle_err: - idpf_decfg_netdev(vport); -cfg_netdev_err: - idpf_vport_rel(vport); - adapter->vports[index] = NULL; unwind_vports: if (default_vport) { for (index = 0; index < adapter->max_vports; index++) { From 068ace90e3c0dbeb623a7c29feaf177a4a3fd0e4 Mon Sep 17 00:00:00 2001 From: Emil Tantilov Date: Thu, 20 Nov 2025 16:12:15 -0800 Subject: [PATCH 0774/1321] idpf: detach and close netdevs while handling a reset Protect the reset path from callbacks by setting the netdevs to detached state and close any netdevs in UP state until the reset handling has completed. During a reset, the driver will de-allocate resources for the vport, and there is no guarantee that those will recover, which is why the existing vport_ctrl_lock does not provide sufficient protection. idpf_detach_and_close() is called right before reset handling. If the reset handling succeeds, the netdevs state is recovered via call to idpf_attach_and_open(). If the reset handling fails the netdevs remain down. The detach/down calls are protected with RTNL lock to avoid racing with callbacks. On the recovery side the attach can be done without holding the RTNL lock as there are no callbacks expected at that point, due to detach/close always being done first in that flow. The previous logic restoring the netdevs state based on the IDPF_VPORT_UP_REQUESTED flag in the init task is not needed anymore, hence the removal of idpf_set_vport_state(). The IDPF_VPORT_UP_REQUESTED is still being used to restore the state of the netdevs following the reset, but has no use outside of the reset handling flow. idpf_init_hard_reset() is converted to void, since it was used as such and there is no error handling being done based on its return value. Before this change, invoking hard and soft resets simultaneously will cause the driver to lose the vport state: ip -br a UP echo 1 > /sys/class/net/ens801f0/device/reset& \ ethtool -L ens801f0 combined 8 ip -br a DOWN ip link set up ip -br a DOWN Also in case of a failure in the reset path, the netdev is left exposed to external callbacks, while vport resources are not initialized, leading to a crash on subsequent ifup/down: [408471.398966] idpf 0000:83:00.0: HW reset detected [408471.411744] idpf 0000:83:00.0: Device HW Reset initiated [408472.277901] idpf 0000:83:00.0: The driver was unable to contact the device's firmware. Check that the FW is running. Driver state= 0x2 [408508.125551] BUG: kernel NULL pointer dereference, address: 0000000000000078 [408508.126112] #PF: supervisor read access in kernel mode [408508.126687] #PF: error_code(0x0000) - not-present page [408508.127256] PGD 2aae2f067 P4D 0 [408508.127824] Oops: Oops: 0000 [#1] SMP NOPTI ... [408508.130871] RIP: 0010:idpf_stop+0x39/0x70 [idpf] ... [408508.139193] Call Trace: [408508.139637] [408508.140077] __dev_close_many+0xbb/0x260 [408508.140533] __dev_change_flags+0x1cf/0x280 [408508.140987] netif_change_flags+0x26/0x70 [408508.141434] dev_change_flags+0x3d/0xb0 [408508.141878] devinet_ioctl+0x460/0x890 [408508.142321] inet_ioctl+0x18e/0x1d0 [408508.142762] ? _copy_to_user+0x22/0x70 [408508.143207] sock_do_ioctl+0x3d/0xe0 [408508.143652] sock_ioctl+0x10e/0x330 [408508.144091] ? find_held_lock+0x2b/0x80 [408508.144537] __x64_sys_ioctl+0x96/0xe0 [408508.144979] do_syscall_64+0x79/0x3d0 [408508.145415] entry_SYSCALL_64_after_hwframe+0x76/0x7e [408508.145860] RIP: 0033:0x7f3e0bb4caff Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration") Signed-off-by: Emil Tantilov Reviewed-by: Madhu Chittim Tested-by: Samuel Salin Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf_lib.c | 121 ++++++++++++--------- 1 file changed, 72 insertions(+), 49 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c index 313803c088478..a964e0f5891eb 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c @@ -729,6 +729,65 @@ static int idpf_init_mac_addr(struct idpf_vport *vport, return 0; } +static void idpf_detach_and_close(struct idpf_adapter *adapter) +{ + int max_vports = adapter->max_vports; + + for (int i = 0; i < max_vports; i++) { + struct net_device *netdev = adapter->netdevs[i]; + + /* If the interface is in detached state, that means the + * previous reset was not handled successfully for this + * vport. + */ + if (!netif_device_present(netdev)) + continue; + + /* Hold RTNL to protect racing with callbacks */ + rtnl_lock(); + netif_device_detach(netdev); + if (netif_running(netdev)) { + set_bit(IDPF_VPORT_UP_REQUESTED, + adapter->vport_config[i]->flags); + dev_close(netdev); + } + rtnl_unlock(); + } +} + +static void idpf_attach_and_open(struct idpf_adapter *adapter) +{ + int max_vports = adapter->max_vports; + + for (int i = 0; i < max_vports; i++) { + struct idpf_vport *vport = adapter->vports[i]; + struct idpf_vport_config *vport_config; + struct net_device *netdev; + + /* In case of a critical error in the init task, the vport + * will be freed. Only continue to restore the netdevs + * if the vport is allocated. + */ + if (!vport) + continue; + + /* No need for RTNL on attach as this function is called + * following detach and dev_close(). We do take RTNL for + * dev_open() below as it can race with external callbacks + * following the call to netif_device_attach(). + */ + netdev = adapter->netdevs[i]; + netif_device_attach(netdev); + vport_config = adapter->vport_config[vport->idx]; + if (test_and_clear_bit(IDPF_VPORT_UP_REQUESTED, + vport_config->flags)) { + rtnl_lock(); + dev_open(netdev, NULL); + rtnl_unlock(); + } + } +} + /** * idpf_cfg_netdev - Allocate, configure and register a netdev * @vport: main vport structure @@ -1041,10 +1100,11 @@ static void idpf_vport_dealloc(struct idpf_vport *vport) idpf_idc_deinit_vport_aux_device(vport->vdev_info); idpf_deinit_mac_addr(vport); - idpf_vport_stop(vport, true); - if (!test_bit(IDPF_HR_RESET_IN_PROG, adapter->flags)) + if (!test_bit(IDPF_HR_RESET_IN_PROG, adapter->flags)) { + idpf_vport_stop(vport, true); idpf_decfg_netdev(vport); + } if (test_bit(IDPF_REMOVE_IN_PROG, adapter->flags)) idpf_del_all_mac_filters(vport); @@ -1544,7 +1604,6 @@ void idpf_init_task(struct work_struct *work) struct idpf_vport_config *vport_config; struct idpf_vport_max_q max_q; struct idpf_adapter *adapter; - struct idpf_netdev_priv *np; struct idpf_vport *vport; u16 num_default_vports; struct pci_dev *pdev; @@ -1600,12 +1659,6 @@ void idpf_init_task(struct work_struct *work) if (idpf_cfg_netdev(vport)) goto unwind_vports; - /* Once state is put into DOWN, driver is ready for dev_open */ - np = netdev_priv(vport->netdev); - clear_bit(IDPF_VPORT_UP, np->state); - if (test_and_clear_bit(IDPF_VPORT_UP_REQUESTED, vport_config->flags)) - idpf_vport_open(vport, true); - /* Spawn and return 'idpf_init_task' work queue until all the * default vports are created */ @@ -1781,27 +1834,6 @@ static int idpf_check_reset_complete(struct idpf_hw *hw, return -EBUSY; } -/** - * idpf_set_vport_state - Set the vport state to be after the reset - * @adapter: Driver specific private structure - */ -static void idpf_set_vport_state(struct idpf_adapter *adapter) -{ - u16 i; - - for (i = 0; i < adapter->max_vports; i++) { - struct idpf_netdev_priv *np; - - if (!adapter->netdevs[i]) - continue; - - np = netdev_priv(adapter->netdevs[i]); - if (test_bit(IDPF_VPORT_UP, np->state)) - set_bit(IDPF_VPORT_UP_REQUESTED, - adapter->vport_config[i]->flags); - } -} - /** * idpf_init_hard_reset - Initiate a hardware reset * @adapter: Driver specific private structure @@ -1810,28 +1842,17 @@ static void idpf_set_vport_state(struct idpf_adapter *adapter) * reallocate. Also reinitialize the mailbox. Return 0 on success, * negative on failure. */ -static int idpf_init_hard_reset(struct idpf_adapter *adapter) +static void idpf_init_hard_reset(struct idpf_adapter *adapter) { struct idpf_reg_ops *reg_ops = &adapter->dev_ops.reg_ops; struct device *dev = &adapter->pdev->dev; - struct net_device *netdev; int err; - u16 i; + idpf_detach_and_close(adapter); mutex_lock(&adapter->vport_ctrl_lock); dev_info(dev, "Device HW Reset initiated\n"); - /* Avoid TX hangs on reset */ - for (i = 0; i < adapter->max_vports; i++) { - netdev = adapter->netdevs[i]; - if (!netdev) - continue; - - netif_carrier_off(netdev); - netif_tx_disable(netdev); - } - /* Prepare for reset */ if (test_and_clear_bit(IDPF_HR_DRV_LOAD, adapter->flags)) { reg_ops->trigger_reset(adapter, IDPF_HR_DRV_LOAD); @@ -1840,7 +1861,6 @@ static int idpf_init_hard_reset(struct idpf_adapter *adapter) idpf_idc_issue_reset_event(adapter->cdev_info); - idpf_set_vport_state(adapter); idpf_vc_core_deinit(adapter); if (!is_reset) reg_ops->trigger_reset(adapter, IDPF_HR_FUNC_RESET); @@ -1887,11 +1907,14 @@ static int idpf_init_hard_reset(struct idpf_adapter *adapter) unlock_mutex: mutex_unlock(&adapter->vport_ctrl_lock); - /* Wait until all vports are created to init RDMA CORE AUX */ - if (!err) - err = idpf_idc_init(adapter); - - return err; + /* Attempt to restore netdevs and initialize RDMA CORE AUX device, + * provided vc_core_init succeeded. It is still possible that + * vports are not allocated at this point if the init task failed. + */ + if (!err) { + idpf_attach_and_open(adapter); + idpf_idc_init(adapter); + } } /** From 20a6ccd5cd0bd9380207e0b9c598ed1a2c218ed8 Mon Sep 17 00:00:00 2001 From: Emil Tantilov Date: Thu, 20 Nov 2025 16:12:16 -0800 Subject: [PATCH 0775/1321] idpf: fix memory leak in idpf_vport_rel() Free vport->rx_ptype_lkup in idpf_vport_rel() to avoid leaking memory during a reset. Reported by kmemleak: unreferenced object 0xff450acac838a000 (size 4096): comm "kworker/u258:5", pid 7732, jiffies 4296830044 hex dump (first 32 bytes): 00 00 00 00 00 10 00 00 00 10 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 ................ backtrace (crc 3da81902): __kmalloc_cache_noprof+0x469/0x7a0 idpf_send_get_rx_ptype_msg+0x90/0x570 [idpf] idpf_init_task+0x1ec/0x8d0 [idpf] process_one_work+0x226/0x6d0 worker_thread+0x19e/0x340 kthread+0x10f/0x250 ret_from_fork+0x251/0x2b0 ret_from_fork_asm+0x1a/0x30 Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration") Signed-off-by: Emil Tantilov Reviewed-by: Aleksandr Loktionov Reviewed-by: Madhu Chittim Tested-by: Samuel Salin Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf_lib.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c index a964e0f5891eb..04af10cfaa8cb 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c @@ -1082,6 +1082,8 @@ static void idpf_vport_rel(struct idpf_vport *vport) kfree(adapter->vport_config[idx]->req_qs_chunks); adapter->vport_config[idx]->req_qs_chunks = NULL; } + kfree(vport->rx_ptype_lkup); + vport->rx_ptype_lkup = NULL; kfree(vport); adapter->num_alloc_vports--; } From 7d157e36cc35e8b6c5238cb2dedce900229d9980 Mon Sep 17 00:00:00 2001 From: Emil Tantilov Date: Thu, 20 Nov 2025 16:12:17 -0800 Subject: [PATCH 0776/1321] idpf: fix memory leak in idpf_vc_core_deinit() Make sure to free hw->lan_regs. Reported by kmemleak during reset: unreferenced object 0xff1b913d02a936c0 (size 96): comm "kworker/u258:14", pid 2174, jiffies 4294958305 hex dump (first 32 bytes): 00 00 00 c0 a8 ba 2d ff 00 00 00 00 00 00 00 00 ......-......... 00 00 40 08 00 00 00 00 00 00 25 b3 a8 ba 2d ff ..@.......%...-. backtrace (crc 36063c4f): __kmalloc_noprof+0x48f/0x890 idpf_vc_core_init+0x6ce/0x9b0 [idpf] idpf_vc_event_task+0x1fb/0x350 [idpf] process_one_work+0x226/0x6d0 worker_thread+0x19e/0x340 kthread+0x10f/0x250 ret_from_fork+0x251/0x2b0 ret_from_fork_asm+0x1a/0x30 Fixes: 6aa53e861c1a ("idpf: implement get LAN MMIO memory regions") Signed-off-by: Emil Tantilov Reviewed-by: Aleksandr Loktionov Reviewed-by: Joshua Hay Tested-by: Samuel Salin Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c index 5bbe7d9294c14..01bbd12a642a0 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c @@ -3570,6 +3570,7 @@ int idpf_vc_core_init(struct idpf_adapter *adapter) */ void idpf_vc_core_deinit(struct idpf_adapter *adapter) { + struct idpf_hw *hw = &adapter->hw; bool remove_in_prog; if (!test_bit(IDPF_VC_CORE_INIT, adapter->flags)) @@ -3593,6 +3594,9 @@ void idpf_vc_core_deinit(struct idpf_adapter *adapter) idpf_vport_params_buf_rel(adapter); + kfree(hw->lan_regs); + hw->lan_regs = NULL; + kfree(adapter->vports); adapter->vports = NULL; From 651a8d8b2c887aa36db5a4c3ae2650d4adcf3013 Mon Sep 17 00:00:00 2001 From: Emil Tantilov Date: Thu, 20 Nov 2025 16:12:18 -0800 Subject: [PATCH 0777/1321] idpf: fix error handling in the init_task on load If the init_task fails during a driver load, we end up without vports and netdevs, effectively failing the entire process. In that state a subsequent reset will result in a crash as the service task attempts to access uninitialized resources. Following trace is from an error in the init_task where the CREATE_VPORT (op 501) is rejected by the FW: [40922.763136] idpf 0000:83:00.0: Device HW Reset initiated [40924.449797] idpf 0000:83:00.0: Transaction failed (op 501) [40958.148190] idpf 0000:83:00.0: HW reset detected [40958.161202] BUG: kernel NULL pointer dereference, address: 00000000000000a8 ... [40958.168094] Workqueue: idpf-0000:83:00.0-vc_event idpf_vc_event_task [idpf] [40958.168865] RIP: 0010:idpf_vc_event_task+0x9b/0x350 [idpf] ... [40958.177932] Call Trace: [40958.178491] [40958.179040] process_one_work+0x226/0x6d0 [40958.179609] worker_thread+0x19e/0x340 [40958.180158] ? __pfx_worker_thread+0x10/0x10 [40958.180702] kthread+0x10f/0x250 [40958.181238] ? __pfx_kthread+0x10/0x10 [40958.181774] ret_from_fork+0x251/0x2b0 [40958.182307] ? __pfx_kthread+0x10/0x10 [40958.182834] ret_from_fork_asm+0x1a/0x30 [40958.183370] Fix the error handling in the init_task to make sure the service and mailbox tasks are disabled if the error happens during load. These are started in idpf_vc_core_init(), which spawns the init_task and has no way of knowing if it failed. If the error happens on reset, following successful driver load, the tasks can still run, as that will allow the netdevs to attempt recovery through another reset. Stop the PTP callbacks either way as those will be restarted by the call to idpf_vc_core_init() during a successful reset. Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration") Reported-by: Vivek Kumar Signed-off-by: Emil Tantilov Reviewed-by: Madhu Chittim Tested-by: Samuel Salin Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf_lib.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c index 04af10cfaa8cb..e2ee8b137421f 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c @@ -1690,10 +1690,9 @@ void idpf_init_task(struct work_struct *work) set_bit(IDPF_VPORT_REG_NETDEV, vport_config->flags); } - /* As all the required vports are created, clear the reset flag - * unconditionally here in case we were in reset and the link was down. - */ + /* Clear the reset and load bits as all vports are created */ clear_bit(IDPF_HR_RESET_IN_PROG, adapter->flags); + clear_bit(IDPF_HR_DRV_LOAD, adapter->flags); /* Start the statistics task now */ queue_delayed_work(adapter->stats_wq, &adapter->stats_task, msecs_to_jiffies(10 * (pdev->devfn & 0x07))); @@ -1707,6 +1706,15 @@ void idpf_init_task(struct work_struct *work) idpf_vport_dealloc(adapter->vports[index]); } } + /* Cleanup after vc_core_init, which has no way of knowing the + * init task failed on driver load. + */ + if (test_and_clear_bit(IDPF_HR_DRV_LOAD, adapter->flags)) { + cancel_delayed_work_sync(&adapter->serv_task); + cancel_delayed_work_sync(&adapter->mbx_task); + } + idpf_ptp_release(adapter); + clear_bit(IDPF_HR_RESET_IN_PROG, adapter->flags); } @@ -1856,7 +1864,7 @@ static void idpf_init_hard_reset(struct idpf_adapter *adapter) dev_info(dev, "Device HW Reset initiated\n"); /* Prepare for reset */ - if (test_and_clear_bit(IDPF_HR_DRV_LOAD, adapter->flags)) { + if (test_bit(IDPF_HR_DRV_LOAD, adapter->flags)) { reg_ops->trigger_reset(adapter, IDPF_HR_DRV_LOAD); } else if (test_and_clear_bit(IDPF_HR_FUNC_RESET, adapter->flags)) { bool is_reset = idpf_is_reset_detected(adapter); From 92710c6211deb2de87f95f23fcd6c8f4053daf43 Mon Sep 17 00:00:00 2001 From: Sreedevi Joshi Date: Tue, 30 Sep 2025 16:23:51 -0500 Subject: [PATCH 0778/1321] idpf: fix memory leak of flow steer list on rmmod The flow steering list maintains entries that are added and removed as ethtool creates and deletes flow steering rules. Module removal with active entries causes memory leak as the list is not properly cleaned up. Prevent this by iterating through the remaining entries in the list and freeing the associated memory during module removal. Add a spinlock (flow_steer_list_lock) to protect the list access from multiple threads. Fixes: ada3e24b84a0 ("idpf: add flow steering support") Reviewed-by: Przemek Kitszel Reviewed-by: Aleksandr Loktionov Signed-off-by: Sreedevi Joshi Reviewed-by: Simon Horman Tested-by: Mina Almasry Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf.h | 2 ++ .../net/ethernet/intel/idpf/idpf_ethtool.c | 15 ++++++++-- drivers/net/ethernet/intel/idpf/idpf_lib.c | 28 ++++++++++++++++++- 3 files changed, 42 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h index 8cfc68cbfa06d..a61821333f5df 100644 --- a/drivers/net/ethernet/intel/idpf/idpf.h +++ b/drivers/net/ethernet/intel/idpf/idpf.h @@ -558,6 +558,7 @@ struct idpf_vector_lifo { * @max_q: Maximum possible queues * @req_qs_chunks: Queue chunk data for requested queues * @mac_filter_list_lock: Lock to protect mac filters + * @flow_steer_list_lock: Lock to protect fsteer filters * @flags: See enum idpf_vport_config_flags */ struct idpf_vport_config { @@ -565,6 +566,7 @@ struct idpf_vport_config { struct idpf_vport_max_q max_q; struct virtchnl2_add_queues *req_qs_chunks; spinlock_t mac_filter_list_lock; + spinlock_t flow_steer_list_lock; DECLARE_BITMAP(flags, IDPF_VPORT_CONFIG_FLAGS_NBITS); }; diff --git a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c index 2589e124e41cd..00481fec81793 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c +++ b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c @@ -37,6 +37,7 @@ static int idpf_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd, { struct idpf_netdev_priv *np = netdev_priv(netdev); struct idpf_vport_user_config_data *user_config; + struct idpf_vport_config *vport_config; struct idpf_fsteer_fltr *f; struct idpf_vport *vport; unsigned int cnt = 0; @@ -44,7 +45,8 @@ static int idpf_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd, idpf_vport_ctrl_lock(netdev); vport = idpf_netdev_to_vport(netdev); - user_config = &np->adapter->vport_config[np->vport_idx]->user_config; + vport_config = np->adapter->vport_config[np->vport_idx]; + user_config = &vport_config->user_config; switch (cmd->cmd) { case ETHTOOL_GRXCLSRLCNT: @@ -53,15 +55,18 @@ static int idpf_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd, break; case ETHTOOL_GRXCLSRULE: err = -EINVAL; + spin_lock_bh(&vport_config->flow_steer_list_lock); list_for_each_entry(f, &user_config->flow_steer_list, list) if (f->loc == cmd->fs.location) { cmd->fs.ring_cookie = f->q_index; err = 0; break; } + spin_unlock_bh(&vport_config->flow_steer_list_lock); break; case ETHTOOL_GRXCLSRLALL: cmd->data = idpf_fsteer_max_rules(vport); + spin_lock_bh(&vport_config->flow_steer_list_lock); list_for_each_entry(f, &user_config->flow_steer_list, list) { if (cnt == cmd->rule_cnt) { err = -EMSGSIZE; @@ -72,6 +77,7 @@ static int idpf_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd, } if (!err) cmd->rule_cnt = user_config->num_fsteer_fltrs; + spin_unlock_bh(&vport_config->flow_steer_list_lock); break; default: break; @@ -240,6 +246,7 @@ static int idpf_add_flow_steer(struct net_device *netdev, fltr->loc = fsp->location; fltr->q_index = q_index; + spin_lock_bh(&vport_config->flow_steer_list_lock); list_for_each_entry(f, &user_config->flow_steer_list, list) { if (f->loc >= fltr->loc) break; @@ -250,6 +257,7 @@ static int idpf_add_flow_steer(struct net_device *netdev, list_add(&fltr->list, &user_config->flow_steer_list); user_config->num_fsteer_fltrs++; + spin_unlock_bh(&vport_config->flow_steer_list_lock); out: kfree(rule); @@ -302,17 +310,20 @@ static int idpf_del_flow_steer(struct net_device *netdev, goto out; } + spin_lock_bh(&vport_config->flow_steer_list_lock); list_for_each_entry_safe(f, iter, &user_config->flow_steer_list, list) { if (f->loc == fsp->location) { list_del(&f->list); kfree(f); user_config->num_fsteer_fltrs--; - goto out; + goto out_unlock; } } err = -EINVAL; +out_unlock: + spin_unlock_bh(&vport_config->flow_steer_list_lock); out: kfree(rule); return err; diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c index e2ee8b137421f..d56366e676cf7 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c @@ -442,6 +442,29 @@ int idpf_intr_req(struct idpf_adapter *adapter) return err; } +/** + * idpf_del_all_flow_steer_filters - Delete all flow steer filters in list + * @vport: main vport struct + * + * Takes flow_steer_list_lock spinlock. Deletes all filters + */ +static void idpf_del_all_flow_steer_filters(struct idpf_vport *vport) +{ + struct idpf_vport_config *vport_config; + struct idpf_fsteer_fltr *f, *ftmp; + + vport_config = vport->adapter->vport_config[vport->idx]; + + spin_lock_bh(&vport_config->flow_steer_list_lock); + list_for_each_entry_safe(f, ftmp, &vport_config->user_config.flow_steer_list, + list) { + list_del(&f->list); + kfree(f); + } + vport_config->user_config.num_fsteer_fltrs = 0; + spin_unlock_bh(&vport_config->flow_steer_list_lock); +} + /** * idpf_find_mac_filter - Search filter list for specific mac filter * @vconfig: Vport config structure @@ -1107,8 +1130,10 @@ static void idpf_vport_dealloc(struct idpf_vport *vport) idpf_vport_stop(vport, true); idpf_decfg_netdev(vport); } - if (test_bit(IDPF_REMOVE_IN_PROG, adapter->flags)) + if (test_bit(IDPF_REMOVE_IN_PROG, adapter->flags)) { idpf_del_all_mac_filters(vport); + idpf_del_all_flow_steer_filters(vport); + } if (adapter->netdevs[i]) { struct idpf_netdev_priv *np = netdev_priv(adapter->netdevs[i]); @@ -1648,6 +1673,7 @@ void idpf_init_task(struct work_struct *work) vport_config = adapter->vport_config[index]; spin_lock_init(&vport_config->mac_filter_list_lock); + spin_lock_init(&vport_config->flow_steer_list_lock); INIT_LIST_HEAD(&vport_config->user_config.mac_filter_list); INIT_LIST_HEAD(&vport_config->user_config.flow_steer_list); From 27e71b26d9a24feee98ed846ed7b168db3646644 Mon Sep 17 00:00:00 2001 From: Erik Gabriel Carrillo Date: Tue, 30 Sep 2025 16:23:52 -0500 Subject: [PATCH 0779/1321] idpf: fix issue with ethtool -n command display When ethtool -n is executed on an interface to display the flow steering rules, "rxclass: Unknown flow type" error is generated. The flow steering list maintained in the driver currently stores only the location and q_index but other fields of the ethtool_rx_flow_spec are not stored. This may be enough for the virtchnl command to delete the entry. However, when the ethtool -n command is used to query the flow steering rules, the ethtool_rx_flow_spec returned is not complete causing the error below. Resolve this by storing the flow spec (fsp) when rules are added and returning the complete flow spec when rules are queried. Also, change the return value from EINVAL to ENOENT when flow steering entry is not found during query by location or when deleting an entry. Add logic to detect and reject duplicate filter entries at the same location and change logic to perform upfront validation of all error conditions before adding flow rules through virtchnl. This avoids the need for additional virtchnl delete messages when subsequent operations fail, which was missing in the original upstream code. Example: Before the fix: ethtool -n eth1 2 RX rings available Total 2 rules rxclass: Unknown flow type rxclass: Unknown flow type After the fix: ethtool -n eth1 2 RX rings available Total 2 rules Filter: 0 Rule Type: TCP over IPv4 Src IP addr: 10.0.0.1 mask: 0.0.0.0 Dest IP addr: 0.0.0.0 mask: 255.255.255.255 TOS: 0x0 mask: 0xff Src port: 0 mask: 0xffff Dest port: 0 mask: 0xffff Action: Direct to queue 0 Filter: 1 Rule Type: UDP over IPv4 Src IP addr: 10.0.0.1 mask: 0.0.0.0 Dest IP addr: 0.0.0.0 mask: 255.255.255.255 TOS: 0x0 mask: 0xff Src port: 0 mask: 0xffff Dest port: 0 mask: 0xffff Action: Direct to queue 0 Fixes: ada3e24b84a0 ("idpf: add flow steering support") Signed-off-by: Erik Gabriel Carrillo Co-developed-by: Sreedevi Joshi Signed-off-by: Sreedevi Joshi Reviewed-by: Przemek Kitszel Reviewed-by: Aleksandr Loktionov Reviewed-by: Simon Horman Tested-by: Mina Almasry Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf.h | 3 +- .../net/ethernet/intel/idpf/idpf_ethtool.c | 59 ++++++++++++------- 2 files changed, 40 insertions(+), 22 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h index a61821333f5df..dab36c0c3cdc3 100644 --- a/drivers/net/ethernet/intel/idpf/idpf.h +++ b/drivers/net/ethernet/intel/idpf/idpf.h @@ -284,8 +284,7 @@ struct idpf_port_stats { struct idpf_fsteer_fltr { struct list_head list; - u32 loc; - u32 q_index; + struct ethtool_rx_flow_spec fs; }; /** diff --git a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c index 00481fec81793..7000f6283a335 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c +++ b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c @@ -54,11 +54,15 @@ static int idpf_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd, cmd->data = idpf_fsteer_max_rules(vport); break; case ETHTOOL_GRXCLSRULE: - err = -EINVAL; + err = -ENOENT; spin_lock_bh(&vport_config->flow_steer_list_lock); list_for_each_entry(f, &user_config->flow_steer_list, list) - if (f->loc == cmd->fs.location) { - cmd->fs.ring_cookie = f->q_index; + if (f->fs.location == cmd->fs.location) { + /* Avoid infoleak from padding: zero first, + * then assign fields + */ + memset(&cmd->fs, 0, sizeof(cmd->fs)); + cmd->fs = f->fs; err = 0; break; } @@ -72,7 +76,7 @@ static int idpf_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd, err = -EMSGSIZE; break; } - rule_locs[cnt] = f->loc; + rule_locs[cnt] = f->fs.location; cnt++; } if (!err) @@ -174,7 +178,7 @@ static int idpf_add_flow_steer(struct net_device *netdev, struct idpf_vport *vport; u32 flow_type, q_index; u16 num_rxq; - int err; + int err = 0; vport = idpf_netdev_to_vport(netdev); vport_config = vport->adapter->vport_config[np->vport_idx]; @@ -200,6 +204,29 @@ static int idpf_add_flow_steer(struct net_device *netdev, if (!rule) return -ENOMEM; + fltr = kzalloc(sizeof(*fltr), GFP_KERNEL); + if (!fltr) { + err = -ENOMEM; + goto out_free_rule; + } + + /* detect duplicate entry and reject before adding rules */ + spin_lock_bh(&vport_config->flow_steer_list_lock); + list_for_each_entry(f, &user_config->flow_steer_list, list) { + if (f->fs.location == fsp->location) { + err = -EEXIST; + break; + } + + if (f->fs.location > fsp->location) + break; + parent = f; + } + spin_unlock_bh(&vport_config->flow_steer_list_lock); + + if (err) + goto out; + rule->vport_id = cpu_to_le32(vport->vport_id); rule->count = cpu_to_le32(1); info = &rule->rule_info[0]; @@ -238,28 +265,20 @@ static int idpf_add_flow_steer(struct net_device *netdev, goto out; } - fltr = kzalloc(sizeof(*fltr), GFP_KERNEL); - if (!fltr) { - err = -ENOMEM; - goto out; - } + /* Save a copy of the user's flow spec so ethtool can later retrieve it */ + fltr->fs = *fsp; - fltr->loc = fsp->location; - fltr->q_index = q_index; spin_lock_bh(&vport_config->flow_steer_list_lock); - list_for_each_entry(f, &user_config->flow_steer_list, list) { - if (f->loc >= fltr->loc) - break; - parent = f; - } - parent ? list_add(&fltr->list, &parent->list) : list_add(&fltr->list, &user_config->flow_steer_list); user_config->num_fsteer_fltrs++; spin_unlock_bh(&vport_config->flow_steer_list_lock); + goto out_free_rule; out: + kfree(fltr); +out_free_rule: kfree(rule); return err; } @@ -313,14 +332,14 @@ static int idpf_del_flow_steer(struct net_device *netdev, spin_lock_bh(&vport_config->flow_steer_list_lock); list_for_each_entry_safe(f, iter, &user_config->flow_steer_list, list) { - if (f->loc == fsp->location) { + if (f->fs.location == fsp->location) { list_del(&f->list); kfree(f); user_config->num_fsteer_fltrs--; goto out_unlock; } } - err = -EINVAL; + err = -ENOENT; out_unlock: spin_unlock_bh(&vport_config->flow_steer_list_lock); From 29b41ee4dbc23120b1821aa643d6bc875b10d51d Mon Sep 17 00:00:00 2001 From: Sreedevi Joshi Date: Mon, 24 Nov 2025 12:47:48 -0600 Subject: [PATCH 0780/1321] idpf: Fix RSS LUT NULL pointer crash on early ethtool operations The RSS LUT is not initialized until the interface comes up, causing the following NULL pointer crash when ethtool operations like rxhash on/off are performed before the interface is brought up for the first time. Move RSS LUT initialization from ndo_open to vport creation to ensure LUT is always available. This enables RSS configuration via ethtool before bringing the interface up. Simplify LUT management by maintaining all changes in the driver's soft copy and programming zeros to the indirection table when rxhash is disabled. Defer HW programming until the interface comes up if it is down during rxhash and LUT configuration changes. Steps to reproduce: ** Load idpf driver; interfaces will be created modprobe idpf ** Before bringing the interfaces up, turn rxhash off ethtool -K eth2 rxhash off [89408.371875] BUG: kernel NULL pointer dereference, address: 0000000000000000 [89408.371908] #PF: supervisor read access in kernel mode [89408.371924] #PF: error_code(0x0000) - not-present page [89408.371940] PGD 0 P4D 0 [89408.371953] Oops: Oops: 0000 [#1] SMP NOPTI [89408.372052] RIP: 0010:memcpy_orig+0x16/0x130 [89408.372310] Call Trace: [89408.372317] [89408.372326] ? idpf_set_features+0xfc/0x180 [idpf] [89408.372363] __netdev_update_features+0x295/0xde0 [89408.372384] ethnl_set_features+0x15e/0x460 [89408.372406] genl_family_rcv_msg_doit+0x11f/0x180 [89408.372429] genl_rcv_msg+0x1ad/0x2b0 [89408.372446] ? __pfx_ethnl_set_features+0x10/0x10 [89408.372465] ? __pfx_genl_rcv_msg+0x10/0x10 [89408.372482] netlink_rcv_skb+0x58/0x100 [89408.372502] genl_rcv+0x2c/0x50 [89408.372516] netlink_unicast+0x289/0x3e0 [89408.372533] netlink_sendmsg+0x215/0x440 [89408.372551] __sys_sendto+0x234/0x240 [89408.372571] __x64_sys_sendto+0x28/0x30 [89408.372585] x64_sys_call+0x1909/0x1da0 [89408.372604] do_syscall_64+0x7a/0xfa0 [89408.373140] ? clear_bhb_loop+0x60/0xb0 [89408.373647] entry_SYSCALL_64_after_hwframe+0x76/0x7e [89408.378887] Fixes: a251eee62133 ("idpf: add SRIOV support and other ndo_ops") Signed-off-by: Sreedevi Joshi Reviewed-by: Sridhar Samudrala Reviewed-by: Emil Tantilov Reviewed-by: Aleksandr Loktionov Reviewed-by: Paul Menzel Reviewed-by: Simon Horman Tested-by: Samuel Salin Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf.h | 2 - drivers/net/ethernet/intel/idpf/idpf_lib.c | 94 +++++++++---------- drivers/net/ethernet/intel/idpf/idpf_txrx.c | 36 +++---- drivers/net/ethernet/intel/idpf/idpf_txrx.h | 4 +- .../net/ethernet/intel/idpf/idpf_virtchnl.c | 9 +- 5 files changed, 66 insertions(+), 79 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h index dab36c0c3cdc3..1bf7934d4e28b 100644 --- a/drivers/net/ethernet/intel/idpf/idpf.h +++ b/drivers/net/ethernet/intel/idpf/idpf.h @@ -423,14 +423,12 @@ enum idpf_user_flags { * @rss_key: RSS hash key * @rss_lut_size: Size of RSS lookup table * @rss_lut: RSS lookup table - * @cached_lut: Used to restore previously init RSS lut */ struct idpf_rss_data { u16 rss_key_size; u8 *rss_key; u16 rss_lut_size; u32 *rss_lut; - u32 *cached_lut; }; /** diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c index d56366e676cf7..51716e5a84ef3 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c @@ -1073,7 +1073,7 @@ static void idpf_vport_rel(struct idpf_vport *vport) u16 idx = vport->idx; vport_config = adapter->vport_config[vport->idx]; - idpf_deinit_rss(vport); + idpf_deinit_rss_lut(vport); rss_data = &vport_config->user_config.rss_data; kfree(rss_data->rss_key); rss_data->rss_key = NULL; @@ -1226,6 +1226,7 @@ static struct idpf_vport *idpf_vport_alloc(struct idpf_adapter *adapter, u16 idx = adapter->next_vport; struct idpf_vport *vport; u16 num_max_q; + int err; if (idx == IDPF_NO_FREE_SLOT) return NULL; @@ -1276,10 +1277,11 @@ static struct idpf_vport *idpf_vport_alloc(struct idpf_adapter *adapter, idpf_vport_init(vport, max_q); - /* This alloc is done separate from the LUT because it's not strictly - * dependent on how many queues we have. If we change number of queues - * and soft reset we'll need a new LUT but the key can remain the same - * for as long as the vport exists. + /* LUT and key are both initialized here. Key is not strictly dependent + * on how many queues we have. If we change number of queues and soft + * reset is initiated, LUT will be freed and a new LUT will be allocated + * as per the updated number of queues during vport bringup. However, + * the key remains the same for as long as the vport exists. */ rss_data = &adapter->vport_config[idx]->user_config.rss_data; rss_data->rss_key = kzalloc(rss_data->rss_key_size, GFP_KERNEL); @@ -1289,6 +1291,11 @@ static struct idpf_vport *idpf_vport_alloc(struct idpf_adapter *adapter, /* Initialize default rss key */ netdev_rss_key_fill((void *)rss_data->rss_key, rss_data->rss_key_size); + /* Initialize default rss LUT */ + err = idpf_init_rss_lut(vport); + if (err) + goto free_rss_key; + /* fill vport slot in the adapter struct */ adapter->vports[idx] = vport; adapter->vport_ids[idx] = idpf_get_vport_id(vport); @@ -1299,6 +1306,8 @@ static struct idpf_vport *idpf_vport_alloc(struct idpf_adapter *adapter, return vport; +free_rss_key: + kfree(rss_data->rss_key); free_vector_idxs: kfree(vport->q_vector_idxs); free_vport: @@ -1476,6 +1485,7 @@ static int idpf_vport_open(struct idpf_vport *vport, bool rtnl) struct idpf_netdev_priv *np = netdev_priv(vport->netdev); struct idpf_adapter *adapter = vport->adapter; struct idpf_vport_config *vport_config; + struct idpf_rss_data *rss_data; int err; if (test_bit(IDPF_VPORT_UP, np->state)) @@ -1570,12 +1580,21 @@ static int idpf_vport_open(struct idpf_vport *vport, bool rtnl) idpf_restore_features(vport); vport_config = adapter->vport_config[vport->idx]; - if (vport_config->user_config.rss_data.rss_lut) - err = idpf_config_rss(vport); - else - err = idpf_init_rss(vport); + rss_data = &vport_config->user_config.rss_data; + + if (!rss_data->rss_lut) { + err = idpf_init_rss_lut(vport); + if (err) { + dev_err(&adapter->pdev->dev, + "Failed to initialize RSS LUT for vport %u: %d\n", + vport->vport_id, err); + goto disable_vport; + } + } + + err = idpf_config_rss(vport); if (err) { - dev_err(&adapter->pdev->dev, "Failed to initialize RSS for vport %u: %d\n", + dev_err(&adapter->pdev->dev, "Failed to configure RSS for vport %u: %d\n", vport->vport_id, err); goto disable_vport; } @@ -1584,7 +1603,7 @@ static int idpf_vport_open(struct idpf_vport *vport, bool rtnl) if (err) { dev_err(&adapter->pdev->dev, "Failed to complete interface up for vport %u: %d\n", vport->vport_id, err); - goto deinit_rss; + goto disable_vport; } if (rtnl) @@ -1592,8 +1611,6 @@ static int idpf_vport_open(struct idpf_vport *vport, bool rtnl) return 0; -deinit_rss: - idpf_deinit_rss(vport); disable_vport: idpf_send_disable_vport_msg(vport); disable_queues: @@ -2051,7 +2068,7 @@ int idpf_initiate_soft_reset(struct idpf_vport *vport, idpf_vport_stop(vport, false); } - idpf_deinit_rss(vport); + idpf_deinit_rss_lut(vport); /* We're passing in vport here because we need its wait_queue * to send a message and it should be getting all the vport * config data out of the adapter but we need to be careful not @@ -2219,40 +2236,6 @@ static void idpf_set_rx_mode(struct net_device *netdev) dev_err(dev, "Failed to set promiscuous mode: %d\n", err); } -/** - * idpf_vport_manage_rss_lut - disable/enable RSS - * @vport: the vport being changed - * - * In the event of disable request for RSS, this function will zero out RSS - * LUT, while in the event of enable request for RSS, it will reconfigure RSS - * LUT with the default LUT configuration. - */ -static int idpf_vport_manage_rss_lut(struct idpf_vport *vport) -{ - bool ena = idpf_is_feature_ena(vport, NETIF_F_RXHASH); - struct idpf_rss_data *rss_data; - u16 idx = vport->idx; - int lut_size; - - rss_data = &vport->adapter->vport_config[idx]->user_config.rss_data; - lut_size = rss_data->rss_lut_size * sizeof(u32); - - if (ena) { - /* This will contain the default or user configured LUT */ - memcpy(rss_data->rss_lut, rss_data->cached_lut, lut_size); - } else { - /* Save a copy of the current LUT to be restored later if - * requested. - */ - memcpy(rss_data->cached_lut, rss_data->rss_lut, lut_size); - - /* Zero out the current LUT to disable */ - memset(rss_data->rss_lut, 0, lut_size); - } - - return idpf_config_rss(vport); -} - /** * idpf_set_features - set the netdev feature flags * @netdev: ptr to the netdev being adjusted @@ -2278,10 +2261,19 @@ static int idpf_set_features(struct net_device *netdev, } if (changed & NETIF_F_RXHASH) { + struct idpf_netdev_priv *np = netdev_priv(netdev); + netdev->features ^= NETIF_F_RXHASH; - err = idpf_vport_manage_rss_lut(vport); - if (err) - goto unlock_mutex; + + /* If the interface is not up when changing the rxhash, update + * to the HW is skipped. The updated LUT will be committed to + * the HW when the interface is brought up. + */ + if (test_bit(IDPF_VPORT_UP, np->state)) { + err = idpf_config_rss(vport); + if (err) + goto unlock_mutex; + } } if (changed & NETIF_F_GRO_HW) { diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c index 1d91c56f74696..8991a891a440a 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c @@ -4650,57 +4650,47 @@ static void idpf_fill_dflt_rss_lut(struct idpf_vport *vport) rss_data = &adapter->vport_config[vport->idx]->user_config.rss_data; - for (i = 0; i < rss_data->rss_lut_size; i++) { + for (i = 0; i < rss_data->rss_lut_size; i++) rss_data->rss_lut[i] = i % num_active_rxq; - rss_data->cached_lut[i] = rss_data->rss_lut[i]; - } } /** - * idpf_init_rss - Allocate and initialize RSS resources + * idpf_init_rss_lut - Allocate and initialize RSS LUT * @vport: virtual port * - * Return 0 on success, negative on failure + * Return: 0 on success, negative on failure */ -int idpf_init_rss(struct idpf_vport *vport) +int idpf_init_rss_lut(struct idpf_vport *vport) { struct idpf_adapter *adapter = vport->adapter; struct idpf_rss_data *rss_data; - u32 lut_size; rss_data = &adapter->vport_config[vport->idx]->user_config.rss_data; + if (!rss_data->rss_lut) { + u32 lut_size; - lut_size = rss_data->rss_lut_size * sizeof(u32); - rss_data->rss_lut = kzalloc(lut_size, GFP_KERNEL); - if (!rss_data->rss_lut) - return -ENOMEM; - - rss_data->cached_lut = kzalloc(lut_size, GFP_KERNEL); - if (!rss_data->cached_lut) { - kfree(rss_data->rss_lut); - rss_data->rss_lut = NULL; - - return -ENOMEM; + lut_size = rss_data->rss_lut_size * sizeof(u32); + rss_data->rss_lut = kzalloc(lut_size, GFP_KERNEL); + if (!rss_data->rss_lut) + return -ENOMEM; } /* Fill the default RSS lut values */ idpf_fill_dflt_rss_lut(vport); - return idpf_config_rss(vport); + return 0; } /** - * idpf_deinit_rss - Release RSS resources + * idpf_deinit_rss_lut - Release RSS LUT * @vport: virtual port */ -void idpf_deinit_rss(struct idpf_vport *vport) +void idpf_deinit_rss_lut(struct idpf_vport *vport) { struct idpf_adapter *adapter = vport->adapter; struct idpf_rss_data *rss_data; rss_data = &adapter->vport_config[vport->idx]->user_config.rss_data; - kfree(rss_data->cached_lut); - rss_data->cached_lut = NULL; kfree(rss_data->rss_lut); rss_data->rss_lut = NULL; } diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h index 75b977094741f..7d20593bd8778 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h @@ -1086,8 +1086,8 @@ void idpf_vport_intr_deinit(struct idpf_vport *vport); int idpf_vport_intr_init(struct idpf_vport *vport); void idpf_vport_intr_ena(struct idpf_vport *vport); int idpf_config_rss(struct idpf_vport *vport); -int idpf_init_rss(struct idpf_vport *vport); -void idpf_deinit_rss(struct idpf_vport *vport); +int idpf_init_rss_lut(struct idpf_vport *vport); +void idpf_deinit_rss_lut(struct idpf_vport *vport); int idpf_rx_bufs_init_all(struct idpf_vport *vport); struct idpf_q_vector *idpf_find_rxq_vec(const struct idpf_vport *vport, diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c index 01bbd12a642a0..cb702eac86c80 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c @@ -2804,6 +2804,10 @@ int idpf_send_get_stats_msg(struct idpf_vport *vport) * @vport: virtual port data structure * @get: flag to set or get rss look up table * + * When rxhash is disabled, RSS LUT will be configured with zeros. If rxhash + * is enabled, the LUT values stored in driver's soft copy will be used to setup + * the HW. + * * Returns 0 on success, negative on failure. */ int idpf_send_get_set_rss_lut_msg(struct idpf_vport *vport, bool get) @@ -2814,10 +2818,12 @@ int idpf_send_get_set_rss_lut_msg(struct idpf_vport *vport, bool get) struct idpf_rss_data *rss_data; int buf_size, lut_buf_size; ssize_t reply_sz; + bool rxhash_ena; int i; rss_data = &vport->adapter->vport_config[vport->idx]->user_config.rss_data; + rxhash_ena = idpf_is_feature_ena(vport, NETIF_F_RXHASH); buf_size = struct_size(rl, lut, rss_data->rss_lut_size); rl = kzalloc(buf_size, GFP_KERNEL); if (!rl) @@ -2839,7 +2845,8 @@ int idpf_send_get_set_rss_lut_msg(struct idpf_vport *vport, bool get) } else { rl->lut_entries = cpu_to_le16(rss_data->rss_lut_size); for (i = 0; i < rss_data->rss_lut_size; i++) - rl->lut[i] = cpu_to_le32(rss_data->rss_lut[i]); + rl->lut[i] = rxhash_ena ? + cpu_to_le32(rss_data->rss_lut[i]) : 0; xn_params.vc_op = VIRTCHNL2_OP_SET_RSS_LUT; } From 1f7f4c77cd97ef2f6e094c6239cbd2dff45fbc73 Mon Sep 17 00:00:00 2001 From: Sreedevi Joshi Date: Mon, 24 Nov 2025 12:47:49 -0600 Subject: [PATCH 0781/1321] idpf: Fix RSS LUT configuration on down interfaces RSS LUT provisioning and queries on a down interface currently return silently without effect. Users should be able to configure RSS settings even when the interface is down. Fix by maintaining RSS configuration changes in the driver's soft copy and deferring HW programming until the interface comes up. Fixes: 02cbfba1add5 ("idpf: add ethtool callbacks") Signed-off-by: Sreedevi Joshi Reviewed-by: Aleksandr Loktionov Reviewed-by: Sridhar Samudrala Reviewed-by: Emil Tantilov Tested-by: Samuel Salin Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf_ethtool.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c index 7000f6283a335..2efa3c08aba5c 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c +++ b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c @@ -411,7 +411,10 @@ static u32 idpf_get_rxfh_indir_size(struct net_device *netdev) * @netdev: network interface device structure * @rxfh: pointer to param struct (indir, key, hfunc) * - * Reads the indirection table directly from the hardware. Always returns 0. + * RSS LUT and Key information are read from driver's cached + * copy. When rxhash is off, rss lut will be displayed as zeros. + * + * Return: 0 on success, -errno otherwise. */ static int idpf_get_rxfh(struct net_device *netdev, struct ethtool_rxfh_param *rxfh) @@ -419,10 +422,13 @@ static int idpf_get_rxfh(struct net_device *netdev, struct idpf_netdev_priv *np = netdev_priv(netdev); struct idpf_rss_data *rss_data; struct idpf_adapter *adapter; + struct idpf_vport *vport; + bool rxhash_ena; int err = 0; u16 i; idpf_vport_ctrl_lock(netdev); + vport = idpf_netdev_to_vport(netdev); adapter = np->adapter; @@ -432,9 +438,8 @@ static int idpf_get_rxfh(struct net_device *netdev, } rss_data = &adapter->vport_config[np->vport_idx]->user_config.rss_data; - if (!test_bit(IDPF_VPORT_UP, np->state)) - goto unlock_mutex; + rxhash_ena = idpf_is_feature_ena(vport, NETIF_F_RXHASH); rxfh->hfunc = ETH_RSS_HASH_TOP; if (rxfh->key) @@ -442,7 +447,7 @@ static int idpf_get_rxfh(struct net_device *netdev, if (rxfh->indir) { for (i = 0; i < rss_data->rss_lut_size; i++) - rxfh->indir[i] = rss_data->rss_lut[i]; + rxfh->indir[i] = rxhash_ena ? rss_data->rss_lut[i] : 0; } unlock_mutex: @@ -482,8 +487,6 @@ static int idpf_set_rxfh(struct net_device *netdev, } rss_data = &adapter->vport_config[vport->idx]->user_config.rss_data; - if (!test_bit(IDPF_VPORT_UP, np->state)) - goto unlock_mutex; if (rxfh->hfunc != ETH_RSS_HASH_NO_CHANGE && rxfh->hfunc != ETH_RSS_HASH_TOP) { @@ -499,7 +502,8 @@ static int idpf_set_rxfh(struct net_device *netdev, rss_data->rss_lut[lut] = rxfh->indir[lut]; } - err = idpf_config_rss(vport); + if (test_bit(IDPF_VPORT_UP, np->state)) + err = idpf_config_rss(vport); unlock_mutex: idpf_vport_ctrl_unlock(netdev); From 641283c434abcc7e10e60d685ce2e3355214b057 Mon Sep 17 00:00:00 2001 From: Sreedevi Joshi Date: Mon, 24 Nov 2025 12:47:50 -0600 Subject: [PATCH 0782/1321] idpf: Fix RSS LUT NULL ptr issue after soft reset During soft reset, the RSS LUT is freed and not restored unless the interface is up. If an ethtool command that accesses the rss lut is attempted immediately after reset, it will result in NULL ptr dereference. Also, there is no need to reset the rss lut if the soft reset does not involve queue count change. After soft reset, set the RSS LUT to default values based on the updated queue count only if the reset was a result of a queue count change and the LUT was not configured by the user. In all other cases, don't touch the LUT. Steps to reproduce: ** Bring the interface down (if up) ifconfig eth1 down ** update the queue count (eg., 27->20) ethtool -L eth1 combined 20 ** display the RSS LUT ethtool -x eth1 [82375.558338] BUG: kernel NULL pointer dereference, address: 0000000000000000 [82375.558373] #PF: supervisor read access in kernel mode [82375.558391] #PF: error_code(0x0000) - not-present page [82375.558408] PGD 0 P4D 0 [82375.558421] Oops: Oops: 0000 [#1] SMP NOPTI [82375.558516] RIP: 0010:idpf_get_rxfh+0x108/0x150 [idpf] [82375.558786] Call Trace: [82375.558793] [82375.558804] rss_prepare.isra.0+0x187/0x2a0 [82375.558827] rss_prepare_data+0x3a/0x50 [82375.558845] ethnl_default_doit+0x13d/0x3e0 [82375.558863] genl_family_rcv_msg_doit+0x11f/0x180 [82375.558886] genl_rcv_msg+0x1ad/0x2b0 [82375.558902] ? __pfx_ethnl_default_doit+0x10/0x10 [82375.558920] ? __pfx_genl_rcv_msg+0x10/0x10 [82375.558937] netlink_rcv_skb+0x58/0x100 [82375.558957] genl_rcv+0x2c/0x50 [82375.558971] netlink_unicast+0x289/0x3e0 [82375.558988] netlink_sendmsg+0x215/0x440 [82375.559005] __sys_sendto+0x234/0x240 [82375.559555] __x64_sys_sendto+0x28/0x30 [82375.560068] x64_sys_call+0x1909/0x1da0 [82375.560576] do_syscall_64+0x7a/0xfa0 [82375.561076] ? clear_bhb_loop+0x60/0xb0 [82375.561567] entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: 02cbfba1add5 ("idpf: add ethtool callbacks") Signed-off-by: Sreedevi Joshi Reviewed-by: Aleksandr Loktionov Reviewed-by: Sridhar Samudrala Reviewed-by: Emil Tantilov Reviewed-by: Simon Horman Tested-by: Samuel Salin Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf_lib.c | 20 ++++---------------- drivers/net/ethernet/intel/idpf/idpf_txrx.c | 2 +- drivers/net/ethernet/intel/idpf/idpf_txrx.h | 1 + 3 files changed, 6 insertions(+), 17 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c index 51716e5a84ef3..003bab3ce5ae6 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c @@ -1484,8 +1484,6 @@ static int idpf_vport_open(struct idpf_vport *vport, bool rtnl) { struct idpf_netdev_priv *np = netdev_priv(vport->netdev); struct idpf_adapter *adapter = vport->adapter; - struct idpf_vport_config *vport_config; - struct idpf_rss_data *rss_data; int err; if (test_bit(IDPF_VPORT_UP, np->state)) @@ -1579,19 +1577,6 @@ static int idpf_vport_open(struct idpf_vport *vport, bool rtnl) idpf_restore_features(vport); - vport_config = adapter->vport_config[vport->idx]; - rss_data = &vport_config->user_config.rss_data; - - if (!rss_data->rss_lut) { - err = idpf_init_rss_lut(vport); - if (err) { - dev_err(&adapter->pdev->dev, - "Failed to initialize RSS LUT for vport %u: %d\n", - vport->vport_id, err); - goto disable_vport; - } - } - err = idpf_config_rss(vport); if (err) { dev_err(&adapter->pdev->dev, "Failed to configure RSS for vport %u: %d\n", @@ -2068,7 +2053,6 @@ int idpf_initiate_soft_reset(struct idpf_vport *vport, idpf_vport_stop(vport, false); } - idpf_deinit_rss_lut(vport); /* We're passing in vport here because we need its wait_queue * to send a message and it should be getting all the vport * config data out of the adapter but we need to be careful not @@ -2094,6 +2078,10 @@ int idpf_initiate_soft_reset(struct idpf_vport *vport, if (err) goto err_open; + if (reset_cause == IDPF_SR_Q_CHANGE && + !netif_is_rxfh_configured(vport->netdev)) + idpf_fill_dflt_rss_lut(vport); + if (vport_is_up) err = idpf_vport_open(vport, false); diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c index 8991a891a440a..f51d52297e1e5 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c @@ -4641,7 +4641,7 @@ int idpf_config_rss(struct idpf_vport *vport) * idpf_fill_dflt_rss_lut - Fill the indirection table with the default values * @vport: virtual port structure */ -static void idpf_fill_dflt_rss_lut(struct idpf_vport *vport) +void idpf_fill_dflt_rss_lut(struct idpf_vport *vport) { struct idpf_adapter *adapter = vport->adapter; u16 num_active_rxq = vport->num_rxq; diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h index 7d20593bd8778..0472698ca1927 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h @@ -1085,6 +1085,7 @@ void idpf_vport_intr_update_itr_ena_irq(struct idpf_q_vector *q_vector); void idpf_vport_intr_deinit(struct idpf_vport *vport); int idpf_vport_intr_init(struct idpf_vport *vport); void idpf_vport_intr_ena(struct idpf_vport *vport); +void idpf_fill_dflt_rss_lut(struct idpf_vport *vport); int idpf_config_rss(struct idpf_vport *vport); int idpf_init_rss_lut(struct idpf_vport *vport); void idpf_deinit_rss_lut(struct idpf_vport *vport); From c0b0560aaa041444e999fdfb6051767542efed6e Mon Sep 17 00:00:00 2001 From: Sreedevi Joshi Date: Tue, 2 Dec 2025 17:12:46 -0600 Subject: [PATCH 0783/1321] idpf: Fix error handling in idpf_vport_open() Fix error handling to properly cleanup interrupts when idpf_vport_queue_ids_init() or idpf_rx_bufs_init_all() fail. Jump to 'intr_deinit' instead of 'queues_rel' to ensure interrupts are cleaned up before releasing other resources. Fixes: d4d558718266 ("idpf: initialize interrupts and enable vport") Signed-off-by: Sreedevi Joshi Reviewed-by: Madhu Chittim Reviewed-by: Aleksandr Loktionov Reviewed-by: Simon Horman Tested-by: Samuel Salin Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf_lib.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c index 003bab3ce5ae6..131a8121839bd 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c @@ -1524,14 +1524,14 @@ static int idpf_vport_open(struct idpf_vport *vport, bool rtnl) if (err) { dev_err(&adapter->pdev->dev, "Failed to initialize queue registers for vport %u: %d\n", vport->vport_id, err); - goto queues_rel; + goto intr_deinit; } err = idpf_rx_bufs_init_all(vport); if (err) { dev_err(&adapter->pdev->dev, "Failed to initialize RX buffers for vport %u: %d\n", vport->vport_id, err); - goto queues_rel; + goto intr_deinit; } idpf_rx_init_buf_tail(vport); From 5e0703e7692f487260ed2d370a6a840f89289350 Mon Sep 17 00:00:00 2001 From: Joshua Hay Date: Mon, 3 Nov 2025 13:20:36 -0800 Subject: [PATCH 0784/1321] idpf: cap maximum Rx buffer size The HW only supports a maximum Rx buffer size of 16K-128. On systems using large pages, the libeth logic can configure the buffer size to be larger than this. The upper bound is PAGE_SIZE while the lower bound is MTU rounded up to the nearest power of 2. For example, ARM systems with a 64K page size and an mtu of 9000 will set the Rx buffer size to 16K, which will cause the config Rx queues message to fail. Initialize the bufq/fill queue buf_len field to the maximum supported size. This will trigger the libeth logic to cap the maximum Rx buffer size by reducing the upper bound. Fixes: 74d1412ac8f37 ("idpf: use libeth Rx buffer management for payload buffer") Signed-off-by: Joshua Hay Acked-by: Alexander Lobakin Reviewed-by: Madhu Chittim Reviewed-by: Jacob Keller Reviewed-by: Aleksandr Loktionov Reviewed-by: David Decotigny Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf_txrx.c | 8 +++++--- drivers/net/ethernet/intel/idpf/idpf_txrx.h | 1 + 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c index f51d52297e1e5..7f3933ca9edcd 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c @@ -695,9 +695,10 @@ static int idpf_rx_buf_alloc_singleq(struct idpf_rx_queue *rxq) static int idpf_rx_bufs_init_singleq(struct idpf_rx_queue *rxq) { struct libeth_fq fq = { - .count = rxq->desc_count, - .type = LIBETH_FQE_MTU, - .nid = idpf_q_vector_to_mem(rxq->q_vector), + .count = rxq->desc_count, + .type = LIBETH_FQE_MTU, + .buf_len = IDPF_RX_MAX_BUF_SZ, + .nid = idpf_q_vector_to_mem(rxq->q_vector), }; int ret; @@ -754,6 +755,7 @@ static int idpf_rx_bufs_init(struct idpf_buf_queue *bufq, .truesize = bufq->truesize, .count = bufq->desc_count, .type = type, + .buf_len = IDPF_RX_MAX_BUF_SZ, .hsplit = idpf_queue_has(HSPLIT_EN, bufq), .xdp = idpf_xdp_enabled(bufq->q_vector->vport), .nid = idpf_q_vector_to_mem(bufq->q_vector), diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h index 0472698ca1927..423cc9486dce7 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h @@ -101,6 +101,7 @@ do { \ idx = 0; \ } while (0) +#define IDPF_RX_MAX_BUF_SZ (16384 - 128) #define IDPF_RX_BUF_STRIDE 32 #define IDPF_RX_BUF_POST_STRIDE 16 #define IDPF_LOW_WATERMARK 64 From 914dbeece70885b012326afd31ff7422ae33cd08 Mon Sep 17 00:00:00 2001 From: Larysa Zaremba Date: Mon, 17 Nov 2025 08:03:49 +0100 Subject: [PATCH 0785/1321] idpf: fix aux device unplugging when rdma is not supported by vport If vport flags do not contain VIRTCHNL2_VPORT_ENABLE_RDMA, driver does not allocate vdev_info for this vport. This leads to kernel NULL pointer dereference in idpf_idc_vport_dev_down(), which references vdev_info for every vport regardless. Check, if vdev_info was ever allocated before unplugging aux device. Fixes: be91128c579c ("idpf: implement RDMA vport auxiliary dev create, init, and destroy") Reviewed-by: Madhu Chittim Signed-off-by: Larysa Zaremba Reviewed-by: Paul Menzel Reviewed-by: Aleksandr Loktionov Tested-by: Krishneil Singh Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf_idc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf_idc.c b/drivers/net/ethernet/intel/idpf/idpf_idc.c index 7e20a07e98e53..6dad0593f7f22 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_idc.c +++ b/drivers/net/ethernet/intel/idpf/idpf_idc.c @@ -322,7 +322,7 @@ static void idpf_idc_vport_dev_down(struct idpf_adapter *adapter) for (i = 0; i < adapter->num_alloc_vports; i++) { struct idpf_vport *vport = adapter->vports[i]; - if (!vport) + if (!vport || !vport->vdev_info) continue; idpf_unplug_aux_dev(vport->vdev_info->adev); From d1086f71f139fbb5ff4de065bc8e4ececd0afc2b Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Wed, 7 Jan 2026 21:22:50 +0000 Subject: [PATCH 0786/1321] arp: do not assume dev_hard_header() does not change skb->head arp_create() is the only dev_hard_header() caller making assumption about skb->head being unchanged. A recent commit broke this assumption. Initialize @arp pointer after dev_hard_header() call. Fixes: db5b4e39c4e6 ("ip6_gre: make ip6gre_header() robust") Reported-by: syzbot+58b44a770a1585795351@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet Link: https://patch.msgid.link/20260107212250.384552-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- net/ipv4/arp.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c index 7f3863daaa407..c8c3e1713c0ed 100644 --- a/net/ipv4/arp.c +++ b/net/ipv4/arp.c @@ -564,7 +564,7 @@ struct sk_buff *arp_create(int type, int ptype, __be32 dest_ip, skb_reserve(skb, hlen); skb_reset_network_header(skb); - arp = skb_put(skb, arp_hdr_len(dev)); + skb_put(skb, arp_hdr_len(dev)); skb->dev = dev; skb->protocol = htons(ETH_P_ARP); if (!src_hw) @@ -572,12 +572,13 @@ struct sk_buff *arp_create(int type, int ptype, __be32 dest_ip, if (!dest_hw) dest_hw = dev->broadcast; - /* - * Fill the device header for the ARP frame + /* Fill the device header for the ARP frame. + * Note: skb->head can be changed. */ if (dev_hard_header(skb, dev, ptype, dest_hw, src_hw, skb->len) < 0) goto out; + arp = arp_hdr(skb); /* * Fill out the arp protocol part. * From 298687111e793e002182e2b9cc1aefa251971a27 Mon Sep 17 00:00:00 2001 From: Julia Lawall Date: Fri, 26 Dec 2025 12:05:31 +0100 Subject: [PATCH 0787/1321] tracing: Drop unneeded assignment to soft_mode soft_mode is not read in the enable case, so drop the assignment. Drop also the comment text that refers to the assignment and realign the comment. Cc: "Paul E . McKenney" Cc: Gabriele Paoloni Cc: Masami Hiramatsu Cc: Mathieu Desnoyers Link: https://patch.msgid.link/20251226110531.4129794-1-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall Signed-off-by: Steven Rostedt (Google) --- kernel/trace/trace_events.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index 76067529db61b..137b4d9bb116d 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -826,16 +826,15 @@ static int __ftrace_event_enable_disable(struct trace_event_file *file, * When soft_disable is set and enable is set, we want to * register the tracepoint for the event, but leave the event * as is. That means, if the event was already enabled, we do - * nothing (but set soft_mode). If the event is disabled, we - * set SOFT_DISABLED before enabling the event tracepoint, so - * it still seems to be disabled. + * nothing. If the event is disabled, we set SOFT_DISABLED + * before enabling the event tracepoint, so it still seems + * to be disabled. */ if (!soft_disable) clear_bit(EVENT_FILE_FL_SOFT_DISABLED_BIT, &file->flags); else { if (atomic_inc_return(&file->sm_ref) > 1) break; - soft_mode = true; /* Enable use of trace_buffered_event */ trace_buffered_event_enable(); } From 31aef0b902dfd4093c0d04f5c8cbf092bcad6a0f Mon Sep 17 00:00:00 2001 From: Wupeng Ma Date: Sun, 28 Dec 2025 14:50:07 +0800 Subject: [PATCH 0788/1321] ring-buffer: Avoid softlockup in ring_buffer_resize() during memory free When user resize all trace ring buffer through file 'buffer_size_kb', then in ring_buffer_resize(), kernel allocates buffer pages for each cpu in a loop. If the kernel preemption model is PREEMPT_NONE and there are many cpus and there are many buffer pages to be freed, it may not give up cpu for a long time and finally cause a softlockup. To avoid it, call cond_resched() after each cpu buffer free as Commit f6bd2c92488c ("ring-buffer: Avoid softlockup in ring_buffer_resize()") does. Detailed call trace as follow: rcu: INFO: rcu_sched self-detected stall on CPU rcu: 24-....: (14837 ticks this GP) idle=521c/1/0x4000000000000000 softirq=230597/230597 fqs=5329 rcu: (t=15004 jiffies g=26003221 q=211022 ncpus=96) CPU: 24 UID: 0 PID: 11253 Comm: bash Kdump: loaded Tainted: G EL 6.18.2+ #278 NONE pc : arch_local_irq_restore+0x8/0x20 arch_local_irq_restore+0x8/0x20 (P) free_frozen_page_commit+0x28c/0x3b0 __free_frozen_pages+0x1c0/0x678 ___free_pages+0xc0/0xe0 free_pages+0x3c/0x50 ring_buffer_resize.part.0+0x6a8/0x880 ring_buffer_resize+0x3c/0x58 __tracing_resize_ring_buffer.part.0+0x34/0xd8 tracing_resize_ring_buffer+0x8c/0xd0 tracing_entries_write+0x74/0xd8 vfs_write+0xcc/0x288 ksys_write+0x74/0x118 __arm64_sys_write+0x24/0x38 Cc: Link: https://patch.msgid.link/20251228065008.2396573-1-mawupeng1@huawei.com Signed-off-by: Wupeng Ma Acked-by: Masami Hiramatsu (Google) Signed-off-by: Steven Rostedt (Google) --- kernel/trace/ring_buffer.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 41c9f5d079beb..630221b00838e 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -3137,6 +3137,8 @@ int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size, list) { list_del_init(&bpage->list); free_buffer_page(bpage); + + cond_resched(); } } out_err_unlock: From 2e6b732c24da660124a8e9857d64eb4e05041a4d Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Fri, 2 Jan 2026 14:31:48 -0500 Subject: [PATCH 0789/1321] ftrace: Make ftrace_graph_ent depth field signed The code has integrity checks to make sure that depth never goes below zero. But the depth field has recently been converted to unsigned long from "int" (for alignment reasons). As unsigned long can never be less than zero, the integrity checks no longer work. Convert depth to long from unsigned long to allow the integrity checks to work again. Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers Cc: pengdonglin Link: https://patch.msgid.link/20260102143148.251c2e16@gandalf.local.home Reported-by: Dan Carpenter Closes: https://lore.kernel.org/all/aS6kGi0maWBl-MjZ@stanley.mountain/ Fixes: f83ac7544fbf7 ("function_graph: Enable funcgraph-args and funcgraph-retaddr to work simultaneously") Signed-off-by: Steven Rostedt (Google) Acked-by: Masami Hiramatsu (Google) --- include/linux/ftrace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 770f0dc993ccf..a3a8989e3268d 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -1167,7 +1167,7 @@ static inline void ftrace_init(void) { } */ struct ftrace_graph_ent { unsigned long func; /* Current function */ - unsigned long depth; + long depth; /* signed to check for less than zero */ } __packed; /* From 2fd7266a07a26dc96ac2dc91a3e855bc9895f3f6 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Mon, 5 Jan 2026 20:31:41 -0500 Subject: [PATCH 0790/1321] tracing: Add recursion protection in kernel stack trace recording A bug was reported about an infinite recursion caused by tracing the rcu events with the kernel stack trace trigger enabled. The stack trace code called back into RCU which then called the stack trace again. Expand the ftrace recursion protection to add a set of bits to protect events from recursion. Each bit represents the context that the event is in (normal, softirq, interrupt and NMI). Have the stack trace code use the interrupt context to protect against recursion. Note, the bug showed an issue in both the RCU code as well as the tracing stacktrace code. This only handles the tracing stack trace side of the bug. The RCU fix will be handled separately. Link: https://lore.kernel.org/all/20260102122807.7025fc87@gandalf.local.home/ Cc: stable@vger.kernel.org Cc: Masami Hiramatsu Cc: Mathieu Desnoyers Cc: Joel Fernandes Cc: "Paul E. McKenney" Cc: Boqun Feng Link: https://patch.msgid.link/20260105203141.515cd49f@gandalf.local.home Reported-by: Yao Kai Tested-by: Yao Kai Fixes: 5f5fa7ea89dc ("rcu: Don't use negative nesting depth in __rcu_read_unlock()") Signed-off-by: Steven Rostedt (Google) --- include/linux/trace_recursion.h | 9 +++++++++ kernel/trace/trace.c | 6 ++++++ 2 files changed, 15 insertions(+) diff --git a/include/linux/trace_recursion.h b/include/linux/trace_recursion.h index ae04054a1be36..e6ca052b2a85a 100644 --- a/include/linux/trace_recursion.h +++ b/include/linux/trace_recursion.h @@ -34,6 +34,13 @@ enum { TRACE_INTERNAL_SIRQ_BIT, TRACE_INTERNAL_TRANSITION_BIT, + /* Internal event use recursion bits */ + TRACE_INTERNAL_EVENT_BIT, + TRACE_INTERNAL_EVENT_NMI_BIT, + TRACE_INTERNAL_EVENT_IRQ_BIT, + TRACE_INTERNAL_EVENT_SIRQ_BIT, + TRACE_INTERNAL_EVENT_TRANSITION_BIT, + TRACE_BRANCH_BIT, /* * Abuse of the trace_recursion. @@ -58,6 +65,8 @@ enum { #define TRACE_LIST_START TRACE_INTERNAL_BIT +#define TRACE_EVENT_START TRACE_INTERNAL_EVENT_BIT + #define TRACE_CONTEXT_MASK ((1 << (TRACE_LIST_START + TRACE_CONTEXT_BITS)) - 1) /* diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 6f2148df14d96..aef9058537d54 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -3012,6 +3012,11 @@ static void __ftrace_trace_stack(struct trace_array *tr, struct ftrace_stack *fstack; struct stack_entry *entry; int stackidx; + int bit; + + bit = trace_test_and_set_recursion(_THIS_IP_, _RET_IP_, TRACE_EVENT_START); + if (bit < 0) + return; /* * Add one, for this function and the call to save_stack_trace() @@ -3080,6 +3085,7 @@ static void __ftrace_trace_stack(struct trace_array *tr, /* Again, don't let gcc optimize things here */ barrier(); __this_cpu_dec(ftrace_stack_reserve); + trace_clear_recursion(bit); } static inline void ftrace_trace_stack(struct trace_array *tr, From 8fa1c799e24b610206fd7d679a925cb4b2fae6d6 Mon Sep 17 00:00:00 2001 From: Ben Dooks Date: Tue, 6 Jan 2026 23:10:54 +0000 Subject: [PATCH 0791/1321] trace: ftrace_dump_on_oops[] is not exported, make it static The ftrace_dump_on_oops string is not used outside of trace.c so make it static to avoid the export warning from sparse: kernel/trace/trace.c:141:6: warning: symbol 'ftrace_dump_on_oops' was not declared. Should it be static? Fixes: dd293df6395a2 ("tracing: Move trace sysctls into trace.c") Link: https://patch.msgid.link/20260106231054.84270-1-ben.dooks@codethink.co.uk Signed-off-by: Ben Dooks Signed-off-by: Steven Rostedt (Google) --- kernel/trace/trace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index aef9058537d54..baec63134ab62 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -138,7 +138,7 @@ cpumask_var_t __read_mostly tracing_buffer_mask; * by commas. */ /* Set to string format zero to disable by default */ -char ftrace_dump_on_oops[MAX_TRACER_SIZE] = "0"; +static char ftrace_dump_on_oops[MAX_TRACER_SIZE] = "0"; /* When set, tracing will stop when a WARN*() is hit */ static int __disable_trace_on_warning; From e2d9d55c51a34f6418225019828ddb5db8985b28 Mon Sep 17 00:00:00 2001 From: Louis-Alexis Eyraud Date: Wed, 3 Dec 2025 12:32:42 +0100 Subject: [PATCH 0792/1321] pinctrl: mediatek: mt8189: restore previous register base name array order In mt8189-pinctrl driver, a previous commit changed the register base name array (mt8189_pinctrl_register_base_names) entry name and order to align it with the same name and order as the "mediatek,mt8189-pinctrl" devicetree bindings. The new order (by ascending register address) now causes an issue with MT8189 pinctrl configuration. MT8189 SoC has multiple base addresses for the pin configuration registers. Several constant data structures, declaring each pin configuration, are using PIN_FIELD_BASE() macro which i_base parameter indicates for a given pin the lookup index in the base register address array of the driver internal data for the configuration register read/write accesses. But in practice, this parameter is given a hardcoded numerical value that corresponds to the expected base register entry index in mt8189_pinctrl_register_base_names array. Since this array reordering, the i_base index matching is no more correct. So, in order to avoid modifying over a thousand of PIN_FIELD_BASE() calls, restore previous mt8189_pinctrl_register_base_names entry order. Fixes: 518919276c41 ("pinctrl: mediatek: mt8189: align register base names to dt-bindings ones") Signed-off-by: Louis-Alexis Eyraud Signed-off-by: Linus Walleij --- drivers/pinctrl/mediatek/pinctrl-mt8189.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pinctrl/mediatek/pinctrl-mt8189.c b/drivers/pinctrl/mediatek/pinctrl-mt8189.c index f6a3e584588b0..cd4cdff309a12 100644 --- a/drivers/pinctrl/mediatek/pinctrl-mt8189.c +++ b/drivers/pinctrl/mediatek/pinctrl-mt8189.c @@ -1642,7 +1642,7 @@ static const struct mtk_pin_reg_calc mt8189_reg_cals[PINCTRL_PIN_REG_MAX] = { }; static const char * const mt8189_pinctrl_register_base_names[] = { - "base", "lm", "rb0", "rb1", "bm0", "bm1", "bm2", "lt0", "lt1", "rt", + "base", "bm0", "bm1", "bm2", "lm", "lt0", "lt1", "rb0", "rb1", "rt", }; static const struct mtk_eint_hw mt8189_eint_hw = { From 013821489cda4a2bf93ee7213a49c5d3e8586fa1 Mon Sep 17 00:00:00 2001 From: Linus Walleij Date: Tue, 30 Dec 2025 22:10:45 +0100 Subject: [PATCH 0793/1321] Update .mailmap for Linus Walleij Developers run into bouncing emails from my old address, so add it to .mailmap. Stuff in the rest of my old mail addresses as well while we're at it. Reported-by: Andy Shevchenko Signed-off-by: Linus Walleij --- .mailmap | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/.mailmap b/.mailmap index e071d0a08c933..b23e0853d636c 100644 --- a/.mailmap +++ b/.mailmap @@ -473,6 +473,10 @@ Linas Vepstas Linus Lüssing Linus Lüssing Linus Lüssing +Linus Walleij +Linus Walleij +Linus Walleij +Linus Walleij Li Yang Li Yang From 7a13f965e4217e4a30cf2d7fa73052e153d4702a Mon Sep 17 00:00:00 2001 From: Sander Vanheule Date: Fri, 19 Dec 2025 12:56:31 +0100 Subject: [PATCH 0794/1321] pinctrl: pic64gx-gpio2: Add REGMAP_MMIO dependency In line with other drivers depending on REGMAP_*, select the required symbol to prevent a linker error when building with COMPILE_TEST=y: ld: drivers/pinctrl/pinctrl-pic64gx-gpio2.o: in function `pic64gx_gpio2_probe': pinctrl-pic64gx-gpio2.c:315:(.text+0x198): undefined reference to `__devm_regmap_init_mmio_clk' Fixes: 38cf9d641314 ("pinctrl: add pic64gx "gpio2" pinmux driver") Signed-off-by: Sander Vanheule Signed-off-by: Linus Walleij --- drivers/pinctrl/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/pinctrl/Kconfig b/drivers/pinctrl/Kconfig index bc7f37afc48bf..7b9f792acb0e3 100644 --- a/drivers/pinctrl/Kconfig +++ b/drivers/pinctrl/Kconfig @@ -491,6 +491,7 @@ config PINCTRL_PIC64GX depends on ARCH_MICROCHIP || COMPILE_TEST depends on OF select GENERIC_PINCONF + select REGMAP_MMIO default y help This selects the pinctrl driver for gpio2 on pic64gx. From a9cebfdd26f991f19ebe8d032c81265157d328e3 Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Wed, 26 Nov 2025 13:22:19 +0100 Subject: [PATCH 0795/1321] pinctrl: qcom: lpass-lpi: mark the GPIO controller as sleeping The gpio_chip settings in this driver say the controller can't sleep but it actually uses a mutex for synchronization. This triggers the following BUG(): [ 9.233659] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281 [ 9.233665] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 554, name: (udev-worker) [ 9.233669] preempt_count: 1, expected: 0 [ 9.233673] RCU nest depth: 0, expected: 0 [ 9.233688] Tainted: [W]=WARN [ 9.233690] Hardware name: Dell Inc. Latitude 7455/0FK7MX, BIOS 2.10.1 05/20/2025 [ 9.233694] Call trace: [ 9.233696] show_stack+0x24/0x38 (C) [ 9.233709] dump_stack_lvl+0x40/0x88 [ 9.233716] dump_stack+0x18/0x24 [ 9.233722] __might_resched+0x148/0x160 [ 9.233731] __might_sleep+0x38/0x98 [ 9.233736] mutex_lock+0x30/0xd8 [ 9.233749] lpi_config_set+0x2e8/0x3c8 [pinctrl_lpass_lpi] [ 9.233757] lpi_gpio_direction_output+0x58/0x90 [pinctrl_lpass_lpi] [ 9.233761] gpiod_direction_output_raw_commit+0x110/0x428 [ 9.233772] gpiod_direction_output_nonotify+0x234/0x358 [ 9.233779] gpiod_direction_output+0x38/0xd0 [ 9.233786] gpio_shared_proxy_direction_output+0xb8/0x2a8 [gpio_shared_proxy] [ 9.233792] gpiod_direction_output_raw_commit+0x110/0x428 [ 9.233799] gpiod_direction_output_nonotify+0x234/0x358 [ 9.233806] gpiod_configure_flags+0x2c0/0x580 [ 9.233812] gpiod_find_and_request+0x358/0x4f8 [ 9.233819] gpiod_get_index+0x7c/0x98 [ 9.233826] devm_gpiod_get+0x34/0xb0 [ 9.233829] reset_gpio_probe+0x58/0x128 [reset_gpio] [ 9.233836] auxiliary_bus_probe+0xb0/0xf0 [ 9.233845] really_probe+0x14c/0x450 [ 9.233853] __driver_probe_device+0xb0/0x188 [ 9.233858] driver_probe_device+0x4c/0x250 [ 9.233863] __driver_attach+0xf8/0x2a0 [ 9.233868] bus_for_each_dev+0xf8/0x158 [ 9.233872] driver_attach+0x30/0x48 [ 9.233876] bus_add_driver+0x158/0x2b8 [ 9.233880] driver_register+0x74/0x118 [ 9.233886] __auxiliary_driver_register+0x94/0xe8 [ 9.233893] init_module+0x34/0xfd0 [reset_gpio] [ 9.233898] do_one_initcall+0xec/0x300 [ 9.233903] do_init_module+0x64/0x260 [ 9.233910] load_module+0x16c4/0x1900 [ 9.233915] __arm64_sys_finit_module+0x24c/0x378 [ 9.233919] invoke_syscall+0x4c/0xe8 [ 9.233925] el0_svc_common+0x8c/0xf0 [ 9.233929] do_el0_svc+0x28/0x40 [ 9.233934] el0_svc+0x38/0x100 [ 9.233938] el0t_64_sync_handler+0x84/0x130 [ 9.233943] el0t_64_sync+0x17c/0x180 Mark the controller as sleeping. Fixes: 6e261d1090d6 ("pinctrl: qcom: Add sm8250 lpass lpi pinctrl driver") Cc: stable@vger.kernel.org Reported-by: Val Packett Closes: https://lore.kernel.org/all/98c0f185-b0e0-49ea-896c-f3972dd011ca@packett.cool/ Signed-off-by: Bartosz Golaszewski Reviewed-by: Dmitry Baryshkov Reviewed-by: Bjorn Andersson Signed-off-by: Linus Walleij --- drivers/pinctrl/qcom/pinctrl-lpass-lpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c b/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c index 1c97ec44aa5ff..78212f9928430 100644 --- a/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c +++ b/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c @@ -498,7 +498,7 @@ int lpi_pinctrl_probe(struct platform_device *pdev) pctrl->chip.base = -1; pctrl->chip.ngpio = data->npins; pctrl->chip.label = dev_name(dev); - pctrl->chip.can_sleep = false; + pctrl->chip.can_sleep = true; mutex_init(&pctrl->lock); From dd599abf32a4f96abb4b54cd0ed8f8bc42a7173d Mon Sep 17 00:00:00 2001 From: Harshita Bhilwaria Date: Wed, 17 Dec 2025 11:16:06 +0530 Subject: [PATCH 0796/1321] crypto: qat - fix duplicate restarting msg during AER error The restarting message from PF to VF is sent twice during AER error handling: once from adf_error_detected() and again from adf_disable_sriov(). This causes userspace subservices to shutdown unexpectedly when they receive a duplicate restarting message after already being restarted. Avoid calling adf_pf2vf_notify_restarting() and adf_pf2vf_wait_for_restarting_complete() from adf_error_detected() so that the restarting msg is sent only once from PF to VF. Fixes: 9567d3dc760931 ("crypto: qat - improve aer error reset handling") Signed-off-by: Harshita Bhilwaria Reviewed-by: Giovanni Cabiddu Reviewed-by: Ahsan Atta Reviewed-by: Ravikumar PM Reviewed-by: Srikanth Thokala Signed-off-by: Herbert Xu --- drivers/crypto/intel/qat/qat_common/adf_aer.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/crypto/intel/qat/qat_common/adf_aer.c b/drivers/crypto/intel/qat/qat_common/adf_aer.c index 11728cf32653c..a5964fd8204c9 100644 --- a/drivers/crypto/intel/qat/qat_common/adf_aer.c +++ b/drivers/crypto/intel/qat/qat_common/adf_aer.c @@ -41,8 +41,6 @@ static pci_ers_result_t adf_error_detected(struct pci_dev *pdev, adf_error_notifier(accel_dev); adf_pf2vf_notify_fatal_error(accel_dev); adf_dev_restarting_notify(accel_dev); - adf_pf2vf_notify_restarting(accel_dev); - adf_pf2vf_wait_for_restarting_complete(accel_dev); pci_clear_master(pdev); adf_dev_down(accel_dev); From 361f747b69c5da1ee779c3fd90d4d6944173639f Mon Sep 17 00:00:00 2001 From: Mathias Krause Date: Tue, 9 Dec 2025 22:09:03 +0100 Subject: [PATCH 0797/1321] media: mc: fix potential use-after-free in media_request_alloc() Commit 6f504cbf108a ("media: convert media_request_alloc() to FD_PREPARE()") moved the call to fd_install() (now hidden in fd_publish()) before the snprintf(), making the later write to potentially already freed memory, as userland is free to call close() concurrently right after the call to fd_install() which may end up in the request_fops.release() handler freeing 'req'. Fixes: 6f504cbf108a ("media: convert media_request_alloc() to FD_PREPARE()") Signed-off-by: Mathias Krause Link: https://patch.msgid.link/20251209210903.603958-1-minipli@grsecurity.net Signed-off-by: Christian Brauner --- drivers/media/mc/mc-request.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/media/mc/mc-request.c b/drivers/media/mc/mc-request.c index 2ac9ac0a740bd..3cca9a0c7c973 100644 --- a/drivers/media/mc/mc-request.c +++ b/drivers/media/mc/mc-request.c @@ -315,12 +315,12 @@ int media_request_alloc(struct media_device *mdev, int *alloc_fd) fd_prepare_file(fdf)->private_data = req; - *alloc_fd = fd_publish(fdf); - snprintf(req->debug_str, sizeof(req->debug_str), "%u:%d", - atomic_inc_return(&mdev->request_id), *alloc_fd); + atomic_inc_return(&mdev->request_id), fd_prepare_fd(fdf)); dev_dbg(mdev->dev, "request: allocated %s\n", req->debug_str); + *alloc_fd = fd_publish(fdf); + return 0; err_free_req: From cd966790c7adf02245f03580080f84e029d037c8 Mon Sep 17 00:00:00 2001 From: Brian Foster Date: Mon, 8 Dec 2025 09:05:48 -0500 Subject: [PATCH 0798/1321] iomap: replace folio_batch allocation with stack allocation Zhang Yi points out that the dynamic folio_batch allocation in iomap_fill_dirty_folios() is problematic for the ext4 on iomap work that is under development because it doesn't sufficiently handle the allocation failure case (by allowing a retry, for example). We've also seen lockdep (via syzbot) complain recently about the scope of the allocation. The dynamic allocation was initially added for simplicity and to help indicate whether the batch was used or not by the calling fs. To address these issues, put the batch on the stack of iomap_zero_range() and use a flag to control whether the batch should be used in the iomap folio lookup path. This keeps things simple and eliminates allocation issues with lockdep and for ext4 on iomap. While here, also clean up the fill helper signature to be more consistent with the underlying filemap helper. Pass through the return value of the filemap helper (folio count) and update the lookup offset via an out param. Fixes: 395ed1ef0012 ("iomap: optional zero range dirty folio processing") Signed-off-by: Brian Foster Link: https://patch.msgid.link/20251208140548.373411-1-bfoster@redhat.com Acked-by: Dave Chinner Reviewed-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Christian Brauner --- fs/iomap/buffered-io.c | 50 +++++++++++++++++++++++++++++------------- fs/iomap/iter.c | 6 ++--- fs/xfs/xfs_iomap.c | 11 +++++----- include/linux/iomap.h | 8 +++++-- 4 files changed, 50 insertions(+), 25 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index e5c1ca440d93b..fd9a2cf956202 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -832,7 +832,7 @@ static struct folio *__iomap_get_folio(struct iomap_iter *iter, if (!mapping_large_folio_support(iter->inode->i_mapping)) len = min_t(size_t, len, PAGE_SIZE - offset_in_page(pos)); - if (iter->fbatch) { + if (iter->iomap.flags & IOMAP_F_FOLIO_BATCH) { struct folio *folio = folio_batch_next(iter->fbatch); if (!folio) @@ -929,7 +929,7 @@ static int iomap_write_begin(struct iomap_iter *iter, * process so return and let the caller iterate and refill the batch. */ if (!folio) { - WARN_ON_ONCE(!iter->fbatch); + WARN_ON_ONCE(!(iter->iomap.flags & IOMAP_F_FOLIO_BATCH)); return 0; } @@ -1544,23 +1544,39 @@ static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero, return status; } -loff_t +/** + * iomap_fill_dirty_folios - fill a folio batch with dirty folios + * @iter: Iteration structure + * @start: Start offset of range. Updated based on lookup progress. + * @end: End offset of range + * @iomap_flags: Flags to set on the associated iomap to track the batch. + * + * Returns the folio count directly. Also returns the associated control flag if + * the the batch lookup is performed and the expected offset of a subsequent + * lookup via out params. The caller is responsible to set the flag on the + * associated iomap. + */ +unsigned int iomap_fill_dirty_folios( struct iomap_iter *iter, - loff_t offset, - loff_t length) + loff_t *start, + loff_t end, + unsigned int *iomap_flags) { struct address_space *mapping = iter->inode->i_mapping; - pgoff_t start = offset >> PAGE_SHIFT; - pgoff_t end = (offset + length - 1) >> PAGE_SHIFT; + pgoff_t pstart = *start >> PAGE_SHIFT; + pgoff_t pend = (end - 1) >> PAGE_SHIFT; + unsigned int count; - iter->fbatch = kmalloc(sizeof(struct folio_batch), GFP_KERNEL); - if (!iter->fbatch) - return offset + length; - folio_batch_init(iter->fbatch); + if (!iter->fbatch) { + *start = end; + return 0; + } - filemap_get_folios_dirty(mapping, &start, end, iter->fbatch); - return (start << PAGE_SHIFT); + count = filemap_get_folios_dirty(mapping, &pstart, pend, iter->fbatch); + *start = (pstart << PAGE_SHIFT); + *iomap_flags |= IOMAP_F_FOLIO_BATCH; + return count; } EXPORT_SYMBOL_GPL(iomap_fill_dirty_folios); @@ -1569,17 +1585,21 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero, const struct iomap_ops *ops, const struct iomap_write_ops *write_ops, void *private) { + struct folio_batch fbatch; struct iomap_iter iter = { .inode = inode, .pos = pos, .len = len, .flags = IOMAP_ZERO, .private = private, + .fbatch = &fbatch, }; struct address_space *mapping = inode->i_mapping; int ret; bool range_dirty; + folio_batch_init(&fbatch); + /* * To avoid an unconditional flush, check pagecache state and only flush * if dirty and the fs returns a mapping that might convert on @@ -1590,11 +1610,11 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero, while ((ret = iomap_iter(&iter, ops)) > 0) { const struct iomap *srcmap = iomap_iter_srcmap(&iter); - if (WARN_ON_ONCE(iter.fbatch && + if (WARN_ON_ONCE((iter.iomap.flags & IOMAP_F_FOLIO_BATCH) && srcmap->type != IOMAP_UNWRITTEN)) return -EIO; - if (!iter.fbatch && + if (!(iter.iomap.flags & IOMAP_F_FOLIO_BATCH) && (srcmap->type == IOMAP_HOLE || srcmap->type == IOMAP_UNWRITTEN)) { s64 status; diff --git a/fs/iomap/iter.c b/fs/iomap/iter.c index 8692e5e41c6df..c04796f6e57fa 100644 --- a/fs/iomap/iter.c +++ b/fs/iomap/iter.c @@ -8,10 +8,10 @@ static inline void iomap_iter_reset_iomap(struct iomap_iter *iter) { - if (iter->fbatch) { + if (iter->iomap.flags & IOMAP_F_FOLIO_BATCH) { folio_batch_release(iter->fbatch); - kfree(iter->fbatch); - iter->fbatch = NULL; + folio_batch_reinit(iter->fbatch); + iter->iomap.flags &= ~IOMAP_F_FOLIO_BATCH; } iter->status = 0; diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 04f39ea15898b..37a1b33e90450 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -1831,7 +1831,6 @@ xfs_buffered_write_iomap_begin( */ if (flags & IOMAP_ZERO) { xfs_fileoff_t eof_fsb = XFS_B_TO_FSB(mp, XFS_ISIZE(ip)); - u64 end; if (isnullstartblock(imap.br_startblock) && offset_fsb >= eof_fsb) @@ -1851,12 +1850,14 @@ xfs_buffered_write_iomap_begin( */ if (imap.br_state == XFS_EXT_UNWRITTEN && offset_fsb < eof_fsb) { - loff_t len = min(count, - XFS_FSB_TO_B(mp, imap.br_blockcount)); + loff_t foffset = offset, fend; - end = iomap_fill_dirty_folios(iter, offset, len); + fend = offset + + min(count, XFS_FSB_TO_B(mp, imap.br_blockcount)); + iomap_fill_dirty_folios(iter, &foffset, fend, + &iomap_flags); end_fsb = min_t(xfs_fileoff_t, end_fsb, - XFS_B_TO_FSB(mp, end)); + XFS_B_TO_FSB(mp, foffset)); } xfs_trim_extent(&imap, offset_fsb, end_fsb - offset_fsb); diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 520e967cb501d..6bb941707d122 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -88,6 +88,9 @@ struct vm_fault; /* * Flags set by the core iomap code during operations: * + * IOMAP_F_FOLIO_BATCH indicates that the folio batch mechanism is active + * for this operation, set by iomap_fill_dirty_folios(). + * * IOMAP_F_SIZE_CHANGED indicates to the iomap_end method that the file size * has changed as the result of this write operation. * @@ -95,6 +98,7 @@ struct vm_fault; * range it covers needs to be remapped by the high level before the operation * can proceed. */ +#define IOMAP_F_FOLIO_BATCH (1U << 13) #define IOMAP_F_SIZE_CHANGED (1U << 14) #define IOMAP_F_STALE (1U << 15) @@ -352,8 +356,8 @@ bool iomap_dirty_folio(struct address_space *mapping, struct folio *folio); int iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len, const struct iomap_ops *ops, const struct iomap_write_ops *write_ops); -loff_t iomap_fill_dirty_folios(struct iomap_iter *iter, loff_t offset, - loff_t length); +unsigned int iomap_fill_dirty_folios(struct iomap_iter *iter, loff_t *start, + loff_t end, unsigned int *iomap_flags); int iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero, const struct iomap_ops *ops, const struct iomap_write_ops *write_ops, void *private); From c0dbeb791dc9ce6b7b9bc95f163b3ac9889f1e7f Mon Sep 17 00:00:00 2001 From: Jeff Layton Date: Thu, 4 Dec 2025 08:48:32 -0500 Subject: [PATCH 0799/1321] filelock: add lease_dispose_list() helper The lease-handling code paths always know they're disposing of leases, yet locks_dispose_list() checks flags at runtime to determine whether to call locks_free_lease() or locks_free_lock(). Split out a dedicated lease_dispose_list() helper for lease code paths. This makes the type handling explicit and prepares for the upcoming lease_manager enhancements where lease-specific operations are being consolidated. Reviewed-by: Chuck Lever Signed-off-by: Jeff Layton Link: https://patch.msgid.link/20251204-dir-deleg-ro-v2-1-22d37f92ce2c@kernel.org Signed-off-by: Christian Brauner --- fs/locks.c | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index e2036aa4bd373..d1d3ca739b740 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -369,10 +369,19 @@ locks_dispose_list(struct list_head *dispose) while (!list_empty(dispose)) { flc = list_first_entry(dispose, struct file_lock_core, flc_list); list_del_init(&flc->flc_list); - if (flc->flc_flags & (FL_LEASE|FL_DELEG|FL_LAYOUT)) - locks_free_lease(file_lease(flc)); - else - locks_free_lock(file_lock(flc)); + locks_free_lock(file_lock(flc)); + } +} + +static void +lease_dispose_list(struct list_head *dispose) +{ + struct file_lock_core *flc; + + while (!list_empty(dispose)) { + flc = list_first_entry(dispose, struct file_lock_core, flc_list); + list_del_init(&flc->flc_list); + locks_free_lease(file_lease(flc)); } } @@ -1620,7 +1629,7 @@ int __break_lease(struct inode *inode, unsigned int flags) spin_unlock(&ctx->flc_lock); percpu_up_read(&file_rwsem); - locks_dispose_list(&dispose); + lease_dispose_list(&dispose); error = wait_event_interruptible_timeout(new_fl->c.flc_wait, list_empty(&new_fl->c.flc_blocked_member), break_time); @@ -1643,7 +1652,7 @@ int __break_lease(struct inode *inode, unsigned int flags) out: spin_unlock(&ctx->flc_lock); percpu_up_read(&file_rwsem); - locks_dispose_list(&dispose); + lease_dispose_list(&dispose); free_lock: locks_free_lease(new_fl); return error; @@ -1727,7 +1736,7 @@ static int __fcntl_getlease(struct file *filp, unsigned int flavor) spin_unlock(&ctx->flc_lock); percpu_up_read(&file_rwsem); - locks_dispose_list(&dispose); + lease_dispose_list(&dispose); } return type; } @@ -1896,7 +1905,7 @@ generic_add_lease(struct file *filp, int arg, struct file_lease **flp, void **pr out: spin_unlock(&ctx->flc_lock); percpu_up_read(&file_rwsem); - locks_dispose_list(&dispose); + lease_dispose_list(&dispose); if (is_deleg) inode_unlock(inode); if (!error && !my_fl) @@ -1932,7 +1941,7 @@ static int generic_delete_lease(struct file *filp, void *owner) error = fl->fl_lmops->lm_change(victim, F_UNLCK, &dispose); spin_unlock(&ctx->flc_lock); percpu_up_read(&file_rwsem); - locks_dispose_list(&dispose); + lease_dispose_list(&dispose); return error; } @@ -2735,7 +2744,7 @@ locks_remove_lease(struct file *filp, struct file_lock_context *ctx) spin_unlock(&ctx->flc_lock); percpu_up_read(&file_rwsem); - locks_dispose_list(&dispose); + lease_dispose_list(&dispose); } /* From 7fc1be3647cdbf87feadf42c0eb1275665a1aa77 Mon Sep 17 00:00:00 2001 From: Jeff Layton Date: Thu, 4 Dec 2025 08:48:33 -0500 Subject: [PATCH 0800/1321] filelock: allow lease_managers to dictate what qualifies as a conflict Requesting a delegation on a file from the userland fcntl() interface currently succeeds when there are conflicting opens present. This is because the lease handling code ignores conflicting opens for FL_LAYOUT and FL_DELEG leases. This was a hack put in place long ago, because nfsd already checks for conflicts in its own way. The kernel needs to perform this check for userland delegations the same way it is done for leases, however. Make this dependent on the lease_manager by adding a new ->lm_open_conflict() lease_manager operation and have generic_add_lease() call that instead of check_conflicting_open(). Morph check_conflicting_open() into a ->lm_open_conflict() op that is only called for userland leases/delegations. Set the ->lm_open_conflict() operations for nfsd to trivial functions that always return 0. Reviewed-by: Chuck Lever Signed-off-by: Jeff Layton Link: https://patch.msgid.link/20251204-dir-deleg-ro-v2-2-22d37f92ce2c@kernel.org Signed-off-by: Christian Brauner --- Documentation/filesystems/locking.rst | 1 + fs/locks.c | 90 +++++++++++++-------------- fs/nfsd/nfs4layouts.c | 23 ++++++- fs/nfsd/nfs4state.c | 19 ++++++ include/linux/filelock.h | 1 + 5 files changed, 84 insertions(+), 50 deletions(-) diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index 77704fde98457..04c7691e50e01 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -416,6 +416,7 @@ lm_change yes no no lm_breaker_owns_lease: yes no no lm_lock_expirable yes no no lm_expire_lock no no yes +lm_open_conflict yes no no ====================== ============= ================= ========= buffer_head diff --git a/fs/locks.c b/fs/locks.c index d1d3ca739b740..7ea949d7ff451 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -585,10 +585,50 @@ lease_setup(struct file_lease *fl, void **priv) __f_setown(filp, task_pid(current), PIDTYPE_TGID, 0); } +/** + * lease_open_conflict - see if the given file points to an inode that has + * an existing open that would conflict with the + * desired lease. + * @filp: file to check + * @arg: type of lease that we're trying to acquire + * + * Check to see if there's an existing open fd on this file that would + * conflict with the lease we're trying to set. + */ +static int +lease_open_conflict(struct file *filp, const int arg) +{ + struct inode *inode = file_inode(filp); + int self_wcount = 0, self_rcount = 0; + + if (arg == F_RDLCK) + return inode_is_open_for_write(inode) ? -EAGAIN : 0; + else if (arg != F_WRLCK) + return 0; + + /* + * Make sure that only read/write count is from lease requestor. + * Note that this will result in denying write leases when i_writecount + * is negative, which is what we want. (We shouldn't grant write leases + * on files open for execution.) + */ + if (filp->f_mode & FMODE_WRITE) + self_wcount = 1; + else if (filp->f_mode & FMODE_READ) + self_rcount = 1; + + if (atomic_read(&inode->i_writecount) != self_wcount || + atomic_read(&inode->i_readcount) != self_rcount) + return -EAGAIN; + + return 0; +} + static const struct lease_manager_operations lease_manager_ops = { .lm_break = lease_break_callback, .lm_change = lease_modify, .lm_setup = lease_setup, + .lm_open_conflict = lease_open_conflict, }; /* @@ -1754,52 +1794,6 @@ int fcntl_getdeleg(struct file *filp, struct delegation *deleg) return 0; } -/** - * check_conflicting_open - see if the given file points to an inode that has - * an existing open that would conflict with the - * desired lease. - * @filp: file to check - * @arg: type of lease that we're trying to acquire - * @flags: current lock flags - * - * Check to see if there's an existing open fd on this file that would - * conflict with the lease we're trying to set. - */ -static int -check_conflicting_open(struct file *filp, const int arg, int flags) -{ - struct inode *inode = file_inode(filp); - int self_wcount = 0, self_rcount = 0; - - if (flags & FL_LAYOUT) - return 0; - if (flags & FL_DELEG) - /* We leave these checks to the caller */ - return 0; - - if (arg == F_RDLCK) - return inode_is_open_for_write(inode) ? -EAGAIN : 0; - else if (arg != F_WRLCK) - return 0; - - /* - * Make sure that only read/write count is from lease requestor. - * Note that this will result in denying write leases when i_writecount - * is negative, which is what we want. (We shouldn't grant write leases - * on files open for execution.) - */ - if (filp->f_mode & FMODE_WRITE) - self_wcount = 1; - else if (filp->f_mode & FMODE_READ) - self_rcount = 1; - - if (atomic_read(&inode->i_writecount) != self_wcount || - atomic_read(&inode->i_readcount) != self_rcount) - return -EAGAIN; - - return 0; -} - static int generic_add_lease(struct file *filp, int arg, struct file_lease **flp, void **priv) { @@ -1836,7 +1830,7 @@ generic_add_lease(struct file *filp, int arg, struct file_lease **flp, void **pr percpu_down_read(&file_rwsem); spin_lock(&ctx->flc_lock); time_out_leases(inode, &dispose); - error = check_conflicting_open(filp, arg, lease->c.flc_flags); + error = lease->fl_lmops->lm_open_conflict(filp, arg); if (error) goto out; @@ -1893,7 +1887,7 @@ generic_add_lease(struct file *filp, int arg, struct file_lease **flp, void **pr * precedes these checks. */ smp_mb(); - error = check_conflicting_open(filp, arg, lease->c.flc_flags); + error = lease->fl_lmops->lm_open_conflict(filp, arg); if (error) { locks_unlink_lock_ctx(&lease->c); goto out; diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c index 683bd1130afe2..ad7af8cfcf1f9 100644 --- a/fs/nfsd/nfs4layouts.c +++ b/fs/nfsd/nfs4layouts.c @@ -764,9 +764,28 @@ nfsd4_layout_lm_change(struct file_lease *onlist, int arg, return lease_modify(onlist, arg, dispose); } +/** + * nfsd4_layout_lm_open_conflict - see if the given file points to an inode that has + * an existing open that would conflict with the + * desired lease. + * @filp: file to check + * @arg: type of lease that we're trying to acquire + * + * The kernel will call into this operation to determine whether there + * are conflicting opens that may prevent the layout from being granted. + * For nfsd, that check is done at a higher level, so this trivially + * returns 0. + */ +static int +nfsd4_layout_lm_open_conflict(struct file *filp, int arg) +{ + return 0; +} + static const struct lease_manager_operations nfsd4_layouts_lm_ops = { - .lm_break = nfsd4_layout_lm_break, - .lm_change = nfsd4_layout_lm_change, + .lm_break = nfsd4_layout_lm_break, + .lm_change = nfsd4_layout_lm_change, + .lm_open_conflict = nfsd4_layout_lm_open_conflict, }; int diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index a2361ec36ab84..d5e0f3a52d4f0 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -5555,10 +5555,29 @@ nfsd_change_deleg_cb(struct file_lease *onlist, int arg, return -EAGAIN; } +/** + * nfsd4_deleg_lm_open_conflict - see if the given file points to an inode that has + * an existing open that would conflict with the + * desired lease. + * @filp: file to check + * @arg: type of lease that we're trying to acquire + * + * The kernel will call into this operation to determine whether there + * are conflicting opens that may prevent the deleg from being granted. + * For nfsd, that check is done at a higher level, so this trivially + * returns 0. + */ +static int +nfsd4_deleg_lm_open_conflict(struct file *filp, int arg) +{ + return 0; +} + static const struct lease_manager_operations nfsd_lease_mng_ops = { .lm_breaker_owns_lease = nfsd_breaker_owns_lease, .lm_break = nfsd_break_deleg_cb, .lm_change = nfsd_change_deleg_cb, + .lm_open_conflict = nfsd4_deleg_lm_open_conflict, }; static __be32 nfsd4_check_seqid(struct nfsd4_compound_state *cstate, struct nfs4_stateowner *so, u32 seqid) diff --git a/include/linux/filelock.h b/include/linux/filelock.h index 54b824c052992..2f5e5588ee073 100644 --- a/include/linux/filelock.h +++ b/include/linux/filelock.h @@ -49,6 +49,7 @@ struct lease_manager_operations { int (*lm_change)(struct file_lease *, int, struct list_head *); void (*lm_setup)(struct file_lease *, void **); bool (*lm_breaker_owns_lease)(struct file_lease *); + int (*lm_open_conflict)(struct file *, int); }; struct lock_manager { From 591e742086457604efde68ff4897e83171703504 Mon Sep 17 00:00:00 2001 From: David Howells Date: Sat, 20 Dec 2025 12:31:40 +0000 Subject: [PATCH 0801/1321] netfs: Fix early read unlock of page with EOF in middle The read result collection for buffered reads seems to run ahead of the completion of subrequests under some circumstances, as can be seen in the following log snippet: 9p_client_res: client 18446612686390831168 response P9_TREAD tag 0 err 0 ... netfs_sreq: R=00001b55[1] DOWN TERM f=192 s=0 5fb2/5fb2 s=5 e=0 ... netfs_collect_folio: R=00001b55 ix=00004 r=4000-5000 t=4000/5fb2 netfs_folio: i=157f3 ix=00004-00004 read-done netfs_folio: i=157f3 ix=00004-00004 read-unlock netfs_collect_folio: R=00001b55 ix=00005 r=5000-5fb2 t=5000/5fb2 netfs_folio: i=157f3 ix=00005-00005 read-done netfs_folio: i=157f3 ix=00005-00005 read-unlock ... netfs_collect_stream: R=00001b55[0:] cto=5fb2 frn=ffffffff netfs_collect_state: R=00001b55 col=5fb2 cln=6000 n=c netfs_collect_stream: R=00001b55[0:] cto=5fb2 frn=ffffffff netfs_collect_state: R=00001b55 col=5fb2 cln=6000 n=8 ... netfs_sreq: R=00001b55[2] ZERO SUBMT f=000 s=5fb2 0/4e s=0 e=0 netfs_sreq: R=00001b55[2] ZERO TERM f=102 s=5fb2 4e/4e s=5 e=0 The 'cto=5fb2' indicates the collected file pos we've collected results to so far - but we still have 0x4e more bytes to go - so we shouldn't have collected folio ix=00005 yet. The 'ZERO' subreq that clears the tail happens after we unlock the folio, allowing the application to see the uncleared tail through mmap. The problem is that netfs_read_unlock_folios() will unlock a folio in which the amount of read results collected hits EOF position - but the ZERO subreq lies beyond that and so happens after. Fix this by changing the end check to always be the end of the folio and never the end of the file. In the future, I should look at clearing to the end of the folio here rather than adding a ZERO subreq to do this. On the other hand, the ZERO subreq can run in parallel with an async READ subreq. Further, the ZERO subreq may still be necessary to, say, handle extents in a ceph file that don't have any backing store and are thus implicitly all zeros. This can be reproduced by creating a file, the size of which doesn't align to a page boundary, e.g. 24998 (0x5fb2) bytes and then doing something like: xfs_io -c "mmap -r 0 0x6000" -c "madvise -d 0 0x6000" \ -c "mread -v 0 0x6000" /xfstest.test/x The last 0x4e bytes should all be 00, but if the tail hasn't been cleared yet, you may see rubbish there. This can be reproduced with kafs by modifying the kernel to disable the call to netfs_read_subreq_progress() and to stop afs_issue_read() from doing the async call for NETFS_READAHEAD. Reproduction can be made easier by inserting an mdelay(100) in netfs_issue_read() for the ZERO-subreq case. AFS and CIFS are normally unlikely to show this as they dispatch READ ops asynchronously, which allows the ZERO-subreq to finish first. 9P's READ op is completely synchronous, so the ZERO-subreq will always happen after. It isn't seen all the time, though, because the collection may be done in a worker thread. Reported-by: Christian Schoenebeck Link: https://lore.kernel.org/r/8622834.T7Z3S40VBb@weasel/ Signed-off-by: David Howells Link: https://patch.msgid.link/938162.1766233900@warthog.procyon.org.uk Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item") Tested-by: Christian Schoenebeck Acked-by: Dominique Martinet Suggested-by: Dominique Martinet cc: Dominique Martinet cc: Christian Schoenebeck cc: v9fs@lists.linux.dev cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner --- fs/netfs/read_collect.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c index a95e7aadafd07..7a0ffa675fb17 100644 --- a/fs/netfs/read_collect.c +++ b/fs/netfs/read_collect.c @@ -137,7 +137,7 @@ static void netfs_read_unlock_folios(struct netfs_io_request *rreq, rreq->front_folio_order = order; fsize = PAGE_SIZE << order; fpos = folio_pos(folio); - fend = umin(fpos + fsize, rreq->i_size); + fend = fpos + fsize; trace_netfs_collect_folio(rreq, folio, fend, collected_to); From 25ae47b2adb5e8a1a9cf4b104227135d56b0d3ec Mon Sep 17 00:00:00 2001 From: Mateusz Guzik Date: Sat, 20 Dec 2025 06:40:22 +0100 Subject: [PATCH 0802/1321] fs: make sure to fail try_to_unlazy() and try_to_unlazy() for LOOKUP_CACHED Otherwise the slowpath can be taken by the caller, defeating the flag. This regressed after calls to legitimize_links() started being conditionally elided and stems from the routine always failing after seeing the flag, regardless if there were any links. In order to address both the bug and the weird semantics make it illegal to call legitimize_links() with LOOKUP_CACHED and handle the problem at the two callsites. Fixes: 7c179096e77eca21 ("fs: add predicts based on nd->depth") Reported-by: Chris Mason Signed-off-by: Mateusz Guzik Link: https://patch.msgid.link/20251220054023.142134-1-mjguzik@gmail.com Signed-off-by: Christian Brauner --- fs/namei.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index bf0f66f0e9b92..f7a8b5b000c23 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -830,11 +830,9 @@ static inline bool legitimize_path(struct nameidata *nd, static bool legitimize_links(struct nameidata *nd) { int i; - if (unlikely(nd->flags & LOOKUP_CACHED)) { - drop_links(nd); - nd->depth = 0; - return false; - } + + VFS_BUG_ON(nd->flags & LOOKUP_CACHED); + for (i = 0; i < nd->depth; i++) { struct saved *last = nd->stack + i; if (unlikely(!legitimize_path(nd, &last->link, last->seq))) { @@ -883,6 +881,11 @@ static bool try_to_unlazy(struct nameidata *nd) BUG_ON(!(nd->flags & LOOKUP_RCU)); + if (unlikely(nd->flags & LOOKUP_CACHED)) { + drop_links(nd); + nd->depth = 0; + goto out1; + } if (unlikely(nd->depth && !legitimize_links(nd))) goto out1; if (unlikely(!legitimize_path(nd, &nd->path, nd->seq))) @@ -918,6 +921,11 @@ static bool try_to_unlazy_next(struct nameidata *nd, struct dentry *dentry) int res; BUG_ON(!(nd->flags & LOOKUP_RCU)); + if (unlikely(nd->flags & LOOKUP_CACHED)) { + drop_links(nd); + nd->depth = 0; + goto out2; + } if (unlikely(nd->depth && !legitimize_links(nd))) goto out2; res = __legitimize_mnt(nd->path.mnt, nd->m_seq); From 8c12d5cfd098e8d42dd7f5959e6205422f647a8f Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Fri, 19 Dec 2025 09:46:19 +0700 Subject: [PATCH 0803/1321] fs: Describe @isnew parameter in ilookup5_nowait() Sphinx reports kernel-doc warning: WARNING: ./fs/inode.c:1607 function parameter 'isnew' not described in 'ilookup5_nowait' Describe the parameter. Fixes: a27628f4363435 ("fs: rework I_NEW handling to operate without fences") Signed-off-by: Bagas Sanjaya Link: https://patch.msgid.link/20251219024620.22880-2-bagasdotme@gmail.com Reviewed-by: Jeff Layton Reviewed-by: Jan Kara Signed-off-by: Christian Brauner --- fs/inode.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/inode.c b/fs/inode.c index 521383223d8a4..379f4c19845c9 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1593,6 +1593,9 @@ EXPORT_SYMBOL(igrab); * @hashval: hash value (usually inode number) to search for * @test: callback used for comparisons between inodes * @data: opaque data pointer to pass to @test + * @isnew: return argument telling whether I_NEW was set when + * the inode was found in hash (the caller needs to + * wait for I_NEW to clear) * * Search for the inode specified by @hashval and @data in the inode cache. * If the inode is in the cache, the inode is returned with an incremented From b41ec00476a085be4db37d1dbaddad6db279fc00 Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Fri, 19 Dec 2025 09:46:20 +0700 Subject: [PATCH 0804/1321] VFS: fix __start_dirop() kernel-doc warnings Sphinx report kernel-doc warnings: WARNING: ./fs/namei.c:2853 function parameter 'state' not described in '__start_dirop' WARNING: ./fs/namei.c:2853 expecting prototype for start_dirop(). Prototype was for __start_dirop() instead Fix them up. Fixes: ff7c4ea11a05c8 ("VFS: add start_creating_killable() and start_removing_killable()") Signed-off-by: Bagas Sanjaya Link: https://patch.msgid.link/20251219024620.22880-3-bagasdotme@gmail.com Reviewed-by: Jeff Layton Reviewed-by: Jan Kara Signed-off-by: Christian Brauner --- fs/namei.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/namei.c b/fs/namei.c index f7a8b5b000c23..cf16b6822dd31 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2844,10 +2844,11 @@ static int filename_parentat(int dfd, struct filename *name, } /** - * start_dirop - begin a create or remove dirop, performing locking and lookup + * __start_dirop - begin a create or remove dirop, performing locking and lookup * @parent: the dentry of the parent in which the operation will occur * @name: a qstr holding the name within that parent * @lookup_flags: intent and other lookup flags. + * @state: task state bitmask * * The lookup is performed and necessary locks are taken so that, on success, * the returned dentry can be operated on safely. From 7efe01123ee30650325a8146614e7033c59d853f Mon Sep 17 00:00:00 2001 From: Al Viro Date: Tue, 16 Dec 2025 08:19:39 +0000 Subject: [PATCH 0805/1321] get rid of bogus __user in struct xattr_args::value The first member of struct xattr_args is declared as __aligned_u64 __user value; which makes no sense whatsoever; __user is a qualifier and what that declaration says is "all struct xattr_args instances have .value _stored_ in user address space, no matter where the rest of the structure happens to be". Something like "int __user *p" stands for "value of p is a pointer to an instance of int that happens to live in user address space"; it says nothing about location of p itself, just as const char *p declares a pointer to unmodifiable char rather than an unmodifiable pointer to char. With xattr_args the intent clearly had been "the 64bit value represents a _pointer_ to object in user address space", but __user has nothing to do with that. All it gets us is a couple of bogus warnings in fs/xattr.c where (userland) instance of xattr_args is copied to local variable of that type (in kernel address space), followed by access to its members. Since we've told sparse that args.value must somehow be located in userland memory, we get warned that looking at that 64bit unsigned integer (in a variable already on kernel stack) is not allowed. Note that sparse has no way to express "this integer shall never be cast into a pointer to be dereferenced directly" and I don't see any way to assign a sane semantics to that. In any case, __user is not it. Signed-off-by: Al Viro Link: https://patch.msgid.link/20251216081939.GQ1712166@ZenIV Signed-off-by: Christian Brauner --- include/uapi/linux/xattr.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/uapi/linux/xattr.h b/include/uapi/linux/xattr.h index c7c85bb504bab..2e5aef48fa7ea 100644 --- a/include/uapi/linux/xattr.h +++ b/include/uapi/linux/xattr.h @@ -23,7 +23,7 @@ #define XATTR_REPLACE 0x2 /* set value, fail if attr does not exist */ struct xattr_args { - __aligned_u64 __user value; + __aligned_u64 value; __u32 size; __u32 flags; }; From 734861bd412e577319b4450e64edec5ed01047e5 Mon Sep 17 00:00:00 2001 From: Tyler Hicks Date: Tue, 23 Dec 2025 13:41:52 -0600 Subject: [PATCH 0806/1321] ecryptfs: Fix improper mknod pairing of start_creating()/end_removing() The ecryptfs_start_creating_dentry() function must be paired with the end_creating() function. Fix ecryptfs_mknod() so that end_creating() is properly called in the return path, instead of end_removing(). Fixes: f046fbb4d81d ("ecryptfs: use new start_creating/start_removing APIs") Signed-off-by: Tyler Hicks Link: https://patch.msgid.link/20251223194153.2818445-2-code@tyhicks.com Reviewed-by: Amir Goldstein Signed-off-by: Christian Brauner --- fs/ecryptfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c index 3978248247dc2..e73d9de676a64 100644 --- a/fs/ecryptfs/inode.c +++ b/fs/ecryptfs/inode.c @@ -584,7 +584,7 @@ ecryptfs_mknod(struct mnt_idmap *idmap, struct inode *dir, fsstack_copy_attr_times(dir, lower_dir); fsstack_copy_inode_size(dir, lower_dir); out: - end_removing(lower_dentry); + end_creating(lower_dentry); if (d_really_is_negative(dentry)) d_drop(dentry); return rc; From b122d7981c37b31a4559d8ad682e9ab717d968c3 Mon Sep 17 00:00:00 2001 From: Tyler Hicks Date: Tue, 23 Dec 2025 13:41:53 -0600 Subject: [PATCH 0807/1321] ecryptfs: Release lower parent dentry after creating dir Fix a mkdir-induced usage count imbalance that tripped a umount_check() BUG while unmounting the lower filesystem. Commit f046fbb4d81d ("ecryptfs: use new start_creating/start_removing APIs") added a new dget() of the lower parent dir, in ecryptfs_mkdir(), but did not dput() the dentry before returning from that function. The BUG output as seen while running the eCryptfs test suite: $ ./run_tests.sh -b 131072 -c safe,destructive -f ext4 -K -t lp-926292.sh ... Running eCryptfs filesystem tests on ext4 lp-926292 ------------[ cut here ]------------ BUG: Dentry ffff8e6692d11988{i=c,n=ECRYPTFS_FNEK_ENCRYPTED.FXZuRGZL7QAFtER.JeA46DtdKqkkQx9H2Vpmv234J5CU8YSsrUwZJK4AbXbrN5WkZ348wnqstovKKxA-} still in use (1) [unmount of ext4 loop0] WARNING: CPU: 7 PID: 950 at fs/dcache.c:1590 umount_check+0x5e/0x80 Modules linked in: md5 libmd5 ecryptfs encrypted_keys ext4 crc16 mbcache jbd2 CPU: 7 UID: 0 PID: 950 Comm: umount Not tainted 6.18.0-rc1-00013-gf046fbb4d81d #17 PREEMPT(full) Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 RIP: 0010:umount_check+0x5e/0x80 Code: 88 38 06 00 00 48 8b 40 28 4c 8b 08 48 8b 46 68 48 85 c0 74 04 48 8b 50 38 51 48 c7 c7 60 32 9c b5 48 89 f1 e8 43 5e ca ff 90 <0f> 0b 90 90 58 31 c0 e9 46 9d 6c 00 41 83 f8 01 75 b8 eb a3 66 66 RSP: 0018:ffffa19940c4bdd0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff8e6692fad4c0 RCX: 0000000000000000 RDX: 0000000000000004 RSI: ffffa19940c4bc70 RDI: 00000000ffffffff RBP: ffffffffb4eb5930 R08: 00000000ffffdfff R09: 0000000000000001 R10: 00000000ffffdfff R11: ffffffffb5c8a9e0 R12: ffff8e6692fad4c0 R13: ffff8e6692fad4c0 R14: ffff8e6692d11a40 R15: ffff8e6692d11988 FS: 00007f6b4b491800(0000) GS:ffff8e670506e000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f6b4b5f8d40 CR3: 0000000114eb7001 CR4: 0000000000772ef0 PKRU: 55555554 Call Trace: d_walk+0xfd/0x370 shrink_dcache_for_umount+0x4d/0x140 generic_shutdown_super+0x20/0x160 kill_block_super+0x1a/0x40 ext4_kill_sb+0x22/0x40 [ext4] deactivate_locked_super+0x33/0xa0 cleanup_mnt+0xba/0x150 task_work_run+0x5c/0xa0 exit_to_user_mode_loop+0xac/0xb0 do_syscall_64+0x2ab/0xfa0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f6b4b6c2a2b Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 b9 83 0d 00 f7 d8 RSP: 002b:00007ffcd5b8b498 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 RAX: 0000000000000000 RBX: 000055b84af0b9e0 RCX: 00007f6b4b6c2a2b RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055b84af0bdf0 RBP: 00007ffcd5b8b570 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000103 R11: 0000000000000246 R12: 000055b84af0bae0 R13: 0000000000000000 R14: 000055b84af0bdf0 R15: 0000000000000000 ---[ end trace 0000000000000000 ]--- EXT4-fs (loop0): unmounting filesystem 00d9ea41-f61e-43d0-a449-6be03e7e8428. EXT4-fs (loop0): sb orphan head is 12 sb_info orphan list: inode loop0:12 at ffff8e66950e1df0: mode 40700, nlink 0, next 0 Assertion failure in ext4_put_super() at fs/ext4/super.c:1345: 'list_empty(&sbi->s_orphan)' Fixes: f046fbb4d81d ("ecryptfs: use new start_creating/start_removing APIs") Signed-off-by: Tyler Hicks Link: https://patch.msgid.link/20251223194153.2818445-3-code@tyhicks.com Reviewed-by: Amir Goldstein Signed-off-by: Christian Brauner --- fs/ecryptfs/inode.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c index e73d9de676a64..8ab014db3e03f 100644 --- a/fs/ecryptfs/inode.c +++ b/fs/ecryptfs/inode.c @@ -533,6 +533,7 @@ static struct dentry *ecryptfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, fsstack_copy_inode_size(dir, lower_dir); set_nlink(dir, lower_dir->i_nlink); out: + dput(lower_dir_dentry); end_creating(lower_dentry); if (d_really_is_negative(dentry)) d_drop(dentry); From a02251c49486e0a6bb38f853a3767c6bb9b62d2a Mon Sep 17 00:00:00 2001 From: Christian Brauner Date: Wed, 24 Dec 2025 13:00:24 +0100 Subject: [PATCH 0808/1321] pidfs: protect PIDFD_GET_* ioctls() via ifdef We originally protected PIDFD_GET__NAMESPACE ioctls() through ifdefs and recent rework made it possible to drop them. There was an oversight though. When the relevant namespace is turned off ns->ops will be NULL so even though opening a file descriptor is perfectly legitimate it would fail during inode eviction when the file was closed. The simple fix would be to check ns->ops for NULL and continue allow to retrieve namespace fds from pidfds but we don't allow retrieving them when the relevant namespace type is turned off. So keep the simplification but add the ifdefs back in. Link: https://lore.kernel.org/20251222214907.GA189632@quark Link: https://patch.msgid.link/20251224-ununterbrochen-gagen-ea949b83f8f2@brauner Fixes: a71e4f103aed ("pidfs: simplify PIDFD_GET__NAMESPACE ioctls") Tested-by: Brendan Jackman Tested-by: Eric Biggers Reported-by: Eric Biggers Signed-off-by: Christian Brauner --- fs/pidfs.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/fs/pidfs.c b/fs/pidfs.c index dba703d4ce4a8..1e20e36e0ed55 100644 --- a/fs/pidfs.c +++ b/fs/pidfs.c @@ -517,14 +517,18 @@ static long pidfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg) switch (cmd) { /* Namespaces that hang of nsproxy. */ case PIDFD_GET_CGROUP_NAMESPACE: +#ifdef CONFIG_CGROUPS if (!ns_ref_get(nsp->cgroup_ns)) break; ns_common = to_ns_common(nsp->cgroup_ns); +#endif break; case PIDFD_GET_IPC_NAMESPACE: +#ifdef CONFIG_IPC_NS if (!ns_ref_get(nsp->ipc_ns)) break; ns_common = to_ns_common(nsp->ipc_ns); +#endif break; case PIDFD_GET_MNT_NAMESPACE: if (!ns_ref_get(nsp->mnt_ns)) @@ -532,32 +536,43 @@ static long pidfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg) ns_common = to_ns_common(nsp->mnt_ns); break; case PIDFD_GET_NET_NAMESPACE: +#ifdef CONFIG_NET_NS if (!ns_ref_get(nsp->net_ns)) break; ns_common = to_ns_common(nsp->net_ns); +#endif break; case PIDFD_GET_PID_FOR_CHILDREN_NAMESPACE: +#ifdef CONFIG_PID_NS if (!ns_ref_get(nsp->pid_ns_for_children)) break; ns_common = to_ns_common(nsp->pid_ns_for_children); +#endif break; case PIDFD_GET_TIME_NAMESPACE: +#ifdef CONFIG_TIME_NS if (!ns_ref_get(nsp->time_ns)) break; ns_common = to_ns_common(nsp->time_ns); +#endif break; case PIDFD_GET_TIME_FOR_CHILDREN_NAMESPACE: +#ifdef CONFIG_TIME_NS if (!ns_ref_get(nsp->time_ns_for_children)) break; ns_common = to_ns_common(nsp->time_ns_for_children); +#endif break; case PIDFD_GET_UTS_NAMESPACE: +#ifdef CONFIG_UTS_NS if (!ns_ref_get(nsp->uts_ns)) break; ns_common = to_ns_common(nsp->uts_ns); +#endif break; /* Namespaces that don't hang of nsproxy. */ case PIDFD_GET_USER_NAMESPACE: +#ifdef CONFIG_USER_NS scoped_guard(rcu) { struct user_namespace *user_ns; @@ -566,8 +581,10 @@ static long pidfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg) break; ns_common = to_ns_common(user_ns); } +#endif break; case PIDFD_GET_PID_NAMESPACE: +#ifdef CONFIG_PID_NS scoped_guard(rcu) { struct pid_namespace *pid_ns; @@ -576,6 +593,7 @@ static long pidfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg) break; ns_common = to_ns_common(pid_ns); } +#endif break; default: return -ENOIOCTLCMD; From 618429e520f0727e84fde15cfc69b13331b30730 Mon Sep 17 00:00:00 2001 From: Philipp Stanner Date: Mon, 8 Dec 2025 15:07:15 +0100 Subject: [PATCH 0809/1321] MAINTAINERS: Update Nova GPU driver git link Nova driver development has been moved to a different git repository. Update the MAINTAINERS entry to reflect that. Reported-by: Gary Guo Signed-off-by: Philipp Stanner Link: https://patch.msgid.link/20251208140713.41330-3-phasta@kernel.org Signed-off-by: Danilo Krummrich --- MAINTAINERS | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 6737aad729d63..a4d8281a05978 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8068,7 +8068,7 @@ W: https://rust-for-linux.com/nova-gpu-driver Q: https://patchwork.freedesktop.org/project/nouveau/ B: https://gitlab.freedesktop.org/drm/nova/-/issues C: irc://irc.oftc.net/nouveau -T: git https://gitlab.freedesktop.org/drm/nova.git nova-next +T: git https://gitlab.freedesktop.org/drm/rust/kernel.git drm-rust-next F: Documentation/gpu/nova/ F: drivers/gpu/nova-core/ @@ -8080,7 +8080,7 @@ W: https://rust-for-linux.com/nova-gpu-driver Q: https://patchwork.freedesktop.org/project/nouveau/ B: https://gitlab.freedesktop.org/drm/nova/-/issues C: irc://irc.oftc.net/nouveau -T: git https://gitlab.freedesktop.org/drm/nova.git nova-next +T: git https://gitlab.freedesktop.org/drm/rust/kernel.git drm-rust-next F: Documentation/gpu/nova/ F: drivers/gpu/drm/nova/ F: include/uapi/drm/nova_drm.h From f0ee1df2ecb7eecb6d6672e5e67d1801ae3904ec Mon Sep 17 00:00:00 2001 From: Alexandre Courbot Date: Wed, 5 Nov 2025 09:40:09 +0900 Subject: [PATCH 0810/1321] gpu: nova-core: select RUST_FW_LOADER_ABSTRACTIONS RUST_FW_LOADER_ABSTRACTIONS was depended on by NOVA_CORE, but NOVA_CORE is selected by DRM_NOVA. This creates a situation where, if DRM_NOVA is selected, NOVA_CORE gets enabled but not RUST_FW_LOADER_ABSTRACTIONS, which results in a build error. Since the firmware loader is an implementation detail of the driver, it should be enabled along with it, so change the "depends on" to a "select". Fixes: 54e6baf123fd ("gpu: nova-core: add initial driver stub") Closes: https://lore.kernel.org/oe-kbuild-all/202512061721.rxKGnt5q-lkp@intel.com/ Tested-by: Alyssa Ross Acked-by: Danilo Krummrich Link: https://patch.msgid.link/20251106-b4-select-rust-fw-v3-2-771172257755@nvidia.com Signed-off-by: Alexandre Courbot --- drivers/gpu/nova-core/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/nova-core/Kconfig b/drivers/gpu/nova-core/Kconfig index 20d3e6d0d796e..527920f9c4d39 100644 --- a/drivers/gpu/nova-core/Kconfig +++ b/drivers/gpu/nova-core/Kconfig @@ -3,7 +3,7 @@ config NOVA_CORE depends on 64BIT depends on PCI depends on RUST - depends on RUST_FW_LOADER_ABSTRACTIONS + select RUST_FW_LOADER_ABSTRACTIONS select AUXILIARY_BUS default n help From 8365c6179969a45214d56dab71d24417cf70cc46 Mon Sep 17 00:00:00 2001 From: Alexandre Courbot Date: Tue, 16 Dec 2025 11:57:07 +0900 Subject: [PATCH 0811/1321] gpu: nova-core: bindings: add missing explicit padding Explicit padding is needed in order to avoid uninitialized bytes and safely implement `AsBytes`. The `--explicit-padding` of bindgen was omitted by mistake when these bindings were generated. Fixes: 13f85988d4fa ("gpu: nova-core: gsp: Retrieve GSP static info to gather GPU information") Reviewed-by: Lyude Paul Reviewed-by: Joel Fernandes Acked-by: Danilo Krummrich Link: https://patch.msgid.link/20251216-nova-fixes-v3-1-c7469a71f7c4@nvidia.com Signed-off-by: Alexandre Courbot --- drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs index 5bcfbcd1ad22c..5f0569dcc4a01 100644 --- a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs +++ b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs @@ -325,6 +325,7 @@ pub struct NV0080_CTRL_GPU_GET_SRIOV_CAPS_PARAMS { pub totalVFs: u32_, pub firstVfOffset: u32_, pub vfFeatureMask: u32_, + pub __bindgen_padding_0: [u8; 4usize], pub FirstVFBar0Address: u64_, pub FirstVFBar1Address: u64_, pub FirstVFBar2Address: u64_, @@ -340,6 +341,7 @@ pub struct NV0080_CTRL_GPU_GET_SRIOV_CAPS_PARAMS { pub bClientRmAllocatedCtxBuffer: u8_, pub bNonPowerOf2ChannelCountSupported: u8_, pub bVfResizableBAR1Supported: u8_, + pub __bindgen_padding_1: [u8; 7usize], } #[repr(C)] #[derive(Debug, Default, Copy, Clone)] @@ -347,11 +349,13 @@ pub struct NV2080_CTRL_BIOS_GET_SKU_INFO_PARAMS { pub BoardID: u32_, pub chipSKU: [ffi::c_char; 9usize], pub chipSKUMod: [ffi::c_char; 5usize], + pub __bindgen_padding_0: [u8; 2usize], pub skuConfigVersion: u32_, pub project: [ffi::c_char; 5usize], pub projectSKU: [ffi::c_char; 5usize], pub CDP: [ffi::c_char; 6usize], pub projectSKUMod: [ffi::c_char; 2usize], + pub __bindgen_padding_1: [u8; 2usize], pub businessCycle: u32_, } pub type NV2080_CTRL_CMD_FB_GET_FB_REGION_SURFACE_MEM_TYPE_FLAG = [u8_; 17usize]; @@ -371,6 +375,7 @@ pub struct NV2080_CTRL_CMD_FB_GET_FB_REGION_FB_REGION_INFO { #[derive(Debug, Default, Copy, Clone)] pub struct NV2080_CTRL_CMD_FB_GET_FB_REGION_INFO_PARAMS { pub numFBRegions: u32_, + pub __bindgen_padding_0: [u8; 4usize], pub fbRegion: [NV2080_CTRL_CMD_FB_GET_FB_REGION_FB_REGION_INFO; 16usize], } #[repr(C)] @@ -495,13 +500,16 @@ pub struct FW_WPR_LAYOUT_OFFSET { #[derive(Debug, Copy, Clone)] pub struct GspStaticConfigInfo_t { pub grCapsBits: [u8_; 23usize], + pub __bindgen_padding_0: u8, pub gidInfo: NV2080_CTRL_GPU_GET_GID_INFO_PARAMS, pub SKUInfo: NV2080_CTRL_BIOS_GET_SKU_INFO_PARAMS, + pub __bindgen_padding_1: [u8; 4usize], pub fbRegionInfoParams: NV2080_CTRL_CMD_FB_GET_FB_REGION_INFO_PARAMS, pub sriovCaps: NV0080_CTRL_GPU_GET_SRIOV_CAPS_PARAMS, pub sriovMaxGfid: u32_, pub engineCaps: [u32_; 3usize], pub poisonFuseEnabled: u8_, + pub __bindgen_padding_2: [u8; 7usize], pub fb_length: u64_, pub fbio_mask: u64_, pub fb_bus_width: u32_, @@ -527,16 +535,20 @@ pub struct GspStaticConfigInfo_t { pub bIsMigSupported: u8_, pub RTD3GC6TotalBoardPower: u16_, pub RTD3GC6PerstDelay: u16_, + pub __bindgen_padding_3: [u8; 2usize], pub bar1PdeBase: u64_, pub bar2PdeBase: u64_, pub bVbiosValid: u8_, + pub __bindgen_padding_4: [u8; 3usize], pub vbiosSubVendor: u32_, pub vbiosSubDevice: u32_, pub bPageRetirementSupported: u8_, pub bSplitVasBetweenServerClientRm: u8_, pub bClRootportNeedsNosnoopWAR: u8_, + pub __bindgen_padding_5: u8, pub displaylessMaxHeads: VIRTUAL_DISPLAY_GET_NUM_HEADS_PARAMS, pub displaylessMaxResolution: VIRTUAL_DISPLAY_GET_MAX_RESOLUTION_PARAMS, + pub __bindgen_padding_6: [u8; 4usize], pub displaylessMaxPixels: u64_, pub hInternalClient: u32_, pub hInternalDevice: u32_, From 7a1c2e6773410601b47e4180901304e4883d0637 Mon Sep 17 00:00:00 2001 From: Alexandre Courbot Date: Tue, 16 Dec 2025 11:57:08 +0900 Subject: [PATCH 0812/1321] gpu: nova-core: gsp: fix length of received messages The size of messages' payload is miscalculated, leading to extra data passed to the message handler. While this is not a problem with our current set of commands, others with a variable-length payload may misbehave. Fix this by introducing a method returning the payload size and using it. Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling") Reviewed-by: Lyude Paul Reviewed-by: Joel Fernandes Reviewed-by: Alistair Popple Acked-by: Danilo Krummrich Link: https://patch.msgid.link/20251216-nova-fixes-v3-2-c7469a71f7c4@nvidia.com [acourbot@nvidia.com: update `PANIC:` comments as pointed out by Joel.] Signed-off-by: Alexandre Courbot --- drivers/gpu/nova-core/gsp/cmdq.rs | 14 ++++++++------ drivers/gpu/nova-core/gsp/fw.rs | 13 +++++++++---- 2 files changed, 17 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs index 6f946d14868ab..3991ccc0c10f1 100644 --- a/drivers/gpu/nova-core/gsp/cmdq.rs +++ b/drivers/gpu/nova-core/gsp/cmdq.rs @@ -588,21 +588,23 @@ impl Cmdq { header.length(), ); + let payload_length = header.payload_length(); + // Check that the driver read area is large enough for the message. - if slice_1.len() + slice_2.len() < header.length() { + if slice_1.len() + slice_2.len() < payload_length { return Err(EIO); } // Cut the message slices down to the actual length of the message. - let (slice_1, slice_2) = if slice_1.len() > header.length() { - // PANIC: we checked above that `slice_1` is at least as long as `msg_header.length()`. - (slice_1.split_at(header.length()).0, &slice_2[0..0]) + let (slice_1, slice_2) = if slice_1.len() > payload_length { + // PANIC: we checked above that `slice_1` is at least as long as `payload_length`. + (slice_1.split_at(payload_length).0, &slice_2[0..0]) } else { ( slice_1, // PANIC: we checked above that `slice_1.len() + slice_2.len()` is at least as - // large as `msg_header.length()`. - slice_2.split_at(header.length() - slice_1.len()).0, + // large as `payload_length`. + slice_2.split_at(payload_length - slice_1.len()).0, ) }; diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs index abffd6beec654..7b8e710b33e75 100644 --- a/drivers/gpu/nova-core/gsp/fw.rs +++ b/drivers/gpu/nova-core/gsp/fw.rs @@ -853,11 +853,16 @@ impl GspMsgElement { self.inner.checkSum = checksum; } - /// Returns the total length of the message. + /// Returns the length of the message's payload. + pub(crate) fn payload_length(&self) -> usize { + // `rpc.length` includes the length of the RPC message header. + num::u32_as_usize(self.inner.rpc.length) + .saturating_sub(size_of::()) + } + + /// Returns the total length of the message, message and RPC headers included. pub(crate) fn length(&self) -> usize { - // `rpc.length` includes the length of the GspRpcHeader but not the message header. - size_of::() - size_of::() - + num::u32_as_usize(self.inner.rpc.length) + size_of::() + self.payload_length() } // Returns the sequence number of the message. From dea8903526bcf52edf2847334d57c22702e1791a Mon Sep 17 00:00:00 2001 From: Alexandre Courbot Date: Tue, 16 Dec 2025 11:57:09 +0900 Subject: [PATCH 0813/1321] gpu: nova-core: bindings: derive `MaybeZeroable` Commit 4846300ba8f9 ("rust: derive `Zeroable` for all structs & unions generated by bindgen where possible") automatically derives `MaybeZeroable` for all bindings. This is better than selectively deriving `Zeroable` as it ensures all types that can implement `Zeroable` do. Regenerate the nova-core bindings so they benefit from this, and remove a now unneeded implementation of `Zeroable`. Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling") Reviewed-by: Lyude Paul Reviewed-by: Joel Fernandes Acked-by: Danilo Krummrich Link: https://patch.msgid.link/20251216-nova-fixes-v3-3-c7469a71f7c4@nvidia.com Signed-off-by: Alexandre Courbot --- drivers/gpu/nova-core/gsp/fw.rs | 7 -- drivers/gpu/nova-core/gsp/fw/r570_144.rs | 11 ++- .../gpu/nova-core/gsp/fw/r570_144/bindings.rs | 93 ++++++++++--------- 3 files changed, 54 insertions(+), 57 deletions(-) diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs index 7b8e710b33e75..b754631d2d8d2 100644 --- a/drivers/gpu/nova-core/gsp/fw.rs +++ b/drivers/gpu/nova-core/gsp/fw.rs @@ -797,13 +797,6 @@ impl bindings::rpc_message_header_v { } } -// SAFETY: We can't derive the Zeroable trait for this binding because the -// procedural macro doesn't support the syntax used by bindgen to create the -// __IncompleteArrayField types. So instead we implement it here, which is safe -// because these are explicitly padded structures only containing types for -// which any bit pattern, including all zeros, is valid. -unsafe impl Zeroable for bindings::rpc_message_header_v {} - /// GSP Message Element. /// /// This is essentially a message header expected to be followed by the message data. diff --git a/drivers/gpu/nova-core/gsp/fw/r570_144.rs b/drivers/gpu/nova-core/gsp/fw/r570_144.rs index 048234d1a9d1a..e99d315ae74c1 100644 --- a/drivers/gpu/nova-core/gsp/fw/r570_144.rs +++ b/drivers/gpu/nova-core/gsp/fw/r570_144.rs @@ -24,8 +24,11 @@ unreachable_pub, unsafe_op_in_unsafe_fn )] -use kernel::{ - ffi, - prelude::Zeroable, // -}; +use kernel::ffi; +use pin_init::MaybeZeroable; + include!("r570_144/bindings.rs"); + +// SAFETY: This type has a size of zero, so its inclusion into another type should not affect their +// ability to implement `Zeroable`. +unsafe impl kernel::prelude::Zeroable for __IncompleteArrayField {} diff --git a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs index 5f0569dcc4a01..6d25fe0bffa97 100644 --- a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs +++ b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs @@ -320,7 +320,7 @@ pub const NV_VGPU_MSG_EVENT_RECOVERY_ACTION: _bindgen_ty_3 = 4130; pub const NV_VGPU_MSG_EVENT_NUM_EVENTS: _bindgen_ty_3 = 4131; pub type _bindgen_ty_3 = ffi::c_uint; #[repr(C)] -#[derive(Debug, Default, Copy, Clone)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct NV0080_CTRL_GPU_GET_SRIOV_CAPS_PARAMS { pub totalVFs: u32_, pub firstVfOffset: u32_, @@ -344,7 +344,7 @@ pub struct NV0080_CTRL_GPU_GET_SRIOV_CAPS_PARAMS { pub __bindgen_padding_1: [u8; 7usize], } #[repr(C)] -#[derive(Debug, Default, Copy, Clone)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct NV2080_CTRL_BIOS_GET_SKU_INFO_PARAMS { pub BoardID: u32_, pub chipSKU: [ffi::c_char; 9usize], @@ -360,7 +360,7 @@ pub struct NV2080_CTRL_BIOS_GET_SKU_INFO_PARAMS { } pub type NV2080_CTRL_CMD_FB_GET_FB_REGION_SURFACE_MEM_TYPE_FLAG = [u8_; 17usize]; #[repr(C)] -#[derive(Debug, Default, Copy, Clone)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct NV2080_CTRL_CMD_FB_GET_FB_REGION_FB_REGION_INFO { pub base: u64_, pub limit: u64_, @@ -372,14 +372,14 @@ pub struct NV2080_CTRL_CMD_FB_GET_FB_REGION_FB_REGION_INFO { pub blackList: NV2080_CTRL_CMD_FB_GET_FB_REGION_SURFACE_MEM_TYPE_FLAG, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct NV2080_CTRL_CMD_FB_GET_FB_REGION_INFO_PARAMS { pub numFBRegions: u32_, pub __bindgen_padding_0: [u8; 4usize], pub fbRegion: [NV2080_CTRL_CMD_FB_GET_FB_REGION_FB_REGION_INFO; 16usize], } #[repr(C)] -#[derive(Debug, Copy, Clone)] +#[derive(Debug, Copy, Clone, MaybeZeroable)] pub struct NV2080_CTRL_GPU_GET_GID_INFO_PARAMS { pub index: u32_, pub flags: u32_, @@ -396,14 +396,14 @@ impl Default for NV2080_CTRL_GPU_GET_GID_INFO_PARAMS { } } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct DOD_METHOD_DATA { pub status: u32_, pub acpiIdListLen: u32_, pub acpiIdList: [u32_; 16usize], } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct JT_METHOD_DATA { pub status: u32_, pub jtCaps: u32_, @@ -412,14 +412,14 @@ pub struct JT_METHOD_DATA { pub __bindgen_padding_0: u8, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct MUX_METHOD_DATA_ELEMENT { pub acpiId: u32_, pub mode: u32_, pub status: u32_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct MUX_METHOD_DATA { pub tableLen: u32_, pub acpiIdMuxModeTable: [MUX_METHOD_DATA_ELEMENT; 16usize], @@ -427,13 +427,13 @@ pub struct MUX_METHOD_DATA { pub acpiIdMuxStateTable: [MUX_METHOD_DATA_ELEMENT; 16usize], } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct CAPS_METHOD_DATA { pub status: u32_, pub optimusCaps: u32_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct ACPI_METHOD_DATA { pub bValid: u8_, pub __bindgen_padding_0: [u8; 3usize], @@ -443,20 +443,20 @@ pub struct ACPI_METHOD_DATA { pub capsMethodData: CAPS_METHOD_DATA, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct VIRTUAL_DISPLAY_GET_MAX_RESOLUTION_PARAMS { pub headIndex: u32_, pub maxHResolution: u32_, pub maxVResolution: u32_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct VIRTUAL_DISPLAY_GET_NUM_HEADS_PARAMS { pub numHeads: u32_, pub maxNumHeads: u32_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct BUSINFO { pub deviceID: u16_, pub vendorID: u16_, @@ -466,7 +466,7 @@ pub struct BUSINFO { pub __bindgen_padding_0: u8, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GSP_VF_INFO { pub totalVFs: u32_, pub firstVFOffset: u32_, @@ -479,25 +479,25 @@ pub struct GSP_VF_INFO { pub __bindgen_padding_0: [u8; 5usize], } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GSP_PCIE_CONFIG_REG { pub linkCap: u32_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct EcidManufacturingInfo { pub ecidLow: u32_, pub ecidHigh: u32_, pub ecidExtended: u32_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct FW_WPR_LAYOUT_OFFSET { pub nonWprHeapOffset: u64_, pub frtsOffset: u64_, } #[repr(C)] -#[derive(Debug, Copy, Clone)] +#[derive(Debug, Copy, Clone, MaybeZeroable)] pub struct GspStaticConfigInfo_t { pub grCapsBits: [u8_; 23usize], pub __bindgen_padding_0: u8, @@ -570,7 +570,7 @@ impl Default for GspStaticConfigInfo_t { } } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GspSystemInfo { pub gpuPhysAddr: u64_, pub gpuPhysFbAddr: u64_, @@ -627,7 +627,7 @@ pub struct GspSystemInfo { pub hostPageSize: u64_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct MESSAGE_QUEUE_INIT_ARGUMENTS { pub sharedMemPhysAddr: u64_, pub pageTableEntryCount: u32_, @@ -636,7 +636,7 @@ pub struct MESSAGE_QUEUE_INIT_ARGUMENTS { pub statQueueOffset: u64_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GSP_SR_INIT_ARGUMENTS { pub oldLevel: u32_, pub flags: u32_, @@ -644,7 +644,7 @@ pub struct GSP_SR_INIT_ARGUMENTS { pub __bindgen_padding_0: [u8; 3usize], } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GSP_ARGUMENTS_CACHED { pub messageQueueInitArguments: MESSAGE_QUEUE_INIT_ARGUMENTS, pub srInitArguments: GSP_SR_INIT_ARGUMENTS, @@ -654,13 +654,13 @@ pub struct GSP_ARGUMENTS_CACHED { pub profilerArgs: GSP_ARGUMENTS_CACHED__bindgen_ty_1, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GSP_ARGUMENTS_CACHED__bindgen_ty_1 { pub pa: u64_, pub size: u64_, } #[repr(C)] -#[derive(Copy, Clone, Zeroable)] +#[derive(Copy, Clone, MaybeZeroable)] pub union rpc_message_rpc_union_field_v03_00 { pub spare: u32_, pub cpuRmGfid: u32_, @@ -676,6 +676,7 @@ impl Default for rpc_message_rpc_union_field_v03_00 { } pub type rpc_message_rpc_union_field_v = rpc_message_rpc_union_field_v03_00; #[repr(C)] +#[derive(MaybeZeroable)] pub struct rpc_message_header_v03_00 { pub header_version: u32_, pub signature: u32_, @@ -698,7 +699,7 @@ impl Default for rpc_message_header_v03_00 { } pub type rpc_message_header_v = rpc_message_header_v03_00; #[repr(C)] -#[derive(Copy, Clone, Zeroable)] +#[derive(Copy, Clone, MaybeZeroable)] pub struct GspFwWprMeta { pub magic: u64_, pub revision: u64_, @@ -733,19 +734,19 @@ pub struct GspFwWprMeta { pub verified: u64_, } #[repr(C)] -#[derive(Copy, Clone, Zeroable)] +#[derive(Copy, Clone, MaybeZeroable)] pub union GspFwWprMeta__bindgen_ty_1 { pub __bindgen_anon_1: GspFwWprMeta__bindgen_ty_1__bindgen_ty_1, pub __bindgen_anon_2: GspFwWprMeta__bindgen_ty_1__bindgen_ty_2, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GspFwWprMeta__bindgen_ty_1__bindgen_ty_1 { pub sysmemAddrOfSignature: u64_, pub sizeOfSignature: u64_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GspFwWprMeta__bindgen_ty_1__bindgen_ty_2 { pub gspFwHeapFreeListWprOffset: u32_, pub unused0: u32_, @@ -761,13 +762,13 @@ impl Default for GspFwWprMeta__bindgen_ty_1 { } } #[repr(C)] -#[derive(Copy, Clone, Zeroable)] +#[derive(Copy, Clone, MaybeZeroable)] pub union GspFwWprMeta__bindgen_ty_2 { pub __bindgen_anon_1: GspFwWprMeta__bindgen_ty_2__bindgen_ty_1, pub __bindgen_anon_2: GspFwWprMeta__bindgen_ty_2__bindgen_ty_2, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GspFwWprMeta__bindgen_ty_2__bindgen_ty_1 { pub partitionRpcAddr: u64_, pub partitionRpcRequestOffset: u16_, @@ -779,7 +780,7 @@ pub struct GspFwWprMeta__bindgen_ty_2__bindgen_ty_1 { pub lsUcodeVersion: u32_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GspFwWprMeta__bindgen_ty_2__bindgen_ty_2 { pub partitionRpcPadding: [u32_; 4usize], pub sysmemAddrOfCrashReportQueue: u64_, @@ -814,7 +815,7 @@ pub const LibosMemoryRegionLoc_LIBOS_MEMORY_REGION_LOC_SYSMEM: LibosMemoryRegion pub const LibosMemoryRegionLoc_LIBOS_MEMORY_REGION_LOC_FB: LibosMemoryRegionLoc = 2; pub type LibosMemoryRegionLoc = ffi::c_uint; #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct LibosMemoryRegionInitArgument { pub id8: LibosAddress, pub pa: LibosAddress, @@ -824,7 +825,7 @@ pub struct LibosMemoryRegionInitArgument { pub __bindgen_padding_0: [u8; 6usize], } #[repr(C)] -#[derive(Debug, Default, Copy, Clone)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct PACKED_REGISTRY_ENTRY { pub nameOffset: u32_, pub type_: u8_, @@ -833,14 +834,14 @@ pub struct PACKED_REGISTRY_ENTRY { pub length: u32_, } #[repr(C)] -#[derive(Debug, Default)] +#[derive(Debug, Default, MaybeZeroable)] pub struct PACKED_REGISTRY_TABLE { pub size: u32_, pub numEntries: u32_, pub entries: __IncompleteArrayField, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct msgqTxHeader { pub version: u32_, pub size: u32_, @@ -852,13 +853,13 @@ pub struct msgqTxHeader { pub entryOff: u32_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone, Zeroable)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct msgqRxHeader { pub readPtr: u32_, } #[repr(C)] #[repr(align(8))] -#[derive(Zeroable)] +#[derive(MaybeZeroable)] pub struct GSP_MSG_QUEUE_ELEMENT { pub authTagBuffer: [u8_; 16usize], pub aadBuffer: [u8_; 16usize], @@ -878,7 +879,7 @@ impl Default for GSP_MSG_QUEUE_ELEMENT { } } #[repr(C)] -#[derive(Debug, Default)] +#[derive(Debug, Default, MaybeZeroable)] pub struct rpc_run_cpu_sequencer_v17_00 { pub bufferSizeDWord: u32_, pub cmdIndex: u32_, @@ -896,20 +897,20 @@ pub const GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_WAIT_FOR_HALT: GSP_SEQ_BUF_ pub const GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_RESUME: GSP_SEQ_BUF_OPCODE = 8; pub type GSP_SEQ_BUF_OPCODE = ffi::c_uint; #[repr(C)] -#[derive(Debug, Default, Copy, Clone)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GSP_SEQ_BUF_PAYLOAD_REG_WRITE { pub addr: u32_, pub val: u32_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GSP_SEQ_BUF_PAYLOAD_REG_MODIFY { pub addr: u32_, pub mask: u32_, pub val: u32_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GSP_SEQ_BUF_PAYLOAD_REG_POLL { pub addr: u32_, pub mask: u32_, @@ -918,24 +919,24 @@ pub struct GSP_SEQ_BUF_PAYLOAD_REG_POLL { pub error: u32_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GSP_SEQ_BUF_PAYLOAD_DELAY_US { pub val: u32_, } #[repr(C)] -#[derive(Debug, Default, Copy, Clone)] +#[derive(Debug, Default, Copy, Clone, MaybeZeroable)] pub struct GSP_SEQ_BUF_PAYLOAD_REG_STORE { pub addr: u32_, pub index: u32_, } #[repr(C)] -#[derive(Copy, Clone)] +#[derive(Copy, Clone, MaybeZeroable)] pub struct GSP_SEQUENCER_BUFFER_CMD { pub opCode: GSP_SEQ_BUF_OPCODE, pub payload: GSP_SEQUENCER_BUFFER_CMD__bindgen_ty_1, } #[repr(C)] -#[derive(Copy, Clone)] +#[derive(Copy, Clone, MaybeZeroable)] pub union GSP_SEQUENCER_BUFFER_CMD__bindgen_ty_1 { pub regWrite: GSP_SEQ_BUF_PAYLOAD_REG_WRITE, pub regModify: GSP_SEQ_BUF_PAYLOAD_REG_MODIFY, From cd97c5d6b1923ccdf7617ea0aa12ac31ad1e3c46 Mon Sep 17 00:00:00 2001 From: Alexandre Courbot Date: Tue, 16 Dec 2025 11:57:10 +0900 Subject: [PATCH 0814/1321] gpu: nova-core: gsp: replace firmware version with "bindings" alias We have an "bindings" alias to avoid having to mention the firmware version again and again, and limit the diff when upgrading the firmware. Use it where we neglected to. Fixes: eaf0989c77e4 ("gpu: nova-core: Add bindings required by GSP sequencer") Reviewed-by: Lyude Paul Reviewed-by: Joel Fernandes Acked-by: Danilo Krummrich Link: https://patch.msgid.link/20251216-nova-fixes-v3-4-c7469a71f7c4@nvidia.com Signed-off-by: Alexandre Courbot --- drivers/gpu/nova-core/gsp/fw.rs | 58 ++++++++++++++++----------------- 1 file changed, 29 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs index b754631d2d8d2..caeb0d251fe5f 100644 --- a/drivers/gpu/nova-core/gsp/fw.rs +++ b/drivers/gpu/nova-core/gsp/fw.rs @@ -141,8 +141,8 @@ unsafe impl AsBytes for GspFwWprMeta {} // are valid. unsafe impl FromBytes for GspFwWprMeta {} -type GspFwWprMetaBootResumeInfo = r570_144::GspFwWprMeta__bindgen_ty_1; -type GspFwWprMetaBootInfo = r570_144::GspFwWprMeta__bindgen_ty_1__bindgen_ty_1; +type GspFwWprMetaBootResumeInfo = bindings::GspFwWprMeta__bindgen_ty_1; +type GspFwWprMetaBootInfo = bindings::GspFwWprMeta__bindgen_ty_1__bindgen_ty_1; impl GspFwWprMeta { /// Fill in and return a `GspFwWprMeta` suitable for booting `gsp_firmware` using the @@ -150,8 +150,8 @@ impl GspFwWprMeta { pub(crate) fn new(gsp_firmware: &GspFirmware, fb_layout: &FbLayout) -> Self { Self(bindings::GspFwWprMeta { // CAST: we want to store the bits of `GSP_FW_WPR_META_MAGIC` unmodified. - magic: r570_144::GSP_FW_WPR_META_MAGIC as u64, - revision: u64::from(r570_144::GSP_FW_WPR_META_REVISION), + magic: bindings::GSP_FW_WPR_META_MAGIC as u64, + revision: u64::from(bindings::GSP_FW_WPR_META_REVISION), sysmemAddrOfRadix3Elf: gsp_firmware.radix3_dma_handle(), sizeOfRadix3Elf: u64::from_safe_cast(gsp_firmware.size), sysmemAddrOfBootloader: gsp_firmware.bootloader.ucode.dma_handle(), @@ -315,19 +315,19 @@ impl From for u32 { #[repr(u32)] pub(crate) enum SeqBufOpcode { // Core operation opcodes - CoreReset = r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_RESET, - CoreResume = r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_RESUME, - CoreStart = r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_START, - CoreWaitForHalt = r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_WAIT_FOR_HALT, + CoreReset = bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_RESET, + CoreResume = bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_RESUME, + CoreStart = bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_START, + CoreWaitForHalt = bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_WAIT_FOR_HALT, // Delay opcode - DelayUs = r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_DELAY_US, + DelayUs = bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_DELAY_US, // Register operation opcodes - RegModify = r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_MODIFY, - RegPoll = r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_POLL, - RegStore = r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_STORE, - RegWrite = r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_WRITE, + RegModify = bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_MODIFY, + RegPoll = bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_POLL, + RegStore = bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_STORE, + RegWrite = bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_WRITE, } impl fmt::Display for SeqBufOpcode { @@ -351,25 +351,25 @@ impl TryFrom for SeqBufOpcode { fn try_from(value: u32) -> Result { match value { - r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_RESET => { + bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_RESET => { Ok(SeqBufOpcode::CoreReset) } - r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_RESUME => { + bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_RESUME => { Ok(SeqBufOpcode::CoreResume) } - r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_START => { + bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_START => { Ok(SeqBufOpcode::CoreStart) } - r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_WAIT_FOR_HALT => { + bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_CORE_WAIT_FOR_HALT => { Ok(SeqBufOpcode::CoreWaitForHalt) } - r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_DELAY_US => Ok(SeqBufOpcode::DelayUs), - r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_MODIFY => { + bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_DELAY_US => Ok(SeqBufOpcode::DelayUs), + bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_MODIFY => { Ok(SeqBufOpcode::RegModify) } - r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_POLL => Ok(SeqBufOpcode::RegPoll), - r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_STORE => Ok(SeqBufOpcode::RegStore), - r570_144::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_WRITE => Ok(SeqBufOpcode::RegWrite), + bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_POLL => Ok(SeqBufOpcode::RegPoll), + bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_STORE => Ok(SeqBufOpcode::RegStore), + bindings::GSP_SEQ_BUF_OPCODE_GSP_SEQ_BUF_OPCODE_REG_WRITE => Ok(SeqBufOpcode::RegWrite), _ => Err(EINVAL), } } @@ -385,7 +385,7 @@ impl From for u32 { /// Wrapper for GSP sequencer register write payload. #[repr(transparent)] #[derive(Copy, Clone)] -pub(crate) struct RegWritePayload(r570_144::GSP_SEQ_BUF_PAYLOAD_REG_WRITE); +pub(crate) struct RegWritePayload(bindings::GSP_SEQ_BUF_PAYLOAD_REG_WRITE); impl RegWritePayload { /// Returns the register address. @@ -408,7 +408,7 @@ unsafe impl AsBytes for RegWritePayload {} /// Wrapper for GSP sequencer register modify payload. #[repr(transparent)] #[derive(Copy, Clone)] -pub(crate) struct RegModifyPayload(r570_144::GSP_SEQ_BUF_PAYLOAD_REG_MODIFY); +pub(crate) struct RegModifyPayload(bindings::GSP_SEQ_BUF_PAYLOAD_REG_MODIFY); impl RegModifyPayload { /// Returns the register address. @@ -436,7 +436,7 @@ unsafe impl AsBytes for RegModifyPayload {} /// Wrapper for GSP sequencer register poll payload. #[repr(transparent)] #[derive(Copy, Clone)] -pub(crate) struct RegPollPayload(r570_144::GSP_SEQ_BUF_PAYLOAD_REG_POLL); +pub(crate) struct RegPollPayload(bindings::GSP_SEQ_BUF_PAYLOAD_REG_POLL); impl RegPollPayload { /// Returns the register address. @@ -469,7 +469,7 @@ unsafe impl AsBytes for RegPollPayload {} /// Wrapper for GSP sequencer delay payload. #[repr(transparent)] #[derive(Copy, Clone)] -pub(crate) struct DelayUsPayload(r570_144::GSP_SEQ_BUF_PAYLOAD_DELAY_US); +pub(crate) struct DelayUsPayload(bindings::GSP_SEQ_BUF_PAYLOAD_DELAY_US); impl DelayUsPayload { /// Returns the delay value in microseconds. @@ -487,7 +487,7 @@ unsafe impl AsBytes for DelayUsPayload {} /// Wrapper for GSP sequencer register store payload. #[repr(transparent)] #[derive(Copy, Clone)] -pub(crate) struct RegStorePayload(r570_144::GSP_SEQ_BUF_PAYLOAD_REG_STORE); +pub(crate) struct RegStorePayload(bindings::GSP_SEQ_BUF_PAYLOAD_REG_STORE); impl RegStorePayload { /// Returns the register address. @@ -510,7 +510,7 @@ unsafe impl AsBytes for RegStorePayload {} /// Wrapper for GSP sequencer buffer command. #[repr(transparent)] -pub(crate) struct SequencerBufferCmd(r570_144::GSP_SEQUENCER_BUFFER_CMD); +pub(crate) struct SequencerBufferCmd(bindings::GSP_SEQUENCER_BUFFER_CMD); impl SequencerBufferCmd { /// Returns the opcode as a `SeqBufOpcode` enum, or error if invalid. @@ -612,7 +612,7 @@ unsafe impl AsBytes for SequencerBufferCmd {} /// Wrapper for GSP run CPU sequencer RPC. #[repr(transparent)] -pub(crate) struct RunCpuSequencer(r570_144::rpc_run_cpu_sequencer_v17_00); +pub(crate) struct RunCpuSequencer(bindings::rpc_run_cpu_sequencer_v17_00); impl RunCpuSequencer { /// Returns the command index. From c00cead508dc6a49c893bb792c7e6569f580fe78 Mon Sep 17 00:00:00 2001 From: Danilo Krummrich Date: Tue, 23 Dec 2025 12:59:47 +0100 Subject: [PATCH 0815/1321] MAINTAINERS: fix typo in TYR DRM driver entry Fix a missing ':' in the ARM MALI TYR DRM DRIVER entry, which does prevent script/get_maintainer.pl to properly work and pick up the corresponding maintainers. Fixes: cf4fd52e3236 ("rust: drm: Introduce the Tyr driver for Arm Mali GPUs") Reported-by: Tamir Duberstein Closes: https://lore.kernel.org/lkml/CAJ-ks9mrZtnPUjp5tD03hW+TyS0M9i-KRF_ramNY-oh-0X+ayA@mail.gmail.com/ Signed-off-by: Danilo Krummrich Reviewed-by: Tamir Duberstein Link: https://patch.msgid.link/20251223115949.32531-1-dakr@kernel.org Signed-off-by: Alice Ryhl --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index a4d8281a05978..28370a81b61eb 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2159,7 +2159,7 @@ M: Alice Ryhl L: dri-devel@lists.freedesktop.org S: Supported W: https://rust-for-linux.com/tyr-gpu-driver -W https://drm.pages.freedesktop.org/maintainer-tools/drm-rust.html +W: https://drm.pages.freedesktop.org/maintainer-tools/drm-rust.html B: https://gitlab.freedesktop.org/panfrost/linux/-/issues T: git https://gitlab.freedesktop.org/drm/rust/kernel.git F: Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml From ebab6d945c07b9e01743c1258c1566e4dcf5cd89 Mon Sep 17 00:00:00 2001 From: Danilo Krummrich Date: Tue, 23 Dec 2025 13:04:34 +0100 Subject: [PATCH 0816/1321] MAINTAINERS: exclude the tyr driver from DRM MISC The ARM MALI TYR DRM DRIVER is already maintained through the drm-rust tree, hence exclude it from drm-misc. Fixes: cf4fd52e3236 ("rust: drm: Introduce the Tyr driver for Arm Mali GPUs") Signed-off-by: Danilo Krummrich Link: https://patch.msgid.link/20251223120436.33233-1-dakr@kernel.org Signed-off-by: Alice Ryhl --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 28370a81b61eb..d4afbc1c32a7d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8358,6 +8358,7 @@ X: drivers/gpu/drm/msm/ X: drivers/gpu/drm/nova/ X: drivers/gpu/drm/radeon/ X: drivers/gpu/drm/tegra/ +X: drivers/gpu/drm/tyr/ X: drivers/gpu/drm/xe/ DRM DRIVERS AND COMMON INFRASTRUCTURE [RUST] From 9c78b1aeb6a20d08430cc37b9ad0e2f5fd9edba3 Mon Sep 17 00:00:00 2001 From: Marco Crivellari Date: Tue, 4 Nov 2025 12:29:23 +0100 Subject: [PATCH 0817/1321] drm/exynos: hdmi: replace use of system_wq with system_percpu_wq Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. This patch continues the effort to refactor worqueue APIs, which has begun with the change introducing new workqueues and a new alloc_workqueue flag: commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq") commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag") system_wq should be the per-cpu workqueue, yet in this name nothing makes that clear, so replace system_wq with system_percpu_wq. The old wq (system_wq) will be kept for a few release cycles. Suggested-by: Tejun Heo Signed-off-by: Marco Crivellari Signed-off-by: Inki Dae --- drivers/gpu/drm/exynos/exynos_hdmi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/exynos/exynos_hdmi.c b/drivers/gpu/drm/exynos/exynos_hdmi.c index 01813e11e6c6c..8e76ac8ee4e24 100644 --- a/drivers/gpu/drm/exynos/exynos_hdmi.c +++ b/drivers/gpu/drm/exynos/exynos_hdmi.c @@ -1692,7 +1692,7 @@ static irqreturn_t hdmi_irq_thread(int irq, void *arg) { struct hdmi_context *hdata = arg; - mod_delayed_work(system_wq, &hdata->hotplug_work, + mod_delayed_work(system_percpu_wq, &hdata->hotplug_work, msecs_to_jiffies(HOTPLUG_DEBOUNCE_MS)); return IRQ_HANDLED; From cf6f5a5cb6781258957c359e2ab07b2c23e148cc Mon Sep 17 00:00:00 2001 From: Miaoqian Lin Date: Thu, 11 Dec 2025 16:33:44 +0400 Subject: [PATCH 0818/1321] drm/pl111: Fix error handling in pl111_amba_probe Jump to the existing dev_put label when devm_request_irq() fails so drm_dev_put() and of_reserved_mem_device_release() run instead of returning early and leaking resources. Found via static analysis and code review. Fixes: bed41005e617 ("drm/pl111: Initial drm/kms driver for pl111") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin Reviewed-by: Javier Martinez Canillas Signed-off-by: Linus Walleij Link: https://patch.msgid.link/20251211123345.2392065-1-linmq006@gmail.com --- drivers/gpu/drm/pl111/pl111_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/pl111/pl111_drv.c b/drivers/gpu/drm/pl111/pl111_drv.c index 56ff6a3fb4834..d7dc83cf7b00a 100644 --- a/drivers/gpu/drm/pl111/pl111_drv.c +++ b/drivers/gpu/drm/pl111/pl111_drv.c @@ -295,7 +295,7 @@ static int pl111_amba_probe(struct amba_device *amba_dev, variant->name, priv); if (ret != 0) { dev_err(dev, "%s failed irq %d\n", __func__, ret); - return ret; + goto dev_put; } ret = pl111_modeset_init(drm); From 4379938baaf367e42cff764e7935916da452cad3 Mon Sep 17 00:00:00 2001 From: Tomi Valkeinen Date: Fri, 5 Dec 2025 11:51:48 +0200 Subject: [PATCH 0819/1321] Revert "drm/atomic-helper: Re-order bridge chain pre-enable and post-disable" This reverts commit c9b1150a68d9362a0827609fc0dc1664c0d8bfe1. Changing the enable/disable sequence has caused regressions on multiple platforms: R-Car, MCDE, Rockchip. A series (see link below) was sent to fix these, but it was decided that it's better to revert the original patch and change the enable/disable sequence only in the tidss driver. Reverting this commit breaks tidss's DSI and OLDI outputs, which will be fixed in the following commits. Signed-off-by: Tomi Valkeinen Link: https://lore.kernel.org/all/20251202-mcde-drm-regression-thirdfix-v6-0-f1bffd4ec0fa%40kernel.org/ Fixes: c9b1150a68d9 ("drm/atomic-helper: Re-order bridge chain pre-enable and post-disable") Cc: stable@vger.kernel.org # v6.17+ Reviewed-by: Aradhya Bhatia Reviewed-by: Maxime Ripard Reviewed-by: Linus Walleij Tested-by: Linus Walleij Signed-off-by: Linus Walleij Link: https://patch.msgid.link/20251205-drm-seq-fix-v1-1-fda68fa1b3de@ideasonboard.com --- drivers/gpu/drm/drm_atomic_helper.c | 8 +- include/drm/drm_bridge.h | 249 ++++++++-------------------- 2 files changed, 70 insertions(+), 187 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 10adac9397cfb..ef97f37560b25 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -1341,9 +1341,9 @@ disable_outputs(struct drm_device *dev, struct drm_atomic_state *state) { encoder_bridge_disable(dev, state); - crtc_disable(dev, state); - encoder_bridge_post_disable(dev, state); + + crtc_disable(dev, state); } /** @@ -1682,10 +1682,10 @@ encoder_bridge_enable(struct drm_device *dev, struct drm_atomic_state *state) void drm_atomic_helper_commit_modeset_enables(struct drm_device *dev, struct drm_atomic_state *state) { - encoder_bridge_pre_enable(dev, state); - crtc_enable(dev, state); + encoder_bridge_pre_enable(dev, state); + encoder_bridge_enable(dev, state); drm_atomic_helper_commit_writebacks(dev, state); diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h index 0ff7ab4aa8689..dbafe136833f8 100644 --- a/include/drm/drm_bridge.h +++ b/include/drm/drm_bridge.h @@ -176,33 +176,17 @@ struct drm_bridge_funcs { /** * @disable: * - * The @disable callback should disable the bridge. + * This callback should disable the bridge. It is called right before + * the preceding element in the display pipe is disabled. If the + * preceding element is a bridge this means it's called before that + * bridge's @disable vfunc. If the preceding element is a &drm_encoder + * it's called right before the &drm_encoder_helper_funcs.disable, + * &drm_encoder_helper_funcs.prepare or &drm_encoder_helper_funcs.dpms + * hook. * * The bridge can assume that the display pipe (i.e. clocks and timing * signals) feeding it is still running when this callback is called. * - * - * If the preceding element is a &drm_bridge, then this is called before - * that bridge is disabled via one of: - * - * - &drm_bridge_funcs.disable - * - &drm_bridge_funcs.atomic_disable - * - * If the preceding element of the bridge is a display controller, then - * this callback is called before the encoder is disabled via one of: - * - * - &drm_encoder_helper_funcs.atomic_disable - * - &drm_encoder_helper_funcs.prepare - * - &drm_encoder_helper_funcs.disable - * - &drm_encoder_helper_funcs.dpms - * - * and the CRTC is disabled via one of: - * - * - &drm_crtc_helper_funcs.prepare - * - &drm_crtc_helper_funcs.atomic_disable - * - &drm_crtc_helper_funcs.disable - * - &drm_crtc_helper_funcs.dpms. - * * The @disable callback is optional. * * NOTE: @@ -215,34 +199,17 @@ struct drm_bridge_funcs { /** * @post_disable: * - * The bridge must assume that the display pipe (i.e. clocks and timing - * signals) feeding this bridge is no longer running when the - * @post_disable is called. + * This callback should disable the bridge. It is called right after the + * preceding element in the display pipe is disabled. If the preceding + * element is a bridge this means it's called after that bridge's + * @post_disable function. If the preceding element is a &drm_encoder + * it's called right after the encoder's + * &drm_encoder_helper_funcs.disable, &drm_encoder_helper_funcs.prepare + * or &drm_encoder_helper_funcs.dpms hook. * - * This callback should perform all the actions required by the hardware - * after it has stopped receiving signals from the preceding element. - * - * If the preceding element is a &drm_bridge, then this is called after - * that bridge is post-disabled (unless marked otherwise by the - * @pre_enable_prev_first flag) via one of: - * - * - &drm_bridge_funcs.post_disable - * - &drm_bridge_funcs.atomic_post_disable - * - * If the preceding element of the bridge is a display controller, then - * this callback is called after the encoder is disabled via one of: - * - * - &drm_encoder_helper_funcs.atomic_disable - * - &drm_encoder_helper_funcs.prepare - * - &drm_encoder_helper_funcs.disable - * - &drm_encoder_helper_funcs.dpms - * - * and the CRTC is disabled via one of: - * - * - &drm_crtc_helper_funcs.prepare - * - &drm_crtc_helper_funcs.atomic_disable - * - &drm_crtc_helper_funcs.disable - * - &drm_crtc_helper_funcs.dpms + * The bridge must assume that the display pipe (i.e. clocks and timing + * signals) feeding it is no longer running when this callback is + * called. * * The @post_disable callback is optional. * @@ -285,30 +252,18 @@ struct drm_bridge_funcs { /** * @pre_enable: * - * The display pipe (i.e. clocks and timing signals) feeding this bridge - * will not yet be running when the @pre_enable is called. - * - * This callback should perform all the necessary actions to prepare the - * bridge to accept signals from the preceding element. - * - * If the preceding element is a &drm_bridge, then this is called before - * that bridge is pre-enabled (unless marked otherwise by - * @pre_enable_prev_first flag) via one of: - * - * - &drm_bridge_funcs.pre_enable - * - &drm_bridge_funcs.atomic_pre_enable - * - * If the preceding element of the bridge is a display controller, then - * this callback is called before the CRTC is enabled via one of: - * - * - &drm_crtc_helper_funcs.atomic_enable - * - &drm_crtc_helper_funcs.commit - * - * and the encoder is enabled via one of: + * This callback should enable the bridge. It is called right before + * the preceding element in the display pipe is enabled. If the + * preceding element is a bridge this means it's called before that + * bridge's @pre_enable function. If the preceding element is a + * &drm_encoder it's called right before the encoder's + * &drm_encoder_helper_funcs.enable, &drm_encoder_helper_funcs.commit or + * &drm_encoder_helper_funcs.dpms hook. * - * - &drm_encoder_helper_funcs.atomic_enable - * - &drm_encoder_helper_funcs.enable - * - &drm_encoder_helper_funcs.commit + * The display pipe (i.e. clocks and timing signals) feeding this bridge + * will not yet be running when this callback is called. The bridge must + * not enable the display link feeding the next bridge in the chain (if + * there is one) when this callback is called. * * The @pre_enable callback is optional. * @@ -322,31 +277,19 @@ struct drm_bridge_funcs { /** * @enable: * - * The @enable callback should enable the bridge. + * This callback should enable the bridge. It is called right after + * the preceding element in the display pipe is enabled. If the + * preceding element is a bridge this means it's called after that + * bridge's @enable function. If the preceding element is a + * &drm_encoder it's called right after the encoder's + * &drm_encoder_helper_funcs.enable, &drm_encoder_helper_funcs.commit or + * &drm_encoder_helper_funcs.dpms hook. * * The bridge can assume that the display pipe (i.e. clocks and timing * signals) feeding it is running when this callback is called. This * callback must enable the display link feeding the next bridge in the * chain if there is one. * - * If the preceding element is a &drm_bridge, then this is called after - * that bridge is enabled via one of: - * - * - &drm_bridge_funcs.enable - * - &drm_bridge_funcs.atomic_enable - * - * If the preceding element of the bridge is a display controller, then - * this callback is called after the CRTC is enabled via one of: - * - * - &drm_crtc_helper_funcs.atomic_enable - * - &drm_crtc_helper_funcs.commit - * - * and the encoder is enabled via one of: - * - * - &drm_encoder_helper_funcs.atomic_enable - * - &drm_encoder_helper_funcs.enable - * - drm_encoder_helper_funcs.commit - * * The @enable callback is optional. * * NOTE: @@ -359,30 +302,17 @@ struct drm_bridge_funcs { /** * @atomic_pre_enable: * - * The display pipe (i.e. clocks and timing signals) feeding this bridge - * will not yet be running when the @atomic_pre_enable is called. - * - * This callback should perform all the necessary actions to prepare the - * bridge to accept signals from the preceding element. - * - * If the preceding element is a &drm_bridge, then this is called before - * that bridge is pre-enabled (unless marked otherwise by - * @pre_enable_prev_first flag) via one of: - * - * - &drm_bridge_funcs.pre_enable - * - &drm_bridge_funcs.atomic_pre_enable + * This callback should enable the bridge. It is called right before + * the preceding element in the display pipe is enabled. If the + * preceding element is a bridge this means it's called before that + * bridge's @atomic_pre_enable or @pre_enable function. If the preceding + * element is a &drm_encoder it's called right before the encoder's + * &drm_encoder_helper_funcs.atomic_enable hook. * - * If the preceding element of the bridge is a display controller, then - * this callback is called before the CRTC is enabled via one of: - * - * - &drm_crtc_helper_funcs.atomic_enable - * - &drm_crtc_helper_funcs.commit - * - * and the encoder is enabled via one of: - * - * - &drm_encoder_helper_funcs.atomic_enable - * - &drm_encoder_helper_funcs.enable - * - &drm_encoder_helper_funcs.commit + * The display pipe (i.e. clocks and timing signals) feeding this bridge + * will not yet be running when this callback is called. The bridge must + * not enable the display link feeding the next bridge in the chain (if + * there is one) when this callback is called. * * The @atomic_pre_enable callback is optional. */ @@ -392,31 +322,18 @@ struct drm_bridge_funcs { /** * @atomic_enable: * - * The @atomic_enable callback should enable the bridge. + * This callback should enable the bridge. It is called right after + * the preceding element in the display pipe is enabled. If the + * preceding element is a bridge this means it's called after that + * bridge's @atomic_enable or @enable function. If the preceding element + * is a &drm_encoder it's called right after the encoder's + * &drm_encoder_helper_funcs.atomic_enable hook. * * The bridge can assume that the display pipe (i.e. clocks and timing * signals) feeding it is running when this callback is called. This * callback must enable the display link feeding the next bridge in the * chain if there is one. * - * If the preceding element is a &drm_bridge, then this is called after - * that bridge is enabled via one of: - * - * - &drm_bridge_funcs.enable - * - &drm_bridge_funcs.atomic_enable - * - * If the preceding element of the bridge is a display controller, then - * this callback is called after the CRTC is enabled via one of: - * - * - &drm_crtc_helper_funcs.atomic_enable - * - &drm_crtc_helper_funcs.commit - * - * and the encoder is enabled via one of: - * - * - &drm_encoder_helper_funcs.atomic_enable - * - &drm_encoder_helper_funcs.enable - * - drm_encoder_helper_funcs.commit - * * The @atomic_enable callback is optional. */ void (*atomic_enable)(struct drm_bridge *bridge, @@ -424,32 +341,16 @@ struct drm_bridge_funcs { /** * @atomic_disable: * - * The @atomic_disable callback should disable the bridge. + * This callback should disable the bridge. It is called right before + * the preceding element in the display pipe is disabled. If the + * preceding element is a bridge this means it's called before that + * bridge's @atomic_disable or @disable vfunc. If the preceding element + * is a &drm_encoder it's called right before the + * &drm_encoder_helper_funcs.atomic_disable hook. * * The bridge can assume that the display pipe (i.e. clocks and timing * signals) feeding it is still running when this callback is called. * - * If the preceding element is a &drm_bridge, then this is called before - * that bridge is disabled via one of: - * - * - &drm_bridge_funcs.disable - * - &drm_bridge_funcs.atomic_disable - * - * If the preceding element of the bridge is a display controller, then - * this callback is called before the encoder is disabled via one of: - * - * - &drm_encoder_helper_funcs.atomic_disable - * - &drm_encoder_helper_funcs.prepare - * - &drm_encoder_helper_funcs.disable - * - &drm_encoder_helper_funcs.dpms - * - * and the CRTC is disabled via one of: - * - * - &drm_crtc_helper_funcs.prepare - * - &drm_crtc_helper_funcs.atomic_disable - * - &drm_crtc_helper_funcs.disable - * - &drm_crtc_helper_funcs.dpms. - * * The @atomic_disable callback is optional. */ void (*atomic_disable)(struct drm_bridge *bridge, @@ -458,34 +359,16 @@ struct drm_bridge_funcs { /** * @atomic_post_disable: * - * The bridge must assume that the display pipe (i.e. clocks and timing - * signals) feeding this bridge is no longer running when the - * @atomic_post_disable is called. - * - * This callback should perform all the actions required by the hardware - * after it has stopped receiving signals from the preceding element. + * This callback should disable the bridge. It is called right after the + * preceding element in the display pipe is disabled. If the preceding + * element is a bridge this means it's called after that bridge's + * @atomic_post_disable or @post_disable function. If the preceding + * element is a &drm_encoder it's called right after the encoder's + * &drm_encoder_helper_funcs.atomic_disable hook. * - * If the preceding element is a &drm_bridge, then this is called after - * that bridge is post-disabled (unless marked otherwise by the - * @pre_enable_prev_first flag) via one of: - * - * - &drm_bridge_funcs.post_disable - * - &drm_bridge_funcs.atomic_post_disable - * - * If the preceding element of the bridge is a display controller, then - * this callback is called after the encoder is disabled via one of: - * - * - &drm_encoder_helper_funcs.atomic_disable - * - &drm_encoder_helper_funcs.prepare - * - &drm_encoder_helper_funcs.disable - * - &drm_encoder_helper_funcs.dpms - * - * and the CRTC is disabled via one of: - * - * - &drm_crtc_helper_funcs.prepare - * - &drm_crtc_helper_funcs.atomic_disable - * - &drm_crtc_helper_funcs.disable - * - &drm_crtc_helper_funcs.dpms + * The bridge must assume that the display pipe (i.e. clocks and timing + * signals) feeding it is no longer running when this callback is + * called. * * The @atomic_post_disable callback is optional. */ From 6b5bec0aee9a1c9f5e9d40c392eb5981fdbf8357 Mon Sep 17 00:00:00 2001 From: Tomi Valkeinen Date: Fri, 5 Dec 2025 11:51:49 +0200 Subject: [PATCH 0820/1321] Revert "drm/mediatek: dsi: Fix DSI host and panel bridge pre-enable order" This reverts commit f5b1819193667bf62c3c99d3921b9429997a14b2. As the original commit (c9b1150a68d9 ("drm/atomic-helper: Re-order bridge chain pre-enable and post-disable")) causing the issue has been reverted, let's revert the fix for mediatek. Signed-off-by: Tomi Valkeinen Cc: stable@vger.kernel.org # v6.17+ Fixes: c9b1150a68d9 ("drm/atomic-helper: Re-order bridge chain pre-enable and post-disable") Reviewed-by: Maxime Ripard Reviewed-by: Linus Walleij Tested-by: Linus Walleij Signed-off-by: Linus Walleij Link: https://patch.msgid.link/20251205-drm-seq-fix-v1-2-fda68fa1b3de@ideasonboard.com --- drivers/gpu/drm/mediatek/mtk_dsi.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/drivers/gpu/drm/mediatek/mtk_dsi.c b/drivers/gpu/drm/mediatek/mtk_dsi.c index 0e2bcd5f67b76..d7726091819c4 100644 --- a/drivers/gpu/drm/mediatek/mtk_dsi.c +++ b/drivers/gpu/drm/mediatek/mtk_dsi.c @@ -1002,12 +1002,6 @@ static int mtk_dsi_host_attach(struct mipi_dsi_host *host, return PTR_ERR(dsi->next_bridge); } - /* - * set flag to request the DSI host bridge be pre-enabled before device bridge - * in the chain, so the DSI host is ready when the device bridge is pre-enabled - */ - dsi->next_bridge->pre_enable_prev_first = true; - drm_bridge_add(&dsi->bridge); ret = component_add(host->dev, &mtk_dsi_component_ops); From 08f8ffa992ea22764b329e900196e1cd9929199a Mon Sep 17 00:00:00 2001 From: Linus Walleij Date: Fri, 5 Dec 2025 11:51:50 +0200 Subject: [PATCH 0821/1321] drm/atomic-helper: Export and namespace some functions Export and namespace those not prefixed with drm_* so it becomes possible to write custom commit tail functions in individual drivers using the helper infrastructure. Tested-by: Marek Vasut Reviewed-by: Maxime Ripard Signed-off-by: Tomi Valkeinen Cc: stable@vger.kernel.org # v6.17+ Fixes: c9b1150a68d9 ("drm/atomic-helper: Re-order bridge chain pre-enable and post-disable") Reviewed-by: Aradhya Bhatia Reviewed-by: Linus Walleij Tested-by: Linus Walleij Signed-off-by: Linus Walleij Link: https://patch.msgid.link/20251205-drm-seq-fix-v1-3-fda68fa1b3de@ideasonboard.com --- drivers/gpu/drm/drm_atomic_helper.c | 122 ++++++++++++++++++++++------ include/drm/drm_atomic_helper.h | 22 +++++ 2 files changed, 121 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index ef97f37560b25..5beea645035f2 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -1162,8 +1162,18 @@ crtc_needs_disable(struct drm_crtc_state *old_state, new_state->self_refresh_active; } -static void -encoder_bridge_disable(struct drm_device *dev, struct drm_atomic_state *state) +/** + * drm_atomic_helper_commit_encoder_bridge_disable - disable bridges and encoder + * @dev: DRM device + * @state: the driver state object + * + * Loops over all connectors in the current state and if the CRTC needs + * it, disables the bridge chain all the way, then disables the encoder + * afterwards. + */ +void +drm_atomic_helper_commit_encoder_bridge_disable(struct drm_device *dev, + struct drm_atomic_state *state) { struct drm_connector *connector; struct drm_connector_state *old_conn_state, *new_conn_state; @@ -1229,9 +1239,18 @@ encoder_bridge_disable(struct drm_device *dev, struct drm_atomic_state *state) } } } +EXPORT_SYMBOL(drm_atomic_helper_commit_encoder_bridge_disable); -static void -crtc_disable(struct drm_device *dev, struct drm_atomic_state *state) +/** + * drm_atomic_helper_commit_crtc_disable - disable CRTSs + * @dev: DRM device + * @state: the driver state object + * + * Loops over all CRTCs in the current state and if the CRTC needs + * it, disables it. + */ +void +drm_atomic_helper_commit_crtc_disable(struct drm_device *dev, struct drm_atomic_state *state) { struct drm_crtc *crtc; struct drm_crtc_state *old_crtc_state, *new_crtc_state; @@ -1282,9 +1301,18 @@ crtc_disable(struct drm_device *dev, struct drm_atomic_state *state) drm_crtc_vblank_put(crtc); } } +EXPORT_SYMBOL(drm_atomic_helper_commit_crtc_disable); -static void -encoder_bridge_post_disable(struct drm_device *dev, struct drm_atomic_state *state) +/** + * drm_atomic_helper_commit_encoder_bridge_post_disable - post-disable encoder bridges + * @dev: DRM device + * @state: the driver state object + * + * Loops over all connectors in the current state and if the CRTC needs + * it, post-disables all encoder bridges. + */ +void +drm_atomic_helper_commit_encoder_bridge_post_disable(struct drm_device *dev, struct drm_atomic_state *state) { struct drm_connector *connector; struct drm_connector_state *old_conn_state, *new_conn_state; @@ -1335,15 +1363,16 @@ encoder_bridge_post_disable(struct drm_device *dev, struct drm_atomic_state *sta drm_bridge_put(bridge); } } +EXPORT_SYMBOL(drm_atomic_helper_commit_encoder_bridge_post_disable); static void disable_outputs(struct drm_device *dev, struct drm_atomic_state *state) { - encoder_bridge_disable(dev, state); + drm_atomic_helper_commit_encoder_bridge_disable(dev, state); - encoder_bridge_post_disable(dev, state); + drm_atomic_helper_commit_encoder_bridge_post_disable(dev, state); - crtc_disable(dev, state); + drm_atomic_helper_commit_crtc_disable(dev, state); } /** @@ -1446,8 +1475,17 @@ void drm_atomic_helper_calc_timestamping_constants(struct drm_atomic_state *stat } EXPORT_SYMBOL(drm_atomic_helper_calc_timestamping_constants); -static void -crtc_set_mode(struct drm_device *dev, struct drm_atomic_state *state) +/** + * drm_atomic_helper_commit_crtc_set_mode - set the new mode + * @dev: DRM device + * @state: the driver state object + * + * Loops over all connectors in the current state and if the mode has + * changed, change the mode of the CRTC, then call down the bridge + * chain and change the mode in all bridges as well. + */ +void +drm_atomic_helper_commit_crtc_set_mode(struct drm_device *dev, struct drm_atomic_state *state) { struct drm_crtc *crtc; struct drm_crtc_state *new_crtc_state; @@ -1508,6 +1546,7 @@ crtc_set_mode(struct drm_device *dev, struct drm_atomic_state *state) drm_bridge_put(bridge); } } +EXPORT_SYMBOL(drm_atomic_helper_commit_crtc_set_mode); /** * drm_atomic_helper_commit_modeset_disables - modeset commit to disable outputs @@ -1531,12 +1570,21 @@ void drm_atomic_helper_commit_modeset_disables(struct drm_device *dev, drm_atomic_helper_update_legacy_modeset_state(dev, state); drm_atomic_helper_calc_timestamping_constants(state); - crtc_set_mode(dev, state); + drm_atomic_helper_commit_crtc_set_mode(dev, state); } EXPORT_SYMBOL(drm_atomic_helper_commit_modeset_disables); -static void drm_atomic_helper_commit_writebacks(struct drm_device *dev, - struct drm_atomic_state *state) +/** + * drm_atomic_helper_commit_writebacks - issue writebacks + * @dev: DRM device + * @state: atomic state object being committed + * + * This loops over the connectors, checks if the new state requires + * a writeback job to be issued and in that case issues an atomic + * commit on each connector. + */ +void drm_atomic_helper_commit_writebacks(struct drm_device *dev, + struct drm_atomic_state *state) { struct drm_connector *connector; struct drm_connector_state *new_conn_state; @@ -1555,9 +1603,18 @@ static void drm_atomic_helper_commit_writebacks(struct drm_device *dev, } } } +EXPORT_SYMBOL(drm_atomic_helper_commit_writebacks); -static void -encoder_bridge_pre_enable(struct drm_device *dev, struct drm_atomic_state *state) +/** + * drm_atomic_helper_commit_encoder_bridge_pre_enable - pre-enable bridges + * @dev: DRM device + * @state: atomic state object being committed + * + * This loops over the connectors and if the CRTC needs it, pre-enables + * the entire bridge chain. + */ +void +drm_atomic_helper_commit_encoder_bridge_pre_enable(struct drm_device *dev, struct drm_atomic_state *state) { struct drm_connector *connector; struct drm_connector_state *new_conn_state; @@ -1588,9 +1645,18 @@ encoder_bridge_pre_enable(struct drm_device *dev, struct drm_atomic_state *state drm_bridge_put(bridge); } } +EXPORT_SYMBOL(drm_atomic_helper_commit_encoder_bridge_pre_enable); -static void -crtc_enable(struct drm_device *dev, struct drm_atomic_state *state) +/** + * drm_atomic_helper_commit_crtc_enable - enables the CRTCs + * @dev: DRM device + * @state: atomic state object being committed + * + * This loops over CRTCs in the new state, and of the CRTC needs + * it, enables it. + */ +void +drm_atomic_helper_commit_crtc_enable(struct drm_device *dev, struct drm_atomic_state *state) { struct drm_crtc *crtc; struct drm_crtc_state *old_crtc_state; @@ -1619,9 +1685,18 @@ crtc_enable(struct drm_device *dev, struct drm_atomic_state *state) } } } +EXPORT_SYMBOL(drm_atomic_helper_commit_crtc_enable); -static void -encoder_bridge_enable(struct drm_device *dev, struct drm_atomic_state *state) +/** + * drm_atomic_helper_commit_encoder_bridge_enable - enables the bridges + * @dev: DRM device + * @state: atomic state object being committed + * + * This loops over all connectors in the new state, and of the CRTC needs + * it, enables the entire bridge chain. + */ +void +drm_atomic_helper_commit_encoder_bridge_enable(struct drm_device *dev, struct drm_atomic_state *state) { struct drm_connector *connector; struct drm_connector_state *new_conn_state; @@ -1664,6 +1739,7 @@ encoder_bridge_enable(struct drm_device *dev, struct drm_atomic_state *state) drm_bridge_put(bridge); } } +EXPORT_SYMBOL(drm_atomic_helper_commit_encoder_bridge_enable); /** * drm_atomic_helper_commit_modeset_enables - modeset commit to enable outputs @@ -1682,11 +1758,11 @@ encoder_bridge_enable(struct drm_device *dev, struct drm_atomic_state *state) void drm_atomic_helper_commit_modeset_enables(struct drm_device *dev, struct drm_atomic_state *state) { - crtc_enable(dev, state); + drm_atomic_helper_commit_crtc_enable(dev, state); - encoder_bridge_pre_enable(dev, state); + drm_atomic_helper_commit_encoder_bridge_pre_enable(dev, state); - encoder_bridge_enable(dev, state); + drm_atomic_helper_commit_encoder_bridge_enable(dev, state); drm_atomic_helper_commit_writebacks(dev, state); } diff --git a/include/drm/drm_atomic_helper.h b/include/drm/drm_atomic_helper.h index 53382fe93537b..e154ee4f0696c 100644 --- a/include/drm/drm_atomic_helper.h +++ b/include/drm/drm_atomic_helper.h @@ -60,6 +60,12 @@ int drm_atomic_helper_check_plane_state(struct drm_plane_state *plane_state, int drm_atomic_helper_check_planes(struct drm_device *dev, struct drm_atomic_state *state); int drm_atomic_helper_check_crtc_primary_plane(struct drm_crtc_state *crtc_state); +void drm_atomic_helper_commit_encoder_bridge_disable(struct drm_device *dev, + struct drm_atomic_state *state); +void drm_atomic_helper_commit_crtc_disable(struct drm_device *dev, + struct drm_atomic_state *state); +void drm_atomic_helper_commit_encoder_bridge_post_disable(struct drm_device *dev, + struct drm_atomic_state *state); int drm_atomic_helper_check(struct drm_device *dev, struct drm_atomic_state *state); void drm_atomic_helper_commit_tail(struct drm_atomic_state *state); @@ -89,8 +95,24 @@ drm_atomic_helper_update_legacy_modeset_state(struct drm_device *dev, void drm_atomic_helper_calc_timestamping_constants(struct drm_atomic_state *state); +void drm_atomic_helper_commit_crtc_set_mode(struct drm_device *dev, + struct drm_atomic_state *state); + void drm_atomic_helper_commit_modeset_disables(struct drm_device *dev, struct drm_atomic_state *state); + +void drm_atomic_helper_commit_writebacks(struct drm_device *dev, + struct drm_atomic_state *state); + +void drm_atomic_helper_commit_encoder_bridge_pre_enable(struct drm_device *dev, + struct drm_atomic_state *state); + +void drm_atomic_helper_commit_crtc_enable(struct drm_device *dev, + struct drm_atomic_state *state); + +void drm_atomic_helper_commit_encoder_bridge_enable(struct drm_device *dev, + struct drm_atomic_state *state); + void drm_atomic_helper_commit_modeset_enables(struct drm_device *dev, struct drm_atomic_state *old_state); From b3892c272e5484a4ef157d9ec7d74934df111ba9 Mon Sep 17 00:00:00 2001 From: Tomi Valkeinen Date: Fri, 5 Dec 2025 11:51:51 +0200 Subject: [PATCH 0822/1321] drm/tidss: Fix enable/disable order TI's OLDI and DSI encoders need to be set up before the crtc is enabled, but the DRM helpers will enable the crtc first. This causes various issues on TI platforms, like visual artifacts or crtc sync lost warnings. Thus drm_atomic_helper_commit_modeset_enables() and drm_atomic_helper_commit_modeset_disables() cannot be used, as they enable the crtc before bridges' pre-enable, and disable the crtc after bridges' post-disable. Open code the drm_atomic_helper_commit_modeset_enables() and drm_atomic_helper_commit_modeset_disables(), and first call the bridges' pre-enables, then crtc enable, then bridges' post-enable (and vice versa for disable). Signed-off-by: Tomi Valkeinen Cc: stable@vger.kernel.org # v6.17+ Fixes: c9b1150a68d9 ("drm/atomic-helper: Re-order bridge chain pre-enable and post-disable") Reviewed-by: Aradhya Bhatia Reviewed-by: Maxime Ripard Reviewed-by: Linus Walleij Tested-by: Linus Walleij Signed-off-by: Linus Walleij Link: https://patch.msgid.link/20251205-drm-seq-fix-v1-4-fda68fa1b3de@ideasonboard.com --- drivers/gpu/drm/tidss/tidss_kms.c | 30 +++++++++++++++++++++++++++--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/tidss/tidss_kms.c b/drivers/gpu/drm/tidss/tidss_kms.c index 86eb5d97410be..8bb93194e5ac6 100644 --- a/drivers/gpu/drm/tidss/tidss_kms.c +++ b/drivers/gpu/drm/tidss/tidss_kms.c @@ -26,9 +26,33 @@ static void tidss_atomic_commit_tail(struct drm_atomic_state *old_state) tidss_runtime_get(tidss); - drm_atomic_helper_commit_modeset_disables(ddev, old_state); - drm_atomic_helper_commit_planes(ddev, old_state, DRM_PLANE_COMMIT_ACTIVE_ONLY); - drm_atomic_helper_commit_modeset_enables(ddev, old_state); + /* + * TI's OLDI and DSI encoders need to be set up before the crtc is + * enabled. Thus drm_atomic_helper_commit_modeset_enables() and + * drm_atomic_helper_commit_modeset_disables() cannot be used here, as + * they enable the crtc before bridges' pre-enable, and disable the crtc + * after bridges' post-disable. + * + * Open code the functions here and first call the bridges' pre-enables, + * then crtc enable, then bridges' post-enable (and vice versa for + * disable). + */ + + drm_atomic_helper_commit_encoder_bridge_disable(ddev, old_state); + drm_atomic_helper_commit_crtc_disable(ddev, old_state); + drm_atomic_helper_commit_encoder_bridge_post_disable(ddev, old_state); + + drm_atomic_helper_update_legacy_modeset_state(ddev, old_state); + drm_atomic_helper_calc_timestamping_constants(old_state); + drm_atomic_helper_commit_crtc_set_mode(ddev, old_state); + + drm_atomic_helper_commit_planes(ddev, old_state, + DRM_PLANE_COMMIT_ACTIVE_ONLY); + + drm_atomic_helper_commit_encoder_bridge_pre_enable(ddev, old_state); + drm_atomic_helper_commit_crtc_enable(ddev, old_state); + drm_atomic_helper_commit_encoder_bridge_enable(ddev, old_state); + drm_atomic_helper_commit_writebacks(ddev, old_state); drm_atomic_helper_commit_hw_done(old_state); drm_atomic_helper_wait_for_flip_done(ddev, old_state); From c157be46035641be6d6eab87b35c6fca993a7c2c Mon Sep 17 00:00:00 2001 From: Dave Airlie Date: Fri, 2 Jan 2026 14:18:29 +1000 Subject: [PATCH 0823/1321] nouveau: don't attempt fwsec on sb on newer platforms. The changes to always loads fwsec sb causes problems on newer GPUs which don't use this path. Add hooks and pass through the device specific layers. Fixes: da67179e5538 ("drm/nouveau/gsp: Allocate fwsec-sb at boot") Cc: # v6.16+ Cc: Lyude Paul Cc: Timur Tabi Tested-by: Matthew Schwartz Tested-by: Christopher Snowhill Reviewed-by: Lyude Paul Signed-off-by: Dave Airlie Link: https://patch.msgid.link/20260102041829.2748009-1-airlied@gmail.com --- .../gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c | 3 +++ .../gpu/drm/nouveau/nvkm/subdev/gsp/fwsec.c | 8 +------ .../gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c | 3 +++ .../gpu/drm/nouveau/nvkm/subdev/gsp/ga102.c | 3 +++ .../gpu/drm/nouveau/nvkm/subdev/gsp/priv.h | 23 +++++++++++++++++-- .../gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c | 15 ++++++++++++ .../gpu/drm/nouveau/nvkm/subdev/gsp/tu116.c | 3 +++ 7 files changed, 49 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c index 35d1fcef520bf..c456a96268231 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c @@ -30,6 +30,9 @@ ad102_gsp = { .booter.ctor = ga102_gsp_booter_ctor, + .fwsec_sb.ctor = tu102_gsp_fwsec_sb_ctor, + .fwsec_sb.dtor = tu102_gsp_fwsec_sb_dtor, + .dtor = r535_gsp_dtor, .oneinit = tu102_gsp_oneinit, .init = tu102_gsp_init, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/fwsec.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/fwsec.c index 5037602466604..851140e801220 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/fwsec.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/fwsec.c @@ -337,18 +337,12 @@ nvkm_gsp_fwsec_sb(struct nvkm_gsp *gsp) } int -nvkm_gsp_fwsec_sb_ctor(struct nvkm_gsp *gsp) +nvkm_gsp_fwsec_sb_init(struct nvkm_gsp *gsp) { return nvkm_gsp_fwsec_init(gsp, &gsp->fws.falcon.sb, "fwsec-sb", NVFW_FALCON_APPIF_DMEMMAPPER_CMD_SB); } -void -nvkm_gsp_fwsec_sb_dtor(struct nvkm_gsp *gsp) -{ - nvkm_falcon_fw_dtor(&gsp->fws.falcon.sb); -} - int nvkm_gsp_fwsec_frts(struct nvkm_gsp *gsp) { diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c index d201e8697226b..27a13aeccd3cb 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c @@ -47,6 +47,9 @@ ga100_gsp = { .booter.ctor = tu102_gsp_booter_ctor, + .fwsec_sb.ctor = tu102_gsp_fwsec_sb_ctor, + .fwsec_sb.dtor = tu102_gsp_fwsec_sb_dtor, + .dtor = r535_gsp_dtor, .oneinit = tu102_gsp_oneinit, .init = tu102_gsp_init, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga102.c index 917f7e2f6c466..b6b3eb6f4c006 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga102.c @@ -158,6 +158,9 @@ ga102_gsp_r535 = { .booter.ctor = ga102_gsp_booter_ctor, + .fwsec_sb.ctor = tu102_gsp_fwsec_sb_ctor, + .fwsec_sb.dtor = tu102_gsp_fwsec_sb_dtor, + .dtor = r535_gsp_dtor, .oneinit = tu102_gsp_oneinit, .init = tu102_gsp_init, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h index 86bdd203bc107..9dd66a2e38017 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h @@ -7,9 +7,8 @@ enum nvkm_acr_lsf_id; int nvkm_gsp_fwsec_frts(struct nvkm_gsp *); -int nvkm_gsp_fwsec_sb_ctor(struct nvkm_gsp *); int nvkm_gsp_fwsec_sb(struct nvkm_gsp *); -void nvkm_gsp_fwsec_sb_dtor(struct nvkm_gsp *); +int nvkm_gsp_fwsec_sb_init(struct nvkm_gsp *gsp); struct nvkm_gsp_fwif { int version; @@ -52,6 +51,11 @@ struct nvkm_gsp_func { struct nvkm_falcon *, struct nvkm_falcon_fw *); } booter; + struct { + int (*ctor)(struct nvkm_gsp *); + void (*dtor)(struct nvkm_gsp *); + } fwsec_sb; + void (*dtor)(struct nvkm_gsp *); int (*oneinit)(struct nvkm_gsp *); int (*init)(struct nvkm_gsp *); @@ -67,6 +71,8 @@ extern const struct nvkm_falcon_func tu102_gsp_flcn; extern const struct nvkm_falcon_fw_func tu102_gsp_fwsec; int tu102_gsp_booter_ctor(struct nvkm_gsp *, const char *, const struct firmware *, struct nvkm_falcon *, struct nvkm_falcon_fw *); +int tu102_gsp_fwsec_sb_ctor(struct nvkm_gsp *); +void tu102_gsp_fwsec_sb_dtor(struct nvkm_gsp *); int tu102_gsp_oneinit(struct nvkm_gsp *); int tu102_gsp_init(struct nvkm_gsp *); int tu102_gsp_fini(struct nvkm_gsp *, bool suspend); @@ -91,5 +97,18 @@ int r535_gsp_fini(struct nvkm_gsp *, bool suspend); int nvkm_gsp_new_(const struct nvkm_gsp_fwif *, struct nvkm_device *, enum nvkm_subdev_type, int, struct nvkm_gsp **); +static inline int nvkm_gsp_fwsec_sb_ctor(struct nvkm_gsp *gsp) +{ + if (gsp->func->fwsec_sb.ctor) + return gsp->func->fwsec_sb.ctor(gsp); + return 0; +} + +static inline void nvkm_gsp_fwsec_sb_dtor(struct nvkm_gsp *gsp) +{ + if (gsp->func->fwsec_sb.dtor) + gsp->func->fwsec_sb.dtor(gsp); +} + extern const struct nvkm_gsp_func gv100_gsp; #endif diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c index 81e56da0474a1..04b642a1f7305 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c @@ -30,6 +30,18 @@ #include #include +int +tu102_gsp_fwsec_sb_ctor(struct nvkm_gsp *gsp) +{ + return nvkm_gsp_fwsec_sb_init(gsp); +} + +void +tu102_gsp_fwsec_sb_dtor(struct nvkm_gsp *gsp) +{ + nvkm_falcon_fw_dtor(&gsp->fws.falcon.sb); +} + static int tu102_gsp_booter_unload(struct nvkm_gsp *gsp, u32 mbox0, u32 mbox1) { @@ -370,6 +382,9 @@ tu102_gsp = { .booter.ctor = tu102_gsp_booter_ctor, + .fwsec_sb.ctor = tu102_gsp_fwsec_sb_ctor, + .fwsec_sb.dtor = tu102_gsp_fwsec_sb_dtor, + .dtor = r535_gsp_dtor, .oneinit = tu102_gsp_oneinit, .init = tu102_gsp_init, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu116.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu116.c index 97eb046c25d07..58cf258424218 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu116.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu116.c @@ -30,6 +30,9 @@ tu116_gsp = { .booter.ctor = tu102_gsp_booter_ctor, + .fwsec_sb.ctor = tu102_gsp_fwsec_sb_ctor, + .fwsec_sb.dtor = tu102_gsp_fwsec_sb_dtor, + .dtor = r535_gsp_dtor, .oneinit = tu102_gsp_oneinit, .init = tu102_gsp_init, From c42aca932a947e5c46f18597113f09aaa2953ddd Mon Sep 17 00:00:00 2001 From: "Mario Limonciello (AMD)" Date: Mon, 5 Jan 2026 22:46:38 -0600 Subject: [PATCH 0824/1321] PCI/VGA: Don't assume the only VGA device on a system is `boot_vga` Some systems ship with multiple display class devices but not all of them are VGA devices. If the "only" VGA device on the system is not used for displaying the image on the screen marking it as `boot_vga` because nothing was found is totally wrong. This behavior actually leads to mistakes of the wrong device being advertised to userspace and then userspace can make incorrect decisions. As there is an accurate `boot_display` sysfs file stop lying about `boot_vga` by assuming if nothing is found it's the right device. Reported-by: Aaron Erhardt Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220712 Tested-by: Aaron Erhardt Acked-by: Thomas Zimmermann Fixes: ad90860bd10ee ("fbcon: Use screen info to find primary device") Tested-by: Luke D. Jones Signed-off-by: Mario Limonciello (AMD) Signed-off-by: Thomas Zimmermann Link: https://patch.msgid.link/20260106044638.52906-1-superm1@kernel.org --- drivers/pci/vgaarb.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c index 436fa7f4c3873..baa242b140993 100644 --- a/drivers/pci/vgaarb.c +++ b/drivers/pci/vgaarb.c @@ -652,13 +652,6 @@ static bool vga_is_boot_device(struct vga_device *vgadev) return true; } - /* - * Vgadev has neither IO nor MEM enabled. If we haven't found any - * other VGA devices, it is the best candidate so far. - */ - if (!boot_vga) - return true; - return false; } From 5e9adff2cb0f8a7a06c7a15956f244945a1b33f7 Mon Sep 17 00:00:00 2001 From: Chengjun Yao Date: Mon, 15 Dec 2025 16:18:21 +0800 Subject: [PATCH 0825/1321] drm/fb-helper: Fix vblank timeout during suspend/reset During GPU reset, VBlank interrupts are disabled which causes drm_fb_helper_fb_dirty() to wait for VBlank timeout. This will create call traces like (seen on an RX7900 series dGPU): [ 101.313646] ------------[ cut here ]------------ [ 101.313648] amdgpu 0000:03:00.0: [drm] vblank wait timed out on crtc 0 [ 101.313657] WARNING: CPU: 0 PID: 461 at drivers/gpu/drm/drm_vblank.c:1320 drm_wait_one_vblank+0x176/0x220 [ 101.313663] Modules linked in: amdgpu amdxcp drm_panel_backlight_quirks gpu_sched drm_buddy drm_ttm_helper ttm drm_exec drm_suballoc_helper drm_display_helper cec rc_core i2c_algo_bit nf_conntrack_netlink xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE bridge stp llc xfrm_user xfrm_algo xt_set ip_set nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype nft_compat x_tables nf_tables overlay qrtr sunrpc snd_hda_codec_alc882 snd_hda_codec_realtek_lib snd_hda_codec_generic snd_hda_codec_atihdmi snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_hwdep snd_pcm amd_atl intel_rapl_msr snd_seq_midi intel_rapl_common asus_ec_sensors snd_seq_midi_event snd_rawmidi snd_seq eeepc_wmi snd_seq_device edac_mce_amd asus_wmi polyval_clmulni ghash_clmulni_intel snd_timer platform_profile aesni_intel wmi_bmof sparse_keymap joydev snd rapl input_leds i2c_piix4 soundcore ccp k10temp i2c_smbus gpio_amdpt mac_hid binfmt_misc sch_fq_codel msr parport_pc ppdev lp parport [ 101.313745] efi_pstore nfnetlink dmi_sysfs autofs4 hid_generic usbhid hid r8169 realtek ahci libahci video wmi [ 101.313760] CPU: 0 UID: 0 PID: 461 Comm: kworker/0:2 Not tainted 6.18.0-rc6-174403b3b920 #1 PREEMPT(voluntary) [ 101.313763] Hardware name: ASUS System Product Name/TUF GAMING X670E-PLUS, BIOS 0821 11/15/2022 [ 101.313765] Workqueue: events drm_fb_helper_damage_work [ 101.313769] RIP: 0010:drm_wait_one_vblank+0x176/0x220 [ 101.313772] Code: 7c 24 08 4c 8b 77 50 4d 85 f6 0f 84 a1 00 00 00 e8 2f 11 03 00 44 89 e9 4c 89 f2 48 c7 c7 d0 ad 0d a8 48 89 c6 e8 2a e0 4a ff <0f> 0b e9 f2 fe ff ff 48 85 ff 74 04 4c 8b 67 08 4d 8b 6c 24 50 4d [ 101.313774] RSP: 0018:ffffc99c00d47d68 EFLAGS: 00010246 [ 101.313777] RAX: 0000000000000000 RBX: 000000000200038a RCX: 0000000000000000 [ 101.313778] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 101.313779] RBP: ffffc99c00d47dc0 R08: 0000000000000000 R09: 0000000000000000 [ 101.313781] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8948c4280010 [ 101.313782] R13: 0000000000000000 R14: ffff894883263a50 R15: ffff89488c384830 [ 101.313784] FS: 0000000000000000(0000) GS:ffff895424692000(0000) knlGS:0000000000000000 [ 101.313785] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 101.313787] CR2: 00007773650ee200 CR3: 0000000588e40000 CR4: 0000000000f50ef0 [ 101.313788] PKRU: 55555554 [ 101.313790] Call Trace: [ 101.313791] [ 101.313795] ? __pfx_autoremove_wake_function+0x10/0x10 [ 101.313800] drm_crtc_wait_one_vblank+0x17/0x30 [ 101.313802] drm_client_modeset_wait_for_vblank+0x61/0x80 [ 101.313805] drm_fb_helper_damage_work+0x46/0x1a0 [ 101.313808] process_one_work+0x1a1/0x3f0 [ 101.313812] worker_thread+0x2ba/0x3d0 [ 101.313816] kthread+0x107/0x220 [ 101.313818] ? __pfx_worker_thread+0x10/0x10 [ 101.313821] ? __pfx_kthread+0x10/0x10 [ 101.313823] ret_from_fork+0x202/0x230 [ 101.313826] ? __pfx_kthread+0x10/0x10 [ 101.313828] ret_from_fork_asm+0x1a/0x30 [ 101.313834] [ 101.313835] ---[ end trace 0000000000000000 ]--- Cancel pending damage work synchronously before console_lock() to ensure any in-flight framebuffer damage operations complete before suspension. Also check for FBINFO_STATE_RUNNING in drm_fb_helper_damage_work() to avoid executing damage work if it is rescheduled while the device is suspended. Fixes: d8c4bddcd8bc ("drm/fb-helper: Synchronize dirty worker with vblank") Signed-off-by: Aurabindo Pillai Signed-off-by: Chengjun Yao Signed-off-by: Thomas Zimmermann Reviewed-by: Thomas Zimmermann Link: https://patch.msgid.link/20251215081822.432005-1-Chengjun.Yao@amd.com --- drivers/gpu/drm/drm_fb_helper.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index 4a7f72044ab8f..4b47aa0dab35a 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -366,6 +366,9 @@ static void drm_fb_helper_damage_work(struct work_struct *work) { struct drm_fb_helper *helper = container_of(work, struct drm_fb_helper, damage_work); + if (helper->info->state != FBINFO_STATE_RUNNING) + return; + drm_fb_helper_fb_dirty(helper); } @@ -732,6 +735,13 @@ void drm_fb_helper_set_suspend_unlocked(struct drm_fb_helper *fb_helper, if (fb_helper->info->state != FBINFO_STATE_RUNNING) return; + /* + * Cancel pending damage work. During GPU reset, VBlank + * interrupts are disabled and drm_fb_helper_fb_dirty() + * would wait for VBlank timeout otherwise. + */ + cancel_work_sync(&fb_helper->damage_work); + console_lock(); } else { From a5bd3c2db58974e8eae2036b4070c40554f91a0e Mon Sep 17 00:00:00 2001 From: Nathan Chancellor Date: Sat, 13 Dec 2025 15:16:43 +0900 Subject: [PATCH 0826/1321] drm/amd/display: Apply e4479aecf658 to dml After an innocuous optimization change in clang-22, allmodconfig (which enables CONFIG_KASAN and CONFIG_WERROR) breaks with: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1724:6: error: stack frame size (3144) exceeds limit (3072) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than] 1724 | void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib) | ^ With clang-21, this function was already pretty close to the existing limit of 3072 bytes. drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1724:6: error: stack frame size (2904) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than] 1724 | void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib) | ^ A similar situation occurred in dml2, which was resolved by commit e4479aecf658 ("drm/amd/display: Increase sanitizer frame larger than limit when compile testing with clang") by increasing the limit for clang when compile testing with certain sanitizer enabled, so that allmodconfig (an easy testing target) continues to work. Apply that same change to the dml folder to clear up the warning for allmodconfig, unbreaking the build. Closes: https://github.com/ClangBuiltLinux/linux/issues/2135 Signed-off-by: Nathan Chancellor Signed-off-by: Alex Deucher (cherry picked from commit 25314b453cf812150e9951a32007a32bba85707e) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/display/dc/dml/Makefile | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile b/drivers/gpu/drm/amd/display/dc/dml/Makefile index b357683b4255a..268b5fbdb48bd 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/Makefile +++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile @@ -30,7 +30,11 @@ dml_rcflags := $(CC_FLAGS_NO_FPU) ifneq ($(CONFIG_FRAME_WARN),0) ifeq ($(filter y,$(CONFIG_KASAN)$(CONFIG_KCSAN)),y) - frame_warn_limit := 3072 + ifeq ($(CONFIG_CC_IS_CLANG)$(CONFIG_COMPILE_TEST),yy) + frame_warn_limit := 4096 + else + frame_warn_limit := 3072 + endif else frame_warn_limit := 2048 endif From a4c0f7aac61732d3851c45edfc2e6dbf57dd1801 Mon Sep 17 00:00:00 2001 From: Nathan Chancellor Date: Sat, 13 Dec 2025 19:58:10 +0900 Subject: [PATCH 0827/1321] drm/amd/display: Reduce number of arguments of dcn30's CalculatePrefetchSchedule() After an innocuous optimization change in clang-22, dml30_ModeSupportAndSystemConfigurationFull() is over the 2048 byte stack limit for display_mode_vba_30.c. drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3529:6: warning: stack frame size (2096) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than] 3529 | void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib) | ^ With clang-21, this function was already close to the limit: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3529:6: warning: stack frame size (1912) exceeds limit (1586) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than] 3529 | void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib) | ^ CalculatePrefetchSchedule() has a large number of parameters, which must be passed on the stack. Most of the parameters between the two callsites are the same, so they can be accessed through the existing mode_lib pointer, instead of being passed as explicit arguments. Doing this reduces the stack size of dml30_ModeSupportAndSystemConfigurationFull() from 2096 bytes to 1912 bytes with clang-22. Closes: https://github.com/ClangBuiltLinux/linux/issues/2117 Signed-off-by: Nathan Chancellor Signed-off-by: Alex Deucher (cherry picked from commit b20b3fc4210f83089f835cdb91deec4b0778761a) --- .../dc/dml/dcn30/display_mode_vba_30.c | 258 +++++------------- 1 file changed, 73 insertions(+), 185 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c index 8d24763938ea6..2d19bb8de59c8 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c @@ -77,32 +77,14 @@ static unsigned int dscceComputeDelay( static unsigned int dscComputeDelay( enum output_format_class pixelFormat, enum output_encoder_class Output); -// Super monster function with some 45 argument static bool CalculatePrefetchSchedule( struct display_mode_lib *mode_lib, - double PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyPixelMixedWithVMData, - double PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyVMDataOnly, + unsigned int k, Pipe *myPipe, unsigned int DSCDelay, - double DPPCLKDelaySubtotalPlusCNVCFormater, - double DPPCLKDelaySCL, - double DPPCLKDelaySCLLBOnly, - double DPPCLKDelayCNVCCursor, - double DISPCLKDelaySubtotal, unsigned int DPP_RECOUT_WIDTH, - enum output_format_class OutputFormat, - unsigned int MaxInterDCNTileRepeaters, unsigned int VStartup, unsigned int MaxVStartup, - unsigned int GPUVMPageTableLevels, - bool GPUVMEnable, - bool HostVMEnable, - unsigned int HostVMMaxNonCachedPageTableLevels, - double HostVMMinPageSize, - bool DynamicMetadataEnable, - bool DynamicMetadataVMEnabled, - int DynamicMetadataLinesBeforeActiveRequired, - unsigned int DynamicMetadataTransmittedBytes, double UrgentLatency, double UrgentExtraLatency, double TCalc, @@ -116,7 +98,6 @@ static bool CalculatePrefetchSchedule( unsigned int MaxNumSwathY, double PrefetchSourceLinesC, unsigned int SwathWidthC, - int BytePerPixelC, double VInitPreFillC, unsigned int MaxNumSwathC, long swath_width_luma_ub, @@ -124,9 +105,6 @@ static bool CalculatePrefetchSchedule( unsigned int SwathHeightY, unsigned int SwathHeightC, double TWait, - bool ProgressiveToInterlaceUnitInOPP, - double *DSTXAfterScaler, - double *DSTYAfterScaler, double *DestinationLinesForPrefetch, double *PrefetchBandwidth, double *DestinationLinesToRequestVMInVBlank, @@ -135,14 +113,7 @@ static bool CalculatePrefetchSchedule( double *VRatioPrefetchC, double *RequiredPrefetchPixDataBWLuma, double *RequiredPrefetchPixDataBWChroma, - bool *NotEnoughTimeForDynamicMetadata, - double *Tno_bw, - double *prefetch_vmrow_bw, - double *Tdmdl_vm, - double *Tdmdl, - unsigned int *VUpdateOffsetPix, - double *VUpdateWidthPix, - double *VReadyOffsetPix); + bool *NotEnoughTimeForDynamicMetadata); static double RoundToDFSGranularityUp(double Clock, double VCOSpeed); static double RoundToDFSGranularityDown(double Clock, double VCOSpeed); static void CalculateDCCConfiguration( @@ -810,29 +781,12 @@ static unsigned int dscComputeDelay(enum output_format_class pixelFormat, enum o static bool CalculatePrefetchSchedule( struct display_mode_lib *mode_lib, - double PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyPixelMixedWithVMData, - double PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyVMDataOnly, + unsigned int k, Pipe *myPipe, unsigned int DSCDelay, - double DPPCLKDelaySubtotalPlusCNVCFormater, - double DPPCLKDelaySCL, - double DPPCLKDelaySCLLBOnly, - double DPPCLKDelayCNVCCursor, - double DISPCLKDelaySubtotal, unsigned int DPP_RECOUT_WIDTH, - enum output_format_class OutputFormat, - unsigned int MaxInterDCNTileRepeaters, unsigned int VStartup, unsigned int MaxVStartup, - unsigned int GPUVMPageTableLevels, - bool GPUVMEnable, - bool HostVMEnable, - unsigned int HostVMMaxNonCachedPageTableLevels, - double HostVMMinPageSize, - bool DynamicMetadataEnable, - bool DynamicMetadataVMEnabled, - int DynamicMetadataLinesBeforeActiveRequired, - unsigned int DynamicMetadataTransmittedBytes, double UrgentLatency, double UrgentExtraLatency, double TCalc, @@ -846,7 +800,6 @@ static bool CalculatePrefetchSchedule( unsigned int MaxNumSwathY, double PrefetchSourceLinesC, unsigned int SwathWidthC, - int BytePerPixelC, double VInitPreFillC, unsigned int MaxNumSwathC, long swath_width_luma_ub, @@ -854,9 +807,6 @@ static bool CalculatePrefetchSchedule( unsigned int SwathHeightY, unsigned int SwathHeightC, double TWait, - bool ProgressiveToInterlaceUnitInOPP, - double *DSTXAfterScaler, - double *DSTYAfterScaler, double *DestinationLinesForPrefetch, double *PrefetchBandwidth, double *DestinationLinesToRequestVMInVBlank, @@ -865,15 +815,10 @@ static bool CalculatePrefetchSchedule( double *VRatioPrefetchC, double *RequiredPrefetchPixDataBWLuma, double *RequiredPrefetchPixDataBWChroma, - bool *NotEnoughTimeForDynamicMetadata, - double *Tno_bw, - double *prefetch_vmrow_bw, - double *Tdmdl_vm, - double *Tdmdl, - unsigned int *VUpdateOffsetPix, - double *VUpdateWidthPix, - double *VReadyOffsetPix) + bool *NotEnoughTimeForDynamicMetadata) { + struct vba_vars_st *v = &mode_lib->vba; + double DPPCLKDelaySubtotalPlusCNVCFormater = v->DPPCLKDelaySubtotal + v->DPPCLKDelayCNVCFormater; bool MyError = false; unsigned int DPPCycles = 0, DISPCLKCycles = 0; double DSTTotalPixelsAfterScaler = 0; @@ -905,26 +850,26 @@ static bool CalculatePrefetchSchedule( double Tdmec = 0; double Tdmsks = 0; - if (GPUVMEnable == true && HostVMEnable == true) { - HostVMInefficiencyFactor = PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyPixelMixedWithVMData / PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyVMDataOnly; - HostVMDynamicLevelsTrips = HostVMMaxNonCachedPageTableLevels; + if (v->GPUVMEnable == true && v->HostVMEnable == true) { + HostVMInefficiencyFactor = v->PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyPixelMixedWithVMData / v->PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyVMDataOnly; + HostVMDynamicLevelsTrips = v->HostVMMaxNonCachedPageTableLevels; } else { HostVMInefficiencyFactor = 1; HostVMDynamicLevelsTrips = 0; } CalculateDynamicMetadataParameters( - MaxInterDCNTileRepeaters, + v->MaxInterDCNTileRepeaters, myPipe->DPPCLK, myPipe->DISPCLK, myPipe->DCFCLKDeepSleep, myPipe->PixelClock, myPipe->HTotal, myPipe->VBlank, - DynamicMetadataTransmittedBytes, - DynamicMetadataLinesBeforeActiveRequired, + v->DynamicMetadataTransmittedBytes[k], + v->DynamicMetadataLinesBeforeActiveRequired[k], myPipe->InterlaceEnable, - ProgressiveToInterlaceUnitInOPP, + v->ProgressiveToInterlaceUnitInOPP, &Tsetup, &Tdmbf, &Tdmec, @@ -932,16 +877,16 @@ static bool CalculatePrefetchSchedule( LineTime = myPipe->HTotal / myPipe->PixelClock; trip_to_mem = UrgentLatency; - Tvm_trips = UrgentExtraLatency + trip_to_mem * (GPUVMPageTableLevels * (HostVMDynamicLevelsTrips + 1) - 1); + Tvm_trips = UrgentExtraLatency + trip_to_mem * (v->GPUVMMaxPageTableLevels * (HostVMDynamicLevelsTrips + 1) - 1); - if (DynamicMetadataVMEnabled == true && GPUVMEnable == true) { - *Tdmdl = TWait + Tvm_trips + trip_to_mem; + if (v->DynamicMetadataVMEnabled == true && v->GPUVMEnable == true) { + v->Tdmdl[k] = TWait + Tvm_trips + trip_to_mem; } else { - *Tdmdl = TWait + UrgentExtraLatency; + v->Tdmdl[k] = TWait + UrgentExtraLatency; } - if (DynamicMetadataEnable == true) { - if (VStartup * LineTime < Tsetup + *Tdmdl + Tdmbf + Tdmec + Tdmsks) { + if (v->DynamicMetadataEnable[k] == true) { + if (VStartup * LineTime < Tsetup + v->Tdmdl[k] + Tdmbf + Tdmec + Tdmsks) { *NotEnoughTimeForDynamicMetadata = true; } else { *NotEnoughTimeForDynamicMetadata = false; @@ -949,39 +894,39 @@ static bool CalculatePrefetchSchedule( dml_print("DML: Tdmbf: %fus - time for dmd transfer from dchub to dio output buffer\n", Tdmbf); dml_print("DML: Tdmec: %fus - time dio takes to transfer dmd\n", Tdmec); dml_print("DML: Tdmsks: %fus - time before active dmd must complete transmission at dio\n", Tdmsks); - dml_print("DML: Tdmdl: %fus - time for fabric to become ready and fetch dmd \n", *Tdmdl); + dml_print("DML: Tdmdl: %fus - time for fabric to become ready and fetch dmd \n", v->Tdmdl[k]); } } else { *NotEnoughTimeForDynamicMetadata = false; } - *Tdmdl_vm = (DynamicMetadataEnable == true && DynamicMetadataVMEnabled == true && GPUVMEnable == true ? TWait + Tvm_trips : 0); + v->Tdmdl_vm[k] = (v->DynamicMetadataEnable[k] == true && v->DynamicMetadataVMEnabled == true && v->GPUVMEnable == true ? TWait + Tvm_trips : 0); if (myPipe->ScalerEnabled) - DPPCycles = DPPCLKDelaySubtotalPlusCNVCFormater + DPPCLKDelaySCL; + DPPCycles = DPPCLKDelaySubtotalPlusCNVCFormater + v->DPPCLKDelaySCL; else - DPPCycles = DPPCLKDelaySubtotalPlusCNVCFormater + DPPCLKDelaySCLLBOnly; + DPPCycles = DPPCLKDelaySubtotalPlusCNVCFormater + v->DPPCLKDelaySCLLBOnly; - DPPCycles = DPPCycles + myPipe->NumberOfCursors * DPPCLKDelayCNVCCursor; + DPPCycles = DPPCycles + myPipe->NumberOfCursors * v->DPPCLKDelayCNVCCursor; - DISPCLKCycles = DISPCLKDelaySubtotal; + DISPCLKCycles = v->DISPCLKDelaySubtotal; if (myPipe->DPPCLK == 0.0 || myPipe->DISPCLK == 0.0) return true; - *DSTXAfterScaler = DPPCycles * myPipe->PixelClock / myPipe->DPPCLK + DISPCLKCycles * myPipe->PixelClock / myPipe->DISPCLK + v->DSTXAfterScaler[k] = DPPCycles * myPipe->PixelClock / myPipe->DPPCLK + DISPCLKCycles * myPipe->PixelClock / myPipe->DISPCLK + DSCDelay; - *DSTXAfterScaler = *DSTXAfterScaler + ((myPipe->ODMCombineEnabled)?18:0) + (myPipe->DPPPerPlane - 1) * DPP_RECOUT_WIDTH; + v->DSTXAfterScaler[k] = v->DSTXAfterScaler[k] + ((myPipe->ODMCombineEnabled)?18:0) + (myPipe->DPPPerPlane - 1) * DPP_RECOUT_WIDTH; - if (OutputFormat == dm_420 || (myPipe->InterlaceEnable && ProgressiveToInterlaceUnitInOPP)) - *DSTYAfterScaler = 1; + if (v->OutputFormat[k] == dm_420 || (myPipe->InterlaceEnable && v->ProgressiveToInterlaceUnitInOPP)) + v->DSTYAfterScaler[k] = 1; else - *DSTYAfterScaler = 0; + v->DSTYAfterScaler[k] = 0; - DSTTotalPixelsAfterScaler = *DSTYAfterScaler * myPipe->HTotal + *DSTXAfterScaler; - *DSTYAfterScaler = dml_floor(DSTTotalPixelsAfterScaler / myPipe->HTotal, 1); - *DSTXAfterScaler = DSTTotalPixelsAfterScaler - ((double) (*DSTYAfterScaler * myPipe->HTotal)); + DSTTotalPixelsAfterScaler = v->DSTYAfterScaler[k] * myPipe->HTotal + v->DSTXAfterScaler[k]; + v->DSTYAfterScaler[k] = dml_floor(DSTTotalPixelsAfterScaler / myPipe->HTotal, 1); + v->DSTXAfterScaler[k] = DSTTotalPixelsAfterScaler - ((double) (v->DSTYAfterScaler[k] * myPipe->HTotal)); MyError = false; @@ -990,33 +935,33 @@ static bool CalculatePrefetchSchedule( Tvm_trips_rounded = dml_ceil(4.0 * Tvm_trips / LineTime, 1) / 4 * LineTime; Tr0_trips_rounded = dml_ceil(4.0 * Tr0_trips / LineTime, 1) / 4 * LineTime; - if (GPUVMEnable) { - if (GPUVMPageTableLevels >= 3) { - *Tno_bw = UrgentExtraLatency + trip_to_mem * ((GPUVMPageTableLevels - 2) - 1); + if (v->GPUVMEnable) { + if (v->GPUVMMaxPageTableLevels >= 3) { + v->Tno_bw[k] = UrgentExtraLatency + trip_to_mem * ((v->GPUVMMaxPageTableLevels - 2) - 1); } else - *Tno_bw = 0; + v->Tno_bw[k] = 0; } else if (!myPipe->DCCEnable) - *Tno_bw = LineTime; + v->Tno_bw[k] = LineTime; else - *Tno_bw = LineTime / 4; + v->Tno_bw[k] = LineTime / 4; - dst_y_prefetch_equ = VStartup - (Tsetup + dml_max(TWait + TCalc, *Tdmdl)) / LineTime - - (*DSTYAfterScaler + *DSTXAfterScaler / myPipe->HTotal); + dst_y_prefetch_equ = VStartup - (Tsetup + dml_max(TWait + TCalc, v->Tdmdl[k])) / LineTime + - (v->DSTYAfterScaler[k] + v->DSTXAfterScaler[k] / myPipe->HTotal); dst_y_prefetch_equ = dml_min(dst_y_prefetch_equ, 63.75); // limit to the reg limit of U6.2 for DST_Y_PREFETCH Lsw_oto = dml_max(PrefetchSourceLinesY, PrefetchSourceLinesC); Tsw_oto = Lsw_oto * LineTime; - prefetch_bw_oto = (PrefetchSourceLinesY * swath_width_luma_ub * BytePerPixelY + PrefetchSourceLinesC * swath_width_chroma_ub * BytePerPixelC) / Tsw_oto; + prefetch_bw_oto = (PrefetchSourceLinesY * swath_width_luma_ub * BytePerPixelY + PrefetchSourceLinesC * swath_width_chroma_ub * v->BytePerPixelC[k]) / Tsw_oto; - if (GPUVMEnable == true) { - Tvm_oto = dml_max3(*Tno_bw + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / prefetch_bw_oto, + if (v->GPUVMEnable == true) { + Tvm_oto = dml_max3(v->Tno_bw[k] + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / prefetch_bw_oto, Tvm_trips, LineTime / 4.0); } else Tvm_oto = LineTime / 4.0; - if ((GPUVMEnable == true || myPipe->DCCEnable == true)) { + if ((v->GPUVMEnable == true || myPipe->DCCEnable == true)) { Tr0_oto = dml_max3( (MetaRowByte + PixelPTEBytesPerRow * HostVMInefficiencyFactor) / prefetch_bw_oto, LineTime - Tvm_oto, LineTime / 4); @@ -1042,10 +987,10 @@ static bool CalculatePrefetchSchedule( dml_print("DML: Tdmbf: %fus - time for dmd transfer from dchub to dio output buffer\n", Tdmbf); dml_print("DML: Tdmec: %fus - time dio takes to transfer dmd\n", Tdmec); dml_print("DML: Tdmsks: %fus - time before active dmd must complete transmission at dio\n", Tdmsks); - dml_print("DML: Tdmdl_vm: %fus - time for vm stages of dmd \n", *Tdmdl_vm); - dml_print("DML: Tdmdl: %fus - time for fabric to become ready and fetch dmd \n", *Tdmdl); - dml_print("DML: dst_x_after_scl: %f pixels - number of pixel clocks pipeline and buffer delay after scaler \n", *DSTXAfterScaler); - dml_print("DML: dst_y_after_scl: %d lines - number of lines of pipeline and buffer delay after scaler \n", (int)*DSTYAfterScaler); + dml_print("DML: Tdmdl_vm: %fus - time for vm stages of dmd \n", v->Tdmdl_vm[k]); + dml_print("DML: Tdmdl: %fus - time for fabric to become ready and fetch dmd \n", v->Tdmdl[k]); + dml_print("DML: dst_x_after_scl: %f pixels - number of pixel clocks pipeline and buffer delay after scaler \n", v->DSTXAfterScaler[k]); + dml_print("DML: dst_y_after_scl: %d lines - number of lines of pipeline and buffer delay after scaler \n", (int)v->DSTYAfterScaler[k]); *PrefetchBandwidth = 0; *DestinationLinesToRequestVMInVBlank = 0; @@ -1059,26 +1004,26 @@ static bool CalculatePrefetchSchedule( double PrefetchBandwidth3 = 0; double PrefetchBandwidth4 = 0; - if (Tpre_rounded - *Tno_bw > 0) + if (Tpre_rounded - v->Tno_bw[k] > 0) PrefetchBandwidth1 = (PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor + 2 * MetaRowByte + 2 * PixelPTEBytesPerRow * HostVMInefficiencyFactor + PrefetchSourceLinesY * swath_width_luma_ub * BytePerPixelY - + PrefetchSourceLinesC * swath_width_chroma_ub * BytePerPixelC) - / (Tpre_rounded - *Tno_bw); + + PrefetchSourceLinesC * swath_width_chroma_ub * v->BytePerPixelC[k]) + / (Tpre_rounded - v->Tno_bw[k]); else PrefetchBandwidth1 = 0; - if (VStartup == MaxVStartup && (PrefetchBandwidth1 > 4 * prefetch_bw_oto) && (Tpre_rounded - Tsw_oto / 4 - 0.75 * LineTime - *Tno_bw) > 0) { - PrefetchBandwidth1 = (PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor + 2 * MetaRowByte + 2 * PixelPTEBytesPerRow * HostVMInefficiencyFactor) / (Tpre_rounded - Tsw_oto / 4 - 0.75 * LineTime - *Tno_bw); + if (VStartup == MaxVStartup && (PrefetchBandwidth1 > 4 * prefetch_bw_oto) && (Tpre_rounded - Tsw_oto / 4 - 0.75 * LineTime - v->Tno_bw[k]) > 0) { + PrefetchBandwidth1 = (PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor + 2 * MetaRowByte + 2 * PixelPTEBytesPerRow * HostVMInefficiencyFactor) / (Tpre_rounded - Tsw_oto / 4 - 0.75 * LineTime - v->Tno_bw[k]); } - if (Tpre_rounded - *Tno_bw - 2 * Tr0_trips_rounded > 0) + if (Tpre_rounded - v->Tno_bw[k] - 2 * Tr0_trips_rounded > 0) PrefetchBandwidth2 = (PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor + PrefetchSourceLinesY * swath_width_luma_ub * BytePerPixelY + PrefetchSourceLinesC * swath_width_chroma_ub * - BytePerPixelC) / - (Tpre_rounded - *Tno_bw - 2 * Tr0_trips_rounded); + v->BytePerPixelC[k]) / + (Tpre_rounded - v->Tno_bw[k] - 2 * Tr0_trips_rounded); else PrefetchBandwidth2 = 0; @@ -1086,7 +1031,7 @@ static bool CalculatePrefetchSchedule( PrefetchBandwidth3 = (2 * MetaRowByte + 2 * PixelPTEBytesPerRow * HostVMInefficiencyFactor + PrefetchSourceLinesY * swath_width_luma_ub * BytePerPixelY + PrefetchSourceLinesC * - swath_width_chroma_ub * BytePerPixelC) / (Tpre_rounded - + swath_width_chroma_ub * v->BytePerPixelC[k]) / (Tpre_rounded - Tvm_trips_rounded); else PrefetchBandwidth3 = 0; @@ -1096,7 +1041,7 @@ static bool CalculatePrefetchSchedule( } if (Tpre_rounded - Tvm_trips_rounded - 2 * Tr0_trips_rounded > 0) - PrefetchBandwidth4 = (PrefetchSourceLinesY * swath_width_luma_ub * BytePerPixelY + PrefetchSourceLinesC * swath_width_chroma_ub * BytePerPixelC) + PrefetchBandwidth4 = (PrefetchSourceLinesY * swath_width_luma_ub * BytePerPixelY + PrefetchSourceLinesC * swath_width_chroma_ub * v->BytePerPixelC[k]) / (Tpre_rounded - Tvm_trips_rounded - 2 * Tr0_trips_rounded); else PrefetchBandwidth4 = 0; @@ -1107,7 +1052,7 @@ static bool CalculatePrefetchSchedule( bool Case3OK; if (PrefetchBandwidth1 > 0) { - if (*Tno_bw + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / PrefetchBandwidth1 + if (v->Tno_bw[k] + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / PrefetchBandwidth1 >= Tvm_trips_rounded && (MetaRowByte + PixelPTEBytesPerRow * HostVMInefficiencyFactor) / PrefetchBandwidth1 >= Tr0_trips_rounded) { Case1OK = true; } else { @@ -1118,7 +1063,7 @@ static bool CalculatePrefetchSchedule( } if (PrefetchBandwidth2 > 0) { - if (*Tno_bw + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / PrefetchBandwidth2 + if (v->Tno_bw[k] + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / PrefetchBandwidth2 >= Tvm_trips_rounded && (MetaRowByte + PixelPTEBytesPerRow * HostVMInefficiencyFactor) / PrefetchBandwidth2 < Tr0_trips_rounded) { Case2OK = true; } else { @@ -1129,7 +1074,7 @@ static bool CalculatePrefetchSchedule( } if (PrefetchBandwidth3 > 0) { - if (*Tno_bw + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / PrefetchBandwidth3 + if (v->Tno_bw[k] + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / PrefetchBandwidth3 < Tvm_trips_rounded && (MetaRowByte + PixelPTEBytesPerRow * HostVMInefficiencyFactor) / PrefetchBandwidth3 >= Tr0_trips_rounded) { Case3OK = true; } else { @@ -1152,13 +1097,13 @@ static bool CalculatePrefetchSchedule( dml_print("DML: prefetch_bw_equ: %f\n", prefetch_bw_equ); if (prefetch_bw_equ > 0) { - if (GPUVMEnable) { - Tvm_equ = dml_max3(*Tno_bw + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / prefetch_bw_equ, Tvm_trips, LineTime / 4); + if (v->GPUVMEnable) { + Tvm_equ = dml_max3(v->Tno_bw[k] + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / prefetch_bw_equ, Tvm_trips, LineTime / 4); } else { Tvm_equ = LineTime / 4; } - if ((GPUVMEnable || myPipe->DCCEnable)) { + if ((v->GPUVMEnable || myPipe->DCCEnable)) { Tr0_equ = dml_max4( (MetaRowByte + PixelPTEBytesPerRow * HostVMInefficiencyFactor) / prefetch_bw_equ, Tr0_trips, @@ -1227,7 +1172,7 @@ static bool CalculatePrefetchSchedule( } *RequiredPrefetchPixDataBWLuma = (double) PrefetchSourceLinesY / LinesToRequestPrefetchPixelData * BytePerPixelY * swath_width_luma_ub / LineTime; - *RequiredPrefetchPixDataBWChroma = (double) PrefetchSourceLinesC / LinesToRequestPrefetchPixelData * BytePerPixelC * swath_width_chroma_ub / LineTime; + *RequiredPrefetchPixDataBWChroma = (double) PrefetchSourceLinesC / LinesToRequestPrefetchPixelData * v->BytePerPixelC[k] * swath_width_chroma_ub / LineTime; } else { MyError = true; dml_print("DML: MyErr set %s:%d\n", __FILE__, __LINE__); @@ -1243,9 +1188,9 @@ static bool CalculatePrefetchSchedule( dml_print("DML: Tr0: %fus - time to fetch first row of data pagetables and first row of meta data (done in parallel)\n", TimeForFetchingRowInVBlank); dml_print("DML: Tr1: %fus - time to fetch second row of data pagetables and second row of meta data (done in parallel)\n", TimeForFetchingRowInVBlank); dml_print("DML: Tsw: %fus = time to fetch enough pixel data and cursor data to feed the scalers init position and detile\n", (double)LinesToRequestPrefetchPixelData * LineTime); - dml_print("DML: To: %fus - time for propagation from scaler to optc\n", (*DSTYAfterScaler + ((*DSTXAfterScaler) / (double) myPipe->HTotal)) * LineTime); + dml_print("DML: To: %fus - time for propagation from scaler to optc\n", (v->DSTYAfterScaler[k] + ((v->DSTXAfterScaler[k]) / (double) myPipe->HTotal)) * LineTime); dml_print("DML: Tvstartup - Tsetup - Tcalc - Twait - Tpre - To > 0\n"); - dml_print("DML: Tslack(pre): %fus - time left over in schedule\n", VStartup * LineTime - TimeForFetchingMetaPTE - 2 * TimeForFetchingRowInVBlank - (*DSTYAfterScaler + ((*DSTXAfterScaler) / (double) myPipe->HTotal)) * LineTime - TWait - TCalc - Tsetup); + dml_print("DML: Tslack(pre): %fus - time left over in schedule\n", VStartup * LineTime - TimeForFetchingMetaPTE - 2 * TimeForFetchingRowInVBlank - (v->DSTYAfterScaler[k] + ((v->DSTXAfterScaler[k]) / (double) myPipe->HTotal)) * LineTime - TWait - TCalc - Tsetup); dml_print("DML: row_bytes = dpte_row_bytes (per_pipe) = PixelPTEBytesPerRow = : %d\n", PixelPTEBytesPerRow); } else { @@ -1276,7 +1221,7 @@ static bool CalculatePrefetchSchedule( dml_print("DML: MyErr set %s:%d\n", __FILE__, __LINE__); } - *prefetch_vmrow_bw = dml_max(prefetch_vm_bw, prefetch_row_bw); + v->prefetch_vmrow_bw[k] = dml_max(prefetch_vm_bw, prefetch_row_bw); } if (MyError) { @@ -2437,30 +2382,12 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman v->ErrorResult[k] = CalculatePrefetchSchedule( mode_lib, - v->PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyPixelMixedWithVMData, - v->PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyVMDataOnly, + k, &myPipe, v->DSCDelay[k], - v->DPPCLKDelaySubtotal - + v->DPPCLKDelayCNVCFormater, - v->DPPCLKDelaySCL, - v->DPPCLKDelaySCLLBOnly, - v->DPPCLKDelayCNVCCursor, - v->DISPCLKDelaySubtotal, (unsigned int) (v->SwathWidthY[k] / v->HRatio[k]), - v->OutputFormat[k], - v->MaxInterDCNTileRepeaters, dml_min(v->VStartupLines, v->MaxVStartupLines[k]), v->MaxVStartupLines[k], - v->GPUVMMaxPageTableLevels, - v->GPUVMEnable, - v->HostVMEnable, - v->HostVMMaxNonCachedPageTableLevels, - v->HostVMMinPageSize, - v->DynamicMetadataEnable[k], - v->DynamicMetadataVMEnabled, - v->DynamicMetadataLinesBeforeActiveRequired[k], - v->DynamicMetadataTransmittedBytes[k], v->UrgentLatency, v->UrgentExtraLatency, v->TCalc, @@ -2474,7 +2401,6 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman v->MaxNumSwathY[k], v->PrefetchSourceLinesC[k], v->SwathWidthC[k], - v->BytePerPixelC[k], v->VInitPreFillC[k], v->MaxNumSwathC[k], v->swath_width_luma_ub[k], @@ -2482,9 +2408,6 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman v->SwathHeightY[k], v->SwathHeightC[k], TWait, - v->ProgressiveToInterlaceUnitInOPP, - &v->DSTXAfterScaler[k], - &v->DSTYAfterScaler[k], &v->DestinationLinesForPrefetch[k], &v->PrefetchBandwidth[k], &v->DestinationLinesToRequestVMInVBlank[k], @@ -2493,14 +2416,7 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman &v->VRatioPrefetchC[k], &v->RequiredPrefetchPixDataBWLuma[k], &v->RequiredPrefetchPixDataBWChroma[k], - &v->NotEnoughTimeForDynamicMetadata[k], - &v->Tno_bw[k], - &v->prefetch_vmrow_bw[k], - &v->Tdmdl_vm[k], - &v->Tdmdl[k], - &v->VUpdateOffsetPix[k], - &v->VUpdateWidthPix[k], - &v->VReadyOffsetPix[k]); + &v->NotEnoughTimeForDynamicMetadata[k]); if (v->BlendingAndTiming[k] == k) { double TotalRepeaterDelayTime = v->MaxInterDCNTileRepeaters * (2 / v->DPPCLK[k] + 3 / v->DISPCLK); v->VUpdateWidthPix[k] = (14 / v->DCFCLKDeepSleep + 12 / v->DPPCLK[k] + TotalRepeaterDelayTime) * v->PixelClock[k]; @@ -4770,29 +4686,12 @@ void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l v->NoTimeForPrefetch[i][j][k] = CalculatePrefetchSchedule( mode_lib, - v->PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyPixelMixedWithVMData, - v->PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyVMDataOnly, + k, &myPipe, v->DSCDelayPerState[i][k], - v->DPPCLKDelaySubtotal + v->DPPCLKDelayCNVCFormater, - v->DPPCLKDelaySCL, - v->DPPCLKDelaySCLLBOnly, - v->DPPCLKDelayCNVCCursor, - v->DISPCLKDelaySubtotal, v->SwathWidthYThisState[k] / v->HRatio[k], - v->OutputFormat[k], - v->MaxInterDCNTileRepeaters, dml_min(v->MaxVStartup, v->MaximumVStartup[i][j][k]), v->MaximumVStartup[i][j][k], - v->GPUVMMaxPageTableLevels, - v->GPUVMEnable, - v->HostVMEnable, - v->HostVMMaxNonCachedPageTableLevels, - v->HostVMMinPageSize, - v->DynamicMetadataEnable[k], - v->DynamicMetadataVMEnabled, - v->DynamicMetadataLinesBeforeActiveRequired[k], - v->DynamicMetadataTransmittedBytes[k], v->UrgLatency[i], v->ExtraLatency, v->TimeCalc, @@ -4806,7 +4705,6 @@ void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l v->MaxNumSwY[k], v->PrefetchLinesC[i][j][k], v->SwathWidthCThisState[k], - v->BytePerPixelC[k], v->PrefillC[k], v->MaxNumSwC[k], v->swath_width_luma_ub_this_state[k], @@ -4814,9 +4712,6 @@ void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l v->SwathHeightYThisState[k], v->SwathHeightCThisState[k], v->TWait, - v->ProgressiveToInterlaceUnitInOPP, - &v->DSTXAfterScaler[k], - &v->DSTYAfterScaler[k], &v->LineTimesForPrefetch[k], &v->PrefetchBW[k], &v->LinesForMetaPTE[k], @@ -4825,14 +4720,7 @@ void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l &v->VRatioPreC[i][j][k], &v->RequiredPrefetchPixelDataBWLuma[i][j][k], &v->RequiredPrefetchPixelDataBWChroma[i][j][k], - &v->NoTimeForDynamicMetadata[i][j][k], - &v->Tno_bw[k], - &v->prefetch_vmrow_bw[k], - &v->Tdmdl_vm[k], - &v->Tdmdl[k], - &v->VUpdateOffsetPix[k], - &v->VUpdateWidthPix[k], - &v->VReadyOffsetPix[k]); + &v->NoTimeForDynamicMetadata[i][j][k]); } for (k = 0; k <= v->NumberOfActivePlanes - 1; k++) { From a55ba45fde743f100df58a4bfeeacd7f48308046 Mon Sep 17 00:00:00 2001 From: Nathan Chancellor Date: Sat, 13 Dec 2025 19:58:11 +0900 Subject: [PATCH 0828/1321] drm/amd/display: Reduce number of arguments of dcn30's CalculateWatermarksAndDRAMSpeedChangeSupport() CalculateWatermarksAndDRAMSpeedChangeSupport() has a large number of parameters, which must be passed on the stack. Most of the parameters between the two callsites are the same, so they can be accessed through the existing mode_lib pointer, instead of being passed as explicit arguments. Doing this reduces the stack size of dml30_ModeSupportAndSystemConfigurationFull() from 1912 bytes to 1840 bytes building for x86_64 with clang-22, helping stay under the 2048 byte limit for display_mode_vba_30.c. Additionally, now that there is a pointer to mode_lib->vba available, use 'v' consistently throughout the entire function. Signed-off-by: Nathan Chancellor Signed-off-by: Alex Deucher (cherry picked from commit 563dfbefdf633c8d958398ddfa3955f9f40e47d9) --- .../dc/dml/dcn30/display_mode_vba_30.c | 287 ++++-------------- 1 file changed, 66 insertions(+), 221 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c index 2d19bb8de59c8..1df3412be3465 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c @@ -265,62 +265,23 @@ static void CalculateDynamicMetadataParameters( static void CalculateWatermarksAndDRAMSpeedChangeSupport( struct display_mode_lib *mode_lib, unsigned int PrefetchMode, - unsigned int NumberOfActivePlanes, - unsigned int MaxLineBufferLines, - unsigned int LineBufferSize, - unsigned int DPPOutputBufferPixels, - unsigned int DETBufferSizeInKByte, - unsigned int WritebackInterfaceBufferSize, double DCFCLK, double ReturnBW, - bool GPUVMEnable, - unsigned int dpte_group_bytes[], - unsigned int MetaChunkSize, double UrgentLatency, double ExtraLatency, - double WritebackLatency, - double WritebackChunkSize, double SOCCLK, - double DRAMClockChangeLatency, - double SRExitTime, - double SREnterPlusExitTime, double DCFCLKDeepSleep, unsigned int DPPPerPlane[], - bool DCCEnable[], double DPPCLK[], unsigned int DETBufferSizeY[], unsigned int DETBufferSizeC[], unsigned int SwathHeightY[], unsigned int SwathHeightC[], - unsigned int LBBitPerPixel[], double SwathWidthY[], double SwathWidthC[], - double HRatio[], - double HRatioChroma[], - unsigned int vtaps[], - unsigned int VTAPsChroma[], - double VRatio[], - double VRatioChroma[], - unsigned int HTotal[], - double PixelClock[], - unsigned int BlendingAndTiming[], double BytePerPixelDETY[], double BytePerPixelDETC[], - double DSTXAfterScaler[], - double DSTYAfterScaler[], - bool WritebackEnable[], - enum source_format_class WritebackPixelFormat[], - double WritebackDestinationWidth[], - double WritebackDestinationHeight[], - double WritebackSourceHeight[], - enum clock_change_support *DRAMClockChangeSupport, - double *UrgentWatermark, - double *WritebackUrgentWatermark, - double *DRAMClockChangeWatermark, - double *WritebackDRAMClockChangeWatermark, - double *StutterExitWatermark, - double *StutterEnterPlusExitWatermark, - double *MinActiveDRAMClockChangeLatencySupported); + enum clock_change_support *DRAMClockChangeSupport); static void CalculateDCFCLKDeepSleep( struct display_mode_lib *mode_lib, unsigned int NumberOfActivePlanes, @@ -2646,62 +2607,23 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman CalculateWatermarksAndDRAMSpeedChangeSupport( mode_lib, PrefetchMode, - v->NumberOfActivePlanes, - v->MaxLineBufferLines, - v->LineBufferSize, - v->DPPOutputBufferPixels, - v->DETBufferSizeInKByte[0], - v->WritebackInterfaceBufferSize, v->DCFCLK, v->ReturnBW, - v->GPUVMEnable, - v->dpte_group_bytes, - v->MetaChunkSize, v->UrgentLatency, v->UrgentExtraLatency, - v->WritebackLatency, - v->WritebackChunkSize, v->SOCCLK, - v->FinalDRAMClockChangeLatency, - v->SRExitTime, - v->SREnterPlusExitTime, v->DCFCLKDeepSleep, v->DPPPerPlane, - v->DCCEnable, v->DPPCLK, v->DETBufferSizeY, v->DETBufferSizeC, v->SwathHeightY, v->SwathHeightC, - v->LBBitPerPixel, v->SwathWidthY, v->SwathWidthC, - v->HRatio, - v->HRatioChroma, - v->vtaps, - v->VTAPsChroma, - v->VRatio, - v->VRatioChroma, - v->HTotal, - v->PixelClock, - v->BlendingAndTiming, v->BytePerPixelDETY, v->BytePerPixelDETC, - v->DSTXAfterScaler, - v->DSTYAfterScaler, - v->WritebackEnable, - v->WritebackPixelFormat, - v->WritebackDestinationWidth, - v->WritebackDestinationHeight, - v->WritebackSourceHeight, - &DRAMClockChangeSupport, - &v->UrgentWatermark, - &v->WritebackUrgentWatermark, - &v->DRAMClockChangeWatermark, - &v->WritebackDRAMClockChangeWatermark, - &v->StutterExitWatermark, - &v->StutterEnterPlusExitWatermark, - &v->MinActiveDRAMClockChangeLatencySupported); + &DRAMClockChangeSupport); for (k = 0; k < v->NumberOfActivePlanes; ++k) { if (v->WritebackEnable[k] == true) { @@ -4895,62 +4817,23 @@ void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l CalculateWatermarksAndDRAMSpeedChangeSupport( mode_lib, v->PrefetchModePerState[i][j], - v->NumberOfActivePlanes, - v->MaxLineBufferLines, - v->LineBufferSize, - v->DPPOutputBufferPixels, - v->DETBufferSizeInKByte[0], - v->WritebackInterfaceBufferSize, v->DCFCLKState[i][j], v->ReturnBWPerState[i][j], - v->GPUVMEnable, - v->dpte_group_bytes, - v->MetaChunkSize, v->UrgLatency[i], v->ExtraLatency, - v->WritebackLatency, - v->WritebackChunkSize, v->SOCCLKPerState[i], - v->FinalDRAMClockChangeLatency, - v->SRExitTime, - v->SREnterPlusExitTime, v->ProjectedDCFCLKDeepSleep[i][j], v->NoOfDPPThisState, - v->DCCEnable, v->RequiredDPPCLKThisState, v->DETBufferSizeYThisState, v->DETBufferSizeCThisState, v->SwathHeightYThisState, v->SwathHeightCThisState, - v->LBBitPerPixel, v->SwathWidthYThisState, v->SwathWidthCThisState, - v->HRatio, - v->HRatioChroma, - v->vtaps, - v->VTAPsChroma, - v->VRatio, - v->VRatioChroma, - v->HTotal, - v->PixelClock, - v->BlendingAndTiming, v->BytePerPixelInDETY, v->BytePerPixelInDETC, - v->DSTXAfterScaler, - v->DSTYAfterScaler, - v->WritebackEnable, - v->WritebackPixelFormat, - v->WritebackDestinationWidth, - v->WritebackDestinationHeight, - v->WritebackSourceHeight, - &v->DRAMClockChangeSupport[i][j], - &v->UrgentWatermark, - &v->WritebackUrgentWatermark, - &v->DRAMClockChangeWatermark, - &v->WritebackDRAMClockChangeWatermark, - &v->StutterExitWatermark, - &v->StutterEnterPlusExitWatermark, - &v->MinActiveDRAMClockChangeLatencySupported); + &v->DRAMClockChangeSupport[i][j]); } } @@ -5067,63 +4950,25 @@ void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l static void CalculateWatermarksAndDRAMSpeedChangeSupport( struct display_mode_lib *mode_lib, unsigned int PrefetchMode, - unsigned int NumberOfActivePlanes, - unsigned int MaxLineBufferLines, - unsigned int LineBufferSize, - unsigned int DPPOutputBufferPixels, - unsigned int DETBufferSizeInKByte, - unsigned int WritebackInterfaceBufferSize, double DCFCLK, double ReturnBW, - bool GPUVMEnable, - unsigned int dpte_group_bytes[], - unsigned int MetaChunkSize, double UrgentLatency, double ExtraLatency, - double WritebackLatency, - double WritebackChunkSize, double SOCCLK, - double DRAMClockChangeLatency, - double SRExitTime, - double SREnterPlusExitTime, double DCFCLKDeepSleep, unsigned int DPPPerPlane[], - bool DCCEnable[], double DPPCLK[], unsigned int DETBufferSizeY[], unsigned int DETBufferSizeC[], unsigned int SwathHeightY[], unsigned int SwathHeightC[], - unsigned int LBBitPerPixel[], double SwathWidthY[], double SwathWidthC[], - double HRatio[], - double HRatioChroma[], - unsigned int vtaps[], - unsigned int VTAPsChroma[], - double VRatio[], - double VRatioChroma[], - unsigned int HTotal[], - double PixelClock[], - unsigned int BlendingAndTiming[], double BytePerPixelDETY[], double BytePerPixelDETC[], - double DSTXAfterScaler[], - double DSTYAfterScaler[], - bool WritebackEnable[], - enum source_format_class WritebackPixelFormat[], - double WritebackDestinationWidth[], - double WritebackDestinationHeight[], - double WritebackSourceHeight[], - enum clock_change_support *DRAMClockChangeSupport, - double *UrgentWatermark, - double *WritebackUrgentWatermark, - double *DRAMClockChangeWatermark, - double *WritebackDRAMClockChangeWatermark, - double *StutterExitWatermark, - double *StutterEnterPlusExitWatermark, - double *MinActiveDRAMClockChangeLatencySupported) + enum clock_change_support *DRAMClockChangeSupport) { + struct vba_vars_st *v = &mode_lib->vba; double EffectiveLBLatencyHidingY = 0; double EffectiveLBLatencyHidingC = 0; double LinesInDETY[DC__NUM_DPP__MAX] = { 0 }; @@ -5142,101 +4987,101 @@ static void CalculateWatermarksAndDRAMSpeedChangeSupport( double WritebackDRAMClockChangeLatencyHiding = 0; unsigned int k, j; - mode_lib->vba.TotalActiveDPP = 0; - mode_lib->vba.TotalDCCActiveDPP = 0; - for (k = 0; k < NumberOfActivePlanes; ++k) { - mode_lib->vba.TotalActiveDPP = mode_lib->vba.TotalActiveDPP + DPPPerPlane[k]; - if (DCCEnable[k] == true) { - mode_lib->vba.TotalDCCActiveDPP = mode_lib->vba.TotalDCCActiveDPP + DPPPerPlane[k]; + v->TotalActiveDPP = 0; + v->TotalDCCActiveDPP = 0; + for (k = 0; k < v->NumberOfActivePlanes; ++k) { + v->TotalActiveDPP = v->TotalActiveDPP + DPPPerPlane[k]; + if (v->DCCEnable[k] == true) { + v->TotalDCCActiveDPP = v->TotalDCCActiveDPP + DPPPerPlane[k]; } } - *UrgentWatermark = UrgentLatency + ExtraLatency; + v->UrgentWatermark = UrgentLatency + ExtraLatency; - *DRAMClockChangeWatermark = DRAMClockChangeLatency + *UrgentWatermark; + v->DRAMClockChangeWatermark = v->FinalDRAMClockChangeLatency + v->UrgentWatermark; - mode_lib->vba.TotalActiveWriteback = 0; - for (k = 0; k < NumberOfActivePlanes; ++k) { - if (WritebackEnable[k] == true) { - mode_lib->vba.TotalActiveWriteback = mode_lib->vba.TotalActiveWriteback + 1; + v->TotalActiveWriteback = 0; + for (k = 0; k < v->NumberOfActivePlanes; ++k) { + if (v->WritebackEnable[k] == true) { + v->TotalActiveWriteback = v->TotalActiveWriteback + 1; } } - if (mode_lib->vba.TotalActiveWriteback <= 1) { - *WritebackUrgentWatermark = WritebackLatency; + if (v->TotalActiveWriteback <= 1) { + v->WritebackUrgentWatermark = v->WritebackLatency; } else { - *WritebackUrgentWatermark = WritebackLatency + WritebackChunkSize * 1024.0 / 32.0 / SOCCLK; + v->WritebackUrgentWatermark = v->WritebackLatency + v->WritebackChunkSize * 1024.0 / 32.0 / SOCCLK; } - if (mode_lib->vba.TotalActiveWriteback <= 1) { - *WritebackDRAMClockChangeWatermark = DRAMClockChangeLatency + WritebackLatency; + if (v->TotalActiveWriteback <= 1) { + v->WritebackDRAMClockChangeWatermark = v->FinalDRAMClockChangeLatency + v->WritebackLatency; } else { - *WritebackDRAMClockChangeWatermark = DRAMClockChangeLatency + WritebackLatency + WritebackChunkSize * 1024.0 / 32.0 / SOCCLK; + v->WritebackDRAMClockChangeWatermark = v->FinalDRAMClockChangeLatency + v->WritebackLatency + v->WritebackChunkSize * 1024.0 / 32.0 / SOCCLK; } - for (k = 0; k < NumberOfActivePlanes; ++k) { + for (k = 0; k < v->NumberOfActivePlanes; ++k) { - mode_lib->vba.LBLatencyHidingSourceLinesY = dml_min((double) MaxLineBufferLines, dml_floor(LineBufferSize / LBBitPerPixel[k] / (SwathWidthY[k] / dml_max(HRatio[k], 1.0)), 1)) - (vtaps[k] - 1); + v->LBLatencyHidingSourceLinesY = dml_min((double) v->MaxLineBufferLines, dml_floor(v->LineBufferSize / v->LBBitPerPixel[k] / (SwathWidthY[k] / dml_max(v->HRatio[k], 1.0)), 1)) - (v->vtaps[k] - 1); - mode_lib->vba.LBLatencyHidingSourceLinesC = dml_min((double) MaxLineBufferLines, dml_floor(LineBufferSize / LBBitPerPixel[k] / (SwathWidthC[k] / dml_max(HRatioChroma[k], 1.0)), 1)) - (VTAPsChroma[k] - 1); + v->LBLatencyHidingSourceLinesC = dml_min((double) v->MaxLineBufferLines, dml_floor(v->LineBufferSize / v->LBBitPerPixel[k] / (SwathWidthC[k] / dml_max(v->HRatioChroma[k], 1.0)), 1)) - (v->VTAPsChroma[k] - 1); - EffectiveLBLatencyHidingY = mode_lib->vba.LBLatencyHidingSourceLinesY / VRatio[k] * (HTotal[k] / PixelClock[k]); + EffectiveLBLatencyHidingY = v->LBLatencyHidingSourceLinesY / v->VRatio[k] * (v->HTotal[k] / v->PixelClock[k]); - EffectiveLBLatencyHidingC = mode_lib->vba.LBLatencyHidingSourceLinesC / VRatioChroma[k] * (HTotal[k] / PixelClock[k]); + EffectiveLBLatencyHidingC = v->LBLatencyHidingSourceLinesC / v->VRatioChroma[k] * (v->HTotal[k] / v->PixelClock[k]); LinesInDETY[k] = (double) DETBufferSizeY[k] / BytePerPixelDETY[k] / SwathWidthY[k]; LinesInDETYRoundedDownToSwath[k] = dml_floor(LinesInDETY[k], SwathHeightY[k]); - FullDETBufferingTimeY[k] = LinesInDETYRoundedDownToSwath[k] * (HTotal[k] / PixelClock[k]) / VRatio[k]; + FullDETBufferingTimeY[k] = LinesInDETYRoundedDownToSwath[k] * (v->HTotal[k] / v->PixelClock[k]) / v->VRatio[k]; if (BytePerPixelDETC[k] > 0) { - LinesInDETC = mode_lib->vba.DETBufferSizeC[k] / BytePerPixelDETC[k] / SwathWidthC[k]; + LinesInDETC = v->DETBufferSizeC[k] / BytePerPixelDETC[k] / SwathWidthC[k]; LinesInDETCRoundedDownToSwath = dml_floor(LinesInDETC, SwathHeightC[k]); - FullDETBufferingTimeC = LinesInDETCRoundedDownToSwath * (HTotal[k] / PixelClock[k]) / VRatioChroma[k]; + FullDETBufferingTimeC = LinesInDETCRoundedDownToSwath * (v->HTotal[k] / v->PixelClock[k]) / v->VRatioChroma[k]; } else { LinesInDETC = 0; FullDETBufferingTimeC = 999999; } - ActiveDRAMClockChangeLatencyMarginY = EffectiveLBLatencyHidingY + FullDETBufferingTimeY[k] - *UrgentWatermark - (HTotal[k] / PixelClock[k]) * (DSTXAfterScaler[k] / HTotal[k] + DSTYAfterScaler[k]) - *DRAMClockChangeWatermark; + ActiveDRAMClockChangeLatencyMarginY = EffectiveLBLatencyHidingY + FullDETBufferingTimeY[k] - v->UrgentWatermark - (v->HTotal[k] / v->PixelClock[k]) * (v->DSTXAfterScaler[k] / v->HTotal[k] + v->DSTYAfterScaler[k]) - v->DRAMClockChangeWatermark; - if (NumberOfActivePlanes > 1) { - ActiveDRAMClockChangeLatencyMarginY = ActiveDRAMClockChangeLatencyMarginY - (1 - 1.0 / NumberOfActivePlanes) * SwathHeightY[k] * HTotal[k] / PixelClock[k] / VRatio[k]; + if (v->NumberOfActivePlanes > 1) { + ActiveDRAMClockChangeLatencyMarginY = ActiveDRAMClockChangeLatencyMarginY - (1 - 1.0 / v->NumberOfActivePlanes) * SwathHeightY[k] * v->HTotal[k] / v->PixelClock[k] / v->VRatio[k]; } if (BytePerPixelDETC[k] > 0) { - ActiveDRAMClockChangeLatencyMarginC = EffectiveLBLatencyHidingC + FullDETBufferingTimeC - *UrgentWatermark - (HTotal[k] / PixelClock[k]) * (DSTXAfterScaler[k] / HTotal[k] + DSTYAfterScaler[k]) - *DRAMClockChangeWatermark; + ActiveDRAMClockChangeLatencyMarginC = EffectiveLBLatencyHidingC + FullDETBufferingTimeC - v->UrgentWatermark - (v->HTotal[k] / v->PixelClock[k]) * (v->DSTXAfterScaler[k] / v->HTotal[k] + v->DSTYAfterScaler[k]) - v->DRAMClockChangeWatermark; - if (NumberOfActivePlanes > 1) { - ActiveDRAMClockChangeLatencyMarginC = ActiveDRAMClockChangeLatencyMarginC - (1 - 1.0 / NumberOfActivePlanes) * SwathHeightC[k] * HTotal[k] / PixelClock[k] / VRatioChroma[k]; + if (v->NumberOfActivePlanes > 1) { + ActiveDRAMClockChangeLatencyMarginC = ActiveDRAMClockChangeLatencyMarginC - (1 - 1.0 / v->NumberOfActivePlanes) * SwathHeightC[k] * v->HTotal[k] / v->PixelClock[k] / v->VRatioChroma[k]; } - mode_lib->vba.ActiveDRAMClockChangeLatencyMargin[k] = dml_min(ActiveDRAMClockChangeLatencyMarginY, ActiveDRAMClockChangeLatencyMarginC); + v->ActiveDRAMClockChangeLatencyMargin[k] = dml_min(ActiveDRAMClockChangeLatencyMarginY, ActiveDRAMClockChangeLatencyMarginC); } else { - mode_lib->vba.ActiveDRAMClockChangeLatencyMargin[k] = ActiveDRAMClockChangeLatencyMarginY; + v->ActiveDRAMClockChangeLatencyMargin[k] = ActiveDRAMClockChangeLatencyMarginY; } - if (WritebackEnable[k] == true) { + if (v->WritebackEnable[k] == true) { - WritebackDRAMClockChangeLatencyHiding = WritebackInterfaceBufferSize * 1024 / (WritebackDestinationWidth[k] * WritebackDestinationHeight[k] / (WritebackSourceHeight[k] * HTotal[k] / PixelClock[k]) * 4); - if (WritebackPixelFormat[k] == dm_444_64) { + WritebackDRAMClockChangeLatencyHiding = v->WritebackInterfaceBufferSize * 1024 / (v->WritebackDestinationWidth[k] * v->WritebackDestinationHeight[k] / (v->WritebackSourceHeight[k] * v->HTotal[k] / v->PixelClock[k]) * 4); + if (v->WritebackPixelFormat[k] == dm_444_64) { WritebackDRAMClockChangeLatencyHiding = WritebackDRAMClockChangeLatencyHiding / 2; } - if (mode_lib->vba.WritebackConfiguration == dm_whole_buffer_for_single_stream_interleave) { + if (v->WritebackConfiguration == dm_whole_buffer_for_single_stream_interleave) { WritebackDRAMClockChangeLatencyHiding = WritebackDRAMClockChangeLatencyHiding * 2; } - WritebackDRAMClockChangeLatencyMargin = WritebackDRAMClockChangeLatencyHiding - mode_lib->vba.WritebackDRAMClockChangeWatermark; - mode_lib->vba.ActiveDRAMClockChangeLatencyMargin[k] = dml_min(mode_lib->vba.ActiveDRAMClockChangeLatencyMargin[k], WritebackDRAMClockChangeLatencyMargin); + WritebackDRAMClockChangeLatencyMargin = WritebackDRAMClockChangeLatencyHiding - v->WritebackDRAMClockChangeWatermark; + v->ActiveDRAMClockChangeLatencyMargin[k] = dml_min(v->ActiveDRAMClockChangeLatencyMargin[k], WritebackDRAMClockChangeLatencyMargin); } } - mode_lib->vba.MinActiveDRAMClockChangeMargin = 999999; + v->MinActiveDRAMClockChangeMargin = 999999; PlaneWithMinActiveDRAMClockChangeMargin = 0; - for (k = 0; k < NumberOfActivePlanes; ++k) { - if (mode_lib->vba.ActiveDRAMClockChangeLatencyMargin[k] < mode_lib->vba.MinActiveDRAMClockChangeMargin) { - mode_lib->vba.MinActiveDRAMClockChangeMargin = mode_lib->vba.ActiveDRAMClockChangeLatencyMargin[k]; - if (BlendingAndTiming[k] == k) { + for (k = 0; k < v->NumberOfActivePlanes; ++k) { + if (v->ActiveDRAMClockChangeLatencyMargin[k] < v->MinActiveDRAMClockChangeMargin) { + v->MinActiveDRAMClockChangeMargin = v->ActiveDRAMClockChangeLatencyMargin[k]; + if (v->BlendingAndTiming[k] == k) { PlaneWithMinActiveDRAMClockChangeMargin = k; } else { - for (j = 0; j < NumberOfActivePlanes; ++j) { - if (BlendingAndTiming[k] == j) { + for (j = 0; j < v->NumberOfActivePlanes; ++j) { + if (v->BlendingAndTiming[k] == j) { PlaneWithMinActiveDRAMClockChangeMargin = j; } } @@ -5244,40 +5089,40 @@ static void CalculateWatermarksAndDRAMSpeedChangeSupport( } } - *MinActiveDRAMClockChangeLatencySupported = mode_lib->vba.MinActiveDRAMClockChangeMargin + DRAMClockChangeLatency; + v->MinActiveDRAMClockChangeLatencySupported = v->MinActiveDRAMClockChangeMargin + v->FinalDRAMClockChangeLatency; SecondMinActiveDRAMClockChangeMarginOneDisplayInVBLank = 999999; - for (k = 0; k < NumberOfActivePlanes; ++k) { - if (!((k == PlaneWithMinActiveDRAMClockChangeMargin) && (BlendingAndTiming[k] == k)) && !(BlendingAndTiming[k] == PlaneWithMinActiveDRAMClockChangeMargin) && mode_lib->vba.ActiveDRAMClockChangeLatencyMargin[k] < SecondMinActiveDRAMClockChangeMarginOneDisplayInVBLank) { - SecondMinActiveDRAMClockChangeMarginOneDisplayInVBLank = mode_lib->vba.ActiveDRAMClockChangeLatencyMargin[k]; + for (k = 0; k < v->NumberOfActivePlanes; ++k) { + if (!((k == PlaneWithMinActiveDRAMClockChangeMargin) && (v->BlendingAndTiming[k] == k)) && !(v->BlendingAndTiming[k] == PlaneWithMinActiveDRAMClockChangeMargin) && v->ActiveDRAMClockChangeLatencyMargin[k] < SecondMinActiveDRAMClockChangeMarginOneDisplayInVBLank) { + SecondMinActiveDRAMClockChangeMarginOneDisplayInVBLank = v->ActiveDRAMClockChangeLatencyMargin[k]; } } - mode_lib->vba.TotalNumberOfActiveOTG = 0; - for (k = 0; k < NumberOfActivePlanes; ++k) { - if (BlendingAndTiming[k] == k) { - mode_lib->vba.TotalNumberOfActiveOTG = mode_lib->vba.TotalNumberOfActiveOTG + 1; + v->TotalNumberOfActiveOTG = 0; + for (k = 0; k < v->NumberOfActivePlanes; ++k) { + if (v->BlendingAndTiming[k] == k) { + v->TotalNumberOfActiveOTG = v->TotalNumberOfActiveOTG + 1; } } - if (mode_lib->vba.MinActiveDRAMClockChangeMargin > 0) { + if (v->MinActiveDRAMClockChangeMargin > 0) { *DRAMClockChangeSupport = dm_dram_clock_change_vactive; - } else if (((mode_lib->vba.SynchronizedVBlank == true || mode_lib->vba.TotalNumberOfActiveOTG == 1 || SecondMinActiveDRAMClockChangeMarginOneDisplayInVBLank > 0) && PrefetchMode == 0)) { + } else if (((v->SynchronizedVBlank == true || v->TotalNumberOfActiveOTG == 1 || SecondMinActiveDRAMClockChangeMarginOneDisplayInVBLank > 0) && PrefetchMode == 0)) { *DRAMClockChangeSupport = dm_dram_clock_change_vblank; } else { *DRAMClockChangeSupport = dm_dram_clock_change_unsupported; } FullDETBufferingTimeYStutterCriticalPlane = FullDETBufferingTimeY[0]; - for (k = 0; k < NumberOfActivePlanes; ++k) { + for (k = 0; k < v->NumberOfActivePlanes; ++k) { if (FullDETBufferingTimeY[k] <= FullDETBufferingTimeYStutterCriticalPlane) { FullDETBufferingTimeYStutterCriticalPlane = FullDETBufferingTimeY[k]; - TimeToFinishSwathTransferStutterCriticalPlane = (SwathHeightY[k] - (LinesInDETY[k] - LinesInDETYRoundedDownToSwath[k])) * (HTotal[k] / PixelClock[k]) / VRatio[k]; + TimeToFinishSwathTransferStutterCriticalPlane = (SwathHeightY[k] - (LinesInDETY[k] - LinesInDETYRoundedDownToSwath[k])) * (v->HTotal[k] / v->PixelClock[k]) / v->VRatio[k]; } } - *StutterExitWatermark = SRExitTime + ExtraLatency + 10 / DCFCLKDeepSleep; - *StutterEnterPlusExitWatermark = dml_max(SREnterPlusExitTime + ExtraLatency + 10 / DCFCLKDeepSleep, TimeToFinishSwathTransferStutterCriticalPlane); + v->StutterExitWatermark = v->SRExitTime + ExtraLatency + 10 / DCFCLKDeepSleep; + v->StutterEnterPlusExitWatermark = dml_max(v->SREnterPlusExitTime + ExtraLatency + 10 / DCFCLKDeepSleep, TimeToFinishSwathTransferStutterCriticalPlane); } From 2dcb5778cb8dd5b0854c44c3139c31fb5331aac4 Mon Sep 17 00:00:00 2001 From: Alex Deucher Date: Mon, 30 Jun 2025 10:47:09 -0400 Subject: [PATCH 0829/1321] drm/radeon: Remove __counted_by from ClockInfoArray.clockInfo[] clockInfo[] is a generic uchar pointer to variable sized structures which vary from ASIC to ASIC. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4374 Reviewed-by: Lijo Lazar Signed-off-by: Alex Deucher (cherry picked from commit dc135aa73561b5acc74eadf776e48530996529a3) Cc: stable@vger.kernel.org --- drivers/gpu/drm/radeon/pptable.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/pptable.h b/drivers/gpu/drm/radeon/pptable.h index 969a8fb0ee9e0..f4e71046dc91c 100644 --- a/drivers/gpu/drm/radeon/pptable.h +++ b/drivers/gpu/drm/radeon/pptable.h @@ -450,7 +450,7 @@ typedef struct _ClockInfoArray{ //sizeof(ATOM_PPLIB_CLOCK_INFO) UCHAR ucEntrySize; - UCHAR clockInfo[] __counted_by(ucNumEntries); + UCHAR clockInfo[] /*__counted_by(ucNumEntries)*/; }ClockInfoArray; typedef struct _NonClockInfoArray{ From 3f981e6f48aedbd4dadfefe262ccd39a9fb342ae Mon Sep 17 00:00:00 2001 From: Yang Wang Date: Thu, 11 Dec 2025 10:47:18 +0800 Subject: [PATCH 0830/1321] drm/amd/pm: fix wrong pcie parameter on navi1x fix wrong pcie dpm parameter on navi1x Fixes: 1a18607c07bb ("drm/amd/pm: override pcie dpm parameters only if it is necessary") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4671 Signed-off-by: Yang Wang Co-developed-by: Kenneth Feng Signed-off-by: Kenneth Feng Acked-by: Alex Deucher Reviewed-by: Lijo Lazar Signed-off-by: Alex Deucher (cherry picked from commit 5c5189cf4b0cc0a22bac74a40743ee711cff07f8) --- drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c index 7c9f77124ab26..16e3cc10891d0 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c @@ -2464,8 +2464,8 @@ static int navi10_update_pcie_parameters(struct smu_context *smu, pptable->PcieLaneCount[i] > pcie_width_cap ? pcie_width_cap : pptable->PcieLaneCount[i]; smu_pcie_arg = i << 16; - smu_pcie_arg |= pcie_gen_cap << 8; - smu_pcie_arg |= pcie_width_cap; + smu_pcie_arg |= dpm_context->dpm_tables.pcie_table.pcie_gen[i] << 8; + smu_pcie_arg |= dpm_context->dpm_tables.pcie_table.pcie_lane[i]; ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_OverridePcieParameters, smu_pcie_arg, From e9fc0864ca4f5711f412d77d028b1dbeb7dc99f2 Mon Sep 17 00:00:00 2001 From: Yang Wang Date: Mon, 15 Dec 2025 17:51:11 +0800 Subject: [PATCH 0831/1321] drm/amd/pm: force send pcie parmater on navi1x v1: the PMFW didn't initialize the PCIe DPM parameters and requires the KMD to actively provide these parameters. v2: clean & remove unused code logic (lijo) Fixes: 1a18607c07bb ("drm/amd/pm: override pcie dpm parameters only if it is necessary") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4671 Signed-off-by: Yang Wang Reviewed-by: Lijo Lazar Signed-off-by: Alex Deucher (cherry picked from commit b0dbd5db7cf1f81e4aaedd25cb5e72ce369387b2) --- .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 33 +++++++++---------- 1 file changed, 15 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c index 16e3cc10891d0..c4966dcc68756 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c @@ -2455,24 +2455,21 @@ static int navi10_update_pcie_parameters(struct smu_context *smu, } for (i = 0; i < NUM_LINK_LEVELS; i++) { - if (pptable->PcieGenSpeed[i] > pcie_gen_cap || - pptable->PcieLaneCount[i] > pcie_width_cap) { - dpm_context->dpm_tables.pcie_table.pcie_gen[i] = - pptable->PcieGenSpeed[i] > pcie_gen_cap ? - pcie_gen_cap : pptable->PcieGenSpeed[i]; - dpm_context->dpm_tables.pcie_table.pcie_lane[i] = - pptable->PcieLaneCount[i] > pcie_width_cap ? - pcie_width_cap : pptable->PcieLaneCount[i]; - smu_pcie_arg = i << 16; - smu_pcie_arg |= dpm_context->dpm_tables.pcie_table.pcie_gen[i] << 8; - smu_pcie_arg |= dpm_context->dpm_tables.pcie_table.pcie_lane[i]; - ret = smu_cmn_send_smc_msg_with_param(smu, - SMU_MSG_OverridePcieParameters, - smu_pcie_arg, - NULL); - if (ret) - break; - } + dpm_context->dpm_tables.pcie_table.pcie_gen[i] = + pptable->PcieGenSpeed[i] > pcie_gen_cap ? + pcie_gen_cap : pptable->PcieGenSpeed[i]; + dpm_context->dpm_tables.pcie_table.pcie_lane[i] = + pptable->PcieLaneCount[i] > pcie_width_cap ? + pcie_width_cap : pptable->PcieLaneCount[i]; + smu_pcie_arg = i << 16; + smu_pcie_arg |= dpm_context->dpm_tables.pcie_table.pcie_gen[i] << 8; + smu_pcie_arg |= dpm_context->dpm_tables.pcie_table.pcie_lane[i]; + ret = smu_cmn_send_smc_msg_with_param(smu, + SMU_MSG_OverridePcieParameters, + smu_pcie_arg, + NULL); + if (ret) + return ret; } return ret; From 226b1c6ebb369f61ffeb058b378a608279ae7ea9 Mon Sep 17 00:00:00 2001 From: Alex Deucher Date: Thu, 13 Nov 2025 13:24:10 -0500 Subject: [PATCH 0832/1321] drm/amdgpu: don't reemit ring contents more than once MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If we cancel a bad job and reemit the ring contents, and we get another timeout, cancel everything rather than reemitting. The wptr markers are only relevant for the original emit. If we reemit, the wptr markers are no longer correct. Reviewed-by: Timur Kristóf Signed-off-by: Alex Deucher (cherry picked from commit fb62a2067ca4555a6572d911e05919a311c010aa) --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 22 +++++++++++++++++----- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 ++ 2 files changed, 19 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index c7843e3363109..4f74a02a9a05c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -709,6 +709,7 @@ void amdgpu_fence_driver_guilty_force_completion(struct amdgpu_fence *af) struct amdgpu_ring *ring = af->ring; unsigned long flags; u32 seq, last_seq; + bool reemitted = false; last_seq = amdgpu_fence_read(ring) & ring->fence_drv.num_fences_mask; seq = ring->fence_drv.sync_seq & ring->fence_drv.num_fences_mask; @@ -726,7 +727,9 @@ void amdgpu_fence_driver_guilty_force_completion(struct amdgpu_fence *af) if (unprocessed && !dma_fence_is_signaled_locked(unprocessed)) { fence = container_of(unprocessed, struct amdgpu_fence, base); - if (fence == af) + if (fence->reemitted > 1) + reemitted = true; + else if (fence == af) dma_fence_set_error(&fence->base, -ETIME); else if (fence->context == af->context) dma_fence_set_error(&fence->base, -ECANCELED); @@ -734,9 +737,16 @@ void amdgpu_fence_driver_guilty_force_completion(struct amdgpu_fence *af) rcu_read_unlock(); } while (last_seq != seq); spin_unlock_irqrestore(&ring->fence_drv.lock, flags); - /* signal the guilty fence */ - amdgpu_fence_write(ring, (u32)af->base.seqno); - amdgpu_fence_process(ring); + + if (reemitted) { + /* if we've already reemitted once then just cancel everything */ + amdgpu_fence_driver_force_completion(af->ring); + af->ring->ring_backup_entries_to_copy = 0; + } else { + /* signal the guilty fence */ + amdgpu_fence_write(ring, (u32)af->base.seqno); + amdgpu_fence_process(ring); + } } void amdgpu_fence_save_wptr(struct amdgpu_fence *af) @@ -784,10 +794,12 @@ void amdgpu_ring_backup_unprocessed_commands(struct amdgpu_ring *ring, /* save everything if the ring is not guilty, otherwise * just save the content from other contexts. */ - if (!guilty_fence || (fence->context != guilty_fence->context)) + if (!fence->reemitted && + (!guilty_fence || (fence->context != guilty_fence->context))) amdgpu_ring_backup_unprocessed_command(ring, wptr, fence->wptr); wptr = fence->wptr; + fence->reemitted++; } rcu_read_unlock(); } while (last_seq != seq); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index 7a27c6c4bb440..5044cf9e45fb7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h @@ -148,6 +148,8 @@ struct amdgpu_fence { u64 wptr; /* fence context for resets */ u64 context; + /* has this fence been reemitted */ + unsigned int reemitted; }; extern const struct drm_sched_backend_ops amdgpu_sched_ops; From d97504606bf3b1063d38b0ec8992bce5e6d045ec Mon Sep 17 00:00:00 2001 From: Alex Deucher Date: Thu, 13 Nov 2025 14:12:10 -0500 Subject: [PATCH 0833/1321] drm/amdgpu: always backup and reemit fences MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If when we backup the ring contents for reemit before a ring reset, we skip jobs associated with the bad context, however, we need to make sure the fences are reemited as unprocessed submissions may depend on them. v2: clean up fence handling, make helpers static Reviewed-by: Timur Kristóf Signed-off-by: Alex Deucher (cherry picked from commit 155a748f14bc0b72783994dea7c5a12276730342) --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 24 ++++++++++++++++++----- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 ++++- 2 files changed, 23 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 4f74a02a9a05c..06c333b2213b0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -89,6 +89,16 @@ static u32 amdgpu_fence_read(struct amdgpu_ring *ring) return seq; } +static void amdgpu_fence_save_fence_wptr_start(struct amdgpu_fence *af) +{ + af->fence_wptr_start = af->ring->wptr; +} + +static void amdgpu_fence_save_fence_wptr_end(struct amdgpu_fence *af) +{ + af->fence_wptr_end = af->ring->wptr; +} + /** * amdgpu_fence_emit - emit a fence on the requested ring * @@ -116,8 +126,10 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct amdgpu_fence *af, &ring->fence_drv.lock, adev->fence_context + ring->idx, seq); + amdgpu_fence_save_fence_wptr_start(af); amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr, seq, flags | AMDGPU_FENCE_FLAG_INT); + amdgpu_fence_save_fence_wptr_end(af); amdgpu_fence_save_wptr(af); pm_runtime_get_noresume(adev_to_drm(adev)->dev); ptr = &ring->fence_drv.fences[seq & ring->fence_drv.num_fences_mask]; @@ -742,10 +754,6 @@ void amdgpu_fence_driver_guilty_force_completion(struct amdgpu_fence *af) /* if we've already reemitted once then just cancel everything */ amdgpu_fence_driver_force_completion(af->ring); af->ring->ring_backup_entries_to_copy = 0; - } else { - /* signal the guilty fence */ - amdgpu_fence_write(ring, (u32)af->base.seqno); - amdgpu_fence_process(ring); } } @@ -795,9 +803,15 @@ void amdgpu_ring_backup_unprocessed_commands(struct amdgpu_ring *ring, * just save the content from other contexts. */ if (!fence->reemitted && - (!guilty_fence || (fence->context != guilty_fence->context))) + (!guilty_fence || (fence->context != guilty_fence->context))) { amdgpu_ring_backup_unprocessed_command(ring, wptr, fence->wptr); + } else if (!fence->reemitted) { + /* always save the fence */ + amdgpu_ring_backup_unprocessed_command(ring, + fence->fence_wptr_start, + fence->fence_wptr_end); + } wptr = fence->wptr; fence->reemitted++; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index 5044cf9e45fb7..055437d4edf9f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h @@ -144,12 +144,15 @@ struct amdgpu_fence { struct amdgpu_ring *ring; ktime_t start_timestamp; - /* wptr for the fence for resets */ + /* wptr for the total submission for resets */ u64 wptr; /* fence context for resets */ u64 context; /* has this fence been reemitted */ unsigned int reemitted; + /* wptr for the fence for the submission */ + u64 fence_wptr_start; + u64 fence_wptr_end; }; extern const struct drm_sched_backend_ops amdgpu_sched_ops; From 57b4c5f0630ef0e490ddab206966b1050bd412b6 Mon Sep 17 00:00:00 2001 From: Pratap Nirujogi Date: Tue, 9 Dec 2025 20:22:15 -0500 Subject: [PATCH 0834/1321] drm/amd/amdgpu: Fix SMU warning during isp suspend-resume ISP mfd child devices are using genpd and the system suspend-resume operations between genpd and amdgpu parent device which uses only runtime suspend-resume are not in sync. Linux power manager during suspend-resume resuming the genpd devices earlier than the amdgpu parent device. This is resulting in the below warning as SMU is in suspended state when genpd attempts to resume ISP. WARNING: CPU: 13 PID: 5435 at drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.c:398 smu_dpm_set_power_gate+0x36f/0x380 [amdgpu] To fix this warning isp suspend-resume is handled as part of amdgpu parent device suspend-resume instead of genpd sequence. Each ISP MFD child device is marked as dev_pm_syscore_device to skip genpd suspend-resume and use pm_runtime_force api's to suspend-resume the devices when callbacks from amdgpu are received. Co-developed-by: Gjorgji Rosikopulos Signed-off-by: Gjorgji Rosikopulos Signed-off-by: Bin Du Signed-off-by: Pratap Nirujogi Reviewed-by: Mario Limonciello (AMD) Signed-off-by: Alex Deucher (cherry picked from commit 0288a345f19b2162546352161509bb24614729e1) --- drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c | 24 +++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_isp.h | 2 ++ drivers/gpu/drm/amd/amdgpu/isp_v4_1_1.c | 41 +++++++++++++++++++++++++ 3 files changed, 67 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c index 37270c4dab8dd..532f83d783d12 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c @@ -318,12 +318,36 @@ void isp_kernel_buffer_free(void **buf_obj, u64 *gpu_addr, void **cpu_addr) } EXPORT_SYMBOL(isp_kernel_buffer_free); +static int isp_resume(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + struct amdgpu_isp *isp = &adev->isp; + + if (isp->funcs->hw_resume) + return isp->funcs->hw_resume(isp); + + return -ENODEV; +} + +static int isp_suspend(struct amdgpu_ip_block *ip_block) +{ + struct amdgpu_device *adev = ip_block->adev; + struct amdgpu_isp *isp = &adev->isp; + + if (isp->funcs->hw_suspend) + return isp->funcs->hw_suspend(isp); + + return -ENODEV; +} + static const struct amd_ip_funcs isp_ip_funcs = { .name = "isp_ip", .early_init = isp_early_init, .hw_init = isp_hw_init, .hw_fini = isp_hw_fini, .is_idle = isp_is_idle, + .suspend = isp_suspend, + .resume = isp_resume, .set_clockgating_state = isp_set_clockgating_state, .set_powergating_state = isp_set_powergating_state, }; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.h index d6f4ffa4c97c7..9a5d2b1dff9e7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.h @@ -38,6 +38,8 @@ struct amdgpu_isp; struct isp_funcs { int (*hw_init)(struct amdgpu_isp *isp); int (*hw_fini)(struct amdgpu_isp *isp); + int (*hw_suspend)(struct amdgpu_isp *isp); + int (*hw_resume)(struct amdgpu_isp *isp); }; struct amdgpu_isp { diff --git a/drivers/gpu/drm/amd/amdgpu/isp_v4_1_1.c b/drivers/gpu/drm/amd/amdgpu/isp_v4_1_1.c index 4258d3e0b706c..0002bcc6c4ec0 100644 --- a/drivers/gpu/drm/amd/amdgpu/isp_v4_1_1.c +++ b/drivers/gpu/drm/amd/amdgpu/isp_v4_1_1.c @@ -26,6 +26,7 @@ */ #include +#include #include "amdgpu.h" #include "isp_v4_1_1.h" @@ -145,6 +146,9 @@ static int isp_genpd_add_device(struct device *dev, void *data) return -ENODEV; } + /* The devices will be managed by the pm ops from the parent */ + dev_pm_syscore_device(dev, true); + exit: /* Continue to add */ return 0; @@ -177,12 +181,47 @@ static int isp_genpd_remove_device(struct device *dev, void *data) drm_err(&adev->ddev, "Failed to remove dev from genpd %d\n", ret); return -ENODEV; } + dev_pm_syscore_device(dev, false); exit: /* Continue to remove */ return 0; } +static int isp_suspend_device(struct device *dev, void *data) +{ + return pm_runtime_force_suspend(dev); +} + +static int isp_resume_device(struct device *dev, void *data) +{ + return pm_runtime_force_resume(dev); +} + +static int isp_v4_1_1_hw_suspend(struct amdgpu_isp *isp) +{ + int r; + + r = device_for_each_child(isp->parent, NULL, + isp_suspend_device); + if (r) + dev_err(isp->parent, "failed to suspend hw devices (%d)\n", r); + + return r; +} + +static int isp_v4_1_1_hw_resume(struct amdgpu_isp *isp) +{ + int r; + + r = device_for_each_child(isp->parent, NULL, + isp_resume_device); + if (r) + dev_err(isp->parent, "failed to resume hw device (%d)\n", r); + + return r; +} + static int isp_v4_1_1_hw_init(struct amdgpu_isp *isp) { const struct software_node *amd_camera_node, *isp4_node; @@ -369,6 +408,8 @@ static int isp_v4_1_1_hw_fini(struct amdgpu_isp *isp) static const struct isp_funcs isp_v4_1_1_funcs = { .hw_init = isp_v4_1_1_hw_init, .hw_fini = isp_v4_1_1_hw_fini, + .hw_suspend = isp_v4_1_1_hw_suspend, + .hw_resume = isp_v4_1_1_hw_resume, }; void isp_v4_1_1_set_isp_funcs(struct amdgpu_isp *isp) From 73ddd2149867572dd6ab24190ac68bcce9af1c21 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timur=20Krist=C3=B3f?= Date: Sat, 6 Dec 2025 03:31:03 +0100 Subject: [PATCH 0835/1321] drm/amd/display: Correct color depth for SelectCRTC_Source MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Pass the correct enum values as expected by the VBIOS. Previously the actual bit depth integer value was passed, which was a mistake. Fixes: 7fb4f254c8eb ("drm/amd/display: Add SelectCRTC_Source to BIOS parser") Reviewed-by: Alex Deucher Signed-off-by: Timur Kristóf Signed-off-by: Alex Deucher (cherry picked from commit cdf6e4c0cdab129ffc4e41a8ac53a0738f805072) --- .../drm/amd/display/dc/bios/command_table.c | 25 ++++++++++++++++- .../amd/display/dc/hwss/dce110/dce110_hwseq.c | 28 +------------------ .../amd/display/include/bios_parser_types.h | 2 +- 3 files changed, 26 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table.c b/drivers/gpu/drm/amd/display/dc/bios/command_table.c index 22457f417e65c..d56c0d3763ddd 100644 --- a/drivers/gpu/drm/amd/display/dc/bios/command_table.c +++ b/drivers/gpu/drm/amd/display/dc/bios/command_table.c @@ -1797,7 +1797,30 @@ static enum bp_result select_crtc_source_v3( ¶ms.ucEncodeMode)) return BP_RESULT_BADINPUT; - params.ucDstBpc = bp_params->bit_depth; + switch (bp_params->color_depth) { + case COLOR_DEPTH_UNDEFINED: + params.ucDstBpc = PANEL_BPC_UNDEFINE; + break; + case COLOR_DEPTH_666: + params.ucDstBpc = PANEL_6BIT_PER_COLOR; + break; + default: + case COLOR_DEPTH_888: + params.ucDstBpc = PANEL_8BIT_PER_COLOR; + break; + case COLOR_DEPTH_101010: + params.ucDstBpc = PANEL_10BIT_PER_COLOR; + break; + case COLOR_DEPTH_121212: + params.ucDstBpc = PANEL_12BIT_PER_COLOR; + break; + case COLOR_DEPTH_141414: + dm_error("14-bit color not supported by SelectCRTC_Source v3\n"); + break; + case COLOR_DEPTH_161616: + params.ucDstBpc = PANEL_16BIT_PER_COLOR; + break; + } if (EXEC_BIOS_CMD_TABLE(SelectCRTC_Source, params)) result = BP_RESULT_OK; diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c index 0cdd8c74abdfa..ebd74b43e935e 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c @@ -1610,38 +1610,12 @@ dce110_select_crtc_source(struct pipe_ctx *pipe_ctx) struct dc_bios *bios = link->ctx->dc_bios; struct bp_crtc_source_select crtc_source_select = {0}; enum engine_id engine_id = link->link_enc->preferred_engine; - uint8_t bit_depth; if (dc_is_rgb_signal(pipe_ctx->stream->signal)) engine_id = link->link_enc->analog_engine; - switch (pipe_ctx->stream->timing.display_color_depth) { - case COLOR_DEPTH_UNDEFINED: - bit_depth = 0; - break; - case COLOR_DEPTH_666: - bit_depth = 6; - break; - default: - case COLOR_DEPTH_888: - bit_depth = 8; - break; - case COLOR_DEPTH_101010: - bit_depth = 10; - break; - case COLOR_DEPTH_121212: - bit_depth = 12; - break; - case COLOR_DEPTH_141414: - bit_depth = 14; - break; - case COLOR_DEPTH_161616: - bit_depth = 16; - break; - } - crtc_source_select.controller_id = CONTROLLER_ID_D0 + pipe_ctx->stream_res.tg->inst; - crtc_source_select.bit_depth = bit_depth; + crtc_source_select.color_depth = pipe_ctx->stream->timing.display_color_depth; crtc_source_select.engine_id = engine_id; crtc_source_select.sink_signal = pipe_ctx->stream->signal; diff --git a/drivers/gpu/drm/amd/display/include/bios_parser_types.h b/drivers/gpu/drm/amd/display/include/bios_parser_types.h index 973b6bdbac63e..f40dc612ec73b 100644 --- a/drivers/gpu/drm/amd/display/include/bios_parser_types.h +++ b/drivers/gpu/drm/amd/display/include/bios_parser_types.h @@ -136,7 +136,7 @@ struct bp_crtc_source_select { enum engine_id engine_id; enum controller_id controller_id; enum signal_type sink_signal; - uint8_t bit_depth; + enum dc_color_depth color_depth; }; struct bp_transmitter_control { From 7f1b37d5cfcac0cabee7faefc5c3e69f6474c13e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timur=20Krist=C3=B3f?= Date: Sat, 6 Dec 2025 03:31:04 +0100 Subject: [PATCH 0836/1321] drm/amd/display: Add missing encoder setup to DACnEncoderControl MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Apparently the DAC encoder needs to be set up before use. The BIOS parser in DC did not support this so I assumed it was not necessary, but the DAC doesn't work without it on some GPUs. Fixes: 69b29b894660 ("drm/amd/display: Hook up DAC to bios_parser_encoder_control") Reviewed-by: Alex Deucher Signed-off-by: Timur Kristóf Signed-off-by: Alex Deucher (cherry picked from commit bb5dfe2f5630ce344c654c705d28b4e20cb9d334) --- .../gpu/drm/amd/display/dc/bios/bios_parser.c | 4 ++-- .../drm/amd/display/dc/bios/command_table.c | 19 +++++++++++-------- .../drm/amd/display/dc/bios/command_table.h | 4 ++-- 3 files changed, 15 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c index d1471f34e4192..9f11e6ca40510 100644 --- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c +++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c @@ -763,14 +763,14 @@ static enum bp_result bios_parser_encoder_control( return BP_RESULT_FAILURE; return bp->cmd_tbl.dac1_encoder_control( - bp, cntl->action == ENCODER_CONTROL_ENABLE, + bp, cntl->action, cntl->pixel_clock, ATOM_DAC1_PS2); } else if (cntl->engine_id == ENGINE_ID_DACB) { if (!bp->cmd_tbl.dac2_encoder_control) return BP_RESULT_FAILURE; return bp->cmd_tbl.dac2_encoder_control( - bp, cntl->action == ENCODER_CONTROL_ENABLE, + bp, cntl->action, cntl->pixel_clock, ATOM_DAC1_PS2); } diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table.c b/drivers/gpu/drm/amd/display/dc/bios/command_table.c index d56c0d3763ddd..76a3559f0ddc1 100644 --- a/drivers/gpu/drm/amd/display/dc/bios/command_table.c +++ b/drivers/gpu/drm/amd/display/dc/bios/command_table.c @@ -1838,12 +1838,12 @@ static enum bp_result select_crtc_source_v3( static enum bp_result dac1_encoder_control_v1( struct bios_parser *bp, - bool enable, + enum bp_encoder_control_action action, uint32_t pixel_clock, uint8_t dac_standard); static enum bp_result dac2_encoder_control_v1( struct bios_parser *bp, - bool enable, + enum bp_encoder_control_action action, uint32_t pixel_clock, uint8_t dac_standard); @@ -1869,12 +1869,15 @@ static void init_dac_encoder_control(struct bios_parser *bp) static void dac_encoder_control_prepare_params( DAC_ENCODER_CONTROL_PS_ALLOCATION *params, - bool enable, + enum bp_encoder_control_action action, uint32_t pixel_clock, uint8_t dac_standard) { params->ucDacStandard = dac_standard; - if (enable) + if (action == ENCODER_CONTROL_SETUP || + action == ENCODER_CONTROL_INIT) + params->ucAction = ATOM_ENCODER_INIT; + else if (action == ENCODER_CONTROL_ENABLE) params->ucAction = ATOM_ENABLE; else params->ucAction = ATOM_DISABLE; @@ -1887,7 +1890,7 @@ static void dac_encoder_control_prepare_params( static enum bp_result dac1_encoder_control_v1( struct bios_parser *bp, - bool enable, + enum bp_encoder_control_action action, uint32_t pixel_clock, uint8_t dac_standard) { @@ -1896,7 +1899,7 @@ static enum bp_result dac1_encoder_control_v1( dac_encoder_control_prepare_params( ¶ms, - enable, + action, pixel_clock, dac_standard); @@ -1908,7 +1911,7 @@ static enum bp_result dac1_encoder_control_v1( static enum bp_result dac2_encoder_control_v1( struct bios_parser *bp, - bool enable, + enum bp_encoder_control_action action, uint32_t pixel_clock, uint8_t dac_standard) { @@ -1917,7 +1920,7 @@ static enum bp_result dac2_encoder_control_v1( dac_encoder_control_prepare_params( ¶ms, - enable, + action, pixel_clock, dac_standard); diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table.h b/drivers/gpu/drm/amd/display/dc/bios/command_table.h index e89b1ba0048b9..78bdbcaa61c88 100644 --- a/drivers/gpu/drm/amd/display/dc/bios/command_table.h +++ b/drivers/gpu/drm/amd/display/dc/bios/command_table.h @@ -57,12 +57,12 @@ struct cmd_tbl { struct bp_crtc_source_select *bp_params); enum bp_result (*dac1_encoder_control)( struct bios_parser *bp, - bool enable, + enum bp_encoder_control_action action, uint32_t pixel_clock, uint8_t dac_standard); enum bp_result (*dac2_encoder_control)( struct bios_parser *bp, - bool enable, + enum bp_encoder_control_action action, uint32_t pixel_clock, uint8_t dac_standard); enum bp_result (*dac1_output_control)( From d097b4f0f0d57737c2c111d35720536e58a56475 Mon Sep 17 00:00:00 2001 From: Alan Liu Date: Mon, 22 Dec 2025 12:26:35 +0800 Subject: [PATCH 0837/1321] drm/amdgpu: Fix query for VPE block_type and ip_count [Why] Query for VPE block_type and ip_count is missing. [How] Add VPE case in ip_block_type and hw_ip_count query. Reviewed-by: Lang Yu Signed-off-by: Alan Liu Signed-off-by: Alex Deucher (cherry picked from commit a6ea0a430aca5932b9c75d8e38deeb45665dd2ae) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index 6ee77f431d568..f65edd80cabfa 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c @@ -201,6 +201,9 @@ static enum amd_ip_block_type type = (amdgpu_device_ip_get_ip_block(adev, AMD_IP_BLOCK_TYPE_JPEG)) ? AMD_IP_BLOCK_TYPE_JPEG : AMD_IP_BLOCK_TYPE_VCN; break; + case AMDGPU_HW_IP_VPE: + type = AMD_IP_BLOCK_TYPE_VPE; + break; default: type = AMD_IP_BLOCK_TYPE_NUM; break; @@ -721,6 +724,9 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) case AMD_IP_BLOCK_TYPE_UVD: count = adev->uvd.num_uvd_inst; break; + case AMD_IP_BLOCK_TYPE_VPE: + count = adev->vpe.num_instances; + break; /* For all other IP block types not listed in the switch statement * the ip status is valid here and the instance count is one. */ From b26ca087fee421394658bc3995cdb6b12954650e Mon Sep 17 00:00:00 2001 From: Perry Yuan Date: Thu, 25 Dec 2025 16:43:49 +0800 Subject: [PATCH 0838/1321] drm/amd/pm: Disable MMIO access during SMU Mode 1 reset During Mode 1 reset, the ASIC undergoes a reset cycle and becomes temporarily inaccessible via PCIe. Any attempt to access MMIO registers during this window (e.g., from interrupt handlers or other driver threads) can result in uncompleted PCIe transactions, leading to NMI panics or system hangs. To prevent this, set the `no_hw_access` flag to true immediately after triggering the reset. This signals other driver components to skip register accesses while the device is offline. A memory barrier `smp_mb()` is added to ensure the flag update is globally visible to all cores before the driver enters the sleep/wait state. Signed-off-by: Perry Yuan Reviewed-by: Yifan Zhang Signed-off-by: Alex Deucher (cherry picked from commit 7edb503fe4b6d67f47d8bb0dfafb8e699bb0f8a4) --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++ drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 7 ++++++- drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c | 9 +++++++-- 3 files changed, 16 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 12201b8e99b3f..25f49be4d0bd4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -5867,6 +5867,9 @@ int amdgpu_device_mode1_reset(struct amdgpu_device *adev) if (ret) goto mode1_reset_failed; + /* enable mmio access after mode 1 reset completed */ + adev->no_hw_access = false; + amdgpu_device_load_pci_state(adev->pdev); ret = amdgpu_psp_wait_for_bootloader(adev); if (ret) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c index 677781060246f..eaeff6a9bc50f 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c @@ -2923,8 +2923,13 @@ static int smu_v13_0_0_mode1_reset(struct smu_context *smu) break; } - if (!ret) + if (!ret) { + /* disable mmio access while doing mode 1 reset*/ + smu->adev->no_hw_access = true; + /* ensure no_hw_access is globally visible before any MMIO */ + smp_mb(); msleep(SMU13_MODE1_RESET_WAIT_TIME_IN_MS); + } return ret; } diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c index 2cea688c604f2..33c3cd2e1e244 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c @@ -2143,10 +2143,15 @@ static int smu_v14_0_2_mode1_reset(struct smu_context *smu) ret = smu_cmn_send_debug_smc_msg(smu, DEBUGSMC_MSG_Mode1Reset); if (!ret) { - if (amdgpu_emu_mode == 1) + if (amdgpu_emu_mode == 1) { msleep(50000); - else + } else { + /* disable mmio access while doing mode 1 reset*/ + smu->adev->no_hw_access = true; + /* ensure no_hw_access is globally visible before any MMIO */ + smp_mb(); msleep(1000); + } } return ret; From e54f3cd07e020add89c24f20ec3fa326153eefd1 Mon Sep 17 00:00:00 2001 From: Alex Hung Date: Mon, 8 Dec 2025 12:11:43 -0700 Subject: [PATCH 0839/1321] drm/amd/display: Check NULL before calling dac_load_detection dac_load_detection can be NULL in some scenario, so checking it before calling. Reviewed-by: Harry Wentland Signed-off-by: Alex Hung Signed-off-by: Chenyu Chen Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher (cherry picked from commit 179176134b535246f0b368b30e8ecad50066f896) --- drivers/gpu/drm/amd/display/dc/link/link_detection.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/link/link_detection.c b/drivers/gpu/drm/amd/display/dc/link/link_detection.c index 6d31f4967f1a9..e1940b8e5bc3f 100644 --- a/drivers/gpu/drm/amd/display/dc/link/link_detection.c +++ b/drivers/gpu/drm/amd/display/dc/link/link_detection.c @@ -932,7 +932,7 @@ static bool link_detect_dac_load_detect(struct dc_link *link) struct link_encoder *link_enc = link->link_enc; enum engine_id engine_id = link_enc->preferred_engine; enum dal_device_type device_type = DEVICE_TYPE_CRT; - enum bp_result bp_result; + enum bp_result bp_result = BP_RESULT_UNSUPPORTED; uint32_t enum_id; switch (engine_id) { @@ -946,7 +946,9 @@ static bool link_detect_dac_load_detect(struct dc_link *link) break; } - bp_result = bios->funcs->dac_load_detection(bios, engine_id, device_type, enum_id); + if (bios->funcs->dac_load_detection) + bp_result = bios->funcs->dac_load_detection(bios, engine_id, device_type, enum_id); + return bp_result == BP_RESULT_OK; } From c0fff105d987392299ce1fda925f91b2dc898f02 Mon Sep 17 00:00:00 2001 From: "Mario Limonciello (AMD)" Date: Sat, 29 Nov 2025 19:46:31 -0600 Subject: [PATCH 0840/1321] Reapply "Revert "drm/amd: Skip power ungate during suspend for VPE"" Skipping power ungate exposed some scenarios that will fail like below: ``` amdgpu: Register(0) [regVPEC_QUEUE_RESET_REQ] failed to reach value 0x00000000 != 0x00000001n amdgpu 0000:c1:00.0: amdgpu: VPE queue reset failed ... amdgpu: [drm] *ERROR* wait_for_completion_timeout timeout! ``` The underlying s2idle issue that prompted this commit is going to be fixed in BIOS. This reverts commit 2a6c826cfeedd7714611ac115371a959ead55bda. This was lost in the 6.19 merge so reapply it. Fixes: 2a6c826cfeed ("drm/amd: Skip power ungate during suspend for VPE") Signed-off-by: Mario Limonciello (AMD) Acked-by: Alex Deucher Reported-by: Konstantin Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220812 Reported-by: Matthew Schwartz Signed-off-by: Alex Deucher (cherry picked from commit 3925683515e93844be204381d2d5a1df5de34f31) --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 25f49be4d0bd4..d5c44bd34d456 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3445,11 +3445,10 @@ int amdgpu_device_set_pg_state(struct amdgpu_device *adev, (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GFX || adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_SDMA)) continue; - /* skip CG for VCE/UVD/VPE, it's handled specially */ + /* skip CG for VCE/UVD, it's handled specially */ if (adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_UVD && adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_VCE && adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_VCN && - adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_VPE && adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_JPEG && adev->ip_blocks[i].version->funcs->set_powergating_state) { /* enable powergating to save power */ From e44b8452c5188bdc7700521df72a50b986717e10 Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Wed, 10 Dec 2025 06:50:26 +0100 Subject: [PATCH 0841/1321] gpio: it87: balance superio enter/exit calls in error path We always call superio_enter() in it87_gpio_direction_out() but only call superio_exit() if the call to it87_gpio_set() succeeds. Move the label to balance the calls in error path as well. Fixes: ef877a159072 ("gpio: it87: use new line value setter callbacks") Reported-by: Daniel Gibson Closes: https://lore.kernel.org/all/bd0a00e3-9b8c-43e8-8772-e67b91f4c71e@gibson.sh/ Link: https://lore.kernel.org/r/20251210055026.23146-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-it87.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/drivers/gpio/gpio-it87.c b/drivers/gpio/gpio-it87.c index 5d677bcfccf26..2ad3c239367bc 100644 --- a/drivers/gpio/gpio-it87.c +++ b/drivers/gpio/gpio-it87.c @@ -12,6 +12,7 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt +#include #include #include #include @@ -241,23 +242,17 @@ static int it87_gpio_direction_out(struct gpio_chip *chip, mask = 1 << (gpio_num % 8); group = (gpio_num / 8); - spin_lock(&it87_gpio->lock); + guard(spinlock)(&it87_gpio->lock); rc = superio_enter(); if (rc) - goto exit; + return rc; /* set the output enable bit */ superio_set_mask(mask, group + it87_gpio->output_base); rc = it87_gpio_set(chip, gpio_num, val); - if (rc) - goto exit; - superio_exit(); - -exit: - spin_unlock(&it87_gpio->lock); return rc; } From 75dc8b32e524e45d214bfbd77b5ebde0b02c897e Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Mon, 22 Dec 2025 11:01:26 +0100 Subject: [PATCH 0842/1321] gpiolib: allow multiple lookup tables per consumer The GPIO machine lookup mechanism dates back to old ARM board files where lookup tables would all be defined in a single place. Since then, lookup tables have also been used to address various DT and ACPI quirks like missing GPIOs and - as of recently - setting up shared GPIO proxy devices. The lookup itself stops when we find the first matching table for a consumer and - if it doesn't contain the right entry - the lookup fails. Since the tables can now be defined in multiple places and we can't know how many there are, effectively limiting a consumer to only a single lookup table is no longer viable. Rework the code to always look through all tables matching the consumer. Link: https://lore.kernel.org/r/20251222-gpio-shared-reset-gpio-proxy-v1-1-8d4bba7d8c14@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpiolib.c | 91 ++++++++++++++++++++++++++---------------- 1 file changed, 56 insertions(+), 35 deletions(-) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index 91e0c384f34ae..6d583b3b07bb5 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -4508,45 +4508,41 @@ void gpiod_remove_hogs(struct gpiod_hog *hogs) } EXPORT_SYMBOL_GPL(gpiod_remove_hogs); -static struct gpiod_lookup_table *gpiod_find_lookup_table(struct device *dev) +static bool gpiod_match_lookup_table(struct device *dev, + const struct gpiod_lookup_table *table) { const char *dev_id = dev ? dev_name(dev) : NULL; - struct gpiod_lookup_table *table; - list_for_each_entry(table, &gpio_lookup_list, list) { - if (table->dev_id && dev_id) { - /* - * Valid strings on both ends, must be identical to have - * a match - */ - if (!strcmp(table->dev_id, dev_id)) - return table; - } else { - /* - * One of the pointers is NULL, so both must be to have - * a match - */ - if (dev_id == table->dev_id) - return table; - } + lockdep_assert_held(&gpio_lookup_lock); + + if (table->dev_id && dev_id) { + /* + * Valid strings on both ends, must be identical to have + * a match + */ + if (!strcmp(table->dev_id, dev_id)) + return true; + } else { + /* + * One of the pointers is NULL, so both must be to have + * a match + */ + if (dev_id == table->dev_id) + return true; } - return NULL; + return false; } -static struct gpio_desc *gpiod_find(struct device *dev, const char *con_id, - unsigned int idx, unsigned long *flags) +static struct gpio_desc *gpio_desc_table_match(struct device *dev, const char *con_id, + unsigned int idx, unsigned long *flags, + struct gpiod_lookup_table *table) { - struct gpio_desc *desc = ERR_PTR(-ENOENT); - struct gpiod_lookup_table *table; + struct gpio_desc *desc; struct gpiod_lookup *p; struct gpio_chip *gc; - guard(mutex)(&gpio_lookup_lock); - - table = gpiod_find_lookup_table(dev); - if (!table) - return desc; + lockdep_assert_held(&gpio_lookup_lock); for (p = &table->table[0]; p->key; p++) { /* idx must always match exactly */ @@ -4600,6 +4596,29 @@ static struct gpio_desc *gpiod_find(struct device *dev, const char *con_id, return desc; } + return NULL; +} + +static struct gpio_desc *gpiod_find(struct device *dev, const char *con_id, + unsigned int idx, unsigned long *flags) +{ + struct gpio_desc *desc = ERR_PTR(-ENOENT); + struct gpiod_lookup_table *table; + + guard(mutex)(&gpio_lookup_lock); + + list_for_each_entry(table, &gpio_lookup_list, list) { + if (!gpiod_match_lookup_table(dev, table)) + continue; + + desc = gpio_desc_table_match(dev, con_id, idx, flags, table); + if (!desc) + continue; + + /* On IS_ERR() or match. */ + return desc; + } + return desc; } @@ -4610,14 +4629,16 @@ static int platform_gpio_count(struct device *dev, const char *con_id) unsigned int count = 0; scoped_guard(mutex, &gpio_lookup_lock) { - table = gpiod_find_lookup_table(dev); - if (!table) - return -ENOENT; + list_for_each_entry(table, &gpio_lookup_list, list) { + if (!gpiod_match_lookup_table(dev, table)) + continue; - for (p = &table->table[0]; p->key; p++) { - if ((con_id && p->con_id && !strcmp(con_id, p->con_id)) || - (!con_id && !p->con_id)) - count++; + for (p = &table->table[0]; p->key; p++) { + if ((con_id && p->con_id && + !strcmp(con_id, p->con_id)) || + (!con_id && !p->con_id)) + count++; + } } } From 4c3128b278e43e74b5c68a084224c4a2378fd57d Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Mon, 22 Dec 2025 11:01:27 +0100 Subject: [PATCH 0843/1321] gpio: shared: verify con_id when adding proxy lookup When matching the firmware node with the potential consumer, we currently omit the con_id string. This can lead to false positives in the unlikely case of the consumer having been assigned more than one shared GPIO. Check the connector ID before proceeding. Fixes: a060b8c511ab ("gpiolib: implement low-level, shared GPIO support") Link: https://lore.kernel.org/r/20251222-gpio-shared-reset-gpio-proxy-v1-2-8d4bba7d8c14@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpiolib-shared.c | 7 ++++++- drivers/gpio/gpiolib-shared.h | 4 +++- drivers/gpio/gpiolib.c | 3 ++- 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/gpio/gpiolib-shared.c b/drivers/gpio/gpiolib-shared.c index ba4b718d40a08..f589109590c7c 100644 --- a/drivers/gpio/gpiolib-shared.c +++ b/drivers/gpio/gpiolib-shared.c @@ -365,7 +365,8 @@ static bool gpio_shared_dev_is_reset_gpio(struct device *consumer, } #endif /* CONFIG_RESET_GPIO */ -int gpio_shared_add_proxy_lookup(struct device *consumer, unsigned long lflags) +int gpio_shared_add_proxy_lookup(struct device *consumer, const char *con_id, + unsigned long lflags) { const char *dev_id = dev_name(consumer); struct gpio_shared_entry *entry; @@ -384,6 +385,10 @@ int gpio_shared_add_proxy_lookup(struct device *consumer, unsigned long lflags) guard(mutex)(&ref->lock); + if ((!con_id && ref->con_id) || (con_id && !ref->con_id) || + (con_id && ref->con_id && strcmp(con_id, ref->con_id) != 0)) + continue; + /* We've already done that on a previous request. */ if (ref->lookup) return 0; diff --git a/drivers/gpio/gpiolib-shared.h b/drivers/gpio/gpiolib-shared.h index 667dbdff35850..40568ef7364cc 100644 --- a/drivers/gpio/gpiolib-shared.h +++ b/drivers/gpio/gpiolib-shared.h @@ -16,7 +16,8 @@ struct device; int gpio_device_setup_shared(struct gpio_device *gdev); void gpio_device_teardown_shared(struct gpio_device *gdev); -int gpio_shared_add_proxy_lookup(struct device *consumer, unsigned long lflags); +int gpio_shared_add_proxy_lookup(struct device *consumer, const char *con_id, + unsigned long lflags); #else @@ -28,6 +29,7 @@ static inline int gpio_device_setup_shared(struct gpio_device *gdev) static inline void gpio_device_teardown_shared(struct gpio_device *gdev) { } static inline int gpio_shared_add_proxy_lookup(struct device *consumer, + const char *con_id, unsigned long lflags) { return 0; diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index 6d583b3b07bb5..9ccfb7af67cca 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -4717,7 +4717,8 @@ struct gpio_desc *gpiod_find_and_request(struct device *consumer, * lookup table for the proxy device as previously * we only knew the consumer's fwnode. */ - ret = gpio_shared_add_proxy_lookup(consumer, lookupflags); + ret = gpio_shared_add_proxy_lookup(consumer, con_id, + lookupflags); if (ret) return ERR_PTR(ret); From 31c8174310f0981046d7306e8672c8d921bba02a Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Mon, 22 Dec 2025 11:01:28 +0100 Subject: [PATCH 0844/1321] gpio: shared: allow sharing a reset-gpios pin between reset-gpio and gpiolib We currently support sharing GPIOs between multiple devices whose drivers use either the GPIOLIB API *OR* the reset control API but not both at the same time. There's an unlikely corner-case where a reset-gpios pin can be shared by one driver using the GPIOLIB API and a second using the reset API. This will currently not work as the reset-gpio consumers are not described in firmware at the time of scanning so the shared GPIO just chooses one of the proxies created for the consumers when the reset-gpio driver performs the lookup. This can of course conflict in the case described above. In order to fix it: if we deal with the "reset-gpios" pin that's shared acconding to the firmware node setup, create a proxy for each described consumer as well as another one for the potential reset-gpio device. To that end: rework the matching to take this into account. Fixes: 7b78b26757e0 ("gpio: shared: handle the reset-gpios corner case") Link: https://lore.kernel.org/r/20251222-gpio-shared-reset-gpio-proxy-v1-3-8d4bba7d8c14@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpiolib-shared.c | 182 ++++++++++++++++++++++++---------- 1 file changed, 129 insertions(+), 53 deletions(-) diff --git a/drivers/gpio/gpiolib-shared.c b/drivers/gpio/gpiolib-shared.c index f589109590c7c..baf7e07a3bb88 100644 --- a/drivers/gpio/gpiolib-shared.c +++ b/drivers/gpio/gpiolib-shared.c @@ -76,6 +76,56 @@ gpio_shared_find_entry(struct fwnode_handle *controller_node, return NULL; } +static struct gpio_shared_ref *gpio_shared_make_ref(struct fwnode_handle *fwnode, + const char *con_id, + enum gpiod_flags flags) +{ + char *con_id_cpy __free(kfree) = NULL; + + struct gpio_shared_ref *ref __free(kfree) = kzalloc(sizeof(*ref), GFP_KERNEL); + if (!ref) + return NULL; + + if (con_id) { + con_id_cpy = kstrdup(con_id, GFP_KERNEL); + if (!con_id_cpy) + return NULL; + } + + ref->dev_id = ida_alloc(&gpio_shared_ida, GFP_KERNEL); + if (ref->dev_id < 0) + return NULL; + + ref->flags = flags; + ref->con_id = no_free_ptr(con_id_cpy); + ref->fwnode = fwnode; + mutex_init(&ref->lock); + + return no_free_ptr(ref); +} + +static int gpio_shared_setup_reset_proxy(struct gpio_shared_entry *entry, + enum gpiod_flags flags) +{ + struct gpio_shared_ref *ref; + + list_for_each_entry(ref, &entry->refs, list) { + if (!ref->fwnode && ref->con_id && strcmp(ref->con_id, "reset") == 0) + return 0; + } + + ref = gpio_shared_make_ref(NULL, "reset", flags); + if (!ref) + return -ENOMEM; + + list_add_tail(&ref->list, &entry->refs); + + pr_debug("Created a secondary shared GPIO reference for potential reset-gpio device for GPIO %u at %s\n", + entry->offset, fwnode_get_name(entry->fwnode)); + + return 0; +} + /* Handle all special nodes that we should ignore. */ static bool gpio_shared_of_node_ignore(struct device_node *node) { @@ -106,6 +156,7 @@ static int gpio_shared_of_traverse(struct device_node *curr) size_t con_id_len, suffix_len; struct fwnode_handle *fwnode; struct of_phandle_args args; + struct gpio_shared_ref *ref; struct property *prop; unsigned int offset; const char *suffix; @@ -138,6 +189,7 @@ static int gpio_shared_of_traverse(struct device_node *curr) for (i = 0; i < count; i++) { struct device_node *np __free(device_node) = NULL; + char *con_id __free(kfree) = NULL; ret = of_parse_phandle_with_args(curr, prop->name, "#gpio-cells", i, @@ -182,15 +234,6 @@ static int gpio_shared_of_traverse(struct device_node *curr) list_add_tail(&entry->list, &gpio_shared_list); } - struct gpio_shared_ref *ref __free(kfree) = - kzalloc(sizeof(*ref), GFP_KERNEL); - if (!ref) - return -ENOMEM; - - ref->fwnode = fwnode_handle_get(of_fwnode_handle(curr)); - ref->flags = args.args[1]; - mutex_init(&ref->lock); - if (strends(prop->name, "gpios")) suffix = "-gpios"; else if (strends(prop->name, "gpio")) @@ -202,27 +245,32 @@ static int gpio_shared_of_traverse(struct device_node *curr) /* We only set con_id if there's actually one. */ if (strcmp(prop->name, "gpios") && strcmp(prop->name, "gpio")) { - ref->con_id = kstrdup(prop->name, GFP_KERNEL); - if (!ref->con_id) + con_id = kstrdup(prop->name, GFP_KERNEL); + if (!con_id) return -ENOMEM; - con_id_len = strlen(ref->con_id); + con_id_len = strlen(con_id); suffix_len = strlen(suffix); - ref->con_id[con_id_len - suffix_len] = '\0'; + con_id[con_id_len - suffix_len] = '\0'; } - ref->dev_id = ida_alloc(&gpio_shared_ida, GFP_KERNEL); - if (ref->dev_id < 0) { - kfree(ref->con_id); + ref = gpio_shared_make_ref(fwnode_handle_get(of_fwnode_handle(curr)), + con_id, args.args[1]); + if (!ref) return -ENOMEM; - } if (!list_empty(&entry->refs)) pr_debug("GPIO %u at %s is shared by multiple firmware nodes\n", entry->offset, fwnode_get_name(entry->fwnode)); - list_add_tail(&no_free_ptr(ref)->list, &entry->refs); + list_add_tail(&ref->list, &entry->refs); + + if (strcmp(prop->name, "reset-gpios") == 0) { + ret = gpio_shared_setup_reset_proxy(entry, args.args[1]); + if (ret) + return ret; + } } } @@ -306,20 +354,16 @@ static bool gpio_shared_dev_is_reset_gpio(struct device *consumer, struct fwnode_handle *reset_fwnode = dev_fwnode(consumer); struct fwnode_reference_args ref_args, aux_args; struct device *parent = consumer->parent; + struct gpio_shared_ref *real_ref; bool match; int ret; + lockdep_assert_held(&ref->lock); + /* The reset-gpio device must have a parent AND a firmware node. */ if (!parent || !reset_fwnode) return false; - /* - * FIXME: use device_is_compatible() once the reset-gpio drivers gains - * a compatible string which it currently does not have. - */ - if (!strstarts(dev_name(consumer), "reset.gpio.")) - return false; - /* * Parent of the reset-gpio auxiliary device is the GPIO chip whose * fwnode we stored in the entry structure. @@ -328,33 +372,56 @@ static bool gpio_shared_dev_is_reset_gpio(struct device *consumer, return false; /* - * The device associated with the shared reference's firmware node is - * the consumer of the reset control exposed by the reset-gpio device. - * It must have a "reset-gpios" property that's referencing the entry's - * firmware node. - * - * The reference args must agree between the real consumer and the - * auxiliary reset-gpio device. + * Now we need to find the actual pin we want to assign to this + * reset-gpio device. To that end: iterate over the list of references + * of this entry and see if there's one, whose reset-gpios property's + * arguments match the ones from this consumer's node. */ - ret = fwnode_property_get_reference_args(ref->fwnode, "reset-gpios", - NULL, 2, 0, &ref_args); - if (ret) - return false; + list_for_each_entry(real_ref, &entry->refs, list) { + if (!real_ref->fwnode) + continue; + + /* + * The device associated with the shared reference's firmware + * node is the consumer of the reset control exposed by the + * reset-gpio device. It must have a "reset-gpios" property + * that's referencing the entry's firmware node. + * + * The reference args must agree between the real consumer and + * the auxiliary reset-gpio device. + */ + ret = fwnode_property_get_reference_args(real_ref->fwnode, + "reset-gpios", + NULL, 2, 0, &ref_args); + if (ret) + continue; + + ret = fwnode_property_get_reference_args(reset_fwnode, "reset-gpios", + NULL, 2, 0, &aux_args); + if (ret) { + fwnode_handle_put(ref_args.fwnode); + continue; + } + + match = ((ref_args.fwnode == entry->fwnode) && + (aux_args.fwnode == entry->fwnode) && + (ref_args.args[0] == aux_args.args[0])); - ret = fwnode_property_get_reference_args(reset_fwnode, "reset-gpios", - NULL, 2, 0, &aux_args); - if (ret) { fwnode_handle_put(ref_args.fwnode); - return false; - } + fwnode_handle_put(aux_args.fwnode); + + if (!match) + continue; - match = ((ref_args.fwnode == entry->fwnode) && - (aux_args.fwnode == entry->fwnode) && - (ref_args.args[0] == aux_args.args[0])); + /* + * Reuse the fwnode of the real device, next time we'll use it + * in the normal path. + */ + ref->fwnode = fwnode_handle_get(real_ref->fwnode); + return true; + } - fwnode_handle_put(ref_args.fwnode); - fwnode_handle_put(aux_args.fwnode); - return match; + return false; } #else static bool gpio_shared_dev_is_reset_gpio(struct device *consumer, @@ -379,12 +446,20 @@ int gpio_shared_add_proxy_lookup(struct device *consumer, const char *con_id, list_for_each_entry(entry, &gpio_shared_list, list) { list_for_each_entry(ref, &entry->refs, list) { - if (!device_match_fwnode(consumer, ref->fwnode) && - !gpio_shared_dev_is_reset_gpio(consumer, entry, ref)) - continue; - guard(mutex)(&ref->lock); + /* + * FIXME: use device_is_compatible() once the reset-gpio + * drivers gains a compatible string which it currently + * does not have. + */ + if (!ref->fwnode && strstarts(dev_name(consumer), "reset.gpio.")) { + if (!gpio_shared_dev_is_reset_gpio(consumer, entry, ref)) + continue; + } else if (!device_match_fwnode(consumer, ref->fwnode)) { + continue; + } + if ((!con_id && ref->con_id) || (con_id && !ref->con_id) || (con_id && ref->con_id && strcmp(con_id, ref->con_id) != 0)) continue; @@ -471,8 +546,9 @@ int gpio_device_setup_shared(struct gpio_device *gdev) entry->offset, gpio_device_get_label(gdev)); list_for_each_entry(ref, &entry->refs, list) { - pr_debug("Setting up a shared GPIO entry for %s\n", - fwnode_get_name(ref->fwnode)); + pr_debug("Setting up a shared GPIO entry for %s (con_id: '%s')\n", + fwnode_get_name(ref->fwnode) ?: "(no fwnode)", + ref->con_id ?: "(none)"); ret = gpio_shared_make_adev(gdev, entry, ref); if (ret) From a595c213cafabd2958ee155c5a7791ef387803c0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Pawe=C5=82=20Narewski?= Date: Wed, 24 Dec 2025 09:26:40 +0100 Subject: [PATCH 0845/1321] gpiolib: fix race condition for gdev->srcu MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If two drivers were calling gpiochip_add_data_with_key(), one may be traversing the srcu-protected list in gpio_name_to_desc(), meanwhile other has just added its gdev in gpiodev_add_to_list_unlocked(). This creates a non-mutexed and non-protected timeframe, when one instance is dereferencing and using &gdev->srcu, before the other has initialized it, resulting in crash: [ 4.935481] Unable to handle kernel paging request at virtual address ffff800272bcc000 [ 4.943396] Mem abort info: [ 4.943400] ESR = 0x0000000096000005 [ 4.943403] EC = 0x25: DABT (current EL), IL = 32 bits [ 4.943407] SET = 0, FnV = 0 [ 4.943410] EA = 0, S1PTW = 0 [ 4.943413] FSC = 0x05: level 1 translation fault [ 4.943416] Data abort info: [ 4.943418] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 [ 4.946220] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 4.955261] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 4.955268] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000038e6c000 [ 4.961449] [ffff800272bcc000] pgd=0000000000000000 [ 4.969203] , p4d=1000000039739003 [ 4.979730] , pud=0000000000000000 [ 4.980210] phandle (CPU): 0x0000005e, phandle (BE): 0x5e000000 for node "reset" [ 4.991736] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP ... [ 5.121359] pc : __srcu_read_lock+0x44/0x98 [ 5.131091] lr : gpio_name_to_desc+0x60/0x1a0 [ 5.153671] sp : ffff8000833bb430 [ 5.298440] [ 5.298443] Call trace: [ 5.298445] __srcu_read_lock+0x44/0x98 [ 5.309484] gpio_name_to_desc+0x60/0x1a0 [ 5.320692] gpiochip_add_data_with_key+0x488/0xf00 5.946419] ---[ end trace 0000000000000000 ]--- Move initialization code for gdev fields before it is added to gpio_devices, with adjacent initialization code. Adjust goto statements to reflect modified order of operations Fixes: 47d8b4c1d868 ("gpio: add SRCU infrastructure to struct gpio_device") Reviewed-by: Jakub Lewalski Signed-off-by: Paweł Narewski [Bartosz: fixed a build issue, removed stray newline] Link: https://lore.kernel.org/r/20251224082641.10769-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpiolib.c | 40 ++++++++++++++++++++-------------------- 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index 9ccfb7af67cca..c06152b16dbc7 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -1105,6 +1105,18 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data, gdev->ngpio = gc->ngpio; gdev->can_sleep = gc->can_sleep; + rwlock_init(&gdev->line_state_lock); + RAW_INIT_NOTIFIER_HEAD(&gdev->line_state_notifier); + BLOCKING_INIT_NOTIFIER_HEAD(&gdev->device_notifier); + + ret = init_srcu_struct(&gdev->srcu); + if (ret) + goto err_free_label; + + ret = init_srcu_struct(&gdev->desc_srcu); + if (ret) + goto err_cleanup_gdev_srcu; + scoped_guard(mutex, &gpio_devices_lock) { /* * TODO: this allocates a Linux GPIO number base in the global @@ -1119,7 +1131,7 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data, if (base < 0) { ret = base; base = 0; - goto err_free_label; + goto err_cleanup_desc_srcu; } /* @@ -1139,22 +1151,10 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data, ret = gpiodev_add_to_list_unlocked(gdev); if (ret) { gpiochip_err(gc, "GPIO integer space overlap, cannot add chip\n"); - goto err_free_label; + goto err_cleanup_desc_srcu; } } - rwlock_init(&gdev->line_state_lock); - RAW_INIT_NOTIFIER_HEAD(&gdev->line_state_notifier); - BLOCKING_INIT_NOTIFIER_HEAD(&gdev->device_notifier); - - ret = init_srcu_struct(&gdev->srcu); - if (ret) - goto err_remove_from_list; - - ret = init_srcu_struct(&gdev->desc_srcu); - if (ret) - goto err_cleanup_gdev_srcu; - #ifdef CONFIG_PINCTRL INIT_LIST_HEAD(&gdev->pin_ranges); #endif @@ -1164,11 +1164,11 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data, ret = gpiochip_set_names(gc); if (ret) - goto err_cleanup_desc_srcu; + goto err_remove_from_list; ret = gpiochip_init_valid_mask(gc); if (ret) - goto err_cleanup_desc_srcu; + goto err_remove_from_list; for (desc_index = 0; desc_index < gc->ngpio; desc_index++) { struct gpio_desc *desc = &gdev->descs[desc_index]; @@ -1248,10 +1248,6 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data, of_gpiochip_remove(gc); err_free_valid_mask: gpiochip_free_valid_mask(gc); -err_cleanup_desc_srcu: - cleanup_srcu_struct(&gdev->desc_srcu); -err_cleanup_gdev_srcu: - cleanup_srcu_struct(&gdev->srcu); err_remove_from_list: scoped_guard(mutex, &gpio_devices_lock) list_del_rcu(&gdev->list); @@ -1261,6 +1257,10 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data, gpio_device_put(gdev); goto err_print_message; } +err_cleanup_desc_srcu: + cleanup_srcu_struct(&gdev->desc_srcu); +err_cleanup_gdev_srcu: + cleanup_srcu_struct(&gdev->srcu); err_free_label: kfree_const(gdev->label); err_free_descs: From 902f85239d7b43b187f2633d8801ed1c9757f035 Mon Sep 17 00:00:00 2001 From: Ernest Van Hoecke Date: Wed, 17 Dec 2025 16:30:25 +0100 Subject: [PATCH 0846/1321] gpio: pca953x: handle short interrupt pulses on PCAL devices MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit GPIO drivers with latch input support may miss short pulses on input pins even when input latching is enabled. The generic interrupt logic in the pca953x driver reports interrupts by comparing the current input value against the previously sampled one and only signals an event when a level change is observed between two reads. For short pulses, the first edge is captured when the input register is read, but if the signal returns to its previous level before the read, the second edge is not observed. As a result, successive pulses can produce identical input values at read time and no level change is detected, causing interrupts to be missed. Below timing diagram shows this situation where the top signal is the input pin level and the bottom signal indicates the latched value. ─────┐ ┌──*───────────────┐ ┌──*─────────────────┐ ┌──*─── │ │ . │ │ . │ │ . │ │ │ │ │ │ │ │ │ └──*──┘ │ └──*──┘ │ └──*──┘ │ Input │ │ │ │ │ │ ▼ │ ▼ │ ▼ │ IRQ │ IRQ │ IRQ │ . . . ─────┐ .┌──────────────┐ .┌────────────────┐ .┌── │ │ │ │ │ │ │ │ │ │ │ │ └────────*┘ └────────*┘ └────────*┘ Latched │ │ │ ▼ ▼ ▼ READ 0 READ 0 READ 0 NO CHANGE NO CHANGE PCAL variants provide an interrupt status register that records which pins triggered an interrupt, but the status and input registers cannot be read atomically. The interrupt status is only cleared when the input port is read, and the input value must also be read to determine the triggering edge. If another interrupt occurs on a different line after the status register has been read but before the input register is sampled, that event will not be reflected in the earlier status snapshot, so relying solely on the interrupt status register is also insufficient. Support for input latching and interrupt status handling was previously added by [1], but the interrupt status-based logic was reverted by [2] due to these issues. This patch addresses the original problem by combining both sources of information. Events indicated by the interrupt status register are merged with events detected through the existing level-change logic. As a result: * short pulses, whose second edges are invisible, are detected via the interrupt status register, and * interrupts that occur between the status and input reads are still caught by the generic level-change logic. This significantly improves robustness on devices that signal interrupts as short pulses, while avoiding the issues that led to the earlier reversion. In practice, even if only the first edge of a pulse is observable, the interrupt is reliably detected. This fixes missed interrupts from an Ilitek touch controller with its interrupt line connected to a PCAL6416A, where active-low pulses are approximately 200 us long. [1] commit 44896beae605 ("gpio: pca953x: add PCAL9535 interrupt support for Galileo Gen2") [2] commit d6179f6c6204 ("gpio: pca953x: Improve interrupt support") Fixes: d6179f6c6204 ("gpio: pca953x: Improve interrupt support") Signed-off-by: Ernest Van Hoecke Reviewed-by: Andy Shevchenko Link: https://lore.kernel.org/r/20251217153050.142057-1-ernestvanhoecke@gmail.com Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-pca953x.c | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c index 0a3916cc2772a..8727ae54bc578 100644 --- a/drivers/gpio/gpio-pca953x.c +++ b/drivers/gpio/gpio-pca953x.c @@ -943,14 +943,35 @@ static bool pca953x_irq_pending(struct pca953x_chip *chip, unsigned long *pendin DECLARE_BITMAP(old_stat, MAX_LINE); DECLARE_BITMAP(cur_stat, MAX_LINE); DECLARE_BITMAP(new_stat, MAX_LINE); + DECLARE_BITMAP(int_stat, MAX_LINE); DECLARE_BITMAP(trigger, MAX_LINE); DECLARE_BITMAP(edges, MAX_LINE); int ret; + if (chip->driver_data & PCA_PCAL) { + /* Read INT_STAT before it is cleared by the input-port read. */ + ret = pca953x_read_regs(chip, PCAL953X_INT_STAT, int_stat); + if (ret) + return false; + } + ret = pca953x_read_regs(chip, chip->regs->input, cur_stat); if (ret) return false; + if (chip->driver_data & PCA_PCAL) { + /* Detect short pulses via INT_STAT. */ + bitmap_and(trigger, int_stat, chip->irq_mask, gc->ngpio); + + /* Apply filter for rising/falling edge selection. */ + bitmap_replace(new_stat, chip->irq_trig_fall, chip->irq_trig_raise, + cur_stat, gc->ngpio); + + bitmap_and(int_stat, new_stat, trigger, gc->ngpio); + } else { + bitmap_zero(int_stat, gc->ngpio); + } + /* Remove output pins from the equation */ pca953x_read_regs(chip, chip->regs->direction, reg_direction); @@ -964,7 +985,8 @@ static bool pca953x_irq_pending(struct pca953x_chip *chip, unsigned long *pendin if (bitmap_empty(chip->irq_trig_level_high, gc->ngpio) && bitmap_empty(chip->irq_trig_level_low, gc->ngpio)) { - if (bitmap_empty(trigger, gc->ngpio)) + if (bitmap_empty(trigger, gc->ngpio) && + bitmap_empty(int_stat, gc->ngpio)) return false; } @@ -972,6 +994,7 @@ static bool pca953x_irq_pending(struct pca953x_chip *chip, unsigned long *pendin bitmap_and(old_stat, chip->irq_trig_raise, new_stat, gc->ngpio); bitmap_or(edges, old_stat, cur_stat, gc->ngpio); bitmap_and(pending, edges, trigger, gc->ngpio); + bitmap_or(pending, pending, int_stat, gc->ngpio); bitmap_and(cur_stat, new_stat, chip->irq_trig_level_high, gc->ngpio); bitmap_and(cur_stat, cur_stat, chip->irq_mask, gc->ngpio); From dbf295237cf8e4603c9f2158ce2d287b67e71cf2 Mon Sep 17 00:00:00 2001 From: Abdun Nihaal Date: Fri, 26 Dec 2025 11:34:10 +0530 Subject: [PATCH 0847/1321] gpio: mpsse: fix reference leak in gpio_mpsse_probe() error paths The reference obtained by calling usb_get_dev() is not released in the gpio_mpsse_probe() error paths. Fix that by using device managed helper functions. Also remove the usb_put_dev() call in the disconnect function since now it will be released automatically. Cc: stable@vger.kernel.org Fixes: c46a74ff05c0 ("gpio: add support for FTDI's MPSSE as GPIO") Signed-off-by: Abdun Nihaal Link: https://lore.kernel.org/r/20251226060414.20785-1-nihaal@cse.iitm.ac.in Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-mpsse.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/gpio/gpio-mpsse.c b/drivers/gpio/gpio-mpsse.c index ace652ba4df12..12191aeb65668 100644 --- a/drivers/gpio/gpio-mpsse.c +++ b/drivers/gpio/gpio-mpsse.c @@ -548,6 +548,13 @@ static void gpio_mpsse_ida_remove(void *data) ida_free(&gpio_mpsse_ida, priv->id); } +static void gpio_mpsse_usb_put_dev(void *data) +{ + struct mpsse_priv *priv = data; + + usb_put_dev(priv->udev); +} + static int mpsse_init_valid_mask(struct gpio_chip *chip, unsigned long *valid_mask, unsigned int ngpios) @@ -592,6 +599,10 @@ static int gpio_mpsse_probe(struct usb_interface *interface, INIT_LIST_HEAD(&priv->workers); priv->udev = usb_get_dev(interface_to_usbdev(interface)); + err = devm_add_action_or_reset(dev, gpio_mpsse_usb_put_dev, priv); + if (err) + return err; + priv->intf = interface; priv->intf_id = interface->cur_altsetting->desc.bInterfaceNumber; @@ -713,7 +724,6 @@ static void gpio_mpsse_disconnect(struct usb_interface *intf) priv->intf = NULL; usb_set_intfdata(intf, NULL); - usb_put_dev(priv->udev); } static struct usb_driver gpio_mpsse_driver = { From 985a43fe53be3b6954e6a953634126da48c21955 Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Tue, 6 Jan 2026 10:00:11 +0100 Subject: [PATCH 0848/1321] gpio: rockchip: mark the GPIO controller as sleeping The GPIO controller is configured as non-sleeping but it uses generic pinctrl helpers which use a mutex for synchronization. This can cause the following lockdep splat with shared GPIOs enabled on boards which have multiple devices using the same GPIO: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:591 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 12, name: kworker/u16:0 preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 6 locks held by kworker/u16:0/12: #0: ffff0001f0018d48 ((wq_completion)events_unbound#2){+.+.}-{0:0}, at: process_one_work+0x18c/0x604 #1: ffff8000842dbdf0 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x1b4/0x604 #2: ffff0001f18498f8 (&dev->mutex){....}-{4:4}, at: __device_attach+0x38/0x1b0 #3: ffff0001f75f1e90 (&gdev->srcu){.+.?}-{0:0}, at: gpiod_direction_output_raw_commit+0x0/0x360 #4: ffff0001f46e3db8 (&shared_desc->spinlock){....}-{3:3}, at: gpio_shared_proxy_direction_output+0xd0/0x144 [gpio_shared_proxy] #5: ffff0001f180ee90 (&gdev->srcu){.+.?}-{0:0}, at: gpiod_direction_output_raw_commit+0x0/0x360 irq event stamp: 81450 hardirqs last enabled at (81449): [] _raw_spin_unlock_irqrestore+0x74/0x78 hardirqs last disabled at (81450): [] _raw_spin_lock_irqsave+0x84/0x88 softirqs last enabled at (79616): [] __alloc_skb+0x17c/0x1e8 softirqs last disabled at (79614): [] __alloc_skb+0x17c/0x1e8 CPU: 2 UID: 0 PID: 12 Comm: kworker/u16:0 Not tainted 6.19.0-rc4-next-20260105+ #11975 PREEMPT Hardware name: Hardkernel ODROID-M1 (DT) Workqueue: events_unbound deferred_probe_work_func Call trace: show_stack+0x18/0x24 (C) dump_stack_lvl+0x90/0xd0 dump_stack+0x18/0x24 __might_resched+0x144/0x248 __might_sleep+0x48/0x98 __mutex_lock+0x5c/0x894 mutex_lock_nested+0x24/0x30 pinctrl_get_device_gpio_range+0x44/0x128 pinctrl_gpio_direction+0x3c/0xe0 pinctrl_gpio_direction_output+0x14/0x20 rockchip_gpio_direction_output+0xb8/0x19c gpiochip_direction_output+0x38/0x94 gpiod_direction_output_raw_commit+0x1d8/0x360 gpiod_direction_output_nonotify+0x7c/0x230 gpiod_direction_output+0x34/0xf8 gpio_shared_proxy_direction_output+0xec/0x144 [gpio_shared_proxy] gpiochip_direction_output+0x38/0x94 gpiod_direction_output_raw_commit+0x1d8/0x360 gpiod_direction_output_nonotify+0x7c/0x230 gpiod_configure_flags+0xbc/0x480 gpiod_find_and_request+0x1a0/0x574 gpiod_get_index+0x58/0x84 devm_gpiod_get_index+0x20/0xb4 devm_gpiod_get_optional+0x18/0x30 rockchip_pcie_probe+0x98/0x380 platform_probe+0x5c/0xac really_probe+0xbc/0x298 Fixes: 936ee2675eee ("gpio/rockchip: add driver for rockchip gpio") Cc: stable@vger.kernel.org Reported-by: Marek Szyprowski Closes: https://lore.kernel.org/all/d035fc29-3b03-4cd6-b8ec-001f93540bc6@samsung.com/ Acked-by: Heiko Stuebner Link: https://lore.kernel.org/r/20260106090011.21603-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-rockchip.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpio/gpio-rockchip.c b/drivers/gpio/gpio-rockchip.c index 47174eb3ba76f..bae2061f15fc4 100644 --- a/drivers/gpio/gpio-rockchip.c +++ b/drivers/gpio/gpio-rockchip.c @@ -593,6 +593,7 @@ static int rockchip_gpiolib_register(struct rockchip_pin_bank *bank) gc->ngpio = bank->nr_pins; gc->label = bank->name; gc->parent = bank->dev; + gc->can_sleep = true; ret = gpiochip_add_data(gc, bank); if (ret) { From 3f3b394c7136baa0503b5de8bd907072407f18e5 Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Tue, 6 Jan 2026 10:34:21 +0100 Subject: [PATCH 0849/1321] gpio: shared: assign the correct firmware node for reset-gpio use-case When we defer probe due to unlucky timing of adding the lookup table, we assign the matching firmware node to the shared reference for the future probing. However, the fwnode we assign is wrong so fix it and assign the one associated with the reset-gpio device. Fixes: 49416483a953 ("gpio: shared: allow sharing a reset-gpios pin between reset-gpio and gpiolib") Reported-by: Marek Szyprowski Closes: https://lore.kernel.org/all/00107523-7737-4b92-a785-14ce4e93b8cb@samsung.com/ Tested-by: Mark Brown Link: https://lore.kernel.org/r/20260106-gpio-shared-fixes-v2-1-c7091d2f7581@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpiolib-shared.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpio/gpiolib-shared.c b/drivers/gpio/gpiolib-shared.c index baf7e07a3bb88..a68af06a6cc4e 100644 --- a/drivers/gpio/gpiolib-shared.c +++ b/drivers/gpio/gpiolib-shared.c @@ -417,7 +417,7 @@ static bool gpio_shared_dev_is_reset_gpio(struct device *consumer, * Reuse the fwnode of the real device, next time we'll use it * in the normal path. */ - ref->fwnode = fwnode_handle_get(real_ref->fwnode); + ref->fwnode = fwnode_handle_get(reset_fwnode); return true; } From fd3162bc6aba34e4a67a4a44785de7d6e0926e2b Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Tue, 6 Jan 2026 10:34:22 +0100 Subject: [PATCH 0850/1321] gpio: shared: fix a race condition When matching the reset-gpio reference with the actual firmware node consuming the GPIO, we also need to lock the structure associated with the latter as it can change while we're doing it. Due to triggering lockdep false-positives, we need to use a per-reference lockdep class but accidentally, this also allows us to remove the previous lockdep workaround for cleaner code. Fixes: 49416483a953 ("gpio: shared: allow sharing a reset-gpios pin between reset-gpio and gpiolib") Reported-by: Marek Szyprowski Closes: https://lore.kernel.org/all/00107523-7737-4b92-a785-14ce4e93b8cb@samsung.com/ Tested-by: Mark Brown Link: https://lore.kernel.org/r/20260106-gpio-shared-fixes-v2-2-c7091d2f7581@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpiolib-shared.c | 20 +++++++++----------- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/drivers/gpio/gpiolib-shared.c b/drivers/gpio/gpiolib-shared.c index a68af06a6cc4e..4c57b0928760c 100644 --- a/drivers/gpio/gpiolib-shared.c +++ b/drivers/gpio/gpiolib-shared.c @@ -38,6 +38,7 @@ struct gpio_shared_ref { int dev_id; /* Protects the auxiliary device struct and the lookup table. */ struct mutex lock; + struct lock_class_key lock_key; struct auxiliary_device adev; struct gpiod_lookup_table *lookup; }; @@ -99,7 +100,8 @@ static struct gpio_shared_ref *gpio_shared_make_ref(struct fwnode_handle *fwnode ref->flags = flags; ref->con_id = no_free_ptr(con_id_cpy); ref->fwnode = fwnode; - mutex_init(&ref->lock); + lockdep_register_key(&ref->lock_key); + mutex_init_with_key(&ref->lock, &ref->lock_key); return no_free_ptr(ref); } @@ -378,6 +380,11 @@ static bool gpio_shared_dev_is_reset_gpio(struct device *consumer, * arguments match the ones from this consumer's node. */ list_for_each_entry(real_ref, &entry->refs, list) { + if (real_ref == ref) + continue; + + guard(mutex)(&real_ref->lock); + if (!real_ref->fwnode) continue; @@ -568,15 +575,6 @@ void gpio_device_teardown_shared(struct gpio_device *gdev) if (!device_match_fwnode(&gdev->dev, entry->fwnode)) continue; - /* - * For some reason if we call synchronize_srcu() in GPIO core, - * descent here and take this mutex and then recursively call - * synchronize_srcu() again from gpiochip_remove() (which is - * totally fine) called after gpio_shared_remove_adev(), - * lockdep prints a false positive deadlock splat. Disable - * lockdep here. - */ - lockdep_off(); list_for_each_entry(ref, &entry->refs, list) { guard(mutex)(&ref->lock); @@ -589,7 +587,6 @@ void gpio_device_teardown_shared(struct gpio_device *gdev) gpio_shared_remove_adev(&ref->adev); } - lockdep_on(); } } @@ -685,6 +682,7 @@ static void gpio_shared_drop_ref(struct gpio_shared_ref *ref) { list_del(&ref->list); mutex_destroy(&ref->lock); + lockdep_unregister_key(&ref->lock_key); kfree(ref->con_id); ida_free(&gpio_shared_ida, ref->dev_id); fwnode_handle_put(ref->fwnode); From 53557c67056461a7e0b9945a5e173ad822355e2a Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Tue, 6 Jan 2026 10:34:23 +0100 Subject: [PATCH 0851/1321] gpio: shared: don't allocate the lookup table until we really need it We allocate memory for the GPIO lookup table at the top of gpio_shared_add_proxy_lookup() but we don't use it until the very end. Depending on the timing, we may return earlier. Move the allocation towards the end. Fixes: a060b8c511ab ("gpiolib: implement low-level, shared GPIO support") Tested-by: Mark Brown Link: https://lore.kernel.org/r/20260106-gpio-shared-fixes-v2-3-c7091d2f7581@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpiolib-shared.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpio/gpiolib-shared.c b/drivers/gpio/gpiolib-shared.c index 4c57b0928760c..076d8642675c2 100644 --- a/drivers/gpio/gpiolib-shared.c +++ b/drivers/gpio/gpiolib-shared.c @@ -443,14 +443,10 @@ int gpio_shared_add_proxy_lookup(struct device *consumer, const char *con_id, unsigned long lflags) { const char *dev_id = dev_name(consumer); + struct gpiod_lookup_table *lookup; struct gpio_shared_entry *entry; struct gpio_shared_ref *ref; - struct gpiod_lookup_table *lookup __free(kfree) = - kzalloc(struct_size(lookup, table, 2), GFP_KERNEL); - if (!lookup) - return -ENOMEM; - list_for_each_entry(entry, &gpio_shared_list, list) { list_for_each_entry(ref, &entry->refs, list) { guard(mutex)(&ref->lock); @@ -482,6 +478,10 @@ int gpio_shared_add_proxy_lookup(struct device *consumer, const char *con_id, if (!key) return -ENOMEM; + lookup = kzalloc(struct_size(lookup, table, 2), GFP_KERNEL); + if (!lookup) + return -ENOMEM; + pr_debug("Adding machine lookup entry for a shared GPIO for consumer %s, with key '%s' and con_id '%s'\n", dev_id, key, ref->con_id ?: "none"); @@ -489,7 +489,7 @@ int gpio_shared_add_proxy_lookup(struct device *consumer, const char *con_id, lookup->table[0] = GPIO_LOOKUP(no_free_ptr(key), 0, ref->con_id, lflags); - ref->lookup = no_free_ptr(lookup); + ref->lookup = lookup; gpiod_add_lookup_table(ref->lookup); return 0; From e215f8b231ed2006ef9b359879a50e6563bd6373 Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Thu, 8 Jan 2026 11:23:14 +0100 Subject: [PATCH 0852/1321] gpiolib: fix lookup table matching If on any iteration in gpiod_find(), gpio_desc_table_match() returns NULL (which is normal and expected), we never reinitialize desc back to ERR_PTR(-ENOENT) and if we don't find a match later on, we will return NULL causing a NULL-pointer dereference in users not expecting it. Don't initialize desc, but return ERR_PTR(-ENOENT) explicitly at the end of the function. Fixes: 9700b0fccf38 ("gpiolib: allow multiple lookup tables per consumer") Reported-by: Marek Szyprowski Closes: https://lore.kernel.org/all/00107523-7737-4b92-a785-14ce4e93b8cb@samsung.com/ Tested-by: Marek Szyprowski Link: https://lore.kernel.org/r/20260108102314.18816-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpiolib.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index c06152b16dbc7..dcf427d3cf437 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -4602,8 +4602,8 @@ static struct gpio_desc *gpio_desc_table_match(struct device *dev, const char *c static struct gpio_desc *gpiod_find(struct device *dev, const char *con_id, unsigned int idx, unsigned long *flags) { - struct gpio_desc *desc = ERR_PTR(-ENOENT); struct gpiod_lookup_table *table; + struct gpio_desc *desc; guard(mutex)(&gpio_lookup_lock); @@ -4619,7 +4619,7 @@ static struct gpio_desc *gpiod_find(struct device *dev, const char *con_id, return desc; } - return desc; + return ERR_PTR(-ENOENT); } static int platform_gpio_count(struct device *dev, const char *con_id) From d7cda20e3c7026b0a93c1527cba02aba7d4c857c Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Thu, 8 Jan 2026 14:39:19 +0100 Subject: [PATCH 0853/1321] gpio: shared: fix a false-positive sharing detection with reset-gpios After scanning the devicetree, we remove all entries that have only one reference, while creating GPIO shared proxies for the remaining, shared entries. However: for the reset-gpio corner-case, we will have two references for a "reset-gpios" pin that's not really shared. In this case one will come from the actual consumer fwnode and the other from the potential auxiliary reset-gpio device. This causes the GPIO core to create unnecessary GPIO shared proxy devices for pins that are not really shared. Add a function that can detect this situation and remove entries that have exactly two references but one of them is a reset-gpio. Fixes: 7b78b26757e0 ("gpio: shared: handle the reset-gpios corner case") Link: https://lore.kernel.org/r/20260108-gpio-shared-false-positive-v1-1-5dbf8d1b2f7d@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpiolib-shared.c | 34 ++++++++++++++++++++++++++++++++-- 1 file changed, 32 insertions(+), 2 deletions(-) diff --git a/drivers/gpio/gpiolib-shared.c b/drivers/gpio/gpiolib-shared.c index 076d8642675c2..17343fdc97584 100644 --- a/drivers/gpio/gpiolib-shared.c +++ b/drivers/gpio/gpiolib-shared.c @@ -41,6 +41,7 @@ struct gpio_shared_ref { struct lock_class_key lock_key; struct auxiliary_device adev; struct gpiod_lookup_table *lookup; + bool is_reset_gpio; }; /* Represents a single GPIO pin. */ @@ -112,7 +113,8 @@ static int gpio_shared_setup_reset_proxy(struct gpio_shared_entry *entry, struct gpio_shared_ref *ref; list_for_each_entry(ref, &entry->refs, list) { - if (!ref->fwnode && ref->con_id && strcmp(ref->con_id, "reset") == 0) + if (ref->is_reset_gpio) + /* Already set-up. */ return 0; } @@ -120,6 +122,8 @@ static int gpio_shared_setup_reset_proxy(struct gpio_shared_entry *entry, if (!ref) return -ENOMEM; + ref->is_reset_gpio = true; + list_add_tail(&ref->list, &entry->refs); pr_debug("Created a secondary shared GPIO reference for potential reset-gpio device for GPIO %u at %s\n", @@ -714,12 +718,38 @@ static void __init gpio_shared_teardown(void) } } +static bool gpio_shared_entry_is_really_shared(struct gpio_shared_entry *entry) +{ + size_t num_nodes = list_count_nodes(&entry->refs); + struct gpio_shared_ref *ref; + + if (num_nodes <= 1) + return false; + + if (num_nodes > 2) + return true; + + /* Exactly two references: */ + list_for_each_entry(ref, &entry->refs, list) { + /* + * Corner-case: the second reference comes from the potential + * reset-gpio instance. However, this pin is not really shared + * as it would have three references in this case. Avoid + * creating unnecessary proxies. + */ + if (ref->is_reset_gpio) + return false; + } + + return true; +} + static void gpio_shared_free_exclusive(void) { struct gpio_shared_entry *entry, *epos; list_for_each_entry_safe(entry, epos, &gpio_shared_list, list) { - if (list_count_nodes(&entry->refs) > 1) + if (gpio_shared_entry_is_really_shared(entry)) continue; gpio_shared_drop_ref(list_first_entry(&entry->refs, From db44f98fb1244b687e59811ffef8ae36462b9e75 Mon Sep 17 00:00:00 2001 From: Malaya Kumar Rout Date: Tue, 30 Dec 2025 17:26:13 +0530 Subject: [PATCH 0854/1321] PM: hibernate: Fix crash when freeing invalid crypto compressor When crypto_alloc_acomp() fails, it returns an ERR_PTR value, not NULL. The cleanup code in save_compressed_image() and load_compressed_image() unconditionally calls crypto_free_acomp() without checking for ERR_PTR, which causes crypto_acomp_tfm() to dereference an invalid pointer and crash the kernel. This can be triggered when the compression algorithm is unavailable (e.g., CONFIG_CRYPTO_LZO not enabled). Fix by adding IS_ERR_OR_NULL() checks before calling crypto_free_acomp() and acomp_request_free(), similar to the existing kthread_stop() check. Fixes: b03d542c3c95 ("PM: hibernate: Use crypto_acomp interface") Signed-off-by: Malaya Kumar Rout Cc: 6.15+ # 6.15+ [ rjw: Added 2 empty code lines ] Link: https://patch.msgid.link/20251230115613.64080-1-mrout@redhat.com Signed-off-by: Rafael J. Wysocki --- kernel/power/swap.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/kernel/power/swap.c b/kernel/power/swap.c index 33a186373bef2..8050e51828351 100644 --- a/kernel/power/swap.c +++ b/kernel/power/swap.c @@ -902,8 +902,11 @@ static int save_compressed_image(struct swap_map_handle *handle, for (thr = 0; thr < nr_threads; thr++) { if (data[thr].thr) kthread_stop(data[thr].thr); - acomp_request_free(data[thr].cr); - crypto_free_acomp(data[thr].cc); + if (data[thr].cr) + acomp_request_free(data[thr].cr); + + if (!IS_ERR_OR_NULL(data[thr].cc)) + crypto_free_acomp(data[thr].cc); } vfree(data); } @@ -1499,8 +1502,11 @@ static int load_compressed_image(struct swap_map_handle *handle, for (thr = 0; thr < nr_threads; thr++) { if (data[thr].thr) kthread_stop(data[thr].thr); - acomp_request_free(data[thr].cr); - crypto_free_acomp(data[thr].cc); + if (data[thr].cr) + acomp_request_free(data[thr].cr); + + if (!IS_ERR_OR_NULL(data[thr].cc)) + crypto_free_acomp(data[thr].cc); } vfree(data); } From 9297609b2d8230416fcc67c9bb8ccffb67c36fca Mon Sep 17 00:00:00 2001 From: Lorenzo Pieralisi Date: Mon, 5 Jan 2026 11:17:05 +0100 Subject: [PATCH 0855/1321] ACPI: PCI: IRQ: Fix INTx GSIs signedness In ACPI Global System Interrupts (GSIs) are described using a 32-bit value. ACPI/PCI legacy interrupts (INTx) parsing code treats GSIs as 'int', which poses issues if the GSI interrupt value is a 32-bit value with the MSB set (as required in some interrupt configurations - eg ARM64 GICv5 systems) because acpi_pci_link_allocate_irq() treats a negative gsi return value as a failed GSI allocation (and acpi_irq_get_penalty() would trigger an out-of-bounds array dereference if the 'irq' param is a negative value). Fix ACPI/PCI legacy INTx parsing by converting variables representing GSIs from 'int' to 'u32' bringing the code in line with the ACPI specification and fixing the current parsing issue. Signed-off-by: Lorenzo Pieralisi Reviewed-by: Bjorn Helgaas Link: https://patch.msgid.link/20260105101705.36703-1-lpieralisi@kernel.org Signed-off-by: Rafael J. Wysocki --- drivers/acpi/pci_irq.c | 19 ++++++++++-------- drivers/acpi/pci_link.c | 39 ++++++++++++++++++++++++------------- drivers/xen/acpi.c | 13 +++++++------ include/acpi/acpi_drivers.h | 2 +- 4 files changed, 44 insertions(+), 29 deletions(-) diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c index ad81aa03fe2f8..c416942ff3e26 100644 --- a/drivers/acpi/pci_irq.c +++ b/drivers/acpi/pci_irq.c @@ -188,7 +188,7 @@ static int acpi_pci_irq_check_entry(acpi_handle handle, struct pci_dev *dev, * the IRQ value, which is hardwired to specific interrupt inputs on * the interrupt controller. */ - pr_debug("%04x:%02x:%02x[%c] -> %s[%d]\n", + pr_debug("%04x:%02x:%02x[%c] -> %s[%u]\n", entry->id.segment, entry->id.bus, entry->id.device, pin_name(entry->pin), prt->source, entry->index); @@ -384,7 +384,7 @@ static inline bool acpi_pci_irq_valid(struct pci_dev *dev, u8 pin) int acpi_pci_irq_enable(struct pci_dev *dev) { struct acpi_prt_entry *entry; - int gsi; + u32 gsi; u8 pin; int triggering = ACPI_LEVEL_SENSITIVE; /* @@ -422,18 +422,21 @@ int acpi_pci_irq_enable(struct pci_dev *dev) return 0; } + rc = -ENODEV; + if (entry) { if (entry->link) - gsi = acpi_pci_link_allocate_irq(entry->link, + rc = acpi_pci_link_allocate_irq(entry->link, entry->index, &triggering, &polarity, - &link); - else + &link, &gsi); + else { gsi = entry->index; - } else - gsi = -1; + rc = 0; + } + } - if (gsi < 0) { + if (rc < 0) { /* * No IRQ known to the ACPI subsystem - maybe the BIOS / * driver reported one, then use it. Exit in any case. diff --git a/drivers/acpi/pci_link.c b/drivers/acpi/pci_link.c index bed7dc85612e3..b91b039a3d20f 100644 --- a/drivers/acpi/pci_link.c +++ b/drivers/acpi/pci_link.c @@ -448,7 +448,7 @@ static int acpi_isa_irq_penalty[ACPI_MAX_ISA_IRQS] = { /* >IRQ15 */ }; -static int acpi_irq_pci_sharing_penalty(int irq) +static int acpi_irq_pci_sharing_penalty(u32 irq) { struct acpi_pci_link *link; int penalty = 0; @@ -474,7 +474,7 @@ static int acpi_irq_pci_sharing_penalty(int irq) return penalty; } -static int acpi_irq_get_penalty(int irq) +static int acpi_irq_get_penalty(u32 irq) { int penalty = 0; @@ -528,7 +528,7 @@ static int acpi_irq_balance = -1; /* 0: static, 1: balance */ static int acpi_pci_link_allocate(struct acpi_pci_link *link) { acpi_handle handle = link->device->handle; - int irq; + u32 irq; int i; if (link->irq.initialized) { @@ -598,44 +598,53 @@ static int acpi_pci_link_allocate(struct acpi_pci_link *link) return 0; } -/* - * acpi_pci_link_allocate_irq - * success: return IRQ >= 0 - * failure: return -1 +/** + * acpi_pci_link_allocate_irq(): Retrieve a link device GSI + * + * @handle: Handle for the link device + * @index: GSI index + * @triggering: pointer to store the GSI trigger + * @polarity: pointer to store GSI polarity + * @name: pointer to store link device name + * @gsi: pointer to store GSI number + * + * Returns: + * 0 on success with @triggering, @polarity, @name, @gsi initialized. + * -ENODEV on failure */ int acpi_pci_link_allocate_irq(acpi_handle handle, int index, int *triggering, - int *polarity, char **name) + int *polarity, char **name, u32 *gsi) { struct acpi_device *device = acpi_fetch_acpi_dev(handle); struct acpi_pci_link *link; if (!device) { acpi_handle_err(handle, "Invalid link device\n"); - return -1; + return -ENODEV; } link = acpi_driver_data(device); if (!link) { acpi_handle_err(handle, "Invalid link context\n"); - return -1; + return -ENODEV; } /* TBD: Support multiple index (IRQ) entries per Link Device */ if (index) { acpi_handle_err(handle, "Invalid index %d\n", index); - return -1; + return -ENODEV; } mutex_lock(&acpi_link_lock); if (acpi_pci_link_allocate(link)) { mutex_unlock(&acpi_link_lock); - return -1; + return -ENODEV; } if (!link->irq.active) { mutex_unlock(&acpi_link_lock); acpi_handle_err(handle, "Link active IRQ is 0!\n"); - return -1; + return -ENODEV; } link->refcnt++; mutex_unlock(&acpi_link_lock); @@ -647,7 +656,9 @@ int acpi_pci_link_allocate_irq(acpi_handle handle, int index, int *triggering, if (name) *name = acpi_device_bid(link->device); acpi_handle_debug(handle, "Link is referenced\n"); - return link->irq.active; + *gsi = link->irq.active; + + return 0; } /* diff --git a/drivers/xen/acpi.c b/drivers/xen/acpi.c index d2ee605c5ca1c..eab28cfe99391 100644 --- a/drivers/xen/acpi.c +++ b/drivers/xen/acpi.c @@ -89,11 +89,11 @@ int xen_acpi_get_gsi_info(struct pci_dev *dev, int *trigger_out, int *polarity_out) { - int gsi; + u32 gsi; u8 pin; struct acpi_prt_entry *entry; int trigger = ACPI_LEVEL_SENSITIVE; - int polarity = acpi_irq_model == ACPI_IRQ_MODEL_GIC ? + int ret, polarity = acpi_irq_model == ACPI_IRQ_MODEL_GIC ? ACPI_ACTIVE_HIGH : ACPI_ACTIVE_LOW; if (!dev || !gsi_out || !trigger_out || !polarity_out) @@ -105,17 +105,18 @@ int xen_acpi_get_gsi_info(struct pci_dev *dev, entry = acpi_pci_irq_lookup(dev, pin); if (entry) { + ret = 0; if (entry->link) - gsi = acpi_pci_link_allocate_irq(entry->link, + ret = acpi_pci_link_allocate_irq(entry->link, entry->index, &trigger, &polarity, - NULL); + NULL, &gsi); else gsi = entry->index; } else - gsi = -1; + ret = -ENODEV; - if (gsi < 0) + if (ret < 0) return -EINVAL; *gsi_out = gsi; diff --git a/include/acpi/acpi_drivers.h b/include/acpi/acpi_drivers.h index b14d165632e76..402b97d121381 100644 --- a/include/acpi/acpi_drivers.h +++ b/include/acpi/acpi_drivers.h @@ -51,7 +51,7 @@ int acpi_irq_penalty_init(void); int acpi_pci_link_allocate_irq(acpi_handle handle, int index, int *triggering, - int *polarity, char **name); + int *polarity, char **name, u32 *gsi); int acpi_pci_link_free_irq(acpi_handle handle); /* ACPI PCI Device Binding */ From 829c11b55fd3b2e6aa1a53bf0a7ba74bbe7d8b98 Mon Sep 17 00:00:00 2001 From: Manivannan Sadhasivam Date: Wed, 26 Nov 2025 13:47:18 +0530 Subject: [PATCH 0856/1321] PCI: qcom: Remove ASPM L0s support for MSM8996 SoC Though I couldn't confirm ASPM L0s support with the Qcom hardware team, a bug report from Dmitry suggests that L0s is broken on this legacy SoC. Hence, remove L0s support from the Root Port Link Capabilities in this SoC. Since qcom_pcie_clear_aspm_l0s() is now used by more than one SoC config, call it from qcom_pcie_host_init() instead. Reported-by: Dmitry Baryshkov Closes: https://lore.kernel.org/linux-pci/4cp5pzmlkkht2ni7us6p3edidnk25l45xrp6w3fxguqcvhq2id@wjqqrdpkypkf Signed-off-by: Manivannan Sadhasivam Signed-off-by: Manivannan Sadhasivam Signed-off-by: Bjorn Helgaas Tested-by: Dmitry Baryshkov Reviewed-by: Konrad Dybcio Link: https://patch.msgid.link/20251126081718.8239-1-mani@kernel.org --- drivers/pci/controller/dwc/pcie-qcom.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c index 7b92e7a1c0d93..5a318487b2b3f 100644 --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -1047,7 +1047,6 @@ static int qcom_pcie_post_init_2_7_0(struct qcom_pcie *pcie) writel(WR_NO_SNOOP_OVERRIDE_EN | RD_NO_SNOOP_OVERRIDE_EN, pcie->parf + PARF_NO_SNOOP_OVERRIDE); - qcom_pcie_clear_aspm_l0s(pcie->pci); qcom_pcie_clear_hpc(pcie->pci); return 0; @@ -1316,6 +1315,8 @@ static int qcom_pcie_host_init(struct dw_pcie_rp *pp) goto err_disable_phy; } + qcom_pcie_clear_aspm_l0s(pcie->pci); + qcom_ep_reset_deassert(pcie); if (pcie->cfg->ops->config_sid) { @@ -1464,6 +1465,7 @@ static const struct qcom_pcie_cfg cfg_2_1_0 = { static const struct qcom_pcie_cfg cfg_2_3_2 = { .ops = &ops_2_3_2, + .no_l0s = true, }; static const struct qcom_pcie_cfg cfg_2_3_3 = { From 29060c7f130fbfb5c3d56c4d8375e566b4e8df72 Mon Sep 17 00:00:00 2001 From: Bjorn Helgaas Date: Mon, 3 Nov 2025 16:19:26 -0600 Subject: [PATCH 0857/1321] PCI: meson: Report that link is up while in ASPM L0s and L1 states Previously meson_pcie_link_up() only returned true if the link was in the L0 state. This was incorrect because hardware autonomously manages transitions between L0, L0s, and L1 while both components on the link stay in D0. Those states should all be treated as "link is active". Returning false when the device was in L0s or L1 broke config accesses because dw_pcie_other_conf_map_bus() fails if the link is down, which caused errors like this: meson-pcie fc000000.pcie: error: wait linkup timeout pci 0000:01:00.0: BAR 0: error updating (0xfc700004 != 0xffffffff) Remove the LTSSM state check, timeout, speed check, and error message from meson_pcie_link_up(), the dw_pcie_ops.link_up() method, so it is a simple boolean check of whether the link is active. Timeouts and error messages are handled at a higher level, e.g., dw_pcie_wait_for_link(). Fixes: 9c0ef6d34fdb ("PCI: amlogic: Add the Amlogic Meson PCIe controller driver") Reported-by: Linnaea Lavia Closes: https://lore.kernel.org/r/DM4PR05MB102707B8CDF84D776C39F22F2C7F0A@DM4PR05MB10270.namprd05.prod.outlook.com [bhelgaas: squash removal of unused WAIT_LINKUP_TIMEOUT by Martin Blumenstingl : https://patch.msgid.link/20260105125625.239497-1-martin.blumenstingl@googlemail.com] Signed-off-by: Bjorn Helgaas Tested-by: Linnaea Lavia Tested-by: Neil Armstrong # on BananaPi M2S Reviewed-by: Neil Armstrong Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251103221930.1831376-1-helgaas@kernel.org Link: https://patch.msgid.link/20260105125625.239497-1-martin.blumenstingl@googlemail.com --- drivers/pci/controller/dwc/pci-meson.c | 39 +++----------------------- 1 file changed, 4 insertions(+), 35 deletions(-) diff --git a/drivers/pci/controller/dwc/pci-meson.c b/drivers/pci/controller/dwc/pci-meson.c index 54b6a4196f176..0694084f612b7 100644 --- a/drivers/pci/controller/dwc/pci-meson.c +++ b/drivers/pci/controller/dwc/pci-meson.c @@ -37,7 +37,6 @@ #define PCIE_CFG_STATUS17 0x44 #define PM_CURRENT_STATE(x) (((x) >> 7) & 0x1) -#define WAIT_LINKUP_TIMEOUT 4000 #define PORT_CLK_RATE 100000000UL #define MAX_PAYLOAD_SIZE 256 #define MAX_READ_REQ_SIZE 256 @@ -350,40 +349,10 @@ static struct pci_ops meson_pci_ops = { static bool meson_pcie_link_up(struct dw_pcie *pci) { struct meson_pcie *mp = to_meson_pcie(pci); - struct device *dev = pci->dev; - u32 speed_okay = 0; - u32 cnt = 0; - u32 state12, state17, smlh_up, ltssm_up, rdlh_up; - - do { - state12 = meson_cfg_readl(mp, PCIE_CFG_STATUS12); - state17 = meson_cfg_readl(mp, PCIE_CFG_STATUS17); - smlh_up = IS_SMLH_LINK_UP(state12); - rdlh_up = IS_RDLH_LINK_UP(state12); - ltssm_up = IS_LTSSM_UP(state12); - - if (PM_CURRENT_STATE(state17) < PCIE_GEN3) - speed_okay = 1; - - if (smlh_up) - dev_dbg(dev, "smlh_link_up is on\n"); - if (rdlh_up) - dev_dbg(dev, "rdlh_link_up is on\n"); - if (ltssm_up) - dev_dbg(dev, "ltssm_up is on\n"); - if (speed_okay) - dev_dbg(dev, "speed_okay\n"); - - if (smlh_up && rdlh_up && ltssm_up && speed_okay) - return true; - - cnt++; - - udelay(10); - } while (cnt < WAIT_LINKUP_TIMEOUT); - - dev_err(dev, "error: wait linkup timeout\n"); - return false; + u32 state12; + + state12 = meson_cfg_readl(mp, PCIE_CFG_STATUS12); + return IS_SMLH_LINK_UP(state12) && IS_RDLH_LINK_UP(state12); } static int meson_pcie_host_init(struct dw_pcie_rp *pp) From 2e3efaade68a63efbf3c56d2325bab0e93619b06 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= Date: Mon, 24 Nov 2025 19:04:11 +0200 Subject: [PATCH 0858/1321] sparc/PCI: Correct 64-bit non-pref -> pref BAR resources MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit SPARC T5-2 dts describes some PCI BARs as 64-bit resources without the pref(etchable) bit (0x83... vs 0xc3... in assigned-addresses) for address ranges above the 4G threshold. Such resources cannot be placed into a non-prefetchable PCI bridge window that is capable only of 32-bit addressing. As such, it looks like the platform is improperly described by the dts. The kernel detects this problem (see the IORESOURCE_PREFETCH check in pci_find_parent_resource()) and fails to assign these BAR resources to the resource tree due to lack of a compatible bridge window. Prior to 754babaaf333 ("sparc/PCI: Remove pcibios_enable_device() as they do nothing extra") SPARC arch code did not test whether device resources were successfully in the resource tree when enabling a device, effectively hiding the problem. After removing the arch-specific enable code, pci_enable_resources() refuses to enable the device when it finds not all mem resources are assigned, and therefore mpt3sas can't be enabled: pci 0001:04:00.0: reg 0x14: [mem 0x801110000000-0x80111000ffff 64bit] pci 0001:04:00.0: reg 0x1c: [mem 0x801110040000-0x80111007ffff 64bit] pci 0001:04:00.0: BAR 1 [mem 0x801110000000-0x80111000ffff 64bit]: can't claim; no compatible bridge window pci 0001:04:00.0: BAR 3 [mem 0x801110040000-0x80111007ffff 64bit]: can't claim; no compatible bridge window mpt3sas 0001:04:00.0: BAR 1 [mem size 0x00010000 64bit]: not assigned; can't enable device For clarity, this filtered log only shows failures for one mpt3sas device but other devices fail similarly. In the reported case, the end result with all the failures is an unbootable system. Things appeared to "work" before 754babaaf333 ("sparc/PCI: Remove pcibios_enable_device() as they do nothing extra") because the resource tree is agnostic to whether PCI BAR resources are properly in the tree or not. So as long as there was a parent resource (e.g. a root bus resource) that contains the address range, the resource tree code just places resource request underneath it without any consideration to the intermediate BAR resource. While it worked, it's incorrect setup still. Add an OF fixup to set the IORESOURCE_PREFETCH flag for a 64-bit PCI resource that has the end address above 4G requiring placement into the prefetchable window. Also log the issue. Fixes: 754babaaf333 ("sparc/PCI: Remove pcibios_enable_device() as they do nothing extra") Reported-by: Nathaniel Roach Closes: https://github.com/sparclinux/issues/issues/22 Signed-off-by: Ilpo Järvinen Signed-off-by: Bjorn Helgaas Tested-by: Nathaniel Roach Link: https://patch.msgid.link/20251124170411.3709-1-ilpo.jarvinen@linux.intel.com --- arch/sparc/kernel/pci.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c index a9448088e762e..b290107170e94 100644 --- a/arch/sparc/kernel/pci.c +++ b/arch/sparc/kernel/pci.c @@ -181,6 +181,28 @@ static int __init ofpci_debug(char *str) __setup("ofpci_debug=", ofpci_debug); +static void of_fixup_pci_pref(struct pci_dev *dev, int index, + struct resource *res) +{ + struct pci_bus_region region; + + if (!(res->flags & IORESOURCE_MEM_64)) + return; + + if (!resource_size(res)) + return; + + pcibios_resource_to_bus(dev->bus, ®ion, res); + if (region.end <= ~((u32)0)) + return; + + if (!(res->flags & IORESOURCE_PREFETCH)) { + res->flags |= IORESOURCE_PREFETCH; + pci_info(dev, "reg 0x%x: fixup: pref added to 64-bit resource\n", + index); + } +} + static unsigned long pci_parse_of_flags(u32 addr0) { unsigned long flags = 0; @@ -244,6 +266,7 @@ static void pci_parse_of_addrs(struct platform_device *op, res->end = op_res->end; res->flags = flags; res->name = pci_name(dev); + of_fixup_pci_pref(dev, i, res); pci_info(dev, "reg 0x%x: %pR\n", i, res); } From 7c20d3f070b04059b306c18968c40d2c209369f5 Mon Sep 17 00:00:00 2001 From: Qu Wenruo Date: Sun, 7 Dec 2025 14:53:20 +1030 Subject: [PATCH 0859/1321] btrfs: avoid access-beyond-folio for bs > ps encoded writes [POTENTIAL BUG] If the system page size is 4K and fs block size is 8K, and max_inline mount option is set to 6K, we can inline a 6K sized data extent. Then a encoded write submitted a compressed extent which is at file offset 0, and the compressed length is 6K, which is allowed to be inlined. Now a read beyond page boundary is triggered inside write_extent_buffer() from insert_inline_extent(). [CAUSE] Currently the function __cow_file_range_inline() can only accept a single folio. For regular compressed write path, we always allocate the compressed folios using the minimal order matching the block size, thus the @compressed_folio should always cover a full fs block thus it is fine. But for encoded writes, they allocate page size folios, this means we can hit a case where the compressed data is smaller than block size but still larger than page size, in that case __cow_file_range_inline() will be called with @compressed_size larger than a page. In that case we will trigger a read beyond the folio inside insert_inline_extent(). Thankfully this is not that common, as the default max_inline is only 2048 bytes, smaller than PAGE_SIZE, and bs > ps support is still experimental. [FIX] We need to either allow insert_inline_extent() to accept a page array to properly support such case, or reject such inline extent. The latter is a much simpler solution, and considering bs > ps will stay as a corner case and non-default max_inline will be even rarer, I don't think we really need to fulfill such niche. So just reject any inline extent that's larger than PAGE_SIZE, and add an extra ASSERT() to insert_inline_extent() to catch such beyond-boundary access. Fixes: ec20799064c8 ("btrfs: enable encoded read/write/send for bs > ps cases") Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/inode.c | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 599c03a1c5737..00bc30aee5215 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -481,13 +481,15 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans, ASSERT(size <= sectorsize); /* - * The compressed size also needs to be no larger than a sector. - * That's also why we only need one page as the parameter. + * The compressed size also needs to be no larger than a page. + * That's also why we only need one folio as the parameter. */ - if (compressed_folio) + if (compressed_folio) { ASSERT(compressed_size <= sectorsize); - else + ASSERT(compressed_size <= PAGE_SIZE); + } else { ASSERT(compressed_size == 0); + } if (compressed_size && compressed_folio) cur_size = compressed_size; @@ -574,6 +576,18 @@ static bool can_cow_file_range_inline(struct btrfs_inode *inode, if (offset != 0) return false; + /* + * Even for bs > ps cases, cow_file_range_inline() can only accept a + * single folio. + * + * This can be problematic and cause access beyond page boundary if a + * page sized folio is passed into that function. + * And encoded write is doing exactly that. + * So here limits the inlined extent size to PAGE_SIZE. + */ + if (size > PAGE_SIZE || compressed_size > PAGE_SIZE) + return false; + /* Inline extents are limited to sectorsize. */ if (size > fs_info->sectorsize) return false; From b458c0eb131764a2b9cb45156d314c07a56ec261 Mon Sep 17 00:00:00 2001 From: Filipe Manana Date: Tue, 16 Dec 2025 14:51:52 +0000 Subject: [PATCH 0860/1321] btrfs: release path before initializing extent tree in btrfs_read_locked_inode() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In btrfs_read_locked_inode() we are calling btrfs_init_file_extent_tree() while holding a path with a read locked leaf from a subvolume tree, and btrfs_init_file_extent_tree() may do a GFP_KERNEL allocation, which can trigger reclaim. This can create a circular lock dependency which lockdep warns about with the following splat: [6.1433] ====================================================== [6.1574] WARNING: possible circular locking dependency detected [6.1583] 6.18.0+ #4 Tainted: G U [6.1591] ------------------------------------------------------ [6.1599] kswapd0/117 is trying to acquire lock: [6.1606] ffff8d9b6333c5b8 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node.part.0+0x39/0x2f0 [6.1625] but task is already holding lock: [6.1633] ffffffffa4ab8ce0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x195/0xc60 [6.1646] which lock already depends on the new lock. [6.1657] the existing dependency chain (in reverse order) is: [6.1667] -> #2 (fs_reclaim){+.+.}-{0:0}: [6.1677] fs_reclaim_acquire+0x9d/0xd0 [6.1685] __kmalloc_cache_noprof+0x59/0x750 [6.1694] btrfs_init_file_extent_tree+0x90/0x100 [6.1702] btrfs_read_locked_inode+0xc3/0x6b0 [6.1710] btrfs_iget+0xbb/0xf0 [6.1716] btrfs_lookup_dentry+0x3c5/0x8e0 [6.1724] btrfs_lookup+0x12/0x30 [6.1731] lookup_open.isra.0+0x1aa/0x6a0 [6.1739] path_openat+0x5f7/0xc60 [6.1746] do_filp_open+0xd6/0x180 [6.1753] do_sys_openat2+0x8b/0xe0 [6.1760] __x64_sys_openat+0x54/0xa0 [6.1768] do_syscall_64+0x97/0x3e0 [6.1776] entry_SYSCALL_64_after_hwframe+0x76/0x7e [6.1784] -> #1 (btrfs-tree-00){++++}-{3:3}: [6.1794] lock_release+0x127/0x2a0 [6.1801] up_read+0x1b/0x30 [6.1808] btrfs_search_slot+0x8e0/0xff0 [6.1817] btrfs_lookup_inode+0x52/0xd0 [6.1825] __btrfs_update_delayed_inode+0x73/0x520 [6.1833] btrfs_commit_inode_delayed_inode+0x11a/0x120 [6.1842] btrfs_log_inode+0x608/0x1aa0 [6.1849] btrfs_log_inode_parent+0x249/0xf80 [6.1857] btrfs_log_dentry_safe+0x3e/0x60 [6.1865] btrfs_sync_file+0x431/0x690 [6.1872] do_fsync+0x39/0x80 [6.1879] __x64_sys_fsync+0x13/0x20 [6.1887] do_syscall_64+0x97/0x3e0 [6.1894] entry_SYSCALL_64_after_hwframe+0x76/0x7e [6.1903] -> #0 (&delayed_node->mutex){+.+.}-{3:3}: [6.1913] __lock_acquire+0x15e9/0x2820 [6.1920] lock_acquire+0xc9/0x2d0 [6.1927] __mutex_lock+0xcc/0x10a0 [6.1934] __btrfs_release_delayed_node.part.0+0x39/0x2f0 [6.1944] btrfs_evict_inode+0x20b/0x4b0 [6.1952] evict+0x15a/0x2f0 [6.1958] prune_icache_sb+0x91/0xd0 [6.1966] super_cache_scan+0x150/0x1d0 [6.1974] do_shrink_slab+0x155/0x6f0 [6.1981] shrink_slab+0x48e/0x890 [6.1988] shrink_one+0x11a/0x1f0 [6.1995] shrink_node+0xbfd/0x1320 [6.1002] balance_pgdat+0x67f/0xc60 [6.1321] kswapd+0x1dc/0x3e0 [6.1643] kthread+0xff/0x240 [6.1965] ret_from_fork+0x223/0x280 [6.1287] ret_from_fork_asm+0x1a/0x30 [6.1616] other info that might help us debug this: [6.1561] Chain exists of: &delayed_node->mutex --> btrfs-tree-00 --> fs_reclaim [6.1503] Possible unsafe locking scenario: [6.1110] CPU0 CPU1 [6.1411] ---- ---- [6.1707] lock(fs_reclaim); [6.1998] lock(btrfs-tree-00); [6.1291] lock(fs_reclaim); [6.1581] lock(&delayed_node->mutex); [6.1874] *** DEADLOCK *** [6.1716] 2 locks held by kswapd0/117: [6.1999] #0: ffffffffa4ab8ce0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x195/0xc60 [6.1294] #1: ffff8d998344b0e0 (&type->s_umount_key#40){++++}- {3:3}, at: super_cache_scan+0x37/0x1d0 [6.1596] stack backtrace: [6.1183] CPU: 11 UID: 0 PID: 117 Comm: kswapd0 Tainted: G U 6.18.0+ #4 PREEMPT(lazy) [6.1185] Tainted: [U]=USER [6.1186] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 2001 02/01/2023 [6.1187] Call Trace: [6.1187] [6.1189] dump_stack_lvl+0x6e/0xa0 [6.1192] print_circular_bug.cold+0x17a/0x1c0 [6.1194] check_noncircular+0x175/0x190 [6.1197] __lock_acquire+0x15e9/0x2820 [6.1200] lock_acquire+0xc9/0x2d0 [6.1201] ? __btrfs_release_delayed_node.part.0+0x39/0x2f0 [6.1204] __mutex_lock+0xcc/0x10a0 [6.1206] ? __btrfs_release_delayed_node.part.0+0x39/0x2f0 [6.1208] ? __btrfs_release_delayed_node.part.0+0x39/0x2f0 [6.1211] ? __btrfs_release_delayed_node.part.0+0x39/0x2f0 [6.1213] __btrfs_release_delayed_node.part.0+0x39/0x2f0 [6.1215] btrfs_evict_inode+0x20b/0x4b0 [6.1217] ? lock_acquire+0xc9/0x2d0 [6.1220] evict+0x15a/0x2f0 [6.1222] prune_icache_sb+0x91/0xd0 [6.1224] super_cache_scan+0x150/0x1d0 [6.1226] do_shrink_slab+0x155/0x6f0 [6.1228] shrink_slab+0x48e/0x890 [6.1229] ? shrink_slab+0x2d2/0x890 [6.1231] shrink_one+0x11a/0x1f0 [6.1234] shrink_node+0xbfd/0x1320 [6.1236] ? shrink_node+0xa2d/0x1320 [6.1236] ? shrink_node+0xbd3/0x1320 [6.1239] ? balance_pgdat+0x67f/0xc60 [6.1239] balance_pgdat+0x67f/0xc60 [6.1241] ? finish_task_switch.isra.0+0xc4/0x2a0 [6.1246] kswapd+0x1dc/0x3e0 [6.1247] ? __pfx_autoremove_wake_function+0x10/0x10 [6.1249] ? __pfx_kswapd+0x10/0x10 [6.1250] kthread+0xff/0x240 [6.1251] ? __pfx_kthread+0x10/0x10 [6.1253] ret_from_fork+0x223/0x280 [6.1255] ? __pfx_kthread+0x10/0x10 [6.1257] ret_from_fork_asm+0x1a/0x30 [6.1260] This is because: 1) The fsync task is holding an inode's delayed node mutex (for a directory) while calling __btrfs_update_delayed_inode() and that needs to do a search on the subvolume's btree (therefore read lock some extent buffers); 2) The lookup task, at btrfs_lookup(), triggered reclaim with the GFP_KERNEL allocation done by btrfs_init_file_extent_tree() while holding a read lock on a subvolume leaf; 3) The reclaim triggered kswapd which is doing inode eviction for the directory inode the fsync task is using as an argument to btrfs_commit_inode_delayed_inode() - but in that call chain we are trying to read lock the same leaf that the lookup task is holding while calling btrfs_init_file_extent_tree() and doing the GFP_KERNEL allocation. Fix this by calling btrfs_init_file_extent_tree() after we don't need the path anymore and release it in btrfs_read_locked_inode(). Reported-by: Thomas Hellström Link: https://lore.kernel.org/linux-btrfs/6e55113a22347c3925458a5d840a18401a38b276.camel@linux.intel.com/ Fixes: 8679d2687c35 ("btrfs: initialize inode::file_extent_tree after i_mode has been set") Reviewed-by: Qu Wenruo Signed-off-by: Filipe Manana Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/inode.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 00bc30aee5215..d2b302ac6af96 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4048,11 +4048,6 @@ static int btrfs_read_locked_inode(struct btrfs_inode *inode, struct btrfs_path btrfs_set_inode_mapping_order(inode); cache_index: - ret = btrfs_init_file_extent_tree(inode); - if (ret) - goto out; - btrfs_inode_set_file_extent_range(inode, 0, - round_up(i_size_read(vfs_inode), fs_info->sectorsize)); /* * If we were modified in the current generation and evicted from memory * and then re-read we need to do a full sync since we don't have any @@ -4139,6 +4134,20 @@ static int btrfs_read_locked_inode(struct btrfs_inode *inode, struct btrfs_path btrfs_ino(inode), btrfs_root_id(root), ret); } + /* + * We don't need the path anymore, so release it to avoid holding a read + * lock on a leaf while calling btrfs_init_file_extent_tree(), which can + * allocate memory that triggers reclaim (GFP_KERNEL) and cause a locking + * dependency. + */ + btrfs_release_path(path); + + ret = btrfs_init_file_extent_tree(inode); + if (ret) + goto out; + btrfs_inode_set_file_extent_range(inode, 0, + round_up(i_size_read(vfs_inode), fs_info->sectorsize)); + if (!maybe_acls) cache_no_acl(vfs_inode); From 3c902eb00296aba1666b124019b7199415eba129 Mon Sep 17 00:00:00 2001 From: Qu Wenruo Date: Thu, 18 Dec 2025 15:15:28 +1030 Subject: [PATCH 0861/1321] btrfs: only enforce free space tree if v1 cache is required for bs < ps cases [BUG] Since the introduction of btrfs bs < ps support, v1 cache was never on the plan due to its hard coded PAGE_SIZE usage, and the future plan to properly deprecate it. However for bs < ps cases, even if 'nospace_cache,clear_cache' mount option is specified, it's never respected and free space tree is always enabled: mkfs.btrfs -f -O ^bgt,fst $dev mount $dev $mnt -o clear_cache,nospace_cache umount $mnt btrfs ins dump-super $dev ... compat_ro_flags 0x3 ( FREE_SPACE_TREE | FREE_SPACE_TREE_VALID ) ... This means a different behavior compared to bs >= ps cases. [CAUSE] The forcing usage of v2 space cache is done inside btrfs_set_free_space_cache_settings(), however it never checks if we're even using space cache but always enabling v2 cache. [FIX] Instead unconditionally enable v2 cache, only forcing v2 cache if the old v1 cache is required. Now v2 space cache can be properly disabled on bs < ps cases: mkfs.btrfs -f -O ^bgt,fst $dev mount $dev $mnt -o clear_cache,nospace_cache umount $mnt btrfs ins dump-super $dev ... compat_ro_flags 0x0 ... Fixes: 9f73f1aef98b ("btrfs: force v2 space cache usage for subpage mount") Reviewed-by: Filipe Manana Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/super.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 1999533b52bee..9f85464988183 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -736,14 +736,12 @@ bool btrfs_check_options(const struct btrfs_fs_info *info, */ void btrfs_set_free_space_cache_settings(struct btrfs_fs_info *fs_info) { - if (fs_info->sectorsize < PAGE_SIZE) { + if (fs_info->sectorsize < PAGE_SIZE && btrfs_test_opt(fs_info, SPACE_CACHE)) { + btrfs_info(fs_info, + "forcing free space tree for sector size %u with page size %lu", + fs_info->sectorsize, PAGE_SIZE); btrfs_clear_opt(fs_info->mount_opt, SPACE_CACHE); - if (!btrfs_test_opt(fs_info, FREE_SPACE_TREE)) { - btrfs_info(fs_info, - "forcing free space tree for sector size %u with page size %lu", - fs_info->sectorsize, PAGE_SIZE); - btrfs_set_opt(fs_info->mount_opt, FREE_SPACE_TREE); - } + btrfs_set_opt(fs_info->mount_opt, FREE_SPACE_TREE); } /* From 684f9aa69384ecb2aa09cfd70abc9bc4951364a0 Mon Sep 17 00:00:00 2001 From: Qu Wenruo Date: Thu, 18 Dec 2025 15:15:29 +1030 Subject: [PATCH 0862/1321] btrfs: force free space tree for bs > ps cases [BUG] Currently we only enforcing the free space tree for bs < ps cases, but with the recently added bs > ps support, we lack the free space tree enforcing, causing explicit v1 cache mount option to fail on bs > ps cases: # mount -o space_cache=v1 /dev/test/scratch1 /mnt/btrfs/ mount: /mnt/btrfs: wrong fs type, bad option, bad superblock on /dev/mapper/test-scratch1, missing codepage or helper program, or other error. dmesg(1) may have more information after failed mount system call. # dmesg -t | tail -n7 BTRFS: device fsid ac14a6fa-4ec9-449e-aec9-7d1777bfdc06 devid 1 transid 11 /dev/mapper/test-scratch1 (253:3) scanned by mount (2849) BTRFS info (device dm-3): first mount of filesystem ac14a6fa-4ec9-449e-aec9-7d1777bfdc06 BTRFS info (device dm-3): using crc32c checksum algorithm BTRFS warning (device dm-3): support for block size 8192 with page size 4096 is experimental, some features may be missing BTRFS warning (device dm-3): space cache v1 is being deprecated and will be removed in a future release, please use -o space_cache=v2 BTRFS warning (device dm-3): v1 space cache is not supported for page size 4096 with sectorsize 8192 BTRFS error (device dm-3): open_ctree failed: -22 [FIX] Just enable the same free space tree for bs > ps cases, aligning the behavior to bs < ps cases. Reviewed-by: Filipe Manana Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 9f85464988183..af56fdbba65d2 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -736,7 +736,7 @@ bool btrfs_check_options(const struct btrfs_fs_info *info, */ void btrfs_set_free_space_cache_settings(struct btrfs_fs_info *fs_info) { - if (fs_info->sectorsize < PAGE_SIZE && btrfs_test_opt(fs_info, SPACE_CACHE)) { + if (fs_info->sectorsize != PAGE_SIZE && btrfs_test_opt(fs_info, SPACE_CACHE)) { btrfs_info(fs_info, "forcing free space tree for sector size %u with page size %lu", fs_info->sectorsize, PAGE_SIZE); From da4330a0b2afc6662618e89e047d71f130ef475f Mon Sep 17 00:00:00 2001 From: Suchit Karunakaran Date: Fri, 19 Dec 2025 22:44:34 +0530 Subject: [PATCH 0863/1321] btrfs: fix NULL pointer dereference in do_abort_log_replay() Coverity reported a NULL pointer dereference issue (CID 1666756) in do_abort_log_replay(). When btrfs_alloc_path() fails in replay_one_buffer(), wc->subvol_path is NULL, but btrfs_abort_log_replay() calls do_abort_log_replay() which unconditionally dereferences wc->subvol_path when attempting to print debug information. Fix this by adding a NULL check before dereferencing wc->subvol_path in do_abort_log_replay(). Fixes: 2753e4917624 ("btrfs: dump detailed info and specific messages on log replay failures") Reviewed-by: Filipe Manana Signed-off-by: Suchit Karunakaran Signed-off-by: Filipe Manana Signed-off-by: David Sterba --- fs/btrfs/tree-log.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 5831754bb01c4..2d9d38b82daa2 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -190,7 +190,7 @@ static void do_abort_log_replay(struct walk_control *wc, const char *function, btrfs_abort_transaction(wc->trans, error); - if (wc->subvol_path->nodes[0]) { + if (wc->subvol_path && wc->subvol_path->nodes[0]) { btrfs_crit(fs_info, "subvolume (root %llu) leaf currently being processed:", btrfs_root_id(wc->root)); From 15756d0e12c4f1144804ee19154c893d9dc69d0a Mon Sep 17 00:00:00 2001 From: Mark Harmstone Date: Fri, 19 Dec 2025 18:15:28 +0000 Subject: [PATCH 0864/1321] btrfs: show correct warning if can't read data reloc tree If a filesystem is missing its data reloc tree, we get something like this in dmesg: BTRFS warning (device loop11): failed to read root (objectid=4): -2 BTRFS error (device loop11): open_ctree failed: -2 objectid is BTRFS_DEV_TREE_OBJECTID, but this should actually be the value of BTRFS_DATA_RELOC_TREE_OBJECTID. btrfs_read_roots() prints location.objectid on failure, but this isn't set when reading the data reloc tree. Set location.objectid to the correct value on failure, so that the error message makes sense. Reviewed-by: Qu Wenruo Signed-off-by: Mark Harmstone Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/disk-io.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 89149fac804c8..d8ca5b6e88e0d 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2255,6 +2255,7 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info) BTRFS_DATA_RELOC_TREE_OBJECTID, true); if (IS_ERR(root)) { if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) { + location.objectid = BTRFS_DATA_RELOC_TREE_OBJECTID; ret = PTR_ERR(root); goto out; } From dadb1c00024e8b1c058091f966088f58e244cde1 Mon Sep 17 00:00:00 2001 From: ziming zhang Date: Thu, 11 Dec 2025 16:52:58 +0800 Subject: [PATCH 0865/1321] libceph: prevent potential out-of-bounds reads in handle_auth_done() Perform an explicit bounds check on payload_len to avoid a possible out-of-bounds access in the callout. [ idryomov: changelog ] Cc: stable@vger.kernel.org Signed-off-by: ziming zhang Reviewed-by: Ilya Dryomov Signed-off-by: Ilya Dryomov --- net/ceph/messenger_v2.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c index 833e57849c1d7..c9d50c0dcd33a 100644 --- a/net/ceph/messenger_v2.c +++ b/net/ceph/messenger_v2.c @@ -2376,7 +2376,9 @@ static int process_auth_done(struct ceph_connection *con, void *p, void *end) ceph_decode_64_safe(&p, end, global_id, bad); ceph_decode_32_safe(&p, end, con->v2.con_mode, bad); + ceph_decode_32_safe(&p, end, payload_len, bad); + ceph_decode_need(&p, end, payload_len, bad); dout("%s con %p global_id %llu con_mode %d payload_len %d\n", __func__, con, global_id, con->v2.con_mode, payload_len); From bda0da504f64a5ee65bc0da8767f6c7c0e2e3d22 Mon Sep 17 00:00:00 2001 From: Ilya Dryomov Date: Mon, 15 Dec 2025 11:53:31 +0100 Subject: [PATCH 0866/1321] libceph: replace overzealous BUG_ON in osdmap_apply_incremental() If the osdmap is (maliciously) corrupted such that the incremental osdmap epoch is different from what is expected, there is no need to BUG. Instead, just declare the incremental osdmap to be invalid. Cc: stable@vger.kernel.org Reported-by: ziming zhang Signed-off-by: Ilya Dryomov --- net/ceph/osdmap.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/ceph/osdmap.c b/net/ceph/osdmap.c index 34b3ab59602f7..3377a22e3f6c1 100644 --- a/net/ceph/osdmap.c +++ b/net/ceph/osdmap.c @@ -1979,11 +1979,13 @@ struct ceph_osdmap *osdmap_apply_incremental(void **p, void *end, bool msgr2, sizeof(u64) + sizeof(u32), e_inval); ceph_decode_copy(p, &fsid, sizeof(fsid)); epoch = ceph_decode_32(p); - BUG_ON(epoch != map->epoch+1); ceph_decode_copy(p, &modified, sizeof(modified)); new_pool_max = ceph_decode_64(p); new_flags = ceph_decode_32(p); + if (epoch != map->epoch + 1) + goto e_inval; + /* full map? */ ceph_decode_32_safe(p, end, len, e_inval); if (len > 0) { From 2342598615ec9dd6cae2e81fe9d6c52b7153af4a Mon Sep 17 00:00:00 2001 From: Viacheslav Dubeyko Date: Tue, 16 Dec 2025 12:00:06 -0800 Subject: [PATCH 0867/1321] ceph: update co-maintainers list in MAINTAINERS Update the list of co-maintainers for Ceph file system following Xiubo's departure. Signed-off-by: Viacheslav Dubeyko Acked-by: Alex Markuze Acked-by: Xiubo Li Signed-off-by: Ilya Dryomov --- MAINTAINERS | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index d4afbc1c32a7d..e73ad09807a19 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5802,7 +5802,8 @@ F: drivers/power/supply/cw2015_battery.c CEPH COMMON CODE (LIBCEPH) M: Ilya Dryomov -M: Xiubo Li +M: Alex Markuze +M: Viacheslav Dubeyko L: ceph-devel@vger.kernel.org S: Supported W: http://ceph.com/ @@ -5813,8 +5814,9 @@ F: include/linux/crush/ F: net/ceph/ CEPH DISTRIBUTED FILE SYSTEM CLIENT (CEPH) -M: Xiubo Li M: Ilya Dryomov +M: Alex Markuze +M: Viacheslav Dubeyko L: ceph-devel@vger.kernel.org S: Supported W: http://ceph.com/ From e0e1210cbf3401855d9924d6586caacf9157f418 Mon Sep 17 00:00:00 2001 From: Tuo Li Date: Sun, 21 Dec 2025 02:11:49 +0800 Subject: [PATCH 0868/1321] libceph: make free_choose_arg_map() resilient to partial allocation free_choose_arg_map() may dereference a NULL pointer if its caller fails after a partial allocation. For example, in decode_choose_args(), if allocation of arg_map->args fails, execution jumps to the fail label and free_choose_arg_map() is called. Since arg_map->size is updated to a non-zero value before memory allocation, free_choose_arg_map() will iterate over arg_map->args and dereference a NULL pointer. To prevent this potential NULL pointer dereference and make free_choose_arg_map() more resilient, add checks for pointers before iterating. Cc: stable@vger.kernel.org Co-authored-by: Ilya Dryomov Signed-off-by: Tuo Li Reviewed-by: Viacheslav Dubeyko Signed-off-by: Ilya Dryomov --- net/ceph/osdmap.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/net/ceph/osdmap.c b/net/ceph/osdmap.c index 3377a22e3f6c1..92a44026de293 100644 --- a/net/ceph/osdmap.c +++ b/net/ceph/osdmap.c @@ -241,22 +241,26 @@ static struct crush_choose_arg_map *alloc_choose_arg_map(void) static void free_choose_arg_map(struct crush_choose_arg_map *arg_map) { - if (arg_map) { - int i, j; + int i, j; + + if (!arg_map) + return; - WARN_ON(!RB_EMPTY_NODE(&arg_map->node)); + WARN_ON(!RB_EMPTY_NODE(&arg_map->node)); + if (arg_map->args) { for (i = 0; i < arg_map->size; i++) { struct crush_choose_arg *arg = &arg_map->args[i]; - - for (j = 0; j < arg->weight_set_size; j++) - kfree(arg->weight_set[j].weights); - kfree(arg->weight_set); + if (arg->weight_set) { + for (j = 0; j < arg->weight_set_size; j++) + kfree(arg->weight_set[j].weights); + kfree(arg->weight_set); + } kfree(arg->ids); } kfree(arg_map->args); - kfree(arg_map); } + kfree(arg_map); } DEFINE_RB_FUNCS(choose_arg_map, struct crush_choose_arg_map, choose_args_index, From 07f112a03a36dc7f97ff0d43305ab5f638b377ce Mon Sep 17 00:00:00 2001 From: Ilya Dryomov Date: Mon, 29 Dec 2025 15:14:48 +0100 Subject: [PATCH 0869/1321] libceph: return the handler error from mon_handle_auth_done() Currently any error from ceph_auth_handle_reply_done() is propagated via finish_auth() but isn't returned from mon_handle_auth_done(). This results in higher layers learning that (despite the monitor considering us to be successfully authenticated) something went wrong in the authentication phase and reacting accordingly, but msgr2 still trying to proceed with establishing the session in the background. In the case of secure mode this can trigger a WARN in setup_crypto() and later lead to a NULL pointer dereference inside of prepare_auth_signature(). Cc: stable@vger.kernel.org Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)") Signed-off-by: Ilya Dryomov Reviewed-by: Viacheslav Dubeyko --- net/ceph/mon_client.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c index c227ececa9254..fa8dd2a20f7d2 100644 --- a/net/ceph/mon_client.c +++ b/net/ceph/mon_client.c @@ -1417,7 +1417,7 @@ static int mon_handle_auth_done(struct ceph_connection *con, if (!ret) finish_hunting(monc); mutex_unlock(&monc->mutex); - return 0; + return ret; } static int mon_handle_auth_bad_method(struct ceph_connection *con, From c3ea459234912fa62e25b4d649439be8725c225b Mon Sep 17 00:00:00 2001 From: Sam Edwards Date: Tue, 30 Dec 2025 20:05:06 -0800 Subject: [PATCH 0870/1321] libceph: reset sparse-read state in osd_fault() When a fault occurs, the connection is abandoned, reestablished, and any pending operations are retried. The OSD client tracks the progress of a sparse-read reply using a separate state machine, largely independent of the messenger's state. If a connection is lost mid-payload or the sparse-read state machine returns an error, the sparse-read state is not reset. The OSD client will then interpret the beginning of a new reply as the continuation of the old one. If this makes the sparse-read machinery enter a failure state, it may never recover, producing loops like: libceph: [0] got 0 extents libceph: data len 142248331 != extent len 0 libceph: osd0 (1)...:6801 socket error on read libceph: data len 142248331 != extent len 0 libceph: osd0 (1)...:6801 socket error on read Therefore, reset the sparse-read state in osd_fault(), ensuring retries start from a clean state. Cc: stable@vger.kernel.org Fixes: f628d7999727 ("libceph: add sparse read support to OSD client") Signed-off-by: Sam Edwards Reviewed-by: Ilya Dryomov Signed-off-by: Ilya Dryomov --- net/ceph/osd_client.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index 3667319b949dd..1a7be2f615dc3 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -4281,6 +4281,9 @@ static void osd_fault(struct ceph_connection *con) goto out_unlock; } + osd->o_sparse_op_idx = -1; + ceph_init_sparse_read(&osd->o_sparse_read); + if (!reopen_osd(osd)) kick_osd_requests(osd); maybe_request_map(osdc); From 6587777c55e6971dce692bdeaed7a36da9f6cd93 Mon Sep 17 00:00:00 2001 From: Ilya Dryomov Date: Mon, 5 Jan 2026 19:23:19 +0100 Subject: [PATCH 0871/1321] libceph: make calc_target() set t->paused, not just clear it Currently calc_target() clears t->paused if the request shouldn't be paused anymore, but doesn't ever set t->paused even though it's able to determine when the request should be paused. Setting t->paused is left to __submit_request() which is fine for regular requests but doesn't work for linger requests -- since __submit_request() doesn't operate on linger requests, there is nowhere for lreq->t.paused to be set. One consequence of this is that watches don't get reestablished on paused -> unpaused transitions in cases where requests have been paused long enough for the (paused) unwatch request to time out and for the subsequent (re)watch request to enter the paused state. On top of the watch not getting reestablished, rbd_reregister_watch() gets stuck with rbd_dev->watch_mutex held: rbd_register_watch __rbd_register_watch ceph_osdc_watch linger_reg_commit_wait It's waiting for lreq->reg_commit_wait to be completed, but for that to happen the respective request needs to end up on need_resend_linger list and be kicked when requests are unpaused. There is no chance for that if the request in question is never marked paused in the first place. The fact that rbd_dev->watch_mutex remains taken out forever then prevents the image from getting unmapped -- "rbd unmap" would inevitably hang in D state on an attempt to grab the mutex. Cc: stable@vger.kernel.org Reported-by: Raphael Zimmer Signed-off-by: Ilya Dryomov Reviewed-by: Viacheslav Dubeyko --- net/ceph/osd_client.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index 1a7be2f615dc3..610e584524d14 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -1586,6 +1586,7 @@ static enum calc_target_result calc_target(struct ceph_osd_client *osdc, struct ceph_pg_pool_info *pi; struct ceph_pg pgid, last_pgid; struct ceph_osds up, acting; + bool should_be_paused; bool is_read = t->flags & CEPH_OSD_FLAG_READ; bool is_write = t->flags & CEPH_OSD_FLAG_WRITE; bool force_resend = false; @@ -1654,10 +1655,16 @@ static enum calc_target_result calc_target(struct ceph_osd_client *osdc, &last_pgid)) force_resend = true; - if (t->paused && !target_should_be_paused(osdc, t, pi)) { - t->paused = false; + should_be_paused = target_should_be_paused(osdc, t, pi); + if (t->paused && !should_be_paused) { unpaused = true; } + if (t->paused != should_be_paused) { + dout("%s t %p paused %d -> %d\n", __func__, t, t->paused, + should_be_paused); + t->paused = should_be_paused; + } + legacy_change = ceph_pg_compare(&t->pgid, &pgid) || ceph_osds_changed(&t->acting, &acting, t->used_replica || any_change); From 9145a792afc08aeafbb276599454dc31d0680daf Mon Sep 17 00:00:00 2001 From: Linus Walleij Date: Thu, 11 Dec 2025 14:05:01 +0100 Subject: [PATCH 0872/1321] ARM: dts: ixp4xx: Fix up Actiontec MI424WR DTS files The KS8995 switch was unconditionally wired to EthC (eth1) on both MI424WR variants, this is wrong: the D revision has the switch connected to EthB (eth0) so pull this assingment out of the generic MI424WR DTSI file and make it a property of the respective variants instead. Signed-off-by: Linus Walleij Link: https://patch.msgid.link/20251211-ixp4xx-actiontec-dts-fix-v1-1-97af8e79d474@kernel.org Signed-off-by: Krzysztof Kozlowski --- .../intel/ixp/intel-ixp42x-actiontec-mi424wr-ac.dts | 11 +++++++++++ .../intel/ixp/intel-ixp42x-actiontec-mi424wr-d.dts | 11 +++++++++++ .../dts/intel/ixp/intel-ixp42x-actiontec-mi424wr.dtsi | 1 - 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/intel/ixp/intel-ixp42x-actiontec-mi424wr-ac.dts b/arch/arm/boot/dts/intel/ixp/intel-ixp42x-actiontec-mi424wr-ac.dts index 413b9255f9e3c..19a8d7b077588 100644 --- a/arch/arm/boot/dts/intel/ixp/intel-ixp42x-actiontec-mi424wr-ac.dts +++ b/arch/arm/boot/dts/intel/ixp/intel-ixp42x-actiontec-mi424wr-ac.dts @@ -12,6 +12,17 @@ model = "Actiontec MI424WR rev A/C"; compatible = "actiontec,mi424wr-ac", "intel,ixp42x"; + /* Connect the switch to EthC */ + spi { + ethernet-switch@0 { + ethernet-ports { + ethernet-port@4 { + ethernet = <ðc>; + }; + }; + }; + }; + soc { /* EthB used for WAN */ ethernet@c8009000 { diff --git a/arch/arm/boot/dts/intel/ixp/intel-ixp42x-actiontec-mi424wr-d.dts b/arch/arm/boot/dts/intel/ixp/intel-ixp42x-actiontec-mi424wr-d.dts index 3619c6411a5c0..244c6ea0973ff 100644 --- a/arch/arm/boot/dts/intel/ixp/intel-ixp42x-actiontec-mi424wr-d.dts +++ b/arch/arm/boot/dts/intel/ixp/intel-ixp42x-actiontec-mi424wr-d.dts @@ -12,6 +12,17 @@ model = "Actiontec MI424WR rev D"; compatible = "actiontec,mi424wr-d", "intel,ixp42x"; + /* Connect the switch to EthB */ + spi { + ethernet-switch@0 { + ethernet-ports { + ethernet-port@4 { + ethernet = <ðb>; + }; + }; + }; + }; + soc { /* EthB used for LAN */ ethernet@c8009000 { diff --git a/arch/arm/boot/dts/intel/ixp/intel-ixp42x-actiontec-mi424wr.dtsi b/arch/arm/boot/dts/intel/ixp/intel-ixp42x-actiontec-mi424wr.dtsi index 76fd97c5beb69..9b54e3c01a345 100644 --- a/arch/arm/boot/dts/intel/ixp/intel-ixp42x-actiontec-mi424wr.dtsi +++ b/arch/arm/boot/dts/intel/ixp/intel-ixp42x-actiontec-mi424wr.dtsi @@ -152,7 +152,6 @@ }; ethernet-port@4 { reg = <4>; - ethernet = <ðc>; phy-mode = "mii"; fixed-link { speed = <100>; From 33a119cb60c57ed5362711830d095e0d6cdb6c21 Mon Sep 17 00:00:00 2001 From: Wadim Egorov Date: Thu, 27 Nov 2025 13:27:31 +0100 Subject: [PATCH 0873/1321] arm64: dts: ti: k3-am642-phyboard-electra-peb-c-010: Fix icssg-prueth schema warning Reduce length of dma-names and dmas properties for icssg1-ethernet node to comply with ti,icssg-prueth schema constraints. The previous entries exceeded the allowed count and triggered dtschema warnings during validation. Fixes: e53fbf955ea7 ("arm64: dts: ti: k3-am642-phyboard-electra: Add PEB-C-010 Overlay") Signed-off-by: Wadim Egorov Link: https://patch.msgid.link/20251127122733.2523367-1-w.egorov@phytec.de Signed-off-by: Nishanth Menon --- .../boot/dts/ti/k3-am642-phyboard-electra-peb-c-010.dtso | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/arch/arm64/boot/dts/ti/k3-am642-phyboard-electra-peb-c-010.dtso b/arch/arm64/boot/dts/ti/k3-am642-phyboard-electra-peb-c-010.dtso index 7fc73cfacadb8..1176a52d560b7 100644 --- a/arch/arm64/boot/dts/ti/k3-am642-phyboard-electra-peb-c-010.dtso +++ b/arch/arm64/boot/dts/ti/k3-am642-phyboard-electra-peb-c-010.dtso @@ -30,13 +30,10 @@ <&main_pktdma 0xc206 15>, /* egress slice 1 */ <&main_pktdma 0xc207 15>, /* egress slice 1 */ <&main_pktdma 0x4200 15>, /* ingress slice 0 */ - <&main_pktdma 0x4201 15>, /* ingress slice 1 */ - <&main_pktdma 0x4202 0>, /* mgmnt rsp slice 0 */ - <&main_pktdma 0x4203 0>; /* mgmnt rsp slice 1 */ + <&main_pktdma 0x4201 15>; /* ingress slice 1 */ dma-names = "tx0-0", "tx0-1", "tx0-2", "tx0-3", "tx1-0", "tx1-1", "tx1-2", "tx1-3", - "rx0", "rx1", - "rxmgm0", "rxmgm1"; + "rx0", "rx1"; firmware-name = "ti-pruss/am65x-sr2-pru0-prueth-fw.elf", "ti-pruss/am65x-sr2-rtu0-prueth-fw.elf", From cc3a149cd878a4aeb8389863a85ae237bac22988 Mon Sep 17 00:00:00 2001 From: Wadim Egorov Date: Thu, 27 Nov 2025 13:27:32 +0100 Subject: [PATCH 0874/1321] arm64: dts: ti: k3-am642-phyboard-electra-x27-gpio1-spi1-uart3: Fix schema warnings Rename pinctrl nodes to comply with naming conventions required by pinctrl-single schema. Also, replace invalid integer assignment in SPI node with a boolean to align with omap-spi schema. Fixes: 638ab30ce4c6 ("arm64: dts: ti: am64-phyboard-electra: Add DT overlay for X27 connector") Signed-off-by: Wadim Egorov Link: https://patch.msgid.link/20251127122733.2523367-2-w.egorov@phytec.de Signed-off-by: Nishanth Menon --- .../k3-am642-phyboard-electra-x27-gpio1-spi1-uart3.dtso | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/arm64/boot/dts/ti/k3-am642-phyboard-electra-x27-gpio1-spi1-uart3.dtso b/arch/arm64/boot/dts/ti/k3-am642-phyboard-electra-x27-gpio1-spi1-uart3.dtso index 996c42ec4253e..bea8efa3e9094 100644 --- a/arch/arm64/boot/dts/ti/k3-am642-phyboard-electra-x27-gpio1-spi1-uart3.dtso +++ b/arch/arm64/boot/dts/ti/k3-am642-phyboard-electra-x27-gpio1-spi1-uart3.dtso @@ -20,13 +20,13 @@ }; &main_pmx0 { - main_gpio1_exp_header_gpio_pins_default: main-gpio1-exp-header-gpio-pins-default { + main_gpio1_exp_header_gpio_pins_default: main-gpio1-exp-header-gpio-default-pins { pinctrl-single,pins = < AM64X_IOPAD(0x0220, PIN_INPUT, 7) /* (D14) SPI1_CS1.GPIO1_48 */ >; }; - main_spi1_pins_default: main-spi1-pins-default { + main_spi1_pins_default: main-spi1-default-pins { pinctrl-single,pins = < AM64X_IOPAD(0x0224, PIN_INPUT, 0) /* (C14) SPI1_CLK */ AM64X_IOPAD(0x021C, PIN_OUTPUT, 0) /* (B14) SPI1_CS0 */ @@ -35,7 +35,7 @@ >; }; - main_uart3_pins_default: main-uart3-pins-default { + main_uart3_pins_default: main-uart3-default-pins { pinctrl-single,pins = < AM64X_IOPAD(0x0048, PIN_INPUT, 2) /* (U20) GPMC0_AD3.UART3_RXD */ AM64X_IOPAD(0x004c, PIN_OUTPUT, 2) /* (U18) GPMC0_AD4.UART3_TXD */ @@ -52,7 +52,7 @@ &main_spi1 { pinctrl-names = "default"; pinctrl-0 = <&main_spi1_pins_default>; - ti,pindir-d0-out-d1-in = <1>; + ti,pindir-d0-out-d1-in; status = "okay"; }; From f9499bf666915f9875248caa7ad164bcb8f060b7 Mon Sep 17 00:00:00 2001 From: Wadim Egorov Date: Thu, 27 Nov 2025 13:27:33 +0100 Subject: [PATCH 0875/1321] arm64: dts: ti: k3-am62-lp-sk-nand: Rename pinctrls to fix schema warnings Rename pinctrl nodes to comply with naming conventions required by pinctrl-single schema. Fixes: e569152274fec ("arm64: dts: ti: am62-lp-sk: Add overlay for NAND expansion card") Signed-off-by: Wadim Egorov Link: https://patch.msgid.link/20251127122733.2523367-3-w.egorov@phytec.de Signed-off-by: Nishanth Menon --- arch/arm64/boot/dts/ti/k3-am62-lp-sk-nand.dtso | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/ti/k3-am62-lp-sk-nand.dtso b/arch/arm64/boot/dts/ti/k3-am62-lp-sk-nand.dtso index 173ac60723b64..b4daa674eaa1e 100644 --- a/arch/arm64/boot/dts/ti/k3-am62-lp-sk-nand.dtso +++ b/arch/arm64/boot/dts/ti/k3-am62-lp-sk-nand.dtso @@ -14,7 +14,7 @@ }; &main_pmx0 { - gpmc0_pins_default: gpmc0-pins-default { + gpmc0_pins_default: gpmc0-default-pins { pinctrl-single,pins = < AM62X_IOPAD(0x003c, PIN_INPUT, 0) /* (K19) GPMC0_AD0 */ AM62X_IOPAD(0x0040, PIN_INPUT, 0) /* (L19) GPMC0_AD1 */ From db9d611a9a63f97a87f1fa71c309a3ef27a68226 Mon Sep 17 00:00:00 2001 From: Linus Walleij Date: Tue, 16 Dec 2025 10:40:42 +0100 Subject: [PATCH 0876/1321] MAINTAINERS: Fix a linusw mail address The patch adding me to the SoC maintainers was in-flight at the time I had to change my mail address. This fixes it up. Signed-off-by: Linus Walleij Link: https://patch.msgid.link/20251216-maintainers-fix-v1-1-92f11231b27e@kernel.org Signed-off-by: Krzysztof Kozlowski --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index e73ad09807a19..32b5e41d98493 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2012,7 +2012,7 @@ ARM AND ARM64 SoC SUB-ARCHITECTURES (COMMON PARTS) M: Arnd Bergmann M: Krzysztof Kozlowski M: Alexandre Belloni -M: Linus Walleij +M: Linus Walleij R: Drew Fustini L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) L: soc@lists.linux.dev From 07e9a2dae241edea00f54bf1d29947be974727c2 Mon Sep 17 00:00:00 2001 From: Andrea della Porta Date: Thu, 18 Dec 2025 20:09:06 +0100 Subject: [PATCH 0877/1321] dt-bindings: misc: pci1de4,1: add required reg property for endpoint The PCI subsystem links an endpoint Device Tree node to its corresponding pci_dev structure only if the Bus/Device/Function (BDF) encoded in the 'reg' property matches the actual hardware topology. Add the 'reg' property and mark it as required to ensure proper binding between the device_node and the pci_dev. Update the example to reflect this requirement. Signed-off-by: Andrea della Porta Reviewed-by: Rob Herring Link: https://lore.kernel.org/r/b58bfcd957b2270fcf932d463f2db534b2ae1a6d.1766077285.git.andrea.porta@suse.com Signed-off-by: Florian Fainelli --- Documentation/devicetree/bindings/misc/pci1de4,1.yaml | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/misc/pci1de4,1.yaml b/Documentation/devicetree/bindings/misc/pci1de4,1.yaml index 2f9a7a554ed8a..17a8c19af8cc2 100644 --- a/Documentation/devicetree/bindings/misc/pci1de4,1.yaml +++ b/Documentation/devicetree/bindings/misc/pci1de4,1.yaml @@ -25,6 +25,10 @@ properties: items: - const: pci1de4,1 + reg: + maxItems: 1 + description: The PCI Bus-Device-Function address. + '#interrupt-cells': const: 2 description: | @@ -101,6 +105,7 @@ unevaluatedProperties: false required: - compatible + - reg - '#interrupt-cells' - interrupt-controller - pci-ep-bus@1 @@ -111,8 +116,9 @@ examples: #address-cells = <3>; #size-cells = <2>; - rp1@0,0 { + dev@0,0 { compatible = "pci1de4,1"; + reg = <0x10000 0x0 0x0 0x0 0x0>; ranges = <0x01 0x00 0x00000000 0x82010000 0x00 0x00 0x00 0x400000>; #address-cells = <3>; #size-cells = <2>; From 59c8261e52e7c8216160bfa517c7f037a7fcc2dc Mon Sep 17 00:00:00 2001 From: Andrea della Porta Date: Thu, 18 Dec 2025 20:09:07 +0100 Subject: [PATCH 0878/1321] misc: rp1: drop overlay support The RP1 driver can load an overlay at runtime to describe the inner peripherals. This has led to a lot of confusion regarding the naming of nodes, their topology and the reclaiming of related node resources. Since the overlay is currently not fully functional, drop its support in the driver in favor of the fully described static DT. This also means that this driver does not depend on CONFIG_PCI_DYNAMIC_OF_NODES and no longer requires PCI quirks to dynamically create the intermediate PCI nodes. Signed-off-by: Andrea della Porta Reviewed-by: Rob Herring Link: https://lore.kernel.org/r/4b0aa7160877cf128b9bc713776bcac73c46eb24.1766077285.git.andrea.porta@suse.com Signed-off-by: Florian Fainelli --- drivers/misc/rp1/Kconfig | 6 +----- drivers/misc/rp1/Makefile | 3 +-- drivers/misc/rp1/rp1-pci.dtso | 25 ----------------------- drivers/misc/rp1/rp1_pci.c | 37 ++++------------------------------- drivers/pci/quirks.c | 1 - 5 files changed, 6 insertions(+), 66 deletions(-) delete mode 100644 drivers/misc/rp1/rp1-pci.dtso diff --git a/drivers/misc/rp1/Kconfig b/drivers/misc/rp1/Kconfig index 5232e70d3079b..2c13b3968b011 100644 --- a/drivers/misc/rp1/Kconfig +++ b/drivers/misc/rp1/Kconfig @@ -5,8 +5,7 @@ config MISC_RP1 tristate "RaspberryPi RP1 misc device" - depends on OF_IRQ && OF_OVERLAY && PCI_MSI && PCI_QUIRKS - select PCI_DYNAMIC_OF_NODES + depends on OF_IRQ && PCI_MSI help Support the RP1 peripheral chip found on Raspberry Pi 5 board. @@ -15,6 +14,3 @@ config MISC_RP1 The driver is responsible for enabling the DT node once the PCIe endpoint has been configured, and handling interrupts. - - This driver uses an overlay to load other drivers to support for - RP1 internal sub-devices. diff --git a/drivers/misc/rp1/Makefile b/drivers/misc/rp1/Makefile index 508b4cb056277..ab32b433d7ede 100644 --- a/drivers/misc/rp1/Makefile +++ b/drivers/misc/rp1/Makefile @@ -1,3 +1,2 @@ # SPDX-License-Identifier: GPL-2.0-only -obj-$(CONFIG_MISC_RP1) += rp1-pci.o -rp1-pci-objs := rp1_pci.o rp1-pci.dtbo.o +obj-$(CONFIG_MISC_RP1) += rp1_pci.o diff --git a/drivers/misc/rp1/rp1-pci.dtso b/drivers/misc/rp1/rp1-pci.dtso deleted file mode 100644 index eea826b36e029..0000000000000 --- a/drivers/misc/rp1/rp1-pci.dtso +++ /dev/null @@ -1,25 +0,0 @@ -// SPDX-License-Identifier: (GPL-2.0 OR MIT) - -/* - * The dts overlay is included from the dts directory so - * it can be possible to check it with CHECK_DTBS while - * also compile it from the driver source directory. - */ - -/dts-v1/; -/plugin/; - -/ { - fragment@0 { - target-path=""; - __overlay__ { - compatible = "pci1de4,1"; - #address-cells = <3>; - #size-cells = <2>; - interrupt-controller; - #interrupt-cells = <2>; - - #include "arm64/broadcom/rp1-common.dtsi" - }; - }; -}; diff --git a/drivers/misc/rp1/rp1_pci.c b/drivers/misc/rp1/rp1_pci.c index a342bcc6164bb..d210da84c30a2 100644 --- a/drivers/misc/rp1/rp1_pci.c +++ b/drivers/misc/rp1/rp1_pci.c @@ -34,16 +34,11 @@ /* Interrupts */ #define RP1_INT_END 61 -/* Embedded dtbo symbols created by cmd_wrap_S_dtb in scripts/Makefile.lib */ -extern char __dtbo_rp1_pci_begin[]; -extern char __dtbo_rp1_pci_end[]; - struct rp1_dev { struct pci_dev *pdev; struct irq_domain *domain; struct irq_data *pcie_irqds[64]; void __iomem *bar1; - int ovcs_id; /* overlay changeset id */ bool level_triggered_irq[RP1_INT_END]; }; @@ -184,24 +179,13 @@ static void rp1_unregister_interrupts(struct pci_dev *pdev) static int rp1_probe(struct pci_dev *pdev, const struct pci_device_id *id) { - u32 dtbo_size = __dtbo_rp1_pci_end - __dtbo_rp1_pci_begin; - void *dtbo_start = __dtbo_rp1_pci_begin; struct device *dev = &pdev->dev; struct device_node *rp1_node; - bool skip_ovl = true; struct rp1_dev *rp1; int err = 0; int i; - /* - * Either use rp1_nexus node if already present in DT, or - * set a flag to load it from overlay at runtime - */ - rp1_node = of_find_node_by_name(NULL, "rp1_nexus"); - if (!rp1_node) { - rp1_node = dev_of_node(dev); - skip_ovl = false; - } + rp1_node = dev_of_node(dev); if (!rp1_node) { dev_err(dev, "Missing of_node for device\n"); @@ -276,42 +260,29 @@ static int rp1_probe(struct pci_dev *pdev, const struct pci_device_id *id) rp1_chained_handle_irq, rp1); } - if (!skip_ovl) { - err = of_overlay_fdt_apply(dtbo_start, dtbo_size, &rp1->ovcs_id, - rp1_node); - if (err) - goto err_unregister_interrupts; - } - err = of_platform_default_populate(rp1_node, NULL, dev); if (err) { dev_err_probe(&pdev->dev, err, "Error populating devicetree\n"); - goto err_unload_overlay; + goto err_unregister_interrupts; } - if (skip_ovl) - of_node_put(rp1_node); + of_node_put(rp1_node); return 0; -err_unload_overlay: - of_overlay_remove(&rp1->ovcs_id); err_unregister_interrupts: rp1_unregister_interrupts(pdev); err_put_node: - if (skip_ovl) - of_node_put(rp1_node); + of_node_put(rp1_node); return err; } static void rp1_remove(struct pci_dev *pdev) { - struct rp1_dev *rp1 = pci_get_drvdata(pdev); struct device *dev = &pdev->dev; of_platform_depopulate(dev); - of_overlay_remove(&rp1->ovcs_id); rp1_unregister_interrupts(pdev); } diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index b9c252aa6fe08..280cd50d693bd 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -6308,7 +6308,6 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5020, of_pci_make_dev_node); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5021, of_pci_make_dev_node); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REDHAT, 0x0005, of_pci_make_dev_node); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_EFAR, 0x9660, of_pci_make_dev_node); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_RPI, PCI_DEVICE_ID_RPI_RP1_C0, of_pci_make_dev_node); /* * Devices known to require a longer delay before first config space access From a6a50e0eaf5840d40726900a3762fa17398da784 Mon Sep 17 00:00:00 2001 From: Andrea della Porta Date: Thu, 18 Dec 2025 20:09:08 +0100 Subject: [PATCH 0879/1321] arm64: dts: broadcom: bcm2712: fix RP1 endpoint PCI topology The node describing the RP1 endpoint currently uses a specific name ('rp1_nexus') that does not correctly reflect the PCI topology. Update the DT with the correct topology and use generic node names. Additionally, since the driver dropped overlay support in favor of a fully described DT, rename '...-ovl-rp1.dts' to '...-base.dtsi' for inclusion in the board DTB, as it is no longer compiled as a standalone DTB. Signed-off-by: Andrea della Porta Reviewed-by: Rob Herring Link: https://lore.kernel.org/r/827b12ba48bb47bc77a0f5e5617aea961c8bc6b5.1766077285.git.andrea.porta@suse.com Signed-off-by: Florian Fainelli --- arch/arm64/boot/dts/broadcom/Makefile | 1 - ...-ovl-rp1.dts => bcm2712-rpi-5-b-base.dtsi} | 0 .../boot/dts/broadcom/bcm2712-rpi-5-b.dts | 39 ++++++++++++------- 3 files changed, 26 insertions(+), 14 deletions(-) rename arch/arm64/boot/dts/broadcom/{bcm2712-rpi-5-b-ovl-rp1.dts => bcm2712-rpi-5-b-base.dtsi} (100%) diff --git a/arch/arm64/boot/dts/broadcom/Makefile b/arch/arm64/boot/dts/broadcom/Makefile index 83d45afc6588e..d43901404c955 100644 --- a/arch/arm64/boot/dts/broadcom/Makefile +++ b/arch/arm64/boot/dts/broadcom/Makefile @@ -7,7 +7,6 @@ dtb-$(CONFIG_ARCH_BCM2835) += bcm2711-rpi-400.dtb \ bcm2711-rpi-4-b.dtb \ bcm2711-rpi-cm4-io.dtb \ bcm2712-rpi-5-b.dtb \ - bcm2712-rpi-5-b-ovl-rp1.dtb \ bcm2712-d-rpi-5-b.dtb \ bcm2837-rpi-2-b.dtb \ bcm2837-rpi-3-a-plus.dtb \ diff --git a/arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b-ovl-rp1.dts b/arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b-base.dtsi similarity index 100% rename from arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b-ovl-rp1.dts rename to arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b-base.dtsi diff --git a/arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b.dts b/arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b.dts index 3e0319fdb93f7..2856082814462 100644 --- a/arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b.dts +++ b/arch/arm64/boot/dts/broadcom/bcm2712-rpi-5-b.dts @@ -1,22 +1,16 @@ // SPDX-License-Identifier: (GPL-2.0 OR MIT) /* - * bcm2712-rpi-5-b-ovl-rp1.dts is the overlay-ready DT which will make - * the RP1 driver to load the RP1 dtb overlay at runtime, while - * bcm2712-rpi-5-b.dts (this file) is the fully defined one (i.e. it - * already contains RP1 node, so no overlay is loaded nor needed). - * This file is intended to host the override nodes for the RP1 peripherals, - * e.g. to declare the phy of the ethernet interface or the custom pin setup - * for several RP1 peripherals. - * This in turn is due to the fact that there's no current generic - * infrastructure to reference nodes (i.e. the nodes in rp1-common.dtsi) that - * are not yet defined in the DT since they are loaded at runtime via overlay. + * As a loose attempt to separate RP1 customizations from SoC peripherals + * definitioni, this file is intended to host the override nodes for the RP1 + * peripherals, e.g. to declare the phy of the ethernet interface or custom + * pin setup. * All other nodes that do not have anything to do with RP1 should be added - * to the included bcm2712-rpi-5-b-ovl-rp1.dts instead. + * to the included bcm2712-rpi-5-b-base.dtsi instead. */ /dts-v1/; -#include "bcm2712-rpi-5-b-ovl-rp1.dts" +#include "bcm2712-rpi-5-b-base.dtsi" / { aliases { @@ -25,7 +19,26 @@ }; &pcie2 { - #include "rp1-nexus.dtsi" + pci@0,0 { + reg = <0x0 0x0 0x0 0x0 0x0>; + ranges; + bus-range = <0 1>; + device_type = "pci"; + #address-cells = <3>; + #size-cells = <2>; + + dev@0,0 { + compatible = "pci1de4,1"; + reg = <0x10000 0x0 0x0 0x0 0x0>; + ranges = <0x1 0x0 0x0 0x82010000 0x0 0x0 0x0 0x400000>; + interrupt-controller; + #interrupt-cells = <2>; + #address-cells = <3>; + #size-cells = <2>; + + #include "rp1-common.dtsi" + }; + }; }; &rp1_eth { From 9399010cb9a47cc67a93638b7e3de5cdac00240d Mon Sep 17 00:00:00 2001 From: Andrea della Porta Date: Thu, 18 Dec 2025 20:09:09 +0100 Subject: [PATCH 0880/1321] arm64: dts: broadcom: rp1: drop RP1 overlay RP1 support loaded from overlay has been dropped from the driver and the DTB intended to be loaded with the overlay no longer exists. Drop unused include file and overlay. Signed-off-by: Andrea della Porta Reviewed-by: Rob Herring Link: https://lore.kernel.org/r/85167b815d41ed9ed690ad239a19de5cd2e8be1c.1766077285.git.andrea.porta@suse.com Signed-off-by: Florian Fainelli --- arch/arm64/boot/dts/broadcom/Makefile | 3 +-- arch/arm64/boot/dts/broadcom/rp1-nexus.dtsi | 14 -------------- arch/arm64/boot/dts/broadcom/rp1.dtso | 11 ----------- 3 files changed, 1 insertion(+), 27 deletions(-) delete mode 100644 arch/arm64/boot/dts/broadcom/rp1-nexus.dtsi delete mode 100644 arch/arm64/boot/dts/broadcom/rp1.dtso diff --git a/arch/arm64/boot/dts/broadcom/Makefile b/arch/arm64/boot/dts/broadcom/Makefile index d43901404c955..01ecfa3041845 100644 --- a/arch/arm64/boot/dts/broadcom/Makefile +++ b/arch/arm64/boot/dts/broadcom/Makefile @@ -13,8 +13,7 @@ dtb-$(CONFIG_ARCH_BCM2835) += bcm2711-rpi-400.dtb \ bcm2837-rpi-3-b.dtb \ bcm2837-rpi-3-b-plus.dtb \ bcm2837-rpi-cm3-io3.dtb \ - bcm2837-rpi-zero-2-w.dtb \ - rp1.dtbo + bcm2837-rpi-zero-2-w.dtb subdir-y += bcmbca subdir-y += northstar2 diff --git a/arch/arm64/boot/dts/broadcom/rp1-nexus.dtsi b/arch/arm64/boot/dts/broadcom/rp1-nexus.dtsi deleted file mode 100644 index 0ef30d7f1c352..0000000000000 --- a/arch/arm64/boot/dts/broadcom/rp1-nexus.dtsi +++ /dev/null @@ -1,14 +0,0 @@ -// SPDX-License-Identifier: (GPL-2.0 OR MIT) - -rp1_nexus { - compatible = "pci1de4,1"; - #address-cells = <3>; - #size-cells = <2>; - ranges = <0x01 0x00 0x00000000 - 0x02000000 0x00 0x00000000 - 0x0 0x400000>; - interrupt-controller; - #interrupt-cells = <2>; - - #include "rp1-common.dtsi" -}; diff --git a/arch/arm64/boot/dts/broadcom/rp1.dtso b/arch/arm64/boot/dts/broadcom/rp1.dtso deleted file mode 100644 index ab4f146d22c06..0000000000000 --- a/arch/arm64/boot/dts/broadcom/rp1.dtso +++ /dev/null @@ -1,11 +0,0 @@ -// SPDX-License-Identifier: (GPL-2.0 OR MIT) - -/dts-v1/; -/plugin/; - -&pcie2 { - #address-cells = <3>; - #size-cells = <2>; - - #include "rp1-nexus.dtsi" -}; From 23f81d8c61d48ce1a5311cebc0f5f4a6eae5a63e Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Tue, 23 Dec 2025 15:27:27 +0100 Subject: [PATCH 0881/1321] Documentation/process: maintainer-soc: Be more explicit about defconfig It is already documented but people still send noticeable amount of patches ignoring the rule - get_maintainers.pl does not work on arm64/configs/defconfig or any other shared ARM defconfig. Be more explicit, that one must not rely on typical/simple approach here for getting To/Cc list. Signed-off-by: Krzysztof Kozlowski Reviewed-by: Linus Walleij Link: https://lore.kernel.org/r/20251223142726.73417-3-krzysztof.kozlowski@oss.qualcomm.com Signed-off-by: Arnd Bergmann --- Documentation/process/maintainer-soc.rst | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/Documentation/process/maintainer-soc.rst b/Documentation/process/maintainer-soc.rst index 3ba886f52a51d..014c639022b2d 100644 --- a/Documentation/process/maintainer-soc.rst +++ b/Documentation/process/maintainer-soc.rst @@ -57,8 +57,10 @@ Submitting Patches for Given SoC All typical platform related patches should be sent via SoC submaintainers (platform-specific maintainers). This includes also changes to per-platform or -shared defconfigs (scripts/get_maintainer.pl might not provide correct -addresses in such case). +shared defconfigs. Note that scripts/get_maintainer.pl might not provide +correct addresses for the shared defconfig, so ignore its output and manually +create CC-list based on MAINTAINERS file or use something like +``scripts/get_maintainer.pl -f drivers/soc/FOO/``). Submitting Patches to the Main SoC Maintainers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From da35b5c533e5301c6541f7a6498b42d6f2d18cfa Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Tue, 23 Dec 2025 15:27:28 +0100 Subject: [PATCH 0882/1321] Documentation/process: maintainer-soc: Mark 'make' as commands Improve readability of the docs by marking 'make dtbs/dtbs_check' as shell commands. Signed-off-by: Krzysztof Kozlowski Reviewed-by: Linus Walleij Link: https://lore.kernel.org/r/20251223142726.73417-4-krzysztof.kozlowski@oss.qualcomm.com Signed-off-by: Arnd Bergmann --- Documentation/process/maintainer-soc.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/process/maintainer-soc.rst b/Documentation/process/maintainer-soc.rst index 014c639022b2d..7d6bad989ad88 100644 --- a/Documentation/process/maintainer-soc.rst +++ b/Documentation/process/maintainer-soc.rst @@ -116,9 +116,9 @@ coordinating how the changes get merged through different maintainer trees. Usually the branch that includes a driver change will also include the corresponding change to the devicetree binding description, to ensure they are in fact compatible. This means that the devicetree branch can end up causing -warnings in the "make dtbs_check" step. If a devicetree change depends on +warnings in the ``make dtbs_check`` step. If a devicetree change depends on missing additions to a header file in include/dt-bindings/, it will fail the -"make dtbs" step and not get merged. +``make dtbs`` step and not get merged. There are multiple ways to deal with this: From f72484686e3ee0906ef56dfaaec899ec23946dea Mon Sep 17 00:00:00 2001 From: Carlos Song Date: Tue, 18 Nov 2025 14:28:54 +0800 Subject: [PATCH 0883/1321] arm64: dts: imx95: correct I3C2 pclk to IMX95_CLK_BUSWAKEUP I3C2 is in WAKEUP domain. Its pclk should be IMX95_CLK_BUSWAKEUP. Fixes: 969497ebefcf ("arm64: dts: imx95: Add i3c1 and i3c2") Signed-off-by: Carlos Song Cc: stable@vger.kernel.org Reviewed-by: Frank Li Signed-off-by: Shawn Guo --- arch/arm64/boot/dts/freescale/imx95.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/freescale/imx95.dtsi b/arch/arm64/boot/dts/freescale/imx95.dtsi index e45014d50abef..a4d8548175594 100644 --- a/arch/arm64/boot/dts/freescale/imx95.dtsi +++ b/arch/arm64/boot/dts/freescale/imx95.dtsi @@ -828,7 +828,7 @@ interrupts = ; #address-cells = <3>; #size-cells = <0>; - clocks = <&scmi_clk IMX95_CLK_BUSAON>, + clocks = <&scmi_clk IMX95_CLK_BUSWAKEUP>, <&scmi_clk IMX95_CLK_I3C2SLOW>; clock-names = "pclk", "fast_clk"; status = "disabled"; From e5f0818bd493570f3e183680247e0f5009cd052b Mon Sep 17 00:00:00 2001 From: "Rob Herring (Arm)" Date: Tue, 18 Nov 2025 15:35:50 -0600 Subject: [PATCH 0884/1321] ARM: dts: nxp: imx: Fix mc13xxx LED node names Node names are supposed to be generic and use hexadecimal unit-addresses. Signed-off-by: Rob Herring (Arm) Reviewed-by: Krzysztof Kozlowski Signed-off-by: Shawn Guo --- arch/arm/boot/dts/nxp/imx/imx27-phytec-phycore-rdk.dts | 8 ++++---- arch/arm/boot/dts/nxp/imx/imx51-zii-rdu1.dts | 4 ++-- arch/arm/boot/dts/nxp/imx/imx51-zii-scu2-mezz.dts | 4 ++-- arch/arm/boot/dts/nxp/imx/imx51-zii-scu3-esb.dts | 4 ++-- 4 files changed, 10 insertions(+), 10 deletions(-) diff --git a/arch/arm/boot/dts/nxp/imx/imx27-phytec-phycore-rdk.dts b/arch/arm/boot/dts/nxp/imx/imx27-phytec-phycore-rdk.dts index b8048e12e3d9a..5398e9067e60f 100644 --- a/arch/arm/boot/dts/nxp/imx/imx27-phytec-phycore-rdk.dts +++ b/arch/arm/boot/dts/nxp/imx/imx27-phytec-phycore-rdk.dts @@ -248,14 +248,14 @@ linux,default-trigger = "nand-disk"; }; - ledg3: led@10 { - reg = <10>; + ledg3: led@a { + reg = <0xa>; label = "system:green3:live"; linux,default-trigger = "heartbeat"; }; - ledb3: led@11 { - reg = <11>; + ledb3: led@b { + reg = <0xb>; label = "system:blue3:cpu"; linux,default-trigger = "cpu0"; }; diff --git a/arch/arm/boot/dts/nxp/imx/imx51-zii-rdu1.dts b/arch/arm/boot/dts/nxp/imx/imx51-zii-rdu1.dts index 43ff5eafb2bb8..91c63d1f26043 100644 --- a/arch/arm/boot/dts/nxp/imx/imx51-zii-rdu1.dts +++ b/arch/arm/boot/dts/nxp/imx/imx51-zii-rdu1.dts @@ -398,13 +398,13 @@ #size-cells = <0>; led-control = <0x0 0x0 0x3f83f8 0x0>; - sysled0@3 { + led@3 { reg = <3>; label = "system:green:status"; linux,default-trigger = "default-on"; }; - sysled1@4 { + led@4 { reg = <4>; label = "system:green:act"; linux,default-trigger = "heartbeat"; diff --git a/arch/arm/boot/dts/nxp/imx/imx51-zii-scu2-mezz.dts b/arch/arm/boot/dts/nxp/imx/imx51-zii-scu2-mezz.dts index 26eb7a9506e48..1598bf4f49911 100644 --- a/arch/arm/boot/dts/nxp/imx/imx51-zii-scu2-mezz.dts +++ b/arch/arm/boot/dts/nxp/imx/imx51-zii-scu2-mezz.dts @@ -225,13 +225,13 @@ #size-cells = <0>; led-control = <0x0 0x0 0x3f83f8 0x0>; - sysled3: led3@3 { + sysled3: led@3 { reg = <3>; label = "system:red:power"; linux,default-trigger = "default-on"; }; - sysled4: led4@4 { + sysled4: led@4 { reg = <4>; label = "system:green:act"; linux,default-trigger = "heartbeat"; diff --git a/arch/arm/boot/dts/nxp/imx/imx51-zii-scu3-esb.dts b/arch/arm/boot/dts/nxp/imx/imx51-zii-scu3-esb.dts index 19a3b142c9649..c2dcfd44c445b 100644 --- a/arch/arm/boot/dts/nxp/imx/imx51-zii-scu3-esb.dts +++ b/arch/arm/boot/dts/nxp/imx/imx51-zii-scu3-esb.dts @@ -153,13 +153,13 @@ #size-cells = <0>; led-control = <0x0 0x0 0x3f83f8 0x0>; - sysled3: led3@3 { + sysled3: led@3 { reg = <3>; label = "system:red:power"; linux,default-trigger = "default-on"; }; - sysled4: led4@4 { + sysled4: led@4 { reg = <4>; label = "system:green:act"; linux,default-trigger = "heartbeat"; From b9b493500687398b65094e8314b8e5fb0614e87a Mon Sep 17 00:00:00 2001 From: Haibo Chen Date: Wed, 19 Nov 2025 11:22:39 +0800 Subject: [PATCH 0885/1321] arm64: dts: imx8qm-mek: correct the light sensor interrupt type to low level light sensor isl29023 share the interrupt with lsm303arg, but these two devices use different interrupt type. According to the datasheet of these two devides, both support low level trigger type, so correct the interrupt type here to avoid the following error log: irq: type mismatch, failed to map hwirq-11 for gpio@5d0c0000! Fixes: 9918092cbb0e ("arm64: dts: imx8qm-mek: add i2c0 and children devices") Fixes: 1d8a9f043a77 ("arm64: dts: imx8: use defines for interrupts") Signed-off-by: Haibo Chen Reviewed-by: Frank Li Signed-off-by: Shawn Guo --- arch/arm64/boot/dts/freescale/imx8qm-mek.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/freescale/imx8qm-mek.dts b/arch/arm64/boot/dts/freescale/imx8qm-mek.dts index 779d9f78fb819..667ba2fea8678 100644 --- a/arch/arm64/boot/dts/freescale/imx8qm-mek.dts +++ b/arch/arm64/boot/dts/freescale/imx8qm-mek.dts @@ -576,7 +576,7 @@ compatible = "isil,isl29023"; reg = <0x44>; interrupt-parent = <&lsio_gpio4>; - interrupts = <11 IRQ_TYPE_EDGE_FALLING>; + interrupts = <11 IRQ_TYPE_LEVEL_LOW>; }; pressure-sensor@60 { From 829a14a670c841b07eed000d383f0c02acfb30bf Mon Sep 17 00:00:00 2001 From: Haibo Chen Date: Wed, 19 Nov 2025 11:22:40 +0800 Subject: [PATCH 0886/1321] arm64: dts: add off-on-delay-us for usdhc2 regulator For SD card, according to the spec requirement, for sd card power reset operation, it need sd card supply voltage to be lower than 0.5v and keep over 1ms, otherwise, next time power back the sd card supply voltage to 3.3v, sd card can't support SD3.0 mode again. To match such requirement on imx8qm-mek board, add 4.8ms delay between sd power off and power on. Fixes: 307fd14d4b14 ("arm64: dts: imx: add imx8qm mek support") Reviewed-by: Frank Li Signed-off-by: Haibo Chen Signed-off-by: Shawn Guo --- arch/arm64/boot/dts/freescale/imx8qm-mek.dts | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/boot/dts/freescale/imx8qm-mek.dts b/arch/arm64/boot/dts/freescale/imx8qm-mek.dts index 667ba2fea8678..f1b0563d3a090 100644 --- a/arch/arm64/boot/dts/freescale/imx8qm-mek.dts +++ b/arch/arm64/boot/dts/freescale/imx8qm-mek.dts @@ -263,6 +263,7 @@ regulator-max-microvolt = <3000000>; gpio = <&lsio_gpio4 7 GPIO_ACTIVE_HIGH>; enable-active-high; + off-on-delay-us = <4800>; }; reg_audio: regulator-audio { From c0c60ba3070a2754b191f0bf26aeee832552ac2f Mon Sep 17 00:00:00 2001 From: Vitor Soares Date: Fri, 28 Nov 2025 15:00:27 +0000 Subject: [PATCH 0887/1321] arm64: dts: freescale: imx95-toradex-smarc: use edge trigger for ethphy1 interrupt Change the PHY interrupt trigger type from IRQ_TYPE_LEVEL_LOW to IRQ_TYPE_EDGE_FALLING to match the PCA9745 GPIO expander hardware capabilities and avoid emulated level detection. Fixes: 90bbe88e0ea6 ("arm64: dts: freescale: add Toradex SMARC iMX95") Signed-off-by: Vitor Soares Signed-off-by: Shawn Guo --- arch/arm64/boot/dts/freescale/imx95-toradex-smarc.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/freescale/imx95-toradex-smarc.dtsi b/arch/arm64/boot/dts/freescale/imx95-toradex-smarc.dtsi index afbdadcb36863..2cbd5606cb192 100644 --- a/arch/arm64/boot/dts/freescale/imx95-toradex-smarc.dtsi +++ b/arch/arm64/boot/dts/freescale/imx95-toradex-smarc.dtsi @@ -582,7 +582,7 @@ ethphy1: ethernet-phy@1 { reg = <1>; interrupt-parent = <&som_gpio_expander_1>; - interrupts = <6 IRQ_TYPE_LEVEL_LOW>; + interrupts = <6 IRQ_TYPE_EDGE_FALLING>; ti,rx-internal-delay = ; ti,tx-internal-delay = ; }; From 7773cbb000792bac8bd265621afee161b760fbe9 Mon Sep 17 00:00:00 2001 From: Vitor Soares Date: Fri, 28 Nov 2025 15:00:28 +0000 Subject: [PATCH 0888/1321] arm64: dts: freescale: imx95-toradex-smarc: fix SMARC_SDIO_WP label position Fix the SMARC_SDIO_WP gpio-line-name position. It should be on line 15 of som_gpio_expander_1, not line 17. Fixes: 90bbe88e0ea6 ("arm64: dts: freescale: add Toradex SMARC iMX95") Signed-off-by: Vitor Soares Signed-off-by: Shawn Guo --- arch/arm64/boot/dts/freescale/imx95-toradex-smarc.dtsi | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/arm64/boot/dts/freescale/imx95-toradex-smarc.dtsi b/arch/arm64/boot/dts/freescale/imx95-toradex-smarc.dtsi index 2cbd5606cb192..115a16e44a999 100644 --- a/arch/arm64/boot/dts/freescale/imx95-toradex-smarc.dtsi +++ b/arch/arm64/boot/dts/freescale/imx95-toradex-smarc.dtsi @@ -406,8 +406,6 @@ "", "", "", - "", - "", "SMARC_SDIO_WP"; }; From 52786bb0fb669b157f898246a706e24b028eafc5 Mon Sep 17 00:00:00 2001 From: Ian Ray Date: Mon, 1 Dec 2025 11:56:05 +0200 Subject: [PATCH 0889/1321] ARM: dts: imx6q-ba16: fix RTC interrupt level RTC interrupt level should be set to "LOW". This was revealed by the introduction of commit: f181987ef477 ("rtc: m41t80: use IRQ flags obtained from fwnode") which changed the way IRQ type is obtained. Fixes: 56c27310c1b4 ("ARM: dts: imx: Add Advantech BA-16 Qseven module") Signed-off-by: Ian Ray Signed-off-by: Shawn Guo --- arch/arm/boot/dts/nxp/imx/imx6q-ba16.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/nxp/imx/imx6q-ba16.dtsi b/arch/arm/boot/dts/nxp/imx/imx6q-ba16.dtsi index 53013b12c2ecb..02d66523668d2 100644 --- a/arch/arm/boot/dts/nxp/imx/imx6q-ba16.dtsi +++ b/arch/arm/boot/dts/nxp/imx/imx6q-ba16.dtsi @@ -337,7 +337,7 @@ pinctrl-0 = <&pinctrl_rtc>; reg = <0x32>; interrupt-parent = <&gpio4>; - interrupts = <10 IRQ_TYPE_LEVEL_HIGH>; + interrupts = <10 IRQ_TYPE_LEVEL_LOW>; }; }; From 32a6764632061b7abcb914f66ae49935b197824f Mon Sep 17 00:00:00 2001 From: Maud Spierings Date: Mon, 1 Dec 2025 12:56:50 +0100 Subject: [PATCH 0890/1321] dt-bindings: arm: fsl: moduline-display: fix compatible The compatibles should include the SoM compatible, this board is based on the Ka-Ro TX8P-ML81 SoM, so add it to allow using shared code in the bootloader which uses upstream Linux devicetrees as a base. Also add the hardware revision to the board compatible to handle revision specific quirks in the bootloader/userspace. This is a breaking change, but it is early enough that it can be corrected without causing any issues. Fixes: 24e67d28ef95 ("dt-bindings: arm: fsl: Add GOcontroll Moduline Display") Signed-off-by: Maud Spierings Reviewed-by: Krzysztof Kozlowski Signed-off-by: Shawn Guo --- Documentation/devicetree/bindings/arm/fsl.yaml | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/arm/fsl.yaml b/Documentation/devicetree/bindings/arm/fsl.yaml index 68a2d5fecc43b..336669e16d7ab 100644 --- a/Documentation/devicetree/bindings/arm/fsl.yaml +++ b/Documentation/devicetree/bindings/arm/fsl.yaml @@ -1105,7 +1105,6 @@ properties: - gateworks,imx8mp-gw74xx # i.MX8MP Gateworks Board - gateworks,imx8mp-gw75xx-2x # i.MX8MP Gateworks Board - gateworks,imx8mp-gw82xx-2x # i.MX8MP Gateworks Board - - gocontroll,moduline-display # GOcontroll Moduline Display controller - prt,prt8ml # Protonic PRT8ML - skov,imx8mp-skov-basic # SKOV i.MX8MP baseboard without frontplate - skov,imx8mp-skov-revb-hdmi # SKOV i.MX8MP climate control without panel @@ -1164,6 +1163,14 @@ properties: - const: engicam,icore-mx8mp # i.MX8MP Engicam i.Core MX8M Plus SoM - const: fsl,imx8mp + - description: Ka-Ro TX8P-ML81 SoM based boards + items: + - enum: + - gocontroll,moduline-display + - gocontroll,moduline-display-106 + - const: karo,tx8p-ml81 + - const: fsl,imx8mp + - description: Kontron i.MX8MP OSM-S SoM based Boards items: - const: kontron,imx8mp-bl-osm-s # Kontron BL i.MX8MP OSM-S Board From 5ce5aff9fa606aff7060d9545f4f09f2c3f378e9 Mon Sep 17 00:00:00 2001 From: Maud Spierings Date: Mon, 1 Dec 2025 12:56:51 +0100 Subject: [PATCH 0891/1321] arm64: dts: freescale: moduline-display: fix compatible The compatibles should include the SoM compatible, this board is based on the Ka-Ro TX8P-ML81 SoM, so add it to allow using shared code in the bootloader which uses upstream Linux devicetrees as a base. Also add the hardware revision to the board compatible to handle revision specific quirks in the bootloader/userspace. This is a breaking change, but it is early enough that it can be corrected without causing any issues. Fixes: 03f07be54cdc ("arm64: dts: freescale: Add the GOcontroll Moduline Display baseboard") Signed-off-by: Maud Spierings Signed-off-by: Shawn Guo --- .../dts/freescale/imx8mp-tx8p-ml81-moduline-display-106.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/freescale/imx8mp-tx8p-ml81-moduline-display-106.dts b/arch/arm64/boot/dts/freescale/imx8mp-tx8p-ml81-moduline-display-106.dts index 88ad422c27603..399230144ce39 100644 --- a/arch/arm64/boot/dts/freescale/imx8mp-tx8p-ml81-moduline-display-106.dts +++ b/arch/arm64/boot/dts/freescale/imx8mp-tx8p-ml81-moduline-display-106.dts @@ -9,7 +9,7 @@ #include "imx8mp-tx8p-ml81.dtsi" / { - compatible = "gocontroll,moduline-display", "fsl,imx8mp"; + compatible = "gocontroll,moduline-display-106", "karo,tx8p-ml81", "fsl,imx8mp"; chassis-type = "embedded"; hardware = "Moduline Display V1.06"; model = "GOcontroll Moduline Display baseboard"; From 3dd07b9dcb625a65decd6526b5750cec91ba75b0 Mon Sep 17 00:00:00 2001 From: Maud Spierings Date: Mon, 1 Dec 2025 12:56:52 +0100 Subject: [PATCH 0892/1321] arm64: dts: freescale: tx8p-ml81: fix eqos nvmem-cells On this SoM eqos is the primary ethernet interface, Ka-Ro fuses the address for it in eth_mac1, eth_mac2 seems to be left unfused. In their downstream u-boot they fetch it from eth_mac1 [1][2], by setting alias of eqos to ethernet0, the driver then fetches the mac address based on the alias number. Set eqos to read from eth_mac1 instead of eth_mac2. Also set fec to point at eth_mac2 as it may be fused later even though it is disabled by default. With this changed barebox is now capable of loading the correct address. Link: https://github.com/karo-electronics/karo-tx-uboot/blob/380543278410bbf04264d80a3bfbe340b8e62439/drivers/net/dwc_eth_qos.c#L1167 [1] Link: https://github.com/karo-electronics/karo-tx-uboot/blob/380543278410bbf04264d80a3bfbe340b8e62439/arch/arm/dts/imx8mp-karo.dtsi#L12 [2] Fixes: bac63d7c5f46 ("arm64: dts: freescale: add Ka-Ro Electronics tx8p-ml81 COM") Signed-off-by: Maud Spierings Signed-off-by: Shawn Guo --- arch/arm64/boot/dts/freescale/imx8mp-tx8p-ml81.dtsi | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/arm64/boot/dts/freescale/imx8mp-tx8p-ml81.dtsi b/arch/arm64/boot/dts/freescale/imx8mp-tx8p-ml81.dtsi index fe8ba16eb40e7..761ee046eb72e 100644 --- a/arch/arm64/boot/dts/freescale/imx8mp-tx8p-ml81.dtsi +++ b/arch/arm64/boot/dts/freescale/imx8mp-tx8p-ml81.dtsi @@ -47,6 +47,7 @@ <&clk IMX8MP_SYS_PLL2_100M>, <&clk IMX8MP_SYS_PLL2_50M>; assigned-clock-rates = <266000000>, <100000000>, <50000000>; + nvmem-cells = <ð_mac1>; phy-handle = <ðphy0>; phy-mode = "rmii"; pinctrl-0 = <&pinctrl_eqos>; @@ -75,6 +76,10 @@ }; }; +&fec { + nvmem-cells = <ð_mac2>; +}; + &gpio1 { gpio-line-names = "SODIMM_152", "SODIMM_42", From bfc378e5b93fcd50d237230e2c457af73d18c4a1 Mon Sep 17 00:00:00 2001 From: Marek Vasut Date: Tue, 2 Dec 2025 14:41:51 +0100 Subject: [PATCH 0893/1321] arm64: dts: imx8mp: Fix LAN8740Ai PHY reference clock on DH electronics i.MX8M Plus DHCOM Add missing 'clocks' property to LAN8740Ai PHY node, to allow the PHY driver to manage LAN8740Ai CLKIN reference clock supply. This fixes sporadic link bouncing caused by interruptions on the PHY reference clock, by letting the PHY driver manage the reference clock and assure there are no interruptions. This follows the matching PHY driver recommendation described in commit bedd8d78aba3 ("net: phy: smsc: LAN8710/20: add phy refclk in support") Fixes: 8d6712695bc8 ("arm64: dts: imx8mp: Add support for DH electronics i.MX8M Plus DHCOM and PDK2") Signed-off-by: Marek Vasut Tested-by: Christoph Niedermaier Signed-off-by: Shawn Guo --- arch/arm64/boot/dts/freescale/imx8mp-dhcom-som.dtsi | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/boot/dts/freescale/imx8mp-dhcom-som.dtsi b/arch/arm64/boot/dts/freescale/imx8mp-dhcom-som.dtsi index 68c2e0156a5c8..f8303b7e2bd22 100644 --- a/arch/arm64/boot/dts/freescale/imx8mp-dhcom-som.dtsi +++ b/arch/arm64/boot/dts/freescale/imx8mp-dhcom-som.dtsi @@ -113,6 +113,7 @@ ethphy0f: ethernet-phy@1 { /* SMSC LAN8740Ai */ compatible = "ethernet-phy-id0007.c110", "ethernet-phy-ieee802.3-c22"; + clocks = <&clk IMX8MP_CLK_ENET_QOS>; interrupt-parent = <&gpio3>; interrupts = <19 IRQ_TYPE_LEVEL_LOW>; pinctrl-0 = <&pinctrl_ethphy0>; From ed45d0aec08199b4aef79567055d87d4ba001fb6 Mon Sep 17 00:00:00 2001 From: Sherry Sun Date: Wed, 3 Dec 2025 09:59:56 +0800 Subject: [PATCH 0894/1321] arm64: dts: imx8qm-ss-dma: correct the dma channels of lpuart The commit 616effc0272b5 ("arm64: dts: imx8: Fix lpuart DMA channel order") swap uart rx and tx channel at common imx8-ss-dma.dtsi. But miss update imx8qm-ss-dma.dtsi. The commit 5a8e9b022e569 ("arm64: dts: imx8qm-ss-dma: Pass lpuart dma-names") just simple add dma-names as binding doc requirement. Correct lpuart0 - lpuart3 dma rx and tx channels, and use defines for the FSL_EDMA_RX flag. Fixes: 5a8e9b022e56 ("arm64: dts: imx8qm-ss-dma: Pass lpuart dma-names") Signed-off-by: Sherry Sun Reviewed-by: Frank Li Reviewed-by: Alexander Stein Signed-off-by: Shawn Guo --- arch/arm64/boot/dts/freescale/imx8qm-ss-dma.dtsi | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/arm64/boot/dts/freescale/imx8qm-ss-dma.dtsi b/arch/arm64/boot/dts/freescale/imx8qm-ss-dma.dtsi index 5f24850bf322a..974e193f8dcb9 100644 --- a/arch/arm64/boot/dts/freescale/imx8qm-ss-dma.dtsi +++ b/arch/arm64/boot/dts/freescale/imx8qm-ss-dma.dtsi @@ -172,25 +172,25 @@ &lpuart0 { compatible = "fsl,imx8qm-lpuart", "fsl,imx8qxp-lpuart"; - dmas = <&edma2 13 0 0>, <&edma2 12 0 1>; + dmas = <&edma2 12 0 FSL_EDMA_RX>, <&edma2 13 0 0>; dma-names = "rx","tx"; }; &lpuart1 { compatible = "fsl,imx8qm-lpuart", "fsl,imx8qxp-lpuart"; - dmas = <&edma2 15 0 0>, <&edma2 14 0 1>; + dmas = <&edma2 14 0 FSL_EDMA_RX>, <&edma2 15 0 0>; dma-names = "rx","tx"; }; &lpuart2 { compatible = "fsl,imx8qm-lpuart", "fsl,imx8qxp-lpuart"; - dmas = <&edma2 17 0 0>, <&edma2 16 0 1>; + dmas = <&edma2 16 0 FSL_EDMA_RX>, <&edma2 17 0 0>; dma-names = "rx","tx"; }; &lpuart3 { compatible = "fsl,imx8qm-lpuart", "fsl,imx8qxp-lpuart"; - dmas = <&edma2 19 0 0>, <&edma2 18 0 1>; + dmas = <&edma2 18 0 FSL_EDMA_RX>, <&edma2 19 0 0>; dma-names = "rx","tx"; }; From 045c1da89ce3a91ca84e01908df567d0bee913f6 Mon Sep 17 00:00:00 2001 From: Alexander Stein Date: Tue, 16 Dec 2025 14:15:28 +0100 Subject: [PATCH 0895/1321] arm64: dts: mba8mx: Fix Ethernet PHY IRQ support Ethernet PHY interrupt mode is level triggered. Adjust the mode accordingly. Signed-off-by: Alexander Stein Reviewed-by: Andrew Lunn Fixes: 70cf622bb16e ("arm64: dts: mba8mx: Add Ethernet PHY IRQ support") Signed-off-by: Shawn Guo --- arch/arm64/boot/dts/freescale/mba8mx.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/freescale/mba8mx.dtsi b/arch/arm64/boot/dts/freescale/mba8mx.dtsi index 225cd2f1220bf..10d5c211b1c9b 100644 --- a/arch/arm64/boot/dts/freescale/mba8mx.dtsi +++ b/arch/arm64/boot/dts/freescale/mba8mx.dtsi @@ -192,7 +192,7 @@ reset-assert-us = <500000>; reset-deassert-us = <500>; interrupt-parent = <&expander2>; - interrupts = <6 IRQ_TYPE_EDGE_FALLING>; + interrupts = <6 IRQ_TYPE_LEVEL_LOW>; }; }; }; From e13e6c7208ff226de224982c9d9383070f303bc4 Mon Sep 17 00:00:00 2001 From: "Rob Herring (Arm)" Date: Mon, 5 Jan 2026 11:40:02 -0600 Subject: [PATCH 0896/1321] arm64: dts: hisilicon: hikey960: Drop "snps,gctl-reset-quirk" and "snps,tx_de_emphasis*" properties "snps,tx_de_emphasis" is supposed to be a u8, not a u32. Since it is big endian, 0 will be read rather than 1. The DWC3 Linux driver simply ORs the value if "snps,tx_de_emphasis_quirk" is set, so the 2 properties have no effect. (The driver doesn't clear the field either which is another problem). "snps,gctl-reset-quirk" is not documented nor used in the driver, so drop it as well. Signed-off-by: Rob Herring (Arm) Link: https://lore.kernel.org/r/20260105174002.2997615-1-robh@kernel.org Signed-off-by: Arnd Bergmann --- arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts b/arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts index 3f13a960f34ef..ed84ab92fb19c 100644 --- a/arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts +++ b/arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts @@ -675,10 +675,7 @@ snps,lfps_filter_quirk; snps,dis_u2_susphy_quirk; snps,dis_u3_susphy_quirk; - snps,tx_de_emphasis_quirk; - snps,tx_de_emphasis = <1>; snps,dis_enblslpm_quirk; - snps,gctl-reset-quirk; usb-role-switch; role-switch-default-mode = "host"; port { From 86ad28b29d523ad5a2bb3028f51c1c089c09c373 Mon Sep 17 00:00:00 2001 From: Ben Horgan Date: Mon, 5 Jan 2026 13:58:47 +0000 Subject: [PATCH 0897/1321] arm64/efi: Don't fail check current_in_efi() if preemptible As EFI runtime services can now be run without disabling preemption remove the check for non preemptible in current_in_efi(). Without this change, firmware errors that were previously recovered from by __efi_runtime_kernel_fixup_exception() will lead to a kernel oops. Fixes: a5baf582f4c0 ("arm64/efi: Call EFI runtime services without disabling preemption") Signed-off-by: Ben Horgan Reviewed-by: Yeoreum Yun Acked-by: Ard Biesheuvel Reviewed-by: Richard Lyu Signed-off-by: Catalin Marinas --- arch/arm64/include/asm/efi.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h index aa91165ca1407..e8a9783235cbe 100644 --- a/arch/arm64/include/asm/efi.h +++ b/arch/arm64/include/asm/efi.h @@ -45,7 +45,7 @@ void arch_efi_call_virt_teardown(void); * switching to the EFI runtime stack. */ #define current_in_efi() \ - (!preemptible() && efi_rt_stack_top != NULL && \ + (efi_rt_stack_top != NULL && \ on_task_stack(current, READ_ONCE(efi_rt_stack_top[-1]), 1)) #define ARCH_EFI_IRQ_FLAGS_MASK (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT) From e5c09786effbadcbaa963a77f56392421b5577cc Mon Sep 17 00:00:00 2001 From: Ben Horgan Date: Fri, 19 Dec 2025 18:11:03 +0000 Subject: [PATCH 0898/1321] arm_mpam: Stop using uninitialized variables in __ris_msmon_read() Dan has reported two uses of uninitialized variables in __ris_msmon_read(). If an unknown monitor type is encountered then the local variable, now, is used uninitialized. Fix this by returning early on error. If a non-mbwu monitor is being read then the local variable, overflow, is not initialized but still read. Initialize it to false as overflow is not relevant for csu monitors. Fixes: 823e7c3712c5 ("arm_mpam: Add mpam_msmon_read() to read monitor value") Fixes: 9e5afb7c3283 ("arm_mpam: Use long MBWU counters if supported") Reported-by: kernel test robot Reported-by: Dan Carpenter Closes: https://lore.kernel.org/r/202512091519.RBwiJcSq-lkp@intel.com/ Closes: https://lore.kernel.org/r/202512100547.N7QPYgfb-lkp@intel.com/ Signed-off-by: Ben Horgan Reviewed-by: Jonathan Cameron Signed-off-by: Catalin Marinas --- drivers/resctrl/mpam_devices.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c index 0b5b158e1aafb..b495d52918681 100644 --- a/drivers/resctrl/mpam_devices.c +++ b/drivers/resctrl/mpam_devices.c @@ -1072,7 +1072,7 @@ static void __ris_msmon_read(void *arg) u64 now; bool nrdy = false; bool config_mismatch; - bool overflow; + bool overflow = false; struct mon_read *m = arg; struct mon_cfg *ctx = m->ctx; bool reset_on_next_read = false; @@ -1176,10 +1176,11 @@ static void __ris_msmon_read(void *arg) } mpam_mon_sel_unlock(msc); - if (nrdy) { + if (nrdy) m->err = -EBUSY; + + if (m->err) return; - } *m->val += now; } From f9d30cf8c900b4596530aad5cf05a84b4f2130db Mon Sep 17 00:00:00 2001 From: Jiayuan Chen Date: Sun, 4 Jan 2026 20:35:27 +0800 Subject: [PATCH 0899/1321] arm64: mm: Fix incomplete tag reset in change_memory_common() Running KASAN KUnit tests with {HW,SW}_TAGS mode triggers a fault in change_memory_common(): Call trace: change_memory_common+0x168/0x210 (P) set_memory_ro+0x20/0x48 vmalloc_helpers_tags+0xe8/0x338 kunit_try_run_case+0x74/0x188 kunit_generic_run_threadfn_adapter+0x30/0x70 kthread+0x11c/0x200 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- # vmalloc_helpers_tags: try faulted not ok 67 vmalloc_helpers_tags Commit a06494adb7ef ("arm64: mm: use untagged address to calculate page index") fixed a KASAN warning in the BPF subsystem by adding kasan_reset_tag() to the index calculation. In the execmem flow: bpf_prog_pack_alloc() -> bpf_jit_alloc_exec() -> execmem_alloc() The returned address from execmem_vmalloc/execmem_cache_alloc is passed through kasan_reset_tag(), so start has no tag while area->addr still retains the original tag. The fix correctly handled this case by resetting the tag on area->addr: (start - (unsigned long)kasan_reset_tag(area->addr)) >> PAGE_SHIFT However, in normal vmalloc paths, both start and area->addr have matching tags(or no tags). Resetting only area->addr causes a mismatch when subtracting a tagged address from an untagged one, resulting in an incorrect index. Fix this by resetting tags on both addresses in the index calculation. This ensures correct results regardless of the tag state of either address. Tested with KASAN KUnit tests under CONFIG_KASAN_GENERIC, CONFIG_KASAN_SW_TAGS, and CONFIG_KASAN_HW_TAGS - all pass. Also verified the original BPF KASAN warning from [1] is still fixed. [1] https://lore.kernel.org/all/20251118164115.GA3977565@ax162/ Fixes: a06494adb7ef ("arm64: mm: use untagged address to calculate page index") Signed-off-by: Jiayuan Chen Signed-off-by: Jiayuan Chen Signed-off-by: Catalin Marinas --- arch/arm64/mm/pageattr.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c index f0e784b963e69..7176ff39cb879 100644 --- a/arch/arm64/mm/pageattr.c +++ b/arch/arm64/mm/pageattr.c @@ -171,7 +171,8 @@ static int change_memory_common(unsigned long addr, int numpages, */ area = find_vm_area((void *)addr); if (!area || - end > (unsigned long)kasan_reset_tag(area->addr) + area->size || + ((unsigned long)kasan_reset_tag((void *)end) > + (unsigned long)kasan_reset_tag(area->addr) + area->size) || ((area->flags & (VM_ALLOC | VM_ALLOW_HUGE_VMAP)) != VM_ALLOC)) return -EINVAL; @@ -184,7 +185,8 @@ static int change_memory_common(unsigned long addr, int numpages, */ if (rodata_full && (pgprot_val(set_mask) == PTE_RDONLY || pgprot_val(clear_mask) == PTE_RDONLY)) { - unsigned long idx = (start - (unsigned long)kasan_reset_tag(area->addr)) + unsigned long idx = ((unsigned long)kasan_reset_tag((void *)start) - + (unsigned long)kasan_reset_tag(area->addr)) >> PAGE_SHIFT; for (; numpages; idx++, numpages--) { ret = __change_memory_common((u64)page_address(area->pages[idx]), From 23314cb385bfa6c0fcfd0120bc7076d1abe90018 Mon Sep 17 00:00:00 2001 From: Yeoreum Yun Date: Wed, 7 Jan 2026 16:21:15 +0000 Subject: [PATCH 0900/1321] arm64: Fix cleared E0POE bit after cpu_suspend()/resume() TCR2_ELx.E0POE is set during smp_init(). However, this bit is not reprogrammed when the CPU enters suspension and later resumes via cpu_resume(), as __cpu_setup() does not re-enable E0POE and there is no save/restore logic for the TCR2_ELx system register. As a result, the E0POE feature no longer works after cpu_resume(). To address this, save and restore TCR2_EL1 in the cpu_suspend()/cpu_resume() path, rather than adding related logic to __cpu_setup(), taking into account possible future extensions of the TCR2_ELx feature. Fixes: bf83dae90fbc ("arm64: enable the Permission Overlay Extension for EL0") Cc: # 6.12.x Signed-off-by: Yeoreum Yun Reviewed-by: Anshuman Khandual Reviewed-by: Kevin Brodsky Signed-off-by: Catalin Marinas --- arch/arm64/include/asm/suspend.h | 2 +- arch/arm64/mm/proc.S | 8 ++++++++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h index e65f33edf9d61..e9ce68d50ba45 100644 --- a/arch/arm64/include/asm/suspend.h +++ b/arch/arm64/include/asm/suspend.h @@ -2,7 +2,7 @@ #ifndef __ASM_SUSPEND_H #define __ASM_SUSPEND_H -#define NR_CTX_REGS 13 +#define NR_CTX_REGS 14 #define NR_CALLEE_SAVED_REGS 12 /* diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S index 01e8681164488..5d907ce3b6d3f 100644 --- a/arch/arm64/mm/proc.S +++ b/arch/arm64/mm/proc.S @@ -110,6 +110,10 @@ SYM_FUNC_START(cpu_do_suspend) * call stack. */ str x18, [x0, #96] +alternative_if ARM64_HAS_TCR2 + mrs x2, REG_TCR2_EL1 + str x2, [x0, #104] +alternative_else_nop_endif ret SYM_FUNC_END(cpu_do_suspend) @@ -144,6 +148,10 @@ SYM_FUNC_START(cpu_do_resume) msr tcr_el1, x8 msr vbar_el1, x9 msr mdscr_el1, x10 +alternative_if ARM64_HAS_TCR2 + ldr x2, [x0, #104] + msr REG_TCR2_EL1, x2 +alternative_else_nop_endif msr sctlr_el1, x12 set_this_cpu_offset x13 From a7dbd2bb4fddb7b92a1ed80fd1ee4982234f1531 Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Mon, 5 Jan 2026 07:42:48 -0700 Subject: [PATCH 0901/1321] io_uring/io-wq: fix incorrect io_wq_for_each_worker() termination logic A previous commit added this helper, and had it terminate if false is returned from the handler. However, that is completely opposite, it should abort the loop if true is returned. Fix this up by having io_wq_for_each_worker() keep iterating as long as false is returned, and only abort if true is returned. Cc: stable@vger.kernel.org Fixes: 751eedc4b4b7 ("io_uring/io-wq: move worker lists to struct io_wq_acct") Reported-by: Lewis Campbell Reviewed-by: Gabriel Krisman Bertazi Signed-off-by: Jens Axboe --- io_uring/io-wq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index cd13d8aac3d26..6c5ef629e59ad 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -952,11 +952,11 @@ static bool io_wq_for_each_worker(struct io_wq *wq, void *data) { for (int i = 0; i < IO_WQ_ACCT_NR; i++) { - if (!io_acct_for_each_worker(&wq->acct[i], func, data)) - return false; + if (io_acct_for_each_worker(&wq->acct[i], func, data)) + return true; } - return true; + return false; } static bool io_wq_worker_wake(struct io_worker *worker, void *data) From 87dfbd2675382cc57fe768b173e6d18ef85811e2 Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Mon, 5 Jan 2026 11:42:05 -0700 Subject: [PATCH 0902/1321] io_uring/io-wq: remove io_wq_for_each_worker() return value The only use of this helper is to iterate all of the workers, and hence all callers will pass in a func that always returns false to do that. As none of the callers use the return value, get rid of it. Reviewed-by: Gabriel Krisman Bertazi Signed-off-by: Jens Axboe --- io_uring/io-wq.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index 6c5ef629e59ad..9fd9f6ab722c7 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -947,16 +947,13 @@ static bool io_acct_for_each_worker(struct io_wq_acct *acct, return ret; } -static bool io_wq_for_each_worker(struct io_wq *wq, +static void io_wq_for_each_worker(struct io_wq *wq, bool (*func)(struct io_worker *, void *), void *data) { - for (int i = 0; i < IO_WQ_ACCT_NR; i++) { + for (int i = 0; i < IO_WQ_ACCT_NR; i++) if (io_acct_for_each_worker(&wq->acct[i], func, data)) - return true; - } - - return false; + break; } static bool io_wq_worker_wake(struct io_worker *worker, void *data) From 7af1dbd8144c54b5163ef8953feec015b970bcfa Mon Sep 17 00:00:00 2001 From: Raphael Pinsonneault-Thibeault Date: Wed, 17 Dec 2025 14:00:40 -0500 Subject: [PATCH 0903/1321] loop: don't change loop device under exclusive opener in loop_set_status loop_set_status() is allowed to change the loop device while there are other openers of the device, even exclusive ones. In this case, it causes a KASAN: slab-out-of-bounds Read in ext4_search_dir(), since when looking for an entry in an inlined directory, e_value_offs is changed underneath the filesystem by loop_set_status(). Fix the problem by forbidding loop_set_status() from modifying the loop device while there are exclusive openers of the device. This is similar to the fix in loop_configure() by commit 33ec3e53e7b1 ("loop: Don't change loop device under exclusive opener") alongside commit ecbe6bc0003b ("block: use bd_prepare_to_claim directly in the loop driver"). Reported-by: syzbot+3ee481e21fd75e14c397@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=3ee481e21fd75e14c397 Tested-by: syzbot+3ee481e21fd75e14c397@syzkaller.appspotmail.com Tested-by: Yongpeng Yang Signed-off-by: Raphael Pinsonneault-Thibeault Reviewed-by: Jan Kara Signed-off-by: Jens Axboe --- drivers/block/loop.c | 41 ++++++++++++++++++++++++++++++----------- 1 file changed, 30 insertions(+), 11 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 32a3a5b138029..ca74cc31bf07f 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1225,13 +1225,24 @@ static int loop_clr_fd(struct loop_device *lo) } static int -loop_set_status(struct loop_device *lo, const struct loop_info64 *info) +loop_set_status(struct loop_device *lo, blk_mode_t mode, + struct block_device *bdev, const struct loop_info64 *info) { int err; bool partscan = false; bool size_changed = false; unsigned int memflags; + /* + * If we don't hold exclusive handle for the device, upgrade to it + * here to avoid changing device under exclusive owner. + */ + if (!(mode & BLK_OPEN_EXCL)) { + err = bd_prepare_to_claim(bdev, loop_set_status, NULL); + if (err) + goto out_reread_partitions; + } + err = mutex_lock_killable(&lo->lo_mutex); if (err) return err; @@ -1273,6 +1284,9 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info) } out_unlock: mutex_unlock(&lo->lo_mutex); + if (!(mode & BLK_OPEN_EXCL)) + bd_abort_claiming(bdev, loop_set_status); +out_reread_partitions: if (partscan) loop_reread_partitions(lo); @@ -1352,7 +1366,9 @@ loop_info64_to_old(const struct loop_info64 *info64, struct loop_info *info) } static int -loop_set_status_old(struct loop_device *lo, const struct loop_info __user *arg) +loop_set_status_old(struct loop_device *lo, blk_mode_t mode, + struct block_device *bdev, + const struct loop_info __user *arg) { struct loop_info info; struct loop_info64 info64; @@ -1360,17 +1376,19 @@ loop_set_status_old(struct loop_device *lo, const struct loop_info __user *arg) if (copy_from_user(&info, arg, sizeof (struct loop_info))) return -EFAULT; loop_info64_from_old(&info, &info64); - return loop_set_status(lo, &info64); + return loop_set_status(lo, mode, bdev, &info64); } static int -loop_set_status64(struct loop_device *lo, const struct loop_info64 __user *arg) +loop_set_status64(struct loop_device *lo, blk_mode_t mode, + struct block_device *bdev, + const struct loop_info64 __user *arg) { struct loop_info64 info64; if (copy_from_user(&info64, arg, sizeof (struct loop_info64))) return -EFAULT; - return loop_set_status(lo, &info64); + return loop_set_status(lo, mode, bdev, &info64); } static int @@ -1549,14 +1567,14 @@ static int lo_ioctl(struct block_device *bdev, blk_mode_t mode, case LOOP_SET_STATUS: err = -EPERM; if ((mode & BLK_OPEN_WRITE) || capable(CAP_SYS_ADMIN)) - err = loop_set_status_old(lo, argp); + err = loop_set_status_old(lo, mode, bdev, argp); break; case LOOP_GET_STATUS: return loop_get_status_old(lo, argp); case LOOP_SET_STATUS64: err = -EPERM; if ((mode & BLK_OPEN_WRITE) || capable(CAP_SYS_ADMIN)) - err = loop_set_status64(lo, argp); + err = loop_set_status64(lo, mode, bdev, argp); break; case LOOP_GET_STATUS64: return loop_get_status64(lo, argp); @@ -1650,8 +1668,9 @@ loop_info64_to_compat(const struct loop_info64 *info64, } static int -loop_set_status_compat(struct loop_device *lo, - const struct compat_loop_info __user *arg) +loop_set_status_compat(struct loop_device *lo, blk_mode_t mode, + struct block_device *bdev, + const struct compat_loop_info __user *arg) { struct loop_info64 info64; int ret; @@ -1659,7 +1678,7 @@ loop_set_status_compat(struct loop_device *lo, ret = loop_info64_from_compat(arg, &info64); if (ret < 0) return ret; - return loop_set_status(lo, &info64); + return loop_set_status(lo, mode, bdev, &info64); } static int @@ -1685,7 +1704,7 @@ static int lo_compat_ioctl(struct block_device *bdev, blk_mode_t mode, switch(cmd) { case LOOP_SET_STATUS: - err = loop_set_status_compat(lo, + err = loop_set_status_compat(lo, mode, bdev, (const struct compat_loop_info __user *)arg); break; case LOOP_GET_STATUS: From cf09141a4bdff63309c4a44c31e2300b813e82c4 Mon Sep 17 00:00:00 2001 From: Breno Leitao Date: Tue, 6 Jan 2026 06:26:57 -0800 Subject: [PATCH 0904/1321] blk-rq-qos: Remove unlikely() hints from QoS checks The unlikely() annotations on QUEUE_FLAG_QOS_ENABLED checks are counterproductive. Writeback throttling (WBT) might be enabled by default, mainly because CONFIG_BLK_WBT_MQ defaults to 'y'. Branch profiling on Meta servers, which have WBT enabled, confirms 100% misprediction rates on these checks. Remove the unlikely() annotations to let the CPU's branch predictor learn the actual behavior, potentially improving I/O path performance. Signed-off-by: Breno Leitao Signed-off-by: Jens Axboe --- block/blk-rq-qos.h | 25 +++++++++---------------- 1 file changed, 9 insertions(+), 16 deletions(-) diff --git a/block/blk-rq-qos.h b/block/blk-rq-qos.h index b538f2c0febc2..a747a504fe429 100644 --- a/block/blk-rq-qos.h +++ b/block/blk-rq-qos.h @@ -112,29 +112,26 @@ void __rq_qos_queue_depth_changed(struct rq_qos *rqos); static inline void rq_qos_cleanup(struct request_queue *q, struct bio *bio) { - if (unlikely(test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags)) && - q->rq_qos) + if (test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags) && q->rq_qos) __rq_qos_cleanup(q->rq_qos, bio); } static inline void rq_qos_done(struct request_queue *q, struct request *rq) { - if (unlikely(test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags)) && - q->rq_qos && !blk_rq_is_passthrough(rq)) + if (test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags) && + q->rq_qos && !blk_rq_is_passthrough(rq)) __rq_qos_done(q->rq_qos, rq); } static inline void rq_qos_issue(struct request_queue *q, struct request *rq) { - if (unlikely(test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags)) && - q->rq_qos) + if (test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags) && q->rq_qos) __rq_qos_issue(q->rq_qos, rq); } static inline void rq_qos_requeue(struct request_queue *q, struct request *rq) { - if (unlikely(test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags)) && - q->rq_qos) + if (test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags) && q->rq_qos) __rq_qos_requeue(q->rq_qos, rq); } @@ -162,8 +159,7 @@ static inline void rq_qos_done_bio(struct bio *bio) static inline void rq_qos_throttle(struct request_queue *q, struct bio *bio) { - if (unlikely(test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags)) && - q->rq_qos) { + if (test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags) && q->rq_qos) { bio_set_flag(bio, BIO_QOS_THROTTLED); __rq_qos_throttle(q->rq_qos, bio); } @@ -172,16 +168,14 @@ static inline void rq_qos_throttle(struct request_queue *q, struct bio *bio) static inline void rq_qos_track(struct request_queue *q, struct request *rq, struct bio *bio) { - if (unlikely(test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags)) && - q->rq_qos) + if (test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags) && q->rq_qos) __rq_qos_track(q->rq_qos, rq, bio); } static inline void rq_qos_merge(struct request_queue *q, struct request *rq, struct bio *bio) { - if (unlikely(test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags)) && - q->rq_qos) { + if (test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags) && q->rq_qos) { bio_set_flag(bio, BIO_QOS_MERGED); __rq_qos_merge(q->rq_qos, rq, bio); } @@ -189,8 +183,7 @@ static inline void rq_qos_merge(struct request_queue *q, struct request *rq, static inline void rq_qos_queue_depth_changed(struct request_queue *q) { - if (unlikely(test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags)) && - q->rq_qos) + if (test_bit(QUEUE_FLAG_QOS_ENABLED, &q->queue_flags) && q->rq_qos) __rq_qos_queue_depth_changed(q->rq_qos); } From eab72686834ca4b83cc880a2bcce8e0fb5e79711 Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Tue, 6 Jan 2026 13:08:37 -0700 Subject: [PATCH 0905/1321] block: don't merge bios with different app_tags nvme_set_app_tag() uses the app_tag value from the bio_integrity_payload of the struct request's first bio. This assumes all the request's bios have the same app_tag. However, it is possible for bios with different app_tag values to be merged into a single request. Add a check in blk_integrity_merge_{bio,rq}() to prevent the merging of bios/requests with different app_tag values if BIP_CHECK_APPTAG is set. Signed-off-by: Caleb Sander Mateos Fixes: 3d8b5a22d404 ("block: add support to pass user meta buffer") Signed-off-by: Jens Axboe --- block/blk-integrity.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/block/blk-integrity.c b/block/blk-integrity.c index 9b27963680dc3..964eebbee14d0 100644 --- a/block/blk-integrity.c +++ b/block/blk-integrity.c @@ -140,14 +140,21 @@ EXPORT_SYMBOL_GPL(blk_rq_integrity_map_user); bool blk_integrity_merge_rq(struct request_queue *q, struct request *req, struct request *next) { + struct bio_integrity_payload *bip, *bip_next; + if (blk_integrity_rq(req) == 0 && blk_integrity_rq(next) == 0) return true; if (blk_integrity_rq(req) == 0 || blk_integrity_rq(next) == 0) return false; - if (bio_integrity(req->bio)->bip_flags != - bio_integrity(next->bio)->bip_flags) + bip = bio_integrity(req->bio); + bip_next = bio_integrity(next->bio); + if (bip->bip_flags != bip_next->bip_flags) + return false; + + if (bip->bip_flags & BIP_CHECK_APPTAG && + bip->app_tag != bip_next->app_tag) return false; if (req->nr_integrity_segments + next->nr_integrity_segments > @@ -163,15 +170,21 @@ bool blk_integrity_merge_rq(struct request_queue *q, struct request *req, bool blk_integrity_merge_bio(struct request_queue *q, struct request *req, struct bio *bio) { + struct bio_integrity_payload *bip, *bip_bio = bio_integrity(bio); int nr_integrity_segs; - if (blk_integrity_rq(req) == 0 && bio_integrity(bio) == NULL) + if (blk_integrity_rq(req) == 0 && bip_bio == NULL) return true; - if (blk_integrity_rq(req) == 0 || bio_integrity(bio) == NULL) + if (blk_integrity_rq(req) == 0 || bip_bio == NULL) + return false; + + bip = bio_integrity(req->bio); + if (bip->bip_flags != bip_bio->bip_flags) return false; - if (bio_integrity(req->bio)->bip_flags != bio_integrity(bio)->bip_flags) + if (bip->bip_flags & BIP_CHECK_APPTAG && + bip->app_tag != bip_bio->app_tag) return false; nr_integrity_segs = blk_rq_count_integrity_sg(q, bio); From 5292883ea0c392053ec05547afb182602cb2742b Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Wed, 7 Jan 2026 19:41:43 +0900 Subject: [PATCH 0906/1321] loop: add missing bd_abort_claiming in loop_set_status Commit 08e136ebd193 ("loop: don't change loop device under exclusive opener in loop_set_status") forgot to call bd_abort_claiming() when mutex_lock_killable() failed. Fixes: 08e136ebd193 ("loop: don't change loop device under exclusive opener in loop_set_status") Signed-off-by: Tetsuo Handa Signed-off-by: Jens Axboe --- drivers/block/loop.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index ca74cc31bf07f..bd59c0e9508b7 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1245,7 +1245,8 @@ loop_set_status(struct loop_device *lo, blk_mode_t mode, err = mutex_lock_killable(&lo->lo_mutex); if (err) - return err; + goto out_abort_claiming; + if (lo->lo_state != Lo_bound) { err = -ENXIO; goto out_unlock; @@ -1284,6 +1285,7 @@ loop_set_status(struct loop_device *lo, blk_mode_t mode, } out_unlock: mutex_unlock(&lo->lo_mutex); +out_abort_claiming: if (!(mode & BLK_OPEN_EXCL)) bd_abort_claiming(bdev, loop_set_status); out_reread_partitions: From 28ebfcd974b48277c791d08a0ebcc9e1ea2c53b9 Mon Sep 17 00:00:00 2001 From: Mikulas Patocka Date: Tue, 6 Jan 2026 16:56:07 +0100 Subject: [PATCH 0907/1321] blk-mq: avoid stall during boot due to synchronize_rcu_expedited On the kernel 6.19-rc, I am experiencing 15-second boot stall in a virtual machine when probing a virtio-scsi disk: [ 1.011641] SCSI subsystem initialized [ 1.013972] virtio_scsi virtio6: 16/0/0 default/read/poll queues [ 1.015983] scsi host0: Virtio SCSI HBA [ 1.019578] ACPI: \_SB_.GSIA: Enabled at IRQ 16 [ 1.020225] ahci 0000:00:1f.2: AHCI vers 0001.0000, 32 command slots, 1.5 Gbps, SATA mode [ 1.020228] ahci 0000:00:1f.2: 6/6 ports implemented (port mask 0x3f) [ 1.020230] ahci 0000:00:1f.2: flags: 64bit ncq only [ 1.024688] scsi host1: ahci [ 1.025432] scsi host2: ahci [ 1.025966] scsi host3: ahci [ 1.026511] scsi host4: ahci [ 1.028371] scsi host5: ahci [ 1.028918] scsi host6: ahci [ 1.029266] ata1: SATA max UDMA/133 abar m4096@0xfea23000 port 0xfea23100 irq 16 lpm-pol 1 [ 1.029305] ata2: SATA max UDMA/133 abar m4096@0xfea23000 port 0xfea23180 irq 16 lpm-pol 1 [ 1.029316] ata3: SATA max UDMA/133 abar m4096@0xfea23000 port 0xfea23200 irq 16 lpm-pol 1 [ 1.029327] ata4: SATA max UDMA/133 abar m4096@0xfea23000 port 0xfea23280 irq 16 lpm-pol 1 [ 1.029341] ata5: SATA max UDMA/133 abar m4096@0xfea23000 port 0xfea23300 irq 16 lpm-pol 1 [ 1.029356] ata6: SATA max UDMA/133 abar m4096@0xfea23000 port 0xfea23380 irq 16 lpm-pol 1 [ 1.118111] scsi 0:0:0:0: Direct-Access QEMU QEMU HARDDISK 2.5+ PQ: 0 ANSI: 5 [ 1.348916] ata1: SATA link down (SStatus 0 SControl 300) [ 1.350713] ata2: SATA link down (SStatus 0 SControl 300) [ 1.351025] ata6: SATA link down (SStatus 0 SControl 300) [ 1.351160] ata5: SATA link down (SStatus 0 SControl 300) [ 1.351326] ata3: SATA link down (SStatus 0 SControl 300) [ 1.351536] ata4: SATA link down (SStatus 0 SControl 300) [ 1.449153] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input2 [ 16.483477] sd 0:0:0:0: Power-on or device reset occurred [ 16.483691] sd 0:0:0:0: [sda] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB) [ 16.483762] sd 0:0:0:0: [sda] Write Protect is off [ 16.483877] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 16.569225] sd 0:0:0:0: [sda] Attached SCSI disk I bisected it and it is caused by the commit 89e1fb7ceffd which introduces calls to synchronize_rcu_expedited. This commit replaces synchronize_rcu_expedited and kfree with a call to kfree_rcu_mightsleep, avoiding the 15-second delay. Signed-off-by: Mikulas Patocka Fixes: 89e1fb7ceffd ("blk-mq: fix potential uaf for 'queue_hw_ctx'") Reviewed-by: Uladzislau Rezki (Sony) Signed-off-by: Jens Axboe --- block/blk-mq.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index eff4f72ce83be..a29d8ac9d3e35 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4553,8 +4553,7 @@ static void __blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, * Make sure reading the old queue_hw_ctx from other * context concurrently won't trigger uaf. */ - synchronize_rcu_expedited(); - kfree(hctxs); + kfree_rcu_mightsleep(hctxs); hctxs = new_hctxs; } From 975eb029aa6f0fe78fe0d2afe40733242232aee5 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Fri, 9 Jan 2026 20:14:54 +0800 Subject: [PATCH 0908/1321] ublk: fix use-after-free in ublk_partition_scan_work A race condition exists between the async partition scan work and device teardown that can lead to a use-after-free of ub->ub_disk: 1. ublk_ctrl_start_dev() schedules partition_scan_work after add_disk() 2. ublk_stop_dev() calls ublk_stop_dev_unlocked() which does: - del_gendisk(ub->ub_disk) - ublk_detach_disk() sets ub->ub_disk = NULL - put_disk() which may free the disk 3. The worker ublk_partition_scan_work() then dereferences ub->ub_disk leading to UAF Fix this by using ublk_get_disk()/ublk_put_disk() in the worker to hold a reference to the disk during the partition scan. The spinlock in ublk_get_disk() synchronizes with ublk_detach_disk() ensuring the worker either gets a valid reference or sees NULL and exits early. Also change flush_work() to cancel_work_sync() to avoid running the partition scan work unnecessarily when the disk is already detached. Fixes: 7fc4da6a304b ("ublk: scan partition in async way") Reported-by: Ruikai Peng Signed-off-by: Ming Lei Signed-off-by: Jens Axboe --- drivers/block/ublk_drv.c | 37 ++++++++++++++++++++++--------------- 1 file changed, 22 insertions(+), 15 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 837fedb02e0d5..f6e5a07667212 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -255,20 +255,6 @@ static inline struct request *__ublk_check_and_get_req(struct ublk_device *ub, u16 q_id, u16 tag, struct ublk_io *io, size_t offset); static inline unsigned int ublk_req_build_flags(struct request *req); -static void ublk_partition_scan_work(struct work_struct *work) -{ - struct ublk_device *ub = - container_of(work, struct ublk_device, partition_scan_work); - - if (WARN_ON_ONCE(!test_and_clear_bit(GD_SUPPRESS_PART_SCAN, - &ub->ub_disk->state))) - return; - - mutex_lock(&ub->ub_disk->open_mutex); - bdev_disk_changed(ub->ub_disk, false); - mutex_unlock(&ub->ub_disk->open_mutex); -} - static inline struct ublksrv_io_desc * ublk_get_iod(const struct ublk_queue *ubq, unsigned tag) { @@ -1597,6 +1583,27 @@ static void ublk_put_disk(struct gendisk *disk) put_device(disk_to_dev(disk)); } +static void ublk_partition_scan_work(struct work_struct *work) +{ + struct ublk_device *ub = + container_of(work, struct ublk_device, partition_scan_work); + /* Hold disk reference to prevent UAF during concurrent teardown */ + struct gendisk *disk = ublk_get_disk(ub); + + if (!disk) + return; + + if (WARN_ON_ONCE(!test_and_clear_bit(GD_SUPPRESS_PART_SCAN, + &disk->state))) + goto out; + + mutex_lock(&disk->open_mutex); + bdev_disk_changed(disk, false); + mutex_unlock(&disk->open_mutex); +out: + ublk_put_disk(disk); +} + /* * Use this function to ensure that ->canceling is consistently set for * the device and all queues. Do not set these flags directly. @@ -2041,7 +2048,7 @@ static void ublk_stop_dev(struct ublk_device *ub) mutex_lock(&ub->mutex); ublk_stop_dev_unlocked(ub); mutex_unlock(&ub->mutex); - flush_work(&ub->partition_scan_work); + cancel_work_sync(&ub->partition_scan_work); ublk_cancel_dev(ub); } From 40ed708607909ae522c3086da997b17bf9913a4f Mon Sep 17 00:00:00 2001 From: Gao Xiang Date: Thu, 8 Jan 2026 10:38:31 +0800 Subject: [PATCH 0909/1321] erofs: don't bother with s_stack_depth increasing for now MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Previously, commit d53cd891f0e4 ("erofs: limit the level of fs stacking for file-backed mounts") bumped `s_stack_depth` by one to avoid kernel stack overflow when stacking an unlimited number of EROFS on top of each other. This fix breaks composefs mounts, which need EROFS+ovl^2 sometimes (and such setups are already used in production for quite a long time). One way to fix this regression is to bump FILESYSTEM_MAX_STACK_DEPTH from 2 to 3, but proving that this is safe in general is a high bar. After a long discussion on GitHub issues [1] about possible solutions, one conclusion is that there is no need to support nesting file-backed EROFS mounts on stacked filesystems, because there is always the option to use loopback devices as a fallback. As a quick fix for the composefs regression for this cycle, instead of bumping `s_stack_depth` for file backed EROFS mounts, we disallow nesting file-backed EROFS over EROFS and over filesystems with `s_stack_depth` > 0. This works for all known file-backed mount use cases (composefs, containerd, and Android APEX for some Android vendors), and the fix is self-contained. Essentially, we are allowing one extra unaccounted fs stacking level of EROFS below stacking filesystems, but EROFS can only be used in the read path (i.e. overlayfs lower layers), which typically has much lower stack usage than the write path. We can consider increasing FILESYSTEM_MAX_STACK_DEPTH later, after more stack usage analysis or using alternative approaches, such as splitting the `s_stack_depth` limitation according to different combinations of stacking. Fixes: d53cd891f0e4 ("erofs: limit the level of fs stacking for file-backed mounts") Reported-and-tested-by: Dusty Mabe Reported-by: Timothée Ravier Closes: https://github.com/coreos/fedora-coreos-tracker/issues/2087 [1] Reported-by: "Alekséi Naidénov" Closes: https://lore.kernel.org/r/CAFHtUiYv4+=+JP_-JjARWjo6OwcvBj1wtYN=z0QXwCpec9sXtg@mail.gmail.com Acked-by: Amir Goldstein Acked-by: Alexander Larsson Reviewed-and-tested-by: Sheng Yong Reviewed-by: Zhiguo Niu Reviewed-by: Chao Yu Cc: Christian Brauner Cc: Miklos Szeredi Signed-off-by: Gao Xiang --- fs/erofs/super.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/fs/erofs/super.c b/fs/erofs/super.c index 937a215f626c1..e93264034b5db 100644 --- a/fs/erofs/super.c +++ b/fs/erofs/super.c @@ -644,14 +644,20 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc) * fs contexts (including its own) due to self-controlled RO * accesses/contexts and no side-effect changes that need to * context save & restore so it can reuse the current thread - * context. However, it still needs to bump `s_stack_depth` to - * avoid kernel stack overflow from nested filesystems. + * context. + * However, we still need to prevent kernel stack overflow due + * to filesystem nesting: just ensure that s_stack_depth is 0 + * to disallow mounting EROFS on stacked filesystems. + * Note: s_stack_depth is not incremented here for now, since + * EROFS is the only fs supporting file-backed mounts for now. + * It MUST change if another fs plans to support them, which + * may also require adjusting FILESYSTEM_MAX_STACK_DEPTH. */ if (erofs_is_fileio_mode(sbi)) { - sb->s_stack_depth = - file_inode(sbi->dif0.file)->i_sb->s_stack_depth + 1; - if (sb->s_stack_depth > FILESYSTEM_MAX_STACK_DEPTH) { - erofs_err(sb, "maximum fs stacking depth exceeded"); + inode = file_inode(sbi->dif0.file); + if ((inode->i_sb->s_op == &erofs_sops && !sb->s_bdev) || + inode->i_sb->s_stack_depth) { + erofs_err(sb, "file-backed mounts cannot be applied to stacked fses"); return -ENOTBLK; } } From 96fd0ea84a4efdd533442e5d504215eda8644dd1 Mon Sep 17 00:00:00 2001 From: Gao Xiang Date: Sat, 10 Jan 2026 19:47:03 +0800 Subject: [PATCH 0910/1321] erofs: fix file-backed mounts no longer working on EROFS partitions Sheng Yong reported [1] that Android APEX images didn't work with commit 072a7c7cdbea ("erofs: don't bother with s_stack_depth increasing for now") because "EROFS-formatted APEX file images can be stored within an EROFS-formatted Android system partition." In response, I sent a quick fat-fingered [PATCH v3] to address the report. Unfortunately, the updated condition was incorrect: if (erofs_is_fileio_mode(sbi)) { - sb->s_stack_depth = - file_inode(sbi->dif0.file)->i_sb->s_stack_depth + 1; - if (sb->s_stack_depth > FILESYSTEM_MAX_STACK_DEPTH) { - erofs_err(sb, "maximum fs stacking depth exceeded"); + inode = file_inode(sbi->dif0.file); + if ((inode->i_sb->s_op == &erofs_sops && !sb->s_bdev) || + inode->i_sb->s_stack_depth) { The condition `!sb->s_bdev` is always true for all file-backed EROFS mounts, making the check effectively a no-op. The real fix tested and confirmed by Sheng Yong [2] at that time was [PATCH v3 RESEND], which correctly ensures the following EROFS^2 setup works: EROFS (on a block device) + EROFS (file-backed mount) But sadly I screwed it up again by upstreaming the outdated [PATCH v3]. This patch applies the same logic as the delta between the upstream [PATCH v3] and the real fix [PATCH v3 RESEND]. Reported-by: Sheng Yong Closes: https://lore.kernel.org/r/3acec686-4020-4609-aee4-5dae7b9b0093@gmail.com [1] Fixes: 072a7c7cdbea ("erofs: don't bother with s_stack_depth increasing for now") Link: https://lore.kernel.org/r/243f57b8-246f-47e7-9fb1-27a771e8e9e8@gmail.com [2] Signed-off-by: Gao Xiang Signed-off-by: Linus Torvalds --- fs/erofs/super.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/erofs/super.c b/fs/erofs/super.c index e93264034b5db..5136cda5972a9 100644 --- a/fs/erofs/super.c +++ b/fs/erofs/super.c @@ -655,7 +655,8 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc) */ if (erofs_is_fileio_mode(sbi)) { inode = file_inode(sbi->dif0.file); - if ((inode->i_sb->s_op == &erofs_sops && !sb->s_bdev) || + if ((inode->i_sb->s_op == &erofs_sops && + !inode->i_sb->s_bdev) || inode->i_sb->s_stack_depth) { erofs_err(sb, "file-backed mounts cannot be applied to stacked fses"); return -ENOTBLK; From c0839435aa4050010fdb74e987a89ad38c3bda57 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 6 Jan 2026 15:22:10 -0400 Subject: [PATCH 0911/1321] iommupt: Fix the kunit building The kunit doesn't work since the below commit made GENERIC_PT unselectable: $ make ARCH=x86_64 O=build_kunit_x86_64 olddefconfig ERROR:root:Not all Kconfig options selected in kunitconfig were in the generated .config. This is probably due to unsatisfied dependencies. Missing: CONFIG_DEBUG_GENERIC_PT=y, CONFIG_IOMMUFD_TEST=y, CONFIG_IOMMU_PT_X86_64=y, CONFIG_GENERIC_PT=y, CONFIG_IOMMU_PT_AMDV1=y, CONFIG_IOMMU_PT_VTDSS=y, CONFIG_IOMMU_PT=y, CONFIG_IOMMU_PT_KUNIT_TEST=y Also remove the unneeded CONFIG_IOMMUFD_TEST reference as the iommupt kunit doesn't interact with iommufd, and it doesn't currently build for the kunit due problems with DMA_SHARED buffer either. Fixes: 01569c216dde ("genpt: Make GENERIC_PT invisible") Fixes: 1dd4187f53c3 ("iommupt: Add a kunit test for Generic Page Table") Signed-off-by: Jason Gunthorpe Reviewed-by: Alejandro Jimenez Signed-off-by: Joerg Roedel --- drivers/iommu/generic_pt/.kunitconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/generic_pt/.kunitconfig b/drivers/iommu/generic_pt/.kunitconfig index 52ac9e661ffd2..a78b295f264d2 100644 --- a/drivers/iommu/generic_pt/.kunitconfig +++ b/drivers/iommu/generic_pt/.kunitconfig @@ -1,4 +1,5 @@ CONFIG_KUNIT=y +CONFIG_COMPILE_TEST=y CONFIG_GENERIC_PT=y CONFIG_DEBUG_GENERIC_PT=y CONFIG_IOMMU_PT=y @@ -11,4 +12,3 @@ CONFIG_IOMMUFD=y CONFIG_DEBUG_KERNEL=y CONFIG_FAULT_INJECTION=y CONFIG_RUNTIME_TESTING_MENU=y -CONFIG_IOMMUFD_TEST=y From e880a2edc15725aca4dcb7e0688b497b8b70878e Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 6 Jan 2026 15:22:11 -0400 Subject: [PATCH 0912/1321] iommufd/selftest: Add missing kconfig for DMA_SHARED_BUFFER The test doesn't build without it, dma-buf.h does not provide stub functions if it is not enabled. Compilation can fail with: ERROR:root:ld: vmlinux.o: in function `iommufd_test': (.text+0x3b1cdd): undefined reference to `dma_buf_get' ld: (.text+0x3b1d08): undefined reference to `dma_buf_put' ld: (.text+0x3b2105): undefined reference to `dma_buf_export' ld: (.text+0x3b211f): undefined reference to `dma_buf_fd' ld: (.text+0x3b2e47): undefined reference to `dma_buf_move_notify' Add the missing select. Fixes: d2041f1f11dd ("iommufd/selftest: Add some tests for the dmabuf flow") Signed-off-by: Jason Gunthorpe Signed-off-by: Joerg Roedel --- drivers/iommu/iommufd/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/iommufd/Kconfig b/drivers/iommu/iommufd/Kconfig index eae3f03629b0c..7e41e3ccea758 100644 --- a/drivers/iommu/iommufd/Kconfig +++ b/drivers/iommu/iommufd/Kconfig @@ -42,6 +42,7 @@ config IOMMUFD_TEST depends on FAULT_INJECTION depends on RUNTIME_TESTING_MENU depends on IOMMU_PT_AMDV1 + select DMA_SHARED_BUFFER select IOMMUFD_DRIVER default n help From 4f22830879b0c389856fdbfa5383d372cd203035 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 6 Jan 2026 15:22:12 -0400 Subject: [PATCH 0913/1321] iommufd/selftest: Prevent module/builtin conflicts in kconfig The selftest now depends on the AMDv1 page table, however the selftest kconfig itself is just an sub-option of the main IOMMUFD module kconfig. This means it cannot be modular and so kconfig allowed a modular IOMMU_PT_AMDV1 with a built in IOMMUFD. This causes link failures: ld: vmlinux.o: in function `mock_domain_alloc_pgtable.isra.0': selftest.c:(.text+0x12e8ad3): undefined reference to `pt_iommu_amdv1_init' ld: vmlinux.o: in function `BSWAP_SHUFB_CTL': sha1-avx2-asm.o:(.rodata+0xaa36a8): undefined reference to `pt_iommu_amdv1_read_and_clear_dirty' ld: sha1-avx2-asm.o:(.rodata+0xaa36f0): undefined reference to `pt_iommu_amdv1_map_pages' ld: sha1-avx2-asm.o:(.rodata+0xaa36f8): undefined reference to `pt_iommu_amdv1_unmap_pages' ld: sha1-avx2-asm.o:(.rodata+0xaa3720): undefined reference to `pt_iommu_amdv1_iova_to_phys' Adjust the kconfig to disable IOMMUFD_TEST if IOMMU_PT_AMDV1 is incompatible. Fixes: e93d5945ed5b ("iommufd: Change the selftest to use iommupt instead of xarray") Suggested-by: Arnd Bergmann Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202512210135.freQWpxa-lkp@intel.com/ Signed-off-by: Jason Gunthorpe Signed-off-by: Joerg Roedel --- drivers/iommu/iommufd/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/Kconfig b/drivers/iommu/iommufd/Kconfig index 7e41e3ccea758..455bac0351f28 100644 --- a/drivers/iommu/iommufd/Kconfig +++ b/drivers/iommu/iommufd/Kconfig @@ -41,7 +41,7 @@ config IOMMUFD_TEST depends on DEBUG_KERNEL depends on FAULT_INJECTION depends on RUNTIME_TESTING_MENU - depends on IOMMU_PT_AMDV1 + depends on IOMMU_PT_AMDV1=y || IOMMUFD=IOMMU_PT_AMDV1 select DMA_SHARED_BUFFER select IOMMUFD_DRIVER default n From 765750e083f38aa05326caffcf8c6a9a629a7524 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Fri, 9 Jan 2026 10:29:52 -0400 Subject: [PATCH 0914/1321] iommupt: Make pt_feature() always_inline gcc 8.5 on powerpc does not automatically inline these functions even though they evaluate to constants in key cases. Since the constant propagation is essential for some code elimination and built-time checks this causes a build failure: ERROR: modpost: "__pt_no_sw_bit" [drivers/iommu/generic_pt/fmt/iommu_amdv1.ko] undefined! Caused by this: if (pts_feature(&pts, PT_FEAT_DMA_INCOHERENT) && !pt_test_sw_bit_acquire(&pts, SW_BIT_CACHE_FLUSH_DONE)) flush_writes_item(&pts); Where pts_feature() evaluates to a constant false. Mark them as __always_inline to force it to evaluate to a constant and trigger the code elimination. Fixes: 7c5b184db714 ("genpt: Generic Page Table base API") Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202512230720.9y9DtWIo-lkp@intel.com/ Signed-off-by: Jason Gunthorpe --- drivers/iommu/generic_pt/pt_defs.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/generic_pt/pt_defs.h b/drivers/iommu/generic_pt/pt_defs.h index c25544d72f979..707b3b0282fad 100644 --- a/drivers/iommu/generic_pt/pt_defs.h +++ b/drivers/iommu/generic_pt/pt_defs.h @@ -202,7 +202,7 @@ static inline bool pt_table_install32(struct pt_state *pts, u32 table_entry) #define PT_SUPPORTED_FEATURE(feature_nr) (PT_SUPPORTED_FEATURES & BIT(feature_nr)) -static inline bool pt_feature(const struct pt_common *common, +static __always_inline bool pt_feature(const struct pt_common *common, unsigned int feature_nr) { if (PT_FORCE_ENABLED_FEATURES & BIT(feature_nr)) @@ -212,7 +212,7 @@ static inline bool pt_feature(const struct pt_common *common, return common->features & BIT(feature_nr); } -static inline bool pts_feature(const struct pt_state *pts, +static __always_inline bool pts_feature(const struct pt_state *pts, unsigned int feature_nr) { return pt_feature(pts->range->common, feature_nr); From dad0a25beea0b5ad3f09ac881d15976ddd93d202 Mon Sep 17 00:00:00 2001 From: Fushuai Wang Date: Fri, 9 Jan 2026 11:36:20 +0800 Subject: [PATCH 0915/1321] selftests/tracing: Fix test_multiple_writes stall When /sys/kernel/tracing/buffer_size_kb is less than 12KB, the test_multiple_writes test will stall and wait for more input due to insufficient buffer space. Check current buffer_size_kb value before the test. If it is less than 12KB, it temporarily increase the buffer to 12KB, and restore the original value after the tests are completed. Link: https://lore.kernel.org/r/20260109033620.25727-1-fushuai.wang@linux.dev Fixes: 37f46601383a ("selftests/tracing: Add basic test for trace_marker_raw file") Suggested-by: Steven Rostedt Signed-off-by: Fushuai Wang Acked-by: Steven Rostedt (Google) Signed-off-by: Shuah Khan --- .../ftrace/test.d/00basic/trace_marker_raw.tc | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/ftrace/test.d/00basic/trace_marker_raw.tc b/tools/testing/selftests/ftrace/test.d/00basic/trace_marker_raw.tc index 7daf7292209e5..a2c42e13f614b 100644 --- a/tools/testing/selftests/ftrace/test.d/00basic/trace_marker_raw.tc +++ b/tools/testing/selftests/ftrace/test.d/00basic/trace_marker_raw.tc @@ -89,6 +89,7 @@ test_buffer() { # The id must be four bytes, test that 3 bytes fails a write if echo -n abc > ./trace_marker_raw ; then echo "Too small of write expected to fail but did not" + echo ${ORIG} > buffer_size_kb exit_fail fi @@ -99,9 +100,24 @@ test_buffer() { if write_buffer 0xdeadbeef $size ; then echo "Too big of write expected to fail but did not" + echo ${ORIG} > buffer_size_kb exit_fail fi } +ORIG=`cat buffer_size_kb` + +# test_multiple_writes test needs at least 12KB buffer +NEW_SIZE=12 + +if [ ${ORIG} -lt ${NEW_SIZE} ]; then + echo ${NEW_SIZE} > buffer_size_kb +fi + test_buffer -test_multiple_writes +if ! test_multiple_writes; then + echo ${ORIG} > buffer_size_kb + exit_fail +fi + +echo ${ORIG} > buffer_size_kb From 180bbe9c4a3a8ac5d862b2bbd0808b3c629ef8cd Mon Sep 17 00:00:00 2001 From: Matthew Maurer Date: Fri, 26 Dec 2025 20:17:08 +0000 Subject: [PATCH 0916/1321] docs: ABI: sysfs-devices-soc: Fix swapped sample values The sample values for `family` and `machine` were swapped relative to what the driver actually does, and doesn't match the field description. Fixes: da5a70f3519f ("Documentation: add information for new sysfs soc bus functionality") Reviewed-by: Lee Jones Signed-off-by: Matthew Maurer Link: https://patch.msgid.link/20251226-soc-bindings-v4-2-2c2fac08f820@google.com Signed-off-by: Danilo Krummrich --- Documentation/ABI/testing/sysfs-devices-soc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-devices-soc b/Documentation/ABI/testing/sysfs-devices-soc index 5269808ec35f8..cb6776a4afe02 100644 --- a/Documentation/ABI/testing/sysfs-devices-soc +++ b/Documentation/ABI/testing/sysfs-devices-soc @@ -17,14 +17,14 @@ Date: January 2012 contact: Lee Jones Description: Read-only attribute common to all SoCs. Contains the SoC machine - name (e.g. Ux500). + name (e.g. DB8500). What: /sys/devices/socX/family Date: January 2012 contact: Lee Jones Description: Read-only attribute common to all SoCs. Contains SoC family name - (e.g. DB8500). + (e.g. ux500). On many of ARM based silicon with SMCCC v1.2+ compliant firmware this will contain the JEDEC JEP106 manufacturer’s identification From 19d07a808cf7258bd233f77bee315182603ce4b8 Mon Sep 17 00:00:00 2001 From: Yilin Chen <1479826151@qq.com> Date: Mon, 29 Dec 2025 00:52:34 +0800 Subject: [PATCH 0917/1321] rust: dma: remove incorrect safety documentation Removes a safety requirement that incorrectly states callers must ensure the device does not access memory while the returned slice is live, as this method doesn't return a slice. Fixes: d37a39f607c4 ("rust: dma: add as_slice/write functions for CoherentAllocation") Signed-off-by: Yilin Chen <1479826151@qq.com> Link: https://patch.msgid.link/tencent_5195C0324923A2B67DEF8AE4B8E139BCB105@qq.com Signed-off-by: Danilo Krummrich --- rust/kernel/dma.rs | 2 -- 1 file changed, 2 deletions(-) diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs index 84d3c67269e85..2ac107d8f7b78 100644 --- a/rust/kernel/dma.rs +++ b/rust/kernel/dma.rs @@ -532,8 +532,6 @@ impl CoherentAllocation { /// /// # Safety /// - /// * Callers must ensure that the device does not read/write to/from memory while the returned - /// slice is live. /// * Callers must ensure that this call does not race with a read or write to the same region /// that overlaps with this write. /// From 9b6b318f3f4cefd71a2bac889a7aa3f901fc1d9a Mon Sep 17 00:00:00 2001 From: Yilin Chen <1479826151@qq.com> Date: Mon, 29 Dec 2025 00:53:44 +0800 Subject: [PATCH 0918/1321] rust: device_id: replace incorrect word in safety documentation The safety documentation incorrectly refers to `RawDeviceId` when transmuting to `RawType`. This fixes the documentation to correctly indicate that implementers must ensure layout compatibility with `RawType`, not `RawDeviceId`. Fixes: 9b90864bb42b ("rust: implement `IdArray`, `IdTable` and `RawDeviceId`") Signed-off-by: Yilin Chen <1479826151@qq.com> Link: https://patch.msgid.link/tencent_C18DD5047749311142ED455779C7CCCF3A08@qq.com Signed-off-by: Danilo Krummrich --- rust/kernel/device_id.rs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rust/kernel/device_id.rs b/rust/kernel/device_id.rs index 62c42da12e9de..8e97214460146 100644 --- a/rust/kernel/device_id.rs +++ b/rust/kernel/device_id.rs @@ -15,7 +15,7 @@ use core::mem::MaybeUninit; /// # Safety /// /// Implementers must ensure that `Self` is layout-compatible with [`RawDeviceId::RawType`]; -/// i.e. it's safe to transmute to `RawDeviceId`. +/// i.e. it's safe to transmute to `RawType`. /// /// This requirement is needed so `IdArray::new` can convert `Self` to `RawType` when building /// the ID table. From b25b5f391f85f9240eecd28c799d83e10b1e6b67 Mon Sep 17 00:00:00 2001 From: Alice Ryhl Date: Sat, 27 Dec 2025 15:47:21 +0000 Subject: [PATCH 0919/1321] rust: driver: fix broken intra-doc links to example driver types The `auxiliary` and `pci` modules are conditional on `CONFIG_AUXILIARY_BUS` and `CONFIG_PCI` respectively. When these are disabled, the intra-doc links to `auxiliary::Driver` and `pci::Driver` break, causing rustdoc warnings (or errors with `-D warnings`). error: unresolved link to `kernel::auxiliary::Driver` --> rust/kernel/driver.rs:82:28 | 82 | //! [`auxiliary::Driver`]: kernel::auxiliary::Driver | ^^^^^^^^^^^^^^^^^^^^^^^^^ no item named `auxiliary` in module `kernel` Fix this by making the documentation for these examples conditional on the corresponding configuration options. Fixes: 970a7c68788e ("driver: rust: expand documentation for driver infrastructure") Signed-off-by: Alice Ryhl Reported-by: FUJITA Tomonori Closes: https://lore.kernel.org/rust-for-linux/20251209.151817.744108529426448097.fujita.tomonori@gmail.com/ Link: https://patch.msgid.link/20251227-driver-types-v1-1-1916154fbe5e@google.com Signed-off-by: Danilo Krummrich --- rust/kernel/driver.rs | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/rust/kernel/driver.rs b/rust/kernel/driver.rs index 9beae2e3d57e7..649d06468f411 100644 --- a/rust/kernel/driver.rs +++ b/rust/kernel/driver.rs @@ -33,7 +33,14 @@ //! } //! ``` //! -//! For specific examples see [`auxiliary::Driver`], [`pci::Driver`] and [`platform::Driver`]. +//! For specific examples see: +//! +//! * [`platform::Driver`](kernel::platform::Driver) +#![cfg_attr( + CONFIG_AUXILIARY_BUS, + doc = "* [`auxiliary::Driver`](kernel::auxiliary::Driver)" +)] +#![cfg_attr(CONFIG_PCI, doc = "* [`pci::Driver`](kernel::pci::Driver)")] //! //! The `probe()` callback should return a `impl PinInit`, i.e. the driver's private //! data. The bus abstraction should store the pointer in the corresponding bus device. The generic @@ -79,7 +86,6 @@ //! //! For this purpose the generic infrastructure in [`device_id`] should be used. //! -//! [`auxiliary::Driver`]: kernel::auxiliary::Driver //! [`Core`]: device::Core //! [`Device`]: device::Device //! [`Device`]: device::Device @@ -87,8 +93,6 @@ //! [`DeviceContext`]: device::DeviceContext //! [`device_id`]: kernel::device_id //! [`module_driver`]: kernel::module_driver -//! [`pci::Driver`]: kernel::pci::Driver -//! [`platform::Driver`]: kernel::platform::Driver use crate::error::{Error, Result}; use crate::{acpi, device, of, str::CStr, try_pin_init, types::Opaque, ThisModule}; From b6f768f3b725b060bacb5b458e51244464e06e46 Mon Sep 17 00:00:00 2001 From: FUJITA Tomonori Date: Wed, 31 Dec 2025 13:57:27 +0900 Subject: [PATCH 0920/1321] rust: dma: fix broken intra-doc links The `pci` module is conditional on CONFIG_PCI. When it's disabled, the intra-doc link to `pci::Device` causes rustdoc warnings: warning: unresolved link to `::kernel::pci::Device` --> rust/kernel/dma.rs:30:70 | 30 | /// where the underlying bus is DMA capable, such as [`pci::Device`](::kernel::pci::Device) or | ^^^^^^^^^^^^^^^^^^^^^ no item named `pci` in module `kernel` Fix this by making the documentation conditional on CONFIG_PCI. Fixes: d06d5f66f549 ("rust: dma: implement `dma::Device` trait") Signed-off-by: FUJITA Tomonori Reviewed-by: Dirk Behme Link: https://patch.msgid.link/20251231045728.1912024-1-fujita.tomonori@gmail.com [ Keep the "such as" part indicating a list of examples; fix typos in commit message. - Danilo ] Signed-off-by: Danilo Krummrich --- rust/kernel/dma.rs | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs index 2ac107d8f7b78..acc65b1e0f245 100644 --- a/rust/kernel/dma.rs +++ b/rust/kernel/dma.rs @@ -27,8 +27,9 @@ pub type DmaAddress = bindings::dma_addr_t; /// Trait to be implemented by DMA capable bus devices. /// /// The [`dma::Device`](Device) trait should be implemented by bus specific device representations, -/// where the underlying bus is DMA capable, such as [`pci::Device`](::kernel::pci::Device) or -/// [`platform::Device`](::kernel::platform::Device). +/// where the underlying bus is DMA capable, such as: +#[cfg_attr(CONFIG_PCI, doc = "* [`pci::Device`](kernel::pci::Device)")] +/// * [`platform::Device`](::kernel::platform::Device) pub trait Device: AsRef> { /// Set up the device's DMA streaming addressing capabilities. /// From 12b371bda696a115372417173505a6c8ac7727ed Mon Sep 17 00:00:00 2001 From: FUJITA Tomonori Date: Wed, 31 Dec 2025 13:57:28 +0900 Subject: [PATCH 0921/1321] rust: device: fix broken intra-doc links The `pci` module is conditional on CONFIG_PCI. When it's disabled, the intra-doc link to `pci::Device` causes rustdoc warnings: warning: unresolved link to `kernel::pci::Device` --> rust/kernel/device.rs:163:22 | 163 | /// [`pci::Device`]: kernel::pci::Device | ^^^^^^^^^^^^^^^^^^^ no item named `pci` in module `kernel` | = note: `#[warn(rustdoc::broken_intra_doc_links)]` on by default Fix this by making the documentation conditional on CONFIG_PCI. Fixes: d6e26c1ae4a6 ("device: rust: expand documentation for Device") Signed-off-by: FUJITA Tomonori Reviewed-by: Dirk Behme Link: https://patch.msgid.link/20251231045728.1912024-2-fujita.tomonori@gmail.com [ Keep the "such as" part indicating a list of examples; fix typos in commit message. - Danilo ] Signed-off-by: Danilo Krummrich --- rust/kernel/device.rs | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs index c79be2e2bfe38..5c2e1e0369e9b 100644 --- a/rust/kernel/device.rs +++ b/rust/kernel/device.rs @@ -67,8 +67,9 @@ static_assert!(core::mem::size_of::() >= core::mem::size_ /// /// # Implementing Bus Devices /// -/// This section provides a guideline to implement bus specific devices, such as [`pci::Device`] or -/// [`platform::Device`]. +/// This section provides a guideline to implement bus specific devices, such as: +#[cfg_attr(CONFIG_PCI, doc = "* [`pci::Device`](kernel::pci::Device)")] +/// * [`platform::Device`] /// /// A bus specific device should be defined as follows. /// @@ -160,7 +161,6 @@ static_assert!(core::mem::size_of::() >= core::mem::size_ /// /// [`AlwaysRefCounted`]: kernel::types::AlwaysRefCounted /// [`impl_device_context_deref`]: kernel::impl_device_context_deref -/// [`pci::Device`]: kernel::pci::Device /// [`platform::Device`]: kernel::platform::Device #[repr(transparent)] pub struct Device(Opaque, PhantomData); From 2670dbe1c203fb13b5e7e75d299028ecfe78c53b Mon Sep 17 00:00:00 2001 From: Marko Turk Date: Mon, 5 Jan 2026 22:37:57 +0100 Subject: [PATCH 0922/1321] rust: pci: fix typos in Bar struct's comments Fix a typo in the doc-comment of the Bar structure: 'inststance -> instance'. Add also 'is' to the comment inside Bar's `new()` function (suggested by Dirk): // `pdev` is valid by the invariants of `Device`. Fixes: bf9651f84b4e ("rust: pci: implement I/O mappable `pci::Bar`") Suggested-by: Dirk Behme Signed-off-by: Marko Turk Reviewed-by: Dirk Behme Link: https://patch.msgid.link/20260105213726.73000-2-mt@markoturk.info Signed-off-by: Danilo Krummrich --- rust/kernel/pci/io.rs | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/rust/kernel/pci/io.rs b/rust/kernel/pci/io.rs index 0d55c3139b6f8..82a4f1eba2f5b 100644 --- a/rust/kernel/pci/io.rs +++ b/rust/kernel/pci/io.rs @@ -20,7 +20,7 @@ use core::ops::Deref; /// /// # Invariants /// -/// `Bar` always holds an `IoRaw` inststance that holds a valid pointer to the start of the I/O +/// `Bar` always holds an `IoRaw` instance that holds a valid pointer to the start of the I/O /// memory mapped PCI BAR and its size. pub struct Bar { pdev: ARef, @@ -54,7 +54,7 @@ impl Bar { let ioptr: usize = unsafe { bindings::pci_iomap(pdev.as_raw(), num, 0) } as usize; if ioptr == 0 { // SAFETY: - // `pdev` valid by the invariants of `Device`. + // `pdev` is valid by the invariants of `Device`. // `num` is checked for validity by a previous call to `Device::resource_len`. unsafe { bindings::pci_release_region(pdev.as_raw(), num) }; return Err(ENOMEM); From 673911c50c3f0532cd0deabdea6ad83e957fe89c Mon Sep 17 00:00:00 2001 From: FUJITA Tomonori Date: Tue, 6 Jan 2026 09:03:20 +0900 Subject: [PATCH 0923/1321] rust: device: Remove explicit import of CStrExt Remove the explicit import of CStrExt. When CONFIG_PRINTK is disabled this import causes a build error: error: unused import: `crate::str::CStrExt` --> rust/kernel/device.rs:17:5 | 17 | use crate::str::CStrExt as _; | ^^^^^^^^^^^^^^^^^^^ | = note: `-D unused-imports` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(unused_imports)]` error: aborting due to 1 previous error CStrExt is covered by prelude::* so the explicit import is redundant. Signed-off-by: FUJITA Tomonori Fixes: 3b83f5d5e78a ("rust: replace `CStr` with `core::ffi::CStr`") Link: https://patch.msgid.link/20260106000320.2593800-1-fujita.tomonori@gmail.com Signed-off-by: Danilo Krummrich --- rust/kernel/device.rs | 1 - 1 file changed, 1 deletion(-) diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs index 5c2e1e0369e9b..71b200df0f400 100644 --- a/rust/kernel/device.rs +++ b/rust/kernel/device.rs @@ -14,7 +14,6 @@ use core::{any::TypeId, marker::PhantomData, ptr}; #[cfg(CONFIG_PRINTK)] use crate::c_str; -use crate::str::CStrExt as _; pub mod property; From dc051d314ff7dbc05b7a4b7635f5520e1096b58b Mon Sep 17 00:00:00 2001 From: Vivian Wang Date: Tue, 30 Dec 2025 21:39:17 +0800 Subject: [PATCH 0924/1321] riscv: boot: Always make Image from vmlinux, not vmlinux.unstripped Since commit 4b47a3aefb29 ("kbuild: Restore pattern to avoid stripping .rela.dyn from vmlinux") vmlinux has .rel*.dyn preserved. Therefore, use vmlinux to produce Image, not vmlinux.unstripped. Doing so fixes booting a RELOCATABLE=y Image with kexec. The problem is caused by this chain of events: - Since commit 3e86e4d74c04 ("kbuild: keep .modinfo section in vmlinux.unstripped"), vmlinux.unstripped gets a .modinfo section. - The .modinfo section has SHF_ALLOC, so it ends up in Image, at the end of it. - The Image header's image_size field does not expect to include .modinfo and does not account for it, since it should not be in Image. - If .modinfo is large enough, the file size of Image ends up larger than image_size, which eventually leads to it failing sanity_check_segment_list(). Using vmlinux instead of vmlinux.unstripped means that the unexpected .modinfo section is gone from Image, fixing the file size problem. Cc: stable@vger.kernel.org Fixes: 3e86e4d74c04 ("kbuild: keep .modinfo section in vmlinux.unstripped") Signed-off-by: Vivian Wang Reviewed-by: Nathan Chancellor Tested-by: Han Gao Link: https://patch.msgid.link/20251230-riscv-vmlinux-not-unstripped-v1-1-15f49df880df@iscas.ac.cn Signed-off-by: Paul Walmsley --- arch/riscv/boot/Makefile | 4 ---- 1 file changed, 4 deletions(-) diff --git a/arch/riscv/boot/Makefile b/arch/riscv/boot/Makefile index bfc3d0b75b9b2..5301adf5f3f5d 100644 --- a/arch/riscv/boot/Makefile +++ b/arch/riscv/boot/Makefile @@ -31,11 +31,7 @@ $(obj)/xipImage: vmlinux FORCE endif -ifdef CONFIG_RELOCATABLE -$(obj)/Image: vmlinux.unstripped FORCE -else $(obj)/Image: vmlinux FORCE -endif $(call if_changed,objcopy) $(obj)/Image.gz: $(obj)/Image FORCE From 93108ee2f09ba5bd9e538f57cc232a2d69c743a7 Mon Sep 17 00:00:00 2001 From: Lukas Gerlach Date: Thu, 18 Dec 2025 20:13:32 +0100 Subject: [PATCH 0925/1321] riscv: Sanitize syscall table indexing under speculation The syscall number is a user-controlled value used to index into the syscall table. Use array_index_nospec() to clamp this value after the bounds check to prevent speculative out-of-bounds access and subsequent data leakage via cache side channels. Signed-off-by: Lukas Gerlach Link: https://patch.msgid.link/20251218191332.35849-3-lukas.gerlach@cispa.de Signed-off-by: Paul Walmsley --- arch/riscv/kernel/traps.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 80230de167def..47afea4ff1a8d 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -339,8 +339,10 @@ void do_trap_ecall_u(struct pt_regs *regs) add_random_kstack_offset(); - if (syscall >= 0 && syscall < NR_syscalls) + if (syscall >= 0 && syscall < NR_syscalls) { + syscall = array_index_nospec(syscall, NR_syscalls); syscall_handler(regs, syscall); + } /* * Ultimately, this value will get limited by KSTACK_OFFSET_MAX(), From 299fc3c327a61229f95b4fc4981a7501c9d35113 Mon Sep 17 00:00:00 2001 From: Jiakai Xu Date: Fri, 26 Dec 2025 03:23:17 +0000 Subject: [PATCH 0926/1321] riscv: fix KUnit test_kprobes crash when building with Clang MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Clang misinterprets the placement of test_kprobes_addresses and test_kprobes_functions arrays when they are not explicitly assigned to a data section. This can lead to kmalloc_array() allocation errors and KUnit failures. When testing the Clang-compiled code in QEMU, this warning was emitted: WARNING: CPU: 1 PID: 3000 at mm/page_alloc.c:5159 __alloc_frozen_pages_noprof+0xe6/0x2fc mm/page_alloc.c:5159 Further investigation revealed that the test_kprobes_addresses array appeared to have over 100,000 elements, including invalid addresses; whereas, according to test-kprobes-asm.S, test_kprobes_addresses should only have 25 elements. When compiling the kernel with GCC, the kernel boots correctly. This patch fixes the issue by adding .section .rodata to explicitly place arrays in the read-only data segment. For detailed debug and analysis, see: https://github.com/j1akai/temp/blob/main/20251113/readme.md v1 -> v2: - Drop changes to .align, and .globl. Signed-off-by: Jiakai Xu Signed-off-by: Jiakai Xu Link: https://patch.msgid.link/738dd4e2.ff73.19a7cd7b4d5.Coremail.xujiakai2025@iscas.ac.cn Link: https://github.com/llvm/llvm-project/issues/168308 Link: https://patch.msgid.link/20251226032317.1523764-1-jiakaiPeanut@gmail.com Signed-off-by: Paul Walmsley --- arch/riscv/kernel/tests/kprobes/test-kprobes-asm.S | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/riscv/kernel/tests/kprobes/test-kprobes-asm.S b/arch/riscv/kernel/tests/kprobes/test-kprobes-asm.S index b951d0f124823..f16deee9e0915 100644 --- a/arch/riscv/kernel/tests/kprobes/test-kprobes-asm.S +++ b/arch/riscv/kernel/tests/kprobes/test-kprobes-asm.S @@ -181,6 +181,7 @@ SYM_FUNC_END(test_kprobes_c_bnez) #endif /* CONFIG_RISCV_ISA_C */ +.section .rodata SYM_DATA_START(test_kprobes_addresses) RISCV_PTR test_kprobes_add_addr1 RISCV_PTR test_kprobes_add_addr2 @@ -212,6 +213,7 @@ SYM_DATA_START(test_kprobes_addresses) RISCV_PTR 0 SYM_DATA_END(test_kprobes_addresses) +.section .rodata SYM_DATA_START(test_kprobes_functions) RISCV_PTR test_kprobes_add RISCV_PTR test_kprobes_jal From 223cb74d6d1ab73ddadd50449986bdea47d72cb9 Mon Sep 17 00:00:00 2001 From: Guodong Xu Date: Tue, 23 Dec 2025 10:44:27 +0800 Subject: [PATCH 0927/1321] riscv: cpufeature: Fix Zk bundled extension missing Zknh MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Zk extension is a bundle consisting of Zkn, Zkr, and Zkt. The Zkn extension itself is a bundle consisting of Zbkb, Zbkc, Zbkx, Zknd, Zkne, and Zknh. The current implementation of riscv_zk_bundled_exts manually listed the dependencies but missed RISCV_ISA_EXT_ZKNH. Fix this by introducing a RISCV_ISA_EXT_ZKN macro that lists the Zkn components and using it in both riscv_zk_bundled_exts and riscv_zkn_bundled_exts. This adds the missing Zknh extension to Zk and reduces code duplication. Fixes: 0d8295ed975b ("riscv: add ISA extension parsing for scalar crypto") Link: https://patch.msgid.link/20231114141256.126749-4-cleger@rivosinc.com/ Signed-off-by: Guodong Xu Reviewed-by: Clément Léger Link: https://patch.msgid.link/20251223-zk-missing-zknh-v1-1-b627c990ee1a@riscstar.com Signed-off-by: Paul Walmsley --- arch/riscv/kernel/cpufeature.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index c05b11596c190..fa591aff9d335 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -301,23 +301,22 @@ static const unsigned int riscv_a_exts[] = { RISCV_ISA_EXT_ZALRSC, }; +#define RISCV_ISA_EXT_ZKN \ + RISCV_ISA_EXT_ZBKB, \ + RISCV_ISA_EXT_ZBKC, \ + RISCV_ISA_EXT_ZBKX, \ + RISCV_ISA_EXT_ZKND, \ + RISCV_ISA_EXT_ZKNE, \ + RISCV_ISA_EXT_ZKNH + static const unsigned int riscv_zk_bundled_exts[] = { - RISCV_ISA_EXT_ZBKB, - RISCV_ISA_EXT_ZBKC, - RISCV_ISA_EXT_ZBKX, - RISCV_ISA_EXT_ZKND, - RISCV_ISA_EXT_ZKNE, + RISCV_ISA_EXT_ZKN, RISCV_ISA_EXT_ZKR, - RISCV_ISA_EXT_ZKT, + RISCV_ISA_EXT_ZKT }; static const unsigned int riscv_zkn_bundled_exts[] = { - RISCV_ISA_EXT_ZBKB, - RISCV_ISA_EXT_ZBKC, - RISCV_ISA_EXT_ZBKX, - RISCV_ISA_EXT_ZKND, - RISCV_ISA_EXT_ZKNE, - RISCV_ISA_EXT_ZKNH, + RISCV_ISA_EXT_ZKN }; static const unsigned int riscv_zks_bundled_exts[] = { From 6de2aa7bd23377ed3635ec5eca02b52788ab5e49 Mon Sep 17 00:00:00 2001 From: "Guo Ren (Alibaba DAMO Academy)" Date: Sun, 30 Nov 2025 19:58:50 -0500 Subject: [PATCH 0928/1321] riscv: pgtable: Cleanup useless VA_USER_XXX definitions These marcos are not used after commit b5b4287accd7 ("riscv: mm: Use hint address in mmap if available"). Cleanup VA_USER_XXX definitions in asm/pgtable.h. Fixes: b5b4287accd7 ("riscv: mm: Use hint address in mmap if available") Signed-off-by: Guo Ren (Alibaba DAMO Academy) Reviewed-by: Jinjie Ruan Link: https://patch.msgid.link/20251201005850.702569-1-guoren@kernel.org Signed-off-by: Paul Walmsley --- arch/riscv/include/asm/pgtable.h | 4 ---- 1 file changed, 4 deletions(-) diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 6bb1f5bdc5d26..9acd58a67123b 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -124,10 +124,6 @@ #ifdef CONFIG_64BIT #include -#define VA_USER_SV39 (UL(1) << (VA_BITS_SV39 - 1)) -#define VA_USER_SV48 (UL(1) << (VA_BITS_SV48 - 1)) -#define VA_USER_SV57 (UL(1) << (VA_BITS_SV57 - 1)) - #define MMAP_VA_BITS_64 ((VA_BITS >= VA_BITS_SV48) ? VA_BITS_SV48 : VA_BITS) #define MMAP_MIN_VA_BITS_64 (VA_BITS_SV39) #define MMAP_VA_BITS (is_compat_task() ? VA_BITS_SV32 : MMAP_VA_BITS_64) From dbabf06063185b8024c62e53e2df0ddd2279a8af Mon Sep 17 00:00:00 2001 From: Soham Metha Date: Thu, 4 Dec 2025 01:13:52 +0530 Subject: [PATCH 0929/1321] riscv: kexec_image: Fix dead link to boot-image-header.rst Fix the reference to 'boot-image-header.rst', which was moved to 'Documentation/arch/riscv/' in commit 'ed843ae947f8' ("docs: move riscv under arch"). Signed-off-by: Soham Metha Link: https://patch.msgid.link/20251203194355.63265-1-sohammetha01@gmail.com Signed-off-by: Paul Walmsley --- arch/riscv/kernel/kexec_image.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/kernel/kexec_image.c b/arch/riscv/kernel/kexec_image.c index 8f2eb900910b1..51dc89259f169 100644 --- a/arch/riscv/kernel/kexec_image.c +++ b/arch/riscv/kernel/kexec_image.c @@ -22,7 +22,7 @@ static int image_probe(const char *kernel_buf, unsigned long kernel_len) if (!h || kernel_len < sizeof(*h)) return -EINVAL; - /* According to Documentation/riscv/boot-image-header.rst, + /* According to Documentation/arch/riscv/boot-image-header.rst, * use "magic2" field to check when version >= 0.2. */ From 8827013233684c7a93537c994b0c9bb2302424a1 Mon Sep 17 00:00:00 2001 From: Lukas Bulwahn Date: Wed, 7 Jan 2026 10:24:25 +0100 Subject: [PATCH 0930/1321] riscv: configs: Clean up references to non-existing configs - Drop 'CONFIG_I2C_COMPAT is not set' (removed in commit 7e722083fcc3 ("i2c: Remove I2C_COMPAT config symbol and related code")) - Drop 'CONFIG_SCHED_DEBUG is not set' (removed in commit b52173065e0a ("sched/debug: Remove CONFIG_SCHED_DEBUG")) Signed-off-by: Lukas Bulwahn Link: https://patch.msgid.link/20260107092425.24737-1-lukas.bulwahn@redhat.com Signed-off-by: Paul Walmsley --- arch/riscv/configs/nommu_k210_defconfig | 2 -- arch/riscv/configs/nommu_k210_sdcard_defconfig | 1 - arch/riscv/configs/nommu_virt_defconfig | 1 - 3 files changed, 4 deletions(-) diff --git a/arch/riscv/configs/nommu_k210_defconfig b/arch/riscv/configs/nommu_k210_defconfig index ee18d1e333f20..544c52067dc29 100644 --- a/arch/riscv/configs/nommu_k210_defconfig +++ b/arch/riscv/configs/nommu_k210_defconfig @@ -55,7 +55,6 @@ CONFIG_DEVTMPFS_MOUNT=y # CONFIG_HW_RANDOM is not set # CONFIG_DEVMEM is not set CONFIG_I2C=y -# CONFIG_I2C_COMPAT is not set CONFIG_I2C_CHARDEV=y # CONFIG_I2C_HELPER_AUTO is not set CONFIG_I2C_DESIGNWARE_CORE=y @@ -89,7 +88,6 @@ CONFIG_PRINTK_TIME=y # CONFIG_FRAME_POINTER is not set # CONFIG_DEBUG_MISC is not set CONFIG_PANIC_ON_OOPS=y -# CONFIG_SCHED_DEBUG is not set # CONFIG_RCU_TRACE is not set # CONFIG_FTRACE is not set # CONFIG_RUNTIME_TESTING_MENU is not set diff --git a/arch/riscv/configs/nommu_k210_sdcard_defconfig b/arch/riscv/configs/nommu_k210_sdcard_defconfig index e770d81b738e1..4a826e30fa3ed 100644 --- a/arch/riscv/configs/nommu_k210_sdcard_defconfig +++ b/arch/riscv/configs/nommu_k210_sdcard_defconfig @@ -86,7 +86,6 @@ CONFIG_PRINTK_TIME=y # CONFIG_FRAME_POINTER is not set # CONFIG_DEBUG_MISC is not set CONFIG_PANIC_ON_OOPS=y -# CONFIG_SCHED_DEBUG is not set # CONFIG_RCU_TRACE is not set # CONFIG_FTRACE is not set # CONFIG_RUNTIME_TESTING_MENU is not set diff --git a/arch/riscv/configs/nommu_virt_defconfig b/arch/riscv/configs/nommu_virt_defconfig index 0da5069bfbef2..4c38049633b76 100644 --- a/arch/riscv/configs/nommu_virt_defconfig +++ b/arch/riscv/configs/nommu_virt_defconfig @@ -66,7 +66,6 @@ CONFIG_EXT2_FS=y # CONFIG_MISC_FILESYSTEMS is not set CONFIG_LSM="[]" CONFIG_PRINTK_TIME=y -# CONFIG_SCHED_DEBUG is not set # CONFIG_RCU_TRACE is not set # CONFIG_FTRACE is not set # CONFIG_RUNTIME_TESTING_MENU is not set From 8e35f28b359fd297ac1812b6fdde51ee8c739ab3 Mon Sep 17 00:00:00 2001 From: Ben Dooks Date: Fri, 2 Jan 2026 14:58:39 +0000 Subject: [PATCH 0931/1321] riscv: cpu_ops_sbi: smp_processor_id() returns int, not unsigned int The print in sbi_cpu_stop() assumes smp_processor_id() returns an unsigned int, when it is actually an int. Fix the format string to avoid mismatch type warnings in rht pr_crit(). Signed-off-by: Ben Dooks Link: https://patch.msgid.link/20260102145839.657864-1-ben.dooks@codethink.co.uk Signed-off-by: Paul Walmsley --- arch/riscv/kernel/cpu_ops_sbi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/kernel/cpu_ops_sbi.c b/arch/riscv/kernel/cpu_ops_sbi.c index 87d6559448039..00aff669f5f2f 100644 --- a/arch/riscv/kernel/cpu_ops_sbi.c +++ b/arch/riscv/kernel/cpu_ops_sbi.c @@ -85,7 +85,7 @@ static void sbi_cpu_stop(void) int ret; ret = sbi_hsm_hart_stop(); - pr_crit("Unable to stop the cpu %u (%d)\n", smp_processor_id(), ret); + pr_crit("Unable to stop the cpu %d (%d)\n", smp_processor_id(), ret); } static int sbi_cpu_is_stopped(unsigned int cpuid) From fc26acdfcd6bed99e5b7833e269998c8a231983a Mon Sep 17 00:00:00 2001 From: Yunhui Cui Date: Tue, 16 Dec 2025 09:47:19 +0800 Subject: [PATCH 0932/1321] riscv: remove irqflags.h inclusion in asm/bitops.h The arch/riscv/include/asm/bitops.h does not functionally require including /linux/irqflags.h. Additionally, adding arch/riscv/include/asm/percpu.h causes a circular inclusion: kernel/bounds.c ->include/linux/log2.h ->include/linux/bitops.h ->arch/riscv/include/asm/bitops.h ->include/linux/irqflags.h ->include/linux/find.h ->return val ? __ffs(val) : size; ->arch/riscv/include/asm/bitops.h The compilation log is as follows: CC kernel/bounds.s In file included from ./include/linux/bitmap.h:11, from ./include/linux/cpumask.h:12, from ./arch/riscv/include/asm/processor.h:55, from ./arch/riscv/include/asm/thread_info.h:42, from ./include/linux/thread_info.h:60, from ./include/asm-generic/preempt.h:5, from ./arch/riscv/include/generated/asm/preempt.h:1, from ./include/linux/preempt.h:79, from ./arch/riscv/include/asm/percpu.h:8, from ./include/linux/irqflags.h:19, from ./arch/riscv/include/asm/bitops.h:14, from ./include/linux/bitops.h:68, from ./include/linux/log2.h:12, from kernel/bounds.c:13: ./include/linux/find.h: In function 'find_next_bit': ./include/linux/find.h:66:30: error: implicit declaration of function '__ffs' [-Wimplicit-function-declaration] 66 | return val ? __ffs(val) : size; | ^~~~~ Signed-off-by: Yunhui Cui Acked-by: Yury Norov (NVIDIA) Link: https://patch.msgid.link/20251216014721.42262-2-cuiyunhui@bytedance.com Signed-off-by: Paul Walmsley --- arch/riscv/include/asm/bitops.h | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/riscv/include/asm/bitops.h b/arch/riscv/include/asm/bitops.h index 238092125c118..3c1a15be54d80 100644 --- a/arch/riscv/include/asm/bitops.h +++ b/arch/riscv/include/asm/bitops.h @@ -11,7 +11,6 @@ #endif /* _LINUX_BITOPS_H */ #include -#include #include #include From 1dda8ce229c24378a6b9b62325bf268de054d074 Mon Sep 17 00:00:00 2001 From: Martin Kaiser Date: Tue, 23 Dec 2025 14:50:06 +0100 Subject: [PATCH 0933/1321] riscv: trace: fix snapshot deadlock with sbi ecall If sbi_ecall.c's functions are traceable, echo "__sbi_ecall:snapshot" > /sys/kernel/tracing/set_ftrace_filter may get the kernel into a deadlock. (Functions in sbi_ecall.c are excluded from tracing if CONFIG_RISCV_ALTERNATIVE_EARLY is set.) __sbi_ecall triggers a snapshot of the ringbuffer. The snapshot code raises an IPI interrupt, which results in another call to __sbi_ecall and another snapshot... All it takes to get into this endless loop is one initial __sbi_ecall. On RISC-V systems without SSTC extension, the clock events in timer-riscv.c issue periodic sbi ecalls, making the problem easy to trigger. Always exclude the sbi_ecall.c functions from tracing to fix the potential deadlock. sbi ecalls can easiliy be logged via trace events, excluding ecall functions from function tracing is not a big limitation. Signed-off-by: Martin Kaiser Link: https://patch.msgid.link/20251223135043.1336524-1-martin@kaiser.cx Signed-off-by: Paul Walmsley --- arch/riscv/kernel/Makefile | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index f60fce69b7259..a01f6439d62b1 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -3,12 +3,6 @@ # Makefile for the RISC-V Linux kernel # -ifdef CONFIG_FTRACE -CFLAGS_REMOVE_ftrace.o = $(CC_FLAGS_FTRACE) -CFLAGS_REMOVE_patch.o = $(CC_FLAGS_FTRACE) -CFLAGS_REMOVE_sbi.o = $(CC_FLAGS_FTRACE) -CFLAGS_REMOVE_return_address.o = $(CC_FLAGS_FTRACE) -endif CFLAGS_syscall_table.o += $(call cc-disable-warning, override-init) CFLAGS_compat_syscall_table.o += $(call cc-disable-warning, override-init) @@ -24,7 +18,6 @@ CFLAGS_sbi_ecall.o := -mcmodel=medany ifdef CONFIG_FTRACE CFLAGS_REMOVE_alternative.o = $(CC_FLAGS_FTRACE) CFLAGS_REMOVE_cpufeature.o = $(CC_FLAGS_FTRACE) -CFLAGS_REMOVE_sbi_ecall.o = $(CC_FLAGS_FTRACE) endif ifdef CONFIG_RELOCATABLE CFLAGS_alternative.o += -fno-pie @@ -43,6 +36,14 @@ CFLAGS_sbi_ecall.o += -D__NO_FORTIFY endif endif +ifdef CONFIG_FTRACE +CFLAGS_REMOVE_ftrace.o = $(CC_FLAGS_FTRACE) +CFLAGS_REMOVE_patch.o = $(CC_FLAGS_FTRACE) +CFLAGS_REMOVE_sbi.o = $(CC_FLAGS_FTRACE) +CFLAGS_REMOVE_return_address.o = $(CC_FLAGS_FTRACE) +CFLAGS_REMOVE_sbi_ecall.o = $(CC_FLAGS_FTRACE) +endif + always-$(KBUILD_BUILTIN) += vmlinux.lds obj-y += head.o From 265ef3d58ba405dd54eae0970d266acf789ae988 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Sun, 11 Jan 2026 16:53:48 +0100 Subject: [PATCH 0934/1321] treewide: Update email address In a vain attempt to consolidate the email zoo switch everything to the kernel.org account. Signed-off-by: Thomas Gleixner Signed-off-by: Linus Torvalds --- .mailmap | 1 + CREDITS | 2 +- .../ABI/stable/sysfs-kernel-time-aux-clocks | 2 +- Documentation/arch/x86/topology.rst | 2 +- Documentation/core-api/cpu_hotplug.rst | 2 +- Documentation/core-api/genericirq.rst | 2 +- Documentation/core-api/librs.rst | 2 +- .../bindings/timer/mrvl,mmp-timer.yaml | 2 +- Documentation/driver-api/mtdnand.rst | 4 +-- .../zh_CN/core-api/cpu_hotplug.rst | 2 +- .../zh_CN/core-api/genericirq.rst | 2 +- MAINTAINERS | 36 +++++++++---------- arch/sh/kernel/perf_event.c | 2 +- arch/sparc/kernel/perf_event.c | 2 +- arch/x86/events/core.c | 2 +- arch/x86/events/perf_event.h | 2 +- arch/x86/kernel/x86_init.c | 2 +- arch/x86/mm/pti.c | 2 +- drivers/mtd/nand/ecc-sw-hamming.c | 2 +- drivers/mtd/nand/raw/diskonchip.c | 2 +- drivers/mtd/nand/raw/nand_base.c | 4 +-- drivers/mtd/nand/raw/nand_bbt.c | 2 +- drivers/mtd/nand/raw/nand_ids.c | 2 +- drivers/mtd/nand/raw/nand_jedec.c | 2 +- drivers/mtd/nand/raw/nand_legacy.c | 2 +- drivers/mtd/nand/raw/nand_onfi.c | 2 +- drivers/mtd/nand/raw/ndfc.c | 2 +- drivers/uio/uio.c | 2 +- fs/jffs2/wbuf.c | 4 +-- include/linux/hrtimer.h | 2 +- include/linux/ktime.h | 2 +- include/linux/mtd/jedec.h | 2 +- include/linux/mtd/nand-ecc-sw-hamming.h | 2 +- include/linux/mtd/ndfc.h | 2 +- include/linux/mtd/onfi.h | 2 +- include/linux/mtd/platnand.h | 2 +- include/linux/mtd/rawnand.h | 2 +- include/linux/perf_event.h | 2 +- include/linux/plist.h | 2 +- include/linux/rslib.h | 2 +- include/linux/uio_driver.h | 2 +- include/uapi/linux/perf_event.h | 2 +- kernel/events/callchain.c | 2 +- kernel/events/core.c | 2 +- kernel/events/ring_buffer.c | 2 +- kernel/irq/debugfs.c | 2 +- kernel/irq/matrix.c | 2 +- kernel/sched/fair.c | 2 +- kernel/sched/pelt.c | 2 +- kernel/time/clockevents.c | 2 +- kernel/time/hrtimer.c | 2 +- kernel/time/tick-broadcast.c | 2 +- kernel/time/tick-common.c | 2 +- kernel/time/tick-oneshot.c | 2 +- kernel/time/tick-sched.c | 2 +- lib/debugobjects.c | 2 +- lib/plist.c | 2 +- lib/reed_solomon/decode_rs.c | 2 +- lib/reed_solomon/encode_rs.c | 2 +- lib/reed_solomon/reed_solomon.c | 2 +- scripts/spdxcheck.py | 2 +- tools/include/uapi/linux/perf_event.h | 2 +- tools/perf/builtin-list.c | 2 +- 63 files changed, 83 insertions(+), 82 deletions(-) diff --git a/.mailmap b/.mailmap index b23e0853d636c..fa018b5bd5339 100644 --- a/.mailmap +++ b/.mailmap @@ -801,6 +801,7 @@ Tanzir Hasan Tejun Heo Tomeu Vizoso Thomas Graf +Thomas Gleixner Thomas Körper Thomas Pedersen Thorsten Blum diff --git a/CREDITS b/CREDITS index ca75f110edb6f..383809bc4b7ac 100644 --- a/CREDITS +++ b/CREDITS @@ -1398,7 +1398,7 @@ D: SRM environment driver (for Alpha systems) P: 1024D/8399E1BB 250D 3BCF 7127 0D8C A444 A961 1DBD 5E75 8399 E1BB N: Thomas Gleixner -E: tglx@linutronix.de +E: tglx@kernel.org D: NAND flash hardware support, JFFS2 on NAND flash N: Jérôme Glisse diff --git a/Documentation/ABI/stable/sysfs-kernel-time-aux-clocks b/Documentation/ABI/stable/sysfs-kernel-time-aux-clocks index 825508f42af60..e1a894c8dd1bd 100644 --- a/Documentation/ABI/stable/sysfs-kernel-time-aux-clocks +++ b/Documentation/ABI/stable/sysfs-kernel-time-aux-clocks @@ -1,5 +1,5 @@ What: /sys/kernel/time/aux_clocks//enable Date: May 2025 -Contact: Thomas Gleixner +Contact: Thomas Gleixner Description: Controls the enablement of auxiliary clock timekeepers. diff --git a/Documentation/arch/x86/topology.rst b/Documentation/arch/x86/topology.rst index 86bec8ac2c4de..f779a68875c54 100644 --- a/Documentation/arch/x86/topology.rst +++ b/Documentation/arch/x86/topology.rst @@ -17,7 +17,7 @@ with the generic one and look at this one in parallel for the x86 specifics. Needless to say, code should use the generic functions - this file is *only* here to *document* the inner workings of x86 topology. -Started by Thomas Gleixner and Borislav Petkov . +Started by Thomas Gleixner and Borislav Petkov . The main aim of the topology facilities is to present adequate interfaces to code which needs to know/query/use the structure of the running system wrt diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst index e1b0eeabbb5e5..9b4afca9fd09c 100644 --- a/Documentation/core-api/cpu_hotplug.rst +++ b/Documentation/core-api/cpu_hotplug.rst @@ -8,7 +8,7 @@ CPU hotplug in the Kernel Srivatsa Vaddagiri , Ashok Raj , Joel Schopp , - Thomas Gleixner + Thomas Gleixner Introduction ============ diff --git a/Documentation/core-api/genericirq.rst b/Documentation/core-api/genericirq.rst index 582bde9bf5a92..b16d751d4b98f 100644 --- a/Documentation/core-api/genericirq.rst +++ b/Documentation/core-api/genericirq.rst @@ -439,6 +439,6 @@ Credits The following people have contributed to this document: -1. Thomas Gleixner tglx@linutronix.de +1. Thomas Gleixner tglx@kernel.org 2. Ingo Molnar mingo@elte.hu diff --git a/Documentation/core-api/librs.rst b/Documentation/core-api/librs.rst index 6010f5bc5bf91..0d88893dbc03c 100644 --- a/Documentation/core-api/librs.rst +++ b/Documentation/core-api/librs.rst @@ -209,4 +209,4 @@ testing. Thanks a lot. The following people have contributed to this document: -Thomas Gleixner\ tglx@linutronix.de +Thomas Gleixner\ tglx@kernel.org diff --git a/Documentation/devicetree/bindings/timer/mrvl,mmp-timer.yaml b/Documentation/devicetree/bindings/timer/mrvl,mmp-timer.yaml index fe6bc4173789d..0643cfcc6bc73 100644 --- a/Documentation/devicetree/bindings/timer/mrvl,mmp-timer.yaml +++ b/Documentation/devicetree/bindings/timer/mrvl,mmp-timer.yaml @@ -8,7 +8,7 @@ title: Marvell MMP Timer maintainers: - Daniel Lezcano - - Thomas Gleixner + - Thomas Gleixner - Rob Herring properties: diff --git a/Documentation/driver-api/mtdnand.rst b/Documentation/driver-api/mtdnand.rst index ce77e024c4f1c..adf03983f1ba8 100644 --- a/Documentation/driver-api/mtdnand.rst +++ b/Documentation/driver-api/mtdnand.rst @@ -996,11 +996,11 @@ The following people have contributed to the NAND driver: 2. David Woodhouse\ dwmw2@infradead.org -3. Thomas Gleixner\ tglx@linutronix.de +3. Thomas Gleixner\ tglx@kernel.org A lot of users have provided bugfixes, improvements and helping hands for testing. Thanks a lot. The following people have contributed to this document: -1. Thomas Gleixner\ tglx@linutronix.de +1. Thomas Gleixner\ tglx@kernel.org diff --git a/Documentation/translations/zh_CN/core-api/cpu_hotplug.rst b/Documentation/translations/zh_CN/core-api/cpu_hotplug.rst index bc0d7ea6d834c..3447fbf0e6954 100644 --- a/Documentation/translations/zh_CN/core-api/cpu_hotplug.rst +++ b/Documentation/translations/zh_CN/core-api/cpu_hotplug.rst @@ -22,7 +22,7 @@ Srivatsa Vaddagiri , Ashok Raj , Joel Schopp , - Thomas Gleixner + Thomas Gleixner 简介 ==== diff --git a/Documentation/translations/zh_CN/core-api/genericirq.rst b/Documentation/translations/zh_CN/core-api/genericirq.rst index 05ccb954c18d0..d2c1bd94bb970 100644 --- a/Documentation/translations/zh_CN/core-api/genericirq.rst +++ b/Documentation/translations/zh_CN/core-api/genericirq.rst @@ -404,6 +404,6 @@ kernel/irq/chip.c 感谢以下人士对本文档作出的贡献: -1. Thomas Gleixner tglx@linutronix.de +1. Thomas Gleixner tglx@kernel.org 2. Ingo Molnar mingo@elte.hu diff --git a/MAINTAINERS b/MAINTAINERS index 32b5e41d98493..ee036e0a3ef6c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6175,7 +6175,7 @@ F: include/linux/clk.h CLOCKSOURCE, CLOCKEVENT DRIVERS M: Daniel Lezcano -M: Thomas Gleixner +M: Thomas Gleixner L: linux-kernel@vger.kernel.org S: Supported T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers/core @@ -6541,7 +6541,7 @@ S: Maintained F: drivers/cpufreq/virtual-cpufreq.c CPU HOTPLUG -M: Thomas Gleixner +M: Thomas Gleixner M: Peter Zijlstra L: linux-kernel@vger.kernel.org S: Maintained @@ -6968,7 +6968,7 @@ F: Documentation/scsi/dc395x.rst F: drivers/scsi/dc395x.* DEBUGOBJECTS: -M: Thomas Gleixner +M: Thomas Gleixner L: linux-kernel@vger.kernel.org S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core/debugobjects @@ -10371,7 +10371,7 @@ F: include/uapi/linux/fuse.h F: tools/testing/selftests/filesystems/fuse/ FUTEX SUBSYSTEM -M: Thomas Gleixner +M: Thomas Gleixner M: Ingo Molnar R: Peter Zijlstra R: Darren Hart @@ -10515,7 +10515,7 @@ F: drivers/base/arch_topology.c F: include/linux/arch_topology.h GENERIC ENTRY CODE -M: Thomas Gleixner +M: Thomas Gleixner M: Peter Zijlstra M: Andy Lutomirski L: linux-kernel@vger.kernel.org @@ -10628,7 +10628,7 @@ F: drivers/uio/uio_pci_generic.c GENERIC VDSO LIBRARY M: Andy Lutomirski -M: Thomas Gleixner +M: Thomas Gleixner M: Vincenzo Frascino L: linux-kernel@vger.kernel.org S: Maintained @@ -11241,7 +11241,7 @@ F: drivers/hid/hid-logitech-hidpp.c HIGH-RESOLUTION TIMERS, TIMER WHEEL, CLOCKEVENTS M: Anna-Maria Behnsen M: Frederic Weisbecker -M: Thomas Gleixner +M: Thomas Gleixner L: linux-kernel@vger.kernel.org S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers/core @@ -11264,7 +11264,7 @@ R: Boqun Feng R: FUJITA Tomonori R: Frederic Weisbecker R: Lyude Paul -R: Thomas Gleixner +R: Thomas Gleixner R: Anna-Maria Behnsen R: John Stultz R: Stephen Boyd @@ -13334,7 +13334,7 @@ F: Documentation/devicetree/bindings/sound/irondevice,* F: sound/soc/codecs/sma* IRQ DOMAINS (IRQ NUMBER MAPPING LIBRARY) -M: Thomas Gleixner +M: Thomas Gleixner S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core F: Documentation/core-api/irq/irq-domain.rst @@ -13344,7 +13344,7 @@ F: kernel/irq/irqdomain.c F: kernel/irq/msi.c IRQ SUBSYSTEM -M: Thomas Gleixner +M: Thomas Gleixner L: linux-kernel@vger.kernel.org S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core @@ -13357,7 +13357,7 @@ F: kernel/irq/ F: lib/group_cpus.c IRQCHIP DRIVERS -M: Thomas Gleixner +M: Thomas Gleixner L: linux-kernel@vger.kernel.org S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core @@ -14451,7 +14451,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-nonmm-unstab F: lib/* LICENSES and SPDX stuff -M: Thomas Gleixner +M: Thomas Gleixner M: Greg Kroah-Hartman L: linux-spdx@vger.kernel.org S: Maintained @@ -18576,7 +18576,7 @@ NOHZ, DYNTICKS SUPPORT M: Anna-Maria Behnsen M: Frederic Weisbecker M: Ingo Molnar -M: Thomas Gleixner +M: Thomas Gleixner L: linux-kernel@vger.kernel.org S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers/nohz @@ -20761,7 +20761,7 @@ F: drivers/platform/x86/portwell-ec.c POSIX CLOCKS and TIMERS M: Anna-Maria Behnsen M: Frederic Weisbecker -M: Thomas Gleixner +M: Thomas Gleixner L: linux-kernel@vger.kernel.org S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers/core @@ -26272,7 +26272,7 @@ F: drivers/net/wireless/ti/ TIMEKEEPING, CLOCKSOURCE CORE, NTP, ALARMTIMER M: John Stultz -M: Thomas Gleixner +M: Thomas Gleixner R: Stephen Boyd L: linux-kernel@vger.kernel.org S: Supported @@ -28203,7 +28203,7 @@ F: net/lapb/ F: net/x25/ X86 ARCHITECTURE (32-BIT AND 64-BIT) -M: Thomas Gleixner +M: Thomas Gleixner M: Ingo Molnar M: Borislav Petkov M: Dave Hansen @@ -28219,7 +28219,7 @@ F: tools/testing/selftests/x86 X86 CPUID DATABASE M: Borislav Petkov -M: Thomas Gleixner +M: Thomas Gleixner M: x86@kernel.org R: Ahmed S. Darwish L: x86-cpuid@lists.linux.dev @@ -28235,7 +28235,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/asm F: arch/x86/entry/ X86 HARDWARE VULNERABILITIES -M: Thomas Gleixner +M: Thomas Gleixner M: Borislav Petkov M: Peter Zijlstra M: Josh Poimboeuf diff --git a/arch/sh/kernel/perf_event.c b/arch/sh/kernel/perf_event.c index 1d2507f224372..1fbb7d46e4840 100644 --- a/arch/sh/kernel/perf_event.c +++ b/arch/sh/kernel/perf_event.c @@ -7,7 +7,7 @@ * Heavily based on the x86 and PowerPC implementations. * * x86: - * Copyright (C) 2008 Thomas Gleixner + * Copyright (C) 2008 Linutronix GmbH, Thomas Gleixner * Copyright (C) 2008-2009 Red Hat, Inc., Ingo Molnar * Copyright (C) 2009 Jaswinder Singh Rajput * Copyright (C) 2009 Advanced Micro Devices, Inc., Robert Richter diff --git a/arch/sparc/kernel/perf_event.c b/arch/sparc/kernel/perf_event.c index cae4d33002a54..0ce4ae343531e 100644 --- a/arch/sparc/kernel/perf_event.c +++ b/arch/sparc/kernel/perf_event.c @@ -6,7 +6,7 @@ * This code is based almost entirely upon the x86 perf event * code, which is: * - * Copyright (C) 2008 Thomas Gleixner + * Copyright (C) 2008 Linutronix GmbH, Thomas Gleixner * Copyright (C) 2008-2009 Red Hat, Inc., Ingo Molnar * Copyright (C) 2009 Jaswinder Singh Rajput * Copyright (C) 2009 Advanced Micro Devices, Inc., Robert Richter diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 0c38a31d5fc72..576baa9a52c5b 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -1,7 +1,7 @@ /* * Performance events x86 architecture code * - * Copyright (C) 2008 Thomas Gleixner + * Copyright (C) 2008 Linutronix GmbH, Thomas Gleixner * Copyright (C) 2008-2009 Red Hat, Inc., Ingo Molnar * Copyright (C) 2009 Jaswinder Singh Rajput * Copyright (C) 2009 Advanced Micro Devices, Inc., Robert Richter diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 3161ec0a3416d..62963022b517f 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -1,7 +1,7 @@ /* * Performance events x86 architecture header * - * Copyright (C) 2008 Thomas Gleixner + * Copyright (C) 2008 Linutronix GmbH, Thomas Gleixner * Copyright (C) 2008-2009 Red Hat, Inc., Ingo Molnar * Copyright (C) 2009 Jaswinder Singh Rajput * Copyright (C) 2009 Advanced Micro Devices, Inc., Robert Richter diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c index 0a2bbd674a6d9..ebefb77c37bbd 100644 --- a/arch/x86/kernel/x86_init.c +++ b/arch/x86/kernel/x86_init.c @@ -1,5 +1,5 @@ /* - * Copyright (C) 2009 Thomas Gleixner + * Copyright (C) 2009 Linutronix GmbH, Thomas Gleixner * * For licencing details see kernel-base/COPYING */ diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c index b10d4d131dce3..f7546e9e8e896 100644 --- a/arch/x86/mm/pti.c +++ b/arch/x86/mm/pti.c @@ -15,7 +15,7 @@ * Signed-off-by: Michael Schwarz * * Major changes to the original code by: Dave Hansen - * Mostly rewritten by Thomas Gleixner and + * Mostly rewritten by Thomas Gleixner and * Andy Lutomirsky */ #include diff --git a/drivers/mtd/nand/ecc-sw-hamming.c b/drivers/mtd/nand/ecc-sw-hamming.c index f2d0effad9d29..bc62a71f9fdd4 100644 --- a/drivers/mtd/nand/ecc-sw-hamming.c +++ b/drivers/mtd/nand/ecc-sw-hamming.c @@ -8,7 +8,7 @@ * * Completely replaces the previous ECC implementation which was written by: * Steven J. Hill (sjhill@realitydiluted.com) - * Thomas Gleixner (tglx@linutronix.de) + * Thomas Gleixner (tglx@kernel.org) * * Information on how this algorithm works and how it was developed * can be found in Documentation/driver-api/mtd/nand_ecc.rst diff --git a/drivers/mtd/nand/raw/diskonchip.c b/drivers/mtd/nand/raw/diskonchip.c index 70d6c2250f32c..540b6baf8bb16 100644 --- a/drivers/mtd/nand/raw/diskonchip.c +++ b/drivers/mtd/nand/raw/diskonchip.c @@ -11,7 +11,7 @@ * Error correction code lifted from the old docecc code * Author: Fabrice Bellard (fabrice.bellard@netgem.com) * Copyright (C) 2000 Netgem S.A. - * converted to the generic Reed-Solomon library by Thomas Gleixner + * converted to the generic Reed-Solomon library by Thomas Gleixner * * Interface to generic NAND code for M-Systems DiskOnChip devices */ diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c index ad6d66309597b..f2322de93ab41 100644 --- a/drivers/mtd/nand/raw/nand_base.c +++ b/drivers/mtd/nand/raw/nand_base.c @@ -8,7 +8,7 @@ * http://www.linux-mtd.infradead.org/doc/nand.html * * Copyright (C) 2000 Steven J. Hill (sjhill@realitydiluted.com) - * 2002-2006 Thomas Gleixner (tglx@linutronix.de) + * 2002-2006 Thomas Gleixner (tglx@kernel.org) * * Credits: * David Woodhouse for adding multichip support @@ -6594,5 +6594,5 @@ EXPORT_SYMBOL_GPL(nand_cleanup); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Steven J. Hill "); -MODULE_AUTHOR("Thomas Gleixner "); +MODULE_AUTHOR("Thomas Gleixner "); MODULE_DESCRIPTION("Generic NAND flash driver code"); diff --git a/drivers/mtd/nand/raw/nand_bbt.c b/drivers/mtd/nand/raw/nand_bbt.c index a8fba5f39f591..3050ab7e6eb63 100644 --- a/drivers/mtd/nand/raw/nand_bbt.c +++ b/drivers/mtd/nand/raw/nand_bbt.c @@ -3,7 +3,7 @@ * Overview: * Bad block table support for the NAND driver * - * Copyright © 2004 Thomas Gleixner (tglx@linutronix.de) + * Copyright © 2004 Thomas Gleixner (tglx@kernel.org) * * Description: * diff --git a/drivers/mtd/nand/raw/nand_ids.c b/drivers/mtd/nand/raw/nand_ids.c index 650351c62af63..62a8cf86d9e21 100644 --- a/drivers/mtd/nand/raw/nand_ids.c +++ b/drivers/mtd/nand/raw/nand_ids.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0-only /* - * Copyright (C) 2002 Thomas Gleixner (tglx@linutronix.de) + * Copyright (C) 2002 Thomas Gleixner (tglx@kernel.org) */ #include diff --git a/drivers/mtd/nand/raw/nand_jedec.c b/drivers/mtd/nand/raw/nand_jedec.c index b3cc8f3605291..89e6dd8ed1a8e 100644 --- a/drivers/mtd/nand/raw/nand_jedec.c +++ b/drivers/mtd/nand/raw/nand_jedec.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* * Copyright (C) 2000 Steven J. Hill (sjhill@realitydiluted.com) - * 2002-2006 Thomas Gleixner (tglx@linutronix.de) + * 2002-2006 Thomas Gleixner (tglx@kernel.org) * * Credits: * David Woodhouse for adding multichip support diff --git a/drivers/mtd/nand/raw/nand_legacy.c b/drivers/mtd/nand/raw/nand_legacy.c index 743792edf98d6..97700f80d5b8f 100644 --- a/drivers/mtd/nand/raw/nand_legacy.c +++ b/drivers/mtd/nand/raw/nand_legacy.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* * Copyright (C) 2000 Steven J. Hill (sjhill@realitydiluted.com) - * 2002-2006 Thomas Gleixner (tglx@linutronix.de) + * 2002-2006 Thomas Gleixner (tglx@kernel.org) * * Credits: * David Woodhouse for adding multichip support diff --git a/drivers/mtd/nand/raw/nand_onfi.c b/drivers/mtd/nand/raw/nand_onfi.c index 861975e44b552..11954440e4de6 100644 --- a/drivers/mtd/nand/raw/nand_onfi.c +++ b/drivers/mtd/nand/raw/nand_onfi.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* * Copyright (C) 2000 Steven J. Hill (sjhill@realitydiluted.com) - * 2002-2006 Thomas Gleixner (tglx@linutronix.de) + * 2002-2006 Thomas Gleixner (tglx@kernel.org) * * Credits: * David Woodhouse for adding multichip support diff --git a/drivers/mtd/nand/raw/ndfc.c b/drivers/mtd/nand/raw/ndfc.c index 13365128194d0..7ad8bc04be1a5 100644 --- a/drivers/mtd/nand/raw/ndfc.c +++ b/drivers/mtd/nand/raw/ndfc.c @@ -272,5 +272,5 @@ static struct platform_driver ndfc_driver = { module_platform_driver(ndfc_driver); MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Thomas Gleixner "); +MODULE_AUTHOR("Thomas Gleixner "); MODULE_DESCRIPTION("OF Platform driver for NDFC"); diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c index d93ed4e86a174..fa0d4e6aee16d 100644 --- a/drivers/uio/uio.c +++ b/drivers/uio/uio.c @@ -3,7 +3,7 @@ * drivers/uio/uio.c * * Copyright(C) 2005, Benedikt Spranger - * Copyright(C) 2005, Thomas Gleixner + * Copyright(C) 2005, Linutronix GmbH, Thomas Gleixner * Copyright(C) 2006, Hans J. Koch * Copyright(C) 2006, Greg Kroah-Hartman * diff --git a/fs/jffs2/wbuf.c b/fs/jffs2/wbuf.c index bb815a0029842..3ab3f0ff7ebb3 100644 --- a/fs/jffs2/wbuf.c +++ b/fs/jffs2/wbuf.c @@ -2,10 +2,10 @@ * JFFS2 -- Journalling Flash File System, Version 2. * * Copyright © 2001-2007 Red Hat, Inc. - * Copyright © 2004 Thomas Gleixner + * Copyright © 2004 Thomas Gleixner * * Created by David Woodhouse - * Modified debugged and enhanced by Thomas Gleixner + * Modified debugged and enhanced by Thomas Gleixner * * For licensing information, see the file 'LICENCE' in this directory. * diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h index 2cf1bf65b2257..0de12f14d6a46 100644 --- a/include/linux/hrtimer.h +++ b/include/linux/hrtimer.h @@ -2,7 +2,7 @@ /* * hrtimers - High-resolution kernel timers * - * Copyright(C) 2005, Thomas Gleixner + * Copyright(C) 2005, Linutronix GmbH, Thomas Gleixner * Copyright(C) 2005, Red Hat, Inc., Ingo Molnar * * data type definitions, declarations, prototypes diff --git a/include/linux/ktime.h b/include/linux/ktime.h index 383ed99858024..f247e564602fc 100644 --- a/include/linux/ktime.h +++ b/include/linux/ktime.h @@ -3,7 +3,7 @@ * * ktime_t - nanosecond-resolution time format. * - * Copyright(C) 2005, Thomas Gleixner + * Copyright(C) 2005, Linutronix GmbH, Thomas Gleixner * Copyright(C) 2005, Red Hat, Inc., Ingo Molnar * * data type definitions, declarations, prototypes and macros. diff --git a/include/linux/mtd/jedec.h b/include/linux/mtd/jedec.h index 56047a4e54c9c..255972f3d88dc 100644 --- a/include/linux/mtd/jedec.h +++ b/include/linux/mtd/jedec.h @@ -2,7 +2,7 @@ /* * Copyright © 2000-2010 David Woodhouse * Steven J. Hill - * Thomas Gleixner + * Thomas Gleixner * * Contains all JEDEC related definitions */ diff --git a/include/linux/mtd/nand-ecc-sw-hamming.h b/include/linux/mtd/nand-ecc-sw-hamming.h index c6c71894c575c..2aa2f8ef68d2f 100644 --- a/include/linux/mtd/nand-ecc-sw-hamming.h +++ b/include/linux/mtd/nand-ecc-sw-hamming.h @@ -2,7 +2,7 @@ /* * Copyright (C) 2000-2010 Steven J. Hill * David Woodhouse - * Thomas Gleixner + * Thomas Gleixner * * This file is the header for the NAND Hamming ECC implementation. */ diff --git a/include/linux/mtd/ndfc.h b/include/linux/mtd/ndfc.h index 98f075b869316..622891191e9c4 100644 --- a/include/linux/mtd/ndfc.h +++ b/include/linux/mtd/ndfc.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0-only */ /* - * Copyright (c) 2006 Thomas Gleixner + * Copyright (c) 2006 Linutronix GmbH, Thomas Gleixner * * Info: * Contains defines, datastructures for ndfc nand controller diff --git a/include/linux/mtd/onfi.h b/include/linux/mtd/onfi.h index 55ab2e4d62f94..09a5cbd8f2321 100644 --- a/include/linux/mtd/onfi.h +++ b/include/linux/mtd/onfi.h @@ -2,7 +2,7 @@ /* * Copyright © 2000-2010 David Woodhouse * Steven J. Hill - * Thomas Gleixner + * Thomas Gleixner * * Contains all ONFI related definitions */ diff --git a/include/linux/mtd/platnand.h b/include/linux/mtd/platnand.h index bc11eb6b593bb..2df6fba699f2d 100644 --- a/include/linux/mtd/platnand.h +++ b/include/linux/mtd/platnand.h @@ -2,7 +2,7 @@ /* * Copyright © 2000-2010 David Woodhouse * Steven J. Hill - * Thomas Gleixner + * Thomas Gleixner * * Contains all platform NAND related definitions. */ diff --git a/include/linux/mtd/rawnand.h b/include/linux/mtd/rawnand.h index d30bdc3fcfd72..5c70e7bd3ed5a 100644 --- a/include/linux/mtd/rawnand.h +++ b/include/linux/mtd/rawnand.h @@ -2,7 +2,7 @@ /* * Copyright © 2000-2010 David Woodhouse * Steven J. Hill - * Thomas Gleixner + * Thomas Gleixner * * Info: * Contains standard defines and IDs for NAND flash devices diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 9870d768db4cc..9ded2e582c602 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1,7 +1,7 @@ /* * Performance events: * - * Copyright (C) 2008-2009, Thomas Gleixner + * Copyright (C) 2008-2009, Linutronix GmbH, Thomas Gleixner * Copyright (C) 2008-2011, Red Hat, Inc., Ingo Molnar * Copyright (C) 2008-2011, Red Hat, Inc., Peter Zijlstra * diff --git a/include/linux/plist.h b/include/linux/plist.h index 8c1c8adf7fe94..16cf4355b5c13 100644 --- a/include/linux/plist.h +++ b/include/linux/plist.h @@ -8,7 +8,7 @@ * 2001-2005 (c) MontaVista Software, Inc. * Daniel Walker * - * (C) 2005 Thomas Gleixner + * (C) 2005 Linutronix GmbH, Thomas Gleixner * * Simplifications of the original code by * Oleg Nesterov diff --git a/include/linux/rslib.h b/include/linux/rslib.h index a04dacbdc8ae9..a2848f6907e3d 100644 --- a/include/linux/rslib.h +++ b/include/linux/rslib.h @@ -2,7 +2,7 @@ /* * Generic Reed Solomon encoder / decoder library * - * Copyright (C) 2004 Thomas Gleixner (tglx@linutronix.de) + * Copyright (C) 2004 Thomas Gleixner (tglx@kernel.org) * * RS code lifted from reed solomon library written by Phil Karn * Copyright 2002 Phil Karn, KA9Q diff --git a/include/linux/uio_driver.h b/include/linux/uio_driver.h index 18238dc8bfd35..334641e20fb12 100644 --- a/include/linux/uio_driver.h +++ b/include/linux/uio_driver.h @@ -3,7 +3,7 @@ * include/linux/uio_driver.h * * Copyright(C) 2005, Benedikt Spranger - * Copyright(C) 2005, Thomas Gleixner + * Copyright(C) 2005, Linutronix GmbH, Thomas Gleixner * Copyright(C) 2006, Hans J. Koch * Copyright(C) 2006, Greg Kroah-Hartman * diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index c44a8fb3e4181..72f03153dd325 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -2,7 +2,7 @@ /* * Performance events: * - * Copyright (C) 2008-2009, Thomas Gleixner + * Copyright (C) 2008-2009, Linutronix GmbH, Thomas Gleixner * Copyright (C) 2008-2011, Red Hat, Inc., Ingo Molnar * Copyright (C) 2008-2011, Red Hat, Inc., Peter Zijlstra * diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c index b9c7e00725d6b..1f65895787039 100644 --- a/kernel/events/callchain.c +++ b/kernel/events/callchain.c @@ -2,7 +2,7 @@ /* * Performance events callchain code, extracted from core.c: * - * Copyright (C) 2008 Thomas Gleixner + * Copyright (C) 2008 Linutronix GmbH, Thomas Gleixner * Copyright (C) 2008-2011 Red Hat, Inc., Ingo Molnar * Copyright (C) 2008-2011 Red Hat, Inc., Peter Zijlstra * Copyright © 2009 Paul Mackerras, IBM Corp. diff --git a/kernel/events/core.c b/kernel/events/core.c index dad0d3d2e85f7..f5e9d30e4fa93 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -2,7 +2,7 @@ /* * Performance events core code: * - * Copyright (C) 2008 Thomas Gleixner + * Copyright (C) 2008 Linutronix GmbH, Thomas Gleixner * Copyright (C) 2008-2011 Red Hat, Inc., Ingo Molnar * Copyright (C) 2008-2011 Red Hat, Inc., Peter Zijlstra * Copyright © 2009 Paul Mackerras, IBM Corp. diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 20a9050237362..3e7de26614172 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -2,7 +2,7 @@ /* * Performance events ring-buffer code: * - * Copyright (C) 2008 Thomas Gleixner + * Copyright (C) 2008 Linutronix GmbH, Thomas Gleixner * Copyright (C) 2008-2011 Red Hat, Inc., Ingo Molnar * Copyright (C) 2008-2011 Red Hat, Inc., Peter Zijlstra * Copyright © 2009 Paul Mackerras, IBM Corp. diff --git a/kernel/irq/debugfs.c b/kernel/irq/debugfs.c index 3527defd2890e..5c5ebaee35f2c 100644 --- a/kernel/irq/debugfs.c +++ b/kernel/irq/debugfs.c @@ -1,5 +1,5 @@ // SPDX-License-Identifier: GPL-2.0 -// Copyright 2017 Thomas Gleixner +// Copyright 2017 Linutronix GmbH, Thomas Gleixner #include #include diff --git a/kernel/irq/matrix.c b/kernel/irq/matrix.c index 8f222d1ccceca..a50f2305a8dcd 100644 --- a/kernel/irq/matrix.c +++ b/kernel/irq/matrix.c @@ -1,5 +1,5 @@ // SPDX-License-Identifier: GPL-2.0 -// Copyright (C) 2017 Thomas Gleixner +// Copyright (C) 2017 Linutronix GmbH, Thomas Gleixner #include #include diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index da46c31645378..e71302282671c 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -15,7 +15,7 @@ * Author: Srivatsa Vaddagiri * * Scaled math optimizations by Thomas Gleixner - * Copyright (C) 2007, Thomas Gleixner + * Copyright (C) 2007, Linutronix GmbH, Thomas Gleixner * * Adaptive scheduling granularity, math enhancements by Peter Zijlstra * Copyright (C) 2007 Red Hat, Inc., Peter Zijlstra diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c index fa83bbaf4f3e8..897790889ba38 100644 --- a/kernel/sched/pelt.c +++ b/kernel/sched/pelt.c @@ -15,7 +15,7 @@ * Author: Srivatsa Vaddagiri * * Scaled math optimizations by Thomas Gleixner - * Copyright (C) 2007, Thomas Gleixner + * Copyright (C) 2007, Linutronix GmbH, Thomas Gleixner * * Adaptive scheduling granularity, math enhancements by Peter Zijlstra * Copyright (C) 2007 Red Hat, Inc., Peter Zijlstra diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index a59bc75ab7c5b..eaae1ce9f0600 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -2,7 +2,7 @@ /* * This file contains functions which manage clock event devices. * - * Copyright(C) 2005-2006, Thomas Gleixner + * Copyright(C) 2005-2006, Linutronix GmbH, Thomas Gleixner * Copyright(C) 2005-2007, Red Hat, Inc., Ingo Molnar * Copyright(C) 2006-2007, Timesys Corp., Thomas Gleixner */ diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index f8ea8c8fc8952..bdb30cc5e873e 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 /* - * Copyright(C) 2005-2006, Thomas Gleixner + * Copyright(C) 2005-2006, Linutronix GmbH, Thomas Gleixner * Copyright(C) 2005-2007, Red Hat, Inc., Ingo Molnar * Copyright(C) 2006-2007 Timesys Corp., Thomas Gleixner * diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index 0207868c8b4d2..f63c65881364d 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -3,7 +3,7 @@ * This file contains functions which emulate a local clock-event * device via a broadcast event source. * - * Copyright(C) 2005-2006, Thomas Gleixner + * Copyright(C) 2005-2006, Linutronix GmbH, Thomas Gleixner * Copyright(C) 2005-2007, Red Hat, Inc., Ingo Molnar * Copyright(C) 2006-2007, Timesys Corp., Thomas Gleixner */ diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c index 7e33d3f2e889b..d305d85218961 100644 --- a/kernel/time/tick-common.c +++ b/kernel/time/tick-common.c @@ -3,7 +3,7 @@ * This file contains the base functions to manage periodic tick * related events. * - * Copyright(C) 2005-2006, Thomas Gleixner + * Copyright(C) 2005-2006, Linutronix GmbH, Thomas Gleixner * Copyright(C) 2005-2007, Red Hat, Inc., Ingo Molnar * Copyright(C) 2006-2007, Timesys Corp., Thomas Gleixner */ diff --git a/kernel/time/tick-oneshot.c b/kernel/time/tick-oneshot.c index ffee943d796d0..7472597f3225f 100644 --- a/kernel/time/tick-oneshot.c +++ b/kernel/time/tick-oneshot.c @@ -3,7 +3,7 @@ * This file contains functions which manage high resolution tick * related events. * - * Copyright(C) 2005-2006, Thomas Gleixner + * Copyright(C) 2005-2006, Linutronix GmbH, Thomas Gleixner * Copyright(C) 2005-2007, Red Hat, Inc., Ingo Molnar * Copyright(C) 2006-2007, Timesys Corp., Thomas Gleixner */ diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 8ddf74e705d35..2f8a7923fa279 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 /* - * Copyright(C) 2005-2006, Thomas Gleixner + * Copyright(C) 2005-2006, Linutronix GmbH, Thomas Gleixner * Copyright(C) 2005-2007, Red Hat, Inc., Ingo Molnar * Copyright(C) 2006-2007 Timesys Corp., Thomas Gleixner * diff --git a/lib/debugobjects.c b/lib/debugobjects.c index ecf8e7f978e30..89a1d6745dc2c 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -2,7 +2,7 @@ /* * Generic infrastructure for lifetime debugging of objects. * - * Copyright (C) 2008, Thomas Gleixner + * Copyright (C) 2008, Linutronix GmbH, Thomas Gleixner */ #define pr_fmt(fmt) "ODEBUG: " fmt diff --git a/lib/plist.c b/lib/plist.c index ba677c31e8f3d..a5bef38add431 100644 --- a/lib/plist.c +++ b/lib/plist.c @@ -10,7 +10,7 @@ * 2001-2005 (c) MontaVista Software, Inc. * Daniel Walker * - * (C) 2005 Thomas Gleixner + * (C) 2005 Linutronix GmbH, Thomas Gleixner * * Simplifications of the original code by * Oleg Nesterov diff --git a/lib/reed_solomon/decode_rs.c b/lib/reed_solomon/decode_rs.c index 805de84ae83df..ef86ee2aec580 100644 --- a/lib/reed_solomon/decode_rs.c +++ b/lib/reed_solomon/decode_rs.c @@ -5,7 +5,7 @@ * Copyright 2002, Phil Karn, KA9Q * May be used under the terms of the GNU General Public License (GPL) * - * Adaption to the kernel by Thomas Gleixner (tglx@linutronix.de) + * Adaption to the kernel by Thomas Gleixner (tglx@kernel.org) * * Generic data width independent code which is included by the wrappers. */ diff --git a/lib/reed_solomon/encode_rs.c b/lib/reed_solomon/encode_rs.c index 9112d46e869ee..1d9e51dcc83dd 100644 --- a/lib/reed_solomon/encode_rs.c +++ b/lib/reed_solomon/encode_rs.c @@ -5,7 +5,7 @@ * Copyright 2002, Phil Karn, KA9Q * May be used under the terms of the GNU General Public License (GPL) * - * Adaption to the kernel by Thomas Gleixner (tglx@linutronix.de) + * Adaption to the kernel by Thomas Gleixner (tglx@kernel.org) * * Generic data width independent code which is included by the wrappers. */ diff --git a/lib/reed_solomon/reed_solomon.c b/lib/reed_solomon/reed_solomon.c index bbc01bad3053e..a9e2dcb6f2a7a 100644 --- a/lib/reed_solomon/reed_solomon.c +++ b/lib/reed_solomon/reed_solomon.c @@ -2,7 +2,7 @@ /* * Generic Reed Solomon encoder / decoder library * - * Copyright (C) 2004 Thomas Gleixner (tglx@linutronix.de) + * Copyright (C) 2004 Thomas Gleixner (tglx@kernel.org) * * Reed Solomon code lifted from reed solomon library written by Phil Karn * Copyright 2002 Phil Karn, KA9Q diff --git a/scripts/spdxcheck.py b/scripts/spdxcheck.py index 8d608f61bf371..908029e45ca2b 100755 --- a/scripts/spdxcheck.py +++ b/scripts/spdxcheck.py @@ -1,6 +1,6 @@ #!/usr/bin/env python3 # SPDX-License-Identifier: GPL-2.0 -# Copyright Thomas Gleixner +# Copyright Linutronix GmbH, Thomas Gleixner from argparse import ArgumentParser from ply import lex, yacc diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index c44a8fb3e4181..72f03153dd325 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -2,7 +2,7 @@ /* * Performance events: * - * Copyright (C) 2008-2009, Thomas Gleixner + * Copyright (C) 2008-2009, Linutronix GmbH, Thomas Gleixner * Copyright (C) 2008-2011, Red Hat, Inc., Ingo Molnar * Copyright (C) 2008-2011, Red Hat, Inc., Peter Zijlstra * diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c index 5cbca0bacd352..87a5491048ac6 100644 --- a/tools/perf/builtin-list.c +++ b/tools/perf/builtin-list.c @@ -4,7 +4,7 @@ * * Builtin list command: list all event types * - * Copyright (C) 2009, Thomas Gleixner + * Copyright (C) 2009, Linutronix GmbH, Thomas Gleixner * Copyright (C) 2008-2009, Red Hat Inc, Ingo Molnar * Copyright (C) 2011, Red Hat Inc, Arnaldo Carvalho de Melo */ From e0fbe8fb78ed3d2283b0ed6e4287c1eba9559965 Mon Sep 17 00:00:00 2001 From: Lorenzo Pieralisi Date: Mon, 22 Dec 2025 11:22:50 +0100 Subject: [PATCH 0935/1321] irqchip/gic-v5: Fix gicv5_its_map_event() ITTE read endianness Kbuild bot (through sparse) reported that the ITTE read to carry out a valid check in gicv5_its_map_event() lacks proper endianness handling. Add the missing endianess conversion. Fixes: 57d72196dfc8 ("irqchip/gic-v5: Add GICv5 ITS support") Reported-by: kernel test robot Signed-off-by: Lorenzo Pieralisi Signed-off-by: Thomas Gleixner Acked-by: Marc Zyngier Link: https://patch.msgid.link/20251222102250.435460-1-lpieralisi@kernel.org Closes: https://lore.kernel.org/oe-kbuild-all/202512131849.30ZRTBeR-lkp@intel.com/ --- drivers/irqchip/irq-gic-v5-its.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/irqchip/irq-gic-v5-its.c b/drivers/irqchip/irq-gic-v5-its.c index 554485f0be1fb..8e22134b9f486 100644 --- a/drivers/irqchip/irq-gic-v5-its.c +++ b/drivers/irqchip/irq-gic-v5-its.c @@ -849,7 +849,7 @@ static int gicv5_its_map_event(struct gicv5_its_dev *its_dev, u16 event_id, u32 itte = gicv5_its_device_get_itte_ref(its_dev, event_id); - if (FIELD_GET(GICV5_ITTL2E_VALID, *itte)) + if (FIELD_GET(GICV5_ITTL2E_VALID, le64_to_cpu(*itte))) return -EEXIST; itt_entry = FIELD_PREP(GICV5_ITTL2E_LPI_ID, lpi) | From b69fa9ae497c5695d39c7c4338111767a8397fff Mon Sep 17 00:00:00 2001 From: Anup Patel Date: Tue, 23 Dec 2025 20:05:44 +0530 Subject: [PATCH 0936/1321] Revert "irqchip/riscv-imsic: Embed the vector array in lpriv" The __alloc_percpu() fails when the number of IDs are greater than 959 because size parameter of __alloc_percpu() must be less than 32768 (aka PCPU_MIN_UNIT_SIZE). This failure is observed with KVMTOOL when AIA is trap-n-emulated by in-kernel KVM because in this case KVM guest has 2047 interrupt IDs. To address this issue, don't embed vector array in struct imsic_local_priv until __alloc_percpu() support size parameter greater than 32768. This reverts commit 79eaabc61dfb ("irqchip/riscv-imsic: Embed the vector array in lpriv"). Signed-off-by: Anup Patel Signed-off-by: Thomas Gleixner Link: https://patch.msgid.link/20251223143544.1504217-1-anup.patel@oss.qualcomm.com --- drivers/irqchip/irq-riscv-imsic-state.c | 10 ++++++++-- drivers/irqchip/irq-riscv-imsic-state.h | 2 +- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/irqchip/irq-riscv-imsic-state.c b/drivers/irqchip/irq-riscv-imsic-state.c index 385368052d5c0..b6cebfee94611 100644 --- a/drivers/irqchip/irq-riscv-imsic-state.c +++ b/drivers/irqchip/irq-riscv-imsic-state.c @@ -477,6 +477,7 @@ static void __init imsic_local_cleanup(void) lpriv = per_cpu_ptr(imsic->lpriv, cpu); bitmap_free(lpriv->dirty_bitmap); + kfree(lpriv->vectors); } free_percpu(imsic->lpriv); @@ -490,8 +491,7 @@ static int __init imsic_local_init(void) int cpu, i; /* Allocate per-CPU private state */ - imsic->lpriv = __alloc_percpu(struct_size(imsic->lpriv, vectors, global->nr_ids + 1), - __alignof__(*imsic->lpriv)); + imsic->lpriv = alloc_percpu(typeof(*imsic->lpriv)); if (!imsic->lpriv) return -ENOMEM; @@ -511,6 +511,12 @@ static int __init imsic_local_init(void) timer_setup(&lpriv->timer, imsic_local_timer_callback, TIMER_PINNED); #endif + /* Allocate vector array */ + lpriv->vectors = kcalloc(global->nr_ids + 1, sizeof(*lpriv->vectors), + GFP_KERNEL); + if (!lpriv->vectors) + goto fail_local_cleanup; + /* Setup vector array */ for (i = 0; i <= global->nr_ids; i++) { vec = &lpriv->vectors[i]; diff --git a/drivers/irqchip/irq-riscv-imsic-state.h b/drivers/irqchip/irq-riscv-imsic-state.h index 6332501dcbd8e..c42ee180b3050 100644 --- a/drivers/irqchip/irq-riscv-imsic-state.h +++ b/drivers/irqchip/irq-riscv-imsic-state.h @@ -40,7 +40,7 @@ struct imsic_local_priv { #endif /* Local vector table */ - struct imsic_vector vectors[]; + struct imsic_vector *vectors; }; struct imsic_priv { From f958d63933c352552d24296c8c762f217c35f4f5 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Sat, 20 Dec 2025 14:14:41 +0100 Subject: [PATCH 0937/1321] perf: Ensure swevent hrtimer is properly destroyed With the change to hrtimer_try_to_cancel() in perf_swevent_cancel_hrtimer() it appears possible for the hrtimer to still be active by the time the event gets freed. Make sure the event does a full hrtimer_cancel() on the free path by installing a perf_event::destroy handler. Fixes: eb3182ef0405 ("perf/core: Fix system hang caused by cpu-clock usage") Reported-by: CyberUnicorns Tested-by: CyberUnicorns Debugged-by: Thomas Gleixner Signed-off-by: Peter Zijlstra (Intel) --- kernel/events/core.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/kernel/events/core.c b/kernel/events/core.c index f5e9d30e4fa93..5b5cb620499ef 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -11906,6 +11906,11 @@ static void perf_swevent_cancel_hrtimer(struct perf_event *event) } } +static void perf_swevent_destroy_hrtimer(struct perf_event *event) +{ + hrtimer_cancel(&event->hw.hrtimer); +} + static void perf_swevent_init_hrtimer(struct perf_event *event) { struct hw_perf_event *hwc = &event->hw; @@ -11914,6 +11919,7 @@ static void perf_swevent_init_hrtimer(struct perf_event *event) return; hrtimer_setup(&hwc->hrtimer, perf_swevent_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD); + event->destroy = perf_swevent_destroy_hrtimer; /* * Since hrtimers have a fixed rate, we can do a static freq->period From 8c37fd16da85e15a5bc1a378efef86ac17227a62 Mon Sep 17 00:00:00 2001 From: Cong Wang Date: Tue, 23 Dec 2025 13:51:13 -0800 Subject: [PATCH 0938/1321] sched/mm_cid: Prevent NULL mm dereference in sched_mm_cid_after_execve() sched_mm_cid_after_execve() is called in bprm_execve()'s cleanup path even when exec_binprm() fails. For the init task's first execve(), this causes a problem: 1. current->mm is NULL (kernel threads don't have an mm) 2. sched_mm_cid_before_execve() exits early because mm is NULL 3. exec_binprm() fails (e.g., ENOENT for missing script interpreter) 4. sched_mm_cid_after_execve() is called with mm still NULL 5. sched_mm_cid_fork() is called unconditionally, triggering WARN_ON This is easily reproduced by booting with an init that is a shell script (#!/bin/sh) where the interpreter doesn't exist in the initramfs. Fix this by checking if t->mm is NULL before calling sched_mm_cid_fork(), matching the behavior of sched_mm_cid_before_execve() which already handles this case via sched_mm_cid_exit()'s early return. Fixes: b0c3d51b54f8 ("sched/mmcid: Provide precomputed maximal value") Signed-off-by: Cong Wang Signed-off-by: Thomas Gleixner Reviewed-by: Mathieu Desnoyers Acked-by: Will Deacon Link: https://patch.msgid.link/20251223215113.639686-1-xiyou.wangcong@gmail.com --- kernel/sched/core.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 41ba0be169117..60afadb6eedea 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -10694,10 +10694,11 @@ void sched_mm_cid_before_execve(struct task_struct *t) sched_mm_cid_exit(t); } -/* Reactivate MM CID after successful execve() */ +/* Reactivate MM CID after execve() */ void sched_mm_cid_after_execve(struct task_struct *t) { - sched_mm_cid_fork(t); + if (t->mm) + sched_mm_cid_fork(t); } static void mm_cid_work_fn(struct work_struct *work) From ef2224b653c929c428f1977daf27d4721a79df35 Mon Sep 17 00:00:00 2001 From: Brendan Jackman Date: Tue, 16 Dec 2025 10:16:36 +0000 Subject: [PATCH 0939/1321] x86/sev: Disable GCOV on noinstr object With Debian clang version 19.1.7 (3+build5) there are calls to kasan_check_write() from __sev_es_nmi_complete(), which violates noinstr. Fix it by disabling GCOV for the noinstr object, as has been done for previous such instrumentation issues. Note that this file already disables __SANITIZE_ADDRESS__ and __SANITIZE_THREAD__, thus calls like kasan_check_write() ought to be nops regardless of GCOV. This has been fixed in other patches. However, to avoid any other accidental instrumentation showing up, (and since, in principle GCOV is instrumentation and hence should be disabled for noinstr code anyway), disable GCOV overall as well. Signed-off-by: Brendan Jackman Signed-off-by: Borislav Petkov (AMD) Acked-by: Marco Elver Link: https://patch.msgid.link/20251216-gcov-inline-noinstr-v3-3-10244d154451@google.com --- arch/x86/coco/sev/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/coco/sev/Makefile b/arch/x86/coco/sev/Makefile index 3b8ae214a6a64..b2e9ec2f69014 100644 --- a/arch/x86/coco/sev/Makefile +++ b/arch/x86/coco/sev/Makefile @@ -8,3 +8,5 @@ UBSAN_SANITIZE_noinstr.o := n # GCC may fail to respect __no_sanitize_address or __no_kcsan when inlining KASAN_SANITIZE_noinstr.o := n KCSAN_SANITIZE_noinstr.o := n + +GCOV_PROFILE_noinstr.o := n From eae8b9f3eee3bce1ae7b7316274313d349d3ffdb Mon Sep 17 00:00:00 2001 From: Alexander Sverdlin Date: Tue, 18 Nov 2025 09:35:48 +0100 Subject: [PATCH 0940/1321] counter: interrupt-cnt: Drop IRQF_NO_THREAD flag An IRQ handler can either be IRQF_NO_THREAD or acquire spinlock_t, as CONFIG_PROVE_RAW_LOCK_NESTING warns: ============================= [ BUG: Invalid wait context ] 6.18.0-rc1+git... #1 ----------------------------- some-user-space-process/1251 is trying to lock: (&counter->events_list_lock){....}-{3:3}, at: counter_push_event [counter] other info that might help us debug this: context-{2:2} no locks held by some-user-space-process/.... stack backtrace: CPU: 0 UID: 0 PID: 1251 Comm: some-user-space-process 6.18.0-rc1+git... #1 PREEMPT Call trace: show_stack (C) dump_stack_lvl dump_stack __lock_acquire lock_acquire _raw_spin_lock_irqsave counter_push_event [counter] interrupt_cnt_isr [interrupt_cnt] __handle_irq_event_percpu handle_irq_event handle_simple_irq handle_irq_desc generic_handle_domain_irq gpio_irq_handler handle_irq_desc generic_handle_domain_irq gic_handle_irq call_on_irq_stack do_interrupt_handler el0_interrupt __el0_irq_handler_common el0t_64_irq_handler el0t_64_irq ... and Sebastian correctly points out. Remove IRQF_NO_THREAD as an alternative to switching to raw_spinlock_t, because the latter would limit all potential nested locks to raw_spinlock_t only. Cc: Sebastian Andrzej Siewior Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20251117151314.xwLAZrWY@linutronix.de/ Fixes: a55ebd47f21f ("counter: add IRQ or GPIO based counter") Signed-off-by: Alexander Sverdlin Reviewed-by: Sebastian Andrzej Siewior Reviewed-by: Oleksij Rempel Link: https://lore.kernel.org/r/20251118083603.778626-1-alexander.sverdlin@siemens.com Signed-off-by: William Breathitt Gray --- drivers/counter/interrupt-cnt.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/counter/interrupt-cnt.c b/drivers/counter/interrupt-cnt.c index 6c0c1d2d7027d..e6100b5fb082e 100644 --- a/drivers/counter/interrupt-cnt.c +++ b/drivers/counter/interrupt-cnt.c @@ -229,8 +229,7 @@ static int interrupt_cnt_probe(struct platform_device *pdev) irq_set_status_flags(priv->irq, IRQ_NOAUTOEN); ret = devm_request_irq(dev, priv->irq, interrupt_cnt_isr, - IRQF_TRIGGER_RISING | IRQF_NO_THREAD, - dev_name(dev), counter); + IRQF_TRIGGER_RISING, dev_name(dev), counter); if (ret) return ret; From 2d017661176f37a757d4ef9251a48abf0d0afe28 Mon Sep 17 00:00:00 2001 From: Haotian Zhang Date: Mon, 15 Dec 2025 10:01:14 +0800 Subject: [PATCH 0941/1321] counter: 104-quad-8: Fix incorrect return value in IRQ handler quad8_irq_handler() should return irqreturn_t enum values, but it directly returns negative errno codes from regmap operations on error. Return IRQ_NONE if the interrupt status cannot be read. If clearing the interrupt fails, return IRQ_HANDLED to prevent the kernel from disabling the IRQ line due to a spurious interrupt storm. Also, log these regmap failures with dev_WARN_ONCE. Fixes: 98ffe0252911 ("counter: 104-quad-8: Migrate to the regmap API") Suggested-by: Andy Shevchenko Signed-off-by: Haotian Zhang Link: https://lore.kernel.org/r/20251215020114.1913-1-vulab@iscas.ac.cn Cc: stable@vger.kernel.org Signed-off-by: William Breathitt Gray --- drivers/counter/104-quad-8.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c index ce81fc4e1ae76..573b2fe93253e 100644 --- a/drivers/counter/104-quad-8.c +++ b/drivers/counter/104-quad-8.c @@ -1192,6 +1192,7 @@ static irqreturn_t quad8_irq_handler(int irq, void *private) { struct counter_device *counter = private; struct quad8 *const priv = counter_priv(counter); + struct device *dev = counter->parent; unsigned int status; unsigned long irq_status; unsigned long channel; @@ -1200,8 +1201,11 @@ static irqreturn_t quad8_irq_handler(int irq, void *private) int ret; ret = regmap_read(priv->map, QUAD8_INTERRUPT_STATUS, &status); - if (ret) - return ret; + if (ret) { + dev_WARN_ONCE(dev, true, + "Attempt to read Interrupt Status Register failed: %d\n", ret); + return IRQ_NONE; + } if (!status) return IRQ_NONE; @@ -1223,8 +1227,9 @@ static irqreturn_t quad8_irq_handler(int irq, void *private) break; default: /* should never reach this path */ - WARN_ONCE(true, "invalid interrupt trigger function %u configured for channel %lu\n", - flg_pins, channel); + dev_WARN_ONCE(dev, true, + "invalid interrupt trigger function %u configured for channel %lu\n", + flg_pins, channel); continue; } @@ -1232,8 +1237,11 @@ static irqreturn_t quad8_irq_handler(int irq, void *private) } ret = regmap_write(priv->map, QUAD8_CHANNEL_OPERATION, CLEAR_PENDING_INTERRUPTS); - if (ret) - return ret; + if (ret) { + dev_WARN_ONCE(dev, true, + "Attempt to clear pending interrupts by writing to Channel Operation Register failed: %d\n", ret); + return IRQ_HANDLED; + } return IRQ_HANDLED; } From 6130e0ba65b83f4d116a8c22cff9151946805c63 Mon Sep 17 00:00:00 2001 From: Alexander Usyskin Date: Mon, 15 Dec 2025 12:59:15 +0200 Subject: [PATCH 0942/1321] mei: me: add nova lake point S DID Add Nova Lake S device id. Cc: stable Co-developed-by: Tomas Winkler Signed-off-by: Tomas Winkler Signed-off-by: Alexander Usyskin Link: https://patch.msgid.link/20251215105915.1672659-1-alexander.usyskin@intel.com Signed-off-by: Greg Kroah-Hartman --- drivers/misc/mei/hw-me-regs.h | 2 ++ drivers/misc/mei/pci-me.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/drivers/misc/mei/hw-me-regs.h b/drivers/misc/mei/hw-me-regs.h index a4f75dc369292..fa30899a5fa26 100644 --- a/drivers/misc/mei/hw-me-regs.h +++ b/drivers/misc/mei/hw-me-regs.h @@ -122,6 +122,8 @@ #define MEI_DEV_ID_WCL_P 0x4D70 /* Wildcat Lake P */ +#define MEI_DEV_ID_NVL_S 0x6E68 /* Nova Lake Point S */ + /* * MEI HW Section */ diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c index 73cad914be9f9..2a6e569558b94 100644 --- a/drivers/misc/mei/pci-me.c +++ b/drivers/misc/mei/pci-me.c @@ -129,6 +129,8 @@ static const struct pci_device_id mei_me_pci_tbl[] = { {MEI_PCI_DEVICE(MEI_DEV_ID_WCL_P, MEI_ME_PCH15_CFG)}, + {MEI_PCI_DEVICE(MEI_DEV_ID_NVL_S, MEI_ME_PCH15_CFG)}, + /* required last entry */ {0, } }; From 0ae883c1e2de433ecc7f1553802f9bde029b1798 Mon Sep 17 00:00:00 2001 From: Alice Ryhl Date: Tue, 2 Dec 2025 11:24:24 +0000 Subject: [PATCH 0943/1321] rust_binder: remove spin_lock() in rust_shrink_free_page() When forward-porting Rust Binder to 6.18, I neglected to take commit fb56fdf8b9a2 ("mm/list_lru: split the lock to per-cgroup scope") into account, and apparently I did not end up running the shrinker callback when I sanity tested the driver before submission. This leads to crashes like the following: ============================================ WARNING: possible recursive locking detected 6.18.0-mainline-maybe-dirty #1 Tainted: G IO -------------------------------------------- kswapd0/68 is trying to acquire lock: ffff956000fa18b0 (&l->lock){+.+.}-{2:2}, at: lock_list_lru_of_memcg+0x128/0x230 but task is already holding lock: ffff956000fa18b0 (&l->lock){+.+.}-{2:2}, at: rust_helper_spin_lock+0xd/0x20 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&l->lock); lock(&l->lock); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kswapd0/68: #0: ffffffff90d2e260 (fs_reclaim){+.+.}-{0:0}, at: kswapd+0x597/0x1160 #1: ffff956000fa18b0 (&l->lock){+.+.}-{2:2}, at: rust_helper_spin_lock+0xd/0x20 #2: ffffffff90cf3680 (rcu_read_lock){....}-{1:2}, at: lock_list_lru_of_memcg+0x2d/0x230 To fix this, remove the spin_lock() call from rust_shrink_free_page(). Cc: stable Fixes: eafedbc7c050 ("rust_binder: add Rust Binder driver") Signed-off-by: Alice Ryhl Link: https://patch.msgid.link/20251202-binder-shrink-unspin-v1-1-263efb9ad625@google.com Signed-off-by: Greg Kroah-Hartman --- drivers/android/binder/page_range.rs | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/android/binder/page_range.rs b/drivers/android/binder/page_range.rs index 9379038f61f51..fdd97112ef5c8 100644 --- a/drivers/android/binder/page_range.rs +++ b/drivers/android/binder/page_range.rs @@ -727,8 +727,5 @@ unsafe extern "C" fn rust_shrink_free_page( drop(mm); drop(page); - // SAFETY: We just unlocked the lru lock, but it should be locked when we return. - unsafe { bindings::spin_lock(&raw mut (*lru).lock) }; - LRU_REMOVED_ENTRY } From 5d176f9bbf2f4626681f88cd9fa3a94a69cc76b9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= Date: Fri, 2 Jan 2026 08:32:03 +0100 Subject: [PATCH 0944/1321] lib/crypto: tests: polyval_kunit: Increase iterations for preparekey in IRQs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On my development machine the generic, memcpy()-only implementation of polyval_preparekey() is too fast for the IRQ workers to actually fire. The test fails. Increase the iterations to make the test more robust. The test will run for a maximum of one second in any case. [EB: This failure was already fixed by commit c31f4aa8fed0 ("kunit: Enforce task execution in {soft,hard}irq contexts"). I'm still applying this patch too, since the iteration count in this test made its running time much shorter than the other similar ones.] Fixes: b3aed551b3fc ("lib/crypto: tests: Add KUnit tests for POLYVAL") Signed-off-by: Thomas Weißschuh Link: https://lore.kernel.org/r/20260102-kunit-polyval-fix-v1-1-5313b5a65f35@linutronix.de Signed-off-by: Eric Biggers --- lib/crypto/tests/polyval_kunit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/crypto/tests/polyval_kunit.c b/lib/crypto/tests/polyval_kunit.c index e59f598c15721..f47f41a39a416 100644 --- a/lib/crypto/tests/polyval_kunit.c +++ b/lib/crypto/tests/polyval_kunit.c @@ -183,7 +183,7 @@ static void test_polyval_preparekey_in_irqs(struct kunit *test) rand_bytes(state.raw_key, sizeof(state.raw_key)); polyval_preparekey(&state.expected_key, state.raw_key); - kunit_run_irq_test(test, polyval_irq_test_func, 20000, &state); + kunit_run_irq_test(test, polyval_irq_test_func, 200000, &state); } static int polyval_suite_init(struct kunit_suite *suite) From beb9255c1ec35685392353d9ece110c7a5ee1939 Mon Sep 17 00:00:00 2001 From: Jie Zhan Date: Wed, 7 Jan 2026 09:58:29 +0800 Subject: [PATCH 0945/1321] lib/crypto: tests: Fix syntax error for old python versions 'make binrpm-pkg' throws me this error, with Python 3.9: *** Error compiling '.../gen-hash-testvecs.py'... File ".../scripts/crypto/gen-hash-testvecs.py", line 121 return f'{alg.upper().replace('-', '_')}_DIGEST_SIZE' ^ SyntaxError: f-string: unmatched '(' Old python versions, presumably <= 3.11, can't resolve these quotes. Fix it with double quotes for compatibility. Fixes: 15c64c47e484 ("lib/crypto: tests: Add SHA3 kunit tests") Signed-off-by: Jie Zhan Link: https://lore.kernel.org/r/20260107015829.2000699-1-zhanjie9@hisilicon.com Signed-off-by: Eric Biggers --- scripts/crypto/gen-hash-testvecs.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/crypto/gen-hash-testvecs.py b/scripts/crypto/gen-hash-testvecs.py index c1d0517140bdb..c773294fba648 100755 --- a/scripts/crypto/gen-hash-testvecs.py +++ b/scripts/crypto/gen-hash-testvecs.py @@ -118,7 +118,7 @@ def print_c_struct_u8_array_field(name, value): def alg_digest_size_const(alg): if alg.startswith('blake2'): return f'{alg.upper()}_HASH_SIZE' - return f'{alg.upper().replace('-', '_')}_DIGEST_SIZE' + return f"{alg.upper().replace('-', '_')}_DIGEST_SIZE" def gen_unkeyed_testvecs(alg): print('') From 7957bd12175221d374ca2da5e179e4852454ff68 Mon Sep 17 00:00:00 2001 From: Eric Biggers Date: Tue, 6 Jan 2026 19:39:48 -0800 Subject: [PATCH 0946/1321] MAINTAINERS: add test vector generation scripts to "CRYPTO LIBRARY" The scripts in scripts/crypto/ are used to generate files in lib/crypto/, so they should be included in "CRYPTO LIBRARY". Acked-by: Ard Biesheuvel Link: https://lore.kernel.org/r/20260107033948.29368-1-ebiggers@kernel.org Signed-off-by: Eric Biggers --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index ee036e0a3ef6c..0d044a58cbfe0 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6708,6 +6708,7 @@ S: Maintained T: git https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git libcrypto-next T: git https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git libcrypto-fixes F: lib/crypto/ +F: scripts/crypto/ CRYPTO SPEED TEST COMPARE M: Wang Jinchao From 616b1694cf32f92fa14046b21099dfb5947d6d81 Mon Sep 17 00:00:00 2001 From: Eric Biggers Date: Tue, 6 Jan 2026 21:20:23 -0800 Subject: [PATCH 0947/1321] lib/crypto: aes: Fix missing MMU protection for AES S-box __cacheline_aligned puts the data in the ".data..cacheline_aligned" section, which isn't marked read-only i.e. it doesn't receive MMU protection. Replace it with ____cacheline_aligned which does the right thing and just aligns the data while keeping it in ".rodata". Fixes: b5e0b032b6c3 ("crypto: aes - add generic time invariant AES cipher") Cc: stable@vger.kernel.org Reported-by: Qingfang Deng Closes: https://lore.kernel.org/r/20260105074712.498-1-dqfext@gmail.com/ Acked-by: Ard Biesheuvel Link: https://lore.kernel.org/r/20260107052023.174620-1-ebiggers@kernel.org Signed-off-by: Eric Biggers --- lib/crypto/aes.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/crypto/aes.c b/lib/crypto/aes.c index b57fda3460f1b..102aaa76bc8d7 100644 --- a/lib/crypto/aes.c +++ b/lib/crypto/aes.c @@ -13,7 +13,7 @@ * Emit the sbox as volatile const to prevent the compiler from doing * constant folding on sbox references involving fixed indexes. */ -static volatile const u8 __cacheline_aligned aes_sbox[] = { +static volatile const u8 ____cacheline_aligned aes_sbox[] = { 0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5, 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76, 0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0, @@ -48,7 +48,7 @@ static volatile const u8 __cacheline_aligned aes_sbox[] = { 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16, }; -static volatile const u8 __cacheline_aligned aes_inv_sbox[] = { +static volatile const u8 ____cacheline_aligned aes_inv_sbox[] = { 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38, 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb, 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87, From 32e392c0088b2054d6035f6114d02390eedfc39b Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sun, 11 Jan 2026 17:03:14 -1000 Subject: [PATCH 0948/1321] Linux 6.19-rc5 --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 665b79aa21b8d..9d38125263fb0 100644 --- a/Makefile +++ b/Makefile @@ -2,7 +2,7 @@ VERSION = 6 PATCHLEVEL = 19 SUBLEVEL = 0 -EXTRAVERSION = -rc4 +EXTRAVERSION = -rc5 NAME = Baby Opossum Posse # *DOCUMENTATION* From 4d40e03cf540bd4837618bad5cf9fc5bacc54d44 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Koutn=C3=BD?= Date: Wed, 7 Jan 2026 17:59:41 +0100 Subject: [PATCH 0949/1321] cgroup: Eliminate cgrp_ancestor_storage in cgroup_root MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The cgrp_ancestor_storage has two drawbacks: - it's not guaranteed that the member immediately follows struct cgrp in cgroup_root (root cgroup's ancestors[0] might thus point to a padding and not in cgrp_ancestor_storage proper), - this idiom raises warnings with -Wflex-array-member-not-at-end. Instead of relying on the auxiliary member in cgroup_root, define the 0-th level ancestor inside struct cgroup (needed for static allocation of cgrp_dfl_root), deeper cgroups would allocate flexible _low_ancestors[]. Unionized alias through ancestors[] will transparently join the two ranges. The above change would still leave the flexible array at the end of struct cgroup inside cgroup_root, so move cgrp also towards the end of cgroup_root to resolve the -Wflex-array-member-not-at-end. Link: https://lore.kernel.org/r/5fb74444-2fbb-476e-b1bf-3f3e279d0ced@embeddedor.com/ Reported-by: Gustavo A. R. Silva Closes: https://lore.kernel.org/r/b3eb050d-9451-4b60-b06c-ace7dab57497@embeddedor.com/ Cc: David Laight Acked-by: Gustavo A. R. Silva Signed-off-by: Michal Koutný Signed-off-by: Tejun Heo --- include/linux/cgroup-defs.h | 25 ++++++++++++++----------- kernel/cgroup/cgroup.c | 2 +- 2 files changed, 15 insertions(+), 12 deletions(-) diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h index b760a3c470a56..f7cc60de00583 100644 --- a/include/linux/cgroup-defs.h +++ b/include/linux/cgroup-defs.h @@ -626,7 +626,13 @@ struct cgroup { #endif /* All ancestors including self */ - struct cgroup *ancestors[]; + union { + DECLARE_FLEX_ARRAY(struct cgroup *, ancestors); + struct { + struct cgroup *_root_ancestor; + DECLARE_FLEX_ARRAY(struct cgroup *, _low_ancestors); + }; + }; }; /* @@ -647,16 +653,6 @@ struct cgroup_root { struct list_head root_list; struct rcu_head rcu; /* Must be near the top */ - /* - * The root cgroup. The containing cgroup_root will be destroyed on its - * release. cgrp->ancestors[0] will be used overflowing into the - * following field. cgrp_ancestor_storage must immediately follow. - */ - struct cgroup cgrp; - - /* must follow cgrp for cgrp->ancestors[0], see above */ - struct cgroup *cgrp_ancestor_storage; - /* Number of cgroups in the hierarchy, used only for /proc/cgroups */ atomic_t nr_cgrps; @@ -668,6 +664,13 @@ struct cgroup_root { /* The name for this hierarchy - may be empty */ char name[MAX_CGROUP_ROOT_NAMELEN]; + + /* + * The root cgroup. The containing cgroup_root will be destroyed on its + * release. This must be embedded last due to flexible array at the end + * of struct cgroup. + */ + struct cgroup cgrp; }; /* diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index e717208cfb185..554a02ee298ba 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -5847,7 +5847,7 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name, int ret; /* allocate the cgroup and its ID, 0 is reserved for the root */ - cgrp = kzalloc(struct_size(cgrp, ancestors, (level + 1)), GFP_KERNEL); + cgrp = kzalloc(struct_size(cgrp, _low_ancestors, level), GFP_KERNEL); if (!cgrp) return ERR_PTR(-ENOMEM); From 1d60cb935d518e71825b9b9d67e328e7a1cfd62f Mon Sep 17 00:00:00 2001 From: Stanislav Kinsburskii Date: Tue, 9 Dec 2025 16:37:20 +0000 Subject: [PATCH 0950/1321] mshv: Use PMD_ORDER instead of HPAGE_PMD_ORDER when processing regions Fix page order determination logic when CONFIG_PGTABLE_HAS_HUGE_LEAVES is undefined, as HPAGE_PMD_SHIFT is defined as BUILD_BUG in that case. Fixes: abceb4297bf8 ("mshv: Fix huge page handling in memory region traversal") Signed-off-by: Stanislav Kinsburskii Signed-off-by: Wei Liu --- drivers/hv/mshv_regions.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hv/mshv_regions.c b/drivers/hv/mshv_regions.c index 202b9d551e393..dc2d7044fb913 100644 --- a/drivers/hv/mshv_regions.c +++ b/drivers/hv/mshv_regions.c @@ -58,7 +58,7 @@ static long mshv_region_process_chunk(struct mshv_mem_region *region, page_order = folio_order(page_folio(page)); /* The hypervisor only supports 4K and 2M page sizes */ - if (page_order && page_order != HPAGE_PMD_ORDER) + if (page_order && page_order != PMD_ORDER) return -EINVAL; stride = 1 << page_order; From db68e71828747784cafad28c81febc450dbfcac8 Mon Sep 17 00:00:00 2001 From: Stanislav Kinsburskii Date: Wed, 10 Dec 2025 17:55:47 +0000 Subject: [PATCH 0951/1321] mshv: Initialize local variables early upon region invalidation Ensure local variables are initialized before use so that the warning can print the right values if locking the region to invalidate fails due to inability to lock the region. Reported-by: Dan Carpenter Fixes: b9a66cd5ccbb ("mshv: Add support for movable memory regions") Signed-off-by: Stanislav Kinsburskii Signed-off-by: Wei Liu --- drivers/hv/mshv_regions.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/hv/mshv_regions.c b/drivers/hv/mshv_regions.c index dc2d7044fb913..8abf80129f9bd 100644 --- a/drivers/hv/mshv_regions.c +++ b/drivers/hv/mshv_regions.c @@ -494,13 +494,6 @@ static bool mshv_region_interval_invalidate(struct mmu_interval_notifier *mni, unsigned long mstart, mend; int ret = -EPERM; - if (mmu_notifier_range_blockable(range)) - mutex_lock(®ion->mutex); - else if (!mutex_trylock(®ion->mutex)) - goto out_fail; - - mmu_interval_set_seq(mni, cur_seq); - mstart = max(range->start, region->start_uaddr); mend = min(range->end, region->start_uaddr + (region->nr_pages << HV_HYP_PAGE_SHIFT)); @@ -508,6 +501,13 @@ static bool mshv_region_interval_invalidate(struct mmu_interval_notifier *mni, page_offset = HVPFN_DOWN(mstart - region->start_uaddr); page_count = HVPFN_DOWN(mend - mstart); + if (mmu_notifier_range_blockable(range)) + mutex_lock(®ion->mutex); + else if (!mutex_trylock(®ion->mutex)) + goto out_fail; + + mmu_interval_set_seq(mni, cur_seq); + ret = mshv_region_remap_pages(region, HV_MAP_GPA_NO_ACCESS, page_offset, page_count); if (ret) From 0e43462b52908dc7bdf361f9dceec862cde9d836 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 16 Dec 2025 22:36:12 +0100 Subject: [PATCH 0952/1321] mshv: hide x86-specific functions on arm64 The hv_sleep_notifiers_register() and hv_machine_power_off() functions are only called and declared on x86, but cause a warning on arm64: drivers/hv/mshv_common.c:210:6: error: no previous prototype for 'hv_sleep_notifiers_register' [-Werror=missing-prototypes] 210 | void hv_sleep_notifiers_register(void) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/hv/mshv_common.c:224:6: error: no previous prototype for 'hv_machine_power_off' [-Werror=missing-prototypes] 224 | void hv_machine_power_off(void) | ^~~~~~~~~~~~~~~~~~~~ Hide both inside of an #ifdef block. Fixes: f0be2600ac55 ("mshv: Use reboot notifier to configure sleep state") Fixes: 615a6e7d83f9 ("mshv: Cleanly shutdown root partition with MSHV") Signed-off-by: Arnd Bergmann Signed-off-by: Wei Liu --- drivers/hv/mshv_common.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/hv/mshv_common.c b/drivers/hv/mshv_common.c index 58027b23c2066..63f09bb5136e8 100644 --- a/drivers/hv/mshv_common.c +++ b/drivers/hv/mshv_common.c @@ -142,6 +142,7 @@ int hv_call_get_partition_property(u64 partition_id, } EXPORT_SYMBOL_GPL(hv_call_get_partition_property); +#ifdef CONFIG_X86 /* * Corresponding sleep states have to be initialized in order for a subsequent * HVCALL_ENTER_SLEEP_STATE call to succeed. Currently only S5 state as per @@ -237,3 +238,4 @@ void hv_machine_power_off(void) BUG(); } +#endif From 77d8ce3746dcdb93e4d7e3a70e5185d74e4ec93c Mon Sep 17 00:00:00 2001 From: "Gustavo A. R. Silva" Date: Fri, 12 Dec 2025 15:44:50 +0900 Subject: [PATCH 0953/1321] hyperv: Avoid -Wflex-array-member-not-at-end warning -Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Use the new __TRAILING_OVERLAP() helper to fix the following warning: include/hyperv/hvgdk_mini.h:581:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] This helper creates a union between a flexible-array member (FAM) and a set of MEMBERS that would otherwise follow it. This overlays the trailing MEMBER u64 gva_list[]; onto the FAM struct hv_tlb_flush_ex::hv_vp_set.bank_contents[], while keeping the FAM and the start of MEMBER aligned. The static_assert() ensures this alignment remains, and it's intentionally placed inmediately after the related structure --no blank line in between. Signed-off-by: Gustavo A. R. Silva Signed-off-by: Wei Liu --- include/hyperv/hvgdk_mini.h | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/include/hyperv/hvgdk_mini.h b/include/hyperv/hvgdk_mini.h index 04b18d0e37af3..30fbbde81c5c4 100644 --- a/include/hyperv/hvgdk_mini.h +++ b/include/hyperv/hvgdk_mini.h @@ -578,9 +578,12 @@ struct hv_tlb_flush { /* HV_INPUT_FLUSH_VIRTUAL_ADDRESS_LIST */ struct hv_tlb_flush_ex { u64 address_space; u64 flags; - struct hv_vpset hv_vp_set; - u64 gva_list[]; + __TRAILING_OVERLAP(struct hv_vpset, hv_vp_set, bank_contents, __packed, + u64 gva_list[]; + ); } __packed; +static_assert(offsetof(struct hv_tlb_flush_ex, hv_vp_set.bank_contents) == + offsetof(struct hv_tlb_flush_ex, gva_list)); struct ms_hyperv_tsc_page { /* HV_REFERENCE_TSC_PAGE */ volatile u32 tsc_sequence; From 3fed3459933e84834cd86e9959d6cca09ecd1a60 Mon Sep 17 00:00:00 2001 From: "Anirudh Rayabharam (Microsoft)" Date: Tue, 16 Dec 2025 14:20:30 +0000 Subject: [PATCH 0954/1321] mshv: release mutex on region invalidation failure In the region invalidation failure path in mshv_region_interval_invalidate(), the region mutex is not released. Fix it by releasing the mutex in the failure path. Signed-off-by: Anirudh Rayabharam (Microsoft) Fixes: b9a66cd5ccbb ("mshv: Add support for movable memory regions") Acked-by: Stanislav Kinsburskii Reviewed-by: Roman Kisel Signed-off-by: Wei Liu --- drivers/hv/mshv_regions.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/hv/mshv_regions.c b/drivers/hv/mshv_regions.c index 8abf80129f9bd..30bacba6aec39 100644 --- a/drivers/hv/mshv_regions.c +++ b/drivers/hv/mshv_regions.c @@ -511,7 +511,7 @@ static bool mshv_region_interval_invalidate(struct mmu_interval_notifier *mni, ret = mshv_region_remap_pages(region, HV_MAP_GPA_NO_ACCESS, page_offset, page_count); if (ret) - goto out_fail; + goto out_unlock; mshv_region_invalidate_pages(region, page_offset, page_count); @@ -519,6 +519,8 @@ static bool mshv_region_interval_invalidate(struct mmu_interval_notifier *mni, return true; +out_unlock: + mutex_unlock(®ion->mutex); out_fail: WARN_ONCE(ret, "Failed to invalidate region %#llx-%#llx (range %#lx-%#lx, event: %u, pages %#llx-%#llx, mm: %#llx): %d\n", From 808f886575f30aeef120dec7ae6c54f4b3cb8329 Mon Sep 17 00:00:00 2001 From: Ryosuke Yasuoka Date: Sat, 6 Dec 2025 23:09:36 +0900 Subject: [PATCH 0955/1321] x86/kvm: Avoid freeing stack-allocated node in kvm_async_pf_queue_task kvm_async_pf_queue_task() can incorrectly try to kfree() a node allocated on the stack of kvm_async_pf_task_wait_schedule(). This occurs when a task requests a PF while another task's PF request with the same token is still pending. Since the token is derived from the (u32)address in exc_page_fault(), two different tasks can generate the same token. Currently, kvm_async_pf_queue_task() assumes that any entry found in the list is a dummy entry and tries to kfree() it. To fix this, add a flag to the node structure to distinguish stack-allocated nodes, and only kfree() the node if it is a dummy entry. Signed-off-by: Ryosuke Yasuoka Message-ID: <20251206140939.144038-1-ryasuoka@redhat.com> Signed-off-by: Paolo Bonzini --- arch/x86/kernel/kvm.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index df78ddee0abbb..37dc8465e0f51 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -89,6 +89,7 @@ struct kvm_task_sleep_node { struct swait_queue_head wq; u32 token; int cpu; + bool dummy; }; static struct kvm_task_sleep_head { @@ -120,15 +121,26 @@ static bool kvm_async_pf_queue_task(u32 token, struct kvm_task_sleep_node *n) raw_spin_lock(&b->lock); e = _find_apf_task(b, token); if (e) { - /* dummy entry exist -> wake up was delivered ahead of PF */ - hlist_del(&e->link); + struct kvm_task_sleep_node *dummy = NULL; + + /* + * The entry can either be a 'dummy' entry (which is put on the + * list when wake-up happens ahead of APF handling completion) + * or a token from another task which should not be touched. + */ + if (e->dummy) { + hlist_del(&e->link); + dummy = e; + } + raw_spin_unlock(&b->lock); - kfree(e); + kfree(dummy); return false; } n->token = token; n->cpu = smp_processor_id(); + n->dummy = false; init_swait_queue_head(&n->wq); hlist_add_head(&n->link, &b->list); raw_spin_unlock(&b->lock); @@ -231,6 +243,7 @@ static void kvm_async_pf_task_wake(u32 token) } dummy->token = token; dummy->cpu = smp_processor_id(); + dummy->dummy = true; init_swait_queue_head(&dummy->wq); hlist_add_head(&dummy->link, &b->list); dummy = NULL; From fce769fbfa15040805d59ac98233f57bce5dac87 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Wed, 31 Dec 2025 16:43:15 +0100 Subject: [PATCH 0956/1321] x86/fpu: Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1 When loading guest XSAVE state via KVM_SET_XSAVE, and when updating XFD in response to a guest WRMSR, clear XFD-disabled features in the saved (or to be restored) XSTATE_BV to ensure KVM doesn't attempt to load state for features that are disabled via the guest's XFD. Because the kernel executes XRSTOR with the guest's XFD, saving XSTATE_BV[i]=1 with XFD[i]=1 will cause XRSTOR to #NM and panic the kernel. E.g. if fpu_update_guest_xfd() sets XFD without clearing XSTATE_BV: ------------[ cut here ]------------ WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#29: amx_test/848 Modules linked in: kvm_intel kvm irqbypass CPU: 29 UID: 1000 PID: 848 Comm: amx_test Not tainted 6.19.0-rc2-ffa07f7fd437-x86_amx_nm_xfd_non_init-vm #171 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:exc_device_not_available+0x101/0x110 Call Trace: asm_exc_device_not_available+0x1a/0x20 RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90 switch_fpu_return+0x4a/0xb0 kvm_arch_vcpu_ioctl_run+0x1245/0x1e40 [kvm] kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm] __x64_sys_ioctl+0x8f/0xd0 do_syscall_64+0x62/0x940 entry_SYSCALL_64_after_hwframe+0x4b/0x53 ---[ end trace 0000000000000000 ]--- This can happen if the guest executes WRMSR(MSR_IA32_XFD) to set XFD[18] = 1, and a host IRQ triggers kernel_fpu_begin() prior to the vmexit handler's call to fpu_update_guest_xfd(). and if userspace stuffs XSTATE_BV[i]=1 via KVM_SET_XSAVE: ------------[ cut here ]------------ WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#14: amx_test/867 Modules linked in: kvm_intel kvm irqbypass CPU: 14 UID: 1000 PID: 867 Comm: amx_test Not tainted 6.19.0-rc2-2dace9faccd6-x86_amx_nm_xfd_non_init-vm #168 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:exc_device_not_available+0x101/0x110 Call Trace: asm_exc_device_not_available+0x1a/0x20 RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90 fpu_swap_kvm_fpstate+0x6b/0x120 kvm_load_guest_fpu+0x30/0x80 [kvm] kvm_arch_vcpu_ioctl_run+0x85/0x1e40 [kvm] kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm] __x64_sys_ioctl+0x8f/0xd0 do_syscall_64+0x62/0x940 entry_SYSCALL_64_after_hwframe+0x4b/0x53 ---[ end trace 0000000000000000 ]--- The new behavior is consistent with the AMX architecture. Per Intel's SDM, XSAVE saves XSTATE_BV as '0' for components that are disabled via XFD (and non-compacted XSAVE saves the initial configuration of the state component): If XSAVE, XSAVEC, XSAVEOPT, or XSAVES is saving the state component i, the instruction does not generate #NM when XCR0[i] = IA32_XFD[i] = 1; instead, it operates as if XINUSE[i] = 0 (and the state component was in its initial state): it saves bit i of XSTATE_BV field of the XSAVE header as 0; in addition, XSAVE saves the initial configuration of the state component (the other instructions do not save state component i). Alternatively, KVM could always do XRSTOR with XFD=0, e.g. by using a constant XFD based on the set of enabled features when XSAVEing for a struct fpu_guest. However, having XSTATE_BV[i]=1 for XFD-disabled features can only happen in the above interrupt case, or in similar scenarios involving preemption on preemptible kernels, because fpu_swap_kvm_fpstate()'s call to save_fpregs_to_fpstate() saves the outgoing FPU state with the current XFD; and that is (on all but the first WRMSR to XFD) the guest XFD. Therefore, XFD can only go out of sync with XSTATE_BV in the above interrupt case, or in similar scenarios involving preemption on preemptible kernels, and it we can consider it (de facto) part of KVM ABI that KVM_GET_XSAVE returns XSTATE_BV[i]=0 for XFD-disabled features. Reported-by: Paolo Bonzini Cc: stable@vger.kernel.org Fixes: 820a6ee944e7 ("kvm: x86: Add emulation for IA32_XFD", 2022-01-14) Signed-off-by: Sean Christopherson [Move clearing of XSTATE_BV from fpu_copy_uabi_to_guest_fpstate to kvm_vcpu_ioctl_x86_set_xsave. - Paolo] Reviewed-by: Binbin Wu Signed-off-by: Paolo Bonzini --- arch/x86/kernel/fpu/core.c | 32 +++++++++++++++++++++++++++++--- arch/x86/kvm/x86.c | 9 +++++++++ 2 files changed, 38 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index da233f20ae6f2..608983806fd7f 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -319,10 +319,29 @@ EXPORT_SYMBOL_FOR_KVM(fpu_enable_guest_xfd_features); #ifdef CONFIG_X86_64 void fpu_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd) { + struct fpstate *fpstate = guest_fpu->fpstate; + fpregs_lock(); - guest_fpu->fpstate->xfd = xfd; - if (guest_fpu->fpstate->in_use) - xfd_update_state(guest_fpu->fpstate); + + /* + * KVM's guest ABI is that setting XFD[i]=1 *can* immediately revert the + * save state to its initial configuration. Likewise, KVM_GET_XSAVE does + * the same as XSAVE and returns XSTATE_BV[i]=0 whenever XFD[i]=1. + * + * If the guest's FPU state is in hardware, just update XFD: the XSAVE + * in fpu_swap_kvm_fpstate will clear XSTATE_BV[i] whenever XFD[i]=1. + * + * If however the guest's FPU state is NOT resident in hardware, clear + * disabled components in XSTATE_BV now, or a subsequent XRSTOR will + * attempt to load disabled components and generate #NM _in the host_. + */ + if (xfd && test_thread_flag(TIF_NEED_FPU_LOAD)) + fpstate->regs.xsave.header.xfeatures &= ~xfd; + + fpstate->xfd = xfd; + if (fpstate->in_use) + xfd_update_state(fpstate); + fpregs_unlock(); } EXPORT_SYMBOL_FOR_KVM(fpu_update_guest_xfd); @@ -430,6 +449,13 @@ int fpu_copy_uabi_to_guest_fpstate(struct fpu_guest *gfpu, const void *buf, if (ustate->xsave.header.xfeatures & ~xcr0) return -EINVAL; + /* + * Disabled features must be in their initial state, otherwise XRSTOR + * causes an exception. + */ + if (WARN_ON_ONCE(ustate->xsave.header.xfeatures & kstate->xfd)) + return -EINVAL; + /* * Nullify @vpkru to preserve its current value if PKRU's bit isn't set * in the header. KVM's odd ABI is to leave PKRU untouched in this diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ff8812f3a1293..63afdb6bb078b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5807,9 +5807,18 @@ static int kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu, static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu, struct kvm_xsave *guest_xsave) { + union fpregs_state *xstate = (union fpregs_state *)guest_xsave->region; + if (fpstate_is_confidential(&vcpu->arch.guest_fpu)) return vcpu->kvm->arch.has_protected_state ? -EINVAL : 0; + /* + * For backwards compatibility, do not expect disabled features to be in + * their initial state. XSTATE_BV[i] must still be cleared whenever + * XFD[i]=1, or XRSTOR would cause a #NM. + */ + xstate->xsave.header.xfeatures &= ~vcpu->arch.guest_fpu.fpstate->xfd; + return fpu_copy_uabi_to_guest_fpstate(&vcpu->arch.guest_fpu, guest_xsave->region, kvm_caps.supported_xcr0, From 9bd168a6175aad2adcfac569c47d8ea8c9fb99b3 Mon Sep 17 00:00:00 2001 From: Paolo Bonzini Date: Wed, 24 Dec 2025 00:44:49 +0100 Subject: [PATCH 0957/1321] selftests: kvm: replace numbered sync points with actions Rework the guest=>host syncs in the AMX test to use named actions instead of arbitrary, incrementing numbers. The "stage" of the test has no real meaning, what matters is what action the test wants the host to perform. The incrementing numbers are somewhat helpful for triaging failures, but fully debugging failures almost always requires a much deeper dive into the test (and KVM). Using named actions not only makes it easier to extend the test without having to shift all sync point numbers, it makes the code easier to read. [Commit message by Sean Christopherson] Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini --- tools/testing/selftests/kvm/x86/amx_test.c | 88 +++++++++++----------- 1 file changed, 43 insertions(+), 45 deletions(-) diff --git a/tools/testing/selftests/kvm/x86/amx_test.c b/tools/testing/selftests/kvm/x86/amx_test.c index f4ce5a185a7dd..3de4402ac17de 100644 --- a/tools/testing/selftests/kvm/x86/amx_test.c +++ b/tools/testing/selftests/kvm/x86/amx_test.c @@ -124,6 +124,14 @@ static void set_tilecfg(struct tile_config *cfg) } } +enum { + /* Check TMM0 against tiledata */ + TEST_COMPARE_TILEDATA = 1, + + /* Full VM save/restore */ + TEST_SAVE_RESTORE = 2, +}; + static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg, struct tile_data *tiledata, struct xstate *xstate) @@ -131,20 +139,20 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg, GUEST_ASSERT(this_cpu_has(X86_FEATURE_XSAVE) && this_cpu_has(X86_FEATURE_OSXSAVE)); check_xtile_info(); - GUEST_SYNC(1); + GUEST_SYNC(TEST_SAVE_RESTORE); /* xfd=0, enable amx */ wrmsr(MSR_IA32_XFD, 0); - GUEST_SYNC(2); + GUEST_SYNC(TEST_SAVE_RESTORE); GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == 0); set_tilecfg(amx_cfg); __ldtilecfg(amx_cfg); - GUEST_SYNC(3); + GUEST_SYNC(TEST_SAVE_RESTORE); /* Check save/restore when trap to userspace */ __tileloadd(tiledata); - GUEST_SYNC(4); + GUEST_SYNC(TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE); __tilerelease(); - GUEST_SYNC(5); + GUEST_SYNC(TEST_SAVE_RESTORE); /* * After XSAVEC, XTILEDATA is cleared in the xstate_bv but is set in * the xcomp_bv. @@ -154,6 +162,8 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg, GUEST_ASSERT(!(xstate->header.xstate_bv & XFEATURE_MASK_XTILE_DATA)); GUEST_ASSERT(xstate->header.xcomp_bv & XFEATURE_MASK_XTILE_DATA); + /* #NM test */ + /* xfd=0x40000, disable amx tiledata */ wrmsr(MSR_IA32_XFD, XFEATURE_MASK_XTILE_DATA); @@ -166,13 +176,13 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg, GUEST_ASSERT(!(xstate->header.xstate_bv & XFEATURE_MASK_XTILE_DATA)); GUEST_ASSERT((xstate->header.xcomp_bv & XFEATURE_MASK_XTILE_DATA)); - GUEST_SYNC(6); + GUEST_SYNC(TEST_SAVE_RESTORE); GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA); set_tilecfg(amx_cfg); __ldtilecfg(amx_cfg); /* Trigger #NM exception */ __tileloadd(tiledata); - GUEST_SYNC(10); + GUEST_SYNC(TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE); GUEST_DONE(); } @@ -180,18 +190,18 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg, void guest_nm_handler(struct ex_regs *regs) { /* Check if #NM is triggered by XFEATURE_MASK_XTILE_DATA */ - GUEST_SYNC(7); + GUEST_SYNC(TEST_SAVE_RESTORE); GUEST_ASSERT(!(get_cr0() & X86_CR0_TS)); GUEST_ASSERT(rdmsr(MSR_IA32_XFD_ERR) == XFEATURE_MASK_XTILE_DATA); GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA); - GUEST_SYNC(8); + GUEST_SYNC(TEST_SAVE_RESTORE); GUEST_ASSERT(rdmsr(MSR_IA32_XFD_ERR) == XFEATURE_MASK_XTILE_DATA); GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA); /* Clear xfd_err */ wrmsr(MSR_IA32_XFD_ERR, 0); /* xfd=0, enable amx */ wrmsr(MSR_IA32_XFD, 0); - GUEST_SYNC(9); + GUEST_SYNC(TEST_SAVE_RESTORE); } int main(int argc, char *argv[]) @@ -244,6 +254,7 @@ int main(int argc, char *argv[]) memset(addr_gva2hva(vm, xstate), 0, PAGE_SIZE * DIV_ROUND_UP(XSAVE_SIZE, PAGE_SIZE)); vcpu_args_set(vcpu, 3, amx_cfg, tiledata, xstate); + int iter = 0; for (;;) { vcpu_run(vcpu); TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO); @@ -253,20 +264,9 @@ int main(int argc, char *argv[]) REPORT_GUEST_ASSERT(uc); /* NOT REACHED */ case UCALL_SYNC: - switch (uc.args[1]) { - case 1: - case 2: - case 3: - case 5: - case 6: - case 7: - case 8: - fprintf(stderr, "GUEST_SYNC(%ld)\n", uc.args[1]); - break; - case 4: - case 10: - fprintf(stderr, - "GUEST_SYNC(%ld), check save/restore status\n", uc.args[1]); + ++iter; + if (uc.args[1] & TEST_COMPARE_TILEDATA) { + fprintf(stderr, "GUEST_SYNC #%d, check TMM0 contents\n", iter); /* Compacted mode, get amx offset by xsave area * size subtract 8K amx size. @@ -279,11 +279,25 @@ int main(int argc, char *argv[]) ret = memcmp(amx_start, tiles_data, TILE_SIZE); TEST_ASSERT(ret == 0, "memcmp failed, ret=%d", ret); kvm_x86_state_cleanup(state); - break; - case 9: - fprintf(stderr, - "GUEST_SYNC(%ld), #NM exception and enable amx\n", uc.args[1]); - break; + } + if (uc.args[1] & TEST_SAVE_RESTORE) { + fprintf(stderr, "GUEST_SYNC #%d, save/restore VM state\n", iter); + state = vcpu_save_state(vcpu); + memset(®s1, 0, sizeof(regs1)); + vcpu_regs_get(vcpu, ®s1); + + kvm_vm_release(vm); + + /* Restore state in a new VM. */ + vcpu = vm_recreate_with_one_vcpu(vm); + vcpu_load_state(vcpu, state); + kvm_x86_state_cleanup(state); + + memset(®s2, 0, sizeof(regs2)); + vcpu_regs_get(vcpu, ®s2); + TEST_ASSERT(!memcmp(®s1, ®s2, sizeof(regs2)), + "Unexpected register values after vcpu_load_state; rdi: %lx rsi: %lx", + (ulong) regs2.rdi, (ulong) regs2.rsi); } break; case UCALL_DONE: @@ -293,22 +307,6 @@ int main(int argc, char *argv[]) TEST_FAIL("Unknown ucall %lu", uc.cmd); } - state = vcpu_save_state(vcpu); - memset(®s1, 0, sizeof(regs1)); - vcpu_regs_get(vcpu, ®s1); - - kvm_vm_release(vm); - - /* Restore state in a new VM. */ - vcpu = vm_recreate_with_one_vcpu(vm); - vcpu_load_state(vcpu, state); - kvm_x86_state_cleanup(state); - - memset(®s2, 0, sizeof(regs2)); - vcpu_regs_get(vcpu, ®s2); - TEST_ASSERT(!memcmp(®s1, ®s2, sizeof(regs2)), - "Unexpected register values after vcpu_load_state; rdi: %lx rsi: %lx", - (ulong) regs2.rdi, (ulong) regs2.rsi); } done: kvm_vm_free(vm); From b72b192bd794dc6c28acf717b56265fc32e966a6 Mon Sep 17 00:00:00 2001 From: Paolo Bonzini Date: Wed, 31 Dec 2025 16:47:26 +0100 Subject: [PATCH 0958/1321] selftests: kvm: try getting XFD and XSAVE state out of sync The host is allowed to set FPU state that includes a disabled xstate component. Check that this does not cause bad effects. Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini --- tools/testing/selftests/kvm/x86/amx_test.c | 38 +++++++++++++++++----- 1 file changed, 30 insertions(+), 8 deletions(-) diff --git a/tools/testing/selftests/kvm/x86/amx_test.c b/tools/testing/selftests/kvm/x86/amx_test.c index 3de4402ac17de..bee56c1f78339 100644 --- a/tools/testing/selftests/kvm/x86/amx_test.c +++ b/tools/testing/selftests/kvm/x86/amx_test.c @@ -125,11 +125,17 @@ static void set_tilecfg(struct tile_config *cfg) } enum { + /* Retrieve TMM0 from guest, stash it for TEST_RESTORE_TILEDATA */ + TEST_SAVE_TILEDATA = 1, + /* Check TMM0 against tiledata */ - TEST_COMPARE_TILEDATA = 1, + TEST_COMPARE_TILEDATA = 2, + + /* Restore TMM0 from earlier save */ + TEST_RESTORE_TILEDATA = 4, /* Full VM save/restore */ - TEST_SAVE_RESTORE = 2, + TEST_SAVE_RESTORE = 8, }; static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg, @@ -150,7 +156,16 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg, GUEST_SYNC(TEST_SAVE_RESTORE); /* Check save/restore when trap to userspace */ __tileloadd(tiledata); - GUEST_SYNC(TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE); + GUEST_SYNC(TEST_SAVE_TILEDATA | TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE); + + /* xfd=0x40000, disable amx tiledata */ + wrmsr(MSR_IA32_XFD, XFEATURE_MASK_XTILE_DATA); + + /* host tries setting tiledata while guest XFD is set */ + GUEST_SYNC(TEST_RESTORE_TILEDATA); + GUEST_SYNC(TEST_SAVE_RESTORE); + + wrmsr(MSR_IA32_XFD, 0); __tilerelease(); GUEST_SYNC(TEST_SAVE_RESTORE); /* @@ -210,10 +225,10 @@ int main(int argc, char *argv[]) struct kvm_vcpu *vcpu; struct kvm_vm *vm; struct kvm_x86_state *state; + struct kvm_x86_state *tile_state = NULL; int xsave_restore_size; vm_vaddr_t amx_cfg, tiledata, xstate; struct ucall uc; - u32 amx_offset; int ret; /* @@ -265,20 +280,27 @@ int main(int argc, char *argv[]) /* NOT REACHED */ case UCALL_SYNC: ++iter; + if (uc.args[1] & TEST_SAVE_TILEDATA) { + fprintf(stderr, "GUEST_SYNC #%d, save tiledata\n", iter); + tile_state = vcpu_save_state(vcpu); + } if (uc.args[1] & TEST_COMPARE_TILEDATA) { fprintf(stderr, "GUEST_SYNC #%d, check TMM0 contents\n", iter); /* Compacted mode, get amx offset by xsave area * size subtract 8K amx size. */ - amx_offset = xsave_restore_size - NUM_TILES*TILE_SIZE; - state = vcpu_save_state(vcpu); - void *amx_start = (void *)state->xsave + amx_offset; + u32 amx_offset = xsave_restore_size - NUM_TILES*TILE_SIZE; + void *amx_start = (void *)tile_state->xsave + amx_offset; void *tiles_data = (void *)addr_gva2hva(vm, tiledata); /* Only check TMM0 register, 1 tile */ ret = memcmp(amx_start, tiles_data, TILE_SIZE); TEST_ASSERT(ret == 0, "memcmp failed, ret=%d", ret); - kvm_x86_state_cleanup(state); + } + if (uc.args[1] & TEST_RESTORE_TILEDATA) { + fprintf(stderr, "GUEST_SYNC #%d, before KVM_SET_XSAVE\n", iter); + vcpu_xsave_set(vcpu, tile_state->xsave); + fprintf(stderr, "GUEST_SYNC #%d, after KVM_SET_XSAVE\n", iter); } if (uc.args[1] & TEST_SAVE_RESTORE) { fprintf(stderr, "GUEST_SYNC #%d, save/restore VM state\n", iter); From 47bcf43781b162336480e8cf0f063cc13603873a Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Mon, 29 Dec 2025 12:23:30 -0800 Subject: [PATCH 0959/1321] selftests: kvm: Verify TILELOADD actually #NM faults when XFD[18]=1 Rework the AMX test's #NM handling to use kvm_asm_safe() to verify an #NM actually occurs. As is, a completely missing #NM could go unnoticed. Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini --- tools/testing/selftests/kvm/x86/amx_test.c | 30 +++++++++++++--------- 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/tools/testing/selftests/kvm/x86/amx_test.c b/tools/testing/selftests/kvm/x86/amx_test.c index bee56c1f78339..37b166260ee3f 100644 --- a/tools/testing/selftests/kvm/x86/amx_test.c +++ b/tools/testing/selftests/kvm/x86/amx_test.c @@ -69,6 +69,12 @@ static inline void __tileloadd(void *tile) : : "a"(tile), "d"(0)); } +static inline int tileloadd_safe(void *tile) +{ + return kvm_asm_safe(".byte 0xc4,0xe2,0x7b,0x4b,0x04,0x10", + "a"(tile), "d"(0)); +} + static inline void __tilerelease(void) { asm volatile(".byte 0xc4, 0xe2, 0x78, 0x49, 0xc0" ::); @@ -142,6 +148,8 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg, struct tile_data *tiledata, struct xstate *xstate) { + int vector; + GUEST_ASSERT(this_cpu_has(X86_FEATURE_XSAVE) && this_cpu_has(X86_FEATURE_OSXSAVE)); check_xtile_info(); @@ -195,17 +203,13 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg, GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA); set_tilecfg(amx_cfg); __ldtilecfg(amx_cfg); - /* Trigger #NM exception */ - __tileloadd(tiledata); - GUEST_SYNC(TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE); - GUEST_DONE(); -} + /* Trigger #NM exception */ + vector = tileloadd_safe(tiledata); + __GUEST_ASSERT(vector == NM_VECTOR, + "Wanted #NM on tileloadd with XFD[18]=1, got %s", + ex_str(vector)); -void guest_nm_handler(struct ex_regs *regs) -{ - /* Check if #NM is triggered by XFEATURE_MASK_XTILE_DATA */ - GUEST_SYNC(TEST_SAVE_RESTORE); GUEST_ASSERT(!(get_cr0() & X86_CR0_TS)); GUEST_ASSERT(rdmsr(MSR_IA32_XFD_ERR) == XFEATURE_MASK_XTILE_DATA); GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA); @@ -217,6 +221,11 @@ void guest_nm_handler(struct ex_regs *regs) /* xfd=0, enable amx */ wrmsr(MSR_IA32_XFD, 0); GUEST_SYNC(TEST_SAVE_RESTORE); + + __tileloadd(tiledata); + GUEST_SYNC(TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE); + + GUEST_DONE(); } int main(int argc, char *argv[]) @@ -253,9 +262,6 @@ int main(int argc, char *argv[]) vcpu_regs_get(vcpu, ®s1); - /* Register #NM handler */ - vm_install_exception_handler(vm, NM_VECTOR, guest_nm_handler); - /* amx cfg for guest_code */ amx_cfg = vm_vaddr_alloc_page(vm); memset(addr_gva2hva(vm, amx_cfg), 0x0, getpagesize()); From 73e80a6eb63e3db378c8ba43913c63e8b8999f76 Mon Sep 17 00:00:00 2001 From: Andreas Gruenbacher Date: Mon, 12 Jan 2026 11:47:35 +0100 Subject: [PATCH 0960/1321] Revert "gfs2: Fix use of bio_chain" This reverts commit 8a157e0a0aa5143b5d94201508c0ca1bb8cfb941. That commit incorrectly assumed that the bio_chain() arguments were swapped in gfs2. However, gfs2 intentionally constructs bio chains so that the first bio's bi_end_io callback is invoked when all bios in the chain have completed, unlike bio chains where the last bio's callback is invoked. Fixes: 8a157e0a0aa5 ("gfs2: Fix use of bio_chain") Cc: stable@vger.kernel.org Signed-off-by: Andreas Gruenbacher --- fs/gfs2/lops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c index 97ebe457c00ad..d27a0b1080a97 100644 --- a/fs/gfs2/lops.c +++ b/fs/gfs2/lops.c @@ -484,7 +484,7 @@ static struct bio *gfs2_chain_bio(struct bio *prev, unsigned int nr_iovecs) new = bio_alloc(prev->bi_bdev, nr_iovecs, prev->bi_opf, GFP_NOIO); bio_clone_blkg_association(new, prev); new->bi_iter.bi_sector = bio_end_sector(prev); - bio_chain(prev, new); + bio_chain(new, prev); submit_bio(prev); return new; } From d62a3e128a46276ee272d41a20e33afab810729c Mon Sep 17 00:00:00 2001 From: Menglong Dong Date: Fri, 19 Dec 2025 22:29:48 +0800 Subject: [PATCH 0961/1321] riscv, bpf: Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK The usage of BPF_TRAMP_F_ORIG_STACK in __arch_prepare_bpf_trampoline() is wrong, and it should be BPF_TRAMP_F_CALL_ORIG, which caused crash as Andreas reported: Insufficient stack space to handle exception! Task stack: [0xff20000000010000..0xff20000000014000] Overflow stack: [0xff600000ffdad070..0xff600000ffdae070] CPU: 1 UID: 0 PID: 1 Comm: systemd Not tainted 6.18.0-rc5+ #15 PREEMPT(voluntary) Hardware name: riscv-virtio qemu/qemu, BIOS 2025.10 10/01/2025 epc : copy_from_kernel_nofault+0xa/0x198 ra : bpf_probe_read_kernel+0x20/0x60 epc : ffffffff802b732a ra : ffffffff801e6070 sp : ff2000000000ffe0 gp : ffffffff82262ed0 tp : 0000000000000000 t0 : ffffffff80022320 t1 : ffffffff801e6056 t2 : 0000000000000000 s0 : ff20000000010040 s1 : 0000000000000008 a0 : ff20000000010050 a1 : ff60000083b3d320 a2 : 0000000000000008 a3 : 0000000000000097 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000021 a7 : 0000000000000003 s2 : ff20000000010050 s3 : ff6000008459fc18 s4 : ff60000083b3d340 s5 : ff20000000010060 s6 : 0000000000000000 s7 : ff20000000013aa8 s8 : 0000000000000000 s9 : 0000000000008000 s10: 000000000058dcb0 s11: 000000000058dca7 t3 : 000000006925116d t4 : ff6000008090f026 t5 : 00007fff9b0cbaa8 t6 : 0000000000000016 status: 0000000200000120 badaddr: 0000000000000000 cause: 8000000000000005 Kernel panic - not syncing: Kernel stack overflow CPU: 1 UID: 0 PID: 1 Comm: systemd Not tainted 6.18.0-rc5+ #15 PREEMPT(voluntary) Hardware name: riscv-virtio qemu/qemu, BIOS 2025.10 10/01/2025 Call Trace: [] dump_backtrace+0x28/0x38 [] show_stack+0x3a/0x50 [] dump_stack_lvl+0x56/0x80 [] dump_stack+0x18/0x22 [] vpanic+0xf6/0x328 [] panic+0x3e/0x40 [] handle_bad_stack+0x98/0xa0 [] bpf_probe_read_kernel+0x20/0x60 Just fix it. Fixes: 47c9214dcbea ("bpf: fix the usage of BPF_TRAMP_F_SKIP_FRAME") Link: https://lore.kernel.org/bpf/20251219142948.204312-1-dongml2@chinatelecom.cn Closes: https://lore.kernel.org/bpf/874ipnkfvt.fsf@igel.home/ Reported-by: Andreas Schwab Signed-off-by: Menglong Dong Signed-off-by: Andrii Nakryiko Signed-off-by: Alexei Starovoitov --- arch/riscv/net/bpf_jit_comp64.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c index 5f9457e910e87..37888abee70c3 100644 --- a/arch/riscv/net/bpf_jit_comp64.c +++ b/arch/riscv/net/bpf_jit_comp64.c @@ -1133,10 +1133,6 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, store_args(nr_arg_slots, args_off, ctx); - /* skip to actual body of traced function */ - if (flags & BPF_TRAMP_F_ORIG_STACK) - orig_call += RV_FENTRY_NINSNS * 4; - if (flags & BPF_TRAMP_F_CALL_ORIG) { emit_imm(RV_REG_A0, ctx->insns ? (const s64)im : RV_MAX_COUNT_IMM, ctx); ret = emit_call((const u64)__bpf_tramp_enter, true, ctx); @@ -1171,6 +1167,8 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, } if (flags & BPF_TRAMP_F_CALL_ORIG) { + /* skip to actual body of traced function */ + orig_call += RV_FENTRY_NINSNS * 4; restore_args(min_t(int, nr_arg_slots, RV_MAX_REG_ARGS), args_off, ctx); restore_stack_args(nr_arg_slots - RV_MAX_REG_ARGS, args_off, stk_arg_off, ctx); ret = emit_call((const u64)orig_call, true, ctx); From 62dae866a7bcda288faefa1fb47d26c427bba565 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Toke=20H=C3=B8iland-J=C3=B8rgensen?= Date: Mon, 5 Jan 2026 12:47:45 +0100 Subject: [PATCH 0962/1321] bpf, test_run: Subtract size of xdp_frame from allowed metadata size MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The xdp_frame structure takes up part of the XDP frame headroom, limiting the size of the metadata. However, in bpf_test_run, we don't take this into account, which makes it possible for userspace to supply a metadata size that is too large (taking up the entire headroom). If userspace supplies such a large metadata size in live packet mode, the xdp_update_frame_from_buff() call in xdp_test_run_init_page() call will fail, after which packet transmission proceeds with an uninitialised frame structure, leading to the usual Bad Stuff. The commit in the Fixes tag fixed a related bug where the second check in xdp_update_frame_from_buff() could fail, but did not add any additional constraints on the metadata size. Complete the fix by adding an additional check on the metadata size. Reorder the checks slightly to make the logic clearer and add a comment. Link: https://lore.kernel.org/r/fa2be179-bad7-4ee3-8668-4903d1853461@hust.edu.cn Fixes: b6f1f780b393 ("bpf, test_run: Fix packet size check for live packet mode") Reported-by: Yinhao Hu Reported-by: Kaiyan Mei Signed-off-by: Toke Høiland-Jørgensen Reviewed-by: Amery Hung Link: https://lore.kernel.org/r/20260105114747.1358750-1-toke@redhat.com Signed-off-by: Alexei Starovoitov --- net/bpf/test_run.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index 655efac6f1334..e6c0ad204b928 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -1294,8 +1294,6 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr, batch_size = NAPI_POLL_WEIGHT; else if (batch_size > TEST_XDP_MAX_BATCH) return -E2BIG; - - headroom += sizeof(struct xdp_page_head); } else if (batch_size) { return -EINVAL; } @@ -1308,16 +1306,26 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr, /* There can't be user provided data before the meta data */ if (ctx->data_meta || ctx->data_end > kattr->test.data_size_in || ctx->data > ctx->data_end || - unlikely(xdp_metalen_invalid(ctx->data)) || (do_live && (kattr->test.data_out || kattr->test.ctx_out))) goto free_ctx; - /* Meta data is allocated from the headroom */ - headroom -= ctx->data; meta_sz = ctx->data; + if (xdp_metalen_invalid(meta_sz) || meta_sz > headroom - sizeof(struct xdp_frame)) + goto free_ctx; + + /* Meta data is allocated from the headroom */ + headroom -= meta_sz; linear_sz = ctx->data_end; } + /* The xdp_page_head structure takes up space in each page, limiting the + * size of the packet data; add the extra size to headroom here to make + * sure it's accounted in the length checks below, but not in the + * metadata size check above. + */ + if (do_live) + headroom += sizeof(struct xdp_page_head); + max_linear_sz = PAGE_SIZE - headroom - tailroom; linear_sz = min_t(u32, linear_sz, max_linear_sz); From 7b8d11cd5d814f6c982291fcb1609b3e03211f1a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Toke=20H=C3=B8iland-J=C3=B8rgensen?= Date: Mon, 5 Jan 2026 12:47:46 +0100 Subject: [PATCH 0963/1321] selftests/bpf: Update xdp_context_test_run test to check maximum metadata size MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Update the selftest to check that the metadata size check takes the xdp_frame size into account in bpf_prog_test_run. The original check (for meta size 256) was broken because the data frame supplied was smaller than this, triggering a different EINVAL return. So supply a larger data frame for this test to make sure we actually exercise the check we think we are. Signed-off-by: Toke Høiland-Jørgensen Reviewed-by: Amery Hung Link: https://lore.kernel.org/r/20260105114747.1358750-2-toke@redhat.com Signed-off-by: Alexei Starovoitov --- .../bpf/prog_tests/xdp_context_test_run.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c index ee94c281888ae..26159e0499c76 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c @@ -47,6 +47,7 @@ void test_xdp_context_test_run(void) struct test_xdp_context_test_run *skel = NULL; char data[sizeof(pkt_v4) + sizeof(__u32)]; char bad_ctx[sizeof(struct xdp_md) + 1]; + char large_data[256]; struct xdp_md ctx_in, ctx_out; DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts, .data_in = &data, @@ -94,9 +95,6 @@ void test_xdp_context_test_run(void) test_xdp_context_error(prog_fd, opts, 4, sizeof(__u32), sizeof(data), 0, 0, 0); - /* Meta data must be 255 bytes or smaller */ - test_xdp_context_error(prog_fd, opts, 0, 256, sizeof(data), 0, 0, 0); - /* Total size of data must be data_end - data_meta or larger */ test_xdp_context_error(prog_fd, opts, 0, sizeof(__u32), sizeof(data) + 1, 0, 0, 0); @@ -116,6 +114,16 @@ void test_xdp_context_test_run(void) test_xdp_context_error(prog_fd, opts, 0, sizeof(__u32), sizeof(data), 0, 0, 1); + /* Meta data must be 216 bytes or smaller (256 - sizeof(struct + * xdp_frame)). Test both nearest invalid size and nearest invalid + * 4-byte-aligned size, and make sure data_in is large enough that we + * actually hit the check on metadata length + */ + opts.data_in = large_data; + opts.data_size_in = sizeof(large_data); + test_xdp_context_error(prog_fd, opts, 0, 217, sizeof(large_data), 0, 0, 0); + test_xdp_context_error(prog_fd, opts, 0, 220, sizeof(large_data), 0, 0, 0); + test_xdp_context_test_run__destroy(skel); } From 378d13c399ce575799e84195e835ae204602fd9f Mon Sep 17 00:00:00 2001 From: Deepanshu Kartikey Date: Wed, 7 Jan 2026 07:40:37 +0530 Subject: [PATCH 0964/1321] bpf: Reject BPF_MAP_TYPE_INSN_ARRAY in check_reg_const_str() BPF_MAP_TYPE_INSN_ARRAY maps store instruction pointers in their ips array, not string data. The map_direct_value_addr callback for this map type returns the address of the ips array, which is not suitable for use as a constant string argument. When a BPF program passes a pointer to an insn_array map value as ARG_PTR_TO_CONST_STR (e.g., to bpf_snprintf), the verifier's null-termination check in check_reg_const_str() operates on the wrong memory region, and at runtime bpf_bprintf_prepare() can read out of bounds searching for a null terminator. Reject BPF_MAP_TYPE_INSN_ARRAY in check_reg_const_str() since this map type is not designed to hold string data. Reported-by: syzbot+2c29addf92581b410079@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2c29addf92581b410079 Tested-by: syzbot+2c29addf92581b410079@syzkaller.appspotmail.com Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps") Signed-off-by: Deepanshu Kartikey Acked-by: Anton Protopopov Link: https://lore.kernel.org/r/20260107021037.289644-1-kartikey406@gmail.com Signed-off-by: Alexei Starovoitov --- kernel/bpf/verifier.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index f0ca69f888fac..3135643d56955 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -9609,6 +9609,11 @@ static int check_reg_const_str(struct bpf_verifier_env *env, if (reg->type != PTR_TO_MAP_VALUE) return -EINVAL; + if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY) { + verbose(env, "R%d points to insn_array map which cannot be used as const string\n", regno); + return -EACCES; + } + if (!bpf_map_is_rdonly(map)) { verbose(env, "R%d does not point to a readonly map'\n", regno); return -EACCES; From 4f49b95b18b2f27af101f7a2cd998e441775e318 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Thu, 8 Jan 2026 21:36:48 +0900 Subject: [PATCH 0965/1321] bpf: Fix reference count leak in bpf_prog_test_run_xdp() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit syzbot is reporting unregister_netdevice: waiting for sit0 to become free. Usage count = 2 problem. A debug printk() patch found that a refcount is obtained at xdp_convert_md_to_buff() from bpf_prog_test_run_xdp(). According to commit ec94670fcb3b ("bpf: Support specifying ingress via xdp_md context in BPF_PROG_TEST_RUN"), the refcount obtained by xdp_convert_md_to_buff() will be released by xdp_convert_buff_to_md(). Therefore, we can consider that the error handling path introduced by commit 1c1949982524 ("bpf: introduce frags support to bpf_prog_test_run_xdp()") forgot to call xdp_convert_buff_to_md(). Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 Fixes: 1c1949982524 ("bpf: introduce frags support to bpf_prog_test_run_xdp()") Signed-off-by: Tetsuo Handa Reviewed-by: Toke Høiland-Jørgensen Link: https://lore.kernel.org/r/af090e53-9d9b-4412-8acb-957733b3975c@I-love.SAKURA.ne.jp Signed-off-by: Alexei Starovoitov --- net/bpf/test_run.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index e6c0ad204b928..26cfcfdc45eb7 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -1363,13 +1363,13 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr, if (sinfo->nr_frags == MAX_SKB_FRAGS) { ret = -ENOMEM; - goto out; + goto out_put_dev; } page = alloc_page(GFP_KERNEL); if (!page) { ret = -ENOMEM; - goto out; + goto out_put_dev; } frag = &sinfo->frags[sinfo->nr_frags++]; @@ -1381,7 +1381,7 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr, if (copy_from_user(page_address(page), data_in + size, data_len)) { ret = -EFAULT; - goto out; + goto out_put_dev; } sinfo->xdp_frags_size += data_len; size += data_len; @@ -1396,6 +1396,7 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr, ret = bpf_test_run_xdp_live(prog, &xdp, repeat, batch_size, &duration); else ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration, true); +out_put_dev: /* We convert the xdp_buff back to an xdp_md before checking the return * code so the reference count of any held netdevice will be decremented * even if the test run failed. From f69aa0d70b16d70c2d44ec134057edc6f383c5b3 Mon Sep 17 00:00:00 2001 From: Jacopo Mondi Date: Mon, 15 Dec 2025 13:15:57 +0100 Subject: [PATCH 0966/1321] media: Documentation: mali-c55: Use v4l2-isp version identifier The Mali C55 driver uses the v4l2-isp framework, which defines its own versioning numbers. Do not use the Mali C55 specific version identifier in the code example in the documentation of the Mali C55 stats and params metadata formats. Signed-off-by: Jacopo Mondi Reviewed-by: Daniel Scally Signed-off-by: Hans Verkuil --- Documentation/userspace-api/media/v4l/metafmt-arm-mali-c55.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/userspace-api/media/v4l/metafmt-arm-mali-c55.rst b/Documentation/userspace-api/media/v4l/metafmt-arm-mali-c55.rst index 696e0a645a7e0..f8029bcb52828 100644 --- a/Documentation/userspace-api/media/v4l/metafmt-arm-mali-c55.rst +++ b/Documentation/userspace-api/media/v4l/metafmt-arm-mali-c55.rst @@ -44,7 +44,7 @@ member and userspace must populate the type member with a value from struct v4l2_isp_params_buffer *params = (struct v4l2_isp_params_buffer *)buffer; - params->version = MALI_C55_PARAM_BUFFER_V1; + params->version = V4L2_ISP_PARAMS_VERSION_V1; params->data_size = 0; void *data = (void *)params->data; From 02dd584c45710fdc75e574a0adce332035b15256 Mon Sep 17 00:00:00 2001 From: Jacopo Mondi Date: Mon, 15 Dec 2025 13:11:06 +0100 Subject: [PATCH 0967/1321] media: mali-c55: Remove duplicated version check The Mali C55 driver uses the v4l2-isp framework, which performs validation of the parameters buffer versioning in the v4l2_isp_params_validate_buffer() function. It is not necessary to replicate the validation of the parameters buffer versioning in the platform-specific implementation. Remove it. Signed-off-by: Jacopo Mondi Reviewed-by: Daniel Scally Signed-off-by: Hans Verkuil --- drivers/media/platform/arm/mali-c55/mali-c55-params.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/drivers/media/platform/arm/mali-c55/mali-c55-params.c b/drivers/media/platform/arm/mali-c55/mali-c55-params.c index 082cda4f4f63e..be0e909bcf29f 100644 --- a/drivers/media/platform/arm/mali-c55/mali-c55-params.c +++ b/drivers/media/platform/arm/mali-c55/mali-c55-params.c @@ -582,13 +582,6 @@ static int mali_c55_params_buf_prepare(struct vb2_buffer *vb) struct mali_c55 *mali_c55 = params->mali_c55; int ret; - if (config->version != MALI_C55_PARAM_BUFFER_V1) { - dev_dbg(mali_c55->dev, - "Unsupported extensible format version: %u\n", - config->version); - return -EINVAL; - } - ret = v4l2_isp_params_validate_buffer_size(mali_c55->dev, vb, v4l2_isp_params_buffer_size(MALI_C55_PARAMS_MAX_SIZE)); if (ret) From 897f97b479d19b61c37315d13f7d3c80cbf24d8e Mon Sep 17 00:00:00 2001 From: Jacopo Mondi Date: Mon, 15 Dec 2025 13:08:12 +0100 Subject: [PATCH 0968/1321] media: uapi: mali-c55-config: Remove version identifier The Mali C55 driver uses the v4l2-isp framework, which defines its own versioning number which does not need to be defined again in each platform-specific header. Remove the definition of mali_c55_param_buffer_version enumeration from the Mali C55 uAPI header. Signed-off-by: Jacopo Mondi Reviewed-by: Daniel Scally Signed-off-by: Hans Verkuil --- include/uapi/linux/media/arm/mali-c55-config.h | 9 --------- 1 file changed, 9 deletions(-) diff --git a/include/uapi/linux/media/arm/mali-c55-config.h b/include/uapi/linux/media/arm/mali-c55-config.h index 109082c5694f6..3d335f950eeb1 100644 --- a/include/uapi/linux/media/arm/mali-c55-config.h +++ b/include/uapi/linux/media/arm/mali-c55-config.h @@ -194,15 +194,6 @@ struct mali_c55_stats_buffer { __u32 reserved3[15]; } __attribute__((packed)); -/** - * enum mali_c55_param_buffer_version - Mali-C55 parameters block versioning - * - * @MALI_C55_PARAM_BUFFER_V1: First version of Mali-C55 parameters block - */ -enum mali_c55_param_buffer_version { - MALI_C55_PARAM_BUFFER_V1, -}; - /** * enum mali_c55_param_block_type - Enumeration of Mali-C55 parameter blocks * From e6cd05c893e47205911df702d96612435b523e58 Mon Sep 17 00:00:00 2001 From: Jacopo Mondi Date: Fri, 29 Aug 2025 13:12:14 +0200 Subject: [PATCH 0969/1321] media: rzg2l-cru: csi-2: Support RZ/V2H input sizes The CRU version on the RZ/V2H SoC supports larger input sizes (4096x4096) compared to the version on the RZ/G2L (2800x4095). Store the per-SoC min/max sizes in the device match info and use them in place of the hardcoded ones. While at it, use the min sizes reported by the info structure to replace the RZG2L_CSI2_DEFAULT_WIDTH/HEIGHT macros. Signed-off-by: Jacopo Mondi Tested-by: Tommaso Merciai Reviewed-by: Lad Prabhakar Reviewed-by: Laurent Pinchart Signed-off-by: Hans Verkuil --- .../platform/renesas/rzg2l-cru/rzg2l-csi2.c | 41 ++++++++++++------- 1 file changed, 26 insertions(+), 15 deletions(-) diff --git a/drivers/media/platform/renesas/rzg2l-cru/rzg2l-csi2.c b/drivers/media/platform/renesas/rzg2l-cru/rzg2l-csi2.c index 0fbdae280fdc4..6dc4b53607b4c 100644 --- a/drivers/media/platform/renesas/rzg2l-cru/rzg2l-csi2.c +++ b/drivers/media/platform/renesas/rzg2l-cru/rzg2l-csi2.c @@ -96,13 +96,6 @@ #define VSRSTS_RETRIES 20 -#define RZG2L_CSI2_MIN_WIDTH 320 -#define RZG2L_CSI2_MIN_HEIGHT 240 -#define RZG2L_CSI2_MAX_WIDTH 2800 -#define RZG2L_CSI2_MAX_HEIGHT 4095 - -#define RZG2L_CSI2_DEFAULT_WIDTH RZG2L_CSI2_MIN_WIDTH -#define RZG2L_CSI2_DEFAULT_HEIGHT RZG2L_CSI2_MIN_HEIGHT #define RZG2L_CSI2_DEFAULT_FMT MEDIA_BUS_FMT_UYVY8_1X16 enum rzg2l_csi2_pads { @@ -137,6 +130,10 @@ struct rzg2l_csi2_info { int (*dphy_enable)(struct rzg2l_csi2 *csi2); int (*dphy_disable)(struct rzg2l_csi2 *csi2); bool has_system_clk; + unsigned int min_width; + unsigned int min_height; + unsigned int max_width; + unsigned int max_height; }; struct rzg2l_csi2_timings { @@ -418,6 +415,10 @@ static const struct rzg2l_csi2_info rzg2l_csi2_info = { .dphy_enable = rzg2l_csi2_dphy_enable, .dphy_disable = rzg2l_csi2_dphy_disable, .has_system_clk = true, + .min_width = 320, + .min_height = 240, + .max_width = 2800, + .max_height = 4095, }; static int rzg2l_csi2_dphy_setting(struct v4l2_subdev *sd, bool on) @@ -542,6 +543,10 @@ static const struct rzg2l_csi2_info rzv2h_csi2_info = { .dphy_enable = rzv2h_csi2_dphy_enable, .dphy_disable = rzv2h_csi2_dphy_disable, .has_system_clk = false, + .min_width = 320, + .min_height = 240, + .max_width = 4096, + .max_height = 4096, }; static int rzg2l_csi2_mipi_link_setting(struct v4l2_subdev *sd, bool on) @@ -631,6 +636,7 @@ static int rzg2l_csi2_set_format(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_format *fmt) { + struct rzg2l_csi2 *csi2 = sd_to_csi2(sd); struct v4l2_mbus_framefmt *src_format; struct v4l2_mbus_framefmt *sink_format; @@ -653,9 +659,11 @@ static int rzg2l_csi2_set_format(struct v4l2_subdev *sd, sink_format->ycbcr_enc = fmt->format.ycbcr_enc; sink_format->quantization = fmt->format.quantization; sink_format->width = clamp_t(u32, fmt->format.width, - RZG2L_CSI2_MIN_WIDTH, RZG2L_CSI2_MAX_WIDTH); + csi2->info->min_width, + csi2->info->max_width); sink_format->height = clamp_t(u32, fmt->format.height, - RZG2L_CSI2_MIN_HEIGHT, RZG2L_CSI2_MAX_HEIGHT); + csi2->info->min_height, + csi2->info->max_height); fmt->format = *sink_format; /* propagate format to source pad */ @@ -668,9 +676,10 @@ static int rzg2l_csi2_init_state(struct v4l2_subdev *sd, struct v4l2_subdev_state *sd_state) { struct v4l2_subdev_format fmt = { .pad = RZG2L_CSI2_SINK, }; + struct rzg2l_csi2 *csi2 = sd_to_csi2(sd); - fmt.format.width = RZG2L_CSI2_DEFAULT_WIDTH; - fmt.format.height = RZG2L_CSI2_DEFAULT_HEIGHT; + fmt.format.width = csi2->info->min_width; + fmt.format.height = csi2->info->min_height; fmt.format.field = V4L2_FIELD_NONE; fmt.format.code = RZG2L_CSI2_DEFAULT_FMT; fmt.format.colorspace = V4L2_COLORSPACE_SRGB; @@ -697,16 +706,18 @@ static int rzg2l_csi2_enum_frame_size(struct v4l2_subdev *sd, struct v4l2_subdev_state *sd_state, struct v4l2_subdev_frame_size_enum *fse) { + struct rzg2l_csi2 *csi2 = sd_to_csi2(sd); + if (fse->index != 0) return -EINVAL; if (!rzg2l_csi2_code_to_fmt(fse->code)) return -EINVAL; - fse->min_width = RZG2L_CSI2_MIN_WIDTH; - fse->min_height = RZG2L_CSI2_MIN_HEIGHT; - fse->max_width = RZG2L_CSI2_MAX_WIDTH; - fse->max_height = RZG2L_CSI2_MAX_HEIGHT; + fse->min_width = csi2->info->min_width; + fse->min_height = csi2->info->min_height; + fse->max_width = csi2->info->max_width; + fse->max_height = csi2->info->max_height; return 0; } From c0c3fcdccb2dbfde14c81f109f317df03a9dd037 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Sun, 7 Dec 2025 22:56:50 +0100 Subject: [PATCH 0970/1321] media: ov02c10: Fix bayer-pattern change after default vflip change After commit d5ebe3f7d13d ("media: ov02c10: Fix default vertical flip") the reported bayer-pattern of MEDIA_BUS_FMT_SGRBG10_1X10 is no longer correct. Change the 16-bit x-win register (0x3810) value from 2 to 1 so that the sensor will generate data in GRBG bayer-order again. Fixes: d5ebe3f7d13d ("media: ov02c10: Fix default vertical flip") Cc: stable@vger.kernel.org Reviewed-by: Bryan O'Donoghue Reviewed-by: Sebastian Reichel Signed-off-by: Hans de Goede Signed-off-by: Hans Verkuil --- drivers/media/i2c/ov02c10.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/media/i2c/ov02c10.c b/drivers/media/i2c/ov02c10.c index b1e540eb83265..6369841de88b6 100644 --- a/drivers/media/i2c/ov02c10.c +++ b/drivers/media/i2c/ov02c10.c @@ -168,7 +168,7 @@ static const struct reg_sequence sensor_1928x1092_30fps_setting[] = { {0x3810, 0x00}, {0x3811, 0x02}, {0x3812, 0x00}, - {0x3813, 0x02}, + {0x3813, 0x01}, {0x3814, 0x01}, {0x3815, 0x01}, {0x3816, 0x01}, From 130a1d9d63898e4295af8405567d033fe88a1d95 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Mon, 8 Dec 2025 15:12:58 +0100 Subject: [PATCH 0971/1321] media: ov02c10: Adjust x-win/y-win when changing flipping to preserve bayer-pattern The ov02c10 is capable of having its (crop) window shifted around with 1 pixel precision while streaming. This allows changing the x/y window coordinates when changing flipping to preserve the bayer-pattern. __v4l2_ctrl_handler_setup() will now write the window coordinates at 0x3810 and 0x3812 so these can be dropped from sensor_1928x1092_30fps_setting. Since the bayer-pattern is now unchanged, the V4L2_CTRL_FLAG_MODIFY_LAYOUT flag can be dropped from the flip controls. Note the original use of the V4L2_CTRL_FLAG_MODIFY_LAYOUT flag was incomplete, besides setting the flag the driver should also have reported a different mbus code when getting the source pad's format depending on the hflip / vflip settings see the ov2680.c driver for example. Fixes: b7cd2ba3f692 ("media: ov02c10: Support hflip and vflip") Cc: stable@vger.kernel.org Reviewed-by: Bryan O'Donoghue Reviewed-by: Sebastian Reichel Tested-by: Sebastian Reichel # T14s Gen6 Snapdragon Signed-off-by: Hans de Goede Signed-off-by: Hans Verkuil --- drivers/media/i2c/ov02c10.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/drivers/media/i2c/ov02c10.c b/drivers/media/i2c/ov02c10.c index 6369841de88b6..384c2f0b16086 100644 --- a/drivers/media/i2c/ov02c10.c +++ b/drivers/media/i2c/ov02c10.c @@ -165,10 +165,6 @@ static const struct reg_sequence sensor_1928x1092_30fps_setting[] = { {0x3809, 0x88}, {0x380a, 0x04}, {0x380b, 0x44}, - {0x3810, 0x00}, - {0x3811, 0x02}, - {0x3812, 0x00}, - {0x3813, 0x01}, {0x3814, 0x01}, {0x3815, 0x01}, {0x3816, 0x01}, @@ -465,11 +461,15 @@ static int ov02c10_set_ctrl(struct v4l2_ctrl *ctrl) break; case V4L2_CID_HFLIP: + cci_write(ov02c10->regmap, OV02C10_ISP_X_WIN_CONTROL, + ctrl->val ? 1 : 2, &ret); cci_update_bits(ov02c10->regmap, OV02C10_ROTATE_CONTROL, BIT(3), ov02c10->hflip->val << 3, &ret); break; case V4L2_CID_VFLIP: + cci_write(ov02c10->regmap, OV02C10_ISP_Y_WIN_CONTROL, + ctrl->val ? 2 : 1, &ret); cci_update_bits(ov02c10->regmap, OV02C10_ROTATE_CONTROL, BIT(4), ov02c10->vflip->val << 4, &ret); break; @@ -551,13 +551,9 @@ static int ov02c10_init_controls(struct ov02c10 *ov02c10) ov02c10->hflip = v4l2_ctrl_new_std(ctrl_hdlr, &ov02c10_ctrl_ops, V4L2_CID_HFLIP, 0, 1, 1, 0); - if (ov02c10->hflip) - ov02c10->hflip->flags |= V4L2_CTRL_FLAG_MODIFY_LAYOUT; ov02c10->vflip = v4l2_ctrl_new_std(ctrl_hdlr, &ov02c10_ctrl_ops, V4L2_CID_VFLIP, 0, 1, 1, 0); - if (ov02c10->vflip) - ov02c10->vflip->flags |= V4L2_CTRL_FLAG_MODIFY_LAYOUT; v4l2_ctrl_new_std_menu_items(ctrl_hdlr, &ov02c10_ctrl_ops, V4L2_CID_TEST_PATTERN, From 2d54b96a7cba0babdea41fd35704ba37c2507f4f Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Mon, 8 Dec 2025 15:32:58 +0100 Subject: [PATCH 0972/1321] media: ov02c10: Fix the horizontal flip control During sensor calibration I noticed that with the hflip control set to false/disabled the image was mirrored. The horizontal flip control is inverted and needs to be set to 1 to not flip. This is something which seems to be common with various recent Omnivision sensors, the ov01a10 and ov08x40 also have an inverted mirror control. Invert the hflip control to fix the sensor mirroring by default. Fixes: b7cd2ba3f692 ("media: ov02c10: Support hflip and vflip") Cc: stable@vger.kernel.org Reviewed-by: Bryan O'Donoghue Reviewed-by: Sebastian Reichel Tested-by: Sebastian Reichel # T14s Gen6 Snapdragon Signed-off-by: Hans de Goede Signed-off-by: Hans Verkuil --- drivers/media/i2c/ov02c10.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/media/i2c/ov02c10.c b/drivers/media/i2c/ov02c10.c index 384c2f0b16086..f912ae1420408 100644 --- a/drivers/media/i2c/ov02c10.c +++ b/drivers/media/i2c/ov02c10.c @@ -170,7 +170,7 @@ static const struct reg_sequence sensor_1928x1092_30fps_setting[] = { {0x3816, 0x01}, {0x3817, 0x01}, - {0x3820, 0xa0}, + {0x3820, 0xa8}, {0x3821, 0x00}, {0x3822, 0x80}, {0x3823, 0x08}, @@ -462,9 +462,9 @@ static int ov02c10_set_ctrl(struct v4l2_ctrl *ctrl) case V4L2_CID_HFLIP: cci_write(ov02c10->regmap, OV02C10_ISP_X_WIN_CONTROL, - ctrl->val ? 1 : 2, &ret); + ctrl->val ? 2 : 1, &ret); cci_update_bits(ov02c10->regmap, OV02C10_ROTATE_CONTROL, - BIT(3), ov02c10->hflip->val << 3, &ret); + BIT(3), ctrl->val ? 0 : BIT(3), &ret); break; case V4L2_CID_VFLIP: From 291b5fe4bd54bc5eab1c127aa34b09d92ab21e6d Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Tue, 9 Dec 2025 10:18:50 +0100 Subject: [PATCH 0973/1321] media: ipu-bridge: Add DMI quirk for Dell XPS laptops with upside down sensors MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Dell XPS 13 9350 and XPS 16 9640 both have an upside-down mounted OV02C10 sensor. This rotation of 180° is reported in neither the SSDB nor the _PLD for the sensor (both report a rotation of 0°). Add a DMI quirk mechanism for upside-down sensors and add 2 initial entries to the DMI quirk list for these 2 laptops. Note the OV02C10 driver was originally developed on a XPS 16 9640 which resulted in inverted vflip + hflip settings making it look like the sensor was upright on the XPS 16 9640 and upside down elsewhere this has been fixed in commit d5ebe3f7d13d ("media: ov02c10: Fix default vertical flip"). This makes this commit a regression fix since now the video is upside down on these Dell XPS models where it was not before. Fixes: d5ebe3f7d13d ("media: ov02c10: Fix default vertical flip") Cc: stable@vger.kernel.org Reviewed-by: Bryan O'Donoghue Reviewed-by: Sebastian Reichel Signed-off-by: Hans de Goede Signed-off-by: Hans Verkuil --- drivers/media/pci/intel/Kconfig | 2 +- drivers/media/pci/intel/ipu-bridge.c | 29 ++++++++++++++++++++++++++++ 2 files changed, 30 insertions(+), 1 deletion(-) diff --git a/drivers/media/pci/intel/Kconfig b/drivers/media/pci/intel/Kconfig index d9fcddce028bf..3f14ca110d06c 100644 --- a/drivers/media/pci/intel/Kconfig +++ b/drivers/media/pci/intel/Kconfig @@ -6,7 +6,7 @@ source "drivers/media/pci/intel/ivsc/Kconfig" config IPU_BRIDGE tristate "Intel IPU Bridge" - depends on ACPI || COMPILE_TEST + depends on ACPI depends on I2C help The IPU bridge is a helper library for Intel IPU drivers to diff --git a/drivers/media/pci/intel/ipu-bridge.c b/drivers/media/pci/intel/ipu-bridge.c index c9827e9fe6ffc..b2b710094914c 100644 --- a/drivers/media/pci/intel/ipu-bridge.c +++ b/drivers/media/pci/intel/ipu-bridge.c @@ -5,6 +5,7 @@ #include #include #include +#include #include #include #include @@ -98,6 +99,28 @@ static const struct ipu_sensor_config ipu_supported_sensors[] = { IPU_SENSOR_CONFIG("XMCC0003", 1, 321468000), }; +/* + * DMI matches for laptops which have their sensor mounted upside-down + * without reporting a rotation of 180° in neither the SSDB nor the _PLD. + */ +static const struct dmi_system_id upside_down_sensor_dmi_ids[] = { + { + .matches = { + DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "XPS 13 9350"), + }, + .driver_data = "OVTI02C1", + }, + { + .matches = { + DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "XPS 16 9640"), + }, + .driver_data = "OVTI02C1", + }, + {} /* Terminating entry */ +}; + static const struct ipu_property_names prop_names = { .clock_frequency = "clock-frequency", .rotation = "rotation", @@ -248,6 +271,12 @@ static int ipu_bridge_read_acpi_buffer(struct acpi_device *adev, char *id, static u32 ipu_bridge_parse_rotation(struct acpi_device *adev, struct ipu_sensor_ssdb *ssdb) { + const struct dmi_system_id *dmi_id; + + dmi_id = dmi_first_match(upside_down_sensor_dmi_ids); + if (dmi_id && acpi_dev_hid_match(adev, dmi_id->driver_data)) + return 180; + switch (ssdb->degree) { case IPU_SENSOR_ROTATION_NORMAL: return 0; From 6b22b55d08dcadb41cce3ad6fc89fc22186adda7 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Mon, 8 Dec 2025 15:37:24 +0100 Subject: [PATCH 0974/1321] media: ov02c10: Remove unnecessary hflip and vflip pointers The cci_update_bits() inside ov02c10_set_ctrl() can use the passed data in the ctrl argument to access the vflip control value. After changing this there is no need to store a pointer to the hflip and vflip controls inside struct ov02c10, drop these. Reviewed-by: Bryan O'Donoghue Reviewed-by: Sebastian Reichel Tested-by: Sebastian Reichel # T14s Gen6 Snapdragon Signed-off-by: Hans de Goede Signed-off-by: Hans Verkuil --- drivers/media/i2c/ov02c10.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/drivers/media/i2c/ov02c10.c b/drivers/media/i2c/ov02c10.c index f912ae1420408..cf93d36032e14 100644 --- a/drivers/media/i2c/ov02c10.c +++ b/drivers/media/i2c/ov02c10.c @@ -381,8 +381,6 @@ struct ov02c10 { struct v4l2_ctrl *vblank; struct v4l2_ctrl *hblank; struct v4l2_ctrl *exposure; - struct v4l2_ctrl *hflip; - struct v4l2_ctrl *vflip; struct clk *img_clk; struct gpio_desc *reset; @@ -471,7 +469,7 @@ static int ov02c10_set_ctrl(struct v4l2_ctrl *ctrl) cci_write(ov02c10->regmap, OV02C10_ISP_Y_WIN_CONTROL, ctrl->val ? 2 : 1, &ret); cci_update_bits(ov02c10->regmap, OV02C10_ROTATE_CONTROL, - BIT(4), ov02c10->vflip->val << 4, &ret); + BIT(4), ctrl->val << 4, &ret); break; default: @@ -549,11 +547,11 @@ static int ov02c10_init_controls(struct ov02c10 *ov02c10) OV02C10_EXPOSURE_STEP, exposure_max); - ov02c10->hflip = v4l2_ctrl_new_std(ctrl_hdlr, &ov02c10_ctrl_ops, - V4L2_CID_HFLIP, 0, 1, 1, 0); + v4l2_ctrl_new_std(ctrl_hdlr, &ov02c10_ctrl_ops, V4L2_CID_HFLIP, + 0, 1, 1, 0); - ov02c10->vflip = v4l2_ctrl_new_std(ctrl_hdlr, &ov02c10_ctrl_ops, - V4L2_CID_VFLIP, 0, 1, 1, 0); + v4l2_ctrl_new_std(ctrl_hdlr, &ov02c10_ctrl_ops, V4L2_CID_VFLIP, + 0, 1, 1, 0); v4l2_ctrl_new_std_menu_items(ctrl_hdlr, &ov02c10_ctrl_ops, V4L2_CID_TEST_PATTERN, From ff0c750e4d4c8e16b7db4c7aa24a1aec24d71088 Mon Sep 17 00:00:00 2001 From: Alice Ryhl Date: Mon, 5 Jan 2026 10:44:06 +0000 Subject: [PATCH 0975/1321] rust: bitops: fix missing _find_* functions on 32-bit ARM On 32-bit ARM, you may encounter linker errors such as this one: ld.lld: error: undefined symbol: _find_next_zero_bit >>> referenced by rust_binder_main.43196037ba7bcee1-cgu.0 >>> drivers/android/binder/rust_binder_main.o:(::insert_or_update_handle) in archive vmlinux.a >>> referenced by rust_binder_main.43196037ba7bcee1-cgu.0 >>> drivers/android/binder/rust_binder_main.o:(::insert_or_update_handle) in archive vmlinux.a This error occurs because even though the functions are declared by include/linux/find.h, the definition is #ifdef'd out on 32-bit ARM. This is because arch/arm/include/asm/bitops.h contains: #define find_first_zero_bit(p,sz) _find_first_zero_bit_le(p,sz) #define find_next_zero_bit(p,sz,off) _find_next_zero_bit_le(p,sz,off) #define find_first_bit(p,sz) _find_first_bit_le(p,sz) #define find_next_bit(p,sz,off) _find_next_bit_le(p,sz,off) And the underscore-prefixed function is conditional on #ifndef of the non-underscore-prefixed name, but the declaration in find.h is *not* conditional on that #ifndef. To fix the linker error, we ensure that the symbols in question exist when compiling Rust code. We do this by defining them in rust/helpers/ whenever the normal definition is #ifndef'd out. Note that these helpers are somewhat unusual in that they do not have the rust_helper_ prefix that most helpers have. Adding the rust_helper_ prefix does not compile, as 'bindings::_find_next_zero_bit()' will result in a call to a symbol called _find_next_zero_bit as defined by include/linux/find.h rather than a symbol with the rust_helper_ prefix. This is because when a symbol is present in both include/ and rust/helpers/, the one from include/ wins under the assumption that the current configuration is one where that helper is unnecessary. This heuristic fails for _find_next_zero_bit() because the header file always declares it even if the symbol does not exist. The functions still use the __rust_helper annotation. This lets the wrapper function be inlined into Rust code even if full kernel LTO is not used once the patch series for that feature lands. Yury: arches are free to implement they own find_bit() functions. Most rely on generic implementation, but arm32 and m86k - not; so they require custom handling. Alice confirmed it fixes the build for both. Cc: stable@vger.kernel.org Fixes: 6cf93a9ed39e ("rust: add bindings for bitops.h") Reported-by: Andreas Hindborg Closes: https://rust-for-linux.zulipchat.com/#narrow/channel/x/topic/x/near/561677301 Tested-by: Andreas Hindborg Reviewed-by: Dirk Behme Signed-off-by: Alice Ryhl Signed-off-by: Yury Norov (NVIDIA) --- rust/helpers/bitops.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/rust/helpers/bitops.c b/rust/helpers/bitops.c index 5d0861d29d3f0..e79ef9e6d98f9 100644 --- a/rust/helpers/bitops.c +++ b/rust/helpers/bitops.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include +#include void rust_helper___set_bit(unsigned long nr, unsigned long *addr) { @@ -21,3 +22,44 @@ void rust_helper_clear_bit(unsigned long nr, volatile unsigned long *addr) { clear_bit(nr, addr); } + +/* + * The rust_helper_ prefix is intentionally omitted below so that the + * declarations in include/linux/find.h are compatible with these helpers. + * + * Note that the below #ifdefs mean that the helper is only created if C does + * not provide a definition. + */ +#ifdef find_first_zero_bit +__rust_helper +unsigned long _find_first_zero_bit(const unsigned long *p, unsigned long size) +{ + return find_first_zero_bit(p, size); +} +#endif /* find_first_zero_bit */ + +#ifdef find_next_zero_bit +__rust_helper +unsigned long _find_next_zero_bit(const unsigned long *addr, + unsigned long size, unsigned long offset) +{ + return find_next_zero_bit(addr, size, offset); +} +#endif /* find_next_zero_bit */ + +#ifdef find_first_bit +__rust_helper +unsigned long _find_first_bit(const unsigned long *addr, unsigned long size) +{ + return find_first_bit(addr, size); +} +#endif /* find_first_bit */ + +#ifdef find_next_bit +__rust_helper +unsigned long _find_next_bit(const unsigned long *addr, unsigned long size, + unsigned long offset) +{ + return find_next_bit(addr, size, offset); +} +#endif /* find_next_bit */ From 50c84485dac56d876fa6b5db0fb1bbab4b4c6951 Mon Sep 17 00:00:00 2001 From: Zhaoming Luo Date: Wed, 17 Dec 2025 22:03:38 +0800 Subject: [PATCH 0976/1321] scsi: ufs: dt-bindings: Fix several grammar errors Fix several grammar errors. Signed-off-by: Zhaoming Luo Reviewed-by: Rob Herring (Arm) Link: https://patch.msgid.link/20251217-fix-minor-grammar-err-v3-1-9be220cdd56a@posteo.com Signed-off-by: Martin K. Petersen --- Documentation/devicetree/bindings/ufs/ufs-common.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/ufs/ufs-common.yaml b/Documentation/devicetree/bindings/ufs/ufs-common.yaml index 9f04f34d8c5aa..ed97f56825093 100644 --- a/Documentation/devicetree/bindings/ufs/ufs-common.yaml +++ b/Documentation/devicetree/bindings/ufs/ufs-common.yaml @@ -48,8 +48,8 @@ properties: enum: [1, 2] default: 2 description: - Number of lanes available per direction. Note that it is assume same - number of lanes is used both directions at once. + Number of lanes available per direction. Note that it is assumed that + the same number of lanes are used in both directions at once. vdd-hba-supply: description: From 60e9ebdf0eced7d3ddc09daada61ff02e8740c8e Mon Sep 17 00:00:00 2001 From: Miao Li Date: Thu, 18 Dec 2025 10:31:29 +0800 Subject: [PATCH 0977/1321] scsi: core: Correct documentation for scsi_test_unit_ready() If scsi_test_unit_ready() returns zero, TEST UNIT READY was executed successfully. Signed-off-by: Miao Li Link: https://patch.msgid.link/20251218023129.284307-1-limiao870622@163.com Signed-off-by: Martin K. Petersen --- drivers/scsi/scsi_lib.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 93031326ac3ee..c7d6b76c86d24 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2459,7 +2459,7 @@ EXPORT_SYMBOL(scsi_mode_sense); * @retries: number of retries before failing * @sshdr: outpout pointer for decoded sense information. * - * Returns zero if unsuccessful or an error if TUR failed. For + * Returns zero if successful or an error if TUR failed. For * removable media, UNIT_ATTENTION sets ->changed flag. **/ int From b669b6e1a9141371608aac1766cff851f1613e06 Mon Sep 17 00:00:00 2001 From: Brian Kao Date: Thu, 18 Dec 2025 03:17:23 +0000 Subject: [PATCH 0978/1321] scsi: core: Fix error handler encryption support Some low-level drivers (LLD) access block layer crypto fields, such as rq->crypt_keyslot and rq->crypt_ctx within `struct request`, to configure hardware for inline encryption. However, SCSI Error Handling (EH) commands (e.g., TEST UNIT READY, START STOP UNIT) should not involve any encryption setup. To prevent drivers from erroneously applying crypto settings during EH, this patch saves the original values of rq->crypt_keyslot and rq->crypt_ctx before an EH command is prepared via scsi_eh_prep_cmnd(). These fields in the 'struct request' are then set to NULL. The original values are restored in scsi_eh_restore_cmnd() after the EH command completes. This ensures that the block layer crypto context does not leak into EH command execution. Signed-off-by: Brian Kao Link: https://patch.msgid.link/20251218031726.2642834-1-powenkao@google.com Cc: stable@vger.kernel.org Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen --- drivers/scsi/scsi_error.c | 24 ++++++++++++++++++++++++ include/scsi/scsi_eh.h | 6 ++++++ 2 files changed, 30 insertions(+) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index f869108fd9693..eebca96c1fc15 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -1063,6 +1063,9 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct scsi_eh_save *ses, unsigned char *cmnd, int cmnd_size, unsigned sense_bytes) { struct scsi_device *sdev = scmd->device; +#ifdef CONFIG_BLK_INLINE_ENCRYPTION + struct request *rq = scsi_cmd_to_rq(scmd); +#endif /* * We need saved copies of a number of fields - this is because @@ -1114,6 +1117,18 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct scsi_eh_save *ses, scmd->cmnd[1] = (scmd->cmnd[1] & 0x1f) | (sdev->lun << 5 & 0xe0); + /* + * Encryption must be disabled for the commands submitted by the error handler. + * Hence, clear the encryption context information. + */ +#ifdef CONFIG_BLK_INLINE_ENCRYPTION + ses->rq_crypt_keyslot = rq->crypt_keyslot; + ses->rq_crypt_ctx = rq->crypt_ctx; + + rq->crypt_keyslot = NULL; + rq->crypt_ctx = NULL; +#endif + /* * Zero the sense buffer. The scsi spec mandates that any * untransferred sense data should be interpreted as being zero. @@ -1131,6 +1146,10 @@ EXPORT_SYMBOL(scsi_eh_prep_cmnd); */ void scsi_eh_restore_cmnd(struct scsi_cmnd* scmd, struct scsi_eh_save *ses) { +#ifdef CONFIG_BLK_INLINE_ENCRYPTION + struct request *rq = scsi_cmd_to_rq(scmd); +#endif + /* * Restore original data */ @@ -1143,6 +1162,11 @@ void scsi_eh_restore_cmnd(struct scsi_cmnd* scmd, struct scsi_eh_save *ses) scmd->underflow = ses->underflow; scmd->prot_op = ses->prot_op; scmd->eh_eflags = ses->eh_eflags; + +#ifdef CONFIG_BLK_INLINE_ENCRYPTION + rq->crypt_keyslot = ses->rq_crypt_keyslot; + rq->crypt_ctx = ses->rq_crypt_ctx; +#endif } EXPORT_SYMBOL(scsi_eh_restore_cmnd); diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h index 1ae08e81339fa..15679be90c5c3 100644 --- a/include/scsi/scsi_eh.h +++ b/include/scsi/scsi_eh.h @@ -41,6 +41,12 @@ struct scsi_eh_save { unsigned char cmnd[32]; struct scsi_data_buffer sdb; struct scatterlist sense_sgl; + + /* struct request fields */ +#ifdef CONFIG_BLK_INLINE_ENCRYPTION + struct bio_crypt_ctx *rq_crypt_ctx; + struct blk_crypto_keyslot *rq_crypt_keyslot; +#endif }; extern void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, From 7107e55dcbfa111a2c30b7999514e16b9b974500 Mon Sep 17 00:00:00 2001 From: Bart Van Assche Date: Thu, 18 Dec 2025 15:07:37 -0800 Subject: [PATCH 0979/1321] scsi: ufs: core: Configure MCQ after link startup Commit f46b9a595fa9 ("scsi: ufs: core: Allocate the SCSI host earlier") did not only cause scsi_add_host() to be called earlier. It also swapped the order of link startup and enabling and configuring MCQ mode. Before that commit, the call chains for link startup and enabling MCQ were as follows: ufshcd_init() ufshcd_link_startup() ufshcd_add_scsi_host() ufshcd_mcq_enable() Apparently this change causes link startup to fail. Fix this by configuring MCQ after link startup has completed. Reported-by: Nitin Rawat Fixes: f46b9a595fa9 ("scsi: ufs: core: Allocate the SCSI host earlier") Signed-off-by: Bart Van Assche Reviewed-by: Peter Wang Link: https://patch.msgid.link/20251218230741.2661049-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen --- drivers/ufs/core/ufshcd.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index 0babb7035200f..604043a7533d3 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -10736,9 +10736,7 @@ static int ufshcd_add_scsi_host(struct ufs_hba *hba) if (is_mcq_supported(hba)) { ufshcd_mcq_enable(hba); err = ufshcd_alloc_mcq(hba); - if (!err) { - ufshcd_config_mcq(hba); - } else { + if (err) { /* Continue with SDB mode */ ufshcd_mcq_disable(hba); use_mcq_mode = false; @@ -11011,6 +11009,9 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq) if (err) goto out_disable; + if (hba->mcq_enabled) + ufshcd_config_mcq(hba); + if (hba->quirks & UFSHCD_QUIRK_SKIP_PH_CONFIGURATION) goto initialized; From c2af9127b821d6bd4aa440b762589de28a09fb87 Mon Sep 17 00:00:00 2001 From: Ranjan Kumar Date: Tue, 23 Dec 2025 16:17:21 +0530 Subject: [PATCH 0980/1321] scsi: mpt3sas: Update maintainer list As an active participant in the development of the mpt3sas driver, add myself to the maintainers list. Signed-off-by: Ranjan Kumar Link: https://patch.msgid.link/20251223104721.16882-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 0d044a58cbfe0..cf755238c429f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14880,6 +14880,7 @@ LSILOGIC MPT FUSION DRIVERS (FC/SAS/SPI) M: Sathya Prakash M: Sreekanth Reddy M: Suganath Prabu Subramani +M: Ranjan Kumar L: MPT-FusionLinux.pdl@broadcom.com L: linux-scsi@vger.kernel.org S: Supported From f77903ce88d3bd1944096f53ffa083611c27709a Mon Sep 17 00:00:00 2001 From: Julia Lawall Date: Wed, 31 Dec 2025 17:50:27 +0100 Subject: [PATCH 0981/1321] scsi: bfa: Update outdated comment The function bfa_lps_is_brcd_fabric() was eliminated, being a one-line function, in commit f7f73812e950 ("[SCSI] bfa: clean up one line functions"). Replace the call in the comment by its inlined counterpart, referring to the parameter of the subsequent function. Signed-off-by: Julia Lawall Link: https://patch.msgid.link/20251231165027.142443-1-Julia.Lawall@inria.fr Signed-off-by: Martin K. Petersen --- drivers/scsi/bfa/bfa_fcs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/bfa/bfa_fcs.c b/drivers/scsi/bfa/bfa_fcs.c index e52ce9b01f49a..9b57312f43f50 100644 --- a/drivers/scsi/bfa/bfa_fcs.c +++ b/drivers/scsi/bfa/bfa_fcs.c @@ -1169,7 +1169,7 @@ bfa_fcs_fabric_vport_lookup(struct bfa_fcs_fabric_s *fabric, wwn_t pwwn) * This function should be used only if there is any requirement * to check for FOS version below 6.3. * To check if the attached fabric is a brocade fabric, use - * bfa_lps_is_brcd_fabric() which works for FOS versions 6.3 + * fabric->lps->brcd_switch which works for FOS versions 6.3 * or above only. */ From 963162b1ab4782f6f08878b904bd0a0f784cdb81 Mon Sep 17 00:00:00 2001 From: Colin Ian King Date: Fri, 19 Dec 2025 21:44:28 +0000 Subject: [PATCH 0982/1321] scsi: ufs: host: mediatek: Make read-only array scale_us static const Don't populate the read-only array scale_us on the stack at run time, instead make it static const. Signed-off-by: Colin Ian King Reviewed-by: Peter Wang Link: https://patch.msgid.link/20251219214428.492744-1-colin.i.king@gmail.com Signed-off-by: Martin K. Petersen --- drivers/ufs/host/ufs-mediatek.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/ufs/host/ufs-mediatek.c b/drivers/ufs/host/ufs-mediatek.c index ecbbf52bf7341..66b11cc0703bd 100644 --- a/drivers/ufs/host/ufs-mediatek.c +++ b/drivers/ufs/host/ufs-mediatek.c @@ -1112,7 +1112,7 @@ static void ufs_mtk_setup_clk_gating(struct ufs_hba *hba) unsigned long flags; u32 ah_ms = 10; u32 ah_scale, ah_timer; - u32 scale_us[] = {1, 10, 100, 1000, 10000, 100000}; + static const u32 scale_us[] = {1, 10, 100, 1000, 10000, 100000}; if (ufshcd_is_clkgating_allowed(hba)) { if (ufshcd_is_auto_hibern8_supported(hba) && hba->ahit) { From 23b1ec16c1e9e3107f2e268124c839ebc813d76d Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Tue, 6 Jan 2026 12:07:06 -0800 Subject: [PATCH 0983/1321] MAINTAINERS: add docs and selftest to the TLS file list The TLS MAINTAINERS entry does not seem to cover the selftest or docs. Add those. While at it remove the unnecessary wildcard from net/tls/, there are no subdirectories anyway so this change has no impact today. Reviewed-by: Sabrina Dubroca Link: https://patch.msgid.link/20260106200706.1596250-1-kuba@kernel.org Signed-off-by: Jakub Kicinski --- MAINTAINERS | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index cf755238c429f..61bc5c5665529 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -18424,9 +18424,11 @@ M: Jakub Kicinski M: Sabrina Dubroca L: netdev@vger.kernel.org S: Maintained +F: Documentation/networking/tls* F: include/net/tls.h F: include/uapi/linux/tls.h -F: net/tls/* +F: net/tls/ +F: tools/testing/selftests/net/tls.c NETWORKING [SOCKETS] M: Eric Dumazet From 9d8ededa3fd26dba05bf71c7373e8f27f0f51002 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Tue, 6 Jan 2026 17:24:26 +0000 Subject: [PATCH 0984/1321] ipv4: ip_tunnel: spread netdev_lockdep_set_classes() Inspired by yet another syzbot report. IPv6 tunnels call netdev_lockdep_set_classes() for each tunnel type, while IPv4 currently centralizes netdev_lockdep_set_classes() call from ip_tunnel_init(). Make ip_tunnel_init() a macro, so that we have different lockdep classes per tunnel type. Fixes: 0bef512012b1 ("net: add netdev_lockdep_set_classes() to virtual drivers") Reported-by: syzbot+1240b33467289f5ab50b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/695d439f.050a0220.1c677c.0347.GAE@google.com/T/#u Signed-off-by: Eric Dumazet Link: https://patch.msgid.link/20260106172426.1760721-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- include/net/ip_tunnels.h | 13 ++++++++++++- net/ipv4/ip_tunnel.c | 5 ++--- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h index ecae35512b9b4..4021e6a73e32b 100644 --- a/include/net/ip_tunnels.h +++ b/include/net/ip_tunnels.h @@ -19,6 +19,7 @@ #include #include #include +#include #if IS_ENABLED(CONFIG_IPV6) #include @@ -372,7 +373,17 @@ static inline void ip_tunnel_init_flow(struct flowi4 *fl4, fl4->flowi4_flags = flow_flags; } -int ip_tunnel_init(struct net_device *dev); +int __ip_tunnel_init(struct net_device *dev); +#define ip_tunnel_init(DEV) \ +({ \ + struct net_device *__dev = (DEV); \ + int __res = __ip_tunnel_init(__dev); \ + \ + if (!__res) \ + netdev_lockdep_set_classes(__dev);\ + __res; \ +}) + void ip_tunnel_uninit(struct net_device *dev); void ip_tunnel_dellink(struct net_device *dev, struct list_head *head); struct net *ip_tunnel_get_link_net(const struct net_device *dev); diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 158a30ae7c5f2..50d0f5fe4e4c6 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -1281,7 +1281,7 @@ int ip_tunnel_changelink(struct net_device *dev, struct nlattr *tb[], } EXPORT_SYMBOL_GPL(ip_tunnel_changelink); -int ip_tunnel_init(struct net_device *dev) +int __ip_tunnel_init(struct net_device *dev) { struct ip_tunnel *tunnel = netdev_priv(dev); struct iphdr *iph = &tunnel->parms.iph; @@ -1308,10 +1308,9 @@ int ip_tunnel_init(struct net_device *dev) if (tunnel->collect_md) netif_keep_dst(dev); - netdev_lockdep_set_classes(dev); return 0; } -EXPORT_SYMBOL_GPL(ip_tunnel_init); +EXPORT_SYMBOL_GPL(__ip_tunnel_init); void ip_tunnel_uninit(struct net_device *dev) { From d2e01749b2c920472470c85c4e4756c48aa369d7 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 8 Jan 2026 09:38:06 +0000 Subject: [PATCH 0985/1321] net: bridge: annotate data-races around fdb->{updated,used} fdb->updated and fdb->used are read and written locklessly. Add READ_ONCE()/WRITE_ONCE() annotations. Fixes: 31cbc39b6344 ("net: bridge: add option to allow activity notifications for any fdb entries") Reported-by: syzbot+bfab43087ad57222ce96@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/695e3d74.050a0220.1c677c.035f.GAE@google.com/ Signed-off-by: Eric Dumazet Acked-by: Nikolay Aleksandrov Reviewed-by: Ido Schimmel Link: https://patch.msgid.link/20260108093806.834459-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- net/bridge/br_fdb.c | 28 ++++++++++++++++------------ net/bridge/br_input.c | 4 ++-- 2 files changed, 18 insertions(+), 14 deletions(-) diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index 58d22e2b85fc3..0501ffcb8a3dd 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -70,7 +70,7 @@ static inline int has_expired(const struct net_bridge *br, { return !test_bit(BR_FDB_STATIC, &fdb->flags) && !test_bit(BR_FDB_ADDED_BY_EXT_LEARN, &fdb->flags) && - time_before_eq(fdb->updated + hold_time(br), jiffies); + time_before_eq(READ_ONCE(fdb->updated) + hold_time(br), jiffies); } static int fdb_to_nud(const struct net_bridge *br, @@ -126,9 +126,9 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, if (nla_put_u32(skb, NDA_FLAGS_EXT, ext_flags)) goto nla_put_failure; - ci.ndm_used = jiffies_to_clock_t(now - fdb->used); + ci.ndm_used = jiffies_to_clock_t(now - READ_ONCE(fdb->used)); ci.ndm_confirmed = 0; - ci.ndm_updated = jiffies_to_clock_t(now - fdb->updated); + ci.ndm_updated = jiffies_to_clock_t(now - READ_ONCE(fdb->updated)); ci.ndm_refcnt = 0; if (nla_put(skb, NDA_CACHEINFO, sizeof(ci), &ci)) goto nla_put_failure; @@ -551,7 +551,7 @@ void br_fdb_cleanup(struct work_struct *work) */ rcu_read_lock(); hlist_for_each_entry_rcu(f, &br->fdb_list, fdb_node) { - unsigned long this_timer = f->updated + delay; + unsigned long this_timer = READ_ONCE(f->updated) + delay; if (test_bit(BR_FDB_STATIC, &f->flags) || test_bit(BR_FDB_ADDED_BY_EXT_LEARN, &f->flags)) { @@ -924,6 +924,7 @@ int br_fdb_fillbuf(struct net_bridge *br, void *buf, { struct net_bridge_fdb_entry *f; struct __fdb_entry *fe = buf; + unsigned long delta; int num = 0; memset(buf, 0, maxnum*sizeof(struct __fdb_entry)); @@ -953,8 +954,11 @@ int br_fdb_fillbuf(struct net_bridge *br, void *buf, fe->port_hi = f->dst->port_no >> 8; fe->is_local = test_bit(BR_FDB_LOCAL, &f->flags); - if (!test_bit(BR_FDB_STATIC, &f->flags)) - fe->ageing_timer_value = jiffies_delta_to_clock_t(jiffies - f->updated); + if (!test_bit(BR_FDB_STATIC, &f->flags)) { + delta = jiffies - READ_ONCE(f->updated); + fe->ageing_timer_value = + jiffies_delta_to_clock_t(delta); + } ++fe; ++num; } @@ -1002,8 +1006,8 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source, unsigned long now = jiffies; bool fdb_modified = false; - if (now != fdb->updated) { - fdb->updated = now; + if (now != READ_ONCE(fdb->updated)) { + WRITE_ONCE(fdb->updated, now); fdb_modified = __fdb_mark_active(fdb); } @@ -1242,10 +1246,10 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source, if (fdb_handle_notify(fdb, notify)) modified = true; - fdb->used = jiffies; + WRITE_ONCE(fdb->used, jiffies); if (modified) { if (refresh) - fdb->updated = jiffies; + WRITE_ONCE(fdb->updated, jiffies); fdb_notify(br, fdb, RTM_NEWNEIGH, true); } @@ -1556,7 +1560,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, goto err_unlock; } - fdb->updated = jiffies; + WRITE_ONCE(fdb->updated, jiffies); if (READ_ONCE(fdb->dst) != p) { WRITE_ONCE(fdb->dst, p); @@ -1565,7 +1569,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, if (test_and_set_bit(BR_FDB_ADDED_BY_EXT_LEARN, &fdb->flags)) { /* Refresh entry */ - fdb->used = jiffies; + WRITE_ONCE(fdb->used, jiffies); } else { modified = true; } diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 777fa869c1a14..e355a15bf5ab1 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -221,8 +221,8 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb if (test_bit(BR_FDB_LOCAL, &dst->flags)) return br_pass_frame_up(skb, false); - if (now != dst->used) - dst->used = now; + if (now != READ_ONCE(dst->used)) + WRITE_ONCE(dst->used, now); br_forward(dst->dst, skb, local_rcv, false); } else { if (!mcast_hit) From d5dedd560ab11d7c34da6235167d03ed81346110 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Wed, 7 Jan 2026 16:31:09 +0000 Subject: [PATCH 0986/1321] ip6_tunnel: use skb_vlan_inet_prepare() in __ip6_tnl_rcv() Blamed commit did not take care of VLAN encapsulations as spotted by syzbot [1]. Use skb_vlan_inet_prepare() instead of pskb_inet_may_pull(). [1] BUG: KMSAN: uninit-value in __INET_ECN_decapsulate include/net/inet_ecn.h:253 [inline] BUG: KMSAN: uninit-value in INET_ECN_decapsulate include/net/inet_ecn.h:275 [inline] BUG: KMSAN: uninit-value in IP6_ECN_decapsulate+0x7a8/0x1fa0 include/net/inet_ecn.h:321 __INET_ECN_decapsulate include/net/inet_ecn.h:253 [inline] INET_ECN_decapsulate include/net/inet_ecn.h:275 [inline] IP6_ECN_decapsulate+0x7a8/0x1fa0 include/net/inet_ecn.h:321 ip6ip6_dscp_ecn_decapsulate+0x16f/0x1b0 net/ipv6/ip6_tunnel.c:729 __ip6_tnl_rcv+0xed9/0x1b50 net/ipv6/ip6_tunnel.c:860 ip6_tnl_rcv+0xc3/0x100 net/ipv6/ip6_tunnel.c:903 gre_rcv+0x1529/0x1b90 net/ipv6/ip6_gre.c:-1 ip6_protocol_deliver_rcu+0x1c89/0x2c60 net/ipv6/ip6_input.c:438 ip6_input_finish+0x1f4/0x4a0 net/ipv6/ip6_input.c:489 NF_HOOK include/linux/netfilter.h:318 [inline] ip6_input+0x9c/0x330 net/ipv6/ip6_input.c:500 ip6_mc_input+0x7ca/0xc10 net/ipv6/ip6_input.c:590 dst_input include/net/dst.h:474 [inline] ip6_rcv_finish+0x958/0x990 net/ipv6/ip6_input.c:79 NF_HOOK include/linux/netfilter.h:318 [inline] ipv6_rcv+0xf1/0x3c0 net/ipv6/ip6_input.c:311 __netif_receive_skb_one_core net/core/dev.c:6139 [inline] __netif_receive_skb+0x1df/0xac0 net/core/dev.c:6252 netif_receive_skb_internal net/core/dev.c:6338 [inline] netif_receive_skb+0x57/0x630 net/core/dev.c:6397 tun_rx_batched+0x1df/0x980 drivers/net/tun.c:1485 tun_get_user+0x5c0e/0x6c60 drivers/net/tun.c:1953 tun_chr_write_iter+0x3e9/0x5c0 drivers/net/tun.c:1999 new_sync_write fs/read_write.c:593 [inline] vfs_write+0xbe2/0x15d0 fs/read_write.c:686 ksys_write fs/read_write.c:738 [inline] __do_sys_write fs/read_write.c:749 [inline] __se_sys_write fs/read_write.c:746 [inline] __x64_sys_write+0x1fb/0x4d0 fs/read_write.c:746 x64_sys_call+0x30ab/0x3e70 arch/x86/include/generated/asm/syscalls_64.h:2 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xd3/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Uninit was created at: slab_post_alloc_hook mm/slub.c:4960 [inline] slab_alloc_node mm/slub.c:5263 [inline] kmem_cache_alloc_node_noprof+0x9e7/0x17a0 mm/slub.c:5315 kmalloc_reserve+0x13c/0x4b0 net/core/skbuff.c:586 __alloc_skb+0x805/0x1040 net/core/skbuff.c:690 alloc_skb include/linux/skbuff.h:1383 [inline] alloc_skb_with_frags+0xc5/0xa60 net/core/skbuff.c:6712 sock_alloc_send_pskb+0xacc/0xc60 net/core/sock.c:2995 tun_alloc_skb drivers/net/tun.c:1461 [inline] tun_get_user+0x1142/0x6c60 drivers/net/tun.c:1794 tun_chr_write_iter+0x3e9/0x5c0 drivers/net/tun.c:1999 new_sync_write fs/read_write.c:593 [inline] vfs_write+0xbe2/0x15d0 fs/read_write.c:686 ksys_write fs/read_write.c:738 [inline] __do_sys_write fs/read_write.c:749 [inline] __se_sys_write fs/read_write.c:746 [inline] __x64_sys_write+0x1fb/0x4d0 fs/read_write.c:746 x64_sys_call+0x30ab/0x3e70 arch/x86/include/generated/asm/syscalls_64.h:2 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xd3/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f CPU: 0 UID: 0 PID: 6465 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(none) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 Fixes: 8d975c15c0cd ("ip6_tunnel: make sure to pull inner header in __ip6_tnl_rcv()") Reported-by: syzbot+d4dda070f833dc5dc89a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/695e88b2.050a0220.1c677c.036d.GAE@google.com/T/#u Signed-off-by: Eric Dumazet Link: https://patch.msgid.link/20260107163109.4188620-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- net/ipv6/ip6_tunnel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 6405072050e0e..c1f39735a2367 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -844,7 +844,7 @@ static int __ip6_tnl_rcv(struct ip6_tnl *tunnel, struct sk_buff *skb, skb_reset_network_header(skb); - if (!pskb_inet_may_pull(skb)) { + if (skb_vlan_inet_prepare(skb, true)) { DEV_STATS_INC(tunnel->dev, rx_length_errors); DEV_STATS_INC(tunnel->dev, rx_errors); goto drop; From 5263dfc745f5d4eeb35c3e72065977bc55e53b74 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 8 Jan 2026 09:32:44 +0000 Subject: [PATCH 0987/1321] net: update netdev_lock_{type,name} Add missing entries in netdev_lock_type[] and netdev_lock_name[] : CAN, MCTP, RAWIP, CAIF, IP6GRE, 6LOWPAN, NETLINK, VSOCKMON, IEEE802154_MONITOR. Also add a WARN_ONCE() in netdev_lock_pos() to help future bug hunting next time a protocol is added without updating these arrays. Fixes: 1a33e10e4a95 ("net: partially revert dynamic lockdep key changes") Signed-off-by: Eric Dumazet Link: https://patch.msgid.link/20260108093244.830280-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- net/core/dev.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 36dc5199037ed..9af9c3df452f7 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -478,15 +478,21 @@ static const unsigned short netdev_lock_type[] = { ARPHRD_IEEE1394, ARPHRD_EUI64, ARPHRD_INFINIBAND, ARPHRD_SLIP, ARPHRD_CSLIP, ARPHRD_SLIP6, ARPHRD_CSLIP6, ARPHRD_RSRVD, ARPHRD_ADAPT, ARPHRD_ROSE, ARPHRD_X25, ARPHRD_HWX25, + ARPHRD_CAN, ARPHRD_MCTP, ARPHRD_PPP, ARPHRD_CISCO, ARPHRD_LAPB, ARPHRD_DDCMP, - ARPHRD_RAWHDLC, ARPHRD_TUNNEL, ARPHRD_TUNNEL6, ARPHRD_FRAD, + ARPHRD_RAWHDLC, ARPHRD_RAWIP, + ARPHRD_TUNNEL, ARPHRD_TUNNEL6, ARPHRD_FRAD, ARPHRD_SKIP, ARPHRD_LOOPBACK, ARPHRD_LOCALTLK, ARPHRD_FDDI, ARPHRD_BIF, ARPHRD_SIT, ARPHRD_IPDDP, ARPHRD_IPGRE, ARPHRD_PIMREG, ARPHRD_HIPPI, ARPHRD_ASH, ARPHRD_ECONET, ARPHRD_IRDA, ARPHRD_FCPP, ARPHRD_FCAL, ARPHRD_FCPL, ARPHRD_FCFABRIC, ARPHRD_IEEE80211, ARPHRD_IEEE80211_PRISM, - ARPHRD_IEEE80211_RADIOTAP, ARPHRD_PHONET, ARPHRD_PHONET_PIPE, - ARPHRD_IEEE802154, ARPHRD_VOID, ARPHRD_NONE}; + ARPHRD_IEEE80211_RADIOTAP, + ARPHRD_IEEE802154, ARPHRD_IEEE802154_MONITOR, + ARPHRD_PHONET, ARPHRD_PHONET_PIPE, + ARPHRD_CAIF, ARPHRD_IP6GRE, ARPHRD_NETLINK, ARPHRD_6LOWPAN, + ARPHRD_VSOCKMON, + ARPHRD_VOID, ARPHRD_NONE}; static const char *const netdev_lock_name[] = { "_xmit_NETROM", "_xmit_ETHER", "_xmit_EETHER", "_xmit_AX25", @@ -495,15 +501,21 @@ static const char *const netdev_lock_name[] = { "_xmit_IEEE1394", "_xmit_EUI64", "_xmit_INFINIBAND", "_xmit_SLIP", "_xmit_CSLIP", "_xmit_SLIP6", "_xmit_CSLIP6", "_xmit_RSRVD", "_xmit_ADAPT", "_xmit_ROSE", "_xmit_X25", "_xmit_HWX25", + "_xmit_CAN", "_xmit_MCTP", "_xmit_PPP", "_xmit_CISCO", "_xmit_LAPB", "_xmit_DDCMP", - "_xmit_RAWHDLC", "_xmit_TUNNEL", "_xmit_TUNNEL6", "_xmit_FRAD", + "_xmit_RAWHDLC", "_xmit_RAWIP", + "_xmit_TUNNEL", "_xmit_TUNNEL6", "_xmit_FRAD", "_xmit_SKIP", "_xmit_LOOPBACK", "_xmit_LOCALTLK", "_xmit_FDDI", "_xmit_BIF", "_xmit_SIT", "_xmit_IPDDP", "_xmit_IPGRE", "_xmit_PIMREG", "_xmit_HIPPI", "_xmit_ASH", "_xmit_ECONET", "_xmit_IRDA", "_xmit_FCPP", "_xmit_FCAL", "_xmit_FCPL", "_xmit_FCFABRIC", "_xmit_IEEE80211", "_xmit_IEEE80211_PRISM", - "_xmit_IEEE80211_RADIOTAP", "_xmit_PHONET", "_xmit_PHONET_PIPE", - "_xmit_IEEE802154", "_xmit_VOID", "_xmit_NONE"}; + "_xmit_IEEE80211_RADIOTAP", + "_xmit_IEEE802154", "_xmit_IEEE802154_MONITOR", + "_xmit_PHONET", "_xmit_PHONET_PIPE", + "_xmit_CAIF", "_xmit_IP6GRE", "_xmit_NETLINK", "_xmit_6LOWPAN", + "_xmit_VSOCKMON", + "_xmit_VOID", "_xmit_NONE"}; static struct lock_class_key netdev_xmit_lock_key[ARRAY_SIZE(netdev_lock_type)]; static struct lock_class_key netdev_addr_lock_key[ARRAY_SIZE(netdev_lock_type)]; @@ -516,6 +528,7 @@ static inline unsigned short netdev_lock_pos(unsigned short dev_type) if (netdev_lock_type[i] == dev_type) return i; /* the last key is used by default */ + WARN_ONCE(1, "netdev_lock_pos() could not find dev_type=%u\n", dev_type); return ARRAY_SIZE(netdev_lock_type) - 1; } From 2e225b2cdb0b54b615d3bc2c4913eab12f3e4be3 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 8 Jan 2026 13:36:51 +0000 Subject: [PATCH 0988/1321] macvlan: fix possible UAF in macvlan_forward_source() Add RCU protection on (struct macvlan_source_entry)->vlan. Whenever macvlan_hash_del_source() is called, we must clear entry->vlan pointer before RCU grace period starts. This allows macvlan_forward_source() to skip over entries queued for freeing. Note that macvlan_dev are already RCU protected, as they are embedded in a standard netdev (netdev_priv(ndev)). Fixes: 79cf79abce71 ("macvlan: add source mode") Reported-by: syzbot+7182fbe91e58602ec1fe@syzkaller.appspotmail.com https: //lore.kernel.org/netdev/695fb1e8.050a0220.1c677c.039f.GAE@google.com/T/#u Signed-off-by: Eric Dumazet Link: https://patch.msgid.link/20260108133651.1130486-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- drivers/net/macvlan.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index 7966545512cfe..b4df7e184791d 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -59,7 +59,7 @@ struct macvlan_port { struct macvlan_source_entry { struct hlist_node hlist; - struct macvlan_dev *vlan; + struct macvlan_dev __rcu *vlan; unsigned char addr[6+2] __aligned(sizeof(u16)); struct rcu_head rcu; }; @@ -146,7 +146,7 @@ static struct macvlan_source_entry *macvlan_hash_lookup_source( hlist_for_each_entry_rcu(entry, h, hlist, lockdep_rtnl_is_held()) { if (ether_addr_equal_64bits(entry->addr, addr) && - entry->vlan == vlan) + rcu_access_pointer(entry->vlan) == vlan) return entry; } return NULL; @@ -168,7 +168,7 @@ static int macvlan_hash_add_source(struct macvlan_dev *vlan, return -ENOMEM; ether_addr_copy(entry->addr, addr); - entry->vlan = vlan; + RCU_INIT_POINTER(entry->vlan, vlan); h = &port->vlan_source_hash[macvlan_eth_hash(addr)]; hlist_add_head_rcu(&entry->hlist, h); vlan->macaddr_count++; @@ -187,6 +187,7 @@ static void macvlan_hash_add(struct macvlan_dev *vlan) static void macvlan_hash_del_source(struct macvlan_source_entry *entry) { + RCU_INIT_POINTER(entry->vlan, NULL); hlist_del_rcu(&entry->hlist); kfree_rcu(entry, rcu); } @@ -390,7 +391,7 @@ static void macvlan_flush_sources(struct macvlan_port *port, int i; hash_for_each_safe(port->vlan_source_hash, i, next, entry, hlist) - if (entry->vlan == vlan) + if (rcu_access_pointer(entry->vlan) == vlan) macvlan_hash_del_source(entry); vlan->macaddr_count = 0; @@ -433,9 +434,14 @@ static bool macvlan_forward_source(struct sk_buff *skb, hlist_for_each_entry_rcu(entry, h, hlist) { if (ether_addr_equal_64bits(entry->addr, addr)) { - if (entry->vlan->flags & MACVLAN_FLAG_NODST) + struct macvlan_dev *vlan = rcu_dereference(entry->vlan); + + if (!vlan) + continue; + + if (vlan->flags & MACVLAN_FLAG_NODST) consume = true; - macvlan_forward_source_one(skb, entry->vlan); + macvlan_forward_source_one(skb, vlan); } } @@ -1680,7 +1686,7 @@ static int macvlan_fill_info_macaddr(struct sk_buff *skb, struct macvlan_source_entry *entry; hlist_for_each_entry_rcu(entry, h, hlist, lockdep_rtnl_is_held()) { - if (entry->vlan != vlan) + if (rcu_access_pointer(entry->vlan) != vlan) continue; if (nla_put(skb, IFLA_MACVLAN_MACADDR, ETH_ALEN, entry->addr)) return 1; From 65147937e5f2628b7eddeb206f0c3bbeea855f0b Mon Sep 17 00:00:00 2001 From: Bui Quang Minh Date: Tue, 6 Jan 2026 22:04:36 +0700 Subject: [PATCH 0989/1321] virtio-net: don't schedule delayed refill worker When we fail to refill the receive buffers, we schedule a delayed worker to retry later. However, this worker creates some concurrency issues. For example, when the worker runs concurrently with virtnet_xdp_set, both need to temporarily disable queue's NAPI before enabling again. Without proper synchronization, a deadlock can happen when napi_disable() is called on an already disabled NAPI. That napi_disable() call will be stuck and so will the subsequent napi_enable() call. To simplify the logic and avoid further problems, we will instead retry refilling in the next NAPI poll. Fixes: 4bc12818b363 ("virtio-net: disable delayed refill when pausing rx") Reported-by: Paolo Abeni Closes: https://lore.kernel.org/526b5396-459d-4d02-8635-a222d07b46d7@redhat.com Cc: stable@vger.kernel.org Suggested-by: Xuan Zhuo Signed-off-by: Bui Quang Minh Acked-by: Michael S. Tsirkin Link: https://patch.msgid.link/20260106150438.7425-2-minhquangbui99@gmail.com Signed-off-by: Jakub Kicinski --- drivers/net/virtio_net.c | 47 ++++++++++++++++++++-------------------- 1 file changed, 24 insertions(+), 23 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 22d894101c018..fd8859bd41c7c 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3046,16 +3046,16 @@ static int virtnet_receive(struct receive_queue *rq, int budget, else packets = virtnet_receive_packets(vi, rq, budget, xdp_xmit, &stats); + u64_stats_set(&stats.packets, packets); if (rq->vq->num_free > min((unsigned int)budget, virtqueue_get_vring_size(rq->vq)) / 2) { - if (!try_fill_recv(vi, rq, GFP_ATOMIC)) { - spin_lock(&vi->refill_lock); - if (vi->refill_enabled) - schedule_delayed_work(&vi->refill, 0); - spin_unlock(&vi->refill_lock); - } + if (!try_fill_recv(vi, rq, GFP_ATOMIC)) + /* We need to retry refilling in the next NAPI poll so + * we must return budget to make sure the NAPI is + * repolled. + */ + packets = budget; } - u64_stats_set(&stats.packets, packets); u64_stats_update_begin(&rq->stats.syncp); for (i = 0; i < ARRAY_SIZE(virtnet_rq_stats_desc); i++) { size_t offset = virtnet_rq_stats_desc[i].offset; @@ -3230,9 +3230,10 @@ static int virtnet_open(struct net_device *dev) for (i = 0; i < vi->max_queue_pairs; i++) { if (i < vi->curr_queue_pairs) - /* Make sure we have some buffers: if oom use wq. */ - if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL)) - schedule_delayed_work(&vi->refill, 0); + /* Pre-fill rq agressively, to make sure we are ready to + * get packets immediately. + */ + try_fill_recv(vi, &vi->rq[i], GFP_KERNEL); err = virtnet_enable_queue_pair(vi, i); if (err < 0) @@ -3472,16 +3473,15 @@ static void __virtnet_rx_resume(struct virtnet_info *vi, struct receive_queue *rq, bool refill) { - bool running = netif_running(vi->dev); - bool schedule_refill = false; + if (netif_running(vi->dev)) { + /* Pre-fill rq agressively, to make sure we are ready to get + * packets immediately. + */ + if (refill) + try_fill_recv(vi, rq, GFP_KERNEL); - if (refill && !try_fill_recv(vi, rq, GFP_KERNEL)) - schedule_refill = true; - if (running) virtnet_napi_enable(rq); - - if (schedule_refill) - schedule_delayed_work(&vi->refill, 0); + } } static void virtnet_rx_resume_all(struct virtnet_info *vi) @@ -3829,11 +3829,12 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) } succ: vi->curr_queue_pairs = queue_pairs; - /* virtnet_open() will refill when device is going to up. */ - spin_lock_bh(&vi->refill_lock); - if (dev->flags & IFF_UP && vi->refill_enabled) - schedule_delayed_work(&vi->refill, 0); - spin_unlock_bh(&vi->refill_lock); + if (dev->flags & IFF_UP) { + local_bh_disable(); + for (int i = 0; i < vi->curr_queue_pairs; ++i) + virtqueue_napi_schedule(&vi->rq[i].napi, vi->rq[i].vq); + local_bh_enable(); + } return 0; } From 172caac2372bff23b758050d6fd0e184f69b365f Mon Sep 17 00:00:00 2001 From: Bui Quang Minh Date: Tue, 6 Jan 2026 22:04:37 +0700 Subject: [PATCH 0990/1321] virtio-net: remove unused delayed refill worker Since we switched to retry refilling receive buffer in NAPI poll instead of delayed worker, remove all now unused delayed refill worker code. Acked-by: Michael S. Tsirkin Signed-off-by: Bui Quang Minh Link: https://patch.msgid.link/20260106150438.7425-3-minhquangbui99@gmail.com Signed-off-by: Jakub Kicinski --- drivers/net/virtio_net.c | 86 ---------------------------------------- 1 file changed, 86 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index fd8859bd41c7c..9e37708243983 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -441,9 +441,6 @@ struct virtnet_info { /* Packet virtio header size */ u8 hdr_len; - /* Work struct for delayed refilling if we run low on memory. */ - struct delayed_work refill; - /* UDP tunnel support */ bool tx_tnl; @@ -451,12 +448,6 @@ struct virtnet_info { bool rx_tnl_csum; - /* Is delayed refill enabled? */ - bool refill_enabled; - - /* The lock to synchronize the access to refill_enabled */ - spinlock_t refill_lock; - /* Work struct for config space updates */ struct work_struct config_work; @@ -720,20 +711,6 @@ static void virtnet_rq_free_buf(struct virtnet_info *vi, put_page(virt_to_head_page(buf)); } -static void enable_delayed_refill(struct virtnet_info *vi) -{ - spin_lock_bh(&vi->refill_lock); - vi->refill_enabled = true; - spin_unlock_bh(&vi->refill_lock); -} - -static void disable_delayed_refill(struct virtnet_info *vi) -{ - spin_lock_bh(&vi->refill_lock); - vi->refill_enabled = false; - spin_unlock_bh(&vi->refill_lock); -} - static void enable_rx_mode_work(struct virtnet_info *vi) { rtnl_lock(); @@ -2948,42 +2925,6 @@ static void virtnet_napi_disable(struct receive_queue *rq) napi_disable(napi); } -static void refill_work(struct work_struct *work) -{ - struct virtnet_info *vi = - container_of(work, struct virtnet_info, refill.work); - bool still_empty; - int i; - - for (i = 0; i < vi->curr_queue_pairs; i++) { - struct receive_queue *rq = &vi->rq[i]; - - /* - * When queue API support is added in the future and the call - * below becomes napi_disable_locked, this driver will need to - * be refactored. - * - * One possible solution would be to: - * - cancel refill_work with cancel_delayed_work (note: - * non-sync) - * - cancel refill_work with cancel_delayed_work_sync in - * virtnet_remove after the netdev is unregistered - * - wrap all of the work in a lock (perhaps the netdev - * instance lock) - * - check netif_running() and return early to avoid a race - */ - napi_disable(&rq->napi); - still_empty = !try_fill_recv(vi, rq, GFP_KERNEL); - virtnet_napi_do_enable(rq->vq, &rq->napi); - - /* In theory, this can happen: if we don't get any buffers in - * we will *never* try to fill again. - */ - if (still_empty) - schedule_delayed_work(&vi->refill, HZ/2); - } -} - static int virtnet_receive_xsk_bufs(struct virtnet_info *vi, struct receive_queue *rq, int budget, @@ -3226,8 +3167,6 @@ static int virtnet_open(struct net_device *dev) struct virtnet_info *vi = netdev_priv(dev); int i, err; - enable_delayed_refill(vi); - for (i = 0; i < vi->max_queue_pairs; i++) { if (i < vi->curr_queue_pairs) /* Pre-fill rq agressively, to make sure we are ready to @@ -3252,9 +3191,6 @@ static int virtnet_open(struct net_device *dev) return 0; err_enable_qp: - disable_delayed_refill(vi); - cancel_delayed_work_sync(&vi->refill); - for (i--; i >= 0; i--) { virtnet_disable_queue_pair(vi, i); virtnet_cancel_dim(vi, &vi->rq[i].dim); @@ -3448,24 +3384,12 @@ static void virtnet_rx_pause_all(struct virtnet_info *vi) { int i; - /* - * Make sure refill_work does not run concurrently to - * avoid napi_disable race which leads to deadlock. - */ - disable_delayed_refill(vi); - cancel_delayed_work_sync(&vi->refill); for (i = 0; i < vi->max_queue_pairs; i++) __virtnet_rx_pause(vi, &vi->rq[i]); } static void virtnet_rx_pause(struct virtnet_info *vi, struct receive_queue *rq) { - /* - * Make sure refill_work does not run concurrently to - * avoid napi_disable race which leads to deadlock. - */ - disable_delayed_refill(vi); - cancel_delayed_work_sync(&vi->refill); __virtnet_rx_pause(vi, rq); } @@ -3488,7 +3412,6 @@ static void virtnet_rx_resume_all(struct virtnet_info *vi) { int i; - enable_delayed_refill(vi); for (i = 0; i < vi->max_queue_pairs; i++) { if (i < vi->curr_queue_pairs) __virtnet_rx_resume(vi, &vi->rq[i], true); @@ -3499,7 +3422,6 @@ static void virtnet_rx_resume_all(struct virtnet_info *vi) static void virtnet_rx_resume(struct virtnet_info *vi, struct receive_queue *rq) { - enable_delayed_refill(vi); __virtnet_rx_resume(vi, rq, true); } @@ -3844,10 +3766,6 @@ static int virtnet_close(struct net_device *dev) struct virtnet_info *vi = netdev_priv(dev); int i; - /* Make sure NAPI doesn't schedule refill work */ - disable_delayed_refill(vi); - /* Make sure refill_work doesn't re-enable napi! */ - cancel_delayed_work_sync(&vi->refill); /* Prevent the config change callback from changing carrier * after close */ @@ -5803,7 +5721,6 @@ static int virtnet_restore_up(struct virtio_device *vdev) virtio_device_ready(vdev); - enable_delayed_refill(vi); enable_rx_mode_work(vi); if (netif_running(vi->dev)) { @@ -6560,7 +6477,6 @@ static int virtnet_alloc_queues(struct virtnet_info *vi) if (!vi->rq) goto err_rq; - INIT_DELAYED_WORK(&vi->refill, refill_work); for (i = 0; i < vi->max_queue_pairs; i++) { vi->rq[i].pages = NULL; netif_napi_add_config(vi->dev, &vi->rq[i].napi, virtnet_poll, @@ -6902,7 +6818,6 @@ static int virtnet_probe(struct virtio_device *vdev) INIT_WORK(&vi->config_work, virtnet_config_changed_work); INIT_WORK(&vi->rx_mode_work, virtnet_rx_mode_work); - spin_lock_init(&vi->refill_lock); if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF)) { vi->mergeable_rx_bufs = true; @@ -7166,7 +7081,6 @@ static int virtnet_probe(struct virtio_device *vdev) net_failover_destroy(vi->failover); free_vqs: virtio_reset_device(vdev); - cancel_delayed_work_sync(&vi->refill); free_receive_page_frags(vi); virtnet_del_vqs(vi); free: From 4772b5da3870da57526b7efa12009297ea80b9c4 Mon Sep 17 00:00:00 2001 From: Bui Quang Minh Date: Tue, 6 Jan 2026 22:04:38 +0700 Subject: [PATCH 0991/1321] virtio-net: clean up __virtnet_rx_pause/resume The delayed refill worker is removed which makes virtnet_rx_pause/resume quite the same as __virtnet_rx_pause/resume. So remove __virtnet_rx_pause/resume and move the code to virtnet_rx_pause/resume. Acked-by: Michael S. Tsirkin Signed-off-by: Bui Quang Minh Link: https://patch.msgid.link/20260106150438.7425-4-minhquangbui99@gmail.com Signed-off-by: Jakub Kicinski --- drivers/net/virtio_net.c | 30 ++++++++++-------------------- 1 file changed, 10 insertions(+), 20 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 9e37708243983..ca92b4a1879ce 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3369,8 +3369,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } -static void __virtnet_rx_pause(struct virtnet_info *vi, - struct receive_queue *rq) +static void virtnet_rx_pause(struct virtnet_info *vi, + struct receive_queue *rq) { bool running = netif_running(vi->dev); @@ -3385,17 +3385,12 @@ static void virtnet_rx_pause_all(struct virtnet_info *vi) int i; for (i = 0; i < vi->max_queue_pairs; i++) - __virtnet_rx_pause(vi, &vi->rq[i]); + virtnet_rx_pause(vi, &vi->rq[i]); } -static void virtnet_rx_pause(struct virtnet_info *vi, struct receive_queue *rq) -{ - __virtnet_rx_pause(vi, rq); -} - -static void __virtnet_rx_resume(struct virtnet_info *vi, - struct receive_queue *rq, - bool refill) +static void virtnet_rx_resume(struct virtnet_info *vi, + struct receive_queue *rq, + bool refill) { if (netif_running(vi->dev)) { /* Pre-fill rq agressively, to make sure we are ready to get @@ -3414,17 +3409,12 @@ static void virtnet_rx_resume_all(struct virtnet_info *vi) for (i = 0; i < vi->max_queue_pairs; i++) { if (i < vi->curr_queue_pairs) - __virtnet_rx_resume(vi, &vi->rq[i], true); + virtnet_rx_resume(vi, &vi->rq[i], true); else - __virtnet_rx_resume(vi, &vi->rq[i], false); + virtnet_rx_resume(vi, &vi->rq[i], false); } } -static void virtnet_rx_resume(struct virtnet_info *vi, struct receive_queue *rq) -{ - __virtnet_rx_resume(vi, rq, true); -} - static int virtnet_rx_resize(struct virtnet_info *vi, struct receive_queue *rq, u32 ring_num) { @@ -3438,7 +3428,7 @@ static int virtnet_rx_resize(struct virtnet_info *vi, if (err) netdev_err(vi->dev, "resize rx fail: rx queue index: %d err: %d\n", qindex, err); - virtnet_rx_resume(vi, rq); + virtnet_rx_resume(vi, rq, true); return err; } @@ -5810,7 +5800,7 @@ static int virtnet_rq_bind_xsk_pool(struct virtnet_info *vi, struct receive_queu rq->xsk_pool = pool; - virtnet_rx_resume(vi, rq); + virtnet_rx_resume(vi, rq, true); if (pool) return 0; From 2c6932ba081ccce4240218366e21531d98a2f4f9 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 8 Jan 2026 19:02:14 +0000 Subject: [PATCH 0992/1321] ipv4: ip_gre: make ipgre_header() robust Analog to commit db5b4e39c4e6 ("ip6_gre: make ip6gre_header() robust") Over the years, syzbot found many ways to crash the kernel in ipgre_header() [1]. This involves team or bonding drivers ability to dynamically change their dev->needed_headroom and/or dev->hard_header_len In this particular crash mld_newpack() allocated an skb with a too small reserve/headroom, and by the time mld_sendpack() was called, syzbot managed to attach an ipgre device. [1] skbuff: skb_under_panic: text:ffffffff89ea3cb7 len:2030915468 put:2030915372 head:ffff888058b43000 data:ffff887fdfa6e194 tail:0x120 end:0x6c0 dev:team0 kernel BUG at net/core/skbuff.c:213 ! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 1 UID: 0 PID: 1322 Comm: kworker/1:9 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 Workqueue: mld mld_ifc_work RIP: 0010:skb_panic+0x157/0x160 net/core/skbuff.c:213 Call Trace: skb_under_panic net/core/skbuff.c:223 [inline] skb_push+0xc3/0xe0 net/core/skbuff.c:2641 ipgre_header+0x67/0x290 net/ipv4/ip_gre.c:897 dev_hard_header include/linux/netdevice.h:3436 [inline] neigh_connected_output+0x286/0x460 net/core/neighbour.c:1618 NF_HOOK_COND include/linux/netfilter.h:307 [inline] ip6_output+0x340/0x550 net/ipv6/ip6_output.c:247 NF_HOOK+0x9e/0x380 include/linux/netfilter.h:318 mld_sendpack+0x8d4/0xe60 net/ipv6/mcast.c:1855 mld_send_cr net/ipv6/mcast.c:2154 [inline] mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693 process_one_work kernel/workqueue.c:3257 [inline] process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246 Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Reported-by: syzbot+7c134e1c3aa3283790b9@syzkaller.appspotmail.com Closes: https://www.spinics.net/lists/netdev/msg1147302.html Signed-off-by: Eric Dumazet Link: https://patch.msgid.link/20260108190214.1667040-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- net/ipv4/ip_gre.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index 8178c44a3cdd4..e13244729ad8d 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -891,10 +891,17 @@ static int ipgre_header(struct sk_buff *skb, struct net_device *dev, const void *daddr, const void *saddr, unsigned int len) { struct ip_tunnel *t = netdev_priv(dev); - struct iphdr *iph; struct gre_base_hdr *greh; + struct iphdr *iph; + int needed; + + needed = t->hlen + sizeof(*iph); + if (skb_headroom(skb) < needed && + pskb_expand_head(skb, HH_DATA_ALIGN(needed - skb_headroom(skb)), + 0, GFP_ATOMIC)) + return -needed; - iph = skb_push(skb, t->hlen + sizeof(*iph)); + iph = skb_push(skb, needed); greh = (struct gre_base_hdr *)(iph+1); greh->flags = gre_tnl_flags_to_gre_flags(t->parms.o_flags); greh->protocol = htons(type); From 03cf52a0ed0b14ad58ff719eeaa9e335f782959e Mon Sep 17 00:00:00 2001 From: Stefano Garzarella Date: Thu, 8 Jan 2026 12:44:19 +0100 Subject: [PATCH 0993/1321] vsock/test: add a final full barrier after run all tests If the last test fails, the other side still completes correctly, which could lead to false positives. Let's add a final barrier that ensures that the last test has finished correctly on both sides, but also that the two sides agree on the number of tests to be performed. Fixes: 2f65b44e199c ("VSOCK: add full barrier between test cases") Reviewed-by: Luigi Leonardi Signed-off-by: Stefano Garzarella Link: https://patch.msgid.link/20260108114419.52747-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski --- tools/testing/vsock/util.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c index d843643ced6b7..9430ef5b8bc3e 100644 --- a/tools/testing/vsock/util.c +++ b/tools/testing/vsock/util.c @@ -511,6 +511,18 @@ void run_tests(const struct test_case *test_cases, printf("ok\n"); } + + printf("All tests have been executed. Waiting other peer..."); + fflush(stdout); + + /* + * Final full barrier, to ensure that all tests have been run and + * that even the last one has been successful on both sides. + */ + control_writeln("COMPLETED"); + control_expectln("COMPLETED"); + + printf("ok\n"); } void list_tests(const struct test_case *test_cases) From 95d2e6275ba9bd341bf230ad8b1b456b94c4f644 Mon Sep 17 00:00:00 2001 From: Yang Li Date: Fri, 19 Dec 2025 10:43:09 +0800 Subject: [PATCH 0994/1321] Bluetooth: hci_sync: enable PA Sync Lost event Enable the PA Sync Lost event mask to ensure PA sync loss is properly reported and handled. Fixes: 485e0626e587 ("Bluetooth: hci_event: Fix not handling PA Sync Lost event") Signed-off-by: Yang Li Signed-off-by: Luiz Augusto von Dentz --- net/bluetooth/hci_sync.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c index a9f5b1a68356e..cbc3a75d73262 100644 --- a/net/bluetooth/hci_sync.c +++ b/net/bluetooth/hci_sync.c @@ -4420,6 +4420,7 @@ static int hci_le_set_event_mask_sync(struct hci_dev *hdev) if (bis_capable(hdev)) { events[1] |= 0x20; /* LE PA Report */ events[1] |= 0x40; /* LE PA Sync Established */ + events[1] |= 0x80; /* LE PA Sync Lost */ events[3] |= 0x04; /* LE Create BIG Complete */ events[3] |= 0x08; /* LE Terminate BIG Complete */ events[3] |= 0x10; /* LE BIG Sync Established */ From da276f02bc8a4942a629d08298394d002c193807 Mon Sep 17 00:00:00 2001 From: Szymon Wilczek Date: Tue, 23 Dec 2025 02:17:32 +0100 Subject: [PATCH 0995/1321] can: etas_es58x: allow partial RX URB allocation to succeed When es58x_alloc_rx_urbs() fails to allocate the requested number of URBs but succeeds in allocating some, it returns an error code. This causes es58x_open() to return early, skipping the cleanup label 'free_urbs', which leads to the anchored URBs being leaked. As pointed out by maintainer Vincent Mailhol, the driver is designed to handle partial URB allocation gracefully. Therefore, partial allocation should not be treated as a fatal error. Modify es58x_alloc_rx_urbs() to return 0 if at least one URB has been allocated, restoring the intended behavior and preventing the leak in es58x_open(). Fixes: 8537257874e9 ("can: etas_es58x: add core support for ETAS ES58X CAN USB interfaces") Reported-by: syzbot+e8cb6691a7cf68256cb8@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=e8cb6691a7cf68256cb8 Signed-off-by: Szymon Wilczek Reviewed-by: Vincent Mailhol Link: https://patch.msgid.link/20251223011732.39361-1-swilczek.lx@gmail.com Signed-off-by: Marc Kleine-Budde --- drivers/net/can/usb/etas_es58x/es58x_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/can/usb/etas_es58x/es58x_core.c b/drivers/net/can/usb/etas_es58x/es58x_core.c index f799233c2b72c..2d248deb69dc1 100644 --- a/drivers/net/can/usb/etas_es58x/es58x_core.c +++ b/drivers/net/can/usb/etas_es58x/es58x_core.c @@ -1736,7 +1736,7 @@ static int es58x_alloc_rx_urbs(struct es58x_device *es58x_dev) dev_dbg(dev, "%s: Allocated %d rx URBs each of size %u\n", __func__, i, rx_buf_len); - return ret; + return 0; } /** From c6c5590bf2a88753626cdbb1cdfab5089f6cd7ab Mon Sep 17 00:00:00 2001 From: Marc Kleine-Budde Date: Tue, 23 Dec 2025 21:21:39 +0100 Subject: [PATCH 0996/1321] can: gs_usb: gs_usb_receive_bulk_callback(): fix URB memory leak In gs_can_open(), the URBs for USB-in transfers are allocated, added to the parent->rx_submitted anchor and submitted. In the complete callback gs_usb_receive_bulk_callback(), the URB is processed and resubmitted. In gs_can_close() the URBs are freed by calling usb_kill_anchored_urbs(parent->rx_submitted). However, this does not take into account that the USB framework unanchors the URB before the complete function is called. This means that once an in-URB has been completed, it is no longer anchored and is ultimately not released in gs_can_close(). Fix the memory leak by anchoring the URB in the gs_usb_receive_bulk_callback() to the parent->rx_submitted anchor. Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260105-gs_usb-fix-memory-leak-v2-1-cc6ed6438034@pengutronix.de Signed-off-by: Marc Kleine-Budde --- drivers/net/can/usb/gs_usb.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c index a0233e550a5ad..d093babbc3201 100644 --- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -751,6 +751,8 @@ static void gs_usb_receive_bulk_callback(struct urb *urb) hf, parent->hf_size_rx, gs_usb_receive_bulk_callback, parent); + usb_anchor_urb(urb, &parent->rx_submitted); + rc = usb_submit_urb(urb, GFP_ATOMIC); /* USB failure take down all interfaces */ From 240cbd7e4d175734418a4b2ff2aa4fa20e6ea1a1 Mon Sep 17 00:00:00 2001 From: Ondrej Ille Date: Mon, 5 Jan 2026 12:16:20 +0100 Subject: [PATCH 0997/1321] can: ctucanfd: fix SSP_SRC in cases when bit-rate is higher than 1 MBit. The Secondary Sample Point Source field has been set to an incorrect value by some mistake in the past 0b01 - SSP_SRC_NO_SSP - SSP is not used. for data bitrates above 1 MBit/s. The correct/default value already used for lower bitrates is 0b00 - SSP_SRC_MEAS_N_OFFSET - SSP position = TRV_DELAY (Measured Transmitter delay) + SSP_OFFSET. The related configuration register structure is described in section 3.1.46 SSP_CFG of the CTU CAN FD IP CORE Datasheet. The analysis leading to the proper configuration is described in section 2.8.3 Secondary sampling point of the datasheet. The change has been tested on AMD/Xilinx Zynq with the next CTU CN FD IP core versions: - 2.6 aka master in the "integration with Zynq-7000 system" test 6.12.43-rt12+ #1 SMP PREEMPT_RT kernel with CTU CAN FD git driver (change already included in the driver repo) - older 2.5 snapshot with mainline kernels with this patch applied locally in the multiple CAN latency tester nightly runs 6.18.0-rc4-rt3-dut #1 SMP PREEMPT_RT 6.19.0-rc3-dut The logs, the datasheet and sources are available at https://canbus.pages.fel.cvut.cz/ Signed-off-by: Ondrej Ille Signed-off-by: Pavel Pisa Link: https://patch.msgid.link/20260105111620.16580-1-pisa@fel.cvut.cz Fixes: 2dcb8e8782d8 ("can: ctucanfd: add support for CTU CAN FD open-source IP core - bus independent part.") Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde --- drivers/net/can/ctucanfd/ctucanfd_base.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/can/ctucanfd/ctucanfd_base.c b/drivers/net/can/ctucanfd/ctucanfd_base.c index 1e6b9e3dc2fea..0ea1ff28dfce8 100644 --- a/drivers/net/can/ctucanfd/ctucanfd_base.c +++ b/drivers/net/can/ctucanfd/ctucanfd_base.c @@ -310,7 +310,7 @@ static int ctucan_set_secondary_sample_point(struct net_device *ndev) } ssp_cfg = FIELD_PREP(REG_TRV_DELAY_SSP_OFFSET, ssp_offset); - ssp_cfg |= FIELD_PREP(REG_TRV_DELAY_SSP_SRC, 0x1); + ssp_cfg |= FIELD_PREP(REG_TRV_DELAY_SSP_SRC, 0x0); } ctucan_write32(priv, CTUCANFD_TRV_DELAY, ssp_cfg); From eda82c3ea9e2b59710a3cacdfda221746005681f Mon Sep 17 00:00:00 2001 From: Saeed Mahameed Date: Thu, 8 Jan 2026 13:26:54 -0800 Subject: [PATCH 0998/1321] net/mlx5e: Fix crash on profile change rollback failure mlx5e_netdev_change_profile can fail to attach a new profile and can fail to rollback to old profile, in such case, we could end up with a dangling netdev with a fully reset netdev_priv. A retry to change profile, e.g. another attempt to call mlx5e_netdev_change_profile via switchdev mode change, will crash trying to access the now NULL priv->mdev. This fix allows mlx5e_netdev_change_profile() to handle previous failures and an empty priv, by not assuming priv is valid. Pass netdev and mdev to all flows requiring mlx5e_netdev_change_profile() and avoid passing priv. In mlx5e_netdev_change_profile() check if current priv is valid, and if not, just attach the new profile without trying to access the old one. This fixes the following oops, when enabling switchdev mode for the 2nd time after first time failure: ## Enabling switchdev mode first time: mlx5_core 0012:03:00.1: E-Switch: Supported tc chains and prios offload workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12 mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: new profile init failed, -12 workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12 mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12 ^^^^^^^^ mlx5_core 0000:00:03.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) ## retry: Enabling switchdev mode 2nd time: mlx5_core 0000:00:03.0: E-Switch: Supported tc chains and prios offload BUG: kernel NULL pointer dereference, address: 0000000000000038 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 13 UID: 0 PID: 520 Comm: devlink Not tainted 6.18.0-rc4+ #91 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:mlx5e_detach_netdev+0x3c/0x90 Code: 50 00 00 f0 80 4f 78 02 48 8b bf e8 07 00 00 48 85 ff 74 16 48 8b 73 78 48 d1 ee 83 e6 01 83 f6 01 40 0f b6 f6 e8 c4 42 00 00 <48> 8b 45 38 48 85 c0 74 08 48 89 df e8 cc 47 40 1e 48 8b bb f0 07 RSP: 0018:ffffc90000673890 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8881036a89c0 RCX: 0000000000000000 RDX: ffff888113f63800 RSI: ffffffff822fe720 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000002dcd R09: 0000000000000000 R10: ffffc900006738e8 R11: 00000000ffffffff R12: 0000000000000000 R13: 0000000000000000 R14: ffff8881036a89c0 R15: 0000000000000000 FS: 00007fdfb8384740(0000) GS:ffff88856a9d6000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000038 CR3: 0000000112ae0005 CR4: 0000000000370ef0 Call Trace: mlx5e_netdev_change_profile+0x45/0xb0 mlx5e_vport_rep_load+0x27b/0x2d0 mlx5_esw_offloads_rep_load+0x72/0xf0 esw_offloads_enable+0x5d0/0x970 mlx5_eswitch_enable_locked+0x349/0x430 ? is_mp_supported+0x57/0xb0 mlx5_devlink_eswitch_mode_set+0x26b/0x430 devlink_nl_eswitch_set_doit+0x6f/0xf0 genl_family_rcv_msg_doit+0xe8/0x140 genl_rcv_msg+0x18b/0x290 ? __pfx_devlink_nl_pre_doit+0x10/0x10 ? __pfx_devlink_nl_eswitch_set_doit+0x10/0x10 ? __pfx_devlink_nl_post_doit+0x10/0x10 ? __pfx_genl_rcv_msg+0x10/0x10 netlink_rcv_skb+0x52/0x100 genl_rcv+0x28/0x40 netlink_unicast+0x282/0x3e0 ? __alloc_skb+0xd6/0x190 netlink_sendmsg+0x1f7/0x430 __sys_sendto+0x213/0x220 ? __sys_recvmsg+0x6a/0xd0 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x50/0x1f0 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fdfb8495047 Fixes: c4d7eb57687f ("net/mxl5e: Add change profile method") Signed-off-by: Saeed Mahameed Reviewed-by: Tariq Toukan Link: https://patch.msgid.link/20260108212657.25090-2-saeed@kernel.org Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 9 ++-- .../net/ethernet/mellanox/mlx5/core/en_main.c | 48 +++++++++++++------ .../net/ethernet/mellanox/mlx5/core/en_rep.c | 11 ++--- 3 files changed, 44 insertions(+), 24 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 262dc032e276a..f422567687009 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -1243,9 +1243,12 @@ mlx5e_create_netdev(struct mlx5_core_dev *mdev, const struct mlx5e_profile *prof int mlx5e_attach_netdev(struct mlx5e_priv *priv); void mlx5e_detach_netdev(struct mlx5e_priv *priv); void mlx5e_destroy_netdev(struct mlx5e_priv *priv); -int mlx5e_netdev_change_profile(struct mlx5e_priv *priv, - const struct mlx5e_profile *new_profile, void *new_ppriv); -void mlx5e_netdev_attach_nic_profile(struct mlx5e_priv *priv); +int mlx5e_netdev_change_profile(struct net_device *netdev, + struct mlx5_core_dev *mdev, + const struct mlx5e_profile *new_profile, + void *new_ppriv); +void mlx5e_netdev_attach_nic_profile(struct net_device *netdev, + struct mlx5_core_dev *mdev); void mlx5e_set_netdev_mtu_boundaries(struct mlx5e_priv *priv); void mlx5e_build_nic_params(struct mlx5e_priv *priv, struct mlx5e_xsk *xsk, u16 mtu); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 07fc4d2c8fadd..e50525b771bc4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -6584,19 +6584,28 @@ mlx5e_netdev_attach_profile(struct net_device *netdev, struct mlx5_core_dev *mde return err; } -int mlx5e_netdev_change_profile(struct mlx5e_priv *priv, - const struct mlx5e_profile *new_profile, void *new_ppriv) +int mlx5e_netdev_change_profile(struct net_device *netdev, + struct mlx5_core_dev *mdev, + const struct mlx5e_profile *new_profile, + void *new_ppriv) { - const struct mlx5e_profile *orig_profile = priv->profile; - struct net_device *netdev = priv->netdev; - struct mlx5_core_dev *mdev = priv->mdev; - void *orig_ppriv = priv->ppriv; + struct mlx5e_priv *priv = netdev_priv(netdev); + const struct mlx5e_profile *orig_profile; int err, rollback_err; + void *orig_ppriv; - /* cleanup old profile */ - mlx5e_detach_netdev(priv); - priv->profile->cleanup(priv); - mlx5e_priv_cleanup(priv); + orig_profile = priv->profile; + orig_ppriv = priv->ppriv; + + /* NULL could happen if previous change_profile failed to rollback */ + if (priv->profile) { + WARN_ON_ONCE(priv->mdev != mdev); + /* cleanup old profile */ + mlx5e_detach_netdev(priv); + priv->profile->cleanup(priv); + mlx5e_priv_cleanup(priv); + } + /* priv members are not valid from this point ... */ if (mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) { mlx5e_netdev_init_profile(netdev, mdev, new_profile, new_ppriv); @@ -6613,16 +6622,25 @@ int mlx5e_netdev_change_profile(struct mlx5e_priv *priv, return 0; rollback: + if (!orig_profile) { + netdev_warn(netdev, "no original profile to rollback to\n"); + priv->profile = NULL; + return err; + } + rollback_err = mlx5e_netdev_attach_profile(netdev, mdev, orig_profile, orig_ppriv); - if (rollback_err) - netdev_err(netdev, "%s: failed to rollback to orig profile, %d\n", - __func__, rollback_err); + if (rollback_err) { + netdev_err(netdev, "failed to rollback to orig profile, %d\n", + rollback_err); + priv->profile = NULL; + } return err; } -void mlx5e_netdev_attach_nic_profile(struct mlx5e_priv *priv) +void mlx5e_netdev_attach_nic_profile(struct net_device *netdev, + struct mlx5_core_dev *mdev) { - mlx5e_netdev_change_profile(priv, &mlx5e_nic_profile, NULL); + mlx5e_netdev_change_profile(netdev, mdev, &mlx5e_nic_profile, NULL); } void mlx5e_destroy_netdev(struct mlx5e_priv *priv) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index ee9595109649a..52d3ad0b9cd92 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -1508,17 +1508,16 @@ mlx5e_vport_uplink_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep * { struct mlx5e_rep_priv *rpriv = mlx5e_rep_to_rep_priv(rep); struct net_device *netdev; - struct mlx5e_priv *priv; int err; netdev = mlx5_uplink_netdev_get(dev); if (!netdev) return 0; - priv = netdev_priv(netdev); - rpriv->netdev = priv->netdev; - err = mlx5e_netdev_change_profile(priv, &mlx5e_uplink_rep_profile, - rpriv); + /* must not use netdev_priv(netdev), it might not be initialized yet */ + rpriv->netdev = netdev; + err = mlx5e_netdev_change_profile(netdev, dev, + &mlx5e_uplink_rep_profile, rpriv); mlx5_uplink_netdev_put(dev, netdev); return err; } @@ -1546,7 +1545,7 @@ mlx5e_vport_uplink_rep_unload(struct mlx5e_rep_priv *rpriv) if (!(priv->mdev->priv.flags & MLX5_PRIV_FLAGS_SWITCH_LEGACY)) unregister_netdev(netdev); - mlx5e_netdev_attach_nic_profile(priv); + mlx5e_netdev_attach_nic_profile(netdev, priv->mdev); } static int From f57eaf8005071833244fecf130e3afdb36110ede Mon Sep 17 00:00:00 2001 From: Saeed Mahameed Date: Thu, 8 Jan 2026 13:26:55 -0800 Subject: [PATCH 0999/1321] net/mlx5e: Don't store mlx5e_priv in mlx5e_dev devlink priv mlx5e_priv is an unstable structure that can be memset(0) if profile attaching fails, mlx5e_priv in mlx5e_dev devlink private is used to reference the netdev and mdev associated with that struct. Instead, store netdev directly into mlx5e_dev and get mdev from the containing mlx5_adev aux device structure. This fixes a kernel oops in mlx5e_remove when switchdev mode fails due to change profile failure. $ devlink dev eswitch set pci/0000:00:03.0 mode switchdev Error: mlx5_core: Failed setting eswitch to offloads. dmesg: workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12 mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: new profile init failed, -12 workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12 mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12 $ devlink dev reload pci/0000:00:03.0 ==> oops BUG: kernel NULL pointer dereference, address: 0000000000000520 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 3 UID: 0 PID: 521 Comm: devlink Not tainted 6.18.0-rc5+ #117 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:mlx5e_remove+0x68/0x130 RSP: 0018:ffffc900034838f0 EFLAGS: 00010246 RAX: ffff88810283c380 RBX: ffff888101874400 RCX: ffffffff826ffc45 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000 RBP: ffff888102d789c0 R08: ffff8881007137f0 R09: ffff888100264e10 R10: ffffc90003483898 R11: ffffc900034838a0 R12: ffff888100d261a0 R13: ffff888100d261a0 R14: ffff8881018749a0 R15: ffff888101874400 FS: 00007f8565fea740(0000) GS:ffff88856a759000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000520 CR3: 000000010b11a004 CR4: 0000000000370ef0 Call Trace: device_release_driver_internal+0x19c/0x200 bus_remove_device+0xc6/0x130 device_del+0x160/0x3d0 ? devl_param_driverinit_value_get+0x2d/0x90 mlx5_detach_device+0x89/0xe0 mlx5_unload_one_devl_locked+0x3a/0x70 mlx5_devlink_reload_down+0xc8/0x220 devlink_reload+0x7d/0x260 devlink_nl_reload_doit+0x45b/0x5a0 genl_family_rcv_msg_doit+0xe8/0x140 Fixes: ee75f1fc44dd ("net/mlx5e: Create separate devlink instance for ethernet auxiliary device") Fixes: c4d7eb57687f ("net/mxl5e: Add change profile method") Signed-off-by: Saeed Mahameed Link: https://patch.msgid.link/20260108212657.25090-3-saeed@kernel.org Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 +- .../net/ethernet/mellanox/mlx5/core/en_main.c | 20 ++++++++++--------- 2 files changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index f422567687009..be52c30c2ad61 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -962,7 +962,7 @@ struct mlx5e_priv { }; struct mlx5e_dev { - struct mlx5e_priv *priv; + struct net_device *netdev; struct devlink_port dl_port; }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index e50525b771bc4..9f8d95f8915eb 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -6655,8 +6655,8 @@ static int _mlx5e_resume(struct auxiliary_device *adev) { struct mlx5_adev *edev = container_of(adev, struct mlx5_adev, adev); struct mlx5e_dev *mlx5e_dev = auxiliary_get_drvdata(adev); - struct mlx5e_priv *priv = mlx5e_dev->priv; - struct net_device *netdev = priv->netdev; + struct mlx5e_priv *priv = netdev_priv(mlx5e_dev->netdev); + struct net_device *netdev = mlx5e_dev->netdev; struct mlx5_core_dev *mdev = edev->mdev; struct mlx5_core_dev *pos, *to; int err, i; @@ -6702,10 +6702,11 @@ static int mlx5e_resume(struct auxiliary_device *adev) static int _mlx5e_suspend(struct auxiliary_device *adev, bool pre_netdev_reg) { + struct mlx5_adev *edev = container_of(adev, struct mlx5_adev, adev); struct mlx5e_dev *mlx5e_dev = auxiliary_get_drvdata(adev); - struct mlx5e_priv *priv = mlx5e_dev->priv; - struct net_device *netdev = priv->netdev; - struct mlx5_core_dev *mdev = priv->mdev; + struct mlx5e_priv *priv = netdev_priv(mlx5e_dev->netdev); + struct net_device *netdev = mlx5e_dev->netdev; + struct mlx5_core_dev *mdev = edev->mdev; struct mlx5_core_dev *pos; int i; @@ -6766,11 +6767,11 @@ static int _mlx5e_probe(struct auxiliary_device *adev) goto err_devlink_port_unregister; } SET_NETDEV_DEVLINK_PORT(netdev, &mlx5e_dev->dl_port); + mlx5e_dev->netdev = netdev; mlx5e_build_nic_netdev(netdev); priv = netdev_priv(netdev); - mlx5e_dev->priv = priv; priv->profile = profile; priv->ppriv = NULL; @@ -6833,7 +6834,8 @@ static void _mlx5e_remove(struct auxiliary_device *adev) { struct mlx5_adev *edev = container_of(adev, struct mlx5_adev, adev); struct mlx5e_dev *mlx5e_dev = auxiliary_get_drvdata(adev); - struct mlx5e_priv *priv = mlx5e_dev->priv; + struct net_device *netdev = mlx5e_dev->netdev; + struct mlx5e_priv *priv = netdev_priv(netdev); struct mlx5_core_dev *mdev = edev->mdev; mlx5_core_uplink_netdev_set(mdev, NULL); @@ -6842,8 +6844,8 @@ static void _mlx5e_remove(struct auxiliary_device *adev) * if it's from legacy mode. If from switchdev mode, it * is already unregistered before changing to NIC profile. */ - if (priv->netdev->reg_state == NETREG_REGISTERED) { - unregister_netdev(priv->netdev); + if (netdev->reg_state == NETREG_REGISTERED) { + unregister_netdev(netdev); _mlx5e_suspend(adev, false); } else { struct mlx5_core_dev *pos; From b18fa4476277336cd3563932ef1389c2ae1e212c Mon Sep 17 00:00:00 2001 From: Saeed Mahameed Date: Thu, 8 Jan 2026 13:26:56 -0800 Subject: [PATCH 1000/1321] net/mlx5e: Pass netdev to mlx5e_destroy_netdev instead of priv mlx5e_priv is an unstable structure that can be memset(0) if profile attaching fails. Pass netdev to mlx5e_destroy_netdev() to guarantee it will work on a valid netdev. On mlx5e_remove: Check validity of priv->profile, before attempting to cleanup any resources that might be not there. This fixes a kernel oops in mlx5e_remove when switchdev mode fails due to change profile failure. $ devlink dev eswitch set pci/0000:00:03.0 mode switchdev Error: mlx5_core: Failed setting eswitch to offloads. dmesg: workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12 mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: new profile init failed, -12 workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12 mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12 $ devlink dev reload pci/0000:00:03.0 ==> oops BUG: kernel NULL pointer dereference, address: 0000000000000370 PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 15 UID: 0 PID: 520 Comm: devlink Not tainted 6.18.0-rc5+ #115 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:mlx5e_dcbnl_dscp_app+0x23/0x100 RSP: 0018:ffffc9000083f8b8 EFLAGS: 00010286 RAX: ffff8881126fc380 RBX: ffff8881015ac400 RCX: ffffffff826ffc45 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8881035109c0 RBP: ffff8881035109c0 R08: ffff888101e3e838 R09: ffff888100264e10 R10: ffffc9000083f898 R11: ffffc9000083f8a0 R12: ffff888101b921a0 R13: ffff888101b921a0 R14: ffff8881015ac9a0 R15: ffff8881015ac400 FS: 00007f789a3c8740(0000) GS:ffff88856aa59000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000370 CR3: 000000010b6c0001 CR4: 0000000000370ef0 Call Trace: mlx5e_remove+0x57/0x110 device_release_driver_internal+0x19c/0x200 bus_remove_device+0xc6/0x130 device_del+0x160/0x3d0 ? devl_param_driverinit_value_get+0x2d/0x90 mlx5_detach_device+0x89/0xe0 mlx5_unload_one_devl_locked+0x3a/0x70 mlx5_devlink_reload_down+0xc8/0x220 devlink_reload+0x7d/0x260 devlink_nl_reload_doit+0x45b/0x5a0 genl_family_rcv_msg_doit+0xe8/0x140 Fixes: c4d7eb57687f ("net/mxl5e: Add change profile method") Signed-off-by: Saeed Mahameed Reviewed-by: Shay Drori Reviewed-by: Tariq Toukan Link: https://patch.msgid.link/20260108212657.25090-4-saeed@kernel.org Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 15 +++++++++------ drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 4 ++-- 3 files changed, 12 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index be52c30c2ad61..ff4ab4691baf0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -1242,7 +1242,7 @@ struct net_device * mlx5e_create_netdev(struct mlx5_core_dev *mdev, const struct mlx5e_profile *profile); int mlx5e_attach_netdev(struct mlx5e_priv *priv); void mlx5e_detach_netdev(struct mlx5e_priv *priv); -void mlx5e_destroy_netdev(struct mlx5e_priv *priv); +void mlx5e_destroy_netdev(struct net_device *netdev); int mlx5e_netdev_change_profile(struct net_device *netdev, struct mlx5_core_dev *mdev, const struct mlx5e_profile *new_profile, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 9f8d95f8915eb..2c06a4abea045 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -6643,11 +6643,12 @@ void mlx5e_netdev_attach_nic_profile(struct net_device *netdev, mlx5e_netdev_change_profile(netdev, mdev, &mlx5e_nic_profile, NULL); } -void mlx5e_destroy_netdev(struct mlx5e_priv *priv) +void mlx5e_destroy_netdev(struct net_device *netdev) { - struct net_device *netdev = priv->netdev; + struct mlx5e_priv *priv = netdev_priv(netdev); - mlx5e_priv_cleanup(priv); + if (priv->profile) + mlx5e_priv_cleanup(priv); free_netdev(netdev); } @@ -6804,7 +6805,7 @@ static int _mlx5e_probe(struct auxiliary_device *adev) err_profile_cleanup: profile->cleanup(priv); err_destroy_netdev: - mlx5e_destroy_netdev(priv); + mlx5e_destroy_netdev(netdev); err_devlink_port_unregister: mlx5e_devlink_port_unregister(mlx5e_dev); err_devlink_unregister: @@ -6839,7 +6840,9 @@ static void _mlx5e_remove(struct auxiliary_device *adev) struct mlx5_core_dev *mdev = edev->mdev; mlx5_core_uplink_netdev_set(mdev, NULL); - mlx5e_dcbnl_delete_app(priv); + + if (priv->profile) + mlx5e_dcbnl_delete_app(priv); /* When unload driver, the netdev is in registered state * if it's from legacy mode. If from switchdev mode, it * is already unregistered before changing to NIC profile. @@ -6860,7 +6863,7 @@ static void _mlx5e_remove(struct auxiliary_device *adev) /* Avoid cleanup if profile rollback failed. */ if (priv->profile) priv->profile->cleanup(priv); - mlx5e_destroy_netdev(priv); + mlx5e_destroy_netdev(netdev); mlx5e_devlink_port_unregister(mlx5e_dev); mlx5e_destroy_devlink(mlx5e_dev); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index 52d3ad0b9cd92..6eec88fa6d102 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -1611,7 +1611,7 @@ mlx5e_vport_vf_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep) priv->profile->cleanup(priv); err_destroy_netdev: - mlx5e_destroy_netdev(netdev_priv(netdev)); + mlx5e_destroy_netdev(netdev); return err; } @@ -1666,7 +1666,7 @@ mlx5e_vport_rep_unload(struct mlx5_eswitch_rep *rep) mlx5e_rep_vnic_reporter_destroy(priv); mlx5e_detach_netdev(priv); priv->profile->cleanup(priv); - mlx5e_destroy_netdev(priv); + mlx5e_destroy_netdev(netdev); free_ppriv: kvfree(ppriv); /* mlx5e_rep_priv */ } From 98abe408d1348c0c3c94dbead05c195acf85a4c7 Mon Sep 17 00:00:00 2001 From: Saeed Mahameed Date: Thu, 8 Jan 2026 13:26:57 -0800 Subject: [PATCH 1001/1321] net/mlx5e: Restore destroying state bit after profile cleanup Profile rollback can fail in mlx5e_netdev_change_profile() and we will end up with invalid mlx5e_priv memset to 0, we must maintain the 'destroying' bit in order to gracefully shutdown even if the profile/priv are not valid. This patch maintains the previous state of the 'destroying' state of mlx5e_priv after priv cleanup, to allow the remove flow to cleanup common resources from mlx5_core to avoid FW fatal errors as seen below: $ devlink dev eswitch set pci/0000:00:03.0 mode switchdev Error: mlx5_core: Failed setting eswitch to offloads. dmesg: mlx5_core 0000:00:03.0 enp0s3np0: failed to rollback to orig profile, ... $ devlink dev reload pci/0000:00:03.0 mlx5_core 0000:00:03.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:00:03.0: poll_health:803:(pid 519): Fatal error 3 detected mlx5_core 0000:00:03.0: firmware version: 28.41.1000 mlx5_core 0000:00:03.0: 0.000 Gb/s available PCIe bandwidth (Unknown x255 link) mlx5_core 0000:00:03.0: mlx5_function_enable:1200:(pid 519): enable hca failed mlx5_core 0000:00:03.0: mlx5_function_enable:1200:(pid 519): enable hca failed mlx5_core 0000:00:03.0: mlx5_health_try_recover:340:(pid 141): handling bad device here mlx5_core 0000:00:03.0: mlx5_handle_bad_state:285:(pid 141): Expected to see disabled NIC but it is full driver mlx5_core 0000:00:03.0: mlx5_error_sw_reset:236:(pid 141): start mlx5_core 0000:00:03.0: NIC IFC still 0 after 4000ms. Fixes: c4d7eb57687f ("net/mxl5e: Add change profile method") Signed-off-by: Saeed Mahameed Reviewed-by: Tariq Toukan Link: https://patch.msgid.link/20260108212657.25090-5-saeed@kernel.org Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 2c06a4abea045..9042c8a388e42 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -6325,6 +6325,7 @@ int mlx5e_priv_init(struct mlx5e_priv *priv, void mlx5e_priv_cleanup(struct mlx5e_priv *priv) { + bool destroying = test_bit(MLX5E_STATE_DESTROYING, &priv->state); int i; /* bail if change profile failed and also rollback failed */ @@ -6352,6 +6353,8 @@ void mlx5e_priv_cleanup(struct mlx5e_priv *priv) } memset(priv, 0, sizeof(*priv)); + if (destroying) /* restore destroying bit, to allow unload */ + set_bit(MLX5E_STATE_DESTROYING, &priv->state); } static unsigned int mlx5e_get_max_num_txqs(struct mlx5_core_dev *mdev, From 99a9ddd56206b12974209865d262ab863a904635 Mon Sep 17 00:00:00 2001 From: Kery Qi Date: Fri, 9 Jan 2026 00:42:57 +0800 Subject: [PATCH 1002/1321] net: octeon_ep_vf: fix free_irq dev_id mismatch in IRQ rollback octep_vf_request_irqs() requests MSI-X queue IRQs with dev_id set to ioq_vector. If request_irq() fails part-way, the rollback loop calls free_irq() with dev_id set to 'oct', which does not match the original dev_id and may leave the irqaction registered. This can keep IRQ handlers alive while ioq_vector is later freed during unwind/teardown, leading to a use-after-free or crash when an interrupt fires. Fix the error path to free IRQs with the same ioq_vector dev_id used during request_irq(). Fixes: 1cd3b407977c ("octeon_ep_vf: add Tx/Rx processing and interrupt support") Signed-off-by: Kery Qi Link: https://patch.msgid.link/20260108164256.1749-2-qikeyu2017@gmail.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c index 420c3f4cf7417..1d9760b4b8f47 100644 --- a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c @@ -218,7 +218,7 @@ static int octep_vf_request_irqs(struct octep_vf_device *oct) ioq_irq_err: while (i) { --i; - free_irq(oct->msix_entries[i].vector, oct); + free_irq(oct->msix_entries[i].vector, oct->ioq_vector[i]); } return -1; } From 50d252f8d2c0fa9c6b3a4afb81100fd8eceebe8c Mon Sep 17 00:00:00 2001 From: Jijie Shao Date: Thu, 8 Jan 2026 15:14:09 +0800 Subject: [PATCH 1003/1321] net: phy: motorcomm: fix duplex setting error for phy leds fix duplex setting error for phy leds Fixes: 355b82c54c12 ("net: phy: motorcomm: Add support for PHY LEDs on YT8521") Signed-off-by: Jijie Shao Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20260108071409.2750607-1-shaojijie@huawei.com Signed-off-by: Jakub Kicinski --- drivers/net/phy/motorcomm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/phy/motorcomm.c b/drivers/net/phy/motorcomm.c index 89b5b19a9bd28..42d46b5758fc5 100644 --- a/drivers/net/phy/motorcomm.c +++ b/drivers/net/phy/motorcomm.c @@ -1741,10 +1741,10 @@ static int yt8521_led_hw_control_set(struct phy_device *phydev, u8 index, val |= YT8521_LED_1000_ON_EN; if (test_bit(TRIGGER_NETDEV_FULL_DUPLEX, &rules)) - val |= YT8521_LED_HDX_ON_EN; + val |= YT8521_LED_FDX_ON_EN; if (test_bit(TRIGGER_NETDEV_HALF_DUPLEX, &rules)) - val |= YT8521_LED_FDX_ON_EN; + val |= YT8521_LED_HDX_ON_EN; if (test_bit(TRIGGER_NETDEV_TX, &rules) || test_bit(TRIGGER_NETDEV_RX, &rules)) From 6906e92b6928aaebbac00f8d182d8473fd539245 Mon Sep 17 00:00:00 2001 From: Lorenzo Bianconi Date: Fri, 9 Jan 2026 10:29:06 +0100 Subject: [PATCH 1004/1321] net: airoha: Fix typo in airoha_ppe_setup_tc_block_cb definition Fix Typo in airoha_ppe_dev_setup_tc_block_cb routine definition when CONFIG_NET_AIROHA is not enabled. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202601090517.Fj6v501r-lkp@intel.com/ Fixes: f45fc18b6de04 ("net: airoha: Add airoha_ppe_dev struct definition") Signed-off-by: Lorenzo Bianconi Link: https://patch.msgid.link/20260109-airoha_ppe_dev_setup_tc_block_cb-typo-v1-1-282e8834a9f9@kernel.org Signed-off-by: Jakub Kicinski --- include/linux/soc/airoha/airoha_offload.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/soc/airoha/airoha_offload.h b/include/linux/soc/airoha/airoha_offload.h index ab64ecdf39a06..d01ef4a6b3d7c 100644 --- a/include/linux/soc/airoha/airoha_offload.h +++ b/include/linux/soc/airoha/airoha_offload.h @@ -52,8 +52,8 @@ static inline void airoha_ppe_put_dev(struct airoha_ppe_dev *dev) { } -static inline int airoha_ppe_setup_tc_block_cb(struct airoha_ppe_dev *dev, - void *type_data) +static inline int airoha_ppe_dev_setup_tc_block_cb(struct airoha_ppe_dev *dev, + void *type_data) { return -EOPNOTSUPP; } From 7fea18a49e974e3fddf27703b35ab801331a5eea Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Wed, 7 Jan 2026 10:41:59 +0000 Subject: [PATCH 1005/1321] net: add net.core.qdisc_max_burst In blamed commit, I added a check against the temporary queue built in __dev_xmit_skb(). Idea was to drop packets early, before any spinlock was acquired. if (unlikely(defer_count > READ_ONCE(q->limit))) { kfree_skb_reason(skb, SKB_DROP_REASON_QDISC_DROP); return NET_XMIT_DROP; } It turned out that HTB Qdisc has a zero q->limit. HTB limits packets on a per-class basis. Some of our tests became flaky. Add a new sysctl : net.core.qdisc_max_burst to control how many packets can be stored in the temporary lockless queue. Also add a new QDISC_BURST_DROP drop reason to better diagnose future issues. Thanks Neal ! Fixes: 100dfa74cad9 ("net: dev_queue_xmit() llist adoption") Reported-and-bisected-by: Neal Cardwell Signed-off-by: Eric Dumazet Reviewed-by: Neal Cardwell Link: https://patch.msgid.link/20260107104159.3669285-1-edumazet@google.com Signed-off-by: Paolo Abeni --- Documentation/admin-guide/sysctl/net.rst | 8 ++++++++ include/net/dropreason-core.h | 6 ++++++ include/net/hotdata.h | 1 + net/core/dev.c | 6 +++--- net/core/hotdata.c | 1 + net/core/sysctl_net_core.c | 7 +++++++ 6 files changed, 26 insertions(+), 3 deletions(-) diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst index 369a738a68193..91fa4ccd326c2 100644 --- a/Documentation/admin-guide/sysctl/net.rst +++ b/Documentation/admin-guide/sysctl/net.rst @@ -303,6 +303,14 @@ netdev_max_backlog Maximum number of packets, queued on the INPUT side, when the interface receives packets faster than kernel can process them. +qdisc_max_burst +------------------ + +Maximum number of packets that can be temporarily stored before +reaching qdisc. + +Default: 1000 + netdev_rss_key -------------- diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h index 58d91ccc56e0b..a7b7abd66e215 100644 --- a/include/net/dropreason-core.h +++ b/include/net/dropreason-core.h @@ -67,6 +67,7 @@ FN(TC_EGRESS) \ FN(SECURITY_HOOK) \ FN(QDISC_DROP) \ + FN(QDISC_BURST_DROP) \ FN(QDISC_OVERLIMIT) \ FN(QDISC_CONGESTED) \ FN(CAKE_FLOOD) \ @@ -374,6 +375,11 @@ enum skb_drop_reason { * failed to enqueue to current qdisc) */ SKB_DROP_REASON_QDISC_DROP, + /** + * @SKB_DROP_REASON_QDISC_BURST_DROP: dropped when net.core.qdisc_max_burst + * limit is hit. + */ + SKB_DROP_REASON_QDISC_BURST_DROP, /** * @SKB_DROP_REASON_QDISC_OVERLIMIT: dropped by qdisc when a qdisc * instance exceeds its total buffer size limit. diff --git a/include/net/hotdata.h b/include/net/hotdata.h index 4acec191c54ab..6632b1aa75848 100644 --- a/include/net/hotdata.h +++ b/include/net/hotdata.h @@ -42,6 +42,7 @@ struct net_hotdata { int netdev_budget_usecs; int tstamp_prequeue; int max_backlog; + int qdisc_max_burst; int dev_tx_weight; int dev_rx_weight; int sysctl_max_skb_frags; diff --git a/net/core/dev.c b/net/core/dev.c index 9af9c3df452f7..ccef685023c29 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4203,8 +4203,8 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q, do { if (first_n && !defer_count) { defer_count = atomic_long_inc_return(&q->defer_count); - if (unlikely(defer_count > READ_ONCE(q->limit))) { - kfree_skb_reason(skb, SKB_DROP_REASON_QDISC_DROP); + if (unlikely(defer_count > READ_ONCE(net_hotdata.qdisc_max_burst))) { + kfree_skb_reason(skb, SKB_DROP_REASON_QDISC_BURST_DROP); return NET_XMIT_DROP; } } @@ -4222,7 +4222,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q, ll_list = llist_del_all(&q->defer_list); /* There is a small race because we clear defer_count not atomically * with the prior llist_del_all(). This means defer_list could grow - * over q->limit. + * over qdisc_max_burst. */ atomic_long_set(&q->defer_count, 0); diff --git a/net/core/hotdata.c b/net/core/hotdata.c index dddd5c287cf08..a6db365808178 100644 --- a/net/core/hotdata.c +++ b/net/core/hotdata.c @@ -17,6 +17,7 @@ struct net_hotdata net_hotdata __cacheline_aligned = { .tstamp_prequeue = 1, .max_backlog = 1000, + .qdisc_max_burst = 1000, .dev_tx_weight = 64, .dev_rx_weight = 64, .sysctl_max_skb_frags = MAX_SKB_FRAGS, diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index 8d4decb2606fa..05dd55cf8b58e 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -429,6 +429,13 @@ static struct ctl_table net_core_table[] = { .mode = 0644, .proc_handler = proc_dointvec }, + { + .procname = "qdisc_max_burst", + .data = &net_hotdata.qdisc_max_burst, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec + }, { .procname = "netdev_rss_key", .data = &netdev_rss_key, From 559dbf23859b8e9cfb43b0c7d5965882233b58a3 Mon Sep 17 00:00:00 2001 From: Donald Hunter Date: Mon, 12 Jan 2026 15:34:36 +0000 Subject: [PATCH 1006/1321] tools: ynl: render event op docs correctly MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The docs for YNL event ops currently render raw python structs. For example in: https://docs.kernel.org/netlink/specs/ethtool.html#cable-test-ntf event: {‘attributes’: [‘header’, ‘status’, ‘nest’], ‘__lineno__’: 2385} Handle event ops correctly and render their op attributes: event: attributes: [header, status] Signed-off-by: Donald Hunter Link: https://patch.msgid.link/20260112153436.75495-1-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski --- tools/net/ynl/pyynl/lib/doc_generator.py | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/tools/net/ynl/pyynl/lib/doc_generator.py b/tools/net/ynl/pyynl/lib/doc_generator.py index 3a16b8eb01ca0..8b922d8f89e8f 100644 --- a/tools/net/ynl/pyynl/lib/doc_generator.py +++ b/tools/net/ynl/pyynl/lib/doc_generator.py @@ -166,13 +166,13 @@ def parse_do(self, do_dict: Dict[str, Any], level: int = 0) -> str: continue lines.append(self.fmt.rst_paragraph(self.fmt.bold(key), level + 1)) if key in ['request', 'reply']: - lines.append(self.parse_do_attributes(do_dict[key], level + 1) + "\n") + lines.append(self.parse_op_attributes(do_dict[key], level + 1) + "\n") else: lines.append(self.fmt.headroom(level + 2) + do_dict[key] + "\n") return "\n".join(lines) - def parse_do_attributes(self, attrs: Dict[str, Any], level: int = 0) -> str: + def parse_op_attributes(self, attrs: Dict[str, Any], level: int = 0) -> str: """Parse 'attributes' section""" if "attributes" not in attrs: return "" @@ -184,7 +184,7 @@ def parse_do_attributes(self, attrs: Dict[str, Any], level: int = 0) -> str: def parse_operations(self, operations: List[Dict[str, Any]], namespace: str) -> str: """Parse operations block""" - preprocessed = ["name", "doc", "title", "do", "dump", "flags"] + preprocessed = ["name", "doc", "title", "do", "dump", "flags", "event"] linkable = ["fixed-header", "attribute-set"] lines = [] @@ -217,6 +217,9 @@ def parse_operations(self, operations: List[Dict[str, Any]], namespace: str) -> if "dump" in operation: lines.append(self.fmt.rst_paragraph(":dump:", 0)) lines.append(self.parse_do(operation["dump"], 0)) + if "event" in operation: + lines.append(self.fmt.rst_paragraph(":event:", 0)) + lines.append(self.parse_op_attributes(operation["event"], 0)) # New line after fields lines.append("\n") From ffe06e62062a44e9ca0ce6332b3917ee2c71f2bc Mon Sep 17 00:00:00 2001 From: Aditya Garg Date: Mon, 12 Jan 2026 02:01:33 -0800 Subject: [PATCH 1007/1321] net: hv_netvsc: reject RSS hash key programming without RX indirection table RSS configuration requires a valid RX indirection table. When the device reports a single receive queue, rndis_filter_device_add() does not allocate an indirection table, accepting RSS hash key updates in this state leads to a hang. Fix this by gating netvsc_set_rxfh() on ndc->rx_table_sz and return -EOPNOTSUPP when the table is absent. This aligns set_rxfh with the device capabilities and prevents incorrect behavior. Fixes: 962f3fee83a4 ("netvsc: add ethtool ops to get/set RSS key") Signed-off-by: Aditya Garg Reviewed-by: Dipayaan Roy Reviewed-by: Haiyang Zhang Link: https://patch.msgid.link/1768212093-1594-1-git-send-email-gargaditya@linux.microsoft.com Signed-off-by: Jakub Kicinski --- drivers/net/hyperv/netvsc_drv.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index 3d47d749ef9f1..cbd52cb79268e 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -1750,6 +1750,9 @@ static int netvsc_set_rxfh(struct net_device *dev, rxfh->hfunc != ETH_RSS_HASH_TOP) return -EOPNOTSUPP; + if (!ndc->rx_table_sz) + return -EOPNOTSUPP; + rndis_dev = ndev->extension; if (rxfh->indir) { for (i = 0; i < ndc->rx_table_sz; i++) From 65f849226f8afe56e47ff27225b2d7f73e59c5bd Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Mon, 12 Jan 2026 10:38:25 +0000 Subject: [PATCH 1008/1321] dst: fix races in rt6_uncached_list_del() and rt_del_uncached_list() syzbot was able to crash the kernel in rt6_uncached_list_flush_dev() in an interesting way [1] Crash happens in list_del_init()/INIT_LIST_HEAD() while writing list->prev, while the prior write on list->next went well. static inline void INIT_LIST_HEAD(struct list_head *list) { WRITE_ONCE(list->next, list); // This went well WRITE_ONCE(list->prev, list); // Crash, @list has been freed. } Issue here is that rt6_uncached_list_del() did not attempt to lock ul->lock, as list_empty(&rt->dst.rt_uncached) returned true because the WRITE_ONCE(list->next, list) happened on the other CPU. We might use list_del_init_careful() and list_empty_careful(), or make sure rt6_uncached_list_del() always grabs the spinlock whenever rt->dst.rt_uncached_list has been set. A similar fix is neeed for IPv4. [1] BUG: KASAN: slab-use-after-free in INIT_LIST_HEAD include/linux/list.h:46 [inline] BUG: KASAN: slab-use-after-free in list_del_init include/linux/list.h:296 [inline] BUG: KASAN: slab-use-after-free in rt6_uncached_list_flush_dev net/ipv6/route.c:191 [inline] BUG: KASAN: slab-use-after-free in rt6_disable_ip+0x633/0x730 net/ipv6/route.c:5020 Write of size 8 at addr ffff8880294cfa78 by task kworker/u8:14/3450 CPU: 0 UID: 0 PID: 3450 Comm: kworker/u8:14 Tainted: G L syzkaller #0 PREEMPT_{RT,(full)} Tainted: [L]=SOFTLOCKUP Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 Workqueue: netns cleanup_net Call Trace: dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xca/0x240 mm/kasan/report.c:482 kasan_report+0x118/0x150 mm/kasan/report.c:595 INIT_LIST_HEAD include/linux/list.h:46 [inline] list_del_init include/linux/list.h:296 [inline] rt6_uncached_list_flush_dev net/ipv6/route.c:191 [inline] rt6_disable_ip+0x633/0x730 net/ipv6/route.c:5020 addrconf_ifdown+0x143/0x18a0 net/ipv6/addrconf.c:3853 addrconf_notify+0x1bc/0x1050 net/ipv6/addrconf.c:-1 notifier_call_chain+0x19d/0x3a0 kernel/notifier.c:85 call_netdevice_notifiers_extack net/core/dev.c:2268 [inline] call_netdevice_notifiers net/core/dev.c:2282 [inline] netif_close_many+0x29c/0x410 net/core/dev.c:1785 unregister_netdevice_many_notify+0xb50/0x2330 net/core/dev.c:12353 ops_exit_rtnl_list net/core/net_namespace.c:187 [inline] ops_undo_list+0x3dc/0x990 net/core/net_namespace.c:248 cleanup_net+0x4de/0x7b0 net/core/net_namespace.c:696 process_one_work kernel/workqueue.c:3257 [inline] process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246 Allocated by task 803: kasan_save_stack mm/kasan/common.c:57 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:78 unpoison_slab_object mm/kasan/common.c:340 [inline] __kasan_slab_alloc+0x6c/0x80 mm/kasan/common.c:366 kasan_slab_alloc include/linux/kasan.h:253 [inline] slab_post_alloc_hook mm/slub.c:4953 [inline] slab_alloc_node mm/slub.c:5263 [inline] kmem_cache_alloc_noprof+0x18d/0x6c0 mm/slub.c:5270 dst_alloc+0x105/0x170 net/core/dst.c:89 ip6_dst_alloc net/ipv6/route.c:342 [inline] icmp6_dst_alloc+0x75/0x460 net/ipv6/route.c:3333 mld_sendpack+0x683/0xe60 net/ipv6/mcast.c:1844 mld_send_cr net/ipv6/mcast.c:2154 [inline] mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693 process_one_work kernel/workqueue.c:3257 [inline] process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246 Freed by task 20: kasan_save_stack mm/kasan/common.c:57 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:78 kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:584 poison_slab_object mm/kasan/common.c:253 [inline] __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:285 kasan_slab_free include/linux/kasan.h:235 [inline] slab_free_hook mm/slub.c:2540 [inline] slab_free mm/slub.c:6670 [inline] kmem_cache_free+0x18f/0x8d0 mm/slub.c:6781 dst_destroy+0x235/0x350 net/core/dst.c:121 rcu_do_batch kernel/rcu/tree.c:2605 [inline] rcu_core kernel/rcu/tree.c:2857 [inline] rcu_cpu_kthread+0xba5/0x1af0 kernel/rcu/tree.c:2945 smpboot_thread_fn+0x542/0xa60 kernel/smpboot.c:160 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246 Last potentially related work creation: kasan_save_stack+0x3e/0x60 mm/kasan/common.c:57 kasan_record_aux_stack+0xbd/0xd0 mm/kasan/generic.c:556 __call_rcu_common kernel/rcu/tree.c:3119 [inline] call_rcu+0xee/0x890 kernel/rcu/tree.c:3239 refdst_drop include/net/dst.h:266 [inline] skb_dst_drop include/net/dst.h:278 [inline] skb_release_head_state+0x71/0x360 net/core/skbuff.c:1156 skb_release_all net/core/skbuff.c:1180 [inline] __kfree_skb net/core/skbuff.c:1196 [inline] sk_skb_reason_drop+0xe9/0x170 net/core/skbuff.c:1234 kfree_skb_reason include/linux/skbuff.h:1322 [inline] tcf_kfree_skb_list include/net/sch_generic.h:1127 [inline] __dev_xmit_skb net/core/dev.c:4260 [inline] __dev_queue_xmit+0x26aa/0x3210 net/core/dev.c:4785 NF_HOOK_COND include/linux/netfilter.h:307 [inline] ip6_output+0x340/0x550 net/ipv6/ip6_output.c:247 NF_HOOK+0x9e/0x380 include/linux/netfilter.h:318 mld_sendpack+0x8d4/0xe60 net/ipv6/mcast.c:1855 mld_send_cr net/ipv6/mcast.c:2154 [inline] mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693 process_one_work kernel/workqueue.c:3257 [inline] process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246 The buggy address belongs to the object at ffff8880294cfa00 which belongs to the cache ip6_dst_cache of size 232 The buggy address is located 120 bytes inside of freed 232-byte region [ffff8880294cfa00, ffff8880294cfae8) The buggy address belongs to the physical page: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x294cf memcg:ffff88803536b781 flags: 0x80000000000000(node=0|zone=1) page_type: f5(slab) raw: 0080000000000000 ffff88802ff1c8c0 ffffea0000bf2bc0 dead000000000006 raw: 0000000000000000 00000000800c000c 00000000f5000000 ffff88803536b781 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52820(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 9, tgid 9 (kworker/0:0), ts 91119585830, free_ts 91088628818 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x234/0x290 mm/page_alloc.c:1857 prep_new_page mm/page_alloc.c:1865 [inline] get_page_from_freelist+0x28c0/0x2960 mm/page_alloc.c:3915 __alloc_frozen_pages_noprof+0x181/0x370 mm/page_alloc.c:5210 alloc_pages_mpol+0xd1/0x380 mm/mempolicy.c:2486 alloc_slab_page mm/slub.c:3075 [inline] allocate_slab+0x86/0x3b0 mm/slub.c:3248 new_slab mm/slub.c:3302 [inline] ___slab_alloc+0xb10/0x13e0 mm/slub.c:4656 __slab_alloc+0xc6/0x1f0 mm/slub.c:4779 __slab_alloc_node mm/slub.c:4855 [inline] slab_alloc_node mm/slub.c:5251 [inline] kmem_cache_alloc_noprof+0x101/0x6c0 mm/slub.c:5270 dst_alloc+0x105/0x170 net/core/dst.c:89 ip6_dst_alloc net/ipv6/route.c:342 [inline] icmp6_dst_alloc+0x75/0x460 net/ipv6/route.c:3333 mld_sendpack+0x683/0xe60 net/ipv6/mcast.c:1844 mld_send_cr net/ipv6/mcast.c:2154 [inline] mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693 process_one_work kernel/workqueue.c:3257 [inline] process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158 page last free pid 5859 tgid 5859 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1406 [inline] __free_frozen_pages+0xfe1/0x1170 mm/page_alloc.c:2943 discard_slab mm/slub.c:3346 [inline] __put_partials+0x149/0x170 mm/slub.c:3886 __slab_free+0x2af/0x330 mm/slub.c:5952 qlink_free mm/kasan/quarantine.c:163 [inline] qlist_free_all+0x97/0x100 mm/kasan/quarantine.c:179 kasan_quarantine_reduce+0x148/0x160 mm/kasan/quarantine.c:286 __kasan_slab_alloc+0x22/0x80 mm/kasan/common.c:350 kasan_slab_alloc include/linux/kasan.h:253 [inline] slab_post_alloc_hook mm/slub.c:4953 [inline] slab_alloc_node mm/slub.c:5263 [inline] kmem_cache_alloc_noprof+0x18d/0x6c0 mm/slub.c:5270 getname_flags+0xb8/0x540 fs/namei.c:146 getname include/linux/fs.h:2498 [inline] do_sys_openat2+0xbc/0x200 fs/open.c:1426 do_sys_open fs/open.c:1436 [inline] __do_sys_openat fs/open.c:1452 [inline] __se_sys_openat fs/open.c:1447 [inline] __x64_sys_openat+0x138/0x170 fs/open.c:1447 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94 Fixes: 8d0b94afdca8 ("ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister") Fixes: 78df76a065ae ("ipv4: take rt_uncached_lock only if needed") Reported-by: syzbot+179fc225724092b8b2b2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6964cdf2.050a0220.eaf7.009d.GAE@google.com/T/#u Signed-off-by: Eric Dumazet Cc: Martin KaFai Lau Reviewed-by: David Ahern Link: https://patch.msgid.link/20260112103825.3810713-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- net/core/dst.c | 1 + net/ipv4/route.c | 4 ++-- net/ipv6/route.c | 4 ++-- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/net/core/dst.c b/net/core/dst.c index e9d35f49c9e78..1dae26c51ebec 100644 --- a/net/core/dst.c +++ b/net/core/dst.c @@ -68,6 +68,7 @@ void dst_init(struct dst_entry *dst, struct dst_ops *ops, dst->lwtstate = NULL; rcuref_init(&dst->__rcuref, 1); INIT_LIST_HEAD(&dst->rt_uncached); + dst->rt_uncached_list = NULL; dst->__use = 0; dst->lastuse = jiffies; dst->flags = flags; diff --git a/net/ipv4/route.c b/net/ipv4/route.c index b549d6a573073..11d990703d31a 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1537,9 +1537,9 @@ void rt_add_uncached_list(struct rtable *rt) void rt_del_uncached_list(struct rtable *rt) { - if (!list_empty(&rt->dst.rt_uncached)) { - struct uncached_list *ul = rt->dst.rt_uncached_list; + struct uncached_list *ul = rt->dst.rt_uncached_list; + if (ul) { spin_lock_bh(&ul->lock); list_del_init(&rt->dst.rt_uncached); spin_unlock_bh(&ul->lock); diff --git a/net/ipv6/route.c b/net/ipv6/route.c index a3e051dc66ee0..e3a260a5564ba 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -148,9 +148,9 @@ void rt6_uncached_list_add(struct rt6_info *rt) void rt6_uncached_list_del(struct rt6_info *rt) { - if (!list_empty(&rt->dst.rt_uncached)) { - struct uncached_list *ul = rt->dst.rt_uncached_list; + struct uncached_list *ul = rt->dst.rt_uncached_list; + if (ul) { spin_lock_bh(&ul->lock); list_del_init(&rt->dst.rt_uncached); spin_unlock_bh(&ul->lock); From e3f98e90aaeb41118cf8ff2e3aeb76c3c8365cf0 Mon Sep 17 00:00:00 2001 From: Kuniyuki Iwashima Date: Tue, 13 Jan 2026 01:05:08 +0000 Subject: [PATCH 1009/1321] ipv6: Fix use-after-free in inet6_addr_del(). syzbot reported use-after-free of inet6_ifaddr in inet6_addr_del(). [0] The cited commit accidentally moved ipv6_del_addr() for mngtmpaddr before reading its ifp->flags for temporary addresses in inet6_addr_del(). Let's move ipv6_del_addr() down to fix the UAF. [0]: BUG: KASAN: slab-use-after-free in inet6_addr_del.constprop.0+0x67a/0x6b0 net/ipv6/addrconf.c:3117 Read of size 4 at addr ffff88807b89c86c by task syz.3.1618/9593 CPU: 0 UID: 0 PID: 9593 Comm: syz.3.1618 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 Call Trace: __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xcd/0x630 mm/kasan/report.c:482 kasan_report+0xe0/0x110 mm/kasan/report.c:595 inet6_addr_del.constprop.0+0x67a/0x6b0 net/ipv6/addrconf.c:3117 addrconf_del_ifaddr+0x11e/0x190 net/ipv6/addrconf.c:3181 inet6_ioctl+0x1e5/0x2b0 net/ipv6/af_inet6.c:582 sock_do_ioctl+0x118/0x280 net/socket.c:1254 sock_ioctl+0x227/0x6b0 net/socket.c:1375 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl fs/ioctl.c:583 [inline] __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f164cf8f749 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f164de64038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f164d1e5fa0 RCX: 00007f164cf8f749 RDX: 0000200000000000 RSI: 0000000000008936 RDI: 0000000000000003 RBP: 00007f164d013f91 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f164d1e6038 R14: 00007f164d1e5fa0 R15: 00007ffde15c8288 Allocated by task 9593: kasan_save_stack+0x33/0x60 mm/kasan/common.c:56 kasan_save_track+0x14/0x30 mm/kasan/common.c:77 poison_kmalloc_redzone mm/kasan/common.c:397 [inline] __kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:414 kmalloc_noprof include/linux/slab.h:957 [inline] kzalloc_noprof include/linux/slab.h:1094 [inline] ipv6_add_addr+0x4e3/0x2010 net/ipv6/addrconf.c:1120 inet6_addr_add+0x256/0x9b0 net/ipv6/addrconf.c:3050 addrconf_add_ifaddr+0x1fc/0x450 net/ipv6/addrconf.c:3160 inet6_ioctl+0x103/0x2b0 net/ipv6/af_inet6.c:580 sock_do_ioctl+0x118/0x280 net/socket.c:1254 sock_ioctl+0x227/0x6b0 net/socket.c:1375 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl fs/ioctl.c:583 [inline] __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 6099: kasan_save_stack+0x33/0x60 mm/kasan/common.c:56 kasan_save_track+0x14/0x30 mm/kasan/common.c:77 kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:584 poison_slab_object mm/kasan/common.c:252 [inline] __kasan_slab_free+0x5f/0x80 mm/kasan/common.c:284 kasan_slab_free include/linux/kasan.h:234 [inline] slab_free_hook mm/slub.c:2540 [inline] slab_free_freelist_hook mm/slub.c:2569 [inline] slab_free_bulk mm/slub.c:6696 [inline] kmem_cache_free_bulk mm/slub.c:7383 [inline] kmem_cache_free_bulk+0x2bf/0x680 mm/slub.c:7362 kfree_bulk include/linux/slab.h:830 [inline] kvfree_rcu_bulk+0x1b7/0x1e0 mm/slab_common.c:1523 kvfree_rcu_drain_ready mm/slab_common.c:1728 [inline] kfree_rcu_monitor+0x1d0/0x2f0 mm/slab_common.c:1801 process_one_work+0x9ba/0x1b20 kernel/workqueue.c:3257 process_scheduled_works kernel/workqueue.c:3340 [inline] worker_thread+0x6c8/0xf10 kernel/workqueue.c:3421 kthread+0x3c5/0x780 kernel/kthread.c:463 ret_from_fork+0x983/0xb10 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246 Fixes: 00b5b7aab9e42 ("net/ipv6: delete temporary address if mngtmpaddr is removed or unmanaged") Reported-by: syzbot+72e610f4f1a930ca9d8a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/696598e9.050a0220.3be5c5.0009.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima Reviewed-by: Hangbin Liu Reviewed-by: Eric Dumazet Link: https://patch.msgid.link/20260113010538.2019411-1-kuniyu@google.com Signed-off-by: Jakub Kicinski --- net/ipv6/addrconf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index b66217d1b2f82..27ab9d7adc649 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3112,12 +3112,12 @@ static int inet6_addr_del(struct net *net, int ifindex, u32 ifa_flags, in6_ifa_hold(ifp); read_unlock_bh(&idev->lock); - ipv6_del_addr(ifp); - if (!(ifp->flags & IFA_F_TEMPORARY) && (ifp->flags & IFA_F_MANAGETEMPADDR)) delete_tempaddrs(idev, ifp); + ipv6_del_addr(ifp); + addrconf_verify_rtnl(net); if (ipv6_addr_is_multicast(pfx)) { ipv6_mc_config(net->ipv6.mc_autojoin_sk, From 956b7daba7bc323a508c8d153add353745156aeb Mon Sep 17 00:00:00 2001 From: Gal Pressman Date: Mon, 12 Jan 2026 19:37:14 +0200 Subject: [PATCH 1010/1321] selftests: drv-net: fix RPS mask handling in toeplitz test The toeplitz.py test passed the hex mask without "0x" prefix (e.g., "300" for CPUs 8,9). The toeplitz.c strtoul() call wrongly parsed this as decimal 300 (0x12c) instead of hex 0x300. Pass the prefixed mask to toeplitz.c, and the unprefixed one to sysfs. Fixes: 9cf9aa77a1f6 ("selftests: drv-net: hw: convert the Toeplitz test to Python") Reviewed-by: Nimrod Oren Signed-off-by: Gal Pressman Reviewed-by: Willem de Bruijn Link: https://patch.msgid.link/20260112173715.384843-2-gal@nvidia.com Signed-off-by: Jakub Kicinski --- tools/testing/selftests/drivers/net/hw/toeplitz.py | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/drivers/net/hw/toeplitz.py b/tools/testing/selftests/drivers/net/hw/toeplitz.py index d2db5ee9e3586..d288c57894f68 100755 --- a/tools/testing/selftests/drivers/net/hw/toeplitz.py +++ b/tools/testing/selftests/drivers/net/hw/toeplitz.py @@ -94,12 +94,14 @@ def _configure_rps(cfg, rps_cpus): mask = 0 for cpu in rps_cpus: mask |= (1 << cpu) - mask = hex(mask)[2:] + + mask = hex(mask) # Set RPS bitmap for all rx queues for rps_file in glob.glob(f"/sys/class/net/{cfg.ifname}/queues/rx-*/rps_cpus"): with open(rps_file, "w", encoding="utf-8") as fp: - fp.write(mask) + # sysfs expects hex without '0x' prefix, toeplitz.c needs the prefix + fp.write(mask[2:]) return mask From 31df885948accc679597f5c6a52e5b6c9f08ddc1 Mon Sep 17 00:00:00 2001 From: Gal Pressman Date: Mon, 12 Jan 2026 19:37:15 +0200 Subject: [PATCH 1011/1321] selftests: drv-net: fix RPS mask handling for high CPU numbers The RPS bitmask bounds check uses ~(RPS_MAX_CPUS - 1) which equals ~15 = 0xfff0, only allowing CPUs 0-3. Change the mask to ~((1UL << RPS_MAX_CPUS) - 1) = ~0xffff to allow CPUs 0-15. Fixes: 5ebfb4cc3048 ("selftests/net: toeplitz test") Reviewed-by: Nimrod Oren Signed-off-by: Gal Pressman Reviewed-by: Willem de Bruijn Link: https://patch.msgid.link/20260112173715.384843-3-gal@nvidia.com Signed-off-by: Jakub Kicinski --- tools/testing/selftests/drivers/net/hw/toeplitz.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/drivers/net/hw/toeplitz.c b/tools/testing/selftests/drivers/net/hw/toeplitz.c index d23b3b0c20a36..285bb17df9c2f 100644 --- a/tools/testing/selftests/drivers/net/hw/toeplitz.c +++ b/tools/testing/selftests/drivers/net/hw/toeplitz.c @@ -485,8 +485,8 @@ static void parse_rps_bitmap(const char *arg) bitmap = strtoul(arg, NULL, 0); - if (bitmap & ~(RPS_MAX_CPUS - 1)) - error(1, 0, "rps bitmap 0x%lx out of bounds 0..%lu", + if (bitmap & ~((1UL << RPS_MAX_CPUS) - 1)) + error(1, 0, "rps bitmap 0x%lx out of bounds, max cpu %lu", bitmap, RPS_MAX_CPUS - 1); for (i = 0; i < RPS_MAX_CPUS; i++) From 4764f0de797006b78a7d72556a007619be39f5eb Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Mon, 12 Jan 2026 17:56:56 +0000 Subject: [PATCH 1012/1321] net/sched: sch_qfq: do not free existing class in qfq_change_class() Fixes qfq_change_class() error case. cl->qdisc and cl should only be freed if a new class and qdisc were allocated, or we risk various UAF. Fixes: 462dbc9101ac ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost") Reported-by: syzbot+07f3f38f723c335f106d@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6965351d.050a0220.eaf7.00c5.GAE@google.com/T/#u Signed-off-by: Eric Dumazet Reviewed-by: Jamal Hadi Salim Link: https://patch.msgid.link/20260112175656.17605-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- net/sched/sch_qfq.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c index f4013b547438f..9d59090bbe934 100644 --- a/net/sched/sch_qfq.c +++ b/net/sched/sch_qfq.c @@ -529,8 +529,10 @@ static int qfq_change_class(struct Qdisc *sch, u32 classid, u32 parentid, return 0; destroy_class: - qdisc_put(cl->qdisc); - kfree(cl); + if (!existing) { + qdisc_put(cl->qdisc); + kfree(cl); + } return err; } From 69eba8209273ac3e4f8c00f5f975ed82c49136c4 Mon Sep 17 00:00:00 2001 From: "Gustavo A. R. Silva" Date: Sat, 10 Jan 2026 17:07:17 +0900 Subject: [PATCH 1013/1321] virtio_net: Fix misalignment bug in struct virtnet_info Use the new TRAILING_OVERLAP() helper to fix a misalignment bug along with the following warning: drivers/net/virtio_net.c:429:46: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] This helper creates a union between a flexible-array member (FAM) and a set of members that would otherwise follow it (in this case `u8 rss_hash_key_data[VIRTIO_NET_RSS_MAX_KEY_SIZE];`). This overlays the trailing members (rss_hash_key_data) onto the FAM (hash_key_data) while keeping the FAM and the start of MEMBERS aligned. The static_assert() ensures this alignment remains. Notice that due to tail padding in flexible `struct virtio_net_rss_config_trailer`, `rss_trailer.hash_key_data` (at offset 83 in struct virtnet_info) and `rss_hash_key_data` (at offset 84 in struct virtnet_info) are misaligned by one byte. See below: struct virtio_net_rss_config_trailer { __le16 max_tx_vq; /* 0 2 */ __u8 hash_key_length; /* 2 1 */ __u8 hash_key_data[]; /* 3 0 */ /* size: 4, cachelines: 1, members: 3 */ /* padding: 1 */ /* last cacheline: 4 bytes */ }; struct virtnet_info { ... struct virtio_net_rss_config_trailer rss_trailer; /* 80 4 */ /* XXX last struct has 1 byte of padding */ u8 rss_hash_key_data[40]; /* 84 40 */ ... /* size: 832, cachelines: 13, members: 48 */ /* sum members: 801, holes: 8, sum holes: 31 */ /* paddings: 2, sum paddings: 5 */ }; After changes, those members are correctly aligned at offset 795: struct virtnet_info { ... union { struct virtio_net_rss_config_trailer rss_trailer; /* 792 4 */ struct { unsigned char __offset_to_hash_key_data[3]; /* 792 3 */ u8 rss_hash_key_data[40]; /* 795 40 */ }; /* 792 43 */ }; /* 792 44 */ ... /* size: 840, cachelines: 14, members: 47 */ /* sum members: 801, holes: 8, sum holes: 35 */ /* padding: 4 */ /* paddings: 1, sum paddings: 4 */ /* last cacheline: 8 bytes */ }; As a result, the RSS key passed to the device is shifted by 1 byte: the last byte is cut off, and instead a (possibly uninitialized) byte is added at the beginning. As a last note `struct virtio_net_rss_config_hdr *rss_hdr;` is also moved to the end, since it seems those three members should stick around together. :) Cc: stable@vger.kernel.org Fixes: ed3100e90d0d ("virtio_net: Use new RSS config structs") Signed-off-by: Gustavo A. R. Silva Acked-by: Michael S. Tsirkin Link: https://patch.msgid.link/aWIItWq5dV9XTTCJ@kspp Signed-off-by: Paolo Abeni --- drivers/net/virtio_net.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index ca92b4a1879ce..db88dcaefb20b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -425,9 +425,6 @@ struct virtnet_info { u16 rss_indir_table_size; u32 rss_hash_types_supported; u32 rss_hash_types_saved; - struct virtio_net_rss_config_hdr *rss_hdr; - struct virtio_net_rss_config_trailer rss_trailer; - u8 rss_hash_key_data[VIRTIO_NET_RSS_MAX_KEY_SIZE]; /* Has control virtqueue */ bool has_cvq; @@ -484,7 +481,16 @@ struct virtnet_info { struct failover *failover; u64 device_stats_cap; + + struct virtio_net_rss_config_hdr *rss_hdr; + + /* Must be last as it ends in a flexible-array member. */ + TRAILING_OVERLAP(struct virtio_net_rss_config_trailer, rss_trailer, hash_key_data, + u8 rss_hash_key_data[VIRTIO_NET_RSS_MAX_KEY_SIZE]; + ); }; +static_assert(offsetof(struct virtnet_info, rss_trailer.hash_key_data) == + offsetof(struct virtnet_info, rss_hash_key_data)); struct padded_vnet_hdr { struct virtio_net_hdr_v1_hash hdr; From 37619714ab12a5a060a0298eeb366a06f552d954 Mon Sep 17 00:00:00 2001 From: Jianbo Liu Date: Thu, 20 Nov 2025 05:56:09 +0200 Subject: [PATCH 1014/1321] xfrm: Fix inner mode lookup in tunnel mode GSO segmentation Commit 61fafbee6cfe ("xfrm: Determine inner GSO type from packet inner protocol") attempted to fix GSO segmentation by reading the inner protocol from XFRM_MODE_SKB_CB(skb)->protocol. This was incorrect because the field holds the inner L4 protocol (TCP/UDP) instead of the required tunnel protocol. Also, the memory location (shared by XFRM_SKB_CB(skb) which could be overwritten by xfrm_replay_overflow()) is prone to corruption. This combination caused the kernel to select the wrong inner mode and get the wrong address family. The correct value is in xfrm_offload(skb)->proto, which is set from the outer tunnel header's protocol field by esp[4|6]_gso_encap(). It is initialized by xfrm[4|6]_tunnel_encap_add() to either IPPROTO_IPIP or IPPROTO_IPV6, using xfrm_af2proto() and correctly reflects the inner packet's address family. Fixes: 61fafbee6cfe ("xfrm: Determine inner GSO type from packet inner protocol") Signed-off-by: Jianbo Liu Reviewed-by: Sabrina Dubroca Signed-off-by: Steffen Klassert --- net/ipv4/esp4_offload.c | 4 ++-- net/ipv6/esp6_offload.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/net/ipv4/esp4_offload.c b/net/ipv4/esp4_offload.c index 05828d4cb6cdb..abd77162f5e75 100644 --- a/net/ipv4/esp4_offload.c +++ b/net/ipv4/esp4_offload.c @@ -122,8 +122,8 @@ static struct sk_buff *xfrm4_tunnel_gso_segment(struct xfrm_state *x, struct sk_buff *skb, netdev_features_t features) { - const struct xfrm_mode *inner_mode = xfrm_ip2inner_mode(x, - XFRM_MODE_SKB_CB(skb)->protocol); + struct xfrm_offload *xo = xfrm_offload(skb); + const struct xfrm_mode *inner_mode = xfrm_ip2inner_mode(x, xo->proto); __be16 type = inner_mode->family == AF_INET6 ? htons(ETH_P_IPV6) : htons(ETH_P_IP); diff --git a/net/ipv6/esp6_offload.c b/net/ipv6/esp6_offload.c index 22410243ebe88..22895521a57d0 100644 --- a/net/ipv6/esp6_offload.c +++ b/net/ipv6/esp6_offload.c @@ -158,8 +158,8 @@ static struct sk_buff *xfrm6_tunnel_gso_segment(struct xfrm_state *x, struct sk_buff *skb, netdev_features_t features) { - const struct xfrm_mode *inner_mode = xfrm_ip2inner_mode(x, - XFRM_MODE_SKB_CB(skb)->protocol); + struct xfrm_offload *xo = xfrm_offload(skb); + const struct xfrm_mode *inner_mode = xfrm_ip2inner_mode(x, xo->proto); __be16 type = inner_mode->family == AF_INET ? htons(ETH_P_IP) : htons(ETH_P_IPV6); From bc969c4436fc253da066255ab79ebbe7f40bc76b Mon Sep 17 00:00:00 2001 From: Antony Antony Date: Thu, 11 Dec 2025 11:30:27 +0100 Subject: [PATCH 1015/1321] xfrm: set ipv4 no_pmtu_disc flag only on output sa when direction is set The XFRM_STATE_NOPMTUDISC flag is only meaningful for output SAs, but it was being applied regardless of the SA direction when the sysctl ip_no_pmtu_disc is enabled. This can unintentionally affect input SAs. Limit setting XFRM_STATE_NOPMTUDISC to output SAs when the SA direction is configured. Closes: https://github.com/strongswan/strongswan/issues/2946 Fixes: a4a87fa4e96c ("xfrm: Add Direction to the SA in or out") Signed-off-by: Antony Antony Signed-off-by: Steffen Klassert --- net/xfrm/xfrm_state.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 9e14e453b55cc..98b362d518363 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -3151,6 +3151,7 @@ int __xfrm_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack) int err; if (family == AF_INET && + (!x->dir || x->dir == XFRM_SA_DIR_OUT) && READ_ONCE(xs_net(x)->ipv4.sysctl_ip_no_pmtu_disc)) x->props.flags |= XFRM_STATE_NOPMTUDISC; From a1e793a6ab286346672a28c57fd33a2c7071dda4 Mon Sep 17 00:00:00 2001 From: Oliver Hartkopp Date: Fri, 9 Jan 2026 15:41:33 +0100 Subject: [PATCH 1016/1321] Revert "can: raw: instantly reject unsupported CAN frames" This reverts commit 1a620a723853a0f49703c317d52dc6b9602cbaa8 and its follow-up fixes for the introduced dependency issues. commit 1a620a723853 ("can: raw: instantly reject unsupported CAN frames") commit cb2dc6d2869a ("can: Kconfig: select CAN driver infrastructure by default") commit 6abd4577bccc ("can: fix build dependency") commit 5a5aff6338c0 ("can: fix build dependency") The entire problem was caused by the requirement that a new network layer feature needed to know about the protocol capabilities of the CAN devices. Instead of accessing CAN device internal data structures which caused the dependency problems a better approach has been developed which makes use of CAN specific ml_priv data which is accessible from both sides. Cc: Marc Kleine-Budde Cc: Arnd Bergmann Cc: Vincent Mailhol Signed-off-by: Oliver Hartkopp Link: https://patch.msgid.link/20260109144135.8495-2-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde --- drivers/net/can/Kconfig | 7 +++-- drivers/net/can/Makefile | 2 +- drivers/net/can/dev/Makefile | 5 ++-- include/linux/can/dev.h | 7 ----- net/can/raw.c | 54 ++++++------------------------------ 5 files changed, 17 insertions(+), 58 deletions(-) diff --git a/drivers/net/can/Kconfig b/drivers/net/can/Kconfig index cfaea6178a719..e15e320db4763 100644 --- a/drivers/net/can/Kconfig +++ b/drivers/net/can/Kconfig @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0-only menuconfig CAN_DEV - bool "CAN Device Drivers" + tristate "CAN Device Drivers" default y depends on CAN help @@ -17,7 +17,10 @@ menuconfig CAN_DEV virtual ones. If you own such devices or plan to use the virtual CAN interfaces to develop applications, say Y here. -if CAN_DEV && CAN + To compile as a module, choose M here: the module will be called + can-dev. + +if CAN_DEV config CAN_VCAN tristate "Virtual Local CAN Interface (vcan)" diff --git a/drivers/net/can/Makefile b/drivers/net/can/Makefile index 37e2f1a2faecd..d7bc10a6b8eae 100644 --- a/drivers/net/can/Makefile +++ b/drivers/net/can/Makefile @@ -7,7 +7,7 @@ obj-$(CONFIG_CAN_VCAN) += vcan.o obj-$(CONFIG_CAN_VXCAN) += vxcan.o obj-$(CONFIG_CAN_SLCAN) += slcan/ -obj-$(CONFIG_CAN_DEV) += dev/ +obj-y += dev/ obj-y += esd/ obj-y += rcar/ obj-y += rockchip/ diff --git a/drivers/net/can/dev/Makefile b/drivers/net/can/dev/Makefile index 64226acf0f3d4..633687d6b6c0c 100644 --- a/drivers/net/can/dev/Makefile +++ b/drivers/net/can/dev/Makefile @@ -1,8 +1,9 @@ # SPDX-License-Identifier: GPL-2.0 -obj-$(CONFIG_CAN) += can-dev.o +obj-$(CONFIG_CAN_DEV) += can-dev.o + +can-dev-y += skb.o -can-dev-$(CONFIG_CAN_DEV) += skb.o can-dev-$(CONFIG_CAN_CALC_BITTIMING) += calc_bittiming.o can-dev-$(CONFIG_CAN_NETLINK) += bittiming.o can-dev-$(CONFIG_CAN_NETLINK) += dev.o diff --git a/include/linux/can/dev.h b/include/linux/can/dev.h index f6416a56e95d5..52c8be5c160e4 100644 --- a/include/linux/can/dev.h +++ b/include/linux/can/dev.h @@ -111,14 +111,7 @@ struct net_device *alloc_candev_mqs(int sizeof_priv, unsigned int echo_skb_max, void free_candev(struct net_device *dev); /* a candev safe wrapper around netdev_priv */ -#if IS_ENABLED(CONFIG_CAN_NETLINK) struct can_priv *safe_candev_priv(struct net_device *dev); -#else -static inline struct can_priv *safe_candev_priv(struct net_device *dev) -{ - return NULL; -} -#endif int open_candev(struct net_device *dev); void close_candev(struct net_device *dev); diff --git a/net/can/raw.c b/net/can/raw.c index be1ef7cf4204e..f36a83d3447cf 100644 --- a/net/can/raw.c +++ b/net/can/raw.c @@ -892,58 +892,20 @@ static void raw_put_canxl_vcid(struct raw_sock *ro, struct sk_buff *skb) } } -static inline bool raw_dev_cc_enabled(struct net_device *dev, - struct can_priv *priv) +static unsigned int raw_check_txframe(struct raw_sock *ro, struct sk_buff *skb, int mtu) { - /* The CANXL-only mode disables error-signalling on the CAN bus - * which is needed to send CAN CC/FD frames - */ - if (priv) - return !can_dev_in_xl_only_mode(priv); - - /* virtual CAN interfaces always support CAN CC */ - return true; -} - -static inline bool raw_dev_fd_enabled(struct net_device *dev, - struct can_priv *priv) -{ - /* check FD ctrlmode on real CAN interfaces */ - if (priv) - return (priv->ctrlmode & CAN_CTRLMODE_FD); - - /* check MTU for virtual CAN FD interfaces */ - return (READ_ONCE(dev->mtu) >= CANFD_MTU); -} - -static inline bool raw_dev_xl_enabled(struct net_device *dev, - struct can_priv *priv) -{ - /* check XL ctrlmode on real CAN interfaces */ - if (priv) - return (priv->ctrlmode & CAN_CTRLMODE_XL); - - /* check MTU for virtual CAN XL interfaces */ - return can_is_canxl_dev_mtu(READ_ONCE(dev->mtu)); -} - -static unsigned int raw_check_txframe(struct raw_sock *ro, struct sk_buff *skb, - struct net_device *dev) -{ - struct can_priv *priv = safe_candev_priv(dev); - - /* Classical CAN */ - if (can_is_can_skb(skb) && raw_dev_cc_enabled(dev, priv)) + /* Classical CAN -> no checks for flags and device capabilities */ + if (can_is_can_skb(skb)) return CAN_MTU; - /* CAN FD */ + /* CAN FD -> needs to be enabled and a CAN FD or CAN XL device */ if (ro->fd_frames && can_is_canfd_skb(skb) && - raw_dev_fd_enabled(dev, priv)) + (mtu == CANFD_MTU || can_is_canxl_dev_mtu(mtu))) return CANFD_MTU; - /* CAN XL */ + /* CAN XL -> needs to be enabled and a CAN XL device */ if (ro->xl_frames && can_is_canxl_skb(skb) && - raw_dev_xl_enabled(dev, priv)) + can_is_canxl_dev_mtu(mtu)) return CANXL_MTU; return 0; @@ -999,7 +961,7 @@ static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) err = -EINVAL; /* check for valid CAN (CC/FD/XL) frame content */ - txmtu = raw_check_txframe(ro, skb, dev); + txmtu = raw_check_txframe(ro, skb, READ_ONCE(dev->mtu)); if (!txmtu) goto free_skb; From a1bec0d5b3b2701920cef5c469b644ef6300dfbe Mon Sep 17 00:00:00 2001 From: Oliver Hartkopp Date: Fri, 9 Jan 2026 15:41:34 +0100 Subject: [PATCH 1017/1321] can: propagate CAN device capabilities via ml_priv Commit 1a620a723853 ("can: raw: instantly reject unsupported CAN frames") caused a sequence of dependency and linker fixes. Instead of accessing CAN device internal data structures which caused the dependency problems this patch introduces capability information into the CAN specific ml_priv data which is accessible from both sides. With this change the CAN network layer can check the required features and the decoupling of the driver layer and network layer is restored. Fixes: 1a620a723853 ("can: raw: instantly reject unsupported CAN frames") Cc: Marc Kleine-Budde Cc: Arnd Bergmann Cc: Vincent Mailhol Signed-off-by: Oliver Hartkopp Link: https://patch.msgid.link/20260109144135.8495-3-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde --- drivers/net/can/dev/dev.c | 27 +++++++++++++++++++++++++++ drivers/net/can/dev/netlink.c | 1 + drivers/net/can/vcan.c | 15 +++++++++++++++ drivers/net/can/vxcan.c | 15 +++++++++++++++ include/linux/can/can-ml.h | 24 ++++++++++++++++++++++++ include/linux/can/dev.h | 1 + 6 files changed, 83 insertions(+) diff --git a/drivers/net/can/dev/dev.c b/drivers/net/can/dev/dev.c index 091f30e94c610..7ab9578f5b890 100644 --- a/drivers/net/can/dev/dev.c +++ b/drivers/net/can/dev/dev.c @@ -375,6 +375,32 @@ void can_set_default_mtu(struct net_device *dev) } } +void can_set_cap_info(struct net_device *dev) +{ + struct can_priv *priv = netdev_priv(dev); + u32 can_cap; + + if (can_dev_in_xl_only_mode(priv)) { + /* XL only mode => no CC/FD capability */ + can_cap = CAN_CAP_XL; + } else { + /* mixed mode => CC + FD/XL capability */ + can_cap = CAN_CAP_CC; + + if (priv->ctrlmode & CAN_CTRLMODE_FD) + can_cap |= CAN_CAP_FD; + + if (priv->ctrlmode & CAN_CTRLMODE_XL) + can_cap |= CAN_CAP_XL; + } + + if (priv->ctrlmode & (CAN_CTRLMODE_LISTENONLY | + CAN_CTRLMODE_RESTRICTED)) + can_cap |= CAN_CAP_RO; + + can_set_cap(dev, can_cap); +} + /* helper to define static CAN controller features at device creation time */ int can_set_static_ctrlmode(struct net_device *dev, u32 static_mode) { @@ -390,6 +416,7 @@ int can_set_static_ctrlmode(struct net_device *dev, u32 static_mode) /* override MTU which was set by default in can_setup()? */ can_set_default_mtu(dev); + can_set_cap_info(dev); return 0; } diff --git a/drivers/net/can/dev/netlink.c b/drivers/net/can/dev/netlink.c index d6b0e686fb114..0498198a46965 100644 --- a/drivers/net/can/dev/netlink.c +++ b/drivers/net/can/dev/netlink.c @@ -377,6 +377,7 @@ static int can_ctrlmode_changelink(struct net_device *dev, } can_set_default_mtu(dev); + can_set_cap_info(dev); return 0; } diff --git a/drivers/net/can/vcan.c b/drivers/net/can/vcan.c index fdc662aea2798..76e6b7b5c6a11 100644 --- a/drivers/net/can/vcan.c +++ b/drivers/net/can/vcan.c @@ -130,6 +130,19 @@ static netdev_tx_t vcan_tx(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } +static void vcan_set_cap_info(struct net_device *dev) +{ + u32 can_cap = CAN_CAP_CC; + + if (dev->mtu > CAN_MTU) + can_cap |= CAN_CAP_FD; + + if (dev->mtu >= CANXL_MIN_MTU) + can_cap |= CAN_CAP_XL; + + can_set_cap(dev, can_cap); +} + static int vcan_change_mtu(struct net_device *dev, int new_mtu) { /* Do not allow changing the MTU while running */ @@ -141,6 +154,7 @@ static int vcan_change_mtu(struct net_device *dev, int new_mtu) return -EINVAL; WRITE_ONCE(dev->mtu, new_mtu); + vcan_set_cap_info(dev); return 0; } @@ -162,6 +176,7 @@ static void vcan_setup(struct net_device *dev) dev->tx_queue_len = 0; dev->flags = IFF_NOARP; can_set_ml_priv(dev, netdev_priv(dev)); + vcan_set_cap_info(dev); /* set flags according to driver capabilities */ if (echo) diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c index b2c19f8c5f8e5..f14c6f02b6627 100644 --- a/drivers/net/can/vxcan.c +++ b/drivers/net/can/vxcan.c @@ -125,6 +125,19 @@ static int vxcan_get_iflink(const struct net_device *dev) return iflink; } +static void vxcan_set_cap_info(struct net_device *dev) +{ + u32 can_cap = CAN_CAP_CC; + + if (dev->mtu > CAN_MTU) + can_cap |= CAN_CAP_FD; + + if (dev->mtu >= CANXL_MIN_MTU) + can_cap |= CAN_CAP_XL; + + can_set_cap(dev, can_cap); +} + static int vxcan_change_mtu(struct net_device *dev, int new_mtu) { /* Do not allow changing the MTU while running */ @@ -136,6 +149,7 @@ static int vxcan_change_mtu(struct net_device *dev, int new_mtu) return -EINVAL; WRITE_ONCE(dev->mtu, new_mtu); + vxcan_set_cap_info(dev); return 0; } @@ -167,6 +181,7 @@ static void vxcan_setup(struct net_device *dev) can_ml = netdev_priv(dev) + ALIGN(sizeof(struct vxcan_priv), NETDEV_ALIGN); can_set_ml_priv(dev, can_ml); + vxcan_set_cap_info(dev); } /* forward declaration for rtnl_create_link() */ diff --git a/include/linux/can/can-ml.h b/include/linux/can/can-ml.h index 8afa92d15a664..1e99fda2b3803 100644 --- a/include/linux/can/can-ml.h +++ b/include/linux/can/can-ml.h @@ -46,6 +46,12 @@ #include #include +/* exposed CAN device capabilities for network layer */ +#define CAN_CAP_CC BIT(0) /* CAN CC aka Classical CAN */ +#define CAN_CAP_FD BIT(1) /* CAN FD */ +#define CAN_CAP_XL BIT(2) /* CAN XL */ +#define CAN_CAP_RO BIT(3) /* read-only mode (LISTEN/RESTRICTED) */ + #define CAN_SFF_RCV_ARRAY_SZ (1 << CAN_SFF_ID_BITS) #define CAN_EFF_RCV_HASH_BITS 10 #define CAN_EFF_RCV_ARRAY_SZ (1 << CAN_EFF_RCV_HASH_BITS) @@ -64,6 +70,7 @@ struct can_ml_priv { #ifdef CAN_J1939 struct j1939_priv *j1939_priv; #endif + u32 can_cap; }; static inline struct can_ml_priv *can_get_ml_priv(struct net_device *dev) @@ -77,4 +84,21 @@ static inline void can_set_ml_priv(struct net_device *dev, netdev_set_ml_priv(dev, ml_priv, ML_PRIV_CAN); } +static inline bool can_cap_enabled(struct net_device *dev, u32 cap) +{ + struct can_ml_priv *can_ml = can_get_ml_priv(dev); + + if (!can_ml) + return false; + + return (can_ml->can_cap & cap); +} + +static inline void can_set_cap(struct net_device *dev, u32 cap) +{ + struct can_ml_priv *can_ml = can_get_ml_priv(dev); + + can_ml->can_cap = cap; +} + #endif /* CAN_ML_H */ diff --git a/include/linux/can/dev.h b/include/linux/can/dev.h index 52c8be5c160e4..6d0710d6f571d 100644 --- a/include/linux/can/dev.h +++ b/include/linux/can/dev.h @@ -116,6 +116,7 @@ struct can_priv *safe_candev_priv(struct net_device *dev); int open_candev(struct net_device *dev); void close_candev(struct net_device *dev); void can_set_default_mtu(struct net_device *dev); +void can_set_cap_info(struct net_device *dev); int __must_check can_set_static_ctrlmode(struct net_device *dev, u32 static_mode); int can_hwtstamp_get(struct net_device *netdev, From 96e14d1897bdbdc548960aa4e2541661e55aac31 Mon Sep 17 00:00:00 2001 From: Oliver Hartkopp Date: Fri, 9 Jan 2026 15:41:35 +0100 Subject: [PATCH 1018/1321] can: raw: instantly reject disabled CAN frames For real CAN interfaces the CAN_CTRLMODE_FD and CAN_CTRLMODE_XL control modes indicate whether an interface can handle those CAN FD/XL frames. In the case a CAN XL interface is configured in CANXL-only mode with disabled error-signalling neither CAN CC nor CAN FD frames can be sent. The checks are now performed on CAN_RAW sockets to give an instant feedback to the user when writing unsupported CAN frames to the interface or when the CAN interface is in read-only mode. Fixes: 1a620a723853 ("can: raw: instantly reject unsupported CAN frames") Cc: Marc Kleine-Budde Cc: Arnd Bergmann Cc: Vincent Mailhol Signed-off-by: Oliver Hartkopp Link: https://patch.msgid.link/20260109144135.8495-4-socketcan@hartkopp.net [mkl: fix dev reference leak] Link: https://lore.kernel.org/all/0636c732-2e71-4633-8005-dfa85e1da445@hartkopp.net Signed-off-by: Marc Kleine-Budde --- net/can/raw.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/net/can/raw.c b/net/can/raw.c index f36a83d3447cf..12293363413ce 100644 --- a/net/can/raw.c +++ b/net/can/raw.c @@ -49,8 +49,8 @@ #include #include #include +#include #include -#include /* for can_is_canxl_dev_mtu() */ #include #include #include @@ -892,20 +892,21 @@ static void raw_put_canxl_vcid(struct raw_sock *ro, struct sk_buff *skb) } } -static unsigned int raw_check_txframe(struct raw_sock *ro, struct sk_buff *skb, int mtu) +static unsigned int raw_check_txframe(struct raw_sock *ro, struct sk_buff *skb, + struct net_device *dev) { - /* Classical CAN -> no checks for flags and device capabilities */ - if (can_is_can_skb(skb)) + /* Classical CAN */ + if (can_is_can_skb(skb) && can_cap_enabled(dev, CAN_CAP_CC)) return CAN_MTU; - /* CAN FD -> needs to be enabled and a CAN FD or CAN XL device */ + /* CAN FD */ if (ro->fd_frames && can_is_canfd_skb(skb) && - (mtu == CANFD_MTU || can_is_canxl_dev_mtu(mtu))) + can_cap_enabled(dev, CAN_CAP_FD)) return CANFD_MTU; - /* CAN XL -> needs to be enabled and a CAN XL device */ + /* CAN XL */ if (ro->xl_frames && can_is_canxl_skb(skb) && - can_is_canxl_dev_mtu(mtu)) + can_cap_enabled(dev, CAN_CAP_XL)) return CANXL_MTU; return 0; @@ -944,6 +945,12 @@ static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) if (!dev) return -ENXIO; + /* no sending on a CAN device in read-only mode */ + if (can_cap_enabled(dev, CAN_CAP_RO)) { + err = -EACCES; + goto put_dev; + } + skb = sock_alloc_send_skb(sk, size + sizeof(struct can_skb_priv), msg->msg_flags & MSG_DONTWAIT, &err); if (!skb) @@ -961,7 +968,7 @@ static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) err = -EINVAL; /* check for valid CAN (CC/FD/XL) frame content */ - txmtu = raw_check_txframe(ro, skb, READ_ONCE(dev->mtu)); + txmtu = raw_check_txframe(ro, skb, dev); if (!txmtu) goto free_skb; From b5e7db4b2098802e2af81f93a8a2debd74e2304e Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Wed, 14 Jan 2026 00:28:47 +0900 Subject: [PATCH 1019/1321] net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts Since j1939_session_deactivate_activate_next() in j1939_tp_rxtimer() is called only when the timer is enabled, we need to call j1939_session_deactivate_activate_next() if we cancelled the timer. Otherwise, refcount for j1939_session leaks, which will later appear as | unregister_netdevice: waiting for vcan0 to become free. Usage count = 2. problem. Reported-by: syzbot Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 Signed-off-by: Tetsuo Handa Tested-by: Oleksij Rempel Acked-by: Oleksij Rempel Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Link: https://patch.msgid.link/b1212653-8fa1-44e1-be9d-12f950fb3a07@I-love.SAKURA.ne.jp Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde --- net/can/j1939/transport.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c index 613a911dda100..8656ab388c83e 100644 --- a/net/can/j1939/transport.c +++ b/net/can/j1939/transport.c @@ -1695,8 +1695,16 @@ static int j1939_xtp_rx_rts_session_active(struct j1939_session *session, j1939_session_timers_cancel(session); j1939_session_cancel(session, J1939_XTP_ABORT_BUSY); - if (session->transmission) + if (session->transmission) { j1939_session_deactivate_activate_next(session); + } else if (session->state == J1939_SESSION_WAITING_ABORT) { + /* Force deactivation for the receiver. + * If we rely on the timer starting in j1939_session_cancel, + * a second RTS call here will cancel that timer and fail + * to restart it because the state is already WAITING_ABORT. + */ + j1939_session_deactivate_activate_next(session); + } return -EBUSY; } From 3f6a854348d6e89b03dd5ef5eb80adc4704cae1b Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Fri, 19 Dec 2025 08:40:04 +0700 Subject: [PATCH 1020/1321] mm: describe @flags parameter in memalloc_flags_save() Patch series "mm kernel-doc fixes". Here are kernel-doc fixes for mm subsystem. I'm also including textsearch fix since there's currently no maintainer for include/linux/textsearch.h (get_maintainer.pl only shows LKML). This patch (of 4): Sphinx reports kernel-doc warning: WARNING: ./include/linux/sched/mm.h:332 function parameter 'flags' not described in 'memalloc_flags_save' Describe @flags to fix it. Link: https://lkml.kernel.org/r/20251219014006.16328-2-bagasdotme@gmail.com Link: https://lkml.kernel.org/r/20251219014006.16328-3-bagasdotme@gmail.com Signed-off-by: Bagas Sanjaya Fixes: 3f6d5e6a468d ("mm: introduce memalloc_flags_{save,restore}") Acked-by: David Hildenbrand (Red Hat) Acked-by: Harry Yoo Signed-off-by: Andrew Morton --- include/linux/sched/mm.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index 0e1d73955fa51..95d0040df5841 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -325,6 +325,7 @@ static inline void might_alloc(gfp_t gfp_mask) /** * memalloc_flags_save - Add a PF_* flag to current->flags, save old value + * @flags: Flags to add. * * This allows PF_* flags to be conveniently added, irrespective of current * value, and then the old version restored with memalloc_flags_restore(). From d3995f0230306f66c1f086f2d2e020918d45d674 Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Fri, 19 Dec 2025 08:40:05 +0700 Subject: [PATCH 1021/1321] textsearch: describe @list member in ts_ops search Sphinx reports kernel-doc warning: WARNING: ./include/linux/textsearch.h:49 struct member 'list' not described in 'ts_ops' Describe @list member to fix it. Link: https://lkml.kernel.org/r/20251219014006.16328-4-bagasdotme@gmail.com Fixes: 2de4ff7bd658 ("[LIB]: Textsearch infrastructure.") Signed-off-by: Bagas Sanjaya Cc: Thomas Graf Cc: "David S. Miller" Signed-off-by: Andrew Morton --- include/linux/textsearch.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/textsearch.h b/include/linux/textsearch.h index 6673e4d4ac2e1..4933777404d61 100644 --- a/include/linux/textsearch.h +++ b/include/linux/textsearch.h @@ -35,6 +35,7 @@ struct ts_state * @get_pattern: return head of pattern * @get_pattern_len: return length of pattern * @owner: module reference to algorithm + * @list: list to search */ struct ts_ops { From 85a74cf6a99abd2305361a9cb98e98b5d58f90e7 Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Fri, 19 Dec 2025 08:40:06 +0700 Subject: [PATCH 1022/1321] mm: vmalloc: fix up vrealloc_node_align() kernel-doc macro name Sphinx reports kernel-doc warning: WARNING: ./mm/vmalloc.c:4284 expecting prototype for vrealloc_node_align_noprof(). Prototype was for vrealloc_node_align() instead Fix the macro name in vrealloc_node_align_noprof() kernel-doc comment. Link: https://lkml.kernel.org/r/20251219014006.16328-5-bagasdotme@gmail.com Fixes: 4c5d3365882d ("mm/vmalloc: allow to set node and align in vrealloc") Signed-off-by: Bagas Sanjaya Reviewed-by: Vishal Moola (Oracle) Signed-off-by: Andrew Morton --- mm/vmalloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 41dd01e8430c5..628f96e83b118 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -4248,7 +4248,7 @@ void *vzalloc_node_noprof(unsigned long size, int node) EXPORT_SYMBOL(vzalloc_node_noprof); /** - * vrealloc_node_align_noprof - reallocate virtually contiguous memory; contents + * vrealloc_node_align - reallocate virtually contiguous memory; contents * remain unchanged * @p: object to reallocate memory for * @size: the size to reallocate From 4bc193285225969ca413a65b983494aa5a96b946 Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Fri, 19 Dec 2025 08:40:07 +0700 Subject: [PATCH 1023/1321] mm, kfence: describe @slab parameter in __kfence_obj_info() Sphinx reports kernel-doc warning: WARNING: ./include/linux/kfence.h:220 function parameter 'slab' not described in '__kfence_obj_info' Fix it by describing @slab parameter. Link: https://lkml.kernel.org/r/20251219014006.16328-6-bagasdotme@gmail.com Fixes: 2dfe63e61cc3 ("mm, kfence: support kmem_dump_obj() for KFENCE objects") Signed-off-by: Bagas Sanjaya Acked-by: Marco Elver Acked-by: David Hildenbrand (Red Hat) Acked-by: Harry Yoo Signed-off-by: Andrew Morton --- include/linux/kfence.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/kfence.h b/include/linux/kfence.h index 0ad1ddbb8b996..e5822f6e7f279 100644 --- a/include/linux/kfence.h +++ b/include/linux/kfence.h @@ -211,6 +211,7 @@ struct kmem_obj_info; * __kfence_obj_info() - fill kmem_obj_info struct * @kpp: kmem_obj_info to be filled * @object: the object + * @slab: the slab * * Return: * * false - not a KFENCE object From 887e8cf905488fc6bdb0c84369e85be04ee2cfa1 Mon Sep 17 00:00:00 2001 From: Szymon Wilczek Date: Sun, 21 Dec 2025 16:17:10 +0100 Subject: [PATCH 1024/1321] mailmap: update email address for Szymon Wilczek Map my old address to my new address . The old account is no longer accessible due to provider blocking access. Link: https://lkml.kernel.org/r/20251221151710.13747-1-swilczek.lx@gmail.com Signed-off-by: Szymon Wilczek Signed-off-by: Andrew Morton --- .mailmap | 1 + 1 file changed, 1 insertion(+) diff --git a/.mailmap b/.mailmap index fa018b5bd5339..bc94004bc4c2e 100644 --- a/.mailmap +++ b/.mailmap @@ -794,6 +794,7 @@ Sven Eckelmann Sven Eckelmann Sven Eckelmann Sven Peter +Szymon Wilczek Takashi YOSHII Tamizh Chelvam Raja Taniya Das From ae91b0dc4cb0a928b7fdf0dbb7f72fd0178a238f Mon Sep 17 00:00:00 2001 From: Marco Elver Date: Mon, 22 Dec 2025 16:00:06 +0100 Subject: [PATCH 1025/1321] docs: kernel-parameters: add kfence parameters Add a brief summary for KFENCE's kernel command-line parameters in admin-guide/kernel-parameters. Link: https://lkml.kernel.org/r/20251222150018.1349672-1-elver@google.com Signed-off-by: Marco Elver Cc: Alexander Potapenko Cc: Dmitriy Vyukov Cc: Jonathan Corbet Signed-off-by: Andrew Morton --- .../admin-guide/kernel-parameters.txt | 35 +++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index a8d0afde7f85a..1058f2a6d6a8c 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2917,6 +2917,41 @@ Kernel parameters for Movable pages. "nn[KMGTPE]", "nn%", and "mirror" are exclusive, so you cannot specify multiple forms. + kfence.burst= [MM,KFENCE] The number of additional successive + allocations to be attempted through KFENCE for each + sample interval. + Format: + Default: 0 + + kfence.check_on_panic= + [MM,KFENCE] Whether to check all KFENCE-managed objects' + canaries on panic. + Format: + Default: false + + kfence.deferrable= + [MM,KFENCE] Whether to use a deferrable timer to trigger + allocations. This avoids forcing CPU wake-ups if the + system is idle, at the risk of a less predictable + sample interval. + Format: + Default: CONFIG_KFENCE_DEFERRABLE + + kfence.sample_interval= + [MM,KFENCE] KFENCE's sample interval in milliseconds. + Format: + 0 - Disable KFENCE. + >0 - Enabled KFENCE with given sample interval. + Default: CONFIG_KFENCE_SAMPLE_INTERVAL + + kfence.skip_covered_thresh= + [MM,KFENCE] If pool utilization reaches this threshold + (pool usage%), KFENCE limits currently covered + allocations of the same source from further filling + up the pool. + Format: + Default: 75 + kgdbdbgp= [KGDB,HW,EARLY] kgdb over EHCI usb debug port. Format: [,poll interval] The controller # is the number of the ehci usb debug From 203933d225c876eb56f3c321e8d993363dc9d1ab Mon Sep 17 00:00:00 2001 From: Shakeel Butt Date: Mon, 22 Dec 2025 12:58:59 -0800 Subject: [PATCH 1026/1321] lib/buildid: use __kernel_read() for sleepable context Prevent a "BUG: unable to handle kernel NULL pointer dereference in filemap_read_folio". For the sleepable context, convert freader to use __kernel_read() instead of direct page cache access via read_cache_folio(). This simplifies the faultable code path by using the standard kernel file reading interface which handles all the complexity of reading file data. At the moment we are not changing the code for non-sleepable context which uses filemap_get_folio() and only succeeds if the target folios are already in memory and up-to-date. The reason is to keep the patch simple and easier to backport to stable kernels. Syzbot repro does not crash the kernel anymore and the selftests run successfully. In the follow up we will make __kernel_read() with IOCB_NOWAIT work for non-sleepable contexts. In addition, I would like to replace the secretmem check with a more generic approach and will add fstest for the buildid code. Link: https://lkml.kernel.org/r/20251222205859.3968077-1-shakeel.butt@linux.dev Fixes: ad41251c290d ("lib/buildid: implement sleepable build_id_parse() API") Reported-by: syzbot+09b7d050e4806540153d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=09b7d050e4806540153d Signed-off-by: Shakeel Butt Reviewed-by: Christoph Hellwig Tested-by: Jinchao Wang Link: https://lkml.kernel.org/r/aUteBPWPYzVWIZFH@ndev Reviewed-by: Christian Brauner Cc: Alexei Starovoitov Cc: Andrii Nakryiko Cc: Daniel Borkman Cc: "Darrick J. Wong" Cc: Matthew Wilcox (Oracle) Cc: Signed-off-by: Andrew Morton --- lib/buildid.c | 32 ++++++++++++++++++++------------ 1 file changed, 20 insertions(+), 12 deletions(-) diff --git a/lib/buildid.c b/lib/buildid.c index aaf61dfc09195..818331051afe2 100644 --- a/lib/buildid.c +++ b/lib/buildid.c @@ -5,6 +5,7 @@ #include #include #include +#include #include #define BUILD_ID 3 @@ -46,20 +47,9 @@ static int freader_get_folio(struct freader *r, loff_t file_off) freader_put_folio(r); - /* reject secretmem folios created with memfd_secret() */ - if (secretmem_mapping(r->file->f_mapping)) - return -EFAULT; - + /* only use page cache lookup - fail if not already cached */ r->folio = filemap_get_folio(r->file->f_mapping, file_off >> PAGE_SHIFT); - /* if sleeping is allowed, wait for the page, if necessary */ - if (r->may_fault && (IS_ERR(r->folio) || !folio_test_uptodate(r->folio))) { - filemap_invalidate_lock_shared(r->file->f_mapping); - r->folio = read_cache_folio(r->file->f_mapping, file_off >> PAGE_SHIFT, - NULL, r->file); - filemap_invalidate_unlock_shared(r->file->f_mapping); - } - if (IS_ERR(r->folio) || !folio_test_uptodate(r->folio)) { if (!IS_ERR(r->folio)) folio_put(r->folio); @@ -97,6 +87,24 @@ const void *freader_fetch(struct freader *r, loff_t file_off, size_t sz) return r->data + file_off; } + /* reject secretmem folios created with memfd_secret() */ + if (secretmem_mapping(r->file->f_mapping)) { + r->err = -EFAULT; + return NULL; + } + + /* use __kernel_read() for sleepable context */ + if (r->may_fault) { + ssize_t ret; + + ret = __kernel_read(r->file, r->buf, sz, &file_off); + if (ret != sz) { + r->err = (ret < 0) ? ret : -EIO; + return NULL; + } + return r->buf; + } + /* fetch or reuse folio for given file offset */ r->err = freader_get_folio(r, file_off); if (r->err) From 50e9c8fb96b6e6ad1125344ef4447089898d820a Mon Sep 17 00:00:00 2001 From: Pasha Tatashin Date: Tue, 23 Dec 2025 09:01:40 -0500 Subject: [PATCH 1027/1321] kho: validate preserved memory map during population If the previous kernel enabled KHO but did not call kho_finalize() (e.g., CONFIG_LIVEUPDATE=n or userspace skipped the finalization step), the 'preserved-memory-map' property in the FDT remains empty/zero. Previously, kho_populate() would succeed regardless of the memory map's state, reserving the incoming scratch regions in memblock. However, kho_memory_init() would later fail to deserialize the empty map. By that time, the scratch regions were already registered, leading to partial initialization and subsequent list corruption (freeing scratch area twice) during kho_init(). Move the validation of the preserved memory map earlier into kho_populate(). If the memory map is empty/NULL: 1. Abort kho_populate() immediately with -ENOENT. 2. Do not register or reserve the incoming scratch memory, allowing the new kernel to reclaim those pages as standard free memory. 3. Leave the global 'kho_in' state uninitialized. Consequently, kho_memory_init() sees no active KHO context (kho_in.mem_chunks_phys is 0) and falls back to kho_reserve_scratch(), allocating fresh scratch memory as if it were a standard cold boot. Link: https://lkml.kernel.org/r/20251223140140.2090337-1-pasha.tatashin@soleen.com Fixes: de51999e687c ("kho: allow memory preservation state updates after finalization") Signed-off-by: Pasha Tatashin Reported-by: Ricardo Neri Closes: https://lore.kernel.org/all/20251218215613.GA17304@ranerica-svr.sc.intel.com Reviewed-by: Mike Rapoport (Microsoft) Tested-by: Ricardo Neri Reviewed-by: Pratyush Yadav Cc: Alexander Graf Signed-off-by: Andrew Morton --- kernel/liveupdate/kexec_handover.c | 37 +++++++++++++++--------------- 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c index 9dc51fab604f1..d4482b6e3cae3 100644 --- a/kernel/liveupdate/kexec_handover.c +++ b/kernel/liveupdate/kexec_handover.c @@ -460,27 +460,23 @@ static void __init deserialize_bitmap(unsigned int order, } } -/* Return true if memory was deserizlied */ -static bool __init kho_mem_deserialize(const void *fdt) +/* Returns physical address of the preserved memory map from FDT */ +static phys_addr_t __init kho_get_mem_map_phys(const void *fdt) { - struct khoser_mem_chunk *chunk; const void *mem_ptr; - u64 mem; int len; mem_ptr = fdt_getprop(fdt, 0, PROP_PRESERVED_MEMORY_MAP, &len); if (!mem_ptr || len != sizeof(u64)) { pr_err("failed to get preserved memory bitmaps\n"); - return false; + return 0; } - mem = get_unaligned((const u64 *)mem_ptr); - chunk = mem ? phys_to_virt(mem) : NULL; - - /* No preserved physical pages were passed, no deserialization */ - if (!chunk) - return false; + return get_unaligned((const u64 *)mem_ptr); +} +static void __init kho_mem_deserialize(struct khoser_mem_chunk *chunk) +{ while (chunk) { unsigned int i; @@ -489,8 +485,6 @@ static bool __init kho_mem_deserialize(const void *fdt) &chunk->bitmaps[i]); chunk = KHOSER_LOAD_PTR(chunk->hdr.next); } - - return true; } /* @@ -1253,6 +1247,7 @@ bool kho_finalized(void) struct kho_in { phys_addr_t fdt_phys; phys_addr_t scratch_phys; + phys_addr_t mem_map_phys; struct kho_debugfs dbg; }; @@ -1434,12 +1429,10 @@ static void __init kho_release_scratch(void) void __init kho_memory_init(void) { - if (kho_in.scratch_phys) { + if (kho_in.mem_map_phys) { kho_scratch = phys_to_virt(kho_in.scratch_phys); kho_release_scratch(); - - if (!kho_mem_deserialize(kho_get_fdt())) - kho_in.fdt_phys = 0; + kho_mem_deserialize(phys_to_virt(kho_in.mem_map_phys)); } else { kho_reserve_scratch(); } @@ -1448,8 +1441,9 @@ void __init kho_memory_init(void) void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len, phys_addr_t scratch_phys, u64 scratch_len) { - void *fdt = NULL; struct kho_scratch *scratch = NULL; + phys_addr_t mem_map_phys; + void *fdt = NULL; int err = 0; unsigned int scratch_cnt = scratch_len / sizeof(*kho_scratch); @@ -1475,6 +1469,12 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len, goto out; } + mem_map_phys = kho_get_mem_map_phys(fdt); + if (!mem_map_phys) { + err = -ENOENT; + goto out; + } + scratch = early_memremap(scratch_phys, scratch_len); if (!scratch) { pr_warn("setup: failed to memremap scratch (phys=0x%llx, len=%lld)\n", @@ -1515,6 +1515,7 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len, kho_in.fdt_phys = fdt_phys; kho_in.scratch_phys = scratch_phys; + kho_in.mem_map_phys = mem_map_phys; kho_scratch_cnt = scratch_cnt; pr_info("found kexec handover data.\n"); From 89ce9c5fcd797821a0787e18a7c416f15f159854 Mon Sep 17 00:00:00 2001 From: Shakeel Butt Date: Wed, 24 Dec 2025 16:29:04 -0800 Subject: [PATCH 1028/1321] mm/damon/core: get memcg reference before access The commit b74a120bcf507 ("mm/damon/core: implement DAMOS_QUOTA_NODE_MEMCG_USED_BP") added accesses to memcg structure without getting reference to it. This is unsafe. Let's get the reference before accessing the memcg. Link: https://lkml.kernel.org/r/20251225002904.139543-1-shakeel.butt@linux.dev Fixes: b74a120bcf507 ("mm/damon/core: implement DAMOS_QUOTA_NODE_MEMCG_USED_BP") Signed-off-by: Shakeel Butt Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/core.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/damon/core.c b/mm/damon/core.c index f9fc0375890a8..fd09bd106ad54 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -2051,13 +2051,15 @@ static unsigned long damos_get_node_memcg_used_bp( rcu_read_lock(); memcg = mem_cgroup_from_id(goal->memcg_id); - rcu_read_unlock(); - if (!memcg) { + if (!memcg || !mem_cgroup_tryget(memcg)) { + rcu_read_unlock(); if (goal->metric == DAMOS_QUOTA_NODE_MEMCG_USED_BP) return 0; else /* DAMOS_QUOTA_NODE_MEMCG_FREE_BP */ return 10000; } + rcu_read_unlock(); + mem_cgroup_flush_stats(memcg); lruvec = mem_cgroup_lruvec(memcg, NODE_DATA(goal->nid)); used_pages = lruvec_page_state(lruvec, NR_ACTIVE_ANON); @@ -2065,6 +2067,8 @@ static unsigned long damos_get_node_memcg_used_bp( used_pages += lruvec_page_state(lruvec, NR_ACTIVE_FILE); used_pages += lruvec_page_state(lruvec, NR_INACTIVE_FILE); + mem_cgroup_put(memcg); + si_meminfo_node(&i, goal->nid); if (goal->metric == DAMOS_QUOTA_NODE_MEMCG_USED_BP) numerator = used_pages; From b8637338e429f94f51a3a3d2a78f25b093fa595f Mon Sep 17 00:00:00 2001 From: Aboorva Devarajan Date: Mon, 1 Dec 2025 11:30:09 +0530 Subject: [PATCH 1029/1321] mm/page_alloc: make percpu_pagelist_high_fraction reads lock-free When page isolation loops indefinitely during memory offline, reading /proc/sys/vm/percpu_pagelist_high_fraction blocks on pcp_batch_high_lock, causing hung task warnings. Make procfs reads lock-free since percpu_pagelist_high_fraction is a simple integer with naturally atomic reads, writers still serialize via the mutex. This prevents hung task warnings when reading the procfs file during long-running memory offline operations. [akpm@linux-foundation.org: add comment, per Michal] Link: https://lkml.kernel.org/r/aS_y9AuJQFydLEXo@tiehlicka Link: https://lkml.kernel.org/r/20251201060009.1420792-1-aboorvad@linux.ibm.com Signed-off-by: Aboorva Devarajan Acked-by: Michal Hocko Cc: Brendan Jackman Cc: Johannes Weiner Cc: Suren Baghdasaryan Cc: Vlastimil Babka Cc: Zi Yan Cc: Signed-off-by: Andrew Morton --- mm/page_alloc.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c380f063e8b7b..3dce96f3be494 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6667,11 +6667,19 @@ static int percpu_pagelist_high_fraction_sysctl_handler(const struct ctl_table * int old_percpu_pagelist_high_fraction; int ret; + /* + * Avoid using pcp_batch_high_lock for reads as the value is read + * atomically and a race with offlining is harmless. + */ + + if (!write) + return proc_dointvec_minmax(table, write, buffer, length, ppos); + mutex_lock(&pcp_batch_high_lock); old_percpu_pagelist_high_fraction = percpu_pagelist_high_fraction; ret = proc_dointvec_minmax(table, write, buffer, length, ppos); - if (!write || ret < 0) + if (ret < 0) goto out; /* Sanity checking to avoid pcp imbalance */ From 1b5525644553e55139bbc6d532d5edcd7095bb3e Mon Sep 17 00:00:00 2001 From: Sourabh Jain Date: Wed, 24 Dec 2025 17:25:24 +0530 Subject: [PATCH 1030/1321] mm/hugetlb: ignore hugepage kernel args if hugepages are unsupported Skip processing hugepage kernel arguments (hugepagesz, hugepages, and default_hugepagesz) when hugepages are not supported by the architecture. Some architectures may need to disable hugepages based on conditions discovered during kernel boot. The hugepages_supported() helper allows architecture code to advertise whether hugepages are supported. Currently, normal hugepage allocation is guarded by hugepages_supported(), but gigantic hugepages are allocated regardless of this check. This causes problems on powerpc for fadump (firmware- assisted dump). In the fadump (firmware-assisted dump) scenario, a production kernel crash causes the system to boot into a special kernel whose sole purpose is to collect the memory dump and reboot. Features such as hugepages are not required in this environment and should be disabled. For example, when the fadump kernel boots with the following kernel arguments: default_hugepagesz=1GB hugepagesz=1GB hugepages=200 Before this patch, the kernel prints the following logs: HugeTLB: allocating 200 of page size 1.00 GiB failed. Only allocated 58 hugepages. HugeTLB support is disabled! HugeTLB: huge pages not supported, ignoring associated command-line parameters hugetlbfs: disabling because there are no supported hugepage sizes Even though the logs state that HugeTLB support is disabled, gigantic hugepages are still allocated. This causes the fadump kernel to run out of memory during boot. After this patch is applied, the kernel prints the following logs for the same command line: HugeTLB: hugepages unsupported, ignoring default_hugepagesz=1GB cmdline HugeTLB: hugepages unsupported, ignoring hugepagesz=1GB cmdline HugeTLB: hugepages unsupported, ignoring hugepages=200 cmdline HugeTLB support is disabled! hugetlbfs: disabling because there are no supported hugepage sizes To fix the issue, gigantic hugepage allocation should be guarded by hugepages_supported(). Previously, two approaches were proposed to bring gigantic hugepage allocation under hugepages_supported(): [1] Check hugepages_supported() in the generic code before allocating gigantic hugepages [2] Make arch_hugetlb_valid_size() return false for all hugetlb sizes Approach [2] has two minor issues: 1. It prints misleading logs about invalid hugepage sizes 2. The kernel still processes hugepage kernel arguments unnecessarily To control gigantic hugepage allocation, skip processing hugepage kernel arguments (default_hugepagesz, hugepagesz and hugepages) when hugepages_supported() returns false. Note for backporting: This fix is a partial reversion of the commit mentioned in the Fixes tag and is only valid once the change referenced by the Depends-on tag is present. When backporting this patch, the commit mentioned in the Depends-on tag must be included first. Link: https://lore.kernel.org/all/20250121150419.1342794-1-sourabhjain@linux.ibm.com/ [1] Link: https://lore.kernel.org/all/20250128043358.163372-1-sourabhjain@linux.ibm.com/ [2] Link: https://lkml.kernel.org/r/20251224115524.1272010-1-sourabhjain@linux.ibm.com Fixes: c2833a5bf75b ("hugetlbfs: fix changes to command line processing") Signed-off-by: Sourabh Jain Depends-on: 2354ad252b66 ("powerpc/mm: Update default hugetlb size early") Acked-by: David Hildenbrand (Red Hat) Reviewed-by: Ritesh Harjani (IBM) Cc: Borislav Petkov Cc: Christophe Leroy Cc: Heiko Carstens Cc: Ingo Molnar Cc: Madhavan Srinivasan Cc: Michael Ellerman Cc: Muchun Song Cc: Oscar Salvador Cc: Thomas Gleixner Cc: Vasily Gorbik Signed-off-by: Andrew Morton --- mm/hugetlb.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 51273baec9e5d..e0ab140205134 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4286,6 +4286,11 @@ static int __init hugepages_setup(char *s) unsigned long tmp; char *p = s; + if (!hugepages_supported()) { + pr_warn("HugeTLB: hugepages unsupported, ignoring hugepages=%s cmdline\n", s); + return 0; + } + if (!parsed_valid_hugepagesz) { pr_warn("HugeTLB: hugepages=%s does not follow a valid hugepagesz, ignoring\n", s); parsed_valid_hugepagesz = true; @@ -4366,6 +4371,11 @@ static int __init hugepagesz_setup(char *s) unsigned long size; struct hstate *h; + if (!hugepages_supported()) { + pr_warn("HugeTLB: hugepages unsupported, ignoring hugepagesz=%s cmdline\n", s); + return 0; + } + parsed_valid_hugepagesz = false; size = (unsigned long)memparse(s, NULL); @@ -4414,6 +4424,12 @@ static int __init default_hugepagesz_setup(char *s) unsigned long size; int i; + if (!hugepages_supported()) { + pr_warn("HugeTLB: hugepages unsupported, ignoring default_hugepagesz=%s cmdline\n", + s); + return 0; + } + parsed_valid_hugepagesz = false; if (parsed_default_hugepagesz) { pr_err("HugeTLB: default_hugepagesz previously specified, ignoring %s\n", s); From 21ef51553d2eeecdb7129b7721780b6f4410109e Mon Sep 17 00:00:00 2001 From: "Mike Rapoport (Microsoft)" Date: Wed, 31 Dec 2025 12:57:01 +0200 Subject: [PATCH 1031/1321] mips: fix HIGHMEM initialization Commit 6faea3422e3b ("arch, mm: streamline HIGHMEM freeing") overzealously removed mem_init_free_highmem() function that beside freeing high memory pages checked for CPU support for high memory as a prerequisite. Partially restore mem_init_free_highmem() with a new highmem_init() name and make it discard high memory in case there is no CPU support for it. Link: https://lkml.kernel.org/r/20251231105701.519711-1-rppt@kernel.org Fixes: 6faea3422e3b ("arch, mm: streamline HIGHMEM freeing") Signed-off-by: Mike Rapoport (Microsoft) Reported-by: Markus Stockhausen Cc: Chris Packham Cc: Hauke Mehrtens Cc: Jonas Jelonek Cc: Thomas Bogendoerfer Cc: Thomas Gleinxer Signed-off-by: Andrew Morton --- arch/mips/mm/init.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c index a673d3d68254b..8986048f9b110 100644 --- a/arch/mips/mm/init.c +++ b/arch/mips/mm/init.c @@ -425,6 +425,28 @@ void __init paging_init(void) static struct kcore_list kcore_kseg0; #endif +static inline void __init highmem_init(void) +{ +#ifdef CONFIG_HIGHMEM + unsigned long tmp; + + /* + * If CPU cannot support HIGHMEM discard the memory above highstart_pfn + */ + if (cpu_has_dc_aliases) { + memblock_remove(PFN_PHYS(highstart_pfn), -1); + return; + } + + for (tmp = highstart_pfn; tmp < highend_pfn; tmp++) { + struct page *page = pfn_to_page(tmp); + + if (!memblock_is_memory(PFN_PHYS(tmp))) + SetPageReserved(page); + } +#endif +} + void __init arch_mm_preinit(void) { /* @@ -435,6 +457,7 @@ void __init arch_mm_preinit(void) maar_init(); setup_zero_pages(); /* Setup zeroed pages. */ + highmem_init(); #ifdef CONFIG_64BIT if ((unsigned long) &_text > (unsigned long) CKSEG0) From e119c400fb7b4c3a8371c43bb0f5765887866d50 Mon Sep 17 00:00:00 2001 From: Feng Tang Date: Wed, 31 Dec 2025 16:03:09 +0800 Subject: [PATCH 1032/1321] powerpc/watchdog: add support for hardlockup_sys_info sysctl Commit a9af76a78760 ("watchdog: add sys_info sysctls to dump sys info on system lockup") adds 'hardlock_sys_info' systcl knob for general kernel watchdog to control what kinds of system debug info to be dumped on hardlockup. Add similar support in powerpc watchdog code to make the sysctl knob more general, which also fixes a compiling warning in general watchdog code reported by 0day bot. Link: https://lkml.kernel.org/r/20251231080309.39642-1-feng.tang@linux.alibaba.com Fixes: a9af76a78760 ("watchdog: add sys_info sysctls to dump sys info on system lockup") Signed-off-by: Feng Tang Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202512030920.NFKtekA7-lkp@intel.com/ Suggested-by: Petr Mladek Reviewed-by: Petr Mladek Cc: Madhavan Srinivasan Cc: Michael Ellerman Cc: Nicholas Piggin Signed-off-by: Andrew Morton --- arch/powerpc/kernel/watchdog.c | 15 ++++++++++----- include/linux/nmi.h | 1 + kernel/watchdog.c | 2 +- 3 files changed, 12 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c index 2429cb1c7baa7..764001deb0605 100644 --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -26,6 +26,7 @@ #include #include #include +#include #include #include @@ -235,7 +236,11 @@ static void watchdog_smp_panic(int cpu) pr_emerg("CPU %d TB:%lld, last SMP heartbeat TB:%lld (%lldms ago)\n", cpu, tb, last_reset, tb_to_ns(tb - last_reset) / 1000000); - if (!sysctl_hardlockup_all_cpu_backtrace) { + if (sysctl_hardlockup_all_cpu_backtrace || + (hardlockup_si_mask & SYS_INFO_ALL_BT)) { + trigger_allbutcpu_cpu_backtrace(cpu); + cpumask_clear(&wd_smp_cpus_ipi); + } else { /* * Try to trigger the stuck CPUs, unless we are going to * get a backtrace on all of them anyway. @@ -244,11 +249,9 @@ static void watchdog_smp_panic(int cpu) smp_send_nmi_ipi(c, wd_lockup_ipi, 1000000); __cpumask_clear_cpu(c, &wd_smp_cpus_ipi); } - } else { - trigger_allbutcpu_cpu_backtrace(cpu); - cpumask_clear(&wd_smp_cpus_ipi); } + sys_info(hardlockup_si_mask & ~SYS_INFO_ALL_BT); if (hardlockup_panic) nmi_panic(NULL, "Hard LOCKUP"); @@ -415,9 +418,11 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt) xchg(&__wd_nmi_output, 1); // see wd_lockup_ipi - if (sysctl_hardlockup_all_cpu_backtrace) + if (sysctl_hardlockup_all_cpu_backtrace || + (hardlockup_si_mask & SYS_INFO_ALL_BT)) trigger_allbutcpu_cpu_backtrace(cpu); + sys_info(hardlockup_si_mask & ~SYS_INFO_ALL_BT); if (hardlockup_panic) nmi_panic(regs, "Hard LOCKUP"); diff --git a/include/linux/nmi.h b/include/linux/nmi.h index cf3c6ab408aac..207156f2143c5 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -83,6 +83,7 @@ static inline void reset_hung_task_detector(void) { } #if defined(CONFIG_HARDLOCKUP_DETECTOR) extern void hardlockup_detector_disable(void); extern unsigned int hardlockup_panic; +extern unsigned long hardlockup_si_mask; #else static inline void hardlockup_detector_disable(void) {} #endif diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 0685e3a8aa0a8..366122f4a0f87 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -71,7 +71,7 @@ unsigned int __read_mostly hardlockup_panic = * hard lockup is detected, it could be task, memory, lock etc. * Refer include/linux/sys_info.h for detailed bit definition. */ -static unsigned long hardlockup_si_mask; +unsigned long hardlockup_si_mask; #ifdef CONFIG_SYSFS From 018531708ac831f0fd88853eac6ae311de543ab2 Mon Sep 17 00:00:00 2001 From: SeongJae Park Date: Tue, 30 Dec 2025 17:23:13 -0800 Subject: [PATCH 1033/1321] mm/damon/core: remove call_control in inactive contexts If damon_call() is executed against a DAMON context that is not running, the function returns error while keeping the damon_call_control object linked to the context's call_controls list. Let's suppose the object is deallocated after the damon_call(), and yet another damon_call() is executed against the same context. The function tries to add the new damon_call_control object to the call_controls list, which still has the pointer to the previous damon_call_control object, which is deallocated. As a result, use-after-free happens. This can actually be triggered using the DAMON sysfs interface. It is not easily exploitable since it requires the sysfs write permission and making a definitely weird file writes, though. Please refer to the report for more details about the issue reproduction steps. Fix the issue by making two changes. Firstly, move the final kdamond_call() for cancelling all existing damon_call() requests from terminating DAMON context to be done before the ctx->kdamond reset. This makes any code that sees NULL ctx->kdamond can safely assume the context may not access damon_call() requests anymore. Secondly, let damon_call() to cleanup the damon_call_control objects that were added to the already-terminated DAMON context, before returning the error. Link: https://lkml.kernel.org/r/20251231012315.75835-1-sj@kernel.org Fixes: 004ded6bee11 ("mm/damon: accept parallel damon_call() requests") Signed-off-by: SeongJae Park Reported-by: JaeJoon Jung Closes: https://lore.kernel.org/20251224094401.20384-1-rgbi3307@gmail.com Cc: # 6.17.x Signed-off-by: Andrew Morton --- mm/damon/core.c | 33 +++++++++++++++++++++++++++++++-- 1 file changed, 31 insertions(+), 2 deletions(-) diff --git a/mm/damon/core.c b/mm/damon/core.c index fd09bd106ad54..84f80a20f2336 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -1431,6 +1431,35 @@ bool damon_is_running(struct damon_ctx *ctx) return running; } +/* + * damon_call_handle_inactive_ctx() - handle DAMON call request that added to + * an inactive context. + * @ctx: The inactive DAMON context. + * @control: Control variable of the call request. + * + * This function is called in a case that @control is added to @ctx but @ctx is + * not running (inactive). See if @ctx handled @control or not, and cleanup + * @control if it was not handled. + * + * Returns 0 if @control was handled by @ctx, negative error code otherwise. + */ +static int damon_call_handle_inactive_ctx( + struct damon_ctx *ctx, struct damon_call_control *control) +{ + struct damon_call_control *c; + + mutex_lock(&ctx->call_controls_lock); + list_for_each_entry(c, &ctx->call_controls, list) { + if (c == control) { + list_del(&control->list); + mutex_unlock(&ctx->call_controls_lock); + return -EINVAL; + } + } + mutex_unlock(&ctx->call_controls_lock); + return 0; +} + /** * damon_call() - Invoke a given function on DAMON worker thread (kdamond). * @ctx: DAMON context to call the function for. @@ -1461,7 +1490,7 @@ int damon_call(struct damon_ctx *ctx, struct damon_call_control *control) list_add_tail(&control->list, &ctx->call_controls); mutex_unlock(&ctx->call_controls_lock); if (!damon_is_running(ctx)) - return -EINVAL; + return damon_call_handle_inactive_ctx(ctx, control); if (control->repeat) return 0; wait_for_completion(&control->completion); @@ -2755,13 +2784,13 @@ static int kdamond_fn(void *data) if (ctx->ops.cleanup) ctx->ops.cleanup(ctx); kfree(ctx->regions_score_histogram); + kdamond_call(ctx, true); pr_debug("kdamond (%d) finishes\n", current->pid); mutex_lock(&ctx->kdamond_lock); ctx->kdamond = NULL; mutex_unlock(&ctx->kdamond_lock); - kdamond_call(ctx, true); damos_walk_cancel(ctx); mutex_lock(&damon_lock); From c3a954c073a83d5344d298fd31b10d363aa00882 Mon Sep 17 00:00:00 2001 From: SeongJae Park Date: Wed, 24 Dec 2025 18:30:34 -0800 Subject: [PATCH 1034/1321] mm/damon/sysfs: cleanup intervals subdirs on attrs dir setup failure Patch series "mm/damon/sysfs: free setup failures generated zombie sub-sub dirs". Some DAMON sysfs directory setup functions generates its sub and sub-sub directories. For example, 'monitoring_attrs/' directory setup creates 'intervals/' and 'intervals/intervals_goal/' directories under 'monitoring_attrs/' directory. When such sub-sub directories are successfully made but followup setup is failed, the setup function should recursively clean up the subdirectories. However, such setup functions are only dereferencing sub directory reference counters. As a result, under certain setup failures, the sub-sub directories keep having non-zero reference counters. It means the directories cannot be removed like zombies, and the memory for the directories cannot be freed. The user impact of this issue is limited due to the following reasons. When the issue happens, the zombie directories are still taking the path. Hence attempts to generate the directories again will fail, without additional memory leak. This means the upper bound memory leak is limited. Nonetheless this also implies controlling DAMON with a feature that requires the setup-failed sysfs files will be impossible until the system reboots. Also, the setup operations are quite simple. The certain failures would hence only rarely happen, and are difficult to artificially trigger. This patch (of 4): When attrs/ DAMON sysfs directory setup is failed after setup of intervals/ directory, intervals/intervals_goal/ directory is not cleaned up. As a result, DAMON sysfs interface is nearly broken until the system reboots, and the memory for the unremoved directory is leaked. Cleanup the directory under such failures. Link: https://lkml.kernel.org/r/20251225023043.18579-1-sj@kernel.org Link: https://lkml.kernel.org/r/20251225023043.18579-2-sj@kernel.org Fixes: 8fbbcbeaafeb ("mm/damon/sysfs: implement intervals tuning goal directory") Signed-off-by: SeongJae Park Cc: chongjiapeng Cc: # 6.15.x Signed-off-by: Andrew Morton --- mm/damon/sysfs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/damon/sysfs.c b/mm/damon/sysfs.c index e2bd2d7becddf..a669de0687706 100644 --- a/mm/damon/sysfs.c +++ b/mm/damon/sysfs.c @@ -792,7 +792,7 @@ static int damon_sysfs_attrs_add_dirs(struct damon_sysfs_attrs *attrs) nr_regions_range = damon_sysfs_ul_range_alloc(10, 1000); if (!nr_regions_range) { err = -ENOMEM; - goto put_intervals_out; + goto rmdir_put_intervals_out; } err = kobject_init_and_add(&nr_regions_range->kobj, @@ -806,6 +806,8 @@ static int damon_sysfs_attrs_add_dirs(struct damon_sysfs_attrs *attrs) put_nr_regions_intervals_out: kobject_put(&nr_regions_range->kobj); attrs->nr_regions_range = NULL; +rmdir_put_intervals_out: + damon_sysfs_intervals_rm_dirs(intervals); put_intervals_out: kobject_put(&intervals->kobj); attrs->intervals = NULL; From fe109fb96d5f8b132d87fcdd95fac418b6fe6879 Mon Sep 17 00:00:00 2001 From: SeongJae Park Date: Wed, 24 Dec 2025 18:30:35 -0800 Subject: [PATCH 1035/1321] mm/damon/sysfs: cleanup attrs subdirs on context dir setup failure When a context DAMON sysfs directory setup is failed after setup of attrs/ directory, subdirectories of attrs/ directory are not cleaned up. As a result, DAMON sysfs interface is nearly broken until the system reboots, and the memory for the unremoved directory is leaked. Cleanup the directories under such failures. Link: https://lkml.kernel.org/r/20251225023043.18579-3-sj@kernel.org Fixes: c951cd3b8901 ("mm/damon: implement a minimal stub for sysfs-based DAMON interface") Signed-off-by: SeongJae Park Cc: chongjiapeng Cc: # 5.18.x Signed-off-by: Andrew Morton --- mm/damon/sysfs.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mm/damon/sysfs.c b/mm/damon/sysfs.c index a669de0687706..95fd9375a7d84 100644 --- a/mm/damon/sysfs.c +++ b/mm/damon/sysfs.c @@ -950,7 +950,7 @@ static int damon_sysfs_context_add_dirs(struct damon_sysfs_context *context) err = damon_sysfs_context_set_targets(context); if (err) - goto put_attrs_out; + goto rmdir_put_attrs_out; err = damon_sysfs_context_set_schemes(context); if (err) @@ -960,7 +960,8 @@ static int damon_sysfs_context_add_dirs(struct damon_sysfs_context *context) put_targets_attrs_out: kobject_put(&context->targets->kobj); context->targets = NULL; -put_attrs_out: +rmdir_put_attrs_out: + damon_sysfs_attrs_rm_dirs(context->attrs); kobject_put(&context->attrs->kobj); context->attrs = NULL; return err; From 01635b9e08ed232417b0dfaab98e4ee79c055137 Mon Sep 17 00:00:00 2001 From: SeongJae Park Date: Wed, 24 Dec 2025 18:30:36 -0800 Subject: [PATCH 1036/1321] mm/damon/sysfs-scheme: cleanup quotas subdirs on scheme dir setup failure When a DAMOS-scheme DAMON sysfs directory setup fails after setup of quotas/ directory, subdirectories of quotas/ directory are not cleaned up. As a result, DAMON sysfs interface is nearly broken until the system reboots, and the memory for the unremoved directory is leaked. Cleanup the directories under such failures. Link: https://lkml.kernel.org/r/20251225023043.18579-4-sj@kernel.org Fixes: 1b32234ab087 ("mm/damon/sysfs: support DAMOS watermarks") Signed-off-by: SeongJae Park Cc: chongjiapeng Cc: # 5.18.x Signed-off-by: Andrew Morton --- mm/damon/sysfs-schemes.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mm/damon/sysfs-schemes.c b/mm/damon/sysfs-schemes.c index 30d20f5b3192d..2c7a2b54be576 100644 --- a/mm/damon/sysfs-schemes.c +++ b/mm/damon/sysfs-schemes.c @@ -2158,7 +2158,7 @@ static int damon_sysfs_scheme_add_dirs(struct damon_sysfs_scheme *scheme) goto put_dests_out; err = damon_sysfs_scheme_set_watermarks(scheme); if (err) - goto put_quotas_access_pattern_out; + goto rmdir_put_quotas_access_pattern_out; err = damos_sysfs_set_filter_dirs(scheme); if (err) goto put_watermarks_quotas_access_pattern_out; @@ -2183,7 +2183,8 @@ static int damon_sysfs_scheme_add_dirs(struct damon_sysfs_scheme *scheme) put_watermarks_quotas_access_pattern_out: kobject_put(&scheme->watermarks->kobj); scheme->watermarks = NULL; -put_quotas_access_pattern_out: +rmdir_put_quotas_access_pattern_out: + damon_sysfs_quotas_rm_dirs(scheme->quotas); kobject_put(&scheme->quotas->kobj); scheme->quotas = NULL; put_dests_out: From 6a17aec98ea4a3d54a4234591aafa54841103516 Mon Sep 17 00:00:00 2001 From: SeongJae Park Date: Wed, 24 Dec 2025 18:30:37 -0800 Subject: [PATCH 1037/1321] mm/damon/sysfs-scheme: cleanup access_pattern subdirs on scheme dir setup failure When a DAMOS-scheme DAMON sysfs directory setup fails after setup of access_pattern/ directory, subdirectories of access_pattern/ directory are not cleaned up. As a result, DAMON sysfs interface is nearly broken until the system reboots, and the memory for the unremoved directory is leaked. Cleanup the directories under such failures. Link: https://lkml.kernel.org/r/20251225023043.18579-5-sj@kernel.org Fixes: 9bbb820a5bd5 ("mm/damon/sysfs: support DAMOS quotas") Signed-off-by: SeongJae Park Cc: chongjiapeng Cc: # 5.18.x Signed-off-by: Andrew Morton --- mm/damon/sysfs-schemes.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mm/damon/sysfs-schemes.c b/mm/damon/sysfs-schemes.c index 2c7a2b54be576..3a699dcd5a7fe 100644 --- a/mm/damon/sysfs-schemes.c +++ b/mm/damon/sysfs-schemes.c @@ -2152,7 +2152,7 @@ static int damon_sysfs_scheme_add_dirs(struct damon_sysfs_scheme *scheme) return err; err = damos_sysfs_set_dests(scheme); if (err) - goto put_access_pattern_out; + goto rmdir_put_access_pattern_out; err = damon_sysfs_scheme_set_quotas(scheme); if (err) goto put_dests_out; @@ -2190,7 +2190,8 @@ static int damon_sysfs_scheme_add_dirs(struct damon_sysfs_scheme *scheme) put_dests_out: kobject_put(&scheme->dests->kobj); scheme->dests = NULL; -put_access_pattern_out: +rmdir_put_access_pattern_out: + damon_sysfs_access_pattern_rm_dirs(scheme->access_pattern); kobject_put(&scheme->access_pattern->kobj); scheme->access_pattern = NULL; return err; From b07be203c759ce3e64bc327702e7f1a5c52469e2 Mon Sep 17 00:00:00 2001 From: Pavel Butsykin Date: Wed, 31 Dec 2025 11:46:38 +0400 Subject: [PATCH 1038/1321] mm/zswap: fix error pointer free in zswap_cpu_comp_prepare() crypto_alloc_acomp_node() may return ERR_PTR(), but the fail path checks only for NULL and can pass an error pointer to crypto_free_acomp(). Use IS_ERR_OR_NULL() to only free valid acomp instances. Link: https://lkml.kernel.org/r/20251231074638.2564302-1-pbutsykin@cloudlinux.com Fixes: 779b9955f643 ("mm: zswap: move allocations during CPU init outside the lock") Signed-off-by: Pavel Butsykin Reviewed-by: SeongJae Park Acked-by: Yosry Ahmed Acked-by: Nhat Pham Cc: Johannes Weiner Cc: Chengming Zhou Cc: Signed-off-by: Andrew Morton --- mm/zswap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/zswap.c b/mm/zswap.c index 5d0f8b13a958d..ac9b7a60736bc 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -787,7 +787,7 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node) return 0; fail: - if (acomp) + if (!IS_ERR_OR_NULL(acomp)) crypto_free_acomp(acomp); kfree(buffer); return ret; From e0ab591d854e1f391a4523e53ee7bf5aa3793396 Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Mon, 5 Jan 2026 20:11:47 +0000 Subject: [PATCH 1039/1321] mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted merge Patch series "mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted merge", v2. Commit 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges") introduced the ability to merge previously unavailable VMA merge scenarios. However, it is handling merges incorrectly when it comes to mremap() of a faulted VMA adjacent to an unfaulted VMA. The issues arise in three cases: 1. Previous VMA unfaulted: copied -----| v |-----------|.............| | unfaulted |(faulted VMA)| |-----------|.............| prev 2. Next VMA unfaulted: copied -----| v |.............|-----------| |(faulted VMA)| unfaulted | |.............|-----------| next 3. Both adjacent VMAs unfaulted: copied -----| v |-----------|.............|-----------| | unfaulted |(faulted VMA)| unfaulted | |-----------|.............|-----------| prev next This series fixes each of these cases, and introduces self tests to assert that the issues are corrected. I also test a further case which was already handled, to assert that my changes continues to correctly handle it: 4. prev unfaulted, next faulted: copied -----| v |-----------|.............|-----------| | unfaulted |(faulted VMA)| faulted | |-----------|.............|-----------| prev next This bug was discovered via a syzbot report, linked to in the first patch in the series, I confirmed that this series fixes the bug. I also discovered that we are failing to check that the faulted VMA was not forked when merging a copied VMA in cases 1-3 above, an issue this series also addresses. I also added self tests to assert that this is resolved (and confirmed that the tests failed prior to this). I also cleaned up vma_expand() as part of this work, renamed vma_had_uncowed_parents() to vma_is_fork_child() as the previous name was unduly confusing, and simplified the comments around this function. This patch (of 4): Commit 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges") introduced the ability to merge previously unavailable VMA merge scenarios. The key piece of logic introduced was the ability to merge a faulted VMA immediately next to an unfaulted VMA, which relies upon dup_anon_vma() to correctly handle anon_vma state. In the case of the merge of an existing VMA (that is changing properties of a VMA and then merging if those properties are shared by adjacent VMAs), dup_anon_vma() is invoked correctly. However in the case of the merge of a new VMA, a corner case peculiar to mremap() was missed. The issue is that vma_expand() only performs dup_anon_vma() if the target (the VMA that will ultimately become the merged VMA): is not the next VMA, i.e. the one that appears after the range in which the new VMA is to be established. A key insight here is that in all other cases other than mremap(), a new VMA merge either expands an existing VMA, meaning that the target VMA will be that VMA, or would have anon_vma be NULL. Specifically: * __mmap_region() - no anon_vma in place, initial mapping. * do_brk_flags() - expanding an existing VMA. * vma_merge_extend() - expanding an existing VMA. * relocate_vma_down() - no anon_vma in place, initial mapping. In addition, we are in the unique situation of needing to duplicate anon_vma state from a VMA that is neither the previous or next VMA being merged with. dup_anon_vma() deals exclusively with the target=unfaulted, src=faulted case. This leaves four possibilities, in each case where the copied VMA is faulted: 1. Previous VMA unfaulted: copied -----| v |-----------|.............| | unfaulted |(faulted VMA)| |-----------|.............| prev target = prev, expand prev to cover. 2. Next VMA unfaulted: copied -----| v |.............|-----------| |(faulted VMA)| unfaulted | |.............|-----------| next target = next, expand next to cover. 3. Both adjacent VMAs unfaulted: copied -----| v |-----------|.............|-----------| | unfaulted |(faulted VMA)| unfaulted | |-----------|.............|-----------| prev next target = prev, expand prev to cover. 4. prev unfaulted, next faulted: copied -----| v |-----------|.............|-----------| | unfaulted |(faulted VMA)| faulted | |-----------|.............|-----------| prev next target = prev, expand prev to cover. Essentially equivalent to 3, but with additional requirement that next's anon_vma is the same as the copied VMA's. This is covered by the existing logic. To account for this very explicitly, we introduce vma_merge_copied_range(), which sets a newly introduced vmg->copied_from field, then invokes vma_merge_new_range() which handles the rest of the logic. We then update the key vma_expand() function to clean up the logic and make what's going on clearer, making the 'remove next' case less special, before invoking dup_anon_vma() unconditionally should we be copying from a VMA. Note that in case 3, the if (remove_next) ... branch will be a no-op, as next=src in this instance and src is unfaulted. In case 4, it won't be, but since in this instance next=src and it is faulted, this will have required tgt=faulted, src=faulted to be compatible, meaning that next->anon_vma == vmg->copied_from->anon_vma, and thus a single dup_anon_vma() of next suffices to copy anon_vma state for the copied-from VMA also. If we are copying from a VMA in a successful merge we must _always_ propagate anon_vma state. This issue can be observed most directly by invoked mremap() to move around a VMA and cause this kind of merge with the MREMAP_DONTUNMAP flag specified. This will result in unlink_anon_vmas() being called after failing to duplicate anon_vma state to the target VMA, which results in the anon_vma itself being freed with folios still possessing dangling pointers to the anon_vma and thus a use-after-free bug. This bug was discovered via a syzbot report, which this patch resolves. We further make a change to update the mergeable anon_vma check to assert the copied-from anon_vma did not have CoW parents, as otherwise dup_anon_vma() might incorrectly propagate CoW ancestors from the next VMA in case 4 despite the anon_vma's being identical for both VMAs. Link: https://lkml.kernel.org/r/cover.1767638272.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/b7930ad2b1503a657e29fe928eb33061d7eadf5b.1767638272.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes Fixes: 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges") Reported-by: syzbot+b165fc2e11771c66d8ba@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/694a2745.050a0220.19928e.0017.GAE@google.com/ Reported-by: syzbot+5272541ccbbb14e2ec30@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/694e3dc6.050a0220.35954c.0066.GAE@google.com/ Reviewed-by: Harry Yoo Reviewed-by: Jeongjun Park Acked-by: Vlastimil Babka Cc: David Hildenbrand (Red Hat) Cc: Jann Horn Cc: Yeoreum Yun Cc: Liam Howlett Cc: "Liam R. Howlett" Cc: Pedro Falcato Cc: Rik van Riel Cc: Signed-off-by: Andrew Morton --- mm/vma.c | 84 +++++++++++++++++++++++++++++++++++++++----------------- mm/vma.h | 3 ++ 2 files changed, 62 insertions(+), 25 deletions(-) diff --git a/mm/vma.c b/mm/vma.c index fc90befd162f5..9df9e3b786046 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -829,6 +829,8 @@ static __must_check struct vm_area_struct *vma_merge_existing_range( VM_WARN_ON_VMG(middle && !(vma_iter_addr(vmg->vmi) >= middle->vm_start && vma_iter_addr(vmg->vmi) < middle->vm_end), vmg); + /* An existing merge can never be used by the mremap() logic. */ + VM_WARN_ON_VMG(vmg->copied_from, vmg); vmg->state = VMA_MERGE_NOMERGE; @@ -1098,6 +1100,33 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg) return NULL; } +/* + * vma_merge_copied_range - Attempt to merge a VMA that is being copied by + * mremap() + * + * @vmg: Describes the VMA we are adding, in the copied-to range @vmg->start to + * @vmg->end (exclusive), which we try to merge with any adjacent VMAs if + * possible. + * + * vmg->prev, next, start, end, pgoff should all be relative to the COPIED TO + * range, i.e. the target range for the VMA. + * + * Returns: In instances where no merge was possible, NULL. Otherwise, a pointer + * to the VMA we expanded. + * + * ASSUMPTIONS: Same as vma_merge_new_range(), except vmg->middle must contain + * the copied-from VMA. + */ +static struct vm_area_struct *vma_merge_copied_range(struct vma_merge_struct *vmg) +{ + /* We must have a copied-from VMA. */ + VM_WARN_ON_VMG(!vmg->middle, vmg); + + vmg->copied_from = vmg->middle; + vmg->middle = NULL; + return vma_merge_new_range(vmg); +} + /* * vma_expand - Expand an existing VMA * @@ -1117,46 +1146,52 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg) int vma_expand(struct vma_merge_struct *vmg) { struct vm_area_struct *anon_dup = NULL; - bool remove_next = false; struct vm_area_struct *target = vmg->target; struct vm_area_struct *next = vmg->next; + bool remove_next = false; vm_flags_t sticky_flags; - - sticky_flags = vmg->vm_flags & VM_STICKY; - sticky_flags |= target->vm_flags & VM_STICKY; - - VM_WARN_ON_VMG(!target, vmg); + int ret = 0; mmap_assert_write_locked(vmg->mm); - vma_start_write(target); - if (next && (target != next) && (vmg->end == next->vm_end)) { - int ret; - sticky_flags |= next->vm_flags & VM_STICKY; + if (next && target != next && vmg->end == next->vm_end) remove_next = true; - /* This should already have been checked by this point. */ - VM_WARN_ON_VMG(!can_merge_remove_vma(next), vmg); - vma_start_write(next); - /* - * In this case we don't report OOM, so vmg->give_up_on_mm is - * safe. - */ - ret = dup_anon_vma(target, next, &anon_dup); - if (ret) - return ret; - } + /* We must have a target. */ + VM_WARN_ON_VMG(!target, vmg); + /* This should have already been checked by this point. */ + VM_WARN_ON_VMG(remove_next && !can_merge_remove_vma(next), vmg); /* Not merging but overwriting any part of next is not handled. */ VM_WARN_ON_VMG(next && !remove_next && next != target && vmg->end > next->vm_start, vmg); - /* Only handles expanding */ + /* Only handles expanding. */ VM_WARN_ON_VMG(target->vm_start < vmg->start || target->vm_end > vmg->end, vmg); + sticky_flags = vmg->vm_flags & VM_STICKY; + sticky_flags |= target->vm_flags & VM_STICKY; if (remove_next) - vmg->__remove_next = true; + sticky_flags |= next->vm_flags & VM_STICKY; + /* + * If we are removing the next VMA or copying from a VMA + * (e.g. mremap()'ing), we must propagate anon_vma state. + * + * Note that, by convention, callers ignore OOM for this case, so + * we don't need to account for vmg->give_up_on_mm here. + */ + if (remove_next) + ret = dup_anon_vma(target, next, &anon_dup); + if (!ret && vmg->copied_from) + ret = dup_anon_vma(target, vmg->copied_from, &anon_dup); + if (ret) + return ret; + + if (remove_next) { + vma_start_write(next); + vmg->__remove_next = true; + } if (commit_merge(vmg)) goto nomem; @@ -1828,10 +1863,9 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, if (new_vma && new_vma->vm_start < addr + len) return NULL; /* should never get here */ - vmg.middle = NULL; /* New VMA range. */ vmg.pgoff = pgoff; vmg.next = vma_iter_next_rewind(&vmi, NULL); - new_vma = vma_merge_new_range(&vmg); + new_vma = vma_merge_copied_range(&vmg); if (new_vma) { /* diff --git a/mm/vma.h b/mm/vma.h index abada6a64c4e6..9d5ee6ac913ad 100644 --- a/mm/vma.h +++ b/mm/vma.h @@ -106,6 +106,9 @@ struct vma_merge_struct { struct anon_vma_name *anon_name; enum vma_merge_state state; + /* If copied from (i.e. mremap()'d) the VMA from which we are copying. */ + struct vm_area_struct *copied_from; + /* Flags which callers can use to modify merge behaviour: */ /* From 2db3e88c47b4161b449e1ddf5ce6294bda70fe8e Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Mon, 5 Jan 2026 20:11:48 +0000 Subject: [PATCH 1040/1321] tools/testing/selftests: add tests for !tgt, src mremap() merges Test that mremap()'ing a VMA into a position such that the target VMA on merge is unfaulted and the source faulted is correctly performed. We cover 4 cases: 1. Previous VMA unfaulted: copied -----| v |-----------|.............| | unfaulted |(faulted VMA)| |-----------|.............| prev target = prev, expand prev to cover. 2. Next VMA unfaulted: copied -----| v |.............|-----------| |(faulted VMA)| unfaulted | |.............|-----------| next target = next, expand next to cover. 3. Both adjacent VMAs unfaulted: copied -----| v |-----------|.............|-----------| | unfaulted |(faulted VMA)| unfaulted | |-----------|.............|-----------| prev next target = prev, expand prev to cover. 4. prev unfaulted, next faulted: copied -----| v |-----------|.............|-----------| | unfaulted |(faulted VMA)| faulted | |-----------|.............|-----------| prev next target = prev, expand prev to cover. Essentially equivalent to 3, but with additional requirement that next's anon_vma is the same as the copied VMA's. Each of these are performed with MREMAP_DONTUNMAP set, which will cause a KASAN assert for UAF or an assert on zero refcount anon_vma if a bug exists with correctly propagating anon_vma state in each scenario. Link: https://lkml.kernel.org/r/f903af2930c7c2c6e0948c886b58d0f42d8e8ba3.1767638272.git.lorenzo.stoakes@oracle.com Fixes: 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges") Signed-off-by: Lorenzo Stoakes Cc: David Hildenbrand (Red Hat) Cc: Jann Horn Cc: Jeongjun Park Cc: Liam Howlett Cc: Pedro Falcato Cc: Rik van Riel Cc: Vlastimil Babka Cc: Yeoreum Yun Cc: Harry Yoo Cc: Signed-off-by: Andrew Morton --- tools/testing/selftests/mm/merge.c | 232 +++++++++++++++++++++++++++++ 1 file changed, 232 insertions(+) diff --git a/tools/testing/selftests/mm/merge.c b/tools/testing/selftests/mm/merge.c index 363c1033cc7dd..22be149f71093 100644 --- a/tools/testing/selftests/mm/merge.c +++ b/tools/testing/selftests/mm/merge.c @@ -1171,4 +1171,236 @@ TEST_F(merge, mremap_correct_placed_faulted) ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr + 15 * page_size); } +TEST_F(merge, mremap_faulted_to_unfaulted_prev) +{ + struct procmap_fd *procmap = &self->procmap; + unsigned int page_size = self->page_size; + char *ptr_a, *ptr_b; + + /* + * mremap() such that A and B merge: + * + * |------------| + * | \ | + * |-----------| | / |---------| + * | unfaulted | v \ | faulted | + * |-----------| / |---------| + * B \ A + */ + + /* Map VMA A into place. */ + ptr_a = mmap(&self->carveout[page_size + 3 * page_size], + 3 * page_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0); + ASSERT_NE(ptr_a, MAP_FAILED); + /* Fault it in. */ + ptr_a[0] = 'x'; + + /* + * Now move it out of the way so we can place VMA B in position, + * unfaulted. + */ + ptr_a = mremap(ptr_a, 3 * page_size, 3 * page_size, + MREMAP_FIXED | MREMAP_MAYMOVE, &self->carveout[20 * page_size]); + ASSERT_NE(ptr_a, MAP_FAILED); + + /* Map VMA B into place. */ + ptr_b = mmap(&self->carveout[page_size], 3 * page_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0); + ASSERT_NE(ptr_b, MAP_FAILED); + + /* + * Now move VMA A into position with MREMAP_DONTUNMAP to catch incorrect + * anon_vma propagation. + */ + ptr_a = mremap(ptr_a, 3 * page_size, 3 * page_size, + MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP, + &self->carveout[page_size + 3 * page_size]); + ASSERT_NE(ptr_a, MAP_FAILED); + + /* The VMAs should have merged. */ + ASSERT_TRUE(find_vma_procmap(procmap, ptr_b)); + ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_b); + ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_b + 6 * page_size); +} + +TEST_F(merge, mremap_faulted_to_unfaulted_next) +{ + struct procmap_fd *procmap = &self->procmap; + unsigned int page_size = self->page_size; + char *ptr_a, *ptr_b; + + /* + * mremap() such that A and B merge: + * + * |---------------------------| + * | \ | + * | |-----------| / |---------| + * v | unfaulted | \ | faulted | + * |-----------| / |---------| + * B \ A + * + * Then unmap VMA A to trigger the bug. + */ + + /* Map VMA A into place. */ + ptr_a = mmap(&self->carveout[page_size], 3 * page_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0); + ASSERT_NE(ptr_a, MAP_FAILED); + /* Fault it in. */ + ptr_a[0] = 'x'; + + /* + * Now move it out of the way so we can place VMA B in position, + * unfaulted. + */ + ptr_a = mremap(ptr_a, 3 * page_size, 3 * page_size, + MREMAP_FIXED | MREMAP_MAYMOVE, &self->carveout[20 * page_size]); + ASSERT_NE(ptr_a, MAP_FAILED); + + /* Map VMA B into place. */ + ptr_b = mmap(&self->carveout[page_size + 3 * page_size], 3 * page_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0); + ASSERT_NE(ptr_b, MAP_FAILED); + + /* + * Now move VMA A into position with MREMAP_DONTUNMAP to catch incorrect + * anon_vma propagation. + */ + ptr_a = mremap(ptr_a, 3 * page_size, 3 * page_size, + MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP, + &self->carveout[page_size]); + ASSERT_NE(ptr_a, MAP_FAILED); + + /* The VMAs should have merged. */ + ASSERT_TRUE(find_vma_procmap(procmap, ptr_a)); + ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_a); + ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_a + 6 * page_size); +} + +TEST_F(merge, mremap_faulted_to_unfaulted_prev_unfaulted_next) +{ + struct procmap_fd *procmap = &self->procmap; + unsigned int page_size = self->page_size; + char *ptr_a, *ptr_b, *ptr_c; + + /* + * mremap() with MREMAP_DONTUNMAP such that A, B and C merge: + * + * |---------------------------| + * | \ | + * |-----------| | |-----------| / |---------| + * | unfaulted | v | unfaulted | \ | faulted | + * |-----------| |-----------| / |---------| + * A C \ B + */ + + /* Map VMA B into place. */ + ptr_b = mmap(&self->carveout[page_size + 3 * page_size], 3 * page_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0); + ASSERT_NE(ptr_b, MAP_FAILED); + /* Fault it in. */ + ptr_b[0] = 'x'; + + /* + * Now move it out of the way so we can place VMAs A, C in position, + * unfaulted. + */ + ptr_b = mremap(ptr_b, 3 * page_size, 3 * page_size, + MREMAP_FIXED | MREMAP_MAYMOVE, &self->carveout[20 * page_size]); + ASSERT_NE(ptr_b, MAP_FAILED); + + /* Map VMA A into place. */ + + ptr_a = mmap(&self->carveout[page_size], 3 * page_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0); + ASSERT_NE(ptr_a, MAP_FAILED); + + /* Map VMA C into place. */ + ptr_c = mmap(&self->carveout[page_size + 3 * page_size + 3 * page_size], + 3 * page_size, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0); + ASSERT_NE(ptr_c, MAP_FAILED); + + /* + * Now move VMA B into position with MREMAP_DONTUNMAP to catch incorrect + * anon_vma propagation. + */ + ptr_b = mremap(ptr_b, 3 * page_size, 3 * page_size, + MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP, + &self->carveout[page_size + 3 * page_size]); + ASSERT_NE(ptr_b, MAP_FAILED); + + /* The VMAs should have merged. */ + ASSERT_TRUE(find_vma_procmap(procmap, ptr_a)); + ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_a); + ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_a + 9 * page_size); +} + +TEST_F(merge, mremap_faulted_to_unfaulted_prev_faulted_next) +{ + struct procmap_fd *procmap = &self->procmap; + unsigned int page_size = self->page_size; + char *ptr_a, *ptr_b, *ptr_bc; + + /* + * mremap() with MREMAP_DONTUNMAP such that A, B and C merge: + * + * |---------------------------| + * | \ | + * |-----------| | |-----------| / |---------| + * | unfaulted | v | faulted | \ | faulted | + * |-----------| |-----------| / |---------| + * A C \ B + */ + + /* + * Map VMA B and C into place. We have to map them together so their + * anon_vma is the same and the vma->vm_pgoff's are correctly aligned. + */ + ptr_bc = mmap(&self->carveout[page_size + 3 * page_size], + 3 * page_size + 3 * page_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0); + ASSERT_NE(ptr_bc, MAP_FAILED); + + /* Fault it in. */ + ptr_bc[0] = 'x'; + + /* + * Now move VMA B out the way (splitting VMA BC) so we can place VMA A + * in position, unfaulted, and leave the remainder of the VMA we just + * moved in place, faulted, as VMA C. + */ + ptr_b = mremap(ptr_bc, 3 * page_size, 3 * page_size, + MREMAP_FIXED | MREMAP_MAYMOVE, &self->carveout[20 * page_size]); + ASSERT_NE(ptr_b, MAP_FAILED); + + /* Map VMA A into place. */ + ptr_a = mmap(&self->carveout[page_size], 3 * page_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0); + ASSERT_NE(ptr_a, MAP_FAILED); + + /* + * Now move VMA B into position with MREMAP_DONTUNMAP to catch incorrect + * anon_vma propagation. + */ + ptr_b = mremap(ptr_b, 3 * page_size, 3 * page_size, + MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP, + &self->carveout[page_size + 3 * page_size]); + ASSERT_NE(ptr_b, MAP_FAILED); + + /* The VMAs should have merged. */ + ASSERT_TRUE(find_vma_procmap(procmap, ptr_a)); + ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_a); + ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_a + 9 * page_size); +} + TEST_HARNESS_MAIN From 293c173077b2a183eb5f68e431099f5abd2582a7 Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Mon, 5 Jan 2026 20:11:49 +0000 Subject: [PATCH 1041/1321] mm/vma: enforce VMA fork limit on unfaulted,faulted mremap merge too The is_mergeable_anon_vma() function uses vmg->middle as the source VMA. However when merging a new VMA, this field is NULL. In all cases except mremap(), the new VMA will either be newly established and thus lack an anon_vma, or will be an expansion of an existing VMA thus we do not care about whether VMA is CoW'd or not. In the case of an mremap(), we can end up in a situation where we can accidentally allow an unfaulted/faulted merge with a VMA that has been forked, violating the general rule that we do not permit this for reasons of anon_vma lock scalability. Now we have the ability to be aware of the fact we are copying a VMA and also know which VMA that is, we can explicitly check for this, so do so. This is pertinent since commit 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges"), as this patch permits unfaulted/faulted merges that were previously disallowed running afoul of this issue. While we are here, vma_had_uncowed_parents() is a confusing name, so make it simple and rename it to vma_is_fork_child(). Link: https://lkml.kernel.org/r/6e2b9b3024ae1220961c8b81d74296d4720eaf2b.1767638272.git.lorenzo.stoakes@oracle.com Fixes: 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges") Signed-off-by: Lorenzo Stoakes Reviewed-by: Harry Yoo Reviewed-by: Jeongjun Park Acked-by: Vlastimil Babka Cc: David Hildenbrand (Red Hat) Cc: Jann Horn Cc: Liam Howlett Cc: Pedro Falcato Cc: Rik van Riel Cc: Yeoreum Yun Cc: Signed-off-by: Andrew Morton --- mm/vma.c | 27 +++++++++++++++------------ 1 file changed, 15 insertions(+), 12 deletions(-) diff --git a/mm/vma.c b/mm/vma.c index 9df9e3b786046..dc92f3dd8514b 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -67,18 +67,13 @@ struct mmap_state { .state = VMA_MERGE_START, \ } -/* - * If, at any point, the VMA had unCoW'd mappings from parents, it will maintain - * more than one anon_vma_chain connecting it to more than one anon_vma. A merge - * would mean a wider range of folios sharing the root anon_vma lock, and thus - * potential lock contention, we do not wish to encourage merging such that this - * scales to a problem. - */ -static bool vma_had_uncowed_parents(struct vm_area_struct *vma) +/* Was this VMA ever forked from a parent, i.e. maybe contains CoW mappings? */ +static bool vma_is_fork_child(struct vm_area_struct *vma) { /* * The list_is_singular() test is to avoid merging VMA cloned from - * parents. This can improve scalability caused by anon_vma lock. + * parents. This can improve scalability caused by the anon_vma root + * lock. */ return vma && vma->anon_vma && !list_is_singular(&vma->anon_vma_chain); } @@ -115,11 +110,19 @@ static bool is_mergeable_anon_vma(struct vma_merge_struct *vmg, bool merge_next) VM_WARN_ON(src && src_anon != src->anon_vma); /* Case 1 - we will dup_anon_vma() from src into tgt. */ - if (!tgt_anon && src_anon) - return !vma_had_uncowed_parents(src); + if (!tgt_anon && src_anon) { + struct vm_area_struct *copied_from = vmg->copied_from; + + if (vma_is_fork_child(src)) + return false; + if (vma_is_fork_child(copied_from)) + return false; + + return true; + } /* Case 2 - we will simply use tgt's anon_vma. */ if (tgt_anon && !src_anon) - return !vma_had_uncowed_parents(tgt); + return !vma_is_fork_child(tgt); /* Case 3 - the anon_vma's are already shared. */ return src_anon == tgt_anon; } From aaf80036a07e1d4ca983b9bfebd82eef7adc1c9b Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Mon, 5 Jan 2026 20:11:50 +0000 Subject: [PATCH 1042/1321] tools/testing/selftests: add forked (un)/faulted VMA merge tests Now we correctly handle forked faulted/unfaulted merge on mremap(), exhaustively assert that we handle this correctly. Do this in the less duplicative way by adding a new merge_with_fork fixture and forked/unforked variants, and abstract the forking logic as necessary to avoid code duplication with this also. Link: https://lkml.kernel.org/r/1daf76d89fdb9d96f38a6a0152d8f3c2e9e30ac7.1767638272.git.lorenzo.stoakes@oracle.com Fixes: 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges") Signed-off-by: Lorenzo Stoakes Cc: David Hildenbrand (Red Hat) Cc: Jann Horn Cc: Jeongjun Park Cc: Liam Howlett Cc: Pedro Falcato Cc: Rik van Riel Cc: Vlastimil Babka Cc: Yeoreum Yun Cc: Harry Yoo Cc: Signed-off-by: Andrew Morton --- tools/testing/selftests/mm/merge.c | 180 ++++++++++++++++++++++------- 1 file changed, 139 insertions(+), 41 deletions(-) diff --git a/tools/testing/selftests/mm/merge.c b/tools/testing/selftests/mm/merge.c index 22be149f71093..10b686102b79d 100644 --- a/tools/testing/selftests/mm/merge.c +++ b/tools/testing/selftests/mm/merge.c @@ -22,12 +22,37 @@ FIXTURE(merge) struct procmap_fd procmap; }; +static char *map_carveout(unsigned int page_size) +{ + return mmap(NULL, 30 * page_size, PROT_NONE, + MAP_ANON | MAP_PRIVATE, -1, 0); +} + +static pid_t do_fork(struct procmap_fd *procmap) +{ + pid_t pid = fork(); + + if (pid == -1) + return -1; + if (pid != 0) { + wait(NULL); + return pid; + } + + /* Reopen for child. */ + if (close_procmap(procmap)) + return -1; + if (open_self_procmap(procmap)) + return -1; + + return 0; +} + FIXTURE_SETUP(merge) { self->page_size = psize(); /* Carve out PROT_NONE region to map over. */ - self->carveout = mmap(NULL, 30 * self->page_size, PROT_NONE, - MAP_ANON | MAP_PRIVATE, -1, 0); + self->carveout = map_carveout(self->page_size); ASSERT_NE(self->carveout, MAP_FAILED); /* Setup PROCMAP_QUERY interface. */ ASSERT_EQ(open_self_procmap(&self->procmap), 0); @@ -36,7 +61,8 @@ FIXTURE_SETUP(merge) FIXTURE_TEARDOWN(merge) { ASSERT_EQ(munmap(self->carveout, 30 * self->page_size), 0); - ASSERT_EQ(close_procmap(&self->procmap), 0); + /* May fail for parent of forked process. */ + close_procmap(&self->procmap); /* * Clear unconditionally, as some tests set this. It is no issue if this * fails (KSM may be disabled for instance). @@ -44,6 +70,44 @@ FIXTURE_TEARDOWN(merge) prctl(PR_SET_MEMORY_MERGE, 0, 0, 0, 0); } +FIXTURE(merge_with_fork) +{ + unsigned int page_size; + char *carveout; + struct procmap_fd procmap; +}; + +FIXTURE_VARIANT(merge_with_fork) +{ + bool forked; +}; + +FIXTURE_VARIANT_ADD(merge_with_fork, forked) +{ + .forked = true, +}; + +FIXTURE_VARIANT_ADD(merge_with_fork, unforked) +{ + .forked = false, +}; + +FIXTURE_SETUP(merge_with_fork) +{ + self->page_size = psize(); + self->carveout = map_carveout(self->page_size); + ASSERT_NE(self->carveout, MAP_FAILED); + ASSERT_EQ(open_self_procmap(&self->procmap), 0); +} + +FIXTURE_TEARDOWN(merge_with_fork) +{ + ASSERT_EQ(munmap(self->carveout, 30 * self->page_size), 0); + ASSERT_EQ(close_procmap(&self->procmap), 0); + /* See above. */ + prctl(PR_SET_MEMORY_MERGE, 0, 0, 0, 0); +} + TEST_F(merge, mprotect_unfaulted_left) { unsigned int page_size = self->page_size; @@ -322,8 +386,8 @@ TEST_F(merge, forked_target_vma) unsigned int page_size = self->page_size; char *carveout = self->carveout; struct procmap_fd *procmap = &self->procmap; - pid_t pid; char *ptr, *ptr2; + pid_t pid; int i; /* @@ -344,19 +408,10 @@ TEST_F(merge, forked_target_vma) */ ptr[0] = 'x'; - pid = fork(); + pid = do_fork(&self->procmap); ASSERT_NE(pid, -1); - - if (pid != 0) { - wait(NULL); + if (pid != 0) return; - } - - /* Child process below: */ - - /* Reopen for child. */ - ASSERT_EQ(close_procmap(&self->procmap), 0); - ASSERT_EQ(open_self_procmap(&self->procmap), 0); /* unCOWing everything does not cause the AVC to go away. */ for (i = 0; i < 5 * page_size; i += page_size) @@ -386,8 +441,8 @@ TEST_F(merge, forked_source_vma) unsigned int page_size = self->page_size; char *carveout = self->carveout; struct procmap_fd *procmap = &self->procmap; - pid_t pid; char *ptr, *ptr2; + pid_t pid; int i; /* @@ -408,19 +463,10 @@ TEST_F(merge, forked_source_vma) */ ptr[0] = 'x'; - pid = fork(); + pid = do_fork(&self->procmap); ASSERT_NE(pid, -1); - - if (pid != 0) { - wait(NULL); + if (pid != 0) return; - } - - /* Child process below: */ - - /* Reopen for child. */ - ASSERT_EQ(close_procmap(&self->procmap), 0); - ASSERT_EQ(open_self_procmap(&self->procmap), 0); /* unCOWing everything does not cause the AVC to go away. */ for (i = 0; i < 5 * page_size; i += page_size) @@ -1171,10 +1217,11 @@ TEST_F(merge, mremap_correct_placed_faulted) ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr + 15 * page_size); } -TEST_F(merge, mremap_faulted_to_unfaulted_prev) +TEST_F(merge_with_fork, mremap_faulted_to_unfaulted_prev) { struct procmap_fd *procmap = &self->procmap; unsigned int page_size = self->page_size; + unsigned long offset; char *ptr_a, *ptr_b; /* @@ -1197,6 +1244,14 @@ TEST_F(merge, mremap_faulted_to_unfaulted_prev) /* Fault it in. */ ptr_a[0] = 'x'; + if (variant->forked) { + pid_t pid = do_fork(&self->procmap); + + ASSERT_NE(pid, -1); + if (pid != 0) + return; + } + /* * Now move it out of the way so we can place VMA B in position, * unfaulted. @@ -1220,16 +1275,19 @@ TEST_F(merge, mremap_faulted_to_unfaulted_prev) &self->carveout[page_size + 3 * page_size]); ASSERT_NE(ptr_a, MAP_FAILED); - /* The VMAs should have merged. */ + /* The VMAs should have merged, if not forked. */ ASSERT_TRUE(find_vma_procmap(procmap, ptr_b)); ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_b); - ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_b + 6 * page_size); + + offset = variant->forked ? 3 * page_size : 6 * page_size; + ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_b + offset); } -TEST_F(merge, mremap_faulted_to_unfaulted_next) +TEST_F(merge_with_fork, mremap_faulted_to_unfaulted_next) { struct procmap_fd *procmap = &self->procmap; unsigned int page_size = self->page_size; + unsigned long offset; char *ptr_a, *ptr_b; /* @@ -1253,6 +1311,14 @@ TEST_F(merge, mremap_faulted_to_unfaulted_next) /* Fault it in. */ ptr_a[0] = 'x'; + if (variant->forked) { + pid_t pid = do_fork(&self->procmap); + + ASSERT_NE(pid, -1); + if (pid != 0) + return; + } + /* * Now move it out of the way so we can place VMA B in position, * unfaulted. @@ -1276,16 +1342,18 @@ TEST_F(merge, mremap_faulted_to_unfaulted_next) &self->carveout[page_size]); ASSERT_NE(ptr_a, MAP_FAILED); - /* The VMAs should have merged. */ + /* The VMAs should have merged, if not forked. */ ASSERT_TRUE(find_vma_procmap(procmap, ptr_a)); ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_a); - ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_a + 6 * page_size); + offset = variant->forked ? 3 * page_size : 6 * page_size; + ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_a + offset); } -TEST_F(merge, mremap_faulted_to_unfaulted_prev_unfaulted_next) +TEST_F(merge_with_fork, mremap_faulted_to_unfaulted_prev_unfaulted_next) { struct procmap_fd *procmap = &self->procmap; unsigned int page_size = self->page_size; + unsigned long offset; char *ptr_a, *ptr_b, *ptr_c; /* @@ -1307,6 +1375,14 @@ TEST_F(merge, mremap_faulted_to_unfaulted_prev_unfaulted_next) /* Fault it in. */ ptr_b[0] = 'x'; + if (variant->forked) { + pid_t pid = do_fork(&self->procmap); + + ASSERT_NE(pid, -1); + if (pid != 0) + return; + } + /* * Now move it out of the way so we can place VMAs A, C in position, * unfaulted. @@ -1337,13 +1413,21 @@ TEST_F(merge, mremap_faulted_to_unfaulted_prev_unfaulted_next) &self->carveout[page_size + 3 * page_size]); ASSERT_NE(ptr_b, MAP_FAILED); - /* The VMAs should have merged. */ + /* The VMAs should have merged, if not forked. */ ASSERT_TRUE(find_vma_procmap(procmap, ptr_a)); ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_a); - ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_a + 9 * page_size); + offset = variant->forked ? 3 * page_size : 9 * page_size; + ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_a + offset); + + /* If forked, B and C should also not have merged. */ + if (variant->forked) { + ASSERT_TRUE(find_vma_procmap(procmap, ptr_b)); + ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_b); + ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_b + 3 * page_size); + } } -TEST_F(merge, mremap_faulted_to_unfaulted_prev_faulted_next) +TEST_F(merge_with_fork, mremap_faulted_to_unfaulted_prev_faulted_next) { struct procmap_fd *procmap = &self->procmap; unsigned int page_size = self->page_size; @@ -1373,6 +1457,14 @@ TEST_F(merge, mremap_faulted_to_unfaulted_prev_faulted_next) /* Fault it in. */ ptr_bc[0] = 'x'; + if (variant->forked) { + pid_t pid = do_fork(&self->procmap); + + ASSERT_NE(pid, -1); + if (pid != 0) + return; + } + /* * Now move VMA B out the way (splitting VMA BC) so we can place VMA A * in position, unfaulted, and leave the remainder of the VMA we just @@ -1397,10 +1489,16 @@ TEST_F(merge, mremap_faulted_to_unfaulted_prev_faulted_next) &self->carveout[page_size + 3 * page_size]); ASSERT_NE(ptr_b, MAP_FAILED); - /* The VMAs should have merged. */ - ASSERT_TRUE(find_vma_procmap(procmap, ptr_a)); - ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_a); - ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_a + 9 * page_size); + /* The VMAs should have merged. A,B,C if unforked, B, C if forked. */ + if (variant->forked) { + ASSERT_TRUE(find_vma_procmap(procmap, ptr_b)); + ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_b); + ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_b + 6 * page_size); + } else { + ASSERT_TRUE(find_vma_procmap(procmap, ptr_a)); + ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_a); + ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_a + 9 * page_size); + } } TEST_HARNESS_MAIN From 2d4a9a17ef5b5ef44210f5a3a0dee8d694ea516e Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Sun, 4 Jan 2026 13:43:47 +0000 Subject: [PATCH 1043/1321] mm: kmsan: fix poisoning of high-order non-compound pages kmsan_free_page() is called by the page allocator's free_pages_prepare() during page freeing. Its job is to poison all the memory covered by the page. It can be called with an order-0 page, a compound high-order page or a non-compound high-order page. But page_size() only works for order-0 and compound pages. For a non-compound high-order page it will incorrectly return PAGE_SIZE. The implication is that the tail pages of a high-order non-compound page do not get poisoned at free, so any invalid access while they are free could go unnoticed. It looks like the pages will be poisoned again at allocation time, so that would bookend the window. Fix this by using the order parameter to calculate the size. Link: https://lkml.kernel.org/r/20260104134348.3544298-1-ryan.roberts@arm.com Fixes: b073d7f8aee4 ("mm: kmsan: maintain KMSAN metadata for page operations") Signed-off-by: Ryan Roberts Reviewed-by: Alexander Potapenko Tested-by: Alexander Potapenko Cc: Dmitriy Vyukov Cc: Dmitry Vyukov Cc: Marco Elver Cc: Ryan Roberts Cc: Signed-off-by: Andrew Morton --- mm/kmsan/shadow.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/kmsan/shadow.c b/mm/kmsan/shadow.c index e7f554a31bb44..9e1c5f2b7a416 100644 --- a/mm/kmsan/shadow.c +++ b/mm/kmsan/shadow.c @@ -207,7 +207,7 @@ void kmsan_free_page(struct page *page, unsigned int order) if (!kmsan_enabled || kmsan_in_runtime()) return; kmsan_enter_runtime(); - kmsan_internal_poison_memory(page_address(page), page_size(page), + kmsan_internal_poison_memory(page_address(page), PAGE_SIZE << order, GFP_KERNEL & ~(__GFP_RECLAIM), KMSAN_POISON_CHECK | KMSAN_POISON_FREE); kmsan_leave_runtime(); From d30bbde725ef44e97d5014379008ecf3a6b75279 Mon Sep 17 00:00:00 2001 From: Carlos Llamas Date: Mon, 5 Jan 2026 19:07:46 +0000 Subject: [PATCH 1044/1321] iommu/sva: include mmu_notifier.h header A call to mmu_notifier_arch_invalidate_secondary_tlbs() was introduced in commit e37d5a2d60a3 ("iommu/sva: invalidate stale IOTLB entries for kernel address space") but without explicitly adding its corresponding header file . This was evidenced while trying to enable compile testing support for IOMMU_SVA: config IOMMU_SVA select IOMMU_MM_DATA - bool + bool "Shared Virtual Addressing" if COMPILE_TEST The thing is for certain architectures this header file is indirectly included via . However, for others such as 32-bit arm the header is missing and it results in a build failure: $ make ARCH=arm allmodconfig [...] drivers/iommu/iommu-sva.c:340:3: error: call to undeclared function 'mmu_notifier_arch_invalidate_secondary_tlbs' [...] 340 | mmu_notifier_arch_invalidate_secondary_tlbs(iommu_mm->mm, start, end); | ^ Fix this by including the appropriate header file. Link: https://lkml.kernel.org/r/20260105190747.625082-1-cmllamas@google.com Fixes: e37d5a2d60a3 ("iommu/sva: invalidate stale IOTLB entries for kernel address space") Signed-off-by: Carlos Llamas Cc: Baolu Lu Cc: Jason Gunthorpe Cc: Joerg Roedel Cc: Kevin Tian Cc: Robin Murphy Cc: Vasant Hegde Cc: Will Deacon Signed-off-by: Andrew Morton --- drivers/iommu/iommu-sva.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index d236aef80a8d5..e1e63c2be82b2 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -3,6 +3,7 @@ * Helpers for IOMMU drivers implementing SVA */ #include +#include #include #include #include From d5d829fd085a2cb39215dc653b09f8063cdc5707 Mon Sep 17 00:00:00 2001 From: Vlastimil Babka Date: Mon, 5 Jan 2026 16:08:56 +0100 Subject: [PATCH 1045/1321] mm/page_alloc: prevent pcp corruption with SMP=n The kernel test robot has reported: BUG: spinlock trylock failure on UP on CPU#0, kcompactd0/28 lock: 0xffff888807e35ef0, .magic: dead4ead, .owner: kcompactd0/28, .owner_cpu: 0 CPU: 0 UID: 0 PID: 28 Comm: kcompactd0 Not tainted 6.18.0-rc5-00127-ga06157804399 #1 PREEMPT 8cc09ef94dcec767faa911515ce9e609c45db470 Call Trace: __dump_stack (lib/dump_stack.c:95) dump_stack_lvl (lib/dump_stack.c:123) dump_stack (lib/dump_stack.c:130) spin_dump (kernel/locking/spinlock_debug.c:71) do_raw_spin_trylock (kernel/locking/spinlock_debug.c:?) _raw_spin_trylock (include/linux/spinlock_api_smp.h:89 kernel/locking/spinlock.c:138) __free_frozen_pages (mm/page_alloc.c:2973) ___free_pages (mm/page_alloc.c:5295) __free_pages (mm/page_alloc.c:5334) tlb_remove_table_rcu (include/linux/mm.h:? include/linux/mm.h:3122 include/asm-generic/tlb.h:220 mm/mmu_gather.c:227 mm/mmu_gather.c:290) ? __cfi_tlb_remove_table_rcu (mm/mmu_gather.c:289) ? rcu_core (kernel/rcu/tree.c:?) rcu_core (include/linux/rcupdate.h:341 kernel/rcu/tree.c:2607 kernel/rcu/tree.c:2861) rcu_core_si (kernel/rcu/tree.c:2879) handle_softirqs (arch/x86/include/asm/jump_label.h:36 include/trace/events/irq.h:142 kernel/softirq.c:623) __irq_exit_rcu (arch/x86/include/asm/jump_label.h:36 kernel/softirq.c:725) irq_exit_rcu (kernel/softirq.c:741) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1052) RIP: 0010:_raw_spin_unlock_irqrestore (arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:194) free_pcppages_bulk (mm/page_alloc.c:1494) drain_pages_zone (include/linux/spinlock.h:391 mm/page_alloc.c:2632) __drain_all_pages (mm/page_alloc.c:2731) drain_all_pages (mm/page_alloc.c:2747) kcompactd (mm/compaction.c:3115) kthread (kernel/kthread.c:465) ? __cfi_kcompactd (mm/compaction.c:3166) ? __cfi_kthread (kernel/kthread.c:412) ret_from_fork (arch/x86/kernel/process.c:164) ? __cfi_kthread (kernel/kthread.c:412) ret_from_fork_asm (arch/x86/entry/entry_64.S:255) Matthew has analyzed the report and identified that in drain_page_zone() we are in a section protected by spin_lock(&pcp->lock) and then get an interrupt that attempts spin_trylock() on the same lock. The code is designed to work this way without disabling IRQs and occasionally fail the trylock with a fallback. However, the SMP=n spinlock implementation assumes spin_trylock() will always succeed, and thus it's normally a no-op. Here the enabled lock debugging catches the problem, but otherwise it could cause a corruption of the pcp structure. The problem has been introduced by commit 574907741599 ("mm/page_alloc: leave IRQs enabled for per-cpu page allocations"). The pcp locking scheme recognizes the need for disabling IRQs to prevent nesting spin_trylock() sections on SMP=n, but the need to prevent the nesting in spin_lock() has not been recognized. Fix it by introducing local wrappers that change the spin_lock() to spin_lock_iqsave() with SMP=n and use them in all places that do spin_lock(&pcp->lock). [vbabka@suse.cz: add pcp_ prefix to the spin_lock_irqsave wrappers, per Steven] Link: https://lkml.kernel.org/r/20260105-fix-pcp-up-v1-1-5579662d2071@suse.cz Fixes: 574907741599 ("mm/page_alloc: leave IRQs enabled for per-cpu page allocations") Signed-off-by: Vlastimil Babka Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-lkp/202512101320.e2f2dd6f-lkp@intel.com Analyzed-by: Matthew Wilcox Link: https://lore.kernel.org/all/aUW05pyc9nZkvY-1@casper.infradead.org/ Acked-by: Mel Gorman Cc: Brendan Jackman Cc: Johannes Weiner Cc: Michal Hocko Cc: Sebastian Andrzej Siewior Cc: Steven Rostedt Cc: Suren Baghdasaryan Cc: Zi Yan Cc: Signed-off-by: Andrew Morton --- mm/page_alloc.c | 47 +++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 8 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3dce96f3be494..f65c4edf199d5 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -167,6 +167,33 @@ static inline void __pcp_trylock_noop(unsigned long *flags) { } pcp_trylock_finish(UP_flags); \ }) +/* + * With the UP spinlock implementation, when we spin_lock(&pcp->lock) (for i.e. + * a potentially remote cpu drain) and get interrupted by an operation that + * attempts pcp_spin_trylock(), we can't rely on the trylock failure due to UP + * spinlock assumptions making the trylock a no-op. So we have to turn that + * spin_lock() to a spin_lock_irqsave(). This works because on UP there are no + * remote cpu's so we can only be locking the only existing local one. + */ +#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT) +static inline void __flags_noop(unsigned long *flags) { } +#define pcp_spin_lock_maybe_irqsave(ptr, flags) \ +({ \ + __flags_noop(&(flags)); \ + spin_lock(&(ptr)->lock); \ +}) +#define pcp_spin_unlock_maybe_irqrestore(ptr, flags) \ +({ \ + spin_unlock(&(ptr)->lock); \ + __flags_noop(&(flags)); \ +}) +#else +#define pcp_spin_lock_maybe_irqsave(ptr, flags) \ + spin_lock_irqsave(&(ptr)->lock, flags) +#define pcp_spin_unlock_maybe_irqrestore(ptr, flags) \ + spin_unlock_irqrestore(&(ptr)->lock, flags) +#endif + #ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID DEFINE_PER_CPU(int, numa_node); EXPORT_PER_CPU_SYMBOL(numa_node); @@ -2556,6 +2583,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, bool decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp) { int high_min, to_drain, to_drain_batched, batch; + unsigned long UP_flags; bool todo = false; high_min = READ_ONCE(pcp->high_min); @@ -2575,9 +2603,9 @@ bool decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp) to_drain = pcp->count - pcp->high; while (to_drain > 0) { to_drain_batched = min(to_drain, batch); - spin_lock(&pcp->lock); + pcp_spin_lock_maybe_irqsave(pcp, UP_flags); free_pcppages_bulk(zone, to_drain_batched, pcp, 0); - spin_unlock(&pcp->lock); + pcp_spin_unlock_maybe_irqrestore(pcp, UP_flags); todo = true; to_drain -= to_drain_batched; @@ -2594,14 +2622,15 @@ bool decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp) */ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp) { + unsigned long UP_flags; int to_drain, batch; batch = READ_ONCE(pcp->batch); to_drain = min(pcp->count, batch); if (to_drain > 0) { - spin_lock(&pcp->lock); + pcp_spin_lock_maybe_irqsave(pcp, UP_flags); free_pcppages_bulk(zone, to_drain, pcp, 0); - spin_unlock(&pcp->lock); + pcp_spin_unlock_maybe_irqrestore(pcp, UP_flags); } } #endif @@ -2612,10 +2641,11 @@ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp) static void drain_pages_zone(unsigned int cpu, struct zone *zone) { struct per_cpu_pages *pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu); + unsigned long UP_flags; int count; do { - spin_lock(&pcp->lock); + pcp_spin_lock_maybe_irqsave(pcp, UP_flags); count = pcp->count; if (count) { int to_drain = min(count, @@ -2624,7 +2654,7 @@ static void drain_pages_zone(unsigned int cpu, struct zone *zone) free_pcppages_bulk(zone, to_drain, pcp, 0); count -= to_drain; } - spin_unlock(&pcp->lock); + pcp_spin_unlock_maybe_irqrestore(pcp, UP_flags); } while (count); } @@ -6109,6 +6139,7 @@ static void zone_pcp_update_cacheinfo(struct zone *zone, unsigned int cpu) { struct per_cpu_pages *pcp; struct cpu_cacheinfo *cci; + unsigned long UP_flags; pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu); cci = get_cpu_cacheinfo(cpu); @@ -6119,12 +6150,12 @@ static void zone_pcp_update_cacheinfo(struct zone *zone, unsigned int cpu) * This can reduce zone lock contention without hurting * cache-hot pages sharing. */ - spin_lock(&pcp->lock); + pcp_spin_lock_maybe_irqsave(pcp, UP_flags); if ((cci->per_cpu_data_slice_size >> PAGE_SHIFT) > 3 * pcp->batch) pcp->flags |= PCPF_FREE_HIGH_BATCH; else pcp->flags &= ~PCPF_FREE_HIGH_BATCH; - spin_unlock(&pcp->lock); + pcp_spin_unlock_maybe_irqrestore(pcp, UP_flags); } void setup_pcp_cacheinfo(unsigned int cpu) From fcf75dd8ebc7cd135afdf05aacd979f2ff3d7c7b Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Tue, 6 Jan 2026 15:45:47 +0000 Subject: [PATCH 1046/1321] tools/testing/selftests: fix gup_longterm for unknown fs Commit 66bce7afbaca ("selftests/mm: fix test result reporting in gup_longterm") introduced a small bug causing unknown filesystems to always result in a test failure. This is because do_test() was updated to use a common reporting path, but this case appears to have been missed. This is problematic for e.g. virtme-ng which uses an overlayfs file system, causing gup_longterm to appear to fail each time due to a test count mismatch: # Planned tests != run tests (50 != 46) # Totals: pass:24 fail:0 xfail:0 xpass:0 skip:22 error:0 The fix is to simply change the return into a break. Link: https://lkml.kernel.org/r/20260106154547.214907-1-lorenzo.stoakes@oracle.com Fixes: 66bce7afbaca ("selftests/mm: fix test result reporting in gup_longterm") Signed-off-by: Lorenzo Stoakes Reviewed-by: David Hildenbrand (Red Hat) Cc: Jason Gunthorpe Cc: John Hubbard Cc: Liam Howlett Cc: "Liam R. Howlett" Cc: Mark Brown Cc: Michal Hocko Cc: Mike Rapoport Cc: Peter Xu Cc: Shuah Khan Cc: Suren Baghdasaryan Cc: Vlastimil Babka Cc: Signed-off-by: Andrew Morton --- tools/testing/selftests/mm/gup_longterm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/mm/gup_longterm.c b/tools/testing/selftests/mm/gup_longterm.c index 6279893a0adcf..f61150d28eb2e 100644 --- a/tools/testing/selftests/mm/gup_longterm.c +++ b/tools/testing/selftests/mm/gup_longterm.c @@ -179,7 +179,7 @@ static void do_test(int fd, size_t size, enum test_type type, bool shared) if (rw && shared && fs_is_unknown(fs_type)) { ksft_print_msg("Unknown filesystem\n"); result = KSFT_SKIP; - return; + break; } /* * R/O pinning or pinning in a private mapping is always From 34066f471658985cbedee185842515aa97b37c90 Mon Sep 17 00:00:00 2001 From: Daniel Thompson Date: Thu, 8 Jan 2026 11:09:54 +0000 Subject: [PATCH 1047/1321] mailmap: add entry for Daniel Thompson My linaro address stopped working a long time ago but I didn't update my mailmap entry. Fix that. Link: https://lkml.kernel.org/r/20260108-mailmap-daniel_thompson_linaro-org-v1-1-83f610876377@kernel.org Signed-off-by: Daniel Thompson Signed-off-by: Andrew Morton --- .mailmap | 1 + 1 file changed, 1 insertion(+) diff --git a/.mailmap b/.mailmap index bc94004bc4c2e..4a8a160f28ed0 100644 --- a/.mailmap +++ b/.mailmap @@ -207,6 +207,7 @@ Daniel Borkmann Daniel Borkmann Daniel Borkmann Daniel Borkmann +Daniel Thompson Danilo Krummrich David Brownell David Collins From 6babb53d8d0394948d4819c1948f534468b722d1 Mon Sep 17 00:00:00 2001 From: Ben Dooks Date: Thu, 8 Jan 2026 10:15:39 +0000 Subject: [PATCH 1048/1321] mm: numa,memblock: include for 'numa_nodes_parsed' The 'numa_nodes_parsed' is defined in but this file is not included in mm/numa_memblks.c (build x86_64) so add this to the incldues to fix the following sparse warning: mm/numa_memblks.c:13:12: warning: symbol 'numa_nodes_parsed' was not declared. Should it be static? Link: https://lkml.kernel.org/r/20260108101539.229192-1-ben.dooks@codethink.co.uk Fixes: 87482708210f ("mm: introduce numa_memblks") Signed-off-by: Ben Dooks Reviewed-by: Mike Rapoport (Microsoft) Cc: Ben Dooks Cc: Signed-off-by: Andrew Morton --- mm/numa_memblks.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/numa_memblks.c b/mm/numa_memblks.c index 5b009a9cd8b4c..8f5735fda0a21 100644 --- a/mm/numa_memblks.c +++ b/mm/numa_memblks.c @@ -7,6 +7,8 @@ #include #include +#include + int numa_distance_cnt; static u8 *numa_distance; From 0524d0b7a82f080085f9e3455ed8d1aba7a2f82b Mon Sep 17 00:00:00 2001 From: John Groves Date: Sat, 10 Jan 2026 13:18:04 -0600 Subject: [PATCH 1049/1321] drivers/dax: add some missing kerneldoc comment fields for struct dev_dax Add the missing @align and @memmap_on_memory fields to kerneldoc comment header for struct dev_dax. Also, some other fields were followed by '-' and others by ':'. Fix all to be ':' for actual kerneldoc compliance. Link: https://lkml.kernel.org/r/20260110191804.5739-1-john@groves.net Fixes: 33cf94d71766 ("device-dax: make align a per-device property") Fixes: 4eca0ef49af9 ("dax/kmem: allow kmem to add memory with memmap_on_memory") Signed-off-by: John Groves Cc: Dan Williams Cc: Joao Martins Cc: Vishal Verma Signed-off-by: Andrew Morton --- drivers/dax/dax-private.h | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h index 0867115aeef2e..c6ae27c982f43 100644 --- a/drivers/dax/dax-private.h +++ b/drivers/dax/dax-private.h @@ -67,14 +67,16 @@ struct dev_dax_range { /** * struct dev_dax - instance data for a subdivision of a dax region, and * data while the device is activated in the driver. - * @region - parent region - * @dax_dev - core dax functionality + * @region: parent region + * @dax_dev: core dax functionality + * @align: alignment of this instance * @target_node: effective numa node if dev_dax memory range is onlined * @dyn_id: is this a dynamic or statically created instance * @id: ida allocated id when the dax_region is not static * @ida: mapping id allocator - * @dev - device core - * @pgmap - pgmap for memmap setup / lifetime (driver owned) + * @dev: device core + * @pgmap: pgmap for memmap setup / lifetime (driver owned) + * @memmap_on_memory: allow kmem to put the memmap in the memory * @nr_range: size of @ranges * @ranges: range tuples of memory used */ From 4072786f7174dac670d4f177c831cb3caaee1da4 Mon Sep 17 00:00:00 2001 From: Ard Biesheuvel Date: Fri, 5 Dec 2025 10:32:16 +0100 Subject: [PATCH 1050/1321] efi: Wipe INITRD config table from memory after consumption When the EFI stub itself loads the initrd and puts it in memory (rather than simply passing on a struct boot_params or device tree that already carries initrd information), it exposes this information to the core kernel via a INITRD configuration table. Given that config tables are preserved across kexec, this means that subsequent kexec boots will observe the same information, even though it most likely has become stale by that point. On x86, this information is usually superseded by the initrd info passed via bootparams, in which case this stale information is simply ignored. However, when performing a kexec boot without passing an initrd, the loader falls back to this stale information and explodes. So wipe the base and size from the INITRD config table as soon as it has been consumed. This fixes the issue for kexec on all EFI architectures. Reported-by: James Le Cuirot Tested-by: James Le Cuirot Acked-by: H. Peter Anvin (Intel) Link: https://lore.kernel.org/all/20251126173209.374755-2-chewi@gentoo.org Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/efi.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index 55452e61af31d..f5ff6e84a9b7d 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -819,6 +819,7 @@ int __init efi_config_parse_tables(const efi_config_table_t *config_tables, if (tbl) { phys_initrd_start = tbl->base; phys_initrd_size = tbl->size; + tbl->base = tbl->size = 0; early_memunmap(tbl, sizeof(*tbl)); } } From c057885c593d786f51901cb7a37b36268fbe83b7 Mon Sep 17 00:00:00 2001 From: Mauro Carvalho Chehab Date: Tue, 6 Jan 2026 11:24:26 +0100 Subject: [PATCH 1051/1321] MAINTAINERS: add cper to APEI files The CPER records are defined as part of UEFI specs, but its primary way to report it is via APEI/GHES. As such, let's place it under the same umbrella to make easier for patch review. Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Ard Biesheuvel --- MAINTAINERS | 2 ++ 1 file changed, 2 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 61bc5c5665529..f1b0205885973 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -314,6 +314,7 @@ R: Mauro Carvalho Chehab R: Shuai Xue L: linux-acpi@vger.kernel.org F: drivers/acpi/apei/ +F: drivers/firmware/efi/cper* ACPI COMPONENT ARCHITECTURE (ACPICA) M: "Rafael J. Wysocki" @@ -9516,6 +9517,7 @@ F: arch/arm/boot/compressed/efi-header.S F: arch/x86/platform/efi/ F: drivers/firmware/efi/ F: include/linux/efi*.h +X: drivers/firmware/efi/cper* EXTERNAL CONNECTOR SUBSYSTEM (EXTCON) M: MyungJoo Ham From 70ee9ca556504f17b6e1bbfe69671d8cb4416218 Mon Sep 17 00:00:00 2001 From: Morduan Zang Date: Wed, 14 Jan 2026 13:30:33 +0800 Subject: [PATCH 1052/1321] efi/cper: Fix cper_bits_to_str buffer handling and return value The return value calculation was incorrect: `return len - buf_size;` Initially `len = buf_size`, then `len` decreases with each operation. This results in a negative return value on success. Fix by returning `buf_size - len` which correctly calculates the actual number of bytes written. Fixes: a976d790f494 ("efi/cper: Add a new helper function to print bitmasks") Signed-off-by: Morduan Zang Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/cper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c index 0232bd040f61c..bd99802cb0cad 100644 --- a/drivers/firmware/efi/cper.c +++ b/drivers/firmware/efi/cper.c @@ -162,7 +162,7 @@ int cper_bits_to_str(char *buf, int buf_size, unsigned long bits, len -= size; str += size; } - return len - buf_size; + return buf_size - len; } EXPORT_SYMBOL_GPL(cper_bits_to_str); From 54731724ccaa5bc3594d5d7c285b178ca036634b Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Mon, 8 Dec 2025 14:45:00 -0500 Subject: [PATCH 1053/1321] pNFS: Fix a deadlock when returning a delegation during open() Ben Coddington reports seeing a hang in the following stack trace: 0 [ffffd0b50e1774e0] __schedule at ffffffff9ca05415 1 [ffffd0b50e177548] schedule at ffffffff9ca05717 2 [ffffd0b50e177558] bit_wait at ffffffff9ca061e1 3 [ffffd0b50e177568] __wait_on_bit at ffffffff9ca05cfb 4 [ffffd0b50e1775c8] out_of_line_wait_on_bit at ffffffff9ca05ea5 5 [ffffd0b50e177618] pnfs_roc at ffffffffc154207b [nfsv4] 6 [ffffd0b50e1776b8] _nfs4_proc_delegreturn at ffffffffc1506586 [nfsv4] 7 [ffffd0b50e177788] nfs4_proc_delegreturn at ffffffffc1507480 [nfsv4] 8 [ffffd0b50e1777f8] nfs_do_return_delegation at ffffffffc1523e41 [nfsv4] 9 [ffffd0b50e177838] nfs_inode_set_delegation at ffffffffc1524a75 [nfsv4] 10 [ffffd0b50e177888] nfs4_process_delegation at ffffffffc14f41dd [nfsv4] 11 [ffffd0b50e1778a0] _nfs4_opendata_to_nfs4_state at ffffffffc1503edf [nfsv4] 12 [ffffd0b50e1778c0] _nfs4_open_and_get_state at ffffffffc1504e56 [nfsv4] 13 [ffffd0b50e177978] _nfs4_do_open at ffffffffc15051b8 [nfsv4] 14 [ffffd0b50e1779f8] nfs4_do_open at ffffffffc150559c [nfsv4] 15 [ffffd0b50e177a80] nfs4_atomic_open at ffffffffc15057fb [nfsv4] 16 [ffffd0b50e177ad0] nfs4_file_open at ffffffffc15219be [nfsv4] 17 [ffffd0b50e177b78] do_dentry_open at ffffffff9c09e6ea 18 [ffffd0b50e177ba8] vfs_open at ffffffff9c0a082e 19 [ffffd0b50e177bd0] dentry_open at ffffffff9c0a0935 The issue is that the delegreturn is being asked to wait for a layout return that cannot complete because a state recovery was initiated. The state recovery cannot complete until the open() finishes processing the delegations it was given. The solution is to propagate the existing flags that indicate a non-blocking call to the function pnfs_roc(), so that it knows not to wait in this situation. Reported-by: Benjamin Coddington Fixes: 29ade5db1293 ("pNFS: Wait on outstanding layoutreturns to complete in pnfs_roc()") Signed-off-by: Trond Myklebust --- fs/nfs/nfs4proc.c | 6 ++--- fs/nfs/pnfs.c | 58 +++++++++++++++++++++++++++++++++-------------- fs/nfs/pnfs.h | 17 ++++++-------- 3 files changed, 51 insertions(+), 30 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index ec1ce593dea2b..51da62ba65595 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -3894,8 +3894,8 @@ int nfs4_do_close(struct nfs4_state *state, gfp_t gfp_mask, int wait) calldata->res.seqid = calldata->arg.seqid; calldata->res.server = server; calldata->res.lr_ret = -NFS4ERR_NOMATCHING_LAYOUT; - calldata->lr.roc = pnfs_roc(state->inode, - &calldata->lr.arg, &calldata->lr.res, msg.rpc_cred); + calldata->lr.roc = pnfs_roc(state->inode, &calldata->lr.arg, + &calldata->lr.res, msg.rpc_cred, wait); if (calldata->lr.roc) { calldata->arg.lr_args = &calldata->lr.arg; calldata->res.lr_res = &calldata->lr.res; @@ -7005,7 +7005,7 @@ static int _nfs4_proc_delegreturn(struct inode *inode, const struct cred *cred, data->inode = nfs_igrab_and_active(inode); if (data->inode || issync) { data->lr.roc = pnfs_roc(inode, &data->lr.arg, &data->lr.res, - cred); + cred, issync); if (data->lr.roc) { data->args.lr_args = &data->lr.arg; data->res.lr_res = &data->lr.res; diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index b72d7cc367662..cff225721d1ce 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -1533,10 +1533,9 @@ static int pnfs_layout_return_on_reboot(struct pnfs_layout_hdr *lo) PNFS_FL_LAYOUTRETURN_PRIVILEGED); } -bool pnfs_roc(struct inode *ino, - struct nfs4_layoutreturn_args *args, - struct nfs4_layoutreturn_res *res, - const struct cred *cred) +bool pnfs_roc(struct inode *ino, struct nfs4_layoutreturn_args *args, + struct nfs4_layoutreturn_res *res, const struct cred *cred, + bool sync) { struct nfs_inode *nfsi = NFS_I(ino); struct nfs_open_context *ctx; @@ -1547,7 +1546,7 @@ bool pnfs_roc(struct inode *ino, nfs4_stateid stateid; enum pnfs_iomode iomode = 0; bool layoutreturn = false, roc = false; - bool skip_read = false; + bool skip_read; if (!nfs_have_layout(ino)) return false; @@ -1560,20 +1559,14 @@ bool pnfs_roc(struct inode *ino, lo = NULL; goto out_noroc; } - pnfs_get_layout_hdr(lo); - if (test_bit(NFS_LAYOUT_RETURN_LOCK, &lo->plh_flags)) { - spin_unlock(&ino->i_lock); - rcu_read_unlock(); - wait_on_bit(&lo->plh_flags, NFS_LAYOUT_RETURN, - TASK_UNINTERRUPTIBLE); - pnfs_put_layout_hdr(lo); - goto retry; - } /* no roc if we hold a delegation */ + skip_read = false; if (nfs4_check_delegation(ino, FMODE_READ)) { - if (nfs4_check_delegation(ino, FMODE_WRITE)) + if (nfs4_check_delegation(ino, FMODE_WRITE)) { + lo = NULL; goto out_noroc; + } skip_read = true; } @@ -1582,12 +1575,43 @@ bool pnfs_roc(struct inode *ino, if (state == NULL) continue; /* Don't return layout if there is open file state */ - if (state->state & FMODE_WRITE) + if (state->state & FMODE_WRITE) { + lo = NULL; goto out_noroc; + } if (state->state & FMODE_READ) skip_read = true; } + if (skip_read) { + bool writes = false; + + list_for_each_entry(lseg, &lo->plh_segs, pls_list) { + if (lseg->pls_range.iomode != IOMODE_READ) { + writes = true; + break; + } + } + if (!writes) { + lo = NULL; + goto out_noroc; + } + } + + pnfs_get_layout_hdr(lo); + if (test_bit(NFS_LAYOUT_RETURN_LOCK, &lo->plh_flags)) { + if (!sync) { + pnfs_set_plh_return_info( + lo, skip_read ? IOMODE_RW : IOMODE_ANY, 0); + goto out_noroc; + } + spin_unlock(&ino->i_lock); + rcu_read_unlock(); + wait_on_bit(&lo->plh_flags, NFS_LAYOUT_RETURN, + TASK_UNINTERRUPTIBLE); + pnfs_put_layout_hdr(lo); + goto retry; + } list_for_each_entry_safe(lseg, next, &lo->plh_segs, pls_list) { if (skip_read && lseg->pls_range.iomode == IOMODE_READ) @@ -1627,7 +1651,7 @@ bool pnfs_roc(struct inode *ino, out_noroc: spin_unlock(&ino->i_lock); rcu_read_unlock(); - pnfs_layoutcommit_inode(ino, true); + pnfs_layoutcommit_inode(ino, sync); if (roc) { struct pnfs_layoutdriver_type *ld = NFS_SERVER(ino)->pnfs_curr_ld; if (ld->prepare_layoutreturn) diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h index 91ff877185c8a..3db8f13d8fe4e 100644 --- a/fs/nfs/pnfs.h +++ b/fs/nfs/pnfs.h @@ -303,10 +303,9 @@ int pnfs_mark_matching_lsegs_return(struct pnfs_layout_hdr *lo, u32 seq); int pnfs_mark_layout_stateid_invalid(struct pnfs_layout_hdr *lo, struct list_head *lseg_list); -bool pnfs_roc(struct inode *ino, - struct nfs4_layoutreturn_args *args, - struct nfs4_layoutreturn_res *res, - const struct cred *cred); +bool pnfs_roc(struct inode *ino, struct nfs4_layoutreturn_args *args, + struct nfs4_layoutreturn_res *res, const struct cred *cred, + bool sync); int pnfs_roc_done(struct rpc_task *task, struct nfs4_layoutreturn_args **argpp, struct nfs4_layoutreturn_res **respp, int *ret); void pnfs_roc_release(struct nfs4_layoutreturn_args *args, @@ -773,12 +772,10 @@ pnfs_layoutcommit_outstanding(struct inode *inode) return false; } - -static inline bool -pnfs_roc(struct inode *ino, - struct nfs4_layoutreturn_args *args, - struct nfs4_layoutreturn_res *res, - const struct cred *cred) +static inline bool pnfs_roc(struct inode *ino, + struct nfs4_layoutreturn_args *args, + struct nfs4_layoutreturn_res *res, + const struct cred *cred, bool sync) { return false; } From 4288fc9254323f555c9e2a1630c4fb33a6540b67 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Wed, 31 Dec 2025 11:42:31 -0500 Subject: [PATCH 1054/1321] NFS: Fix a deadlock involving nfs_release_folio() Wang Zhaolong reports a deadlock involving NFSv4.1 state recovery waiting on kthreadd, which is attempting to reclaim memory by calling nfs_release_folio(). The latter cannot make progress due to state recovery being needed. It seems that the only safe thing to do here is to kick off a writeback of the folio, without waiting for completion, or else kicking off an asynchronous commit. Reported-by: Wang Zhaolong Fixes: 96780ca55e3c ("NFS: fix up nfs_release_folio() to try to release the page") Signed-off-by: Trond Myklebust --- fs/nfs/file.c | 3 ++- fs/nfs/nfstrace.h | 3 +++ fs/nfs/write.c | 33 +++++++++++++++++++++++++++++++++ include/linux/nfs_fs.h | 1 + 4 files changed, 39 insertions(+), 1 deletion(-) diff --git a/fs/nfs/file.c b/fs/nfs/file.c index d020aab40c64e..d1c138a416cfb 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -511,7 +511,8 @@ static bool nfs_release_folio(struct folio *folio, gfp_t gfp) if ((current_gfp_context(gfp) & GFP_KERNEL) != GFP_KERNEL || current_is_kswapd() || current_is_kcompactd()) return false; - if (nfs_wb_folio(folio->mapping->host, folio) < 0) + if (nfs_wb_folio_reclaim(folio->mapping->host, folio) < 0 || + folio_test_private(folio)) return false; } return nfs_fscache_release_folio(folio, gfp); diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h index 6ce55e8e6b67c..9f9ce4a565ea6 100644 --- a/fs/nfs/nfstrace.h +++ b/fs/nfs/nfstrace.h @@ -1062,6 +1062,9 @@ DECLARE_EVENT_CLASS(nfs_folio_event_done, DEFINE_NFS_FOLIO_EVENT(nfs_aop_readpage); DEFINE_NFS_FOLIO_EVENT_DONE(nfs_aop_readpage_done); +DEFINE_NFS_FOLIO_EVENT(nfs_writeback_folio_reclaim); +DEFINE_NFS_FOLIO_EVENT_DONE(nfs_writeback_folio_reclaim_done); + DEFINE_NFS_FOLIO_EVENT(nfs_writeback_folio); DEFINE_NFS_FOLIO_EVENT_DONE(nfs_writeback_folio_done); diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 336c510f37502..bf412455e8edf 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -2024,6 +2024,39 @@ int nfs_wb_folio_cancel(struct inode *inode, struct folio *folio) return ret; } +/** + * nfs_wb_folio_reclaim - Write back all requests on one page + * @inode: pointer to page + * @folio: pointer to folio + * + * Assumes that the folio has been locked by the caller + */ +int nfs_wb_folio_reclaim(struct inode *inode, struct folio *folio) +{ + loff_t range_start = folio_pos(folio); + size_t len = folio_size(folio); + struct writeback_control wbc = { + .sync_mode = WB_SYNC_ALL, + .nr_to_write = 0, + .range_start = range_start, + .range_end = range_start + len - 1, + .for_sync = 1, + }; + int ret; + + if (folio_test_writeback(folio)) + return -EBUSY; + if (folio_clear_dirty_for_io(folio)) { + trace_nfs_writeback_folio_reclaim(inode, range_start, len); + ret = nfs_writepage_locked(folio, &wbc); + trace_nfs_writeback_folio_reclaim_done(inode, range_start, len, + ret); + return ret; + } + nfs_commit_inode(inode, 0); + return 0; +} + /** * nfs_wb_folio - Write back all requests on one page * @inode: pointer to page diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index a6624edb7226a..8dd79a3f3d662 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -637,6 +637,7 @@ extern int nfs_update_folio(struct file *file, struct folio *folio, extern int nfs_sync_inode(struct inode *inode); extern int nfs_wb_all(struct inode *inode); extern int nfs_wb_folio(struct inode *inode, struct folio *folio); +extern int nfs_wb_folio_reclaim(struct inode *inode, struct folio *folio); int nfs_wb_folio_cancel(struct inode *inode, struct folio *folio); extern int nfs_commit_inode(struct inode *, int); extern struct nfs_commit_data *nfs_commitdata_alloc(void); From 7ce85c8cd9868b02c78132c5c68ff4a2409a708d Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Thu, 25 Dec 2025 07:41:03 +0000 Subject: [PATCH 1055/1321] pnfs/flexfiles: Fix memory leak in nfs4_ff_alloc_deviceid_node() In nfs4_ff_alloc_deviceid_node(), if the allocation for ds_versions fails, the function jumps to the out_scratch label without freeing the already allocated dsaddrs list, leading to a memory leak. Fix this by jumping to the out_err_drain_dsaddrs label, which properly frees the dsaddrs list before cleaning up other resources. Fixes: d67ae825a59d6 ("pnfs/flexfiles: Add the FlexFile Layout Driver") Signed-off-by: Zilin Guan Signed-off-by: Trond Myklebust --- fs/nfs/flexfilelayout/flexfilelayoutdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c b/fs/nfs/flexfilelayout/flexfilelayoutdev.c index c55ea8fa3bfa5..c2d8a13a9dbdd 100644 --- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c +++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c @@ -103,7 +103,7 @@ nfs4_ff_alloc_deviceid_node(struct nfs_server *server, struct pnfs_device *pdev, sizeof(struct nfs4_ff_ds_version), gfp_flags); if (!ds_versions) - goto out_scratch; + goto out_err_drain_dsaddrs; for (i = 0; i < version_count; i++) { /* 20 = version(4) + minor_version(4) + rsize(4) + wsize(4) + From 1057487ba5ec2f9c23c99a16bec9eef250f2c0c7 Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Thu, 25 Dec 2025 08:45:26 +0000 Subject: [PATCH 1056/1321] pnfs/blocklayout: Fix memory leak in bl_parse_scsi() In bl_parse_scsi(), if the block device length is zero, the function returns immediately without releasing the file reference obtained via bl_open_path(), leading to a memory leak. Fix this by jumping to the out_blkdev_put label to ensure the file reference is properly released. Fixes: d76c769c8db4c ("pnfs/blocklayout: Don't add zero-length pnfs_block_dev") Signed-off-by: Zilin Guan Signed-off-by: Trond Myklebust --- fs/nfs/blocklayout/dev.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/nfs/blocklayout/dev.c b/fs/nfs/blocklayout/dev.c index ab76120705e20..134d7f760a33a 100644 --- a/fs/nfs/blocklayout/dev.c +++ b/fs/nfs/blocklayout/dev.c @@ -417,8 +417,10 @@ bl_parse_scsi(struct nfs_server *server, struct pnfs_block_dev *d, d->map = bl_map_simple; d->pr_key = v->scsi.pr_key; - if (d->len == 0) - return -ENODEV; + if (d->len == 0) { + error = -ENODEV; + goto out_blkdev_put; + } ops = bdev->bd_disk->fops->pr_ops; if (!ops) { From bd590d5d464a228ebc980207e90cbba463c43054 Mon Sep 17 00:00:00 2001 From: Anna Schumaker Date: Fri, 19 Dec 2025 15:13:44 -0500 Subject: [PATCH 1057/1321] NFS: Fix directory delegation verifier checks Doing this check in nfs_check_verifier() resulted in many, many more lookups on the wire when running Christoph's delegation benchmarking script. After some experimentation, I found that we can treat directory delegations exactly the same as having a delegated verifier when we reach nfs4_lookup_revalidate() for the best performance. Reported-by: Christoph Hellwig Fixes: 156b09482933 ("NFS: Request a directory delegation on ACCESS, CREATE, and UNLINK") Signed-off-by: Anna Schumaker Signed-off-by: Trond Myklebust --- fs/nfs/dir.c | 21 ++------------------- 1 file changed, 2 insertions(+), 19 deletions(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 23a78a742b619..c0e9d5a45cd0b 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1516,14 +1516,6 @@ static int nfs_check_verifier(struct inode *dir, struct dentry *dentry, if (!nfs_dentry_verify_change(dir, dentry)) return 0; - /* - * If we have a directory delegation then we don't need to revalidate - * the directory. The delegation will either get recalled or we will - * receive a notification when it changes. - */ - if (nfs_have_directory_delegation(dir)) - return 0; - /* Revalidate nfsi->cache_change_attribute before we declare a match */ if (nfs_mapping_need_revalidate_inode(dir)) { if (rcu_walk) @@ -2216,13 +2208,6 @@ int nfs_atomic_open(struct inode *dir, struct dentry *dentry, } EXPORT_SYMBOL_GPL(nfs_atomic_open); -static int -nfs_lookup_revalidate_delegated_parent(struct inode *dir, struct dentry *dentry, - struct inode *inode) -{ - return nfs_lookup_revalidate_done(dir, dentry, inode, 1); -} - static int nfs4_lookup_revalidate(struct inode *dir, const struct qstr *name, struct dentry *dentry, unsigned int flags) @@ -2247,12 +2232,10 @@ nfs4_lookup_revalidate(struct inode *dir, const struct qstr *name, if (inode == NULL) goto full_reval; - if (nfs_verifier_is_delegated(dentry)) + if (nfs_verifier_is_delegated(dentry) || + nfs_have_directory_delegation(inode)) return nfs_lookup_revalidate_delegated(dir, dentry, inode); - if (nfs_have_directory_delegation(dir)) - return nfs_lookup_revalidate_delegated_parent(dir, dentry, inode); - /* NFS only supports OPEN on regular files */ if (!S_ISREG(inode->i_mode)) goto full_reval; From 47b649d34fc09c59a750bfbfef1646e9eb3a71e6 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Wed, 31 Dec 2025 16:41:15 -0500 Subject: [PATCH 1058/1321] NFSv4: Fix nfs_clear_verifier_delegated() for delegated directories If the client returns a directory delegation, then look up all the child dentries, and clear their 'verifier delegated' bit, unless subject to a file delegation. Similarly, if a file delegation is being returned, check if there is a directory delegation before clearing a 'verifier delegated' bit. Reported-by: Christoph Hellwig Fixes: 156b09482933 ("NFS: Request a directory delegation on ACCESS, CREATE, and UNLINK") Signed-off-by: Trond Myklebust --- fs/nfs/dir.c | 57 ++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 49 insertions(+), 8 deletions(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index c0e9d5a45cd0b..8f9ea79b78823 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1440,7 +1440,8 @@ static void nfs_set_verifier_locked(struct dentry *dentry, unsigned long verf) if (!dir || !nfs_verify_change_attribute(dir, verf)) return; - if (inode && NFS_PROTO(inode)->have_delegation(inode, FMODE_READ, 0)) + if (NFS_PROTO(dir)->have_delegation(dir, FMODE_READ, 0) || + (inode && NFS_PROTO(inode)->have_delegation(inode, FMODE_READ, 0))) nfs_set_verifier_delegated(&verf); dentry->d_time = verf; } @@ -1465,6 +1466,49 @@ void nfs_set_verifier(struct dentry *dentry, unsigned long verf) EXPORT_SYMBOL_GPL(nfs_set_verifier); #if IS_ENABLED(CONFIG_NFS_V4) +static void nfs_clear_verifier_file(struct inode *inode) +{ + struct dentry *alias; + struct inode *dir; + + hlist_for_each_entry(alias, &inode->i_dentry, d_u.d_alias) { + spin_lock(&alias->d_lock); + dir = d_inode_rcu(alias->d_parent); + if (!dir || + !NFS_PROTO(dir)->have_delegation(dir, FMODE_READ, 0)) + nfs_unset_verifier_delegated(&alias->d_time); + spin_unlock(&alias->d_lock); + } +} + +static void nfs_clear_verifier_directory(struct inode *dir) +{ + struct dentry *this_parent; + struct dentry *dentry; + struct inode *inode; + + if (hlist_empty(&dir->i_dentry)) + return; + this_parent = + hlist_entry(dir->i_dentry.first, struct dentry, d_u.d_alias); + + spin_lock(&this_parent->d_lock); + nfs_unset_verifier_delegated(&this_parent->d_time); + dentry = d_first_child(this_parent); + hlist_for_each_entry_from(dentry, d_sib) { + if (unlikely(dentry->d_flags & DCACHE_DENTRY_CURSOR)) + continue; + inode = d_inode_rcu(dentry); + if (inode && + NFS_PROTO(inode)->have_delegation(inode, FMODE_READ, 0)) + continue; + spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED); + nfs_unset_verifier_delegated(&dentry->d_time); + spin_unlock(&dentry->d_lock); + } + spin_unlock(&this_parent->d_lock); +} + /** * nfs_clear_verifier_delegated - clear the dir verifier delegation tag * @inode: pointer to inode @@ -1477,16 +1521,13 @@ EXPORT_SYMBOL_GPL(nfs_set_verifier); */ void nfs_clear_verifier_delegated(struct inode *inode) { - struct dentry *alias; - if (!inode) return; spin_lock(&inode->i_lock); - hlist_for_each_entry(alias, &inode->i_dentry, d_u.d_alias) { - spin_lock(&alias->d_lock); - nfs_unset_verifier_delegated(&alias->d_time); - spin_unlock(&alias->d_lock); - } + if (S_ISREG(inode->i_mode)) + nfs_clear_verifier_file(inode); + else if (S_ISDIR(inode->i_mode)) + nfs_clear_verifier_directory(inode); spin_unlock(&inode->i_lock); } EXPORT_SYMBOL_GPL(nfs_clear_verifier_delegated); From 2da8c8e4d16e466e079a1aa5ff20f9d8daf6d5b4 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Thu, 1 Jan 2026 20:16:04 -0500 Subject: [PATCH 1059/1321] NFSv4: Don't free slots prematurely if requesting a directory delegation When requesting a directory delegation, it is imperative to hold the slot until the delegation state has been recorded. Otherwise, if a recall comes in, the call to referring_call_exists() will assume the processing is done, and when it doesn't find a delegation, it will assume it has been returned. Fixes: 156b09482933 ("NFS: Request a directory delegation on ACCESS, CREATE, and UNLINK") Signed-off-by: Trond Myklebust --- fs/nfs/nfs4proc.c | 47 +++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 8 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 51da62ba65595..a0885ae55abc1 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -4494,6 +4494,25 @@ static bool should_request_dir_deleg(struct inode *inode) } #endif /* CONFIG_NFS_V4_1 */ +static void nfs4_call_getattr_prepare(struct rpc_task *task, void *calldata) +{ + struct nfs4_call_sync_data *data = calldata; + nfs4_setup_sequence(data->seq_server->nfs_client, data->seq_args, + data->seq_res, task); +} + +static void nfs4_call_getattr_done(struct rpc_task *task, void *calldata) +{ + struct nfs4_call_sync_data *data = calldata; + + nfs4_sequence_process(task, data->seq_res); +} + +static const struct rpc_call_ops nfs4_call_getattr_ops = { + .rpc_call_prepare = nfs4_call_getattr_prepare, + .rpc_call_done = nfs4_call_getattr_done, +}; + static int _nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle, struct nfs_fattr *fattr, struct inode *inode) { @@ -4511,16 +4530,26 @@ static int _nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle, .rpc_argp = &args, .rpc_resp = &res, }; + struct nfs4_call_sync_data data = { + .seq_server = server, + .seq_args = &args.seq_args, + .seq_res = &res.seq_res, + }; + struct rpc_task_setup task_setup = { + .rpc_client = server->client, + .rpc_message = &msg, + .callback_ops = &nfs4_call_getattr_ops, + .callback_data = &data, + }; struct nfs4_gdd_res gdd_res; - unsigned short task_flags = 0; int status; if (nfs4_has_session(server->nfs_client)) - task_flags = RPC_TASK_MOVEABLE; + task_setup.flags = RPC_TASK_MOVEABLE; /* Is this is an attribute revalidation, subject to softreval? */ if (inode && (server->flags & NFS_MOUNT_SOFTREVAL)) - task_flags |= RPC_TASK_TIMEOUT; + task_setup.flags |= RPC_TASK_TIMEOUT; args.get_dir_deleg = should_request_dir_deleg(inode); if (args.get_dir_deleg) @@ -4530,22 +4559,24 @@ static int _nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle, nfs_fattr_init(fattr); nfs4_init_sequence(&args.seq_args, &res.seq_res, 0, 0); - status = nfs4_do_call_sync(server->client, server, &msg, - &args.seq_args, &res.seq_res, task_flags); + status = nfs4_call_sync_custom(&task_setup); + if (args.get_dir_deleg) { switch (status) { case 0: if (gdd_res.status != GDD4_OK) break; - status = nfs_inode_set_delegation( - inode, current_cred(), FMODE_READ, - &gdd_res.deleg, 0, NFS4_OPEN_DELEGATE_READ); + nfs_inode_set_delegation(inode, current_cred(), + FMODE_READ, &gdd_res.deleg, 0, + NFS4_OPEN_DELEGATE_READ); break; case -ENOTSUPP: case -EOPNOTSUPP: server->caps &= ~NFS_CAP_DIR_DELEG; } } + + nfs4_sequence_free_slot(&res.seq_res); return status; } From dccc96729129b6c4fb6f3160b05d4cc3587c3ec8 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Tue, 6 Jan 2026 11:54:32 -0500 Subject: [PATCH 1060/1321] NFSv4.x: Directory delegations don't require any state recovery The state recovery code in nfs_end_delegation_return() is intended to allow regular files to recover cached open and lock state. It has no function for directory delegations, and may cause corruption. Fixes: 156b09482933 ("NFS: Request a directory delegation on ACCESS, CREATE, and UNLINK") Signed-off-by: Trond Myklebust --- fs/nfs/delegation.c | 5 +++++ fs/nfs/nfs4state.c | 6 +++++- 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c index 2248e3ad089a7..c9fa4c1f68fce 100644 --- a/fs/nfs/delegation.c +++ b/fs/nfs/delegation.c @@ -581,6 +581,10 @@ static int nfs_end_delegation_return(struct inode *inode, struct nfs_delegation if (delegation == NULL) return 0; + /* Directory delegations don't require any state recovery */ + if (!S_ISREG(inode->i_mode)) + goto out_return; + if (!issync) mode |= O_NONBLOCK; /* Recall of any remaining application leases */ @@ -604,6 +608,7 @@ static int nfs_end_delegation_return(struct inode *inode, struct nfs_delegation goto out; } +out_return: err = nfs_do_return_delegation(inode, delegation, issync); out: /* Refcount matched in nfs_start_delegation_return_locked() */ diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 01179f7de3225..dba51c622cf36 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -1445,6 +1445,8 @@ void nfs_inode_find_state_and_recover(struct inode *inode, struct nfs4_state *state; bool found = false; + if (!S_ISREG(inode->i_mode)) + goto out; rcu_read_lock(); list_for_each_entry_rcu(ctx, &nfsi->open_files, list) { state = ctx->state; @@ -1466,7 +1468,7 @@ void nfs_inode_find_state_and_recover(struct inode *inode, found = true; } rcu_read_unlock(); - +out: nfs_inode_find_delegation_state_and_recover(inode, stateid); if (found) nfs4_schedule_state_manager(clp); @@ -1478,6 +1480,8 @@ static void nfs4_state_mark_open_context_bad(struct nfs4_state *state, int err) struct nfs_inode *nfsi = NFS_I(inode); struct nfs_open_context *ctx; + if (!S_ISREG(inode->i_mode)) + return; rcu_read_lock(); list_for_each_entry_rcu(ctx, &nfsi->open_files, list) { if (ctx->state != state) From 0b7e7a767aff0dc755b9953f0026a352eeb41525 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Fri, 2 Jan 2026 16:01:06 -0500 Subject: [PATCH 1061/1321] NFS/localio: Stop further I/O upon hitting an error If the call into the filesystem results in an I/O error, then the next chunk of data won't be contiguous with the end of the last successful chunk. So break out of the I/O loop and report the results. Currently the localio code will do this for a short read/write, but not for an error. Fixes: 6a218b9c3183 ("nfs/localio: do not issue misaligned DIO out-of-order") Signed-off-by: Trond Myklebust Reviewed-by: Mike Snitzer --- fs/nfs/localio.c | 30 ++++++++++++++---------------- 1 file changed, 14 insertions(+), 16 deletions(-) diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c index a113bfdacfd61..c884245e8fb87 100644 --- a/fs/nfs/localio.c +++ b/fs/nfs/localio.c @@ -618,7 +618,6 @@ static void nfs_local_call_read(struct work_struct *work) struct nfs_local_kiocb *iocb = container_of(work, struct nfs_local_kiocb, work); struct file *filp = iocb->kiocb.ki_filp; - bool force_done = false; ssize_t status; int n_iters; @@ -637,13 +636,13 @@ static void nfs_local_call_read(struct work_struct *work) scoped_with_creds(filp->f_cred) status = filp->f_op->read_iter(&iocb->kiocb, &iocb->iters[i]); - if (status != -EIOCBQUEUED) { - if (unlikely(status >= 0 && status < iocb->iters[i].count)) - force_done = true; /* Partial read */ - if (nfs_local_pgio_done(iocb, status, force_done)) { - nfs_local_read_iocb_done(iocb); - break; - } + if (status == -EIOCBQUEUED) + continue; + /* Break on completion, errors, or short reads */ + if (nfs_local_pgio_done(iocb, status, false) || status < 0 || + (size_t)status < iov_iter_count(&iocb->iters[i])) { + nfs_local_read_iocb_done(iocb); + break; } } } @@ -821,7 +820,6 @@ static void nfs_local_call_write(struct work_struct *work) container_of(work, struct nfs_local_kiocb, work); struct file *filp = iocb->kiocb.ki_filp; unsigned long old_flags = current->flags; - bool force_done = false; ssize_t status; int n_iters; @@ -843,13 +841,13 @@ static void nfs_local_call_write(struct work_struct *work) scoped_with_creds(filp->f_cred) status = filp->f_op->write_iter(&iocb->kiocb, &iocb->iters[i]); - if (status != -EIOCBQUEUED) { - if (unlikely(status >= 0 && status < iocb->iters[i].count)) - force_done = true; /* Partial write */ - if (nfs_local_pgio_done(iocb, status, force_done)) { - nfs_local_write_iocb_done(iocb); - break; - } + if (status == -EIOCBQUEUED) + continue; + /* Break on completion, errors, or short writes */ + if (nfs_local_pgio_done(iocb, status, false) || status < 0 || + (size_t)status < iov_iter_count(&iocb->iters[i])) { + nfs_local_write_iocb_done(iocb); + break; } } file_end_write(filp); From e46e16a1d04e974ba2a7bafe3c182081b26d1325 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Fri, 2 Jan 2026 18:55:08 -0500 Subject: [PATCH 1062/1321] NFS/localio: Deal with page bases that are > PAGE_SIZE When resending requests, etc, the page base can quickly grow larger than the page size. Fixes: 091bdcfcece0 ("nfs/localio: refactor iocb and iov_iter_bvec initialization") Signed-off-by: Trond Myklebust Reviewed-by: Mike Snitzer --- fs/nfs/localio.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c index c884245e8fb87..41fbcb3f9167e 100644 --- a/fs/nfs/localio.c +++ b/fs/nfs/localio.c @@ -461,6 +461,8 @@ nfs_local_iters_init(struct nfs_local_kiocb *iocb, int rw) v = 0; total = hdr->args.count; base = hdr->args.pgbase; + pagevec += base >> PAGE_SHIFT; + base &= ~PAGE_MASK; while (total && v < hdr->page_array.npages) { len = min_t(size_t, total, PAGE_SIZE - base); bvec_set_page(&iocb->bvec[v], *pagevec, len, base); From f8201dc73bf409beb58ca1138a10c20d99db9d6c Mon Sep 17 00:00:00 2001 From: Anna Schumaker Date: Fri, 9 Jan 2026 15:41:14 -0500 Subject: [PATCH 1063/1321] NFS: Don't immediately return directory delegations when disabled The function nfs_inode_evict_delegation() immediately and synchronously returns a delegation when called. This means we can't call it from nfs4_have_delegation(), since that function could be called under a lock. Instead we should mark the delegation for return and let the state manager handle it for us. Fixes: b6d2a520f463 ("NFS: Add a module option to disable directory delegations") Signed-off-by: Anna Schumaker Signed-off-by: Trond Myklebust --- fs/nfs/delegation.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c index c9fa4c1f68fce..8a3857a49d847 100644 --- a/fs/nfs/delegation.c +++ b/fs/nfs/delegation.c @@ -149,7 +149,7 @@ static int nfs4_do_check_delegation(struct inode *inode, fmode_t type, int nfs4_have_delegation(struct inode *inode, fmode_t type, int flags) { if (S_ISDIR(inode->i_mode) && !directory_delegations) - nfs_inode_evict_delegation(inode); + nfs4_inode_set_return_delegation_on_close(inode); return nfs4_do_check_delegation(inode, type, flags, true); } From 08b8424f02c2af93d3dfcbe016e168e0f3b783f5 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Sat, 10 Jan 2026 18:53:34 -0500 Subject: [PATCH 1064/1321] NFS: Fix size read races in truncate, fallocate and copy offload If the pre-operation file size is read before locking the inode and quiescing O_DIRECT writes, then nfs_truncate_last_folio() might end up overwriting valid file data. Fixes: b1817b18ff20 ("NFS: Protect against 'eof page pollution'") Signed-off-by: Trond Myklebust --- fs/nfs/inode.c | 10 ++++++---- fs/nfs/io.c | 2 ++ fs/nfs/nfs42proc.c | 29 +++++++++++++++++++---------- 3 files changed, 27 insertions(+), 14 deletions(-) diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index 84049f3cd3405..de2cce1d08f45 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -716,7 +716,7 @@ nfs_setattr(struct mnt_idmap *idmap, struct dentry *dentry, { struct inode *inode = d_inode(dentry); struct nfs_fattr *fattr; - loff_t oldsize = i_size_read(inode); + loff_t oldsize; int error = 0; kuid_t task_uid = current_fsuid(); kuid_t owner_uid = inode->i_uid; @@ -727,6 +727,10 @@ nfs_setattr(struct mnt_idmap *idmap, struct dentry *dentry, if (attr->ia_valid & (ATTR_KILL_SUID | ATTR_KILL_SGID)) attr->ia_valid &= ~ATTR_MODE; + if (S_ISREG(inode->i_mode)) + nfs_file_block_o_direct(NFS_I(inode)); + + oldsize = i_size_read(inode); if (attr->ia_valid & ATTR_SIZE) { BUG_ON(!S_ISREG(inode->i_mode)); @@ -774,10 +778,8 @@ nfs_setattr(struct mnt_idmap *idmap, struct dentry *dentry, trace_nfs_setattr_enter(inode); /* Write all dirty data */ - if (S_ISREG(inode->i_mode)) { - nfs_file_block_o_direct(NFS_I(inode)); + if (S_ISREG(inode->i_mode)) nfs_sync_inode(inode); - } fattr = nfs_alloc_fattr_with_label(NFS_SERVER(inode)); if (fattr == NULL) { diff --git a/fs/nfs/io.c b/fs/nfs/io.c index d275b0a250bf3..8337f0ae852d4 100644 --- a/fs/nfs/io.c +++ b/fs/nfs/io.c @@ -84,6 +84,7 @@ nfs_start_io_write(struct inode *inode) nfs_file_block_o_direct(NFS_I(inode)); return err; } +EXPORT_SYMBOL_GPL(nfs_start_io_write); /** * nfs_end_io_write - declare that the buffered write operation is done @@ -97,6 +98,7 @@ nfs_end_io_write(struct inode *inode) { up_write(&inode->i_rwsem); } +EXPORT_SYMBOL_GPL(nfs_end_io_write); /* Call with exclusively locked inode->i_rwsem */ static void nfs_block_buffered(struct nfs_inode *nfsi, struct inode *inode) diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c index d537fb0c230e8..c08520828708b 100644 --- a/fs/nfs/nfs42proc.c +++ b/fs/nfs/nfs42proc.c @@ -114,7 +114,6 @@ static int nfs42_proc_fallocate(struct rpc_message *msg, struct file *filep, exception.inode = inode; exception.state = lock->open_context->state; - nfs_file_block_o_direct(NFS_I(inode)); err = nfs_sync_inode(inode); if (err) goto out; @@ -138,13 +137,17 @@ int nfs42_proc_allocate(struct file *filep, loff_t offset, loff_t len) .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_ALLOCATE], }; struct inode *inode = file_inode(filep); - loff_t oldsize = i_size_read(inode); + loff_t oldsize; int err; if (!nfs_server_capable(inode, NFS_CAP_ALLOCATE)) return -EOPNOTSUPP; - inode_lock(inode); + err = nfs_start_io_write(inode); + if (err) + return err; + + oldsize = i_size_read(inode); err = nfs42_proc_fallocate(&msg, filep, offset, len); @@ -155,7 +158,7 @@ int nfs42_proc_allocate(struct file *filep, loff_t offset, loff_t len) NFS_SERVER(inode)->caps &= ~(NFS_CAP_ALLOCATE | NFS_CAP_ZERO_RANGE); - inode_unlock(inode); + nfs_end_io_write(inode); return err; } @@ -170,7 +173,9 @@ int nfs42_proc_deallocate(struct file *filep, loff_t offset, loff_t len) if (!nfs_server_capable(inode, NFS_CAP_DEALLOCATE)) return -EOPNOTSUPP; - inode_lock(inode); + err = nfs_start_io_write(inode); + if (err) + return err; err = nfs42_proc_fallocate(&msg, filep, offset, len); if (err == 0) @@ -179,7 +184,7 @@ int nfs42_proc_deallocate(struct file *filep, loff_t offset, loff_t len) NFS_SERVER(inode)->caps &= ~(NFS_CAP_DEALLOCATE | NFS_CAP_ZERO_RANGE); - inode_unlock(inode); + nfs_end_io_write(inode); return err; } @@ -189,14 +194,17 @@ int nfs42_proc_zero_range(struct file *filep, loff_t offset, loff_t len) .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_ZERO_RANGE], }; struct inode *inode = file_inode(filep); - loff_t oldsize = i_size_read(inode); + loff_t oldsize; int err; if (!nfs_server_capable(inode, NFS_CAP_ZERO_RANGE)) return -EOPNOTSUPP; - inode_lock(inode); + err = nfs_start_io_write(inode); + if (err) + return err; + oldsize = i_size_read(inode); err = nfs42_proc_fallocate(&msg, filep, offset, len); if (err == 0) { nfs_truncate_last_folio(inode->i_mapping, oldsize, @@ -205,7 +213,7 @@ int nfs42_proc_zero_range(struct file *filep, loff_t offset, loff_t len) } else if (err == -EOPNOTSUPP) NFS_SERVER(inode)->caps &= ~NFS_CAP_ZERO_RANGE; - inode_unlock(inode); + nfs_end_io_write(inode); return err; } @@ -416,7 +424,7 @@ static ssize_t _nfs42_proc_copy(struct file *src, struct nfs_server *src_server = NFS_SERVER(src_inode); loff_t pos_src = args->src_pos; loff_t pos_dst = args->dst_pos; - loff_t oldsize_dst = i_size_read(dst_inode); + loff_t oldsize_dst; size_t count = args->count; ssize_t status; @@ -461,6 +469,7 @@ static ssize_t _nfs42_proc_copy(struct file *src, &src_lock->open_context->state->flags); set_bit(NFS_CLNT_DST_SSC_COPY_STATE, &dst_lock->open_context->state->flags); + oldsize_dst = i_size_read(dst_inode); status = nfs4_call_sync(dst_server->client, dst_server, &msg, &args->seq_args, &res->seq_res, 0); From e92838a6be3bd71d297bd5b0cb177040c156f6d4 Mon Sep 17 00:00:00 2001 From: Guenter Roeck Date: Tue, 13 Jan 2026 07:22:42 -0800 Subject: [PATCH 1065/1321] ftrace: Do not over-allocate ftrace memory The pg_remaining calculation in ftrace_process_locs() assumes that ENTRIES_PER_PAGE multiplied by 2^order equals the actual capacity of the allocated page group. However, ENTRIES_PER_PAGE is PAGE_SIZE / ENTRY_SIZE (integer division). When PAGE_SIZE is not a multiple of ENTRY_SIZE (e.g. 4096 / 24 = 170 with remainder 16), high-order allocations (like 256 pages) have significantly more capacity than 256 * 170. This leads to pg_remaining being underestimated, which in turn makes skip (derived from skipped - pg_remaining) larger than expected, causing the WARN(skip != remaining) to trigger. Extra allocated pages for ftrace: 2 with 654 skipped WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:7295 ftrace_process_locs+0x5bf/0x5e0 A similar problem in ftrace_allocate_records() can result in allocating too many pages. This can trigger the second warning in ftrace_process_locs(). Extra allocated pages for ftrace WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:7276 ftrace_process_locs+0x548/0x580 Use the actual capacity of a page group to determine the number of pages to allocate. Have ftrace_allocate_pages() return the number of allocated pages to avoid having to calculate it. Use the actual page group capacity when validating the number of unused pages due to skipped entries. Drop the definition of ENTRIES_PER_PAGE since it is no longer used. Cc: stable@vger.kernel.org Fixes: 4a3efc6baff93 ("ftrace: Update the mcount_loc check of skipped entries") Link: https://patch.msgid.link/20260113152243.3557219-1-linux@roeck-us.net Signed-off-by: Guenter Roeck Signed-off-by: Steven Rostedt (Google) --- kernel/trace/ftrace.c | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index ef2d5dca6f70c..aa758efc37314 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -1148,7 +1148,6 @@ struct ftrace_page { }; #define ENTRY_SIZE sizeof(struct dyn_ftrace) -#define ENTRIES_PER_PAGE (PAGE_SIZE / ENTRY_SIZE) static struct ftrace_page *ftrace_pages_start; static struct ftrace_page *ftrace_pages; @@ -3834,7 +3833,8 @@ static int ftrace_update_code(struct module *mod, struct ftrace_page *new_pgs) return 0; } -static int ftrace_allocate_records(struct ftrace_page *pg, int count) +static int ftrace_allocate_records(struct ftrace_page *pg, int count, + unsigned long *num_pages) { int order; int pages; @@ -3844,7 +3844,7 @@ static int ftrace_allocate_records(struct ftrace_page *pg, int count) return -EINVAL; /* We want to fill as much as possible, with no empty pages */ - pages = DIV_ROUND_UP(count, ENTRIES_PER_PAGE); + pages = DIV_ROUND_UP(count * ENTRY_SIZE, PAGE_SIZE); order = fls(pages) - 1; again: @@ -3859,6 +3859,7 @@ static int ftrace_allocate_records(struct ftrace_page *pg, int count) } ftrace_number_of_pages += 1 << order; + *num_pages += 1 << order; ftrace_number_of_groups++; cnt = (PAGE_SIZE << order) / ENTRY_SIZE; @@ -3887,12 +3888,14 @@ static void ftrace_free_pages(struct ftrace_page *pages) } static struct ftrace_page * -ftrace_allocate_pages(unsigned long num_to_init) +ftrace_allocate_pages(unsigned long num_to_init, unsigned long *num_pages) { struct ftrace_page *start_pg; struct ftrace_page *pg; int cnt; + *num_pages = 0; + if (!num_to_init) return NULL; @@ -3906,7 +3909,7 @@ ftrace_allocate_pages(unsigned long num_to_init) * waste as little space as possible. */ for (;;) { - cnt = ftrace_allocate_records(pg, num_to_init); + cnt = ftrace_allocate_records(pg, num_to_init, num_pages); if (cnt < 0) goto free_pages; @@ -7192,8 +7195,6 @@ static int ftrace_process_locs(struct module *mod, if (!count) return 0; - pages = DIV_ROUND_UP(count, ENTRIES_PER_PAGE); - /* * Sorting mcount in vmlinux at build time depend on * CONFIG_BUILDTIME_MCOUNT_SORT, while mcount loc in @@ -7206,7 +7207,7 @@ static int ftrace_process_locs(struct module *mod, test_is_sorted(start, count); } - start_pg = ftrace_allocate_pages(count); + start_pg = ftrace_allocate_pages(count, &pages); if (!start_pg) return -ENOMEM; @@ -7305,27 +7306,27 @@ static int ftrace_process_locs(struct module *mod, /* We should have used all pages unless we skipped some */ if (pg_unuse) { unsigned long pg_remaining, remaining = 0; - unsigned long skip; + long skip; /* Count the number of entries unused and compare it to skipped. */ - pg_remaining = (ENTRIES_PER_PAGE << pg->order) - pg->index; + pg_remaining = (PAGE_SIZE << pg->order) / ENTRY_SIZE - pg->index; if (!WARN(skipped < pg_remaining, "Extra allocated pages for ftrace")) { skip = skipped - pg_remaining; - for (pg = pg_unuse; pg; pg = pg->next) + for (pg = pg_unuse; pg && skip > 0; pg = pg->next) { remaining += 1 << pg->order; + skip -= (PAGE_SIZE << pg->order) / ENTRY_SIZE; + } pages -= remaining; - skip = DIV_ROUND_UP(skip, ENTRIES_PER_PAGE); - /* * Check to see if the number of pages remaining would * just fit the number of entries skipped. */ - WARN(skip != remaining, "Extra allocated pages for ftrace: %lu with %lu skipped", + WARN(pg || skip > 0, "Extra allocated pages for ftrace: %lu with %lu skipped", remaining, skipped); } /* Need to synchronize with ftrace_location_range() */ From fa4f827c7dc0585e42fa0fc854ce1b09a8ab76d8 Mon Sep 17 00:00:00 2001 From: Tim Bird Date: Thu, 15 Jan 2026 17:04:31 -0700 Subject: [PATCH 1066/1321] kernel: modules: Add SPDX license identifier to kmod.c Add a GPL-2.0 license identifier line for this file. kmod.c was originally introduced in the kernel in February of 1998 by Linus Torvalds - who was familiar with kernel licensing at the time this was introduced. Signed-off-by: Tim Bird Signed-off-by: Linus Torvalds --- kernel/module/kmod.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/module/kmod.c b/kernel/module/kmod.c index 25f2538125128..a25dccdf7aa75 100644 --- a/kernel/module/kmod.c +++ b/kernel/module/kmod.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 /* * kmod - the kernel module loader * From f5d3f32ae2b311f0a6370a6b0f82e01ba40bd935 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Fri, 19 Dec 2025 13:33:50 +0300 Subject: [PATCH 1067/1321] xfs: fix memory leak in xfs_growfs_check_rtgeom() Free the "nmp" allocation before returning -EINVAL. Fixes: dc68c0f60169 ("xfs: fix the zoned RT growfs check for zone alignment") Signed-off-by: Dan Carpenter Reviewed-by: Christoph Hellwig Signed-off-by: Carlos Maiolino --- fs/xfs/xfs_rtalloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index e063f4f2f2e61..167298ad88dde 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -1265,7 +1265,7 @@ xfs_growfs_check_rtgeom( uint32_t rem; if (rextsize != 1) - return -EINVAL; + goto out_inval; div_u64_rem(nmp->m_sb.sb_rblocks, gblocks, &rem); if (rem) { xfs_warn(mp, From fa6e237dc6dfcda2d8a7e50e448cf243ef56849a Mon Sep 17 00:00:00 2001 From: "Nirjhar Roy (IBM)" Date: Mon, 12 Jan 2026 15:35:23 +0530 Subject: [PATCH 1068/1321] xfs: Fix the return value of xfs_rtcopy_summary() xfs_rtcopy_summary() should return the appropriate error code instead of always returning 0. The caller of this function which is xfs_growfs_rt_bmblock() is already handling the error. Fixes: e94b53ff699c ("xfs: cache last bitmap block in realtime allocator") Signed-off-by: Nirjhar Roy (IBM) Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Cc: stable@vger.kernel.org # v6.7 Signed-off-by: Carlos Maiolino --- fs/xfs/xfs_rtalloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 167298ad88dde..202dcd2f40395 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -126,7 +126,7 @@ xfs_rtcopy_summary( error = 0; out: xfs_rtbuf_cache_relse(oargs); - return 0; + return error; } /* * Mark an extent specified by start and len allocated. From 21232e064057e6f7334f5cd060e5aa83c21000d5 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Fri, 9 Jan 2026 16:18:53 +0100 Subject: [PATCH 1069/1321] xfs: mark __xfs_rtgroup_extents static __xfs_rtgroup_extents is not used outside of xfs_rtgroup.c, so mark it static. Move it and xfs_rtgroup_extents up in the file to avoid forward declarations. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Carlos Maiolino --- fs/xfs/libxfs/xfs_rtgroup.c | 50 ++++++++++++++++++------------------- fs/xfs/libxfs/xfs_rtgroup.h | 2 -- 2 files changed, 25 insertions(+), 27 deletions(-) diff --git a/fs/xfs/libxfs/xfs_rtgroup.c b/fs/xfs/libxfs/xfs_rtgroup.c index 9186c58e83d50..5a3d0dc6ae1b5 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.c +++ b/fs/xfs/libxfs/xfs_rtgroup.c @@ -48,6 +48,31 @@ xfs_rtgroup_min_block( return 0; } +/* Compute the number of rt extents in this realtime group. */ +static xfs_rtxnum_t +__xfs_rtgroup_extents( + struct xfs_mount *mp, + xfs_rgnumber_t rgno, + xfs_rgnumber_t rgcount, + xfs_rtbxlen_t rextents) +{ + ASSERT(rgno < rgcount); + if (rgno == rgcount - 1) + return rextents - ((xfs_rtxnum_t)rgno * mp->m_sb.sb_rgextents); + + ASSERT(xfs_has_rtgroups(mp)); + return mp->m_sb.sb_rgextents; +} + +xfs_rtxnum_t +xfs_rtgroup_extents( + struct xfs_mount *mp, + xfs_rgnumber_t rgno) +{ + return __xfs_rtgroup_extents(mp, rgno, mp->m_sb.sb_rgcount, + mp->m_sb.sb_rextents); +} + /* Precompute this group's geometry */ void xfs_rtgroup_calc_geometry( @@ -136,31 +161,6 @@ xfs_initialize_rtgroups( return error; } -/* Compute the number of rt extents in this realtime group. */ -xfs_rtxnum_t -__xfs_rtgroup_extents( - struct xfs_mount *mp, - xfs_rgnumber_t rgno, - xfs_rgnumber_t rgcount, - xfs_rtbxlen_t rextents) -{ - ASSERT(rgno < rgcount); - if (rgno == rgcount - 1) - return rextents - ((xfs_rtxnum_t)rgno * mp->m_sb.sb_rgextents); - - ASSERT(xfs_has_rtgroups(mp)); - return mp->m_sb.sb_rgextents; -} - -xfs_rtxnum_t -xfs_rtgroup_extents( - struct xfs_mount *mp, - xfs_rgnumber_t rgno) -{ - return __xfs_rtgroup_extents(mp, rgno, mp->m_sb.sb_rgcount, - mp->m_sb.sb_rextents); -} - /* * Update the rt extent count of the previous tail rtgroup if it changed during * recovery (i.e. recovery of a growfs). diff --git a/fs/xfs/libxfs/xfs_rtgroup.h b/fs/xfs/libxfs/xfs_rtgroup.h index 03f1e2493334f..73cace4d25c79 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.h +++ b/fs/xfs/libxfs/xfs_rtgroup.h @@ -285,8 +285,6 @@ void xfs_free_rtgroups(struct xfs_mount *mp, xfs_rgnumber_t first_rgno, int xfs_initialize_rtgroups(struct xfs_mount *mp, xfs_rgnumber_t first_rgno, xfs_rgnumber_t end_rgno, xfs_rtbxlen_t rextents); -xfs_rtxnum_t __xfs_rtgroup_extents(struct xfs_mount *mp, xfs_rgnumber_t rgno, - xfs_rgnumber_t rgcount, xfs_rtbxlen_t rextents); xfs_rtxnum_t xfs_rtgroup_extents(struct xfs_mount *mp, xfs_rgnumber_t rgno); void xfs_rtgroup_calc_geometry(struct xfs_mount *mp, struct xfs_rtgroup *rtg, xfs_rgnumber_t rgno, xfs_rgnumber_t rgcount, From 01f1aeb7eb1c49ad9dbe98a5d009ee1d314a72a8 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Fri, 9 Jan 2026 16:18:54 +0100 Subject: [PATCH 1070/1321] xfs: fix an overly long line in xfs_rtgroup_calc_geometry Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Carlos Maiolino --- fs/xfs/libxfs/xfs_rtgroup.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_rtgroup.c b/fs/xfs/libxfs/xfs_rtgroup.c index 5a3d0dc6ae1b5..be16efaa69251 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.c +++ b/fs/xfs/libxfs/xfs_rtgroup.c @@ -83,7 +83,8 @@ xfs_rtgroup_calc_geometry( xfs_rtbxlen_t rextents) { rtg->rtg_extents = __xfs_rtgroup_extents(mp, rgno, rgcount, rextents); - rtg_group(rtg)->xg_block_count = rtg->rtg_extents * mp->m_sb.sb_rextsize; + rtg_group(rtg)->xg_block_count = + rtg->rtg_extents * mp->m_sb.sb_rextsize; rtg_group(rtg)->xg_min_gbno = xfs_rtgroup_min_block(mp, rgno); } From f86b6fc6a0e3909c701dd6f19fb8d0bc5cc6b0a4 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Fri, 9 Jan 2026 16:18:21 +0100 Subject: [PATCH 1071/1321] xfs: improve the assert at the top of xfs_log_cover Move each condition into a separate assert so that we can see which on triggered. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Carlos Maiolino --- fs/xfs/xfs_log.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c index a311385b23d8b..d4544ccafea5d 100644 --- a/fs/xfs/xfs_log.c +++ b/fs/xfs/xfs_log.c @@ -1180,9 +1180,11 @@ xfs_log_cover( int error = 0; bool need_covered; - ASSERT((xlog_cil_empty(mp->m_log) && xlog_iclogs_empty(mp->m_log) && - !xfs_ail_min_lsn(mp->m_log->l_ailp)) || - xlog_is_shutdown(mp->m_log)); + if (!xlog_is_shutdown(mp->m_log)) { + ASSERT(xlog_cil_empty(mp->m_log)); + ASSERT(xlog_iclogs_empty(mp->m_log)); + ASSERT(!xfs_ail_min_lsn(mp->m_log->l_ailp)); + } if (!xfs_log_writable(mp)) return 0; From e6533b21e8cfa8070ea4d1c0cc665e45197ac37c Mon Sep 17 00:00:00 2001 From: "Nirjhar Roy (IBM)" Date: Mon, 12 Jan 2026 13:54:02 +0530 Subject: [PATCH 1072/1321] xfs: Fix xfs_grow_last_rtg() The last rtg should be able to grow when the size of the last is less than (and not equal to) sb_rgextents. xfs_growfs with realtime groups fails without this patch. The reason is that, xfs_growfs_rtg() tries to grow the last rt group even when the last rt group is at its maximal size i.e, sb_rgextents. It fails with the following messages: XFS (loop0): Internal error block >= mp->m_rsumblocks at line 253 of file fs/xfs/libxfs/xfs_rtbitmap.c. Caller xfs_rtsummary_read_buf+0x20/0x80 XFS (loop0): Corruption detected. Unmount and run xfs_repair XFS (loop0): Internal error xfs_trans_cancel at line 976 of file fs/xfs/xfs_trans.c. Caller xfs_growfs_rt_bmblock+0x402/0x450 XFS (loop0): Corruption of in-memory data (0x8) detected at xfs_trans_cancel+0x10a/0x1f0 (fs/xfs/xfs_trans.c:977). Shutting down filesystem. XFS (loop0): Please unmount the filesystem and rectify the problem(s) Signed-off-by: Nirjhar Roy (IBM) Reviewed-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Carlos Maiolino --- fs/xfs/xfs_rtalloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 202dcd2f40395..a12ffed123919 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -1326,7 +1326,7 @@ xfs_grow_last_rtg( return true; if (mp->m_sb.sb_rgcount == 0) return false; - return xfs_rtgroup_extents(mp, mp->m_sb.sb_rgcount - 1) <= + return xfs_rtgroup_extents(mp, mp->m_sb.sb_rgcount - 1) < mp->m_sb.sb_rgextents; } From 18a99c0cf75cb2dae2e2b472e46aae4e133dab95 Mon Sep 17 00:00:00 2001 From: Brian Foster Date: Fri, 9 Jan 2026 12:49:05 -0500 Subject: [PATCH 1073/1321] xfs: set max_agbno to allow sparse alloc of last full inode chunk Sparse inode cluster allocation sets min/max agbno values to avoid allocating an inode cluster that might map to an invalid inode chunk. For example, we can't have an inode record mapped to agbno 0 or that extends past the end of a runt AG of misaligned size. The initial calculation of max_agbno is unnecessarily conservative, however. This has triggered a corner case allocation failure where a small runt AG (i.e. 2063 blocks) is mostly full save for an extent to the EOFS boundary: [2050,13]. max_agbno is set to 2048 in this case, which happens to be the offset of the last possible valid inode chunk in the AG. In practice, we should be able to allocate the 4-block cluster at agbno 2052 to map to the parent inode record at agbno 2048, but the max_agbno value precludes it. Note that this can result in filesystem shutdown via dirty trans cancel on stable kernels prior to commit 9eb775968b68 ("xfs: walk all AGs if TRYLOCK passed to xfs_alloc_vextent_iterate_ags") because the tail AG selection by the allocator sets t_highest_agno on the transaction. If the inode allocator spins around and finds an inode chunk with free inodes in an earlier AG, the subsequent dir name creation path may still fail to allocate due to the AG restriction and cancel. To avoid this problem, update the max_agbno calculation to the agbno prior to the last chunk aligned agbno in the AG. This is not necessarily the last valid allocation target for a sparse chunk, but since inode chunks (i.e. records) are chunk aligned and sparse allocs are cluster sized/aligned, this allows the sb_spino_align alignment restriction to take over and round down the max effective agbno to within the last valid inode chunk in the AG. Note that even though the allocator improvements in the aforementioned commit seem to avoid this particular dirty trans cancel situation, the max_agbno logic improvement still applies as we should be able to allocate from an AG that has been appropriately selected. The more important target for this patch however are older/stable kernels prior to this allocator rework/improvement. Cc: stable@vger.kernel.org # v4.2 Fixes: 56d1115c9bc7 ("xfs: allocate sparse inode chunks on full chunk allocation failure") Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong Signed-off-by: Carlos Maiolino --- fs/xfs/libxfs/xfs_ialloc.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c index d97295eaebe63..c19d6d713780c 100644 --- a/fs/xfs/libxfs/xfs_ialloc.c +++ b/fs/xfs/libxfs/xfs_ialloc.c @@ -848,15 +848,16 @@ xfs_ialloc_ag_alloc( * invalid inode records, such as records that start at agbno 0 * or extend beyond the AG. * - * Set min agbno to the first aligned, non-zero agbno and max to - * the last aligned agbno that is at least one full chunk from - * the end of the AG. + * Set min agbno to the first chunk aligned, non-zero agbno and + * max to one less than the last chunk aligned agbno from the + * end of the AG. We subtract 1 from max so that the cluster + * allocation alignment takes over and allows allocation within + * the last full inode chunk in the AG. */ args.min_agbno = args.mp->m_sb.sb_inoalignmt; args.max_agbno = round_down(xfs_ag_block_count(args.mp, pag_agno(pag)), - args.mp->m_sb.sb_inoalignmt) - - igeo->ialloc_blks; + args.mp->m_sb.sb_inoalignmt) - 1; error = xfs_alloc_vextent_near_bno(&args, xfs_agbno_to_fsb(pag, From acd138575251cef6e8612a6f8f5eb2eea64b32a1 Mon Sep 17 00:00:00 2001 From: Petr Mladek Date: Fri, 12 Dec 2025 13:45:20 +0100 Subject: [PATCH 1074/1321] printk/nbcon: Restore IRQ in atomic flush after each emitted record The commit d5d399efff6577 ("printk/nbcon: Release nbcon consoles ownership in atomic flush after each emitted record") prevented stall of a CPU which lost nbcon console ownership because another CPU entered an emergency flush. But there is still the problem that the CPU doing the emergency flush might cause a stall on its own. Let's go even further and restore IRQ in the atomic flush after each emitted record. It is not a complete solution. The interrupts and/or scheduling might still be blocked when the emergency atomic flush was called with IRQs and/or scheduling disabled. But it should remove the following lockup: mlx5_core 0000:03:00.0: Shutdown was called kvm: exiting hardware virtualization arm-smmu-v3 arm-smmu-v3.10.auto: CMD_SYNC timeout at 0x00000103 [hwprod 0x00000104, hwcons 0x00000102] smp: csd: Detected non-responsive CSD lock (#1) on CPU#4, waiting 5000000032 ns for CPU#00 do_nothing (kernel/smp.c:1057) smp: csd: CSD lock (#1) unresponsive. [...] Call trace: pl011_console_write_atomic (./arch/arm64/include/asm/vdso/processor.h:12 drivers/tty/serial/amba-pl011.c:2540) (P) nbcon_emit_next_record (kernel/printk/nbcon.c:1049) __nbcon_atomic_flush_pending_con (kernel/printk/nbcon.c:1517) __nbcon_atomic_flush_pending.llvm.15488114865160659019 (./arch/arm64/include/asm/alternative-macros.h:254 ./arch/arm64/include/asm/cpufeature.h:808 ./arch/arm64/include/asm/irqflags.h:192 kernel/printk/nbcon.c:1562 kernel/printk/nbcon.c:1612) nbcon_atomic_flush_pending (kernel/printk/nbcon.c:1629) printk_kthreads_shutdown (kernel/printk/printk.c:?) syscore_shutdown (drivers/base/syscore.c:120) kernel_kexec (kernel/kexec_core.c:1045) __arm64_sys_reboot (kernel/reboot.c:794 kernel/reboot.c:722 kernel/reboot.c:722) invoke_syscall (arch/arm64/kernel/syscall.c:50) el0_svc_common.llvm.14158405452757855239 (arch/arm64/kernel/syscall.c:?) do_el0_svc (arch/arm64/kernel/syscall.c:152) el0_svc (./arch/arm64/include/asm/alternative-macros.h:254 ./arch/arm64/include/asm/cpufeature.h:808 ./arch/arm64/include/asm/irqflags.h:73 arch/arm64/kernel/entry-common.c:169 arch/arm64/kernel/entry-common.c:182 arch/arm64/kernel/entry-common.c:749) el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:820) el0t_64_sync (arch/arm64/kernel/entry.S:600) In this case, nbcon_atomic_flush_pending() is called from printk_kthreads_shutdown() with IRQs and scheduling enabled. Note that __nbcon_atomic_flush_pending_con() is directly called also from nbcon_device_release() where the disabled IRQs might break PREEMPT_RT guarantees. But the atomic flush is called only in emergency or panic situations where the latencies are irrelevant anyway. An ultimate solution would be a touching of watchdogs. But it would hide all problems. Let's do it later when anyone reports a stall which does not have a better solution. Closes: https://lore.kernel.org/r/sqwajvt7utnt463tzxgwu2yctyn5m6bjwrslsnupfexeml6hkd@v6sqmpbu3vvu Tested-by: Breno Leitao Reviewed-by: John Ogness Link: https://patch.msgid.link/20251212124520.244483-1-pmladek@suse.com Signed-off-by: Petr Mladek --- kernel/printk/nbcon.c | 38 ++++++++++++++++++-------------------- 1 file changed, 18 insertions(+), 20 deletions(-) diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c index 3fa403f9831f0..32fc12e536752 100644 --- a/kernel/printk/nbcon.c +++ b/kernel/printk/nbcon.c @@ -1557,18 +1557,27 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq) ctxt->allow_unsafe_takeover = nbcon_allow_unsafe_takeover(); while (nbcon_seq_read(con) < stop_seq) { - if (!nbcon_context_try_acquire(ctxt, false)) - return -EPERM; - /* - * nbcon_emit_next_record() returns false when the console was - * handed over or taken over. In both cases the context is no - * longer valid. + * Atomic flushing does not use console driver synchronization + * (i.e. it does not hold the port lock for uart consoles). + * Therefore IRQs must be disabled to avoid being interrupted + * and then calling into a driver that will deadlock trying + * to acquire console ownership. */ - if (!nbcon_emit_next_record(&wctxt, true)) - return -EAGAIN; + scoped_guard(irqsave) { + if (!nbcon_context_try_acquire(ctxt, false)) + return -EPERM; - nbcon_context_release(ctxt); + /* + * nbcon_emit_next_record() returns false when + * the console was handed over or taken over. + * In both cases the context is no longer valid. + */ + if (!nbcon_emit_next_record(&wctxt, true)) + return -EAGAIN; + + nbcon_context_release(ctxt); + } if (!ctxt->backlog) { /* Are there reserved but not yet finalized records? */ @@ -1595,22 +1604,11 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq) static void nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq) { struct console_flush_type ft; - unsigned long flags; int err; again: - /* - * Atomic flushing does not use console driver synchronization (i.e. - * it does not hold the port lock for uart consoles). Therefore IRQs - * must be disabled to avoid being interrupted and then calling into - * a driver that will deadlock trying to acquire console ownership. - */ - local_irq_save(flags); - err = __nbcon_atomic_flush_pending_con(con, stop_seq); - local_irq_restore(flags); - /* * If there was a new owner (-EPERM, -EAGAIN), that context is * responsible for completing. From cf92ce7282c9e4c7887207f54700be38abceda18 Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Fri, 9 Jan 2026 14:08:32 +0100 Subject: [PATCH 1075/1321] gpio: davinci: implement .get_direction() It's strongly recommended for GPIO drivers to always implement the .get_direction() callback - even for fixed-direction controllers. GPIO core will even emit a warning if the callback is missing, when users try to read the direction of a pin. Implement .get_direction() for gpio-davinci. Reported-by: Michael Walle Closes: https://lore.kernel.org/all/DFJAFK3DTBOZ.3G2P3A5IH34GF@kernel.org/ Reviewed-by: Linus Walleij Fixes: a060b8c511ab ("gpiolib: implement low-level, shared GPIO support") Tested-by: Michael Walle # on sa67 Link: https://lore.kernel.org/r/20260109130832.27326-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-davinci.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/drivers/gpio/gpio-davinci.c b/drivers/gpio/gpio-davinci.c index 538f27209ce71..97780f27ce5b5 100644 --- a/drivers/gpio/gpio-davinci.c +++ b/drivers/gpio/gpio-davinci.c @@ -6,6 +6,7 @@ * Copyright (c) 2007, MontaVista Software, Inc. */ +#include #include #include #include @@ -109,6 +110,22 @@ davinci_direction_out(struct gpio_chip *chip, unsigned offset, int value) return __davinci_direction(chip, offset, true, value); } +static int davinci_get_direction(struct gpio_chip *chip, unsigned int offset) +{ + struct davinci_gpio_controller *d = gpiochip_get_data(chip); + struct davinci_gpio_regs __iomem *g; + u32 mask = __gpio_mask(offset), val; + int bank = offset / 32; + + g = d->regs[bank]; + + guard(spinlock_irqsave)(&d->lock); + + val = readl_relaxed(&g->dir); + + return (val & mask) ? GPIO_LINE_DIRECTION_IN : GPIO_LINE_DIRECTION_OUT; +} + /* * Read the pin's value (works even if it's set up as output); * returns zero/nonzero. @@ -203,6 +220,7 @@ static int davinci_gpio_probe(struct platform_device *pdev) chips->chip.get = davinci_gpio_get; chips->chip.direction_output = davinci_direction_out; chips->chip.set = davinci_gpio_set; + chips->chip.get_direction = davinci_get_direction; chips->chip.ngpio = ngpio; chips->chip.base = -1; From f0f6ac2f0f93aded9eaaceea468ff32c885c4779 Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Fri, 9 Jan 2026 11:55:56 +0100 Subject: [PATCH 1076/1321] gpiolib: remove redundant callback check The presence of the .get_direction() callback is already checked in gpiochip_get_direction(). Remove the duplicated check which also returns the wrong error code to user-space. Fixes: e623c4303ed1 ("gpiolib: sanitize the return value of gpio_chip::get_direction()") Reported-by: Michael Walle Closes: https://lore.kernel.org/all/DFJAFK3DTBOZ.3G2P3A5IH34GF@kernel.org/ Link: https://lore.kernel.org/r/20260109105557.20024-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpiolib.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index dcf427d3cf437..fe2d107b0a844 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -468,9 +468,6 @@ int gpiod_get_direction(struct gpio_desc *desc) test_bit(GPIOD_FLAG_IS_OUT, &flags)) return 0; - if (!guard.gc->get_direction) - return -ENOTSUPP; - ret = gpiochip_get_direction(guard.gc, offset); if (ret < 0) return ret; From 1d93bbfda88816dc9b4050caa6b77c2177b2f080 Mon Sep 17 00:00:00 2001 From: Jaroslav Kysela Date: Wed, 7 Jan 2026 22:36:42 +0100 Subject: [PATCH 1077/1321] ALSA: pcm: Improve the fix for race of buffer access at PCM OSS layer Handle the error code from snd_pcm_buffer_access_lock() in snd_pcm_runtime_buffer_set_silence() function. Found by Alexandros Panagiotou Fixes: 93a81ca06577 ("ALSA: pcm: Fix race of buffer access at PCM OSS layer") Cc: stable@vger.kernel.org # 6.15 Signed-off-by: Jaroslav Kysela Link: https://patch.msgid.link/20260107213642.332954-1-perex@perex.cz Signed-off-by: Takashi Iwai --- include/sound/pcm.h | 2 +- sound/core/oss/pcm_oss.c | 4 +++- sound/core/pcm_native.c | 9 +++++++-- 3 files changed, 11 insertions(+), 4 deletions(-) diff --git a/include/sound/pcm.h b/include/sound/pcm.h index 58fd6e84f9613..a7860c047503a 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -1402,7 +1402,7 @@ int snd_pcm_lib_mmap_iomem(struct snd_pcm_substream *substream, struct vm_area_s #define snd_pcm_lib_mmap_iomem NULL #endif -void snd_pcm_runtime_buffer_set_silence(struct snd_pcm_runtime *runtime); +int snd_pcm_runtime_buffer_set_silence(struct snd_pcm_runtime *runtime); /** * snd_pcm_limit_isa_dma_size - Get the max size fitting with ISA DMA transfer diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c index a82dd155e1d3a..b12df5b5ddfc1 100644 --- a/sound/core/oss/pcm_oss.c +++ b/sound/core/oss/pcm_oss.c @@ -1074,7 +1074,9 @@ static int snd_pcm_oss_change_params_locked(struct snd_pcm_substream *substream) runtime->oss.params = 0; runtime->oss.prepare = 1; runtime->oss.buffer_used = 0; - snd_pcm_runtime_buffer_set_silence(runtime); + err = snd_pcm_runtime_buffer_set_silence(runtime); + if (err < 0) + goto failure; runtime->oss.period_frames = snd_pcm_alsa_frames(substream, oss_period_size); diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 68bee40c9adaf..932a9bf98cbc0 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -730,13 +730,18 @@ static void snd_pcm_buffer_access_unlock(struct snd_pcm_runtime *runtime) } /* fill the PCM buffer with the current silence format; called from pcm_oss.c */ -void snd_pcm_runtime_buffer_set_silence(struct snd_pcm_runtime *runtime) +int snd_pcm_runtime_buffer_set_silence(struct snd_pcm_runtime *runtime) { - snd_pcm_buffer_access_lock(runtime); + int err; + + err = snd_pcm_buffer_access_lock(runtime); + if (err < 0) + return err; if (runtime->dma_area) snd_pcm_format_set_silence(runtime->format, runtime->dma_area, bytes_to_samples(runtime, runtime->dma_bytes)); snd_pcm_buffer_access_unlock(runtime); + return 0; } EXPORT_SYMBOL_GPL(snd_pcm_runtime_buffer_set_silence); From fc627a3442866150f57b36bddef71f533343b0ca Mon Sep 17 00:00:00 2001 From: Matthew Schwartz Date: Thu, 8 Jan 2026 01:36:50 -0800 Subject: [PATCH 1078/1321] ALSA: hda/tas2781: Skip UEFI calibration on ASUS ROG Xbox Ally X There is currently an issue with UEFI calibration data parsing for some TAS devices, like the ASUS ROG Xbox Ally X (RC73XA), that causes audio quality issues such as gaps in playback. Until the issue is root caused and fixed, add a quirk to skip using the UEFI calibration data and fall back to using the calibration data provided by the DSP firmware, which restores full speaker functionality on affected devices. Cc: stable@vger.kernel.org # 6.18 Link: https://lore.kernel.org/all/160aef32646c4d5498cbfd624fd683cc@ti.com/ Closes: https://lore.kernel.org/all/0ba100d0-9b6f-4a3b-bffa-61abe1b46cd5@linux.dev/ Suggested-by: Baojun Xu Signed-off-by: Matthew Schwartz Reviewed-by: Antheas Kapenekakis Link: https://patch.msgid.link/20260108093650.1142176-1-matthew.schwartz@linux.dev Signed-off-by: Takashi Iwai --- sound/hda/codecs/side-codecs/tas2781_hda_i2c.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c b/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c index f7a7f216d5865..0e4bda3a544ea 100644 --- a/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c +++ b/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c @@ -60,6 +60,7 @@ struct tas2781_hda_i2c_priv { int (*save_calibration)(struct tas2781_hda *h); int hda_chip_id; + bool skip_calibration; }; static int tas2781_get_i2c_res(struct acpi_resource *ares, void *data) @@ -491,7 +492,8 @@ static void tasdevice_dspfw_init(void *context) /* If calibrated data occurs error, dsp will still works with default * calibrated data inside algo. */ - hda_priv->save_calibration(tas_hda); + if (!hda_priv->skip_calibration) + hda_priv->save_calibration(tas_hda); } static void tasdev_fw_ready(const struct firmware *fmw, void *context) @@ -548,6 +550,7 @@ static int tas2781_hda_bind(struct device *dev, struct device *master, void *master_data) { struct tas2781_hda *tas_hda = dev_get_drvdata(dev); + struct tas2781_hda_i2c_priv *hda_priv = tas_hda->hda_priv; struct hda_component_parent *parent = master_data; struct hda_component *comp; struct hda_codec *codec; @@ -573,6 +576,14 @@ static int tas2781_hda_bind(struct device *dev, struct device *master, break; } + /* + * Using ASUS ROG Xbox Ally X (RC73XA) UEFI calibration data + * causes audio dropouts during playback, use fallback data + * from DSP firmware as a workaround. + */ + if (codec->core.subsystem_id == 0x10431384) + hda_priv->skip_calibration = true; + pm_runtime_get_sync(dev); comp->dev = dev; From 8179f0b874e14e313782b35709da370a869988b6 Mon Sep 17 00:00:00 2001 From: Aleksandrs Vinarskis Date: Mon, 12 Jan 2026 01:06:37 +0100 Subject: [PATCH 1079/1321] ALSA: hda/realtek: Add quirk for Asus Zephyrus G14 2025 using CS35L56, fix speakers Just like GA403U, this GA403W needs to remap woofers to DAC1. Similarly to other Asus devices, headphones/headset MIC is not working, however the pin config alone is not enough to fix it. From Windows dump of GA403W: 0x12, 0x90a60140 # Correctly set by codec out of the box 0x13, 0x90a60550 0x14, 0x90170510 0x17, 0x90170120 # Correctly set by codec out of the box 0x19, 0x03a11050 # Set by ALC285_FIXUP_ASUS_GA403U_HEADSET_MIC 0x1a, 0x411115F0 0x1b, 0x03a11c30 # Set by ALC285_FIXUP_ASUS_GA403U_HEADSET_MIC 0x1d, 0x40663A45 # Correctly set by codec out of the box 0x21, 0x03211430 Even with all the values set, MIC of the jack is not detected. Until a complete solution is found, set ALC285_FIXUP_ASUS_GA403U_HEADSET_MIC for GA403W which fixes audio volume control for woofers. No need to create new quirk with missing pin config just yet, since its not making the situation better. Signed-off-by: Aleksandrs Vinarskis Link: https://patch.msgid.link/20260112-asus-rog-audio-v1-1-513957b4704e@vinarskis.com Signed-off-by: Takashi Iwai --- sound/hda/codecs/realtek/alc269.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index 61c7372e6307a..dbbe8b4985838 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -6817,6 +6817,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x8f42, "HP ZBook 8 G2a 14W", ALC245_FIXUP_HP_TAS2781_I2C_MUTE_LED), SND_PCI_QUIRK(0x103c, 0x8f57, "HP Trekker G7JC", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8f62, "HP ZBook 8 G2a 16W", ALC245_FIXUP_HP_TAS2781_I2C_MUTE_LED), + SND_PCI_QUIRK(0x1043, 0x1024, "ASUS Zephyrus G14 2025", ALC285_FIXUP_ASUS_GA403U_HEADSET_MIC), SND_PCI_QUIRK(0x1043, 0x1032, "ASUS VivoBook X513EA", ALC256_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x1034, "ASUS GU605C", ALC285_FIXUP_ASUS_GU605_SPI_SPEAKER2_TO_DAC1), SND_PCI_QUIRK(0x1043, 0x103e, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC), From 7f2a9a94ac01e8a2d710d0e5f89dde98aa68adbf Mon Sep 17 00:00:00 2001 From: Richard Fitzgerald Date: Tue, 13 Jan 2026 13:09:54 +0000 Subject: [PATCH 1080/1321] ALSA: hda/cirrus_scodec_test: Fix incorrect setup of gpiochip Set gpiochip parent to the struct device of the dummy GPIO driver so that the software node will be associated with the GPIO chip. The recent commit e5d527be7e698 ("gpio: swnode: don't use the swnode's name as the key for GPIO lookup") broke cirrus_scodec_test, because the software node no longer gets associated with the GPIO driver by name. Instead, setting struct gpio_chip.parent to the owning struct device will find the node using a normal fwnode lookup. Signed-off-by: Richard Fitzgerald Fixes: 2144833e7b414 ("ALSA: hda: cirrus_scodec: Add KUnit test") Link: https://patch.msgid.link/20260113130954.574670-1-rf@opensource.cirrus.com Signed-off-by: Takashi Iwai --- sound/hda/codecs/side-codecs/cirrus_scodec_test.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/hda/codecs/side-codecs/cirrus_scodec_test.c b/sound/hda/codecs/side-codecs/cirrus_scodec_test.c index 3cca750857b68..159ac80a93144 100644 --- a/sound/hda/codecs/side-codecs/cirrus_scodec_test.c +++ b/sound/hda/codecs/side-codecs/cirrus_scodec_test.c @@ -103,6 +103,7 @@ static int cirrus_scodec_test_gpio_probe(struct platform_device *pdev) /* GPIO core modifies our struct gpio_chip so use a copy */ gpio_priv->chip = cirrus_scodec_test_gpio_chip; + gpio_priv->chip.parent = &pdev->dev; ret = devm_gpiochip_add_data(&pdev->dev, &gpio_priv->chip, gpio_priv); if (ret) return dev_err_probe(&pdev->dev, ret, "Failed to add gpiochip\n"); From 2f1507f3968129075bbb1fe3d02684305fce2695 Mon Sep 17 00:00:00 2001 From: Richard Fitzgerald Date: Tue, 13 Jan 2026 13:40:56 +0000 Subject: [PATCH 1081/1321] ALSA: hda/cirrus_scodec_test: Fix test suite name Change the test suite name string to "snd-hda-cirrus-scodec-test". It was incorrectly named "snd-hda-scodec-cs35l56-test", a leftover from when the code under test was actually in the cs35l56 driver. Signed-off-by: Richard Fitzgerald Fixes: 2144833e7b414 ("ALSA: hda: cirrus_scodec: Add KUnit test") Link: https://patch.msgid.link/20260113134056.619051-1-rf@opensource.cirrus.com Signed-off-by: Takashi Iwai --- sound/hda/codecs/side-codecs/cirrus_scodec_test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/hda/codecs/side-codecs/cirrus_scodec_test.c b/sound/hda/codecs/side-codecs/cirrus_scodec_test.c index 159ac80a93144..dc35932b6b22f 100644 --- a/sound/hda/codecs/side-codecs/cirrus_scodec_test.c +++ b/sound/hda/codecs/side-codecs/cirrus_scodec_test.c @@ -320,7 +320,7 @@ static struct kunit_case cirrus_scodec_test_cases[] = { }; static struct kunit_suite cirrus_scodec_test_suite = { - .name = "snd-hda-scodec-cs35l56-test", + .name = "snd-hda-cirrus-scodec-test", .init = cirrus_scodec_test_case_init, .test_cases = cirrus_scodec_test_cases, }; From 81221ac93c9988cd9b7b798b493307aef241d8fb Mon Sep 17 00:00:00 2001 From: Edward Adam Davis Date: Tue, 13 Jan 2026 16:29:23 +0800 Subject: [PATCH 1082/1321] ALSA: usb-audio: Prevent excessive number of frames In this case, the user constructed the parameters with maxpacksize 40 for rate 22050 / pps 1000, and packsize[0] 22 packsize[1] 23. The buffer size for each data URB is maxpacksize * packets, which in this example is 40 * 6 = 240; When the user performs a write operation to send audio data into the ALSA PCM playback stream, the calculated number of frames is packsize[0] * packets = 264, which exceeds the allocated URB buffer size, triggering the out-of-bounds (OOB) issue reported by syzbot [1]. Added a check for the number of single data URB frames when calculating the number of frames to prevent [1]. [1] BUG: KASAN: slab-out-of-bounds in copy_to_urb+0x261/0x460 sound/usb/pcm.c:1487 Write of size 264 at addr ffff88804337e800 by task syz.0.17/5506 Call Trace: copy_to_urb+0x261/0x460 sound/usb/pcm.c:1487 prepare_playback_urb+0x953/0x13d0 sound/usb/pcm.c:1611 prepare_outbound_urb+0x377/0xc50 sound/usb/endpoint.c:333 Reported-by: syzbot+6db0415d6d5c635f72cb@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6db0415d6d5c635f72cb Tested-by: syzbot+6db0415d6d5c635f72cb@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis Link: https://patch.msgid.link/tencent_9AECE6CD2C7A826D902D696C289724E8120A@qq.com Signed-off-by: Takashi Iwai --- sound/usb/pcm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 54d01dfd820fa..263abb36bb2d1 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -1553,7 +1553,7 @@ static int prepare_playback_urb(struct snd_usb_substream *subs, for (i = 0; i < ctx->packets; i++) { counts = snd_usb_endpoint_next_packet_size(ep, ctx, i, avail); - if (counts < 0) + if (counts < 0 || frames + counts >= ep->max_urb_frames) break; /* set up descriptor */ urb->iso_frame_desc[i].offset = frames * stride; From 9f94e8e895e79e37d920408254d67e3eeaf4d903 Mon Sep 17 00:00:00 2001 From: Zhang Heng Date: Thu, 15 Jan 2026 09:58:44 +0800 Subject: [PATCH 1083/1321] ALSA: hda/realtek: Add quirk for HP Pavilion x360 to enable mute LED This quirk enables mute LED on HP Pavilion x360 2-in-1 Laptop 14-ek0xxx, which use ALC245 codec. Link: https://bugzilla.kernel.org/show_bug.cgi?id=220220 Cc: Signed-off-by: Zhang Heng Link: https://patch.msgid.link/20260115015844.3129890-1-zhangheng@kylinos.cn Signed-off-by: Takashi Iwai --- sound/hda/codecs/realtek/alc269.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index dbbe8b4985838..29469e5497914 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -6613,6 +6613,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x8a2e, "HP Envy 16", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8a30, "HP Envy 17", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8a31, "HP Envy 15", ALC287_FIXUP_CS35L41_I2C_2), + SND_PCI_QUIRK(0x103c, 0x8a34, "HP Pavilion x360 2-in-1 Laptop 14-ek0xxx", ALC245_FIXUP_HP_MUTE_LED_COEFBIT), SND_PCI_QUIRK(0x103c, 0x8a4f, "HP Victus 15-fa0xxx (MB 8A4F)", ALC245_FIXUP_HP_MUTE_LED_COEFBIT), SND_PCI_QUIRK(0x103c, 0x8a6e, "HP EDNA 360", ALC287_FIXUP_CS35L41_I2C_4), SND_PCI_QUIRK(0x103c, 0x8a74, "HP ProBook 440 G8 Notebook PC", ALC236_FIXUP_HP_GPIO_LED), From 6180ee0d3b8bfb7afb6270083b27544f1c7d3ef3 Mon Sep 17 00:00:00 2001 From: Deep Harsora Date: Fri, 2 Jan 2026 15:21:24 +0000 Subject: [PATCH 1084/1321] ASoC: Intel: sof_sdw: Add new quirks for PTL on Dell with CS42L43 Add missing quirks for some new Dell laptops using cs42l43's speaker outputs. Signed-off-by: Deep Harsora Signed-off-by: Maciej Strozek Link: https://patch.msgid.link/20260102152132.3053106-1-mstrozek@opensource.cirrus.com Signed-off-by: Mark Brown --- sound/soc/intel/boards/sof_sdw.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/sound/soc/intel/boards/sof_sdw.c b/sound/soc/intel/boards/sof_sdw.c index 2c1001148d540..8721a098d53f1 100644 --- a/sound/soc/intel/boards/sof_sdw.c +++ b/sound/soc/intel/boards/sof_sdw.c @@ -764,6 +764,14 @@ static const struct dmi_system_id sof_sdw_quirk_table[] = { .driver_data = (void *)(SOC_SDW_CODEC_SPKR), }, /* Pantherlake devices*/ + { + .callback = sof_sdw_quirk_cb, + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc"), + DMI_EXACT_MATCH(DMI_PRODUCT_SKU, "0DD6") + }, + .driver_data = (void *)(SOC_SDW_SIDECAR_AMPS), + }, { .callback = sof_sdw_quirk_cb, .matches = { From 6aeac30f0906969397f7894f6b1aea9b27c6f2a8 Mon Sep 17 00:00:00 2001 From: Shengjiu Wang Date: Mon, 29 Dec 2025 17:04:32 +0800 Subject: [PATCH 1085/1321] ASoC: simple-card-utils: Check device node before overwrite direction Even the device node don't exist, the graph_util_parse_link_direction() will overwrite the playback_only and capture_only to be zero. Which cause the playback_only and capture_only are not correct, so check device node exist or not before update the value. Signed-off-by: Shengjiu Wang Acked-by: Kuninori Morimoto Link: https://patch.msgid.link/20251229090432.3964848-1-shengjiu.wang@nxp.com Signed-off-by: Mark Brown --- sound/soc/generic/simple-card-utils.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sound/soc/generic/simple-card-utils.c b/sound/soc/generic/simple-card-utils.c index 355f7ec8943c2..bdc02e85b089f 100644 --- a/sound/soc/generic/simple-card-utils.c +++ b/sound/soc/generic/simple-card-utils.c @@ -1179,9 +1179,9 @@ void graph_util_parse_link_direction(struct device_node *np, bool is_playback_only = of_property_read_bool(np, "playback-only"); bool is_capture_only = of_property_read_bool(np, "capture-only"); - if (playback_only) + if (np && playback_only) *playback_only = is_playback_only; - if (capture_only) + if (np && capture_only) *capture_only = is_capture_only; } EXPORT_SYMBOL_GPL(graph_util_parse_link_direction); From d5f5bb2aacb5bad7ca9bfea191fd1fcdeb2d7e97 Mon Sep 17 00:00:00 2001 From: "Rob Herring (Arm)" Date: Mon, 5 Jan 2026 13:32:03 -0600 Subject: [PATCH 1086/1321] ASoC: dt-bindings: everest,es8316: Add interrupt support The Everest ES8316 has interrupt capability on its GPIO3 pin for headphone detection. Several of the RockPi 4 variants are using it already. Signed-off-by: Rob Herring (Arm) Link: https://patch.msgid.link/20260105193203.3166320-1-robh@kernel.org Signed-off-by: Mark Brown --- Documentation/devicetree/bindings/sound/everest,es8316.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/Documentation/devicetree/bindings/sound/everest,es8316.yaml b/Documentation/devicetree/bindings/sound/everest,es8316.yaml index 81a0215050e01..fe5d938ca3102 100644 --- a/Documentation/devicetree/bindings/sound/everest,es8316.yaml +++ b/Documentation/devicetree/bindings/sound/everest,es8316.yaml @@ -49,6 +49,10 @@ properties: items: - const: mclk + interrupts: + maxItems: 1 + description: Headphone detect interrupt + port: $ref: audio-graph-port.yaml# unevaluatedProperties: false From b3ee41253c5304a3afb3ac184f3dd5c99eaec574 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Fri, 2 Jan 2026 12:14:10 +0100 Subject: [PATCH 1087/1321] ASoC: codecs: wsa883x: fix unnecessary initialisation The soundwire update_status() callback may be called multiple times with the same ATTACHED status but initialisation should only be done when transitioning from UNATTACHED to ATTACHED. This avoids repeated initialisation of the codecs during boot of machines like the Lenovo ThinkPad X13s: [ 11.614523] wsa883x-codec sdw:1:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2 [ 11.618022] wsa883x-codec sdw:1:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2 [ 11.621377] wsa883x-codec sdw:1:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2 [ 11.624065] wsa883x-codec sdw:1:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2 [ 11.631382] wsa883x-codec sdw:1:0:0217:0202:00:2: WSA883X Version 1_1, Variant: WSA8835_V2 [ 11.634424] wsa883x-codec sdw:1:0:0217:0202:00:2: WSA883X Version 1_1, Variant: WSA8835_V2 Fixes: 43b8c7dc85a1 ("ASoC: codecs: add wsa883x amplifier support") Cc: stable@vger.kernel.org # 6.0 Cc: Srinivas Kandagatla Signed-off-by: Johan Hovold Reviewed-by: Krzysztof Kozlowski Reviewed-by: Srinivas Kandagatla Link: https://patch.msgid.link/20260102111413.9605-2-johan@kernel.org Signed-off-by: Mark Brown --- sound/soc/codecs/wsa883x.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/sound/soc/codecs/wsa883x.c b/sound/soc/codecs/wsa883x.c index c3046e260cb95..3ffea56aeb0fa 100644 --- a/sound/soc/codecs/wsa883x.c +++ b/sound/soc/codecs/wsa883x.c @@ -475,6 +475,7 @@ struct wsa883x_priv { int active_ports; int dev_mode; int comp_offset; + bool hw_init; /* * Protects temperature reading code (related to speaker protection) and * fields: temperature and pa_on. @@ -1043,6 +1044,9 @@ static int wsa883x_init(struct wsa883x_priv *wsa883x) struct regmap *regmap = wsa883x->regmap; int variant, version, ret; + if (wsa883x->hw_init) + return 0; + ret = regmap_read(regmap, WSA883X_OTP_REG_0, &variant); if (ret) return ret; @@ -1085,6 +1089,8 @@ static int wsa883x_init(struct wsa883x_priv *wsa883x) wsa883x->comp_offset); } + wsa883x->hw_init = true; + return 0; } @@ -1093,6 +1099,9 @@ static int wsa883x_update_status(struct sdw_slave *slave, { struct wsa883x_priv *wsa883x = dev_get_drvdata(&slave->dev); + if (status == SDW_SLAVE_UNATTACHED) + wsa883x->hw_init = false; + if (status == SDW_SLAVE_ATTACHED && slave->dev_num > 0) return wsa883x_init(wsa883x); From 401964c033bf9abfa82a81ad90dc1e181e7ef699 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Fri, 2 Jan 2026 12:14:11 +0100 Subject: [PATCH 1088/1321] ASoC: codecs: wsa881x: fix unnecessary initialisation The soundwire update_status() callback may be called multiple times with the same ATTACHED status but initialisation should only be done when transitioning from UNATTACHED to ATTACHED. Fixes: a0aab9e1404a ("ASoC: codecs: add wsa881x amplifier support") Cc: stable@vger.kernel.org # 5.6 Cc: Srinivas Kandagatla Signed-off-by: Johan Hovold Reviewed-by: Krzysztof Kozlowski Reviewed-by: Srinivas Kandagatla Link: https://patch.msgid.link/20260102111413.9605-3-johan@kernel.org Signed-off-by: Mark Brown --- sound/soc/codecs/wsa881x.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/sound/soc/codecs/wsa881x.c b/sound/soc/codecs/wsa881x.c index d7aca6567c2de..2fc234adca5f7 100644 --- a/sound/soc/codecs/wsa881x.c +++ b/sound/soc/codecs/wsa881x.c @@ -678,6 +678,7 @@ struct wsa881x_priv { */ unsigned int sd_n_val; int active_ports; + bool hw_init; bool port_prepared[WSA881X_MAX_SWR_PORTS]; bool port_enable[WSA881X_MAX_SWR_PORTS]; }; @@ -687,6 +688,9 @@ static void wsa881x_init(struct wsa881x_priv *wsa881x) struct regmap *rm = wsa881x->regmap; unsigned int val = 0; + if (wsa881x->hw_init) + return; + regmap_register_patch(wsa881x->regmap, wsa881x_rev_2_0, ARRAY_SIZE(wsa881x_rev_2_0)); @@ -724,6 +728,8 @@ static void wsa881x_init(struct wsa881x_priv *wsa881x) regmap_update_bits(rm, WSA881X_OTP_REG_28, 0x3F, 0x3A); regmap_update_bits(rm, WSA881X_BONGO_RESRV_REG1, 0xFF, 0xB2); regmap_update_bits(rm, WSA881X_BONGO_RESRV_REG2, 0xFF, 0x05); + + wsa881x->hw_init = true; } static int wsa881x_component_probe(struct snd_soc_component *comp) @@ -1067,6 +1073,9 @@ static int wsa881x_update_status(struct sdw_slave *slave, { struct wsa881x_priv *wsa881x = dev_get_drvdata(&slave->dev); + if (status == SDW_SLAVE_UNATTACHED) + wsa881x->hw_init = false; + if (status == SDW_SLAVE_ATTACHED && slave->dev_num > 0) wsa881x_init(wsa881x); From 4ae81ff9480b4e3ea1413e06d23fa2f5d99489e1 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Fri, 2 Jan 2026 12:14:12 +0100 Subject: [PATCH 1089/1321] ASoC: codecs: wsa884x: fix codec initialisation The soundwire update_status() callback may be called multiple times with the same ATTACHED status but initialisation should only be done when transitioning from UNATTACHED to ATTACHED. Fix the inverted hw_init flag which was set to false instead of true after initialisation which defeats its purpose and may result in repeated unnecessary initialisation. Similarly, the initial state of the flag was also inverted so that the codec would only be initialised and brought out of regmap cache only mode if its status first transitions to UNATTACHED. Fixes: aa21a7d4f68a ("ASoC: codecs: wsa884x: Add WSA884x family of speakers") Cc: stable@vger.kernel.org # 6.5 Cc: Krzysztof Kozlowski Signed-off-by: Johan Hovold Reviewed-by: Krzysztof Kozlowski Tested-by: Krzysztof Kozlowski Reviewed-by: Srinivas Kandagatla Link: https://patch.msgid.link/20260102111413.9605-4-johan@kernel.org Signed-off-by: Mark Brown --- sound/soc/codecs/wsa884x.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sound/soc/codecs/wsa884x.c b/sound/soc/codecs/wsa884x.c index 887edd2be705e..6c6b497657d0c 100644 --- a/sound/soc/codecs/wsa884x.c +++ b/sound/soc/codecs/wsa884x.c @@ -1534,7 +1534,7 @@ static void wsa884x_init(struct wsa884x_priv *wsa884x) wsa884x_set_gain_parameters(wsa884x); - wsa884x->hw_init = false; + wsa884x->hw_init = true; } static int wsa884x_update_status(struct sdw_slave *slave, @@ -2109,7 +2109,6 @@ static int wsa884x_probe(struct sdw_slave *pdev, /* Start in cache-only until device is enumerated */ regcache_cache_only(wsa884x->regmap, true); - wsa884x->hw_init = true; if (IS_REACHABLE(CONFIG_HWMON)) { struct device *hwmon; From 9e1b6a7b022fc141015d77a38c32dd886bd1e40d Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Fri, 2 Jan 2026 12:14:13 +0100 Subject: [PATCH 1090/1321] ASoC: codecs: wsa883x: suppress variant printk Drivers should generally be silent on successful probe. Demote the codec variant printk to debug level and instead add a warning in case an unknown variant is ever encountered. Signed-off-by: Johan Hovold Reviewed-by: Krzysztof Kozlowski Reviewed-by: Srinivas Kandagatla Link: https://patch.msgid.link/20260102111413.9605-5-johan@kernel.org Signed-off-by: Mark Brown --- sound/soc/codecs/wsa883x.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/sound/soc/codecs/wsa883x.c b/sound/soc/codecs/wsa883x.c index 3ffea56aeb0fa..468d2b38a22ab 100644 --- a/sound/soc/codecs/wsa883x.c +++ b/sound/soc/codecs/wsa883x.c @@ -1058,22 +1058,23 @@ static int wsa883x_init(struct wsa883x_priv *wsa883x) switch (variant) { case WSA8830: - dev_info(wsa883x->dev, "WSA883X Version 1_%d, Variant: WSA8830\n", - version); + dev_dbg(wsa883x->dev, "WSA883X Version 1_%d, Variant: WSA8830\n", + version); break; case WSA8835: - dev_info(wsa883x->dev, "WSA883X Version 1_%d, Variant: WSA8835\n", - version); + dev_dbg(wsa883x->dev, "WSA883X Version 1_%d, Variant: WSA8835\n", + version); break; case WSA8832: - dev_info(wsa883x->dev, "WSA883X Version 1_%d, Variant: WSA8832\n", - version); + dev_dbg(wsa883x->dev, "WSA883X Version 1_%d, Variant: WSA8832\n", + version); break; case WSA8835_V2: - dev_info(wsa883x->dev, "WSA883X Version 1_%d, Variant: WSA8835_V2\n", - version); + dev_dbg(wsa883x->dev, "WSA883X Version 1_%d, Variant: WSA8835_V2\n", + version); break; default: + dev_warn(wsa883x->dev, "unknown variant: %d\n", variant); break; } From 5ddcf6bb868f13e64a8663ecb9293fc2d0c47dcd Mon Sep 17 00:00:00 2001 From: Ben Dooks Date: Tue, 6 Jan 2026 22:58:46 +0000 Subject: [PATCH 1091/1321] ASoC: ops: fix pointer types to be big-endian If manipulating big-endian data, make the pointers be big-endian instead of host-endian. This should stop the following sparse warnigns about endian-conversion: sound/soc/soc-ops.c:547:33: warning: invalid assignment: &= sound/soc/soc-ops.c:547:33: left side has type unsigned short sound/soc/soc-ops.c:547:33: right side has type restricted __be16 sound/soc/soc-ops.c:551:33: warning: invalid assignment: &= sound/soc/soc-ops.c:551:33: left side has type unsigned int sound/soc/soc-ops.c:551:33: right side has type restricted __be32 Signed-off-by: Ben Dooks Link: https://patch.msgid.link/20260106225846.83580-1-ben.dooks@codethink.co.uk Signed-off-by: Mark Brown --- sound/soc/soc-ops.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c index 624e9269fc25b..ba42939d5f013 100644 --- a/sound/soc/soc-ops.c +++ b/sound/soc/soc-ops.c @@ -543,11 +543,11 @@ int snd_soc_bytes_get(struct snd_kcontrol *kcontrol, ucontrol->value.bytes.data[0] &= ~params->mask; break; case 2: - ((u16 *)(&ucontrol->value.bytes.data))[0] + ((__be16 *)(&ucontrol->value.bytes.data))[0] &= cpu_to_be16(~params->mask); break; case 4: - ((u32 *)(&ucontrol->value.bytes.data))[0] + ((__be32 *)(&ucontrol->value.bytes.data))[0] &= cpu_to_be32(~params->mask); break; default: From 0f68dfde8a44b07129b04c8bd022c097ca1992b1 Mon Sep 17 00:00:00 2001 From: Kery Qi Date: Wed, 7 Jan 2026 23:48:37 +0800 Subject: [PATCH 1092/1321] ASoC: davinci-evm: Fix reference leak in davinci_evm_probe The davinci_evm_probe() function calls of_parse_phandle() to acquire device nodes for "ti,audio-codec" and "ti,mcasp-controller". These functions return device nodes with incremented reference counts. However, in several error paths (e.g., when the second of_parse_phandle(), snd_soc_of_parse_card_name(), or devm_snd_soc_register_card() fails), the function returns directly without releasing the acquired nodes, leading to reference leaks. This patch adds an error handling path 'err_put' to properly release the device nodes using of_node_put() and clean up the pointers when an error occurs. Signed-off-by: Kery Qi Link: https://patch.msgid.link/20260107154836.1521-2-qikeyu2017@gmail.com Signed-off-by: Mark Brown --- sound/soc/ti/davinci-evm.c | 39 ++++++++++++++++++++++++++++++-------- 1 file changed, 31 insertions(+), 8 deletions(-) diff --git a/sound/soc/ti/davinci-evm.c b/sound/soc/ti/davinci-evm.c index 3848766d96c3e..ad514c2e5a254 100644 --- a/sound/soc/ti/davinci-evm.c +++ b/sound/soc/ti/davinci-evm.c @@ -194,27 +194,32 @@ static int davinci_evm_probe(struct platform_device *pdev) return -EINVAL; dai->cpus->of_node = of_parse_phandle(np, "ti,mcasp-controller", 0); - if (!dai->cpus->of_node) - return -EINVAL; + if (!dai->cpus->of_node) { + ret = -EINVAL; + goto err_put; + } dai->platforms->of_node = dai->cpus->of_node; evm_soc_card.dev = &pdev->dev; ret = snd_soc_of_parse_card_name(&evm_soc_card, "ti,model"); if (ret) - return ret; + goto err_put; mclk = devm_clk_get(&pdev->dev, "mclk"); if (PTR_ERR(mclk) == -EPROBE_DEFER) { - return -EPROBE_DEFER; + ret = -EPROBE_DEFER; + goto err_put; } else if (IS_ERR(mclk)) { dev_dbg(&pdev->dev, "mclk not found.\n"); mclk = NULL; } drvdata = devm_kzalloc(&pdev->dev, sizeof(*drvdata), GFP_KERNEL); - if (!drvdata) - return -ENOMEM; + if (!drvdata) { + ret = -ENOMEM; + goto err_put; + } drvdata->mclk = mclk; @@ -224,7 +229,8 @@ static int davinci_evm_probe(struct platform_device *pdev) if (!drvdata->mclk) { dev_err(&pdev->dev, "No clock or clock rate defined.\n"); - return -EINVAL; + ret = -EINVAL; + goto err_put; } drvdata->sysclk = clk_get_rate(drvdata->mclk); } else if (drvdata->mclk) { @@ -240,8 +246,25 @@ static int davinci_evm_probe(struct platform_device *pdev) snd_soc_card_set_drvdata(&evm_soc_card, drvdata); ret = devm_snd_soc_register_card(&pdev->dev, &evm_soc_card); - if (ret) + if (ret) { dev_err(&pdev->dev, "snd_soc_register_card failed (%d)\n", ret); + goto err_put; + } + + return ret; + +err_put: + dai->platforms->of_node = NULL; + + if (dai->cpus->of_node) { + of_node_put(dai->cpus->of_node); + dai->cpus->of_node = NULL; + } + + if (dai->codecs->of_node) { + of_node_put(dai->codecs->of_node); + dai->codecs->of_node = NULL; + } return ret; } From 971f66f547eeeaa5ec9b2384f821472a41042ed6 Mon Sep 17 00:00:00 2001 From: "Rob Herring (Arm)" Date: Thu, 8 Jan 2026 15:53:05 -0600 Subject: [PATCH 1093/1321] ASoC: dt-bindings: realtek,rt5640: Add missing properties/node The RT5640 has an MCLK pin and several users already define a clocks entry. A 'port' node is also in use and a common node for codecs. Signed-off-by: Rob Herring (Arm) Link: https://patch.msgid.link/20260108215307.1138515-1-robh@kernel.org Signed-off-by: Mark Brown --- .../devicetree/bindings/sound/realtek,rt5640.yaml | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml b/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml index 3f4f59287c1cc..a0b8bf6cb1108 100644 --- a/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml +++ b/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml @@ -47,6 +47,12 @@ properties: reg: maxItems: 1 + clocks: + maxItems: 1 + + clock-names: + const: mclk + interrupts: maxItems: 1 description: The CODEC's interrupt output. @@ -121,6 +127,9 @@ properties: - 2 # Scale current by 1.0 - 3 # Scale current by 1.5 + port: + $ref: /schemas/graph.yaml#/properties/port + required: - compatible - reg From 67474c1c794548c5e8deb8b2d9929ae79ffa07ba Mon Sep 17 00:00:00 2001 From: "Rob Herring (Arm)" Date: Thu, 8 Jan 2026 15:53:06 -0600 Subject: [PATCH 1094/1321] ASoC: dt-bindings: realtek,rt5640: Allow 7 for realtek,jack-detect-source The driver accepts and uses a value of 7 for realtek,jack-detect-source. What exactly it means isn't clear though. Signed-off-by: Rob Herring (Arm) Link: https://patch.msgid.link/20260108215307.1138515-2-robh@kernel.org Signed-off-by: Mark Brown --- Documentation/devicetree/bindings/sound/realtek,rt5640.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml b/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml index a0b8bf6cb1108..cd95d7189d340 100644 --- a/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml +++ b/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml @@ -104,6 +104,7 @@ properties: - 4 # Use GPIO2 for jack-detect - 5 # Use GPIO3 for jack-detect - 6 # Use GPIO4 for jack-detect + - 7 # HDA? realtek,jack-detect-not-inverted: description: From 9452f304600d1755560ee5bf6c2f39545344aa4b Mon Sep 17 00:00:00 2001 From: "Rob Herring (Arm)" Date: Thu, 8 Jan 2026 16:49:36 -0600 Subject: [PATCH 1095/1321] ASoC: dt-bindings: rockchip-spdif: Allow "port" node Add a "port" node entry for Rockchip S/PDIF binding. It's already in use and a common property for DAIs. Signed-off-by: Rob Herring (Arm) Link: https://patch.msgid.link/20260108224938.1320809-1-robh@kernel.org Signed-off-by: Mark Brown --- Documentation/devicetree/bindings/sound/rockchip-spdif.yaml | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/devicetree/bindings/sound/rockchip-spdif.yaml b/Documentation/devicetree/bindings/sound/rockchip-spdif.yaml index 32dea7392e8d4..56c755c22945f 100644 --- a/Documentation/devicetree/bindings/sound/rockchip-spdif.yaml +++ b/Documentation/devicetree/bindings/sound/rockchip-spdif.yaml @@ -70,6 +70,9 @@ properties: "#sound-dai-cells": const: 0 + port: + $ref: /schemas/graph.yaml#/properties/port + required: - compatible - reg From ea313d88f059d536eacb599ee07e1b1c1e04a7fa Mon Sep 17 00:00:00 2001 From: sheetal Date: Wed, 17 Dec 2025 18:55:24 +0530 Subject: [PATCH 1096/1321] ASoC: tegra: Revert fix for uninitialized flat cache warning in tegra210_ahub Commit 4d4021b0bbd1 ("ASoC: tegra: Fix uninitialized flat cache warning in tegra210_ahub") attempted to fix the uninitialized flat cache warning that is observed for the Tegra210 AHUB driver. However, the change broke various audio tests because an -EBUSY error is returned when accessing registers from cache before they are read from hardware. Revert this change for now, until a proper fix is available. Fixes: 4d4021b0bbd1 ("ASoC: tegra: Fix uninitialized flat cache warning in tegra210_ahub") Signed-off-by: sheetal Acked-by: Jon Hunter Link: https://patch.msgid.link/20251217132524.2844499-1-sheetal@nvidia.com Signed-off-by: Mark Brown --- sound/soc/tegra/tegra210_ahub.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/sound/soc/tegra/tegra210_ahub.c b/sound/soc/tegra/tegra210_ahub.c index 261d9067d27b6..e795907a3963a 100644 --- a/sound/soc/tegra/tegra210_ahub.c +++ b/sound/soc/tegra/tegra210_ahub.c @@ -2077,7 +2077,7 @@ static const struct regmap_config tegra210_ahub_regmap_config = { .val_bits = 32, .reg_stride = 4, .max_register = TEGRA210_MAX_REGISTER_ADDR, - .cache_type = REGCACHE_FLAT_S, + .cache_type = REGCACHE_FLAT, }; static const struct regmap_config tegra186_ahub_regmap_config = { @@ -2085,7 +2085,7 @@ static const struct regmap_config tegra186_ahub_regmap_config = { .val_bits = 32, .reg_stride = 4, .max_register = TEGRA186_MAX_REGISTER_ADDR, - .cache_type = REGCACHE_FLAT_S, + .cache_type = REGCACHE_FLAT, }; static const struct regmap_config tegra264_ahub_regmap_config = { @@ -2094,7 +2094,7 @@ static const struct regmap_config tegra264_ahub_regmap_config = { .reg_stride = 4, .writeable_reg = tegra264_ahub_wr_reg, .max_register = TEGRA264_MAX_REGISTER_ADDR, - .cache_type = REGCACHE_FLAT_S, + .cache_type = REGCACHE_FLAT, }; static const struct tegra_ahub_soc_data soc_data_tegra210 = { From 4f5a3ffdbf13d2353ee573e708ba929ed0affe37 Mon Sep 17 00:00:00 2001 From: Radhi Bajahaw Date: Mon, 12 Jan 2026 21:38:14 +0100 Subject: [PATCH 1097/1321] ASoC: amd: yc: Fix microphone on ASUS M6500RE Add DMI match for ASUSTeK COMPUTER INC. M6500RE to enable the internal microphone. Signed-off-by: Radhi Bajahaw Link: https://patch.msgid.link/20260112203814.155-1-bajahawradhi@gmail.com Signed-off-by: Mark Brown --- sound/soc/amd/yc/acp6x-mach.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/sound/soc/amd/yc/acp6x-mach.c b/sound/soc/amd/yc/acp6x-mach.c index bf4d9d3365617..0294177acc663 100644 --- a/sound/soc/amd/yc/acp6x-mach.c +++ b/sound/soc/amd/yc/acp6x-mach.c @@ -416,6 +416,13 @@ static const struct dmi_system_id yc_acp_quirk_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "M6500RC"), } }, + { + .driver_data = &acp6x_card, + .matches = { + DMI_MATCH(DMI_BOARD_VENDOR, "ASUSTeK COMPUTER INC."), + DMI_MATCH(DMI_PRODUCT_NAME, "M6500RE"), + } + }, { .driver_data = &acp6x_card, .matches = { From 84ecaaadeca7efa93173b660d8640fff8596406b Mon Sep 17 00:00:00 2001 From: Jon Hunter Date: Thu, 8 Jan 2026 14:31:56 +0000 Subject: [PATCH 1098/1321] ASoC: dt-bindings: realtek,rt5640: Document mclk Commit eba5a0bac211 ("ASoC: dt-bindings: realtek,rt5640: Convert to dtschema") converted the rt5640 dt-binding to yaml format but in the process dropped 'clock' and 'clock-names' properties that are used to specify the codec 'mclk'. This is causing DTB build warnings for boards that use this codec and define an 'mclk' in device-tree. Update the rt5640 binding document to add the optional mclk. Fixes: eba5a0bac211 ("ASoC: dt-bindings: realtek,rt5640: Convert to dtschema") Signed-off-by: Jon Hunter Reviewed-by: Krzysztof Kozlowski Link: https://patch.msgid.link/20260108143158.351223-2-jonathanh@nvidia.com Signed-off-by: Mark Brown --- .../devicetree/bindings/sound/realtek,rt5640.yaml | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml b/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml index cd95d7189d340..f91764aca9855 100644 --- a/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml +++ b/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml @@ -44,6 +44,14 @@ properties: - realtek,rt5640 - realtek,rt5639 + clocks: + items: + - description: phandle and clock specifier for codec MCLK. + + clock-names: + items: + - const: mclk + reg: maxItems: 1 From 68c2eb7f38e7efc0c1bf44a6de59444b97c9d378 Mon Sep 17 00:00:00 2001 From: Jon Hunter Date: Thu, 8 Jan 2026 14:31:57 +0000 Subject: [PATCH 1099/1321] ASoC: dt-bindings: realtek,rt5640: Update jack-detect The device-tree property 'realtek,jack-detect-source' currently only permits values from 0-6. However, commit 2b9c8d2b3c89 ("ASoC: rt5640: Add the HDA header support") updated the Realtek rt5640 to support setting the 'realtek,jack-detect-source' to 7 to support the HDA header. The Tegra234 platforms currently set 'realtek,jack-detect-source' to 7 for the HDA header and this is causing a warning when building device-tree. audio-codec@1c (realtek,rt5640): realtek,jack-detect-source: 7 is not one of [0, 1, 2, 3, 4, 5, 6] Given that the driver already supports this settings, update the binding document for the rt5640 device to add the HDA header as a valid configuration for the 'realtek,jack-detect-source' property. Fixes: 2b9c8d2b3c89 ("ASoC: rt5640: Add the HDA header support") Signed-off-by: Jon Hunter Reviewed-by: Krzysztof Kozlowski Link: https://patch.msgid.link/20260108143158.351223-3-jonathanh@nvidia.com Signed-off-by: Mark Brown --- Documentation/devicetree/bindings/sound/realtek,rt5640.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml b/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml index f91764aca9855..7810b75857f1e 100644 --- a/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml +++ b/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml @@ -112,7 +112,7 @@ properties: - 4 # Use GPIO2 for jack-detect - 5 # Use GPIO3 for jack-detect - 6 # Use GPIO4 for jack-detect - - 7 # HDA? + - 7 # Use HDA header for jack-detect realtek,jack-detect-not-inverted: description: From f8f686b087e1426e09a0bbbce19f1c98104cfd1f Mon Sep 17 00:00:00 2001 From: Jon Hunter Date: Thu, 8 Jan 2026 14:31:58 +0000 Subject: [PATCH 1100/1321] ASoC: dt-bindings: realtek,rt5640: Document port node Various boards that use the rt5640 audio codec define a 'port' child node under the codec node to describe the interface between it and the SoC that it is connected to. The binding document for the rt5640 codec does not define the 'port' child node and so this is generating warnings when running the DTB checks for these boards. Add the 'port' node to the binding document for the rt5640 codec to fix this. Signed-off-by: Jon Hunter Reviewed-by: Krzysztof Kozlowski Link: https://patch.msgid.link/20260108143158.351223-4-jonathanh@nvidia.com Signed-off-by: Mark Brown --- Documentation/devicetree/bindings/sound/realtek,rt5640.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml b/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml index 7810b75857f1e..42c0c8e285b8a 100644 --- a/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml +++ b/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml @@ -138,6 +138,7 @@ properties: port: $ref: /schemas/graph.yaml#/properties/port + unevaluatedProperties: false required: - compatible From 6680c1333e8b4425b72d897b813fbf9af09699c5 Mon Sep 17 00:00:00 2001 From: Cole Leavitt Date: Tue, 13 Jan 2026 19:55:18 -0700 Subject: [PATCH 1101/1321] ASoC: sdw_utils: cs42l43: Enable Headphone pin for LINEOUT jack type The CS42L43 codec's load detection can return different impedance values that map to either HEADPHONE or LINEOUT jack types. However, the soc_jack_pins array only maps SND_JACK_HEADPHONE to the "Headphone" DAPM pin, not SND_JACK_LINEOUT. When headphones are detected with an impedance that maps to LINEOUT (such as impedance value 0x2), the driver reports SND_JACK_LINEOUT. Since this doesn't match the jack pin mask, the "Headphone" DAPM pin is not activated, and no audio is routed to the headphone outputs. Fix by adding SND_JACK_LINEOUT to the Headphone pin mask, so that both headphone and line-out detection properly enable the headphone output path. This fixes no audio output on devices like the Lenovo ThinkPad P16 Gen 3 where headphones are detected with LINEOUT impedance. Fixes: d74bad3b7452 ("ASoC: intel: sof_sdw_cs42l43: Create separate jacks for hp and mic") Reviewed-by: Charles Keepax Signed-off-by: Cole Leavitt Link: https://patch.msgid.link/20260114025518.28519-1-cole@unwrap.rs Signed-off-by: Mark Brown --- sound/soc/sdw_utils/soc_sdw_cs42l43.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/sdw_utils/soc_sdw_cs42l43.c b/sound/soc/sdw_utils/soc_sdw_cs42l43.c index 4c954501e5002..2685ff4f09320 100644 --- a/sound/soc/sdw_utils/soc_sdw_cs42l43.c +++ b/sound/soc/sdw_utils/soc_sdw_cs42l43.c @@ -44,7 +44,7 @@ static const struct snd_soc_dapm_route cs42l43_dmic_map[] = { static struct snd_soc_jack_pin soc_jack_pins[] = { { .pin = "Headphone", - .mask = SND_JACK_HEADPHONE, + .mask = SND_JACK_HEADPHONE | SND_JACK_LINEOUT, }, { .pin = "Headset Mic", From c020de801275df8f5f08d729e323680ba2e19d37 Mon Sep 17 00:00:00 2001 From: Richard Fitzgerald Date: Mon, 12 Jan 2026 14:07:56 +0000 Subject: [PATCH 1102/1321] soundwire: Add missing EXPORT for sdw_slave_type include/sdw_type.h provides the function is_sdw_slave() which requires sdw_slave_type. But sdw_slave_type was not exported. Signed-off-by: Richard Fitzgerald Acked-by: Vinod Koul Reviewed-by: Pierre-Louis Bossart Reviewed-by: Charles Keepax Link: https://patch.msgid.link/20260112140758.215799-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown --- drivers/soundwire/slave.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/soundwire/slave.c b/drivers/soundwire/slave.c index 3d4d00188c26c..d933cebad52b4 100644 --- a/drivers/soundwire/slave.c +++ b/drivers/soundwire/slave.c @@ -23,6 +23,7 @@ const struct device_type sdw_slave_type = { .release = sdw_slave_release, .uevent = sdw_slave_uevent, }; +EXPORT_SYMBOL_GPL(sdw_slave_type); int sdw_slave_add(struct sdw_bus *bus, struct sdw_slave_id *id, struct fwnode_handle *fwnode) From e4aec3b0c4e3d8322bcd59765bc323fb0be455df Mon Sep 17 00:00:00 2001 From: Richard Fitzgerald Date: Mon, 12 Jan 2026 14:07:57 +0000 Subject: [PATCH 1103/1321] ASoC: sdw_utils: Call init callbacks on the correct codec DAI asoc_sdw_rtd_init() needs to call the rtd_init() callbacks for each codec in a dailink. It was finding the codecs by looking for the matching DAI name in codec_info_list[] but this isn't correct, because the DAI name isn't guaranteed to be unique. Parts using the same codec driver (so the same DAI names) might require different machine driver setup. Instead, get the struct sdw_slave and extract the SoundWire part ID. Use this to lookup the entry in codec_info_list[]. This is the same identity info that was used to find the entry when the machine driver created the dailink. Signed-off-by: Richard Fitzgerald Fixes: e377c9477317 ("ASoC: intel/sdw_utils: move soundwire codec_info_list structure") Reviewed-by: Pierre-Louis Bossart Reviewed-by: Charles Keepax Link: https://patch.msgid.link/20260112140758.215799-3-rf@opensource.cirrus.com Signed-off-by: Mark Brown --- sound/soc/sdw_utils/soc_sdw_utils.c | 43 ++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/sound/soc/sdw_utils/soc_sdw_utils.c b/sound/soc/sdw_utils/soc_sdw_utils.c index bf382aa07e92f..ccf149f949e8f 100644 --- a/sound/soc/sdw_utils/soc_sdw_utils.c +++ b/sound/soc/sdw_utils/soc_sdw_utils.c @@ -841,6 +841,19 @@ struct asoc_sdw_codec_info *asoc_sdw_find_codec_info_part(const u64 adr) } EXPORT_SYMBOL_NS(asoc_sdw_find_codec_info_part, "SND_SOC_SDW_UTILS"); +static struct asoc_sdw_codec_info *asoc_sdw_find_codec_info_sdw_id(const struct sdw_slave_id *id) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(codec_info_list); i++) + if (id->part_id == codec_info_list[i].part_id && + (!codec_info_list[i].version_id || + id->sdw_version == codec_info_list[i].version_id)) + return &codec_info_list[i]; + + return NULL; +} + struct asoc_sdw_codec_info *asoc_sdw_find_codec_info_acpi(const u8 *acpi_id) { int i; @@ -873,22 +886,46 @@ struct asoc_sdw_codec_info *asoc_sdw_find_codec_info_dai(const char *dai_name, i } EXPORT_SYMBOL_NS(asoc_sdw_find_codec_info_dai, "SND_SOC_SDW_UTILS"); +static int asoc_sdw_find_codec_info_dai_index(const struct asoc_sdw_codec_info *codec_info, + const char *dai_name) +{ + int i; + + for (i = 0; i < codec_info->dai_num; i++) { + if (!strcmp(codec_info->dais[i].dai_name, dai_name)) + return i; + } + + return -ENOENT; +} + int asoc_sdw_rtd_init(struct snd_soc_pcm_runtime *rtd) { struct snd_soc_card *card = rtd->card; struct snd_soc_dapm_context *dapm = snd_soc_card_to_dapm(card); struct asoc_sdw_codec_info *codec_info; struct snd_soc_dai *dai; + struct sdw_slave *sdw_peripheral; const char *spk_components=""; int dai_index; int ret; int i; for_each_rtd_codec_dais(rtd, i, dai) { - codec_info = asoc_sdw_find_codec_info_dai(dai->name, &dai_index); + if (is_sdw_slave(dai->component->dev)) + sdw_peripheral = dev_to_sdw_dev(dai->component->dev); + else if (dai->component->dev->parent && is_sdw_slave(dai->component->dev->parent)) + sdw_peripheral = dev_to_sdw_dev(dai->component->dev->parent); + else + continue; + + codec_info = asoc_sdw_find_codec_info_sdw_id(&sdw_peripheral->id); if (!codec_info) return -EINVAL; + dai_index = asoc_sdw_find_codec_info_dai_index(codec_info, dai->name); + WARN_ON(dai_index < 0); + /* * A codec dai can be connected to different dai links for capture and playback, * but we only need to call the rtd_init function once. @@ -898,6 +935,10 @@ int asoc_sdw_rtd_init(struct snd_soc_pcm_runtime *rtd) if (codec_info->dais[dai_index].rtd_init_done) continue; + dev_dbg(card->dev, "%#x/%s initializing for %s/%s\n", + codec_info->part_id, codec_info->dais[dai_index].dai_name, + dai->component->name, dai->name); + /* * Add card controls and dapm widgets for the first codec dai. * The controls and widgets will be used for all codec dais. From a892c3e93052c1d6af43ce01e4ab19ee3a9298c7 Mon Sep 17 00:00:00 2001 From: Emil Svendsen Date: Tue, 13 Jan 2026 11:58:44 +0100 Subject: [PATCH 1104/1321] ASoC: tlv320adcx140: invert DRE_ENABLE Looking at section 8.6.1.1.69 in datasheets for both 5140 and 6140 (3140 doesn't support DRE). REG ADCX140_DSP_CFG1 BIT 3 field "DRE_AGC_SEL" it select either DRE or AGC. It states: * 0 = DRE * 1 = AGC The control is called "DRE_ENABLE" and for it to be true it has to be active low. This commit will invert the control so "DRE_ENABLE" is active low. Signed-off-by: Emil Svendsen Signed-off-by: Sascha Hauer Link: https://patch.msgid.link/20260113-sound-soc-codecs-tvl320adcx140-v4-1-8f7ecec525c8@pengutronix.de Signed-off-by: Mark Brown --- sound/soc/codecs/tlv320adcx140.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/codecs/tlv320adcx140.c b/sound/soc/codecs/tlv320adcx140.c index 443cf59cb71ab..75e1007012a48 100644 --- a/sound/soc/codecs/tlv320adcx140.c +++ b/sound/soc/codecs/tlv320adcx140.c @@ -338,7 +338,7 @@ static const struct snd_kcontrol_new adcx140_dapm_ch4_dre_en_switch = SOC_DAPM_SINGLE("Switch", ADCX140_CH4_CFG0, 0, 1, 0); static const struct snd_kcontrol_new adcx140_dapm_dre_en_switch = - SOC_DAPM_SINGLE("Switch", ADCX140_DSP_CFG1, 3, 1, 0); + SOC_DAPM_SINGLE("Switch", ADCX140_DSP_CFG1, 3, 1, 1); /* Output Mixer */ static const struct snd_kcontrol_new adcx140_output_mixer_controls[] = { From 4b9562e0bb619fe2dd0d584c7bea64455e0200f3 Mon Sep 17 00:00:00 2001 From: Emil Svendsen Date: Tue, 13 Jan 2026 11:58:45 +0100 Subject: [PATCH 1105/1321] ASoC: tlv320adcx140: fix null pointer The "snd_soc_component" in "adcx140_priv" was only used once but never set. It was only used for reaching "dev" which is already present in "adcx140_priv". Fixes: 4e82971f7b55 ("ASoC: tlv320adcx140: Add a new kcontrol") Signed-off-by: Emil Svendsen Signed-off-by: Sascha Hauer Link: https://patch.msgid.link/20260113-sound-soc-codecs-tvl320adcx140-v4-2-8f7ecec525c8@pengutronix.de Signed-off-by: Mark Brown --- sound/soc/codecs/tlv320adcx140.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/sound/soc/codecs/tlv320adcx140.c b/sound/soc/codecs/tlv320adcx140.c index 75e1007012a48..3fb1b6251e6f8 100644 --- a/sound/soc/codecs/tlv320adcx140.c +++ b/sound/soc/codecs/tlv320adcx140.c @@ -23,7 +23,6 @@ #include "tlv320adcx140.h" struct adcx140_priv { - struct snd_soc_component *component; struct regulator *supply_areg; struct gpio_desc *gpio_reset; struct regmap *regmap; @@ -699,7 +698,6 @@ static void adcx140_pwr_ctrl(struct adcx140_priv *adcx140, bool power_state) { int pwr_ctrl = 0; int ret = 0; - struct snd_soc_component *component = adcx140->component; if (power_state) pwr_ctrl = ADCX140_PWR_CFG_ADC_PDZ | ADCX140_PWR_CFG_PLL_PDZ; @@ -711,7 +709,7 @@ static void adcx140_pwr_ctrl(struct adcx140_priv *adcx140, bool power_state) ret = regmap_write(adcx140->regmap, ADCX140_PHASE_CALIB, adcx140->phase_calib_on ? 0x00 : 0x40); if (ret) - dev_err(component->dev, "%s: register write error %d\n", + dev_err(adcx140->dev, "%s: register write error %d\n", __func__, ret); } From a918086e56c23c1434b8220528c9791899b19729 Mon Sep 17 00:00:00 2001 From: Dimitrios Katsaros Date: Tue, 13 Jan 2026 11:58:46 +0100 Subject: [PATCH 1106/1321] ASoC: tlv320adcx140: Propagate error codes during probe When scanning for the reset pin, we could get an -EPROBE_DEFER. The driver would assume that no reset pin had been defined, which would mean that the chip would never be powered. Now we both respect any error we get from devm_gpiod_get_optional. We also now properly report the missing GPIO definition when 'gpio_reset' is NULL. Signed-off-by: Dimitrios Katsaros Signed-off-by: Sascha Hauer Link: https://patch.msgid.link/20260113-sound-soc-codecs-tvl320adcx140-v4-3-8f7ecec525c8@pengutronix.de Signed-off-by: Mark Brown --- sound/soc/codecs/tlv320adcx140.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/sound/soc/codecs/tlv320adcx140.c b/sound/soc/codecs/tlv320adcx140.c index 3fb1b6251e6f8..58a6dfa228cc3 100644 --- a/sound/soc/codecs/tlv320adcx140.c +++ b/sound/soc/codecs/tlv320adcx140.c @@ -1154,6 +1154,9 @@ static int adcx140_i2c_probe(struct i2c_client *i2c) adcx140->gpio_reset = devm_gpiod_get_optional(adcx140->dev, "reset", GPIOD_OUT_LOW); if (IS_ERR(adcx140->gpio_reset)) + return dev_err_probe(&i2c->dev, PTR_ERR(adcx140->gpio_reset), + "Failed to get Reset GPIO\n"); + if (!adcx140->gpio_reset) dev_info(&i2c->dev, "Reset GPIO not defined\n"); adcx140->supply_areg = devm_regulator_get_optional(adcx140->dev, From 1ad86ee5154e6cdc03a862a39f9b467bcf90dd50 Mon Sep 17 00:00:00 2001 From: Emil Svendsen Date: Tue, 13 Jan 2026 11:58:47 +0100 Subject: [PATCH 1107/1321] ASoC: tlv320adcx140: fix word length The word length is the physical width of the channel slots. So the hw_params would misconfigure when format width and physical width doesn't match. Like S24_LE which has data width of 24 bits but physical width of 32 bits. So if using asymmetric formats you will get a lot of noise. Fixes: 689c7655b50c5 ("ASoC: tlv320adcx140: Add the tlv320adcx140 codec driver family") Signed-off-by: Emil Svendsen Signed-off-by: Sascha Hauer Link: https://patch.msgid.link/20260113-sound-soc-codecs-tvl320adcx140-v4-4-8f7ecec525c8@pengutronix.de Signed-off-by: Mark Brown --- sound/soc/codecs/tlv320adcx140.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sound/soc/codecs/tlv320adcx140.c b/sound/soc/codecs/tlv320adcx140.c index 58a6dfa228cc3..fdf4a9add852d 100644 --- a/sound/soc/codecs/tlv320adcx140.c +++ b/sound/soc/codecs/tlv320adcx140.c @@ -725,7 +725,7 @@ static int adcx140_hw_params(struct snd_pcm_substream *substream, struct adcx140_priv *adcx140 = snd_soc_component_get_drvdata(component); u8 data = 0; - switch (params_width(params)) { + switch (params_physical_width(params)) { case 16: data = ADCX140_16_BIT_WORD; break; @@ -740,7 +740,7 @@ static int adcx140_hw_params(struct snd_pcm_substream *substream, break; default: dev_err(component->dev, "%s: Unsupported width %d\n", - __func__, params_width(params)); + __func__, params_physical_width(params)); return -EINVAL; } From 94be4e68d01a8cfddd1af5f4ef2ca684295670fa Mon Sep 17 00:00:00 2001 From: Mark Brown Date: Wed, 14 Jan 2026 22:08:35 +0000 Subject: [PATCH 1108/1321] ASoC: rt5640: Fix duplicate clock properties in DT binding Not quite overlapping changes to the rt5640 binding resulted in duplicate definitions of the clocks and clock-names properties. Delete one of them, preferring the simpler one. Reported-by: Jon Hunter Closes: https://lore.kernel.org/r/0e68c5f4-f68d-4544-bc7a-40694829db75@nvidia.com Signed-off-by: Mark Brown Link: https://patch.msgid.link/20260114-asoc-fix-rt5640-dt-clocks-v1-1-421d438673c2@kernel.org Signed-off-by: Mark Brown --- .../devicetree/bindings/sound/realtek,rt5640.yaml | 8 -------- 1 file changed, 8 deletions(-) diff --git a/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml b/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml index 42c0c8e285b8a..5f2bd7cfab634 100644 --- a/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml +++ b/Documentation/devicetree/bindings/sound/realtek,rt5640.yaml @@ -44,14 +44,6 @@ properties: - realtek,rt5640 - realtek,rt5639 - clocks: - items: - - description: phandle and clock specifier for codec MCLK. - - clock-names: - items: - - const: mclk - reg: maxItems: 1 From e220f09bebe15b3bb7887344894f14fcb4807403 Mon Sep 17 00:00:00 2001 From: Shenghao Ding Date: Thu, 15 Jan 2026 20:49:06 +0800 Subject: [PATCH 1109/1321] ALSA: hda/tas2781: Add newly-released HP laptop HP released the new laptop with the subid 0x103C. Signed-off-by: Shenghao Ding Link: https://patch.msgid.link/20260115124907.629-1-shenghao-ding@ti.com Signed-off-by: Takashi Iwai --- sound/hda/codecs/side-codecs/tas2781_hda_i2c.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c b/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c index 0e4bda3a544ea..624a822341bb7 100644 --- a/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c +++ b/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c @@ -2,7 +2,7 @@ // // TAS2781 HDA I2C driver // -// Copyright 2023 - 2025 Texas Instruments, Inc. +// Copyright 2023 - 2026 Texas Instruments, Inc. // // Author: Shenghao Ding // Current maintainer: Baojun Xu @@ -571,6 +571,9 @@ static int tas2781_hda_bind(struct device *dev, struct device *master, case 0x1028: tas_hda->catlog_id = DELL; break; + case 0x103C: + tas_hda->catlog_id = HP; + break; default: tas_hda->catlog_id = LENOVO; break; From 8717ade329365aadf1e71ca1cd01e17b9ed5a591 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Wed, 7 Jan 2026 16:14:55 +0100 Subject: [PATCH 1110/1321] ACPI: PM: s2idle: Add missing checks to acpi_s2idle_begin_lps0() Commit 32ece31db4df ("ACPI: PM: s2idle: Only retrieve constraints when needed"), that attempted to avoid useless evaluation of LPS0 _DSM Function 1 in lps0_device_attach(), forgot to add checks for lps0_device_handle and sleep_no_lps0 to acpi_s2idle_begin_lps0() where they should be done before calling lpi_device_get_constraints() or lpi_device_get_constraints_amd(). Add the missing checks. Fixes: 32ece31db4df ("ACPI: PM: s2idle: Only retrieve constraints when needed") Signed-off-by: Rafael J. Wysocki Reviewed-by: Mario Limonciello (AMD) Link: https://patch.msgid.link/2818730.mvXUDI8C0e@rafael.j.wysocki --- drivers/acpi/x86/s2idle.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c index 6d4d06236f616..2a9edb53d5d45 100644 --- a/drivers/acpi/x86/s2idle.c +++ b/drivers/acpi/x86/s2idle.c @@ -515,7 +515,8 @@ static struct acpi_scan_handler lps0_handler = { static int acpi_s2idle_begin_lps0(void) { - if (pm_debug_messages_on && !lpi_constraints_table) { + if (lps0_device_handle && !sleep_no_lps0 && pm_debug_messages_on && + !lpi_constraints_table) { if (acpi_s2idle_vendor_amd()) lpi_device_get_constraints_amd(); else From fe62f06f6aebc8d8fbcee18f1487463702cd5104 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Tue, 13 Jan 2026 14:36:54 +0100 Subject: [PATCH 1111/1321] ACPI: PM: s2idle: Add module parameter for LPS0 constraints checking Commit 32ece31db4df ("ACPI: PM: s2idle: Only retrieve constraints when needed") attempted to avoid useless evaluation of LPS0 _DSM Function 1 in lps0_device_attach() because pm_debug_messages_on might never be set (and that is the case on production systems most of the time), but it turns out that LPS0 _DSM Function 1 is generally problematic on some platforms and causes suspend issues to occur when pm_debug_messages_on is set now. In Linux, LPS0 _DSM Function 1 is only useful for diagnostics and only in the cases when the system does not reach the deepest platform idle state during suspend-to-idle for some reason. If such diagnostics is not necessary, evaluating it is a loss of time, so using it along with the other pm_debug_messages_on diagnostics is questionable because the latter is expected to be suitable for collecting debug information even during production use of system suspend. For this reason, add a module parameter called check_lps0_constraints to control whether or not the list of LPS0 constraints will be checked in acpi_s2idle_prepare_late_lps0() and so whether or not to evaluate LPS0 _DSM Function 1 (once) in acpi_s2idle_begin_lps0(). Fixes: 32ece31db4df ("ACPI: PM: s2idle: Only retrieve constraints when needed") Signed-off-by: Rafael J. Wysocki Reviewed-by: Mario Limonciello (AMD) Link: https://patch.msgid.link/2827214.mvXUDI8C0e@rafael.j.wysocki --- drivers/acpi/x86/s2idle.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c index 2a9edb53d5d45..cc3c83e4cc23b 100644 --- a/drivers/acpi/x86/s2idle.c +++ b/drivers/acpi/x86/s2idle.c @@ -28,6 +28,10 @@ static bool sleep_no_lps0 __read_mostly; module_param(sleep_no_lps0, bool, 0644); MODULE_PARM_DESC(sleep_no_lps0, "Do not use the special LPS0 device interface"); +static bool check_lps0_constraints __read_mostly; +module_param(check_lps0_constraints, bool, 0644); +MODULE_PARM_DESC(check_lps0_constraints, "Check LPS0 device constraints"); + static const struct acpi_device_id lps0_device_ids[] = { {"PNP0D80", }, {"", }, @@ -515,7 +519,7 @@ static struct acpi_scan_handler lps0_handler = { static int acpi_s2idle_begin_lps0(void) { - if (lps0_device_handle && !sleep_no_lps0 && pm_debug_messages_on && + if (lps0_device_handle && !sleep_no_lps0 && check_lps0_constraints && !lpi_constraints_table) { if (acpi_s2idle_vendor_amd()) lpi_device_get_constraints_amd(); @@ -540,7 +544,7 @@ static int acpi_s2idle_prepare_late_lps0(void) if (!lps0_device_handle || sleep_no_lps0) return 0; - if (pm_debug_messages_on) + if (check_lps0_constraints) lpi_check_constraints(); /* Screen off */ From df617a2dc9da2ba23cc62ca64ddfc5a753e00cd9 Mon Sep 17 00:00:00 2001 From: Yaxiong Tian Date: Tue, 30 Dec 2025 14:15:34 +0800 Subject: [PATCH 1112/1321] PM: EM: Fix incorrect description of the cost field in struct em_perf_state Due to commit 1b600da51073 ("PM: EM: Optimize em_cpu_energy() and remove division"), the logic for energy consumption calculation has been modified. The actual calculation of cost is 10 * power * max_frequency / frequency instead of power * max_frequency / frequency. Therefore, the comment for cost has been updated to reflect the correct content. Fixes: 1b600da51073 ("PM: EM: Optimize em_cpu_energy() and remove division") Signed-off-by: Yaxiong Tian Reviewed-by: Lukasz Luba [ rjw: Added Fixes: tag ] Link: https://patch.msgid.link/20251230061534.816894-1-tianyaxiong@kylinos.cn Signed-off-by: Rafael J. Wysocki --- include/linux/energy_model.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h index 43aa6153dc578..e7497f8046442 100644 --- a/include/linux/energy_model.h +++ b/include/linux/energy_model.h @@ -18,7 +18,7 @@ * @power: The power consumed at this level (by 1 CPU or by a registered * device). It can be a total power: static and dynamic. * @cost: The cost coefficient associated with this level, used during - * energy calculation. Equal to: power * max_frequency / frequency + * energy calculation. Equal to: 10 * power * max_frequency / frequency * @flags: see "em_perf_state flags" description below. */ struct em_perf_state { From 2b03ddb53cd4d25724c4dcc3a6e854adbd2e8670 Mon Sep 17 00:00:00 2001 From: Malaya Kumar Rout Date: Mon, 5 Jan 2026 16:07:29 +0530 Subject: [PATCH 1113/1321] PM: EM: Fix memory leak in em_create_pd() error path When ida_alloc() fails in em_create_pd(), the function returns without freeing the previously allocated 'pd' structure, leading to a memory leak. The 'pd' pointer is allocated either at line 436 (for CPU devices with cpumask) or line 442 (for other devices) using kzalloc(). Additionally, the function incorrectly returns -ENOMEM when ida_alloc() fails, ignoring the actual error code returned by ida_alloc(), which can fail for reasons other than memory exhaustion. Fix both issues by: 1. Freeing the 'pd' structure with kfree() when ida_alloc() fails 2. Returning the actual error code from ida_alloc() instead of -ENOMEM This ensures proper cleanup on the error path and accurate error reporting. Fixes: cbe5aeedecc7 ("PM: EM: Assign a unique ID when creating a performance domain") Signed-off-by: Malaya Kumar Rout Reviewed-by: Changwoo Min Link: https://patch.msgid.link/20260105103730.65626-1-mrout@redhat.com Signed-off-by: Rafael J. Wysocki --- kernel/power/energy_model.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c index 11af9f64aa827..5b055cbe5341e 100644 --- a/kernel/power/energy_model.c +++ b/kernel/power/energy_model.c @@ -449,8 +449,10 @@ static int em_create_pd(struct device *dev, int nr_states, INIT_LIST_HEAD(&pd->node); id = ida_alloc(&em_pd_ida, GFP_KERNEL); - if (id < 0) - return -ENOMEM; + if (id < 0) { + kfree(pd); + return id; + } pd->id = id; em_table = em_table_alloc(pd); From b91c41590c09e58e0762112ea240655895225338 Mon Sep 17 00:00:00 2001 From: Changwoo Min Date: Thu, 8 Jan 2026 14:32:09 +0900 Subject: [PATCH 1114/1321] PM: EM: Fix yamllint warnings in the EM YNL spec MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The energy model YNL spec has the following two warnings when checking with yamlint: 3:1 warning missing document start "---" (document-start) 107:13 error wrong indentation: expected 10 but found 12 (indentation) So let’s fix whose lint warnings. Fixes: bd26631ccdfd ("PM: EM: Add em.yaml and autogen files") Suggested-by: Donald Hunter Reviewed-by: Lukasz Luba Reviewed-by: Donald Hunter Signed-off-by: Changwoo Min Link: https://patch.msgid.link/20260108053212.642478-2-changwoo@igalia.com Signed-off-by: Rafael J. Wysocki --- Documentation/netlink/specs/em.yaml | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/Documentation/netlink/specs/em.yaml b/Documentation/netlink/specs/em.yaml index 9905ca482325d..0c595a874f080 100644 --- a/Documentation/netlink/specs/em.yaml +++ b/Documentation/netlink/specs/em.yaml @@ -1,5 +1,8 @@ # SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) - +# +# Copyright (c) 2025 Valve Corporation. +# +--- name: em doc: | @@ -104,7 +107,7 @@ operations: attribute-set: pd-table event: attributes: - - pd-id + - pd-id mcgrp: event mcast-groups: From b7d2f886627801486e2987c51767ac6f9fe5a4b3 Mon Sep 17 00:00:00 2001 From: Changwoo Min Date: Thu, 8 Jan 2026 14:32:10 +0900 Subject: [PATCH 1115/1321] PM: EM: Rename em.yaml to dev-energymodel.yaml MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The EM YNL specification used many acronyms, including ‘em’, ‘pd’, ‘ps’, etc. While the acronyms are short and convenient, they could be confusing. So, let’s spell them out to be more specific. The following changes were made in the spec. Note that the protocol name cannot exceed GENL_NAMSIZ (16). em -> dev-energymodel pds -> perf-domains pd -> perf-domain pd-id -> perf-domain-id pd-table -> perf-table ps -> perf-state get-pds -> get-perf-domains get-pd-table -> get-perf-table pd-created -> perf-domain-created pd-updated -> perf-domain-updated pd-deleted -> perf-domain-deleted In addition. doc strings were added to the spec. based on the comments in energy_model.h. Two flag attributes (perf-state-flags and perf-domain-flags) were added for easily interpreting the bit flags. Finally, the autogenerated files and em_netlink.c were updated accordingly to reflect the name changes. Suggested-by: Donald Hunter Reviewed-by: Lukasz Luba Reviewed-by: Donald Hunter Signed-off-by: Changwoo Min Link: https://patch.msgid.link/20260108053212.642478-3-changwoo@igalia.com Signed-off-by: Rafael J. Wysocki --- .../netlink/specs/dev-energymodel.yaml | 175 ++++++++++++++++++ Documentation/netlink/specs/em.yaml | 116 ------------ MAINTAINERS | 8 +- include/uapi/linux/dev_energymodel.h | 89 +++++++++ include/uapi/linux/energy_model.h | 63 ------- kernel/power/em_netlink.c | 135 ++++++++------ kernel/power/em_netlink_autogen.c | 44 ++--- kernel/power/em_netlink_autogen.h | 20 +- 8 files changed, 384 insertions(+), 266 deletions(-) create mode 100644 Documentation/netlink/specs/dev-energymodel.yaml delete mode 100644 Documentation/netlink/specs/em.yaml create mode 100644 include/uapi/linux/dev_energymodel.h delete mode 100644 include/uapi/linux/energy_model.h diff --git a/Documentation/netlink/specs/dev-energymodel.yaml b/Documentation/netlink/specs/dev-energymodel.yaml new file mode 100644 index 0000000000000..cbc4bc38f23cc --- /dev/null +++ b/Documentation/netlink/specs/dev-energymodel.yaml @@ -0,0 +1,175 @@ +# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) +# +# Copyright (c) 2025 Valve Corporation. +# +--- +name: dev-energymodel + +doc: | + Energy model netlink interface to notify its changes. + +protocol: genetlink + +uapi-header: linux/dev_energymodel.h + +definitions: + - + type: flags + name: perf-state-flags + entries: + - + name: perf-state-inefficient + doc: >- + The performance state is inefficient. There is in this perf-domain, + another performance state with a higher frequency but a lower or + equal power cost. + - + type: flags + name: perf-domain-flags + entries: + - + name: perf-domain-microwatts + doc: >- + The power values are in micro-Watts or some other scale. + - + name: perf-domain-skip-inefficiencies + doc: >- + Skip inefficient states when estimating energy consumption. + - + name: perf-domain-artificial + doc: >- + The power values are artificial and might be created by platform + missing real power information. + +attribute-sets: + - + name: perf-domains + doc: >- + Information on all the performance domains. + attributes: + - + name: perf-domain + type: nest + nested-attributes: perf-domain + multi-attr: true + - + name: perf-domain + doc: >- + Information on a single performance domains. + attributes: + - + name: pad + type: pad + - + name: perf-domain-id + type: u32 + doc: >- + A unique ID number for each performance domain. + - + name: flags + type: u64 + doc: >- + Bitmask of performance domain flags. + enum: perf-domain-flags + - + name: cpus + type: string + doc: >- + CPUs that belong to this performance domain. + - + name: perf-table + doc: >- + Performance states table. + attributes: + - + name: perf-domain-id + type: u32 + doc: >- + A unique ID number for each performance domain. + - + name: perf-state + type: nest + nested-attributes: perf-state + multi-attr: true + - + name: perf-state + doc: >- + Performance state of a performance domain. + attributes: + - + name: pad + type: pad + - + name: performance + type: u64 + doc: >- + CPU performance (capacity) at a given frequency. + - + name: frequency + type: u64 + doc: >- + The frequency in KHz, for consistency with CPUFreq. + - + name: power + type: u64 + doc: >- + The power consumed at this level (by 1 CPU or by a registered + device). It can be a total power: static and dynamic. + - + name: cost + type: u64 + doc: >- + The cost coefficient associated with this level, used during energy + calculation. Equal to: power * max_frequency / frequency. + - + name: flags + type: u64 + doc: >- + Bitmask of performance state flags. + enum: perf-state-flags + +operations: + list: + - + name: get-perf-domains + attribute-set: perf-domains + doc: Get the list of information for all performance domains. + do: + reply: + attributes: + - perf-domain + - + name: get-perf-table + attribute-set: perf-table + doc: Get the energy model table of a performance domain. + do: + request: + attributes: + - perf-domain-id + reply: + attributes: + - perf-domain-id + - perf-state + - + name: perf-domain-created + doc: A performance domain is created. + notify: get-perf-table + mcgrp: event + - + name: perf-domain-updated + doc: A performance domain is updated. + notify: get-perf-table + mcgrp: event + - + name: perf-domain-deleted + doc: A performance domain is deleted. + attribute-set: perf-table + event: + attributes: + - perf-domain-id + mcgrp: event + +mcast-groups: + list: + - + name: event diff --git a/Documentation/netlink/specs/em.yaml b/Documentation/netlink/specs/em.yaml deleted file mode 100644 index 0c595a874f080..0000000000000 --- a/Documentation/netlink/specs/em.yaml +++ /dev/null @@ -1,116 +0,0 @@ -# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) -# -# Copyright (c) 2025 Valve Corporation. -# ---- -name: em - -doc: | - Energy model netlink interface to notify its changes. - -protocol: genetlink - -uapi-header: linux/energy_model.h - -attribute-sets: - - - name: pds - attributes: - - - name: pd - type: nest - nested-attributes: pd - multi-attr: true - - - name: pd - attributes: - - - name: pad - type: pad - - - name: pd-id - type: u32 - - - name: flags - type: u64 - - - name: cpus - type: string - - - name: pd-table - attributes: - - - name: pd-id - type: u32 - - - name: ps - type: nest - nested-attributes: ps - multi-attr: true - - - name: ps - attributes: - - - name: pad - type: pad - - - name: performance - type: u64 - - - name: frequency - type: u64 - - - name: power - type: u64 - - - name: cost - type: u64 - - - name: flags - type: u64 - -operations: - list: - - - name: get-pds - attribute-set: pds - doc: Get the list of information for all performance domains. - do: - reply: - attributes: - - pd - - - name: get-pd-table - attribute-set: pd-table - doc: Get the energy model table of a performance domain. - do: - request: - attributes: - - pd-id - reply: - attributes: - - pd-id - - ps - - - name: pd-created - doc: A performance domain is created. - notify: get-pd-table - mcgrp: event - - - name: pd-updated - doc: A performance domain is updated. - notify: get-pd-table - mcgrp: event - - - name: pd-deleted - doc: A performance domain is deleted. - attribute-set: pd-table - event: - attributes: - - pd-id - mcgrp: event - -mcast-groups: - list: - - - name: event diff --git a/MAINTAINERS b/MAINTAINERS index f1b0205885973..41ace5aeecbe5 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9304,12 +9304,12 @@ M: Lukasz Luba M: "Rafael J. Wysocki" L: linux-pm@vger.kernel.org S: Maintained -F: kernel/power/energy_model.c -F: include/linux/energy_model.h +F: Documentation/netlink/specs/dev-energymodel.yaml F: Documentation/power/energy-model.rst -F: Documentation/netlink/specs/em.yaml -F: include/uapi/linux/energy_model.h +F: include/linux/energy_model.h +F: include/uapi/linux/dev_energymodel.h F: kernel/power/em_netlink*.* +F: kernel/power/energy_model.c EPAPR HYPERVISOR BYTE CHANNEL DEVICE DRIVER M: Laurentiu Tudor diff --git a/include/uapi/linux/dev_energymodel.h b/include/uapi/linux/dev_energymodel.h new file mode 100644 index 0000000000000..3399967e1f930 --- /dev/null +++ b/include/uapi/linux/dev_energymodel.h @@ -0,0 +1,89 @@ +/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */ +/* Do not edit directly, auto-generated from: */ +/* Documentation/netlink/specs/dev-energymodel.yaml */ +/* YNL-GEN uapi header */ +/* To regenerate run: tools/net/ynl/ynl-regen.sh */ + +#ifndef _UAPI_LINUX_DEV_ENERGYMODEL_H +#define _UAPI_LINUX_DEV_ENERGYMODEL_H + +#define DEV_ENERGYMODEL_FAMILY_NAME "dev-energymodel" +#define DEV_ENERGYMODEL_FAMILY_VERSION 1 + +/** + * enum dev_energymodel_perf_state_flags + * @DEV_ENERGYMODEL_PERF_STATE_FLAGS_PERF_STATE_INEFFICIENT: The performance + * state is inefficient. There is in this perf-domain, another performance + * state with a higher frequency but a lower or equal power cost. + */ +enum dev_energymodel_perf_state_flags { + DEV_ENERGYMODEL_PERF_STATE_FLAGS_PERF_STATE_INEFFICIENT = 1, +}; + +/** + * enum dev_energymodel_perf_domain_flags + * @DEV_ENERGYMODEL_PERF_DOMAIN_FLAGS_PERF_DOMAIN_MICROWATTS: The power values + * are in micro-Watts or some other scale. + * @DEV_ENERGYMODEL_PERF_DOMAIN_FLAGS_PERF_DOMAIN_SKIP_INEFFICIENCIES: Skip + * inefficient states when estimating energy consumption. + * @DEV_ENERGYMODEL_PERF_DOMAIN_FLAGS_PERF_DOMAIN_ARTIFICIAL: The power values + * are artificial and might be created by platform missing real power + * information. + */ +enum dev_energymodel_perf_domain_flags { + DEV_ENERGYMODEL_PERF_DOMAIN_FLAGS_PERF_DOMAIN_MICROWATTS = 1, + DEV_ENERGYMODEL_PERF_DOMAIN_FLAGS_PERF_DOMAIN_SKIP_INEFFICIENCIES = 2, + DEV_ENERGYMODEL_PERF_DOMAIN_FLAGS_PERF_DOMAIN_ARTIFICIAL = 4, +}; + +enum { + DEV_ENERGYMODEL_A_PERF_DOMAINS_PERF_DOMAIN = 1, + + __DEV_ENERGYMODEL_A_PERF_DOMAINS_MAX, + DEV_ENERGYMODEL_A_PERF_DOMAINS_MAX = (__DEV_ENERGYMODEL_A_PERF_DOMAINS_MAX - 1) +}; + +enum { + DEV_ENERGYMODEL_A_PERF_DOMAIN_PAD = 1, + DEV_ENERGYMODEL_A_PERF_DOMAIN_PERF_DOMAIN_ID, + DEV_ENERGYMODEL_A_PERF_DOMAIN_FLAGS, + DEV_ENERGYMODEL_A_PERF_DOMAIN_CPUS, + + __DEV_ENERGYMODEL_A_PERF_DOMAIN_MAX, + DEV_ENERGYMODEL_A_PERF_DOMAIN_MAX = (__DEV_ENERGYMODEL_A_PERF_DOMAIN_MAX - 1) +}; + +enum { + DEV_ENERGYMODEL_A_PERF_TABLE_PERF_DOMAIN_ID = 1, + DEV_ENERGYMODEL_A_PERF_TABLE_PERF_STATE, + + __DEV_ENERGYMODEL_A_PERF_TABLE_MAX, + DEV_ENERGYMODEL_A_PERF_TABLE_MAX = (__DEV_ENERGYMODEL_A_PERF_TABLE_MAX - 1) +}; + +enum { + DEV_ENERGYMODEL_A_PERF_STATE_PAD = 1, + DEV_ENERGYMODEL_A_PERF_STATE_PERFORMANCE, + DEV_ENERGYMODEL_A_PERF_STATE_FREQUENCY, + DEV_ENERGYMODEL_A_PERF_STATE_POWER, + DEV_ENERGYMODEL_A_PERF_STATE_COST, + DEV_ENERGYMODEL_A_PERF_STATE_FLAGS, + + __DEV_ENERGYMODEL_A_PERF_STATE_MAX, + DEV_ENERGYMODEL_A_PERF_STATE_MAX = (__DEV_ENERGYMODEL_A_PERF_STATE_MAX - 1) +}; + +enum { + DEV_ENERGYMODEL_CMD_GET_PERF_DOMAINS = 1, + DEV_ENERGYMODEL_CMD_GET_PERF_TABLE, + DEV_ENERGYMODEL_CMD_PERF_DOMAIN_CREATED, + DEV_ENERGYMODEL_CMD_PERF_DOMAIN_UPDATED, + DEV_ENERGYMODEL_CMD_PERF_DOMAIN_DELETED, + + __DEV_ENERGYMODEL_CMD_MAX, + DEV_ENERGYMODEL_CMD_MAX = (__DEV_ENERGYMODEL_CMD_MAX - 1) +}; + +#define DEV_ENERGYMODEL_MCGRP_EVENT "event" + +#endif /* _UAPI_LINUX_DEV_ENERGYMODEL_H */ diff --git a/include/uapi/linux/energy_model.h b/include/uapi/linux/energy_model.h deleted file mode 100644 index 0bcad967854ff..0000000000000 --- a/include/uapi/linux/energy_model.h +++ /dev/null @@ -1,63 +0,0 @@ -/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */ -/* Do not edit directly, auto-generated from: */ -/* Documentation/netlink/specs/em.yaml */ -/* YNL-GEN uapi header */ -/* To regenerate run: tools/net/ynl/ynl-regen.sh */ - -#ifndef _UAPI_LINUX_ENERGY_MODEL_H -#define _UAPI_LINUX_ENERGY_MODEL_H - -#define EM_FAMILY_NAME "em" -#define EM_FAMILY_VERSION 1 - -enum { - EM_A_PDS_PD = 1, - - __EM_A_PDS_MAX, - EM_A_PDS_MAX = (__EM_A_PDS_MAX - 1) -}; - -enum { - EM_A_PD_PAD = 1, - EM_A_PD_PD_ID, - EM_A_PD_FLAGS, - EM_A_PD_CPUS, - - __EM_A_PD_MAX, - EM_A_PD_MAX = (__EM_A_PD_MAX - 1) -}; - -enum { - EM_A_PD_TABLE_PD_ID = 1, - EM_A_PD_TABLE_PS, - - __EM_A_PD_TABLE_MAX, - EM_A_PD_TABLE_MAX = (__EM_A_PD_TABLE_MAX - 1) -}; - -enum { - EM_A_PS_PAD = 1, - EM_A_PS_PERFORMANCE, - EM_A_PS_FREQUENCY, - EM_A_PS_POWER, - EM_A_PS_COST, - EM_A_PS_FLAGS, - - __EM_A_PS_MAX, - EM_A_PS_MAX = (__EM_A_PS_MAX - 1) -}; - -enum { - EM_CMD_GET_PDS = 1, - EM_CMD_GET_PD_TABLE, - EM_CMD_PD_CREATED, - EM_CMD_PD_UPDATED, - EM_CMD_PD_DELETED, - - __EM_CMD_MAX, - EM_CMD_MAX = (__EM_CMD_MAX - 1) -}; - -#define EM_MCGRP_EVENT "event" - -#endif /* _UAPI_LINUX_ENERGY_MODEL_H */ diff --git a/kernel/power/em_netlink.c b/kernel/power/em_netlink.c index 4b85da138a063..6f6238c465bb9 100644 --- a/kernel/power/em_netlink.c +++ b/kernel/power/em_netlink.c @@ -12,27 +12,31 @@ #include #include #include -#include +#include #include "em_netlink.h" #include "em_netlink_autogen.h" -#define EM_A_PD_CPUS_LEN 256 +#define DEV_ENERGYMODEL_A_PERF_DOMAIN_CPUS_LEN 256 /*************************** Command encoding ********************************/ static int __em_nl_get_pd_size(struct em_perf_domain *pd, void *data) { - char cpus_buf[EM_A_PD_CPUS_LEN]; + char cpus_buf[DEV_ENERGYMODEL_A_PERF_DOMAIN_CPUS_LEN]; int *tot_msg_sz = data; int msg_sz, cpus_sz; cpus_sz = snprintf(cpus_buf, sizeof(cpus_buf), "%*pb", cpumask_pr_args(to_cpumask(pd->cpus))); - msg_sz = nla_total_size(0) + /* EM_A_PDS_PD */ - nla_total_size(sizeof(u32)) + /* EM_A_PD_PD_ID */ - nla_total_size_64bit(sizeof(u64)) + /* EM_A_PD_FLAGS */ - nla_total_size(cpus_sz); /* EM_A_PD_CPUS */ + msg_sz = nla_total_size(0) + + /* DEV_ENERGYMODEL_A_PERF_DOMAINS_PERF_DOMAIN */ + nla_total_size(sizeof(u32)) + + /* DEV_ENERGYMODEL_A_PERF_DOMAIN_PERF_DOMAIN_ID */ + nla_total_size_64bit(sizeof(u64)) + + /* DEV_ENERGYMODEL_A_PERF_DOMAIN_FLAGS */ + nla_total_size(cpus_sz); + /* DEV_ENERGYMODEL_A_PERF_DOMAIN_CPUS */ *tot_msg_sz += nlmsg_total_size(genlmsg_msg_size(msg_sz)); return 0; @@ -40,23 +44,26 @@ static int __em_nl_get_pd_size(struct em_perf_domain *pd, void *data) static int __em_nl_get_pd(struct em_perf_domain *pd, void *data) { - char cpus_buf[EM_A_PD_CPUS_LEN]; + char cpus_buf[DEV_ENERGYMODEL_A_PERF_DOMAIN_CPUS_LEN]; struct sk_buff *msg = data; struct nlattr *entry; - entry = nla_nest_start(msg, EM_A_PDS_PD); + entry = nla_nest_start(msg, + DEV_ENERGYMODEL_A_PERF_DOMAINS_PERF_DOMAIN); if (!entry) goto out_cancel_nest; - if (nla_put_u32(msg, EM_A_PD_PD_ID, pd->id)) + if (nla_put_u32(msg, DEV_ENERGYMODEL_A_PERF_DOMAIN_PERF_DOMAIN_ID, + pd->id)) goto out_cancel_nest; - if (nla_put_u64_64bit(msg, EM_A_PD_FLAGS, pd->flags, EM_A_PD_PAD)) + if (nla_put_u64_64bit(msg, DEV_ENERGYMODEL_A_PERF_DOMAIN_FLAGS, + pd->flags, DEV_ENERGYMODEL_A_PERF_DOMAIN_PAD)) goto out_cancel_nest; snprintf(cpus_buf, sizeof(cpus_buf), "%*pb", cpumask_pr_args(to_cpumask(pd->cpus))); - if (nla_put_string(msg, EM_A_PD_CPUS, cpus_buf)) + if (nla_put_string(msg, DEV_ENERGYMODEL_A_PERF_DOMAIN_CPUS, cpus_buf)) goto out_cancel_nest; nla_nest_end(msg, entry); @@ -69,7 +76,8 @@ static int __em_nl_get_pd(struct em_perf_domain *pd, void *data) return -EMSGSIZE; } -int em_nl_get_pds_doit(struct sk_buff *skb, struct genl_info *info) +int dev_energymodel_nl_get_perf_domains_doit(struct sk_buff *skb, + struct genl_info *info) { struct sk_buff *msg; void *hdr; @@ -82,7 +90,7 @@ int em_nl_get_pds_doit(struct sk_buff *skb, struct genl_info *info) if (!msg) return -ENOMEM; - hdr = genlmsg_put_reply(msg, info, &em_nl_family, 0, cmd); + hdr = genlmsg_put_reply(msg, info, &dev_energymodel_nl_family, 0, cmd); if (!hdr) goto out_free_msg; @@ -107,10 +115,10 @@ static struct em_perf_domain *__em_nl_get_pd_table_id(struct nlattr **attrs) struct em_perf_domain *pd; int id; - if (!attrs[EM_A_PD_TABLE_PD_ID]) + if (!attrs[DEV_ENERGYMODEL_A_PERF_TABLE_PERF_DOMAIN_ID]) return NULL; - id = nla_get_u32(attrs[EM_A_PD_TABLE_PD_ID]); + id = nla_get_u32(attrs[DEV_ENERGYMODEL_A_PERF_TABLE_PERF_DOMAIN_ID]); pd = em_perf_domain_get_by_id(id); return pd; } @@ -119,25 +127,34 @@ static int __em_nl_get_pd_table_size(const struct em_perf_domain *pd) { int id_sz, ps_sz; - id_sz = nla_total_size(sizeof(u32)); /* EM_A_PD_TABLE_PD_ID */ - ps_sz = nla_total_size(0) + /* EM_A_PD_TABLE_PS */ - nla_total_size_64bit(sizeof(u64)) + /* EM_A_PS_PERFORMANCE */ - nla_total_size_64bit(sizeof(u64)) + /* EM_A_PS_FREQUENCY */ - nla_total_size_64bit(sizeof(u64)) + /* EM_A_PS_POWER */ - nla_total_size_64bit(sizeof(u64)) + /* EM_A_PS_COST */ - nla_total_size_64bit(sizeof(u64)); /* EM_A_PS_FLAGS */ + id_sz = nla_total_size(sizeof(u32)); + /* DEV_ENERGYMODEL_A_PERF_TABLE_PERF_DOMAIN_ID */ + ps_sz = nla_total_size(0) + + /* DEV_ENERGYMODEL_A_PERF_TABLE_PERF_STATE */ + nla_total_size_64bit(sizeof(u64)) + + /* DEV_ENERGYMODEL_A_PERF_STATE_PERFORMANCE */ + nla_total_size_64bit(sizeof(u64)) + + /* DEV_ENERGYMODEL_A_PERF_STATE_FREQUENCY */ + nla_total_size_64bit(sizeof(u64)) + + /* DEV_ENERGYMODEL_A_PERF_STATE_POWER */ + nla_total_size_64bit(sizeof(u64)) + + /* DEV_ENERGYMODEL_A_PERF_STATE_COST */ + nla_total_size_64bit(sizeof(u64)); + /* DEV_ENERGYMODEL_A_PERF_STATE_FLAGS */ ps_sz *= pd->nr_perf_states; return nlmsg_total_size(genlmsg_msg_size(id_sz + ps_sz)); } -static int __em_nl_get_pd_table(struct sk_buff *msg, const struct em_perf_domain *pd) +static +int __em_nl_get_pd_table(struct sk_buff *msg, const struct em_perf_domain *pd) { struct em_perf_state *table, *ps; struct nlattr *entry; int i; - if (nla_put_u32(msg, EM_A_PD_TABLE_PD_ID, pd->id)) + if (nla_put_u32(msg, DEV_ENERGYMODEL_A_PERF_TABLE_PERF_DOMAIN_ID, + pd->id)) goto out_err; rcu_read_lock(); @@ -146,24 +163,35 @@ static int __em_nl_get_pd_table(struct sk_buff *msg, const struct em_perf_domain for (i = 0; i < pd->nr_perf_states; i++) { ps = &table[i]; - entry = nla_nest_start(msg, EM_A_PD_TABLE_PS); + entry = nla_nest_start(msg, + DEV_ENERGYMODEL_A_PERF_TABLE_PERF_STATE); if (!entry) goto out_unlock_ps; - if (nla_put_u64_64bit(msg, EM_A_PS_PERFORMANCE, - ps->performance, EM_A_PS_PAD)) + if (nla_put_u64_64bit(msg, + DEV_ENERGYMODEL_A_PERF_STATE_PERFORMANCE, + ps->performance, + DEV_ENERGYMODEL_A_PERF_STATE_PAD)) goto out_cancel_ps_nest; - if (nla_put_u64_64bit(msg, EM_A_PS_FREQUENCY, - ps->frequency, EM_A_PS_PAD)) + if (nla_put_u64_64bit(msg, + DEV_ENERGYMODEL_A_PERF_STATE_FREQUENCY, + ps->frequency, + DEV_ENERGYMODEL_A_PERF_STATE_PAD)) goto out_cancel_ps_nest; - if (nla_put_u64_64bit(msg, EM_A_PS_POWER, - ps->power, EM_A_PS_PAD)) + if (nla_put_u64_64bit(msg, + DEV_ENERGYMODEL_A_PERF_STATE_POWER, + ps->power, + DEV_ENERGYMODEL_A_PERF_STATE_PAD)) goto out_cancel_ps_nest; - if (nla_put_u64_64bit(msg, EM_A_PS_COST, - ps->cost, EM_A_PS_PAD)) + if (nla_put_u64_64bit(msg, + DEV_ENERGYMODEL_A_PERF_STATE_COST, + ps->cost, + DEV_ENERGYMODEL_A_PERF_STATE_PAD)) goto out_cancel_ps_nest; - if (nla_put_u64_64bit(msg, EM_A_PS_FLAGS, - ps->flags, EM_A_PS_PAD)) + if (nla_put_u64_64bit(msg, + DEV_ENERGYMODEL_A_PERF_STATE_FLAGS, + ps->flags, + DEV_ENERGYMODEL_A_PERF_STATE_PAD)) goto out_cancel_ps_nest; nla_nest_end(msg, entry); @@ -179,7 +207,8 @@ static int __em_nl_get_pd_table(struct sk_buff *msg, const struct em_perf_domain return -EMSGSIZE; } -int em_nl_get_pd_table_doit(struct sk_buff *skb, struct genl_info *info) +int dev_energymodel_nl_get_perf_table_doit(struct sk_buff *skb, + struct genl_info *info) { int cmd = info->genlhdr->cmd; int msg_sz, ret = -EMSGSIZE; @@ -197,7 +226,7 @@ int em_nl_get_pd_table_doit(struct sk_buff *skb, struct genl_info *info) if (!msg) return -ENOMEM; - hdr = genlmsg_put_reply(msg, info, &em_nl_family, 0, cmd); + hdr = genlmsg_put_reply(msg, info, &dev_energymodel_nl_family, 0, cmd); if (!hdr) goto out_free_msg; @@ -221,7 +250,7 @@ static void __em_notify_pd_table(const struct em_perf_domain *pd, int ntf_type) int msg_sz, ret = -EMSGSIZE; void *hdr; - if (!genl_has_listeners(&em_nl_family, &init_net, EM_NLGRP_EVENT)) + if (!genl_has_listeners(&dev_energymodel_nl_family, &init_net, DEV_ENERGYMODEL_NLGRP_EVENT)) return; msg_sz = __em_nl_get_pd_table_size(pd); @@ -230,7 +259,7 @@ static void __em_notify_pd_table(const struct em_perf_domain *pd, int ntf_type) if (!msg) return; - hdr = genlmsg_put(msg, 0, 0, &em_nl_family, 0, ntf_type); + hdr = genlmsg_put(msg, 0, 0, &dev_energymodel_nl_family, 0, ntf_type); if (!hdr) goto out_free_msg; @@ -240,28 +269,28 @@ static void __em_notify_pd_table(const struct em_perf_domain *pd, int ntf_type) genlmsg_end(msg, hdr); - genlmsg_multicast(&em_nl_family, msg, 0, EM_NLGRP_EVENT, GFP_KERNEL); + genlmsg_multicast(&dev_energymodel_nl_family, msg, 0, + DEV_ENERGYMODEL_NLGRP_EVENT, GFP_KERNEL); return; out_free_msg: nlmsg_free(msg); - return; } void em_notify_pd_created(const struct em_perf_domain *pd) { - __em_notify_pd_table(pd, EM_CMD_PD_CREATED); + __em_notify_pd_table(pd, DEV_ENERGYMODEL_CMD_PERF_DOMAIN_CREATED); } void em_notify_pd_updated(const struct em_perf_domain *pd) { - __em_notify_pd_table(pd, EM_CMD_PD_UPDATED); + __em_notify_pd_table(pd, DEV_ENERGYMODEL_CMD_PERF_DOMAIN_UPDATED); } static int __em_notify_pd_deleted_size(const struct em_perf_domain *pd) { - int id_sz = nla_total_size(sizeof(u32)); /* EM_A_PD_TABLE_PD_ID */ + int id_sz = nla_total_size(sizeof(u32)); /* DEV_ENERGYMODEL_A_PERF_TABLE_PERF_DOMAIN_ID */ return nlmsg_total_size(genlmsg_msg_size(id_sz)); } @@ -272,7 +301,8 @@ void em_notify_pd_deleted(const struct em_perf_domain *pd) void *hdr; int msg_sz; - if (!genl_has_listeners(&em_nl_family, &init_net, EM_NLGRP_EVENT)) + if (!genl_has_listeners(&dev_energymodel_nl_family, &init_net, + DEV_ENERGYMODEL_NLGRP_EVENT)) return; msg_sz = __em_notify_pd_deleted_size(pd); @@ -281,28 +311,29 @@ void em_notify_pd_deleted(const struct em_perf_domain *pd) if (!msg) return; - hdr = genlmsg_put(msg, 0, 0, &em_nl_family, 0, EM_CMD_PD_DELETED); + hdr = genlmsg_put(msg, 0, 0, &dev_energymodel_nl_family, 0, + DEV_ENERGYMODEL_CMD_PERF_DOMAIN_DELETED); if (!hdr) goto out_free_msg; - if (nla_put_u32(msg, EM_A_PD_TABLE_PD_ID, pd->id)) { + if (nla_put_u32(msg, DEV_ENERGYMODEL_A_PERF_TABLE_PERF_DOMAIN_ID, + pd->id)) goto out_free_msg; - } genlmsg_end(msg, hdr); - genlmsg_multicast(&em_nl_family, msg, 0, EM_NLGRP_EVENT, GFP_KERNEL); + genlmsg_multicast(&dev_energymodel_nl_family, msg, 0, + DEV_ENERGYMODEL_NLGRP_EVENT, GFP_KERNEL); return; out_free_msg: nlmsg_free(msg); - return; } /**************************** Initialization *********************************/ static int __init em_netlink_init(void) { - return genl_register_family(&em_nl_family); + return genl_register_family(&dev_energymodel_nl_family); } postcore_initcall(em_netlink_init); diff --git a/kernel/power/em_netlink_autogen.c b/kernel/power/em_netlink_autogen.c index ceb3b2bb6ebe0..44acef0e7df2b 100644 --- a/kernel/power/em_netlink_autogen.c +++ b/kernel/power/em_netlink_autogen.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) /* Do not edit directly, auto-generated from: */ -/* Documentation/netlink/specs/em.yaml */ +/* Documentation/netlink/specs/dev-energymodel.yaml */ /* YNL-GEN kernel source */ /* To regenerate run: tools/net/ynl/ynl-regen.sh */ @@ -9,41 +9,41 @@ #include "em_netlink_autogen.h" -#include +#include -/* EM_CMD_GET_PD_TABLE - do */ -static const struct nla_policy em_get_pd_table_nl_policy[EM_A_PD_TABLE_PD_ID + 1] = { - [EM_A_PD_TABLE_PD_ID] = { .type = NLA_U32, }, +/* DEV_ENERGYMODEL_CMD_GET_PERF_TABLE - do */ +static const struct nla_policy dev_energymodel_get_perf_table_nl_policy[DEV_ENERGYMODEL_A_PERF_TABLE_PERF_DOMAIN_ID + 1] = { + [DEV_ENERGYMODEL_A_PERF_TABLE_PERF_DOMAIN_ID] = { .type = NLA_U32, }, }; -/* Ops table for em */ -static const struct genl_split_ops em_nl_ops[] = { +/* Ops table for dev_energymodel */ +static const struct genl_split_ops dev_energymodel_nl_ops[] = { { - .cmd = EM_CMD_GET_PDS, - .doit = em_nl_get_pds_doit, + .cmd = DEV_ENERGYMODEL_CMD_GET_PERF_DOMAINS, + .doit = dev_energymodel_nl_get_perf_domains_doit, .flags = GENL_CMD_CAP_DO, }, { - .cmd = EM_CMD_GET_PD_TABLE, - .doit = em_nl_get_pd_table_doit, - .policy = em_get_pd_table_nl_policy, - .maxattr = EM_A_PD_TABLE_PD_ID, + .cmd = DEV_ENERGYMODEL_CMD_GET_PERF_TABLE, + .doit = dev_energymodel_nl_get_perf_table_doit, + .policy = dev_energymodel_get_perf_table_nl_policy, + .maxattr = DEV_ENERGYMODEL_A_PERF_TABLE_PERF_DOMAIN_ID, .flags = GENL_CMD_CAP_DO, }, }; -static const struct genl_multicast_group em_nl_mcgrps[] = { - [EM_NLGRP_EVENT] = { "event", }, +static const struct genl_multicast_group dev_energymodel_nl_mcgrps[] = { + [DEV_ENERGYMODEL_NLGRP_EVENT] = { "event", }, }; -struct genl_family em_nl_family __ro_after_init = { - .name = EM_FAMILY_NAME, - .version = EM_FAMILY_VERSION, +struct genl_family dev_energymodel_nl_family __ro_after_init = { + .name = DEV_ENERGYMODEL_FAMILY_NAME, + .version = DEV_ENERGYMODEL_FAMILY_VERSION, .netnsok = true, .parallel_ops = true, .module = THIS_MODULE, - .split_ops = em_nl_ops, - .n_split_ops = ARRAY_SIZE(em_nl_ops), - .mcgrps = em_nl_mcgrps, - .n_mcgrps = ARRAY_SIZE(em_nl_mcgrps), + .split_ops = dev_energymodel_nl_ops, + .n_split_ops = ARRAY_SIZE(dev_energymodel_nl_ops), + .mcgrps = dev_energymodel_nl_mcgrps, + .n_mcgrps = ARRAY_SIZE(dev_energymodel_nl_mcgrps), }; diff --git a/kernel/power/em_netlink_autogen.h b/kernel/power/em_netlink_autogen.h index 140ab548103ce..f7e4bddcbd53a 100644 --- a/kernel/power/em_netlink_autogen.h +++ b/kernel/power/em_netlink_autogen.h @@ -1,24 +1,26 @@ /* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */ /* Do not edit directly, auto-generated from: */ -/* Documentation/netlink/specs/em.yaml */ +/* Documentation/netlink/specs/dev-energymodel.yaml */ /* YNL-GEN kernel header */ /* To regenerate run: tools/net/ynl/ynl-regen.sh */ -#ifndef _LINUX_EM_GEN_H -#define _LINUX_EM_GEN_H +#ifndef _LINUX_DEV_ENERGYMODEL_GEN_H +#define _LINUX_DEV_ENERGYMODEL_GEN_H #include #include -#include +#include -int em_nl_get_pds_doit(struct sk_buff *skb, struct genl_info *info); -int em_nl_get_pd_table_doit(struct sk_buff *skb, struct genl_info *info); +int dev_energymodel_nl_get_perf_domains_doit(struct sk_buff *skb, + struct genl_info *info); +int dev_energymodel_nl_get_perf_table_doit(struct sk_buff *skb, + struct genl_info *info); enum { - EM_NLGRP_EVENT, + DEV_ENERGYMODEL_NLGRP_EVENT, }; -extern struct genl_family em_nl_family; +extern struct genl_family dev_energymodel_nl_family; -#endif /* _LINUX_EM_GEN_H */ +#endif /* _LINUX_DEV_ENERGYMODEL_GEN_H */ From af23b574c98e41098a44c2dd79b6ece8b2afe6dc Mon Sep 17 00:00:00 2001 From: Changwoo Min Date: Thu, 8 Jan 2026 14:32:11 +0900 Subject: [PATCH 1116/1321] PM: EM: Change cpus' type from string to u64 array in the EM YNL spec MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Previously, the cpus attribute was a string format which was a "%*pb" stringification of a bitmap. That is not very consumable for a UAPI, so let’s change it to an u64 array of CPU ids. Suggested-by: Donald Hunter Reviewed-by: Lukasz Luba Reviewed-by: Donald Hunter Signed-off-by: Changwoo Min Link: https://patch.msgid.link/20260108053212.642478-4-changwoo@igalia.com Signed-off-by: Rafael J. Wysocki --- .../netlink/specs/dev-energymodel.yaml | 3 ++- kernel/power/em_netlink.c | 22 +++++++++---------- 2 files changed, 13 insertions(+), 12 deletions(-) diff --git a/Documentation/netlink/specs/dev-energymodel.yaml b/Documentation/netlink/specs/dev-energymodel.yaml index cbc4bc38f23cc..af8b8f72f722c 100644 --- a/Documentation/netlink/specs/dev-energymodel.yaml +++ b/Documentation/netlink/specs/dev-energymodel.yaml @@ -73,7 +73,8 @@ attribute-sets: enum: perf-domain-flags - name: cpus - type: string + type: u64 + multi-attr: true doc: >- CPUs that belong to this performance domain. - diff --git a/kernel/power/em_netlink.c b/kernel/power/em_netlink.c index 6f6238c465bb9..b6edb018c65a6 100644 --- a/kernel/power/em_netlink.c +++ b/kernel/power/em_netlink.c @@ -17,17 +17,14 @@ #include "em_netlink.h" #include "em_netlink_autogen.h" -#define DEV_ENERGYMODEL_A_PERF_DOMAIN_CPUS_LEN 256 - /*************************** Command encoding ********************************/ static int __em_nl_get_pd_size(struct em_perf_domain *pd, void *data) { - char cpus_buf[DEV_ENERGYMODEL_A_PERF_DOMAIN_CPUS_LEN]; + int nr_cpus, msg_sz, cpus_sz; int *tot_msg_sz = data; - int msg_sz, cpus_sz; - cpus_sz = snprintf(cpus_buf, sizeof(cpus_buf), "%*pb", - cpumask_pr_args(to_cpumask(pd->cpus))); + nr_cpus = cpumask_weight(to_cpumask(pd->cpus)); + cpus_sz = nla_total_size_64bit(sizeof(u64)) * nr_cpus; msg_sz = nla_total_size(0) + /* DEV_ENERGYMODEL_A_PERF_DOMAINS_PERF_DOMAIN */ @@ -44,9 +41,10 @@ static int __em_nl_get_pd_size(struct em_perf_domain *pd, void *data) static int __em_nl_get_pd(struct em_perf_domain *pd, void *data) { - char cpus_buf[DEV_ENERGYMODEL_A_PERF_DOMAIN_CPUS_LEN]; struct sk_buff *msg = data; + struct cpumask *cpumask; struct nlattr *entry; + int cpu; entry = nla_nest_start(msg, DEV_ENERGYMODEL_A_PERF_DOMAINS_PERF_DOMAIN); @@ -61,10 +59,12 @@ static int __em_nl_get_pd(struct em_perf_domain *pd, void *data) pd->flags, DEV_ENERGYMODEL_A_PERF_DOMAIN_PAD)) goto out_cancel_nest; - snprintf(cpus_buf, sizeof(cpus_buf), "%*pb", - cpumask_pr_args(to_cpumask(pd->cpus))); - if (nla_put_string(msg, DEV_ENERGYMODEL_A_PERF_DOMAIN_CPUS, cpus_buf)) - goto out_cancel_nest; + cpumask = to_cpumask(pd->cpus); + for_each_cpu(cpu, cpumask) { + if (nla_put_u64_64bit(msg, DEV_ENERGYMODEL_A_PERF_DOMAIN_CPUS, + cpu, DEV_ENERGYMODEL_A_PERF_DOMAIN_PAD)) + goto out_cancel_nest; + } nla_nest_end(msg, entry); From c0d128716e0db0b268f12ae09d433213392bdd54 Mon Sep 17 00:00:00 2001 From: Changwoo Min Date: Thu, 8 Jan 2026 14:32:12 +0900 Subject: [PATCH 1117/1321] PM: EM: Add dump to get-perf-domains in the EM YNL spec Add dump to get-perf-domains, so that a user can fetch either information about a specific performance domain with do or information about all performance domains with dump. Share the reply format of do and dump using perf-domain-attrs, so remove perf-domains. The YNL spec, autogenerated files, and the do implementation are updated, and the dump implementation is added. Suggested-by: Donald Hunter Reviewed-by: Lukasz Luba Reviewed-by: Donald Hunter Signed-off-by: Changwoo Min Link: https://patch.msgid.link/20260108053212.642478-5-changwoo@igalia.com Signed-off-by: Rafael J. Wysocki --- .../netlink/specs/dev-energymodel.yaml | 25 ++++--- include/uapi/linux/dev_energymodel.h | 7 -- kernel/power/em_netlink.c | 68 ++++++++++++++----- kernel/power/em_netlink_autogen.c | 16 ++++- kernel/power/em_netlink_autogen.h | 2 + 5 files changed, 80 insertions(+), 38 deletions(-) diff --git a/Documentation/netlink/specs/dev-energymodel.yaml b/Documentation/netlink/specs/dev-energymodel.yaml index af8b8f72f722c..11faabfdfbe8f 100644 --- a/Documentation/netlink/specs/dev-energymodel.yaml +++ b/Documentation/netlink/specs/dev-energymodel.yaml @@ -42,16 +42,6 @@ definitions: missing real power information. attribute-sets: - - - name: perf-domains - doc: >- - Information on all the performance domains. - attributes: - - - name: perf-domain - type: nest - nested-attributes: perf-domain - multi-attr: true - name: perf-domain doc: >- @@ -133,12 +123,21 @@ operations: list: - name: get-perf-domains - attribute-set: perf-domains + attribute-set: perf-domain doc: Get the list of information for all performance domains. do: - reply: + request: attributes: - - perf-domain + - perf-domain-id + reply: + attributes: &perf-domain-attrs + - pad + - perf-domain-id + - flags + - cpus + dump: + reply: + attributes: *perf-domain-attrs - name: get-perf-table attribute-set: perf-table diff --git a/include/uapi/linux/dev_energymodel.h b/include/uapi/linux/dev_energymodel.h index 3399967e1f930..355d8885c9a07 100644 --- a/include/uapi/linux/dev_energymodel.h +++ b/include/uapi/linux/dev_energymodel.h @@ -36,13 +36,6 @@ enum dev_energymodel_perf_domain_flags { DEV_ENERGYMODEL_PERF_DOMAIN_FLAGS_PERF_DOMAIN_ARTIFICIAL = 4, }; -enum { - DEV_ENERGYMODEL_A_PERF_DOMAINS_PERF_DOMAIN = 1, - - __DEV_ENERGYMODEL_A_PERF_DOMAINS_MAX, - DEV_ENERGYMODEL_A_PERF_DOMAINS_MAX = (__DEV_ENERGYMODEL_A_PERF_DOMAINS_MAX - 1) -}; - enum { DEV_ENERGYMODEL_A_PERF_DOMAIN_PAD = 1, DEV_ENERGYMODEL_A_PERF_DOMAIN_PERF_DOMAIN_ID, diff --git a/kernel/power/em_netlink.c b/kernel/power/em_netlink.c index b6edb018c65a6..5a611d3950fd5 100644 --- a/kernel/power/em_netlink.c +++ b/kernel/power/em_netlink.c @@ -18,6 +18,13 @@ #include "em_netlink_autogen.h" /*************************** Command encoding ********************************/ +struct dump_ctx { + int idx; + int start; + struct sk_buff *skb; + struct netlink_callback *cb; +}; + static int __em_nl_get_pd_size(struct em_perf_domain *pd, void *data) { int nr_cpus, msg_sz, cpus_sz; @@ -43,14 +50,8 @@ static int __em_nl_get_pd(struct em_perf_domain *pd, void *data) { struct sk_buff *msg = data; struct cpumask *cpumask; - struct nlattr *entry; int cpu; - entry = nla_nest_start(msg, - DEV_ENERGYMODEL_A_PERF_DOMAINS_PERF_DOMAIN); - if (!entry) - goto out_cancel_nest; - if (nla_put_u32(msg, DEV_ENERGYMODEL_A_PERF_DOMAIN_PERF_DOMAIN_ID, pd->id)) goto out_cancel_nest; @@ -66,26 +67,50 @@ static int __em_nl_get_pd(struct em_perf_domain *pd, void *data) goto out_cancel_nest; } - nla_nest_end(msg, entry); - return 0; out_cancel_nest: - nla_nest_cancel(msg, entry); - return -EMSGSIZE; } +static int __em_nl_get_pd_for_dump(struct em_perf_domain *pd, void *data) +{ + const struct genl_info *info; + struct dump_ctx *ctx = data; + void *hdr; + int ret; + + if (ctx->idx++ < ctx->start) + return 0; + + info = genl_info_dump(ctx->cb); + hdr = genlmsg_iput(ctx->skb, info); + if (!hdr) { + genlmsg_cancel(ctx->skb, hdr); + return -EMSGSIZE; + } + + ret = __em_nl_get_pd(pd, ctx->skb); + genlmsg_end(ctx->skb, hdr); + return ret; +} + int dev_energymodel_nl_get_perf_domains_doit(struct sk_buff *skb, struct genl_info *info) { + int id, ret = -EMSGSIZE, msg_sz = 0; + int cmd = info->genlhdr->cmd; + struct em_perf_domain *pd; struct sk_buff *msg; void *hdr; - int cmd = info->genlhdr->cmd; - int ret = -EMSGSIZE, msg_sz = 0; - for_each_em_perf_domain(__em_nl_get_pd_size, &msg_sz); + if (!info->attrs[DEV_ENERGYMODEL_A_PERF_DOMAIN_PERF_DOMAIN_ID]) + return -EINVAL; + id = nla_get_u32(info->attrs[DEV_ENERGYMODEL_A_PERF_DOMAIN_PERF_DOMAIN_ID]); + pd = em_perf_domain_get_by_id(id); + + __em_nl_get_pd_size(pd, &msg_sz); msg = genlmsg_new(msg_sz, GFP_KERNEL); if (!msg) return -ENOMEM; @@ -94,10 +119,9 @@ int dev_energymodel_nl_get_perf_domains_doit(struct sk_buff *skb, if (!hdr) goto out_free_msg; - ret = for_each_em_perf_domain(__em_nl_get_pd, msg); + ret = __em_nl_get_pd(pd, msg); if (ret) goto out_cancel_msg; - genlmsg_end(msg, hdr); return genlmsg_reply(msg, info); @@ -106,10 +130,22 @@ int dev_energymodel_nl_get_perf_domains_doit(struct sk_buff *skb, genlmsg_cancel(msg, hdr); out_free_msg: nlmsg_free(msg); - return ret; } +int dev_energymodel_nl_get_perf_domains_dumpit(struct sk_buff *skb, + struct netlink_callback *cb) +{ + struct dump_ctx ctx = { + .idx = 0, + .start = cb->args[0], + .skb = skb, + .cb = cb, + }; + + return for_each_em_perf_domain(__em_nl_get_pd_for_dump, &ctx); +} + static struct em_perf_domain *__em_nl_get_pd_table_id(struct nlattr **attrs) { struct em_perf_domain *pd; diff --git a/kernel/power/em_netlink_autogen.c b/kernel/power/em_netlink_autogen.c index 44acef0e7df2b..fedd473e4244e 100644 --- a/kernel/power/em_netlink_autogen.c +++ b/kernel/power/em_netlink_autogen.c @@ -11,6 +11,11 @@ #include +/* DEV_ENERGYMODEL_CMD_GET_PERF_DOMAINS - do */ +static const struct nla_policy dev_energymodel_get_perf_domains_nl_policy[DEV_ENERGYMODEL_A_PERF_DOMAIN_PERF_DOMAIN_ID + 1] = { + [DEV_ENERGYMODEL_A_PERF_DOMAIN_PERF_DOMAIN_ID] = { .type = NLA_U32, }, +}; + /* DEV_ENERGYMODEL_CMD_GET_PERF_TABLE - do */ static const struct nla_policy dev_energymodel_get_perf_table_nl_policy[DEV_ENERGYMODEL_A_PERF_TABLE_PERF_DOMAIN_ID + 1] = { [DEV_ENERGYMODEL_A_PERF_TABLE_PERF_DOMAIN_ID] = { .type = NLA_U32, }, @@ -18,10 +23,17 @@ static const struct nla_policy dev_energymodel_get_perf_table_nl_policy[DEV_ENER /* Ops table for dev_energymodel */ static const struct genl_split_ops dev_energymodel_nl_ops[] = { + { + .cmd = DEV_ENERGYMODEL_CMD_GET_PERF_DOMAINS, + .doit = dev_energymodel_nl_get_perf_domains_doit, + .policy = dev_energymodel_get_perf_domains_nl_policy, + .maxattr = DEV_ENERGYMODEL_A_PERF_DOMAIN_PERF_DOMAIN_ID, + .flags = GENL_CMD_CAP_DO, + }, { .cmd = DEV_ENERGYMODEL_CMD_GET_PERF_DOMAINS, - .doit = dev_energymodel_nl_get_perf_domains_doit, - .flags = GENL_CMD_CAP_DO, + .dumpit = dev_energymodel_nl_get_perf_domains_dumpit, + .flags = GENL_CMD_CAP_DUMP, }, { .cmd = DEV_ENERGYMODEL_CMD_GET_PERF_TABLE, diff --git a/kernel/power/em_netlink_autogen.h b/kernel/power/em_netlink_autogen.h index f7e4bddcbd53a..5caf2f7e18a51 100644 --- a/kernel/power/em_netlink_autogen.h +++ b/kernel/power/em_netlink_autogen.h @@ -14,6 +14,8 @@ int dev_energymodel_nl_get_perf_domains_doit(struct sk_buff *skb, struct genl_info *info); +int dev_energymodel_nl_get_perf_domains_dumpit(struct sk_buff *skb, + struct netlink_callback *cb); int dev_energymodel_nl_get_perf_table_doit(struct sk_buff *skb, struct genl_info *info); From e47db746dcd9605111e550aebd43bda2fe24e4e6 Mon Sep 17 00:00:00 2001 From: Boqun Feng Date: Fri, 26 Dec 2025 19:39:38 +0800 Subject: [PATCH 1118/1321] PCI: Provide pci_free_irq_vectors() stub 473b9f331718 ("rust: pci: fix build failure when CONFIG_PCI_MSI is disabled") fixed a build error by providing Rust helpers when CONFIG_PCI_MSI is not set. However the Rust helpers rely on pci_free_irq_vectors(), which is only available when CONFIG_PCI=y. When CONFIG_PCI is not set, there is already a stub for pci_alloc_irq_vectors(). Add a similar stub for pci_free_irq_vectors(). Fixes: 473b9f331718 ("rust: pci: fix build failure when CONFIG_PCI_MSI is disabled") Reported-by: FUJITA Tomonori Closes: https://lore.kernel.org/rust-for-linux/20251209014312.575940-1-fujita.tomonori@gmail.com/ Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202512220740.4Kexm4dW-lkp@intel.com/ Reported-by: Liang Jie Closes: https://lore.kernel.org/rust-for-linux/20251222034415.1384223-1-buaajxlj@163.com/ Signed-off-by: Boqun Feng Signed-off-by: Bjorn Helgaas Reviewed-by: Drew Fustini Reviewed-by: David Gow Reviewed-by: Joel Fernandes Reviewed-by: Danilo Krummrich Link: https://patch.msgid.link/20251226113938.52145-1-boqun.feng@gmail.com --- include/linux/pci.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/linux/pci.h b/include/linux/pci.h index 864775651c6fa..b5cc0c2b99065 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2210,6 +2210,10 @@ pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs, { return -ENOSPC; } + +static inline void pci_free_irq_vectors(struct pci_dev *dev) +{ +} #endif /* CONFIG_PCI */ /* Include architecture-dependent settings and functions */ From c274a29e81221988f8571559b583aa6b0003a781 Mon Sep 17 00:00:00 2001 From: Dan Williams Date: Thu, 6 Nov 2025 15:13:50 -0800 Subject: [PATCH 1119/1321] x86/kaslr: Recognize all ZONE_DEVICE users as physaddr consumers Commit 7ffb791423c7 ("x86/kaslr: Reduce KASLR entropy on most x86 systems") is too narrow. The effect being mitigated in that commit is caused by ZONE_DEVICE which PCI_P2PDMA has a dependency. ZONE_DEVICE, in general, lets any physical address be added to the direct-map. I.e. not only ACPI hotplug ranges, CXL Memory Windows, or EFI Specific Purpose Memory, but also any PCI MMIO range for the DEVICE_PRIVATE and PCI_P2PDMA cases. Update the mitigation, limit KASLR entropy, to apply in all ZONE_DEVICE=y cases. Distro kernels typically have PCI_P2PDMA=y, so the practical exposure of this problem is limited to the PCI_P2PDMA=n case. A potential path to recover entropy would be to walk ACPI and determine the limits for hotplug and PCI MMIO before kernel_randomize_memory(). On smaller systems that could yield some KASLR address bits. This needs additional investigation to determine if some limited ACPI table scanning can happen this early without an open coded solution like arch/x86/boot/compressed/acpi.c needs to deploy. Cc: Ingo Molnar Cc: Kees Cook Cc: Bjorn Helgaas Cc: Peter Zijlstra Cc: Andy Lutomirski Cc: Logan Gunthorpe Cc: Andrew Morton Cc: David Hildenbrand Cc: Lorenzo Stoakes Cc: "Liam R. Howlett" Cc: Vlastimil Babka Cc: Mike Rapoport Cc: Suren Baghdasaryan Cc: Michal Hocko Fixes: 7ffb791423c7 ("x86/kaslr: Reduce KASLR entropy on most x86 systems") Cc: Signed-off-by: Dan Williams Reviewed-by: Balbir Singh Tested-by: Yasunori Goto Acked-by: Dave Hansen Link: http://patch.msgid.link/692e08b2516d4_261c1100a3@dwillia2-mobl4.notmuch Signed-off-by: Dave Jiang --- arch/x86/mm/kaslr.c | 10 +++++----- drivers/pci/Kconfig | 6 ------ mm/Kconfig | 12 ++++++++---- 3 files changed, 13 insertions(+), 15 deletions(-) diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c index 3c306de52fd4d..834641c6049a5 100644 --- a/arch/x86/mm/kaslr.c +++ b/arch/x86/mm/kaslr.c @@ -115,12 +115,12 @@ void __init kernel_randomize_memory(void) /* * Adapt physical memory region size based on available memory, - * except when CONFIG_PCI_P2PDMA is enabled. P2PDMA exposes the - * device BAR space assuming the direct map space is large enough - * for creating a ZONE_DEVICE mapping in the direct map corresponding - * to the physical BAR address. + * except when CONFIG_ZONE_DEVICE is enabled. ZONE_DEVICE wants to map + * any physical address into the direct-map. KASLR wants to reliably + * steal some physical address bits. Those design choices are in direct + * conflict. */ - if (!IS_ENABLED(CONFIG_PCI_P2PDMA) && (memory_tb < kaslr_regions[0].size_tb)) + if (!IS_ENABLED(CONFIG_ZONE_DEVICE) && (memory_tb < kaslr_regions[0].size_tb)) kaslr_regions[0].size_tb = memory_tb; /* diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig index 00b0210e1f1de..e3f848ffb52a7 100644 --- a/drivers/pci/Kconfig +++ b/drivers/pci/Kconfig @@ -225,12 +225,6 @@ config PCI_P2PDMA P2P DMA transactions must be between devices behind the same root port. - Enabling this option will reduce the entropy of x86 KASLR memory - regions. For example - on a 46 bit system, the entropy goes down - from 16 bits to 15 bits. The actual reduction in entropy depends - on the physical address bits, on processor features, kernel config - (5 level page table) and physical memory present on the system. - If unsure, say N. config PCI_LABEL diff --git a/mm/Kconfig b/mm/Kconfig index bd0ea5454af82..a992f2203eb91 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1220,10 +1220,14 @@ config ZONE_DEVICE Device memory hotplug support allows for establishing pmem, or other device driver discovered memory regions, in the memmap. This allows pfn_to_page() lookups of otherwise - "device-physical" addresses which is needed for using a DAX - mapping in an O_DIRECT operation, among other things. - - If FS_DAX is enabled, then say Y. + "device-physical" addresses which is needed for DAX, PCI_P2PDMA, and + DEVICE_PRIVATE features among others. + + Enabling this option will reduce the entropy of x86 KASLR memory + regions. For example - on a 46 bit system, the entropy goes down + from 16 bits to 15 bits. The actual reduction in entropy depends + on the physical address bits, on processor features, kernel config + (5 level page table) and physical memory present on the system. # # Helpers to mirror range of the CPU page tables of a process into device page From a9edf840f224ce200d4e6f41955eb2e7d3613e16 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Thu, 4 Dec 2025 10:52:26 +0100 Subject: [PATCH 1120/1321] cxl/region: fix format string for resource_size_t The size of this type is architecture specific, and the recommended way to print it portably is through the custom %pap format string. Fixes: d6602e25819d ("cxl/region: Add support to indicate region has extended linear cache") Signed-off-by: Arnd Bergmann Reviewed-by: Ira Weiny Reviewed-by: Dave Jiang > --- Reviewed-by: Alison Schofield Link: https://patch.msgid.link/20251204095237.1032528-1-arnd@kernel.org Signed-off-by: Dave Jiang --- drivers/cxl/core/region.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index ae899f68551f7..fc36a5413d3f7 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -759,7 +759,7 @@ static ssize_t extended_linear_cache_size_show(struct device *dev, ACQUIRE(rwsem_read_intr, rwsem)(&cxl_rwsem.region); if ((rc = ACQUIRE_ERR(rwsem_read_intr, &rwsem))) return rc; - return sysfs_emit(buf, "%#llx\n", p->cache_size); + return sysfs_emit(buf, "%pap\n", &p->cache_size); } static DEVICE_ATTR_RO(extended_linear_cache_size); From a5e2b85226398be58f0de7bf5459a20ad85bdb78 Mon Sep 17 00:00:00 2001 From: Robert Richter Date: Thu, 8 Jan 2026 11:13:23 +0100 Subject: [PATCH 1121/1321] cxl/port: Fix target list setup for multiple decoders sharing the same dport If a switch port has more than one decoder that is using the same downstream port, the enumeration of the target lists may fail with: # dmesg | grep target.list update_decoder_targets: cxl decoder1.0: dport3 found in target list, index 3 update_decoder_targets: cxl decoder1.0: dport2 found in target list, index 2 update_decoder_targets: cxl decoder1.0: dport0 found in target list, index 0 update_decoder_targets: cxl decoder2.0: dport3 found in target list, index 1 update_decoder_targets: cxl decoder4.0: dport3 found in target list, index 1 cxl_mem mem6: failed to find endpoint12:0000:00:01.4 in target list of decoder2.1 cxl_mem mem8: failed to find endpoint13:0000:20:01.4 in target list of decoder4.1 The case, that the same downstream port can be used in multiple target lists, is allowed and possible. Fix the update of the target list. Enumerate all children of the switch port and do not stop the iteration after the first matching target was found. With the fix applied: # dmesg | grep target.list update_decoder_targets: cxl decoder1.0: dport2 found in target list, index 2 update_decoder_targets: cxl decoder1.0: dport0 found in target list, index 0 update_decoder_targets: cxl decoder1.0: dport3 found in target list, index 3 update_decoder_targets: cxl decoder2.0: dport3 found in target list, index 1 update_decoder_targets: cxl decoder2.1: dport3 found in target list, index 1 update_decoder_targets: cxl decoder4.0: dport3 found in target list, index 1 update_decoder_targets: cxl decoder4.1: dport3 found in target list, index 1 Analyzing the conditions when this happens: 1) A dport is shared by multiple decoders. 2) The decoders have interleaving configured (ways > 1). The configuration above has the following hierarchy details (fixed version): root0 |_ | | | decoder0.1 | ways: 2 | target_list: 0,1 |_______________________________________ | | | dport0 | dport1 | | port2 port4 | | |___________________ |_____________________ | | | | | | | decoder2.0 decoder2.1 | decoder4.0 decoder4.1 | ways: 2 ways: 2 | ways: 2 ways: 2 | target_list: 2,3 target_list: 2,3 | target_list: 2,3 target_list: 2,3 |___________________ |___________________ | | | | | dport2 | dport3 | dport2 | dport3 | | | | endpoint7 endpoint12 endpoint9 endpoint13 |_ |_ |_ |_ | | | | | | | | | decoder7.0 | decoder12.0 | decoder9.0 | decoder13.0 | decoder7.2 | decoder12.2 | decoder9.2 | decoder13.2 | | | | mem3 mem5 mem6 mem8 Note: Device numbers vary for every boot. Current kernel fails to enumerate endpoint12 and endpoint13 as the target list is not updated for the second decoder. Fixes: 4f06d81e7c6a ("cxl: Defer dport allocation for switch ports") Reviewed-by: Dave Jiang Reviewed-by: Alison Schofield Reviewed-by: Jonathan Cameron Signed-off-by: Robert Richter Link: https://patch.msgid.link/20260108101324.509667-1-rrichter@amd.com Signed-off-by: Dave Jiang --- drivers/cxl/core/port.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index fef3aa0c6680c..3310dbfae9d66 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -1590,7 +1590,7 @@ static int update_decoder_targets(struct device *dev, void *data) cxlsd->target[i] = dport; dev_dbg(dev, "dport%d found in target list, index %d\n", dport->port_id, i); - return 1; + return 0; } } From e70e479dff229655857d95e32e2f5e930c4c833a Mon Sep 17 00:00:00 2001 From: Alison Schofield Date: Fri, 9 Jan 2026 11:49:44 -0800 Subject: [PATCH 1122/1321] cxl/acpi: Restore HBIW check before dereferencing platform_data Commit 4fe516d2ad1a ("cxl/acpi: Make the XOR calculations available for testing") split xormap handling code to create a reusable helper function but inadvertently dropped the check of HBIW values before dereferencing cxlrd->platform_data. When HBIW is 1 or 3, no xormaps are needed and platform_data may be NULL, leading to a potential NULL pointer dereference. Affects platform configs using XOR Arithmetic with HBIWs of 1 or 3, when performing DPA->HPA address translation for CXL events. Those events would be any of poison ops, general media, or dram. Restore the early return check for HBIW values of 1 and 3 before dereferencing platform_data. Fixes: 4fe516d2ad1a ("cxl/acpi: Make the XOR calculations available for testing") Signed-off-by: Alison Schofield Reviewed-by: Dave Jiang Link: https://patch.msgid.link/20260109194946.431083-1-alison.schofield@intel.com Signed-off-by: Dave Jiang --- drivers/cxl/acpi.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c index 77ac940e30138..49bba2b9a3c45 100644 --- a/drivers/cxl/acpi.c +++ b/drivers/cxl/acpi.c @@ -75,9 +75,16 @@ EXPORT_SYMBOL_FOR_MODULES(cxl_do_xormap_calc, "cxl_translate"); static u64 cxl_apply_xor_maps(struct cxl_root_decoder *cxlrd, u64 addr) { - struct cxl_cxims_data *cximsd = cxlrd->platform_data; + int hbiw = cxlrd->cxlsd.nr_targets; + struct cxl_cxims_data *cximsd; + + /* No xormaps for host bridge interleave ways of 1 or 3 */ + if (hbiw == 1 || hbiw == 3) + return addr; + + cximsd = cxlrd->platform_data; - return cxl_do_xormap_calc(cximsd, addr, cxlrd->cxlsd.nr_targets); + return cxl_do_xormap_calc(cximsd, addr, hbiw); } struct cxl_cxims_context { From d5d92e7301dc6e74accdfd413971b8395a62dd59 Mon Sep 17 00:00:00 2001 From: Li Ming Date: Mon, 12 Jan 2026 20:05:26 +0800 Subject: [PATCH 1123/1321] cxl/hdm: Fix potential infinite loop in __cxl_dpa_reserve() In __cxl_dpa_reserve(), it will check if the new resource range is included in one of paritions of the cxl memory device. cxlds->nr_paritions is used to represent how many partitions information the cxl memory device has. In the loop, if driver cannot find a partition including the new resource range, it will be an infinite loop. [ dj: Removed incorrect fixes tag ] Fixes: 991d98f17d31 ("cxl: Make cxl_dpa_alloc() DPA partition number agnostic") Signed-off-by: Li Ming Reviewed-by: Ira Weiny Reviewed-by: Dave Jiang Link: https://patch.msgid.link/20260112120526.530232-1-ming.li@zohomail.com Signed-off-by: Dave Jiang --- drivers/cxl/core/hdm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index 1c5d2022c87a9..a470099a69f10 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -403,7 +403,7 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled, * is not set. */ if (cxled->part < 0) - for (int i = 0; cxlds->nr_partitions; i++) + for (int i = 0; i < cxlds->nr_partitions; i++) if (resource_contains(&cxlds->part[i].res, res)) { cxled->part = i; break; From 5c965b8538552ec6253d22ecab60aa8652a7283e Mon Sep 17 00:00:00 2001 From: Robert Richter Date: Wed, 7 Jan 2026 13:05:43 +0100 Subject: [PATCH 1124/1321] cxl: Check for invalid addresses returned from translation functions on errors Translation functions may return an invalid address in case of errors. If the address is not checked the further use of the invalid value will cause an address corruption. Consistently check for a valid address returned by translation functions. Use RESOURCE_SIZE_MAX to indicate an invalid address for type resource_size_t. Depending on the type either RESOURCE_SIZE_MAX or ULLONG_MAX is used to indicate an address error. Propagating an invalid address from a failed translation may cause userspace to think it has received a valid SPA, when in fact it is wrong. The CXL userspace API, using trace events, expects ULLONG_MAX to indicate a translation failure. If ULLONG_MAX is not returned immediately, subsequent calculations can transform that bad address into a different value (!ULLONG_MAX), and an invalid SPA may be returned to userspace. This can lead to incorrect diagnostics and erroneous corrective actions. [ dj: Added user impact statement from Alison. ] [ dj: Fixed checkpatch tab alignment issue. ] Reviewed-by: Dave Jiang Signed-off-by: Robert Richter Fixes: c3dd67681c70 ("cxl/region: Add inject and clear poison by region offset") Fixes: b78b9e7b7979 ("cxl/region: Refactor address translation funcs for testing") Reviewed-by: Alison Schofield Reviewed-by: Jonathan Cameron Link: https://patch.msgid.link/20260107120544.410993-1-rrichter@amd.com Signed-off-by: Dave Jiang --- drivers/cxl/core/hdm.c | 2 +- drivers/cxl/core/region.c | 34 ++++++++++++++++++++------ tools/testing/cxl/test/cxl_translate.c | 30 ++++++++++++++--------- 3 files changed, 45 insertions(+), 21 deletions(-) diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index a470099a69f10..eb5a3a7640c60 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -530,7 +530,7 @@ resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled) resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled) { - resource_size_t base = -1; + resource_size_t base = RESOURCE_SIZE_MAX; lockdep_assert_held(&cxl_rwsem.dpa); if (cxled->dpa_res) diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index fc36a5413d3f7..5bd1213737fa2 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -3118,7 +3118,7 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd, struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent); struct cxl_region_params *p = &cxlr->params; struct cxl_endpoint_decoder *cxled = NULL; - u64 dpa_offset, hpa_offset, hpa; + u64 base, dpa_offset, hpa_offset, hpa; u16 eig = 0; u8 eiw = 0; int pos; @@ -3136,8 +3136,14 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd, ways_to_eiw(p->interleave_ways, &eiw); granularity_to_eig(p->interleave_granularity, &eig); - dpa_offset = dpa - cxl_dpa_resource_start(cxled); + base = cxl_dpa_resource_start(cxled); + if (base == RESOURCE_SIZE_MAX) + return ULLONG_MAX; + + dpa_offset = dpa - base; hpa_offset = cxl_calculate_hpa_offset(dpa_offset, pos, eiw, eig); + if (hpa_offset == ULLONG_MAX) + return ULLONG_MAX; /* Apply the hpa_offset to the region base address */ hpa = hpa_offset + p->res->start + p->cache_size; @@ -3146,6 +3152,9 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd, if (cxlrd->ops.hpa_to_spa) hpa = cxlrd->ops.hpa_to_spa(cxlrd, hpa); + if (hpa == ULLONG_MAX) + return ULLONG_MAX; + if (!cxl_resource_contains_addr(p->res, hpa)) { dev_dbg(&cxlr->dev, "Addr trans fail: hpa 0x%llx not in region\n", hpa); @@ -3170,7 +3179,8 @@ static int region_offset_to_dpa_result(struct cxl_region *cxlr, u64 offset, struct cxl_region_params *p = &cxlr->params; struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent); struct cxl_endpoint_decoder *cxled; - u64 hpa, hpa_offset, dpa_offset; + u64 hpa_offset = offset; + u64 dpa, dpa_offset; u16 eig = 0; u8 eiw = 0; int pos; @@ -3187,10 +3197,13 @@ static int region_offset_to_dpa_result(struct cxl_region *cxlr, u64 offset, * CXL HPA is assumed to equal SPA. */ if (cxlrd->ops.spa_to_hpa) { - hpa = cxlrd->ops.spa_to_hpa(cxlrd, p->res->start + offset); - hpa_offset = hpa - p->res->start; - } else { - hpa_offset = offset; + hpa_offset = cxlrd->ops.spa_to_hpa(cxlrd, p->res->start + offset); + if (hpa_offset == ULLONG_MAX) { + dev_dbg(&cxlr->dev, "HPA not found for %pr offset %#llx\n", + p->res, offset); + return -ENXIO; + } + hpa_offset -= p->res->start; } pos = cxl_calculate_position(hpa_offset, eiw, eig); @@ -3207,8 +3220,13 @@ static int region_offset_to_dpa_result(struct cxl_region *cxlr, u64 offset, cxled = p->targets[i]; if (cxled->pos != pos) continue; + + dpa = cxl_dpa_resource_start(cxled); + if (dpa != RESOURCE_SIZE_MAX) + dpa += dpa_offset; + result->cxlmd = cxled_to_memdev(cxled); - result->dpa = cxl_dpa_resource_start(cxled) + dpa_offset; + result->dpa = dpa; return 0; } diff --git a/tools/testing/cxl/test/cxl_translate.c b/tools/testing/cxl/test/cxl_translate.c index 2200ae21795c7..16328b2112b23 100644 --- a/tools/testing/cxl/test/cxl_translate.c +++ b/tools/testing/cxl/test/cxl_translate.c @@ -68,6 +68,8 @@ static u64 to_hpa(u64 dpa_offset, int pos, u8 r_eiw, u16 r_eig, u8 hb_ways, /* Calculate base HPA offset from DPA and position */ hpa_offset = cxl_calculate_hpa_offset(dpa_offset, pos, r_eiw, r_eig); + if (hpa_offset == ULLONG_MAX) + return ULLONG_MAX; if (math == XOR_MATH) { cximsd->nr_maps = hbiw_to_nr_maps[hb_ways]; @@ -258,19 +260,23 @@ static int test_random_params(void) pos = get_random_u32() % ways; dpa = get_random_u64() >> 12; + reverse_dpa = ULLONG_MAX; + reverse_pos = -1; + hpa = cxl_calculate_hpa_offset(dpa, pos, eiw, eig); - reverse_dpa = cxl_calculate_dpa_offset(hpa, eiw, eig); - reverse_pos = cxl_calculate_position(hpa, eiw, eig); - - if (reverse_dpa != dpa || reverse_pos != pos) { - pr_err("test random iter %d FAIL hpa=%llu, dpa=%llu reverse_dpa=%llu, pos=%d reverse_pos=%d eiw=%u eig=%u\n", - i, hpa, dpa, reverse_dpa, pos, reverse_pos, eiw, - eig); - - if (failures++ > 10) { - pr_err("test random too many failures, stop\n"); - break; - } + if (hpa != ULLONG_MAX) { + reverse_dpa = cxl_calculate_dpa_offset(hpa, eiw, eig); + reverse_pos = cxl_calculate_position(hpa, eiw, eig); + if (reverse_dpa == dpa && reverse_pos == pos) + continue; + } + + pr_err("test random iter %d FAIL hpa=%llu, dpa=%llu reverse_dpa=%llu, pos=%d reverse_pos=%d eiw=%u eig=%u\n", + i, hpa, dpa, reverse_dpa, pos, reverse_pos, eiw, eig); + + if (failures++ > 10) { + pr_err("test random too many failures, stop\n"); + break; } } pr_info("..... test random: PASS %d FAIL %d\n", i - failures, failures); From e6d4bb0e311ac02915d81247a3f6b6a8751e1f8f Mon Sep 17 00:00:00 2001 From: Ben Dooks Date: Thu, 8 Jan 2026 15:12:03 -0500 Subject: [PATCH 1125/1321] drm/i915/guc: make 'guc_hw_reg_state' static as it isn't exported The guc_hw_reg_state array is not exported, so make it static. Fixes the following sparse warning: drivers/gpu/drm/i915/i915_gpu_error.c:692:3: warning: symbol 'guc_hw_reg_state' was not declared. Should it be static? Fixes: ba391a102ec11 ("drm/i915/guc: Include the GuC registers in the error state") Signed-off-by: Ben Dooks Reviewed-by: Rodrigo Vivi Link: https://patch.msgid.link/20260108201202.59250-2-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi (cherry picked from commit 701c47493328a8173996e7590733be3493af572f) Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 7582ef34bf3fb..303d8d9b77753 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -686,7 +686,7 @@ static void err_print_guc_ctb(struct drm_i915_error_state_buf *m, } /* This list includes registers that are useful in debugging GuC hangs. */ -const struct { +static const struct { u32 start; u32 count; } guc_hw_reg_state[] = { From 4f47a499f4bd6fd195eb2897ac3de365847a431e Mon Sep 17 00:00:00 2001 From: Philip Yang Date: Thu, 4 Dec 2025 12:13:05 -0500 Subject: [PATCH 1126/1321] drm/amdgpu: Fix gfx9 update PTE mtype flag MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fix copy&paste error, that should have been an assignment instead of an or, otherwise MTYPE_UC 0x3 can not be updated to MTYPE_RW 0x1. Signed-off-by: Philip Yang Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit fc1366016abe4103c0f0fac882811aea961ef213) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 8ad7519f7b581..f1ee3921d970c 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -1235,16 +1235,16 @@ static void gmc_v9_0_get_vm_pte(struct amdgpu_device *adev, *flags = AMDGPU_PTE_MTYPE_VG10(*flags, MTYPE_NC); break; case AMDGPU_VM_MTYPE_WC: - *flags |= AMDGPU_PTE_MTYPE_VG10(*flags, MTYPE_WC); + *flags = AMDGPU_PTE_MTYPE_VG10(*flags, MTYPE_WC); break; case AMDGPU_VM_MTYPE_RW: - *flags |= AMDGPU_PTE_MTYPE_VG10(*flags, MTYPE_RW); + *flags = AMDGPU_PTE_MTYPE_VG10(*flags, MTYPE_RW); break; case AMDGPU_VM_MTYPE_CC: - *flags |= AMDGPU_PTE_MTYPE_VG10(*flags, MTYPE_CC); + *flags = AMDGPU_PTE_MTYPE_VG10(*flags, MTYPE_CC); break; case AMDGPU_VM_MTYPE_UC: - *flags |= AMDGPU_PTE_MTYPE_VG10(*flags, MTYPE_UC); + *flags = AMDGPU_PTE_MTYPE_VG10(*flags, MTYPE_UC); break; } From 0707e7fac6a5cacc1832f402cb0372020e6c90e9 Mon Sep 17 00:00:00 2001 From: Lu Yao Date: Tue, 6 Jan 2026 10:37:12 +0800 Subject: [PATCH 1127/1321] drm/amdgpu: fix drm panic null pointer when driver not support atomic When driver not support atomic, fb using plane->fb rather than plane->state->fb. Fixes: fe151ed7af54 ("drm/amdgpu: add generic display panic helper code") Signed-off-by: Lu Yao Signed-off-by: Alex Deucher (cherry picked from commit 2f2a72de673513247cd6fae14e53f6c40c5841ef) --- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index b5d34797d6065..52bc04452812e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -1880,7 +1880,12 @@ int amdgpu_display_get_scanout_buffer(struct drm_plane *plane, struct drm_scanout_buffer *sb) { struct amdgpu_bo *abo; - struct drm_framebuffer *fb = plane->state->fb; + struct drm_framebuffer *fb; + + if (drm_drv_uses_atomic_modeset(plane->dev)) + fb = plane->state->fb; + else + fb = plane->fb; if (!fb) return -EINVAL; From 67e755d6d734b4721a14e081fab186327244b6d2 Mon Sep 17 00:00:00 2001 From: "Mario Limonciello (AMD)" Date: Wed, 7 Jan 2026 15:37:28 -0600 Subject: [PATCH 1128/1321] drm/amd: Clean up kfd node on surprise disconnect When an eGPU is unplugged the KFD topology should also be destroyed for that GPU. This never happens because the fini_sw callbacks never get to run. Run them manually before calling amdgpu_device_ip_fini_early() when a device has already been disconnected. This location is intentionally chosen to make sure that the kfd locking refcount doesn't get incremented unintentionally. Cc: kent.russell@amd.com Closes: https://community.frame.work/t/amd-egpu-on-linux/8691/33 Signed-off-by: Mario Limonciello (AMD) Reviewed-by: Kent Russell Signed-off-by: Alex Deucher (cherry picked from commit 6a23e7b4332c10f8b56c33a9c5431b52ecff9aab) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index d5c44bd34d456..d2c3885de711f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -5063,6 +5063,14 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) amdgpu_ttm_set_buffer_funcs_status(adev, false); + /* + * device went through surprise hotplug; we need to destroy topology + * before ip_fini_early to prevent kfd locking refcount issues by calling + * amdgpu_amdkfd_suspend() + */ + if (drm_dev_is_unplugged(adev_to_drm(adev))) + amdgpu_amdkfd_device_fini_sw(adev); + amdgpu_device_ip_fini_early(adev); amdgpu_irq_fini_hw(adev); From 9d9fde5dc50a32f2cc36cb07a397f85c1693dcb2 Mon Sep 17 00:00:00 2001 From: Peter Colberg Date: Mon, 22 Dec 2025 12:42:48 -0500 Subject: [PATCH 1129/1321] Revert duplicate "drm/amdgpu: disable peer-to-peer access for DCC-enabled GC12 VRAM surfaces" This reverts commit 22a36e660d014925114feb09a2680bb3c2d1e279 once, which was merged twice due to an incorrect backmerge resolution. Fixes: ce0478b02ed2 ("Merge tag 'v6.18-rc6' into drm-next") Signed-off-by: Peter Colberg Signed-off-by: Alex Deucher (cherry picked from commit 38a0f4cf8c6147fd10baa206ab349f8ff724e391) --- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index e22cfa7c6d32f..c1461317eb298 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -83,18 +83,6 @@ static int amdgpu_dma_buf_attach(struct dma_buf *dmabuf, struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); int r; - /* - * Disable peer-to-peer access for DCC-enabled VRAM surfaces on GFX12+. - * Such buffers cannot be safely accessed over P2P due to device-local - * compression metadata. Fallback to system-memory path instead. - * Device supports GFX12 (GC 12.x or newer) - * BO was created with the AMDGPU_GEM_CREATE_GFX12_DCC flag - * - */ - if (amdgpu_ip_version(adev, GC_HWIP, 0) >= IP_VERSION(12, 0, 0) && - bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC) - attach->peer2peer = false; - /* * Disable peer-to-peer access for DCC-enabled VRAM surfaces on GFX12+. * Such buffers cannot be safely accessed over P2P due to device-local From 09ccc9ab1fef360b19af6e845f246bbd8f2b1c6f Mon Sep 17 00:00:00 2001 From: Xiaogang Chen Date: Thu, 8 Jan 2026 09:50:36 -0600 Subject: [PATCH 1130/1321] drm/amdgpu: Use correct address to setup gart page table for vram access Use dst input parameter to setup gart page table entries instead of using fixed location. Fixes: 237d623ae659 ("drm/amdgpu/gart: Add helper to bind VRAM pages (v2)") Signed-off-by: Xiaogang Chen Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher (cherry picked from commit ca5d4db8db843be7ed35fc9334737490c2b58d32) --- drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c index d2237ce9da70e..1485f47894401 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c @@ -375,7 +375,7 @@ void amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset, * @start_page: first page to map in the GART aperture * @num_pages: number of pages to be mapped * @flags: page table entry flags - * @dst: CPU address of the GART table + * @dst: valid CPU address of GART table, cannot be null * * Binds a BO that is allocated in VRAM to the GART page table * (all ASICs). @@ -396,7 +396,7 @@ void amdgpu_gart_map_vram_range(struct amdgpu_device *adev, uint64_t pa, return; for (i = 0; i < num_pages; ++i) { - amdgpu_gmc_set_pte_pde(adev, adev->gart.ptr, + amdgpu_gmc_set_pte_pde(adev, dst, start_page + i, pa + AMDGPU_GPU_PAGE_SIZE * i, flags); } From 07dda03612dcfdcd92d554bb09bce40bec247dfc Mon Sep 17 00:00:00 2001 From: Alex Deucher Date: Fri, 9 Jan 2026 08:54:55 -0500 Subject: [PATCH 1131/1321] drm/amdgpu: make sure userqs are enabled in userq IOCTLs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit These IOCTLs shouldn't be called when userqs are not enabled. Make sure they are enabled before executing the IOCTLs. Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit d967509651601cddce7ff2a9f09479f3636f684d) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 16 ++++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 6 ++++++ 3 files changed, 23 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c index 9a969175900e9..58b26c78b6425 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c @@ -885,12 +885,28 @@ static int amdgpu_userq_input_args_validate(struct drm_device *dev, return 0; } +bool amdgpu_userq_enabled(struct drm_device *dev) +{ + struct amdgpu_device *adev = drm_to_adev(dev); + int i; + + for (i = 0; i < AMDGPU_HW_IP_NUM; i++) { + if (adev->userq_funcs[i]) + return true; + } + + return false; +} + int amdgpu_userq_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) { union drm_amdgpu_userq *args = data; int r; + if (!amdgpu_userq_enabled(dev)) + return -ENOTSUPP; + if (amdgpu_userq_input_args_validate(dev, args, filp) < 0) return -EINVAL; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h index c37444427a14c..b48b3bc293fca 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h @@ -141,6 +141,7 @@ uint64_t amdgpu_userq_get_doorbell_index(struct amdgpu_userq_mgr *uq_mgr, struct drm_file *filp); u32 amdgpu_userq_get_supported_ip_mask(struct amdgpu_device *adev); +bool amdgpu_userq_enabled(struct drm_device *dev); int amdgpu_userq_suspend(struct amdgpu_device *adev); int amdgpu_userq_resume(struct amdgpu_device *adev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c index eba9fb359047f..53d8707f9881a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c @@ -471,6 +471,9 @@ int amdgpu_userq_signal_ioctl(struct drm_device *dev, void *data, struct drm_exec exec; u64 wptr; + if (!amdgpu_userq_enabled(dev)) + return -ENOTSUPP; + num_syncobj_handles = args->num_syncobj_handles; syncobj_handles = memdup_user(u64_to_user_ptr(args->syncobj_handles), size_mul(sizeof(u32), num_syncobj_handles)); @@ -653,6 +656,9 @@ int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data, int r, i, rentry, wentry, cnt; struct drm_exec exec; + if (!amdgpu_userq_enabled(dev)) + return -ENOTSUPP; + num_read_bo_handles = wait_info->num_bo_read_handles; bo_handles_read = memdup_user(u64_to_user_ptr(wait_info->bo_read_handles), size_mul(sizeof(u32), num_read_bo_handles)); From 67212aee1e5af95f08a5ccbee595ffad69a60e00 Mon Sep 17 00:00:00 2001 From: Haoxiang Li Date: Thu, 8 Jan 2026 15:18:22 +0800 Subject: [PATCH 1132/1321] drm/amdkfd: fix a memory leak in device_queue_manager_init() If dqm->ops.initialize() fails, add deallocate_hiq_sdma_mqd() to release the memory allocated by allocate_hiq_sdma_mqd(). Move deallocate_hiq_sdma_mqd() up to ensure proper function visibility at the point of use. Fixes: 11614c36bc8f ("drm/amdkfd: Allocate MQD trunk for HIQ and SDMA") Signed-off-by: Haoxiang Li Signed-off-by: Felix Kuehling Reviewed-by: Oak Zeng Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher (cherry picked from commit b7cccc8286bb9919a0952c812872da1dcfe9d390) Cc: stable@vger.kernel.org --- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index d7a2e7178ea9c..8af0929ca40a3 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -2919,6 +2919,14 @@ static int allocate_hiq_sdma_mqd(struct device_queue_manager *dqm) return retval; } +static void deallocate_hiq_sdma_mqd(struct kfd_node *dev, + struct kfd_mem_obj *mqd) +{ + WARN(!mqd, "No hiq sdma mqd trunk to free"); + + amdgpu_amdkfd_free_gtt_mem(dev->adev, &mqd->gtt_mem); +} + struct device_queue_manager *device_queue_manager_init(struct kfd_node *dev) { struct device_queue_manager *dqm; @@ -3042,19 +3050,14 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_node *dev) return dqm; } + if (!dev->kfd->shared_resources.enable_mes) + deallocate_hiq_sdma_mqd(dev, &dqm->hiq_sdma_mqd); + out_free: kfree(dqm); return NULL; } -static void deallocate_hiq_sdma_mqd(struct kfd_node *dev, - struct kfd_mem_obj *mqd) -{ - WARN(!mqd, "No hiq sdma mqd trunk to free"); - - amdgpu_amdkfd_free_gtt_mem(dev->adev, &mqd->gtt_mem); -} - void device_queue_manager_uninit(struct device_queue_manager *dqm) { dqm->ops.stop(dqm); From c49bbe33b6338dbe33c613aab4c8b39dea31612d Mon Sep 17 00:00:00 2001 From: "Mario Limonciello (AMD)" Date: Sun, 14 Dec 2025 08:59:16 -0600 Subject: [PATCH 1133/1321] drm/amd/display: Show link name in PSR status message [Why] The PSR message was moved in commit 4321742c394e ("drm/amd/display: Move PSR support message into amdgpu_dm"). This message however shows for every single link without showing which link is which. This can send a confusing message to the user. [How] Add link name into the message. Fixes: 4321742c394e ("drm/amd/display: Move PSR support message into amdgpu_dm") Reviewed-by: Alex Hung Signed-off-by: Mario Limonciello (AMD) Signed-off-by: Matthew Stewart Tested-by: Dan Wheeler Signed-off-by: Alex Deucher (cherry picked from commit 99f77f6229c0766b980ae05affcf9f742d97de6a) --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 740711ac1037c..936c3bc3f2d60 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -5626,7 +5626,8 @@ static int amdgpu_dm_initialize_drm_device(struct amdgpu_device *adev) if (psr_feature_enabled) { amdgpu_dm_set_psr_caps(link); - drm_info(adev_to_drm(adev), "PSR support %d, DC PSR ver %d, sink PSR ver %d DPCD caps 0x%x su_y_granularity %d\n", + drm_info(adev_to_drm(adev), "%s: PSR support %d, DC PSR ver %d, sink PSR ver %d DPCD caps 0x%x su_y_granularity %d\n", + aconnector->base.name, link->psr_settings.psr_feature_enabled, link->psr_settings.psr_version, link->dpcd_caps.psr_info.psr_version, From 3d0bc775ee96e186e90dbba2798a0d54dfa63751 Mon Sep 17 00:00:00 2001 From: Mario Limonciello Date: Mon, 15 Dec 2025 14:08:30 -0600 Subject: [PATCH 1134/1321] drm/amd/display: Bump the HDMI clock to 340MHz [Why] DP-HDMI dongles can execeed bandwidth requirements on high resolution monitors. This can lead to pruning the high resolution modes. HDMI 1.3 bumped the clock to 340MHz, but display code never matched it. [How] Set default to (DVI) 165MHz. Once HDMI display is identified update to 340MHz. Reported-by: Dianne Skoll Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4780 Reviewed-by: Chris Park Signed-off-by: Mario Limonciello Signed-off-by: Matthew Stewart Tested-by: Dan Wheeler Signed-off-by: Alex Deucher (cherry picked from commit ac1e65d8ade46c09fb184579b81acadf36dcb91e) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/display/dc/dc_hdmi_types.h | 2 +- drivers/gpu/drm/amd/display/dc/link/link_detection.c | 4 +++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dc_hdmi_types.h b/drivers/gpu/drm/amd/display/dc/dc_hdmi_types.h index b015e80672ec9..fcd3ab4b00459 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_hdmi_types.h +++ b/drivers/gpu/drm/amd/display/dc/dc_hdmi_types.h @@ -41,7 +41,7 @@ /* kHZ*/ #define DP_ADAPTOR_DVI_MAX_TMDS_CLK 165000 /* kHZ*/ -#define DP_ADAPTOR_HDMI_SAFE_MAX_TMDS_CLK 165000 +#define DP_ADAPTOR_HDMI_SAFE_MAX_TMDS_CLK 340000 struct dp_hdmi_dongle_signature_data { int8_t id[15];/* "DP-HDMI ADAPTOR"*/ diff --git a/drivers/gpu/drm/amd/display/dc/link/link_detection.c b/drivers/gpu/drm/amd/display/dc/link/link_detection.c index e1940b8e5bc3f..7fa6bc97a9193 100644 --- a/drivers/gpu/drm/amd/display/dc/link/link_detection.c +++ b/drivers/gpu/drm/amd/display/dc/link/link_detection.c @@ -336,7 +336,7 @@ static void query_dp_dual_mode_adaptor( /* Assume we have no valid DP passive dongle connected */ *dongle = DISPLAY_DONGLE_NONE; - sink_cap->max_hdmi_pixel_clock = DP_ADAPTOR_HDMI_SAFE_MAX_TMDS_CLK; + sink_cap->max_hdmi_pixel_clock = DP_ADAPTOR_DVI_MAX_TMDS_CLK; /* Read DP-HDMI dongle I2c (no response interpreted as DP-DVI dongle)*/ if (!i2c_read( @@ -392,6 +392,8 @@ static void query_dp_dual_mode_adaptor( } } + if (is_valid_hdmi_signature) + sink_cap->max_hdmi_pixel_clock = DP_ADAPTOR_HDMI_SAFE_MAX_TMDS_CLK; if (is_type2_dongle) { uint32_t max_tmds_clk = From 1c33ed020a4a6d4d4892951713cb5024ca851a97 Mon Sep 17 00:00:00 2001 From: Vivek Das Mohapatra Date: Mon, 12 Jan 2026 15:28:56 +0000 Subject: [PATCH 1135/1321] drm/amd/display: Initialise backlight level values from hw Internal backlight levels are initialised from ACPI but the values are sometimes out of sync with the levels in effect until there has been a read from hardware (eg triggered by reading from sysfs). This means that the first drm_commit can cause the levels to be set to a different value than the actual starting one, which results in a sudden change in brightness. This path shows the problem (when the values are out of sync): amdgpu_dm_atomic_commit_tail() -> amdgpu_dm_commit_streams() -> amdgpu_dm_backlight_set_level(..., dm->brightness[n]) This patch calls the backlight ops get_brightness explicitly at the end of backlight registration to make sure dm->brightness[n] is in sync with the actual hardware levels. Fixes: 2fe87f54abdc ("drm/amd/display: Set default brightness according to ACPI") Signed-off-by: Vivek Das Mohapatra Reviewed-by: Mario Limonciello (AMD) Signed-off-by: Alex Deucher (cherry picked from commit 318b1c36d82a0cd2b06a4bb43272fa6f1bc8adc1) Cc: stable@vger.kernel.org --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 936c3bc3f2d60..fae88ce8327fa 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -5266,6 +5266,8 @@ amdgpu_dm_register_backlight_device(struct amdgpu_dm_connector *aconnector) struct amdgpu_dm_backlight_caps *caps; char bl_name[16]; int min, max; + int real_brightness; + int init_brightness; if (aconnector->bl_idx == -1) return; @@ -5290,6 +5292,8 @@ amdgpu_dm_register_backlight_device(struct amdgpu_dm_connector *aconnector) } else props.brightness = props.max_brightness = MAX_BACKLIGHT_LEVEL; + init_brightness = props.brightness; + if (caps->data_points && !(amdgpu_dc_debug_mask & DC_DISABLE_CUSTOM_BRIGHTNESS_CURVE)) { drm_info(drm, "Using custom brightness curve\n"); props.scale = BACKLIGHT_SCALE_NON_LINEAR; @@ -5308,8 +5312,20 @@ amdgpu_dm_register_backlight_device(struct amdgpu_dm_connector *aconnector) if (IS_ERR(dm->backlight_dev[aconnector->bl_idx])) { drm_err(drm, "DM: Backlight registration failed!\n"); dm->backlight_dev[aconnector->bl_idx] = NULL; - } else + } else { + /* + * dm->brightness[x] can be inconsistent just after startup until + * ops.get_brightness is called. + */ + real_brightness = + amdgpu_dm_backlight_ops.get_brightness(dm->backlight_dev[aconnector->bl_idx]); + + if (real_brightness != init_brightness) { + dm->actual_brightness[aconnector->bl_idx] = real_brightness; + dm->brightness[aconnector->bl_idx] = real_brightness; + } drm_dbg_driver(drm, "DM: Registered Backlight device: %s\n", bl_name); + } } static int initialize_plane(struct amdgpu_display_manager *dm, From 2abfc6d51aff866ac17aa72f96daff64d52742ec Mon Sep 17 00:00:00 2001 From: Yang Wang Date: Tue, 6 Jan 2026 14:42:40 +0800 Subject: [PATCH 1136/1321] drm/amd/pm: fix smu overdrive data type wrong issue on smu 14.0.2 resolving the issue of incorrect type definitions potentially causing calculation errors. Fixes: 54f7f3ca982a ("drm/amdgpu/swm14: Update power limit logic") Signed-off-by: Yang Wang Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher (cherry picked from commit e3a03d0ae16d6b56e893cce8e52b44140e1ed985) --- drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c index 33c3cd2e1e244..d7642d388bc38 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c @@ -1702,8 +1702,9 @@ static int smu_v14_0_2_get_power_limit(struct smu_context *smu, table_context->power_play_table; PPTable_t *pptable = table_context->driver_pptable; CustomSkuTable_t *skutable = &pptable->CustomSkuTable; - uint32_t power_limit, od_percent_upper = 0, od_percent_lower = 0; + int16_t od_percent_upper = 0, od_percent_lower = 0; uint32_t msg_limit = pptable->SkuTable.MsgLimits.Power[PPT_THROTTLER_PPT0][POWER_SOURCE_AC]; + uint32_t power_limit; if (smu_v14_0_get_current_power_limit(smu, &power_limit)) power_limit = smu->adev->pm.ac_power ? From 3f00caaf8d0311bec8c07f3de42f9ada6e6592c7 Mon Sep 17 00:00:00 2001 From: Prike Liang Date: Tue, 6 Jan 2026 17:00:57 +0800 Subject: [PATCH 1137/1321] drm/amdgpu: validate the flush_gpu_tlb_pasid() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Validate flush_gpu_tlb_pasid() availability before flushing tlb. Signed-off-by: Prike Liang Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit f4db9913e4d3dabe9ff3ea6178f2c1bc286012b8) --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 869bceb0fe2c6..8924380086c84 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -732,6 +732,10 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct amdgpu_device *adev, uint16_t pasid, return 0; if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready) { + + if (!adev->gmc.gmc_funcs->flush_gpu_tlb_pasid) + return 0; + if (adev->gmc.flush_tlb_needs_extra_type_2) adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid, 2, all_hub, From 6db648a57f5f84ebc59c67ade0db851849968cd0 Mon Sep 17 00:00:00 2001 From: Prike Liang Date: Fri, 9 Jan 2026 16:15:11 +0800 Subject: [PATCH 1138/1321] Revert "drm/amdgpu: don't attach the tlb fence for SI" MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This reverts commit 820b3d376e8a102c6aeab737ec6edebbbb710e04. It’s better to validate VM TLB flushes in the flush‑TLB backend rather than in the generic VM layer. Reverting this patch depends on commit fa7c231fc2b0 ("drm/amdgpu: validate the flush_gpu_tlb_pasid()") being present in the tree. Signed-off-by: Prike Liang Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit 9163fe4d790fb4e16d6b0e23f55b43cddd3d4a65) --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index c362d4dfb5bbb..a67285118c37b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -1069,9 +1069,7 @@ amdgpu_vm_tlb_flush(struct amdgpu_vm_update_params *params, } /* Prepare a TLB flush fence to be attached to PTs */ - if (!params->unlocked && - /* SI doesn't support pasid or KIQ/MES */ - params->adev->family > AMDGPU_FAMILY_SI) { + if (!params->unlocked) { amdgpu_vm_tlb_fence_create(params->adev, vm, fence); /* Makes sure no PD/PT is freed before the flush */ From 86050fcaa5b9d1281c9d79ccc7a8431053b9261e Mon Sep 17 00:00:00 2001 From: Harish Kasiviswanathan Date: Sun, 11 Jan 2026 16:53:18 -0500 Subject: [PATCH 1139/1321] drm/amdkfd: No need to suspend whole MES to evict process Each queue of the process is individually removed and there is not need to suspend whole mes. Suspending mes stops kernel mode queues also causing unnecessary timeouts when running mixed work loads Fixes: 079ae5118e1f ("drm/amdkfd: fix suspend/resume all calls in mes based eviction path") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4765 Signed-off-by: Harish Kasiviswanathan Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher (cherry picked from commit 3fd20580b96a6e9da65b94ac3b58ee288239b731) --- .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index 8af0929ca40a3..625ea8ab7a749 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -1209,14 +1209,8 @@ static int evict_process_queues_cpsch(struct device_queue_manager *dqm, pr_debug_ratelimited("Evicting process pid %d queues\n", pdd->process->lead_thread->pid); - if (dqm->dev->kfd->shared_resources.enable_mes) { + if (dqm->dev->kfd->shared_resources.enable_mes) pdd->last_evict_timestamp = get_jiffies_64(); - retval = suspend_all_queues_mes(dqm); - if (retval) { - dev_err(dev, "Suspending all queues failed"); - goto out; - } - } /* Mark all queues as evicted. Deactivate all active queues on * the qpd. @@ -1246,10 +1240,6 @@ static int evict_process_queues_cpsch(struct device_queue_manager *dqm, KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES : KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0, USE_DEFAULT_GRACE_PERIOD); - } else { - retval = resume_all_queues_mes(dqm); - if (retval) - dev_err(dev, "Resuming all queues failed"); } out: From ea51386d2fb530c07fa95525b5652dc83921d30b Mon Sep 17 00:00:00 2001 From: Srinivasan Shanmugam Date: Wed, 14 Jan 2026 16:14:53 +0530 Subject: [PATCH 1140/1321] drm/amdgpu/userq: Fix fence reference leak on queue teardown v2 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The user mode queue keeps a pointer to the most recent fence in userq->last_fence. This pointer holds an extra dma_fence reference. When the queue is destroyed, we free the fence driver and its xarray, but we forgot to drop the last_fence reference. Because of the missing dma_fence_put(), the last fence object can stay alive when the driver unloads. This leaves an allocated object in the amdgpu_userq_fence slab cache and triggers This is visible during driver unload as: BUG amdgpu_userq_fence: Objects remaining on __kmem_cache_shutdown() kmem_cache_destroy amdgpu_userq_fence: Slab cache still has objects Call Trace: kmem_cache_destroy amdgpu_userq_fence_slab_fini amdgpu_exit __do_sys_delete_module Fix this by putting userq->last_fence and clearing the pointer during amdgpu_userq_fence_driver_free(). This makes sure the fence reference is released and the slab cache is empty when the module exits. v2: Update to only release userq->last_fence with dma_fence_put() (Christian) Fixes: edc762a51c71 ("drm/amdgpu/userq: move some code around") Cc: Alex Deucher Cc: Christian König Signed-off-by: Srinivasan Shanmugam Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit 8e051e38a8d45caf6a866d4ff842105b577953bb) --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c index 53d8707f9881a..85e9edc1cb6ff 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c @@ -141,6 +141,8 @@ static void amdgpu_userq_walk_and_drop_fence_drv(struct xarray *xa) void amdgpu_userq_fence_driver_free(struct amdgpu_usermode_queue *userq) { + dma_fence_put(userq->last_fence); + amdgpu_userq_walk_and_drop_fence_drv(&userq->fence_drv_xa); xa_destroy(&userq->fence_drv_xa); /* Drop the fence_drv reference held by user queue */ From 605fadd55baeea37fec2934f847748aff65e7c52 Mon Sep 17 00:00:00 2001 From: Ivan Lipski Date: Tue, 13 Jan 2026 17:29:59 -0500 Subject: [PATCH 1141/1321] drm/amd/display: Add an hdmi_hpd_debounce_delay_ms module [Why&How] Right now, the HDMI HPD filter is enabled by default at 1500ms. We want to disable it by default, as most modern displays with HDMI do not require it for DPMS mode. The HPD can instead be enabled as a driver parameter with a custom delay value in ms (up to 5000ms). Fixes: c918e75e1ed9 ("drm/amd/display: Add an HPD filter for HDMI") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4859 Signed-off-by: Ivan Lipski Reviewed-by: Mario Limonciello (AMD) Signed-off-by: Alex Deucher (cherry picked from commit 6a681cd9034587fe3550868bacfbd639d1c6891f) --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 11 +++++++++++ drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 ++++++++++++--- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 5 ++++- 4 files changed, 29 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 9f9774f58ce1c..b20a06abb65dd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -274,6 +274,8 @@ extern int amdgpu_rebar; extern int amdgpu_wbrf; extern int amdgpu_user_queue; +extern uint amdgpu_hdmi_hpd_debounce_delay_ms; + #define AMDGPU_VM_MAX_NUM_CTX 4096 #define AMDGPU_SG_THRESHOLD (256*1024*1024) #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS 3000 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 848e6b7db482d..6ccb80e2d7c9b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -247,6 +247,7 @@ int amdgpu_damage_clips = -1; /* auto */ int amdgpu_umsch_mm_fwlog; int amdgpu_rebar = -1; /* auto */ int amdgpu_user_queue = -1; +uint amdgpu_hdmi_hpd_debounce_delay_ms; DECLARE_DYNDBG_CLASSMAP(drm_debug_classes, DD_CLASS_TYPE_DISJOINT_BITS, 0, "DRM_UT_CORE", @@ -1123,6 +1124,16 @@ module_param_named(rebar, amdgpu_rebar, int, 0444); MODULE_PARM_DESC(user_queue, "Enable user queues (-1 = auto (default), 0 = disable, 1 = enable, 2 = enable UQs and disable KQs)"); module_param_named(user_queue, amdgpu_user_queue, int, 0444); +/* + * DOC: hdmi_hpd_debounce_delay_ms (uint) + * HDMI HPD disconnect debounce delay in milliseconds. + * + * Used to filter short disconnect->reconnect HPD toggles some HDMI sinks + * generate while entering/leaving power save. Set to 0 to disable by default. + */ +MODULE_PARM_DESC(hdmi_hpd_debounce_delay_ms, "HDMI HPD disconnect debounce delay in milliseconds (0 to disable (by default), 1500 is common)"); +module_param_named(hdmi_hpd_debounce_delay_ms, amdgpu_hdmi_hpd_debounce_delay_ms, uint, 0644); + /* These devices are not supported by amdgpu. * They are supported by the mach64, r128, radeon drivers */ diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index fae88ce8327fa..1ea5a250440f6 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -8947,9 +8947,18 @@ void amdgpu_dm_connector_init_helper(struct amdgpu_display_manager *dm, mutex_init(&aconnector->hpd_lock); mutex_init(&aconnector->handle_mst_msg_ready); - aconnector->hdmi_hpd_debounce_delay_ms = AMDGPU_DM_HDMI_HPD_DEBOUNCE_MS; - INIT_DELAYED_WORK(&aconnector->hdmi_hpd_debounce_work, hdmi_hpd_debounce_work); - aconnector->hdmi_prev_sink = NULL; + /* + * If HDMI HPD debounce delay is set, use the minimum between selected + * value and AMDGPU_DM_MAX_HDMI_HPD_DEBOUNCE_MS + */ + if (amdgpu_hdmi_hpd_debounce_delay_ms) { + aconnector->hdmi_hpd_debounce_delay_ms = min(amdgpu_hdmi_hpd_debounce_delay_ms, + AMDGPU_DM_MAX_HDMI_HPD_DEBOUNCE_MS); + INIT_DELAYED_WORK(&aconnector->hdmi_hpd_debounce_work, hdmi_hpd_debounce_work); + aconnector->hdmi_prev_sink = NULL; + } else { + aconnector->hdmi_hpd_debounce_delay_ms = 0; + } /* * configure support HPD hot plug connector_>polled default value is 0 diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h index bd0403005f370..beb0d04d3e682 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h @@ -59,7 +59,10 @@ #define AMDGPU_HDR_MULT_DEFAULT (0x100000000LL) -#define AMDGPU_DM_HDMI_HPD_DEBOUNCE_MS 1500 +/* + * Maximum HDMI HPD debounce delay in milliseconds + */ +#define AMDGPU_DM_MAX_HDMI_HPD_DEBOUNCE_MS 5000 /* #include "include/amdgpu_dal_power_if.h" #include "amdgpu_dm_irq.h" From 8f07ed0bea230ff94cc728db53de39d60540aefd Mon Sep 17 00:00:00 2001 From: Sebastian Reichel Date: Tue, 14 Oct 2025 18:00:57 +0200 Subject: [PATCH 1142/1321] drm/bridge: dw-hdmi-qp: Fix spurious IRQ on resume After resume from suspend to RAM, the following splash is generated if the HDMI driver is probed (independent of a connected cable): [ 1194.484052] irq 80: nobody cared (try booting with the "irqpoll" option) [ 1194.484074] CPU: 0 UID: 0 PID: 627 Comm: rtcwake Not tainted 6.17.0-rc7-g96f1a11414b3 #1 PREEMPT [ 1194.484082] Hardware name: Rockchip RK3576 EVB V10 Board (DT) [ 1194.484085] Call trace: [ 1194.484087] ... (stripped) [ 1194.484283] handlers: [ 1194.484285] [<00000000bc363dcb>] dw_hdmi_qp_main_hardirq [dw_hdmi_qp] [ 1194.484302] Disabling IRQ #80 Apparently the HDMI IP is losing part of its state while the system is suspended and generates spurious interrupts during resume. The bug has not yet been noticed, as system suspend does not yet work properly on upstream kernel with either the Rockchip RK3588 or RK3576 platform. Fixes: 128a9bf8ace2 ("drm/rockchip: Add basic RK3588 HDMI output support") Signed-off-by: Sebastian Reichel Reviewed-by: Cristian Ciocaltea Signed-off-by: Heiko Stuebner Link: https://patch.msgid.link/20251014-rockchip-hdmi-suspend-fix-v1-1-983fcbf44839@collabora.com --- drivers/gpu/drm/bridge/synopsys/dw-hdmi-qp.c | 9 +++++++++ drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c | 12 +++++++++++- include/drm/bridge/dw_hdmi_qp.h | 1 + 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-qp.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-qp.c index fe4c026280f03..60166919c5b54 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-qp.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-qp.c @@ -163,6 +163,7 @@ struct dw_hdmi_qp { unsigned long ref_clk_rate; struct regmap *regm; + int main_irq; unsigned long tmds_char_rate; }; @@ -1271,6 +1272,7 @@ struct dw_hdmi_qp *dw_hdmi_qp_bind(struct platform_device *pdev, dw_hdmi_qp_init_hw(hdmi); + hdmi->main_irq = plat_data->main_irq; ret = devm_request_threaded_irq(dev, plat_data->main_irq, dw_hdmi_qp_main_hardirq, NULL, IRQF_SHARED, dev_name(dev), hdmi); @@ -1331,9 +1333,16 @@ struct dw_hdmi_qp *dw_hdmi_qp_bind(struct platform_device *pdev, } EXPORT_SYMBOL_GPL(dw_hdmi_qp_bind); +void dw_hdmi_qp_suspend(struct device *dev, struct dw_hdmi_qp *hdmi) +{ + disable_irq(hdmi->main_irq); +} +EXPORT_SYMBOL_GPL(dw_hdmi_qp_suspend); + void dw_hdmi_qp_resume(struct device *dev, struct dw_hdmi_qp *hdmi) { dw_hdmi_qp_init_hw(hdmi); + enable_irq(hdmi->main_irq); } EXPORT_SYMBOL_GPL(dw_hdmi_qp_resume); diff --git a/drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c b/drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c index c9fe6aa3e3e36..6e39e8a007749 100644 --- a/drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c +++ b/drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c @@ -640,6 +640,15 @@ static void dw_hdmi_qp_rockchip_remove(struct platform_device *pdev) component_del(&pdev->dev, &dw_hdmi_qp_rockchip_ops); } +static int __maybe_unused dw_hdmi_qp_rockchip_suspend(struct device *dev) +{ + struct rockchip_hdmi_qp *hdmi = dev_get_drvdata(dev); + + dw_hdmi_qp_suspend(dev, hdmi->hdmi); + + return 0; +} + static int __maybe_unused dw_hdmi_qp_rockchip_resume(struct device *dev) { struct rockchip_hdmi_qp *hdmi = dev_get_drvdata(dev); @@ -655,7 +664,8 @@ static int __maybe_unused dw_hdmi_qp_rockchip_resume(struct device *dev) } static const struct dev_pm_ops dw_hdmi_qp_rockchip_pm = { - SET_SYSTEM_SLEEP_PM_OPS(NULL, dw_hdmi_qp_rockchip_resume) + SET_SYSTEM_SLEEP_PM_OPS(dw_hdmi_qp_rockchip_suspend, + dw_hdmi_qp_rockchip_resume) }; struct platform_driver dw_hdmi_qp_rockchip_pltfm_driver = { diff --git a/include/drm/bridge/dw_hdmi_qp.h b/include/drm/bridge/dw_hdmi_qp.h index 3f461f6b9bbfb..3af12f82da2c1 100644 --- a/include/drm/bridge/dw_hdmi_qp.h +++ b/include/drm/bridge/dw_hdmi_qp.h @@ -34,5 +34,6 @@ struct dw_hdmi_qp_plat_data { struct dw_hdmi_qp *dw_hdmi_qp_bind(struct platform_device *pdev, struct drm_encoder *encoder, const struct dw_hdmi_qp_plat_data *plat_data); +void dw_hdmi_qp_suspend(struct device *dev, struct dw_hdmi_qp *hdmi); void dw_hdmi_qp_resume(struct device *dev, struct dw_hdmi_qp *hdmi); #endif /* __DW_HDMI_QP__ */ From 9e488ff69d151fb79be8915b06df9e0a30cc1c39 Mon Sep 17 00:00:00 2001 From: Ian Forbes Date: Fri, 14 Nov 2025 14:37:03 -0600 Subject: [PATCH 1143/1321] drm/vmwgfx: Fix KMS with 3D on HW version 10 HW version 10 does not have GB Surfaces so there is no backing buffer for surface backed FBs. This would result in a nullptr dereference and crash the driver causing a black screen. Fixes: 965544150d1c ("drm/vmwgfx: Refactor cursor handling") Signed-off-by: Ian Forbes Reviewed-by: Zack Rusin Signed-off-by: Zack Rusin Link: https://patch.msgid.link/20251114203703.1946616-1-ian.forbes@broadcom.com --- drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c index d32ce1cb579e6..bc51b5d55e38a 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c @@ -766,13 +766,15 @@ static struct drm_framebuffer *vmw_kms_fb_create(struct drm_device *dev, return ERR_PTR(ret); } - ttm_bo_reserve(&bo->tbo, false, false, NULL); - ret = vmw_bo_dirty_add(bo); - if (!ret && surface && surface->res.func->dirty_alloc) { - surface->res.coherent = true; - ret = surface->res.func->dirty_alloc(&surface->res); + if (bo) { + ttm_bo_reserve(&bo->tbo, false, false, NULL); + ret = vmw_bo_dirty_add(bo); + if (!ret && surface && surface->res.func->dirty_alloc) { + surface->res.coherent = true; + ret = surface->res.func->dirty_alloc(&surface->res); + } + ttm_bo_unreserve(&bo->tbo); } - ttm_bo_unreserve(&bo->tbo); return &vfb->base; } From 2345307e93f20858eb507319dad39914cf5f982f Mon Sep 17 00:00:00 2001 From: Ian Forbes Date: Wed, 7 Jan 2026 09:20:59 -0600 Subject: [PATCH 1144/1321] drm/vmwgfx: Merge vmw_bo_release and vmw_bo_free functions Some of the warnings need to be reordered between these two functions in order to be correct. This has happened multiple times. Merging them solves this problem once and for all. Fixes: d6667f0ddf46 ("drm/vmwgfx: Fix handling of dumb buffers") Signed-off-by: Ian Forbes Signed-off-by: Zack Rusin Link: https://patch.msgid.link/20260107152059.3048329-1-ian.forbes@broadcom.com --- drivers/gpu/drm/vmwgfx/vmwgfx_bo.c | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c index f031a312c7835..b22887e8c8815 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c @@ -32,9 +32,15 @@ #include -static void vmw_bo_release(struct vmw_bo *vbo) +/** + * vmw_bo_free - vmw_bo destructor + * + * @bo: Pointer to the embedded struct ttm_buffer_object + */ +static void vmw_bo_free(struct ttm_buffer_object *bo) { struct vmw_resource *res; + struct vmw_bo *vbo = to_vmw_bo(&bo->base); WARN_ON(kref_read(&vbo->tbo.base.refcount) != 0); vmw_bo_unmap(vbo); @@ -62,20 +68,8 @@ static void vmw_bo_release(struct vmw_bo *vbo) } vmw_surface_unreference(&vbo->dumb_surface); } - drm_gem_object_release(&vbo->tbo.base); -} - -/** - * vmw_bo_free - vmw_bo destructor - * - * @bo: Pointer to the embedded struct ttm_buffer_object - */ -static void vmw_bo_free(struct ttm_buffer_object *bo) -{ - struct vmw_bo *vbo = to_vmw_bo(&bo->base); - WARN_ON(!RB_EMPTY_ROOT(&vbo->res_tree)); - vmw_bo_release(vbo); + drm_gem_object_release(&vbo->tbo.base); WARN_ON(vbo->dirty); kfree(vbo); } From a5e10bbf2a26b6676206fc647572f0903ec90530 Mon Sep 17 00:00:00 2001 From: Haoxiang Li Date: Wed, 24 Dec 2025 17:11:05 +0800 Subject: [PATCH 1145/1321] drm/vmwgfx: Fix an error return check in vmw_compat_shader_add() In vmw_compat_shader_add(), the return value check of vmw_shader_alloc() is not proper. Modify the check for the return pointer 'res'. Found by code review and compiled on ubuntu 20.04. Fixes: 18e4a4669c50 ("drm/vmwgfx: Fix compat shader namespace") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li Signed-off-by: Zack Rusin Link: https://patch.msgid.link/20251224091105.1569464-1-lihaoxiang@isrc.iscas.ac.cn --- drivers/gpu/drm/vmwgfx/vmwgfx_shader.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c b/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c index 69dfe69ce0f87..a8c8c9375d297 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c @@ -923,8 +923,10 @@ int vmw_compat_shader_add(struct vmw_private *dev_priv, ttm_bo_unreserve(&buf->tbo); res = vmw_shader_alloc(dev_priv, buf, size, 0, shader_type); - if (unlikely(ret != 0)) + if (IS_ERR(res)) { + ret = PTR_ERR(res); goto no_reserve; + } ret = vmw_cmdbuf_res_add(man, vmw_cmdbuf_res_shader, vmw_shader_key(user_key, shader_type), From 6b4c0666d9eff77c4bb86b83aec1c2a89fdbdeaf Mon Sep 17 00:00:00 2001 From: Bartlomiej Kubik Date: Thu, 11 Dec 2025 19:10:44 +0100 Subject: [PATCH 1146/1321] drm/vmwgfx: Fix kernel-doc warnings for vmwgfx_fence Add missing descriptions for vmw_event_fence_action_seq_passed. This fixes the following warnings: drivers/gpu/drm/vmwgfx/vmwgfx_fence.c:526 function parameter 'f' not described in 'vmw_event_fence_action_seq_passed' drivers/gpu/drm/vmwgfx/vmwgfx_fence.c:526 function parameter 'cb' not described in 'vmw_event_fence_action_seq_passed' Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202507040807.jKTxWGVQ-lkp@intel.com/ Signed-off-by: Bartlomiej Kubik Signed-off-by: Zack Rusin Link: https://lore.kernel.org/all/20251211181044.4098689-1-kubik.bartlomiej@gmail.com/ --- drivers/gpu/drm/vmwgfx/vmwgfx_fence.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c index 00be92da55097..85795082fef91 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c @@ -515,12 +515,12 @@ int vmw_fence_obj_unref_ioctl(struct drm_device *dev, void *data, /** * vmw_event_fence_action_seq_passed * - * @action: The struct vmw_fence_action embedded in a struct - * vmw_event_fence_action. + * @f: The struct dma_fence which provides timestamp for the action event + * @cb: The struct dma_fence_cb callback for the action event. * - * This function is called when the seqno of the fence where @action is - * attached has passed. It queues the event on the submitter's event list. - * This function is always called from atomic context. + * This function is called when the seqno of the fence has passed + * and it is always called from atomic context. + * It queues the event on the submitter's event list. */ static void vmw_event_fence_action_seq_passed(struct dma_fence *f, struct dma_fence_cb *cb) From 2ce7d2f1641756b69a5e0ef2682312adb6e84536 Mon Sep 17 00:00:00 2001 From: Andy Yan Date: Fri, 18 Jul 2025 14:41:13 +0800 Subject: [PATCH 1147/1321] drm/rockchip: vop2: Add delay between poll registers According to the implementation of read_poll_timeout_atomic, if the delay time is 0, it will only use a simple loop based on timeout_us to decrement the count. Therefore, the final timeout time will differ significantly from the set timeout time. So, here we set a specific delay time to ensure that the calculation of the timeout duration is accurate. Fixes: 3e89a8c68354 ("drm/rockchip: vop2: Fix the update of LAYER/PORT select registers when there are multi display output on rk3588/rk3568") Signed-off-by: Andy Yan Signed-off-by: Heiko Stuebner Link: https://patch.msgid.link/20250718064120.8811-1-andyshrk@163.com --- drivers/gpu/drm/rockchip/rockchip_vop2_reg.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/rockchip/rockchip_vop2_reg.c b/drivers/gpu/drm/rockchip/rockchip_vop2_reg.c index cd8380f0eddc8..855386a6a9f5c 100644 --- a/drivers/gpu/drm/rockchip/rockchip_vop2_reg.c +++ b/drivers/gpu/drm/rockchip/rockchip_vop2_reg.c @@ -2104,7 +2104,7 @@ static void rk3568_vop2_wait_for_port_mux_done(struct vop2 *vop2) * Spin until the previous port_mux figuration is done. */ ret = readx_poll_timeout_atomic(rk3568_vop2_read_port_mux, vop2, port_mux_sel, - port_mux_sel == vop2->old_port_sel, 0, 50 * 1000); + port_mux_sel == vop2->old_port_sel, 10, 50 * 1000); if (ret) DRM_DEV_ERROR(vop2->dev, "wait port_mux done timeout: 0x%x--0x%x\n", port_mux_sel, vop2->old_port_sel); @@ -2124,7 +2124,7 @@ static void rk3568_vop2_wait_for_layer_cfg_done(struct vop2 *vop2, u32 cfg) * Spin until the previous layer configuration is done. */ ret = readx_poll_timeout_atomic(rk3568_vop2_read_layer_cfg, vop2, atv_layer_cfg, - atv_layer_cfg == cfg, 0, 50 * 1000); + atv_layer_cfg == cfg, 10, 50 * 1000); if (ret) DRM_DEV_ERROR(vop2->dev, "wait layer cfg done timeout: 0x%x--0x%x\n", atv_layer_cfg, cfg); From e72c0c37137020ccdaf6d2e0a730a50fa5eae849 Mon Sep 17 00:00:00 2001 From: Andy Yan Date: Fri, 18 Jul 2025 14:41:14 +0800 Subject: [PATCH 1148/1321] drm/rockchip: vop2: Only wait for changed layer cfg done when there is pending cfgdone bits The write of cfgdone bits always done at .atomic_flush. When userspace makes plane zpos changes of two crtc within one commit, at the .atomic_begin stage, crtcN will never receive the "layer change cfg done" event of crtcM because crtcM has not yet written "cfgdone". So only wait when there is pending cfgdone bits to avoid long timeout. Fixes: 3e89a8c68354 ("drm/rockchip: vop2: Fix the update of LAYER/PORT select registers when there are multi display output on rk3588/rk3568") Signed-off-by: Andy Yan Signed-off-by: Heiko Stuebner Link: https://patch.msgid.link/20250718064120.8811-2-andyshrk@163.com --- drivers/gpu/drm/rockchip/rockchip_vop2_reg.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/rockchip/rockchip_vop2_reg.c b/drivers/gpu/drm/rockchip/rockchip_vop2_reg.c index 855386a6a9f5c..f3950e8476a75 100644 --- a/drivers/gpu/drm/rockchip/rockchip_vop2_reg.c +++ b/drivers/gpu/drm/rockchip/rockchip_vop2_reg.c @@ -2144,6 +2144,7 @@ static void rk3568_vop2_setup_layer_mixer(struct vop2_video_port *vp) u8 layer_sel_id; unsigned int ofs; u32 ovl_ctrl; + u32 cfg_done; int i; struct vop2_video_port *vp0 = &vop2->vps[0]; struct vop2_video_port *vp1 = &vop2->vps[1]; @@ -2298,8 +2299,16 @@ static void rk3568_vop2_setup_layer_mixer(struct vop2_video_port *vp) rk3568_vop2_wait_for_port_mux_done(vop2); } - if (layer_sel != old_layer_sel && atv_layer_sel != old_layer_sel) - rk3568_vop2_wait_for_layer_cfg_done(vop2, vop2->old_layer_sel); + if (layer_sel != old_layer_sel && atv_layer_sel != old_layer_sel) { + cfg_done = vop2_readl(vop2, RK3568_REG_CFG_DONE); + cfg_done &= (BIT(vop2->data->nr_vps) - 1); + cfg_done &= ~BIT(vp->id); + /* + * Changes of other VPs' overlays have not taken effect + */ + if (cfg_done) + rk3568_vop2_wait_for_layer_cfg_done(vop2, vop2->old_layer_sel); + } vop2_writel(vop2, RK3568_OVL_LAYER_SEL, layer_sel); mutex_unlock(&vop2->ovl_lock); From e538646a38f1e10501404ab5f438050189e6fcaa Mon Sep 17 00:00:00 2001 From: Alice Ryhl Date: Thu, 8 Jan 2026 16:07:31 +0000 Subject: [PATCH 1149/1321] drm/gpuvm: take GEM lock inside drm_gpuvm_bo_obtain_prealloc() When calling drm_gpuvm_bo_obtain_prealloc() and using immediate mode, this may result in a call to ops->vm_bo_free(vm_bo) while holding the GEMs gpuva mutex. This is a problem if ops->vm_bo_free(vm_bo) performs any operations that are not safe in the fence signalling critical path, and it turns out that Panthor (the only current user of the method) calls drm_gem_shmem_unpin() which takes a resv lock internally. This constitutes both a violation of signalling safety and lock inversion. To fix this, we modify the method to internally take the GEMs gpuva mutex so that the mutex can be unlocked before freeing the preallocated vm_bo. Note that this modification introduces a requirement that the driver uses immediate mode to call drm_gpuvm_bo_obtain_prealloc() as it would otherwise take the wrong lock. Fixes: 63e919a31625 ("panthor: use drm_gpuva_unlink_defer()") Reviewed-by: Boris Brezillon Signed-off-by: Alice Ryhl Link: https://patch.msgid.link/20260108-gpuvm-rust-v2-1-dbd014005a0b@google.com Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/drm_gpuvm.c | 69 +++++++++++++++++++-------- drivers/gpu/drm/panthor/panthor_mmu.c | 10 ---- 2 files changed, 48 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c index 8a06d296561d1..0de47e83d84df 100644 --- a/drivers/gpu/drm/drm_gpuvm.c +++ b/drivers/gpu/drm/drm_gpuvm.c @@ -1602,14 +1602,48 @@ drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm, } EXPORT_SYMBOL_GPL(drm_gpuvm_bo_create); +/* + * drm_gpuvm_bo_destroy_not_in_lists() - final part of drm_gpuvm_bo cleanup + * @vm_bo: the &drm_gpuvm_bo to destroy + * + * It is illegal to call this method if the @vm_bo is present in the GEMs gpuva + * list, the extobj list, or the evicted list. + * + * Note that this puts a refcount on the GEM object, which may destroy the GEM + * object if the refcount reaches zero. It's illegal for this to happen if the + * caller holds the GEMs gpuva mutex because it would free the mutex. + */ +static void +drm_gpuvm_bo_destroy_not_in_lists(struct drm_gpuvm_bo *vm_bo) +{ + struct drm_gpuvm *gpuvm = vm_bo->vm; + const struct drm_gpuvm_ops *ops = gpuvm->ops; + struct drm_gem_object *obj = vm_bo->obj; + + if (ops && ops->vm_bo_free) + ops->vm_bo_free(vm_bo); + else + kfree(vm_bo); + + drm_gpuvm_put(gpuvm); + drm_gem_object_put(obj); +} + +static void +drm_gpuvm_bo_destroy_not_in_lists_kref(struct kref *kref) +{ + struct drm_gpuvm_bo *vm_bo = container_of(kref, struct drm_gpuvm_bo, + kref); + + drm_gpuvm_bo_destroy_not_in_lists(vm_bo); +} + static void drm_gpuvm_bo_destroy(struct kref *kref) { struct drm_gpuvm_bo *vm_bo = container_of(kref, struct drm_gpuvm_bo, kref); struct drm_gpuvm *gpuvm = vm_bo->vm; - const struct drm_gpuvm_ops *ops = gpuvm->ops; - struct drm_gem_object *obj = vm_bo->obj; bool lock = !drm_gpuvm_resv_protected(gpuvm); if (!lock) @@ -1618,16 +1652,10 @@ drm_gpuvm_bo_destroy(struct kref *kref) drm_gpuvm_bo_list_del(vm_bo, extobj, lock); drm_gpuvm_bo_list_del(vm_bo, evict, lock); - drm_gem_gpuva_assert_lock_held(gpuvm, obj); + drm_gem_gpuva_assert_lock_held(gpuvm, vm_bo->obj); list_del(&vm_bo->list.entry.gem); - if (ops && ops->vm_bo_free) - ops->vm_bo_free(vm_bo); - else - kfree(vm_bo); - - drm_gpuvm_put(gpuvm); - drm_gem_object_put(obj); + drm_gpuvm_bo_destroy_not_in_lists(vm_bo); } /** @@ -1745,9 +1773,7 @@ EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put_deferred); void drm_gpuvm_bo_deferred_cleanup(struct drm_gpuvm *gpuvm) { - const struct drm_gpuvm_ops *ops = gpuvm->ops; struct drm_gpuvm_bo *vm_bo; - struct drm_gem_object *obj; struct llist_node *bo_defer; bo_defer = llist_del_all(&gpuvm->bo_defer); @@ -1766,14 +1792,7 @@ drm_gpuvm_bo_deferred_cleanup(struct drm_gpuvm *gpuvm) while (bo_defer) { vm_bo = llist_entry(bo_defer, struct drm_gpuvm_bo, list.entry.bo_defer); bo_defer = bo_defer->next; - obj = vm_bo->obj; - if (ops && ops->vm_bo_free) - ops->vm_bo_free(vm_bo); - else - kfree(vm_bo); - - drm_gpuvm_put(gpuvm); - drm_gem_object_put(obj); + drm_gpuvm_bo_destroy_not_in_lists(vm_bo); } } EXPORT_SYMBOL_GPL(drm_gpuvm_bo_deferred_cleanup); @@ -1861,6 +1880,9 @@ EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain); * count is decreased. If not found @__vm_bo is returned without further * increase of the reference count. * + * The provided @__vm_bo must not already be in the gpuva, evict, or extobj + * lists prior to calling this method. + * * A new &drm_gpuvm_bo is added to the GEMs gpuva list. * * Returns: a pointer to the found &drm_gpuvm_bo or @__vm_bo if no existing @@ -1873,14 +1895,19 @@ drm_gpuvm_bo_obtain_prealloc(struct drm_gpuvm_bo *__vm_bo) struct drm_gem_object *obj = __vm_bo->obj; struct drm_gpuvm_bo *vm_bo; + drm_WARN_ON(gpuvm->drm, !drm_gpuvm_immediate_mode(gpuvm)); + + mutex_lock(&obj->gpuva.lock); vm_bo = drm_gpuvm_bo_find(gpuvm, obj); if (vm_bo) { - drm_gpuvm_bo_put(__vm_bo); + mutex_unlock(&obj->gpuva.lock); + kref_put(&__vm_bo->kref, drm_gpuvm_bo_destroy_not_in_lists_kref); return vm_bo; } drm_gem_gpuva_assert_lock_held(gpuvm, obj); list_add_tail(&__vm_bo->list.entry.gem, &obj->gpuva.list); + mutex_unlock(&obj->gpuva.lock); return __vm_bo; } diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c index d4839d282689f..f6339963e4960 100644 --- a/drivers/gpu/drm/panthor/panthor_mmu.c +++ b/drivers/gpu/drm/panthor/panthor_mmu.c @@ -1252,17 +1252,7 @@ static int panthor_vm_prepare_map_op_ctx(struct panthor_vm_op_ctx *op_ctx, goto err_cleanup; } - /* drm_gpuvm_bo_obtain_prealloc() will call drm_gpuvm_bo_put() on our - * pre-allocated BO if the association exists. Given we - * only have one ref on preallocated_vm_bo, drm_gpuvm_bo_destroy() will - * be called immediately, and we have to hold the VM resv lock when - * calling this function. - */ - dma_resv_lock(panthor_vm_resv(vm), NULL); - mutex_lock(&bo->base.base.gpuva.lock); op_ctx->map.vm_bo = drm_gpuvm_bo_obtain_prealloc(preallocated_vm_bo); - mutex_unlock(&bo->base.base.gpuva.lock); - dma_resv_unlock(panthor_vm_resv(vm)); op_ctx->map.bo_offset = offset; From 59f8b4528220e9da995ef11c60122ea087ca0e1c Mon Sep 17 00:00:00 2001 From: "Chia-Lin Kao (AceLan)" Date: Thu, 4 Dec 2025 10:46:47 +0800 Subject: [PATCH 1150/1321] drm/dp: Add byte-by-byte fallback for broken USB-C adapters Some USB-C hubs and adapters have buggy firmware where multi-byte AUX reads consistently timeout, while single-byte reads from the same address work correctly. Known affected devices that exhibit this issue: - Lenovo USB-C to VGA adapter (VIA VL817 chipset) idVendor=17ef, idProduct=7217 - Dell DA310 USB-C mobile adapter hub idVendor=413c, idProduct=c010 Analysis of the failure pattern shows: - Single-byte probes to 0xf0000 (LTTPR) succeed - Single-byte probes to 0x00102 (TRAINING_AUX_RD_INTERVAL) succeed - Multi-byte reads from 0x00000 (DPCD capabilities) timeout with -ETIMEDOUT - Retrying does not help - the failure is consistent across all attempts The issue appears to be a firmware bug in the AUX transaction handling that specifically affects multi-byte reads. Add a fallback mechanism in drm_dp_dpcd_read_data() that attempts byte-by-byte reading when the normal multi-byte read fails. This workaround only activates for adapters that fail the standard read path, ensuring no impact on correctly functioning hardware. Tested with: - Lenovo USB-C to VGA adapter (VIA VL817) - now works with fallback - Dell DA310 USB-C hub - now works with fallback - Dell/Analogix Slimport adapter - continues to work with normal path Signed-off-by: Chia-Lin Kao (AceLan) Reviewed-by: Mario Limonciello (AMD) Link: https://patch.msgid.link/20251204024647.1462866-1-acelan.kao@canonical.com Signed-off-by: Mario Limonciello (AMD) --- include/drm/display/drm_dp_helper.h | 57 +++++++++++++++++++---------- 1 file changed, 37 insertions(+), 20 deletions(-) diff --git a/include/drm/display/drm_dp_helper.h b/include/drm/display/drm_dp_helper.h index df2f24b950e4c..14d2859f0bdab 100644 --- a/include/drm/display/drm_dp_helper.h +++ b/include/drm/display/drm_dp_helper.h @@ -551,6 +551,22 @@ ssize_t drm_dp_dpcd_read(struct drm_dp_aux *aux, unsigned int offset, ssize_t drm_dp_dpcd_write(struct drm_dp_aux *aux, unsigned int offset, void *buffer, size_t size); +/** + * drm_dp_dpcd_readb() - read a single byte from the DPCD + * @aux: DisplayPort AUX channel + * @offset: address of the register to read + * @valuep: location where the value of the register will be stored + * + * Returns the number of bytes transferred (1) on success, or a negative + * error code on failure. In most of the cases you should be using + * drm_dp_dpcd_read_byte() instead. + */ +static inline ssize_t drm_dp_dpcd_readb(struct drm_dp_aux *aux, + unsigned int offset, u8 *valuep) +{ + return drm_dp_dpcd_read(aux, offset, valuep, 1); +} + /** * drm_dp_dpcd_read_data() - read a series of bytes from the DPCD * @aux: DisplayPort AUX channel (SST or MST) @@ -570,12 +586,29 @@ static inline int drm_dp_dpcd_read_data(struct drm_dp_aux *aux, void *buffer, size_t size) { int ret; + size_t i; + u8 *buf = buffer; ret = drm_dp_dpcd_read(aux, offset, buffer, size); - if (ret < 0) - return ret; - if (ret < size) - return -EPROTO; + if (ret >= 0) { + if (ret < size) + return -EPROTO; + return 0; + } + + /* + * Workaround for USB-C hubs/adapters with buggy firmware that fail + * multi-byte AUX reads but work with single-byte reads. + * Known affected devices: + * - Lenovo USB-C to VGA adapter (VIA VL817, idVendor=17ef, idProduct=7217) + * - Dell DA310 USB-C hub (idVendor=413c, idProduct=c010) + * Attempt byte-by-byte reading as a fallback. + */ + for (i = 0; i < size; i++) { + ret = drm_dp_dpcd_readb(aux, offset + i, &buf[i]); + if (ret < 0) + return ret; + } return 0; } @@ -609,22 +642,6 @@ static inline int drm_dp_dpcd_write_data(struct drm_dp_aux *aux, return 0; } -/** - * drm_dp_dpcd_readb() - read a single byte from the DPCD - * @aux: DisplayPort AUX channel - * @offset: address of the register to read - * @valuep: location where the value of the register will be stored - * - * Returns the number of bytes transferred (1) on success, or a negative - * error code on failure. In most of the cases you should be using - * drm_dp_dpcd_read_byte() instead. - */ -static inline ssize_t drm_dp_dpcd_readb(struct drm_dp_aux *aux, - unsigned int offset, u8 *valuep) -{ - return drm_dp_dpcd_read(aux, offset, valuep, 1); -} - /** * drm_dp_dpcd_writeb() - write a single byte to the DPCD * @aux: DisplayPort AUX channel From e330cefb6a299bd85dac2b97370013c1776c1aa1 Mon Sep 17 00:00:00 2001 From: Cristian Ciocaltea Date: Sat, 10 Jan 2026 23:12:11 +0200 Subject: [PATCH 1151/1321] drm/rockchip: dw_hdmi_qp: Switch to gpiod_set_value_cansleep() Since commit 20cf2aed89ac ("gpio: rockchip: mark the GPIO controller as sleeping"), the Rockchip GPIO chip operations potentially sleep, hence the kernel complains when trying to make use of the non-sleeping API: [ 16.653343] WARNING: drivers/gpio/gpiolib.c:3902 at gpiod_set_value+0xd0/0x108, CPU#5: kworker/5:1/93 ... [ 16.678470] Hardware name: Radxa ROCK 5B (DT) [ 16.682374] Workqueue: events dw_hdmi_qp_rk3588_hpd_work [rockchipdrm] ... [ 16.729314] Call trace: [ 16.731846] gpiod_set_value+0xd0/0x108 (P) [ 16.734548] dw_hdmi_qp_rockchip_encoder_enable+0xbc/0x3a8 [rockchipdrm] [ 16.737487] drm_atomic_helper_commit_encoder_bridge_enable+0x314/0x380 [drm_kms_helper] [ 16.740555] drm_atomic_helper_commit_tail_rpm+0xa4/0x100 [drm_kms_helper] [ 16.743501] commit_tail+0x1e0/0x2c0 [drm_kms_helper] [ 16.746290] drm_atomic_helper_commit+0x274/0x2b8 [drm_kms_helper] [ 16.749178] drm_atomic_commit+0x1f0/0x248 [drm] [ 16.752000] drm_client_modeset_commit_atomic+0x490/0x5d0 [drm] [ 16.754954] drm_client_modeset_commit_locked+0xf4/0x400 [drm] [ 16.757911] drm_client_modeset_commit+0x50/0x80 [drm] [ 16.760791] __drm_fb_helper_restore_fbdev_mode_unlocked+0x9c/0x170 [drm_kms_helper] [ 16.763843] drm_fb_helper_hotplug_event+0x340/0x368 [drm_kms_helper] [ 16.766780] drm_fbdev_client_hotplug+0x64/0x1d0 [drm_client_lib] [ 16.769634] drm_client_hotplug+0x178/0x240 [drm] [ 16.772455] drm_client_dev_hotplug+0x170/0x1c0 [drm] [ 16.775303] drm_connector_helper_hpd_irq_event+0xa4/0x178 [drm_kms_helper] [ 16.778248] dw_hdmi_qp_rk3588_hpd_work+0x44/0xb8 [rockchipdrm] [ 16.781080] process_one_work+0xc3c/0x1658 [ 16.783719] worker_thread+0xa24/0xc40 [ 16.786333] kthread+0x3b4/0x3d8 [ 16.788889] ret_from_fork+0x10/0x20 Since gpiod_get_value() is called from a context that can sleep, switch to its *_cansleep() variant and get rid of the issue. Signed-off-by: Cristian Ciocaltea Signed-off-by: Heiko Stuebner Link: https://patch.msgid.link/20260110-dw-hdmi-qp-cansleep-v1-1-1ce937c5b201@collabora.com --- drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c b/drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c index 6e39e8a007749..8604342f99432 100644 --- a/drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c +++ b/drivers/gpu/drm/rockchip/dw_hdmi_qp-rockchip.c @@ -121,7 +121,7 @@ static void dw_hdmi_qp_rockchip_encoder_enable(struct drm_encoder *encoder) struct drm_crtc *crtc = encoder->crtc; /* Unconditionally switch to TMDS as FRL is not yet supported */ - gpiod_set_value(hdmi->frl_enable_gpio, 0); + gpiod_set_value_cansleep(hdmi->frl_enable_gpio, 0); if (!crtc || !crtc->state) return; From 9c15ecaabec58b75a4323c7af2c4605e22dfdb53 Mon Sep 17 00:00:00 2001 From: Marek Vasut Date: Sat, 10 Jan 2026 16:27:28 +0100 Subject: [PATCH 1152/1321] drm/panel-simple: fix connector type for DataImage SCF0700C48GGU18 panel The connector type for the DataImage SCF0700C48GGU18 panel is missing and devm_drm_panel_bridge_add() requires connector type to be set. This leads to a warning and a backtrace in the kernel log and panel does not work: " WARNING: CPU: 3 PID: 38 at drivers/gpu/drm/bridge/panel.c:379 devm_drm_of_get_bridge+0xac/0xb8 " The warning is triggered by a check for valid connector type in devm_drm_panel_bridge_add(). If there is no valid connector type set for a panel, the warning is printed and panel is not added. Fill in the missing connector type to fix the warning and make the panel operational once again. Cc: stable@vger.kernel.org Fixes: 97ceb1fb08b6 ("drm/panel: simple: Add support for DataImage SCF0700C48GGU18") Signed-off-by: Marek Vasut Reviewed-by: Neil Armstrong Signed-off-by: Neil Armstrong Link: https://patch.msgid.link/20260110152750.73848-1-marex@nabladev.com --- drivers/gpu/drm/panel/panel-simple.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c index b26b682826bca..246d6883fbf7f 100644 --- a/drivers/gpu/drm/panel/panel-simple.c +++ b/drivers/gpu/drm/panel/panel-simple.c @@ -1900,6 +1900,7 @@ static const struct panel_desc dataimage_scf0700c48ggu18 = { }, .bus_format = MEDIA_BUS_FMT_RGB888_1X24, .bus_flags = DRM_BUS_FLAG_DE_HIGH | DRM_BUS_FLAG_PIXDATA_DRIVE_POSEDGE, + .connector_type = DRM_MODE_CONNECTOR_DPI, }; static const struct display_timing dlc_dlc0700yzg_1_timing = { From 9472b1a962ce881d6e073b98de9bbc82e98bb3fd Mon Sep 17 00:00:00 2001 From: Ludovic Desroches Date: Thu, 18 Dec 2025 14:34:43 +0100 Subject: [PATCH 1153/1321] drm/panel: simple: restore connector_type fallback The switch from devm_kzalloc() + drm_panel_init() to devm_drm_panel_alloc() introduced a regression. Several panel descriptors do not set connector_type. For those panels, panel_simple_probe() used to compute a connector type (currently DPI as a fallback) and pass that value to drm_panel_init(). After the conversion to devm_drm_panel_alloc(), the call unconditionally used desc->connector_type instead, ignoring the computed fallback and potentially passing DRM_MODE_CONNECTOR_Unknown, which drm_panel_bridge_add() does not allow. Move the connector_type validation / fallback logic before the devm_drm_panel_alloc() call and pass the computed connector_type to devm_drm_panel_alloc(), so panels without an explicit connector_type once again get the DPI default. Signed-off-by: Ludovic Desroches Fixes: de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()") Cc: stable@vger.kernel.org Reviewed-by: Luca Ceresoli Link: https://lore.kernel.org/stable/20251126-lcd_panel_connector_type_fix-v2-1-c15835d1f7cb%40microchip.com Signed-off-by: Neil Armstrong Link: https://patch.msgid.link/20251218-lcd_panel_connector_type_fix-v3-1-ddcea6d8d7ef@microchip.com --- drivers/gpu/drm/panel/panel-simple.c | 89 ++++++++++++++-------------- 1 file changed, 44 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c index 246d6883fbf7f..162cc58c7b8f0 100644 --- a/drivers/gpu/drm/panel/panel-simple.c +++ b/drivers/gpu/drm/panel/panel-simple.c @@ -623,49 +623,6 @@ static struct panel_simple *panel_simple_probe(struct device *dev) if (IS_ERR(desc)) return ERR_CAST(desc); - panel = devm_drm_panel_alloc(dev, struct panel_simple, base, - &panel_simple_funcs, desc->connector_type); - if (IS_ERR(panel)) - return ERR_CAST(panel); - - panel->desc = desc; - - panel->supply = devm_regulator_get(dev, "power"); - if (IS_ERR(panel->supply)) - return ERR_CAST(panel->supply); - - panel->enable_gpio = devm_gpiod_get_optional(dev, "enable", - GPIOD_OUT_LOW); - if (IS_ERR(panel->enable_gpio)) - return dev_err_cast_probe(dev, panel->enable_gpio, - "failed to request GPIO\n"); - - err = of_drm_get_panel_orientation(dev->of_node, &panel->orientation); - if (err) { - dev_err(dev, "%pOF: failed to get orientation %d\n", dev->of_node, err); - return ERR_PTR(err); - } - - ddc = of_parse_phandle(dev->of_node, "ddc-i2c-bus", 0); - if (ddc) { - panel->ddc = of_find_i2c_adapter_by_node(ddc); - of_node_put(ddc); - - if (!panel->ddc) - return ERR_PTR(-EPROBE_DEFER); - } - - if (!of_device_is_compatible(dev->of_node, "panel-dpi") && - !of_get_display_timing(dev->of_node, "panel-timing", &dt)) - panel_simple_parse_panel_timing_node(dev, panel, &dt); - - if (desc->connector_type == DRM_MODE_CONNECTOR_LVDS) { - /* Optional data-mapping property for overriding bus format */ - err = panel_simple_override_nondefault_lvds_datamapping(dev, panel); - if (err) - goto free_ddc; - } - connector_type = desc->connector_type; /* Catch common mistakes for panels. */ switch (connector_type) { @@ -690,8 +647,7 @@ static struct panel_simple *panel_simple_probe(struct device *dev) break; case DRM_MODE_CONNECTOR_eDP: dev_warn(dev, "eDP panels moved to panel-edp\n"); - err = -EINVAL; - goto free_ddc; + return ERR_PTR(-EINVAL); case DRM_MODE_CONNECTOR_DSI: if (desc->bpc != 6 && desc->bpc != 8) dev_warn(dev, "Expected bpc in {6,8} but got: %u\n", desc->bpc); @@ -720,6 +676,49 @@ static struct panel_simple *panel_simple_probe(struct device *dev) break; } + panel = devm_drm_panel_alloc(dev, struct panel_simple, base, + &panel_simple_funcs, connector_type); + if (IS_ERR(panel)) + return ERR_CAST(panel); + + panel->desc = desc; + + panel->supply = devm_regulator_get(dev, "power"); + if (IS_ERR(panel->supply)) + return ERR_CAST(panel->supply); + + panel->enable_gpio = devm_gpiod_get_optional(dev, "enable", + GPIOD_OUT_LOW); + if (IS_ERR(panel->enable_gpio)) + return dev_err_cast_probe(dev, panel->enable_gpio, + "failed to request GPIO\n"); + + err = of_drm_get_panel_orientation(dev->of_node, &panel->orientation); + if (err) { + dev_err(dev, "%pOF: failed to get orientation %d\n", dev->of_node, err); + return ERR_PTR(err); + } + + ddc = of_parse_phandle(dev->of_node, "ddc-i2c-bus", 0); + if (ddc) { + panel->ddc = of_find_i2c_adapter_by_node(ddc); + of_node_put(ddc); + + if (!panel->ddc) + return ERR_PTR(-EPROBE_DEFER); + } + + if (!of_device_is_compatible(dev->of_node, "panel-dpi") && + !of_get_display_timing(dev->of_node, "panel-timing", &dt)) + panel_simple_parse_panel_timing_node(dev, panel, &dt); + + if (desc->connector_type == DRM_MODE_CONNECTOR_LVDS) { + /* Optional data-mapping property for overriding bus format */ + err = panel_simple_override_nondefault_lvds_datamapping(dev, panel); + if (err) + goto free_ddc; + } + dev_set_drvdata(dev, panel); /* From 3a4dbada1db308a246435881734d48fa3f6d5098 Mon Sep 17 00:00:00 2001 From: Shenghao Yang Date: Wed, 31 Dec 2025 13:50:26 +0800 Subject: [PATCH 1154/1321] drm/gud: fix NULL fb and crtc dereferences on USB disconnect On disconnect drm_atomic_helper_disable_all() is called which sets both the fb and crtc for a plane to NULL before invoking a commit. This causes a kernel oops on every display disconnect. Add guards for those dereferences. Cc: # 6.18.x Fixes: 73cfd166e045 ("drm/gud: Replace simple display pipe with DRM atomic helpers") Signed-off-by: Shenghao Yang Reviewed-by: Ruben Wauters Signed-off-by: Ruben Wauters Link: https://patch.msgid.link/20251231055039.44266-1-me@shenghaoyang.info --- drivers/gpu/drm/gud/gud_pipe.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/gud/gud_pipe.c b/drivers/gpu/drm/gud/gud_pipe.c index 76d77a736d84a..4b77be94348d9 100644 --- a/drivers/gpu/drm/gud/gud_pipe.c +++ b/drivers/gpu/drm/gud/gud_pipe.c @@ -457,27 +457,20 @@ int gud_plane_atomic_check(struct drm_plane *plane, struct drm_plane_state *old_plane_state = drm_atomic_get_old_plane_state(state, plane); struct drm_plane_state *new_plane_state = drm_atomic_get_new_plane_state(state, plane); struct drm_crtc *crtc = new_plane_state->crtc; - struct drm_crtc_state *crtc_state; + struct drm_crtc_state *crtc_state = NULL; const struct drm_display_mode *mode; struct drm_framebuffer *old_fb = old_plane_state->fb; struct drm_connector_state *connector_state = NULL; struct drm_framebuffer *fb = new_plane_state->fb; - const struct drm_format_info *format = fb->format; + const struct drm_format_info *format; struct drm_connector *connector; unsigned int i, num_properties; struct gud_state_req *req; int idx, ret; size_t len; - if (drm_WARN_ON_ONCE(plane->dev, !fb)) - return -EINVAL; - - if (drm_WARN_ON_ONCE(plane->dev, !crtc)) - return -EINVAL; - - crtc_state = drm_atomic_get_new_crtc_state(state, crtc); - - mode = &crtc_state->mode; + if (crtc) + crtc_state = drm_atomic_get_new_crtc_state(state, crtc); ret = drm_atomic_helper_check_plane_state(new_plane_state, crtc_state, DRM_PLANE_NO_SCALING, @@ -492,6 +485,9 @@ int gud_plane_atomic_check(struct drm_plane *plane, if (old_plane_state->rotation != new_plane_state->rotation) crtc_state->mode_changed = true; + mode = &crtc_state->mode; + format = fb->format; + if (old_fb && old_fb->format != format) crtc_state->mode_changed = true; @@ -598,7 +594,7 @@ void gud_plane_atomic_update(struct drm_plane *plane, struct drm_atomic_helper_damage_iter iter; int ret, idx; - if (crtc->state->mode_changed || !crtc->state->enable) { + if (!crtc || crtc->state->mode_changed || !crtc->state->enable) { cancel_work_sync(&gdrm->work); mutex_lock(&gdrm->damage_lock); if (gdrm->fb) { From 6a88563e6f8e04bb8cccc30a21d39d1667ffe51e Mon Sep 17 00:00:00 2001 From: Lyude Paul Date: Fri, 19 Dec 2025 16:52:02 -0500 Subject: [PATCH 1155/1321] drm/nouveau/disp/nv50-: Set lock_core in curs507a_prepare For a while, I've been seeing a strange issue where some (usually not all) of the display DMA channels will suddenly hang, particularly when there is a visible cursor on the screen that is being frequently updated, and especially when said cursor happens to go between two screens. While this brings back lovely memories of fixing Intel Skylake bugs, I would quite like to fix it :). It turns out the problem that's happening here is that we're managing to reach nv50_head_flush_set() in our atomic commit path without actually holding nv50_disp->mutex. This means that cursor updates happening in parallel (along with any other atomic updates that need to use the core channel) will race with eachother, which eventually causes us to corrupt the pushbuffer - leading to a plethora of various GSP errors, usually: nouveau 0000:c1:00.0: gsp: Xid:56 CMDre 00000000 00000218 00102680 00000004 00800003 nouveau 0000:c1:00.0: gsp: Xid:56 CMDre 00000000 0000021c 00040509 00000004 00000001 nouveau 0000:c1:00.0: gsp: Xid:56 CMDre 00000000 00000000 00000000 00000001 00000001 The reason this is happening is because generally we check whether we need to set nv50_atom->lock_core at the end of nv50_head_atomic_check(). However, curs507a_prepare is called from the fb_prepare callback, which happens after the atomic check phase. As a result, this can lead to commits that both touch the core channel but also don't grab nv50_disp->mutex. So, fix this by making sure that we set nv50_atom->lock_core in cus507a_prepare(). Reviewed-by: Dave Airlie Signed-off-by: Lyude Paul Fixes: 1590700d94ac ("drm/nouveau/kms/nv50-: split each resource type into their own source files") Cc: # v4.18+ Link: https://patch.msgid.link/20251219215344.170852-2-lyude@redhat.com --- drivers/gpu/drm/nouveau/dispnv50/curs507a.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/nouveau/dispnv50/curs507a.c b/drivers/gpu/drm/nouveau/dispnv50/curs507a.c index a95ee5dcc2e39..1a889139cb053 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/curs507a.c +++ b/drivers/gpu/drm/nouveau/dispnv50/curs507a.c @@ -84,6 +84,7 @@ curs507a_prepare(struct nv50_wndw *wndw, struct nv50_head_atom *asyh, asyh->curs.handle = handle; asyh->curs.offset = offset; asyh->set.curs = asyh->curs.visible; + nv50_atom(asyh->state.state)->lock_core = true; } } From fe32bcf00911649cdd68c0c31223553b3b07a74c Mon Sep 17 00:00:00 2001 From: Lyude Paul Date: Fri, 19 Dec 2025 16:52:03 -0500 Subject: [PATCH 1156/1321] drm/nouveau/kms/nv50-: Assert we hold nv50_disp->lock in nv50_head_flush_* Now that we've had one bug that occurred in nouveau as the result of nv50_head_flush_* being called without the appropriate locks, let's add some lockdep asserts to make sure this doesn't happen in the future. Reviewed-by: Dave Airlie Signed-off-by: Lyude Paul Link: https://patch.msgid.link/20251219215344.170852-3-lyude@redhat.com --- drivers/gpu/drm/nouveau/dispnv50/head.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.c b/drivers/gpu/drm/nouveau/dispnv50/head.c index 3dd742b4f8232..e32ed1db6c566 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/head.c +++ b/drivers/gpu/drm/nouveau/dispnv50/head.c @@ -43,6 +43,9 @@ nv50_head_flush_clr(struct nv50_head *head, union nv50_head_atom_mask clr = { .mask = asyh->clr.mask & ~(flush ? 0 : asyh->set.mask), }; + + lockdep_assert_held(&head->disp->mutex); + if (clr.crc) nv50_crc_atomic_clr(head); if (clr.olut) head->func->olut_clr(head); if (clr.core) head->func->core_clr(head); @@ -65,6 +68,8 @@ nv50_head_flush_set_wndw(struct nv50_head *head, struct nv50_head_atom *asyh) void nv50_head_flush_set(struct nv50_head *head, struct nv50_head_atom *asyh) { + lockdep_assert_held(&head->disp->mutex); + if (asyh->set.view ) head->func->view (head, asyh); if (asyh->set.mode ) head->func->mode (head, asyh); if (asyh->set.core ) head->func->core_set(head, asyh); From 069544ce59122ff789515dba44038fd1384a0e4e Mon Sep 17 00:00:00 2001 From: Thomas Zimmermann Date: Thu, 8 Jan 2026 15:19:46 +0100 Subject: [PATCH 1157/1321] drm/sysfb: Remove duplicate declarations Commit 6046b49bafff ("drm/sysfb: Share helpers for integer validation") and commit e8c086880b2b ("drm/sysfb: Share helpers for screen_info validation") added duplicate function declarations. Remove the latter ones. Signed-off-by: Thomas Zimmermann Fixes: e8c086880b2b ("drm/sysfb: Share helpers for screen_info validation") Cc: Thomas Zimmermann Cc: Javier Martinez Canillas Cc: dri-devel@lists.freedesktop.org Cc: # v6.16+ Reviewed-by: Javier Martinez Canillas Link: https://patch.msgid.link/20260108145058.56943-7-tzimmermann@suse.de --- drivers/gpu/drm/sysfb/drm_sysfb_helper.h | 9 --------- 1 file changed, 9 deletions(-) diff --git a/drivers/gpu/drm/sysfb/drm_sysfb_helper.h b/drivers/gpu/drm/sysfb/drm_sysfb_helper.h index da670d7eeb2e1..de96bfe7562c4 100644 --- a/drivers/gpu/drm/sysfb/drm_sysfb_helper.h +++ b/drivers/gpu/drm/sysfb/drm_sysfb_helper.h @@ -54,15 +54,6 @@ const struct drm_format_info *drm_sysfb_get_format_si(struct drm_device *dev, const struct screen_info *si); #endif -/* - * Input parsing - */ - -int drm_sysfb_get_validated_int(struct drm_device *dev, const char *name, - u64 value, u32 max); -int drm_sysfb_get_validated_int0(struct drm_device *dev, const char *name, - u64 value, u32 max); - /* * Display modes */ From f59ac2578429a44e73e64cc0f13d38a939143555 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Wed, 14 Jan 2026 16:54:05 +0800 Subject: [PATCH 1158/1321] io_uring: move local task_work in exit cancel loop With IORING_SETUP_DEFER_TASKRUN, task work is queued to ctx->work_llist (local work) rather than the fallback list. During io_ring_exit_work(), io_move_task_work_from_local() was called once before the cancel loop, moving work from work_llist to fallback_llist. However, task work can be added to work_llist during the cancel loop itself. There are two cases: 1) io_kill_timeouts() is called from io_uring_try_cancel_requests() to cancel pending timeouts, and it adds task work via io_req_queue_tw_complete() for each cancelled timeout: 2) URING_CMD requests like ublk can be completed via io_uring_cmd_complete_in_task() from ublk_queue_rq() during canceling, given ublk request queue is only quiesced when canceling the 1st uring_cmd. Since io_allowed_defer_tw_run() returns false in io_ring_exit_work() (kworker != submitter_task), io_run_local_work() is never invoked, and the work_llist entries are never processed. This causes io_uring_try_cancel_requests() to loop indefinitely, resulting in 100% CPU usage in kworker threads. Fix this by moving io_move_task_work_from_local() inside the cancel loop, ensuring any work on work_llist is moved to fallback before each cancel attempt. Cc: stable@vger.kernel.org Fixes: c0e0d6ba25f1 ("io_uring: add IORING_SETUP_DEFER_TASKRUN") Signed-off-by: Ming Lei Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 87a87396e9409..b7a077c11c21a 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3003,12 +3003,12 @@ static __cold void io_ring_exit_work(struct work_struct *work) mutex_unlock(&ctx->uring_lock); } - if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) - io_move_task_work_from_local(ctx); - /* The SQPOLL thread never reaches this path */ - while (io_uring_try_cancel_requests(ctx, NULL, true, false)) + do { + if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) + io_move_task_work_from_local(ctx); cond_resched(); + } while (io_uring_try_cancel_requests(ctx, NULL, true, false)); if (ctx->sq_data) { struct io_sq_data *sqd = ctx->sq_data; From 7649ce437da62ed5abf5ff0cc102f6b89fb95c40 Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Thu, 8 Jan 2026 10:22:10 -0700 Subject: [PATCH 1159/1321] block: zero non-PI portion of auto integrity buffer The auto-generated integrity buffer for writes needs to be fully initialized before being passed to the underlying block device, otherwise the uninitialized memory can be read back by userspace or anyone with physical access to the storage device. If protection information is generated, that portion of the integrity buffer is already initialized. The integrity data is also zeroed if PI generation is disabled via sysfs or the PI tuple size is 0. However, this misses the case where PI is generated and the PI tuple size is nonzero, but the metadata size is larger than the PI tuple. In this case, the remainder ("opaque") of the metadata is left uninitialized. Generalize the BLK_INTEGRITY_CSUM_NONE check to cover any case when the metadata is larger than just the PI tuple. Signed-off-by: Caleb Sander Mateos Fixes: c546d6f43833 ("block: only zero non-PI metadata tuples in bio_integrity_prep") Reviewed-by: Anuj Gupta Reviewed-by: Christoph Hellwig Reviewed-by: Martin K. Petersen Signed-off-by: Jens Axboe --- block/bio-integrity-auto.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/bio-integrity-auto.c b/block/bio-integrity-auto.c index 9850c338548d3..cff025b06be18 100644 --- a/block/bio-integrity-auto.c +++ b/block/bio-integrity-auto.c @@ -140,7 +140,7 @@ bool bio_integrity_prep(struct bio *bio) return true; set_flags = false; gfp |= __GFP_ZERO; - } else if (bi->csum_type == BLK_INTEGRITY_CSUM_NONE) + } else if (bi->metadata_size > bi->pi_tuple_size) gfp |= __GFP_ZERO; break; default: From 2d3ffc06914a1f6aea625fe80c816add0921e164 Mon Sep 17 00:00:00 2001 From: Nilay Shroff Date: Tue, 13 Jan 2026 12:27:22 +0530 Subject: [PATCH 1160/1321] null_blk: fix kmemleak by releasing references to fault configfs items When CONFIG_BLK_DEV_NULL_BLK_FAULT_INJECTION is enabled, the null-blk driver sets up fault injection support by creating the timeout_inject, requeue_inject, and init_hctx_fault_inject configfs items as children of the top-level nullbX configfs group. However, when the nullbX device is removed, the references taken to these fault-config configfs items are not released. As a result, kmemleak reports a memory leak, for example: unreferenced object 0xc00000021ff25c40 (size 32): comm "mkdir", pid 10665, jiffies 4322121578 hex dump (first 32 bytes): 69 6e 69 74 5f 68 63 74 78 5f 66 61 75 6c 74 5f init_hctx_fault_ 69 6e 6a 65 63 74 00 88 00 00 00 00 00 00 00 00 inject.......... backtrace (crc 1a018c86): __kmalloc_node_track_caller_noprof+0x494/0xbd8 kvasprintf+0x74/0xf4 config_item_set_name+0xf0/0x104 config_group_init_type_name+0x48/0xfc fault_config_init+0x48/0xf0 0xc0080000180559e4 configfs_mkdir+0x304/0x814 vfs_mkdir+0x49c/0x604 do_mkdirat+0x314/0x3d0 sys_mkdir+0xa0/0xd8 system_call_exception+0x1b0/0x4f0 system_call_vectored_common+0x15c/0x2ec Fix this by explicitly releasing the references to the fault-config configfs items when dropping the reference to the top-level nullbX configfs group. Cc: stable@vger.kernel.org Reviewed-by: Chaitanya Kulkarni Fixes: bb4c19e030f4 ("block: null_blk: make fault-injection dynamically configurable per device") Signed-off-by: Nilay Shroff Signed-off-by: Jens Axboe --- drivers/block/null_blk/main.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c index c7c0fb79a6bf6..4c0632ab4e1bb 100644 --- a/drivers/block/null_blk/main.c +++ b/drivers/block/null_blk/main.c @@ -665,12 +665,22 @@ static void nullb_add_fault_config(struct nullb_device *dev) configfs_add_default_group(&dev->init_hctx_fault_config.group, &dev->group); } +static void nullb_del_fault_config(struct nullb_device *dev) +{ + config_item_put(&dev->init_hctx_fault_config.group.cg_item); + config_item_put(&dev->requeue_config.group.cg_item); + config_item_put(&dev->timeout_config.group.cg_item); +} + #else static void nullb_add_fault_config(struct nullb_device *dev) { } +static void nullb_del_fault_config(struct nullb_device *dev) +{ +} #endif static struct @@ -702,7 +712,7 @@ nullb_group_drop_item(struct config_group *group, struct config_item *item) null_del_dev(dev->nullb); mutex_unlock(&lock); } - + nullb_del_fault_config(dev); config_item_put(item); } From 30683a203ed06ce8320aec0472a5152187a3a199 Mon Sep 17 00:00:00 2001 From: Ilikara Zheng Date: Mon, 8 Dec 2025 21:23:40 +0800 Subject: [PATCH 1161/1321] nvme-pci: disable secondary temp for Wodposit WPBSNM8 Secondary temperature thresholds (temp2_{min,max}) were not reported properly on this NVMe SSD. This resulted in an error while attempting to read these values with sensors(1): ERROR: Can't get value of subfeature temp2_min: I/O error ERROR: Can't get value of subfeature temp2_max: I/O error Add the device to the nvme_id_table with the NVME_QUIRK_NO_SECONDARY_TEMP_THRESH flag to suppress access to all non- composite temperature thresholds. Cc: stable@vger.kernel.org Tested-by: Wu Haotian Signed-off-by: Ilikara Zheng Signed-off-by: Keith Busch --- drivers/nvme/host/pci.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 0e4caeab739c7..29e715d5b8f30 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -3999,6 +3999,8 @@ static const struct pci_device_id nvme_id_table[] = { .driver_data = NVME_QUIRK_NO_DEEPEST_PS, }, { PCI_DEVICE(0x1e49, 0x0041), /* ZHITAI TiPro7000 NVMe SSD */ .driver_data = NVME_QUIRK_NO_DEEPEST_PS, }, + { PCI_DEVICE(0x1fa0, 0x2283), /* Wodposit WPBSNM8-256GTP */ + .driver_data = NVME_QUIRK_NO_SECONDARY_TEMP_THRESH, }, { PCI_DEVICE(0x025e, 0xf1ac), /* SOLIDIGM P44 pro SSDPFKKW020X7 */ .driver_data = NVME_QUIRK_NO_DEEPEST_PS, }, { PCI_DEVICE(0xc0a9, 0x540a), /* Crucial P2 */ From 577f5a893d8e94e96a82edb55be9060719dff88f Mon Sep 17 00:00:00 2001 From: Shivam Kumar Date: Sat, 13 Dec 2025 13:57:48 -0500 Subject: [PATCH 1162/1321] nvme-tcp: fix NULL pointer dereferences in nvmet_tcp_build_pdu_iovec MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Commit efa56305908b ("nvmet-tcp: Fix a kernel panic when host sends an invalid H2C PDU length") added ttag bounds checking and data_offset validation in nvmet_tcp_handle_h2c_data_pdu(), but it did not validate whether the command's data structures (cmd->req.sg and cmd->iov) have been properly initialized before processing H2C_DATA PDUs. The nvmet_tcp_build_pdu_iovec() function dereferences these pointers without NULL checks. This can be triggered by sending H2C_DATA PDU immediately after the ICREQ/ICRESP handshake, before sending a CONNECT command or NVMe write command. Attack vectors that trigger NULL pointer dereferences: 1. H2C_DATA PDU sent before CONNECT → both pointers NULL 2. H2C_DATA PDU for READ command → cmd->req.sg allocated, cmd->iov NULL 3. H2C_DATA PDU for uninitialized command slot → both pointers NULL The fix validates both cmd->req.sg and cmd->iov before calling nvmet_tcp_build_pdu_iovec(). Both checks are required because: - Uninitialized commands: both NULL - READ commands: cmd->req.sg allocated, cmd->iov NULL - WRITE commands: both allocated Fixes: efa56305908b ("nvmet-tcp: Fix a kernel panic when host sends an invalid H2C PDU length") Reviewed-by: Sagi Grimberg Signed-off-by: Shivam Kumar Signed-off-by: Keith Busch --- drivers/nvme/target/tcp.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 15416ff0eac49..d5966d007ba3f 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -982,6 +982,18 @@ static int nvmet_tcp_handle_h2c_data_pdu(struct nvmet_tcp_queue *queue) pr_err("H2CData PDU len %u is invalid\n", cmd->pdu_len); goto err_proto; } + /* + * Ensure command data structures are initialized. We must check both + * cmd->req.sg and cmd->iov because they can have different NULL states: + * - Uninitialized commands: both NULL + * - READ commands: cmd->req.sg allocated, cmd->iov NULL + * - WRITE commands: both allocated + */ + if (unlikely(!cmd->req.sg || !cmd->iov)) { + pr_err("queue %d: H2CData PDU received for invalid command state (ttag %u)\n", + queue->idx, data->ttag); + goto err_proto; + } cmd->pdu_recv = 0; nvmet_tcp_build_pdu_iovec(cmd); queue->cmd = cmd; From eca44ea1a1de627b9ffea2b4f03f5c4781157db8 Mon Sep 17 00:00:00 2001 From: Janne Grunau Date: Wed, 31 Dec 2025 11:10:57 +0100 Subject: [PATCH 1163/1321] nvme-apple: add "apple,t8103-nvme-ans2" as compatible After discussion with the devicetree maintainers we agreed to not extend lists with the generic compatible "apple,nvme-ans2" anymore [1]. Add "apple,t8103-nvme-ans2" as fallback compatible as it is the SoC the driver and bindings were written for. [1]: https://lore.kernel.org/asahi/12ab93b7-1fc2-4ce0-926e-c8141cfe81bf@kernel.org/ Cc: stable@vger.kernel.org # v6.18+ Fixes: 5bd2927aceba ("nvme-apple: Add initial Apple SoC NVMe driver") Reviewed-by: Neal Gompa Reviewed-by: Christoph Hellwig Signed-off-by: Janne Grunau Signed-off-by: Keith Busch --- drivers/nvme/host/apple.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c index 15b3d07f8ccdd..ed61b97fde59f 100644 --- a/drivers/nvme/host/apple.c +++ b/drivers/nvme/host/apple.c @@ -1704,6 +1704,7 @@ static const struct apple_nvme_hw apple_nvme_t8103_hw = { static const struct of_device_id apple_nvme_of_match[] = { { .compatible = "apple,t8015-nvme-ans2", .data = &apple_nvme_t8015_hw }, + { .compatible = "apple,t8103-nvme-ans2", .data = &apple_nvme_t8103_hw }, { .compatible = "apple,nvme-ans2", .data = &apple_nvme_t8103_hw }, {}, }; From 62607e4a8c77a50c50cfd4a2ad046439b7a519fa Mon Sep 17 00:00:00 2001 From: Chaitanya Kulkarni Date: Fri, 19 Dec 2025 16:18:42 -0800 Subject: [PATCH 1164/1321] nvme-fc: release admin tagset if init fails nvme_fabrics creates an NVMe/FC controller in following path: nvmf_dev_write() -> nvmf_create_ctrl() -> nvme_fc_create_ctrl() -> nvme_fc_init_ctrl() nvme_fc_init_ctrl() allocates the admin blk-mq resources right after nvme_add_ctrl() succeeds. If any of the subsequent steps fail (changing the controller state, scheduling connect work, etc.), we jump to the fail_ctrl path, which tears down the controller references but never frees the admin queue/tag set. The leaked blk-mq allocations match the kmemleak report seen during blktests nvme/fc. Check ctrl->ctrl.admin_tagset in the fail_ctrl path and call nvme_remove_admin_tag_set() when it is set so that all admin queue allocations are reclaimed whenever controller setup aborts. Reported-by: Yi Zhang Reviewed-by: Justin Tee Signed-off-by: Chaitanya Kulkarni Signed-off-by: Keith Busch --- drivers/nvme/host/fc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index bc455fa982462..6948de3f438ae 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -3587,6 +3587,8 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts, ctrl->ctrl.opts = NULL; + if (ctrl->ctrl.admin_tagset) + nvme_remove_admin_tag_set(&ctrl->ctrl); /* initiate nvme ctrl ref counting teardown */ nvme_uninit_ctrl(&ctrl->ctrl); From e4b5218ab58a3b3bfc0f43d0b66deb9d60d0e971 Mon Sep 17 00:00:00 2001 From: Hannes Reinecke Date: Mon, 18 Aug 2025 11:32:45 +0200 Subject: [PATCH 1165/1321] nvmet-tcp: fixup hang in nvmet_tcp_listen_data_ready() When the socket is closed while in TCP_LISTEN a callback is run to flush all outstanding packets, which in turns calls nvmet_tcp_listen_data_ready() with the sk_callback_lock held. So we need to check if we are in TCP_LISTEN before attempting to get the sk_callback_lock() to avoid a deadlock. Link: https://lore.kernel.org/linux-nvme/CAHj4cs-zu7eVB78yUpFjVe2UqMWFkLk8p+DaS3qj+uiGCXBAoA@mail.gmail.com/ Tested-by: Yi Zhang Reviewed-by: Sagi Grimberg Signed-off-by: Hannes Reinecke Signed-off-by: Keith Busch --- drivers/nvme/target/tcp.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index d5966d007ba3f..549a4786d1c3d 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -2004,14 +2004,13 @@ static void nvmet_tcp_listen_data_ready(struct sock *sk) trace_sk_data_ready(sk); + if (sk->sk_state != TCP_LISTEN) + return; + read_lock_bh(&sk->sk_callback_lock); port = sk->sk_user_data; - if (!port) - goto out; - - if (sk->sk_state == TCP_LISTEN) + if (port) queue_work(nvmet_wq, &port->accept_work); -out: read_unlock_bh(&sk->sk_callback_lock); } From be0f6cc8f8884fffd80a407dfc7c6c333f917a45 Mon Sep 17 00:00:00 2001 From: Shin'ichiro Kawasaki Date: Sun, 21 Dec 2025 16:37:14 +0900 Subject: [PATCH 1166/1321] nvmet: do not copy beyond sybsysnqn string length Commit edd17206e363 ("nvmet: remove redundant subsysnqn field from ctrl") replaced ctrl->subsysnqn with ctrl->subsys->subsysnqn. This change works as expected because both point to strings with the same data. However, their memory allocation lengths differ. ctrl->subsysnqn had the fixed size defined as NVMF_NQN_FILED_LEN, while ctrl->subsys->subsysnqn has variable length determined by kstrndup(). Due to this difference, KASAN slab-out-of-bounds occurs at memcpy() in nvmet_passthru_override_id_ctrl() after the commit. The failure can be recreated by running the blktests test case nvme/033. To prevent such failures, replace memcpy() with strscpy(), which copies only the string length and avoids overruns. Fixes: edd17206e363 ("nvmet: remove redundant subsysnqn field from ctrl") Signed-off-by: Shin'ichiro Kawasaki Reviewed-by: Christoph Hellwig Reviewed-by: Sagi Grimberg Reviewed-by: Chaitanya Kulkarni Signed-off-by: Keith Busch --- drivers/nvme/target/passthru.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c index 96648ec2fadb5..67c423a8b052b 100644 --- a/drivers/nvme/target/passthru.c +++ b/drivers/nvme/target/passthru.c @@ -150,7 +150,7 @@ static u16 nvmet_passthru_override_id_ctrl(struct nvmet_req *req) * code path with duplicate ctrl subsysnqn. In order to prevent that we * mask the passthru-ctrl subsysnqn with the target ctrl subsysnqn. */ - memcpy(id->subnqn, ctrl->subsys->subsysnqn, sizeof(id->subnqn)); + strscpy(id->subnqn, ctrl->subsys->subsysnqn, sizeof(id->subnqn)); /* use fabric id-ctrl values */ id->ioccsz = cpu_to_le32((sizeof(struct nvme_command) + From eaf0c8329401aaeabcfbabe132a9c1c8b3100f31 Mon Sep 17 00:00:00 2001 From: Nilay Shroff Date: Wed, 14 Jan 2026 12:54:13 +0530 Subject: [PATCH 1167/1321] nvme: fix PCIe subsystem reset controller state transition MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The commit d2fe192348f9 (“nvme: only allow entering LIVE from CONNECTING state”) disallows controller state transitions directly from RESETTING to LIVE. However, the NVMe PCIe subsystem reset path relies on this transition to recover the controller on PowerPC (PPC) systems. On PPC systems, issuing a subsystem reset causes a temporary loss of communication with the NVMe adapter. A subsequent PCIe MMIO read then triggers EEH recovery, which restores the PCIe link and brings the controller back online. For EEH recovery to proceed correctly, the controller must transition back to the LIVE state. Due to the changes introduced by commit d2fe192348f9 (“nvme: only allow entering LIVE from CONNECTING state”), the controller can no longer transition directly from RESETTING to LIVE. As a result, EEH recovery exits prematurely, leaving the controller stuck in the RESETTING state. Fix this by explicitly transitioning the controller state from RESETTING to CONNECTING and then to LIVE. This satisfies the updated state transition rules and allows the controller to be successfully recovered on PPC systems following a PCIe subsystem reset. Cc: stable@vger.kernel.org Fixes: d2fe192348f9 ("nvme: only allow entering LIVE from CONNECTING state") Reviewed-by: Daniel Wagner Signed-off-by: Nilay Shroff Signed-off-by: Keith Busch --- drivers/nvme/host/pci.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 29e715d5b8f30..58f3097888a76 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1532,7 +1532,10 @@ static int nvme_pci_subsystem_reset(struct nvme_ctrl *ctrl) } writel(NVME_SUBSYS_RESET, dev->bar + NVME_REG_NSSR); - nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE); + + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) || + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE)) + goto unlock; /* * Read controller status to flush the previous write and trigger a From 9419af4dc25da7a2c5cf47c541badbc1996507a6 Mon Sep 17 00:00:00 2001 From: Chaitanya Kulkarni Date: Mon, 12 Jan 2026 15:19:28 -0800 Subject: [PATCH 1168/1321] rnbd-clt: fix refcount underflow in device unmap path During device unmapping (triggered by module unload or explicit unmap), a refcount underflow occurs causing a use-after-free warning: [14747.574913] ------------[ cut here ]------------ [14747.574916] refcount_t: underflow; use-after-free. [14747.574917] WARNING: lib/refcount.c:28 at refcount_warn_saturate+0x55/0x90, CPU#9: kworker/9:1/378 [14747.574924] Modules linked in: rnbd_client(-) rtrs_client rnbd_server rtrs_server rtrs_core ... [14747.574998] CPU: 9 UID: 0 PID: 378 Comm: kworker/9:1 Tainted: G O N 6.19.0-rc3lblk-fnext+ #42 PREEMPT(voluntary) [14747.575005] Workqueue: rnbd_clt_wq unmap_device_work [rnbd_client] [14747.575010] RIP: 0010:refcount_warn_saturate+0x55/0x90 [14747.575037] Call Trace: [14747.575038] [14747.575038] rnbd_clt_unmap_device+0x170/0x1d0 [rnbd_client] [14747.575044] process_one_work+0x211/0x600 [14747.575052] worker_thread+0x184/0x330 [14747.575055] ? __pfx_worker_thread+0x10/0x10 [14747.575058] kthread+0x10d/0x250 [14747.575062] ? __pfx_kthread+0x10/0x10 [14747.575066] ret_from_fork+0x319/0x390 [14747.575069] ? __pfx_kthread+0x10/0x10 [14747.575072] ret_from_fork_asm+0x1a/0x30 [14747.575083] [14747.575096] ---[ end trace 0000000000000000 ]--- Befor this patch :- The bug is a double kobject_put() on dev->kobj during device cleanup. Kobject Lifecycle: kobject_init_and_add() sets kobj.kref = 1 (initialization) kobject_put() sets kobj.kref = 0 (should be called once) * Before this patch: rnbd_clt_unmap_device() rnbd_destroy_sysfs() kobject_del(&dev->kobj) [remove from sysfs] kobject_put(&dev->kobj) PUT #1 (WRONG!) kref: 1 to 0 rnbd_dev_release() kfree(dev) [DEVICE FREED!] rnbd_destroy_gen_disk() [use-after-free!] rnbd_clt_put_dev() refcount_dec_and_test(&dev->refcount) kobject_put(&dev->kobj) PUT #2 (UNDERFLOW!) kref: 0 to -1 [WARNING!] The first kobject_put() in rnbd_destroy_sysfs() prematurely frees the device via rnbd_dev_release(), then the second kobject_put() in rnbd_clt_put_dev() causes refcount underflow. * After this patch :- Remove kobject_put() from rnbd_destroy_sysfs(). This function should only remove sysfs visibility (kobject_del), not manage object lifetime. Call Graph (FIXED): rnbd_clt_unmap_device() rnbd_destroy_sysfs() kobject_del(&dev->kobj) [remove from sysfs only] [kref unchanged: 1] rnbd_destroy_gen_disk() [device still valid] rnbd_clt_put_dev() refcount_dec_and_test(&dev->refcount) kobject_put(&dev->kobj) ONLY PUT (CORRECT!) kref: 1 to 0 [BALANCED] rnbd_dev_release() kfree(dev) [CLEAN DESTRUCTION] This follows the kernel pattern where sysfs removal (kobject_del) is separate from object destruction (kobject_put). Fixes: 581cf833cac4 ("block: rnbd: add .release to rnbd_dev_ktype") Signed-off-by: Chaitanya Kulkarni Acked-by: Jack Wang Reviewed-by: Jack Wang Signed-off-by: Jens Axboe --- drivers/block/rnbd/rnbd-clt.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c index d1c354636315d..8194a970f002a 100644 --- a/drivers/block/rnbd/rnbd-clt.c +++ b/drivers/block/rnbd/rnbd-clt.c @@ -1662,7 +1662,6 @@ static void destroy_sysfs(struct rnbd_clt_dev *dev, /* To avoid deadlock firstly remove itself */ sysfs_remove_file_self(&dev->kobj, sysfs_self); kobject_del(&dev->kobj); - kobject_put(&dev->kobj); } } From 2d47f048093224dfcece9b3fb25622c4ebfba282 Mon Sep 17 00:00:00 2001 From: Jiapeng Chong Date: Mon, 12 Jan 2026 16:58:28 +0000 Subject: [PATCH 1169/1321] arm_mpam: Remove duplicate linux/srcu.h header ./drivers/resctrl/mpam_internal.h: linux/srcu.h is included more than once. Reviewed-by: Jonathan Cameron Reported-by: Abaci Robot Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=27328 Signed-off-by: Jiapeng Chong Acked-by: James Morse [BH: Keep alphabetical order] Signed-off-by: Ben Horgan Reviewed-by: Gavin Shan Reviewed-by: Fenghua Yu Tested-by: Shaopeng Tan Tested-by: Peter Newman Signed-off-by: Catalin Marinas --- drivers/resctrl/mpam_internal.h | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h index e79c3c47259c9..17cdc3080d585 100644 --- a/drivers/resctrl/mpam_internal.h +++ b/drivers/resctrl/mpam_internal.h @@ -12,7 +12,6 @@ #include #include #include -#include #include #include #include From 3ed9e0d02e217f17ca5e7ab12f997fe050e78bea Mon Sep 17 00:00:00 2001 From: Ben Horgan Date: Mon, 12 Jan 2026 16:58:29 +0000 Subject: [PATCH 1170/1321] arm_mpam: Use non-atomic bitops when modifying feature bitmap In the test__props_mismatch() kunit test we rely on the struct mpam_props being packed to ensure memcmp doesn't consider packing. Making it packed reduces the alignment of the features bitmap and so breaks a requirement for the use of atomics. As we don't rely on the set/clear of these bits being atomic, just make them non-atomic. Reviewed-by: Jonathan Cameron Signed-off-by: Ben Horgan Fixes: 8c90dc68a5de ("arm_mpam: Probe the hardware features resctrl supports") Reviewed-by: Gavin Shan Tested-by: Shaopeng Tan Tested-by: Peter Newman Signed-off-by: Catalin Marinas --- drivers/resctrl/mpam_internal.h | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h index 17cdc3080d585..e8971842b124f 100644 --- a/drivers/resctrl/mpam_internal.h +++ b/drivers/resctrl/mpam_internal.h @@ -200,8 +200,12 @@ struct mpam_props { } PACKED_FOR_KUNIT; #define mpam_has_feature(_feat, x) test_bit(_feat, (x)->features) -#define mpam_set_feature(_feat, x) set_bit(_feat, (x)->features) -#define mpam_clear_feature(_feat, x) clear_bit(_feat, (x)->features) +/* + * The non-atomic get/set operations are used because if struct mpam_props is + * packed, the alignment requirements for atomics aren't met. + */ +#define mpam_set_feature(_feat, x) __set_bit(_feat, (x)->features) +#define mpam_clear_feature(_feat, x) __clear_bit(_feat, (x)->features) /* The values for MSMON_CFG_MBWU_FLT.RWBW */ enum mon_filter_options { From 3366683134687923f1545ae72e587c4ee2fb4be0 Mon Sep 17 00:00:00 2001 From: Huacai Chen Date: Thu, 15 Jan 2026 18:50:48 +0800 Subject: [PATCH 1171/1321] LoongArch: Remove redundant code in head.S SETUP_MODES already setup the initial values of CSR.CRMD, CSR.PRMD and CSR.EUEN, so the redundant open code can be removed. Fixes: 7b2afeafaf9c2d5 ("LoongArch: Adjust boot & setup for 32BIT/64BIT") Signed-off-by: Huacai Chen --- arch/loongarch/kernel/head.S | 8 -------- 1 file changed, 8 deletions(-) diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S index 7f288e89573b7..4eed7bc312a86 100644 --- a/arch/loongarch/kernel/head.S +++ b/arch/loongarch/kernel/head.S @@ -126,14 +126,6 @@ SYM_CODE_START(smpboot_entry) LONG_LI t1, CSR_STFILL csrxchg t0, t1, LOONGARCH_CSR_IMPCTL1 #endif - /* Enable PG */ - li.w t0, 0xb0 # PLV=0, IE=0, PG=1 - csrwr t0, LOONGARCH_CSR_CRMD - li.w t0, 0x04 # PLV=0, PIE=1, PWE=0 - csrwr t0, LOONGARCH_CSR_PRMD - li.w t0, 0x00 # FPE=0, SXE=0, ASXE=0, BTE=0 - csrwr t0, LOONGARCH_CSR_EUEN - la.pcrel t0, cpuboot_data ld.d sp, t0, CPU_BOOT_STACK ld.d tp, t0, CPU_BOOT_TINFO From a008f6ac19cec0ac4f9da2be45ee34a1ce9aee30 Mon Sep 17 00:00:00 2001 From: Lisa Robinson Date: Sat, 17 Jan 2026 10:56:43 +0800 Subject: [PATCH 1172/1321] LoongArch: Fix PMU counter allocation for mixed-type event groups When validating a perf event group, validate_group() unconditionally attempts to allocate hardware PMU counters for the leader, sibling events and the new event being added. This is incorrect for mixed-type groups. If a PERF_TYPE_SOFTWARE event is part of the group, the current code still tries to allocate a hardware PMU counter for it, which can wrongly consume hardware PMU resources and cause spurious allocation failures. Fix this by only allocating PMU counters for hardware events during group validation, and skipping software events. A trimmed down reproducer is as simple as this: #include #include #include #include #include #include int main (int argc, char *argv[]) { struct perf_event_attr attr = { 0 }; int fds[5]; attr.disabled = 1; attr.exclude_kernel = 1; attr.exclude_hv = 1; attr.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED | PERF_FORMAT_TOTAL_TIME_RUNNING | PERF_FORMAT_ID | PERF_FORMAT_GROUP; attr.size = sizeof (attr); attr.type = PERF_TYPE_SOFTWARE; attr.config = PERF_COUNT_SW_DUMMY; fds[0] = syscall (SYS_perf_event_open, &attr, 0, -1, -1, 0); assert (fds[0] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_CPU_CYCLES; fds[1] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[1] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_INSTRUCTIONS; fds[2] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[2] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_BRANCH_MISSES; fds[3] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[3] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_CACHE_REFERENCES; fds[4] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[4] >= 0); printf ("PASSED\n"); return 0; } Cc: stable@vger.kernel.org Fixes: b37042b2bb7c ("LoongArch: Add perf events support") Signed-off-by: Lisa Robinson Signed-off-by: Huacai Chen --- arch/loongarch/kernel/perf_event.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/arch/loongarch/kernel/perf_event.c b/arch/loongarch/kernel/perf_event.c index 9d257c8519c90..e34a6fb33e11c 100644 --- a/arch/loongarch/kernel/perf_event.c +++ b/arch/loongarch/kernel/perf_event.c @@ -626,6 +626,18 @@ static const struct loongarch_perf_event *loongarch_pmu_map_cache_event(u64 conf return pev; } +static inline bool loongarch_pmu_event_requires_counter(const struct perf_event *event) +{ + switch (event->attr.type) { + case PERF_TYPE_HARDWARE: + case PERF_TYPE_HW_CACHE: + case PERF_TYPE_RAW: + return true; + default: + return false; + } +} + static int validate_group(struct perf_event *event) { struct cpu_hw_events fake_cpuc; @@ -633,15 +645,18 @@ static int validate_group(struct perf_event *event) memset(&fake_cpuc, 0, sizeof(fake_cpuc)); - if (loongarch_pmu_alloc_counter(&fake_cpuc, &leader->hw) < 0) + if (loongarch_pmu_event_requires_counter(leader) && + loongarch_pmu_alloc_counter(&fake_cpuc, &leader->hw) < 0) return -EINVAL; for_each_sibling_event(sibling, leader) { - if (loongarch_pmu_alloc_counter(&fake_cpuc, &sibling->hw) < 0) + if (loongarch_pmu_event_requires_counter(sibling) && + loongarch_pmu_alloc_counter(&fake_cpuc, &sibling->hw) < 0) return -EINVAL; } - if (loongarch_pmu_alloc_counter(&fake_cpuc, &event->hw) < 0) + if (loongarch_pmu_event_requires_counter(event) && + loongarch_pmu_alloc_counter(&fake_cpuc, &event->hw) < 0) return -EINVAL; return 0; From 74dc312e74a04b851190f212633af9d55239d717 Mon Sep 17 00:00:00 2001 From: Yao Zi Date: Sat, 17 Jan 2026 10:56:52 +0800 Subject: [PATCH 1173/1321] LoongArch: dts: Describe PCI sideband IRQ through interrupt-extended SoC integrated peripherals on LS2K1000 and LS2K2000 could be discovered as PCI devices, but require sideband interrupts to function, which are previously described by interrupts and interrupt-parent properties. However, pci/pci-device.yaml allows interrupts property to only specify PCI INTx interrupts, not sideband ones. Convert these devices to use interrupt-extended property, which describes sideband interrupts used by PCI devices since dt-schema commit e6ea659d2baa ("schemas: pci-device: Allow interrupts-extended for sideband interrupts"), eliminating dtbs_check warnings. Cc: stable@vger.kernel.org Fixes: 30a5532a3206 ("LoongArch: dts: DeviceTree for Loongson-2K1000") Signed-off-by: Yao Zi Signed-off-by: Binbin Zhou Signed-off-by: Huacai Chen --- arch/loongarch/boot/dts/loongson-2k1000.dtsi | 25 ++++++--------- arch/loongarch/boot/dts/loongson-2k2000.dtsi | 32 ++++++++------------ 2 files changed, 21 insertions(+), 36 deletions(-) diff --git a/arch/loongarch/boot/dts/loongson-2k1000.dtsi b/arch/loongarch/boot/dts/loongson-2k1000.dtsi index 60ab425f793fe..eee06b84951c5 100644 --- a/arch/loongarch/boot/dts/loongson-2k1000.dtsi +++ b/arch/loongarch/boot/dts/loongson-2k1000.dtsi @@ -437,54 +437,47 @@ gmac0: ethernet@3,0 { reg = <0x1800 0x0 0x0 0x0 0x0>; - interrupt-parent = <&liointc0>; - interrupts = <12 IRQ_TYPE_LEVEL_HIGH>, - <13 IRQ_TYPE_LEVEL_HIGH>; + interrupts-extended = <&liointc0 12 IRQ_TYPE_LEVEL_HIGH>, + <&liointc0 13 IRQ_TYPE_LEVEL_HIGH>; interrupt-names = "macirq", "eth_lpi"; status = "disabled"; }; gmac1: ethernet@3,1 { reg = <0x1900 0x0 0x0 0x0 0x0>; - interrupt-parent = <&liointc0>; - interrupts = <14 IRQ_TYPE_LEVEL_HIGH>, - <15 IRQ_TYPE_LEVEL_HIGH>; + interrupts-extended = <&liointc0 14 IRQ_TYPE_LEVEL_HIGH>, + <&liointc0 15 IRQ_TYPE_LEVEL_HIGH>; interrupt-names = "macirq", "eth_lpi"; status = "disabled"; }; ehci0: usb@4,1 { reg = <0x2100 0x0 0x0 0x0 0x0>; - interrupt-parent = <&liointc1>; - interrupts = <18 IRQ_TYPE_LEVEL_HIGH>; + interrupts-extended = <&liointc1 18 IRQ_TYPE_LEVEL_HIGH>; status = "disabled"; }; ohci0: usb@4,2 { reg = <0x2200 0x0 0x0 0x0 0x0>; - interrupt-parent = <&liointc1>; - interrupts = <19 IRQ_TYPE_LEVEL_HIGH>; + interrupts-extended = <&liointc1 19 IRQ_TYPE_LEVEL_HIGH>; status = "disabled"; }; display@6,0 { reg = <0x3000 0x0 0x0 0x0 0x0>; - interrupt-parent = <&liointc0>; - interrupts = <28 IRQ_TYPE_LEVEL_HIGH>; + interrupts-extended = <&liointc0 28 IRQ_TYPE_LEVEL_HIGH>; status = "disabled"; }; hda@7,0 { reg = <0x3800 0x0 0x0 0x0 0x0>; - interrupt-parent = <&liointc0>; - interrupts = <4 IRQ_TYPE_LEVEL_HIGH>; + interrupts-extended = <&liointc0 4 IRQ_TYPE_LEVEL_HIGH>; status = "disabled"; }; sata: sata@8,0 { reg = <0x4000 0x0 0x0 0x0 0x0>; - interrupt-parent = <&liointc0>; - interrupts = <19 IRQ_TYPE_LEVEL_HIGH>; + interrupts-extended = <&liointc0 19 IRQ_TYPE_LEVEL_HIGH>; status = "disabled"; }; diff --git a/arch/loongarch/boot/dts/loongson-2k2000.dtsi b/arch/loongarch/boot/dts/loongson-2k2000.dtsi index 6c77b86ee06c8..87c45f1f7cc7a 100644 --- a/arch/loongarch/boot/dts/loongson-2k2000.dtsi +++ b/arch/loongarch/boot/dts/loongson-2k2000.dtsi @@ -291,65 +291,57 @@ gmac0: ethernet@3,0 { reg = <0x1800 0x0 0x0 0x0 0x0>; - interrupts = <12 IRQ_TYPE_LEVEL_HIGH>, - <13 IRQ_TYPE_LEVEL_HIGH>; + interrupts-extended = <&pic 12 IRQ_TYPE_LEVEL_HIGH>, + <&pic 13 IRQ_TYPE_LEVEL_HIGH>; interrupt-names = "macirq", "eth_lpi"; - interrupt-parent = <&pic>; status = "disabled"; }; gmac1: ethernet@3,1 { reg = <0x1900 0x0 0x0 0x0 0x0>; - interrupts = <14 IRQ_TYPE_LEVEL_HIGH>, - <15 IRQ_TYPE_LEVEL_HIGH>; + interrupts-extended = <&pic 14 IRQ_TYPE_LEVEL_HIGH>, + <&pic 15 IRQ_TYPE_LEVEL_HIGH>; interrupt-names = "macirq", "eth_lpi"; - interrupt-parent = <&pic>; status = "disabled"; }; gmac2: ethernet@3,2 { reg = <0x1a00 0x0 0x0 0x0 0x0>; - interrupts = <17 IRQ_TYPE_LEVEL_HIGH>, - <18 IRQ_TYPE_LEVEL_HIGH>; + interrupts-extended = <&pic 17 IRQ_TYPE_LEVEL_HIGH>, + <&pic 18 IRQ_TYPE_LEVEL_HIGH>; interrupt-names = "macirq", "eth_lpi"; - interrupt-parent = <&pic>; status = "disabled"; }; xhci0: usb@4,0 { reg = <0x2000 0x0 0x0 0x0 0x0>; - interrupts = <48 IRQ_TYPE_LEVEL_HIGH>; - interrupt-parent = <&pic>; + interrupts-extended = <&pic 48 IRQ_TYPE_LEVEL_HIGH>; status = "disabled"; }; xhci1: usb@19,0 { reg = <0xc800 0x0 0x0 0x0 0x0>; - interrupts = <22 IRQ_TYPE_LEVEL_HIGH>; - interrupt-parent = <&pic>; + interrupts-extended = <&pic 22 IRQ_TYPE_LEVEL_HIGH>; status = "disabled"; }; display@6,1 { reg = <0x3100 0x0 0x0 0x0 0x0>; - interrupts = <28 IRQ_TYPE_LEVEL_HIGH>; - interrupt-parent = <&pic>; + interrupts-extended = <&pic 28 IRQ_TYPE_LEVEL_HIGH>; status = "disabled"; }; i2s@7,0 { reg = <0x3800 0x0 0x0 0x0 0x0>; - interrupts = <78 IRQ_TYPE_LEVEL_HIGH>, - <79 IRQ_TYPE_LEVEL_HIGH>; + interrupts-extended = <&pic 78 IRQ_TYPE_LEVEL_HIGH>, + <&pic 79 IRQ_TYPE_LEVEL_HIGH>; interrupt-names = "tx", "rx"; - interrupt-parent = <&pic>; status = "disabled"; }; sata: sata@8,0 { reg = <0x4000 0x0 0x0 0x0 0x0>; - interrupts = <16 IRQ_TYPE_LEVEL_HIGH>; - interrupt-parent = <&pic>; + interrupts-extended = <&pic 16 IRQ_TYPE_LEVEL_HIGH>; status = "disabled"; }; From a7580a0be79900fd5b26d7fcca0d00c4bb530f16 Mon Sep 17 00:00:00 2001 From: Binbin Zhou Date: Sat, 17 Jan 2026 10:56:52 +0800 Subject: [PATCH 1174/1321] LoongArch: dts: loongson-2k0500: Add default interrupt controller address cells Add missing address-cells 0 to the Local I/O and Extend I/O interrupt controller node to silence W=1 warning: loongson-2k0500.dtsi:513.5-51: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@0,0:interrupt-map: Missing property '#address-cells' in node /bus@10000000/interrupt-controller@1fe11600, using 0 as fallback Value '0' is correct because: 1. The Local I/O & Extend I/O interrupt controller do not have children, 2. interrupt-map property (in PCI node) consists of five components and the fourth component "parent unit address", which size is defined by '#address-cells' of the node pointed to by the interrupt-parent component, is not used (=0) Cc: stable@vger.kernel.org Signed-off-by: Binbin Zhou Signed-off-by: Huacai Chen --- arch/loongarch/boot/dts/loongson-2k0500.dtsi | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/loongarch/boot/dts/loongson-2k0500.dtsi b/arch/loongarch/boot/dts/loongson-2k0500.dtsi index 357de4ca75550..e759fae77dcf4 100644 --- a/arch/loongarch/boot/dts/loongson-2k0500.dtsi +++ b/arch/loongarch/boot/dts/loongson-2k0500.dtsi @@ -131,6 +131,7 @@ reg-names = "main", "isr0"; interrupt-controller; + #address-cells = <0>; #interrupt-cells = <2>; interrupt-parent = <&cpuintc>; interrupts = <2>; @@ -149,6 +150,7 @@ reg-names = "main", "isr0"; interrupt-controller; + #address-cells = <0>; #interrupt-cells = <2>; interrupt-parent = <&cpuintc>; interrupts = <4>; @@ -164,6 +166,7 @@ compatible = "loongson,ls2k0500-eiointc"; reg = <0x0 0x1fe11600 0x0 0xea00>; interrupt-controller; + #address-cells = <0>; #interrupt-cells = <1>; interrupt-parent = <&cpuintc>; interrupts = <3>; From b95c9c13c62eb63aac81678aa5715f4f6a80dcc0 Mon Sep 17 00:00:00 2001 From: Binbin Zhou Date: Sat, 17 Jan 2026 10:56:53 +0800 Subject: [PATCH 1175/1321] LoongArch: dts: loongson-2k1000: Add default interrupt controller address cells Add missing address-cells 0 to the Local I/O interrupt controller node to silence W=1 warning: loongson-2k1000.dtsi:498.5-55: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@9,0:interrupt-map: Missing property '#address-cells' in node /bus@10000000/interrupt-controller@1fe01440, using 0 as fallback Value '0' is correct because: 1. The Local I/O interrupt controller does not have children, 2. interrupt-map property (in PCI node) consists of five components and the fourth component "parent unit address", which size is defined by '#address-cells' of the node pointed to by the interrupt-parent component, is not used (=0) Cc: stable@vger.kernel.org Signed-off-by: Binbin Zhou Signed-off-by: Huacai Chen --- arch/loongarch/boot/dts/loongson-2k1000.dtsi | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/loongarch/boot/dts/loongson-2k1000.dtsi b/arch/loongarch/boot/dts/loongson-2k1000.dtsi index eee06b84951c5..440a8f3c01f48 100644 --- a/arch/loongarch/boot/dts/loongson-2k1000.dtsi +++ b/arch/loongarch/boot/dts/loongson-2k1000.dtsi @@ -114,6 +114,7 @@ <0x0 0x1fe01140 0x0 0x8>; reg-names = "main", "isr0", "isr1"; interrupt-controller; + #address-cells = <0>; #interrupt-cells = <2>; interrupt-parent = <&cpuintc>; interrupts = <2>; @@ -131,6 +132,7 @@ <0x0 0x1fe01148 0x0 0x8>; reg-names = "main", "isr0", "isr1"; interrupt-controller; + #address-cells = <0>; #interrupt-cells = <2>; interrupt-parent = <&cpuintc>; interrupts = <3>; From 61718d44e2c3a2d57f413ff6b834eb8cb83cdd3c Mon Sep 17 00:00:00 2001 From: Binbin Zhou Date: Sat, 17 Jan 2026 10:56:53 +0800 Subject: [PATCH 1176/1321] LoongArch: dts: loongson-2k2000: Add default interrupt controller address cells Add missing address-cells 0 to the Local I/O, Extend I/O and PCH-PIC Interrupt Controller node to silence W=1 warning: loongson-2k2000.dtsi:364.5-49: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@9,0:interrupt-map: Missing property '#address-cells' in node /bus@10000000/interrupt-controller@10000000, using 0 as fallback Value '0' is correct because: 1. The LIO/EIO/PCH interrupt controller does not have children, 2. interrupt-map property (in PCI node) consists of five components and the fourth component "parent unit address", which size is defined by '#address-cells' of the node pointed to by the interrupt-parent component, is not used (=0) Cc: stable@vger.kernel.org Signed-off-by: Binbin Zhou Signed-off-by: Huacai Chen --- arch/loongarch/boot/dts/loongson-2k2000.dtsi | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/loongarch/boot/dts/loongson-2k2000.dtsi b/arch/loongarch/boot/dts/loongson-2k2000.dtsi index 87c45f1f7cc7a..3678c084adf71 100644 --- a/arch/loongarch/boot/dts/loongson-2k2000.dtsi +++ b/arch/loongarch/boot/dts/loongson-2k2000.dtsi @@ -126,6 +126,7 @@ reg = <0x0 0x1fe01400 0x0 0x64>; interrupt-controller; + #address-cells = <0>; #interrupt-cells = <2>; interrupt-parent = <&cpuintc>; interrupts = <2>; @@ -140,6 +141,7 @@ compatible = "loongson,ls2k2000-eiointc"; reg = <0x0 0x1fe01600 0x0 0xea00>; interrupt-controller; + #address-cells = <0>; #interrupt-cells = <1>; interrupt-parent = <&cpuintc>; interrupts = <3>; @@ -149,6 +151,7 @@ compatible = "loongson,pch-pic-1.0"; reg = <0x0 0x10000000 0x0 0x400>; interrupt-controller; + #address-cells = <0>; #interrupt-cells = <2>; loongson,pic-base-vec = <0>; interrupt-parent = <&eiointc>; From 6a804f91231f0244295d833674d789e5161b4423 Mon Sep 17 00:00:00 2001 From: Binbin Zhou Date: Sat, 17 Jan 2026 10:56:53 +0800 Subject: [PATCH 1177/1321] LoongArch: dts: loongson-2k1000: Fix i2c-gpio node names The binding wants the node to be named "i2c-number", but those are named "i2c-gpio-number" instead. Thus rename those to i2c-0, i2c-1 to adhere to the binding and suppress dtbs_check warnings. Cc: stable@vger.kernel.org Reviewed-by: Krzysztof Kozlowski Signed-off-by: Binbin Zhou Signed-off-by: Huacai Chen --- arch/loongarch/boot/dts/loongson-2k1000.dtsi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/loongarch/boot/dts/loongson-2k1000.dtsi b/arch/loongarch/boot/dts/loongson-2k1000.dtsi index 440a8f3c01f48..be4f7d119660e 100644 --- a/arch/loongarch/boot/dts/loongson-2k1000.dtsi +++ b/arch/loongarch/boot/dts/loongson-2k1000.dtsi @@ -46,7 +46,7 @@ }; /* i2c of the dvi eeprom edid */ - i2c-gpio-0 { + i2c-0 { compatible = "i2c-gpio"; scl-gpios = <&gpio0 0 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>; sda-gpios = <&gpio0 1 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>; @@ -57,7 +57,7 @@ }; /* i2c of the eeprom edid */ - i2c-gpio-1 { + i2c-1 { compatible = "i2c-gpio"; scl-gpios = <&gpio0 33 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>; sda-gpios = <&gpio0 32 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>; From d61ac29388b9ad044402062e742e38e7ae1c0c89 Mon Sep 17 00:00:00 2001 From: Qiang Ma Date: Sat, 17 Jan 2026 10:57:02 +0800 Subject: [PATCH 1178/1321] LoongArch: KVM: Fix kvm_device leak in kvm_ipi_destroy() In kvm_ioctl_create_device(), kvm_device has allocated memory, kvm_device->destroy() seems to be supposed to free its kvm_device struct, but kvm_ipi_destroy() is not currently doing this, that would lead to a memory leak. So, fix it. Cc: stable@vger.kernel.org Reviewed-by: Bibo Mao Signed-off-by: Qiang Ma Signed-off-by: Huacai Chen --- arch/loongarch/kvm/intc/ipi.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/loongarch/kvm/intc/ipi.c b/arch/loongarch/kvm/intc/ipi.c index 05cefd29282e8..1058c13dba7f4 100644 --- a/arch/loongarch/kvm/intc/ipi.c +++ b/arch/loongarch/kvm/intc/ipi.c @@ -459,6 +459,7 @@ static void kvm_ipi_destroy(struct kvm_device *dev) ipi = kvm->arch.ipi; kvm_io_bus_unregister_dev(kvm, KVM_IOCSR_BUS, &ipi->device); kfree(ipi); + kfree(dev); } static struct kvm_device_ops kvm_ipi_dev_ops = { From 6c116868386de66821583b7ddebe645d8c66691e Mon Sep 17 00:00:00 2001 From: Qiang Ma Date: Sat, 17 Jan 2026 10:57:02 +0800 Subject: [PATCH 1179/1321] LoongArch: KVM: Fix kvm_device leak in kvm_eiointc_destroy() In kvm_ioctl_create_device(), kvm_device has allocated memory, kvm_device->destroy() seems to be supposed to free its kvm_device struct, but kvm_eiointc_destroy() is not currently doing this, that would lead to a memory leak. So, fix it. Cc: stable@vger.kernel.org Reviewed-by: Bibo Mao Signed-off-by: Qiang Ma Signed-off-by: Huacai Chen --- arch/loongarch/kvm/intc/eiointc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/loongarch/kvm/intc/eiointc.c b/arch/loongarch/kvm/intc/eiointc.c index 29886876143f1..dfaf6ccfdd8b3 100644 --- a/arch/loongarch/kvm/intc/eiointc.c +++ b/arch/loongarch/kvm/intc/eiointc.c @@ -679,6 +679,7 @@ static void kvm_eiointc_destroy(struct kvm_device *dev) kvm_io_bus_unregister_dev(kvm, KVM_IOCSR_BUS, &eiointc->device); kvm_io_bus_unregister_dev(kvm, KVM_IOCSR_BUS, &eiointc->device_vext); kfree(eiointc); + kfree(dev); } static struct kvm_device_ops kvm_eiointc_dev_ops = { From 0fc1868b10f70ef32f0a855f90e21f0d58ff3148 Mon Sep 17 00:00:00 2001 From: Qiang Ma Date: Sat, 17 Jan 2026 10:57:03 +0800 Subject: [PATCH 1180/1321] LoongArch: KVM: Fix kvm_device leak in kvm_pch_pic_destroy() In kvm_ioctl_create_device(), kvm_device has allocated memory, kvm_device->destroy() seems to be supposed to free its kvm_device struct, but kvm_pch_pic_destroy() is not currently doing this, that would lead to a memory leak. So, fix it. Cc: stable@vger.kernel.org Reviewed-by: Bibo Mao Signed-off-by: Qiang Ma Signed-off-by: Huacai Chen --- arch/loongarch/kvm/intc/pch_pic.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/loongarch/kvm/intc/pch_pic.c b/arch/loongarch/kvm/intc/pch_pic.c index a698a73de399b..4addb34bf432b 100644 --- a/arch/loongarch/kvm/intc/pch_pic.c +++ b/arch/loongarch/kvm/intc/pch_pic.c @@ -475,6 +475,7 @@ static void kvm_pch_pic_destroy(struct kvm_device *dev) /* unregister pch pic device and free it's memory */ kvm_io_bus_unregister_dev(kvm, KVM_MMIO_BUS, &s->device); kfree(s); + kfree(dev); } static struct kvm_device_ops kvm_pch_pic_dev_ops = { From bd1681b6f8839d64113b33ef39ada56ce2cc93b8 Mon Sep 17 00:00:00 2001 From: Filipe Manana Date: Fri, 19 Dec 2025 11:26:02 +0000 Subject: [PATCH 1181/1321] btrfs: release path before iget_failed() in btrfs_read_locked_inode() In btrfs_read_locked_inode() if we fail to lookup the inode, we jump to the 'out' label with a path that has a read locked leaf and then we call iget_failed(). This can result in a ABBA deadlock, since iget_failed() triggers inode eviction and that causes the release of the delayed inode, which must lock the delayed inode's mutex, and a task updating a delayed inode starts by taking the node's mutex and then modifying the inode's subvolume btree. Syzbot reported the following lockdep splat for this: ====================================================== WARNING: possible circular locking dependency detected syzkaller #0 Not tainted ------------------------------------------------------ btrfs-cleaner/8725 is trying to acquire lock: ffff0000d6826a48 (&delayed_node->mutex){+.+.}-{4:4}, at: __btrfs_release_delayed_node+0xa0/0x9b0 fs/btrfs/delayed-inode.c:290 but task is already holding lock: ffff0000dbeba878 (btrfs-tree-00){++++}-{4:4}, at: btrfs_tree_read_lock_nested+0x44/0x2ec fs/btrfs/locking.c:145 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (btrfs-tree-00){++++}-{4:4}: __lock_release kernel/locking/lockdep.c:5574 [inline] lock_release+0x198/0x39c kernel/locking/lockdep.c:5889 up_read+0x24/0x3c kernel/locking/rwsem.c:1632 btrfs_tree_read_unlock+0xdc/0x298 fs/btrfs/locking.c:169 btrfs_tree_unlock_rw fs/btrfs/locking.h:218 [inline] btrfs_search_slot+0xa6c/0x223c fs/btrfs/ctree.c:2133 btrfs_lookup_inode+0xd8/0x38c fs/btrfs/inode-item.c:395 __btrfs_update_delayed_inode+0x124/0xed0 fs/btrfs/delayed-inode.c:1032 btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1118 [inline] __btrfs_commit_inode_delayed_items+0x15f8/0x1748 fs/btrfs/delayed-inode.c:1141 __btrfs_run_delayed_items+0x1ac/0x514 fs/btrfs/delayed-inode.c:1176 btrfs_run_delayed_items_nr+0x28/0x38 fs/btrfs/delayed-inode.c:1219 flush_space+0x26c/0xb68 fs/btrfs/space-info.c:828 do_async_reclaim_metadata_space+0x110/0x364 fs/btrfs/space-info.c:1158 btrfs_async_reclaim_metadata_space+0x90/0xd8 fs/btrfs/space-info.c:1226 process_one_work+0x7e8/0x155c kernel/workqueue.c:3263 process_scheduled_works kernel/workqueue.c:3346 [inline] worker_thread+0x958/0xed8 kernel/workqueue.c:3427 kthread+0x5fc/0x75c kernel/kthread.c:463 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:844 -> #0 (&delayed_node->mutex){+.+.}-{4:4}: check_prev_add kernel/locking/lockdep.c:3165 [inline] check_prevs_add kernel/locking/lockdep.c:3284 [inline] validate_chain kernel/locking/lockdep.c:3908 [inline] __lock_acquire+0x1774/0x30a4 kernel/locking/lockdep.c:5237 lock_acquire+0x14c/0x2e0 kernel/locking/lockdep.c:5868 __mutex_lock_common+0x1d0/0x2678 kernel/locking/mutex.c:598 __mutex_lock kernel/locking/mutex.c:760 [inline] mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:812 __btrfs_release_delayed_node+0xa0/0x9b0 fs/btrfs/delayed-inode.c:290 btrfs_release_delayed_node fs/btrfs/delayed-inode.c:315 [inline] btrfs_remove_delayed_node+0x68/0x84 fs/btrfs/delayed-inode.c:1326 btrfs_evict_inode+0x578/0xe28 fs/btrfs/inode.c:5587 evict+0x414/0x928 fs/inode.c:810 iput_final fs/inode.c:1914 [inline] iput+0x95c/0xad4 fs/inode.c:1966 iget_failed+0xec/0x134 fs/bad_inode.c:248 btrfs_read_locked_inode+0xe1c/0x1234 fs/btrfs/inode.c:4101 btrfs_iget+0x1b0/0x264 fs/btrfs/inode.c:5837 btrfs_run_defrag_inode fs/btrfs/defrag.c:237 [inline] btrfs_run_defrag_inodes+0x520/0xdc4 fs/btrfs/defrag.c:309 cleaner_kthread+0x21c/0x418 fs/btrfs/disk-io.c:1516 kthread+0x5fc/0x75c kernel/kthread.c:463 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:844 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- rlock(btrfs-tree-00); lock(&delayed_node->mutex); lock(btrfs-tree-00); lock(&delayed_node->mutex); *** DEADLOCK *** 1 lock held by btrfs-cleaner/8725: #0: ffff0000dbeba878 (btrfs-tree-00){++++}-{4:4}, at: btrfs_tree_read_lock_nested+0x44/0x2ec fs/btrfs/locking.c:145 stack backtrace: CPU: 0 UID: 0 PID: 8725 Comm: btrfs-cleaner Not tainted syzkaller #0 PREEMPT Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/03/2025 Call trace: show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:499 (C) __dump_stack+0x30/0x40 lib/dump_stack.c:94 dump_stack_lvl+0xd8/0x12c lib/dump_stack.c:120 dump_stack+0x1c/0x28 lib/dump_stack.c:129 print_circular_bug+0x324/0x32c kernel/locking/lockdep.c:2043 check_noncircular+0x154/0x174 kernel/locking/lockdep.c:2175 check_prev_add kernel/locking/lockdep.c:3165 [inline] check_prevs_add kernel/locking/lockdep.c:3284 [inline] validate_chain kernel/locking/lockdep.c:3908 [inline] __lock_acquire+0x1774/0x30a4 kernel/locking/lockdep.c:5237 lock_acquire+0x14c/0x2e0 kernel/locking/lockdep.c:5868 __mutex_lock_common+0x1d0/0x2678 kernel/locking/mutex.c:598 __mutex_lock kernel/locking/mutex.c:760 [inline] mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:812 __btrfs_release_delayed_node+0xa0/0x9b0 fs/btrfs/delayed-inode.c:290 btrfs_release_delayed_node fs/btrfs/delayed-inode.c:315 [inline] btrfs_remove_delayed_node+0x68/0x84 fs/btrfs/delayed-inode.c:1326 btrfs_evict_inode+0x578/0xe28 fs/btrfs/inode.c:5587 evict+0x414/0x928 fs/inode.c:810 iput_final fs/inode.c:1914 [inline] iput+0x95c/0xad4 fs/inode.c:1966 iget_failed+0xec/0x134 fs/bad_inode.c:248 btrfs_read_locked_inode+0xe1c/0x1234 fs/btrfs/inode.c:4101 btrfs_iget+0x1b0/0x264 fs/btrfs/inode.c:5837 btrfs_run_defrag_inode fs/btrfs/defrag.c:237 [inline] btrfs_run_defrag_inodes+0x520/0xdc4 fs/btrfs/defrag.c:309 cleaner_kthread+0x21c/0x418 fs/btrfs/disk-io.c:1516 kthread+0x5fc/0x75c kernel/kthread.c:463 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:844 Fix this by releasing the path before calling iget_failed(). Reported-by: syzbot+c1c6edb02bea1da754d8@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/694530c2.a70a0220.207337.010d.GAE@google.com/ Fixes: 69673992b1ae ("btrfs: push cleanup into btrfs_read_locked_inode()") Reviewed-by: Boris Burkov Signed-off-by: Filipe Manana Signed-off-by: David Sterba --- fs/btrfs/inode.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index d2b302ac6af96..a2b5b440637e6 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4180,6 +4180,15 @@ static int btrfs_read_locked_inode(struct btrfs_inode *inode, struct btrfs_path return 0; out: + /* + * We may have a read locked leaf and iget_failed() triggers inode + * eviction which needs to release the delayed inode and that needs + * to lock the delayed inode's mutex. This can cause a ABBA deadlock + * with a task running delayed items, as that require first locking + * the delayed inode's mutex and then modifying its subvolume btree. + * So release the path before iget_failed(). + */ + btrfs_release_path(path); iget_failed(vfs_inode); return ret; } From 16464310e34c5ee70995d01d15690661ad026f81 Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Fri, 26 Dec 2025 11:30:22 +0000 Subject: [PATCH 1182/1321] btrfs: tests: fix root tree leak in btrfs_test_qgroups() If btrfs_insert_fs_root() fails, the tmp_root allocated by btrfs_alloc_dummy_root() is leaked because its initial reference count is not decremented. Fix this by calling btrfs_put_root() unconditionally after btrfs_insert_fs_root(). This ensures the local reference is always dropped. Also fix a copy-paste error in the error message where the subvolume root insertion failure was incorrectly logged as "fs root". Co-developed-by: Jianhao Xu Signed-off-by: Jianhao Xu Signed-off-by: Zilin Guan Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/tests/qgroup-tests.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/tests/qgroup-tests.c b/fs/btrfs/tests/qgroup-tests.c index e9124605974bf..0fcc31beeffe9 100644 --- a/fs/btrfs/tests/qgroup-tests.c +++ b/fs/btrfs/tests/qgroup-tests.c @@ -517,11 +517,11 @@ int btrfs_test_qgroups(u32 sectorsize, u32 nodesize) tmp_root->root_key.objectid = BTRFS_FS_TREE_OBJECTID; root->fs_info->fs_root = tmp_root; ret = btrfs_insert_fs_root(root->fs_info, tmp_root); + btrfs_put_root(tmp_root); if (ret) { test_err("couldn't insert fs root %d", ret); goto out; } - btrfs_put_root(tmp_root); tmp_root = btrfs_alloc_dummy_root(fs_info); if (IS_ERR(tmp_root)) { @@ -532,11 +532,11 @@ int btrfs_test_qgroups(u32 sectorsize, u32 nodesize) tmp_root->root_key.objectid = BTRFS_FIRST_FREE_OBJECTID; ret = btrfs_insert_fs_root(root->fs_info, tmp_root); + btrfs_put_root(tmp_root); if (ret) { - test_err("couldn't insert fs root %d", ret); + test_err("couldn't insert subvolume root %d", ret); goto out; } - btrfs_put_root(tmp_root); test_msg("running qgroup tests"); ret = test_no_shared_qgroup(root, sectorsize, nodesize); From b51609cc51c545ed5d57aebec7f841fa0bde8425 Mon Sep 17 00:00:00 2001 From: Naohiro Aota Date: Mon, 5 Jan 2026 17:19:05 +0900 Subject: [PATCH 1183/1321] btrfs: tests: fix return 0 on rmap test failure In test_rmap_blocks(), we have ret = 0 before checking the results. We need to set it to -EINVAL, so that a mismatching result will return -EINVAL not 0. Reviewed-by: Qu Wenruo Signed-off-by: Naohiro Aota Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/tests/extent-map-tests.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/btrfs/tests/extent-map-tests.c b/fs/btrfs/tests/extent-map-tests.c index 0b9f25dd1a68a..aabf825e8d7bc 100644 --- a/fs/btrfs/tests/extent-map-tests.c +++ b/fs/btrfs/tests/extent-map-tests.c @@ -1059,6 +1059,7 @@ static int test_rmap_block(struct btrfs_fs_info *fs_info, if (out_stripe_len != BTRFS_STRIPE_LEN) { test_err("calculated stripe length doesn't match"); + ret = -EINVAL; goto out; } @@ -1066,12 +1067,14 @@ static int test_rmap_block(struct btrfs_fs_info *fs_info, for (i = 0; i < out_ndaddrs; i++) test_msg("mapped %llu", logical[i]); test_err("unexpected number of mapped addresses: %d", out_ndaddrs); + ret = -EINVAL; goto out; } for (i = 0; i < out_ndaddrs; i++) { if (logical[i] != test->mapped_logical[i]) { test_err("unexpected logical address mapped"); + ret = -EINVAL; goto out; } } From 31a0649142eab7a00a499ae8b1806f59ecafde06 Mon Sep 17 00:00:00 2001 From: Qu Wenruo Date: Tue, 6 Jan 2026 20:26:40 +1030 Subject: [PATCH 1184/1321] btrfs: send: check for inline extents in range_is_hole_in_parent() Before accessing the disk_bytenr field of a file extent item we need to check if we are dealing with an inline extent. This is because for inline extents their data starts at the offset of the disk_bytenr field. So accessing the disk_bytenr means we are accessing inline data or in case the inline data is less than 8 bytes we can actually cause an invalid memory access if this inline extent item is the first item in the leaf or access metadata from other items. Fixes: 82bfb2e7b645 ("Btrfs: incremental send, fix unnecessary hole writes for sparse files") Reviewed-by: Filipe Manana Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/send.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 2522faa974784..d8127a7120c2d 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -6383,6 +6383,8 @@ static int range_is_hole_in_parent(struct send_ctx *sctx, extent_end = btrfs_file_extent_end(path); if (extent_end <= start) goto next; + if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) + return 0; if (btrfs_file_extent_disk_bytenr(leaf, fi) == 0) { search_start = extent_end; goto next; From c61a18b2bed4af9c6f02d155c21aab45ee63127c Mon Sep 17 00:00:00 2001 From: Qu Wenruo Date: Fri, 9 Jan 2026 14:01:14 +1030 Subject: [PATCH 1185/1321] btrfs: update the Kconfig string for CONFIG_BTRFS_EXPERIMENTAL The following new features are missing: - Async checksum - Shutdown ioctl and auto-degradation - Larger block size support Which is dependent on larger folios. Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/Kconfig | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig index 4438637c8900c..6d6fc85835d46 100644 --- a/fs/btrfs/Kconfig +++ b/fs/btrfs/Kconfig @@ -115,6 +115,10 @@ config BTRFS_EXPERIMENTAL - extent tree v2 - complex rework of extent tracking - - large folio support + - large folio and block size (> page size) support + + - shutdown ioctl and auto-degradation support + + - asynchronous checksum generation for data writes If unsure, say N. From eac775d93be882422e342160862a103854e2bad1 Mon Sep 17 00:00:00 2001 From: Filipe Manana Date: Thu, 8 Jan 2026 16:16:38 +0000 Subject: [PATCH 1186/1321] btrfs: invalidate pages instead of truncate after reflinking Qu reported that generic/164 often fails because the read operations get zeroes when it expects to either get all bytes with a value of 0x61 or 0x62. The issue stems from truncating the pages from the page cache instead of invalidating, as truncating can zero page contents. This zeroing is not just in case the range is not page sized (as it's commented in truncate_inode_pages_range()) but also in case we are using large folios, they need to be split and the splitting fails. Stealing Qu's comment in the thread linked below: "We can have the following case: 0 4K 8K 12K 16K | | | | | |<---- Extent A ----->|<----- Extent B ------>| The page size is still 4K, but the folio we got is 16K. Then if we remap the range for [8K, 16K), then truncate_inode_pages_range() will get the large folio 0 sized 16K, then call truncate_inode_partial_folio(). Which later calls folio_zero_range() for the [8K, 16K) range first, then tries to split the folio into smaller ones to properly drop them from the cache. But if splitting failed (e.g. racing with other operations holding the filemap lock), the partially zeroed large folio will be kept, resulting the range [8K, 16K) being zeroed meanwhile the folio is still a 16K sized large one." So instead of truncating, invalidate the page cache range with a call to filemap_invalidate_inode(), which besides not doing any zeroing also ensures that while it's invalidating folios, no new folios are added. This helps ensure that buffered reads that happen while a reflink operation is in progress always get either the whole old data (the one before the reflink) or the whole new data, which is what generic/164 expects. Link: https://lore.kernel.org/linux-btrfs/7fb9b44f-9680-4c22-a47f-6648cb109ddf@suse.com/ Reported-by: Qu Wenruo Reviewed-by: Qu Wenruo Reviewed-by: Boris Burkov Signed-off-by: Filipe Manana Signed-off-by: David Sterba --- fs/btrfs/reflink.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c index b5fe95baf92ed..58dc3e5057ce3 100644 --- a/fs/btrfs/reflink.c +++ b/fs/btrfs/reflink.c @@ -705,7 +705,6 @@ static noinline int btrfs_clone_files(struct file *file, struct file *file_src, struct inode *src = file_inode(file_src); struct btrfs_fs_info *fs_info = inode_to_fs_info(inode); int ret; - int wb_ret; u64 len = olen; u64 bs = fs_info->sectorsize; u64 end; @@ -750,25 +749,29 @@ static noinline int btrfs_clone_files(struct file *file, struct file *file_src, btrfs_lock_extent(&BTRFS_I(inode)->io_tree, destoff, end, &cached_state); ret = btrfs_clone(src, inode, off, olen, len, destoff, 0); btrfs_unlock_extent(&BTRFS_I(inode)->io_tree, destoff, end, &cached_state); + if (ret < 0) + return ret; /* * We may have copied an inline extent into a page of the destination - * range, so wait for writeback to complete before truncating pages + * range, so wait for writeback to complete before invalidating pages * from the page cache. This is a rare case. */ - wb_ret = btrfs_wait_ordered_range(BTRFS_I(inode), destoff, len); - ret = ret ? ret : wb_ret; + ret = btrfs_wait_ordered_range(BTRFS_I(inode), destoff, len); + if (ret < 0) + return ret; + /* - * Truncate page cache pages so that future reads will see the cloned - * data immediately and not the previous data. + * Invalidate page cache so that future reads will see the cloned data + * immediately and not the previous data. */ - truncate_inode_pages_range(&inode->i_data, - round_down(destoff, PAGE_SIZE), - round_up(destoff + len, PAGE_SIZE) - 1); + ret = filemap_invalidate_inode(inode, false, destoff, end); + if (ret < 0) + return ret; btrfs_btree_balance_dirty(fs_info); - return ret; + return 0; } static int btrfs_remap_file_range_prep(struct file *file_in, loff_t pos_in, From 0fe1fc23a3bd0b29f41b51c7c8510b2d4f392243 Mon Sep 17 00:00:00 2001 From: Jiasheng Jiang Date: Sun, 11 Jan 2026 19:20:37 +0000 Subject: [PATCH 1187/1321] btrfs: fix memory leaks in create_space_info() error paths In create_space_info(), the 'space_info' object is allocated at the beginning of the function. However, there are two error paths where the function returns an error code without freeing the allocated memory: 1. When create_space_info_sub_group() fails in zoned mode. 2. When btrfs_sysfs_add_space_info_type() fails. In both cases, 'space_info' has not yet been added to the fs_info->space_info list, resulting in a memory leak. Fix this by adding an error handling label to kfree(space_info) before returning. Fixes: 2be12ef79fe9 ("btrfs: Separate space_info create/update") Reviewed-by: Qu Wenruo Signed-off-by: Jiasheng Jiang Signed-off-by: David Sterba --- fs/btrfs/space-info.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index 6babbe333741d..3f08e450f7961 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -306,18 +306,22 @@ static int create_space_info(struct btrfs_fs_info *info, u64 flags) 0); if (ret) - return ret; + goto out_free; } ret = btrfs_sysfs_add_space_info_type(space_info); if (ret) - return ret; + goto out_free; list_add(&space_info->list, &info->space_info); if (flags & BTRFS_BLOCK_GROUP_DATA) info->data_sinfo = space_info; return ret; + +out_free: + kfree(space_info); + return ret; } int btrfs_init_space_info(struct btrfs_fs_info *fs_info) From a619ab57d019a560298c0b9402dc71fc08f9d939 Mon Sep 17 00:00:00 2001 From: Johannes Thumshirn Date: Wed, 17 Dec 2025 14:41:36 +0100 Subject: [PATCH 1188/1321] btrfs: remove zoned statistics from sysfs Remove the newly introduced zoned statistics from sysfs, as sysfs can only show a single page this will truncate the output on a busy filesystem. Reviewed-by: Filipe Manana Signed-off-by: Johannes Thumshirn Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/sysfs.c | 52 ------------------------------------------------ 1 file changed, 52 deletions(-) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 1f64c132b3873..4b3c2acac51a6 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -26,7 +26,6 @@ #include "misc.h" #include "fs.h" #include "accessors.h" -#include "zoned.h" /* * Structure name Path @@ -1189,56 +1188,6 @@ static ssize_t btrfs_commit_stats_store(struct kobject *kobj, } BTRFS_ATTR_RW(, commit_stats, btrfs_commit_stats_show, btrfs_commit_stats_store); -static ssize_t btrfs_zoned_stats_show(struct kobject *kobj, - struct kobj_attribute *a, char *buf) -{ - struct btrfs_fs_info *fs_info = to_fs_info(kobj); - struct btrfs_block_group *bg; - size_t ret = 0; - - - if (!btrfs_is_zoned(fs_info)) - return ret; - - spin_lock(&fs_info->zone_active_bgs_lock); - ret += sysfs_emit_at(buf, ret, "active block-groups: %zu\n", - list_count_nodes(&fs_info->zone_active_bgs)); - spin_unlock(&fs_info->zone_active_bgs_lock); - - mutex_lock(&fs_info->reclaim_bgs_lock); - spin_lock(&fs_info->unused_bgs_lock); - ret += sysfs_emit_at(buf, ret, "\treclaimable: %zu\n", - list_count_nodes(&fs_info->reclaim_bgs)); - ret += sysfs_emit_at(buf, ret, "\tunused: %zu\n", - list_count_nodes(&fs_info->unused_bgs)); - spin_unlock(&fs_info->unused_bgs_lock); - mutex_unlock(&fs_info->reclaim_bgs_lock); - - ret += sysfs_emit_at(buf, ret, "\tneed reclaim: %s\n", - str_true_false(btrfs_zoned_should_reclaim(fs_info))); - - if (fs_info->data_reloc_bg) - ret += sysfs_emit_at(buf, ret, - "data relocation block-group: %llu\n", - fs_info->data_reloc_bg); - if (fs_info->treelog_bg) - ret += sysfs_emit_at(buf, ret, - "tree-log block-group: %llu\n", - fs_info->treelog_bg); - - spin_lock(&fs_info->zone_active_bgs_lock); - ret += sysfs_emit_at(buf, ret, "active zones:\n"); - list_for_each_entry(bg, &fs_info->zone_active_bgs, active_bg_list) { - ret += sysfs_emit_at(buf, ret, - "\tstart: %llu, wp: %llu used: %llu, reserved: %llu, unusable: %llu\n", - bg->start, bg->alloc_offset, bg->used, - bg->reserved, bg->zone_unusable); - } - spin_unlock(&fs_info->zone_active_bgs_lock); - return ret; -} -BTRFS_ATTR(, zoned_stats, btrfs_zoned_stats_show); - static ssize_t btrfs_clone_alignment_show(struct kobject *kobj, struct kobj_attribute *a, char *buf) { @@ -1651,7 +1600,6 @@ static const struct attribute *btrfs_attrs[] = { BTRFS_ATTR_PTR(, bg_reclaim_threshold), BTRFS_ATTR_PTR(, commit_stats), BTRFS_ATTR_PTR(, temp_fsid), - BTRFS_ATTR_PTR(, zoned_stats), #ifdef CONFIG_BTRFS_EXPERIMENTAL BTRFS_ATTR_PTR(, offload_csum), #endif From ca65d14073fed958bd68e06789693cecdac40e26 Mon Sep 17 00:00:00 2001 From: Luo Haiyang Date: Tue, 13 Jan 2026 11:19:30 +0800 Subject: [PATCH 1189/1321] irqchip/riscv-imsic: Revert "Remove redundant irq_data lookups" Commit c475c0b71314("irqchip/riscv-imsic: Remove redundant irq_data lookups") leads to a NULL pointer deference in imsic_msi_update_msg(): virtio_blk virtio1: 8/0/0 default/read/poll queues Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Current kworker/u32:2 pgtable: 4K pagesize, 48-bit VAs, pgdp=0x0000000081c33000 [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 CPU: 5 UID: 0 PID: 75 Comm: kworker/u32:2 Not tainted 6.19.0-rc4-next-20260109 #1 NONE epc : 0x0 ra : imsic_irq_set_affinity+0x110/0x130 The irq_data argument of imsic_irq_set_affinity() is associated with the imsic domain and not with the top-level MSI domain. As a consequence the code dereferences the wrong interrupt chip, which has the irq_write_msi_msg() callback not populated. Signed-off-by: Luo Haiyang Signed-off-by: Thomas Gleixner Link: https://patch.msgid.link/20260113111930821RrC26avITHWSFCN0bYbgI@zte.com.cn --- drivers/irqchip/irq-riscv-imsic-platform.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/irqchip/irq-riscv-imsic-platform.c b/drivers/irqchip/irq-riscv-imsic-platform.c index 7228a33f6c37f..643c8e4596117 100644 --- a/drivers/irqchip/irq-riscv-imsic-platform.c +++ b/drivers/irqchip/irq-riscv-imsic-platform.c @@ -158,11 +158,11 @@ static int imsic_irq_set_affinity(struct irq_data *d, const struct cpumask *mask tmp_vec.local_id = new_vec->local_id; /* Point device to the temporary vector */ - imsic_msi_update_msg(d, &tmp_vec); + imsic_msi_update_msg(irq_get_irq_data(d->irq), &tmp_vec); } /* Point device to the new vector */ - imsic_msi_update_msg(d, new_vec); + imsic_msi_update_msg(irq_get_irq_data(d->irq), new_vec); /* Update irq descriptors with the new vector */ d->chip_data = new_vec; From 7ec0d2f47818dde9364533503b83d79979678b35 Mon Sep 17 00:00:00 2001 From: Mikulas Patocka Date: Tue, 6 Jan 2026 12:13:15 +0100 Subject: [PATCH 1190/1321] objtool: fix compilation failure with the x32 toolchain When using the x32 toolchain, compilation fails because the printf specifier "%lx" (long), doesn't match the type of the "checksum" variable (long long). Fix this by changing the printf specifier to "%llx" and casting "checksum" to unsigned long long. Fixes: a3493b33384a ("objtool/klp: Add --debug-checksum= to show per-instruction checksums") Signed-off-by: Mikulas Patocka Signed-off-by: Peter Zijlstra (Intel) Link: https://patch.msgid.link/a1158c99-fe0e-a218-4b5b-ffac212489f6@redhat.com --- tools/objtool/include/objtool/warn.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/objtool/include/objtool/warn.h b/tools/objtool/include/objtool/warn.h index 25ff7942b4d54..2b27b54096b86 100644 --- a/tools/objtool/include/objtool/warn.h +++ b/tools/objtool/include/objtool/warn.h @@ -152,8 +152,8 @@ static inline void unindent(int *unused) { indent--; } if (unlikely(insn->sym && insn->sym->pfunc && \ insn->sym->pfunc->debug_checksum)) { \ char *insn_off = offstr(insn->sec, insn->offset); \ - __dbg("checksum: %s %s %016lx", \ - func->name, insn_off, checksum); \ + __dbg("checksum: %s %s %016llx", \ + func->name, insn_off, (unsigned long long)checksum);\ free(insn_off); \ } \ }) From ba5ae2566da00fff25609707574ea5f69133f69a Mon Sep 17 00:00:00 2001 From: Sasha Levin Date: Tue, 23 Dec 2025 07:03:57 -0500 Subject: [PATCH 1191/1321] objtool: fix build failure due to missing libopcodes check Commit 59953303827e ("objtool: Disassemble code with libopcodes instead of running objdump") added support for using libopcodes for disassembly. However, the feature detection checks for libbfd availability but then unconditionally links against libopcodes: ifeq ($(feature-libbfd),1) OBJTOOL_LDFLAGS += -lopcodes endif This causes build failures in environments where libbfd is installed but libopcodes is not, since the test-libbfd.c feature test only links against -lbfd and -ldl, not -lopcodes: /usr/bin/ld: cannot find -lopcodes: No such file or directory collect2: error: ld returned 1 exit status make[4]: *** [Makefile:109: objtool] Error 1 Additionally, the shared feature framework uses $(CC) which is the cross-compiler in cross-compilation builds. Since objtool is a host tool that links with $(HOSTCC) against host libraries, the feature detection can falsely report libopcodes as available when the cross-compiler's sysroot has it but the host system doesn't. Fix this by replacing the feature framework check with a direct inline test that uses $(HOSTCC) to compile and link a test program against libopcodes, similar to how xxhash availability is detected. Fixes: 59953303827e ("objtool: Disassemble code with libopcodes instead of running objdump") Assisted-by: claude-opus-4-5-20251101 Signed-off-by: Sasha Levin Signed-off-by: Peter Zijlstra (Intel) Link: https://patch.msgid.link/20251223120357.2492008-1-sashal@kernel.org --- tools/objtool/Makefile | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile index ad6e1ec706ce0..9b4503113ce5f 100644 --- a/tools/objtool/Makefile +++ b/tools/objtool/Makefile @@ -72,23 +72,27 @@ HOST_OVERRIDES := CC="$(HOSTCC)" LD="$(HOSTLD)" AR="$(HOSTAR)" # # To support disassembly, objtool needs libopcodes which is provided -# with libbdf (binutils-dev or binutils-devel package). +# with libbfd (binutils-dev or binutils-devel package). # -FEATURE_USER = .objtool -FEATURE_TESTS = libbfd disassembler-init-styled -FEATURE_DISPLAY = -include $(srctree)/tools/build/Makefile.feature +# We check using HOSTCC directly rather than the shared feature framework +# because objtool is a host tool that links against host libraries. +# +HAVE_LIBOPCODES := $(shell echo 'int main(void) { return 0; }' | \ + $(HOSTCC) -xc - -o /dev/null -lopcodes 2>/dev/null && echo y) -ifeq ($(feature-disassembler-init-styled), 1) - OBJTOOL_CFLAGS += -DDISASM_INIT_STYLED -endif +# Styled disassembler support requires binutils >= 2.39 +HAVE_DISASM_STYLED := $(shell echo '$(pound)include ' | \ + $(HOSTCC) -E -xc - 2>/dev/null | grep -q disassembler_style && echo y) BUILD_DISAS := n -ifeq ($(feature-libbfd),1) +ifeq ($(HAVE_LIBOPCODES),y) BUILD_DISAS := y - OBJTOOL_CFLAGS += -DDISAS -DPACKAGE="objtool" + OBJTOOL_CFLAGS += -DDISAS -DPACKAGE='"objtool"' OBJTOOL_LDFLAGS += -lopcodes +ifeq ($(HAVE_DISASM_STYLED),y) + OBJTOOL_CFLAGS += -DDISASM_INIT_STYLED +endif endif export BUILD_DISAS From 313678b80a98dc1ba48b9056be1306ca29f8c050 Mon Sep 17 00:00:00 2001 From: Pingfan Liu Date: Tue, 25 Nov 2025 11:26:29 +0800 Subject: [PATCH 1192/1321] sched/deadline: Remove unnecessary comment in dl_add_task_root_domain() The comments above dl_get_task_effective_cpus() and dl_add_task_root_domain() already explain how to fetch a valid root domain and protect against races. There's no need to repeat this inside dl_add_task_root_domain(). Remove the redundant comment to keep the code clean. No functional change. Signed-off-by: Pingfan Liu Signed-off-by: Peter Zijlstra (Intel) Acked-by: Juri Lelli Acked-by: Waiman Long Link: https://patch.msgid.link/20251125032630.8746-2-piliu@redhat.com --- kernel/sched/deadline.c | 9 --------- 1 file changed, 9 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 319439fe18702..463ba50f9fffb 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -3162,20 +3162,11 @@ void dl_add_task_root_domain(struct task_struct *p) return; } - /* - * Get an active rq, whose rq->rd traces the correct root - * domain. - * Ideally this would be under cpuset reader lock until rq->rd is - * fetched. However, sleepable locks cannot nest inside pi_lock, so we - * rely on the caller of dl_add_task_root_domain() holds 'cpuset_mutex' - * to guarantee the CPU stays in the cpuset. - */ dl_get_task_effective_cpus(p, msk); cpu = cpumask_first_and(cpu_active_mask, msk); BUG_ON(cpu >= nr_cpu_ids); rq = cpu_rq(cpu); dl_b = &rq->rd->dl_bw; - /* End of fetching rd */ raw_spin_lock(&dl_b->lock); __dl_add(dl_b, p->dl.dl_bw, cpumask_weight(rq->rd->span)); From 7babf4426c4dcb5a4fffda1f2805e96ddde06371 Mon Sep 17 00:00:00 2001 From: Pingfan Liu Date: Tue, 25 Nov 2025 11:26:30 +0800 Subject: [PATCH 1193/1321] sched/deadline: Fix potential race in dl_add_task_root_domain() The access rule for local_cpu_mask_dl requires it to be called on the local CPU with preemption disabled. However, dl_add_task_root_domain() currently violates this rule. Without preemption disabled, the following race can occur: 1. ThreadA calls dl_add_task_root_domain() on CPU 0 2. Gets pointer to CPU 0's local_cpu_mask_dl 3. ThreadA is preempted and migrated to CPU 1 4. ThreadA continues using CPU 0's local_cpu_mask_dl 5. Meanwhile, the scheduler on CPU 0 calls find_later_rq() which also uses local_cpu_mask_dl (with preemption properly disabled) 6. Both contexts now corrupt the same per-CPU buffer concurrently Fix this by moving the local_cpu_mask_dl access to the preemption disabled section. Closes: https://lore.kernel.org/lkml/aSBjm3mN_uIy64nz@jlelli-thinkpadt14gen4.remote.csb Fixes: 318e18ed22e8 ("sched/deadline: Walk up cpuset hierarchy to decide root domain when hot-unplug") Reported-by: Juri Lelli Signed-off-by: Pingfan Liu Signed-off-by: Peter Zijlstra (Intel) Acked-by: Juri Lelli Acked-by: Waiman Long Link: https://patch.msgid.link/20251125032630.8746-3-piliu@redhat.com --- kernel/sched/deadline.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 463ba50f9fffb..e3efc40349f13 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -3154,7 +3154,7 @@ void dl_add_task_root_domain(struct task_struct *p) struct rq *rq; struct dl_bw *dl_b; unsigned int cpu; - struct cpumask *msk = this_cpu_cpumask_var_ptr(local_cpu_mask_dl); + struct cpumask *msk; raw_spin_lock_irqsave(&p->pi_lock, rf.flags); if (!dl_task(p) || dl_entity_is_special(&p->dl)) { @@ -3162,6 +3162,7 @@ void dl_add_task_root_domain(struct task_struct *p) return; } + msk = this_cpu_cpumask_var_ptr(local_cpu_mask_dl); dl_get_task_effective_cpus(p, msk); cpu = cpumask_first_and(cpu_active_mask, msk); BUG_ON(cpu >= nr_cpu_ids); From 75439dee8b446f4146da275e7deafda51f8d2878 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Tue, 13 Jan 2026 10:50:41 +0100 Subject: [PATCH 1194/1321] sched: Provide idle_rq() helper A fix for the dl_server 'requires' idle_cpu() usage, which made me note that it and available_idle_cpu() are extern function calls. And while idle_cpu() is used outside of kernel/sched/, available_idle_cpu() is not. This makes it hard to make idle_cpu() an inline helper, so provide idle_rq() and implement idle_cpu() and available_idle_cpu() using that. Signed-off-by: Peter Zijlstra (Intel) --- include/linux/sched.h | 1 - kernel/sched/sched.h | 22 ++++++++++++++++++++++ kernel/sched/syscalls.c | 30 +----------------------------- 3 files changed, 23 insertions(+), 30 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index d395f2810facb..da0133524d08c 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1874,7 +1874,6 @@ static inline int task_nice(const struct task_struct *p) extern int can_nice(const struct task_struct *p, const int nice); extern int task_curr(const struct task_struct *p); extern int idle_cpu(int cpu); -extern int available_idle_cpu(int cpu); extern int sched_setscheduler(struct task_struct *, int, const struct sched_param *); extern int sched_setscheduler_nocheck(struct task_struct *, int, const struct sched_param *); extern void sched_set_fifo(struct task_struct *p); diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index d30cca6870f5f..e885a935b7162 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1364,6 +1364,28 @@ static inline u32 sched_rng(void) #define cpu_curr(cpu) (cpu_rq(cpu)->curr) #define raw_rq() raw_cpu_ptr(&runqueues) +static inline bool idle_rq(struct rq *rq) +{ + return rq->curr == rq->idle && !rq->nr_running && !rq->ttwu_pending; +} + +/** + * available_idle_cpu - is a given CPU idle for enqueuing work. + * @cpu: the CPU in question. + * + * Return: 1 if the CPU is currently idle. 0 otherwise. + */ +static inline bool available_idle_cpu(int cpu) +{ + if (!idle_rq(cpu_rq(cpu))) + return 0; + + if (vcpu_is_preempted(cpu)) + return 0; + + return 1; +} + #ifdef CONFIG_SCHED_PROXY_EXEC static inline void rq_set_donor(struct rq *rq, struct task_struct *t) { diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c index 0496dc29ed0f9..cb337de679b86 100644 --- a/kernel/sched/syscalls.c +++ b/kernel/sched/syscalls.c @@ -180,35 +180,7 @@ int task_prio(const struct task_struct *p) */ int idle_cpu(int cpu) { - struct rq *rq = cpu_rq(cpu); - - if (rq->curr != rq->idle) - return 0; - - if (rq->nr_running) - return 0; - - if (rq->ttwu_pending) - return 0; - - return 1; -} - -/** - * available_idle_cpu - is a given CPU idle for enqueuing work. - * @cpu: the CPU in question. - * - * Return: 1 if the CPU is currently idle. 0 otherwise. - */ -int available_idle_cpu(int cpu) -{ - if (!idle_cpu(cpu)) - return 0; - - if (vcpu_is_preempted(cpu)) - return 0; - - return 1; + return idle_rq(cpu_rq(cpu)); } /** From 8ddf3aa87154cd8af99b5a8acbe4507d2832acb6 Mon Sep 17 00:00:00 2001 From: Gabriele Monaco Date: Tue, 13 Jan 2026 09:52:01 +0100 Subject: [PATCH 1195/1321] sched/deadline: Fix server stopping with runnable tasks The deadline server can currently stop due to idle although fair tasks are runnable. This happens essentially when: * the server is set to idle, a task wakes up, the server stops * a task wakes up, the server sets itself to idle and stops right away Address both cases by clearing the server idle flag whenever a fair task wakes up and accounting also for pending tasks in the definition of idle. Fixes: f5a538c07df2 ("sched/deadline: Fix dl_server stop condition") Signed-off-by: Gabriele Monaco Signed-off-by: Peter Zijlstra (Intel) Link: https://patch.msgid.link/20260113085159.114226-3-gmonaco@redhat.com --- kernel/sched/deadline.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index e3efc40349f13..b5c19b17e386e 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1420,7 +1420,7 @@ update_stats_dequeue_dl(struct dl_rq *dl_rq, struct sched_dl_entity *dl_se, int static void update_curr_dl_se(struct rq *rq, struct sched_dl_entity *dl_se, s64 delta_exec) { - bool idle = rq->curr == rq->idle; + bool idle = idle_rq(rq); s64 scaled_delta_exec; if (unlikely(delta_exec <= 0)) { @@ -1603,8 +1603,8 @@ void dl_server_update(struct sched_dl_entity *dl_se, s64 delta_exec) * | 8 | B:zero_laxity-wait | | | * | | | <---+ | * | +--------------------------------+ | - * | | ^ ^ 2 | - * | | 7 | 2 +--------------------+ + * | | ^ ^ 2 | + * | | 7 | 2, 1 +----------------+ * | v | * | +-------------+ | * +-- | C:idle-wait | -+ @@ -1649,8 +1649,11 @@ void dl_server_update(struct sched_dl_entity *dl_se, s64 delta_exec) * dl_defer_idle = 0 * * - * [1] A->B, A->D + * [1] A->B, A->D, C->B * dl_server_start() + * dl_defer_idle = 0; + * if (dl_server_active) + * return; // [B] * dl_server_active = 1; * enqueue_dl_entity() * update_dl_entity(WAKEUP) @@ -1759,6 +1762,7 @@ void dl_server_update(struct sched_dl_entity *dl_se, s64 delta_exec) * "B:zero_laxity-wait" -> "C:idle-wait" [label="7:dl_server_update_idle"] * "B:zero_laxity-wait" -> "D:running" [label="3:dl_server_timer"] * "C:idle-wait" -> "A:init" [label="8:dl_server_timer"] + * "C:idle-wait" -> "B:zero_laxity-wait" [label="1:dl_server_start"] * "C:idle-wait" -> "B:zero_laxity-wait" [label="2:dl_server_update"] * "C:idle-wait" -> "C:idle-wait" [label="7:dl_server_update_idle"] * "D:running" -> "A:init" [label="4:pick_task_dl"] @@ -1784,6 +1788,7 @@ void dl_server_start(struct sched_dl_entity *dl_se) { struct rq *rq = dl_se->rq; + dl_se->dl_defer_idle = 0; if (!dl_server(dl_se) || dl_se->dl_server_active) return; From a8be06247118eb996c0b037c557efbfec18a5a31 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Tue, 6 Jan 2026 11:41:13 +0100 Subject: [PATCH 1196/1321] sched/deadline: Ensure get_prio_dl() is up-to-date Pratheek tripped a WARN and noted the following issue: > Inspecting the set of events that led to the warning being triggered > showed the following: > > systemd-1 [008] dN.31 ...: do_set_cpus_allowed: set_cpus_allowed begin! > > systemd-1 [008] dN.31 ...: sched_change_begin: Begin! > systemd-1 [008] dN.31 ...: sched_change_begin: Before dequeue_task()! > systemd-1 [008] dN.31 ...: update_curr_dl_se: update_curr_dl_se: ENQUEUE_REPLENISH > systemd-1 [008] dN.31 ...: enqueue_dl_entity: enqueue_dl_entity: ENQUEUE_REPLENISH > systemd-1 [008] dN.31 ...: replenish_dl_entity: Replenish before: 14815760217 > systemd-1 [008] dN.31 ...: replenish_dl_entity: Replenish after: 14816960047 > systemd-1 [008] dN.31 ...: sched_change_begin: Before put_prev_task()! > > systemd-1 [008] dN.31 ...: sched_change_end: Before enqueue_task()! > systemd-1 [008] dN.31 ...: sched_change_end: Before put_prev_task()! > systemd-1 [008] dN.31 ...: prio_changed_dl: Queuing pull task on prio change: 14815760217 -> 14816960047 > systemd-1 [008] dN.31 ...: prio_changed_dl: Queuing balance callback! > systemd-1 [008] dN.31 ...: sched_change_end: End! > > systemd-1 [008] dN.31 ...: do_set_cpus_allowed: set_cpus_allowed end! > systemd-1 [008] dN.21 ...: __schedule: Woops! Balance callback found! > > 1. sched_change_begin() from guard(sched_change) in > do_set_cpus_allowed() stashes the priority, which for the deadline > task, is "p->dl.deadline". > 2. The dequeue of the deadline task replenishes the deadline. > 3. The task is enqueued back after guard's scope ends and since there is > no *_CLASS flags set, sched_change_end() calls > dl_sched_class->prio_changed() which compares the deadline. > 4. Since deadline was moved on dequeue, prio_changed_dl() sees the value > differ from the stashed value and queues a balance pull callback. > 5. do_set_cpus_allowed() finishes and drops the rq_lock without doing a > do_balance_callbacks(). > 6. Grabbing the rq_lock() at subsequent __schedule() triggers the > warning since the balance pull callback was never executed before > dropping the lock. Meaning get_prio_dl() ought to update current and return an up-to-date value. Fixes: 6455ad5346c9 ("sched: Move sched_class::prio_changed() into the change pattern") Reported-by: K Prateek Nayak Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: K Prateek Nayak Tested-by: K Prateek Nayak Link: https://patch.msgid.link/20260106104113.GX3707891@noisy.programming.kicks-ass.net --- kernel/sched/deadline.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index b5c19b17e386e..b7acf74b6527c 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -3296,6 +3296,12 @@ static void switched_to_dl(struct rq *rq, struct task_struct *p) static u64 get_prio_dl(struct rq *rq, struct task_struct *p) { + /* + * Make sure to update current so we don't return a stale value. + */ + if (task_current_donor(rq, p)) + update_curr_dl(rq); + return p->dl.deadline; } From 05d2e2e08700d3365f582ddd8f22912ffba186f8 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Tue, 13 Jan 2026 12:57:14 +0100 Subject: [PATCH 1197/1321] sched/deadline: Avoid double update_rq_clock() When setup_new_dl_entity() is called from enqueue_task_dl() -> enqueue_dl_entity(), the rq-clock should already be updated, and calling update_rq_clock() again is not right. Move the update_rq_clock() to the one other caller of setup_new_dl_entity(): sched_init_dl_server(). Fixes: 9f239df55546 ("sched/deadline: Initialize dl_servers after SMP") Reported-by: Pierre Gondois Signed-off-by: Peter Zijlstra (Intel) Tested-by: Pierre Gondois Link: https://patch.msgid.link/20260113115622.GA831285@noisy.programming.kicks-ass.net --- kernel/sched/deadline.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index b7acf74b6527c..5d6f3cced7401 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -752,8 +752,6 @@ static inline void setup_new_dl_entity(struct sched_dl_entity *dl_se) struct dl_rq *dl_rq = dl_rq_of_se(dl_se); struct rq *rq = rq_of_dl_rq(dl_rq); - update_rq_clock(rq); - WARN_ON(is_dl_boosted(dl_se)); WARN_ON(dl_time_before(rq_clock(rq), dl_se->deadline)); @@ -1839,6 +1837,7 @@ void sched_init_dl_servers(void) rq = cpu_rq(cpu); guard(rq_lock_irq)(rq); + update_rq_clock(rq); dl_se = &rq->fair_server; From ff119dd03c6749d5090f4af1aa0469733d7d642c Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Thu, 15 Jan 2026 09:16:44 +0100 Subject: [PATCH 1198/1321] sched: Fold rq-pin swizzle into __balance_callbacks() Prepare for more users needing the rq-pin swizzle. Signed-off-by: Peter Zijlstra (Intel) Tested-by: Pierre Gondois Tested-by: Juri Lelli Link: https://patch.msgid.link/20260114130528.GB831285@noisy.programming.kicks-ass.net --- kernel/sched/core.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 60afadb6eedea..842a3adaf7467 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4950,9 +4950,13 @@ struct balance_callback *splice_balance_callbacks(struct rq *rq) return __splice_balance_callbacks(rq, true); } -static void __balance_callbacks(struct rq *rq) +static void __balance_callbacks(struct rq *rq, struct rq_flags *rf) { + if (rf) + rq_unpin_lock(rq, rf); do_balance_callbacks(rq, __splice_balance_callbacks(rq, false)); + if (rf) + rq_repin_lock(rq, rf); } void balance_callbacks(struct rq *rq, struct balance_callback *head) @@ -4991,7 +4995,7 @@ static inline void finish_lock_switch(struct rq *rq) * prev into current: */ spin_acquire(&__rq_lockp(rq)->dep_map, 0, 0, _THIS_IP_); - __balance_callbacks(rq); + __balance_callbacks(rq, NULL); raw_spin_rq_unlock_irq(rq); } @@ -6867,7 +6871,7 @@ static void __sched notrace __schedule(int sched_mode) proxy_tag_curr(rq, next); rq_unpin_lock(rq, &rf); - __balance_callbacks(rq); + __balance_callbacks(rq, NULL); raw_spin_rq_unlock_irq(rq); } trace_sched_exit_tp(is_switch); @@ -7362,9 +7366,7 @@ void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task) out_unlock: /* Caller holds task_struct::pi_lock, IRQs are still disabled */ - rq_unpin_lock(rq, &rf); - __balance_callbacks(rq); - rq_repin_lock(rq, &rf); + __balance_callbacks(rq, &rf); __task_rq_unlock(rq, p, &rf); } #endif /* CONFIG_RT_MUTEXES */ From d7206955f455b38b736fa8bb1d959e785d16d2d2 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Thu, 15 Jan 2026 09:17:49 +0100 Subject: [PATCH 1199/1321] sched: Audit MOVE vs balance_callbacks The {DE,EN}QUEUE_MOVE flag indicates a task is allowed to change priority, which means there could be balance callbacks queued. Therefore audit all MOVE users and make sure they do run balance callbacks before dropping rq-lock. Fixes: 6455ad5346c9 ("sched: Move sched_class::prio_changed() into the change pattern") Signed-off-by: Peter Zijlstra (Intel) Tested-by: Pierre Gondois Tested-by: Juri Lelli Link: https://patch.msgid.link/20260114130528.GB831285@noisy.programming.kicks-ass.net --- kernel/sched/core.c | 4 +++- kernel/sched/ext.c | 1 + kernel/sched/sched.h | 5 ++++- 3 files changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 842a3adaf7467..4d925d7ad097e 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4950,7 +4950,7 @@ struct balance_callback *splice_balance_callbacks(struct rq *rq) return __splice_balance_callbacks(rq, true); } -static void __balance_callbacks(struct rq *rq, struct rq_flags *rf) +void __balance_callbacks(struct rq *rq, struct rq_flags *rf) { if (rf) rq_unpin_lock(rq, rf); @@ -9126,6 +9126,8 @@ void sched_move_task(struct task_struct *tsk, bool for_autogroup) if (resched) resched_curr(rq); + + __balance_callbacks(rq, &rq_guard.rf); } static struct cgroup_subsys_state * diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 8f6d8d7f895cc..afe28c04d5aa7 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -545,6 +545,7 @@ static void scx_task_iter_start(struct scx_task_iter *iter) static void __scx_task_iter_rq_unlock(struct scx_task_iter *iter) { if (iter->locked_task) { + __balance_callbacks(iter->rq, &iter->rf); task_rq_unlock(iter->rq, iter->locked_task, &iter->rf); iter->locked_task = NULL; } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index e885a935b7162..93fce4bbff5ea 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2388,7 +2388,8 @@ extern const u32 sched_prio_to_wmult[40]; * should preserve as much state as possible. * * MOVE - paired with SAVE/RESTORE, explicitly does not preserve the location - * in the runqueue. + * in the runqueue. IOW the priority is allowed to change. Callers + * must expect to deal with balance callbacks. * * NOCLOCK - skip the update_rq_clock() (avoids double updates) * @@ -3969,6 +3970,8 @@ extern void enqueue_task(struct rq *rq, struct task_struct *p, int flags); extern bool dequeue_task(struct rq *rq, struct task_struct *p, int flags); extern struct balance_callback *splice_balance_callbacks(struct rq *rq); + +extern void __balance_callbacks(struct rq *rq, struct rq_flags *rf); extern void balance_callbacks(struct rq *rq, struct balance_callback *head); /* From 1a82cd6f058a2570e3827473bf046e238b65e0ff Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Thu, 15 Jan 2026 09:25:37 +0100 Subject: [PATCH 1200/1321] sched: Deadline has dynamic priority While FIFO/RR have static priority, DEADLINE is a dynamic priority scheme. Notably it has static priority -1. Do not assume the priority doesn't change for deadline tasks just because the static priority doesn't change. This ensures DL always sees {DE,EN}QUEUE_MOVE where appropriate. Fixes: ff77e4685359 ("sched/rt: Fix PI handling vs. sched_setscheduler()") Signed-off-by: Peter Zijlstra (Intel) Tested-by: Pierre Gondois Tested-by: Juri Lelli Link: https://patch.msgid.link/20260114130528.GB831285@noisy.programming.kicks-ass.net --- kernel/sched/core.c | 2 +- kernel/sched/syscalls.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 4d925d7ad097e..045f83ad261e2 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7320,7 +7320,7 @@ void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task) trace_sched_pi_setprio(p, pi_task); oldprio = p->prio; - if (oldprio == prio) + if (oldprio == prio && !dl_prio(prio)) queue_flag &= ~DEQUEUE_MOVE; prev_class = p->sched_class; diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c index cb337de679b86..6f10db3646e7f 100644 --- a/kernel/sched/syscalls.c +++ b/kernel/sched/syscalls.c @@ -639,7 +639,7 @@ int __sched_setscheduler(struct task_struct *p, * itself. */ newprio = rt_effective_prio(p, newprio); - if (newprio == oldprio) + if (newprio == oldprio && !dl_prio(newprio)) queue_flags &= ~DEQUEUE_MOVE; } From 97a78cfe78c86493605bfa21442ff2d2f090ec2b Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Thu, 15 Jan 2026 09:27:22 +0100 Subject: [PATCH 1201/1321] sched/deadline: Use ENQUEUE_MOVE to allow priority change Pierre reported hitting balance callback warnings for deadline tasks after commit 6455ad5346c9 ("sched: Move sched_class::prio_changed() into the change pattern"). It turns out that DEQUEUE_SAVE+ENQUEUE_RESTORE does not preserve DL priority and subsequently trips a balance pass -- where one was not expected. From discussion with Juri and Luca, the purpose of this clause was to deal with tasks new to DL and all those sites will have MOVE set (as well as CLASS, but MOVE is move conservative at this point). Per the previous patches MOVE is audited to always run the balance callbacks, so switch enqueue_dl_entity() to use MOVE for this case. Fixes: 6455ad5346c9 ("sched: Move sched_class::prio_changed() into the change pattern") Reported-by: Pierre Gondois Signed-off-by: Peter Zijlstra (Intel) Tested-by: Pierre Gondois Tested-by: Juri Lelli Link: https://patch.msgid.link/20260114130528.GB831285@noisy.programming.kicks-ass.net --- kernel/sched/deadline.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 5d6f3cced7401..c509f2e7d69de 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2214,7 +2214,7 @@ enqueue_dl_entity(struct sched_dl_entity *dl_se, int flags) update_dl_entity(dl_se); } else if (flags & ENQUEUE_REPLENISH) { replenish_dl_entity(dl_se); - } else if ((flags & ENQUEUE_RESTORE) && + } else if ((flags & ENQUEUE_MOVE) && !is_dl_boosted(dl_se) && dl_time_before(dl_se->deadline, rq_clock(rq_of_dl_se(dl_se)))) { setup_new_dl_entity(dl_se); From 571f4883973bb6f266ac13244f929923ef5e38e5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= Date: Wed, 7 Jan 2026 11:39:24 +0100 Subject: [PATCH 1202/1321] hrtimer: Fix softirq base check in update_needs_ipi() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The 'clockid' field is not the correct way to check for a softirq base. Fix the check to correctly compare the base type instead of the clockid. Fixes: 1e7f7fbcd40c ("hrtimer: Avoid more SMP function calls in clock_was_set()") Signed-off-by: Thomas Weißschuh Signed-off-by: Thomas Gleixner Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260107-hrtimer-clock-base-check-v1-1-afb5dbce94a1@linutronix.de --- kernel/time/hrtimer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index bdb30cc5e873e..0e4bc1ca15ff1 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -913,7 +913,7 @@ static bool update_needs_ipi(struct hrtimer_cpu_base *cpu_base, return true; /* Extra check for softirq clock bases */ - if (base->clockid < HRTIMER_BASE_MONOTONIC_SOFT) + if (base->index < HRTIMER_BASE_MONOTONIC_SOFT) continue; if (cpu_base->softirq_activated) continue; From 43f67096bb3b078e2519d3e82d95242a07933a61 Mon Sep 17 00:00:00 2001 From: Xiaochen Shen Date: Tue, 9 Dec 2025 14:26:49 +0800 Subject: [PATCH 1203/1321] x86/resctrl: Add missing resctrl initialization for Hygon Hygon CPUs supporting Platform QoS features currently undergo partial resctrl initialization through resctrl_cpu_detect() in the Hygon BSP init helper and AMD/Hygon common initialization code. However, several critical data structures remain uninitialized for Hygon CPUs in the following paths: - get_mem_config()-> __rdt_get_mem_config_amd(): rdt_resource::membw,alloc_capable hw_res::num_closid - rdt_init_res_defs()->rdt_init_res_defs_amd(): rdt_resource::cache hw_res::msr_base,msr_update Add the missing AMD/Hygon common initialization to ensure proper Platform QoS functionality on Hygon CPUs. Fixes: d8df126349da ("x86/cpu/hygon: Add missing resctrl_cpu_detect() in bsp_init helper") Signed-off-by: Xiaochen Shen Signed-off-by: Borislav Petkov (AMD) Reviewed-by: Reinette Chatre Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251209062650.1536952-2-shenxiaochen@open-hieco.net --- arch/x86/kernel/cpu/resctrl/core.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 3792ab4819dc6..10de1594d328f 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -825,7 +825,8 @@ static __init bool get_mem_config(void) if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) return __get_mem_config_intel(&hw_res->r_resctrl); - else if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) + else if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD || + boot_cpu_data.x86_vendor == X86_VENDOR_HYGON) return __rdt_get_mem_config_amd(&hw_res->r_resctrl); return false; @@ -987,7 +988,8 @@ static __init void rdt_init_res_defs(void) { if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) rdt_init_res_defs_intel(); - else if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) + else if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD || + boot_cpu_data.x86_vendor == X86_VENDOR_HYGON) rdt_init_res_defs_amd(); } From bda4f3eb3229bc3a739805b96713988e838701ed Mon Sep 17 00:00:00 2001 From: Xiaochen Shen Date: Tue, 9 Dec 2025 14:26:50 +0800 Subject: [PATCH 1204/1321] x86/resctrl: Fix memory bandwidth counter width for Hygon The memory bandwidth calculation relies on reading the hardware counter and measuring the delta between samples. To ensure accurate measurement, the software reads the counter frequently enough to prevent it from rolling over twice between reads. The default Memory Bandwidth Monitoring (MBM) counter width is 24 bits. Hygon CPUs provide a 32-bit width counter, but they do not support the MBM capability CPUID leaf (0xF.[ECX=1]:EAX) to report the width offset (from 24 bits). Consequently, the kernel falls back to the 24-bit default counter width, which causes incorrect overflow handling on Hygon CPUs. Fix this by explicitly setting the counter width offset to 8 bits (resulting in a 32-bit total counter width) for Hygon CPUs. Fixes: d8df126349da ("x86/cpu/hygon: Add missing resctrl_cpu_detect() in bsp_init helper") Signed-off-by: Xiaochen Shen Signed-off-by: Borislav Petkov (AMD) Reviewed-by: Tony Luck Reviewed-by: Reinette Chatre Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251209062650.1536952-3-shenxiaochen@open-hieco.net --- arch/x86/kernel/cpu/resctrl/core.c | 15 +++++++++++++-- arch/x86/kernel/cpu/resctrl/internal.h | 3 +++ 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 10de1594d328f..6ebff44a3f754 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -1021,8 +1021,19 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c) c->x86_cache_occ_scale = ebx; c->x86_cache_mbm_width_offset = eax & 0xff; - if (c->x86_vendor == X86_VENDOR_AMD && !c->x86_cache_mbm_width_offset) - c->x86_cache_mbm_width_offset = MBM_CNTR_WIDTH_OFFSET_AMD; + if (!c->x86_cache_mbm_width_offset) { + switch (c->x86_vendor) { + case X86_VENDOR_AMD: + c->x86_cache_mbm_width_offset = MBM_CNTR_WIDTH_OFFSET_AMD; + break; + case X86_VENDOR_HYGON: + c->x86_cache_mbm_width_offset = MBM_CNTR_WIDTH_OFFSET_HYGON; + break; + default: + /* Leave c->x86_cache_mbm_width_offset as 0 */ + break; + } + } } } diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 4a916c84a3220..79c18657ede0a 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -14,6 +14,9 @@ #define MBM_CNTR_WIDTH_OFFSET_AMD 20 +/* Hygon MBM counter width as an offset from MBM_CNTR_WIDTH_BASE */ +#define MBM_CNTR_WIDTH_OFFSET_HYGON 8 + #define RMID_VAL_ERROR BIT_ULL(63) #define RMID_VAL_UNAVAIL BIT_ULL(62) From ef7224b773152b807ce78a1626b5f9f33f6df41e Mon Sep 17 00:00:00 2001 From: Bala-Vignesh-Reddy Date: Wed, 22 Oct 2025 11:59:48 +0530 Subject: [PATCH 1205/1321] selftests/x86: Add selftests include path for kselftest.h after centralization The previous change centralizing kselftest.h include path in lib.mk caused x86 selftests to fail, as x86 Makefile overwrites CFLAGS using ":=", dropping the include path added in lib.mk. Therefore, helpers.h could not find kselftest.h during compilation. Fix this by adding the tools/testing/sefltest to CFLAGS in x86 Makefile. [ bp: Correct commit ID in Fixes: ] Fixes: e6fbd1759c9e ("selftests: complete kselftest include centralization") Closes: https://lore.kernel.org/lkml/CA+G9fYvKjQcCBMfXA-z2YuL2L+3Qd-pJjEUDX8PDdz2-EEQd=Q@mail.gmail.com/T/#m83fd330231287fc9d6c921155bee16c591db7360 Reported-by: Linux Kernel Functional Testing Signed-off-by: Bala-Vignesh-Reddy Signed-off-by: Borislav Petkov (AMD) Tested-by: Anders Roxell Tested-by: Brendan Jackman Link: https://patch.msgid.link/20251022062948.162852-1-reddybalavignesh9979@gmail.com --- tools/testing/selftests/x86/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile index 83148875a12c2..434065215d127 100644 --- a/tools/testing/selftests/x86/Makefile +++ b/tools/testing/selftests/x86/Makefile @@ -36,6 +36,7 @@ BINARIES_32 := $(patsubst %,$(OUTPUT)/%,$(BINARIES_32)) BINARIES_64 := $(patsubst %,$(OUTPUT)/%,$(BINARIES_64)) CFLAGS := -O2 -g -std=gnu99 -pthread -Wall $(KHDR_INCLUDES) +CFLAGS += -I $(top_srcdir)/tools/testing/selftests/ # call32_from_64 in thunks.S uses absolute addresses. ifeq ($(CAN_BUILD_WITH_NOPIE),1) From 1bb0fedc3ff3d93d9fa42329a15b07337fb0b609 Mon Sep 17 00:00:00 2001 From: Haoxiang Li Date: Tue, 23 Dec 2025 20:32:02 +0800 Subject: [PATCH 1206/1321] EDAC/i3200: Fix a resource leak in i3200_probe1() If edac_mc_alloc() fails, also unmap the window. [ bp: Use separate labels, turning it into the classic unwind pattern. ] Fixes: dd8ef1db87a4 ("edac: i3200 memory controller driver") Signed-off-by: Haoxiang Li Signed-off-by: Borislav Petkov (AMD) Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251223123202.1492038-1-lihaoxiang@isrc.iscas.ac.cn --- drivers/edac/i3200_edac.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c index afccdebf5ac1e..6cade6d7ceff6 100644 --- a/drivers/edac/i3200_edac.c +++ b/drivers/edac/i3200_edac.c @@ -358,10 +358,11 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx) layers[1].type = EDAC_MC_LAYER_CHANNEL; layers[1].size = nr_channels; layers[1].is_virt_csrow = false; - mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, - sizeof(struct i3200_priv)); + + rc = -ENOMEM; + mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, sizeof(struct i3200_priv)); if (!mci) - return -ENOMEM; + goto unmap; edac_dbg(3, "MC: init mci\n"); @@ -421,9 +422,9 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx) return 0; fail: + edac_mc_free(mci); +unmap: iounmap(window); - if (mci) - edac_mc_free(mci); return rc; } From 9f48d952752cd0b1234b82003fffa47940261602 Mon Sep 17 00:00:00 2001 From: Haoxiang Li Date: Tue, 23 Dec 2025 20:43:50 +0800 Subject: [PATCH 1207/1321] EDAC/x38: Fix a resource leak in x38_probe1() If edac_mc_alloc() fails, also unmap the window. [ bp: Use separate labels, turning it into the classic unwind pattern. ] Fixes: df8bc08c192f ("edac x38: new MC driver module") Signed-off-by: Haoxiang Li Signed-off-by: Borislav Petkov (AMD) Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251223124350.1496325-1-lihaoxiang@isrc.iscas.ac.cn --- drivers/edac/x38_edac.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c index 49ab5721aab25..292dda754c236 100644 --- a/drivers/edac/x38_edac.c +++ b/drivers/edac/x38_edac.c @@ -341,9 +341,12 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx) layers[1].type = EDAC_MC_LAYER_CHANNEL; layers[1].size = x38_channel_num; layers[1].is_virt_csrow = false; + + + rc = -ENOMEM; mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, 0); if (!mci) - return -ENOMEM; + goto unmap; edac_dbg(3, "MC: init mci\n"); @@ -403,9 +406,9 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx) return 0; fail: + edac_mc_free(mci); +unmap: iounmap(window); - if (mci) - edac_mc_free(mci); return rc; } From b4eae155b9bf6a019a58114e559b90614a72c650 Mon Sep 17 00:00:00 2001 From: "Rob Herring (Arm)" Date: Mon, 15 Dec 2025 15:28:05 -0600 Subject: [PATCH 1208/1321] dt-bindings: i2c: brcm,iproc-i2c: Allow 2 reg entries for brcm,iproc-nic-i2c The brcm,iproc-nic-i2c variant has 2 reg entries. The second one is related to the brcm,ape-hsls-addr-mask property, but it's not clear what a proper description would be. Signed-off-by: Rob Herring (Arm) Reviewed-by: Florian Fainelli Signed-off-by: Wolfram Sang --- .../devicetree/bindings/i2c/brcm,iproc-i2c.yaml | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/i2c/brcm,iproc-i2c.yaml b/Documentation/devicetree/bindings/i2c/brcm,iproc-i2c.yaml index 2aa75b7add7bf..daa70a8500e9b 100644 --- a/Documentation/devicetree/bindings/i2c/brcm,iproc-i2c.yaml +++ b/Documentation/devicetree/bindings/i2c/brcm,iproc-i2c.yaml @@ -16,7 +16,8 @@ properties: - brcm,iproc-nic-i2c reg: - maxItems: 1 + minItems: 1 + maxItems: 2 clock-frequency: enum: [ 100000, 400000 ] @@ -41,8 +42,15 @@ allOf: contains: const: brcm,iproc-nic-i2c then: + properties: + reg: + minItems: 2 required: - brcm,ape-hsls-addr-mask + else: + properties: + reg: + maxItems: 1 unevaluatedProperties: false From d27fee809b27a65d0c63956c891597d669f713f6 Mon Sep 17 00:00:00 2001 From: Tommaso Merciai Date: Thu, 18 Dec 2025 16:10:21 +0100 Subject: [PATCH 1209/1321] i2c: riic: Move suspend handling to NOIRQ phase Commit 53326135d0e0 ("i2c: riic: Add suspend/resume support") added suspend support for the Renesas I2C driver and following this change on RZ/G3E the following WARNING is seen on entering suspend ... [ 134.275704] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) [ 134.285536] ------------[ cut here ]------------ [ 134.290298] i2c i2c-2: Transfer while suspended [ 134.295174] WARNING: drivers/i2c/i2c-core.h:56 at __i2c_smbus_xfer+0x1e4/0x214, CPU#0: systemd-sleep/388 [ 134.365507] Tainted: [W]=WARN [ 134.368485] Hardware name: Renesas SMARC EVK version 2 based on r9a09g047e57 (DT) [ 134.375961] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 134.382935] pc : __i2c_smbus_xfer+0x1e4/0x214 [ 134.387329] lr : __i2c_smbus_xfer+0x1e4/0x214 [ 134.391717] sp : ffff800083f23860 [ 134.395040] x29: ffff800083f23860 x28: 0000000000000000 x27: ffff800082ed5d60 [ 134.402226] x26: 0000001f4395fd74 x25: 0000000000000007 x24: 0000000000000001 [ 134.409408] x23: 0000000000000000 x22: 000000000000006f x21: ffff800083f23936 [ 134.416589] x20: ffff0000c090e140 x19: ffff0000c090e0d0 x18: 0000000000000006 [ 134.423771] x17: 6f63657320313030 x16: 2e30206465737061 x15: ffff800083f23280 [ 134.430953] x14: 0000000000000000 x13: ffff800082b16ce8 x12: 0000000000000f09 [ 134.438134] x11: 0000000000000503 x10: ffff800082b6ece8 x9 : ffff800082b16ce8 [ 134.445315] x8 : 00000000ffffefff x7 : ffff800082b6ece8 x6 : 80000000fffff000 [ 134.452495] x5 : 0000000000000504 x4 : 0000000000000000 x3 : 0000000000000000 [ 134.459672] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000c9ee9e80 [ 134.466851] Call trace: [ 134.469311] __i2c_smbus_xfer+0x1e4/0x214 (P) [ 134.473715] i2c_smbus_xfer+0xbc/0x120 [ 134.477507] i2c_smbus_read_byte_data+0x4c/0x84 [ 134.482077] isl1208_i2c_read_time+0x44/0x178 [rtc_isl1208] [ 134.487703] isl1208_rtc_read_time+0x14/0x20 [rtc_isl1208] [ 134.493226] __rtc_read_time+0x44/0x88 [ 134.497012] rtc_read_time+0x3c/0x68 [ 134.500622] rtc_suspend+0x9c/0x170 The warning is triggered because I2C transfers can still be attempted while the controller is already suspended, due to inappropriate ordering of the system sleep callbacks. If the controller is autosuspended, there is no way to wake it up once runtime PM disabled (in suspend_late()). During system resume, the I2C controller will be available only after runtime PM is re-enabled (in resume_early()). However, this may be too late for some devices. Wake up the controller in the suspend() callback while runtime PM is still enabled. The I2C controller will remain available until the suspend_noirq() callback (pm_runtime_force_suspend()) is called. During resume, the I2C controller can be restored by the resume_noirq() callback (pm_runtime_force_resume()). Finally, the resume() callback re-enables autosuspend. As a result, the I2C controller can remain available until the system enters suspend_noirq() and from resume_noirq(). Cc: stable@vger.kernel.org Fixes: 53326135d0e0 ("i2c: riic: Add suspend/resume support") Signed-off-by: Tommaso Merciai Reviewed-by: Biju Das Tested-by: Biju Das Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-riic.c | 46 +++++++++++++++++++++++++++++------ 1 file changed, 39 insertions(+), 7 deletions(-) diff --git a/drivers/i2c/busses/i2c-riic.c b/drivers/i2c/busses/i2c-riic.c index 3e8f126cb7f74..9e3595b3623e4 100644 --- a/drivers/i2c/busses/i2c-riic.c +++ b/drivers/i2c/busses/i2c-riic.c @@ -670,12 +670,39 @@ static const struct riic_of_data riic_rz_t2h_info = { static int riic_i2c_suspend(struct device *dev) { - struct riic_dev *riic = dev_get_drvdata(dev); - int ret; + /* + * Some I2C devices may need the I2C controller to remain active + * during resume_noirq() or suspend_noirq(). If the controller is + * autosuspended, there is no way to wake it up once runtime PM is + * disabled (in suspend_late()). + * + * During system resume, the I2C controller will be available only + * after runtime PM is re-enabled (in resume_early()). However, this + * may be too late for some devices. + * + * Wake up the controller in the suspend() callback while runtime PM + * is still enabled. The I2C controller will remain available until + * the suspend_noirq() callback (pm_runtime_force_suspend()) is + * called. During resume, the I2C controller can be restored by the + * resume_noirq() callback (pm_runtime_force_resume()). + * + * Finally, the resume() callback re-enables autosuspend, ensuring + * the I2C controller remains available until the system enters + * suspend_noirq() and from resume_noirq(). + */ + return pm_runtime_resume_and_get(dev); +} - ret = pm_runtime_resume_and_get(dev); - if (ret) - return ret; +static int riic_i2c_resume(struct device *dev) +{ + pm_runtime_put_autosuspend(dev); + + return 0; +} + +static int riic_i2c_suspend_noirq(struct device *dev) +{ + struct riic_dev *riic = dev_get_drvdata(dev); i2c_mark_adapter_suspended(&riic->adapter); @@ -683,12 +710,12 @@ static int riic_i2c_suspend(struct device *dev) riic_clear_set_bit(riic, ICCR1_ICE, 0, RIIC_ICCR1); pm_runtime_mark_last_busy(dev); - pm_runtime_put_sync(dev); + pm_runtime_force_suspend(dev); return reset_control_assert(riic->rstc); } -static int riic_i2c_resume(struct device *dev) +static int riic_i2c_resume_noirq(struct device *dev) { struct riic_dev *riic = dev_get_drvdata(dev); int ret; @@ -697,6 +724,10 @@ static int riic_i2c_resume(struct device *dev) if (ret) return ret; + ret = pm_runtime_force_resume(dev); + if (ret) + return ret; + ret = riic_init_hw(riic); if (ret) { /* @@ -714,6 +745,7 @@ static int riic_i2c_resume(struct device *dev) } static const struct dev_pm_ops riic_i2c_pm_ops = { + NOIRQ_SYSTEM_SLEEP_PM_OPS(riic_i2c_suspend_noirq, riic_i2c_resume_noirq) SYSTEM_SLEEP_PM_OPS(riic_i2c_suspend, riic_i2c_resume) }; From ade39cbba09821d7eccdcfea9b1cab0ecb993bd3 Mon Sep 17 00:00:00 2001 From: Neil Armstrong Date: Wed, 29 Oct 2025 19:07:42 +0100 Subject: [PATCH 1210/1321] i2c: qcom-geni: make sure I2C hub controllers can't use SE DMA The I2C Hub controller is a simpler GENI I2C variant that doesn't support DMA at all, add a no_dma flag to make sure it nevers selects the SE DMA mode with mappable 32bytes long transfers. Fixes: cacd9643eca7 ("i2c: qcom-geni: add support for I2C Master Hub variant") Signed-off-by: Neil Armstrong Reviewed-by: Konrad Dybcio Reviewed-by: Mukesh Kumar Savaliya > Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-qcom-geni.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c index 3a04016db2c32..ae609bdd2ec49 100644 --- a/drivers/i2c/busses/i2c-qcom-geni.c +++ b/drivers/i2c/busses/i2c-qcom-geni.c @@ -116,6 +116,7 @@ struct geni_i2c_dev { dma_addr_t dma_addr; struct dma_chan *tx_c; struct dma_chan *rx_c; + bool no_dma; bool gpi_mode; bool abort_done; bool is_tx_multi_desc_xfer; @@ -447,7 +448,7 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg, size_t len = msg->len; struct i2c_msg *cur; - dma_buf = i2c_get_dma_safe_msg_buf(msg, 32); + dma_buf = gi2c->no_dma ? NULL : i2c_get_dma_safe_msg_buf(msg, 32); if (dma_buf) geni_se_select_mode(se, GENI_SE_DMA); else @@ -486,7 +487,7 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg, size_t len = msg->len; struct i2c_msg *cur; - dma_buf = i2c_get_dma_safe_msg_buf(msg, 32); + dma_buf = gi2c->no_dma ? NULL : i2c_get_dma_safe_msg_buf(msg, 32); if (dma_buf) geni_se_select_mode(se, GENI_SE_DMA); else @@ -1080,10 +1081,12 @@ static int geni_i2c_probe(struct platform_device *pdev) goto err_resources; } - if (desc && desc->no_dma_support) + if (desc && desc->no_dma_support) { fifo_disable = false; - else + gi2c->no_dma = true; + } else { fifo_disable = readl_relaxed(gi2c->se.base + GENI_IF_DISABLE_RO) & FIFO_IF_DISABLE; + } if (fifo_disable) { /* FIFO is disabled, so we can only use GPI DMA */ From db5a01aacefaf4335fab5be0b6103013bc440ce3 Mon Sep 17 00:00:00 2001 From: Carlos Song Date: Fri, 21 Nov 2025 11:00:30 +0800 Subject: [PATCH 1211/1321] i2c: imx-lpi2c: change to PIO mode in system-wide suspend/resume progress EDMA resumes early and suspends late in the system power transition sequence, while LPI2C enters the NOIRQ stage for both suspend and resume. This means LPI2C resources become available before EDMA is fully resumed. Once IRQs are enabled, a slave device may immediately trigger an LPI2C transfer. If the transfer length meets DMA requirements, the driver will attempt to use EDMA even though EDMA may still be unavailable. This timing gap can lead to transfer failures. To prevent this, force LPI2C to use PIO mode during system-wide suspend and resume transitions. This reduces dependency on EDMA and avoids using an unready DMA resource. Fixes: a09c8b3f9047 ("i2c: imx-lpi2c: add eDMA mode support for LPI2C") Signed-off-by: Carlos Song Reviewed-by: Frank Li Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-imx-lpi2c.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/i2c/busses/i2c-imx-lpi2c.c b/drivers/i2c/busses/i2c-imx-lpi2c.c index 2a0962a0b4417..d882126c1778c 100644 --- a/drivers/i2c/busses/i2c-imx-lpi2c.c +++ b/drivers/i2c/busses/i2c-imx-lpi2c.c @@ -592,6 +592,13 @@ static bool is_use_dma(struct lpi2c_imx_struct *lpi2c_imx, struct i2c_msg *msg) if (!lpi2c_imx->can_use_dma) return false; + /* + * A system-wide suspend or resume transition is in progress. LPI2C should use PIO to + * transfer data to avoid issue caused by no ready DMA HW resource. + */ + if (pm_suspend_in_progress()) + return false; + /* * When the length of data is less than I2C_DMA_THRESHOLD, * cpu mode is used directly to avoid low performance. From 3902714d2a50509b8c8ea862e54fecab74d90b48 Mon Sep 17 00:00:00 2001 From: Thinh Nguyen Date: Fri, 2 Jan 2026 21:53:46 +0000 Subject: [PATCH 1212/1321] usb: dwc3: Check for USB4 IP_NAME Synopsys renamed DWC_usb32 IP to DWC_usb4 as of IP version 1.30. No functional change except checking for the IP_NAME here. The driver will treat the new IP_NAME as if it's DWC_usb32. Additional features for USB4 will be introduced and checked separately. Cc: stable@vger.kernel.org Signed-off-by: Thinh Nguyen Link: https://patch.msgid.link/e6f1827754c7a7ddc5eb7382add20bfe3a9b312f.1767390747.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/dwc3/core.c | 2 ++ drivers/usb/dwc3/core.h | 1 + 2 files changed, 3 insertions(+) diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index ec8407972b9df..93fd5fdf95cb1 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -993,6 +993,8 @@ static bool dwc3_core_is_valid(struct dwc3 *dwc) reg = dwc3_readl(dwc->regs, DWC3_GSNPSID); dwc->ip = DWC3_GSNPS_ID(reg); + if (dwc->ip == DWC4_IP) + dwc->ip = DWC32_IP; /* This should read as U3 followed by revision number */ if (DWC3_IP_IS(DWC3)) { diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h index a5fc92c4ffa3b..45757169b672f 100644 --- a/drivers/usb/dwc3/core.h +++ b/drivers/usb/dwc3/core.h @@ -1265,6 +1265,7 @@ struct dwc3 { #define DWC3_IP 0x5533 #define DWC31_IP 0x3331 #define DWC32_IP 0x3332 +#define DWC4_IP 0x3430 u32 revision; From d5569db2fb2796883ebf3605b51d6f0796ecf9c1 Mon Sep 17 00:00:00 2001 From: Arnaud Ferraris Date: Mon, 5 Jan 2026 09:43:23 +0100 Subject: [PATCH 1213/1321] tcpm: allow looking for role_sw device in the main node If ports are defined in the tcpc main node, fwnode_usb_role_switch_get() returns an error, meaning usb_role_switch_get() (which would succeed) never gets a chance to run as port->role_sw isn't NULL, causing a regression on devices where this is the case. Fix this by turning the NULL check into IS_ERR_OR_NULL(), so usb_role_switch_get() can actually run and the device get properly probed. Fixes: 2d8713f807a4 ("tcpm: switch check for role_sw device with fw_node") Cc: stable Reviewed-by: Heikki Krogerus Reviewed-by: Dragan Simic Signed-off-by: Arnaud Ferraris Link: https://patch.msgid.link/20260105-fix-ppp-power-v2-1-6924f5a41224@collabora.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/typec/tcpm/tcpm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c index 4ca2746ce16bc..be49a976428fc 100644 --- a/drivers/usb/typec/tcpm/tcpm.c +++ b/drivers/usb/typec/tcpm/tcpm.c @@ -7890,7 +7890,7 @@ struct tcpm_port *tcpm_register_port(struct device *dev, struct tcpc_dev *tcpc) port->partner_desc.identity = &port->partner_ident; port->role_sw = fwnode_usb_role_switch_get(tcpc->fwnode); - if (!port->role_sw) + if (IS_ERR_OR_NULL(port->role_sw)) port->role_sw = usb_role_switch_get(port->dev); if (IS_ERR(port->role_sw)) { err = PTR_ERR(port->role_sw); From 843a8bcc3d875d439a3835f2de6fe0038bc6c53c Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Tue, 6 Jan 2026 19:50:13 +0100 Subject: [PATCH 1214/1321] dt-bindings: usb: qcom,dwc3: Correct IPQ5018 interrupts According to reference manual, IPQ5018 does not have QUSB2 PHY and its interrupts should rather match ones used in IPQ5332 (so power_event, eud_dmse_int_mx, eud_dpse_int_mx). Fixes: 53c6d854be4e ("dt-bindings: usb: dwc3: Clean up hs_phy_irq in binding") Fixes: 6e762f7b8edc ("dt-bindings: usb: Introduce qcom,snps-dwc3") Signed-off-by: Krzysztof Kozlowski Link: https://patch.msgid.link/20260106185012.19551-3-krzysztof.kozlowski@oss.qualcomm.com Signed-off-by: Greg Kroah-Hartman --- Documentation/devicetree/bindings/usb/qcom,dwc3.yaml | 2 +- Documentation/devicetree/bindings/usb/qcom,snps-dwc3.yaml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/usb/qcom,dwc3.yaml b/Documentation/devicetree/bindings/usb/qcom,dwc3.yaml index a792434c59db2..809280b091431 100644 --- a/Documentation/devicetree/bindings/usb/qcom,dwc3.yaml +++ b/Documentation/devicetree/bindings/usb/qcom,dwc3.yaml @@ -406,7 +406,6 @@ allOf: compatible: contains: enum: - - qcom,ipq5018-dwc3 - qcom,ipq6018-dwc3 - qcom,ipq8074-dwc3 - qcom,msm8953-dwc3 @@ -451,6 +450,7 @@ allOf: compatible: contains: enum: + - qcom,ipq5018-dwc3 - qcom,ipq5332-dwc3 then: properties: diff --git a/Documentation/devicetree/bindings/usb/qcom,snps-dwc3.yaml b/Documentation/devicetree/bindings/usb/qcom,snps-dwc3.yaml index 8cee7c5582f2a..3073943c59642 100644 --- a/Documentation/devicetree/bindings/usb/qcom,snps-dwc3.yaml +++ b/Documentation/devicetree/bindings/usb/qcom,snps-dwc3.yaml @@ -420,7 +420,6 @@ allOf: compatible: contains: enum: - - qcom,ipq5018-dwc3 - qcom,ipq6018-dwc3 - qcom,ipq8074-dwc3 - qcom,msm8953-dwc3 @@ -467,6 +466,7 @@ allOf: compatible: contains: enum: + - qcom,ipq5018-dwc3 - qcom,ipq5332-dwc3 then: properties: From b6b39bf2584c68983ec5028a6e6ba2703a72d30f Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Tue, 6 Jan 2026 19:50:14 +0100 Subject: [PATCH 1215/1321] dt-bindings: usb: qcom,dwc3: Correct MSM8994 interrupts According to the reference manual, MSM8994 does have QUSB2 PHY and does not have DP/DM IRQs interrupts. It is also logical it has the same constraints as similar device: MSM8996. This fixes dtbs_check warnings like: msm8994-sony-xperia-kitakami-karin.dtb: usb@f92f8800 (qcom,msm8994-dwc3): interrupt-names:1: 'hs_phy_irq' was expected msm8994-sony-xperia-kitakami-karin.dtb: usb@f92f8800 (qcom,msm8994-dwc3): interrupt-names:2: 'dp_hs_phy_irq' was expected msm8994-sony-xperia-kitakami-karin.dtb: usb@f92f8800 (qcom,msm8994-dwc3): interrupt-names:3: 'dm_hs_phy_irq' was expected Fixes: 53c6d854be4e ("dt-bindings: usb: dwc3: Clean up hs_phy_irq in binding") Fixes: 6e762f7b8edc ("dt-bindings: usb: Introduce qcom,snps-dwc3") Signed-off-by: Krzysztof Kozlowski Link: https://patch.msgid.link/20260106185012.19551-4-krzysztof.kozlowski@oss.qualcomm.com Signed-off-by: Greg Kroah-Hartman --- Documentation/devicetree/bindings/usb/qcom,dwc3.yaml | 2 +- Documentation/devicetree/bindings/usb/qcom,snps-dwc3.yaml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/usb/qcom,dwc3.yaml b/Documentation/devicetree/bindings/usb/qcom,dwc3.yaml index 809280b091431..a7f58114c02e8 100644 --- a/Documentation/devicetree/bindings/usb/qcom,dwc3.yaml +++ b/Documentation/devicetree/bindings/usb/qcom,dwc3.yaml @@ -427,6 +427,7 @@ allOf: compatible: contains: enum: + - qcom,msm8994-dwc3 - qcom,msm8996-dwc3 - qcom,qcs404-dwc3 - qcom,sdm660-dwc3 @@ -488,7 +489,6 @@ allOf: enum: - qcom,ipq4019-dwc3 - qcom,ipq8064-dwc3 - - qcom,msm8994-dwc3 - qcom,qcs615-dwc3 - qcom,qcs8300-dwc3 - qcom,qdu1000-dwc3 diff --git a/Documentation/devicetree/bindings/usb/qcom,snps-dwc3.yaml b/Documentation/devicetree/bindings/usb/qcom,snps-dwc3.yaml index 3073943c59642..7d784a648b7d9 100644 --- a/Documentation/devicetree/bindings/usb/qcom,snps-dwc3.yaml +++ b/Documentation/devicetree/bindings/usb/qcom,snps-dwc3.yaml @@ -442,6 +442,7 @@ allOf: compatible: contains: enum: + - qcom,msm8994-dwc3 - qcom,msm8996-dwc3 - qcom,qcs404-dwc3 - qcom,sdm660-dwc3 @@ -509,7 +510,6 @@ allOf: - qcom,ipq4019-dwc3 - qcom,ipq8064-dwc3 - qcom,kaanapali-dwc3 - - qcom,msm8994-dwc3 - qcom,qcs615-dwc3 - qcom,qcs8300-dwc3 - qcom,qdu1000-dwc3 From 240f4d5540770acce6e35b4432d5b8c69f334efb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Johannes=20Br=C3=BCderl?= Date: Sun, 7 Dec 2025 10:02:20 +0100 Subject: [PATCH 1216/1321] usb: core: add USB_QUIRK_NO_BOS for devices that hang on BOS descriptor MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add USB_QUIRK_NO_BOS quirk flag to skip requesting the BOS descriptor for devices that cannot handle it. Add Elgato 4K X (0fd9:009b) to the quirk table. This device hangs when the BOS descriptor is requested at SuperSpeed Plus (10Gbps). Link: https://bugzilla.kernel.org/show_bug.cgi?id=220027 Cc: stable Signed-off-by: Johannes Brüderl Link: https://patch.msgid.link/20251207090220.14807-1-johannes.bruederl@gmail.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/core/config.c | 5 +++++ drivers/usb/core/quirks.c | 3 +++ include/linux/usb/quirks.h | 3 +++ 3 files changed, 11 insertions(+) diff --git a/drivers/usb/core/config.c b/drivers/usb/core/config.c index baf5bc844b6ff..2bb1ceb9d621a 100644 --- a/drivers/usb/core/config.c +++ b/drivers/usb/core/config.c @@ -1040,6 +1040,11 @@ int usb_get_bos_descriptor(struct usb_device *dev) __u8 cap_type; int ret; + if (dev->quirks & USB_QUIRK_NO_BOS) { + dev_dbg(ddev, "skipping BOS descriptor\n"); + return -ENOMSG; + } + bos = kzalloc(sizeof(*bos), GFP_KERNEL); if (!bos) return -ENOMEM; diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c index 47f589c4104a3..c4d85089d19b1 100644 --- a/drivers/usb/core/quirks.c +++ b/drivers/usb/core/quirks.c @@ -450,6 +450,9 @@ static const struct usb_device_id usb_quirk_list[] = { { USB_DEVICE(0x0c45, 0x7056), .driver_info = USB_QUIRK_IGNORE_REMOTE_WAKEUP }, + /* Elgato 4K X - BOS descriptor fetch hangs at SuperSpeed Plus */ + { USB_DEVICE(0x0fd9, 0x009b), .driver_info = USB_QUIRK_NO_BOS }, + /* Sony Xperia XZ1 Compact (lilac) smartphone in fastboot mode */ { USB_DEVICE(0x0fce, 0x0dde), .driver_info = USB_QUIRK_NO_LPM }, diff --git a/include/linux/usb/quirks.h b/include/linux/usb/quirks.h index 59409c1fc3dee..2f7bd2fdc6164 100644 --- a/include/linux/usb/quirks.h +++ b/include/linux/usb/quirks.h @@ -75,4 +75,7 @@ /* short SET_ADDRESS request timeout */ #define USB_QUIRK_SHORT_SET_ADDRESS_REQ_TIMEOUT BIT(16) +/* skip BOS descriptor request */ +#define USB_QUIRK_NO_BOS BIT(17) + #endif /* __LINUX_USB_QUIRKS_H */ From a65c99bac99eca63d5bce18d8aead2d50e77f417 Mon Sep 17 00:00:00 2001 From: Sven Peter Date: Fri, 9 Jan 2026 17:43:34 +0100 Subject: [PATCH 1217/1321] usb: dwc3: apple: Set USB2 PHY mode before dwc3 init Now that the upstream code has been getting broader test coverage by our users we occasionally see issues with USB2 devices plugged in during boot. Before Linux is running, the USB2 PHY has usually been running in device mode and it turns out that sometimes host->device or device->host transitions don't work. The root cause: If the role inside the USB2 PHY is re-configured when it has already been powered on or when dwc3 has already enabled the ULPI interface the new configuration sometimes doesn't take affect until dwc3 is reset again. Fix this rare issue by configuring the role much earlier. Note that the USB3 PHY does not suffer from this issue and actually requires dwc3 to be up before the correct role can be configured there. Reported-by: James Calligeros Reported-by: Janne Grunau Fixes: 0ec946d32ef7 ("usb: dwc3: Add Apple Silicon DWC3 glue layer driver") Cc: stable Tested-by: Janne Grunau Reviewed-by: Janne Grunau Acked-by: Thinh Nguyen Signed-off-by: Sven Peter Link: https://patch.msgid.link/20260109-dwc3-apple-usb2phy-fix-v2-1-ab6b041e3b26@kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/dwc3/dwc3-apple.c | 48 ++++++++++++++++++++++++----------- 1 file changed, 33 insertions(+), 15 deletions(-) diff --git a/drivers/usb/dwc3/dwc3-apple.c b/drivers/usb/dwc3/dwc3-apple.c index cc47cad232e39..c2ae8eb21d514 100644 --- a/drivers/usb/dwc3/dwc3-apple.c +++ b/drivers/usb/dwc3/dwc3-apple.c @@ -218,25 +218,31 @@ static int dwc3_apple_core_init(struct dwc3_apple *appledwc) return ret; } -static void dwc3_apple_phy_set_mode(struct dwc3_apple *appledwc, enum phy_mode mode) -{ - lockdep_assert_held(&appledwc->lock); - - /* - * This platform requires SUSPHY to be enabled here already in order to properly configure - * the PHY and switch dwc3's PIPE interface to USB3 PHY. - */ - dwc3_enable_susphy(&appledwc->dwc, true); - phy_set_mode(appledwc->dwc.usb2_generic_phy[0], mode); - phy_set_mode(appledwc->dwc.usb3_generic_phy[0], mode); -} - static int dwc3_apple_init(struct dwc3_apple *appledwc, enum dwc3_apple_state state) { int ret, ret_reset; lockdep_assert_held(&appledwc->lock); + /* + * The USB2 PHY on this platform must be configured for host or device mode while it is + * still powered off and before dwc3 tries to access it. Otherwise, the new configuration + * will sometimes only take affect after the *next* time dwc3 is brought up which causes + * the connected device to just not work. + * The USB3 PHY must be configured later after dwc3 has already been initialized. + */ + switch (state) { + case DWC3_APPLE_HOST: + phy_set_mode(appledwc->dwc.usb2_generic_phy[0], PHY_MODE_USB_HOST); + break; + case DWC3_APPLE_DEVICE: + phy_set_mode(appledwc->dwc.usb2_generic_phy[0], PHY_MODE_USB_DEVICE); + break; + default: + /* Unreachable unless there's a bug in this driver */ + return -EINVAL; + } + ret = reset_control_deassert(appledwc->reset); if (ret) { dev_err(appledwc->dev, "Failed to deassert reset, err=%d\n", ret); @@ -257,7 +263,13 @@ static int dwc3_apple_init(struct dwc3_apple *appledwc, enum dwc3_apple_state st case DWC3_APPLE_HOST: appledwc->dwc.dr_mode = USB_DR_MODE_HOST; dwc3_apple_set_ptrcap(appledwc, DWC3_GCTL_PRTCAP_HOST); - dwc3_apple_phy_set_mode(appledwc, PHY_MODE_USB_HOST); + /* + * This platform requires SUSPHY to be enabled here already in order to properly + * configure the PHY and switch dwc3's PIPE interface to USB3 PHY. The USB2 PHY + * has already been configured to the correct mode earlier. + */ + dwc3_enable_susphy(&appledwc->dwc, true); + phy_set_mode(appledwc->dwc.usb3_generic_phy[0], PHY_MODE_USB_HOST); ret = dwc3_host_init(&appledwc->dwc); if (ret) { dev_err(appledwc->dev, "Failed to initialize host, ret=%d\n", ret); @@ -268,7 +280,13 @@ static int dwc3_apple_init(struct dwc3_apple *appledwc, enum dwc3_apple_state st case DWC3_APPLE_DEVICE: appledwc->dwc.dr_mode = USB_DR_MODE_PERIPHERAL; dwc3_apple_set_ptrcap(appledwc, DWC3_GCTL_PRTCAP_DEVICE); - dwc3_apple_phy_set_mode(appledwc, PHY_MODE_USB_DEVICE); + /* + * This platform requires SUSPHY to be enabled here already in order to properly + * configure the PHY and switch dwc3's PIPE interface to USB3 PHY. The USB2 PHY + * has already been configured to the correct mode earlier. + */ + dwc3_enable_susphy(&appledwc->dwc, true); + phy_set_mode(appledwc->dwc.usb3_generic_phy[0], PHY_MODE_USB_DEVICE); ret = dwc3_gadget_init(&appledwc->dwc); if (ret) { dev_err(appledwc->dev, "Failed to initialize gadget, ret=%d\n", ret); From 67934342631bc5df8a8ae34fc86262b7f37f5d44 Mon Sep 17 00:00:00 2001 From: Huacai Chen Date: Mon, 12 Jan 2026 16:48:02 +0800 Subject: [PATCH 1218/1321] USB: OHCI/UHCI: Add soft dependencies on ehci_platform Commit 9beeee6584b9aa4f ("USB: EHCI: log a warning if ehci-hcd is not loaded first") said that ehci-hcd should be loaded before ohci-hcd and uhci-hcd. However, commit 05c92da0c52494ca ("usb: ohci/uhci - add soft dependencies on ehci_pci") only makes ohci-pci/uhci-pci depend on ehci- pci, which is not enough and we may still see the warnings in boot log. To eliminate the warnings we should make ohci-hcd/uhci-hcd depend on ehci-hcd. But Alan said that the warning introduced by 9beeee6584b9aa4f is bogus, we only need the soft dependencies in the PCI level rather than the HCD level. However, there is really another neccessary soft dependencies between ohci-platform/uhci-platform and ehci-platform, which is added by this patch. The boot logs are below. 1. ohci-platform loaded before ehci-platform: ohci-platform 1f058000.usb: Generic Platform OHCI controller ohci-platform 1f058000.usb: new USB bus registered, assigned bus number 1 ohci-platform 1f058000.usb: irq 28, io mem 0x1f058000 hub 1-0:1.0: USB hub found hub 1-0:1.0: 4 ports detected Warning! ehci_hcd should always be loaded before uhci_hcd and ohci_hcd, not after usb 1-4: new low-speed USB device number 2 using ohci-platform ehci-platform 1f050000.usb: EHCI Host Controller ehci-platform 1f050000.usb: new USB bus registered, assigned bus number 2 ehci-platform 1f050000.usb: irq 29, io mem 0x1f050000 ehci-platform 1f050000.usb: USB 2.0 started, EHCI 1.00 usb 1-4: device descriptor read/all, error -62 hub 2-0:1.0: USB hub found hub 2-0:1.0: 4 ports detected usb 1-4: new low-speed USB device number 3 using ohci-platform input: YSPRINGTECH USB OPTICAL MOUSE as /devices/platform/bus@10000000/1f058000.usb/usb1/1-4/1-4:1.0/0003:10C4:8105.0001/input/input0 hid-generic 0003:10C4:8105.0001: input,hidraw0: USB HID v1.11 Mouse [YSPRINGTECH USB OPTICAL MOUSE] on usb-1f058000.usb-4/input0 2. ehci-platform loaded before ohci-platform: ehci-platform 1f050000.usb: EHCI Host Controller ehci-platform 1f050000.usb: new USB bus registered, assigned bus number 1 ehci-platform 1f050000.usb: irq 28, io mem 0x1f050000 ehci-platform 1f050000.usb: USB 2.0 started, EHCI 1.00 hub 1-0:1.0: USB hub found hub 1-0:1.0: 4 ports detected ohci-platform 1f058000.usb: Generic Platform OHCI controller ohci-platform 1f058000.usb: new USB bus registered, assigned bus number 2 ohci-platform 1f058000.usb: irq 29, io mem 0x1f058000 hub 2-0:1.0: USB hub found hub 2-0:1.0: 4 ports detected usb 2-4: new low-speed USB device number 2 using ohci-platform input: YSPRINGTECH USB OPTICAL MOUSE as /devices/platform/bus@10000000/1f058000.usb/usb2/2-4/2-4:1.0/0003:10C4:8105.0001/input/input0 hid-generic 0003:10C4:8105.0001: input,hidraw0: USB HID v1.11 Mouse [YSPRINGTECH USB OPTICAL MOUSE] on usb-1f058000.usb-4/input0 In the later case, there is no re-connection for USB-1.0/1.1 devices, which is expected. Cc: stable Reported-by: Shengwen Xiao Signed-off-by: Huacai Chen Reviewed-by: Alan Stern Link: https://patch.msgid.link/20260112084802.1995923-1-chenhuacai@loongson.cn Signed-off-by: Greg Kroah-Hartman --- drivers/usb/host/ohci-platform.c | 1 + drivers/usb/host/uhci-platform.c | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/usb/host/ohci-platform.c b/drivers/usb/host/ohci-platform.c index 2e4bb5cc2165f..c801527d5bd29 100644 --- a/drivers/usb/host/ohci-platform.c +++ b/drivers/usb/host/ohci-platform.c @@ -392,3 +392,4 @@ MODULE_DESCRIPTION(DRIVER_DESC); MODULE_AUTHOR("Hauke Mehrtens"); MODULE_AUTHOR("Alan Stern"); MODULE_LICENSE("GPL"); +MODULE_SOFTDEP("pre: ehci_platform"); diff --git a/drivers/usb/host/uhci-platform.c b/drivers/usb/host/uhci-platform.c index 5e02f2ceafb6e..f4419d4526c43 100644 --- a/drivers/usb/host/uhci-platform.c +++ b/drivers/usb/host/uhci-platform.c @@ -211,3 +211,4 @@ static struct platform_driver uhci_platform_driver = { .of_match_table = platform_uhci_ids, }, }; +MODULE_SOFTDEP("pre: ehci_platform"); From 7f27d188dd83fc6aa628383bb443bf4df877a872 Mon Sep 17 00:00:00 2001 From: Wayne Chang Date: Mon, 12 Jan 2026 22:56:53 +0800 Subject: [PATCH 1219/1321] usb: host: xhci-tegra: Use platform_get_irq_optional() for wake IRQs When some wake IRQs are disabled in the device tree, the corresponding interrupt entries are removed from DT. In such cases, the driver currently calls platform_get_irq(), which returns -ENXIO and logs an error like: tegra-xusb 3610000.usb: error -ENXIO: IRQ index 2 not found However, not all wake IRQs are mandatory. The hardware can operate normally even if some wake sources are not defined in DT. To avoid this false alarm and allow missing wake IRQs gracefully, use platform_get_irq_optional() instead of platform_get_irq(). Fixes: 5df186e2ef11 ("usb: xhci: tegra: Support USB wakeup function for Tegra234") Cc: stable Signed-off-by: Wayne Chang Signed-off-by: Wei-Cheng Chen Reviewed-by: Jon Hunter Tested-by: Jon Hunter Link: https://patch.msgid.link/20260112145653.95691-1-weichengc@nvidia.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/host/xhci-tegra.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c index 31ccced5125e8..8b492871d21d6 100644 --- a/drivers/usb/host/xhci-tegra.c +++ b/drivers/usb/host/xhci-tegra.c @@ -1563,7 +1563,7 @@ static int tegra_xusb_setup_wakeup(struct platform_device *pdev, struct tegra_xu for (i = 0; i < tegra->soc->max_num_wakes; i++) { struct irq_data *data; - tegra->wake_irqs[i] = platform_get_irq(pdev, i + WAKE_IRQ_START_INDEX); + tegra->wake_irqs[i] = platform_get_irq_optional(pdev, i + WAKE_IRQ_START_INDEX); if (tegra->wake_irqs[i] < 0) break; From e061d264cbcee8d0c0f5f853a804d3875e614200 Mon Sep 17 00:00:00 2001 From: Janne Grunau Date: Fri, 9 Jan 2026 11:13:48 +0100 Subject: [PATCH 1220/1321] usb: dwc3: apple: Ignore USB role switches to the active role Ignore USB role switches if dwc3-apple is already in the desired state. The USB-C port controller on M2 and M1/M2 Pro/Max/Ultra devices issues additional interrupts which result in USB role switches to the already active role. Ignore these USB role switches to ensure the USB-C port controller and dwc3-apple are always in a consistent state. This matches the behaviour in __dwc3_set_mode() in core.c. Fixes detecting USB 2.0 and 3.x devices on the affected systems. The reset caused by the additional role switch appears to leave the USB devices in a state which prevents detection when the phy and dwc3 is brought back up again. Fixes: 0ec946d32ef7 ("usb: dwc3: Add Apple Silicon DWC3 glue layer driver") Cc: stable Signed-off-by: Janne Grunau Acked-by: Thinh Nguyen Reviewed-by: Sven Peter Tested-by: Sven Peter # M1 mac mini and macbook air Link: https://patch.msgid.link/20260109-apple-dwc3-role-switch-v1-1-11623b0f6222@jannau.net Signed-off-by: Greg Kroah-Hartman --- drivers/usb/dwc3/dwc3-apple.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/drivers/usb/dwc3/dwc3-apple.c b/drivers/usb/dwc3/dwc3-apple.c index c2ae8eb21d514..40c3ccfddb670 100644 --- a/drivers/usb/dwc3/dwc3-apple.c +++ b/drivers/usb/dwc3/dwc3-apple.c @@ -357,6 +357,22 @@ static int dwc3_usb_role_switch_set(struct usb_role_switch *sw, enum usb_role ro guard(mutex)(&appledwc->lock); + /* + * Skip role switches if appledwc is already in the desired state. The + * USB-C port controller on M2 and M1/M2 Pro/Max/Ultra devices issues + * additional interrupts which results in usb_role_switch_set_role() + * calls with the current role. + * Ignore those calls here to ensure the USB-C port controller and + * appledwc are in a consistent state. + * This matches the behaviour in __dwc3_set_mode(). + * Do no handle USB_ROLE_NONE for DWC3_APPLE_NO_CABLE and + * DWC3_APPLE_PROBE_PENDING since that is no-op anyway. + */ + if (appledwc->state == DWC3_APPLE_HOST && role == USB_ROLE_HOST) + return 0; + if (appledwc->state == DWC3_APPLE_DEVICE && role == USB_ROLE_DEVICE) + return 0; + /* * We need to tear all of dwc3 down and re-initialize it every time a cable is * connected or disconnected or when the mode changes. See the documentation for enum From b4d5f635b24d6f6d0d08a0412e00138d7c137898 Mon Sep 17 00:00:00 2001 From: Xu Yang Date: Tue, 13 Jan 2026 17:53:07 +0800 Subject: [PATCH 1221/1321] usb: gadget: uvc: fix req_payload_size calculation Current req_payload_size calculation has 2 issue: (1) When the first time calculate req_payload_size for all the buffers, reqs_per_frame = 0 will be the divisor of DIV_ROUND_UP(). So the result is undefined. This happens because VIDIOC_STREAMON is always executed after VIDIOC_QBUF. So video->reqs_per_frame will be 0 until VIDIOC_STREAMON is run. (2) The buf->req_payload_size may be bigger than max_req_size. Take YUYV pixel format as example: If bInterval = 1, video->interval = 666666, high-speed: video->reqs_per_frame = 666666 / 1250 = 534 720p: buf->req_payload_size = 1843200 / 534 = 3452 1080p: buf->req_payload_size = 4147200 / 534 = 7766 Based on such req_payload_size, the controller can't run normally. To fix above issue, assign max_req_size to buf->req_payload_size when video->reqs_per_frame = 0. And limit buf->req_payload_size to video->req_size if it's large than video->req_size. Since max_req_size is used at many place, add it to struct uvc_video and set the value once endpoint is enabled. Fixes: 98ad03291560 ("usb: gadget: uvc: set req_length based on payload by nreqs instead of req_size") Cc: stable@vger.kernel.org Reviewed-by: Frank Li Signed-off-by: Xu Yang Link: https://patch.msgid.link/20260113-uvc-gadget-fix-patch-v2-1-62950ef5bcb5@nxp.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/gadget/function/f_uvc.c | 4 ++++ drivers/usb/gadget/function/uvc.h | 1 + drivers/usb/gadget/function/uvc_queue.c | 15 +++++++++++---- drivers/usb/gadget/function/uvc_video.c | 4 +--- 4 files changed, 17 insertions(+), 7 deletions(-) diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c index aa6ab666741a9..a96476507d2fd 100644 --- a/drivers/usb/gadget/function/f_uvc.c +++ b/drivers/usb/gadget/function/f_uvc.c @@ -362,6 +362,10 @@ uvc_function_set_alt(struct usb_function *f, unsigned interface, unsigned alt) return ret; usb_ep_enable(uvc->video.ep); + uvc->video.max_req_size = uvc->video.ep->maxpacket + * max_t(unsigned int, uvc->video.ep->maxburst, 1) + * (uvc->video.ep->mult); + memset(&v4l2_event, 0, sizeof(v4l2_event)); v4l2_event.type = UVC_EVENT_STREAMON; v4l2_event_queue(&uvc->vdev, &v4l2_event); diff --git a/drivers/usb/gadget/function/uvc.h b/drivers/usb/gadget/function/uvc.h index 9e79cbe507157..b3f88670bff80 100644 --- a/drivers/usb/gadget/function/uvc.h +++ b/drivers/usb/gadget/function/uvc.h @@ -117,6 +117,7 @@ struct uvc_video { /* Requests */ bool is_enabled; /* tracks whether video stream is enabled */ unsigned int req_size; + unsigned int max_req_size; struct list_head ureqs; /* all uvc_requests allocated by uvc_video */ /* USB requests that the video pump thread can encode into */ diff --git a/drivers/usb/gadget/function/uvc_queue.c b/drivers/usb/gadget/function/uvc_queue.c index 9a1bbd79ff5af..21d80322cb614 100644 --- a/drivers/usb/gadget/function/uvc_queue.c +++ b/drivers/usb/gadget/function/uvc_queue.c @@ -86,10 +86,17 @@ static int uvc_buffer_prepare(struct vb2_buffer *vb) buf->bytesused = 0; } else { buf->bytesused = vb2_get_plane_payload(vb, 0); - buf->req_payload_size = - DIV_ROUND_UP(buf->bytesused + - (video->reqs_per_frame * UVCG_REQUEST_HEADER_LEN), - video->reqs_per_frame); + + if (video->reqs_per_frame != 0) { + buf->req_payload_size = + DIV_ROUND_UP(buf->bytesused + + (video->reqs_per_frame * UVCG_REQUEST_HEADER_LEN), + video->reqs_per_frame); + if (buf->req_payload_size > video->req_size) + buf->req_payload_size = video->req_size; + } else { + buf->req_payload_size = video->max_req_size; + } } return 0; diff --git a/drivers/usb/gadget/function/uvc_video.c b/drivers/usb/gadget/function/uvc_video.c index fb77b0b217901..1c0672f707e4e 100644 --- a/drivers/usb/gadget/function/uvc_video.c +++ b/drivers/usb/gadget/function/uvc_video.c @@ -503,9 +503,7 @@ uvc_video_prep_requests(struct uvc_video *video) unsigned int max_req_size, req_size, header_size; unsigned int nreq; - max_req_size = video->ep->maxpacket - * max_t(unsigned int, video->ep->maxburst, 1) - * (video->ep->mult); + max_req_size = video->max_req_size; if (!usb_endpoint_xfer_isoc(video->ep->desc)) { video->req_size = max_req_size; From ada5144a7a888689f62f168e74212e974499cc1d Mon Sep 17 00:00:00 2001 From: Xu Yang Date: Tue, 13 Jan 2026 17:53:08 +0800 Subject: [PATCH 1222/1321] usb: gadget: uvc: fix interval_duration calculation According to USB specification: For full-/high-speed isochronous endpoints, the bInterval value is used as the exponent for a 2^(bInterval-1) value. To correctly convert bInterval as interval_duration: interval_duration = 2^(bInterval-1) * frame_interval Because the unit of video->interval is 100ns, add a comment info to make it clear. Fixes: 48dbe731171e ("usb: gadget: uvc: set req_size and n_requests based on the frame interval") Cc: stable@vger.kernel.org Reviewed-by: Frank Li Signed-off-by: Xu Yang Link: https://patch.msgid.link/20260113-uvc-gadget-fix-patch-v2-2-62950ef5bcb5@nxp.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/gadget/function/uvc.h | 2 +- drivers/usb/gadget/function/uvc_video.c | 7 +++++-- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/usb/gadget/function/uvc.h b/drivers/usb/gadget/function/uvc.h index b3f88670bff80..676419a049762 100644 --- a/drivers/usb/gadget/function/uvc.h +++ b/drivers/usb/gadget/function/uvc.h @@ -107,7 +107,7 @@ struct uvc_video { unsigned int width; unsigned int height; unsigned int imagesize; - unsigned int interval; + unsigned int interval; /* in 100ns units */ struct mutex mutex; /* protects frame parameters */ unsigned int uvc_num_requests; diff --git a/drivers/usb/gadget/function/uvc_video.c b/drivers/usb/gadget/function/uvc_video.c index 1c0672f707e4e..9dc3af16e2f38 100644 --- a/drivers/usb/gadget/function/uvc_video.c +++ b/drivers/usb/gadget/function/uvc_video.c @@ -499,7 +499,7 @@ uvc_video_prep_requests(struct uvc_video *video) { struct uvc_device *uvc = container_of(video, struct uvc_device, video); struct usb_composite_dev *cdev = uvc->func.config->cdev; - unsigned int interval_duration = video->ep->desc->bInterval * 1250; + unsigned int interval_duration; unsigned int max_req_size, req_size, header_size; unsigned int nreq; @@ -513,8 +513,11 @@ uvc_video_prep_requests(struct uvc_video *video) return; } + interval_duration = 2 << (video->ep->desc->bInterval - 1); if (cdev->gadget->speed < USB_SPEED_HIGH) - interval_duration = video->ep->desc->bInterval * 10000; + interval_duration *= 10000; + else + interval_duration *= 1250; nreq = DIV_ROUND_UP(video->interval, interval_duration); From dd26b0b8c571cea8dfc6f044964fd175262eb661 Mon Sep 17 00:00:00 2001 From: Xu Yang Date: Tue, 13 Jan 2026 17:53:09 +0800 Subject: [PATCH 1223/1321] usb: gadget: uvc: return error from uvcg_queue_init() uvcg_queue_init() may fail, but its return value is currently ignored. Propagate the error code from uvcg_queue_init() to correctly report initialization failures. Signed-off-by: Xu Yang Link: https://patch.msgid.link/20260113-uvc-gadget-fix-patch-v2-3-62950ef5bcb5@nxp.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/gadget/function/uvc_video.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/usb/gadget/function/uvc_video.c b/drivers/usb/gadget/function/uvc_video.c index 9dc3af16e2f38..f568dee08b3b7 100644 --- a/drivers/usb/gadget/function/uvc_video.c +++ b/drivers/usb/gadget/function/uvc_video.c @@ -838,7 +838,6 @@ int uvcg_video_init(struct uvc_video *video, struct uvc_device *uvc) video->interval = 666666; /* Initialize the video buffers queue. */ - uvcg_queue_init(&video->queue, uvc->v4l2_dev.dev->parent, + return uvcg_queue_init(&video->queue, uvc->v4l2_dev.dev->parent, V4L2_BUF_TYPE_VIDEO_OUTPUT, &video->mutex); - return 0; } From 18f39ffa064e98e78fd6422a5a0bf2c5b3114d67 Mon Sep 17 00:00:00 2001 From: Xu Yang Date: Tue, 13 Jan 2026 17:53:10 +0800 Subject: [PATCH 1224/1321] usb: gadget: uvc: retry vb2_reqbufs() with vb_vmalloc_memops if use_sg fail Based on the reality[1][2] that vb2_dma_sg_alloc() can't alloc buffer with device DMA limits, those device will always get below error: "swiotlb buffer is full (sz: 393216 bytes), total 65536 (slots), used 2358 (slots)" and the uvc gadget function can't work at all. The videobuf2-dma-sg.c driver doesn't has a formal improve about this issue till now. For UVC gadget, the videobuf2 subsystem doesn't do dma_map() on vmalloc returned big buffer when allocate the video buffers, however, it do it for dma_sg returned buffer. So the issue happens for vb2_dma_sg_alloc(). To workaround the issue, lets retry vb2_reqbufs() with vb_vmalloc_memops if it fails to allocate buffer with vb2_dma_sg_memops. If use vmalloced buffer, UVC gadget will allocate some small buffers for each usb_request to do dma transfer, then uvc driver will memcopy data from big buffer to small buffer. Link[1]: https://lore.kernel.org/linux-media/20230828075420.2009568-1-anle.pan@nxp.com/ Link[2]: https://lore.kernel.org/linux-media/20230914145812.12851-1-hui.fang@nxp.com/ Signed-off-by: Xu Yang Reviewed-by: Frank Li Link: https://patch.msgid.link/20260113-uvc-gadget-fix-patch-v2-4-62950ef5bcb5@nxp.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/gadget/function/uvc_queue.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/usb/gadget/function/uvc_queue.c b/drivers/usb/gadget/function/uvc_queue.c index 21d80322cb614..586e5524c171f 100644 --- a/drivers/usb/gadget/function/uvc_queue.c +++ b/drivers/usb/gadget/function/uvc_queue.c @@ -182,7 +182,15 @@ int uvcg_alloc_buffers(struct uvc_video_queue *queue, { int ret; +retry: ret = vb2_reqbufs(&queue->queue, rb); + if (ret < 0 && queue->use_sg) { + uvc_trace(UVC_TRACE_IOCTL, + "failed to alloc buffer with sg enabled, try non-sg mode\n"); + queue->use_sg = 0; + queue->queue.mem_ops = &vb2_vmalloc_memops; + goto retry; + } return ret ? ret : rb->count; } From 9f3cade8a90f26b0f5dbab32bbbc447962933141 Mon Sep 17 00:00:00 2001 From: Ulrich Mohr Date: Tue, 9 Dec 2025 21:08:41 +0100 Subject: [PATCH 1225/1321] USB: serial: option: add Telit LE910 MBIM composition Add support for Telit LE910 module when operating in MBIM composition with additional ttys. This USB product ID is used by the module when AT#USBCFG is set to 7. 0x1252: MBIM + tty(NMEA) + tty(MODEM) + tty(MODEM) + SAP T: Bus=01 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1252 Rev=03.18 S: Manufacturer=Android S: Product=LE910C1-EU S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=88(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8a(I) Atr=03(Int.) MxPS= 10 Ivl=32ms Signed-off-by: Ulrich Mohr Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold --- drivers/usb/serial/option.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c index 4c0e5a3ab557b..9f2cc5fb9f456 100644 --- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -1505,6 +1505,7 @@ static const struct usb_device_id option_ids[] = { { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1231, 0xff), /* Telit LE910Cx (RNDIS) */ .driver_info = NCTRL(2) | RSVD(3) }, { USB_DEVICE_AND_INTERFACE_INFO(TELIT_VENDOR_ID, 0x1250, 0xff, 0x00, 0x00) }, /* Telit LE910Cx (rmnet) */ + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1252, 0xff) }, /* Telit LE910Cx (MBIM) */ { USB_DEVICE(TELIT_VENDOR_ID, 0x1260), .driver_info = NCTRL(0) | RSVD(1) | RSVD(2) }, { USB_DEVICE(TELIT_VENDOR_ID, 0x1261), From ba6e65784aacbba93b1d0971305d669f587d6ac1 Mon Sep 17 00:00:00 2001 From: Ethan Nelson-Moore Date: Wed, 10 Dec 2025 18:01:17 -0800 Subject: [PATCH 1226/1321] USB: serial: ftdi_sio: add support for PICAXE AXE027 cable The vendor provides instructions to write "0403 bd90" to /sys/bus/usb-serial/drivers/ftdi_sio/new_id; see: https://picaxe.com/docs/picaxe_linux_instructions.pdf Cc: stable@vger.kernel.org Signed-off-by: Ethan Nelson-Moore Signed-off-by: Johan Hovold --- drivers/usb/serial/ftdi_sio.c | 1 + drivers/usb/serial/ftdi_sio_ids.h | 2 ++ 2 files changed, 3 insertions(+) diff --git a/drivers/usb/serial/ftdi_sio.c b/drivers/usb/serial/ftdi_sio.c index fe2f21d85737e..acb48b1c83f7d 100644 --- a/drivers/usb/serial/ftdi_sio.c +++ b/drivers/usb/serial/ftdi_sio.c @@ -848,6 +848,7 @@ static const struct usb_device_id id_table_combined[] = { { USB_DEVICE_INTERFACE_NUMBER(FTDI_VID, LMI_LM3S_DEVEL_BOARD_PID, 1) }, { USB_DEVICE_INTERFACE_NUMBER(FTDI_VID, LMI_LM3S_EVAL_BOARD_PID, 1) }, { USB_DEVICE_INTERFACE_NUMBER(FTDI_VID, LMI_LM3S_ICDI_BOARD_PID, 1) }, + { USB_DEVICE(FTDI_VID, FTDI_AXE027_PID) }, { USB_DEVICE_INTERFACE_NUMBER(FTDI_VID, FTDI_TURTELIZER_PID, 1) }, { USB_DEVICE(RATOC_VENDOR_ID, RATOC_PRODUCT_ID_USB60F) }, { USB_DEVICE(RATOC_VENDOR_ID, RATOC_PRODUCT_ID_SCU18) }, diff --git a/drivers/usb/serial/ftdi_sio_ids.h b/drivers/usb/serial/ftdi_sio_ids.h index 2539b9e2f712c..6c76cfebfd0e4 100644 --- a/drivers/usb/serial/ftdi_sio_ids.h +++ b/drivers/usb/serial/ftdi_sio_ids.h @@ -96,6 +96,8 @@ #define LMI_LM3S_EVAL_BOARD_PID 0xbcd9 #define LMI_LM3S_ICDI_BOARD_PID 0xbcda +#define FTDI_AXE027_PID 0xBD90 /* PICAXE AXE027 USB download cable */ + #define FTDI_TURTELIZER_PID 0xBDC8 /* JTAG/RS-232 adapter by egnite GmbH */ /* OpenDCC (www.opendcc.de) product id */ From 73c5261eea5180709f9315bd0c5c29addcac487b Mon Sep 17 00:00:00 2001 From: "Ji-Ze Hong (Peter Hong)" Date: Fri, 12 Dec 2025 15:08:31 +0800 Subject: [PATCH 1227/1321] USB: serial: f81232: fix incomplete serial port generation The Fintek F81532A/534A/535/536 family relies on the F81534A_CTRL_CMD_ENABLE_PORT (116h) register during initialization to both determine serial port status and control port creation. If the driver experiences fast load/unload cycles, the device state may becomes unstable, resulting in the incomplete generation of serial ports. Performing a dummy read operation on the register prior to the initial write command resolves the issue. This clears the device's stale internal state. Subsequent write operations will correctly generate all serial ports. This patch also removes the retry loop in f81534a_ctrl_set_register() because the stale state has been fixed. Tested on: HygonDM1SLT(Hygon C86 3250 8-core Processor) Signed-off-by: Ji-Ze Hong (Peter Hong) Signed-off-by: Johan Hovold --- drivers/usb/serial/f81232.c | 77 ++++++++++++++++++++++--------------- 1 file changed, 47 insertions(+), 30 deletions(-) diff --git a/drivers/usb/serial/f81232.c b/drivers/usb/serial/f81232.c index 530b77fc2f78d..9262a2ac97f58 100644 --- a/drivers/usb/serial/f81232.c +++ b/drivers/usb/serial/f81232.c @@ -70,7 +70,6 @@ MODULE_DEVICE_TABLE(usb, combined_id_table); #define F81232_REGISTER_REQUEST 0xa0 #define F81232_GET_REGISTER 0xc0 #define F81232_SET_REGISTER 0x40 -#define F81534A_ACCESS_REG_RETRY 2 #define SERIAL_BASE_ADDRESS 0x0120 #define RECEIVE_BUFFER_REGISTER (0x00 + SERIAL_BASE_ADDRESS) @@ -824,36 +823,31 @@ static void f81232_lsr_worker(struct work_struct *work) static int f81534a_ctrl_set_register(struct usb_interface *intf, u16 reg, u16 size, void *val) { - struct usb_device *dev = interface_to_usbdev(intf); - int retry = F81534A_ACCESS_REG_RETRY; - int status; - - while (retry--) { - status = usb_control_msg_send(dev, - 0, - F81232_REGISTER_REQUEST, - F81232_SET_REGISTER, - reg, - 0, - val, - size, - USB_CTRL_SET_TIMEOUT, - GFP_KERNEL); - if (status) { - status = usb_translate_errors(status); - if (status == -EIO) - continue; - } - - break; - } - - if (status) { - dev_err(&intf->dev, "failed to set register 0x%x: %d\n", - reg, status); - } + return usb_control_msg_send(interface_to_usbdev(intf), + 0, + F81232_REGISTER_REQUEST, + F81232_SET_REGISTER, + reg, + 0, + val, + size, + USB_CTRL_SET_TIMEOUT, + GFP_KERNEL); +} - return status; +static int f81534a_ctrl_get_register(struct usb_interface *intf, u16 reg, + u16 size, void *val) +{ + return usb_control_msg_recv(interface_to_usbdev(intf), + 0, + F81232_REGISTER_REQUEST, + F81232_GET_REGISTER, + reg, + 0, + val, + size, + USB_CTRL_GET_TIMEOUT, + GFP_KERNEL); } static int f81534a_ctrl_enable_all_ports(struct usb_interface *intf, bool en) @@ -869,6 +863,29 @@ static int f81534a_ctrl_enable_all_ports(struct usb_interface *intf, bool en) * bit 0~11 : Serial port enable bit. */ if (en) { + /* + * The Fintek F81532A/534A/535/536 family relies on the + * F81534A_CTRL_CMD_ENABLE_PORT (116h) register during + * initialization to both determine serial port status and + * control port creation. + * + * If the driver experiences fast load/unload cycles, the + * device state may becomes unstable, resulting in the + * incomplete generation of serial ports. + * + * Performing a dummy read operation on the register prior + * to the initial write command resolves the issue. + * + * This clears the device's stale internal state. Subsequent + * write operations will correctly generate all serial ports. + */ + status = f81534a_ctrl_get_register(intf, + F81534A_CTRL_CMD_ENABLE_PORT, + sizeof(enable), + enable); + if (status) + return status; + enable[0] = 0xff; enable[1] = 0x8f; } From 93bd88677a598e10b4528dea4f27f6d8eff9b728 Mon Sep 17 00:00:00 2001 From: Mathias Nyman Date: Fri, 16 Jan 2026 01:37:58 +0200 Subject: [PATCH 1228/1321] xhci: sideband: don't dereference freed ring when removing sideband endpoint MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit xhci_sideband_remove_endpoint() incorrecly assumes that the endpoint is running and has a valid transfer ring. Lianqin reported a crash during suspend/wake-up stress testing, and found the cause to be dereferencing a non-existing transfer ring 'ep->ring' during xhci_sideband_remove_endpoint(). The endpoint and its ring may be in unknown state if this function is called after xHCI was reinitialized in resume (lost power), or if device is being re-enumerated, disconnected or endpoint already dropped. Fix this by both removing unnecessary ring access, and by checking ep->ring exists before dereferencing it. Also make sure endpoint is running before attempting to stop it. Remove the xhci_initialize_ring_info() call during sideband endpoint removal as is it only initializes ring structure enqueue, dequeue and cycle state values to their starting values without changing actual hardware enqueue, dequeue and cycle state. Leaving them out of sync is worse than leaving it as it is. The endpoint will get freed in after this in most usecases. If the (audio) class driver want's to reuse the endpoint after offload then it is up to the class driver to ensure endpoint is properly set up. Reported-by: 胡连勤 Closes: https://lore.kernel.org/linux-usb/TYUPR06MB6217B105B059A7730C4F6EC8D2B9A@TYUPR06MB6217.apcprd06.prod.outlook.com/ Tested-by: 胡连勤 Fixes: de66754e9f80 ("xhci: sideband: add initial api to register a secondary interrupter entity") Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman Link: https://patch.msgid.link/20260115233758.364097-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/host/xhci-sideband.c | 1 - drivers/usb/host/xhci.c | 15 ++++++++++++--- 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/usb/host/xhci-sideband.c b/drivers/usb/host/xhci-sideband.c index a85f62a73313a..2bd77255032b9 100644 --- a/drivers/usb/host/xhci-sideband.c +++ b/drivers/usb/host/xhci-sideband.c @@ -210,7 +210,6 @@ xhci_sideband_remove_endpoint(struct xhci_sideband *sb, return -ENODEV; __xhci_sideband_remove_endpoint(sb, ep); - xhci_initialize_ring_info(ep->ring); return 0; } diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index 02c9bfe21ae27..b3ba16b9718cf 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -2898,16 +2898,25 @@ int xhci_stop_endpoint_sync(struct xhci_hcd *xhci, struct xhci_virt_ep *ep, int gfp_t gfp_flags) { struct xhci_command *command; + struct xhci_ep_ctx *ep_ctx; unsigned long flags; - int ret; + int ret = -ENODEV; command = xhci_alloc_command(xhci, true, gfp_flags); if (!command) return -ENOMEM; spin_lock_irqsave(&xhci->lock, flags); - ret = xhci_queue_stop_endpoint(xhci, command, ep->vdev->slot_id, - ep->ep_index, suspend); + + /* make sure endpoint exists and is running before stopping it */ + if (ep->ring) { + ep_ctx = xhci_get_ep_ctx(xhci, ep->vdev->out_ctx, ep->ep_index); + if (GET_EP_CTX_STATE(ep_ctx) == EP_STATE_RUNNING) + ret = xhci_queue_stop_endpoint(xhci, command, + ep->vdev->slot_id, + ep->ep_index, suspend); + } + if (ret < 0) { spin_unlock_irqrestore(&xhci->lock, flags); goto out; From ec6099dd46216fe79e0623360e5db296b74d763c Mon Sep 17 00:00:00 2001 From: Harshit Mogalapalli Date: Sat, 10 Jan 2026 12:19:58 -0800 Subject: [PATCH 1229/1321] soundwire: bus: fix off-by-one when allocating slave IDs ida_alloc_max() interprets its max argument as inclusive. Using SDW_FW_MAX_DEVICES(16) therefore allows an ID of 16 to be allocated, but the IRQ domain created for the bus is sized for IDs 0-15. If 16 is returned, irq_create_mapping() fails and the driver ends up with an invalid IRQ mapping. Limit the allocation to 0-15 by passing SDW_FW_MAX_DEVICES - 1. Reported-by: kernel test robot Reported-by: Dan Carpenter Closes: https://lore.kernel.org/r/202512240450.hlDH3nCs-lkp@intel.com/ Fixes: aab12022b076 ("soundwire: bus: Add internal slave ID and use for IRQs") Signed-off-by: Harshit Mogalapalli Reviewed-by: Charles Keepax Link: https://patch.msgid.link/20260110201959.2523024-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Vinod Koul --- drivers/soundwire/bus_type.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/soundwire/bus_type.c b/drivers/soundwire/bus_type.c index 91e70cb46fb57..5c67c13e57357 100644 --- a/drivers/soundwire/bus_type.c +++ b/drivers/soundwire/bus_type.c @@ -105,7 +105,7 @@ static int sdw_drv_probe(struct device *dev) if (ret) return ret; - ret = ida_alloc_max(&slave->bus->slave_ida, SDW_FW_MAX_DEVICES, GFP_KERNEL); + ret = ida_alloc_max(&slave->bus->slave_ida, SDW_FW_MAX_DEVICES - 1, GFP_KERNEL); if (ret < 0) { dev_err(dev, "Failed to allocated ID: %d\n", ret); return ret; From d30849ad0e00a369fe07c4e3b49b3cb8e3c7283a Mon Sep 17 00:00:00 2001 From: Franz Schnyder Date: Wed, 26 Nov 2025 15:01:33 +0100 Subject: [PATCH 1230/1321] phy: fsl-imx8mq-usb: fix typec orientation switch when built as module Currently, the PHY only registers the typec orientation switch when it is built in. If the typec driver is built as a module, the switch registration is skipped due to the preprocessor condition, causing orientation detection to fail. With commit 45fe729be9a6 ("usb: typec: Stub out typec_switch APIs when CONFIG_TYPEC=n") the preprocessor condition is not needed anymore and the orientation switch is correctly registered for both built-in and module builds. Fixes: b58f0f86fd61 ("phy: fsl-imx8mq-usb: add tca function driver for imx95") Cc: stable@vger.kernel.org Suggested-by: Xu Yang Signed-off-by: Franz Schnyder Reviewed-by: Frank Li Reviewed-by: Xu Yang Link: https://patch.msgid.link/20251126140136.1202241-1-fra.schnyder@gmail.com Signed-off-by: Vinod Koul --- drivers/phy/freescale/phy-fsl-imx8mq-usb.c | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/drivers/phy/freescale/phy-fsl-imx8mq-usb.c b/drivers/phy/freescale/phy-fsl-imx8mq-usb.c index ad8a55012e42f..99d2bdd4bfc88 100644 --- a/drivers/phy/freescale/phy-fsl-imx8mq-usb.c +++ b/drivers/phy/freescale/phy-fsl-imx8mq-usb.c @@ -126,8 +126,6 @@ struct imx8mq_usb_phy { static void tca_blk_orientation_set(struct tca_blk *tca, enum typec_orientation orientation); -#ifdef CONFIG_TYPEC - static int tca_blk_typec_switch_set(struct typec_switch_dev *sw, enum typec_orientation orientation) { @@ -175,18 +173,6 @@ static void tca_blk_put_typec_switch(struct typec_switch_dev *sw) typec_switch_unregister(sw); } -#else - -static struct typec_switch_dev *tca_blk_get_typec_switch(struct platform_device *pdev, - struct imx8mq_usb_phy *imx_phy) -{ - return NULL; -} - -static void tca_blk_put_typec_switch(struct typec_switch_dev *sw) {} - -#endif /* CONFIG_TYPEC */ - static void tca_blk_orientation_set(struct tca_blk *tca, enum typec_orientation orientation) { From cbc0f89ea3f481355ba66ae2cb210ac1c9b7b76d Mon Sep 17 00:00:00 2001 From: Ziyue Zhang Date: Fri, 28 Nov 2025 18:49:23 +0800 Subject: [PATCH 1231/1321] dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Update pcie phy bindings for qcs8300 The gcc_aux_clk is not required by the PCIe PHY on qcs8300 and is not specified in the device tree node. Hence, move the qcs8300 phy compatibility entry into the list of PHYs that require six clocks. Removed the phy_aux clock from the PCIe PHY binding as it is no longer used by any instance. Fixes: e46e59b77a9e ("dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Document the QCS8300 QMP PCIe PHY Gen4 x2") Signed-off-by: Ziyue Zhang Acked-by: Manivannan Sadhasivam Acked-by: Rob Herring (Arm) Reviewed-by: Johan Hovold Link: https://patch.msgid.link/20251128104928.4070050-2-ziyue.zhang@oss.qualcomm.com Signed-off-by: Vinod Koul --- .../phy/qcom,sc8280xp-qmp-pcie-phy.yaml | 17 ++--------------- 1 file changed, 2 insertions(+), 15 deletions(-) diff --git a/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml b/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml index 48bd11410e8c2..f5068df20cfe6 100644 --- a/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml +++ b/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml @@ -56,7 +56,7 @@ properties: clocks: minItems: 5 - maxItems: 7 + maxItems: 6 clock-names: minItems: 5 @@ -67,7 +67,6 @@ properties: - enum: [rchng, refgen] - const: pipe - const: pipediv2 - - const: phy_aux power-domains: maxItems: 1 @@ -180,6 +179,7 @@ allOf: contains: enum: - qcom,glymur-qmp-gen5x4-pcie-phy + - qcom,qcs8300-qmp-gen4x2-pcie-phy - qcom,sa8775p-qmp-gen4x2-pcie-phy - qcom,sa8775p-qmp-gen4x4-pcie-phy - qcom,sc8280xp-qmp-gen3x1-pcie-phy @@ -197,19 +197,6 @@ allOf: clock-names: minItems: 6 - - if: - properties: - compatible: - contains: - enum: - - qcom,qcs8300-qmp-gen4x2-pcie-phy - then: - properties: - clocks: - minItems: 7 - clock-names: - minItems: 7 - - if: properties: compatible: From 3c406f320a287ddbed1c40fed0d0d2fa6c3ce937 Mon Sep 17 00:00:00 2001 From: Stefano Radaelli Date: Fri, 19 Dec 2025 17:09:12 +0100 Subject: [PATCH 1232/1321] phy: fsl-imx8mq-usb: Clear the PCS_TX_SWING_FULL field before using it Clear the PCS_TX_SWING_FULL field mask before setting the new value in PHY_CTRL5 register. Without clearing the mask first, the OR operation could leave previously set bits, resulting in incorrect register configuration. Fixes: 63c85ad0cd81 ("phy: fsl-imx8mp-usb: add support for phy tuning") Suggested-by: Leonid Segal Acked-by: Pierluigi Passaro Signed-off-by: Stefano Radaelli Reviewed-by: Xu Yang Reviewed-by: Frank Li Reviewed-by: Fabio Estevam Reviewed-by: Ahmad Fatoum Link: https://patch.msgid.link/20251219160912.561431-1-stefano.r@variscite.com Signed-off-by: Vinod Koul --- drivers/phy/freescale/phy-fsl-imx8mq-usb.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/phy/freescale/phy-fsl-imx8mq-usb.c b/drivers/phy/freescale/phy-fsl-imx8mq-usb.c index 99d2bdd4bfc88..91b3e62743d3a 100644 --- a/drivers/phy/freescale/phy-fsl-imx8mq-usb.c +++ b/drivers/phy/freescale/phy-fsl-imx8mq-usb.c @@ -490,6 +490,7 @@ static void imx8m_phy_tune(struct imx8mq_usb_phy *imx_phy) if (imx_phy->pcs_tx_swing_full != PHY_TUNE_DEFAULT) { value = readl(imx_phy->base + PHY_CTRL5); + value &= ~PHY_CTRL5_PCS_TX_SWING_FULL_MASK; value |= FIELD_PREP(PHY_CTRL5_PCS_TX_SWING_FULL_MASK, imx_phy->pcs_tx_swing_full); writel(value, imx_phy->base + PHY_CTRL5); From f79b320dfdcb06bfee0aa66fc1c603774e843447 Mon Sep 17 00:00:00 2001 From: Loic Poulain Date: Fri, 19 Dec 2025 09:56:40 +0100 Subject: [PATCH 1233/1321] phy: qcom-qusb2: Fix NULL pointer dereference on early suspend Enabling runtime PM before attaching the QPHY instance as driver data can lead to a NULL pointer dereference in runtime PM callbacks that expect valid driver data. There is a small window where the suspend callback may run after PM runtime enabling and before runtime forbid. This causes a sporadic crash during boot: ``` Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a1 [...] CPU: 0 UID: 0 PID: 11 Comm: kworker/0:1 Not tainted 6.16.7+ #116 PREEMPT Workqueue: pm pm_runtime_work pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : qusb2_phy_runtime_suspend+0x14/0x1e0 [phy_qcom_qusb2] lr : pm_generic_runtime_suspend+0x2c/0x44 [...] ``` Attach the QPHY instance as driver data before enabling runtime PM to prevent NULL pointer dereference in runtime PM callbacks. Reorder pm_runtime_enable() and pm_runtime_forbid() to prevent a short window where an unnecessary runtime suspend can occur. Use the devres-managed version to ensure PM runtime is symmetrically disabled during driver removal for proper cleanup. Fixes: 891a96f65ac3 ("phy: qcom-qusb2: Add support for runtime PM") Signed-off-by: Loic Poulain Reviewed-by: Dmitry Baryshkov Reviewed-by: Abel Vesa Link: https://patch.msgid.link/20251219085640.114473-1-loic.poulain@oss.qualcomm.com Signed-off-by: Vinod Koul --- drivers/phy/qualcomm/phy-qcom-qusb2.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/phy/qualcomm/phy-qcom-qusb2.c b/drivers/phy/qualcomm/phy-qcom-qusb2.c index b5514a32ff8ff..eb93015be841f 100644 --- a/drivers/phy/qualcomm/phy-qcom-qusb2.c +++ b/drivers/phy/qualcomm/phy-qcom-qusb2.c @@ -1093,29 +1093,29 @@ static int qusb2_phy_probe(struct platform_device *pdev) or->hsdisc_trim.override = true; } - pm_runtime_set_active(dev); - pm_runtime_enable(dev); + dev_set_drvdata(dev, qphy); + /* - * Prevent runtime pm from being ON by default. Users can enable - * it using power/control in sysfs. + * Enable runtime PM support, but forbid it by default. + * Users can allow it again via the power/control attribute in sysfs. */ + pm_runtime_set_active(dev); pm_runtime_forbid(dev); + ret = devm_pm_runtime_enable(dev); + if (ret) + return ret; generic_phy = devm_phy_create(dev, NULL, &qusb2_phy_gen_ops); if (IS_ERR(generic_phy)) { ret = PTR_ERR(generic_phy); dev_err(dev, "failed to create phy, %d\n", ret); - pm_runtime_disable(dev); return ret; } qphy->phy = generic_phy; - dev_set_drvdata(dev, qphy); phy_set_drvdata(generic_phy, qphy); phy_provider = devm_of_phy_provider_register(dev, of_phy_simple_xlate); - if (IS_ERR(phy_provider)) - pm_runtime_disable(dev); return PTR_ERR_OR_ZERO(phy_provider); } From 1f62ce9912b7c889b3a2eb3efd51d1e67f013a26 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Tue, 9 Dec 2025 09:53:36 +0300 Subject: [PATCH 1234/1321] phy: stm32-usphyc: Fix off by one in probe() The "index" variable is used as an index into the usbphyc->phys[] array which has usbphyc->nphys elements. So if it is equal to usbphyc->nphys then it is one element out of bounds. The "index" comes from the device tree so it's data that we trust and it's unlikely to be wrong, however it's obviously still worth fixing the bug. Change the > to >=. Fixes: 94c358da3a05 ("phy: stm32: add support for STM32 USB PHY Controller (USBPHYC)") Signed-off-by: Dan Carpenter Reviewed-by: Amelie Delaunay Link: https://patch.msgid.link/aTfHcMJK1wFVnvEe@stanley.mountain Signed-off-by: Vinod Koul --- drivers/phy/st/phy-stm32-usbphyc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/phy/st/phy-stm32-usbphyc.c b/drivers/phy/st/phy-stm32-usbphyc.c index 27fe92f73f331..b44afbff8616b 100644 --- a/drivers/phy/st/phy-stm32-usbphyc.c +++ b/drivers/phy/st/phy-stm32-usbphyc.c @@ -712,7 +712,7 @@ static int stm32_usbphyc_probe(struct platform_device *pdev) } ret = of_property_read_u32(child, "reg", &index); - if (ret || index > usbphyc->nphys) { + if (ret || index >= usbphyc->nphys) { dev_err(&phy->dev, "invalid reg property: %d\n", ret); if (!ret) ret = -EINVAL; From 6433949696fae62bbfae24bf3b06107a9e6a7bb6 Mon Sep 17 00:00:00 2001 From: Haotian Zhang Date: Mon, 24 Nov 2025 18:57:34 +0800 Subject: [PATCH 1235/1321] phy: ti: da8xx-usb: Handle devm_pm_runtime_enable() errors devm_pm_runtime_enable() can fail due to memory allocation. The current code ignores its return value after calling pm_runtime_set_active(), leaving the device in an inconsistent state if runtime PM initialization fails. Check the return value of devm_pm_runtime_enable() and return on failure. Also move the declaration of 'ret' to the function scope to support this check. Fixes: ee8e41b5044f ("phy: ti: phy-da8xx-usb: Add runtime PM support") Suggested-by: Neil Armstrong Signed-off-by: Haotian Zhang Reviewed-by: Neil Armstrong Link: https://patch.msgid.link/20251124105734.1027-1-vulab@iscas.ac.cn Signed-off-by: Vinod Koul --- drivers/phy/ti/phy-da8xx-usb.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/phy/ti/phy-da8xx-usb.c b/drivers/phy/ti/phy-da8xx-usb.c index 1d81a1e6ec6b6..62fa6f89c0e61 100644 --- a/drivers/phy/ti/phy-da8xx-usb.c +++ b/drivers/phy/ti/phy-da8xx-usb.c @@ -180,6 +180,7 @@ static int da8xx_usb_phy_probe(struct platform_device *pdev) struct da8xx_usb_phy_platform_data *pdata = dev->platform_data; struct device_node *node = dev->of_node; struct da8xx_usb_phy *d_phy; + int ret; d_phy = devm_kzalloc(dev, sizeof(*d_phy), GFP_KERNEL); if (!d_phy) @@ -233,8 +234,6 @@ static int da8xx_usb_phy_probe(struct platform_device *pdev) return PTR_ERR(d_phy->phy_provider); } } else { - int ret; - ret = phy_create_lookup(d_phy->usb11_phy, "usb-phy", "ohci-da8xx"); if (ret) @@ -249,7 +248,9 @@ static int da8xx_usb_phy_probe(struct platform_device *pdev) PHY_INIT_BITS, PHY_INIT_BITS); pm_runtime_set_active(dev); - devm_pm_runtime_enable(dev); + ret = devm_pm_runtime_enable(dev); + if (ret) + return ret; /* * Prevent runtime pm from being ON by default. Users can enable * it using power/control in sysfs. From 8258ce527c54616a3aeebdc27c7f806186dd53ec Mon Sep 17 00:00:00 2001 From: Robert Marko Date: Fri, 31 Oct 2025 13:18:12 +0100 Subject: [PATCH 1236/1321] phy: sparx5-serdes: make it selectable for ARCH_LAN969X LAN969x uses the SparX-5 SERDES driver, so make it selectable for ARCH_LAN969X. Reviewed-by: Daniel Machon Signed-off-by: Robert Marko Tested-by: Gabor Juhos Tested-by: Vladimir Oltean Link: https://patch.msgid.link/20251031121834.665987-1-robert.marko@sartura.hr Signed-off-by: Vinod Koul --- drivers/phy/microchip/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/phy/microchip/Kconfig b/drivers/phy/microchip/Kconfig index 2f0045e874ac8..2e6d1224711e3 100644 --- a/drivers/phy/microchip/Kconfig +++ b/drivers/phy/microchip/Kconfig @@ -6,7 +6,7 @@ config PHY_SPARX5_SERDES tristate "Microchip Sparx5 SerDes PHY driver" select GENERIC_PHY - depends on ARCH_SPARX5 || COMPILE_TEST + depends on ARCH_SPARX5 || ARCH_LAN969X || COMPILE_TEST depends on OF depends on HAS_IOMEM help From cde4bfb84f5d15b48cc700a5c81e1bd44f691ad8 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Thu, 27 Nov 2025 14:48:34 +0100 Subject: [PATCH 1237/1321] phy: ti: gmii-sel: fix regmap leak on probe failure The mmio regmap that may be allocated during probe is never freed. Switch to using the device managed allocator so that the regmap is released on probe failures (e.g. probe deferral) and on driver unbind. Fixes: 5ab90f40121a ("phy: ti: gmii-sel: Do not use syscon helper to build regmap") Cc: stable@vger.kernel.org # 6.14 Cc: Andrew Davis Signed-off-by: Johan Hovold Acked-by: Andrew Davis Link: https://patch.msgid.link/20251127134834.2030-1-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/phy/ti/phy-gmii-sel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/phy/ti/phy-gmii-sel.c b/drivers/phy/ti/phy-gmii-sel.c index 6cfe2538d15b1..6213c2b6005a5 100644 --- a/drivers/phy/ti/phy-gmii-sel.c +++ b/drivers/phy/ti/phy-gmii-sel.c @@ -512,7 +512,7 @@ static int phy_gmii_sel_probe(struct platform_device *pdev) return dev_err_probe(dev, PTR_ERR(base), "failed to get base memory resource\n"); - priv->regmap = regmap_init_mmio(dev, base, &phy_gmii_sel_regmap_cfg); + priv->regmap = devm_regmap_init_mmio(dev, base, &phy_gmii_sel_regmap_cfg); if (IS_ERR(priv->regmap)) return dev_err_probe(dev, PTR_ERR(priv->regmap), "Failed to get syscon\n"); From c6c715c0cf6fc865eed177b3352da971ff07b6f3 Mon Sep 17 00:00:00 2001 From: Louis Chauvet Date: Thu, 27 Nov 2025 11:26:16 +0100 Subject: [PATCH 1238/1321] phy: rockchip: inno-usb2: fix disconnection in gadget mode MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When the OTG USB port is used to power the SoC, configured as peripheral and used in gadget mode, there is a disconnection about 6 seconds after the gadget is configured and enumerated. The problem was observed on a Radxa Rock Pi S board, which can only be powered by the only USB-C connector. That connector is the only one usable in gadget mode. This implies the USB cable is connected from before boot and never disconnects while the kernel runs. The problem happens because of the PHY driver code flow, summarized as: * UDC start code (triggered via configfs at any time after boot) -> phy_init -> rockchip_usb2phy_init -> schedule_delayed_work(otg_sm_work [A], 6 sec) -> phy_power_on -> rockchip_usb2phy_power_on -> enable clock -> rockchip_usb2phy_reset * Now the gadget interface is up and running. * 6 seconds later otg_sm_work starts [A] -> rockchip_usb2phy_otg_sm_work(): if (B_IDLE state && VBUS present && ...): schedule_delayed_work(&rport->chg_work [B], 0); * immediately the chg_detect_work starts [B] -> rockchip_chg_detect_work(): if chg_state is UNDEFINED: if (!rport->suspended): rockchip_usb2phy_power_off() <--- [X] At [X], the PHY is powered off, causing a disconnection. This quickly triggers a new connection and following re-enumeration, but any connection that had been established during the 6 seconds is broken. The code already checks for !rport->suspended (which, somewhat counter-intuitively, means the PHY is powered on), so add a guard for VBUS as well to avoid a disconnection when a cable is connected. Fixes: 98898f3bc83c ("phy: rockchip-inno-usb2: support otg-port for rk3399") Cc: stable@vger.kernel.org Closes: https://lore.kernel.org/lkml/20250414185458.7767aabc@booty/ Signed-off-by: Louis Chauvet Co-developed-by: Luca Ceresoli Signed-off-by: Luca Ceresoli Reviewed-by: Théo Lebrun Link: https://patch.msgid.link/20251127-rk3308-fix-usb-gadget-phy-disconnect-v2-1-dac8a02cd2ca@bootlin.com Signed-off-by: Vinod Koul --- drivers/phy/rockchip/phy-rockchip-inno-usb2.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c index b0f23690ec300..0106d7b7ae24e 100644 --- a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c +++ b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c @@ -821,14 +821,16 @@ static void rockchip_chg_detect_work(struct work_struct *work) container_of(work, struct rockchip_usb2phy_port, chg_work.work); struct rockchip_usb2phy *rphy = dev_get_drvdata(rport->phy->dev.parent); struct regmap *base = get_reg_base(rphy); - bool is_dcd, tmout, vout; + bool is_dcd, tmout, vout, vbus_attach; unsigned long delay; + vbus_attach = property_enabled(rphy->grf, &rport->port_cfg->utmi_bvalid); + dev_dbg(&rport->phy->dev, "chg detection work state = %d\n", rphy->chg_state); switch (rphy->chg_state) { case USB_CHG_STATE_UNDEFINED: - if (!rport->suspended) + if (!rport->suspended && !vbus_attach) rockchip_usb2phy_power_off(rport->phy); /* put the controller in non-driving mode */ property_enable(base, &rphy->phy_cfg->chg_det.opmode, false); From f2b948e0e6da0edd353793ae8aa2982f5855b7a0 Mon Sep 17 00:00:00 2001 From: Luca Ceresoli Date: Thu, 27 Nov 2025 11:26:17 +0100 Subject: [PATCH 1239/1321] phy: rockchip: inno-usb2: fix communication disruption in gadget mode MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When the OTG USB port is used to power to SoC, configured as peripheral and used in gadget mode, communication stops without notice about 6 seconds after the gadget is configured and enumerated. The problem was observed on a Radxa Rock Pi S board, which can only be powered by the only USB-C connector. That connector is the only one usable in gadget mode. This implies the USB cable is connected from before boot and never disconnects while the kernel runs. The related code flow in the PHY driver code can be summarized as: * the first time chg_detect_work starts (6 seconds after gadget is configured and enumerated) -> rockchip_chg_detect_work(): if chg_state is UNDEFINED: property_enable(base, &rphy->phy_cfg->chg_det.opmode, false); [Y] * rockchip_chg_detect_work() changes state and re-triggers itself a few times until it reaches the DETECTED state: -> rockchip_chg_detect_work(): if chg_state is DETECTED: property_enable(base, &rphy->phy_cfg->chg_det.opmode, true); [Z] At [Y] all existing communications stop. E.g. using a CDC serial gadget, the /dev/tty* devices are still present on both host and device, but no data is transferred anymore. The later call with a 'true' argument at [Z] does not restore it. Due to the lack of documentation, what chg_det.opmode does exactly is not clear, however by code inspection it seems reasonable that is disables something needed to keep the communication working, and testing proves that disabling these lines lets gadget mode keep working. So prevent changes to chg_det.opmode when there is a cable connected (VBUS present). Fixes: 98898f3bc83c ("phy: rockchip-inno-usb2: support otg-port for rk3399") Cc: stable@vger.kernel.org Closes: https://lore.kernel.org/lkml/20250414185458.7767aabc@booty/ Signed-off-by: Luca Ceresoli Reviewed-by: Théo Lebrun Link: https://patch.msgid.link/20251127-rk3308-fix-usb-gadget-phy-disconnect-v2-2-dac8a02cd2ca@bootlin.com Signed-off-by: Vinod Koul --- drivers/phy/rockchip/phy-rockchip-inno-usb2.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c index 0106d7b7ae24e..e5efae7b01359 100644 --- a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c +++ b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c @@ -833,7 +833,8 @@ static void rockchip_chg_detect_work(struct work_struct *work) if (!rport->suspended && !vbus_attach) rockchip_usb2phy_power_off(rport->phy); /* put the controller in non-driving mode */ - property_enable(base, &rphy->phy_cfg->chg_det.opmode, false); + if (!vbus_attach) + property_enable(base, &rphy->phy_cfg->chg_det.opmode, false); /* Start DCD processing stage 1 */ rockchip_chg_enable_dcd(rphy, true); rphy->chg_state = USB_CHG_STATE_WAIT_FOR_DCD; @@ -896,7 +897,8 @@ static void rockchip_chg_detect_work(struct work_struct *work) fallthrough; case USB_CHG_STATE_DETECTED: /* put the controller in normal mode */ - property_enable(base, &rphy->phy_cfg->chg_det.opmode, true); + if (!vbus_attach) + property_enable(base, &rphy->phy_cfg->chg_det.opmode, true); rockchip_usb2phy_otg_sm_work(&rport->otg_sm_work.work); dev_dbg(&rport->phy->dev, "charger = %s\n", chg_to_string(rphy->chg_type)); From 40b8c7595ba7b9f843ca8b099d88ee819bb0e661 Mon Sep 17 00:00:00 2001 From: Wayne Chang Date: Fri, 12 Dec 2025 11:21:16 +0800 Subject: [PATCH 1240/1321] phy: tegra: xusb: Explicitly configure HS_DISCON_LEVEL to 0x7 The USB2 Bias Pad Control register manages analog parameters for signal detection. Previously, the HS_DISCON_LEVEL relied on hardware reset values, which may lead to the detection failure. Explicitly configure HS_DISCON_LEVEL to 0x7. This ensures the disconnect threshold is sufficient to guarantee reliable detection. Fixes: bbf711682cd5 ("phy: tegra: xusb: Add Tegra186 support") Cc: stable@vger.kernel.org Signed-off-by: Wayne Chang Link: https://patch.msgid.link/20251212032116.768307-1-waynec@nvidia.com Signed-off-by: Vinod Koul --- drivers/phy/tegra/xusb-tegra186.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/phy/tegra/xusb-tegra186.c b/drivers/phy/tegra/xusb-tegra186.c index e818f6c3980e6..bec9616c4a2e0 100644 --- a/drivers/phy/tegra/xusb-tegra186.c +++ b/drivers/phy/tegra/xusb-tegra186.c @@ -84,6 +84,7 @@ #define XUSB_PADCTL_USB2_BIAS_PAD_CTL0 0x284 #define BIAS_PAD_PD BIT(11) #define HS_SQUELCH_LEVEL(x) (((x) & 0x7) << 0) +#define HS_DISCON_LEVEL(x) (((x) & 0x7) << 3) #define XUSB_PADCTL_USB2_BIAS_PAD_CTL1 0x288 #define USB2_TRK_START_TIMER(x) (((x) & 0x7f) << 12) @@ -623,6 +624,8 @@ static void tegra186_utmi_bias_pad_power_on(struct tegra_xusb_padctl *padctl) value &= ~BIAS_PAD_PD; value &= ~HS_SQUELCH_LEVEL(~0); value |= HS_SQUELCH_LEVEL(priv->calib.hs_squelch); + value &= ~HS_DISCON_LEVEL(~0); + value |= HS_DISCON_LEVEL(0x7); padctl_writel(padctl, value, XUSB_PADCTL_USB2_BIAS_PAD_CTL0); udelay(1); From 68f8300f56ee48555f45e8633b27029d46f1105e Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Wed, 24 Dec 2025 12:55:34 +0100 Subject: [PATCH 1241/1321] phy: broadcom: ns-usb3: Fix Wvoid-pointer-to-enum-cast warning (again) "family" is an enum, thus cast of pointer on 64-bit compile test with clang W=1 causes: phy-bcm-ns-usb3.c:206:17: error: cast to smaller integer type 'enum bcm_ns_family' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] This was already fixed in commit bd6e74a2f0a0 ("phy: broadcom: ns-usb3: fix Wvoid-pointer-to-enum-cast warning") but then got bad in commit 21bf6fc47a1e ("phy: Use device_get_match_data()"). Note that after various discussions the preferred cast is via "unsigned long", not "uintptr_t". Fixes: 21bf6fc47a1e ("phy: Use device_get_match_data()") Signed-off-by: Krzysztof Kozlowski Link: https://patch.msgid.link/20251224115533.154162-2-krzysztof.kozlowski@oss.qualcomm.com Signed-off-by: Vinod Koul --- drivers/phy/broadcom/phy-bcm-ns-usb3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/phy/broadcom/phy-bcm-ns-usb3.c b/drivers/phy/broadcom/phy-bcm-ns-usb3.c index 9f995e156f755..6e56498d0644b 100644 --- a/drivers/phy/broadcom/phy-bcm-ns-usb3.c +++ b/drivers/phy/broadcom/phy-bcm-ns-usb3.c @@ -203,7 +203,7 @@ static int bcm_ns_usb3_mdio_probe(struct mdio_device *mdiodev) usb3->dev = dev; usb3->mdiodev = mdiodev; - usb3->family = (enum bcm_ns_family)device_get_match_data(dev); + usb3->family = (unsigned long)device_get_match_data(dev); syscon_np = of_parse_phandle(dev->of_node, "usb3-dmp-syscon", 0); err = of_address_to_resource(syscon_np, 0, &res); From dd76cd2f56e946db87d40245028c2053e60ec2bd Mon Sep 17 00:00:00 2001 From: Wentao Liang Date: Fri, 9 Jan 2026 15:46:26 +0000 Subject: [PATCH 1242/1321] phy: rockchip: inno-usb2: Fix a double free bug in rockchip_usb2phy_probe() The for_each_available_child_of_node() calls of_node_put() to release child_np in each success loop. After breaking from the loop with the child_np has been released, the code will jump to the put_child label and will call the of_node_put() again if the devm_request_threaded_irq() fails. These cause a double free bug. Fix by returning directly to avoid the duplicate of_node_put(). Fixes: ed2b5a8e6b98 ("phy: phy-rockchip-inno-usb2: support muxed interrupts") Cc: stable@vger.kernel.org Signed-off-by: Wentao Liang Reviewed-by: Neil Armstrong Link: https://patch.msgid.link/20260109154626.2452034-1-vulab@iscas.ac.cn Signed-off-by: Vinod Koul --- drivers/phy/rockchip/phy-rockchip-inno-usb2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c index e5efae7b01359..8f4c08e599aa2 100644 --- a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c +++ b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c @@ -1495,7 +1495,7 @@ static int rockchip_usb2phy_probe(struct platform_device *pdev) rphy); if (ret) { dev_err_probe(rphy->dev, ret, "failed to request usb2phy irq handle\n"); - goto put_child; + return ret; } } From ee9293be87ed2027188e7d26bbb8d97bb4a8019b Mon Sep 17 00:00:00 2001 From: Rafael Beims Date: Tue, 23 Dec 2025 12:02:54 -0300 Subject: [PATCH 1243/1321] phy: freescale: imx8m-pcie: assert phy reset during power on After U-Boot initializes PCIe with "pcie enum", Linux fails to detect an NVMe disk on some boot cycles with: phy phy-32f00000.pcie-phy.0: phy poweron failed --> -110 Discussion with NXP identified that the iMX8MP PCIe PHY PLL may fail to lock when re-initialized without a reset cycle [1]. The issue reproduces on 7% of tested hardware platforms, with a 30-40% failure rate per affected device across boot cycles. Insert a reset cycle in the power-on routine to ensure the PHY is initialized from a known state. [1] https://community.nxp.com/t5/i-MX-Processors/iMX8MP-PCIe-initialization-in-U-Boot/m-p/2248437#M242401 Signed-off-by: Rafael Beims Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251223150254.1075221-1-rafael@beims.me Signed-off-by: Vinod Koul --- drivers/phy/freescale/phy-fsl-imx8m-pcie.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/phy/freescale/phy-fsl-imx8m-pcie.c b/drivers/phy/freescale/phy-fsl-imx8m-pcie.c index 68fcc8114d750..7f5600103a001 100644 --- a/drivers/phy/freescale/phy-fsl-imx8m-pcie.c +++ b/drivers/phy/freescale/phy-fsl-imx8m-pcie.c @@ -89,7 +89,8 @@ static int imx8_pcie_phy_power_on(struct phy *phy) writel(imx8_phy->tx_deemph_gen2, imx8_phy->base + PCIE_PHY_TRSV_REG6); break; - case IMX8MP: /* Do nothing. */ + case IMX8MP: + reset_control_assert(imx8_phy->reset); break; } From bf5a4f6b65c1adef6f5c191878d5b05f8e34a606 Mon Sep 17 00:00:00 2001 From: Guodong Xu Date: Thu, 18 Sep 2025 22:27:27 +0800 Subject: [PATCH 1244/1321] dmaengine: mmp_pdma: fix DMA mask handling The driver's existing logic for setting the DMA mask for "marvell,pdma-1.0" was flawed. It incorrectly relied on pdev->dev->coherent_dma_mask instead of declaring the hardware's fixed addressing capability. A cleaner and more correct approach is to define the mask directly based on the hardware limitations. The MMP/PXA PDMA controller is a 32-bit DMA engine. This is supported by datasheets and various dtsi files for PXA25x, PXA27x, PXA3xx, and MMP2, all of which are 32-bit systems. This patch simplifies the driver's logic by replacing the 'u64 dma_mask' field with a simpler 'u32 dma_width' to store the addressing capability in bits. The complex if/else block in probe() is then replaced with a single, clear call to dma_set_mask_and_coherent(). This sets a fixed 32-bit DMA mask for "marvell,pdma-1.0" and a 64-bit mask for "spacemit,k1-pdma," matching each device's hardware capabilities. Finally, this change also works around a specific build error encountered with clang-20 on x86_64 allyesconfig. The shift-count-overflow error is caused by a known clang compiler issue where the DMA_BIT_MASK(n) macro's ternary operator is not correctly evaluated in static initializers. By moving the macro's evaluation into the probe() function, the driver avoids this compiler bug. Fixes: 5cfe585d8624 ("dmaengine: mmp_pdma: Add SpacemiT K1 PDMA support with 64-bit addressing") Reported-by: Naresh Kamboju Closes: https://lore.kernel.org/lkml/CA+G9fYsPcMfW-e_0_TRqu4cnwqOqYF3aJOeKUYk6Z4qRStdFvg@mail.gmail.com Suggested-by: Arnd Bergmann Signed-off-by: Guodong Xu Reviewed-by: Arnd Bergmann Tested-by: Nathan Chancellor # build Tested-by: Naresh Kamboju Signed-off-by: Vinod Koul --- drivers/dma/mmp_pdma.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/drivers/dma/mmp_pdma.c b/drivers/dma/mmp_pdma.c index d07229a748868..86661eb3cde1f 100644 --- a/drivers/dma/mmp_pdma.c +++ b/drivers/dma/mmp_pdma.c @@ -152,8 +152,8 @@ struct mmp_pdma_phy { * * Controller Configuration: * @run_bits: Control bits in DCSR register for channel start/stop - * @dma_mask: DMA addressing capability of controller. 0 to use OF/platform - * settings, or explicit mask like DMA_BIT_MASK(32/64) + * @dma_width: DMA addressing width in bits (32 or 64). Determines the + * DMA mask capability of the controller hardware. */ struct mmp_pdma_ops { /* Hardware Register Operations */ @@ -173,7 +173,7 @@ struct mmp_pdma_ops { /* Controller Configuration */ u32 run_bits; - u64 dma_mask; + u32 dma_width; }; struct mmp_pdma_device { @@ -1172,7 +1172,7 @@ static const struct mmp_pdma_ops marvell_pdma_v1_ops = { .get_desc_src_addr = get_desc_src_addr_32, .get_desc_dst_addr = get_desc_dst_addr_32, .run_bits = (DCSR_RUN), - .dma_mask = 0, /* let OF/platform set DMA mask */ + .dma_width = 32, }; static const struct mmp_pdma_ops spacemit_k1_pdma_ops = { @@ -1185,7 +1185,7 @@ static const struct mmp_pdma_ops spacemit_k1_pdma_ops = { .get_desc_src_addr = get_desc_src_addr_64, .get_desc_dst_addr = get_desc_dst_addr_64, .run_bits = (DCSR_RUN | DCSR_LPAEEN), - .dma_mask = DMA_BIT_MASK(64), /* force 64-bit DMA addr capability */ + .dma_width = 64, }; static const struct of_device_id mmp_pdma_dt_ids[] = { @@ -1314,13 +1314,9 @@ static int mmp_pdma_probe(struct platform_device *op) pdev->device.directions = BIT(DMA_MEM_TO_DEV) | BIT(DMA_DEV_TO_MEM); pdev->device.residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR; - /* Set DMA mask based on ops->dma_mask, or OF/platform */ - if (pdev->ops->dma_mask) - dma_set_mask(pdev->dev, pdev->ops->dma_mask); - else if (pdev->dev->coherent_dma_mask) - dma_set_mask(pdev->dev, pdev->dev->coherent_dma_mask); - else - dma_set_mask(pdev->dev, DMA_BIT_MASK(64)); + /* Set DMA mask based on controller hardware capabilities */ + dma_set_mask_and_coherent(pdev->dev, + DMA_BIT_MASK(pdev->ops->dma_width)); ret = dma_async_device_register(&pdev->device); if (ret) { From 9429fc6f7df256695aaec50218b7f8488fb500c6 Mon Sep 17 00:00:00 2001 From: Anthony Brandon Date: Mon, 13 Oct 2025 17:48:49 +0200 Subject: [PATCH 1245/1321] dmaengine: xilinx: xdma: Fix regmap max_register The max_register field is assigned the size of the register memory region instead of the offset of the last register. The result is that reading from the regmap via debugfs can cause a segmentation fault: tail /sys/kernel/debug/regmap/xdma.1.auto/registers Unable to handle kernel paging request at virtual address ffff800082f70000 Mem abort info: ESR = 0x0000000096000007 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x07: level 3 translation fault [...] Call trace: regmap_mmio_read32le+0x10/0x30 _regmap_bus_reg_read+0x74/0xc0 _regmap_read+0x68/0x198 regmap_read+0x54/0x88 regmap_read_debugfs+0x140/0x380 regmap_map_read_file+0x30/0x48 full_proxy_read+0x68/0xc8 vfs_read+0xcc/0x310 ksys_read+0x7c/0x120 __arm64_sys_read+0x24/0x40 invoke_syscall.constprop.0+0x64/0x108 do_el0_svc+0xb0/0xd8 el0_svc+0x38/0x130 el0t_64_sync_handler+0x120/0x138 el0t_64_sync+0x194/0x198 Code: aa1e03e9 d503201f f9400000 8b214000 (b9400000) ---[ end trace 0000000000000000 ]--- note: tail[1217] exited with irqs disabled note: tail[1217] exited with preempt_count 1 Segmentation fault Fixes: 17ce252266c7 ("dmaengine: xilinx: xdma: Add xilinx xdma driver") Reviewed-by: Lizhi Hou Reviewed-by: Radhey Shyam Pandey Reviewed-by: Alexander Stein Signed-off-by: Anthony Brandon Signed-off-by: Vinod Koul --- drivers/dma/xilinx/xdma-regs.h | 1 + drivers/dma/xilinx/xdma.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/dma/xilinx/xdma-regs.h b/drivers/dma/xilinx/xdma-regs.h index 6ad08878e9386..70bca92621aa4 100644 --- a/drivers/dma/xilinx/xdma-regs.h +++ b/drivers/dma/xilinx/xdma-regs.h @@ -9,6 +9,7 @@ /* The length of register space exposed to host */ #define XDMA_REG_SPACE_LEN 65536 +#define XDMA_MAX_REG_OFFSET (XDMA_REG_SPACE_LEN - 4) /* * maximum number of DMA channels for each direction: diff --git a/drivers/dma/xilinx/xdma.c b/drivers/dma/xilinx/xdma.c index 0d88b1a670e14..5ecf8223c112e 100644 --- a/drivers/dma/xilinx/xdma.c +++ b/drivers/dma/xilinx/xdma.c @@ -38,7 +38,7 @@ static const struct regmap_config xdma_regmap_config = { .reg_bits = 32, .val_bits = 32, .reg_stride = 4, - .max_register = XDMA_REG_SPACE_LEN, + .max_register = XDMA_MAX_REG_OFFSET, }; /** From bc86100d746e86a39d87b722f08b6e91ebc16275 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:43 +0100 Subject: [PATCH 1246/1321] dmaengine: at_hdmac: fix device leak on of_dma_xlate() Make sure to drop the reference taken when looking up the DMA platform device during of_dma_xlate() when releasing channel resources. Note that commit 3832b78b3ec2 ("dmaengine: at_hdmac: add missing put_device() call in at_dma_xlate()") fixed the leak in a couple of error paths but the reference is still leaking on successful allocation. Fixes: bbe89c8e3d59 ("at_hdmac: move to generic DMA binding") Fixes: 3832b78b3ec2 ("dmaengine: at_hdmac: add missing put_device() call in at_dma_xlate()") Cc: stable@vger.kernel.org # 3.10: 3832b78b3ec2 Cc: Yu Kuai Signed-off-by: Johan Hovold Link: https://patch.msgid.link/20251117161258.10679-2-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/at_hdmac.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/dma/at_hdmac.c b/drivers/dma/at_hdmac.c index 7d226453961f8..22bb604a3f975 100644 --- a/drivers/dma/at_hdmac.c +++ b/drivers/dma/at_hdmac.c @@ -1765,6 +1765,7 @@ static int atc_alloc_chan_resources(struct dma_chan *chan) static void atc_free_chan_resources(struct dma_chan *chan) { struct at_dma_chan *atchan = to_at_dma_chan(chan); + struct at_dma_slave *atslave; BUG_ON(atc_chan_is_enabled(atchan)); @@ -1774,8 +1775,12 @@ static void atc_free_chan_resources(struct dma_chan *chan) /* * Free atslave allocated in at_dma_xlate() */ - kfree(chan->private); - chan->private = NULL; + atslave = chan->private; + if (atslave) { + put_device(atslave->dma_dev); + kfree(atslave); + chan->private = NULL; + } dev_vdbg(chan2dev(chan), "free_chan_resources: done\n"); } From a8f8b53f5555a681edf7ec7ba6541bd60189269a Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:45 +0100 Subject: [PATCH 1247/1321] dmaengine: bcm-sba-raid: fix device leak on probe Make sure to drop the reference taken when looking up the mailbox device during probe on probe failures and on driver unbind. Fixes: 743e1c8ffe4e ("dmaengine: Add Broadcom SBA RAID driver") Cc: stable@vger.kernel.org # 4.13 Signed-off-by: Johan Hovold Link: https://patch.msgid.link/20251117161258.10679-4-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/bcm-sba-raid.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/dma/bcm-sba-raid.c b/drivers/dma/bcm-sba-raid.c index 7f0e76439ce58..ed037fa883f6f 100644 --- a/drivers/dma/bcm-sba-raid.c +++ b/drivers/dma/bcm-sba-raid.c @@ -1699,7 +1699,7 @@ static int sba_probe(struct platform_device *pdev) /* Prealloc channel resource */ ret = sba_prealloc_channel_resources(sba); if (ret) - goto fail_free_mchan; + goto fail_put_mbox; /* Check availability of debugfs */ if (!debugfs_initialized()) @@ -1729,6 +1729,8 @@ static int sba_probe(struct platform_device *pdev) fail_free_resources: debugfs_remove_recursive(sba->root); sba_freeup_channel_resources(sba); +fail_put_mbox: + put_device(sba->mbox_dev); fail_free_mchan: mbox_free_channel(sba->mchan); return ret; @@ -1744,6 +1746,8 @@ static void sba_remove(struct platform_device *pdev) sba_freeup_channel_resources(sba); + put_device(sba->mbox_dev); + mbox_free_channel(sba->mchan); } From d049967c322b58eedaa8c190d43e05c5e08f2988 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:46 +0100 Subject: [PATCH 1248/1321] dmaengine: cv1800b-dmamux: fix device leak on route allocation Make sure to drop the reference taken when looking up the DMA mux platform device during route allocation. Note that holding a reference to a device does not prevent its driver data from going away so there is no point in keeping the reference. Fixes: db7d07b5add4 ("dmaengine: add driver for Sophgo CV18XX/SG200X dmamux") Cc: stable@vger.kernel.org # 6.17 Cc: Inochi Amaoto Signed-off-by: Johan Hovold Link: https://patch.msgid.link/20251117161258.10679-5-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/cv1800b-dmamux.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/dma/cv1800b-dmamux.c b/drivers/dma/cv1800b-dmamux.c index e900d65956171..f7a952fcbc7d7 100644 --- a/drivers/dma/cv1800b-dmamux.c +++ b/drivers/dma/cv1800b-dmamux.c @@ -102,11 +102,11 @@ static void *cv1800_dmamux_route_allocate(struct of_phandle_args *dma_spec, struct llist_node *node; unsigned long flags; unsigned int chid, devid, cpuid; - int ret; + int ret = -EINVAL; if (dma_spec->args_count != DMAMUX_NCELLS) { dev_err(&pdev->dev, "invalid number of dma mux args\n"); - return ERR_PTR(-EINVAL); + goto err_put_pdev; } devid = dma_spec->args[0]; @@ -115,18 +115,18 @@ static void *cv1800_dmamux_route_allocate(struct of_phandle_args *dma_spec, if (devid > MAX_DMA_MAPPING_ID) { dev_err(&pdev->dev, "invalid device id: %u\n", devid); - return ERR_PTR(-EINVAL); + goto err_put_pdev; } if (cpuid > MAX_DMA_CPU_ID) { dev_err(&pdev->dev, "invalid cpu id: %u\n", cpuid); - return ERR_PTR(-EINVAL); + goto err_put_pdev; } dma_spec->np = of_parse_phandle(ofdma->of_node, "dma-masters", 0); if (!dma_spec->np) { dev_err(&pdev->dev, "can't get dma master\n"); - return ERR_PTR(-EINVAL); + goto err_put_pdev; } spin_lock_irqsave(&dmamux->lock, flags); @@ -136,8 +136,6 @@ static void *cv1800_dmamux_route_allocate(struct of_phandle_args *dma_spec, if (map->peripheral == devid && map->cpu == cpuid) goto found; } - - ret = -EINVAL; goto failed; } else { node = llist_del_first(&dmamux->free_maps); @@ -171,12 +169,17 @@ static void *cv1800_dmamux_route_allocate(struct of_phandle_args *dma_spec, dev_dbg(&pdev->dev, "register channel %u for req %u (cpu %u)\n", chid, devid, cpuid); + put_device(&pdev->dev); + return map; failed: spin_unlock_irqrestore(&dmamux->lock, flags); of_node_put(dma_spec->np); dev_err(&pdev->dev, "errno %d\n", ret); +err_put_pdev: + put_device(&pdev->dev); + return ERR_PTR(ret); } From 0253388e9b5eccf4e50ab0a2868d0cb6aac51a8b Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:47 +0100 Subject: [PATCH 1249/1321] dmaengine: dw: dmamux: fix OF node leak on route allocation failure Make sure to drop the reference taken to the DMA master OF node also on late route allocation failures. Fixes: 134d9c52fca2 ("dmaengine: dw: dmamux: Introduce RZN1 DMA router support") Cc: stable@vger.kernel.org # 5.19 Cc: Miquel Raynal Signed-off-by: Johan Hovold Reviewed-by: Miquel Raynal Link: https://patch.msgid.link/20251117161258.10679-6-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/dw/rzn1-dmamux.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/dma/dw/rzn1-dmamux.c b/drivers/dma/dw/rzn1-dmamux.c index deadf135681b6..cbec277af4dd3 100644 --- a/drivers/dma/dw/rzn1-dmamux.c +++ b/drivers/dma/dw/rzn1-dmamux.c @@ -90,7 +90,7 @@ static void *rzn1_dmamux_route_allocate(struct of_phandle_args *dma_spec, if (test_and_set_bit(map->req_idx, dmamux->used_chans)) { ret = -EBUSY; - goto free_map; + goto put_dma_spec_np; } mask = BIT(map->req_idx); @@ -103,6 +103,8 @@ static void *rzn1_dmamux_route_allocate(struct of_phandle_args *dma_spec, clear_bitmap: clear_bit(map->req_idx, dmamux->used_chans); +put_dma_spec_np: + of_node_put(dma_spec->np); free_map: kfree(map); put_device: From 99ae563d4b62381ae177478c498271376a60bbce Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:48 +0100 Subject: [PATCH 1250/1321] dmaengine: idxd: fix device leaks on compat bind and unbind Make sure to drop the reference taken when looking up the idxd device as part of the compat bind and unbind sysfs interface. Fixes: 6e7f3ee97bbe ("dmaengine: idxd: move dsa_drv support to compatible mode") Cc: stable@vger.kernel.org # 5.15 Cc: Dave Jiang Signed-off-by: Johan Hovold Link: https://patch.msgid.link/20251117161258.10679-7-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/idxd/compat.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/dma/idxd/compat.c b/drivers/dma/idxd/compat.c index eff9943f1a42e..95b8ef9586338 100644 --- a/drivers/dma/idxd/compat.c +++ b/drivers/dma/idxd/compat.c @@ -20,11 +20,16 @@ static ssize_t unbind_store(struct device_driver *drv, const char *buf, size_t c int rc = -ENODEV; dev = bus_find_device_by_name(bus, NULL, buf); - if (dev && dev->driver) { + if (!dev) + return -ENODEV; + + if (dev->driver) { device_driver_detach(dev); rc = count; } + put_device(dev); + return rc; } static DRIVER_ATTR_IGNORE_LOCKDEP(unbind, 0200, NULL, unbind_store); @@ -38,9 +43,12 @@ static ssize_t bind_store(struct device_driver *drv, const char *buf, size_t cou struct idxd_dev *idxd_dev; dev = bus_find_device_by_name(bus, NULL, buf); - if (!dev || dev->driver || drv != &dsa_drv.drv) + if (!dev) return -ENODEV; + if (dev->driver || drv != &dsa_drv.drv) + goto err_put_dev; + idxd_dev = confdev_to_idxd_dev(dev); if (is_idxd_dev(idxd_dev)) { alt_drv = driver_find("idxd", bus); @@ -53,13 +61,20 @@ static ssize_t bind_store(struct device_driver *drv, const char *buf, size_t cou alt_drv = driver_find("user", bus); } if (!alt_drv) - return -ENODEV; + goto err_put_dev; rc = device_driver_attach(alt_drv, dev); if (rc < 0) - return rc; + goto err_put_dev; + + put_device(dev); return count; + +err_put_dev: + put_device(dev); + + return rc; } static DRIVER_ATTR_IGNORE_LOCKDEP(bind, 0200, NULL, bind_store); From 6a61ed7373468792947159dd4535f38d3371689f Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:49 +0100 Subject: [PATCH 1251/1321] dmaengine: lpc18xx-dmamux: fix device leak on route allocation Make sure to drop the reference taken when looking up the DMA mux platform device during route allocation. Note that holding a reference to a device does not prevent its driver data from going away so there is no point in keeping the reference. Fixes: e5f4ae84be74 ("dmaengine: add driver for lpc18xx dmamux") Cc: stable@vger.kernel.org # 4.3 Signed-off-by: Johan Hovold Reviewed-by: Vladimir Zapolskiy Link: https://patch.msgid.link/20251117161258.10679-8-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/lpc18xx-dmamux.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/drivers/dma/lpc18xx-dmamux.c b/drivers/dma/lpc18xx-dmamux.c index 2b6436f4b1937..d3ff521951b83 100644 --- a/drivers/dma/lpc18xx-dmamux.c +++ b/drivers/dma/lpc18xx-dmamux.c @@ -57,30 +57,31 @@ static void *lpc18xx_dmamux_reserve(struct of_phandle_args *dma_spec, struct lpc18xx_dmamux_data *dmamux = platform_get_drvdata(pdev); unsigned long flags; unsigned mux; + int ret = -EINVAL; if (dma_spec->args_count != 3) { dev_err(&pdev->dev, "invalid number of dma mux args\n"); - return ERR_PTR(-EINVAL); + goto err_put_pdev; } mux = dma_spec->args[0]; if (mux >= dmamux->dma_master_requests) { dev_err(&pdev->dev, "invalid mux number: %d\n", dma_spec->args[0]); - return ERR_PTR(-EINVAL); + goto err_put_pdev; } if (dma_spec->args[1] > LPC18XX_DMAMUX_MAX_VAL) { dev_err(&pdev->dev, "invalid dma mux value: %d\n", dma_spec->args[1]); - return ERR_PTR(-EINVAL); + goto err_put_pdev; } /* The of_node_put() will be done in the core for the node */ dma_spec->np = of_parse_phandle(ofdma->of_node, "dma-masters", 0); if (!dma_spec->np) { dev_err(&pdev->dev, "can't get dma master\n"); - return ERR_PTR(-EINVAL); + goto err_put_pdev; } spin_lock_irqsave(&dmamux->lock, flags); @@ -89,7 +90,8 @@ static void *lpc18xx_dmamux_reserve(struct of_phandle_args *dma_spec, dev_err(&pdev->dev, "dma request %u busy with %u.%u\n", mux, mux, dmamux->muxes[mux].value); of_node_put(dma_spec->np); - return ERR_PTR(-EBUSY); + ret = -EBUSY; + goto err_put_pdev; } dmamux->muxes[mux].busy = true; @@ -106,7 +108,14 @@ static void *lpc18xx_dmamux_reserve(struct of_phandle_args *dma_spec, dev_dbg(&pdev->dev, "mapping dmamux %u.%u to dma request %u\n", mux, dmamux->muxes[mux].value, mux); + put_device(&pdev->dev); + return &dmamux->muxes[mux]; + +err_put_pdev: + put_device(&pdev->dev); + + return ERR_PTR(ret); } static int lpc18xx_dmamux_probe(struct platform_device *pdev) From b79783f7e870af682752eeb2937496b44dd9446b Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:50 +0100 Subject: [PATCH 1252/1321] dmaengine: lpc32xx-dmamux: fix device leak on route allocation Make sure to drop the reference taken when looking up the DMA mux platform device during route allocation. Note that holding a reference to a device does not prevent its driver data from going away so there is no point in keeping the reference. Fixes: 5d318b595982 ("dmaengine: Add dma router for pl08x in LPC32XX SoC") Cc: stable@vger.kernel.org # 6.12 Cc: Piotr Wojtaszczyk Signed-off-by: Johan Hovold Reviewed-by: Vladimir Zapolskiy Link: https://patch.msgid.link/20251117161258.10679-9-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/lpc32xx-dmamux.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/drivers/dma/lpc32xx-dmamux.c b/drivers/dma/lpc32xx-dmamux.c index 351d7e23e6156..33be714740ddf 100644 --- a/drivers/dma/lpc32xx-dmamux.c +++ b/drivers/dma/lpc32xx-dmamux.c @@ -95,11 +95,12 @@ static void *lpc32xx_dmamux_reserve(struct of_phandle_args *dma_spec, struct lpc32xx_dmamux_data *dmamux = platform_get_drvdata(pdev); unsigned long flags; struct lpc32xx_dmamux *mux = NULL; + int ret = -EINVAL; int i; if (dma_spec->args_count != 3) { dev_err(&pdev->dev, "invalid number of dma mux args\n"); - return ERR_PTR(-EINVAL); + goto err_put_pdev; } for (i = 0; i < ARRAY_SIZE(lpc32xx_muxes); i++) { @@ -111,20 +112,20 @@ static void *lpc32xx_dmamux_reserve(struct of_phandle_args *dma_spec, if (!mux) { dev_err(&pdev->dev, "invalid mux request number: %d\n", dma_spec->args[0]); - return ERR_PTR(-EINVAL); + goto err_put_pdev; } if (dma_spec->args[2] > 1) { dev_err(&pdev->dev, "invalid dma mux value: %d\n", dma_spec->args[1]); - return ERR_PTR(-EINVAL); + goto err_put_pdev; } /* The of_node_put() will be done in the core for the node */ dma_spec->np = of_parse_phandle(ofdma->of_node, "dma-masters", 0); if (!dma_spec->np) { dev_err(&pdev->dev, "can't get dma master\n"); - return ERR_PTR(-EINVAL); + goto err_put_pdev; } spin_lock_irqsave(&dmamux->lock, flags); @@ -133,7 +134,8 @@ static void *lpc32xx_dmamux_reserve(struct of_phandle_args *dma_spec, dev_err(dev, "dma request signal %d busy, routed to %s\n", mux->signal, mux->muxval ? mux->name_sel1 : mux->name_sel1); of_node_put(dma_spec->np); - return ERR_PTR(-EBUSY); + ret = -EBUSY; + goto err_put_pdev; } mux->busy = true; @@ -148,7 +150,14 @@ static void *lpc32xx_dmamux_reserve(struct of_phandle_args *dma_spec, dev_dbg(dev, "dma request signal %d routed to %s\n", mux->signal, mux->muxval ? mux->name_sel1 : mux->name_sel1); + put_device(&pdev->dev); + return mux; + +err_put_pdev: + put_device(&pdev->dev); + + return ERR_PTR(ret); } static int lpc32xx_dmamux_probe(struct platform_device *pdev) From 89640fbf5ade1ddcea1063eda2cec9a6331fb9fb Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:51 +0100 Subject: [PATCH 1253/1321] dmaengine: sh: rz-dmac: fix device leak on probe failure Make sure to drop the reference taken when looking up the ICU device during probe also on probe failures (e.g. probe deferral). Fixes: 7de873201c44 ("dmaengine: sh: rz-dmac: Add RZ/V2H(P) support") Cc: stable@vger.kernel.org # 6.16 Cc: Fabrizio Castro Signed-off-by: Johan Hovold Reviewed-by: Fabrizio Castro Link: https://patch.msgid.link/20251117161258.10679-10-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/sh/rz-dmac.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/dma/sh/rz-dmac.c b/drivers/dma/sh/rz-dmac.c index 1f687b08d6b86..38137e8d80b9f 100644 --- a/drivers/dma/sh/rz-dmac.c +++ b/drivers/dma/sh/rz-dmac.c @@ -854,6 +854,13 @@ static int rz_dmac_chan_probe(struct rz_dmac *dmac, return 0; } +static void rz_dmac_put_device(void *_dev) +{ + struct device *dev = _dev; + + put_device(dev); +} + static int rz_dmac_parse_of_icu(struct device *dev, struct rz_dmac *dmac) { struct device_node *np = dev->of_node; @@ -876,6 +883,10 @@ static int rz_dmac_parse_of_icu(struct device *dev, struct rz_dmac *dmac) return -ENODEV; } + ret = devm_add_action_or_reset(dev, rz_dmac_put_device, &dmac->icu.pdev->dev); + if (ret) + return ret; + dmac_index = args.args[0]; if (dmac_index > RZV2H_MAX_DMAC_INDEX) { dev_err(dev, "DMAC index %u invalid.\n", dmac_index); @@ -1055,8 +1066,6 @@ static void rz_dmac_remove(struct platform_device *pdev) reset_control_assert(dmac->rstc); pm_runtime_put(&pdev->dev); pm_runtime_disable(&pdev->dev); - - platform_device_put(dmac->icu.pdev); } static const struct of_device_id of_rz_dmac_match[] = { From 73edaa04cb67b0a536642ca5830d57da37fa6c11 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:52 +0100 Subject: [PATCH 1254/1321] dmaengine: stm32: dmamux: fix device leak on route allocation Make sure to drop the reference taken when looking up the DMA mux platform device during route allocation. Note that holding a reference to a device does not prevent its driver data from going away so there is no point in keeping the reference. Fixes: df7e762db5f6 ("dmaengine: Add STM32 DMAMUX driver") Cc: stable@vger.kernel.org # 4.15 Cc: Pierre-Yves MORDRET Signed-off-by: Johan Hovold Reviewed-by: Amelie Delaunay Link: https://patch.msgid.link/20251117161258.10679-11-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/stm32/stm32-dmamux.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/dma/stm32/stm32-dmamux.c b/drivers/dma/stm32/stm32-dmamux.c index 8d77e2a7939a0..7911797607824 100644 --- a/drivers/dma/stm32/stm32-dmamux.c +++ b/drivers/dma/stm32/stm32-dmamux.c @@ -90,23 +90,25 @@ static void *stm32_dmamux_route_allocate(struct of_phandle_args *dma_spec, struct stm32_dmamux_data *dmamux = platform_get_drvdata(pdev); struct stm32_dmamux *mux; u32 i, min, max; - int ret; + int ret = -EINVAL; unsigned long flags; if (dma_spec->args_count != 3) { dev_err(&pdev->dev, "invalid number of dma mux args\n"); - return ERR_PTR(-EINVAL); + goto err_put_pdev; } if (dma_spec->args[0] > dmamux->dmamux_requests) { dev_err(&pdev->dev, "invalid mux request number: %d\n", dma_spec->args[0]); - return ERR_PTR(-EINVAL); + goto err_put_pdev; } mux = kzalloc(sizeof(*mux), GFP_KERNEL); - if (!mux) - return ERR_PTR(-ENOMEM); + if (!mux) { + ret = -ENOMEM; + goto err_put_pdev; + } spin_lock_irqsave(&dmamux->lock, flags); mux->chan_id = find_first_zero_bit(dmamux->dma_inuse, @@ -133,7 +135,6 @@ static void *stm32_dmamux_route_allocate(struct of_phandle_args *dma_spec, dma_spec->np = of_parse_phandle(ofdma->of_node, "dma-masters", i - 1); if (!dma_spec->np) { dev_err(&pdev->dev, "can't get dma master\n"); - ret = -EINVAL; goto error; } @@ -160,6 +161,8 @@ static void *stm32_dmamux_route_allocate(struct of_phandle_args *dma_spec, dev_dbg(&pdev->dev, "Mapping DMAMUX(%u) to DMA%u(%u)\n", mux->request, mux->master, mux->chan_id); + put_device(&pdev->dev); + return mux; error: @@ -167,6 +170,9 @@ static void *stm32_dmamux_route_allocate(struct of_phandle_args *dma_spec, error_chan_id: kfree(mux); +err_put_pdev: + put_device(&pdev->dev); + return ERR_PTR(ret); } From 2ff5726afcd07d547f8525a11fee8f803b2b05bd Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:53 +0100 Subject: [PATCH 1255/1321] dmaengine: stm32: dmamux: fix OF node leak on route allocation failure Make sure to drop the reference taken to the DMA master OF node also on late route allocation failures. Fixes: df7e762db5f6 ("dmaengine: Add STM32 DMAMUX driver") Cc: stable@vger.kernel.org # 4.15 Cc: Pierre-Yves MORDRET Signed-off-by: Johan Hovold Reviewed-by: Amelie Delaunay Link: https://patch.msgid.link/20251117161258.10679-12-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/stm32/stm32-dmamux.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/dma/stm32/stm32-dmamux.c b/drivers/dma/stm32/stm32-dmamux.c index 7911797607824..2bd218dbabbb1 100644 --- a/drivers/dma/stm32/stm32-dmamux.c +++ b/drivers/dma/stm32/stm32-dmamux.c @@ -143,7 +143,7 @@ static void *stm32_dmamux_route_allocate(struct of_phandle_args *dma_spec, ret = pm_runtime_resume_and_get(&pdev->dev); if (ret < 0) { spin_unlock_irqrestore(&dmamux->lock, flags); - goto error; + goto err_put_dma_spec_np; } spin_unlock_irqrestore(&dmamux->lock, flags); @@ -165,6 +165,8 @@ static void *stm32_dmamux_route_allocate(struct of_phandle_args *dma_spec, return mux; +err_put_dma_spec_np: + of_node_put(dma_spec->np); error: clear_bit(mux->chan_id, dmamux->dma_inuse); From 7ea8fe5accc8c3901a726bf26b420ab43ddde7d4 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:54 +0100 Subject: [PATCH 1256/1321] dmaengine: stm32: dmamux: clean up route allocation error labels Error labels should be named after what they do (and not after wherefrom they are jumped to). Signed-off-by: Johan Hovold Reviewed-by: Amelie Delaunay Link: https://patch.msgid.link/20251117161258.10679-13-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/stm32/stm32-dmamux.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/dma/stm32/stm32-dmamux.c b/drivers/dma/stm32/stm32-dmamux.c index 2bd218dbabbb1..db13498b9c9fe 100644 --- a/drivers/dma/stm32/stm32-dmamux.c +++ b/drivers/dma/stm32/stm32-dmamux.c @@ -118,7 +118,7 @@ static void *stm32_dmamux_route_allocate(struct of_phandle_args *dma_spec, spin_unlock_irqrestore(&dmamux->lock, flags); dev_err(&pdev->dev, "Run out of free DMA requests\n"); ret = -ENOMEM; - goto error_chan_id; + goto err_free_mux; } set_bit(mux->chan_id, dmamux->dma_inuse); spin_unlock_irqrestore(&dmamux->lock, flags); @@ -135,7 +135,7 @@ static void *stm32_dmamux_route_allocate(struct of_phandle_args *dma_spec, dma_spec->np = of_parse_phandle(ofdma->of_node, "dma-masters", i - 1); if (!dma_spec->np) { dev_err(&pdev->dev, "can't get dma master\n"); - goto error; + goto err_clear_inuse; } /* Set dma request */ @@ -167,10 +167,9 @@ static void *stm32_dmamux_route_allocate(struct of_phandle_args *dma_spec, err_put_dma_spec_np: of_node_put(dma_spec->np); -error: +err_clear_inuse: clear_bit(mux->chan_id, dmamux->dma_inuse); - -error_chan_id: +err_free_mux: kfree(mux); err_put_pdev: put_device(&pdev->dev); From ebb6e239e6a6f5c6bfe0cb67b80ff1a641b93c68 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:55 +0100 Subject: [PATCH 1257/1321] dmaengine: ti: dma-crossbar: fix device leak on dra7x route allocation Make sure to drop the reference taken when looking up the crossbar platform device during dra7x route allocation. Note that commit 615a4bfc426e ("dmaengine: ti: Add missing put_device in ti_dra7_xbar_route_allocate") fixed the leak in the error paths but the reference is still leaking on successful allocation. Fixes: a074ae38f859 ("dmaengine: Add driver for TI DMA crossbar on DRA7x") Fixes: 615a4bfc426e ("dmaengine: ti: Add missing put_device in ti_dra7_xbar_route_allocate") Cc: stable@vger.kernel.org # 4.2: 615a4bfc426e Cc: Peter Ujfalusi Cc: Miaoqian Lin Signed-off-by: Johan Hovold Link: https://patch.msgid.link/20251117161258.10679-14-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/ti/dma-crossbar.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/dma/ti/dma-crossbar.c b/drivers/dma/ti/dma-crossbar.c index 7f17ee87a6dce..e52b0e1399008 100644 --- a/drivers/dma/ti/dma-crossbar.c +++ b/drivers/dma/ti/dma-crossbar.c @@ -288,6 +288,8 @@ static void *ti_dra7_xbar_route_allocate(struct of_phandle_args *dma_spec, ti_dra7_xbar_write(xbar->iomem, map->xbar_out, map->xbar_in); + put_device(&pdev->dev); + return map; } From da5fa2877ade2c2400a985ea47a80cde88131411 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:56 +0100 Subject: [PATCH 1258/1321] dmaengine: ti: dma-crossbar: fix device leak on am335x route allocation Make sure to drop the reference taken when looking up the crossbar platform device during am335x route allocation. Fixes: 42dbdcc6bf96 ("dmaengine: ti-dma-crossbar: Add support for crossbar on AM33xx/AM43xx") Cc: stable@vger.kernel.org # 4.4 Cc: Peter Ujfalusi Signed-off-by: Johan Hovold Link: https://patch.msgid.link/20251117161258.10679-15-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/ti/dma-crossbar.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/dma/ti/dma-crossbar.c b/drivers/dma/ti/dma-crossbar.c index e52b0e1399008..ff05b150ad372 100644 --- a/drivers/dma/ti/dma-crossbar.c +++ b/drivers/dma/ti/dma-crossbar.c @@ -79,34 +79,35 @@ static void *ti_am335x_xbar_route_allocate(struct of_phandle_args *dma_spec, { struct platform_device *pdev = of_find_device_by_node(ofdma->of_node); struct ti_am335x_xbar_data *xbar = platform_get_drvdata(pdev); - struct ti_am335x_xbar_map *map; + struct ti_am335x_xbar_map *map = ERR_PTR(-EINVAL); if (dma_spec->args_count != 3) - return ERR_PTR(-EINVAL); + goto out_put_pdev; if (dma_spec->args[2] >= xbar->xbar_events) { dev_err(&pdev->dev, "Invalid XBAR event number: %d\n", dma_spec->args[2]); - return ERR_PTR(-EINVAL); + goto out_put_pdev; } if (dma_spec->args[0] >= xbar->dma_requests) { dev_err(&pdev->dev, "Invalid DMA request line number: %d\n", dma_spec->args[0]); - return ERR_PTR(-EINVAL); + goto out_put_pdev; } /* The of_node_put() will be done in the core for the node */ dma_spec->np = of_parse_phandle(ofdma->of_node, "dma-masters", 0); if (!dma_spec->np) { dev_err(&pdev->dev, "Can't get DMA master\n"); - return ERR_PTR(-EINVAL); + goto out_put_pdev; } map = kzalloc(sizeof(*map), GFP_KERNEL); if (!map) { of_node_put(dma_spec->np); - return ERR_PTR(-ENOMEM); + map = ERR_PTR(-ENOMEM); + goto out_put_pdev; } map->dma_line = (u16)dma_spec->args[0]; @@ -120,6 +121,9 @@ static void *ti_am335x_xbar_route_allocate(struct of_phandle_args *dma_spec, ti_am335x_xbar_write(xbar->iomem, map->dma_line, map->mux_val); +out_put_pdev: + put_device(&pdev->dev); + return map; } From 239c9db09221312cb090c6c5c9a54c9673f4fc92 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:57 +0100 Subject: [PATCH 1259/1321] dmaengine: ti: dma-crossbar: clean up dra7x route allocation error paths Use a common exit path to drop the cross platform device reference on errors for consistency with am335x. Signed-off-by: Johan Hovold Link: https://patch.msgid.link/20251117161258.10679-16-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/ti/dma-crossbar.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/drivers/dma/ti/dma-crossbar.c b/drivers/dma/ti/dma-crossbar.c index ff05b150ad372..e04077d542d29 100644 --- a/drivers/dma/ti/dma-crossbar.c +++ b/drivers/dma/ti/dma-crossbar.c @@ -245,28 +245,26 @@ static void *ti_dra7_xbar_route_allocate(struct of_phandle_args *dma_spec, { struct platform_device *pdev = of_find_device_by_node(ofdma->of_node); struct ti_dra7_xbar_data *xbar = platform_get_drvdata(pdev); - struct ti_dra7_xbar_map *map; + struct ti_dra7_xbar_map *map = ERR_PTR(-EINVAL); if (dma_spec->args[0] >= xbar->xbar_requests) { dev_err(&pdev->dev, "Invalid XBAR request number: %d\n", dma_spec->args[0]); - put_device(&pdev->dev); - return ERR_PTR(-EINVAL); + goto out_put_pdev; } /* The of_node_put() will be done in the core for the node */ dma_spec->np = of_parse_phandle(ofdma->of_node, "dma-masters", 0); if (!dma_spec->np) { dev_err(&pdev->dev, "Can't get DMA master\n"); - put_device(&pdev->dev); - return ERR_PTR(-EINVAL); + goto out_put_pdev; } map = kzalloc(sizeof(*map), GFP_KERNEL); if (!map) { of_node_put(dma_spec->np); - put_device(&pdev->dev); - return ERR_PTR(-ENOMEM); + map = ERR_PTR(-ENOMEM); + goto out_put_pdev; } mutex_lock(&xbar->mutex); @@ -277,8 +275,8 @@ static void *ti_dra7_xbar_route_allocate(struct of_phandle_args *dma_spec, dev_err(&pdev->dev, "Run out of free DMA requests\n"); kfree(map); of_node_put(dma_spec->np); - put_device(&pdev->dev); - return ERR_PTR(-ENOMEM); + map = ERR_PTR(-ENOMEM); + goto out_put_pdev; } set_bit(map->xbar_out, xbar->dma_inuse); mutex_unlock(&xbar->mutex); @@ -292,6 +290,7 @@ static void *ti_dra7_xbar_route_allocate(struct of_phandle_args *dma_spec, ti_dra7_xbar_write(xbar->iomem, map->xbar_out, map->xbar_in); +out_put_pdev: put_device(&pdev->dev); return map; From cc39d9cf6c7ff3e0e86ea08971ed1ec58e785956 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 Nov 2025 17:12:58 +0100 Subject: [PATCH 1260/1321] dmaengine: ti: k3-udma: fix device leak on udma lookup Make sure to drop the reference taken when looking up the UDMA platform device. Note that holding a reference to a platform device does not prevent its driver data from going away so there is no point in keeping the reference after the lookup helper returns. Fixes: d70241913413 ("dmaengine: ti: k3-udma: Add glue layer for non DMAengine users") Fixes: 1438cde8fe9c ("dmaengine: ti: k3-udma: add missing put_device() call in of_xudma_dev_get()") Cc: stable@vger.kernel.org # 5.6: 1438cde8fe9c Cc: Grygorii Strashko Cc: Yu Kuai Signed-off-by: Johan Hovold Link: https://patch.msgid.link/20251117161258.10679-17-johan@kernel.org Signed-off-by: Vinod Koul --- drivers/dma/ti/k3-udma-private.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/dma/ti/k3-udma-private.c b/drivers/dma/ti/k3-udma-private.c index 05228bf000333..624360423ef17 100644 --- a/drivers/dma/ti/k3-udma-private.c +++ b/drivers/dma/ti/k3-udma-private.c @@ -42,9 +42,9 @@ struct udma_dev *of_xudma_dev_get(struct device_node *np, const char *property) } ud = platform_get_drvdata(pdev); + put_device(&pdev->dev); if (!ud) { pr_debug("UDMA has not been probed\n"); - put_device(&pdev->dev); return ERR_PTR(-EPROBE_DEFER); } From f114fa71ba6dd24cca432d06becaeef3b10a2165 Mon Sep 17 00:00:00 2001 From: Guodong Xu Date: Tue, 16 Dec 2025 22:10:06 +0800 Subject: [PATCH 1261/1321] dmaengine: mmp_pdma: Fix race condition in mmp_pdma_residue() Add proper locking in mmp_pdma_residue() to prevent use-after-free when accessing descriptor list and descriptor contents. The race occurs when multiple threads call tx_status() while the tasklet on another CPU is freeing completed descriptors: CPU 0 CPU 1 ----- ----- mmp_pdma_tx_status() mmp_pdma_residue() -> NO LOCK held list_for_each_entry(sw, ..) DMA interrupt dma_do_tasklet() -> spin_lock(&desc_lock) list_move(sw->node, ...) spin_unlock(&desc_lock) | dma_pool_free(sw) <- FREED! -> access sw->desc <- UAF! This issue can be reproduced when running dmatest on the same channel with multiple threads (threads_per_chan > 1). Fix by protecting the chain_running list iteration and descriptor access with the chan->desc_lock spinlock. Signed-off-by: Juan Li Signed-off-by: Guodong Xu Link: https://patch.msgid.link/20251216-mmp-pdma-race-v1-1-976a224bb622@riscstar.com Signed-off-by: Vinod Koul --- drivers/dma/mmp_pdma.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/dma/mmp_pdma.c b/drivers/dma/mmp_pdma.c index 86661eb3cde1f..d12e729ee12c5 100644 --- a/drivers/dma/mmp_pdma.c +++ b/drivers/dma/mmp_pdma.c @@ -928,6 +928,7 @@ static unsigned int mmp_pdma_residue(struct mmp_pdma_chan *chan, { struct mmp_pdma_desc_sw *sw; struct mmp_pdma_device *pdev = to_mmp_pdma_dev(chan->chan.device); + unsigned long flags; u64 curr; u32 residue = 0; bool passed = false; @@ -945,6 +946,8 @@ static unsigned int mmp_pdma_residue(struct mmp_pdma_chan *chan, else curr = pdev->ops->read_src_addr(chan->phy); + spin_lock_irqsave(&chan->desc_lock, flags); + list_for_each_entry(sw, &chan->chain_running, node) { u64 start, end; u32 len; @@ -989,6 +992,7 @@ static unsigned int mmp_pdma_residue(struct mmp_pdma_chan *chan, continue; if (sw->async_tx.cookie == cookie) { + spin_unlock_irqrestore(&chan->desc_lock, flags); return residue; } else { residue = 0; @@ -996,6 +1000,8 @@ static unsigned int mmp_pdma_residue(struct mmp_pdma_chan *chan, } } + spin_unlock_irqrestore(&chan->desc_lock, flags); + /* We should only get here in case of cyclic transactions */ return residue; } From 715bcf4dd1167888830e195f50de21c9de45a4b8 Mon Sep 17 00:00:00 2001 From: Zhen Ni Date: Tue, 14 Oct 2025 17:05:22 +0800 Subject: [PATCH 1262/1321] dmaengine: fsl-edma: Fix clk leak on alloc_chan_resources failure When fsl_edma_alloc_chan_resources() fails after clk_prepare_enable(), the error paths only free IRQs and destroy the TCD pool, but forget to call clk_disable_unprepare(). This causes the channel clock to remain enabled, leaking power and resources. Fix it by disabling the channel clock in the error unwind path. Fixes: d8d4355861d8 ("dmaengine: fsl-edma: add i.MX8ULP edma support") Cc: stable@vger.kernel.org Suggested-by: Frank Li Signed-off-by: Zhen Ni Reviewed-by: Frank Li Link: https://patch.msgid.link/20251014090522.827726-1-zhen.ni@easystack.cn Signed-off-by: Vinod Koul --- drivers/dma/fsl-edma-common.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c index a592127580299..7137f51ff6a0a 100644 --- a/drivers/dma/fsl-edma-common.c +++ b/drivers/dma/fsl-edma-common.c @@ -873,6 +873,7 @@ int fsl_edma_alloc_chan_resources(struct dma_chan *chan) free_irq(fsl_chan->txirq, fsl_chan); err_txirq: dma_pool_destroy(fsl_chan->tcd_pool); + clk_disable_unprepare(fsl_chan->clk); return ret; } From 94b66e536c81ba338c571d4c359f3543454e16f5 Mon Sep 17 00:00:00 2001 From: Sheetal Date: Mon, 10 Nov 2025 19:54:45 +0530 Subject: [PATCH 1263/1321] dmaengine: tegra-adma: Fix use-after-free A use-after-free bug exists in the Tegra ADMA driver when audio streams are terminated, particularly during XRUN conditions. The issue occurs when the DMA buffer is freed by tegra_adma_terminate_all() before the vchan completion tasklet finishes accessing it. The race condition follows this sequence: 1. DMA transfer completes, triggering an interrupt that schedules the completion tasklet (tasklet has not executed yet) 2. Audio playback stops, calling tegra_adma_terminate_all() which frees the DMA buffer memory via kfree() 3. The scheduled tasklet finally executes, calling vchan_complete() which attempts to access the already-freed memory Since tasklets can execute at any time after being scheduled, there is no guarantee that the buffer will remain valid when vchan_complete() runs. Fix this by properly synchronizing the virtual channel completion: - Calling vchan_terminate_vdesc() in tegra_adma_stop() to mark the descriptors as terminated instead of freeing the descriptor. - Add the callback tegra_adma_synchronize() that calls vchan_synchronize() which kills any pending tasklets and frees any terminated descriptors. Crash logs: [ 337.427523] BUG: KASAN: use-after-free in vchan_complete+0x124/0x3b0 [ 337.427544] Read of size 8 at addr ffff000132055428 by task swapper/0/0 [ 337.427562] Call trace: [ 337.427564] dump_backtrace+0x0/0x320 [ 337.427571] show_stack+0x20/0x30 [ 337.427575] dump_stack_lvl+0x68/0x84 [ 337.427584] print_address_description.constprop.0+0x74/0x2b8 [ 337.427590] kasan_report+0x1f4/0x210 [ 337.427598] __asan_load8+0xa0/0xd0 [ 337.427603] vchan_complete+0x124/0x3b0 [ 337.427609] tasklet_action_common.constprop.0+0x190/0x1d0 [ 337.427617] tasklet_action+0x30/0x40 [ 337.427623] __do_softirq+0x1a0/0x5c4 [ 337.427628] irq_exit+0x110/0x140 [ 337.427633] handle_domain_irq+0xa4/0xe0 [ 337.427640] gic_handle_irq+0x64/0x160 [ 337.427644] call_on_irq_stack+0x20/0x4c [ 337.427649] do_interrupt_handler+0x7c/0x90 [ 337.427654] el1_interrupt+0x30/0x80 [ 337.427659] el1h_64_irq_handler+0x18/0x30 [ 337.427663] el1h_64_irq+0x7c/0x80 [ 337.427667] cpuidle_enter_state+0xe4/0x540 [ 337.427674] cpuidle_enter+0x54/0x80 [ 337.427679] do_idle+0x2e0/0x380 [ 337.427685] cpu_startup_entry+0x2c/0x70 [ 337.427690] rest_init+0x114/0x130 [ 337.427695] arch_call_rest_init+0x18/0x24 [ 337.427702] start_kernel+0x380/0x3b4 [ 337.427706] __primary_switched+0xc0/0xc8 Fixes: f46b195799b5 ("dmaengine: tegra-adma: Add support for Tegra210 ADMA") Signed-off-by: Sheetal Acked-by: Thierry Reding Link: https://patch.msgid.link/20251110142445.3842036-1-sheetal@nvidia.com Signed-off-by: Vinod Koul --- drivers/dma/tegra210-adma.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/dma/tegra210-adma.c b/drivers/dma/tegra210-adma.c index d0e8bb27a03b8..215bfef37ec6f 100644 --- a/drivers/dma/tegra210-adma.c +++ b/drivers/dma/tegra210-adma.c @@ -429,10 +429,17 @@ static void tegra_adma_stop(struct tegra_adma_chan *tdc) return; } - kfree(tdc->desc); + vchan_terminate_vdesc(&tdc->desc->vd); tdc->desc = NULL; } +static void tegra_adma_synchronize(struct dma_chan *dc) +{ + struct tegra_adma_chan *tdc = to_tegra_adma_chan(dc); + + vchan_synchronize(&tdc->vc); +} + static void tegra_adma_start(struct tegra_adma_chan *tdc) { struct virt_dma_desc *vd = vchan_next_desc(&tdc->vc); @@ -1155,6 +1162,7 @@ static int tegra_adma_probe(struct platform_device *pdev) tdma->dma_dev.device_config = tegra_adma_slave_config; tdma->dma_dev.device_tx_status = tegra_adma_tx_status; tdma->dma_dev.device_terminate_all = tegra_adma_terminate_all; + tdma->dma_dev.device_synchronize = tegra_adma_synchronize; tdma->dma_dev.src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_4_BYTES); tdma->dma_dev.dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_4_BYTES); tdma->dma_dev.directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV); From 5db7cd89b39f93bfb24a5a897cae78e9ee33f9e3 Mon Sep 17 00:00:00 2001 From: Suraj Gupta Date: Wed, 22 Oct 2025 00:00:06 +0530 Subject: [PATCH 1264/1321] dmaengine: xilinx_dma: Fix uninitialized addr_width when "xlnx,addrwidth" property is missing When device tree lacks optional "xlnx,addrwidth" property, the addr_width variable remained uninitialized with garbage values, causing incorrect DMA mask configuration and subsequent probe failure. The fix ensures a fallback to the default 32-bit address width when this property is missing. Signed-off-by: Suraj Gupta Fixes: b72db4005fe4 ("dmaengine: vdma: Add 64 bit addressing support to the driver") Reviewed-by: Radhey Shyam Pandey Reviewed-by: Folker Schwesinger Link: https://patch.msgid.link/20251021183006.3434495-1-suraj.gupta2@amd.com Signed-off-by: Vinod Koul --- drivers/dma/xilinx/xilinx_dma.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c index fabff602065f6..89a8254d9cdc6 100644 --- a/drivers/dma/xilinx/xilinx_dma.c +++ b/drivers/dma/xilinx/xilinx_dma.c @@ -131,6 +131,7 @@ #define XILINX_MCDMA_MAX_CHANS_PER_DEVICE 0x20 #define XILINX_DMA_MAX_CHANS_PER_DEVICE 0x2 #define XILINX_CDMA_MAX_CHANS_PER_DEVICE 0x1 +#define XILINX_DMA_DFAULT_ADDRWIDTH 0x20 #define XILINX_DMA_DMAXR_ALL_IRQ_MASK \ (XILINX_DMA_DMASR_FRM_CNT_IRQ | \ @@ -3159,7 +3160,7 @@ static int xilinx_dma_probe(struct platform_device *pdev) struct device_node *node = pdev->dev.of_node; struct xilinx_dma_device *xdev; struct device_node *child, *np = pdev->dev.of_node; - u32 num_frames, addr_width, len_width; + u32 num_frames, addr_width = XILINX_DMA_DFAULT_ADDRWIDTH, len_width; int i, err; /* Allocate and initialize the DMA engine structure */ @@ -3235,7 +3236,9 @@ static int xilinx_dma_probe(struct platform_device *pdev) err = of_property_read_u32(node, "xlnx,addrwidth", &addr_width); if (err < 0) - dev_warn(xdev->dev, "missing xlnx,addrwidth property\n"); + dev_warn(xdev->dev, + "missing xlnx,addrwidth property, using default value %d\n", + XILINX_DMA_DFAULT_ADDRWIDTH); if (addr_width > 32) xdev->ext_addr = true; From 6255580d55eaf0ad35e083d6d5a145938269fb98 Mon Sep 17 00:00:00 2001 From: Biju Das Date: Thu, 13 Nov 2025 19:50:48 +0000 Subject: [PATCH 1265/1321] dmaengine: sh: rz-dmac: Fix rz_dmac_terminate_all() After audio full duplex testing, playing the recorded file contains a few playback frames from the previous time. The rz_dmac_terminate_all() does not reset all the hardware descriptors queued previously, leading to the wrong descriptor being picked up during the next DMA transfer. Fix the above issue by resetting all the descriptor headers for a channel in rz_dmac_terminate_all() as rz_dmac_lmdesc_recycle() points to the proper descriptor header filled by the rz_dmac_prepare_descs_for_slave_sg(). Cc: stable@kernel.org Fixes: 5000d37042a6 ("dmaengine: sh: Add DMAC driver for RZ/G2L SoC") Reviewed-by: Geert Uytterhoeven Signed-off-by: Biju Das Reviewed-by: Claudiu Beznea Tested-by: Claudiu Beznea Link: https://patch.msgid.link/20251113195052.564338-1-biju.das.jz@bp.renesas.com Signed-off-by: Vinod Koul --- drivers/dma/sh/rz-dmac.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/dma/sh/rz-dmac.c b/drivers/dma/sh/rz-dmac.c index 38137e8d80b9f..9e5f088355e22 100644 --- a/drivers/dma/sh/rz-dmac.c +++ b/drivers/dma/sh/rz-dmac.c @@ -557,11 +557,16 @@ rz_dmac_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl, static int rz_dmac_terminate_all(struct dma_chan *chan) { struct rz_dmac_chan *channel = to_rz_dmac_chan(chan); + struct rz_lmdesc *lmdesc = channel->lmdesc.base; unsigned long flags; + unsigned int i; LIST_HEAD(head); rz_dmac_disable_hw(channel); spin_lock_irqsave(&channel->vc.lock, flags); + for (i = 0; i < DMAC_NR_LMDESC; i++) + lmdesc[i].header = 0; + list_splice_tail_init(&channel->ld_active, &channel->ld_free); list_splice_tail_init(&channel->ld_queue, &channel->ld_free); vchan_get_all_descriptors(&channel->vc, &head); From 224e5afa649fead7fb15b4079c207c1c49145d0d Mon Sep 17 00:00:00 2001 From: Miaoqian Lin Date: Wed, 29 Oct 2025 20:34:19 +0800 Subject: [PATCH 1266/1321] dmaengine: qcom: gpi: Fix memory leak in gpi_peripheral_config() Fix a memory leak in gpi_peripheral_config() where the original memory pointed to by gchan->config could be lost if krealloc() fails. The issue occurs when: 1. gchan->config points to previously allocated memory 2. krealloc() fails and returns NULL 3. The function directly assigns NULL to gchan->config, losing the reference to the original memory 4. The original memory becomes unreachable and cannot be freed Fix this by using a temporary variable to hold the krealloc() result and only updating gchan->config when the allocation succeeds. Found via static analysis and code review. Fixes: 5d0c3533a19f ("dmaengine: qcom: Add GPI dma driver") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin Reviewed-by: Bjorn Andersson Link: https://patch.msgid.link/20251029123421.91973-1-linmq006@gmail.com Signed-off-by: Vinod Koul --- drivers/dma/qcom/gpi.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/dma/qcom/gpi.c b/drivers/dma/qcom/gpi.c index 66bfea1f156d1..6e30f3aa401e3 100644 --- a/drivers/dma/qcom/gpi.c +++ b/drivers/dma/qcom/gpi.c @@ -1605,14 +1605,16 @@ static int gpi_peripheral_config(struct dma_chan *chan, struct dma_slave_config *config) { struct gchan *gchan = to_gchan(chan); + void *new_config; if (!config->peripheral_config) return -EINVAL; - gchan->config = krealloc(gchan->config, config->peripheral_size, GFP_NOWAIT); - if (!gchan->config) + new_config = krealloc(gchan->config, config->peripheral_size, GFP_NOWAIT); + if (!new_config) return -ENOMEM; + gchan->config = new_config; memcpy(gchan->config, config->peripheral_config, config->peripheral_size); return 0; From a8aa2e5c823fa4a1f8c1b3deaf897864cf651a57 Mon Sep 17 00:00:00 2001 From: Haotian Zhang Date: Mon, 3 Nov 2025 15:30:18 +0800 Subject: [PATCH 1267/1321] dmaengine: omap-dma: fix dma_pool resource leak in error paths The dma_pool created by dma_pool_create() is not destroyed when dma_async_device_register() or of_dma_controller_register() fails, causing a resource leak in the probe error paths. Add dma_pool_destroy() in both error paths to properly release the allocated dma_pool resource. Fixes: 7bedaa553760 ("dmaengine: add OMAP DMA engine driver") Signed-off-by: Haotian Zhang Link: https://patch.msgid.link/20251103073018.643-1-vulab@iscas.ac.cn Signed-off-by: Vinod Koul --- drivers/dma/ti/omap-dma.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/dma/ti/omap-dma.c b/drivers/dma/ti/omap-dma.c index 8c023c6e623a5..73ed4b7946304 100644 --- a/drivers/dma/ti/omap-dma.c +++ b/drivers/dma/ti/omap-dma.c @@ -1808,6 +1808,8 @@ static int omap_dma_probe(struct platform_device *pdev) if (rc) { pr_warn("OMAP-DMA: failed to register slave DMA engine device: %d\n", rc); + if (od->ll123_supported) + dma_pool_destroy(od->desc_pool); omap_dma_free(od); return rc; } @@ -1823,6 +1825,8 @@ static int omap_dma_probe(struct platform_device *pdev) if (rc) { pr_warn("OMAP-DMA: failed to register DMA controller\n"); dma_async_device_unregister(&od->ddev); + if (od->ll123_supported) + dma_pool_destroy(od->desc_pool); omap_dma_free(od); } } From c2d9b016d56630aae8a5dbe4438e137c9dbfb4c7 Mon Sep 17 00:00:00 2001 From: Janne Grunau Date: Wed, 31 Dec 2025 13:34:59 +0100 Subject: [PATCH 1268/1321] dmaengine: apple-admac: Add "apple,t8103-admac" compatible After discussion with the devicetree maintainers we agreed to not extend lists with the generic compatible "apple,admac" anymore [1]. Use "apple,t8103-admac" as base compatible as it is the SoC the driver and bindings were written for. [1]: https://lore.kernel.org/asahi/12ab93b7-1fc2-4ce0-926e-c8141cfe81bf@kernel.org/ Fixes: b127315d9a78 ("dmaengine: apple-admac: Add Apple ADMAC driver") Cc: stable@vger.kernel.org Reviewed-by: Neal Gompa Signed-off-by: Janne Grunau Link: https://patch.msgid.link/20251231-apple-admac-t8103-base-compat-v1-1-ec24a3708f76@jannau.net Signed-off-by: Vinod Koul --- drivers/dma/apple-admac.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/dma/apple-admac.c b/drivers/dma/apple-admac.c index bd49f03742912..04bbd774b3b44 100644 --- a/drivers/dma/apple-admac.c +++ b/drivers/dma/apple-admac.c @@ -936,6 +936,7 @@ static void admac_remove(struct platform_device *pdev) } static const struct of_device_id admac_of_match[] = { + { .compatible = "apple,t8103-admac", }, { .compatible = "apple,admac", }, { } }; From ec0043b52f1bf359005ce48d825781b63e8d395e Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Thu, 4 Dec 2025 11:19:10 +0100 Subject: [PATCH 1269/1321] ext4: fix ext4_tune_sb_params padding The padding at the end of struct ext4_tune_sb_params is architecture specific and in particular is different between x86-32 and x86-64, since the __u64 member only enforces struct alignment on the latter. This shows up as a new warning when test-building the headers with -Wpadded: include/linux/ext4.h:144:1: error: padding struct size to alignment boundary with 4 bytes [-Werror=padded] All members inside the structure are naturally aligned, so the only difference here is the amount of padding at the end. Make the padding explicit, to have a consistent sizeof(struct ext4_tune_sb_params) of 232 on all architectures and avoid adding compat ioctl handling for EXT4_IOC_GET_TUNE_SB_PARAM/EXT4_IOC_SET_TUNE_SB_PARAM. This is an ABI break on x86-32 but hopefully this can go into 6.18.y early enough as a fixup so no actual users will be affected. Alternatively, the kernel could handle the ioctl commands for both sizes (232 and 228 bytes) on all architectures. Fixes: 04a91570ac67 ("ext4: implemet new ioctls to set and get superblock parameters") Signed-off-by: Arnd Bergmann Reviewed-by: Jan Kara Link: https://patch.msgid.link/20251204101914.1037148-1-arnd@kernel.org Signed-off-by: Theodore Ts'o Cc: stable@kernel.org --- include/uapi/linux/ext4.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/uapi/linux/ext4.h b/include/uapi/linux/ext4.h index 411dcc1e4a35c..9c683991c32f9 100644 --- a/include/uapi/linux/ext4.h +++ b/include/uapi/linux/ext4.h @@ -139,7 +139,7 @@ struct ext4_tune_sb_params { __u32 clear_feature_incompat_mask; __u32 clear_feature_ro_compat_mask; __u8 mount_opts[64]; - __u8 pad[64]; + __u8 pad[68]; }; #define EXT4_TUNE_FL_ERRORS_BEHAVIOR 0x00000001 From af6cdfd17f064079aef72741d7518ead81634955 Mon Sep 17 00:00:00 2001 From: Julian Sun Date: Mon, 8 Dec 2025 20:37:13 +0800 Subject: [PATCH 1270/1321] ext4: add missing down_write_data_sem in mext_move_extent(). Commit 962e8a01eab9 ("ext4: introduce mext_move_extent()") attempts to call ext4_swap_extents() on the failure path to recover the swapped extents, but fails to acquire locks for the two inode->i_data_sem, triggering the BUG_ON statement in ext4_swap_extents(). This issue can be fixed by calling ext4_double_down_write_data_sem() before ext4_swap_extents(). Signed-off-by: Julian Sun Reported-by: syzbot+4ea6bd8737669b423aae@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/69368649.a70a0220.38f243.0093.GAE@google.com/ Fixes: 962e8a01eab9 ("ext4: introduce mext_move_extent()") Reviewed-by: Baokun Li Reviewed-by: Jan Kara Reviewed-by: Zhang Yi Signed-off-by: Theodore Ts'o Link: https://patch.msgid.link/20251208123713.1971068-1-sunjunchao@bytedance.com Signed-off-by: Theodore Ts'o --- fs/ext4/move_extent.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 0550fd30fd10f..635fb8a52e0c2 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -393,9 +393,11 @@ static int mext_move_extent(struct mext_data *mext, u64 *m_len) repair_branches: ret2 = 0; + ext4_double_down_write_data_sem(orig_inode, donor_inode); r_len = ext4_swap_extents(handle, donor_inode, orig_inode, mext->donor_lblk, orig_map->m_lblk, *m_len, 0, &ret2); + ext4_double_up_write_data_sem(orig_inode, donor_inode); if (ret2 || r_len != *m_len) { ext4_error_inode_block(orig_inode, (sector_t)(orig_map->m_lblk), EIO, "Unable to copy data block, data will be lost!"); From 77bf3551d0ad0cbf10f7f376d329f16772a72f27 Mon Sep 17 00:00:00 2001 From: Yang Erkun Date: Sat, 13 Dec 2025 13:57:06 +0800 Subject: [PATCH 1271/1321] ext4: fix iloc.bh leak in ext4_xattr_inode_update_ref The error branch for ext4_xattr_inode_update_ref forget to release the refcount for iloc.bh. Find this when review code. Fixes: 57295e835408 ("ext4: guard against EA inode refcount underflow in xattr update") Signed-off-by: Yang Erkun Reviewed-by: Baokun Li Reviewed-by: Zhang Yi Link: https://patch.msgid.link/20251213055706.3417529-1-yangerkun@huawei.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org --- fs/ext4/xattr.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index 2e02efbddaacc..4ed8ddf2a60b3 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -1037,6 +1037,7 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode, ext4_error_inode(ea_inode, __func__, __LINE__, 0, "EA inode %lu ref wraparound: ref_count=%lld ref_change=%d", ea_inode->i_ino, ref_count, ref_change); + brelse(iloc.bh); ret = -EFSCORRUPTED; goto out; } From 76536189f311934de51187c7b8c8286a3c37a2bd Mon Sep 17 00:00:00 2001 From: Waiman Long Date: Tue, 13 Jan 2026 23:54:35 -0500 Subject: [PATCH 1272/1321] MAINTAINERS: Add Chen Ridong as cpuset reviewer Add Chen Ridong as a reviewer for the cpuset cgroup subsystem. Signed-off-by: Waiman Long Acked-by: Chen Ridong Signed-off-by: Tejun Heo --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 41ace5aeecbe5..da9dbc1a40194 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6421,6 +6421,7 @@ F: include/linux/blk-cgroup.h CONTROL GROUP - CPUSET M: Waiman Long +R: Chen Ridong L: cgroups@vger.kernel.org S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git From 43434091803650fb49474076f2f89a43a9e4c8ff Mon Sep 17 00:00:00 2001 From: Tim Bird Date: Wed, 14 Jan 2026 13:30:27 -0700 Subject: [PATCH 1273/1321] kernel: cgroup: Add SPDX-License-Identifier lines Add GPL-2.0 SPDX license id lines to a few old files, replacing the reference to the COPYING file. The COPYING file at the time of creation of these files (2007 and 2005) was GPL-v2.0, with an additional clause indicating that only v2 applied. Signed-off-by: Tim Bird Signed-off-by: Tejun Heo --- kernel/cgroup/cgroup.c | 5 +---- kernel/cgroup/cpuset.c | 5 +---- 2 files changed, 2 insertions(+), 8 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 554a02ee298ba..5f0d33b049102 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 /* * Generic process-grouping system. * @@ -20,10 +21,6 @@ * 2003-10-22 Updates by Stephen Hemminger. * 2004 May-July Rework by Paul Jackson. * --------------------------------------------------- - * - * This file is subject to the terms and conditions of the GNU General Public - * License. See the file COPYING in the main directory of the Linux - * distribution for more details. */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 3e8cc34d8d502..c06e2e96f79dc 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 /* * kernel/cpuset.c * @@ -16,10 +17,6 @@ * 2006 Rework by Paul Menage to use generic cgroups * 2008 Rework of the scheduler domains and CPU hotplug handling * by Max Krasnyansky - * - * This file is subject to the terms and conditions of the GNU General Public - * License. See the file COPYING in the main directory of the Linux - * distribution for more details. */ #include "cpuset-internal.h" From a836903e9ac9886f81d5a9241cf897651da835c2 Mon Sep 17 00:00:00 2001 From: Tim Bird Date: Wed, 14 Jan 2026 16:22:08 -0700 Subject: [PATCH 1274/1321] kernel: cgroup: Add LGPL-2.1 SPDX license ID to legacy_freezer.c Add an appropriate SPDX-License-Identifier line to the file, and remove the GNU boilerplate text. Signed-off-by: Tim Bird Signed-off-by: Tejun Heo --- kernel/cgroup/legacy_freezer.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/kernel/cgroup/legacy_freezer.c b/kernel/cgroup/legacy_freezer.c index 915b02f659809..817c33450fee7 100644 --- a/kernel/cgroup/legacy_freezer.c +++ b/kernel/cgroup/legacy_freezer.c @@ -1,17 +1,10 @@ +// SPDX-License-Identifier: LGPL-2.1 /* * cgroup_freezer.c - control group freezer subsystem * * Copyright IBM Corporation, 2007 * * Author : Cedric Le Goater - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of version 2.1 of the GNU Lesser General Public License - * as published by the Free Software Foundation. - * - * This program is distributed in the hope that it would be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. */ #include From 4dcf043bb1b86ff214ad144629ee99b97f963ba6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micka=C3=ABl=20Sala=C3=BCn?= Date: Fri, 19 Dec 2025 20:38:51 +0100 Subject: [PATCH 1275/1321] landlock: Fix formatting MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Format with clang-format -i security/landlock/*.[ch] Cc: Christian Brauner Cc: Günther Noack Cc: Mateusz Guzik Fixes: b4dbfd8653b3 ("Coccinelle-based conversion to use ->i_state accessors") Link: https://lore.kernel.org/r/20251219193855.825889-5-mic@digikod.net Reviewed-by: Günther Noack Signed-off-by: Mickaël Salaün --- security/landlock/fs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/security/landlock/fs.c b/security/landlock/fs.c index fe794875ad461..e3c3a8a9ac27b 100644 --- a/security/landlock/fs.c +++ b/security/landlock/fs.c @@ -1314,7 +1314,8 @@ static void hook_sb_delete(struct super_block *const sb) * second call to iput() for the same Landlock object. Also * checks I_NEW because such inode cannot be tied to an object. */ - if (inode_state_read(inode) & (I_FREEING | I_WILL_FREE | I_NEW)) { + if (inode_state_read(inode) & + (I_FREEING | I_WILL_FREE | I_NEW)) { spin_unlock(&inode->i_lock); continue; } From b79dda71763c182adae75ca3b4be4d92a44c4e0b Mon Sep 17 00:00:00 2001 From: Matthieu Buffet Date: Mon, 27 Oct 2025 20:07:26 +0100 Subject: [PATCH 1276/1321] landlock: Fix TCP handling of short AF_UNSPEC addresses MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit current_check_access_socket() treats AF_UNSPEC addresses as AF_INET ones, and only later adds special case handling to allow connect(AF_UNSPEC), and on IPv4 sockets bind(AF_UNSPEC+INADDR_ANY). This would be fine except AF_UNSPEC addresses can be as short as a bare AF_UNSPEC sa_family_t field, and nothing more. The AF_INET code path incorrectly enforces a length of sizeof(struct sockaddr_in) instead. Move AF_UNSPEC edge case handling up inside the switch-case, before the address is (potentially incorrectly) treated as AF_INET. Fixes: fff69fb03dde ("landlock: Support network rules with TCP bind and connect") Signed-off-by: Matthieu Buffet Link: https://lore.kernel.org/r/20251027190726.626244-4-matthieu@buffet.re Signed-off-by: Mickaël Salaün --- security/landlock/net.c | 118 +++++++++++++++++++++++----------------- 1 file changed, 67 insertions(+), 51 deletions(-) diff --git a/security/landlock/net.c b/security/landlock/net.c index 1f3915a90a808..e6367e30e5b0e 100644 --- a/security/landlock/net.c +++ b/security/landlock/net.c @@ -71,6 +71,61 @@ static int current_check_access_socket(struct socket *const sock, switch (address->sa_family) { case AF_UNSPEC: + if (access_request == LANDLOCK_ACCESS_NET_CONNECT_TCP) { + /* + * Connecting to an address with AF_UNSPEC dissolves + * the TCP association, which have the same effect as + * closing the connection while retaining the socket + * object (i.e., the file descriptor). As for dropping + * privileges, closing connections is always allowed. + * + * For a TCP access control system, this request is + * legitimate. Let the network stack handle potential + * inconsistencies and return -EINVAL if needed. + */ + return 0; + } else if (access_request == LANDLOCK_ACCESS_NET_BIND_TCP) { + /* + * Binding to an AF_UNSPEC address is treated + * differently by IPv4 and IPv6 sockets. The socket's + * family may change under our feet due to + * setsockopt(IPV6_ADDRFORM), but that's ok: we either + * reject entirely or require + * %LANDLOCK_ACCESS_NET_BIND_TCP for the given port, so + * it cannot be used to bypass the policy. + * + * IPv4 sockets map AF_UNSPEC to AF_INET for + * retrocompatibility for bind accesses, only if the + * address is INADDR_ANY (cf. __inet_bind). IPv6 + * sockets always reject it. + * + * Checking the address is required to not wrongfully + * return -EACCES instead of -EAFNOSUPPORT or -EINVAL. + * We could return 0 and let the network stack handle + * these checks, but it is safer to return a proper + * error and test consistency thanks to kselftest. + */ + if (sock->sk->__sk_common.skc_family == AF_INET) { + const struct sockaddr_in *const sockaddr = + (struct sockaddr_in *)address; + + if (addrlen < sizeof(struct sockaddr_in)) + return -EINVAL; + + if (sockaddr->sin_addr.s_addr != + htonl(INADDR_ANY)) + return -EAFNOSUPPORT; + } else { + if (addrlen < SIN6_LEN_RFC2133) + return -EINVAL; + else + return -EAFNOSUPPORT; + } + } else { + WARN_ON_ONCE(1); + } + /* Only for bind(AF_UNSPEC+INADDR_ANY) on IPv4 socket. */ + fallthrough; case AF_INET: { const struct sockaddr_in *addr4; @@ -119,57 +174,18 @@ static int current_check_access_socket(struct socket *const sock, return 0; } - /* Specific AF_UNSPEC handling. */ - if (address->sa_family == AF_UNSPEC) { - /* - * Connecting to an address with AF_UNSPEC dissolves the TCP - * association, which have the same effect as closing the - * connection while retaining the socket object (i.e., the file - * descriptor). As for dropping privileges, closing - * connections is always allowed. - * - * For a TCP access control system, this request is legitimate. - * Let the network stack handle potential inconsistencies and - * return -EINVAL if needed. - */ - if (access_request == LANDLOCK_ACCESS_NET_CONNECT_TCP) - return 0; - - /* - * For compatibility reason, accept AF_UNSPEC for bind - * accesses (mapped to AF_INET) only if the address is - * INADDR_ANY (cf. __inet_bind). Checking the address is - * required to not wrongfully return -EACCES instead of - * -EAFNOSUPPORT. - * - * We could return 0 and let the network stack handle these - * checks, but it is safer to return a proper error and test - * consistency thanks to kselftest. - */ - if (access_request == LANDLOCK_ACCESS_NET_BIND_TCP) { - /* addrlen has already been checked for AF_UNSPEC. */ - const struct sockaddr_in *const sockaddr = - (struct sockaddr_in *)address; - - if (sock->sk->__sk_common.skc_family != AF_INET) - return -EINVAL; - - if (sockaddr->sin_addr.s_addr != htonl(INADDR_ANY)) - return -EAFNOSUPPORT; - } - } else { - /* - * Checks sa_family consistency to not wrongfully return - * -EACCES instead of -EINVAL. Valid sa_family changes are - * only (from AF_INET or AF_INET6) to AF_UNSPEC. - * - * We could return 0 and let the network stack handle this - * check, but it is safer to return a proper error and test - * consistency thanks to kselftest. - */ - if (address->sa_family != sock->sk->__sk_common.skc_family) - return -EINVAL; - } + /* + * Checks sa_family consistency to not wrongfully return + * -EACCES instead of -EINVAL. Valid sa_family changes are + * only (from AF_INET or AF_INET6) to AF_UNSPEC. + * + * We could return 0 and let the network stack handle this + * check, but it is safer to return a proper error and test + * consistency thanks to kselftest. + */ + if (address->sa_family != sock->sk->__sk_common.skc_family && + address->sa_family != AF_UNSPEC) + return -EINVAL; id.key.data = (__force uintptr_t)port; BUILD_BUG_ON(sizeof(port) > sizeof(id.key.data)); From 92072529d3702c3efacde1b4b5ce121d9378a7da Mon Sep 17 00:00:00 2001 From: Matthieu Buffet Date: Mon, 27 Oct 2025 20:07:24 +0100 Subject: [PATCH 1277/1321] selftests/landlock: Fix TCP bind(AF_UNSPEC) test case MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The nominal error code for bind(AF_UNSPEC) on an IPv6 socket is -EAFNOSUPPORT, not -EINVAL. -EINVAL is only returned when the supplied address struct is too short, which happens to be the case in current selftests because they treat AF_UNSPEC like IPv4 sockets do: as an alias for AF_INET (which is a 16-byte struct instead of the 24 bytes required by IPv6 sockets). Make the union large enough for any address (by adding struct sockaddr_storage to the union), and make AF_UNSPEC addresses large enough for any family. Test for -EAFNOSUPPORT instead, and add a dedicated test case for truncated inputs with -EINVAL. Fixes: a549d055a22e ("selftests/landlock: Add network tests") Signed-off-by: Matthieu Buffet Link: https://lore.kernel.org/r/20251027190726.626244-2-matthieu@buffet.re Signed-off-by: Mickaël Salaün --- tools/testing/selftests/landlock/common.h | 1 + tools/testing/selftests/landlock/net_test.c | 16 +++++++++++++++- 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/landlock/common.h b/tools/testing/selftests/landlock/common.h index 230b75f6015ba..90551650299c2 100644 --- a/tools/testing/selftests/landlock/common.h +++ b/tools/testing/selftests/landlock/common.h @@ -237,6 +237,7 @@ struct service_fixture { struct sockaddr_un unix_addr; socklen_t unix_addr_len; }; + struct sockaddr_storage _largest; }; }; diff --git a/tools/testing/selftests/landlock/net_test.c b/tools/testing/selftests/landlock/net_test.c index 2a45208551e61..3bbc0508420b1 100644 --- a/tools/testing/selftests/landlock/net_test.c +++ b/tools/testing/selftests/landlock/net_test.c @@ -121,6 +121,10 @@ static socklen_t get_addrlen(const struct service_fixture *const srv, { switch (srv->protocol.domain) { case AF_UNSPEC: + if (minimal) + return sizeof(sa_family_t); + return sizeof(struct sockaddr_storage); + case AF_INET: return sizeof(srv->ipv4_addr); @@ -758,6 +762,11 @@ TEST_F(protocol, bind_unspec) bind_fd = socket_variant(&self->srv0); ASSERT_LE(0, bind_fd); + /* Tries to bind with too small addrlen. */ + EXPECT_EQ(-EINVAL, bind_variant_addrlen( + bind_fd, &self->unspec_any0, + get_addrlen(&self->unspec_any0, true) - 1)); + /* Allowed bind on AF_UNSPEC/INADDR_ANY. */ ret = bind_variant(bind_fd, &self->unspec_any0); if (variant->prot.domain == AF_INET) { @@ -766,6 +775,8 @@ TEST_F(protocol, bind_unspec) TH_LOG("Failed to bind to unspec/any socket: %s", strerror(errno)); } + } else if (variant->prot.domain == AF_INET6) { + EXPECT_EQ(-EAFNOSUPPORT, ret); } else { EXPECT_EQ(-EINVAL, ret); } @@ -792,6 +803,8 @@ TEST_F(protocol, bind_unspec) } else { EXPECT_EQ(0, ret); } + } else if (variant->prot.domain == AF_INET6) { + EXPECT_EQ(-EAFNOSUPPORT, ret); } else { EXPECT_EQ(-EINVAL, ret); } @@ -801,7 +814,8 @@ TEST_F(protocol, bind_unspec) bind_fd = socket_variant(&self->srv0); ASSERT_LE(0, bind_fd); ret = bind_variant(bind_fd, &self->unspec_srv0); - if (variant->prot.domain == AF_INET) { + if (variant->prot.domain == AF_INET || + variant->prot.domain == AF_INET6) { EXPECT_EQ(-EAFNOSUPPORT, ret); } else { EXPECT_EQ(-EINVAL, ret) From e19b07b130024b139efb5d7021ba63a51158d119 Mon Sep 17 00:00:00 2001 From: Matthieu Buffet Date: Mon, 27 Oct 2025 20:07:25 +0100 Subject: [PATCH 1278/1321] selftests/landlock: Add missing connect(minimal AF_UNSPEC) test MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit connect_variant(unspec_any0) is called twice. Both calls end up in connect_variant_addrlen() with an address length of get_addrlen(minimal=false). However, the connect() syscall and its variants (e.g. iouring/compat) accept much shorter addresses of 4 bytes and that behaviour was not tested. Replace one of these calls with one using a minimal address length (just a bare sa_family=AF_UNSPEC field with no actual address). Also add a call using a truncated address for good measure. Signed-off-by: Matthieu Buffet Link: https://lore.kernel.org/r/20251027190726.626244-3-matthieu@buffet.re Signed-off-by: Mickaël Salaün --- tools/testing/selftests/landlock/net_test.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/landlock/net_test.c b/tools/testing/selftests/landlock/net_test.c index 3bbc0508420b1..b34b139b3f89c 100644 --- a/tools/testing/selftests/landlock/net_test.c +++ b/tools/testing/selftests/landlock/net_test.c @@ -906,7 +906,19 @@ TEST_F(protocol, connect_unspec) EXPECT_EQ(0, close(ruleset_fd)); } - ret = connect_variant(connect_fd, &self->unspec_any0); + /* Try to re-disconnect with a truncated address struct. */ + EXPECT_EQ(-EINVAL, + connect_variant_addrlen( + connect_fd, &self->unspec_any0, + get_addrlen(&self->unspec_any0, true) - 1)); + + /* + * Re-disconnect, with a minimal sockaddr struct (just a + * bare af_family=AF_UNSPEC field). + */ + ret = connect_variant_addrlen(connect_fd, &self->unspec_any0, + get_addrlen(&self->unspec_any0, + true)); if (self->srv0.protocol.domain == AF_UNIX && self->srv0.protocol.type == SOCK_STREAM) { EXPECT_EQ(-EINVAL, ret); From 8e44f89a93fa977c938f4a25961fc0460de2745e Mon Sep 17 00:00:00 2001 From: Matthieu Buffet Date: Mon, 1 Dec 2025 01:36:31 +0100 Subject: [PATCH 1279/1321] selftests/landlock: Remove invalid unix socket bind() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Remove bind() call on a client socket that doesn't make sense. Since strlen(cli_un.sun_path) returns a random value depending on stack garbage, that many uninitialized bytes are read from the stack as an unix socket address. This creates random test failures due to the bind address being invalid or already in use if the same stack value comes up twice. Fixes: f83d51a5bdfe ("selftests/landlock: Check IOCTL restrictions for named UNIX domain sockets") Signed-off-by: Matthieu Buffet Reviewed-by: Günther Noack Link: https://lore.kernel.org/r/20251201003631.190817-1-matthieu@buffet.re Signed-off-by: Mickaël Salaün --- tools/testing/selftests/landlock/fs_test.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c index eee814e09dd70..7d378bdf3bce0 100644 --- a/tools/testing/selftests/landlock/fs_test.c +++ b/tools/testing/selftests/landlock/fs_test.c @@ -4391,9 +4391,6 @@ TEST_F_FORK(layout1, named_unix_domain_socket_ioctl) cli_fd = socket(AF_UNIX, SOCK_STREAM, 0); ASSERT_LE(0, cli_fd); - size = offsetof(struct sockaddr_un, sun_path) + strlen(cli_un.sun_path); - ASSERT_EQ(0, bind(cli_fd, (struct sockaddr *)&cli_un, size)); - bzero(&cli_un, sizeof(cli_un)); cli_un.sun_family = AF_UNIX; strncpy(cli_un.sun_path, path, sizeof(cli_un.sun_path)); From 64a624722f1840a5a68d846a2568c380ec369950 Mon Sep 17 00:00:00 2001 From: Matthieu Buffet Date: Tue, 2 Dec 2025 22:51:41 +0100 Subject: [PATCH 1280/1321] selftests/landlock: NULL-terminate unix pathname addresses MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The size of Unix pathname addresses is computed in selftests using offsetof(struct sockaddr_un, sun_path) + strlen(xxx). It should have been that +1, which makes addresses passed to the libc and kernel non-NULL-terminated. unix_mkname_bsd() fixes that in Linux so there is no harm, but just using sizeof(the address struct) should improve readability. Signed-off-by: Matthieu Buffet Reviewed-by: Günther Noack Link: https://lore.kernel.org/r/20251202215141.689986-1-matthieu@buffet.re Signed-off-by: Mickaël Salaün --- tools/testing/selftests/landlock/fs_test.c | 24 +++++++++---------- .../landlock/scoped_abstract_unix_test.c | 21 +++++++--------- 2 files changed, 20 insertions(+), 25 deletions(-) diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c index 7d378bdf3bce0..76491ba54dce2 100644 --- a/tools/testing/selftests/landlock/fs_test.c +++ b/tools/testing/selftests/landlock/fs_test.c @@ -4362,22 +4362,24 @@ TEST_F_FORK(layout1, named_unix_domain_socket_ioctl) { const char *const path = file1_s1d1; int srv_fd, cli_fd, ruleset_fd; - socklen_t size; - struct sockaddr_un srv_un, cli_un; + struct sockaddr_un srv_un = { + .sun_family = AF_UNIX, + }; + struct sockaddr_un cli_un = { + .sun_family = AF_UNIX, + }; const struct landlock_ruleset_attr attr = { .handled_access_fs = LANDLOCK_ACCESS_FS_IOCTL_DEV, }; /* Sets up a server */ - srv_un.sun_family = AF_UNIX; - strncpy(srv_un.sun_path, path, sizeof(srv_un.sun_path)); - ASSERT_EQ(0, unlink(path)); srv_fd = socket(AF_UNIX, SOCK_STREAM, 0); ASSERT_LE(0, srv_fd); - size = offsetof(struct sockaddr_un, sun_path) + strlen(srv_un.sun_path); - ASSERT_EQ(0, bind(srv_fd, (struct sockaddr *)&srv_un, size)); + strncpy(srv_un.sun_path, path, sizeof(srv_un.sun_path)); + ASSERT_EQ(0, bind(srv_fd, (struct sockaddr *)&srv_un, sizeof(srv_un))); + ASSERT_EQ(0, listen(srv_fd, 10 /* qlen */)); /* Enables Landlock. */ @@ -4387,16 +4389,12 @@ TEST_F_FORK(layout1, named_unix_domain_socket_ioctl) ASSERT_EQ(0, close(ruleset_fd)); /* Sets up a client connection to it */ - cli_un.sun_family = AF_UNIX; cli_fd = socket(AF_UNIX, SOCK_STREAM, 0); ASSERT_LE(0, cli_fd); - bzero(&cli_un, sizeof(cli_un)); - cli_un.sun_family = AF_UNIX; strncpy(cli_un.sun_path, path, sizeof(cli_un.sun_path)); - size = offsetof(struct sockaddr_un, sun_path) + strlen(cli_un.sun_path); - - ASSERT_EQ(0, connect(cli_fd, (struct sockaddr *)&cli_un, size)); + ASSERT_EQ(0, + connect(cli_fd, (struct sockaddr *)&cli_un, sizeof(cli_un))); /* FIONREAD and other IOCTLs should not be forbidden. */ EXPECT_EQ(0, test_fionread_ioctl(cli_fd)); diff --git a/tools/testing/selftests/landlock/scoped_abstract_unix_test.c b/tools/testing/selftests/landlock/scoped_abstract_unix_test.c index 6825082c079c0..2cdf1ba070164 100644 --- a/tools/testing/selftests/landlock/scoped_abstract_unix_test.c +++ b/tools/testing/selftests/landlock/scoped_abstract_unix_test.c @@ -779,7 +779,6 @@ FIXTURE_TEARDOWN(various_address_sockets) TEST_F(various_address_sockets, scoped_pathname_sockets) { - socklen_t size_stream, size_dgram; pid_t child; int status; char buf_child, buf_parent; @@ -798,12 +797,8 @@ TEST_F(various_address_sockets, scoped_pathname_sockets) /* Pathname address. */ snprintf(stream_pathname_addr.sun_path, sizeof(stream_pathname_addr.sun_path), "%s", stream_path); - size_stream = offsetof(struct sockaddr_un, sun_path) + - strlen(stream_pathname_addr.sun_path); snprintf(dgram_pathname_addr.sun_path, sizeof(dgram_pathname_addr.sun_path), "%s", dgram_path); - size_dgram = offsetof(struct sockaddr_un, sun_path) + - strlen(dgram_pathname_addr.sun_path); /* Abstract address. */ memset(&stream_abstract_addr, 0, sizeof(stream_abstract_addr)); @@ -841,8 +836,9 @@ TEST_F(various_address_sockets, scoped_pathname_sockets) /* Connects with pathname sockets. */ stream_pathname_socket = socket(AF_UNIX, SOCK_STREAM, 0); ASSERT_LE(0, stream_pathname_socket); - ASSERT_EQ(0, connect(stream_pathname_socket, - &stream_pathname_addr, size_stream)); + ASSERT_EQ(0, + connect(stream_pathname_socket, &stream_pathname_addr, + sizeof(stream_pathname_addr))); ASSERT_EQ(1, write(stream_pathname_socket, "b", 1)); EXPECT_EQ(0, close(stream_pathname_socket)); @@ -850,12 +846,13 @@ TEST_F(various_address_sockets, scoped_pathname_sockets) dgram_pathname_socket = socket(AF_UNIX, SOCK_DGRAM, 0); ASSERT_LE(0, dgram_pathname_socket); err = sendto(dgram_pathname_socket, "c", 1, 0, - &dgram_pathname_addr, size_dgram); + &dgram_pathname_addr, sizeof(dgram_pathname_addr)); EXPECT_EQ(1, err); /* Sends with connection. */ - ASSERT_EQ(0, connect(dgram_pathname_socket, - &dgram_pathname_addr, size_dgram)); + ASSERT_EQ(0, + connect(dgram_pathname_socket, &dgram_pathname_addr, + sizeof(dgram_pathname_addr))); ASSERT_EQ(1, write(dgram_pathname_socket, "d", 1)); EXPECT_EQ(0, close(dgram_pathname_socket)); @@ -910,13 +907,13 @@ TEST_F(various_address_sockets, scoped_pathname_sockets) stream_pathname_socket = socket(AF_UNIX, SOCK_STREAM, 0); ASSERT_LE(0, stream_pathname_socket); ASSERT_EQ(0, bind(stream_pathname_socket, &stream_pathname_addr, - size_stream)); + sizeof(stream_pathname_addr))); ASSERT_EQ(0, listen(stream_pathname_socket, backlog)); dgram_pathname_socket = socket(AF_UNIX, SOCK_DGRAM, 0); ASSERT_LE(0, dgram_pathname_socket); ASSERT_EQ(0, bind(dgram_pathname_socket, &dgram_pathname_addr, - size_dgram)); + sizeof(dgram_pathname_addr))); /* Sets up abstract servers. */ stream_abstract_socket = socket(AF_UNIX, SOCK_STREAM, 0); From c43ff5138d008e9671beca191b2c4a33475e1870 Mon Sep 17 00:00:00 2001 From: Tingmao Wang Date: Sat, 6 Dec 2025 17:11:06 +0000 Subject: [PATCH 1281/1321] landlock: Fix wrong type usage MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit I think, based on my best understanding, that this type is likely a typo (even though in the end both are u16) Signed-off-by: Tingmao Wang Fixes: 2fc80c69df82 ("landlock: Log file-related denials") Reviewed-by: Günther Noack Link: https://lore.kernel.org/r/7339ad7b47f998affd84ca629a334a71f913616d.1765040503.git.m@maowtm.org Signed-off-by: Mickaël Salaün --- security/landlock/audit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/security/landlock/audit.c b/security/landlock/audit.c index c52d079cdb77b..e899995f1fd59 100644 --- a/security/landlock/audit.c +++ b/security/landlock/audit.c @@ -191,7 +191,7 @@ static size_t get_denied_layer(const struct landlock_ruleset *const domain, long youngest_layer = -1; for_each_set_bit(access_bit, &access_req, layer_masks_size) { - const access_mask_t mask = (*layer_masks)[access_bit]; + const layer_mask_t mask = (*layer_masks)[access_bit]; long layer; if (!mask) From 0b644f10c9335c4a0e0fcc968c6c250db0f043d3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micka=C3=ABl=20Sala=C3=BCn?= Date: Fri, 19 Dec 2025 20:38:47 +0100 Subject: [PATCH 1282/1321] landlock: Remove useless include MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Remove useless audit.h include. Cc: Günther Noack Fixes: 33e65b0d3add ("landlock: Add AUDIT_LANDLOCK_ACCESS and log ptrace denials") Link: https://lore.kernel.org/r/20251219193855.825889-1-mic@digikod.net Reviewed-by: Günther Noack Signed-off-by: Mickaël Salaün --- security/landlock/ruleset.c | 1 - 1 file changed, 1 deletion(-) diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c index dfcdc19ea2683..0a5b0c76b3f78 100644 --- a/security/landlock/ruleset.c +++ b/security/landlock/ruleset.c @@ -23,7 +23,6 @@ #include #include "access.h" -#include "audit.h" #include "domain.h" #include "limits.h" #include "object.h" From 55167e5603ff26e01e64289440428539c48c2d30 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micka=C3=ABl=20Sala=C3=BCn?= Date: Fri, 19 Dec 2025 20:38:48 +0100 Subject: [PATCH 1283/1321] landlock: Improve erratum documentation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Improve description about scoped signal handling. Reported-by: Günther Noack Link: https://lore.kernel.org/r/20251219193855.825889-2-mic@digikod.net Reviewed-by: Günther Noack Signed-off-by: Mickaël Salaün --- security/landlock/errata/abi-6.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/security/landlock/errata/abi-6.h b/security/landlock/errata/abi-6.h index df7bc0e1fdf47..5113a829f87e9 100644 --- a/security/landlock/errata/abi-6.h +++ b/security/landlock/errata/abi-6.h @@ -9,7 +9,7 @@ * This fix addresses an issue where signal scoping was overly restrictive, * preventing sandboxed threads from signaling other threads within the same * process if they belonged to different domains. Because threads are not - * security boundaries, user space might assume that any thread within the same + * security boundaries, user space might assume that all threads within the same * process can send signals between themselves (see :manpage:`nptl(7)` and * :manpage:`libpsx(3)`). Consistent with :manpage:`ptrace(2)` behavior, direct * interaction between threads of the same process should always be allowed. From 2eeeebee271f163986f0dc9ce423400385deb197 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micka=C3=ABl=20Sala=C3=BCn?= Date: Fri, 19 Dec 2025 20:38:49 +0100 Subject: [PATCH 1284/1321] landlock: Clean up hook_ptrace_access_check() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Make variable's scope minimal in hook_ptrace_access_check(). Cc: Günther Noack Link: https://lore.kernel.org/r/20251219193855.825889-3-mic@digikod.net Reviewed-by: Günther Noack Signed-off-by: Mickaël Salaün --- security/landlock/task.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/security/landlock/task.c b/security/landlock/task.c index 2385017418ca5..bf4ed15a7f01f 100644 --- a/security/landlock/task.c +++ b/security/landlock/task.c @@ -86,7 +86,6 @@ static int hook_ptrace_access_check(struct task_struct *const child, const unsigned int mode) { const struct landlock_cred_security *parent_subject; - const struct landlock_ruleset *child_dom; int err; /* Quick return for non-landlocked tasks. */ @@ -96,7 +95,8 @@ static int hook_ptrace_access_check(struct task_struct *const child, scoped_guard(rcu) { - child_dom = landlock_get_task_domain(child); + const struct landlock_ruleset *const child_dom = + landlock_get_task_domain(child); err = domain_ptrace(parent_subject->domain, child_dom); } From 53c76ac67ad426747ad0dcd6bb0cb5d78cc03f23 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micka=C3=ABl=20Sala=C3=BCn?= Date: Fri, 19 Dec 2025 20:38:50 +0100 Subject: [PATCH 1285/1321] landlock: Fix spelling MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Günther Noack Link: https://lore.kernel.org/r/20251219193855.825889-4-mic@digikod.net Reviewed-by: Günther Noack Signed-off-by: Mickaël Salaün --- security/landlock/domain.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/security/landlock/domain.h b/security/landlock/domain.h index 7fb70b25f85a1..621f054c9a2b7 100644 --- a/security/landlock/domain.h +++ b/security/landlock/domain.h @@ -97,7 +97,7 @@ struct landlock_hierarchy { */ atomic64_t num_denials; /** - * @id: Landlock domain ID, sets once at domain creation time. + * @id: Landlock domain ID, set once at domain creation time. */ u64 id; /** From 6c0df2d3ba49ec557306a3e4996363e5239fe859 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micka=C3=ABl=20Sala=C3=BCn?= Date: Fri, 19 Dec 2025 15:22:59 +0100 Subject: [PATCH 1286/1321] landlock: Optimize stack usage when !CONFIG_AUDIT MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Until now, each landlock_request struct were allocated on the stack, even if not really used, because is_access_to_paths_allowed() unconditionally modified the passed references. Even if the changed landlock_request variables are not used, the compiler is not smart enough to detect this case. To avoid this issue, explicitly disable the related code when CONFIG_AUDIT is not set, which enables elision of log_request_parent* and associated caller's stack variables thanks to dead code elimination. This makes it possible to reduce the stack frame by 32 bytes for the path_link and path_rename hooks, and by 20 bytes for most other filesystem hooks. Here is a summary of scripts/stackdelta before and after this change when CONFIG_AUDIT is disabled: current_check_refer_path 560 320 -240 current_check_access_path 328 184 -144 hook_file_open 328 184 -144 is_access_to_paths_allowed 376 360 -16 Also, add extra pointer checks to be more future-proof. Cc: Günther Noack Reported-by: Tingmao Wang Closes: https://lore.kernel.org/r/eb86863b-53b0-460b-b223-84dd31d765b9@maowtm.org Fixes: 2fc80c69df82 ("landlock: Log file-related denials") Link: https://lore.kernel.org/r/20251219142302.744917-2-mic@digikod.net Reviewed-by: Günther Noack [mic: Improve stack usage measurement accuracy with scripts/stackdelta] Signed-off-by: Mickaël Salaün --- security/landlock/fs.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/security/landlock/fs.c b/security/landlock/fs.c index e3c3a8a9ac27b..8205673c8b1c4 100644 --- a/security/landlock/fs.c +++ b/security/landlock/fs.c @@ -939,7 +939,12 @@ static bool is_access_to_paths_allowed( } path_put(&walker_path); - if (!allowed_parent1) { + /* + * Check CONFIG_AUDIT to enable elision of log_request_parent* and + * associated caller's stack variables thanks to dead code elimination. + */ +#ifdef CONFIG_AUDIT + if (!allowed_parent1 && log_request_parent1) { log_request_parent1->type = LANDLOCK_REQUEST_FS_ACCESS; log_request_parent1->audit.type = LSM_AUDIT_DATA_PATH; log_request_parent1->audit.u.path = *path; @@ -949,7 +954,7 @@ static bool is_access_to_paths_allowed( ARRAY_SIZE(*layer_masks_parent1); } - if (!allowed_parent2) { + if (!allowed_parent2 && log_request_parent2) { log_request_parent2->type = LANDLOCK_REQUEST_FS_ACCESS; log_request_parent2->audit.type = LSM_AUDIT_DATA_PATH; log_request_parent2->audit.u.path = *path; @@ -958,6 +963,8 @@ static bool is_access_to_paths_allowed( log_request_parent2->layer_masks_size = ARRAY_SIZE(*layer_masks_parent2); } +#endif /* CONFIG_AUDIT */ + return allowed_parent1 && allowed_parent2; } From c499d3130d81f096ad382cff3dac12235420c70c Mon Sep 17 00:00:00 2001 From: Tingmao Wang Date: Sun, 28 Dec 2025 01:27:31 +0000 Subject: [PATCH 1287/1321] selftests/landlock: Fix typo in fs_test MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Tingmao Wang Link: https://lore.kernel.org/r/62d18e06eeb26f62bc49d24c4467b3793c5ba32b.1766885035.git.m@maowtm.org Signed-off-by: Mickaël Salaün --- tools/testing/selftests/landlock/fs_test.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c index 76491ba54dce2..37a5a3df712ec 100644 --- a/tools/testing/selftests/landlock/fs_test.c +++ b/tools/testing/selftests/landlock/fs_test.c @@ -7069,8 +7069,8 @@ static int matches_log_fs_extra(struct __test_metadata *const _metadata, return -E2BIG; /* - * It is assume that absolute_path does not contain control characters nor - * spaces, see audit_string_contains_control(). + * It is assumed that absolute_path does not contain control + * characters nor spaces, see audit_string_contains_control(). */ absolute_path = realpath(path, NULL); if (!absolute_path) From 9b360e0c0c9e27aafb7ad6db4144e2a33a9b0ac7 Mon Sep 17 00:00:00 2001 From: Tingmao Wang Date: Sun, 28 Dec 2025 01:27:32 +0000 Subject: [PATCH 1288/1321] selftests/landlock: Fix missing semicolon MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add missing semicolon after EXPECT_EQ(0, close(stream_server_child)) in the scoped_vs_unscoped test. I suspect currently it's just not executing the close statement after the line, but this causes no observable difference. Fixes: fefcf0f7cf47 ("selftests/landlock: Test abstract UNIX socket scoping") Cc: Tahera Fahimi Signed-off-by: Tingmao Wang Link: https://lore.kernel.org/r/d9e968e4cd4ecc9bf487593d7b7220bffbb3b5f5.1766885035.git.m@maowtm.org Signed-off-by: Mickaël Salaün --- tools/testing/selftests/landlock/scoped_abstract_unix_test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/landlock/scoped_abstract_unix_test.c b/tools/testing/selftests/landlock/scoped_abstract_unix_test.c index 2cdf1ba070164..72f97648d4a7d 100644 --- a/tools/testing/selftests/landlock/scoped_abstract_unix_test.c +++ b/tools/testing/selftests/landlock/scoped_abstract_unix_test.c @@ -543,7 +543,7 @@ TEST_F(scoped_vs_unscoped, unix_scoping) ASSERT_EQ(1, write(pipe_child[1], ".", 1)); ASSERT_EQ(grand_child, waitpid(grand_child, &status, 0)); - EXPECT_EQ(0, close(stream_server_child)) + EXPECT_EQ(0, close(stream_server_child)); EXPECT_EQ(0, close(dgram_server_child)); return; } From 553b1af9fe26e3f45257d612fb04baa96f77900b Mon Sep 17 00:00:00 2001 From: Tingmao Wang Date: Sun, 28 Dec 2025 01:27:34 +0000 Subject: [PATCH 1289/1321] selftests/landlock: Use scoped_base_variants.h for ptrace_test MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ptrace_test.c currently contains a duplicated version of the scoped_domains fixture variants. This patch removes that and make it use the shared scoped_base_variants.h instead, like in scoped_abstract_unix_test and scoped_signal_test. This required renaming the hierarchy fixture to scoped_domains, but the test is otherwise the same. Cc: Tahera Fahimi Signed-off-by: Tingmao Wang Link: https://lore.kernel.org/r/48148f0134f95f819a25277486a875a6fd88ecf9.1766885035.git.m@maowtm.org Signed-off-by: Mickaël Salaün --- .../testing/selftests/landlock/ptrace_test.c | 154 +----------------- .../selftests/landlock/scoped_base_variants.h | 9 +- 2 files changed, 12 insertions(+), 151 deletions(-) diff --git a/tools/testing/selftests/landlock/ptrace_test.c b/tools/testing/selftests/landlock/ptrace_test.c index 4e356334ecb74..4f64c90583cd6 100644 --- a/tools/testing/selftests/landlock/ptrace_test.c +++ b/tools/testing/selftests/landlock/ptrace_test.c @@ -86,16 +86,9 @@ static int get_yama_ptrace_scope(void) } /* clang-format off */ -FIXTURE(hierarchy) {}; +FIXTURE(scoped_domains) {}; /* clang-format on */ -FIXTURE_VARIANT(hierarchy) -{ - const bool domain_both; - const bool domain_parent; - const bool domain_child; -}; - /* * Test multiple tracing combinations between a parent process P1 and a child * process P2. @@ -104,155 +97,18 @@ FIXTURE_VARIANT(hierarchy) * restriction is enforced in addition to any Landlock check, which means that * all P2 requests to trace P1 would be denied. */ +#include "scoped_base_variants.h" -/* - * No domain - * - * P1-. P1 -> P2 : allow - * \ P2 -> P1 : allow - * 'P2 - */ -/* clang-format off */ -FIXTURE_VARIANT_ADD(hierarchy, allow_without_domain) { - /* clang-format on */ - .domain_both = false, - .domain_parent = false, - .domain_child = false, -}; - -/* - * Child domain - * - * P1--. P1 -> P2 : allow - * \ P2 -> P1 : deny - * .'-----. - * | P2 | - * '------' - */ -/* clang-format off */ -FIXTURE_VARIANT_ADD(hierarchy, allow_with_one_domain) { - /* clang-format on */ - .domain_both = false, - .domain_parent = false, - .domain_child = true, -}; - -/* - * Parent domain - * .------. - * | P1 --. P1 -> P2 : deny - * '------' \ P2 -> P1 : allow - * ' - * P2 - */ -/* clang-format off */ -FIXTURE_VARIANT_ADD(hierarchy, deny_with_parent_domain) { - /* clang-format on */ - .domain_both = false, - .domain_parent = true, - .domain_child = false, -}; - -/* - * Parent + child domain (siblings) - * .------. - * | P1 ---. P1 -> P2 : deny - * '------' \ P2 -> P1 : deny - * .---'--. - * | P2 | - * '------' - */ -/* clang-format off */ -FIXTURE_VARIANT_ADD(hierarchy, deny_with_sibling_domain) { - /* clang-format on */ - .domain_both = false, - .domain_parent = true, - .domain_child = true, -}; - -/* - * Same domain (inherited) - * .-------------. - * | P1----. | P1 -> P2 : allow - * | \ | P2 -> P1 : allow - * | ' | - * | P2 | - * '-------------' - */ -/* clang-format off */ -FIXTURE_VARIANT_ADD(hierarchy, allow_sibling_domain) { - /* clang-format on */ - .domain_both = true, - .domain_parent = false, - .domain_child = false, -}; - -/* - * Inherited + child domain - * .-----------------. - * | P1----. | P1 -> P2 : allow - * | \ | P2 -> P1 : deny - * | .-'----. | - * | | P2 | | - * | '------' | - * '-----------------' - */ -/* clang-format off */ -FIXTURE_VARIANT_ADD(hierarchy, allow_with_nested_domain) { - /* clang-format on */ - .domain_both = true, - .domain_parent = false, - .domain_child = true, -}; - -/* - * Inherited + parent domain - * .-----------------. - * |.------. | P1 -> P2 : deny - * || P1 ----. | P2 -> P1 : allow - * |'------' \ | - * | ' | - * | P2 | - * '-----------------' - */ -/* clang-format off */ -FIXTURE_VARIANT_ADD(hierarchy, deny_with_nested_and_parent_domain) { - /* clang-format on */ - .domain_both = true, - .domain_parent = true, - .domain_child = false, -}; - -/* - * Inherited + parent and child domain (siblings) - * .-----------------. - * | .------. | P1 -> P2 : deny - * | | P1 . | P2 -> P1 : deny - * | '------'\ | - * | \ | - * | .--'---. | - * | | P2 | | - * | '------' | - * '-----------------' - */ -/* clang-format off */ -FIXTURE_VARIANT_ADD(hierarchy, deny_with_forked_domain) { - /* clang-format on */ - .domain_both = true, - .domain_parent = true, - .domain_child = true, -}; - -FIXTURE_SETUP(hierarchy) +FIXTURE_SETUP(scoped_domains) { } -FIXTURE_TEARDOWN(hierarchy) +FIXTURE_TEARDOWN(scoped_domains) { } /* Test PTRACE_TRACEME and PTRACE_ATTACH for parent and child. */ -TEST_F(hierarchy, trace) +TEST_F(scoped_domains, trace) { pid_t child, parent; int status, err_proc_read; diff --git a/tools/testing/selftests/landlock/scoped_base_variants.h b/tools/testing/selftests/landlock/scoped_base_variants.h index d3b1fa8a584e1..7116728ebc68d 100644 --- a/tools/testing/selftests/landlock/scoped_base_variants.h +++ b/tools/testing/selftests/landlock/scoped_base_variants.h @@ -1,8 +1,13 @@ /* SPDX-License-Identifier: GPL-2.0 */ /* - * Landlock scoped_domains variants + * Landlock scoped_domains test variant definition. * - * See the hierarchy variants from ptrace_test.c + * This file defines a fixture variant "scoped_domains" that has all + * permutations of parent/child process being in separate or shared + * Landlock domain, or not being in a Landlock domain at all. + * + * Scoped access tests can include this file to avoid repeating these + * combinations. * * Copyright © 2017-2020 Mickaël Salaün * Copyright © 2019-2020 ANSSI From 294125f1b5c977540619f2c03af89482b95eeca4 Mon Sep 17 00:00:00 2001 From: Tingmao Wang Date: Sun, 28 Dec 2025 01:27:35 +0000 Subject: [PATCH 1290/1321] landlock: Improve the comment for domain_is_scoped MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Currently it is not obvious what "scoped" mean, and the fact that the function returns true when access should be denied is slightly surprising and in need of documentation. Cc: Tahera Fahimi Signed-off-by: Tingmao Wang Link: https://lore.kernel.org/r/06393bc18aee5bc278df5ef31c64a05b742ebc10.1766885035.git.m@maowtm.org [mic: Fix formatting and improve consistency] Signed-off-by: Mickaël Salaün --- security/landlock/task.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/security/landlock/task.c b/security/landlock/task.c index bf4ed15a7f01f..833bc0cfe5c9b 100644 --- a/security/landlock/task.c +++ b/security/landlock/task.c @@ -166,15 +166,15 @@ static int hook_ptrace_traceme(struct task_struct *const parent) } /** - * domain_is_scoped - Checks if the client domain is scoped in the same - * domain as the server. + * domain_is_scoped - Check if an interaction from a client/sender to a + * server/receiver should be restricted based on scope controls. * * @client: IPC sender domain. * @server: IPC receiver domain. * @scope: The scope restriction criteria. * - * Returns: True if the @client domain is scoped to access the @server, - * unless the @server is also scoped in the same domain as @client. + * Returns: True if @server is in a different domain from @client, and @client + * is scoped to access @server (i.e. access should be denied). */ static bool domain_is_scoped(const struct landlock_ruleset *const client, const struct landlock_ruleset *const server, From cffceb13c4cb453ba38298effe10fafc15debffc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?G=C3=BCnther=20Noack?= Date: Thu, 1 Jan 2026 14:40:58 +0100 Subject: [PATCH 1291/1321] selftests/landlock: Properly close a file descriptor MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add a missing close(srv_fd) call, and use EXPECT_EQ() to check the result. Signed-off-by: Günther Noack Fixes: f83d51a5bdfe ("selftests/landlock: Check IOCTL restrictions for named UNIX domain sockets") Link: https://lore.kernel.org/r/20260101134102.25938-2-gnoack3000@gmail.com [mic: Use EXPECT_EQ() and update commit message] Signed-off-by: Mickaël Salaün --- tools/testing/selftests/landlock/fs_test.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c index 37a5a3df712ec..968a91c927a46 100644 --- a/tools/testing/selftests/landlock/fs_test.c +++ b/tools/testing/selftests/landlock/fs_test.c @@ -4399,7 +4399,8 @@ TEST_F_FORK(layout1, named_unix_domain_socket_ioctl) /* FIONREAD and other IOCTLs should not be forbidden. */ EXPECT_EQ(0, test_fionread_ioctl(cli_fd)); - ASSERT_EQ(0, close(cli_fd)); + EXPECT_EQ(0, close(cli_fd)); + EXPECT_EQ(0, close(srv_fd)); } /* clang-format off */ From abed2c549007183abd2b8c1912ab77b2457f686b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?G=C3=BCnther=20Noack?= Date: Sun, 11 Jan 2026 18:52:04 +0100 Subject: [PATCH 1292/1321] landlock: Clarify documentation for the IOCTL access right MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Move the description of the LANDLOCK_ACCESS_FS_IOCTL_DEV access right together with the file access rights. This group of access rights applies to files (in this case device files), and they can be added to file or directory inodes using landlock_add_rule(2). The check for that works the same for all file access rights, including LANDLOCK_ACCESS_FS_IOCTL_DEV. Invoking ioctl(2) on directory FDs can not currently be restricted with Landlock. Having it grouped separately in the documentation is a remnant from earlier revisions of the LANDLOCK_ACCESS_FS_IOCTL_DEV patch set. Link: https://lore.kernel.org/all/20260108.Thaex5ruach2@digikod.net/ Signed-off-by: Günther Noack Link: https://lore.kernel.org/r/20260111175203.6545-2-gnoack3000@gmail.com Signed-off-by: Mickaël Salaün --- include/uapi/linux/landlock.h | 37 ++++++++++++++++------------------- 1 file changed, 17 insertions(+), 20 deletions(-) diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h index f030adc462ee7..75fd7f5e6cc31 100644 --- a/include/uapi/linux/landlock.h +++ b/include/uapi/linux/landlock.h @@ -216,6 +216,23 @@ struct landlock_net_port_attr { * :manpage:`ftruncate(2)`, :manpage:`creat(2)`, or :manpage:`open(2)` with * ``O_TRUNC``. This access right is available since the third version of the * Landlock ABI. + * - %LANDLOCK_ACCESS_FS_IOCTL_DEV: Invoke :manpage:`ioctl(2)` commands on an opened + * character or block device. + * + * This access right applies to all `ioctl(2)` commands implemented by device + * drivers. However, the following common IOCTL commands continue to be + * invokable independent of the %LANDLOCK_ACCESS_FS_IOCTL_DEV right: + * + * * IOCTL commands targeting file descriptors (``FIOCLEX``, ``FIONCLEX``), + * * IOCTL commands targeting file descriptions (``FIONBIO``, ``FIOASYNC``), + * * IOCTL commands targeting file systems (``FIFREEZE``, ``FITHAW``, + * ``FIGETBSZ``, ``FS_IOC_GETFSUUID``, ``FS_IOC_GETFSSYSFSPATH``) + * * Some IOCTL commands which do not make sense when used with devices, but + * whose implementations are safe and return the right error codes + * (``FS_IOC_FIEMAP``, ``FICLONE``, ``FICLONERANGE``, ``FIDEDUPERANGE``) + * + * This access right is available since the fifth version of the Landlock + * ABI. * * Whether an opened file can be truncated with :manpage:`ftruncate(2)` or used * with `ioctl(2)` is determined during :manpage:`open(2)`, in the same way as @@ -275,26 +292,6 @@ struct landlock_net_port_attr { * If multiple requirements are not met, the ``EACCES`` error code takes * precedence over ``EXDEV``. * - * The following access right applies both to files and directories: - * - * - %LANDLOCK_ACCESS_FS_IOCTL_DEV: Invoke :manpage:`ioctl(2)` commands on an opened - * character or block device. - * - * This access right applies to all `ioctl(2)` commands implemented by device - * drivers. However, the following common IOCTL commands continue to be - * invokable independent of the %LANDLOCK_ACCESS_FS_IOCTL_DEV right: - * - * * IOCTL commands targeting file descriptors (``FIOCLEX``, ``FIONCLEX``), - * * IOCTL commands targeting file descriptions (``FIONBIO``, ``FIOASYNC``), - * * IOCTL commands targeting file systems (``FIFREEZE``, ``FITHAW``, - * ``FIGETBSZ``, ``FS_IOC_GETFSUUID``, ``FS_IOC_GETFSSYSFSPATH``) - * * Some IOCTL commands which do not make sense when used with devices, but - * whose implementations are safe and return the right error codes - * (``FS_IOC_FIEMAP``, ``FICLONE``, ``FICLONERANGE``, ``FIDEDUPERANGE``) - * - * This access right is available since the fifth version of the Landlock - * ABI. - * * .. warning:: * * It is currently not possible to restrict some file-related actions From 0819607eaa24aea72250307ca3365f28fbea026e Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sun, 18 Jan 2026 15:42:45 -0800 Subject: [PATCH 1293/1321] Linux 6.19-rc6 --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 9d38125263fb0..1465f715786da 100644 --- a/Makefile +++ b/Makefile @@ -2,7 +2,7 @@ VERSION = 6 PATCHLEVEL = 19 SUBLEVEL = 0 -EXTRAVERSION = -rc5 +EXTRAVERSION = -rc6 NAME = Baby Opossum Posse # *DOCUMENTATION* From 3360b8966da655dddb66efa2d86a4a396046f723 Mon Sep 17 00:00:00 2001 From: Abel Vesa Date: Wed, 24 Dec 2025 12:35:01 +0200 Subject: [PATCH 1294/1321] FROMLIST: dt-bindings: phy: sc8280xp-qmp-pcie: Document Glymur PCIe Gen4 2-lanes PHY The fourth and sixth PCIe instances on Glymur are both Gen4 2-lane PHY. So document the compatible. Link: https://lore.kernel.org/r/20251224-phy-qcom-pcie-add-glymur-v3-1-57396145bc22@oss.qualcomm.com Signed-off-by: Abel Vesa Reviewed-by: Krzysztof Kozlowski --- .../devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml b/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml index f5068df20cfe6..2b3be35ea0e8a 100644 --- a/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml +++ b/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml @@ -16,6 +16,7 @@ description: properties: compatible: enum: + - qcom,glymur-qmp-gen4x2-pcie-phy - qcom,glymur-qmp-gen5x4-pcie-phy - qcom,qcs615-qmp-gen3x1-pcie-phy - qcom,qcs8300-qmp-gen4x2-pcie-phy @@ -178,6 +179,7 @@ allOf: compatible: contains: enum: + - qcom,glymur-qmp-gen4x2-pcie-phy - qcom,glymur-qmp-gen5x4-pcie-phy - qcom,qcs8300-qmp-gen4x2-pcie-phy - qcom,sa8775p-qmp-gen4x2-pcie-phy @@ -202,6 +204,7 @@ allOf: compatible: contains: enum: + - qcom,glymur-qmp-gen4x2-pcie-phy - qcom,glymur-qmp-gen5x4-pcie-phy - qcom,sm8550-qmp-gen4x2-pcie-phy - qcom,sm8650-qmp-gen4x2-pcie-phy From 8f883b54d2506d9de4ac6fcb0138b60b89e7f27e Mon Sep 17 00:00:00 2001 From: Abel Vesa Date: Wed, 24 Dec 2025 12:35:02 +0200 Subject: [PATCH 1295/1321] FROMLIST: phy: qcom: qmp-pcie: Add support for Glymur PCIe Gen4x2 PHY Glymur platform has two Gen4 2-lanes controllers, the fourth and sixth instances. Add support for their PHYs. Link: https://lore.kernel.org/r/20251224-phy-qcom-pcie-add-glymur-v3-2-57396145bc22@oss.qualcomm.com Signed-off-by: Abel Vesa Reviewed-by: Dmitry Baryshkov --- drivers/phy/qualcomm/phy-qcom-qmp-pcie.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-pcie.c b/drivers/phy/qualcomm/phy-qcom-qmp-pcie.c index 86b1b7e2da86a..f65a8b9586c48 100644 --- a/drivers/phy/qualcomm/phy-qcom-qmp-pcie.c +++ b/drivers/phy/qualcomm/phy-qcom-qmp-pcie.c @@ -4441,6 +4441,22 @@ static const struct qmp_phy_cfg glymur_qmp_gen5x4_pciephy_cfg = { .phy_status = PHYSTATUS_4_20, }; +static const struct qmp_phy_cfg glymur_qmp_gen4x2_pciephy_cfg = { + .lanes = 2, + + .offsets = &qmp_pcie_offsets_v8_0, + + .reset_list = sdm845_pciephy_reset_l, + .num_resets = ARRAY_SIZE(sdm845_pciephy_reset_l), + .vreg_list = qmp_phy_vreg_l, + .num_vregs = ARRAY_SIZE(qmp_phy_vreg_l), + + .regs = pciephy_v8_regs_layout, + + .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, + .phy_status = PHYSTATUS_4_20, +}; + static void qmp_pcie_init_port_b(struct qmp_pcie *qmp, const struct qmp_phy_cfg_tbls *tbls) { const struct qmp_phy_cfg *cfg = qmp->cfg; @@ -5192,6 +5208,9 @@ static int qmp_pcie_probe(struct platform_device *pdev) static const struct of_device_id qmp_pcie_of_match_table[] = { { + .compatible = "qcom,glymur-qmp-gen4x2-pcie-phy", + .data = &glymur_qmp_gen4x2_pciephy_cfg, + }, { .compatible = "qcom,glymur-qmp-gen5x4-pcie-phy", .data = &glymur_qmp_gen5x4_pciephy_cfg, }, { From a7cb298044b1e5e7dc1d5656072bee17efe591a4 Mon Sep 17 00:00:00 2001 From: Prudhvi Yarlagadda Date: Fri, 17 Oct 2025 18:33:40 -0700 Subject: [PATCH 1296/1321] FROMLIST: dt-bindings: PCI: qcom: Document the Glymur PCIe Controller On the Qualcomm Glymur platform the PCIe host is compatible with the DWC controller present on the X1E80100 platform. So document the PCIe controllers found on Glymur and use the X1E80100 compatible string as a fallback in the schema. Link: https://lore.kernel.org/all/20251017-glymur_pcie-v5-3-82d0c4bd402b@oss.qualcomm.com/ Signed-off-by: Prudhvi Yarlagadda Signed-off-by: Wenbin Yao Acked-by: Rob Herring (Arm) Signed-off-by: Qiang Yu --- .../devicetree/bindings/pci/qcom,pcie-x1e80100.yaml | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/pci/qcom,pcie-x1e80100.yaml b/Documentation/devicetree/bindings/pci/qcom,pcie-x1e80100.yaml index 62c674ca0cf74..3d3b9f309a738 100644 --- a/Documentation/devicetree/bindings/pci/qcom,pcie-x1e80100.yaml +++ b/Documentation/devicetree/bindings/pci/qcom,pcie-x1e80100.yaml @@ -16,7 +16,12 @@ description: properties: compatible: - const: qcom,pcie-x1e80100 + oneOf: + - const: qcom,pcie-x1e80100 + - items: + - enum: + - qcom,glymur-pcie + - const: qcom,pcie-x1e80100 reg: minItems: 6 From b812b352977760fca83c4c4f61c2b873a7377605 Mon Sep 17 00:00:00 2001 From: Qiang Yu Date: Mon, 24 Nov 2025 02:24:34 -0800 Subject: [PATCH 1297/1321] FROMLIST: dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Add Kaanapali compatible Document compatible for the QMP PCIe PHY on Kaanapali platform. Link: https://lore.kernel.org/all/20251124-kaanapali-pcie-phy-v4-1-d04ee9cca83b@oss.qualcomm.com/ Signed-off-by: Jingyi Wang Reviewed-by: Krzysztof Kozlowski Reviewed-by: Neil Armstrong Signed-off-by: Qiang Yu --- .../devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml b/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml index 2b3be35ea0e8a..c8d27d7d491e9 100644 --- a/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml +++ b/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml @@ -147,6 +147,7 @@ allOf: compatible: contains: enum: + - qcom,kaanapali-qmp-gen3x2-pcie-phy - qcom,qcs615-qmp-gen3x1-pcie-phy - qcom,sar2130p-qmp-gen3x2-pcie-phy - qcom,sc8180x-qmp-pcie-phy @@ -181,6 +182,7 @@ allOf: enum: - qcom,glymur-qmp-gen4x2-pcie-phy - qcom,glymur-qmp-gen5x4-pcie-phy + - qcom,kaanapali-qmp-gen3x2-pcie-phy - qcom,qcs8300-qmp-gen4x2-pcie-phy - qcom,sa8775p-qmp-gen4x2-pcie-phy - qcom,sa8775p-qmp-gen4x4-pcie-phy @@ -206,6 +208,7 @@ allOf: enum: - qcom,glymur-qmp-gen4x2-pcie-phy - qcom,glymur-qmp-gen5x4-pcie-phy + - qcom,kaanapali-qmp-gen3x2-pcie-phy - qcom,sm8550-qmp-gen4x2-pcie-phy - qcom,sm8650-qmp-gen4x2-pcie-phy - qcom,x1e80100-qmp-gen3x2-pcie-phy From db72a5365846c4e67a80e2bef5eefc3149e0bc57 Mon Sep 17 00:00:00 2001 From: Qiang Yu Date: Mon, 24 Nov 2025 02:24:35 -0800 Subject: [PATCH 1298/1321] FROMLIST: phy: qcom-qmp: qserdes-txrx: Add complete QMP PCIe PHY v8 register offsets Kaanapali SoC uses QMP PHY with version v8 for PCIe Gen3 x2, but requires a completely unique qserdes-txrx register offsets compared to existing v8 offsets. Hence, add a dedicated header file containing the FULL SET of qserdes-txrx register definitions required for Kaanapali's PCIe PHY operation. Link:https://lore.kernel.org/all/20251124-kaanapali-pcie-phy-v4-2-d04ee9cca83b@oss.qualcomm.com/ Signed-off-by: Jingyi Wang Reviewed-by: Dmitry Baryshkov Reviewed-by: Neil Armstrong Signed-off-by: Qiang Yu --- .../phy-qcom-qmp-qserdes-txrx-pcie-v8.h | 71 +++++++++++++++++++ 1 file changed, 71 insertions(+) create mode 100644 drivers/phy/qualcomm/phy-qcom-qmp-qserdes-txrx-pcie-v8.h diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-qserdes-txrx-pcie-v8.h b/drivers/phy/qualcomm/phy-qcom-qmp-qserdes-txrx-pcie-v8.h new file mode 100644 index 0000000000000..181846e08c0f0 --- /dev/null +++ b/drivers/phy/qualcomm/phy-qcom-qmp-qserdes-txrx-pcie-v8.h @@ -0,0 +1,71 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. All rights reserved. + */ + +#ifndef QCOM_PHY_QMP_QSERDES_TXRX_PCIE_V8_H_ +#define QCOM_PHY_QMP_QSERDES_TXRX_PCIE_V8_H_ + +#define QSERDES_V8_PCIE_TX_RES_CODE_LANE_OFFSET_TX 0x030 +#define QSERDES_V8_PCIE_TX_RES_CODE_LANE_OFFSET_RX 0x034 +#define QSERDES_V8_PCIE_TX_LANE_MODE_1 0x07c +#define QSERDES_V8_PCIE_TX_LANE_MODE_2 0x080 +#define QSERDES_V8_PCIE_TX_LANE_MODE_3 0x084 +#define QSERDES_V8_PCIE_TX_TRAN_DRVR_EMP_EN 0x0b4 +#define QSERDES_V8_PCIE_TX_TX_BAND0 0x0e0 +#define QSERDES_V8_PCIE_TX_TX_BAND1 0x0e4 +#define QSERDES_V8_PCIE_TX_SEL_10B_8B 0x0f4 +#define QSERDES_V8_PCIE_TX_SEL_20B_10B 0x0f8 +#define QSERDES_V8_PCIE_TX_PARRATE_REC_DETECT_IDLE_EN 0x058 +#define QSERDES_V8_PCIE_TX_TX_ADAPT_POST_THRESH1 0x118 +#define QSERDES_V8_PCIE_TX_TX_ADAPT_POST_THRESH2 0x11c +#define QSERDES_V8_PCIE_TX_PHPRE_CTRL 0x128 +#define QSERDES_V8_PCIE_TX_EQ_RCF_CTRL_RATE3 0x148 +#define QSERDES_V8_PCIE_TX_EQ_RCF_CTRL_RATE4 0x14c + +#define QSERDES_V8_PCIE_RX_UCDR_FO_GAIN_RATE4 0x0dc +#define QSERDES_V8_PCIE_RX_UCDR_SO_GAIN_RATE3 0x0ec +#define QSERDES_V8_PCIE_RX_UCDR_SO_GAIN_RATE4 0x0f0 +#define QSERDES_V8_PCIE_RX_UCDR_PI_CONTROLS 0x0f4 +#define QSERDES_V8_PCIE_RX_VGA_CAL_CNTRL1 0x170 +#define QSERDES_V8_PCIE_RX_VGA_CAL_MAN_VAL 0x178 +#define QSERDES_V8_PCIE_RX_RX_EQU_ADAPTOR_CNTRL4 0x1b4 +#define QSERDES_V8_PCIE_RX_SIGDET_ENABLES 0x1d8 +#define QSERDES_V8_PCIE_RX_SIGDET_LVL 0x1e0 +#define QSERDES_V8_PCIE_RX_RXCLK_DIV2_CTRL 0x0b8 +#define QSERDES_V8_PCIE_RX_RX_BAND_CTRL0 0x0bc +#define QSERDES_V8_PCIE_RX_RX_TERM_BW_CTRL0 0x0c4 +#define QSERDES_V8_PCIE_RX_RX_TERM_BW_CTRL1 0x0c8 +#define QSERDES_V8_PCIE_RX_SVS_MODE_CTRL 0x0b4 +#define QSERDES_V8_PCIE_RX_UCDR_PI_CTRL1 0x058 +#define QSERDES_V8_PCIE_RX_UCDR_PI_CTRL2 0x05c +#define QSERDES_V8_PCIE_RX_UCDR_SB2_THRESH2_RATE3 0x084 +#define QSERDES_V8_PCIE_RX_UCDR_SB2_GAIN1_RATE3 0x098 +#define QSERDES_V8_PCIE_RX_UCDR_SB2_GAIN2_RATE3 0x0ac +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE_0_1_B0 0x218 +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE_0_1_B1 0x21c +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE_0_1_B2 0x220 +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE_0_1_B4 0x228 +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE_0_1_B7 0x234 +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE3_B0 0x260 +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE3_B1 0x264 +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE3_B2 0x268 +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE3_B3 0x26c +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE3_B4 0x270 +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE4_SA_B0 0x284 +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE4_SA_B1 0x288 +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE4_SA_B2 0x28c +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE4_SA_B3 0x290 +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE4_SA_B4 0x294 +#define QSERDES_V8_PCIE_RX_RX_MODE_RATE4_SA_B5 0x298 +#define QSERDES_V8_PCIE_RX_Q_PI_INTRINSIC_BIAS_RATE32 0x31c +#define QSERDES_V8_PCIE_RX_Q_PI_INTRINSIC_BIAS_RATE4 0x320 +#define QSERDES_V8_PCIE_RX_EOM_MAX_ERR_LIMIT_LSB 0x11c +#define QSERDES_V8_PCIE_RX_EOM_MAX_ERR_LIMIT_MSB 0x120 +#define QSERDES_V8_PCIE_RX_AUXDATA_BIN_RATE23 0x108 +#define QSERDES_V8_PCIE_RX_AUXDATA_BIN_RATE4 0x10c +#define QSERDES_V8_PCIE_RX_VTHRESH_CAL_MAN_VAL_RATE3 0x198 +#define QSERDES_V8_PCIE_RX_VTHRESH_CAL_MAN_VAL_RATE4 0x19c +#define QSERDES_V8_PCIE_RX_GM_CAL 0x1a0 + +#endif From 0a25c219ec1c3126ce2389e971d2a1eb6adf6d08 Mon Sep 17 00:00:00 2001 From: Qiang Yu Date: Mon, 24 Nov 2025 02:24:36 -0800 Subject: [PATCH 1299/1321] FROMLIST: phy: qcom-qmp: pcs-pcie: Add v8 register offsets Kaanapali SoC uses QMP phy with version v8 for PCIe Gen3 x2. Add the new PCS PCIE specific offsets in a dedicated header file. Link: https://lore.kernel.org/all/20251124-kaanapali-pcie-phy-v4-3-d04ee9cca83b@oss.qualcomm.com/ Signed-off-by: Jingyi Wang Reviewed-by: Dmitry Baryshkov Reviewed-by: Neil Armstrong Signed-off-by: Qiang Yu --- .../phy/qualcomm/phy-qcom-qmp-pcs-pcie-v8.h | 34 +++++++++++++++++++ 1 file changed, 34 insertions(+) create mode 100644 drivers/phy/qualcomm/phy-qcom-qmp-pcs-pcie-v8.h diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-pcs-pcie-v8.h b/drivers/phy/qualcomm/phy-qcom-qmp-pcs-pcie-v8.h new file mode 100644 index 0000000000000..1e06aa9d73d58 --- /dev/null +++ b/drivers/phy/qualcomm/phy-qcom-qmp-pcs-pcie-v8.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. All rights reserved. + */ + +#ifndef QCOM_PHY_QMP_PCS_PCIE_V8_H_ +#define QCOM_PHY_QMP_PCS_PCIE_V8_H_ + +/* Only for QMP V8 PHY - PCIE PCS registers */ + +#define QPHY_PCIE_V8_PCS_POWER_STATE_CONFIG2 0x00c +#define QPHY_PCIE_V8_PCS_TX_RX_CONFIG 0x018 +#define QPHY_PCIE_V8_PCS_ENDPOINT_REFCLK_DRIVE 0x01c +#define QPHY_PCIE_V8_PCS_OSC_DTCT_ACTIONS 0x090 +#define QPHY_PCIE_V8_PCS_EQ_CONFIG1 0x0a0 +#define QPHY_PCIE_V8_PCS_G3_RXEQEVAL_TIME 0x0f0 +#define QPHY_PCIE_V8_PCS_G4_RXEQEVAL_TIME 0x0f4 +#define QPHY_PCIE_V8_PCS_G4_EQ_CONFIG5 0x108 +#define QPHY_PCIE_V8_PCS_G4_PRE_GAIN 0x15c +#define QPHY_PCIE_V8_PCS_G12S1_TXDEEMPH_M6DB 0x170 +#define QPHY_PCIE_V8_PCS_G3S2_PRE_GAIN 0x178 +#define QPHY_PCIE_V8_PCS_RX_MARGINING_CONFIG1 0x17c +#define QPHY_PCIE_V8_PCS_RX_MARGINING_CONFIG3 0x184 +#define QPHY_PCIE_V8_PCS_RX_MARGINING_CONFIG5 0x18c +#define QPHY_PCIE_V8_PCS_RX_SIGDET_LVL 0x190 +#define QPHY_PCIE_V8_PCS_G3_FOM_EQ_CONFIG5 0x1ac +#define QPHY_PCIE_V8_PCS_ELECIDLE_DLY_SEL 0x1b8 +#define QPHY_PCIE_V8_PCS_G4_FOM_EQ_CONFIG5 0x1c0 +#define QPHY_PCIE_V8_PCS_POWER_STATE_CONFIG6 0x1d0 +#define QPHY_PCIE_V8_PCS_PCS_TX_RX_CONFIG1 0x1dc +#define QPHY_PCIE_V8_PCS_PCS_TX_RX_CONFIG2 0x1e0 +#define QPHY_PCIE_V8_PCS_EQ_CONFIG4 0x1f8 +#define QPHY_PCIE_V8_PCS_EQ_CONFIG5 0x1fc +#endif From ae4ec5feff1ddc2cc8a80f60ab65a4c9c4014a82 Mon Sep 17 00:00:00 2001 From: Qiang Yu Date: Mon, 24 Nov 2025 02:24:37 -0800 Subject: [PATCH 1300/1321] FROMLIST: phy: qcom-qmp: qserdes-com: Add some more v8 register offsets Some qserdes-com register offsets for the v8 PHY were previously omitted, as they were not needed by earlier v8 PHY initialization sequences. Add these missing v8 register offsets now required to support PCIe QMP PHY on Kaanapali platform. Link: https://lore.kernel.org/all/20251124-kaanapali-pcie-phy-v4-4-d04ee9cca83b@oss.qualcomm.com/ Signed-off-by: Jingyi Wang Reviewed-by: Dmitry Baryshkov Reviewed-by: Neil Armstrong Signed-off-by: Qiang Yu --- drivers/phy/qualcomm/phy-qcom-qmp-qserdes-com-v8.h | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-qserdes-com-v8.h b/drivers/phy/qualcomm/phy-qcom-qmp-qserdes-com-v8.h index d3b2292257bc5..d8ac4c4a2c316 100644 --- a/drivers/phy/qualcomm/phy-qcom-qmp-qserdes-com-v8.h +++ b/drivers/phy/qualcomm/phy-qcom-qmp-qserdes-com-v8.h @@ -33,6 +33,7 @@ #define QSERDES_V8_COM_CP_CTRL_MODE0 0x070 #define QSERDES_V8_COM_PLL_RCTRL_MODE0 0x074 #define QSERDES_V8_COM_PLL_CCTRL_MODE0 0x078 +#define QSERDES_V8_COM_CORECLK_DIV_MODE0 0x07c #define QSERDES_V8_COM_LOCK_CMP1_MODE0 0x080 #define QSERDES_V8_COM_LOCK_CMP2_MODE0 0x084 #define QSERDES_V8_COM_DEC_START_MODE0 0x088 @@ -40,6 +41,7 @@ #define QSERDES_V8_COM_DIV_FRAC_START1_MODE0 0x090 #define QSERDES_V8_COM_DIV_FRAC_START2_MODE0 0x094 #define QSERDES_V8_COM_DIV_FRAC_START3_MODE0 0x098 +#define QSERDES_V8_COM_HSCLK_HS_SWITCH_SEL_1 0x09c #define QSERDES_V8_COM_VCO_TUNE1_MODE0 0x0a8 #define QSERDES_V8_COM_VCO_TUNE2_MODE0 0x0ac #define QSERDES_V8_COM_BG_TIMER 0x0bc @@ -47,13 +49,22 @@ #define QSERDES_V8_COM_SSC_PER1 0x0cc #define QSERDES_V8_COM_SSC_PER2 0x0d0 #define QSERDES_V8_COM_BIAS_EN_CLKBUFLR_EN 0x0dc +#define QSERDES_V8_COM_CLK_ENABLE1 0x0e0 +#define QSERDES_V8_COM_SYS_CLK_CTRL 0x0e4 +#define QSERDES_V8_COM_PLL_IVCO 0x0f4 #define QSERDES_V8_COM_SYSCLK_BUF_ENABLE 0x0e8 #define QSERDES_V8_COM_SYSCLK_EN_SEL 0x110 #define QSERDES_V8_COM_RESETSM_CNTRL 0x118 +#define QSERDES_V8_COM_LOCK_CMP_EN 0x120 #define QSERDES_V8_COM_LOCK_CMP_CFG 0x124 #define QSERDES_V8_COM_VCO_TUNE_MAP 0x140 +#define QSERDES_V8_COM_CLK_SELECT 0x164 #define QSERDES_V8_COM_CORE_CLK_EN 0x170 #define QSERDES_V8_COM_CMN_CONFIG_1 0x174 +#define QSERDES_V8_COM_CMN_MISC_1 0x184 +#define QSERDES_V8_COM_CMN_MODE 0x188 +#define QSERDES_V8_COM_VCO_DC_LEVEL_CTRL 0x198 +#define QSERDES_V8_COM_PLL_SPARE_FOR_ECO 0x2b4 #define QSERDES_V8_COM_AUTO_GAIN_ADJ_CTRL_1 0x1a4 #define QSERDES_V8_COM_AUTO_GAIN_ADJ_CTRL_2 0x1a8 #define QSERDES_V8_COM_AUTO_GAIN_ADJ_CTRL_3 0x1ac From aa015177c4c0f0f0c026769d6c33d930995029e2 Mon Sep 17 00:00:00 2001 From: Qiang Yu Date: Mon, 24 Nov 2025 02:24:38 -0800 Subject: [PATCH 1301/1321] FROMLIST: phy: qcom: qmp-pcie: add QMP PCIe PHY tables for Kaanapali Add QMP PCIe PHY support for the Kaanapali platform. Link: https://lore.kernel.org/r/20251124-kaanapali-pcie-phy-v4-5-d04ee9cca83b@oss.qualcomm.com Signed-off-by: Jingyi Wang Reviewed-by: Abel Vesa Reviewed-by: Neil Armstrong Signed-off-by: Qiang Yu --- drivers/phy/qualcomm/phy-qcom-qmp-pcie.c | 194 +++++++++++++++++++++++ 1 file changed, 194 insertions(+) diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-pcie.c b/drivers/phy/qualcomm/phy-qcom-qmp-pcie.c index f65a8b9586c48..fed2fc9bb3110 100644 --- a/drivers/phy/qualcomm/phy-qcom-qmp-pcie.c +++ b/drivers/phy/qualcomm/phy-qcom-qmp-pcie.c @@ -37,6 +37,9 @@ #include "phy-qcom-qmp-pcs-pcie-v6_30.h" #include "phy-qcom-qmp-pcs-v6_30.h" #include "phy-qcom-qmp-pcie-qhp.h" +#include "phy-qcom-qmp-qserdes-com-v8.h" +#include "phy-qcom-qmp-pcs-pcie-v8.h" +#include "phy-qcom-qmp-qserdes-txrx-pcie-v8.h" #define PHY_INIT_COMPLETE_TIMEOUT 10000 @@ -100,6 +103,13 @@ static const unsigned int pciephy_v7_regs_layout[QPHY_LAYOUT_SIZE] = { [QPHY_PCS_POWER_DOWN_CONTROL] = QPHY_V7_PCS_POWER_DOWN_CONTROL, }; +static const unsigned int pciephy_v8_regs_layout[QPHY_LAYOUT_SIZE] = { + [QPHY_SW_RESET] = QPHY_V8_PCS_SW_RESET, + [QPHY_START_CTRL] = QPHY_V8_PCS_START_CONTROL, + [QPHY_PCS_STATUS] = QPHY_V8_PCS_PCS_STATUS1, + [QPHY_PCS_POWER_DOWN_CONTROL] = QPHY_V8_PCS_POWER_DOWN_CONTROL, +}; + static const unsigned int pciephy_v8_50_regs_layout[QPHY_LAYOUT_SIZE] = { [QPHY_START_CTRL] = QPHY_V8_50_PCS_START_CONTROL, [QPHY_PCS_STATUS] = QPHY_V8_50_PCS_STATUS1, @@ -3067,6 +3077,149 @@ static const struct qmp_phy_init_tbl sar2130p_qmp_gen3x2_pcie_ep_pcs_misc_tbl[] QMP_PHY_INIT_CFG(QPHY_PCIE_V6_PCS_PCIE_POWER_STATE_CONFIG4, 0x07), }; +static const struct qmp_phy_init_tbl kaanapali_qmp_gen3x2_pcie_serdes_tbl[] = { + QMP_PHY_INIT_CFG(QSERDES_V8_COM_SSC_STEP_SIZE1_MODE1, 0x93), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_SSC_STEP_SIZE2_MODE1, 0x01), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_CP_CTRL_MODE1, 0x06), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_PLL_RCTRL_MODE1, 0x16), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_PLL_CCTRL_MODE1, 0x36), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_CORECLK_DIV_MODE1, 0x04), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_LOCK_CMP1_MODE1, 0x0a), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_LOCK_CMP2_MODE1, 0x1a), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_DEC_START_MODE1, 0x34), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_DIV_FRAC_START1_MODE1, 0x55), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_DIV_FRAC_START2_MODE1, 0x55), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_DIV_FRAC_START3_MODE1, 0x01), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_HSCLK_SEL_1, 0x01), + + QMP_PHY_INIT_CFG(QSERDES_V8_COM_SSC_STEP_SIZE1_MODE0, 0xf8), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_SSC_STEP_SIZE2_MODE0, 0x01), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_CP_CTRL_MODE0, 0x06), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_PLL_RCTRL_MODE0, 0x16), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_PLL_CCTRL_MODE0, 0x36), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_CORECLK_DIV_MODE0, 0x0a), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_LOCK_CMP1_MODE0, 0x04), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_LOCK_CMP2_MODE0, 0x0d), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_DEC_START_MODE0, 0x41), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_DIV_FRAC_START1_MODE0, 0xab), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_DIV_FRAC_START2_MODE0, 0xaa), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_DIV_FRAC_START3_MODE0, 0x01), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_HSCLK_HS_SWITCH_SEL_1, 0x00), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_BG_TIMER, 0x0a), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_SSC_PER1, 0x62), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_SSC_PER2, 0x02), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_BIAS_EN_CLKBUFLR_EN, 0x14), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_CLK_ENABLE1, 0x90), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_SYS_CLK_CTRL, 0x82), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_PLL_IVCO, 0x0f), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_SYSCLK_EN_SEL, 0x08), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_LOCK_CMP_EN, 0x46), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_LOCK_CMP_CFG, 0x04), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_VCO_TUNE_MAP, 0x14), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_CLK_SELECT, 0x34), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_CORE_CLK_EN, 0xa0), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_CMN_CONFIG_1, 0x16), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_CMN_MISC_1, 0x88), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_CMN_MODE, 0x04), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_VCO_DC_LEVEL_CTRL, 0x0f), + QMP_PHY_INIT_CFG(QSERDES_V8_COM_PLL_SPARE_FOR_ECO, 0x02), +}; + +static const struct qmp_phy_init_tbl kaanapali_qmp_gen3x2_pcie_tx_tbl[] = { + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_RES_CODE_LANE_OFFSET_TX, 0x1b), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_RES_CODE_LANE_OFFSET_RX, 0x14), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_LANE_MODE_1, 0x00), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_LANE_MODE_2, 0x40), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_LANE_MODE_3, 0x00), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_TRAN_DRVR_EMP_EN, 0x04), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_TX_BAND0, 0x05), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_TX_BAND1, 0x00), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_SEL_10B_8B, 0x07), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_SEL_20B_10B, 0x1f), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_PARRATE_REC_DETECT_IDLE_EN, 0x90), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_TX_ADAPT_POST_THRESH1, 0x02), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_TX_ADAPT_POST_THRESH2, 0x0d), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_EQ_RCF_CTRL_RATE3, 0x53), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_EQ_RCF_CTRL_RATE4, 0x54), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_TX_PHPRE_CTRL, 0x20), +}; + +static const struct qmp_phy_init_tbl kaanapali_qmp_gen3x2_pcie_rx_tbl[] = { + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_UCDR_FO_GAIN_RATE4, 0x0b), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_UCDR_SO_GAIN_RATE3, 0x04), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_UCDR_SO_GAIN_RATE4, 0x05), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_UCDR_PI_CONTROLS, 0x15), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_VGA_CAL_CNTRL1, 0x00), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_VGA_CAL_MAN_VAL, 0x89), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_EQU_ADAPTOR_CNTRL4, 0x2d), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_SIGDET_ENABLES, 0x1c), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_SIGDET_LVL, 0x04), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RXCLK_DIV2_CTRL, 0x01), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_BAND_CTRL0, 0x05), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_TERM_BW_CTRL0, 0x00), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_TERM_BW_CTRL1, 0x00), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_SVS_MODE_CTRL, 0x00), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_UCDR_PI_CTRL1, 0x40), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_UCDR_PI_CTRL2, 0x42), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_UCDR_SB2_THRESH2_RATE3, 0x18), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_UCDR_SB2_GAIN1_RATE3, 0x12), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_UCDR_SB2_GAIN2_RATE3, 0x18), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE_0_1_B0, 0xc2), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE_0_1_B1, 0xc2), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE_0_1_B2, 0x18), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE_0_1_B4, 0x0f), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE_0_1_B7, 0x62), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE3_B0, 0xe4), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE3_B1, 0x63), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE3_B2, 0xd8), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE3_B3, 0x99), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE3_B4, 0x67), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE4_SA_B0, 0xa4), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE4_SA_B1, 0xa4), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE4_SA_B2, 0x28), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE4_SA_B3, 0x9f), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE4_SA_B4, 0x48), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_RX_MODE_RATE4_SA_B5, 0x24), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_Q_PI_INTRINSIC_BIAS_RATE32, 0x01), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_Q_PI_INTRINSIC_BIAS_RATE4, 0x00), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_EOM_MAX_ERR_LIMIT_LSB, 0xff), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_EOM_MAX_ERR_LIMIT_MSB, 0xff), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_AUXDATA_BIN_RATE23, 0x30), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_AUXDATA_BIN_RATE4, 0x03), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_VTHRESH_CAL_MAN_VAL_RATE3, 0x1f), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_VTHRESH_CAL_MAN_VAL_RATE4, 0x1f), + QMP_PHY_INIT_CFG(QSERDES_V8_PCIE_RX_GM_CAL, 0x0d), +}; + +static const struct qmp_phy_init_tbl kaanapali_qmp_gen3x2_pcie_pcs_tbl[] = { + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_G12S1_TXDEEMPH_M6DB, 0x17), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_G3S2_PRE_GAIN, 0x2e), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_RX_SIGDET_LVL, 0xcc), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_ELECIDLE_DLY_SEL, 0x40), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_PCS_TX_RX_CONFIG1, 0x04), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_PCS_TX_RX_CONFIG2, 0x02), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_EQ_CONFIG4, 0x00), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_EQ_CONFIG5, 0x22), +}; + +static const struct qmp_phy_init_tbl kaanapali_qmp_gen3x2_pcie_pcs_misc_tbl[] = { + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_TX_RX_CONFIG, 0xc0), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_POWER_STATE_CONFIG2, 0x1d), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_ENDPOINT_REFCLK_DRIVE, 0xc1), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_OSC_DTCT_ACTIONS, 0x00), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_EQ_CONFIG1, 0x16), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_G3_RXEQEVAL_TIME, 0x27), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_G4_RXEQEVAL_TIME, 0x27), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_G4_EQ_CONFIG5, 0x02), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_G4_PRE_GAIN, 0x2e), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_RX_MARGINING_CONFIG1, 0x03), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_RX_MARGINING_CONFIG3, 0x28), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_RX_MARGINING_CONFIG5, 0x0f), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_G3_FOM_EQ_CONFIG5, 0xf2), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_G4_FOM_EQ_CONFIG5, 0xf2), + QMP_PHY_INIT_CFG(QPHY_PCIE_V8_PCS_POWER_STATE_CONFIG6, 0x1f), +}; + struct qmp_pcie_offsets { u16 serdes; u16 pcs; @@ -3363,6 +3516,16 @@ static const struct qmp_pcie_offsets qmp_pcie_offsets_v6_30 = { .ln_shrd = 0x8000, }; +static const struct qmp_pcie_offsets qmp_pcie_offsets_v8_0 = { + .serdes = 0x1000, + .pcs = 0x1400, + .pcs_misc = 0x1800, + .tx = 0x0000, + .rx = 0x0200, + .tx2 = 0x0800, + .rx2 = 0x0a00, +}; + static const struct qmp_pcie_offsets qmp_pcie_offsets_v8_50 = { .serdes = 0x8000, .pcs = 0x9000, @@ -4425,6 +4588,34 @@ static const struct qmp_phy_cfg qmp_v6_gen4x4_pciephy_cfg = { .phy_status = PHYSTATUS_4_20, }; +static const struct qmp_phy_cfg qmp_v8_gen3x2_pciephy_cfg = { + .lanes = 2, + + .offsets = &qmp_pcie_offsets_v8_0, + + .tbls = { + .serdes = kaanapali_qmp_gen3x2_pcie_serdes_tbl, + .serdes_num = ARRAY_SIZE(kaanapali_qmp_gen3x2_pcie_serdes_tbl), + .tx = kaanapali_qmp_gen3x2_pcie_tx_tbl, + .tx_num = ARRAY_SIZE(kaanapali_qmp_gen3x2_pcie_tx_tbl), + .rx = kaanapali_qmp_gen3x2_pcie_rx_tbl, + .rx_num = ARRAY_SIZE(kaanapali_qmp_gen3x2_pcie_rx_tbl), + .pcs = kaanapali_qmp_gen3x2_pcie_pcs_tbl, + .pcs_num = ARRAY_SIZE(kaanapali_qmp_gen3x2_pcie_pcs_tbl), + .pcs_misc = kaanapali_qmp_gen3x2_pcie_pcs_misc_tbl, + .pcs_misc_num = ARRAY_SIZE(kaanapali_qmp_gen3x2_pcie_pcs_misc_tbl), + }, + + .reset_list = sdm845_pciephy_reset_l, + .num_resets = ARRAY_SIZE(sdm845_pciephy_reset_l), + .vreg_list = qmp_phy_vreg_l, + .num_vregs = ARRAY_SIZE(qmp_phy_vreg_l), + .regs = pciephy_v8_regs_layout, + + .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, + .phy_status = PHYSTATUS_4_20, +}; + static const struct qmp_phy_cfg glymur_qmp_gen5x4_pciephy_cfg = { .lanes = 4, @@ -5228,6 +5419,9 @@ static const struct of_device_id qmp_pcie_of_match_table[] = { }, { .compatible = "qcom,ipq9574-qmp-gen3x2-pcie-phy", .data = &ipq9574_gen3x2_pciephy_cfg, + }, { + .compatible = "qcom,kaanapali-qmp-gen3x2-pcie-phy", + .data = &qmp_v8_gen3x2_pciephy_cfg, }, { .compatible = "qcom,msm8998-qmp-pcie-phy", .data = &msm8998_pciephy_cfg, From 4ff39898277e410d15aaf002d1324c4addfa9155 Mon Sep 17 00:00:00 2001 From: Krishna Chaitanya Chundru Date: Wed, 28 Jan 2026 17:10:42 +0530 Subject: [PATCH 1302/1321] FROMLIST: PCI: dwc: Use common D3cold eligibility helper in suspend path Previously, the driver skipped putting the link into L2/device state in D3cold whenever L1 ASPM was enabled, since some devices (e.g. NVMe) expect low resume latency and may not tolerate deeper power states. However, such devices typically remain in D0 and are already covered by the new helper's requirement that all endpoints be in D3hot before the host bridge may enter D3cold. So, replace the local L1/L1SS-based check in dw_pcie_suspend_noirq() with the shared pci_host_common_can_enter_d3cold() helper to decide whether the DesignWare host bridge can safely transition to D3cold. Link: https://lore.kernel.org/r/20260128-d3cold-v1-2-dd8f3f0ce824@oss.qualcomm.com Signed-off-by: Krishna Chaitanya Chundru --- drivers/pci/controller/dwc/pcie-designware-host.c | 7 +------ drivers/pci/controller/dwc/pcie-designware.h | 1 + 2 files changed, 2 insertions(+), 6 deletions(-) diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c index 372207c33a857..2c8056761addf 100644 --- a/drivers/pci/controller/dwc/pcie-designware-host.c +++ b/drivers/pci/controller/dwc/pcie-designware-host.c @@ -1157,15 +1157,10 @@ static int dw_pcie_pme_turn_off(struct dw_pcie *pci) int dw_pcie_suspend_noirq(struct dw_pcie *pci) { - u8 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP); u32 val; int ret; - /* - * If L1SS is supported, then do not put the link into L2 as some - * devices such as NVMe expect low resume latency. - */ - if (dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKCTL) & PCI_EXP_LNKCTL_ASPM_L1) + if (!pci_host_common_can_enter_d3cold(pci->pp.bridge)) return 0; if (pci->pp.ops->pme_turn_off) { diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h index 31685951a0804..18d8f7d5d2308 100644 --- a/drivers/pci/controller/dwc/pcie-designware.h +++ b/drivers/pci/controller/dwc/pcie-designware.h @@ -26,6 +26,7 @@ #include #include +#include "../pci-host-common.h" #include "../../pci.h" /* DWC PCIe IP-core versions (native support since v4.70a) */ From 545e8d0619f7a3e90ce87dcb18c8960b3fb04a4a Mon Sep 17 00:00:00 2001 From: Krishna Chaitanya Chundru Date: Wed, 28 Jan 2026 17:10:41 +0530 Subject: [PATCH 1303/1321] FROMLIST: PCI: host-common: Add shared D3cold eligibility helper for host bridges Add a common helper, pci_host_common_can_enter_d3cold(), to determine whether a PCI host bridge can safely transition to D3cold. The helper walks all devices on the bridge's bus and only allows the host bridge to enter D3cold if all PCIe endpoints are already in PCI_D3hot. For devices that may wake the system, it additionally requires that the device supports PME wakeup from D3cold(with WAKE#). Devices without wakeup enabled are not restricted by this check and can be allowed to keep device in D3cold. Link: https://lore.kernel.org/r/20260128-d3cold-v1-1-dd8f3f0ce824@oss.qualcomm.com Signed-off-by: Krishna Chaitanya Chundru --- drivers/pci/controller/pci-host-common.c | 29 ++++++++++++++++++++++++ drivers/pci/controller/pci-host-common.h | 2 ++ 2 files changed, 31 insertions(+) diff --git a/drivers/pci/controller/pci-host-common.c b/drivers/pci/controller/pci-host-common.c index c473e7c03baca..225472c5ac82c 100644 --- a/drivers/pci/controller/pci-host-common.c +++ b/drivers/pci/controller/pci-host-common.c @@ -106,5 +106,34 @@ void pci_host_common_remove(struct platform_device *pdev) } EXPORT_SYMBOL_GPL(pci_host_common_remove); +static int pci_host_common_check_d3cold(struct pci_dev *pdev, void *userdata) +{ + bool *d3cold_allow = userdata; + + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ENDPOINT) + return 0; + + if (pdev->current_state != PCI_D3hot) + goto exit; + + if (device_may_wakeup(&pdev->dev) && !pci_pme_capable(pdev, PCI_D3cold)) + goto exit; + + return 0; +exit: + *d3cold_allow = false; + return -EBUSY; +} + +bool pci_host_common_can_enter_d3cold(struct pci_host_bridge *bridge) +{ + bool d3cold_allow = true; + + pci_walk_bus(bridge->bus, pci_host_common_check_d3cold, &d3cold_allow); + + return d3cold_allow; +} +EXPORT_SYMBOL_GPL(pci_host_common_can_enter_d3cold); + MODULE_DESCRIPTION("Common library for PCI host controller drivers"); MODULE_LICENSE("GPL v2"); diff --git a/drivers/pci/controller/pci-host-common.h b/drivers/pci/controller/pci-host-common.h index b5075d4bd7eb3..18a731bca0588 100644 --- a/drivers/pci/controller/pci-host-common.h +++ b/drivers/pci/controller/pci-host-common.h @@ -20,4 +20,6 @@ void pci_host_common_remove(struct platform_device *pdev); struct pci_config_window *pci_host_common_ecam_create(struct device *dev, struct pci_host_bridge *bridge, const struct pci_ecam_ops *ops); + +bool pci_host_common_can_enter_d3cold(struct pci_host_bridge *bridge); #endif From 54216660b814db8ef9a4aead2b0df2cdb00b0958 Mon Sep 17 00:00:00 2001 From: Krishna Chaitanya Chundru Date: Wed, 28 Jan 2026 17:10:43 +0530 Subject: [PATCH 1304/1321] FROMLIST: PCI: qcom: Add D3cold support Add pme_turn_off() support and use DWC common suspend resume methods for device D3cold entry & exit. If the device is not kept in D3cold use existing methods like keeping icc votes, opp votes etc.. intact. In qcom_pcie_deinit_2_7_0(), explicitly disable PCIe clocks and resets in the controller. Remove suspended flag from qcom_pcie structure as it is no longer needed. Link: https://lore.kernel.org/r/20260128-d3cold-v1-3-dd8f3f0ce824@oss.qualcomm.com Signed-off-by: Krishna Chaitanya Chundru --- drivers/pci/controller/dwc/pcie-qcom.c | 116 +++++++++++++++---------- 1 file changed, 69 insertions(+), 47 deletions(-) diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c index 5a318487b2b3f..37019da5236ee 100644 --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -150,6 +150,7 @@ /* ELBI_SYS_CTRL register fields */ #define ELBI_SYS_CTRL_LT_ENABLE BIT(0) +#define ELBI_SYS_CTRL_PME_TURNOFF_MSG BIT(4) /* AXI_MSTR_RESP_COMP_CTRL0 register fields */ #define CFG_REMOTE_RD_REQ_BRIDGE_SIZE_2K 0x4 @@ -283,7 +284,6 @@ struct qcom_pcie { const struct qcom_pcie_cfg *cfg; struct dentry *debugfs; struct list_head ports; - bool suspended; bool use_pm_opp; }; @@ -1074,6 +1074,12 @@ static void qcom_pcie_host_post_init_2_7_0(struct qcom_pcie *pcie) static void qcom_pcie_deinit_2_7_0(struct qcom_pcie *pcie) { struct qcom_pcie_resources_2_7_0 *res = &pcie->res.v2_7_0; + u32 val; + + /* Disable PCIe clocks and resets */ + val = readl(pcie->parf + PARF_PHY_CTRL); + val |= PHY_TEST_PWR_DOWN; + writel(val, pcie->parf + PARF_PHY_CTRL); clk_bulk_disable_unprepare(res->num_clks, res->clks); @@ -1356,10 +1362,18 @@ static void qcom_pcie_host_post_init(struct dw_pcie_rp *pp) pcie->cfg->ops->host_post_init(pcie); } +static void qcom_pcie_host_pme_turn_off(struct dw_pcie_rp *pp) +{ + struct dw_pcie *pci = to_dw_pcie_from_pp(pp); + + writel(ELBI_SYS_CTRL_PME_TURNOFF_MSG, pci->elbi_base + ELBI_SYS_CTRL); +} + static const struct dw_pcie_host_ops qcom_pcie_dw_ops = { .init = qcom_pcie_host_init, .deinit = qcom_pcie_host_deinit, .post_init = qcom_pcie_host_post_init, + .pme_turn_off = qcom_pcie_host_pme_turn_off, }; /* Qcom IP rev.: 2.1.0 Synopsys IP rev.: 4.01a */ @@ -2018,53 +2032,51 @@ static int qcom_pcie_suspend_noirq(struct device *dev) if (!pcie) return 0; - /* - * Set minimum bandwidth required to keep data path functional during - * suspend. - */ - if (pcie->icc_mem) { - ret = icc_set_bw(pcie->icc_mem, 0, kBps_to_icc(1)); - if (ret) { - dev_err(dev, - "Failed to set bandwidth for PCIe-MEM interconnect path: %d\n", - ret); - return ret; - } - } + ret = dw_pcie_suspend_noirq(pcie->pci); + if (ret) + return ret; - /* - * Turn OFF the resources only for controllers without active PCIe - * devices. For controllers with active devices, the resources are kept - * ON and the link is expected to be in L0/L1 (sub)states. - * - * Turning OFF the resources for controllers with active PCIe devices - * will trigger access violation during the end of the suspend cycle, - * as kernel tries to access the PCIe devices config space for masking - * MSIs. - * - * Also, it is not desirable to put the link into L2/L3 state as that - * implies VDD supply will be removed and the devices may go into - * powerdown state. This will affect the lifetime of the storage devices - * like NVMe. - */ - if (!dw_pcie_link_up(pcie->pci)) { - qcom_pcie_host_deinit(&pcie->pci->pp); - pcie->suspended = true; - } + if (pcie->pci->suspended) { + ret = icc_disable(pcie->icc_mem); + if (ret) + dev_err(dev, "Failed to disable PCIe-MEM interconnect path: %d\n", ret); - /* - * Only disable CPU-PCIe interconnect path if the suspend is non-S2RAM. - * Because on some platforms, DBI access can happen very late during the - * S2RAM and a non-active CPU-PCIe interconnect path may lead to NoC - * error. - */ - if (pm_suspend_target_state != PM_SUSPEND_MEM) { ret = icc_disable(pcie->icc_cpu); if (ret) dev_err(dev, "Failed to disable CPU-PCIe interconnect path: %d\n", ret); if (pcie->use_pm_opp) dev_pm_opp_set_opp(pcie->pci->dev, NULL); + } else { + /* + * Set minimum bandwidth required to keep data path functional during + * suspend. + */ + if (pcie->icc_mem) { + ret = icc_set_bw(pcie->icc_mem, 0, kBps_to_icc(1)); + if (ret) { + dev_err(dev, + "Failed to set bandwidth for PCIe-MEM interconnect path: %d\n", + ret); + return ret; + } + } + + /* + * Only disable CPU-PCIe interconnect path if the suspend is non-S2RAM. + * Because on some platforms, DBI access can happen very late during the + * S2RAM and a non-active CPU-PCIe interconnect path may lead to NoC + * error. + */ + if (pm_suspend_target_state != PM_SUSPEND_MEM) { + ret = icc_disable(pcie->icc_cpu); + if (ret) + dev_err(dev, "Failed to disable CPU-PCIe interconnect path: %d\n", + ret); + + if (pcie->use_pm_opp) + dev_pm_opp_set_opp(pcie->pci->dev, NULL); + } } return ret; } @@ -2078,20 +2090,30 @@ static int qcom_pcie_resume_noirq(struct device *dev) if (!pcie) return 0; - if (pm_suspend_target_state != PM_SUSPEND_MEM) { + if (pcie->pci->suspended) { ret = icc_enable(pcie->icc_cpu); if (ret) { dev_err(dev, "Failed to enable CPU-PCIe interconnect path: %d\n", ret); return ret; } - } - if (pcie->suspended) { - ret = qcom_pcie_host_init(&pcie->pci->pp); - if (ret) + ret = icc_enable(pcie->icc_mem); + if (ret) { + dev_err(dev, "Failed to enable PCIe-MEM interconnect path: %d\n", ret); return ret; - - pcie->suspended = false; + } + ret = dw_pcie_resume_noirq(pcie->pci); + if (ret && (ret != -ETIMEDOUT)) + return ret; + } else { + if (pm_suspend_target_state != PM_SUSPEND_MEM) { + ret = icc_enable(pcie->icc_cpu); + if (ret) { + dev_err(dev, "Failed to enable CPU-PCIe interconnect path: %d\n", + ret); + return ret; + } + } } qcom_pcie_icc_opp_update(pcie); From 7f7cdd2c07583f071aed00ec207f8f2453f01aee Mon Sep 17 00:00:00 2001 From: Krishna Chaitanya Chundru Date: Wed, 28 Jan 2026 17:52:42 +0530 Subject: [PATCH 1305/1321] FROMLIST: PCI: qcom: Prevent GDSC power down on suspendi Currently, the driver expects the devices to remain in D0 across system suspend, but the genpd framework may still power down the associated GDSC during suspend. When that happens, the PCIe link goes down and cannot be recovered on resume. Prevent genpd from turning off the PCIe GDSC by using dev_pm_genpd_rpm_always_on() so that the power domain stays on while the controller is suspended. This preserves the link state across suspend/resume and avoids unrecoverable link failures. Fixes: 82a823833f4e ("PCI: qcom: Add Qualcomm PCIe controller driver") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20260128-genpd_fix-v1-1-cd45a249d12f@oss.qualcomm.com Signed-off-by: Krishna Chaitanya Chundru --- drivers/pci/controller/dwc/pcie-qcom.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c index 37019da5236ee..0610692a129b7 100644 --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include #include @@ -2036,6 +2037,11 @@ static int qcom_pcie_suspend_noirq(struct device *dev) if (ret) return ret; + if (pcie->pci->suspended) + dev_pm_genpd_rpm_always_on(dev, false); + else + dev_pm_genpd_rpm_always_on(dev, true); + if (pcie->pci->suspended) { ret = icc_disable(pcie->icc_mem); if (ret) From c8d94ff89bc5cb9501164e6ebdbec857b3f111f2 Mon Sep 17 00:00:00 2001 From: Bjorn Helgaas Date: Thu, 15 Jan 2026 12:58:53 +0530 Subject: [PATCH 1306/1321] FROMLIST: PCI/pwrctrl: pwrseq: Rename private struct and pointers for consistency Previously the pwrseq, tc9563, and slot pwrctrl drivers used different naming conventions for their private data structs and pointers to them, which makes patches hard to read: Previous names New names ------------------------------------ ---------------------------------- struct pci_pwrctrl_pwrseq_data { struct pci_pwrctrl_pwrseq { struct pci_pwrctrl ctx; struct pci_pwrctrl pwrctrl; struct pci_pwrctrl_pwrseq_data *data struct pci_pwrctrl_pwrseq *pwrseq struct tc9563_pwrctrl_ctx { struct pci_pwrctrl_tc9563 { struct tc9563_pwrctrl_ctx *ctx struct pci_pwrctrl_tc9563 *tc9563 struct pci_pwrctrl_slot_data { struct pci_pwrctrl_slot { struct pci_pwrctrl ctx; struct pci_pwrctrl pwrctrl; struct pci_pwrctrl_slot_data *slot struct pci_pwrctrl_slot *slot Rename "struct pci_pwrctrl_pwrseq_data" to "pci_pwrctrl_pwrseq". Rename the "struct pci_pwrctrl ctx" member to "struct pci_pwrctrl pwrctrl". Rename pointers from "struct pci_pwrctrl_pwrseq_data *data" to "struct pci_pwrctrl_pwrseq *pwrseq". No functional change intended. Signed-off-by: Bjorn Helgaas Signed-off-by: Manivannan Sadhasivam Reviewed-by: Bartosz Golaszewski Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-1-9d26da3ce903@oss.qualcomm.com --- drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c b/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c index 4e664e7b8dd23..c0d22dc3a8561 100644 --- a/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c +++ b/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c @@ -13,8 +13,8 @@ #include #include -struct pci_pwrctrl_pwrseq_data { - struct pci_pwrctrl ctx; +struct pci_pwrctrl_pwrseq { + struct pci_pwrctrl pwrctrl; struct pwrseq_desc *pwrseq; }; @@ -62,7 +62,7 @@ static void devm_pci_pwrctrl_pwrseq_power_off(void *data) static int pci_pwrctrl_pwrseq_probe(struct platform_device *pdev) { const struct pci_pwrctrl_pwrseq_pdata *pdata; - struct pci_pwrctrl_pwrseq_data *data; + struct pci_pwrctrl_pwrseq *pwrseq; struct device *dev = &pdev->dev; int ret; @@ -76,28 +76,28 @@ static int pci_pwrctrl_pwrseq_probe(struct platform_device *pdev) return ret; } - data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); - if (!data) + pwrseq = devm_kzalloc(dev, sizeof(*pwrseq), GFP_KERNEL); + if (!pwrseq) return -ENOMEM; - data->pwrseq = devm_pwrseq_get(dev, pdata->target); - if (IS_ERR(data->pwrseq)) - return dev_err_probe(dev, PTR_ERR(data->pwrseq), + pwrseq->pwrseq = devm_pwrseq_get(dev, pdata->target); + if (IS_ERR(pwrseq->pwrseq)) + return dev_err_probe(dev, PTR_ERR(pwrseq->pwrseq), "Failed to get the power sequencer\n"); - ret = pwrseq_power_on(data->pwrseq); + ret = pwrseq_power_on(pwrseq->pwrseq); if (ret) return dev_err_probe(dev, ret, "Failed to power-on the device\n"); ret = devm_add_action_or_reset(dev, devm_pci_pwrctrl_pwrseq_power_off, - data->pwrseq); + pwrseq->pwrseq); if (ret) return ret; - pci_pwrctrl_init(&data->ctx, dev); + pci_pwrctrl_init(&pwrseq->pwrctrl, dev); - ret = devm_pci_pwrctrl_device_set_ready(dev, &data->ctx); + ret = devm_pci_pwrctrl_device_set_ready(dev, &pwrseq->pwrctrl); if (ret) return dev_err_probe(dev, ret, "Failed to register the pwrctrl wrapper\n"); From 2e8d195ab28768e6ee93473cd5fd3adcf2d3a541 Mon Sep 17 00:00:00 2001 From: Bjorn Helgaas Date: Thu, 15 Jan 2026 12:58:54 +0530 Subject: [PATCH 1307/1321] FROMLIST: PCI/pwrctrl: slot: Rename private struct and pointers for consistency Previously the pwrseq, tc9563, and slot pwrctrl drivers used different naming conventions for their private data structs and pointers to them, which makes patches hard to read: Previous names New names ------------------------------------ ---------------------------------- struct pci_pwrctrl_pwrseq_data { struct pci_pwrctrl_pwrseq { struct pci_pwrctrl ctx; struct pci_pwrctrl pwrctrl; struct pci_pwrctrl_pwrseq_data *data struct pci_pwrctrl_pwrseq *pwrseq struct tc9563_pwrctrl_ctx { struct pci_pwrctrl_tc9563 { struct tc9563_pwrctrl_ctx *ctx struct pci_pwrctrl_tc9563 *tc9563 struct pci_pwrctrl_slot_data { struct pci_pwrctrl_slot { struct pci_pwrctrl ctx; struct pci_pwrctrl pwrctrl; struct pci_pwrctrl_slot_data *slot struct pci_pwrctrl_slot *slot Rename "struct pci_pwrctrl_slot_data" to "struct pci_pwrctrl_slot". Rename the "struct pci_pwrctrl ctx" member to "struct pci_pwrctrl pwrctrl". No functional change intended. Signed-off-by: Bjorn Helgaas Signed-off-by: Manivannan Sadhasivam Reviewed-by: Bartosz Golaszewski Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-2-9d26da3ce903@oss.qualcomm.com --- drivers/pci/pwrctrl/slot.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/pci/pwrctrl/slot.c b/drivers/pci/pwrctrl/slot.c index 3320494b62d89..5ddae4ae34315 100644 --- a/drivers/pci/pwrctrl/slot.c +++ b/drivers/pci/pwrctrl/slot.c @@ -13,15 +13,15 @@ #include #include -struct pci_pwrctrl_slot_data { - struct pci_pwrctrl ctx; +struct pci_pwrctrl_slot { + struct pci_pwrctrl pwrctrl; struct regulator_bulk_data *supplies; int num_supplies; }; static void devm_pci_pwrctrl_slot_power_off(void *data) { - struct pci_pwrctrl_slot_data *slot = data; + struct pci_pwrctrl_slot *slot = data; regulator_bulk_disable(slot->num_supplies, slot->supplies); regulator_bulk_free(slot->num_supplies, slot->supplies); @@ -29,7 +29,7 @@ static void devm_pci_pwrctrl_slot_power_off(void *data) static int pci_pwrctrl_slot_probe(struct platform_device *pdev) { - struct pci_pwrctrl_slot_data *slot; + struct pci_pwrctrl_slot *slot; struct device *dev = &pdev->dev; struct clk *clk; int ret; @@ -64,9 +64,9 @@ static int pci_pwrctrl_slot_probe(struct platform_device *pdev) "Failed to enable slot clock\n"); } - pci_pwrctrl_init(&slot->ctx, dev); + pci_pwrctrl_init(&slot->pwrctrl, dev); - ret = devm_pci_pwrctrl_device_set_ready(dev, &slot->ctx); + ret = devm_pci_pwrctrl_device_set_ready(dev, &slot->pwrctrl); if (ret) return dev_err_probe(dev, ret, "Failed to register pwrctrl driver\n"); From d40c66e2de7be743fa02608fcf5c8726196423c0 Mon Sep 17 00:00:00 2001 From: Manivannan Sadhasivam Date: Thu, 15 Jan 2026 12:58:55 +0530 Subject: [PATCH 1308/1321] FROMLIST: PCI/pwrctrl: tc9563: Use put_device() instead of i2c_put_adapter() The API comment for of_find_i2c_adapter_by_node() recommends using put_device() to drop the reference count of I2C adapter instead of using i2c_put_adapter(). So replace i2c_put_adapter() with put_device(). Fixes: 4c9c7be47310 ("PCI: pwrctrl: Add power control driver for TC9563") Signed-off-by: Bjorn Helgaas Signed-off-by: Manivannan Sadhasivam Reviewed-by: Bartosz Golaszewski Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-3-9d26da3ce903@oss.qualcomm.com --- drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c b/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c index ec423432ac655..0a63add84d095 100644 --- a/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c +++ b/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c @@ -533,7 +533,7 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) ctx->client = i2c_new_dummy_device(ctx->adapter, addr); if (IS_ERR(ctx->client)) { dev_err(dev, "Failed to create I2C client\n"); - i2c_put_adapter(ctx->adapter); + put_device(&ctx->adapter->dev); return PTR_ERR(ctx->client); } @@ -613,7 +613,7 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) tc9563_pwrctrl_power_off(ctx); remove_i2c: i2c_unregister_device(ctx->client); - i2c_put_adapter(ctx->adapter); + put_device(&ctx->adapter->dev); return ret; } @@ -623,7 +623,7 @@ static void tc9563_pwrctrl_remove(struct platform_device *pdev) tc9563_pwrctrl_power_off(ctx); i2c_unregister_device(ctx->client); - i2c_put_adapter(ctx->adapter); + put_device(&ctx->adapter->dev); } static const struct of_device_id tc9563_pwrctrl_of_match[] = { From 3dd8223913520e595145c08713a41a310d5fb4e8 Mon Sep 17 00:00:00 2001 From: Bjorn Helgaas Date: Thu, 15 Jan 2026 12:58:56 +0530 Subject: [PATCH 1309/1321] FROMLIST: PCI/pwrctrl: tc9563: Clean up whitespace Most of pci-pwrctrl-tc9563.c fits in 80 columns. Wrap lines that are gratuitously longer. Whitespace changes only. Signed-off-by: Bjorn Helgaas Signed-off-by: Manivannan Sadhasivam Reviewed-by: Bartosz Golaszewski Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-4-9d26da3ce903@oss.qualcomm.com --- drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c | 65 +++++++++++++++--------- 1 file changed, 42 insertions(+), 23 deletions(-) diff --git a/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c b/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c index 0a63add84d095..efc4d2054bfda 100644 --- a/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c +++ b/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c @@ -59,7 +59,7 @@ #define TC9563_POWER_CONTROL_OVREN 0x82b2c8 #define TC9563_GPIO_MASK 0xfffffff3 -#define TC9563_GPIO_DEASSERT_BITS 0xc /* Bits to clear for GPIO deassert */ +#define TC9563_GPIO_DEASSERT_BITS 0xc /* Clear to deassert GPIO */ #define TC9563_TX_MARGIN_MIN_UA 400000 @@ -69,7 +69,7 @@ */ #define TC9563_OSC_STAB_DELAY_US (10 * USEC_PER_MSEC) -#define TC9563_L0S_L1_DELAY_UNIT_NS 256 /* Each unit represents 256 nanoseconds */ +#define TC9563_L0S_L1_DELAY_UNIT_NS 256 /* Each unit represents 256 ns */ struct tc9563_pwrctrl_reg_setting { unsigned int offset; @@ -217,7 +217,8 @@ static int tc9563_pwrctrl_i2c_read(struct i2c_client *client, } static int tc9563_pwrctrl_i2c_bulk_write(struct i2c_client *client, - const struct tc9563_pwrctrl_reg_setting *seq, int len) + const struct tc9563_pwrctrl_reg_setting *seq, + int len) { int ret, i; @@ -252,12 +253,13 @@ static int tc9563_pwrctrl_disable_port(struct tc9563_pwrctrl_ctx *ctx, if (ret) return ret; - return tc9563_pwrctrl_i2c_bulk_write(ctx->client, - common_pwroff_seq, ARRAY_SIZE(common_pwroff_seq)); + return tc9563_pwrctrl_i2c_bulk_write(ctx->client, common_pwroff_seq, + ARRAY_SIZE(common_pwroff_seq)); } static int tc9563_pwrctrl_set_l0s_l1_entry_delay(struct tc9563_pwrctrl_ctx *ctx, - enum tc9563_pwrctrl_ports port, bool is_l1, u32 ns) + enum tc9563_pwrctrl_ports port, + bool is_l1, u32 ns) { u32 rd_val, units; int ret; @@ -269,24 +271,32 @@ static int tc9563_pwrctrl_set_l0s_l1_entry_delay(struct tc9563_pwrctrl_ctx *ctx, units = ns / TC9563_L0S_L1_DELAY_UNIT_NS; if (port == TC9563_ETHERNET) { - ret = tc9563_pwrctrl_i2c_read(ctx->client, TC9563_EMBEDDED_ETH_DELAY, &rd_val); + ret = tc9563_pwrctrl_i2c_read(ctx->client, + TC9563_EMBEDDED_ETH_DELAY, + &rd_val); if (ret) return ret; if (is_l1) - rd_val = u32_replace_bits(rd_val, units, TC9563_ETH_L1_DELAY_MASK); + rd_val = u32_replace_bits(rd_val, units, + TC9563_ETH_L1_DELAY_MASK); else - rd_val = u32_replace_bits(rd_val, units, TC9563_ETH_L0S_DELAY_MASK); + rd_val = u32_replace_bits(rd_val, units, + TC9563_ETH_L0S_DELAY_MASK); - return tc9563_pwrctrl_i2c_write(ctx->client, TC9563_EMBEDDED_ETH_DELAY, rd_val); + return tc9563_pwrctrl_i2c_write(ctx->client, + TC9563_EMBEDDED_ETH_DELAY, + rd_val); } - ret = tc9563_pwrctrl_i2c_write(ctx->client, TC9563_PORT_SELECT, BIT(port)); + ret = tc9563_pwrctrl_i2c_write(ctx->client, TC9563_PORT_SELECT, + BIT(port)); if (ret) return ret; return tc9563_pwrctrl_i2c_write(ctx->client, - is_l1 ? TC9563_PORT_L1_DELAY : TC9563_PORT_L0S_DELAY, units); + is_l1 ? TC9563_PORT_L1_DELAY : TC9563_PORT_L0S_DELAY, + units); } static int tc9563_pwrctrl_set_tx_amplitude(struct tc9563_pwrctrl_ctx *ctx, @@ -321,7 +331,8 @@ static int tc9563_pwrctrl_set_tx_amplitude(struct tc9563_pwrctrl_ctx *ctx, {TC9563_TX_MARGIN, amp}, }; - return tc9563_pwrctrl_i2c_bulk_write(ctx->client, tx_amp_seq, ARRAY_SIZE(tx_amp_seq)); + return tc9563_pwrctrl_i2c_bulk_write(ctx->client, tx_amp_seq, + ARRAY_SIZE(tx_amp_seq)); } static int tc9563_pwrctrl_disable_dfe(struct tc9563_pwrctrl_ctx *ctx, @@ -364,8 +375,8 @@ static int tc9563_pwrctrl_disable_dfe(struct tc9563_pwrctrl_ctx *ctx, {TC9563_PHY_RATE_CHANGE_OVERRIDE, 0x0}, }; - return tc9563_pwrctrl_i2c_bulk_write(ctx->client, - disable_dfe_seq, ARRAY_SIZE(disable_dfe_seq)); + return tc9563_pwrctrl_i2c_bulk_write(ctx->client, disable_dfe_seq, + ARRAY_SIZE(disable_dfe_seq)); } static int tc9563_pwrctrl_set_nfts(struct tc9563_pwrctrl_ctx *ctx, @@ -381,18 +392,22 @@ static int tc9563_pwrctrl_set_nfts(struct tc9563_pwrctrl_ctx *ctx, if (!nfts[0]) return 0; - ret = tc9563_pwrctrl_i2c_write(ctx->client, TC9563_PORT_SELECT, BIT(port)); + ret = tc9563_pwrctrl_i2c_write(ctx->client, TC9563_PORT_SELECT, + BIT(port)); if (ret) return ret; - return tc9563_pwrctrl_i2c_bulk_write(ctx->client, nfts_seq, ARRAY_SIZE(nfts_seq)); + return tc9563_pwrctrl_i2c_bulk_write(ctx->client, nfts_seq, + ARRAY_SIZE(nfts_seq)); } -static int tc9563_pwrctrl_assert_deassert_reset(struct tc9563_pwrctrl_ctx *ctx, bool deassert) +static int tc9563_pwrctrl_assert_deassert_reset(struct tc9563_pwrctrl_ctx *ctx, + bool deassert) { int ret, val; - ret = tc9563_pwrctrl_i2c_write(ctx->client, TC9563_GPIO_CONFIG, TC9563_GPIO_MASK); + ret = tc9563_pwrctrl_i2c_write(ctx->client, TC9563_GPIO_CONFIG, + TC9563_GPIO_MASK); if (ret) return ret; @@ -401,7 +416,8 @@ static int tc9563_pwrctrl_assert_deassert_reset(struct tc9563_pwrctrl_ctx *ctx, return tc9563_pwrctrl_i2c_write(ctx->client, TC9563_RESET_GPIO, val); } -static int tc9563_pwrctrl_parse_device_dt(struct tc9563_pwrctrl_ctx *ctx, struct device_node *node, +static int tc9563_pwrctrl_parse_device_dt(struct tc9563_pwrctrl_ctx *ctx, + struct device_node *node, enum tc9563_pwrctrl_ports port) { struct tc9563_pwrctrl_cfg *cfg = &ctx->cfg[port]; @@ -540,7 +556,8 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) for (int i = 0; i < ARRAY_SIZE(tc9563_supply_names); i++) ctx->supplies[i].supply = tc9563_supply_names[i]; - ret = devm_regulator_bulk_get(dev, TC9563_PWRCTL_MAX_SUPPLY, ctx->supplies); + ret = devm_regulator_bulk_get(dev, TC9563_PWRCTL_MAX_SUPPLY, + ctx->supplies); if (ret) { dev_err_probe(dev, ret, "failed to get supply regulator\n"); goto remove_i2c; @@ -563,7 +580,8 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) /* * Downstream ports are always children of the upstream port. - * The first node represents DSP1, the second node represents DSP2, and so on. + * The first node represents DSP1, the second node represents DSP2, + * and so on. */ for_each_child_of_node_scoped(pdev->dev.of_node, child) { port++; @@ -574,7 +592,8 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) if (port == TC9563_DSP3) { for_each_child_of_node_scoped(child, child1) { port++; - ret = tc9563_pwrctrl_parse_device_dt(ctx, child1, port); + ret = tc9563_pwrctrl_parse_device_dt(ctx, + child1, port); if (ret) break; } From ad129c34ffbbda37515c77da6f01c1c350bd8013 Mon Sep 17 00:00:00 2001 From: Bjorn Helgaas Date: Thu, 15 Jan 2026 12:58:57 +0530 Subject: [PATCH 1310/1321] FROMLIST: PCI/pwrctrl: tc9563: Add local variables to reduce repetition Add local struct device * and struct device_node * variables to reduce repetitive pointer chasing. No functional changes intended. Signed-off-by: Bjorn Helgaas Signed-off-by: Manivannan Sadhasivam Reviewed-by: Bartosz Golaszewski Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-5-9d26da3ce903@oss.qualcomm.com --- drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c b/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c index efc4d2054bfda..90480e35e968f 100644 --- a/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c +++ b/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c @@ -459,12 +459,13 @@ static void tc9563_pwrctrl_power_off(struct tc9563_pwrctrl_ctx *ctx) static int tc9563_pwrctrl_bring_up(struct tc9563_pwrctrl_ctx *ctx) { + struct device *dev = ctx->pwrctrl.dev; struct tc9563_pwrctrl_cfg *cfg; int ret, i; ret = regulator_bulk_enable(ARRAY_SIZE(ctx->supplies), ctx->supplies); if (ret < 0) - return dev_err_probe(ctx->pwrctrl.dev, ret, "cannot enable regulators\n"); + return dev_err_probe(dev, ret, "cannot enable regulators\n"); gpiod_set_value(ctx->reset_gpio, 0); @@ -478,37 +479,37 @@ static int tc9563_pwrctrl_bring_up(struct tc9563_pwrctrl_ctx *ctx) cfg = &ctx->cfg[i]; ret = tc9563_pwrctrl_disable_port(ctx, i); if (ret) { - dev_err(ctx->pwrctrl.dev, "Disabling port failed\n"); + dev_err(dev, "Disabling port failed\n"); goto power_off; } ret = tc9563_pwrctrl_set_l0s_l1_entry_delay(ctx, i, false, cfg->l0s_delay); if (ret) { - dev_err(ctx->pwrctrl.dev, "Setting L0s entry delay failed\n"); + dev_err(dev, "Setting L0s entry delay failed\n"); goto power_off; } ret = tc9563_pwrctrl_set_l0s_l1_entry_delay(ctx, i, true, cfg->l1_delay); if (ret) { - dev_err(ctx->pwrctrl.dev, "Setting L1 entry delay failed\n"); + dev_err(dev, "Setting L1 entry delay failed\n"); goto power_off; } ret = tc9563_pwrctrl_set_tx_amplitude(ctx, i); if (ret) { - dev_err(ctx->pwrctrl.dev, "Setting Tx amplitude failed\n"); + dev_err(dev, "Setting Tx amplitude failed\n"); goto power_off; } ret = tc9563_pwrctrl_set_nfts(ctx, i); if (ret) { - dev_err(ctx->pwrctrl.dev, "Setting N_FTS failed\n"); + dev_err(dev, "Setting N_FTS failed\n"); goto power_off; } ret = tc9563_pwrctrl_disable_dfe(ctx, i); if (ret) { - dev_err(ctx->pwrctrl.dev, "Disabling DFE failed\n"); + dev_err(dev, "Disabling DFE failed\n"); goto power_off; } } @@ -525,6 +526,7 @@ static int tc9563_pwrctrl_bring_up(struct tc9563_pwrctrl_ctx *ctx) static int tc9563_pwrctrl_probe(struct platform_device *pdev) { struct pci_host_bridge *bridge = to_pci_host_bridge(pdev->dev.parent); + struct device_node *node = pdev->dev.of_node; struct pci_bus *bus = bridge->bus; struct device *dev = &pdev->dev; enum tc9563_pwrctrl_ports port; @@ -536,7 +538,7 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) if (!ctx) return -ENOMEM; - ret = of_property_read_u32_index(pdev->dev.of_node, "i2c-parent", 1, &addr); + ret = of_property_read_u32_index(node, "i2c-parent", 1, &addr); if (ret) return dev_err_probe(dev, ret, "Failed to read i2c-parent property\n"); @@ -572,7 +574,7 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) pci_pwrctrl_init(&ctx->pwrctrl, dev); port = TC9563_USP; - ret = tc9563_pwrctrl_parse_device_dt(ctx, pdev->dev.of_node, port); + ret = tc9563_pwrctrl_parse_device_dt(ctx, node, port); if (ret) { dev_err(dev, "failed to parse device tree properties: %d\n", ret); goto remove_i2c; @@ -583,7 +585,7 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) * The first node represents DSP1, the second node represents DSP2, * and so on. */ - for_each_child_of_node_scoped(pdev->dev.of_node, child) { + for_each_child_of_node_scoped(node, child) { port++; ret = tc9563_pwrctrl_parse_device_dt(ctx, child, port); if (ret) From c7a7b7103bcb4b810cef2b62bc35e3632151c896 Mon Sep 17 00:00:00 2001 From: Bjorn Helgaas Date: Thu, 15 Jan 2026 12:58:58 +0530 Subject: [PATCH 1311/1321] FROMLIST: PCI/pwrctrl: tc9563: Rename private struct and pointers for consistency Previously the pwrseq, tc9563, and slot pwrctrl drivers used different naming conventions for their private data structs and pointers to them, which makes patches hard to read: Previous names New names ------------------------------------ ---------------------------------- struct pci_pwrctrl_pwrseq_data { struct pci_pwrctrl_pwrseq { struct pci_pwrctrl ctx; struct pci_pwrctrl pwrctrl; struct pci_pwrctrl_pwrseq_data *data struct pci_pwrctrl_pwrseq *pwrseq struct tc9563_pwrctrl_ctx { struct pci_pwrctrl_tc9563 { struct tc9563_pwrctrl_ctx *ctx struct pci_pwrctrl_tc9563 *tc9563 struct pci_pwrctrl_slot_data { struct pci_pwrctrl_slot { struct pci_pwrctrl ctx; struct pci_pwrctrl pwrctrl; struct pci_pwrctrl_slot_data *slot struct pci_pwrctrl_slot *slot Rename "struct tc9563_pwrctrl_ctx" to "pci_pwrctrl_tc9563". Move the struct pci_pwrctrl member to be the first element in struct pci_pwrctrl_tc9563, as it is in the other drivers. Rename pointers from "struct tc9563_pwrctrl_ctx *ctx" to "struct pci_pwrctrl_tc9563 *tc9563". No functional change intended. Signed-off-by: Bjorn Helgaas Signed-off-by: Manivannan Sadhasivam Reviewed-by: Bartosz Golaszewski Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-6-9d26da3ce903@oss.qualcomm.com --- drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c | 143 ++++++++++++----------- 1 file changed, 72 insertions(+), 71 deletions(-) diff --git a/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c b/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c index 90480e35e968f..8ae27abdf3620 100644 --- a/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c +++ b/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c @@ -105,13 +105,13 @@ static const char *const tc9563_supply_names[TC9563_PWRCTL_MAX_SUPPLY] = { "vddio18", }; -struct tc9563_pwrctrl_ctx { +struct pci_pwrctrl_tc9563 { + struct pci_pwrctrl pwrctrl; struct regulator_bulk_data supplies[TC9563_PWRCTL_MAX_SUPPLY]; struct tc9563_pwrctrl_cfg cfg[TC9563_MAX]; struct gpio_desc *reset_gpio; struct i2c_adapter *adapter; struct i2c_client *client; - struct pci_pwrctrl pwrctrl; }; /* @@ -231,10 +231,10 @@ static int tc9563_pwrctrl_i2c_bulk_write(struct i2c_client *client, return 0; } -static int tc9563_pwrctrl_disable_port(struct tc9563_pwrctrl_ctx *ctx, +static int tc9563_pwrctrl_disable_port(struct pci_pwrctrl_tc9563 *tc9563, enum tc9563_pwrctrl_ports port) { - struct tc9563_pwrctrl_cfg *cfg = &ctx->cfg[port]; + struct tc9563_pwrctrl_cfg *cfg = &tc9563->cfg[port]; const struct tc9563_pwrctrl_reg_setting *seq; int ret, len; @@ -249,15 +249,15 @@ static int tc9563_pwrctrl_disable_port(struct tc9563_pwrctrl_ctx *ctx, len = ARRAY_SIZE(dsp2_pwroff_seq); } - ret = tc9563_pwrctrl_i2c_bulk_write(ctx->client, seq, len); + ret = tc9563_pwrctrl_i2c_bulk_write(tc9563->client, seq, len); if (ret) return ret; - return tc9563_pwrctrl_i2c_bulk_write(ctx->client, common_pwroff_seq, + return tc9563_pwrctrl_i2c_bulk_write(tc9563->client, common_pwroff_seq, ARRAY_SIZE(common_pwroff_seq)); } -static int tc9563_pwrctrl_set_l0s_l1_entry_delay(struct tc9563_pwrctrl_ctx *ctx, +static int tc9563_pwrctrl_set_l0s_l1_entry_delay(struct pci_pwrctrl_tc9563 *tc9563, enum tc9563_pwrctrl_ports port, bool is_l1, u32 ns) { @@ -271,7 +271,7 @@ static int tc9563_pwrctrl_set_l0s_l1_entry_delay(struct tc9563_pwrctrl_ctx *ctx, units = ns / TC9563_L0S_L1_DELAY_UNIT_NS; if (port == TC9563_ETHERNET) { - ret = tc9563_pwrctrl_i2c_read(ctx->client, + ret = tc9563_pwrctrl_i2c_read(tc9563->client, TC9563_EMBEDDED_ETH_DELAY, &rd_val); if (ret) @@ -284,25 +284,25 @@ static int tc9563_pwrctrl_set_l0s_l1_entry_delay(struct tc9563_pwrctrl_ctx *ctx, rd_val = u32_replace_bits(rd_val, units, TC9563_ETH_L0S_DELAY_MASK); - return tc9563_pwrctrl_i2c_write(ctx->client, + return tc9563_pwrctrl_i2c_write(tc9563->client, TC9563_EMBEDDED_ETH_DELAY, rd_val); } - ret = tc9563_pwrctrl_i2c_write(ctx->client, TC9563_PORT_SELECT, + ret = tc9563_pwrctrl_i2c_write(tc9563->client, TC9563_PORT_SELECT, BIT(port)); if (ret) return ret; - return tc9563_pwrctrl_i2c_write(ctx->client, + return tc9563_pwrctrl_i2c_write(tc9563->client, is_l1 ? TC9563_PORT_L1_DELAY : TC9563_PORT_L0S_DELAY, units); } -static int tc9563_pwrctrl_set_tx_amplitude(struct tc9563_pwrctrl_ctx *ctx, +static int tc9563_pwrctrl_set_tx_amplitude(struct pci_pwrctrl_tc9563 *tc9563, enum tc9563_pwrctrl_ports port) { - u32 amp = ctx->cfg[port].tx_amp; + u32 amp = tc9563->cfg[port].tx_amp; int port_access; if (amp < TC9563_TX_MARGIN_MIN_UA) @@ -331,14 +331,14 @@ static int tc9563_pwrctrl_set_tx_amplitude(struct tc9563_pwrctrl_ctx *ctx, {TC9563_TX_MARGIN, amp}, }; - return tc9563_pwrctrl_i2c_bulk_write(ctx->client, tx_amp_seq, + return tc9563_pwrctrl_i2c_bulk_write(tc9563->client, tx_amp_seq, ARRAY_SIZE(tx_amp_seq)); } -static int tc9563_pwrctrl_disable_dfe(struct tc9563_pwrctrl_ctx *ctx, +static int tc9563_pwrctrl_disable_dfe(struct pci_pwrctrl_tc9563 *tc9563, enum tc9563_pwrctrl_ports port) { - struct tc9563_pwrctrl_cfg *cfg = &ctx->cfg[port]; + struct tc9563_pwrctrl_cfg *cfg = &tc9563->cfg[port]; int port_access, lane_access = 0x3; u32 phy_rate = 0x21; @@ -375,14 +375,14 @@ static int tc9563_pwrctrl_disable_dfe(struct tc9563_pwrctrl_ctx *ctx, {TC9563_PHY_RATE_CHANGE_OVERRIDE, 0x0}, }; - return tc9563_pwrctrl_i2c_bulk_write(ctx->client, disable_dfe_seq, + return tc9563_pwrctrl_i2c_bulk_write(tc9563->client, disable_dfe_seq, ARRAY_SIZE(disable_dfe_seq)); } -static int tc9563_pwrctrl_set_nfts(struct tc9563_pwrctrl_ctx *ctx, +static int tc9563_pwrctrl_set_nfts(struct pci_pwrctrl_tc9563 *tc9563, enum tc9563_pwrctrl_ports port) { - u8 *nfts = ctx->cfg[port].nfts; + u8 *nfts = tc9563->cfg[port].nfts; struct tc9563_pwrctrl_reg_setting nfts_seq[] = { {TC9563_NFTS_2_5_GT, nfts[0]}, {TC9563_NFTS_5_GT, nfts[1]}, @@ -392,35 +392,35 @@ static int tc9563_pwrctrl_set_nfts(struct tc9563_pwrctrl_ctx *ctx, if (!nfts[0]) return 0; - ret = tc9563_pwrctrl_i2c_write(ctx->client, TC9563_PORT_SELECT, + ret = tc9563_pwrctrl_i2c_write(tc9563->client, TC9563_PORT_SELECT, BIT(port)); if (ret) return ret; - return tc9563_pwrctrl_i2c_bulk_write(ctx->client, nfts_seq, + return tc9563_pwrctrl_i2c_bulk_write(tc9563->client, nfts_seq, ARRAY_SIZE(nfts_seq)); } -static int tc9563_pwrctrl_assert_deassert_reset(struct tc9563_pwrctrl_ctx *ctx, +static int tc9563_pwrctrl_assert_deassert_reset(struct pci_pwrctrl_tc9563 *tc9563, bool deassert) { int ret, val; - ret = tc9563_pwrctrl_i2c_write(ctx->client, TC9563_GPIO_CONFIG, + ret = tc9563_pwrctrl_i2c_write(tc9563->client, TC9563_GPIO_CONFIG, TC9563_GPIO_MASK); if (ret) return ret; val = deassert ? TC9563_GPIO_DEASSERT_BITS : 0; - return tc9563_pwrctrl_i2c_write(ctx->client, TC9563_RESET_GPIO, val); + return tc9563_pwrctrl_i2c_write(tc9563->client, TC9563_RESET_GPIO, val); } -static int tc9563_pwrctrl_parse_device_dt(struct tc9563_pwrctrl_ctx *ctx, +static int tc9563_pwrctrl_parse_device_dt(struct pci_pwrctrl_tc9563 *tc9563, struct device_node *node, enum tc9563_pwrctrl_ports port) { - struct tc9563_pwrctrl_cfg *cfg = &ctx->cfg[port]; + struct tc9563_pwrctrl_cfg *cfg = &tc9563->cfg[port]; int ret; /* Disable port if the status of the port is disabled. */ @@ -450,76 +450,77 @@ static int tc9563_pwrctrl_parse_device_dt(struct tc9563_pwrctrl_ctx *ctx, return 0; } -static void tc9563_pwrctrl_power_off(struct tc9563_pwrctrl_ctx *ctx) +static void tc9563_pwrctrl_power_off(struct pci_pwrctrl_tc9563 *tc9563) { - gpiod_set_value(ctx->reset_gpio, 1); + gpiod_set_value(tc9563->reset_gpio, 1); - regulator_bulk_disable(ARRAY_SIZE(ctx->supplies), ctx->supplies); + regulator_bulk_disable(ARRAY_SIZE(tc9563->supplies), tc9563->supplies); } -static int tc9563_pwrctrl_bring_up(struct tc9563_pwrctrl_ctx *ctx) +static int tc9563_pwrctrl_bring_up(struct pci_pwrctrl_tc9563 *tc9563) { - struct device *dev = ctx->pwrctrl.dev; + struct device *dev = tc9563->pwrctrl.dev; struct tc9563_pwrctrl_cfg *cfg; int ret, i; - ret = regulator_bulk_enable(ARRAY_SIZE(ctx->supplies), ctx->supplies); + ret = regulator_bulk_enable(ARRAY_SIZE(tc9563->supplies), + tc9563->supplies); if (ret < 0) return dev_err_probe(dev, ret, "cannot enable regulators\n"); - gpiod_set_value(ctx->reset_gpio, 0); + gpiod_set_value(tc9563->reset_gpio, 0); fsleep(TC9563_OSC_STAB_DELAY_US); - ret = tc9563_pwrctrl_assert_deassert_reset(ctx, false); + ret = tc9563_pwrctrl_assert_deassert_reset(tc9563, false); if (ret) goto power_off; for (i = 0; i < TC9563_MAX; i++) { - cfg = &ctx->cfg[i]; - ret = tc9563_pwrctrl_disable_port(ctx, i); + cfg = &tc9563->cfg[i]; + ret = tc9563_pwrctrl_disable_port(tc9563, i); if (ret) { dev_err(dev, "Disabling port failed\n"); goto power_off; } - ret = tc9563_pwrctrl_set_l0s_l1_entry_delay(ctx, i, false, cfg->l0s_delay); + ret = tc9563_pwrctrl_set_l0s_l1_entry_delay(tc9563, i, false, cfg->l0s_delay); if (ret) { dev_err(dev, "Setting L0s entry delay failed\n"); goto power_off; } - ret = tc9563_pwrctrl_set_l0s_l1_entry_delay(ctx, i, true, cfg->l1_delay); + ret = tc9563_pwrctrl_set_l0s_l1_entry_delay(tc9563, i, true, cfg->l1_delay); if (ret) { dev_err(dev, "Setting L1 entry delay failed\n"); goto power_off; } - ret = tc9563_pwrctrl_set_tx_amplitude(ctx, i); + ret = tc9563_pwrctrl_set_tx_amplitude(tc9563, i); if (ret) { dev_err(dev, "Setting Tx amplitude failed\n"); goto power_off; } - ret = tc9563_pwrctrl_set_nfts(ctx, i); + ret = tc9563_pwrctrl_set_nfts(tc9563, i); if (ret) { dev_err(dev, "Setting N_FTS failed\n"); goto power_off; } - ret = tc9563_pwrctrl_disable_dfe(ctx, i); + ret = tc9563_pwrctrl_disable_dfe(tc9563, i); if (ret) { dev_err(dev, "Disabling DFE failed\n"); goto power_off; } } - ret = tc9563_pwrctrl_assert_deassert_reset(ctx, true); + ret = tc9563_pwrctrl_assert_deassert_reset(tc9563, true); if (!ret) return 0; power_off: - tc9563_pwrctrl_power_off(ctx); + tc9563_pwrctrl_power_off(tc9563); return ret; } @@ -530,12 +531,12 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) struct pci_bus *bus = bridge->bus; struct device *dev = &pdev->dev; enum tc9563_pwrctrl_ports port; - struct tc9563_pwrctrl_ctx *ctx; + struct pci_pwrctrl_tc9563 *tc9563; struct device_node *i2c_node; int ret, addr; - ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL); - if (!ctx) + tc9563 = devm_kzalloc(dev, sizeof(*tc9563), GFP_KERNEL); + if (!tc9563) return -ENOMEM; ret = of_property_read_u32_index(node, "i2c-parent", 1, &addr); @@ -543,38 +544,38 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) return dev_err_probe(dev, ret, "Failed to read i2c-parent property\n"); i2c_node = of_parse_phandle(dev->of_node, "i2c-parent", 0); - ctx->adapter = of_find_i2c_adapter_by_node(i2c_node); + tc9563->adapter = of_find_i2c_adapter_by_node(i2c_node); of_node_put(i2c_node); - if (!ctx->adapter) + if (!tc9563->adapter) return dev_err_probe(dev, -EPROBE_DEFER, "Failed to find I2C adapter\n"); - ctx->client = i2c_new_dummy_device(ctx->adapter, addr); - if (IS_ERR(ctx->client)) { + tc9563->client = i2c_new_dummy_device(tc9563->adapter, addr); + if (IS_ERR(tc9563->client)) { dev_err(dev, "Failed to create I2C client\n"); - put_device(&ctx->adapter->dev); - return PTR_ERR(ctx->client); + put_device(&tc9563->adapter->dev); + return PTR_ERR(tc9563->client); } for (int i = 0; i < ARRAY_SIZE(tc9563_supply_names); i++) - ctx->supplies[i].supply = tc9563_supply_names[i]; + tc9563->supplies[i].supply = tc9563_supply_names[i]; ret = devm_regulator_bulk_get(dev, TC9563_PWRCTL_MAX_SUPPLY, - ctx->supplies); + tc9563->supplies); if (ret) { dev_err_probe(dev, ret, "failed to get supply regulator\n"); goto remove_i2c; } - ctx->reset_gpio = devm_gpiod_get(dev, "resx", GPIOD_OUT_HIGH); - if (IS_ERR(ctx->reset_gpio)) { - ret = dev_err_probe(dev, PTR_ERR(ctx->reset_gpio), "failed to get resx GPIO\n"); + tc9563->reset_gpio = devm_gpiod_get(dev, "resx", GPIOD_OUT_HIGH); + if (IS_ERR(tc9563->reset_gpio)) { + ret = dev_err_probe(dev, PTR_ERR(tc9563->reset_gpio), "failed to get resx GPIO\n"); goto remove_i2c; } - pci_pwrctrl_init(&ctx->pwrctrl, dev); + pci_pwrctrl_init(&tc9563->pwrctrl, dev); port = TC9563_USP; - ret = tc9563_pwrctrl_parse_device_dt(ctx, node, port); + ret = tc9563_pwrctrl_parse_device_dt(tc9563, node, port); if (ret) { dev_err(dev, "failed to parse device tree properties: %d\n", ret); goto remove_i2c; @@ -587,14 +588,14 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) */ for_each_child_of_node_scoped(node, child) { port++; - ret = tc9563_pwrctrl_parse_device_dt(ctx, child, port); + ret = tc9563_pwrctrl_parse_device_dt(tc9563, child, port); if (ret) break; /* Embedded ethernet device are under DSP3 */ if (port == TC9563_DSP3) { for_each_child_of_node_scoped(child, child1) { port++; - ret = tc9563_pwrctrl_parse_device_dt(ctx, + ret = tc9563_pwrctrl_parse_device_dt(tc9563, child1, port); if (ret) break; @@ -612,7 +613,7 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) goto remove_i2c; } - ret = tc9563_pwrctrl_bring_up(ctx); + ret = tc9563_pwrctrl_bring_up(tc9563); if (ret) goto remove_i2c; @@ -622,29 +623,29 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) goto power_off; } - ret = devm_pci_pwrctrl_device_set_ready(dev, &ctx->pwrctrl); + ret = devm_pci_pwrctrl_device_set_ready(dev, &tc9563->pwrctrl); if (ret) goto power_off; - platform_set_drvdata(pdev, ctx); + platform_set_drvdata(pdev, tc9563); return 0; power_off: - tc9563_pwrctrl_power_off(ctx); + tc9563_pwrctrl_power_off(tc9563); remove_i2c: - i2c_unregister_device(ctx->client); - put_device(&ctx->adapter->dev); + i2c_unregister_device(tc9563->client); + put_device(&tc9563->adapter->dev); return ret; } static void tc9563_pwrctrl_remove(struct platform_device *pdev) { - struct tc9563_pwrctrl_ctx *ctx = platform_get_drvdata(pdev); + struct pci_pwrctrl_tc9563 *tc9563 = platform_get_drvdata(pdev); - tc9563_pwrctrl_power_off(ctx); - i2c_unregister_device(ctx->client); - put_device(&ctx->adapter->dev); + tc9563_pwrctrl_power_off(tc9563); + i2c_unregister_device(tc9563->client); + put_device(&tc9563->adapter->dev); } static const struct of_device_id tc9563_pwrctrl_of_match[] = { From e86eedf0878677666d4bf60bfc2d36962c294709 Mon Sep 17 00:00:00 2001 From: Manivannan Sadhasivam Date: Thu, 15 Jan 2026 12:58:59 +0530 Subject: [PATCH 1312/1321] FROMLIST: PCI/pwrctrl: slot: Factor out power on/off code to helpers In order to allow the pwrctrl core to control the power on/off logic of the pwrctrl slot driver, move the power on/off code to pci_pwrctrl_slot_power_{off/on} helper functions. Signed-off-by: Manivannan Sadhasivam Reviewed-by: Bartosz Golaszewski Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-7-9d26da3ce903@oss.qualcomm.com --- drivers/pci/pwrctrl/slot.c | 49 +++++++++++++++++++++++++++----------- 1 file changed, 35 insertions(+), 14 deletions(-) diff --git a/drivers/pci/pwrctrl/slot.c b/drivers/pci/pwrctrl/slot.c index 5ddae4ae34315..5d0ec880c0ecf 100644 --- a/drivers/pci/pwrctrl/slot.c +++ b/drivers/pci/pwrctrl/slot.c @@ -17,13 +17,40 @@ struct pci_pwrctrl_slot { struct pci_pwrctrl pwrctrl; struct regulator_bulk_data *supplies; int num_supplies; + struct clk *clk; }; -static void devm_pci_pwrctrl_slot_power_off(void *data) +static int pci_pwrctrl_slot_power_on(struct pci_pwrctrl *pwrctrl) { - struct pci_pwrctrl_slot *slot = data; + struct pci_pwrctrl_slot *slot = container_of(pwrctrl, + struct pci_pwrctrl_slot, pwrctrl); + int ret; + + ret = regulator_bulk_enable(slot->num_supplies, slot->supplies); + if (ret < 0) { + dev_err(slot->pwrctrl.dev, "Failed to enable slot regulators\n"); + return ret; + } + + return clk_prepare_enable(slot->clk); +} + +static int pci_pwrctrl_slot_power_off(struct pci_pwrctrl *pwrctrl) +{ + struct pci_pwrctrl_slot *slot = container_of(pwrctrl, + struct pci_pwrctrl_slot, pwrctrl); regulator_bulk_disable(slot->num_supplies, slot->supplies); + clk_disable_unprepare(slot->clk); + + return 0; +} + +static void devm_pci_pwrctrl_slot_release(void *data) +{ + struct pci_pwrctrl_slot *slot = data; + + pci_pwrctrl_slot_power_off(&slot->pwrctrl); regulator_bulk_free(slot->num_supplies, slot->supplies); } @@ -31,7 +58,6 @@ static int pci_pwrctrl_slot_probe(struct platform_device *pdev) { struct pci_pwrctrl_slot *slot; struct device *dev = &pdev->dev; - struct clk *clk; int ret; slot = devm_kzalloc(dev, sizeof(*slot), GFP_KERNEL); @@ -46,23 +72,18 @@ static int pci_pwrctrl_slot_probe(struct platform_device *pdev) } slot->num_supplies = ret; - ret = regulator_bulk_enable(slot->num_supplies, slot->supplies); - if (ret < 0) { - dev_err_probe(dev, ret, "Failed to enable slot regulators\n"); - regulator_bulk_free(slot->num_supplies, slot->supplies); - return ret; - } - ret = devm_add_action_or_reset(dev, devm_pci_pwrctrl_slot_power_off, + ret = devm_add_action_or_reset(dev, devm_pci_pwrctrl_slot_release, slot); if (ret) return ret; - clk = devm_clk_get_optional_enabled(dev, NULL); - if (IS_ERR(clk)) { - return dev_err_probe(dev, PTR_ERR(clk), + slot->clk = devm_clk_get_optional(dev, NULL); + if (IS_ERR(slot->clk)) + return dev_err_probe(dev, PTR_ERR(slot->clk), "Failed to enable slot clock\n"); - } + + pci_pwrctrl_slot_power_on(&slot->pwrctrl); pci_pwrctrl_init(&slot->pwrctrl, dev); From a05e67554aee0cf0afbcbf938be06c7da92371e0 Mon Sep 17 00:00:00 2001 From: Manivannan Sadhasivam Date: Thu, 15 Jan 2026 12:59:00 +0530 Subject: [PATCH 1313/1321] FROMLIST: PCI/pwrctrl: pwrseq: Factor out power on/off code to helpers In order to allow the pwrctrl core to control the power on/off logic of the pwrctrl pwrseq driver, move the power on/off code to pci_pwrctrl_pwrseq_power_{off/on} helper functions. Signed-off-by: Manivannan Sadhasivam Reviewed-by: Bartosz Golaszewski Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-8-9d26da3ce903@oss.qualcomm.com --- drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c b/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c index c0d22dc3a8561..d2c261b09030f 100644 --- a/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c +++ b/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c @@ -52,11 +52,27 @@ static const struct pci_pwrctrl_pwrseq_pdata pci_pwrctrl_pwrseq_qcom_wcn_pdata = .validate_device = pci_pwrctrl_pwrseq_qcm_wcn_validate_device, }; +static int pci_pwrctrl_pwrseq_power_on(struct pci_pwrctrl *pwrctrl) +{ + struct pci_pwrctrl_pwrseq *pwrseq = container_of(pwrctrl, + struct pci_pwrctrl_pwrseq, pwrctrl); + + return pwrseq_power_on(pwrseq->pwrseq); +} + +static int pci_pwrctrl_pwrseq_power_off(struct pci_pwrctrl *pwrctrl) +{ + struct pci_pwrctrl_pwrseq *pwrseq = container_of(pwrctrl, + struct pci_pwrctrl_pwrseq, pwrctrl); + + return pwrseq_power_off(pwrseq->pwrseq); +} + static void devm_pci_pwrctrl_pwrseq_power_off(void *data) { - struct pwrseq_desc *pwrseq = data; + struct pci_pwrctrl_pwrseq *pwrseq = data; - pwrseq_power_off(pwrseq); + pci_pwrctrl_pwrseq_power_off(&pwrseq->pwrctrl); } static int pci_pwrctrl_pwrseq_probe(struct platform_device *pdev) @@ -85,13 +101,13 @@ static int pci_pwrctrl_pwrseq_probe(struct platform_device *pdev) return dev_err_probe(dev, PTR_ERR(pwrseq->pwrseq), "Failed to get the power sequencer\n"); - ret = pwrseq_power_on(pwrseq->pwrseq); + ret = pci_pwrctrl_pwrseq_power_on(&pwrseq->pwrctrl); if (ret) return dev_err_probe(dev, ret, "Failed to power-on the device\n"); ret = devm_add_action_or_reset(dev, devm_pci_pwrctrl_pwrseq_power_off, - pwrseq->pwrseq); + pwrseq); if (ret) return ret; From 09bd306bc811998e3917161ac36e01fb9f339fed Mon Sep 17 00:00:00 2001 From: Manivannan Sadhasivam Date: Thu, 15 Jan 2026 12:59:01 +0530 Subject: [PATCH 1314/1321] FROMLIST: PCI/pwrctrl: Add 'struct pci_pwrctrl::power_{on/off}' callbacks To allow the pwrctrl core to control the power on/off sequences of the pwrctrl drivers, add the 'struct pci_pwrctrl::power_{on/off}' callbacks and populate them in the respective pwrctrl drivers. The pwrctrl drivers still power on the resources on their own now. So there is no functional change. Co-developed-by: Krishna Chaitanya Chundru Signed-off-by: Manivannan Sadhasivam Signed-off-by: Bjorn Helgaas Tested-by: Chen-Yu Tsai Reviewed-by: Bartosz Golaszewski Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-9-9d26da3ce903@oss.qualcomm.com Signed-off-by: Krishna Chaitanya Chundru --- drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c | 3 +++ drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c | 22 ++++++++++++++++------ drivers/pci/pwrctrl/slot.c | 3 +++ include/linux/pci-pwrctrl.h | 4 ++++ 4 files changed, 26 insertions(+), 6 deletions(-) diff --git a/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c b/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c index d2c261b09030f..2ee02edd55a31 100644 --- a/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c +++ b/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c @@ -111,6 +111,9 @@ static int pci_pwrctrl_pwrseq_probe(struct platform_device *pdev) if (ret) return ret; + pwrseq->pwrctrl.power_on = pci_pwrctrl_pwrseq_power_on; + pwrseq->pwrctrl.power_off = pci_pwrctrl_pwrseq_power_off; + pci_pwrctrl_init(&pwrseq->pwrctrl, dev); ret = devm_pci_pwrctrl_device_set_ready(dev, &pwrseq->pwrctrl); diff --git a/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c b/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c index 8ae27abdf3620..a71d7ef2d4b89 100644 --- a/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c +++ b/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c @@ -450,15 +450,22 @@ static int tc9563_pwrctrl_parse_device_dt(struct pci_pwrctrl_tc9563 *tc9563, return 0; } -static void tc9563_pwrctrl_power_off(struct pci_pwrctrl_tc9563 *tc9563) +static int tc9563_pwrctrl_power_off(struct pci_pwrctrl *pwrctrl) { + struct pci_pwrctrl_tc9563 *tc9563 = container_of(pwrctrl, + struct pci_pwrctrl_tc9563, pwrctrl); + gpiod_set_value(tc9563->reset_gpio, 1); regulator_bulk_disable(ARRAY_SIZE(tc9563->supplies), tc9563->supplies); + + return 0; } -static int tc9563_pwrctrl_bring_up(struct pci_pwrctrl_tc9563 *tc9563) +static int tc9563_pwrctrl_power_on(struct pci_pwrctrl *pwrctrl) { + struct pci_pwrctrl_tc9563 *tc9563 = container_of(pwrctrl, + struct pci_pwrctrl_tc9563, pwrctrl); struct device *dev = tc9563->pwrctrl.dev; struct tc9563_pwrctrl_cfg *cfg; int ret, i; @@ -520,7 +527,7 @@ static int tc9563_pwrctrl_bring_up(struct pci_pwrctrl_tc9563 *tc9563) return 0; power_off: - tc9563_pwrctrl_power_off(tc9563); + tc9563_pwrctrl_power_off(&tc9563->pwrctrl); return ret; } @@ -613,7 +620,7 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) goto remove_i2c; } - ret = tc9563_pwrctrl_bring_up(tc9563); + ret = tc9563_pwrctrl_power_on(&tc9563->pwrctrl); if (ret) goto remove_i2c; @@ -623,6 +630,9 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) goto power_off; } + tc9563->pwrctrl.power_on = tc9563_pwrctrl_power_on; + tc9563->pwrctrl.power_off = tc9563_pwrctrl_power_off; + ret = devm_pci_pwrctrl_device_set_ready(dev, &tc9563->pwrctrl); if (ret) goto power_off; @@ -632,7 +642,7 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) return 0; power_off: - tc9563_pwrctrl_power_off(tc9563); + tc9563_pwrctrl_power_off(&tc9563->pwrctrl); remove_i2c: i2c_unregister_device(tc9563->client); put_device(&tc9563->adapter->dev); @@ -643,7 +653,7 @@ static void tc9563_pwrctrl_remove(struct platform_device *pdev) { struct pci_pwrctrl_tc9563 *tc9563 = platform_get_drvdata(pdev); - tc9563_pwrctrl_power_off(tc9563); + tc9563_pwrctrl_power_off(&tc9563->pwrctrl); i2c_unregister_device(tc9563->client); put_device(&tc9563->adapter->dev); } diff --git a/drivers/pci/pwrctrl/slot.c b/drivers/pci/pwrctrl/slot.c index 5d0ec880c0ecf..55828aec24862 100644 --- a/drivers/pci/pwrctrl/slot.c +++ b/drivers/pci/pwrctrl/slot.c @@ -85,6 +85,9 @@ static int pci_pwrctrl_slot_probe(struct platform_device *pdev) pci_pwrctrl_slot_power_on(&slot->pwrctrl); + slot->pwrctrl.power_on = pci_pwrctrl_slot_power_on; + slot->pwrctrl.power_off = pci_pwrctrl_slot_power_off; + pci_pwrctrl_init(&slot->pwrctrl, dev); ret = devm_pci_pwrctrl_device_set_ready(dev, &slot->pwrctrl); diff --git a/include/linux/pci-pwrctrl.h b/include/linux/pci-pwrctrl.h index 4aefc7901cd18..435b822c841ea 100644 --- a/include/linux/pci-pwrctrl.h +++ b/include/linux/pci-pwrctrl.h @@ -31,6 +31,8 @@ struct device_link; /** * struct pci_pwrctrl - PCI device power control context. * @dev: Address of the power controlling device. + * @power_on: Callback to power on the power controlling device. + * @power_off: Callback to power off the power controlling device. * * An object of this type must be allocated by the PCI power control device and * passed to the pwrctrl subsystem to trigger a bus rescan and setup a device @@ -38,6 +40,8 @@ struct device_link; */ struct pci_pwrctrl { struct device *dev; + int (*power_on)(struct pci_pwrctrl *pwrctrl); + int (*power_off)(struct pci_pwrctrl *pwrctrl); /* private: internal use only */ struct notifier_block nb; From 275f27fe8f7db696d6ccd22e514818205d2d4b0d Mon Sep 17 00:00:00 2001 From: Krishna Chaitanya Chundru Date: Thu, 15 Jan 2026 12:59:02 +0530 Subject: [PATCH 1315/1321] FROMLIST: PCI/pwrctrl: Add APIs to create, destroy pwrctrl devices Previously, the PCI core created pwrctrl devices during pci_scan_device() on its own and then skipped enumeration of those devices, hoping the pwrctrl driver would power them on and trigger a bus rescan. This approach works for endpoint devices directly connected to Root Ports, but it fails for PCIe switches acting as bus extenders. When the switch requires pwrctrl support and the pwrctrl driver is not available during the pwrctrl device creation, its enumeration will be skipped during the initial PCI bus scan. This premature scan leads the PCI core to allocate resources (bridge windows, bus numbers) for the upstream bridge based on available downstream buses at scan time. For non-hotplug capable bridges, PCI core typically allocates resources based on the number of buses available during the initial bus scan, which happens to be just one if the switch is not powered on and enumerated at that time. When the switch gets enumerated later on, it will fail due to the lack of upstream resources. As a result, a PCIe switch powered on by the pwrctrl driver cannot be reliably enumerated currently. Either the switch has to be enabled in the bootloader or the switch pwrctrl driver has to be loaded during the pwrctrl device creation time to work around these issues. Introduce new APIs to explicitly create and destroy pwrctrl devices from controller drivers by recursively scanning the PCI child nodes of the controller. These APIs allow creating pwrctrl devices based on the original criteria and are intended to be called during controller probe and removal. These APIs, together with the upcoming APIs for power on/off will allow the controller drivers to power on all the devices before starting the initial bus scan, thereby solving the resource allocation issue. Reviewed-by: Bartosz Golaszewski Signed-off-by: Manivannan Sadhasivam Signed-off-by: Bjorn Helgaas Tested-by: Chen-Yu Tsai Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-10-9d26da3ce903@oss.qualcomm.com Signed-off-by: Krishna Chaitanya Chundru --- drivers/pci/of.c | 1 + drivers/pci/pwrctrl/core.c | 114 ++++++++++++++++++++++++++++++++++++ include/linux/pci-pwrctrl.h | 8 ++- 3 files changed, 122 insertions(+), 1 deletion(-) diff --git a/drivers/pci/of.c b/drivers/pci/of.c index 3579265f11984..9bb5f258759be 100644 --- a/drivers/pci/of.c +++ b/drivers/pci/of.c @@ -867,6 +867,7 @@ bool of_pci_supply_present(struct device_node *np) return false; } +EXPORT_SYMBOL_GPL(of_pci_supply_present); #endif /* CONFIG_PCI */ diff --git a/drivers/pci/pwrctrl/core.c b/drivers/pci/pwrctrl/core.c index 6bdbfed584d6d..b423768cc477c 100644 --- a/drivers/pci/pwrctrl/core.c +++ b/drivers/pci/pwrctrl/core.c @@ -3,14 +3,21 @@ * Copyright (C) 2024 Linaro Ltd. */ +#define dev_fmt(fmt) "pwrctrl: " fmt + #include #include #include +#include +#include #include #include +#include #include #include +#include "../pci.h" + static int pci_pwrctrl_notify(struct notifier_block *nb, unsigned long action, void *data) { @@ -145,6 +152,113 @@ int devm_pci_pwrctrl_device_set_ready(struct device *dev, } EXPORT_SYMBOL_GPL(devm_pci_pwrctrl_device_set_ready); +static int pci_pwrctrl_create_device(struct device_node *np, + struct device *parent) +{ + struct platform_device *pdev; + int ret; + + for_each_available_child_of_node_scoped(np, child) { + ret = pci_pwrctrl_create_device(child, parent); + if (ret) + return ret; + } + + /* Bail out if the platform device is already available for the node */ + pdev = of_find_device_by_node(np); + if (pdev) { + platform_device_put(pdev); + return 0; + } + + /* + * Sanity check to make sure that the node has the compatible property + * to allow driver binding. + */ + if (!of_property_present(np, "compatible")) + return 0; + + /* + * Check whether the pwrctrl device really needs to be created or not. + * This is decided based on at least one of the power supplies being + * defined in the devicetree node of the device. + */ + if (!of_pci_supply_present(np)) { + dev_dbg(parent, "Skipping OF node: %s\n", np->name); + return 0; + } + + /* Now create the pwrctrl device */ + pdev = of_platform_device_create(np, NULL, parent); + if (!pdev) { + dev_err(parent, "Failed to create pwrctrl device for node: %s\n", np->name); + return -EINVAL; + } + + return 0; +} + +/** + * pci_pwrctrl_create_devices - Create pwrctrl devices + * + * @parent: PCI host controller device + * + * Recursively create pwrctrl devices for the devicetree hierarchy below + * the specified PCI host controller in a depth first manner. On error, all + * created devices will be destroyed. + * + * Return: 0 on success, negative error number on error. + */ +int pci_pwrctrl_create_devices(struct device *parent) +{ + int ret; + + for_each_available_child_of_node_scoped(parent->of_node, child) { + ret = pci_pwrctrl_create_device(child, parent); + if (ret) { + pci_pwrctrl_destroy_devices(parent); + return ret; + } + } + + return 0; +} +EXPORT_SYMBOL_GPL(pci_pwrctrl_create_devices); + +static void pci_pwrctrl_destroy_device(struct device_node *np) +{ + struct platform_device *pdev; + + for_each_available_child_of_node_scoped(np, child) + pci_pwrctrl_destroy_device(child); + + pdev = of_find_device_by_node(np); + if (!pdev) + return; + + of_device_unregister(pdev); + platform_device_put(pdev); + + of_node_clear_flag(np, OF_POPULATED); +} + +/** + * pci_pwrctrl_destroy_devices - Destroy pwrctrl devices + * + * @parent: PCI host controller device + * + * Recursively destroy pwrctrl devices for the devicetree hierarchy below + * the specified PCI host controller in a depth first manner. + */ +void pci_pwrctrl_destroy_devices(struct device *parent) +{ + struct device_node *np = parent->of_node; + + for_each_available_child_of_node_scoped(np, child) + pci_pwrctrl_destroy_device(child); +} +EXPORT_SYMBOL_GPL(pci_pwrctrl_destroy_devices); + MODULE_AUTHOR("Bartosz Golaszewski "); MODULE_DESCRIPTION("PCI Device Power Control core driver"); MODULE_LICENSE("GPL"); diff --git a/include/linux/pci-pwrctrl.h b/include/linux/pci-pwrctrl.h index 435b822c841ea..44f66872d0901 100644 --- a/include/linux/pci-pwrctrl.h +++ b/include/linux/pci-pwrctrl.h @@ -54,5 +54,11 @@ int pci_pwrctrl_device_set_ready(struct pci_pwrctrl *pwrctrl); void pci_pwrctrl_device_unset_ready(struct pci_pwrctrl *pwrctrl); int devm_pci_pwrctrl_device_set_ready(struct device *dev, struct pci_pwrctrl *pwrctrl); - +#if IS_ENABLED(CONFIG_PCI_PWRCTRL) +int pci_pwrctrl_create_devices(struct device *parent); +void pci_pwrctrl_destroy_devices(struct device *parent); +#else +static inline int pci_pwrctrl_create_devices(struct device *parent) { return 0; } +static void pci_pwrctrl_destroy_devices(struct device *parent) { } +#endif #endif /* __PCI_PWRCTRL_H__ */ From eaa3a49953ca6725d0c77ac4df27aeee530534eb Mon Sep 17 00:00:00 2001 From: Manivannan Sadhasivam Date: Thu, 15 Jan 2026 12:59:03 +0530 Subject: [PATCH 1316/1321] FROMLIST: PCI/pwrctrl: Add APIs to power on/off pwrctrl devices To fix bridge resource allocation issues when powering PCI bridges with the pwrctrl driver, introduce APIs to explicitly power on and off all related devices simultaneously. Previously, the individual pwrctrl drivers powered on/off the PCI devices autonomously, without any control from the controller drivers. But to enforce ordering with respect to powering on the devices, these APIs will power on/off all the devices at the same time. The pci_pwrctrl_power_on_devices() API recursively scans the PCI child nodes, makes sure that pwrctrl drivers are bound to devices, and calls their power_on() callbacks. If any pwrctrl driver is not bound, it will return -EPROBE_DEFER. Similarly, pci_pwrctrl_power_off_devices() API powers off devices recursively via their power_off() callbacks. These APIs are expected to be called during the controller probe and suspend/resume time to power on/off the devices. But before calling these APIs, the pwrctrl devices should be created using the pci_pwrctrl_{create/destroy}_devices() APIs. Co-developed-by: Krishna Chaitanya Chundru Signed-off-by: Manivannan Sadhasivam [bhelgaas: return early when !pdev and unindent, move ctx.power_on/off here] Signed-off-by: Bjorn Helgaas Tested-by: Chen-Yu Tsai Reviewed-by: Bartosz Golaszewski Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-11-9d26da3ce903@oss.qualcomm.com Signed-off-by: Krishna Chaitanya Chundru --- drivers/pci/pwrctrl/core.c | 130 ++++++++++++++++++++++++++++++++++++ include/linux/pci-pwrctrl.h | 4 ++ 2 files changed, 134 insertions(+) diff --git a/drivers/pci/pwrctrl/core.c b/drivers/pci/pwrctrl/core.c index b423768cc477c..fef5243d94458 100644 --- a/drivers/pci/pwrctrl/core.c +++ b/drivers/pci/pwrctrl/core.c @@ -65,6 +65,7 @@ void pci_pwrctrl_init(struct pci_pwrctrl *pwrctrl, struct device *dev) { pwrctrl->dev = dev; INIT_WORK(&pwrctrl->work, rescan_work_func); + dev_set_drvdata(dev, pwrctrl); } EXPORT_SYMBOL_GPL(pci_pwrctrl_init); @@ -152,6 +153,135 @@ int devm_pci_pwrctrl_device_set_ready(struct device *dev, } EXPORT_SYMBOL_GPL(devm_pci_pwrctrl_device_set_ready); +static int __pci_pwrctrl_power_off_device(struct device *dev) +{ + struct pci_pwrctrl *pwrctrl = dev_get_drvdata(dev); + + if (!pwrctrl) + return 0; + + return pwrctrl->power_off(pwrctrl); +} + +static void pci_pwrctrl_power_off_device(struct device_node *np) +{ + struct platform_device *pdev; + int ret; + + for_each_available_child_of_node_scoped(np, child) + pci_pwrctrl_power_off_device(child); + + pdev = of_find_device_by_node(np); + if (!pdev) + return; + + if (device_is_bound(&pdev->dev)) { + ret = __pci_pwrctrl_power_off_device(&pdev->dev); + if (ret) + dev_err(&pdev->dev, "Failed to power off device: %d", ret); + } + + platform_device_put(pdev); +} + +/** + * pci_pwrctrl_power_off_devices - Power off pwrctrl devices + * + * @parent: PCI host controller device + * + * Recursively traverse all pwrctrl devices for the devicetree hierarchy + * below the specified PCI host controller and power them off in a depth + * first manner. + */ +void pci_pwrctrl_power_off_devices(struct device *parent) +{ + struct device_node *np = parent->of_node; + + for_each_available_child_of_node_scoped(np, child) + pci_pwrctrl_power_off_device(child); +} +EXPORT_SYMBOL_GPL(pci_pwrctrl_power_off_devices); + +static int __pci_pwrctrl_power_on_device(struct device *dev) +{ + struct pci_pwrctrl *pwrctrl = dev_get_drvdata(dev); + + if (!pwrctrl) + return 0; + + return pwrctrl->power_on(pwrctrl); +} + +/* + * Power on the devices in a depth first manner. Before powering on the device, + * make sure its driver is bound. + */ +static int pci_pwrctrl_power_on_device(struct device_node *np) +{ + struct platform_device *pdev; + int ret; + + for_each_available_child_of_node_scoped(np, child) { + ret = pci_pwrctrl_power_on_device(child); + if (ret) + return ret; + } + + pdev = of_find_device_by_node(np); + if (!pdev) + return 0; + + if (device_is_bound(&pdev->dev)) { + ret = __pci_pwrctrl_power_on_device(&pdev->dev); + } else { + /* FIXME: Use blocking wait instead of probe deferral */ + dev_dbg(&pdev->dev, "driver is not bound\n"); + ret = -EPROBE_DEFER; + } + + platform_device_put(pdev); + + return ret; +} + +/** + * pci_pwrctrl_power_on_devices - Power on pwrctrl devices + * + * @parent: PCI host controller device + * + * Recursively traverse all pwrctrl devices for the devicetree hierarchy + * below the specified PCI host controller and power them on in a depth + * first manner. On error, all powered on devices will be powered off. + * + * Return: 0 on success, -EPROBE_DEFER if any pwrctrl driver is not bound, an + * appropriate error code otherwise. + */ +int pci_pwrctrl_power_on_devices(struct device *parent) +{ + struct device_node *np = parent->of_node; + struct device_node *child = NULL; + int ret; + + for_each_available_child_of_node(np, child) { + ret = pci_pwrctrl_power_on_device(child); + if (ret) + goto err_power_off; + } + + return 0; + +err_power_off: + for_each_available_child_of_node_scoped(np, tmp) { + if (tmp == child) + break; + pci_pwrctrl_power_off_device(tmp); + } + of_node_put(child); + + return ret; +} +EXPORT_SYMBOL_GPL(pci_pwrctrl_power_on_devices); + static int pci_pwrctrl_create_device(struct device_node *np, struct device *parent) { diff --git a/include/linux/pci-pwrctrl.h b/include/linux/pci-pwrctrl.h index 44f66872d0901..1192a2527521d 100644 --- a/include/linux/pci-pwrctrl.h +++ b/include/linux/pci-pwrctrl.h @@ -57,8 +57,12 @@ int devm_pci_pwrctrl_device_set_ready(struct device *dev, #if IS_ENABLED(CONFIG_PCI_PWRCTRL) int pci_pwrctrl_create_devices(struct device *parent); void pci_pwrctrl_destroy_devices(struct device *parent); +int pci_pwrctrl_power_on_devices(struct device *parent); +void pci_pwrctrl_power_off_devices(struct device *parent); #else static inline int pci_pwrctrl_create_devices(struct device *parent) { return 0; } static void pci_pwrctrl_destroy_devices(struct device *parent) { } +static inline int pci_pwrctrl_power_on_devices(struct device *parent) { return 0; } +static void pci_pwrctrl_power_off_devices(struct device *parent) { } #endif #endif /* __PCI_PWRCTRL_H__ */ From 45b74da64699eaecb7768c188e0e8507fc2b188e Mon Sep 17 00:00:00 2001 From: Manivannan Sadhasivam Date: Thu, 15 Jan 2026 12:59:04 +0530 Subject: [PATCH 1317/1321] FROMLIST: PCI/pwrctrl: Switch to pwrctrl create, power on/off, destroy APIs Adopt pwrctrl APIs to create, power on/off, and destroy pwrctrl devices. In qcom_pcie_host_init(), call pci_pwrctrl_create_devices() to create devices, then pci_pwrctrl_power_on_devices() to power them on, both after controller resource initialization. Once successful, deassert PERST# for all devices. In qcom_pcie_host_deinit(), call pci_pwrctrl_power_off_devices() after asserting PERST#. Note that pci_pwrctrl_destroy_devices() is not called here, as deinit is only invoked during system suspend where device destruction is unnecessary. If the driver becomes removable in future, pci_pwrctrl_destroy_devices() should be called in the remove() handler. Remove the old pwrctrl framework code from the PCI core (including devlinks) as the new APIs are now the sole consumer of pwrctrl functionality. And also do not power on the pwrctrl drivers during probe() as this is now handled by the APIs. Co-developed-by: Krishna Chaitanya Chundru Signed-off-by: Manivannan Sadhasivam Signed-off-by: Bjorn Helgaas Tested-by: Chen-Yu Tsai Reviewed-by: Bartosz Golaszewski Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-12-9d26da3ce903@oss.qualcomm.com Signed-off-by: Krishna Chaitanya Chundru --- drivers/pci/bus.c | 19 -------- drivers/pci/controller/dwc/pcie-qcom.c | 24 +++++++++- drivers/pci/probe.c | 59 ------------------------ drivers/pci/pwrctrl/core.c | 15 ------ drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c | 5 -- drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c | 24 ++-------- drivers/pci/pwrctrl/slot.c | 2 - drivers/pci/remove.c | 20 -------- 8 files changed, 25 insertions(+), 143 deletions(-) diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c index 4383a36fd6ca0..d60d5f1082120 100644 --- a/drivers/pci/bus.c +++ b/drivers/pci/bus.c @@ -344,7 +344,6 @@ void __weak pcibios_bus_add_device(struct pci_dev *pdev) { } void pci_bus_add_device(struct pci_dev *dev) { struct device_node *dn = dev->dev.of_node; - struct platform_device *pdev; /* * Can not put in pci_device_add yet because resources @@ -361,24 +360,6 @@ void pci_bus_add_device(struct pci_dev *dev) /* Save config space for error recoverability */ pci_save_state(dev); - /* - * If the PCI device is associated with a pwrctrl device with a - * power supply, create a device link between the PCI device and - * pwrctrl device. This ensures that pwrctrl drivers are probed - * before PCI client drivers. - */ - pdev = of_find_device_by_node(dn); - if (pdev) { - if (of_pci_supply_present(dn)) { - if (!device_link_add(&dev->dev, &pdev->dev, - DL_FLAG_AUTOREMOVE_CONSUMER)) { - pci_err(dev, "failed to add device link to power control device %s\n", - pdev->name); - } - } - put_device(&pdev->dev); - } - if (!dn || of_device_is_available(dn)) pci_dev_allow_binding(dev); diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c index 0610692a129b7..38f2c3469b864 100644 --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include #include @@ -1316,10 +1317,18 @@ static int qcom_pcie_host_init(struct dw_pcie_rp *pp) if (ret) goto err_deinit; + ret = pci_pwrctrl_create_devices(pci->dev); + if (ret) + goto err_disable_phy; + + ret = pci_pwrctrl_power_on_devices(pci->dev); + if (ret) + goto err_pwrctrl_destroy; + if (pcie->cfg->ops->post_init) { ret = pcie->cfg->ops->post_init(pcie); if (ret) - goto err_disable_phy; + goto err_pwrctrl_power_off; } qcom_pcie_clear_aspm_l0s(pcie->pci); @@ -1336,6 +1345,11 @@ static int qcom_pcie_host_init(struct dw_pcie_rp *pp) err_assert_reset: qcom_ep_reset_assert(pcie); +err_pwrctrl_power_off: + pci_pwrctrl_power_off_devices(pci->dev); +err_pwrctrl_destroy: + if (ret != -EPROBE_DEFER) + pci_pwrctrl_destroy_devices(pci->dev); err_disable_phy: qcom_pcie_phy_power_off(pcie); err_deinit: @@ -1350,6 +1364,12 @@ static void qcom_pcie_host_deinit(struct dw_pcie_rp *pp) struct qcom_pcie *pcie = to_qcom_pcie(pci); qcom_ep_reset_assert(pcie); + + /* + * No need to destroy pwrctrl devices as this function only gets called + * during system suspend as of now. + */ + pci_pwrctrl_power_off_devices(pci->dev); qcom_pcie_phy_power_off(pcie); pcie->cfg->ops->deinit(pcie); } @@ -1978,7 +1998,7 @@ static int qcom_pcie_probe(struct platform_device *pdev) ret = dw_pcie_host_init(pp); if (ret) { - dev_err(dev, "cannot initialize host\n"); + dev_err_probe(dev, ret, "cannot initialize host\n"); goto err_phy_exit; } diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 41183aed8f5d9..6e7252404b583 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2563,56 +2563,6 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l, } EXPORT_SYMBOL(pci_bus_read_dev_vendor_id); -#if IS_ENABLED(CONFIG_PCI_PWRCTRL) -static struct platform_device *pci_pwrctrl_create_device(struct pci_bus *bus, int devfn) -{ - struct pci_host_bridge *host = pci_find_host_bridge(bus); - struct platform_device *pdev; - struct device_node *np; - - np = of_pci_find_child_device(dev_of_node(&bus->dev), devfn); - if (!np) - return NULL; - - pdev = of_find_device_by_node(np); - if (pdev) { - put_device(&pdev->dev); - goto err_put_of_node; - } - - /* - * First check whether the pwrctrl device really needs to be created or - * not. This is decided based on at least one of the power supplies - * being defined in the devicetree node of the device. - */ - if (!of_pci_supply_present(np)) { - pr_debug("PCI/pwrctrl: Skipping OF node: %s\n", np->name); - goto err_put_of_node; - } - - /* Now create the pwrctrl device */ - pdev = of_platform_device_create(np, NULL, &host->dev); - if (!pdev) { - pr_err("PCI/pwrctrl: Failed to create pwrctrl device for node: %s\n", np->name); - goto err_put_of_node; - } - - of_node_put(np); - - return pdev; - -err_put_of_node: - of_node_put(np); - - return NULL; -} -#else -static struct platform_device *pci_pwrctrl_create_device(struct pci_bus *bus, int devfn) -{ - return NULL; -} -#endif - /* * Read the config data for a PCI device, sanity-check it, * and fill in the dev structure. @@ -2622,15 +2572,6 @@ static struct pci_dev *pci_scan_device(struct pci_bus *bus, int devfn) struct pci_dev *dev; u32 l; - /* - * Create pwrctrl device (if required) for the PCI device to handle the - * power state. If the pwrctrl device is created, then skip scanning - * further as the pwrctrl core will rescan the bus after powering on - * the device. - */ - if (pci_pwrctrl_create_device(bus, devfn)) - return NULL; - if (!pci_bus_read_dev_vendor_id(bus, devfn, &l, 60*1000)) return NULL; diff --git a/drivers/pci/pwrctrl/core.c b/drivers/pci/pwrctrl/core.c index fef5243d94458..1b91375738a08 100644 --- a/drivers/pci/pwrctrl/core.c +++ b/drivers/pci/pwrctrl/core.c @@ -45,16 +45,6 @@ static int pci_pwrctrl_notify(struct notifier_block *nb, unsigned long action, return NOTIFY_DONE; } -static void rescan_work_func(struct work_struct *work) -{ - struct pci_pwrctrl *pwrctrl = container_of(work, - struct pci_pwrctrl, work); - - pci_lock_rescan_remove(); - pci_rescan_bus(to_pci_host_bridge(pwrctrl->dev->parent)->bus); - pci_unlock_rescan_remove(); -} - /** * pci_pwrctrl_init() - Initialize the PCI power control context struct * @@ -64,7 +54,6 @@ static void rescan_work_func(struct work_struct *work) void pci_pwrctrl_init(struct pci_pwrctrl *pwrctrl, struct device *dev) { pwrctrl->dev = dev; - INIT_WORK(&pwrctrl->work, rescan_work_func); dev_set_drvdata(dev, pwrctrl); } EXPORT_SYMBOL_GPL(pci_pwrctrl_init); @@ -95,8 +84,6 @@ int pci_pwrctrl_device_set_ready(struct pci_pwrctrl *pwrctrl) if (ret) return ret; - schedule_work(&pwrctrl->work); - return 0; } EXPORT_SYMBOL_GPL(pci_pwrctrl_device_set_ready); @@ -109,8 +96,6 @@ EXPORT_SYMBOL_GPL(pci_pwrctrl_device_set_ready); */ void pci_pwrctrl_device_unset_ready(struct pci_pwrctrl *pwrctrl) { - cancel_work_sync(&pwrctrl->work); - /* * We don't have to delete the link here. Typically, this function * is only called when the power control device is being detached. If diff --git a/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c b/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c index 2ee02edd55a31..1df0cd73b416e 100644 --- a/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c +++ b/drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c @@ -101,11 +101,6 @@ static int pci_pwrctrl_pwrseq_probe(struct platform_device *pdev) return dev_err_probe(dev, PTR_ERR(pwrseq->pwrseq), "Failed to get the power sequencer\n"); - ret = pci_pwrctrl_pwrseq_power_on(&pwrseq->pwrctrl); - if (ret) - return dev_err_probe(dev, ret, - "Failed to power-on the device\n"); - ret = devm_add_action_or_reset(dev, devm_pci_pwrctrl_pwrseq_power_off, pwrseq); if (ret) diff --git a/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c b/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c index a71d7ef2d4b89..38008d03903f2 100644 --- a/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c +++ b/drivers/pci/pwrctrl/pci-pwrctrl-tc9563.c @@ -533,9 +533,7 @@ static int tc9563_pwrctrl_power_on(struct pci_pwrctrl *pwrctrl) static int tc9563_pwrctrl_probe(struct platform_device *pdev) { - struct pci_host_bridge *bridge = to_pci_host_bridge(pdev->dev.parent); struct device_node *node = pdev->dev.of_node; - struct pci_bus *bus = bridge->bus; struct device *dev = &pdev->dev; enum tc9563_pwrctrl_ports port; struct pci_pwrctrl_tc9563 *tc9563; @@ -614,22 +612,6 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) goto remove_i2c; } - if (bridge->ops->assert_perst) { - ret = bridge->ops->assert_perst(bus, true); - if (ret) - goto remove_i2c; - } - - ret = tc9563_pwrctrl_power_on(&tc9563->pwrctrl); - if (ret) - goto remove_i2c; - - if (bridge->ops->assert_perst) { - ret = bridge->ops->assert_perst(bus, false); - if (ret) - goto power_off; - } - tc9563->pwrctrl.power_on = tc9563_pwrctrl_power_on; tc9563->pwrctrl.power_off = tc9563_pwrctrl_power_off; @@ -637,8 +619,6 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) if (ret) goto power_off; - platform_set_drvdata(pdev, tc9563); - return 0; power_off: @@ -651,7 +631,9 @@ static int tc9563_pwrctrl_probe(struct platform_device *pdev) static void tc9563_pwrctrl_remove(struct platform_device *pdev) { - struct pci_pwrctrl_tc9563 *tc9563 = platform_get_drvdata(pdev); + struct pci_pwrctrl *pwrctrl = dev_get_drvdata(&pdev->dev); + struct pci_pwrctrl_tc9563 *tc9563 = container_of(pwrctrl, + struct pci_pwrctrl_tc9563, pwrctrl); tc9563_pwrctrl_power_off(&tc9563->pwrctrl); i2c_unregister_device(tc9563->client); diff --git a/drivers/pci/pwrctrl/slot.c b/drivers/pci/pwrctrl/slot.c index 55828aec24862..a36e5dd424427 100644 --- a/drivers/pci/pwrctrl/slot.c +++ b/drivers/pci/pwrctrl/slot.c @@ -83,8 +83,6 @@ static int pci_pwrctrl_slot_probe(struct platform_device *pdev) return dev_err_probe(dev, PTR_ERR(slot->clk), "Failed to enable slot clock\n"); - pci_pwrctrl_slot_power_on(&slot->pwrctrl); - slot->pwrctrl.power_on = pci_pwrctrl_slot_power_on; slot->pwrctrl.power_off = pci_pwrctrl_slot_power_off; diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c index 417a9ea59117d..e9d519993853f 100644 --- a/drivers/pci/remove.c +++ b/drivers/pci/remove.c @@ -17,25 +17,6 @@ static void pci_free_resources(struct pci_dev *dev) } } -static void pci_pwrctrl_unregister(struct device *dev) -{ - struct device_node *np; - struct platform_device *pdev; - - np = dev_of_node(dev); - if (!np) - return; - - pdev = of_find_device_by_node(np); - if (!pdev) - return; - - of_device_unregister(pdev); - put_device(&pdev->dev); - - of_node_clear_flag(np, OF_POPULATED); -} - static void pci_stop_dev(struct pci_dev *dev) { pci_pme_active(dev, false); @@ -73,7 +54,6 @@ static void pci_destroy_dev(struct pci_dev *dev) pci_ide_destroy(dev); pcie_aspm_exit_link_state(dev); pci_bridge_d3_update(dev); - pci_pwrctrl_unregister(&dev->dev); pci_free_resources(dev); put_device(&dev->dev); } From e2c83a96e6194240b0e5176e38c2ee34bfc04df7 Mon Sep 17 00:00:00 2001 From: Manivannan Sadhasivam Date: Thu, 15 Jan 2026 12:59:05 +0530 Subject: [PATCH 1318/1321] FROMLIST: PCI: qcom: Drop the assert_perst() callbacks Previously, the pcie-qcom driver probed first, deasserted PERST#, enabled link training and scanned the bus. By the time the pwrctrl driver probe got called, link training was already enabled by the controller driver. Thus the pwrctrl drivers had to call the .assert_perst() callback, to assert PERST#, power on the needed resources, and then call the .assert_perst() callback to deassert PERST#. Now since all pwrctrl drivers and this controller driver have been converted to the new pwrctrl design where the pwrctrl drivers will first power on the devices before this driver deasserts PERST# and scan the bus. So there is no longer a need for .assert_perst() callback in this driver and in DWC core driver. Hence, drop them. Signed-off-by: Manivannan Sadhasivam Reviewed-by: Bartosz Golaszewski Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-13-9d26da3ce903@oss.qualcomm.com --- drivers/pci/controller/dwc/pcie-designware-host.c | 9 --------- drivers/pci/controller/dwc/pcie-designware.h | 9 --------- drivers/pci/controller/dwc/pcie-qcom.c | 13 ------------- 3 files changed, 31 deletions(-) diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c index 2c8056761addf..0dccc7e77dddb 100644 --- a/drivers/pci/controller/dwc/pcie-designware-host.c +++ b/drivers/pci/controller/dwc/pcie-designware-host.c @@ -857,19 +857,10 @@ static void __iomem *dw_pcie_ecam_conf_map_bus(struct pci_bus *bus, unsigned int return pci->dbi_base + where; } -static int dw_pcie_op_assert_perst(struct pci_bus *bus, bool assert) -{ - struct dw_pcie_rp *pp = bus->sysdata; - struct dw_pcie *pci = to_dw_pcie_from_pp(pp); - - return dw_pcie_assert_perst(pci, assert); -} - static struct pci_ops dw_pcie_ops = { .map_bus = dw_pcie_own_conf_map_bus, .read = pci_generic_config_read, .write = pci_generic_config_write, - .assert_perst = dw_pcie_op_assert_perst, }; static struct pci_ops dw_pcie_ecam_ops = { diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h index 18d8f7d5d2308..c486824aa0ff4 100644 --- a/drivers/pci/controller/dwc/pcie-designware.h +++ b/drivers/pci/controller/dwc/pcie-designware.h @@ -494,7 +494,6 @@ struct dw_pcie_ops { enum dw_pcie_ltssm (*get_ltssm)(struct dw_pcie *pcie); int (*start_link)(struct dw_pcie *pcie); void (*stop_link)(struct dw_pcie *pcie); - int (*assert_perst)(struct dw_pcie *pcie, bool assert); }; struct debugfs_info { @@ -799,14 +798,6 @@ static inline void dw_pcie_stop_link(struct dw_pcie *pci) pci->ops->stop_link(pci); } -static inline int dw_pcie_assert_perst(struct dw_pcie *pci, bool assert) -{ - if (pci->ops && pci->ops->assert_perst) - return pci->ops->assert_perst(pci, assert); - - return 0; -} - static inline enum dw_pcie_ltssm dw_pcie_get_ltssm(struct dw_pcie *pci) { u32 val; diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c index 38f2c3469b864..5788e24d3ec8f 100644 --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -643,18 +643,6 @@ static int qcom_pcie_post_init_1_0_0(struct qcom_pcie *pcie) return 0; } -static int qcom_pcie_assert_perst(struct dw_pcie *pci, bool assert) -{ - struct qcom_pcie *pcie = to_qcom_pcie(pci); - - if (assert) - qcom_ep_reset_assert(pcie); - else - qcom_ep_reset_deassert(pcie); - - return 0; -} - static void qcom_pcie_2_3_2_ltssm_enable(struct qcom_pcie *pcie) { u32 val; @@ -1531,7 +1519,6 @@ static const struct qcom_pcie_cfg cfg_fw_managed = { static const struct dw_pcie_ops dw_pcie_ops = { .link_up = qcom_pcie_link_up, .start_link = qcom_pcie_start_link, - .assert_perst = qcom_pcie_assert_perst, }; static int qcom_pcie_icc_init(struct qcom_pcie *pcie) From 2d6159ae11fef2f1ea048ee0a3ff18d71f90f984 Mon Sep 17 00:00:00 2001 From: Manivannan Sadhasivam Date: Thu, 15 Jan 2026 12:59:06 +0530 Subject: [PATCH 1319/1321] FROMLIST: PCI: Drop the assert_perst() callback Now since all .assert_callback() implementations have been removed from the controller drivers, drop the .assert_callback callback from pci.h. Reviewed-by: Bartosz Golaszewski Signed-off-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-14-9d26da3ce903@oss.qualcomm.com --- include/linux/pci.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/pci.h b/include/linux/pci.h index b5cc0c2b99065..0245e03af5775 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -854,7 +854,6 @@ struct pci_ops { void __iomem *(*map_bus)(struct pci_bus *bus, unsigned int devfn, int where); int (*read)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val); int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val); - int (*assert_perst)(struct pci_bus *bus, bool assert); }; /* From 4e4ad7a7e2d44a3ee379a620b4930de80e107b0a Mon Sep 17 00:00:00 2001 From: Manivannan Sadhasivam Date: Thu, 15 Jan 2026 12:59:07 +0530 Subject: [PATCH 1320/1321] FROMLIST: PCI: qcom: Rename PERST# assert/deassert helpers for uniformity Rename the PERST# assert/deassert helpers from qcom_ep_reset_{assert/deassert}() to qcom_pcie_perst_{assert/deassert}() to maintain uniformity. Reviewed-by: Bartosz Golaszewski Signed-off-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/20260115-pci-pwrctrl-rework-v5-15-9d26da3ce903@oss.qualcomm.com --- drivers/pci/controller/dwc/pcie-qcom.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c index 5788e24d3ec8f..efb1b76715a47 100644 --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -291,7 +291,7 @@ struct qcom_pcie { #define to_qcom_pcie(x) dev_get_drvdata((x)->dev) -static void qcom_perst_assert(struct qcom_pcie *pcie, bool assert) +static void __qcom_pcie_perst_assert(struct qcom_pcie *pcie, bool assert) { struct qcom_pcie_port *port; int val = assert ? 1 : 0; @@ -302,16 +302,16 @@ static void qcom_perst_assert(struct qcom_pcie *pcie, bool assert) usleep_range(PERST_DELAY_US, PERST_DELAY_US + 500); } -static void qcom_ep_reset_assert(struct qcom_pcie *pcie) +static void qcom_pcie_perst_assert(struct qcom_pcie *pcie) { - qcom_perst_assert(pcie, true); + __qcom_pcie_perst_assert(pcie, true); } -static void qcom_ep_reset_deassert(struct qcom_pcie *pcie) +static void qcom_pcie_perst_deassert(struct qcom_pcie *pcie) { /* Ensure that PERST has been asserted for at least 100 ms */ msleep(PCIE_T_PVPERL_MS); - qcom_perst_assert(pcie, false); + __qcom_pcie_perst_assert(pcie, false); } static int qcom_pcie_start_link(struct dw_pcie *pci) @@ -1295,7 +1295,7 @@ static int qcom_pcie_host_init(struct dw_pcie_rp *pp) struct qcom_pcie *pcie = to_qcom_pcie(pci); int ret; - qcom_ep_reset_assert(pcie); + qcom_pcie_perst_assert(pcie); ret = pcie->cfg->ops->init(pcie); if (ret) @@ -1321,7 +1321,7 @@ static int qcom_pcie_host_init(struct dw_pcie_rp *pp) qcom_pcie_clear_aspm_l0s(pcie->pci); - qcom_ep_reset_deassert(pcie); + qcom_pcie_perst_deassert(pcie); if (pcie->cfg->ops->config_sid) { ret = pcie->cfg->ops->config_sid(pcie); @@ -1332,7 +1332,7 @@ static int qcom_pcie_host_init(struct dw_pcie_rp *pp) return 0; err_assert_reset: - qcom_ep_reset_assert(pcie); + qcom_pcie_perst_assert(pcie); err_pwrctrl_power_off: pci_pwrctrl_power_off_devices(pci->dev); err_pwrctrl_destroy: @@ -1351,7 +1351,7 @@ static void qcom_pcie_host_deinit(struct dw_pcie_rp *pp) struct dw_pcie *pci = to_dw_pcie_from_pp(pp); struct qcom_pcie *pcie = to_qcom_pcie(pci); - qcom_ep_reset_assert(pcie); + qcom_pcie_perst_assert(pcie); /* * No need to destroy pwrctrl devices as this function only gets called From 1054f575a6799a4d5d1037c6052588a502edfc1b Mon Sep 17 00:00:00 2001 From: Krishna Chaitanya Chundru Date: Tue, 17 Feb 2026 16:49:08 +0530 Subject: [PATCH 1321/1321] FROMLIST: PCI: qcom: Add .get_ltssm() helper For older targets like sc7280, we see reading DBI after sending PME turn off message is causing NOC error. To avoid unsafe DBI accesses, introduce qcom_pcie_get_ltssm(), which retrieves the LTSSM state from the PARF_LTSSM register instead. Link: https://lore.kernel.org/r/20260217-d3cold-v2-3-89b322864043@oss.qualcomm.com Signed-off-by: Krishna Chaitanya Chundru --- drivers/pci/controller/dwc/pcie-qcom.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c index efb1b76715a47..22119d3784050 100644 --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -135,6 +135,7 @@ /* PARF_LTSSM register fields */ #define LTSSM_EN BIT(8) +#define PARF_LTSSM_STATE_MASK GENMASK(5, 0) /* PARF_INT_ALL_{STATUS/CLEAR/MASK} register fields */ #define PARF_INT_ALL_LINK_UP BIT(13) @@ -1261,6 +1262,15 @@ static bool qcom_pcie_link_up(struct dw_pcie *pci) return val & PCI_EXP_LNKSTA_DLLLA; } +static enum dw_pcie_ltssm qcom_pcie_get_ltssm(struct dw_pcie *pci) +{ + struct qcom_pcie *pcie = to_qcom_pcie(pci); + u32 val; + + val = readl(pcie->parf + PARF_LTSSM); + return (enum dw_pcie_ltssm)FIELD_GET(PARF_LTSSM_STATE_MASK, val); +} + static void qcom_pcie_phy_power_off(struct qcom_pcie *pcie) { struct qcom_pcie_port *port; @@ -1519,6 +1529,7 @@ static const struct qcom_pcie_cfg cfg_fw_managed = { static const struct dw_pcie_ops dw_pcie_ops = { .link_up = qcom_pcie_link_up, .start_link = qcom_pcie_start_link, + .get_ltssm = qcom_pcie_get_ltssm, }; static int qcom_pcie_icc_init(struct qcom_pcie *pcie)