Skip to content

RDMA/irdma: Cap MSIX used to online CPUs + 1

Kamal Heib requested to merge kheib/centos-stream-9:bz2125810 into main

Bugzilla: https://bugzilla.redhat.com/2125810

commit 9cd9842c46996ef62173c36619c746f57416bcb0
Author: Mustafa Ismail mustafa.ismail@intel.com
Date: Tue Feb 7 14:19:38 2023 -0600

RDMA/irdma: Cap MSIX used to online CPUs + 1  

The irdma driver can use a maximum number of msix vectors equal  
to num_online_cpus() + 1 and the kernel warning stack below is shown  
if that number is exceeded.  

The kernel throws a warning as the driver tries to update the affinity  
hint with a CPU mask greater than the max CPU IDs. Fix this by capping  
the MSIX vectors to num_online_cpus() + 1.  

 WARNING: CPU: 7 PID: 23655 at include/linux/cpumask.h:106 irdma_cfg_ceq_vector+0x34c/0x3f0 [irdma]  
 RIP: 0010:irdma_cfg_ceq_vector+0x34c/0x3f0 [irdma]  
 Call Trace:  
 irdma_rt_init_hw+0xa62/0x1290 [irdma]  
 ? irdma_alloc_local_mac_entry+0x1a0/0x1a0 [irdma]  
 ? __is_kernel_percpu_address+0x63/0x310  
 ? rcu_read_lock_held_common+0xe/0xb0  
 ? irdma_lan_unregister_qset+0x280/0x280 [irdma]  
 ? irdma_request_reset+0x80/0x80 [irdma]  
 ? ice_get_qos_params+0x84/0x390 [ice]  
 irdma_probe+0xa40/0xfc0 [irdma]  
 ? rcu_read_lock_bh_held+0xd0/0xd0  
 ? irdma_remove+0x140/0x140 [irdma]  
 ? rcu_read_lock_sched_held+0x62/0xe0  
 ? down_write+0x187/0x3d0  
 ? auxiliary_match_id+0xf0/0x1a0  
 ? irdma_remove+0x140/0x140 [irdma]  
 auxiliary_bus_probe+0xa6/0x100  
 __driver_probe_device+0x4a4/0xd50  
 ? __device_attach_driver+0x2c0/0x2c0  
 driver_probe_device+0x4a/0x110  
 __driver_attach+0x1aa/0x350  
 bus_for_each_dev+0x11d/0x1b0  
 ? subsys_dev_iter_init+0xe0/0xe0  
 bus_add_driver+0x3b1/0x610  
 driver_register+0x18e/0x410  
 ? 0xffffffffc0b88000  
 irdma_init_module+0x50/0xaa [irdma]  
 do_one_initcall+0x103/0x5f0  
 ? perf_trace_initcall_level+0x420/0x420  
 ? do_init_module+0x4e/0x700  
 ? __kasan_kmalloc+0x7d/0xa0  
 ? kmem_cache_alloc_trace+0x188/0x2b0  
 ? kasan_unpoison+0x21/0x50  
 do_init_module+0x1d1/0x700  
 load_module+0x3867/0x5260  
 ? layout_and_allocate+0x3990/0x3990  
 ? rcu_read_lock_held_common+0xe/0xb0  
 ? rcu_read_lock_sched_held+0x62/0xe0  
 ? rcu_read_lock_bh_held+0xd0/0xd0  
 ? __vmalloc_node_range+0x46b/0x890  
 ? lock_release+0x5c8/0xba0  
 ? alloc_vm_area+0x120/0x120  
 ? selinux_kernel_module_from_file+0x2a5/0x300  
 ? __inode_security_revalidate+0xf0/0xf0  
 ? __do_sys_init_module+0x1db/0x260  
 __do_sys_init_module+0x1db/0x260  
 ? load_module+0x5260/0x5260  
 ? do_syscall_64+0x22/0x450  
 do_syscall_64+0xa5/0x450  
 entry_SYSCALL_64_after_hwframe+0x66/0xdb  

Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions")  
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>  
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>  
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>  
Link: https://lore.kernel.org/r/20230207201938.1329-1-sindhu.devale@intel.com  
Signed-off-by: Leon Romanovsky <leon@kernel.org>  

Signed-off-by: Kamal Heib kheib@redhat.com

Merge request reports