- Jun 24, 2024
-
-
Andrew Cooper authored
`xl devd` has been observed leaking /var/log/xldevd.log into children. Note this is specifically safe; dup2() leaves O_CLOEXEC disabled on newfd, so after setting up stdout/stderr, it's only the logfile fd which will close on exec(). Link: https://github.com/QubesOS/qubes-issues/issues/8292 Reported-by:
Demi Marie Obenour <demi@invisiblethingslab.com> Signed-off-by:
Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by:
Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by:
Demi Marie Obenour <demi@invisiblethingslab.com> Acked-by:
Anthony PERARD <anthony.perard@vates.tech> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
- Jun 21, 2024
-
-
When the left shift amount is up to 31, the shifted quantity wants to be of unsigned int (or wider) type. While there also adjust types: get doesn't alter the array and returns a boolean, while both don't really accept negative "nr". Drop a stray blank each as well. Signed-off-by:
Jan Beulich <jbeulich@suse.com> Reviewed-by:
Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-By:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
There exists bitshifts in the IOAPIC code where signed integers are shifted to the left by up to 31 bits, which is undefined behaviour. This patch fixes this by changing the integers from signed to unsigned. Signed-off-by:
Matthew Barnes <matthew.barnes@cloud.com> Reviewed-by:
Jan Beulich <jbeulich@suse.com> Reviewed-by:
Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-By:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
- Jun 20, 2024
-
-
Jan Beulich authored
As was made noticeable by the last of the commits referenced below, using a fixed-size type for such purposes is not only against ./CODING_STYLE, but can lead to actual issues. Switch to using size_t instead, thus also allowing calculations to be lighter-weight in 32-bit builds. No functional change for 64-bit builds. Link: https://gitlab.com/xen-project/xen/-/jobs/7136417308 Fixes: b145b4a3 ("livepatch: Handle arbitrary size names with the list operation") Fixes: 5083e0ff ("livepatch: Add metadata runtime retrieval mechanism") Fixes: 43d5c5d5 ("xen: avoid UB in guest handle arithmetic") Reported-by:
Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by:
Jan Beulich <jbeulich@suse.com> Reviewed-by:
Ross Lagerwall <ross.lagerwall@citrix.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
fixup_irqs() is used to evacuate interrupts from to be offlined CPUs. Given the CPU is to become offline, the normal migration logic used by Xen where the vector in the previous target(s) is left configured until the interrupt is received on the new destination is not suitable. Instead attempt to do as much as possible in order to prevent loosing interrupts. If fixup_irqs() is called from the CPU to be offlined (as is currently the case for CPU hot unplug) attempt to forward pending vectors when interrupts that target the current CPU are migrated to a different destination. Additionally, for interrupts that have already been moved from the current CPU prior to the call to fixup_irqs() but that haven't been delivered to the new destination (iow: interrupts with move_in_progress set and the current CPU set in ->arch.old_cpu_mask) also check whether the previous vector is pending and forward it to the new destination. This allows us to remove the window with interrupts enabled at the bottom of fixup_irqs(). Such window wasn't safe anyway: references to the CPU to become offline are removed from interrupts masks, but the per-CPU vector_irq[] array is not updated to reflect those changes (as the CPU is going offline anyway). Signed-off-by:
Roger Pau Monné <roger.pau@citrix.com> Reviewed-by:
Jan Beulich <jbeulich@suse.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
After the following commit: 3bc14e4f ("tools/libs/light: Add vlan field to libxl_device_nic") xl list -l aborts with a double free error if a domain has at least one vif defined: $ sudo xl list -l free(): double free detected in tcache 2 Aborted Orginally, the vlan field was called vid and was defined as an integer. It was appropriate to call libxl__xs_read_checked() with gc passed as the string data was copied to a different variable. However, the final version uses a string data type and the call should have been changed to use NOGC instead of gc to allow that data to live past the gc controlled lifetime, in line with the other string fields. This patch makes the change to pass NOGC instead of gc and moves the new code to be next to the other string fields (fixing a couple of errant tabs along the way), as recommended by Jason. Fixes: 3bc14e4f ("tools/libs/light: Add vlan field to libxl_device_nic") Signed-off-by:
Leigh Brown <leigh@solinno.co.uk> Reviewed-by:
Jason Andryuk <jason.andryuk@amd.com> Acked-by:
Anthony PERARD <anthony.perard@vates.tech> Release-acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
backendtype=phy using the blktap kernel module needs to use write_dev, but tapback can't support that. tapback should perform better, but make the script compatible with the old kernel module again. Fixes: 76a48419 ("hotplug: Update block-tap") Signed-off-by:
Jason Andryuk <jason.andryuk@amd.com> Acked-by:
Anthony PERARD <anthony.perard@vates.tech> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
- Jun 19, 2024
-
-
Jan Beulich authored
At least XENMEM_memory_exchange can have huge values passed in the nr_extents and nr_exchanged fields. Adding such values to pointers can overflow, resulting in UB. Cast respective pointers to "unsigned long" while at the same time making the necessary multiplication explicit. Remaining arithmetic is, despite there possibly being mathematical overflow, okay as per the C99 spec: "A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type." The overflow that we need to guard against is checked for in array_access_ok(). Note that in / down from array_access_ok() the address value is only ever cast to "unsigned long" anyway, which is why in the invocation from guest_handle_subrange_okay() the value doesn't need casting back to pointer type. In compat grant table code change two guest_handle_add_offset() to avoid passing in negative offsets. Since {,__}clear_guest_offset() need touching anyway, also deal with another (latent) issue there: They were losing the handle type, i.e. the size of the individual objects accessed. Luckily the few users we presently have all pass char or uint8 handles. Reported-by:
Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by:
Jan Beulich <jbeulich@suse.com> Acked-by:
Andrew Cooper <andrew.cooper3@citrix.com> Tested-by:
Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-By:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Andrew Cooper authored
With the exception of one case in read_bndcfgu() which can use ilog2(), the *_POS defines are unused. Drop them. X86_XCR0_X87 is the name used by both the SDM and APM, rather than X86_XCR0_FP. No functional change. Signed-off-by:
Andrew Cooper <andrew.cooper3@citrix.com> Acked-by:
Jan Beulich <jbeulich@suse.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Andrew Cooper authored
First, if XSAVE is available in hardware but not visible to the guest, the dynamic leaves shouldn't be filled in. Second, the comment concerning XSS state is wrong. VT-x doesn't manage host/guest state automatically, but there is provision for "host only" bits to be set, so the implications are still accurate. Introduce xstate_compressed_size() to mirror the uncompressed one. Cross check it at boot. Signed-off-by:
Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by:
Jan Beulich <jbeulich@suse.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Andrew Cooper authored
Make use of xstate_uncompressed_size() helper rather than maintaining the running calculation while accumulating feature components. The rest of the CPUID data can come direct from the raw cpu policy. All per-component data form an ABI through the behaviour of the X{SAVE,RSTOR}* instructions. Use for_each_set_bit() rather than opencoding a slightly awkward version of it. Mask the attributes in ecx down based on the visible features. This isn't actually necessary for any components or attributes defined at the time of writing (up to AMX), but is added out of an abundance of caution. Signed-off-by:
Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by:
Jan Beulich <jbeulich@suse.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Andrew Cooper authored
We're soon going to need a compressed helper of the same form. The size of the uncompressed image depends on the single element with the largest offset + size. Sadly this isn't always the element with the largest index. Name the per-xstate-component cpu_policy struture, for legibility of the logic in xstate_uncompressed_size(). Cross-check with hardware during boot, and remove hw_uncompressed_size(). This means that the migration paths don't need to mess with XCR0 just to sanity check the buffer size. It also means we can drop the "fastpath" check against xfeature_mask (there to skip some XCR0 writes); this path is going to be dead logic the moment Xen starts using supervisor states itself. The users of hw_uncompressed_size() in xstate_init() can (and indeed need) to be replaced with CPUID instructions. They run with feature_mask in XCR0, and prior to setup_xstate_features() on the BSP. No practical change. Signed-off-by:
Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by:
Jan Beulich <jbeulich@suse.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Andrew Cooper authored
This is a tangle, but it's a small step in the right direction. In the following change, xstate_init() is going to start using the Raw policy. calculate_raw_cpu_policy() is sufficiently separate from the other policies to safely move like this. No functional change. Signed-off-by:
Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by:
Jan Beulich <jbeulich@suse.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Andrew Cooper authored
Right now, xstate_ctxt_size() performs a cross-check of size with CPUID in for every call. This is expensive, being used for domain create/migrate, as well as to service certain guest CPUID instructions. Instead, arrange to check the sizes once at boot. See the code comments for details. Right now, it just checks hardware against the algorithm expectations. Later patches will cross-check Xen's XSTATE calculations too. Introduce more X86_XCR0_* and X86_XSS_* constants CPUID bits. This is to maximise coverage in the sanity check, even if we don't expect to use/virtualise some of these features any time soon. Leave HDC and HWP alone for now; we don't have CPUID bits from them stored nicely. Only perform the cross-checks when SELF_TESTS are active. It's only developers or new hardware liable to trip these checks, and Xen at least tracks "maximum value ever seen in xcr0" for the lifetime of the VM, which we don't want to be tickling in the general case. Signed-off-by:
Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by:
Jan Beulich <jbeulich@suse.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Andrew Cooper authored
The clobbering of this_cpu(xcr0) and this_cpu(xss) to architecturally invalid values is to force the subsequent set_xcr0() and set_msr_xss() to reload the hardware register. While XCR0 is reloaded in xstate_init(), MSR_XSS isn't. This causes get_msr_xss() to return the invalid value, and logic of the form: old = get_msr_xss(); set_msr_xss(new); ... set_msr_xss(old); to try and restore said invalid value. The architecturally invalid value must be purged from the cache, meaning the hardware register must be written at least once. This in turn highlights that the invalid value must only be used in the case that the hardware register is available. Fixes: f7f4a523 ("x86/xstate: reset cached register values on resume") Signed-off-by:
Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by:
Jan Beulich <jbeulich@suse.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Andrew Cooper authored
These living in cache.h is inherited from Linux, but cache.h is not a terribly appropriately location for them to live. __read_mostly is an optimisation related to data placement in order to avoid having shared data in cachelines that are likely to be written to, but it really is just a section of the linked image separating data by usage patterns; it has nothing to do with cache sizes or flushing logic. Worse, __ro_after_init was only in xen/cache.h because __read_mostly was in arch/cache.h, and has literally nothing whatsoever to do with caches. Move the definitions into xen/sections.h, which in particular means that RISC-V doesn't need to repeat the problematic pattern. Take the opportunity to provide a short descriptions of what these are used for. For now, leave TODO comments next to the other identical definitions. It turns out that unpicking cache.h is more complicated than it appears because a number of files use it for transitive dependencies. Signed-off-by:
Andrew Cooper <andrew.cooper3@citrix.com> Acked-by:
Jan Beulich <jbeulich@suse.com> Acked-by:
Stefano Stabellini <sstabellini@kernel.org> Release-Acked-By:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Andrew Cooper authored
When centralising irq_ack_none(), different architectures had different names for the parameter. As it's type is struct irq_desc *, it should be named desc. Make this consistent. No functional change. Fixes: 8aeda4a2 ("arch/irq: Make irq_ack_none() mandatory") Signed-off-by:
Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by:
Jan Beulich <jbeulich@suse.com> Acked-by:
Julien Grall <jgrall@amazon.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Julien Grall authored
Michal reported that the gitlab CI is failing because of this series [1]. This reverts commit 6f9d90ea. Signed-off-by:
Julien Grall <jgrall@amazon.com>
-
Julien Grall authored
Michal reported that the gitlab CI is failing because of this series [1]. This reverts commit 53c5c99e. [1] https://gitlab.com/xen-project/xen/-/pipelines/1338067978 Signed-off-by:
Julien Grall <jgrall@amazon.com>
-
Michal Orzel authored
Building Xen with CONFIG_STATIC_SHM=y results in a build failure: arch/arm/static-shmem.c: In function 'process_shm': arch/arm/static-shmem.c:327:41: error: 'gbase' may be used uninitialized [-Werror=maybe-uninitialized] 327 | if ( is_domain_direct_mapped(d) && (pbase != gbase) ) arch/arm/static-shmem.c:305:17: note: 'gbase' was declared here 305 | paddr_t gbase, pbase, psize; This is because the commit cb1ddafd adds a check referencing gbase/pbase variables which were not yet assigned a value. Fix it. Fixes: cb1ddafd ("xen/arm/static-shmem: Static-shmem should be direct-mapped for direct-mapped domains") Signed-off-by:
Michal Orzel <michal.orzel@amd.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com> Reviewed-by:
Bertrand Marquis <bertrand.marquis@arm.com>
-
Henry Wang authored
With the new allocation strategy of Dom0less DomUs XenStore page, update the doc of the late XenStore init protocol accordingly. Signed-off-by:
Henry Wang <xin.wang2@amd.com> Reviewed-by:
Michal Orzel <michal.orzel@amd.com>
-
Henry Wang authored
There are use cases (for example using the PV driver) in Dom0less setup that require Dom0less DomUs start immediately with Dom0, but initialize XenStore later after Dom0's successful boot and call to the init-dom0less application. An error message can seen from the init-dom0less application on 1:1 direct-mapped domains: ``` Allocating magic pages memory.c:238:d0v0 mfn 0x39000 doesn't belong to d1 Error on alloc magic pages ``` The "magic page" is a terminology used in the toolstack as reserved pages for the VM to have access to virtual platform capabilities. Currently the magic pages for Dom0less DomUs are populated by the init-dom0less app through populate_physmap(), and populate_physmap() automatically assumes gfn == mfn for 1:1 direct mapped domains. This cannot be true for the magic pages that are allocated later from the init-dom0less application executed in Dom0. For domain using statically allocated memory but not 1:1 direct-mapped, similar error "failed to retrieve a reserved page" can be seen as the reserved memory list is empty at that time. Since for init-dom0less, the magic page region is only for XenStore. To solve above issue, this commit allocates the XenStore page for Dom0less DomUs at the domain construction time. The PFN will be noted and communicated to the init-dom0less application executed from Dom0. To keep the XenStore late init protocol, set the connection status to XENSTORE_RECONNECT. Currently the GUEST_MAGIC_BASE in the init-dom0less application is hardcoded, which will lead to failures for 1:1 direct-mapped Dom0less DomUs. Since the guest magic region allocation from init-dom0less is for XenStore, and the XenStore page is now allocated from the hypervisor, instead of hardcoding the guest magic pages region, use xc_hvm_param_get() to get the XenStore page PFN. Rename alloc_xs_page() to get_xs_page() to reflect the changes. With this change, some existing code is not needed anymore, including: (1) The definition of the XenStore page offset. (2) Call to xc_domain_setmaxmem() and xc_clear_domain_page() as we don't need to set the max mem and clear the page anymore. (3) Foreign mapping of the XenStore page, setting of XenStore interface status and HVM_PARAM_STORE_PFN from init-dom0less, as they are set by the hypervisor. Take the opportunity to do some coding style improvements when possible. Reported-by:
Alec Kwapis <alec.kwapis@medtronic.com> Suggested-by:
Daniel P. Smith <dpsmith@apertussolutions.com> Signed-off-by:
Henry Wang <xin.wang2@amd.com> Signed-off-by:
Stefano Stabellini <stefano.stabellini@amd.com> Reviewed-by:
Michal Orzel <michal.orzel@amd.com> Reviewed-by:
Jason Andryuk <jason.andryuk@amd.com> Acked-by:
Anthony PERARD <anthony.perard@vates.tech>
-
Henry Wang authored
Currently, users are allowed to map static shared memory in a non-direct-mapped way for direct-mapped domains. This can lead to clashing of guest memory spaces. Also, the current extended region finding logic only removes the host physical addresses of the static shared memory areas for direct-mapped domains, which may be inconsistent with the guest memory map if users map the static shared memory in a non-direct-mapped way. This will lead to incorrect extended region calculation results. To make things easier, add restriction that static shared memory should also be direct-mapped for direct-mapped domains. Check the host physical address to be matched with guest physical address when parsing the device tree. Document this restriction in the doc. Signed-off-by:
Henry Wang <xin.wang2@amd.com> Signed-off-by:
Stefano Stabellini <stefano.stabellini@amd.com> Acked-by:
Michal Orzel <michal.orzel@amd.com>
-
- Jun 18, 2024
-
-
Andrew Cooper authored
struct type_descriptor is arranged with a NUL terminated string following the kind/info fields. The only reason this doesn't trip UBSAN detection itself (on more modern compilers at least) is because struct type_descriptor is only referenced in suppressed regions. Switch the declaration to be a real flexible member. No functional change. Fixes: 00fcf4dd ("xen/ubsan: Import ubsan implementation from Linux 4.13") Signed-off-by:
Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by:
Jan Beulich <jbeulich@suse.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Currently there's logic in fixup_irqs() that attempts to prevent _assign_irq_vector() from failing, as fixup_irqs() is required to evacuate all interrupts from the CPUs not present in the input mask. The current logic in fixup_irqs() is incomplete, as it doesn't deal with interrupts that have move_cleanup_count > 0 and a non-empty ->arch.old_cpu_mask field. Instead of attempting to fixup the interrupt descriptor in fixup_irqs() so that _assign_irq_vector() cannot fail, introduce logic in _assign_irq_vector() to deal with interrupts that have either move_{in_progress,cleanup_count} set and no remaining online CPUs in ->arch.cpu_mask. If _assign_irq_vector() is requested to move an interrupt in the state described above, first attempt to see if ->arch.old_cpu_mask contains any valid CPUs that could be used as fallback, and if that's the case do move the interrupt back to the previous destination. Note this is easier because the vector hasn't been released yet, so there's no need to allocate and setup a new vector on the destination. Due to the logic in fixup_irqs() that clears offline CPUs from ->arch.old_cpu_mask (and releases the old vector if the mask becomes empty) it shouldn't be possible to get into _assign_irq_vector() with ->arch.move_{in_progress,cleanup_count} set but no online CPUs in ->arch.old_cpu_mask. However if ->arch.move_{in_progress,cleanup_count} is set and the interrupt has also changed affinity, it's possible the members of ->arch.old_cpu_mask are no longer part of the affinity set, move the interrupt to a different CPU part of the provided mask and keep the current ->arch.old_{cpu_mask,vector} for the pending interrupt movement to be completed. Signed-off-by:
Roger Pau Monné <roger.pau@citrix.com> Reviewed-by:
Jan Beulich <jbeulich@suse.com>
-
Given the current logic it's possible for ->arch.old_cpu_mask to get out of sync: if a CPU set in old_cpu_mask is offlined and then onlined again without old_cpu_mask having been updated the data in the mask will no longer be accurate, as when brought back online the CPU will no longer have old_vector configured to handle the old interrupt source. If there's an interrupt movement in progress, and the to be offlined CPU (which is the call context) is in the old_cpu_mask, clear it and update the mask, so it doesn't contain stale data. Note that when the system is going down fixup_irqs() will be called by smp_send_stop() from CPU 0 with a mask with only CPU 0 on it, effectively asking to move all interrupts to the current caller (CPU 0) which is the only CPU to remain online. In that case we don't care to migrate interrupts that are in the process of being moved, as it's likely we won't be able to move all interrupts to CPU 0 due to vector shortage anyway. Signed-off-by:
Roger Pau Monné <roger.pau@citrix.com> Reviewed-by:
Jan Beulich <jbeulich@suse.com>
-
Jan Beulich authored
Intel CPUs have a MSR bit to limit CPUID enumeration to leaf two. If this bit is set by the BIOS then CPUID evaluation does not work when data from any leaf greater than two is needed; early_cpu_init() in particular wants to collect leaf 7 data. Cure this by unlocking CPUID right before evaluating anything which depends on the maximum CPUID leaf being greater than two. Inspired by (and description cloned from) Linux commit 0c2f6d04619e ("x86/topology/intel: Unlock CPUID before evaluating anything"). Signed-off-by:
Jan Beulich <jbeulich@suse.com> Reviewed-by:
Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Nicola Vetrini authored
Rules 20.12 and 14.4 are now clean on ARM and x86, so they are added to the list of clean guidelines. Some guidelines listed in the additional clean section for ARM are also clean on x86, so they can be removed from there. No functional change. Signed-off-by:
Nicola Vetrini <nicola.vetrini@bugseng.com> [stefano: remove 20.9 from commit message] Signed-off-by:
Stefano Stabellini <stefano.stabellini@amd.com> Acked-by:
Stefano Stabellini <sstabellini@kernel.org>
-
Nicola Vetrini authored
The DEFINE macro in asm-offsets.c (for all architectures) still generates violations despite the file(s) being excluded from compliance, due to the fact that in its expansion it sometimes refers entities in non-excluded files. These corner cases are deviated by the configuration. No functional change. Signed-off-by:
Nicola Vetrini <nicola.vetrini@bugseng.com> Acked-by:
Stefano Stabellini <sstabellini@kernel.org>
-
- Jun 14, 2024
-
-
Penny authored
This commit describe the new scenario where host address is not provided in "xen,shared-mem" property and a new example is added to the page to explain in details. Take the occasion to fix some typos in the page. Signed-off-by:
Penny Zheng <penny.zheng@arm.com> Signed-off-by:
Luca Fancellu <luca.fancellu@arm.com> Reviewed-by:
Michal Orzel <michal.orzel@amd.com>
-
Luca Fancellu authored
This commit implements the logic to have the static shared memory banks from the Xen heap instead of having the host physical address passed from the user. When the host physical address is not supplied, the physical memory is taken from the Xen heap using allocate_domheap_memory, the allocation needs to occur at the first handled DT node and the allocated banks need to be saved somewhere. Introduce the 'shm_heap_banks' for that reason, a struct that will hold the banks allocated from the heap, its field bank[].shmem_extra will be used to point to the bootinfo shared memory banks .shmem_extra space, so that there is not further allocation of memory and every bank in shm_heap_banks can be safely identified by the shm_id to reconstruct its traceability and if it was allocated or not. A search into 'shm_heap_banks' will reveal if the banks were allocated or not, in case the host address is not passed, and the callback given to allocate_domheap_memory will store the banks in the structure and map them to the current domain, to do that, some changes to acquire_shared_memory_bank are made to let it differentiate if the bank is from the heap and if it is, then assign_pages is called for every bank. When the bank is already allocated, for every bank allocated with the corresponding shm_id, handle_shared_mem_bank is called and the mapping are done. Signed-off-by:
Luca Fancellu <luca.fancellu@arm.com> Reviewed-by:
Michal Orzel <michal.orzel@amd.com>
-
Luca Fancellu authored
The function allocate_bank_memory allocates pages from the heap and maps them to the guest using guest_physmap_add_page. As a preparation work to support static shared memory bank when the host physical address is not provided, Xen needs to allocate memory from the heap, so rework allocate_bank_memory moving out the page allocation in a new function called allocate_domheap_memory. The function allocate_domheap_memory takes a callback function and a pointer to some extra information passed to the callback and this function will be called for every region, until a defined size is reached. In order to keep allocate_bank_memory functionality, the callback passed to allocate_domheap_memory is a wrapper for guest_physmap_add_page. Let allocate_domheap_memory be externally visible, in order to use it in the future from the static shared memory module. Take the opportunity to change the signature of allocate_bank_memory and remove the 'struct domain' parameter, which can be retrieved from 'struct kernel_info'. No functional changes is intended. Signed-off-by:
Luca Fancellu <luca.fancellu@arm.com> Reviewed-by:
Michal Orzel <michal.orzel@amd.com>
-
Luca Fancellu authored
Handle the parsing of the 'xen,shared-mem' property when the host physical address is not provided, this commit is introducing the logic to parse it, but the functionality is still not implemented and will be part of future commits. Rework the logic inside process_shm_node to check the shm_id before doing the other checks, because it ease the logic itself, add more comment on the logic. Now when the host physical address is not provided, the value INVALID_PADDR is chosen to signal this condition and it is stored as start of the bank, due to that change also early_print_info_shmem and init_sharedmem_pages are changed, to not handle banks with start equal to INVALID_PADDR. Another change is done inside meminfo_overlap_check, to skip banks that are starting with the start address INVALID_PADDR, that function is used to check banks from reserved memory, shared memory and ACPI and since the comment above the function states that wrapping around is not handled, it's unlikely for these bank to have the start address as INVALID_PADDR. Same change is done inside consider_modules, find_unallocated_memory and dt_unreserved_regions functions, in order to skip banks that starts with INVALID_PADDR from any computation. The changes above holds because of this consideration. Signed-off-by:
Luca Fancellu <luca.fancellu@arm.com> Reviewed-by:
Michal Orzel <michal.orzel@amd.com>
-
Penny authored
We are doing foreign memory mapping for static shared memory, and there is a great possibility that it could be super mapped. But today, p2m_put_l3_page could not handle superpages. This commits implements a new function p2m_put_l2_superpage to handle level 2 superpages, specifically for helping put extra references for foreign superpages. Modify relinquish_p2m_mapping as well to take into account preemption when we have a level-2 foreign mapping. Currently level 1 superpages are not handled because Xen is not preemptible and therefore some work is needed to handle such superpages, for which at some point Xen might end up freeing memory and therefore for such a big mapping it could end up in a very long operation. Signed-off-by:
Penny Zheng <penny.zheng@arm.com> Signed-off-by:
Luca Fancellu <luca.fancellu@arm.com> Reviewed-by:
Julien Grall <jgrall@amazon.com>
-
Luca Fancellu authored
Wrap the code and logic that is calling assign_shared_memory and map_regions_p2mt into a new function 'handle_shared_mem_bank', it will become useful later when the code will allow the user to don't pass the host physical address. Signed-off-by:
Luca Fancellu <luca.fancellu@arm.com> Reviewed-by:
Michal Orzel <michal.orzel@amd.com>
-
Luca Fancellu authored
The current static shared memory code is using bootinfo banks when it needs to find the number of borrowers, so every time assign_shared_memory is called, the bank is searched in the bootinfo.shmem structure. There is nothing wrong with it, however the bank can be used also to retrieve the start address and size and also to pass less argument to assign_shared_memory. When retrieving the information from the bootinfo bank, it's also possible to move the checks on alignment to process_shm_node in the early stages. So create a new function find_shm_bank_by_id() which takes a 'struct shared_meminfo' structure and the shared memory ID, to look for a bank with a matching ID, take the physical host address and size from the bank, pass the bank to assign_shared_memory() removing the now unnecessary arguments and finally remove the acquire_nr_borrower_domain() function since now the information can be extracted from the passed bank. Move the "xen,shm-id" parsing early in process_shm to bail out quickly in case of errors (unlikely), as said above, move the checks on alignment to process_shm_node. Drawback of this change is that now the bootinfo are used also when the bank doesn't need to be allocated, however it will be convenient later to use it as an argument for assign_shared_memory when dealing with the use case where the Host physical address is not supplied by the user. Signed-off-by:
Luca Fancellu <luca.fancellu@arm.com> Reviewed-by:
Michal Orzel <michal.orzel@amd.com>
-
- Jun 13, 2024
-
-
Jan Beulich authored
mfn_valid() is RAM-focused; it will often return false for MMIO. Yet access to actual MMIO space should not generally be restricted to UC only; especially video frame buffer accesses are unduly affected by such a restriction. Since, as of 777c71d3 ("x86/EPT: avoid marking non-present entries for re-configuring"), the function won't be called with INVALID_MFN or, worse, truncated forms thereof anymore, we call fully drop that check. Fixes: 81fd0d3c ("x86/hvm: simplify 'mmio_direct' check in epte_get_entry_emt()") Signed-off-by:
Jan Beulich <jbeulich@suse.com> Reviewed-by:
Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Jan Beulich authored
For non-present entries EMT, like most other fields, is meaningless to hardware. Make the logic in ept_set_entry() setting the field (and iPAT) conditional upon dealing with a present entry, leaving the value at 0 otherwise. This has two effects for epte_get_entry_emt() which we'll want to leverage subsequently: 1) The call moved here now won't be issued with INVALID_MFN anymore (a respective BUG_ON() is being added). 2) Neither of the other two calls could now be issued with a truncated form of INVALID_MFN anymore (as long as there's no bug anywhere marking an entry present when that was populated using INVALID_MFN). Signed-off-by:
Jan Beulich <jbeulich@suse.com> Reviewed-by:
Roger Pau Monné <roger.pau@citrix.com> Release-Acked-By:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Jan Beulich authored
mfn_valid() granularity is (currently) 256Mb. Therefore the start of a 1Gb page passing the test doesn't necessarily mean all parts of such a range would also pass. Yet using the result of mfn_to_page() on an MFN which doesn't pass mfn_valid() checking is liable to result in a crash (the invocation of mfn_to_page() alone is presumably "just" UB in such a case). Fixes: ca24b2ff ("x86/hvm: set 'ipat' in EPT for special pages") Signed-off-by:
Jan Beulich <jbeulich@suse.com> Reviewed-by:
Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-
Jens Wiklander authored
Add support for FF-A notifications, currently limited to an SP (Secure Partition) sending an asynchronous notification to a guest. Guests and Xen itself are made aware of pending notifications with an interrupt. The interrupt handler triggers a tasklet to retrieve the notifications using the FF-A ABI and deliver them to their destinations. Update ffa_partinfo_domain_init() to return error code like ffa_notif_domain_init(). Signed-off-by:
Jens Wiklander <jens.wiklander@linaro.org> Reviewed-by:
Bertrand Marquis <bertrand.marquis@arm.com> Release-Acked-by:
Oleksii Kurochko <oleksii.kurochko@gmail.com>
-