- Mar 05, 2022
-
-
Jonathan Cameron authored
The Device Serial Number Extended Capability PCI r6.0 sec 7.9.3 provides a standard way to provide a device serial number as an IEEE defined 64-bit extended unique identifier EUI-64. CXL 2.0 section 8.1.12.2 Memory Device PCIe Capabilities and Extended Capabilities requires this to be used to uniquely identify CXL memory devices. Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Huai-Cheng Kuo authored
The Data Object Exchange implementation of CXL Coherent Device Attribute Table (CDAT). This implementation is referring to "Coherent Device Attribute Table Specification, Rev. 1.02, Oct. 2020" and "Compute Express Link Specification, Rev. 2.0, Oct. 2020" The CDAT can be specified in two ways. One is to add ",cdat=<filename>" in "-device cxl-type3"'s command option. The file is required to provide the whole CDAT table in binary mode. The other is to use the default CDAT value created by build_cdat_table in hw/cxl/cxl-cdat.c. A DOE capability of CDAT is added to hw/mem/cxl_type3.c with capability offset 0x190. The config read/write to this capability range can be generated in the OS to request the CDAT data. Signed-off-by:
hchkuo <hchkuo@avery-design.com.tw> Signed-off-by:
Chris Browy <cbrowy@avery-design.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Huai-Cheng Kuo authored
The Data Object Exchange implementation of CXL Compliance Mode is referring to "Compute Express Link (CXL) Specification, Rev. 2.0, Oct. 2020". The data structure of CXL compliance request and response is added to the header. Due to the scope limitation of QEMU, most of the compliance response is limited to returning corresponding length. A DOE capability of CXL Compliance is added to hw/mem/cxl_type3.c with capability offset 0x160. The config read/write to this capability range can be generated in the OS to request the Compliance info. Signed-off-by:
hchkuo <hchkuo@avery-design.com.tw> Signed-off-by:
Chris Browy <cbrowy@avery-design.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Huai-Cheng Kuo authored
PCIe Data Object Exchange (DOE) implementation for QEMU referring to "PCIe Data Object Exchange ECN, March 12, 2020". The patch supports multiple DOE capabilities for a single PCIe device in QEMU. For each capability, a static array of DOEProtocol should be passed to pcie_doe_init(). The protocols in that array will be registered under the DOE capability structure. For each protocol, vendor ID, type, and corresponding callback function (handle_request()) should be implemented. This callback function represents how the DOE request for corresponding protocol will be handled. pcie_doe_{read/write}_config() must be appended to corresponding PCI device's config_read/write() handler to enable DOE access. In pcie_doe_read_config(), false will be returned if pci_config_read() offset is not within DOE capability range. In pcie_doe_write_config(), the function will be early returned if not within the related DOE range. Signed-off-by:
hchkuo <hchkuo@avery-design.com.tw> Signed-off-by:
Chris Browy <cbrowy@avery-design.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Huai-Cheng Kuo authored
Macros for the vender ID of PCI-SIG mentioned in "PCIe Data Object Exchange ECN, March 12, 2020" and the size of PCIe Data Object Exchange. Signed-off-by:
hchkuo <hchkuo@avery-design.com.tw> Signed-off-by:
Chris Browy <cbrowy@avery-design.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Huai-Cheng Kuo authored
Linux standard header for the registers of PCI Data Object Exchange (DOE). This header might be generated via script. The DOE feature should be added in the future Linux release so this patch can be removed then. Signed-off-by:
hchkuo <hchkuo@avery-design.com.tw> Signed-off-by:
Chris Browy <cbrowy@avery-design.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Jonathan Cameron authored
Extend the walk of the CXL bus during interleave decoding to take into account one layer of switches. Whilst theoretically CXL 2.0 allows multiple switch levels, in the vast majority of usecases only one level is expected and currently that is all the proposed Linux support provides. Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Jonathan Cameron authored
Emulation of a simple CXL Switch downstream port. The Device ID has been allocated for this use. Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Jonathan Cameron authored
An initial simple upstream port emulation to allow the creation of CXL switches. The Device ID has been allocated for this use. Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Jonathan Cameron authored
Provide an introduction to the main components of a CXL system, with detailed explanation of memory interleaving, example command lines and kernel configuration. This was a challenging document to write due to the need to extract only that subset of CXL information which is relevant to either users of QEMU emulation of CXL or to those interested in the implementation. Much of CXL is concerned with specific elements of the protocol, management of memory pooling etc which is simply not relevant to what is currently planned for CXL emulation in QEMU. All comments welcome Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Jonathan Cameron authored
Add a single complex case for aarch64 virt machine. Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Jonathan Cameron authored
Code based on i386/pc enablement. The memory layout places space for 16 host bridge register regions after the GIC_REDIST2 in the extended memmap. The CFMWs are placed above the extended memmap. Only create the CEDT table if cxl=on set for the machine. Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com>
-
Ben Widawsky authored
Add CXL Fixed Memory Windows to the CXL tests. Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> Co-developed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Jonathan Cameron authored
Tables that differ from normal Q35 tables when running the CXL test. Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Jonathan Cameron authored
The DSDT includes several CXL specific elements and the CEDT table is only present if we enable CXL. The test exercises all current functionality with several CFMWS, CHBS structures in CEDT and ACPI0016/ACPI00017 and _OSC entries in DSDT. Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Jonathan Cameron authored
Add exceptions for the DSDT and the new CEDT tables specific to a new CXL test in the following patch. Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Jonathan Cameron authored
Add the CFMWs memory regions to the memorymap and adjust the PCI window to avoid hitting the same memory. Signed-off-by:
Jonathan Cameron <jonathan.cameron@huawei.com>
-
Ben Widawsky authored
Add a trivial handler for now to cover the root bridge where we could do some error checking in future. Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Jonathan Cameron authored
Inorder to implement memory interleaving we need a means to proxy the calls. Adding mem_ops allows such proxying. Note should have no impact on use cases not using _dispatch_read/write. For now, only file backed hostmem is considered to seek feedback on the approach before considering other hostmem backends. Signed-off-by:
Jonathan Cameron <jonathan.cameron@huawei.com>
-
Jonathan Cameron authored
These memops perform interleave decoding, walking down the CXL topology from CFMWS described host interleave decoder via CXL host bridge HDM decoders, through the CXL root ports and finally call CXL type 3 specific read and write functions. Note that, whilst functional the current implementation does not support: * switches * multiple HDM decoders at a given level. * unaligned accesses across the interleave boundaries Signed-off-by:
Jonathan Cameron <jonathan.cameron@huawei.com>
-
Jonathan Cameron authored
Once a read or write reaches a CXL type 3 device, the HDM decoders on the device are used to establish the Device Physical Address which should be accessed. These functions peform the required maths and then directly access the hostmem->mr to fullfil the actual operation. Note that failed writes are silent, but failed reads return poison. Note this is based loosely on: https://lore.kernel.org/qemu-devel/20200817161853.593247-6-f4bug@amsat.org/ [RFC PATCH 0/9] hw/misc: Add support for interleaved memory accesses Only lightly tested so far. More complex test cases yet to be written. Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Jonathan Cameron authored
Accessor to get hold of the cxl state for a CXL host bridge without exposing the internals of the implementation. Signed-off-by:
Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org>
-
Jonathan Cameron authored
Simple function to search a PCIBus to find a port by it's port number. CXL interleave decoding uses the port number as a target so it is necessary to locate the port when doing interleave decoding. Signed-off-by:
Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org>
-
Jonathan Cameron authored
This adds code to instantiate the slightly extended ACPI root port description in DSDT as per the CXL 2.0 specification. Basically a cut and paste job from the i386/pc code. Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org>
-
Ben Widawsky authored
The CEDT CXL Fixed Window Memory Window Structures (CFMWs) define regions of the host phyiscal address map which (via an impdef means) are configured such that they have a particular interleave setup across one or more CXL Host Bridges. Reported-by:
Alison Schofield <alison.schofield@intel.com> Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org>
-
Jonathan Cameron authored
The concept of these is introduced in [1] in terms of the description the CEDT ACPI table. The principal is more general. Unlike once traffic hits the CXL root bridges, the host system memory address routing is implementation defined and effectively static once observable by standard / generic system software. Each CXL Fixed Memory Windows (CFMW) is a region of PA space which has fixed system dependent routing configured so that accesses can be routed to the CXL devices below a set of target root bridges. The accesses may be interleaved across multiple root bridges. For QEMU we could have fully specified these regions in terms of a base PA + size, but as the absolute address does not matter it is simpler to let individual platforms place the memory regions. ExampleS: -cxl-fixed-memory-window targets.0=cxl.0,size=128G -cxl-fixed-memory-window targets.0=cxl.1,size=128G -cxl-fixed-memory-window targets.0=cxl0,targets.1=cxl.1,size=256G,interleave-granularity=2k Specifies * 2x 128G regions not interleaved across root bridges, one for each of the root bridges with ids cxl.0 and cxl.1 * 256G region interleaved across root bridges with ids cxl.0 and cxl.1 with a 2k interleave granularity. When system software enumerates the devices below a given root bridge it can then decide which CFMW to use. If non interleave is desired (or possible) it can use the appropriate CFMW for the root bridge in question. If there are suitable devices to interleave across the two root bridges then it may use the 3rd CFMS. A number of other designs were considered but the following constraints made it hard to adapt existing QEMU approaches to this particular problem. 1) The size must be known before a specific architecture / board brings up it's PA memory map. We need to set up an appropriate region. 2) Using links to the host bridges provides a clean command line interface but these links cannot be established until command line devices have been added. Hence the two step process used here of first establishing the size, interleave-ways and granularity + caching the ids of the host bridges and then, once available finding the actual host bridges so they can be used later to support interleave decoding. [1] CXL 2.0 ECN: CEDT CFMWS & QTG DSM (computeexpresslink.org / specifications) Signed-off-by:
Jonathan Cameron <jonathan.cameron@huawei.com>
-
Jonathan Cameron authored
Both registers and the CFMWS entries in CDAT use simple encodings for the number of interleave ways and the interleave granularity. Introduce simple conversion functions to/from the unencoded number / size. So far the iw decode has not been needed so is it not implemented. Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org>
-
Ben Widawsky authored
The CXL Early Discovery Table is defined in the CXL 2.0 specification as a way for the OS to get CXL specific information from the system firmware. CXL 2.0 specification adds an _HID, ACPI0016, for CXL capable host bridges, with a _CID of PNP0A08 (PCIe host bridge). CXL aware software is able to use this initiate the proper _OSC method, and get the _UID which is referenced by the CEDT. Therefore the existence of an ACPI0016 device allows a CXL aware driver perform the necessary actions. For a CXL capable OS, this works. For a CXL unaware OS, this works. CEDT awaremess requires more. The motivation for ACPI0017 is to provide the possibility of having a Linux CXL module that can work on a legacy Linux kernel. Linux core PCI/ACPI which won't be built as a module, will see the _CID of PNP0A08 and bind a driver to it. If we later loaded a driver for ACPI0016, Linux won't be able to bind it to the hardware because it has already bound the PNP0A08 driver. The ACPI0017 device is an opportunity to have an object to bind a driver will be used by a Linux driver to walk the CXL topology and do everything that we would have preferred to do with ACPI0016. There is another motivation for an ACPI0017 device which isn't implemented here. An operating system needs an attach point for a non-volatile region provider that understands cross-hostbridge interleaving. Since QEMU emulation doesn't support interleaving yet, this is more important on the OS side, for now. As of CXL 2.0 spec, only 1 sub structure is defined, the CXL Host Bridge Structure (CHBS) which is primarily useful for telling the OS exactly where the MMIO for the host bridge is. Link: https://lore.kernel.org/linux-cxl/20210115034911.nkgpzc756d6qmjpl@intel.com/T/#t Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org>
-
Ben Widawsky authored
CXL 2.0 specification adds 2 new dwords to the existing _OSC definition from PCIe. The new dwords are accessed with a new uuid. This implementation supports what is in the specification. Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org>
-
Ben Widawsky authored
CXL host bridges themselves may have MMIO. Since host bridges don't have a BAR they are treated as special for MMIO. This patch includes i386/pc support. Also hook up the device reset now that we have have the MMIO space in which the results are visible. Note that we duplicate the PCI express case for the aml_build but the implementations will diverge when the CXL specific _OSC is introduced. Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> Co-developed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org>
-
Jonathan Cameron authored
At this stage we can boot configurations with host bridges, root ports and type 3 memory devices, so add appropriate tests. Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org>
-
Ben Widawsky authored
Implement get and set handlers for the Label Storage Area used to hold data describing persistent memory configuration so that it can be ensured it is seen in the same configuration after reboot. Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Ben Widawsky authored
This should introduce no change. Subsequent work will make use of this new class member. Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Ben Widawsky authored
GET_FW_INFO and GET_PARTITION_INFO, for this emulation, is equivalent to info already returned in the IDENTIFY command. To have a more robust implementation, add those. Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Ben Widawsky authored
A device's volatile and persistent memory are known Host Defined Memory (HDM) regions. The mechanism by which the device is programmed to claim the addresses associated with those regions is through dedicated logic known as the HDM decoder. In order to allow the OS to properly program the HDMs, the HDM decoders must be modeled. There are two ways the HDM decoders can be implemented, the legacy mechanism is through the PCIe DVSEC programming from CXL 1.1 (8.1.3.8), and MMIO is found in 8.2.5.12 of the spec. For now, 8.1.3.8 is not implemented. Much of CXL device logic is implemented in cxl-utils. The HDM decoder however is implemented directly by the device implementation. Whilst the implementation currently does no validity checks on the encoder set up, future work will add sanity checking specific to the type of cxl component. Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> Co-developed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org>
-
Ben Widawsky authored
A CXL memory device (AKA Type 3) is a CXL component that contains some combination of volatile and persistent memory. It also implements the previously defined mailbox interface as well as the memory device firmware interface. Although the memory device is configured like a normal PCIe device, the memory traffic is on an entirely separate bus conceptually (using the same physical wires as PCIe, but different protocol). Once the CXL topology is fully configure and address decoders committed, the guest physical address for the memory device is part of a larger window which is owned by the platform. The creation of these windows is later in this series. The following example will create a 256M device in a 512M window: -object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M" -device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0" Note: Dropped PCDIMM info interfaces for now. They can be added if appropriate at a later date. Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com>
-
Ben Widawsky authored
This adds just enough of a root port implementation to be able to enumerate root ports (creating the required DVSEC entries). What's not here yet is the MMIO nor the ability to write some of the DVSEC entries. This can be added with the qemu commandline by adding a rootport to a specific CXL host bridge. For example: -device cxl-rp,id=rp0,bus="cxl.0",addr=0.0,chassis=4 Like the host bridge patch, the ACPI tables aren't generated at this point and so system software cannot use it. Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org>
-
Jonathan Cameron authored
Initial test with just pxb-cxl. Other tests will be added alongside functionality. Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org> Tested-by:
Alex Bennée <alex.bennee@linaro.org>
-
Ben Widawsky authored
This works like adding a typical pxb device, except the name is 'pxb-cxl' instead of 'pxb-pcie'. An example command line would be as follows: -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1 A CXL PXB is backward compatible with PCIe. What this means in practice is that an operating system that is unaware of CXL should still be able to enumerate this topology as if it were PCIe. One can create multiple CXL PXB host bridges, but a host bridge can only be connected to the main root bus. Host bridges cannot appear elsewhere in the topology. Note that as of this patch, the ACPI tables needed for the host bridge (specifically, an ACPI object in _SB named ACPI0016 and the CEDT) aren't created. So while this patch internally creates it, it cannot be properly used by an operating system or other system software. Also necessary is to add an exception to scripts/device-crash-test similar to that for exiting pxb as both must created on a PCIexpress host bus. Signed-off-by:
Ben Widawsky <ben.widawsky@intel.com> Signed-off-by:
Jonathan.Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org>
-
Jonathan Cameron authored
There are going to be some potential overheads to CXL enablement, for example the host bridge region reserved in memory maps. Add a machine level control so that CXL is disabled by default. Signed-off-by:
Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by:
Alex Bennée <alex.bennee@linaro.org>
-