Whilst I know others are working on an implementation of at least some of this, a desire to work on the kernel user of this required an implementation so I threw this together in the meantime and am sending it out on the basis that it may be of use to others.
As it is an RFC I have been lazy on some cleanup and error handling. Will fix that in a future version if we decide to take this forwards.
Based on Ben's cxl-2.0v3 branch from https://gitlab.com/bwidawsk/qemu Possible I've picked an unstable branch so this may or may not currently apply :) I'll rebase on Ben's next version when avaialble.
Jonathan Cameron (4): include/standard-headers/linux/pci_regs: temp hack to add necessary DOE definitions. hw/pci/pcie_doe: Introduce utility functions for PCIe DOE hw/cxl/cxl-cdat: Initial CDAT implementation for use by CXL devices hw/mem/cxl_type3: Enabled DOE mailbox for access to CDAT
hw/cxl/cxl-cdat.c | 252 +++++++++++++++++++++ hw/cxl/meson.build | 1 + hw/mem/cxl_type3.c | 49 ++++- hw/pci/meson.build | 2 +- hw/pci/pcie_doe.c | 257 ++++++++++++++++++++++ include/hw/cxl/cxl.h | 1 + include/hw/cxl/cxl_cdat.h | 101 +++++++++ include/hw/pci/doe.h | 40 ++++ include/hw/pci/pci_ids.h | 2 + include/standard-headers/linux/pci_regs.h | 33 ++- 10 files changed, 734 insertions(+), 4 deletions(-) create mode 100644 hw/cxl/cxl-cdat.c create mode 100644 hw/pci/pcie_doe.c create mode 100644 include/hw/cxl/cxl_cdat.h create mode 100644 include/hw/pci/doe.h
Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com --- include/standard-headers/linux/pci_regs.h | 33 ++++++++++++++++++++++- 1 file changed, 32 insertions(+), 1 deletion(-)
diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h index e709ae8235..7e852d3dd0 100644 --- a/include/standard-headers/linux/pci_regs.h +++ b/include/standard-headers/linux/pci_regs.h @@ -730,7 +730,8 @@ #define PCI_EXT_CAP_ID_DVSEC 0x23 /* Designated Vendor-Specific */ #define PCI_EXT_CAP_ID_DLF 0x25 /* Data Link Feature */ #define PCI_EXT_CAP_ID_PL_16GT 0x26 /* Physical Layer 16.0 GT/s */ -#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_PL_16GT +#define PCI_EXT_CAP_ID_DOE 0x2E /* Data Object Exchange */ +#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_DOE
#define PCI_EXT_CAP_DSN_SIZEOF 12 #define PCI_EXT_CAP_MCAST_ENDPOINT_SIZEOF 40 @@ -1092,4 +1093,34 @@ #define PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_MASK 0x000000F0 #define PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_SHIFT 4
+/* Data Object Exchange */ +#define PCI_DOE_CAP 0x04 +#define PCI_DOE_CAP_INT_SUPPORT 0x00000001 +#define PCI_DOE_CAP_INT_MSG_NUM 0x00000FFE + +#define PCI_DOE_CTRL 0x08 +#define PCI_DOE_CTRL_DOE_ABORT 0x00000001 +#define PCI_DOE_CTRL_DOE_INT_EN 0x00000002 +#define PCI_DOE_CTRL_DOE_GO 0x80000000 + +#define PCI_DOE_STATUS 0x0c +#define PCI_DOE_STATUS_DOE_BUSY 0x00000001 +#define PCI_DOE_STATUS_INT_STATUS 0x00000002 +#define PCI_DOE_STATUS_DOE_ERROR 0x00000004 +#define PCI_DOE_STATUS_DATA_OBJECT_READY 0x80000000 + +#define PCI_DOE_WRITE_MAILBOX 0x10 +#define PCI_DOE_READ_MAILBOX 0x14 + +/* Data Object Format DOE ECN 6.xx.1 */ +#define PCI_DATA_OBJ_DW0_VID 0x0000ffff +#define PCI_DATA_OBJ_DW0_TYPE 0x00ff0000 +#define PCI_DATA_OBJ_DW1_LEN 0x0003ffff + +/* DOE Discover Data Object */ +#define PCI_DOE_DIS_OBJ_TYPE 0x1 +#define PCI_DOE_DIS_REQ_D0_DW0_INDEX 0x000000ff +#define PCI_DOE_DIS_RSP_DO_DW0_VID 0x0000ffff +#define PCI_DOE_DIS_RSP_D0_DW0_PROT 0x00ff0000 +#define PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX 0xff000000 #endif /* LINUX_PCI_REGS_H */
On 21-02-01 23:16:26, Jonathan Cameron wrote:
Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com
include/standard-headers/linux/pci_regs.h | 33 ++++++++++++++++++++++- 1 file changed, 32 insertions(+), 1 deletion(-)
diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h index e709ae8235..7e852d3dd0 100644 --- a/include/standard-headers/linux/pci_regs.h +++ b/include/standard-headers/linux/pci_regs.h @@ -730,7 +730,8 @@ #define PCI_EXT_CAP_ID_DVSEC 0x23 /* Designated Vendor-Specific */ #define PCI_EXT_CAP_ID_DLF 0x25 /* Data Link Feature */ #define PCI_EXT_CAP_ID_PL_16GT 0x26 /* Physical Layer 16.0 GT/s */ -#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_PL_16GT +#define PCI_EXT_CAP_ID_DOE 0x2E /* Data Object Exchange */ +#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_DOE
#define PCI_EXT_CAP_DSN_SIZEOF 12 #define PCI_EXT_CAP_MCAST_ENDPOINT_SIZEOF 40 @@ -1092,4 +1093,34 @@ #define PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_MASK 0x000000F0 #define PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_SHIFT 4
+/* Data Object Exchange */ +#define PCI_DOE_CAP 0x04 +#define PCI_DOE_CAP_INT_SUPPORT 0x00000001 +#define PCI_DOE_CAP_INT_MSG_NUM 0x00000FFE
+#define PCI_DOE_CTRL 0x08 +#define PCI_DOE_CTRL_DOE_ABORT 0x00000001 +#define PCI_DOE_CTRL_DOE_INT_EN 0x00000002 +#define PCI_DOE_CTRL_DOE_GO 0x80000000
+#define PCI_DOE_STATUS 0x0c +#define PCI_DOE_STATUS_DOE_BUSY 0x00000001 +#define PCI_DOE_STATUS_INT_STATUS 0x00000002 +#define PCI_DOE_STATUS_DOE_ERROR 0x00000004 +#define PCI_DOE_STATUS_DATA_OBJECT_READY 0x80000000
+#define PCI_DOE_WRITE_MAILBOX 0x10 +#define PCI_DOE_READ_MAILBOX 0x14
+/* Data Object Format DOE ECN 6.xx.1 */ +#define PCI_DATA_OBJ_DW0_VID 0x0000ffff +#define PCI_DATA_OBJ_DW0_TYPE 0x00ff0000 +#define PCI_DATA_OBJ_DW1_LEN 0x0003ffff
+/* DOE Discover Data Object */ +#define PCI_DOE_DIS_OBJ_TYPE 0x1 +#define PCI_DOE_DIS_REQ_D0_DW0_INDEX 0x000000ff +#define PCI_DOE_DIS_RSP_DO_DW0_VID 0x0000ffff +#define PCI_DOE_DIS_RSP_D0_DW0_PROT 0x00ff0000 +#define PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX 0xff000000 #endif /* LINUX_PCI_REGS_H */
I think a lot of these should have had _MASK at the end.
As for the accuracy of the values, lgtm.
On Tue, Feb 02, 2021 at 07:39:51AM -0800, Ben Widawsky wrote:
On 21-02-01 23:16:26, Jonathan Cameron wrote:
Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com
include/standard-headers/linux/pci_regs.h | 33 ++++++++++++++++++++++- 1 file changed, 32 insertions(+), 1 deletion(-)
diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h index e709ae8235..7e852d3dd0 100644 --- a/include/standard-headers/linux/pci_regs.h +++ b/include/standard-headers/linux/pci_regs.h @@ -730,7 +730,8 @@ #define PCI_EXT_CAP_ID_DVSEC 0x23 /* Designated Vendor-Specific */ #define PCI_EXT_CAP_ID_DLF 0x25 /* Data Link Feature */ #define PCI_EXT_CAP_ID_PL_16GT 0x26 /* Physical Layer 16.0 GT/s */ -#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_PL_16GT +#define PCI_EXT_CAP_ID_DOE 0x2E /* Data Object Exchange */ +#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_DOE
#define PCI_EXT_CAP_DSN_SIZEOF 12 #define PCI_EXT_CAP_MCAST_ENDPOINT_SIZEOF 40 @@ -1092,4 +1093,34 @@ #define PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_MASK 0x000000F0 #define PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_SHIFT 4
+/* Data Object Exchange */ +#define PCI_DOE_CAP 0x04 +#define PCI_DOE_CAP_INT_SUPPORT 0x00000001 +#define PCI_DOE_CAP_INT_MSG_NUM 0x00000FFE
+#define PCI_DOE_CTRL 0x08 +#define PCI_DOE_CTRL_DOE_ABORT 0x00000001 +#define PCI_DOE_CTRL_DOE_INT_EN 0x00000002 +#define PCI_DOE_CTRL_DOE_GO 0x80000000
+#define PCI_DOE_STATUS 0x0c +#define PCI_DOE_STATUS_DOE_BUSY 0x00000001 +#define PCI_DOE_STATUS_INT_STATUS 0x00000002 +#define PCI_DOE_STATUS_DOE_ERROR 0x00000004 +#define PCI_DOE_STATUS_DATA_OBJECT_READY 0x80000000
+#define PCI_DOE_WRITE_MAILBOX 0x10 +#define PCI_DOE_READ_MAILBOX 0x14
+/* Data Object Format DOE ECN 6.xx.1 */ +#define PCI_DATA_OBJ_DW0_VID 0x0000ffff +#define PCI_DATA_OBJ_DW0_TYPE 0x00ff0000 +#define PCI_DATA_OBJ_DW1_LEN 0x0003ffff
+/* DOE Discover Data Object */ +#define PCI_DOE_DIS_OBJ_TYPE 0x1 +#define PCI_DOE_DIS_REQ_D0_DW0_INDEX 0x000000ff +#define PCI_DOE_DIS_RSP_DO_DW0_VID 0x0000ffff +#define PCI_DOE_DIS_RSP_D0_DW0_PROT 0x00ff0000 +#define PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX 0xff000000 #endif /* LINUX_PCI_REGS_H */
I think a lot of these should have had _MASK at the end.
As for the accuracy of the values, lgtm.
just add them in the source file where they are used. standard-headers are over-written by scripts, adding your own macros there won't help.
This implements the ECN to the PCI 5.0 specification available at https://members.pcisig.com/wg/PCI-SIG/document/14143
Does not currently support interrupts.
Note that currently no attempt is made to clean up allocated memory.
Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com --- hw/pci/meson.build | 2 +- hw/pci/pcie_doe.c | 257 +++++++++++++++++++++++++++++++++++++++ include/hw/pci/doe.h | 40 ++++++ include/hw/pci/pci_ids.h | 2 + 4 files changed, 300 insertions(+), 1 deletion(-)
diff --git a/hw/pci/meson.build b/hw/pci/meson.build index 5c4bbac817..7336620ee3 100644 --- a/hw/pci/meson.build +++ b/hw/pci/meson.build @@ -11,7 +11,7 @@ pci_ss.add(files( # The functions in these modules can be used by devices too. Since we # allow plugging PCIe devices into PCI buses, include them even if # CONFIG_PCI_EXPRESS=n. -pci_ss.add(files('pcie.c', 'pcie_aer.c')) +pci_ss.add(files('pcie.c', 'pcie_aer.c', 'pcie_doe.c')) softmmu_ss.add(when: 'CONFIG_PCI_EXPRESS', if_true: files('pcie_port.c', 'pcie_host.c')) softmmu_ss.add_all(when: 'CONFIG_PCI', if_true: pci_ss)
diff --git a/hw/pci/pcie_doe.c b/hw/pci/pcie_doe.c new file mode 100644 index 0000000000..8739c41280 --- /dev/null +++ b/hw/pci/pcie_doe.c @@ -0,0 +1,257 @@ +/* + * pcie_doe.c + * utility functions for pci express data object exchange introduced + * in PCI 5.0 Data Object Exchange (DOE) ECN + * + * Copyright (c) 2021 Jonathan Cameron Jonathan.Cameron@huawei.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu/units.h" +#include "qemu/error-report.h" +#include "hw/pci/pci.h" +#include "hw/pci/doe.h" +#include "hw/qdev-properties.h" +#include "qapi/error.h" +#include "qemu/log.h" +#include "qemu/module.h" +#include "qemu/range.h" +#include "qemu/rcu.h" +#include "sysemu/hostmem.h" + +struct doe_handler { + uint16_t vendor_id; + uint8_t object_type; + doe_msg_handler_t handler; + void *priv; +}; + +static void doe_set_ctl(PCIEDOE *doe, uint32_t val) +{ + /* Abort */ + if (val & PCI_DOE_CTRL_DOE_ABORT) { + doe->req_index = 0; + doe->rsp_index = 0; + doe->req_length = 0; + doe->error = false; + doe->data_object_ready = false; + } + + if (val & PCI_DOE_CTRL_DOE_GO) { + GList *l; + uint16_t vendor_id = doe->store[0] & PCI_DATA_OBJ_DW0_VID; + uint8_t object_type = (doe->store[0] & PCI_DATA_OBJ_DW0_TYPE) >> + ctz32(PCI_DATA_OBJ_DW0_TYPE); + if ((doe->req_index != 3) || (doe->req_length != 3)) { + /* + * Not entirely clear what should happen if req_length is correct + * buf insufficient data has been received. + */ + doe->error = true; + return; + } + /* Discovery protocol - DOE ECN */ + if (vendor_id == PCI_VENDOR_ID_PCI_SIG && + object_type == PCI_DOE_DIS_OBJ_TYPE) { + uint8_t index = doe->store[2] & PCI_DOE_DIS_REQ_D0_DW0_INDEX; + doe->store[1] = 3; + if (index == 0) { + /* First entry is this one, the discovery protocol itself */ + uint8_t next; + + if (doe->cb_list) { + next = index + 1; + } else { + next = 0; + } + doe->store[2] = + (next << ctz32(PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX)) | + (0 << ctz32(PCI_DOE_DIS_RSP_D0_DW0_PROT)) | + 0x0001; + } else { + /* Other entries based on register callbacks */ + uint8_t next; + struct doe_handler *h; + + h = g_list_nth_data(doe->cb_list, index - 1); + /* + * Off end of list, Table 7-x4 in DOE ECN - + * Vendor ID 0xFFFF if no more indices + */ + if (h == NULL) { + doe->store[2] = 0xFFFF; + } else { + if (g_list_nth(doe->cb_list, index)) { + next = index + 1; + } else { + next = 0; + } + doe->store[2] = + (next << ctz32(PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX)) | + (h->object_type << ctz32(PCI_DOE_DIS_RSP_D0_DW0_PROT)) | + h->vendor_id; + } + } + doe->data_object_ready = true; + doe->rsp_index = 0; + } else { + for (l = doe->cb_list; l != NULL; l = l->next) { + struct doe_handler *h = l->data; + if (h->vendor_id == vendor_id && + h->object_type == object_type) { + int ret = h->handler(doe, h->vendor_id, h->object_type, + h->priv); + if (ret) { + /* + * No response so as per 6.xx.1 in DOE ECN + * "... within 1 second after the DOE Go bit was Set + * in the DOE Control register, otherwise the DOE + * instance must Set the DOE Error bit in the DOE + * Status register.." + */ + doe->error = true; + break; + } + doe->data_object_ready = true; + doe->rsp_index = 0; + break; + } + } + /* Comamnd not handled */ + if (l == NULL) { + doe->error = true; + } + } + /* Reset input index to allow for a new message */ + doe->req_index = 0; + } +} + +static void doe_set_write_mailbox(PCIEDOE *doe, uint32_t val) +{ + if (doe->req_index == 1) { + if (val & 0x3FFFF) { + doe->req_length = val & PCI_DATA_OBJ_DW1_LEN; + } else { + doe->req_length = 1 << 18; + } + } + if (doe->req_length && doe->req_index == doe->req_length) { + /* + * 6.xx.1 Data Objects + * If the DW transferred does not match the indicated Length + * for a data object, then the data object must be + * silently discarded + */ + return; + } + doe->store[doe->req_index] = val; + doe->req_index++; +} + +static uint32_t doe_get_read_mailbox(PCIEDOE *doe) +{ + uint32_t val; + + if (doe->rsp_index == 0) { + doe->rsp_length = doe->store[1] & PCI_DATA_OBJ_DW1_LEN; + } + if (!doe->data_object_ready) { + /* Underflow of the Read Data Mailbox Mechanism */ + doe->error = true; + return 0; + } + + val = doe->store[doe->rsp_index]; + doe->rsp_index++; + if (doe->rsp_index == doe->rsp_length) { + doe->rsp_index = -1; + doe->data_object_ready = false; + } + + return val; +} + +static uint32_t doe_get_status(PCIEDOE *doe) +{ + uint32_t val = 0; + + if (doe->busy) { + val |= PCI_DOE_STATUS_DOE_BUSY; + } + /* bit 1: interrupt not yet supported */ + if (doe->error) { + val |= PCI_DOE_STATUS_DOE_ERROR; + } + if (doe->data_object_ready) { + val |= PCI_DOE_STATUS_DATA_OBJECT_READY; + } + + return val; +} + +void doe_add_message_handler(PCIEDOE *doe, uint16_t vendor_id, + uint8_t object_type, + const doe_msg_handler_t handler, void *priv) +{ + struct doe_handler *h = g_malloc0(sizeof(*handler)); + + h->vendor_id = vendor_id; + h->object_type = object_type; + h->handler = handler; + h->priv = priv; + doe->cb_list = g_list_append(doe->cb_list, h); +} + +uint32_t pcie_doe_ecap(PCIEDOE *doe, PCIDevice *d, uint16_t offset) +{ + doe->doe_base = offset; + /* Length field is 18 bits and is in dwords */ + doe->store = g_malloc0((1 << 18) * sizeof(uint32_t)); + + pcie_add_capability(d, PCI_EXT_CAP_ID_DOE, 1, offset, 0x18); + offset += 0x18; + + return offset; +} + +void pcie_doe_write(PCIEDOE *doe, uint32_t addr, uint32_t val, int len) +{ + if (len != 4) { + return; + } + + switch (addr - doe->doe_base) { + case PCI_DOE_CTRL: + doe_set_ctl(doe, val); + break; + case PCI_DOE_WRITE_MAILBOX: + doe_set_write_mailbox(doe, val); + break; + } +} + +uint32_t pcie_doe_read(PCIEDOE *doe, uint32_t addr, int len, int *found) +{ + if (len != 4) { + *found = 0; + return 0; + } + + *found = 1; + switch (addr - doe->doe_base) { + case PCI_DOE_CAP: + return 0; /* No interrupt support */ + case PCI_DOE_STATUS: + return doe_get_status(doe); + case PCI_DOE_READ_MAILBOX: + return doe_get_read_mailbox(doe); + default: + *found = 0; + return 0; + } +} + diff --git a/include/hw/pci/doe.h b/include/hw/pci/doe.h new file mode 100644 index 0000000000..364c866c53 --- /dev/null +++ b/include/hw/pci/doe.h @@ -0,0 +1,40 @@ +/* + * PCIE DOE emulation. + * + * Copyright (c) 2021 Jonathan Cameron Jonathan.Cameron@huawei.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifndef QEMU_PCIE_DOE_H_ +#define QEMU_PCIE_DOE_H_ +#include "qemu/osdep.h" +#include "qemu/units.h" +#include "hw/qdev-properties.h" +#include "qapi/error.h" +#include "qemu/module.h" + +typedef struct pcie_doe { + uint32_t doe_base; + GList *cb_list; + int req_index; + int req_length; + int rsp_index; + int rsp_length; + bool data_object_ready; + bool error; + bool busy; + uint32_t *store; +} PCIEDOE; + +typedef int (*doe_msg_handler_t)(PCIEDOE *doe, uint16_t vendor_id, + uint8_t object_type, void *priv); + +uint32_t pcie_doe_ecap(PCIEDOE *doe, PCIDevice *d, uint16_t offset); +void doe_add_message_handler(PCIEDOE *doe, uint16_t vendor_id, + uint8_t object_type, + const doe_msg_handler_t handler, void *priv); +uint32_t pcie_doe_read(PCIEDOE *doe, uint32_t addr, int len, int *found); +void pcie_doe_write(PCIEDOE *doe, uint32_t addr, uint32_t val, int len); +#endif diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h index 76bf3ed590..636b2e8017 100644 --- a/include/hw/pci/pci_ids.h +++ b/include/hw/pci/pci_ids.h @@ -157,6 +157,8 @@
/* Vendors and devices. Sort key: vendor first, device next. */
+#define PCI_VENDOR_ID_PCI_SIG 0x0001 + #define PCI_VENDOR_ID_LSI_LOGIC 0x1000 #define PCI_DEVICE_ID_LSI_53C810 0x0001 #define PCI_DEVICE_ID_LSI_53C895A 0x0012
This was a bit more complicated than I was anticipating :-)
On 21-02-01 23:16:27, Jonathan Cameron wrote:
This implements the ECN to the PCI 5.0 specification available at https://members.pcisig.com/wg/PCI-SIG/document/14143
Does not currently support interrupts.
Note that currently no attempt is made to clean up allocated memory.
Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com
hw/pci/meson.build | 2 +- hw/pci/pcie_doe.c | 257 +++++++++++++++++++++++++++++++++++++++ include/hw/pci/doe.h | 40 ++++++ include/hw/pci/pci_ids.h | 2 + 4 files changed, 300 insertions(+), 1 deletion(-)
diff --git a/hw/pci/meson.build b/hw/pci/meson.build index 5c4bbac817..7336620ee3 100644 --- a/hw/pci/meson.build +++ b/hw/pci/meson.build @@ -11,7 +11,7 @@ pci_ss.add(files( # The functions in these modules can be used by devices too. Since we # allow plugging PCIe devices into PCI buses, include them even if # CONFIG_PCI_EXPRESS=n. -pci_ss.add(files('pcie.c', 'pcie_aer.c')) +pci_ss.add(files('pcie.c', 'pcie_aer.c', 'pcie_doe.c')) softmmu_ss.add(when: 'CONFIG_PCI_EXPRESS', if_true: files('pcie_port.c', 'pcie_host.c')) softmmu_ss.add_all(when: 'CONFIG_PCI', if_true: pci_ss)
diff --git a/hw/pci/pcie_doe.c b/hw/pci/pcie_doe.c new file mode 100644 index 0000000000..8739c41280 --- /dev/null +++ b/hw/pci/pcie_doe.c @@ -0,0 +1,257 @@ +/*
- pcie_doe.c
- utility functions for pci express data object exchange introduced
- in PCI 5.0 Data Object Exchange (DOE) ECN
- Copyright (c) 2021 Jonathan Cameron Jonathan.Cameron@huawei.com
- This work is licensed under the terms of the GNU GPL, version 2 or later.
- See the COPYING file in the top-level directory.
- */
+#include "qemu/osdep.h" +#include "qemu/units.h" +#include "qemu/error-report.h" +#include "hw/pci/pci.h" +#include "hw/pci/doe.h" +#include "hw/qdev-properties.h" +#include "qapi/error.h" +#include "qemu/log.h" +#include "qemu/module.h" +#include "qemu/range.h" +#include "qemu/rcu.h" +#include "sysemu/hostmem.h"
I know it's RFC and quickly thrown together, but:
/* VID and Type */ #define DOE_DATA_OBJECT_HDR1 0 /* Length */ #define DOE_DATA_OBJECT_HDR2 1 /* Index */ #define DOE_DATA_OBJECT_REQUEST_DATA 2 #define DOE_DATA_OBJECT_RESPONSE_DATA 3
Then use that throughout below, please?
+struct doe_handler {
- uint16_t vendor_id;
- uint8_t object_type;
- doe_msg_handler_t handler;
- void *priv;
+};
+static void doe_set_ctl(PCIEDOE *doe, uint32_t val) +{
- /* Abort */
- if (val & PCI_DOE_CTRL_DOE_ABORT) {
doe->req_index = 0;
doe->rsp_index = 0;
doe->req_length = 0;
doe->error = false;
doe->data_object_ready = false;
- }
- if (val & PCI_DOE_CTRL_DOE_GO) {
GList *l;
uint16_t vendor_id = doe->store[0] & PCI_DATA_OBJ_DW0_VID;
uint8_t object_type = (doe->store[0] & PCI_DATA_OBJ_DW0_TYPE) >>
ctz32(PCI_DATA_OBJ_DW0_TYPE);
I think it'd be much nicer to read this as REG32/FIELD_EX32.
if ((doe->req_index != 3) || (doe->req_length != 3)) {
/*
* Not entirely clear what should happen if req_length is correct
* buf insufficient data has been received.
s/buf/but
Also, maybe for more resiliency and readability, with a comment about why '3': if ((doe->req_index < 3) ...
I think it'd be much more readable if you pulled the index directly out of the register store here instead of when you set.
len = doe->store[DOE_DATA_OBJECT_HDR2] & PCI_DATA_OBJ_DW1_LEN; if (!len) len = 1<<18;
*/
doe->error = true;
return;
I don't think this should be an error: "If the Length is shorter than expected for a specific data object, then the data object must be silently discarded.
If the Length is greater than expected for a specific data object, then the portion of the data object up to the expected length must be processednormally and the remainder of the data object must be silently discarded."
}
/* Discovery protocol - DOE ECN */
if (vendor_id == PCI_VENDOR_ID_PCI_SIG &&
object_type == PCI_DOE_DIS_OBJ_TYPE) {
uint8_t index = doe->store[2] & PCI_DOE_DIS_REQ_D0_DW0_INDEX;
doe->store[1] = 3;
if (index == 0) {
/* First entry is this one, the discovery protocol itself */
uint8_t next;
if (doe->cb_list) {
next = index + 1;
} else {
next = 0;
}
I think a comment here that you're terminating the list if no callbacks are registered would be good.
doe->store[2] =
(next << ctz32(PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX)) |
(0 << ctz32(PCI_DOE_DIS_RSP_D0_DW0_PROT)) |
0x0001;
Same comment about FIELD_DP|EX32. I'd try it and see how it looks.
Would probably be nice to #define some vendor ID, or use fields so it's obvious why you have a 0x1.
} else {
/* Other entries based on register callbacks */
uint8_t next;
struct doe_handler *h;
h = g_list_nth_data(doe->cb_list, index - 1);
It wasn't immediately obvious to me why index - 1. Maybe it'd be a little nicer if you had a helper:
static inline struct doe_handler *doe_get_handler(PCIEDOE *doe, int doe_index) { struct doe_handler *ret = NULL;
// assert(doe->cb_list); if (doe->cb_list) ret = g_list_nth_data(doe->cb_list, doe_index - 1);
return ret; }
/*
* Off end of list, Table 7-x4 in DOE ECN -
* Vendor ID 0xFFFF if no more indices
*/
if (h == NULL) {
doe->store[2] = 0xFFFF;
} else {
if (g_list_nth(doe->cb_list, index)) {
next = index + 1;
} else {
next = 0;
}
A bit confusing here as well IMO, but I don't know a better way to write it other than adding comments about index effectively being off by 1.
doe->store[2] =
(next << ctz32(PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX)) |
(h->object_type << ctz32(PCI_DOE_DIS_RSP_D0_DW0_PROT)) |
h->vendor_id;
}
}
doe->data_object_ready = true;
doe->rsp_index = 0;
} else {
for (l = doe->cb_list; l != NULL; l = l->next) {
struct doe_handler *h = l->data;
if (h->vendor_id == vendor_id &&
h->object_type == object_type) {
int ret = h->handler(doe, h->vendor_id, h->object_type,
h->priv);
if (ret) {
/*
* No response so as per 6.xx.1 in DOE ECN
* "... within 1 second after the DOE Go bit was Set
* in the DOE Control register, otherwise the DOE
* instance must Set the DOE Error bit in the DOE
* Status register.."
*/
doe->error = true;
extra whitespace
break;
}
doe->data_object_ready = true;
doe->rsp_index = 0;
break;
}
}
/* Comamnd not handled */
if (l == NULL) {
doe->error = true;
}
}
/* Reset input index to allow for a new message */
doe->req_index = 0;
- }
+}
+static void doe_set_write_mailbox(PCIEDOE *doe, uint32_t val) +{
- if (doe->req_index == 1) {
if (val & 0x3FFFF) {
val & PCI_DATA_OBJ_DW1_LEN
doe->req_length = val & PCI_DATA_OBJ_DW1_LEN;
} else {
doe->req_length = 1 << 18;
}
- }
- if (doe->req_length && doe->req_index == doe->req_length) {
/*
* 6.xx.1 Data Objects
* If the DW transferred does not match the indicated Length
* for a data object, then the data object must be
* silently discarded
*/
return;
- }
- doe->store[doe->req_index] = val;
- doe->req_index++;
+}
+static uint32_t doe_get_read_mailbox(PCIEDOE *doe) +{
- uint32_t val;
- if (doe->rsp_index == 0) {
doe->rsp_length = doe->store[1] & PCI_DATA_OBJ_DW1_LEN;
- }
- if (!doe->data_object_ready) {
/* Underflow of the Read Data Mailbox Mechanism */
doe->error = true;
return 0;
- }
I don't think this should be an error condition. Could you please explain?
- val = doe->store[doe->rsp_index];
- doe->rsp_index++;
- if (doe->rsp_index == doe->rsp_length) {
doe->rsp_index = -1;
doe->data_object_ready = false;
- }
- return val;
+}
+static uint32_t doe_get_status(PCIEDOE *doe) +{
- uint32_t val = 0;
- if (doe->busy) {
val |= PCI_DOE_STATUS_DOE_BUSY;
- }
Do you actually intend to model busy?
- /* bit 1: interrupt not yet supported */
- if (doe->error) {
val |= PCI_DOE_STATUS_DOE_ERROR;
- }
- if (doe->data_object_ready) {
val |= PCI_DOE_STATUS_DATA_OBJECT_READY;
- }
- return val;
+}
+void doe_add_message_handler(PCIEDOE *doe, uint16_t vendor_id,
uint8_t object_type,
const doe_msg_handler_t handler, void *priv)
+{
- struct doe_handler *h = g_malloc0(sizeof(*handler));
- h->vendor_id = vendor_id;
- h->object_type = object_type;
- h->handler = handler;
- h->priv = priv;
- doe->cb_list = g_list_append(doe->cb_list, h);
+}
+uint32_t pcie_doe_ecap(PCIEDOE *doe, PCIDevice *d, uint16_t offset) +{
- doe->doe_base = offset;
- /* Length field is 18 bits and is in dwords */
- doe->store = g_malloc0((1 << 18) * sizeof(uint32_t));
I think it's fine if you're instantiating DOE to just have this be statically allocated.
- pcie_add_capability(d, PCI_EXT_CAP_ID_DOE, 1, offset, 0x18);
- offset += 0x18;
- return offset;
+}
+void pcie_doe_write(PCIEDOE *doe, uint32_t addr, uint32_t val, int len) +{
- if (len != 4) {
return;
- }
Do you want to check alignment also?
- switch (addr - doe->doe_base) {
- case PCI_DOE_CTRL:
doe_set_ctl(doe, val);
break;
- case PCI_DOE_WRITE_MAILBOX:
doe_set_write_mailbox(doe, val);
break;
- }
+}
+uint32_t pcie_doe_read(PCIEDOE *doe, uint32_t addr, int len, int *found) +{
- if (len != 4) {
*found = 0;
return 0;
- }
Same comment as _write
- *found = 1;
- switch (addr - doe->doe_base) {
- case PCI_DOE_CAP:
return 0; /* No interrupt support */
- case PCI_DOE_STATUS:
return doe_get_status(doe);
- case PCI_DOE_READ_MAILBOX:
return doe_get_read_mailbox(doe);
- default:
*found = 0;
return 0;
- }
+}
I'm guessing you don't love 'found'. Perhaps you can have this return an int64_t and negative values are errors?
I'm not yet convinced you even need that level of error handling though. I suppose next patch will tell me.
diff --git a/include/hw/pci/doe.h b/include/hw/pci/doe.h new file mode 100644 index 0000000000..364c866c53 --- /dev/null +++ b/include/hw/pci/doe.h @@ -0,0 +1,40 @@ +/*
- PCIE DOE emulation.
- Copyright (c) 2021 Jonathan Cameron Jonathan.Cameron@huawei.com
- This work is licensed under the terms of the GNU GPL, version 2 or later.
- See the COPYING file in the top-level directory.
- */
+#ifndef QEMU_PCIE_DOE_H_ +#define QEMU_PCIE_DOE_H_ +#include "qemu/osdep.h" +#include "qemu/units.h" +#include "hw/qdev-properties.h" +#include "qapi/error.h" +#include "qemu/module.h"
+typedef struct pcie_doe {
- uint32_t doe_base;
- GList *cb_list;
- int req_index;
- int req_length;
- int rsp_index;
- int rsp_length;
- bool data_object_ready;
- bool error;
- bool busy;
- uint32_t *store;
+} PCIEDOE;
+typedef int (*doe_msg_handler_t)(PCIEDOE *doe, uint16_t vendor_id,
uint8_t object_type, void *priv);
+uint32_t pcie_doe_ecap(PCIEDOE *doe, PCIDevice *d, uint16_t offset); +void doe_add_message_handler(PCIEDOE *doe, uint16_t vendor_id,
uint8_t object_type,
const doe_msg_handler_t handler, void *priv);
+uint32_t pcie_doe_read(PCIEDOE *doe, uint32_t addr, int len, int *found); +void pcie_doe_write(PCIEDOE *doe, uint32_t addr, uint32_t val, int len); +#endif diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h index 76bf3ed590..636b2e8017 100644 --- a/include/hw/pci/pci_ids.h +++ b/include/hw/pci/pci_ids.h @@ -157,6 +157,8 @@
/* Vendors and devices. Sort key: vendor first, device next. */
+#define PCI_VENDOR_ID_PCI_SIG 0x0001
#define PCI_VENDOR_ID_LSI_LOGIC 0x1000 #define PCI_DEVICE_ID_LSI_53C810 0x0001
#define PCI_DEVICE_ID_LSI_53C895A 0x0012
2.19.1
On Tue, 2 Feb 2021 09:54:11 -0800 Ben Widawsky ben@bwidawsk.net wrote:
This was a bit more complicated than I was anticipating :-)
On 21-02-01 23:16:27, Jonathan Cameron wrote:
This implements the ECN to the PCI 5.0 specification available at https://members.pcisig.com/wg/PCI-SIG/document/14143
Does not currently support interrupts.
Note that currently no attempt is made to clean up allocated memory.
Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com
Hi Ben,
Thanks, for taking a look.
I've only commented when more info to give or disagree.
hw/pci/meson.build | 2 +- hw/pci/pcie_doe.c | 257 +++++++++++++++++++++++++++++++++++++++ include/hw/pci/doe.h | 40 ++++++ include/hw/pci/pci_ids.h | 2 + 4 files changed, 300 insertions(+), 1 deletion(-)
diff --git a/hw/pci/meson.build b/hw/pci/meson.build index 5c4bbac817..7336620ee3 100644 --- a/hw/pci/meson.build +++ b/hw/pci/meson.build @@ -11,7 +11,7 @@ pci_ss.add(files( # The functions in these modules can be used by devices too. Since we # allow plugging PCIe devices into PCI buses, include them even if # CONFIG_PCI_EXPRESS=n. -pci_ss.add(files('pcie.c', 'pcie_aer.c')) +pci_ss.add(files('pcie.c', 'pcie_aer.c', 'pcie_doe.c')) softmmu_ss.add(when: 'CONFIG_PCI_EXPRESS', if_true: files('pcie_port.c', 'pcie_host.c')) softmmu_ss.add_all(when: 'CONFIG_PCI', if_true: pci_ss)
diff --git a/hw/pci/pcie_doe.c b/hw/pci/pcie_doe.c new file mode 100644 index 0000000000..8739c41280 --- /dev/null +++ b/hw/pci/pcie_doe.c @@ -0,0 +1,257 @@ +/*
- pcie_doe.c
- utility functions for pci express data object exchange introduced
- in PCI 5.0 Data Object Exchange (DOE) ECN
- Copyright (c) 2021 Jonathan Cameron Jonathan.Cameron@huawei.com
- This work is licensed under the terms of the GNU GPL, version 2 or later.
- See the COPYING file in the top-level directory.
- */
+#include "qemu/osdep.h" +#include "qemu/units.h" +#include "qemu/error-report.h" +#include "hw/pci/pci.h" +#include "hw/pci/doe.h" +#include "hw/qdev-properties.h" +#include "qapi/error.h" +#include "qemu/log.h" +#include "qemu/module.h" +#include "qemu/range.h" +#include "qemu/rcu.h" +#include "sysemu/hostmem.h"
I know it's RFC and quickly thrown together, but:
/* VID and Type */ #define DOE_DATA_OBJECT_HDR1 0 /* Length */ #define DOE_DATA_OBJECT_HDR2 1 /* Index */ #define DOE_DATA_OBJECT_REQUEST_DATA 2 #define DOE_DATA_OBJECT_RESPONSE_DATA 3
Then use that throughout below, please?
+struct doe_handler {
- uint16_t vendor_id;
- uint8_t object_type;
- doe_msg_handler_t handler;
- void *priv;
+};
+static void doe_set_ctl(PCIEDOE *doe, uint32_t val) +{
- /* Abort */
- if (val & PCI_DOE_CTRL_DOE_ABORT) {
doe->req_index = 0;
doe->rsp_index = 0;
doe->req_length = 0;
doe->error = false;
doe->data_object_ready = false;
- }
- if (val & PCI_DOE_CTRL_DOE_GO) {
GList *l;
uint16_t vendor_id = doe->store[0] & PCI_DATA_OBJ_DW0_VID;
uint8_t object_type = (doe->store[0] & PCI_DATA_OBJ_DW0_TYPE) >>
ctz32(PCI_DATA_OBJ_DW0_TYPE);
I think it'd be much nicer to read this as REG32/FIELD_EX32.
Probably :)
if ((doe->req_index != 3) || (doe->req_length != 3)) {
/*
* Not entirely clear what should happen if req_length is correct
* buf insufficient data has been received.
s/buf/but
Also, maybe for more resiliency and readability, with a comment about why '3': if ((doe->req_index < 3) ...
I think it'd be much more readable if you pulled the index directly out of the register store here instead of when you set.
len = doe->store[DOE_DATA_OBJECT_HDR2] & PCI_DATA_OBJ_DW1_LEN; if (!len) len = 1<<18;
*/
doe->error = true;
return;
I don't think this should be an error: "If the Length is shorter than expected for a specific data object, then the data object must be silently discarded.
Comment was to tell me to come back and check it. Hinges on the bit below.
If the Length is greater than expected for a specific data object, then the portion of the data object up to the expected length must be processed normally and the remainder of the data object must be silently discarded."
There is a gotcha around that one that still has me scratching my head. "Any of the following events must result in the DOE error bit being set:" ... "Optionally, if the associated data object protocol does not provide an alternative mechanism for reporting such errors, the transfer of a data object that does not correspond to the expected length of the data object".
So I think the normally and dropping statement you've quoted may mean that you wouldn't expect software to check the error bit before continuing to write to the mailbox. Thus it should carry on allowing writes and just dropping them.
Of course, an implementation might optionally 'not' set the error in this case.
}
/* Discovery protocol - DOE ECN */
if (vendor_id == PCI_VENDOR_ID_PCI_SIG &&
object_type == PCI_DOE_DIS_OBJ_TYPE) {
uint8_t index = doe->store[2] & PCI_DOE_DIS_REQ_D0_DW0_INDEX;
doe->store[1] = 3;
if (index == 0) {
/* First entry is this one, the discovery protocol itself */
uint8_t next;
if (doe->cb_list) {
next = index + 1;
} else {
next = 0;
}
I think a comment here that you're terminating the list if no callbacks are registered would be good.
doe->store[2] =
(next << ctz32(PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX)) |
(0 << ctz32(PCI_DOE_DIS_RSP_D0_DW0_PROT)) |
0x0001;
Same comment about FIELD_DP|EX32. I'd try it and see how it looks.
Would probably be nice to #define some vendor ID, or use fields so it's obvious why you have a 0x1.
} else {
/* Other entries based on register callbacks */
uint8_t next;
struct doe_handler *h;
h = g_list_nth_data(doe->cb_list, index - 1);
It wasn't immediately obvious to me why index - 1. Maybe it'd be a little nicer if you had a helper:
static inline struct doe_handler *doe_get_handler(PCIEDOE *doe, int doe_index) { struct doe_handler *ret = NULL;
// assert(doe->cb_list); if (doe->cb_list) ret = g_list_nth_data(doe->cb_list, doe_index - 1);
return ret; }
/*
* Off end of list, Table 7-x4 in DOE ECN -
* Vendor ID 0xFFFF if no more indices
*/
if (h == NULL) {
doe->store[2] = 0xFFFF;
} else {
if (g_list_nth(doe->cb_list, index)) {
next = index + 1;
} else {
next = 0;
}
A bit confusing here as well IMO, but I don't know a better way to write it other than adding comments about index effectively being off by 1.
doe->store[2] =
(next << ctz32(PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX)) |
(h->object_type << ctz32(PCI_DOE_DIS_RSP_D0_DW0_PROT)) |
h->vendor_id;
}
}
doe->data_object_ready = true;
doe->rsp_index = 0;
} else {
for (l = doe->cb_list; l != NULL; l = l->next) {
struct doe_handler *h = l->data;
if (h->vendor_id == vendor_id &&
h->object_type == object_type) {
int ret = h->handler(doe, h->vendor_id, h->object_type,
h->priv);
if (ret) {
/*
* No response so as per 6.xx.1 in DOE ECN
* "... within 1 second after the DOE Go bit was Set
* in the DOE Control register, otherwise the DOE
* instance must Set the DOE Error bit in the DOE
* Status register.."
*/
doe->error = true;
extra whitespace
break;
}
doe->data_object_ready = true;
doe->rsp_index = 0;
break;
}
}
/* Comamnd not handled */
if (l == NULL) {
doe->error = true;
}
}
/* Reset input index to allow for a new message */
doe->req_index = 0;
- }
+}
+static void doe_set_write_mailbox(PCIEDOE *doe, uint32_t val) +{
- if (doe->req_index == 1) {
if (val & 0x3FFFF) {
val & PCI_DATA_OBJ_DW1_LEN
doe->req_length = val & PCI_DATA_OBJ_DW1_LEN;
} else {
doe->req_length = 1 << 18;
}
- }
- if (doe->req_length && doe->req_index == doe->req_length) {
/*
* 6.xx.1 Data Objects
* If the DW transferred does not match the indicated Length
* for a data object, then the data object must be
* silently discarded
*/
return;
- }
- doe->store[doe->req_index] = val;
- doe->req_index++;
+}
+static uint32_t doe_get_read_mailbox(PCIEDOE *doe) +{
- uint32_t val;
- if (doe->rsp_index == 0) {
doe->rsp_length = doe->store[1] & PCI_DATA_OBJ_DW1_LEN;
- }
- if (!doe->data_object_ready) {
/* Underflow of the Read Data Mailbox Mechanism */
doe->error = true;
return 0;
- }
I don't think this should be an error condition. Could you please explain?
More fun with spec... "Any of the following events must result in the DOE error bit being set" "Underflow of the Read Data Mailbox Mechanism".
A read before data_object_ready is to me a read of an empty mailbox so hits that. Open to interpretation...
I guess I'll have annoying our PCI-SIG people...
- val = doe->store[doe->rsp_index];
- doe->rsp_index++;
- if (doe->rsp_index == doe->rsp_length) {
doe->rsp_index = -1;
doe->data_object_ready = false;
- }
- return val;
+}
+static uint32_t doe_get_status(PCIEDOE *doe) +{
- uint32_t val = 0;
- if (doe->busy) {
val |= PCI_DOE_STATUS_DOE_BUSY;
- }
Do you actually intend to model busy?
It's an interesting corner. We probably want to be able to to check that OS code is doing the right thing. + It is possible we'll get writes from multiple CPUs so maybe it can race?
Obviously right now it does nothing.
- /* bit 1: interrupt not yet supported */
- if (doe->error) {
val |= PCI_DOE_STATUS_DOE_ERROR;
- }
- if (doe->data_object_ready) {
val |= PCI_DOE_STATUS_DATA_OBJECT_READY;
- }
- return val;
+}
+void doe_add_message_handler(PCIEDOE *doe, uint16_t vendor_id,
uint8_t object_type,
const doe_msg_handler_t handler, void *priv)
+{
- struct doe_handler *h = g_malloc0(sizeof(*handler));
- h->vendor_id = vendor_id;
- h->object_type = object_type;
- h->handler = handler;
- h->priv = priv;
- doe->cb_list = g_list_append(doe->cb_list, h);
+}
+uint32_t pcie_doe_ecap(PCIEDOE *doe, PCIDevice *d, uint16_t offset) +{
- doe->doe_base = offset;
- /* Length field is 18 bits and is in dwords */
- doe->store = g_malloc0((1 << 18) * sizeof(uint32_t));
I think it's fine if you're instantiating DOE to just have this be statically allocated.
Probably true for now. Might want to make it flexible given how large it needs to be is dependent on what commands are actually supported. Nice to be smaller if we never need it to be so big.
- pcie_add_capability(d, PCI_EXT_CAP_ID_DOE, 1, offset, 0x18);
- offset += 0x18;
- return offset;
+}
+void pcie_doe_write(PCIEDOE *doe, uint32_t addr, uint32_t val, int len) +{
- if (len != 4) {
return;
- }
Do you want to check alignment also?
I 'think' config writes are always aligned, but will check.
- switch (addr - doe->doe_base) {
- case PCI_DOE_CTRL:
doe_set_ctl(doe, val);
break;
- case PCI_DOE_WRITE_MAILBOX:
doe_set_write_mailbox(doe, val);
break;
- }
+}
+uint32_t pcie_doe_read(PCIEDOE *doe, uint32_t addr, int len, int *found) +{
- if (len != 4) {
*found = 0;
return 0;
- }
Same comment as _write
- *found = 1;
- switch (addr - doe->doe_base) {
- case PCI_DOE_CAP:
return 0; /* No interrupt support */
- case PCI_DOE_STATUS:
return doe_get_status(doe);
- case PCI_DOE_READ_MAILBOX:
return doe_get_read_mailbox(doe);
- default:
*found = 0;
return 0;
- }
+}
I'm guessing you don't love 'found'. Perhaps you can have this return an int64_t and negative values are errors?
I thought about it but that is also rather uggly.
I'm not yet convinced you even need that level of error handling though. I suppose next patch will tell me.
Found isn't about error handling, it's about saving time if we have lots of device specific config space callbacks. Don't want to waste time running callbacks as a given config register should only be handled once.
diff --git a/include/hw/pci/doe.h b/include/hw/pci/doe.h new file mode 100644 index 0000000000..364c866c53 --- /dev/null +++ b/include/hw/pci/doe.h @@ -0,0 +1,40 @@ +/*
- PCIE DOE emulation.
- Copyright (c) 2021 Jonathan Cameron Jonathan.Cameron@huawei.com
- This work is licensed under the terms of the GNU GPL, version 2 or later.
- See the COPYING file in the top-level directory.
- */
+#ifndef QEMU_PCIE_DOE_H_ +#define QEMU_PCIE_DOE_H_ +#include "qemu/osdep.h" +#include "qemu/units.h" +#include "hw/qdev-properties.h" +#include "qapi/error.h" +#include "qemu/module.h"
+typedef struct pcie_doe {
- uint32_t doe_base;
- GList *cb_list;
- int req_index;
- int req_length;
- int rsp_index;
- int rsp_length;
- bool data_object_ready;
- bool error;
- bool busy;
- uint32_t *store;
+} PCIEDOE;
+typedef int (*doe_msg_handler_t)(PCIEDOE *doe, uint16_t vendor_id,
uint8_t object_type, void *priv);
+uint32_t pcie_doe_ecap(PCIEDOE *doe, PCIDevice *d, uint16_t offset); +void doe_add_message_handler(PCIEDOE *doe, uint16_t vendor_id,
uint8_t object_type,
const doe_msg_handler_t handler, void *priv);
+uint32_t pcie_doe_read(PCIEDOE *doe, uint32_t addr, int len, int *found); +void pcie_doe_write(PCIEDOE *doe, uint32_t addr, uint32_t val, int len); +#endif diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h index 76bf3ed590..636b2e8017 100644 --- a/include/hw/pci/pci_ids.h +++ b/include/hw/pci/pci_ids.h @@ -157,6 +157,8 @@
/* Vendors and devices. Sort key: vendor first, device next. */
+#define PCI_VENDOR_ID_PCI_SIG 0x0001
#define PCI_VENDOR_ID_LSI_LOGIC 0x1000 #define PCI_DEVICE_ID_LSI_53C810 0x0001
#define PCI_DEVICE_ID_LSI_53C895A 0x0012
2.19.1
CDAT is an ACPI like format defined by the CXL consortium. It is available from
https://www.uefi.org/node/4093
Here support for managing all the entires is introduced, along with an implementation of a callback for a DOE mailbox which may be used to read these values from CXL hardware by either firmware or an OS.
Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com --- hw/cxl/cxl-cdat.c | 252 ++++++++++++++++++++++++++++++++++++++ hw/cxl/meson.build | 1 + include/hw/cxl/cxl_cdat.h | 101 +++++++++++++++ 3 files changed, 354 insertions(+)
diff --git a/hw/cxl/cxl-cdat.c b/hw/cxl/cxl-cdat.c new file mode 100644 index 0000000000..6ed4c15cc0 --- /dev/null +++ b/hw/cxl/cxl-cdat.c @@ -0,0 +1,252 @@ +/* + * Support for CDAT entires as defined in + * Coherent Device Attribute Table (CDAT) Specification rev 1.02 + * Available from uefi.org. + * + * Copyright (c) 2021 Jonathan Cameron Jonathan.Cameron@huawei.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#include "qemu/osdep.h" +#include "qemu/units.h" +#include "qemu/error-report.h" +#include "hw/mem/memory-device.h" +#include "hw/mem/pc-dimm.h" +#include "hw/pci/pci.h" +#include "hw/pci/doe.h" +#include "hw/qdev-properties.h" +#include "qapi/error.h" +#include "qemu/log.h" +#include "qemu/module.h" +#include "qemu/range.h" +#include "qemu/rcu.h" +#include "sysemu/hostmem.h" +#include "sysemu/numa.h" +#include "hw/cxl/cxl.h" + +void cdat_add_dsmas(CXLCDAT *cdat, uint8_t dsmad_handle, uint8_t flags, + uint64_t dpa_base, uint64_t dpa_length) +{ + struct cxl_cdat_dsmas *dsmas = g_malloc0(sizeof(*dsmas)); + dsmas->dsmad_handle = dsmad_handle; + dsmas->flags = flags; + dsmas->dpa_base = dpa_base; + dsmas->dpa_length = dpa_length; + cdat->dsmas_list = g_list_append(cdat->dsmas_list, dsmas); +} + +void cdat_add_dslbis(CXLCDAT *cdat, uint8_t handle, uint8_t flags, + uint8_t data_type, uint64_t base_unit, + uint16_t entry0, uint16_t entry1, uint16_t entry2) +{ + struct cxl_cdat_dslbis *dslbis = g_malloc0(sizeof(*dslbis)); + dslbis->handle = handle; + dslbis->flags = flags; + dslbis->data_type = data_type; + dslbis->entry_base_unit = base_unit; + dslbis->entries[0] = entry0; + dslbis->entries[1] = entry1; + dslbis->entries[2] = entry2; + cdat->dslbis_list = g_list_append(cdat->dslbis_list, dslbis); +} + +void cdat_add_dsmscis(CXLCDAT *cdat, uint8_t dsmas_handle, + uint64_t memory_sc_size, uint64_t cache_attrs) +{ + struct cxl_cdat_dsmscis *dsmscis = g_malloc(sizeof(*dsmscis)); + dsmscis->dsmas_handle = dsmas_handle; + dsmscis->memory_side_cache_size = memory_sc_size; + dsmscis->cache_attributes = cache_attrs; + cdat->dsmscis_list = g_list_append(cdat->dsmscis_list, dsmscis); +} + +void cdat_add_dsis(CXLCDAT *cdat, uint8_t flags, uint8_t handle) +{ + struct cxl_cdat_dsis *dsis = g_malloc(sizeof(*dsis)); + dsis->flags = flags; + dsis->handle = handle; + cdat->dsis_list = g_list_append(cdat->dsis_list, dsis); +} + +void cdat_add_dsemts(CXLCDAT *cdat, uint8_t dsmas_handle, + uint8_t efi_mem_type_attr, uint64_t dpa_offset, + uint64_t dpa_length) +{ + struct cxl_cdat_dsemts *dsemts = g_malloc(sizeof(*dsemts)); + dsemts->dsmas_handle = dsmas_handle; + dsemts->efi_mem_type_attr = efi_mem_type_attr; + dsemts->dpa_offset = dpa_offset; + dsemts->dpa_length = dpa_length; + cdat->dsemts_list = g_list_append(cdat->dsemts_list, dsemts); +} + +struct cxl_cdat_sslbis *cdat_add_sslbis(CXLCDAT *cdat, uint8_t num_entries, + uint8_t data_type, uint64_t base_unit) +{ + struct cxl_cdat_sslbis *sslbis = + g_malloc(sizeof(*sslbis) + num_entries * sizeof(sslbis->entries[0])); + sslbis->num_entries = num_entries; + sslbis->data_type = data_type; + sslbis->base_unit = base_unit; + cdat->sslbis_list = g_list_append(cdat->sslbis_list, sslbis); + return sslbis; +} + +int cdata_sslbis_set_entry(struct cxl_cdat_sslbis *sslbis, uint8_t index, + uint16_t portx, uint16_t porty, uint16_t val) +{ + struct cxl_cdat_sslbis_entry *entry; + if (index >= sslbis->num_entries) { + return -1; + } + entry = &sslbis->entries[index]; + entry->port_x_id = portx; + entry->port_y_id = porty; + entry->val = val; + return 0; +} + +int cxl_table_access(PCIEDOE *doe, uint16_t vendor_id, uint8_t object_type, + void *priv) +{ + uint8_t table_type; + int total_entries; + uint16_t entry_handle; + CXLCDAT *cdat = priv; + uint16_t next_entry; + + if (doe->req_length != 3) { + /* optional error for unexpected command length */ + return -1; + } + + if ((doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_RCODE) != 0) { + /* + * Only table access code currently supported. + * Error indication is lack of Data Object Ready + */ + return -1; + } + + table_type = (doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_TYPE) >> + ctz32(CXL_DOE_TABLE_ACCESS_DW2_TYPE); + if (table_type != 0) { + /* Unsuported table ID so just don't set Data Object Ready */ + return -1; + } + entry_handle = (doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE) >> + ctz32(CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE); + + /* Assume entry handle == CDAT structure index */ + total_entries = g_list_length(cdat->dsmas_list) + + g_list_length(cdat->dslbis_list) + + g_list_length(cdat->dsmscis_list) + + g_list_length(cdat->dsis_list) + + g_list_length(cdat->dsemts_list) + + g_list_length(cdat->sslbis_list); + if (entry_handle + 1 == total_entries) { + next_entry = 0xFFFF; + } else { + next_entry = entry_handle + 1; + } + + if (entry_handle < g_list_length(cdat->dsmas_list)) { + const int dsmas_len = 24; + struct cxl_cdat_dsmas *dsmas = + g_list_nth_data(cdat->dsmas_list, entry_handle); + + doe->store[1] = 3 + dsmas_len / sizeof(uint32_t); + doe->store[2] = next_entry << 16; + doe->store[3] = (dsmas_len << 16) | CXL_CDAT_DSMAS_TYPE; + + /* cxl version of proximity domain */ + doe->store[4] = dsmas->dsmad_handle; + /* flags in 2nd byte of [4] (bit 2 is non volatile) */ + doe->store[4] |= (dsmas->flags << 8); + doe->store[5] = dsmas->dpa_base & 0xFFFFFFFF; + doe->store[6] = (dsmas->dpa_base >> 32) & 0xFFFFFFFF; + doe->store[7] = dsmas->dpa_length & 0xFFFFFFFF; + doe->store[8] = (dsmas->dpa_length >> 32) & 0xFFFFFFFF; + return 0; + } + entry_handle -= g_list_length(cdat->dsmas_list); + if (entry_handle < g_list_length(cdat->dslbis_list)) { + const int dslbis_len = 24; + struct cxl_cdat_dslbis *dslbis = + g_list_nth_data(cdat->dslbis_list, entry_handle); + + doe->store[1] = 3 + dslbis_len / sizeof(uint32_t); + doe->store[2] = next_entry << 16; + doe->store[3] = (dslbis_len << 16) | CXL_CDAT_DSLBIS_TYPE; + + doe->store[4] = (dslbis->data_type << 24) | (dslbis->flags << 8) | + dslbis->handle; + doe->store[5] = dslbis->entry_base_unit & 0xFFFFFFFF; + doe->store[6] = (dslbis->entry_base_unit >> 32) & 0xFFFFFFFF; + doe->store[7] = (dslbis->entries[1] << 16) | dslbis->entries[0]; + doe->store[8] = dslbis->entries[2]; + return 0; + } + entry_handle -= g_list_length(cdat->dslbis_list); + if (entry_handle < g_list_length(cdat->dsmscis_list)) { + const int dsmscis_len = 20; + struct cxl_cdat_dsmscis *dsmscis = + g_list_nth_data(cdat->dsmscis_list, entry_handle); + + doe->store[1] = 3 + dsmscis_len / sizeof(uint32_t); + doe->store[2] = next_entry << 16; + doe->store[3] = (dsmscis_len << 16) | CXL_CDAT_DSMSCIS_TYPE; + doe->store[4] = dsmscis->dsmas_handle; + doe->store[5] = dsmscis->memory_side_cache_size & 0xffffffff; + doe->store[6] = (dsmscis->memory_side_cache_size >> 32) & 0xffffffff; + doe->store[7] = dsmscis->cache_attributes; + } + entry_handle -= g_list_length(cdat->dsmscis_list); + if (entry_handle < g_list_length(cdat->dsis_list)) { + const int dsis_len = 8; + struct cxl_cdat_dsis *dsis = + g_list_nth_data(cdat->dsis_list, entry_handle); + + doe->store[1] = 3 + dsis_len / sizeof(uint32_t); + doe->store[2] = next_entry << 16; + doe->store[3] = (dsis_len << 16) | CXL_CDAT_DSMSCIS_TYPE; + doe->store[4] = (dsis->handle << 8) | dsis->flags; + } + entry_handle -= g_list_length(cdat->dsis_list); + if (entry_handle < g_list_length(cdat->dsemts_list)) { + const int dsemts_len = 24; + struct cxl_cdat_dsemts *dsemts = + g_list_nth_data(cdat->dsemts_list, entry_handle); + + doe->store[1] = 3 + dsemts_len / sizeof(uint32_t); + doe->store[2] = next_entry << 16; + doe->store[3] = (dsemts_len << 16) | CXL_CDAT_DSEMTS_TYPE; + doe->store[4] = (dsemts->efi_mem_type_attr << 8) | dsemts->dsmas_handle; + doe->store[5] = dsemts->dpa_offset & 0xffffffff; + doe->store[6] = (dsemts->dpa_offset >> 32) & 0xffffffff; + doe->store[7] = dsemts->dpa_length & 0xffffffff; + doe->store[8] = (dsemts->dpa_length >> 32) & 0xffffffff; + } + entry_handle -= g_list_length(cdat->dsemts_list); + if (entry_handle < g_list_length(cdat->sslbis_list)) { + struct cxl_cdat_sslbis *sslbis = + g_list_nth_data(cdat->sslbis_list, entry_handle); + int sslbis_len = 16 + 8 * sslbis->num_entries; + int i; + + doe->store[1] = 3 + sslbis_len / sizeof(uint32_t); + doe->store[2] = next_entry << 16; + doe->store[3] = (sslbis_len << 16) | CXL_CDAT_SSLBIS_TYPE; + doe->store[4] = sslbis->data_type; + doe->store[5] = sslbis->base_unit & 0xffffffff; + doe->store[6] = (sslbis->base_unit >> 32) & 0xffffffff; + for (i = 0; i < sslbis->num_entries; i++) { + doe->store[7 + i * 2] = (sslbis->entries[i].port_y_id << 8) | + sslbis->entries[i].port_x_id; + doe->store[7 + i * 2 + 1] = sslbis->entries[i].val; + } + } + + return -1; +} diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build index 0eca715d10..9e2e5f4094 100644 --- a/hw/cxl/meson.build +++ b/hw/cxl/meson.build @@ -2,4 +2,5 @@ softmmu_ss.add(when: 'CONFIG_CXL', if_true: files( 'cxl-component-utils.c', 'cxl-device-utils.c', 'cxl-mailbox-utils.c', + 'cxl-cdat.c', )) diff --git a/include/hw/cxl/cxl_cdat.h b/include/hw/cxl/cxl_cdat.h new file mode 100644 index 0000000000..d0226c463c --- /dev/null +++ b/include/hw/cxl/cxl_cdat.h @@ -0,0 +1,101 @@ +/* + * Support for CDAT entires as defined in + * Coherent Device Attribute Table (CDAT) Specification rev 1.02 + * Available from uefi.org. + * + * Copyright (c) 2021 Jonathan Cameron Jonathan.Cameron@huawei.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#ifndef QEMU_CXL_CDAT_H +#define QEMU_CXL_CDAT_H +#include "hw/pci/doe.h" +/* DW2 is common to the request and response */ +#define CXL_DOE_TABLE_ACCESS_DW2_RCODE 0x000000ff +#define CXL_DOE_TABLE_ACCESS_DW2_TYPE 0x0000ff00 +#define CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE 0xffff0000 + +#define CXL_CDAT_DSMAS_TYPE 0 +#define CXL_CDAT_DSLBIS_TYPE 1 +#define CXL_CDAT_DSMSCIS_TYPE 2 +#define CXL_CDAT_DSIS_TYPE 3 +#define CXL_CDAT_DSEMTS_TYPE 4 +#define CXL_CDAT_SSLBIS_TYPE 5 + +struct cxl_cdat_dsmas { + uint8_t dsmad_handle; + uint8_t flags; +#define CDAT_DSMAS_FLAG_NV (1 << 2) + uint64_t dpa_base; + uint64_t dpa_length; +}; + +struct cxl_cdat_dslbis { + uint8_t handle; + uint8_t flags; + uint8_t data_type; + uint64_t entry_base_unit; + uint16_t entries[3]; /* 6 bytes */ +}; + +struct cxl_cdat_dsmscis { + uint8_t dsmas_handle; + uint64_t memory_side_cache_size; + uint32_t cache_attributes; +}; + +struct cxl_cdat_dsis { + uint8_t flags; +#define CDAT_DSIS_MEMORY_ATTACHED 0x01 + uint8_t handle; +}; + +struct cxl_cdat_dsemts { + uint8_t dsmas_handle; + uint8_t efi_mem_type_attr; + uint64_t dpa_offset; + uint64_t dpa_length; +}; + +struct cxl_cdat_sslbis_entry { + uint16_t port_x_id; + uint16_t port_y_id; + uint16_t val; +}; + +struct cxl_cdat_sslbis { + uint8_t num_entries; /* needed to compute length */ + uint8_t data_type; + uint64_t base_unit; + struct cxl_cdat_sslbis_entry entries[]; +}; + +typedef struct cxl_cdat { + GList *dsmas_list; + GList *dslbis_list; + GList *dsmscis_list; + GList *dsis_list; + GList *dsemts_list; + GList *sslbis_list; +} CXLCDAT; + +void cdat_add_dsmas(CXLCDAT *cdat, uint8_t dsmad_handle, uint8_t flags, + uint64_t dpa_base, uint64_t dpa_length); +void cdat_add_dslbis(CXLCDAT *cdat, uint8_t handle, uint8_t flags, + uint8_t data_type, uint64_t base_unit, + uint16_t entry0, uint16_t entry1, uint16_t entry2); +void cdat_add_dsmscis(CXLCDAT *cdat, uint8_t dsmas_handle, + uint64_t memory_sc_size, uint64_t cache_attrs); +void cdat_add_dsis(CXLCDAT *cdat, uint8_t flags, uint8_t handle); +void cdat_add_dsemts(CXLCDAT *cdat, uint8_t dsmas_handle, + uint8_t efi_mem_type_attr, uint64_t dpa_offset, + uint64_t dpa_length); +struct cxl_cdat_sslbis *cdat_add_sslbis(CXLCDAT *cdat, uint8_t num_entries, + uint8_t data_type, uint64_t base_unit); +int cdata_sslbis_set_entry(struct cxl_cdat_sslbis *sslbis, uint8_t index, + uint16_t portx, uint16_t pory, uint16_t val); + +int cxl_table_access(PCIEDOE *doe, uint16_t vendor_id, uint8_t object_type, + void *priv); +#endif
On 21-02-01 23:16:28, Jonathan Cameron wrote:
CDAT is an ACPI like format defined by the CXL consortium. It is available from
https://www.uefi.org/node/4093
Here support for managing all the entires is introduced, along with an implementation of a callback for a DOE mailbox which may be used to read these values from CXL hardware by either firmware or an OS.
Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com
I seem to be missing one critical thing, where is the CDAT header (Table 1 from the spec) actually populated, length, revision, checksum, etc?
General, probably underthought-out, comment:
With the CXLType3Class I added since you probably wrote these patches, I wonder if that'd be a better fit for populating some of these tables, having them populate dynamically when the DOE/CDAT request actually comes in.
I have no strong opinion. The one advantage I see to using the class is operations like managing handles, length calculations, etc can be managed by the helper library rather than the device implementation having to do it. OTTOH if you have devices that want to do weird things, they might lose some flexibility.
I think with just a few methods added to the CXLType3Class you could pretty trivially build up a nice sane default CDAT for any CXL device.
hw/cxl/cxl-cdat.c | 252 ++++++++++++++++++++++++++++++++++++++ hw/cxl/meson.build | 1 + include/hw/cxl/cxl_cdat.h | 101 +++++++++++++++ 3 files changed, 354 insertions(+)
diff --git a/hw/cxl/cxl-cdat.c b/hw/cxl/cxl-cdat.c new file mode 100644 index 0000000000..6ed4c15cc0 --- /dev/null +++ b/hw/cxl/cxl-cdat.c @@ -0,0 +1,252 @@ +/*
- Support for CDAT entires as defined in
- Coherent Device Attribute Table (CDAT) Specification rev 1.02
- Available from uefi.org.
- Copyright (c) 2021 Jonathan Cameron Jonathan.Cameron@huawei.com
- This work is licensed under the terms of the GNU GPL, version 2 or later.
- See the COPYING file in the top-level directory.
- */
+#include "qemu/osdep.h" +#include "qemu/units.h" +#include "qemu/error-report.h" +#include "hw/mem/memory-device.h" +#include "hw/mem/pc-dimm.h" +#include "hw/pci/pci.h" +#include "hw/pci/doe.h" +#include "hw/qdev-properties.h" +#include "qapi/error.h" +#include "qemu/log.h" +#include "qemu/module.h" +#include "qemu/range.h" +#include "qemu/rcu.h" +#include "sysemu/hostmem.h" +#include "sysemu/numa.h" +#include "hw/cxl/cxl.h"
+void cdat_add_dsmas(CXLCDAT *cdat, uint8_t dsmad_handle, uint8_t flags,
uint64_t dpa_base, uint64_t dpa_length)
+{
- struct cxl_cdat_dsmas *dsmas = g_malloc0(sizeof(*dsmas));
- dsmas->dsmad_handle = dsmad_handle;
- dsmas->flags = flags;
- dsmas->dpa_base = dpa_base;
- dsmas->dpa_length = dpa_length;
- cdat->dsmas_list = g_list_append(cdat->dsmas_list, dsmas);
+}
+void cdat_add_dslbis(CXLCDAT *cdat, uint8_t handle, uint8_t flags,
uint8_t data_type, uint64_t base_unit,
uint16_t entry0, uint16_t entry1, uint16_t entry2)
+{
- struct cxl_cdat_dslbis *dslbis = g_malloc0(sizeof(*dslbis));
- dslbis->handle = handle;
- dslbis->flags = flags;
- dslbis->data_type = data_type;
- dslbis->entry_base_unit = base_unit;
- dslbis->entries[0] = entry0;
- dslbis->entries[1] = entry1;
- dslbis->entries[2] = entry2;
- cdat->dslbis_list = g_list_append(cdat->dslbis_list, dslbis);
+}
+void cdat_add_dsmscis(CXLCDAT *cdat, uint8_t dsmas_handle,
uint64_t memory_sc_size, uint64_t cache_attrs)
+{
- struct cxl_cdat_dsmscis *dsmscis = g_malloc(sizeof(*dsmscis));
- dsmscis->dsmas_handle = dsmas_handle;
- dsmscis->memory_side_cache_size = memory_sc_size;
- dsmscis->cache_attributes = cache_attrs;
- cdat->dsmscis_list = g_list_append(cdat->dsmscis_list, dsmscis);
+}
+void cdat_add_dsis(CXLCDAT *cdat, uint8_t flags, uint8_t handle) +{
- struct cxl_cdat_dsis *dsis = g_malloc(sizeof(*dsis));
- dsis->flags = flags;
- dsis->handle = handle;
- cdat->dsis_list = g_list_append(cdat->dsis_list, dsis);
+}
+void cdat_add_dsemts(CXLCDAT *cdat, uint8_t dsmas_handle,
uint8_t efi_mem_type_attr, uint64_t dpa_offset,
uint64_t dpa_length)
+{
- struct cxl_cdat_dsemts *dsemts = g_malloc(sizeof(*dsemts));
- dsemts->dsmas_handle = dsmas_handle;
- dsemts->efi_mem_type_attr = efi_mem_type_attr;
- dsemts->dpa_offset = dpa_offset;
- dsemts->dpa_length = dpa_length;
- cdat->dsemts_list = g_list_append(cdat->dsemts_list, dsemts);
+}
+struct cxl_cdat_sslbis *cdat_add_sslbis(CXLCDAT *cdat, uint8_t num_entries,
uint8_t data_type, uint64_t base_unit)
+{
- struct cxl_cdat_sslbis *sslbis =
g_malloc(sizeof(*sslbis) + num_entries * sizeof(sslbis->entries[0]));
- sslbis->num_entries = num_entries;
- sslbis->data_type = data_type;
- sslbis->base_unit = base_unit;
- cdat->sslbis_list = g_list_append(cdat->sslbis_list, sslbis);
- return sslbis;
+}
+int cdata_sslbis_set_entry(struct cxl_cdat_sslbis *sslbis, uint8_t index,
uint16_t portx, uint16_t porty, uint16_t val)
+{
- struct cxl_cdat_sslbis_entry *entry;
- if (index >= sslbis->num_entries) {
return -1;
- }
- entry = &sslbis->entries[index];
- entry->port_x_id = portx;
- entry->port_y_id = porty;
- entry->val = val;
- return 0;
+}
+int cxl_table_access(PCIEDOE *doe, uint16_t vendor_id, uint8_t object_type,
void *priv)
+{
- uint8_t table_type;
- int total_entries;
- uint16_t entry_handle;
- CXLCDAT *cdat = priv;
- uint16_t next_entry;
- if (doe->req_length != 3) {
/* optional error for unexpected command length */
return -1;
- }
- if ((doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_RCODE) != 0) {
/*
* Only table access code currently supported.
* Error indication is lack of Data Object Ready
*/
return -1;
- }
- table_type = (doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_TYPE) >>
ctz32(CXL_DOE_TABLE_ACCESS_DW2_TYPE);
- if (table_type != 0) {
/* Unsuported table ID so just don't set Data Object Ready */
return -1;
- }
- entry_handle = (doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE) >>
ctz32(CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE);
- /* Assume entry handle == CDAT structure index */
- total_entries = g_list_length(cdat->dsmas_list) +
g_list_length(cdat->dslbis_list) +
g_list_length(cdat->dsmscis_list) +
g_list_length(cdat->dsis_list) +
g_list_length(cdat->dsemts_list) +
g_list_length(cdat->sslbis_list);
- if (entry_handle + 1 == total_entries) {
next_entry = 0xFFFF;
- } else {
next_entry = entry_handle + 1;
- }
- if (entry_handle < g_list_length(cdat->dsmas_list)) {
const int dsmas_len = 24;
struct cxl_cdat_dsmas *dsmas =
g_list_nth_data(cdat->dsmas_list, entry_handle);
doe->store[1] = 3 + dsmas_len / sizeof(uint32_t);
doe->store[2] = next_entry << 16;
doe->store[3] = (dsmas_len << 16) | CXL_CDAT_DSMAS_TYPE;
/* cxl version of proximity domain */
doe->store[4] = dsmas->dsmad_handle;
/* flags in 2nd byte of [4] (bit 2 is non volatile) */
doe->store[4] |= (dsmas->flags << 8);
doe->store[5] = dsmas->dpa_base & 0xFFFFFFFF;
doe->store[6] = (dsmas->dpa_base >> 32) & 0xFFFFFFFF;
doe->store[7] = dsmas->dpa_length & 0xFFFFFFFF;
doe->store[8] = (dsmas->dpa_length >> 32) & 0xFFFFFFFF;
return 0;
- }
- entry_handle -= g_list_length(cdat->dsmas_list);
- if (entry_handle < g_list_length(cdat->dslbis_list)) {
const int dslbis_len = 24;
struct cxl_cdat_dslbis *dslbis =
g_list_nth_data(cdat->dslbis_list, entry_handle);
doe->store[1] = 3 + dslbis_len / sizeof(uint32_t);
doe->store[2] = next_entry << 16;
doe->store[3] = (dslbis_len << 16) | CXL_CDAT_DSLBIS_TYPE;
doe->store[4] = (dslbis->data_type << 24) | (dslbis->flags << 8) |
dslbis->handle;
doe->store[5] = dslbis->entry_base_unit & 0xFFFFFFFF;
doe->store[6] = (dslbis->entry_base_unit >> 32) & 0xFFFFFFFF;
doe->store[7] = (dslbis->entries[1] << 16) | dslbis->entries[0];
doe->store[8] = dslbis->entries[2];
return 0;
- }
- entry_handle -= g_list_length(cdat->dslbis_list);
- if (entry_handle < g_list_length(cdat->dsmscis_list)) {
const int dsmscis_len = 20;
struct cxl_cdat_dsmscis *dsmscis =
g_list_nth_data(cdat->dsmscis_list, entry_handle);
doe->store[1] = 3 + dsmscis_len / sizeof(uint32_t);
doe->store[2] = next_entry << 16;
doe->store[3] = (dsmscis_len << 16) | CXL_CDAT_DSMSCIS_TYPE;
doe->store[4] = dsmscis->dsmas_handle;
doe->store[5] = dsmscis->memory_side_cache_size & 0xffffffff;
doe->store[6] = (dsmscis->memory_side_cache_size >> 32) & 0xffffffff;
doe->store[7] = dsmscis->cache_attributes;
- }
- entry_handle -= g_list_length(cdat->dsmscis_list);
- if (entry_handle < g_list_length(cdat->dsis_list)) {
const int dsis_len = 8;
struct cxl_cdat_dsis *dsis =
g_list_nth_data(cdat->dsis_list, entry_handle);
doe->store[1] = 3 + dsis_len / sizeof(uint32_t);
doe->store[2] = next_entry << 16;
doe->store[3] = (dsis_len << 16) | CXL_CDAT_DSMSCIS_TYPE;
doe->store[4] = (dsis->handle << 8) | dsis->flags;
- }
- entry_handle -= g_list_length(cdat->dsis_list);
- if (entry_handle < g_list_length(cdat->dsemts_list)) {
const int dsemts_len = 24;
struct cxl_cdat_dsemts *dsemts =
g_list_nth_data(cdat->dsemts_list, entry_handle);
doe->store[1] = 3 + dsemts_len / sizeof(uint32_t);
doe->store[2] = next_entry << 16;
doe->store[3] = (dsemts_len << 16) | CXL_CDAT_DSEMTS_TYPE;
doe->store[4] = (dsemts->efi_mem_type_attr << 8) | dsemts->dsmas_handle;
doe->store[5] = dsemts->dpa_offset & 0xffffffff;
doe->store[6] = (dsemts->dpa_offset >> 32) & 0xffffffff;
doe->store[7] = dsemts->dpa_length & 0xffffffff;
doe->store[8] = (dsemts->dpa_length >> 32) & 0xffffffff;
- }
- entry_handle -= g_list_length(cdat->dsemts_list);
- if (entry_handle < g_list_length(cdat->sslbis_list)) {
struct cxl_cdat_sslbis *sslbis =
g_list_nth_data(cdat->sslbis_list, entry_handle);
int sslbis_len = 16 + 8 * sslbis->num_entries;
int i;
doe->store[1] = 3 + sslbis_len / sizeof(uint32_t);
doe->store[2] = next_entry << 16;
doe->store[3] = (sslbis_len << 16) | CXL_CDAT_SSLBIS_TYPE;
doe->store[4] = sslbis->data_type;
doe->store[5] = sslbis->base_unit & 0xffffffff;
doe->store[6] = (sslbis->base_unit >> 32) & 0xffffffff;
for (i = 0; i < sslbis->num_entries; i++) {
doe->store[7 + i * 2] = (sslbis->entries[i].port_y_id << 8) |
sslbis->entries[i].port_x_id;
doe->store[7 + i * 2 + 1] = sslbis->entries[i].val;
}
- }
- return -1;
+} diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build index 0eca715d10..9e2e5f4094 100644 --- a/hw/cxl/meson.build +++ b/hw/cxl/meson.build @@ -2,4 +2,5 @@ softmmu_ss.add(when: 'CONFIG_CXL', if_true: files( 'cxl-component-utils.c', 'cxl-device-utils.c', 'cxl-mailbox-utils.c',
- 'cxl-cdat.c',
)) diff --git a/include/hw/cxl/cxl_cdat.h b/include/hw/cxl/cxl_cdat.h new file mode 100644 index 0000000000..d0226c463c --- /dev/null +++ b/include/hw/cxl/cxl_cdat.h @@ -0,0 +1,101 @@ +/*
- Support for CDAT entires as defined in
- Coherent Device Attribute Table (CDAT) Specification rev 1.02
- Available from uefi.org.
- Copyright (c) 2021 Jonathan Cameron Jonathan.Cameron@huawei.com
- This work is licensed under the terms of the GNU GPL, version 2 or later.
- See the COPYING file in the top-level directory.
- */
+#ifndef QEMU_CXL_CDAT_H +#define QEMU_CXL_CDAT_H +#include "hw/pci/doe.h" +/* DW2 is common to the request and response */ +#define CXL_DOE_TABLE_ACCESS_DW2_RCODE 0x000000ff +#define CXL_DOE_TABLE_ACCESS_DW2_TYPE 0x0000ff00 +#define CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE 0xffff0000
+#define CXL_CDAT_DSMAS_TYPE 0 +#define CXL_CDAT_DSLBIS_TYPE 1 +#define CXL_CDAT_DSMSCIS_TYPE 2 +#define CXL_CDAT_DSIS_TYPE 3 +#define CXL_CDAT_DSEMTS_TYPE 4 +#define CXL_CDAT_SSLBIS_TYPE 5
+struct cxl_cdat_dsmas {
- uint8_t dsmad_handle;
- uint8_t flags;
+#define CDAT_DSMAS_FLAG_NV (1 << 2)
- uint64_t dpa_base;
- uint64_t dpa_length;
+};
+struct cxl_cdat_dslbis {
- uint8_t handle;
- uint8_t flags;
- uint8_t data_type;
- uint64_t entry_base_unit;
- uint16_t entries[3]; /* 6 bytes */
+};
+struct cxl_cdat_dsmscis {
- uint8_t dsmas_handle;
- uint64_t memory_side_cache_size;
- uint32_t cache_attributes;
+};
+struct cxl_cdat_dsis {
- uint8_t flags;
+#define CDAT_DSIS_MEMORY_ATTACHED 0x01
- uint8_t handle;
+};
+struct cxl_cdat_dsemts {
- uint8_t dsmas_handle;
- uint8_t efi_mem_type_attr;
- uint64_t dpa_offset;
- uint64_t dpa_length;
+};
+struct cxl_cdat_sslbis_entry {
- uint16_t port_x_id;
- uint16_t port_y_id;
- uint16_t val;
+};
+struct cxl_cdat_sslbis {
- uint8_t num_entries; /* needed to compute length */
- uint8_t data_type;
- uint64_t base_unit;
- struct cxl_cdat_sslbis_entry entries[];
+};
+typedef struct cxl_cdat {
- GList *dsmas_list;
- GList *dslbis_list;
- GList *dsmscis_list;
- GList *dsis_list;
- GList *dsemts_list;
- GList *sslbis_list;
+} CXLCDAT;
+void cdat_add_dsmas(CXLCDAT *cdat, uint8_t dsmad_handle, uint8_t flags,
uint64_t dpa_base, uint64_t dpa_length);
+void cdat_add_dslbis(CXLCDAT *cdat, uint8_t handle, uint8_t flags,
uint8_t data_type, uint64_t base_unit,
uint16_t entry0, uint16_t entry1, uint16_t entry2);
+void cdat_add_dsmscis(CXLCDAT *cdat, uint8_t dsmas_handle,
uint64_t memory_sc_size, uint64_t cache_attrs);
+void cdat_add_dsis(CXLCDAT *cdat, uint8_t flags, uint8_t handle); +void cdat_add_dsemts(CXLCDAT *cdat, uint8_t dsmas_handle,
uint8_t efi_mem_type_attr, uint64_t dpa_offset,
uint64_t dpa_length);
+struct cxl_cdat_sslbis *cdat_add_sslbis(CXLCDAT *cdat, uint8_t num_entries,
uint8_t data_type, uint64_t base_unit);
+int cdata_sslbis_set_entry(struct cxl_cdat_sslbis *sslbis, uint8_t index,
uint16_t portx, uint16_t pory, uint16_t val);
+int cxl_table_access(PCIEDOE *doe, uint16_t vendor_id, uint8_t object_type,
void *priv);
+#endif
2.19.1
On Tue, 2 Feb 2021 10:49:23 -0800 Ben Widawsky ben@bwidawsk.net wrote:
On 21-02-01 23:16:28, Jonathan Cameron wrote:
CDAT is an ACPI like format defined by the CXL consortium. It is available from
https://www.uefi.org/node/4093
Here support for managing all the entires is introduced, along with an implementation of a callback for a DOE mailbox which may be used to read these values from CXL hardware by either firmware or an OS.
Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com
I seem to be missing one critical thing, where is the CDAT header (Table 1 from the spec) actually populated, length, revision, checksum, etc?
An excellent question. As far as I can see there is no actual means defined to read that from a device. The DOE stuff in the CXL spec just accesses 'entries' not the header (thus as far as I can tell making a good chunk of the CDAT spec entirely pointless).
It's on my list to raise on the SSWG list / call. Perhaps I'm missing something...
General, probably underthought-out, comment:
With the CXLType3Class I added since you probably wrote these patches, I wonder if that'd be a better fit for populating some of these tables, having them populate dynamically when the DOE/CDAT request actually comes in.
I have no strong opinion. The one advantage I see to using the class is operations like managing handles, length calculations, etc can be managed by the helper library rather than the device implementation having to do it. OTTOH if you have devices that want to do weird things, they might lose some flexibility.
I think with just a few methods added to the CXLType3Class you could pretty trivially build up a nice sane default CDAT for any CXL device.
I'm not convinced there is such a thing as a sane default, but can certainly look at this if we take this particular implementation forwards. I suspect we'll also want to make a lot of this configurable from the command line or similar on a per device basis.
Thanks,
Jonathan
hw/cxl/cxl-cdat.c | 252 ++++++++++++++++++++++++++++++++++++++ hw/cxl/meson.build | 1 + include/hw/cxl/cxl_cdat.h | 101 +++++++++++++++ 3 files changed, 354 insertions(+)
diff --git a/hw/cxl/cxl-cdat.c b/hw/cxl/cxl-cdat.c new file mode 100644 index 0000000000..6ed4c15cc0 --- /dev/null +++ b/hw/cxl/cxl-cdat.c @@ -0,0 +1,252 @@ +/*
- Support for CDAT entires as defined in
- Coherent Device Attribute Table (CDAT) Specification rev 1.02
- Available from uefi.org.
- Copyright (c) 2021 Jonathan Cameron Jonathan.Cameron@huawei.com
- This work is licensed under the terms of the GNU GPL, version 2 or later.
- See the COPYING file in the top-level directory.
- */
+#include "qemu/osdep.h" +#include "qemu/units.h" +#include "qemu/error-report.h" +#include "hw/mem/memory-device.h" +#include "hw/mem/pc-dimm.h" +#include "hw/pci/pci.h" +#include "hw/pci/doe.h" +#include "hw/qdev-properties.h" +#include "qapi/error.h" +#include "qemu/log.h" +#include "qemu/module.h" +#include "qemu/range.h" +#include "qemu/rcu.h" +#include "sysemu/hostmem.h" +#include "sysemu/numa.h" +#include "hw/cxl/cxl.h"
+void cdat_add_dsmas(CXLCDAT *cdat, uint8_t dsmad_handle, uint8_t flags,
uint64_t dpa_base, uint64_t dpa_length)
+{
- struct cxl_cdat_dsmas *dsmas = g_malloc0(sizeof(*dsmas));
- dsmas->dsmad_handle = dsmad_handle;
- dsmas->flags = flags;
- dsmas->dpa_base = dpa_base;
- dsmas->dpa_length = dpa_length;
- cdat->dsmas_list = g_list_append(cdat->dsmas_list, dsmas);
+}
+void cdat_add_dslbis(CXLCDAT *cdat, uint8_t handle, uint8_t flags,
uint8_t data_type, uint64_t base_unit,
uint16_t entry0, uint16_t entry1, uint16_t entry2)
+{
- struct cxl_cdat_dslbis *dslbis = g_malloc0(sizeof(*dslbis));
- dslbis->handle = handle;
- dslbis->flags = flags;
- dslbis->data_type = data_type;
- dslbis->entry_base_unit = base_unit;
- dslbis->entries[0] = entry0;
- dslbis->entries[1] = entry1;
- dslbis->entries[2] = entry2;
- cdat->dslbis_list = g_list_append(cdat->dslbis_list, dslbis);
+}
+void cdat_add_dsmscis(CXLCDAT *cdat, uint8_t dsmas_handle,
uint64_t memory_sc_size, uint64_t cache_attrs)
+{
- struct cxl_cdat_dsmscis *dsmscis = g_malloc(sizeof(*dsmscis));
- dsmscis->dsmas_handle = dsmas_handle;
- dsmscis->memory_side_cache_size = memory_sc_size;
- dsmscis->cache_attributes = cache_attrs;
- cdat->dsmscis_list = g_list_append(cdat->dsmscis_list, dsmscis);
+}
+void cdat_add_dsis(CXLCDAT *cdat, uint8_t flags, uint8_t handle) +{
- struct cxl_cdat_dsis *dsis = g_malloc(sizeof(*dsis));
- dsis->flags = flags;
- dsis->handle = handle;
- cdat->dsis_list = g_list_append(cdat->dsis_list, dsis);
+}
+void cdat_add_dsemts(CXLCDAT *cdat, uint8_t dsmas_handle,
uint8_t efi_mem_type_attr, uint64_t dpa_offset,
uint64_t dpa_length)
+{
- struct cxl_cdat_dsemts *dsemts = g_malloc(sizeof(*dsemts));
- dsemts->dsmas_handle = dsmas_handle;
- dsemts->efi_mem_type_attr = efi_mem_type_attr;
- dsemts->dpa_offset = dpa_offset;
- dsemts->dpa_length = dpa_length;
- cdat->dsemts_list = g_list_append(cdat->dsemts_list, dsemts);
+}
+struct cxl_cdat_sslbis *cdat_add_sslbis(CXLCDAT *cdat, uint8_t num_entries,
uint8_t data_type, uint64_t base_unit)
+{
- struct cxl_cdat_sslbis *sslbis =
g_malloc(sizeof(*sslbis) + num_entries * sizeof(sslbis->entries[0]));
- sslbis->num_entries = num_entries;
- sslbis->data_type = data_type;
- sslbis->base_unit = base_unit;
- cdat->sslbis_list = g_list_append(cdat->sslbis_list, sslbis);
- return sslbis;
+}
+int cdata_sslbis_set_entry(struct cxl_cdat_sslbis *sslbis, uint8_t index,
uint16_t portx, uint16_t porty, uint16_t val)
+{
- struct cxl_cdat_sslbis_entry *entry;
- if (index >= sslbis->num_entries) {
return -1;
- }
- entry = &sslbis->entries[index];
- entry->port_x_id = portx;
- entry->port_y_id = porty;
- entry->val = val;
- return 0;
+}
+int cxl_table_access(PCIEDOE *doe, uint16_t vendor_id, uint8_t object_type,
void *priv)
+{
- uint8_t table_type;
- int total_entries;
- uint16_t entry_handle;
- CXLCDAT *cdat = priv;
- uint16_t next_entry;
- if (doe->req_length != 3) {
/* optional error for unexpected command length */
return -1;
- }
- if ((doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_RCODE) != 0) {
/*
* Only table access code currently supported.
* Error indication is lack of Data Object Ready
*/
return -1;
- }
- table_type = (doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_TYPE) >>
ctz32(CXL_DOE_TABLE_ACCESS_DW2_TYPE);
- if (table_type != 0) {
/* Unsuported table ID so just don't set Data Object Ready */
return -1;
- }
- entry_handle = (doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE) >>
ctz32(CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE);
- /* Assume entry handle == CDAT structure index */
- total_entries = g_list_length(cdat->dsmas_list) +
g_list_length(cdat->dslbis_list) +
g_list_length(cdat->dsmscis_list) +
g_list_length(cdat->dsis_list) +
g_list_length(cdat->dsemts_list) +
g_list_length(cdat->sslbis_list);
- if (entry_handle + 1 == total_entries) {
next_entry = 0xFFFF;
- } else {
next_entry = entry_handle + 1;
- }
- if (entry_handle < g_list_length(cdat->dsmas_list)) {
const int dsmas_len = 24;
struct cxl_cdat_dsmas *dsmas =
g_list_nth_data(cdat->dsmas_list, entry_handle);
doe->store[1] = 3 + dsmas_len / sizeof(uint32_t);
doe->store[2] = next_entry << 16;
doe->store[3] = (dsmas_len << 16) | CXL_CDAT_DSMAS_TYPE;
/* cxl version of proximity domain */
doe->store[4] = dsmas->dsmad_handle;
/* flags in 2nd byte of [4] (bit 2 is non volatile) */
doe->store[4] |= (dsmas->flags << 8);
doe->store[5] = dsmas->dpa_base & 0xFFFFFFFF;
doe->store[6] = (dsmas->dpa_base >> 32) & 0xFFFFFFFF;
doe->store[7] = dsmas->dpa_length & 0xFFFFFFFF;
doe->store[8] = (dsmas->dpa_length >> 32) & 0xFFFFFFFF;
return 0;
- }
- entry_handle -= g_list_length(cdat->dsmas_list);
- if (entry_handle < g_list_length(cdat->dslbis_list)) {
const int dslbis_len = 24;
struct cxl_cdat_dslbis *dslbis =
g_list_nth_data(cdat->dslbis_list, entry_handle);
doe->store[1] = 3 + dslbis_len / sizeof(uint32_t);
doe->store[2] = next_entry << 16;
doe->store[3] = (dslbis_len << 16) | CXL_CDAT_DSLBIS_TYPE;
doe->store[4] = (dslbis->data_type << 24) | (dslbis->flags << 8) |
dslbis->handle;
doe->store[5] = dslbis->entry_base_unit & 0xFFFFFFFF;
doe->store[6] = (dslbis->entry_base_unit >> 32) & 0xFFFFFFFF;
doe->store[7] = (dslbis->entries[1] << 16) | dslbis->entries[0];
doe->store[8] = dslbis->entries[2];
return 0;
- }
- entry_handle -= g_list_length(cdat->dslbis_list);
- if (entry_handle < g_list_length(cdat->dsmscis_list)) {
const int dsmscis_len = 20;
struct cxl_cdat_dsmscis *dsmscis =
g_list_nth_data(cdat->dsmscis_list, entry_handle);
doe->store[1] = 3 + dsmscis_len / sizeof(uint32_t);
doe->store[2] = next_entry << 16;
doe->store[3] = (dsmscis_len << 16) | CXL_CDAT_DSMSCIS_TYPE;
doe->store[4] = dsmscis->dsmas_handle;
doe->store[5] = dsmscis->memory_side_cache_size & 0xffffffff;
doe->store[6] = (dsmscis->memory_side_cache_size >> 32) & 0xffffffff;
doe->store[7] = dsmscis->cache_attributes;
- }
- entry_handle -= g_list_length(cdat->dsmscis_list);
- if (entry_handle < g_list_length(cdat->dsis_list)) {
const int dsis_len = 8;
struct cxl_cdat_dsis *dsis =
g_list_nth_data(cdat->dsis_list, entry_handle);
doe->store[1] = 3 + dsis_len / sizeof(uint32_t);
doe->store[2] = next_entry << 16;
doe->store[3] = (dsis_len << 16) | CXL_CDAT_DSMSCIS_TYPE;
doe->store[4] = (dsis->handle << 8) | dsis->flags;
- }
- entry_handle -= g_list_length(cdat->dsis_list);
- if (entry_handle < g_list_length(cdat->dsemts_list)) {
const int dsemts_len = 24;
struct cxl_cdat_dsemts *dsemts =
g_list_nth_data(cdat->dsemts_list, entry_handle);
doe->store[1] = 3 + dsemts_len / sizeof(uint32_t);
doe->store[2] = next_entry << 16;
doe->store[3] = (dsemts_len << 16) | CXL_CDAT_DSEMTS_TYPE;
doe->store[4] = (dsemts->efi_mem_type_attr << 8) | dsemts->dsmas_handle;
doe->store[5] = dsemts->dpa_offset & 0xffffffff;
doe->store[6] = (dsemts->dpa_offset >> 32) & 0xffffffff;
doe->store[7] = dsemts->dpa_length & 0xffffffff;
doe->store[8] = (dsemts->dpa_length >> 32) & 0xffffffff;
- }
- entry_handle -= g_list_length(cdat->dsemts_list);
- if (entry_handle < g_list_length(cdat->sslbis_list)) {
struct cxl_cdat_sslbis *sslbis =
g_list_nth_data(cdat->sslbis_list, entry_handle);
int sslbis_len = 16 + 8 * sslbis->num_entries;
int i;
doe->store[1] = 3 + sslbis_len / sizeof(uint32_t);
doe->store[2] = next_entry << 16;
doe->store[3] = (sslbis_len << 16) | CXL_CDAT_SSLBIS_TYPE;
doe->store[4] = sslbis->data_type;
doe->store[5] = sslbis->base_unit & 0xffffffff;
doe->store[6] = (sslbis->base_unit >> 32) & 0xffffffff;
for (i = 0; i < sslbis->num_entries; i++) {
doe->store[7 + i * 2] = (sslbis->entries[i].port_y_id << 8) |
sslbis->entries[i].port_x_id;
doe->store[7 + i * 2 + 1] = sslbis->entries[i].val;
}
- }
- return -1;
+} diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build index 0eca715d10..9e2e5f4094 100644 --- a/hw/cxl/meson.build +++ b/hw/cxl/meson.build @@ -2,4 +2,5 @@ softmmu_ss.add(when: 'CONFIG_CXL', if_true: files( 'cxl-component-utils.c', 'cxl-device-utils.c', 'cxl-mailbox-utils.c',
- 'cxl-cdat.c',
)) diff --git a/include/hw/cxl/cxl_cdat.h b/include/hw/cxl/cxl_cdat.h new file mode 100644 index 0000000000..d0226c463c --- /dev/null +++ b/include/hw/cxl/cxl_cdat.h @@ -0,0 +1,101 @@ +/*
- Support for CDAT entires as defined in
- Coherent Device Attribute Table (CDAT) Specification rev 1.02
- Available from uefi.org.
- Copyright (c) 2021 Jonathan Cameron Jonathan.Cameron@huawei.com
- This work is licensed under the terms of the GNU GPL, version 2 or later.
- See the COPYING file in the top-level directory.
- */
+#ifndef QEMU_CXL_CDAT_H +#define QEMU_CXL_CDAT_H +#include "hw/pci/doe.h" +/* DW2 is common to the request and response */ +#define CXL_DOE_TABLE_ACCESS_DW2_RCODE 0x000000ff +#define CXL_DOE_TABLE_ACCESS_DW2_TYPE 0x0000ff00 +#define CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE 0xffff0000
+#define CXL_CDAT_DSMAS_TYPE 0 +#define CXL_CDAT_DSLBIS_TYPE 1 +#define CXL_CDAT_DSMSCIS_TYPE 2 +#define CXL_CDAT_DSIS_TYPE 3 +#define CXL_CDAT_DSEMTS_TYPE 4 +#define CXL_CDAT_SSLBIS_TYPE 5
+struct cxl_cdat_dsmas {
- uint8_t dsmad_handle;
- uint8_t flags;
+#define CDAT_DSMAS_FLAG_NV (1 << 2)
- uint64_t dpa_base;
- uint64_t dpa_length;
+};
+struct cxl_cdat_dslbis {
- uint8_t handle;
- uint8_t flags;
- uint8_t data_type;
- uint64_t entry_base_unit;
- uint16_t entries[3]; /* 6 bytes */
+};
+struct cxl_cdat_dsmscis {
- uint8_t dsmas_handle;
- uint64_t memory_side_cache_size;
- uint32_t cache_attributes;
+};
+struct cxl_cdat_dsis {
- uint8_t flags;
+#define CDAT_DSIS_MEMORY_ATTACHED 0x01
- uint8_t handle;
+};
+struct cxl_cdat_dsemts {
- uint8_t dsmas_handle;
- uint8_t efi_mem_type_attr;
- uint64_t dpa_offset;
- uint64_t dpa_length;
+};
+struct cxl_cdat_sslbis_entry {
- uint16_t port_x_id;
- uint16_t port_y_id;
- uint16_t val;
+};
+struct cxl_cdat_sslbis {
- uint8_t num_entries; /* needed to compute length */
- uint8_t data_type;
- uint64_t base_unit;
- struct cxl_cdat_sslbis_entry entries[];
+};
+typedef struct cxl_cdat {
- GList *dsmas_list;
- GList *dslbis_list;
- GList *dsmscis_list;
- GList *dsis_list;
- GList *dsemts_list;
- GList *sslbis_list;
+} CXLCDAT;
+void cdat_add_dsmas(CXLCDAT *cdat, uint8_t dsmad_handle, uint8_t flags,
uint64_t dpa_base, uint64_t dpa_length);
+void cdat_add_dslbis(CXLCDAT *cdat, uint8_t handle, uint8_t flags,
uint8_t data_type, uint64_t base_unit,
uint16_t entry0, uint16_t entry1, uint16_t entry2);
+void cdat_add_dsmscis(CXLCDAT *cdat, uint8_t dsmas_handle,
uint64_t memory_sc_size, uint64_t cache_attrs);
+void cdat_add_dsis(CXLCDAT *cdat, uint8_t flags, uint8_t handle); +void cdat_add_dsemts(CXLCDAT *cdat, uint8_t dsmas_handle,
uint8_t efi_mem_type_attr, uint64_t dpa_offset,
uint64_t dpa_length);
+struct cxl_cdat_sslbis *cdat_add_sslbis(CXLCDAT *cdat, uint8_t num_entries,
uint8_t data_type, uint64_t base_unit);
+int cdata_sslbis_set_entry(struct cxl_cdat_sslbis *sslbis, uint8_t index,
uint16_t portx, uint16_t pory, uint16_t val);
+int cxl_table_access(PCIEDOE *doe, uint16_t vendor_id, uint8_t object_type,
void *priv);
+#endif
2.19.1
Linuxarm mailing list -- linuxarm@openeuler.org To unsubscribe send an email to linuxarm-leave@openeuler.org
This enables basic read back of CDAT entries via a DOE mailbox in the devices PCI config space.
Note that for now a few possible entries are provided that don't make a lot of sense for current instances of this device. It will need to be made somewhat dynamic and possibly require additional command line parameters.
Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com --- hw/mem/cxl_type3.c | 49 ++++++++++++++++++++++++++++++++++++++++++-- include/hw/cxl/cxl.h | 1 + 2 files changed, 48 insertions(+), 2 deletions(-)
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c index dee5a8884b..3449f02090 100644 --- a/hw/mem/cxl_type3.c +++ b/hw/mem/cxl_type3.c @@ -4,6 +4,7 @@ #include "hw/mem/memory-device.h" #include "hw/mem/pc-dimm.h" #include "hw/pci/pci.h" +#include "hw/pci/doe.h" #include "hw/qdev-properties.h" #include "qapi/error.h" #include "qemu/log.h" @@ -11,6 +12,7 @@ #include "qemu/range.h" #include "qemu/rcu.h" #include "sysemu/hostmem.h" +#include "sysemu/numa.h" #include "hw/cxl/cxl.h"
typedef struct cxl_type3_dev { @@ -21,6 +23,9 @@ typedef struct cxl_type3_dev { uint64_t size; HostMemoryBackend *hostmem;
+ PCIEDOE doe; + CXLCDAT cdat; + /* State */ CXLComponentState cxl_cstate; CXLDeviceState cxl_dstate; @@ -28,7 +33,7 @@ typedef struct cxl_type3_dev {
#define CT3(obj) OBJECT_CHECK(CXLType3Dev, (obj), TYPE_CXL_TYPE3_DEV)
-static void build_dvsecs(CXLType3Dev *ct3d) +static int build_dvsecs(CXLType3Dev *ct3d) { CXLComponentState *cxl_cstate = &ct3d->cxl_cstate; uint8_t *dvsec; @@ -55,6 +60,7 @@ static void build_dvsecs(CXLType3Dev *ct3d) }; cxl_component_create_dvsec(cxl_cstate, REG_LOC_DVSEC_LENGTH, REG_LOC_DVSEC, REG_LOC_DVSEC_REVID, dvsec); + return cxl_cstate->dvsec_offset; }
static void hdm_decoder_commit(CXLType3Dev *ct3d, int which) @@ -201,6 +207,7 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp) ComponentRegisters *regs = &cxl_cstate->crb; MemoryRegion *mr = ®s->component_registers; uint8_t *pci_conf = pci_dev->config; + int offset;
if (!ct3d->cxl_dstate.pmem) { cxl_setup_memory(ct3d, errp); @@ -213,7 +220,21 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp) cxl_cstate->dvsec_offset = 0x100;
ct3d->cxl_cstate.pdev = pci_dev; - build_dvsecs(ct3d); + offset = build_dvsecs(ct3d); + + /* Build DOE - no interrupt for now */ + offset = pcie_doe_ecap(&ct3d->doe, pci_dev, offset); + doe_add_message_handler(&ct3d->doe, 0x1e98, 0x2, &cxl_table_access, + &ct3d->cdat); + + /* Store entires of CDAT for read back from DOE */ + cdat_add_dsmas(&ct3d->cdat, 0, CDAT_DSMAS_FLAG_NV, 0x3345000000, + 0x0001000000); + cdat_add_dsmas(&ct3d->cdat, 1, 0, 0x33460000000, 0x00010000000); + cdat_add_dslbis(&ct3d->cdat, 0, HMAT_LB_MEM_MEMORY, + (uint8_t)HMAT_LB_DATA_ACCESS_LATENCY, + 1000, /* microsecs */ + 90, 0, 0);
#ifndef SET_PMEM_PADDR regs->special_ops = g_new0(MemoryRegionOps, 1); @@ -287,6 +308,28 @@ static void pc_dimm_md_fill_device_info(const MemoryDeviceState *md, info->type = MEMORY_DEVICE_INFO_KIND_CXL; }
+static void ct3d_config_write(PCIDevice *pci_dev, uint32_t addr, uint32_t val, int len) +{ + CXLType3Dev *ct3d = CT3(pci_dev); + + pci_default_write_config(pci_dev, addr, val, len); + pcie_doe_write(&ct3d->doe, addr, val, len); + +} + +static uint32_t ct3d_config_read(PCIDevice *pci_dev, uint32_t addr, int len) +{ + CXLType3Dev *ct3d = CT3(pci_dev); + int found = 0; + uint32_t val; + + val = pcie_doe_read(&ct3d->doe, addr, len, &found); + if (found) { + return val; + } + return pci_default_read_config(pci_dev, addr, len); +} + static void ct3_class_init(ObjectClass *oc, void *data) { DeviceClass *dc = DEVICE_CLASS(oc); @@ -298,6 +341,8 @@ static void ct3_class_init(ObjectClass *oc, void *data) pc->vendor_id = PCI_VENDOR_ID_INTEL; pc->device_id = 0xd93; /* LVF for now */ pc->revision = 1; + pc->config_write = ct3d_config_write; + pc->config_read = ct3d_config_read;
set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); dc->desc = "CXL PMEM Device (Type 3)"; diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h index 6961e47076..ec3b4d7398 100644 --- a/include/hw/cxl/cxl.h +++ b/include/hw/cxl/cxl.h @@ -13,6 +13,7 @@ #include "cxl_pci.h" #include "cxl_component.h" #include "cxl_device.h" +#include "cxl_cdat.h"
#define COMPONENT_REG_BAR_IDX 0 #define DEVICE_REG_BAR_IDX 2
On Mon, 1 Feb 2021 23:16:25 +0800 Jonathan Cameron Jonathan.Cameron@huawei.com wrote:
Whilst I know others are working on an implementation of at least some of this, a desire to work on the kernel user of this required an implementation so I threw this together in the meantime and am sending it out on the basis that it may be of use to others.
As it is an RFC I have been lazy on some cleanup and error handling. Will fix that in a future version if we decide to take this forwards.
In the interests of info for anyone seeing this, note there is an alternative version from Chris Browy and colleagues.
https://lore.kernel.org/qemu-devel/alpine.LRH.2.23.451.2102021543190.30097@s...
Note that it's proved useful because we disagreed on a few things some of which I believe need spec clarification. (definitely one spec misread in this version - oops).
Based on Ben's cxl-2.0v3 branch from https://gitlab.com/bwidawsk/qemu Possible I've picked an unstable branch so this may or may not currently apply :) I'll rebase on Ben's next version when avaialble.
Jonathan Cameron (4): include/standard-headers/linux/pci_regs: temp hack to add necessary DOE definitions. hw/pci/pcie_doe: Introduce utility functions for PCIe DOE hw/cxl/cxl-cdat: Initial CDAT implementation for use by CXL devices hw/mem/cxl_type3: Enabled DOE mailbox for access to CDAT
hw/cxl/cxl-cdat.c | 252 +++++++++++++++++++++ hw/cxl/meson.build | 1 + hw/mem/cxl_type3.c | 49 ++++- hw/pci/meson.build | 2 +- hw/pci/pcie_doe.c | 257 ++++++++++++++++++++++ include/hw/cxl/cxl.h | 1 + include/hw/cxl/cxl_cdat.h | 101 +++++++++ include/hw/pci/doe.h | 40 ++++ include/hw/pci/pci_ids.h | 2 + include/standard-headers/linux/pci_regs.h | 33 ++- 10 files changed, 734 insertions(+), 4 deletions(-) create mode 100644 hw/cxl/cxl-cdat.c create mode 100644 hw/pci/pcie_doe.c create mode 100644 include/hw/cxl/cxl_cdat.h create mode 100644 include/hw/pci/doe.h