The current uadk initialization process is: 1.Call wd_request_ctx() to request ctxs from devices. 2.Call wd_sched_rr_alloc() to create a sched(or some other scheduler alloc function if exits). 3.Initialize the sched. 4.Call wd_<alg>_init() with ctx_config and sched.
Logic is reasonable. But in practice, the step of `wd_ request_ Ctx() ` and `wd_ sched_ rr_alloc() ` are very tedious. This makes it difficult for users to use the interface. One of the main reasons for this is that uadk has made a lot of configurations in the scheduler in order to provide users with better performance. Based on this consideration, the current uadk requires the user to arrange the division of hardware resources according to the device topology during initialization. Therefore, as a high-level interface, this scheme can provide customized scheme configuration for users with deep needs.
All algorithm initialization interfaces have the same input parameters and behavioral logic. The pre-processing of the wd_<alg>_init is actually the configuration of `struct wd_ctx_config` and `struct wd_sched`. Therefore, the next thing to be done is to use limited and easy-to-use input parameters to describe users' requirements on the two input parameters, ensuring that the functions of the new interface init2 are the same as those of init. For ease of description, v1 is used to refer to the existing interface, and v2 is used to refer to the layer of encapsulation.
At present, at least 4 parameters are required to meet the user configuration requirements with the V1 interface function remains unchanged. @device_list: The available uacce device list. Users can get it by wd_get_accel_list(). @numa_bitmask: The bitmask provided by libnuma. Users can use this parameter to control requesting ctxs devices in the bind NUMA scenario. @ctx_nums: The requested ctx number for each numa node. Due to users may have different requirements for different types of ctx numbers, needs a two-dimensional array as input. @sched_type: Scheduling type the user wants to use.
What's more, some users want uadk to provide the default value about input parameters for some performance insensitive scenes. C code has no way to.
Changelog:
v3->v4: - Resume the wd_comp_init2() parameters and rename it to wd_comp_init2_(). Then add a macro named wd_comp_init2() which has a simpler parameters.
v2->v3: - Update the wd_comp_init2() parameters.
v1->v2: - Update the desdescription about wd_<alg>_init in wd_design.md.
Yang Shen (6): uadk - support algorithms initialization reentry protect uadk/doc - update wd_alg_init support reentrancy uadk - support return error number as pointer uadk - mv some function to header file uadk/comp - add wd_comp_init2 uadk/docs - support a simple interface for initialization
Makefile.am | 4 +- docs/wd_alg_init2.md | 154 ++++++++++++++++++++++++ docs/wd_design.md | 5 +- include/wd.h | 54 ++++++++- include/wd_alg_common.h | 24 ++++ include/wd_comp.h | 27 +++++ include/wd_util.h | 57 +++++++++ wd.c | 97 ++++++++++++--- wd_aead.c | 33 +++-- wd_cipher.c | 35 ++++-- wd_comp.c | 121 +++++++++++++++++-- wd_dh.c | 34 ++++-- wd_digest.c | 33 +++-- wd_ecc.c | 33 +++-- wd_rsa.c | 33 +++-- wd_util.c | 260 +++++++++++++++++++++++++++++++++++++++- 16 files changed, 898 insertions(+), 106 deletions(-) create mode 100644 docs/wd_alg_init2.md
-- 2.24.0
The 'wd_<alg>_init()' is designed as non-reentrant. So add a status to protect for this situation.
When 'wd_<alg>_init()' is called, it will read the status at first. If the status is WD_UNINIT, it will set status as WD_INITING and change status to WD_INIT if succeed or reduction status to WD_UNINIT if something is wrong. If the status is WD_INIT, it can return directly. If the status is WD_INITING, that meaning other thread is initializing, so it need to wait for the result.
Signed-off-by: Yang Shen shenyang39@huawei.com --- include/wd_util.h | 38 ++++++++++++++++++++++++++++++++++++++ wd_aead.c | 33 ++++++++++++++++++++++----------- wd_cipher.c | 35 +++++++++++++++++++++++------------ wd_comp.c | 35 ++++++++++++++++++++++++----------- wd_dh.c | 34 ++++++++++++++++++++++------------ wd_digest.c | 33 ++++++++++++++++++++++----------- wd_ecc.c | 33 ++++++++++++++++++++++----------- wd_rsa.c | 33 ++++++++++++++++++++++----------- wd_util.c | 24 ++++++++++++++++++++++++ 9 files changed, 219 insertions(+), 79 deletions(-)
diff --git a/include/wd_util.h b/include/wd_util.h index 83ac5f8..eafe3ce 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -21,6 +21,12 @@ extern "C" { for ((i) = 0, (config_numa) = (config)->config_per_numa; \ (i) < (config)->numa_num; (config_numa)++, (i)++)
+enum wd_status { + WD_UNINIT, + WD_INITING, + WD_INIT, +}; + struct wd_async_msg_pool { struct msg_pool *pools; __u32 pool_num; @@ -356,6 +362,38 @@ int wd_handle_msg_sync(struct wd_msg_handle *msg_handle, handle_t ctx, */ int wd_init_param_check(struct wd_ctx_config *config, struct wd_sched *sched);
+/** + * wd_alg_try_init() - Check the algorithm status and set it as WD_INITING + * if need initialization. + * @status: algorithm initialization status. + * + * Return true if need initialization and false if initialized, otherwise will wait + * last initialization result. + */ +bool wd_alg_try_init(enum wd_status *status); + +/** + * wd_alg_set_init() - Set the algorithm status as WD_INIT. + * @status: algorithm initialization status. + */ +static inline void wd_alg_set_init(enum wd_status *status) +{ + enum wd_status setting = WD_INIT; + + __atomic_store(status, &setting, __ATOMIC_RELAXED); +} + +/** + * wd_alg_clear_init() - Set the algorithm status as WD_UNINIT. + * @status: algorithm initialization status. + */ +static inline void wd_alg_clear_init(enum wd_status *status) +{ + enum wd_status setting = WD_UNINIT; + + __atomic_store(status, &setting, __ATOMIC_RELAXED); +} + /** * wd_dfx_msg_cnt() - Message counter interface for ctx * @msg: Shared memory addr. diff --git a/wd_aead.c b/wd_aead.c index d6c2380..2307b20 100644 --- a/wd_aead.c +++ b/wd_aead.c @@ -31,6 +31,7 @@ static int g_aead_mac_len[WD_DIGEST_TYPE_MAX] = { };
struct wd_aead_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_aead_driver *driver; @@ -392,24 +393,29 @@ static int wd_aead_param_check(struct wd_aead_sess *sess, int wd_aead_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_aead_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_AEAD_EPOLL_EN", &wd_aead_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_aead_setting.config, config); if (ret) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_aead_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
/* set driver */ #ifdef WD_STATIC_DRV @@ -421,33 +427,37 @@ int wd_aead_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_aead_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* init ctx related resources in specific driver */ priv = calloc(1, wd_aead_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; } wd_aead_setting.priv = priv;
ret = wd_aead_setting.driver->init(&wd_aead_setting.config, priv); if (ret < 0) { WD_ERR("failed to init aead dirver!\n"); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_aead_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_aead_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_aead_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_aead_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_aead_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_aead_setting.status); return ret; }
@@ -465,6 +475,7 @@ void wd_aead_uninit(void) wd_uninit_async_request_pool(&wd_aead_setting.pool); wd_clear_sched(&wd_aead_setting.sched); wd_clear_ctx_config(&wd_aead_setting.config); + wd_alg_clear_init(&wd_aead_setting.status); }
static void fill_request_msg(struct wd_aead_msg *msg, struct wd_aead_req *req, diff --git a/wd_cipher.c b/wd_cipher.c index 8ce975a..a85629d 100644 --- a/wd_cipher.c +++ b/wd_cipher.c @@ -45,6 +45,7 @@ static const unsigned char des_weak_keys[DES_WEAK_KEY_NUM][DES_KEY_SIZE] = { };
struct wd_cipher_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; void *sched_ctx; @@ -231,24 +232,29 @@ void wd_cipher_free_sess(handle_t h_sess) int wd_cipher_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_cipher_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_CIPHER_EPOLL_EN", &wd_cipher_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_cipher_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_cipher_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
#ifdef WD_STATIC_DRV /* set driver */ @@ -260,33 +266,37 @@ int wd_cipher_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_cipher_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* init ctx related resources in specific driver */ priv = calloc(1, wd_cipher_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; } wd_cipher_setting.priv = priv;
ret = wd_cipher_setting.driver->init(&wd_cipher_setting.config, priv); if (ret < 0) { - WD_ERR("hisi sec init failed.\n"); - goto out_init; + WD_ERR("failed to do dirver init, ret = %d.\n", ret); + goto out_free_priv; }
+ wd_alg_set_init(&wd_cipher_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_cipher_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_cipher_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_cipher_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_cipher_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_cipher_setting.status); return ret; }
@@ -304,6 +314,7 @@ void wd_cipher_uninit(void) wd_uninit_async_request_pool(&wd_cipher_setting.pool); wd_clear_sched(&wd_cipher_setting.sched); wd_clear_ctx_config(&wd_cipher_setting.config); + wd_alg_clear_init(&wd_cipher_setting.status); }
static void fill_request_msg(struct wd_cipher_msg *msg, diff --git a/wd_comp.c b/wd_comp.c index eacebd3..44593a6 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -41,6 +41,7 @@ struct wd_comp_sess { };
struct wd_comp_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver; @@ -81,24 +82,29 @@ void wd_comp_set_driver(struct wd_comp_driver *drv) int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_comp_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_COMP_EPOLL_EN", &wd_comp_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_comp_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_comp_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config; /* * Fix me: ctx could be passed into wd_comp_set_static_drv to help to * choose static compiled vendor driver. For dynamic vendor driver, @@ -118,31 +124,36 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_comp_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* init ctx related resources in specific driver */ priv = calloc(1, wd_comp_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; } wd_comp_setting.priv = priv; ret = wd_comp_setting.driver->init(&wd_comp_setting.config, priv); if (ret < 0) { WD_ERR("failed to do driver init, ret = %d!\n", ret); - goto out_init; + goto out_free_priv; } + + wd_alg_set_init(&wd_comp_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_comp_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_comp_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_comp_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_comp_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_comp_setting.status); return ret; }
@@ -163,6 +174,8 @@ void wd_comp_uninit(void) /* unset config, sched, driver */ wd_clear_sched(&wd_comp_setting.sched); wd_clear_ctx_config(&wd_comp_setting.config); + + wd_alg_clear_init(&wd_comp_setting.status); }
struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) diff --git a/wd_dh.c b/wd_dh.c index 0bf770d..85382e2 100644 --- a/wd_dh.c +++ b/wd_dh.c @@ -32,6 +32,7 @@ struct wd_dh_sess { };
static struct wd_dh_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; void *sched_ctx; @@ -78,24 +79,29 @@ void wd_dh_set_driver(struct wd_dh_driver *drv) int wd_dh_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_dh_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_DH_EPOLL_EN", &wd_dh_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_dh_setting.config, config); if (ret) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_dh_setting.sched, sched); if (ret) - goto out; + goto out_clear_ctx_config;
#ifdef WD_STATIC_DRV wd_dh_set_static_drv(); @@ -106,13 +112,13 @@ int wd_dh_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_dh_msg)); if (ret) - goto out_sched; + goto out_clear_sched;
/* initialize ctx related resources in specific driver */ priv = calloc(1, wd_dh_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; }
wd_dh_setting.priv = priv; @@ -120,21 +126,24 @@ int wd_dh_init(struct wd_ctx_config *config, struct wd_sched *sched) wd_dh_setting.driver->alg_name); if (ret < 0) { WD_ERR("failed to init dh driver, ret= %d!\n", ret); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_dh_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_dh_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_dh_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_dh_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_dh_setting.config); - +out_clear_init: + wd_alg_clear_init(&wd_dh_setting.status); return ret; }
@@ -156,6 +165,7 @@ void wd_dh_uninit(void) /* unset config, sched, driver */ wd_clear_sched(&wd_dh_setting.sched); wd_clear_ctx_config(&wd_dh_setting.config); + wd_alg_clear_init(&wd_dh_setting.status); }
static int fill_dh_msg(struct wd_dh_msg *msg, struct wd_dh_req *req, diff --git a/wd_digest.c b/wd_digest.c index f56be0c..26dc7d1 100644 --- a/wd_digest.c +++ b/wd_digest.c @@ -39,6 +39,7 @@ static int g_digest_mac_full_len[WD_DIGEST_TYPE_MAX] = { };
struct wd_digest_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_digest_driver *driver; @@ -186,24 +187,29 @@ void wd_digest_free_sess(handle_t h_sess) int wd_digest_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_digest_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_DIGEST_EPOLL_EN", &wd_digest_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_digest_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_digest_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
/* set driver */ #ifdef WD_STATIC_DRV @@ -215,33 +221,37 @@ int wd_digest_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_digest_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* init ctx related resources in specific driver */ priv = calloc(1, wd_digest_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; } wd_digest_setting.priv = priv;
ret = wd_digest_setting.driver->init(&wd_digest_setting.config, priv); if (ret < 0) { WD_ERR("failed to init digest dirver!\n"); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_digest_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_digest_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_digest_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_digest_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_digest_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_digest_setting.status); return ret; }
@@ -260,6 +270,7 @@ void wd_digest_uninit(void)
wd_clear_sched(&wd_digest_setting.sched); wd_clear_ctx_config(&wd_digest_setting.config); + wd_alg_clear_init(&wd_digest_setting.status); }
static int wd_aes_hmac_length_check(struct wd_digest_sess *sess, diff --git a/wd_ecc.c b/wd_ecc.c index 2266b1d..3e902bd 100644 --- a/wd_ecc.c +++ b/wd_ecc.c @@ -64,6 +64,7 @@ struct wd_ecc_curve_list { };
static struct wd_ecc_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; void *sched_ctx; @@ -133,24 +134,29 @@ void wd_ecc_set_driver(struct wd_ecc_driver *drv) int wd_ecc_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_ecc_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_ECC_EPOLL_EN", &wd_ecc_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_ecc_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_ecc_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
#ifdef WD_STATIC_DRV wd_ecc_set_static_drv(); @@ -161,13 +167,13 @@ int wd_ecc_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_ecc_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* initialize ctx related resources in specific driver */ priv = calloc(1, wd_ecc_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; }
wd_ecc_setting.priv = priv; @@ -175,20 +181,24 @@ int wd_ecc_init(struct wd_ctx_config *config, struct wd_sched *sched) wd_ecc_setting.driver->alg_name); if (ret < 0) { WD_ERR("failed to init ecc driver, ret = %d!\n", ret); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_ecc_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_ecc_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_ecc_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_ecc_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_ecc_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_ecc_setting.status); return ret; }
@@ -210,6 +220,7 @@ void wd_ecc_uninit(void) /* unset config, sched, driver */ wd_clear_sched(&wd_ecc_setting.sched); wd_clear_ctx_config(&wd_ecc_setting.config); + wd_alg_clear_init(&wd_ecc_setting.status); }
static int trans_to_binpad(char *dst, const char *src, diff --git a/wd_rsa.c b/wd_rsa.c index 489833e..aab16ce 100644 --- a/wd_rsa.c +++ b/wd_rsa.c @@ -72,6 +72,7 @@ struct wd_rsa_sess { };
static struct wd_rsa_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; void *sched_ctx; @@ -118,24 +119,29 @@ void wd_rsa_set_driver(struct wd_rsa_driver *drv) int wd_rsa_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_rsa_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_RSA_EPOLL_EN", &wd_rsa_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_rsa_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_rsa_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
#ifdef WD_STATIC_DRV wd_rsa_set_static_drv(); @@ -146,13 +152,13 @@ int wd_rsa_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_rsa_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* initialize ctx related resources in specific driver */ priv = calloc(1, wd_rsa_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; }
wd_rsa_setting.priv = priv; @@ -160,20 +166,24 @@ int wd_rsa_init(struct wd_ctx_config *config, struct wd_sched *sched) wd_rsa_setting.driver->alg_name); if (ret < 0) { WD_ERR("failed to init rsa driver, ret = %d!\n", ret); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_rsa_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_rsa_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_rsa_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_rsa_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_rsa_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_rsa_setting.status); return ret; }
@@ -195,6 +205,7 @@ void wd_rsa_uninit(void) /* unset config, sched, driver */ wd_clear_sched(&wd_rsa_setting.sched); wd_clear_ctx_config(&wd_rsa_setting.config); + wd_alg_clear_init(&wd_rsa_setting.status); }
static int fill_rsa_msg(struct wd_rsa_msg *msg, struct wd_rsa_req *req, diff --git a/wd_util.c b/wd_util.c index bd82075..fa77b46 100644 --- a/wd_util.c +++ b/wd_util.c @@ -22,6 +22,9 @@ #define PRIVILEGE_FLAG 600 #define MIN(a, b) ((a) > (b) ? (b) : (a))
+#define WD_INIT_SLEEP_UTIME 1000 +#define WD_INIT_RETRY_TIMES 10000 + struct msg_pool { /* message array allocated dynamically */ void *msgs; @@ -1777,3 +1780,24 @@ int wd_init_param_check(struct wd_ctx_config *config, struct wd_sched *sched)
return 0; } + +bool wd_alg_try_init(enum wd_status *status) +{ + enum wd_status expected; + int count = 0; + bool ret; + + do { + expected = WD_UNINIT; + ret = __atomic_compare_exchange_n(status, &expected, WD_INITING, true, + __ATOMIC_RELAXED, __ATOMIC_RELAXED); + if (expected == WD_INIT) + return false; + usleep(WD_INIT_SLEEP_UTIME); + if (!(++count % WD_INIT_RETRY_TIMES)) + WD_ERR("The algorithm initizalite has been waiting for %ds!\n", + WD_INIT_SLEEP_UTIME * count / 1000000); + } while (!ret); + + return true; +}
Now the uadk support initialization interface multi-thread concurrency and reentrant.
Signed-off-by: Yang Shen shenyang39@huawei.com --- docs/wd_design.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/docs/wd_design.md b/docs/wd_design.md index ba5a5b9..3e5297e 100644 --- a/docs/wd_design.md +++ b/docs/wd_design.md @@ -81,6 +81,7 @@ | | |2) Change *user* layer to *sched* layer since | | | | sample_sched is moved from user space into UADK | | | | framework. | +| 1.4 | |1) Update *wd_alg_init* reentrancy. |
## Terminology @@ -493,7 +494,9 @@ device. Return 0 if it succeeds. And return error number if it fails.
In *wd_comp_init()*, context resources, user scheduler and vendor driver are -initialized. +initialized. This function supports multi-threaded concurrent calls and +reentrant. When one thread is initializing, other threads will wait for +completion.
***void wd_comp_uninit(void)***
Add a new set of interface 'WD_ERR_PTR()' and 'WD_PTR_ERR()' for return error value.
Signed-off-by: Yang Shen shenyang39@huawei.com --- include/wd.h | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/include/wd.h b/include/wd.h index b0580ba..74d714b 100644 --- a/include/wd.h +++ b/include/wd.h @@ -91,11 +91,6 @@ typedef void (*wd_log)(const char *format, ...); #define WD_IS_ERR(h) ((uintptr_t)(h) > \ (uintptr_t)(-1000))
-static inline void *WD_ERR_PTR(uintptr_t error) -{ - return (void *)error; -} - enum wcrypto_type { WD_CIPHER, WD_DIGEST, @@ -185,6 +180,16 @@ static inline void wd_iowrite64(void *addr, uint64_t value) *((volatile uint64_t *)addr) = value; }
+static inline void *WD_ERR_PTR(uintptr_t error) +{ + return (void *)error; +} + +static inline long WD_PTR_ERR(const void *ptr) +{ + return (long)ptr; +} + /** * wd_request_ctx() - Request a communication context from a device. * @dev: Indicate one device.
Since two function will be used for mutil files, move them to header file.
Signed-off-by: Yang Shen shenyang39@huawei.com --- include/wd.h | 15 +++++++++++++++ wd.c | 35 +++++++++++++++++------------------ 2 files changed, 32 insertions(+), 18 deletions(-)
diff --git a/include/wd.h b/include/wd.h index 74d714b..e1a87de 100644 --- a/include/wd.h +++ b/include/wd.h @@ -508,6 +508,21 @@ void wd_mempool_stats(handle_t mempool, struct wd_mempool_stats *stats); */ void wd_blockpool_stats(handle_t blkpool, struct wd_blockpool_stats *stats);
+/** + * wd_clone_dev() - clone a new uacce device. + * @dev: The source device. + * + * Return a pointer value if succeed, and NULL if fail. + */ +struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); + +/** + * wd_add_dev_to_list() - add a node to end of list. + * @head: The list head. + * @node: The node need to be add. + */ +void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node); + /** * wd_ctx_get_dev_name() - Get the device name about task. * @h_ctx: The handle of context. diff --git a/wd.c b/wd.c index 6ea17f3..78094d8 100644 --- a/wd.c +++ b/wd.c @@ -342,7 +342,13 @@ out: return strndup(name, len); }
-static struct uacce_dev *clone_uacce_dev(struct uacce_dev *dev) +static void wd_ctx_init_qfrs_offs(struct wd_ctx_h *ctx) +{ + memcpy(&ctx->qfrs_offs, &ctx->dev->qfrs_offs, + sizeof(ctx->qfrs_offs)); +} + +struct uacce_dev *wd_clone_dev(struct uacce_dev *dev) { struct uacce_dev *new;
@@ -355,10 +361,14 @@ static struct uacce_dev *clone_uacce_dev(struct uacce_dev *dev) return new; }
-static void wd_ctx_init_qfrs_offs(struct wd_ctx_h *ctx) +void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node) { - memcpy(&ctx->qfrs_offs, &ctx->dev->qfrs_offs, - sizeof(ctx->qfrs_offs)); + struct uacce_dev_list *tmp = head; + + while (tmp->next) + tmp = tmp->next; + + tmp->next = node; }
handle_t wd_request_ctx(struct uacce_dev *dev) @@ -393,7 +403,7 @@ handle_t wd_request_ctx(struct uacce_dev *dev) if (!ctx->drv_name) goto free_dev_name;
- ctx->dev = clone_uacce_dev(dev); + ctx->dev = wd_clone_dev(dev); if (!ctx->dev) goto free_drv_name;
@@ -633,17 +643,6 @@ static bool dev_has_alg(const char *dev_alg_name, const char *alg_name) return false; }
-static void add_uacce_dev_to_list(struct uacce_dev_list *head, - struct uacce_dev_list *node) -{ - struct uacce_dev_list *tmp = head; - - while (tmp->next) - tmp = tmp->next; - - tmp->next = node; -} - static int check_alg_name(const char *alg_name) { int i = 0; @@ -715,7 +714,7 @@ struct uacce_dev_list *wd_get_accel_list(const char *alg_name) if (!head) head = node; else - add_uacce_dev_to_list(head, node); + wd_add_dev_to_list(head, node); }
closedir(wd_class); @@ -774,7 +773,7 @@ struct uacce_dev *wd_get_accel_dev(const char *alg_name) }
if (dev) - target = clone_uacce_dev(dev); + target = wd_clone_dev(dev);
wd_free_list_accels(head);
Due to performance, uadk tries to leave many configuration options to users. This gives users great flexibility, but it also leads to a problem that the current initialization interface has high complexity. Therefore, in order to facilitate users to adapt quickly, a new set of interfaces is provided.
The 'wd_alg_init2_()' will complete all initialization steps. There are 4 parameters to describe the user configuration requirements. @alg: The algorithm users want to use. @sched_type: Scheduling type the user wants to use. @numa_bitmask: The bitmask provided by libnuma. Users can use this parameter to control requesting ctxs devices in the bind NUMA scenario. @ctx_nums: The requested ctx number for each numa node. Due to users may have different requirements for different types of ctx numbers, needs a two-dimensional array as input.
If users think 'wd_alg_init2_()' is too complex, wd_alg_init2() is a simplified packaging and will use the default value of numa_bitmask and ctx_nums.
Signed-off-by: Yang Shen shenyang39@huawei.com --- Makefile.am | 4 +- include/wd.h | 24 ++++ include/wd_alg_common.h | 24 ++++ include/wd_comp.h | 27 +++++ include/wd_util.h | 19 ++++ wd.c | 62 +++++++++++ wd_comp.c | 86 +++++++++++++++ wd_util.c | 236 +++++++++++++++++++++++++++++++++++++++- 8 files changed, 479 insertions(+), 3 deletions(-)
diff --git a/Makefile.am b/Makefile.am index 53f36f9..5465b64 100644 --- a/Makefile.am +++ b/Makefile.am @@ -86,7 +86,7 @@ AM_CFLAGS += -DWD_NO_LOG
libwd_la_LIBADD = $(libwd_la_OBJECTS) -lnuma
-libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl +libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl -lnuma libwd_comp_la_DEPENDENCIES = libwd.la
libhisi_zip_la_LIBADD = -ldl @@ -103,7 +103,7 @@ else libwd_la_LDFLAGS=$(UADK_VERSION) libwd_la_LIBADD= -lnuma
-libwd_comp_la_LIBADD= -lwd -ldl +libwd_comp_la_LIBADD= -lwd -ldl -lnuma libwd_comp_la_LDFLAGS=$(UADK_VERSION) libwd_comp_la_DEPENDENCIES= libwd.la
diff --git a/include/wd.h b/include/wd.h index e1a87de..facd992 100644 --- a/include/wd.h +++ b/include/wd.h @@ -348,6 +348,16 @@ int wd_get_avail_ctx(struct uacce_dev *dev); */ struct uacce_dev_list *wd_get_accel_list(const char *alg_name);
+/** + * wd_find_dev_by_numa() - get device with max available ctx number from an + * device list according to numa id. + * @list: The device list. + * @numa_id: The numa_id. + * + * Return device if succeed and other error number if fail. + */ +struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id); + /** * wd_get_accel_dev() - Get device supporting the algorithm with smallest numa distance to current numa node. @@ -523,6 +533,20 @@ struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); */ void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node);
+/** + * wd_create_device_nodemask() - create a numa node mask of device list. + * @list: The devices list. + * + * Return a pointer value if succeed, and error number if fail. + */ +struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list); + +/** + * wd_free_device_nodemask() - free a numa node mask. + * @bmp: A numa node mask. + */ +void wd_free_device_nodemask(struct bitmask *bmp); + /** * wd_ctx_get_dev_name() - Get the device name about task. * @h_ctx: The handle of context. diff --git a/include/wd_alg_common.h b/include/wd_alg_common.h index c455dc3..5f63215 100644 --- a/include/wd_alg_common.h +++ b/include/wd_alg_common.h @@ -63,6 +63,30 @@ struct wd_ctx_config { void *priv; };
+/** + * struct wd_ctx_nums - Define the ctx sets numbers. + * @sync_ctx_num: The ctx numbers which are used for sync mode for each + * ctx sets. + * @async_ctx_num: The ctx numbers which are used for async mode for each + * ctx sets. + */ +struct wd_ctx_nums { + __u32 sync_ctx_num; + __u32 async_ctx_num; +}; + +/** + * struct wd_ctx_params - Define the ctx sets params which are used for init + * algorithms. + * @ctx_set_size: Number of ctx sets to be created. Usually users can + * set it according to <alg>_op_type. + * @ctx_set_num: Each ctx sets numbers. + */ +struct wd_ctx_params { + __u32 ctx_set_size; + struct wd_ctx_nums *ctx_set_num; +}; + struct wd_ctx_internal { handle_t ctx; __u8 op_type; diff --git a/include/wd_comp.h b/include/wd_comp.h index e043a83..d96110e 100644 --- a/include/wd_comp.h +++ b/include/wd_comp.h @@ -7,6 +7,7 @@ #ifndef __WD_COMP_H #define __WD_COMP_H
+#include <numa.h> #include "wd.h" #include "wd_alg_common.h"
@@ -113,6 +114,32 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched); */ void wd_comp_uninit(void);
+/** + * wd_comp_init2_() - A simplify interface to initializate uadk + * compression/decompression. This interface keeps most functions of + * wd_comp_init(). Users just need to descripe the deployment of + * business scenarios. Then the initialization will request appropriate + * resources to support the business scenarios. + * To make the initializate simpler, bmp and cparams support set NULL. + * And then the function will set them as default. + * + * @alg: The selected algorithm. + * @sched_type: The scheduler type. + * @bmp: Node mask of the required devices. + * @cparams: The ctx number settings. + * + * Return 0 if succeed and others if fail. + */ +int wd_comp_init2_(char *alg, __u32 sched_type, struct bitmask *bmp, struct wd_ctx_params *cparams); + +#define wd_comp_init2(alg, sched_type) \ + wd_comp_init2_(alg, sched_type, NULL, NULL) + +/** + * wd_comp_uninit2() - Uninitialise ctx configuration and scheduler. + */ +void wd_comp_uninit2(void); + struct wd_comp_sess_setup { enum wd_comp_alg_type alg_type; /* Denoted by enum wd_comp_alg_type */ enum wd_comp_level comp_lv; /* Denoted by enum wd_comp_level */ diff --git a/include/wd_util.h b/include/wd_util.h index eafe3ce..8ae70f1 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -7,6 +7,7 @@ #ifndef __WD_UTIL_H #define __WD_UTIL_H
+#include <numa.h> #include <stdbool.h> #include <sys/ipc.h> #include <sys/shm.h> @@ -112,6 +113,15 @@ struct wd_msg_handle { int (*recv)(handle_t sess, void *msg); };
+struct wd_init_attrs { + __u32 sched_type; + char *alg; + struct bitmask *bmp; + struct wd_sched *sched; + struct wd_ctx_params *cparams; + struct wd_ctx_config *ctx_config; +}; + /* * wd_init_ctx_config() - Init internal ctx configuration. * @in: ctx configuration in global setting. @@ -394,6 +404,15 @@ static inline void wd_alg_clear_init(enum wd_status *status) __atomic_store(status, &setting, __ATOMIC_RELAXED); }
+/** + * wd_alg_pre_init() - Request the ctxs and initialize the sched_domain + * with the given devices list, ctxs number and numa mask. + * @attrs: the algorithm initialization parameters. + * + * Return device if succeed and other error number if fail. + */ +int wd_alg_pre_init(struct wd_init_attrs *attrs); + /** * wd_dfx_msg_cnt() - Message counter interface for ctx * @msg: Shared memory addr. diff --git a/wd.c b/wd.c index 78094d8..9eb69d2 100644 --- a/wd.c +++ b/wd.c @@ -727,6 +727,35 @@ free_list: return NULL; }
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id) +{ + struct uacce_dev *dev = WD_ERR_PTR(-WD_ENODEV); + struct uacce_dev_list *p = list; + int ctx_num, ctx_max = 0; + + if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + } + + while (p) { + if (numa_id != p->dev->numa_id) { + p = p->next; + continue; + } + + ctx_num = wd_get_avail_ctx(p->dev); + if (ctx_num > ctx_max) { + dev = p->dev; + ctx_max = ctx_num; + } + + p = p->next; + } + + return dev; +} + void wd_free_list_accels(struct uacce_dev_list *list) { struct uacce_dev_list *curr, *next; @@ -793,6 +822,39 @@ int wd_ctx_set_io_cmd(handle_t h_ctx, unsigned long cmd, void *arg) return ioctl(ctx->fd, cmd, arg); }
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list) +{ + struct uacce_dev_list *p; + struct bitmask *bmp; + + if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + } + + bmp = numa_allocate_nodemask(); + if (!bmp) { + WD_ERR("failed to alloc bitmask(%d)!\n", errno); + return WD_ERR_PTR(-WD_ENOMEM); + } + + p = list; + while (p) { + numa_bitmask_setbit(bmp, p->dev->numa_id); + p = p->next; + } + + return bmp; +} + +void wd_free_device_nodemask(struct bitmask *bmp) +{ + if (!bmp) + return; + + numa_free_nodemask(bmp); +} + void wd_get_version(void) { const char *wd_released_time = UADK_RELEASED_TIME; diff --git a/wd_comp.c b/wd_comp.c index 44593a6..ea80d13 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -14,6 +14,7 @@
#include "config.h" #include "drv/wd_comp_drv.h" +#include "wd_sched.h" #include "wd_util.h" #include "wd_comp.h"
@@ -21,6 +22,8 @@ #define HW_CTX_SIZE (64 * 1024) #define STREAM_CHUNK (128 * 1024)
+#define SCHED_RR_NAME "sched_rr" + #define swap_byte(x) \ ((((x) & 0x000000ff) << 24) | \ (((x) & 0x0000ff00) << 8) | \ @@ -42,6 +45,7 @@ struct wd_comp_sess {
struct wd_comp_setting { enum wd_status status; + enum wd_status status2; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver; @@ -52,6 +56,19 @@ struct wd_comp_setting {
struct wd_env_config wd_comp_env_config;
+static struct wd_init_attrs wd_comp_init_attrs; +static struct wd_ctx_config wd_comp_ctx; +static struct wd_sched *wd_comp_sched; + +static struct wd_ctx_nums wd_comp_ctx_num[] = { + {1, 1}, {1, 1}, {} +}; + +static struct wd_ctx_params wd_comp_cparams = { + .ctx_set_size = WD_DIR_MAX, + .ctx_set_num = wd_comp_ctx_num +}; + #ifdef WD_STATIC_DRV static void wd_comp_set_static_drv(void) { @@ -178,6 +195,73 @@ void wd_comp_uninit(void) wd_alg_clear_init(&wd_comp_setting.status); }
+int wd_comp_init2_(char *alg, __u32 sched_type, struct bitmask *bmp, struct wd_ctx_params *cparams) +{ + bool flag; + int ret; + + flag = wd_alg_try_init(&wd_comp_setting.status2); + if (!flag) + return 0; + + if (!alg) { + WD_ERR("invalid: alg is NULL!\n"); + ret = -WD_EINVAL; + goto out_uninit; + } + + wd_comp_init_attrs.alg = alg; + wd_comp_init_attrs.sched_type = sched_type; + wd_comp_init_attrs.bmp = bmp; + wd_comp_init_attrs.cparams = cparams ? cparams : &wd_comp_cparams; + wd_comp_init_attrs.ctx_config = &wd_comp_ctx; + + wd_comp_sched = wd_sched_rr_alloc(sched_type, wd_comp_init_attrs.cparams->ctx_set_size, + numa_max_node() + 1, wd_comp_poll_ctx); + if (!wd_comp_sched) { + ret = -WD_EINVAL; + goto out_uninit; + } + wd_comp_sched->name = SCHED_RR_NAME; + wd_comp_init_attrs.sched = wd_comp_sched; + + ret = wd_alg_pre_init(&wd_comp_init_attrs); + if (ret) + goto out_freesched; + + ret = wd_comp_init(&wd_comp_ctx, wd_comp_sched); + if (ret) + goto out_freesched; + + wd_alg_set_init(&wd_comp_setting.status2); + + return 0; + +out_freesched: + wd_sched_rr_release(wd_comp_sched); + +out_uninit: + wd_alg_clear_init(&wd_comp_setting.status2); + + return ret; +} + +void wd_comp_uninit2(void) +{ + int i; + + wd_comp_uninit(); + + for (i = 0; i < wd_comp_ctx.ctx_num; i++) + if (wd_comp_ctx.ctxs[i].ctx) { + wd_release_ctx(wd_comp_ctx.ctxs[i].ctx); + wd_comp_ctx.ctxs[i].ctx = 0; + } + + wd_sched_rr_release(wd_comp_sched); + wd_alg_clear_init(&wd_comp_setting.status2); +} + struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) { return wd_find_msg_in_pool(&wd_comp_setting.pool, idx, tag); @@ -289,6 +373,7 @@ handle_t wd_comp_alloc_sess(struct wd_comp_sess_setup *setup) sess->comp_lv = setup->comp_lv; sess->win_sz = setup->win_sz; sess->stream_pos = WD_COMP_STREAM_NEW; + /* Some simple scheduler don't need scheduling parameters */ sess->sched_key = (void *)wd_comp_setting.sched.sched_init( wd_comp_setting.sched.h_sched_ctx, setup->sched_param); @@ -318,6 +403,7 @@ void wd_comp_free_sess(handle_t h_sess)
if (sess->sched_key) free(sess->sched_key); + free(sess); }
diff --git a/wd_util.c b/wd_util.c index fa77b46..d618776 100644 --- a/wd_util.c +++ b/wd_util.c @@ -5,7 +5,6 @@ */
#define _GNU_SOURCE -#include <numa.h> #include <pthread.h> #include <semaphore.h> #include <string.h> @@ -1801,3 +1800,238 @@ bool wd_alg_try_init(enum wd_status *status)
return true; } + +static __u32 wd_get_ctx_numbers(struct wd_ctx_params cparams, int end) +{ + __u32 count = 0; + int i; + + for (i = 0; i < end; i++) { + count += cparams.ctx_set_num[i].sync_ctx_num; + count += cparams.ctx_set_num[i].async_ctx_num; + } + + return count; +} + +struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp) +{ + struct uacce_dev_list *p, *node, *result = NULL; + struct uacce_dev *dev; + int numa_id, ret; + + if (!bmp) { + WD_ERR("invalid: bmp is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + } + + p = list; + while (p) { + dev = p->dev; + numa_id = dev->numa_id; + ret = numa_bitmask_isbitset(bmp, numa_id); + if (!ret) { + p = p->next; + continue; + } + + node = calloc(1, sizeof(*node)); + if (!node) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_list; + } + + node->dev = wd_clone_dev(dev); + if (!node->dev) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_node; + } + + if (!result) + result = node; + else + wd_add_dev_to_list(result, node); + + p = p->next; + } + + return result; + +out_free_node: + free(node); +out_free_list: + wd_free_list_accels(result); + return result; +} + +static int wd_init_ctx_set(struct wd_init_attrs *attrs, struct uacce_dev_list *list, + int idx, int numa_id, int op_type) +{ + struct wd_ctx_nums ctx_nums = attrs->cparams->ctx_set_num[op_type]; + __u32 ctx_set_num = ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num; + struct wd_ctx_config *ctx_config = attrs->ctx_config; + struct uacce_dev *dev; + int i; + + dev = wd_find_dev_by_numa(list, numa_id); + if (WD_IS_ERR(dev)) + return WD_PTR_ERR(dev); + + for (i = idx; i < idx + ctx_set_num; i++) { + ctx_config->ctxs[i].ctx = wd_request_ctx(dev); + if (errno == WD_EBUSY) { + dev = wd_find_dev_by_numa(list, numa_id); + if (WD_IS_ERR(dev)) + return WD_PTR_ERR(dev); + i--; + } + ctx_config->ctxs[i].op_type = op_type; + ctx_config->ctxs[i].ctx_mode = + ((i - idx) < ctx_nums.sync_ctx_num) ? + CTX_MODE_SYNC : CTX_MODE_ASYNC; + } + + return 0; +} + +static void wd_release_ctx_set(struct wd_ctx_config *ctx_config) +{ + int i; + + for (i = 0; i < ctx_config->ctx_num; i++) + if (ctx_config->ctxs[i].ctx) { + wd_release_ctx(ctx_config->ctxs[i].ctx); + ctx_config->ctxs[i].ctx = 0; + } +} + +static int wd_instance_sched_set(struct wd_sched *sched, struct wd_ctx_nums ctx_nums, + int idx, int numa_id, int op_type) +{ + struct sched_params sparams; + int i, ret = 0; + + for (i = 0; i < CTX_MODE_MAX; i++) { + sparams.numa_id = numa_id; + sparams.type = op_type; + sparams.mode = i; + sparams.begin = idx + ctx_nums.sync_ctx_num * i; + sparams.end = idx - 1 + ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num * i; + if (sparams.begin > sparams.end) + continue; + ret = wd_sched_rr_instance(sched, &sparams); + if (ret) + goto out; + } + +out: + return ret; +} + +static int wd_init_ctx_and_sched(struct wd_init_attrs *attrs, struct bitmask *bmp, + struct uacce_dev_list *list) +{ + struct wd_ctx_params *cparams = attrs->cparams; + __u32 ctx_set_size = cparams->ctx_set_size; + int max_node = numa_max_node() + 1; + struct wd_ctx_nums ctx_nums; + int i, j, ret; + int idx = 0; + + for (i = 0; i < max_node; i++) { + if (!numa_bitmask_isbitset(bmp, i)) + continue; + for (j = 0; j < ctx_set_size; j++) { + ctx_nums = cparams->ctx_set_num[j]; + ret = wd_init_ctx_set(attrs, list, idx, i, j); + if (ret) + goto free_ctxs; + ret = wd_instance_sched_set(attrs->sched, ctx_nums, idx, i, j); + if (ret) + goto free_ctxs; + idx += (ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num); + } + } + + return 0; + +free_ctxs: + wd_release_ctx_set(attrs->ctx_config); + + return ret; +} + +int wd_alg_pre_init(struct wd_init_attrs *attrs) +{ + struct wd_ctx_config *ctx_config = attrs->ctx_config; + struct wd_ctx_params *cparams = attrs->cparams; + struct uacce_dev_list *list, *used_list = NULL; + struct bitmask *used_bmp, *bmp = attrs->bmp; + __u32 ctx_set_num, ctx_set_size; + int numa_cnt, ret; + + list = wd_get_accel_list(attrs->alg); + if (!list) { + WD_ERR("failed to get devices!\n"); + return -WD_ENODEV; + } + + ctx_set_size = cparams->ctx_set_size; + ctx_set_num = wd_get_ctx_numbers(*cparams, ctx_set_size); + if (!ctx_set_num || !ctx_set_size) { + WD_ERR("invalid: ctx_set_num is %d, ctx_set_size is %d!\n", + ctx_set_num, ctx_set_size); + ret = -WD_EINVAL; + goto out_freelist; + } + + /* + * Not every numa has a device. Therefore, the first thing is to + * filter the devices in the selected numa node, and the second + * thing is to obtain the distribution of devices. + */ + if (bmp) { + used_list = wd_get_usable_list(list, bmp); + if (WD_IS_ERR(used_list)) { + ret = WD_PTR_ERR(used_list); + WD_ERR("failed to get usable devices(%d)!\n", ret); + goto out_freelist; + } + } + + used_bmp = wd_create_device_nodemask(used_list ? used_list : list); + if (WD_IS_ERR(bmp)) { + ret = WD_PTR_ERR(bmp); + goto out_freeusedlist; + } + + numa_cnt = numa_bitmask_weight(used_bmp); + if (!numa_cnt) { + ret = numa_cnt; + WD_ERR("invalid: bmp is clear!\n"); + goto out_freenodemask; + } + + ctx_config->ctx_num = ctx_set_num * numa_cnt; + ctx_config->ctxs = calloc(ctx_config->ctx_num, sizeof(struct wd_ctx)); + if (!ctx_config->ctxs) { + ret = -WD_ENOMEM; + WD_ERR("failed to alloc ctxs!\n"); + goto out_freenodemask; + } + + ret = wd_init_ctx_and_sched(attrs, used_bmp, used_list ? used_list : list); + if (ret) + free(ctx_config->ctxs); + +out_freenodemask: + wd_free_device_nodemask(used_bmp); + +out_freeusedlist: + wd_free_list_accels(used_list); + +out_freelist: + wd_free_list_accels(list); + + return ret; +}
On 2022/10/28 15:00, Yang Shen wrote:
Due to performance, uadk tries to leave many configuration options to users. This gives users great flexibility, but it also leads to a problem that the current initialization interface has high complexity. Therefore, in order to facilitate users to adapt quickly, a new set of interfaces is provided.
The 'wd_alg_init2_()' will complete all initialization steps. There are 4 parameters to describe the user configuration requirements. @alg: The algorithm users want to use. @sched_type: Scheduling type the user wants to use. @numa_bitmask: The bitmask provided by libnuma. Users can use this parameter to control requesting ctxs devices in the bind NUMA scenario. @ctx_nums: The requested ctx number for each numa node. Due to users may have different requirements for different types of ctx numbers, needs a two-dimensional array as input.
If users think 'wd_alg_init2_()' is too complex, wd_alg_init2() is a simplified packaging and will use the default value of numa_bitmask and ctx_nums.
Signed-off-by: Yang Shen shenyang39@huawei.com
Makefile.am | 4 +- include/wd.h | 24 ++++ include/wd_alg_common.h | 24 ++++ include/wd_comp.h | 27 +++++ include/wd_util.h | 19 ++++ wd.c | 62 +++++++++++ wd_comp.c | 86 +++++++++++++++ wd_util.c | 236 +++++++++++++++++++++++++++++++++++++++- 8 files changed, 479 insertions(+), 3 deletions(-)
diff --git a/Makefile.am b/Makefile.am index 53f36f9..5465b64 100644 --- a/Makefile.am +++ b/Makefile.am @@ -86,7 +86,7 @@ AM_CFLAGS += -DWD_NO_LOG
libwd_la_LIBADD = $(libwd_la_OBJECTS) -lnuma
-libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl +libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl -lnuma libwd_comp_la_DEPENDENCIES = libwd.la
libhisi_zip_la_LIBADD = -ldl @@ -103,7 +103,7 @@ else libwd_la_LDFLAGS=$(UADK_VERSION) libwd_la_LIBADD= -lnuma
-libwd_comp_la_LIBADD= -lwd -ldl +libwd_comp_la_LIBADD= -lwd -ldl -lnuma libwd_comp_la_LDFLAGS=$(UADK_VERSION) libwd_comp_la_DEPENDENCIES= libwd.la
diff --git a/include/wd.h b/include/wd.h index e1a87de..facd992 100644 --- a/include/wd.h +++ b/include/wd.h @@ -348,6 +348,16 @@ int wd_get_avail_ctx(struct uacce_dev *dev); */ struct uacce_dev_list *wd_get_accel_list(const char *alg_name);
+/**
- wd_find_dev_by_numa() - get device with max available ctx number from an
device list according to numa id.
- @list: The device list.
- @numa_id: The numa_id.
- Return device if succeed and other error number if fail.
- */
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id);
/**
- wd_get_accel_dev() - Get device supporting the algorithm with smallest numa distance to current numa node.
@@ -523,6 +533,20 @@ struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); */ void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node);
+/**
- wd_create_device_nodemask() - create a numa node mask of device list.
- @list: The devices list.
- Return a pointer value if succeed, and error number if fail.
- */
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list);
+/**
- wd_free_device_nodemask() - free a numa node mask.
- @bmp: A numa node mask.
- */
+void wd_free_device_nodemask(struct bitmask *bmp);
/**
- wd_ctx_get_dev_name() - Get the device name about task.
- @h_ctx: The handle of context.
diff --git a/include/wd_alg_common.h b/include/wd_alg_common.h index c455dc3..5f63215 100644 --- a/include/wd_alg_common.h +++ b/include/wd_alg_common.h @@ -63,6 +63,30 @@ struct wd_ctx_config { void *priv; };
+/**
- struct wd_ctx_nums - Define the ctx sets numbers.
- @sync_ctx_num: The ctx numbers which are used for sync mode for each
- ctx sets.
- @async_ctx_num: The ctx numbers which are used for async mode for each
- ctx sets.
- */
+struct wd_ctx_nums {
- __u32 sync_ctx_num;
- __u32 async_ctx_num;
+};
+/**
- struct wd_ctx_params - Define the ctx sets params which are used for init
- algorithms.
- @ctx_set_size: Number of ctx sets to be created. Usually users can
- set it according to <alg>_op_type.
- @ctx_set_num: Each ctx sets numbers.
- */
+struct wd_ctx_params {
- __u32 ctx_set_size;
- struct wd_ctx_nums *ctx_set_num;
+};
struct wd_ctx_internal { handle_t ctx; __u8 op_type; diff --git a/include/wd_comp.h b/include/wd_comp.h index e043a83..d96110e 100644 --- a/include/wd_comp.h +++ b/include/wd_comp.h @@ -7,6 +7,7 @@ #ifndef __WD_COMP_H #define __WD_COMP_H
+#include <numa.h> #include "wd.h" #include "wd_alg_common.h"
@@ -113,6 +114,32 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched); */ void wd_comp_uninit(void);
+/**
- wd_comp_init2_() - A simplify interface to initializate uadk
- compression/decompression. This interface keeps most functions of
- wd_comp_init(). Users just need to descripe the deployment of
- business scenarios. Then the initialization will request appropriate
- resources to support the business scenarios.
- To make the initializate simpler, bmp and cparams support set NULL.
- And then the function will set them as default.
- @alg: The selected algorithm.
- @sched_type: The scheduler type.
- @bmp: Node mask of the required devices.
- @cparams: The ctx number settings.
- Return 0 if succeed and others if fail.
- */
+int wd_comp_init2_(char *alg, __u32 sched_type, struct bitmask *bmp, struct wd_ctx_params *cparams);
+#define wd_comp_init2(alg, sched_type) \
- wd_comp_init2_(alg, sched_type, NULL, NULL)
+/**
- wd_comp_uninit2() - Uninitialise ctx configuration and scheduler.
- */
+void wd_comp_uninit2(void);
struct wd_comp_sess_setup { enum wd_comp_alg_type alg_type; /* Denoted by enum wd_comp_alg_type */ enum wd_comp_level comp_lv; /* Denoted by enum wd_comp_level */ diff --git a/include/wd_util.h b/include/wd_util.h index eafe3ce..8ae70f1 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -7,6 +7,7 @@ #ifndef __WD_UTIL_H #define __WD_UTIL_H
+#include <numa.h> #include <stdbool.h> #include <sys/ipc.h> #include <sys/shm.h> @@ -112,6 +113,15 @@ struct wd_msg_handle { int (*recv)(handle_t sess, void *msg); };
+struct wd_init_attrs {
- __u32 sched_type;
- char *alg;
- struct bitmask *bmp;
- struct wd_sched *sched;
- struct wd_ctx_params *cparams;
- struct wd_ctx_config *ctx_config;
+};
/*
- wd_init_ctx_config() - Init internal ctx configuration.
- @in: ctx configuration in global setting.
@@ -394,6 +404,15 @@ static inline void wd_alg_clear_init(enum wd_status *status) __atomic_store(status, &setting, __ATOMIC_RELAXED); }
+/**
- wd_alg_pre_init() - Request the ctxs and initialize the sched_domain
with the given devices list, ctxs number and numa mask.
- @attrs: the algorithm initialization parameters.
- Return device if succeed and other error number if fail.
- */
+int wd_alg_pre_init(struct wd_init_attrs *attrs);
/**
- wd_dfx_msg_cnt() - Message counter interface for ctx
- @msg: Shared memory addr.
diff --git a/wd.c b/wd.c index 78094d8..9eb69d2 100644 --- a/wd.c +++ b/wd.c @@ -727,6 +727,35 @@ free_list: return NULL; }
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id) +{
- struct uacce_dev *dev = WD_ERR_PTR(-WD_ENODEV);
- struct uacce_dev_list *p = list;
- int ctx_num, ctx_max = 0;
- if (!list) {
WD_ERR("invalid: list is NULL!\n");
return WD_ERR_PTR(-WD_EINVAL);
- }
- while (p) {
if (numa_id != p->dev->numa_id) {
p = p->next;
continue;
}
ctx_num = wd_get_avail_ctx(p->dev);
if (ctx_num > ctx_max) {
dev = p->dev;
ctx_max = ctx_num;
}
p = p->next;
- }
- return dev;
+}
void wd_free_list_accels(struct uacce_dev_list *list) { struct uacce_dev_list *curr, *next; @@ -793,6 +822,39 @@ int wd_ctx_set_io_cmd(handle_t h_ctx, unsigned long cmd, void *arg) return ioctl(ctx->fd, cmd, arg); }
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list) +{
- struct uacce_dev_list *p;
- struct bitmask *bmp;
- if (!list) {
WD_ERR("invalid: list is NULL!\n");
return WD_ERR_PTR(-WD_EINVAL);
- }
- bmp = numa_allocate_nodemask();
- if (!bmp) {
WD_ERR("failed to alloc bitmask(%d)!\n", errno);
return WD_ERR_PTR(-WD_ENOMEM);
- }
- p = list;
- while (p) {
numa_bitmask_setbit(bmp, p->dev->numa_id);
p = p->next;
- }
- return bmp;
+}
+void wd_free_device_nodemask(struct bitmask *bmp) +{
- if (!bmp)
return;
- numa_free_nodemask(bmp);
+}
void wd_get_version(void) { const char *wd_released_time = UADK_RELEASED_TIME; diff --git a/wd_comp.c b/wd_comp.c index 44593a6..ea80d13 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -14,6 +14,7 @@
#include "config.h" #include "drv/wd_comp_drv.h" +#include "wd_sched.h" #include "wd_util.h" #include "wd_comp.h"
@@ -21,6 +22,8 @@ #define HW_CTX_SIZE (64 * 1024) #define STREAM_CHUNK (128 * 1024)
+#define SCHED_RR_NAME "sched_rr"
#define swap_byte(x) \ ((((x) & 0x000000ff) << 24) | \ (((x) & 0x0000ff00) << 8) | \ @@ -42,6 +45,7 @@ struct wd_comp_sess {
struct wd_comp_setting { enum wd_status status;
- enum wd_status status2; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver;
@@ -52,6 +56,19 @@ struct wd_comp_setting {
struct wd_env_config wd_comp_env_config;
+static struct wd_init_attrs wd_comp_init_attrs; +static struct wd_ctx_config wd_comp_ctx; +static struct wd_sched *wd_comp_sched;
+static struct wd_ctx_nums wd_comp_ctx_num[] = {
- {1, 1}, {1, 1}, {}
+};
+static struct wd_ctx_params wd_comp_cparams = {
- .ctx_set_size = WD_DIR_MAX,
- .ctx_set_num = wd_comp_ctx_num
+};
#ifdef WD_STATIC_DRV static void wd_comp_set_static_drv(void) { @@ -178,6 +195,73 @@ void wd_comp_uninit(void) wd_alg_clear_init(&wd_comp_setting.status); }
+int wd_comp_init2_(char *alg, __u32 sched_type, struct bitmask *bmp, struct wd_ctx_params *cparams)
This bmp is recommended to be merged into cparams, because it is also a parameter that affects the number of ctx.
+{
- bool flag;
- int ret;
- flag = wd_alg_try_init(&wd_comp_setting.status2);
- if (!flag)
return 0;
- if (!alg) {
WD_ERR("invalid: alg is NULL!\n");
ret = -WD_EINVAL;
goto out_uninit;
- }
- wd_comp_init_attrs.alg = alg;
- wd_comp_init_attrs.sched_type = sched_type;
- wd_comp_init_attrs.bmp = bmp;
- wd_comp_init_attrs.cparams = cparams ? cparams : &wd_comp_cparams;
It is recommended not to use this parameter structure, use cparams directly, and pass alg and sched_type to wd_util.c
- wd_comp_init_attrs.ctx_config = &wd_comp_ctx;
- wd_comp_sched = wd_sched_rr_alloc(sched_type, wd_comp_init_attrs.cparams->ctx_set_size,
numa_max_node() + 1, wd_comp_poll_ctx);
- if (!wd_comp_sched) {
ret = -WD_EINVAL;
goto out_uninit;
- }
- wd_comp_sched->name = SCHED_RR_NAME;
- wd_comp_init_attrs.sched = wd_comp_sched;
- ret = wd_alg_pre_init(&wd_comp_init_attrs);
- if (ret)
goto out_freesched;
- ret = wd_comp_init(&wd_comp_ctx, wd_comp_sched);
- if (ret)
goto out_freesched;
- wd_alg_set_init(&wd_comp_setting.status2);
- return 0;
+out_freesched:
- wd_sched_rr_release(wd_comp_sched);
+out_uninit:
- wd_alg_clear_init(&wd_comp_setting.status2);
- return ret;
+}
+void wd_comp_uninit2(void) +{
- int i;
- wd_comp_uninit();
- for (i = 0; i < wd_comp_ctx.ctx_num; i++)
if (wd_comp_ctx.ctxs[i].ctx) {
wd_release_ctx(wd_comp_ctx.ctxs[i].ctx);
wd_comp_ctx.ctxs[i].ctx = 0;
- }
- wd_sched_rr_release(wd_comp_sched);
- wd_alg_clear_init(&wd_comp_setting.status2);
+}
struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) { return wd_find_msg_in_pool(&wd_comp_setting.pool, idx, tag); @@ -289,6 +373,7 @@ handle_t wd_comp_alloc_sess(struct wd_comp_sess_setup *setup) sess->comp_lv = setup->comp_lv; sess->win_sz = setup->win_sz; sess->stream_pos = WD_COMP_STREAM_NEW;
- /* Some simple scheduler don't need scheduling parameters */ sess->sched_key = (void *)wd_comp_setting.sched.sched_init( wd_comp_setting.sched.h_sched_ctx, setup->sched_param);
@@ -318,6 +403,7 @@ void wd_comp_free_sess(handle_t h_sess)
if (sess->sched_key) free(sess->sched_key);
- free(sess);
}
diff --git a/wd_util.c b/wd_util.c index fa77b46..d618776 100644 --- a/wd_util.c +++ b/wd_util.c @@ -5,7 +5,6 @@ */
#define _GNU_SOURCE -#include <numa.h> #include <pthread.h> #include <semaphore.h> #include <string.h> @@ -1801,3 +1800,238 @@ bool wd_alg_try_init(enum wd_status *status)
return true; }
+static __u32 wd_get_ctx_numbers(struct wd_ctx_params cparams, int end) +{
- __u32 count = 0;
- int i;
- for (i = 0; i < end; i++) {
count += cparams.ctx_set_num[i].sync_ctx_num;
count += cparams.ctx_set_num[i].async_ctx_num;
- }
- return count;
+}
+struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp) +{
- struct uacce_dev_list *p, *node, *result = NULL;
- struct uacce_dev *dev;
- int numa_id, ret;
- if (!bmp) {
WD_ERR("invalid: bmp is NULL!\n");
return WD_ERR_PTR(-WD_EINVAL);
- }
- p = list;
- while (p) {
dev = p->dev;
numa_id = dev->numa_id;
ret = numa_bitmask_isbitset(bmp, numa_id);
if (!ret) {
p = p->next;
continue;
}
node = calloc(1, sizeof(*node));
if (!node) {
result = WD_ERR_PTR(-WD_ENOMEM);
goto out_free_list;
}
node->dev = wd_clone_dev(dev);
if (!node->dev) {
result = WD_ERR_PTR(-WD_ENOMEM);
goto out_free_node;
}
if (!result)
result = node;
else
wd_add_dev_to_list(result, node);
p = p->next;
- }
- return result;
+out_free_node:
- free(node);
+out_free_list:
- wd_free_list_accels(result);
- return result;
+}
+static int wd_init_ctx_set(struct wd_init_attrs *attrs, struct uacce_dev_list *list,
int idx, int numa_id, int op_type)
+{
- struct wd_ctx_nums ctx_nums = attrs->cparams->ctx_set_num[op_type];
- __u32 ctx_set_num = ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num;
- struct wd_ctx_config *ctx_config = attrs->ctx_config;
- struct uacce_dev *dev;
- int i;
- dev = wd_find_dev_by_numa(list, numa_id);
- if (WD_IS_ERR(dev))
return WD_PTR_ERR(dev);
- for (i = idx; i < idx + ctx_set_num; i++) {
ctx_config->ctxs[i].ctx = wd_request_ctx(dev);
if (errno == WD_EBUSY) {
dev = wd_find_dev_by_numa(list, numa_id);
if (WD_IS_ERR(dev))
return WD_PTR_ERR(dev);
i--;
}
ctx_config->ctxs[i].op_type = op_type;
ctx_config->ctxs[i].ctx_mode =
((i - idx) < ctx_nums.sync_ctx_num) ?
CTX_MODE_SYNC : CTX_MODE_ASYNC;
- }
- return 0;
+}
+static void wd_release_ctx_set(struct wd_ctx_config *ctx_config) +{
- int i;
- for (i = 0; i < ctx_config->ctx_num; i++)
if (ctx_config->ctxs[i].ctx) {
wd_release_ctx(ctx_config->ctxs[i].ctx);
ctx_config->ctxs[i].ctx = 0;
}
+}
+static int wd_instance_sched_set(struct wd_sched *sched, struct wd_ctx_nums ctx_nums,
int idx, int numa_id, int op_type)
+{
- struct sched_params sparams;
- int i, ret = 0;
- for (i = 0; i < CTX_MODE_MAX; i++) {
sparams.numa_id = numa_id;
sparams.type = op_type;
sparams.mode = i;
sparams.begin = idx + ctx_nums.sync_ctx_num * i;
sparams.end = idx - 1 + ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num * i;
if (sparams.begin > sparams.end)
continue;
ret = wd_sched_rr_instance(sched, &sparams);
if (ret)
goto out;
- }
+out:
- return ret;
+}
+static int wd_init_ctx_and_sched(struct wd_init_attrs *attrs, struct bitmask *bmp,
struct uacce_dev_list *list)
+{
- struct wd_ctx_params *cparams = attrs->cparams;
- __u32 ctx_set_size = cparams->ctx_set_size;
- int max_node = numa_max_node() + 1;
- struct wd_ctx_nums ctx_nums;
- int i, j, ret;
- int idx = 0;
- for (i = 0; i < max_node; i++) {
if (!numa_bitmask_isbitset(bmp, i))
continue;
for (j = 0; j < ctx_set_size; j++) {
ctx_nums = cparams->ctx_set_num[j];
ret = wd_init_ctx_set(attrs, list, idx, i, j);
if (ret)
goto free_ctxs;
ret = wd_instance_sched_set(attrs->sched, ctx_nums, idx, i, j);
if (ret)
goto free_ctxs;
idx += (ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num);
}
- }
- return 0;
+free_ctxs:
- wd_release_ctx_set(attrs->ctx_config);
- return ret;
+}
+int wd_alg_pre_init(struct wd_init_attrs *attrs) +{
- struct wd_ctx_config *ctx_config = attrs->ctx_config;
- struct wd_ctx_params *cparams = attrs->cparams;
- struct uacce_dev_list *list, *used_list = NULL;
- struct bitmask *used_bmp, *bmp = attrs->bmp;
- __u32 ctx_set_num, ctx_set_size;
- int numa_cnt, ret;
- list = wd_get_accel_list(attrs->alg);
- if (!list) {
WD_ERR("failed to get devices!\n");
return -WD_ENODEV;
- }
- ctx_set_size = cparams->ctx_set_size;
- ctx_set_num = wd_get_ctx_numbers(*cparams, ctx_set_size);
- if (!ctx_set_num || !ctx_set_size) {
WD_ERR("invalid: ctx_set_num is %d, ctx_set_size is %d!\n",
ctx_set_num, ctx_set_size);
ret = -WD_EINVAL;
goto out_freelist;
- }
- /*
* Not every numa has a device. Therefore, the first thing is to
* filter the devices in the selected numa node, and the second
* thing is to obtain the distribution of devices.
*/
- if (bmp) {
used_list = wd_get_usable_list(list, bmp);
if (WD_IS_ERR(used_list)) {
ret = WD_PTR_ERR(used_list);
WD_ERR("failed to get usable devices(%d)!\n", ret);
goto out_freelist;
}
- }
- used_bmp = wd_create_device_nodemask(used_list ? used_list : list);
- if (WD_IS_ERR(bmp)) {
ret = WD_PTR_ERR(bmp);
goto out_freeusedlist;
- }
- numa_cnt = numa_bitmask_weight(used_bmp);
- if (!numa_cnt) {
ret = numa_cnt;
WD_ERR("invalid: bmp is clear!\n");
goto out_freenodemask;
- }
- ctx_config->ctx_num = ctx_set_num * numa_cnt;
- ctx_config->ctxs = calloc(ctx_config->ctx_num, sizeof(struct wd_ctx));
- if (!ctx_config->ctxs) {
ret = -WD_ENOMEM;
WD_ERR("failed to alloc ctxs!\n");
goto out_freenodemask;
- }
- ret = wd_init_ctx_and_sched(attrs, used_bmp, used_list ? used_list : list);
- if (ret)
free(ctx_config->ctxs);
+out_freenodemask:
- wd_free_device_nodemask(used_bmp);
+out_freeusedlist:
- wd_free_list_accels(used_list);
+out_freelist:
- wd_free_list_accels(list);
- return ret;
+}
在 2022/10/28 15:31, liulongfang 写道:
On 2022/10/28 15:00, Yang Shen wrote:
Due to performance, uadk tries to leave many configuration options to users. This gives users great flexibility, but it also leads to a problem that the current initialization interface has high complexity. Therefore, in order to facilitate users to adapt quickly, a new set of interfaces is provided.
The 'wd_alg_init2_()' will complete all initialization steps. There are 4 parameters to describe the user configuration requirements. @alg: The algorithm users want to use. @sched_type: Scheduling type the user wants to use. @numa_bitmask: The bitmask provided by libnuma. Users can use this parameter to control requesting ctxs devices in the bind NUMA scenario. @ctx_nums: The requested ctx number for each numa node. Due to users may have different requirements for different types of ctx numbers, needs a two-dimensional array as input.
If users think 'wd_alg_init2_()' is too complex, wd_alg_init2() is a simplified packaging and will use the default value of numa_bitmask and ctx_nums.
Signed-off-by: Yang Shen shenyang39@huawei.com
Makefile.am | 4 +- include/wd.h | 24 ++++ include/wd_alg_common.h | 24 ++++ include/wd_comp.h | 27 +++++ include/wd_util.h | 19 ++++ wd.c | 62 +++++++++++ wd_comp.c | 86 +++++++++++++++ wd_util.c | 236 +++++++++++++++++++++++++++++++++++++++- 8 files changed, 479 insertions(+), 3 deletions(-)
diff --git a/Makefile.am b/Makefile.am index 53f36f9..5465b64 100644 --- a/Makefile.am +++ b/Makefile.am @@ -86,7 +86,7 @@ AM_CFLAGS += -DWD_NO_LOG
libwd_la_LIBADD = $(libwd_la_OBJECTS) -lnuma
-libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl +libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl -lnuma libwd_comp_la_DEPENDENCIES = libwd.la
libhisi_zip_la_LIBADD = -ldl @@ -103,7 +103,7 @@ else libwd_la_LDFLAGS=$(UADK_VERSION) libwd_la_LIBADD= -lnuma
-libwd_comp_la_LIBADD= -lwd -ldl +libwd_comp_la_LIBADD= -lwd -ldl -lnuma libwd_comp_la_LDFLAGS=$(UADK_VERSION) libwd_comp_la_DEPENDENCIES= libwd.la
diff --git a/include/wd.h b/include/wd.h index e1a87de..facd992 100644 --- a/include/wd.h +++ b/include/wd.h @@ -348,6 +348,16 @@ int wd_get_avail_ctx(struct uacce_dev *dev); */ struct uacce_dev_list *wd_get_accel_list(const char *alg_name);
+/**
- wd_find_dev_by_numa() - get device with max available ctx number from an
device list according to numa id.
- @list: The device list.
- @numa_id: The numa_id.
- Return device if succeed and other error number if fail.
- */
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id);
- /**
- wd_get_accel_dev() - Get device supporting the algorithm with smallest numa distance to current numa node.
@@ -523,6 +533,20 @@ struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); */ void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node);
+/**
- wd_create_device_nodemask() - create a numa node mask of device list.
- @list: The devices list.
- Return a pointer value if succeed, and error number if fail.
- */
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list);
+/**
- wd_free_device_nodemask() - free a numa node mask.
- @bmp: A numa node mask.
- */
+void wd_free_device_nodemask(struct bitmask *bmp);
- /**
- wd_ctx_get_dev_name() - Get the device name about task.
- @h_ctx: The handle of context.
diff --git a/include/wd_alg_common.h b/include/wd_alg_common.h index c455dc3..5f63215 100644 --- a/include/wd_alg_common.h +++ b/include/wd_alg_common.h @@ -63,6 +63,30 @@ struct wd_ctx_config { void *priv; };
+/**
- struct wd_ctx_nums - Define the ctx sets numbers.
- @sync_ctx_num: The ctx numbers which are used for sync mode for each
- ctx sets.
- @async_ctx_num: The ctx numbers which are used for async mode for each
- ctx sets.
- */
+struct wd_ctx_nums {
- __u32 sync_ctx_num;
- __u32 async_ctx_num;
+};
+/**
- struct wd_ctx_params - Define the ctx sets params which are used for init
- algorithms.
- @ctx_set_size: Number of ctx sets to be created. Usually users can
- set it according to <alg>_op_type.
- @ctx_set_num: Each ctx sets numbers.
- */
+struct wd_ctx_params {
- __u32 ctx_set_size;
- struct wd_ctx_nums *ctx_set_num;
+};
- struct wd_ctx_internal { handle_t ctx; __u8 op_type;
diff --git a/include/wd_comp.h b/include/wd_comp.h index e043a83..d96110e 100644 --- a/include/wd_comp.h +++ b/include/wd_comp.h @@ -7,6 +7,7 @@ #ifndef __WD_COMP_H #define __WD_COMP_H
+#include <numa.h> #include "wd.h" #include "wd_alg_common.h"
@@ -113,6 +114,32 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched); */ void wd_comp_uninit(void);
+/**
- wd_comp_init2_() - A simplify interface to initializate uadk
- compression/decompression. This interface keeps most functions of
- wd_comp_init(). Users just need to descripe the deployment of
- business scenarios. Then the initialization will request appropriate
- resources to support the business scenarios.
- To make the initializate simpler, bmp and cparams support set NULL.
- And then the function will set them as default.
- @alg: The selected algorithm.
- @sched_type: The scheduler type.
- @bmp: Node mask of the required devices.
- @cparams: The ctx number settings.
- Return 0 if succeed and others if fail.
- */
+int wd_comp_init2_(char *alg, __u32 sched_type, struct bitmask *bmp, struct wd_ctx_params *cparams);
+#define wd_comp_init2(alg, sched_type) \
- wd_comp_init2_(alg, sched_type, NULL, NULL)
+/**
- wd_comp_uninit2() - Uninitialise ctx configuration and scheduler.
- */
+void wd_comp_uninit2(void);
- struct wd_comp_sess_setup { enum wd_comp_alg_type alg_type; /* Denoted by enum wd_comp_alg_type */ enum wd_comp_level comp_lv; /* Denoted by enum wd_comp_level */
diff --git a/include/wd_util.h b/include/wd_util.h index eafe3ce..8ae70f1 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -7,6 +7,7 @@ #ifndef __WD_UTIL_H #define __WD_UTIL_H
+#include <numa.h> #include <stdbool.h> #include <sys/ipc.h> #include <sys/shm.h> @@ -112,6 +113,15 @@ struct wd_msg_handle { int (*recv)(handle_t sess, void *msg); };
+struct wd_init_attrs {
- __u32 sched_type;
- char *alg;
- struct bitmask *bmp;
- struct wd_sched *sched;
- struct wd_ctx_params *cparams;
- struct wd_ctx_config *ctx_config;
+};
- /*
- wd_init_ctx_config() - Init internal ctx configuration.
- @in: ctx configuration in global setting.
@@ -394,6 +404,15 @@ static inline void wd_alg_clear_init(enum wd_status *status) __atomic_store(status, &setting, __ATOMIC_RELAXED); }
+/**
- wd_alg_pre_init() - Request the ctxs and initialize the sched_domain
with the given devices list, ctxs number and numa mask.
- @attrs: the algorithm initialization parameters.
- Return device if succeed and other error number if fail.
- */
+int wd_alg_pre_init(struct wd_init_attrs *attrs);
- /**
- wd_dfx_msg_cnt() - Message counter interface for ctx
- @msg: Shared memory addr.
diff --git a/wd.c b/wd.c index 78094d8..9eb69d2 100644 --- a/wd.c +++ b/wd.c @@ -727,6 +727,35 @@ free_list: return NULL; }
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id) +{
- struct uacce_dev *dev = WD_ERR_PTR(-WD_ENODEV);
- struct uacce_dev_list *p = list;
- int ctx_num, ctx_max = 0;
- if (!list) {
WD_ERR("invalid: list is NULL!\n");
return WD_ERR_PTR(-WD_EINVAL);
- }
- while (p) {
if (numa_id != p->dev->numa_id) {
p = p->next;
continue;
}
ctx_num = wd_get_avail_ctx(p->dev);
if (ctx_num > ctx_max) {
dev = p->dev;
ctx_max = ctx_num;
}
p = p->next;
- }
- return dev;
+}
- void wd_free_list_accels(struct uacce_dev_list *list) { struct uacce_dev_list *curr, *next;
@@ -793,6 +822,39 @@ int wd_ctx_set_io_cmd(handle_t h_ctx, unsigned long cmd, void *arg) return ioctl(ctx->fd, cmd, arg); }
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list) +{
- struct uacce_dev_list *p;
- struct bitmask *bmp;
- if (!list) {
WD_ERR("invalid: list is NULL!\n");
return WD_ERR_PTR(-WD_EINVAL);
- }
- bmp = numa_allocate_nodemask();
- if (!bmp) {
WD_ERR("failed to alloc bitmask(%d)!\n", errno);
return WD_ERR_PTR(-WD_ENOMEM);
- }
- p = list;
- while (p) {
numa_bitmask_setbit(bmp, p->dev->numa_id);
p = p->next;
- }
- return bmp;
+}
+void wd_free_device_nodemask(struct bitmask *bmp) +{
- if (!bmp)
return;
- numa_free_nodemask(bmp);
+}
- void wd_get_version(void) { const char *wd_released_time = UADK_RELEASED_TIME;
diff --git a/wd_comp.c b/wd_comp.c index 44593a6..ea80d13 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -14,6 +14,7 @@
#include "config.h" #include "drv/wd_comp_drv.h" +#include "wd_sched.h" #include "wd_util.h" #include "wd_comp.h"
@@ -21,6 +22,8 @@ #define HW_CTX_SIZE (64 * 1024) #define STREAM_CHUNK (128 * 1024)
+#define SCHED_RR_NAME "sched_rr"
- #define swap_byte(x) \ ((((x) & 0x000000ff) << 24) | \ (((x) & 0x0000ff00) << 8) | \
@@ -42,6 +45,7 @@ struct wd_comp_sess {
struct wd_comp_setting { enum wd_status status;
- enum wd_status status2; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver;
@@ -52,6 +56,19 @@ struct wd_comp_setting {
struct wd_env_config wd_comp_env_config;
+static struct wd_init_attrs wd_comp_init_attrs; +static struct wd_ctx_config wd_comp_ctx; +static struct wd_sched *wd_comp_sched;
+static struct wd_ctx_nums wd_comp_ctx_num[] = {
- {1, 1}, {1, 1}, {}
+};
+static struct wd_ctx_params wd_comp_cparams = {
- .ctx_set_size = WD_DIR_MAX,
- .ctx_set_num = wd_comp_ctx_num
+};
- #ifdef WD_STATIC_DRV static void wd_comp_set_static_drv(void) {
@@ -178,6 +195,73 @@ void wd_comp_uninit(void) wd_alg_clear_init(&wd_comp_setting.status); }
+int wd_comp_init2_(char *alg, __u32 sched_type, struct bitmask *bmp, struct wd_ctx_params *cparams)
This bmp is recommended to be merged into cparams, because it is also a parameter that affects the number of ctx.
OK, this may need to rename the parameter wd_ctx_params. Maybe init_params?
+{
- bool flag;
- int ret;
- flag = wd_alg_try_init(&wd_comp_setting.status2);
- if (!flag)
return 0;
- if (!alg) {
WD_ERR("invalid: alg is NULL!\n");
ret = -WD_EINVAL;
goto out_uninit;
- }
- wd_comp_init_attrs.alg = alg;
- wd_comp_init_attrs.sched_type = sched_type;
- wd_comp_init_attrs.bmp = bmp;
- wd_comp_init_attrs.cparams = cparams ? cparams : &wd_comp_cparams;
It is recommended not to use this parameter structure, use cparams directly, and pass alg and sched_type to wd_util.c
But, different algorithms need a unique ctx_params setting. So we need to check if user set is as NULL.
- wd_comp_init_attrs.ctx_config = &wd_comp_ctx;
- wd_comp_sched = wd_sched_rr_alloc(sched_type, wd_comp_init_attrs.cparams->ctx_set_size,
numa_max_node() + 1, wd_comp_poll_ctx);
- if (!wd_comp_sched) {
ret = -WD_EINVAL;
goto out_uninit;
- }
- wd_comp_sched->name = SCHED_RR_NAME;
- wd_comp_init_attrs.sched = wd_comp_sched;
- ret = wd_alg_pre_init(&wd_comp_init_attrs);
- if (ret)
goto out_freesched;
- ret = wd_comp_init(&wd_comp_ctx, wd_comp_sched);
- if (ret)
goto out_freesched;
- wd_alg_set_init(&wd_comp_setting.status2);
- return 0;
+out_freesched:
- wd_sched_rr_release(wd_comp_sched);
+out_uninit:
- wd_alg_clear_init(&wd_comp_setting.status2);
- return ret;
+}
+void wd_comp_uninit2(void) +{
- int i;
- wd_comp_uninit();
- for (i = 0; i < wd_comp_ctx.ctx_num; i++)
if (wd_comp_ctx.ctxs[i].ctx) {
wd_release_ctx(wd_comp_ctx.ctxs[i].ctx);
wd_comp_ctx.ctxs[i].ctx = 0;
- }
- wd_sched_rr_release(wd_comp_sched);
- wd_alg_clear_init(&wd_comp_setting.status2);
+}
- struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) { return wd_find_msg_in_pool(&wd_comp_setting.pool, idx, tag);
@@ -289,6 +373,7 @@ handle_t wd_comp_alloc_sess(struct wd_comp_sess_setup *setup) sess->comp_lv = setup->comp_lv; sess->win_sz = setup->win_sz; sess->stream_pos = WD_COMP_STREAM_NEW;
- /* Some simple scheduler don't need scheduling parameters */ sess->sched_key = (void *)wd_comp_setting.sched.sched_init( wd_comp_setting.sched.h_sched_ctx, setup->sched_param);
@@ -318,6 +403,7 @@ void wd_comp_free_sess(handle_t h_sess)
if (sess->sched_key) free(sess->sched_key);
- free(sess); }
diff --git a/wd_util.c b/wd_util.c index fa77b46..d618776 100644 --- a/wd_util.c +++ b/wd_util.c @@ -5,7 +5,6 @@ */
#define _GNU_SOURCE -#include <numa.h> #include <pthread.h> #include <semaphore.h> #include <string.h> @@ -1801,3 +1800,238 @@ bool wd_alg_try_init(enum wd_status *status)
return true; }
+static __u32 wd_get_ctx_numbers(struct wd_ctx_params cparams, int end) +{
- __u32 count = 0;
- int i;
- for (i = 0; i < end; i++) {
count += cparams.ctx_set_num[i].sync_ctx_num;
count += cparams.ctx_set_num[i].async_ctx_num;
- }
- return count;
+}
+struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp) +{
- struct uacce_dev_list *p, *node, *result = NULL;
- struct uacce_dev *dev;
- int numa_id, ret;
- if (!bmp) {
WD_ERR("invalid: bmp is NULL!\n");
return WD_ERR_PTR(-WD_EINVAL);
- }
- p = list;
- while (p) {
dev = p->dev;
numa_id = dev->numa_id;
ret = numa_bitmask_isbitset(bmp, numa_id);
if (!ret) {
p = p->next;
continue;
}
node = calloc(1, sizeof(*node));
if (!node) {
result = WD_ERR_PTR(-WD_ENOMEM);
goto out_free_list;
}
node->dev = wd_clone_dev(dev);
if (!node->dev) {
result = WD_ERR_PTR(-WD_ENOMEM);
goto out_free_node;
}
if (!result)
result = node;
else
wd_add_dev_to_list(result, node);
p = p->next;
- }
- return result;
+out_free_node:
- free(node);
+out_free_list:
- wd_free_list_accels(result);
- return result;
+}
+static int wd_init_ctx_set(struct wd_init_attrs *attrs, struct uacce_dev_list *list,
int idx, int numa_id, int op_type)
+{
- struct wd_ctx_nums ctx_nums = attrs->cparams->ctx_set_num[op_type];
- __u32 ctx_set_num = ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num;
- struct wd_ctx_config *ctx_config = attrs->ctx_config;
- struct uacce_dev *dev;
- int i;
- dev = wd_find_dev_by_numa(list, numa_id);
- if (WD_IS_ERR(dev))
return WD_PTR_ERR(dev);
- for (i = idx; i < idx + ctx_set_num; i++) {
ctx_config->ctxs[i].ctx = wd_request_ctx(dev);
if (errno == WD_EBUSY) {
dev = wd_find_dev_by_numa(list, numa_id);
if (WD_IS_ERR(dev))
return WD_PTR_ERR(dev);
i--;
}
ctx_config->ctxs[i].op_type = op_type;
ctx_config->ctxs[i].ctx_mode =
((i - idx) < ctx_nums.sync_ctx_num) ?
CTX_MODE_SYNC : CTX_MODE_ASYNC;
- }
- return 0;
+}
+static void wd_release_ctx_set(struct wd_ctx_config *ctx_config) +{
- int i;
- for (i = 0; i < ctx_config->ctx_num; i++)
if (ctx_config->ctxs[i].ctx) {
wd_release_ctx(ctx_config->ctxs[i].ctx);
ctx_config->ctxs[i].ctx = 0;
}
+}
+static int wd_instance_sched_set(struct wd_sched *sched, struct wd_ctx_nums ctx_nums,
int idx, int numa_id, int op_type)
+{
- struct sched_params sparams;
- int i, ret = 0;
- for (i = 0; i < CTX_MODE_MAX; i++) {
sparams.numa_id = numa_id;
sparams.type = op_type;
sparams.mode = i;
sparams.begin = idx + ctx_nums.sync_ctx_num * i;
sparams.end = idx - 1 + ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num * i;
if (sparams.begin > sparams.end)
continue;
ret = wd_sched_rr_instance(sched, &sparams);
if (ret)
goto out;
- }
+out:
- return ret;
+}
+static int wd_init_ctx_and_sched(struct wd_init_attrs *attrs, struct bitmask *bmp,
struct uacce_dev_list *list)
+{
- struct wd_ctx_params *cparams = attrs->cparams;
- __u32 ctx_set_size = cparams->ctx_set_size;
- int max_node = numa_max_node() + 1;
- struct wd_ctx_nums ctx_nums;
- int i, j, ret;
- int idx = 0;
- for (i = 0; i < max_node; i++) {
if (!numa_bitmask_isbitset(bmp, i))
continue;
for (j = 0; j < ctx_set_size; j++) {
ctx_nums = cparams->ctx_set_num[j];
ret = wd_init_ctx_set(attrs, list, idx, i, j);
if (ret)
goto free_ctxs;
ret = wd_instance_sched_set(attrs->sched, ctx_nums, idx, i, j);
if (ret)
goto free_ctxs;
idx += (ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num);
}
- }
- return 0;
+free_ctxs:
- wd_release_ctx_set(attrs->ctx_config);
- return ret;
+}
+int wd_alg_pre_init(struct wd_init_attrs *attrs) +{
- struct wd_ctx_config *ctx_config = attrs->ctx_config;
- struct wd_ctx_params *cparams = attrs->cparams;
- struct uacce_dev_list *list, *used_list = NULL;
- struct bitmask *used_bmp, *bmp = attrs->bmp;
- __u32 ctx_set_num, ctx_set_size;
- int numa_cnt, ret;
- list = wd_get_accel_list(attrs->alg);
- if (!list) {
WD_ERR("failed to get devices!\n");
return -WD_ENODEV;
- }
- ctx_set_size = cparams->ctx_set_size;
- ctx_set_num = wd_get_ctx_numbers(*cparams, ctx_set_size);
- if (!ctx_set_num || !ctx_set_size) {
WD_ERR("invalid: ctx_set_num is %d, ctx_set_size is %d!\n",
ctx_set_num, ctx_set_size);
ret = -WD_EINVAL;
goto out_freelist;
- }
- /*
* Not every numa has a device. Therefore, the first thing is to
* filter the devices in the selected numa node, and the second
* thing is to obtain the distribution of devices.
*/
- if (bmp) {
used_list = wd_get_usable_list(list, bmp);
if (WD_IS_ERR(used_list)) {
ret = WD_PTR_ERR(used_list);
WD_ERR("failed to get usable devices(%d)!\n", ret);
goto out_freelist;
}
- }
- used_bmp = wd_create_device_nodemask(used_list ? used_list : list);
- if (WD_IS_ERR(bmp)) {
ret = WD_PTR_ERR(bmp);
goto out_freeusedlist;
- }
- numa_cnt = numa_bitmask_weight(used_bmp);
- if (!numa_cnt) {
ret = numa_cnt;
WD_ERR("invalid: bmp is clear!\n");
goto out_freenodemask;
- }
- ctx_config->ctx_num = ctx_set_num * numa_cnt;
- ctx_config->ctxs = calloc(ctx_config->ctx_num, sizeof(struct wd_ctx));
- if (!ctx_config->ctxs) {
ret = -WD_ENOMEM;
WD_ERR("failed to alloc ctxs!\n");
goto out_freenodemask;
- }
- ret = wd_init_ctx_and_sched(attrs, used_bmp, used_list ? used_list : list);
- if (ret)
free(ctx_config->ctxs);
+out_freenodemask:
- wd_free_device_nodemask(used_bmp);
+out_freeusedlist:
- wd_free_list_accels(used_list);
+out_freelist:
- wd_free_list_accels(list);
- return ret;
+}
Due to the complexity of wd_alg_init, add wd_alg_init2 interface for users. And add the design documents.
Signed-off-by: Yang Shen shenyang39@huawei.com --- docs/wd_alg_init2.md | 154 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 154 insertions(+) create mode 100644 docs/wd_alg_init2.md
diff --git a/docs/wd_alg_init2.md b/docs/wd_alg_init2.md new file mode 100644 index 0000000..c4fe530 --- /dev/null +++ b/docs/wd_alg_init2.md @@ -0,0 +1,154 @@ +# wd_alg_init2 + +## Preface + +The current uadk initialization process is: +1.Call wd_request_ctx() to request ctxs from devices. +2.Call wd_sched_rr_alloc() to create a sched(or some other scheduler alloc function if exits). +3.Initialize the sched. +4.Call wd_alg_init() with ctx_config and sched. + +```flow +st=>start: Start +o1=>operation: request ctxs +o2=>operation: create uadk_sched and instance ctxs to sched region +o3=>operation: call wd_alg_init +e=>end +st->o1->o2->o3->e +``` + +Logic is reasonable. But in practice, the step of wd_request_ctx() +and wd_sched_rr_alloc() are very tedious. This makes it difficult +for users to use the interface. One of the main reasons for this is +that uadk has made a lot of configurations in the scheduler in order +to provide users with better performance. Based on this consideration, +the current uadk requires the user to arrange the division of hardware +resources according to the device topology during initialization. +Therefore, as a high-level interface, this scheme can provide customized +scheme configuration for users with deep needs. + +## wd_alg_init2 + +### Design + +Is there any way to simplify these steps? Not currently. Because the +architecture model designed by uadk is to manage hardware resources +through a scheduler, users can no longer perceive after specifying +hardware resources, and all subsequent tasks are handled by the scheduler. +The original intention of this design is to make the scenarios supported +by uadk more flexible. Because the resource requirements of different +business scenarios are different from the task model of the business +itself, the best performance experience can be obtained through the +scheduler to match. + +But we can try to provide a layer of encapsulation. The original design +intention of this layer of encapsulation is that users only need to +specify available resources and requirements, and the configuration of +resources is completed internally by the interface. Because the previous +interface complexity mainly lies in the parameter configuration of CTX +and scheduler, it is easy for users to make configuration errors and +generate bugs because of their misunderstanding of parameters. + +All algorithms have the same input parameters and initialization logic. + +```c +struct wd_ctx_config { + __u32 ctx_num; + struct wd_ctx *ctxs; + void *priv; +}; + +struct wd_sched { + const char *name; + int sched_policy; + handle_t (*sched_init)(handle_t h_sched_ctx, void *sched_param); + __u32 (*pick_next_ctx)(handle_t h_sched_ctx, void *sched_key, + const int sched_mode); + int (*poll_policy)(handle_t h_sched_ctx, __u32 expect, __u32 *count); + handle_t h_sched_ctx; +}; + +int wd_alg_init(struct wd_ctx_config *config, struct wd_sched *sched); +``` + +`wd_ctx_config` is the requested ctxs descriptor, and the attributes +of ctxs are contained in their own structure. The attributes will be +used in scheduler for picking ctx according to request type. The main +difficulty in this step is that users need to apply for CTXs from the +appropriate device nodes according to their own business distribution. +If the user does not consider the appropriate device distribution, +it may lead to cross chip or cross numa node which will affect +performance. + +`wd_sched` is the scheduler descriptor of the request. It will create +the scheduling domain based parameters passed by the users. User needs +to allocate the ctxs applied to the scheduling domain that meets the +attribute, so that uadk can select the appropriate ctxs according to +the issued business. The main difficulty in this step is that the user +needs to initialize the correct scheduling domain according to the ctxs +attributes previously applied. However, there are many attributes of +ctxs here, which should be divided by multiple dimensions. If the +parameters are not understood enough, it is easy to make queue +allocation errors, resulting in the scheduling of the wrong ctxs when +the task is finally issued, and cause unexpected errors. + +Therefore, the next thing to be done is to use limited and easy-to-use +input parameters to describe users' requirements on the two input +parameters, ensuring that the functions of the new interface init2 +are the same as those of init. For ease of description, v1 is used +to refer to the existing interface, and v2 is used to refer to the +layer of encapsulation. + +Let's clarify the following logic first: all uacce devices under a +numa node can be regarded as the same. So although we request for +ctxs from the device, we manage ctxs according to numa nodes. +That means if users want to get the same performance for all cpu, +the uadk configure should be same for all numa node. + +At present, at least 4 parameters are required to meet the user +configuration requirements with the V1 interface function remains +unchanged. + +@alg: The algorithm users wanted. + +@numa_bitmask: The bitmask provided by libnuma. Users can use this +parameter to control requesting ctxs devices in the bind NUMA scenario. +This parameter is mainly convenient for users to use in the binding +cpu scenario. It can avoid resource waste or initialization failure +caused by insufficient resources. Libnuma provides a complete operation +interface which can be found in numa.h. + +@ctx_nums: The requested ctx number for each numa node. Due to users +may have different requirements for different types of ctx numbers, +needs a two-dimensional array as input. + +@sched_type: Scheduling type the user wants to use. + +To sum up, the wd_alg_init2_() is as follows + +```c +struct wd_ctx_nums { + __u32 sync_ctx_num; + __u32 async_ctx_num; +}; + +struct wd_ctx_params { + __u32 ctx_set_num; + struct wd_ctx_nums *ctx_set_size; +}; + +init wd_alg_init2_(char *alg, __u32 sched_type, struct bitmask *bmp, + struct wd_ctx_params *cparams); +``` + +Somebody may say that the wd_alg_init2_() is still complex for three +input parameters are structure. So the interface support default value +for some parameters. The @bmp can be set as NULL, and then it will be +initialized according to device list. The @cparams can be set as NULL, +and it has a default value in wd_alg.c. So there is a simpler interface +wd_alg_init2(). + +```c +#define wd_alg_init2(alg, sched_type) \ + wd_alg_init2_(alg, sched_type, NULL, NULL) +```