The current uadk initialization process is: 1.Call wd_request_ctx() to request ctxs from devices. 2.Call wd_sched_rr_alloc() to create a sched(or some other scheduler alloc function if exits). 3.Initialize the sched. 4.Call wd_<alg>_init() with ctx_config and sched.
Logic is reasonable. But in practice, the step of `wd_ request_ Ctx() ` and `wd_ sched_ rr_alloc() ` are very tedious. This makes it difficult for users to use the interface. One of the main reasons for this is that uadk has made a lot of configurations in the scheduler in order to provide users with better performance. Based on this consideration, the current uadk requires the user to arrange the division of hardware resources according to the device topology during initialization. Therefore, as a high-level interface, this scheme can provide customized scheme configuration for users with deep needs.
All algorithm initialization interfaces have the same input parameters and behavioral logic. The pre-processing of the wd_<alg>_init is actually the configuration of `struct wd_ctx_config` and `struct wd_sched`. Therefore, the next thing to be done is to use limited and easy-to-use input parameters to describe users' requirements on the two input parameters, ensuring that the functions of the new interface init2 are the same as those of init. For ease of description, v1 is used to refer to the existing interface, and v2 is used to refer to the layer of encapsulation.
At present, at least 4 parameters are required to meet the user configuration requirements with the V1 interface function remains unchanged. @device_list: The available uacce device list. Users can get it by wd_get_accel_list(). @numa_bitmask: The bitmask provided by libnuma. Users can use this parameter to control requesting ctxs devices in the bind NUMA scenario. @ctx_nums: The requested ctx number for each numa node. Due to users may have different requirements for different types of ctx numbers, needs a two-dimensional array as input. @sched_type: Scheduling type the user wants to use.
What's more, some users want uadk to provide the default value about input parameters for some performance insensitive scenes. C code has no way to.
Changelog:
v2->v3: - update the wd_comp_init2 parameters.
v1->v2: - Update the desdescription about wd_<alg>_init in wd_design.md.
Yang Shen (6): uadk - support algorithms initialization reentry protect uadk/doc - update wd_alg_init support reentrancy uadk - support return error number as pointer uadk - mv some function to header file uadk/comp - add wd_comp_init2 uadk/docs - support a simple interface for initialization
Makefile.am | 4 +- docs/wd_alg_init2.md | 176 ++++++++++++++++++++++++++++ docs/wd_design.md | 5 +- include/wd.h | 54 ++++++++- include/wd_alg_common.h | 24 ++++ include/wd_comp.h | 27 +++++ include/wd_util.h | 57 +++++++++ wd.c | 97 +++++++++++++--- wd_aead.c | 33 ++++-- wd_cipher.c | 35 ++++-- wd_comp.c | 248 ++++++++++++++++++++++++++++++++++++++-- wd_dh.c | 34 ++++-- wd_digest.c | 33 ++++-- wd_ecc.c | 33 ++++-- wd_rsa.c | 33 ++++-- wd_util.c | 83 +++++++++++++- 16 files changed, 870 insertions(+), 106 deletions(-) create mode 100644 docs/wd_alg_init2.md
-- 2.24.0
The 'wd_<alg>_init()' is designed as non-reentrant. So add a status to protect for this situation.
When 'wd_<alg>_init()' is called, it will read the status at first. If the status is WD_UNINIT, it will set status as WD_INITING and change status to WD_INIT if succeed or reduction status to WD_UNINIT if something is wrong. If the status is WD_INIT, it can return directly. If the status is WD_INITING, that meaning other thread is initializing, so it need to wait for the result.
Signed-off-by: Yang Shen shenyang39@huawei.com --- include/wd_util.h | 38 ++++++++++++++++++++++++++++++++++++++ wd_aead.c | 33 ++++++++++++++++++++++----------- wd_cipher.c | 35 +++++++++++++++++++++++------------ wd_comp.c | 35 ++++++++++++++++++++++++----------- wd_dh.c | 34 ++++++++++++++++++++++------------ wd_digest.c | 33 ++++++++++++++++++++++----------- wd_ecc.c | 33 ++++++++++++++++++++++----------- wd_rsa.c | 33 ++++++++++++++++++++++----------- wd_util.c | 24 ++++++++++++++++++++++++ 9 files changed, 219 insertions(+), 79 deletions(-)
diff --git a/include/wd_util.h b/include/wd_util.h index 83ac5f8..eafe3ce 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -21,6 +21,12 @@ extern "C" { for ((i) = 0, (config_numa) = (config)->config_per_numa; \ (i) < (config)->numa_num; (config_numa)++, (i)++)
+enum wd_status { + WD_UNINIT, + WD_INITING, + WD_INIT, +}; + struct wd_async_msg_pool { struct msg_pool *pools; __u32 pool_num; @@ -356,6 +362,38 @@ int wd_handle_msg_sync(struct wd_msg_handle *msg_handle, handle_t ctx, */ int wd_init_param_check(struct wd_ctx_config *config, struct wd_sched *sched);
+/** + * wd_alg_try_init() - Check the algorithm status and set it as WD_INITING + * if need initialization. + * @status: algorithm initialization status. + * + * Return true if need initialization and false if initialized, otherwise will wait + * last initialization result. + */ +bool wd_alg_try_init(enum wd_status *status); + +/** + * wd_alg_set_init() - Set the algorithm status as WD_INIT. + * @status: algorithm initialization status. + */ +static inline void wd_alg_set_init(enum wd_status *status) +{ + enum wd_status setting = WD_INIT; + + __atomic_store(status, &setting, __ATOMIC_RELAXED); +} + +/** + * wd_alg_clear_init() - Set the algorithm status as WD_UNINIT. + * @status: algorithm initialization status. + */ +static inline void wd_alg_clear_init(enum wd_status *status) +{ + enum wd_status setting = WD_UNINIT; + + __atomic_store(status, &setting, __ATOMIC_RELAXED); +} + /** * wd_dfx_msg_cnt() - Message counter interface for ctx * @msg: Shared memory addr. diff --git a/wd_aead.c b/wd_aead.c index d6c2380..2307b20 100644 --- a/wd_aead.c +++ b/wd_aead.c @@ -31,6 +31,7 @@ static int g_aead_mac_len[WD_DIGEST_TYPE_MAX] = { };
struct wd_aead_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_aead_driver *driver; @@ -392,24 +393,29 @@ static int wd_aead_param_check(struct wd_aead_sess *sess, int wd_aead_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_aead_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_AEAD_EPOLL_EN", &wd_aead_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_aead_setting.config, config); if (ret) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_aead_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
/* set driver */ #ifdef WD_STATIC_DRV @@ -421,33 +427,37 @@ int wd_aead_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_aead_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* init ctx related resources in specific driver */ priv = calloc(1, wd_aead_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; } wd_aead_setting.priv = priv;
ret = wd_aead_setting.driver->init(&wd_aead_setting.config, priv); if (ret < 0) { WD_ERR("failed to init aead dirver!\n"); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_aead_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_aead_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_aead_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_aead_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_aead_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_aead_setting.status); return ret; }
@@ -465,6 +475,7 @@ void wd_aead_uninit(void) wd_uninit_async_request_pool(&wd_aead_setting.pool); wd_clear_sched(&wd_aead_setting.sched); wd_clear_ctx_config(&wd_aead_setting.config); + wd_alg_clear_init(&wd_aead_setting.status); }
static void fill_request_msg(struct wd_aead_msg *msg, struct wd_aead_req *req, diff --git a/wd_cipher.c b/wd_cipher.c index 8ce975a..a85629d 100644 --- a/wd_cipher.c +++ b/wd_cipher.c @@ -45,6 +45,7 @@ static const unsigned char des_weak_keys[DES_WEAK_KEY_NUM][DES_KEY_SIZE] = { };
struct wd_cipher_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; void *sched_ctx; @@ -231,24 +232,29 @@ void wd_cipher_free_sess(handle_t h_sess) int wd_cipher_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_cipher_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_CIPHER_EPOLL_EN", &wd_cipher_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_cipher_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_cipher_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
#ifdef WD_STATIC_DRV /* set driver */ @@ -260,33 +266,37 @@ int wd_cipher_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_cipher_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* init ctx related resources in specific driver */ priv = calloc(1, wd_cipher_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; } wd_cipher_setting.priv = priv;
ret = wd_cipher_setting.driver->init(&wd_cipher_setting.config, priv); if (ret < 0) { - WD_ERR("hisi sec init failed.\n"); - goto out_init; + WD_ERR("failed to do dirver init, ret = %d.\n", ret); + goto out_free_priv; }
+ wd_alg_set_init(&wd_cipher_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_cipher_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_cipher_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_cipher_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_cipher_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_cipher_setting.status); return ret; }
@@ -304,6 +314,7 @@ void wd_cipher_uninit(void) wd_uninit_async_request_pool(&wd_cipher_setting.pool); wd_clear_sched(&wd_cipher_setting.sched); wd_clear_ctx_config(&wd_cipher_setting.config); + wd_alg_clear_init(&wd_cipher_setting.status); }
static void fill_request_msg(struct wd_cipher_msg *msg, diff --git a/wd_comp.c b/wd_comp.c index eacebd3..44593a6 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -41,6 +41,7 @@ struct wd_comp_sess { };
struct wd_comp_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver; @@ -81,24 +82,29 @@ void wd_comp_set_driver(struct wd_comp_driver *drv) int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_comp_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_COMP_EPOLL_EN", &wd_comp_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_comp_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_comp_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config; /* * Fix me: ctx could be passed into wd_comp_set_static_drv to help to * choose static compiled vendor driver. For dynamic vendor driver, @@ -118,31 +124,36 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_comp_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* init ctx related resources in specific driver */ priv = calloc(1, wd_comp_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; } wd_comp_setting.priv = priv; ret = wd_comp_setting.driver->init(&wd_comp_setting.config, priv); if (ret < 0) { WD_ERR("failed to do driver init, ret = %d!\n", ret); - goto out_init; + goto out_free_priv; } + + wd_alg_set_init(&wd_comp_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_comp_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_comp_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_comp_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_comp_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_comp_setting.status); return ret; }
@@ -163,6 +174,8 @@ void wd_comp_uninit(void) /* unset config, sched, driver */ wd_clear_sched(&wd_comp_setting.sched); wd_clear_ctx_config(&wd_comp_setting.config); + + wd_alg_clear_init(&wd_comp_setting.status); }
struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) diff --git a/wd_dh.c b/wd_dh.c index 0bf770d..85382e2 100644 --- a/wd_dh.c +++ b/wd_dh.c @@ -32,6 +32,7 @@ struct wd_dh_sess { };
static struct wd_dh_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; void *sched_ctx; @@ -78,24 +79,29 @@ void wd_dh_set_driver(struct wd_dh_driver *drv) int wd_dh_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_dh_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_DH_EPOLL_EN", &wd_dh_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_dh_setting.config, config); if (ret) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_dh_setting.sched, sched); if (ret) - goto out; + goto out_clear_ctx_config;
#ifdef WD_STATIC_DRV wd_dh_set_static_drv(); @@ -106,13 +112,13 @@ int wd_dh_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_dh_msg)); if (ret) - goto out_sched; + goto out_clear_sched;
/* initialize ctx related resources in specific driver */ priv = calloc(1, wd_dh_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; }
wd_dh_setting.priv = priv; @@ -120,21 +126,24 @@ int wd_dh_init(struct wd_ctx_config *config, struct wd_sched *sched) wd_dh_setting.driver->alg_name); if (ret < 0) { WD_ERR("failed to init dh driver, ret= %d!\n", ret); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_dh_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_dh_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_dh_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_dh_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_dh_setting.config); - +out_clear_init: + wd_alg_clear_init(&wd_dh_setting.status); return ret; }
@@ -156,6 +165,7 @@ void wd_dh_uninit(void) /* unset config, sched, driver */ wd_clear_sched(&wd_dh_setting.sched); wd_clear_ctx_config(&wd_dh_setting.config); + wd_alg_clear_init(&wd_dh_setting.status); }
static int fill_dh_msg(struct wd_dh_msg *msg, struct wd_dh_req *req, diff --git a/wd_digest.c b/wd_digest.c index f56be0c..26dc7d1 100644 --- a/wd_digest.c +++ b/wd_digest.c @@ -39,6 +39,7 @@ static int g_digest_mac_full_len[WD_DIGEST_TYPE_MAX] = { };
struct wd_digest_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_digest_driver *driver; @@ -186,24 +187,29 @@ void wd_digest_free_sess(handle_t h_sess) int wd_digest_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_digest_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_DIGEST_EPOLL_EN", &wd_digest_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_digest_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_digest_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
/* set driver */ #ifdef WD_STATIC_DRV @@ -215,33 +221,37 @@ int wd_digest_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_digest_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* init ctx related resources in specific driver */ priv = calloc(1, wd_digest_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; } wd_digest_setting.priv = priv;
ret = wd_digest_setting.driver->init(&wd_digest_setting.config, priv); if (ret < 0) { WD_ERR("failed to init digest dirver!\n"); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_digest_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_digest_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_digest_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_digest_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_digest_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_digest_setting.status); return ret; }
@@ -260,6 +270,7 @@ void wd_digest_uninit(void)
wd_clear_sched(&wd_digest_setting.sched); wd_clear_ctx_config(&wd_digest_setting.config); + wd_alg_clear_init(&wd_digest_setting.status); }
static int wd_aes_hmac_length_check(struct wd_digest_sess *sess, diff --git a/wd_ecc.c b/wd_ecc.c index 2266b1d..3e902bd 100644 --- a/wd_ecc.c +++ b/wd_ecc.c @@ -64,6 +64,7 @@ struct wd_ecc_curve_list { };
static struct wd_ecc_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; void *sched_ctx; @@ -133,24 +134,29 @@ void wd_ecc_set_driver(struct wd_ecc_driver *drv) int wd_ecc_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_ecc_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_ECC_EPOLL_EN", &wd_ecc_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_ecc_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_ecc_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
#ifdef WD_STATIC_DRV wd_ecc_set_static_drv(); @@ -161,13 +167,13 @@ int wd_ecc_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_ecc_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* initialize ctx related resources in specific driver */ priv = calloc(1, wd_ecc_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; }
wd_ecc_setting.priv = priv; @@ -175,20 +181,24 @@ int wd_ecc_init(struct wd_ctx_config *config, struct wd_sched *sched) wd_ecc_setting.driver->alg_name); if (ret < 0) { WD_ERR("failed to init ecc driver, ret = %d!\n", ret); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_ecc_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_ecc_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_ecc_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_ecc_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_ecc_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_ecc_setting.status); return ret; }
@@ -210,6 +220,7 @@ void wd_ecc_uninit(void) /* unset config, sched, driver */ wd_clear_sched(&wd_ecc_setting.sched); wd_clear_ctx_config(&wd_ecc_setting.config); + wd_alg_clear_init(&wd_ecc_setting.status); }
static int trans_to_binpad(char *dst, const char *src, diff --git a/wd_rsa.c b/wd_rsa.c index 489833e..aab16ce 100644 --- a/wd_rsa.c +++ b/wd_rsa.c @@ -72,6 +72,7 @@ struct wd_rsa_sess { };
static struct wd_rsa_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; void *sched_ctx; @@ -118,24 +119,29 @@ void wd_rsa_set_driver(struct wd_rsa_driver *drv) int wd_rsa_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_rsa_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_RSA_EPOLL_EN", &wd_rsa_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_rsa_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_rsa_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
#ifdef WD_STATIC_DRV wd_rsa_set_static_drv(); @@ -146,13 +152,13 @@ int wd_rsa_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_rsa_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* initialize ctx related resources in specific driver */ priv = calloc(1, wd_rsa_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; }
wd_rsa_setting.priv = priv; @@ -160,20 +166,24 @@ int wd_rsa_init(struct wd_ctx_config *config, struct wd_sched *sched) wd_rsa_setting.driver->alg_name); if (ret < 0) { WD_ERR("failed to init rsa driver, ret = %d!\n", ret); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_rsa_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_rsa_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_rsa_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_rsa_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_rsa_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_rsa_setting.status); return ret; }
@@ -195,6 +205,7 @@ void wd_rsa_uninit(void) /* unset config, sched, driver */ wd_clear_sched(&wd_rsa_setting.sched); wd_clear_ctx_config(&wd_rsa_setting.config); + wd_alg_clear_init(&wd_rsa_setting.status); }
static int fill_rsa_msg(struct wd_rsa_msg *msg, struct wd_rsa_req *req, diff --git a/wd_util.c b/wd_util.c index 349df81..efc0d41 100644 --- a/wd_util.c +++ b/wd_util.c @@ -22,6 +22,9 @@ #define WD_RECV_MAX_CNT_NOSLEEP 200000000 #define PRIVILEGE_FLAG 600
+#define WD_INIT_SLEEP_UTIME 1000 +#define WD_INIT_RETRY_TIMES 10000 + struct msg_pool { /* message array allocated dynamically */ void *msgs; @@ -1777,3 +1780,24 @@ int wd_init_param_check(struct wd_ctx_config *config, struct wd_sched *sched)
return 0; } + +bool wd_alg_try_init(enum wd_status *status) +{ + enum wd_status expected; + int count = 0; + bool ret; + + do { + expected = WD_UNINIT; + ret = __atomic_compare_exchange_n(status, &expected, WD_INITING, true, + __ATOMIC_RELAXED, __ATOMIC_RELAXED); + if (expected == WD_INIT) + return false; + usleep(WD_INIT_SLEEP_UTIME); + if (!(++count % WD_INIT_RETRY_TIMES)) + WD_ERR("The algorithm initizalite has been waiting for %ds!\n", + WD_INIT_SLEEP_UTIME * count / 1000000); + } while (!ret); + + return true; +}
Now the uadk support initialization interface multi-thread concurrency and reentrant.
Signed-off-by: Yang Shen shenyang39@huawei.com --- docs/wd_design.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/docs/wd_design.md b/docs/wd_design.md index ba5a5b9..3e5297e 100644 --- a/docs/wd_design.md +++ b/docs/wd_design.md @@ -81,6 +81,7 @@ | | |2) Change *user* layer to *sched* layer since | | | | sample_sched is moved from user space into UADK | | | | framework. | +| 1.4 | |1) Update *wd_alg_init* reentrancy. |
## Terminology @@ -493,7 +494,9 @@ device. Return 0 if it succeeds. And return error number if it fails.
In *wd_comp_init()*, context resources, user scheduler and vendor driver are -initialized. +initialized. This function supports multi-threaded concurrent calls and +reentrant. When one thread is initializing, other threads will wait for +completion.
***void wd_comp_uninit(void)***
Add a new set of interface 'WD_ERR_PTR()' and 'WD_PTR_ERR()' for return error value.
Signed-off-by: Yang Shen shenyang39@huawei.com --- include/wd.h | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/include/wd.h b/include/wd.h index b0580ba..74d714b 100644 --- a/include/wd.h +++ b/include/wd.h @@ -91,11 +91,6 @@ typedef void (*wd_log)(const char *format, ...); #define WD_IS_ERR(h) ((uintptr_t)(h) > \ (uintptr_t)(-1000))
-static inline void *WD_ERR_PTR(uintptr_t error) -{ - return (void *)error; -} - enum wcrypto_type { WD_CIPHER, WD_DIGEST, @@ -185,6 +180,16 @@ static inline void wd_iowrite64(void *addr, uint64_t value) *((volatile uint64_t *)addr) = value; }
+static inline void *WD_ERR_PTR(uintptr_t error) +{ + return (void *)error; +} + +static inline long WD_PTR_ERR(const void *ptr) +{ + return (long)ptr; +} + /** * wd_request_ctx() - Request a communication context from a device. * @dev: Indicate one device.
Since two function will be used for mutil files, move them to header file.
Signed-off-by: Yang Shen shenyang39@huawei.com --- include/wd.h | 15 +++++++++++++++ wd.c | 35 +++++++++++++++++------------------ 2 files changed, 32 insertions(+), 18 deletions(-)
diff --git a/include/wd.h b/include/wd.h index 74d714b..e1a87de 100644 --- a/include/wd.h +++ b/include/wd.h @@ -508,6 +508,21 @@ void wd_mempool_stats(handle_t mempool, struct wd_mempool_stats *stats); */ void wd_blockpool_stats(handle_t blkpool, struct wd_blockpool_stats *stats);
+/** + * wd_clone_dev() - clone a new uacce device. + * @dev: The source device. + * + * Return a pointer value if succeed, and NULL if fail. + */ +struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); + +/** + * wd_add_dev_to_list() - add a node to end of list. + * @head: The list head. + * @node: The node need to be add. + */ +void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node); + /** * wd_ctx_get_dev_name() - Get the device name about task. * @h_ctx: The handle of context. diff --git a/wd.c b/wd.c index b0c3dec..d99d4ec 100644 --- a/wd.c +++ b/wd.c @@ -365,7 +365,13 @@ out: return strndup(name, len); }
-static struct uacce_dev *clone_uacce_dev(struct uacce_dev *dev) +static void wd_ctx_init_qfrs_offs(struct wd_ctx_h *ctx) +{ + memcpy(&ctx->qfrs_offs, &ctx->dev->qfrs_offs, + sizeof(ctx->qfrs_offs)); +} + +struct uacce_dev *wd_clone_dev(struct uacce_dev *dev) { struct uacce_dev *new;
@@ -378,10 +384,14 @@ static struct uacce_dev *clone_uacce_dev(struct uacce_dev *dev) return new; }
-static void wd_ctx_init_qfrs_offs(struct wd_ctx_h *ctx) +void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node) { - memcpy(&ctx->qfrs_offs, &ctx->dev->qfrs_offs, - sizeof(ctx->qfrs_offs)); + struct uacce_dev_list *tmp = head; + + while (tmp->next) + tmp = tmp->next; + + tmp->next = node; }
handle_t wd_request_ctx(struct uacce_dev *dev) @@ -416,7 +426,7 @@ handle_t wd_request_ctx(struct uacce_dev *dev) if (!ctx->drv_name) goto free_dev_name;
- ctx->dev = clone_uacce_dev(dev); + ctx->dev = wd_clone_dev(dev); if (!ctx->dev) goto free_drv_name;
@@ -656,17 +666,6 @@ static bool dev_has_alg(const char *dev_alg_name, const char *alg_name) return false; }
-static void add_uacce_dev_to_list(struct uacce_dev_list *head, - struct uacce_dev_list *node) -{ - struct uacce_dev_list *tmp = head; - - while (tmp->next) - tmp = tmp->next; - - tmp->next = node; -} - static int check_alg_name(const char *alg_name) { int i = 0; @@ -729,7 +728,7 @@ struct uacce_dev_list *wd_get_accel_list(const char *alg_name) if (!head) head = node; else - add_uacce_dev_to_list(head, node); + wd_add_dev_to_list(head, node); }
closedir(wd_class); @@ -788,7 +787,7 @@ struct uacce_dev *wd_get_accel_dev(const char *alg_name) }
if (dev) - target = clone_uacce_dev(dev); + target = wd_clone_dev(dev);
wd_free_list_accels(head);
Due to performance, uadk tries to leave many configuration options to users. This gives users great flexibility, but it also leads to a problem that the current initialization interface has high complexity. Therefore, in order to facilitate users to adapt quickly, a new set of interfaces is provided.
The 'wd_alg_init2()' will complete all initialization steps. There are 4 parameters to describe the user configuration requirements. @device_list: The available uacce device list. Users can get it by wd_get_accel_list(). @numa_bitmask: The bitmask provided by libnuma. Users can use this parameter to control requesting ctxs devices in the bind NUMA scenario. @ctx_nums: The requested ctx number for each numa node. Due to users may have different requirements for different types of ctx numbers, needs a two-dimensional array as input. @sched_type: Scheduling type the user wants to use.
Signed-off-by: Yang Shen shenyang39@huawei.com --- Makefile.am | 4 +- include/wd.h | 24 +++++ include/wd_alg_common.h | 24 +++++ include/wd_comp.h | 27 +++++ include/wd_util.h | 19 ++++ wd.c | 62 ++++++++++++ wd_comp.c | 213 ++++++++++++++++++++++++++++++++++++++++ wd_util.c | 59 ++++++++++- 8 files changed, 429 insertions(+), 3 deletions(-)
diff --git a/Makefile.am b/Makefile.am index b3f07df..6cfb6b3 100644 --- a/Makefile.am +++ b/Makefile.am @@ -86,7 +86,7 @@ AM_CFLAGS += -DWD_NO_LOG
libwd_la_LIBADD = $(libwd_la_OBJECTS) -lnuma
-libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl +libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl -lnuma libwd_comp_la_DEPENDENCIES = libwd.la
libhisi_zip_la_LIBADD = -ldl @@ -103,7 +103,7 @@ else libwd_la_LDFLAGS=$(UADK_VERSION) libwd_la_LIBADD= -lnuma
-libwd_comp_la_LIBADD= -lwd -ldl +libwd_comp_la_LIBADD= -lwd -ldl -lnuma libwd_comp_la_LDFLAGS=$(UADK_VERSION) libwd_comp_la_DEPENDENCIES= libwd.la
diff --git a/include/wd.h b/include/wd.h index e1a87de..facd992 100644 --- a/include/wd.h +++ b/include/wd.h @@ -348,6 +348,16 @@ int wd_get_avail_ctx(struct uacce_dev *dev); */ struct uacce_dev_list *wd_get_accel_list(const char *alg_name);
+/** + * wd_find_dev_by_numa() - get device with max available ctx number from an + * device list according to numa id. + * @list: The device list. + * @numa_id: The numa_id. + * + * Return device if succeed and other error number if fail. + */ +struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id); + /** * wd_get_accel_dev() - Get device supporting the algorithm with smallest numa distance to current numa node. @@ -523,6 +533,20 @@ struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); */ void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node);
+/** + * wd_create_device_nodemask() - create a numa node mask of device list. + * @list: The devices list. + * + * Return a pointer value if succeed, and error number if fail. + */ +struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list); + +/** + * wd_free_device_nodemask() - free a numa node mask. + * @bmp: A numa node mask. + */ +void wd_free_device_nodemask(struct bitmask *bmp); + /** * wd_ctx_get_dev_name() - Get the device name about task. * @h_ctx: The handle of context. diff --git a/include/wd_alg_common.h b/include/wd_alg_common.h index c455dc3..f261830 100644 --- a/include/wd_alg_common.h +++ b/include/wd_alg_common.h @@ -63,6 +63,30 @@ struct wd_ctx_config { void *priv; };
+/** + * struct wd_ctx_nums - Define the ctx sets numbers. + * @sync_ctx_num: The ctx numbers which are used for sync mode for each + * ctx sets. + * @async_ctx_num: The ctx numbers which are used for async mode for each + * ctx sets. + */ +struct wd_ctx_nums { + __u32 sync_ctx_num; + __u32 async_ctx_num; +}; + +/** + * struct wd_ctx_params - Define the ctx sets params which are used for init + * algorithms. + * @ctx_set_num: Number of ctx sets to be created. Usually users can + * set it according to <alg>_op_type. + * @ctx_set_size: Each ctx sets numbers. + */ +struct wd_ctx_params { + __u32 ctx_set_num; + struct wd_ctx_nums *ctx_set_size; +}; + struct wd_ctx_internal { handle_t ctx; __u8 op_type; diff --git a/include/wd_comp.h b/include/wd_comp.h index e043a83..9cd50dd 100644 --- a/include/wd_comp.h +++ b/include/wd_comp.h @@ -7,6 +7,7 @@ #ifndef __WD_COMP_H #define __WD_COMP_H
+#include <numa.h> #include "wd.h" #include "wd_alg_common.h"
@@ -113,6 +114,32 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched); */ void wd_comp_uninit(void);
+/** + * wd_comp_init2() - A simplify interface to initializate uadk + * compression/decompression. Users can use wd_get_accel_list() to + * get the usable device list with the algrithms. Users should provide + * a device numa node mask to show which numa devices will be + * selected. wd_create_device_nodemask() can create a node mask + * according the list. If all numa devices on the list are match + * the requirement, just use the return of it. Otherwise, users can + * use the function in libnuma to set the node mask. + * To make the initializate simpler, bmp and cparams support set NULL. + * And then the function will set them as default. + * + * @list: The device list. + * @bmp: Node mask of the required devices. + * @cparams: The ctx settings. + * @sched_type: The scheduler type. + * + * Return 0 if succeed and others if fail. + */ +int wd_comp_init2(const char *alg_name, __u32 sched_type); + +/** + * wd_comp_uninit2() - Uninitialise ctx configuration and scheduler. + */ +void wd_comp_uninit2(void); + struct wd_comp_sess_setup { enum wd_comp_alg_type alg_type; /* Denoted by enum wd_comp_alg_type */ enum wd_comp_level comp_lv; /* Denoted by enum wd_comp_level */ diff --git a/include/wd_util.h b/include/wd_util.h index eafe3ce..4a2e102 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -7,6 +7,7 @@ #ifndef __WD_UTIL_H #define __WD_UTIL_H
+#include <numa.h> #include <stdbool.h> #include <sys/ipc.h> #include <sys/shm.h> @@ -394,6 +395,24 @@ static inline void wd_alg_clear_init(enum wd_status *status) __atomic_store(status, &setting, __ATOMIC_RELAXED); }
+/** + * wd_get_usable_list() - choose the devices according bitmask. + * @list: The device list. + * @bmp: The devices node mask. + * + * Return a list that meet user's requirement if succeed, and error number if fail. + */ +struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp); + +/** + * wd_get_ctx_numbers() - count the ctx number for first to end. + * @cparams: the input ctx setting numbers. + * @end: the end index of cparams. + * + * Return the sum of top '@end' cparams ctx number. + */ +__u32 wd_get_ctx_numbers(struct wd_ctx_params cparams, int end); + /** * wd_dfx_msg_cnt() - Message counter interface for ctx * @msg: Shared memory addr. diff --git a/wd.c b/wd.c index d99d4ec..c63805c 100644 --- a/wd.c +++ b/wd.c @@ -741,6 +741,35 @@ free_list: return NULL; }
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id) +{ + struct uacce_dev *dev = WD_ERR_PTR(-WD_ENODEV); + struct uacce_dev_list *p = list; + int ctx_num, ctx_max = 0; + + if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + } + + while (p) { + if (numa_id != p->dev->numa_id) { + p = p->next; + continue; + } + + ctx_num = wd_get_avail_ctx(p->dev); + if (ctx_num > ctx_max) { + dev = p->dev; + ctx_max = ctx_num; + } + + p = p->next; + } + + return dev; +} + void wd_free_list_accels(struct uacce_dev_list *list) { struct uacce_dev_list *curr, *next; @@ -807,6 +836,39 @@ int wd_ctx_set_io_cmd(handle_t h_ctx, unsigned long cmd, void *arg) return ioctl(ctx->fd, cmd, arg); }
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list) +{ + struct uacce_dev_list *p; + struct bitmask *bmp; + + if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + } + + bmp = numa_allocate_nodemask(); + if (!bmp) { + WD_ERR("failed to alloc bitmask(%d)!\n", errno); + return WD_ERR_PTR(-WD_EINVAL); + } + + p = list; + while (p) { + numa_bitmask_setbit(bmp, p->dev->numa_id); + p = p->next; + } + + return bmp; +} + +void wd_free_device_nodemask(struct bitmask *bmp) +{ + if (!bmp) + return; + + numa_free_nodemask(bmp); +} + void wd_get_version(void) { const char *wd_released_time = UADK_RELEASED_TIME; diff --git a/wd_comp.c b/wd_comp.c index 44593a6..ba79838 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -14,6 +14,7 @@
#include "config.h" #include "drv/wd_comp_drv.h" +#include "wd_sched.h" #include "wd_util.h" #include "wd_comp.h"
@@ -21,6 +22,8 @@ #define HW_CTX_SIZE (64 * 1024) #define STREAM_CHUNK (128 * 1024)
+#define SCHED_RR_NAME "sched_rr" + #define swap_byte(x) \ ((((x) & 0x000000ff) << 24) | \ (((x) & 0x0000ff00) << 8) | \ @@ -42,6 +45,7 @@ struct wd_comp_sess {
struct wd_comp_setting { enum wd_status status; + enum wd_status status2; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver; @@ -52,6 +56,10 @@ struct wd_comp_setting {
struct wd_env_config wd_comp_env_config;
+static struct wd_ctx_config wd_comp_ctx; +static struct wd_sched *wd_comp_sched; +static int wd_comp_numa_count; + #ifdef WD_STATIC_DRV static void wd_comp_set_static_drv(void) { @@ -178,6 +186,209 @@ void wd_comp_uninit(void) wd_alg_clear_init(&wd_comp_setting.status); }
+static int wd_comp_request_ctx(struct uacce_dev_list *list, + struct wd_ctx_nums ctx_nums, + int idx, int numa_id, int op_type) +{ + int ctx_set_size = ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num; + struct uacce_dev *dev; + int i; + + dev = wd_find_dev_by_numa(list, numa_id); + if (!dev) + return -WD_EBUSY; + + for (i = idx; i < idx + ctx_set_size; i++) { + wd_comp_ctx.ctxs[i].ctx = wd_request_ctx(dev); + if (errno == WD_EBUSY) { + dev = wd_find_dev_by_numa(list, numa_id); + if (!dev) + return -WD_EBUSY; + i--; + } + wd_comp_ctx.ctxs[i].op_type = op_type; + wd_comp_ctx.ctxs[i].ctx_mode = + ((i - idx) < ctx_nums.sync_ctx_num) ? + CTX_MODE_SYNC : CTX_MODE_ASYNC; + } + + return 0; +} + +static void wd_comp_release_ctx(void) +{ + int i; + + for (i = 0; i < wd_comp_ctx.ctx_num; i++) + if (wd_comp_ctx.ctxs[i].ctx) { + wd_release_ctx(wd_comp_ctx.ctxs[i].ctx); + wd_comp_ctx.ctxs[i].ctx = 0; + } +} + +static int wd_comp_instance_sched(struct wd_ctx_nums ctx_nums, int idx, + int numa_id, int op_type) +{ + struct sched_params sparams; + int i, ret = 0; + + for (i = 0; i < CTX_MODE_MAX; i++) { + sparams.numa_id = numa_id; + sparams.type = op_type; + sparams.mode = i; + sparams.begin = idx + ctx_nums.sync_ctx_num * i; + sparams.end = idx - 1 + ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num * i; + if (sparams.begin > sparams.end) + continue; + ret = wd_sched_rr_instance(wd_comp_sched, &sparams); + if (ret) + goto out; + } + +out: + return ret; +} + +static int __wd_comp_init2(struct uacce_dev_list *list, struct bitmask *bmp, + struct wd_ctx_params cparams) +{ + int ctx_set_num = cparams.ctx_set_num; + int max_node = numa_max_node() + 1; + struct wd_ctx_nums ctx_nums; + int i, j, ret; + int idx = 0; + + for (i = 0; i < max_node; i++) { + if (!numa_bitmask_isbitset(bmp, i)) + continue; + for (j = 0; j < ctx_set_num; j++) { + ctx_nums = cparams.ctx_set_size[j]; + ret = wd_comp_request_ctx(list, ctx_nums, idx, i, j); + if (ret) + goto free_ctxs; + ret = wd_comp_instance_sched(ctx_nums, idx, i, j); + if (ret) + goto free_ctxs; + idx += (ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num); + } + } + + ret = wd_comp_init(&wd_comp_ctx, wd_comp_sched); + if (ret) + goto free_ctxs; + + return 0; + +free_ctxs: + wd_comp_release_ctx(); + + return ret; +} + +static struct wd_ctx_nums comp_default_ctxsize[] = { + {1, 1}, {1, 1}, { } +}; + +static struct wd_ctx_params comp_default_cparams = { + .ctx_set_num = WD_DIR_MAX, + .ctx_set_size = comp_default_ctxsize, +}; + +int wd_comp_init2(const char *alg_name, __u32 sched_type) +{ + struct uacce_dev_list *dev_list = NULL; + __u32 ctx_set_num, ctx_set_size; + struct bitmask *dev_bmp; + bool flag; + int ret; + + flag = wd_alg_try_init(&wd_comp_setting.status2); + if (!flag) + return 0; + + if (!alg_name) { + WD_ERR("invalid: alg_name is NULL!\n"); + ret = -WD_EINVAL; + goto out_uninit; + } + + dev_list = wd_get_accel_list(alg_name); + if (!dev_list) { + WD_ERR("invalid: alg_name is not support!\n"); + ret = -WD_EINVAL; + goto out_uninit; + } + + dev_bmp = wd_create_device_nodemask(dev_list); + if (WD_IS_ERR(dev_bmp)) { + ret = WD_PTR_ERR(dev_bmp); + goto out_freelist; + } + + wd_comp_numa_count = numa_bitmask_weight(dev_bmp); + if (!wd_comp_numa_count) { + WD_ERR("invalid: bmp is clear!\n"); + ret = -WD_ENODEV; + goto out_freebmp; + } + + ctx_set_num = comp_default_cparams.ctx_set_num; + ctx_set_size = wd_get_ctx_numbers(comp_default_cparams, ctx_set_num); + wd_comp_ctx.ctx_num = ctx_set_size * wd_comp_numa_count; + wd_comp_ctx.ctxs = calloc(wd_comp_ctx.ctx_num, sizeof(struct wd_ctx)); + if (!wd_comp_ctx.ctxs) { + ret = -WD_ENOMEM; + WD_ERR("failed to alloc ctxs!\n"); + goto out_freebmp; + } + + wd_comp_sched = wd_sched_rr_alloc(sched_type, ctx_set_num, + numa_max_node() + 1, wd_comp_poll_ctx); + if (!wd_comp_sched) { + ret = -WD_EINVAL; + goto out_freectxs; + } + wd_comp_sched->name = SCHED_RR_NAME; + + ret = __wd_comp_init2(dev_list, dev_bmp, comp_default_cparams); + if (ret) + goto out_freesched; + + wd_free_device_nodemask(dev_bmp); + wd_free_list_accels(dev_list); + + wd_alg_set_init(&wd_comp_setting.status2); + + return ret; + +out_freesched: + wd_sched_rr_release(wd_comp_sched); + wd_comp_sched = NULL; + +out_freectxs: + free(wd_comp_ctx.ctxs); + wd_comp_ctx.ctxs = NULL; + +out_freebmp: + wd_free_device_nodemask(dev_bmp); + +out_freelist: + wd_free_list_accels(dev_list); + +out_uninit: + wd_alg_clear_init(&wd_comp_setting.status2); + + return ret; +} + +void wd_comp_uninit2(void) +{ + wd_comp_uninit(); + wd_comp_release_ctx(); + wd_sched_rr_release(wd_comp_sched); + wd_alg_clear_init(&wd_comp_setting.status2); +} + struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) { return wd_find_msg_in_pool(&wd_comp_setting.pool, idx, tag); @@ -289,6 +500,7 @@ handle_t wd_comp_alloc_sess(struct wd_comp_sess_setup *setup) sess->comp_lv = setup->comp_lv; sess->win_sz = setup->win_sz; sess->stream_pos = WD_COMP_STREAM_NEW; + /* Some simple scheduler don't need scheduling parameters */ sess->sched_key = (void *)wd_comp_setting.sched.sched_init( wd_comp_setting.sched.h_sched_ctx, setup->sched_param); @@ -318,6 +530,7 @@ void wd_comp_free_sess(handle_t h_sess)
if (sess->sched_key) free(sess->sched_key); + free(sess); }
diff --git a/wd_util.c b/wd_util.c index efc0d41..471ca07 100644 --- a/wd_util.c +++ b/wd_util.c @@ -5,7 +5,6 @@ */
#define _GNU_SOURCE -#include <numa.h> #include <pthread.h> #include <semaphore.h> #include <string.h> @@ -1801,3 +1800,61 @@ bool wd_alg_try_init(enum wd_status *status)
return true; } + +struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp) +{ + struct uacce_dev_list *p, *node, *result = NULL; + struct uacce_dev *dev; + int numa_id, ret; + + p = list; + while (p) { + dev = p->dev; + numa_id = dev->numa_id; + ret = numa_bitmask_isbitset(bmp, numa_id); + if (!ret) { + p = p->next; + continue; + } + + node = calloc(1, sizeof(*node)); + if (!node) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_list; + } + + node->dev = wd_clone_dev(dev); + if (!node->dev) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_node; + } + + if (!result) + result = node; + else + wd_add_dev_to_list(result, node); + + p = p->next; + } + + return result; + +out_free_node: + free(node); +out_free_list: + wd_free_list_accels(result); + return result; +} + +__u32 wd_get_ctx_numbers(struct wd_ctx_params cparams, int end) +{ + __u32 count = 0; + int i; + + for (i = 0; i < end; i++) { + count += cparams.ctx_set_size[i].sync_ctx_num; + count += cparams.ctx_set_size[i].async_ctx_num; + } + + return count; +}
在 2022/9/24 18:18, Yang Shen 写道:
Due to performance, uadk tries to leave many configuration options to users. This gives users great flexibility, but it also leads to a problem that the current initialization interface has high complexity. Therefore, in order to facilitate users to adapt quickly, a new set of interfaces is provided.
The 'wd_alg_init2()' will complete all initialization steps. There are 4 parameters to describe the user configuration requirements. @device_list: The available uacce device list. Users can get it by wd_get_accel_list(). @numa_bitmask: The bitmask provided by libnuma. Users can use this parameter to control requesting ctxs devices in the bind NUMA scenario. @ctx_nums: The requested ctx number for each numa node. Due to users may have different requirements for different types of ctx numbers, needs a two-dimensional array as input. @sched_type: Scheduling type the user wants to use.
need update.
Signed-off-by: Yang Shen shenyang39@huawei.com
Makefile.am | 4 +- include/wd.h | 24 +++++ include/wd_alg_common.h | 24 +++++ include/wd_comp.h | 27 +++++ include/wd_util.h | 19 ++++ wd.c | 62 ++++++++++++ wd_comp.c | 213 ++++++++++++++++++++++++++++++++++++++++ wd_util.c | 59 ++++++++++- 8 files changed, 429 insertions(+), 3 deletions(-)
diff --git a/Makefile.am b/Makefile.am index b3f07df..6cfb6b3 100644 --- a/Makefile.am +++ b/Makefile.am @@ -86,7 +86,7 @@ AM_CFLAGS += -DWD_NO_LOG
libwd_la_LIBADD = $(libwd_la_OBJECTS) -lnuma
-libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl +libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl -lnuma libwd_comp_la_DEPENDENCIES = libwd.la
libhisi_zip_la_LIBADD = -ldl @@ -103,7 +103,7 @@ else libwd_la_LDFLAGS=$(UADK_VERSION) libwd_la_LIBADD= -lnuma
-libwd_comp_la_LIBADD= -lwd -ldl +libwd_comp_la_LIBADD= -lwd -ldl -lnuma libwd_comp_la_LDFLAGS=$(UADK_VERSION) libwd_comp_la_DEPENDENCIES= libwd.la
diff --git a/include/wd.h b/include/wd.h index e1a87de..facd992 100644 --- a/include/wd.h +++ b/include/wd.h @@ -348,6 +348,16 @@ int wd_get_avail_ctx(struct uacce_dev *dev); */ struct uacce_dev_list *wd_get_accel_list(const char *alg_name);
+/**
- wd_find_dev_by_numa() - get device with max available ctx number from an
device list according to numa id.
- @list: The device list.
- @numa_id: The numa_id.
- Return device if succeed and other error number if fail.
- */
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id);
- /**
- wd_get_accel_dev() - Get device supporting the algorithm with smallest numa distance to current numa node.
@@ -523,6 +533,20 @@ struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); */ void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node);
+/**
- wd_create_device_nodemask() - create a numa node mask of device list.
- @list: The devices list.
- Return a pointer value if succeed, and error number if fail.
- */
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list);
+/**
- wd_free_device_nodemask() - free a numa node mask.
- @bmp: A numa node mask.
- */
+void wd_free_device_nodemask(struct bitmask *bmp);
- /**
- wd_ctx_get_dev_name() - Get the device name about task.
- @h_ctx: The handle of context.
diff --git a/include/wd_alg_common.h b/include/wd_alg_common.h index c455dc3..f261830 100644 --- a/include/wd_alg_common.h +++ b/include/wd_alg_common.h @@ -63,6 +63,30 @@ struct wd_ctx_config { void *priv; };
+/**
- struct wd_ctx_nums - Define the ctx sets numbers.
- @sync_ctx_num: The ctx numbers which are used for sync mode for each
- ctx sets.
- @async_ctx_num: The ctx numbers which are used for async mode for each
- ctx sets.
- */
+struct wd_ctx_nums {
- __u32 sync_ctx_num;
- __u32 async_ctx_num;
+};
+/**
- struct wd_ctx_params - Define the ctx sets params which are used for init
- algorithms.
- @ctx_set_num: Number of ctx sets to be created. Usually users can
- set it according to <alg>_op_type.
- @ctx_set_size: Each ctx sets numbers.
- */
+struct wd_ctx_params {
- __u32 ctx_set_num;
- struct wd_ctx_nums *ctx_set_size;
+};
- struct wd_ctx_internal { handle_t ctx; __u8 op_type;
diff --git a/include/wd_comp.h b/include/wd_comp.h index e043a83..9cd50dd 100644 --- a/include/wd_comp.h +++ b/include/wd_comp.h @@ -7,6 +7,7 @@ #ifndef __WD_COMP_H #define __WD_COMP_H
+#include <numa.h> #include "wd.h" #include "wd_alg_common.h"
@@ -113,6 +114,32 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched); */ void wd_comp_uninit(void);
+/**
- wd_comp_init2() - A simplify interface to initializate uadk
- compression/decompression. Users can use wd_get_accel_list() to
- get the usable device list with the algrithms. Users should provide
- a device numa node mask to show which numa devices will be
- selected. wd_create_device_nodemask() can create a node mask
- according the list. If all numa devices on the list are match
- the requirement, just use the return of it. Otherwise, users can
- use the function in libnuma to set the node mask.
- To make the initializate simpler, bmp and cparams support set NULL.
- And then the function will set them as default.
- @list: The device list.
- @bmp: Node mask of the required devices.
- @cparams: The ctx settings.
- @sched_type: The scheduler type.
same as above.
- Return 0 if succeed and others if fail.
- */
+int wd_comp_init2(const char *alg_name, __u32 sched_type);
direction and op_mode maybe need congfig when process init.
sample: wd_comp_init2(ctx_setup); struct ctx_setup { alg_name; /* 根据算法名去搜索可用对象,挂接driver ops */ direction; /* 压缩(加密)还是解压缩(解密),还是dual mode */ op_mode; /* 同步还是异步 */ sched_type; /* 调度器类型,默认RR调度 */ };
+/**
- wd_comp_uninit2() - Uninitialise ctx configuration and scheduler.
- */
+void wd_comp_uninit2(void);
- struct wd_comp_sess_setup { enum wd_comp_alg_type alg_type; /* Denoted by enum wd_comp_alg_type */ enum wd_comp_level comp_lv; /* Denoted by enum wd_comp_level */
diff --git a/include/wd_util.h b/include/wd_util.h index eafe3ce..4a2e102 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -7,6 +7,7 @@ #ifndef __WD_UTIL_H #define __WD_UTIL_H
+#include <numa.h> #include <stdbool.h> #include <sys/ipc.h> #include <sys/shm.h> @@ -394,6 +395,24 @@ static inline void wd_alg_clear_init(enum wd_status *status) __atomic_store(status, &setting, __ATOMIC_RELAXED); }
+/**
- wd_get_usable_list() - choose the devices according bitmask.
- @list: The device list.
- @bmp: The devices node mask.
- Return a list that meet user's requirement if succeed, and error number if fail.
- */
+struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp);
+/**
- wd_get_ctx_numbers() - count the ctx number for first to end.
- @cparams: the input ctx setting numbers.
- @end: the end index of cparams.
- Return the sum of top '@end' cparams ctx number.
- */
+__u32 wd_get_ctx_numbers(struct wd_ctx_params cparams, int end);
- /**
- wd_dfx_msg_cnt() - Message counter interface for ctx
- @msg: Shared memory addr.
diff --git a/wd.c b/wd.c index d99d4ec..c63805c 100644 --- a/wd.c +++ b/wd.c @@ -741,6 +741,35 @@ free_list: return NULL; }
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id) +{
- struct uacce_dev *dev = WD_ERR_PTR(-WD_ENODEV);
- struct uacce_dev_list *p = list;
- int ctx_num, ctx_max = 0;
- if (!list) {
WD_ERR("invalid: list is NULL!\n");
return WD_ERR_PTR(-WD_EINVAL);
- }
- while (p) {
if (numa_id != p->dev->numa_id) {
p = p->next;
continue;
}
ctx_num = wd_get_avail_ctx(p->dev);
if (ctx_num > ctx_max) {
dev = p->dev;
ctx_max = ctx_num;
}
p = p->next;
- }
- return dev;
+}
- void wd_free_list_accels(struct uacce_dev_list *list) { struct uacce_dev_list *curr, *next;
@@ -807,6 +836,39 @@ int wd_ctx_set_io_cmd(handle_t h_ctx, unsigned long cmd, void *arg) return ioctl(ctx->fd, cmd, arg); }
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list) +{
- struct uacce_dev_list *p;
- struct bitmask *bmp;
- if (!list) {
WD_ERR("invalid: list is NULL!\n");
return WD_ERR_PTR(-WD_EINVAL);
- }
- bmp = numa_allocate_nodemask();
- if (!bmp) {
WD_ERR("failed to alloc bitmask(%d)!\n", errno);
return WD_ERR_PTR(-WD_EINVAL);
- }
- p = list;
- while (p) {
numa_bitmask_setbit(bmp, p->dev->numa_id);
p = p->next;
- }
- return bmp;
+}
+void wd_free_device_nodemask(struct bitmask *bmp) +{
- if (!bmp)
return;
- numa_free_nodemask(bmp);
+}
- void wd_get_version(void) { const char *wd_released_time = UADK_RELEASED_TIME;
diff --git a/wd_comp.c b/wd_comp.c index 44593a6..ba79838 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -14,6 +14,7 @@
#include "config.h" #include "drv/wd_comp_drv.h" +#include "wd_sched.h" #include "wd_util.h" #include "wd_comp.h"
@@ -21,6 +22,8 @@ #define HW_CTX_SIZE (64 * 1024) #define STREAM_CHUNK (128 * 1024)
+#define SCHED_RR_NAME "sched_rr"
- #define swap_byte(x) \ ((((x) & 0x000000ff) << 24) | \ (((x) & 0x0000ff00) << 8) | \
@@ -42,6 +45,7 @@ struct wd_comp_sess {
struct wd_comp_setting { enum wd_status status;
- enum wd_status status2; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver;
@@ -52,6 +56,10 @@ struct wd_comp_setting {
struct wd_env_config wd_comp_env_config;
+static struct wd_ctx_config wd_comp_ctx; +static struct wd_sched *wd_comp_sched; +static int wd_comp_numa_count;
- #ifdef WD_STATIC_DRV static void wd_comp_set_static_drv(void) {
@@ -178,6 +186,209 @@ void wd_comp_uninit(void) wd_alg_clear_init(&wd_comp_setting.status); }
+static int wd_comp_request_ctx(struct uacce_dev_list *list,
struct wd_ctx_nums ctx_nums,
int idx, int numa_id, int op_type)
+{
- int ctx_set_size = ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num;
- struct uacce_dev *dev;
- int i;
- dev = wd_find_dev_by_numa(list, numa_id);
- if (!dev)
return -WD_EBUSY;
- for (i = idx; i < idx + ctx_set_size; i++) {
wd_comp_ctx.ctxs[i].ctx = wd_request_ctx(dev);
if (errno == WD_EBUSY) {
dev = wd_find_dev_by_numa(list, numa_id);
if (!dev)
return -WD_EBUSY;
i--;
}
wd_comp_ctx.ctxs[i].op_type = op_type;
wd_comp_ctx.ctxs[i].ctx_mode =
((i - idx) < ctx_nums.sync_ctx_num) ?
CTX_MODE_SYNC : CTX_MODE_ASYNC;
- }
- return 0;
+}
+static void wd_comp_release_ctx(void) +{
- int i;
- for (i = 0; i < wd_comp_ctx.ctx_num; i++)
if (wd_comp_ctx.ctxs[i].ctx) {
wd_release_ctx(wd_comp_ctx.ctxs[i].ctx);
wd_comp_ctx.ctxs[i].ctx = 0;
}
+}
+static int wd_comp_instance_sched(struct wd_ctx_nums ctx_nums, int idx,
int numa_id, int op_type)
+{
- struct sched_params sparams;
- int i, ret = 0;
- for (i = 0; i < CTX_MODE_MAX; i++) {
sparams.numa_id = numa_id;
sparams.type = op_type;
sparams.mode = i;
sparams.begin = idx + ctx_nums.sync_ctx_num * i;
sparams.end = idx - 1 + ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num * i;
if (sparams.begin > sparams.end)
continue;
ret = wd_sched_rr_instance(wd_comp_sched, &sparams);
if (ret)
goto out;
- }
+out:
- return ret;
+}
+static int __wd_comp_init2(struct uacce_dev_list *list, struct bitmask *bmp,
struct wd_ctx_params cparams)
+{
- int ctx_set_num = cparams.ctx_set_num;
- int max_node = numa_max_node() + 1;
- struct wd_ctx_nums ctx_nums;
- int i, j, ret;
- int idx = 0;
- for (i = 0; i < max_node; i++) {
if (!numa_bitmask_isbitset(bmp, i))
continue;
for (j = 0; j < ctx_set_num; j++) {
ctx_nums = cparams.ctx_set_size[j];
ret = wd_comp_request_ctx(list, ctx_nums, idx, i, j);
if (ret)
goto free_ctxs;
ret = wd_comp_instance_sched(ctx_nums, idx, i, j);
if (ret)
goto free_ctxs;
idx += (ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num);
}
- }
- ret = wd_comp_init(&wd_comp_ctx, wd_comp_sched);
- if (ret)
goto free_ctxs;
- return 0;
+free_ctxs:
- wd_comp_release_ctx();
- return ret;
+}
+static struct wd_ctx_nums comp_default_ctxsize[] = {
- {1, 1}, {1, 1}, { }
+};
+static struct wd_ctx_params comp_default_cparams = {
- .ctx_set_num = WD_DIR_MAX,
- .ctx_set_size = comp_default_ctxsize,
+};
+int wd_comp_init2(const char *alg_name, __u32 sched_type) +{
- struct uacce_dev_list *dev_list = NULL;
- __u32 ctx_set_num, ctx_set_size;
- struct bitmask *dev_bmp;
- bool flag;
- int ret;
- flag = wd_alg_try_init(&wd_comp_setting.status2);
- if (!flag)
return 0;
- if (!alg_name) {
WD_ERR("invalid: alg_name is NULL!\n");
ret = -WD_EINVAL;
goto out_uninit;
- }
- dev_list = wd_get_accel_list(alg_name);
- if (!dev_list) {
WD_ERR("invalid: alg_name is not support!\n");
ret = -WD_EINVAL;
goto out_uninit;
- }
- dev_bmp = wd_create_device_nodemask(dev_list);
- if (WD_IS_ERR(dev_bmp)) {
ret = WD_PTR_ERR(dev_bmp);
goto out_freelist;
- }
- wd_comp_numa_count = numa_bitmask_weight(dev_bmp);
- if (!wd_comp_numa_count) {
WD_ERR("invalid: bmp is clear!\n");
ret = -WD_ENODEV;
goto out_freebmp;
- }
- ctx_set_num = comp_default_cparams.ctx_set_num;
- ctx_set_size = wd_get_ctx_numbers(comp_default_cparams, ctx_set_num);
- wd_comp_ctx.ctx_num = ctx_set_size * wd_comp_numa_count;
- wd_comp_ctx.ctxs = calloc(wd_comp_ctx.ctx_num, sizeof(struct wd_ctx));
- if (!wd_comp_ctx.ctxs) {
ret = -WD_ENOMEM;
WD_ERR("failed to alloc ctxs!\n");
goto out_freebmp;
- }
- wd_comp_sched = wd_sched_rr_alloc(sched_type, ctx_set_num,
numa_max_node() + 1, wd_comp_poll_ctx);
- if (!wd_comp_sched) {
ret = -WD_EINVAL;
goto out_freectxs;
- }
- wd_comp_sched->name = SCHED_RR_NAME;
- ret = __wd_comp_init2(dev_list, dev_bmp, comp_default_cparams);
- if (ret)
goto out_freesched;
- wd_free_device_nodemask(dev_bmp);
- wd_free_list_accels(dev_list);
- wd_alg_set_init(&wd_comp_setting.status2);
- return ret;
+out_freesched:
- wd_sched_rr_release(wd_comp_sched);
- wd_comp_sched = NULL;
+out_freectxs:
- free(wd_comp_ctx.ctxs);
- wd_comp_ctx.ctxs = NULL;
+out_freebmp:
- wd_free_device_nodemask(dev_bmp);
+out_freelist:
- wd_free_list_accels(dev_list);
+out_uninit:
- wd_alg_clear_init(&wd_comp_setting.status2);
- return ret;
+}
+void wd_comp_uninit2(void) +{
- wd_comp_uninit();
- wd_comp_release_ctx();
- wd_sched_rr_release(wd_comp_sched);
- wd_alg_clear_init(&wd_comp_setting.status2);
+}
- struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) { return wd_find_msg_in_pool(&wd_comp_setting.pool, idx, tag);
@@ -289,6 +500,7 @@ handle_t wd_comp_alloc_sess(struct wd_comp_sess_setup *setup) sess->comp_lv = setup->comp_lv; sess->win_sz = setup->win_sz; sess->stream_pos = WD_COMP_STREAM_NEW;
- /* Some simple scheduler don't need scheduling parameters */ sess->sched_key = (void *)wd_comp_setting.sched.sched_init( wd_comp_setting.sched.h_sched_ctx, setup->sched_param);
@@ -318,6 +530,7 @@ void wd_comp_free_sess(handle_t h_sess)
if (sess->sched_key) free(sess->sched_key);
- free(sess); }
diff --git a/wd_util.c b/wd_util.c index efc0d41..471ca07 100644 --- a/wd_util.c +++ b/wd_util.c @@ -5,7 +5,6 @@ */
#define _GNU_SOURCE -#include <numa.h> #include <pthread.h> #include <semaphore.h> #include <string.h> @@ -1801,3 +1800,61 @@ bool wd_alg_try_init(enum wd_status *status)
return true; }
+struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp) +{
- struct uacce_dev_list *p, *node, *result = NULL;
- struct uacce_dev *dev;
- int numa_id, ret;
- p = list;
- while (p) {
dev = p->dev;
numa_id = dev->numa_id;
ret = numa_bitmask_isbitset(bmp, numa_id);
if (!ret) {
p = p->next;
continue;
}
node = calloc(1, sizeof(*node));
if (!node) {
result = WD_ERR_PTR(-WD_ENOMEM);
goto out_free_list;
}
node->dev = wd_clone_dev(dev);
if (!node->dev) {
result = WD_ERR_PTR(-WD_ENOMEM);
goto out_free_node;
}
if (!result)
result = node;
else
wd_add_dev_to_list(result, node);
p = p->next;
- }
- return result;
+out_free_node:
- free(node);
+out_free_list:
- wd_free_list_accels(result);
- return result;
+}
+__u32 wd_get_ctx_numbers(struct wd_ctx_params cparams, int end) +{
- __u32 count = 0;
- int i;
- for (i = 0; i < end; i++) {
count += cparams.ctx_set_size[i].sync_ctx_num;
count += cparams.ctx_set_size[i].async_ctx_num;
- }
- return count;
+}
On 2022/9/24 18:18, Yang Shen Wrote:
Due to performance, uadk tries to leave many configuration options to users. This gives users great flexibility, but it also leads to a problem that the current initialization interface has high complexity. Therefore, in order to facilitate users to adapt quickly, a new set of interfaces is provided.
The 'wd_alg_init2()' will complete all initialization steps. There are 4 parameters to describe the user configuration requirements. @device_list: The available uacce device list. Users can get it by wd_get_accel_list(). @numa_bitmask: The bitmask provided by libnuma. Users can use this parameter to control requesting ctxs devices in the bind NUMA scenario. @ctx_nums: The requested ctx number for each numa node. Due to users may have different requirements for different types of ctx numbers, needs a two-dimensional array as input. @sched_type: Scheduling type the user wants to use.
Signed-off-by: Yang Shen shenyang39@huawei.com
Makefile.am | 4 +- include/wd.h | 24 +++++ include/wd_alg_common.h | 24 +++++ include/wd_comp.h | 27 +++++ include/wd_util.h | 19 ++++ wd.c | 62 ++++++++++++ wd_comp.c | 213 ++++++++++++++++++++++++++++++++++++++++ wd_util.c | 59 ++++++++++- 8 files changed, 429 insertions(+), 3 deletions(-)
diff --git a/Makefile.am b/Makefile.am index b3f07df..6cfb6b3 100644 --- a/Makefile.am +++ b/Makefile.am @@ -86,7 +86,7 @@ AM_CFLAGS += -DWD_NO_LOG
libwd_la_LIBADD = $(libwd_la_OBJECTS) -lnuma
-libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl +libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl -lnuma libwd_comp_la_DEPENDENCIES = libwd.la
libhisi_zip_la_LIBADD = -ldl @@ -103,7 +103,7 @@ else libwd_la_LDFLAGS=$(UADK_VERSION) libwd_la_LIBADD= -lnuma
-libwd_comp_la_LIBADD= -lwd -ldl +libwd_comp_la_LIBADD= -lwd -ldl -lnuma libwd_comp_la_LDFLAGS=$(UADK_VERSION) libwd_comp_la_DEPENDENCIES= libwd.la
diff --git a/include/wd.h b/include/wd.h index e1a87de..facd992 100644 --- a/include/wd.h +++ b/include/wd.h @@ -348,6 +348,16 @@ int wd_get_avail_ctx(struct uacce_dev *dev); */ struct uacce_dev_list *wd_get_accel_list(const char *alg_name);
+/**
- wd_find_dev_by_numa() - get device with max available ctx number from an
device list according to numa id.
- @list: The device list.
- @numa_id: The numa_id.
- Return device if succeed and other error number if fail.
- */
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id);
/**
- wd_get_accel_dev() - Get device supporting the algorithm with smallest numa distance to current numa node.
@@ -523,6 +533,20 @@ struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); */ void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node);
+/**
- wd_create_device_nodemask() - create a numa node mask of device list.
- @list: The devices list.
- Return a pointer value if succeed, and error number if fail.
- */
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list);
+/**
- wd_free_device_nodemask() - free a numa node mask.
- @bmp: A numa node mask.
- */
+void wd_free_device_nodemask(struct bitmask *bmp);
/**
- wd_ctx_get_dev_name() - Get the device name about task.
- @h_ctx: The handle of context.
diff --git a/include/wd_alg_common.h b/include/wd_alg_common.h index c455dc3..f261830 100644 --- a/include/wd_alg_common.h +++ b/include/wd_alg_common.h @@ -63,6 +63,30 @@ struct wd_ctx_config { void *priv; };
+/**
- struct wd_ctx_nums - Define the ctx sets numbers.
- @sync_ctx_num: The ctx numbers which are used for sync mode for each
- ctx sets.
- @async_ctx_num: The ctx numbers which are used for async mode for each
- ctx sets.
- */
+struct wd_ctx_nums {
- __u32 sync_ctx_num;
- __u32 async_ctx_num;
+};
+/**
- struct wd_ctx_params - Define the ctx sets params which are used for init
- algorithms.
- @ctx_set_num: Number of ctx sets to be created. Usually users can
- set it according to <alg>_op_type.
- @ctx_set_size: Each ctx sets numbers.
- */
+struct wd_ctx_params {
- __u32 ctx_set_num;
- struct wd_ctx_nums *ctx_set_size;
+};
struct wd_ctx_internal { handle_t ctx; __u8 op_type; diff --git a/include/wd_comp.h b/include/wd_comp.h index e043a83..9cd50dd 100644 --- a/include/wd_comp.h +++ b/include/wd_comp.h @@ -7,6 +7,7 @@ #ifndef __WD_COMP_H #define __WD_COMP_H
+#include <numa.h> #include "wd.h" #include "wd_alg_common.h"
@@ -113,6 +114,32 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched); */ void wd_comp_uninit(void);
+/**
- wd_comp_init2() - A simplify interface to initializate uadk
- compression/decompression. Users can use wd_get_accel_list() to
- get the usable device list with the algrithms. Users should provide
- a device numa node mask to show which numa devices will be
- selected. wd_create_device_nodemask() can create a node mask
- according the list. If all numa devices on the list are match
- the requirement, just use the return of it. Otherwise, users can
- use the function in libnuma to set the node mask.
- To make the initializate simpler, bmp and cparams support set NULL.
- And then the function will set them as default.
- @list: The device list.
- @bmp: Node mask of the required devices.
- @cparams: The ctx settings.
- @sched_type: The scheduler type.
- Return 0 if succeed and others if fail.
- */
+int wd_comp_init2(const char *alg_name, __u32 sched_type);
+/**
- wd_comp_uninit2() - Uninitialise ctx configuration and scheduler.
- */
+void wd_comp_uninit2(void);
struct wd_comp_sess_setup { enum wd_comp_alg_type alg_type; /* Denoted by enum wd_comp_alg_type */ enum wd_comp_level comp_lv; /* Denoted by enum wd_comp_level */ diff --git a/include/wd_util.h b/include/wd_util.h index eafe3ce..4a2e102 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -7,6 +7,7 @@ #ifndef __WD_UTIL_H #define __WD_UTIL_H
+#include <numa.h> #include <stdbool.h> #include <sys/ipc.h> #include <sys/shm.h> @@ -394,6 +395,24 @@ static inline void wd_alg_clear_init(enum wd_status *status) __atomic_store(status, &setting, __ATOMIC_RELAXED); }
+/**
- wd_get_usable_list() - choose the devices according bitmask.
- @list: The device list.
- @bmp: The devices node mask.
- Return a list that meet user's requirement if succeed, and error number if fail.
- */
+struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp);
+/**
- wd_get_ctx_numbers() - count the ctx number for first to end.
- @cparams: the input ctx setting numbers.
- @end: the end index of cparams.
- Return the sum of top '@end' cparams ctx number.
- */
+__u32 wd_get_ctx_numbers(struct wd_ctx_params cparams, int end);
/**
- wd_dfx_msg_cnt() - Message counter interface for ctx
- @msg: Shared memory addr.
diff --git a/wd.c b/wd.c index d99d4ec..c63805c 100644 --- a/wd.c +++ b/wd.c @@ -741,6 +741,35 @@ free_list: return NULL; }
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id) +{
- struct uacce_dev *dev = WD_ERR_PTR(-WD_ENODEV);
- struct uacce_dev_list *p = list;
- int ctx_num, ctx_max = 0;
- if (!list) {
WD_ERR("invalid: list is NULL!\n");
return WD_ERR_PTR(-WD_EINVAL);
- }
- while (p) {
if (numa_id != p->dev->numa_id) {
p = p->next;
continue;
}
ctx_num = wd_get_avail_ctx(p->dev);
if (ctx_num > ctx_max) {
dev = p->dev;
ctx_max = ctx_num;
}
p = p->next;
- }
- return dev;
+}
void wd_free_list_accels(struct uacce_dev_list *list) { struct uacce_dev_list *curr, *next; @@ -807,6 +836,39 @@ int wd_ctx_set_io_cmd(handle_t h_ctx, unsigned long cmd, void *arg) return ioctl(ctx->fd, cmd, arg); }
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list) +{
- struct uacce_dev_list *p;
- struct bitmask *bmp;
- if (!list) {
WD_ERR("invalid: list is NULL!\n");
return WD_ERR_PTR(-WD_EINVAL);
- }
- bmp = numa_allocate_nodemask();
- if (!bmp) {
WD_ERR("failed to alloc bitmask(%d)!\n", errno);
return WD_ERR_PTR(-WD_EINVAL);
- }
- p = list;
- while (p) {
numa_bitmask_setbit(bmp, p->dev->numa_id);
p = p->next;
- }
- return bmp;
+}
+void wd_free_device_nodemask(struct bitmask *bmp) +{
- if (!bmp)
return;
- numa_free_nodemask(bmp);
+}
void wd_get_version(void) { const char *wd_released_time = UADK_RELEASED_TIME; diff --git a/wd_comp.c b/wd_comp.c index 44593a6..ba79838 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -14,6 +14,7 @@
#include "config.h" #include "drv/wd_comp_drv.h" +#include "wd_sched.h" #include "wd_util.h" #include "wd_comp.h"
@@ -21,6 +22,8 @@ #define HW_CTX_SIZE (64 * 1024) #define STREAM_CHUNK (128 * 1024)
+#define SCHED_RR_NAME "sched_rr"
#define swap_byte(x) \ ((((x) & 0x000000ff) << 24) | \ (((x) & 0x0000ff00) << 8) | \ @@ -42,6 +45,7 @@ struct wd_comp_sess {
struct wd_comp_setting { enum wd_status status;
- enum wd_status status2; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver;
@@ -52,6 +56,10 @@ struct wd_comp_setting {
struct wd_env_config wd_comp_env_config;
+static struct wd_ctx_config wd_comp_ctx; +static struct wd_sched *wd_comp_sched; +static int wd_comp_numa_count;
#ifdef WD_STATIC_DRV static void wd_comp_set_static_drv(void) { @@ -178,6 +186,209 @@ void wd_comp_uninit(void) wd_alg_clear_init(&wd_comp_setting.status); }
+static int wd_comp_request_ctx(struct uacce_dev_list *list,
struct wd_ctx_nums ctx_nums,
int idx, int numa_id, int op_type)
+{
- int ctx_set_size = ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num;
- struct uacce_dev *dev;
- int i;
- dev = wd_find_dev_by_numa(list, numa_id);
- if (!dev)
return -WD_EBUSY;
- for (i = idx; i < idx + ctx_set_size; i++) {
wd_comp_ctx.ctxs[i].ctx = wd_request_ctx(dev);
if (errno == WD_EBUSY) {
dev = wd_find_dev_by_numa(list, numa_id);
if (!dev)
return -WD_EBUSY;
i--;
}
wd_comp_ctx.ctxs[i].op_type = op_type;
wd_comp_ctx.ctxs[i].ctx_mode =
((i - idx) < ctx_nums.sync_ctx_num) ?
CTX_MODE_SYNC : CTX_MODE_ASYNC;
- }
- return 0;
+}
+static void wd_comp_release_ctx(void) +{
- int i;
- for (i = 0; i < wd_comp_ctx.ctx_num; i++)
if (wd_comp_ctx.ctxs[i].ctx) {
wd_release_ctx(wd_comp_ctx.ctxs[i].ctx);
wd_comp_ctx.ctxs[i].ctx = 0;
}
+}
+static int wd_comp_instance_sched(struct wd_ctx_nums ctx_nums, int idx,
int numa_id, int op_type)
+{
- struct sched_params sparams;
- int i, ret = 0;
- for (i = 0; i < CTX_MODE_MAX; i++) {
sparams.numa_id = numa_id;
sparams.type = op_type;
sparams.mode = i;
sparams.begin = idx + ctx_nums.sync_ctx_num * i;
sparams.end = idx - 1 + ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num * i;
if (sparams.begin > sparams.end)
continue;
ret = wd_sched_rr_instance(wd_comp_sched, &sparams);
if (ret)
goto out;
- }
+out:
- return ret;
+}
+static int __wd_comp_init2(struct uacce_dev_list *list, struct bitmask *bmp,
struct wd_ctx_params cparams)
+{
- int ctx_set_num = cparams.ctx_set_num;
- int max_node = numa_max_node() + 1;
- struct wd_ctx_nums ctx_nums;
- int i, j, ret;
- int idx = 0;
- for (i = 0; i < max_node; i++) {
if (!numa_bitmask_isbitset(bmp, i))
continue;
for (j = 0; j < ctx_set_num; j++) {
ctx_nums = cparams.ctx_set_size[j];
ret = wd_comp_request_ctx(list, ctx_nums, idx, i, j);
if (ret)
goto free_ctxs;
ret = wd_comp_instance_sched(ctx_nums, idx, i, j);
if (ret)
goto free_ctxs;
idx += (ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num);
}
- }
- ret = wd_comp_init(&wd_comp_ctx, wd_comp_sched);
- if (ret)
goto free_ctxs;
- return 0;
+free_ctxs:
- wd_comp_release_ctx();
- return ret;
+}
+static struct wd_ctx_nums comp_default_ctxsize[] = {
- {1, 1}, {1, 1}, { }
+};
+static struct wd_ctx_params comp_default_cparams = {
- .ctx_set_num = WD_DIR_MAX,
- .ctx_set_size = comp_default_ctxsize,
+};
This implementation of default parameters is not very suitable. Is it better to pass a setup parameter structure or get it directly from the driver?
Thanks Longfang
+int wd_comp_init2(const char *alg_name, __u32 sched_type) +{
- struct uacce_dev_list *dev_list = NULL;
- __u32 ctx_set_num, ctx_set_size;
- struct bitmask *dev_bmp;
- bool flag;
- int ret;
- flag = wd_alg_try_init(&wd_comp_setting.status2);
- if (!flag)
return 0;
- if (!alg_name) {
WD_ERR("invalid: alg_name is NULL!\n");
ret = -WD_EINVAL;
goto out_uninit;
- }
- dev_list = wd_get_accel_list(alg_name);
- if (!dev_list) {
WD_ERR("invalid: alg_name is not support!\n");
ret = -WD_EINVAL;
goto out_uninit;
- }
- dev_bmp = wd_create_device_nodemask(dev_list);
- if (WD_IS_ERR(dev_bmp)) {
ret = WD_PTR_ERR(dev_bmp);
goto out_freelist;
- }
- wd_comp_numa_count = numa_bitmask_weight(dev_bmp);
- if (!wd_comp_numa_count) {
WD_ERR("invalid: bmp is clear!\n");
ret = -WD_ENODEV;
goto out_freebmp;
- }
- ctx_set_num = comp_default_cparams.ctx_set_num;
- ctx_set_size = wd_get_ctx_numbers(comp_default_cparams, ctx_set_num);
- wd_comp_ctx.ctx_num = ctx_set_size * wd_comp_numa_count;
- wd_comp_ctx.ctxs = calloc(wd_comp_ctx.ctx_num, sizeof(struct wd_ctx));
- if (!wd_comp_ctx.ctxs) {
ret = -WD_ENOMEM;
WD_ERR("failed to alloc ctxs!\n");
goto out_freebmp;
- }
- wd_comp_sched = wd_sched_rr_alloc(sched_type, ctx_set_num,
numa_max_node() + 1, wd_comp_poll_ctx);
- if (!wd_comp_sched) {
ret = -WD_EINVAL;
goto out_freectxs;
- }
- wd_comp_sched->name = SCHED_RR_NAME;
- ret = __wd_comp_init2(dev_list, dev_bmp, comp_default_cparams);
- if (ret)
goto out_freesched;
- wd_free_device_nodemask(dev_bmp);
- wd_free_list_accels(dev_list);
- wd_alg_set_init(&wd_comp_setting.status2);
- return ret;
+out_freesched:
- wd_sched_rr_release(wd_comp_sched);
- wd_comp_sched = NULL;
+out_freectxs:
- free(wd_comp_ctx.ctxs);
- wd_comp_ctx.ctxs = NULL;
+out_freebmp:
- wd_free_device_nodemask(dev_bmp);
+out_freelist:
- wd_free_list_accels(dev_list);
+out_uninit:
- wd_alg_clear_init(&wd_comp_setting.status2);
- return ret;
+}
+void wd_comp_uninit2(void) +{
- wd_comp_uninit();
- wd_comp_release_ctx();
- wd_sched_rr_release(wd_comp_sched);
- wd_alg_clear_init(&wd_comp_setting.status2);
+}
struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) { return wd_find_msg_in_pool(&wd_comp_setting.pool, idx, tag); @@ -289,6 +500,7 @@ handle_t wd_comp_alloc_sess(struct wd_comp_sess_setup *setup) sess->comp_lv = setup->comp_lv; sess->win_sz = setup->win_sz; sess->stream_pos = WD_COMP_STREAM_NEW;
- /* Some simple scheduler don't need scheduling parameters */ sess->sched_key = (void *)wd_comp_setting.sched.sched_init( wd_comp_setting.sched.h_sched_ctx, setup->sched_param);
@@ -318,6 +530,7 @@ void wd_comp_free_sess(handle_t h_sess)
if (sess->sched_key) free(sess->sched_key);
- free(sess);
}
diff --git a/wd_util.c b/wd_util.c index efc0d41..471ca07 100644 --- a/wd_util.c +++ b/wd_util.c @@ -5,7 +5,6 @@ */
#define _GNU_SOURCE -#include <numa.h> #include <pthread.h> #include <semaphore.h> #include <string.h> @@ -1801,3 +1800,61 @@ bool wd_alg_try_init(enum wd_status *status)
return true; }
+struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp) +{
- struct uacce_dev_list *p, *node, *result = NULL;
- struct uacce_dev *dev;
- int numa_id, ret;
- p = list;
- while (p) {
dev = p->dev;
numa_id = dev->numa_id;
ret = numa_bitmask_isbitset(bmp, numa_id);
if (!ret) {
p = p->next;
continue;
}
node = calloc(1, sizeof(*node));
if (!node) {
result = WD_ERR_PTR(-WD_ENOMEM);
goto out_free_list;
}
node->dev = wd_clone_dev(dev);
if (!node->dev) {
result = WD_ERR_PTR(-WD_ENOMEM);
goto out_free_node;
}
if (!result)
result = node;
else
wd_add_dev_to_list(result, node);
p = p->next;
- }
- return result;
+out_free_node:
- free(node);
+out_free_list:
- wd_free_list_accels(result);
- return result;
+}
+__u32 wd_get_ctx_numbers(struct wd_ctx_params cparams, int end) +{
- __u32 count = 0;
- int i;
- for (i = 0; i < end; i++) {
count += cparams.ctx_set_size[i].sync_ctx_num;
count += cparams.ctx_set_size[i].async_ctx_num;
- }
- return count;
+}
Due to the complexity of wd_alg_init, add wd_alg_init2 interface for users. And add the design documents.
Signed-off-by: Yang Shen shenyang39@huawei.com --- docs/wd_alg_init2.md | 176 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 176 insertions(+) create mode 100644 docs/wd_alg_init2.md
diff --git a/docs/wd_alg_init2.md b/docs/wd_alg_init2.md new file mode 100644 index 0000000..3fb570c --- /dev/null +++ b/docs/wd_alg_init2.md @@ -0,0 +1,176 @@ +# wd_alg_init2 + +## Preface + +The current uadk initialization process is: +1.Call wd_request_ctx() to request ctxs from devices. +2.Call wd_sched_rr_alloc() to create a sched(or some other scheduler alloc function if exits). +3.Initialize the sched. +4.Call wd_alg_init() with ctx_config and sched. + +```flow +st=>start: Start +o1=>operation: request ctxs +o2=>operation: create uadk_sched and instance ctxs to sched region +o3=>operation: call wd_alg_init +e=>end +st->o1->o2->o3->e +``` + +Logic is reasonable. But in practice, the step of `wd_request_ctx()` +and `wd_sched_rr_alloc()` are very tedious. This makes it difficult +for users to use the interface. One of the main reasons for this is +that uadk has made a lot of configurations in the scheduler in order +to provide users with better performance. Based on this consideration, +the current uadk requires the user to arrange the division of hardware +resources according to the device topology during initialization. +Therefore, as a high-level interface, this scheme can provide customized +scheme configuration for users with deep needs. + +## wd_alg_init2 + +### Design + +Is there any way to simplify these steps? Not currently. Because the +architecture model designed by uadk is to manage hardware resources +through a scheduler, users can no longer perceive after specifying +hardware resources, and all subsequent tasks are handled by the scheduler. +The original intention of this design is to make the scenarios supported +by uadk more flexible. Because the resource requirements of different +business scenarios are different from the task model of the business +itself, the best performance experience can be obtained through the +scheduler to match. + +But we can try to provide a layer of encapsulation. The original design +intention of this layer of encapsulation is that users only need to +specify available resources and requirements, and the configuration of +resources is completed internally by the interface. Because the previous +interface complexity mainly lies in the parameter configuration of CTX +and scheduler, it is easy for users to make configuration errors and +generate bugs because of their misunderstanding of parameters. + +All algorithms have the same input parameters and initialization logic. + +```c +struct wd_ctx_config { + __u32 ctx_num; + struct wd_ctx *ctxs; + void *priv; +}; + +struct wd_sched { + const char *name; + int sched_policy; + handle_t (*sched_init)(handle_t h_sched_ctx, void *sched_param); + __u32 (*pick_next_ctx)(handle_t h_sched_ctx, void *sched_key, + const int sched_mode); + int (*poll_policy)(handle_t h_sched_ctx, __u32 expect, __u32 *count); + handle_t h_sched_ctx; +}; + +int wd_alg_init(struct wd_ctx_config *config, struct wd_sched *sched); +``` + +`wd_ctx_config` is the requested ctxs descriptor, and the attributes +of ctxs are contained in their own structure. The attributes will be +used in scheduler for picking ctx according to request type. The main +difficulty in this step is that users need to apply for CTXs from the +appropriate device nodes according to their own business distribution. +If the user does not consider the appropriate device distribution, +it may lead to cross chip or cross numa node which will affect +performance. + +`wd_sched` is the scheduler descriptor of the request. It will create +the scheduling domain based parameters passed by the users. User needs +to allocate the ctxs applied to the scheduling domain that meets the +attribute, so that uadk can select the appropriate ctxs according to +the issued business. The main difficulty in this step is that the user +needs to initialize the correct scheduling domain according to the ctxs +attributes previously applied. However, there are many attributes of +ctxs here, which should be divided by multiple dimensions. If the +parameters are not understood enough, it is easy to make queue +allocation errors, resulting in the scheduling of the wrong ctxs when +the task is finally issued, and cause unexpected errors. + +Therefore, the next thing to be done is to use limited and easy-to-use +input parameters to describe users' requirements on the two input +parameters, ensuring that the functions of the new interface init2 +are the same as those of init. For ease of description, v1 is used +to refer to the existing interface, and v2 is used to refer to the +layer of encapsulation. + +Let's clarify the following logic first: all uacce devices under a +numa node can be regarded as the same. So although we request for +ctxs from the device, we manage ctxs according to numa nodes. +That means if users want to get the same performance for all cpu, +the uadk configure should be same for all numa node. + +At present, at least 4 parameters are required to meet the user +configuration requirements with the V1 interface function remains +unchanged. + +@device_list: The available uacce device list. Users can get it by +`wd_get_accel_list()`. + +@numa_bitmask: The bitmask provided by libnuma. Users can use this +parameter to control requesting ctxs devices in the bind NUMA scenario. +This parameter is mainly convenient for users to use in the binding +cpu scenario. It can avoid resource waste or initialization failure +caused by insufficient resources. Libnuma provides a complete operation +interface which can be found in numa.h. + +@ctx_nums: The requested ctx number for each numa node. Due to users +may have different requirements for different types of ctx numbers, +needs a two-dimensional array as input. + +@sched_type: Scheduling type the user wants to use. + +To sum up, the wd_alg_init2 is as follows + +```c +struct wd_ctx_nums { + __u32 sync_ctx_num; + __u32 async_ctx_num; +}; + +struct wd_ctx_params { + __u32 ctx_set_num; + struct wd_ctx_nums *ctx_set_size; +}; + +init wd_alg_init2 (struct uacce_dev_list *list, struct bitmask *bmp, + struct wd_ctx_params *cparams, __u32 sched_type); +``` + +Somebody may say that the wd_alg_init2 is still complex for three +input parameters are structure. So the interface support default value +for some parameters. The @bmp can be set as NULL, and then it will be +initialized according to device list. The @cparams can be set as NULL, +and it has a default value in wd_alg.c. The @list and sched_type are +necessary. + +What's more, uadk provides a new set of interface to get device list +bit mask. + +```c +struct bitmask *wd_create_device_nodemask(strcut uacce_dev_list *list); + +void wd_free_device_nodemask(struct bitmask *bmp); +``` + +## Demo + +The simplest user initialization process is: + +```c +{ + …… + struct uacce_dev_list *list; + int ret; + + list = wd_get_accel_list(alg); + ret = wd_<alg>_init2_(list, NULL, NULL, sched_type); + wd_free_list_accel(list); + …… +} +```
在 2022/9/24 18:18, Yang Shen 写道:
Due to the complexity of wd_alg_init, add wd_alg_init2 interface for users. And add the design documents.
Signed-off-by: Yang Shen shenyang39@huawei.com
docs/wd_alg_init2.md | 176 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 176 insertions(+) create mode 100644 docs/wd_alg_init2.md
diff --git a/docs/wd_alg_init2.md b/docs/wd_alg_init2.md new file mode 100644 index 0000000..3fb570c --- /dev/null +++ b/docs/wd_alg_init2.md @@ -0,0 +1,176 @@ +# wd_alg_init2
+## Preface
+The current uadk initialization process is: +1.Call wd_request_ctx() to request ctxs from devices. +2.Call wd_sched_rr_alloc() to create a sched(or some other scheduler alloc function if exits). +3.Initialize the sched. +4.Call wd_alg_init() with ctx_config and sched.
+```flow +st=>start: Start +o1=>operation: request ctxs +o2=>operation: create uadk_sched and instance ctxs to sched region +o3=>operation: call wd_alg_init +e=>end +st->o1->o2->o3->e +```
+Logic is reasonable. But in practice, the step of `wd_request_ctx()` +and `wd_sched_rr_alloc()` are very tedious. This makes it difficult +for users to use the interface. One of the main reasons for this is +that uadk has made a lot of configurations in the scheduler in order +to provide users with better performance. Based on this consideration, +the current uadk requires the user to arrange the division of hardware +resources according to the device topology during initialization. +Therefore, as a high-level interface, this scheme can provide customized +scheme configuration for users with deep needs.
+## wd_alg_init2
+### Design
+Is there any way to simplify these steps? Not currently. Because the +architecture model designed by uadk is to manage hardware resources +through a scheduler, users can no longer perceive after specifying +hardware resources, and all subsequent tasks are handled by the scheduler. +The original intention of this design is to make the scenarios supported +by uadk more flexible. Because the resource requirements of different +business scenarios are different from the task model of the business +itself, the best performance experience can be obtained through the +scheduler to match.
+But we can try to provide a layer of encapsulation. The original design +intention of this layer of encapsulation is that users only need to +specify available resources and requirements, and the configuration of +resources is completed internally by the interface. Because the previous +interface complexity mainly lies in the parameter configuration of CTX +and scheduler, it is easy for users to make configuration errors and +generate bugs because of their misunderstanding of parameters.
+All algorithms have the same input parameters and initialization logic.
+```c +struct wd_ctx_config {
- __u32 ctx_num;
- struct wd_ctx *ctxs;
- void *priv;
+};
+struct wd_sched {
- const char *name;
- int sched_policy;
- handle_t (*sched_init)(handle_t h_sched_ctx, void *sched_param);
- __u32 (*pick_next_ctx)(handle_t h_sched_ctx, void *sched_key,
const int sched_mode);
- int (*poll_policy)(handle_t h_sched_ctx, __u32 expect, __u32 *count);
- handle_t h_sched_ctx;
+};
+int wd_alg_init(struct wd_ctx_config *config, struct wd_sched *sched); +```
+`wd_ctx_config` is the requested ctxs descriptor, and the attributes +of ctxs are contained in their own structure. The attributes will be +used in scheduler for picking ctx according to request type. The main +difficulty in this step is that users need to apply for CTXs from the +appropriate device nodes according to their own business distribution. +If the user does not consider the appropriate device distribution, +it may lead to cross chip or cross numa node which will affect +performance.
+`wd_sched` is the scheduler descriptor of the request. It will create +the scheduling domain based parameters passed by the users. User needs +to allocate the ctxs applied to the scheduling domain that meets the +attribute, so that uadk can select the appropriate ctxs according to +the issued business. The main difficulty in this step is that the user +needs to initialize the correct scheduling domain according to the ctxs +attributes previously applied. However, there are many attributes of +ctxs here, which should be divided by multiple dimensions. If the +parameters are not understood enough, it is easy to make queue +allocation errors, resulting in the scheduling of the wrong ctxs when +the task is finally issued, and cause unexpected errors.
+Therefore, the next thing to be done is to use limited and easy-to-use +input parameters to describe users' requirements on the two input +parameters, ensuring that the functions of the new interface init2 +are the same as those of init. For ease of description, v1 is used +to refer to the existing interface, and v2 is used to refer to the +layer of encapsulation.
+Let's clarify the following logic first: all uacce devices under a +numa node can be regarded as the same. So although we request for +ctxs from the device, we manage ctxs according to numa nodes. +That means if users want to get the same performance for all cpu, +the uadk configure should be same for all numa node.
+At present, at least 4 parameters are required to meet the user +configuration requirements with the V1 interface function remains +unchanged.
+@device_list: The available uacce device list. Users can get it by +`wd_get_accel_list()`.
+@numa_bitmask: The bitmask provided by libnuma. Users can use this +parameter to control requesting ctxs devices in the bind NUMA scenario. +This parameter is mainly convenient for users to use in the binding +cpu scenario. It can avoid resource waste or initialization failure +caused by insufficient resources. Libnuma provides a complete operation +interface which can be found in numa.h.
+@ctx_nums: The requested ctx number for each numa node. Due to users +may have different requirements for different types of ctx numbers, +needs a two-dimensional array as input.
+@sched_type: Scheduling type the user wants to use.
+To sum up, the wd_alg_init2 is as follows
+```c +struct wd_ctx_nums {
- __u32 sync_ctx_num;
- __u32 async_ctx_num;
+};
+struct wd_ctx_params {
- __u32 ctx_set_num;
- struct wd_ctx_nums *ctx_set_size;
+};
+init wd_alg_init2 (struct uacce_dev_list *list, struct bitmask *bmp,
struct wd_ctx_params *cparams, __u32 sched_type);
+```
+Somebody may say that the wd_alg_init2 is still complex for three +input parameters are structure. So the interface support default value +for some parameters. The @bmp can be set as NULL, and then it will be +initialized according to device list. The @cparams can be set as NULL, +and it has a default value in wd_alg.c. The @list and sched_type are +necessary.
+What's more, uadk provides a new set of interface to get device list +bit mask.
+```c +struct bitmask *wd_create_device_nodemask(strcut uacce_dev_list *list);
+void wd_free_device_nodemask(struct bitmask *bmp); +```
+## Demo
+The simplest user initialization process is:
+```c +{
- ……
- struct uacce_dev_list *list;
- int ret;
- list = wd_get_accel_list(alg);
- ret = wd_<alg>_init2_(list, NULL, NULL, sched_type);
- wd_free_list_accel(list);
- ……
+}
need update
+```