The current uadk initialization process is: 1.Call wd_request_ctx() to request ctxs from devices. 2.Call wd_sched_rr_alloc() to create a sched(or some other scheduler alloc function if exits). 3.Initialize the sched. 4.Call wd_<alg>_init() with ctx_config and sched.
Logic is reasonable. But in practice, the step of `wd_ request_ Ctx() ` and `wd_ sched_ rr_alloc() ` are very tedious. This makes it difficult for users to use the interface. One of the main reasons for this is that uadk has made a lot of configurations in the scheduler in order to provide users with better performance. Based on this consideration, the current uadk requires the user to arrange the division of hardware resources according to the device topology during initialization. Therefore, as a high-level interface, this scheme can provide customized scheme configuration for users with deep needs.
All algorithm initialization interfaces have the same input parameters and behavioral logic. The pre-processing of the wd_<alg>_init is actually the configuration of `struct wd_ctx_config` and `struct wd_sched`. Therefore, the next thing to be done is to use limited and easy-to-use input parameters to describe users' requirements on the two input parameters, ensuring that the functions of the new interface init2 are the same as those of init. For ease of description, v1 is used to refer to the existing interface, and v2 is used to refer to the layer of encapsulation.
At present, at least 4 parameters are required to meet the user configuration requirements with the V1 interface function remains unchanged. @alg: The algorithm users wanted. @sched_type: Scheduling type the user wants to use. @task_tp: Reserved. @wd_ctx_params: op_type_num and ctx_set_num means the requested ctx number for each numa node. Due to users may have different requirements for different types of ctx numbers, needs a two-dimensional array as input. The bitmask provided by libnuma. Users can use this parameter to control requesting ctxs devices in the bind NUMA scenario. This parameter is mainly convenient for users to use in the binding cpu scenario. It can avoid resource waste or initialization failure caused by insufficient resources. Libnuma provides a complete operation interface which can be found in numa.h.
Changelog:
v5->v6: - Update a limit between wd_comp_init() and wd_comp_init2_(). If the wd_comp_init2_() is called after wd_comp_init(), some ctx resources may be leak until called wd_comp_uninit2().
v4->v5: - Update wd_comp_init2() and wd_comp_init2_() parameters.
v3->v4: - Resume the wd_comp_init2() parameters and rename it to wd_comp_init2_(). Then add a macro named wd_comp_init2() which has a simpler parameters.
v2->v3: - Update the wd_comp_init2() parameters.
v1->v2: - Update the desdescription about wd_<alg>_init in wd_design.md.
Yang Shen (6): uadk - support algorithms initialization reentry protect uadk/doc - update wd_alg_init support reentrancy uadk - support return error number as pointer uadk - mv some function to header file uadk/comp - add wd_comp_init2 uadk/docs - support a simple interface for initialization
Makefile.am | 4 +- docs/wd_alg_init2.md | 159 ++++++++++++++++++++++++ docs/wd_design.md | 5 +- include/wd.h | 54 ++++++++- include/wd_alg_common.h | 27 +++++ include/wd_comp.h | 30 +++++ include/wd_util.h | 66 ++++++++++ wd.c | 97 ++++++++++++--- wd_aead.c | 33 +++-- wd_cipher.c | 35 ++++-- wd_comp.c | 129 ++++++++++++++++++-- wd_dh.c | 34 ++++-- wd_digest.c | 33 +++-- wd_ecc.c | 33 +++-- wd_rsa.c | 33 +++-- wd_util.c | 260 +++++++++++++++++++++++++++++++++++++++- 16 files changed, 926 insertions(+), 106 deletions(-) create mode 100644 docs/wd_alg_init2.md
-- 2.24.0
The 'wd_<alg>_init()' is designed as non-reentrant. So add a status to protect for this situation.
When 'wd_<alg>_init()' is called, it will read the status at first. If the status is WD_UNINIT, it will set status as WD_INITING and change status to WD_INIT if succeed or reduction status to WD_UNINIT if something is wrong. If the status is WD_INIT, it can return directly. If the status is WD_INITING, that meaning other thread is initializing, so it need to wait for the result.
Signed-off-by: Yang Shen shenyang39@huawei.com --- include/wd_util.h | 48 +++++++++++++++++++++++++++++++++++++++++++++++ wd_aead.c | 33 +++++++++++++++++++++----------- wd_cipher.c | 35 ++++++++++++++++++++++------------ wd_comp.c | 35 +++++++++++++++++++++++----------- wd_dh.c | 34 +++++++++++++++++++++------------ wd_digest.c | 33 +++++++++++++++++++++----------- wd_ecc.c | 33 +++++++++++++++++++++----------- wd_rsa.c | 33 +++++++++++++++++++++----------- wd_util.c | 24 ++++++++++++++++++++++++ 9 files changed, 229 insertions(+), 79 deletions(-)
diff --git a/include/wd_util.h b/include/wd_util.h index 83ac5f8..cd0e112 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -21,6 +21,12 @@ extern "C" { for ((i) = 0, (config_numa) = (config)->config_per_numa; \ (i) < (config)->numa_num; (config_numa)++, (i)++)
+enum wd_status { + WD_UNINIT, + WD_INITING, + WD_INIT, +}; + struct wd_async_msg_pool { struct msg_pool *pools; __u32 pool_num; @@ -356,6 +362,48 @@ int wd_handle_msg_sync(struct wd_msg_handle *msg_handle, handle_t ctx, */ int wd_init_param_check(struct wd_ctx_config *config, struct wd_sched *sched);
+/** + * wd_alg_try_init() - Check the algorithm status and set it as WD_INITING + * if need initialization. + * @status: algorithm initialization status. + * + * Return true if need initialization and false if initialized, otherwise will wait + * last initialization result. + */ +bool wd_alg_try_init(enum wd_status *status); + +/** + * wd_alg_set_init() - Set the algorithm status as WD_INIT. + * @status: algorithm initialization status. + */ +static inline void wd_alg_set_init(enum wd_status *status) +{ + enum wd_status setting = WD_INIT; + + __atomic_store(status, &setting, __ATOMIC_RELAXED); +} + +/** + * wd_alg_get_init() - Get the algorithm status. + * @status: algorithm initialization status. + * @value: value of algorithm initialization status. + */ +static inline void wd_alg_get_init(enum wd_status *status, enum wd_status *value) +{ + __atomic_load(status, value, __ATOMIC_RELAXED); +} + +/** + * wd_alg_clear_init() - Set the algorithm status as WD_UNINIT. + * @status: algorithm initialization status. + */ +static inline void wd_alg_clear_init(enum wd_status *status) +{ + enum wd_status setting = WD_UNINIT; + + __atomic_store(status, &setting, __ATOMIC_RELAXED); +} + /** * wd_dfx_msg_cnt() - Message counter interface for ctx * @msg: Shared memory addr. diff --git a/wd_aead.c b/wd_aead.c index d6c2380..2307b20 100644 --- a/wd_aead.c +++ b/wd_aead.c @@ -31,6 +31,7 @@ static int g_aead_mac_len[WD_DIGEST_TYPE_MAX] = { };
struct wd_aead_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_aead_driver *driver; @@ -392,24 +393,29 @@ static int wd_aead_param_check(struct wd_aead_sess *sess, int wd_aead_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_aead_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_AEAD_EPOLL_EN", &wd_aead_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_aead_setting.config, config); if (ret) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_aead_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
/* set driver */ #ifdef WD_STATIC_DRV @@ -421,33 +427,37 @@ int wd_aead_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_aead_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* init ctx related resources in specific driver */ priv = calloc(1, wd_aead_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; } wd_aead_setting.priv = priv;
ret = wd_aead_setting.driver->init(&wd_aead_setting.config, priv); if (ret < 0) { WD_ERR("failed to init aead dirver!\n"); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_aead_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_aead_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_aead_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_aead_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_aead_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_aead_setting.status); return ret; }
@@ -465,6 +475,7 @@ void wd_aead_uninit(void) wd_uninit_async_request_pool(&wd_aead_setting.pool); wd_clear_sched(&wd_aead_setting.sched); wd_clear_ctx_config(&wd_aead_setting.config); + wd_alg_clear_init(&wd_aead_setting.status); }
static void fill_request_msg(struct wd_aead_msg *msg, struct wd_aead_req *req, diff --git a/wd_cipher.c b/wd_cipher.c index 8ce975a..a85629d 100644 --- a/wd_cipher.c +++ b/wd_cipher.c @@ -45,6 +45,7 @@ static const unsigned char des_weak_keys[DES_WEAK_KEY_NUM][DES_KEY_SIZE] = { };
struct wd_cipher_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; void *sched_ctx; @@ -231,24 +232,29 @@ void wd_cipher_free_sess(handle_t h_sess) int wd_cipher_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_cipher_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_CIPHER_EPOLL_EN", &wd_cipher_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_cipher_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_cipher_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
#ifdef WD_STATIC_DRV /* set driver */ @@ -260,33 +266,37 @@ int wd_cipher_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_cipher_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* init ctx related resources in specific driver */ priv = calloc(1, wd_cipher_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; } wd_cipher_setting.priv = priv;
ret = wd_cipher_setting.driver->init(&wd_cipher_setting.config, priv); if (ret < 0) { - WD_ERR("hisi sec init failed.\n"); - goto out_init; + WD_ERR("failed to do dirver init, ret = %d.\n", ret); + goto out_free_priv; }
+ wd_alg_set_init(&wd_cipher_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_cipher_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_cipher_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_cipher_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_cipher_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_cipher_setting.status); return ret; }
@@ -304,6 +314,7 @@ void wd_cipher_uninit(void) wd_uninit_async_request_pool(&wd_cipher_setting.pool); wd_clear_sched(&wd_cipher_setting.sched); wd_clear_ctx_config(&wd_cipher_setting.config); + wd_alg_clear_init(&wd_cipher_setting.status); }
static void fill_request_msg(struct wd_cipher_msg *msg, diff --git a/wd_comp.c b/wd_comp.c index eacebd3..44593a6 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -41,6 +41,7 @@ struct wd_comp_sess { };
struct wd_comp_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver; @@ -81,24 +82,29 @@ void wd_comp_set_driver(struct wd_comp_driver *drv) int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_comp_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_COMP_EPOLL_EN", &wd_comp_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_comp_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_comp_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config; /* * Fix me: ctx could be passed into wd_comp_set_static_drv to help to * choose static compiled vendor driver. For dynamic vendor driver, @@ -118,31 +124,36 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_comp_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* init ctx related resources in specific driver */ priv = calloc(1, wd_comp_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; } wd_comp_setting.priv = priv; ret = wd_comp_setting.driver->init(&wd_comp_setting.config, priv); if (ret < 0) { WD_ERR("failed to do driver init, ret = %d!\n", ret); - goto out_init; + goto out_free_priv; } + + wd_alg_set_init(&wd_comp_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_comp_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_comp_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_comp_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_comp_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_comp_setting.status); return ret; }
@@ -163,6 +174,8 @@ void wd_comp_uninit(void) /* unset config, sched, driver */ wd_clear_sched(&wd_comp_setting.sched); wd_clear_ctx_config(&wd_comp_setting.config); + + wd_alg_clear_init(&wd_comp_setting.status); }
struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) diff --git a/wd_dh.c b/wd_dh.c index 0bf770d..85382e2 100644 --- a/wd_dh.c +++ b/wd_dh.c @@ -32,6 +32,7 @@ struct wd_dh_sess { };
static struct wd_dh_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; void *sched_ctx; @@ -78,24 +79,29 @@ void wd_dh_set_driver(struct wd_dh_driver *drv) int wd_dh_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_dh_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_DH_EPOLL_EN", &wd_dh_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_dh_setting.config, config); if (ret) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_dh_setting.sched, sched); if (ret) - goto out; + goto out_clear_ctx_config;
#ifdef WD_STATIC_DRV wd_dh_set_static_drv(); @@ -106,13 +112,13 @@ int wd_dh_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_dh_msg)); if (ret) - goto out_sched; + goto out_clear_sched;
/* initialize ctx related resources in specific driver */ priv = calloc(1, wd_dh_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; }
wd_dh_setting.priv = priv; @@ -120,21 +126,24 @@ int wd_dh_init(struct wd_ctx_config *config, struct wd_sched *sched) wd_dh_setting.driver->alg_name); if (ret < 0) { WD_ERR("failed to init dh driver, ret= %d!\n", ret); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_dh_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_dh_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_dh_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_dh_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_dh_setting.config); - +out_clear_init: + wd_alg_clear_init(&wd_dh_setting.status); return ret; }
@@ -156,6 +165,7 @@ void wd_dh_uninit(void) /* unset config, sched, driver */ wd_clear_sched(&wd_dh_setting.sched); wd_clear_ctx_config(&wd_dh_setting.config); + wd_alg_clear_init(&wd_dh_setting.status); }
static int fill_dh_msg(struct wd_dh_msg *msg, struct wd_dh_req *req, diff --git a/wd_digest.c b/wd_digest.c index f56be0c..26dc7d1 100644 --- a/wd_digest.c +++ b/wd_digest.c @@ -39,6 +39,7 @@ static int g_digest_mac_full_len[WD_DIGEST_TYPE_MAX] = { };
struct wd_digest_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_digest_driver *driver; @@ -186,24 +187,29 @@ void wd_digest_free_sess(handle_t h_sess) int wd_digest_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_digest_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_DIGEST_EPOLL_EN", &wd_digest_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_digest_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_digest_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
/* set driver */ #ifdef WD_STATIC_DRV @@ -215,33 +221,37 @@ int wd_digest_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_digest_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* init ctx related resources in specific driver */ priv = calloc(1, wd_digest_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; } wd_digest_setting.priv = priv;
ret = wd_digest_setting.driver->init(&wd_digest_setting.config, priv); if (ret < 0) { WD_ERR("failed to init digest dirver!\n"); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_digest_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_digest_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_digest_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_digest_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_digest_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_digest_setting.status); return ret; }
@@ -260,6 +270,7 @@ void wd_digest_uninit(void)
wd_clear_sched(&wd_digest_setting.sched); wd_clear_ctx_config(&wd_digest_setting.config); + wd_alg_clear_init(&wd_digest_setting.status); }
static int wd_aes_hmac_length_check(struct wd_digest_sess *sess, diff --git a/wd_ecc.c b/wd_ecc.c index 2266b1d..3e902bd 100644 --- a/wd_ecc.c +++ b/wd_ecc.c @@ -64,6 +64,7 @@ struct wd_ecc_curve_list { };
static struct wd_ecc_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; void *sched_ctx; @@ -133,24 +134,29 @@ void wd_ecc_set_driver(struct wd_ecc_driver *drv) int wd_ecc_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_ecc_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_ECC_EPOLL_EN", &wd_ecc_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_ecc_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_ecc_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
#ifdef WD_STATIC_DRV wd_ecc_set_static_drv(); @@ -161,13 +167,13 @@ int wd_ecc_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_ecc_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* initialize ctx related resources in specific driver */ priv = calloc(1, wd_ecc_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; }
wd_ecc_setting.priv = priv; @@ -175,20 +181,24 @@ int wd_ecc_init(struct wd_ctx_config *config, struct wd_sched *sched) wd_ecc_setting.driver->alg_name); if (ret < 0) { WD_ERR("failed to init ecc driver, ret = %d!\n", ret); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_ecc_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_ecc_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_ecc_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_ecc_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_ecc_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_ecc_setting.status); return ret; }
@@ -210,6 +220,7 @@ void wd_ecc_uninit(void) /* unset config, sched, driver */ wd_clear_sched(&wd_ecc_setting.sched); wd_clear_ctx_config(&wd_ecc_setting.config); + wd_alg_clear_init(&wd_ecc_setting.status); }
static int trans_to_binpad(char *dst, const char *src, diff --git a/wd_rsa.c b/wd_rsa.c index 489833e..aab16ce 100644 --- a/wd_rsa.c +++ b/wd_rsa.c @@ -72,6 +72,7 @@ struct wd_rsa_sess { };
static struct wd_rsa_setting { + enum wd_status status; struct wd_ctx_config_internal config; struct wd_sched sched; void *sched_ctx; @@ -118,24 +119,29 @@ void wd_rsa_set_driver(struct wd_rsa_driver *drv) int wd_rsa_init(struct wd_ctx_config *config, struct wd_sched *sched) { void *priv; + bool flag; int ret;
+ flag = wd_alg_try_init(&wd_rsa_setting.status); + if (!flag) + return 0; + ret = wd_init_param_check(config, sched); if (ret) - return ret; + goto out_clear_init;
ret = wd_set_epoll_en("WD_RSA_EPOLL_EN", &wd_rsa_setting.config.epoll_en); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_ctx_config(&wd_rsa_setting.config, config); if (ret < 0) - return ret; + goto out_clear_init;
ret = wd_init_sched(&wd_rsa_setting.sched, sched); if (ret < 0) - goto out; + goto out_clear_ctx_config;
#ifdef WD_STATIC_DRV wd_rsa_set_static_drv(); @@ -146,13 +152,13 @@ int wd_rsa_init(struct wd_ctx_config *config, struct wd_sched *sched) config->ctx_num, WD_POOL_MAX_ENTRIES, sizeof(struct wd_rsa_msg)); if (ret < 0) - goto out_sched; + goto out_clear_sched;
/* initialize ctx related resources in specific driver */ priv = calloc(1, wd_rsa_setting.driver->drv_ctx_size); if (!priv) { ret = -WD_ENOMEM; - goto out_priv; + goto out_clear_pool; }
wd_rsa_setting.priv = priv; @@ -160,20 +166,24 @@ int wd_rsa_init(struct wd_ctx_config *config, struct wd_sched *sched) wd_rsa_setting.driver->alg_name); if (ret < 0) { WD_ERR("failed to init rsa driver, ret = %d!\n", ret); - goto out_init; + goto out_free_priv; }
+ wd_alg_set_init(&wd_rsa_setting.status); + return 0;
-out_init: +out_free_priv: free(priv); wd_rsa_setting.priv = NULL; -out_priv: +out_clear_pool: wd_uninit_async_request_pool(&wd_rsa_setting.pool); -out_sched: +out_clear_sched: wd_clear_sched(&wd_rsa_setting.sched); -out: +out_clear_ctx_config: wd_clear_ctx_config(&wd_rsa_setting.config); +out_clear_init: + wd_alg_clear_init(&wd_rsa_setting.status); return ret; }
@@ -195,6 +205,7 @@ void wd_rsa_uninit(void) /* unset config, sched, driver */ wd_clear_sched(&wd_rsa_setting.sched); wd_clear_ctx_config(&wd_rsa_setting.config); + wd_alg_clear_init(&wd_rsa_setting.status); }
static int fill_rsa_msg(struct wd_rsa_msg *msg, struct wd_rsa_req *req, diff --git a/wd_util.c b/wd_util.c index bd82075..fa77b46 100644 --- a/wd_util.c +++ b/wd_util.c @@ -22,6 +22,9 @@ #define PRIVILEGE_FLAG 600 #define MIN(a, b) ((a) > (b) ? (b) : (a))
+#define WD_INIT_SLEEP_UTIME 1000 +#define WD_INIT_RETRY_TIMES 10000 + struct msg_pool { /* message array allocated dynamically */ void *msgs; @@ -1777,3 +1780,24 @@ int wd_init_param_check(struct wd_ctx_config *config, struct wd_sched *sched)
return 0; } + +bool wd_alg_try_init(enum wd_status *status) +{ + enum wd_status expected; + int count = 0; + bool ret; + + do { + expected = WD_UNINIT; + ret = __atomic_compare_exchange_n(status, &expected, WD_INITING, true, + __ATOMIC_RELAXED, __ATOMIC_RELAXED); + if (expected == WD_INIT) + return false; + usleep(WD_INIT_SLEEP_UTIME); + if (!(++count % WD_INIT_RETRY_TIMES)) + WD_ERR("The algorithm initizalite has been waiting for %ds!\n", + WD_INIT_SLEEP_UTIME * count / 1000000); + } while (!ret); + + return true; +}
Now the uadk support initialization interface multi-thread concurrency and reentrant.
Signed-off-by: Yang Shen shenyang39@huawei.com --- docs/wd_design.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/docs/wd_design.md b/docs/wd_design.md index ba5a5b9..3e5297e 100644 --- a/docs/wd_design.md +++ b/docs/wd_design.md @@ -81,6 +81,7 @@ | | |2) Change *user* layer to *sched* layer since | | | | sample_sched is moved from user space into UADK | | | | framework. | +| 1.4 | |1) Update *wd_alg_init* reentrancy. |
## Terminology @@ -493,7 +494,9 @@ device. Return 0 if it succeeds. And return error number if it fails.
In *wd_comp_init()*, context resources, user scheduler and vendor driver are -initialized. +initialized. This function supports multi-threaded concurrent calls and +reentrant. When one thread is initializing, other threads will wait for +completion.
***void wd_comp_uninit(void)***
Add a new set of interface 'WD_ERR_PTR()' and 'WD_PTR_ERR()' for return error value.
Signed-off-by: Yang Shen shenyang39@huawei.com --- include/wd.h | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/include/wd.h b/include/wd.h index b0580ba..74d714b 100644 --- a/include/wd.h +++ b/include/wd.h @@ -91,11 +91,6 @@ typedef void (*wd_log)(const char *format, ...); #define WD_IS_ERR(h) ((uintptr_t)(h) > \ (uintptr_t)(-1000))
-static inline void *WD_ERR_PTR(uintptr_t error) -{ - return (void *)error; -} - enum wcrypto_type { WD_CIPHER, WD_DIGEST, @@ -185,6 +180,16 @@ static inline void wd_iowrite64(void *addr, uint64_t value) *((volatile uint64_t *)addr) = value; }
+static inline void *WD_ERR_PTR(uintptr_t error) +{ + return (void *)error; +} + +static inline long WD_PTR_ERR(const void *ptr) +{ + return (long)ptr; +} + /** * wd_request_ctx() - Request a communication context from a device. * @dev: Indicate one device.
Since two function will be used for mutil files, move them to header file.
Signed-off-by: Yang Shen shenyang39@huawei.com --- include/wd.h | 15 +++++++++++++++ wd.c | 35 +++++++++++++++++------------------ 2 files changed, 32 insertions(+), 18 deletions(-)
diff --git a/include/wd.h b/include/wd.h index 74d714b..e1a87de 100644 --- a/include/wd.h +++ b/include/wd.h @@ -508,6 +508,21 @@ void wd_mempool_stats(handle_t mempool, struct wd_mempool_stats *stats); */ void wd_blockpool_stats(handle_t blkpool, struct wd_blockpool_stats *stats);
+/** + * wd_clone_dev() - clone a new uacce device. + * @dev: The source device. + * + * Return a pointer value if succeed, and NULL if fail. + */ +struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); + +/** + * wd_add_dev_to_list() - add a node to end of list. + * @head: The list head. + * @node: The node need to be add. + */ +void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node); + /** * wd_ctx_get_dev_name() - Get the device name about task. * @h_ctx: The handle of context. diff --git a/wd.c b/wd.c index 6ea17f3..78094d8 100644 --- a/wd.c +++ b/wd.c @@ -342,7 +342,13 @@ out: return strndup(name, len); }
-static struct uacce_dev *clone_uacce_dev(struct uacce_dev *dev) +static void wd_ctx_init_qfrs_offs(struct wd_ctx_h *ctx) +{ + memcpy(&ctx->qfrs_offs, &ctx->dev->qfrs_offs, + sizeof(ctx->qfrs_offs)); +} + +struct uacce_dev *wd_clone_dev(struct uacce_dev *dev) { struct uacce_dev *new;
@@ -355,10 +361,14 @@ static struct uacce_dev *clone_uacce_dev(struct uacce_dev *dev) return new; }
-static void wd_ctx_init_qfrs_offs(struct wd_ctx_h *ctx) +void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node) { - memcpy(&ctx->qfrs_offs, &ctx->dev->qfrs_offs, - sizeof(ctx->qfrs_offs)); + struct uacce_dev_list *tmp = head; + + while (tmp->next) + tmp = tmp->next; + + tmp->next = node; }
handle_t wd_request_ctx(struct uacce_dev *dev) @@ -393,7 +403,7 @@ handle_t wd_request_ctx(struct uacce_dev *dev) if (!ctx->drv_name) goto free_dev_name;
- ctx->dev = clone_uacce_dev(dev); + ctx->dev = wd_clone_dev(dev); if (!ctx->dev) goto free_drv_name;
@@ -633,17 +643,6 @@ static bool dev_has_alg(const char *dev_alg_name, const char *alg_name) return false; }
-static void add_uacce_dev_to_list(struct uacce_dev_list *head, - struct uacce_dev_list *node) -{ - struct uacce_dev_list *tmp = head; - - while (tmp->next) - tmp = tmp->next; - - tmp->next = node; -} - static int check_alg_name(const char *alg_name) { int i = 0; @@ -715,7 +714,7 @@ struct uacce_dev_list *wd_get_accel_list(const char *alg_name) if (!head) head = node; else - add_uacce_dev_to_list(head, node); + wd_add_dev_to_list(head, node); }
closedir(wd_class); @@ -774,7 +773,7 @@ struct uacce_dev *wd_get_accel_dev(const char *alg_name) }
if (dev) - target = clone_uacce_dev(dev); + target = wd_clone_dev(dev);
wd_free_list_accels(head);
Due to performance, uadk tries to leave many configuration options to users. This gives users great flexibility, but it also leads to a problem that the current initialization interface has high complexity. Therefore, in order to facilitate users to adapt quickly, a new set of interfaces is provided.
The 'wd_alg_init2_()' will complete all initialization steps. There are 4 parameters to describe the user configuration requirements. @alg: The algorithm users want to use. @sched_type: The scheduling type users want to use. @task_sp: Reserved. @ctx_params: The ctxs resources users want to use. Include per operation type ctx numbers and business process run numa.
If users think 'wd_alg_init2_()' is too complex, wd_alg_init2() is a simplified packaging and will use the default value of numa_bitmask and ctx_nums.
Signed-off-by: Yang Shen shenyang39@huawei.com --- Makefile.am | 4 +- include/wd.h | 24 ++++ include/wd_alg_common.h | 27 +++++ include/wd_comp.h | 30 +++++ include/wd_util.h | 18 +++ wd.c | 62 +++++++++++ wd_comp.c | 94 ++++++++++++++++ wd_util.c | 236 +++++++++++++++++++++++++++++++++++++++- 8 files changed, 492 insertions(+), 3 deletions(-)
diff --git a/Makefile.am b/Makefile.am index 457af43..c5637e5 100644 --- a/Makefile.am +++ b/Makefile.am @@ -87,7 +87,7 @@ AM_CFLAGS += -DWD_NO_LOG
libwd_la_LIBADD = $(libwd_la_OBJECTS) -lnuma
-libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl +libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl -lnuma libwd_comp_la_DEPENDENCIES = libwd.la
libhisi_zip_la_LIBADD = -ldl @@ -104,7 +104,7 @@ else libwd_la_LDFLAGS=$(UADK_VERSION) libwd_la_LIBADD= -lnuma
-libwd_comp_la_LIBADD= -lwd -ldl +libwd_comp_la_LIBADD= -lwd -ldl -lnuma libwd_comp_la_LDFLAGS=$(UADK_VERSION) libwd_comp_la_DEPENDENCIES= libwd.la
diff --git a/include/wd.h b/include/wd.h index e1a87de..facd992 100644 --- a/include/wd.h +++ b/include/wd.h @@ -348,6 +348,16 @@ int wd_get_avail_ctx(struct uacce_dev *dev); */ struct uacce_dev_list *wd_get_accel_list(const char *alg_name);
+/** + * wd_find_dev_by_numa() - get device with max available ctx number from an + * device list according to numa id. + * @list: The device list. + * @numa_id: The numa_id. + * + * Return device if succeed and other error number if fail. + */ +struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id); + /** * wd_get_accel_dev() - Get device supporting the algorithm with smallest numa distance to current numa node. @@ -523,6 +533,20 @@ struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); */ void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node);
+/** + * wd_create_device_nodemask() - create a numa node mask of device list. + * @list: The devices list. + * + * Return a pointer value if succeed, and error number if fail. + */ +struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list); + +/** + * wd_free_device_nodemask() - free a numa node mask. + * @bmp: A numa node mask. + */ +void wd_free_device_nodemask(struct bitmask *bmp); + /** * wd_ctx_get_dev_name() - Get the device name about task. * @h_ctx: The handle of context. diff --git a/include/wd_alg_common.h b/include/wd_alg_common.h index c455dc3..96e908f 100644 --- a/include/wd_alg_common.h +++ b/include/wd_alg_common.h @@ -63,6 +63,33 @@ struct wd_ctx_config { void *priv; };
+/** + * struct wd_ctx_nums - Define the ctx sets numbers. + * @sync_ctx_num: The ctx numbers which are used for sync mode for each + * ctx sets. + * @async_ctx_num: The ctx numbers which are used for async mode for each + * ctx sets. + */ +struct wd_ctx_nums { + __u32 sync_ctx_num; + __u32 async_ctx_num; +}; + +/** + * struct wd_ctx_params - Define the ctx sets params which are used for init + * algorithms. + * @op_type_num: Used for index of ctx_set_num, the order is the same as + * wd_<alg>_op_type. + * @ctx_set_num: Each operation type ctx sets numbers. + * @bmp: Ctxs distribution. Means users want to run business process on these + * numa or request ctx from devices located in these numa. + */ +struct wd_ctx_params { + __u32 op_type_num; + struct wd_ctx_nums *ctx_set_num; + struct bitmask *bmp; +}; + struct wd_ctx_internal { handle_t ctx; __u8 op_type; diff --git a/include/wd_comp.h b/include/wd_comp.h index e043a83..13a3e6a 100644 --- a/include/wd_comp.h +++ b/include/wd_comp.h @@ -7,6 +7,7 @@ #ifndef __WD_COMP_H #define __WD_COMP_H
+#include <numa.h> #include "wd.h" #include "wd_alg_common.h"
@@ -113,6 +114,35 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched); */ void wd_comp_uninit(void);
+/** + * wd_comp_init2_() - A simplify interface to initializate uadk + * compression/decompression. This interface keeps most functions of + * wd_comp_init(). Users just need to descripe the deployment of + * business scenarios. Then the initialization will request appropriate + * resources to support the business scenarios. + * To make the initializate simpler, ctx_params support set NULL. + * And then the function will set them as default. + * Please do not use this interface with wd_comp_init() together, or + * some resources may be leak. + * + * @alg: The algorithm users want to use. + * @sched_type: The scheduling type users want to use. + * @task_tp: Reserved. + * @ctx_params: The ctxs resources users want to use. Include per operation + * type ctx numbers and business process run numa. + * + * Return 0 if succeed and others if fail. + */ +int wd_comp_init2_(char *alg, __u32 sched_type, int task_tp, struct wd_ctx_params *ctx_params); + +#define wd_comp_init2(alg, sched_type, task_tp) \ + wd_comp_init2_(alg, sched_type, task_tp, NULL) + +/** + * wd_comp_uninit2() - Uninitialise ctx configuration and scheduler. + */ +void wd_comp_uninit2(void); + struct wd_comp_sess_setup { enum wd_comp_alg_type alg_type; /* Denoted by enum wd_comp_alg_type */ enum wd_comp_level comp_lv; /* Denoted by enum wd_comp_level */ diff --git a/include/wd_util.h b/include/wd_util.h index cd0e112..a51a35d 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -7,6 +7,7 @@ #ifndef __WD_UTIL_H #define __WD_UTIL_H
+#include <numa.h> #include <stdbool.h> #include <sys/ipc.h> #include <sys/shm.h> @@ -112,6 +113,14 @@ struct wd_msg_handle { int (*recv)(handle_t sess, void *msg); };
+struct wd_init_attrs { + __u32 sched_type; + char *alg; + struct wd_sched *sched; + struct wd_ctx_params *ctx_params; + struct wd_ctx_config *ctx_config; +}; + /* * wd_init_ctx_config() - Init internal ctx configuration. * @in: ctx configuration in global setting. @@ -404,6 +413,15 @@ static inline void wd_alg_clear_init(enum wd_status *status) __atomic_store(status, &setting, __ATOMIC_RELAXED); }
+/** + * wd_alg_pre_init() - Request the ctxs and initialize the sched_domain + * with the given devices list, ctxs number and numa mask. + * @attrs: the algorithm initialization parameters. + * + * Return device if succeed and other error number if fail. + */ +int wd_alg_pre_init(struct wd_init_attrs *attrs); + /** * wd_dfx_msg_cnt() - Message counter interface for ctx * @msg: Shared memory addr. diff --git a/wd.c b/wd.c index 78094d8..9eb69d2 100644 --- a/wd.c +++ b/wd.c @@ -727,6 +727,35 @@ free_list: return NULL; }
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id) +{ + struct uacce_dev *dev = WD_ERR_PTR(-WD_ENODEV); + struct uacce_dev_list *p = list; + int ctx_num, ctx_max = 0; + + if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + } + + while (p) { + if (numa_id != p->dev->numa_id) { + p = p->next; + continue; + } + + ctx_num = wd_get_avail_ctx(p->dev); + if (ctx_num > ctx_max) { + dev = p->dev; + ctx_max = ctx_num; + } + + p = p->next; + } + + return dev; +} + void wd_free_list_accels(struct uacce_dev_list *list) { struct uacce_dev_list *curr, *next; @@ -793,6 +822,39 @@ int wd_ctx_set_io_cmd(handle_t h_ctx, unsigned long cmd, void *arg) return ioctl(ctx->fd, cmd, arg); }
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list) +{ + struct uacce_dev_list *p; + struct bitmask *bmp; + + if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + } + + bmp = numa_allocate_nodemask(); + if (!bmp) { + WD_ERR("failed to alloc bitmask(%d)!\n", errno); + return WD_ERR_PTR(-WD_ENOMEM); + } + + p = list; + while (p) { + numa_bitmask_setbit(bmp, p->dev->numa_id); + p = p->next; + } + + return bmp; +} + +void wd_free_device_nodemask(struct bitmask *bmp) +{ + if (!bmp) + return; + + numa_free_nodemask(bmp); +} + void wd_get_version(void) { const char *wd_released_time = UADK_RELEASED_TIME; diff --git a/wd_comp.c b/wd_comp.c index 44593a6..487fd02 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -14,6 +14,7 @@
#include "config.h" #include "drv/wd_comp_drv.h" +#include "wd_sched.h" #include "wd_util.h" #include "wd_comp.h"
@@ -21,6 +22,8 @@ #define HW_CTX_SIZE (64 * 1024) #define STREAM_CHUNK (128 * 1024)
+#define SCHED_RR_NAME "sched_rr" + #define swap_byte(x) \ ((((x) & 0x000000ff) << 24) | \ (((x) & 0x0000ff00) << 8) | \ @@ -42,6 +45,7 @@ struct wd_comp_sess {
struct wd_comp_setting { enum wd_status status; + enum wd_status status2; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver; @@ -52,6 +56,20 @@ struct wd_comp_setting {
struct wd_env_config wd_comp_env_config;
+static struct wd_init_attrs wd_comp_init_attrs; +static struct wd_ctx_config wd_comp_ctx; +static struct wd_sched *wd_comp_sched; + +static struct wd_ctx_nums wd_comp_ctx_num[] = { + {1, 1}, {1, 1}, {} +}; + +static struct wd_ctx_params wd_comp_ctx_params = { + .op_type_num = WD_DIR_MAX, + .ctx_set_num = wd_comp_ctx_num, + .bmp = NULL, +}; + #ifdef WD_STATIC_DRV static void wd_comp_set_static_drv(void) { @@ -178,6 +196,80 @@ void wd_comp_uninit(void) wd_alg_clear_init(&wd_comp_setting.status); }
+int wd_comp_init2_(char *alg, __u32 sched_type, int task_tp, struct wd_ctx_params *ctx_params) +{ + enum wd_status status; + bool flag; + int ret; + + wd_alg_get_init(&wd_comp_setting.status, &status); + if (status == WD_INIT) { + WD_INFO("UADK comp has been initialized with wd_comp_init()!\n"); + return 0; + } + + flag = wd_alg_try_init(&wd_comp_setting.status2); + if (!flag) + return 0; + + if (!alg) { + WD_ERR("invalid: alg is NULL!\n"); + ret = -WD_EINVAL; + goto out_uninit; + } + + wd_comp_init_attrs.alg = alg; + wd_comp_init_attrs.sched_type = sched_type; + + wd_comp_init_attrs.ctx_params = ctx_params ? ctx_params : &wd_comp_ctx_params; + wd_comp_init_attrs.ctx_config = &wd_comp_ctx; + + wd_comp_sched = wd_sched_rr_alloc(sched_type, wd_comp_init_attrs.ctx_params->op_type_num, + numa_max_node() + 1, wd_comp_poll_ctx); + if (!wd_comp_sched) { + ret = -WD_EINVAL; + goto out_uninit; + } + wd_comp_sched->name = SCHED_RR_NAME; + wd_comp_init_attrs.sched = wd_comp_sched; + + ret = wd_alg_pre_init(&wd_comp_init_attrs); + if (ret) + goto out_freesched; + + ret = wd_comp_init(&wd_comp_ctx, wd_comp_sched); + if (ret) + goto out_freesched; + + wd_alg_set_init(&wd_comp_setting.status2); + + return 0; + +out_freesched: + wd_sched_rr_release(wd_comp_sched); + +out_uninit: + wd_alg_clear_init(&wd_comp_setting.status2); + + return ret; +} + +void wd_comp_uninit2(void) +{ + int i; + + wd_comp_uninit(); + + for (i = 0; i < wd_comp_ctx.ctx_num; i++) + if (wd_comp_ctx.ctxs[i].ctx) { + wd_release_ctx(wd_comp_ctx.ctxs[i].ctx); + wd_comp_ctx.ctxs[i].ctx = 0; + } + + wd_sched_rr_release(wd_comp_sched); + wd_alg_clear_init(&wd_comp_setting.status2); +} + struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) { return wd_find_msg_in_pool(&wd_comp_setting.pool, idx, tag); @@ -289,6 +381,7 @@ handle_t wd_comp_alloc_sess(struct wd_comp_sess_setup *setup) sess->comp_lv = setup->comp_lv; sess->win_sz = setup->win_sz; sess->stream_pos = WD_COMP_STREAM_NEW; + /* Some simple scheduler don't need scheduling parameters */ sess->sched_key = (void *)wd_comp_setting.sched.sched_init( wd_comp_setting.sched.h_sched_ctx, setup->sched_param); @@ -318,6 +411,7 @@ void wd_comp_free_sess(handle_t h_sess)
if (sess->sched_key) free(sess->sched_key); + free(sess); }
diff --git a/wd_util.c b/wd_util.c index fa77b46..8fed8ac 100644 --- a/wd_util.c +++ b/wd_util.c @@ -5,7 +5,6 @@ */
#define _GNU_SOURCE -#include <numa.h> #include <pthread.h> #include <semaphore.h> #include <string.h> @@ -1801,3 +1800,238 @@ bool wd_alg_try_init(enum wd_status *status)
return true; } + +static __u32 wd_get_ctx_numbers(struct wd_ctx_params ctx_params, int end) +{ + __u32 count = 0; + int i; + + for (i = 0; i < end; i++) { + count += ctx_params.ctx_set_num[i].sync_ctx_num; + count += ctx_params.ctx_set_num[i].async_ctx_num; + } + + return count; +} + +struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp) +{ + struct uacce_dev_list *p, *node, *result = NULL; + struct uacce_dev *dev; + int numa_id, ret; + + if (!bmp) { + WD_ERR("invalid: bmp is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + } + + p = list; + while (p) { + dev = p->dev; + numa_id = dev->numa_id; + ret = numa_bitmask_isbitset(bmp, numa_id); + if (!ret) { + p = p->next; + continue; + } + + node = calloc(1, sizeof(*node)); + if (!node) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_list; + } + + node->dev = wd_clone_dev(dev); + if (!node->dev) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_node; + } + + if (!result) + result = node; + else + wd_add_dev_to_list(result, node); + + p = p->next; + } + + return result ? result : WD_ERR_PTR(-WD_ENODEV); + +out_free_node: + free(node); +out_free_list: + wd_free_list_accels(result); + return result; +} + +static int wd_init_ctx_set(struct wd_init_attrs *attrs, struct uacce_dev_list *list, + int idx, int numa_id, int op_type) +{ + struct wd_ctx_nums ctx_nums = attrs->ctx_params->ctx_set_num[op_type]; + __u32 ctx_set_num = ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num; + struct wd_ctx_config *ctx_config = attrs->ctx_config; + struct uacce_dev *dev; + int i; + + dev = wd_find_dev_by_numa(list, numa_id); + if (WD_IS_ERR(dev)) + return WD_PTR_ERR(dev); + + for (i = idx; i < idx + ctx_set_num; i++) { + ctx_config->ctxs[i].ctx = wd_request_ctx(dev); + if (errno == WD_EBUSY) { + dev = wd_find_dev_by_numa(list, numa_id); + if (WD_IS_ERR(dev)) + return WD_PTR_ERR(dev); + i--; + } + ctx_config->ctxs[i].op_type = op_type; + ctx_config->ctxs[i].ctx_mode = + ((i - idx) < ctx_nums.sync_ctx_num) ? + CTX_MODE_SYNC : CTX_MODE_ASYNC; + } + + return 0; +} + +static void wd_release_ctx_set(struct wd_ctx_config *ctx_config) +{ + int i; + + for (i = 0; i < ctx_config->ctx_num; i++) + if (ctx_config->ctxs[i].ctx) { + wd_release_ctx(ctx_config->ctxs[i].ctx); + ctx_config->ctxs[i].ctx = 0; + } +} + +static int wd_instance_sched_set(struct wd_sched *sched, struct wd_ctx_nums ctx_nums, + int idx, int numa_id, int op_type) +{ + struct sched_params sparams; + int i, ret = 0; + + for (i = 0; i < CTX_MODE_MAX; i++) { + sparams.numa_id = numa_id; + sparams.type = op_type; + sparams.mode = i; + sparams.begin = idx + ctx_nums.sync_ctx_num * i; + sparams.end = idx - 1 + ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num * i; + if (sparams.begin > sparams.end) + continue; + ret = wd_sched_rr_instance(sched, &sparams); + if (ret) + goto out; + } + +out: + return ret; +} + +static int wd_init_ctx_and_sched(struct wd_init_attrs *attrs, struct bitmask *bmp, + struct uacce_dev_list *list) +{ + struct wd_ctx_params *ctx_params = attrs->ctx_params; + __u32 op_type_num = ctx_params->op_type_num; + int max_node = numa_max_node() + 1; + struct wd_ctx_nums ctx_nums; + int i, j, ret; + int idx = 0; + + for (i = 0; i < max_node; i++) { + if (!numa_bitmask_isbitset(bmp, i)) + continue; + for (j = 0; j < op_type_num; j++) { + ctx_nums = ctx_params->ctx_set_num[j]; + ret = wd_init_ctx_set(attrs, list, idx, i, j); + if (ret) + goto free_ctxs; + ret = wd_instance_sched_set(attrs->sched, ctx_nums, idx, i, j); + if (ret) + goto free_ctxs; + idx += (ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num); + } + } + + return 0; + +free_ctxs: + wd_release_ctx_set(attrs->ctx_config); + + return ret; +} + +int wd_alg_pre_init(struct wd_init_attrs *attrs) +{ + struct wd_ctx_config *ctx_config = attrs->ctx_config; + struct wd_ctx_params *ctx_params = attrs->ctx_params; + struct bitmask *used_bmp, *bmp = ctx_params->bmp; + struct uacce_dev_list *list, *used_list = NULL; + __u32 ctx_set_num, op_type_num; + int numa_cnt, ret; + + list = wd_get_accel_list(attrs->alg); + if (!list) { + WD_ERR("failed to get devices!\n"); + return -WD_ENODEV; + } + + op_type_num = ctx_params->op_type_num; + ctx_set_num = wd_get_ctx_numbers(*ctx_params, op_type_num); + if (!ctx_set_num || !op_type_num) { + WD_ERR("invalid: ctx_set_num is %d, op_type_num is %d!\n", + ctx_set_num, op_type_num); + ret = -WD_EINVAL; + goto out_freelist; + } + + /* + * Not every numa has a device. Therefore, the first thing is to + * filter the devices in the selected numa node, and the second + * thing is to obtain the distribution of devices. + */ + if (bmp) { + used_list = wd_get_usable_list(list, bmp); + if (WD_IS_ERR(used_list)) { + ret = WD_PTR_ERR(used_list); + WD_ERR("failed to get usable devices(%d)!\n", ret); + goto out_freelist; + } + } + + used_bmp = wd_create_device_nodemask(used_list ? used_list : list); + if (WD_IS_ERR(used_bmp)) { + ret = WD_PTR_ERR(used_bmp); + goto out_freeusedlist; + } + + numa_cnt = numa_bitmask_weight(used_bmp); + if (!numa_cnt) { + ret = numa_cnt; + WD_ERR("invalid: bmp is clear!\n"); + goto out_freenodemask; + } + + ctx_config->ctx_num = ctx_set_num * numa_cnt; + ctx_config->ctxs = calloc(ctx_config->ctx_num, sizeof(struct wd_ctx)); + if (!ctx_config->ctxs) { + ret = -WD_ENOMEM; + WD_ERR("failed to alloc ctxs!\n"); + goto out_freenodemask; + } + + ret = wd_init_ctx_and_sched(attrs, used_bmp, used_list ? used_list : list); + if (ret) + free(ctx_config->ctxs); + +out_freenodemask: + wd_free_device_nodemask(used_bmp); + +out_freeusedlist: + wd_free_list_accels(used_list); + +out_freelist: + wd_free_list_accels(list); + + return ret; +}
在 2022/10/29 18:19, Yang Shen 写道:
Due to performance, uadk tries to leave many configuration options to users. This gives users great flexibility, but it also leads to a problem that the current initialization interface has high complexity. Therefore, in order to facilitate users to adapt quickly, a new set of interfaces is provided.
The 'wd_alg_init2_()' will complete all initialization steps. There are 4 parameters to describe the user configuration requirements. @alg: The algorithm users want to use. @sched_type: The scheduling type users want to use. @task_sp: Reserved.
task_type ? 同步或者异步吗,建议说明下。
@ctx_params: The ctxs resources users want to use. Include per operation type ctx numbers and business process run numa.
If users think 'wd_alg_init2_()' is too complex, wd_alg_init2() is a simplified packaging and will use the default value of numa_bitmask and ctx_nums.
Signed-off-by: Yang Shen shenyang39@huawei.com
Makefile.am | 4 +- include/wd.h | 24 ++++ include/wd_alg_common.h | 27 +++++ include/wd_comp.h | 30 +++++ include/wd_util.h | 18 +++ wd.c | 62 +++++++++++ wd_comp.c | 94 ++++++++++++++++ wd_util.c | 236 +++++++++++++++++++++++++++++++++++++++- 8 files changed, 492 insertions(+), 3 deletions(-)
diff --git a/Makefile.am b/Makefile.am index 457af43..c5637e5 100644 --- a/Makefile.am +++ b/Makefile.am @@ -87,7 +87,7 @@ AM_CFLAGS += -DWD_NO_LOG
libwd_la_LIBADD = $(libwd_la_OBJECTS) -lnuma
-libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl +libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl -lnuma libwd_comp_la_DEPENDENCIES = libwd.la
libhisi_zip_la_LIBADD = -ldl @@ -104,7 +104,7 @@ else libwd_la_LDFLAGS=$(UADK_VERSION) libwd_la_LIBADD= -lnuma
-libwd_comp_la_LIBADD= -lwd -ldl +libwd_comp_la_LIBADD= -lwd -ldl -lnuma libwd_comp_la_LDFLAGS=$(UADK_VERSION) libwd_comp_la_DEPENDENCIES= libwd.la
diff --git a/include/wd.h b/include/wd.h index e1a87de..facd992 100644 --- a/include/wd.h +++ b/include/wd.h @@ -348,6 +348,16 @@ int wd_get_avail_ctx(struct uacce_dev *dev); */ struct uacce_dev_list *wd_get_accel_list(const char *alg_name);
+/**
- wd_find_dev_by_numa() - get device with max available ctx number from an
device list according to numa id.
- @list: The device list.
- @numa_id: The numa_id.
- Return device if succeed and other error number if fail.
- */
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id);
- /**
- wd_get_accel_dev() - Get device supporting the algorithm with smallest numa distance to current numa node.
@@ -523,6 +533,20 @@ struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); */ void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node);
+/**
- wd_create_device_nodemask() - create a numa node mask of device list.
- @list: The devices list.
- Return a pointer value if succeed, and error number if fail.
- */
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list);
+/**
- wd_free_device_nodemask() - free a numa node mask.
- @bmp: A numa node mask.
- */
+void wd_free_device_nodemask(struct bitmask *bmp);
- /**
- wd_ctx_get_dev_name() - Get the device name about task.
- @h_ctx: The handle of context.
diff --git a/include/wd_alg_common.h b/include/wd_alg_common.h index c455dc3..96e908f 100644 --- a/include/wd_alg_common.h +++ b/include/wd_alg_common.h @@ -63,6 +63,33 @@ struct wd_ctx_config { void *priv; };
+/**
- struct wd_ctx_nums - Define the ctx sets numbers.
- @sync_ctx_num: The ctx numbers which are used for sync mode for each
- ctx sets.
- @async_ctx_num: The ctx numbers which are used for async mode for each
- ctx sets.
- */
+struct wd_ctx_nums {
- __u32 sync_ctx_num;
- __u32 async_ctx_num;
+};
+/**
- struct wd_ctx_params - Define the ctx sets params which are used for init
- algorithms.
- @op_type_num: Used for index of ctx_set_num, the order is the same as
- wd_<alg>_op_type.
- @ctx_set_num: Each operation type ctx sets numbers.
- @bmp: Ctxs distribution. Means users want to run business process on these
- numa or request ctx from devices located in these numa.
- */
+struct wd_ctx_params {
- __u32 op_type_num;
- struct wd_ctx_nums *ctx_set_num;
- struct bitmask *bmp;
+};
- struct wd_ctx_internal { handle_t ctx; __u8 op_type;
diff --git a/include/wd_comp.h b/include/wd_comp.h index e043a83..13a3e6a 100644 --- a/include/wd_comp.h +++ b/include/wd_comp.h @@ -7,6 +7,7 @@ #ifndef __WD_COMP_H #define __WD_COMP_H
+#include <numa.h> #include "wd.h" #include "wd_alg_common.h"
@@ -113,6 +114,35 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched); */ void wd_comp_uninit(void);
+/**
- wd_comp_init2_() - A simplify interface to initializate uadk
- compression/decompression. This interface keeps most functions of
- wd_comp_init(). Users just need to descripe the deployment of
- business scenarios. Then the initialization will request appropriate
- resources to support the business scenarios.
- To make the initializate simpler, ctx_params support set NULL.
- And then the function will set them as default.
- Please do not use this interface with wd_comp_init() together, or
- some resources may be leak.
- @alg: The algorithm users want to use.
- @sched_type: The scheduling type users want to use.
- @task_tp: Reserved.
- @ctx_params: The ctxs resources users want to use. Include per operation
- type ctx numbers and business process run numa.
- Return 0 if succeed and others if fail.
- */
+int wd_comp_init2_(char *alg, __u32 sched_type, int task_tp, struct wd_ctx_params *ctx_params);
+#define wd_comp_init2(alg, sched_type, task_tp) \
- wd_comp_init2_(alg, sched_type, task_tp, NULL)
+/**
- wd_comp_uninit2() - Uninitialise ctx configuration and scheduler.
- */
+void wd_comp_uninit2(void);
- struct wd_comp_sess_setup { enum wd_comp_alg_type alg_type; /* Denoted by enum wd_comp_alg_type */ enum wd_comp_level comp_lv; /* Denoted by enum wd_comp_level */
diff --git a/include/wd_util.h b/include/wd_util.h index cd0e112..a51a35d 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -7,6 +7,7 @@ #ifndef __WD_UTIL_H #define __WD_UTIL_H
+#include <numa.h> #include <stdbool.h> #include <sys/ipc.h> #include <sys/shm.h> @@ -112,6 +113,14 @@ struct wd_msg_handle { int (*recv)(handle_t sess, void *msg); };
+struct wd_init_attrs {
- __u32 sched_type;
- char *alg;
- struct wd_sched *sched;
- struct wd_ctx_params *ctx_params;
- struct wd_ctx_config *ctx_config;
+};
- /*
- wd_init_ctx_config() - Init internal ctx configuration.
- @in: ctx configuration in global setting.
@@ -404,6 +413,15 @@ static inline void wd_alg_clear_init(enum wd_status *status) __atomic_store(status, &setting, __ATOMIC_RELAXED); }
+/**
- wd_alg_pre_init() - Request the ctxs and initialize the sched_domain
with the given devices list, ctxs number and numa mask.
- @attrs: the algorithm initialization parameters.
- Return device if succeed and other error number if fail.
- */
+int wd_alg_pre_init(struct wd_init_attrs *attrs);
- /**
- wd_dfx_msg_cnt() - Message counter interface for ctx
- @msg: Shared memory addr.
diff --git a/wd.c b/wd.c index 78094d8..9eb69d2 100644 --- a/wd.c +++ b/wd.c @@ -727,6 +727,35 @@ free_list: return NULL; }
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id) +{
- struct uacce_dev *dev = WD_ERR_PTR(-WD_ENODEV);
- struct uacce_dev_list *p = list;
- int ctx_num, ctx_max = 0;
- if (!list) {
WD_ERR("invalid: list is NULL!\n");
return WD_ERR_PTR(-WD_EINVAL);
- }
- while (p) {
if (numa_id != p->dev->numa_id) {
p = p->next;
continue;
}
ctx_num = wd_get_avail_ctx(p->dev);
if (ctx_num > ctx_max) {
dev = p->dev;
ctx_max = ctx_num;
}
p = p->next;
- }
- return dev;
+}
- void wd_free_list_accels(struct uacce_dev_list *list) { struct uacce_dev_list *curr, *next;
@@ -793,6 +822,39 @@ int wd_ctx_set_io_cmd(handle_t h_ctx, unsigned long cmd, void *arg) return ioctl(ctx->fd, cmd, arg); }
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list) +{
- struct uacce_dev_list *p;
- struct bitmask *bmp;
- if (!list) {
WD_ERR("invalid: list is NULL!\n");
return WD_ERR_PTR(-WD_EINVAL);
- }
- bmp = numa_allocate_nodemask();
- if (!bmp) {
WD_ERR("failed to alloc bitmask(%d)!\n", errno);
return WD_ERR_PTR(-WD_ENOMEM);
- }
- p = list;
- while (p) {
numa_bitmask_setbit(bmp, p->dev->numa_id);
p = p->next;
- }
- return bmp;
+}
+void wd_free_device_nodemask(struct bitmask *bmp) +{
- if (!bmp)
return;
- numa_free_nodemask(bmp);
+}
- void wd_get_version(void) { const char *wd_released_time = UADK_RELEASED_TIME;
diff --git a/wd_comp.c b/wd_comp.c index 44593a6..487fd02 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -14,6 +14,7 @@
#include "config.h" #include "drv/wd_comp_drv.h" +#include "wd_sched.h" #include "wd_util.h" #include "wd_comp.h"
@@ -21,6 +22,8 @@ #define HW_CTX_SIZE (64 * 1024) #define STREAM_CHUNK (128 * 1024)
+#define SCHED_RR_NAME "sched_rr"
- #define swap_byte(x) \ ((((x) & 0x000000ff) << 24) | \ (((x) & 0x0000ff00) << 8) | \
@@ -42,6 +45,7 @@ struct wd_comp_sess {
struct wd_comp_setting { enum wd_status status;
- enum wd_status status2; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver;
@@ -52,6 +56,20 @@ struct wd_comp_setting {
struct wd_env_config wd_comp_env_config;
+static struct wd_init_attrs wd_comp_init_attrs; +static struct wd_ctx_config wd_comp_ctx; +static struct wd_sched *wd_comp_sched;
+static struct wd_ctx_nums wd_comp_ctx_num[] = {
- {1, 1}, {1, 1}, {}
+};
+static struct wd_ctx_params wd_comp_ctx_params = {
- .op_type_num = WD_DIR_MAX,
- .ctx_set_num = wd_comp_ctx_num,
- .bmp = NULL,
+};
- #ifdef WD_STATIC_DRV static void wd_comp_set_static_drv(void) {
@@ -178,6 +196,80 @@ void wd_comp_uninit(void) wd_alg_clear_init(&wd_comp_setting.status); }
+int wd_comp_init2_(char *alg, __u32 sched_type, int task_tp, struct wd_ctx_params *ctx_params) +{
- enum wd_status status;
- bool flag;
- int ret;
- wd_alg_get_init(&wd_comp_setting.status, &status);
- if (status == WD_INIT) {
WD_INFO("UADK comp has been initialized with wd_comp_init()!\n");
return 0;
- }
- flag = wd_alg_try_init(&wd_comp_setting.status2);
- if (!flag)
return 0;
- if (!alg) {
WD_ERR("invalid: alg is NULL!\n");
ret = -WD_EINVAL;
goto out_uninit;
- }
- wd_comp_init_attrs.alg = alg;
- wd_comp_init_attrs.sched_type = sched_type;
- wd_comp_init_attrs.ctx_params = ctx_params ? ctx_params : &wd_comp_ctx_params;
- wd_comp_init_attrs.ctx_config = &wd_comp_ctx;
- wd_comp_sched = wd_sched_rr_alloc(sched_type, wd_comp_init_attrs.ctx_params->op_type_num,
numa_max_node() + 1, wd_comp_poll_ctx);
- if (!wd_comp_sched) {
ret = -WD_EINVAL;
goto out_uninit;
- }
- wd_comp_sched->name = SCHED_RR_NAME;
- wd_comp_init_attrs.sched = wd_comp_sched;
- ret = wd_alg_pre_init(&wd_comp_init_attrs);
- if (ret)
goto out_freesched;
- ret = wd_comp_init(&wd_comp_ctx, wd_comp_sched);
- if (ret)
goto out_freesched;
- wd_alg_set_init(&wd_comp_setting.status2);
- return 0;
+out_freesched:
- wd_sched_rr_release(wd_comp_sched);
+out_uninit:
- wd_alg_clear_init(&wd_comp_setting.status2);
- return ret;
+}
+void wd_comp_uninit2(void) +{
- int i;
- wd_comp_uninit();
- for (i = 0; i < wd_comp_ctx.ctx_num; i++)
if (wd_comp_ctx.ctxs[i].ctx) {
wd_release_ctx(wd_comp_ctx.ctxs[i].ctx);
wd_comp_ctx.ctxs[i].ctx = 0;
- }
- wd_sched_rr_release(wd_comp_sched);
- wd_alg_clear_init(&wd_comp_setting.status2);
+}
- struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) { return wd_find_msg_in_pool(&wd_comp_setting.pool, idx, tag);
@@ -289,6 +381,7 @@ handle_t wd_comp_alloc_sess(struct wd_comp_sess_setup *setup) sess->comp_lv = setup->comp_lv; sess->win_sz = setup->win_sz; sess->stream_pos = WD_COMP_STREAM_NEW;
- /* Some simple scheduler don't need scheduling parameters */ sess->sched_key = (void *)wd_comp_setting.sched.sched_init( wd_comp_setting.sched.h_sched_ctx, setup->sched_param);
@@ -318,6 +411,7 @@ void wd_comp_free_sess(handle_t h_sess)
if (sess->sched_key) free(sess->sched_key);
- free(sess); }
diff --git a/wd_util.c b/wd_util.c index fa77b46..8fed8ac 100644 --- a/wd_util.c +++ b/wd_util.c @@ -5,7 +5,6 @@ */
#define _GNU_SOURCE -#include <numa.h> #include <pthread.h> #include <semaphore.h> #include <string.h> @@ -1801,3 +1800,238 @@ bool wd_alg_try_init(enum wd_status *status)
return true; }
+static __u32 wd_get_ctx_numbers(struct wd_ctx_params ctx_params, int end) +{
- __u32 count = 0;
- int i;
- for (i = 0; i < end; i++) {
count += ctx_params.ctx_set_num[i].sync_ctx_num;
count += ctx_params.ctx_set_num[i].async_ctx_num;
- }
- return count;
+}
+struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp) +{
- struct uacce_dev_list *p, *node, *result = NULL;
- struct uacce_dev *dev;
- int numa_id, ret;
- if (!bmp) {
WD_ERR("invalid: bmp is NULL!\n");
return WD_ERR_PTR(-WD_EINVAL);
- }
- p = list;
- while (p) {
dev = p->dev;
numa_id = dev->numa_id;
ret = numa_bitmask_isbitset(bmp, numa_id);
if (!ret) {
p = p->next;
continue;
}
node = calloc(1, sizeof(*node));
if (!node) {
result = WD_ERR_PTR(-WD_ENOMEM);
goto out_free_list;
}
node->dev = wd_clone_dev(dev);
if (!node->dev) {
result = WD_ERR_PTR(-WD_ENOMEM);
goto out_free_node;
}
if (!result)
result = node;
else
wd_add_dev_to_list(result, node);
p = p->next;
- }
- return result ? result : WD_ERR_PTR(-WD_ENODEV);
+out_free_node:
- free(node);
+out_free_list:
- wd_free_list_accels(result);
- return result;
+}
+static int wd_init_ctx_set(struct wd_init_attrs *attrs, struct uacce_dev_list *list,
int idx, int numa_id, int op_type)
+{
- struct wd_ctx_nums ctx_nums = attrs->ctx_params->ctx_set_num[op_type];
- __u32 ctx_set_num = ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num;
- struct wd_ctx_config *ctx_config = attrs->ctx_config;
- struct uacce_dev *dev;
- int i;
- dev = wd_find_dev_by_numa(list, numa_id);
- if (WD_IS_ERR(dev))
return WD_PTR_ERR(dev);
- for (i = idx; i < idx + ctx_set_num; i++) {
ctx_config->ctxs[i].ctx = wd_request_ctx(dev);
if (errno == WD_EBUSY) {
dev = wd_find_dev_by_numa(list, numa_id);
if (WD_IS_ERR(dev))
return WD_PTR_ERR(dev);
i--;
}
ctx_config->ctxs[i].op_type = op_type;
ctx_config->ctxs[i].ctx_mode =
((i - idx) < ctx_nums.sync_ctx_num) ?
CTX_MODE_SYNC : CTX_MODE_ASYNC;
- }
- return 0;
+}
+static void wd_release_ctx_set(struct wd_ctx_config *ctx_config) +{
- int i;
- for (i = 0; i < ctx_config->ctx_num; i++)
if (ctx_config->ctxs[i].ctx) {
wd_release_ctx(ctx_config->ctxs[i].ctx);
ctx_config->ctxs[i].ctx = 0;
}
+}
+static int wd_instance_sched_set(struct wd_sched *sched, struct wd_ctx_nums ctx_nums,
int idx, int numa_id, int op_type)
+{
- struct sched_params sparams;
- int i, ret = 0;
- for (i = 0; i < CTX_MODE_MAX; i++) {
sparams.numa_id = numa_id;
sparams.type = op_type;
sparams.mode = i;
sparams.begin = idx + ctx_nums.sync_ctx_num * i;
sparams.end = idx - 1 + ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num * i;
if (sparams.begin > sparams.end)
continue;
ret = wd_sched_rr_instance(sched, &sparams);
if (ret)
goto out;
- }
+out:
- return ret;
+}
+static int wd_init_ctx_and_sched(struct wd_init_attrs *attrs, struct bitmask *bmp,
struct uacce_dev_list *list)
+{
- struct wd_ctx_params *ctx_params = attrs->ctx_params;
- __u32 op_type_num = ctx_params->op_type_num;
- int max_node = numa_max_node() + 1;
- struct wd_ctx_nums ctx_nums;
- int i, j, ret;
- int idx = 0;
- for (i = 0; i < max_node; i++) {
if (!numa_bitmask_isbitset(bmp, i))
continue;
for (j = 0; j < op_type_num; j++) {
ctx_nums = ctx_params->ctx_set_num[j];
ret = wd_init_ctx_set(attrs, list, idx, i, j);
if (ret)
goto free_ctxs;
ret = wd_instance_sched_set(attrs->sched, ctx_nums, idx, i, j);
if (ret)
goto free_ctxs;
idx += (ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num);
}
- }
- return 0;
+free_ctxs:
- wd_release_ctx_set(attrs->ctx_config);
- return ret;
+}
+int wd_alg_pre_init(struct wd_init_attrs *attrs) +{
- struct wd_ctx_config *ctx_config = attrs->ctx_config;
- struct wd_ctx_params *ctx_params = attrs->ctx_params;
- struct bitmask *used_bmp, *bmp = ctx_params->bmp;
- struct uacce_dev_list *list, *used_list = NULL;
- __u32 ctx_set_num, op_type_num;
- int numa_cnt, ret;
- list = wd_get_accel_list(attrs->alg);
- if (!list) {
WD_ERR("failed to get devices!\n");
return -WD_ENODEV;
- }
- op_type_num = ctx_params->op_type_num;
- ctx_set_num = wd_get_ctx_numbers(*ctx_params, op_type_num);
- if (!ctx_set_num || !op_type_num) {
WD_ERR("invalid: ctx_set_num is %d, op_type_num is %d!\n",
ctx_set_num, op_type_num);
ret = -WD_EINVAL;
goto out_freelist;
- }
- /*
* Not every numa has a device. Therefore, the first thing is to
* filter the devices in the selected numa node, and the second
* thing is to obtain the distribution of devices.
*/
- if (bmp) {
used_list = wd_get_usable_list(list, bmp);
if (WD_IS_ERR(used_list)) {
ret = WD_PTR_ERR(used_list);
WD_ERR("failed to get usable devices(%d)!\n", ret);
goto out_freelist;
}
- }
- used_bmp = wd_create_device_nodemask(used_list ? used_list : list);
- if (WD_IS_ERR(used_bmp)) {
ret = WD_PTR_ERR(used_bmp);
goto out_freeusedlist;
- }
- numa_cnt = numa_bitmask_weight(used_bmp);
- if (!numa_cnt) {
ret = numa_cnt;
WD_ERR("invalid: bmp is clear!\n");
goto out_freenodemask;
- }
- ctx_config->ctx_num = ctx_set_num * numa_cnt;
- ctx_config->ctxs = calloc(ctx_config->ctx_num, sizeof(struct wd_ctx));
- if (!ctx_config->ctxs) {
ret = -WD_ENOMEM;
WD_ERR("failed to alloc ctxs!\n");
goto out_freenodemask;
- }
- ret = wd_init_ctx_and_sched(attrs, used_bmp, used_list ? used_list : list);
- if (ret)
free(ctx_config->ctxs);
+out_freenodemask:
- wd_free_device_nodemask(used_bmp);
+out_freeusedlist:
- wd_free_list_accels(used_list);
+out_freelist:
- wd_free_list_accels(list);
- return ret;
+}
在 2022/10/31 21:25, fanghao (A) 写道:
在 2022/10/29 18:19, Yang Shen 写道:
Due to performance, uadk tries to leave many configuration options to users. This gives users great flexibility, but it also leads to a problem that the current initialization interface has high complexity. Therefore, in order to facilitate users to adapt quickly, a new set of interfaces is provided.
The 'wd_alg_init2_()' will complete all initialization steps. There are 4 parameters to describe the user configuration requirements. @alg: The algorithm users want to use. @sched_type: The scheduling type users want to use. @task_sp: Reserved.
task_type ? 同步或者异步吗,建议说明下。
给龙芳的切软算预留的接口,当前代码用不上,为了避免后续修改接口,所以这里提前加上, 在后续特性使能的时候会加上描述。
@ctx_params: The ctxs resources users want to use. Include per operation type ctx numbers and business process run numa.
If users think 'wd_alg_init2_()' is too complex, wd_alg_init2() is a simplified packaging and will use the default value of numa_bitmask and ctx_nums.
Signed-off-by: Yang Shen shenyang39@huawei.com
Makefile.am | 4 +- include/wd.h | 24 ++++ include/wd_alg_common.h | 27 +++++ include/wd_comp.h | 30 +++++ include/wd_util.h | 18 +++ wd.c | 62 +++++++++++ wd_comp.c | 94 ++++++++++++++++ wd_util.c | 236 +++++++++++++++++++++++++++++++++++++++- 8 files changed, 492 insertions(+), 3 deletions(-)
diff --git a/Makefile.am b/Makefile.am index 457af43..c5637e5 100644 --- a/Makefile.am +++ b/Makefile.am @@ -87,7 +87,7 @@ AM_CFLAGS += -DWD_NO_LOG libwd_la_LIBADD = $(libwd_la_OBJECTS) -lnuma -libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl +libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl -lnuma libwd_comp_la_DEPENDENCIES = libwd.la libhisi_zip_la_LIBADD = -ldl @@ -104,7 +104,7 @@ else libwd_la_LDFLAGS=$(UADK_VERSION) libwd_la_LIBADD= -lnuma -libwd_comp_la_LIBADD= -lwd -ldl +libwd_comp_la_LIBADD= -lwd -ldl -lnuma libwd_comp_la_LDFLAGS=$(UADK_VERSION) libwd_comp_la_DEPENDENCIES= libwd.la diff --git a/include/wd.h b/include/wd.h index e1a87de..facd992 100644 --- a/include/wd.h +++ b/include/wd.h @@ -348,6 +348,16 @@ int wd_get_avail_ctx(struct uacce_dev *dev); */ struct uacce_dev_list *wd_get_accel_list(const char *alg_name); +/**
- wd_find_dev_by_numa() - get device with max available ctx number
from an
- * device list according to numa id.
- @list: The device list.
- @numa_id: The numa_id.
- Return device if succeed and other error number if fail.
- */
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id);
/** * wd_get_accel_dev() - Get device supporting the algorithm with smallest numa distance to current numa node. @@ -523,6 +533,20 @@ struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); */ void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node); +/**
- wd_create_device_nodemask() - create a numa node mask of device
list.
- @list: The devices list.
- Return a pointer value if succeed, and error number if fail.
- */
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list);
+/**
- wd_free_device_nodemask() - free a numa node mask.
- @bmp: A numa node mask.
- */
+void wd_free_device_nodemask(struct bitmask *bmp);
/** * wd_ctx_get_dev_name() - Get the device name about task. * @h_ctx: The handle of context. diff --git a/include/wd_alg_common.h b/include/wd_alg_common.h index c455dc3..96e908f 100644 --- a/include/wd_alg_common.h +++ b/include/wd_alg_common.h @@ -63,6 +63,33 @@ struct wd_ctx_config { void *priv; }; +/**
- struct wd_ctx_nums - Define the ctx sets numbers.
- @sync_ctx_num: The ctx numbers which are used for sync mode for each
- ctx sets.
- @async_ctx_num: The ctx numbers which are used for async mode for
each
- ctx sets.
- */
+struct wd_ctx_nums { + __u32 sync_ctx_num; + __u32 async_ctx_num; +};
+/**
- struct wd_ctx_params - Define the ctx sets params which are used
for init
- algorithms.
- @op_type_num: Used for index of ctx_set_num, the order is the
same as
- wd_<alg>_op_type.
- @ctx_set_num: Each operation type ctx sets numbers.
- @bmp: Ctxs distribution. Means users want to run business process
on these
- numa or request ctx from devices located in these numa.
- */
+struct wd_ctx_params { + __u32 op_type_num; + struct wd_ctx_nums *ctx_set_num; + struct bitmask *bmp; +};
struct wd_ctx_internal { handle_t ctx; __u8 op_type; diff --git a/include/wd_comp.h b/include/wd_comp.h index e043a83..13a3e6a 100644 --- a/include/wd_comp.h +++ b/include/wd_comp.h @@ -7,6 +7,7 @@ #ifndef __WD_COMP_H #define __WD_COMP_H +#include <numa.h> #include "wd.h" #include "wd_alg_common.h" @@ -113,6 +114,35 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched); */ void wd_comp_uninit(void); +/**
- wd_comp_init2_() - A simplify interface to initializate uadk
- compression/decompression. This interface keeps most functions of
- wd_comp_init(). Users just need to descripe the deployment of
- business scenarios. Then the initialization will request appropriate
- resources to support the business scenarios.
- To make the initializate simpler, ctx_params support set NULL.
- And then the function will set them as default.
- Please do not use this interface with wd_comp_init() together, or
- some resources may be leak.
- @alg: The algorithm users want to use.
- @sched_type: The scheduling type users want to use.
- @task_tp: Reserved.
- @ctx_params: The ctxs resources users want to use. Include per
operation
- type ctx numbers and business process run numa.
- Return 0 if succeed and others if fail.
- */
+int wd_comp_init2_(char *alg, __u32 sched_type, int task_tp, struct wd_ctx_params *ctx_params);
+#define wd_comp_init2(alg, sched_type, task_tp) \ + wd_comp_init2_(alg, sched_type, task_tp, NULL)
+/**
- wd_comp_uninit2() - Uninitialise ctx configuration and scheduler.
- */
+void wd_comp_uninit2(void);
struct wd_comp_sess_setup { enum wd_comp_alg_type alg_type; /* Denoted by enum wd_comp_alg_type */ enum wd_comp_level comp_lv; /* Denoted by enum wd_comp_level */ diff --git a/include/wd_util.h b/include/wd_util.h index cd0e112..a51a35d 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -7,6 +7,7 @@ #ifndef __WD_UTIL_H #define __WD_UTIL_H +#include <numa.h> #include <stdbool.h> #include <sys/ipc.h> #include <sys/shm.h> @@ -112,6 +113,14 @@ struct wd_msg_handle { int (*recv)(handle_t sess, void *msg); }; +struct wd_init_attrs { + __u32 sched_type; + char *alg; + struct wd_sched *sched; + struct wd_ctx_params *ctx_params; + struct wd_ctx_config *ctx_config; +};
/* * wd_init_ctx_config() - Init internal ctx configuration. * @in: ctx configuration in global setting. @@ -404,6 +413,15 @@ static inline void wd_alg_clear_init(enum wd_status *status) __atomic_store(status, &setting, __ATOMIC_RELAXED); } +/**
- wd_alg_pre_init() - Request the ctxs and initialize the sched_domain
- * with the given devices list, ctxs number and
numa mask.
- @attrs: the algorithm initialization parameters.
- Return device if succeed and other error number if fail.
- */
+int wd_alg_pre_init(struct wd_init_attrs *attrs);
/** * wd_dfx_msg_cnt() - Message counter interface for ctx * @msg: Shared memory addr. diff --git a/wd.c b/wd.c index 78094d8..9eb69d2 100644 --- a/wd.c +++ b/wd.c @@ -727,6 +727,35 @@ free_list: return NULL; } +struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id) +{ + struct uacce_dev *dev = WD_ERR_PTR(-WD_ENODEV); + struct uacce_dev_list *p = list; + int ctx_num, ctx_max = 0;
+ if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ while (p) { + if (numa_id != p->dev->numa_id) { + p = p->next; + continue; + }
+ ctx_num = wd_get_avail_ctx(p->dev); + if (ctx_num > ctx_max) { + dev = p->dev; + ctx_max = ctx_num; + }
+ p = p->next; + }
+ return dev; +}
void wd_free_list_accels(struct uacce_dev_list *list) { struct uacce_dev_list *curr, *next; @@ -793,6 +822,39 @@ int wd_ctx_set_io_cmd(handle_t h_ctx, unsigned long cmd, void *arg) return ioctl(ctx->fd, cmd, arg); } +struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list) +{ + struct uacce_dev_list *p; + struct bitmask *bmp;
+ if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ bmp = numa_allocate_nodemask(); + if (!bmp) { + WD_ERR("failed to alloc bitmask(%d)!\n", errno); + return WD_ERR_PTR(-WD_ENOMEM); + }
+ p = list; + while (p) { + numa_bitmask_setbit(bmp, p->dev->numa_id); + p = p->next; + }
+ return bmp; +}
+void wd_free_device_nodemask(struct bitmask *bmp) +{ + if (!bmp) + return;
+ numa_free_nodemask(bmp); +}
void wd_get_version(void) { const char *wd_released_time = UADK_RELEASED_TIME; diff --git a/wd_comp.c b/wd_comp.c index 44593a6..487fd02 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -14,6 +14,7 @@ #include "config.h" #include "drv/wd_comp_drv.h" +#include "wd_sched.h" #include "wd_util.h" #include "wd_comp.h" @@ -21,6 +22,8 @@ #define HW_CTX_SIZE (64 * 1024) #define STREAM_CHUNK (128 * 1024) +#define SCHED_RR_NAME "sched_rr"
#define swap_byte(x) \ ((((x) & 0x000000ff) << 24) | \ (((x) & 0x0000ff00) << 8) | \ @@ -42,6 +45,7 @@ struct wd_comp_sess { struct wd_comp_setting { enum wd_status status; + enum wd_status status2; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver; @@ -52,6 +56,20 @@ struct wd_comp_setting { struct wd_env_config wd_comp_env_config; +static struct wd_init_attrs wd_comp_init_attrs; +static struct wd_ctx_config wd_comp_ctx; +static struct wd_sched *wd_comp_sched;
+static struct wd_ctx_nums wd_comp_ctx_num[] = { + {1, 1}, {1, 1}, {} +};
+static struct wd_ctx_params wd_comp_ctx_params = { + .op_type_num = WD_DIR_MAX, + .ctx_set_num = wd_comp_ctx_num, + .bmp = NULL, +};
#ifdef WD_STATIC_DRV static void wd_comp_set_static_drv(void) { @@ -178,6 +196,80 @@ void wd_comp_uninit(void) wd_alg_clear_init(&wd_comp_setting.status); } +int wd_comp_init2_(char *alg, __u32 sched_type, int task_tp, struct wd_ctx_params *ctx_params) +{ + enum wd_status status; + bool flag; + int ret;
+ wd_alg_get_init(&wd_comp_setting.status, &status); + if (status == WD_INIT) { + WD_INFO("UADK comp has been initialized with wd_comp_init()!\n"); + return 0; + }
+ flag = wd_alg_try_init(&wd_comp_setting.status2); + if (!flag) + return 0;
+ if (!alg) { + WD_ERR("invalid: alg is NULL!\n"); + ret = -WD_EINVAL; + goto out_uninit; + }
+ wd_comp_init_attrs.alg = alg; + wd_comp_init_attrs.sched_type = sched_type;
+ wd_comp_init_attrs.ctx_params = ctx_params ? ctx_params : &wd_comp_ctx_params; + wd_comp_init_attrs.ctx_config = &wd_comp_ctx;
+ wd_comp_sched = wd_sched_rr_alloc(sched_type, wd_comp_init_attrs.ctx_params->op_type_num, + numa_max_node() + 1, wd_comp_poll_ctx); + if (!wd_comp_sched) { + ret = -WD_EINVAL; + goto out_uninit; + } + wd_comp_sched->name = SCHED_RR_NAME; + wd_comp_init_attrs.sched = wd_comp_sched;
+ ret = wd_alg_pre_init(&wd_comp_init_attrs); + if (ret) + goto out_freesched;
+ ret = wd_comp_init(&wd_comp_ctx, wd_comp_sched); + if (ret) + goto out_freesched;
+ wd_alg_set_init(&wd_comp_setting.status2);
+ return 0;
+out_freesched: + wd_sched_rr_release(wd_comp_sched);
+out_uninit: + wd_alg_clear_init(&wd_comp_setting.status2);
+ return ret; +}
+void wd_comp_uninit2(void) +{ + int i;
+ wd_comp_uninit();
+ for (i = 0; i < wd_comp_ctx.ctx_num; i++) + if (wd_comp_ctx.ctxs[i].ctx) { + wd_release_ctx(wd_comp_ctx.ctxs[i].ctx); + wd_comp_ctx.ctxs[i].ctx = 0; + }
+ wd_sched_rr_release(wd_comp_sched); + wd_alg_clear_init(&wd_comp_setting.status2); +}
struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) { return wd_find_msg_in_pool(&wd_comp_setting.pool, idx, tag); @@ -289,6 +381,7 @@ handle_t wd_comp_alloc_sess(struct wd_comp_sess_setup *setup) sess->comp_lv = setup->comp_lv; sess->win_sz = setup->win_sz; sess->stream_pos = WD_COMP_STREAM_NEW;
/* Some simple scheduler don't need scheduling parameters */ sess->sched_key = (void *)wd_comp_setting.sched.sched_init( wd_comp_setting.sched.h_sched_ctx, setup->sched_param); @@ -318,6 +411,7 @@ void wd_comp_free_sess(handle_t h_sess) if (sess->sched_key) free(sess->sched_key);
free(sess); } diff --git a/wd_util.c b/wd_util.c index fa77b46..8fed8ac 100644 --- a/wd_util.c +++ b/wd_util.c @@ -5,7 +5,6 @@ */ #define _GNU_SOURCE -#include <numa.h> #include <pthread.h> #include <semaphore.h> #include <string.h> @@ -1801,3 +1800,238 @@ bool wd_alg_try_init(enum wd_status *status) return true; }
+static __u32 wd_get_ctx_numbers(struct wd_ctx_params ctx_params, int end) +{ + __u32 count = 0; + int i;
+ for (i = 0; i < end; i++) { + count += ctx_params.ctx_set_num[i].sync_ctx_num; + count += ctx_params.ctx_set_num[i].async_ctx_num; + }
+ return count; +}
+struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp) +{ + struct uacce_dev_list *p, *node, *result = NULL; + struct uacce_dev *dev; + int numa_id, ret;
+ if (!bmp) { + WD_ERR("invalid: bmp is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ p = list; + while (p) { + dev = p->dev; + numa_id = dev->numa_id; + ret = numa_bitmask_isbitset(bmp, numa_id); + if (!ret) { + p = p->next; + continue; + }
+ node = calloc(1, sizeof(*node)); + if (!node) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_list; + }
+ node->dev = wd_clone_dev(dev); + if (!node->dev) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_node; + }
+ if (!result) + result = node; + else + wd_add_dev_to_list(result, node);
+ p = p->next; + }
+ return result ? result : WD_ERR_PTR(-WD_ENODEV);
+out_free_node: + free(node); +out_free_list: + wd_free_list_accels(result); + return result; +}
+static int wd_init_ctx_set(struct wd_init_attrs *attrs, struct uacce_dev_list *list, + int idx, int numa_id, int op_type) +{ + struct wd_ctx_nums ctx_nums = attrs->ctx_params->ctx_set_num[op_type]; + __u32 ctx_set_num = ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num; + struct wd_ctx_config *ctx_config = attrs->ctx_config; + struct uacce_dev *dev; + int i;
+ dev = wd_find_dev_by_numa(list, numa_id); + if (WD_IS_ERR(dev)) + return WD_PTR_ERR(dev);
+ for (i = idx; i < idx + ctx_set_num; i++) { + ctx_config->ctxs[i].ctx = wd_request_ctx(dev); + if (errno == WD_EBUSY) { + dev = wd_find_dev_by_numa(list, numa_id); + if (WD_IS_ERR(dev)) + return WD_PTR_ERR(dev); + i--; + } + ctx_config->ctxs[i].op_type = op_type; + ctx_config->ctxs[i].ctx_mode = + ((i - idx) < ctx_nums.sync_ctx_num) ? + CTX_MODE_SYNC : CTX_MODE_ASYNC; + }
+ return 0; +}
+static void wd_release_ctx_set(struct wd_ctx_config *ctx_config) +{ + int i;
+ for (i = 0; i < ctx_config->ctx_num; i++) + if (ctx_config->ctxs[i].ctx) { + wd_release_ctx(ctx_config->ctxs[i].ctx); + ctx_config->ctxs[i].ctx = 0; + } +}
+static int wd_instance_sched_set(struct wd_sched *sched, struct wd_ctx_nums ctx_nums, + int idx, int numa_id, int op_type) +{ + struct sched_params sparams; + int i, ret = 0;
+ for (i = 0; i < CTX_MODE_MAX; i++) { + sparams.numa_id = numa_id; + sparams.type = op_type; + sparams.mode = i; + sparams.begin = idx + ctx_nums.sync_ctx_num * i; + sparams.end = idx - 1 + ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num * i; + if (sparams.begin > sparams.end) + continue; + ret = wd_sched_rr_instance(sched, &sparams); + if (ret) + goto out; + }
+out: + return ret; +}
+static int wd_init_ctx_and_sched(struct wd_init_attrs *attrs, struct bitmask *bmp, + struct uacce_dev_list *list) +{ + struct wd_ctx_params *ctx_params = attrs->ctx_params; + __u32 op_type_num = ctx_params->op_type_num; + int max_node = numa_max_node() + 1; + struct wd_ctx_nums ctx_nums; + int i, j, ret; + int idx = 0;
+ for (i = 0; i < max_node; i++) { + if (!numa_bitmask_isbitset(bmp, i)) + continue; + for (j = 0; j < op_type_num; j++) { + ctx_nums = ctx_params->ctx_set_num[j]; + ret = wd_init_ctx_set(attrs, list, idx, i, j); + if (ret) + goto free_ctxs; + ret = wd_instance_sched_set(attrs->sched, ctx_nums, idx, i, j); + if (ret) + goto free_ctxs; + idx += (ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num); + } + }
+ return 0;
+free_ctxs: + wd_release_ctx_set(attrs->ctx_config);
+ return ret; +}
+int wd_alg_pre_init(struct wd_init_attrs *attrs) +{ + struct wd_ctx_config *ctx_config = attrs->ctx_config; + struct wd_ctx_params *ctx_params = attrs->ctx_params; + struct bitmask *used_bmp, *bmp = ctx_params->bmp; + struct uacce_dev_list *list, *used_list = NULL; + __u32 ctx_set_num, op_type_num; + int numa_cnt, ret;
+ list = wd_get_accel_list(attrs->alg); + if (!list) { + WD_ERR("failed to get devices!\n"); + return -WD_ENODEV; + }
+ op_type_num = ctx_params->op_type_num; + ctx_set_num = wd_get_ctx_numbers(*ctx_params, op_type_num); + if (!ctx_set_num || !op_type_num) { + WD_ERR("invalid: ctx_set_num is %d, op_type_num is %d!\n", + ctx_set_num, op_type_num); + ret = -WD_EINVAL; + goto out_freelist; + }
+ /* + * Not every numa has a device. Therefore, the first thing is to + * filter the devices in the selected numa node, and the second + * thing is to obtain the distribution of devices. + */ + if (bmp) { + used_list = wd_get_usable_list(list, bmp); + if (WD_IS_ERR(used_list)) { + ret = WD_PTR_ERR(used_list); + WD_ERR("failed to get usable devices(%d)!\n", ret); + goto out_freelist; + } + }
+ used_bmp = wd_create_device_nodemask(used_list ? used_list : list); + if (WD_IS_ERR(used_bmp)) { + ret = WD_PTR_ERR(used_bmp); + goto out_freeusedlist; + }
+ numa_cnt = numa_bitmask_weight(used_bmp); + if (!numa_cnt) { + ret = numa_cnt; + WD_ERR("invalid: bmp is clear!\n"); + goto out_freenodemask; + }
+ ctx_config->ctx_num = ctx_set_num * numa_cnt; + ctx_config->ctxs = calloc(ctx_config->ctx_num, sizeof(struct wd_ctx)); + if (!ctx_config->ctxs) { + ret = -WD_ENOMEM; + WD_ERR("failed to alloc ctxs!\n"); + goto out_freenodemask; + }
+ ret = wd_init_ctx_and_sched(attrs, used_bmp, used_list ? used_list : list); + if (ret) + free(ctx_config->ctxs);
+out_freenodemask: + wd_free_device_nodemask(used_bmp);
+out_freeusedlist: + wd_free_list_accels(used_list);
+out_freelist: + wd_free_list_accels(list);
+ return ret; +}
在 2022/11/1 15:41, Yang Shen 写道:
在 2022/10/31 21:25, fanghao (A) 写道:
在 2022/10/29 18:19, Yang Shen 写道:
Due to performance, uadk tries to leave many configuration options to users. This gives users great flexibility, but it also leads to a problem that the current initialization interface has high complexity. Therefore, in order to facilitate users to adapt quickly, a new set of interfaces is provided.
The 'wd_alg_init2_()' will complete all initialization steps. There are 4 parameters to describe the user configuration requirements. @alg: The algorithm users want to use. @sched_type: The scheduling type users want to use. @task_sp: Reserved.
task_type ? 同步或者异步吗,建议说明下。
给龙芳的切软算预留的接口,当前代码用不上,为了避免后续修改接口,所以这里提前加上, 在后续特性使能的时候会加上描述。
@ctx_params: The ctxs resources users want to use. Include per operation type ctx numbers and business process run numa.
If users think 'wd_alg_init2_()' is too complex, wd_alg_init2() is a simplified packaging and will use the default value of numa_bitmask and ctx_nums.
Signed-off-by: Yang Shen shenyang39@huawei.com
Makefile.am | 4 +- include/wd.h | 24 ++++ include/wd_alg_common.h | 27 +++++ include/wd_comp.h | 30 +++++ include/wd_util.h | 18 +++ wd.c | 62 +++++++++++ wd_comp.c | 94 ++++++++++++++++ wd_util.c | 236 +++++++++++++++++++++++++++++++++++++++- 8 files changed, 492 insertions(+), 3 deletions(-)
diff --git a/Makefile.am b/Makefile.am index 457af43..c5637e5 100644 --- a/Makefile.am +++ b/Makefile.am @@ -87,7 +87,7 @@ AM_CFLAGS += -DWD_NO_LOG libwd_la_LIBADD = $(libwd_la_OBJECTS) -lnuma -libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl +libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl -lnuma libwd_comp_la_DEPENDENCIES = libwd.la libhisi_zip_la_LIBADD = -ldl @@ -104,7 +104,7 @@ else libwd_la_LDFLAGS=$(UADK_VERSION) libwd_la_LIBADD= -lnuma -libwd_comp_la_LIBADD= -lwd -ldl +libwd_comp_la_LIBADD= -lwd -ldl -lnuma libwd_comp_la_LDFLAGS=$(UADK_VERSION) libwd_comp_la_DEPENDENCIES= libwd.la diff --git a/include/wd.h b/include/wd.h index e1a87de..facd992 100644 --- a/include/wd.h +++ b/include/wd.h @@ -348,6 +348,16 @@ int wd_get_avail_ctx(struct uacce_dev *dev); */ struct uacce_dev_list *wd_get_accel_list(const char *alg_name); +/**
- wd_find_dev_by_numa() - get device with max available ctx number from an
- * device list according to numa id.
- @list: The device list.
- @numa_id: The numa_id.
- Return device if succeed and other error number if fail.
- */
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id);
/** * wd_get_accel_dev() - Get device supporting the algorithm with smallest numa distance to current numa node. @@ -523,6 +533,20 @@ struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); */ void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node); +/**
- wd_create_device_nodemask() - create a numa node mask of device list.
- @list: The devices list.
- Return a pointer value if succeed, and error number if fail.
- */
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list);
+/**
- wd_free_device_nodemask() - free a numa node mask.
- @bmp: A numa node mask.
- */
+void wd_free_device_nodemask(struct bitmask *bmp);
/** * wd_ctx_get_dev_name() - Get the device name about task. * @h_ctx: The handle of context. diff --git a/include/wd_alg_common.h b/include/wd_alg_common.h index c455dc3..96e908f 100644 --- a/include/wd_alg_common.h +++ b/include/wd_alg_common.h @@ -63,6 +63,33 @@ struct wd_ctx_config { void *priv; }; +/**
- struct wd_ctx_nums - Define the ctx sets numbers.
- @sync_ctx_num: The ctx numbers which are used for sync mode for each
- ctx sets.
- @async_ctx_num: The ctx numbers which are used for async mode for each
- ctx sets.
- */
+struct wd_ctx_nums { + __u32 sync_ctx_num; + __u32 async_ctx_num; +};
+/**
- struct wd_ctx_params - Define the ctx sets params which are used for init
- algorithms.
- @op_type_num: Used for index of ctx_set_num, the order is the same as
- wd_<alg>_op_type.
- @ctx_set_num: Each operation type ctx sets numbers.
- @bmp: Ctxs distribution. Means users want to run business process on these
- numa or request ctx from devices located in these numa.
- */
+struct wd_ctx_params { + __u32 op_type_num; + struct wd_ctx_nums *ctx_set_num; + struct bitmask *bmp; +};
struct wd_ctx_internal { handle_t ctx; __u8 op_type; diff --git a/include/wd_comp.h b/include/wd_comp.h index e043a83..13a3e6a 100644 --- a/include/wd_comp.h +++ b/include/wd_comp.h @@ -7,6 +7,7 @@ #ifndef __WD_COMP_H #define __WD_COMP_H +#include <numa.h> #include "wd.h" #include "wd_alg_common.h" @@ -113,6 +114,35 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched); */ void wd_comp_uninit(void); +/**
- wd_comp_init2_() - A simplify interface to initializate uadk
- compression/decompression. This interface keeps most functions of
- wd_comp_init(). Users just need to descripe the deployment of
- business scenarios. Then the initialization will request appropriate
- resources to support the business scenarios.
- To make the initializate simpler, ctx_params support set NULL.
- And then the function will set them as default.
- Please do not use this interface with wd_comp_init() together, or
- some resources may be leak.
- @alg: The algorithm users want to use.
- @sched_type: The scheduling type users want to use.
- @task_tp: Reserved.
建议跟前一个参数sched_type命名一致,用task_type。
但是,如果是表示切换开关使能。 建议用fall_back更合适。task_type容易跟同步,异步混淆。
fall_back; /* 是否切换,0:表示不降级切换。 1:表示算力资源耗尽,搜索切换下一级算力*/
- @ctx_params: The ctxs resources users want to use. Include per operation
- type ctx numbers and business process run numa.
- Return 0 if succeed and others if fail.
- */
+int wd_comp_init2_(char *alg, __u32 sched_type, int task_tp, struct wd_ctx_params *ctx_params);
+#define wd_comp_init2(alg, sched_type, task_tp) \ + wd_comp_init2_(alg, sched_type, task_tp, NULL)
+/**
- wd_comp_uninit2() - Uninitialise ctx configuration and scheduler.
- */
+void wd_comp_uninit2(void);
struct wd_comp_sess_setup { enum wd_comp_alg_type alg_type; /* Denoted by enum wd_comp_alg_type */ enum wd_comp_level comp_lv; /* Denoted by enum wd_comp_level */ diff --git a/include/wd_util.h b/include/wd_util.h index cd0e112..a51a35d 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -7,6 +7,7 @@ #ifndef __WD_UTIL_H #define __WD_UTIL_H +#include <numa.h> #include <stdbool.h> #include <sys/ipc.h> #include <sys/shm.h> @@ -112,6 +113,14 @@ struct wd_msg_handle { int (*recv)(handle_t sess, void *msg); }; +struct wd_init_attrs { + __u32 sched_type; + char *alg; + struct wd_sched *sched; + struct wd_ctx_params *ctx_params; + struct wd_ctx_config *ctx_config; +};
/* * wd_init_ctx_config() - Init internal ctx configuration. * @in: ctx configuration in global setting. @@ -404,6 +413,15 @@ static inline void wd_alg_clear_init(enum wd_status *status) __atomic_store(status, &setting, __ATOMIC_RELAXED); } +/**
- wd_alg_pre_init() - Request the ctxs and initialize the sched_domain
- * with the given devices list, ctxs number and numa mask.
- @attrs: the algorithm initialization parameters.
- Return device if succeed and other error number if fail.
- */
+int wd_alg_pre_init(struct wd_init_attrs *attrs);
/** * wd_dfx_msg_cnt() - Message counter interface for ctx * @msg: Shared memory addr. diff --git a/wd.c b/wd.c index 78094d8..9eb69d2 100644 --- a/wd.c +++ b/wd.c @@ -727,6 +727,35 @@ free_list: return NULL; } +struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id) +{ + struct uacce_dev *dev = WD_ERR_PTR(-WD_ENODEV); + struct uacce_dev_list *p = list; + int ctx_num, ctx_max = 0;
+ if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ while (p) { + if (numa_id != p->dev->numa_id) { + p = p->next; + continue; + }
+ ctx_num = wd_get_avail_ctx(p->dev); + if (ctx_num > ctx_max) { + dev = p->dev; + ctx_max = ctx_num; + }
+ p = p->next; + }
+ return dev; +}
void wd_free_list_accels(struct uacce_dev_list *list) { struct uacce_dev_list *curr, *next; @@ -793,6 +822,39 @@ int wd_ctx_set_io_cmd(handle_t h_ctx, unsigned long cmd, void *arg) return ioctl(ctx->fd, cmd, arg); } +struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list) +{ + struct uacce_dev_list *p; + struct bitmask *bmp;
+ if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ bmp = numa_allocate_nodemask(); + if (!bmp) { + WD_ERR("failed to alloc bitmask(%d)!\n", errno); + return WD_ERR_PTR(-WD_ENOMEM); + }
+ p = list; + while (p) { + numa_bitmask_setbit(bmp, p->dev->numa_id); + p = p->next; + }
+ return bmp; +}
+void wd_free_device_nodemask(struct bitmask *bmp) +{ + if (!bmp) + return;
+ numa_free_nodemask(bmp); +}
void wd_get_version(void) { const char *wd_released_time = UADK_RELEASED_TIME; diff --git a/wd_comp.c b/wd_comp.c index 44593a6..487fd02 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -14,6 +14,7 @@ #include "config.h" #include "drv/wd_comp_drv.h" +#include "wd_sched.h" #include "wd_util.h" #include "wd_comp.h" @@ -21,6 +22,8 @@ #define HW_CTX_SIZE (64 * 1024) #define STREAM_CHUNK (128 * 1024) +#define SCHED_RR_NAME "sched_rr"
#define swap_byte(x) \ ((((x) & 0x000000ff) << 24) | \ (((x) & 0x0000ff00) << 8) | \ @@ -42,6 +45,7 @@ struct wd_comp_sess { struct wd_comp_setting { enum wd_status status; + enum wd_status status2; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver; @@ -52,6 +56,20 @@ struct wd_comp_setting { struct wd_env_config wd_comp_env_config; +static struct wd_init_attrs wd_comp_init_attrs; +static struct wd_ctx_config wd_comp_ctx; +static struct wd_sched *wd_comp_sched;
+static struct wd_ctx_nums wd_comp_ctx_num[] = { + {1, 1}, {1, 1}, {} +};
+static struct wd_ctx_params wd_comp_ctx_params = { + .op_type_num = WD_DIR_MAX, + .ctx_set_num = wd_comp_ctx_num, + .bmp = NULL, +};
#ifdef WD_STATIC_DRV static void wd_comp_set_static_drv(void) { @@ -178,6 +196,80 @@ void wd_comp_uninit(void) wd_alg_clear_init(&wd_comp_setting.status); } +int wd_comp_init2_(char *alg, __u32 sched_type, int task_tp, struct wd_ctx_params *ctx_params) +{ + enum wd_status status; + bool flag; + int ret;
+ wd_alg_get_init(&wd_comp_setting.status, &status); + if (status == WD_INIT) { + WD_INFO("UADK comp has been initialized with wd_comp_init()!\n"); + return 0; + }
+ flag = wd_alg_try_init(&wd_comp_setting.status2); + if (!flag) + return 0;
+ if (!alg) { + WD_ERR("invalid: alg is NULL!\n"); + ret = -WD_EINVAL; + goto out_uninit; + }
+ wd_comp_init_attrs.alg = alg; + wd_comp_init_attrs.sched_type = sched_type;
+ wd_comp_init_attrs.ctx_params = ctx_params ? ctx_params : &wd_comp_ctx_params; + wd_comp_init_attrs.ctx_config = &wd_comp_ctx;
+ wd_comp_sched = wd_sched_rr_alloc(sched_type, wd_comp_init_attrs.ctx_params->op_type_num, + numa_max_node() + 1, wd_comp_poll_ctx); + if (!wd_comp_sched) { + ret = -WD_EINVAL; + goto out_uninit; + } + wd_comp_sched->name = SCHED_RR_NAME; + wd_comp_init_attrs.sched = wd_comp_sched;
+ ret = wd_alg_pre_init(&wd_comp_init_attrs); + if (ret) + goto out_freesched;
+ ret = wd_comp_init(&wd_comp_ctx, wd_comp_sched); + if (ret) + goto out_freesched;
+ wd_alg_set_init(&wd_comp_setting.status2);
+ return 0;
+out_freesched: + wd_sched_rr_release(wd_comp_sched);
+out_uninit: + wd_alg_clear_init(&wd_comp_setting.status2);
+ return ret; +}
+void wd_comp_uninit2(void) +{ + int i;
+ wd_comp_uninit();
+ for (i = 0; i < wd_comp_ctx.ctx_num; i++) + if (wd_comp_ctx.ctxs[i].ctx) { + wd_release_ctx(wd_comp_ctx.ctxs[i].ctx); + wd_comp_ctx.ctxs[i].ctx = 0; + }
+ wd_sched_rr_release(wd_comp_sched); + wd_alg_clear_init(&wd_comp_setting.status2); +}
struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) { return wd_find_msg_in_pool(&wd_comp_setting.pool, idx, tag); @@ -289,6 +381,7 @@ handle_t wd_comp_alloc_sess(struct wd_comp_sess_setup *setup) sess->comp_lv = setup->comp_lv; sess->win_sz = setup->win_sz; sess->stream_pos = WD_COMP_STREAM_NEW;
/* Some simple scheduler don't need scheduling parameters */ sess->sched_key = (void *)wd_comp_setting.sched.sched_init( wd_comp_setting.sched.h_sched_ctx, setup->sched_param); @@ -318,6 +411,7 @@ void wd_comp_free_sess(handle_t h_sess) if (sess->sched_key) free(sess->sched_key);
free(sess); } diff --git a/wd_util.c b/wd_util.c index fa77b46..8fed8ac 100644 --- a/wd_util.c +++ b/wd_util.c @@ -5,7 +5,6 @@ */ #define _GNU_SOURCE -#include <numa.h> #include <pthread.h> #include <semaphore.h> #include <string.h> @@ -1801,3 +1800,238 @@ bool wd_alg_try_init(enum wd_status *status) return true; }
+static __u32 wd_get_ctx_numbers(struct wd_ctx_params ctx_params, int end) +{ + __u32 count = 0; + int i;
+ for (i = 0; i < end; i++) { + count += ctx_params.ctx_set_num[i].sync_ctx_num; + count += ctx_params.ctx_set_num[i].async_ctx_num; + }
+ return count; +}
+struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp) +{ + struct uacce_dev_list *p, *node, *result = NULL; + struct uacce_dev *dev; + int numa_id, ret;
+ if (!bmp) { + WD_ERR("invalid: bmp is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ p = list; + while (p) { + dev = p->dev; + numa_id = dev->numa_id; + ret = numa_bitmask_isbitset(bmp, numa_id); + if (!ret) { + p = p->next; + continue; + }
+ node = calloc(1, sizeof(*node)); + if (!node) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_list; + }
+ node->dev = wd_clone_dev(dev); + if (!node->dev) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_node; + }
+ if (!result) + result = node; + else + wd_add_dev_to_list(result, node);
+ p = p->next; + }
+ return result ? result : WD_ERR_PTR(-WD_ENODEV);
+out_free_node: + free(node); +out_free_list: + wd_free_list_accels(result); + return result; +}
+static int wd_init_ctx_set(struct wd_init_attrs *attrs, struct uacce_dev_list *list, + int idx, int numa_id, int op_type) +{ + struct wd_ctx_nums ctx_nums = attrs->ctx_params->ctx_set_num[op_type]; + __u32 ctx_set_num = ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num; + struct wd_ctx_config *ctx_config = attrs->ctx_config; + struct uacce_dev *dev; + int i;
+ dev = wd_find_dev_by_numa(list, numa_id); + if (WD_IS_ERR(dev)) + return WD_PTR_ERR(dev);
+ for (i = idx; i < idx + ctx_set_num; i++) { + ctx_config->ctxs[i].ctx = wd_request_ctx(dev); + if (errno == WD_EBUSY) { + dev = wd_find_dev_by_numa(list, numa_id); + if (WD_IS_ERR(dev)) + return WD_PTR_ERR(dev); + i--; + } + ctx_config->ctxs[i].op_type = op_type; + ctx_config->ctxs[i].ctx_mode = + ((i - idx) < ctx_nums.sync_ctx_num) ? + CTX_MODE_SYNC : CTX_MODE_ASYNC; + }
+ return 0; +}
+static void wd_release_ctx_set(struct wd_ctx_config *ctx_config) +{ + int i;
+ for (i = 0; i < ctx_config->ctx_num; i++) + if (ctx_config->ctxs[i].ctx) { + wd_release_ctx(ctx_config->ctxs[i].ctx); + ctx_config->ctxs[i].ctx = 0; + } +}
+static int wd_instance_sched_set(struct wd_sched *sched, struct wd_ctx_nums ctx_nums, + int idx, int numa_id, int op_type) +{ + struct sched_params sparams; + int i, ret = 0;
+ for (i = 0; i < CTX_MODE_MAX; i++) { + sparams.numa_id = numa_id; + sparams.type = op_type; + sparams.mode = i; + sparams.begin = idx + ctx_nums.sync_ctx_num * i; + sparams.end = idx - 1 + ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num * i; + if (sparams.begin > sparams.end) + continue; + ret = wd_sched_rr_instance(sched, &sparams); + if (ret) + goto out; + }
+out: + return ret; +}
+static int wd_init_ctx_and_sched(struct wd_init_attrs *attrs, struct bitmask *bmp, + struct uacce_dev_list *list) +{ + struct wd_ctx_params *ctx_params = attrs->ctx_params; + __u32 op_type_num = ctx_params->op_type_num; + int max_node = numa_max_node() + 1; + struct wd_ctx_nums ctx_nums; + int i, j, ret; + int idx = 0;
+ for (i = 0; i < max_node; i++) { + if (!numa_bitmask_isbitset(bmp, i)) + continue; + for (j = 0; j < op_type_num; j++) { + ctx_nums = ctx_params->ctx_set_num[j]; + ret = wd_init_ctx_set(attrs, list, idx, i, j); + if (ret) + goto free_ctxs; + ret = wd_instance_sched_set(attrs->sched, ctx_nums, idx, i, j); + if (ret) + goto free_ctxs; + idx += (ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num); + } + }
+ return 0;
+free_ctxs: + wd_release_ctx_set(attrs->ctx_config);
+ return ret; +}
+int wd_alg_pre_init(struct wd_init_attrs *attrs) +{ + struct wd_ctx_config *ctx_config = attrs->ctx_config; + struct wd_ctx_params *ctx_params = attrs->ctx_params; + struct bitmask *used_bmp, *bmp = ctx_params->bmp; + struct uacce_dev_list *list, *used_list = NULL; + __u32 ctx_set_num, op_type_num; + int numa_cnt, ret;
+ list = wd_get_accel_list(attrs->alg); + if (!list) { + WD_ERR("failed to get devices!\n"); + return -WD_ENODEV; + }
+ op_type_num = ctx_params->op_type_num; + ctx_set_num = wd_get_ctx_numbers(*ctx_params, op_type_num); + if (!ctx_set_num || !op_type_num) { + WD_ERR("invalid: ctx_set_num is %d, op_type_num is %d!\n", + ctx_set_num, op_type_num); + ret = -WD_EINVAL; + goto out_freelist; + }
+ /* + * Not every numa has a device. Therefore, the first thing is to + * filter the devices in the selected numa node, and the second + * thing is to obtain the distribution of devices. + */ + if (bmp) { + used_list = wd_get_usable_list(list, bmp); + if (WD_IS_ERR(used_list)) { + ret = WD_PTR_ERR(used_list); + WD_ERR("failed to get usable devices(%d)!\n", ret); + goto out_freelist; + } + }
+ used_bmp = wd_create_device_nodemask(used_list ? used_list : list); + if (WD_IS_ERR(used_bmp)) { + ret = WD_PTR_ERR(used_bmp); + goto out_freeusedlist; + }
+ numa_cnt = numa_bitmask_weight(used_bmp); + if (!numa_cnt) { + ret = numa_cnt; + WD_ERR("invalid: bmp is clear!\n"); + goto out_freenodemask; + }
+ ctx_config->ctx_num = ctx_set_num * numa_cnt; + ctx_config->ctxs = calloc(ctx_config->ctx_num, sizeof(struct wd_ctx)); + if (!ctx_config->ctxs) { + ret = -WD_ENOMEM; + WD_ERR("failed to alloc ctxs!\n"); + goto out_freenodemask; + }
+ ret = wd_init_ctx_and_sched(attrs, used_bmp, used_list ? used_list : list); + if (ret) + free(ctx_config->ctxs);
+out_freenodemask: + wd_free_device_nodemask(used_bmp);
+out_freeusedlist: + wd_free_list_accels(used_list);
+out_freelist: + wd_free_list_accels(list);
+ return ret; +}
.
在 2022/11/1 16:26, fanghao (A) 写道:
在 2022/11/1 15:41, Yang Shen 写道:
在 2022/10/31 21:25, fanghao (A) 写道:
在 2022/10/29 18:19, Yang Shen 写道:
Due to performance, uadk tries to leave many configuration options to users. This gives users great flexibility, but it also leads to a problem that the current initialization interface has high complexity. Therefore, in order to facilitate users to adapt quickly, a new set of interfaces is provided.
The 'wd_alg_init2_()' will complete all initialization steps. There are 4 parameters to describe the user configuration requirements. @alg: The algorithm users want to use. @sched_type: The scheduling type users want to use. @task_sp: Reserved.
task_type ? 同步或者异步吗,建议说明下。
给龙芳的切软算预留的接口,当前代码用不上,为了避免后续修改接口,所以这里提前加上, 在后续特性使能的时候会加上描述。
@ctx_params: The ctxs resources users want to use. Include per operation type ctx numbers and business process run numa.
If users think 'wd_alg_init2_()' is too complex, wd_alg_init2() is a simplified packaging and will use the default value of numa_bitmask and ctx_nums.
Signed-off-by: Yang Shen shenyang39@huawei.com
Makefile.am | 4 +- include/wd.h | 24 ++++ include/wd_alg_common.h | 27 +++++ include/wd_comp.h | 30 +++++ include/wd_util.h | 18 +++ wd.c | 62 +++++++++++ wd_comp.c | 94 ++++++++++++++++ wd_util.c | 236 +++++++++++++++++++++++++++++++++++++++- 8 files changed, 492 insertions(+), 3 deletions(-)
diff --git a/Makefile.am b/Makefile.am index 457af43..c5637e5 100644 --- a/Makefile.am +++ b/Makefile.am @@ -87,7 +87,7 @@ AM_CFLAGS += -DWD_NO_LOG libwd_la_LIBADD = $(libwd_la_OBJECTS) -lnuma -libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl +libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl -lnuma libwd_comp_la_DEPENDENCIES = libwd.la libhisi_zip_la_LIBADD = -ldl @@ -104,7 +104,7 @@ else libwd_la_LDFLAGS=$(UADK_VERSION) libwd_la_LIBADD= -lnuma -libwd_comp_la_LIBADD= -lwd -ldl +libwd_comp_la_LIBADD= -lwd -ldl -lnuma libwd_comp_la_LDFLAGS=$(UADK_VERSION) libwd_comp_la_DEPENDENCIES= libwd.la diff --git a/include/wd.h b/include/wd.h index e1a87de..facd992 100644 --- a/include/wd.h +++ b/include/wd.h @@ -348,6 +348,16 @@ int wd_get_avail_ctx(struct uacce_dev *dev); */ struct uacce_dev_list *wd_get_accel_list(const char *alg_name); +/**
- wd_find_dev_by_numa() - get device with max available ctx number from an
- * device list according to numa id.
- @list: The device list.
- @numa_id: The numa_id.
- Return device if succeed and other error number if fail.
- */
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id);
/** * wd_get_accel_dev() - Get device supporting the algorithm with smallest numa distance to current numa node. @@ -523,6 +533,20 @@ struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); */ void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node); +/**
- wd_create_device_nodemask() - create a numa node mask of device list.
- @list: The devices list.
- Return a pointer value if succeed, and error number if fail.
- */
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list);
+/**
- wd_free_device_nodemask() - free a numa node mask.
- @bmp: A numa node mask.
- */
+void wd_free_device_nodemask(struct bitmask *bmp);
/** * wd_ctx_get_dev_name() - Get the device name about task. * @h_ctx: The handle of context. diff --git a/include/wd_alg_common.h b/include/wd_alg_common.h index c455dc3..96e908f 100644 --- a/include/wd_alg_common.h +++ b/include/wd_alg_common.h @@ -63,6 +63,33 @@ struct wd_ctx_config { void *priv; }; +/**
- struct wd_ctx_nums - Define the ctx sets numbers.
- @sync_ctx_num: The ctx numbers which are used for sync mode for each
- ctx sets.
- @async_ctx_num: The ctx numbers which are used for async mode for each
- ctx sets.
- */
+struct wd_ctx_nums { + __u32 sync_ctx_num; + __u32 async_ctx_num; +};
+/**
- struct wd_ctx_params - Define the ctx sets params which are used for init
- algorithms.
- @op_type_num: Used for index of ctx_set_num, the order is the same as
- wd_<alg>_op_type.
- @ctx_set_num: Each operation type ctx sets numbers.
- @bmp: Ctxs distribution. Means users want to run business process on these
- numa or request ctx from devices located in these numa.
- */
+struct wd_ctx_params { + __u32 op_type_num; + struct wd_ctx_nums *ctx_set_num; + struct bitmask *bmp; +};
struct wd_ctx_internal { handle_t ctx; __u8 op_type; diff --git a/include/wd_comp.h b/include/wd_comp.h index e043a83..13a3e6a 100644 --- a/include/wd_comp.h +++ b/include/wd_comp.h @@ -7,6 +7,7 @@ #ifndef __WD_COMP_H #define __WD_COMP_H +#include <numa.h> #include "wd.h" #include "wd_alg_common.h" @@ -113,6 +114,35 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched); */ void wd_comp_uninit(void); +/**
- wd_comp_init2_() - A simplify interface to initializate uadk
- compression/decompression. This interface keeps most functions of
- wd_comp_init(). Users just need to descripe the deployment of
- business scenarios. Then the initialization will request appropriate
- resources to support the business scenarios.
- To make the initializate simpler, ctx_params support set NULL.
- And then the function will set them as default.
- Please do not use this interface with wd_comp_init() together, or
- some resources may be leak.
- @alg: The algorithm users want to use.
- @sched_type: The scheduling type users want to use.
- @task_tp: Reserved.
建议跟前一个参数sched_type命名一致,用task_type。
但是,如果是表示切换开关使能。 建议用fall_back更合适。task_type容易跟同步,异步混淆。
fall_back; /* 是否切换,0:表示不降级切换。 1:表示算力资源耗尽,搜索切换下一级算力*/
这个参数不是切换软算这种标志,它的定义如下: enum alg_task_type { TASK_SOFT = 0x0, TASK_HW, TASK_MIX };
表示的是业务类型,算法方式,硬算方式和混合模式,只有混合模式有切软算, 这种情况下fall_back定义不符合语义。
- @ctx_params: The ctxs resources users want to use. Include per operation
- type ctx numbers and business process run numa.
- Return 0 if succeed and others if fail.
- */
+int wd_comp_init2_(char *alg, __u32 sched_type, int task_tp, struct wd_ctx_params *ctx_params);
+#define wd_comp_init2(alg, sched_type, task_tp) \ + wd_comp_init2_(alg, sched_type, task_tp, NULL)
+/**
- wd_comp_uninit2() - Uninitialise ctx configuration and scheduler.
- */
+void wd_comp_uninit2(void);
struct wd_comp_sess_setup { enum wd_comp_alg_type alg_type; /* Denoted by enum wd_comp_alg_type */ enum wd_comp_level comp_lv; /* Denoted by enum wd_comp_level */ diff --git a/include/wd_util.h b/include/wd_util.h index cd0e112..a51a35d 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -7,6 +7,7 @@ #ifndef __WD_UTIL_H #define __WD_UTIL_H +#include <numa.h> #include <stdbool.h> #include <sys/ipc.h> #include <sys/shm.h> @@ -112,6 +113,14 @@ struct wd_msg_handle { int (*recv)(handle_t sess, void *msg); }; +struct wd_init_attrs { + __u32 sched_type; + char *alg; + struct wd_sched *sched; + struct wd_ctx_params *ctx_params; + struct wd_ctx_config *ctx_config; +};
/* * wd_init_ctx_config() - Init internal ctx configuration. * @in: ctx configuration in global setting. @@ -404,6 +413,15 @@ static inline void wd_alg_clear_init(enum wd_status *status) __atomic_store(status, &setting, __ATOMIC_RELAXED); } +/**
- wd_alg_pre_init() - Request the ctxs and initialize the sched_domain
- * with the given devices list, ctxs number and numa mask.
- @attrs: the algorithm initialization parameters.
- Return device if succeed and other error number if fail.
- */
+int wd_alg_pre_init(struct wd_init_attrs *attrs);
/** * wd_dfx_msg_cnt() - Message counter interface for ctx * @msg: Shared memory addr. diff --git a/wd.c b/wd.c index 78094d8..9eb69d2 100644 --- a/wd.c +++ b/wd.c @@ -727,6 +727,35 @@ free_list: return NULL; } +struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id) +{ + struct uacce_dev *dev = WD_ERR_PTR(-WD_ENODEV); + struct uacce_dev_list *p = list; + int ctx_num, ctx_max = 0;
+ if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ while (p) { + if (numa_id != p->dev->numa_id) { + p = p->next; + continue; + }
+ ctx_num = wd_get_avail_ctx(p->dev); + if (ctx_num > ctx_max) { + dev = p->dev; + ctx_max = ctx_num; + }
+ p = p->next; + }
+ return dev; +}
void wd_free_list_accels(struct uacce_dev_list *list) { struct uacce_dev_list *curr, *next; @@ -793,6 +822,39 @@ int wd_ctx_set_io_cmd(handle_t h_ctx, unsigned long cmd, void *arg) return ioctl(ctx->fd, cmd, arg); } +struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list) +{ + struct uacce_dev_list *p; + struct bitmask *bmp;
+ if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ bmp = numa_allocate_nodemask(); + if (!bmp) { + WD_ERR("failed to alloc bitmask(%d)!\n", errno); + return WD_ERR_PTR(-WD_ENOMEM); + }
+ p = list; + while (p) { + numa_bitmask_setbit(bmp, p->dev->numa_id); + p = p->next; + }
+ return bmp; +}
+void wd_free_device_nodemask(struct bitmask *bmp) +{ + if (!bmp) + return;
+ numa_free_nodemask(bmp); +}
void wd_get_version(void) { const char *wd_released_time = UADK_RELEASED_TIME; diff --git a/wd_comp.c b/wd_comp.c index 44593a6..487fd02 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -14,6 +14,7 @@ #include "config.h" #include "drv/wd_comp_drv.h" +#include "wd_sched.h" #include "wd_util.h" #include "wd_comp.h" @@ -21,6 +22,8 @@ #define HW_CTX_SIZE (64 * 1024) #define STREAM_CHUNK (128 * 1024) +#define SCHED_RR_NAME "sched_rr"
#define swap_byte(x) \ ((((x) & 0x000000ff) << 24) | \ (((x) & 0x0000ff00) << 8) | \ @@ -42,6 +45,7 @@ struct wd_comp_sess { struct wd_comp_setting { enum wd_status status; + enum wd_status status2; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver; @@ -52,6 +56,20 @@ struct wd_comp_setting { struct wd_env_config wd_comp_env_config; +static struct wd_init_attrs wd_comp_init_attrs; +static struct wd_ctx_config wd_comp_ctx; +static struct wd_sched *wd_comp_sched;
+static struct wd_ctx_nums wd_comp_ctx_num[] = { + {1, 1}, {1, 1}, {} +};
+static struct wd_ctx_params wd_comp_ctx_params = { + .op_type_num = WD_DIR_MAX, + .ctx_set_num = wd_comp_ctx_num, + .bmp = NULL, +};
#ifdef WD_STATIC_DRV static void wd_comp_set_static_drv(void) { @@ -178,6 +196,80 @@ void wd_comp_uninit(void) wd_alg_clear_init(&wd_comp_setting.status); } +int wd_comp_init2_(char *alg, __u32 sched_type, int task_tp, struct wd_ctx_params *ctx_params) +{ + enum wd_status status; + bool flag; + int ret;
+ wd_alg_get_init(&wd_comp_setting.status, &status); + if (status == WD_INIT) { + WD_INFO("UADK comp has been initialized with wd_comp_init()!\n"); + return 0; + }
+ flag = wd_alg_try_init(&wd_comp_setting.status2); + if (!flag) + return 0;
+ if (!alg) { + WD_ERR("invalid: alg is NULL!\n"); + ret = -WD_EINVAL; + goto out_uninit; + }
+ wd_comp_init_attrs.alg = alg; + wd_comp_init_attrs.sched_type = sched_type;
+ wd_comp_init_attrs.ctx_params = ctx_params ? ctx_params : &wd_comp_ctx_params; + wd_comp_init_attrs.ctx_config = &wd_comp_ctx;
+ wd_comp_sched = wd_sched_rr_alloc(sched_type, wd_comp_init_attrs.ctx_params->op_type_num, + numa_max_node() + 1, wd_comp_poll_ctx); + if (!wd_comp_sched) { + ret = -WD_EINVAL; + goto out_uninit; + } + wd_comp_sched->name = SCHED_RR_NAME; + wd_comp_init_attrs.sched = wd_comp_sched;
+ ret = wd_alg_pre_init(&wd_comp_init_attrs); + if (ret) + goto out_freesched;
+ ret = wd_comp_init(&wd_comp_ctx, wd_comp_sched); + if (ret) + goto out_freesched;
+ wd_alg_set_init(&wd_comp_setting.status2);
+ return 0;
+out_freesched: + wd_sched_rr_release(wd_comp_sched);
+out_uninit: + wd_alg_clear_init(&wd_comp_setting.status2);
+ return ret; +}
+void wd_comp_uninit2(void) +{ + int i;
+ wd_comp_uninit();
+ for (i = 0; i < wd_comp_ctx.ctx_num; i++) + if (wd_comp_ctx.ctxs[i].ctx) { + wd_release_ctx(wd_comp_ctx.ctxs[i].ctx); + wd_comp_ctx.ctxs[i].ctx = 0; + }
+ wd_sched_rr_release(wd_comp_sched); + wd_alg_clear_init(&wd_comp_setting.status2); +}
struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) { return wd_find_msg_in_pool(&wd_comp_setting.pool, idx, tag); @@ -289,6 +381,7 @@ handle_t wd_comp_alloc_sess(struct wd_comp_sess_setup *setup) sess->comp_lv = setup->comp_lv; sess->win_sz = setup->win_sz; sess->stream_pos = WD_COMP_STREAM_NEW;
/* Some simple scheduler don't need scheduling parameters */ sess->sched_key = (void *)wd_comp_setting.sched.sched_init( wd_comp_setting.sched.h_sched_ctx, setup->sched_param); @@ -318,6 +411,7 @@ void wd_comp_free_sess(handle_t h_sess) if (sess->sched_key) free(sess->sched_key);
free(sess); } diff --git a/wd_util.c b/wd_util.c index fa77b46..8fed8ac 100644 --- a/wd_util.c +++ b/wd_util.c @@ -5,7 +5,6 @@ */ #define _GNU_SOURCE -#include <numa.h> #include <pthread.h> #include <semaphore.h> #include <string.h> @@ -1801,3 +1800,238 @@ bool wd_alg_try_init(enum wd_status *status) return true; }
+static __u32 wd_get_ctx_numbers(struct wd_ctx_params ctx_params, int end) +{ + __u32 count = 0; + int i;
+ for (i = 0; i < end; i++) { + count += ctx_params.ctx_set_num[i].sync_ctx_num; + count += ctx_params.ctx_set_num[i].async_ctx_num; + }
+ return count; +}
+struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp) +{ + struct uacce_dev_list *p, *node, *result = NULL; + struct uacce_dev *dev; + int numa_id, ret;
+ if (!bmp) { + WD_ERR("invalid: bmp is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ p = list; + while (p) { + dev = p->dev; + numa_id = dev->numa_id; + ret = numa_bitmask_isbitset(bmp, numa_id); + if (!ret) { + p = p->next; + continue; + }
+ node = calloc(1, sizeof(*node)); + if (!node) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_list; + }
+ node->dev = wd_clone_dev(dev); + if (!node->dev) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_node; + }
+ if (!result) + result = node; + else + wd_add_dev_to_list(result, node);
+ p = p->next; + }
+ return result ? result : WD_ERR_PTR(-WD_ENODEV);
+out_free_node: + free(node); +out_free_list: + wd_free_list_accels(result); + return result; +}
+static int wd_init_ctx_set(struct wd_init_attrs *attrs, struct uacce_dev_list *list, + int idx, int numa_id, int op_type) +{ + struct wd_ctx_nums ctx_nums = attrs->ctx_params->ctx_set_num[op_type]; + __u32 ctx_set_num = ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num; + struct wd_ctx_config *ctx_config = attrs->ctx_config; + struct uacce_dev *dev; + int i;
+ dev = wd_find_dev_by_numa(list, numa_id); + if (WD_IS_ERR(dev)) + return WD_PTR_ERR(dev);
+ for (i = idx; i < idx + ctx_set_num; i++) { + ctx_config->ctxs[i].ctx = wd_request_ctx(dev); + if (errno == WD_EBUSY) { + dev = wd_find_dev_by_numa(list, numa_id); + if (WD_IS_ERR(dev)) + return WD_PTR_ERR(dev); + i--; + } + ctx_config->ctxs[i].op_type = op_type; + ctx_config->ctxs[i].ctx_mode = + ((i - idx) < ctx_nums.sync_ctx_num) ? + CTX_MODE_SYNC : CTX_MODE_ASYNC; + }
+ return 0; +}
+static void wd_release_ctx_set(struct wd_ctx_config *ctx_config) +{ + int i;
+ for (i = 0; i < ctx_config->ctx_num; i++) + if (ctx_config->ctxs[i].ctx) { + wd_release_ctx(ctx_config->ctxs[i].ctx); + ctx_config->ctxs[i].ctx = 0; + } +}
+static int wd_instance_sched_set(struct wd_sched *sched, struct wd_ctx_nums ctx_nums, + int idx, int numa_id, int op_type) +{ + struct sched_params sparams; + int i, ret = 0;
+ for (i = 0; i < CTX_MODE_MAX; i++) { + sparams.numa_id = numa_id; + sparams.type = op_type; + sparams.mode = i; + sparams.begin = idx + ctx_nums.sync_ctx_num * i; + sparams.end = idx - 1 + ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num * i; + if (sparams.begin > sparams.end) + continue; + ret = wd_sched_rr_instance(sched, &sparams); + if (ret) + goto out; + }
+out: + return ret; +}
+static int wd_init_ctx_and_sched(struct wd_init_attrs *attrs, struct bitmask *bmp, + struct uacce_dev_list *list) +{ + struct wd_ctx_params *ctx_params = attrs->ctx_params; + __u32 op_type_num = ctx_params->op_type_num; + int max_node = numa_max_node() + 1; + struct wd_ctx_nums ctx_nums; + int i, j, ret; + int idx = 0;
+ for (i = 0; i < max_node; i++) { + if (!numa_bitmask_isbitset(bmp, i)) + continue; + for (j = 0; j < op_type_num; j++) { + ctx_nums = ctx_params->ctx_set_num[j]; + ret = wd_init_ctx_set(attrs, list, idx, i, j); + if (ret) + goto free_ctxs; + ret = wd_instance_sched_set(attrs->sched, ctx_nums, idx, i, j); + if (ret) + goto free_ctxs; + idx += (ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num); + } + }
+ return 0;
+free_ctxs: + wd_release_ctx_set(attrs->ctx_config);
+ return ret; +}
+int wd_alg_pre_init(struct wd_init_attrs *attrs) +{ + struct wd_ctx_config *ctx_config = attrs->ctx_config; + struct wd_ctx_params *ctx_params = attrs->ctx_params; + struct bitmask *used_bmp, *bmp = ctx_params->bmp; + struct uacce_dev_list *list, *used_list = NULL; + __u32 ctx_set_num, op_type_num; + int numa_cnt, ret;
+ list = wd_get_accel_list(attrs->alg); + if (!list) { + WD_ERR("failed to get devices!\n"); + return -WD_ENODEV; + }
+ op_type_num = ctx_params->op_type_num; + ctx_set_num = wd_get_ctx_numbers(*ctx_params, op_type_num); + if (!ctx_set_num || !op_type_num) { + WD_ERR("invalid: ctx_set_num is %d, op_type_num is %d!\n", + ctx_set_num, op_type_num); + ret = -WD_EINVAL; + goto out_freelist; + }
+ /* + * Not every numa has a device. Therefore, the first thing is to + * filter the devices in the selected numa node, and the second + * thing is to obtain the distribution of devices. + */ + if (bmp) { + used_list = wd_get_usable_list(list, bmp); + if (WD_IS_ERR(used_list)) { + ret = WD_PTR_ERR(used_list); + WD_ERR("failed to get usable devices(%d)!\n", ret); + goto out_freelist; + } + }
+ used_bmp = wd_create_device_nodemask(used_list ? used_list : list); + if (WD_IS_ERR(used_bmp)) { + ret = WD_PTR_ERR(used_bmp); + goto out_freeusedlist; + }
+ numa_cnt = numa_bitmask_weight(used_bmp); + if (!numa_cnt) { + ret = numa_cnt; + WD_ERR("invalid: bmp is clear!\n"); + goto out_freenodemask; + }
+ ctx_config->ctx_num = ctx_set_num * numa_cnt; + ctx_config->ctxs = calloc(ctx_config->ctx_num, sizeof(struct wd_ctx)); + if (!ctx_config->ctxs) { + ret = -WD_ENOMEM; + WD_ERR("failed to alloc ctxs!\n"); + goto out_freenodemask; + }
+ ret = wd_init_ctx_and_sched(attrs, used_bmp, used_list ? used_list : list); + if (ret) + free(ctx_config->ctxs);
+out_freenodemask: + wd_free_device_nodemask(used_bmp);
+out_freeusedlist: + wd_free_list_accels(used_list);
+out_freelist: + wd_free_list_accels(list);
+ return ret; +}
.
.
在 2022/11/2 9:15, liulongfang 写道:
在 2022/11/1 16:26, fanghao (A) 写道:
在 2022/11/1 15:41, Yang Shen 写道:
在 2022/10/31 21:25, fanghao (A) 写道:
在 2022/10/29 18:19, Yang Shen 写道:
Due to performance, uadk tries to leave many configuration options to users. This gives users great flexibility, but it also leads to a problem that the current initialization interface has high complexity. Therefore, in order to facilitate users to adapt quickly, a new set of interfaces is provided.
The 'wd_alg_init2_()' will complete all initialization steps. There are 4 parameters to describe the user configuration requirements. @alg: The algorithm users want to use. @sched_type: The scheduling type users want to use. @task_sp: Reserved.
task_type ? 同步或者异步吗,建议说明下。
给龙芳的切软算预留的接口,当前代码用不上,为了避免后续修改接口,所以这里提前加上, 在后续特性使能的时候会加上描述。
@ctx_params: The ctxs resources users want to use. Include per operation type ctx numbers and business process run numa.
If users think 'wd_alg_init2_()' is too complex, wd_alg_init2() is a simplified packaging and will use the default value of numa_bitmask and ctx_nums.
Signed-off-by: Yang Shen shenyang39@huawei.com
Makefile.am | 4 +- include/wd.h | 24 ++++ include/wd_alg_common.h | 27 +++++ include/wd_comp.h | 30 +++++ include/wd_util.h | 18 +++ wd.c | 62 +++++++++++ wd_comp.c | 94 ++++++++++++++++ wd_util.c | 236 +++++++++++++++++++++++++++++++++++++++- 8 files changed, 492 insertions(+), 3 deletions(-)
diff --git a/Makefile.am b/Makefile.am index 457af43..c5637e5 100644 --- a/Makefile.am +++ b/Makefile.am @@ -87,7 +87,7 @@ AM_CFLAGS += -DWD_NO_LOG libwd_la_LIBADD = $(libwd_la_OBJECTS) -lnuma -libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl +libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl -lnuma libwd_comp_la_DEPENDENCIES = libwd.la libhisi_zip_la_LIBADD = -ldl @@ -104,7 +104,7 @@ else libwd_la_LDFLAGS=$(UADK_VERSION) libwd_la_LIBADD= -lnuma -libwd_comp_la_LIBADD= -lwd -ldl +libwd_comp_la_LIBADD= -lwd -ldl -lnuma libwd_comp_la_LDFLAGS=$(UADK_VERSION) libwd_comp_la_DEPENDENCIES= libwd.la diff --git a/include/wd.h b/include/wd.h index e1a87de..facd992 100644 --- a/include/wd.h +++ b/include/wd.h @@ -348,6 +348,16 @@ int wd_get_avail_ctx(struct uacce_dev *dev); */ struct uacce_dev_list *wd_get_accel_list(const char *alg_name); +/**
- wd_find_dev_by_numa() - get device with max available ctx number from an
- * device list according to numa id.
- @list: The device list.
- @numa_id: The numa_id.
- Return device if succeed and other error number if fail.
- */
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id);
/** * wd_get_accel_dev() - Get device supporting the algorithm with smallest numa distance to current numa node. @@ -523,6 +533,20 @@ struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); */ void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node); +/**
- wd_create_device_nodemask() - create a numa node mask of device list.
- @list: The devices list.
- Return a pointer value if succeed, and error number if fail.
- */
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list);
+/**
- wd_free_device_nodemask() - free a numa node mask.
- @bmp: A numa node mask.
- */
+void wd_free_device_nodemask(struct bitmask *bmp);
/** * wd_ctx_get_dev_name() - Get the device name about task. * @h_ctx: The handle of context. diff --git a/include/wd_alg_common.h b/include/wd_alg_common.h index c455dc3..96e908f 100644 --- a/include/wd_alg_common.h +++ b/include/wd_alg_common.h @@ -63,6 +63,33 @@ struct wd_ctx_config { void *priv; }; +/**
- struct wd_ctx_nums - Define the ctx sets numbers.
- @sync_ctx_num: The ctx numbers which are used for sync mode for each
- ctx sets.
- @async_ctx_num: The ctx numbers which are used for async mode for each
- ctx sets.
- */
+struct wd_ctx_nums { + __u32 sync_ctx_num; + __u32 async_ctx_num; +};
+/**
- struct wd_ctx_params - Define the ctx sets params which are used for init
- algorithms.
- @op_type_num: Used for index of ctx_set_num, the order is the same as
- wd_<alg>_op_type.
- @ctx_set_num: Each operation type ctx sets numbers.
- @bmp: Ctxs distribution. Means users want to run business process on these
- numa or request ctx from devices located in these numa.
- */
+struct wd_ctx_params { + __u32 op_type_num; + struct wd_ctx_nums *ctx_set_num; + struct bitmask *bmp; +};
struct wd_ctx_internal { handle_t ctx; __u8 op_type; diff --git a/include/wd_comp.h b/include/wd_comp.h index e043a83..13a3e6a 100644 --- a/include/wd_comp.h +++ b/include/wd_comp.h @@ -7,6 +7,7 @@ #ifndef __WD_COMP_H #define __WD_COMP_H +#include <numa.h> #include "wd.h" #include "wd_alg_common.h" @@ -113,6 +114,35 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched); */ void wd_comp_uninit(void); +/**
- wd_comp_init2_() - A simplify interface to initializate uadk
- compression/decompression. This interface keeps most functions of
- wd_comp_init(). Users just need to descripe the deployment of
- business scenarios. Then the initialization will request appropriate
- resources to support the business scenarios.
- To make the initializate simpler, ctx_params support set NULL.
- And then the function will set them as default.
- Please do not use this interface with wd_comp_init() together, or
- some resources may be leak.
- @alg: The algorithm users want to use.
- @sched_type: The scheduling type users want to use.
- @task_tp: Reserved.
建议跟前一个参数sched_type命名一致,用task_type。
但是,如果是表示切换开关使能。 建议用fall_back更合适。task_type容易跟同步,异步混淆。
fall_back; /* 是否切换,0:表示不降级切换。 1:表示算力资源耗尽,搜索切换下一级算力*/
这个参数不是切换软算这种标志,它的定义如下: enum alg_task_type { TASK_SOFT = 0x0, TASK_HW, TASK_MIX };
表示的是业务类型,算法方式,硬算方式和混合模式,只有混合模式有切软算, 这种情况下fall_back定义不符合语义。
我感觉有点复杂且过设计了。 1、这个需要用户去指定设置哪种类型,用户怎么知道他是用软算还是硬算。 2、用户大部分是不感知用哪种算力的。唯一有可能感知的是用硬件提升下安全性。
所以,再来看下原始需求是:是用单一算力,还是用更大带宽算力。
用fall_back如果感觉命名不好,可以换一种命名。 但是它想实现的需求是: fall_back==0:表示只用固定一种算力,比如硬件快且安全,那就固定选最高优先级的硬件,不能切软算。 fall_back==1;表示用统一大算力,这个时候应用不管用啥算力,只要能给他拼起更多算力就行,最高优先级用完, 降级用次优先级算力。所以可能有硬算1+硬算2组合;硬算1+软算1组合. 这样适用性会更广。
- @ctx_params: The ctxs resources users want to use. Include per operation
- type ctx numbers and business process run numa.
- Return 0 if succeed and others if fail.
- */
+int wd_comp_init2_(char *alg, __u32 sched_type, int task_tp, struct wd_ctx_params *ctx_params);
+#define wd_comp_init2(alg, sched_type, task_tp) \ + wd_comp_init2_(alg, sched_type, task_tp, NULL)
+/**
- wd_comp_uninit2() - Uninitialise ctx configuration and scheduler.
- */
+void wd_comp_uninit2(void);
struct wd_comp_sess_setup { enum wd_comp_alg_type alg_type; /* Denoted by enum wd_comp_alg_type */ enum wd_comp_level comp_lv; /* Denoted by enum wd_comp_level */ diff --git a/include/wd_util.h b/include/wd_util.h index cd0e112..a51a35d 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -7,6 +7,7 @@ #ifndef __WD_UTIL_H #define __WD_UTIL_H +#include <numa.h> #include <stdbool.h> #include <sys/ipc.h> #include <sys/shm.h> @@ -112,6 +113,14 @@ struct wd_msg_handle { int (*recv)(handle_t sess, void *msg); }; +struct wd_init_attrs { + __u32 sched_type; + char *alg; + struct wd_sched *sched; + struct wd_ctx_params *ctx_params; + struct wd_ctx_config *ctx_config; +};
/* * wd_init_ctx_config() - Init internal ctx configuration. * @in: ctx configuration in global setting. @@ -404,6 +413,15 @@ static inline void wd_alg_clear_init(enum wd_status *status) __atomic_store(status, &setting, __ATOMIC_RELAXED); } +/**
- wd_alg_pre_init() - Request the ctxs and initialize the sched_domain
- * with the given devices list, ctxs number and numa mask.
- @attrs: the algorithm initialization parameters.
- Return device if succeed and other error number if fail.
- */
+int wd_alg_pre_init(struct wd_init_attrs *attrs);
/** * wd_dfx_msg_cnt() - Message counter interface for ctx * @msg: Shared memory addr. diff --git a/wd.c b/wd.c index 78094d8..9eb69d2 100644 --- a/wd.c +++ b/wd.c @@ -727,6 +727,35 @@ free_list: return NULL; } +struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id) +{ + struct uacce_dev *dev = WD_ERR_PTR(-WD_ENODEV); + struct uacce_dev_list *p = list; + int ctx_num, ctx_max = 0;
+ if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ while (p) { + if (numa_id != p->dev->numa_id) { + p = p->next; + continue; + }
+ ctx_num = wd_get_avail_ctx(p->dev); + if (ctx_num > ctx_max) { + dev = p->dev; + ctx_max = ctx_num; + }
+ p = p->next; + }
+ return dev; +}
void wd_free_list_accels(struct uacce_dev_list *list) { struct uacce_dev_list *curr, *next; @@ -793,6 +822,39 @@ int wd_ctx_set_io_cmd(handle_t h_ctx, unsigned long cmd, void *arg) return ioctl(ctx->fd, cmd, arg); } +struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list) +{ + struct uacce_dev_list *p; + struct bitmask *bmp;
+ if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ bmp = numa_allocate_nodemask(); + if (!bmp) { + WD_ERR("failed to alloc bitmask(%d)!\n", errno); + return WD_ERR_PTR(-WD_ENOMEM); + }
+ p = list; + while (p) { + numa_bitmask_setbit(bmp, p->dev->numa_id); + p = p->next; + }
+ return bmp; +}
+void wd_free_device_nodemask(struct bitmask *bmp) +{ + if (!bmp) + return;
+ numa_free_nodemask(bmp); +}
void wd_get_version(void) { const char *wd_released_time = UADK_RELEASED_TIME; diff --git a/wd_comp.c b/wd_comp.c index 44593a6..487fd02 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -14,6 +14,7 @@ #include "config.h" #include "drv/wd_comp_drv.h" +#include "wd_sched.h" #include "wd_util.h" #include "wd_comp.h" @@ -21,6 +22,8 @@ #define HW_CTX_SIZE (64 * 1024) #define STREAM_CHUNK (128 * 1024) +#define SCHED_RR_NAME "sched_rr"
#define swap_byte(x) \ ((((x) & 0x000000ff) << 24) | \ (((x) & 0x0000ff00) << 8) | \ @@ -42,6 +45,7 @@ struct wd_comp_sess { struct wd_comp_setting { enum wd_status status; + enum wd_status status2; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver; @@ -52,6 +56,20 @@ struct wd_comp_setting { struct wd_env_config wd_comp_env_config; +static struct wd_init_attrs wd_comp_init_attrs; +static struct wd_ctx_config wd_comp_ctx; +static struct wd_sched *wd_comp_sched;
+static struct wd_ctx_nums wd_comp_ctx_num[] = { + {1, 1}, {1, 1}, {} +};
+static struct wd_ctx_params wd_comp_ctx_params = { + .op_type_num = WD_DIR_MAX, + .ctx_set_num = wd_comp_ctx_num, + .bmp = NULL, +};
#ifdef WD_STATIC_DRV static void wd_comp_set_static_drv(void) { @@ -178,6 +196,80 @@ void wd_comp_uninit(void) wd_alg_clear_init(&wd_comp_setting.status); } +int wd_comp_init2_(char *alg, __u32 sched_type, int task_tp, struct wd_ctx_params *ctx_params) +{ + enum wd_status status; + bool flag; + int ret;
+ wd_alg_get_init(&wd_comp_setting.status, &status); + if (status == WD_INIT) { + WD_INFO("UADK comp has been initialized with wd_comp_init()!\n"); + return 0; + }
+ flag = wd_alg_try_init(&wd_comp_setting.status2); + if (!flag) + return 0;
+ if (!alg) { + WD_ERR("invalid: alg is NULL!\n"); + ret = -WD_EINVAL; + goto out_uninit; + }
+ wd_comp_init_attrs.alg = alg; + wd_comp_init_attrs.sched_type = sched_type;
+ wd_comp_init_attrs.ctx_params = ctx_params ? ctx_params : &wd_comp_ctx_params; + wd_comp_init_attrs.ctx_config = &wd_comp_ctx;
+ wd_comp_sched = wd_sched_rr_alloc(sched_type, wd_comp_init_attrs.ctx_params->op_type_num, + numa_max_node() + 1, wd_comp_poll_ctx); + if (!wd_comp_sched) { + ret = -WD_EINVAL; + goto out_uninit; + } + wd_comp_sched->name = SCHED_RR_NAME; + wd_comp_init_attrs.sched = wd_comp_sched;
+ ret = wd_alg_pre_init(&wd_comp_init_attrs); + if (ret) + goto out_freesched;
+ ret = wd_comp_init(&wd_comp_ctx, wd_comp_sched); + if (ret) + goto out_freesched;
+ wd_alg_set_init(&wd_comp_setting.status2);
+ return 0;
+out_freesched: + wd_sched_rr_release(wd_comp_sched);
+out_uninit: + wd_alg_clear_init(&wd_comp_setting.status2);
+ return ret; +}
+void wd_comp_uninit2(void) +{ + int i;
+ wd_comp_uninit();
+ for (i = 0; i < wd_comp_ctx.ctx_num; i++) + if (wd_comp_ctx.ctxs[i].ctx) { + wd_release_ctx(wd_comp_ctx.ctxs[i].ctx); + wd_comp_ctx.ctxs[i].ctx = 0; + }
+ wd_sched_rr_release(wd_comp_sched); + wd_alg_clear_init(&wd_comp_setting.status2); +}
struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) { return wd_find_msg_in_pool(&wd_comp_setting.pool, idx, tag); @@ -289,6 +381,7 @@ handle_t wd_comp_alloc_sess(struct wd_comp_sess_setup *setup) sess->comp_lv = setup->comp_lv; sess->win_sz = setup->win_sz; sess->stream_pos = WD_COMP_STREAM_NEW;
/* Some simple scheduler don't need scheduling parameters */ sess->sched_key = (void *)wd_comp_setting.sched.sched_init( wd_comp_setting.sched.h_sched_ctx, setup->sched_param); @@ -318,6 +411,7 @@ void wd_comp_free_sess(handle_t h_sess) if (sess->sched_key) free(sess->sched_key);
free(sess); } diff --git a/wd_util.c b/wd_util.c index fa77b46..8fed8ac 100644 --- a/wd_util.c +++ b/wd_util.c @@ -5,7 +5,6 @@ */ #define _GNU_SOURCE -#include <numa.h> #include <pthread.h> #include <semaphore.h> #include <string.h> @@ -1801,3 +1800,238 @@ bool wd_alg_try_init(enum wd_status *status) return true; }
+static __u32 wd_get_ctx_numbers(struct wd_ctx_params ctx_params, int end) +{ + __u32 count = 0; + int i;
+ for (i = 0; i < end; i++) { + count += ctx_params.ctx_set_num[i].sync_ctx_num; + count += ctx_params.ctx_set_num[i].async_ctx_num; + }
+ return count; +}
+struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp) +{ + struct uacce_dev_list *p, *node, *result = NULL; + struct uacce_dev *dev; + int numa_id, ret;
+ if (!bmp) { + WD_ERR("invalid: bmp is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ p = list; + while (p) { + dev = p->dev; + numa_id = dev->numa_id; + ret = numa_bitmask_isbitset(bmp, numa_id); + if (!ret) { + p = p->next; + continue; + }
+ node = calloc(1, sizeof(*node)); + if (!node) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_list; + }
+ node->dev = wd_clone_dev(dev); + if (!node->dev) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_node; + }
+ if (!result) + result = node; + else + wd_add_dev_to_list(result, node);
+ p = p->next; + }
+ return result ? result : WD_ERR_PTR(-WD_ENODEV);
+out_free_node: + free(node); +out_free_list: + wd_free_list_accels(result); + return result; +}
+static int wd_init_ctx_set(struct wd_init_attrs *attrs, struct uacce_dev_list *list, + int idx, int numa_id, int op_type) +{ + struct wd_ctx_nums ctx_nums = attrs->ctx_params->ctx_set_num[op_type]; + __u32 ctx_set_num = ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num; + struct wd_ctx_config *ctx_config = attrs->ctx_config; + struct uacce_dev *dev; + int i;
+ dev = wd_find_dev_by_numa(list, numa_id); + if (WD_IS_ERR(dev)) + return WD_PTR_ERR(dev);
+ for (i = idx; i < idx + ctx_set_num; i++) { + ctx_config->ctxs[i].ctx = wd_request_ctx(dev); + if (errno == WD_EBUSY) { + dev = wd_find_dev_by_numa(list, numa_id); + if (WD_IS_ERR(dev)) + return WD_PTR_ERR(dev); + i--; + } + ctx_config->ctxs[i].op_type = op_type; + ctx_config->ctxs[i].ctx_mode = + ((i - idx) < ctx_nums.sync_ctx_num) ? + CTX_MODE_SYNC : CTX_MODE_ASYNC; + }
+ return 0; +}
+static void wd_release_ctx_set(struct wd_ctx_config *ctx_config) +{ + int i;
+ for (i = 0; i < ctx_config->ctx_num; i++) + if (ctx_config->ctxs[i].ctx) { + wd_release_ctx(ctx_config->ctxs[i].ctx); + ctx_config->ctxs[i].ctx = 0; + } +}
+static int wd_instance_sched_set(struct wd_sched *sched, struct wd_ctx_nums ctx_nums, + int idx, int numa_id, int op_type) +{ + struct sched_params sparams; + int i, ret = 0;
+ for (i = 0; i < CTX_MODE_MAX; i++) { + sparams.numa_id = numa_id; + sparams.type = op_type; + sparams.mode = i; + sparams.begin = idx + ctx_nums.sync_ctx_num * i; + sparams.end = idx - 1 + ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num * i; + if (sparams.begin > sparams.end) + continue; + ret = wd_sched_rr_instance(sched, &sparams); + if (ret) + goto out; + }
+out: + return ret; +}
+static int wd_init_ctx_and_sched(struct wd_init_attrs *attrs, struct bitmask *bmp, + struct uacce_dev_list *list) +{ + struct wd_ctx_params *ctx_params = attrs->ctx_params; + __u32 op_type_num = ctx_params->op_type_num; + int max_node = numa_max_node() + 1; + struct wd_ctx_nums ctx_nums; + int i, j, ret; + int idx = 0;
+ for (i = 0; i < max_node; i++) { + if (!numa_bitmask_isbitset(bmp, i)) + continue; + for (j = 0; j < op_type_num; j++) { + ctx_nums = ctx_params->ctx_set_num[j]; + ret = wd_init_ctx_set(attrs, list, idx, i, j); + if (ret) + goto free_ctxs; + ret = wd_instance_sched_set(attrs->sched, ctx_nums, idx, i, j); + if (ret) + goto free_ctxs; + idx += (ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num); + } + }
+ return 0;
+free_ctxs: + wd_release_ctx_set(attrs->ctx_config);
+ return ret; +}
+int wd_alg_pre_init(struct wd_init_attrs *attrs) +{ + struct wd_ctx_config *ctx_config = attrs->ctx_config; + struct wd_ctx_params *ctx_params = attrs->ctx_params; + struct bitmask *used_bmp, *bmp = ctx_params->bmp; + struct uacce_dev_list *list, *used_list = NULL; + __u32 ctx_set_num, op_type_num; + int numa_cnt, ret;
+ list = wd_get_accel_list(attrs->alg); + if (!list) { + WD_ERR("failed to get devices!\n"); + return -WD_ENODEV; + }
+ op_type_num = ctx_params->op_type_num; + ctx_set_num = wd_get_ctx_numbers(*ctx_params, op_type_num); + if (!ctx_set_num || !op_type_num) { + WD_ERR("invalid: ctx_set_num is %d, op_type_num is %d!\n", + ctx_set_num, op_type_num); + ret = -WD_EINVAL; + goto out_freelist; + }
+ /* + * Not every numa has a device. Therefore, the first thing is to + * filter the devices in the selected numa node, and the second + * thing is to obtain the distribution of devices. + */ + if (bmp) { + used_list = wd_get_usable_list(list, bmp); + if (WD_IS_ERR(used_list)) { + ret = WD_PTR_ERR(used_list); + WD_ERR("failed to get usable devices(%d)!\n", ret); + goto out_freelist; + } + }
+ used_bmp = wd_create_device_nodemask(used_list ? used_list : list); + if (WD_IS_ERR(used_bmp)) { + ret = WD_PTR_ERR(used_bmp); + goto out_freeusedlist; + }
+ numa_cnt = numa_bitmask_weight(used_bmp); + if (!numa_cnt) { + ret = numa_cnt; + WD_ERR("invalid: bmp is clear!\n"); + goto out_freenodemask; + }
+ ctx_config->ctx_num = ctx_set_num * numa_cnt; + ctx_config->ctxs = calloc(ctx_config->ctx_num, sizeof(struct wd_ctx)); + if (!ctx_config->ctxs) { + ret = -WD_ENOMEM; + WD_ERR("failed to alloc ctxs!\n"); + goto out_freenodemask; + }
+ ret = wd_init_ctx_and_sched(attrs, used_bmp, used_list ? used_list : list); + if (ret) + free(ctx_config->ctxs);
+out_freenodemask: + wd_free_device_nodemask(used_bmp);
+out_freeusedlist: + wd_free_list_accels(used_list);
+out_freelist: + wd_free_list_accels(list);
+ return ret; +}
.
.
.
在 2022/11/2 10:38, fanghao (A) 写道:
在 2022/11/2 9:15, liulongfang 写道:
在 2022/11/1 16:26, fanghao (A) 写道:
在 2022/11/1 15:41, Yang Shen 写道:
在 2022/10/31 21:25, fanghao (A) 写道:
在 2022/10/29 18:19, Yang Shen 写道:
Due to performance, uadk tries to leave many configuration options to users. This gives users great flexibility, but it also leads to a problem that the current initialization interface has high complexity. Therefore, in order to facilitate users to adapt quickly, a new set of interfaces is provided.
The 'wd_alg_init2_()' will complete all initialization steps. There are 4 parameters to describe the user configuration requirements. @alg: The algorithm users want to use. @sched_type: The scheduling type users want to use. @task_sp: Reserved.
task_type ? 同步或者异步吗,建议说明下。
给龙芳的切软算预留的接口,当前代码用不上,为了避免后续修改接口,所以这里提前加上, 在后续特性使能的时候会加上描述。
@ctx_params: The ctxs resources users want to use. Include per operation type ctx numbers and business process run numa.
If users think 'wd_alg_init2_()' is too complex, wd_alg_init2() is a simplified packaging and will use the default value of numa_bitmask and ctx_nums.
Signed-off-by: Yang Shen shenyang39@huawei.com
Makefile.am | 4 +- include/wd.h | 24 ++++ include/wd_alg_common.h | 27 +++++ include/wd_comp.h | 30 +++++ include/wd_util.h | 18 +++ wd.c | 62 +++++++++++ wd_comp.c | 94 ++++++++++++++++ wd_util.c | 236 +++++++++++++++++++++++++++++++++++++++- 8 files changed, 492 insertions(+), 3 deletions(-)
diff --git a/Makefile.am b/Makefile.am index 457af43..c5637e5 100644 --- a/Makefile.am +++ b/Makefile.am @@ -87,7 +87,7 @@ AM_CFLAGS += -DWD_NO_LOG libwd_la_LIBADD = $(libwd_la_OBJECTS) -lnuma -libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl +libwd_comp_la_LIBADD = $(libwd_la_OBJECTS) -ldl -lnuma libwd_comp_la_DEPENDENCIES = libwd.la libhisi_zip_la_LIBADD = -ldl @@ -104,7 +104,7 @@ else libwd_la_LDFLAGS=$(UADK_VERSION) libwd_la_LIBADD= -lnuma -libwd_comp_la_LIBADD= -lwd -ldl +libwd_comp_la_LIBADD= -lwd -ldl -lnuma libwd_comp_la_LDFLAGS=$(UADK_VERSION) libwd_comp_la_DEPENDENCIES= libwd.la diff --git a/include/wd.h b/include/wd.h index e1a87de..facd992 100644 --- a/include/wd.h +++ b/include/wd.h @@ -348,6 +348,16 @@ int wd_get_avail_ctx(struct uacce_dev *dev); */ struct uacce_dev_list *wd_get_accel_list(const char *alg_name); +/**
- wd_find_dev_by_numa() - get device with max available ctx number from an
- * device list according to numa id.
- @list: The device list.
- @numa_id: The numa_id.
- Return device if succeed and other error number if fail.
- */
+struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id);
/** * wd_get_accel_dev() - Get device supporting the algorithm with smallest numa distance to current numa node. @@ -523,6 +533,20 @@ struct uacce_dev *wd_clone_dev(struct uacce_dev *dev); */ void wd_add_dev_to_list(struct uacce_dev_list *head, struct uacce_dev_list *node); +/**
- wd_create_device_nodemask() - create a numa node mask of device list.
- @list: The devices list.
- Return a pointer value if succeed, and error number if fail.
- */
+struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list);
+/**
- wd_free_device_nodemask() - free a numa node mask.
- @bmp: A numa node mask.
- */
+void wd_free_device_nodemask(struct bitmask *bmp);
/** * wd_ctx_get_dev_name() - Get the device name about task. * @h_ctx: The handle of context. diff --git a/include/wd_alg_common.h b/include/wd_alg_common.h index c455dc3..96e908f 100644 --- a/include/wd_alg_common.h +++ b/include/wd_alg_common.h @@ -63,6 +63,33 @@ struct wd_ctx_config { void *priv; }; +/**
- struct wd_ctx_nums - Define the ctx sets numbers.
- @sync_ctx_num: The ctx numbers which are used for sync mode for each
- ctx sets.
- @async_ctx_num: The ctx numbers which are used for async mode for each
- ctx sets.
- */
+struct wd_ctx_nums { + __u32 sync_ctx_num; + __u32 async_ctx_num; +};
+/**
- struct wd_ctx_params - Define the ctx sets params which are used for init
- algorithms.
- @op_type_num: Used for index of ctx_set_num, the order is the same as
- wd_<alg>_op_type.
- @ctx_set_num: Each operation type ctx sets numbers.
- @bmp: Ctxs distribution. Means users want to run business process on these
- numa or request ctx from devices located in these numa.
- */
+struct wd_ctx_params { + __u32 op_type_num; + struct wd_ctx_nums *ctx_set_num; + struct bitmask *bmp; +};
struct wd_ctx_internal { handle_t ctx; __u8 op_type; diff --git a/include/wd_comp.h b/include/wd_comp.h index e043a83..13a3e6a 100644 --- a/include/wd_comp.h +++ b/include/wd_comp.h @@ -7,6 +7,7 @@ #ifndef __WD_COMP_H #define __WD_COMP_H +#include <numa.h> #include "wd.h" #include "wd_alg_common.h" @@ -113,6 +114,35 @@ int wd_comp_init(struct wd_ctx_config *config, struct wd_sched *sched); */ void wd_comp_uninit(void); +/**
- wd_comp_init2_() - A simplify interface to initializate uadk
- compression/decompression. This interface keeps most functions of
- wd_comp_init(). Users just need to descripe the deployment of
- business scenarios. Then the initialization will request appropriate
- resources to support the business scenarios.
- To make the initializate simpler, ctx_params support set NULL.
- And then the function will set them as default.
- Please do not use this interface with wd_comp_init() together, or
- some resources may be leak.
- @alg: The algorithm users want to use.
- @sched_type: The scheduling type users want to use.
- @task_tp: Reserved.
建议跟前一个参数sched_type命名一致,用task_type。
但是,如果是表示切换开关使能。 建议用fall_back更合适。task_type容易跟同步,异步混淆。
fall_back; /* 是否切换,0:表示不降级切换。 1:表示算力资源耗尽,搜索切换下一级算力*/
这个参数不是切换软算这种标志,它的定义如下: enum alg_task_type { TASK_SOFT = 0x0, TASK_HW, TASK_MIX };
表示的是业务类型,算法方式,硬算方式和混合模式,只有混合模式有切软算, 这种情况下fall_back定义不符合语义。
我感觉有点复杂且过设计了。 1、这个需要用户去指定设置哪种类型,用户怎么知道他是用软算还是硬算。 2、用户大部分是不感知用哪种算力的。唯一有可能感知的是用硬件提升下安全性。
所以,再来看下原始需求是:是用单一算力,还是用更大带宽算力。
用fall_back如果感觉命名不好,可以换一种命名。 但是它想实现的需求是: fall_back==0:表示只用固定一种算力,比如硬件快且安全,那就固定选最高优先级的硬件,不能切软算。 fall_back==1;表示用统一大算力,这个时候应用不管用啥算力,只要能给他拼起更多算力就行,最高优先级用完, 降级用次优先级算力。所以可能有硬算1+硬算2组合;硬算1+软算1组合. 这样适用性会更广。
上面那个原始需求的描述不够完整,当前我们的UADK已经不强调一定需要硬算才可以使用,单独使用软算的场景也是存在的。 因此,直接使用0,1表示是否用软算进行补充的情况只是 默认使用硬算的情况,对于单独使用软算就无法支持了.
纯软算就是使用过程中只用软算,至于扩展,只在软算层面扩展下去软算1 + 软算2 + ... 纯硬算就是使用过程中只用硬算,至于扩展,只在硬算层面扩展下去硬算1 + 硬算2 + ... 混合计算就是使用过程中都可以用,至于扩展就是按照优先级,不管软算和硬算,只有当前可以执行业务的,直接按照优先级使用就行
这种方式就只从业务层面进行考虑,不需要考虑内部实现细节。
- @ctx_params: The ctxs resources users want to use. Include per operation
- type ctx numbers and business process run numa.
- Return 0 if succeed and others if fail.
- */
+int wd_comp_init2_(char *alg, __u32 sched_type, int task_tp, struct wd_ctx_params *ctx_params);
+#define wd_comp_init2(alg, sched_type, task_tp) \ + wd_comp_init2_(alg, sched_type, task_tp, NULL)
+/**
- wd_comp_uninit2() - Uninitialise ctx configuration and scheduler.
- */
+void wd_comp_uninit2(void);
struct wd_comp_sess_setup { enum wd_comp_alg_type alg_type; /* Denoted by enum wd_comp_alg_type */ enum wd_comp_level comp_lv; /* Denoted by enum wd_comp_level */ diff --git a/include/wd_util.h b/include/wd_util.h index cd0e112..a51a35d 100644 --- a/include/wd_util.h +++ b/include/wd_util.h @@ -7,6 +7,7 @@ #ifndef __WD_UTIL_H #define __WD_UTIL_H +#include <numa.h> #include <stdbool.h> #include <sys/ipc.h> #include <sys/shm.h> @@ -112,6 +113,14 @@ struct wd_msg_handle { int (*recv)(handle_t sess, void *msg); }; +struct wd_init_attrs { + __u32 sched_type; + char *alg; + struct wd_sched *sched; + struct wd_ctx_params *ctx_params; + struct wd_ctx_config *ctx_config; +};
/* * wd_init_ctx_config() - Init internal ctx configuration. * @in: ctx configuration in global setting. @@ -404,6 +413,15 @@ static inline void wd_alg_clear_init(enum wd_status *status) __atomic_store(status, &setting, __ATOMIC_RELAXED); } +/**
- wd_alg_pre_init() - Request the ctxs and initialize the sched_domain
- * with the given devices list, ctxs number and numa mask.
- @attrs: the algorithm initialization parameters.
- Return device if succeed and other error number if fail.
- */
+int wd_alg_pre_init(struct wd_init_attrs *attrs);
/** * wd_dfx_msg_cnt() - Message counter interface for ctx * @msg: Shared memory addr. diff --git a/wd.c b/wd.c index 78094d8..9eb69d2 100644 --- a/wd.c +++ b/wd.c @@ -727,6 +727,35 @@ free_list: return NULL; } +struct uacce_dev *wd_find_dev_by_numa(struct uacce_dev_list *list, int numa_id) +{ + struct uacce_dev *dev = WD_ERR_PTR(-WD_ENODEV); + struct uacce_dev_list *p = list; + int ctx_num, ctx_max = 0;
+ if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ while (p) { + if (numa_id != p->dev->numa_id) { + p = p->next; + continue; + }
+ ctx_num = wd_get_avail_ctx(p->dev); + if (ctx_num > ctx_max) { + dev = p->dev; + ctx_max = ctx_num; + }
+ p = p->next; + }
+ return dev; +}
void wd_free_list_accels(struct uacce_dev_list *list) { struct uacce_dev_list *curr, *next; @@ -793,6 +822,39 @@ int wd_ctx_set_io_cmd(handle_t h_ctx, unsigned long cmd, void *arg) return ioctl(ctx->fd, cmd, arg); } +struct bitmask *wd_create_device_nodemask(struct uacce_dev_list *list) +{ + struct uacce_dev_list *p; + struct bitmask *bmp;
+ if (!list) { + WD_ERR("invalid: list is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ bmp = numa_allocate_nodemask(); + if (!bmp) { + WD_ERR("failed to alloc bitmask(%d)!\n", errno); + return WD_ERR_PTR(-WD_ENOMEM); + }
+ p = list; + while (p) { + numa_bitmask_setbit(bmp, p->dev->numa_id); + p = p->next; + }
+ return bmp; +}
+void wd_free_device_nodemask(struct bitmask *bmp) +{ + if (!bmp) + return;
+ numa_free_nodemask(bmp); +}
void wd_get_version(void) { const char *wd_released_time = UADK_RELEASED_TIME; diff --git a/wd_comp.c b/wd_comp.c index 44593a6..487fd02 100644 --- a/wd_comp.c +++ b/wd_comp.c @@ -14,6 +14,7 @@ #include "config.h" #include "drv/wd_comp_drv.h" +#include "wd_sched.h" #include "wd_util.h" #include "wd_comp.h" @@ -21,6 +22,8 @@ #define HW_CTX_SIZE (64 * 1024) #define STREAM_CHUNK (128 * 1024) +#define SCHED_RR_NAME "sched_rr"
#define swap_byte(x) \ ((((x) & 0x000000ff) << 24) | \ (((x) & 0x0000ff00) << 8) | \ @@ -42,6 +45,7 @@ struct wd_comp_sess { struct wd_comp_setting { enum wd_status status; + enum wd_status status2; struct wd_ctx_config_internal config; struct wd_sched sched; struct wd_comp_driver *driver; @@ -52,6 +56,20 @@ struct wd_comp_setting { struct wd_env_config wd_comp_env_config; +static struct wd_init_attrs wd_comp_init_attrs; +static struct wd_ctx_config wd_comp_ctx; +static struct wd_sched *wd_comp_sched;
+static struct wd_ctx_nums wd_comp_ctx_num[] = { + {1, 1}, {1, 1}, {} +};
+static struct wd_ctx_params wd_comp_ctx_params = { + .op_type_num = WD_DIR_MAX, + .ctx_set_num = wd_comp_ctx_num, + .bmp = NULL, +};
#ifdef WD_STATIC_DRV static void wd_comp_set_static_drv(void) { @@ -178,6 +196,80 @@ void wd_comp_uninit(void) wd_alg_clear_init(&wd_comp_setting.status); } +int wd_comp_init2_(char *alg, __u32 sched_type, int task_tp, struct wd_ctx_params *ctx_params) +{ + enum wd_status status; + bool flag; + int ret;
+ wd_alg_get_init(&wd_comp_setting.status, &status); + if (status == WD_INIT) { + WD_INFO("UADK comp has been initialized with wd_comp_init()!\n"); + return 0; + }
+ flag = wd_alg_try_init(&wd_comp_setting.status2); + if (!flag) + return 0;
+ if (!alg) { + WD_ERR("invalid: alg is NULL!\n"); + ret = -WD_EINVAL; + goto out_uninit; + }
+ wd_comp_init_attrs.alg = alg; + wd_comp_init_attrs.sched_type = sched_type;
+ wd_comp_init_attrs.ctx_params = ctx_params ? ctx_params : &wd_comp_ctx_params; + wd_comp_init_attrs.ctx_config = &wd_comp_ctx;
+ wd_comp_sched = wd_sched_rr_alloc(sched_type, wd_comp_init_attrs.ctx_params->op_type_num, + numa_max_node() + 1, wd_comp_poll_ctx); + if (!wd_comp_sched) { + ret = -WD_EINVAL; + goto out_uninit; + } + wd_comp_sched->name = SCHED_RR_NAME; + wd_comp_init_attrs.sched = wd_comp_sched;
+ ret = wd_alg_pre_init(&wd_comp_init_attrs); + if (ret) + goto out_freesched;
+ ret = wd_comp_init(&wd_comp_ctx, wd_comp_sched); + if (ret) + goto out_freesched;
+ wd_alg_set_init(&wd_comp_setting.status2);
+ return 0;
+out_freesched: + wd_sched_rr_release(wd_comp_sched);
+out_uninit: + wd_alg_clear_init(&wd_comp_setting.status2);
+ return ret; +}
+void wd_comp_uninit2(void) +{ + int i;
+ wd_comp_uninit();
+ for (i = 0; i < wd_comp_ctx.ctx_num; i++) + if (wd_comp_ctx.ctxs[i].ctx) { + wd_release_ctx(wd_comp_ctx.ctxs[i].ctx); + wd_comp_ctx.ctxs[i].ctx = 0; + }
+ wd_sched_rr_release(wd_comp_sched); + wd_alg_clear_init(&wd_comp_setting.status2); +}
struct wd_comp_msg *wd_comp_get_msg(__u32 idx, __u32 tag) { return wd_find_msg_in_pool(&wd_comp_setting.pool, idx, tag); @@ -289,6 +381,7 @@ handle_t wd_comp_alloc_sess(struct wd_comp_sess_setup *setup) sess->comp_lv = setup->comp_lv; sess->win_sz = setup->win_sz; sess->stream_pos = WD_COMP_STREAM_NEW;
/* Some simple scheduler don't need scheduling parameters */ sess->sched_key = (void *)wd_comp_setting.sched.sched_init( wd_comp_setting.sched.h_sched_ctx, setup->sched_param); @@ -318,6 +411,7 @@ void wd_comp_free_sess(handle_t h_sess) if (sess->sched_key) free(sess->sched_key);
free(sess); } diff --git a/wd_util.c b/wd_util.c index fa77b46..8fed8ac 100644 --- a/wd_util.c +++ b/wd_util.c @@ -5,7 +5,6 @@ */ #define _GNU_SOURCE -#include <numa.h> #include <pthread.h> #include <semaphore.h> #include <string.h> @@ -1801,3 +1800,238 @@ bool wd_alg_try_init(enum wd_status *status) return true; }
+static __u32 wd_get_ctx_numbers(struct wd_ctx_params ctx_params, int end) +{ + __u32 count = 0; + int i;
+ for (i = 0; i < end; i++) { + count += ctx_params.ctx_set_num[i].sync_ctx_num; + count += ctx_params.ctx_set_num[i].async_ctx_num; + }
+ return count; +}
+struct uacce_dev_list *wd_get_usable_list(struct uacce_dev_list *list, struct bitmask *bmp) +{ + struct uacce_dev_list *p, *node, *result = NULL; + struct uacce_dev *dev; + int numa_id, ret;
+ if (!bmp) { + WD_ERR("invalid: bmp is NULL!\n"); + return WD_ERR_PTR(-WD_EINVAL); + }
+ p = list; + while (p) { + dev = p->dev; + numa_id = dev->numa_id; + ret = numa_bitmask_isbitset(bmp, numa_id); + if (!ret) { + p = p->next; + continue; + }
+ node = calloc(1, sizeof(*node)); + if (!node) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_list; + }
+ node->dev = wd_clone_dev(dev); + if (!node->dev) { + result = WD_ERR_PTR(-WD_ENOMEM); + goto out_free_node; + }
+ if (!result) + result = node; + else + wd_add_dev_to_list(result, node);
+ p = p->next; + }
+ return result ? result : WD_ERR_PTR(-WD_ENODEV);
+out_free_node: + free(node); +out_free_list: + wd_free_list_accels(result); + return result; +}
+static int wd_init_ctx_set(struct wd_init_attrs *attrs, struct uacce_dev_list *list, + int idx, int numa_id, int op_type) +{ + struct wd_ctx_nums ctx_nums = attrs->ctx_params->ctx_set_num[op_type]; + __u32 ctx_set_num = ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num; + struct wd_ctx_config *ctx_config = attrs->ctx_config; + struct uacce_dev *dev; + int i;
+ dev = wd_find_dev_by_numa(list, numa_id); + if (WD_IS_ERR(dev)) + return WD_PTR_ERR(dev);
+ for (i = idx; i < idx + ctx_set_num; i++) { + ctx_config->ctxs[i].ctx = wd_request_ctx(dev); + if (errno == WD_EBUSY) { + dev = wd_find_dev_by_numa(list, numa_id); + if (WD_IS_ERR(dev)) + return WD_PTR_ERR(dev); + i--; + } + ctx_config->ctxs[i].op_type = op_type; + ctx_config->ctxs[i].ctx_mode = + ((i - idx) < ctx_nums.sync_ctx_num) ? + CTX_MODE_SYNC : CTX_MODE_ASYNC; + }
+ return 0; +}
+static void wd_release_ctx_set(struct wd_ctx_config *ctx_config) +{ + int i;
+ for (i = 0; i < ctx_config->ctx_num; i++) + if (ctx_config->ctxs[i].ctx) { + wd_release_ctx(ctx_config->ctxs[i].ctx); + ctx_config->ctxs[i].ctx = 0; + } +}
+static int wd_instance_sched_set(struct wd_sched *sched, struct wd_ctx_nums ctx_nums, + int idx, int numa_id, int op_type) +{ + struct sched_params sparams; + int i, ret = 0;
+ for (i = 0; i < CTX_MODE_MAX; i++) { + sparams.numa_id = numa_id; + sparams.type = op_type; + sparams.mode = i; + sparams.begin = idx + ctx_nums.sync_ctx_num * i; + sparams.end = idx - 1 + ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num * i; + if (sparams.begin > sparams.end) + continue; + ret = wd_sched_rr_instance(sched, &sparams); + if (ret) + goto out; + }
+out: + return ret; +}
+static int wd_init_ctx_and_sched(struct wd_init_attrs *attrs, struct bitmask *bmp, + struct uacce_dev_list *list) +{ + struct wd_ctx_params *ctx_params = attrs->ctx_params; + __u32 op_type_num = ctx_params->op_type_num; + int max_node = numa_max_node() + 1; + struct wd_ctx_nums ctx_nums; + int i, j, ret; + int idx = 0;
+ for (i = 0; i < max_node; i++) { + if (!numa_bitmask_isbitset(bmp, i)) + continue; + for (j = 0; j < op_type_num; j++) { + ctx_nums = ctx_params->ctx_set_num[j]; + ret = wd_init_ctx_set(attrs, list, idx, i, j); + if (ret) + goto free_ctxs; + ret = wd_instance_sched_set(attrs->sched, ctx_nums, idx, i, j); + if (ret) + goto free_ctxs; + idx += (ctx_nums.sync_ctx_num + ctx_nums.async_ctx_num); + } + }
+ return 0;
+free_ctxs: + wd_release_ctx_set(attrs->ctx_config);
+ return ret; +}
+int wd_alg_pre_init(struct wd_init_attrs *attrs) +{ + struct wd_ctx_config *ctx_config = attrs->ctx_config; + struct wd_ctx_params *ctx_params = attrs->ctx_params; + struct bitmask *used_bmp, *bmp = ctx_params->bmp; + struct uacce_dev_list *list, *used_list = NULL; + __u32 ctx_set_num, op_type_num; + int numa_cnt, ret;
+ list = wd_get_accel_list(attrs->alg); + if (!list) { + WD_ERR("failed to get devices!\n"); + return -WD_ENODEV; + }
+ op_type_num = ctx_params->op_type_num; + ctx_set_num = wd_get_ctx_numbers(*ctx_params, op_type_num); + if (!ctx_set_num || !op_type_num) { + WD_ERR("invalid: ctx_set_num is %d, op_type_num is %d!\n", + ctx_set_num, op_type_num); + ret = -WD_EINVAL; + goto out_freelist; + }
+ /* + * Not every numa has a device. Therefore, the first thing is to + * filter the devices in the selected numa node, and the second + * thing is to obtain the distribution of devices. + */ + if (bmp) { + used_list = wd_get_usable_list(list, bmp); + if (WD_IS_ERR(used_list)) { + ret = WD_PTR_ERR(used_list); + WD_ERR("failed to get usable devices(%d)!\n", ret); + goto out_freelist; + } + }
+ used_bmp = wd_create_device_nodemask(used_list ? used_list : list); + if (WD_IS_ERR(used_bmp)) { + ret = WD_PTR_ERR(used_bmp); + goto out_freeusedlist; + }
+ numa_cnt = numa_bitmask_weight(used_bmp); + if (!numa_cnt) { + ret = numa_cnt; + WD_ERR("invalid: bmp is clear!\n"); + goto out_freenodemask; + }
+ ctx_config->ctx_num = ctx_set_num * numa_cnt; + ctx_config->ctxs = calloc(ctx_config->ctx_num, sizeof(struct wd_ctx)); + if (!ctx_config->ctxs) { + ret = -WD_ENOMEM; + WD_ERR("failed to alloc ctxs!\n"); + goto out_freenodemask; + }
+ ret = wd_init_ctx_and_sched(attrs, used_bmp, used_list ? used_list : list); + if (ret) + free(ctx_config->ctxs);
+out_freenodemask: + wd_free_device_nodemask(used_bmp);
+out_freeusedlist: + wd_free_list_accels(used_list);
+out_freelist: + wd_free_list_accels(list);
+ return ret; +}
.
.
.
.
Due to the complexity of wd_alg_init, add wd_alg_init2 interface for users. And add the design documents.
Signed-off-by: Yang Shen shenyang39@huawei.com --- docs/wd_alg_init2.md | 159 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 159 insertions(+) create mode 100644 docs/wd_alg_init2.md
diff --git a/docs/wd_alg_init2.md b/docs/wd_alg_init2.md new file mode 100644 index 0000000..2673b03 --- /dev/null +++ b/docs/wd_alg_init2.md @@ -0,0 +1,159 @@ +# wd_alg_init2 + +## Preface + +The current uadk initialization process is: +1.Call wd_request_ctx() to request ctxs from devices. +2.Call wd_sched_rr_alloc() to create a sched(or some other scheduler alloc function if exits). +3.Initialize the sched. +4.Call wd_alg_init() with ctx_config and sched. + +```flow +st=>start: Start +o1=>operation: request ctxs +o2=>operation: create uadk_sched and instance ctxs to sched region +o3=>operation: call wd_alg_init +e=>end +st->o1->o2->o3->e +``` + +Logic is reasonable. But in practice, the step of wd_request_ctx() +and wd_sched_rr_alloc() are very tedious. This makes it difficult +for users to use the interface. One of the main reasons for this is +that uadk has made a lot of configurations in the scheduler in order +to provide users with better performance. Based on this consideration, +the current uadk requires the user to arrange the division of hardware +resources according to the device topology during initialization. +Therefore, as a high-level interface, this scheme can provide customized +scheme configuration for users with deep needs. + +## wd_alg_init2 + +### Design + +Is there any way to simplify these steps? Not currently. Because the +architecture model designed by uadk is to manage hardware resources +through a scheduler, users can no longer perceive after specifying +hardware resources, and all subsequent tasks are handled by the scheduler. +The original intention of this design is to make the scenarios supported +by uadk more flexible. Because the resource requirements of different +business scenarios are different from the task model of the business +itself, the best performance experience can be obtained through the +scheduler to match. + +But we can try to provide a layer of encapsulation. The original design +intention of this layer of encapsulation is that users only need to +specify available resources and requirements, and the configuration of +resources is completed internally by the interface. Because the previous +interface complexity mainly lies in the parameter configuration of CTX +and scheduler, it is easy for users to make configuration errors and +generate bugs because of their misunderstanding of parameters. + +All algorithms have the same input parameters and initialization logic. + +```c +struct wd_ctx_config { + __u32 ctx_num; + struct wd_ctx *ctxs; + void *priv; +}; + +struct wd_sched { + const char *name; + int sched_policy; + handle_t (*sched_init)(handle_t h_sched_ctx, void *sched_param); + __u32 (*pick_next_ctx)(handle_t h_sched_ctx, void *sched_key, + const int sched_mode); + int (*poll_policy)(handle_t h_sched_ctx, __u32 expect, __u32 *count); + handle_t h_sched_ctx; +}; + +int wd_alg_init(struct wd_ctx_config *config, struct wd_sched *sched); +``` + +`wd_ctx_config` is the requested ctxs descriptor, and the attributes +of ctxs are contained in their own structure. The attributes will be +used in scheduler for picking ctx according to request type. The main +difficulty in this step is that users need to apply for CTXs from the +appropriate device nodes according to their own business distribution. +If the user does not consider the appropriate device distribution, +it may lead to cross chip or cross numa node which will affect +performance. + +`wd_sched` is the scheduler descriptor of the request. It will create +the scheduling domain based parameters passed by the users. User needs +to allocate the ctxs applied to the scheduling domain that meets the +attribute, so that uadk can select the appropriate ctxs according to +the issued business. The main difficulty in this step is that the user +needs to initialize the correct scheduling domain according to the ctxs +attributes previously applied. However, there are many attributes of +ctxs here, which should be divided by multiple dimensions. If the +parameters are not understood enough, it is easy to make queue +allocation errors, resulting in the scheduling of the wrong ctxs when +the task is finally issued, and cause unexpected errors. + +Therefore, the next thing to be done is to use limited and easy-to-use +input parameters to describe users' requirements on the two input +parameters, ensuring that the functions of the new interface init2 +are the same as those of init. For ease of description, v1 is used +to refer to the existing interface, and v2 is used to refer to the +layer of encapsulation. + +Let's clarify the following logic first: all uacce devices under a +numa node can be regarded as the same. So although we request for +ctxs from the device, we manage ctxs according to numa nodes. +That means if users want to get the same performance for all cpu, +the uadk configure should be same for all numa node. + +At present, at least 4 parameters are required to meet the user +configuration requirements with the V1 interface function remains +unchanged. + +@alg: The algorithm users wanted. + +@sched_type: Scheduling type the user wants to use. + +@task_tp: Reserved. + +@wd_ctx_params: op_type_num and ctx_set_num means the requested ctx +number for each numa node. Due to users may have different requirements +for different types of ctx numbers, needs a two-dimensional array as +input. The bitmask provided by libnuma. Users can use this parameter +to control requesting ctxs devices in the bind NUMA scenario. +This parameter is mainly convenient for users to use in the binding +cpu scenario. It can avoid resource waste or initialization failure +caused by insufficient resources. Libnuma provides a complete operation +interface which can be found in numa.h. + +To sum up, the wd_alg_init2_() is as follows + +```c +struct wd_ctx_nums { + __u32 sync_ctx_num; + __u32 async_ctx_num; +}; + +struct wd_ctx_params { + __u32 op_type_num; + struct wd_ctx_nums *ctx_set_num; + struct bitmask *bmp; +}; + +init wd_alg_init2_(char *alg, __u32 sched_type, int task_tp, + struct wd_ctx_params *ctx_params); +``` + +Somebody may say that the wd_alg_init2_() is still complex for three +input parameters are structure. So the interface support default value +for some parameters. The @bmp can be set as NULL, and then it will be +initialized according to device list. The @cparams can be set as NULL, +and it has a default value in wd_alg.c. So there is a simpler interface +wd_alg_init2(). + +```c +#define wd_alg_init2(alg, sched_type, task_tp) \ + wd_alg_init2_(alg, sched_type, task_tp, NULL) +``` + +Please do not use this interface with wd_comp_init() together, +or some resources may be leak.