Kernel
Threads by month
- ----- 2025 -----
- June
- May
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- 58 participants
- 18832 discussions

[PATCH openEuler-1.0-LTS] sched: Fix null pointer derefrence for sd->span
by Zhang Changzhong 30 Jun '23
by Zhang Changzhong 30 Jun '23
30 Jun '23
From: Hui Tang <tanghui20(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7HFZV
CVE: NA
----------------------------------------
There may be NULL pointer derefrence when hotplug running and
creating taskgroup concurrently.
sched_autogroup_create_attach
-> sched_create_group
-> alloc_fair_sched_group
-> init_auto_affinity
-> init_affinity_domains
-> cpumask_copy(xx, sched_domain_span(tmp))
{ tmp may be free due rcu lock missing }
{ hotplug will rebuild sched domain }
sched_cpu_activate
-> build_sched_domains
-> cpuset_cpu_active
-> partition_sched_domains
-> build_sched_domains
-> cpu_attach_domain
-> destroy_sched_domains
-> call_rcu(&sd->rcu, destroy_sched_domains_rcu)
So sd should be protect with rcu lock in entire critical zone.
[ 599.811593] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 600.112821] pc : init_affinity_domains+0xf4/0x200
[ 600.125918] lr : init_affinity_domains+0xd4/0x200
[ 600.331355] Call trace:
[ 600.338734] init_affinity_domains+0xf4/0x200
[ 600.347955] init_auto_affinity+0x78/0xc0
[ 600.356622] alloc_fair_sched_group+0xd8/0x210
[ 600.365594] sched_create_group+0x48/0xc0
[ 600.373970] sched_autogroup_create_attach+0x54/0x190
[ 600.383311] ksys_setsid+0x110/0x130
[ 600.391014] __arm64_sys_setsid+0x18/0x24
[ 600.399156] el0_svc_common+0x118/0x170
[ 600.406818] el0_svc_handler+0x3c/0x80
[ 600.414188] el0_svc+0x8/0x640
[ 600.420719] Code: b40002c0 9104e002 f9402061 a9401444 (a9001424)
[ 600.430504] SMP: stopping secondary CPUs
[ 600.441751] Starting crashdump kernel...
Fixes: 713cfd2684fa ("sched: Introduce smart grid scheduling strategy for cfs")
Signed-off-by: Hui Tang <tanghui20(a)huawei.com>
Reviewed-by: Zhang Qiao <zhangqiao22(a)huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong(a)huawei.com>
---
kernel/sched/fair.c | 23 ++++++++++-------------
1 file changed, 10 insertions(+), 13 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e9eb00e..622d433 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5582,7 +5582,7 @@ void free_affinity_domains(struct affinity_domain *ad)
{
int i;
- for (i = 0; i < ad->dcount; i++) {
+ for (i = 0; i < AD_LEVEL_MAX; i++) {
kfree(ad->domains[i]);
kfree(ad->domains_orig[i]);
ad->domains[i] = NULL;
@@ -5621,6 +5621,12 @@ static int init_affinity_domains(struct affinity_domain *ad)
int i = 0;
int cpu;
+ for (i = 0; i < AD_LEVEL_MAX; i++) {
+ ad->domains[i] = kmalloc(sizeof(cpumask_t), GFP_KERNEL);
+ if (!ad->domains[i])
+ goto err;
+ }
+
rcu_read_lock();
cpu = cpumask_first_and(cpu_active_mask,
housekeeping_cpumask(HK_FLAG_DOMAIN));
@@ -5629,21 +5635,12 @@ static int init_affinity_domains(struct affinity_domain *ad)
dcount++;
}
- if (!sd) {
+ if (!sd || dcount > AD_LEVEL_MAX) {
rcu_read_unlock();
- return -EINVAL;
- }
- rcu_read_unlock();
-
- for (i = 0; i < dcount; i++) {
- ad->domains[i] = kmalloc(sizeof(cpumask_t), GFP_KERNEL);
- if (!ad->domains[i]) {
- ad->dcount = i;
- goto err;
- }
+ ret = -EINVAL;
+ goto err;
}
- rcu_read_lock();
idlest = sd_find_idlest_group(sd);
cpu = group_find_idlest_cpu(idlest);
i = 0;
--
2.9.5
1
0
Tu Jinjiang (2):
crypto: add lz4k Cryptographic API
mm/zram: Add lz4k support for zram
crypto/Kconfig | 8 +
crypto/Makefile | 1 +
crypto/lz4k.c | 100 ++++++
drivers/block/zram/zcomp.c | 3 +
include/linux/lz4k.h | 384 +++++++++++++++++++++++
lib/Kconfig | 6 +
lib/Makefile | 2 +
lib/lz4k/Makefile | 2 +
lib/lz4k/lz4k_decode.c | 314 +++++++++++++++++++
lib/lz4k/lz4k_encode.c | 554 +++++++++++++++++++++++++++++++++
lib/lz4k/lz4k_encode_private.h | 142 +++++++++
lib/lz4k/lz4k_private.h | 282 +++++++++++++++++
12 files changed, 1798 insertions(+)
create mode 100644 crypto/lz4k.c
create mode 100644 include/linux/lz4k.h
create mode 100644 lib/lz4k/Makefile
create mode 100644 lib/lz4k/lz4k_decode.c
create mode 100644 lib/lz4k/lz4k_encode.c
create mode 100644 lib/lz4k/lz4k_encode_private.h
create mode 100644 lib/lz4k/lz4k_private.h
--
2.25.1
2
3
在 2023/6/30 10:39, Kefeng Wang 写道:
>
> 整体的代码风格都刷新下,按照linux的要求来吧
>
>
正在修改。
> On 2023/6/30 10:41, Tu Jinjiang wrote:
>> hulk inclusion
>> category: feature
>> bugzilla: https://gitee.com/openeuler/kernel/issues/I7H9IA
>> CVE: NA
>>
>> -------------------------------------------
>>
>> Add lz4k algorithm support for zram.
> 有没有原始作者信息和commit信息,要保留
我这边没有这些信息。
>>
>> Signed-off-by: Nanyong Sun <sunnanyong(a)huawei.com>
>> Signed-off-by: Tu Jinjiang <tujinjiang(a)huawei.com>
>> ---
>> crypto/Kconfig | 8 +
>> crypto/Makefile | 1 +
>> crypto/lz4k.c | 97 ++++++
>> drivers/block/zram/zcomp.c | 3 +
>> include/linux/lz4k.h | 383 +++++++++++++++++++++++
>> lib/Kconfig | 6 +
>> lib/Makefile | 2 +
>> lib/lz4k/Makefile | 2 +
>> lib/lz4k/lz4k_decode.c | 308 +++++++++++++++++++
>> lib/lz4k/lz4k_encode.c | 539 +++++++++++++++++++++++++++++++++
>> lib/lz4k/lz4k_encode_private.h | 137 +++++++++
>> lib/lz4k/lz4k_private.h | 269 ++++++++++++++++
>> 12 files changed, 1755 insertions(+)
>> create mode 100644 crypto/lz4k.c
>> create mode 100644 include/linux/lz4k.h
>> create mode 100644 lib/lz4k/Makefile
>> create mode 100644 lib/lz4k/lz4k_decode.c
>> create mode 100644 lib/lz4k/lz4k_encode.c
>> create mode 100644 lib/lz4k/lz4k_encode_private.h
>> create mode 100644 lib/lz4k/lz4k_private.h
>>
>> diff --git a/crypto/Kconfig b/crypto/Kconfig
>> index 64cb304f5103..35223cff7c8a 100644
>> --- a/crypto/Kconfig
>> +++ b/crypto/Kconfig
>> @@ -1871,6 +1871,14 @@ config CRYPTO_LZ4HC
>> help
>> This is the LZ4 high compression mode algorithm.
>> +config CRYPTO_LZ4K
>> + tristate "LZ4K compression algorithm"
>> + select CRYPTO_ALGAPI
>> + select LZ4K_COMPRESS
>> + select LZ4K_DECOMPRESS
>> + help
>> + This is the LZ4K algorithm.
>> +
>> config CRYPTO_ZSTD
>> tristate "Zstd compression algorithm"
>> select CRYPTO_ALGAPI
>> diff --git a/crypto/Makefile b/crypto/Makefile
>> index 9d1191f2b741..5c3b0a0839c5 100644
>> --- a/crypto/Makefile
>> +++ b/crypto/Makefile
>> @@ -161,6 +161,7 @@ obj-$(CONFIG_CRYPTO_AUTHENC) += authenc.o
>> authencesn.o
>> obj-$(CONFIG_CRYPTO_LZO) += lzo.o lzo-rle.o
>> obj-$(CONFIG_CRYPTO_LZ4) += lz4.o
>> obj-$(CONFIG_CRYPTO_LZ4HC) += lz4hc.o
>> +obj-$(CONFIG_CRYPTO_LZ4K) += lz4k.o
>> obj-$(CONFIG_CRYPTO_XXHASH) += xxhash_generic.o
>> obj-$(CONFIG_CRYPTO_842) += 842.o
>> obj-$(CONFIG_CRYPTO_RNG2) += rng.o
>> diff --git a/crypto/lz4k.c b/crypto/lz4k.c
>> new file mode 100644
>> index 000000000000..8daceab269ef
>> --- /dev/null
>> +++ b/crypto/lz4k.c
>> @@ -0,0 +1,97 @@
>> +/*
>> + * Copyright (c) Huawei Technologies Co., Ltd. 2012-2020. All rights
>> reserved.
>> + * Description: LZ4K compression algorithm for ZRAM
>> + * Author: Arkhipov Denis arkhipov.denis(a)huawei.com
>> + * Create: 2020-03-25
>> + */
>> +
>> +#include <linux/init.h>
>> +#include <linux/module.h>
>> +#include <linux/crypto.h>
>> +#include <linux/vmalloc.h>
>> +#include <linux/lz4k.h>
>> +
>> +
>> +struct lz4k_ctx {
>> + void *lz4k_comp_mem;
>> +};
>> +
>> +static int lz4k_init(struct crypto_tfm *tfm)
>> +{
>> + struct lz4k_ctx *ctx = crypto_tfm_ctx(tfm);
>> +
>> + ctx->lz4k_comp_mem = vmalloc(lz4k_encode_state_bytes_min());
>> + if (!ctx->lz4k_comp_mem)
>> + return -ENOMEM;
>> +
>> + return 0;
>> +}
>> +
>> +static void lz4k_exit(struct crypto_tfm *tfm)
>> +{
>> + struct lz4k_ctx *ctx = crypto_tfm_ctx(tfm);
>> + vfree(ctx->lz4k_comp_mem);
>> +}
>> +
>> +static int lz4k_compress_crypto(struct crypto_tfm *tfm, const u8
>> *src, unsigned int slen, u8 *dst, unsigned int *dlen)
>> +{
>> + struct lz4k_ctx *ctx = crypto_tfm_ctx(tfm);
>> + int ret;
>> +
>> + ret = lz4k_encode(ctx->lz4k_comp_mem, src, dst, slen, *dlen, 0);
>> +
> 去掉空行
>> + if (ret < 0) {
>> + return -EINVAL;
>> + }
> 去掉括号
>> +
>> + if (ret)
>> + *dlen = ret;
>> +
>> + return 0;
>> +}
>> +
>> +static int lz4k_decompress_crypto(struct crypto_tfm *tfm, const u8
>> *src, unsigned int slen, u8 *dst, unsigned int *dlen)
>> +{
>> + int ret;
>> +
>> + ret = lz4k_decode(src, dst, slen, *dlen);
>> +
> 去空行
>> + if (ret <= 0)
>> + return -EINVAL;
> 加空行
>> + *dlen = ret;
>> + return 0;
>> +}
>> +
>> +static struct crypto_alg alg_lz4k = {
>> + .cra_name = "lz4k",
>> + .cra_driver_name = "lz4k-generic",
>> + .cra_flags = CRYPTO_ALG_TYPE_COMPRESS,
>> + .cra_ctxsize = sizeof(struct lz4k_ctx),
>> + .cra_module = THIS_MODULE,
>> + .cra_list = LIST_HEAD_INIT(alg_lz4k.cra_list),
>> + .cra_init = lz4k_init,
>> + .cra_exit = lz4k_exit,
>> + .cra_u = {
>> + .compress = {
>> + .coa_compress = lz4k_compress_crypto,
>> + .coa_decompress = lz4k_decompress_crypto
>> + }
>> + }
>> +};
>> +
>> +static int __init lz4k_mod_init(void)
>> +{
>> + return crypto_register_alg(&alg_lz4k);
>> +}
>> +
>> +static void __exit lz4k_mod_fini(void)
>> +{
>> + crypto_unregister_alg(&alg_lz4k);
>> +}
>> +
>> +module_init(lz4k_mod_init);
>> +module_exit(lz4k_mod_fini);
>> +
>> +MODULE_LICENSE("GPL");
>> +MODULE_DESCRIPTION("LZ4K Compression Algorithm");
>> +MODULE_ALIAS_CRYPTO("lz4k");
>> diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
>> index b08650417bf0..28bda2035326 100644
>> --- a/drivers/block/zram/zcomp.c
>> +++ b/drivers/block/zram/zcomp.c
>> @@ -29,6 +29,9 @@ static const char * const backends[] = {
>> #if IS_ENABLED(CONFIG_CRYPTO_ZSTD)
>> "zstd",
>> #endif
>> +#if IS_ENABLED(CONFIG_CRYPTO_LZ4K)
>> + "lz4k",
>> +#endif
>> };
>> static void zcomp_strm_free(struct zcomp_strm *zstrm)
>> diff --git a/include/linux/lz4k.h b/include/linux/lz4k.h
>> new file mode 100644
>> index 000000000000..6e73161b1840
>> --- /dev/null
>> +++ b/include/linux/lz4k.h
>> @@ -0,0 +1,383 @@
>> +/*
>> + * Copyright (c) Huawei Technologies Co., Ltd. 2012-2020. All rights
>> reserved.
>> + * Description: LZ4K compression algorithm
>> + * Author: Aleksei Romanovskii aleksei.romanovskii(a)huawei.com
>> + * Created: 2020-03-25
>> + */
>> +
>> +#ifndef _LZ4K_H
>> +#define _LZ4K_H
>> +
>> +/* file lz4k.h
>> + This file contains the platform-independent API of LZ-class
>> + lossless codecs (compressors/decompressors) with complete
>> + in-place documentation. The documentation is formatted
>> + in accordance with DOXYGEN mark-up format. So, one can
>> + generate proper documentation, e.g. in HTML format, using DOXYGEN.
>> +
>> + Currently, LZ-class codecs, documented here, implement following
>> + algorithms for lossless data compression/decompression:
>> + \li "LZ HUAWEI" proprietary codec competing with LZ4 - lz4k_encode(),
>> + lz4k_encode_delta(), lz4k_decode(), lz4k_decode_delta()
>> +
>> + The LZ HUAWEI compressors accept any data as input and compress it
>> + without loss to a smaller size if possible.
>> + Compressed data produced by LZ HUAWEI compressor API lz4k_encode*(),
>> + can be decompressed only by lz4k_decode() API documented below.\n
>> + */
>> +
>> +/*
>> + lz4k_status defines simple set of status values returned by Huawei
>> APIs
>> + */
>
> 各种Huawei 都改成 lz4k之类的 算法本身不要保留 LZ HUAWEI之类的
>
>> +typedef enum {
>> + LZ4K_STATUS_INCOMPRESSIBLE = 0, /* !< Return when data is
>> incompressible */
>> + LZ4K_STATUS_FAILED = -1, /* !< Return on general failure */
>> + LZ4K_STATUS_READ_ERROR = -2, /* !< Return when data reading
>> failed */
>> + LZ4K_STATUS_WRITE_ERROR = -3 /* !< Return when data writing
>> failed */
>> +} lz4k_status;
>> +
>> +/*
>> + LZ4K_Version() returns static unmutable string with algorithm version
>> + */
>> +const char *lz4k_version(void);
>> +
>> +/*
>> + lz4k_encode_state_bytes_min() returns number of bytes for state
>> parameter,
>> + supplied to lz4k_encode(), lz4k_encode_delta(),
>> + lz4k_update_delta_state().
>> + So, state should occupy at least lz4k_encode_state_bytes_min() for
>> mentioned
>> + functions to work correctly.
>> + */
>> +unsigned lz4k_encode_state_bytes_min(void);
>
> 下面的注释风格之类要改下;或者删掉一些无用的
>
>> +
>> +/*
>> + lz4k_encode() encodes/compresses one input buffer at *in, places
>> + result of encoding into one output buffer at *out if encoded data
>> + size fits specified values of out_max and out_limit.
>> + It returs size of encoded data in case of success or value<=0
>> otherwise.
>> + The result of successful encoding is in HUAWEI proprietary format,
>> that
>> + is the encoded data can be decoded only by lz4k_decode().
>> +
>> + \return
>> + \li positive value\n
>> + if encoding was successful. The value returned is the size of
>> encoded
>> + (compressed) data always <=out_max.
>> + \li non-positive value\n
>> + if in==0||in_max==0||out==0||out_max==0 or
>> + if out_max is less than needed for encoded (compressed) data.
>> + \li 0 value\n
>> + if encoded data size >= out_limit
>> +
>> + \param[in] state
>> + !=0, pointer to state buffer used internally by the function.
>> Size of
>> + state in bytes should be at least
>> lz4k_encode_state_bytes_min(). The content
>> + of state buffer will be changed during encoding.
>> +
>> + \param[in] in
>> + !=0, pointer to the input buffer to encode (compress). The
>> content of
>> + the input buffer does not change during encoding.
>> +
>> + \param[in] out
>> + !=0, pointer to the output buffer where to place result of encoding
>> + (compression).
>> + If encoding is unsuccessful, e.g. out_max or out_limit are less
>> than
>> + needed for encoded data then content of out buffer may be
>> arbitrary.
>> +
>> + \param[in] in_max
>> + !=0, size in bytes of the input buffer at *in
>> +
>> + \param[in] out_max
>> + !=0, size in bytes of the output buffer at *out
>> +
>> + \param[in] out_limit
>> + encoded data size soft limit in bytes. Due to performance
>> reasons it is
>> + not guaranteed that
>> + lz4k_encode will always detect that resulting encoded data size is
>> + bigger than out_limit.
>> + Hovewer, when reaching out_limit is detected, lz4k_encode() returns
>> + earlier and spares CPU cycles. Caller code should recheck result
>> + returned by lz4k_encode() (value greater than 0) if it is really
>> + less or equal than out_limit.
>> + out_limit is ignored if it is equal to 0.
>> + */
>> +int lz4k_encode(
>> + void *const state,
>> + const void *const in,
>> + void *out,
>> + unsigned in_max,
>> + unsigned out_max,
>> + unsigned out_limit);
>> +
>> +/*
>> + lz4k_encode_max_cr() encodes/compresses one input buffer at *in,
>> places
>> + result of encoding into one output buffer at *out if encoded data
>> + size fits specified value of out_max.
>> + It returs size of encoded data in case of success or value<=0
>> otherwise.
>> + The result of successful encoding is in HUAWEI proprietary format,
>> that
>> + is the encoded data can be decoded only by lz4k_decode().
>> +
>> + \return
>> + \li positive value\n
>> + if encoding was successful. The value returned is the size of
>> encoded
>> + (compressed) data always <=out_max.
>> + \li non-positive value\n
>> + if in==0||in_max==0||out==0||out_max==0 or
>> + if out_max is less than needed for encoded (compressed) data.
>> +
>> + \param[in] state
>> + !=0, pointer to state buffer used internally by the function.
>> Size of
>> + state in bytes should be at least
>> lz4k_encode_state_bytes_min(). The content
>> + of state buffer will be changed during encoding.
>> +
>> + \param[in] in
>> + !=0, pointer to the input buffer to encode (compress). The
>> content of
>> + the input buffer does not change during encoding.
>> +
>> + \param[in] out
>> + !=0, pointer to the output buffer where to place result of encoding
>> + (compression).
>> + If encoding is unsuccessful, e.g. out_max is less than
>> + needed for encoded data then content of out buffer may be
>> arbitrary.
>> +
>> + \param[in] in_max
>> + !=0, size in bytes of the input buffer at *in
>> +
>> + \param[in] out_max
>> + !=0, size in bytes of the output buffer at *out
>> +
>> + \param[in] out_limit
>> + encoded data size soft limit in bytes. Due to performance
>> reasons it is
>> + not guaranteed that
>> + lz4k_encode will always detect that resulting encoded data size is
>> + bigger than out_limit.
>> + Hovewer, when reaching out_limit is detected, lz4k_encode() returns
>> + earlier and spares CPU cycles. Caller code should recheck result
>> + returned by lz4k_encode() (value greater than 0) if it is really
>> + less or equal than out_limit.
>> + out_limit is ignored if it is equal to 0.
>> + */
>> +int lz4k_encode_max_cr(
>> + void *const state,
>> + const void *const in,
>> + void *out,
>> + unsigned in_max,
>> + unsigned out_max,
>> + unsigned out_limit);
>> +
>> +/*
>> + lz4k_update_delta_state() fills/updates state (hash table) in the
>> same way as
>> + lz4k_encode does while encoding (compressing).
>> + The state and its content can then be used by lz4k_encode_delta()
>> + to encode (compress) data more efficiently.
>> + By other words, effect of lz4k_update_delta_state() is the same as
>> + lz4k_encode() with all encoded output discarded.
>> +
>> + Example sequence of calls for lz4k_update_delta_state and
>> + lz4k_encode_delta:
>> + //dictionary (1st) block
>> + int result0=lz4k_update_delta_state(state, in0, in0, in_max0);
>> +//delta (2nd) block
>> + int result1=lz4k_encode_delta(state, in0, in, out, in_max,
>> + out_max);
>> +
>> + \param[in] state
>> + !=0, pointer to state buffer used internally by lz4k_encode*.
>> + Size of state in bytes should be at least
>> lz4k_encode_state_bytes_min().
>> + The content of state buffer is zeroed at the beginning of
>> + lz4k_update_delta_state ONLY when in0==in.
>> + The content of state buffer will be changed inside
>> + lz4k_update_delta_state.
>> +
>> + \param[in] in0
>> + !=0, pointer to the reference/dictionary input buffer that was used
>> + as input to preceding call of lz4k_encode() or
>> lz4k_update_delta_state()
>> + to fill/update the state buffer.
>> + The content of the reference/dictionary input buffer does not
>> change
>> + during encoding.
>> + The in0 is needed for use-cases when there are several
>> dictionary and
>> + input blocks interleaved, e.g.
>> + <dictionaryA><inputA><dictionaryB><inputB>..., or
>> + <dictionaryA><dictionaryB><inputAB>..., etc.
>> +
>> + \param[in] in
>> + !=0, pointer to the input buffer to fill/update state as if
>> encoding
>> + (compressing) this input. This input buffer is also called
>> dictionary
>> + input buffer.
>> + The content of the input buffer does not change during encoding.
>> + The two buffers - at in0 and at in - should be contiguous in
>> memory.
>> + That is, the last byte of buffer at in0 is located exactly
>> before byte
>> + at in.
>> +
>> + \param[in] in_max
>> + !=0, size in bytes of the input buffer at in.
>> + */
>> +int lz4k_update_delta_state(
>> + void *const state,
>> + const void *const in0,
>> + const void *const in,
>> + unsigned in_max);
>> +
>> +/*
>> + lz4k_encode_delta() encodes (compresses) data from one input buffer
>> + using one reference buffer as dictionary and places the result of
>> + compression into one output buffer.
>> + The result of successful compression is in HUAWEI proprietary
>> format, so
>> + that compressed data can be decompressed only by lz4k_decode_delta().
>> + Reference/dictionary buffer and input buffer should be contiguous in
>> + memory.
>> +
>> + Example sequence of calls for lz4k_update_delta_state and
>> + lz4k_encode_delta:
>> +//dictionary (1st) block
>> + int result0=lz4k_update_delta_state(state, in0, in0, in_max0);
>> +//delta (2nd) block
>> + int result1=lz4k_encode_delta(state, in0, in, out, in_max,
>> + out_max);
>> +
>> + Example sequence of calls for lz4k_encode and lz4k_encode_delta:
>> +//dictionary (1st) block
>> + int result0=lz4k_encode(state, in0, out0, in_max0, out_max0);
>> +//delta (2nd) block
>> + int result1=lz4k_encode_delta(state, in0, in, out, in_max,
>> + out_max);
>> +
>> + \return
>> + \li positive value\n
>> + if encoding was successful. The value returned is the size of
>> encoded
>> + (compressed) data.
>> + \li non-positive value\n
>> + if state==0||in0==0||in==0||in_max==0||out==0||out_max==0 or
>> + if out_max is less than needed for encoded (compressed) data.
>> +
>> + \param[in] state
>> + !=0, pointer to state buffer used internally by the function.
>> Size of
>> + state in bytes should be at least
>> lz4k_encode_state_bytes_min(). For more
>> + efficient encoding the state buffer may be filled/updated by
>> calling
>> + lz4k_update_delta_state() or lz4k_encode() before
>> lz4k_encode_delta().
>> + The content of state buffer is zeroed at the beginning of
>> + lz4k_encode_delta() ONLY when in0==in.
>> + The content of state will be changed during encoding.
>> +
>> + \param[in] in0
>> + !=0, pointer to the reference/dictionary input buffer that was
>> used as
>> + input to preceding call of lz4k_encode() or
>> lz4k_update_delta_state() to
>> + fill/update the state buffer.
>> + The content of the reference/dictionary input buffer does not
>> change
>> + during encoding.
>> +
>> + \param[in] in
>> + !=0, pointer to the input buffer to encode (compress). The
>> input buffer
>> + is compressed using content of the reference/dictionary input
>> buffer at
>> + in0. The content of the input buffer does not change during
>> encoding.
>> + The two buffers - at *in0 and at *in - should be contiguous in
>> memory.
>> + That is, the last byte of buffer at *in0 is located exactly
>> before byte
>> + at *in.
>> +
>> + \param[in] out
>> + !=0, pointer to the output buffer where to place result of encoding
>> + (compression). If compression is unsuccessful then content of out
>> + buffer may be arbitrary.
>> +
>> + \param[in] in_max
>> + !=0, size in bytes of the input buffer at *in
>> +
>> + \param[in] out_max
>> + !=0, size in bytes of the output buffer at *out.
>> + */
>> +int lz4k_encode_delta(
>> + void *const state,
>> + const void *const in0,
>> + const void *const in,
>> + void *out,
>> + unsigned in_max,
>> + unsigned out_max);
>> +
>> +/*
>> + lz4k_decode() decodes (decompresses) data from one input buffer
>> and places
>> + the result of decompression into one output buffer. The encoded
>> data in input
>> + buffer should be in HUAWEI proprietary format, produced by
>> lz4k_encode()
>> + or by lz4k_encode_delta().
>> +
>> + \return
>> + \li positive value\n
>> + if decoding was successful. The value returned is the size of
>> decoded
>> + (decompressed) data.
>> + \li non-positive value\n
>> + if in==0||in_max==0||out==0||out_max==0 or
>> + if out_max is less than needed for decoded (decompressed) data or
>> + if input encoded data format is corrupted.
>> +
>> + \param[in] in
>> + !=0, pointer to the input buffer to decode (decompress). The
>> content of
>> + the input buffer does not change during decoding.
>> +
>> + \param[in] out
>> + !=0, pointer to the output buffer where to place result of decoding
>> + (decompression). If decompression is unsuccessful then content
>> of out
>> + buffer may be arbitrary.
>> +
>> + \param[in] in_max
>> + !=0, size in bytes of the input buffer at in
>> +
>> + \param[in] out_max
>> + !=0, size in bytes of the output buffer at out
>> + */
>> +int lz4k_decode(
>> + const void *const in,
>> + void *const out,
>> + unsigned in_max,
>> + unsigned out_max);
>> +
>> +/*
>> + lz4k_decode_delta() decodes (decompresses) data from one input buffer
>> + and places the result of decompression into one output buffer. The
>> + compressed data in input buffer should be in format, produced by
>> + lz4k_encode_delta().
>> +
>> + Example sequence of calls for lz4k_decode and lz4k_decode_delta:
>> +//dictionary (1st) block
>> + int result0=lz4k_decode(in0, out0, in_max0, out_max0);
>> +//delta (2nd) block
>> + int result1=lz4k_decode_delta(in, out0, out, in_max, out_max);
>> +
>> + \return
>> + \li positive value\n
>> + if decoding was successful. The value returned is the size of
>> decoded
>> + (decompressed) data.
>> + \li non-positive value\n
>> + if in==0||in_max==0||out==0||out_max==0 or
>> + if out_max is less than needed for decoded (decompressed) data or
>> + if input data format is corrupted.
>> +
>> + \param[in] in
>> + !=0, pointer to the input buffer to decode (decompress). The
>> content of
>> + the input buffer does not change during decoding.
>> +
>> + \param[in] out0
>> + !=0, pointer to the dictionary input buffer that was used as
>> input to
>> + lz4k_update_delta_state() to fill/update the state buffer. The
>> content
>> + of the dictionary input buffer does not change during decoding.
>> +
>> + \param[in] out
>> + !=0, pointer to the output buffer where to place result of decoding
>> + (decompression). If decompression is unsuccessful then content
>> of out
>> + buffer may be arbitrary.
>> + The two buffers - at *out0 and at *out - should be contiguous in
>> memory.
>> + That is, the last byte of buffer at *out0 is located exactly
>> before byte
>> + at *out.
>> +
>> + \param[in] in_max
>> + !=0, size in bytes of the input buffer at *in
>> +
>> + \param[in] out_max
>> + !=0, size in bytes of the output buffer at *out
>> + */
>> +int lz4k_decode_delta(
>> + const void *in,
>> + const void *const out0,
>> + void *const out,
>> + unsigned in_max,
>> + unsigned out_max);
>> +
>> +
>> +#endif /* _LZ4K_H */
>> diff --git a/lib/Kconfig b/lib/Kconfig
>> index 36326864249d..4bf1c2c21157 100644
>> --- a/lib/Kconfig
>> +++ b/lib/Kconfig
>> @@ -310,6 +310,12 @@ config LZ4HC_COMPRESS
>> config LZ4_DECOMPRESS
>> tristate
>> +config LZ4K_COMPRESS
>> + tristate
>> +
>> +config LZ4K_DECOMPRESS
>> + tristate
>> +
>> config ZSTD_COMPRESS
>> select XXHASH
>> tristate
>> diff --git a/lib/Makefile b/lib/Makefile
>> index a803e1527c4b..bd0d3635ae46 100644
>> --- a/lib/Makefile
>> +++ b/lib/Makefile
>> @@ -187,6 +187,8 @@ obj-$(CONFIG_LZO_DECOMPRESS) += lzo/
>> obj-$(CONFIG_LZ4_COMPRESS) += lz4/
>> obj-$(CONFIG_LZ4HC_COMPRESS) += lz4/
>> obj-$(CONFIG_LZ4_DECOMPRESS) += lz4/
>> +obj-$(CONFIG_LZ4K_COMPRESS) += lz4k/
>> +obj-$(CONFIG_LZ4K_DECOMPRESS) += lz4k/
>> obj-$(CONFIG_ZSTD_COMPRESS) += zstd/
>> obj-$(CONFIG_ZSTD_DECOMPRESS) += zstd/
>> obj-$(CONFIG_XZ_DEC) += xz/
>> diff --git a/lib/lz4k/Makefile b/lib/lz4k/Makefile
>> new file mode 100644
>> index 000000000000..6ea3578639d4
>> --- /dev/null
>> +++ b/lib/lz4k/Makefile
>> @@ -0,0 +1,2 @@
>> +obj-$(CONFIG_LZ4K_COMPRESS) += lz4k_encode.o
>> +obj-$(CONFIG_LZ4K_DECOMPRESS) += lz4k_decode.o
>> \ No newline at end of file
>> diff --git a/lib/lz4k/lz4k_decode.c b/lib/lz4k/lz4k_decode.c
>> new file mode 100644
>> index 000000000000..567b76b7bc51
>> --- /dev/null
>> +++ b/lib/lz4k/lz4k_decode.c
>> @@ -0,0 +1,308 @@
>> +/*
>> + * Copyright (c) Huawei Technologies Co., Ltd. 2012-2020. All rights
>> reserved.
>> + * Description: LZ4K compression algorithm
>> + * Author: Aleksei Romanovskii aleksei.romanovskii(a)huawei.com
>> + * Created: 2020-03-25
>> + */
>> +
>> +#if !defined(__KERNEL__)
>> +#include "lz4k.h"
>> +#else
>> +#include <linux/lz4k.h>
>> +#include <linux/module.h>
>> +#endif
>> +
>> +#include "lz4k_private.h" /* types, etc */
>> +
>> +static const uint8_t *get_size(
>> + uint_fast32_t *size,
>> + const uint8_t *in_at,
>> + const uint8_t *const in_end)
>> +{
>> + uint_fast32_t u;
>> + do {
>> + if (unlikely(in_at >= in_end))
>> + return NULL;
>> + *size += (u = *(const uint8_t*)in_at);
>> + ++in_at;
>> + } while (BYTE_MAX == u);
>> + return in_at;
>> +}
>> +
>> +static int end_of_block(
>> + const uint_fast32_t nr_bytes_max,
>> + const uint_fast32_t r_bytes_max,
>> + const uint8_t *const in_at,
>> + const uint8_t *const in_end,
>> + const uint8_t *const out,
>> + const uint8_t *const out_at)
>> +{
>> + if (!nr_bytes_max)
>> + return LZ4K_STATUS_FAILED; /* should be the last one in
>> block */
>> + if (r_bytes_max != REPEAT_MIN)
>> + return LZ4K_STATUS_FAILED; /* should be the last one in
>> block */
>> + if (in_at != in_end)
>> + return LZ4K_STATUS_FAILED; /* should be the last one in
>> block */
>> + return (int)(out_at - out);
>> +}
>> +
>> +enum {
>> + NR_COPY_MIN = 16,
>> + R_COPY_MIN = 16,
>> + R_COPY_SAFE = R_COPY_MIN - 1,
>> + R_COPY_SAFE_2X = (R_COPY_MIN << 1) - 1
>> +};
>> +
>> +static bool out_non_repeat(
>> + const uint8_t **in_at,
>> + uint8_t **out_at,
>> + uint_fast32_t nr_bytes_max,
>> + const uint8_t *const in_end,
>> + const uint8_t *const out_end)
>> +{
>> + const uint8_t *const in_copy_end = *in_at + nr_bytes_max;
>> + uint8_t *const out_copy_end = *out_at + nr_bytes_max;
>> + if (likely(nr_bytes_max <= NR_COPY_MIN)) {
>> + if (likely(*in_at <= in_end - NR_COPY_MIN &&
>> + *out_at <= out_end - NR_COPY_MIN))
>> + m_copy(*out_at, *in_at, NR_COPY_MIN);
>> + else if (in_copy_end <= in_end && out_copy_end <= out_end)
>> + m_copy(*out_at, *in_at, nr_bytes_max);
>> + else
>> + return false;
>> + } else {
>> + if (likely(in_copy_end <= in_end - NR_COPY_MIN &&
>> + out_copy_end <= out_end - NR_COPY_MIN)) {
>> + m_copy(*out_at, *in_at, NR_COPY_MIN);
>> + copy_x_while_lt(*out_at + NR_COPY_MIN,
>> + *in_at + NR_COPY_MIN,
>> + out_copy_end, NR_COPY_MIN);
>> + } else if (in_copy_end <= in_end && out_copy_end <= out_end) {
>> + m_copy(*out_at, *in_at, nr_bytes_max);
>> + } else { /* in_copy_end > in_end || out_copy_end > out_end */
>> + return false;
>> + }
>> + }
>> + *in_at = in_copy_end;
>> + *out_at = out_copy_end;
>> + return true;
>> +}
>> +
>> +static void out_repeat_overlap(
>> + uint_fast32_t offset,
>> + uint8_t *out_at,
>> + const uint8_t *out_from,
>> + const uint8_t *const out_copy_end)
>> +{ /* (1 < offset < R_COPY_MIN/2) && out_copy_end + R_COPY_SAFE_2X
>> <= out_end */
>> + enum {
>> + COPY_MIN = R_COPY_MIN >> 1,
>> + OFFSET_LIMIT = COPY_MIN >> 1
>> + };
>> + m_copy(out_at, out_from, COPY_MIN);
>> + out_at += offset;
>> + if (offset <= OFFSET_LIMIT)
>> + offset <<= 1;
>> + do {
>> + m_copy(out_at, out_from, COPY_MIN);
>> + out_at += offset;
>> + if (offset <= OFFSET_LIMIT)
>> + offset <<= 1;
>> + } while (out_at - out_from < R_COPY_MIN);
>> + while_lt_copy_2x_as_x2(out_at, out_from, out_copy_end, R_COPY_MIN);
>> +}
>> +
>> +static bool out_repeat_slow(
>> + uint_fast32_t r_bytes_max,
>> + uint_fast32_t offset,
>> + uint8_t *out_at,
>> + const uint8_t *out_from,
>> + const uint8_t *const out_copy_end,
>> + const uint8_t *const out_end)
>> +{
>> + if (offset > 1 && out_copy_end <= out_end - R_COPY_SAFE_2X) {
>> + out_repeat_overlap(offset, out_at, out_from, out_copy_end);
>> + } else {
>> + if (unlikely(out_copy_end > out_end))
>> + return false;
>> + if (offset == 1) {
>> + m_set(out_at, *out_from, r_bytes_max);
>> + } else {
>> + do
>> + *out_at++ = *out_from++;
>> + while (out_at < out_copy_end);
>> + }
>> + }
>> + return true;
>> +}
>> +
>> +static int decode(
>> + const uint8_t *const in,
>> + const uint8_t *const out0,
>> + uint8_t *const out,
>> + const uint8_t *const in_end,
>> + const uint8_t *const out_end,
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2)
>> +{
>> + const uint_fast32_t r_log2 = TAG_BITS_MAX - (off_log2 + nr_log2);
>> + const uint8_t *in_at = in;
>> + const uint8_t *const in_end_minus_x = in_end - TAG_BYTES_MAX;
>> + uint8_t *out_at = out;
>> + while (likely(in_at <= in_end_minus_x)) {
>> + const uint_fast32_t utag = read4_at(in_at - 1) >> BYTE_BITS;
>> + const uint_fast32_t offset = utag & mask(off_log2);
>> + uint_fast32_t nr_bytes_max = utag >> (off_log2 + r_log2),
>> + r_bytes_max = ((utag >> off_log2) & mask(r_log2)) +
>> + REPEAT_MIN;
>> + const uint8_t *out_from = 0;
>> + uint8_t *out_copy_end = 0;
>> + in_at += TAG_BYTES_MAX;
>> + if (unlikely(nr_bytes_max == mask(nr_log2))) {
>> + in_at = get_size(&nr_bytes_max, in_at, in_end);
>> + if (in_at == NULL)
>> + return LZ4K_STATUS_READ_ERROR;
>> + }
>> + if (!out_non_repeat(&in_at, &out_at, nr_bytes_max, in_end,
>> out_end))
>> + return LZ4K_STATUS_FAILED;
>> + if (unlikely(r_bytes_max == mask(r_log2) + REPEAT_MIN)) {
>> + in_at = get_size(&r_bytes_max, in_at, in_end);
>> + if (in_at == NULL)
>> + return LZ4K_STATUS_READ_ERROR;
>> + }
>> + out_from = out_at - offset;
>> + if (unlikely(out_from < out0))
>> + return LZ4K_STATUS_FAILED;
>> + out_copy_end = out_at + r_bytes_max;
>> + if (likely(offset >= R_COPY_MIN &&
>> + out_copy_end <= out_end - R_COPY_SAFE_2X)) {
>> + copy_2x_as_x2_while_lt(out_at, out_from, out_copy_end,
>> + R_COPY_MIN);
>> + } else if (likely(offset >= (R_COPY_MIN >> 1) &&
>> + out_copy_end <= out_end - R_COPY_SAFE_2X)) {
>> + m_copy(out_at, out_from, R_COPY_MIN);
>> + out_at += offset;
>> + while_lt_copy_x(out_at, out_from, out_copy_end,
>> R_COPY_MIN);
>> + /* faster than 2x */
>> + } else if (likely(offset > 0)) {
>> + if (!out_repeat_slow(r_bytes_max, offset, out_at, out_from,
>> + out_copy_end, out_end))
>> + return LZ4K_STATUS_FAILED;
>> + } else { /* offset == 0: EOB, last literal */
>> + return end_of_block(nr_bytes_max, r_bytes_max, in_at,
>> + in_end, out, out_at);
>> + }
>> + out_at = out_copy_end;
>> + }
>> + return in_at == in_end ? (int)(out_at - out) : LZ4K_STATUS_FAILED;
>> +}
>> +
>> +static int decode4kb(
>> + const uint8_t *const in,
>> + const uint8_t *const out0,
>> + uint8_t *const out,
>> + const uint8_t *const in_end,
>> + const uint8_t *const out_end)
>> +{
>> + enum {
>> + NR_LOG2 = 6
>> + };
>> + return decode(in, out0, out, in_end, out_end, NR_LOG2,
>> BLOCK_4KB_LOG2);
>> +}
>> +
>> +static int decode8kb(
>> + const uint8_t *const in,
>> + const uint8_t *const out0,
>> + uint8_t *const out,
>> + const uint8_t *const in_end,
>> + const uint8_t *const out_end)
>> +{
>> + enum {
>> + NR_LOG2 = 5
>> + };
>> + return decode(in, out0, out, in_end, out_end, NR_LOG2,
>> BLOCK_8KB_LOG2);
>> +}
>> +
>> +static int decode16kb(
>> + const uint8_t *const in,
>> + const uint8_t *const out0,
>> + uint8_t *const out,
>> + const uint8_t *const in_end,
>> + const uint8_t *const out_end)
>> +{
>> + enum {
>> + NR_LOG2 = 5
>> + };
>> + return decode(in, out0, out, in_end, out_end, NR_LOG2,
>> BLOCK_16KB_LOG2);
>> +}
>> +
>> +static int decode32kb(
>> + const uint8_t *const in,
>> + const uint8_t *const out0,
>> + uint8_t *const out,
>> + const uint8_t *const in_end,
>> + const uint8_t *const out_end)
>> +{
>> + enum {
>> + NR_LOG2 = 4
>> + };
>> + return decode(in, out0, out, in_end, out_end, NR_LOG2,
>> BLOCK_32KB_LOG2);
>> +}
>> +
>> +static int decode64kb(
>> + const uint8_t *const in,
>> + const uint8_t *const out0,
>> + uint8_t *const out,
>> + const uint8_t *const in_end,
>> + const uint8_t *const out_end)
>> +{
>> + enum {
>> + NR_LOG2 = 4
>> + };
>> + return decode(in, out0, out, in_end, out_end, NR_LOG2,
>> BLOCK_64KB_LOG2);
>> +}
>> +
>> +static inline const void *u8_inc(const uint8_t *a)
>> +{
>> + return a+1;
>> +}
>> +
>> +int lz4k_decode(
>> + const void *in,
>> + void *const out,
>> + unsigned in_max,
>> + unsigned out_max)
>> +{
>> + /* ++use volatile pointers to prevent compiler optimizations */
>> + const uint8_t *volatile in_end = (const uint8_t*)in + in_max;
>> + const uint8_t *volatile out_end = (uint8_t*)out + out_max;
>> + uint8_t in_log2 = 0;
>> + if (unlikely(in == NULL || out == NULL || in_max <= 4 || out_max
>> <= 0))
>> + return LZ4K_STATUS_FAILED;
>> + in_log2 = (uint8_t)(BLOCK_4KB_LOG2 + *(const uint8_t*)in);
>> + /* invalid buffer size or pointer overflow */
>> + if (unlikely((const uint8_t*)in >= in_end || (uint8_t*)out >=
>> out_end))
>> + return LZ4K_STATUS_FAILED;
>> + /* -- */
>> + in = u8_inc((const uint8_t*)in);
>> + --in_max;
>> + if (in_log2 < BLOCK_8KB_LOG2)
>> + return decode4kb((const uint8_t*)in, (uint8_t*)out,
>> + (uint8_t*)out, in_end, out_end);
>> + if (in_log2 == BLOCK_8KB_LOG2)
>> + return decode8kb((const uint8_t*)in, (uint8_t*)out,
>> + (uint8_t*)out, in_end, out_end);
>> + if (in_log2 == BLOCK_16KB_LOG2)
>> + return decode16kb((const uint8_t*)in, (uint8_t*)out,
>> + (uint8_t*)out, in_end, out_end);
>> + if (in_log2 == BLOCK_32KB_LOG2)
>> + return decode32kb((const uint8_t*)in, (uint8_t*)out,
>> + (uint8_t*)out, in_end, out_end);
>> + if (in_log2 == BLOCK_64KB_LOG2)
>> + return decode64kb((const uint8_t*)in, (uint8_t*)out,
>> + (uint8_t*)out, in_end, out_end);
>> + return LZ4K_STATUS_FAILED;
>> +}
>> +EXPORT_SYMBOL(lz4k_decode);
>> +
>> +MODULE_LICENSE("Dual BSD/GPL");
>> +MODULE_DESCRIPTION("LZ4K decoder");
>> diff --git a/lib/lz4k/lz4k_encode.c b/lib/lz4k/lz4k_encode.c
>> new file mode 100644
>> index 000000000000..a425d3a0b827
>> --- /dev/null
>> +++ b/lib/lz4k/lz4k_encode.c
>> @@ -0,0 +1,539 @@
>> +/*
>> + * Copyright (c) Huawei Technologies Co., Ltd. 2012-2020. All rights
>> reserved.
>> + * Description: LZ4K compression algorithm
>> + * Author: Aleksei Romanovskii aleksei.romanovskii(a)huawei.com
>> + * Created: 2020-03-25
>> + */
>> +
>> +#if !defined(__KERNEL__)
>> +#include "lz4k.h"
>> +#else
>> +#include <linux/lz4k.h>
>> +#include <linux/module.h>
>> +#endif
>> +
>> +#include "lz4k_private.h"
>> +#include "lz4k_encode_private.h"
>> +
>> +static uint8_t *out_size_bytes(uint8_t *out_at, uint_fast32_t u)
>> +{
>> + for (; unlikely(u >= BYTE_MAX); u -= BYTE_MAX)
>> + *out_at++ = (uint8_t)BYTE_MAX;
>> + *out_at++ = (uint8_t)u;
>> + return out_at;
>> +}
>> +
>> +static inline uint8_t *out_utag_then_bytes_left(
>> + uint8_t *out_at,
>> + uint_fast32_t utag,
>> + uint_fast32_t bytes_left)
>> +{
>> + m_copy(out_at, &utag, TAG_BYTES_MAX);
>> + return out_size_bytes(out_at + TAG_BYTES_MAX, bytes_left);
>> +}
>> +
>> +static int out_tail(
>> + uint8_t *out_at,
>> + uint8_t *const out_end,
>> + const uint8_t *const out,
>> + const uint8_t *const nr0,
>> + const uint8_t *const in_end,
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2,
>> + bool check_out)
>> +{
>> + const uint_fast32_t nr_mask = mask(nr_log2);
>> + const uint_fast32_t r_log2 = TAG_BITS_MAX - (off_log2 + nr_log2);
>> + const uint_fast32_t nr_bytes_max = u_32(in_end - nr0);
>> + if (encoded_bytes_min(nr_log2, nr_bytes_max) > u_32(out_end -
>> out_at))
>> + return check_out ? LZ4K_STATUS_WRITE_ERROR :
>> + LZ4K_STATUS_INCOMPRESSIBLE;
>> + if (nr_bytes_max < nr_mask) {
>> + /* caller guarantees at least one nr-byte */
>> + uint_fast32_t utag = (nr_bytes_max << (off_log2 + r_log2));
>> + m_copy(out_at, &utag, TAG_BYTES_MAX);
>> + out_at += TAG_BYTES_MAX;
>> + } else {
>> + uint_fast32_t bytes_left = nr_bytes_max - nr_mask;
>> + uint_fast32_t utag = (nr_mask << (off_log2 + r_log2));
>> + out_at = out_utag_then_bytes_left(out_at, utag, bytes_left);
>> + }
>> + m_copy(out_at, nr0, nr_bytes_max);
>> + return (int)(out_at + nr_bytes_max - out);
>> +}
>> +
>> +int lz4k_out_tail(
>> + uint8_t *out_at,
>> + uint8_t *const out_end,
>> + const uint8_t *const out,
>> + const uint8_t *const nr0,
>> + const uint8_t *const in_end,
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2,
>> + bool check_out)
>> +{
>> + return out_tail(out_at, out_end, out, nr0, in_end,
>> + nr_log2, off_log2, check_out);
>> +}
>> +
>> +static uint8_t *out_non_repeat(
>> + uint8_t *out_at,
>> + uint8_t *const out_end,
>> + uint_fast32_t utag,
>> + const uint8_t *const nr0,
>> + const uint8_t *const r,
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2,
>> + bool check_out)
>> +{
>> + const uint_fast32_t nr_bytes_max = u_32(r - nr0);
>> + const uint_fast32_t nr_mask = mask(nr_log2),
>> + r_log2 = TAG_BITS_MAX - (off_log2 + nr_log2);
>> + if (likely(nr_bytes_max < nr_mask)) {
>> + if (unlikely(check_out &&
>> + TAG_BYTES_MAX + nr_bytes_max > u_32(out_end - out_at)))
>> + return NULL;
>> + utag |= (nr_bytes_max << (off_log2 + r_log2));
>> + m_copy(out_at, &utag, TAG_BYTES_MAX);
>> + out_at += TAG_BYTES_MAX;
>> + } else {
>> + uint_fast32_t bytes_left = nr_bytes_max - nr_mask;
>> + if (unlikely(check_out &&
>> + TAG_BYTES_MAX + size_bytes_count(bytes_left) +
>> nr_bytes_max >
>> + u_32(out_end - out_at)))
>> + return NULL;
>> + utag |= (nr_mask << (off_log2 + r_log2));
>> + out_at = out_utag_then_bytes_left(out_at, utag, bytes_left);
>> + }
>> + if (unlikely(check_out))
>> + m_copy(out_at, nr0, nr_bytes_max);
>> + else
>> + copy_x_while_total(out_at, nr0, nr_bytes_max, NR_COPY_MIN);
>> + out_at += nr_bytes_max;
>> + return out_at;
>> +}
>> +
>> +uint8_t *lz4k_out_non_repeat(
>> + uint8_t *out_at,
>> + uint8_t *const out_end,
>> + uint_fast32_t utag,
>> + const uint8_t *const nr0,
>> + const uint8_t *const r,
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2,
>> + bool check_out)
>> +{
>> + return out_non_repeat(out_at, out_end, utag, nr0, r,
>> + nr_log2, off_log2, check_out);
>> +}
>> +
>> +static uint8_t *out_r_bytes_left(
>> + uint8_t *out_at,
>> + uint8_t *const out_end,
>> + uint_fast32_t r_bytes_max,
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2,
>> + const bool check_out) /* =false when
>> +out_max>=encoded_bytes_max(in_max), =true otherwise */
>> +{
>> + const uint_fast32_t r_mask = mask(TAG_BITS_MAX - (off_log2 +
>> nr_log2));
>> + if (unlikely(r_bytes_max - REPEAT_MIN >= r_mask)) {
>> + uint_fast32_t bytes_left = r_bytes_max - REPEAT_MIN - r_mask;
>> + if (unlikely(check_out &&
>> + size_bytes_count(bytes_left) > u_32(out_end - out_at)))
>> + return NULL;
>> + out_at = out_size_bytes(out_at, bytes_left);
>> + }
>> + return out_at; /* SUCCESS: continue compression */
>> +}
>> +
>> +uint8_t *lz4k_out_r_bytes_left(
>> + uint8_t *out_at,
>> + uint8_t *const out_end,
>> + uint_fast32_t r_bytes_max,
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2,
>> + const bool check_out)
>> +{
>> + return out_r_bytes_left(out_at, out_end, r_bytes_max,
>> + nr_log2, off_log2, check_out);
>> +}
>> +
>> +static uint8_t *out_repeat(
>> + uint8_t *out_at,
>> + uint8_t *const out_end,
>> + uint_fast32_t utag,
>> + uint_fast32_t r_bytes_max,
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2,
>> + const bool check_out) /* =false when
>> +out_max>=encoded_bytes_max(in_max), =true otherwise */
>> +{
>> + const uint_fast32_t r_mask = mask(TAG_BITS_MAX - (off_log2 +
>> nr_log2));
>> + if (likely(r_bytes_max - REPEAT_MIN < r_mask)) {
>> + if (unlikely(check_out && TAG_BYTES_MAX > u_32(out_end -
>> out_at)))
>> + return NULL;
>> + utag |= ((r_bytes_max - REPEAT_MIN) << off_log2);
>> + m_copy(out_at, &utag, TAG_BYTES_MAX);
>> + out_at += TAG_BYTES_MAX;
>> + } else {
>> + uint_fast32_t bytes_left = r_bytes_max - REPEAT_MIN - r_mask;
>> + if (unlikely(check_out &&
>> + TAG_BYTES_MAX + size_bytes_count(bytes_left) >
>> + u_32(out_end - out_at)))
>> + return NULL;
>> + utag |= (r_mask << off_log2);
>> + out_at = out_utag_then_bytes_left(out_at, utag, bytes_left);
>> + }
>> + return out_at; /* SUCCESS: continue compression */
>> +}
>> +
>> +uint8_t *lz4k_out_repeat(
>> + uint8_t *out_at,
>> + uint8_t *const out_end,
>> + uint_fast32_t utag,
>> + uint_fast32_t r_bytes_max,
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2,
>> + const bool check_out)
>> +{
>> + return out_repeat(out_at, out_end, utag, r_bytes_max,
>> + nr_log2, off_log2, check_out);
>> +}
>> +
>> +static const uint8_t *repeat_end(
>> + const uint8_t *q,
>> + const uint8_t *r,
>> + const uint8_t *const in_end_safe,
>> + const uint8_t *const in_end)
>> +{
>> + q += REPEAT_MIN;
>> + r += REPEAT_MIN;
>> + /* caller guarantees r+12<=in_end */
>> + do {
>> + const uint64_t x = read8_at(q) ^ read8_at(r);
>> + if (x) {
>> + const uint16_t ctz = (uint16_t)__builtin_ctzl(x);
>> + return r + (ctz >> BYTE_BITS_LOG2);
>> + }
>> + /* some bytes differ:+ count of trailing 0-bits/bytes */
>> + q += sizeof(uint64_t), r += sizeof(uint64_t);
>> + } while (likely(r <= in_end_safe)); /* once, at input block end */
>> + do {
>> + if (*q != *r) return r;
>> + ++q;
>> + ++r;
>> + } while (r < in_end);
>> + return r;
>> +}
>> +
>> +const uint8_t *lz4k_repeat_end(
>> + const uint8_t *q,
>> + const uint8_t *r,
>> + const uint8_t *const in_end_safe,
>> + const uint8_t *const in_end)
>> +{
>> + return repeat_end(q, r, in_end_safe, in_end);
>> +}
>> +
>> +enum {
>> + HT_BYTES_LOG2 = HT_LOG2 + 1
>> +};
>> +
>> +inline unsigned encode_state_bytes_min(void)
>> +{
>> + unsigned bytes_total = (1U << HT_BYTES_LOG2);
>> + return bytes_total;
>> +}
>> +
>> +#if !defined(LZ4K_DELTA) && !defined(LZ4K_MAX_CR)
>> +
>> +unsigned lz4k_encode_state_bytes_min(void)
>> +{
>> + return encode_state_bytes_min();
>> +}
>> +EXPORT_SYMBOL(lz4k_encode_state_bytes_min);
>> +
>> +#endif /* !defined(LZ4K_DELTA) && !defined(LZ4K_MAX_CR) */
>> +
>> +/* CR increase order: +STEP, have OFFSETS, use _5b */
>> +/* *_6b to compete with LZ4 */
>> +static inline uint_fast32_t hash0_v(const uint64_t r, uint32_t shift)
>> +{
>> + return hash64v_6b(r, shift);
>> +}
>> +
>> +static inline uint_fast32_t hash0(const uint8_t *r, uint32_t shift)
>> +{
>> + return hash64_6b(r, shift);
>> +}
>> +
>> +/*
>> + * Proof that 'r' increments are safe-NO pointer overflows are
>> possible:
>> + *
>> + * While using STEP_LOG2=5, step_start=1<<STEP_LOG2 == 32 we
>> increment s
>> + * 32 times by 1, 32 times by 2, 32 times by 3, and so on:
>> + * 32*1+32*2+32*3+...+32*31 == 32*SUM(1..31) == 32*((1+31)*15+16).
>> + * So, we can safely increment s by at most 31 for input block size <=
>> + * 1<<13 < 15872.
>> + *
>> + * More precisely, STEP_LIMIT == x for any input block calculated as
>> follows:
>> + * 1<<off_log2 >= (1<<STEP_LOG2)*((x+1)(x-1)/2+x/2) ==>
>> + * 1<<(off_log2-STEP_LOG2+1) >= x^2+x-1 ==>
>> + * x^2+x-1-1<<(off_log2-STEP_LOG2+1) == 0, which is solved by standard
>> + * method.
>> + * To avoid overhead here conservative approximate value of x is
>> calculated
>> + * as average of two nearest square roots, see STEP_LIMIT above.
>> + */
>> +
>> +enum {
>> + STEP_LOG2 = 5 /* increase for better CR */
>> +};
>> +
>> +static int encode_any(
>> + uint16_t *const ht0,
>> + const uint8_t *const in0,
>> + uint8_t *const out,
>> + const uint8_t *const in_end,
>> + uint8_t *const out_end, /* ==out_limit for !check_out */
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2,
>> + const bool check_out)
>> +{ /* caller guarantees off_log2 <=16 */
>> + uint8_t *out_at = out;
>> + const uint8_t *const in_end_safe = in_end - NR_COPY_MIN;
>> + const uint8_t *r = in0;
>> + const uint8_t *nr0 = r++;
>> + uint_fast32_t step = 1 << STEP_LOG2;
>> + for (;;) {
>> + uint_fast32_t utag = 0;
>> + const uint8_t *r_end = 0;
>> + uint_fast32_t r_bytes_max = 0;
>> + const uint8_t *const q = hashed(in0, ht0, hash0(r, HT_LOG2),
>> r);
>> + if (!equal4(q, r)) {
>> + r += (++step >> STEP_LOG2);
>> + if (unlikely(r > in_end_safe))
>> + return out_tail(out_at, out_end, out, nr0, in_end,
>> + nr_log2, off_log2, check_out);
>> + continue;
>> + }
>> + utag = u_32(r - q);
>> + r_end = repeat_end(q, r, in_end_safe, in_end);
>> + r = repeat_start(q, r, nr0, in0);
>> + r_bytes_max = u_32(r_end - r);
>> + if (nr0 == r) {
>> + out_at = out_repeat(out_at, out_end, utag, r_bytes_max,
>> + nr_log2, off_log2, check_out);
>> + } else {
>> + update_utag(r_bytes_max, &utag, nr_log2, off_log2);
>> + out_at = out_non_repeat(out_at, out_end, utag, nr0, r,
>> + nr_log2, off_log2, check_out);
>> + if (unlikely(check_out && out_at == NULL))
>> + return LZ4K_STATUS_WRITE_ERROR;
>> + out_at = out_r_bytes_left(out_at, out_end, r_bytes_max,
>> + nr_log2, off_log2, check_out);
>> + }
>> + if (unlikely(check_out && out_at == NULL))
>> + return LZ4K_STATUS_WRITE_ERROR;
>> + nr0 = (r += r_bytes_max);
>> + if (unlikely(r > in_end_safe))
>> + return r == in_end ? (int)(out_at - out) :
>> + out_tail(out_at, out_end, out, r, in_end,
>> + nr_log2, off_log2, check_out);
>> + ht0[hash0(r - 1 - 1, HT_LOG2)] = (uint16_t)(r - 1 - 1 - in0);
>> + step = 1 << STEP_LOG2;
>> + }
>> +}
>> +
>> +static int encode_fast(
>> + uint16_t *const ht,
>> + const uint8_t *const in,
>> + uint8_t *const out,
>> + const uint8_t *const in_end,
>> + uint8_t *const out_end, /* ==out_limit for !check_out */
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2)
>> +{ /* caller guarantees off_log2 <=16 */
>> + return encode_any(ht, in, out, in_end, out_end, nr_log2, off_log2,
>> + false); /* !check_out */
>> +}
>> +
>> +static int encode_slow(
>> + uint16_t *const ht,
>> + const uint8_t *const in,
>> + uint8_t *const out,
>> + const uint8_t *const in_end,
>> + uint8_t *const out_end, /* ==out_limit for !check_out */
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2)
>> +{ /* caller guarantees off_log2 <=16 */
>> + return encode_any(ht, in, out, in_end, out_end, nr_log2, off_log2,
>> + true); /* check_out */
>> +}
>> +
>> +static int encode4kb(
>> + uint16_t *const state,
>> + const uint8_t *const in,
>> + uint8_t *out,
>> + const uint_fast32_t in_max,
>> + const uint_fast32_t out_max,
>> + const uint_fast32_t out_limit)
>> +{
>> + enum {
>> + NR_LOG2 = 6
>> + };
>> + const int result = (encoded_bytes_max(NR_LOG2, in_max) > out_max) ?
>> + encode_slow(state, in, out, in + in_max, out + out_max,
>> + NR_LOG2, BLOCK_4KB_LOG2) :
>> + encode_fast(state, in, out, in + in_max, out + out_limit,
>> + NR_LOG2, BLOCK_4KB_LOG2);
>> + return result <= 0 ? result : result + 1; /* +1 for in_log2 */
>> +}
>> +
>> +static int encode8kb(
>> + uint16_t *const state,
>> + const uint8_t *const in,
>> + uint8_t *out,
>> + const uint_fast32_t in_max,
>> + const uint_fast32_t out_max,
>> + const uint_fast32_t out_limit)
>> +{
>> + enum {
>> + NR_LOG2 = 5
>> + };
>> + const int result = (encoded_bytes_max(NR_LOG2, in_max) > out_max) ?
>> + encode_slow(state, in, out, in + in_max, out + out_max,
>> + NR_LOG2, BLOCK_8KB_LOG2) :
>> + encode_fast(state, in, out, in + in_max, out + out_limit,
>> + NR_LOG2, BLOCK_8KB_LOG2);
>> + return result <= 0 ? result : result + 1; /* +1 for in_log2 */
>> +}
>> +
>> +static int encode16kb(
>> + uint16_t *const state,
>> + const uint8_t *const in,
>> + uint8_t *out,
>> + const uint_fast32_t in_max,
>> + const uint_fast32_t out_max,
>> + const uint_fast32_t out_limit)
>> +{
>> + enum {
>> + NR_LOG2 = 5
>> + };
>> + const int result = (encoded_bytes_max(NR_LOG2, in_max) > out_max) ?
>> + encode_slow(state, in, out, in + in_max, out + out_max,
>> + NR_LOG2, BLOCK_16KB_LOG2) :
>> + encode_fast(state, in, out, in + in_max, out + out_limit,
>> + NR_LOG2, BLOCK_16KB_LOG2);
>> + return result <= 0 ? result : result + 1; /* +1 for in_log2 */
>> +}
>> +
>> +static int encode32kb(
>> + uint16_t *const state,
>> + const uint8_t *const in,
>> + uint8_t *out,
>> + const uint_fast32_t in_max,
>> + const uint_fast32_t out_max,
>> + const uint_fast32_t out_limit)
>> +{
>> + enum {
>> + NR_LOG2 = 4
>> + };
>> + const int result = (encoded_bytes_max(NR_LOG2, in_max) > out_max) ?
>> + encode_slow(state, in, out, in + in_max, out + out_max,
>> + NR_LOG2, BLOCK_32KB_LOG2) :
>> + encode_fast(state, in, out, in + in_max, out + out_limit,
>> + NR_LOG2, BLOCK_32KB_LOG2);
>> + return result <= 0 ? result : result + 1; /* +1 for in_log2 */
>> +}
>> +
>> +static int encode64kb(
>> + uint16_t *const state,
>> + const uint8_t *const in,
>> + uint8_t *out,
>> + const uint_fast32_t in_max,
>> + const uint_fast32_t out_max,
>> + const uint_fast32_t out_limit)
>> +{
>> + enum {
>> + NR_LOG2 = 4
>> + };
>> + const int result = (encoded_bytes_max(NR_LOG2, in_max) > out_max) ?
>> + encode_slow(state, in, out, in + in_max, out + out_max,
>> + NR_LOG2, BLOCK_64KB_LOG2) :
>> + encode_fast(state, in, out, in + in_max, out + out_limit,
>> + NR_LOG2, BLOCK_64KB_LOG2);
>> + return result <= 0 ? result : result + 1; /* +1 for in_log2 */
>> +}
>> +
>> +static int encode(
>> + uint16_t *const state,
>> + const uint8_t *const in,
>> + uint8_t *out,
>> + uint_fast32_t in_max,
>> + uint_fast32_t out_max,
>> + uint_fast32_t out_limit)
>> +{
>> + const uint8_t in_log2 = (uint8_t)(most_significant_bit_of(
>> + round_up_to_power_of2(in_max - REPEAT_MIN)));
>> + m_set(state, 0, encode_state_bytes_min());
>> + *out = in_log2 > BLOCK_4KB_LOG2 ? (uint8_t)(in_log2 -
>> BLOCK_4KB_LOG2) : 0;
>> + ++out;
>> + --out_max;
>> + --out_limit;
>> + if (in_log2 < BLOCK_8KB_LOG2)
>> + return encode4kb(state, in, out, in_max, out_max, out_limit);
>> + if (in_log2 == BLOCK_8KB_LOG2)
>> + return encode8kb(state, in, out, in_max, out_max, out_limit);
>> + if (in_log2 == BLOCK_16KB_LOG2)
>> + return encode16kb(state, in, out, in_max, out_max, out_limit);
>> + if (in_log2 == BLOCK_32KB_LOG2)
>> + return encode32kb(state, in, out, in_max, out_max, out_limit);
>> + if (in_log2 == BLOCK_64KB_LOG2)
>> + return encode64kb(state, in, out, in_max, out_max, out_limit);
>> + return LZ4K_STATUS_FAILED;
>> +}
>> +
>> +int lz4k_encode(
>> + void *const state,
>> + const void *const in,
>> + void *out,
>> + unsigned in_max,
>> + unsigned out_max,
>> + unsigned out_limit)
>> +{
>> + const unsigned gain_max = 64 > (in_max >> 6) ? 64 : (in_max >> 6);
>> + const unsigned out_limit_min = in_max < out_max ? in_max : out_max;
>> + const uint8_t *volatile in_end = (const uint8_t*)in + in_max;
>> + const uint8_t *volatile out_end = (uint8_t*)out + out_max;
>> + const void *volatile state_end =
>> + (uint8_t*)state + encode_state_bytes_min();
>> + if (unlikely(state == NULL))
>> + return LZ4K_STATUS_FAILED;
>> + if (unlikely(in == NULL || out == NULL))
>> + return LZ4K_STATUS_FAILED;
>> + if (unlikely(in_max <= gain_max))
>> + return LZ4K_STATUS_INCOMPRESSIBLE;
>> + if (unlikely(out_max <= gain_max)) /* need 1 byte for in_log2 */
>> + return LZ4K_STATUS_FAILED;
>> + /* ++use volatile pointers to prevent compiler optimizations */
>> + if (unlikely((const uint8_t*)in >= in_end || (uint8_t*)out >=
>> out_end))
>> + return LZ4K_STATUS_FAILED;
>> + if (unlikely(state >= state_end))
>> + return LZ4K_STATUS_FAILED; /* pointer overflow */
>> + if (!out_limit || out_limit >= out_limit_min)
>> + out_limit = out_limit_min - gain_max;
>> + return encode((uint16_t*)state, (const uint8_t*)in, (uint8_t*)out,
>> + in_max, out_max, out_limit);
>> +}
>> +EXPORT_SYMBOL(lz4k_encode);
>> +
>> +const char *lz4k_version(void)
>> +{
>> + static const char *version = "2020.07.07";
>> + return version;
>> +}
>> +EXPORT_SYMBOL(lz4k_version);
>> +
>> +MODULE_LICENSE("Dual BSD/GPL");
>> +MODULE_DESCRIPTION("LZ4K encoder");
>> diff --git a/lib/lz4k/lz4k_encode_private.h
>> b/lib/lz4k/lz4k_encode_private.h
>> new file mode 100644
>> index 000000000000..eb5cd162468f
>> --- /dev/null
>> +++ b/lib/lz4k/lz4k_encode_private.h
>> @@ -0,0 +1,137 @@
>> +/*
>> + * Copyright (c) Huawei Technologies Co., Ltd. 2012-2020. All rights
>> reserved.
>> + * Description: LZ4K compression algorithm
>> + * Author: Aleksei Romanovskii aleksei.romanovskii(a)huawei.com
>> + * Created: 2020-03-25
>> + */
>> +
>> +#ifndef _LZ4K_ENCODE_PRIVATE_H
>> +#define _LZ4K_ENCODE_PRIVATE_H
>> +
>> +#include "lz4k_private.h"
>> +
>> +/* <nrSize bytes for whole block>+<1 terminating 0 byte> */
>> +static inline uint_fast32_t size_bytes_count(uint_fast32_t u)
>> +{
>> + return (u + BYTE_MAX - 1) / BYTE_MAX;
>> +}
>> +
>> +/* minimum encoded size for non-compressible data */
>> +static inline uint_fast32_t encoded_bytes_min(
>> + uint_fast32_t nr_log2,
>> + uint_fast32_t in_max)
>> +{
>> + return in_max < mask(nr_log2) ?
>> + TAG_BYTES_MAX + in_max :
>> + TAG_BYTES_MAX + size_bytes_count(in_max - mask(nr_log2)) +
>> in_max;
>> +}
>> +
>> +enum {
>> + NR_COPY_LOG2 = 4,
>> + NR_COPY_MIN = 1 << NR_COPY_LOG2
>> +};
>> +
>> +static inline uint_fast32_t u_32(int64_t i)
>> +{
>> + return (uint_fast32_t)i;
>> +}
>> +
>> +/* maximum encoded size for non-comprressible data if "fast" encoder
>> is used */
>> +static inline uint_fast32_t encoded_bytes_max(
>> + uint_fast32_t nr_log2,
>> + uint_fast32_t in_max)
>> +{
>> + uint_fast32_t r = TAG_BYTES_MAX +
>> (uint32_t)round_up_to_log2(in_max, NR_COPY_LOG2);
>> + return in_max < mask(nr_log2) ? r : r + size_bytes_count(in_max
>> - mask(nr_log2));
>> +}
>> +
>> +enum {
>> + HT_LOG2 = 12
>> +};
>> +
>> +/*
>> + * Compressed data format (where {} means 0 or more occurrences, []
>> means
>> + * optional):
>> + * <24bits tag: (off_log2 rOffset| r_log2 rSize|nr_log2 nrSize)>
>> + * {<nrSize byte>}[<nr bytes>]{<rSize byte>}
>> + * <rSize byte> and <nrSize byte> bytes are terminated by byte != 255
>> + *
>> + */
>> +
>> +static inline void update_utag(
>> + uint_fast32_t r_bytes_max,
>> + uint_fast32_t *utag,
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2)
>> +{
>> + const uint_fast32_t r_mask = mask(TAG_BITS_MAX - (off_log2 +
>> nr_log2));
>> + *utag |= likely(r_bytes_max - REPEAT_MIN < r_mask) ?
>> + ((r_bytes_max - REPEAT_MIN) << off_log2) : (r_mask <<
>> off_log2);
>> +}
>> +
>> +static inline const uint8_t *hashed(
>> + const uint8_t *const in0,
>> + uint16_t *const ht,
>> + uint_fast32_t h,
>> + const uint8_t *r)
>> +{
>> + const uint8_t *q = in0 + ht[h];
>> + ht[h] = (uint16_t)(r - in0);
>> + return q;
>> +}
>> +
>> +static inline const uint8_t *repeat_start(
>> + const uint8_t *q,
>> + const uint8_t *r,
>> + const uint8_t *const nr0,
>> + const uint8_t *const in0)
>> +{
>> + for (; r > nr0 && likely(q > in0) && unlikely(q[-1] == r[-1]);
>> --q, --r);
>> + return r;
>> +}
>> +
>> +int lz4k_out_tail(
>> + uint8_t *out_at,
>> + uint8_t *const out_end,
>> + const uint8_t *const out,
>> + const uint8_t *const nr0,
>> + const uint8_t *const in_end,
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2,
>> + bool check_out);
>> +
>> +uint8_t *lz4k_out_non_repeat(
>> + uint8_t *out_at,
>> + uint8_t *const out_end,
>> + uint_fast32_t utag,
>> + const uint8_t *const nr0,
>> + const uint8_t *const r,
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2,
>> + bool check_out);
>> +
>> +uint8_t *lz4k_out_r_bytes_left(
>> + uint8_t *out_at,
>> + uint8_t *const out_end,
>> + uint_fast32_t r_bytes_max,
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2,
>> + const bool check_out);
>> +
>> +uint8_t *lz4k_out_repeat(
>> + uint8_t *out_at,
>> + uint8_t *const out_end,
>> + uint_fast32_t utag,
>> + uint_fast32_t r_bytes_max,
>> + const uint_fast32_t nr_log2,
>> + const uint_fast32_t off_log2,
>> + const bool check_out);
>> +
>> +const uint8_t *lz4k_repeat_end(
>> + const uint8_t *q,
>> + const uint8_t *r,
>> + const uint8_t *const in_end_safe,
>> + const uint8_t *const in_end);
>> +
>> +#endif /* _LZ4K_ENCODE_PRIVATE_H */
>> +
>> diff --git a/lib/lz4k/lz4k_private.h b/lib/lz4k/lz4k_private.h
>> new file mode 100644
>> index 000000000000..2a8f4b37dc74
>> --- /dev/null
>> +++ b/lib/lz4k/lz4k_private.h
>> @@ -0,0 +1,269 @@
>> +/*
>> + * Copyright (c) Huawei Technologies Co., Ltd. 2012-2020. All rights
>> reserved.
>> + * Description: LZ4K compression algorithm
>> + * Author: Aleksei Romanovskii aleksei.romanovskii(a)huawei.com
>> + * Created: 2020-03-25
>> + */
>> +
>> +#ifndef _LZ4K_PRIVATE_H
>> +#define _LZ4K_PRIVATE_H
>> +
>> +#if !defined(__KERNEL__)
>> +
>> +#include "lz4k.h"
>> +#include <stdint.h> /* uint*_t */
>> +#define __STDC_WANT_LIB_EXT1__ 1
>> +#include <string.h> /* memcpy() */
>> +
>> +#define likely(e) __builtin_expect(e, 1)
>> +#define unlikely(e) __builtin_expect(e, 0)
>> +
>> +#else /* __KERNEL__ */
>> +
>> +#include <linux/lz4k.h>
>> +#define __STDC_WANT_LIB_EXT1__ 1
>> +#include <linux/string.h> /* memcpy() */
>> +#include <linux/types.h> /* uint8_t, int8_t, uint16_t, int16_t,
>> +uint32_t, int32_t, uint64_t, int64_t */
>> +#include <stddef.h>
>> +
>> +typedef uint64_t uint_fast32_t;
>> +typedef int64_t int_fast32_t;
>> +
>> +#endif /* __KERNEL__ */
>> +
>> +#if defined(__GNUC__) && (__GNUC__>=4)
>> +#define LZ4K_WITH_GCC_INTRINSICS
>> +#endif
>> +
>> +#if !defined(__GNUC__)
>> +#define __builtin_expect(e, v) (e)
>> +#endif /* defined(__GNUC__) */
>> +
>> +enum {
>> + BYTE_BITS = 8,
>> + BYTE_BITS_LOG2 = 3,
>> + BYTE_MAX = 255U,
>> + REPEAT_MIN = 4,
>> + TAG_BYTES_MAX = 3,
>> + TAG_BITS_MAX = TAG_BYTES_MAX * 8,
>> + BLOCK_4KB_LOG2 = 12,
>> + BLOCK_8KB_LOG2 = 13,
>> + BLOCK_16KB_LOG2 = 14,
>> + BLOCK_32KB_LOG2 = 15,
>> + BLOCK_64KB_LOG2 = 16
>> +};
>> +
>> +static inline uint32_t mask(uint_fast32_t log2)
>> +{
>> + return (1U << log2) - 1U;
>> +}
>> +
>> +static inline uint64_t mask64(uint_fast32_t log2)
>> +{
>> + return (1ULL << log2) - 1ULL;
>> +}
>> +
>> +#if defined LZ4K_WITH_GCC_INTRINSICS
>> +static inline int most_significant_bit_of(uint64_t u)
>> +{
>> + return (int)(__builtin_expect((u) == 0, false) ?
>> + -1 : (int)(31 ^ (uint32_t)__builtin_clz((unsigned)(u))));
>> +}
>> +#else /* #!defined LZ4K_WITH_GCC_INTRINSICS */
>> +#error undefined most_significant_bit_of(unsigned u)
>> +#endif /* #if defined LZ4K_WITH_GCC_INTRINSICS */
>> +
>> +static inline uint64_t round_up_to_log2(uint64_t u, uint8_t log2)
>> +{
>> + return (uint64_t)((u + mask64(log2)) & ~mask64(log2));
>> +}
>> +
>> +static inline uint64_t round_up_to_power_of2(uint64_t u)
>> +{
>> + const int_fast32_t msb = most_significant_bit_of(u);
>> + return round_up_to_log2(u, (uint8_t)msb);
>> +}
>> +
>> +static inline void m_copy(void *dst, const void *src, size_t total)
>> +{
>> +#if defined(__STDC_LIB_EXT1__)
>> + (void)memcpy_s(dst, total, src, (total * 2) >> 1); /* *2 >> 1 to
>> avoid bot errors */
>> +#else
>> + (void)__builtin_memcpy(dst, src, total);
>> +#endif
>> +}
>> +
>> +static inline void m_set(void *dst, uint8_t value, size_t total)
>> +{
>> +#if defined(__STDC_LIB_EXT1__)
>> + (void)memset_s(dst, total, value, (total * 2) >> 1); /* *2 >> 1
>> to avoid bot errors */
>> +#else
>> + (void)__builtin_memset(dst, value, total);
>> +#endif
>> +}
>> +
>> +static inline uint32_t read4_at(const void *p)
>> +{
>> + uint32_t result;
>> + m_copy(&result, p, sizeof(result));
>> + return result;
>> +}
>> +
>> +static inline uint64_t read8_at(const void *p)
>> +{
>> + uint64_t result;
>> + m_copy(&result, p, sizeof(result));
>> + return result;
>> +}
>> +
>> +static inline bool equal4(const uint8_t *const q, const uint8_t
>> *const r)
>> +{
>> + return read4_at(q) == read4_at(r);
>> +}
>> +
>> +static inline bool equal3(const uint8_t *const q, const uint8_t
>> *const r)
>> +{
>> + return (read4_at(q) << BYTE_BITS) == (read4_at(r) << BYTE_BITS);
>> +}
>> +
>> +static inline uint_fast32_t hash24v(const uint64_t r, uint32_t shift)
>> +{
>> + const uint32_t m = 3266489917U;
>> + return (((uint32_t)r << BYTE_BITS) * m) >> (32 - shift);
>> +}
>> +
>> +static inline uint_fast32_t hash24(const uint8_t *r, uint32_t shift)
>> +{
>> + return hash24v(read4_at(r), shift);
>> +}
>> +
>> +static inline uint_fast32_t hash32v_2(const uint64_t r, uint32_t shift)
>> +{
>> + const uint32_t m = 3266489917U;
>> + return ((uint32_t)r * m) >> (32 - shift);
>> +}
>> +
>> +static inline uint_fast32_t hash32_2(const uint8_t *r, uint32_t shift)
>> +{
>> + return hash32v_2(read4_at(r), shift);
>> +}
>> +
>> +static inline uint_fast32_t hash32v(const uint64_t r, uint32_t shift)
>> +{
>> + const uint32_t m = 2654435761U;
>> + return ((uint32_t)r * m) >> (32 - shift);
>> +}
>> +
>> +static inline uint_fast32_t hash32(const uint8_t *r, uint32_t shift)
>> +{
>> + return hash32v(read4_at(r), shift);
>> +}
>> +
>> +static inline uint_fast32_t hash64v_5b(const uint64_t r, uint32_t
>> shift)
>> +{
>> + const uint64_t m = 889523592379ULL;
>> + return (uint32_t)(((r << 24) * m) >> (64 - shift));
>> +}
>> +
>> +static inline uint_fast32_t hash64_5b(const uint8_t *r, uint32_t shift)
>> +{
>> + return hash64v_5b(read8_at(r), shift);
>> +}
>> +
>> +static inline uint_fast32_t hash64v_6b(const uint64_t r, uint32_t
>> shift)
>> +{
>> + const uint64_t m = 227718039650203ULL;
>> + return (uint32_t)(((r << 16) * m) >> (64 - shift));
>> +}
>> +
>> +static inline uint_fast32_t hash64_6b(const uint8_t *r, uint32_t shift)
>> +{
>> + return hash64v_6b(read8_at(r), shift);
>> +}
>> +
>> +static inline uint_fast32_t hash64v_7b(const uint64_t r, uint32_t
>> shift)
>> +{
>> + const uint64_t m = 58295818150454627ULL;
>> + return (uint32_t)(((r << 8) * m) >> (64 - shift));
>> +}
>> +
>> +static inline uint_fast32_t hash64_7b(const uint8_t *r, uint32_t shift)
>> +{
>> + return hash64v_7b(read8_at(r), shift);
>> +}
>> +
>> +static inline uint_fast32_t hash64v_8b(const uint64_t r, uint32_t
>> shift)
>> +{
>> + const uint64_t m = 2870177450012600261ULL;
>> + return (uint32_t)((r * m) >> (64 - shift));
>> +}
>> +
>> +static inline uint_fast32_t hash64_8b(const uint8_t *r, uint32_t shift)
>> +{
>> + return hash64v_8b(read8_at(r), shift);
>> +}
>> +
>> +static inline void while_lt_copy_x(
>> + uint8_t *dst,
>> + const uint8_t *src,
>> + const uint8_t *dst_end,
>> + const size_t copy_min)
>> +{
>> + for (; dst < dst_end; dst += copy_min, src += copy_min)
>> + m_copy(dst, src, copy_min);
>> +}
>> +
>> +static inline void copy_x_while_lt(
>> + uint8_t *dst,
>> + const uint8_t *src,
>> + const uint8_t *dst_end,
>> + const size_t copy_min)
>> +{
>> + m_copy(dst, src, copy_min);
>> + while (dst + copy_min < dst_end)
>> + m_copy(dst += copy_min, src += copy_min, copy_min);
>> +}
>> +
>> +static inline void copy_x_while_total(
>> + uint8_t *dst,
>> + const uint8_t *src,
>> + size_t total,
>> + const size_t copy_min)
>> +{
>> + m_copy(dst, src, copy_min);
>> + for (; total > copy_min; total-= copy_min)
>> + m_copy(dst += copy_min, src += copy_min, copy_min);
>> +}
>> +
>> +static inline void copy_2x(
>> + uint8_t *dst,
>> + const uint8_t *src,
>> + const size_t copy_min)
>> +{
>> + m_copy(dst, src, copy_min);
>> + m_copy(dst + copy_min, src + copy_min, copy_min);
>> +}
>> +
>> +static inline void copy_2x_as_x2_while_lt(
>> + uint8_t *dst,
>> + const uint8_t *src,
>> + const uint8_t *dst_end,
>> + const size_t copy_min)
>> +{
>> + copy_2x(dst, src, copy_min);
>> + while (dst + (copy_min << 1) < dst_end)
>> + copy_2x(dst += (copy_min << 1), src += (copy_min << 1),
>> copy_min);
>> +}
>> +
>> +static inline void while_lt_copy_2x_as_x2(
>> + uint8_t *dst,
>> + const uint8_t *src,
>> + const uint8_t *dst_end,
>> + const size_t copy_min)
>> +{
>> + for (; dst < dst_end; dst += (copy_min << 1), src += (copy_min
>> << 1))
>> + copy_2x(dst, src, copy_min);
>> +}
>> +
>> +#endif /* _LZ4K_PRIVATE_H */
1
0
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I7H9IA
CVE: NA
-------------------------------------------
Add lz4k algorithm support for zram.
Signed-off-by: Nanyong Sun <sunnanyong(a)huawei.com>
Signed-off-by: Tu Jinjiang <tujinjiang(a)huawei.com>
---
crypto/Kconfig | 8 +
crypto/Makefile | 1 +
crypto/lz4k.c | 97 ++++++
drivers/block/zram/zcomp.c | 3 +
include/linux/lz4k.h | 383 +++++++++++++++++++++++
lib/Kconfig | 6 +
lib/Makefile | 2 +
lib/lz4k/Makefile | 2 +
lib/lz4k/lz4k_decode.c | 308 +++++++++++++++++++
lib/lz4k/lz4k_encode.c | 539 +++++++++++++++++++++++++++++++++
lib/lz4k/lz4k_encode_private.h | 137 +++++++++
lib/lz4k/lz4k_private.h | 269 ++++++++++++++++
12 files changed, 1755 insertions(+)
create mode 100644 crypto/lz4k.c
create mode 100644 include/linux/lz4k.h
create mode 100644 lib/lz4k/Makefile
create mode 100644 lib/lz4k/lz4k_decode.c
create mode 100644 lib/lz4k/lz4k_encode.c
create mode 100644 lib/lz4k/lz4k_encode_private.h
create mode 100644 lib/lz4k/lz4k_private.h
diff --git a/crypto/Kconfig b/crypto/Kconfig
index 64cb304f5103..35223cff7c8a 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -1871,6 +1871,14 @@ config CRYPTO_LZ4HC
help
This is the LZ4 high compression mode algorithm.
+config CRYPTO_LZ4K
+ tristate "LZ4K compression algorithm"
+ select CRYPTO_ALGAPI
+ select LZ4K_COMPRESS
+ select LZ4K_DECOMPRESS
+ help
+ This is the LZ4K algorithm.
+
config CRYPTO_ZSTD
tristate "Zstd compression algorithm"
select CRYPTO_ALGAPI
diff --git a/crypto/Makefile b/crypto/Makefile
index 9d1191f2b741..5c3b0a0839c5 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -161,6 +161,7 @@ obj-$(CONFIG_CRYPTO_AUTHENC) += authenc.o authencesn.o
obj-$(CONFIG_CRYPTO_LZO) += lzo.o lzo-rle.o
obj-$(CONFIG_CRYPTO_LZ4) += lz4.o
obj-$(CONFIG_CRYPTO_LZ4HC) += lz4hc.o
+obj-$(CONFIG_CRYPTO_LZ4K) += lz4k.o
obj-$(CONFIG_CRYPTO_XXHASH) += xxhash_generic.o
obj-$(CONFIG_CRYPTO_842) += 842.o
obj-$(CONFIG_CRYPTO_RNG2) += rng.o
diff --git a/crypto/lz4k.c b/crypto/lz4k.c
new file mode 100644
index 000000000000..8daceab269ef
--- /dev/null
+++ b/crypto/lz4k.c
@@ -0,0 +1,97 @@
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2012-2020. All rights reserved.
+ * Description: LZ4K compression algorithm for ZRAM
+ * Author: Arkhipov Denis arkhipov.denis(a)huawei.com
+ * Create: 2020-03-25
+ */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/crypto.h>
+#include <linux/vmalloc.h>
+#include <linux/lz4k.h>
+
+
+struct lz4k_ctx {
+ void *lz4k_comp_mem;
+};
+
+static int lz4k_init(struct crypto_tfm *tfm)
+{
+ struct lz4k_ctx *ctx = crypto_tfm_ctx(tfm);
+
+ ctx->lz4k_comp_mem = vmalloc(lz4k_encode_state_bytes_min());
+ if (!ctx->lz4k_comp_mem)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void lz4k_exit(struct crypto_tfm *tfm)
+{
+ struct lz4k_ctx *ctx = crypto_tfm_ctx(tfm);
+ vfree(ctx->lz4k_comp_mem);
+}
+
+static int lz4k_compress_crypto(struct crypto_tfm *tfm, const u8 *src, unsigned int slen, u8 *dst, unsigned int *dlen)
+{
+ struct lz4k_ctx *ctx = crypto_tfm_ctx(tfm);
+ int ret;
+
+ ret = lz4k_encode(ctx->lz4k_comp_mem, src, dst, slen, *dlen, 0);
+
+ if (ret < 0) {
+ return -EINVAL;
+ }
+
+ if (ret)
+ *dlen = ret;
+
+ return 0;
+}
+
+static int lz4k_decompress_crypto(struct crypto_tfm *tfm, const u8 *src, unsigned int slen, u8 *dst, unsigned int *dlen)
+{
+ int ret;
+
+ ret = lz4k_decode(src, dst, slen, *dlen);
+
+ if (ret <= 0)
+ return -EINVAL;
+ *dlen = ret;
+ return 0;
+}
+
+static struct crypto_alg alg_lz4k = {
+ .cra_name = "lz4k",
+ .cra_driver_name = "lz4k-generic",
+ .cra_flags = CRYPTO_ALG_TYPE_COMPRESS,
+ .cra_ctxsize = sizeof(struct lz4k_ctx),
+ .cra_module = THIS_MODULE,
+ .cra_list = LIST_HEAD_INIT(alg_lz4k.cra_list),
+ .cra_init = lz4k_init,
+ .cra_exit = lz4k_exit,
+ .cra_u = {
+ .compress = {
+ .coa_compress = lz4k_compress_crypto,
+ .coa_decompress = lz4k_decompress_crypto
+ }
+ }
+};
+
+static int __init lz4k_mod_init(void)
+{
+ return crypto_register_alg(&alg_lz4k);
+}
+
+static void __exit lz4k_mod_fini(void)
+{
+ crypto_unregister_alg(&alg_lz4k);
+}
+
+module_init(lz4k_mod_init);
+module_exit(lz4k_mod_fini);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("LZ4K Compression Algorithm");
+MODULE_ALIAS_CRYPTO("lz4k");
diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
index b08650417bf0..28bda2035326 100644
--- a/drivers/block/zram/zcomp.c
+++ b/drivers/block/zram/zcomp.c
@@ -29,6 +29,9 @@ static const char * const backends[] = {
#if IS_ENABLED(CONFIG_CRYPTO_ZSTD)
"zstd",
#endif
+#if IS_ENABLED(CONFIG_CRYPTO_LZ4K)
+ "lz4k",
+#endif
};
static void zcomp_strm_free(struct zcomp_strm *zstrm)
diff --git a/include/linux/lz4k.h b/include/linux/lz4k.h
new file mode 100644
index 000000000000..6e73161b1840
--- /dev/null
+++ b/include/linux/lz4k.h
@@ -0,0 +1,383 @@
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2012-2020. All rights reserved.
+ * Description: LZ4K compression algorithm
+ * Author: Aleksei Romanovskii aleksei.romanovskii(a)huawei.com
+ * Created: 2020-03-25
+ */
+
+#ifndef _LZ4K_H
+#define _LZ4K_H
+
+/* file lz4k.h
+ This file contains the platform-independent API of LZ-class
+ lossless codecs (compressors/decompressors) with complete
+ in-place documentation. The documentation is formatted
+ in accordance with DOXYGEN mark-up format. So, one can
+ generate proper documentation, e.g. in HTML format, using DOXYGEN.
+
+ Currently, LZ-class codecs, documented here, implement following
+ algorithms for lossless data compression/decompression:
+ \li "LZ HUAWEI" proprietary codec competing with LZ4 - lz4k_encode(),
+ lz4k_encode_delta(), lz4k_decode(), lz4k_decode_delta()
+
+ The LZ HUAWEI compressors accept any data as input and compress it
+ without loss to a smaller size if possible.
+ Compressed data produced by LZ HUAWEI compressor API lz4k_encode*(),
+ can be decompressed only by lz4k_decode() API documented below.\n
+ */
+
+/*
+ lz4k_status defines simple set of status values returned by Huawei APIs
+ */
+typedef enum {
+ LZ4K_STATUS_INCOMPRESSIBLE = 0, /* !< Return when data is incompressible */
+ LZ4K_STATUS_FAILED = -1, /* !< Return on general failure */
+ LZ4K_STATUS_READ_ERROR = -2, /* !< Return when data reading failed */
+ LZ4K_STATUS_WRITE_ERROR = -3 /* !< Return when data writing failed */
+} lz4k_status;
+
+/*
+ LZ4K_Version() returns static unmutable string with algorithm version
+ */
+const char *lz4k_version(void);
+
+/*
+ lz4k_encode_state_bytes_min() returns number of bytes for state parameter,
+ supplied to lz4k_encode(), lz4k_encode_delta(),
+ lz4k_update_delta_state().
+ So, state should occupy at least lz4k_encode_state_bytes_min() for mentioned
+ functions to work correctly.
+ */
+unsigned lz4k_encode_state_bytes_min(void);
+
+/*
+ lz4k_encode() encodes/compresses one input buffer at *in, places
+ result of encoding into one output buffer at *out if encoded data
+ size fits specified values of out_max and out_limit.
+ It returs size of encoded data in case of success or value<=0 otherwise.
+ The result of successful encoding is in HUAWEI proprietary format, that
+ is the encoded data can be decoded only by lz4k_decode().
+
+ \return
+ \li positive value\n
+ if encoding was successful. The value returned is the size of encoded
+ (compressed) data always <=out_max.
+ \li non-positive value\n
+ if in==0||in_max==0||out==0||out_max==0 or
+ if out_max is less than needed for encoded (compressed) data.
+ \li 0 value\n
+ if encoded data size >= out_limit
+
+ \param[in] state
+ !=0, pointer to state buffer used internally by the function. Size of
+ state in bytes should be at least lz4k_encode_state_bytes_min(). The content
+ of state buffer will be changed during encoding.
+
+ \param[in] in
+ !=0, pointer to the input buffer to encode (compress). The content of
+ the input buffer does not change during encoding.
+
+ \param[in] out
+ !=0, pointer to the output buffer where to place result of encoding
+ (compression).
+ If encoding is unsuccessful, e.g. out_max or out_limit are less than
+ needed for encoded data then content of out buffer may be arbitrary.
+
+ \param[in] in_max
+ !=0, size in bytes of the input buffer at *in
+
+ \param[in] out_max
+ !=0, size in bytes of the output buffer at *out
+
+ \param[in] out_limit
+ encoded data size soft limit in bytes. Due to performance reasons it is
+ not guaranteed that
+ lz4k_encode will always detect that resulting encoded data size is
+ bigger than out_limit.
+ Hovewer, when reaching out_limit is detected, lz4k_encode() returns
+ earlier and spares CPU cycles. Caller code should recheck result
+ returned by lz4k_encode() (value greater than 0) if it is really
+ less or equal than out_limit.
+ out_limit is ignored if it is equal to 0.
+ */
+int lz4k_encode(
+ void *const state,
+ const void *const in,
+ void *out,
+ unsigned in_max,
+ unsigned out_max,
+ unsigned out_limit);
+
+/*
+ lz4k_encode_max_cr() encodes/compresses one input buffer at *in, places
+ result of encoding into one output buffer at *out if encoded data
+ size fits specified value of out_max.
+ It returs size of encoded data in case of success or value<=0 otherwise.
+ The result of successful encoding is in HUAWEI proprietary format, that
+ is the encoded data can be decoded only by lz4k_decode().
+
+ \return
+ \li positive value\n
+ if encoding was successful. The value returned is the size of encoded
+ (compressed) data always <=out_max.
+ \li non-positive value\n
+ if in==0||in_max==0||out==0||out_max==0 or
+ if out_max is less than needed for encoded (compressed) data.
+
+ \param[in] state
+ !=0, pointer to state buffer used internally by the function. Size of
+ state in bytes should be at least lz4k_encode_state_bytes_min(). The content
+ of state buffer will be changed during encoding.
+
+ \param[in] in
+ !=0, pointer to the input buffer to encode (compress). The content of
+ the input buffer does not change during encoding.
+
+ \param[in] out
+ !=0, pointer to the output buffer where to place result of encoding
+ (compression).
+ If encoding is unsuccessful, e.g. out_max is less than
+ needed for encoded data then content of out buffer may be arbitrary.
+
+ \param[in] in_max
+ !=0, size in bytes of the input buffer at *in
+
+ \param[in] out_max
+ !=0, size in bytes of the output buffer at *out
+
+ \param[in] out_limit
+ encoded data size soft limit in bytes. Due to performance reasons it is
+ not guaranteed that
+ lz4k_encode will always detect that resulting encoded data size is
+ bigger than out_limit.
+ Hovewer, when reaching out_limit is detected, lz4k_encode() returns
+ earlier and spares CPU cycles. Caller code should recheck result
+ returned by lz4k_encode() (value greater than 0) if it is really
+ less or equal than out_limit.
+ out_limit is ignored if it is equal to 0.
+ */
+int lz4k_encode_max_cr(
+ void *const state,
+ const void *const in,
+ void *out,
+ unsigned in_max,
+ unsigned out_max,
+ unsigned out_limit);
+
+/*
+ lz4k_update_delta_state() fills/updates state (hash table) in the same way as
+ lz4k_encode does while encoding (compressing).
+ The state and its content can then be used by lz4k_encode_delta()
+ to encode (compress) data more efficiently.
+ By other words, effect of lz4k_update_delta_state() is the same as
+ lz4k_encode() with all encoded output discarded.
+
+ Example sequence of calls for lz4k_update_delta_state and
+ lz4k_encode_delta:
+ //dictionary (1st) block
+ int result0=lz4k_update_delta_state(state, in0, in0, in_max0);
+//delta (2nd) block
+ int result1=lz4k_encode_delta(state, in0, in, out, in_max,
+ out_max);
+
+ \param[in] state
+ !=0, pointer to state buffer used internally by lz4k_encode*.
+ Size of state in bytes should be at least lz4k_encode_state_bytes_min().
+ The content of state buffer is zeroed at the beginning of
+ lz4k_update_delta_state ONLY when in0==in.
+ The content of state buffer will be changed inside
+ lz4k_update_delta_state.
+
+ \param[in] in0
+ !=0, pointer to the reference/dictionary input buffer that was used
+ as input to preceding call of lz4k_encode() or lz4k_update_delta_state()
+ to fill/update the state buffer.
+ The content of the reference/dictionary input buffer does not change
+ during encoding.
+ The in0 is needed for use-cases when there are several dictionary and
+ input blocks interleaved, e.g.
+ <dictionaryA><inputA><dictionaryB><inputB>..., or
+ <dictionaryA><dictionaryB><inputAB>..., etc.
+
+ \param[in] in
+ !=0, pointer to the input buffer to fill/update state as if encoding
+ (compressing) this input. This input buffer is also called dictionary
+ input buffer.
+ The content of the input buffer does not change during encoding.
+ The two buffers - at in0 and at in - should be contiguous in memory.
+ That is, the last byte of buffer at in0 is located exactly before byte
+ at in.
+
+ \param[in] in_max
+ !=0, size in bytes of the input buffer at in.
+ */
+int lz4k_update_delta_state(
+ void *const state,
+ const void *const in0,
+ const void *const in,
+ unsigned in_max);
+
+/*
+ lz4k_encode_delta() encodes (compresses) data from one input buffer
+ using one reference buffer as dictionary and places the result of
+ compression into one output buffer.
+ The result of successful compression is in HUAWEI proprietary format, so
+ that compressed data can be decompressed only by lz4k_decode_delta().
+ Reference/dictionary buffer and input buffer should be contiguous in
+ memory.
+
+ Example sequence of calls for lz4k_update_delta_state and
+ lz4k_encode_delta:
+//dictionary (1st) block
+ int result0=lz4k_update_delta_state(state, in0, in0, in_max0);
+//delta (2nd) block
+ int result1=lz4k_encode_delta(state, in0, in, out, in_max,
+ out_max);
+
+ Example sequence of calls for lz4k_encode and lz4k_encode_delta:
+//dictionary (1st) block
+ int result0=lz4k_encode(state, in0, out0, in_max0, out_max0);
+//delta (2nd) block
+ int result1=lz4k_encode_delta(state, in0, in, out, in_max,
+ out_max);
+
+ \return
+ \li positive value\n
+ if encoding was successful. The value returned is the size of encoded
+ (compressed) data.
+ \li non-positive value\n
+ if state==0||in0==0||in==0||in_max==0||out==0||out_max==0 or
+ if out_max is less than needed for encoded (compressed) data.
+
+ \param[in] state
+ !=0, pointer to state buffer used internally by the function. Size of
+ state in bytes should be at least lz4k_encode_state_bytes_min(). For more
+ efficient encoding the state buffer may be filled/updated by calling
+ lz4k_update_delta_state() or lz4k_encode() before lz4k_encode_delta().
+ The content of state buffer is zeroed at the beginning of
+ lz4k_encode_delta() ONLY when in0==in.
+ The content of state will be changed during encoding.
+
+ \param[in] in0
+ !=0, pointer to the reference/dictionary input buffer that was used as
+ input to preceding call of lz4k_encode() or lz4k_update_delta_state() to
+ fill/update the state buffer.
+ The content of the reference/dictionary input buffer does not change
+ during encoding.
+
+ \param[in] in
+ !=0, pointer to the input buffer to encode (compress). The input buffer
+ is compressed using content of the reference/dictionary input buffer at
+ in0. The content of the input buffer does not change during encoding.
+ The two buffers - at *in0 and at *in - should be contiguous in memory.
+ That is, the last byte of buffer at *in0 is located exactly before byte
+ at *in.
+
+ \param[in] out
+ !=0, pointer to the output buffer where to place result of encoding
+ (compression). If compression is unsuccessful then content of out
+ buffer may be arbitrary.
+
+ \param[in] in_max
+ !=0, size in bytes of the input buffer at *in
+
+ \param[in] out_max
+ !=0, size in bytes of the output buffer at *out.
+ */
+int lz4k_encode_delta(
+ void *const state,
+ const void *const in0,
+ const void *const in,
+ void *out,
+ unsigned in_max,
+ unsigned out_max);
+
+/*
+ lz4k_decode() decodes (decompresses) data from one input buffer and places
+ the result of decompression into one output buffer. The encoded data in input
+ buffer should be in HUAWEI proprietary format, produced by lz4k_encode()
+ or by lz4k_encode_delta().
+
+ \return
+ \li positive value\n
+ if decoding was successful. The value returned is the size of decoded
+ (decompressed) data.
+ \li non-positive value\n
+ if in==0||in_max==0||out==0||out_max==0 or
+ if out_max is less than needed for decoded (decompressed) data or
+ if input encoded data format is corrupted.
+
+ \param[in] in
+ !=0, pointer to the input buffer to decode (decompress). The content of
+ the input buffer does not change during decoding.
+
+ \param[in] out
+ !=0, pointer to the output buffer where to place result of decoding
+ (decompression). If decompression is unsuccessful then content of out
+ buffer may be arbitrary.
+
+ \param[in] in_max
+ !=0, size in bytes of the input buffer at in
+
+ \param[in] out_max
+ !=0, size in bytes of the output buffer at out
+ */
+int lz4k_decode(
+ const void *const in,
+ void *const out,
+ unsigned in_max,
+ unsigned out_max);
+
+/*
+ lz4k_decode_delta() decodes (decompresses) data from one input buffer
+ and places the result of decompression into one output buffer. The
+ compressed data in input buffer should be in format, produced by
+ lz4k_encode_delta().
+
+ Example sequence of calls for lz4k_decode and lz4k_decode_delta:
+//dictionary (1st) block
+ int result0=lz4k_decode(in0, out0, in_max0, out_max0);
+//delta (2nd) block
+ int result1=lz4k_decode_delta(in, out0, out, in_max, out_max);
+
+ \return
+ \li positive value\n
+ if decoding was successful. The value returned is the size of decoded
+ (decompressed) data.
+ \li non-positive value\n
+ if in==0||in_max==0||out==0||out_max==0 or
+ if out_max is less than needed for decoded (decompressed) data or
+ if input data format is corrupted.
+
+ \param[in] in
+ !=0, pointer to the input buffer to decode (decompress). The content of
+ the input buffer does not change during decoding.
+
+ \param[in] out0
+ !=0, pointer to the dictionary input buffer that was used as input to
+ lz4k_update_delta_state() to fill/update the state buffer. The content
+ of the dictionary input buffer does not change during decoding.
+
+ \param[in] out
+ !=0, pointer to the output buffer where to place result of decoding
+ (decompression). If decompression is unsuccessful then content of out
+ buffer may be arbitrary.
+ The two buffers - at *out0 and at *out - should be contiguous in memory.
+ That is, the last byte of buffer at *out0 is located exactly before byte
+ at *out.
+
+ \param[in] in_max
+ !=0, size in bytes of the input buffer at *in
+
+ \param[in] out_max
+ !=0, size in bytes of the output buffer at *out
+ */
+int lz4k_decode_delta(
+ const void *in,
+ const void *const out0,
+ void *const out,
+ unsigned in_max,
+ unsigned out_max);
+
+
+#endif /* _LZ4K_H */
diff --git a/lib/Kconfig b/lib/Kconfig
index 36326864249d..4bf1c2c21157 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -310,6 +310,12 @@ config LZ4HC_COMPRESS
config LZ4_DECOMPRESS
tristate
+config LZ4K_COMPRESS
+ tristate
+
+config LZ4K_DECOMPRESS
+ tristate
+
config ZSTD_COMPRESS
select XXHASH
tristate
diff --git a/lib/Makefile b/lib/Makefile
index a803e1527c4b..bd0d3635ae46 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -187,6 +187,8 @@ obj-$(CONFIG_LZO_DECOMPRESS) += lzo/
obj-$(CONFIG_LZ4_COMPRESS) += lz4/
obj-$(CONFIG_LZ4HC_COMPRESS) += lz4/
obj-$(CONFIG_LZ4_DECOMPRESS) += lz4/
+obj-$(CONFIG_LZ4K_COMPRESS) += lz4k/
+obj-$(CONFIG_LZ4K_DECOMPRESS) += lz4k/
obj-$(CONFIG_ZSTD_COMPRESS) += zstd/
obj-$(CONFIG_ZSTD_DECOMPRESS) += zstd/
obj-$(CONFIG_XZ_DEC) += xz/
diff --git a/lib/lz4k/Makefile b/lib/lz4k/Makefile
new file mode 100644
index 000000000000..6ea3578639d4
--- /dev/null
+++ b/lib/lz4k/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_LZ4K_COMPRESS) += lz4k_encode.o
+obj-$(CONFIG_LZ4K_DECOMPRESS) += lz4k_decode.o
\ No newline at end of file
diff --git a/lib/lz4k/lz4k_decode.c b/lib/lz4k/lz4k_decode.c
new file mode 100644
index 000000000000..567b76b7bc51
--- /dev/null
+++ b/lib/lz4k/lz4k_decode.c
@@ -0,0 +1,308 @@
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2012-2020. All rights reserved.
+ * Description: LZ4K compression algorithm
+ * Author: Aleksei Romanovskii aleksei.romanovskii(a)huawei.com
+ * Created: 2020-03-25
+ */
+
+#if !defined(__KERNEL__)
+#include "lz4k.h"
+#else
+#include <linux/lz4k.h>
+#include <linux/module.h>
+#endif
+
+#include "lz4k_private.h" /* types, etc */
+
+static const uint8_t *get_size(
+ uint_fast32_t *size,
+ const uint8_t *in_at,
+ const uint8_t *const in_end)
+{
+ uint_fast32_t u;
+ do {
+ if (unlikely(in_at >= in_end))
+ return NULL;
+ *size += (u = *(const uint8_t*)in_at);
+ ++in_at;
+ } while (BYTE_MAX == u);
+ return in_at;
+}
+
+static int end_of_block(
+ const uint_fast32_t nr_bytes_max,
+ const uint_fast32_t r_bytes_max,
+ const uint8_t *const in_at,
+ const uint8_t *const in_end,
+ const uint8_t *const out,
+ const uint8_t *const out_at)
+{
+ if (!nr_bytes_max)
+ return LZ4K_STATUS_FAILED; /* should be the last one in block */
+ if (r_bytes_max != REPEAT_MIN)
+ return LZ4K_STATUS_FAILED; /* should be the last one in block */
+ if (in_at != in_end)
+ return LZ4K_STATUS_FAILED; /* should be the last one in block */
+ return (int)(out_at - out);
+}
+
+enum {
+ NR_COPY_MIN = 16,
+ R_COPY_MIN = 16,
+ R_COPY_SAFE = R_COPY_MIN - 1,
+ R_COPY_SAFE_2X = (R_COPY_MIN << 1) - 1
+};
+
+static bool out_non_repeat(
+ const uint8_t **in_at,
+ uint8_t **out_at,
+ uint_fast32_t nr_bytes_max,
+ const uint8_t *const in_end,
+ const uint8_t *const out_end)
+{
+ const uint8_t *const in_copy_end = *in_at + nr_bytes_max;
+ uint8_t *const out_copy_end = *out_at + nr_bytes_max;
+ if (likely(nr_bytes_max <= NR_COPY_MIN)) {
+ if (likely(*in_at <= in_end - NR_COPY_MIN &&
+ *out_at <= out_end - NR_COPY_MIN))
+ m_copy(*out_at, *in_at, NR_COPY_MIN);
+ else if (in_copy_end <= in_end && out_copy_end <= out_end)
+ m_copy(*out_at, *in_at, nr_bytes_max);
+ else
+ return false;
+ } else {
+ if (likely(in_copy_end <= in_end - NR_COPY_MIN &&
+ out_copy_end <= out_end - NR_COPY_MIN)) {
+ m_copy(*out_at, *in_at, NR_COPY_MIN);
+ copy_x_while_lt(*out_at + NR_COPY_MIN,
+ *in_at + NR_COPY_MIN,
+ out_copy_end, NR_COPY_MIN);
+ } else if (in_copy_end <= in_end && out_copy_end <= out_end) {
+ m_copy(*out_at, *in_at, nr_bytes_max);
+ } else { /* in_copy_end > in_end || out_copy_end > out_end */
+ return false;
+ }
+ }
+ *in_at = in_copy_end;
+ *out_at = out_copy_end;
+ return true;
+}
+
+static void out_repeat_overlap(
+ uint_fast32_t offset,
+ uint8_t *out_at,
+ const uint8_t *out_from,
+ const uint8_t *const out_copy_end)
+{ /* (1 < offset < R_COPY_MIN/2) && out_copy_end + R_COPY_SAFE_2X <= out_end */
+ enum {
+ COPY_MIN = R_COPY_MIN >> 1,
+ OFFSET_LIMIT = COPY_MIN >> 1
+ };
+ m_copy(out_at, out_from, COPY_MIN);
+ out_at += offset;
+ if (offset <= OFFSET_LIMIT)
+ offset <<= 1;
+ do {
+ m_copy(out_at, out_from, COPY_MIN);
+ out_at += offset;
+ if (offset <= OFFSET_LIMIT)
+ offset <<= 1;
+ } while (out_at - out_from < R_COPY_MIN);
+ while_lt_copy_2x_as_x2(out_at, out_from, out_copy_end, R_COPY_MIN);
+}
+
+static bool out_repeat_slow(
+ uint_fast32_t r_bytes_max,
+ uint_fast32_t offset,
+ uint8_t *out_at,
+ const uint8_t *out_from,
+ const uint8_t *const out_copy_end,
+ const uint8_t *const out_end)
+{
+ if (offset > 1 && out_copy_end <= out_end - R_COPY_SAFE_2X) {
+ out_repeat_overlap(offset, out_at, out_from, out_copy_end);
+ } else {
+ if (unlikely(out_copy_end > out_end))
+ return false;
+ if (offset == 1) {
+ m_set(out_at, *out_from, r_bytes_max);
+ } else {
+ do
+ *out_at++ = *out_from++;
+ while (out_at < out_copy_end);
+ }
+ }
+ return true;
+}
+
+static int decode(
+ const uint8_t *const in,
+ const uint8_t *const out0,
+ uint8_t *const out,
+ const uint8_t *const in_end,
+ const uint8_t *const out_end,
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2)
+{
+ const uint_fast32_t r_log2 = TAG_BITS_MAX - (off_log2 + nr_log2);
+ const uint8_t *in_at = in;
+ const uint8_t *const in_end_minus_x = in_end - TAG_BYTES_MAX;
+ uint8_t *out_at = out;
+ while (likely(in_at <= in_end_minus_x)) {
+ const uint_fast32_t utag = read4_at(in_at - 1) >> BYTE_BITS;
+ const uint_fast32_t offset = utag & mask(off_log2);
+ uint_fast32_t nr_bytes_max = utag >> (off_log2 + r_log2),
+ r_bytes_max = ((utag >> off_log2) & mask(r_log2)) +
+ REPEAT_MIN;
+ const uint8_t *out_from = 0;
+ uint8_t *out_copy_end = 0;
+ in_at += TAG_BYTES_MAX;
+ if (unlikely(nr_bytes_max == mask(nr_log2))) {
+ in_at = get_size(&nr_bytes_max, in_at, in_end);
+ if (in_at == NULL)
+ return LZ4K_STATUS_READ_ERROR;
+ }
+ if (!out_non_repeat(&in_at, &out_at, nr_bytes_max, in_end, out_end))
+ return LZ4K_STATUS_FAILED;
+ if (unlikely(r_bytes_max == mask(r_log2) + REPEAT_MIN)) {
+ in_at = get_size(&r_bytes_max, in_at, in_end);
+ if (in_at == NULL)
+ return LZ4K_STATUS_READ_ERROR;
+ }
+ out_from = out_at - offset;
+ if (unlikely(out_from < out0))
+ return LZ4K_STATUS_FAILED;
+ out_copy_end = out_at + r_bytes_max;
+ if (likely(offset >= R_COPY_MIN &&
+ out_copy_end <= out_end - R_COPY_SAFE_2X)) {
+ copy_2x_as_x2_while_lt(out_at, out_from, out_copy_end,
+ R_COPY_MIN);
+ } else if (likely(offset >= (R_COPY_MIN >> 1) &&
+ out_copy_end <= out_end - R_COPY_SAFE_2X)) {
+ m_copy(out_at, out_from, R_COPY_MIN);
+ out_at += offset;
+ while_lt_copy_x(out_at, out_from, out_copy_end, R_COPY_MIN);
+ /* faster than 2x */
+ } else if (likely(offset > 0)) {
+ if (!out_repeat_slow(r_bytes_max, offset, out_at, out_from,
+ out_copy_end, out_end))
+ return LZ4K_STATUS_FAILED;
+ } else { /* offset == 0: EOB, last literal */
+ return end_of_block(nr_bytes_max, r_bytes_max, in_at,
+ in_end, out, out_at);
+ }
+ out_at = out_copy_end;
+ }
+ return in_at == in_end ? (int)(out_at - out) : LZ4K_STATUS_FAILED;
+}
+
+static int decode4kb(
+ const uint8_t *const in,
+ const uint8_t *const out0,
+ uint8_t *const out,
+ const uint8_t *const in_end,
+ const uint8_t *const out_end)
+{
+ enum {
+ NR_LOG2 = 6
+ };
+ return decode(in, out0, out, in_end, out_end, NR_LOG2, BLOCK_4KB_LOG2);
+}
+
+static int decode8kb(
+ const uint8_t *const in,
+ const uint8_t *const out0,
+ uint8_t *const out,
+ const uint8_t *const in_end,
+ const uint8_t *const out_end)
+{
+ enum {
+ NR_LOG2 = 5
+ };
+ return decode(in, out0, out, in_end, out_end, NR_LOG2, BLOCK_8KB_LOG2);
+}
+
+static int decode16kb(
+ const uint8_t *const in,
+ const uint8_t *const out0,
+ uint8_t *const out,
+ const uint8_t *const in_end,
+ const uint8_t *const out_end)
+{
+ enum {
+ NR_LOG2 = 5
+ };
+ return decode(in, out0, out, in_end, out_end, NR_LOG2, BLOCK_16KB_LOG2);
+}
+
+static int decode32kb(
+ const uint8_t *const in,
+ const uint8_t *const out0,
+ uint8_t *const out,
+ const uint8_t *const in_end,
+ const uint8_t *const out_end)
+{
+ enum {
+ NR_LOG2 = 4
+ };
+ return decode(in, out0, out, in_end, out_end, NR_LOG2, BLOCK_32KB_LOG2);
+}
+
+static int decode64kb(
+ const uint8_t *const in,
+ const uint8_t *const out0,
+ uint8_t *const out,
+ const uint8_t *const in_end,
+ const uint8_t *const out_end)
+{
+ enum {
+ NR_LOG2 = 4
+ };
+ return decode(in, out0, out, in_end, out_end, NR_LOG2, BLOCK_64KB_LOG2);
+}
+
+static inline const void *u8_inc(const uint8_t *a)
+{
+ return a+1;
+}
+
+int lz4k_decode(
+ const void *in,
+ void *const out,
+ unsigned in_max,
+ unsigned out_max)
+{
+ /* ++use volatile pointers to prevent compiler optimizations */
+ const uint8_t *volatile in_end = (const uint8_t*)in + in_max;
+ const uint8_t *volatile out_end = (uint8_t*)out + out_max;
+ uint8_t in_log2 = 0;
+ if (unlikely(in == NULL || out == NULL || in_max <= 4 || out_max <= 0))
+ return LZ4K_STATUS_FAILED;
+ in_log2 = (uint8_t)(BLOCK_4KB_LOG2 + *(const uint8_t*)in);
+ /* invalid buffer size or pointer overflow */
+ if (unlikely((const uint8_t*)in >= in_end || (uint8_t*)out >= out_end))
+ return LZ4K_STATUS_FAILED;
+ /* -- */
+ in = u8_inc((const uint8_t*)in);
+ --in_max;
+ if (in_log2 < BLOCK_8KB_LOG2)
+ return decode4kb((const uint8_t*)in, (uint8_t*)out,
+ (uint8_t*)out, in_end, out_end);
+ if (in_log2 == BLOCK_8KB_LOG2)
+ return decode8kb((const uint8_t*)in, (uint8_t*)out,
+ (uint8_t*)out, in_end, out_end);
+ if (in_log2 == BLOCK_16KB_LOG2)
+ return decode16kb((const uint8_t*)in, (uint8_t*)out,
+ (uint8_t*)out, in_end, out_end);
+ if (in_log2 == BLOCK_32KB_LOG2)
+ return decode32kb((const uint8_t*)in, (uint8_t*)out,
+ (uint8_t*)out, in_end, out_end);
+ if (in_log2 == BLOCK_64KB_LOG2)
+ return decode64kb((const uint8_t*)in, (uint8_t*)out,
+ (uint8_t*)out, in_end, out_end);
+ return LZ4K_STATUS_FAILED;
+}
+EXPORT_SYMBOL(lz4k_decode);
+
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_DESCRIPTION("LZ4K decoder");
diff --git a/lib/lz4k/lz4k_encode.c b/lib/lz4k/lz4k_encode.c
new file mode 100644
index 000000000000..a425d3a0b827
--- /dev/null
+++ b/lib/lz4k/lz4k_encode.c
@@ -0,0 +1,539 @@
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2012-2020. All rights reserved.
+ * Description: LZ4K compression algorithm
+ * Author: Aleksei Romanovskii aleksei.romanovskii(a)huawei.com
+ * Created: 2020-03-25
+ */
+
+#if !defined(__KERNEL__)
+#include "lz4k.h"
+#else
+#include <linux/lz4k.h>
+#include <linux/module.h>
+#endif
+
+#include "lz4k_private.h"
+#include "lz4k_encode_private.h"
+
+static uint8_t *out_size_bytes(uint8_t *out_at, uint_fast32_t u)
+{
+ for (; unlikely(u >= BYTE_MAX); u -= BYTE_MAX)
+ *out_at++ = (uint8_t)BYTE_MAX;
+ *out_at++ = (uint8_t)u;
+ return out_at;
+}
+
+static inline uint8_t *out_utag_then_bytes_left(
+ uint8_t *out_at,
+ uint_fast32_t utag,
+ uint_fast32_t bytes_left)
+{
+ m_copy(out_at, &utag, TAG_BYTES_MAX);
+ return out_size_bytes(out_at + TAG_BYTES_MAX, bytes_left);
+}
+
+static int out_tail(
+ uint8_t *out_at,
+ uint8_t *const out_end,
+ const uint8_t *const out,
+ const uint8_t *const nr0,
+ const uint8_t *const in_end,
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2,
+ bool check_out)
+{
+ const uint_fast32_t nr_mask = mask(nr_log2);
+ const uint_fast32_t r_log2 = TAG_BITS_MAX - (off_log2 + nr_log2);
+ const uint_fast32_t nr_bytes_max = u_32(in_end - nr0);
+ if (encoded_bytes_min(nr_log2, nr_bytes_max) > u_32(out_end - out_at))
+ return check_out ? LZ4K_STATUS_WRITE_ERROR :
+ LZ4K_STATUS_INCOMPRESSIBLE;
+ if (nr_bytes_max < nr_mask) {
+ /* caller guarantees at least one nr-byte */
+ uint_fast32_t utag = (nr_bytes_max << (off_log2 + r_log2));
+ m_copy(out_at, &utag, TAG_BYTES_MAX);
+ out_at += TAG_BYTES_MAX;
+ } else {
+ uint_fast32_t bytes_left = nr_bytes_max - nr_mask;
+ uint_fast32_t utag = (nr_mask << (off_log2 + r_log2));
+ out_at = out_utag_then_bytes_left(out_at, utag, bytes_left);
+ }
+ m_copy(out_at, nr0, nr_bytes_max);
+ return (int)(out_at + nr_bytes_max - out);
+}
+
+int lz4k_out_tail(
+ uint8_t *out_at,
+ uint8_t *const out_end,
+ const uint8_t *const out,
+ const uint8_t *const nr0,
+ const uint8_t *const in_end,
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2,
+ bool check_out)
+{
+ return out_tail(out_at, out_end, out, nr0, in_end,
+ nr_log2, off_log2, check_out);
+}
+
+static uint8_t *out_non_repeat(
+ uint8_t *out_at,
+ uint8_t *const out_end,
+ uint_fast32_t utag,
+ const uint8_t *const nr0,
+ const uint8_t *const r,
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2,
+ bool check_out)
+{
+ const uint_fast32_t nr_bytes_max = u_32(r - nr0);
+ const uint_fast32_t nr_mask = mask(nr_log2),
+ r_log2 = TAG_BITS_MAX - (off_log2 + nr_log2);
+ if (likely(nr_bytes_max < nr_mask)) {
+ if (unlikely(check_out &&
+ TAG_BYTES_MAX + nr_bytes_max > u_32(out_end - out_at)))
+ return NULL;
+ utag |= (nr_bytes_max << (off_log2 + r_log2));
+ m_copy(out_at, &utag, TAG_BYTES_MAX);
+ out_at += TAG_BYTES_MAX;
+ } else {
+ uint_fast32_t bytes_left = nr_bytes_max - nr_mask;
+ if (unlikely(check_out &&
+ TAG_BYTES_MAX + size_bytes_count(bytes_left) + nr_bytes_max >
+ u_32(out_end - out_at)))
+ return NULL;
+ utag |= (nr_mask << (off_log2 + r_log2));
+ out_at = out_utag_then_bytes_left(out_at, utag, bytes_left);
+ }
+ if (unlikely(check_out))
+ m_copy(out_at, nr0, nr_bytes_max);
+ else
+ copy_x_while_total(out_at, nr0, nr_bytes_max, NR_COPY_MIN);
+ out_at += nr_bytes_max;
+ return out_at;
+}
+
+uint8_t *lz4k_out_non_repeat(
+ uint8_t *out_at,
+ uint8_t *const out_end,
+ uint_fast32_t utag,
+ const uint8_t *const nr0,
+ const uint8_t *const r,
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2,
+ bool check_out)
+{
+ return out_non_repeat(out_at, out_end, utag, nr0, r,
+ nr_log2, off_log2, check_out);
+}
+
+static uint8_t *out_r_bytes_left(
+ uint8_t *out_at,
+ uint8_t *const out_end,
+ uint_fast32_t r_bytes_max,
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2,
+ const bool check_out) /* =false when
+out_max>=encoded_bytes_max(in_max), =true otherwise */
+{
+ const uint_fast32_t r_mask = mask(TAG_BITS_MAX - (off_log2 + nr_log2));
+ if (unlikely(r_bytes_max - REPEAT_MIN >= r_mask)) {
+ uint_fast32_t bytes_left = r_bytes_max - REPEAT_MIN - r_mask;
+ if (unlikely(check_out &&
+ size_bytes_count(bytes_left) > u_32(out_end - out_at)))
+ return NULL;
+ out_at = out_size_bytes(out_at, bytes_left);
+ }
+ return out_at; /* SUCCESS: continue compression */
+}
+
+uint8_t *lz4k_out_r_bytes_left(
+ uint8_t *out_at,
+ uint8_t *const out_end,
+ uint_fast32_t r_bytes_max,
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2,
+ const bool check_out)
+{
+ return out_r_bytes_left(out_at, out_end, r_bytes_max,
+ nr_log2, off_log2, check_out);
+}
+
+static uint8_t *out_repeat(
+ uint8_t *out_at,
+ uint8_t *const out_end,
+ uint_fast32_t utag,
+ uint_fast32_t r_bytes_max,
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2,
+ const bool check_out) /* =false when
+out_max>=encoded_bytes_max(in_max), =true otherwise */
+{
+ const uint_fast32_t r_mask = mask(TAG_BITS_MAX - (off_log2 + nr_log2));
+ if (likely(r_bytes_max - REPEAT_MIN < r_mask)) {
+ if (unlikely(check_out && TAG_BYTES_MAX > u_32(out_end - out_at)))
+ return NULL;
+ utag |= ((r_bytes_max - REPEAT_MIN) << off_log2);
+ m_copy(out_at, &utag, TAG_BYTES_MAX);
+ out_at += TAG_BYTES_MAX;
+ } else {
+ uint_fast32_t bytes_left = r_bytes_max - REPEAT_MIN - r_mask;
+ if (unlikely(check_out &&
+ TAG_BYTES_MAX + size_bytes_count(bytes_left) >
+ u_32(out_end - out_at)))
+ return NULL;
+ utag |= (r_mask << off_log2);
+ out_at = out_utag_then_bytes_left(out_at, utag, bytes_left);
+ }
+ return out_at; /* SUCCESS: continue compression */
+}
+
+uint8_t *lz4k_out_repeat(
+ uint8_t *out_at,
+ uint8_t *const out_end,
+ uint_fast32_t utag,
+ uint_fast32_t r_bytes_max,
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2,
+ const bool check_out)
+{
+ return out_repeat(out_at, out_end, utag, r_bytes_max,
+ nr_log2, off_log2, check_out);
+}
+
+static const uint8_t *repeat_end(
+ const uint8_t *q,
+ const uint8_t *r,
+ const uint8_t *const in_end_safe,
+ const uint8_t *const in_end)
+{
+ q += REPEAT_MIN;
+ r += REPEAT_MIN;
+ /* caller guarantees r+12<=in_end */
+ do {
+ const uint64_t x = read8_at(q) ^ read8_at(r);
+ if (x) {
+ const uint16_t ctz = (uint16_t)__builtin_ctzl(x);
+ return r + (ctz >> BYTE_BITS_LOG2);
+ }
+ /* some bytes differ:+ count of trailing 0-bits/bytes */
+ q += sizeof(uint64_t), r += sizeof(uint64_t);
+ } while (likely(r <= in_end_safe)); /* once, at input block end */
+ do {
+ if (*q != *r) return r;
+ ++q;
+ ++r;
+ } while (r < in_end);
+ return r;
+}
+
+const uint8_t *lz4k_repeat_end(
+ const uint8_t *q,
+ const uint8_t *r,
+ const uint8_t *const in_end_safe,
+ const uint8_t *const in_end)
+{
+ return repeat_end(q, r, in_end_safe, in_end);
+}
+
+enum {
+ HT_BYTES_LOG2 = HT_LOG2 + 1
+};
+
+inline unsigned encode_state_bytes_min(void)
+{
+ unsigned bytes_total = (1U << HT_BYTES_LOG2);
+ return bytes_total;
+}
+
+#if !defined(LZ4K_DELTA) && !defined(LZ4K_MAX_CR)
+
+unsigned lz4k_encode_state_bytes_min(void)
+{
+ return encode_state_bytes_min();
+}
+EXPORT_SYMBOL(lz4k_encode_state_bytes_min);
+
+#endif /* !defined(LZ4K_DELTA) && !defined(LZ4K_MAX_CR) */
+
+/* CR increase order: +STEP, have OFFSETS, use _5b */
+/* *_6b to compete with LZ4 */
+static inline uint_fast32_t hash0_v(const uint64_t r, uint32_t shift)
+{
+ return hash64v_6b(r, shift);
+}
+
+static inline uint_fast32_t hash0(const uint8_t *r, uint32_t shift)
+{
+ return hash64_6b(r, shift);
+}
+
+/*
+ * Proof that 'r' increments are safe-NO pointer overflows are possible:
+ *
+ * While using STEP_LOG2=5, step_start=1<<STEP_LOG2 == 32 we increment s
+ * 32 times by 1, 32 times by 2, 32 times by 3, and so on:
+ * 32*1+32*2+32*3+...+32*31 == 32*SUM(1..31) == 32*((1+31)*15+16).
+ * So, we can safely increment s by at most 31 for input block size <=
+ * 1<<13 < 15872.
+ *
+ * More precisely, STEP_LIMIT == x for any input block calculated as follows:
+ * 1<<off_log2 >= (1<<STEP_LOG2)*((x+1)(x-1)/2+x/2) ==>
+ * 1<<(off_log2-STEP_LOG2+1) >= x^2+x-1 ==>
+ * x^2+x-1-1<<(off_log2-STEP_LOG2+1) == 0, which is solved by standard
+ * method.
+ * To avoid overhead here conservative approximate value of x is calculated
+ * as average of two nearest square roots, see STEP_LIMIT above.
+ */
+
+enum {
+ STEP_LOG2 = 5 /* increase for better CR */
+};
+
+static int encode_any(
+ uint16_t *const ht0,
+ const uint8_t *const in0,
+ uint8_t *const out,
+ const uint8_t *const in_end,
+ uint8_t *const out_end, /* ==out_limit for !check_out */
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2,
+ const bool check_out)
+{ /* caller guarantees off_log2 <=16 */
+ uint8_t *out_at = out;
+ const uint8_t *const in_end_safe = in_end - NR_COPY_MIN;
+ const uint8_t *r = in0;
+ const uint8_t *nr0 = r++;
+ uint_fast32_t step = 1 << STEP_LOG2;
+ for (;;) {
+ uint_fast32_t utag = 0;
+ const uint8_t *r_end = 0;
+ uint_fast32_t r_bytes_max = 0;
+ const uint8_t *const q = hashed(in0, ht0, hash0(r, HT_LOG2), r);
+ if (!equal4(q, r)) {
+ r += (++step >> STEP_LOG2);
+ if (unlikely(r > in_end_safe))
+ return out_tail(out_at, out_end, out, nr0, in_end,
+ nr_log2, off_log2, check_out);
+ continue;
+ }
+ utag = u_32(r - q);
+ r_end = repeat_end(q, r, in_end_safe, in_end);
+ r = repeat_start(q, r, nr0, in0);
+ r_bytes_max = u_32(r_end - r);
+ if (nr0 == r) {
+ out_at = out_repeat(out_at, out_end, utag, r_bytes_max,
+ nr_log2, off_log2, check_out);
+ } else {
+ update_utag(r_bytes_max, &utag, nr_log2, off_log2);
+ out_at = out_non_repeat(out_at, out_end, utag, nr0, r,
+ nr_log2, off_log2, check_out);
+ if (unlikely(check_out && out_at == NULL))
+ return LZ4K_STATUS_WRITE_ERROR;
+ out_at = out_r_bytes_left(out_at, out_end, r_bytes_max,
+ nr_log2, off_log2, check_out);
+ }
+ if (unlikely(check_out && out_at == NULL))
+ return LZ4K_STATUS_WRITE_ERROR;
+ nr0 = (r += r_bytes_max);
+ if (unlikely(r > in_end_safe))
+ return r == in_end ? (int)(out_at - out) :
+ out_tail(out_at, out_end, out, r, in_end,
+ nr_log2, off_log2, check_out);
+ ht0[hash0(r - 1 - 1, HT_LOG2)] = (uint16_t)(r - 1 - 1 - in0);
+ step = 1 << STEP_LOG2;
+ }
+}
+
+static int encode_fast(
+ uint16_t *const ht,
+ const uint8_t *const in,
+ uint8_t *const out,
+ const uint8_t *const in_end,
+ uint8_t *const out_end, /* ==out_limit for !check_out */
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2)
+{ /* caller guarantees off_log2 <=16 */
+ return encode_any(ht, in, out, in_end, out_end, nr_log2, off_log2,
+ false); /* !check_out */
+}
+
+static int encode_slow(
+ uint16_t *const ht,
+ const uint8_t *const in,
+ uint8_t *const out,
+ const uint8_t *const in_end,
+ uint8_t *const out_end, /* ==out_limit for !check_out */
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2)
+{ /* caller guarantees off_log2 <=16 */
+ return encode_any(ht, in, out, in_end, out_end, nr_log2, off_log2,
+ true); /* check_out */
+}
+
+static int encode4kb(
+ uint16_t *const state,
+ const uint8_t *const in,
+ uint8_t *out,
+ const uint_fast32_t in_max,
+ const uint_fast32_t out_max,
+ const uint_fast32_t out_limit)
+{
+ enum {
+ NR_LOG2 = 6
+ };
+ const int result = (encoded_bytes_max(NR_LOG2, in_max) > out_max) ?
+ encode_slow(state, in, out, in + in_max, out + out_max,
+ NR_LOG2, BLOCK_4KB_LOG2) :
+ encode_fast(state, in, out, in + in_max, out + out_limit,
+ NR_LOG2, BLOCK_4KB_LOG2);
+ return result <= 0 ? result : result + 1; /* +1 for in_log2 */
+}
+
+static int encode8kb(
+ uint16_t *const state,
+ const uint8_t *const in,
+ uint8_t *out,
+ const uint_fast32_t in_max,
+ const uint_fast32_t out_max,
+ const uint_fast32_t out_limit)
+{
+ enum {
+ NR_LOG2 = 5
+ };
+ const int result = (encoded_bytes_max(NR_LOG2, in_max) > out_max) ?
+ encode_slow(state, in, out, in + in_max, out + out_max,
+ NR_LOG2, BLOCK_8KB_LOG2) :
+ encode_fast(state, in, out, in + in_max, out + out_limit,
+ NR_LOG2, BLOCK_8KB_LOG2);
+ return result <= 0 ? result : result + 1; /* +1 for in_log2 */
+}
+
+static int encode16kb(
+ uint16_t *const state,
+ const uint8_t *const in,
+ uint8_t *out,
+ const uint_fast32_t in_max,
+ const uint_fast32_t out_max,
+ const uint_fast32_t out_limit)
+{
+ enum {
+ NR_LOG2 = 5
+ };
+ const int result = (encoded_bytes_max(NR_LOG2, in_max) > out_max) ?
+ encode_slow(state, in, out, in + in_max, out + out_max,
+ NR_LOG2, BLOCK_16KB_LOG2) :
+ encode_fast(state, in, out, in + in_max, out + out_limit,
+ NR_LOG2, BLOCK_16KB_LOG2);
+ return result <= 0 ? result : result + 1; /* +1 for in_log2 */
+}
+
+static int encode32kb(
+ uint16_t *const state,
+ const uint8_t *const in,
+ uint8_t *out,
+ const uint_fast32_t in_max,
+ const uint_fast32_t out_max,
+ const uint_fast32_t out_limit)
+{
+ enum {
+ NR_LOG2 = 4
+ };
+ const int result = (encoded_bytes_max(NR_LOG2, in_max) > out_max) ?
+ encode_slow(state, in, out, in + in_max, out + out_max,
+ NR_LOG2, BLOCK_32KB_LOG2) :
+ encode_fast(state, in, out, in + in_max, out + out_limit,
+ NR_LOG2, BLOCK_32KB_LOG2);
+ return result <= 0 ? result : result + 1; /* +1 for in_log2 */
+}
+
+static int encode64kb(
+ uint16_t *const state,
+ const uint8_t *const in,
+ uint8_t *out,
+ const uint_fast32_t in_max,
+ const uint_fast32_t out_max,
+ const uint_fast32_t out_limit)
+{
+ enum {
+ NR_LOG2 = 4
+ };
+ const int result = (encoded_bytes_max(NR_LOG2, in_max) > out_max) ?
+ encode_slow(state, in, out, in + in_max, out + out_max,
+ NR_LOG2, BLOCK_64KB_LOG2) :
+ encode_fast(state, in, out, in + in_max, out + out_limit,
+ NR_LOG2, BLOCK_64KB_LOG2);
+ return result <= 0 ? result : result + 1; /* +1 for in_log2 */
+}
+
+static int encode(
+ uint16_t *const state,
+ const uint8_t *const in,
+ uint8_t *out,
+ uint_fast32_t in_max,
+ uint_fast32_t out_max,
+ uint_fast32_t out_limit)
+{
+ const uint8_t in_log2 = (uint8_t)(most_significant_bit_of(
+ round_up_to_power_of2(in_max - REPEAT_MIN)));
+ m_set(state, 0, encode_state_bytes_min());
+ *out = in_log2 > BLOCK_4KB_LOG2 ? (uint8_t)(in_log2 - BLOCK_4KB_LOG2) : 0;
+ ++out;
+ --out_max;
+ --out_limit;
+ if (in_log2 < BLOCK_8KB_LOG2)
+ return encode4kb(state, in, out, in_max, out_max, out_limit);
+ if (in_log2 == BLOCK_8KB_LOG2)
+ return encode8kb(state, in, out, in_max, out_max, out_limit);
+ if (in_log2 == BLOCK_16KB_LOG2)
+ return encode16kb(state, in, out, in_max, out_max, out_limit);
+ if (in_log2 == BLOCK_32KB_LOG2)
+ return encode32kb(state, in, out, in_max, out_max, out_limit);
+ if (in_log2 == BLOCK_64KB_LOG2)
+ return encode64kb(state, in, out, in_max, out_max, out_limit);
+ return LZ4K_STATUS_FAILED;
+}
+
+int lz4k_encode(
+ void *const state,
+ const void *const in,
+ void *out,
+ unsigned in_max,
+ unsigned out_max,
+ unsigned out_limit)
+{
+ const unsigned gain_max = 64 > (in_max >> 6) ? 64 : (in_max >> 6);
+ const unsigned out_limit_min = in_max < out_max ? in_max : out_max;
+ const uint8_t *volatile in_end = (const uint8_t*)in + in_max;
+ const uint8_t *volatile out_end = (uint8_t*)out + out_max;
+ const void *volatile state_end =
+ (uint8_t*)state + encode_state_bytes_min();
+ if (unlikely(state == NULL))
+ return LZ4K_STATUS_FAILED;
+ if (unlikely(in == NULL || out == NULL))
+ return LZ4K_STATUS_FAILED;
+ if (unlikely(in_max <= gain_max))
+ return LZ4K_STATUS_INCOMPRESSIBLE;
+ if (unlikely(out_max <= gain_max)) /* need 1 byte for in_log2 */
+ return LZ4K_STATUS_FAILED;
+ /* ++use volatile pointers to prevent compiler optimizations */
+ if (unlikely((const uint8_t*)in >= in_end || (uint8_t*)out >= out_end))
+ return LZ4K_STATUS_FAILED;
+ if (unlikely(state >= state_end))
+ return LZ4K_STATUS_FAILED; /* pointer overflow */
+ if (!out_limit || out_limit >= out_limit_min)
+ out_limit = out_limit_min - gain_max;
+ return encode((uint16_t*)state, (const uint8_t*)in, (uint8_t*)out,
+ in_max, out_max, out_limit);
+}
+EXPORT_SYMBOL(lz4k_encode);
+
+const char *lz4k_version(void)
+{
+ static const char *version = "2020.07.07";
+ return version;
+}
+EXPORT_SYMBOL(lz4k_version);
+
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_DESCRIPTION("LZ4K encoder");
diff --git a/lib/lz4k/lz4k_encode_private.h b/lib/lz4k/lz4k_encode_private.h
new file mode 100644
index 000000000000..eb5cd162468f
--- /dev/null
+++ b/lib/lz4k/lz4k_encode_private.h
@@ -0,0 +1,137 @@
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2012-2020. All rights reserved.
+ * Description: LZ4K compression algorithm
+ * Author: Aleksei Romanovskii aleksei.romanovskii(a)huawei.com
+ * Created: 2020-03-25
+ */
+
+#ifndef _LZ4K_ENCODE_PRIVATE_H
+#define _LZ4K_ENCODE_PRIVATE_H
+
+#include "lz4k_private.h"
+
+/* <nrSize bytes for whole block>+<1 terminating 0 byte> */
+static inline uint_fast32_t size_bytes_count(uint_fast32_t u)
+{
+ return (u + BYTE_MAX - 1) / BYTE_MAX;
+}
+
+/* minimum encoded size for non-compressible data */
+static inline uint_fast32_t encoded_bytes_min(
+ uint_fast32_t nr_log2,
+ uint_fast32_t in_max)
+{
+ return in_max < mask(nr_log2) ?
+ TAG_BYTES_MAX + in_max :
+ TAG_BYTES_MAX + size_bytes_count(in_max - mask(nr_log2)) + in_max;
+}
+
+enum {
+ NR_COPY_LOG2 = 4,
+ NR_COPY_MIN = 1 << NR_COPY_LOG2
+};
+
+static inline uint_fast32_t u_32(int64_t i)
+{
+ return (uint_fast32_t)i;
+}
+
+/* maximum encoded size for non-comprressible data if "fast" encoder is used */
+static inline uint_fast32_t encoded_bytes_max(
+ uint_fast32_t nr_log2,
+ uint_fast32_t in_max)
+{
+ uint_fast32_t r = TAG_BYTES_MAX + (uint32_t)round_up_to_log2(in_max, NR_COPY_LOG2);
+ return in_max < mask(nr_log2) ? r : r + size_bytes_count(in_max - mask(nr_log2));
+}
+
+enum {
+ HT_LOG2 = 12
+};
+
+/*
+ * Compressed data format (where {} means 0 or more occurrences, [] means
+ * optional):
+ * <24bits tag: (off_log2 rOffset| r_log2 rSize|nr_log2 nrSize)>
+ * {<nrSize byte>}[<nr bytes>]{<rSize byte>}
+ * <rSize byte> and <nrSize byte> bytes are terminated by byte != 255
+ *
+ */
+
+static inline void update_utag(
+ uint_fast32_t r_bytes_max,
+ uint_fast32_t *utag,
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2)
+{
+ const uint_fast32_t r_mask = mask(TAG_BITS_MAX - (off_log2 + nr_log2));
+ *utag |= likely(r_bytes_max - REPEAT_MIN < r_mask) ?
+ ((r_bytes_max - REPEAT_MIN) << off_log2) : (r_mask << off_log2);
+}
+
+static inline const uint8_t *hashed(
+ const uint8_t *const in0,
+ uint16_t *const ht,
+ uint_fast32_t h,
+ const uint8_t *r)
+{
+ const uint8_t *q = in0 + ht[h];
+ ht[h] = (uint16_t)(r - in0);
+ return q;
+}
+
+static inline const uint8_t *repeat_start(
+ const uint8_t *q,
+ const uint8_t *r,
+ const uint8_t *const nr0,
+ const uint8_t *const in0)
+{
+ for (; r > nr0 && likely(q > in0) && unlikely(q[-1] == r[-1]); --q, --r);
+ return r;
+}
+
+int lz4k_out_tail(
+ uint8_t *out_at,
+ uint8_t *const out_end,
+ const uint8_t *const out,
+ const uint8_t *const nr0,
+ const uint8_t *const in_end,
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2,
+ bool check_out);
+
+uint8_t *lz4k_out_non_repeat(
+ uint8_t *out_at,
+ uint8_t *const out_end,
+ uint_fast32_t utag,
+ const uint8_t *const nr0,
+ const uint8_t *const r,
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2,
+ bool check_out);
+
+uint8_t *lz4k_out_r_bytes_left(
+ uint8_t *out_at,
+ uint8_t *const out_end,
+ uint_fast32_t r_bytes_max,
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2,
+ const bool check_out);
+
+uint8_t *lz4k_out_repeat(
+ uint8_t *out_at,
+ uint8_t *const out_end,
+ uint_fast32_t utag,
+ uint_fast32_t r_bytes_max,
+ const uint_fast32_t nr_log2,
+ const uint_fast32_t off_log2,
+ const bool check_out);
+
+const uint8_t *lz4k_repeat_end(
+ const uint8_t *q,
+ const uint8_t *r,
+ const uint8_t *const in_end_safe,
+ const uint8_t *const in_end);
+
+#endif /* _LZ4K_ENCODE_PRIVATE_H */
+
diff --git a/lib/lz4k/lz4k_private.h b/lib/lz4k/lz4k_private.h
new file mode 100644
index 000000000000..2a8f4b37dc74
--- /dev/null
+++ b/lib/lz4k/lz4k_private.h
@@ -0,0 +1,269 @@
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2012-2020. All rights reserved.
+ * Description: LZ4K compression algorithm
+ * Author: Aleksei Romanovskii aleksei.romanovskii(a)huawei.com
+ * Created: 2020-03-25
+ */
+
+#ifndef _LZ4K_PRIVATE_H
+#define _LZ4K_PRIVATE_H
+
+#if !defined(__KERNEL__)
+
+#include "lz4k.h"
+#include <stdint.h> /* uint*_t */
+#define __STDC_WANT_LIB_EXT1__ 1
+#include <string.h> /* memcpy() */
+
+#define likely(e) __builtin_expect(e, 1)
+#define unlikely(e) __builtin_expect(e, 0)
+
+#else /* __KERNEL__ */
+
+#include <linux/lz4k.h>
+#define __STDC_WANT_LIB_EXT1__ 1
+#include <linux/string.h> /* memcpy() */
+#include <linux/types.h> /* uint8_t, int8_t, uint16_t, int16_t,
+uint32_t, int32_t, uint64_t, int64_t */
+#include <stddef.h>
+
+typedef uint64_t uint_fast32_t;
+typedef int64_t int_fast32_t;
+
+#endif /* __KERNEL__ */
+
+#if defined(__GNUC__) && (__GNUC__>=4)
+#define LZ4K_WITH_GCC_INTRINSICS
+#endif
+
+#if !defined(__GNUC__)
+#define __builtin_expect(e, v) (e)
+#endif /* defined(__GNUC__) */
+
+enum {
+ BYTE_BITS = 8,
+ BYTE_BITS_LOG2 = 3,
+ BYTE_MAX = 255U,
+ REPEAT_MIN = 4,
+ TAG_BYTES_MAX = 3,
+ TAG_BITS_MAX = TAG_BYTES_MAX * 8,
+ BLOCK_4KB_LOG2 = 12,
+ BLOCK_8KB_LOG2 = 13,
+ BLOCK_16KB_LOG2 = 14,
+ BLOCK_32KB_LOG2 = 15,
+ BLOCK_64KB_LOG2 = 16
+};
+
+static inline uint32_t mask(uint_fast32_t log2)
+{
+ return (1U << log2) - 1U;
+}
+
+static inline uint64_t mask64(uint_fast32_t log2)
+{
+ return (1ULL << log2) - 1ULL;
+}
+
+#if defined LZ4K_WITH_GCC_INTRINSICS
+static inline int most_significant_bit_of(uint64_t u)
+{
+ return (int)(__builtin_expect((u) == 0, false) ?
+ -1 : (int)(31 ^ (uint32_t)__builtin_clz((unsigned)(u))));
+}
+#else /* #!defined LZ4K_WITH_GCC_INTRINSICS */
+#error undefined most_significant_bit_of(unsigned u)
+#endif /* #if defined LZ4K_WITH_GCC_INTRINSICS */
+
+static inline uint64_t round_up_to_log2(uint64_t u, uint8_t log2)
+{
+ return (uint64_t)((u + mask64(log2)) & ~mask64(log2));
+}
+
+static inline uint64_t round_up_to_power_of2(uint64_t u)
+{
+ const int_fast32_t msb = most_significant_bit_of(u);
+ return round_up_to_log2(u, (uint8_t)msb);
+}
+
+static inline void m_copy(void *dst, const void *src, size_t total)
+{
+#if defined(__STDC_LIB_EXT1__)
+ (void)memcpy_s(dst, total, src, (total * 2) >> 1); /* *2 >> 1 to avoid bot errors */
+#else
+ (void)__builtin_memcpy(dst, src, total);
+#endif
+}
+
+static inline void m_set(void *dst, uint8_t value, size_t total)
+{
+#if defined(__STDC_LIB_EXT1__)
+ (void)memset_s(dst, total, value, (total * 2) >> 1); /* *2 >> 1 to avoid bot errors */
+#else
+ (void)__builtin_memset(dst, value, total);
+#endif
+}
+
+static inline uint32_t read4_at(const void *p)
+{
+ uint32_t result;
+ m_copy(&result, p, sizeof(result));
+ return result;
+}
+
+static inline uint64_t read8_at(const void *p)
+{
+ uint64_t result;
+ m_copy(&result, p, sizeof(result));
+ return result;
+}
+
+static inline bool equal4(const uint8_t *const q, const uint8_t *const r)
+{
+ return read4_at(q) == read4_at(r);
+}
+
+static inline bool equal3(const uint8_t *const q, const uint8_t *const r)
+{
+ return (read4_at(q) << BYTE_BITS) == (read4_at(r) << BYTE_BITS);
+}
+
+static inline uint_fast32_t hash24v(const uint64_t r, uint32_t shift)
+{
+ const uint32_t m = 3266489917U;
+ return (((uint32_t)r << BYTE_BITS) * m) >> (32 - shift);
+}
+
+static inline uint_fast32_t hash24(const uint8_t *r, uint32_t shift)
+{
+ return hash24v(read4_at(r), shift);
+}
+
+static inline uint_fast32_t hash32v_2(const uint64_t r, uint32_t shift)
+{
+ const uint32_t m = 3266489917U;
+ return ((uint32_t)r * m) >> (32 - shift);
+}
+
+static inline uint_fast32_t hash32_2(const uint8_t *r, uint32_t shift)
+{
+ return hash32v_2(read4_at(r), shift);
+}
+
+static inline uint_fast32_t hash32v(const uint64_t r, uint32_t shift)
+{
+ const uint32_t m = 2654435761U;
+ return ((uint32_t)r * m) >> (32 - shift);
+}
+
+static inline uint_fast32_t hash32(const uint8_t *r, uint32_t shift)
+{
+ return hash32v(read4_at(r), shift);
+}
+
+static inline uint_fast32_t hash64v_5b(const uint64_t r, uint32_t shift)
+{
+ const uint64_t m = 889523592379ULL;
+ return (uint32_t)(((r << 24) * m) >> (64 - shift));
+}
+
+static inline uint_fast32_t hash64_5b(const uint8_t *r, uint32_t shift)
+{
+ return hash64v_5b(read8_at(r), shift);
+}
+
+static inline uint_fast32_t hash64v_6b(const uint64_t r, uint32_t shift)
+{
+ const uint64_t m = 227718039650203ULL;
+ return (uint32_t)(((r << 16) * m) >> (64 - shift));
+}
+
+static inline uint_fast32_t hash64_6b(const uint8_t *r, uint32_t shift)
+{
+ return hash64v_6b(read8_at(r), shift);
+}
+
+static inline uint_fast32_t hash64v_7b(const uint64_t r, uint32_t shift)
+{
+ const uint64_t m = 58295818150454627ULL;
+ return (uint32_t)(((r << 8) * m) >> (64 - shift));
+}
+
+static inline uint_fast32_t hash64_7b(const uint8_t *r, uint32_t shift)
+{
+ return hash64v_7b(read8_at(r), shift);
+}
+
+static inline uint_fast32_t hash64v_8b(const uint64_t r, uint32_t shift)
+{
+ const uint64_t m = 2870177450012600261ULL;
+ return (uint32_t)((r * m) >> (64 - shift));
+}
+
+static inline uint_fast32_t hash64_8b(const uint8_t *r, uint32_t shift)
+{
+ return hash64v_8b(read8_at(r), shift);
+}
+
+static inline void while_lt_copy_x(
+ uint8_t *dst,
+ const uint8_t *src,
+ const uint8_t *dst_end,
+ const size_t copy_min)
+{
+ for (; dst < dst_end; dst += copy_min, src += copy_min)
+ m_copy(dst, src, copy_min);
+}
+
+static inline void copy_x_while_lt(
+ uint8_t *dst,
+ const uint8_t *src,
+ const uint8_t *dst_end,
+ const size_t copy_min)
+{
+ m_copy(dst, src, copy_min);
+ while (dst + copy_min < dst_end)
+ m_copy(dst += copy_min, src += copy_min, copy_min);
+}
+
+static inline void copy_x_while_total(
+ uint8_t *dst,
+ const uint8_t *src,
+ size_t total,
+ const size_t copy_min)
+{
+ m_copy(dst, src, copy_min);
+ for (; total > copy_min; total-= copy_min)
+ m_copy(dst += copy_min, src += copy_min, copy_min);
+}
+
+static inline void copy_2x(
+ uint8_t *dst,
+ const uint8_t *src,
+ const size_t copy_min)
+{
+ m_copy(dst, src, copy_min);
+ m_copy(dst + copy_min, src + copy_min, copy_min);
+}
+
+static inline void copy_2x_as_x2_while_lt(
+ uint8_t *dst,
+ const uint8_t *src,
+ const uint8_t *dst_end,
+ const size_t copy_min)
+{
+ copy_2x(dst, src, copy_min);
+ while (dst + (copy_min << 1) < dst_end)
+ copy_2x(dst += (copy_min << 1), src += (copy_min << 1), copy_min);
+}
+
+static inline void while_lt_copy_2x_as_x2(
+ uint8_t *dst,
+ const uint8_t *src,
+ const uint8_t *dst_end,
+ const size_t copy_min)
+{
+ for (; dst < dst_end; dst += (copy_min << 1), src += (copy_min << 1))
+ copy_2x(dst, src, copy_min);
+}
+
+#endif /* _LZ4K_PRIVATE_H */
--
2.25.1
2
1
Patchs 1-6 fix some problems recently.
Patchs 7-8 backport from mainline.
Darrick J. Wong (1):
xfs: fix uninitialized variable access
Dave Chinner (1):
xfs: set XFS_FEAT_NLINK correctly
Long Li (4):
xfs: factor out xfs_defer_pending_abort
xfs: don't leak intent item when recovery intents fail
xfs: factor out xfs_destroy_perag()
xfs: don't leak perag when growfs fails
Ye Bin (1):
xfs: fix warning in xfs_vm_writepages()
yangerkun (1):
xfs: fix mounting failed caused by sequencing problem in the log
records
fs/xfs/libxfs/xfs_defer.c | 26 +++++++++++++++++---------
fs/xfs/libxfs/xfs_defer.h | 1 +
fs/xfs/libxfs/xfs_log_recover.h | 1 +
fs/xfs/libxfs/xfs_sb.c | 2 ++
fs/xfs/xfs_buf_item_recover.c | 2 ++
fs/xfs/xfs_fsmap.c | 1 +
fs/xfs/xfs_fsops.c | 5 ++++-
fs/xfs/xfs_icache.c | 6 ++++++
fs/xfs/xfs_log_recover.c | 27 ++++++++++++++++++++++++---
fs/xfs/xfs_mount.c | 30 ++++++++++++++++++++++--------
fs/xfs/xfs_mount.h | 2 ++
11 files changed, 82 insertions(+), 21 deletions(-)
--
2.31.1
2
9

[PATCH OLK-5.10] media: dvb-core: Fix kernel WARNING for blocking operation in wait_event*()
by Chen Jiahao 29 Jun '23
by Chen Jiahao 29 Jun '23
29 Jun '23
From: Takashi Iwai <tiwai(a)suse.de>
mainline inclusion
from mainline-v6.4-rc3
commit b8c75e4a1b325ea0a9433fa8834be97b5836b946
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6YKXB
CVE: CVE-2023-31084
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Using a semaphore in the wait_event*() condition is no good idea.
It hits a kernel WARN_ON() at prepare_to_wait_event() like:
do not call blocking ops when !TASK_RUNNING; state=1 set at
prepare_to_wait_event+0x6d/0x690
For avoiding the potential deadlock, rewrite to an open-coded loop
instead. Unlike the loop in wait_event*(), this uses wait_woken()
after the condition check, hence the task state stays consistent.
CVE-2023-31084 was assigned to this bug.
Link: https://lore.kernel.org/r/CA+UBctCu7fXn4q41O_3=id1+OdyQ85tZY1x+TkT-6OVBL6KA…
Link: https://lore.kernel.org/linux-media/20230512151800.1874-1-tiwai@suse.de
Reported-by: Yu Hao <yhao016(a)ucr.edu>
Closes: https://nvd.nist.gov/vuln/detail/CVE-2023-31084
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: Chen Jiahao <chenjiahao16(a)huawei.com>
---
drivers/media/dvb-core/dvb_frontend.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/media/dvb-core/dvb_frontend.c b/drivers/media/dvb-core/dvb_frontend.c
index 06ea30a689d7..579cddec55e5 100644
--- a/drivers/media/dvb-core/dvb_frontend.c
+++ b/drivers/media/dvb-core/dvb_frontend.c
@@ -292,14 +292,22 @@ static int dvb_frontend_get_event(struct dvb_frontend *fe,
}
if (events->eventw == events->eventr) {
- int ret;
+ struct wait_queue_entry wait;
+ int ret = 0;
if (flags & O_NONBLOCK)
return -EWOULDBLOCK;
- ret = wait_event_interruptible(events->wait_queue,
- dvb_frontend_test_event(fepriv, events));
-
+ init_waitqueue_entry(&wait, current);
+ add_wait_queue(&events->wait_queue, &wait);
+ while (!dvb_frontend_test_event(fepriv, events)) {
+ wait_woken(&wait, TASK_INTERRUPTIBLE, 0);
+ if (signal_pending(current)) {
+ ret = -ERESTARTSYS;
+ break;
+ }
+ }
+ remove_wait_queue(&events->wait_queue, &wait);
if (ret < 0)
return ret;
}
--
2.34.1
2
1

[PATCH openEuler-1.0-LTS 1/2] scsi: hisi_sas: Fix normally completed I/O analysed as failed
by Yongqiang Liu 29 Jun '23
by Yongqiang Liu 29 Jun '23
29 Jun '23
From: Xingui Yang <yangxingui(a)huawei.com>
driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7GS2V
CVE: NA
-------------------------------------------------------------------
pio read command has no response frame and the struct iu[1024] won't be
filled, and it's found that I/Os that are normally completed will be
analysed as failed in sas_ata_task_done() when iu contain abnormal dirty
data. So ending_fis should not be filled by iu when the response frame
hasn't been written to the memory.
Fixes: d380f55503ed ("scsi: hisi_sas: Don't bother clearing status buffer IU in task prep")
Signed-off-by: Xingui Yang <yangxingui(a)huawei.com>
Reviewed-by: Xiang Chen <chenxiang66(a)hisilicon.com>
Reviewed-by: kang fenglong <kangfenglong(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 23 ++++++++++++++++-------
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 23 +++++++++++++++--------
2 files changed, 31 insertions(+), 15 deletions(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index 2df70c4873f5..f9867176fa14 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -2033,6 +2033,11 @@ static void slot_err_v2_hw(struct hisi_hba *hisi_hba,
u16 dma_tx_err_type = cpu_to_le16(err_record->dma_tx_err_type);
u16 sipc_rx_err_type = cpu_to_le16(err_record->sipc_rx_err_type);
u32 dma_rx_err_type = cpu_to_le32(err_record->dma_rx_err_type);
+ struct hisi_sas_complete_v2_hdr *complete_queue =
+ hisi_hba->complete_hdr[slot->cmplt_queue];
+ struct hisi_sas_complete_v2_hdr *complete_hdr =
+ &complete_queue[slot->cmplt_queue_slot];
+ u32 dw0 = le32_to_cpu(complete_hdr->dw0);
int error = -1;
if (err_phase == 1) {
@@ -2318,7 +2323,8 @@ static void slot_err_v2_hw(struct hisi_hba *hisi_hba,
break;
}
}
- hisi_sas_sata_done(task, slot);
+ if (dw0 & CMPLT_HDR_RSPNS_XFRD_MSK)
+ hisi_sas_sata_done(task, slot);
}
break;
default:
@@ -2342,6 +2348,7 @@ slot_complete_v2_hw(struct hisi_hba *hisi_hba, struct hisi_sas_slot *slot)
&complete_queue[slot->cmplt_queue_slot];
unsigned long flags;
bool is_internal = slot->is_internal;
+ u32 dw0;
if (unlikely(!task || !task->lldd_task || !task->dev))
return -EINVAL;
@@ -2366,7 +2373,8 @@ slot_complete_v2_hw(struct hisi_hba *hisi_hba, struct hisi_sas_slot *slot)
}
/* Use SAS+TMF status codes */
- switch ((complete_hdr->dw0 & CMPLT_HDR_ABORT_STAT_MSK)
+ dw0 = le32_to_cpu(complete_hdr->dw0);
+ switch ((dw0 & CMPLT_HDR_ABORT_STAT_MSK)
>> CMPLT_HDR_ABORT_STAT_OFF) {
case STAT_IO_ABORTED:
/* this io has been aborted by abort command */
@@ -2392,9 +2400,9 @@ slot_complete_v2_hw(struct hisi_hba *hisi_hba, struct hisi_sas_slot *slot)
break;
}
- if ((complete_hdr->dw0 & CMPLT_HDR_ERX_MSK) &&
- (!(complete_hdr->dw0 & CMPLT_HDR_RSPNS_XFRD_MSK))) {
- u32 err_phase = (complete_hdr->dw0 & CMPLT_HDR_ERR_PHASE_MSK)
+ if ((dw0 & CMPLT_HDR_ERX_MSK) &&
+ (!(dw0 & CMPLT_HDR_RSPNS_XFRD_MSK))) {
+ u32 err_phase = (dw0 & CMPLT_HDR_ERR_PHASE_MSK)
>> CMPLT_HDR_ERR_PHASE_OFF;
u32 *error_info = hisi_sas_status_buf_addr_mem(slot);
@@ -2409,7 +2417,7 @@ slot_complete_v2_hw(struct hisi_hba *hisi_hba, struct hisi_sas_slot *slot)
"CQ hdr: 0x%x 0x%x 0x%x 0x%x "
"Error info: 0x%x 0x%x 0x%x 0x%x\n",
slot->idx, task, sas_dev->device_id,
- complete_hdr->dw0, complete_hdr->dw1,
+ dw0, complete_hdr->dw1,
complete_hdr->act, complete_hdr->dw3,
error_info[0], error_info[1],
error_info[2], error_info[3]);
@@ -2456,7 +2464,8 @@ slot_complete_v2_hw(struct hisi_hba *hisi_hba, struct hisi_sas_slot *slot)
case SAS_PROTOCOL_SATA | SAS_PROTOCOL_STP:
{
ts->stat = SAM_STAT_GOOD;
- hisi_sas_sata_done(task, slot);
+ if (dw0 & CMPLT_HDR_RSPNS_XFRD_MSK)
+ hisi_sas_sata_done(task, slot);
break;
}
default:
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 5e2488fd2466..f79060fca001 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -2353,7 +2353,8 @@ slot_err_v3_hw(struct hisi_hba *hisi_hba, struct sas_task *task,
ts->stat = SAS_OPEN_REJECT;
ts->open_rej_reason = SAS_OREJ_RSVD_RETRY;
}
- hisi_sas_sata_done(task, slot);
+ if (dw0 & CMPLT_HDR_RSPNS_XFRD_MSK)
+ hisi_sas_sata_done(task, slot);
break;
case SAS_PROTOCOL_SMP:
ts->stat = SAM_STAT_CHECK_CONDITION;
@@ -2402,6 +2403,7 @@ slot_complete_v3_hw(struct hisi_hba *hisi_hba, struct hisi_sas_slot *slot)
&complete_queue[slot->cmplt_queue_slot];
unsigned long flags;
bool is_internal = slot->is_internal;
+ u32 dw0, dw1, dw3;
if (unlikely(!task || !task->lldd_task || !task->dev))
return -EINVAL;
@@ -2425,10 +2427,14 @@ slot_complete_v3_hw(struct hisi_hba *hisi_hba, struct hisi_sas_slot *slot)
goto out;
}
+ dw0 = le32_to_cpu(complete_hdr->dw0);
+ dw1 = le32_to_cpu(complete_hdr->dw1);
+ dw3 = le32_to_cpu(complete_hdr->dw3);
+
/*
* Use SAS+TMF status codes
*/
- switch ((complete_hdr->dw0 & CMPLT_HDR_ABORT_STAT_MSK)
+ switch ((dw0 & CMPLT_HDR_ABORT_STAT_MSK)
>> CMPLT_HDR_ABORT_STAT_OFF) {
case STAT_IO_ABORTED:
/* this IO has been aborted by abort command */
@@ -2452,9 +2458,9 @@ slot_complete_v3_hw(struct hisi_hba *hisi_hba, struct hisi_sas_slot *slot)
}
/* check for erroneous completion, 0x3 means abnormal */
- if ((complete_hdr->dw0 & CMPLT_HDR_CMPLT_MSK) == 0x3) {
+ if ((dw0 & CMPLT_HDR_CMPLT_MSK) == 0x3) {
u32 *error_info = hisi_sas_status_buf_addr_mem(slot);
- u32 device_id = (complete_hdr->dw1 & 0xffff0000) >> 16;
+ u32 device_id = (dw1 & 0xffff0000) >> 16;
struct hisi_sas_itct *itct = &hisi_hba->itct[device_id];
set_aborted_iptt(hisi_hba, slot);
@@ -2464,12 +2470,12 @@ slot_complete_v3_hw(struct hisi_hba *hisi_hba, struct hisi_sas_slot *slot)
"Error info: 0x%x 0x%x 0x%x 0x%x\n",
slot->idx, task, sas_dev->device_id,
itct->sas_addr,
- complete_hdr->dw0, complete_hdr->dw1,
- complete_hdr->act, complete_hdr->dw3,
+ dw0, dw1,
+ complete_hdr->act, dw3,
error_info[0], error_info[1],
error_info[2], error_info[3]);
- if ((complete_hdr->dw0 & CMPLT_HDR_RSPNS_XFRD_MSK) &&
+ if ((dw0 & CMPLT_HDR_RSPNS_XFRD_MSK) &&
(task->task_proto & SAS_PROTOCOL_SATA ||
task->task_proto & SAS_PROTOCOL_STP)) {
struct hisi_sas_status_buffer *status_buf =
@@ -2559,7 +2565,8 @@ slot_complete_v3_hw(struct hisi_hba *hisi_hba, struct hisi_sas_slot *slot)
case SAS_PROTOCOL_STP:
case SAS_PROTOCOL_SATA | SAS_PROTOCOL_STP:
ts->stat = SAM_STAT_GOOD;
- hisi_sas_sata_done(task, slot);
+ if (dw0 & CMPLT_HDR_RSPNS_XFRD_MSK)
+ hisi_sas_sata_done(task, slot);
break;
default:
ts->stat = SAM_STAT_CHECK_CONDITION;
--
2.25.1
1
1

29 Jun '23
From: D Scott Phillips <scott(a)os.amperecomputing.com>
stable inclusion
from stable-v5.10.153
commit 52a43b82006dc88f996bd06da5a3fcfef85220c8
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I64YCA
CVE: CVE-2023-3006
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 0e5d5ae837c8ce04d2ddb874ec5f920118bd9d31 ]
Per AmpereOne erratum AC03_CPU_12, "Branch history may allow control of
speculative execution across software contexts," the AMPERE1 core needs the
bhb clearing loop to mitigate Spectre-BHB, with a loop iteration count of
11.
Signed-off-by: D Scott Phillips <scott(a)os.amperecomputing.com>
Link: https://lore.kernel.org/r/20221011022140.432370-1-scott@os.amperecomputing.…
Reviewed-by: James Morse <james.morse(a)arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
conflicts:
arch/arm64/include/asm/cputype.h
Signed-off-by: Lin Yujun <linyujun809(a)huawei.com>
---
arch/arm64/include/asm/cputype.h | 4 ++++
arch/arm64/kernel/proton-pack.c | 6 ++++++
2 files changed, 10 insertions(+)
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index 662708c56397..812781fba3f9 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -61,6 +61,7 @@
#define ARM_CPU_IMP_HISI 0x48
#define ARM_CPU_IMP_PHYTIUM 0x70
#define ARM_CPU_IMP_APPLE 0x61
+#define ARM_CPU_IMP_AMPERE 0xC0
#define ARM_CPU_PART_AEM_V8 0xD0F
#define ARM_CPU_PART_FOUNDATION 0xD00
@@ -120,6 +121,8 @@
#define APPLE_CPU_PART_M1_ICESTORM 0x022
#define APPLE_CPU_PART_M1_FIRESTORM 0x023
+#define AMPERE_CPU_PART_AMPERE1 0xAC3
+
#define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53)
#define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57)
#define MIDR_CORTEX_A72 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A72)
@@ -165,6 +168,7 @@
#define MIDR_FT_2500 MIDR_CPU_MODEL(ARM_CPU_IMP_PHYTIUM, PHYTIUM_CPU_PART_2500)
#define MIDR_APPLE_M1_ICESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_ICESTORM)
#define MIDR_APPLE_M1_FIRESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_FIRESTORM)
+#define MIDR_AMPERE1 MIDR_CPU_MODEL(ARM_CPU_IMP_AMPERE, AMPERE_CPU_PART_AMPERE1)
/* Fujitsu Erratum 010001 affects A64FX 1.0 and 1.1, (v0r0 and v1r0) */
#define MIDR_FUJITSU_ERRATUM_010001 MIDR_FUJITSU_A64FX
diff --git a/arch/arm64/kernel/proton-pack.c b/arch/arm64/kernel/proton-pack.c
index e807f77737e0..9c95d4955b6e 100644
--- a/arch/arm64/kernel/proton-pack.c
+++ b/arch/arm64/kernel/proton-pack.c
@@ -873,6 +873,10 @@ u8 spectre_bhb_loop_affected(int scope)
MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
{},
};
+ static const struct midr_range spectre_bhb_k11_list[] = {
+ MIDR_ALL_VERSIONS(MIDR_AMPERE1),
+ {},
+ };
static const struct midr_range spectre_bhb_k8_list[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
@@ -883,6 +887,8 @@ u8 spectre_bhb_loop_affected(int scope)
k = 32;
else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k24_list))
k = 24;
+ else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k11_list))
+ k = 11;
else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k8_list))
k = 8;
--
2.34.1
1
0

[PATCH openEuler-22.03-LTS] arm64: Add AMPERE1 to the Spectre-BHB affected list
by Lin Yujun 29 Jun '23
by Lin Yujun 29 Jun '23
29 Jun '23
From: D Scott Phillips <scott(a)os.amperecomputing.com>
stable inclusion
from stable-v5.10.153
commit 52a43b82006dc88f996bd06da5a3fcfef85220c8
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I64YCA
CVE: CVE-2023-3006
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 0e5d5ae837c8ce04d2ddb874ec5f920118bd9d31 ]
Per AmpereOne erratum AC03_CPU_12, "Branch history may allow control of
speculative execution across software contexts," the AMPERE1 core needs the
bhb clearing loop to mitigate Spectre-BHB, with a loop iteration count of
11.
Signed-off-by: D Scott Phillips <scott(a)os.amperecomputing.com>
Link: https://lore.kernel.org/r/20221011022140.432370-1-scott@os.amperecomputing.…
Reviewed-by: James Morse <james.morse(a)arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
conflicts:
arch/arm64/include/asm/cputype.h
Signed-off-by: Lin Yujun <linyujun809(a)huawei.com>
---
arch/arm64/include/asm/cputype.h | 4 ++++
arch/arm64/kernel/proton-pack.c | 6 ++++++
2 files changed, 10 insertions(+)
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index 662708c56397..812781fba3f9 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -61,6 +61,7 @@
#define ARM_CPU_IMP_HISI 0x48
#define ARM_CPU_IMP_PHYTIUM 0x70
#define ARM_CPU_IMP_APPLE 0x61
+#define ARM_CPU_IMP_AMPERE 0xC0
#define ARM_CPU_PART_AEM_V8 0xD0F
#define ARM_CPU_PART_FOUNDATION 0xD00
@@ -120,6 +121,8 @@
#define APPLE_CPU_PART_M1_ICESTORM 0x022
#define APPLE_CPU_PART_M1_FIRESTORM 0x023
+#define AMPERE_CPU_PART_AMPERE1 0xAC3
+
#define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53)
#define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57)
#define MIDR_CORTEX_A72 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A72)
@@ -165,6 +168,7 @@
#define MIDR_FT_2500 MIDR_CPU_MODEL(ARM_CPU_IMP_PHYTIUM, PHYTIUM_CPU_PART_2500)
#define MIDR_APPLE_M1_ICESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_ICESTORM)
#define MIDR_APPLE_M1_FIRESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_FIRESTORM)
+#define MIDR_AMPERE1 MIDR_CPU_MODEL(ARM_CPU_IMP_AMPERE, AMPERE_CPU_PART_AMPERE1)
/* Fujitsu Erratum 010001 affects A64FX 1.0 and 1.1, (v0r0 and v1r0) */
#define MIDR_FUJITSU_ERRATUM_010001 MIDR_FUJITSU_A64FX
diff --git a/arch/arm64/kernel/proton-pack.c b/arch/arm64/kernel/proton-pack.c
index e807f77737e0..9c95d4955b6e 100644
--- a/arch/arm64/kernel/proton-pack.c
+++ b/arch/arm64/kernel/proton-pack.c
@@ -873,6 +873,10 @@ u8 spectre_bhb_loop_affected(int scope)
MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
{},
};
+ static const struct midr_range spectre_bhb_k11_list[] = {
+ MIDR_ALL_VERSIONS(MIDR_AMPERE1),
+ {},
+ };
static const struct midr_range spectre_bhb_k8_list[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
@@ -883,6 +887,8 @@ u8 spectre_bhb_loop_affected(int scope)
k = 32;
else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k24_list))
k = 24;
+ else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k11_list))
+ k = 11;
else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k8_list))
k = 8;
--
2.34.1
1
0
From: Xia Fukun <xiafukun(a)huawei.com>
stable inclusion
from stable-v4.19.287
commit c746a0b9210cebb29511f01d2becf240408327bf
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7F2UT
CVE: CVE-2023-3220
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=…
--------------------------------
[ Upstream commit 93340e10b9c5fc86730d149636e0aa8b47bb5a34 ]
As kzalloc may fail and return NULL pointer,
it should be better to check pstates
in order to avoid the NULL pointer dereference.
Fixes: 25fdd5933e4c ("drm/msm: Add SDM845 DPU support")
Signed-off-by: Jiasheng Jiang <jiasheng(a)iscas.ac.cn>
Reviewed-by: Abhinav Kumar <quic_abhinavk(a)quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/514160/
Link: https://lore.kernel.org/r/20221206080236.43687-1-jiasheng@iscas.ac.cn
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Xia Fukun <xiafukun(a)huawei.com>
Reviewed-by: Zhang Qiao <zhangqiao22(a)huawei.com>
Reviewed-by: zheng zucheng <zhengzucheng(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
index 4752f08f0884..5852e1d356e1 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
@@ -1477,6 +1477,8 @@ static int dpu_crtc_atomic_check(struct drm_crtc *crtc,
}
pstates = kzalloc(sizeof(*pstates) * DPU_STAGE_MAX * 4, GFP_KERNEL);
+ if (!pstates)
+ return -ENOMEM;
dpu_crtc = to_dpu_crtc(crtc);
cstate = to_dpu_crtc_state(state);
--
2.25.1
1
0