Kernel
Threads by month
- ----- 2025 -----
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- 55 participants
- 16918 discussions

Re: [PATCH openEuler-1.0-LTS V3 09/92] KVM: x86: expose MOVDIRI CPU feature into VM.
by Zenghui Yu 26 Jan '22
by Zenghui Yu 26 Jan '22
26 Jan '22
Reviewed-by: Zenghui Yu <yuzenghui(a)huawei.com>
1
0

Re: [PATCH openEuler-1.0-LTS V3 08/92] KVM: x86: Add requisite includes to hyperv.h
by Zenghui Yu 26 Jan '22
by Zenghui Yu 26 Jan '22
26 Jan '22
Reviewed-by: Zenghui Yu <yuzenghui(a)huawei.com>
1
0

Re: [PATCH openEuler-1.0-LTS V3 07/92] KVM: x86: Add requisite includes to kvm_cache_regs.h
by Zenghui Yu 26 Jan '22
by Zenghui Yu 26 Jan '22
26 Jan '22
Reviewed-by: Zenghui Yu <yuzenghui(a)huawei.com>
1
0

Re: [PATCH openEuler-1.0-LTS V3 06/92] KVM: nVMX: Allocate and configure VM{READ,WRITE} bitmaps iff enable_shadow_vmcs
by Zenghui Yu 26 Jan '22
by Zenghui Yu 26 Jan '22
26 Jan '22
Reviewed-by: Zenghui Yu <yuzenghui(a)huawei.com>
1
0
uniontech inclusion
category: cleanup
CVE: NA
The 'left' always is 0.
If it is not 0, it will 'goto out;' from the previous if judgment.
---
Signed-off-by: Gou Hao <gouhao(a)uniontech.com>
---
fs/eulerfs/dax.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/eulerfs/dax.c b/fs/eulerfs/dax.c
index 9ec8ad7..131d08f 100644
--- a/fs/eulerfs/dax.c
+++ b/fs/eulerfs/dax.c
@@ -1172,8 +1172,8 @@ static ssize_t do_mapping_read(struct address_space *mapping,
goto out;
}
- copied += (nr - left);
- offset += (nr - left);
+ copied += nr;
+ offset += nr;
index += offset >> PAGE_SHIFT;
offset &= ~PAGE_MASK;
} while (copied < len);
--
2.20.1
1
0
uniontech inclusion
category: cleanup
CVE: NA
The 'left' always is 0.
If it is not 0, it will 'goto out;' from the previous if judgment.
---
Signed-off-by: Gou Hao <gouhao(a)uniontech.com>
---
fs/eulerfs/dax.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/eulerfs/dax.c b/fs/eulerfs/dax.c
index 9ec8ad7..131d08f 100644
--- a/fs/eulerfs/dax.c
+++ b/fs/eulerfs/dax.c
@@ -1172,8 +1172,8 @@ static ssize_t do_mapping_read(struct address_space *mapping,
goto out;
}
- copied += (nr - left);
- offset += (nr - left);
+ copied += nr;
+ offset += nr;
index += offset >> PAGE_SHIFT;
offset &= ~PAGE_MASK;
} while (copied < len);
--
2.20.1
1
0
openEuler inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OFDI
CVE: NA
--------------------------------
Chao Liu (2):
change arm64 configs
change x86 configs
arch/arm64/configs/openeuler_defconfig | 265 +++++++-----
arch/x86/configs/openeuler_defconfig | 549 +++++++++----------------
2 files changed, 349 insertions(+), 465 deletions(-)
--
2.27.0
3
5

[PATCH openEuler-1.0-LTS 1/3] etmem: add original kernel swap enabled options
by Yang Yingliang 26 Jan '22
by Yang Yingliang 26 Jan '22
26 Jan '22
From: liubo <liubo254(a)huawei.com>
euleros inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4QVXW
CVE: NA
-------------------------------------------------
etmem, the memory vertical expansion technology,
uses DRAM and high-performance storage new media to form multi-level
memory storage.
By grading the stored data, etmem migrates the classified cold
storage data from the storage medium to the high-performance
storage medium,
so as to achieve the purpose of memory capacity expansion and
memory cost reduction.
When the memory expansion function etmem is running, the native
swap function of the kernel needs to be disabled in certain
scenarios to avoid the impact of kernel swap.
This feature provides the preceding functions.
The /sys/kernel/mm/swap/ directory provides the kernel_swap_enable
sys interface to enable or disable the native swap function
of the kernel.
The default value of /sys/kernel/mm/swap/kernel_swap_enable is true,
that is, kernel swap is enabled by default.
Turn on kernel swap:
echo true > /sys/kernel/mm/swap/kernel_swap_enable
Turn off kernel swap:
echo false > /sys/kernel/mm/swap/kernel_swap_enable
Signed-off-by: liubo <liubo254(a)huawei.com>
Reviewed-by: Miaohe Lin <linmiaohe(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
include/linux/swap.h | 1 +
mm/swap_state.c | 29 +++++++++++++++++++++++++++++
mm/vmscan.c | 18 ++++++++++++++++++
3 files changed, 48 insertions(+)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index b7cfad35987a2..23549741336a4 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -476,6 +476,7 @@ extern struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t flag,
struct vm_fault *vmf);
extern struct page *swapin_readahead(swp_entry_t entry, gfp_t flag,
struct vm_fault *vmf);
+extern bool kernel_swap_enabled(void);
/* linux/mm/swapfile.c */
extern atomic_long_t nr_swap_pages;
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 2137e2d571965..1527ac72928b6 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -40,6 +40,7 @@ static const struct address_space_operations swap_aops = {
struct address_space *swapper_spaces[MAX_SWAPFILES] __read_mostly;
static unsigned int nr_swapper_spaces[MAX_SWAPFILES] __read_mostly;
static bool enable_vma_readahead __read_mostly = true;
+static bool enable_kernel_swap __read_mostly = true;
#define SWAP_RA_WIN_SHIFT (PAGE_SHIFT / 2)
#define SWAP_RA_HITS_MASK ((1UL << SWAP_RA_WIN_SHIFT) - 1)
@@ -326,6 +327,11 @@ static inline bool swap_use_vma_readahead(void)
return READ_ONCE(enable_vma_readahead) && !atomic_read(&nr_rotate_swap);
}
+bool kernel_swap_enabled(void)
+{
+ return READ_ONCE(enable_kernel_swap);
+}
+
/*
* Lookup a swap entry in the swap cache. A found page will be returned
* unlocked and with its refcount incremented - we rely on the kernel
@@ -828,8 +834,31 @@ static struct kobj_attribute vma_ra_enabled_attr =
__ATTR(vma_ra_enabled, 0644, vma_ra_enabled_show,
vma_ra_enabled_store);
+static ssize_t kernel_swap_enable_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ return sprintf(buf, "%s\n", enable_kernel_swap ? "true" : "false");
+}
+static ssize_t kernel_swap_enable_store(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+ if (!strncmp(buf, "true", 4) || !strncmp(buf, "1", 1))
+ WRITE_ONCE(enable_kernel_swap, true);
+ else if (!strncmp(buf, "false", 5) || !strncmp(buf, "0", 1))
+ WRITE_ONCE(enable_kernel_swap, false);
+ else
+ return -EINVAL;
+
+ return count;
+}
+static struct kobj_attribute kernel_swap_enable_attr =
+ __ATTR(kernel_swap_enable, 0644, kernel_swap_enable_show,
+ kernel_swap_enable_store);
+
static struct attribute *swap_attrs[] = {
&vma_ra_enabled_attr.attr,
+ &kernel_swap_enable_attr.attr,
NULL,
};
diff --git a/mm/vmscan.c b/mm/vmscan.c
index e8befe70d2800..2676d6cf2ccac 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3248,6 +3248,16 @@ static bool throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist,
return false;
}
+/*
+ * Check if original kernel swap is enabled
+ * turn off kernel swap,but leave page cache reclaim on
+ */
+static inline void kernel_swap_check(struct scan_control *sc)
+{
+ if (sc != NULL && !kernel_swap_enabled())
+ sc->may_swap = 0;
+}
+
unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
gfp_t gfp_mask, nodemask_t *nodemask)
{
@@ -3264,6 +3274,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
.may_swap = 1,
};
+ kernel_swap_check(&sc);
/*
* scan_control uses s8 fields for order, priority, and reclaim_idx.
* Confirm they are large enough for max values.
@@ -3548,6 +3559,8 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int classzone_idx)
count_vm_event(PAGEOUTRUN);
+ kernel_swap_check(&sc);
+
#ifdef CONFIG_SHRINK_PAGECACHE
if (vm_cache_limit_mbytes && page_cache_over_limit())
shrink_page_cache(GFP_KERNEL);
@@ -3963,6 +3976,8 @@ static unsigned long __shrink_page_cache(gfp_t mask)
struct zonelist *zonelist = node_zonelist(numa_node_id(), mask);
+ kernel_swap_check(&sc);
+
return do_try_to_free_pages(zonelist, &sc);
}
@@ -4282,6 +4297,9 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
cond_resched();
fs_reclaim_acquire(sc.gfp_mask);
+
+ kernel_swap_check(&sc);
+
/*
* We need to be able to allocate from the reserves for RECLAIM_UNMAP
* and we also need to be able to write out pages for RECLAIM_WRITE
--
2.25.1
1
2
openEuler inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OFDI
CVE: NA
--------------------------------
Chao Liu (2):
change arm64 configs
change x86 configs
arch/arm64/configs/openeuler_defconfig | 260 +++++++-----
arch/x86/configs/openeuler_defconfig | 531 ++++++++-----------------
2 files changed, 337 insertions(+), 454 deletions(-)
--
2.27.0
5
9

[PATCH openEuler-1.0-LTS] net: bridge: clear bridge's private skb space on xmit
by Yang Yingliang 24 Jan '22
by Yang Yingliang 24 Jan '22
24 Jan '22
From: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
mainline inclusion
from mainline-v5.9-rc1
commit fd65e5a95d08389444e8591a20538b3edece0e15
category: bugfix
bugzilla: 186114
CVE: NA
--------------------------------
We need to clear all of the bridge private skb variables as they can be
stale due to the packet being recirculated through the stack and then
transmitted through the bridge device. Similar memset is already done on
bridge's input. We've seen cases where proxyarp_replied was 1 on routed
multicast packets transmitted through the bridge to ports with neigh
suppress which were getting dropped. Same thing can in theory happen with
the port isolation bit as well.
Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
Signed-off-by: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Huang Guobin <huangguobin4(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
net/bridge/br_device.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 7b7784bb1cb99..9475e0443ff92 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -41,6 +41,8 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
const unsigned char *dest;
u16 vid = 0;
+ memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
+
rcu_read_lock();
nf_ops = rcu_dereference(nf_br_ops);
if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
--
2.25.1
1
0

[PATCH openEuler-5.10 01/38] tcp_comp: implement sendmsg for tcp compression
by Zheng Zengkai 22 Jan '22
by Zheng Zengkai 22 Jan '22
22 Jan '22
From: Wei Yongjun <weiyongjun1(a)huawei.com>
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4PNEK
CVE: NA
-------------------------------------------------
This patch implement software level compression for
sending tcp messages. All of the TCP payload will be
compressed before xmit.
Signed-off-by: Wei Yongjun <weiyongjun1(a)huawei.com>
Signed-off-by: Wang Yufen <wangyufen(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Signed-off-by: Lu Wei <luwei32(a)huawei.com>
Reviewed-by: Wang Yufen <wangyufen(a)huawei.com>
Reviewed-by: Wei Yongjun <weiyongjun1(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
net/ipv4/Kconfig | 1 +
net/ipv4/tcp_comp.c | 415 +++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 412 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
index 59ffbf80f7f2..22c554d3a9ab 100644
--- a/net/ipv4/Kconfig
+++ b/net/ipv4/Kconfig
@@ -745,6 +745,7 @@ config TCP_MD5SIG
config TCP_COMP
bool "TCP: Transport Layer Compression support"
+ depends on ZSTD_COMPRESS=y
help
Enable kernel payload compression support for TCP protocol. This allows
payload compression handling of the TCP protocol to be done in-kernel.
diff --git a/net/ipv4/tcp_comp.c b/net/ipv4/tcp_comp.c
index fb90be4d9e9c..e85803da3924 100644
--- a/net/ipv4/tcp_comp.c
+++ b/net/ipv4/tcp_comp.c
@@ -5,7 +5,15 @@
* Copyright(c) 2021 Huawei Technologies Co., Ltd
*/
-#include <net/tcp.h>
+#include <linux/skmsg.h>
+#include <linux/zstd.h>
+
+#define TCP_COMP_MAX_PADDING 64
+#define TCP_COMP_SCRATCH_SIZE 65400
+#define TCP_COMP_MAX_CSIZE (TCP_COMP_SCRATCH_SIZE + TCP_COMP_MAX_PADDING)
+
+#define TCP_COMP_SEND_PENDING 1
+#define ZSTD_COMP_DEFAULT_LEVEL 1
static unsigned long tcp_compression_ports[65536 / 8];
@@ -14,11 +22,37 @@ int sysctl_tcp_compression_local __read_mostly;
static struct proto tcp_prot_override;
+struct tcp_comp_context_tx {
+ ZSTD_CStream *cstream;
+ void *cworkspace;
+ void *plaintext_data;
+ void *compressed_data;
+ struct sk_msg msg;
+ bool in_tcp_sendpages;
+};
+
struct tcp_comp_context {
- struct proto *sk_proto;
struct rcu_head rcu;
+
+ struct proto *sk_proto;
+ void (*sk_write_space)(struct sock *sk);
+
+ struct tcp_comp_context_tx tx;
+
+ unsigned long flags;
};
+static bool tcp_comp_is_write_pending(struct tcp_comp_context *ctx)
+{
+ return test_bit(TCP_COMP_SEND_PENDING, &ctx->flags);
+}
+
+static void tcp_comp_err_abort(struct sock *sk, int err)
+{
+ sk->sk_err = err;
+ sk->sk_error_report(sk);
+}
+
static bool tcp_comp_enabled(__be32 saddr, __be32 daddr, int port)
{
if (!sysctl_tcp_compression_local &&
@@ -55,11 +89,341 @@ static struct tcp_comp_context *comp_get_ctx(const struct sock *sk)
return (__force void *)icsk->icsk_ulp_data;
}
-static int tcp_comp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+static int tcp_comp_tx_context_init(struct tcp_comp_context *ctx)
+{
+ ZSTD_parameters params;
+ int csize;
+
+ params = ZSTD_getParams(ZSTD_COMP_DEFAULT_LEVEL, PAGE_SIZE, 0);
+ csize = ZSTD_CStreamWorkspaceBound(params.cParams);
+ if (csize <= 0)
+ return -EINVAL;
+
+ ctx->tx.cworkspace = kmalloc(csize, GFP_KERNEL);
+ if (!ctx->tx.cworkspace)
+ return -ENOMEM;
+
+ ctx->tx.cstream = ZSTD_initCStream(params, 0, ctx->tx.cworkspace,
+ csize);
+ if (!ctx->tx.cstream)
+ goto err_cstream;
+
+ ctx->tx.plaintext_data = kvmalloc(TCP_COMP_SCRATCH_SIZE, GFP_KERNEL);
+ if (!ctx->tx.plaintext_data)
+ goto err_cstream;
+
+ ctx->tx.compressed_data = kvmalloc(TCP_COMP_MAX_CSIZE, GFP_KERNEL);
+ if (!ctx->tx.compressed_data)
+ goto err_compressed;
+
+ return 0;
+
+err_compressed:
+ kvfree(ctx->tx.plaintext_data);
+ ctx->tx.plaintext_data = NULL;
+err_cstream:
+ kfree(ctx->tx.cworkspace);
+ ctx->tx.cworkspace = NULL;
+
+ return -ENOMEM;
+}
+
+static void *tcp_comp_get_tx_stream(struct sock *sk)
+{
+ struct tcp_comp_context *ctx = comp_get_ctx(sk);
+
+ if (!ctx->tx.plaintext_data)
+ tcp_comp_tx_context_init(ctx);
+
+ return ctx->tx.plaintext_data;
+}
+
+static int alloc_compressed_msg(struct sock *sk, int len)
+{
+ struct tcp_comp_context *ctx = comp_get_ctx(sk);
+ struct sk_msg *msg = &ctx->tx.msg;
+
+ sk_msg_init(msg);
+
+ return sk_msg_alloc(sk, msg, len, 0);
+}
+
+static int memcopy_from_iter(struct sock *sk, struct iov_iter *from, int copy)
+{
+ void *dest;
+ int rc;
+
+ dest = tcp_comp_get_tx_stream(sk);
+ if (!dest)
+ return -ENOSPC;
+
+ if (sk->sk_route_caps & NETIF_F_NOCACHE_COPY)
+ rc = copy_from_iter_nocache(dest, copy, from);
+ else
+ rc = copy_from_iter(dest, copy, from);
+
+ if (rc != copy)
+ rc = -EFAULT;
+
+ return rc;
+}
+
+static int memcopy_to_msg(struct sock *sk, int bytes)
+{
+ struct tcp_comp_context *ctx = comp_get_ctx(sk);
+ struct sk_msg *msg = &ctx->tx.msg;
+ int i = msg->sg.curr;
+ struct scatterlist *sge;
+ u32 copy, buf_size;
+ void *from, *to;
+
+ from = ctx->tx.compressed_data;
+ do {
+ sge = sk_msg_elem(msg, i);
+ /* This is possible if a trim operation shrunk the buffer */
+ if (msg->sg.copybreak >= sge->length) {
+ msg->sg.copybreak = 0;
+ sk_msg_iter_var_next(i);
+ if (i == msg->sg.end)
+ break;
+ sge = sk_msg_elem(msg, i);
+ }
+ buf_size = sge->length - msg->sg.copybreak;
+ copy = (buf_size > bytes) ? bytes : buf_size;
+ to = sg_virt(sge) + msg->sg.copybreak;
+ msg->sg.copybreak += copy;
+ memcpy(to, from, copy);
+ bytes -= copy;
+ from += copy;
+ if (!bytes)
+ break;
+ msg->sg.copybreak = 0;
+ sk_msg_iter_var_next(i);
+ } while (i != msg->sg.end);
+
+ msg->sg.curr = i;
+ return bytes;
+}
+
+static int tcp_comp_compress_to_msg(struct sock *sk, int bytes)
+{
+ struct tcp_comp_context *ctx = comp_get_ctx(sk);
+ ZSTD_outBuffer outbuf;
+ ZSTD_inBuffer inbuf;
+ size_t ret;
+
+ inbuf.src = ctx->tx.plaintext_data;
+ outbuf.dst = ctx->tx.compressed_data;
+ inbuf.size = bytes;
+ outbuf.size = TCP_COMP_MAX_CSIZE;
+ inbuf.pos = 0;
+ outbuf.pos = 0;
+
+ ret = ZSTD_compressStream(ctx->tx.cstream, &outbuf, &inbuf);
+ if (ZSTD_isError(ret))
+ return -EIO;
+
+ ret = ZSTD_flushStream(ctx->tx.cstream, &outbuf);
+ if (ZSTD_isError(ret))
+ return -EIO;
+
+ if (inbuf.pos != inbuf.size)
+ return -EIO;
+
+ if (memcopy_to_msg(sk, outbuf.pos))
+ return -EIO;
+
+ sk_msg_trim(sk, &ctx->tx.msg, outbuf.pos);
+
+ return 0;
+}
+
+static int tcp_comp_push_msg(struct sock *sk, struct sk_msg *msg, int flags)
+{
+ struct tcp_comp_context *ctx = comp_get_ctx(sk);
+ struct scatterlist *sg;
+ int ret, offset;
+ struct page *p;
+ size_t size;
+
+ ctx->tx.in_tcp_sendpages = true;
+ while (1) {
+ sg = sk_msg_elem(msg, msg->sg.start);
+ offset = sg->offset;
+ size = sg->length;
+ p = sg_page(sg);
+retry:
+ ret = do_tcp_sendpages(sk, p, offset, size, flags);
+ if (ret != size) {
+ if (ret > 0) {
+ sk_mem_uncharge(sk, ret);
+ sg->offset += ret;
+ sg->length -= ret;
+ size -= ret;
+ offset += ret;
+ goto retry;
+ }
+ ctx->tx.in_tcp_sendpages = false;
+ return ret;
+ }
+
+ sk_mem_uncharge(sk, ret);
+ msg->sg.size -= size;
+ put_page(p);
+ sk_msg_iter_next(msg, start);
+ if (msg->sg.start == msg->sg.end)
+ break;
+ }
+
+ clear_bit(TCP_COMP_SEND_PENDING, &ctx->flags);
+ ctx->tx.in_tcp_sendpages = false;
+
+ return 0;
+}
+
+static int tcp_comp_push(struct sock *sk, int bytes, int flags)
{
struct tcp_comp_context *ctx = comp_get_ctx(sk);
+ int ret;
- return ctx->sk_proto->sendmsg(sk, msg, size);
+ ret = tcp_comp_compress_to_msg(sk, bytes);
+ if (ret < 0) {
+ pr_debug("%s: failed to compress sg\n", __func__);
+ return ret;
+ }
+
+ set_bit(TCP_COMP_SEND_PENDING, &ctx->flags);
+
+ ret = tcp_comp_push_msg(sk, &ctx->tx.msg, flags);
+ if (ret) {
+ pr_debug("%s: failed to tcp_comp_push_sg\n", __func__);
+ return ret;
+ }
+
+ return 0;
+}
+
+static int wait_on_pending_writer(struct sock *sk, long *timeo)
+{
+ DEFINE_WAIT_FUNC(wait, woken_wake_function);
+ int ret = 0;
+
+ add_wait_queue(sk_sleep(sk), &wait);
+ while (1) {
+ if (!*timeo) {
+ ret = -EAGAIN;
+ break;
+ }
+
+ if (signal_pending(current)) {
+ ret = sock_intr_errno(*timeo);
+ break;
+ }
+
+ if (sk_wait_event(sk, timeo, !sk->sk_write_pending, &wait))
+ break;
+ }
+ remove_wait_queue(sk_sleep(sk), &wait);
+
+ return ret;
+}
+
+static int tcp_comp_push_pending_msg(struct sock *sk, int flags)
+{
+ struct tcp_comp_context *ctx = comp_get_ctx(sk);
+ struct sk_msg *msg = &ctx->tx.msg;
+
+ if (msg->sg.start == msg->sg.end)
+ return 0;
+
+ return tcp_comp_push_msg(sk, msg, flags);
+}
+
+static int tcp_comp_complete_pending_work(struct sock *sk, int flags,
+ long *timeo)
+{
+ struct tcp_comp_context *ctx = comp_get_ctx(sk);
+ int ret = 0;
+
+ if (unlikely(sk->sk_write_pending))
+ ret = wait_on_pending_writer(sk, timeo);
+
+ if (!ret && tcp_comp_is_write_pending(ctx))
+ ret = tcp_comp_push_pending_msg(sk, flags);
+
+ return ret;
+}
+
+static int tcp_comp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+{
+ struct tcp_comp_context *ctx = comp_get_ctx(sk);
+ int copied = 0, err = 0;
+ size_t try_to_copy;
+ int required_size;
+ long timeo;
+
+ lock_sock(sk);
+
+ timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
+
+ err = tcp_comp_complete_pending_work(sk, msg->msg_flags, &timeo);
+ if (err)
+ goto out_err;
+
+ while (msg_data_left(msg)) {
+ if (sk->sk_err) {
+ err = -sk->sk_err;
+ goto out_err;
+ }
+
+ try_to_copy = msg_data_left(msg);
+ if (try_to_copy > TCP_COMP_SCRATCH_SIZE)
+ try_to_copy = TCP_COMP_SCRATCH_SIZE;
+ required_size = try_to_copy + TCP_COMP_MAX_PADDING;
+
+ if (!sk_stream_memory_free(sk))
+ goto wait_for_sndbuf;
+
+alloc_compressed:
+ err = alloc_compressed_msg(sk, required_size);
+ if (err) {
+ if (err != -ENOSPC)
+ goto wait_for_memory;
+ goto out_err;
+ }
+
+ err = memcopy_from_iter(sk, &msg->msg_iter, try_to_copy);
+ if (err < 0)
+ goto out_err;
+
+ copied += try_to_copy;
+
+ err = tcp_comp_push(sk, try_to_copy, msg->msg_flags);
+ if (err < 0) {
+ if (err == -ENOMEM)
+ goto wait_for_memory;
+ if (err != -EAGAIN)
+ tcp_comp_err_abort(sk, EBADMSG);
+ goto out_err;
+ }
+
+ continue;
+wait_for_sndbuf:
+ set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
+wait_for_memory:
+ err = sk_stream_wait_memory(sk, &timeo);
+ if (err)
+ goto out_err;
+ if (ctx->tx.msg.sg.size < required_size)
+ goto alloc_compressed;
+ }
+
+out_err:
+ err = sk_stream_error(sk, msg->msg_flags, err);
+
+ release_sock(sk);
+
+ return copied ? copied : err;
}
static int tcp_comp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
@@ -70,10 +434,35 @@ static int tcp_comp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
return ctx->sk_proto->recvmsg(sk, msg, len, nonblock, flags, addr_len);
}
+static void tcp_comp_write_space(struct sock *sk)
+{
+ struct tcp_comp_context *ctx = comp_get_ctx(sk);
+
+ if (ctx->tx.in_tcp_sendpages) {
+ ctx->sk_write_space(sk);
+ return;
+ }
+
+ if (!sk->sk_write_pending && tcp_comp_is_write_pending(ctx)) {
+ gfp_t sk_allocation = sk->sk_allocation;
+ int rc;
+
+ sk->sk_allocation = GFP_ATOMIC;
+ rc = tcp_comp_push_pending_msg(sk, MSG_DONTWAIT | MSG_NOSIGNAL);
+ sk->sk_allocation = sk_allocation;
+
+ if (rc < 0)
+ return;
+ }
+
+ ctx->sk_write_space(sk);
+}
+
void tcp_init_compression(struct sock *sk)
{
struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_comp_context *ctx = NULL;
+ struct sk_msg *msg = NULL;
struct tcp_sock *tp = tcp_sk(sk);
if (!tp->rx_opt.comp_ok)
@@ -83,20 +472,38 @@ void tcp_init_compression(struct sock *sk)
if (!ctx)
return;
+ msg = &ctx->tx.msg;
+ sk_msg_init(msg);
+
+ ctx->sk_write_space = sk->sk_write_space;
ctx->sk_proto = sk->sk_prot;
WRITE_ONCE(sk->sk_prot, &tcp_prot_override);
+ sk->sk_write_space = tcp_comp_write_space;
rcu_assign_pointer(icsk->icsk_ulp_data, ctx);
sock_set_flag(sk, SOCK_COMP);
}
+static void tcp_comp_context_tx_free(struct tcp_comp_context *ctx)
+{
+ kfree(ctx->tx.cworkspace);
+ ctx->tx.cworkspace = NULL;
+
+ kvfree(ctx->tx.plaintext_data);
+ ctx->tx.plaintext_data = NULL;
+
+ kvfree(ctx->tx.compressed_data);
+ ctx->tx.compressed_data = NULL;
+}
+
static void tcp_comp_context_free(struct rcu_head *head)
{
struct tcp_comp_context *ctx;
ctx = container_of(head, struct tcp_comp_context, rcu);
+ tcp_comp_context_tx_free(ctx);
kfree(ctx);
}
--
2.20.1
1
37
openEuler inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OFDI
CVE: NA
--------------------------------
Chao Liu (2):
change arm64 configs
change x86 configs
arch/arm64/configs/openeuler_defconfig | 292 ++++++++-----
arch/x86/configs/openeuler_defconfig | 548 +++++++++----------------
2 files changed, 380 insertions(+), 460 deletions(-)
--
2.27.0
5
9

22 Jan '22
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4O31I
Enable legacy pmem register feature for arm64.
Signed-off-by: Zhuling <zhuling8(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
---
arch/arm64/configs/openeuler_defconfig | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig
index 771eb45..7745715 100644
--- a/arch/arm64/configs/openeuler_defconfig
+++ b/arch/arm64/configs/openeuler_defconfig
@@ -416,6 +416,8 @@ CONFIG_ARM64_CPU_PARK=y
CONFIG_FORCE_MAX_ZONEORDER=11
CONFIG_UNMAP_KERNEL_AT_EL0=y
CONFIG_RODATA_FULL_DEFAULT_ENABLED=y
+CONFIG_ARM64_PMEM_RESERVE=y
+CONFIG_ARM64_PMEM_LEGACY=m
# CONFIG_ARM64_SW_TTBR0_PAN is not set
CONFIG_ARM64_TAGGED_ADDR_ABI=y
CONFIG_ARM64_ILP32=y
@@ -6027,6 +6029,8 @@ CONFIG_USB4=m
# end of Android
CONFIG_LIBNVDIMM=m
+CONFIG_PMEM_LEGACY=m
+CONFIG_PMEM_LEGACY_DEVICE=y
CONFIG_BLK_DEV_PMEM=m
CONFIG_ND_BLK=m
CONFIG_ND_CLAIM=y
--
2.9.5
1
0

[PATCH OLK-5.10 v7 1/3] x86: pmem: move persistent memory(legacy) code into nvdimm
by Zhuling 22 Jan '22
by Zhuling 22 Jan '22
22 Jan '22
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4O31I
Move x86's pmem.c into nvdimm, and rename X86_PMEM_LEGACY_DEVICE to
PMEM_LEGACY_DEVICE, also add PMEM_LEGACY to control the built of
nd_e820.o, then the code could be reused by other architectures.
Note,this patch fixs the nd_e820.c build introduced by commit 2499317e408e
("arm64: Revert feature: Add memmap parameter and register pmem").
Signed-off-by: Zhuling <zhuling8(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
---
arch/x86/Kconfig | 6 ++----
arch/x86/kernel/Makefile | 1 -
drivers/nvdimm/Kconfig | 6 ++++++
drivers/nvdimm/Makefile | 2 ++
arch/x86/kernel/pmem.c => drivers/nvdimm/pmem_legacy_device.c | 0
tools/testing/nvdimm/Kbuild | 2 +-
6 files changed, 11 insertions(+), 6 deletions(-)
rename arch/x86/kernel/pmem.c => drivers/nvdimm/pmem_legacy_device.c (100%)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c77ef59..1d3176a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1667,14 +1667,12 @@ config ILLEGAL_POINTER_VALUE
default 0 if X86_32
default 0xdead000000000000 if X86_64
-config X86_PMEM_LEGACY_DEVICE
- bool
-
config X86_PMEM_LEGACY
tristate "Support non-standard NVDIMMs and ADR protected memory"
depends on PHYS_ADDR_T_64BIT
depends on BLK_DEV
- select X86_PMEM_LEGACY_DEVICE
+ select PMEM_LEGACY
+ select PMEM_LEGACY_DEVICE
select NUMA_KEEP_MEMINFO if NUMA
select LIBNVDIMM
help
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index f0606f8..1e12751 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -130,7 +130,6 @@ obj-$(CONFIG_KVM_GUEST) += kvm.o kvmclock.o
obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch.o
obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= paravirt-spinlocks.o
obj-$(CONFIG_PARAVIRT_CLOCK) += pvclock.o
-obj-$(CONFIG_X86_PMEM_LEGACY_DEVICE) += pmem.o
obj-$(CONFIG_JAILHOUSE_GUEST) += jailhouse.o
diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig
index b7d1eb3..632b6ed 100644
--- a/drivers/nvdimm/Kconfig
+++ b/drivers/nvdimm/Kconfig
@@ -19,6 +19,12 @@ menuconfig LIBNVDIMM
if LIBNVDIMM
+config PMEM_LEGACY
+ tristate
+
+config PMEM_LEGACY_DEVICE
+ bool
+
config BLK_DEV_PMEM
tristate "PMEM: Persistent memory block device support"
default LIBNVDIMM
diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile
index 0407753..2098221 100644
--- a/drivers/nvdimm/Makefile
+++ b/drivers/nvdimm/Makefile
@@ -3,6 +3,8 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
obj-$(CONFIG_ND_BTT) += nd_btt.o
obj-$(CONFIG_ND_BLK) += nd_blk.o
+obj-$(CONFIG_PMEM_LEGACY_DEVICE) += pmem_legacy_device.o
+obj-$(CONFIG_PMEM_LEGACY) += nd_e820.o
obj-$(CONFIG_OF_PMEM) += of_pmem.o
obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o nd_virtio.o
diff --git a/arch/x86/kernel/pmem.c b/drivers/nvdimm/pmem_legacy_device.c
similarity index 100%
rename from arch/x86/kernel/pmem.c
rename to drivers/nvdimm/pmem_legacy_device.c
diff --git a/tools/testing/nvdimm/Kbuild b/tools/testing/nvdimm/Kbuild
index 47f9cc9..77aa117 100644
--- a/tools/testing/nvdimm/Kbuild
+++ b/tools/testing/nvdimm/Kbuild
@@ -28,7 +28,7 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
obj-$(CONFIG_ND_BTT) += nd_btt.o
obj-$(CONFIG_ND_BLK) += nd_blk.o
-obj-$(CONFIG_X86_PMEM_LEGACY) += nd_e820.o
+obj-$(CONFIG_PMEM_LEGACY) += nd_e820.o
obj-$(CONFIG_ACPI_NFIT) += nfit.o
ifeq ($(CONFIG_DAX),m)
obj-$(CONFIG_DAX) += dax.o
--
2.9.5
1
1
openEuler inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OFDI
CVE: NA
--------------------------------
Chao Liu (2):
change arm64 configs
change x86 configs
arch/arm64/configs/openeuler_defconfig | 272 ++++++++-----
arch/x86/configs/openeuler_defconfig | 540 +++++++++----------------
2 files changed, 356 insertions(+), 456 deletions(-)
--
2.27.0
1
2

[PATCH openEuler-1.0-LTS V2] ipmi_si: Phytium S2500 workaround for MMIO-based IPMI
by Laibin Qiu 22 Jan '22
by Laibin Qiu 22 Jan '22
22 Jan '22
phytium inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4RK58
CVE: NA
--------------------------------
The system would hang up when the Phytium S2500 communicates with
some BMCs after several rounds of transactions, unless we reset
the controller timeout counter manually by calling firmware through
SMC.
Signed-off-by: Wang Yinfeng <wangyinfeng(a)phytium.com.cn>
Signed-off-by: Chen Baozi <chenbaozi(a)phytium.com.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin(a)huawei.com>
---
drivers/char/ipmi/ipmi_si_mem_io.c | 72 ++++++++++++++++++++++++++++++
1 file changed, 72 insertions(+)
diff --git a/drivers/char/ipmi/ipmi_si_mem_io.c b/drivers/char/ipmi/ipmi_si_mem_io.c
index 75583612ab10..9403e35a1993 100644
--- a/drivers/char/ipmi/ipmi_si_mem_io.c
+++ b/drivers/char/ipmi/ipmi_si_mem_io.c
@@ -3,9 +3,75 @@
#include <linux/io.h>
#include "ipmi_si.h"
+#ifdef CONFIG_ARM_GIC_PHYTIUM_2500
+#include <linux/arm-smccc.h>
+
+#define CTL_RST_FUNC_ID 0xC2000011
+
+static bool apply_phytium2500_workaround;
+
+struct ipmi_workaround_oem_info {
+ char oem_id[ACPI_OEM_ID_SIZE + 1];
+};
+
+static struct ipmi_workaround_oem_info wa_info[] = {
+ {
+ .oem_id = "KPSVVJ",
+ }
+};
+
+static void ipmi_check_phytium_workaround(void)
+{
+#ifdef CONFIG_ACPI
+ struct acpi_table_header tbl;
+ int i;
+
+ if (ACPI_FAILURE(acpi_get_table_header(ACPI_SIG_DSDT, 0, &tbl)))
+ return;
+
+ for (i = 0; i < ARRAY_SIZE(wa_info); i++) {
+ if (strncmp(wa_info[i].oem_id, tbl.oem_id, ACPI_OEM_ID_SIZE))
+ continue;
+
+ apply_phytium2500_workaround = true;
+ break;
+ }
+#endif
+}
+
+static void ctl_smc(unsigned long arg0, unsigned long arg1,
+ unsigned long arg2, unsigned long arg3)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_smc(arg0, arg1, arg2, arg3, 0, 0, 0, 0, &res);
+ if (res.a0 != 0)
+ pr_err("Error: Firmware call SMC reset Failed: %d, addr: 0x%lx\n",
+ (int)res.a0, arg2);
+}
+
+static void ctl_timeout_reset(void)
+{
+ ctl_smc(CTL_RST_FUNC_ID, 0x1, 0x28100208, 0x1);
+ ctl_smc(CTL_RST_FUNC_ID, 0x1, 0x2810020C, 0x1);
+}
+
+static inline void ipmi_phytium_workaround(void)
+{
+ if (apply_phytium2500_workaround)
+ ctl_timeout_reset();
+}
+
+#else
+static inline void ipmi_check_phytium_workaround(void) {}
+static inline void ipmi_phytium_workaround(void) {}
+#endif
+
static unsigned char intf_mem_inb(const struct si_sm_io *io,
unsigned int offset)
{
+ ipmi_phytium_workaround();
+
return readb((io->addr)+(offset * io->regspacing));
}
@@ -31,6 +97,8 @@ static void intf_mem_outw(const struct si_sm_io *io, unsigned int offset,
static unsigned char intf_mem_inl(const struct si_sm_io *io,
unsigned int offset)
{
+ ipmi_phytium_workaround();
+
return (readl((io->addr)+(offset * io->regspacing)) >> io->regshift)
& 0xff;
}
@@ -44,6 +112,8 @@ static void intf_mem_outl(const struct si_sm_io *io, unsigned int offset,
#ifdef readq
static unsigned char mem_inq(const struct si_sm_io *io, unsigned int offset)
{
+ ipmi_phytium_workaround();
+
return (readq((io->addr)+(offset * io->regspacing)) >> io->regshift)
& 0xff;
}
@@ -81,6 +151,8 @@ int ipmi_si_mem_setup(struct si_sm_io *io)
if (!addr)
return -ENODEV;
+ ipmi_check_phytium_workaround();
+
/*
* Figure out the actual readb/readw/readl/etc routine to use based
* upon the register size.
--
2.22.0
1
0

21 Jan '22
On 2022/1/21 18:06, Chen Baozi wrote:
> phytium inclusion
> category: bugfix
> bugzilla: https://gitee.com/openeuler/kernel/issues/I4RK58
> CVE: NA
>
> ---------------------------------
>
> The system would hang up when the Phytium S2500 communicates with
> some BMCs after several rounds of transactions, unless we reset
> the controller timeout counter manually by calling firmware through
> SMC.
>
> Signed-off-by: Wang Yinfeng <wangyinfeng(a)phytium.com.cn>
> Signed-off-by: Chen Baozi <chenbaozi(a)phytium.com.cn>
Looks good to me,
Reviewed-by: Xie XiuQi <xiexiuqi(a)huawei.com>
> ---
> drivers/char/ipmi/ipmi_si_mem_io.c | 74 ++++++++++++++++++++++++++++++
> 1 file changed, 74 insertions(+)
>
> diff --git a/drivers/char/ipmi/ipmi_si_mem_io.c b/drivers/char/ipmi/ipmi_si_mem_io.c
> index 75583612ab10..bb4ed90a1153 100644
> --- a/drivers/char/ipmi/ipmi_si_mem_io.c
> +++ b/drivers/char/ipmi/ipmi_si_mem_io.c
> @@ -3,9 +3,75 @@
> #include <linux/io.h>
> #include "ipmi_si.h"
>
> +#ifdef CONFIG_ARM_GIC_PHYTIUM_2500
> +#include <linux/arm-smccc.h>
> +
> +#define CTL_RST_FUNC_ID 0xC2000011
> +
> +static bool apply_phytium2500_workaround;
> +
> +struct ipmi_workaround_oem_info {
> + char oem_id[ACPI_OEM_ID_SIZE + 1];
> +};
> +
> +static struct ipmi_workaround_oem_info wa_info[] = {
> + {
> + .oem_id = "KPSVVJ",
> + }
> +};
> +
> +static void ipmi_check_phytium_workaround(void)
> +{
> +#ifdef CONFIG_ACPI
> + struct acpi_table_header tbl;
> + int i;
> +
> + if (ACPI_FAILURE(acpi_get_table_header(ACPI_SIG_DSDT, 0, &tbl)))
> + return;
> +
> + for (i = 0; i < ARRAY_SIZE(wa_info); i++) {
> + if (strncmp(wa_info[i].oem_id, tbl.oem_id, ACPI_OEM_ID_SIZE))
> + continue;
> +
> + apply_phytium2500_workaround = true;
> + break;
> + }
> +#endif
> +}
> +
> +static void ctl_smc(unsigned long arg0, unsigned long arg1,
> + unsigned long arg2, unsigned long arg3)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_smc(arg0, arg1, arg2, arg3, 0, 0, 0, 0, &res);
> + if (res.a0 != 0)
> + pr_err("Error: Firmware call SMC reset Failed: %d, addr: 0x%lx\n",
> + (int)res.a0, arg2);
> +}
> +
> +static void ctl_timeout_reset(void)
> +{
> + ctl_smc(CTL_RST_FUNC_ID, 0x1, 0x28100208, 0x1);
> + ctl_smc(CTL_RST_FUNC_ID, 0x1, 0x2810020C, 0x1);
> +}
> +
> +static inline void ipmi_phytium_workaround(void)
> +{
> + if (apply_phytium2500_workaround)
> + ctl_timeout_reset();
> +}
> +
> +#else
> +static inline void ipmi_check_phytium_workaround(void) {}
> +static inline void ipmi_phytium_workaround(void) {}
> +#endif
> +
> static unsigned char intf_mem_inb(const struct si_sm_io *io,
> unsigned int offset)
> {
> + ipmi_phytium_workaround();
> +
> return readb((io->addr)+(offset * io->regspacing));
> }
>
> @@ -18,6 +84,8 @@ static void intf_mem_outb(const struct si_sm_io *io, unsigned int offset,
> static unsigned char intf_mem_inw(const struct si_sm_io *io,
> unsigned int offset)
> {
> + ipmi_phytium_workaround();
> +
> return (readw((io->addr)+(offset * io->regspacing)) >> io->regshift)
> & 0xff;
> }
> @@ -31,6 +99,8 @@ static void intf_mem_outw(const struct si_sm_io *io, unsigned int offset,
> static unsigned char intf_mem_inl(const struct si_sm_io *io,
> unsigned int offset)
> {
> + ipmi_phytium_workaround();
> +
> return (readl((io->addr)+(offset * io->regspacing)) >> io->regshift)
> & 0xff;
> }
> @@ -44,6 +114,8 @@ static void intf_mem_outl(const struct si_sm_io *io, unsigned int offset,
> #ifdef readq
> static unsigned char mem_inq(const struct si_sm_io *io, unsigned int offset)
> {
> + ipmi_phytium_workaround();
> +
> return (readq((io->addr)+(offset * io->regspacing)) >> io->regshift)
> & 0xff;
> }
> @@ -81,6 +153,8 @@ int ipmi_si_mem_setup(struct si_sm_io *io)
> if (!addr)
> return -ENODEV;
>
> + ipmi_check_phytium_workaround();
> +
> /*
> * Figure out the actual readb/readw/readl/etc routine to use based
> * upon the register size.
>
1
0

21 Jan '22
Hi,
Thanks for your patch. There're some comments bellow.
On 2022/1/19 14:57, Chen Baozi wrote:
> The system would hang up when the Phytium S2500 communicates with
> some BMCs after several rounds of transactions, unless we reset
> the controller timeout counter manually by calling firmware through
> SMC.
>
> Signed-off-by: Wang Yinfeng <wangyinfeng(a)phytium.com.cn>
> Signed-off-by: Chen Baozi <chenbaozi(a)phytium.com.cn>
> ---
> drivers/char/ipmi/ipmi_si_mem_io.c | 117 ++++++++++++++++++++++++++++-
> 1 file changed, 113 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/char/ipmi/ipmi_si_mem_io.c b/drivers/char/ipmi/ipmi_si_mem_io.c
> index 75583612ab10..73a3d6979eaa 100644
> --- a/drivers/char/ipmi/ipmi_si_mem_io.c
> +++ b/drivers/char/ipmi/ipmi_si_mem_io.c
> @@ -3,6 +3,105 @@
> #include <linux/io.h>
> #include "ipmi_si.h"
>
> +#ifdef CONFIG_ARM_GIC_PHYTIUM_2500
> +#include <linux/arm-smccc.h>
> +
> +#define CTL_RST_FUNC_ID 0xC2000011
> +
> +static void ctl_smc(unsigned long arg0, unsigned long arg1,
> + unsigned long arg2, unsigned long arg3)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_smc(arg0, arg1, arg2, arg3, 0, 0, 0, 0, &res);
> + if (res.a0 != 0)
> + pr_err("Error: Firmware call SMC reset Failed: %d, addr: 0x%lx\n",
> + (int)res.a0, arg2);
> +}
> +
> +static void ctl_timeout_reset(void)
> +{
> + ctl_smc(CTL_RST_FUNC_ID, 0x1, 0x28100208, 0x1);
> + ctl_smc(CTL_RST_FUNC_ID, 0x1, 0x2810020C, 0x1);
> +}
> +
> +static unsigned char intf_mem_inb_2500(const struct si_sm_io *io,
> + unsigned int offset)
> +{
> + ctl_timeout_reset();
> +
> + return readb((io->addr)+(offset * io->regspacing));
与 intf_mem_inb() 代码重复,下同。
> +}
> +
> +static unsigned char intf_mem_inw_2500(const struct si_sm_io *io,
> + unsigned int offset)
> +{
> + ctl_timeout_reset();
> +
> + return (readw((io->addr)+(offset * io->regspacing)) >> io->regshift)
> + & 0xff;
> +}
> +
> +static unsigned char intf_mem_inl_2500(const struct si_sm_io *io,
> + unsigned int offset)
> +{
> + ctl_timeout_reset();
> +
> + return (readl((io->addr)+(offset * io->regspacing)) >> io->regshift)
> + & 0xff;
> +}
> +
> +static unsigned char mem_inq_2500(const struct si_sm_io *io,
> + unsigned int offset)
> +{
> + ctl_timeout_reset();
> +
> + return (readq((io->addr)+(offset * io->regspacing)) >> io->regshift)
> + & 0xff;
> +}
> +#else
> +#define intf_mem_inb_2500 intf_mem_inb
> +#define intf_mem_inw_2500 intf_mem_inw
> +#define intf_mem_inl_2500 intf_mem_inl
> +#define mem_inq_2500 mem_inq
> +#endif
> +
> +static bool apply_phytium2500_workaround;
这个变量全局存在的,其实只在 phytium2500 平台上才有意义。
> +
> +#ifdef CONFIG_ACPI
> +struct ipmi_workaround_oem_info {
> + char oem_id[ACPI_OEM_ID_SIZE + 1];
> +};
> +
> +static struct ipmi_workaround_oem_info wa_info[] = {
> + {
> + .oem_id = "KPSVVJ",
> + }
> +};
> +
> +static void ipmi_check_phytium_workaround(void)
> +{
> + struct acpi_table_header tbl;
> + int i;
> +
> + if (ACPI_FAILURE(acpi_get_table_header(ACPI_SIG_DSDT, 0, &tbl)))
> + return;
> +
> + for (i = 0; i < ARRAY_SIZE(wa_info); i++) {
> + if (strncmp(wa_info[i].oem_id, tbl.oem_id, ACPI_OEM_ID_SIZE))
> + continue;
> +
> + apply_phytium2500_workaround = true;
> + break;
> + }
> +}
这个 CONFIG_ACPI 和 PHYTIUM 没有依赖,即便不是 PHYTIUM 平台,仍然会做这些没必要的检查。
而且在非 ARM64 架构上,这个 apply_phytium2500_workaround 仍然会全局存在。
是不是类似这样比较好:
#ifdef CONFIG_ARM_GIC_PHYTIUM_2500
static bool apply_phytium2500_workaround;
static void ipmi_check_phytium_workaround(void)
{
#ifdef CONFIG_ACPI
...
#endif
}
...
static inline void ipmi_phytium2500_workaround(void)
{
if (apply_phytium2500_workaround)
ctl_timeout_reset();
}
static unsigned char intf_mem_inb(const struct si_sm_io *io,
unsigned int offset)
{
+ ipmi_phytium2500_workaround();
return readb((io->addr)+(offset * io->regspacing));
}
static unsigned char intf_mem_inw(const struct si_sm_io *io,
unsigned int offset)
{
+ ipmi_phytium2500_workaround();
return (readw((io->addr)+(offset * io->regspacing)) >> io->regshift)
& 0xff;
}
...
#else
static inline void ipmi_phytium2500_workaround() {}
static inline void ipmi_check_phytium_workaround(void) {}
#endif
> +#else
> +static void ipmi_check_phytium_workaround(void)
> +{
> + apply_phytium2500_workaround = false;
> +}
> +#endif
> +
> static unsigned char intf_mem_inb(const struct si_sm_io *io,
> unsigned int offset)
> {
> @@ -81,26 +180,36 @@ int ipmi_si_mem_setup(struct si_sm_io *io)
> if (!addr)
> return -ENODEV;
>
> + ipmi_check_phytium_workaround();
> +
> /*
> * Figure out the actual readb/readw/readl/etc routine to use based
> * upon the register size.
> */
> switch (io->regsize) {
> case 1:
> - io->inputb = intf_mem_inb;
> + io->inputb = apply_phytium2500_workaround ?
> + intf_mem_inb_2500 :
> + intf_mem_inb;
> io->outputb = intf_mem_outb;
> break;
> case 2:
> - io->inputb = intf_mem_inw;
> + io->inputb = apply_phytium2500_workaround ?
> + intf_mem_inw_2500 :
> + intf_mem_inw;
> io->outputb = intf_mem_outw;
> break;
> case 4:
> - io->inputb = intf_mem_inl;
> + io->inputb = apply_phytium2500_workaround ?
> + intf_mem_inl_2500 :
> + intf_mem_inl;
> io->outputb = intf_mem_outl;
> break;
> #ifdef readq
> case 8:
> - io->inputb = mem_inq;
> + io->inputb = apply_phytium2500_workaround ?
> + mem_inq_2500 :
> + mem_inq;
> io->outputb = mem_outq;
> break;
> #endif
>
1
0

[PATCH OLK-5.10 v6 1/3] x86: pmem: move persistent memory(legacy) code into nvdimm
by Zhuling 21 Jan '22
by Zhuling 21 Jan '22
21 Jan '22
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4O31I
Move x86's pmem.c into nvdimm, and rename X86_PMEM_LEGACY_DEVICE to
PMEM_LEGACY_DEVICE, also add PMEM_LEGACY to control the built of
nd_e820.o, then the code could be reused by other architectures.
Note���this patch fixs the nd_e820.c build introduced by commit 2499317e408e
("arm64: Revert feature: Add memmap parameter and register pmem").
Signed-off-by: Zhuling <zhuling8(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
---
arch/x86/Kconfig | 6 ++----
arch/x86/kernel/Makefile | 1 -
drivers/nvdimm/Kconfig | 6 ++++++
drivers/nvdimm/Makefile | 2 ++
arch/x86/kernel/pmem.c => drivers/nvdimm/pmem_legacy_device.c | 0
tools/testing/nvdimm/Kbuild | 2 +-
6 files changed, 11 insertions(+), 6 deletions(-)
rename arch/x86/kernel/pmem.c => drivers/nvdimm/pmem_legacy_device.c (100%)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c77ef59..1d3176a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1667,14 +1667,12 @@ config ILLEGAL_POINTER_VALUE
default 0 if X86_32
default 0xdead000000000000 if X86_64
-config X86_PMEM_LEGACY_DEVICE
- bool
-
config X86_PMEM_LEGACY
tristate "Support non-standard NVDIMMs and ADR protected memory"
depends on PHYS_ADDR_T_64BIT
depends on BLK_DEV
- select X86_PMEM_LEGACY_DEVICE
+ select PMEM_LEGACY
+ select PMEM_LEGACY_DEVICE
select NUMA_KEEP_MEMINFO if NUMA
select LIBNVDIMM
help
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index f0606f8..1e12751 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -130,7 +130,6 @@ obj-$(CONFIG_KVM_GUEST) += kvm.o kvmclock.o
obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch.o
obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= paravirt-spinlocks.o
obj-$(CONFIG_PARAVIRT_CLOCK) += pvclock.o
-obj-$(CONFIG_X86_PMEM_LEGACY_DEVICE) += pmem.o
obj-$(CONFIG_JAILHOUSE_GUEST) += jailhouse.o
diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig
index b7d1eb3..632b6ed 100644
--- a/drivers/nvdimm/Kconfig
+++ b/drivers/nvdimm/Kconfig
@@ -19,6 +19,12 @@ menuconfig LIBNVDIMM
if LIBNVDIMM
+config PMEM_LEGACY
+ tristate
+
+config PMEM_LEGACY_DEVICE
+ bool
+
config BLK_DEV_PMEM
tristate "PMEM: Persistent memory block device support"
default LIBNVDIMM
diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile
index 0407753..2098221 100644
--- a/drivers/nvdimm/Makefile
+++ b/drivers/nvdimm/Makefile
@@ -3,6 +3,8 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
obj-$(CONFIG_ND_BTT) += nd_btt.o
obj-$(CONFIG_ND_BLK) += nd_blk.o
+obj-$(CONFIG_PMEM_LEGACY_DEVICE) += pmem_legacy_device.o
+obj-$(CONFIG_PMEM_LEGACY) += nd_e820.o
obj-$(CONFIG_OF_PMEM) += of_pmem.o
obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o nd_virtio.o
diff --git a/arch/x86/kernel/pmem.c b/drivers/nvdimm/pmem_legacy_device.c
similarity index 100%
rename from arch/x86/kernel/pmem.c
rename to drivers/nvdimm/pmem_legacy_device.c
diff --git a/tools/testing/nvdimm/Kbuild b/tools/testing/nvdimm/Kbuild
index 47f9cc9..77aa117 100644
--- a/tools/testing/nvdimm/Kbuild
+++ b/tools/testing/nvdimm/Kbuild
@@ -28,7 +28,7 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
obj-$(CONFIG_ND_BTT) += nd_btt.o
obj-$(CONFIG_ND_BLK) += nd_blk.o
-obj-$(CONFIG_X86_PMEM_LEGACY) += nd_e820.o
+obj-$(CONFIG_PMEM_LEGACY) += nd_e820.o
obj-$(CONFIG_ACPI_NFIT) += nfit.o
ifeq ($(CONFIG_DAX),m)
obj-$(CONFIG_DAX) += dax.o
--
2.9.5
1
2

Re: [PATCH OLK-5.10 v5 1/3] x86: pmem: move persistent memory(legacy) code into nvdimm
by zhuling 21 Jan '22
by zhuling 21 Jan '22
21 Jan '22
On 2022/1/21 14:25, Kefeng Wang wrote:
> Move x86's pmem.c into nvdimm, and rename it pmem_legacy_device.c,
> also rename X86_PMEM_LEGACY_DEVICE to PMEM_LEGACY_DEVICE.
> Meanwhile, add PMEM_LEGACY to control the built of nd_e820.o, then
> the code could be reused by other architectures.
>
> Note,this patch fixs the nd_e820.c build introduced by commit 2499317e408e
> ("arm64: Revert feature: Add memmap parameter and register pmem").
>
> changelog改下;
好的,我加下
>
> On 2022/1/21 11:20, Zhuling wrote:
>> hulk inclusion
>> category: feature
>> bugzilla: https://gitee.com/openeuler/kernel/issues/I4O31I
>>
>> Move x86's pmem.c into nvdimm, and rename X86_PMEM_LEGACY_DEVICE to
>> PMEM_LEGACY_DEVICE, also add PMEM_LEGACY to control the built of
>> nd_e820.o, then the code could be reused by other architectures.
>>
>> Signed-off-by: Zhuling <zhuling8(a)huawei.com>
>> ---
>> arch/x86/Kconfig | 6 ++----
>> arch/x86/kernel/Makefile | 1 -
>> drivers/nvdimm/Kconfig | 6 ++++++
>> drivers/nvdimm/Makefile | 2 ++
>> arch/x86/kernel/pmem.c => drivers/nvdimm/pmem_legacy_device.c | 0
>> tools/testing/nvdimm/Kbuild | 2 +-
>> 6 files changed, 11 insertions(+), 6 deletions(-)
>> rename arch/x86/kernel/pmem.c => drivers/nvdimm/pmem_legacy_device.c (100%)
>>
>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> index c77ef59..1d3176a 100644
>> --- a/arch/x86/Kconfig
>> +++ b/arch/x86/Kconfig
>> @@ -1667,14 +1667,12 @@ config ILLEGAL_POINTER_VALUE
>> default 0 if X86_32
>> default 0xdead000000000000 if X86_64
>> -config X86_PMEM_LEGACY_DEVICE
>> - bool
>> -
>> config X86_PMEM_LEGACY
>> tristate "Support non-standard NVDIMMs and ADR protected memory"
>> depends on PHYS_ADDR_T_64BIT
>> depends on BLK_DEV
>> - select X86_PMEM_LEGACY_DEVICE
>> + select PMEM_LEGACY
>> + select PMEM_LEGACY_DEVICE
>> select NUMA_KEEP_MEMINFO if NUMA
>> select LIBNVDIMM
>> help
>> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
>> index f0606f8..1e12751 100644
>> --- a/arch/x86/kernel/Makefile
>> +++ b/arch/x86/kernel/Makefile
>> @@ -130,7 +130,6 @@ obj-$(CONFIG_KVM_GUEST) += kvm.o kvmclock.o
>> obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch.o
>> obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= paravirt-spinlocks.o
>> obj-$(CONFIG_PARAVIRT_CLOCK) += pvclock.o
>> -obj-$(CONFIG_X86_PMEM_LEGACY_DEVICE) += pmem.o
>> obj-$(CONFIG_JAILHOUSE_GUEST) += jailhouse.o
>> diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig
>> index b7d1eb3..632b6ed 100644
>> --- a/drivers/nvdimm/Kconfig
>> +++ b/drivers/nvdimm/Kconfig
>> @@ -19,6 +19,12 @@ menuconfig LIBNVDIMM
>> if LIBNVDIMM
>> +config PMEM_LEGACY
>> + tristate
>> +
>> +config PMEM_LEGACY_DEVICE
>> + bool
>> +
>> config BLK_DEV_PMEM
>> tristate "PMEM: Persistent memory block device support"
>> default LIBNVDIMM
>> diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile
>> index 0407753..2098221 100644
>> --- a/drivers/nvdimm/Makefile
>> +++ b/drivers/nvdimm/Makefile
>> @@ -3,6 +3,8 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
>> obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
>> obj-$(CONFIG_ND_BTT) += nd_btt.o
>> obj-$(CONFIG_ND_BLK) += nd_blk.o
>> +obj-$(CONFIG_PMEM_LEGACY_DEVICE) += pmem_legacy_device.o
>> +obj-$(CONFIG_PMEM_LEGACY) += nd_e820.o
>> obj-$(CONFIG_OF_PMEM) += of_pmem.o
>> obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o nd_virtio.o
>> diff --git a/arch/x86/kernel/pmem.c b/drivers/nvdimm/pmem_legacy_device.c
>> similarity index 100%
>> rename from arch/x86/kernel/pmem.c
>> rename to drivers/nvdimm/pmem_legacy_device.c
>> diff --git a/tools/testing/nvdimm/Kbuild b/tools/testing/nvdimm/Kbuild
>> index 47f9cc9..77aa117 100644
>> --- a/tools/testing/nvdimm/Kbuild
>> +++ b/tools/testing/nvdimm/Kbuild
>> @@ -28,7 +28,7 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
>> obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
>> obj-$(CONFIG_ND_BTT) += nd_btt.o
>> obj-$(CONFIG_ND_BLK) += nd_blk.o
>> -obj-$(CONFIG_X86_PMEM_LEGACY) += nd_e820.o
>> +obj-$(CONFIG_PMEM_LEGACY) += nd_e820.o
>> obj-$(CONFIG_ACPI_NFIT) += nfit.o
>> ifeq ($(CONFIG_DAX),m)
>> obj-$(CONFIG_DAX) += dax.o
> .
1
0

[PATCH OLK-5.10 v5 1/3] x86: pmem: move persistent memory(legacy) code into nvdimm
by Zhuling 21 Jan '22
by Zhuling 21 Jan '22
21 Jan '22
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4O31I
Move x86's pmem.c into nvdimm, and rename X86_PMEM_LEGACY_DEVICE to
PMEM_LEGACY_DEVICE, also add PMEM_LEGACY to control the built of
nd_e820.o, then the code could be reused by other architectures.
Signed-off-by: Zhuling <zhuling8(a)huawei.com>
---
arch/x86/Kconfig | 6 ++----
arch/x86/kernel/Makefile | 1 -
drivers/nvdimm/Kconfig | 6 ++++++
drivers/nvdimm/Makefile | 2 ++
arch/x86/kernel/pmem.c => drivers/nvdimm/pmem_legacy_device.c | 0
tools/testing/nvdimm/Kbuild | 2 +-
6 files changed, 11 insertions(+), 6 deletions(-)
rename arch/x86/kernel/pmem.c => drivers/nvdimm/pmem_legacy_device.c (100%)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c77ef59..1d3176a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1667,14 +1667,12 @@ config ILLEGAL_POINTER_VALUE
default 0 if X86_32
default 0xdead000000000000 if X86_64
-config X86_PMEM_LEGACY_DEVICE
- bool
-
config X86_PMEM_LEGACY
tristate "Support non-standard NVDIMMs and ADR protected memory"
depends on PHYS_ADDR_T_64BIT
depends on BLK_DEV
- select X86_PMEM_LEGACY_DEVICE
+ select PMEM_LEGACY
+ select PMEM_LEGACY_DEVICE
select NUMA_KEEP_MEMINFO if NUMA
select LIBNVDIMM
help
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index f0606f8..1e12751 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -130,7 +130,6 @@ obj-$(CONFIG_KVM_GUEST) += kvm.o kvmclock.o
obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch.o
obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= paravirt-spinlocks.o
obj-$(CONFIG_PARAVIRT_CLOCK) += pvclock.o
-obj-$(CONFIG_X86_PMEM_LEGACY_DEVICE) += pmem.o
obj-$(CONFIG_JAILHOUSE_GUEST) += jailhouse.o
diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig
index b7d1eb3..632b6ed 100644
--- a/drivers/nvdimm/Kconfig
+++ b/drivers/nvdimm/Kconfig
@@ -19,6 +19,12 @@ menuconfig LIBNVDIMM
if LIBNVDIMM
+config PMEM_LEGACY
+ tristate
+
+config PMEM_LEGACY_DEVICE
+ bool
+
config BLK_DEV_PMEM
tristate "PMEM: Persistent memory block device support"
default LIBNVDIMM
diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile
index 0407753..2098221 100644
--- a/drivers/nvdimm/Makefile
+++ b/drivers/nvdimm/Makefile
@@ -3,6 +3,8 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
obj-$(CONFIG_ND_BTT) += nd_btt.o
obj-$(CONFIG_ND_BLK) += nd_blk.o
+obj-$(CONFIG_PMEM_LEGACY_DEVICE) += pmem_legacy_device.o
+obj-$(CONFIG_PMEM_LEGACY) += nd_e820.o
obj-$(CONFIG_OF_PMEM) += of_pmem.o
obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o nd_virtio.o
diff --git a/arch/x86/kernel/pmem.c b/drivers/nvdimm/pmem_legacy_device.c
similarity index 100%
rename from arch/x86/kernel/pmem.c
rename to drivers/nvdimm/pmem_legacy_device.c
diff --git a/tools/testing/nvdimm/Kbuild b/tools/testing/nvdimm/Kbuild
index 47f9cc9..77aa117 100644
--- a/tools/testing/nvdimm/Kbuild
+++ b/tools/testing/nvdimm/Kbuild
@@ -28,7 +28,7 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
obj-$(CONFIG_ND_BTT) += nd_btt.o
obj-$(CONFIG_ND_BLK) += nd_blk.o
-obj-$(CONFIG_X86_PMEM_LEGACY) += nd_e820.o
+obj-$(CONFIG_PMEM_LEGACY) += nd_e820.o
obj-$(CONFIG_ACPI_NFIT) += nfit.o
ifeq ($(CONFIG_DAX),m)
obj-$(CONFIG_DAX) += dax.o
--
2.9.5
1
2

[PATCH openEuler-1.0-LTS] ipmi_si: Phytium S2500 workaround for MMIO-based IPMI
by Laibin Qiu 20 Jan '22
by Laibin Qiu 20 Jan '22
20 Jan '22
phytium inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4RK58
CVE: NA
--------------------------------
The system would hang up when the Phytium S2500 communicates with
some BMCs after several rounds of transactions, unless we reset
the controller timeout counter manually by calling firmware through
SMC.
Signed-off-by: Wang Yinfeng <wangyinfeng(a)phytium.com.cn>
Signed-off-by: Chen Baozi <chenbaozi(a)phytium.com.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin(a)huawei.com>
---
drivers/char/ipmi/ipmi_si_mem_io.c | 117 ++++++++++++++++++++++++++++-
1 file changed, 113 insertions(+), 4 deletions(-)
diff --git a/drivers/char/ipmi/ipmi_si_mem_io.c b/drivers/char/ipmi/ipmi_si_mem_io.c
index 75583612ab10..73a3d6979eaa 100644
--- a/drivers/char/ipmi/ipmi_si_mem_io.c
+++ b/drivers/char/ipmi/ipmi_si_mem_io.c
@@ -3,6 +3,105 @@
#include <linux/io.h>
#include "ipmi_si.h"
+#ifdef CONFIG_ARM_GIC_PHYTIUM_2500
+#include <linux/arm-smccc.h>
+
+#define CTL_RST_FUNC_ID 0xC2000011
+
+static void ctl_smc(unsigned long arg0, unsigned long arg1,
+ unsigned long arg2, unsigned long arg3)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_smc(arg0, arg1, arg2, arg3, 0, 0, 0, 0, &res);
+ if (res.a0 != 0)
+ pr_err("Error: Firmware call SMC reset Failed: %d, addr: 0x%lx\n",
+ (int)res.a0, arg2);
+}
+
+static void ctl_timeout_reset(void)
+{
+ ctl_smc(CTL_RST_FUNC_ID, 0x1, 0x28100208, 0x1);
+ ctl_smc(CTL_RST_FUNC_ID, 0x1, 0x2810020C, 0x1);
+}
+
+static unsigned char intf_mem_inb_2500(const struct si_sm_io *io,
+ unsigned int offset)
+{
+ ctl_timeout_reset();
+
+ return readb((io->addr)+(offset * io->regspacing));
+}
+
+static unsigned char intf_mem_inw_2500(const struct si_sm_io *io,
+ unsigned int offset)
+{
+ ctl_timeout_reset();
+
+ return (readw((io->addr)+(offset * io->regspacing)) >> io->regshift)
+ & 0xff;
+}
+
+static unsigned char intf_mem_inl_2500(const struct si_sm_io *io,
+ unsigned int offset)
+{
+ ctl_timeout_reset();
+
+ return (readl((io->addr)+(offset * io->regspacing)) >> io->regshift)
+ & 0xff;
+}
+
+static unsigned char mem_inq_2500(const struct si_sm_io *io,
+ unsigned int offset)
+{
+ ctl_timeout_reset();
+
+ return (readq((io->addr)+(offset * io->regspacing)) >> io->regshift)
+ & 0xff;
+}
+#else
+#define intf_mem_inb_2500 intf_mem_inb
+#define intf_mem_inw_2500 intf_mem_inw
+#define intf_mem_inl_2500 intf_mem_inl
+#define mem_inq_2500 mem_inq
+#endif
+
+static bool apply_phytium2500_workaround;
+
+#ifdef CONFIG_ACPI
+struct ipmi_workaround_oem_info {
+ char oem_id[ACPI_OEM_ID_SIZE + 1];
+};
+
+static struct ipmi_workaround_oem_info wa_info[] = {
+ {
+ .oem_id = "KPSVVJ",
+ }
+};
+
+static void ipmi_check_phytium_workaround(void)
+{
+ struct acpi_table_header tbl;
+ int i;
+
+ if (ACPI_FAILURE(acpi_get_table_header(ACPI_SIG_DSDT, 0, &tbl)))
+ return;
+
+ for (i = 0; i < ARRAY_SIZE(wa_info); i++) {
+ if (strncmp(wa_info[i].oem_id, tbl.oem_id, ACPI_OEM_ID_SIZE))
+ continue;
+
+ apply_phytium2500_workaround = true;
+ break;
+ }
+}
+#else
+static void ipmi_check_phytium_workaround(void)
+{
+ apply_phytium2500_workaround = false;
+}
+#endif
+
static unsigned char intf_mem_inb(const struct si_sm_io *io,
unsigned int offset)
{
@@ -81,26 +180,36 @@ int ipmi_si_mem_setup(struct si_sm_io *io)
if (!addr)
return -ENODEV;
+ ipmi_check_phytium_workaround();
+
/*
* Figure out the actual readb/readw/readl/etc routine to use based
* upon the register size.
*/
switch (io->regsize) {
case 1:
- io->inputb = intf_mem_inb;
+ io->inputb = apply_phytium2500_workaround ?
+ intf_mem_inb_2500 :
+ intf_mem_inb;
io->outputb = intf_mem_outb;
break;
case 2:
- io->inputb = intf_mem_inw;
+ io->inputb = apply_phytium2500_workaround ?
+ intf_mem_inw_2500 :
+ intf_mem_inw;
io->outputb = intf_mem_outw;
break;
case 4:
- io->inputb = intf_mem_inl;
+ io->inputb = apply_phytium2500_workaround ?
+ intf_mem_inl_2500 :
+ intf_mem_inl;
io->outputb = intf_mem_outl;
break;
#ifdef readq
case 8:
- io->inputb = mem_inq;
+ io->inputb = apply_phytium2500_workaround ?
+ mem_inq_2500 :
+ mem_inq;
io->outputb = mem_outq;
break;
#endif
--
2.22.0
1
0

20 Jan '22
From: Smita Koralahalli <Smita.KoralahalliChannabasappa(a)amd.com>
mainline inclusion
from mainline-v5.13-rc1
commit da66658638c947cab0fb157289f03698453ff8d5
category: feature
feature: milan cpu
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OBEH?from=project-issue
CVE: NA
--------------------------------
Add PMU events for AMD Zen3 processors as documented in the AMD Processor
Programming Reference for Family 19h and Model 01h [1].
Below are the events which are new on Zen3:
PMCx041 ls_mab_alloc.{all_allocations|hardware_prefetcher_allocations|load_store_allocations}
PMCx043 ls_dmnd_fills_from_sys.ext_cache_local
PMCx044 ls_any_fills_from_sys.{mem_io_remote|ext_cache_remote|mem_io_local|ext_cache_local|int_cache|lcl_l2}
PMCx047 ls_misal_loads.{ma4k|ma64}
PMCx059 ls_sw_pf_dc_fills.ext_cache_local
PMCx05a ls_hw_pf_dc_fills.ext_cache_local
PMCx05f ls_alloc_mab_count
PMCx085 bp_l1_tlb_miss_l2_tlb_miss.coalesced_4k
PMCx0ab de_dis_cops_from_decoder.disp_op_type.{any_integer_dispatch|any_fp_dispatch}
PMCx0cc ex_ret_ind_brch_instr
PMCx18e ic_tag_hit_miss.{all_instruction_cache_accesses|instruction_cache_miss|instruction_cache_hit}
PMCx1c7 ex_ret_msprd_brnch_instr_dir_msmtch
PMCx28f op_cache_hit_miss.{all_op_cache_accesses|op_cache_miss|op_cache_hit}
Section 2.1.17.2 "Performance Measurement" of "PPR for AMD Family 19h,
Model 01h, Revision B1 Processors - 55898 Rev 0.35 - Feb 5, 2021." lists
new metrics. Add them.
Preserve the events for Zen3 if they are measurable and non-zero as taken
from Zen2 directory even if the PPR of Zen3 [1] omits them. Those events
are the following:
PMCx000 fpu_pipe_assignment.{total|total0|total1|total2|total3}
PMCx004 fp_num_mov_elim_scal_op.{optimized|opt_potential|sse_mov_ops_elim|sse_mov_ops}
PMCx02D ls_rdtsc
PMCx040 ls_dc_accesses
PMCx046 ls_tablewalker.{iside|ic_type1|ic_type0|dside|dc_type1|dc_type0}
PMCx061 l2_request_g2.{group1|ls_rd_sized|ls_rd_sized_nc|ic_rd_sized|ic_rd_sized_nc|smc_inval|bus_lock_originator|bus_locks_responses}
PMCx062 l2_latency.l2_cycles_waiting_on_fills
PMCx063 l2_wcb_req.{wcb_write|wcb_close|zero_byte_store|cl_zero}
PMCx06d l2_fill_pending.l2_fill_busy
PMCx080 ic_fw32
PMCx081 ic_fw32_miss
PMCx086 bp_snp_re_sync
PMCx087 ic_fetch_stall.{ic_stall_any|ic_stall_dq_empty|ic_stall_back_pressure}
PMCx08a bp_l1_btb_correct
PMCx08c ic_cache_inval.{l2_invalidating_probe|fill_invalidated}
PMCx099 bp_tlb_rel
PMCx0a9 de_dis_uop_queue_empty_di0
PMCx0c7 ex_ret_brn_resync
PMCx28a ic_oc_mode_switch.{oc_ic_mode_switch|ic_oc_mode_switch}
L3PMCx01 l3_request_g1.caching_l3_cache_accesses
L3PMCx06 l3_comb_clstr_state.{other_l3_miss_typs|request_miss}
[1] Processor Programming Reference (PPR) for AMD Family 19h, Model 01h,
Revision B1 Processors - 55898 Rev 0.35 - Feb 5, 2021.
[2] Processor Programming Reference (PPR) for AMD Family 17h Model 71h,
Revision B0 Processors, 56176 Rev 3.06 - Jul 17, 2019.
[3] Processor Programming Reference (PPR) for AMD Family 17h Models
01h,08h, Revision B2 Processors, 54945 Rev 3.03 - Jun 14, 2019.
All of the PPRs can be found at:
https://bugzilla.kernel.org/show_bug.cgi?id=206537
Reviewed-by: Robert Richter <rrichter(a)amd.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa(a)amd.com>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Ian Rogers <irogers(a)google.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Kim Phillips <kim.phillips(a)amd.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Martin Liška <mliska(a)suse.cz>
Cc: Michael Petlan <mpetlan(a)redhat.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Vijay Thakkar <vijaythakkar(a)me.com>
Cc: linux-perf-users(a)vger.kernel.org
Link: https://lore.kernel.org/r/20210406215944.113332-5-Smita.KoralahalliChannaba…
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Reviewed-by: Yang Jihong <yangjihong1(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
.../pmu-events/arch/x86/amdzen3/branch.json | 53 +++
.../pmu-events/arch/x86/amdzen3/cache.json | 402 ++++++++++++++++
.../pmu-events/arch/x86/amdzen3/core.json | 137 ++++++
.../arch/x86/amdzen3/data-fabric.json | 98 ++++
.../arch/x86/amdzen3/floating-point.json | 139 ++++++
.../pmu-events/arch/x86/amdzen3/memory.json | 428 ++++++++++++++++++
.../pmu-events/arch/x86/amdzen3/other.json | 103 +++++
.../arch/x86/amdzen3/recommended.json | 214 +++++++++
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
9 files changed, 1575 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen3/branch.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen3/cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen3/core.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen3/data-fabric.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen3/floating-point.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen3/memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen3/other.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/branch.json b/tools/perf/pmu-events/arch/x86/amdzen3/branch.json
new file mode 100644
index 000000000000..018a7fe94fb9
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen3/branch.json
@@ -0,0 +1,53 @@
+[
+ {
+ "EventName": "bp_l1_btb_correct",
+ "EventCode": "0x8a",
+ "BriefDescription": "L1 Branch Prediction Overrides Existing Prediction (speculative)."
+ },
+ {
+ "EventName": "bp_l2_btb_correct",
+ "EventCode": "0x8b",
+ "BriefDescription": "L2 Branch Prediction Overrides Existing Prediction (speculative)."
+ },
+ {
+ "EventName": "bp_dyn_ind_pred",
+ "EventCode": "0x8e",
+ "BriefDescription": "Dynamic Indirect Predictions.",
+ "PublicDescription": "The number of times a branch used the indirect predictor to make a prediction."
+ },
+ {
+ "EventName": "bp_de_redirect",
+ "EventCode": "0x91",
+ "BriefDescription": "Decode Redirects",
+ "PublicDescription": "The number of times the instruction decoder overrides the predicted target."
+ },
+ {
+ "EventName": "bp_l1_tlb_fetch_hit",
+ "EventCode": "0x94",
+ "BriefDescription": "The number of instruction fetches that hit in the L1 ITLB.",
+ "UMask": "0xff"
+ },
+ {
+ "EventName": "bp_l1_tlb_fetch_hit.if1g",
+ "EventCode": "0x94",
+ "BriefDescription": "The number of instruction fetches that hit in the L1 ITLB. L1 Instruction TLB hit (1G page size).",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "bp_l1_tlb_fetch_hit.if2m",
+ "EventCode": "0x94",
+ "BriefDescription": "The number of instruction fetches that hit in the L1 ITLB. L1 Instruction TLB hit (2M page size).",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "bp_l1_tlb_fetch_hit.if4k",
+ "EventCode": "0x94",
+ "BriefDescription": "The number of instruction fetches that hit in the L1 ITLB. L1 Instrcution TLB hit (4K or 16K page size).",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "bp_tlb_rel",
+ "EventCode": "0x99",
+ "BriefDescription": "The number of ITLB reload requests."
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/cache.json b/tools/perf/pmu-events/arch/x86/amdzen3/cache.json
new file mode 100644
index 000000000000..fa1d7499a2e3
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen3/cache.json
@@ -0,0 +1,402 @@
+[
+ {
+ "EventName": "l2_request_g1.rd_blk_l",
+ "EventCode": "0x60",
+ "BriefDescription": "All L2 Cache Requests (Breakdown 1 - Common). Data cache reads (including hardware and software prefetch).",
+ "UMask": "0x80"
+ },
+ {
+ "EventName": "l2_request_g1.rd_blk_x",
+ "EventCode": "0x60",
+ "BriefDescription": "All L2 Cache Requests (Breakdown 1 - Common). Data cache stores.",
+ "UMask": "0x40"
+ },
+ {
+ "EventName": "l2_request_g1.ls_rd_blk_c_s",
+ "EventCode": "0x60",
+ "BriefDescription": "All L2 Cache Requests (Breakdown 1 - Common). Data cache shared reads.",
+ "UMask": "0x20"
+ },
+ {
+ "EventName": "l2_request_g1.cacheable_ic_read",
+ "EventCode": "0x60",
+ "BriefDescription": "All L2 Cache Requests (Breakdown 1 - Common). Instruction cache reads.",
+ "UMask": "0x10"
+ },
+ {
+ "EventName": "l2_request_g1.change_to_x",
+ "EventCode": "0x60",
+ "BriefDescription": "All L2 Cache Requests (Breakdown 1 - Common). Data cache state change requests. Request change to writable, check L2 for current state.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "l2_request_g1.prefetch_l2_cmd",
+ "EventCode": "0x60",
+ "BriefDescription": "All L2 Cache Requests (Breakdown 1 - Common). PrefetchL2Cmd.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "l2_request_g1.l2_hw_pf",
+ "EventCode": "0x60",
+ "BriefDescription": "All L2 Cache Requests (Breakdown 1 - Common). L2 Prefetcher. All prefetches accepted by L2 pipeline, hit or miss. Types of PF and L2 hit/miss broken out in a separate perfmon event.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "l2_request_g1.group2",
+ "EventCode": "0x60",
+ "BriefDescription": "Miscellaneous events covered in more detail by l2_request_g2 (PMCx061).",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "l2_request_g1.all_no_prefetch",
+ "EventCode": "0x60",
+ "UMask": "0xf9"
+ },
+ {
+ "EventName": "l2_request_g2.group1",
+ "EventCode": "0x61",
+ "BriefDescription": "Miscellaneous events covered in more detail by l2_request_g1 (PMCx060).",
+ "UMask": "0x80"
+ },
+ {
+ "EventName": "l2_request_g2.ls_rd_sized",
+ "EventCode": "0x61",
+ "BriefDescription": "All L2 Cache Requests (Breakdown 2 - Rare). Data cache read sized.",
+ "UMask": "0x40"
+ },
+ {
+ "EventName": "l2_request_g2.ls_rd_sized_nc",
+ "EventCode": "0x61",
+ "BriefDescription": "All L2 Cache Requests (Breakdown 2 - Rare). Data cache read sized non-cacheable.",
+ "UMask": "0x20"
+ },
+ {
+ "EventName": "l2_request_g2.ic_rd_sized",
+ "EventCode": "0x61",
+ "BriefDescription": "All L2 Cache Requests (Breakdown 2 - Rare). Instruction cache read sized.",
+ "UMask": "0x10"
+ },
+ {
+ "EventName": "l2_request_g2.ic_rd_sized_nc",
+ "EventCode": "0x61",
+ "BriefDescription": "All L2 Cache Requests (Breakdown 2 - Rare). Instruction cache read sized non-cacheable.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "l2_request_g2.smc_inval",
+ "EventCode": "0x61",
+ "BriefDescription": "All L2 Cache Requests (Breakdown 2 - Rare). Self-modifying code invalidates.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "l2_request_g2.bus_locks_originator",
+ "EventCode": "0x61",
+ "BriefDescription": "All L2 Cache Requests (Breakdown 2 - Rare). Bus locks.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "l2_request_g2.bus_locks_responses",
+ "EventCode": "0x61",
+ "BriefDescription": "All L2 Cache Requests (Breakdown 2 - Rare). Bus lock response.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "l2_latency.l2_cycles_waiting_on_fills",
+ "EventCode": "0x62",
+ "BriefDescription": "Total cycles spent waiting for L2 fills to complete from L3 or memory, divided by four. Event counts are for both threads. To calculate average latency, the number of fills from both threads must be used.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "l2_wcb_req.wcb_write",
+ "EventCode": "0x63",
+ "BriefDescription": "LS to L2 WCB write requests. LS (Load/Store unit) to L2 WCB (Write Combining Buffer) write requests.",
+ "UMask": "0x40"
+ },
+ {
+ "EventName": "l2_wcb_req.wcb_close",
+ "EventCode": "0x63",
+ "BriefDescription": "LS to L2 WCB close requests. LS (Load/Store unit) to L2 WCB (Write Combining Buffer) close requests.",
+ "UMask": "0x20"
+ },
+ {
+ "EventName": "l2_wcb_req.zero_byte_store",
+ "EventCode": "0x63",
+ "BriefDescription": "LS to L2 WCB zero byte store requests. LS (Load/Store unit) to L2 WCB (Write Combining Buffer) zero byte store requests.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "l2_wcb_req.cl_zero",
+ "EventCode": "0x63",
+ "BriefDescription": "LS to L2 WCB cache line zeroing requests. LS (Load/Store unit) to L2 WCB (Write Combining Buffer) cache line zeroing requests.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ls_rd_blk_cs",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Data cache shared read hit in L2",
+ "UMask": "0x80"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ls_rd_blk_l_hit_x",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Data cache read hit in L2. Modifiable.",
+ "UMask": "0x40"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ls_rd_blk_l_hit_s",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Data cache read hit non-modifiable line in L2.",
+ "UMask": "0x20"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ls_rd_blk_x",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Data cache store or state change hit in L2.",
+ "UMask": "0x10"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ls_rd_blk_c",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Data cache request miss in L2 (all types). Use l2_cache_misses_from_dc_misses instead.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ic_fill_hit_x",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache hit modifiable line in L2.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ic_fill_hit_s",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache hit non-modifiable line in L2.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ic_fill_miss",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request miss in L2. Use l2_cache_misses_from_ic_miss instead.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ic_access_in_l2",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache requests in L2.",
+ "UMask": "0x07"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ic_dc_miss_in_l2",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request miss in L2 and Data cache request miss in L2 (all types).",
+ "UMask": "0x09"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ic_dc_hit_in_l2",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request hit in L2 and Data cache request hit in L2 (all types).",
+ "UMask": "0xf6"
+ },
+ {
+ "EventName": "l2_fill_pending.l2_fill_busy",
+ "EventCode": "0x6d",
+ "BriefDescription": "Cycles with fill pending from L2. Total cycles spent with one or more fill requests in flight from L2.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "l2_pf_hit_l2",
+ "EventCode": "0x70",
+ "BriefDescription": "L2 prefetch hit in L2. Use l2_cache_hits_from_l2_hwpf instead.",
+ "UMask": "0xff"
+ },
+ {
+ "EventName": "l2_pf_miss_l2_hit_l3",
+ "EventCode": "0x71",
+ "BriefDescription": "L2 prefetcher hits in L3. Counts all L2 prefetches accepted by the L2 pipeline which miss the L2 cache and hit the L3.",
+ "UMask": "0xff"
+ },
+ {
+ "EventName": "l2_pf_miss_l2_l3",
+ "EventCode": "0x72",
+ "BriefDescription": "L2 prefetcher misses in L3. Counts all L2 prefetches accepted by the L2 pipeline which miss the L2 and the L3 caches.",
+ "UMask": "0xff"
+ },
+ {
+ "EventName": "ic_fw32",
+ "EventCode": "0x80",
+ "BriefDescription": "The number of 32B fetch windows transferred from IC pipe to DE instruction decoder (includes non-cacheable and cacheable fill responses)."
+ },
+ {
+ "EventName": "ic_fw32_miss",
+ "EventCode": "0x81",
+ "BriefDescription": "The number of 32B fetch windows tried to read the L1 IC and missed in the full tag."
+ },
+ {
+ "EventName": "ic_cache_fill_l2",
+ "EventCode": "0x82",
+ "BriefDescription": "Instruction Cache Refills from L2. The number of 64 byte instruction cache line was fulfilled from the L2 cache."
+ },
+ {
+ "EventName": "ic_cache_fill_sys",
+ "EventCode": "0x83",
+ "BriefDescription": "Instruction Cache Refills from System. The number of 64 byte instruction cache line fulfilled from system memory or another cache."
+ },
+ {
+ "EventName": "bp_l1_tlb_miss_l2_tlb_hit",
+ "EventCode": "0x84",
+ "BriefDescription": "L1 ITLB Miss, L2 ITLB Hit. The number of instruction fetches that miss in the L1 ITLB but hit in the L2 ITLB."
+ },
+ {
+ "EventName": "bp_l1_tlb_miss_l2_tlb_miss",
+ "EventCode": "0x85",
+ "BriefDescription": "The number of instruction fetches that miss in both the L1 and L2 TLBs.",
+ "UMask": "0xff"
+ },
+ {
+ "EventName": "bp_l1_tlb_miss_l2_tlb_miss.coalesced_4k",
+ "EventCode": "0x85",
+ "BriefDescription": "The number of valid fills into the ITLB originating from the LS Page-Table Walker. Tablewalk requests are issued for L1-ITLB and L2-ITLB misses. Walk for >4K Coalesced page.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "bp_l1_tlb_miss_l2_tlb_miss.if1g",
+ "EventCode": "0x85",
+ "BriefDescription": "The number of valid fills into the ITLB originating from the LS Page-Table Walker. Tablewalk requests are issued for L1-ITLB and L2-ITLB misses. Walk for 1G page.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "bp_l1_tlb_miss_l2_tlb_miss.if2m",
+ "EventCode": "0x85",
+ "BriefDescription": "The number of valid fills into the ITLB originating from the LS Page-Table Walker. Tablewalk requests are issued for L1-ITLB and L2-ITLB misses. Walk for 2M page.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "bp_l1_tlb_miss_l2_tlb_miss.if4k",
+ "EventCode": "0x85",
+ "BriefDescription": "The number of valid fills into the ITLB originating from the LS Page-Table Walker. Tablewalk requests are issued for L1-ITLB and L2-ITLB misses. Walk to 4K page.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "bp_snp_re_sync",
+ "EventCode": "0x86",
+ "BriefDescription": "The number of pipeline restarts caused by invalidating probes that hit on the instruction stream currently being executed. This would happen if the active instruction stream was being modified by another processor in an MP system - typically a highly unlikely event."
+ },
+ {
+ "EventName": "ic_fetch_stall.ic_stall_any",
+ "EventCode": "0x87",
+ "BriefDescription": "Instruction Pipe Stall. IC pipe was stalled during this clock cycle for any reason (nothing valid in pipe ICM1).",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "ic_fetch_stall.ic_stall_dq_empty",
+ "EventCode": "0x87",
+ "BriefDescription": "Instruction Pipe Stall. IC pipe was stalled during this clock cycle (including IC to OC fetches) due to DQ empty.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ic_fetch_stall.ic_stall_back_pressure",
+ "EventCode": "0x87",
+ "BriefDescription": "Instruction Pipe Stall. IC pipe was stalled during this clock cycle (including IC to OC fetches) due to back-pressure.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ic_cache_inval.l2_invalidating_probe",
+ "EventCode": "0x8c",
+ "BriefDescription": "IC line invalidated due to L2 invalidating probe (external or LS). The number of instruction cache lines invalidated. A non-SMC event is CMC (cross modifying code), either from the other thread of the core or another core.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ic_cache_inval.fill_invalidated",
+ "EventCode": "0x8c",
+ "BriefDescription": "IC line invalidated due to overwriting fill response. The number of instruction cache lines invalidated. A non-SMC event is CMC (cross modifying code), either from the other thread of the core or another core.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ic_tag_hit_miss.all_instruction_cache_accesses",
+ "EventCode": "0x18e",
+ "BriefDescription": "All Instruction Cache Accesses. Counts various IC tag related hit and miss events.",
+ "UMask": "0x1f"
+ },
+ {
+ "EventName": "ic_tag_hit_miss.instruction_cache_miss",
+ "EventCode": "0x18e",
+ "BriefDescription": "Instruction Cache Miss. Counts various IC tag related hit and miss events.",
+ "UMask": "0x18"
+ },
+ {
+ "EventName": "ic_tag_hit_miss.instruction_cache_hit",
+ "EventCode": "0x18e",
+ "BriefDescription": "Instruction Cache Hit. Counts various IC tag related hit and miss events.",
+ "UMask": "0x07"
+ },
+ {
+ "EventName": "ic_oc_mode_switch.oc_ic_mode_switch",
+ "EventCode": "0x28a",
+ "BriefDescription": "OC Mode Switch. OC to IC mode switch.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ic_oc_mode_switch.ic_oc_mode_switch",
+ "EventCode": "0x28a",
+ "BriefDescription": "OC Mode Switch. IC to OC mode switch.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "op_cache_hit_miss.all_op_cache_accesses",
+ "EventCode": "0x28f",
+ "BriefDescription": "All Op Cache accesses. Counts Op Cache micro-tag hit/miss events",
+ "UMask": "0x07"
+ },
+ {
+ "EventName": "op_cache_hit_miss.op_cache_miss",
+ "EventCode": "0x28f",
+ "BriefDescription": "Op Cache Miss. Counts Op Cache micro-tag hit/miss events",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "op_cache_hit_miss.op_cache_hit",
+ "EventCode": "0x28f",
+ "BriefDescription": "Op Cache Hit. Counts Op Cache micro-tag hit/miss events",
+ "UMask": "0x03"
+ },
+ {
+ "EventName": "l3_request_g1.caching_l3_cache_accesses",
+ "EventCode": "0x01",
+ "BriefDescription": "Caching: L3 cache accesses",
+ "UMask": "0x80",
+ "Unit": "L3PMC"
+ },
+ {
+ "EventName": "l3_lookup_state.all_l3_req_typs",
+ "EventCode": "0x04",
+ "BriefDescription": "All L3 Request Types. All L3 cache Requests",
+ "UMask": "0xff",
+ "Unit": "L3PMC"
+ },
+ {
+ "EventName": "l3_comb_clstr_state.other_l3_miss_typs",
+ "EventCode": "0x06",
+ "BriefDescription": "Other L3 Miss Request Types",
+ "UMask": "0xfe",
+ "Unit": "L3PMC"
+ },
+ {
+ "EventName": "l3_comb_clstr_state.request_miss",
+ "EventCode": "0x06",
+ "BriefDescription": "L3 cache misses",
+ "UMask": "0x01",
+ "Unit": "L3PMC"
+ },
+ {
+ "EventName": "xi_sys_fill_latency",
+ "EventCode": "0x90",
+ "BriefDescription": "L3 Cache Miss Latency. Total cycles for all transactions divided by 16. Ignores SliceMask and ThreadMask.",
+ "Unit": "L3PMC"
+ },
+ {
+ "EventName": "xi_ccx_sdp_req1",
+ "EventCode": "0x9a",
+ "BriefDescription": "L3 Misses by Request Type. Ignores SliceID, EnAllSlices, CoreID, EnAllCores and ThreadMask. Requires unit mask 0xFF to engage event for counting.",
+ "UMask": "0xff",
+ "Unit": "L3PMC"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/core.json b/tools/perf/pmu-events/arch/x86/amdzen3/core.json
new file mode 100644
index 000000000000..4e27a2be359e
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen3/core.json
@@ -0,0 +1,137 @@
+[
+ {
+ "EventName": "ex_ret_instr",
+ "EventCode": "0xc0",
+ "BriefDescription": "Retired Instructions."
+ },
+ {
+ "EventName": "ex_ret_ops",
+ "EventCode": "0xc1",
+ "BriefDescription": "Retired Ops. Use macro_ops_retired instead.",
+ "PublicDescription": "The number of macro-ops retired."
+ },
+ {
+ "EventName": "ex_ret_brn",
+ "EventCode": "0xc2",
+ "BriefDescription": "Retired Branch Instructions.",
+ "PublicDescription": "The number of branch instructions retired. This includes all types of architectural control flow changes, including exceptions and interrupts."
+ },
+ {
+ "EventName": "ex_ret_brn_misp",
+ "EventCode": "0xc3",
+ "BriefDescription": "Retired Branch Instructions Mispredicted.",
+ "PublicDescription": "The number of retired branch instructions, that were mispredicted."
+ },
+ {
+ "EventName": "ex_ret_brn_tkn",
+ "EventCode": "0xc4",
+ "BriefDescription": "Retired Taken Branch Instructions.",
+ "PublicDescription": "The number of taken branches that were retired. This includes all types of architectural control flow changes, including exceptions and interrupts."
+ },
+ {
+ "EventName": "ex_ret_brn_tkn_misp",
+ "EventCode": "0xc5",
+ "BriefDescription": "Retired Taken Branch Instructions Mispredicted.",
+ "PublicDescription": "The number of retired taken branch instructions that were mispredicted."
+ },
+ {
+ "EventName": "ex_ret_brn_far",
+ "EventCode": "0xc6",
+ "BriefDescription": "Retired Far Control Transfers.",
+ "PublicDescription": "The number of far control transfers retired including far call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and interrupts. Far control transfers are not subject to branch prediction."
+ },
+ {
+ "EventName": "ex_ret_brn_resync",
+ "EventCode": "0xc7",
+ "BriefDescription": "Retired Branch Resyncs.",
+ "PublicDescription": "The number of resync branches. These reflect pipeline restarts due to certain microcode assists and events such as writes to the active instruction stream, among other things. Each occurrence reflects a restart penalty similar to a branch mispredict. This is relatively rare."
+ },
+ {
+ "EventName": "ex_ret_near_ret",
+ "EventCode": "0xc8",
+ "BriefDescription": "Retired Near Returns.",
+ "PublicDescription": "The number of near return instructions (RET or RET Iw) retired."
+ },
+ {
+ "EventName": "ex_ret_near_ret_mispred",
+ "EventCode": "0xc9",
+ "BriefDescription": "Retired Near Returns Mispredicted.",
+ "PublicDescription": "The number of near returns retired that were not correctly predicted by the return address predictor. Each such mispredict incurs the same penalty as a mispredicted conditional branch instruction."
+ },
+ {
+ "EventName": "ex_ret_brn_ind_misp",
+ "EventCode": "0xca",
+ "BriefDescription": "Retired Indirect Branch Instructions Mispredicted.",
+ "PublicDescription": "The number of indirect branches retired that were not correctly predicted. Each such mispredict incurs the same penalty as a mispredicted conditional branch instruction. Note that only EX mispredicts are counted."
+ },
+ {
+ "EventName": "ex_ret_mmx_fp_instr.sse_instr",
+ "EventCode": "0xcb",
+ "BriefDescription": "SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, SSE41, SSE42, AVX).",
+ "PublicDescription": "The number of MMX, SSE or x87 instructions retired. The UnitMask allows the selection of the individual classes of instructions as given in the table. Each increment represents one complete instruction. Since this event includes non-numeric instructions it is not suitable for measuring MFLOPS.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "ex_ret_mmx_fp_instr.mmx_instr",
+ "EventCode": "0xcb",
+ "BriefDescription": "MMX instructions.",
+ "PublicDescription": "The number of MMX, SSE or x87 instructions retired. The UnitMask allows the selection of the individual classes of instructions as given in the table. Each increment represents one complete instruction. Since this event includes non-numeric instructions it is not suitable for measuring MFLOPS. MMX instructions.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ex_ret_mmx_fp_instr.x87_instr",
+ "EventCode": "0xcb",
+ "BriefDescription": "x87 instructions.",
+ "PublicDescription": "The number of MMX, SSE or x87 instructions retired. The UnitMask allows the selection of the individual classes of instructions as given in the table. Each increment represents one complete instruction. Since this event includes non-numeric instructions it is not suitable for measuring MFLOPS. x87 instructions.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ex_ret_ind_brch_instr",
+ "EventCode": "0xcc",
+ "BriefDescription": "Retired Indirect Branch Instructions. The number of indirect branches retired."
+ },
+ {
+ "EventName": "ex_ret_cond",
+ "EventCode": "0xd1",
+ "BriefDescription": "Retired Conditional Branch Instructions."
+ },
+ {
+ "EventName": "ex_div_busy",
+ "EventCode": "0xd3",
+ "BriefDescription": "Div Cycles Busy count."
+ },
+ {
+ "EventName": "ex_div_count",
+ "EventCode": "0xd4",
+ "BriefDescription": "Div Op Count."
+ },
+ {
+ "EventName": "ex_ret_msprd_brnch_instr_dir_msmtch",
+ "EventCode": "0x1c7",
+ "BriefDescription": "Retired Mispredicted Branch Instructions due to Direction Mismatch",
+ "PublicDescription": "The number of retired conditional branch instructions that were not correctly predicted because of a branch direction mismatch."
+ },
+ {
+ "EventName": "ex_tagged_ibs_ops.ibs_count_rollover",
+ "EventCode": "0x1cf",
+ "BriefDescription": "Tagged IBS Ops. Number of times an op could not be tagged by IBS because of a previous tagged op that has not retired.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "ex_tagged_ibs_ops.ibs_tagged_ops_ret",
+ "EventCode": "0x1cf",
+ "BriefDescription": "Tagged IBS Ops. Number of Ops tagged by IBS that retired.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ex_tagged_ibs_ops.ibs_tagged_ops",
+ "EventCode": "0x1cf",
+ "BriefDescription": "Tagged IBS Ops. Number of Ops tagged by IBS.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ex_ret_fused_instr",
+ "EventCode": "0x1d0",
+ "BriefDescription": "Counts retired Fused Instructions."
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/data-fabric.json b/tools/perf/pmu-events/arch/x86/amdzen3/data-fabric.json
new file mode 100644
index 000000000000..40271df40015
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen3/data-fabric.json
@@ -0,0 +1,98 @@
+[
+ {
+ "EventName": "remote_outbound_data_controller_0",
+ "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 0",
+ "EventCode": "0x7c7",
+ "UMask": "0x02",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "remote_outbound_data_controller_1",
+ "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 1",
+ "EventCode": "0x807",
+ "UMask": "0x02",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "remote_outbound_data_controller_2",
+ "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 2",
+ "EventCode": "0x847",
+ "UMask": "0x02",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "remote_outbound_data_controller_3",
+ "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 3",
+ "EventCode": "0x887",
+ "UMask": "0x02",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_0",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x07",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_1",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x47",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_2",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x87",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_3",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0xc7",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_4",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x107",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_5",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x147",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_6",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x187",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_7",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x1c7",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/floating-point.json b/tools/perf/pmu-events/arch/x86/amdzen3/floating-point.json
new file mode 100644
index 000000000000..98cfcb9c78ec
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen3/floating-point.json
@@ -0,0 +1,139 @@
+[
+ {
+ "EventName": "fpu_pipe_assignment.total",
+ "EventCode": "0x00",
+ "BriefDescription": "Total number of fp uOps.",
+ "PublicDescription": "Total number of fp uOps. The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS.",
+ "UMask": "0x0f"
+ },
+ {
+ "EventName": "fpu_pipe_assignment.total3",
+ "EventCode": "0x00",
+ "BriefDescription": "Total number uOps assigned to pipe 3.",
+ "PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one-cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number uOps assigned to pipe 3.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "fpu_pipe_assignment.total2",
+ "EventCode": "0x00",
+ "BriefDescription": "Total number uOps assigned to pipe 2.",
+ "PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number uOps assigned to pipe 2.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "fpu_pipe_assignment.total1",
+ "EventCode": "0x00",
+ "BriefDescription": "Total number uOps assigned to pipe 1.",
+ "PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number uOps assigned to pipe 1.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "fpu_pipe_assignment.total0",
+ "EventCode": "0x00",
+ "BriefDescription": "Total number of fp uOps on pipe 0.",
+ "PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number uOps assigned to pipe 0.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "fp_ret_sse_avx_ops.all",
+ "EventCode": "0x03",
+ "BriefDescription": "All FLOPS. This is a retire-based event. The number of retired SSE/AVX FLOPS. The number of events logged per cycle can vary from 0 to 64. This event can count above 15.",
+ "UMask": "0xff"
+ },
+ {
+ "EventName": "fp_ret_sse_avx_ops.mac_flops",
+ "EventCode": "0x03",
+ "BriefDescription": "Multiply-Accumulate FLOPs. Each MAC operation is counted as 2 FLOPS. This is a retire-based event. The number of retired SSE/AVX FLOPs. The number of events logged per cycle can vary from 0 to 64. This event requires the use of the MergeEvent since it can count above 15 events per cycle. See 2.1.17.3 [Large Increment per Cycle Events]. It does not provide a useful count without the use of the MergeEvent.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "fp_ret_sse_avx_ops.div_flops",
+ "EventCode": "0x03",
+ "BriefDescription": "Divide/square root FLOPs. This is a retire-based event. The number of retired SSE/AVX FLOPs. The number of events logged per cycle can vary from 0 to 64. This event requires the use of the MergeEvent since it can count above 15 events per cycle. See 2.1.17.3 [Large Increment per Cycle Events]. It does not provide a useful count without the use of the MergeEvent.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "fp_ret_sse_avx_ops.mult_flops",
+ "EventCode": "0x03",
+ "BriefDescription": "Multiply FLOPs. This is a retire-based event. The number of retired SSE/AVX FLOPs. The number of events logged per cycle can vary from 0 to 64. This event requires the use of the MergeEvent since it can count above 15 events per cycle. See 2.1.17.3 [Large Increment per Cycle Events]. It does not provide a useful count without the use of the MergeEvent.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "fp_ret_sse_avx_ops.add_sub_flops",
+ "EventCode": "0x03",
+ "BriefDescription": "Add/subtract FLOPs. This is a retire-based event. The number of retired SSE/AVX FLOPs. The number of events logged per cycle can vary from 0 to 64. This event requires the use of the MergeEvent since it can count above 15 events per cycle. See 2.1.17.3 [Large Increment per Cycle Events]. It does not provide a useful count without the use of the MergeEvent.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "fp_num_mov_elim_scal_op.optimized",
+ "EventCode": "0x04",
+ "BriefDescription": "Number of Scalar Ops optimized. This is a dispatch based speculative event, and is useful for measuring the effectiveness of the Move elimination and Scalar code optimization schemes.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "fp_num_mov_elim_scal_op.opt_potential",
+ "EventCode": "0x04",
+ "BriefDescription": "Number of Ops that are candidates for optimization (have Z-bit either set or pass). This is a dispatch based speculative event, and is useful for measuring the effectiveness of the Move elimination and Scalar code optimization schemes.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "fp_num_mov_elim_scal_op.sse_mov_ops_elim",
+ "EventCode": "0x04",
+ "BriefDescription": "Number of SSE Move Ops eliminated. This is a dispatch based speculative event, and is useful for measuring the effectiveness of the Move elimination and Scalar code optimization schemes.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "fp_num_mov_elim_scal_op.sse_mov_ops",
+ "EventCode": "0x04",
+ "BriefDescription": "Number of SSE Move Ops. This is a dispatch based speculative event, and is useful for measuring the effectiveness of the Move elimination and Scalar code optimization schemes.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "fp_retired_ser_ops.sse_bot_ret",
+ "EventCode": "0x05",
+ "BriefDescription": "SSE/AVX bottom-executing ops retired. The number of serializing Ops retired.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "fp_retired_ser_ops.sse_ctrl_ret",
+ "EventCode": "0x05",
+ "BriefDescription": "SSE/AVX control word mispredict traps. The number of serializing Ops retired.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "fp_retired_ser_ops.x87_bot_ret",
+ "EventCode": "0x05",
+ "BriefDescription": "x87 bottom-executing ops retired. The number of serializing Ops retired.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "fp_retired_ser_ops.x87_ctrl_ret",
+ "EventCode": "0x05",
+ "BriefDescription": "x87 control word mispredict traps due to mispredictions in RC or PC, or changes in mask bits. The number of serializing Ops retired.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "fp_disp_faults.ymm_spill_fault",
+ "EventCode": "0x0e",
+ "BriefDescription": "Floating Point Dispatch Faults. YMM spill fault.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "fp_disp_faults.ymm_fill_fault",
+ "EventCode": "0x0e",
+ "BriefDescription": "Floating Point Dispatch Faults. YMM fill fault.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "fp_disp_faults.xmm_fill_fault",
+ "EventCode": "0x0e",
+ "BriefDescription": "Floating Point Dispatch Faults. XMM fill fault.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "fp_disp_faults.x87_fill_fault",
+ "EventCode": "0x0e",
+ "BriefDescription": "Floating Point Dispatch Faults. x87 fill fault.",
+ "UMask": "0x01"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/memory.json b/tools/perf/pmu-events/arch/x86/amdzen3/memory.json
new file mode 100644
index 000000000000..a2833955dcd2
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen3/memory.json
@@ -0,0 +1,428 @@
+[
+ {
+ "EventName": "ls_bad_status2.stli_other",
+ "EventCode": "0x24",
+ "BriefDescription": "Non-forwardable conflict; used to reduce STLI's via software. All reasons. Store To Load Interlock (STLI) are loads that were unable to complete because of a possible match with an older store, and the older store could not do STLF for some reason.",
+ "PublicDescription" : "Store-to-load conflicts: A load was unable to complete due to a non-forwardable conflict with an older store. Most commonly, a load's address range partially but not completely overlaps with an uncompleted older store. Software can avoid this problem by using same-size and same-alignment loads and stores when accessing the same data. Vector/SIMD code is particularly susceptible to this problem; software should construct wide vector stores by manipulating vector elements in registers using shuffle/blend/swap instructions prior to storing to memory, instead of using narrow element-by-element stores.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ls_locks.spec_lock_hi_spec",
+ "EventCode": "0x25",
+ "BriefDescription": "Retired lock instructions. High speculative cacheable lock speculation succeeded.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "ls_locks.spec_lock_lo_spec",
+ "EventCode": "0x25",
+ "BriefDescription": "Retired lock instructions. Low speculative cacheable lock speculation succeeded.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "ls_locks.non_spec_lock",
+ "EventCode": "0x25",
+ "BriefDescription": "Retired lock instructions. Non-speculative lock succeeded.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ls_locks.bus_lock",
+ "EventCode": "0x25",
+ "BriefDescription": "Retired lock instructions. Comparable to legacy bus lock.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ls_ret_cl_flush",
+ "EventCode": "0x26",
+ "BriefDescription": "The number of retired CLFLUSH instructions. This is a non-speculative event."
+ },
+ {
+ "EventName": "ls_ret_cpuid",
+ "EventCode": "0x27",
+ "BriefDescription": "The number of CPUID instructions retired."
+ },
+ {
+ "EventName": "ls_dispatch.ld_st_dispatch",
+ "EventCode": "0x29",
+ "BriefDescription": "Load-op-Store Dispatch. Dispatch of a single op that performs a load from and store to the same memory address. Counts the number of operations dispatched to the LS unit. Unit Masks ADDed.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "ls_dispatch.store_dispatch",
+ "EventCode": "0x29",
+ "BriefDescription": "Dispatch of a single op that performs a memory store. Counts the number of operations dispatched to the LS unit. Unit Masks ADDed.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ls_dispatch.ld_dispatch",
+ "EventCode": "0x29",
+ "BriefDescription": "Dispatch of a single op that performs a memory load. Counts the number of operations dispatched to the LS unit. Unit Masks ADDed.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ls_smi_rx",
+ "EventCode": "0x2b",
+ "BriefDescription": "Counts the number of SMIs received."
+ },
+ {
+ "EventName": "ls_int_taken",
+ "EventCode": "0x2c",
+ "BriefDescription": "Counts the number of interrupts taken."
+ },
+ {
+ "EventName": "ls_rdtsc",
+ "EventCode": "0x2d",
+ "BriefDescription": "Number of reads of the TSC (RDTSC instructions). The count is speculative."
+ },
+ {
+ "EventName": "ls_stlf",
+ "EventCode": "0x35",
+ "BriefDescription": "Number of STLF hits."
+ },
+ {
+ "EventName": "ls_st_commit_cancel2.st_commit_cancel_wcb_full",
+ "EventCode": "0x37",
+ "BriefDescription": "A non-cacheable store and the non-cacheable commit buffer is full.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ls_dc_accesses",
+ "EventCode": "0x40",
+ "BriefDescription": "Number of accesses to the dcache for load/store references.",
+ "PublicDescription": "The number of accesses to the data cache for load and store references. This may include certain microcode scratchpad accesses, although these are generally rare. Each increment represents an eight-byte access, although the instruction may only be accessing a portion of that. This event is a speculative event."
+ },
+ {
+ "EventName": "ls_mab_alloc.all_allocations",
+ "EventCode": "0x41",
+ "BriefDescription": "All Allocations. Counts when a LS pipe allocates a MAB entry.",
+ "UMask": "0x7f"
+ },
+ {
+ "EventName": "ls_mab_alloc.hardware_prefetcher_allocations",
+ "EventCode": "0x41",
+ "BriefDescription": "Hardware Prefetcher Allocations. Counts when a LS pipe allocates a MAB entry.",
+ "UMask": "0x40"
+ },
+ {
+ "EventName": "ls_mab_alloc.load_store_allocations",
+ "EventCode": "0x41",
+ "BriefDescription": "Load Store Allocations. Counts when a LS pipe allocates a MAB entry.",
+ "UMask": "0x3f"
+ },
+ {
+ "EventName": "ls_mab_alloc.dc_prefetcher",
+ "EventCode": "0x41",
+ "BriefDescription": "LS MAB Allocates by Type. DC prefetcher.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "ls_mab_alloc.stores",
+ "EventCode": "0x41",
+ "BriefDescription": "LS MAB Allocates by Type. Stores.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ls_mab_alloc.loads",
+ "EventCode": "0x41",
+ "BriefDescription": "LS MAB Allocates by Type. Loads.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ls_dmnd_fills_from_sys.mem_io_remote",
+ "EventCode": "0x43",
+ "BriefDescription": "Demand Data Cache Fills by Data Source. From DRAM or IO connected in different Node.",
+ "UMask": "0x40"
+ },
+ {
+ "EventName": "ls_dmnd_fills_from_sys.ext_cache_remote",
+ "EventCode": "0x43",
+ "BriefDescription": "Demand Data Cache Fills by Data Source. From CCX Cache in different Node.",
+ "UMask": "0x10"
+ },
+ {
+ "EventName": "ls_dmnd_fills_from_sys.mem_io_local",
+ "EventCode": "0x43",
+ "BriefDescription": "Demand Data Cache Fills by Data Source. From DRAM or IO connected in same node.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "ls_dmnd_fills_from_sys.ext_cache_local",
+ "EventCode": "0x43",
+ "BriefDescription": "Demand Data Cache Fills by Data Source. From cache of different CCX in same node.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "ls_dmnd_fills_from_sys.int_cache",
+ "EventCode": "0x43",
+ "BriefDescription": "Demand Data Cache Fills by Data Source. From L3 or different L2 in same CCX.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ls_dmnd_fills_from_sys.lcl_l2",
+ "EventCode": "0x43",
+ "BriefDescription": "Demand Data Cache Fills by Data Source. From Local L2 to the core.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ls_any_fills_from_sys.mem_io_remote",
+ "EventCode": "0x44",
+ "BriefDescription": "Any Data Cache Fills by Data Source. From DRAM or IO connected in different Node.",
+ "UMask": "0x40"
+ },
+ {
+ "EventName": "ls_any_fills_from_sys.ext_cache_remote",
+ "EventCode": "0x44",
+ "BriefDescription": "Any Data Cache Fills by Data Source. From CCX Cache in different Node.",
+ "UMask": "0x10"
+ },
+ {
+ "EventName": "ls_any_fills_from_sys.mem_io_local",
+ "EventCode": "0x44",
+ "BriefDescription": "Any Data Cache Fills by Data Source. From DRAM or IO connected in same node.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "ls_any_fills_from_sys.ext_cache_local",
+ "EventCode": "0x44",
+ "BriefDescription": "Any Data Cache Fills by Data Source. From cache of different CCX in same node.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "ls_any_fills_from_sys.int_cache",
+ "EventCode": "0x44",
+ "BriefDescription": "Any Data Cache Fills by Data Source. From L3 or different L2 in same CCX.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ls_any_fills_from_sys.lcl_l2",
+ "EventCode": "0x44",
+ "BriefDescription": "Any Data Cache Fills by Data Source. From Local L2 to the core.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ls_l1_d_tlb_miss.all",
+ "EventCode": "0x45",
+ "BriefDescription": "All L1 DTLB Misses or Reloads. Use l1_dtlb_misses instead.",
+ "UMask": "0xff"
+ },
+ {
+ "EventName": "ls_l1_d_tlb_miss.tlb_reload_1g_l2_miss",
+ "EventCode": "0x45",
+ "BriefDescription": "L1 DTLB Miss. DTLB reload to a 1G page that also missed in the L2 TLB.",
+ "UMask": "0x80"
+ },
+ {
+ "EventName": "ls_l1_d_tlb_miss.tlb_reload_2m_l2_miss",
+ "EventCode": "0x45",
+ "BriefDescription": "L1 DTLB Miss. DTLB reload to a 2M page that also missed in the L2 TLB.",
+ "UMask": "0x40"
+ },
+ {
+ "EventName": "ls_l1_d_tlb_miss.tlb_reload_coalesced_page_miss",
+ "EventCode": "0x45",
+ "BriefDescription": "L1 DTLB Miss. DTLB reload coalesced page that also missed in the L2 TLB.",
+ "UMask": "0x20"
+ },
+ {
+ "EventName": "ls_l1_d_tlb_miss.tlb_reload_4k_l2_miss",
+ "EventCode": "0x45",
+ "BriefDescription": "L1 DTLB Miss. DTLB reload to a 4K page that missed the L2 TLB.",
+ "UMask": "0x10"
+ },
+ {
+ "EventName": "ls_l1_d_tlb_miss.tlb_reload_1g_l2_hit",
+ "EventCode": "0x45",
+ "BriefDescription": "L1 DTLB Miss. DTLB reload to a 1G page that hit in the L2 TLB.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "ls_l1_d_tlb_miss.tlb_reload_2m_l2_hit",
+ "EventCode": "0x45",
+ "BriefDescription": "L1 DTLB Miss. DTLB reload to a 2M page that hit in the L2 TLB.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "ls_l1_d_tlb_miss.tlb_reload_coalesced_page_hit",
+ "EventCode": "0x45",
+ "BriefDescription": "L1 DTLB Miss. DTLB reload to a coalesced page that hit in the L2 TLB.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ls_l1_d_tlb_miss.tlb_reload_4k_l2_hit",
+ "EventCode": "0x45",
+ "BriefDescription": "L1 DTLB Miss. DTLB reload to a 4K page that hit in the L2 TLB.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ls_tablewalker.iside",
+ "EventCode": "0x46",
+ "BriefDescription": "Total Page Table Walks on I-side.",
+ "UMask": "0x0c"
+ },
+ {
+ "EventName": "ls_tablewalker.ic_type1",
+ "EventCode": "0x46",
+ "BriefDescription": "Total Page Table Walks IC Type 1.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "ls_tablewalker.ic_type0",
+ "EventCode": "0x46",
+ "BriefDescription": "Total Page Table Walks IC Type 0.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "ls_tablewalker.dside",
+ "EventCode": "0x46",
+ "BriefDescription": "Total Page Table Walks on D-side.",
+ "UMask": "0x03"
+ },
+ {
+ "EventName": "ls_tablewalker.dc_type1",
+ "EventCode": "0x46",
+ "BriefDescription": "Total Page Table Walks DC Type 1.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ls_tablewalker.dc_type0",
+ "EventCode": "0x46",
+ "BriefDescription": "Total Page Table Walks DC Type 0.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ls_misal_loads.ma4k",
+ "EventCode": "0x47",
+ "BriefDescription": "The number of 4KB misaligned (i.e., page crossing) loads.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ls_misal_loads.ma64",
+ "EventCode": "0x47",
+ "BriefDescription": "The number of 64B misaligned (i.e., cacheline crossing) loads.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ls_pref_instr_disp",
+ "EventCode": "0x4b",
+ "BriefDescription": "Software Prefetch Instructions Dispatched (Speculative).",
+ "UMask": "0xff"
+ },
+ {
+ "EventName": "ls_pref_instr_disp.prefetch_nta",
+ "EventCode": "0x4b",
+ "BriefDescription": "Software Prefetch Instructions Dispatched (Speculative). PrefetchNTA instruction. See docAPM3 PREFETCHlevel.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "ls_pref_instr_disp.prefetch_w",
+ "EventCode": "0x4b",
+ "BriefDescription": "Software Prefetch Instructions Dispatched (Speculative). PrefetchW instruction. See docAPM3 PREFETCHW.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ls_pref_instr_disp.prefetch",
+ "EventCode": "0x4b",
+ "BriefDescription": "Software Prefetch Instructions Dispatched (Speculative). PrefetchT0, T1 and T2 instructions. See docAPM3 PREFETCHlevel.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ls_inef_sw_pref.mab_mch_cnt",
+ "EventCode": "0x52",
+ "BriefDescription": "The number of software prefetches that did not fetch data outside of the processor core. Software PREFETCH instruction saw a match on an already-allocated miss request buffer.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ls_inef_sw_pref.data_pipe_sw_pf_dc_hit",
+ "EventCode": "0x52",
+ "BriefDescription": "The number of software prefetches that did not fetch data outside of the processor core. Software PREFETCH instruction saw a DC hit.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ls_sw_pf_dc_fills.mem_io_remote",
+ "EventCode": "0x59",
+ "BriefDescription": "Software Prefetch Data Cache Fills by Data Source. From DRAM or IO connected in different Node.",
+ "UMask": "0x40"
+ },
+ {
+ "EventName": "ls_sw_pf_dc_fills.ext_cache_remote",
+ "EventCode": "0x59",
+ "BriefDescription": "Software Prefetch Data Cache Fills by Data Source. From CCX Cache in different Node.",
+ "UMask": "0x10"
+ },
+ {
+ "EventName": "ls_sw_pf_dc_fills.mem_io_local",
+ "EventCode": "0x59",
+ "BriefDescription": "Software Prefetch Data Cache Fills by Data Source. From DRAM or IO connected in same node.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "ls_sw_pf_dc_fills.ext_cache_local",
+ "EventCode": "0x59",
+ "BriefDescription": "Software Prefetch Data Cache Fills by Data Source. From cache of different CCX in same node.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "ls_sw_pf_dc_fills.int_cache",
+ "EventCode": "0x59",
+ "BriefDescription": "Software Prefetch Data Cache Fills by Data Source. From L3 or different L2 in same CCX.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ls_sw_pf_dc_fills.lcl_l2",
+ "EventCode": "0x59",
+ "BriefDescription": "Software Prefetch Data Cache Fills by Data Source. From Local L2 to the core.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ls_hw_pf_dc_fills.mem_io_remote",
+ "EventCode": "0x5a",
+ "BriefDescription": "Hardware Prefetch Data Cache Fills by Data Source. From DRAM or IO connected in different Node.",
+ "UMask": "0x40"
+ },
+ {
+ "EventName": "ls_hw_pf_dc_fills.ext_cache_remote",
+ "EventCode": "0x5a",
+ "BriefDescription": "Hardware Prefetch Data Cache Fills by Data Source. From CCX Cache in different Node.",
+ "UMask": "0x10"
+ },
+ {
+ "EventName": "ls_hw_pf_dc_fills.mem_io_local",
+ "EventCode": "0x5a",
+ "BriefDescription": "Hardware Prefetch Data Cache Fills by Data Source. From DRAM or IO connected in same node.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "ls_hw_pf_dc_fills.ext_cache_local",
+ "EventCode": "0x5a",
+ "BriefDescription": "Hardware Prefetch Data Cache Fills by Data Source. From cache of different CCX in same node.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "ls_hw_pf_dc_fills.int_cache",
+ "EventCode": "0x5a",
+ "BriefDescription": "Hardware Prefetch Data Cache Fills by Data Source. From L3 or different L2 in same CCX.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "ls_hw_pf_dc_fills.lcl_l2",
+ "EventCode": "0x5a",
+ "BriefDescription": "Hardware Prefetch Data Cache Fills by Data Source. From Local L2 to the core.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "ls_alloc_mab_count",
+ "EventCode": "0x5f",
+ "BriefDescription": "Count of Allocated Mabs",
+ "PublicDescription": "This event counts the in-flight L1 data cache misses (allocated Miss Address Buffers) divided by 4 and rounded down each cycle unless used with the MergeEvent functionality. If the MergeEvent is used, it counts the exact number of outstanding L1 data cache misses. See 2.1.17.3 [Large Increment per Cycle Events]."
+ },
+ {
+ "EventName": "ls_not_halted_cyc",
+ "EventCode": "0x76",
+ "BriefDescription": "Cycles not in Halt."
+ },
+ {
+ "EventName": "ls_tlb_flush.all_tlb_flushes",
+ "EventCode": "0x78",
+ "BriefDescription": "All TLB Flushes. Requires unit mask 0xFF to engage event for counting. Use all_tlbs_flushed instead",
+ "UMask": "0xff"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/other.json b/tools/perf/pmu-events/arch/x86/amdzen3/other.json
new file mode 100644
index 000000000000..7da5d0791ea3
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen3/other.json
@@ -0,0 +1,103 @@
+[
+ {
+ "EventName": "de_dis_uop_queue_empty_di0",
+ "EventCode": "0xa9",
+ "BriefDescription": "Cycles where the Micro-Op Queue is empty."
+ },
+ {
+ "EventName": "de_dis_cops_from_decoder.disp_op_type.any_integer_dispatch",
+ "EventCode": "0xab",
+ "BriefDescription": "Any Integer dispatch. Types of Oops Dispatched from Decoder.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "de_dis_cops_from_decoder.disp_op_type.any_fp_dispatch",
+ "EventCode": "0xab",
+ "BriefDescription": "Any FP dispatch. Types of Oops Dispatched from Decoder.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "de_dis_dispatch_token_stalls1.fp_flush_recovery_stall",
+ "EventCode": "0xae",
+ "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a Token Stall. Also counts cycles when the thread is not selected to dispatch but would have been stalled due to a Token Stall. FP Flush recovery stall.",
+ "UMask": "0x80"
+ },
+ {
+ "EventName": "de_dis_dispatch_token_stalls1.fp_sch_rsrc_stall",
+ "EventCode": "0xae",
+ "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a Token Stall. Also counts cycles when the thread is not selected to dispatch but would have been stalled due to a Token Stall. FP scheduler resource stall. Applies to ops that use the FP scheduler.",
+ "UMask": "0x40"
+ },
+ {
+ "EventName": "de_dis_dispatch_token_stalls1.fp_reg_file_rsrc_stall",
+ "EventCode": "0xae",
+ "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a Token Stall. Also counts cycles when the thread is not selected to dispatch but would have been stalled due to a Token Stall. Floating point register file resource stall. Applies to all FP ops that have a destination register.",
+ "UMask": "0x20"
+ },
+ {
+ "EventName": "de_dis_dispatch_token_stalls1.taken_brnch_buffer_rsrc",
+ "EventCode": "0xae",
+ "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a Token Stall. Also counts cycles when the thread is not selected to dispatch but would have been stalled due to a Token Stall. Taken branch buffer resource stall.",
+ "UMask": "0x10"
+ },
+ {
+ "EventName": "de_dis_dispatch_token_stalls1.int_sched_misc_token_stall",
+ "EventCode": "0xae",
+ "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. Integer Scheduler miscellaneous resource stall.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "de_dis_dispatch_token_stalls1.store_queue_rsrc_stall",
+ "EventCode": "0xae",
+ "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a Token Stall. Also counts cycles when the thread is not selected to dispatch but would have been stalled due to a Token Stall. Store Queue resource stall. Applies to all ops with store semantics.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "de_dis_dispatch_token_stalls1.load_queue_rsrc_stall",
+ "EventCode": "0xae",
+ "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a Token Stall. Also counts cycles when the thread is not selected to dispatch but would have been stalled due to a Token Stall. Load Queue resource stall. Applies to all ops with load semantics.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "de_dis_dispatch_token_stalls1.int_phy_reg_file_rsrc_stall",
+ "EventCode": "0xae",
+ "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a Token Stall. Also counts cycles when the thread is not selected to dispatch but would have been stalled due to a Token Stall. Integer Physical Register File resource stall. Integer Physical Register File, applies to all ops that have an integer destination register.",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "de_dis_dispatch_token_stalls2.retire_token_stall",
+ "EventCode": "0xaf",
+ "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. Insufficient Retire Queue tokens available.",
+ "UMask": "0x20"
+ },
+ {
+ "EventName": "de_dis_dispatch_token_stalls2.agsq_token_stall",
+ "EventCode": "0xaf",
+ "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. AGSQ Tokens unavailable.",
+ "UMask": "0x10"
+ },
+ {
+ "EventName": "de_dis_dispatch_token_stalls2.int_sch3_token_stall",
+ "EventCode": "0xaf",
+ "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. No tokens for Integer Scheduler Queue 3 available.",
+ "UMask": "0x08"
+ },
+ {
+ "EventName": "de_dis_dispatch_token_stalls2.int_sch2_token_stall",
+ "EventCode": "0xaf",
+ "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. No tokens for Integer Scheduler Queue 2 available.",
+ "UMask": "0x04"
+ },
+ {
+ "EventName": "de_dis_dispatch_token_stalls2.int_sch1_token_stall",
+ "EventCode": "0xaf",
+ "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. No tokens for Integer Scheduler Queue 1 available.",
+ "UMask": "0x02"
+ },
+ {
+ "EventName": "de_dis_dispatch_token_stalls2.int_sch0_token_stall",
+ "EventCode": "0xaf",
+ "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. No tokens for Integer Scheduler Queue 0 available.",
+ "UMask": "0x01"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
new file mode 100644
index 000000000000..988cf68ae825
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
@@ -0,0 +1,214 @@
+[
+ {
+ "MetricName": "branch_misprediction_ratio",
+ "BriefDescription": "Execution-Time Branch Misprediction Ratio (Non-Speculative)",
+ "MetricExpr": "d_ratio(ex_ret_brn_misp, ex_ret_brn)",
+ "MetricGroup": "branch_prediction",
+ "ScaleUnit": "100%"
+ },
+ {
+ "EventName": "all_data_cache_accesses",
+ "EventCode": "0x29",
+ "BriefDescription": "All L1 Data Cache Accesses",
+ "UMask": "0x07"
+ },
+ {
+ "MetricName": "all_l2_cache_accesses",
+ "BriefDescription": "All L2 Cache Accesses",
+ "MetricExpr": "l2_request_g1.all_no_prefetch + l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "EventName": "l2_cache_accesses_from_ic_misses",
+ "EventCode": "0x60",
+ "BriefDescription": "L2 Cache Accesses from L1 Instruction Cache Misses (including prefetch)",
+ "UMask": "0x10"
+ },
+ {
+ "EventName": "l2_cache_accesses_from_dc_misses",
+ "EventCode": "0x60",
+ "BriefDescription": "L2 Cache Accesses from L1 Data Cache Misses (including prefetch)",
+ "UMask": "0xe8"
+ },
+ {
+ "MetricName": "l2_cache_accesses_from_l2_hwpf",
+ "BriefDescription": "L2 Cache Accesses from L2 HWPF",
+ "MetricExpr": "l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "MetricName": "all_l2_cache_misses",
+ "BriefDescription": "All L2 Cache Misses",
+ "MetricExpr": "l2_cache_req_stat.ic_dc_miss_in_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "EventName": "l2_cache_misses_from_ic_miss",
+ "EventCode": "0x64",
+ "BriefDescription": "L2 Cache Misses from L1 Instruction Cache Misses",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "l2_cache_misses_from_dc_misses",
+ "EventCode": "0x64",
+ "BriefDescription": "L2 Cache Misses from L1 Data Cache Misses",
+ "UMask": "0x08"
+ },
+ {
+ "MetricName": "l2_cache_misses_from_l2_hwpf",
+ "BriefDescription": "L2 Cache Misses from L2 Cache HWPF",
+ "MetricExpr": "l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "MetricName": "all_l2_cache_hits",
+ "BriefDescription": "All L2 Cache Hits",
+ "MetricExpr": "l2_cache_req_stat.ic_dc_hit_in_l2 + l2_pf_hit_l2",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "EventName": "l2_cache_hits_from_ic_misses",
+ "EventCode": "0x64",
+ "BriefDescription": "L2 Cache Hits from L1 Instruction Cache Misses",
+ "UMask": "0x06"
+ },
+ {
+ "EventName": "l2_cache_hits_from_dc_misses",
+ "EventCode": "0x64",
+ "BriefDescription": "L2 Cache Hits from L1 Data Cache Misses",
+ "UMask": "0xf0"
+ },
+ {
+ "EventName": "l2_cache_hits_from_l2_hwpf",
+ "EventCode": "0x70",
+ "BriefDescription": "L2 Cache Hits from L2 Cache HWPF",
+ "UMask": "0xff"
+ },
+ {
+ "EventName": "l3_cache_accesses",
+ "EventCode": "0x04",
+ "BriefDescription": "L3 Cache Accesses",
+ "UMask": "0xff",
+ "Unit": "L3PMC"
+ },
+ {
+ "EventName": "l3_misses",
+ "EventCode": "0x04",
+ "BriefDescription": "L3 Misses (includes cacheline state change requests)",
+ "UMask": "0x01",
+ "Unit": "L3PMC"
+ },
+ {
+ "MetricName": "l3_read_miss_latency",
+ "BriefDescription": "Average L3 Read Miss Latency (in core clocks)",
+ "MetricExpr": "(xi_sys_fill_latency * 16) / xi_ccx_sdp_req1",
+ "MetricGroup": "l3_cache",
+ "ScaleUnit": "1core clocks"
+ },
+ {
+ "MetricName": "op_cache_fetch_miss_ratio",
+ "BriefDescription": "Op Cache (64B) Fetch Miss Ratio",
+ "MetricExpr": "d_ratio(op_cache_hit_miss.op_cache_miss, op_cache_hit_miss.all_op_cache_accesses)",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "MetricName": "ic_fetch_miss_ratio",
+ "BriefDescription": "Instruction Cache (32B) Fetch Miss Ratio",
+ "MetricExpr": "d_ratio(ic_tag_hit_miss.instruction_cache_miss, ic_tag_hit_miss.all_instruction_cache_accesses)",
+ "MetricGroup": "l2_cache",
+ "ScaleUnit": "100%"
+ },
+ {
+ "EventName": "l1_data_cache_fills_from_memory",
+ "EventCode": "0x44",
+ "BriefDescription": "L1 Data Cache Fills: From Memory",
+ "UMask": "0x48"
+ },
+ {
+ "EventName": "l1_data_cache_fills_from_remote_node",
+ "EventCode": "0x44",
+ "BriefDescription": "L1 Data Cache Fills: From Remote Node",
+ "UMask": "0x50"
+ },
+ {
+ "EventName": "l1_data_cache_fills_from_within_same_ccx",
+ "EventCode": "0x44",
+ "BriefDescription": "L1 Data Cache Fills: From within same CCX",
+ "UMask": "0x03"
+ },
+ {
+ "EventName": "l1_data_cache_fills_from_external_ccx_cache",
+ "EventCode": "0x44",
+ "BriefDescription": "L1 Data Cache Fills: From External CCX Cache",
+ "UMask": "0x14"
+ },
+ {
+ "EventName": "l1_data_cache_fills_all",
+ "EventCode": "0x44",
+ "BriefDescription": "L1 Data Cache Fills: All",
+ "UMask": "0xff"
+ },
+ {
+ "MetricName": "l1_itlb_misses",
+ "BriefDescription": "L1 ITLB Misses",
+ "MetricExpr": "bp_l1_tlb_miss_l2_tlb_hit + bp_l1_tlb_miss_l2_tlb_miss",
+ "MetricGroup": "tlb"
+ },
+ {
+ "EventName": "l2_itlb_misses",
+ "EventCode": "0x85",
+ "BriefDescription": "L2 ITLB Misses & Instruction page walks",
+ "UMask": "0x07"
+ },
+ {
+ "EventName": "l1_dtlb_misses",
+ "EventCode": "0x45",
+ "BriefDescription": "L1 DTLB Misses",
+ "UMask": "0xff"
+ },
+ {
+ "EventName": "l2_dtlb_misses",
+ "EventCode": "0x45",
+ "BriefDescription": "L2 DTLB Misses & Data page walks",
+ "UMask": "0xf0"
+ },
+ {
+ "EventName": "all_tlbs_flushed",
+ "EventCode": "0x78",
+ "BriefDescription": "All TLBs Flushed",
+ "UMask": "0xff"
+ },
+ {
+ "MetricName": "macro_ops_dispatched",
+ "BriefDescription": "Macro-ops Dispatched",
+ "MetricExpr": "de_dis_cops_from_decoder.disp_op_type.any_integer_dispatch + de_dis_cops_from_decoder.disp_op_type.any_fp_dispatch",
+ "MetricGroup": "decoder"
+ },
+ {
+ "EventName": "sse_avx_stalls",
+ "EventCode": "0x0e",
+ "BriefDescription": "Mixed SSE/AVX Stalls",
+ "UMask": "0x0e"
+ },
+ {
+ "EventName": "macro_ops_retired",
+ "EventCode": "0xc1",
+ "BriefDescription": "Macro-ops Retired"
+ },
+ {
+ "MetricName": "all_remote_links_outbound",
+ "BriefDescription": "Approximate: Outbound data bytes for all Remote Links for a node (die)",
+ "MetricExpr": "remote_outbound_data_controller_0 + remote_outbound_data_controller_1 + remote_outbound_data_controller_2 + remote_outbound_data_controller_3",
+ "MetricGroup": "data_fabric",
+ "PerPkg": "1",
+ "ScaleUnit": "3e-5MiB"
+ },
+ {
+ "MetricName": "nps1_die_to_dram",
+ "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
+ "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
+ "MetricGroup": "data_fabric",
+ "PerPkg": "1",
+ "ScaleUnit": "6.1e-5MiB"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 2f2a209e87e1..9f97b32533a0 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -38,4 +38,4 @@ GenuineIntel-6-7E,v1,icelake,core
GenuineIntel-6-86,v1,tremontx,core
AuthenticAMD-23-([12][0-9A-F]|[0-9A-F]),v2,amdzen1,core
AuthenticAMD-23-[[:xdigit:]]+,v1,amdzen2,core
-AuthenticAMD-25-[[:xdigit:]]+,v1,amdzen2,core
+AuthenticAMD-25-[[:xdigit:]]+,v1,amdzen3,core
--
2.20.1
1
21

20 Jan '22
您好,您的补丁已经提交 CI 进行检查。
后续合入进展可以在以下issue下关注
https://gitee.com/openeuler/kernel/issues/I4RK58
谢谢
BR
Laibin
On 2022/1/20 10:35, Chen Baozi wrote:
> phytium inclusion
> category: bugfix
> bugzilla: NA
> CVE: NA
>
> --------------------------------
>
> The system would hang up when the Phytium S2500 communicates with
> some BMCs after several rounds of transactions, unless we reset
> the controller timeout counter manually by calling firmware through
> SMC.
>
> Signed-off-by: Wang Yinfeng <wangyinfeng(a)phytium.com.cn>
> Signed-off-by: Chen Baozi <chenbaozi(a)phytium.com.cn>
> ---
> drivers/char/ipmi/ipmi_si_mem_io.c | 117 ++++++++++++++++++++++++++++-
> 1 file changed, 113 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/char/ipmi/ipmi_si_mem_io.c b/drivers/char/ipmi/ipmi_si_mem_io.c
> index 75583612ab10..73a3d6979eaa 100644
> --- a/drivers/char/ipmi/ipmi_si_mem_io.c
> +++ b/drivers/char/ipmi/ipmi_si_mem_io.c
> @@ -3,6 +3,105 @@
> #include <linux/io.h>
> #include "ipmi_si.h"
>
> +#ifdef CONFIG_ARM_GIC_PHYTIUM_2500
> +#include <linux/arm-smccc.h>
> +
> +#define CTL_RST_FUNC_ID 0xC2000011
> +
> +static void ctl_smc(unsigned long arg0, unsigned long arg1,
> + unsigned long arg2, unsigned long arg3)
> +{
> + struct arm_smccc_res res;
> +
> + arm_smccc_smc(arg0, arg1, arg2, arg3, 0, 0, 0, 0, &res);
> + if (res.a0 != 0)
> + pr_err("Error: Firmware call SMC reset Failed: %d, addr: 0x%lx\n",
> + (int)res.a0, arg2);
> +}
> +
> +static void ctl_timeout_reset(void)
> +{
> + ctl_smc(CTL_RST_FUNC_ID, 0x1, 0x28100208, 0x1);
> + ctl_smc(CTL_RST_FUNC_ID, 0x1, 0x2810020C, 0x1);
> +}
> +
> +static unsigned char intf_mem_inb_2500(const struct si_sm_io *io,
> + unsigned int offset)
> +{
> + ctl_timeout_reset();
> +
> + return readb((io->addr)+(offset * io->regspacing));
> +}
> +
> +static unsigned char intf_mem_inw_2500(const struct si_sm_io *io,
> + unsigned int offset)
> +{
> + ctl_timeout_reset();
> +
> + return (readw((io->addr)+(offset * io->regspacing)) >> io->regshift)
> + & 0xff;
> +}
> +
> +static unsigned char intf_mem_inl_2500(const struct si_sm_io *io,
> + unsigned int offset)
> +{
> + ctl_timeout_reset();
> +
> + return (readl((io->addr)+(offset * io->regspacing)) >> io->regshift)
> + & 0xff;
> +}
> +
> +static unsigned char mem_inq_2500(const struct si_sm_io *io,
> + unsigned int offset)
> +{
> + ctl_timeout_reset();
> +
> + return (readq((io->addr)+(offset * io->regspacing)) >> io->regshift)
> + & 0xff;
> +}
> +#else
> +#define intf_mem_inb_2500 intf_mem_inb
> +#define intf_mem_inw_2500 intf_mem_inw
> +#define intf_mem_inl_2500 intf_mem_inl
> +#define mem_inq_2500 mem_inq
> +#endif
> +
> +static bool apply_phytium2500_workaround;
> +
> +#ifdef CONFIG_ACPI
> +struct ipmi_workaround_oem_info {
> + char oem_id[ACPI_OEM_ID_SIZE + 1];
> +};
> +
> +static struct ipmi_workaround_oem_info wa_info[] = {
> + {
> + .oem_id = "KPSVVJ",
> + }
> +};
> +
> +static void ipmi_check_phytium_workaround(void)
> +{
> + struct acpi_table_header tbl;
> + int i;
> +
> + if (ACPI_FAILURE(acpi_get_table_header(ACPI_SIG_DSDT, 0, &tbl)))
> + return;
> +
> + for (i = 0; i < ARRAY_SIZE(wa_info); i++) {
> + if (strncmp(wa_info[i].oem_id, tbl.oem_id, ACPI_OEM_ID_SIZE))
> + continue;
> +
> + apply_phytium2500_workaround = true;
> + break;
> + }
> +}
> +#else
> +static void ipmi_check_phytium_workaround(void)
> +{
> + apply_phytium2500_workaround = false;
> +}
> +#endif
> +
> static unsigned char intf_mem_inb(const struct si_sm_io *io,
> unsigned int offset)
> {
> @@ -81,26 +180,36 @@ int ipmi_si_mem_setup(struct si_sm_io *io)
> if (!addr)
> return -ENODEV;
>
> + ipmi_check_phytium_workaround();
> +
> /*
> * Figure out the actual readb/readw/readl/etc routine to use based
> * upon the register size.
> */
> switch (io->regsize) {
> case 1:
> - io->inputb = intf_mem_inb;
> + io->inputb = apply_phytium2500_workaround ?
> + intf_mem_inb_2500 :
> + intf_mem_inb;
> io->outputb = intf_mem_outb;
> break;
> case 2:
> - io->inputb = intf_mem_inw;
> + io->inputb = apply_phytium2500_workaround ?
> + intf_mem_inw_2500 :
> + intf_mem_inw;
> io->outputb = intf_mem_outw;
> break;
> case 4:
> - io->inputb = intf_mem_inl;
> + io->inputb = apply_phytium2500_workaround ?
> + intf_mem_inl_2500 :
> + intf_mem_inl;
> io->outputb = intf_mem_outl;
> break;
> #ifdef readq
> case 8:
> - io->inputb = mem_inq;
> + io->inputb = apply_phytium2500_workaround ?
> + mem_inq_2500 :
> + mem_inq;
> io->outputb = mem_outq;
> break;
> #endif
>
1
0

[PATCH openEuler-1.0-LTS] audit: bugfix for infinite loop when flush the hold queue
by Yang Yingliang 20 Jan '22
by Yang Yingliang 20 Jan '22
20 Jan '22
From: Cui GaoSheng <cuigaosheng1(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 186105, https://gitee.com/openeuler/kernel/issues/I4RGWS?from=project-issue
CVE: NA
-----------------------------------------------------------------
When we add "audit=1" to the cmdline, if we keep the audit_hold_queue
non-empty, flush the hold queue will fall into an infinite loop. So we
need to fix it by stoping flush the hold queue when netlink abnormal.
Fixes: 3413ddc91e02a ("audit: improve robustness of the audit queue handling")
Signed-off-by: Cui GaoSheng <cuigaosheng1(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Reviewed-by: weiyang wang <wangweiyang2(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
kernel/audit.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/kernel/audit.c b/kernel/audit.c
index c5e034fe14bbb..3de5ebb945592 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -740,6 +740,8 @@ static int kauditd_send_queue(struct sock *sk, u32 portid,
if (!sk) {
if (err_hook)
(*err_hook)(skb);
+ if (queue == &audit_hold_queue)
+ goto out;
continue;
}
@@ -756,6 +758,8 @@ static int kauditd_send_queue(struct sock *sk, u32 portid,
(*err_hook)(skb);
if (rc == -EAGAIN)
rc = 0;
+ if (queue == &audit_hold_queue)
+ goto out;
/* continue to drain the queue */
continue;
} else
@@ -767,6 +771,7 @@ static int kauditd_send_queue(struct sock *sk, u32 portid,
}
}
+out:
return (rc >= 0 ? 0 : rc);
}
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] blk-throttle: enable hierarchical throttle in cgroup v1
by Yang Yingliang 19 Jan '22
by Yang Yingliang 19 Jan '22
19 Jan '22
From: Yu Kuai <yukuai3(a)huawei.com>
hulk inclusion
category: feature
bugzilla: 186072, https://gitee.com/openeuler/kernel/issues/I4RH0V
CVE: NA
-----------------------------------------------
blkio subsytem is not under default hierarchy in cgroup v1 by default,
which means configurations will only be effective on current cgroup
for io throttle.
This patch introduces a new feature that enable default hierarchy for
io throttle, which means configurations will be effective on child cgroups.
Such feature is disabled by default, and can be enabled by adding
"blkcg_global_limit=1" or "blkcg_global_limit=Y" or "blkcg_global_limit=y"
in boot cmd.
Signed-off-by: Yu Kuai <yukuai3(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
block/blk-throttle.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index ceeb17104aa82..6b3ced8ae1a5f 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -43,6 +43,19 @@ static struct blkcg_policy blkcg_policy_throtl;
/* A workqueue to queue throttle related work */
static struct workqueue_struct *kthrotld_workqueue;
+/* True if global limit is enabled in cgroup v1 */
+static bool global_limit;
+
+static int __init setup_global_limit(char *str)
+{
+ if (!strcmp(str, "1") || !strcmp(str, "Y") || !strcmp(str, "y"))
+ global_limit = true;
+
+ return 1;
+}
+
+__setup("blkcg_global_limit=", setup_global_limit);
+
/*
* To implement hierarchical throttling, throtl_grps form a tree and bios
* are dispatched upwards level by level until they reach the top and get
@@ -537,7 +550,8 @@ static void throtl_pd_init(struct blkg_policy_data *pd)
* regardless of the position of the group in the hierarchy.
*/
sq->parent_sq = &td->service_queue;
- if (cgroup_subsys_on_dfl(io_cgrp_subsys) && blkg->parent)
+ if ((cgroup_subsys_on_dfl(io_cgrp_subsys) || global_limit) &&
+ blkg->parent)
sq->parent_sq = &blkg_to_tg(blkg->parent)->service_queue;
tg->td = td;
}
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] xfs: map unwritten blocks in XFS_IOC_{ALLOC,FREE}SP just like fallocate
by Yang Yingliang 19 Jan '22
by Yang Yingliang 19 Jan '22
19 Jan '22
From: "Darrick J. Wong" <djwong(a)kernel.org>
mainline inclusion
from mainline-v5.16-rc5
commit 983d8e60f50806f90534cc5373d0ce867e5aaf79
category: bugfix
bugzilla: 186083
CVE: CVE-2021-4155
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
The old ALLOCSP/FREESP ioctls in XFS can be used to preallocate space at
the end of files, just like fallocate and RESVSP. Make the behavior
consistent with the other ioctls.
Reported-by: Kirill Tkhai <ktkhai(a)virtuozzo.com>
Signed-off-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Darrick J. Wong <darrick.wong(a)oracle.com>
Reviewed-by: Dave Chinner <dchinner(a)redhat.com>
Reviewed-by: Eric Sandeen <sandeen(a)redhat.com>
Signed-off-by: Guo Xuenan <guoxuenan(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
fs/xfs/xfs_ioctl.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 263436977c39e..b6d85da7f3508 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -701,7 +701,8 @@ xfs_ioc_space(
flags |= XFS_PREALLOC_CLEAR;
if (bf->l_start > XFS_ISIZE(ip)) {
error = xfs_alloc_file_space(ip, XFS_ISIZE(ip),
- bf->l_start - XFS_ISIZE(ip), 0);
+ bf->l_start - XFS_ISIZE(ip),
+ XFS_BMAPI_PREALLOC);
if (error)
goto out_unlock;
}
--
2.25.1
1
0
Hi Liu
感谢您的提交,针对先前提交的AMD适配milan V2相关patch,我们总共有两条意
见,麻烦再确认下。
1、Essential下 08ed77e414ab v5.10-rc1 perf vendor events amd: Add
recommended events 未合入
2、41cd02c6f7f6:kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID 后向
有两个bugfix需要确认
new fix: 36fa06f9ff39 ("KVM: x86: Add support for RDPID without RDTSCP")
new fix: 8aec21c04caa ("KVM: VMX: Do not advertise RDPID if
ENABLE_RDTSCP control is unsupported")
BR
Laibin
On 2021/12/16 10:27, Jackie Liu wrote:
> Hi, ALL
>
> 附件是加入到 KYLIN 内核的 AMD patch 并且基于 欧拉分支 rebase 了一下
> ,head 的创建规则没搞懂,这个脚本好像也不合适,就没有更改。请查收。
>
> --
> Jackie Liu
>
> 在 2021/12/15 上午11:44, Zheng Zengkai 写道:
>> #!/bin/sh
>> #
>> # Usage:
>> #
>> # prerequisite:
>> # The patches to add header should be patches from linux mainline.
>> #
>> # command:
>> # add-openeuler-patch-header.sh commit1..commit2 gitee_issue_url
>> #
>> # commit1: mainline start commit id;
>> # commit2: mainline end commit id
>> # gitee_issue_url: for example:
>> https://gitee.com/openeuler/kernel/issues/I4MKP4
>>
>> # Example:
>> # add-openeuler-patch-header.sh afd2ff9b7e1b..8bb7eca972a
>> https://gitee.com/openeuler/kernel/issues/I4MKP4
>> #
>>
>> for p in `ls *.patch`
>> do
>> if [ $p == "0000-cover-letter.patch" ]; then
>> continue
>> fi
>>
>> title=$(sed '/^Subject: \[PATCH .*\]/{N;s/\n / /g}' $p | grep
>> -E "^Subject: \[PATCH .*\]" | cut -d ']' -f 2)
>> commit=$(git log --pretty=oneline $1 | grep -F "$title" | cut
>> -d ' ' -f 1)
>> if [ -n "$multi" ]; then
>> echo "$p" has multi-commits
>> commit="multi"
>> continue
>> fi
>> if [ -z "$commit" ]; then
>> echo "$p" commit is null
>> commit="null"
>> continue
>> fi
>>
>> # find insert line
>> i=5
>> while true
>> do
>> ln=""$i"p"
>> sub=`cat $p | sed -n "$ln"`
>> i=$(($i+1))
>> if [ -z "$sub" ]; then
>> break
>> fi
>> done
>> i=$(($i-1))
>>
>> version=`git name-rev $commit | awk -F '[v~^]' '{print $2}'`
>>
>> sed ""$i"a mainline inclusion\nfrom
>> mainline-v$version\ncommit $commit\ncategory: feature\nbugzilla:
>> $2\nCVE: NA\n\n--------------------------------\n" -i $p
>> done
>>
>
> No virus found
> Checked by Hillstone Network AntiVirus
>
1
0

[PATCH openEuler-1.0-LTS V2 00/90] Enabling AMD Milan series processor support
by Laibin Qiu 19 Jan '22
by Laibin Qiu 19 Jan '22
19 Jan '22
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4
Babu Moger (1):
KVM: SVM: Clear the CR4 register on reset
David Edmondson (1):
KVM: x86: clflushopt should be treated as a no-op by emulation
Fenghua Yu (2):
x86/cpufeatures: Enumerate MOVDIRI instruction
x86/cpufeatures: Enumerate MOVDIR64B instruction
Haiyan Song (2):
perf vendor events intel: Add Icelake V1.00 event file
perf vendor events intel: Add Tremontx event file v1.02
Isaac Vaughn (1):
EDAC/amd64: Add PCI device IDs for family 17h, model 70h
Jan H. Schönherr (1):
x86/mce: Fix use of uninitialized MCE message string
Jim Mattson (1):
kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID
John Allen (2):
kvm/svm: PKU not currently supported
x86/microcode/AMD: Increase microcode PATCH_MAX_SIZE
Kai Huang (3):
kvm: x86: Move kvm_set_mmio_spte_mask() from x86.c to mmu.c
kvm: x86: Fix reserved bits related calculation errors caused by MKTME
kvm: x86: Fix L1TF mitigation for shadow MMU
Kan Liang (1):
perf vendor events intel: Add uncore_upi JSON support
Kim Phillips (17):
x86/cpu/amd: Call init_amd_zn() om Family 19h processors too
perf/amd/uncore: Prepare L3 thread mask code for Family 19h
perf/amd/uncore: Make L3 thread mask code more readable
perf/amd/uncore: Add support for Family 19h L3 PMU
arch/x86/amd/ibs: Fix re-arming IBS Fetch
perf/x86/amd/ibs: Fix raw sample data accumulation
perf/amd/uncore: Set all slices and threads to restore perf stat -a
behaviour
perf/x86/amd/ibs: Don't include randomized bits in get_ibs_op_count()
perf/amd/uncore: Prepare to scale for more attributes that vary per
family
perf/amd/uncore: Allow F17h user threadmask and slicemask
specification
perf/amd/uncore: Allow F19h user coreid, threadmask, and sliceid
specification
perf vendor events amd: Add L3 cache events for Family 17h
perf vendor events amd: Remove redundant '['
perf vendor events amd: Enable Family 19h users by matching Zen2
events
perf/x86/amd/ibs: Support 27-bit extended Op/cycle counter
tools/power turbostat: Support AMD Family 19h
perf vendor events amd: Add L2 Prefetch events for zen1
Krish Sadhukhan (1):
KVM: SVM: Replace hard-coded value with #define
Like Xu (1):
perf/x86/amd: Don't touch the AMD64_EVENTSEL_HOSTONLY bit inside the
guest
Liu Jingqi (2):
KVM: x86: expose MOVDIRI CPU feature into VM.
KVM: x86: expose MOVDIR64B CPU feature into VM.
Maciej S. Szmigiero (1):
KVM: mmu: Fix SPTE encoding of MMIO generation upper half
Marcel Bocu (1):
x86/amd_nb: Add PCI device IDs for family 17h, model 70h
Martin Liška (1):
perf vendor events amd: perf PMU events for AMD Family 17h
Nathan Chancellor (1):
perf/amd/uncore: Fix sysfs type mismatch
Paolo Bonzini (3):
KVM: x86: only do L1TF workaround on affected processors
KVM: x86: assign two bits to track SPTE kinds
KVM: x86: fix overlap between SPTE_MMIO_MASK and generation
Rasmus Villemoes (1):
build_bug.h: add wrapper for _Static_assert
Sean Christopherson (13):
KVM: nVMX: Allocate and configure VM{READ,WRITE} bitmaps iff
enable_shadow_vmcs
KVM: x86: Add requisite includes to kvm_cache_regs.h
KVM: x86: Add requisite includes to hyperv.h
KVM: x86: Use a u64 when passing the MMIO gen around
KVM: Explicitly define the "memslot update in-progress" bit
KVM: x86: Refactor the MMIO SPTE generation handling
KVM: x86: Rename access permissions cache member in struct
kvm_vcpu_arch
KVM: x86/mmu: Add explicit access mask for MMIO SPTEs
KVM: x86/mmu: Consolidate "is MMIO SPTE" code
KVM: x86/mmu: Apply max PA check for MMIO sptes to 32-bit KVM
KVM: x86/mmu: Set mmio_value to '0' if reserved #PF can't be generated
KVM: Remove the hack to trigger memslot generation wraparound
KVM: Move the memslot update in-progress flag to bit 63
Sebastian Andrzej Siewior (3):
x86/pkeys: Provide *pkru() helpers
x86/fpu: Only write PKRU if it is different from current
x86/pkeys: Don't check if PKRU is zero before writing it
Tom Lendacky (1):
KVM: SVM: Override default MMIO mask if memory encryption is enabled
Vijay Thakkar (3):
perf vendor events amd: Restrict model detection for zen1 based
processors
perf vendor events amd: Add Zen2 events
perf vendor events amd: Update Zen1 events to V2
Woods, Brian (2):
hwmon/k10temp, x86/amd_nb: Consolidate shared device IDs
x86/amd_nb: Add PCI device IDs for family 17h, model 30h
Yazen Ghannam (24):
EDAC/amd64: Drop some family checks for newer systems
x86/amd_nb: Add Family 19h PCI IDs
EDAC/mce_amd: Always load on SMCA systems
x86/MCE/AMD, EDAC/mce_amd: Add new MP5, NBIO, and PCIE SMCA bank types
x86/MCE/AMD, EDAC/mce_amd: Add new McaTypes for CS, PSP, and SMU units
x86/MCE/AMD, EDAC/mce_amd: Add new error descriptions for some SMCA
bank types
x86/MCE/AMD, EDAC/mce_amd: Add new Load Store unit McaType
EDAC/amd64: Use a macro for iterating over Unified Memory Controllers
EDAC/amd64: Support more than two controllers for chip selects
handling
EDAC/amd64: Initialize DIMM info for systems with more than two
channels
EDAC/amd64: Add Family 17h Model 30h PCI IDs
EDAC/amd64: Support more than two Unified Memory Controllers
EDAC/amd64: Set maximum channel layer size depending on family
EDAC/amd64: Recognize x16 symbol size
EDAC/amd64: Adjust printed chip select sizes when interleaved
EDAC/amd64: Find Chip Select memory size using Address Mask
EDAC/amd64: Cache secondary Chip Select registers
EDAC/amd64: Support asymmetric dual-rank DIMMs
EDAC/amd64: Set grain per DIMM
EDAC/amd64: Make struct amd64_family_type global
EDAC/amd64: Gather hardware information early
EDAC/amd64: Save max number of controllers to family type
EDAC/amd64: Add family ops for Family 19h Models 00h-0Fh
EDAC/amd64: Handle three rank interleaving mode
Documentation/virtual/kvm/mmu.txt | 13 +-
arch/x86/events/amd/ibs.c | 93 +-
arch/x86/events/amd/uncore.c | 179 ++--
arch/x86/events/perf_event.h | 3 +-
arch/x86/include/asm/cpufeatures.h | 4 +-
arch/x86/include/asm/kvm_host.h | 10 +-
arch/x86/include/asm/mce.h | 7 +
arch/x86/include/asm/microcode_amd.h | 2 +-
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/include/asm/perf_event.h | 16 +-
arch/x86/include/asm/pgtable.h | 2 +-
arch/x86/include/asm/special_insns.h | 19 +-
arch/x86/kernel/amd_nb.c | 15 +-
arch/x86/kernel/cpu/amd.c | 3 +-
arch/x86/kernel/cpu/mce/amd.c | 28 +-
arch/x86/kernel/cpu/mce/core.c | 4 +-
arch/x86/kvm/cpuid.c | 6 +-
arch/x86/kvm/emulate.c | 8 +-
arch/x86/kvm/hyperv.h | 2 +
arch/x86/kvm/kvm_cache_regs.h | 2 +
arch/x86/kvm/mmu.c | 215 +++--
arch/x86/kvm/mmu.h | 2 +-
arch/x86/kvm/svm.c | 53 +-
arch/x86/kvm/vmx.c | 53 +-
arch/x86/kvm/x86.c | 33 +-
arch/x86/kvm/x86.h | 4 +-
arch/x86/mm/pkeys.c | 7 -
drivers/edac/amd64_edac.c | 609 ++++++++----
drivers/edac/amd64_edac.h | 31 +-
drivers/edac/mce_amd.c | 134 ++-
drivers/hwmon/k10temp.c | 9 +-
include/linux/build_bug.h | 19 +
include/linux/kvm_host.h | 21 +
include/linux/pci_ids.h | 5 +
.../pmu-events/arch/x86/amdzen1/branch.json | 23 +
.../pmu-events/arch/x86/amdzen1/cache.json | 312 ++++++
.../pmu-events/arch/x86/amdzen1/core.json | 125 +++
.../arch/x86/amdzen1/floating-point.json | 224 +++++
.../pmu-events/arch/x86/amdzen1/memory.json | 184 ++++
.../pmu-events/arch/x86/amdzen1/other.json | 56 ++
.../pmu-events/arch/x86/amdzen2/branch.json | 52 +
.../pmu-events/arch/x86/amdzen2/cache.json | 338 +++++++
.../pmu-events/arch/x86/amdzen2/core.json | 130 +++
.../arch/x86/amdzen2/floating-point.json | 140 +++
.../pmu-events/arch/x86/amdzen2/memory.json | 341 +++++++
.../pmu-events/arch/x86/amdzen2/other.json | 115 +++
.../pmu-events/arch/x86/icelake/cache.json | 552 +++++++++++
.../arch/x86/icelake/floating-point.json | 102 ++
.../pmu-events/arch/x86/icelake/frontend.json | 424 +++++++++
.../pmu-events/arch/x86/icelake/memory.json | 410 ++++++++
.../pmu-events/arch/x86/icelake/other.json | 121 +++
.../pmu-events/arch/x86/icelake/pipeline.json | 892 ++++++++++++++++++
.../arch/x86/icelake/virtual-memory.json | 236 +++++
tools/perf/pmu-events/arch/x86/mapfile.csv | 6 +
.../pmu-events/arch/x86/tremontx/cache.json | 111 +++
.../arch/x86/tremontx/frontend.json | 26 +
.../pmu-events/arch/x86/tremontx/memory.json | 26 +
.../pmu-events/arch/x86/tremontx/other.json | 26 +
.../arch/x86/tremontx/pipeline.json | 111 +++
.../arch/x86/tremontx/uncore-memory.json | 73 ++
.../arch/x86/tremontx/uncore-other.json | 431 +++++++++
.../arch/x86/tremontx/uncore-power.json | 11 +
.../arch/x86/tremontx/virtual-memory.json | 86 ++
tools/perf/pmu-events/jevents.c | 2 +
tools/power/x86/turbostat/turbostat.c | 34 +-
virt/kvm/kvm_main.c | 36 +-
66 files changed, 6839 insertions(+), 529 deletions(-)
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/branch.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/core.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/floating-point.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/other.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/branch.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/core.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/floating-point.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/other.json
create mode 100644 tools/perf/pmu-events/arch/x86/icelake/cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/icelake/floating-point.json
create mode 100644 tools/perf/pmu-events/arch/x86/icelake/frontend.json
create mode 100644 tools/perf/pmu-events/arch/x86/icelake/memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/icelake/other.json
create mode 100644 tools/perf/pmu-events/arch/x86/icelake/pipeline.json
create mode 100644 tools/perf/pmu-events/arch/x86/icelake/virtual-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/frontend.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/other.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/pipeline.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-other.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-power.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/virtual-memory.json
--
2.22.0
2
89
openEuler inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OFDI
CVE: NA
--------------------------------
Chao Liu (2):
change arm64 configs
change x86 configs
arch/arm64/configs/openeuler_defconfig | 288 ++++++++-----
arch/x86/configs/openeuler_defconfig | 544 +++++++++----------------
2 files changed, 376 insertions(+), 456 deletions(-)
--
2.27.0
1
2
openEuler inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OFDI
CVE: NA
--------------------------------
Chao Liu (2):
change arm64 configs
change x86 configs
arch/arm64/configs/openeuler_defconfig | 287 ++++++++-----
arch/x86/configs/openeuler_defconfig | 544 +++++++++----------------
2 files changed, 375 insertions(+), 456 deletions(-)
--
2.27.0
1
2
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4O31I?from=project-issue
CVE: NA
------------------------------
Register pmem in arm64:
Use memmap(memmap=nn[KMG]!ss[KMG]) reserve memory for pmem
to register persistent memory in arm64. when the kernel
restart or update, the data in PMEM will not be lost and
can be loaded faster. this is a general features. Currently,
only one pmem can be registered and use.
If you want use this features, you need do as follows:
1.Reserve memory: add memmap to reserve memory in grub.cfg
memmap=nn[KMG]!ss[KMG] exp:memmap=100K!0x1a0000000.
2.Insmod nd_e820.ko: modprobe nd_e820.
3.Check pmem device in /dev exp: /dev/pmem0
driver/nvdimm/e820.c:
The function of this file is scan "iomem_resource" and take
advantage of nvdimm resource discovery mechanism by registering
a resource named "Persistent Memory (legacy)", this function
doesn't depend on architecture.
We will push the feature to linux kernel community and discuss to
modify the file name. because people have a mistaken notion that
the e820.c is depend on x86.
Signed-off-by: Zhuling <zhuling8(a)huawei.com>
---
arch/arm64/Kconfig | 12 ++++++++++++
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/pmem.c | 34 ++++++++++++++++++++++++++++++++++
drivers/nvdimm/Kconfig | 7 +++++++
drivers/nvdimm/Makefile | 1 +
5 files changed, 55 insertions(+)
create mode 100644 arch/arm64/kernel/pmem.c
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 2cab963..105221c 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1325,6 +1325,18 @@ config RODATA_FULL_DEFAULT_ENABLED
This requires the linear region to be mapped down to pages,
which may adversely affect performance in some cases.
+config ARM64_PMEM_LEGACY_DEVICE
+ bool "Create persistent storage"
+ depends on BLK_DEV
+ depends on LIBNVDIMM
+ select ARM64_PMEM_RESERVE
+ help
+ Use reserved memory for persistent storage when the kernel
+ restart or update. the data in PMEM will not be lost and
+ can be loaded faster.
+
+ Say y if unsure.
+
config ARM64_SW_TTBR0_PAN
bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
help
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 169d90f..f615325 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -68,6 +68,7 @@ obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o
obj-$(CONFIG_SHADOW_CALL_STACK) += scs.o
obj-$(CONFIG_ARM64_MTE) += mte.o
obj-$(CONFIG_MPAM) += mpam/
+obj-$(CONFIG_ARM64_PMEM_LEGACY_DEVICE) += pmem.o
obj-y += vdso/ probes/
obj-$(CONFIG_COMPAT_VDSO) += vdso32/
diff --git a/arch/arm64/kernel/pmem.c b/arch/arm64/kernel/pmem.c
new file mode 100644
index 0000000..2312ac0
--- /dev/null
+++ b/arch/arm64/kernel/pmem.c
@@ -0,0 +1,34 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * arm64 implement PMEM.
+ * based on arch/x86/kernel/pmem.c
+ */
+#include <linux/platform_device.h>
+#include <linux/init.h>
+#include <linux/ioport.h>
+#include <linux/module.h>
+
+static int found(struct resource *res, void *data)
+{
+ return 1;
+}
+
+static int __init register_e820_pmem(void)
+{
+ struct platform_device *pdev;
+ int rc;
+
+ rc = walk_iomem_res_desc(IORES_DESC_PERSISTENT_MEMORY_LEGACY,
+ IORESOURCE_MEM, 0, -1, NULL, found);
+ if (rc <= 0)
+ return 0;
+
+ /*
+ * See drivers/nvdimm/e820.c for the implementation, this is
+ * simply here to trigger the module to load on demand.
+ */
+ pdev = platform_device_alloc("e820_pmem", -1);
+
+return platform_device_add(pdev);
+}
+device_initcall(register_e820_pmem);
diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig
index b7d1eb3..ce5d4fd 100644
--- a/drivers/nvdimm/Kconfig
+++ b/drivers/nvdimm/Kconfig
@@ -132,3 +132,10 @@ config NVDIMM_TEST_BUILD
infrastructure.
endif
+
+config PMEM_LEGACY
+ tristate "Pmem_legacy"
+ depends on X86 || ARM64
+ select X86_PMEM_LEGACY if X86
+ select ARM64_PMEM_LEGACY_DEVICE if ARM64
+
diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile
index 0407753..6f8dc92 100644
--- a/drivers/nvdimm/Makefile
+++ b/drivers/nvdimm/Makefile
@@ -3,6 +3,7 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
obj-$(CONFIG_ND_BTT) += nd_btt.o
obj-$(CONFIG_ND_BLK) += nd_blk.o
+obj-$(CONFIG_PMEM_LEGACY) += nd_e820.o
obj-$(CONFIG_OF_PMEM) += of_pmem.o
obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o nd_virtio.o
--
2.9.5
1
2
From: Liu Shixin <liushixin2(a)huawei.com>
hulk inclusion
category: feature
bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG
CVE: NA
--------------------------------
There are several functions that will be used in next patches for
dynamic hugetlb feature. Declare them.
No functional changes.
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
include/linux/hugetlb.h | 3 +++
include/linux/memcontrol.h | 16 ++++++++++++++++
include/linux/memory_hotplug.h | 6 ++++++
mm/hugetlb.c | 2 +-
mm/internal.h | 3 +++
mm/memcontrol.c | 16 ----------------
mm/memory_hotplug.c | 3 +--
mm/page_alloc.c | 6 +++---
8 files changed, 33 insertions(+), 22 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 61c38e6c6c43..a1135c43719e 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -471,6 +471,9 @@ static inline struct hstate *hstate_inode(struct inode *i)
{
return HUGETLBFS_SB(i->i_sb)->hstate;
}
+
+bool prep_compound_gigantic_page(struct page *page, unsigned int order);
+
#else /* !CONFIG_HUGETLBFS */
#define is_file_hugepages(file) false
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index e5826f1ff337..2e0a480a8665 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1239,6 +1239,22 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
gfp_t gfp_mask,
unsigned long *total_scanned);
+/*
+ * Test whether @memcg has children, dead or alive. Note that this
+ * function doesn't care whether @memcg has use_hierarchy enabled and
+ * returns %true if there are child csses according to the cgroup
+ * hierarchy. Testing use_hierarchy is the caller's responsibility.
+ */
+static inline bool memcg_has_children(struct mem_cgroup *memcg)
+{
+ bool ret;
+
+ rcu_read_lock();
+ ret = css_next_child(NULL, &memcg->css);
+ rcu_read_unlock();
+ return ret;
+}
+
#else /* CONFIG_MEMCG */
#define MEM_CGROUP_ID_SHIFT 0
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index c60bda5cbb17..b9aeabcce49a 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -284,6 +284,7 @@ static inline void pgdat_resize_init(struct pglist_data *pgdat) {}
#ifdef CONFIG_MEMORY_HOTREMOVE
+extern int do_migrate_range(unsigned long start_pfn, unsigned long end_pfn);
extern void try_offline_node(int nid);
extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
extern int remove_memory(int nid, u64 start, u64 size);
@@ -291,6 +292,11 @@ extern void __remove_memory(int nid, u64 start, u64 size);
extern int offline_and_remove_memory(int nid, u64 start, u64 size);
#else
+static inline int do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
+{
+ return -ENOSYS;
+}
+
static inline void try_offline_node(int nid) {}
static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 219bf083dc8a..fa3cba3571cc 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1633,7 +1633,7 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
spin_unlock_irq(&hugetlb_lock);
}
-static bool prep_compound_gigantic_page(struct page *page, unsigned int order)
+bool prep_compound_gigantic_page(struct page *page, unsigned int order)
{
int i, j;
int nr_pages = 1 << order;
diff --git a/mm/internal.h b/mm/internal.h
index db9546707695..31517354f3c7 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -197,6 +197,9 @@ extern void __free_pages_core(struct page *page, unsigned int order);
extern void prep_compound_page(struct page *page, unsigned int order);
extern void post_alloc_hook(struct page *page, unsigned int order,
gfp_t gfp_flags);
+extern void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags,
+ unsigned int alloc_flags);
+extern bool free_pages_prepare(struct page *page, unsigned int order, bool check_free);
extern int user_min_free_kbytes;
extern void zone_pcp_update(struct zone *zone);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8acf7ff56294..011aff396af2 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3402,22 +3402,6 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
return nr_reclaimed;
}
-/*
- * Test whether @memcg has children, dead or alive. Note that this
- * function doesn't care whether @memcg has use_hierarchy enabled and
- * returns %true if there are child csses according to the cgroup
- * hierarchy. Testing use_hierarchy is the caller's responsibility.
- */
-static inline bool memcg_has_children(struct mem_cgroup *memcg)
-{
- bool ret;
-
- rcu_read_lock();
- ret = css_next_child(NULL, &memcg->css);
- rcu_read_unlock();
- return ret;
-}
-
/*
* Reclaims as many pages from the given memcg as possible.
*
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 1549b19b36f6..73ea92dae74a 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1165,8 +1165,7 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
return 0;
}
-static int
-do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
+int do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
{
unsigned long pfn;
struct page *page, *head;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 988051bf6795..0ff4f4e3a538 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1203,7 +1203,7 @@ static void kernel_init_free_pages(struct page *page, int numpages)
kasan_enable_current();
}
-static __always_inline bool free_pages_prepare(struct page *page,
+__always_inline bool free_pages_prepare(struct page *page,
unsigned int order, bool check_free)
{
int bad = 0;
@@ -2283,8 +2283,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
set_page_owner(page, order, gfp_flags);
}
-static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags,
- unsigned int alloc_flags)
+void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags,
+ unsigned int alloc_flags)
{
post_alloc_hook(page, order, gfp_flags);
--
2.20.1
1
23
From: Matteo Croce <mcroce(a)microsoft.com>
mainline inclusion
category: feature
from mainline-v5.11
commit 2c622ed0eaa38b68d7440bedb8c6cdd138b5a860
bugzilla: https://gitee.com/openeuler/kernel/issues/I4BULZ
CVE: NA
--------------------------------
The kernel cmdline reboot= option offers some sort of control on how the
reboot is issued.
We don't always know in advance what type of reboot to perform.
Sometimes a warm reboot is preferred to persist certain memory regions
across the reboot. Others a cold one is needed to apply a future system
update that makes a memory memory model change, like changing the base
page size or resizing a persistent memory region.
Or simply we want to enable reboot_force because we noticed that
something bad happened.
Add handles in sysfs to allow setting these reboot options, so they can
be changed when the system is booted, other than at boot time.
The handlers are under <sysfs>/kernel/reboot, can be read to get the
current configuration and written to alter it.
# cd /sys/kernel/reboot/
# grep . *
cpu:0
force:0
mode:cold
type:acpi
# echo 2 >cpu
# echo yes >force
# echo soft >mode
# echo bios >type
# grep . *
cpu:2
force:1
mode:soft
type:bios
Before setting anything, check for CAP_SYS_BOOT capability, so it's
possible to allow an unpriviledged process to change these settings simply
by relaxing the handles permissions, without opening them to the world.
[natechancellor(a)gmail.com: fix variable assignments in type_store]
Link: https://lkml.kernel.org/r/20201112035023.974748-1-natechancellor@gmail.com
Link: https://github.com/ClangBuiltLinux/linux/issues/1197
Link: https://lkml.kernel.org/r/20201110202746.9690-1-mcroce@linux.microsoft.com
Signed-off-by: Matteo Croce <mcroce(a)microsoft.com>
Signed-off-by: Nathan Chancellor <natechancellor(a)gmail.com>
Reviewed-by: Petr Mladek <pmladek(a)suse.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Guenter Roeck <linux(a)roeck-us.net>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Tyler Hicks <tyhicks(a)linux.microsoft.com>
Cc: Nathan Chancellor <natechancellor(a)gmail.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Li Hongyu <543306408(a)qq.com>
---
Documentation/ABI/testing/sysfs-kernel-reboot | 32 +++
kernel/reboot.c | 206 ++++++++++++++++++
2 files changed, 238 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-kernel-reboot
diff --git a/Documentation/ABI/testing/sysfs-kernel-reboot b/Documentation/ABI/testing/sysfs-kernel-reboot
new file mode 100644
index 000000000000..837330fb2511
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-reboot
@@ -0,0 +1,32 @@
+What: /sys/kernel/reboot
+Date: November 2020
+KernelVersion: 5.11
+Contact: Matteo Croce <mcroce(a)microsoft.com>
+Description: Interface to set the kernel reboot behavior, similarly to
+ what can be done via the reboot= cmdline option.
+ (see Documentation/admin-guide/kernel-parameters.txt)
+
+What: /sys/kernel/reboot/mode
+Date: November 2020
+KernelVersion: 5.11
+Contact: Matteo Croce <mcroce(a)microsoft.com>
+Description: Reboot mode. Valid values are: cold warm hard soft gpio
+
+What: /sys/kernel/reboot/type
+Date: November 2020
+KernelVersion: 5.11
+Contact: Matteo Croce <mcroce(a)microsoft.com>
+Description: Reboot type. Valid values are: bios acpi kbd triple efi pci
+
+What: /sys/kernel/reboot/cpu
+Date: November 2020
+KernelVersion: 5.11
+Contact: Matteo Croce <mcroce(a)microsoft.com>
+Description: CPU number to use to reboot.
+
+What: /sys/kernel/reboot/force
+Date: November 2020
+KernelVersion: 5.11
+Contact: Matteo Croce <mcroce(a)microsoft.com>
+Description: Don't wait for any other CPUs on reboot and
+ avoid anything that could hang.
diff --git a/kernel/reboot.c b/kernel/reboot.c
index aa3bfd6c673b..940cbb784e17 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -600,3 +600,209 @@ static int __init reboot_setup(char *str)
return 1;
}
__setup("reboot=", reboot_setup);
+
+#ifdef CONFIG_SYSFS
+
+#define REBOOT_COLD_STR "cold"
+#define REBOOT_WARM_STR "warm"
+#define REBOOT_HARD_STR "hard"
+#define REBOOT_SOFT_STR "soft"
+#define REBOOT_GPIO_STR "gpio"
+#define REBOOT_UNDEFINED_STR "undefined"
+
+#define BOOT_TRIPLE_STR "triple"
+#define BOOT_KBD_STR "kbd"
+#define BOOT_BIOS_STR "bios"
+#define BOOT_ACPI_STR "acpi"
+#define BOOT_EFI_STR "efi"
+#define BOOT_CF9_FORCE_STR "cf9_force"
+#define BOOT_CF9_SAFE_STR "cf9_safe"
+
+static ssize_t mode_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+ const char *val;
+
+ switch (reboot_mode) {
+ case REBOOT_COLD:
+ val = REBOOT_COLD_STR;
+ break;
+ case REBOOT_WARM:
+ val = REBOOT_WARM_STR;
+ break;
+ case REBOOT_HARD:
+ val = REBOOT_HARD_STR;
+ break;
+ case REBOOT_SOFT:
+ val = REBOOT_SOFT_STR;
+ break;
+ case REBOOT_GPIO:
+ val = REBOOT_GPIO_STR;
+ break;
+ default:
+ val = REBOOT_UNDEFINED_STR;
+ }
+
+ return sprintf(buf, "%s\n", val);
+}
+static ssize_t mode_store(struct kobject *kobj, struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+ if (!capable(CAP_SYS_BOOT))
+ return -EPERM;
+
+ if (!strncmp(buf, REBOOT_COLD_STR, strlen(REBOOT_COLD_STR)))
+ reboot_mode = REBOOT_COLD;
+ else if (!strncmp(buf, REBOOT_WARM_STR, strlen(REBOOT_WARM_STR)))
+ reboot_mode = REBOOT_WARM;
+ else if (!strncmp(buf, REBOOT_HARD_STR, strlen(REBOOT_HARD_STR)))
+ reboot_mode = REBOOT_HARD;
+ else if (!strncmp(buf, REBOOT_SOFT_STR, strlen(REBOOT_SOFT_STR)))
+ reboot_mode = REBOOT_SOFT;
+ else if (!strncmp(buf, REBOOT_GPIO_STR, strlen(REBOOT_GPIO_STR)))
+ reboot_mode = REBOOT_GPIO;
+ else
+ return -EINVAL;
+
+ return count;
+}
+static struct kobj_attribute reboot_mode_attr = __ATTR_RW(mode);
+
+static ssize_t type_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+ const char *val;
+
+ switch (reboot_type) {
+ case BOOT_TRIPLE:
+ val = BOOT_TRIPLE_STR;
+ break;
+ case BOOT_KBD:
+ val = BOOT_KBD_STR;
+ break;
+ case BOOT_BIOS:
+ val = BOOT_BIOS_STR;
+ break;
+ case BOOT_ACPI:
+ val = BOOT_ACPI_STR;
+ break;
+ case BOOT_EFI:
+ val = BOOT_EFI_STR;
+ break;
+ case BOOT_CF9_FORCE:
+ val = BOOT_CF9_FORCE_STR;
+ break;
+ case BOOT_CF9_SAFE:
+ val = BOOT_CF9_SAFE_STR;
+ break;
+ default:
+ val = REBOOT_UNDEFINED_STR;
+ }
+
+ return sprintf(buf, "%s\n", val);
+}
+static ssize_t type_store(struct kobject *kobj, struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+ if (!capable(CAP_SYS_BOOT))
+ return -EPERM;
+
+ if (!strncmp(buf, BOOT_TRIPLE_STR, strlen(BOOT_TRIPLE_STR)))
+ reboot_type = BOOT_TRIPLE;
+ else if (!strncmp(buf, BOOT_KBD_STR, strlen(BOOT_KBD_STR)))
+ reboot_type = BOOT_KBD;
+ else if (!strncmp(buf, BOOT_BIOS_STR, strlen(BOOT_BIOS_STR)))
+ reboot_type = BOOT_BIOS;
+ else if (!strncmp(buf, BOOT_ACPI_STR, strlen(BOOT_ACPI_STR)))
+ reboot_type = BOOT_ACPI;
+ else if (!strncmp(buf, BOOT_EFI_STR, strlen(BOOT_EFI_STR)))
+ reboot_type = BOOT_EFI;
+ else if (!strncmp(buf, BOOT_CF9_FORCE_STR, strlen(BOOT_CF9_FORCE_STR)))
+ reboot_type = BOOT_CF9_FORCE;
+ else if (!strncmp(buf, BOOT_CF9_SAFE_STR, strlen(BOOT_CF9_SAFE_STR)))
+ reboot_type = BOOT_CF9_SAFE;
+ else
+ return -EINVAL;
+
+ return count;
+}
+static struct kobj_attribute reboot_type_attr = __ATTR_RW(type);
+
+static ssize_t cpu_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+ return sprintf(buf, "%d\n", reboot_cpu);
+}
+static ssize_t cpu_store(struct kobject *kobj, struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+ unsigned int cpunum;
+ int rc;
+
+ if (!capable(CAP_SYS_BOOT))
+ return -EPERM;
+
+ rc = kstrtouint(buf, 0, &cpunum);
+
+ if (rc)
+ return rc;
+
+ if (cpunum >= num_possible_cpus())
+ return -ERANGE;
+
+ reboot_cpu = cpunum;
+
+ return count;
+}
+static struct kobj_attribute reboot_cpu_attr = __ATTR_RW(cpu);
+
+static ssize_t force_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+ return sprintf(buf, "%d\n", reboot_force);
+}
+static ssize_t force_store(struct kobject *kobj, struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+ bool res;
+
+ if (!capable(CAP_SYS_BOOT))
+ return -EPERM;
+
+ if (kstrtobool(buf, &res))
+ return -EINVAL;
+
+ reboot_force = res;
+
+ return count;
+}
+static struct kobj_attribute reboot_force_attr = __ATTR_RW(force);
+
+static struct attribute *reboot_attrs[] = {
+ &reboot_mode_attr.attr,
+ &reboot_type_attr.attr,
+ &reboot_cpu_attr.attr,
+ &reboot_force_attr.attr,
+ NULL,
+};
+
+static const struct attribute_group reboot_attr_group = {
+ .attrs = reboot_attrs,
+};
+
+static int __init reboot_ksysfs_init(void)
+{
+ struct kobject *reboot_kobj;
+ int ret;
+
+ reboot_kobj = kobject_create_and_add("reboot", kernel_kobj);
+ if (!reboot_kobj)
+ return -ENOMEM;
+
+ ret = sysfs_create_group(reboot_kobj, &reboot_attr_group);
+ if (ret) {
+ kobject_put(reboot_kobj);
+ return ret;
+ }
+
+ return 0;
+}
+late_initcall(reboot_ksysfs_init);
+
+#endif
--
2.17.1
2
1
mainline inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4FHN2?from=project-issue
CVE: NA
---------------------------
In modern systems it's not unusual to have a system component monitoring
memory conditions of the system and tasked with keeping system memory
pressure under control. One way to accomplish that is to kill
non-essential processes to free up memory for more important ones.
Examples of this are Facebook's OOM killer daemon called oomd and
Android's low memory killer daemon called lmkd.
For such system component it's important to be able to free memory quickly
and efficiently. Unfortunately the time process takes to free up its
memory after receiving a SIGKILL might vary based on the state of the
process (uninterruptible sleep), size and OPP level of the core the
process is running. A mechanism to free resources of the target process
in a more predictable way would improve system's ability to control its
memory pressure.
Introduce process_mrelease system call that releases memory of a dying
process from the context of the caller. This way the memory is freed in a
more controllable way with CPU affinity and priority of the caller. The
workload of freeing the memory will also be charged to the caller. The
operation is allowed only on a dying process.
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Christian Brauner <christian.brauner(a)ubuntu.com>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Roman Gushchin <guro(a)fb.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Geert Uytterhoeven <geert(a)linux-m68k.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Christian Brauner <christian.brauner(a)ubuntu.com>
Cc: Florian Weimer <fweimer(a)redhat.com>
Cc: Jan Engelhardt <jengelh(a)inai.de>
Cc: Tim Murray <timmurray(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Wen Zhiwei <wenzhiwei(a)kylinos.cn>
---
arch/arm64/include/asm/unistd32.h | 3 +-
arch/x86/entry/syscalls/syscall_64.tbl | 2 +-
include/linux/syscalls.h | 3 ++
include/uapi/asm-generic/unistd.h | 5 +-
kernel/sys_ni.c | 1 +
mm/oom_kill.c | 72 +++++++++++++++++++++++++
tools/include/uapi/asm-generic/unistd.h | 4 +-
7 files changed, 86 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index 107f08e03b9f..e1786b2e8551 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -889,7 +889,8 @@ __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
__SYSCALL(__NR_faccessat2, sys_faccessat2)
#define __NR_process_madvise 440
__SYSCALL(__NR_process_madvise, sys_process_madvise)
-
+#define __NR_process_mrelease 441
+__SYSCALL(__NR_process_mrelease, sys_process_mrelease)
/*
* Please add new compat syscalls above this comment and update
* __NR_compat_syscalls in asm/unistd.h.
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 379819244b91..6eec6496d72c 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -362,7 +362,7 @@
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
440 common process_madvise sys_process_madvise
-
+441 common process_mrelease sys_process_mrelease
#
# Due to a historical design error, certain syscalls are numbered differently
# in x32 as compared to native x86_64. These syscalls have numbers 512-547.
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index aea0ce9f3b74..764bfcdbbcb0 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -905,6 +905,9 @@ asmlinkage long sys_mincore(unsigned long start, size_t len,
asmlinkage long sys_madvise(unsigned long start, size_t len, int behavior);
asmlinkage long sys_process_madvise(int pidfd, const struct iovec __user *vec,
size_t vlen, int behavior, unsigned int flags);
+
+asmlinkage long sys_process_mrelease(int pidfd, unsigned int flags);
+
asmlinkage long sys_remap_file_pages(unsigned long start, unsigned long size,
unsigned long prot, unsigned long pgoff,
unsigned long flags);
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 2056318988f7..e5f38ea960c4 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -859,9 +859,12 @@ __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
__SYSCALL(__NR_faccessat2, sys_faccessat2)
#define __NR_process_madvise 440
__SYSCALL(__NR_process_madvise, sys_process_madvise)
+#define __NR_process_mrelease 441
+__SYSCALL(__NR_process_mrelease, sys_process_mrelease)
#undef __NR_syscalls
-#define __NR_syscalls 441
+#define __NR_syscalls 442
+
/*
* 32 bit systems traditionally used different
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index f27ac94d5fa7..6b8203edf531 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -281,6 +281,7 @@ COND_SYSCALL(munlockall);
COND_SYSCALL(mincore);
COND_SYSCALL(madvise);
COND_SYSCALL(process_madvise);
+COND_SYSCALL(process_mrelease);
COND_SYSCALL(remap_file_pages);
COND_SYSCALL(mbind);
COND_SYSCALL_COMPAT(mbind);
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index fb39b0902476..d169eb511518 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -43,6 +43,7 @@
#include <linux/kthread.h>
#include <linux/init.h>
#include <linux/mmu_notifier.h>
+#include <linux/syscalls.h>
#include <asm/tlb.h>
#include "internal.h"
@@ -1186,3 +1187,74 @@ void pagefault_out_of_memory(void)
out_of_memory(&oc);
mutex_unlock(&oom_lock);
}
+
+SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags)
+{
+#ifdef CONFIG_MMU
+ struct mm_struct *mm = NULL;
+ struct task_struct *task;
+ struct task_struct *p;
+ unsigned int f_flags;
+ bool reap = false;
+ struct pid *pid;
+ long ret = 0;
+
+ if (flags)
+ return -EINVAL;
+
+ pid = pidfd_get_pid(pidfd, &f_flags);
+ if (IS_ERR(pid))
+ return PTR_ERR(pid);
+
+ task = get_pid_task(pid, PIDTYPE_TGID);
+ if (!task) {
+ ret = -ESRCH;
+ goto put_pid;
+ }
+
+ /*
+ * Make sure to choose a thread which still has a reference to mm
+ * during the group exit
+ */
+ p = find_lock_task_mm(task);
+ if (!p) {
+ ret = -ESRCH;
+ goto put_task;
+ }
+
+ if (mmget_not_zero(p->mm)) {
+ mm = p->mm;
+ if (task_will_free_mem(p))
+ reap = true;
+ else {
+ /* Error only if the work has not been done already */
+ if (!test_bit(MMF_OOM_SKIP, &mm->flags))
+ ret = -EINVAL;
+ }
+ }
+ task_unlock(p);
+
+ if (!reap)
+ goto drop_mm;
+
+ if (mmap_read_lock_killable(mm)) {
+ ret = -EINTR;
+ goto drop_mm;
+ }
+ if (!__oom_reap_task_mm(mm))
+ ret = -EAGAIN;
+ mmap_read_unlock(mm);
+
+drop_mm:
+ if (mm)
+ mmput(mm);
+put_task:
+ put_task_struct(task);
+put_pid:
+ put_pid(pid);
+ return ret;
+#else
+ return -ENOSYS;
+#endif /* CONFIG_MMU */
+}
+
diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h
index 2056318988f7..c252faa802d3 100644
--- a/tools/include/uapi/asm-generic/unistd.h
+++ b/tools/include/uapi/asm-generic/unistd.h
@@ -859,9 +859,11 @@ __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
__SYSCALL(__NR_faccessat2, sys_faccessat2)
#define __NR_process_madvise 440
__SYSCALL(__NR_process_madvise, sys_process_madvise)
+#define __NR_process_mrelease 441
+__SYSCALL(__NR_process_mrelease, sys_process_mrelease)
#undef __NR_syscalls
-#define __NR_syscalls 441
+#define __NR_syscalls 442
/*
* 32 bit systems traditionally used different
--
2.30.0
2
1

[PATCH OLK-5.10] crypto:rmd320 rmd128 rmd256-remove RIPE-MD 320 128 256 hash algorithm
by Wen Zhiwei 18 Jan '22
by Wen Zhiwei 18 Jan '22
18 Jan '22
mainlin inclusion
category: performance
bugzilla: NA
CVE: NA
-----------------------------------------------------------
RIPE-MD 256 is never referenced anywhere in the kernel, and unlikely
to be depended upon by userspace via AF_ALG. So let's remove it
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
Signed-off-by: Wen Zhiwei <wenzhiwei(a)kylinos.cn>
---
crypto/tcrypt.c | 42 +-----------------------------------------
1 file changed, 1 insertion(+), 41 deletions(-)
diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c
index 8609174e036e..c8b0d31631f5 100644
--- a/crypto/tcrypt.c
+++ b/crypto/tcrypt.c
@@ -71,7 +71,7 @@ static const char *check[] = {
"blowfish", "twofish", "serpent", "sha384", "sha512", "md4", "aes",
"cast6", "arc4", "michael_mic", "deflate", "crc32c", "tea", "xtea",
"khazad", "wp512", "wp384", "wp256", "tnepres", "xeta", "fcrypt",
- "camellia", "seed", "salsa20", "rmd128", "rmd160", "rmd256", "rmd320",
+ "camellia", "seed", "salsa20", "rmd160",
"lzo", "lzo-rle", "cts", "sha3-224", "sha3-256", "sha3-384",
"sha3-512", "streebog256", "streebog512",
NULL
@@ -1860,22 +1860,10 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
ret += tcrypt_test("cts(cbc(aes))");
break;
- case 39:
- ret += tcrypt_test("rmd128");
- break;
-
case 40:
ret += tcrypt_test("rmd160");
break;
- case 41:
- ret += tcrypt_test("rmd256");
- break;
-
- case 42:
- ret += tcrypt_test("rmd320");
- break;
-
case 43:
ret += tcrypt_test("ecb(seed)");
break;
@@ -1948,10 +1936,6 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
ret += tcrypt_test("xcbc(aes)");
break;
- case 107:
- ret += tcrypt_test("hmac(rmd128)");
- break;
-
case 108:
ret += tcrypt_test("hmac(rmd160)");
break;
@@ -2402,22 +2386,10 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
test_hash_speed("sha224", sec, generic_hash_speed_template);
if (mode > 300 && mode < 400) break;
fallthrough;
- case 314:
- test_hash_speed("rmd128", sec, generic_hash_speed_template);
- if (mode > 300 && mode < 400) break;
- fallthrough;
case 315:
test_hash_speed("rmd160", sec, generic_hash_speed_template);
if (mode > 300 && mode < 400) break;
fallthrough;
- case 316:
- test_hash_speed("rmd256", sec, generic_hash_speed_template);
- if (mode > 300 && mode < 400) break;
- fallthrough;
- case 317:
- test_hash_speed("rmd320", sec, generic_hash_speed_template);
- if (mode > 300 && mode < 400) break;
- fallthrough;
case 318:
klen = 16;
test_hash_speed("ghash", sec, generic_hash_speed_template);
@@ -2526,22 +2498,10 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
test_ahash_speed("sha224", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
fallthrough;
- case 414:
- test_ahash_speed("rmd128", sec, generic_hash_speed_template);
- if (mode > 400 && mode < 500) break;
- fallthrough;
case 415:
test_ahash_speed("rmd160", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
fallthrough;
- case 416:
- test_ahash_speed("rmd256", sec, generic_hash_speed_template);
- if (mode > 400 && mode < 500) break;
- fallthrough;
- case 417:
- test_ahash_speed("rmd320", sec, generic_hash_speed_template);
- if (mode > 400 && mode < 500) break;
- fallthrough;
case 418:
test_ahash_speed("sha3-224", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
--
2.30.0
2
1
change arm64 and x86 configs
openEuler inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OFDI
CVE: NA
--------------------------------
Chao Liu (2):
change arm64 configs
change x86 configs
arch/arm64/configs/openeuler_defconfig | 287 ++++++++-----
arch/x86/configs/openeuler_defconfig | 540 +++++++++----------------
2 files changed, 374 insertions(+), 453 deletions(-)
--
2.27.0
1
2

18 Jan '22
From: Willem de Bruijn <willemb(a)google.com>
mainline inclusion
from mainline-v5.14
commit 8a0ed250f911da31a2aef52101bc707846a800ff
category: bugfix
bugzilla: NA
CVE: CVE-2021-39633
-------------------------------------------------
The GRE tunnel device can pull existing outer headers in ipge_xmit.
This is a rare path, apparently unique to this device. The below
commit ensured that pulling does not move skb->data beyond csum_start.
But it has a false positive if ip_summed is not CHECKSUM_PARTIAL and
thus csum_start is irrelevant.
Refine to exclude this. At the same time simplify and strengthen the
test.
Simplify, by moving the check next to the offending pull, making it
more self documenting and removing an unnecessary branch from other
code paths.
Strengthen, by also ensuring that the transport header is correct and
therefore the inner headers will be after skb_reset_inner_headers.
The transport header is set to csum_start in skb_partial_csum_set.
Link: https://lore.kernel.org/netdev/YS+h%2FtqCJJiQei+W@shredder/
Fixes: 1d011c4803c7 ("ip_gre: add validation for csum_start")
Reported-by: Ido Schimmel <idosch(a)idosch.org>
Suggested-by: Alexander Duyck <alexander.duyck(a)gmail.com>
Signed-off-by: Willem de Bruijn <willemb(a)google.com>
Reviewed-by: Alexander Duyck <alexanderduyck(a)fb.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Huang Guobin <huangguobin4(a)huawei.com>
Reviewed-by: Wei Yongjun <weiyongjun1(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
net/ipv4/ip_gre.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index a8a37d1128201..0c431fd4b1200 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -449,8 +449,6 @@ static void __gre_xmit(struct sk_buff *skb, struct net_device *dev,
static int gre_handle_offloads(struct sk_buff *skb, bool csum)
{
- if (csum && skb_checksum_start(skb) < skb->data)
- return -EINVAL;
return iptunnel_handle_offloads(skb, csum ? SKB_GSO_GRE_CSUM : SKB_GSO_GRE);
}
@@ -682,15 +680,20 @@ static netdev_tx_t ipgre_xmit(struct sk_buff *skb,
}
if (dev->header_ops) {
+ const int pull_len = tunnel->hlen + sizeof(struct iphdr);
+
if (skb_cow_head(skb, 0))
goto free_skb;
tnl_params = (const struct iphdr *)skb->data;
+ if (pull_len > skb_transport_offset(skb))
+ goto free_skb;
+
/* Pull skb since ip_tunnel_xmit() needs skb->data pointing
* to gre header.
*/
- skb_pull(skb, tunnel->hlen + sizeof(struct iphdr));
+ skb_pull(skb, pull_len);
skb_reset_mac_header(skb);
} else {
if (skb_cow_head(skb, dev->needed_headroom))
--
2.25.1
1
0

[PATCH openEuler-5.10 01/82] clocksource/arm_arch_timer: Add build-time guards for unhandled register accesses
by Zheng Zengkai 17 Jan '22
by Zheng Zengkai 17 Jan '22
17 Jan '22
From: Marc Zyngier <maz(a)kernel.org>
mainline inclusion
from mainline-v5.16-rc1
commit 4775bc63f880
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4QCBG
CVE: NA
--------------------------
As we are about to change the registers that are used by the driver,
start by adding build-time checks to ensure that we always handle
all registers and access modes.
Suggested-by: Mark Rutland <mark.rutland(a)arm.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-2-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2(a)huawei.com>
Reviewed-by: Hanjun Guo <guohanjun(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
arch/arm/include/asm/arch_timer.h | 12 ++++++++++++
arch/arm64/include/asm/arch_timer.h | 13 ++++++++++++-
drivers/clocksource/arm_arch_timer.c | 8 ++++++++
3 files changed, 32 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/arch_timer.h b/arch/arm/include/asm/arch_timer.h
index 99175812d903..a5d27cff28fa 100644
--- a/arch/arm/include/asm/arch_timer.h
+++ b/arch/arm/include/asm/arch_timer.h
@@ -34,6 +34,8 @@ void arch_timer_reg_write_cp15(int access, enum arch_timer_reg reg, u32 val)
case ARCH_TIMER_REG_TVAL:
asm volatile("mcr p15, 0, %0, c14, c2, 0" : : "r" (val));
break;
+ default:
+ BUILD_BUG();
}
} else if (access == ARCH_TIMER_VIRT_ACCESS) {
switch (reg) {
@@ -43,7 +45,11 @@ void arch_timer_reg_write_cp15(int access, enum arch_timer_reg reg, u32 val)
case ARCH_TIMER_REG_TVAL:
asm volatile("mcr p15, 0, %0, c14, c3, 0" : : "r" (val));
break;
+ default:
+ BUILD_BUG();
}
+ } else {
+ BUILD_BUG();
}
isb();
@@ -62,6 +68,8 @@ u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg)
case ARCH_TIMER_REG_TVAL:
asm volatile("mrc p15, 0, %0, c14, c2, 0" : "=r" (val));
break;
+ default:
+ BUILD_BUG();
}
} else if (access == ARCH_TIMER_VIRT_ACCESS) {
switch (reg) {
@@ -71,7 +79,11 @@ u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg)
case ARCH_TIMER_REG_TVAL:
asm volatile("mrc p15, 0, %0, c14, c3, 0" : "=r" (val));
break;
+ default:
+ BUILD_BUG();
}
+ } else {
+ BUILD_BUG();
}
return val;
diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h
index ce37c14f9417..7dfb7c6261bb 100644
--- a/arch/arm64/include/asm/arch_timer.h
+++ b/arch/arm64/include/asm/arch_timer.h
@@ -112,6 +112,8 @@ void arch_timer_reg_write_cp15(int access, enum arch_timer_reg reg, u32 val)
case ARCH_TIMER_REG_TVAL:
write_sysreg(val, cntp_tval_el0);
break;
+ default:
+ BUILD_BUG();
}
} else if (access == ARCH_TIMER_VIRT_ACCESS) {
switch (reg) {
@@ -121,7 +123,11 @@ void arch_timer_reg_write_cp15(int access, enum arch_timer_reg reg, u32 val)
case ARCH_TIMER_REG_TVAL:
write_sysreg(val, cntv_tval_el0);
break;
+ default:
+ BUILD_BUG();
}
+ } else {
+ BUILD_BUG();
}
isb();
@@ -136,6 +142,8 @@ u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg)
return read_sysreg(cntp_ctl_el0);
case ARCH_TIMER_REG_TVAL:
return arch_timer_reg_read_stable(cntp_tval_el0);
+ default:
+ BUILD_BUG();
}
} else if (access == ARCH_TIMER_VIRT_ACCESS) {
switch (reg) {
@@ -143,10 +151,13 @@ u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg)
return read_sysreg(cntv_ctl_el0);
case ARCH_TIMER_REG_TVAL:
return arch_timer_reg_read_stable(cntv_tval_el0);
+ default:
+ BUILD_BUG();
}
}
- BUG();
+ BUILD_BUG();
+ unreachable();
}
static inline u32 arch_timer_get_cntfrq(void)
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index 3ea1c22314f9..f3c65e5181ec 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -103,6 +103,8 @@ void arch_timer_reg_write(int access, enum arch_timer_reg reg, u32 val,
case ARCH_TIMER_REG_TVAL:
writel_relaxed(val, timer->base + CNTP_TVAL);
break;
+ default:
+ BUILD_BUG();
}
} else if (access == ARCH_TIMER_MEM_VIRT_ACCESS) {
struct arch_timer *timer = to_arch_timer(clk);
@@ -113,6 +115,8 @@ void arch_timer_reg_write(int access, enum arch_timer_reg reg, u32 val,
case ARCH_TIMER_REG_TVAL:
writel_relaxed(val, timer->base + CNTV_TVAL);
break;
+ default:
+ BUILD_BUG();
}
} else {
arch_timer_reg_write_cp15(access, reg, val);
@@ -134,6 +138,8 @@ u32 arch_timer_reg_read(int access, enum arch_timer_reg reg,
case ARCH_TIMER_REG_TVAL:
val = readl_relaxed(timer->base + CNTP_TVAL);
break;
+ default:
+ BUILD_BUG();
}
} else if (access == ARCH_TIMER_MEM_VIRT_ACCESS) {
struct arch_timer *timer = to_arch_timer(clk);
@@ -144,6 +150,8 @@ u32 arch_timer_reg_read(int access, enum arch_timer_reg reg,
case ARCH_TIMER_REG_TVAL:
val = readl_relaxed(timer->base + CNTV_TVAL);
break;
+ default:
+ BUILD_BUG();
}
} else {
val = arch_timer_reg_read_cp15(access, reg);
--
2.20.1
1
81
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4O31I?from=project-issue
CVE: NA
Add "!" in memmap parameter, use memmap(memmap=nn[KMG]!ss[KMG]) reserve memory
for pmem which register persistent memory in arm64.
Signed-off-by: Zhuling <zhuling8(a)huawei.com>
---
Documentation/admin-guide/kernel-parameters.txt | 9 +++++++--
arch/arm64/mm/init.c | 8 ++++++++
2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 64be32b..1953570 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2829,10 +2829,15 @@
will be eaten.
memmap=nn[KMG]!ss[KMG]
- [KNL,X86] Mark specific memory as protected.
+ [KNL,X86,ARM64] Mark specific memory as protected.
Region of memory to be used, from ss to ss+nn.
- The memory region may be marked as e820 type 12 (0xc)
+ [X86]The memory region may be marked as e820 type 12 (0xc)
and is NVDIMM or ADR memory.
+ [ARM64]Reserve memory for persistent storage when the kernel
+ restart or update. the data in PMEM will not be lost and can
+ be loaded faster
+ Example:
+ memmap=100K!0x1a0000000
memmap=<size>%<offset>-<oldtype>+<newtype>
[KNL,ACPI] Convert memory within the specified region
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 6ebfabd..b9eb312 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -56,6 +56,8 @@
s64 memstart_addr __ro_after_init = -1;
EXPORT_SYMBOL(memstart_addr);
+phys_addr_t start_at, mem_size;
+
#ifdef CONFIG_PIN_MEMORY
struct resource pin_memory_resource = {
.name = "Pin memory",
@@ -111,6 +113,8 @@ static void __init reserve_pin_memory_res(void)
*/
phys_addr_t arm64_dma_phys_limit __ro_after_init;
+static unsigned long long pmem_size, pmem_start;
+
#ifndef CONFIG_KEXEC_CORE
static void __init reserve_crashkernel(void)
{
@@ -442,6 +446,10 @@ static int __init parse_memmap_one(char *p)
start_at = memparse(p + 1, &p);
memblock_reserve(start_at, mem_size);
memblock_mark_memmap(start_at, mem_size);
+ } else if (*p == '!') {
+ start_at = memparse(p+1, &p);
+ pmem_start = start_at;
+ pmem_size = mem_size;
} else
pr_info("Unrecognized memmap option, please check the parameter.\n");
--
2.9.5
3
7

[PATCH openEuler-1.0-LTS 1/2] hugetlbfs: extend the definition of hugepages parameter to support node allocation
by Yang Yingliang 17 Jan '22
by Yang Yingliang 17 Jan '22
17 Jan '22
From: Zhenguo Yao <yaozhenguo1(a)gmail.com>
mainline inclusion
from mainline-v5.16-rc1
commit b5389086ad7be0453c55e0069a89856d1fbdf605
category: feature
bugzilla: 186043
CVE: NA
--------------------------------
We can specify the number of hugepages to allocate at boot. But the
hugepages is balanced in all nodes at present. In some scenarios, we
only need hugepages in one node. For example: DPDK needs hugepages
which are in the same node as NIC.
If DPDK needs four hugepages of 1G size in node1 and system has 16 numa
nodes we must reserve 64 hugepages on the kernel cmdline. But only four
hugepages are used. The others should be free after boot. If the
system memory is low(for example: 64G), it will be an impossible task.
So extend the hugepages parameter to support specifying hugepages on a
specific node. For example add following parameter:
hugepagesz=1G hugepages=0:1,1:3
It will allocate 1 hugepage in node0 and 3 hugepages in node1.
Link: https://lkml.kernel.org/r/20211005054729.86457-1-yaozhenguo1@gmail.com
Signed-off-by: Zhenguo Yao <yaozhenguo1(a)gmail.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Zhenguo Yao <yaozhenguo1(a)gmail.com>
Cc: Dan Carpenter <dan.carpenter(a)oracle.com>
Cc: Nathan Chancellor <nathan(a)kernel.org>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Cc: Paul Mackerras <paulus(a)samba.org>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Conflicts:
Documentation/admin-guide/kernel-parameters.txt
Documentation/admin-guide/mm/hugetlbpage.rst
arch/powerpc/mm/hugetlbpage.c
include/linux/hugetlb.h
mm/hugetlb.c
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
Reviewed-by: Kefeng Wang<wangkefeng.wang(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
.../admin-guide/kernel-parameters.txt | 5 +
arch/powerpc/mm/hugetlbpage.c | 9 +-
include/linux/hugetlb.h | 6 +-
mm/hugetlb.c | 142 +++++++++++++++---
4 files changed, 137 insertions(+), 25 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 420f7317aa6af..5623a9d1cf813 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1402,6 +1402,11 @@
registers. Default set by CONFIG_HPET_MMAP_DEFAULT.
hugepages= [HW,X86-32,IA-64] HugeTLB pages to allocate at boot.
+ If using node format, the number of pages to allocate
+ per-node can be specified.
+ Format: <integer> or (node format)
+ <node>:<integer>[,<node>:<integer>]
+
hugepagesz= [HW,IA-64,PPC,X86-64] The size of the HugeTLB pages.
On x86-64 and powerpc, this option can be specified
multiple times interleaved with hugepages= to reserve
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index cef0b7ee10246..163b0ef7d156e 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -241,17 +241,22 @@ int __init pseries_alloc_bootmem_huge_page(struct hstate *hstate)
m->hstate = hstate;
return 1;
}
+
+bool __init hugetlb_node_alloc_supported(void)
+{
+ return false;
+}
#endif
-int __init alloc_bootmem_huge_page(struct hstate *h)
+int __init alloc_bootmem_huge_page(struct hstate *h, int nid)
{
#ifdef CONFIG_PPC_BOOK3S_64
if (firmware_has_feature(FW_FEATURE_LPAR) && !radix_enabled())
return pseries_alloc_bootmem_huge_page(h);
#endif
- return __alloc_bootmem_huge_page(h);
+ return __alloc_bootmem_huge_page(h, nid);
}
#if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_8xx)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 561e7679fadc4..588f9f2a44fce 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -347,6 +347,7 @@ struct hstate {
unsigned long nr_overcommit_huge_pages;
struct list_head hugepage_activelist;
struct list_head hugepage_freelists[MAX_NUMNODES];
+ unsigned int max_huge_pages_node[MAX_NUMNODES];
unsigned int nr_huge_pages_node[MAX_NUMNODES];
unsigned int free_huge_pages_node[MAX_NUMNODES];
unsigned int surplus_huge_pages_node[MAX_NUMNODES];
@@ -409,8 +410,9 @@ int hugetlb_insert_hugepage(struct vm_area_struct *vma, unsigned long addr,
struct page *hpage, pgprot_t prot);
/* arch callback */
-int __init __alloc_bootmem_huge_page(struct hstate *h);
-int __init alloc_bootmem_huge_page(struct hstate *h);
+int __init __alloc_bootmem_huge_page(struct hstate *h, int nid);
+int __init alloc_bootmem_huge_page(struct hstate *h, int nid);
+bool __init hugetlb_node_alloc_supported(void);
void __init hugetlb_bad_size(void);
void __init hugetlb_add_hstate(unsigned order);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index b2ed9174bfd73..8286a6f8cb050 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -56,6 +56,7 @@ static struct hstate * __initdata parsed_hstate;
static unsigned long __initdata default_hstate_max_huge_pages;
static unsigned long __initdata default_hstate_size;
static bool __initdata parsed_valid_hugepagesz = true;
+static unsigned int default_hugepages_in_node[MAX_NUMNODES] __initdata;
int enable_charge_mighp __read_mostly;
/*
@@ -2241,33 +2242,39 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
return ERR_PTR(-ENOSPC);
}
-int alloc_bootmem_huge_page(struct hstate *h)
+int alloc_bootmem_huge_page(struct hstate *h, int nid)
__attribute__ ((weak, alias("__alloc_bootmem_huge_page")));
-int __alloc_bootmem_huge_page(struct hstate *h)
+int __alloc_bootmem_huge_page(struct hstate *h, int nid)
{
- struct huge_bootmem_page *m;
+ struct huge_bootmem_page *m = NULL; /* initialize for clang */
int nr_nodes, node;
+ if (nid >= nr_online_nodes)
+ return 0;
+ /* do node specific alloc */
+ if (nid != NUMA_NO_NODE) {
+ m = memblock_virt_alloc_try_nid_raw(huge_page_size(h), huge_page_size(h),
+ 0, MEMBLOCK_ALLOC_ACCESSIBLE, nid);
+ if (!m)
+ return 0;
+ goto found;
+ }
+ /* allocate from next node when distributing huge pages */
for_each_node_mask_to_alloc(h, nr_nodes, node, &node_states[N_MEMORY]) {
- void *addr;
-
- addr = memblock_virt_alloc_try_nid_raw(
+ m = memblock_virt_alloc_try_nid_raw(
huge_page_size(h), huge_page_size(h),
0, BOOTMEM_ALLOC_ACCESSIBLE, node);
- if (addr) {
- /*
- * Use the beginning of the huge page to store the
- * huge_bootmem_page struct (until gather_bootmem
- * puts them into the mem_map).
- */
- m = addr;
- goto found;
- }
+ /*
+ * Use the beginning of the huge page to store the
+ * huge_bootmem_page struct (until gather_bootmem
+ * puts them into the mem_map).
+ */
+ if (!m)
+ return 0;
+ goto found;
}
- return 0;
found:
- BUG_ON(!IS_ALIGNED(virt_to_phys(m), huge_page_size(h)));
/* Put them into a private list first because mem_map is not up yet */
INIT_LIST_HEAD(&m->list);
list_add(&m->list, &huge_boot_pages);
@@ -2310,12 +2317,55 @@ static void __init gather_bootmem_prealloc(void)
cond_resched();
}
}
+static void __init hugetlb_hstate_alloc_pages_onenode(struct hstate *h, int nid)
+{
+ unsigned long i;
+ char buf[32];
+
+ for (i = 0; i < h->max_huge_pages_node[nid]; ++i) {
+ if (hstate_is_gigantic(h)) {
+ if (!alloc_bootmem_huge_page(h, nid))
+ break;
+ } else {
+ struct page *page;
+ gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
+
+ page = alloc_fresh_huge_page(h, gfp_mask, nid,
+ &node_states[N_MEMORY], NULL);
+ if (!page)
+ break;
+ put_page(page); /* free it into the hugepage allocator */
+ }
+ cond_resched();
+ }
+ if (i == h->max_huge_pages_node[nid])
+ return;
+
+ string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
+ pr_warn("HugeTLB: allocating %u of page size %s failed node%d. Only allocated %lu hugepages.\n",
+ h->max_huge_pages_node[nid], buf, nid, i);
+ h->max_huge_pages -= (h->max_huge_pages_node[nid] - i);
+ h->max_huge_pages_node[nid] = i;
+}
static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
{
unsigned long i;
nodemask_t *node_alloc_noretry;
+ bool node_specific_alloc = false;
+
+ /* do node specific alloc */
+ for (i = 0; i < nr_online_nodes; i++) {
+ if (h->max_huge_pages_node[i] > 0) {
+ hugetlb_hstate_alloc_pages_onenode(h, i);
+ node_specific_alloc = true;
+ }
+ }
+
+ if (node_specific_alloc)
+ return;
+ /* below will do all node balanced alloc */
if (!hstate_is_gigantic(h)) {
/*
* Bit mask controlling how hard we retry per-node allocations.
@@ -2336,7 +2386,7 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
for (i = 0; i < h->max_huge_pages; ++i) {
if (hstate_is_gigantic(h)) {
- if (!alloc_bootmem_huge_page(h))
+ if (!alloc_bootmem_huge_page(h, NUMA_NO_NODE))
break;
} else if (!alloc_pool_huge_page(h,
&node_states[N_MEMORY],
@@ -2991,8 +3041,13 @@ static int __init hugetlb_init(void)
}
default_hstate_idx = hstate_index(size_to_hstate(default_hstate_size));
if (default_hstate_max_huge_pages) {
- if (!default_hstate.max_huge_pages)
+ if (!default_hstate.max_huge_pages) {
default_hstate.max_huge_pages = default_hstate_max_huge_pages;
+
+ for (i = 0; i < nr_online_nodes; i++)
+ default_hstate.max_huge_pages_node[i] =
+ default_hugepages_in_node[i];
+ }
}
hugetlb_init_hstates();
@@ -3052,10 +3107,19 @@ void __init hugetlb_add_hstate(unsigned int order)
parsed_hstate = h;
}
+bool __init __weak hugetlb_node_alloc_supported(void)
+{
+ return true;
+}
+
static int __init hugetlb_nrpages_setup(char *s)
{
unsigned long *mhp;
static unsigned long *last_mhp;
+ int node = NUMA_NO_NODE;
+ int count;
+ unsigned long tmp;
+ char *p = s;
if (!parsed_valid_hugepagesz) {
pr_warn("hugepages = %s preceded by "
@@ -3077,8 +3141,40 @@ static int __init hugetlb_nrpages_setup(char *s)
return 1;
}
- if (sscanf(s, "%lu", mhp) <= 0)
- *mhp = 0;
+ while (*p) {
+ count = 0;
+ if (sscanf(p, "%lu%n", &tmp, &count) != 1)
+ goto invalid;
+ /* Parameter is node format */
+ if (p[count] == ':') {
+ if (!hugetlb_node_alloc_supported()) {
+ pr_warn("HugeTLB: architecture can't support node specific alloc, ignoring!\n");
+ return 0;
+ }
+ node = tmp;
+ p += count + 1;
+ if (node < 0 || node >= nr_online_nodes)
+ goto invalid;
+ /* Parse hugepages */
+ if (sscanf(p, "%lu%n", &tmp, &count) != 1)
+ goto invalid;
+ if (!hugetlb_max_hstate)
+ default_hugepages_in_node[node] = tmp;
+ else
+ parsed_hstate->max_huge_pages_node[node] = tmp;
+ *mhp += tmp;
+ /* Go to parse next node*/
+ if (p[count] == ',')
+ p += count + 1;
+ else
+ break;
+ } else {
+ if (p != s)
+ goto invalid;
+ *mhp = tmp;
+ break;
+ }
+ }
/*
* Global state is always initialized later in hugetlb_init.
@@ -3091,6 +3187,10 @@ static int __init hugetlb_nrpages_setup(char *s)
last_mhp = mhp;
return 1;
+
+invalid:
+ pr_warn("HugeTLB: Invalid hugepages parameter %s\n", p);
+ return 0;
}
__setup("hugepages=", hugetlb_nrpages_setup);
--
2.25.1
1
1
你好 rdma这个议题什么气候开始呢
----------
该邮件从移动设备发送
--------------原始邮件--------------
发件人:"tangjiayuan "<tangjiayuan(a)huawei.com>;
发送时间:2022年1月17日(星期一) 下午3:10
收件人:"kernel(a)openeuler.org" <kernel(a)openeuler.org>;"kernel-discuss(a)openeuler.org" <kernel-discuss(a)openeuler.org>;
抄送:"xie xiuqi "<xiexiuqi(a)huawei.com>;"chengjian (D) "<cj.chengjian(a)huawei.com>;"qiulaibin "<qiulaibin(a)huawei.com>;
主题:答复: openEuler kernel技术分享第十七期 & 双周例会
-----------------------------------
Kernel SIG增加一个议题:希望在内核态支持RDMA通信
-----邮件原件-----
发件人: openEuler conference [mailto:public@openeuler.org]
发送时间: 2021年12月27日 10:20
收件人: kernel(a)openeuler.org; tc(a)openeuler.org; dev(a)openeuler.org; kernel-discuss(a)openeuler.org
主题: openEuler kernel技术分享第十七期 & 双周例会
您好!
Kernel SIG 邀请您参加 2021-12-31 14:00 召开的ZOOM会议(自动录制)
会议主题:openEuler kernel技术分享第十七期 & 双周例会
会议内容:
14:10-15:30 openEuler kernel 技术分享第17期 - cgroup介绍
会议链接:https://us06web.zoom.us/j/83118407147?pwd=elVzMVhGc2YzaDQ1TGorN2tPa1NhZz09
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the ZOOM conference(auto recording) will be held at 2021-12-31 14:00,
The subject of the conference is openEuler kernel技术分享第十七期 & 双周例会,
Summary:
14:10-15:30 openEuler kernel 技术分享第17期 - cgroup介绍
You can join the meeting at https://us06web.zoom.us/j/83118407147?pwd=elVzMVhGc2YzaDQ1TGorN2tPa1NhZz09.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
_______________________________________________
Kernel-discuss mailing list -- kernel-discuss(a)openeuler.org To unsubscribe send an email to kernel-discuss-leave(a)openeuler.org
_______________________________________________
Kernel mailing list -- kernel(a)openeuler.org
To unsubscribe send an email to kernel-leave(a)openeuler.org
1
0

[PATCH openEuler-1.0-LTS 1/2] share pool: use rwsem to protect sp group exit
by Yang Yingliang 17 Jan '22
by Yang Yingliang 17 Jan '22
17 Jan '22
From: Guo Mengqi <guomengqi3(a)huawei.com>
ascend inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ODMN
CVE: NA
-------------------------------------------------
Fix following situation:
when the last process in a group exits, and a second process tries to add
to this group.
The second process may get a invalid spg. However the group's
use_count is increased by 1, which caused the first process failed to
free the group when it exits. And then second process called
sp_group_drop --> free_sp_group and cause a double request of rwsem.
Signed-off-by: Guo Mengqi <guomengqi3(a)huawei.com>
Reviewed-by: Weilong Chen <chenweilong(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
mm/share_pool.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/mm/share_pool.c b/mm/share_pool.c
index 428742f5d4044..c79ea4bb06026 100644
--- a/mm/share_pool.c
+++ b/mm/share_pool.c
@@ -711,20 +711,25 @@ static void free_new_spg_id(bool new, int spg_id)
free_sp_group_id(spg_id);
}
-static void free_sp_group(struct sp_group *spg)
+static void free_sp_group_locked(struct sp_group *spg)
{
fput(spg->file);
fput(spg->file_hugetlb);
free_spg_stat(spg->id);
- down_write(&sp_group_sem);
idr_remove(&sp_group_idr, spg->id);
- up_write(&sp_group_sem);
free_sp_group_id((unsigned int)spg->id);
kfree(spg);
system_group_count--;
WARN(system_group_count < 0, "unexpected group count\n");
}
+static void free_sp_group(struct sp_group *spg)
+{
+ down_write(&sp_group_sem);
+ free_sp_group_locked(spg);
+ up_write(&sp_group_sem);
+}
+
static void sp_group_drop(struct sp_group *spg)
{
if (atomic_dec_and_test(&spg->use_count))
@@ -4460,14 +4465,15 @@ void sp_group_post_exit(struct mm_struct *mm)
sp_proc_stat_drop(stat);
}
- /* lockless traverse */
+ down_write(&sp_group_sem);
list_for_each_entry_safe(spg_node, tmp, &master->node_list, group_node) {
spg = spg_node->spg;
/* match with refcount inc in sp_group_add_task */
- sp_group_drop(spg);
+ if (atomic_dec_and_test(&spg->use_count))
+ free_sp_group_locked(spg);
kfree(spg_node);
}
-
+ up_write(&sp_group_sem);
kfree(master);
}
--
2.25.1
1
1

[PATCH openEuler-1.0-LTS 1/9] Revert "svm: Add svm_set_user_mpam_en to enable/disable mpam for smmu"
by Yang Yingliang 17 Jan '22
by Yang Yingliang 17 Jan '22
17 Jan '22
From: Xingang Wang <wangxingang5(a)huawei.com>
ascend inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I49RB2
CVE: NA
-------------------------------------------------
This reverts commit c50ad40d63ef0cc9e0b158354af08f1f98419273.
The commit "svm: Add svm_set_user_mpam_en to enable/disable mpam for smmu"
add and export interface in svm module, this makes the mpam depend on
the svm module, just revert this to avoid coupling.
Signed-off-by: Xingang Wang <wangxingang5(a)huawei.com>
Reviewed-by: Weilong Chen <chenweilong(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/char/svm.c | 64 -------------------------------------
include/linux/ascend_smmu.h | 3 --
2 files changed, 67 deletions(-)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c
index c877c75fdcd5f..000b9ad01c350 100644
--- a/drivers/char/svm.c
+++ b/drivers/char/svm.c
@@ -135,14 +135,11 @@ struct meminfo {
struct svm_mpam {
#define SVM_GET_DEV_MPAM (1 << 0)
#define SVM_SET_DEV_MPAM (1 << 1)
-#define SVM_GET_USER_MPAM_EN (1 << 2)
-#define SVM_SET_USER_MPAM_EN (1 << 3)
int flags;
int pasid;
int partid;
int pmg;
int s1mpam;
- int user_mpam_en;
};
struct phymeminfo {
@@ -1505,14 +1502,6 @@ static int svm_get_core_mpam(struct device *dev, void *data)
}
}
- if (mpam->flags & SVM_GET_USER_MPAM_EN) {
- err = arm_smmu_get_dev_user_mpam_en(dev, &mpam->user_mpam_en);
- if (err) {
- dev_err(dev, "set user_mpam_en failed, %d\n", err);
- return err;
- }
- }
-
return err;
}
@@ -1554,14 +1543,6 @@ static int svm_set_core_mpam(struct device *dev, void *data)
}
}
- if (mpam->flags & SVM_SET_USER_MPAM_EN) {
- err = arm_smmu_set_dev_user_mpam_en(dev, mpam->user_mpam_en);
- if (err) {
- dev_err(dev, "set user_mpam_en failed, %d\n", err);
- return err;
- }
- }
-
return 0;
}
@@ -1655,51 +1636,6 @@ int svm_get_mpam(int pasid, int *partid, int *pmg, int *s1mpam)
}
EXPORT_SYMBOL_GPL(svm_get_mpam);
-/**
- * svm_set_user_mpam_en() - set user_mpam_en
- * @user_mpam_en: 0 for smmu mpam, 1 for user mpam
- */
-int svm_set_user_mpam_en(int user_mpam_en)
-{
- int err;
- struct svm_mpam mpam, old_mpam;
-
- old_mpam.flags = SVM_GET_USER_MPAM_EN;
- err = __svm_get_mpam(&old_mpam);
-
- mpam.flags = SVM_SET_USER_MPAM_EN,
- mpam.user_mpam_en = user_mpam_en,
- err = __svm_set_mpam(&mpam);
- if (err)
- goto rollback;
-
- return 0;
-
-rollback:
- __svm_set_mpam(&mpam);
- return err;
-}
-EXPORT_SYMBOL_GPL(svm_set_user_mpam_en);
-
-/**
- * svm_set_user_mpam_en() - set user_mpam_en
- * @user_mpam_en: pointer to user_mpam_en
- */
-int svm_get_user_mpam_en(int *user_mpam_en)
-{
- int err;
- struct svm_mpam mpam;
-
- mpam.flags = SVM_GET_USER_MPAM_EN;
- err = __svm_get_mpam(&mpam);
- if (err)
- return err;
-
- *user_mpam_en = mpam.user_mpam_en;
- return 0;
-}
-EXPORT_SYMBOL_GPL(svm_get_user_mpam_en);
-
static int svm_set_rc(unsigned long __user *arg)
{
unsigned long addr, size, rc;
diff --git a/include/linux/ascend_smmu.h b/include/linux/ascend_smmu.h
index 32241a02a8a89..e570acfc8f6ac 100644
--- a/include/linux/ascend_smmu.h
+++ b/include/linux/ascend_smmu.h
@@ -15,7 +15,4 @@ extern int arm_smmu_get_dev_user_mpam_en(struct device *dev, int *user_mpam_en);
extern int svm_get_mpam(int pasid, int *partid, int *pmg, int *s1mpam);
extern int svm_set_mpam(int pasid, int partid, int pmg, int s1mpam);
-extern int svm_set_user_mpam_en(int user_mpam_en);
-extern int svm_get_user_mpam_en(int *user_mpam_en);
-
#endif /* __LINUX_ASCEND_SMMU_H */
--
2.25.1
1
8

[PATCH openEuler-1.0-LTS 1/3] Revert "svm: Add svm_set_user_mpam_en to enable/disable mpam for smmu"
by Yang Yingliang 17 Jan '22
by Yang Yingliang 17 Jan '22
17 Jan '22
From: Xingang Wang <wangxingang5(a)huawei.com>
ascend inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I49RB2
CVE: NA
-------------------------------------------------
This reverts commit c50ad40d63ef0cc9e0b158354af08f1f98419273.
The commit "svm: Add svm_set_user_mpam_en to enable/disable mpam for smmu"
add and export interface in svm module, this makes the mpam depend on
the svm module, just revert this to avoid coupling.
Signed-off-by: Xingang Wang <wangxingang5(a)huawei.com>
Reviewed-by: Weilong Chen <chenweilong(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/char/svm.c | 64 -------------------------------------
include/linux/ascend_smmu.h | 3 --
2 files changed, 67 deletions(-)
diff --git a/drivers/char/svm.c b/drivers/char/svm.c
index c877c75fdcd5f..000b9ad01c350 100644
--- a/drivers/char/svm.c
+++ b/drivers/char/svm.c
@@ -135,14 +135,11 @@ struct meminfo {
struct svm_mpam {
#define SVM_GET_DEV_MPAM (1 << 0)
#define SVM_SET_DEV_MPAM (1 << 1)
-#define SVM_GET_USER_MPAM_EN (1 << 2)
-#define SVM_SET_USER_MPAM_EN (1 << 3)
int flags;
int pasid;
int partid;
int pmg;
int s1mpam;
- int user_mpam_en;
};
struct phymeminfo {
@@ -1505,14 +1502,6 @@ static int svm_get_core_mpam(struct device *dev, void *data)
}
}
- if (mpam->flags & SVM_GET_USER_MPAM_EN) {
- err = arm_smmu_get_dev_user_mpam_en(dev, &mpam->user_mpam_en);
- if (err) {
- dev_err(dev, "set user_mpam_en failed, %d\n", err);
- return err;
- }
- }
-
return err;
}
@@ -1554,14 +1543,6 @@ static int svm_set_core_mpam(struct device *dev, void *data)
}
}
- if (mpam->flags & SVM_SET_USER_MPAM_EN) {
- err = arm_smmu_set_dev_user_mpam_en(dev, mpam->user_mpam_en);
- if (err) {
- dev_err(dev, "set user_mpam_en failed, %d\n", err);
- return err;
- }
- }
-
return 0;
}
@@ -1655,51 +1636,6 @@ int svm_get_mpam(int pasid, int *partid, int *pmg, int *s1mpam)
}
EXPORT_SYMBOL_GPL(svm_get_mpam);
-/**
- * svm_set_user_mpam_en() - set user_mpam_en
- * @user_mpam_en: 0 for smmu mpam, 1 for user mpam
- */
-int svm_set_user_mpam_en(int user_mpam_en)
-{
- int err;
- struct svm_mpam mpam, old_mpam;
-
- old_mpam.flags = SVM_GET_USER_MPAM_EN;
- err = __svm_get_mpam(&old_mpam);
-
- mpam.flags = SVM_SET_USER_MPAM_EN,
- mpam.user_mpam_en = user_mpam_en,
- err = __svm_set_mpam(&mpam);
- if (err)
- goto rollback;
-
- return 0;
-
-rollback:
- __svm_set_mpam(&mpam);
- return err;
-}
-EXPORT_SYMBOL_GPL(svm_set_user_mpam_en);
-
-/**
- * svm_set_user_mpam_en() - set user_mpam_en
- * @user_mpam_en: pointer to user_mpam_en
- */
-int svm_get_user_mpam_en(int *user_mpam_en)
-{
- int err;
- struct svm_mpam mpam;
-
- mpam.flags = SVM_GET_USER_MPAM_EN;
- err = __svm_get_mpam(&mpam);
- if (err)
- return err;
-
- *user_mpam_en = mpam.user_mpam_en;
- return 0;
-}
-EXPORT_SYMBOL_GPL(svm_get_user_mpam_en);
-
static int svm_set_rc(unsigned long __user *arg)
{
unsigned long addr, size, rc;
diff --git a/include/linux/ascend_smmu.h b/include/linux/ascend_smmu.h
index 32241a02a8a89..e570acfc8f6ac 100644
--- a/include/linux/ascend_smmu.h
+++ b/include/linux/ascend_smmu.h
@@ -15,7 +15,4 @@ extern int arm_smmu_get_dev_user_mpam_en(struct device *dev, int *user_mpam_en);
extern int svm_get_mpam(int pasid, int *partid, int *pmg, int *s1mpam);
extern int svm_set_mpam(int pasid, int partid, int pmg, int s1mpam);
-extern int svm_set_user_mpam_en(int user_mpam_en);
-extern int svm_get_user_mpam_en(int *user_mpam_en);
-
#endif /* __LINUX_ASCEND_SMMU_H */
--
2.25.1
1
2

[PATCH openEuler-1.0-LTS 1/7] cgroup: Use open-time credentials for process migraton perm checks
by Yang Yingliang 17 Jan '22
by Yang Yingliang 17 Jan '22
17 Jan '22
From: Tejun Heo <tj(a)kernel.org>
mainline inclusion
from mainline-v5.16-9
commit 1756d7994ad85c2479af6ae5a9750b92324685af
category: bugfix
bugzilla: NA
CVE: CVE-2021-4197
---------------------------------------------------
cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's credentials which is a
potential security weakness as it may allow scenarios where a less
privileged process tricks a more privileged one into writing into a fd that
it created.
This patch makes both cgroup2 and cgroup1 process migration interfaces to
use the credentials saved at the time of open (file->f_cred) instead of
current's.
Reported-by: "Eric W. Biederman" <ebiederm(a)xmission.com>
Suggested-by: Linus Torvalds <torvalds(a)linuxfoundation.org>
Fixes: 187fe84067bd ("cgroup: require write perm on common ancestor when moving processes on the default hierarchy")
Reviewed-by: Michal Koutný <mkoutny(a)suse.com>
Signed-off-by: Tejun Heo <tj(a)kernel.org>
Conflicts:
kernel/cgroup/cgroup.c
Signed-off-by: Lu Jialin <lujialin4(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
kernel/cgroup/cgroup-v1.c | 7 ++++---
kernel/cgroup/cgroup.c | 19 ++++++++++++++++++-
2 files changed, 22 insertions(+), 4 deletions(-)
diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index 055063c9c2a38..7a3ea4d6b54cf 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -535,10 +535,11 @@ static ssize_t __cgroup1_procs_write(struct kernfs_open_file *of,
goto out_unlock;
/*
- * Even if we're attaching all tasks in the thread group, we only
- * need to check permissions on one of them.
+ * Even if we're attaching all tasks in the thread group, we only need
+ * to check permissions on one of them. Check permissions using the
+ * credentials from file open to protect against inherited fd attacks.
*/
- cred = current_cred();
+ cred = of->file->f_cred;
tcred = get_task_cred(task);
if (!uid_eq(cred->euid, GLOBAL_ROOT_UID) &&
!uid_eq(cred->euid, tcred->uid) &&
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 6f4dcdd6f77b2..49e8bf7ee9c73 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -4490,6 +4490,7 @@ static ssize_t cgroup_procs_write(struct kernfs_open_file *of,
{
struct cgroup *src_cgrp, *dst_cgrp;
struct task_struct *task;
+ const struct cred *saved_cred;
ssize_t ret;
dst_cgrp = cgroup_kn_lock_live(of->kn, false);
@@ -4506,8 +4507,16 @@ static ssize_t cgroup_procs_write(struct kernfs_open_file *of,
src_cgrp = task_cgroup_from_root(task, &cgrp_dfl_root);
spin_unlock_irq(&css_set_lock);
+ /*
+ * Process and thread migrations follow same delegation rule. Check
+ * permissions using the credentials from file open to protect against
+ * inherited fd attacks.
+ */
+
+ saved_cred = override_creds(of->file->f_cred);
ret = cgroup_procs_write_permission(src_cgrp, dst_cgrp,
of->file->f_path.dentry->d_sb);
+ revert_creds(saved_cred);
if (ret)
goto out_finish;
@@ -4531,6 +4540,7 @@ static ssize_t cgroup_threads_write(struct kernfs_open_file *of,
{
struct cgroup *src_cgrp, *dst_cgrp;
struct task_struct *task;
+ const struct cred *saved_cred;
ssize_t ret;
buf = strstrip(buf);
@@ -4549,9 +4559,16 @@ static ssize_t cgroup_threads_write(struct kernfs_open_file *of,
src_cgrp = task_cgroup_from_root(task, &cgrp_dfl_root);
spin_unlock_irq(&css_set_lock);
- /* thread migrations follow the cgroup.procs delegation rule */
+ /*
+ * Process and thread migrations follow same delegation rule. Check
+ * permissions using the credentials from file open to protect against
+ * inherited fd attacks.
+ */
+
+ saved_cred = override_creds(of->file->f_cred);
ret = cgroup_procs_write_permission(src_cgrp, dst_cgrp,
of->file->f_path.dentry->d_sb);
+ revert_creds(saved_cred);
if (ret)
goto out_finish;
--
2.25.1
1
6

[PATCH openEuler-5.10 1/2] x86: hugepage: use nt copy hugepage to AEP in x86
by Zheng Zengkai 15 Jan '22
by Zheng Zengkai 15 Jan '22
15 Jan '22
From: Kemeng Shi <shikemeng(a)huawei.com>
euleros inclusion
category: feature
feature: etmem
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OODH?from=project-issue
CVE: NA
-------------------------------------------------
Add proc/sys/vm/hugepage_nocache_copy switch. Set 1 to copy hugepage
with movnt SSE instructoin if cpu support it. Set 0 to copy hugepage
as usual.
Signed-off-by: Kemeng Shi <shikemeng(a)huawei.com>
Reviewed-by: louhongxiang <louhongxiang(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
arch/x86/include/asm/page_64.h | 6 ++
arch/x86/lib/Makefile | 1 +
arch/x86/lib/copy_highpages.c | 105 +++++++++++++++++++++++++++++++++
arch/x86/lib/copy_page_64.S | 73 +++++++++++++++++++++++
include/linux/highmem.h | 14 +++++
mm/migrate.c | 6 +-
6 files changed, 200 insertions(+), 5 deletions(-)
create mode 100644 arch/x86/lib/copy_highpages.c
diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index 939b1cff4a7b..d9fbc486728a 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -56,6 +56,12 @@ static inline void clear_page(void *page)
void copy_page(void *to, void *from);
+void copy_page_nocache(void *to, void *from);
+void copy_page_nocache_barrir(void);
+
+struct page;
+#define __HAVE_ARCH_COPY_HUGEPAGES 1
+void copy_highpages(struct page *to, struct page *from, int nr_pages);
#endif /* !__ASSEMBLY__ */
#ifdef CONFIG_X86_VSYSCALL_EMULATION
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index bad4dee4f0e4..ab0d91b808be 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -70,4 +70,5 @@ else
lib-y += memmove_64.o memset_64.o
lib-y += copy_user_64.o
lib-y += cmpxchg16b_emu.o
+ lib-y += copy_highpages.o
endif
diff --git a/arch/x86/lib/copy_highpages.c b/arch/x86/lib/copy_highpages.c
new file mode 100644
index 000000000000..ed95b8a7b906
--- /dev/null
+++ b/arch/x86/lib/copy_highpages.c
@@ -0,0 +1,105 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * accelerate copying page to pmem with non-temproal stroes
+ */
+#include <linux/sched.h>
+#include <linux/mmzone.h>
+#include <linux/highmem.h>
+#include <linux/sysctl.h>
+
+DEFINE_STATIC_KEY_FALSE(hugepage_nocache_copy);
+#ifdef CONFIG_SYSCTL
+static void set_hugepage_nocache_copy(bool enabled)
+{
+ if (enabled)
+ static_branch_enable(&hugepage_nocache_copy);
+ else
+ static_branch_disable(&hugepage_nocache_copy);
+}
+
+int sysctl_hugepage_nocache_copy(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ struct ctl_table t;
+ int err;
+ int state;
+
+ if (write && !capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ state = static_branch_unlikely(&hugepage_nocache_copy);
+ t = *table;
+ t.data = &state;
+ err = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
+ if (err < 0)
+ return err;
+ if (write)
+ set_hugepage_nocache_copy(state);
+ return err;
+}
+
+static struct ctl_table copy_highpages_table[] = {
+ {
+ .procname = "hugepage_nocache_copy",
+ .data = NULL,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0600,
+ .proc_handler = sysctl_hugepage_nocache_copy,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+ {}
+};
+
+static struct ctl_table copy_highpages_root_table[] = {
+ {
+ .procname = "vm",
+ .mode = 0555,
+ .child = copy_highpages_table,
+ },
+ {}
+};
+
+static __init int copy_highpages_init(void)
+{
+ return register_sysctl_table(copy_highpages_root_table) ? 0 : -ENOMEM;
+}
+__initcall(copy_highpages_init);
+#endif
+
+static void copy_highpages_nocache(struct page *to, struct page *from, int nr_pages)
+{
+ char *vfrom, *vto;
+ int i;
+
+ for (i = 0; i < nr_pages; i++) {
+ cond_resched();
+ vfrom = kmap_atomic(from);
+ vto = kmap_atomic(to);
+ copy_page_nocache(vto, vfrom);
+ kunmap_atomic(vto);
+ kunmap_atomic(vfrom);
+ to++;
+ from++;
+ }
+ copy_page_nocache_barrir();
+}
+
+static void copy_highpages_cache(struct page *to, struct page *from, int nr_pages)
+{
+ int i;
+
+ for (i = 0; i < nr_pages; i++) {
+ cond_resched();
+ copy_highpage(to + i, from + i);
+ }
+}
+
+void copy_highpages(struct page *to, struct page *from, int nr_pages)
+{
+ if (static_branch_unlikely(&hugepage_nocache_copy) &&
+ get_node_type(page_to_nid(to)) == NODE_TYPE_PMEM)
+ return copy_highpages_nocache(to, from, nr_pages);
+
+ return copy_highpages_cache(to, from, nr_pages);
+}
diff --git a/arch/x86/lib/copy_page_64.S b/arch/x86/lib/copy_page_64.S
index 2402d4c489d2..16ca196d349b 100644
--- a/arch/x86/lib/copy_page_64.S
+++ b/arch/x86/lib/copy_page_64.S
@@ -87,3 +87,76 @@ SYM_FUNC_START_LOCAL(copy_page_regs)
addq $2*8, %rsp
ret
SYM_FUNC_END(copy_page_regs)
+
+SYM_FUNC_START(copy_page_nocache)
+ ALTERNATIVE "jmp copy_page", "", X86_FEATURE_XMM2
+ subq $2*8, %rsp
+ movq %rbx, (%rsp)
+ movq %r12, 1*8(%rsp)
+
+ movl $(4096/64)-5, %ecx
+ .p2align 4
+.LoopNT64:
+ dec %rcx
+ movq 0x8*0(%rsi), %rax
+ movq 0x8*1(%rsi), %rbx
+ movq 0x8*2(%rsi), %rdx
+ movq 0x8*3(%rsi), %r8
+ movq 0x8*4(%rsi), %r9
+ movq 0x8*5(%rsi), %r10
+ movq 0x8*6(%rsi), %r11
+ movq 0x8*7(%rsi), %r12
+
+ prefetcht0 5*64(%rsi)
+
+ movnti %rax, 0x8*0(%rdi)
+ movnti %rbx, 0x8*1(%rdi)
+ movnti %rdx, 0x8*2(%rdi)
+ movnti %r8, 0x8*3(%rdi)
+ movnti %r9, 0x8*4(%rdi)
+ movnti %r10, 0x8*5(%rdi)
+ movnti %r11, 0x8*6(%rdi)
+ movnti %r12, 0x8*7(%rdi)
+
+ leaq 64 (%rsi), %rsi
+ leaq 64 (%rdi), %rdi
+
+ jnz .LoopNT64
+
+ movl $5, %ecx
+ .p2align 4
+.LoopNT2:
+ decl %ecx
+
+ movq 0x8*0(%rsi), %rax
+ movq 0x8*1(%rsi), %rbx
+ movq 0x8*2(%rsi), %rdx
+ movq 0x8*3(%rsi), %r8
+ movq 0x8*4(%rsi), %r9
+ movq 0x8*5(%rsi), %r10
+ movq 0x8*6(%rsi), %r11
+ movq 0x8*7(%rsi), %r12
+
+ movnti %rax, 0x8*0(%rdi)
+ movnti %rbx, 0x8*1(%rdi)
+ movnti %rdx, 0x8*2(%rdi)
+ movnti %r8, 0x8*3(%rdi)
+ movnti %r9, 0x8*4(%rdi)
+ movnti %r10, 0x8*5(%rdi)
+ movnti %r11, 0x8*6(%rdi)
+ movnti %r12, 0x8*7(%rdi)
+
+ leaq 64(%rdi), %rdi
+ leaq 64(%rsi), %rsi
+ jnz .LoopNT2
+
+ movq (%rsp), %rbx
+ movq 1*8(%rsp), %r12
+ addq $2*8, %rsp
+ ret
+SYM_FUNC_END(copy_page_nocache)
+
+SYM_FUNC_START(copy_page_nocache_barrir)
+ ALTERNATIVE "", "sfence", X86_FEATURE_XMM2
+ ret
+SYM_FUNC_END(copy_page_nocache_barrir)
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 8e21fe82b3a3..6b27af8fe624 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -356,4 +356,18 @@ static inline void copy_highpage(struct page *to, struct page *from)
#endif
+#ifndef __HAVE_ARCH_COPY_HUGEPAGES
+
+static inline void copy_highpages(struct page *to, struct page *from, int nr_pages)
+{
+ int i;
+
+ for (i = 0; i < nr_pages; i++) {
+ cond_resched();
+ copy_highpage(to + i, from + i);
+ }
+}
+
+#endif /* __HAVE_ARCH_COPY_HUGEPAGES */
+
#endif /* _LINUX_HIGHMEM_H */
diff --git a/mm/migrate.c b/mm/migrate.c
index 8a57a09ad665..14c4db23b3ed 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -578,7 +578,6 @@ static void __copy_gigantic_page(struct page *dst, struct page *src,
static void copy_huge_page(struct page *dst, struct page *src)
{
- int i;
int nr_pages;
if (PageHuge(src)) {
@@ -596,10 +595,7 @@ static void copy_huge_page(struct page *dst, struct page *src)
nr_pages = thp_nr_pages(src);
}
- for (i = 0; i < nr_pages; i++) {
- cond_resched();
- copy_highpage(dst + i, src + i);
- }
+ copy_highpages(dst, src, nr_pages);
}
/*
--
2.20.1
1
1
mainline inclusion
category: feature
bugzilla:https://gitee.com/openeuler/kernel/issues/I4CMMB?from=project-issue
CVE: NA
Populate X86_FEATURE_SGX feature from CPUID and tie it to the Kconfig
option with disabled-features.h.
IA32_FEATURE_CONTROL.SGX_ENABLE must be examined in addition to the
CPUID
bits to enable full SGX support. The BIOS must both set this bit and
lock
IA32_FEATURE_CONTROL for SGX to be supported (Intel SDM section 36.7.1).
The setting or clearing of this bit has no impact on the CPUID bits
above,
which is why it needs to be detected separately.
Signed-off-by: Sean Christopherson <sean.j.christopherson(a)intel.com>
Co-developed-by: Jarkko Sakkinen <jarkko(a)kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Acked-by: Jethro Beekman <jethro(a)fortanix.com>
Signed-off-by: zhangyue <zhangyue1(a)kylinos.cn>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/disabled-features.h | 8 +++++++-
arch/x86/include/asm/msr-index.h | 1 +
3 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index dad350d42ecf..1181f5c7bbef 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -241,6 +241,7 @@
/* Intel-defined CPU features, CPUID level 0x00000007:0 (EBX), word 9 */
#define X86_FEATURE_FSGSBASE ( 9*32+ 0) /* RDFSBASE, WRFSBASE, RDGSBASE, WRGSBASE instructions*/
#define X86_FEATURE_TSC_ADJUST ( 9*32+ 1) /* TSC adjustment MSR 0x3B */
+#define X86_FEATURE_SGX ( 9*32+ 2) /* Software Guard Extensions */
#define X86_FEATURE_BMI1 ( 9*32+ 3) /* 1st group bit manipulation extensions */
#define X86_FEATURE_HLE ( 9*32+ 4) /* Hardware Lock Elision */
#define X86_FEATURE_AVX2 ( 9*32+ 5) /* AVX2 instructions */
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index 09db5b8f1444..24d66717c642 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -59,6 +59,12 @@
/* Force disable because it's broken beyond repair */
#define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31))
+#ifdef CONFIG_X86_SGX
+#define DISABLE_SGX 0
+#else
+#define DISABLE_SGX (1 << (X86_FEATURE_SGX & 31))
+#endif
+
/*
* Make sure to add features to the correct mask
*/
@@ -71,7 +77,7 @@
#define DISABLED_MASK6 0
#define DISABLED_MASK7 (DISABLE_PTI)
#define DISABLED_MASK8 0
-#define DISABLED_MASK9 (DISABLE_SMAP)
+#define DISABLED_MASK9 (DISABLE_SMAP|DISABLE_SGX)
#define DISABLED_MASK10 0
#define DISABLED_MASK11 0
#define DISABLED_MASK12 0
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 972a34d93505..258d555d22f2 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -609,6 +609,7 @@
#define FEAT_CTL_LOCKED BIT(0)
#define FEAT_CTL_VMX_ENABLED_INSIDE_SMX BIT(1)
#define FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX BIT(2)
+#define FEAT_CTL_SGX_ENABLED BIT(18)
#define FEAT_CTL_LMCE_ENABLED BIT(20)
#define MSR_IA32_TSC_ADJUST 0x0000003b
--
2.30.0
2
1
From: JiaoFenfang <jiaofenfang(a)uniontech.com>
openEuler inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KVP7?from=project-issue
Signed-off-by: Jiao Fenfang <jiaofenfang(a)uniontech.com>
---
arch/x86/configs/openeuler_defconfig | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig
index 7b608301823c..01a902eba3b9 100644
--- a/arch/x86/configs/openeuler_defconfig
+++ b/arch/x86/configs/openeuler_defconfig
@@ -7451,7 +7451,13 @@ CONFIG_XFS_POSIX_ACL=y
CONFIG_GFS2_FS=m
CONFIG_GFS2_FS_LOCKING_DLM=y
# CONFIG_OCFS2_FS is not set
-# CONFIG_BTRFS_FS is not set
+CONFIG_BTRFS_FS=m
+# CONFIG_BTRFS_FS_POSIX_ACL is not set
+# CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set
+# CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set
+# CONFIG_BTRFS_DEBUG is not set
+# CONFIG_BTRFS_ASSERT is not set
+# CONFIG_BTRFS_FS_REF_VERIFY is not set
# CONFIG_NILFS2_FS is not set
# CONFIG_F2FS_FS is not set
# CONFIG_ZONEFS_FS is not set
--
2.20.1
2
1
Backport 5.10.88 LTS patches from upstream.
Alex Bee (1):
arm64: dts: rockchip: fix audio-supply for Rock Pi 4
Alyssa Ross (1):
dmaengine: st_fdma: fix MODULE_ALIAS
Amelie Delaunay (1):
usb: dwc2: fix STM ID/VBUS detection startup delay in
dwc2_driver_probe
Artem Lapkin (1):
arm64: dts: rockchip: remove mmc-hs400-enhanced-strobe from
rk3399-khadas-edge
Baowen Zheng (1):
flow_offload: return EOPNOTSUPP for the unsupported mpls action type
Cyril Novikov (1):
ixgbe: set X550 MDIO speed before talking to PHY
D. Wythe (1):
net/smc: Prevent smc_release() from long blocking
Dan Carpenter (2):
vdpa: check that offsets are within bounds
tee: amdtee: fix an IS_ERR() vs NULL bug
Daniel Borkmann (3):
bpf: Fix signed bounds propagation after mov32
bpf: Make 32->64 bounds propagation slightly more robust
bpf, selftests: Add test case trying to taint map value pointer
Daniele Palmas (1):
USB: serial: option: add Telit FN990 compositions
David Ahern (3):
selftests: Add duplicate config only for MD5 VRF tests
selftests: Fix raw socket bind tests with VRF
selftests: Fix IPv6 address bind tests
Davide Caratti (1):
net/sched: sch_ets: don't remove idle classes from the round-robin
list
Dinh Nguyen (1):
ARM: socfpga: dts: fix qspi node compatible
Eric Dumazet (3):
sch_cake: do not call cake_destroy() from cake_init()
inet_diag: fix kernel-infoleak for UDP sockets
sit: do not call ipip6_dev_free() from sit_init_net()
Fabio Estevam (2):
arm64: dts: imx8mp-evk: Improve the Ethernet PHY description
ARM: dts: imx6ull-pinfunc: Fix CSI_DATA07__ESAI_TX0 pad name
Felix Fietkau (2):
mac80211: fix regression in SSN handling of addba tx
mac80211: send ADDBA requests using the tid/queue of the aggregation
session
Filipe Manana (1):
btrfs: fix double free of anon_dev after failure to create subvolume
Florian Fainelli (1):
net: systemport: Add global locking for descriptor lifecycle
Florian Westphal (1):
mptcp: clear 'kern' flag from fallback sockets
Gal Pressman (1):
net: Fix double 0x prefix print in SKB dump
George Kennedy (4):
libata: if T_LENGTH is zero, dma direction should be DMA_NONE
scsi: scsi_debug: Don't call kcalloc() if size arg is zero
scsi: scsi_debug: Fix type in min_t to avoid stack OOB
scsi: scsi_debug: Sanity check block descriptor length in
resp_mode_select()
Greg Kroah-Hartman (1):
Revert "usb: early: convert to readl_poll_timeout_atomic()"
Hangbin Liu (1):
selftest/net/forwarding: declare NETIFS p9 p10
Hu Weiwen (1):
ceph: fix duplicate increment of opened_inodes metric
Jerome Marchand (1):
recordmcount.pl: look for jgnop instruction as well as bcrl on s390
Ji-Ze Hong (Peter Hong) (1):
serial: 8250_fintek: Fix garbled text for console
Jianglei Nie (1):
btrfs: fix memory leak in __add_inode_ref()
Jiasheng Jiang (2):
drm/ast: potential dereference of null pointer
sfc_ef100: potential dereference of null pointer
Jie Wang (1):
net: hns3: fix use-after-free bug in hclgevf_send_mbx_msg
Jie2x Zhou (1):
selftests: net: Correct ping6 expected rc from 2 to 1
Jimmy Wang (1):
USB: NO_LPM quirk Lenovo USB-C to Ethernet Adapher(RTL8153-04)
Joakim Zhang (1):
arm64: dts: imx8m: correct assigned clocks for FEC
Joe Thornber (1):
dm btree remove: fix use after free in rebalance_children()
Johan Hovold (1):
USB: serial: cp210x: fix CP2105 GPIO registration
Johannes Berg (5):
mac80211: mark TX-during-stop for TX in in_reconfig
mac80211: validate extended element ID is present
mac80211: track only QoS data frames for admission control
mac80211: agg-tx: don't schedule_and_wake_txq() under sta->lock
mac80211: fix lookup when adding AddBA extension element
John Keeping (2):
arm64: dts: rockchip: fix rk3308-roc-cc vcc-sd supply
arm64: dts: rockchip: fix rk3399-leez-p710 vcc3v3-lan supply
Juergen Gross (5):
xen/blkfront: harden blkfront against event channel storms
xen/netfront: harden netfront against event channel storms
xen/console: harden hvc_xen against event channel storms
xen/netback: fix rx queue stall detection
xen/netback: don't queue unlimited number of packages
Karen Sornek (1):
igb: Fix removal of unicast MAC filters of VFs
Lang Yu (1):
drm/amd/pm: fix a potential gpu_metrics_table memory leak
Le Ma (1):
drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORE
Letu Ren (1):
igbvf: fix double free in `igbvf_probe`
Magnus Karlsson (2):
xsk: Do not sleep in poll() when need_wakeup set
Revert "xsk: Do not sleep in poll() when need_wakeup set"
Martin KaFai Lau (1):
bpf, selftests: Fix racing issue in btf_skc_cls_ingress test
Mike Tipton (1):
clk: Don't parent clks until the parent is fully registered
Miklos Szeredi (2):
fuse: annotate lock in fuse_reverse_inval_entry()
ovl: fix warning in ovl_create_real()
Naohiro Aota (1):
zonefs: add MODULE_ALIAS_FS
Nathan Chancellor (2):
soc/tegra: fuse: Fix bitwise vs. logical OR warning
Input: touchscreen - avoid bitwise vs logical OR warning
Nehal Bakulchandra Shah (1):
usb: xhci: Extend support for runtime power management for AMD's
Yellow carp.
Paolo Bonzini (1):
KVM: downgrade two BUG_ONs to WARN_ON_ONCE
Paul E. McKenney (1):
rcu: Mark accesses to rcu_state.n_force_qs
Pavel Skripkin (1):
media: mxl111sf: change mutex_init() location
Philipp Rudo (1):
s390/kexec_file: fix error handling when applying relocations
Robert Schlabbach (1):
ixgbe: Document how to enable NBASE-T support
Sasha Neftin (1):
igc: Fix typo in i225 LTR functions
Stefan Roese (1):
PCI/MSI: Mask MSI-X vectors only on success
Stephan Gerhold (1):
soc: imx: Register SoC device only on i.MX boards
Sudeep Holla (1):
firmware: arm_scpi: Fix string overflow in SCPI genpd driver
Tejun Heo (1):
iocost: Fix divide-by-zero on donation from low hweight cgroup
Tetsuo Handa (1):
tty: n_hdlc: make n_hdlc_tty_wakeup() asynchronous
Thomas Gleixner (1):
PCI/MSI: Clear PCI_MSIX_FLAGS_MASKALL on error
Tony Lindgren (1):
bus: ti-sysc: Fix variable set but not used warning for reinit_modules
Vitaly Kuznetsov (2):
KVM: selftests: Make sure kvm_create_max_vcpus test won't hit
RLIMIT_NOFILE
KVM: x86: Drop guest CPUID check for host initiated writes to
MSR_IA32_PERF_CAPABILITIES
Wei Wang (1):
virtio/vsock: fix the transport to work with VMADDR_CID_ANY
Will Deacon (1):
virtio_ring: Fix querying of maximum DMA mapping size for virtio
device
Willem de Bruijn (1):
net/packet: rx_owner_map depends on pg_vec
Xiubo Li (1):
ceph: initialize pathlen variable in reconnect_caps_cb
.../device_drivers/ethernet/intel/ixgbe.rst | 16 +++
arch/arm/boot/dts/imx6ull-pinfunc.h | 2 +-
.../boot/dts/socfpga_arria10_socdk_qspi.dts | 2 +-
arch/arm/boot/dts/socfpga_arria5_socdk.dts | 2 +-
arch/arm/boot/dts/socfpga_cyclone5_socdk.dts | 2 +-
arch/arm/boot/dts/socfpga_cyclone5_sockit.dts | 2 +-
.../boot/dts/socfpga_cyclone5_socrates.dts | 2 +-
arch/arm/boot/dts/socfpga_cyclone5_sodia.dts | 2 +-
.../boot/dts/socfpga_cyclone5_vining_fpga.dts | 4 +-
arch/arm64/boot/dts/freescale/imx8mm.dtsi | 7 +-
arch/arm64/boot/dts/freescale/imx8mn.dtsi | 7 +-
arch/arm64/boot/dts/freescale/imx8mp-evk.dts | 2 +
arch/arm64/boot/dts/freescale/imx8mp.dtsi | 7 +-
.../arm64/boot/dts/rockchip/rk3308-roc-cc.dts | 2 +-
.../boot/dts/rockchip/rk3399-khadas-edge.dtsi | 1 -
.../boot/dts/rockchip/rk3399-leez-p710.dts | 2 +-
.../boot/dts/rockchip/rk3399-rock-pi-4.dtsi | 2 +-
arch/s390/kernel/machine_kexec_file.c | 7 +-
arch/x86/kvm/x86.c | 2 +-
block/blk-iocost.c | 9 +-
drivers/ata/libata-scsi.c | 15 ++-
drivers/block/xen-blkfront.c | 15 ++-
drivers/bus/ti-sysc.c | 3 +-
drivers/clk/clk.c | 15 ++-
drivers/dma/st_fdma.c | 2 +-
drivers/firmware/scpi_pm_domain.c | 10 +-
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +-
.../gpu/drm/amd/pm/swsmu/smu12/smu_v12_0.c | 3 +
drivers/gpu/drm/ast/ast_mode.c | 5 +-
drivers/input/touchscreen/of_touchscreen.c | 42 +++---
drivers/md/persistent-data/dm-btree-remove.c | 2 +-
drivers/media/usb/dvb-usb-v2/mxl111sf.c | 16 ++-
drivers/net/ethernet/broadcom/bcmsysport.c | 5 +-
drivers/net/ethernet/broadcom/bcmsysport.h | 1 +
.../hisilicon/hns3/hns3vf/hclgevf_mbx.c | 3 +-
drivers/net/ethernet/intel/igb/igb_main.c | 28 ++--
drivers/net/ethernet/intel/igbvf/netdev.c | 1 +
drivers/net/ethernet/intel/igc/igc_i225.c | 2 +-
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 +
drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c | 3 +
drivers/net/ethernet/sfc/ef100_nic.c | 3 +
drivers/net/xen-netback/common.h | 1 +
drivers/net/xen-netback/rx.c | 77 +++++++----
drivers/net/xen-netfront.c | 125 +++++++++++++-----
drivers/pci/msi.c | 15 ++-
drivers/scsi/scsi_debug.c | 42 +++---
drivers/soc/imx/soc-imx.c | 4 +
drivers/soc/tegra/fuse/fuse-tegra.c | 2 +-
drivers/soc/tegra/fuse/fuse.h | 2 +-
drivers/tee/amdtee/core.c | 5 +-
drivers/tty/hvc/hvc_xen.c | 30 ++++-
drivers/tty/n_hdlc.c | 23 +++-
drivers/tty/serial/8250/8250_fintek.c | 20 ---
drivers/usb/core/quirks.c | 3 +
drivers/usb/dwc2/platform.c | 3 +
drivers/usb/early/xhci-dbc.c | 15 ++-
drivers/usb/host/xhci-pci.c | 6 +-
drivers/usb/serial/cp210x.c | 6 +-
drivers/usb/serial/option.c | 8 ++
drivers/vhost/vdpa.c | 2 +-
drivers/virtio/virtio_ring.c | 2 +-
fs/btrfs/disk-io.c | 8 ++
fs/btrfs/tree-log.c | 1 +
fs/ceph/caps.c | 16 +--
fs/ceph/mds_client.c | 3 +-
fs/fuse/dir.c | 2 +-
fs/overlayfs/dir.c | 3 +-
fs/overlayfs/overlayfs.h | 1 +
fs/overlayfs/super.c | 12 +-
fs/zonefs/super.c | 1 +
kernel/bpf/verifier.c | 28 ++--
kernel/rcu/tree.c | 10 +-
net/core/skbuff.c | 2 +-
net/ipv4/inet_diag.c | 4 +-
net/ipv6/sit.c | 1 -
net/mac80211/agg-rx.c | 5 +-
net/mac80211/agg-tx.c | 16 ++-
net/mac80211/driver-ops.h | 5 +-
net/mac80211/mlme.c | 13 +-
net/mac80211/sta_info.h | 1 +
net/mac80211/util.c | 7 +-
net/mptcp/protocol.c | 4 +-
net/packet/af_packet.c | 5 +-
net/sched/cls_api.c | 1 +
net/sched/sch_cake.c | 6 +-
net/sched/sch_ets.c | 4 +-
net/smc/af_smc.c | 4 +-
net/vmw_vsock/virtio_transport_common.c | 3 +-
scripts/recordmcount.pl | 2 +-
.../bpf/prog_tests/btf_skc_cls_ingress.c | 16 ++-
.../selftests/bpf/verifier/value_ptr_arith.c | 23 ++++
.../selftests/kvm/kvm_create_max_vcpus.c | 30 +++++
tools/testing/selftests/net/fcnal-test.sh | 49 +++++--
.../net/forwarding/forwarding.config.sample | 2 +
virt/kvm/kvm_main.c | 6 +-
95 files changed, 670 insertions(+), 279 deletions(-)
--
2.20.1
1
92
Backport 5.10.87 LTS patches from upstream.
Adrian Hunter (8):
perf inject: Fix itrace space allowed for new attributes
perf intel-pt: Fix some PGE (packet generation enable/control flow
packets) usage
perf intel-pt: Fix sync state when a PSB (synchronization) packet is
found
perf intel-pt: Fix intel_pt_fup_event() assumptions about setting
state type
perf intel-pt: Fix state setting when receiving overflow (OVF) packet
perf intel-pt: Fix next 'err' value, walking trace
perf intel-pt: Fix missing 'instruction' events with 'q' option
perf intel-pt: Fix error timestamp setting on the decoder error path
Alexander Stein (1):
Revert "tty: serial: fsl_lpuart: drop earlycon entry for i.MX8QXP"
Antoine Tenart (1):
ethtool: do not perform operations on net devices being unregistered
Armin Wolf (1):
hwmon: (dell-smm) Fix warning on /proc/i8k creation error
Bui Quang Minh (1):
bpf: Fix integer overflow in argument calculation for
bpf_map_area_alloc
Chen Jun (1):
tracing: Fix a kmemleak false positive in tracing_map
Erik Ekman (1):
net/mlx4_en: Update reported link modes for 1/10G
Harshit Mogalapalli (1):
net: netlink: af_netlink: Prevent empty skb by adding a check on len.
Helge Deller (1):
parisc/agp: Annotate parisc agp init functions with __init
Ilie Halip (1):
s390/test_unwind: use raw opcode instead of invalid instruction
Kai Vehmanen (2):
ALSA: hda: Add Intel DG2 PCI ID and HDMI codec vid
ALSA: hda/hdmi: fix HDA codec entry table order for ADL-P
Marc Zyngier (1):
KVM: arm64: Save PSTATE early on exit
Mike Rapoport (4):
memblock: free_unused_memmap: use pageblock units instead of MAX_ORDER
memblock: align freed memory map on pageblock boundaries with
SPARSEMEM
arm: extend pfn_valid to take into account freed memory map alignment
arm: ioremap: don't abuse pfn_valid() to check if pfn is in RAM
Miklos Szeredi (1):
fuse: make sure reclaim doesn't write the inode
Mustapha Ghaddar (1):
drm/amd/display: Fix for the no Audio bug with Tiled Displays
Nikita Yushchenko (1):
staging: most: dim2: use device release method
Ondrej Jirman (1):
i2c: rk3x: Handle a spurious start completion interrupt flag
Perry Yuan (1):
drm/amd/display: add connector type check for CRC source set
Philip Chen (1):
drm/msm/dsi: set default num_data_lanes
Sean Christopherson (1):
KVM: x86: Ignore sparse banks size for an "all CPUs", non-sparse IPI
req
Tadeusz Struk (1):
nfc: fix segfault in nfc_genl_dump_devices_done
arch/arm/mm/init.c | 37 ++++++---
arch/arm/mm/ioremap.c | 4 +-
arch/arm64/kvm/hyp/include/hyp/switch.h | 6 ++
arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 7 +-
arch/s390/lib/test_unwind.c | 5 +-
arch/x86/kvm/hyperv.c | 7 +-
drivers/char/agp/parisc-agp.c | 6 +-
.../drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c | 8 ++
.../gpu/drm/amd/display/dc/core/dc_resource.c | 4 +
drivers/gpu/drm/msm/dsi/dsi_host.c | 2 +
drivers/hwmon/dell-smm-hwmon.c | 7 +-
drivers/i2c/busses/i2c-rk3x.c | 4 +-
.../net/ethernet/mellanox/mlx4/en_ethtool.c | 6 +-
drivers/staging/most/dim2/dim2.c | 55 ++++++------
drivers/tty/serial/fsl_lpuart.c | 1 +
fs/fuse/dir.c | 8 ++
fs/fuse/file.c | 15 ++++
fs/fuse/fuse_i.h | 1 +
fs/fuse/inode.c | 3 +
kernel/bpf/devmap.c | 4 +-
kernel/trace/tracing_map.c | 3 +
net/core/sock_map.c | 2 +-
net/ethtool/netlink.h | 3 +
net/netlink/af_netlink.c | 5 ++
net/nfc/netlink.c | 6 +-
sound/pci/hda/hda_intel.c | 12 ++-
sound/pci/hda/patch_hdmi.c | 3 +-
tools/perf/builtin-inject.c | 2 +-
.../util/intel-pt-decoder/intel-pt-decoder.c | 83 ++++++++++++-------
tools/perf/util/intel-pt.c | 1 +
30 files changed, 220 insertions(+), 90 deletions(-)
--
2.20.1
1
32

[PATCH openEuler-5.10 1/5] cgroup: cgroup.{procs,threads} factor out common parts
by Zheng Zengkai 14 Jan '22
by Zheng Zengkai 14 Jan '22
14 Jan '22
From: Michal Koutný <mkoutny(a)suse.com>
mainline inclusion
from mainline-v5.12-rc1
commit da70862efe0065bada33d67a903270cdbbaf07d9
bugzilla: 186050 https://gitee.com/openeuler/kernel/issues/I4DDEL
CVE: CVE-2021-4197
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
The functions cgroup_threads_write and cgroup_procs_write are almost
identical. In order to reduce duplication, factor out the common code in
similar fashion we already do for other threadgroup/task functions. No
functional changes are intended.
Suggested-by: Hao Lee <haolee.swjtu(a)gmail.com>
Signed-off-by: Michal Koutný <mkoutny(a)suse.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Signed-off-by: Tejun Heo <tj(a)kernel.org>
Signed-off-by: Lu Jialin <lujialin4(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
kernel/cgroup/cgroup.c | 55 +++++++++++-------------------------------
1 file changed, 14 insertions(+), 41 deletions(-)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 353369a1bc6b..c76dbb03e887 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -4835,8 +4835,8 @@ static int cgroup_attach_permissions(struct cgroup *src_cgrp,
return ret;
}
-static ssize_t cgroup_procs_write(struct kernfs_open_file *of,
- char *buf, size_t nbytes, loff_t off)
+static ssize_t __cgroup_procs_write(struct kernfs_open_file *of, char *buf,
+ bool threadgroup)
{
struct cgroup *src_cgrp, *dst_cgrp;
struct task_struct *task;
@@ -4847,7 +4847,7 @@ static ssize_t cgroup_procs_write(struct kernfs_open_file *of,
if (!dst_cgrp)
return -ENODEV;
- task = cgroup_procs_write_start(buf, true, &locked);
+ task = cgroup_procs_write_start(buf, threadgroup, &locked);
ret = PTR_ERR_OR_ZERO(task);
if (ret)
goto out_unlock;
@@ -4857,19 +4857,26 @@ static ssize_t cgroup_procs_write(struct kernfs_open_file *of,
src_cgrp = task_cgroup_from_root(task, &cgrp_dfl_root);
spin_unlock_irq(&css_set_lock);
+ /* process and thread migrations follow same delegation rule */
ret = cgroup_attach_permissions(src_cgrp, dst_cgrp,
- of->file->f_path.dentry->d_sb, true);
+ of->file->f_path.dentry->d_sb, threadgroup);
if (ret)
goto out_finish;
- ret = cgroup_attach_task(dst_cgrp, task, true);
+ ret = cgroup_attach_task(dst_cgrp, task, threadgroup);
out_finish:
cgroup_procs_write_finish(task, locked);
out_unlock:
cgroup_kn_unlock(of->kn);
- return ret ?: nbytes;
+ return ret;
+}
+
+static ssize_t cgroup_procs_write(struct kernfs_open_file *of,
+ char *buf, size_t nbytes, loff_t off)
+{
+ return __cgroup_procs_write(of, buf, true) ?: nbytes;
}
static void *cgroup_threads_start(struct seq_file *s, loff_t *pos)
@@ -4880,41 +4887,7 @@ static void *cgroup_threads_start(struct seq_file *s, loff_t *pos)
static ssize_t cgroup_threads_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
- struct cgroup *src_cgrp, *dst_cgrp;
- struct task_struct *task;
- ssize_t ret;
- bool locked;
-
- buf = strstrip(buf);
-
- dst_cgrp = cgroup_kn_lock_live(of->kn, false);
- if (!dst_cgrp)
- return -ENODEV;
-
- task = cgroup_procs_write_start(buf, false, &locked);
- ret = PTR_ERR_OR_ZERO(task);
- if (ret)
- goto out_unlock;
-
- /* find the source cgroup */
- spin_lock_irq(&css_set_lock);
- src_cgrp = task_cgroup_from_root(task, &cgrp_dfl_root);
- spin_unlock_irq(&css_set_lock);
-
- /* thread migrations follow the cgroup.procs delegation rule */
- ret = cgroup_attach_permissions(src_cgrp, dst_cgrp,
- of->file->f_path.dentry->d_sb, false);
- if (ret)
- goto out_finish;
-
- ret = cgroup_attach_task(dst_cgrp, task, false);
-
-out_finish:
- cgroup_procs_write_finish(task, locked);
-out_unlock:
- cgroup_kn_unlock(of->kn);
-
- return ret ?: nbytes;
+ return __cgroup_procs_write(of, buf, false) ?: nbytes;
}
/* cgroup core interface files for the default hierarchy */
--
2.20.1
1
4

[PATCH openEuler-5.10] netfilter: selftest: conntrack_vrf.sh: fix file permission
by Zheng Zengkai 14 Jan '22
by Zheng Zengkai 14 Jan '22
14 Jan '22
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
stable inclusion
from stable-v5.10.86
commit 32414491834c80ab39519467deb3f8d1e4f5bade
bugzilla: 186045 https://gitee.com/openeuler/kernel/issues/I4QVPD
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
When backporting 33b8aad21ac1 ("selftests: netfilter: add a
vrf+conntrack testcase") to this stable branch, the executable bits were
not properly set on the
tools/testing/selftests/netfilter/conntrack_vrf.sh file due to quilt not
honoring them.
Fix this up manually by setting the correct mode.
Reported-by: "Rantala, Tommi T. (Nokia - FI/Espoo)" <tommi.t.rantala(a)nokia.com>
Link: https://lore.kernel.org/r/234d7a6a81664610fdf21ac72730f8bd10d3f46f.camel@no…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Chen Jun <chenjun102(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
tools/testing/selftests/netfilter/conntrack_vrf.sh | 0
1 file changed, 0 insertions(+), 0 deletions(-)
mode change 100644 => 100755 tools/testing/selftests/netfilter/conntrack_vrf.sh
diff --git a/tools/testing/selftests/netfilter/conntrack_vrf.sh b/tools/testing/selftests/netfilter/conntrack_vrf.sh
old mode 100644
new mode 100755
--
2.20.1
1
0
Backport 5.10.85 LTS patches from upstream.
Alan Young (1):
ALSA: ctl: Fix copy of updated id with element read/write
Alexander Stein (1):
dt-bindings: net: Reintroduce PHY no lane swap binding
Alexander Sverdlin (1):
nfsd: Fix nsfd startup race (again)
Alyssa Ross (1):
iio: trigger: stm32-timer: fix MODULE_ALIAS
Andrea Mayer (1):
seg6: fix the iif in the IPv6 socket control block
Arnaldo Carvalho de Melo (1):
tools build: Remove needless libpython-version feature check that
breaks test-all fast path
Bas Nieuwenhuizen (1):
drm/syncobj: Deal with signalled fences in drm_syncobj_find_fence.
Benjamin Tissoires (1):
HID: bigbenff: prevent null pointer dereference
Billy Tsai (1):
irqchip/aspeed-scu: Replace update_bits with write_bits.
Björn Töpel (1):
bpf, x86: Fix "no previous prototype" warning
Brian Silverman (1):
can: m_can: Disable and ignore ELO interrupt
Dan Carpenter (3):
can: sja1000: fix use after free in ems_pcmcia_add_card()
net: altera: set a couple error code in probe()
net/qla3xxx: fix an error code in ql_adapter_up()
Davidlohr Bueso (1):
block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2)
Dmitry Baryshkov (1):
clk: qcom: regmap-mux: fix parent clock lookup
Eric Biggers (5):
wait: add wake_up_pollfree()
binder: use wake_up_pollfree()
signalfd: use wake_up_pollfree()
aio: keep poll requests on waitqueue until completed
aio: fix use-after-free due to missing POLLFREE handling
Eric Dumazet (5):
bonding: make tx_rebalance_counter an atomic
netfilter: conntrack: annotate data-races around ct->timeout
devlink: fix netns refcount leak in devlink_nl_cmd_reload()
net/sched: fq_pie: prevent dismantle issue
net, neigh: clear whole pneigh_entry at alloc time
Evgeny Boger (1):
iio: adc: axp20x_adc: fix charging current reporting on AXP22x
Fabrice Gasnier (1):
iio: adc: stm32: fix a current leak by resetting pcsel before
disabling vdda
Florian Westphal (1):
selftests: netfilter: add a vrf+conntrack testcase
Greg Kroah-Hartman (7):
HID: add hid_is_usb() function to make it simpler for USB detection
HID: add USB_HID dependancy to hid-prodikeys
HID: add USB_HID dependancy to hid-chicony
HID: add USB_HID dependancy on some USB HID drivers
HID: wacom: fix problems when device is not a valid USB device
HID: check for valid USB device for many HID drivers
USB: gadget: zero allocate endpoint 0 buffers
Gwendal Grignou (1):
iio: at91-sama5d2: Fix incorrect sign extension
Hannes Reinecke (1):
libata: add horkage for ASMedia 1092
Hans de Goede (1):
HID: quirks: Add quirk for the Microsoft Surface 3 type-cover
Herve Codina (2):
mtd: rawnand: fsmc: Take instruction delay into account
mtd: rawnand: fsmc: Fix timing computation
Ian Rogers (1):
perf tools: Fix SMT detection fast read path
Igor Pylypiv (1):
scsi: pm80xx: Do not call scsi_remove_host() in pm8001_alloc()
J. Bruce Fields (1):
nfsd: fix use-after-free due to delegation race
James Zhu (3):
drm/amdkfd: separate kfd_iommu_resume from kfd_resume
drm/amdgpu: add amdgpu_amdkfd_resume_iommu
drm/amdgpu: move iommu_resume before ip init/resume
Jesse Brandeburg (1):
ice: ignore dropped packets during init
Jeya R (1):
misc: fastrpc: fix improper packet size calculation
Jianglei Nie (1):
nfp: Fix memory leak in nfp_cpp_area_cache_add()
Jianguo Wu (1):
udp: using datalen to cap max gso segments
Jimmy Assarsson (2):
can: kvaser_usb: get CAN clock frequency from device
can: kvaser_pciefd: kvaser_pciefd_rx_error_frame(): increase correct
stats->{rx,tx}_errors counter
Joakim Zhang (1):
net: fec: only clear interrupt of handling queue in
fec_enet_rx_queue()
Josef Bacik (1):
btrfs: clear extent buffer uptodate when we fail to write it
Kai-Heng Feng (1):
xhci: Remove CONFIG_USB_DEFAULT_PERSIST to prevent xHCI from runtime
suspending
Kailang Yang (1):
ALSA: hda/realtek - Add headset Mic support for Lenovo ALC897 platform
Karen Sornek (1):
i40e: Fix failed opcode appearing if handling messages from VF
Kelly Devilliv (1):
csky: fix typo of fpu config macro
Kister Genesis Jimenez (1):
iio: gyro: adxrs290: fix data signedness
Krzysztof Kozlowski (1):
nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done
Lang Yu (1):
drm/amd/amdkfd: adjust dummy functions' placement
Lars-Peter Clausen (8):
iio: trigger: Fix reference counting
iio: stk3310: Don't return error code in interrupt handler
iio: mma8452: Fix trigger reference couting
iio: ltr501: Don't return error code in trigger handler
iio: kxsd9: Don't return error code in trigger handler
iio: itg3200: Call iio_trigger_notify_done() on error
iio: dln2: Check return value of devm_iio_trigger_register()
iio: ad7768-1: Call iio_trigger_notify_done() on error
Lee Jones (1):
net: cdc_ncm: Allow for dwNtbOutMaxSize to be unset or zero
Louis Amas (1):
net: mvpp2: fix XDP rx queues registering
Lukas Bulwahn (1):
MAINTAINERS: adjust GCC PLUGINS after gcc-plugin.sh removal
Manish Chopra (1):
qede: validate non LSO skb length
Manjong Lee (1):
mm: bdi: initialize bdi_min_ratio when bdi is unregistered
Marek Behún (1):
Revert "PCI: aardvark: Fix support for PCI_ROM_ADDRESS1 on emulated
bridge"
Markus Hochholdinger (1):
md: fix update super 1.0 on rdev size change
Masahiro Yamada (3):
gcc-plugins: simplify GCC plugin-dev capability test
kbuild: simplify GCC_PLUGINS enablement in dummy-tools/gcc
doc: gcc-plugins: update gcc-plugins.rst
Mateusz Palczewski (1):
i40e: Fix pre-set max number of queues for VF
Mathias Nyman (1):
xhci: avoid race between disable slot command and host runtime suspend
Maxim Mikityanskiy (2):
bpf: Fix the off-by-two error in range markings
bpf: Add selftests to cover packet access corner cases
Michal Maloszewski (1):
iavf: Fix reporting when setting descriptor count
Mike Marciniszyn (4):
IB/hfi1: Insure use of smp_processor_id() is preempt disabled
IB/hfi1: Fix early init panic
IB/hfi1: Fix leak of rcvhdrtail_dummy_kvaddr
IB/hfi1: Correct guard on eager buffer deallocation
Miles Chen (1):
clk: imx: use module_platform_driver
Mitch Williams (1):
iavf: restore MSI state on reset
Nicolas Dichtel (1):
vrf: don't run conntrack on vrf with !dflt qdisc
Noralf Trønnes (1):
iio: dln2-adc: Fix lockdep complaint
Norbert Zulinski (1):
i40e: Fix NULL pointer dereference in i40e_dbg_dump_desc
Pali Rohár (2):
irqchip/armada-370-xp: Fix return value of armada_370_xp_msi_alloc()
irqchip/armada-370-xp: Fix support for Multi-MSI interrupts
Pavel Hofman (2):
usb: core: config: fix validation of wMaxPacketValue entries
usb: core: config: using bit mask instead of individual bits
Peilin Ye (1):
selftests/fib_tests: Rework fib_rp_filter_test()
Qu Wenruo (1):
btrfs: replace the BUG_ON in btrfs_del_root_ref with proper error
handling
Rafael J. Wysocki (1):
PM: runtime: Fix pm_runtime_active() kerneldoc comment
Rob Clark (1):
ASoC: rt5682: Fix crash due to out of scope stack vars
Robert Karszniewicz (1):
Documentation/Kbuild: Remove references to gcc-plugin.sh
Roman Bolshakov (1):
scsi: qla2xxx: Format log strings only if needed
Sebastian Andrzej Siewior (1):
Documentation/locking/locktypes: Update migrate_disable() bits.
Shin'ichiro Kawasaki (1):
scsi: scsi_debug: Fix buffer size of REPORT ZONES command
Srinivas Kandagatla (4):
ASoC: qdsp6: q6routing: Fix return value from
msm_routing_put_audio_mixer
ASoC: codecs: wsa881x: fix return values from kcontrol put
ASoC: codecs: wcd934x: handle channel mappping list correctly
ASoC: codecs: wcd934x: return correct value from mixer put
Stefano Brivio (1):
nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit
groups
Steven Rostedt (VMware) (2):
tracefs: Have new files inherit the ownership of their parent
tracefs: Set all files to the same group ownership as the mount option
Takashi Iwai (3):
ALSA: pcm: oss: Fix negative period/buffer sizes
ALSA: pcm: oss: Limit the period size to 16MB
ALSA: pcm: oss: Handle missing errors in snd_pcm_oss_change_params*()
Thomas Haemmerle (1):
usb: gadget: uvc: fix multiple opens
Tom Lendacky (1):
x86/sme: Explicitly map new EFI memmap table as encrypted
Valdis Kletnieks (1):
gcc-plugins: fix gcc 11 indigestion with plugins...
Vincent Mailhol (1):
can: pch_can: pch_can_rx_normal: fix use after free
Vitaly Kuznetsov (1):
KVM: x86: Wait for IPIs to be delivered when handling Hyper-V TLB
flush hypercall
Vladimir Murzin (1):
irqchip: nvic: Fix offset for Interrupt Priority Offsets
Werner Sembach (1):
ALSA: hda/realtek: Fix quirk for TongFang PHxTxX1
Wolfram Sang (1):
mmc: renesas_sdhi: initialize variable properly when tuning
Wudi Wang (1):
irqchip/irq-gic-v3-its.c: Force synchronisation when issuing INVALL
Yang Yingliang (1):
iio: accel: kxcjk-1013: Fix possible memory leak in probe and remove
Yangyang Li (2):
RDMA/hns: Do not halt commands during reset until later
RDMA/hns: Do not destroy QP resources in the hw resetting phase
Yifan Zhang (2):
drm/amdgpu: init iommu after amdkfd device init
drm/amdkfd: fix boot failure when iommu is disabled in Picasso.
xiazhengqiao (1):
HID: google: add eel USB id
.../devicetree/bindings/net/ethernet-phy.yaml | 8 +
Documentation/kbuild/gcc-plugins.rst | 47 +-
Documentation/locking/locktypes.rst | 9 +-
MAINTAINERS | 1 -
arch/csky/kernel/traps.c | 4 +-
arch/x86/Kconfig | 1 +
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/platform/efi/quirks.c | 3 +-
block/ioprio.c | 3 +
drivers/android/binder.c | 21 +-
drivers/ata/libata-core.c | 2 +
drivers/clk/imx/clk-imx8qxp-lpcg.c | 2 +-
drivers/clk/imx/clk-imx8qxp.c | 2 +-
drivers/clk/qcom/clk-regmap-mux.c | 2 +-
drivers/clk/qcom/common.c | 12 +
drivers/clk/qcom/common.h | 2 +
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 97 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 145 +++-
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 15 +-
drivers/gpu/drm/drm_syncobj.c | 11 +-
drivers/hid/Kconfig | 10 +-
drivers/hid/hid-asus.c | 6 +-
drivers/hid/hid-bigbenff.c | 2 +-
drivers/hid/hid-chicony.c | 8 +-
drivers/hid/hid-corsair.c | 7 +-
drivers/hid/hid-elan.c | 2 +-
drivers/hid/hid-elo.c | 3 +
drivers/hid/hid-google-hammer.c | 2 +
drivers/hid/hid-holtek-kbd.c | 9 +-
drivers/hid/hid-holtek-mouse.c | 9 +
drivers/hid/hid-ids.h | 2 +
drivers/hid/hid-lg.c | 10 +-
drivers/hid/hid-logitech-dj.c | 2 +-
drivers/hid/hid-prodikeys.c | 10 +-
drivers/hid/hid-quirks.c | 1 +
drivers/hid/hid-roccat-arvo.c | 3 +
drivers/hid/hid-roccat-isku.c | 3 +
drivers/hid/hid-roccat-kone.c | 3 +
drivers/hid/hid-roccat-koneplus.c | 3 +
drivers/hid/hid-roccat-konepure.c | 3 +
drivers/hid/hid-roccat-kovaplus.c | 3 +
drivers/hid/hid-roccat-lua.c | 3 +
drivers/hid/hid-roccat-pyra.c | 3 +
drivers/hid/hid-roccat-ryos.c | 3 +
drivers/hid/hid-roccat-savu.c | 3 +
drivers/hid/hid-samsung.c | 3 +
drivers/hid/hid-u2fzero.c | 2 +-
drivers/hid/hid-uclogic-core.c | 3 +
drivers/hid/hid-uclogic-params.c | 3 +-
drivers/hid/wacom_sys.c | 19 +-
drivers/iio/accel/kxcjk-1013.c | 5 +-
drivers/iio/accel/kxsd9.c | 6 +-
drivers/iio/accel/mma8452.c | 2 +-
drivers/iio/adc/ad7768-1.c | 2 +-
drivers/iio/adc/at91-sama5d2_adc.c | 3 +-
drivers/iio/adc/axp20x_adc.c | 18 +-
drivers/iio/adc/dln2-adc.c | 21 +-
drivers/iio/adc/stm32-adc.c | 1 +
drivers/iio/gyro/adxrs290.c | 5 +-
drivers/iio/gyro/itg3200_buffer.c | 2 +-
drivers/iio/industrialio-trigger.c | 1 -
drivers/iio/light/ltr501.c | 2 +-
drivers/iio/light/stk3310.c | 6 +-
drivers/iio/trigger/stm32-timer-trigger.c | 2 +-
drivers/infiniband/hw/hfi1/chip.c | 2 +
drivers/infiniband/hw/hfi1/driver.c | 2 +
drivers/infiniband/hw/hfi1/init.c | 40 +-
drivers/infiniband/hw/hfi1/sdma.c | 2 +-
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 14 +-
drivers/irqchip/irq-armada-370-xp.c | 16 +-
drivers/irqchip/irq-aspeed-scu-ic.c | 4 +-
drivers/irqchip/irq-gic-v3-its.c | 2 +-
drivers/irqchip/irq-nvic.c | 2 +-
drivers/md/md.c | 1 +
drivers/misc/fastrpc.c | 10 +-
drivers/mmc/host/renesas_sdhi_core.c | 2 +-
drivers/mtd/nand/raw/fsmc_nand.c | 36 +-
drivers/net/bonding/bond_alb.c | 14 +-
drivers/net/can/kvaser_pciefd.c | 8 +-
drivers/net/can/m_can/m_can.c | 14 +-
drivers/net/can/pch_can.c | 2 +-
drivers/net/can/sja1000/ems_pcmcia.c | 7 +-
.../net/can/usb/kvaser_usb/kvaser_usb_leaf.c | 101 ++-
drivers/net/ethernet/altera/altera_tse_main.c | 9 +-
drivers/net/ethernet/freescale/fec.h | 3 +
drivers/net/ethernet/freescale/fec_main.c | 2 +-
.../net/ethernet/intel/i40e/i40e_debugfs.c | 8 +
.../ethernet/intel/i40e/i40e_virtchnl_pf.c | 75 ++-
.../ethernet/intel/i40e/i40e_virtchnl_pf.h | 2 +
.../net/ethernet/intel/iavf/iavf_ethtool.c | 43 +-
drivers/net/ethernet/intel/iavf/iavf_main.c | 1 +
drivers/net/ethernet/intel/ice/ice_main.c | 3 +
.../net/ethernet/marvell/mvpp2/mvpp2_main.c | 4 +-
.../netronome/nfp/nfpcore/nfp_cppcore.c | 4 +-
drivers/net/ethernet/qlogic/qede/qede_fp.c | 7 +
drivers/net/ethernet/qlogic/qla3xxx.c | 19 +-
drivers/net/usb/cdc_ncm.c | 2 +
drivers/net/vrf.c | 8 +-
drivers/pci/controller/pci-aardvark.c | 9 -
drivers/scsi/pm8001/pm8001_init.c | 6 +-
drivers/scsi/qla2xxx/qla_dbg.c | 3 +
drivers/scsi/scsi_debug.c | 2 +-
drivers/usb/core/config.c | 6 +-
drivers/usb/gadget/function/uvc.h | 2 +
drivers/usb/gadget/function/uvc_v4l2.c | 49 +-
drivers/usb/gadget/legacy/dbgp.c | 2 +-
drivers/usb/host/xhci-hub.c | 1 +
drivers/usb/host/xhci-ring.c | 1 -
drivers/usb/host/xhci.c | 26 +-
fs/aio.c | 184 ++++-
fs/btrfs/extent_io.c | 6 +
fs/btrfs/root-tree.c | 3 +-
fs/nfsd/nfs4recover.c | 1 +
fs/nfsd/nfs4state.c | 9 +-
fs/nfsd/nfsctl.c | 14 +-
fs/signalfd.c | 12 +-
fs/tracefs/inode.c | 76 +++
include/linux/bpf.h | 1 +
include/linux/hid.h | 5 +
include/linux/pm_runtime.h | 2 +-
include/linux/wait.h | 26 +
include/net/bond_alb.h | 2 +-
include/net/netfilter/nf_conntrack.h | 6 +-
include/uapi/asm-generic/poll.h | 2 +-
kernel/bpf/verifier.c | 2 +-
kernel/sched/wait.c | 7 +
mm/backing-dev.c | 7 +
net/core/devlink.c | 16 +-
net/core/neighbour.c | 3 +-
net/ipv4/udp.c | 2 +-
net/ipv6/seg6_iptunnel.c | 8 +
net/netfilter/nf_conntrack_core.c | 6 +-
net/netfilter/nf_conntrack_netlink.c | 2 +-
net/netfilter/nf_flow_table_core.c | 4 +-
net/netfilter/nft_set_pipapo_avx2.c | 2 +-
net/nfc/netlink.c | 6 +-
net/sched/sch_fq_pie.c | 1 +
scripts/dummy-tools/gcc | 10 +-
scripts/gcc-plugin.sh | 19 -
scripts/gcc-plugins/Kconfig | 2 +-
scripts/gcc-plugins/Makefile | 4 +-
sound/core/control_compat.c | 3 +
sound/core/oss/pcm_oss.c | 37 +-
sound/pci/hda/patch_realtek.c | 80 ++-
sound/soc/codecs/rt5682.c | 10 +-
sound/soc/codecs/wcd934x.c | 126 +++-
sound/soc/codecs/wsa881x.c | 16 +-
sound/soc/qcom/qdsp6/q6routing.c | 8 +-
tools/build/Makefile.feature | 1 -
tools/build/feature/Makefile | 4 -
tools/build/feature/test-all.c | 5 -
tools/build/feature/test-libpython-version.c | 11 -
tools/perf/Makefile.config | 2 -
tools/perf/util/smt.c | 2 +-
.../bpf/verifier/xdp_direct_packet_access.c | 632 +++++++++++++++++-
tools/testing/selftests/net/fib_tests.sh | 59 +-
tools/testing/selftests/netfilter/Makefile | 3 +-
.../selftests/netfilter/conntrack_vrf.sh | 241 +++++++
159 files changed, 2201 insertions(+), 681 deletions(-)
delete mode 100755 scripts/gcc-plugin.sh
delete mode 100644 tools/build/feature/test-libpython-version.c
create mode 100644 tools/testing/selftests/netfilter/conntrack_vrf.sh
--
2.20.1
1
131

14 Jan '22
From: Hangyu Hua <hbh25y(a)gmail.com>
mainline inclusion
from mainline-v5.16-rc6
commit 5f9562ebe710c307adc5f666bf1a2162ee7977c0
bugzilla: 185989 https://gitee.com/openeuler/kernel/issues/I4DDEL
CVE: CVE-2021-45480
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
__rds_conn_create() did not release conn->c_path when loop_trans != 0 and
trans->t_prefer_loopback != 0 and is_outgoing == 0.
Fixes: aced3ce57cd3 ("RDS tcp loopback connection can hang")
Signed-off-by: Hangyu Hua <hbh25y(a)gmail.com>
Reviewed-by: Sharath Srinivasan <sharath.srinivasan(a)oracle.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Baisong Zhong zhongbaisong(a)huawei.com
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
net/rds/connection.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/rds/connection.c b/net/rds/connection.c
index a3bc4b54d491..b4cc699c5fad 100644
--- a/net/rds/connection.c
+++ b/net/rds/connection.c
@@ -253,6 +253,7 @@ static struct rds_connection *__rds_conn_create(struct net *net,
* should end up here, but if it
* does, reset/destroy the connection.
*/
+ kfree(conn->c_path);
kmem_cache_free(rds_conn_slab, conn);
conn = ERR_PTR(-EOPNOTSUPP);
goto out;
--
2.20.1
1
2
Backport 5.10.84 LTS patches from upstream.
Aaro Koskinen (1):
i2c: cbus-gpio: set atomic transfer callback
Alain Volmat (3):
i2c: stm32f7: flush TX FIFO upon transfer errors
i2c: stm32f7: recover the bus on access timeout
i2c: stm32f7: stop dma transfer in case of NACK
Alexey Kardashevskiy (1):
powerpc/pseries/ddw: Revert "Extend upper limit for huge DMA window
for persistent memory"
Andreas Gruenbacher (1):
gfs2: Fix length of holes reported at end-of-file
Arnd Bergmann (1):
siphash: use _unaligned version by default
Badhri Jagan Sridharan (1):
usb: typec: tcpm: Wait in SNK_DEBOUNCED until disconnect
Baokun Li (2):
sata_fsl: fix UAF in sata_fsl_port_stop when rmmod sata_fsl
sata_fsl: fix warning in remove_proc_entry when rmmod sata_fsl
Benjamin Coddington (1):
NFSv42: Fix pagecache invalidation after COPY/CLONE
Benjamin Poirier (1):
net: mpls: Fix notifications when deleting a device
Bernard Zhao (1):
drm/amd/amdgpu: fix potential memleak
Bob Peterson (1):
gfs2: release iopen glock early in evict
Catalin Marinas (1):
KVM: arm64: Avoid setting the upper 32 bits of TCR_EL2 and CPTR_EL2 to
1
Christophe JAILLET (1):
net: marvell: mvpp2: Fix the computation of shared CPUs
Dan Carpenter (1):
KVM: VMX: Set failure code in prepare_vmcs02()
Dmitry Bogdanov (2):
atlantic: Increase delay for fw transactions
atlantic: Fix statistics logic for production hardware
Dongliang Mu (1):
dpaa2-eth: destroy workqueue at the end of remove function
Douglas Anderson (1):
drm/msm/a6xx: Allocate enough space for GMU registers
Dust Li (1):
net/smc: fix wrong list_del in smc_lgr_cleanup_early
Eiichi Tsukata (2):
rxrpc: Fix rxrpc_peer leak in rxrpc_look_up_bundle()
rxrpc: Fix rxrpc_local leak in rxrpc_lookup_peer()
Eric Dumazet (2):
net: annotate data-races on txq->xmit_lock_owner
ipv4: convert fib_num_tclassid_users to atomic_t
Feng Tang (2):
x86/tsc: Add a timer to make sure TSC_adjust is always checked
x86/tsc: Disable clocksource watchdog for TSC on qualified platorms
Gustavo A. R. Silva (1):
wireguard: ratelimiter: use kvcalloc() instead of kvzalloc()
Helge Deller (3):
parisc: Fix KBUILD_IMAGE for self-extracting kernel
parisc: Fix "make install" on newer debian releases
parisc: Mark cr16 CPU clocksource unstable on all SMP machines
Ian Rogers (2):
perf hist: Fix memory leak of a perf_hpp_fmt
perf report: Fix memory leaks around perf_tip()
Ioanna Alifieraki (1):
ipmi: Move remove_work to dedicated workqueue
Jason A. Donenfeld (6):
wireguard: selftests: increase default dmesg log size
wireguard: allowedips: add missing __rcu annotation to satisfy sparse
wireguard: selftests: actually test for routing loops
wireguard: device: reset peer src endpoint when netns exits
wireguard: receive: use ring buffer for incoming handshakes
wireguard: receive: drop handshakes if queue lock is contended
Jay Dolan (2):
serial: 8250_pci: Fix ACCES entries in pci_serial_quirks array
serial: 8250_pci: rewrite pericom_do_set_divisor()
Jimmy Wang (1):
platform/x86: thinkpad_acpi: Add support for dual fan control
Joerg Roedel (1):
x86/64/mm: Map all kernel memory into trampoline_pgd
Johan Hovold (1):
serial: core: fix transmit-buffer reset and memleak
Jordy Zomer (1):
ipv6: check return value of ipv6_skip_exthdr
Juergen Gross (1):
x86/pv: Switch SWAPGS to ALTERNATIVE
Julian Braha (1):
drm/sun4i: fix unmet dependency on RESET_CONTROLLER for
PHY_SUN6I_MIPI_DPHY
Lai Jiangshan (4):
KVM: X86: Use vcpu->arch.walk_mmu for kvm_mmu_invlpg()
x86/entry: Use the correct fence macro after swapgs in kernel CR3
x86/xen: Add xenpv_restore_regs_and_return_to_usermode()
x86/entry: Add a fence for kernel entry SWAPGS in paranoid_entry()
Li Zhijian (2):
wireguard: selftests: rename DEBUG_PI_LIST to DEBUG_PLIST
selftests: net: Correct case name
Like Xu (1):
KVM: x86/pmu: Fix reserved bits for AMD PerfEvtSeln register
Lorenzo Bianconi (1):
mt76: mt7915: fix NULL pointer dereference in mt7915_get_phy_mode
Lukas Wunner (1):
serial: 8250: Fix RTS modem control while in rs485 mode
Maciej W. Rozycki (1):
vgacon: Propagate console boot parameters before calling `vc_resize'
Manaf Meethalavalappu Pallikunhi (1):
thermal: core: Reset previous low and high trip during thermal zone
init
Mario Limonciello (2):
ata: ahci: Add Green Sardine vendor ID as board_ahci_mobile
ACPI: Add stubs for wakeup handler functions
Mark Rutland (1):
arm64: ftrace: add missing BTIs
Masami Hiramatsu (1):
kprobes: Limit max data_size of the kretprobe instances
Mathias Nyman (1):
xhci: Fix commad ring abort, write all 64 bits to CRCR register.
Michael Sterritt (1):
x86/sev: Fix SEV-ES INS/OUTS instructions for word, dword, and qword
Mike Christie (1):
scsi: iscsi: Unblock session then wake up error handler
Miklos Szeredi (2):
ovl: simplify file splice
ovl: fix deadlock in splice write
Mordechay Goodstein (1):
iwlwifi: mvm: retry init flow if failed
Nicholas Kazlauskas (1):
drm/amd/display: Allow DSC on supported MST branch devices
Nikita Danilov (2):
atlatnic: enable Nbase-t speeds with base-t
atlantic: Add missing DIDs and fix 115c.
Niklas Schnelle (1):
s390/pci: move pseudo-MMIO to prevent MIO overlap
Ole Ernst (1):
USB: NO_LPM quirk Lenovo Powered USB-C Travel Hub
Paolo Abeni (1):
tcp: fix page frag corruption on page fault
Paolo Bonzini (1):
KVM: x86: Use a stable condition around all VT-d PI paths
Patrik John (1):
serial: tegra: Change lower tolerance baud rate limit for tegra20 and
tegra30
Pierre Gondois (1):
serial: pl011: Add ACPI SBSA UART match id
Pierre-Louis Bossart (1):
ALSA: intel-dsp-config: add quirk for CML devices based on ES8336
codec
Qais Yousef (1):
sched/uclamp: Fix rq->uclamp_max not set on first enqueue
Randy Dunlap (1):
natsemi: xtensa: fix section mismatch warnings
Rob Clark (1):
drm/msm: Do hw_init() before capturing GPU state
Sameer Pujar (9):
ASoC: tegra: Fix wrong value type in ADMAIF
ASoC: tegra: Fix wrong value type in I2S
ASoC: tegra: Fix wrong value type in DMIC
ASoC: tegra: Fix wrong value type in DSPK
ASoC: tegra: Fix kcontrol put callback in ADMAIF
ASoC: tegra: Fix kcontrol put callback in I2S
ASoC: tegra: Fix kcontrol put callback in DMIC
ASoC: tegra: Fix kcontrol put callback in DSPK
ASoC: tegra: Fix kcontrol put callback in AHUB
Sameer Saurabh (3):
atlantic: Fix to display FW bundle version instead of FW mac version.
Remove Half duplex mode speed capabilities.
atlantic: Remove warn trace message.
Sean Christopherson (2):
KVM: Disallow user memslot with size that exceeds "unsigned long"
KVM: nVMX: Flush current VPID (L1 vs. L2) for KVM_REQ_TLB_FLUSH_GUEST
Slark Xiao (1):
platform/x86: thinkpad_acpi: Fix WWAN device disabled issue after S3
deep
Stanislaw Gruszka (1):
rt2x00: do not mark device gone on EPROTO errors during start
Stephen Suryaputra (1):
vrf: Reset IPCB/IP6CB when processing outbound pkts in vrf dev xmit
Steven Rostedt (VMware) (1):
tracing/histograms: String compares should not care about signed
values
Sven Eckelmann (1):
tty: serial: msm_serial: Deactivate RX DMA for polling support
Sven Schuchmann (1):
net: usb: lan78xx: lan78xx_phy_init(): use PHY_POLL instead of "0" if
no IRQ is available
Teng Qi (2):
ethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow
in hns_dsaf_ge_srst_by_port()
net: ethernet: dec: tulip: de4x5: fix possible array overflows in
type3_infoblock()
Tianjia Zhang (1):
net/tls: Fix authentication failure in CCM mode
Tony Lu (1):
net/smc: Keep smc_close_final rc during active close
Vasily Gorbik (1):
s390/setup: avoid using memblock_enforce_memory_limit
Wang Yugui (1):
btrfs: check-integrity: fix a warning on write caching disabled disk
Wei Yongjun (1):
ipmi: msghandler: Make symbol 'remove_work_wq' static
Wen Gu (2):
net/smc: Transfer remaining wait queue entries during fallback
net/smc: Avoid warning of possible recursive locking
William Kucharski (1):
net/rds: correct socket tunable error in rds_tcp_tune()
Xing Song (1):
mac80211: do not access the IV when it was stripped
Zhang Changzhong (1):
can: j1939: j1939_tp_cmd_recv(): check the dst address of TP.CM_BAM
Zhou Qingyang (2):
net: qlogic: qlcnic: Fix a NULL pointer dereference in
qlcnic_83xx_add_rings()
net/mlx4_en: Fix an use-after-free bug in
mlx4_en_try_alloc_resources()
liuguoqiang (1):
net: return correct error code
msizanoen1 (1):
ipv6: fix memory leak in fib6_rule_suppress
shaoyunl (1):
drm/amd/amdkfd: Fix kernel panic when reset failed and been triggered
again
zhangyue (1):
net: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be
out of bound
arch/arm64/include/asm/kvm_arm.h | 4 +-
arch/arm64/kernel/entry-ftrace.S | 6 +
arch/parisc/Makefile | 5 +
arch/parisc/install.sh | 1 +
arch/parisc/kernel/time.c | 24 +-
arch/powerpc/platforms/pseries/iommu.c | 9 -
arch/s390/include/asm/pci_io.h | 7 +-
arch/s390/kernel/setup.c | 3 -
arch/x86/entry/entry_64.S | 45 ++-
arch/x86/include/asm/irqflags.h | 20 +-
arch/x86/include/asm/paravirt.h | 20 --
arch/x86/include/asm/paravirt_types.h | 2 -
arch/x86/kernel/asm-offsets_64.c | 1 -
arch/x86/kernel/paravirt.c | 1 -
arch/x86/kernel/paravirt_patch.c | 3 -
arch/x86/kernel/sev-es.c | 57 ++--
arch/x86/kernel/tsc.c | 28 +-
arch/x86/kernel/tsc_sync.c | 41 +++
arch/x86/kvm/mmu/mmu.c | 2 +-
arch/x86/kvm/svm/pmu.c | 2 +-
arch/x86/kvm/vmx/nested.c | 4 +-
arch/x86/kvm/vmx/posted_intr.c | 20 +-
arch/x86/kvm/vmx/vmx.c | 23 +-
arch/x86/realmode/init.c | 12 +-
arch/x86/xen/enlighten_pv.c | 3 -
arch/x86/xen/xen-asm.S | 20 ++
drivers/ata/ahci.c | 1 +
drivers/ata/sata_fsl.c | 20 +-
drivers/char/ipmi/ipmi_msghandler.c | 13 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 1 +
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 5 +
.../display/amdgpu_dm/amdgpu_dm_mst_types.c | 20 +-
drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 4 +-
drivers/gpu/drm/msm/msm_debugfs.c | 1 +
drivers/gpu/drm/sun4i/Kconfig | 1 +
drivers/i2c/busses/i2c-cbus-gpio.c | 5 +-
drivers/i2c/busses/i2c-stm32f7.c | 31 +-
.../ethernet/aquantia/atlantic/aq_common.h | 27 +-
.../net/ethernet/aquantia/atlantic/aq_hw.h | 2 +
.../net/ethernet/aquantia/atlantic/aq_nic.c | 10 +-
.../ethernet/aquantia/atlantic/aq_pci_func.c | 7 +-
.../net/ethernet/aquantia/atlantic/aq_vec.c | 3 -
.../aquantia/atlantic/hw_atl/hw_atl_utils.c | 15 +-
.../atlantic/hw_atl/hw_atl_utils_fw2x.c | 3 -
.../aquantia/atlantic/hw_atl2/hw_atl2.c | 22 +-
.../aquantia/atlantic/hw_atl2/hw_atl2.h | 2 +
.../aquantia/atlantic/hw_atl2/hw_atl2_utils.h | 38 ++-
.../atlantic/hw_atl2/hw_atl2_utils_fw.c | 110 +++++--
drivers/net/ethernet/dec/tulip/de4x5.c | 34 +-
.../net/ethernet/freescale/dpaa2/dpaa2-eth.c | 2 +
.../ethernet/hisilicon/hns/hns_dsaf_misc.c | 4 +
.../net/ethernet/marvell/mvpp2/mvpp2_main.c | 2 +-
.../net/ethernet/mellanox/mlx4/en_netdev.c | 9 +-
drivers/net/ethernet/natsemi/xtsonic.c | 2 +-
.../ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 10 +-
drivers/net/usb/lan78xx.c | 2 +-
drivers/net/vrf.c | 2 +
drivers/net/wireguard/allowedips.c | 2 +-
drivers/net/wireguard/device.c | 39 +--
drivers/net/wireguard/device.h | 9 +-
drivers/net/wireguard/queueing.c | 6 +-
drivers/net/wireguard/queueing.h | 2 +-
drivers/net/wireguard/ratelimiter.c | 4 +-
drivers/net/wireguard/receive.c | 39 ++-
drivers/net/wireguard/socket.c | 2 +-
drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 22 +-
drivers/net/wireless/intel/iwlwifi/iwl-drv.h | 3 +
.../net/wireless/intel/iwlwifi/mvm/mac80211.c | 24 +-
drivers/net/wireless/intel/iwlwifi/mvm/mvm.h | 3 +
drivers/net/wireless/intel/iwlwifi/mvm/ops.c | 3 +
.../net/wireless/mediatek/mt76/mt7915/mcu.c | 4 +-
.../net/wireless/ralink/rt2x00/rt2x00usb.c | 3 +
drivers/platform/x86/thinkpad_acpi.c | 13 +-
drivers/scsi/scsi_transport_iscsi.c | 6 +-
drivers/thermal/thermal_core.c | 2 +
drivers/tty/serial/8250/8250_pci.c | 39 ++-
drivers/tty/serial/8250/8250_port.c | 7 -
drivers/tty/serial/amba-pl011.c | 1 +
drivers/tty/serial/msm_serial.c | 3 +
drivers/tty/serial/serial-tegra.c | 4 +-
drivers/tty/serial/serial_core.c | 18 +-
drivers/usb/core/quirks.c | 3 +
drivers/usb/host/xhci-ring.c | 21 +-
drivers/usb/typec/tcpm/tcpm.c | 4 -
drivers/video/console/vgacon.c | 14 +-
fs/btrfs/disk-io.c | 14 +-
fs/gfs2/bmap.c | 2 +-
fs/gfs2/super.c | 14 +-
fs/nfs/nfs42proc.c | 5 +-
fs/overlayfs/file.c | 59 ++--
include/linux/acpi.h | 9 +
include/linux/kprobes.h | 2 +
include/linux/netdevice.h | 19 +-
include/linux/siphash.h | 14 +-
include/net/dst_cache.h | 11 +
include/net/fib_rules.h | 4 +-
include/net/ip_fib.h | 2 +-
include/net/netns/ipv4.h | 2 +-
include/net/sock.h | 13 +-
kernel/kprobes.c | 3 +
kernel/sched/core.c | 2 +-
kernel/trace/trace_events_hist.c | 2 +-
lib/siphash.c | 12 +-
net/can/j1939/transport.c | 6 +
net/core/dev.c | 5 +-
net/core/dst_cache.c | 19 ++
net/core/fib_rules.c | 2 +-
net/ipv4/devinet.c | 2 +-
net/ipv4/fib_frontend.c | 2 +-
net/ipv4/fib_rules.c | 5 +-
net/ipv4/fib_semantics.c | 4 +-
net/ipv6/esp6.c | 6 +
net/ipv6/fib6_rules.c | 4 +-
net/mac80211/rx.c | 3 +-
net/mpls/af_mpls.c | 68 +++-
net/rds/tcp.c | 2 +-
net/rxrpc/conn_client.c | 14 +-
net/rxrpc/peer_object.c | 14 +-
net/smc/af_smc.c | 14 +
net/smc/smc_close.c | 8 +-
net/smc/smc_core.c | 7 +-
net/tls/tls_sw.c | 4 +-
sound/hda/intel-dsp-config.c | 10 +
sound/soc/tegra/tegra186_dspk.c | 181 +++++++++--
sound/soc/tegra/tegra210_admaif.c | 140 +++++++--
sound/soc/tegra/tegra210_ahub.c | 11 +-
sound/soc/tegra/tegra210_dmic.c | 184 ++++++++---
sound/soc/tegra/tegra210_i2s.c | 296 +++++++++++++-----
tools/perf/builtin-report.c | 15 +-
tools/perf/ui/hist.c | 28 +-
tools/perf/util/hist.h | 1 -
tools/perf/util/util.c | 14 +-
tools/perf/util/util.h | 2 +-
tools/testing/selftests/net/fcnal-test.sh | 4 +-
tools/testing/selftests/wireguard/netns.sh | 30 +-
.../selftests/wireguard/qemu/debug.config | 2 +-
.../selftests/wireguard/qemu/kernel.config | 1 +
virt/kvm/kvm_main.c | 3 +-
138 files changed, 1694 insertions(+), 677 deletions(-)
--
2.20.1
1
121
Backport 5.10.83 LTS patches from upstream.
Adrian Hunter (1):
mmc: sdhci: Fix ADMA for PAGE_SIZE >= 64KiB
Albert Wang (1):
usb: dwc3: gadget: Fix null pointer exception
Alex Deucher (1):
drm/amdgpu/gfx9: switch to golden tsc registers for renoir+
Alexander Aring (1):
net: ieee802154: handle iftypes as u32
Alexander Mikhalitsyn (1):
shm: extend forced shm destroy to support objects from several IPC
nses
Amit Cohen (1):
mlxsw: spectrum: Protect driver from buggy firmware
Arjun Roy (1):
tcp: correctly handle increased zerocopy args struct size
Christophe Leroy (1):
powerpc/32: Fix hardlockup on vmap stack overflow
Dan Carpenter (4):
usb: chipidea: ci_hdrc_imx: fix potential error pointer dereference in
probe
staging: rtl8192e: Fix use after free in _rtl92e_pci_disconnect()
drm/nouveau/acr: fix a couple NULL vs IS_ERR() checks
drm/vc4: fix error code in vc4_create_object()
Daniele Palmas (1):
USB: serial: option: add Telit LE910S1 0x9200 composition
Danielle Ratson (1):
mlxsw: Verify the accessed index doesn't exceed the array length
David Hildenbrand (2):
proc/vmcore: fix clearing user buffer by properly using clear_user()
s390/mm: validate VMA in PGSTE manipulation functions
Davide Caratti (1):
net/sched: sch_ets: don't peek at classes beyond 'nbands'
Diana Wang (1):
nfp: checking parameter process for rx-usecs/tx-usecs is invalid
Dylan Hung (1):
mdio: aspeed: Fix "Link is Down" issue
Eric Dumazet (3):
mptcp: fix delack timer
ipv6: fix typos in __ip6_finish_output()
tcp_cubic: fix spurious Hystart ACK train detections for
not-cwnd-limited flows
Florent Fourcot (2):
netfilter: ctnetlink: fix filtering with CTA_TUPLE_REPLY
netfilter: ctnetlink: do not erase error code with EINVAL
Florian Fainelli (3):
ARM: dts: BCM5301X: Fix I2C controller interrupt
ARM: dts: BCM5301X: Add interrupt properties to GPIO node
ARM: dts: bcm2711: Fix PCIe interrupts
Guo DaXing (1):
net/smc: Fix loop in smc_listen
Hans Verkuil (1):
media: cec: copy sequence field for the reply
Heiner Kallweit (1):
lan743x: fix deadlock in lan743x_phy_link_status_change()
Helge Deller (1):
Revert "parisc: Fix backtrace to always include init funtion names"
Holger Assmann (1):
net: stmmac: retain PTP clock time during SIOCSHWTSTAMP ioctls
Huang Jianan (1):
erofs: fix deadlock when shrink erofs slab
Huang Pei (2):
MIPS: loongson64: fix FTLB configuration
MIPS: use 3-level pgtable for 64KB page size on MIPS_VA_BITS_48
Jakub Kicinski (2):
tls: splice_read: fix record type check
tls: fix replacing proto_ops
Jason Gerecke (1):
HID: wacom: Use "Confidence" flag to prevent reporting invalid
contacts
Jeff Layton (1):
ceph: properly handle statfs on multifs setups
Jesse Brandeburg (1):
igb: fix netpoll exit with traffic
Jiri Olsa (1):
tracing/uprobe: Fix uprobe_perf_open probes iteration
Joakim Zhang (2):
net: stmmac: fix system hang caused by eee_ctrl_timer during
suspend/resume
net: stmmac: platform: fix build warning when with !CONFIG_PM_SLEEP
Joerg Roedel (1):
iommu/amd: Clarify AMD IOMMUv2 initialization messages
Juergen Gross (9):
xen: sync include/xen/interface/io/ring.h with Xen's newest version
xen/blkfront: read response from backend only once
xen/blkfront: don't take local copy of a request from the ring page
xen/blkfront: don't trust the backend response data blindly
xen/netfront: read response from backend only once
xen/netfront: don't read data from request on the ring page
xen/netfront: disentangle tx_skb_freelist
xen/netfront: don't trust the backend response data blindly
tty: hvc: replace BUG_ON() with negative return value
Karsten Graul (1):
net/smc: Fix NULL pointer dereferencing in smc_vlan_by_tcpsk()
Kumar Thangavel (1):
net/ncsi : Add payload to be 32-bit aligned to fix dropped packets
Maciej Fijalkowski (1):
ice: fix vsi->txq_map sizing
Marek Behún (2):
PCI: aardvark: Deduplicate code in advk_pcie_rd_conf()
net: marvell: mvpp2: increase MTU limit when XDP enabled
Mark Rutland (1):
sched/scs: Reset task stack state in bringup_cpu()
Marta Plantykow (1):
ice: avoid bpf_prog refcount underflow
Mathias Nyman (2):
usb: hub: Fix usb enumeration issue due to address0 race
usb: hub: Fix locking issues with address0_mutex
Maurizio Lombardi (1):
nvmet: use IOCB_NOWAIT only if the filesystem supports it
Michael Kelley (1):
firmware: smccc: Fix check for ARCH_SOC_ID not implemented
Mike Christie (1):
scsi: core: sysfs: Fix setting device state to SDEV_RUNNING
Miklos Szeredi (1):
fuse: release pipe buf after last use
Minas Harutyunyan (1):
usb: dwc2: gadget: Fix ISOC flow for elapsed frames
Mingjie Zhang (1):
USB: serial: option: add Fibocom FM101-GL variants
Nathan Chancellor (1):
usb: dwc2: hcd_queue: Fix use of floating point literal
Nicholas Kazlauskas (1):
drm/amd/display: Set plane update flags for all planes in reset
Nicholas Piggin (1):
KVM: PPC: Book3S HV: Prevent POWER7/8 TLB flush flushing SLB
Nikolay Aleksandrov (3):
net: nexthop: fix null pointer dereference when IPv6 is not enabled
net: ipv6: add fib6_nh_release_dsts stub
net: nexthop: release IPv6 per-cpu dsts when replacing a nexthop group
Nitesh B Venkatesh (1):
iavf: Prevent changing static ITR values if adaptive moderation is on
Noralf Trønnes (1):
staging/fbtft: Fix backlight
Ondrej Jirman (1):
usb: typec: fusb302: Fix masking of comparator and bc_lvl interrupts
Pali Rohár (4):
PCI: aardvark: Update comment about disabling link training
PCI: aardvark: Implement re-issuing config requests on CRS response
PCI: aardvark: Simplify initialization of rootcap on virtual bridge
PCI: aardvark: Fix link training
Peng Fan (1):
firmware: arm_scmi: pm: Propagate return value to caller
Pierre-Louis Bossart (1):
ALSA: intel-dsp-config: add quirk for JSL devices based on ES8336
codec
Russell King (Oracle) (2):
net: phylink: Force link down and retrigger resolve on interface
change
net: phylink: Force retrigger in case of latched link-fail indicator
Sakari Ailus (1):
ACPI: Get acpi_device's parent from the parent field
Shin'ichiro Kawasaki (1):
scsi: scsi_debug: Zero clear zones at reset write pointer
Sreekanth Reddy (1):
scsi: mpt3sas: Fix kernel panic during drive powercycle test
Srinivas Kandagatla (3):
ASoC: qdsp6: q6routing: Conditionally reset FrontEnd Mixer
ASoC: qdsp6: q6asm: fix q6asm_dai_prepare error handling
ASoC: codecs: wcd934x: return error code correctly from hw_params
Stefano Garzarella (1):
vhost/vsock: fix incorrect used length reported to the guest
Stefano Stabellini (2):
xen: don't continue xenstore initialization in case of errors
xen: detect uninitialized xenbus in xenbus_init
Steve French (1):
smb3: do not error on fsync when readonly
Steven Rostedt (VMware) (2):
tracing: Fix pid filtering when triggers are attached
tracing: Check pid filtering when creating events
Takashi Iwai (5):
ALSA: ctxfi: Fix out-of-range access
ALSA: hda/realtek: Fix LED on HP ProBook 435 G7
staging: greybus: Add missing rwsem around snd_ctl_remove() calls
ASoC: topology: Add missing rwsem around snd_ctl_remove() calls
ARM: socfpga: Fix crash with CONFIG_FORTIRY_SOURCE
Thinh Nguyen (2):
usb: dwc3: gadget: Ignore NoStream after End Transfer
usb: dwc3: gadget: Check for L1/L2/U3 for Start Transfer
Thomas Zeitlhofer (1):
PM: hibernate: use correct mode for swsusp_close()
Tim Harvey (1):
mmc: sdhci-esdhc-imx: disable CMDQ support
Todd Kjos (1):
binder: fix test regression due to sender_euid change
Tony Lu (2):
net/smc: Ensure the active closing peer first closes clcsock
net/smc: Don't call clcsock shutdown twice when smc shutdown
Trond Myklebust (1):
NFSv42: Don't fail clone() unless the OP_CLONE operation failed
Varun Prakash (1):
nvmet-tcp: fix incomplete data digest send
Vladimir Oltean (2):
net: mscc: ocelot: don't downgrade timestamping RX filters in
SIOCSHWTSTAMP
net: mscc: ocelot: correctly report the timestamping RX filters in
ethtool
Volodymyr Mytnyk (1):
net: marvell: prestera: fix double free issue on err path
Weichao Guo (1):
f2fs: set SBI_NEED_FSCK flag when inconsistent node block found
Werner Sembach (1):
ALSA: hda/realtek: Add quirk for ASRock NUC Box 1100
Will Mortensen (1):
netfilter: flowtable: fix IPv6 tunnel addr match
Ziyang Xuan (1):
net: vlan: fix underflow for the real_dev refcnt
yangxingwu (1):
netfilter: ipvs: Fix reuse connection if RS weight is 0
Documentation/networking/ipvs-sysctl.rst | 3 +-
arch/arm/boot/dts/bcm2711.dtsi | 8 +-
arch/arm/boot/dts/bcm5301x.dtsi | 4 +-
arch/arm/mach-socfpga/core.h | 2 +-
arch/arm/mach-socfpga/platsmp.c | 8 +-
arch/mips/Kconfig | 2 +-
arch/mips/kernel/cpu-probe.c | 4 +-
arch/parisc/kernel/vmlinux.lds.S | 3 +-
arch/powerpc/kernel/head_32.h | 6 +-
arch/powerpc/kvm/book3s_hv_builtin.c | 5 +-
arch/s390/mm/pgtable.c | 13 +
drivers/acpi/property.c | 11 +-
drivers/android/binder.c | 2 +-
drivers/block/xen-blkfront.c | 126 +++++---
drivers/firmware/arm_scmi/scmi_pm_domain.c | 4 +-
drivers/firmware/smccc/soc_id.c | 2 +-
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 46 ++-
.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 +-
.../gpu/drm/nouveau/nvkm/subdev/acr/gm200.c | 6 +-
.../gpu/drm/nouveau/nvkm/subdev/acr/gp102.c | 6 +-
drivers/gpu/drm/vc4/vc4_bo.c | 2 +-
drivers/hid/wacom_wac.c | 8 +-
drivers/hid/wacom_wac.h | 1 +
drivers/iommu/amd/iommu_v2.c | 6 +-
drivers/media/cec/core/cec-adap.c | 1 +
drivers/mmc/host/sdhci-esdhc-imx.c | 2 -
drivers/mmc/host/sdhci.c | 21 +-
drivers/mmc/host/sdhci.h | 4 +-
.../net/ethernet/intel/iavf/iavf_ethtool.c | 30 +-
drivers/net/ethernet/intel/ice/ice_lib.c | 9 +-
drivers/net/ethernet/intel/ice/ice_main.c | 18 +-
drivers/net/ethernet/intel/igb/igb_main.c | 2 +-
.../net/ethernet/marvell/mvpp2/mvpp2_main.c | 14 +-
.../marvell/prestera/prestera_switchdev.c | 6 +-
drivers/net/ethernet/mellanox/mlxsw/minimal.c | 4 +
.../net/ethernet/mellanox/mlxsw/spectrum.c | 5 +
.../ethernet/mellanox/mlxsw/spectrum_ptp.c | 3 +
.../ethernet/mellanox/mlxsw/spectrum_router.c | 3 +
.../mellanox/mlxsw/spectrum_switchdev.c | 4 +
drivers/net/ethernet/microchip/lan743x_main.c | 12 +-
drivers/net/ethernet/mscc/ocelot.c | 11 +-
drivers/net/ethernet/netronome/nfp/nfp_net.h | 3 -
.../ethernet/netronome/nfp/nfp_net_ethtool.c | 2 +-
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 1 +
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 139 +++++----
.../ethernet/stmicro/stmmac/stmmac_platform.c | 44 +++
drivers/net/mdio/mdio-aspeed.c | 7 +
drivers/net/phy/phylink.c | 26 +-
drivers/net/xen-netfront.c | 272 ++++++++++-------
drivers/nvme/target/io-cmd-file.c | 4 +-
drivers/nvme/target/tcp.c | 7 +-
drivers/pci/controller/pci-aardvark.c | 235 +++++++--------
drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +-
drivers/scsi/scsi_debug.c | 5 +
drivers/scsi/scsi_sysfs.c | 2 +-
drivers/staging/fbtft/fb_ssd1351.c | 4 -
drivers/staging/fbtft/fbtft-core.c | 9 +-
drivers/staging/greybus/audio_helper.c | 8 +-
drivers/staging/rtl8192e/rtl8192e/rtl_core.c | 3 +-
drivers/tty/hvc/hvc_xen.c | 17 +-
drivers/usb/chipidea/ci_hdrc_imx.c | 18 +-
drivers/usb/core/hub.c | 24 +-
drivers/usb/dwc2/gadget.c | 17 +-
drivers/usb/dwc2/hcd_queue.c | 2 +-
drivers/usb/dwc3/gadget.c | 39 ++-
drivers/usb/serial/option.c | 5 +
drivers/usb/typec/tcpm/fusb302.c | 6 +-
drivers/vhost/vsock.c | 2 +-
drivers/xen/xenbus/xenbus_probe.c | 27 +-
fs/ceph/super.c | 11 +-
fs/cifs/file.c | 35 ++-
fs/erofs/utils.c | 8 +-
fs/f2fs/node.c | 1 +
fs/fuse/dev.c | 10 +-
fs/nfs/nfs42xdr.c | 3 +-
fs/proc/vmcore.c | 16 +-
include/linux/ipc_namespace.h | 15 +
include/linux/sched/task.h | 2 +-
include/net/ip6_fib.h | 1 +
include/net/ipv6_stubs.h | 1 +
include/net/nl802154.h | 7 +-
include/xen/interface/io/ring.h | 278 ++++++++++--------
ipc/shm.c | 189 +++++++++---
kernel/cpu.c | 7 +
kernel/power/hibernate.c | 6 +-
kernel/sched/core.c | 4 -
kernel/trace/trace.h | 24 +-
kernel/trace/trace_events.c | 10 +
kernel/trace/trace_uprobe.c | 1 +
net/8021q/vlan.c | 3 -
net/8021q/vlan_dev.c | 3 +
net/ipv4/nexthop.c | 35 ++-
net/ipv4/tcp.c | 4 +-
net/ipv4/tcp_cubic.c | 5 +-
net/ipv6/af_inet6.c | 1 +
net/ipv6/ip6_output.c | 2 +-
net/ipv6/route.c | 19 ++
net/mptcp/options.c | 3 +-
net/ncsi/ncsi-cmd.c | 24 +-
net/netfilter/ipvs/ip_vs_core.c | 8 +-
net/netfilter/nf_conntrack_netlink.c | 6 +-
net/netfilter/nf_flow_table_offload.c | 4 +-
net/sched/sch_ets.c | 8 +-
net/smc/af_smc.c | 12 +-
net/smc/smc_close.c | 6 +
net/smc/smc_core.c | 35 +--
net/tls/tls_main.c | 47 ++-
net/tls/tls_sw.c | 23 +-
sound/hda/intel-dsp-config.c | 9 +
sound/pci/ctxfi/ctamixer.c | 14 +-
sound/pci/ctxfi/ctdaio.c | 16 +-
sound/pci/ctxfi/ctresource.c | 7 +-
sound/pci/ctxfi/ctresource.h | 4 +-
sound/pci/ctxfi/ctsrc.c | 7 +-
sound/pci/hda/patch_realtek.c | 28 ++
sound/soc/codecs/wcd934x.c | 3 +-
sound/soc/qcom/qdsp6/q6asm-dai.c | 19 +-
sound/soc/qcom/qdsp6/q6routing.c | 6 +-
sound/soc/soc-topology.c | 3 +
119 files changed, 1539 insertions(+), 815 deletions(-)
--
2.20.1
1
119

[PATCH openEuler-5.10 01/16] USB: gadget: detect too-big endpoint 0 requests
by Zheng Zengkai 14 Jan '22
by Zheng Zengkai 14 Jan '22
14 Jan '22
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
stable inclusion
form stable-v5.10.85
commit 7193ad3e50e596ac2192531c58ba83b9e6d2444b
bugzilla: 185937 https://gitee.com/openeuler/kernel/issues/I4DDEL
CVE: CVE-2021-39685
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
Sometimes USB hosts can ask for buffers that are too large from endpoint
0, which should not be allowed. If this happens for OUT requests, stall
the endpoint, but for IN requests, trim the request size to the endpoint
buffer size.
Co-developed-by: Szymon Heidrich <szymon.heidrich(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Chen Jun <chenjun102(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
drivers/usb/gadget/composite.c | 12 ++++++++++++
drivers/usb/gadget/legacy/dbgp.c | 13 +++++++++++++
drivers/usb/gadget/legacy/inode.c | 16 +++++++++++++++-
3 files changed, 40 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c
index a8704e6498ab..426132988512 100644
--- a/drivers/usb/gadget/composite.c
+++ b/drivers/usb/gadget/composite.c
@@ -1648,6 +1648,18 @@ composite_setup(struct usb_gadget *gadget, const struct usb_ctrlrequest *ctrl)
struct usb_function *f = NULL;
u8 endp;
+ if (w_length > USB_COMP_EP0_BUFSIZ) {
+ if (ctrl->bRequestType == USB_DIR_OUT) {
+ goto done;
+ } else {
+ /* Cast away the const, we are going to overwrite on purpose. */
+ __le16 *temp = (__le16 *)&ctrl->wLength;
+
+ *temp = cpu_to_le16(USB_COMP_EP0_BUFSIZ);
+ w_length = USB_COMP_EP0_BUFSIZ;
+ }
+ }
+
/* partial re-init of the response message; the function or the
* gadget might need to intercept e.g. a control-OUT completion
* when we delegate to it.
diff --git a/drivers/usb/gadget/legacy/dbgp.c b/drivers/usb/gadget/legacy/dbgp.c
index e1d566c9918a..e567afcb2794 100644
--- a/drivers/usb/gadget/legacy/dbgp.c
+++ b/drivers/usb/gadget/legacy/dbgp.c
@@ -345,6 +345,19 @@ static int dbgp_setup(struct usb_gadget *gadget,
void *data = NULL;
u16 len = 0;
+ if (length > DBGP_REQ_LEN) {
+ if (ctrl->bRequestType == USB_DIR_OUT) {
+ return err;
+ } else {
+ /* Cast away the const, we are going to overwrite on purpose. */
+ __le16 *temp = (__le16 *)&ctrl->wLength;
+
+ *temp = cpu_to_le16(DBGP_REQ_LEN);
+ length = DBGP_REQ_LEN;
+ }
+ }
+
+
if (request == USB_REQ_GET_DESCRIPTOR) {
switch (value>>8) {
case USB_DT_DEVICE:
diff --git a/drivers/usb/gadget/legacy/inode.c b/drivers/usb/gadget/legacy/inode.c
index 71e7d10dd76b..04b9c4f5f129 100644
--- a/drivers/usb/gadget/legacy/inode.c
+++ b/drivers/usb/gadget/legacy/inode.c
@@ -110,6 +110,8 @@ enum ep0_state {
/* enough for the whole queue: most events invalidate others */
#define N_EVENT 5
+#define RBUF_SIZE 256
+
struct dev_data {
spinlock_t lock;
refcount_t count;
@@ -144,7 +146,7 @@ struct dev_data {
struct dentry *dentry;
/* except this scratch i/o buffer for ep0 */
- u8 rbuf [256];
+ u8 rbuf[RBUF_SIZE];
};
static inline void get_dev (struct dev_data *data)
@@ -1333,6 +1335,18 @@ gadgetfs_setup (struct usb_gadget *gadget, const struct usb_ctrlrequest *ctrl)
u16 w_value = le16_to_cpu(ctrl->wValue);
u16 w_length = le16_to_cpu(ctrl->wLength);
+ if (w_length > RBUF_SIZE) {
+ if (ctrl->bRequestType == USB_DIR_OUT) {
+ return value;
+ } else {
+ /* Cast away the const, we are going to overwrite on purpose. */
+ __le16 *temp = (__le16 *)&ctrl->wLength;
+
+ *temp = cpu_to_le16(RBUF_SIZE);
+ w_length = RBUF_SIZE;
+ }
+ }
+
spin_lock (&dev->lock);
dev->setup_abort = 0;
if (dev->state == STATE_DEV_UNCONNECTED) {
--
2.20.1
1
15
Backport LTS 5.10.82 patches from upstream.
Adrian Hunter (2):
scsi: ufs: core: Fix task management completion
scsi: ufs: core: Fix task management completion timeout race
Alex Elder (1):
net: ipa: disable HOLB drop when updating timer
Alexander Antonov (2):
perf/x86/intel/uncore: Fix filter_tid mask for CHA events on Skylake
Server
perf/x86/intel/uncore: Fix IIO event constraints for Skylake Server
Alexander Mikhalitsyn (1):
ipc: WARN if trying to remove ipc object which is absent
Alistair Delva (1):
block: Check ADMIN before NICE for IOPRIO_CLASS_RT
Alvin Lee (1):
drm/amd/display: Update swizzle mode enums
Amit Kumar Mahapatra (1):
arm64: zynqmp: Do not duplicate flash partition label property
Anatolij Gustschin (1):
powerpc/5200: dts: fix memory node unit name
AngeloGioacchino Del Regno (1):
arm64: dts: qcom: msm8998: Fix CPU/L2 idle state latency and residency
Arjun Roy (3):
net-zerocopy: Copy straggler unaligned data for TCP Rx. zerocopy.
net-zerocopy: Refactor skb frag fast-forward op.
tcp: Fix uninitialized access in skb frags array for Rx 0cp.
Baoquan He (1):
s390/kexec: fix memory leak of ipl report buffer
Bart Van Assche (1):
MIPS: sni: Fix the build
Bjorn Andersson (1):
pinctrl: qcom: sdm845: Enable dual edge errata
Bongsu Jeon (1):
net: nfc: nci: Change the NCI close sequence
Chao Yu (1):
f2fs: fix incorrect return value in f2fs_sanity_check_ckpt()
Chengfeng Ye (1):
ALSA: gus: fix null pointer dereference on pointer block
Christian Lamparter (1):
ARM: BCM53016: Specify switch ports for Meraki MR32
Christophe JAILLET (1):
platform/x86: hp_accel: Fix an error handling path in
'lis3lv02d_probe()'
Christophe Leroy (2):
powerpc/8xx: Fix Oops with STRICT_KERNEL_RWX without DEBUG_RODATA_TEST
powerpc/8xx: Fix pinned TLBs with CONFIG_STRICT_KERNEL_RWX
Colin Ian King (1):
MIPS: generic/yamon-dt: fix uninitialized variable error
David Heidelberg (1):
ARM: dts: qcom: fix memory and mdio nodes naming for RB3011
Dmitry Baryshkov (1):
clk: qcom: gcc-msm8996: Drop (again) gcc_aggre1_pnoc_ahb_clk
Eric Dumazet (1):
net: reduce indentation level in sk_clone_lock()
Eryk Rybak (3):
i40e: Fix correct max_pkt_size on VF RX queue
i40e: Fix changing previously set num_queue_pairs for PFs
i40e: Fix ping is lost after configuring ADq on VF
Ewan D. Milne (1):
scsi: qla2xxx: Fix mailbox direction flags in qla2xxx_get_adapter_id()
Fabio Aiuto (1):
staging: rtl8723bs: remove possible deadlock when disconnect (v2)
Gao Xiang (1):
f2fs: fix up f2fs_lookup tracepoints
Grzegorz Szczurek (2):
iavf: Fix for setting queues to 0
i40e: Fix display error code in dmesg
Guanghui Feng (1):
tty: tty_buffer: Fix the softlockup issue in flush_to_ldisc
Guo Zhi (1):
scsi: advansys: Fix kernel pointer leak
Hans Verkuil (1):
drm/nouveau: hdmigv100.c: fix corrupted HDMI Vendor InfoFrame
Hans de Goede (1):
ASoC: nau8824: Add DMI quirk mechanism for active-high jack-detect
Heiko Carstens (1):
s390/kexec: fix return code handling
Hyeong-Jun Kim (1):
f2fs: compress: disallow disabling compress on non-empty compressed
file
Ian Rogers (1):
perf bpf: Avoid memory leak from perf_env__insert_btf()
Imre Deak (1):
drm/i915/dp: Ensure sink rate values are always valid
Jacob Keller (1):
iavf: prevent accidental free of filter structure
James Clark (1):
perf tests: Remove bash construct from record+zstd_comp_decomp.sh
James Smart (1):
scsi: lpfc: Fix list_add() corruption in lpfc_drain_txq()
Jan Kara (1):
udf: Fix crash after seekdir
Jedrzej Jagielski (1):
i40e: Fix creation of first queue by omitting it if is not power of
two
Jesse Brandeburg (1):
e100: fix device suspend/resume
Joel Stanley (1):
clk/ast2600: Fix soc revision for AHB
Johan Hovold (1):
drm/udl: fix control-message timeout
Jonathan Davies (1):
net: virtio_net_hdr_to_skb: count transport header in UFO
Josef Bacik (2):
fs: export an inode_update_time helper
btrfs: update device path inode time instead of bd_inode
Jérôme Pouiller (1):
staging: wfx: ensure IRQ is ready before enabling it
Karen Sornek (1):
i40e: Fix warning message and call stack during rmmod i40e driver
Keoseong Park (1):
f2fs: fix to use WHINT_MODE
Leon Romanovsky (2):
RDMA/netlink: Add __maybe_unused to static inline in C file
ice: Delete always true check of PF pointer
Li Yang (2):
ARM: dts: ls1021a: move thermal-zones node out of soc/
ARM: dts: ls1021a-tsn: use generic "jedec,spi-nor" compatible for
flash
Like Xu (1):
perf/x86/vlbr: Add c->flags to vlbr event constraints
Lin Ma (3):
NFC: reorganize the functions in nci_request
NFC: reorder the logic in nfc_{un,}register_device
NFC: add NCI_UNREG flag to eliminate the race
Linus Walleij (1):
ARM: dts: ux500: Skomer regulator fixes
Lu Wei (1):
maple: fix wrong return value of maple_bus_init().
Luis Chamberlain (1):
firmware_loader: fix pre-allocated buf built-in firmware use
Maher Sanalla (1):
net/mlx5: Lag, update tracker when state change event received
Masami Hiramatsu (1):
tracing/histogram: Do not copy the fixed-size char array field over
the field size
Mateusz Palczewski (1):
iavf: Fix return of set the new channel count
Matthew Hagan (1):
ARM: dts: NSP: Fix mpcore, mmc node names
Matthias Brugger (1):
arm64: dts: rockchip: Disable CDN DP on Pinebook Pro
Maxim Levitsky (1):
KVM: nVMX: don't use vcpu->arch.efer when checking host state on
nested state load
Maxime Ripard (3):
ARM: dts: sunxi: Fix OPPs node name
arm64: dts: allwinner: h5: Fix GPU thermal zone node name
arm64: dts: allwinner: a100: Fix thermal zone node name
Meng Li (1):
net: stmmac: socfpga: add runtime suspend/resume callback for
stratix10 platform
Michael Ellerman (2):
powerpc/dcr: Use cmplwi instead of 3-argument cmpli
KVM: PPC: Book3S HV: Use GLOBAL_TOC for kvmppc_h_set_dabr/xdabr()
Michael Walle (2):
arm64: dts: hisilicon: fix arm,sp805 compatible string
arm64: dts: freescale: fix arm,sp805 compatible string
Michal Maloszewski (1):
i40e: Fix NULL ptr dereference on VSI filter sync
Michal Simek (1):
arm64: zynqmp: Fix serial compatible string
Mike Christie (3):
scsi: target: Fix ordered tag handling
scsi: target: Fix alua_tg_pt_gps_count tracking
scsi: core: sysfs: Fix hang when device state is set via sysfs
Mitch Williams (1):
iavf: validate pointers
Nathan Chancellor (2):
hexagon: export raw I/O routines for modules
hexagon: clean up timer-regs.h
Nguyen Dinh Phi (1):
cfg80211: call cfg80211_stop_ap when switch from P2P_GO type
Nicholas Nunley (2):
iavf: check for null in iavf_fix_features
iavf: free q_vectors before queues in iavf_disable_vf
Nick Desaulniers (2):
sh: check return code of request_irq
arm64: vdso32: suppress error message for 'make mrproper'
Nicolas Dichtel (1):
tun: fix bonding active backup with arp monitoring
Nikolay Borisov (1):
btrfs: fix memory ordering between normal and ordered work functions
Ondrej Mosnacek (1):
selinux: fix NULL-pointer dereference when hashtab allocation fails
Paul Cercueil (1):
clk: ingenic: Fix bugs with divided dividers
Pavel Skripkin (2):
net: bnx2x: fix variable dereferenced before check
net: dpaa2-eth: fix use-after-free in dpaa2_eth_remove
Pierre-Louis Bossart (4):
ASoC: SOF: Intel: hda-dai: fix potential locking issue
ALSA: intel-dsp-config: add quirk for APL/GLK/TGL devices based on
ES8336 codec
ALSA: hda: hdac_ext_stream: fix potential locking issues
ALSA: hda: hdac_stream: fix potential locking issue in
snd_hdac_stream_assign()
Piotr Marczak (1):
iavf: Fix failure to exit out from last all-multicast mode
Punit Agrawal (1):
net: stmmac: dwmac-rk: Fix ethernet on rk3399 based devices
Raed Salem (1):
net/mlx5: E-Switch, return error if encap isn't supported
Randy Dunlap (8):
ALSA: ISA: not for M68K
sh: fix kconfig unmet dependency warning for FRAME_POINTER
sh: math-emu: drop unused functions
sh: define __BIG_ENDIAN for math-emu
mips: BCM63XX: ensure that CPU_SUPPORTS_32BIT_KERNEL is set
mips: bcm63xx: add support for clk_get_parent()
mips: lantiq: add support for clk_get_parent()
x86/Kconfig: Fix an unused variable error in dell-smm-hwmon
Roger Quadros (1):
ARM: dts: omap: fix gpmc,mux-add-data type
Roi Dayan (1):
net/mlx5: E-Switch, Change mode lock from mutex to rw semaphore
Rustam Kovhaev (1):
mm: kmemleak: slob: respect SLAB_NOLEAKTRACE flag
Sasha Levin (1):
Revert "perf: Rework perf_event_exit_event()"
Sean Christopherson (1):
x86/hyperv: Fix NULL deref in set_hv_tscchange_cb() if Hyper-V setup
fails
Selvin Xavier (1):
RDMA/bnxt_re: Check if the vlan is valid before reporting
Shawn Guo (1):
arm64: dts: qcom: ipq6018: Fix qcom,controlled-remotely property
Sohaib Mohamed (1):
perf bench futex: Fix memory leak of perf_cpu_map__new()
Sriharsha Basavapatna (1):
bnxt_en: reject indirect blk offload when hw-tc-offload is off
Stefan Riedmueller (1):
clk: imx: imx6ul: Move csi_sel mux to correct base register
Steven Rostedt (VMware) (1):
tracing: Add length protection to histogram string copies
Surabhi Boob (1):
iavf: Fix for the false positive ASQ/ARQ errors while issuing VF reset
Sven Peter (1):
usb: typec: tipd: Remove WARN_ON in tps6598x_block_read
Sven Schnelle (1):
parisc/sticon: fix reverse colors
Tadeusz Struk (1):
tipc: check for null after calling kmemdup
Takashi Iwai (1):
ASoC: DAPM: Cover regression by kctl change notification fix
Teng Qi (1):
iio: imu: st_lsm6dsx: Avoid potential array overflow in
st_lsm6dsx_set_odr()
Tetsuo Handa (1):
sock: fix /proc/net/sockstat underflow in sk_clone_lock()
Tony Lindgren (2):
bus: ti-sysc: Add quirk handling for reinit on context lost
bus: ti-sysc: Use context lost quirk for otg
Uwe Kleine-König (1):
usb: max-3421: Use driver data instead of maintaining a list of bound
devices
Valentine Fatiev (1):
net/mlx5e: nullify cq->dbg pointer in mlx5_debug_cq_remove()
Vincent Donnefort (1):
sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
Wen Gu (1):
net/smc: Make sure the link_id is unique
Xin Long (2):
tipc: only accept encrypted MSG_CRYPTO msgs
net: sched: act_mirred: drop dst for the direction from egress to
ingress
Yang Yingliang (2):
usb: musb: tusb6010: check return value after calling
platform_get_resource()
usb: host: ohci-tmio: check return value after calling
platform_get_resource()
hongao (1):
drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on
vga and dvi connectors
arch/arm/boot/dts/bcm-nsp.dtsi | 4 +-
arch/arm/boot/dts/bcm53016-meraki-mr32.dts | 22 ++
arch/arm/boot/dts/ls1021a-tsn.dts | 2 +-
arch/arm/boot/dts/ls1021a.dtsi | 66 +++---
arch/arm/boot/dts/omap-gpmc-smsc9221.dtsi | 2 +-
.../boot/dts/omap3-overo-tobiduo-common.dtsi | 2 +-
arch/arm/boot/dts/qcom-ipq8064-rb3011.dts | 6 +-
.../arm/boot/dts/ste-ux500-samsung-skomer.dts | 8 +-
arch/arm/boot/dts/sun8i-a33.dtsi | 4 +-
arch/arm/boot/dts/sun8i-a83t.dtsi | 4 +-
arch/arm/boot/dts/sun8i-h3.dtsi | 4 +-
.../arm64/boot/dts/allwinner/sun50i-a100.dtsi | 6 +-
.../dts/allwinner/sun50i-a64-cpu-opp.dtsi | 2 +-
.../boot/dts/allwinner/sun50i-h5-cpu-opp.dtsi | 2 +-
arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi | 2 +-
.../boot/dts/allwinner/sun50i-h6-cpu-opp.dtsi | 2 +-
.../arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 16 +-
.../arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 16 +-
arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 4 +-
arch/arm64/boot/dts/hisilicon/hi6220.dtsi | 2 +-
arch/arm64/boot/dts/qcom/ipq6018.dtsi | 2 +-
arch/arm64/boot/dts/qcom/msm8998.dtsi | 20 +-
.../boot/dts/rockchip/rk3399-pinebook-pro.dts | 4 -
.../dts/xilinx/zynqmp-zc1751-xm016-dc2.dts | 4 +-
arch/arm64/boot/dts/xilinx/zynqmp.dtsi | 4 +-
arch/arm64/kernel/vdso32/Makefile | 3 +-
arch/hexagon/include/asm/timer-regs.h | 26 ---
arch/hexagon/include/asm/timex.h | 3 +-
arch/hexagon/kernel/time.c | 12 +-
arch/hexagon/lib/io.c | 4 +
arch/mips/Kconfig | 3 +
arch/mips/bcm63xx/clk.c | 6 +
arch/mips/generic/yamon-dt.c | 2 +-
arch/mips/lantiq/clk.c | 6 +
arch/mips/sni/time.c | 4 +-
arch/powerpc/boot/dts/charon.dts | 2 +-
arch/powerpc/boot/dts/digsy_mtc.dts | 2 +-
arch/powerpc/boot/dts/lite5200.dts | 2 +-
arch/powerpc/boot/dts/lite5200b.dts | 2 +-
arch/powerpc/boot/dts/media5200.dts | 2 +-
arch/powerpc/boot/dts/mpc5200b.dtsi | 2 +-
arch/powerpc/boot/dts/o2d.dts | 2 +-
arch/powerpc/boot/dts/o2d.dtsi | 2 +-
arch/powerpc/boot/dts/o2dnt2.dts | 2 +-
arch/powerpc/boot/dts/o3dnt.dts | 2 +-
arch/powerpc/boot/dts/pcm032.dts | 2 +-
arch/powerpc/boot/dts/tqm5200.dts | 2 +-
arch/powerpc/kernel/head_8xx.S | 13 +-
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 4 +-
arch/powerpc/sysdev/dcr-low.S | 2 +-
arch/s390/include/asm/kexec.h | 6 +
arch/s390/kernel/ipl.c | 3 +-
arch/s390/kernel/machine_kexec_file.c | 18 +-
arch/sh/Kconfig.debug | 1 +
arch/sh/include/asm/sfp-machine.h | 8 +
arch/sh/kernel/cpu/sh4a/smp-shx3.c | 5 +-
arch/sh/math-emu/math.c | 103 ----------
arch/x86/Kconfig | 3 +-
arch/x86/events/intel/core.c | 4 +-
arch/x86/events/intel/uncore_snbep.c | 4 +
arch/x86/hyperv/hv_init.c | 3 +
arch/x86/kvm/vmx/nested.c | 22 +-
block/ioprio.c | 9 +-
drivers/base/firmware_loader/main.c | 13 +-
drivers/bus/ti-sysc.c | 110 +++++++++-
drivers/clk/clk-ast2600.c | 12 +-
drivers/clk/imx/clk-imx6ul.c | 2 +-
drivers/clk/ingenic/cgu.c | 6 +-
drivers/clk/qcom/gcc-msm8996.c | 15 --
.../gpu/drm/amd/amdgpu/amdgpu_connectors.c | 1 +
.../drm/amd/display/dc/dcn20/dcn20_resource.c | 4 +-
.../amd/display/dc/dml/display_mode_enums.h | 4 +-
drivers/gpu/drm/i915/display/intel_dp.c | 11 +
.../drm/nouveau/nvkm/engine/disp/hdmigv100.c | 1 -
drivers/gpu/drm/udl/udl_connector.c | 2 +-
drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c | 6 +-
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 12 +-
.../ethernet/broadcom/bnx2x/bnx2x_init_ops.h | 4 +-
drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 2 +-
.../net/ethernet/freescale/dpaa2/dpaa2-eth.c | 4 +-
drivers/net/ethernet/intel/e100.c | 18 +-
drivers/net/ethernet/intel/i40e/i40e.h | 2 +
drivers/net/ethernet/intel/i40e/i40e_main.c | 160 +++++++++------
.../ethernet/intel/i40e/i40e_virtchnl_pf.c | 121 ++++-------
.../net/ethernet/intel/iavf/iavf_ethtool.c | 30 ++-
drivers/net/ethernet/intel/iavf/iavf_main.c | 14 +-
drivers/net/ethernet/intel/ice/ice_main.c | 3 -
drivers/net/ethernet/mellanox/mlx5/core/cq.c | 5 +-
.../net/ethernet/mellanox/mlx5/core/debugfs.c | 4 +-
.../net/ethernet/mellanox/mlx5/core/eswitch.c | 11 +-
.../net/ethernet/mellanox/mlx5/core/eswitch.h | 2 +-
.../mellanox/mlx5/core/eswitch_offloads.c | 28 +--
drivers/net/ethernet/mellanox/mlx5/core/lag.c | 28 ++-
.../net/ethernet/stmicro/stmmac/dwmac-rk.c | 5 +
.../ethernet/stmicro/stmmac/dwmac-socfpga.c | 24 ++-
drivers/net/ipa/ipa_endpoint.c | 2 +
drivers/net/tun.c | 5 +
drivers/pinctrl/qcom/pinctrl-sdm845.c | 1 +
drivers/platform/x86/hp_accel.c | 2 +
drivers/scsi/advansys.c | 4 +-
drivers/scsi/lpfc/lpfc_sli.c | 1 +
drivers/scsi/qla2xxx/qla_mbx.c | 6 +-
drivers/scsi/scsi_sysfs.c | 30 ++-
drivers/scsi/ufs/ufshcd.c | 57 +++---
drivers/scsi/ufs/ufshcd.h | 1 +
drivers/sh/maple/maple.c | 5 +-
drivers/staging/rtl8723bs/core/rtw_mlme_ext.c | 7 +-
drivers/staging/rtl8723bs/core/rtw_recv.c | 10 +-
drivers/staging/rtl8723bs/core/rtw_sta_mgt.c | 22 +-
drivers/staging/rtl8723bs/core/rtw_xmit.c | 16 +-
.../staging/rtl8723bs/hal/rtl8723bs_xmit.c | 2 -
drivers/staging/wfx/bus_sdio.c | 17 +-
drivers/target/target_core_alua.c | 1 -
drivers/target/target_core_device.c | 2 +
drivers/target/target_core_internal.h | 1 +
drivers/target/target_core_transport.c | 76 +++++--
drivers/tty/tty_buffer.c | 3 +
drivers/usb/host/max3421-hcd.c | 25 +--
drivers/usb/host/ohci-tmio.c | 2 +-
drivers/usb/musb/tusb6010.c | 5 +
drivers/usb/typec/tps6598x.c | 2 +-
drivers/video/console/sticon.c | 12 +-
fs/btrfs/async-thread.c | 14 ++
fs/btrfs/volumes.c | 21 +-
fs/f2fs/f2fs.h | 3 +-
fs/f2fs/super.c | 4 +-
fs/inode.c | 7 +-
fs/udf/dir.c | 32 ++-
fs/udf/namei.c | 3 +
fs/udf/super.c | 2 +
include/linux/fs.h | 2 +
include/linux/perf_event.h | 1 -
include/linux/platform_data/ti-sysc.h | 1 +
include/linux/trace_events.h | 2 +-
include/linux/virtio_net.h | 7 +-
include/net/nfc/nci_core.h | 1 +
include/rdma/rdma_netlink.h | 2 +-
include/sound/hdaudio_ext.h | 2 +
include/target/target_core_base.h | 6 +-
include/trace/events/f2fs.h | 12 +-
include/uapi/linux/tcp.h | 2 +
ipc/util.c | 6 +-
kernel/events/core.c | 142 ++++++-------
kernel/sched/core.c | 3 +
kernel/trace/trace_events_hist.c | 14 +-
mm/slab.h | 2 +-
net/core/sock.c | 189 +++++++++---------
net/ipv4/tcp.c | 122 ++++++++---
net/nfc/core.c | 32 +--
net/nfc/nci/core.c | 34 +++-
net/sched/act_mirred.c | 11 +-
net/smc/smc_core.c | 3 +-
net/tipc/crypto.c | 4 +
net/tipc/link.c | 7 +-
net/wireless/util.c | 1 +
security/selinux/ss/hashtab.c | 17 +-
sound/core/Makefile | 2 +
sound/hda/ext/hdac_ext_stream.c | 46 +++--
sound/hda/hdac_stream.c | 4 +-
sound/hda/intel-dsp-config.c | 22 +-
sound/isa/Kconfig | 2 +-
sound/isa/gus/gus_dma.c | 2 +
sound/pci/Kconfig | 1 +
sound/soc/codecs/nau8824.c | 40 ++++
sound/soc/soc-dapm.c | 29 ++-
sound/soc/sof/intel/hda-dai.c | 7 +-
tools/perf/bench/futex-lock-pi.c | 1 +
tools/perf/bench/futex-requeue.c | 1 +
tools/perf/bench/futex-wake-parallel.c | 1 +
tools/perf/bench/futex-wake.c | 1 +
.../tests/shell/record+zstd_comp_decomp.sh | 2 +-
tools/perf/util/bpf-event.c | 6 +-
tools/perf/util/env.c | 5 +-
tools/perf/util/env.h | 2 +-
174 files changed, 1413 insertions(+), 964 deletions(-)
delete mode 100644 arch/hexagon/include/asm/timer-regs.h
--
2.20.1
1
146

[PATCH openEuler-5.10 01/16] drm/i915/guc: Update to use firmware v49.0.1
by Zheng Zengkai 14 Jan '22
by Zheng Zengkai 14 Jan '22
14 Jan '22
From: John Harrison <John.C.Harrison(a)Intel.com>
mainline inclusion
from mainline-5.11-rc1
commit c784e5249e773689e38d2bc1749f08b986621a26
bugzilla: 185850 https://gitee.com/openeuler/kernel/issues/I4DDEL
CVE: CVE-2020-12363, CVE-2020-12364
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
The latest GuC firmware includes a number of interface changes that
require driver updates to match.
* Starting from Gen11, the ID to be provided to GuC needs to contain
the engine class in bits [0..2] and the instance in bits [3..6].
NOTE: this patch breaks pointer dereferences in some existing GuC
functions that use the guc_id to dereference arrays but these functions
are not used for now as we have GuC submission disabled and we will
update these functions in follow up patch which requires new IDs.
* The new GuC requires the additional data structure (ADS) and associated
'private_data' pointer to be setup. This is basically a scratch area
of memory that the GuC owns. The size is read from the CSS header.
* There is now a physical to logical engine mapping table in the ADS
which needs to be configured in order for the firmware to load. For
now, the table is initialised with a 1 to 1 mapping.
* GUC_CTL_CTXINFO has been removed from the initialization params.
* reg_state_buffer is maintained internally by the GuC as part of
the private data.
* The ADS layout has changed significantly. This patch updates the
shared structure and also adds better documentation of the layout.
* While i915 does not use GuC doorbells, the firmware now requires
that some initialisation is done.
* The number of engine classes and instances supported in the ADS has
been increased.
Signed-off-by: John Harrison <John.C.Harrison(a)Intel.com>
Signed-off-by: Matthew Brost <matthew.brost(a)intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio(a)intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo(a)intel.com>
Signed-off-by: Michel Thierry <michel.thierry(a)intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko(a)intel.com>
Cc: Michal Winiarski <michal.winiarski(a)intel.com>
Cc: Tomasz Lis <tomasz.lis(a)intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio(a)intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028145826.2949180-2-John…
Signed-off-by: Chen Jun <chenjun102(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Reviewed-by: Jason Yan <yanaijie(a)huawei.com>
Signed-off-by: Chen Jun <chenjun102(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 3 +-
drivers/gpu/drm/i915/gt/uc/intel_guc.c | 18 ---
drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 131 +++++++++++++++----
drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 80 +++++------
drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h | 5 +
drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 27 ++--
drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h | 2 +
drivers/gpu/drm/i915/gt/uc/intel_uc_fw_abi.h | 6 +-
8 files changed, 176 insertions(+), 96 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index a19537706ed1..c940ac3aae2f 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -305,8 +305,9 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
engine->i915 = i915;
engine->gt = gt;
engine->uncore = gt->uncore;
- engine->hw_id = engine->guc_id = info->hw_id;
engine->mmio_base = __engine_mmio_base(i915, info->mmio_bases);
+ engine->hw_id = info->hw_id;
+ engine->guc_id = MAKE_GUC_ID(info->class, info->instance);
engine->class = info->class;
engine->instance = info->instance;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 942c7c187adb..6909da1e1a73 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -213,23 +213,6 @@ static u32 guc_ctl_feature_flags(struct intel_guc *guc)
return flags;
}
-static u32 guc_ctl_ctxinfo_flags(struct intel_guc *guc)
-{
- u32 flags = 0;
-
- if (intel_guc_submission_is_used(guc)) {
- u32 ctxnum, base;
-
- base = intel_guc_ggtt_offset(guc, guc->stage_desc_pool);
- ctxnum = GUC_MAX_STAGE_DESCRIPTORS / 16;
-
- base >>= PAGE_SHIFT;
- flags |= (base << GUC_CTL_BASE_ADDR_SHIFT) |
- (ctxnum << GUC_CTL_CTXNUM_IN16_SHIFT);
- }
- return flags;
-}
-
static u32 guc_ctl_log_params_flags(struct intel_guc *guc)
{
u32 offset = intel_guc_ggtt_offset(guc, guc->log.vma) >> PAGE_SHIFT;
@@ -291,7 +274,6 @@ static void guc_init_params(struct intel_guc *guc)
BUILD_BUG_ON(sizeof(guc->params) != GUC_CTL_MAX_DWORDS * sizeof(u32));
- params[GUC_CTL_CTXINFO] = guc_ctl_ctxinfo_flags(guc);
params[GUC_CTL_LOG_PARAMS] = guc_ctl_log_params_flags(guc);
params[GUC_CTL_FEATURE] = guc_ctl_feature_flags(guc);
params[GUC_CTL_DEBUG] = guc_ctl_debug_flags(guc);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index d44061033f23..7950d28beb8c 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -10,11 +10,52 @@
/*
* The Additional Data Struct (ADS) has pointers for different buffers used by
- * the GuC. One single gem object contains the ADS struct itself (guc_ads), the
- * scheduling policies (guc_policies), a structure describing a collection of
- * register sets (guc_mmio_reg_state) and some extra pages for the GuC to save
- * its internal state for sleep.
+ * the GuC. One single gem object contains the ADS struct itself (guc_ads) and
+ * all the extra buffers indirectly linked via the ADS struct's entries.
+ *
+ * Layout of the ADS blob allocated for the GuC:
+ *
+ * +---------------------------------------+ <== base
+ * | guc_ads |
+ * +---------------------------------------+
+ * | guc_policies |
+ * +---------------------------------------+
+ * | guc_gt_system_info |
+ * +---------------------------------------+
+ * | guc_clients_info |
+ * +---------------------------------------+
+ * | guc_ct_pool_entry[size] |
+ * +---------------------------------------+
+ * | padding |
+ * +---------------------------------------+ <== 4K aligned
+ * | private data |
+ * +---------------------------------------+
+ * | padding |
+ * +---------------------------------------+ <== 4K aligned
*/
+struct __guc_ads_blob {
+ struct guc_ads ads;
+ struct guc_policies policies;
+ struct guc_gt_system_info system_info;
+ struct guc_clients_info clients_info;
+ struct guc_ct_pool_entry ct_pool[GUC_CT_POOL_SIZE];
+} __packed;
+
+static u32 guc_ads_private_data_size(struct intel_guc *guc)
+{
+ return PAGE_ALIGN(guc->fw.private_data_size);
+}
+
+static u32 guc_ads_private_data_offset(struct intel_guc *guc)
+{
+ return PAGE_ALIGN(sizeof(struct __guc_ads_blob));
+}
+
+static u32 guc_ads_blob_size(struct intel_guc *guc)
+{
+ return guc_ads_private_data_offset(guc) +
+ guc_ads_private_data_size(guc);
+}
static void guc_policy_init(struct guc_policy *policy)
{
@@ -48,26 +89,37 @@ static void guc_ct_pool_entries_init(struct guc_ct_pool_entry *pool, u32 num)
memset(pool, 0, num * sizeof(*pool));
}
+static void guc_mapping_table_init(struct intel_gt *gt,
+ struct guc_gt_system_info *system_info)
+{
+ unsigned int i, j;
+ struct intel_engine_cs *engine;
+ enum intel_engine_id id;
+
+ /* Table must be set to invalid values for entries not used */
+ for (i = 0; i < GUC_MAX_ENGINE_CLASSES; ++i)
+ for (j = 0; j < GUC_MAX_INSTANCES_PER_CLASS; ++j)
+ system_info->mapping_table[i][j] =
+ GUC_MAX_INSTANCES_PER_CLASS;
+
+ for_each_engine(engine, gt, id) {
+ u8 guc_class = engine->class;
+
+ system_info->mapping_table[guc_class][engine->instance] =
+ engine->instance;
+ }
+}
+
/*
* The first 80 dwords of the register state context, containing the
* execlists and ppgtt registers.
*/
#define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
-/* The ads obj includes the struct itself and buffers passed to GuC */
-struct __guc_ads_blob {
- struct guc_ads ads;
- struct guc_policies policies;
- struct guc_mmio_reg_state reg_state;
- struct guc_gt_system_info system_info;
- struct guc_clients_info clients_info;
- struct guc_ct_pool_entry ct_pool[GUC_CT_POOL_SIZE];
- u8 reg_state_buffer[GUC_S3_SAVE_SPACE_PAGES * PAGE_SIZE];
-} __packed;
-
static void __guc_ads_init(struct intel_guc *guc)
{
struct intel_gt *gt = guc_to_gt(guc);
+ struct drm_i915_private *i915 = gt->i915;
struct __guc_ads_blob *blob = guc->ads_blob;
const u32 skipped_size = LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE;
u32 base;
@@ -99,13 +151,25 @@ static void __guc_ads_init(struct intel_guc *guc)
}
/* System info */
- blob->system_info.slice_enabled = hweight8(gt->info.sseu.slice_mask);
- blob->system_info.rcs_enabled = 1;
- blob->system_info.bcs_enabled = 1;
+ blob->system_info.engine_enabled_masks[RENDER_CLASS] = 1;
+ blob->system_info.engine_enabled_masks[COPY_ENGINE_CLASS] = 1;
+ blob->system_info.engine_enabled_masks[VIDEO_DECODE_CLASS] = VDBOX_MASK(gt);
+ blob->system_info.engine_enabled_masks[VIDEO_ENHANCEMENT_CLASS] = VEBOX_MASK(gt);
+
+ blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED] =
+ hweight8(gt->info.sseu.slice_mask);
+ blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_VDBOX_SFC_SUPPORT_MASK] =
+ gt->info.vdbox_sfc_access;
+
+ if (INTEL_GEN(i915) >= 12 && !IS_DGFX(i915)) {
+ u32 distdbreg = intel_uncore_read(gt->uncore,
+ GEN12_DIST_DBS_POPULATED);
+ blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_DOORBELL_COUNT_PER_SQIDI] =
+ ((distdbreg >> GEN12_DOORBELLS_PER_SQIDI_SHIFT) &
+ GEN12_DOORBELLS_PER_SQIDI) + 1;
+ }
- blob->system_info.vdbox_enable_mask = VDBOX_MASK(gt);
- blob->system_info.vebox_enable_mask = VEBOX_MASK(gt);
- blob->system_info.vdbox_sfc_support_mask = gt->info.vdbox_sfc_access;
+ guc_mapping_table_init(guc_to_gt(guc), &blob->system_info);
base = intel_guc_ggtt_offset(guc, guc->ads_vma);
@@ -118,11 +182,12 @@ static void __guc_ads_init(struct intel_guc *guc)
/* ADS */
blob->ads.scheduler_policies = base + ptr_offset(blob, policies);
- blob->ads.reg_state_buffer = base + ptr_offset(blob, reg_state_buffer);
- blob->ads.reg_state_addr = base + ptr_offset(blob, reg_state);
blob->ads.gt_system_info = base + ptr_offset(blob, system_info);
blob->ads.clients_info = base + ptr_offset(blob, clients_info);
+ /* Private Data */
+ blob->ads.private_data = base + guc_ads_private_data_offset(guc);
+
i915_gem_object_flush_map(guc->ads_vma->obj);
}
@@ -135,14 +200,15 @@ static void __guc_ads_init(struct intel_guc *guc)
*/
int intel_guc_ads_create(struct intel_guc *guc)
{
- const u32 size = PAGE_ALIGN(sizeof(struct __guc_ads_blob));
+ u32 size;
int ret;
GEM_BUG_ON(guc->ads_vma);
+ size = guc_ads_blob_size(guc);
+
ret = intel_guc_allocate_and_map_vma(guc, size, &guc->ads_vma,
(void **)&guc->ads_blob);
-
if (ret)
return ret;
@@ -156,6 +222,18 @@ void intel_guc_ads_destroy(struct intel_guc *guc)
i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP);
}
+static void guc_ads_private_data_reset(struct intel_guc *guc)
+{
+ u32 size;
+
+ size = guc_ads_private_data_size(guc);
+ if (!size)
+ return;
+
+ memset((void *)guc->ads_blob + guc_ads_private_data_offset(guc), 0,
+ size);
+}
+
/**
* intel_guc_ads_reset() - prepares GuC Additional Data Struct for reuse
* @guc: intel_guc struct
@@ -168,5 +246,8 @@ void intel_guc_ads_reset(struct intel_guc *guc)
{
if (!guc->ads_vma)
return;
+
__guc_ads_init(guc);
+
+ guc_ads_private_data_reset(guc);
}
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index a6b733c146c9..79c560d9c0b6 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -26,8 +26,8 @@
#define GUC_VIDEO_ENGINE2 4
#define GUC_MAX_ENGINES_NUM (GUC_VIDEO_ENGINE2 + 1)
-#define GUC_MAX_ENGINE_CLASSES 5
-#define GUC_MAX_INSTANCES_PER_CLASS 16
+#define GUC_MAX_ENGINE_CLASSES 16
+#define GUC_MAX_INSTANCES_PER_CLASS 32
#define GUC_DOORBELL_INVALID 256
@@ -62,12 +62,7 @@
#define GUC_STAGE_DESC_ATTR_PCH BIT(6)
#define GUC_STAGE_DESC_ATTR_TERMINATED BIT(7)
-/* New GuC control data */
-#define GUC_CTL_CTXINFO 0
-#define GUC_CTL_CTXNUM_IN16_SHIFT 0
-#define GUC_CTL_BASE_ADDR_SHIFT 12
-
-#define GUC_CTL_LOG_PARAMS 1
+#define GUC_CTL_LOG_PARAMS 0
#define GUC_LOG_VALID (1 << 0)
#define GUC_LOG_NOTIFY_ON_HALF_FULL (1 << 1)
#define GUC_LOG_ALLOC_IN_MEGABYTE (1 << 3)
@@ -79,11 +74,11 @@
#define GUC_LOG_ISR_MASK (0x7 << GUC_LOG_ISR_SHIFT)
#define GUC_LOG_BUF_ADDR_SHIFT 12
-#define GUC_CTL_WA 2
-#define GUC_CTL_FEATURE 3
+#define GUC_CTL_WA 1
+#define GUC_CTL_FEATURE 2
#define GUC_CTL_DISABLE_SCHEDULER (1 << 14)
-#define GUC_CTL_DEBUG 4
+#define GUC_CTL_DEBUG 3
#define GUC_LOG_VERBOSITY_SHIFT 0
#define GUC_LOG_VERBOSITY_LOW (0 << GUC_LOG_VERBOSITY_SHIFT)
#define GUC_LOG_VERBOSITY_MED (1 << GUC_LOG_VERBOSITY_SHIFT)
@@ -97,12 +92,37 @@
#define GUC_LOG_DISABLED (1 << 6)
#define GUC_PROFILE_ENABLED (1 << 7)
-#define GUC_CTL_ADS 5
+#define GUC_CTL_ADS 4
#define GUC_ADS_ADDR_SHIFT 1
#define GUC_ADS_ADDR_MASK (0xFFFFF << GUC_ADS_ADDR_SHIFT)
#define GUC_CTL_MAX_DWORDS (SOFT_SCRATCH_COUNT - 2) /* [1..14] */
+/* Generic GT SysInfo data types */
+#define GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED 0
+#define GUC_GENERIC_GT_SYSINFO_VDBOX_SFC_SUPPORT_MASK 1
+#define GUC_GENERIC_GT_SYSINFO_DOORBELL_COUNT_PER_SQIDI 2
+#define GUC_GENERIC_GT_SYSINFO_MAX 16
+
+/*
+ * The class goes in bits [0..2] of the GuC ID, the instance in bits [3..6].
+ * Bit 7 can be used for operations that apply to all engine classes&instances.
+ */
+#define GUC_ENGINE_CLASS_SHIFT 0
+#define GUC_ENGINE_CLASS_MASK (0x7 << GUC_ENGINE_CLASS_SHIFT)
+#define GUC_ENGINE_INSTANCE_SHIFT 3
+#define GUC_ENGINE_INSTANCE_MASK (0xf << GUC_ENGINE_INSTANCE_SHIFT)
+#define GUC_ENGINE_ALL_INSTANCES BIT(7)
+
+#define MAKE_GUC_ID(class, instance) \
+ (((class) << GUC_ENGINE_CLASS_SHIFT) | \
+ ((instance) << GUC_ENGINE_INSTANCE_SHIFT))
+
+#define GUC_ID_TO_ENGINE_CLASS(guc_id) \
+ (((guc_id) & GUC_ENGINE_CLASS_MASK) >> GUC_ENGINE_CLASS_SHIFT)
+#define GUC_ID_TO_ENGINE_INSTANCE(guc_id) \
+ (((guc_id) & GUC_ENGINE_INSTANCE_MASK) >> GUC_ENGINE_INSTANCE_SHIFT)
+
/* Work item for submitting workloads into work queue of GuC. */
struct guc_wq_item {
u32 header;
@@ -336,11 +356,6 @@ struct guc_policies {
} __packed;
/* GuC MMIO reg state struct */
-
-
-#define GUC_REGSET_MAX_REGISTERS 64
-#define GUC_S3_SAVE_SPACE_PAGES 10
-
struct guc_mmio_reg {
u32 offset;
u32 value;
@@ -348,28 +363,18 @@ struct guc_mmio_reg {
#define GUC_REGSET_MASKED (1 << 0)
} __packed;
-struct guc_mmio_regset {
- struct guc_mmio_reg registers[GUC_REGSET_MAX_REGISTERS];
- u32 values_valid;
- u32 number_of_registers;
-} __packed;
-
/* GuC register sets */
-struct guc_mmio_reg_state {
- struct guc_mmio_regset engine_reg[GUC_MAX_ENGINE_CLASSES][GUC_MAX_INSTANCES_PER_CLASS];
- u32 reserved[98];
+struct guc_mmio_reg_set {
+ u32 address;
+ u16 count;
+ u16 reserved;
} __packed;
/* HW info */
struct guc_gt_system_info {
- u32 slice_enabled;
- u32 rcs_enabled;
- u32 reserved0;
- u32 bcs_enabled;
- u32 vdbox_enable_mask;
- u32 vdbox_sfc_support_mask;
- u32 vebox_enable_mask;
- u32 reserved[9];
+ u8 mapping_table[GUC_MAX_ENGINE_CLASSES][GUC_MAX_INSTANCES_PER_CLASS];
+ u32 engine_enabled_masks[GUC_MAX_ENGINE_CLASSES];
+ u32 generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_MAX];
} __packed;
/* Clients info */
@@ -390,15 +395,16 @@ struct guc_clients_info {
/* GuC Additional Data Struct */
struct guc_ads {
- u32 reg_state_addr;
- u32 reg_state_buffer;
+ struct guc_mmio_reg_set reg_state_list[GUC_MAX_ENGINE_CLASSES][GUC_MAX_INSTANCES_PER_CLASS];
+ u32 reserved0;
u32 scheduler_policies;
u32 gt_system_info;
u32 clients_info;
u32 control_data;
u32 golden_context_lrca[GUC_MAX_ENGINE_CLASSES];
u32 eng_state_size[GUC_MAX_ENGINE_CLASSES];
- u32 reserved[16];
+ u32 private_data;
+ u32 reserved[15];
} __packed;
/* GuC logging structures */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
index 1949346e714e..b37fc2ffaef2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
@@ -118,6 +118,11 @@ struct guc_doorbell_info {
#define GEN8_DRB_VALID (1<<0)
#define GEN8_DRBREGU(x) _MMIO(0x1000 + (x) * 8 + 4)
+#define GEN12_DIST_DBS_POPULATED _MMIO(0xd08)
+#define GEN12_DOORBELLS_PER_SQIDI_SHIFT 16
+#define GEN12_DOORBELLS_PER_SQIDI (0xff)
+#define GEN12_SQIDIS_DOORBELL_EXIST (0xffff)
+
#define DE_GUCRMR _MMIO(0x44054)
#define GUC_BCS_RCS_IER _MMIO(0xC550)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 80e8b6c3bc8c..ee4ac3922277 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -44,23 +44,19 @@ void intel_uc_fw_change_status(struct intel_uc_fw *uc_fw,
* List of required GuC and HuC binaries per-platform.
* Must be ordered based on platform + revid, from newer to older.
*
- * TGL 35.2 is interface-compatible with 33.0 for previous Gens. The deltas
- * between 33.0 and 35.2 are only related to new additions to support new Gen12
- * features.
- *
* Note that RKL uses the same firmware as TGL.
*/
#define INTEL_UC_FIRMWARE_DEFS(fw_def, guc_def, huc_def) \
- fw_def(ROCKETLAKE, 0, guc_def(tgl, 35, 2, 0), huc_def(tgl, 7, 5, 0)) \
- fw_def(TIGERLAKE, 0, guc_def(tgl, 35, 2, 0), huc_def(tgl, 7, 5, 0)) \
- fw_def(ELKHARTLAKE, 0, guc_def(ehl, 33, 0, 4), huc_def(ehl, 9, 0, 0)) \
- fw_def(ICELAKE, 0, guc_def(icl, 33, 0, 0), huc_def(icl, 9, 0, 0)) \
- fw_def(COMETLAKE, 5, guc_def(cml, 33, 0, 0), huc_def(cml, 4, 0, 0)) \
- fw_def(COFFEELAKE, 0, guc_def(kbl, 33, 0, 0), huc_def(kbl, 4, 0, 0)) \
- fw_def(GEMINILAKE, 0, guc_def(glk, 33, 0, 0), huc_def(glk, 4, 0, 0)) \
- fw_def(KABYLAKE, 0, guc_def(kbl, 33, 0, 0), huc_def(kbl, 4, 0, 0)) \
- fw_def(BROXTON, 0, guc_def(bxt, 33, 0, 0), huc_def(bxt, 2, 0, 0)) \
- fw_def(SKYLAKE, 0, guc_def(skl, 33, 0, 0), huc_def(skl, 2, 0, 0))
+ fw_def(ROCKETLAKE, 0, guc_def(tgl, 49, 0, 1), huc_def(tgl, 7, 5, 0)) \
+ fw_def(TIGERLAKE, 0, guc_def(tgl, 49, 0, 1), huc_def(tgl, 7, 5, 0)) \
+ fw_def(ELKHARTLAKE, 0, guc_def(ehl, 49, 0, 1), huc_def(ehl, 9, 0, 0)) \
+ fw_def(ICELAKE, 0, guc_def(icl, 49, 0, 1), huc_def(icl, 9, 0, 0)) \
+ fw_def(COMETLAKE, 5, guc_def(cml, 49, 0, 1), huc_def(cml, 4, 0, 0)) \
+ fw_def(COFFEELAKE, 0, guc_def(kbl, 49, 0, 1), huc_def(kbl, 4, 0, 0)) \
+ fw_def(GEMINILAKE, 0, guc_def(glk, 49, 0, 1), huc_def(glk, 4, 0, 0)) \
+ fw_def(KABYLAKE, 0, guc_def(kbl, 49, 0, 1), huc_def(kbl, 4, 0, 0)) \
+ fw_def(BROXTON, 0, guc_def(bxt, 49, 0, 1), huc_def(bxt, 2, 0, 0)) \
+ fw_def(SKYLAKE, 0, guc_def(skl, 49, 0, 1), huc_def(skl, 2, 0, 0))
#define __MAKE_UC_FW_PATH(prefix_, name_, major_, minor_, patch_) \
"i915/" \
@@ -371,6 +367,9 @@ int intel_uc_fw_fetch(struct intel_uc_fw *uc_fw)
}
}
+ if (uc_fw->type == INTEL_UC_FW_TYPE_GUC)
+ uc_fw->private_data_size = css->private_data_size;
+
obj = i915_gem_object_create_shmem_from_data(i915, fw->data, fw->size);
if (IS_ERR(obj)) {
err = PTR_ERR(obj);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
index 23d3a423ac0f..99bb1fe1af66 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
@@ -88,6 +88,8 @@ struct intel_uc_fw {
u32 rsa_size;
u32 ucode_size;
+
+ u32 private_data_size;
};
#ifdef CONFIG_DRM_I915_DEBUG_GUC
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw_abi.h b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw_abi.h
index 029214cdedd5..e41ffc7a7fbc 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw_abi.h
@@ -69,7 +69,11 @@ struct uc_css_header {
#define CSS_SW_VERSION_UC_MAJOR (0xFF << 16)
#define CSS_SW_VERSION_UC_MINOR (0xFF << 8)
#define CSS_SW_VERSION_UC_PATCH (0xFF << 0)
- u32 reserved[14];
+ u32 reserved0[13];
+ union {
+ u32 private_data_size; /* only applies to GuC */
+ u32 reserved1;
+ };
u32 header_info;
} __packed;
static_assert(sizeof(struct uc_css_header) == 128);
--
2.20.1
1
15

[PATCH OLK-5.10] x86/tsc: Make cur->adjusted values in package#1 to be the same
by LeoLiuoc 14 Jan '22
by LeoLiuoc 14 Jan '22
14 Jan '22
When resume from S4 on Zhaoxin 2 packages platform that support
X86_FEATURE_TSC_ADJUST,
the following warning messages appear:
[ 327.445302] [Firmware Bug]: TSC ADJUST differs: CPU15 45960750 -->
78394770. Restoring
[ 329.209120] [Firmware Bug]: TSC ADJUST differs: CPU14 45960750 -->
78394770. Restoring
[ 329.209128] [Firmware Bug]: TSC ADJUST differs: CPU13 45960750 -->
78394770. Restoring
[ 329.209138] [Firmware Bug]: TSC ADJUST differs: CPU12 45960750 -->
78394770. Restoring
[ 329.209151] [Firmware Bug]: TSC ADJUST differs: CPU11 45960750 -->
78394770. Restoring
[ 329.209160] [Firmware Bug]: TSC ADJUST differs: CPU10 45960750 -->
78394770. Restoring
[ 329.209169] [Firmware Bug]: TSC ADJUST differs: CPU9 45960750 -->
78394770. Restoring
The reason is:
Step 1: Bring up.
TSC is sync after bring up with following settings:
MSR 0x3b cur->adjusted
Package#0 CPU 0-7 0 0
Package#1 first CPU value1 value1
Package#1 non-first CPU value1 value1
Step 2: Suspend to S4.
Settings in Step 1 are not changed in this Step.
Step 3: Bring up caused by S4 wake up event.
TSC is sync when bring up with following settings:
MSR 0x3b cur->adjusted
Package#0 CPU 0-7 0 0
Package#1 first CPU value2 value2
Package#1 non-first CPU value2 value2
Step 4: Resume from S4.
When resuming from S4, Current TSC synchronous mechanism
cause following settings:
MSR 0x3b cur->adjusted
Package#0 CPU 0-7 0 0
Package#1 first CPU value2 value2
Package#1 non-first CPU value2 value1
In these Steps, value1 != 0 and value2 != value1.
In Step4, as function tsc_store_and_check_tsc_adjust() do, when the value
of MSR 0x3b on the non-first online CPU in package#1 is equal to the value
of cur->adjusted on the first online CPU in the same package,
the cur->adjusted value on this non-first online CPU will hold the old
value1.
This cause function tsc_verify_tsc_adjust() set the value of MSR 0x3b on the
non-first online CPUs in the package#1 to the old value1 and print the
beginning warning messages.
Fix it by setting cur->adjusted value on the non-first online CPU in a
package
to the value of MSR 0x3b on the same CPU when they are not equal.
Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com>
---
arch/x86/kernel/tsc_sync.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/kernel/tsc_sync.c b/arch/x86/kernel/tsc_sync.c
index 3d3c761eb..19a0bed9b 100644
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -190,6 +190,8 @@ bool tsc_store_and_check_tsc_adjust(bool bootcpu)
if (bootval != ref->adjusted) {
cur->adjusted = ref->adjusted;
wrmsrl(MSR_IA32_TSC_ADJUST, ref->adjusted);
+ } else if (cur->adjusted != bootval) {
+ cur->adjusted = bootval;
}
/*
* We have the TSCs forced to be in sync on this package. Skip sync
--
2.20.1
1
0

[PATCH OLK-5.10] Add support for PxSCT.LPM set based on actual LPM circumstances
by LeoLiuoc 14 Jan '22
by LeoLiuoc 14 Jan '22
14 Jan '22
AHCI Spec shows that the PxSCTL.IPM field in each port must be programmed
to disallow device initiated LPM requests when HBA can not support
transitions to the LPM state. The current LPM driver has no restrictions
on LPM transitions when enabling device initiated LPM.
Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com>
---
drivers/ata/libata-sata.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
index c16423e44..7e1296b55 100644
--- a/drivers/ata/libata-sata.c
+++ b/drivers/ata/libata-sata.c
@@ -394,9 +394,18 @@ int sata_link_scr_lpm(struct ata_link *link, enum
ata_lpm_policy policy,
case ATA_LPM_MED_POWER_WITH_DIPM:
case ATA_LPM_MIN_POWER_WITH_PARTIAL:
case ATA_LPM_MIN_POWER:
- if (ata_link_nr_enabled(link) > 0)
+ if (ata_link_nr_enabled(link) > 0) {
/* no restrictions on LPM transitions */
scontrol &= ~(0x7 << 8);
+ /* if Host does not support partial, then disallows it,
+ * the same for slumber
+ */
+ if (!(link->ap->host->flags & ATA_HOST_PART))
+ scontrol |= (0x1 << 8);
+
+ if (!(link->ap->host->flags & ATA_HOST_SSC))
+ scontrol |= (0x2 << 8);
+ }
else {
/* empty port, power off */
scontrol &= ~0xf;
--
2.20.1
1
0

[PATCH OLK-5.10] Add support for PxSCT.LPM setting based on actual LPM circumstances
by LeoLiuoc 14 Jan '22
by LeoLiuoc 14 Jan '22
14 Jan '22
In AHCI spec, PhyRdy Change Interrupt cannot coexist with
Link Power Management (LPM). While LPM driver would disable
PhyRdy Change interrupt without checking whether HBA supports
LPM or not before enabling it. Therefore, if enabling LPM from
LPM driver by default, we can not disable it from BIOS without
influence Hotplug out capability.
Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com>
---
drivers/ata/ahci.c | 9 +++++++++
drivers/ata/libata-eh.c | 4 ++++
include/linux/libata.h | 4 ++++
3 files changed, 17 insertions(+)
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index f8059eed3..2d1ac1b5b 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -1883,6 +1883,15 @@ static int ahci_init_one(struct pci_dev *pdev,
const struct pci_device_id *ent)
else
dev_info(&pdev->dev, "SSS flag set, parallel bus scan
disabled\n");
+ if (hpriv->cap & HOST_CAP_PART)
+ host->flags |= ATA_HOST_PART;
+
+ if (hpriv->cap & HOST_CAP_SSC)
+ host->flags |= ATA_HOST_SSC;
+
+ if (hpriv->cap2 & HOST_CAP2_SDS)
+ host->flags |= ATA_HOST_DEVSLP;
+
if (pi.flags & ATA_FLAG_EM)
ahci_reset_em(host);
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index b6f92050e..ff6e660d6 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -3237,6 +3237,10 @@ static int ata_eh_set_lpm(struct ata_link *link,
enum ata_lpm_policy policy,
unsigned int err_mask;
int rc;
+ /* if Host does not support lpm, then sets no LPM flags*/
+ if (!(ap->host->flags & (ATA_HOST_PART | ATA_HOST_SSC |
ATA_HOST_DEVSLP)))
+ link->flags |= ATA_LFLAG_NO_LPM;
+
/* if the link or host doesn't do LPM, noop */
if (!IS_ENABLED(CONFIG_SATA_HOST) ||
(link->flags & ATA_LFLAG_NO_LPM) || (ap && !ap->ops->set_lpm))
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 5f550eb27..81124ddc2 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -260,6 +260,10 @@ enum {
ATA_HOST_PARALLEL_SCAN = (1 << 2), /* Ports on this host can
be scanned in parallel */
ATA_HOST_IGNORE_ATA = (1 << 3), /* Ignore ATA devices on
this host. */
+ ATA_HOST_PART = (1 << 4), /* Host support partial.*/
+ ATA_HOST_SSC = (1 << 5), /* Host support slumber.*/
+ ATA_HOST_DEVSLP = (1 << 6), /* Host support devslp.*/
+
/* bits 24:31 of host->flags are reserved for LLD specific flags */
/* various lengths of time */
--
2.20.1
1
0

[PATCH OLK-5.10] Fix some bugs like plugin support and sata link stability when user enable ahci RTD3
by LeoLiuoc 14 Jan '22
by LeoLiuoc 14 Jan '22
14 Jan '22
The kernel driver now will disable interrupt when port is suspended,
causing plugin not work. It is useful to make plugin work,when the
plugin interrupt is enabled with power management status.
Adding pm_request_resume() is to reume port for plugin or PME signal
waking up controller which in D3.
with the AHCI controller frequently enters into D3 and leaves from D3,
the identify cmd may be timeout when controller resumes and establishes
a connect with the device.it is effective to delay 10ms between
controller resume and port resume,with link’s smooth transition.
with non power management request and power management competing with
each other in queue, it is often found that block IO hang 120s when
system disk is suspending or resuming.it is now guaranteed that PM
requests will enter the queue no matter other non-PM requests are
waiting. Increase the pm_only counter before checking whether any
non-PM blk_queue_enter() calls are in progress.
Meanwhile, the new blk_pm_request_resume() call is necessary to occur
during request assigned to a queue when device is suspended.
Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com>
---
drivers/ata/ahci.c | 13 +++++++++++++
drivers/ata/libahci.c | 14 ++++++++++++++
2 files changed, 27 insertions(+)
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 33192a8f6..617cfcb2b 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -868,6 +868,19 @@ static int ahci_pci_device_runtime_resume(struct
device *dev)
if (rc)
return rc;
ahci_pci_init_controller(host);
+
+ /* controller delay for 10ms when being reusmed &&
+ * port resume for Zx platform
+ */
+ if (pdev->vendor == PCI_VENDOR_ID_ZHAOXIN) {
+ ata_msleep(NULL, 10);
+ for (rc = 0; rc < host->n_ports; rc++) {
+ struct ata_port *ap = host->ports[rc];
+
+ pm_request_resume(&ap->tdev);
+ }
+ }
+
return 0;
}
diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
index fec2e9754..a37eac335 100644
--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -826,6 +826,12 @@ static void ahci_power_down(struct ata_port *ap)
void __iomem *port_mmio = ahci_port_base(ap);
u32 cmd, scontrol;
+ /* port suspended enable Plugin intr for Zx platform */
+ struct pci_dev *pdev = to_pci_dev(ap->host->dev);
+
+ if ((pdev->vendor == PCI_VENDOR_ID_ZHAOXIN) &&
(!ap->link.device->sdev))
+ writel(PORT_IRQ_CONNECT, port_mmio + PORT_IRQ_MASK);
+
if (!(hpriv->cap & HOST_CAP_SSS))
return;
@@ -1703,6 +1709,7 @@ static void ahci_error_intr(struct ata_port *ap,
u32 irq_stat)
struct ata_eh_info *active_ehi;
bool fbs_need_dec = false;
u32 serror;
+ struct pci_dev *pdev;
/* determine active link with error */
if (pp->fbs_enabled) {
@@ -1791,6 +1798,13 @@ static void ahci_error_intr(struct ata_port *ap,
u32 irq_stat)
ata_ehi_push_desc(host_ehi, "%s",
irq_stat & PORT_IRQ_CONNECT ?
"connection status changed" : "PHY RDY changed");
+ /* when plugin intr happen,now resume suspended port for Zx
platform */
+ pdev = to_pci_dev(ap->host->dev);
+ if ((pdev->vendor == PCI_VENDOR_ID_ZHAOXIN) &&
+ (ap->pflags & ATA_PFLAG_SUSPENDED)) {
+ pm_request_resume(&ap->tdev);
+ return;
+ }
}
/* okay, let's hand over to EH */
--
2.20.1
1
0

[PATCH OLK-5.10] EHCI:Clear wakeup signal locked in S0 state when device plug in
by LeoLiuoc 14 Jan '22
by LeoLiuoc 14 Jan '22
14 Jan '22
If we plug in a LS/FS device on USB2 port of EHCI, it will latch a wakeup
signal in EHCI internal. This is a bug of EHCI for Some project of
ZhaoXin. If enable EHCI runtime suspend and no device attach.
PM core will let EHCI go to D3 to save power. However, once EHCI go to D3,
it will release wakeup signal that latched on device connect to port
during S0. Which will generate a SCI interrupt and bring EHCI to D0.
But without device connect, EHCI will go to D3 again.
So, there is suspend-resume loop and generate SCI interrupt Continuously.
In order to fix this issues, we need to clear the wakeup signal latched
in EHCI when EHCI suspend function is called.
Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com>
---
drivers/pci/pci-driver.c | 6 +++++-
drivers/usb/host/ehci-hcd.c | 21 +++++++++++++++++++++
drivers/usb/host/ehci-pci.c | 4 ++++
drivers/usb/host/ehci.h | 1 +
4 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 8b587fc97..c33d7e0a6 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -518,7 +518,11 @@ static int pci_restore_standard_config(struct
pci_dev *pci_dev)
}
pci_restore_state(pci_dev);
- pci_pme_restore(pci_dev);
+ if (!((pci_dev->vendor == PCI_VENDOR_ID_ZHAOXIN) &&
+ (pci_dev->device == 0x3104) &&
+ ((pci_dev->revision & 0xf0) == 0x90)) ||
+ !(pci_dev->class == PCI_CLASS_SERIAL_USB_EHCI))
+ pci_pme_restore(pci_dev);
return 0;
}
diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 8aff19ff8..586e2735d 100644
--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -1142,6 +1142,27 @@ int ehci_suspend(struct usb_hcd *hcd, bool do_wakeup)
return -EBUSY;
}
+ /*clear wakeup signal locked in S0 state when device plug in*/
+ if (ehci->zx_wakeup_clear == 1) {
+ u32 __iomem *reg = &ehci->regs->port_status[4];
+ u32 t1 = ehci_readl(ehci, reg);
+
+ t1 &= (u32)~0xf0000;
+ t1 |= PORT_TEST_FORCE;
+ ehci_writel(ehci, t1, reg);
+ t1 = ehci_readl(ehci, reg);
+ usleep_range(1000, 2000);
+ t1 &= (u32)~0xf0000;
+ ehci_writel(ehci, t1, reg);
+ usleep_range(1000, 2000);
+ t1 = ehci_readl(ehci, reg);
+ ehci_writel(ehci, t1 | PORT_CSC, reg);
+ udelay(500);
+ t1 = ehci_readl(ehci, &ehci->regs->status);
+ ehci_writel(ehci, t1 & STS_PCD, &ehci->regs->status);
+ ehci_readl(ehci, &ehci->regs->status);
+ }
+
return 0;
}
EXPORT_SYMBOL_GPL(ehci_suspend);
diff --git a/drivers/usb/host/ehci-pci.c b/drivers/usb/host/ehci-pci.c
index e87cf3a00..a5e27deda 100644
--- a/drivers/usb/host/ehci-pci.c
+++ b/drivers/usb/host/ehci-pci.c
@@ -222,6 +222,10 @@ static int ehci_pci_setup(struct usb_hcd *hcd)
ehci->has_synopsys_hc_bug = 1;
}
break;
+ case PCI_VENDOR_ID_ZHAOXIN:
+ if (pdev->device == 0x3104 && (pdev->revision & 0xf0) == 0x90)
+ ehci->zx_wakeup_clear = 1;
+ break;
}
/* optional debug port, normally in the first BAR */
diff --git a/drivers/usb/host/ehci.h b/drivers/usb/host/ehci.h
index 59fd523c5..8da080a54 100644
--- a/drivers/usb/host/ehci.h
+++ b/drivers/usb/host/ehci.h
@@ -219,6 +219,7 @@ struct ehci_hcd { /* one per controller */
unsigned need_oc_pp_cycle:1; /* MPC834X port power */
unsigned imx28_write_fix:1; /* For Freescale i.MX28 */
unsigned is_aspeed:1;
+ unsigned zx_wakeup_clear:1;
/* required for usb32 quirk */
#define OHCI_CTRL_HCFS (3 << 6)
--
2.25.1
1
0

[PATCH OLK-5.10] XHCI:Fix some device identify fail when enable xHCI runtime suspend
by LeoLiuoc 14 Jan '22
by LeoLiuoc 14 Jan '22
14 Jan '22
If plug out device form xhci with runtime suspend enabled.
On the one hand, driver will disconnect this device and send disabled
slot command to xhci.
On the other hand, without no device connect to xhci, PM core will
call xhci suspend function to let xhci go to D3 to save power.
However there is a temporal competition to get xhci lock between
disable slot command interrupt and xhci suspend.
If xhci suspend function get xhci lock first, then this function will
clear xhci command ring. It will get error command trb when driver to
handle disable slot command interrupt. This is a serious error for
driver and driver will cleanup xhci. So,any device connect to this
xhci port again will not be recognized.
In order to fix this issues, we let disable slot command interrupt ISR
to get xhci lock first. So, add a delay in xhci suspend function before
to get xhci lock.
Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com>
---
drivers/usb/host/xhci-pci.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 583a8e895..83aa8e179 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -272,6 +272,8 @@ static void xhci_pci_quirks(struct device *dev,
struct xhci_hcd *xhci)
xhci->quirks |= XHCI_LPM_SUPPORT;
xhci->quirks |= XHCI_ZHAOXIN_HOST;
}
+ if (pdev->vendor == PCI_VENDOR_ID_ZHAOXIN)
+ xhci->quirks |= XHCI_SUSPEND_DELAY;
/* See https://bugzilla.kernel.org/show_bug.cgi?id=79511 */
if (pdev->vendor == PCI_VENDOR_ID_VIA &&
--
2.20.1
1
0

14 Jan '22
When the RTC divider is changed from reset to an operating time base,
the first update cycle should be 500ms later. But on some Zhaoxin SOCs,
this first update cycle is one second later.
So set RTC time on these Zhaoxin SOCs will causing 500ms delay.
Skip setup RTC divider on these SOCs in mc146818_set_time to fix it.
Signed-off-by: LeoLiu-oc <LeoLiu-oc(a)zhaoxin.com>
---
drivers/rtc/rtc-mc146818-lib.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/drivers/rtc/rtc-mc146818-lib.c b/drivers/rtc/rtc-mc146818-lib.c
index 2ecd8752b..96d9d0219 100644
--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -171,8 +171,17 @@ int mc146818_set_time(struct rtc_time *time)
save_control = CMOS_READ(RTC_CONTROL);
CMOS_WRITE((save_control|RTC_SET), RTC_CONTROL);
+#ifdef CONFIG_X86
+ if (!((boot_cpu_data.x86_vendor == X86_VENDOR_CENTAUR ||
+ boot_cpu_data.x86_vendor == X86_VENDOR_ZHAOXIN) &&
+ (boot_cpu_data.x86 <= 7 && boot_cpu_data.x86_model <= 59))) {
+ save_freq_select = CMOS_READ(RTC_FREQ_SELECT);
+ CMOS_WRITE((save_freq_select|RTC_DIV_RESET2), RTC_FREQ_SELECT);
+ }
+#else
save_freq_select = CMOS_READ(RTC_FREQ_SELECT);
CMOS_WRITE((save_freq_select|RTC_DIV_RESET2), RTC_FREQ_SELECT);
+#endif
#ifdef CONFIG_MACH_DECSTATION
CMOS_WRITE(real_yrs, RTC_DEC_YEAR);
@@ -190,7 +199,14 @@ int mc146818_set_time(struct rtc_time *time)
#endif
CMOS_WRITE(save_control, RTC_CONTROL);
+#ifdef CONFIG_X86
+ if (!((boot_cpu_data.x86_vendor == X86_VENDOR_CENTAUR ||
+ boot_cpu_data.x86_vendor == X86_VENDOR_ZHAOXIN) &&
+ (boot_cpu_data.x86 <= 7 && boot_cpu_data.x86_model <= 59)))
+ CMOS_WRITE(save_freq_select, RTC_FREQ_SELECT);
+#else
CMOS_WRITE(save_freq_select, RTC_FREQ_SELECT);
+#endif
spin_unlock_irqrestore(&rtc_lock, flags);
--
2.20.1
1
0
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4O31I?from=project-issue
CVE: NA
Add "!" in memmap parameter, use memmap(memmap=nn[KMG]!ss[KMG]) reserve memory
for pmem which register persistent memory in arm64.
Signed-off-by: Zhuling <zhuling8(a)huawei.com>
---
Documentation/admin-guide/kernel-parameters.txt | 9 +++++++--
arch/arm64/mm/init.c | 6 ++++++
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 64be32b..ab42db4 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2829,10 +2829,15 @@
will be eaten.
memmap=nn[KMG]!ss[KMG]
- [KNL,X86] Mark specific memory as protected.
+ [KNL,X86,ARM64] Mark specific memory as protected.
Region of memory to be used, from ss to ss+nn.
- The memory region may be marked as e820 type 12 (0xc)
+ [X86]The memory region may be marked as e820 type 12 (0xc)
and is NVDIMM or ADR memory.
+ [ARM64]Reserve memory for persistent storage when the kernel
+ restart or update. the data in PMEM will not be lost and can
+ be loaded faster
+ Example:
+ memmap=100K!0x1a0000000
memmap=<size>%<offset>-<oldtype>+<newtype>
[KNL,ACPI] Convert memory within the specified region
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 6ebfabd..b6512dc 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -56,6 +56,8 @@
s64 memstart_addr __ro_after_init = -1;
EXPORT_SYMBOL(memstart_addr);
+phys_addr_t start_at, mem_size;
+
#ifdef CONFIG_PIN_MEMORY
struct resource pin_memory_resource = {
.name = "Pin memory",
@@ -442,6 +444,10 @@ static int __init parse_memmap_one(char *p)
start_at = memparse(p + 1, &p);
memblock_reserve(start_at, mem_size);
memblock_mark_memmap(start_at, mem_size);
+ } else if (*p == '!') {
+ start_at = memparse(p+1, &p);
+ pmem_start = start_at;
+ pmem_size = mem_size;
} else
pr_info("Unrecognized memmap option, please check the parameter.\n");
--
2.9.5
1
2
您好!
Kernel SIG 邀请您参加 2021-12-31 14:00 召开的ZOOM会议(自动录制)
会议主题:openEuler kernel技术分享第十七期 & 双周例会
会议内容:
14:10-15:30 openEuler kernel 技术分享第17期 - cgroup介绍
会议链接:https://us06web.zoom.us/j/83118407147?pwd=elVzMVhGc2YzaDQ1TGorN2tPa1NhZz09
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the ZOOM conference(auto recording) will be held at 2021-12-31 14:00,
The subject of the conference is openEuler kernel技术分享第十七期 & 双周例会,
Summary:
14:10-15:30 openEuler kernel 技术分享第17期 - cgroup介绍
You can join the meeting at https://us06web.zoom.us/j/83118407147?pwd=elVzMVhGc2YzaDQ1TGorN2tPa1NhZz09.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
3
2
您好!
Kernel SIG 邀请您参加 2022-01-14 14:00 召开的ZOOM会议(自动录制)
会议主题:openEuler kernel SIG 双周例会
会议内容:
议题:ebpf增加机制以支持非主线helper从而保证一定的兼容性
会议链接:https://us06web.zoom.us/j/86252266487?pwd=VjRNTVJzdkVwaXl6K2V1VUcxTWpEQT09
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the ZOOM conference(auto recording) will be held at 2022-01-14 14:00,
The subject of the conference is openEuler kernel SIG 双周例会,
Summary:
议题:ebpf增加机制以支持非主线helper从而保证一定的兼容性
You can join the meeting at https://us06web.zoom.us/j/86252266487?pwd=VjRNTVJzdkVwaXl6K2V1VUcxTWpEQT09.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
2
1
From: Hannes Reinecke <hare(a)suse.de>
mainline inclusion
from mainline-5.14-rc1
commit ced202f7bd78eb6a79c441a8b217e0f3d38bccfc
category: bugfix
bugzilla: 185813
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
---------------------------
Return the actual error code in __scsi_execute() (which, according to the
documentation, should have happened anyway). And audit all callers to cope
with negative return values from __scsi_execute() and friends.
[mkp: resolve conflict and return bool]
Link: https://lore.kernel.org/r/20210427083046.31620-7-hare@suse.de
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Hannes Reinecke <hare(a)suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
---
drivers/ata/libata-scsi.c | 8 ++++++++
drivers/scsi/ch.c | 3 ++-
drivers/scsi/cxlflash/superpipe.c | 2 +-
drivers/scsi/scsi.c | 2 ++
drivers/scsi/scsi_ioctl.c | 4 +++-
drivers/scsi/scsi_lib.c | 15 +++++++++------
drivers/scsi/scsi_scan.c | 4 ++--
drivers/scsi/scsi_transport_spi.c | 2 +-
drivers/scsi/sd.c | 15 +++++++++------
drivers/scsi/sd_zbc.c | 2 +-
drivers/scsi/sr_ioctl.c | 4 ++++
drivers/scsi/ufs/ufshcd.c | 2 +-
drivers/scsi/virtio_scsi.c | 2 +-
include/scsi/scsi.h | 7 +++++--
14 files changed, 49 insertions(+), 23 deletions(-)
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 48b8934970f3..c5129b9e3afd 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -409,6 +409,10 @@ int ata_cmd_ioctl(struct scsi_device *scsidev, void __user *arg)
cmd_result = scsi_execute(scsidev, scsi_cmd, data_dir, argbuf, argsize,
sensebuf, &sshdr, (10*HZ), 5, 0, 0, NULL);
+ if (cmd_result < 0) {
+ rc = cmd_result;
+ goto error;
+ }
if (driver_byte(cmd_result) == DRIVER_SENSE) {/* sense data available */
u8 *desc = sensebuf + 8;
cmd_result &= ~(0xFF<<24); /* DRIVER_SENSE is not an error */
@@ -490,6 +494,10 @@ int ata_task_ioctl(struct scsi_device *scsidev, void __user *arg)
cmd_result = scsi_execute(scsidev, scsi_cmd, DMA_NONE, NULL, 0,
sensebuf, &sshdr, (10*HZ), 5, 0, 0, NULL);
+ if (cmd_result < 0) {
+ rc = cmd_result;
+ goto error;
+ }
if (driver_byte(cmd_result) == DRIVER_SENSE) {/* sense data available */
u8 *desc = sensebuf + 8;
cmd_result &= ~(0xFF<<24); /* DRIVER_SENSE is not an error */
diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c
index cb74ab1ae5a4..0e7d1214c3d8 100644
--- a/drivers/scsi/ch.c
+++ b/drivers/scsi/ch.c
@@ -198,7 +198,8 @@ ch_do_scsi(scsi_changer *ch, unsigned char *cmd, int cmd_len,
result = scsi_execute_req(ch->device, cmd, direction, buffer,
buflength, &sshdr, timeout * HZ,
MAX_RETRIES, NULL);
-
+ if (result < 0)
+ return result;
if (driver_byte(result) == DRIVER_SENSE) {
if (debug)
scsi_print_sense_hdr(ch->device, ch->name, &sshdr);
diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
index 5dddf67dfa24..3ebb7ac86a64 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -369,7 +369,7 @@ static int read_cap16(struct scsi_device *sdev, struct llun_info *lli)
goto out;
}
- if (driver_byte(result) == DRIVER_SENSE) {
+ if (result > 0 && driver_byte(result) == DRIVER_SENSE) {
result &= ~(0xFF<<24); /* DRIVER_SENSE is not an error */
if (result & SAM_STAT_CHECK_CONDITION) {
switch (sshdr.sense_key) {
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 6ad834d61d4c..f561caabc771 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -495,6 +495,8 @@ int scsi_report_opcode(struct scsi_device *sdev, unsigned char *buffer,
result = scsi_execute_req(sdev, cmd, DMA_FROM_DEVICE, buffer, len,
&sshdr, 30 * HZ, 3, NULL);
+ if (result < 0)
+ return result;
if (result && scsi_sense_valid(&sshdr) &&
sshdr.sense_key == ILLEGAL_REQUEST &&
(sshdr.asc == 0x20 || sshdr.asc == 0x24) && sshdr.ascq == 0x00)
diff --git a/drivers/scsi/scsi_ioctl.c b/drivers/scsi/scsi_ioctl.c
index 14872c9dc78c..d34e1b41dc71 100644
--- a/drivers/scsi/scsi_ioctl.c
+++ b/drivers/scsi/scsi_ioctl.c
@@ -101,6 +101,8 @@ static int ioctl_internal_command(struct scsi_device *sdev, char *cmd,
SCSI_LOG_IOCTL(2, sdev_printk(KERN_INFO, sdev,
"Ioctl returned 0x%x\n", result));
+ if (result < 0)
+ goto out;
if (driver_byte(result) == DRIVER_SENSE &&
scsi_sense_valid(&sshdr)) {
switch (sshdr.sense_key) {
@@ -133,7 +135,7 @@ static int ioctl_internal_command(struct scsi_device *sdev, char *cmd,
break;
}
}
-
+out:
SCSI_LOG_IOCTL(2, sdev_printk(KERN_INFO, sdev,
"IOCTL Releasing command\n"));
return result;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0a265a3e3dc6..9352a309fe55 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -245,20 +245,23 @@ int __scsi_execute(struct scsi_device *sdev, const unsigned char *cmd,
{
struct request *req;
struct scsi_request *rq;
- int ret = DRIVER_ERROR << 24;
+ int ret;
req = blk_get_request(sdev->request_queue,
data_direction == DMA_TO_DEVICE ?
REQ_OP_SCSI_OUT : REQ_OP_SCSI_IN,
rq_flags & RQF_PM ? BLK_MQ_REQ_PM : 0);
if (IS_ERR(req))
- return ret;
- rq = scsi_req(req);
+ return PTR_ERR(req);
- if (bufflen && blk_rq_map_kern(sdev->request_queue, req,
- buffer, bufflen, GFP_NOIO))
- goto out;
+ rq = scsi_req(req);
+ if (bufflen) {
+ ret = blk_rq_map_kern(sdev->request_queue, req,
+ buffer, bufflen, GFP_NOIO);
+ if (ret)
+ goto out;
+ }
rq->cmd_len = COMMAND_SIZE(cmd[0]);
memcpy(rq->cmd, cmd, rq->cmd_len);
rq->retries = retries;
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 8e474b145249..57002529174f 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -599,7 +599,7 @@ static int scsi_probe_lun(struct scsi_device *sdev, unsigned char *inq_result,
"scsi scan: INQUIRY %s with code 0x%x\n",
result ? "failed" : "successful", result));
- if (result) {
+ if (result > 0) {
/*
* not-ready to ready transition [asc/ascq=0x28/0x0]
* or power-on, reset [asc/ascq=0x29/0x0], continue.
@@ -614,7 +614,7 @@ static int scsi_probe_lun(struct scsi_device *sdev, unsigned char *inq_result,
(sshdr.ascq == 0))
continue;
}
- } else {
+ } else if (result == 0) {
/*
* if nothing was transferred, we try
* again. It's a workaround for some USB
diff --git a/drivers/scsi/scsi_transport_spi.c b/drivers/scsi/scsi_transport_spi.c
index c37dd15d16d2..a9bb7ae2fafd 100644
--- a/drivers/scsi/scsi_transport_spi.c
+++ b/drivers/scsi/scsi_transport_spi.c
@@ -127,7 +127,7 @@ static int spi_execute(struct scsi_device *sdev, const void *cmd,
REQ_FAILFAST_TRANSPORT |
REQ_FAILFAST_DRIVER,
RQF_PM, NULL);
- if (driver_byte(result) != DRIVER_SENSE ||
+ if (result < 0 || driver_byte(result) != DRIVER_SENSE ||
sshdr->sense_key != UNIT_ATTENTION)
break;
}
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 97713f595a29..3fc184e9702f 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1673,7 +1673,7 @@ static unsigned int sd_check_events(struct gendisk *disk, unsigned int clearing)
&sshdr);
/* failed to execute TUR, assume media not present */
- if (host_byte(retval)) {
+ if (retval < 0 || host_byte(retval)) {
set_media_not_present(sdkp);
goto out;
}
@@ -1734,6 +1734,9 @@ static int sd_sync_cache(struct scsi_disk *sdkp, struct scsi_sense_hdr *sshdr)
if (res) {
sd_print_result(sdkp, "Synchronize Cache(10) failed", res);
+ if (res < 0)
+ return res;
+
if (driver_byte(res) == DRIVER_SENSE)
sd_print_sense_hdr(sdkp, sshdr);
@@ -1842,7 +1845,7 @@ static int sd_pr_command(struct block_device *bdev, u8 sa,
result = scsi_execute_req(sdev, cmd, DMA_TO_DEVICE, &data, sizeof(data),
&sshdr, SD_TIMEOUT, sdkp->max_retries, NULL);
- if (driver_byte(result) == DRIVER_SENSE &&
+ if (result > 0 && driver_byte(result) == DRIVER_SENSE &&
scsi_sense_valid(&sshdr)) {
sdev_printk(KERN_INFO, sdev, "PR command failed: %d\n", result);
scsi_print_sense_hdr(sdev, NULL, &sshdr);
@@ -2194,7 +2197,7 @@ sd_spinup_disk(struct scsi_disk *sdkp)
((driver_byte(the_result) == DRIVER_SENSE) &&
sense_valid && sshdr.sense_key == UNIT_ATTENTION)));
- if (driver_byte(the_result) != DRIVER_SENSE) {
+ if (the_result < 0 || driver_byte(the_result) != DRIVER_SENSE) {
/* no sense, TUR either succeeded or failed
* with a status error */
if(!spintime && !scsi_status_is_good(the_result)) {
@@ -2379,7 +2382,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
if (media_not_present(sdkp, &sshdr))
return -ENODEV;
- if (the_result) {
+ if (the_result > 0) {
sense_valid = scsi_sense_valid(&sshdr);
if (sense_valid &&
sshdr.sense_key == ILLEGAL_REQUEST &&
@@ -2464,7 +2467,7 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
if (media_not_present(sdkp, &sshdr))
return -ENODEV;
- if (the_result) {
+ if (the_result > 0) {
sense_valid = scsi_sense_valid(&sshdr);
if (sense_valid &&
sshdr.sense_key == UNIT_ATTENTION &&
@@ -3614,7 +3617,7 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start)
SD_TIMEOUT, sdkp->max_retries, 0, RQF_PM, NULL);
if (res) {
sd_print_result(sdkp, "Start/Stop Unit failed", res);
- if (driver_byte(res) == DRIVER_SENSE)
+ if (res > 0 && driver_byte(res) == DRIVER_SENSE)
sd_print_sense_hdr(sdkp, &sshdr);
if (scsi_sense_valid(&sshdr) &&
/* 0x3a is medium not present */
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 01088f333dbc..6834e029906a 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -116,7 +116,7 @@ static int sd_zbc_do_report_zones(struct scsi_disk *sdkp, unsigned char *buf,
sd_printk(KERN_ERR, sdkp,
"REPORT ZONES start lba %llu failed\n", lba);
sd_print_result(sdkp, "REPORT ZONES", result);
- if (driver_byte(result) == DRIVER_SENSE &&
+ if (result > 0 && driver_byte(result) == DRIVER_SENSE &&
scsi_sense_valid(&sshdr))
sd_print_sense_hdr(sdkp, &sshdr);
return -EIO;
diff --git a/drivers/scsi/sr_ioctl.c b/drivers/scsi/sr_ioctl.c
index ffcf902da390..4c1de11e69fb 100644
--- a/drivers/scsi/sr_ioctl.c
+++ b/drivers/scsi/sr_ioctl.c
@@ -205,6 +205,10 @@ int sr_do_ioctl(Scsi_CD *cd, struct packet_command *cgc)
cgc->timeout, IOCTL_RETRIES, 0, 0, NULL);
/* Minimal error checking. Ignore cases we know about, and report the rest. */
+ if (result < 0) {
+ err = result;
+ goto out;
+ }
if (driver_byte(result) != 0) {
switch (sshdr->sense_key) {
case UNIT_ATTENTION:
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 930f35863cbb..b8462f54cb72 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -8354,7 +8354,7 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
sdev_printk(KERN_WARNING, sdp,
"START_STOP failed for power mode: %d, result %x\n",
pwr_mode, ret);
- if (driver_byte(ret) == DRIVER_SENSE)
+ if (ret > 0 && driver_byte(ret) == DRIVER_SENSE)
scsi_print_sense_hdr(sdp, NULL, &sshdr);
}
diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 6dac58ae6120..fabbad5cda60 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -355,7 +355,7 @@ static void virtscsi_rescan_hotunplug(struct virtio_scsi *vscsi)
if (result == 0 && inq_result[0] >> 5) {
/* PQ indicates the LUN is not attached */
scsi_remove_device(sdev);
- } else if (host_byte(result) == DID_BAD_TARGET) {
+ } else if (result > 0 && host_byte(result) == DID_BAD_TARGET) {
/*
* If all LUNs of a virtio-scsi device are unplugged
* it will respond with BAD TARGET on any INQUIRY
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 7ac50dbee475..6f227b610681 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -256,10 +256,13 @@ static inline int scsi_is_wlun(u64 lun)
* This returns true for known good conditions that may be treated as
* command completed normally
*/
-static inline int scsi_status_is_good(int status)
+static inline bool scsi_status_is_good(int status)
{
+ if (status < 0)
+ return false;
+
if (host_byte(status) == DID_NO_CONNECT)
- return 0;
+ return false;
/*
* FIXME: bit0 is listed as reserved in SCSI-2, but is
--
2.31.1
1
0
Ramaxel inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4QLBP
CVE: NA
Changes:
1. Read lb mode from chip
2. Disable fake vf function
3. Remove unnecessary code
Signed-off-by: Yanling Song <songyl(a)ramaxel.com>
Reviewed-by: Yun Xu <xuyun(a)ramaxel.com>
---
.../ethernet/ramaxel/spnic/hw/sphw_cfg_cmd.h | 3 +-
.../ethernet/ramaxel/spnic/hw/sphw_hw_cfg.c | 3 +
.../ethernet/ramaxel/spnic/hw/sphw_hw_cfg.h | 3 +
drivers/scsi/spfc/hw/spfc_cqm_bat_cla.c | 43 +--
drivers/scsi/spfc/hw/spfc_cqm_bitmap_table.c | 10 +-
drivers/scsi/spfc/hw/spfc_cqm_main.c | 299 +-----------------
drivers/scsi/spfc/hw/spfc_cqm_main.h | 3 -
drivers/scsi/spfc/hw/spfc_cqm_object.c | 21 --
8 files changed, 29 insertions(+), 356 deletions(-)
diff --git a/drivers/net/ethernet/ramaxel/spnic/hw/sphw_cfg_cmd.h b/drivers/net/ethernet/ramaxel/spnic/hw/sphw_cfg_cmd.h
index 23644958e33a..63b89e71c552 100644
--- a/drivers/net/ethernet/ramaxel/spnic/hw/sphw_cfg_cmd.h
+++ b/drivers/net/ethernet/ramaxel/spnic/hw/sphw_cfg_cmd.h
@@ -40,7 +40,8 @@ struct cfg_cmd_dev_cap {
u8 sf_svc_attr;
u8 func_sf_en;
- u16 rsvd_sf;
+ u8 lb_mode;
+ u8 smf_pg;
u32 max_conn_num;
u16 max_stick2cache_num;
diff --git a/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.c b/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.c
index 5d6a53307f0a..4b2674ec66f0 100644
--- a/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.c
+++ b/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.c
@@ -112,6 +112,9 @@ static void parse_pub_res_cap(struct sphw_hwdev *hwdev,
else
cap->sf_en = false;
+ cap->lb_mode = dev_cap->lb_mode;
+ cap->smf_pg = dev_cap->smf_pg;
+
cap->timer_en = (u8)timer_enable; /* timer enable */
cap->host_oq_id_mask_val = dev_cap->host_oq_id_mask_val;
cap->max_connect_num = dev_cap->max_conn_num;
diff --git a/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.h b/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.h
index 1147beeb76ad..1b48e0991563 100644
--- a/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.h
+++ b/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.h
@@ -201,6 +201,9 @@ struct service_cap {
u8 timer_en; /* 0:disable, 1:enable */
u8 bloomfilter_en; /* 0:disable, 1:enable*/
+ u8 lb_mode;
+ u8 smf_pg;
+
/* For test */
u32 test_mode;
u32 test_qpc_num;
diff --git a/drivers/scsi/spfc/hw/spfc_cqm_bat_cla.c b/drivers/scsi/spfc/hw/spfc_cqm_bat_cla.c
index 30fb56a9bfed..0c1d97d9e3e6 100644
--- a/drivers/scsi/spfc/hw/spfc_cqm_bat_cla.c
+++ b/drivers/scsi/spfc/hw/spfc_cqm_bat_cla.c
@@ -53,17 +53,6 @@ cqm_bat_fill_cla_common_gpa(struct cqm_handle *cqm_handle,
gpa.acs_spu_en = 0;
}
- /* In fake mode, fake_vf_en in the GPA address of the BAT
- * must be set to 1.
- */
- if (cqm_handle->func_capability.fake_func_type == CQM_FAKE_FUNC_CHILD) {
- gpa.fake_vf_en = 1;
- func_attr = &cqm_handle->parent_cqm_handle->func_attribute;
- gpa.pf_id = func_attr->func_global_idx;
- } else {
- gpa.fake_vf_en = 0;
- }
-
memcpy(&cla_gpa_h, &gpa, sizeof(u32));
bat_entry_standerd->cla_gpa_h = cla_gpa_h;
@@ -379,13 +368,8 @@ s32 cqm_bat_update(struct cqm_handle *cqm_handle)
CQM_PTR_CHECK_RET(buf_in, CQM_FAIL, CQM_ALLOC_FAIL(buf_in));
buf_in->size = sizeof(struct cqm_cmdq_bat_update);
- /* In non-fake mode, func_id is set to 0xffff, indicating the current func.
- * In fake mode, the value of func_id is specified. This is a fake func_id.
- */
- if (cqm_handle->func_capability.fake_func_type == CQM_FAKE_FUNC_CHILD)
- func_id = cqm_handle->func_attribute.func_global_idx;
- else
- func_id = 0xffff;
+ /* In non-fake mode, func_id is set to 0xffff */
+ func_id = 0xffff;
/* The LB scenario is supported.
* The normal mode is the traditional mode and is configured on SMF0.
@@ -545,19 +529,6 @@ s32 cqm_cla_fill_buf(struct cqm_handle *cqm_handle, struct cqm_buf *cla_base_buf
spu_en = 0;
}
- /* fake enable */
- if (cqm_handle->func_capability.fake_func_type ==
- CQM_FAKE_FUNC_CHILD) {
- fake_en = 1ULL << 62;
- func_attr =
- &cqm_handle->parent_cqm_handle->func_attribute;
- pf_id = func_attr->func_global_idx;
- pf_id = (pf_id & 0x1f) << 57;
- } else {
- fake_en = 0;
- pf_id = 0;
- }
-
*base = (((((cla_sub_buf->buf_list[i].pa & CQM_CHIP_GPA_MASK) |
spu_en) |
fake_en) |
@@ -1248,14 +1219,8 @@ s32 cqm_cla_update(struct cqm_handle *cqm_handle, struct cqm_buf_list *buf_node_
}
}
- /* In non-fake mode, set func_id to 0xffff.
- * Indicates the current func fake mode, set func_id to the
- * specified value, This is a fake func_id.
- */
- if (cqm_handle->func_capability.fake_func_type == CQM_FAKE_FUNC_CHILD)
- cmd.func_id = cqm_handle->func_attribute.func_global_idx;
- else
- cmd.func_id = 0xffff;
+ /* In non-fake mode, set func_id to 0xffff. */
+ cmd.func_id = 0xffff;
/* Mode 0 is hashed to 4 SMF engines (excluding PPF) by func ID. */
if (cqm_handle->func_capability.lb_mode == CQM_LB_MODE_NORMAL ||
diff --git a/drivers/scsi/spfc/hw/spfc_cqm_bitmap_table.c b/drivers/scsi/spfc/hw/spfc_cqm_bitmap_table.c
index 4e482776a14f..21100e8db8f4 100644
--- a/drivers/scsi/spfc/hw/spfc_cqm_bitmap_table.c
+++ b/drivers/scsi/spfc/hw/spfc_cqm_bitmap_table.c
@@ -405,14 +405,8 @@ s32 cqm_cla_cache_invalid(struct cqm_handle *cqm_handle, dma_addr_t gpa, u32 cac
cmd.gpa_h = CQM_ADDR_HI(gpa);
cmd.gpa_l = CQM_ADDR_LW(gpa);
- /* In non-fake mode, set func_id to 0xffff.
- * Indicate the current func fake mode.
- * The value of func_id is a fake func ID.
- */
- if (cqm_handle->func_capability.fake_func_type == CQM_FAKE_FUNC_CHILD)
- cmd.func_id = cqm_handle->func_attribute.func_global_idx;
- else
- cmd.func_id = 0xffff;
+ /* In non-fake mode, set func_id to 0xffff. */
+ cmd.func_id = 0xffff;
/* Mode 0 is hashed to 4 SMF engines (excluding PPF) by func ID. */
if (cqm_handle->func_capability.lb_mode == CQM_LB_MODE_NORMAL ||
diff --git a/drivers/scsi/spfc/hw/spfc_cqm_main.c b/drivers/scsi/spfc/hw/spfc_cqm_main.c
index a4c8e60971b1..52cc2c7838e9 100644
--- a/drivers/scsi/spfc/hw/spfc_cqm_main.c
+++ b/drivers/scsi/spfc/hw/spfc_cqm_main.c
@@ -11,21 +11,8 @@
#include "sphw_crm.h"
#include "sphw_hw.h"
#include "sphw_hw_cfg.h"
-
#include "spfc_cqm_main.h"
-static unsigned char cqm_lb_mode = CQM_LB_MODE_NORMAL;
-module_param(cqm_lb_mode, byte, 0644);
-MODULE_PARM_DESC(cqm_lb_mode, "for cqm lb mode (default=0xff)");
-
-static unsigned char cqm_fake_mode = CQM_FAKE_MODE_DISABLE;
-module_param(cqm_fake_mode, byte, 0644);
-MODULE_PARM_DESC(cqm_fake_mode, "for cqm fake mode (default=0 disable)");
-
-static unsigned char cqm_platform_mode = CQM_FPGA_MODE;
-module_param(cqm_platform_mode, byte, 0644);
-MODULE_PARM_DESC(cqm_platform_mode, "for cqm platform mode (default=0 FPGA)");
-
s32 cqm3_init(void *ex_handle)
{
struct sphw_hwdev *handle = (struct sphw_hwdev *)ex_handle;
@@ -63,19 +50,6 @@ s32 cqm3_init(void *ex_handle)
goto err1;
}
- /* In FAKE mode, only the bitmap of the timer of the function is
- * enabled, and resources are not initialized. Otherwise, the
- * configuration of the fake function is overwritten.
- */
- if (cqm_handle->func_capability.fake_func_type == CQM_FAKE_FUNC_CHILD_CONFLICT) {
- if (sphw_func_tmr_bitmap_set(ex_handle, true) != CQM_SUCCESS)
- cqm_err(handle->dev_hdl, "Timer start: enable timer bitmap failed\n");
-
- handle->cqm_hdl = NULL;
- kfree(cqm_handle);
- return CQM_SUCCESS;
- }
-
/* Initialize memory entries such as BAT, CLA, and bitmap. */
if (cqm_mem_init(ex_handle) != CQM_SUCCESS) {
cqm_err(handle->dev_hdl, CQM_FUNCTION_FAIL(cqm_mem_init));
@@ -287,64 +261,6 @@ void cqm_service_capability_init(struct cqm_handle *cqm_handle,
cqm_service_capability_init_fc(cqm_handle, (void *)service_capability);
}
-s32 cqm_get_fake_func_type(struct cqm_handle *cqm_handle)
-{
- struct cqm_func_capability *func_cap = &cqm_handle->func_capability;
- u32 parent_func, child_func_start, child_func_number, i;
- u32 idx = cqm_handle->func_attribute.func_global_idx;
-
- /* Currently, only one set of fake configurations is implemented.
- * fake_cfg_number = 1
- */
- for (i = 0; i < func_cap->fake_cfg_number; i++) {
- parent_func = func_cap->fake_cfg[i].parent_func;
- child_func_start = func_cap->fake_cfg[i].child_func_start;
- child_func_number = func_cap->fake_cfg[i].child_func_number;
-
- if (idx == parent_func) {
- return CQM_FAKE_FUNC_PARENT;
- } else if ((idx >= child_func_start) &&
- (idx < (child_func_start + child_func_number))) {
- return CQM_FAKE_FUNC_CHILD_CONFLICT;
- }
- }
-
- return CQM_FAKE_FUNC_NORMAL;
-}
-
-s32 cqm_get_child_func_start(struct cqm_handle *cqm_handle)
-{
- struct cqm_func_capability *func_cap = &cqm_handle->func_capability;
- struct sphw_func_attr *func_attr = &cqm_handle->func_attribute;
- u32 i;
-
- /* Currently, only one set of fake configurations is implemented.
- * fake_cfg_number = 1
- */
- for (i = 0; i < func_cap->fake_cfg_number; i++) {
- if (func_attr->func_global_idx ==
- func_cap->fake_cfg[i].parent_func)
- return (s32)(func_cap->fake_cfg[i].child_func_start);
- }
-
- return CQM_FAIL;
-}
-
-s32 cqm_get_child_func_number(struct cqm_handle *cqm_handle)
-{
- struct cqm_func_capability *func_cap = &cqm_handle->func_capability;
- struct sphw_func_attr *func_attr = &cqm_handle->func_attribute;
- u32 i;
-
- for (i = 0; i < func_cap->fake_cfg_number; i++) {
- if (func_attr->func_global_idx ==
- func_cap->fake_cfg[i].parent_func)
- return (s32)(func_cap->fake_cfg[i].child_func_number);
- }
-
- return CQM_FAIL;
-}
-
/* Set func_type in fake_cqm_handle to ppf, pf, or vf. */
void cqm_set_func_type(struct cqm_handle *cqm_handle)
{
@@ -358,41 +274,20 @@ void cqm_set_func_type(struct cqm_handle *cqm_handle)
cqm_handle->func_attribute.func_type = CQM_VF;
}
-void cqm_lb_fake_mode_init(struct cqm_handle *cqm_handle)
+void cqm_lb_fake_mode_init(struct cqm_handle *cqm_handle, struct service_cap *svc_cap)
{
struct cqm_func_capability *func_cap = &cqm_handle->func_capability;
- struct cqm_fake_cfg *cfg = func_cap->fake_cfg;
- func_cap->lb_mode = cqm_lb_mode;
- func_cap->fake_mode = cqm_fake_mode;
+ func_cap->lb_mode = svc_cap->lb_mode;
/* Initializing the LB Mode */
- if (func_cap->lb_mode == CQM_LB_MODE_NORMAL) {
+ if (func_cap->lb_mode == CQM_LB_MODE_NORMAL)
func_cap->smf_pg = 0;
- } else {
- /* The LB mode is tailored on the FPGA.
- * Only SMF0 and SMF2 are instantiated.
- */
- if (cqm_platform_mode == CQM_FPGA_MODE)
- func_cap->smf_pg = 0x5;
- else
- func_cap->smf_pg = 0xF;
- }
+ else
+ func_cap->smf_pg = svc_cap->smf_pg;
- /* Initializing the FAKE Mode */
- if (func_cap->fake_mode == CQM_FAKE_MODE_DISABLE) {
- func_cap->fake_cfg_number = 0;
- func_cap->fake_func_type = CQM_FAKE_FUNC_NORMAL;
- } else {
- func_cap->fake_cfg_number = 1;
-
- /* When configuring fake mode, ensure that the parent function
- * cannot be contained in the child function; otherwise, the
- * system will be initialized repeatedly.
- */
- cfg[0].child_func_start = CQM_FAKE_CFUNC_START;
- func_cap->fake_func_type = cqm_get_fake_func_type(cqm_handle);
- }
+ func_cap->fake_cfg_number = 0;
+ func_cap->fake_func_type = CQM_FAKE_FUNC_NORMAL;
}
s32 cqm_capability_init(void *ex_handle)
@@ -465,10 +360,9 @@ s32 cqm_capability_init(void *ex_handle)
func_cap->gpa_check_enable = true;
- cqm_lb_fake_mode_init(cqm_handle);
+ cqm_lb_fake_mode_init(cqm_handle, service_capability);
cqm_info(handle->dev_hdl, "Cap init: lb_mode=%u\n", func_cap->lb_mode);
cqm_info(handle->dev_hdl, "Cap init: smf_pg=%u\n", func_cap->smf_pg);
- cqm_info(handle->dev_hdl, "Cap init: fake_mode=%u\n", func_cap->fake_mode);
cqm_info(handle->dev_hdl, "Cap init: fake_func_type=%u\n", func_cap->fake_func_type);
cqm_info(handle->dev_hdl, "Cap init: fake_cfg_number=%u\n", func_cap->fake_cfg_number);
@@ -517,153 +411,6 @@ s32 cqm_capability_init(void *ex_handle)
return err;
}
-void cqm_fake_uninit(struct cqm_handle *cqm_handle)
-{
- u32 i;
-
- if (cqm_handle->func_capability.fake_func_type !=
- CQM_FAKE_FUNC_PARENT)
- return;
-
- for (i = 0; i < CQM_FAKE_FUNC_MAX; i++) {
- kfree(cqm_handle->fake_cqm_handle[i]);
- cqm_handle->fake_cqm_handle[i] = NULL;
- }
-}
-
-s32 cqm_fake_init(struct cqm_handle *cqm_handle)
-{
- struct sphw_hwdev *handle = cqm_handle->ex_handle;
- struct cqm_func_capability *func_cap = NULL;
- struct cqm_handle *fake_cqm_handle = NULL;
- struct sphw_func_attr *func_attr = NULL;
- s32 child_func_start, child_func_number;
- u32 i;
-
- func_cap = &cqm_handle->func_capability;
- if (func_cap->fake_func_type != CQM_FAKE_FUNC_PARENT)
- return CQM_SUCCESS;
-
- child_func_start = cqm_get_child_func_start(cqm_handle);
- if (child_func_start == CQM_FAIL) {
- cqm_err(handle->dev_hdl, CQM_WRONG_VALUE(child_func_start));
- return CQM_FAIL;
- }
-
- child_func_number = cqm_get_child_func_number(cqm_handle);
- if (child_func_number == CQM_FAIL) {
- cqm_err(handle->dev_hdl, CQM_WRONG_VALUE(child_func_number));
- return CQM_FAIL;
- }
-
- for (i = 0; i < (u32)child_func_number; i++) {
- fake_cqm_handle = kmalloc(sizeof(*fake_cqm_handle), GFP_KERNEL | __GFP_ZERO);
- if (!fake_cqm_handle) {
- cqm_err(handle->dev_hdl,
- CQM_ALLOC_FAIL(fake_cqm_handle));
- goto err;
- }
-
- /* Copy the attributes of the parent CQM handle to the child CQM
- * handle and modify the values of function.
- */
- memcpy(fake_cqm_handle, cqm_handle, sizeof(struct cqm_handle));
- func_attr = &fake_cqm_handle->func_attribute;
- func_cap = &fake_cqm_handle->func_capability;
- func_attr->func_global_idx = (u16)(child_func_start + i);
- cqm_set_func_type(fake_cqm_handle);
- func_cap->fake_func_type = CQM_FAKE_FUNC_CHILD;
- cqm_info(handle->dev_hdl, "Fake func init: function[%u] type %d(0:PF,1:VF,2:PPF)\n",
- func_attr->func_global_idx, func_attr->func_type);
-
- fake_cqm_handle->parent_cqm_handle = cqm_handle;
- cqm_handle->fake_cqm_handle[i] = fake_cqm_handle;
- }
-
- return CQM_SUCCESS;
-
-err:
- cqm_fake_uninit(cqm_handle);
- return CQM_FAIL;
-}
-
-void cqm_fake_mem_uninit(struct cqm_handle *cqm_handle)
-{
- struct sphw_hwdev *handle = cqm_handle->ex_handle;
- struct cqm_handle *fake_cqm_handle = NULL;
- s32 child_func_number;
- u32 i;
-
- if (cqm_handle->func_capability.fake_func_type != CQM_FAKE_FUNC_PARENT)
- return;
-
- child_func_number = cqm_get_child_func_number(cqm_handle);
- if (child_func_number == CQM_FAIL) {
- cqm_err(handle->dev_hdl, CQM_WRONG_VALUE(child_func_number));
- return;
- }
-
- for (i = 0; i < (u32)child_func_number; i++) {
- fake_cqm_handle = cqm_handle->fake_cqm_handle[i];
- cqm_object_table_uninit(fake_cqm_handle);
- cqm_bitmap_uninit(fake_cqm_handle);
- cqm_cla_uninit(fake_cqm_handle, CQM_BAT_ENTRY_MAX);
- cqm_bat_uninit(fake_cqm_handle);
- }
-}
-
-s32 cqm_fake_mem_init(struct cqm_handle *cqm_handle)
-{
- struct sphw_hwdev *handle = cqm_handle->ex_handle;
- struct cqm_handle *fake_cqm_handle = NULL;
- s32 child_func_number;
- u32 i;
-
- if (cqm_handle->func_capability.fake_func_type !=
- CQM_FAKE_FUNC_PARENT)
- return CQM_SUCCESS;
-
- child_func_number = cqm_get_child_func_number(cqm_handle);
- if (child_func_number == CQM_FAIL) {
- cqm_err(handle->dev_hdl, CQM_WRONG_VALUE(child_func_number));
- return CQM_FAIL;
- }
-
- for (i = 0; i < (u32)child_func_number; i++) {
- fake_cqm_handle = cqm_handle->fake_cqm_handle[i];
-
- if (cqm_bat_init(fake_cqm_handle) != CQM_SUCCESS) {
- cqm_err(handle->dev_hdl,
- CQM_FUNCTION_FAIL(cqm_bat_init));
- goto err;
- }
-
- if (cqm_cla_init(fake_cqm_handle) != CQM_SUCCESS) {
- cqm_err(handle->dev_hdl,
- CQM_FUNCTION_FAIL(cqm_cla_init));
- goto err;
- }
-
- if (cqm_bitmap_init(fake_cqm_handle) != CQM_SUCCESS) {
- cqm_err(handle->dev_hdl,
- CQM_FUNCTION_FAIL(cqm_bitmap_init));
- goto err;
- }
-
- if (cqm_object_table_init(fake_cqm_handle) != CQM_SUCCESS) {
- cqm_err(handle->dev_hdl,
- CQM_FUNCTION_FAIL(cqm_object_table_init));
- goto err;
- }
- }
-
- return CQM_SUCCESS;
-
-err:
- cqm_fake_mem_uninit(cqm_handle);
- return CQM_FAIL;
-}
-
s32 cqm_mem_init(void *ex_handle)
{
struct sphw_hwdev *handle = (struct sphw_hwdev *)ex_handle;
@@ -671,49 +418,35 @@ s32 cqm_mem_init(void *ex_handle)
cqm_handle = (struct cqm_handle *)(handle->cqm_hdl);
- if (cqm_fake_init(cqm_handle) != CQM_SUCCESS) {
- cqm_err(handle->dev_hdl, CQM_FUNCTION_FAIL(cqm_fake_init));
- return CQM_FAIL;
- }
-
- if (cqm_fake_mem_init(cqm_handle) != CQM_SUCCESS) {
- cqm_err(handle->dev_hdl, CQM_FUNCTION_FAIL(cqm_fake_mem_init));
- goto err1;
- }
-
if (cqm_bat_init(cqm_handle) != CQM_SUCCESS) {
cqm_err(handle->dev_hdl, CQM_FUNCTION_FAIL(cqm_bat_init));
- goto err2;
+ return CQM_FAIL;
}
if (cqm_cla_init(cqm_handle) != CQM_SUCCESS) {
cqm_err(handle->dev_hdl, CQM_FUNCTION_FAIL(cqm_cla_init));
- goto err3;
+ goto err1;
}
if (cqm_bitmap_init(cqm_handle) != CQM_SUCCESS) {
cqm_err(handle->dev_hdl, CQM_FUNCTION_FAIL(cqm_bitmap_init));
- goto err4;
+ goto err2;
}
if (cqm_object_table_init(cqm_handle) != CQM_SUCCESS) {
cqm_err(handle->dev_hdl,
CQM_FUNCTION_FAIL(cqm_object_table_init));
- goto err5;
+ goto err3;
}
return CQM_SUCCESS;
-err5:
- cqm_bitmap_uninit(cqm_handle);
-err4:
- cqm_cla_uninit(cqm_handle, CQM_BAT_ENTRY_MAX);
err3:
- cqm_bat_uninit(cqm_handle);
+ cqm_bitmap_uninit(cqm_handle);
err2:
- cqm_fake_mem_uninit(cqm_handle);
+ cqm_cla_uninit(cqm_handle, CQM_BAT_ENTRY_MAX);
err1:
- cqm_fake_uninit(cqm_handle);
+ cqm_bat_uninit(cqm_handle);
return CQM_FAIL;
}
@@ -728,8 +461,6 @@ void cqm_mem_uninit(void *ex_handle)
cqm_bitmap_uninit(cqm_handle);
cqm_cla_uninit(cqm_handle, CQM_BAT_ENTRY_MAX);
cqm_bat_uninit(cqm_handle);
- cqm_fake_mem_uninit(cqm_handle);
- cqm_fake_uninit(cqm_handle);
}
s32 cqm_event_init(void *ex_handle)
diff --git a/drivers/scsi/spfc/hw/spfc_cqm_main.h b/drivers/scsi/spfc/hw/spfc_cqm_main.h
index 1b8cf8bdb3b7..cf10d7f5c339 100644
--- a/drivers/scsi/spfc/hw/spfc_cqm_main.h
+++ b/drivers/scsi/spfc/hw/spfc_cqm_main.h
@@ -328,9 +328,6 @@ void cqm_mem_uninit(void *ex_handle);
s32 cqm_event_init(void *ex_handle);
void cqm_event_uninit(void *ex_handle);
u8 cqm_aeq_callback(void *ex_handle, u8 event, u8 *data);
-s32 cqm_get_fake_func_type(struct cqm_handle *cqm_handle);
-s32 cqm_get_child_func_start(struct cqm_handle *cqm_handle);
-s32 cqm_get_child_func_number(struct cqm_handle *cqm_handle);
s32 cqm3_init(void *ex_handle);
void cqm3_uninit(void *ex_handle);
diff --git a/drivers/scsi/spfc/hw/spfc_cqm_object.c b/drivers/scsi/spfc/hw/spfc_cqm_object.c
index b895d37aebae..165794e9c7e5 100644
--- a/drivers/scsi/spfc/hw/spfc_cqm_object.c
+++ b/drivers/scsi/spfc/hw/spfc_cqm_object.c
@@ -155,8 +155,6 @@ struct cqm_qpc_mpt *cqm3_object_qpc_mpt_create(void *ex_handle, u32 service_type
struct cqm_qpc_mpt_info *qpc_mpt_info = NULL;
struct cqm_handle *cqm_handle = NULL;
s32 ret = CQM_FAIL;
- u32 relative_index;
- u32 fake_func_id;
CQM_PTR_CHECK_RET(ex_handle, NULL, CQM_PTR_NULL(ex_handle));
@@ -180,25 +178,6 @@ struct cqm_qpc_mpt *cqm3_object_qpc_mpt_create(void *ex_handle, u32 service_type
return NULL;
}
- /* fake vf adaption, switch to corresponding VF. */
- if (cqm_handle->func_capability.fake_func_type ==
- CQM_FAKE_FUNC_PARENT) {
- fake_func_id = index / cqm_handle->func_capability.qpc_number;
- relative_index = index % cqm_handle->func_capability.qpc_number;
-
- cqm_info(handle->dev_hdl, "qpc create: fake_func_id=%u, relative_index=%u\n",
- fake_func_id, relative_index);
-
- if ((s32)fake_func_id >=
- cqm_get_child_func_number(cqm_handle)) {
- cqm_err(handle->dev_hdl, CQM_WRONG_VALUE(fake_func_id));
- return NULL;
- }
-
- index = relative_index;
- cqm_handle = cqm_handle->fake_cqm_handle[fake_func_id];
- }
-
qpc_mpt_info = kmalloc(sizeof(*qpc_mpt_info), GFP_ATOMIC | __GFP_ZERO);
CQM_PTR_CHECK_RET(qpc_mpt_info, NULL, CQM_ALLOC_FAIL(qpc_mpt_info));
--
2.32.0
1
0
1.Remove unused functions about ceq
2.Remove unused clp hardware channels
3.Remove the code of polling mode
4.Remove the code about little endian and big endian conversion
Yanling Song (4):
net/spnic:Remove unused functions about ceq
net/spnic:Remove unused clp hardware channels
net/spnic:Remove the code of polling mode
net/spnic:Remove the code about little endian and big endian conversion
.../net/ethernet/ramaxel/spnic/hw/sphw_cmdq.c | 58 +--
.../ethernet/ramaxel/spnic/hw/sphw_common.h | 12 -
.../net/ethernet/ramaxel/spnic/hw/sphw_crm.h | 2 -
.../net/ethernet/ramaxel/spnic/hw/sphw_csr.h | 13 -
.../net/ethernet/ramaxel/spnic/hw/sphw_eqs.c | 104 +---
.../net/ethernet/ramaxel/spnic/hw/sphw_hw.h | 2 -
.../ethernet/ramaxel/spnic/hw/sphw_hwdev.c | 31 --
.../ethernet/ramaxel/spnic/hw/sphw_hwdev.h | 2 -
.../net/ethernet/ramaxel/spnic/hw/sphw_hwif.c | 27 +-
.../net/ethernet/ramaxel/spnic/hw/sphw_mbox.c | 20 +-
.../net/ethernet/ramaxel/spnic/hw/sphw_mbox.h | 2 -
.../net/ethernet/ramaxel/spnic/hw/sphw_mgmt.c | 487 ------------------
.../net/ethernet/ramaxel/spnic/hw/sphw_mgmt.h | 50 --
.../net/ethernet/ramaxel/spnic/hw/sphw_mt.h | 1 -
.../net/ethernet/ramaxel/spnic/spnic_nic_io.h | 6 +-
.../net/ethernet/ramaxel/spnic/spnic_nic_qp.h | 7 +-
drivers/net/ethernet/ramaxel/spnic/spnic_rx.c | 39 +-
drivers/net/ethernet/ramaxel/spnic/spnic_tx.c | 20 +-
18 files changed, 41 insertions(+), 842 deletions(-)
--
2.23.0
1
4

[PATCH openEuler-1.0-LTS] ext4: Fix BUG_ON in ext4_bread when write quota data
by Yang Yingliang 13 Jan '22
by Yang Yingliang 13 Jan '22
13 Jan '22
From: Ye Bin <yebin10(a)huawei.com>
mainline inclusion
from mainline-v5.17
commit ce85548ab4295234b4f8e63a0eea0c157d2f6b25
category: bugfix
bugzilla: 185930
CVE: NA
-----------------------------------------------
We got issue as follows when run syzkaller:
[ 167.936972] EXT4-fs error (device loop0): __ext4_remount:6314: comm rep: Abort forced by user
[ 167.938306] EXT4-fs (loop0): Remounting filesystem read-only
[ 167.981637] Assertion failure in ext4_getblk() at fs/ext4/inode.c:847: '(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) || handle != NULL || create == 0'
[ 167.983601] ------------[ cut here ]------------
[ 167.984245] kernel BUG at fs/ext4/inode.c:847!
[ 167.984882] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[ 167.985624] CPU: 7 PID: 2290 Comm: rep Tainted: G B 5.16.0-rc5-next-20211217+ #123
[ 167.986823] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
[ 167.988590] RIP: 0010:ext4_getblk+0x17e/0x504
[ 167.989189] Code: c6 01 74 28 49 c7 c0 a0 a3 5c 9b b9 4f 03 00 00 48 c7 c2 80 9c 5c 9b 48 c7 c6 40 b6 5c 9b 48 c7 c7 20 a4 5c 9b e8 77 e3 fd ff <0f> 0b 8b 04 244
[ 167.991679] RSP: 0018:ffff8881736f7398 EFLAGS: 00010282
[ 167.992385] RAX: 0000000000000094 RBX: 1ffff1102e6dee75 RCX: 0000000000000000
[ 167.993337] RDX: 0000000000000001 RSI: ffffffff9b6e29e0 RDI: ffffed102e6dee66
[ 167.994292] RBP: ffff88816a076210 R08: 0000000000000094 R09: ffffed107363fa09
[ 167.995252] R10: ffff88839b1fd047 R11: ffffed107363fa08 R12: ffff88816a0761e8
[ 167.996205] R13: 0000000000000000 R14: 0000000000000021 R15: 0000000000000001
[ 167.997158] FS: 00007f6a1428c740(0000) GS:ffff88839b000000(0000) knlGS:0000000000000000
[ 167.998238] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 167.999025] CR2: 00007f6a140716c8 CR3: 0000000133216000 CR4: 00000000000006e0
[ 167.999987] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 168.000944] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 168.001899] Call Trace:
[ 168.002235] <TASK>
[ 168.007167] ext4_bread+0xd/0x53
[ 168.007612] ext4_quota_write+0x20c/0x5c0
[ 168.010457] write_blk+0x100/0x220
[ 168.010944] remove_free_dqentry+0x1c6/0x440
[ 168.011525] free_dqentry.isra.0+0x565/0x830
[ 168.012133] remove_tree+0x318/0x6d0
[ 168.014744] remove_tree+0x1eb/0x6d0
[ 168.017346] remove_tree+0x1eb/0x6d0
[ 168.019969] remove_tree+0x1eb/0x6d0
[ 168.022128] qtree_release_dquot+0x291/0x340
[ 168.023297] v2_release_dquot+0xce/0x120
[ 168.023847] dquot_release+0x197/0x3e0
[ 168.024358] ext4_release_dquot+0x22a/0x2d0
[ 168.024932] dqput.part.0+0x1c9/0x900
[ 168.025430] __dquot_drop+0x120/0x190
[ 168.025942] ext4_clear_inode+0x86/0x220
[ 168.026472] ext4_evict_inode+0x9e8/0xa22
[ 168.028200] evict+0x29e/0x4f0
[ 168.028625] dispose_list+0x102/0x1f0
[ 168.029148] evict_inodes+0x2c1/0x3e0
[ 168.030188] generic_shutdown_super+0xa4/0x3b0
[ 168.030817] kill_block_super+0x95/0xd0
[ 168.031360] deactivate_locked_super+0x85/0xd0
[ 168.031977] cleanup_mnt+0x2bc/0x480
[ 168.033062] task_work_run+0xd1/0x170
[ 168.033565] do_exit+0xa4f/0x2b50
[ 168.037155] do_group_exit+0xef/0x2d0
[ 168.037666] __x64_sys_exit_group+0x3a/0x50
[ 168.038237] do_syscall_64+0x3b/0x90
[ 168.038751] entry_SYSCALL_64_after_hwframe+0x44/0xae
In order to reproduce this problem, the following conditions need to be met:
1. Ext4 filesystem with no journal;
2. Filesystem image with incorrect quota data;
3. Abort filesystem forced by user;
4. umount filesystem;
As in ext4_quota_write:
...
if (EXT4_SB(sb)->s_journal && !handle) {
ext4_msg(sb, KERN_WARNING, "Quota write (off=%llu, len=%llu)"
" cancelled because transaction is not started",
(unsigned long long)off, (unsigned long long)len);
return -EIO;
}
...
We only check handle if NULL when filesystem has journal. There is need
check handle if NULL even when filesystem has no journal.
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://lore.kernel.org/r/20211223015506.297766-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Cc: stable(a)kernel.org
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
fs/ext4/super.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 587817ac60990..b89f431ec78d3 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -6245,7 +6245,7 @@ static ssize_t ext4_quota_write(struct super_block *sb, int type,
struct buffer_head *bh;
handle_t *handle = journal_current_handle();
- if (EXT4_SB(sb)->s_journal && !handle) {
+ if (!handle) {
ext4_msg(sb, KERN_WARNING, "Quota write (off=%llu, len=%llu)"
" cancelled because transaction is not started",
(unsigned long long)off, (unsigned long long)len);
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] PM: hibernate: use correct mode for swsusp_close()
by Yang Yingliang 13 Jan '22
by Yang Yingliang 13 Jan '22
13 Jan '22
From: Thomas Zeitlhofer <thomas.zeitlhofer+lkml(a)ze-it.at>
stable inclusion
from linux-v4.19.219
commit 68945e943519df1532e598fafab16ac54488933f
---------------------------------------------------
[ Upstream commit cefcf24b4d351daf70ecd945324e200d3736821e ]
Commit 39fbef4b0f77 ("PM: hibernate: Get block device exclusively in
swsusp_check()") changed the opening mode of the block device to
(FMODE_READ | FMODE_EXCL).
In the corresponding calls to swsusp_close(), the mode is still just
FMODE_READ which triggers the warning in blkdev_flush_mapping() on
resume from hibernate.
So, use the mode (FMODE_READ | FMODE_EXCL) also when closing the
device.
Fixes: 39fbef4b0f77 ("PM: hibernate: Get block device exclusively in swsusp_check()")
Signed-off-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml(a)ze-it.at>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
kernel/power/hibernate.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index 72ef9df2a29cd..fc6466351e62f 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -686,7 +686,7 @@ static int load_image_and_restore(void)
goto Unlock;
error = swsusp_read(&flags);
- swsusp_close(FMODE_READ);
+ swsusp_close(FMODE_READ | FMODE_EXCL);
if (!error)
hibernation_restore(flags & SF_PLATFORM_MODE);
@@ -896,7 +896,7 @@ static int software_resume(void)
/* The snapshot device should not be opened while we're running */
if (!atomic_add_unless(&snapshot_device_available, -1, 0)) {
error = -EBUSY;
- swsusp_close(FMODE_READ);
+ swsusp_close(FMODE_READ | FMODE_EXCL);
goto Unlock;
}
@@ -925,7 +925,7 @@ static int software_resume(void)
pm_pr_dbg("Hibernation image not present or could not be loaded.\n");
return error;
Close_Finish:
- swsusp_close(FMODE_READ);
+ swsusp_close(FMODE_READ | FMODE_EXCL);
goto Finish;
}
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] Revert "watchdog: Fix check_preemption_disabled() error"
by Yang Yingliang 13 Jan '22
by Yang Yingliang 13 Jan '22
13 Jan '22
hulk inclusion
category: bugfix
bugzilla: 173968, https://gitee.com/openeuler/kernel/issues/I3J87Y
CVE: NA
---------------------------
This reverts commit b2e484e966ac2f211f8f8cf56083e9bc4d5d3c6a.
When CONFIG_LOCKDEP and CONFIG_DEBUG_LOCKDEP are enabled, it detects the following error:
[ 10.145007] BUG: sleeping function called from invalid context at mm/slab.h:418
[ 10.145394] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
[ 10.145765] Preemption disabled at:
[ 10.145978] [<ffff000008f8e7b4>] hardlockup_detector_perf_init+0x20/0x100
[ 10.146770] CPU: 6 PID: 1 Comm: swapper/0 Not tainted 4.19.90+ #3
[ 10.148242] Hardware name: linux,dummy-virt (DT)
[ 10.148572] Call trace:
[ 10.148667] dump_backtrace+0x0/0x190
[ 10.148765] show_stack+0x24/0x30
[ 10.148875] dump_stack+0xa4/0xf8
[ 10.148964] ___might_sleep+0x150/0x180
[ 10.149065] __might_sleep+0x58/0x90
[ 10.149199] kmem_cache_alloc_trace+0x244/0x2b0
[ 10.149308] perf_event_alloc+0x74/0x680
[ 10.149402] perf_event_create_kernel_counter+0x2c/0x190
[ 10.149516] arch_probe_cpu_freq+0x84/0x1ac
[ 10.149611] hw_nmi_get_sample_period+0xb8/0x180
[ 10.149713] hardlockup_detector_event_create+0x28/0xfc
[ 10.149827] hardlockup_detector_perf_init+0x24/0x100
[ 10.149943] watchdog_nmi_probe+0x14/0x1c
[ 10.150037] lockup_detector_init+0x58/0x98
[ 10.150173] kernel_init_freeable+0x10c/0x1c4
[ 10.150298] kernel_init+0x18/0x110
[ 10.150422] ret_from_fork+0x10/0x18
In 'b2e484e966ac ("watchdog: Fix check_preemption_disabled() error")', we
tried to fix check_preemption_disabled() error by disabling preemption in
hardlockup_detector_perf_init(), but missed that function
perf_event_create_kernel_counter() may sleep.
The preemption is always disabled, the problem that wanted be fixed is not
existed, so just revert this commit.
Fixes: b2e484e966ac ("watchdog: Fix check_preemption_disabled() error")
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
kernel/watchdog_hld.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index a5aff8ffa48ce..43832b1023693 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -508,17 +508,14 @@ void __init hardlockup_detector_perf_restart(void)
*/
int __init hardlockup_detector_perf_init(void)
{
- int ret;
+ int ret = hardlockup_detector_event_create();
- preempt_disable();
- ret = hardlockup_detector_event_create();
if (ret) {
pr_info("Perf NMI watchdog permanently disabled\n");
} else {
perf_event_release_kernel(this_cpu_read(watchdog_ev));
this_cpu_write(watchdog_ev, NULL);
}
- preempt_enable();
return ret;
}
#endif /* CONFIG_HARDLOCKUP_DETECTOR_PERF */
--
2.25.1
1
0
Ramaxel inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4QLBP
CVE: NA
Changes:
1. Read lb mode from chip
2. Disable fake vf function
3. Remove unnecessary code
Signed-off-by: Yanling Song <songyl(a)ramaxel.com>
Reviewed-by: Yun Xu <xuyun(a)ramaxel.com>
---
.../ethernet/ramaxel/spnic/hw/sphw_cfg_cmd.h | 3 +-
.../ethernet/ramaxel/spnic/hw/sphw_hw_cfg.c | 3 +
.../ethernet/ramaxel/spnic/hw/sphw_hw_cfg.h | 3 +
drivers/scsi/spfc/hw/spfc_cqm_bat_cla.c | 43 +--
drivers/scsi/spfc/hw/spfc_cqm_bitmap_table.c | 10 +-
drivers/scsi/spfc/hw/spfc_cqm_main.c | 299 +-----------------
drivers/scsi/spfc/hw/spfc_cqm_main.h | 3 -
drivers/scsi/spfc/hw/spfc_cqm_object.c | 21 --
8 files changed, 29 insertions(+), 356 deletions(-)
diff --git a/drivers/net/ethernet/ramaxel/spnic/hw/sphw_cfg_cmd.h b/drivers/net/ethernet/ramaxel/spnic/hw/sphw_cfg_cmd.h
index 23644958e33a..63b89e71c552 100644
--- a/drivers/net/ethernet/ramaxel/spnic/hw/sphw_cfg_cmd.h
+++ b/drivers/net/ethernet/ramaxel/spnic/hw/sphw_cfg_cmd.h
@@ -40,7 +40,8 @@ struct cfg_cmd_dev_cap {
u8 sf_svc_attr;
u8 func_sf_en;
- u16 rsvd_sf;
+ u8 lb_mode;
+ u8 smf_pg;
u32 max_conn_num;
u16 max_stick2cache_num;
diff --git a/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.c b/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.c
index 5d6a53307f0a..4b2674ec66f0 100644
--- a/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.c
+++ b/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.c
@@ -112,6 +112,9 @@ static void parse_pub_res_cap(struct sphw_hwdev *hwdev,
else
cap->sf_en = false;
+ cap->lb_mode = dev_cap->lb_mode;
+ cap->smf_pg = dev_cap->smf_pg;
+
cap->timer_en = (u8)timer_enable; /* timer enable */
cap->host_oq_id_mask_val = dev_cap->host_oq_id_mask_val;
cap->max_connect_num = dev_cap->max_conn_num;
diff --git a/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.h b/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.h
index 1147beeb76ad..1b48e0991563 100644
--- a/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.h
+++ b/drivers/net/ethernet/ramaxel/spnic/hw/sphw_hw_cfg.h
@@ -201,6 +201,9 @@ struct service_cap {
u8 timer_en; /* 0:disable, 1:enable */
u8 bloomfilter_en; /* 0:disable, 1:enable*/
+ u8 lb_mode;
+ u8 smf_pg;
+
/* For test */
u32 test_mode;
u32 test_qpc_num;
diff --git a/drivers/scsi/spfc/hw/spfc_cqm_bat_cla.c b/drivers/scsi/spfc/hw/spfc_cqm_bat_cla.c
index 30fb56a9bfed..0c1d97d9e3e6 100644
--- a/drivers/scsi/spfc/hw/spfc_cqm_bat_cla.c
+++ b/drivers/scsi/spfc/hw/spfc_cqm_bat_cla.c
@@ -53,17 +53,6 @@ cqm_bat_fill_cla_common_gpa(struct cqm_handle *cqm_handle,
gpa.acs_spu_en = 0;
}
- /* In fake mode, fake_vf_en in the GPA address of the BAT
- * must be set to 1.
- */
- if (cqm_handle->func_capability.fake_func_type == CQM_FAKE_FUNC_CHILD) {
- gpa.fake_vf_en = 1;
- func_attr = &cqm_handle->parent_cqm_handle->func_attribute;
- gpa.pf_id = func_attr->func_global_idx;
- } else {
- gpa.fake_vf_en = 0;
- }
-
memcpy(&cla_gpa_h, &gpa, sizeof(u32));
bat_entry_standerd->cla_gpa_h = cla_gpa_h;
@@ -379,13 +368,8 @@ s32 cqm_bat_update(struct cqm_handle *cqm_handle)
CQM_PTR_CHECK_RET(buf_in, CQM_FAIL, CQM_ALLOC_FAIL(buf_in));
buf_in->size = sizeof(struct cqm_cmdq_bat_update);
- /* In non-fake mode, func_id is set to 0xffff, indicating the current func.
- * In fake mode, the value of func_id is specified. This is a fake func_id.
- */
- if (cqm_handle->func_capability.fake_func_type == CQM_FAKE_FUNC_CHILD)
- func_id = cqm_handle->func_attribute.func_global_idx;
- else
- func_id = 0xffff;
+ /* In non-fake mode, func_id is set to 0xffff */
+ func_id = 0xffff;
/* The LB scenario is supported.
* The normal mode is the traditional mode and is configured on SMF0.
@@ -545,19 +529,6 @@ s32 cqm_cla_fill_buf(struct cqm_handle *cqm_handle, struct cqm_buf *cla_base_buf
spu_en = 0;
}
- /* fake enable */
- if (cqm_handle->func_capability.fake_func_type ==
- CQM_FAKE_FUNC_CHILD) {
- fake_en = 1ULL << 62;
- func_attr =
- &cqm_handle->parent_cqm_handle->func_attribute;
- pf_id = func_attr->func_global_idx;
- pf_id = (pf_id & 0x1f) << 57;
- } else {
- fake_en = 0;
- pf_id = 0;
- }
-
*base = (((((cla_sub_buf->buf_list[i].pa & CQM_CHIP_GPA_MASK) |
spu_en) |
fake_en) |
@@ -1248,14 +1219,8 @@ s32 cqm_cla_update(struct cqm_handle *cqm_handle, struct cqm_buf_list *buf_node_
}
}
- /* In non-fake mode, set func_id to 0xffff.
- * Indicates the current func fake mode, set func_id to the
- * specified value, This is a fake func_id.
- */
- if (cqm_handle->func_capability.fake_func_type == CQM_FAKE_FUNC_CHILD)
- cmd.func_id = cqm_handle->func_attribute.func_global_idx;
- else
- cmd.func_id = 0xffff;
+ /* In non-fake mode, set func_id to 0xffff. */
+ cmd.func_id = 0xffff;
/* Mode 0 is hashed to 4 SMF engines (excluding PPF) by func ID. */
if (cqm_handle->func_capability.lb_mode == CQM_LB_MODE_NORMAL ||
diff --git a/drivers/scsi/spfc/hw/spfc_cqm_bitmap_table.c b/drivers/scsi/spfc/hw/spfc_cqm_bitmap_table.c
index 4e482776a14f..21100e8db8f4 100644
--- a/drivers/scsi/spfc/hw/spfc_cqm_bitmap_table.c
+++ b/drivers/scsi/spfc/hw/spfc_cqm_bitmap_table.c
@@ -405,14 +405,8 @@ s32 cqm_cla_cache_invalid(struct cqm_handle *cqm_handle, dma_addr_t gpa, u32 cac
cmd.gpa_h = CQM_ADDR_HI(gpa);
cmd.gpa_l = CQM_ADDR_LW(gpa);
- /* In non-fake mode, set func_id to 0xffff.
- * Indicate the current func fake mode.
- * The value of func_id is a fake func ID.
- */
- if (cqm_handle->func_capability.fake_func_type == CQM_FAKE_FUNC_CHILD)
- cmd.func_id = cqm_handle->func_attribute.func_global_idx;
- else
- cmd.func_id = 0xffff;
+ /* In non-fake mode, set func_id to 0xffff. */
+ cmd.func_id = 0xffff;
/* Mode 0 is hashed to 4 SMF engines (excluding PPF) by func ID. */
if (cqm_handle->func_capability.lb_mode == CQM_LB_MODE_NORMAL ||
diff --git a/drivers/scsi/spfc/hw/spfc_cqm_main.c b/drivers/scsi/spfc/hw/spfc_cqm_main.c
index a4c8e60971b1..52cc2c7838e9 100644
--- a/drivers/scsi/spfc/hw/spfc_cqm_main.c
+++ b/drivers/scsi/spfc/hw/spfc_cqm_main.c
@@ -11,21 +11,8 @@
#include "sphw_crm.h"
#include "sphw_hw.h"
#include "sphw_hw_cfg.h"
-
#include "spfc_cqm_main.h"
-static unsigned char cqm_lb_mode = CQM_LB_MODE_NORMAL;
-module_param(cqm_lb_mode, byte, 0644);
-MODULE_PARM_DESC(cqm_lb_mode, "for cqm lb mode (default=0xff)");
-
-static unsigned char cqm_fake_mode = CQM_FAKE_MODE_DISABLE;
-module_param(cqm_fake_mode, byte, 0644);
-MODULE_PARM_DESC(cqm_fake_mode, "for cqm fake mode (default=0 disable)");
-
-static unsigned char cqm_platform_mode = CQM_FPGA_MODE;
-module_param(cqm_platform_mode, byte, 0644);
-MODULE_PARM_DESC(cqm_platform_mode, "for cqm platform mode (default=0 FPGA)");
-
s32 cqm3_init(void *ex_handle)
{
struct sphw_hwdev *handle = (struct sphw_hwdev *)ex_handle;
@@ -63,19 +50,6 @@ s32 cqm3_init(void *ex_handle)
goto err1;
}
- /* In FAKE mode, only the bitmap of the timer of the function is
- * enabled, and resources are not initialized. Otherwise, the
- * configuration of the fake function is overwritten.
- */
- if (cqm_handle->func_capability.fake_func_type == CQM_FAKE_FUNC_CHILD_CONFLICT) {
- if (sphw_func_tmr_bitmap_set(ex_handle, true) != CQM_SUCCESS)
- cqm_err(handle->dev_hdl, "Timer start: enable timer bitmap failed\n");
-
- handle->cqm_hdl = NULL;
- kfree(cqm_handle);
- return CQM_SUCCESS;
- }
-
/* Initialize memory entries such as BAT, CLA, and bitmap. */
if (cqm_mem_init(ex_handle) != CQM_SUCCESS) {
cqm_err(handle->dev_hdl, CQM_FUNCTION_FAIL(cqm_mem_init));
@@ -287,64 +261,6 @@ void cqm_service_capability_init(struct cqm_handle *cqm_handle,
cqm_service_capability_init_fc(cqm_handle, (void *)service_capability);
}
-s32 cqm_get_fake_func_type(struct cqm_handle *cqm_handle)
-{
- struct cqm_func_capability *func_cap = &cqm_handle->func_capability;
- u32 parent_func, child_func_start, child_func_number, i;
- u32 idx = cqm_handle->func_attribute.func_global_idx;
-
- /* Currently, only one set of fake configurations is implemented.
- * fake_cfg_number = 1
- */
- for (i = 0; i < func_cap->fake_cfg_number; i++) {
- parent_func = func_cap->fake_cfg[i].parent_func;
- child_func_start = func_cap->fake_cfg[i].child_func_start;
- child_func_number = func_cap->fake_cfg[i].child_func_number;
-
- if (idx == parent_func) {
- return CQM_FAKE_FUNC_PARENT;
- } else if ((idx >= child_func_start) &&
- (idx < (child_func_start + child_func_number))) {
- return CQM_FAKE_FUNC_CHILD_CONFLICT;
- }
- }
-
- return CQM_FAKE_FUNC_NORMAL;
-}
-
-s32 cqm_get_child_func_start(struct cqm_handle *cqm_handle)
-{
- struct cqm_func_capability *func_cap = &cqm_handle->func_capability;
- struct sphw_func_attr *func_attr = &cqm_handle->func_attribute;
- u32 i;
-
- /* Currently, only one set of fake configurations is implemented.
- * fake_cfg_number = 1
- */
- for (i = 0; i < func_cap->fake_cfg_number; i++) {
- if (func_attr->func_global_idx ==
- func_cap->fake_cfg[i].parent_func)
- return (s32)(func_cap->fake_cfg[i].child_func_start);
- }
-
- return CQM_FAIL;
-}
-
-s32 cqm_get_child_func_number(struct cqm_handle *cqm_handle)
-{
- struct cqm_func_capability *func_cap = &cqm_handle->func_capability;
- struct sphw_func_attr *func_attr = &cqm_handle->func_attribute;
- u32 i;
-
- for (i = 0; i < func_cap->fake_cfg_number; i++) {
- if (func_attr->func_global_idx ==
- func_cap->fake_cfg[i].parent_func)
- return (s32)(func_cap->fake_cfg[i].child_func_number);
- }
-
- return CQM_FAIL;
-}
-
/* Set func_type in fake_cqm_handle to ppf, pf, or vf. */
void cqm_set_func_type(struct cqm_handle *cqm_handle)
{
@@ -358,41 +274,20 @@ void cqm_set_func_type(struct cqm_handle *cqm_handle)
cqm_handle->func_attribute.func_type = CQM_VF;
}
-void cqm_lb_fake_mode_init(struct cqm_handle *cqm_handle)
+void cqm_lb_fake_mode_init(struct cqm_handle *cqm_handle, struct service_cap *svc_cap)
{
struct cqm_func_capability *func_cap = &cqm_handle->func_capability;
- struct cqm_fake_cfg *cfg = func_cap->fake_cfg;
- func_cap->lb_mode = cqm_lb_mode;
- func_cap->fake_mode = cqm_fake_mode;
+ func_cap->lb_mode = svc_cap->lb_mode;
/* Initializing the LB Mode */
- if (func_cap->lb_mode == CQM_LB_MODE_NORMAL) {
+ if (func_cap->lb_mode == CQM_LB_MODE_NORMAL)
func_cap->smf_pg = 0;
- } else {
- /* The LB mode is tailored on the FPGA.
- * Only SMF0 and SMF2 are instantiated.
- */
- if (cqm_platform_mode == CQM_FPGA_MODE)
- func_cap->smf_pg = 0x5;
- else
- func_cap->smf_pg = 0xF;
- }
+ else
+ func_cap->smf_pg = svc_cap->smf_pg;
- /* Initializing the FAKE Mode */
- if (func_cap->fake_mode == CQM_FAKE_MODE_DISABLE) {
- func_cap->fake_cfg_number = 0;
- func_cap->fake_func_type = CQM_FAKE_FUNC_NORMAL;
- } else {
- func_cap->fake_cfg_number = 1;
-
- /* When configuring fake mode, ensure that the parent function
- * cannot be contained in the child function; otherwise, the
- * system will be initialized repeatedly.
- */
- cfg[0].child_func_start = CQM_FAKE_CFUNC_START;
- func_cap->fake_func_type = cqm_get_fake_func_type(cqm_handle);
- }
+ func_cap->fake_cfg_number = 0;
+ func_cap->fake_func_type = CQM_FAKE_FUNC_NORMAL;
}
s32 cqm_capability_init(void *ex_handle)
@@ -465,10 +360,9 @@ s32 cqm_capability_init(void *ex_handle)
func_cap->gpa_check_enable = true;
- cqm_lb_fake_mode_init(cqm_handle);
+ cqm_lb_fake_mode_init(cqm_handle, service_capability);
cqm_info(handle->dev_hdl, "Cap init: lb_mode=%u\n", func_cap->lb_mode);
cqm_info(handle->dev_hdl, "Cap init: smf_pg=%u\n", func_cap->smf_pg);
- cqm_info(handle->dev_hdl, "Cap init: fake_mode=%u\n", func_cap->fake_mode);
cqm_info(handle->dev_hdl, "Cap init: fake_func_type=%u\n", func_cap->fake_func_type);
cqm_info(handle->dev_hdl, "Cap init: fake_cfg_number=%u\n", func_cap->fake_cfg_number);
@@ -517,153 +411,6 @@ s32 cqm_capability_init(void *ex_handle)
return err;
}
-void cqm_fake_uninit(struct cqm_handle *cqm_handle)
-{
- u32 i;
-
- if (cqm_handle->func_capability.fake_func_type !=
- CQM_FAKE_FUNC_PARENT)
- return;
-
- for (i = 0; i < CQM_FAKE_FUNC_MAX; i++) {
- kfree(cqm_handle->fake_cqm_handle[i]);
- cqm_handle->fake_cqm_handle[i] = NULL;
- }
-}
-
-s32 cqm_fake_init(struct cqm_handle *cqm_handle)
-{
- struct sphw_hwdev *handle = cqm_handle->ex_handle;
- struct cqm_func_capability *func_cap = NULL;
- struct cqm_handle *fake_cqm_handle = NULL;
- struct sphw_func_attr *func_attr = NULL;
- s32 child_func_start, child_func_number;
- u32 i;
-
- func_cap = &cqm_handle->func_capability;
- if (func_cap->fake_func_type != CQM_FAKE_FUNC_PARENT)
- return CQM_SUCCESS;
-
- child_func_start = cqm_get_child_func_start(cqm_handle);
- if (child_func_start == CQM_FAIL) {
- cqm_err(handle->dev_hdl, CQM_WRONG_VALUE(child_func_start));
- return CQM_FAIL;
- }
-
- child_func_number = cqm_get_child_func_number(cqm_handle);
- if (child_func_number == CQM_FAIL) {
- cqm_err(handle->dev_hdl, CQM_WRONG_VALUE(child_func_number));
- return CQM_FAIL;
- }
-
- for (i = 0; i < (u32)child_func_number; i++) {
- fake_cqm_handle = kmalloc(sizeof(*fake_cqm_handle), GFP_KERNEL | __GFP_ZERO);
- if (!fake_cqm_handle) {
- cqm_err(handle->dev_hdl,
- CQM_ALLOC_FAIL(fake_cqm_handle));
- goto err;
- }
-
- /* Copy the attributes of the parent CQM handle to the child CQM
- * handle and modify the values of function.
- */
- memcpy(fake_cqm_handle, cqm_handle, sizeof(struct cqm_handle));
- func_attr = &fake_cqm_handle->func_attribute;
- func_cap = &fake_cqm_handle->func_capability;
- func_attr->func_global_idx = (u16)(child_func_start + i);
- cqm_set_func_type(fake_cqm_handle);
- func_cap->fake_func_type = CQM_FAKE_FUNC_CHILD;
- cqm_info(handle->dev_hdl, "Fake func init: function[%u] type %d(0:PF,1:VF,2:PPF)\n",
- func_attr->func_global_idx, func_attr->func_type);
-
- fake_cqm_handle->parent_cqm_handle = cqm_handle;
- cqm_handle->fake_cqm_handle[i] = fake_cqm_handle;
- }
-
- return CQM_SUCCESS;
-
-err:
- cqm_fake_uninit(cqm_handle);
- return CQM_FAIL;
-}
-
-void cqm_fake_mem_uninit(struct cqm_handle *cqm_handle)
-{
- struct sphw_hwdev *handle = cqm_handle->ex_handle;
- struct cqm_handle *fake_cqm_handle = NULL;
- s32 child_func_number;
- u32 i;
-
- if (cqm_handle->func_capability.fake_func_type != CQM_FAKE_FUNC_PARENT)
- return;
-
- child_func_number = cqm_get_child_func_number(cqm_handle);
- if (child_func_number == CQM_FAIL) {
- cqm_err(handle->dev_hdl, CQM_WRONG_VALUE(child_func_number));
- return;
- }
-
- for (i = 0; i < (u32)child_func_number; i++) {
- fake_cqm_handle = cqm_handle->fake_cqm_handle[i];
- cqm_object_table_uninit(fake_cqm_handle);
- cqm_bitmap_uninit(fake_cqm_handle);
- cqm_cla_uninit(fake_cqm_handle, CQM_BAT_ENTRY_MAX);
- cqm_bat_uninit(fake_cqm_handle);
- }
-}
-
-s32 cqm_fake_mem_init(struct cqm_handle *cqm_handle)
-{
- struct sphw_hwdev *handle = cqm_handle->ex_handle;
- struct cqm_handle *fake_cqm_handle = NULL;
- s32 child_func_number;
- u32 i;
-
- if (cqm_handle->func_capability.fake_func_type !=
- CQM_FAKE_FUNC_PARENT)
- return CQM_SUCCESS;
-
- child_func_number = cqm_get_child_func_number(cqm_handle);
- if (child_func_number == CQM_FAIL) {
- cqm_err(handle->dev_hdl, CQM_WRONG_VALUE(child_func_number));
- return CQM_FAIL;
- }
-
- for (i = 0; i < (u32)child_func_number; i++) {
- fake_cqm_handle = cqm_handle->fake_cqm_handle[i];
-
- if (cqm_bat_init(fake_cqm_handle) != CQM_SUCCESS) {
- cqm_err(handle->dev_hdl,
- CQM_FUNCTION_FAIL(cqm_bat_init));
- goto err;
- }
-
- if (cqm_cla_init(fake_cqm_handle) != CQM_SUCCESS) {
- cqm_err(handle->dev_hdl,
- CQM_FUNCTION_FAIL(cqm_cla_init));
- goto err;
- }
-
- if (cqm_bitmap_init(fake_cqm_handle) != CQM_SUCCESS) {
- cqm_err(handle->dev_hdl,
- CQM_FUNCTION_FAIL(cqm_bitmap_init));
- goto err;
- }
-
- if (cqm_object_table_init(fake_cqm_handle) != CQM_SUCCESS) {
- cqm_err(handle->dev_hdl,
- CQM_FUNCTION_FAIL(cqm_object_table_init));
- goto err;
- }
- }
-
- return CQM_SUCCESS;
-
-err:
- cqm_fake_mem_uninit(cqm_handle);
- return CQM_FAIL;
-}
-
s32 cqm_mem_init(void *ex_handle)
{
struct sphw_hwdev *handle = (struct sphw_hwdev *)ex_handle;
@@ -671,49 +418,35 @@ s32 cqm_mem_init(void *ex_handle)
cqm_handle = (struct cqm_handle *)(handle->cqm_hdl);
- if (cqm_fake_init(cqm_handle) != CQM_SUCCESS) {
- cqm_err(handle->dev_hdl, CQM_FUNCTION_FAIL(cqm_fake_init));
- return CQM_FAIL;
- }
-
- if (cqm_fake_mem_init(cqm_handle) != CQM_SUCCESS) {
- cqm_err(handle->dev_hdl, CQM_FUNCTION_FAIL(cqm_fake_mem_init));
- goto err1;
- }
-
if (cqm_bat_init(cqm_handle) != CQM_SUCCESS) {
cqm_err(handle->dev_hdl, CQM_FUNCTION_FAIL(cqm_bat_init));
- goto err2;
+ return CQM_FAIL;
}
if (cqm_cla_init(cqm_handle) != CQM_SUCCESS) {
cqm_err(handle->dev_hdl, CQM_FUNCTION_FAIL(cqm_cla_init));
- goto err3;
+ goto err1;
}
if (cqm_bitmap_init(cqm_handle) != CQM_SUCCESS) {
cqm_err(handle->dev_hdl, CQM_FUNCTION_FAIL(cqm_bitmap_init));
- goto err4;
+ goto err2;
}
if (cqm_object_table_init(cqm_handle) != CQM_SUCCESS) {
cqm_err(handle->dev_hdl,
CQM_FUNCTION_FAIL(cqm_object_table_init));
- goto err5;
+ goto err3;
}
return CQM_SUCCESS;
-err5:
- cqm_bitmap_uninit(cqm_handle);
-err4:
- cqm_cla_uninit(cqm_handle, CQM_BAT_ENTRY_MAX);
err3:
- cqm_bat_uninit(cqm_handle);
+ cqm_bitmap_uninit(cqm_handle);
err2:
- cqm_fake_mem_uninit(cqm_handle);
+ cqm_cla_uninit(cqm_handle, CQM_BAT_ENTRY_MAX);
err1:
- cqm_fake_uninit(cqm_handle);
+ cqm_bat_uninit(cqm_handle);
return CQM_FAIL;
}
@@ -728,8 +461,6 @@ void cqm_mem_uninit(void *ex_handle)
cqm_bitmap_uninit(cqm_handle);
cqm_cla_uninit(cqm_handle, CQM_BAT_ENTRY_MAX);
cqm_bat_uninit(cqm_handle);
- cqm_fake_mem_uninit(cqm_handle);
- cqm_fake_uninit(cqm_handle);
}
s32 cqm_event_init(void *ex_handle)
diff --git a/drivers/scsi/spfc/hw/spfc_cqm_main.h b/drivers/scsi/spfc/hw/spfc_cqm_main.h
index 1b8cf8bdb3b7..cf10d7f5c339 100644
--- a/drivers/scsi/spfc/hw/spfc_cqm_main.h
+++ b/drivers/scsi/spfc/hw/spfc_cqm_main.h
@@ -328,9 +328,6 @@ void cqm_mem_uninit(void *ex_handle);
s32 cqm_event_init(void *ex_handle);
void cqm_event_uninit(void *ex_handle);
u8 cqm_aeq_callback(void *ex_handle, u8 event, u8 *data);
-s32 cqm_get_fake_func_type(struct cqm_handle *cqm_handle);
-s32 cqm_get_child_func_start(struct cqm_handle *cqm_handle);
-s32 cqm_get_child_func_number(struct cqm_handle *cqm_handle);
s32 cqm3_init(void *ex_handle);
void cqm3_uninit(void *ex_handle);
diff --git a/drivers/scsi/spfc/hw/spfc_cqm_object.c b/drivers/scsi/spfc/hw/spfc_cqm_object.c
index b895d37aebae..165794e9c7e5 100644
--- a/drivers/scsi/spfc/hw/spfc_cqm_object.c
+++ b/drivers/scsi/spfc/hw/spfc_cqm_object.c
@@ -155,8 +155,6 @@ struct cqm_qpc_mpt *cqm3_object_qpc_mpt_create(void *ex_handle, u32 service_type
struct cqm_qpc_mpt_info *qpc_mpt_info = NULL;
struct cqm_handle *cqm_handle = NULL;
s32 ret = CQM_FAIL;
- u32 relative_index;
- u32 fake_func_id;
CQM_PTR_CHECK_RET(ex_handle, NULL, CQM_PTR_NULL(ex_handle));
@@ -180,25 +178,6 @@ struct cqm_qpc_mpt *cqm3_object_qpc_mpt_create(void *ex_handle, u32 service_type
return NULL;
}
- /* fake vf adaption, switch to corresponding VF. */
- if (cqm_handle->func_capability.fake_func_type ==
- CQM_FAKE_FUNC_PARENT) {
- fake_func_id = index / cqm_handle->func_capability.qpc_number;
- relative_index = index % cqm_handle->func_capability.qpc_number;
-
- cqm_info(handle->dev_hdl, "qpc create: fake_func_id=%u, relative_index=%u\n",
- fake_func_id, relative_index);
-
- if ((s32)fake_func_id >=
- cqm_get_child_func_number(cqm_handle)) {
- cqm_err(handle->dev_hdl, CQM_WRONG_VALUE(fake_func_id));
- return NULL;
- }
-
- index = relative_index;
- cqm_handle = cqm_handle->fake_cqm_handle[fake_func_id];
- }
-
qpc_mpt_info = kmalloc(sizeof(*qpc_mpt_info), GFP_ATOMIC | __GFP_ZERO);
CQM_PTR_CHECK_RET(qpc_mpt_info, NULL, CQM_ALLOC_FAIL(qpc_mpt_info));
--
2.32.0
1
0

[PATCH openEuler-5.10 01/30] net: hns3: refactor hns3 makefile to support hns3_common module
by Zheng Zengkai 12 Jan '22
by Zheng Zengkai 12 Jan '22
12 Jan '22
From: Jie Wang <wangjie125(a)huawei.com>
mainline inclusion
from mainline-br26_refactor2
commit 5f20be4e90e6
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4Q02P
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
----------------------------------------------------------------------
Currently we plan to refactor PF and VF cmdq module. A new file folder
hns3_common will be created to store new common APIs used by PF and VF
cmdq module. Thus the PF and VF compilation process will both depends on
the hns3_common. This may cause parallel building problems if we add a new
makefile building unit.
So this patch combined the PF and VF makefile scripts to the top level
makefile to support the new hns3_common which will be created in the next
patch.
Signed-off-by: Jie Wang <wangjie125(a)huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2(a)huawei.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Reviewed-by: Jian Shen <shenjian15(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/Makefile | 14 +++++++++++---
.../net/ethernet/hisilicon/hns3/hns3pf/Makefile | 12 ------------
.../net/ethernet/hisilicon/hns3/hns3vf/Makefile | 10 ----------
3 files changed, 11 insertions(+), 25 deletions(-)
delete mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile
delete mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3vf/Makefile
diff --git a/drivers/net/ethernet/hisilicon/hns3/Makefile b/drivers/net/ethernet/hisilicon/hns3/Makefile
index 7aa2fac76c5e..32e24e0945f5 100644
--- a/drivers/net/ethernet/hisilicon/hns3/Makefile
+++ b/drivers/net/ethernet/hisilicon/hns3/Makefile
@@ -4,9 +4,8 @@
#
ccflags-y += -I$(srctree)/$(src)
-
-obj-$(CONFIG_HNS3) += hns3pf/
-obj-$(CONFIG_HNS3) += hns3vf/
+ccflags-y += -I$(srctree)/drivers/net/ethernet/hisilicon/hns3/hns3pf
+ccflags-y += -I$(srctree)/drivers/net/ethernet/hisilicon/hns3/hns3vf
obj-$(CONFIG_HNS3) += hnae3.o
@@ -14,3 +13,12 @@ obj-$(CONFIG_HNS3_ENET) += hns3.o
hns3-objs = hns3_enet.o hns3_ethtool.o hns3_debugfs.o
hns3-$(CONFIG_HNS3_DCB) += hns3_dcbnl.o
+
+obj-$(CONFIG_HNS3_HCLGEVF) += hclgevf.o
+hclgevf-objs = hns3vf/hclgevf_main.o hns3vf/hclgevf_cmd.o hns3vf/hclgevf_mbx.o hns3vf/hclgevf_devlink.o
+
+obj-$(CONFIG_HNS3_HCLGE) += hclge.o
+hclge-objs = hns3pf/hclge_main.o hns3pf/hclge_cmd.o hns3pf/hclge_mdio.o hns3pf/hclge_tm.o \
+ hns3pf/hclge_mbx.o hns3pf/hclge_err.o hns3pf/hclge_debugfs.o hns3pf/hclge_ptp.o hns3pf/hclge_devlink.o
+
+hclge-$(CONFIG_HNS3_DCB) += hns3pf/hclge_dcb.o
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile b/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile
deleted file mode 100644
index d1bf5c4c0abb..000000000000
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile
+++ /dev/null
@@ -1,12 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0+
-#
-# Makefile for the HISILICON network device drivers.
-#
-
-ccflags-y := -I $(srctree)/drivers/net/ethernet/hisilicon/hns3
-ccflags-y += -I $(srctree)/$(src)
-
-obj-$(CONFIG_HNS3_HCLGE) += hclge.o
-hclge-objs = hclge_main.o hclge_cmd.o hclge_mdio.o hclge_tm.o hclge_mbx.o hclge_err.o hclge_debugfs.o hclge_ptp.o hclge_devlink.o
-
-hclge-$(CONFIG_HNS3_DCB) += hclge_dcb.o
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/Makefile b/drivers/net/ethernet/hisilicon/hns3/hns3vf/Makefile
deleted file mode 100644
index 51ff7d86ee90..000000000000
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/Makefile
+++ /dev/null
@@ -1,10 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0+
-#
-# Makefile for the HISILICON network device drivers.
-#
-
-ccflags-y := -I $(srctree)/drivers/net/ethernet/hisilicon/hns3
-ccflags-y += -I $(srctree)/$(src)
-
-obj-$(CONFIG_HNS3_HCLGEVF) += hclgevf.o
-hclgevf-objs = hclgevf_main.o hclgevf_cmd.o hclgevf_mbx.o hclgevf_devlink.o
--
2.20.1
1
29
This patch set introduce many conflicts while backporting mainline bcache
patches, revert it temporarily.
Zheng Zengkai (8):
Revert "bcache: always record start time of a sample"
Revert "bcache: do not collect data insert info created by
write_moving"
Revert "bcache: Rewrite patch to delay to invalidate cache data"
Revert "bcache: Add a sample of userspace prefetch client"
Revert "bcache: Delay to invalidate cache data in writearound write"
Revert "bcache: inflight prefetch requests block overlapped normal
requests"
Revert "bcache: provide a switch to bypass all IO requests"
Revert "bcache: add a framework to perform prefetch"
Documentation/admin-guide/bcache.rst | 4 -
drivers/md/bcache/Makefile | 2 +-
drivers/md/bcache/acache.c | 586 ---------------------------
drivers/md/bcache/acache.h | 79 ----
drivers/md/bcache/bcache.h | 3 -
drivers/md/bcache/btree.c | 4 +-
drivers/md/bcache/request.c | 161 +++-----
drivers/md/bcache/request.h | 45 --
drivers/md/bcache/stats.c | 13 -
drivers/md/bcache/stats.h | 3 -
drivers/md/bcache/super.c | 6 -
drivers/md/bcache/sysfs.c | 12 -
include/trace/events/bcache.h | 22 -
samples/acache_client/Makefile | 13 -
samples/acache_client/connect.c | 144 -------
samples/acache_client/connect.h | 74 ----
samples/acache_client/main.c | 133 ------
17 files changed, 59 insertions(+), 1245 deletions(-)
delete mode 100644 drivers/md/bcache/acache.c
delete mode 100644 drivers/md/bcache/acache.h
delete mode 100644 samples/acache_client/Makefile
delete mode 100644 samples/acache_client/connect.c
delete mode 100644 samples/acache_client/connect.h
delete mode 100644 samples/acache_client/main.c
--
2.20.1
2
9

[PATCH openEuler-5.10 01/17] tcp: Add some stub info for KABI consistency
by Zheng Zengkai 11 Jan '22
by Zheng Zengkai 11 Jan '22
11 Jan '22
From: Baisong Zhong <zhongbaisong(a)huawei.com>
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4PY1Q
CVE: NA
--------------------------------
We add stub info in some structures to maintain the consistency of KABI
Signed-off-by: Baisong Zhong <zhongbaisong(a)huawei.com>
Reviewed-by: Wei Yongjun <weiyongjun1(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
include/linux/skbuff.h | 2 ++
include/net/ipv6.h | 2 +-
include/net/sock.h | 3 +++
3 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 9e3a454d2377..d485f17ff33a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -36,6 +36,7 @@
#include <linux/splice.h>
#include <linux/in6.h>
#include <linux/if_packet.h>
+#include <linux/llist.h>
#include <net/flow.h>
#include <net/page_pool.h>
#include <linux/kabi.h>
@@ -732,6 +733,7 @@ struct sk_buff {
};
struct rb_node rbnode; /* used in netem, ip4 defrag, and tcp stack */
struct list_head list;
+ struct llist_node ll_node;
};
union {
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index bd1f396cc9c7..c0273ae50296 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -344,9 +344,9 @@ struct ipcm6_cookie {
struct sockcm_cookie sockc;
__s16 hlimit;
__s16 tclass;
+ __u16 gso_size;
__s8 dontfrag;
struct ipv6_txoptions *opt;
- __u16 gso_size;
};
static inline void ipcm6_init(struct ipcm6_cookie *ipc6)
diff --git a/include/net/sock.h b/include/net/sock.h
index 712bb7b09f96..b3d451878640 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -63,6 +63,7 @@
#include <linux/atomic.h>
#include <linux/refcount.h>
+#include <linux/llist.h>
#include <net/dst.h>
#include <net/checksum.h>
#include <net/tcp_states.h>
@@ -405,6 +406,8 @@ struct sock {
struct sk_buff *head;
struct sk_buff *tail;
} sk_backlog;
+ struct llist_head defer_list;
+
#define sk_rmem_alloc sk_backlog.rmem_alloc
int sk_forward_alloc;
--
2.20.1
2
17
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4O31I?from=project-issue
CVE: NA
------------------------------
Register pmem in arm64:
Use memmap(memmap=nn[KMG]!ss[KMG]) reserve memory and
e820(driver/nvdimm/e820.c) function to register persistent
memory in arm64. when the kernel restart or update, the data
in PMEM will not be lost and can be loaded faster. this is a
general features.
driver/nvdimm/e820.c:
The function of this file is scan "iomem_resource" and take
advantage of nvdimm resource discovery mechanism by registering
a resource named "Persistent Memory (legacy)", this function
doesn't depend on architecture.
We will push the feature to linux kernel community and discuss to
modify the file name. because people have a mistaken notion that
the e820.c is depend on x86.
If you want use this features, you need do as follows:
1.Reserve memory: add memmap to reserve memory in grub.cfg
memmap=nn[KMG]!ss[KMG] exp:memmap=100K!0x1a0000000.
2.Insmod nd_e820.ko: modprobe nd_e820.
3.Check pmem device in /dev exp: /dev/pmem0
Signed-off-by: Zhuling <zhuling8(a)huawei.com>
---
arch/arm64/Kconfig | 22 ++++++++++++++++
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/pmem.c | 36 ++++++++++++++++++++++++++
arch/arm64/kernel/setup.c | 11 ++++++++
arch/arm64/mm/init.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++
drivers/nvdimm/Kconfig | 6 +++++
drivers/nvdimm/Makefile | 1 +
7 files changed, 140 insertions(+)
create mode 100644 arch/arm64/kernel/pmem.c
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 405e5ce..2231eac 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1322,6 +1322,28 @@ config RODATA_FULL_DEFAULT_ENABLED
This requires the linear region to be mapped down to pages,
which may adversely affect performance in some cases.
+config ARM64_PMEM_RESERVE
+ bool "Reserve memory for persistent storage"
+ default n
+ help
+ Use memmap=nn[KMG]!ss[KMG](memmap=100K!0x1a0000000) reserve
+ memory for persistent storage.
+
+ Say y here to enable this feature.
+
+config ARM64_PMEM_LEGACY_DEVICE
+ bool "Create persistent storage"
+ depends on BLK_DEV
+ depends on LIBNVDIMM
+ select ARM64_PMEM_RESERVE
+ help
+ Use reserved memory for persistent storage when the kernel
+ restart or update. the data in PMEM will not be lost and
+ can be loaded faster.
+
+ Say y if unsure.
+
+
config ARM64_SW_TTBR0_PAN
bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
help
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 169d90f..f615325 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -68,6 +68,7 @@ obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o
obj-$(CONFIG_SHADOW_CALL_STACK) += scs.o
obj-$(CONFIG_ARM64_MTE) += mte.o
obj-$(CONFIG_MPAM) += mpam/
+obj-$(CONFIG_ARM64_PMEM_LEGACY_DEVICE) += pmem.o
obj-y += vdso/ probes/
obj-$(CONFIG_COMPAT_VDSO) += vdso32/
diff --git a/arch/arm64/kernel/pmem.c b/arch/arm64/kernel/pmem.c
new file mode 100644
index 0000000..d1efdc8
--- /dev/null
+++ b/arch/arm64/kernel/pmem.c
@@ -0,0 +1,36 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright(c) 2021 Huawei Technologies Co., Ltd
+ *
+ * Derived from x86 and arm64 implement PMEM.
+ */
+#include <linux/platform_device.h>
+#include <linux/init.h>
+#include <linux/ioport.h>
+#include <linux/module.h>
+
+static int found(struct resource *res, void *data)
+{
+ return 1;
+}
+
+static int __init register_e820_pmem(void)
+{
+ struct platform_device *pdev;
+ int rc;
+
+ rc = walk_iomem_res_desc(IORES_DESC_PERSISTENT_MEMORY_LEGACY,
+ IORESOURCE_MEM, 0, -1, NULL, found);
+ if (rc <= 0)
+ return 0;
+
+ /*
+ * See drivers/nvdimm/e820.c for the implementation, this is
+ * simply here to trigger the module to load on demand.
+ */
+ pdev = platform_device_alloc("e820_pmem", -1);
+
+ return platform_device_add(pdev);
+}
+device_initcall(register_e820_pmem);
+
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 92d75e3..1a8a4d2 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -70,6 +70,11 @@ static int __init arm64_enable_cpu0_hotplug(char *str)
__setup("arm64_cpu0_hotplug", arm64_enable_cpu0_hotplug);
#endif
+#ifdef CONFIG_ARM64_PMEM_RESERVE
+extern struct resource pmem_res;
+#endif
+
+
phys_addr_t __fdt_pointer __initdata;
/*
@@ -305,6 +310,12 @@ static void __init request_standard_resources(void)
}
}
+#ifdef CONFIG_ARM64_PMEM_RESERVE
+ if (pmem_res.end && pmem_res.start)
+ request_resource(&iomem_resource, &pmem_res);
+#endif
+
+
static int __init reserve_memblock_reserved_regions(void)
{
u64 i, j;
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 6ebfabd..3b6907f 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -56,6 +56,8 @@
s64 memstart_addr __ro_after_init = -1;
EXPORT_SYMBOL(memstart_addr);
+phys_addr_t start_at, mem_size;
+
#ifdef CONFIG_PIN_MEMORY
struct resource pin_memory_resource = {
.name = "Pin memory",
@@ -111,6 +113,18 @@ static void __init reserve_pin_memory_res(void)
*/
phys_addr_t arm64_dma_phys_limit __ro_after_init;
+static unsigned long long pmem_size, pmem_start;
+
+#ifdef CONFIG_ARM64_PMEM_RESERVE
+struct resource pmem_res = {
+ .name = "Persistent Memory (legacy)",
+ .start = 0,
+ .end = 0,
+ .flags = IORESOURCE_MEM,
+ .desc = IORES_DESC_PERSISTENT_MEMORY_LEGACY
+};
+#endif
+
#ifndef CONFIG_KEXEC_CORE
static void __init reserve_crashkernel(void)
{
@@ -403,6 +417,26 @@ static int __init reserve_park_mem(void)
return -EINVAL;
}
#endif
+static bool __init is_mem_valid(unsigned long long mem_size, unsigned long long mem_start)
+{
+ if (!memblock_is_region_memory(mem_start, mem_size)) {
+ pr_warn("cannot reserve mem: region is not memory!\n");
+ return false;
+ }
+
+ if (memblock_is_region_reserved(mem_start, mem_size)) {
+ pr_warn("cannot reserve mem: region overlaps reserved memory!\n");
+ return false;
+ }
+
+ if (!IS_ALIGNED(mem_start, SZ_2M)) {
+ pr_warn("cannot reserve mem: base address is not 2MB aligned!\n");
+ return false;
+ }
+
+ return true;
+}
+
static int need_remove_real_memblock __initdata;
@@ -442,7 +476,17 @@ static int __init parse_memmap_one(char *p)
start_at = memparse(p + 1, &p);
memblock_reserve(start_at, mem_size);
memblock_mark_memmap(start_at, mem_size);
+<<<<<<< HEAD
+ } else if (*p == '!') {
+ start_at = memparse(p+1, &p);
+
+ pmem_start = start_at;
+ pmem_size = mem_size;
+
+ }else
+=======
} else
+>>>>>>> 374db2be8805428b4f54b5ef793b0d3f5069c5f9
pr_info("Unrecognized memmap option, please check the parameter.\n");
return *p == '\0' ? 0 : -EINVAL;
@@ -464,6 +508,20 @@ static int __init parse_memmap_opt(char *str)
}
early_param("memmap", parse_memmap_opt);
+#ifdef CONFIG_ARM64_PMEM_RESERVE
+static void __init reserve_pmem(void)
+{
+ if (!is_mem_valid(mem_size, start_at))
+ return;
+
+ memblock_remove(pmem_start, pmem_size);
+ pr_info("pmem reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
+ pmem_start, pmem_start + pmem_size, pmem_size >> 20);
+ pmem_res.start = pmem_start;
+ pmem_res.end = pmem_start + pmem_size - 1;
+}
+#endif
+
void __init arm64_memblock_init(void)
{
const s64 linear_region_size = BIT(vabits_actual - 1);
@@ -638,6 +696,11 @@ void __init bootmem_init(void)
reserve_quick_kexec();
#endif
+#ifdef CONFIG_ARM64_PMEM_RESERVE
+ reserve_pmem();
+#endif
+
+
reserve_pin_memory_res();
memblock_dump_all();
diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig
index b7d1eb3..9567cca 100644
--- a/drivers/nvdimm/Kconfig
+++ b/drivers/nvdimm/Kconfig
@@ -132,3 +132,9 @@ config NVDIMM_TEST_BUILD
infrastructure.
endif
+
+config PMEM_LEGACY
+ tristate "Pmem_legacy"
+ select X86_PMEM_LEGACY if X86
+ select ARM64_PMEM_LEGACY_DEVICE if ARM64
+
diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile
index 0407753..6f8dc92 100644
--- a/drivers/nvdimm/Makefile
+++ b/drivers/nvdimm/Makefile
@@ -3,6 +3,7 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
obj-$(CONFIG_ND_BTT) += nd_btt.o
obj-$(CONFIG_ND_BLK) += nd_blk.o
+obj-$(CONFIG_PMEM_LEGACY) += nd_e820.o
obj-$(CONFIG_OF_PMEM) += of_pmem.o
obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o nd_virtio.o
--
2.9.5
3
3

[PATCH openEuler-5.10] BMA: Fix format string compile warning in arm32 builds
by Zheng Zengkai 10 Jan '22
by Zheng Zengkai 10 Jan '22
10 Jan '22
driver inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ETXO
CVE: NA
-----------------------------------------
Fix following build warnings in arm32 builds:
drivers/net/ethernet/huawei/bma/edma_drv/bma_devintf.c: In function ‘bma_cdev_add_msg’:
drivers/net/ethernet/huawei/bma/edma_drv/bma_pci.h:92:20: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 5 has type ‘size_t {aka unsigned int}’ [-Wformat=]
drivers/net/ethernet/huawei/bma/veth_drv/veth_hb.c: In function ‘veth_recv_pkt’:
drivers/net/ethernet/huawei/bma/veth_drv/veth_hb.c:74:37: warning: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 7 has type ‘dma_addr_t {aka unsigned int}’ [-Wformat=]
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
.../bma/cdev_veth_drv/virtual_cdev_eth_net.c | 14 +++++++-------
.../net/ethernet/huawei/bma/edma_drv/bma_devintf.c | 2 +-
.../net/ethernet/huawei/bma/edma_drv/edma_host.c | 2 +-
drivers/net/ethernet/huawei/bma/veth_drv/veth_hb.c | 4 ++--
4 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/huawei/bma/cdev_veth_drv/virtual_cdev_eth_net.c b/drivers/net/ethernet/huawei/bma/cdev_veth_drv/virtual_cdev_eth_net.c
index 04ea55f4e0a2..d8c3ab655566 100644
--- a/drivers/net/ethernet/huawei/bma/cdev_veth_drv/virtual_cdev_eth_net.c
+++ b/drivers/net/ethernet/huawei/bma/cdev_veth_drv/virtual_cdev_eth_net.c
@@ -605,7 +605,7 @@ static int edma_veth_cut_tx_packet_send(struct edma_eth_dev_s *eth_dev,
do_queue_rate_limit(eth_dev->ptx_queue);
while (length > 0) {
- LOG(DLOG_DEBUG, "length: %u/%lu", length, len);
+ LOG(DLOG_DEBUG, "length: %u/%zu", length, len);
if (length > BSPPACKET_MTU_MAX) {
/* fragment. */
@@ -1689,7 +1689,7 @@ static ssize_t cdev_copy_packet_to_user(struct edma_eth_dev_s *dev,
start = dev->rx_packet[dev->rx_packet_head].packet + g_read_pos;
LOG(DLOG_DEBUG,
- "User needs %ld bytes, pos: %u, total len: %u, left: %ld.",
+ "User needs %zu bytes, pos: %u, total len: %u, left: %zd.",
count, g_read_pos, dev->rx_packet[dev->rx_packet_head].len, left);
if (left <= 0) {
/* No more data in this message, retry. */
@@ -1721,7 +1721,7 @@ static ssize_t cdev_copy_packet_to_user(struct edma_eth_dev_s *dev,
}
LOG(DLOG_DEBUG,
- "Copied bytes: %ld, pos: %d, buf len: %lu, free_packet: %d.",
+ "Copied bytes: %zd, pos: %d, buf len: %zu, free_packet: %d.",
length, g_read_pos, count, free_packet);
if (packet) {
@@ -1807,11 +1807,11 @@ ssize_t cdev_read(struct file *filp, char __user *data,
if (!data || count >= MAX_PACKET_LEN)
return -EFAULT;
- LOG(DLOG_DEBUG, "read begin, count: %ld, pos: %u.", count, g_read_pos);
+ LOG(DLOG_DEBUG, "read begin, count: %zu, pos: %u.", count, g_read_pos);
length = cdev_copy_packet_to_user(dev, data, count);
- LOG(DLOG_DEBUG, "read done, length: %ld, pos: %u.", length, g_read_pos);
+ LOG(DLOG_DEBUG, "read done, length: %zd, pos: %u.", length, g_read_pos);
return length;
}
@@ -1837,7 +1837,7 @@ ssize_t cdev_write(struct file *filp, const char __user *data,
g_peer_not_ready = 0;
}
- LOG(DLOG_DEBUG, "data length is %lu, pos: %u (%u/%u)",
+ LOG(DLOG_DEBUG, "data length is %zu, pos: %u (%u/%u)",
count, g_read_pos,
pdev->ptx_queue->pshmqhd_v->count,
pdev->ptx_queue->pshmqhd_v->total);
@@ -1859,4 +1859,4 @@ MODULE_DESCRIPTION("HUAWEI CDEV DRIVER");
MODULE_LICENSE("GPL");
module_init(edma_cdev_init);
-module_exit(edma_cdev_exit);
\ No newline at end of file
+module_exit(edma_cdev_exit);
diff --git a/drivers/net/ethernet/huawei/bma/edma_drv/bma_devintf.c b/drivers/net/ethernet/huawei/bma/edma_drv/bma_devintf.c
index 7817f58f8635..acf6bbfc50ff 100644
--- a/drivers/net/ethernet/huawei/bma/edma_drv/bma_devintf.c
+++ b/drivers/net/ethernet/huawei/bma/edma_drv/bma_devintf.c
@@ -497,7 +497,7 @@ int bma_cdev_add_msg(void *handle, const char __user *msg, size_t msg_len)
hdr->sub_type = priv->user.sub_type;
hdr->user_id = priv->user.user_id;
hdr->datalen = msg_len;
- BMA_LOG(DLOG_DEBUG, "msg_len is %ld\n", msg_len);
+ BMA_LOG(DLOG_DEBUG, "msg_len is %zu\n", msg_len);
if (copy_from_user(hdr->data, msg, msg_len)) {
BMA_LOG(DLOG_ERROR, "copy_from_user error\n");
diff --git a/drivers/net/ethernet/huawei/bma/edma_drv/edma_host.c b/drivers/net/ethernet/huawei/bma/edma_drv/edma_host.c
index 2d5f4ffd79d9..3525d41c865f 100644
--- a/drivers/net/ethernet/huawei/bma/edma_drv/edma_host.c
+++ b/drivers/net/ethernet/huawei/bma/edma_drv/edma_host.c
@@ -789,7 +789,7 @@ static int edma_host_send_msg(struct edma_host_s *edma_host)
if (edma_host->msg_send_write >
HOST_MAX_SEND_MBX_LEN - SIZE_OF_MBX_HDR) {
BMA_LOG(DLOG_ERROR,
- "Length of send message %u is larger than %lu\n",
+ "Length of send message %u is larger than %zu\n",
edma_host->msg_send_write,
HOST_MAX_SEND_MBX_LEN - SIZE_OF_MBX_HDR);
edma_host->msg_send_write = 0;
diff --git a/drivers/net/ethernet/huawei/bma/veth_drv/veth_hb.c b/drivers/net/ethernet/huawei/bma/veth_drv/veth_hb.c
index 9681ce3bfc7b..240db31d7178 100644
--- a/drivers/net/ethernet/huawei/bma/veth_drv/veth_hb.c
+++ b/drivers/net/ethernet/huawei/bma/veth_drv/veth_hb.c
@@ -1488,8 +1488,8 @@ s32 veth_recv_pkt(struct bspveth_rxtx_q *prx_queue, int queue)
skb->len, skb->protocol);
VETH_LOG(DLOG_DEBUG,
- "dma_p=0x%llx,dma_map=0x%llx,",
- pbd_v->dma_p, dma_map);
+ "dma_p=0x%llx,dma_map=%pad,",
+ pbd_v->dma_p, &dma_map);
VETH_LOG(DLOG_DEBUG,
"skb=%p,skb->data=%p,skb->len=%d,tail=%d,shm_off=%d\n",
--
2.20.1
1
0

08 Jan '22
From: Yixing Liu <liuyixing1(a)huawei.com>
mainline inclusion
from mainline-v5.15-rc1
commit 260f64a40198
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4O23M
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
----------------------------------------------------------------------------
The stash feature is enabled by default on HIP09.
Fixes: f93c39bc9547 ("RDMA/hns: Add support for QP stash")
Fixes: bfefae9f108d ("RDMA/hns: Add support for CQ stash")
Link: https://lore.kernel.org/r/1629539607-33217-3-git-send-email-liangwenpeng@hu…
Signed-off-by: Yixing Liu <liuyixing1(a)huawei.com>
Signed-off-by: Wenpeng Liang <liangwenpeng(a)huawei.com>
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
sigend-off-by: Guofeng Yue <yueguofeng(a)hisilicon.com>
Reviewed-by: Yangyang Li <liyangyang20(a)huawei.com>
Acked-by: Xie XiuQi <xiexiuqi(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index aea6a6a7b238..4f8738bda0a2 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -2004,6 +2004,7 @@ static void set_default_caps(struct hns_roce_dev *hr_dev)
caps->gid_table_len[0] = HNS_ROCE_V2_GID_INDEX_NUM;
if (hr_dev->pci_dev->revision >= PCI_REVISION_ID_HIP09) {
+ caps->flags |= HNS_ROCE_CAP_FLAG_STASH;
caps->max_sq_inline = HNS_ROCE_V3_MAX_SQ_INLINE;
} else {
caps->max_sq_inline = HNS_ROCE_V2_MAX_SQ_INLINE;
--
2.20.1
1
20

08 Jan '22
driver inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ETXO
CVE: NA
-----------------------------------------
Fix following build warnings in arm32 builds:
drivers/net/ethernet/huawei/bma/edma_drv/bma_devintf.c: In function ‘bma_cdev_add_msg’:
drivers/net/ethernet/huawei/bma/edma_drv/bma_pci.h:92:20: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 5 has type ‘size_t {aka unsigned int}’ [-Wformat=]
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
.../bma/cdev_veth_drv/virtual_cdev_eth_net.c | 14 +++++++-------
.../net/ethernet/huawei/bma/edma_drv/bma_devintf.c | 2 +-
.../net/ethernet/huawei/bma/edma_drv/edma_host.c | 2 +-
drivers/net/ethernet/huawei/bma/veth_drv/veth_hb.c | 2 +-
4 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/huawei/bma/cdev_veth_drv/virtual_cdev_eth_net.c b/drivers/net/ethernet/huawei/bma/cdev_veth_drv/virtual_cdev_eth_net.c
index 04ea55f4e0a2..99466848c034 100644
--- a/drivers/net/ethernet/huawei/bma/cdev_veth_drv/virtual_cdev_eth_net.c
+++ b/drivers/net/ethernet/huawei/bma/cdev_veth_drv/virtual_cdev_eth_net.c
@@ -605,7 +605,7 @@ static int edma_veth_cut_tx_packet_send(struct edma_eth_dev_s *eth_dev,
do_queue_rate_limit(eth_dev->ptx_queue);
while (length > 0) {
- LOG(DLOG_DEBUG, "length: %u/%lu", length, len);
+ LOG(DLOG_DEBUG, "length: %u/%u", length, len);
if (length > BSPPACKET_MTU_MAX) {
/* fragment. */
@@ -1689,7 +1689,7 @@ static ssize_t cdev_copy_packet_to_user(struct edma_eth_dev_s *dev,
start = dev->rx_packet[dev->rx_packet_head].packet + g_read_pos;
LOG(DLOG_DEBUG,
- "User needs %ld bytes, pos: %u, total len: %u, left: %ld.",
+ "User needs %d bytes, pos: %u, total len: %u, left: %d.",
count, g_read_pos, dev->rx_packet[dev->rx_packet_head].len, left);
if (left <= 0) {
/* No more data in this message, retry. */
@@ -1721,7 +1721,7 @@ static ssize_t cdev_copy_packet_to_user(struct edma_eth_dev_s *dev,
}
LOG(DLOG_DEBUG,
- "Copied bytes: %ld, pos: %d, buf len: %lu, free_packet: %d.",
+ "Copied bytes: %d, pos: %d, buf len: %u, free_packet: %d.",
length, g_read_pos, count, free_packet);
if (packet) {
@@ -1807,11 +1807,11 @@ ssize_t cdev_read(struct file *filp, char __user *data,
if (!data || count >= MAX_PACKET_LEN)
return -EFAULT;
- LOG(DLOG_DEBUG, "read begin, count: %ld, pos: %u.", count, g_read_pos);
+ LOG(DLOG_DEBUG, "read begin, count: %d, pos: %u.", count, g_read_pos);
length = cdev_copy_packet_to_user(dev, data, count);
- LOG(DLOG_DEBUG, "read done, length: %ld, pos: %u.", length, g_read_pos);
+ LOG(DLOG_DEBUG, "read done, length: %d, pos: %u.", length, g_read_pos);
return length;
}
@@ -1837,7 +1837,7 @@ ssize_t cdev_write(struct file *filp, const char __user *data,
g_peer_not_ready = 0;
}
- LOG(DLOG_DEBUG, "data length is %lu, pos: %u (%u/%u)",
+ LOG(DLOG_DEBUG, "data length is %u, pos: %u (%u/%u)",
count, g_read_pos,
pdev->ptx_queue->pshmqhd_v->count,
pdev->ptx_queue->pshmqhd_v->total);
@@ -1859,4 +1859,4 @@ MODULE_DESCRIPTION("HUAWEI CDEV DRIVER");
MODULE_LICENSE("GPL");
module_init(edma_cdev_init);
-module_exit(edma_cdev_exit);
\ No newline at end of file
+module_exit(edma_cdev_exit);
diff --git a/drivers/net/ethernet/huawei/bma/edma_drv/bma_devintf.c b/drivers/net/ethernet/huawei/bma/edma_drv/bma_devintf.c
index 7817f58f8635..2d63e44f0ed2 100644
--- a/drivers/net/ethernet/huawei/bma/edma_drv/bma_devintf.c
+++ b/drivers/net/ethernet/huawei/bma/edma_drv/bma_devintf.c
@@ -497,7 +497,7 @@ int bma_cdev_add_msg(void *handle, const char __user *msg, size_t msg_len)
hdr->sub_type = priv->user.sub_type;
hdr->user_id = priv->user.user_id;
hdr->datalen = msg_len;
- BMA_LOG(DLOG_DEBUG, "msg_len is %ld\n", msg_len);
+ BMA_LOG(DLOG_DEBUG, "msg_len is %d\n", msg_len);
if (copy_from_user(hdr->data, msg, msg_len)) {
BMA_LOG(DLOG_ERROR, "copy_from_user error\n");
diff --git a/drivers/net/ethernet/huawei/bma/edma_drv/edma_host.c b/drivers/net/ethernet/huawei/bma/edma_drv/edma_host.c
index 2d5f4ffd79d9..a89a4160dedd 100644
--- a/drivers/net/ethernet/huawei/bma/edma_drv/edma_host.c
+++ b/drivers/net/ethernet/huawei/bma/edma_drv/edma_host.c
@@ -789,7 +789,7 @@ static int edma_host_send_msg(struct edma_host_s *edma_host)
if (edma_host->msg_send_write >
HOST_MAX_SEND_MBX_LEN - SIZE_OF_MBX_HDR) {
BMA_LOG(DLOG_ERROR,
- "Length of send message %u is larger than %lu\n",
+ "Length of send message %u is larger than %u\n",
edma_host->msg_send_write,
HOST_MAX_SEND_MBX_LEN - SIZE_OF_MBX_HDR);
edma_host->msg_send_write = 0;
diff --git a/drivers/net/ethernet/huawei/bma/veth_drv/veth_hb.c b/drivers/net/ethernet/huawei/bma/veth_drv/veth_hb.c
index 9681ce3bfc7b..7b511c05c2e9 100644
--- a/drivers/net/ethernet/huawei/bma/veth_drv/veth_hb.c
+++ b/drivers/net/ethernet/huawei/bma/veth_drv/veth_hb.c
@@ -1488,7 +1488,7 @@ s32 veth_recv_pkt(struct bspveth_rxtx_q *prx_queue, int queue)
skb->len, skb->protocol);
VETH_LOG(DLOG_DEBUG,
- "dma_p=0x%llx,dma_map=0x%llx,",
+ "dma_p=0x%llx,dma_map=0x%x,",
pbd_v->dma_p, dma_map);
VETH_LOG(DLOG_DEBUG,
--
2.20.1
2
1

[PATCH openEuler-5.10 001/125] kabi: Generalize naming of kabi helper macros
by Zheng Zengkai 07 Jan '22
by Zheng Zengkai 07 Jan '22
07 Jan '22
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4K3S5
--------------------------
Generalize naming of some kabi helper macros.
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
Reviewed-by: Xie XiuQi <xiexiuqi(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
include/linux/kabi.h | 38 +++++++++++++++++++-------------------
1 file changed, 19 insertions(+), 19 deletions(-)
diff --git a/include/linux/kabi.h b/include/linux/kabi.h
index 713f63cf56cb..0bc7ca2483f4 100644
--- a/include/linux/kabi.h
+++ b/include/linux/kabi.h
@@ -151,7 +151,7 @@
* approaches can (and often are) combined.
*
* To use this for 'struct foo' (the "base structure"), define a new
- * structure called 'struct foo_rh'; this new struct is called "auxiliary
+ * structure called 'struct foo_resvd'; this new struct is called "auxiliary
* structure". Then add KABI_AUX_EMBED or KABI_AUX_PTR to the end
* of the base structure. The argument is the name of the base structure,
* without the 'struct' keyword.
@@ -174,7 +174,7 @@
* end. Note the auxiliary structure cannot be shrunk in size later (i.e.,
* fields cannot be removed, only deprecated). Any code accessing fields
* from the aux struct must guard the access using the KABI_AUX macro.
- * The access itself is then done via a '_rh' field in the base struct.
+ * The access itself is then done via a '_resvd' field in the base struct.
*
* The auxiliary structure is not guaranteed for access by modules unless
* explicitly commented as such in the declaration of the aux struct
@@ -182,7 +182,7 @@
*
* Example:
*
- * struct foo_rh {
+ * struct foo_resvd {
* int newly_added;
* };
*
@@ -194,20 +194,20 @@
* void use(struct foo *f)
* {
* if (KABI_AUX(f, foo, newly_added))
- * f->_rh->newly_added = 123;
+ * f->_resvd->newly_added = 123;
* else
* // the field 'newly_added' is not present in the passed
* // struct, fall back to old behavior
* f->big_hammer = true;
* }
*
- * static struct foo_rh my_foo_rh {
+ * static struct foo_resvd my_foo_resvd {
* .newly_added = 0;
* }
*
* static struct foo my_foo = {
* .big_hammer = false,
- * ._rh = &my_foo_rh,
+ * ._resvd = &my_foo_resvd,
* KABI_AUX_INIT_SIZE(foo)
* };
*
@@ -218,7 +218,7 @@
*
* Example:
*
- * struct foo_rh {
+ * struct foo_resvd {
* };
*
* struct foo {
@@ -385,7 +385,7 @@
_Static_assert(__alignof__(struct{_new;}) <= __alignof__(struct{_orig;}), \
__FILE__ ":" __stringify(__LINE__) ": " __stringify(_orig) " is not aligned the same as " __stringify(_new) KABI_ALIGN_WARNING); \
}
-# define __ABI_CHECK_SIZE(_item, _size) \
+# define __KABI_CHECK_SIZE(_item, _size) \
_Static_assert(sizeof(struct{_item;}) <= _size, \
__FILE__ ":" __stringify(__LINE__) ": " __stringify(_item) " is larger than the reserved size (" __stringify(_size) " bytes)" KABI_ALIGN_WARNING)
#else
@@ -451,14 +451,14 @@
})
#define _KABI_AUX_PTR(_struct) \
- size_t _struct##_size_rh; \
- _KABI_EXCLUDE(struct _struct##_rh *_rh)
+ size_t _struct##_size_resvd; \
+ _KABI_EXCLUDE(struct _struct##_resvd *_resvd)
#define KABI_AUX_PTR(_struct) \
_KABI_AUX_PTR(_struct);
#define _KABI_AUX_EMBED(_struct) \
- size_t _struct##_size_rh; \
- _KABI_EXCLUDE(struct _struct##_rh _rh)
+ size_t _struct##_size_resvd; \
+ _KABI_EXCLUDE(struct _struct##_resvd _resvd)
#define KABI_AUX_EMBED(_struct) \
_KABI_AUX_EMBED(_struct);
@@ -468,7 +468,7 @@
/*
* KABI_AUX_SET_SIZE calculates and sets the size of the extended struct and
- * stores it in the size_rh field for structs that are dynamically allocated.
+ * stores it in the size_resvd field for structs that are dynamically allocated.
* This macro MUST be called when expanding a base struct with
* KABI_SIZE_AND_EXTEND, and it MUST be called from the allocation site
* regardless of being allocated in the kernel or a module.
@@ -476,28 +476,28 @@
* a semicolon is necessary at the end of the line where it is invoked.
*/
#define KABI_AUX_SET_SIZE(_name, _struct) ({ \
- (_name)->_struct##_size_rh = sizeof(struct _struct##_rh); \
+ (_name)->_struct##_size_resvd = sizeof(struct _struct##_resvd); \
})
/*
* KABI_AUX_INIT_SIZE calculates and sets the size of the extended struct and
- * stores it in the size_rh field for structs that are statically allocated.
+ * stores it in the size_resvd field for structs that are statically allocated.
* This macro MUST be called when expanding a base struct with
* KABI_SIZE_AND_EXTEND, and it MUST be called from the declaration site
* regardless of being allocated in the kernel or a module.
*/
#define KABI_AUX_INIT_SIZE(_struct) \
- ._struct##_size_rh = sizeof(struct _struct##_rh),
+ ._struct##_size_resvd = sizeof(struct _struct##_resvd),
/*
* KABI_AUX verifies allocated memory exists. This MUST be called to
- * verify that memory in the _rh struct is valid, and can be called
+ * verify that memory in the _resvd struct is valid, and can be called
* regardless if KABI_SIZE_AND_EXTEND or KABI_SIZE_AND_EXTEND_PTR is
* used.
*/
#define KABI_AUX(_ptr, _struct, _field) ({ \
- size_t __off = offsetof(struct _struct##_rh, _field); \
- (_ptr)->_struct##_size_rh > __off ? true : false; \
+ size_t __off = offsetof(struct _struct##_resvd, _field); \
+ (_ptr)->_struct##_size_resvd > __off ? true : false; \
})
#endif /* _LINUX_KABI_H */
--
2.20.1
1
124
Subscribe
<https://aka.ms/AAb9ysg>
1
0

[PATCH openEuler-5.10 1/2] arm64: Add support for memmap kernel parameters
by Zheng Zengkai 07 Jan '22
by Zheng Zengkai 07 Jan '22
07 Jan '22
From: Peng Liu <liupeng256(a)huawei.com>
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NYPZ
CVE: NA
-------------------------------------------------
Add support for memmap kernel parameters for ARM64. The three below
modes are supported:
memmap=exactmap
Enable setting of an exact memory map, as specified by the user.
memmap=nn[KMG]@ss[KMG]
Force usage of a specific region of memory.
memmap=nn[KMG]$ss[KMG]
Region of memory to be reserved is from ss to ss+nn, the region must
be in the range of existed memory, otherwise will be ignored.
If users set memmap=exactmap before memmap=nn[KMG]@ss[KMG], they will
get the exact memory specified by memmap=nn[KMG]@ss[KMG]. For example,
on one machine with 4GB memory, "memmap=exactmap memmap=1G@1G" will
make kernel use the memory from 1GB to 2GB only.
Signed-off-by: Peng Liu <liupeng256(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
.../admin-guide/kernel-parameters.txt | 9 ++-
arch/arm64/mm/init.c | 59 +++++++++++++++++++
2 files changed, 65 insertions(+), 3 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index de8f7d447295..64be32ba4373 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2794,8 +2794,8 @@
option.
See Documentation/admin-guide/mm/memory-hotplug.rst.
- memmap=exactmap [KNL,X86] Enable setting of an exact
- E820 memory map, as specified by the user.
+ memmap=exactmap [KNL,X86,ARM64] Enable setting of an exact
+ E820 and ARM64 memory map, as specified by the user.
Such memmap=exactmap lines can be constructed based on
BIOS output or other requirements. See the memmap=nn@ss
option description.
@@ -2806,7 +2806,8 @@
If @ss[KMG] is omitted, it is equivalent to mem=nn[KMG],
which limits max address to nn[KMG].
Multiple different regions can be specified,
- comma delimited.
+ comma delimited, example as follows is not supported to
+ ARM64.
Example:
memmap=100M@2G,100M#3G,1G!1024G
@@ -2817,6 +2818,8 @@
memmap=nn[KMG]$ss[KMG]
[KNL,ACPI] Mark specific memory as reserved.
Region of memory to be reserved is from ss to ss+nn.
+ For ARM64, reserved memory must be in the range of
+ existed memory.
Example: Exclude memory from 0x18690000-0x1869ffff
memmap=64K$0x18690000
or
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index e8d446164c76..f59546d3b0de 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -404,6 +404,65 @@ static int __init reserve_park_mem(void)
}
#endif
+static int need_remove_real_memblock __initdata;
+
+static int __init parse_memmap_one(char *p)
+{
+ char *oldp;
+ u64 start_at, mem_size;
+
+ if (!p)
+ return -EINVAL;
+
+ if (!strncmp(p, "exactmap", 8)) {
+ need_remove_real_memblock = 1;
+ return -EINVAL;
+ }
+
+ oldp = p;
+ mem_size = memparse(p, &p);
+ if (p == oldp)
+ return -EINVAL;
+
+ if (!mem_size)
+ return -EINVAL;
+
+ if (*p == '@') {
+ start_at = memparse(p + 1, &p);
+ /*
+ * use the exactmap defined by nn[KMG]@ss[KMG], remove
+ * memblock populated by DT etc.
+ */
+ if (need_remove_real_memblock) {
+ need_remove_real_memblock = 0;
+ memblock_remove(0, ULLONG_MAX);
+ }
+ memblock_add(start_at, mem_size);
+ } else if (*p == '$') {
+ start_at = memparse(p + 1, &p);
+ memblock_reserve(start_at, mem_size);
+ } else
+ pr_info("Unrecognized memmap option, please check the parameter.\n");
+
+ return *p == '\0' ? 0 : -EINVAL;
+}
+
+static int __init parse_memmap_opt(char *str)
+{
+ while (str) {
+ char *k = strchr(str, ',');
+
+ if (k)
+ *k++ = 0;
+
+ parse_memmap_one(str);
+ str = k;
+ }
+
+ return 0;
+}
+early_param("memmap", parse_memmap_opt);
+
void __init arm64_memblock_init(void)
{
const s64 linear_region_size = BIT(vabits_actual - 1);
--
2.20.1
1
1
Ramaxel inclusion
category: features
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ON8F
CVE: NA
Changes:
1. Split scmd_tmout_nonpt into two parameters:
scmd_tmout_vd/scmd_tmout_rawdisk
2. Return -ETIME instead of -EINVAL when command is timeout.
3. Add one module parameters: max_io_force.
Signed-off-by: Yanling Song <songyl(a)ramaxel.com>
Reviewed-by: Jiang Yu<yujiang(a)ramaxel.com>
---
drivers/scsi/spraid/spraid_main.c | 96 +++++++++++++++----------------
1 file changed, 45 insertions(+), 51 deletions(-)
diff --git a/drivers/scsi/spraid/spraid_main.c b/drivers/scsi/spraid/spraid_main.c
index 519b39f44e91..7069582d741a 100644
--- a/drivers/scsi/spraid/spraid_main.c
+++ b/drivers/scsi/spraid/spraid_main.c
@@ -42,10 +42,17 @@ static u32 admin_tmout = 60;
module_param(admin_tmout, uint, 0644);
MODULE_PARM_DESC(admin_tmout, "admin commands timeout (seconds)");
-static u32 scmd_tmout_nonpt = 180;
-module_param(scmd_tmout_nonpt, uint, 0644);
-MODULE_PARM_DESC(scmd_tmout_nonpt,
- "scsi commands timeout for rawdisk&raid(seconds)");
+static u32 scmd_tmout_rawdisk = 180;
+module_param(scmd_tmout_rawdisk, uint, 0644);
+MODULE_PARM_DESC(scmd_tmout_rawdisk, "scsi commands timeout for rawdisk(seconds)");
+
+static u32 scmd_tmout_vd = 180;
+module_param(scmd_tmout_vd, uint, 0644);
+MODULE_PARM_DESC(scmd_tmout_vd, "scsi commands timeout for vd(seconds)");
+
+static bool max_io_force;
+module_param(max_io_force, bool, 0644);
+MODULE_PARM_DESC(max_io_force, "force max_hw_sectors_kb = 1024, default false(performance first)");
static int ioq_depth_set(const char *val, const struct kernel_param *kp);
static const struct kernel_param_ops ioq_depth_ops = {
@@ -78,7 +85,7 @@ static unsigned char log_debug_switch;
module_param_cb(log_debug_switch, &log_debug_switch_ops,
&log_debug_switch, 0644);
MODULE_PARM_DESC(log_debug_switch,
- "set log state, default non-zero for switch on");
+ "set log state, default zero for switch off");
static int small_pool_num_set(const char *val, const struct kernel_param *kp)
{
@@ -132,7 +139,6 @@ static struct workqueue_struct *spraid_wq;
#define SPRAID_DRV_VERSION "1.0.0.0"
#define ADMIN_TIMEOUT (admin_tmout * HZ)
-#define ADMIN_ERR_TIMEOUT 32757
#define SPRAID_WAIT_ABNL_CMD_TIMEOUT (3 * 2)
@@ -242,13 +248,6 @@ static int spraid_pci_enable(struct spraid_dev *hdev)
goto disable;
}
- ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
- if (ret < 0) {
- dev_err(hdev->dev,
- "Allocate one IRQ for setup admin channel failed\n");
- goto disable;
- }
-
hdev->cap = lo_hi_readq(hdev->bar + SPRAID_REG_CAP);
hdev->ioq_depth = min_t(u32, SPRAID_CAP_MQES(hdev->cap) + 1,
io_queue_depth);
@@ -261,13 +260,21 @@ static int spraid_pci_enable(struct spraid_dev *hdev)
maskbit);
maskbit = SPRAID_DMA_MSK_BIT_MAX;
}
- if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(maskbit))) {
+ if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(maskbit)) &&
+ dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32))) {
dev_err(hdev->dev, "set dma mask and coherent failed\n");
goto disable;
}
dev_info(hdev->dev, "set dma mask[%llu] success\n", maskbit);
+ ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
+
+ if (ret < 0) {
+ dev_err(hdev->dev, "Allocate one IRQ for setup admin channel failed\n");
+ goto disable;
+ }
+
pci_enable_pcie_error_reporting(pdev);
pci_save_state(pdev);
@@ -840,6 +847,9 @@ static void spraid_map_status(struct spraid_iod *iod, struct scsi_cmnd *scmd,
break;
default:
set_host_byte(scmd, DID_BAD_TARGET);
+ dev_warn(iod->spraidq->hdev->dev, "[%s] cid[%d] qid[%d];"
+ "bad status[0x%x]\n", __func__, cqe->cmd_id,
+ le16_to_cpu(cqe->sq_id), le16_to_cpu(cqe->status));
break;
}
}
@@ -981,32 +991,17 @@ static void spraid_slave_destroy(struct scsi_device *sdev)
static int spraid_slave_configure(struct scsi_device *sdev)
{
- u16 idx;
- unsigned int timeout = scmd_tmout_nonpt * HZ;
+ unsigned int timeout = scmd_tmout_rawdisk * HZ;
struct spraid_dev *hdev = shost_priv(sdev->host);
struct spraid_sdev_hostdata *hostdata = sdev->hostdata;
u32 max_sec = sdev->host->max_sectors;
if (hostdata) {
- idx = hostdata->hdid - 1;
- if (sdev->channel == hdev->devices[idx].channel &&
- sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
- sdev->lun < hdev->devices[idx].lun) {
- if (SPRAID_DEV_INFO_ATTR_PT(hdev->devices[idx].attr))
- timeout = 30 * HZ;
- else
- timeout = scmd_tmout_nonpt * HZ;
- max_sec = le16_to_cpu(hdev->devices[idx].max_io_kb)
- << 1;
- } else {
- dev_err(hdev->dev,
- "[%s] err, sdev->channel:id:lun[%d:%d:%lld];"
- "devices[%d], channel:target:lun[%d:%d:%d]\n",
- __func__, sdev->channel, sdev->id, sdev->lun,
- idx, hdev->devices[idx].channel,
- hdev->devices[idx].target,
- hdev->devices[idx].lun);
- }
+ if (SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ timeout = scmd_tmout_vd * HZ;
+ else if (SPRAID_DEV_INFO_ATTR_RAWDISK(hostdata->attr))
+ timeout = scmd_tmout_rawdisk * HZ;
+ max_sec = hostdata->max_io_kb << 1;
} else {
dev_err(hdev->dev, "[%s] err, sdev->hostdata is null\n",
__func__);
@@ -1018,11 +1013,12 @@ static int spraid_slave_configure(struct scsi_device *sdev)
if ((max_sec == 0) || (max_sec > sdev->host->max_sectors))
max_sec = sdev->host->max_sectors;
- dev_info(hdev->dev,
- "[%s] sdev->channel:id:lun[%d:%d:%lld];"
- " scmd_timeout[%d]s, maxsec[%d]\n",
- __func__, sdev->channel, sdev->id,
- sdev->lun, timeout / HZ, max_sec);
+ if (!max_io_force)
+ blk_queue_max_hw_sectors(sdev->request_queue, max_sec);
+
+ dev_info(hdev->dev, "[%s] sdev->channel:id:lun[%d:%d:%lld];"
+ "scmd_timeout[%d]s, maxsec[%d]\n", __func__, sdev->channel,
+ sdev->id, sdev->lun, timeout / HZ, max_sec);
return 0;
}
@@ -1677,7 +1673,7 @@ static int spraid_submit_admin_sync_cmd(struct spraid_dev *hdev,
cmd->usr_cmd.opcode, cmd->usr_cmd.info_0.subopcode);
WRITE_ONCE(adm_cmd->state, SPRAID_CMD_TIMEOUT);
spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
- return -EINVAL;
+ return -ETIME;
}
if (result0)
@@ -2630,9 +2626,6 @@ static int spraid_user_admin_cmd(struct spraid_dev *hdev, struct bsg_job *job)
return -EBUSY;
}
- dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init\n",
- __func__, cmd->opcode, cmd->info_0.subopcode);
-
memset(&admin_cmd, 0, sizeof(admin_cmd));
admin_cmd.common.opcode = cmd->opcode;
admin_cmd.common.flags = cmd->flags;
@@ -2659,10 +2652,11 @@ static int spraid_user_admin_cmd(struct spraid_dev *hdev, struct bsg_job *job)
memcpy(job->reply, result, sizeof(result));
}
- dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x];"
- " status[0x%x] result0[0x%x] result1[0x%x]\n",
- __func__, cmd->opcode, cmd->info_0.subopcode,
- status, result[0], result[1]);
+ if (status)
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x];"
+ " status[0x%x] result0[0x%x] result1[0x%x]\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode,
+ status, result[0], result[1]);
spraid_bsg_unmap_data(hdev, job);
@@ -2742,7 +2736,7 @@ static int spraid_submit_ioq_sync_cmd(struct spraid_dev *hdev,
(le32_to_cpu(cmd->common.cdw3[0]) & 0xffff));
WRITE_ONCE(pt_cmd->state, SPRAID_CMD_TIMEOUT);
spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
- return -EINVAL;
+ return -ETIME;
}
if (result && reslen) {
@@ -3140,7 +3134,7 @@ static int spraid_abort_handler(struct scsi_cmnd *scmd)
dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, aborting\n", cid, hwq);
ret = spraid_send_abort_cmd(hdev, hostdata->hdid, hwq, cid);
- if (ret != ADMIN_ERR_TIMEOUT) {
+ if (ret != -ETIME) {
ret = spraid_wait_abnl_cmd_done(iod);
if (ret) {
dev_warn(hdev->dev, "cid[%d] qid[%d] abort failed;"
@@ -3623,7 +3617,7 @@ static int spraid_bsg_host_dispatch(struct bsg_job *job)
struct spraid_bsg_request *bsg_req = job->request;
int ret = 0;
- dev_info(hdev->dev, "[%s] msgcode[%d], msglen[%d], timeout[%d];"
+ dev_log_dbg(hdev->dev, "[%s] msgcode[%d], msglen[%d], timeout[%d];"
" req_nsge[%d], req_len[%d]\n",
__func__, bsg_req->msgcode, job->request_len,
rq->timeout, job->request_payload.sg_cnt,
--
2.27.0
1
0

[PATCH openEuler-1.0-LTS 1/2] arm64/mpam: fix mpam probe error for wrong init order
by Yang Yingliang 06 Jan '22
by Yang Yingliang 06 Jan '22
06 Jan '22
From: Xingang Wang <wangxingang5(a)huawei.com>
ascend inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I49RB2
CVE: NA
---------------------------------------------------
The mpam init procedure failed when probe with ACPI:
[ 1.148657 ] ACPI MPAM: No CPU has cache with PPTT reference 0x72
[ 1.148658 ] ACPI MPAM: All CPUs must be online to probe mpam.
[ 1.148660 ] ACPI MPAM: discovery failed: -19
This is because mpam need to be probed after all cpus be online, the
arm_mpam_driver_init must be called after cacheinfo_sysfs_init, so the
device_initcall should be replaced with device_initcall_sync.
Fixes: b45bdb5a8604 ("arm64/mpam: add device tree support for mpam initialization")
Signed-off-by: Xingang Wang <wangxingang5(a)huawei.com>
Reviewed-by: Wang ShaoBo <bobo.shaobowang(a)huawei.com>
Reviewed-by: Cheng Jian <cj.chengjian(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
arch/arm64/kernel/mpam/mpam_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c
index 4910e99348798..c7b5c50d431bd 100644
--- a/arch/arm64/kernel/mpam/mpam_device.c
+++ b/arch/arm64/kernel/mpam/mpam_device.c
@@ -1876,4 +1876,4 @@ static int __init arm_mpam_driver_init(void)
* We want to run after cacheinfo_sysfs_init() has caused the cacheinfo
* structures to be populated. That runs as a device_initcall.
*/
-device_initcall(arm_mpam_driver_init);
+device_initcall_sync(arm_mpam_driver_init);
--
2.25.1
1
1

06 Jan '22
Add CONFIG_KABI_RESERVE to control KABI padding reserve and enable it by
default.
Zheng Zengkai (2):
KABI: Add CONFIG_KABI_RESERVE to control KABI padding reserve
openeuler_defconfig: Enable CONFIG_KABI_RESERVE for x86 and arm64
Kconfig | 8 ++++++++
arch/arm64/configs/openeuler_defconfig | 1 +
arch/x86/configs/openeuler_defconfig | 1 +
include/linux/kabi.h | 6 +++++-
4 files changed, 15 insertions(+), 1 deletion(-)
--
2.20.1
1
2
hulk inclusion
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MZU1
CVE: NA
---------------------------
For x86 platform, make allmodconfig & make -j64, following two
build errors are reported.
First error:
- build failed:
- In file included from <command-line>:32:0:
./usr/include/asm/bootparam.h:45:10: fatal error: linux/kabi.h: No such file or directory
#include <linux/kabi.h>
^~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [usr/include/asm/bootparam.hdrtest] Error 1
Second error:
./arch/x86/include/asm/paravirt_types.h:198:2: error: expected specifier-qualifier-list before ‘KABI_RESERVE’
KABI_RESERVE(1)
^~~~~~~~~~~~
./arch/x86/include/asm/paravirt_types.h:286:2: error: expected specifier-qualifier-list before ‘KABI_RESERVE’
KABI_RESERVE(1)
^~~~~~~~~~~~
./arch/x86/include/asm/paravirt_types.h:309:2: error: expected specifier-qualifier-list before ‘KABI_RESERVE’
KABI_RESERVE(1)
^~~~~~~~~~~~
make[1]: *** [scripts/Makefile.build:117: arch/x86/kernel/asm-offsets.s] Error 1
To fix first error, reverts commit 3ba63bacfc bootparam: Add kabi_reserve in bootparam.
To fix second error, add include file kabi.h to arch/x86/include/asm/paravirt_types.h.
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
arch/x86/include/asm/paravirt_types.h | 2 ++
arch/x86/include/uapi/asm/bootparam.h | 2 --
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 98eb135b1888..17a2308ef3af 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -2,6 +2,8 @@
#ifndef _ASM_X86_PARAVIRT_TYPES_H
#define _ASM_X86_PARAVIRT_TYPES_H
+#include <linux/kabi.h>
+
/* Bitmask of what can be clobbered: usually at least eax. */
#define CLBR_NONE 0
#define CLBR_EAX (1 << 0)
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index b1dce768d073..600a141c8805 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -42,7 +42,6 @@
#include <linux/types.h>
#include <linux/screen_info.h>
#include <linux/apm_bios.h>
-#include <linux/kabi.h>
#include <linux/edd.h>
#include <asm/ist.h>
#include <video/edid.h>
@@ -103,7 +102,6 @@ struct setup_header {
__u32 init_size;
__u32 handover_offset;
__u32 kernel_info_offset;
- KABI_RESERVE(1)
} __attribute__((packed));
struct sys_desc_table {
--
2.20.1
1
0

[PATCH openEuler-1.0-LTS 1/3] bpf: Use dedicated bpf_trace_printk event instead of trace_printk()
by Liu Xinpeng 06 Jan '22
by Liu Xinpeng 06 Jan '22
06 Jan '22
From: Alan Maguire <alan.maguire(a)oracle.com>
mainline inclusion
from mainline-v5.10.78
commit ac5a72ea5c8989871e61f6bb0852e0f91de51ebe
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4PKDW
CVE: NA
--------------------------------
The bpf helper bpf_trace_printk() uses trace_printk() under the hood.
This leads to an alarming warning message originating from trace
buffer allocation which occurs the first time a program using
bpf_trace_printk() is loaded.
We can instead create a trace event for bpf_trace_printk() and enable
it in-kernel when/if we encounter a program using the
bpf_trace_printk() helper. With this approach, trace_printk()
is not used directly and no warning message appears.
This work was started by Steven (see Link) and finished by Alan; added
Steven's Signed-off-by with his permission.
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
Signed-off-by: Alan Maguire <alan.maguire(a)oracle.com>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Acked-by: Andrii Nakryiko <andriin(a)fb.com>
Link: https://lore.kernel.org/r/20200628194334.6238b933@oasis.local.home
Link: https://lore.kernel.org/bpf/1594641154-18897-2-git-send-email-alan.maguire@…
Signed-off-by: Liu Xinpeng <liuxp11(a)chinatelecom.cn> # openEuler_contributor
Signed-off-by: Ctyun Kernel <ctyuncommiter01(a)chinatelecom.cn> # openEuler_contributor
---
kernel/trace/Makefile | 2 ++
kernel/trace/bpf_trace.c | 42 +++++++++++++++++++++++++++++++++++++-----
kernel/trace/bpf_trace.h | 34 ++++++++++++++++++++++++++++++++++
3 files changed, 73 insertions(+), 5 deletions(-)
create mode 100644 kernel/trace/bpf_trace.h
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index f81dadbc7..846b6b0 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -28,6 +28,8 @@ ifdef CONFIG_GCOV_PROFILE_FTRACE
GCOV_PROFILE := y
endif
+CFLAGS_bpf_trace.o := -I$(src)
+
CFLAGS_trace_benchmark.o := -I$(src)
CFLAGS_trace_events_filter.o := -I$(src)
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index e0b0e92..f359f79 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -11,12 +11,16 @@
#include <linux/uaccess.h>
#include <linux/ctype.h>
#include <linux/kprobes.h>
+#include <linux/spinlock.h>
#include <linux/syscalls.h>
#include <linux/error-injection.h>
#include "trace_probe.h"
#include "trace.h"
+#define CREATE_TRACE_POINTS
+#include "bpf_trace.h"
+
#ifdef CONFIG_MODULES
struct bpf_trace_module {
struct module *module;
@@ -318,6 +322,30 @@ static const struct bpf_func_proto *bpf_get_probe_write_proto(void)
return &bpf_probe_write_user_proto;
}
+static DEFINE_RAW_SPINLOCK(trace_printk_lock);
+
+#define BPF_TRACE_PRINTK_SIZE 1024
+
+static inline __printf(1, 0) int bpf_do_trace_printk(const char *fmt, ...)
+{
+ static char buf[BPF_TRACE_PRINTK_SIZE];
+ unsigned long flags;
+ va_list ap;
+ int ret;
+
+ raw_spin_lock_irqsave(&trace_printk_lock, flags);
+ va_start(ap, fmt);
+ ret = vsnprintf(buf, sizeof(buf), fmt, ap);
+ va_end(ap);
+ /* vsnprintf() will not append null for zero-length strings */
+ if (ret == 0)
+ buf[0] = '\0';
+ trace_bpf_trace_printk(buf);
+ raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
+
+ return ret;
+}
+
/*
* Only limited trace_printk() conversion specifiers allowed:
* %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %s
@@ -408,8 +436,7 @@ static const struct bpf_func_proto *bpf_get_probe_write_proto(void)
*/
#define __BPF_TP_EMIT() __BPF_ARG3_TP()
#define __BPF_TP(...) \
- __trace_printk(0 /* Fake ip */, \
- fmt, ##__VA_ARGS__)
+ bpf_do_trace_printk(fmt, ##__VA_ARGS__)
#define __BPF_ARG1_TP(...) \
((mod[0] == 2 || (mod[0] == 1 && __BITS_PER_LONG == 64)) \
@@ -446,10 +473,15 @@ static const struct bpf_func_proto *bpf_get_probe_write_proto(void)
const struct bpf_func_proto *bpf_get_trace_printk_proto(void)
{
/*
- * this program might be calling bpf_trace_printk,
- * so allocate per-cpu printk buffers
+ * This program might be calling bpf_trace_printk,
+ * so enable the associated bpf_trace/bpf_trace_printk event.
+ * Repeat this each time as it is possible a user has
+ * disabled bpf_trace_printk events. By loading a program
+ * calling bpf_trace_printk() however the user has expressed
+ * the intent to see such events.
*/
- trace_printk_init_buffers();
+ if (trace_set_clr_event("bpf_trace", "bpf_trace_printk", 1))
+ pr_warn_ratelimited("could not enable bpf_trace_printk events");
return &bpf_trace_printk_proto;
}
diff --git a/kernel/trace/bpf_trace.h b/kernel/trace/bpf_trace.h
new file mode 100644
index 00000000..9acbc11
--- /dev/null
+++ b/kernel/trace/bpf_trace.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM bpf_trace
+
+#if !defined(_TRACE_BPF_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
+
+#define _TRACE_BPF_TRACE_H
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(bpf_trace_printk,
+
+ TP_PROTO(const char *bpf_string),
+
+ TP_ARGS(bpf_string),
+
+ TP_STRUCT__entry(
+ __string(bpf_string, bpf_string)
+ ),
+
+ TP_fast_assign(
+ __assign_str(bpf_string, bpf_string);
+ ),
+
+ TP_printk("%s", __get_str(bpf_string))
+);
+
+#endif /* _TRACE_BPF_TRACE_H */
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE bpf_trace
+
+#include <trace/define_trace.h>
--
1.8.3.1
1
2

06 Jan '22
Ramaxel inclusion
category: features
bugzilla: https://gitee.com/openeuler/kernel/issues/I4JXCG
CVE: NA
Changes:
1. Use bsg module to replace with ioctrl
2. Split scmd_tmout_nonpt into two parameters:
scmd_tmout_vd/scmd_tmout_rawdisk
3. Return -ETIME instead of -EINVAL when command is timeout.
4. Add one module parameters: max_io_force.
5. Remove some unnecessary module parameters.
6. Report disks by the order of channel/target id.
7. Add host_reset handler.
8. Use get_unaligned_be24.
Signed-off-by: Yanling Song <songyl(a)ramaxel.com>
Reviewed-by: Jiang Yu<yujiang(a)ramaxel.com>
---
drivers/scsi/spraid/Kconfig | 6 +-
drivers/scsi/spraid/spraid.h | 142 ++-
drivers/scsi/spraid/spraid_main.c | 1633 +++++++++++++++--------------
3 files changed, 968 insertions(+), 813 deletions(-)
diff --git a/drivers/scsi/spraid/Kconfig b/drivers/scsi/spraid/Kconfig
index 83962efaab07..bfbba3db8db0 100644
--- a/drivers/scsi/spraid/Kconfig
+++ b/drivers/scsi/spraid/Kconfig
@@ -5,7 +5,9 @@
config RAMAXEL_SPRAID
tristate "Ramaxel spraid Adapter"
depends on PCI && SCSI
+ select BLK_DEV_BSGLIB
depends on ARM64 || X86_64
- default m
help
- This driver supports Ramaxel spraid driver.
+ This driver supports Ramaxel SPRxxx serial
+ raid controller, which has PCIE Gen4 interface
+ with host and supports SAS/SATA Hdd/ssd.
diff --git a/drivers/scsi/spraid/spraid.h b/drivers/scsi/spraid/spraid.h
index da46d8e1b4b6..983d7af2faa8 100644
--- a/drivers/scsi/spraid/spraid.h
+++ b/drivers/scsi/spraid/spraid.h
@@ -1,4 +1,5 @@
/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
#ifndef __SPRAID_H_
#define __SPRAID_H_
@@ -24,7 +25,7 @@
#define SENSE_SIZE(depth) ((depth) * SCSI_SENSE_BUFFERSIZE)
#define SPRAID_AQ_DEPTH 128
-#define SPRAID_NR_AEN_COMMANDS 1
+#define SPRAID_NR_AEN_COMMANDS 16
#define SPRAID_AQ_BLK_MQ_DEPTH (SPRAID_AQ_DEPTH - SPRAID_NR_AEN_COMMANDS)
#define SPRAID_AQ_MQ_TAG_DEPTH (SPRAID_AQ_BLK_MQ_DEPTH - 1)
@@ -44,7 +45,7 @@
#define SMALL_POOL_SIZE 256
#define MAX_SMALL_POOL_NUM 16
-#define MAX_CMD_PER_DEV 32
+#define MAX_CMD_PER_DEV 64
#define MAX_CDB_LEN 32
#define SPRAID_UP_TO_MULTY4(x) (((x) + 4) & (~0x03))
@@ -53,7 +54,7 @@
#define PCI_VENDOR_ID_RAMAXEL_LOGIC 0x1E81
-#define SPRAID_SERVER_DEVICE_HAB_DID 0x2100
+#define SPRAID_SERVER_DEVICE_HBA_DID 0x2100
#define SPRAID_SERVER_DEVICE_RAID_DID 0x2200
#define IO_6_DEFAULT_TX_LEN 256
@@ -142,11 +143,15 @@ enum {
enum {
SPRAID_AEN_DEV_CHANGED = 0x00,
+ SPRAID_AEN_FW_ACT_START = 0x01,
SPRAID_AEN_HOST_PROBING = 0x10,
};
enum {
- SPRAID_AEN_TIMESYN = 0x07
+ SPRAID_AEN_TIMESYN = 0x00,
+ SPRAID_AEN_FW_ACT_FINISH = 0x02,
+ SPRAID_AEN_EVENT_MIN = 0x80,
+ SPRAID_AEN_EVENT_MAX = 0xff,
};
enum {
@@ -175,6 +180,16 @@ enum spraid_state {
SPRAID_DEAD,
};
+enum {
+ SPRAID_CARD_HBA,
+ SPRAID_CARD_RAID,
+};
+
+enum spraid_cmd_type {
+ SPRAID_CMD_ADM,
+ SPRAID_CMD_IOPT,
+};
+
struct spraid_completion {
__le32 result;
union {
@@ -217,8 +232,6 @@ struct spraid_dev {
struct dma_pool *prp_page_pool;
struct dma_pool *prp_small_pool[MAX_SMALL_POOL_NUM];
mempool_t *iod_mempool;
- struct blk_mq_tag_set admin_tagset;
- struct request_queue *admin_q;
void __iomem *bar;
u32 max_qid;
u32 num_vecs;
@@ -232,23 +245,27 @@ struct spraid_dev {
u32 ctrl_config;
u32 online_queues;
u64 cap;
- struct device ctrl_device;
- struct cdev cdev;
int instance;
struct spraid_ctrl_info *ctrl_info;
struct spraid_dev_info *devices;
- struct spraid_ioq_ptcmd *ioq_ptcmds;
+ struct spraid_cmd *adm_cmds;
+ struct list_head adm_cmd_list;
+ spinlock_t adm_cmd_lock;
+
+ struct spraid_cmd *ioq_ptcmds;
struct list_head ioq_pt_list;
spinlock_t ioq_pt_lock;
- struct work_struct aen_work;
struct work_struct scan_work;
struct work_struct timesyn_work;
struct work_struct reset_work;
+ struct work_struct fw_act_work;
enum spraid_state state;
spinlock_t state_lock;
+
+ struct request_queue *bsg_queue;
};
struct spraid_sgl_desc {
@@ -347,6 +364,35 @@ struct spraid_get_info {
__u32 rsvd12[4];
};
+struct spraid_usr_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ union {
+ struct {
+ __le16 subopcode;
+ __le16 rsvd1;
+ } info_0;
+ __le32 cdw2;
+ };
+ union {
+ struct {
+ __le16 data_len;
+ __le16 param_len;
+ } info_1;
+ __le32 cdw3;
+ };
+ __u64 metadata;
+ union spraid_data_ptr dptr;
+ __le32 cdw10;
+ __le32 cdw11;
+ __le32 cdw12;
+ __le32 cdw13;
+ __le32 cdw14;
+ __le32 cdw15;
+};
+
enum {
SPRAID_CMD_FLAG_SGL_METABUF = (1 << 6),
SPRAID_CMD_FLAG_SGL_METASEG = (1 << 7),
@@ -393,6 +439,7 @@ struct spraid_admin_command {
struct spraid_get_info get_info;
struct spraid_abort_cmd abort;
struct spraid_reset_cmd reset;
+ struct spraid_usr_cmd usr_cmd;
};
};
@@ -456,9 +503,6 @@ struct spraid_ioq_command {
};
};
-#define SPRAID_IOCTL_RESET_CMD _IOWR('N', 0x80, struct spraid_passthru_common_cmd)
-#define SPRAID_IOCTL_ADMIN_CMD _IOWR('N', 0x41, struct spraid_passthru_common_cmd)
-
struct spraid_passthru_common_cmd {
__u8 opcode;
__u8 flags;
@@ -494,8 +538,6 @@ struct spraid_passthru_common_cmd {
__u32 result1;
};
-#define SPRAID_IOCTL_IOQ_CMD _IOWR('N', 0x42, struct spraid_ioq_passthru_cmd)
-
struct spraid_ioq_passthru_cmd {
__u8 opcode;
__u8 flags;
@@ -560,7 +602,21 @@ struct spraid_ioq_passthru_cmd {
__u32 result1;
};
-struct spraid_ioq_ptcmd {
+struct spraid_bsg_request {
+ u32 msgcode;
+ u32 control;
+ union {
+ struct spraid_passthru_common_cmd admcmd;
+ struct spraid_ioq_passthru_cmd ioqcmd;
+ };
+};
+
+enum {
+ SPRAID_BSG_ADM,
+ SPRAID_BSG_IOQ,
+};
+
+struct spraid_cmd {
int qid;
int cid;
u32 result0;
@@ -572,14 +628,6 @@ struct spraid_ioq_ptcmd {
struct list_head list;
};
-struct spraid_admin_request {
- struct spraid_admin_command *cmd;
- u32 result0;
- u32 result1;
- u16 flags;
- u16 status;
-};
-
struct spraid_queue {
struct spraid_dev *hdev;
spinlock_t sq_lock; /* spinlock for lock handling */
@@ -607,7 +655,6 @@ struct spraid_queue {
};
struct spraid_iod {
- struct spraid_admin_request req;
struct spraid_queue *spraidq;
enum spraid_cmd_state state;
int npages;
@@ -623,13 +670,51 @@ struct spraid_iod {
};
#define SPRAID_DEV_INFO_ATTR_BOOT(attr) ((attr) & 0x01)
-#define SPRAID_DEV_INFO_ATTR_HDD(attr) ((attr) & 0x02)
+#define SPRAID_DEV_INFO_ATTR_VD(attr) (((attr) & 0x02) == 0x0)
#define SPRAID_DEV_INFO_ATTR_PT(attr) (((attr) & 0x22) == 0x02)
#define SPRAID_DEV_INFO_ATTR_RAWDISK(attr) ((attr) & 0x20)
#define SPRAID_DEV_INFO_FLAG_VALID(flag) ((flag) & 0x01)
#define SPRAID_DEV_INFO_FLAG_CHANGE(flag) ((flag) & 0x02)
+#define BGTASK_TYPE_REBUILD 4
+#define USR_CMD_READ 0xc2
+#define USR_CMD_RDLEN 0x1000
+#define USR_CMD_VDINFO 0x704
+#define USR_CMD_BGTASK 0x504
+#define VDINFO_PARAM_LEN 0x04
+
+struct spraid_vd_info {
+ __u8 name[32];
+ __le16 id;
+ __u8 rg_id;
+ __u8 rg_level;
+ __u8 sg_num;
+ __u8 sg_disk_num;
+ __u8 vd_status;
+ __u8 vd_type;
+ __u8 rsvd1[4056];
+};
+
+#define MAX_REALTIME_BGTASK_NUM 32
+
+struct bgtask_info {
+ __u8 type;
+ __u8 progress;
+ __u8 rate;
+ __u8 rsvd0;
+ __le16 vd_id;
+ __le16 time_left;
+ __u8 rsvd1[4];
+};
+
+struct spraid_bgtask {
+ __u8 sw;
+ __u8 task_num;
+ __u8 rsvd[6];
+ struct bgtask_info bgtask[MAX_REALTIME_BGTASK_NUM];
+};
+
struct spraid_dev_info {
__le32 hdid;
__le16 target;
@@ -649,6 +734,11 @@ struct spraid_dev_list {
struct spraid_sdev_hostdata {
u32 hdid;
+ u16 max_io_kb;
+ u8 attr;
+ u8 flag;
+ u8 rg_id;
+ u8 rsvd[3];
};
#endif
diff --git a/drivers/scsi/spraid/spraid_main.c b/drivers/scsi/spraid/spraid_main.c
index a0a75ecb0027..c6d2a0b8e35e 100644
--- a/drivers/scsi/spraid/spraid_main.c
+++ b/drivers/scsi/spraid/spraid_main.c
@@ -1,8 +1,8 @@
// SPDX-License-Identifier: GPL-2.0
-/*
- * Linux spraid device driver
- * Copyright(c) 2021 Ramaxel Memory Technology, Ltd
- */
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
+
+/* Ramaxel Raid SPXXX Series Linux Driver */
+
#define pr_fmt(fmt) "spraid: " fmt
#include <linux/sched/signal.h>
@@ -23,6 +23,9 @@
#include <linux/debugfs.h>
#include <linux/io-64-nonatomic-lo-hi.h>
#include <linux/blkdev.h>
+#include <linux/bsg-lib.h>
+#include <asm/unaligned.h>
+#include <linux/sort.h>
#include <scsi/scsi.h>
#include <scsi/scsi_cmnd.h>
@@ -31,27 +34,24 @@
#include <scsi/scsi_transport.h>
#include <scsi/scsi_dbg.h>
+
#include "spraid.h"
static u32 admin_tmout = 60;
module_param(admin_tmout, uint, 0644);
MODULE_PARM_DESC(admin_tmout, "admin commands timeout (seconds)");
-static u32 scmd_tmout_pt = 30;
-module_param(scmd_tmout_pt, uint, 0644);
-MODULE_PARM_DESC(scmd_tmout_pt, "scsi commands timeout for passthrough(seconds)");
+static u32 scmd_tmout_rawdisk = 180;
+module_param(scmd_tmout_rawdisk, uint, 0644);
+MODULE_PARM_DESC(scmd_tmout_rawdisk, "scsi commands timeout for rawdisk(seconds)");
-static u32 scmd_tmout_nonpt = 180;
-module_param(scmd_tmout_nonpt, uint, 0644);
-MODULE_PARM_DESC(scmd_tmout_nonpt, "scsi commands timeout for rawdisk&raid(seconds)");
+static u32 scmd_tmout_vd = 180;
+module_param(scmd_tmout_vd, uint, 0644);
+MODULE_PARM_DESC(scmd_tmout_vd, "scsi commands timeout for vd(seconds)");
-static u32 wait_abl_tmout = 3;
-module_param(wait_abl_tmout, uint, 0644);
-MODULE_PARM_DESC(wait_abl_tmout, "wait abnormal io timeout(seconds)");
-
-static bool use_sgl_force;
-module_param(use_sgl_force, bool, 0644);
-MODULE_PARM_DESC(use_sgl_force, "force IO use sgl format, default false");
+static bool max_io_force;
+module_param(max_io_force, bool, 0644);
+MODULE_PARM_DESC(max_io_force, "force max_hw_sectors_kb = 1024, default false(performance first)");
static int ioq_depth_set(const char *val, const struct kernel_param *kp);
static const struct kernel_param_ops ioq_depth_ops = {
@@ -106,16 +106,20 @@ static const struct kernel_param_ops small_pool_num_ops = {
.get = param_get_byte,
};
+/* It was found that the spindlock of a single pool conflicts
+ * a lot with multiple CPUs.So multiple pools are introduced
+ * to reduce the conflictions.
+ */
static unsigned char small_pool_num = 4;
module_param_cb(small_pool_num, &small_pool_num_ops, &small_pool_num, 0644);
MODULE_PARM_DESC(small_pool_num, "set prp small pool num, default 4, MAX 16");
static void spraid_free_queue(struct spraid_queue *spraidq);
static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result);
-static void spraid_handle_aen_vs(struct spraid_dev *hdev, u32 result);
+static void spraid_handle_aen_vs(struct spraid_dev *hdev, u32 result, u32 result1);
static DEFINE_IDA(spraid_instance_ida);
-static dev_t spraid_chr_devt;
+
static struct class *spraid_class;
#define SPRAID_CAP_TIMEOUT_UNIT_MS (HZ / 2)
@@ -131,9 +135,8 @@ static struct workqueue_struct *spraid_wq;
#define SPRAID_DRV_VERSION "1.0.0.0"
#define ADMIN_TIMEOUT (admin_tmout * HZ)
-#define ADMIN_ERR_TIMEOUT 32757
-#define SPRAID_WAIT_ABNL_CMD_TIMEOUT (wait_abl_tmout * 2)
+#define SPRAID_WAIT_ABNL_CMD_TIMEOUT (3 * 2)
#define SPRAID_DMA_MSK_BIT_MAX 64
@@ -147,6 +150,13 @@ enum FW_STAT_CODE {
FW_STAT_NEED_RETRY
};
+static const char * const raid_levels[] = {"0", "1", "5", "6", "10", "50", "60", "NA"};
+
+static const char * const raid_states[] = {
+ "NA", "NORMAL", "FAULT", "DEGRADE", "NOT_FORMATTED", "FORMATTING", "SANITIZING",
+ "INITIALIZING", "INITIALIZE_FAIL", "DELETING", "DELETE_FAIL", "WRITE_PROTECT"
+};
+
static int ioq_depth_set(const char *val, const struct kernel_param *kp)
{
int n = 0;
@@ -231,12 +241,6 @@ static int spraid_pci_enable(struct spraid_dev *hdev)
goto disable;
}
- ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
- if (ret < 0) {
- dev_err(hdev->dev, "Allocate one IRQ for setup admin channel failed\n");
- goto disable;
- }
-
hdev->cap = lo_hi_readq(hdev->bar + SPRAID_REG_CAP);
hdev->ioq_depth = min_t(u32, SPRAID_CAP_MQES(hdev->cap) + 1, io_queue_depth);
hdev->db_stride = 1 << SPRAID_CAP_STRIDE(hdev->cap);
@@ -246,13 +250,20 @@ static int spraid_pci_enable(struct spraid_dev *hdev)
dev_err(hdev->dev, "err, dma mask invalid[%llu], set to default\n", maskbit);
maskbit = SPRAID_DMA_MSK_BIT_MAX;
}
- if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(maskbit))) {
+ if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(maskbit)) &&
+ dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32))) {
dev_err(hdev->dev, "set dma mask and coherent failed\n");
goto disable;
}
dev_info(hdev->dev, "set dma mask[%llu] success\n", maskbit);
+ ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
+ if (ret < 0) {
+ dev_err(hdev->dev, "Allocate one IRQ for setup admin channel failed\n");
+ goto disable;
+ }
+
pci_enable_pcie_error_reporting(pdev);
pci_save_state(pdev);
@@ -263,12 +274,6 @@ static int spraid_pci_enable(struct spraid_dev *hdev)
return ret;
}
-static inline
-struct spraid_admin_request *spraid_admin_req(struct request *req)
-{
- return blk_mq_rq_to_pdu(req);
-}
-
static int spraid_npages_prp(u32 size, struct spraid_dev *hdev)
{
u32 nprps = DIV_ROUND_UP(size + hdev->page_size, hdev->page_size);
@@ -419,7 +424,7 @@ static void spraid_submit_cmd(struct spraid_queue *spraidq, const void *cmd)
writel(spraidq->sq_tail, spraidq->q_db);
spin_unlock_irqrestore(&spraidq->sq_lock, flags);
- dev_log_dbg(spraidq->hdev->dev, "cid[%d], qid[%d], opcode[0x%x], flags[0x%x], hdid[%u]\n",
+ dev_log_dbg(spraidq->hdev->dev, "cid[%d] qid[%d], opcode[0x%x], flags[0x%x], hdid[%u]\n",
acd->command_id, spraidq->qid, acd->opcode, acd->flags, le32_to_cpu(acd->hdid));
}
@@ -605,18 +610,15 @@ static void spraid_setup_rw_cmd(struct spraid_dev *hdev,
if (scmd->cmd_len == 6) {
datalength = (u32)(scmd->cmnd[4] == 0 ?
IO_6_DEFAULT_TX_LEN : scmd->cmnd[4]);
- start_lba_lo = ((u32)scmd->cmnd[1] << 16) |
- ((u32)scmd->cmnd[2] << 8) | (u32)scmd->cmnd[3];
+ start_lba_lo = (u32)get_unaligned_be24(&scmd->cmnd[1]);
start_lba_lo &= 0x1FFFFF;
}
/* 10-byte READ(0x28) or WRITE(0x2A) cdb */
else if (scmd->cmd_len == 10) {
- datalength = (u32)scmd->cmnd[8] | ((u32)scmd->cmnd[7] << 8);
- start_lba_lo = ((u32)scmd->cmnd[2] << 24) |
- ((u32)scmd->cmnd[3] << 16) |
- ((u32)scmd->cmnd[4] << 8) | (u32)scmd->cmnd[5];
+ datalength = (u32)get_unaligned_be16(&scmd->cmnd[7]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
if (scmd->cmnd[1] & FUA_MASK)
control |= SPRAID_RW_FUA;
@@ -624,42 +626,26 @@ static void spraid_setup_rw_cmd(struct spraid_dev *hdev,
/* 12-byte READ(0xA8) or WRITE(0xAA) cdb */
else if (scmd->cmd_len == 12) {
- datalength = ((u32)scmd->cmnd[6] << 24) |
- ((u32)scmd->cmnd[7] << 16) |
- ((u32)scmd->cmnd[8] << 8) | (u32)scmd->cmnd[9];
- start_lba_lo = ((u32)scmd->cmnd[2] << 24) |
- ((u32)scmd->cmnd[3] << 16) |
- ((u32)scmd->cmnd[4] << 8) | (u32)scmd->cmnd[5];
+ datalength = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
if (scmd->cmnd[1] & FUA_MASK)
control |= SPRAID_RW_FUA;
}
/* 16-byte READ(0x88) or WRITE(0x8A) cdb */
else if (scmd->cmd_len == 16) {
- datalength = ((u32)scmd->cmnd[10] << 24) |
- ((u32)scmd->cmnd[11] << 16) |
- ((u32)scmd->cmnd[12] << 8) | (u32)scmd->cmnd[13];
- start_lba_lo = ((u32)scmd->cmnd[6] << 24) |
- ((u32)scmd->cmnd[7] << 16) |
- ((u32)scmd->cmnd[8] << 8) | (u32)scmd->cmnd[9];
- start_lba_hi = ((u32)scmd->cmnd[2] << 24) |
- ((u32)scmd->cmnd[3] << 16) |
- ((u32)scmd->cmnd[4] << 8) | (u32)scmd->cmnd[5];
+ datalength = get_unaligned_be32(&scmd->cmnd[10]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[2]);
if (scmd->cmnd[1] & FUA_MASK)
control |= SPRAID_RW_FUA;
}
/* 32-byte READ(0x88) or WRITE(0x8A) cdb */
else if (scmd->cmd_len == 32) {
- datalength = ((u32)scmd->cmnd[28] << 24) |
- ((u32)scmd->cmnd[29] << 16) |
- ((u32)scmd->cmnd[30] << 8) | (u32)scmd->cmnd[31];
- start_lba_lo = ((u32)scmd->cmnd[16] << 24) |
- ((u32)scmd->cmnd[17] << 16) |
- ((u32)scmd->cmnd[18] << 8) | (u32)scmd->cmnd[19];
- start_lba_hi = ((u32)scmd->cmnd[12] << 24) |
- ((u32)scmd->cmnd[13] << 16) |
- ((u32)scmd->cmnd[14] << 8) | (u32)scmd->cmnd[15];
+ datalength = get_unaligned_be32(&scmd->cmnd[28]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[16]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[12]);
if (scmd->cmnd[10] & FUA_MASK)
control |= SPRAID_RW_FUA;
@@ -814,7 +800,7 @@ static void spraid_map_status(struct spraid_iod *iod, struct scsi_cmnd *scmd,
if (scmd->result & SAM_STAT_CHECK_CONDITION) {
memset(scmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
memcpy(scmd->sense_buffer, iod->sense, SCSI_SENSE_BUFFERSIZE);
- set_driver_byte(scmd, DRIVER_SENSE);
+ scmd->result = (scmd->result & 0x00ffffff) | (DRIVER_SENSE << 24);
}
break;
case FW_STAT_ABORTED:
@@ -825,6 +811,8 @@ static void spraid_map_status(struct spraid_iod *iod, struct scsi_cmnd *scmd,
break;
default:
set_host_byte(scmd, DID_BAD_TARGET);
+ dev_warn(iod->spraidq->hdev->dev, "[%s] cid[%d] qid[%d] bad status[0x%x]\n",
+ __func__, cqe->cmd_id, le16_to_cpu(cqe->sq_id), le16_to_cpu(cqe->status));
break;
}
}
@@ -850,14 +838,13 @@ static int spraid_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
int ret;
if (unlikely(!scmd)) {
- dev_err(hdev->dev, "err, scmd is null, return 0\n");
+ dev_err(hdev->dev, "err, scmd is null\n");
return 0;
}
if (unlikely(hdev->state != SPRAID_LIVE)) {
set_host_byte(scmd, DID_NO_CONNECT);
scmd->scsi_done(scmd);
- dev_err(hdev->dev, "[%s] err, hdev state is not live\n", __func__);
return 0;
}
@@ -894,7 +881,7 @@ static int spraid_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
WRITE_ONCE(iod->state, SPRAID_CMD_IN_FLIGHT);
spraid_submit_cmd(ioq, &ioq_cmd);
elapsed = jiffies - scmd->jiffies_at_alloc;
- dev_log_dbg(hdev->dev, "cid[%d], qid[%d] submit IO cost %3ld.%3ld seconds\n",
+ dev_log_dbg(hdev->dev, "cid[%d] qid[%d] submit IO cost %3ld.%3ld seconds\n",
cid, hwq, elapsed / HZ, elapsed % HZ);
return 0;
@@ -945,6 +932,10 @@ static int spraid_slave_alloc(struct scsi_device *sdev)
scan_host:
hostdata->hdid = le32_to_cpu(hdev->devices[idx].hdid);
+ hostdata->max_io_kb = le16_to_cpu(hdev->devices[idx].max_io_kb);
+ hostdata->attr = hdev->devices[idx].attr;
+ hostdata->flag = hdev->devices[idx].flag;
+ hostdata->rg_id = 0xff;
sdev->hostdata = hostdata;
up_read(&hdev->devices_rwsem);
return 0;
@@ -958,30 +949,17 @@ static void spraid_slave_destroy(struct scsi_device *sdev)
static int spraid_slave_configure(struct scsi_device *sdev)
{
- u16 idx;
- unsigned int timeout = scmd_tmout_nonpt * HZ;
+ unsigned int timeout = scmd_tmout_rawdisk * HZ;
struct spraid_dev *hdev = shost_priv(sdev->host);
struct spraid_sdev_hostdata *hostdata = sdev->hostdata;
u32 max_sec = sdev->host->max_sectors;
- if (!hostdata) {
- idx = hostdata->hdid - 1;
- if (sdev->channel == hdev->devices[idx].channel &&
- sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
- sdev->lun < hdev->devices[idx].lun) {
- if (SPRAID_DEV_INFO_ATTR_PT(hdev->devices[idx].attr))
- timeout = scmd_tmout_pt * HZ;
- else
- timeout = scmd_tmout_nonpt * HZ;
- max_sec = le16_to_cpu(hdev->devices[idx].max_io_kb) << 1;
- } else {
- dev_err(hdev->dev, "[%s] err, sdev->channel:id:lun[%d:%d:%lld];"
- "devices[%d], channel:target:lun[%d:%d:%d]\n",
- __func__, sdev->channel, sdev->id, sdev->lun,
- idx, hdev->devices[idx].channel,
- hdev->devices[idx].target,
- hdev->devices[idx].lun);
- }
+ if (hostdata) {
+ if (SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ timeout = scmd_tmout_vd * HZ;
+ else if (SPRAID_DEV_INFO_ATTR_RAWDISK(hostdata->attr))
+ timeout = scmd_tmout_rawdisk * HZ;
+ max_sec = hostdata->max_io_kb << 1;
} else {
dev_err(hdev->dev, "[%s] err, sdev->hostdata is null\n", __func__);
}
@@ -991,7 +969,9 @@ static int spraid_slave_configure(struct scsi_device *sdev)
if ((max_sec == 0) || (max_sec > sdev->host->max_sectors))
max_sec = sdev->host->max_sectors;
- blk_queue_max_hw_sectors(sdev->request_queue, max_sec);
+
+ if (!max_io_force)
+ blk_queue_max_hw_sectors(sdev->request_queue, max_sec);
dev_info(hdev->dev, "[%s] sdev->channel:id:lun[%d:%d:%lld], scmd_timeout[%d]s, maxsec[%d]\n",
__func__, sdev->channel, sdev->id, sdev->lun, timeout / HZ, max_sec);
@@ -1176,6 +1156,75 @@ static inline bool spraid_cqe_pending(struct spraid_queue *spraidq)
spraidq->cq_phase;
}
+static void spraid_sata_report_zone_handle(struct scsi_cmnd *scmd, struct spraid_iod *iod)
+{
+ int i = 0;
+ unsigned int bytes = 0;
+ struct scatterlist *sg = scsi_sglist(scmd);
+
+ scsi_for_each_sg(scmd, sg, iod->nsge, i) {
+ unsigned int offset = 0;
+
+ if (bytes == 0) {
+ char *hdr;
+ u32 list_length;
+ u64 max_lba, opt_lba;
+ u16 same;
+
+ hdr = sg_virt(sg);
+
+ list_length = get_unaligned_le32(&hdr[0]);
+ same = get_unaligned_le16(&hdr[4]);
+ max_lba = get_unaligned_le64(&hdr[8]);
+ opt_lba = get_unaligned_le64(&hdr[16]);
+ put_unaligned_be32(list_length, &hdr[0]);
+ hdr[4] = same & 0xf;
+ put_unaligned_be64(max_lba, &hdr[8]);
+ put_unaligned_be64(opt_lba, &hdr[16]);
+ offset += 64;
+ bytes += 64;
+ }
+ while (offset < sg_dma_len(sg)) {
+ char *rec;
+ u8 cond, type, non_seq, reset;
+ u64 size, start, wp;
+
+ rec = sg_virt(sg) + offset;
+ type = rec[0] & 0xf;
+ cond = (rec[1] >> 4) & 0xf;
+ non_seq = (rec[1] & 2);
+ reset = (rec[1] & 1);
+ size = get_unaligned_le64(&rec[8]);
+ start = get_unaligned_le64(&rec[16]);
+ wp = get_unaligned_le64(&rec[24]);
+ rec[0] = type;
+ rec[1] = (cond << 4) | non_seq | reset;
+ put_unaligned_be64(size, &rec[8]);
+ put_unaligned_be64(start, &rec[16]);
+ put_unaligned_be64(wp, &rec[24]);
+ WARN_ON(offset + 64 > sg_dma_len(sg));
+ offset += 64;
+ bytes += 64;
+ }
+ }
+}
+
+static inline void spraid_handle_ata_cmd(struct spraid_dev *hdev, struct scsi_cmnd *scmd,
+ struct spraid_iod *iod)
+{
+ if (hdev->ctrl_info->card_type != SPRAID_CARD_HBA)
+ return;
+
+ switch (scmd->cmnd[0]) {
+ case ZBC_IN:
+ dev_info(hdev->dev, "[%s] process report zone\n", __func__);
+ spraid_sata_report_zone_handle(scmd, iod);
+ break;
+ default:
+ break;
+ }
+}
+
static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq, struct spraid_completion *cqe)
{
struct spraid_dev *hdev = ioq->hdev;
@@ -1197,12 +1246,12 @@ static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq, struct spraid_com
iod = scsi_cmd_priv(scmd);
elapsed = jiffies - scmd->jiffies_at_alloc;
- dev_log_dbg(hdev->dev, "cid[%d], qid[%d] finish IO cost %3ld.%3ld seconds\n",
+ dev_log_dbg(hdev->dev, "cid[%d] qid[%d] finish IO cost %3ld.%3ld seconds\n",
cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
if (cmpxchg(&iod->state, SPRAID_CMD_IN_FLIGHT, SPRAID_CMD_COMPLETE) !=
SPRAID_CMD_IN_FLIGHT) {
- dev_warn(hdev->dev, "cid[%d], qid[%d] enters abnormal handler, cost %3ld.%3ld seconds\n",
+ dev_warn(hdev->dev, "cid[%d] qid[%d] enters abnormal handler, cost %3ld.%3ld seconds\n",
cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
WRITE_ONCE(iod->state, SPRAID_CMD_TMO_COMPLETE);
@@ -1215,6 +1264,8 @@ static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq, struct spraid_com
return;
}
+ spraid_handle_ata_cmd(hdev, scmd, iod);
+
spraid_map_status(iod, scmd, cqe);
if (iod->nsge) {
iod->nsge = 0;
@@ -1224,38 +1275,36 @@ static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq, struct spraid_com
scmd->scsi_done(scmd);
}
-static inline void spraid_end_admin_request(struct request *req, __le16 status,
- __le32 result0, __le32 result1)
-{
- struct spraid_admin_request *rq = spraid_admin_req(req);
-
- rq->status = le16_to_cpu(status) >> 1;
- rq->result0 = le32_to_cpu(result0);
- rq->result1 = le32_to_cpu(result1);
- blk_mq_complete_request(req);
-}
-
static void spraid_complete_adminq_cmnd(struct spraid_queue *adminq, struct spraid_completion *cqe)
{
- struct blk_mq_tags *tags = adminq->hdev->admin_tagset.tags[0];
- struct request *req;
+ struct spraid_dev *hdev = adminq->hdev;
+ struct spraid_cmd *adm_cmd;
- req = blk_mq_tag_to_rq(tags, cqe->cmd_id);
- if (unlikely(!req)) {
+ adm_cmd = hdev->adm_cmds + cqe->cmd_id;
+ if (unlikely(adm_cmd->state == SPRAID_CMD_IDLE)) {
dev_warn(adminq->hdev->dev, "Invalid id %d completed on queue %d\n",
cqe->cmd_id, le16_to_cpu(cqe->sq_id));
return;
}
- spraid_end_admin_request(req, cqe->status, cqe->result, cqe->result1);
+
+ adm_cmd->status = le16_to_cpu(cqe->status) >> 1;
+ adm_cmd->result0 = le32_to_cpu(cqe->result);
+ adm_cmd->result1 = le32_to_cpu(cqe->result1);
+
+ complete(&adm_cmd->cmd_done);
}
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid);
+
static void spraid_complete_aen(struct spraid_queue *spraidq, struct spraid_completion *cqe)
{
struct spraid_dev *hdev = spraidq->hdev;
u32 result = le32_to_cpu(cqe->result);
- dev_info(hdev->dev, "rcv aen, status[%x], result[%x]\n",
- le16_to_cpu(cqe->status) >> 1, result);
+ dev_info(hdev->dev, "rcv aen, cid[%d], status[0x%x], result[0x%x]\n",
+ cqe->cmd_id, le16_to_cpu(cqe->status) >> 1, result);
+
+ spraid_send_aen(hdev, cqe->cmd_id);
if ((le16_to_cpu(cqe->status) >> 1) != SPRAID_SC_SUCCESS)
return;
@@ -1264,22 +1313,19 @@ static void spraid_complete_aen(struct spraid_queue *spraidq, struct spraid_comp
spraid_handle_aen_notice(hdev, result);
break;
case SPRAID_AEN_VS:
- spraid_handle_aen_vs(hdev, result);
+ spraid_handle_aen_vs(hdev, result, le32_to_cpu(cqe->result1));
break;
default:
dev_warn(hdev->dev, "Unsupported async event type: %u\n",
result & 0x7);
break;
}
- queue_work(spraid_wq, &hdev->aen_work);
}
-static void spraid_put_ioq_ptcmd(struct spraid_dev *hdev, struct spraid_ioq_ptcmd *cmd);
-
static void spraid_complete_ioq_sync_cmnd(struct spraid_queue *ioq, struct spraid_completion *cqe)
{
struct spraid_dev *hdev = ioq->hdev;
- struct spraid_ioq_ptcmd *ptcmd;
+ struct spraid_cmd *ptcmd;
ptcmd = hdev->ioq_ptcmds + (ioq->qid - 1) * SPRAID_PTCMDS_PERQ +
cqe->cmd_id - SPRAID_IO_BLK_MQ_DEPTH;
@@ -1289,8 +1335,6 @@ static void spraid_complete_ioq_sync_cmnd(struct spraid_queue *ioq, struct sprai
ptcmd->result1 = le32_to_cpu(cqe->result1);
complete(&ptcmd->cmd_done);
-
- spraid_put_ioq_ptcmd(hdev, ptcmd);
}
static inline void spraid_handle_cqe(struct spraid_queue *spraidq, u16 idx)
@@ -1304,7 +1348,7 @@ static inline void spraid_handle_cqe(struct spraid_queue *spraidq, u16 idx)
return;
}
- dev_log_dbg(hdev->dev, "cid[%d], qid[%d], result[0x%x], sq_id[%d], status[0x%x]\n",
+ dev_log_dbg(hdev->dev, "cid[%d] qid[%d], result[0x%x], sq_id[%d], status[0x%x]\n",
cqe->cmd_id, spraidq->qid, le32_to_cpu(cqe->result),
le16_to_cpu(cqe->sq_id), le16_to_cpu(cqe->status));
@@ -1452,62 +1496,119 @@ static u32 spraid_bar_size(struct spraid_dev *hdev, u32 nr_ioqs)
return (SPRAID_REG_DBS + ((nr_ioqs + 1) * 8 * hdev->db_stride));
}
-static inline void spraid_clear_spraid_request(struct request *req)
+static int spraid_alloc_admin_cmds(struct spraid_dev *hdev)
{
- if (!(req->rq_flags & RQF_DONTPREP)) {
- spraid_admin_req(req)->flags = 0;
- req->rq_flags |= RQF_DONTPREP;
+ int i;
+
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+ spin_lock_init(&hdev->adm_cmd_lock);
+
+ hdev->adm_cmds = kcalloc_node(SPRAID_AQ_BLK_MQ_DEPTH, sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
+
+ if (!hdev->adm_cmds) {
+ dev_err(hdev->dev, "Alloc admin cmds failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < SPRAID_AQ_BLK_MQ_DEPTH; i++) {
+ hdev->adm_cmds[i].qid = 0;
+ hdev->adm_cmds[i].cid = i;
+ list_add_tail(&(hdev->adm_cmds[i].list), &hdev->adm_cmd_list);
}
+
+ dev_info(hdev->dev, "Alloc admin cmds success, num[%d]\n", SPRAID_AQ_BLK_MQ_DEPTH);
+
+ return 0;
}
-static struct request *spraid_alloc_admin_request(struct request_queue *q,
- struct spraid_admin_command *cmd,
- blk_mq_req_flags_t flags)
+static void spraid_free_admin_cmds(struct spraid_dev *hdev)
{
- u32 op = COMMAND_IS_WRITE(cmd) ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN;
- struct request *req;
+ kfree(hdev->adm_cmds);
+ hdev->adm_cmds = NULL;
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+}
+
+static struct spraid_cmd *spraid_get_cmd(struct spraid_dev *hdev, enum spraid_cmd_type type)
+{
+ struct spraid_cmd *cmd = NULL;
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
+
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
+
+ spin_lock_irqsave(slock, flags);
+ if (list_empty(head)) {
+ spin_unlock_irqrestore(slock, flags);
+ dev_err(hdev->dev, "err, cmd[%d] list empty\n", type);
+ return NULL;
+ }
+ cmd = list_entry(head->next, struct spraid_cmd, list);
+ list_del_init(&cmd->list);
+ spin_unlock_irqrestore(slock, flags);
- req = blk_mq_alloc_request(q, op, flags);
- if (IS_ERR(req))
- return req;
- req->cmd_flags |= REQ_FAILFAST_DRIVER;
- spraid_clear_spraid_request(req);
- spraid_admin_req(req)->cmd = cmd;
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IN_FLIGHT);
- return req;
+ return cmd;
}
-static int spraid_submit_admin_sync_cmd(struct request_queue *q,
- struct spraid_admin_command *cmd,
- u32 *result, void *buffer,
- u32 bufflen, u32 timeout, int at_head, blk_mq_req_flags_t flags)
+static void spraid_put_cmd(struct spraid_dev *hdev, struct spraid_cmd *cmd,
+ enum spraid_cmd_type type)
{
- struct request *req;
- int ret;
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
- req = spraid_alloc_admin_request(q, cmd, flags);
- if (IS_ERR(req))
- return PTR_ERR(req);
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
- req->timeout = timeout ? timeout : ADMIN_TIMEOUT;
- if (buffer && bufflen) {
- ret = blk_rq_map_kern(q, req, buffer, bufflen, GFP_KERNEL);
- if (ret)
- goto out;
+ spin_lock_irqsave(slock, flags);
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IDLE);
+ list_add_tail(&cmd->list, head);
+ spin_unlock_irqrestore(slock, flags);
+}
+
+
+static int spraid_submit_admin_sync_cmd(struct spraid_dev *hdev, struct spraid_admin_command *cmd,
+ u32 *result0, u32 *result1, u32 timeout)
+{
+ struct spraid_cmd *adm_cmd = spraid_get_cmd(hdev, SPRAID_CMD_ADM);
+
+ if (!adm_cmd) {
+ dev_err(hdev->dev, "err, get admin cmd failed\n");
+ return -EFAULT;
}
- blk_execute_rq(req->q, NULL, req, at_head);
- if (result)
- *result = spraid_admin_req(req)->result0;
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
- if (spraid_admin_req(req)->flags & SPRAID_REQ_CANCELLED)
- ret = -EINTR;
- else
- ret = spraid_admin_req(req)->status;
+ init_completion(&adm_cmd->cmd_done);
-out:
- blk_mq_free_request(req);
- return ret;
+ cmd->common.command_id = adm_cmd->cid;
+ spraid_submit_cmd(&hdev->queues[0], cmd);
+
+ if (!wait_for_completion_timeout(&adm_cmd->cmd_done, timeout)) {
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout, opcode[0x%x] subopcode[0x%x]\n",
+ __func__, adm_cmd->cid, adm_cmd->qid, cmd->usr_cmd.opcode,
+ cmd->usr_cmd.info_0.subopcode);
+ WRITE_ONCE(adm_cmd->state, SPRAID_CMD_TIMEOUT);
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+ return -ETIME;
+ }
+
+ if (result0)
+ *result0 = adm_cmd->result0;
+ if (result1)
+ *result1 = adm_cmd->result1;
+
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+
+ return adm_cmd->status;
}
static int spraid_create_cq(struct spraid_dev *hdev, u16 qid,
@@ -1524,8 +1625,7 @@ static int spraid_create_cq(struct spraid_dev *hdev, u16 qid,
admin_cmd.create_cq.cq_flags = cpu_to_le16(flags);
admin_cmd.create_cq.irq_vector = cpu_to_le16(cq_vector);
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
}
static int spraid_create_sq(struct spraid_dev *hdev, u16 qid,
@@ -1542,8 +1642,7 @@ static int spraid_create_sq(struct spraid_dev *hdev, u16 qid,
admin_cmd.create_sq.sq_flags = cpu_to_le16(flags);
admin_cmd.create_sq.cqid = cpu_to_le16(qid);
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
}
static void spraid_free_queue(struct spraid_queue *spraidq)
@@ -1581,8 +1680,7 @@ static int spraid_delete_queue(struct spraid_dev *hdev, u8 op, u16 id)
admin_cmd.delete_queue.opcode = op;
admin_cmd.delete_queue.qid = cpu_to_le16(id);
- ret = spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
if (ret)
dev_err(hdev->dev, "Delete %s:[%d] failed\n",
@@ -1663,19 +1761,28 @@ static int spraid_set_features(struct spraid_dev *hdev, u32 fid, u32 dword11, vo
size_t buflen, u32 *result)
{
struct spraid_admin_command admin_cmd;
- u32 res;
int ret;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+
+ if (buffer && buflen) {
+ data_ptr = dma_alloc_coherent(hdev->dev, buflen, &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memcpy(data_ptr, buffer, buflen);
+ }
memset(&admin_cmd, 0, sizeof(admin_cmd));
admin_cmd.features.opcode = SPRAID_ADMIN_SET_FEATURES;
admin_cmd.features.fid = cpu_to_le32(fid);
admin_cmd.features.dword11 = cpu_to_le32(dword11);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
- ret = spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, &res,
- buffer, buflen, 0, 0, 0);
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, result, NULL, 0);
- if (!ret && result)
- *result = res;
+ if (data_ptr)
+ dma_free_coherent(hdev->dev, buflen, data_ptr, data_dma);
return ret;
}
@@ -1764,8 +1871,7 @@ static int spraid_setup_io_queues(struct spraid_dev *hdev)
break;
}
dev_info(hdev->dev, "[%s] max_qid: %d, queue_count: %d, online_queue: %d, ioq_depth: %d\n",
- __func__, hdev->max_qid, hdev->queue_count,
- hdev->online_queues, hdev->ioq_depth);
+ __func__, hdev->max_qid, hdev->queue_count, hdev->online_queues, hdev->ioq_depth);
return spraid_create_io_queues(hdev);
}
@@ -1889,10 +1995,11 @@ static int spraid_get_dev_list(struct spraid_dev *hdev, struct spraid_dev_info *
u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
struct spraid_admin_command admin_cmd;
struct spraid_dev_list *list_buf;
+ dma_addr_t data_dma = 0;
u32 i, idx, hdid, ndev;
int ret = 0;
- list_buf = kmalloc(sizeof(*list_buf), GFP_KERNEL);
+ list_buf = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
if (!list_buf)
return -ENOMEM;
@@ -1901,9 +2008,9 @@ static int spraid_get_dev_list(struct spraid_dev *hdev, struct spraid_dev_info *
admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
admin_cmd.get_info.type = SPRAID_GET_INFO_DEV_LIST;
admin_cmd.get_info.cdw11 = cpu_to_le32(idx);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
- ret = spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL, list_buf,
- sizeof(*list_buf), 0, 0, 0);
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
if (ret) {
dev_err(hdev->dev, "Get device list failed, nd: %u, idx: %u, ret: %d\n",
@@ -1916,12 +2023,11 @@ static int spraid_get_dev_list(struct spraid_dev *hdev, struct spraid_dev_info *
for (i = 0; i < ndev; i++) {
hdid = le32_to_cpu(list_buf->devices[i].hdid);
- dev_info(hdev->dev, "list_buf->devices[%d], hdid: %u target: %d, channel: %d, lun: %d, attr[%x]\n",
- i, hdid,
- le16_to_cpu(list_buf->devices[i].target),
- list_buf->devices[i].channel,
- list_buf->devices[i].lun,
- list_buf->devices[i].attr);
+ dev_info(hdev->dev, "list_buf->devices[%d], hdid: %u target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ i, hdid, le16_to_cpu(list_buf->devices[i].target),
+ list_buf->devices[i].channel,
+ list_buf->devices[i].lun,
+ list_buf->devices[i].attr);
if (hdid > nd || hdid == 0) {
dev_err(hdev->dev, "err, hdid[%d] invalid\n", hdid);
continue;
@@ -1936,21 +2042,29 @@ static int spraid_get_dev_list(struct spraid_dev *hdev, struct spraid_dev_info *
}
out:
- kfree(list_buf);
+ dma_free_coherent(hdev->dev, PAGE_SIZE, list_buf, data_dma);
return ret;
}
-static void spraid_send_aen(struct spraid_dev *hdev)
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid)
{
struct spraid_queue *adminq = &hdev->queues[0];
struct spraid_admin_command admin_cmd;
memset(&admin_cmd, 0, sizeof(admin_cmd));
admin_cmd.common.opcode = SPRAID_ADMIN_ASYNC_EVENT;
- admin_cmd.common.command_id = SPRAID_AQ_BLK_MQ_DEPTH;
+ admin_cmd.common.command_id = cid;
spraid_submit_cmd(adminq, &admin_cmd);
- dev_info(hdev->dev, "send aen, cid[%d]\n", SPRAID_AQ_BLK_MQ_DEPTH);
+ dev_info(hdev->dev, "send aen, cid[%d]\n", cid);
+}
+
+static inline void spraid_send_all_aen(struct spraid_dev *hdev)
+{
+ u16 i;
+
+ for (i = 0; i < hdev->ctrl_info->aerl; i++)
+ spraid_send_aen(hdev, i + SPRAID_AQ_BLK_MQ_DEPTH);
}
static int spraid_add_device(struct spraid_dev *hdev, struct spraid_dev_info *device)
@@ -1958,6 +2072,10 @@ static int spraid_add_device(struct spraid_dev *hdev, struct spraid_dev_info *de
struct Scsi_Host *shost = hdev->shost;
struct scsi_device *sdev;
+ dev_info(hdev->dev, "add device, hdid: %u target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ le32_to_cpu(device->hdid), le16_to_cpu(device->target),
+ device->channel, device->lun, device->attr);
+
sdev = scsi_device_lookup(shost, device->channel, le16_to_cpu(device->target), 0);
if (sdev) {
dev_warn(hdev->dev, "Device is already exist, channel: %d, target_id: %d, lun: %d\n",
@@ -1974,9 +2092,13 @@ static int spraid_rescan_device(struct spraid_dev *hdev, struct spraid_dev_info
struct Scsi_Host *shost = hdev->shost;
struct scsi_device *sdev;
+ dev_info(hdev->dev, "rescan device, hdid: %u target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ le32_to_cpu(device->hdid), le16_to_cpu(device->target),
+ device->channel, device->lun, device->attr);
+
sdev = scsi_device_lookup(shost, device->channel, le16_to_cpu(device->target), 0);
if (!sdev) {
- dev_warn(hdev->dev, "Device is not exit, channel: %d, target_id: %d, lun: %d\n",
+ dev_warn(hdev->dev, "device is not exit rescan it, channel: %d, target_id: %d, lun: %d\n",
device->channel, le16_to_cpu(device->target), 0);
return -ENODEV;
}
@@ -1991,9 +2113,13 @@ static int spraid_remove_device(struct spraid_dev *hdev, struct spraid_dev_info
struct Scsi_Host *shost = hdev->shost;
struct scsi_device *sdev;
+ dev_info(hdev->dev, "remove device, hdid: %u target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ le32_to_cpu(org_device->hdid), le16_to_cpu(org_device->target),
+ org_device->channel, org_device->lun, org_device->attr);
+
sdev = scsi_device_lookup(shost, org_device->channel, le16_to_cpu(org_device->target), 0);
if (!sdev) {
- dev_warn(hdev->dev, "Device is not exit, channel: %d, target_id: %d, lun: %d\n",
+ dev_warn(hdev->dev, "device is not exit remove it, channel: %d, target_id: %d, lun: %d\n",
org_device->channel, le16_to_cpu(org_device->target), 0);
return -ENODEV;
}
@@ -2029,36 +2155,54 @@ static int spraid_dev_list_init(struct spraid_dev *hdev)
return 0;
}
+static int luntarget_cmp_func(const void *l, const void *r)
+{
+ const struct spraid_dev_info *ln = l;
+ const struct spraid_dev_info *rn = r;
+
+ if (ln->channel == rn->channel)
+ return le16_to_cpu(ln->target) - le16_to_cpu(rn->target);
+
+ return ln->channel - rn->channel;
+}
+
static void spraid_scan_work(struct work_struct *work)
{
struct spraid_dev *hdev =
container_of(work, struct spraid_dev, scan_work);
struct spraid_dev_info *devices, *org_devices;
+ struct spraid_dev_info *sortdevice;
u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
u8 flag, org_flag;
int i, ret;
+ int count = 0;
devices = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
if (!devices)
return;
+
+ sortdevice = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
+ if (!sortdevice)
+ goto free_list;
+
ret = spraid_get_dev_list(hdev, devices);
if (ret)
- goto free_list;
+ goto free_all;
org_devices = hdev->devices;
for (i = 0; i < nd; i++) {
org_flag = org_devices[i].flag;
flag = devices[i].flag;
- dev_log_dbg(hdev->dev, "i: %d, org_flag: 0x%x, flag: 0x%x\n",
- i, org_flag, flag);
+ dev_log_dbg(hdev->dev, "i: %d, org_flag: 0x%x, flag: 0x%x\n", i, org_flag, flag);
if (SPRAID_DEV_INFO_FLAG_VALID(flag)) {
if (!SPRAID_DEV_INFO_FLAG_VALID(org_flag)) {
down_write(&hdev->devices_rwsem);
memcpy(&org_devices[i], &devices[i],
- sizeof(struct spraid_dev_info));
+ sizeof(struct spraid_dev_info));
+ memcpy(&sortdevice[count++], &devices[i],
+ sizeof(struct spraid_dev_info));
up_write(&hdev->devices_rwsem);
- spraid_add_device(hdev, &devices[i]);
} else if (SPRAID_DEV_INFO_FLAG_CHANGE(flag)) {
spraid_rescan_device(hdev, &devices[i]);
}
@@ -2071,6 +2215,16 @@ static void spraid_scan_work(struct work_struct *work)
}
}
}
+
+ dev_info(hdev->dev, "scan work add device count = %d\n", count);
+
+ sort(sortdevice, count, sizeof(sortdevice[0]), luntarget_cmp_func, NULL);
+
+ for (i = 0; i < count; i++)
+ spraid_add_device(hdev, &sortdevice[i]);
+
+free_all:
+ kfree(sortdevice);
free_list:
kfree(devices);
}
@@ -2083,6 +2237,15 @@ static void spraid_timesyn_work(struct work_struct *work)
spraid_configure_timestamp(hdev);
}
+static int spraid_init_ctrl_info(struct spraid_dev *hdev);
+static void spraid_fw_act_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev = container_of(work, struct spraid_dev, fw_act_work);
+
+ if (spraid_init_ctrl_info(hdev))
+ dev_err(hdev->dev, "get ctrl info failed after fw act\n");
+}
+
static void spraid_queue_scan(struct spraid_dev *hdev)
{
queue_work(spraid_wq, &hdev->scan_work);
@@ -2094,6 +2257,9 @@ static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result)
case SPRAID_AEN_DEV_CHANGED:
spraid_queue_scan(hdev);
break;
+ case SPRAID_AEN_FW_ACT_START:
+ dev_info(hdev->dev, "fw activation starting\n");
+ break;
case SPRAID_AEN_HOST_PROBING:
break;
default:
@@ -2101,25 +2267,25 @@ static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result)
}
}
-static void spraid_handle_aen_vs(struct spraid_dev *hdev, u32 result)
+static void spraid_handle_aen_vs(struct spraid_dev *hdev, u32 result, u32 result1)
{
- switch (result) {
+ switch ((result & 0xff00) >> 8) {
case SPRAID_AEN_TIMESYN:
queue_work(spraid_wq, &hdev->timesyn_work);
break;
+ case SPRAID_AEN_FW_ACT_FINISH:
+ dev_info(hdev->dev, "fw activation finish\n");
+ queue_work(spraid_wq, &hdev->fw_act_work);
+ break;
+ case SPRAID_AEN_EVENT_MIN ... SPRAID_AEN_EVENT_MAX:
+ dev_info(hdev->dev, "rcv card event[%d], param1[0x%x] param2[0x%x]\n",
+ (result & 0xff00) >> 8, result, result1);
+ break;
default:
- dev_warn(hdev->dev, "async event result: %x\n", result);
+ dev_warn(hdev->dev, "async event result: 0x%x\n", result);
}
}
-static void spraid_async_event_work(struct work_struct *work)
-{
- struct spraid_dev *hdev =
- container_of(work, struct spraid_dev, aen_work);
-
- spraid_send_aen(hdev);
-}
-
static int spraid_alloc_resources(struct spraid_dev *hdev)
{
int ret, nqueue;
@@ -2149,10 +2315,16 @@ static int spraid_alloc_resources(struct spraid_dev *hdev)
goto destroy_dma_pools;
}
+ ret = spraid_alloc_admin_cmds(hdev);
+ if (ret)
+ goto free_queues;
+
dev_info(hdev->dev, "[%s] queues num: %d\n", __func__, nqueue);
return 0;
+free_queues:
+ kfree(hdev->queues);
destroy_dma_pools:
spraid_destroy_dma_pools(hdev);
free_ctrl_info:
@@ -2164,50 +2336,18 @@ static int spraid_alloc_resources(struct spraid_dev *hdev)
static void spraid_free_resources(struct spraid_dev *hdev)
{
+ spraid_free_admin_cmds(hdev);
kfree(hdev->queues);
spraid_destroy_dma_pools(hdev);
kfree(hdev->ctrl_info);
ida_free(&spraid_instance_ida, hdev->instance);
}
-static void spraid_setup_passthrough(struct request *req, struct spraid_admin_command *cmd)
-{
- memcpy(cmd, spraid_admin_req(req)->cmd, sizeof(*cmd));
- cmd->common.flags &= ~SPRAID_CMD_FLAG_SGL_ALL;
-}
-
-static inline void spraid_clear_hreq(struct request *req)
-{
- if (!(req->rq_flags & RQF_DONTPREP)) {
- spraid_admin_req(req)->flags = 0;
- req->rq_flags |= RQF_DONTPREP;
- }
-}
-
-static blk_status_t spraid_setup_admin_cmd(struct request *req, struct spraid_admin_command *cmd)
+static void spraid_bsg_unmap_data(struct spraid_dev *hdev, struct bsg_job *job)
{
- spraid_clear_hreq(req);
-
- memset(cmd, 0, sizeof(*cmd));
- switch (req_op(req)) {
- case REQ_OP_DRV_IN:
- case REQ_OP_DRV_OUT:
- spraid_setup_passthrough(req, cmd);
- break;
- default:
- WARN_ON_ONCE(1);
- return BLK_STS_IOERR;
- }
-
- cmd->common.command_id = req->tag;
- return BLK_STS_OK;
-}
-
-static void spraid_unmap_data(struct spraid_dev *hdev, struct request *req)
-{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- enum dma_data_direction dma_dir = rq_data_dir(req) ?
- DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir = rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
if (iod->nsge)
dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
@@ -2215,36 +2355,36 @@ static void spraid_unmap_data(struct spraid_dev *hdev, struct request *req)
spraid_free_iod_res(hdev, iod);
}
-static blk_status_t spraid_admin_map_data(struct spraid_dev *hdev, struct request *req,
- struct spraid_admin_command *cmd)
+static int spraid_bsg_map_data(struct spraid_dev *hdev, struct bsg_job *job,
+ struct spraid_admin_command *cmd)
{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct request_queue *admin_q = req->q;
- enum dma_data_direction dma_dir = rq_data_dir(req) ?
- DMA_TO_DEVICE : DMA_FROM_DEVICE;
- blk_status_t ret = BLK_STS_IOERR;
- int nr_mapped;
- int res;
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir = rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ int ret = 0;
+
+ iod->sg = job->request_payload.sg_list;
+ iod->nsge = job->request_payload.sg_cnt;
+ iod->length = job->request_payload.payload_len;
+ iod->use_sgl = false;
+ iod->npages = -1;
+ iod->sg_drv_mgmt = false;
- sg_init_table(iod->sg, blk_rq_nr_phys_segments(req));
- iod->nsge = blk_rq_map_sg(admin_q, req, iod->sg);
if (!iod->nsge)
goto out;
- dev_info(hdev->dev, "nseg: %u, nsge: %u\n",
- blk_rq_nr_phys_segments(req), iod->nsge);
-
- ret = BLK_STS_RESOURCE;
- nr_mapped = dma_map_sg_attrs(hdev->dev, iod->sg, iod->nsge, dma_dir, DMA_ATTR_NO_WARN);
- if (!nr_mapped)
+ ret = dma_map_sg_attrs(hdev->dev, iod->sg, iod->nsge, dma_dir, DMA_ATTR_NO_WARN);
+ if (!ret)
goto out;
- res = spraid_setup_prps(hdev, iod);
- if (res)
+ ret = spraid_setup_prps(hdev, iod);
+ if (ret)
goto unmap;
+
cmd->common.dptr.prp1 = cpu_to_le64(sg_dma_address(iod->sg));
cmd->common.dptr.prp2 = cpu_to_le64(iod->first_dma);
- return BLK_STS_OK;
+
+ return 0;
unmap:
dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
@@ -2252,137 +2392,29 @@ static blk_status_t spraid_admin_map_data(struct spraid_dev *hdev, struct reques
return ret;
}
-static blk_status_t spraid_init_admin_iod(struct request *rq, struct spraid_dev *hdev)
-{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(rq);
- int nents = blk_rq_nr_phys_segments(rq);
- unsigned int size = blk_rq_payload_bytes(rq);
-
- if (nents > SPRAID_INT_PAGES || size > SPRAID_INT_BYTES(hdev)) {
- iod->sg = mempool_alloc(hdev->iod_mempool, GFP_ATOMIC);
- if (!iod->sg)
- return BLK_STS_RESOURCE;
- } else {
- iod->sg = iod->inline_sg;
- }
-
- iod->nsge = 0;
- iod->use_sgl = false;
- iod->npages = -1;
- iod->length = size;
- iod->sg_drv_mgmt = true;
-
- return BLK_STS_OK;
-}
-
-static blk_status_t spraid_queue_admin_rq(struct blk_mq_hw_ctx *hctx,
- const struct blk_mq_queue_data *bd)
-{
- struct spraid_queue *adminq = hctx->driver_data;
- struct spraid_dev *hdev = adminq->hdev;
- struct request *req = bd->rq;
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct spraid_admin_command cmd;
- blk_status_t ret;
-
- ret = spraid_setup_admin_cmd(req, &cmd);
- if (ret)
- goto out;
-
- ret = spraid_init_admin_iod(req, hdev);
- if (ret)
- goto out;
-
- if (blk_rq_nr_phys_segments(req)) {
- ret = spraid_admin_map_data(hdev, req, &cmd);
- if (ret)
- goto cleanup_iod;
- }
-
- blk_mq_start_request(req);
- spraid_submit_cmd(adminq, &cmd);
- return BLK_STS_OK;
-
-cleanup_iod:
- spraid_free_iod_res(hdev, iod);
-out:
- return ret;
-}
-
-static blk_status_t spraid_error_status(struct request *req)
-{
- switch (spraid_admin_req(req)->status & 0x7ff) {
- case SPRAID_SC_SUCCESS:
- return BLK_STS_OK;
- default:
- return BLK_STS_IOERR;
- }
-}
-
-static void spraid_complete_admin_rq(struct request *req)
-{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct spraid_dev *hdev = iod->spraidq->hdev;
-
- if (blk_rq_nr_phys_segments(req))
- spraid_unmap_data(hdev, req);
- blk_mq_end_request(req, spraid_error_status(req));
-}
-
-static int spraid_admin_init_hctx(struct blk_mq_hw_ctx *hctx, void *data, unsigned int hctx_idx)
-{
- struct spraid_dev *hdev = data;
- struct spraid_queue *adminq = &hdev->queues[0];
-
- WARN_ON(hctx_idx != 0);
- WARN_ON(hdev->admin_tagset.tags[0] != hctx->tags);
-
- hctx->driver_data = adminq;
- return 0;
-}
-
-static int spraid_admin_init_request(struct blk_mq_tag_set *set, struct request *req,
- unsigned int hctx_idx, unsigned int numa_node)
-{
- struct spraid_dev *hdev = set->driver_data;
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct spraid_queue *adminq = &hdev->queues[0];
-
- WARN_ON(!adminq);
- iod->spraidq = adminq;
- return 0;
-}
-
-static enum blk_eh_timer_return
-spraid_admin_timeout(struct request *req, bool reserved)
-{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct spraid_queue *spraidq = iod->spraidq;
- struct spraid_dev *hdev = spraidq->hdev;
-
- dev_err(hdev->dev, "Admin cid[%d] qid[%d] timeout\n",
- req->tag, spraidq->qid);
-
- if (spraid_poll_cq(spraidq, req->tag)) {
- dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, completion polled\n",
- req->tag, spraidq->qid);
- return BLK_EH_DONE;
- }
-
- spraid_end_admin_request(req, cpu_to_le16(-EINVAL), 0, 0);
- return BLK_EH_DONE;
-}
-
static int spraid_get_ctrl_info(struct spraid_dev *hdev, struct spraid_ctrl_info *ctrl_info)
{
struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
memset(&admin_cmd, 0, sizeof(admin_cmd));
admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
admin_cmd.get_info.type = SPRAID_GET_INFO_CTRL;
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(ctrl_info, data_ptr, sizeof(struct spraid_ctrl_info));
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- ctrl_info, sizeof(struct spraid_ctrl_info), 0, 0, 0);
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
}
static int spraid_init_ctrl_info(struct spraid_dev *hdev)
@@ -2416,6 +2448,11 @@ static int spraid_init_ctrl_info(struct spraid_dev *hdev)
dev_info(hdev->dev, "[%s]sn = %s\n", __func__, hdev->ctrl_info->sn);
dev_info(hdev->dev, "[%s]fr = %s\n", __func__, hdev->ctrl_info->fr);
+ if (!hdev->ctrl_info->aerl)
+ hdev->ctrl_info->aerl = 1;
+ if (hdev->ctrl_info->aerl > SPRAID_NR_AEN_COMMANDS)
+ hdev->ctrl_info->aerl = SPRAID_NR_AEN_COMMANDS;
+
return 0;
}
@@ -2444,98 +2481,51 @@ static void spraid_free_iod_ext_mem_pool(struct spraid_dev *hdev)
mempool_destroy(hdev->iod_mempool);
}
-static int spraid_submit_user_cmd(struct request_queue *q, struct spraid_admin_command *cmd,
- void __user *ubuffer, unsigned int bufflen, u32 *result,
- unsigned int timeout)
+static int spraid_user_admin_cmd(struct spraid_dev *hdev, struct bsg_job *job)
{
- struct request *req;
- struct bio *bio = NULL;
- int ret;
-
- req = spraid_alloc_admin_request(q, cmd, 0);
- if (IS_ERR(req))
- return PTR_ERR(req);
+ struct spraid_bsg_request *bsg_req = job->request;
+ struct spraid_passthru_common_cmd *cmd = &(bsg_req->admcmd);
+ struct spraid_admin_command admin_cmd;
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
+ u32 result[2] = {0};
+ int status;
- req->timeout = timeout ? timeout : ADMIN_TIMEOUT;
- spraid_admin_req(req)->flags |= SPRAID_REQ_USERCMD;
+ if (hdev->state >= SPRAID_RESETTING) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not right\n",
+ __func__, hdev->state);
+ return -EBUSY;
+ }
- if (ubuffer && bufflen) {
- ret = blk_rq_map_user(q, req, NULL, ubuffer, bufflen, GFP_KERNEL);
- if (ret)
- goto out;
- bio = req->bio;
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.common.opcode = cmd->opcode;
+ admin_cmd.common.flags = cmd->flags;
+ admin_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ admin_cmd.common.cdw2[0] = cpu_to_le32(cmd->cdw2);
+ admin_cmd.common.cdw2[1] = cpu_to_le32(cmd->cdw3);
+ admin_cmd.common.cdw10 = cpu_to_le32(cmd->cdw10);
+ admin_cmd.common.cdw11 = cpu_to_le32(cmd->cdw11);
+ admin_cmd.common.cdw12 = cpu_to_le32(cmd->cdw12);
+ admin_cmd.common.cdw13 = cpu_to_le32(cmd->cdw13);
+ admin_cmd.common.cdw14 = cpu_to_le32(cmd->cdw14);
+ admin_cmd.common.cdw15 = cpu_to_le32(cmd->cdw15);
+
+ status = spraid_bsg_map_data(hdev, job, &admin_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
}
- blk_execute_rq(req->q, NULL, req, 0);
- if (spraid_admin_req(req)->flags & SPRAID_REQ_CANCELLED)
- ret = -EINTR;
- else
- ret = spraid_admin_req(req)->status;
- if (result) {
- result[0] = spraid_admin_req(req)->result0;
- result[1] = spraid_admin_req(req)->result1;
- }
- if (bio)
- blk_rq_unmap_user(bio);
-out:
- blk_mq_free_request(req);
- return ret;
-}
-static int spraid_user_admin_cmd(struct spraid_dev *hdev,
- struct spraid_passthru_common_cmd __user *ucmd)
-{
- struct spraid_passthru_common_cmd cmd;
- struct spraid_admin_command admin_cmd;
- u32 timeout = 0;
- int status;
-
- if (!capable(CAP_SYS_ADMIN)) {
- dev_err(hdev->dev, "Current user hasn't administrator right, reject service\n");
- return -EACCES;
- }
-
- if (copy_from_user(&cmd, ucmd, sizeof(cmd))) {
- dev_err(hdev->dev, "Copy command from user space to kernel space failed\n");
- return -EFAULT;
- }
-
- if (cmd.flags) {
- dev_err(hdev->dev, "Invalid flags in user command\n");
- return -EINVAL;
+ status = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, &result[0], &result[1], timeout);
+ if (status >= 0) {
+ job->reply_len = sizeof(result);
+ memcpy(job->reply, result, sizeof(result));
}
- dev_info(hdev->dev, "user_admin_cmd opcode: 0x%x, subopcode: 0x%x\n",
- cmd.opcode, cmd.cdw2 & 0x7ff);
+ if (status)
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x], status[0x%x] result0[0x%x] result1[0x%x]\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode, status, result[0], result[1]);
- memset(&admin_cmd, 0, sizeof(admin_cmd));
- admin_cmd.common.opcode = cmd.opcode;
- admin_cmd.common.flags = cmd.flags;
- admin_cmd.common.hdid = cpu_to_le32(cmd.nsid);
- admin_cmd.common.cdw2[0] = cpu_to_le32(cmd.cdw2);
- admin_cmd.common.cdw2[1] = cpu_to_le32(cmd.cdw3);
- admin_cmd.common.cdw10 = cpu_to_le32(cmd.cdw10);
- admin_cmd.common.cdw11 = cpu_to_le32(cmd.cdw11);
- admin_cmd.common.cdw12 = cpu_to_le32(cmd.cdw12);
- admin_cmd.common.cdw13 = cpu_to_le32(cmd.cdw13);
- admin_cmd.common.cdw14 = cpu_to_le32(cmd.cdw14);
- admin_cmd.common.cdw15 = cpu_to_le32(cmd.cdw15);
-
- if (cmd.timeout_ms)
- timeout = msecs_to_jiffies(cmd.timeout_ms);
-
- status = spraid_submit_user_cmd(hdev->admin_q, &admin_cmd,
- (void __user *)(uintptr_t)cmd.addr, cmd.info_1.data_len,
- &cmd.result0, timeout);
-
- dev_info(hdev->dev, "user_admin_cmd status: 0x%x, result0: 0x%x, result1: 0x%x\n",
- status, cmd.result0, cmd.result1);
-
- if (status >= 0) {
- if (put_user(cmd.result0, &ucmd->result0))
- return -EFAULT;
- if (put_user(cmd.result1, &ucmd->result1))
- return -EFAULT;
- }
+ spraid_bsg_unmap_data(hdev, job);
return status;
}
@@ -2548,8 +2538,8 @@ static int spraid_alloc_ioq_ptcmds(struct spraid_dev *hdev)
INIT_LIST_HEAD(&hdev->ioq_pt_list);
spin_lock_init(&hdev->ioq_pt_lock);
- hdev->ioq_ptcmds = kcalloc_node(ptnum, sizeof(struct spraid_ioq_ptcmd),
- GFP_KERNEL, hdev->numa_node);
+ hdev->ioq_ptcmds = kcalloc_node(ptnum, sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
if (!hdev->ioq_ptcmds) {
dev_err(hdev->dev, "Alloc ioq_ptcmds failed\n");
@@ -2567,55 +2557,35 @@ static int spraid_alloc_ioq_ptcmds(struct spraid_dev *hdev)
return 0;
}
-static struct spraid_ioq_ptcmd *spraid_get_ioq_ptcmd(struct spraid_dev *hdev)
-{
- struct spraid_ioq_ptcmd *cmd = NULL;
- unsigned long flags;
-
- spin_lock_irqsave(&hdev->ioq_pt_lock, flags);
- if (list_empty(&hdev->ioq_pt_list)) {
- spin_unlock_irqrestore(&hdev->ioq_pt_lock, flags);
- dev_err(hdev->dev, "err, ioq ptcmd list empty\n");
- return NULL;
- }
- cmd = list_entry((&hdev->ioq_pt_list)->next, struct spraid_ioq_ptcmd, list);
- list_del_init(&cmd->list);
- spin_unlock_irqrestore(&hdev->ioq_pt_lock, flags);
-
- WRITE_ONCE(cmd->state, SPRAID_CMD_IDLE);
-
- return cmd;
-}
-
-static void spraid_put_ioq_ptcmd(struct spraid_dev *hdev, struct spraid_ioq_ptcmd *cmd)
+static void spraid_free_ioq_ptcmds(struct spraid_dev *hdev)
{
- unsigned long flags;
+ kfree(hdev->ioq_ptcmds);
+ hdev->ioq_ptcmds = NULL;
- spin_lock_irqsave(&hdev->ioq_pt_lock, flags);
- list_add(&cmd->list, (&hdev->ioq_pt_list)->next);
- spin_unlock_irqrestore(&hdev->ioq_pt_lock, flags);
+ INIT_LIST_HEAD(&hdev->ioq_pt_list);
}
static int spraid_submit_ioq_sync_cmd(struct spraid_dev *hdev, struct spraid_ioq_command *cmd,
- u32 *result, void **sense, u32 timeout)
+ u32 *result, u32 *reslen, u32 timeout)
{
- struct spraid_queue *ioq;
int ret;
dma_addr_t sense_dma;
- struct spraid_ioq_ptcmd *pt_cmd = spraid_get_ioq_ptcmd(hdev);
-
- *sense = NULL;
+ struct spraid_queue *ioq;
+ void *sense_addr = NULL;
+ struct spraid_cmd *pt_cmd = spraid_get_cmd(hdev, SPRAID_CMD_IOPT);
- if (!pt_cmd)
+ if (!pt_cmd) {
+ dev_err(hdev->dev, "err, get ioq cmd failed\n");
return -EFAULT;
+ }
- dev_info(hdev->dev, "[%s] ptcmd, cid[%d], qid[%d]\n", __func__, pt_cmd->cid, pt_cmd->qid);
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
init_completion(&pt_cmd->cmd_done);
ioq = &hdev->queues[pt_cmd->qid];
ret = pt_cmd->cid * SCSI_SENSE_BUFFERSIZE;
- pt_cmd->priv = ioq->sense + ret;
+ sense_addr = ioq->sense + ret;
sense_dma = ioq->sense_dma_addr + ret;
cmd->common.sense_addr = cpu_to_le64(sense_dma);
@@ -2625,260 +2595,87 @@ static int spraid_submit_ioq_sync_cmd(struct spraid_dev *hdev, struct spraid_ioq
spraid_submit_cmd(ioq, cmd);
if (!wait_for_completion_timeout(&pt_cmd->cmd_done, timeout)) {
- dev_err(hdev->dev, "[%s] cid[%d], qid[%d] timeout\n",
- __func__, pt_cmd->cid, pt_cmd->qid);
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout, opcode[0x%x] subopcode[0x%x]\n",
+ __func__, pt_cmd->cid, pt_cmd->qid, cmd->common.opcode,
+ (le32_to_cpu(cmd->common.cdw3[0]) & 0xffff));
WRITE_ONCE(pt_cmd->state, SPRAID_CMD_TIMEOUT);
- return -EINVAL;
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
+ return -ETIME;
}
- if (result) {
- result[0] = pt_cmd->result0;
- result[1] = pt_cmd->result1;
+ if (result && reslen) {
+ if ((pt_cmd->status & 0x17f) == 0x101) {
+ memcpy(result, sense_addr, SCSI_SENSE_BUFFERSIZE);
+ *reslen = SCSI_SENSE_BUFFERSIZE;
+ }
}
- if ((pt_cmd->status & 0x17f) == 0x101)
- *sense = pt_cmd->priv;
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
return pt_cmd->status;
}
-static int spraid_user_ioq_cmd(struct spraid_dev *hdev,
- struct spraid_ioq_passthru_cmd __user *ucmd)
+static int spraid_user_ioq_cmd(struct spraid_dev *hdev, struct bsg_job *job)
{
- struct spraid_ioq_passthru_cmd cmd;
+ struct spraid_bsg_request *bsg_req = (struct spraid_bsg_request *)(job->request);
+ struct spraid_ioq_passthru_cmd *cmd = &(bsg_req->ioqcmd);
struct spraid_ioq_command ioq_cmd;
- u32 timeout = 0;
int status = 0;
- u8 *data_ptr = NULL;
- dma_addr_t data_dma;
- enum dma_data_direction dma_dir = DMA_NONE;
- void *sense = NULL;
-
- if (!capable(CAP_SYS_ADMIN)) {
- dev_err(hdev->dev, "Current user hasn't administrator right, reject service\n");
- return -EACCES;
- }
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
- if (copy_from_user(&cmd, ucmd, sizeof(cmd))) {
- dev_err(hdev->dev, "Copy command from user space to kernel space failed\n");
- return -EFAULT;
- }
-
- if (cmd.data_len > PAGE_SIZE) {
+ if (cmd->data_len > PAGE_SIZE) {
dev_err(hdev->dev, "[%s] data len bigger than 4k\n", __func__);
return -EFAULT;
}
- dev_info(hdev->dev, "[%s] opcode: 0x%x, subopcode: 0x%x, datalen: %d\n",
- __func__, cmd.opcode, cmd.info_1.subopcode, cmd.data_len);
-
- if (cmd.addr && cmd.data_len) {
- data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
- if (!data_ptr)
- return -ENOMEM;
-
- dma_dir = (cmd.opcode & 1) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ if (hdev->state != SPRAID_LIVE) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not live\n",
+ __func__, hdev->state);
+ return -EBUSY;
}
- if (dma_dir == DMA_TO_DEVICE) {
- if (copy_from_user(data_ptr, (void __user *)(uintptr_t)cmd.addr, cmd.data_len)) {
- dev_err(hdev->dev, "[%s] copy user data failed\n", __func__);
- status = -EFAULT;
- goto free_dma_mem;
- }
- }
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init, datalen[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode, cmd->data_len);
memset(&ioq_cmd, 0, sizeof(ioq_cmd));
- ioq_cmd.common.opcode = cmd.opcode;
- ioq_cmd.common.flags = cmd.flags;
- ioq_cmd.common.hdid = cpu_to_le32(cmd.nsid);
- ioq_cmd.common.sense_len = cpu_to_le16(cmd.info_0.res_sense_len);
- ioq_cmd.common.cdb_len = cmd.info_0.cdb_len;
- ioq_cmd.common.rsvd2 = cmd.info_0.rsvd0;
- ioq_cmd.common.cdw3[0] = cpu_to_le32(cmd.cdw3);
- ioq_cmd.common.cdw3[1] = cpu_to_le32(cmd.cdw4);
- ioq_cmd.common.cdw3[2] = cpu_to_le32(cmd.cdw5);
- ioq_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
-
- ioq_cmd.common.cdw10[0] = cpu_to_le32(cmd.cdw10);
- ioq_cmd.common.cdw10[1] = cpu_to_le32(cmd.cdw11);
- ioq_cmd.common.cdw10[2] = cpu_to_le32(cmd.cdw12);
- ioq_cmd.common.cdw10[3] = cpu_to_le32(cmd.cdw13);
- ioq_cmd.common.cdw10[4] = cpu_to_le32(cmd.cdw14);
- ioq_cmd.common.cdw10[5] = cpu_to_le32(cmd.data_len);
-
- memcpy(ioq_cmd.common.cdb, &cmd.cdw16, cmd.info_0.cdb_len);
-
- ioq_cmd.common.cdw26[0] = cpu_to_le32(cmd.cdw26[0]);
- ioq_cmd.common.cdw26[1] = cpu_to_le32(cmd.cdw26[1]);
- ioq_cmd.common.cdw26[2] = cpu_to_le32(cmd.cdw26[2]);
- ioq_cmd.common.cdw26[3] = cpu_to_le32(cmd.cdw26[3]);
-
- if (cmd.timeout_ms)
- timeout = msecs_to_jiffies(cmd.timeout_ms);
- timeout = timeout ? timeout : ADMIN_TIMEOUT;
-
- status = spraid_submit_ioq_sync_cmd(hdev, &ioq_cmd, &cmd.result0, &sense, timeout);
-
- if (status >= 0) {
- if (put_user(cmd.result0, &ucmd->result0)) {
- status = -EFAULT;
- goto free_dma_mem;
- }
- if (put_user(cmd.result1, &ucmd->result1)) {
- status = -EFAULT;
- goto free_dma_mem;
- }
- if (dma_dir == DMA_FROM_DEVICE &&
- copy_to_user((void __user *)(uintptr_t)cmd.addr, data_ptr, cmd.data_len)) {
- status = -EFAULT;
- goto free_dma_mem;
- }
- }
-
- if (sense) {
- if (copy_to_user((void *__user *)(uintptr_t)cmd.sense_addr,
- sense, cmd.info_0.res_sense_len)) {
- status = -EFAULT;
- goto free_dma_mem;
- }
- }
-
-free_dma_mem:
- if (data_ptr)
- dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
-
- return status;
-
-}
-
-static int spraid_reset_work_sync(struct spraid_dev *hdev);
-
-static int spraid_user_reset_cmd(struct spraid_dev *hdev)
-{
- int ret;
-
- dev_info(hdev->dev, "[%s] start user reset cmd\n", __func__);
- ret = spraid_reset_work_sync(hdev);
- dev_info(hdev->dev, "[%s] stop user reset cmd[%d]\n", __func__, ret);
-
- return ret;
-}
-
-static int hdev_open(struct inode *inode, struct file *file)
-{
- struct spraid_dev *hdev =
- container_of(inode->i_cdev, struct spraid_dev, cdev);
- file->private_data = hdev;
- return 0;
-}
-
-static long hdev_ioctl(struct file *file, u32 cmd, unsigned long arg)
-{
- struct spraid_dev *hdev = file->private_data;
- void __user *argp = (void __user *)arg;
-
- switch (cmd) {
- case SPRAID_IOCTL_ADMIN_CMD:
- return spraid_user_admin_cmd(hdev, argp);
- case SPRAID_IOCTL_IOQ_CMD:
- return spraid_user_ioq_cmd(hdev, argp);
- case SPRAID_IOCTL_RESET_CMD:
- return spraid_user_reset_cmd(hdev);
- default:
- return -ENOTTY;
- }
-}
-
-static const struct file_operations spraid_dev_fops = {
- .owner = THIS_MODULE,
- .open = hdev_open,
- .unlocked_ioctl = hdev_ioctl,
- .compat_ioctl = hdev_ioctl,
-};
-
-static int spraid_create_cdev(struct spraid_dev *hdev)
-{
- int ret;
-
- device_initialize(&hdev->ctrl_device);
- hdev->ctrl_device.devt = MKDEV(MAJOR(spraid_chr_devt), hdev->instance);
- hdev->ctrl_device.class = spraid_class;
- hdev->ctrl_device.parent = hdev->dev;
- dev_set_drvdata(&hdev->ctrl_device, hdev);
- ret = dev_set_name(&hdev->ctrl_device, "spraid%d", hdev->instance);
- if (ret)
- return ret;
- cdev_init(&hdev->cdev, &spraid_dev_fops);
- hdev->cdev.owner = THIS_MODULE;
- ret = cdev_device_add(&hdev->cdev, &hdev->ctrl_device);
- if (ret) {
- dev_err(hdev->dev, "Add cdev failed, ret: %d", ret);
- put_device(&hdev->ctrl_device);
- kfree_const(hdev->ctrl_device.kobj.name);
- return ret;
- }
-
- return 0;
-}
-
-static inline void spraid_remove_cdev(struct spraid_dev *hdev)
-{
- cdev_device_del(&hdev->cdev, &hdev->ctrl_device);
-}
-
-static const struct blk_mq_ops spraid_admin_mq_ops = {
- .queue_rq = spraid_queue_admin_rq,
- .complete = spraid_complete_admin_rq,
- .init_hctx = spraid_admin_init_hctx,
- .init_request = spraid_admin_init_request,
- .timeout = spraid_admin_timeout,
-};
-
-static void spraid_remove_admin_tagset(struct spraid_dev *hdev)
-{
- if (hdev->admin_q && !blk_queue_dying(hdev->admin_q)) {
- blk_mq_unquiesce_queue(hdev->admin_q);
- blk_cleanup_queue(hdev->admin_q);
- blk_mq_free_tag_set(&hdev->admin_tagset);
+ ioq_cmd.common.opcode = cmd->opcode;
+ ioq_cmd.common.flags = cmd->flags;
+ ioq_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ ioq_cmd.common.sense_len = cpu_to_le16(cmd->info_0.res_sense_len);
+ ioq_cmd.common.cdb_len = cmd->info_0.cdb_len;
+ ioq_cmd.common.rsvd2 = cmd->info_0.rsvd0;
+ ioq_cmd.common.cdw3[0] = cpu_to_le32(cmd->cdw3);
+ ioq_cmd.common.cdw3[1] = cpu_to_le32(cmd->cdw4);
+ ioq_cmd.common.cdw3[2] = cpu_to_le32(cmd->cdw5);
+
+ ioq_cmd.common.cdw10[0] = cpu_to_le32(cmd->cdw10);
+ ioq_cmd.common.cdw10[1] = cpu_to_le32(cmd->cdw11);
+ ioq_cmd.common.cdw10[2] = cpu_to_le32(cmd->cdw12);
+ ioq_cmd.common.cdw10[3] = cpu_to_le32(cmd->cdw13);
+ ioq_cmd.common.cdw10[4] = cpu_to_le32(cmd->cdw14);
+ ioq_cmd.common.cdw10[5] = cpu_to_le32(cmd->data_len);
+
+ memcpy(ioq_cmd.common.cdb, &cmd->cdw16, cmd->info_0.cdb_len);
+
+ ioq_cmd.common.cdw26[0] = cpu_to_le32(cmd->cdw26[0]);
+ ioq_cmd.common.cdw26[1] = cpu_to_le32(cmd->cdw26[1]);
+ ioq_cmd.common.cdw26[2] = cpu_to_le32(cmd->cdw26[2]);
+ ioq_cmd.common.cdw26[3] = cpu_to_le32(cmd->cdw26[3]);
+
+ status = spraid_bsg_map_data(hdev, job, (struct spraid_admin_command *)&ioq_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
}
-}
-static int spraid_alloc_admin_tags(struct spraid_dev *hdev)
-{
- if (!hdev->admin_q) {
- hdev->admin_tagset.ops = &spraid_admin_mq_ops;
- hdev->admin_tagset.nr_hw_queues = 1;
+ status = spraid_submit_ioq_sync_cmd(hdev, &ioq_cmd, job->reply, &job->reply_len, timeout);
- hdev->admin_tagset.queue_depth = SPRAID_AQ_MQ_TAG_DEPTH;
- hdev->admin_tagset.timeout = ADMIN_TIMEOUT;
- hdev->admin_tagset.numa_node = hdev->numa_node;
- hdev->admin_tagset.cmd_size =
- spraid_cmd_size(hdev, true, false);
- hdev->admin_tagset.flags = BLK_MQ_F_NO_SCHED;
- hdev->admin_tagset.driver_data = hdev;
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x], status[0x%x], reply_len[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode, status, job->reply_len);
- if (blk_mq_alloc_tag_set(&hdev->admin_tagset)) {
- dev_err(hdev->dev, "Allocate admin tagset failed\n");
- return -ENOMEM;
- }
+ spraid_bsg_unmap_data(hdev, job);
- hdev->admin_q = blk_mq_init_queue(&hdev->admin_tagset);
- if (IS_ERR(hdev->admin_q)) {
- dev_err(hdev->dev, "Initialize admin request queue failed\n");
- blk_mq_free_tag_set(&hdev->admin_tagset);
- return -ENOMEM;
- }
- if (!blk_get_queue(hdev->admin_q)) {
- dev_err(hdev->dev, "Get admin request queue failed\n");
- spraid_remove_admin_tagset(hdev);
- hdev->admin_q = NULL;
- return -ENODEV;
- }
- } else {
- blk_mq_unquiesce_queue(hdev->admin_q);
- }
- return 0;
+ return status;
}
static bool spraid_check_scmd_completed(struct scsi_cmnd *scmd)
@@ -2891,7 +2688,7 @@ static bool spraid_check_scmd_completed(struct scsi_cmnd *scmd)
spraid_get_tag_from_scmd(scmd, &hwq, &cid);
spraidq = &hdev->queues[hwq];
if (READ_ONCE(iod->state) == SPRAID_CMD_COMPLETE || spraid_poll_cq(spraidq, cid)) {
- dev_warn(hdev->dev, "cid[%d], qid[%d] has been completed\n",
+ dev_warn(hdev->dev, "cid[%d] qid[%d] has been completed\n",
cid, spraidq->qid);
return true;
}
@@ -2927,8 +2724,7 @@ static int spraid_send_abort_cmd(struct spraid_dev *hdev, u32 hdid, u16 qid, u16
admin_cmd.abort.sqid = cpu_to_le16(qid);
admin_cmd.abort.cid = cpu_to_le16(cid);
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
}
/* send reset command by admin quueue temporary */
@@ -2941,8 +2737,7 @@ static int spraid_send_reset_cmd(struct spraid_dev *hdev, int type, u32 hdid)
admin_cmd.reset.hdid = cpu_to_le32(hdid);
admin_cmd.reset.type = type;
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
}
static bool spraid_change_host_state(struct spraid_dev *hdev, enum spraid_state newstate)
@@ -3022,7 +2817,7 @@ static void spraid_back_fault_cqe(struct spraid_queue *ioq, struct spraid_comple
scsi_dma_unmap(scmd);
spraid_free_iod_res(hdev, iod);
scmd->scsi_done(scmd);
- dev_warn(hdev->dev, "Back fault CQE, cid[%d], qid[%d]\n",
+ dev_warn(hdev->dev, "Back fault CQE, cid[%d] qid[%d]\n",
cqe->cmd_id, ioq->qid);
}
@@ -3032,6 +2827,8 @@ static void spraid_back_all_io(struct spraid_dev *hdev)
struct spraid_queue *ioq;
struct spraid_completion cqe = { 0 };
+ scsi_block_requests(hdev->shost);
+
for (i = 1; i <= hdev->shost->nr_hw_queues; i++) {
ioq = &hdev->queues[i];
for (j = 0; j < hdev->shost->can_queue; j++) {
@@ -3039,6 +2836,8 @@ static void spraid_back_all_io(struct spraid_dev *hdev)
spraid_back_fault_cqe(ioq, &cqe);
}
}
+
+ scsi_unblock_requests(hdev->shost);
}
static void spraid_dev_disable(struct spraid_dev *hdev, bool shutdown)
@@ -3106,17 +2905,13 @@ static void spraid_reset_work(struct work_struct *work)
if (ret)
goto pci_disable;
- ret = spraid_alloc_admin_tags(hdev);
- if (ret)
- goto pci_disable;
-
ret = spraid_setup_io_queues(hdev);
if (ret || hdev->online_queues <= hdev->shost->nr_hw_queues)
goto pci_disable;
spraid_change_host_state(hdev, SPRAID_LIVE);
- spraid_send_aen(hdev);
+ spraid_send_all_aen(hdev);
return;
@@ -3174,8 +2969,8 @@ static int spraid_abort_handler(struct scsi_cmnd *scmd)
scsi_print_command(scmd);
- if (!spraid_wait_abnl_cmd_done(iod) || spraid_check_scmd_completed(scmd) ||
- hdev->state != SPRAID_LIVE)
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
return SUCCESS;
hostdata = scmd->device->hostdata;
@@ -3183,7 +2978,7 @@ static int spraid_abort_handler(struct scsi_cmnd *scmd)
dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, aborting\n", cid, hwq);
ret = spraid_send_abort_cmd(hdev, hostdata->hdid, hwq, cid);
- if (ret != ADMIN_ERR_TIMEOUT) {
+ if (ret != -ETIME) {
ret = spraid_wait_abnl_cmd_done(iod);
if (ret) {
dev_warn(hdev->dev, "cid[%d] qid[%d] abort failed, not found\n", cid, hwq);
@@ -3206,8 +3001,8 @@ static int spraid_tgt_reset_handler(struct scsi_cmnd *scmd)
scsi_print_command(scmd);
- if (!spraid_wait_abnl_cmd_done(iod) || spraid_check_scmd_completed(scmd) ||
- hdev->state != SPRAID_LIVE)
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
return SUCCESS;
hostdata = scmd->device->hostdata;
@@ -3241,8 +3036,8 @@ static int spraid_bus_reset_handler(struct scsi_cmnd *scmd)
scsi_print_command(scmd);
- if (!spraid_wait_abnl_cmd_done(iod) || spraid_check_scmd_completed(scmd) ||
- hdev->state != SPRAID_LIVE)
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
return SUCCESS;
hostdata = scmd->device->hostdata;
@@ -3272,7 +3067,7 @@ static int spraid_shost_reset_handler(struct scsi_cmnd *scmd)
struct spraid_dev *hdev = shost_priv(scmd->device->host);
scsi_print_command(scmd);
- if (spraid_check_scmd_completed(scmd) || hdev->state != SPRAID_LIVE)
+ if (hdev->state != SPRAID_LIVE || spraid_check_scmd_completed(scmd))
return SUCCESS;
spraid_get_tag_from_scmd(scmd, &hwq, &cid);
@@ -3288,6 +3083,62 @@ static int spraid_shost_reset_handler(struct scsi_cmnd *scmd)
return SUCCESS;
}
+static pci_ers_result_t spraid_pci_error_detected(struct pci_dev *pdev,
+ pci_channel_state_t state)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "enter pci error detect, state:%d\n", state);
+
+ switch (state) {
+ case pci_channel_io_normal:
+ dev_warn(hdev->dev, "channel is normal, do nothing\n");
+
+ return PCI_ERS_RESULT_CAN_RECOVER;
+ case pci_channel_io_frozen:
+ dev_warn(hdev->dev, "channel io frozen, need reset controller\n");
+
+ scsi_block_requests(hdev->shost);
+
+ spraid_change_host_state(hdev, SPRAID_RESETTING);
+
+ return PCI_ERS_RESULT_NEED_RESET;
+ case pci_channel_io_perm_failure:
+ dev_warn(hdev->dev, "channel io failure, request disconnect\n");
+
+ return PCI_ERS_RESULT_DISCONNECT;
+ }
+
+ return PCI_ERS_RESULT_NEED_RESET;
+}
+
+static pci_ers_result_t spraid_pci_slot_reset(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "restart after slot reset\n");
+
+ pci_restore_state(pdev);
+
+ if (!queue_work(spraid_wq, &hdev->reset_work)) {
+ dev_err(hdev->dev, "[%s] err, the device is resetting state\n", __func__);
+ return PCI_ERS_RESULT_NONE;
+ }
+
+ flush_work(&hdev->reset_work);
+
+ scsi_unblock_requests(hdev->shost);
+
+ return PCI_ERS_RESULT_RECOVERED;
+}
+
+static void spraid_reset_done(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "enter spraid reset done\n");
+}
+
static ssize_t csts_pp_show(struct device *cdev, struct device_attribute *attr, char *buf)
{
struct Scsi_Host *shost = class_to_shost(cdev);
@@ -3347,7 +3198,7 @@ static ssize_t fw_version_show(struct device *cdev, struct device_attribute *att
struct Scsi_Host *shost = class_to_shost(cdev);
struct spraid_dev *hdev = shost_priv(shost);
- return snprintf(buf, sizeof(hdev->ctrl_info->fr), "%s\n", hdev->ctrl_info->fr);
+ return snprintf(buf, PAGE_SIZE, "%s\n", hdev->ctrl_info->fr);
}
static DEVICE_ATTR_RO(csts_pp);
@@ -3365,6 +3216,185 @@ static struct device_attribute *spraid_host_attrs[] = {
NULL,
};
+static int spraid_get_vd_info(struct spraid_dev *hdev, struct spraid_vd_info *vd_info, u16 vid)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.usr_cmd.opcode = USR_CMD_READ;
+ admin_cmd.usr_cmd.info_0.subopcode = cpu_to_le16(USR_CMD_VDINFO);
+ admin_cmd.usr_cmd.info_1.data_len = cpu_to_le16(USR_CMD_RDLEN);
+ admin_cmd.usr_cmd.info_1.param_len = cpu_to_le16(VDINFO_PARAM_LEN);
+ admin_cmd.usr_cmd.cdw10 = cpu_to_le32(vid);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(vd_info, data_ptr, sizeof(struct spraid_vd_info));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_get_bgtask(struct spraid_dev *hdev, struct spraid_bgtask *bgtask)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.usr_cmd.opcode = USR_CMD_READ;
+ admin_cmd.usr_cmd.info_0.subopcode = cpu_to_le16(USR_CMD_BGTASK);
+ admin_cmd.usr_cmd.info_1.data_len = cpu_to_le16(USR_CMD_RDLEN);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(bgtask, data_ptr, sizeof(struct spraid_bgtask));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static ssize_t raid_level_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_sdev_hostdata *hostdata;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret)
+ vd_info->rg_level = ARRAY_SIZE(raid_levels) - 1;
+
+ ret = (vd_info->rg_level < ARRAY_SIZE(raid_levels)) ?
+ vd_info->rg_level : (ARRAY_SIZE(raid_levels) - 1);
+
+ kfree(vd_info);
+
+ return snprintf(buf, PAGE_SIZE, "RAID-%s\n", raid_levels[ret]);
+}
+
+static ssize_t raid_state_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_sdev_hostdata *hostdata;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret) {
+ vd_info->vd_status = 0;
+ vd_info->rg_id = 0xff;
+ }
+
+ ret = (vd_info->vd_status < ARRAY_SIZE(raid_states)) ? vd_info->vd_status : 0;
+
+ kfree(vd_info);
+
+ return snprintf(buf, PAGE_SIZE, "%s\n", raid_states[ret]);
+}
+
+static ssize_t raid_resync_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_bgtask *bgtask;
+ struct spraid_sdev_hostdata *hostdata;
+ u8 rg_id, i, progress = 0;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret)
+ goto out;
+
+ rg_id = vd_info->rg_id;
+
+ bgtask = (struct spraid_bgtask *)vd_info;
+ ret = spraid_get_bgtask(hdev, bgtask);
+ if (ret)
+ goto out;
+ for (i = 0; i < bgtask->task_num; i++) {
+ if ((bgtask->bgtask[i].type == BGTASK_TYPE_REBUILD) &&
+ (le16_to_cpu(bgtask->bgtask[i].vd_id) == rg_id))
+ progress = bgtask->bgtask[i].progress;
+ }
+
+out:
+ kfree(vd_info);
+ return snprintf(buf, PAGE_SIZE, "%d\n", progress);
+}
+
+static DEVICE_ATTR_RO(raid_level);
+static DEVICE_ATTR_RO(raid_state);
+static DEVICE_ATTR_RO(raid_resync);
+
+static struct device_attribute *spraid_dev_attrs[] = {
+ &dev_attr_raid_level,
+ &dev_attr_raid_state,
+ &dev_attr_raid_resync,
+ NULL,
+};
+
+static struct pci_error_handlers spraid_err_handler = {
+ .error_detected = spraid_pci_error_detected,
+ .slot_reset = spraid_pci_slot_reset,
+ .reset_done = spraid_reset_done,
+};
+
+static int spraid_sysfs_host_reset(struct Scsi_Host *shost, int reset_type)
+{
+ int ret;
+ struct spraid_dev *hdev = shost_priv(shost);
+
+ dev_info(hdev->dev, "[%s] start sysfs host reset cmd\n", __func__);
+ ret = spraid_reset_work_sync(hdev);
+ dev_info(hdev->dev, "[%s] stop sysfs host reset cmd[%d]\n", __func__, ret);
+
+ return ret;
+}
+
static struct scsi_host_template spraid_driver_template = {
.module = THIS_MODULE,
.name = "Ramaxel Logic spraid driver",
@@ -3379,9 +3409,11 @@ static struct scsi_host_template spraid_driver_template = {
.eh_bus_reset_handler = spraid_bus_reset_handler,
.eh_host_reset_handler = spraid_shost_reset_handler,
.change_queue_depth = scsi_change_queue_depth,
- .host_tagset = 1,
+ .host_tagset = 0,
.this_id = -1,
.shost_attrs = spraid_host_attrs,
+ .sdev_attrs = spraid_dev_attrs,
+ .host_reset = spraid_sysfs_host_reset,
};
static void spraid_shutdown(struct pci_dev *pdev)
@@ -3392,11 +3424,53 @@ static void spraid_shutdown(struct pci_dev *pdev)
spraid_disable_admin_queue(hdev, true);
}
+/* bsg dispatch user command */
+static int spraid_bsg_host_dispatch(struct bsg_job *job)
+{
+ struct Scsi_Host *shost = dev_to_shost(job->dev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_bsg_request *bsg_req = job->request;
+ int ret = 0;
+
+ dev_log_dbg(hdev->dev, "[%s] msgcode[%d], msglen[%d], timeout[%d], req_nsge[%d], req_len[%d]\n",
+ __func__, bsg_req->msgcode, job->request_len, rq->timeout,
+ job->request_payload.sg_cnt, job->request_payload.payload_len);
+
+ job->reply_len = 0;
+
+ switch (bsg_req->msgcode) {
+ case SPRAID_BSG_ADM:
+ ret = spraid_user_admin_cmd(hdev, job);
+ break;
+ case SPRAID_BSG_IOQ:
+ ret = spraid_user_ioq_cmd(hdev, job);
+ break;
+ default:
+ dev_info(hdev->dev, "[%s] unsupport msgcode[%d]\n", __func__, bsg_req->msgcode);
+ break;
+ }
+
+ if (ret > 0)
+ ret = ret | (ret << 8);
+
+ bsg_job_done(job, ret, 0);
+ return 0;
+}
+
+static inline void spraid_remove_bsg(struct spraid_dev *hdev)
+{
+ if (hdev->bsg_queue) {
+ bsg_unregister_queue(hdev->bsg_queue);
+ blk_cleanup_queue(hdev->bsg_queue);
+ }
+}
static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
struct spraid_dev *hdev;
struct Scsi_Host *shost;
int node, ret;
+ char bsg_name[15];
shost = scsi_host_alloc(&spraid_driver_template, sizeof(*hdev));
if (!shost) {
@@ -3421,10 +3495,10 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
goto put_dev;
init_rwsem(&hdev->devices_rwsem);
- INIT_WORK(&hdev->aen_work, spraid_async_event_work);
INIT_WORK(&hdev->scan_work, spraid_scan_work);
INIT_WORK(&hdev->timesyn_work, spraid_timesyn_work);
INIT_WORK(&hdev->reset_work, spraid_reset_work);
+ INIT_WORK(&hdev->fw_act_work, spraid_fw_act_work);
spin_lock_init(&hdev->state_lock);
ret = spraid_alloc_resources(hdev);
@@ -3439,17 +3513,13 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
if (ret)
goto pci_disable;
- ret = spraid_alloc_admin_tags(hdev);
- if (ret)
- goto disable_admin_q;
-
ret = spraid_init_ctrl_info(hdev);
if (ret)
- goto free_admin_tagset;
+ goto disable_admin_q;
ret = spraid_alloc_iod_ext_mem_pool(hdev);
if (ret)
- goto free_admin_tagset;
+ goto disable_admin_q;
ret = spraid_setup_io_queues(hdev);
if (ret)
@@ -3464,9 +3534,15 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
goto remove_io_queues;
}
- ret = spraid_create_cdev(hdev);
- if (ret)
+ snprintf(bsg_name, sizeof(bsg_name), "spraid%d", shost->host_no);
+ hdev->bsg_queue = bsg_setup_queue(&shost->shost_gendev, bsg_name,
+ spraid_bsg_host_dispatch, NULL,
+ spraid_cmd_size(hdev, true, false));
+ if (IS_ERR(hdev->bsg_queue)) {
+ dev_err(hdev->dev, "err, setup bsg failed\n");
+ hdev->bsg_queue = NULL;
goto remove_io_queues;
+ }
if (hdev->online_queues == SPRAID_ADMIN_QUEUE_NUM) {
dev_warn(hdev->dev, "warn only admin queue can be used\n");
@@ -3475,11 +3551,11 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
hdev->state = SPRAID_LIVE;
- spraid_send_aen(hdev);
+ spraid_send_all_aen(hdev);
ret = spraid_dev_list_init(hdev);
if (ret)
- goto remove_cdev;
+ goto remove_bsg;
ret = spraid_configure_timestamp(hdev);
if (ret)
@@ -3487,20 +3563,18 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
ret = spraid_alloc_ioq_ptcmds(hdev);
if (ret)
- goto remove_cdev;
+ goto remove_bsg;
scsi_scan_host(hdev->shost);
return 0;
-remove_cdev:
- spraid_remove_cdev(hdev);
+remove_bsg:
+ spraid_remove_bsg(hdev);
remove_io_queues:
spraid_remove_io_queues(hdev);
free_iod_mempool:
spraid_free_iod_ext_mem_pool(hdev);
-free_admin_tagset:
- spraid_remove_admin_tagset(hdev);
disable_admin_q:
spraid_disable_admin_queue(hdev, false);
pci_disable:
@@ -3524,22 +3598,17 @@ static void spraid_remove(struct pci_dev *pdev)
dev_info(hdev->dev, "enter spraid remove\n");
spraid_change_host_state(hdev, SPRAID_DELETING);
+ flush_work(&hdev->reset_work);
- if (!pci_device_is_present(pdev)) {
- scsi_block_requests(shost);
+ if (!pci_device_is_present(pdev))
spraid_back_all_io(hdev);
- scsi_unblock_requests(shost);
- }
- flush_work(&hdev->reset_work);
+ spraid_remove_bsg(hdev);
scsi_remove_host(shost);
-
- kfree(hdev->ioq_ptcmds);
+ spraid_free_ioq_ptcmds(hdev);
kfree(hdev->devices);
- spraid_remove_cdev(hdev);
spraid_remove_io_queues(hdev);
spraid_free_iod_ext_mem_pool(hdev);
- spraid_remove_admin_tagset(hdev);
spraid_disable_admin_queue(hdev, false);
spraid_pci_disable(hdev);
spraid_free_resources(hdev);
@@ -3551,7 +3620,7 @@ static void spraid_remove(struct pci_dev *pdev)
}
static const struct pci_device_id spraid_id_table[] = {
- { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC, SPRAID_SERVER_DEVICE_HAB_DID) },
+ { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC, SPRAID_SERVER_DEVICE_HBA_DID) },
{ PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC, SPRAID_SERVER_DEVICE_RAID_DID) },
{ 0, }
};
@@ -3563,6 +3632,7 @@ static struct pci_driver spraid_driver = {
.probe = spraid_probe,
.remove = spraid_remove,
.shutdown = spraid_shutdown,
+ .err_handler = &spraid_err_handler,
};
static int __init spraid_init(void)
@@ -3573,14 +3643,10 @@ static int __init spraid_init(void)
if (!spraid_wq)
return -ENOMEM;
- ret = alloc_chrdev_region(&spraid_chr_devt, 0, SPRAID_MINORS, "spraid");
- if (ret < 0)
- goto destroy_wq;
-
spraid_class = class_create(THIS_MODULE, "spraid");
if (IS_ERR(spraid_class)) {
ret = PTR_ERR(spraid_class);
- goto unregister_chrdev;
+ goto destroy_wq;
}
ret = pci_register_driver(&spraid_driver);
@@ -3591,8 +3657,6 @@ static int __init spraid_init(void)
destroy_class:
class_destroy(spraid_class);
-unregister_chrdev:
- unregister_chrdev_region(spraid_chr_devt, SPRAID_MINORS);
destroy_wq:
destroy_workqueue(spraid_wq);
@@ -3603,12 +3667,11 @@ static void __exit spraid_exit(void)
{
pci_unregister_driver(&spraid_driver);
class_destroy(spraid_class);
- unregister_chrdev_region(spraid_chr_devt, SPRAID_MINORS);
destroy_workqueue(spraid_wq);
ida_destroy(&spraid_instance_ida);
}
-MODULE_AUTHOR("Ramaxel Memory Technology");
+MODULE_AUTHOR("songyl(a)ramaxel.com");
MODULE_DESCRIPTION("Ramaxel Memory Technology SPraid Driver");
MODULE_LICENSE("GPL");
MODULE_VERSION(SPRAID_DRV_VERSION);
--
2.32.0
2
1
Ramaxel inclusion
category: features
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ON8F
CVE: NA
Changes:
1. Split scmd_tmout_nonpt into two parameters:
scmd_tmout_vd/scmd_tmout_rawdisk
2. Return -ETIME instead of -EINVAL when command is timeout.
3. Add one module parameters: max_io_force.
Signed-off-by: Yanling Song <songyl(a)ramaxel.com>
Reviewed-by: Jiang Yu<yujiang(a)ramaxel.com>
---
drivers/scsi/spraid/spraid_main.c | 96 +++++++++++++++----------------
1 file changed, 45 insertions(+), 51 deletions(-)
diff --git a/drivers/scsi/spraid/spraid_main.c b/drivers/scsi/spraid/spraid_main.c
index 519b39f44e91..7069582d741a 100644
--- a/drivers/scsi/spraid/spraid_main.c
+++ b/drivers/scsi/spraid/spraid_main.c
@@ -42,10 +42,17 @@ static u32 admin_tmout = 60;
module_param(admin_tmout, uint, 0644);
MODULE_PARM_DESC(admin_tmout, "admin commands timeout (seconds)");
-static u32 scmd_tmout_nonpt = 180;
-module_param(scmd_tmout_nonpt, uint, 0644);
-MODULE_PARM_DESC(scmd_tmout_nonpt,
- "scsi commands timeout for rawdisk&raid(seconds)");
+static u32 scmd_tmout_rawdisk = 180;
+module_param(scmd_tmout_rawdisk, uint, 0644);
+MODULE_PARM_DESC(scmd_tmout_rawdisk, "scsi commands timeout for rawdisk(seconds)");
+
+static u32 scmd_tmout_vd = 180;
+module_param(scmd_tmout_vd, uint, 0644);
+MODULE_PARM_DESC(scmd_tmout_vd, "scsi commands timeout for vd(seconds)");
+
+static bool max_io_force;
+module_param(max_io_force, bool, 0644);
+MODULE_PARM_DESC(max_io_force, "force max_hw_sectors_kb = 1024, default false(performance first)");
static int ioq_depth_set(const char *val, const struct kernel_param *kp);
static const struct kernel_param_ops ioq_depth_ops = {
@@ -78,7 +85,7 @@ static unsigned char log_debug_switch;
module_param_cb(log_debug_switch, &log_debug_switch_ops,
&log_debug_switch, 0644);
MODULE_PARM_DESC(log_debug_switch,
- "set log state, default non-zero for switch on");
+ "set log state, default zero for switch off");
static int small_pool_num_set(const char *val, const struct kernel_param *kp)
{
@@ -132,7 +139,6 @@ static struct workqueue_struct *spraid_wq;
#define SPRAID_DRV_VERSION "1.0.0.0"
#define ADMIN_TIMEOUT (admin_tmout * HZ)
-#define ADMIN_ERR_TIMEOUT 32757
#define SPRAID_WAIT_ABNL_CMD_TIMEOUT (3 * 2)
@@ -242,13 +248,6 @@ static int spraid_pci_enable(struct spraid_dev *hdev)
goto disable;
}
- ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
- if (ret < 0) {
- dev_err(hdev->dev,
- "Allocate one IRQ for setup admin channel failed\n");
- goto disable;
- }
-
hdev->cap = lo_hi_readq(hdev->bar + SPRAID_REG_CAP);
hdev->ioq_depth = min_t(u32, SPRAID_CAP_MQES(hdev->cap) + 1,
io_queue_depth);
@@ -261,13 +260,21 @@ static int spraid_pci_enable(struct spraid_dev *hdev)
maskbit);
maskbit = SPRAID_DMA_MSK_BIT_MAX;
}
- if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(maskbit))) {
+ if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(maskbit)) &&
+ dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32))) {
dev_err(hdev->dev, "set dma mask and coherent failed\n");
goto disable;
}
dev_info(hdev->dev, "set dma mask[%llu] success\n", maskbit);
+ ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
+
+ if (ret < 0) {
+ dev_err(hdev->dev, "Allocate one IRQ for setup admin channel failed\n");
+ goto disable;
+ }
+
pci_enable_pcie_error_reporting(pdev);
pci_save_state(pdev);
@@ -840,6 +847,9 @@ static void spraid_map_status(struct spraid_iod *iod, struct scsi_cmnd *scmd,
break;
default:
set_host_byte(scmd, DID_BAD_TARGET);
+ dev_warn(iod->spraidq->hdev->dev, "[%s] cid[%d] qid[%d];"
+ "bad status[0x%x]\n", __func__, cqe->cmd_id,
+ le16_to_cpu(cqe->sq_id), le16_to_cpu(cqe->status));
break;
}
}
@@ -981,32 +991,17 @@ static void spraid_slave_destroy(struct scsi_device *sdev)
static int spraid_slave_configure(struct scsi_device *sdev)
{
- u16 idx;
- unsigned int timeout = scmd_tmout_nonpt * HZ;
+ unsigned int timeout = scmd_tmout_rawdisk * HZ;
struct spraid_dev *hdev = shost_priv(sdev->host);
struct spraid_sdev_hostdata *hostdata = sdev->hostdata;
u32 max_sec = sdev->host->max_sectors;
if (hostdata) {
- idx = hostdata->hdid - 1;
- if (sdev->channel == hdev->devices[idx].channel &&
- sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
- sdev->lun < hdev->devices[idx].lun) {
- if (SPRAID_DEV_INFO_ATTR_PT(hdev->devices[idx].attr))
- timeout = 30 * HZ;
- else
- timeout = scmd_tmout_nonpt * HZ;
- max_sec = le16_to_cpu(hdev->devices[idx].max_io_kb)
- << 1;
- } else {
- dev_err(hdev->dev,
- "[%s] err, sdev->channel:id:lun[%d:%d:%lld];"
- "devices[%d], channel:target:lun[%d:%d:%d]\n",
- __func__, sdev->channel, sdev->id, sdev->lun,
- idx, hdev->devices[idx].channel,
- hdev->devices[idx].target,
- hdev->devices[idx].lun);
- }
+ if (SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ timeout = scmd_tmout_vd * HZ;
+ else if (SPRAID_DEV_INFO_ATTR_RAWDISK(hostdata->attr))
+ timeout = scmd_tmout_rawdisk * HZ;
+ max_sec = hostdata->max_io_kb << 1;
} else {
dev_err(hdev->dev, "[%s] err, sdev->hostdata is null\n",
__func__);
@@ -1018,11 +1013,12 @@ static int spraid_slave_configure(struct scsi_device *sdev)
if ((max_sec == 0) || (max_sec > sdev->host->max_sectors))
max_sec = sdev->host->max_sectors;
- dev_info(hdev->dev,
- "[%s] sdev->channel:id:lun[%d:%d:%lld];"
- " scmd_timeout[%d]s, maxsec[%d]\n",
- __func__, sdev->channel, sdev->id,
- sdev->lun, timeout / HZ, max_sec);
+ if (!max_io_force)
+ blk_queue_max_hw_sectors(sdev->request_queue, max_sec);
+
+ dev_info(hdev->dev, "[%s] sdev->channel:id:lun[%d:%d:%lld];"
+ "scmd_timeout[%d]s, maxsec[%d]\n", __func__, sdev->channel,
+ sdev->id, sdev->lun, timeout / HZ, max_sec);
return 0;
}
@@ -1677,7 +1673,7 @@ static int spraid_submit_admin_sync_cmd(struct spraid_dev *hdev,
cmd->usr_cmd.opcode, cmd->usr_cmd.info_0.subopcode);
WRITE_ONCE(adm_cmd->state, SPRAID_CMD_TIMEOUT);
spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
- return -EINVAL;
+ return -ETIME;
}
if (result0)
@@ -2630,9 +2626,6 @@ static int spraid_user_admin_cmd(struct spraid_dev *hdev, struct bsg_job *job)
return -EBUSY;
}
- dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init\n",
- __func__, cmd->opcode, cmd->info_0.subopcode);
-
memset(&admin_cmd, 0, sizeof(admin_cmd));
admin_cmd.common.opcode = cmd->opcode;
admin_cmd.common.flags = cmd->flags;
@@ -2659,10 +2652,11 @@ static int spraid_user_admin_cmd(struct spraid_dev *hdev, struct bsg_job *job)
memcpy(job->reply, result, sizeof(result));
}
- dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x];"
- " status[0x%x] result0[0x%x] result1[0x%x]\n",
- __func__, cmd->opcode, cmd->info_0.subopcode,
- status, result[0], result[1]);
+ if (status)
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x];"
+ " status[0x%x] result0[0x%x] result1[0x%x]\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode,
+ status, result[0], result[1]);
spraid_bsg_unmap_data(hdev, job);
@@ -2742,7 +2736,7 @@ static int spraid_submit_ioq_sync_cmd(struct spraid_dev *hdev,
(le32_to_cpu(cmd->common.cdw3[0]) & 0xffff));
WRITE_ONCE(pt_cmd->state, SPRAID_CMD_TIMEOUT);
spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
- return -EINVAL;
+ return -ETIME;
}
if (result && reslen) {
@@ -3140,7 +3134,7 @@ static int spraid_abort_handler(struct scsi_cmnd *scmd)
dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, aborting\n", cid, hwq);
ret = spraid_send_abort_cmd(hdev, hostdata->hdid, hwq, cid);
- if (ret != ADMIN_ERR_TIMEOUT) {
+ if (ret != -ETIME) {
ret = spraid_wait_abnl_cmd_done(iod);
if (ret) {
dev_warn(hdev->dev, "cid[%d] qid[%d] abort failed;"
@@ -3623,7 +3617,7 @@ static int spraid_bsg_host_dispatch(struct bsg_job *job)
struct spraid_bsg_request *bsg_req = job->request;
int ret = 0;
- dev_info(hdev->dev, "[%s] msgcode[%d], msglen[%d], timeout[%d];"
+ dev_log_dbg(hdev->dev, "[%s] msgcode[%d], msglen[%d], timeout[%d];"
" req_nsge[%d], req_len[%d]\n",
__func__, bsg_req->msgcode, job->request_len,
rq->timeout, job->request_payload.sg_cnt,
--
2.27.0
1
0

[PATCH openEuler-5.10 1/2] kabi: reserve space for arm64 cpufeature related structure
by Zheng Zengkai 04 Jan '22
by Zheng Zengkai 04 Jan '22
04 Jan '22
From: Jialin Zhang <zhangjialin11(a)huawei.com>
hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4JBL0
CVE: NA
-------------------------------
Reserve space for the structure cpu_hwcap_keys and cpu_hwcaps.
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
Reviewed-by: wangxiongfeng <wangxiongfeng2(a)huawei.com>
Reviewed-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: Hanjun Guo <guohanjun(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
arch/arm64/include/asm/cpucaps.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 93bae3795165..c58a1919bfc9 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -72,6 +72,6 @@
#define ARM64_HAS_TWED 62
#define ARM64_WORKAROUND_HISILICON_1980005 63
-#define ARM64_NCAPS 64
+#define ARM64_NCAPS 80
#endif /* __ASM_CPUCAPS_H */
--
2.20.1
1
1

04 Jan '22
From: Linus Walleij <linus.walleij(a)linaro.org>
mainline inclusion
from mainline-v5.3-rc1
commit b3198c38f02d54a5e964258a2180d502abe6eaf0
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4P7YB
CVE: N/A
---
The 50 ms default timeout waiting for vblanks is too small
for the first vblank from the ST-Ericsson MCDE display
controller over DSI. Presumably this is because the DSI
display is command-mode only and the state machine will
take some time setting up its state for the first display
update, and we hit a timeout. 100 ms makes it pass without
problems.
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
Reviewed-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190430093746.26485-1-linus.…
Signed-off-by: Gou Hao <gouhao(a)uniontech.com>
---
drivers/gpu/drm/drm_atomic_helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 86efd2da37f9..553415fe8ede 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1424,7 +1424,7 @@ drm_atomic_helper_wait_for_vblanks(struct drm_device *dev,
ret = wait_event_timeout(dev->vblank[i].queue,
old_state->crtcs[i].last_vblank_count !=
drm_crtc_vblank_count(crtc),
- msecs_to_jiffies(50));
+ msecs_to_jiffies(100));
WARN(!ret, "[CRTC:%d:%s] vblank wait timed out\n",
crtc->base.id, crtc->name);
--
2.20.1
1
0

04 Jan '22
From: Linus Walleij <linus.walleij(a)linaro.org>
mainline inclusion
from mainline-v5.3-rc1
commit b3198c38f02d54a5e964258a2180d502abe6eaf0
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4P7YB
CVE: N/A
---
The 50 ms default timeout waiting for vblanks is too small
for the first vblank from the ST-Ericsson MCDE display
controller over DSI. Presumably this is because the DSI
display is command-mode only and the state machine will
take some time setting up its state for the first display
update, and we hit a timeout. 100 ms makes it pass without
problems.
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
Reviewed-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190430093746.26485-1-linus.…
Signed-off-by: Gou Hao <gouhao(a)uniontech.com>
---
drivers/gpu/drm/drm_atomic_helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 86efd2da37f9..553415fe8ede 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1424,7 +1424,7 @@ drm_atomic_helper_wait_for_vblanks(struct drm_device *dev,
ret = wait_event_timeout(dev->vblank[i].queue,
old_state->crtcs[i].last_vblank_count !=
drm_crtc_vblank_count(crtc),
- msecs_to_jiffies(50));
+ msecs_to_jiffies(100));
WARN(!ret, "[CRTC:%d:%s] vblank wait timed out\n",
crtc->base.id, crtc->name);
--
2.20.1
1
0
From: Zhang Jian <zhangjian210(a)huawei.com>
ascend inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OXH9
CVE: NA
-------------------------------------------------
Collect the processes who have the page mapped via collect_procs().
@page if the page is a part of the hugepages/compound-page, we must
using compound_head() to find it's head page to prevent the kernel panic,
and make the page be locked.
@to_kill the function will return a linked list, when we have used
this list, we must kfree the list.
@force_early if we want to find all process, we must make it be true, if
it's false, the function will only return the process who have PF_MCE_PROCESS
or PF_MCE_EARLY mark.
limits: if force_early is true, sysctl_memory_failure_early_kill is useless.
If it's false, no process have PF_MCE_PROCESS and PF_MCE_EARLY flag, and
the sysctl_memory_failure_early_kill is enabled, function will return all tasks
whether the task have the PF_MCE_PROCESS and PF_MCE_EARLY flag.
Signed-off-by: Zhang Jian <zhangjian210(a)huawei.com>
Reviewed-by: Weilong Chen <chenweilong(a)huawei.com>
Reviewed-by: Kefeng Wang<wangkefeng.wang(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
include/linux/mm.h | 2 ++
mm/memory-failure.c | 3 ++-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index f801fa5e60289..7b724d39e6ee0 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2866,6 +2866,8 @@ extern int sysctl_memory_failure_recovery;
extern void shake_page(struct page *p, int access);
extern atomic_long_t num_poisoned_pages __read_mostly;
extern int soft_offline_page(struct page *page, int flags);
+extern void collect_procs(struct page *page, struct list_head *tokill,
+ int force_early);
/*
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 52619a9bc3b0e..72e1746e5386f 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -512,7 +512,7 @@ static void collect_procs_file(struct page *page, struct list_head *to_kill,
* First preallocate one tokill structure outside the spin locks,
* so that we can kill at least one process reasonably reliable.
*/
-static void collect_procs(struct page *page, struct list_head *tokill,
+void collect_procs(struct page *page, struct list_head *tokill,
int force_early)
{
struct to_kill *tk;
@@ -529,6 +529,7 @@ static void collect_procs(struct page *page, struct list_head *tokill,
collect_procs_file(page, tokill, &tk, force_early);
kfree(tk);
}
+EXPORT_SYMBOL_GPL(collect_procs);
static const char *action_name[] = {
[MF_IGNORED] = "Ignored",
--
2.25.1
1
0

31 Dec '21
This initial commit contains Ramaxel's spraid module.
Signed-off-by: Yanling Song <songyl(a)ramaxel.com>
Changes from V3:
1. Use macro to repalce scmd_tmout_nonpt module parameter.
Changes from V2:
1. Change "depends BLK_DEV_BSGLIB" to "select BLK_DEV_BSGLIB".
2. Add more descriptions in help.
3. Use pr_debug() instead of introducing dev_log_dbg().
4. Use get_unaligned_be*() in spraid_setup_rw_cmd();
5. Remove some unncessarry module parameters:
scmd_tmout_pt, wait_abl_tmout, use_sgl_force
6. Some module parameters will be changed in the next version:
admin_tmout, scmd_tmout_nonpt
Changes from V1:
1. Use BSG module to replace with ioctrl
2. Use author's email as MODULE_AUTHOR
3. Remove "default=m" in drivers/scsi/spraid/Kconfig
4. To be changed in the next version:
a. Use get_unaligned_be*() in spraid_setup_rw_cmd();
b. Use pr_debug() instead of introducing dev_log_dbg().
---
Documentation/scsi/spraid.rst | 16 +
MAINTAINERS | 7 +
drivers/scsi/Kconfig | 1 +
drivers/scsi/Makefile | 1 +
drivers/scsi/spraid/Kconfig | 13 +
drivers/scsi/spraid/Makefile | 7 +
drivers/scsi/spraid/spraid.h | 695 ++++++
drivers/scsi/spraid/spraid_main.c | 3266 +++++++++++++++++++++++++++++
8 files changed, 4006 insertions(+)
create mode 100644 Documentation/scsi/spraid.rst
create mode 100644 drivers/scsi/spraid/Kconfig
create mode 100644 drivers/scsi/spraid/Makefile
create mode 100644 drivers/scsi/spraid/spraid.h
create mode 100644 drivers/scsi/spraid/spraid_main.c
diff --git a/Documentation/scsi/spraid.rst b/Documentation/scsi/spraid.rst
new file mode 100644
index 000000000000..f87b02a6907b
--- /dev/null
+++ b/Documentation/scsi/spraid.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================================
+SPRAID - Ramaxel SCSI Raid Controller driver
+==============================================
+
+This file describes the spraid SCSI driver for Ramaxel
+raid controllers. The spraid driver is the first generation raid driver for
+Ramaxel Corp.
+
+For Ramaxel spraid controller support, enable the spraid driver
+when configuring the kernel.
+
+Supported devices
+=================
+<Controller names to be added as they become publicly available.>
diff --git a/MAINTAINERS b/MAINTAINERS
index 57f7656fe513..68560008b259 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17973,6 +17973,13 @@ F: include/dt-bindings/spmi/spmi.h
F: include/linux/spmi.h
F: include/trace/events/spmi.h
+SPRAID SCSI/Raid DRIVERS
+M: Yanling Song <yanling.song(a)ramaxel.com>
+L: linux-scsi(a)vger.kernel.org
+S: Maintained
+F: Documentation/scsi/spraid.rst
+F: drivers/scsi/spraid/
+
SPU FILE SYSTEM
M: Jeremy Kerr <jk(a)ozlabs.org>
L: linuxppc-dev(a)lists.ozlabs.org
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 6e3a04107bb6..3da5d26e1e11 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -501,6 +501,7 @@ source "drivers/scsi/mpt3sas/Kconfig"
source "drivers/scsi/mpi3mr/Kconfig"
source "drivers/scsi/smartpqi/Kconfig"
source "drivers/scsi/ufs/Kconfig"
+source "drivers/scsi/spraid/Kconfig"
config SCSI_HPTIOP
tristate "HighPoint RocketRAID 3xxx/4xxx Controller support"
diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index 19814c26c908..192f8fb10a19 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -96,6 +96,7 @@ obj-$(CONFIG_SCSI_ZALON) += zalon7xx.o
obj-$(CONFIG_SCSI_DC395x) += dc395x.o
obj-$(CONFIG_SCSI_AM53C974) += esp_scsi.o am53c974.o
obj-$(CONFIG_CXLFLASH) += cxlflash/
+obj-$(CONFIG_RAMAXEL_SPRAID) += spraid/
obj-$(CONFIG_MEGARAID_LEGACY) += megaraid.o
obj-$(CONFIG_MEGARAID_NEWGEN) += megaraid/
obj-$(CONFIG_MEGARAID_SAS) += megaraid/
diff --git a/drivers/scsi/spraid/Kconfig b/drivers/scsi/spraid/Kconfig
new file mode 100644
index 000000000000..bfbba3db8db0
--- /dev/null
+++ b/drivers/scsi/spraid/Kconfig
@@ -0,0 +1,13 @@
+#
+# Ramaxel driver configuration
+#
+
+config RAMAXEL_SPRAID
+ tristate "Ramaxel spraid Adapter"
+ depends on PCI && SCSI
+ select BLK_DEV_BSGLIB
+ depends on ARM64 || X86_64
+ help
+ This driver supports Ramaxel SPRxxx serial
+ raid controller, which has PCIE Gen4 interface
+ with host and supports SAS/SATA Hdd/ssd.
diff --git a/drivers/scsi/spraid/Makefile b/drivers/scsi/spraid/Makefile
new file mode 100644
index 000000000000..ad2c2a84ddaf
--- /dev/null
+++ b/drivers/scsi/spraid/Makefile
@@ -0,0 +1,7 @@
+#
+# Makefile for the Ramaxel device drivers.
+#
+
+obj-$(CONFIG_RAMAXEL_SPRAID) += spraid.o
+
+spraid-objs := spraid_main.o
diff --git a/drivers/scsi/spraid/spraid.h b/drivers/scsi/spraid/spraid.h
new file mode 100644
index 000000000000..617f5c2cdb82
--- /dev/null
+++ b/drivers/scsi/spraid/spraid.h
@@ -0,0 +1,695 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
+
+#ifndef __SPRAID_H_
+#define __SPRAID_H_
+
+#define SPRAID_CAP_MQES(cap) ((cap) & 0xffff)
+#define SPRAID_CAP_STRIDE(cap) (((cap) >> 32) & 0xf)
+#define SPRAID_CAP_MPSMIN(cap) (((cap) >> 48) & 0xf)
+#define SPRAID_CAP_MPSMAX(cap) (((cap) >> 52) & 0xf)
+#define SPRAID_CAP_TIMEOUT(cap) (((cap) >> 24) & 0xff)
+#define SPRAID_CAP_DMAMASK(cap) (((cap) >> 37) & 0xff)
+
+#define SPRAID_DEFAULT_MAX_CHANNEL 4
+#define SPRAID_DEFAULT_MAX_ID 240
+#define SPRAID_DEFAULT_MAX_LUN_PER_HOST 8
+#define MAX_SECTORS 2048
+
+#define IO_SQE_SIZE sizeof(struct spraid_ioq_command)
+#define ADMIN_SQE_SIZE sizeof(struct spraid_admin_command)
+#define SQE_SIZE(qid) (((qid) > 0) ? IO_SQE_SIZE : ADMIN_SQE_SIZE)
+#define CQ_SIZE(depth) ((depth) * sizeof(struct spraid_completion))
+#define SQ_SIZE(qid, depth) ((depth) * SQE_SIZE(qid))
+
+#define SENSE_SIZE(depth) ((depth) * SCSI_SENSE_BUFFERSIZE)
+
+#define SPRAID_AQ_DEPTH 128
+#define SPRAID_NR_AEN_COMMANDS 16
+#define SPRAID_AQ_BLK_MQ_DEPTH (SPRAID_AQ_DEPTH - SPRAID_NR_AEN_COMMANDS)
+#define SPRAID_AQ_MQ_TAG_DEPTH (SPRAID_AQ_BLK_MQ_DEPTH - 1)
+
+#define SPRAID_ADMIN_QUEUE_NUM 1
+#define SPRAID_PTCMDS_PERQ 1
+#define SPRAID_IO_BLK_MQ_DEPTH (hdev->shost->can_queue)
+#define SPRAID_NR_IOQ_PTCMDS (SPRAID_PTCMDS_PERQ * hdev->shost->nr_hw_queues)
+
+#define FUA_MASK 0x08
+#define SPRAID_MINORS BIT(MINORBITS)
+
+#define COMMAND_IS_WRITE(cmd) ((cmd)->common.opcode & 1)
+
+#define SPRAID_IO_IOSQES 7
+#define SPRAID_IO_IOCQES 4
+#define PRP_ENTRY_SIZE 8
+
+#define SMALL_POOL_SIZE 256
+#define MAX_SMALL_POOL_NUM 16
+#define MAX_CMD_PER_DEV 64
+#define MAX_CDB_LEN 32
+
+#define SPRAID_UP_TO_MULTY4(x) (((x) + 4) & (~0x03))
+
+#define CQE_STATUS_SUCCESS (0x0)
+
+#define PCI_VENDOR_ID_RAMAXEL_LOGIC 0x1E81
+
+#define SPRAID_SERVER_DEVICE_HBA_DID 0x2100
+#define SPRAID_SERVER_DEVICE_RAID_DID 0x2200
+
+#define IO_6_DEFAULT_TX_LEN 256
+
+#define SPRAID_INT_PAGES 2
+#define SPRAID_INT_BYTES(hdev) (SPRAID_INT_PAGES * (hdev)->page_size)
+
+enum {
+ SPRAID_REQ_CANCELLED = (1 << 0),
+ SPRAID_REQ_USERCMD = (1 << 1),
+};
+
+enum {
+ SPRAID_SC_SUCCESS = 0x0,
+ SPRAID_SC_INVALID_OPCODE = 0x1,
+ SPRAID_SC_INVALID_FIELD = 0x2,
+
+ SPRAID_SC_ABORT_LIMIT = 0x103,
+ SPRAID_SC_ABORT_MISSING = 0x104,
+ SPRAID_SC_ASYNC_LIMIT = 0x105,
+
+ SPRAID_SC_DNR = 0x4000,
+};
+
+enum {
+ SPRAID_REG_CAP = 0x0000,
+ SPRAID_REG_CC = 0x0014,
+ SPRAID_REG_CSTS = 0x001c,
+ SPRAID_REG_AQA = 0x0024,
+ SPRAID_REG_ASQ = 0x0028,
+ SPRAID_REG_ACQ = 0x0030,
+ SPRAID_REG_DBS = 0x1000,
+};
+
+enum {
+ SPRAID_CC_ENABLE = 1 << 0,
+ SPRAID_CC_CSS_NVM = 0 << 4,
+ SPRAID_CC_MPS_SHIFT = 7,
+ SPRAID_CC_AMS_SHIFT = 11,
+ SPRAID_CC_SHN_SHIFT = 14,
+ SPRAID_CC_IOSQES_SHIFT = 16,
+ SPRAID_CC_IOCQES_SHIFT = 20,
+ SPRAID_CC_AMS_RR = 0 << SPRAID_CC_AMS_SHIFT,
+ SPRAID_CC_SHN_NONE = 0 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CC_IOSQES = SPRAID_IO_IOSQES << SPRAID_CC_IOSQES_SHIFT,
+ SPRAID_CC_IOCQES = SPRAID_IO_IOCQES << SPRAID_CC_IOCQES_SHIFT,
+ SPRAID_CC_SHN_NORMAL = 1 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CC_SHN_MASK = 3 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CSTS_CFS_SHIFT = 1,
+ SPRAID_CSTS_SHST_SHIFT = 2,
+ SPRAID_CSTS_PP_SHIFT = 5,
+ SPRAID_CSTS_RDY = 1 << 0,
+ SPRAID_CSTS_SHST_CMPLT = 2 << 2,
+ SPRAID_CSTS_SHST_MASK = 3 << 2,
+ SPRAID_CSTS_CFS_MASK = 1 << SPRAID_CSTS_CFS_SHIFT,
+ SPRAID_CSTS_PP_MASK = 1 << SPRAID_CSTS_PP_SHIFT,
+};
+
+enum {
+ SPRAID_ADMIN_DELETE_SQ = 0x00,
+ SPRAID_ADMIN_CREATE_SQ = 0x01,
+ SPRAID_ADMIN_DELETE_CQ = 0x04,
+ SPRAID_ADMIN_CREATE_CQ = 0x05,
+ SPRAID_ADMIN_ABORT_CMD = 0x08,
+ SPRAID_ADMIN_SET_FEATURES = 0x09,
+ SPRAID_ADMIN_ASYNC_EVENT = 0x0c,
+ SPRAID_ADMIN_GET_INFO = 0xc6,
+ SPRAID_ADMIN_RESET = 0xc8,
+};
+
+enum {
+ SPRAID_GET_INFO_CTRL = 0,
+ SPRAID_GET_INFO_DEV_LIST = 1,
+};
+
+enum {
+ SPRAID_RESET_TARGET = 0,
+ SPRAID_RESET_BUS = 1,
+};
+
+enum {
+ SPRAID_AEN_ERROR = 0,
+ SPRAID_AEN_NOTICE = 2,
+ SPRAID_AEN_VS = 7,
+};
+
+enum {
+ SPRAID_AEN_DEV_CHANGED = 0x00,
+ SPRAID_AEN_HOST_PROBING = 0x10,
+};
+
+enum {
+ SPRAID_AEN_TIMESYN = 0x07
+};
+
+enum {
+ SPRAID_CMD_WRITE = 0x01,
+ SPRAID_CMD_READ = 0x02,
+
+ SPRAID_CMD_NONIO_NONE = 0x80,
+ SPRAID_CMD_NONIO_TODEV = 0x81,
+ SPRAID_CMD_NONIO_FROMDEV = 0x82,
+};
+
+enum {
+ SPRAID_QUEUE_PHYS_CONTIG = (1 << 0),
+ SPRAID_CQ_IRQ_ENABLED = (1 << 1),
+
+ SPRAID_FEAT_NUM_QUEUES = 0x07,
+ SPRAID_FEAT_ASYNC_EVENT = 0x0b,
+ SPRAID_FEAT_TIMESTAMP = 0x0e,
+};
+
+enum spraid_state {
+ SPRAID_NEW,
+ SPRAID_LIVE,
+ SPRAID_RESETTING,
+ SPRAID_DELETING,
+ SPRAID_DEAD,
+};
+
+enum {
+ SPRAID_CARD_HBA,
+ SPRAID_CARD_RAID,
+};
+
+enum spraid_cmd_type {
+ SPRAID_CMD_ADM,
+ SPRAID_CMD_IOPT,
+};
+
+struct spraid_completion {
+ __le32 result;
+ union {
+ struct {
+ __u8 sense_len;
+ __u8 resv[3];
+ };
+ __le32 result1;
+ };
+ __le16 sq_head;
+ __le16 sq_id;
+ __u16 cmd_id;
+ __le16 status;
+};
+
+struct spraid_ctrl_info {
+ __le32 nd;
+ __le16 max_cmds;
+ __le16 max_channel;
+ __le32 max_tgt_id;
+ __le16 max_lun;
+ __le16 max_num_sge;
+ __le16 lun_num_in_boot;
+ __u8 mdts;
+ __u8 acl;
+ __u8 aerl;
+ __u8 card_type;
+ __u16 rsvd;
+ __u32 rtd3e;
+ __u8 sn[32];
+ __u8 fr[16];
+ __u8 rsvd1[4020];
+};
+
+struct spraid_dev {
+ struct pci_dev *pdev;
+ struct device *dev;
+ struct Scsi_Host *shost;
+ struct spraid_queue *queues;
+ struct dma_pool *prp_page_pool;
+ struct dma_pool *prp_small_pool[MAX_SMALL_POOL_NUM];
+ mempool_t *iod_mempool;
+ void __iomem *bar;
+ u32 max_qid;
+ u32 num_vecs;
+ u32 queue_count;
+ u32 ioq_depth;
+ int db_stride;
+ u32 __iomem *dbs;
+ struct rw_semaphore devices_rwsem;
+ int numa_node;
+ u32 page_size;
+ u32 ctrl_config;
+ u32 online_queues;
+ u64 cap;
+ int instance;
+ struct spraid_ctrl_info *ctrl_info;
+ struct spraid_dev_info *devices;
+
+ struct spraid_cmd *adm_cmds;
+ struct list_head adm_cmd_list;
+ spinlock_t adm_cmd_lock;
+
+ struct spraid_cmd *ioq_ptcmds;
+ struct list_head ioq_pt_list;
+ spinlock_t ioq_pt_lock;
+
+ struct work_struct scan_work;
+ struct work_struct timesyn_work;
+ struct work_struct reset_work;
+
+ enum spraid_state state;
+ spinlock_t state_lock;
+ struct request_queue *bsg_queue;
+};
+
+struct spraid_sgl_desc {
+ __le64 addr;
+ __le32 length;
+ __u8 rsvd[3];
+ __u8 type;
+};
+
+union spraid_data_ptr {
+ struct {
+ __le64 prp1;
+ __le64 prp2;
+ };
+ struct spraid_sgl_desc sgl;
+};
+
+struct spraid_admin_common_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le32 cdw2[4];
+ union spraid_data_ptr dptr;
+ __le32 cdw10;
+ __le32 cdw11;
+ __le32 cdw12;
+ __le32 cdw13;
+ __le32 cdw14;
+ __le32 cdw15;
+};
+
+struct spraid_features {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[2];
+ union spraid_data_ptr dptr;
+ __le32 fid;
+ __le32 dword11;
+ __le32 dword12;
+ __le32 dword13;
+ __le32 dword14;
+ __le32 dword15;
+};
+
+struct spraid_create_cq {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[5];
+ __le64 prp1;
+ __u64 rsvd8;
+ __le16 cqid;
+ __le16 qsize;
+ __le16 cq_flags;
+ __le16 irq_vector;
+ __u32 rsvd12[4];
+};
+
+struct spraid_create_sq {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[5];
+ __le64 prp1;
+ __u64 rsvd8;
+ __le16 sqid;
+ __le16 qsize;
+ __le16 sq_flags;
+ __le16 cqid;
+ __u32 rsvd12[4];
+};
+
+struct spraid_delete_queue {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[9];
+ __le16 qid;
+ __u16 rsvd10;
+ __u32 rsvd11[5];
+};
+
+struct spraid_get_info {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u32 rsvd2[4];
+ union spraid_data_ptr dptr;
+ __u8 type;
+ __u8 rsvd10[3];
+ __le32 cdw11;
+ __u32 rsvd12[4];
+};
+
+struct spraid_usr_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ union {
+ struct {
+ __le16 subopcode;
+ __le16 rsvd1;
+ } info_0;
+ __le32 cdw2;
+ };
+ union {
+ struct {
+ __le16 data_len;
+ __le16 param_len;
+ } info_1;
+ __le32 cdw3;
+ };
+ __u64 metadata;
+ union spraid_data_ptr dptr;
+ __le32 cdw10;
+ __le32 cdw11;
+ __le32 cdw12;
+ __le32 cdw13;
+ __le32 cdw14;
+ __le32 cdw15;
+};
+
+enum {
+ SPRAID_CMD_FLAG_SGL_METABUF = (1 << 6),
+ SPRAID_CMD_FLAG_SGL_METASEG = (1 << 7),
+ SPRAID_CMD_FLAG_SGL_ALL = SPRAID_CMD_FLAG_SGL_METABUF | SPRAID_CMD_FLAG_SGL_METASEG,
+};
+
+enum spraid_cmd_state {
+ SPRAID_CMD_IDLE = 0,
+ SPRAID_CMD_IN_FLIGHT = 1,
+ SPRAID_CMD_COMPLETE = 2,
+ SPRAID_CMD_TIMEOUT = 3,
+ SPRAID_CMD_TMO_COMPLETE = 4,
+};
+
+struct spraid_abort_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[4];
+ __le16 sqid;
+ __le16 cid;
+ __u32 rsvd11[5];
+};
+
+struct spraid_reset_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[4];
+ __u8 type;
+ __u8 rsvd10[3];
+ __u32 rsvd11[5];
+};
+
+struct spraid_admin_command {
+ union {
+ struct spraid_admin_common_command common;
+ struct spraid_features features;
+ struct spraid_create_cq create_cq;
+ struct spraid_create_sq create_sq;
+ struct spraid_delete_queue delete_queue;
+ struct spraid_get_info get_info;
+ struct spraid_abort_cmd abort;
+ struct spraid_reset_cmd reset;
+ struct spraid_usr_cmd usr_cmd;
+ };
+};
+
+struct spraid_ioq_common_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_len;
+ __u8 rsvd2;
+ __le32 cdw3[3];
+ union spraid_data_ptr dptr;
+ __le32 cdw10[6];
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __le32 cdw26[6];
+};
+
+struct spraid_rw_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_len;
+ __u8 rsvd2;
+ __u32 rsvd3[3];
+ union spraid_data_ptr dptr;
+ __le64 slba;
+ __le16 nlb;
+ __le16 control;
+ __u32 rsvd13[3];
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __u32 rsvd26[6];
+};
+
+struct spraid_scsi_nonio {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_length;
+ __u8 rsvd2;
+ __u32 rsvd3[3];
+ union spraid_data_ptr dptr;
+ __u32 rsvd10[5];
+ __le32 buffer_len;
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __u32 rsvd26[6];
+};
+
+struct spraid_ioq_command {
+ union {
+ struct spraid_ioq_common_command common;
+ struct spraid_rw_command rw;
+ struct spraid_scsi_nonio scsi_nonio;
+ };
+};
+
+struct spraid_passthru_common_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 rsvd0;
+ __u32 nsid;
+ union {
+ struct {
+ __u16 subopcode;
+ __u16 rsvd1;
+ } info_0;
+ __u32 cdw2;
+ };
+ union {
+ struct {
+ __u16 data_len;
+ __u16 param_len;
+ } info_1;
+ __u32 cdw3;
+ };
+ __u64 metadata;
+
+ __u64 addr;
+ __u64 prp2;
+
+ __u32 cdw10;
+ __u32 cdw11;
+ __u32 cdw12;
+ __u32 cdw13;
+ __u32 cdw14;
+ __u32 cdw15;
+ __u32 timeout_ms;
+ __u32 result0;
+ __u32 result1;
+};
+
+struct spraid_ioq_passthru_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 rsvd0;
+ __u32 nsid;
+ union {
+ struct {
+ __u16 res_sense_len;
+ __u8 cdb_len;
+ __u8 rsvd0;
+ } info_0;
+ __u32 cdw2;
+ };
+ union {
+ struct {
+ __u16 subopcode;
+ __u16 rsvd1;
+ } info_1;
+ __u32 cdw3;
+ };
+ union {
+ struct {
+ __u16 rsvd;
+ __u16 param_len;
+ } info_2;
+ __u32 cdw4;
+ };
+ __u32 cdw5;
+ __u64 addr;
+ __u64 prp2;
+ union {
+ struct {
+ __u16 eid;
+ __u16 sid;
+ } info_3;
+ __u32 cdw10;
+ };
+ union {
+ struct {
+ __u16 did;
+ __u8 did_flag;
+ __u8 rsvd2;
+ } info_4;
+ __u32 cdw11;
+ };
+ __u32 cdw12;
+ __u32 cdw13;
+ __u32 cdw14;
+ __u32 data_len;
+ __u32 cdw16;
+ __u32 cdw17;
+ __u32 cdw18;
+ __u32 cdw19;
+ __u32 cdw20;
+ __u32 cdw21;
+ __u32 cdw22;
+ __u32 cdw23;
+ __u64 sense_addr;
+ __u32 cdw26[4];
+ __u32 timeout_ms;
+ __u32 result0;
+ __u32 result1;
+};
+
+struct spraid_bsg_request {
+ u32 msgcode;
+ u32 control;
+ union {
+ struct spraid_passthru_common_cmd admcmd;
+ struct spraid_ioq_passthru_cmd ioqcmd;
+ };
+};
+
+enum {
+ SPRAID_BSG_ADM,
+ SPRAID_BSG_IOQ,
+};
+
+struct spraid_cmd {
+ int qid;
+ int cid;
+ u32 result0;
+ u32 result1;
+ u16 status;
+ void *priv;
+ enum spraid_cmd_state state;
+ struct completion cmd_done;
+ struct list_head list;
+};
+
+struct spraid_queue {
+ struct spraid_dev *hdev;
+ spinlock_t sq_lock; /* spinlock for lock handling */
+
+ spinlock_t cq_lock ____cacheline_aligned_in_smp; /* spinlock for lock handling */
+
+ void *sq_cmds;
+
+ struct spraid_completion *cqes;
+
+ dma_addr_t sq_dma_addr;
+ dma_addr_t cq_dma_addr;
+ u32 __iomem *q_db;
+ u8 cq_phase;
+ u8 sqes;
+ u16 qid;
+ u16 sq_tail;
+ u16 cq_head;
+ u16 last_cq_head;
+ u16 q_depth;
+ s16 cq_vector;
+ void *sense;
+ dma_addr_t sense_dma_addr;
+ struct dma_pool *prp_small_pool;
+};
+
+struct spraid_iod {
+ struct spraid_queue *spraidq;
+ enum spraid_cmd_state state;
+ int npages;
+ u32 nsge;
+ u32 length;
+ bool use_sgl;
+ bool sg_drv_mgmt;
+ dma_addr_t first_dma;
+ void *sense;
+ dma_addr_t sense_dma;
+ struct scatterlist *sg;
+ struct scatterlist inline_sg[0];
+};
+
+#define SPRAID_DEV_INFO_ATTR_BOOT(attr) ((attr) & 0x01)
+#define SPRAID_DEV_INFO_ATTR_VD(attr) (((attr) & 0x02) == 0x0)
+#define SPRAID_DEV_INFO_ATTR_PT(attr) (((attr) & 0x22) == 0x02)
+#define SPRAID_DEV_INFO_ATTR_RAWDISK(attr) ((attr) & 0x20)
+
+#define SPRAID_DEV_INFO_FLAG_VALID(flag) ((flag) & 0x01)
+#define SPRAID_DEV_INFO_FLAG_CHANGE(flag) ((flag) & 0x02)
+
+struct spraid_dev_info {
+ __le32 hdid;
+ __le16 target;
+ __u8 channel;
+ __u8 lun;
+ __u8 attr;
+ __u8 flag;
+ __le16 max_io_kb;
+};
+
+#define MAX_DEV_ENTRY_PER_PAGE_4K 340
+struct spraid_dev_list {
+ __le32 dev_num;
+ __u32 rsvd0[3];
+ struct spraid_dev_info devices[MAX_DEV_ENTRY_PER_PAGE_4K];
+};
+
+struct spraid_sdev_hostdata {
+ u32 hdid;
+};
+
+#endif
diff --git a/drivers/scsi/spraid/spraid_main.c b/drivers/scsi/spraid/spraid_main.c
new file mode 100644
index 000000000000..7edce06b62a4
--- /dev/null
+++ b/drivers/scsi/spraid/spraid_main.c
@@ -0,0 +1,3266 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
+
+/* Ramaxel Raid SPXXX Series Linux Driver */
+
+#define pr_fmt(fmt) "spraid: " fmt
+
+#include <linux/sched/signal.h>
+#include <linux/pci.h>
+#include <linux/aer.h>
+#include <linux/module.h>
+#include <linux/ioport.h>
+#include <linux/device.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/cdev.h>
+#include <linux/sysfs.h>
+#include <linux/gfp.h>
+#include <linux/types.h>
+#include <linux/ratelimit.h>
+#include <linux/once.h>
+#include <linux/debugfs.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/blkdev.h>
+#include <linux/bsg-lib.h>
+#include <asm/unaligned.h>
+
+#include <scsi/scsi.h>
+#include <scsi/scsi_cmnd.h>
+#include <scsi/scsi_device.h>
+#include <scsi/scsi_host.h>
+#include <scsi/scsi_transport.h>
+#include <scsi/scsi_dbg.h>
+#include <scsi/sg.h>
+
+
+#include "spraid.h"
+
+#define SCMD_TMOUT_RAWDISK 180
+
+#define SCMD_TMOUT_VD 180
+
+static int ioq_depth_set(const char *val, const struct kernel_param *kp);
+static const struct kernel_param_ops ioq_depth_ops = {
+ .set = ioq_depth_set,
+ .get = param_get_uint,
+};
+
+static u32 io_queue_depth = 1024;
+module_param_cb(io_queue_depth, &ioq_depth_ops, &io_queue_depth, 0644);
+MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2");
+
+static int small_pool_num_set(const char *val, const struct kernel_param *kp)
+{
+ u8 n = 0;
+ int ret;
+
+ ret = kstrtou8(val, 10, &n);
+ if (ret != 0)
+ return -EINVAL;
+ if (n > MAX_SMALL_POOL_NUM)
+ n = MAX_SMALL_POOL_NUM;
+ if (n < 1)
+ n = 1;
+ *((u8 *)kp->arg) = n;
+
+ return 0;
+}
+
+static const struct kernel_param_ops small_pool_num_ops = {
+ .set = small_pool_num_set,
+ .get = param_get_byte,
+};
+
+/* It was found that the spindlock of a single pool conflicts
+ * a lot with multiple CPUs.So multiple pools are introduced
+ * to reduce the conflictions.
+ */
+static unsigned char small_pool_num = 4;
+module_param_cb(small_pool_num, &small_pool_num_ops, &small_pool_num, 0644);
+MODULE_PARM_DESC(small_pool_num, "set prp small pool num, default 4, MAX 16");
+
+static void spraid_free_queue(struct spraid_queue *spraidq);
+static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result);
+static void spraid_handle_aen_vs(struct spraid_dev *hdev, u32 result, u32 result1);
+
+static DEFINE_IDA(spraid_instance_ida);
+
+static struct class *spraid_class;
+
+#define SPRAID_CAP_TIMEOUT_UNIT_MS (HZ / 2)
+
+static struct workqueue_struct *spraid_wq;
+
+#define SPRAID_DRV_VERSION "1.0.0.0"
+
+#define ADMIN_TIMEOUT (60 * HZ)
+#define ADMIN_ERR_TIMEOUT 32757
+
+#define SPRAID_WAIT_ABNL_CMD_TIMEOUT (3 * 2)
+
+#define SPRAID_DMA_MSK_BIT_MAX 64
+
+enum FW_STAT_CODE {
+ FW_STAT_OK = 0,
+ FW_STAT_NEED_CHECK,
+ FW_STAT_ERROR,
+ FW_STAT_EP_PCIE_ERROR,
+ FW_STAT_NAC_DMA_ERROR,
+ FW_STAT_ABORTED,
+ FW_STAT_NEED_RETRY
+};
+
+static int ioq_depth_set(const char *val, const struct kernel_param *kp)
+{
+ int n = 0;
+ int ret;
+
+ ret = kstrtoint(val, 10, &n);
+ if (ret != 0 || n < 2)
+ return -EINVAL;
+
+ return param_set_int(val, kp);
+}
+
+static int spraid_remap_bar(struct spraid_dev *hdev, u32 size)
+{
+ struct pci_dev *pdev = hdev->pdev;
+
+ if (size > pci_resource_len(pdev, 0)) {
+ dev_err(hdev->dev, "Input size[%u] exceed bar0 length[%llu]\n",
+ size, pci_resource_len(pdev, 0));
+ return -ENOMEM;
+ }
+
+ if (hdev->bar)
+ iounmap(hdev->bar);
+
+ hdev->bar = ioremap(pci_resource_start(pdev, 0), size);
+ if (!hdev->bar) {
+ dev_err(hdev->dev, "ioremap for bar0 failed\n");
+ return -ENOMEM;
+ }
+ hdev->dbs = hdev->bar + SPRAID_REG_DBS;
+
+ return 0;
+}
+
+static int spraid_dev_map(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ int ret;
+
+ ret = pci_request_mem_regions(pdev, "spraid");
+ if (ret) {
+ dev_err(hdev->dev, "fail to request memory regions\n");
+ return ret;
+ }
+
+ ret = spraid_remap_bar(hdev, SPRAID_REG_DBS + 4096);
+ if (ret) {
+ pci_release_mem_regions(pdev);
+ return ret;
+ }
+
+ return 0;
+}
+
+static void spraid_dev_unmap(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+
+ if (hdev->bar) {
+ iounmap(hdev->bar);
+ hdev->bar = NULL;
+ }
+ pci_release_mem_regions(pdev);
+}
+
+static int spraid_pci_enable(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ int ret = -ENOMEM;
+ u64 maskbit = SPRAID_DMA_MSK_BIT_MAX;
+
+ if (pci_enable_device_mem(pdev)) {
+ dev_err(hdev->dev, "Enable pci device memory resources failed\n");
+ return ret;
+ }
+ pci_set_master(pdev);
+
+ if (readl(hdev->bar + SPRAID_REG_CSTS) == U32_MAX) {
+ ret = -ENODEV;
+ dev_err(hdev->dev, "Read csts register failed\n");
+ goto disable;
+ }
+
+ ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
+ if (ret < 0) {
+ dev_err(hdev->dev, "Allocate one IRQ for setup admin channel failed\n");
+ goto disable;
+ }
+
+ hdev->cap = lo_hi_readq(hdev->bar + SPRAID_REG_CAP);
+ hdev->ioq_depth = min_t(u32, SPRAID_CAP_MQES(hdev->cap) + 1, io_queue_depth);
+ hdev->db_stride = 1 << SPRAID_CAP_STRIDE(hdev->cap);
+
+ maskbit = SPRAID_CAP_DMAMASK(hdev->cap);
+ if (maskbit < 32 || maskbit > SPRAID_DMA_MSK_BIT_MAX) {
+ dev_err(hdev->dev, "err, dma mask invalid[%llu], set to default\n", maskbit);
+ maskbit = SPRAID_DMA_MSK_BIT_MAX;
+ }
+ if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(maskbit))) {
+ dev_err(hdev->dev, "set dma mask and coherent failed\n");
+ goto disable;
+ }
+
+ dev_info(hdev->dev, "set dma mask[%llu] success\n", maskbit);
+
+ pci_enable_pcie_error_reporting(pdev);
+ pci_save_state(pdev);
+
+ return 0;
+
+disable:
+ pci_disable_device(pdev);
+ return ret;
+}
+
+static int spraid_npages_prp(u32 size, struct spraid_dev *hdev)
+{
+ u32 nprps = DIV_ROUND_UP(size + hdev->page_size, hdev->page_size);
+
+ return DIV_ROUND_UP(PRP_ENTRY_SIZE * nprps, PAGE_SIZE - PRP_ENTRY_SIZE);
+}
+
+static int spraid_npages_sgl(u32 nseg)
+{
+ return DIV_ROUND_UP(nseg * sizeof(struct spraid_sgl_desc), PAGE_SIZE);
+}
+
+static void **spraid_iod_list(struct spraid_iod *iod)
+{
+ return (void **)(iod->inline_sg + (iod->sg_drv_mgmt ? iod->nsge : 0));
+}
+
+static u32 spraid_iod_ext_size(struct spraid_dev *hdev, u32 size, u32 nsge,
+ bool sg_drv_mgmt, bool use_sgl)
+{
+ size_t alloc_size, sg_size;
+
+ if (use_sgl)
+ alloc_size = sizeof(__le64 *) * spraid_npages_sgl(nsge);
+ else
+ alloc_size = sizeof(__le64 *) * spraid_npages_prp(size, hdev);
+
+ sg_size = sg_drv_mgmt ? (sizeof(struct scatterlist) * nsge) : 0;
+ return sg_size + alloc_size;
+}
+
+static u32 spraid_cmd_size(struct spraid_dev *hdev, bool sg_drv_mgmt, bool use_sgl)
+{
+ u32 alloc_size = spraid_iod_ext_size(hdev, SPRAID_INT_BYTES(hdev),
+ SPRAID_INT_PAGES, sg_drv_mgmt, use_sgl);
+
+ dev_info(hdev->dev, "sg_drv_mgmt: %s, use_sgl: %s, iod size: %lu, alloc_size: %u\n",
+ sg_drv_mgmt ? "true" : "false", use_sgl ? "true" : "false",
+ sizeof(struct spraid_iod), alloc_size);
+
+ return sizeof(struct spraid_iod) + alloc_size;
+}
+
+static int spraid_setup_prps(struct spraid_dev *hdev, struct spraid_iod *iod)
+{
+ struct scatterlist *sg = iod->sg;
+ u64 dma_addr = sg_dma_address(sg);
+ int dma_len = sg_dma_len(sg);
+ __le64 *prp_list, *old_prp_list;
+ u32 page_size = hdev->page_size;
+ int offset = dma_addr & (page_size - 1);
+ void **list = spraid_iod_list(iod);
+ int length = iod->length;
+ struct dma_pool *pool;
+ dma_addr_t prp_dma;
+ int nprps, i;
+
+ length -= (page_size - offset);
+ if (length <= 0) {
+ iod->first_dma = 0;
+ return 0;
+ }
+
+ dma_len -= (page_size - offset);
+ if (dma_len) {
+ dma_addr += (page_size - offset);
+ } else {
+ sg = sg_next(sg);
+ dma_addr = sg_dma_address(sg);
+ dma_len = sg_dma_len(sg);
+ }
+
+ if (length <= page_size) {
+ iod->first_dma = dma_addr;
+ return 0;
+ }
+
+ nprps = DIV_ROUND_UP(length, page_size);
+ if (nprps <= (SMALL_POOL_SIZE / PRP_ENTRY_SIZE)) {
+ pool = iod->spraidq->prp_small_pool;
+ iod->npages = 0;
+ } else {
+ pool = hdev->prp_page_pool;
+ iod->npages = 1;
+ }
+
+ prp_list = dma_pool_alloc(pool, GFP_ATOMIC, &prp_dma);
+ if (!prp_list) {
+ dev_err_ratelimited(hdev->dev, "Allocate first prp_list memory failed\n");
+ iod->first_dma = dma_addr;
+ iod->npages = -1;
+ return -ENOMEM;
+ }
+ list[0] = prp_list;
+ iod->first_dma = prp_dma;
+ i = 0;
+ for (;;) {
+ if (i == page_size / PRP_ENTRY_SIZE) {
+ old_prp_list = prp_list;
+
+ prp_list = dma_pool_alloc(pool, GFP_ATOMIC, &prp_dma);
+ if (!prp_list) {
+ dev_err_ratelimited(hdev->dev, "Allocate %dth prp_list memory failed\n",
+ iod->npages + 1);
+ return -ENOMEM;
+ }
+ list[iod->npages++] = prp_list;
+ prp_list[0] = old_prp_list[i - 1];
+ old_prp_list[i - 1] = cpu_to_le64(prp_dma);
+ i = 1;
+ }
+ prp_list[i++] = cpu_to_le64(dma_addr);
+ dma_len -= page_size;
+ dma_addr += page_size;
+ length -= page_size;
+ if (length <= 0)
+ break;
+ if (dma_len > 0)
+ continue;
+ if (unlikely(dma_len < 0))
+ goto bad_sgl;
+ sg = sg_next(sg);
+ dma_addr = sg_dma_address(sg);
+ dma_len = sg_dma_len(sg);
+ }
+
+ return 0;
+
+bad_sgl:
+ dev_err(hdev->dev, "Setup prps, invalid SGL for payload: %d nents: %d\n",
+ iod->length, iod->nsge);
+ return -EIO;
+}
+
+#define SGES_PER_PAGE (PAGE_SIZE / sizeof(struct spraid_sgl_desc))
+
+static void spraid_submit_cmd(struct spraid_queue *spraidq, const void *cmd)
+{
+ u32 sqes = SQE_SIZE(spraidq->qid);
+ unsigned long flags;
+ struct spraid_admin_common_command *acd = (struct spraid_admin_common_command *)cmd;
+
+ spin_lock_irqsave(&spraidq->sq_lock, flags);
+ memcpy((spraidq->sq_cmds + sqes * spraidq->sq_tail), cmd, sqes);
+ if (++spraidq->sq_tail == spraidq->q_depth)
+ spraidq->sq_tail = 0;
+
+ writel(spraidq->sq_tail, spraidq->q_db);
+ spin_unlock_irqrestore(&spraidq->sq_lock, flags);
+
+ pr_debug("cid[%d], qid[%d], opcode[0x%x], flags[0x%x], hdid[%u]\n",
+ acd->command_id, spraidq->qid, acd->opcode, acd->flags, le32_to_cpu(acd->hdid));
+}
+
+static u32 spraid_mod64(u64 dividend, u32 divisor)
+{
+ u64 d;
+ u32 remainder;
+
+ if (!divisor)
+ pr_err("DIVISOR is zero, in div fn\n");
+
+ d = dividend;
+ remainder = do_div(d, divisor);
+ return remainder;
+}
+
+static inline bool spraid_is_rw_scmd(struct scsi_cmnd *scmd)
+{
+ switch (scmd->cmnd[0]) {
+ case READ_6:
+ case READ_10:
+ case READ_12:
+ case READ_16:
+ case READ_32:
+ case WRITE_6:
+ case WRITE_10:
+ case WRITE_12:
+ case WRITE_16:
+ case WRITE_32:
+ return true;
+ default:
+ return false;
+ }
+}
+
+static bool spraid_is_prp(struct spraid_dev *hdev, struct scsi_cmnd *scmd, u32 nsge)
+{
+ struct scatterlist *sg = scsi_sglist(scmd);
+ u32 page_size = hdev->page_size;
+ bool is_prp = true;
+ int i = 0;
+
+ scsi_for_each_sg(scmd, sg, nsge, i) {
+ if (i != 0 && i != nsge - 1) {
+ if (spraid_mod64(sg_dma_len(sg), page_size) ||
+ spraid_mod64(sg_dma_address(sg), page_size)) {
+ is_prp = false;
+ break;
+ }
+ }
+
+ if (nsge > 1 && i == 0) {
+ if ((spraid_mod64((sg_dma_address(sg) + sg_dma_len(sg)), page_size))) {
+ is_prp = false;
+ break;
+ }
+ }
+
+ if (nsge > 1 && i == (nsge - 1)) {
+ if (spraid_mod64(sg_dma_address(sg), page_size)) {
+ is_prp = false;
+ break;
+ }
+ }
+ }
+
+ return is_prp;
+}
+
+enum {
+ SPRAID_SGL_FMT_DATA_DESC = 0x00,
+ SPRAID_SGL_FMT_SEG_DESC = 0x02,
+ SPRAID_SGL_FMT_LAST_SEG_DESC = 0x03,
+ SPRAID_KEY_SGL_FMT_DATA_DESC = 0x04,
+ SPRAID_TRANSPORT_SGL_DATA_DESC = 0x05
+};
+
+static void spraid_sgl_set_data(struct spraid_sgl_desc *sge, struct scatterlist *sg)
+{
+ sge->addr = cpu_to_le64(sg_dma_address(sg));
+ sge->length = cpu_to_le32(sg_dma_len(sg));
+ sge->type = SPRAID_SGL_FMT_DATA_DESC << 4;
+}
+
+static void spraid_sgl_set_seg(struct spraid_sgl_desc *sge, dma_addr_t dma_addr, int entries)
+{
+ sge->addr = cpu_to_le64(dma_addr);
+ if (entries <= SGES_PER_PAGE) {
+ sge->length = cpu_to_le32(entries * sizeof(*sge));
+ sge->type = SPRAID_SGL_FMT_LAST_SEG_DESC << 4;
+ } else {
+ sge->length = cpu_to_le32(PAGE_SIZE);
+ sge->type = SPRAID_SGL_FMT_SEG_DESC << 4;
+ }
+}
+
+static int spraid_setup_ioq_cmd_sgl(struct spraid_dev *hdev,
+ struct scsi_cmnd *scmd, struct spraid_ioq_command *ioq_cmd,
+ struct spraid_iod *iod)
+{
+ struct spraid_sgl_desc *sg_list, *link, *old_sg_list;
+ struct scatterlist *sg = scsi_sglist(scmd);
+ void **list = spraid_iod_list(iod);
+ struct dma_pool *pool;
+ int nsge = iod->nsge;
+ dma_addr_t sgl_dma;
+ int i = 0;
+
+ ioq_cmd->common.flags |= SPRAID_CMD_FLAG_SGL_METABUF;
+
+ if (nsge == 1) {
+ spraid_sgl_set_data(&ioq_cmd->common.dptr.sgl, sg);
+ return 0;
+ }
+
+ if (nsge <= (SMALL_POOL_SIZE / sizeof(struct spraid_sgl_desc))) {
+ pool = iod->spraidq->prp_small_pool;
+ iod->npages = 0;
+ } else {
+ pool = hdev->prp_page_pool;
+ iod->npages = 1;
+ }
+
+ sg_list = dma_pool_alloc(pool, GFP_ATOMIC, &sgl_dma);
+ if (!sg_list) {
+ dev_err_ratelimited(hdev->dev, "Allocate first sgl_list failed\n");
+ iod->npages = -1;
+ return -ENOMEM;
+ }
+
+ list[0] = sg_list;
+ iod->first_dma = sgl_dma;
+ spraid_sgl_set_seg(&ioq_cmd->common.dptr.sgl, sgl_dma, nsge);
+ do {
+ if (i == SGES_PER_PAGE) {
+ old_sg_list = sg_list;
+ link = &old_sg_list[SGES_PER_PAGE - 1];
+
+ sg_list = dma_pool_alloc(pool, GFP_ATOMIC, &sgl_dma);
+ if (!sg_list) {
+ dev_err_ratelimited(hdev->dev, "Allocate %dth sgl_list failed\n",
+ iod->npages + 1);
+ return -ENOMEM;
+ }
+ list[iod->npages++] = sg_list;
+
+ i = 0;
+ memcpy(&sg_list[i++], link, sizeof(*link));
+ spraid_sgl_set_seg(link, sgl_dma, nsge);
+ }
+
+ spraid_sgl_set_data(&sg_list[i++], sg);
+ sg = sg_next(sg);
+ } while (--nsge > 0);
+
+ return 0;
+}
+
+#define SPRAID_RW_FUA BIT(14)
+
+static void spraid_setup_rw_cmd(struct spraid_dev *hdev,
+ struct spraid_rw_command *rw,
+ struct scsi_cmnd *scmd)
+{
+ u32 start_lba_lo, start_lba_hi;
+ u32 datalength = 0;
+ u16 control = 0;
+
+ start_lba_lo = 0;
+ start_lba_hi = 0;
+
+ if (scmd->sc_data_direction == DMA_TO_DEVICE) {
+ rw->opcode = SPRAID_CMD_WRITE;
+ } else if (scmd->sc_data_direction == DMA_FROM_DEVICE) {
+ rw->opcode = SPRAID_CMD_READ;
+ } else {
+ dev_err(hdev->dev, "Invalid IO for unsupported data direction: %d\n",
+ scmd->sc_data_direction);
+ WARN_ON(1);
+ }
+
+ /* 6-byte READ(0x08) or WRITE(0x0A) cdb */
+ if (scmd->cmd_len == 6) {
+ datalength = (u32)(scmd->cmnd[4] == 0 ?
+ IO_6_DEFAULT_TX_LEN : scmd->cmnd[4]);
+ start_lba_lo = (u32)get_unaligned_be24(&scmd->cmnd[1]);
+
+ start_lba_lo &= 0x1FFFFF;
+ }
+
+ /* 10-byte READ(0x28) or WRITE(0x2A) cdb */
+ else if (scmd->cmd_len == 10) {
+ datalength = (u32)get_unaligned_be16(&scmd->cmnd[7]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+
+ /* 12-byte READ(0xA8) or WRITE(0xAA) cdb */
+ else if (scmd->cmd_len == 12) {
+ datalength = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+ /* 16-byte READ(0x88) or WRITE(0x8A) cdb */
+ else if (scmd->cmd_len == 16) {
+ datalength = get_unaligned_be32(&scmd->cmnd[10]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+ /* 32-byte READ(0x88) or WRITE(0x8A) cdb */
+ else if (scmd->cmd_len == 32) {
+ datalength = get_unaligned_be32(&scmd->cmnd[28]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[16]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[12]);
+
+ if (scmd->cmnd[10] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+
+ if (unlikely(datalength > U16_MAX || datalength == 0)) {
+ dev_err(hdev->dev, "Invalid IO for illegal transfer data length: %u\n",
+ datalength);
+ WARN_ON(1);
+ }
+
+ rw->slba = cpu_to_le64(((u64)start_lba_hi << 32) | start_lba_lo);
+ /* 0base for nlb */
+ rw->nlb = cpu_to_le16((u16)(datalength - 1));
+ rw->control = cpu_to_le16(control);
+}
+
+static void spraid_setup_nonio_cmd(struct spraid_dev *hdev,
+ struct spraid_scsi_nonio *scsi_nonio, struct scsi_cmnd *scmd)
+{
+ scsi_nonio->buffer_len = cpu_to_le32(scsi_bufflen(scmd));
+
+ switch (scmd->sc_data_direction) {
+ case DMA_NONE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_NONE;
+ break;
+ case DMA_TO_DEVICE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_TODEV;
+ break;
+ case DMA_FROM_DEVICE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_FROMDEV;
+ break;
+ default:
+ dev_err(hdev->dev, "Invalid IO for unsupported data direction: %d\n",
+ scmd->sc_data_direction);
+ WARN_ON(1);
+ }
+}
+
+static void spraid_setup_ioq_cmd(struct spraid_dev *hdev,
+ struct spraid_ioq_command *ioq_cmd, struct scsi_cmnd *scmd)
+{
+ memcpy(ioq_cmd->common.cdb, scmd->cmnd, scmd->cmd_len);
+ ioq_cmd->common.cdb_len = scmd->cmd_len;
+
+ if (spraid_is_rw_scmd(scmd))
+ spraid_setup_rw_cmd(hdev, &ioq_cmd->rw, scmd);
+ else
+ spraid_setup_nonio_cmd(hdev, &ioq_cmd->scsi_nonio, scmd);
+}
+
+static int spraid_init_iod(struct spraid_dev *hdev,
+ struct spraid_iod *iod, struct spraid_ioq_command *ioq_cmd,
+ struct scsi_cmnd *scmd)
+{
+ if (unlikely(!iod->sense)) {
+ dev_err(hdev->dev, "Allocate sense data buffer failed\n");
+ return -ENOMEM;
+ }
+ ioq_cmd->common.sense_addr = cpu_to_le64(iod->sense_dma);
+ ioq_cmd->common.sense_len = cpu_to_le16(SCSI_SENSE_BUFFERSIZE);
+
+ iod->nsge = 0;
+ iod->npages = -1;
+ iod->use_sgl = 0;
+ iod->sg_drv_mgmt = false;
+ WRITE_ONCE(iod->state, SPRAID_CMD_IDLE);
+
+ return 0;
+}
+
+static void spraid_free_iod_res(struct spraid_dev *hdev, struct spraid_iod *iod)
+{
+ const int last_prp = hdev->page_size / sizeof(__le64) - 1;
+ dma_addr_t dma_addr, next_dma_addr;
+ struct spraid_sgl_desc *sg_list;
+ __le64 *prp_list;
+ void *addr;
+ int i;
+
+ dma_addr = iod->first_dma;
+ if (iod->npages == 0)
+ dma_pool_free(iod->spraidq->prp_small_pool, spraid_iod_list(iod)[0], dma_addr);
+
+ for (i = 0; i < iod->npages; i++) {
+ addr = spraid_iod_list(iod)[i];
+
+ if (iod->use_sgl) {
+ sg_list = addr;
+ next_dma_addr =
+ le64_to_cpu((sg_list[SGES_PER_PAGE - 1]).addr);
+ } else {
+ prp_list = addr;
+ next_dma_addr = le64_to_cpu(prp_list[last_prp]);
+ }
+
+ dma_pool_free(hdev->prp_page_pool, addr, dma_addr);
+ dma_addr = next_dma_addr;
+ }
+
+ if (iod->sg_drv_mgmt && iod->sg != iod->inline_sg) {
+ iod->sg_drv_mgmt = false;
+ mempool_free(iod->sg, hdev->iod_mempool);
+ }
+
+ iod->sense = NULL;
+ iod->npages = -1;
+}
+
+static int spraid_io_map_data(struct spraid_dev *hdev, struct spraid_iod *iod,
+ struct scsi_cmnd *scmd, struct spraid_ioq_command *ioq_cmd)
+{
+ int ret;
+
+ iod->nsge = scsi_dma_map(scmd);
+
+ /* No data to DMA, it may be scsi no-rw command */
+ if (unlikely(iod->nsge == 0))
+ return 0;
+
+ iod->length = scsi_bufflen(scmd);
+ iod->sg = scsi_sglist(scmd);
+ iod->use_sgl = !spraid_is_prp(hdev, scmd, iod->nsge);
+
+ if (iod->use_sgl) {
+ ret = spraid_setup_ioq_cmd_sgl(hdev, scmd, ioq_cmd, iod);
+ } else {
+ ret = spraid_setup_prps(hdev, iod);
+ ioq_cmd->common.dptr.prp1 =
+ cpu_to_le64(sg_dma_address(iod->sg));
+ ioq_cmd->common.dptr.prp2 = cpu_to_le64(iod->first_dma);
+ }
+
+ if (ret)
+ scsi_dma_unmap(scmd);
+
+ return ret;
+}
+
+static void spraid_map_status(struct spraid_iod *iod, struct scsi_cmnd *scmd,
+ struct spraid_completion *cqe)
+{
+ scsi_set_resid(scmd, 0);
+
+ switch ((le16_to_cpu(cqe->status) >> 1) & 0x7f) {
+ case FW_STAT_OK:
+ set_host_byte(scmd, DID_OK);
+ break;
+ case FW_STAT_NEED_CHECK:
+ set_host_byte(scmd, DID_OK);
+ scmd->result |= le16_to_cpu(cqe->status) >> 8;
+ if (scmd->result & SAM_STAT_CHECK_CONDITION) {
+ memset(scmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
+ memcpy(scmd->sense_buffer, iod->sense, SCSI_SENSE_BUFFERSIZE);
+ scmd->result = (scmd->result & 0x00ffffff) | (DRIVER_SENSE << 24);
+ }
+ break;
+ case FW_STAT_ABORTED:
+ set_host_byte(scmd, DID_ABORT);
+ break;
+ case FW_STAT_NEED_RETRY:
+ set_host_byte(scmd, DID_REQUEUE);
+ break;
+ default:
+ set_host_byte(scmd, DID_BAD_TARGET);
+ break;
+ }
+}
+
+static inline void spraid_get_tag_from_scmd(struct scsi_cmnd *scmd, u16 *qid, u16 *cid)
+{
+ u32 tag = blk_mq_unique_tag(scsi_cmd_to_rq(scmd));
+
+ *qid = blk_mq_unique_tag_to_hwq(tag) + 1;
+ *cid = blk_mq_unique_tag_to_tag(tag);
+}
+
+static int spraid_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
+{
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_dev *hdev = shost_priv(shost);
+ struct scsi_device *sdev = scmd->device;
+ struct spraid_sdev_hostdata *hostdata;
+ struct spraid_ioq_command ioq_cmd;
+ struct spraid_queue *ioq;
+ unsigned long elapsed;
+ u16 hwq, cid;
+ int ret;
+
+ if (unlikely(!scmd)) {
+ dev_err(hdev->dev, "err, scmd is null, return 0\n");
+ return 0;
+ }
+
+ if (unlikely(hdev->state != SPRAID_LIVE)) {
+ set_host_byte(scmd, DID_NO_CONNECT);
+ scsi_done(scmd);
+ dev_err(hdev->dev, "[%s] err, hdev state is not live\n", __func__);
+ return 0;
+ }
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ hostdata = sdev->hostdata;
+ ioq = &hdev->queues[hwq];
+ memset(&ioq_cmd, 0, sizeof(ioq_cmd));
+ ioq_cmd.rw.hdid = cpu_to_le32(hostdata->hdid);
+ ioq_cmd.rw.command_id = cid;
+
+ spraid_setup_ioq_cmd(hdev, &ioq_cmd, scmd);
+
+ ret = cid * SCSI_SENSE_BUFFERSIZE;
+ iod->sense = ioq->sense + ret;
+ iod->sense_dma = ioq->sense_dma_addr + ret;
+
+ ret = spraid_init_iod(hdev, iod, &ioq_cmd, scmd);
+ if (unlikely(ret))
+ return SCSI_MLQUEUE_HOST_BUSY;
+
+ iod->spraidq = ioq;
+ ret = spraid_io_map_data(hdev, iod, scmd, &ioq_cmd);
+ if (unlikely(ret)) {
+ dev_err(hdev->dev, "spraid_io_map_data Err.\n");
+ set_host_byte(scmd, DID_ERROR);
+ scsi_done(scmd);
+ ret = 0;
+ goto deinit_iod;
+ }
+
+ WRITE_ONCE(iod->state, SPRAID_CMD_IN_FLIGHT);
+ spraid_submit_cmd(ioq, &ioq_cmd);
+ elapsed = jiffies - scmd->jiffies_at_alloc;
+ pr_debug("cid[%d], qid[%d] submit IO cost %3ld.%3ld seconds\n",
+ cid, hwq, elapsed / HZ, elapsed % HZ);
+ return 0;
+
+deinit_iod:
+ spraid_free_iod_res(hdev, iod);
+ return ret;
+}
+
+static int spraid_match_dev(struct spraid_dev *hdev, u16 idx, struct scsi_device *sdev)
+{
+ if (SPRAID_DEV_INFO_FLAG_VALID(hdev->devices[idx].flag)) {
+ if (sdev->channel == hdev->devices[idx].channel &&
+ sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
+ sdev->lun < hdev->devices[idx].lun) {
+ dev_info(hdev->dev, "Match device success, channel:target:lun[%d:%d:%d]\n",
+ hdev->devices[idx].channel,
+ hdev->devices[idx].target,
+ hdev->devices[idx].lun);
+ return 1;
+ }
+ }
+
+ return 0;
+}
+
+static int spraid_slave_alloc(struct scsi_device *sdev)
+{
+ struct spraid_sdev_hostdata *hostdata;
+ struct spraid_dev *hdev;
+ u16 idx;
+
+ hdev = shost_priv(sdev->host);
+ hostdata = kzalloc(sizeof(*hostdata), GFP_KERNEL);
+ if (!hostdata) {
+ dev_err(hdev->dev, "Alloc scsi host data memory failed\n");
+ return -ENOMEM;
+ }
+
+ down_read(&hdev->devices_rwsem);
+ for (idx = 0; idx < le32_to_cpu(hdev->ctrl_info->nd); idx++) {
+ if (spraid_match_dev(hdev, idx, sdev))
+ goto scan_host;
+ }
+ up_read(&hdev->devices_rwsem);
+
+ kfree(hostdata);
+ return -ENXIO;
+
+scan_host:
+ hostdata->hdid = le32_to_cpu(hdev->devices[idx].hdid);
+ sdev->hostdata = hostdata;
+ up_read(&hdev->devices_rwsem);
+ return 0;
+}
+
+static void spraid_slave_destroy(struct scsi_device *sdev)
+{
+ kfree(sdev->hostdata);
+ sdev->hostdata = NULL;
+}
+
+static int spraid_slave_configure(struct scsi_device *sdev)
+{
+ u16 idx;
+ unsigned int timeout = SCMD_TMOUT_RAWDISK * HZ;
+ struct spraid_dev *hdev = shost_priv(sdev->host);
+ struct spraid_sdev_hostdata *hostdata = sdev->hostdata;
+ u32 max_sec = sdev->host->max_sectors;
+
+ if (hostdata) {
+ idx = hostdata->hdid - 1;
+ if (sdev->channel == hdev->devices[idx].channel &&
+ sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
+ sdev->lun < hdev->devices[idx].lun) {
+ if (SPRAID_DEV_INFO_ATTR_VD(hdev->devices[idx].attr))
+ timeout = SCMD_TMOUT_VD * HZ;
+ else if (SPRAID_DEV_INFO_ATTR_RAWDISK(hdev->devices[idx].attr))
+ timeout = SCMD_TMOUT_RAWDISK * HZ;
+ max_sec = le16_to_cpu(hdev->devices[idx].max_io_kb) << 1;
+ } else {
+ dev_err(hdev->dev, "[%s] err, sdev->channel:id:lun[%d:%d:%lld];"
+ "devices[%d], channel:target:lun[%d:%d:%d]\n",
+ __func__, sdev->channel, sdev->id, sdev->lun,
+ idx, hdev->devices[idx].channel,
+ hdev->devices[idx].target,
+ hdev->devices[idx].lun);
+ }
+ } else {
+ dev_err(hdev->dev, "[%s] err, sdev->hostdata is null\n", __func__);
+ }
+
+ blk_queue_rq_timeout(sdev->request_queue, timeout);
+ sdev->eh_timeout = timeout;
+
+ if ((max_sec == 0) || (max_sec > sdev->host->max_sectors))
+ max_sec = sdev->host->max_sectors;
+ blk_queue_max_hw_sectors(sdev->request_queue, max_sec);
+
+ dev_info(hdev->dev, "[%s] sdev->channel:id:lun[%d:%d:%lld], scmd_timeout[%d]s, maxsec[%d]\n",
+ __func__, sdev->channel, sdev->id, sdev->lun, timeout / HZ, max_sec);
+
+ return 0;
+}
+
+static void spraid_shost_init(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ u8 domain, bus;
+ u32 dev_func;
+
+ domain = pci_domain_nr(pdev->bus);
+ bus = pdev->bus->number;
+ dev_func = pdev->devfn;
+
+ hdev->shost->nr_hw_queues = hdev->online_queues - 1;
+ hdev->shost->can_queue = (hdev->ioq_depth - SPRAID_PTCMDS_PERQ);
+
+ hdev->shost->sg_tablesize = le16_to_cpu(hdev->ctrl_info->max_num_sge);
+ /* 512B per sector */
+ hdev->shost->max_sectors = (1U << ((hdev->ctrl_info->mdts) * 1U) << 12) / 512;
+ hdev->shost->cmd_per_lun = MAX_CMD_PER_DEV;
+ hdev->shost->max_channel = le16_to_cpu(hdev->ctrl_info->max_channel) - 1;
+ hdev->shost->max_id = le32_to_cpu(hdev->ctrl_info->max_tgt_id);
+ hdev->shost->max_lun = le16_to_cpu(hdev->ctrl_info->max_lun);
+
+ hdev->shost->this_id = -1;
+ hdev->shost->unique_id = (domain << 16) | (bus << 8) | dev_func;
+ hdev->shost->max_cmd_len = MAX_CDB_LEN;
+ hdev->shost->hostt->cmd_size = max(spraid_cmd_size(hdev, false, true),
+ spraid_cmd_size(hdev, false, false));
+}
+
+static inline void spraid_host_deinit(struct spraid_dev *hdev)
+{
+ ida_free(&spraid_instance_ida, hdev->instance);
+}
+
+static int spraid_alloc_queue(struct spraid_dev *hdev, u16 qid, u16 depth)
+{
+ struct spraid_queue *spraidq = &hdev->queues[qid];
+ int ret = 0;
+
+ if (hdev->queue_count > qid) {
+ dev_info(hdev->dev, "[%s] warn: queue[%d] is exist\n", __func__, qid);
+ return 0;
+ }
+
+ spraidq->cqes = dma_alloc_coherent(hdev->dev, CQ_SIZE(depth),
+ &spraidq->cq_dma_addr, GFP_KERNEL | __GFP_ZERO);
+ if (!spraidq->cqes)
+ return -ENOMEM;
+
+ spraidq->sq_cmds = dma_alloc_coherent(hdev->dev, SQ_SIZE(qid, depth),
+ &spraidq->sq_dma_addr, GFP_KERNEL);
+ if (!spraidq->sq_cmds) {
+ ret = -ENOMEM;
+ goto free_cqes;
+ }
+
+ spin_lock_init(&spraidq->sq_lock);
+ spin_lock_init(&spraidq->cq_lock);
+ spraidq->hdev = hdev;
+ spraidq->q_depth = depth;
+ spraidq->qid = qid;
+ spraidq->cq_vector = -1;
+ hdev->queue_count++;
+
+ /* alloc sense buffer */
+ spraidq->sense = dma_alloc_coherent(hdev->dev, SENSE_SIZE(depth),
+ &spraidq->sense_dma_addr, GFP_KERNEL | __GFP_ZERO);
+ if (!spraidq->sense) {
+ ret = -ENOMEM;
+ goto free_sq_cmds;
+ }
+
+ return 0;
+
+free_sq_cmds:
+ dma_free_coherent(hdev->dev, SQ_SIZE(qid, depth), (void *)spraidq->sq_cmds,
+ spraidq->sq_dma_addr);
+free_cqes:
+ dma_free_coherent(hdev->dev, CQ_SIZE(depth), (void *)spraidq->cqes,
+ spraidq->cq_dma_addr);
+ return ret;
+}
+
+static int spraid_wait_ready(struct spraid_dev *hdev, u64 cap, bool enabled)
+{
+ unsigned long timeout =
+ ((SPRAID_CAP_TIMEOUT(cap) + 1) * SPRAID_CAP_TIMEOUT_UNIT_MS) + jiffies;
+ u32 bit = enabled ? SPRAID_CSTS_RDY : 0;
+
+ while ((readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_RDY) != bit) {
+ usleep_range(1000, 2000);
+ if (fatal_signal_pending(current))
+ return -EINTR;
+
+ if (time_after(jiffies, timeout)) {
+ dev_err(hdev->dev, "Device not ready; aborting %s\n",
+ enabled ? "initialisation" : "reset");
+ return -ENODEV;
+ }
+ }
+ return 0;
+}
+
+static int spraid_shutdown_ctrl(struct spraid_dev *hdev)
+{
+ unsigned long timeout = hdev->ctrl_info->rtd3e + jiffies;
+
+ hdev->ctrl_config &= ~SPRAID_CC_SHN_MASK;
+ hdev->ctrl_config |= SPRAID_CC_SHN_NORMAL;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ while ((readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_SHST_MASK) !=
+ SPRAID_CSTS_SHST_CMPLT) {
+ msleep(100);
+ if (fatal_signal_pending(current))
+ return -EINTR;
+ if (time_after(jiffies, timeout)) {
+ dev_err(hdev->dev, "Device shutdown incomplete; abort shutdown\n");
+ return -ENODEV;
+ }
+ }
+ return 0;
+}
+
+static int spraid_disable_ctrl(struct spraid_dev *hdev)
+{
+ hdev->ctrl_config &= ~SPRAID_CC_SHN_MASK;
+ hdev->ctrl_config &= ~SPRAID_CC_ENABLE;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ return spraid_wait_ready(hdev, hdev->cap, false);
+}
+
+static int spraid_enable_ctrl(struct spraid_dev *hdev)
+{
+ u64 cap = hdev->cap;
+ u32 dev_page_min = SPRAID_CAP_MPSMIN(cap) + 12;
+ u32 page_shift = PAGE_SHIFT;
+
+ if (page_shift < dev_page_min) {
+ dev_err(hdev->dev, "Minimum device page size[%u], too large for host[%u]\n",
+ 1U << dev_page_min, 1U << page_shift);
+ return -ENODEV;
+ }
+
+ page_shift = min_t(unsigned int, SPRAID_CAP_MPSMAX(cap) + 12, PAGE_SHIFT);
+ hdev->page_size = 1U << page_shift;
+
+ hdev->ctrl_config = SPRAID_CC_CSS_NVM;
+ hdev->ctrl_config |= (page_shift - 12) << SPRAID_CC_MPS_SHIFT;
+ hdev->ctrl_config |= SPRAID_CC_AMS_RR | SPRAID_CC_SHN_NONE;
+ hdev->ctrl_config |= SPRAID_CC_IOSQES | SPRAID_CC_IOCQES;
+ hdev->ctrl_config |= SPRAID_CC_ENABLE;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ return spraid_wait_ready(hdev, cap, true);
+}
+
+static void spraid_init_queue(struct spraid_queue *spraidq, u16 qid)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ memset((void *)spraidq->cqes, 0, CQ_SIZE(spraidq->q_depth));
+
+ spraidq->sq_tail = 0;
+ spraidq->cq_head = 0;
+ spraidq->cq_phase = 1;
+ spraidq->q_db = &hdev->dbs[qid * 2 * hdev->db_stride];
+ spraidq->prp_small_pool = hdev->prp_small_pool[qid % small_pool_num];
+ hdev->online_queues++;
+}
+
+static inline bool spraid_cqe_pending(struct spraid_queue *spraidq)
+{
+ return (le16_to_cpu(spraidq->cqes[spraidq->cq_head].status) & 1) ==
+ spraidq->cq_phase;
+}
+
+static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq, struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct blk_mq_tags *tags;
+ struct scsi_cmnd *scmd;
+ struct spraid_iod *iod;
+ struct request *req;
+ unsigned long elapsed;
+
+ tags = hdev->shost->tag_set.tags[ioq->qid - 1];
+ req = blk_mq_tag_to_rq(tags, cqe->cmd_id);
+ if (unlikely(!req || !blk_mq_request_started(req))) {
+ dev_warn(hdev->dev, "Invalid id %d completed on queue %d\n",
+ cqe->cmd_id, ioq->qid);
+ return;
+ }
+
+ scmd = blk_mq_rq_to_pdu(req);
+ iod = scsi_cmd_priv(scmd);
+
+ elapsed = jiffies - scmd->jiffies_at_alloc;
+ pr_debug("cid[%d], qid[%d] finish IO cost %3ld.%3ld seconds\n",
+ cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
+
+ if (cmpxchg(&iod->state, SPRAID_CMD_IN_FLIGHT, SPRAID_CMD_COMPLETE) !=
+ SPRAID_CMD_IN_FLIGHT) {
+ dev_warn(hdev->dev, "cid[%d], qid[%d] enters abnormal handler, cost %3ld.%3ld seconds\n",
+ cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
+ WRITE_ONCE(iod->state, SPRAID_CMD_TMO_COMPLETE);
+
+ if (iod->nsge) {
+ iod->nsge = 0;
+ scsi_dma_unmap(scmd);
+ }
+ spraid_free_iod_res(hdev, iod);
+
+ return;
+ }
+
+ spraid_map_status(iod, scmd, cqe);
+ if (iod->nsge) {
+ iod->nsge = 0;
+ scsi_dma_unmap(scmd);
+ }
+ spraid_free_iod_res(hdev, iod);
+ scsi_done(scmd);
+}
+
+
+static void spraid_complete_adminq_cmnd(struct spraid_queue *adminq, struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = adminq->hdev;
+ struct spraid_cmd *adm_cmd;
+
+ adm_cmd = hdev->adm_cmds + cqe->cmd_id;
+ if (unlikely(adm_cmd->state == SPRAID_CMD_IDLE)) {
+ dev_warn(adminq->hdev->dev, "Invalid id %d completed on queue %d\n",
+ cqe->cmd_id, le16_to_cpu(cqe->sq_id));
+ return;
+ }
+
+ adm_cmd->status = le16_to_cpu(cqe->status) >> 1;
+ adm_cmd->result0 = le32_to_cpu(cqe->result);
+ adm_cmd->result1 = le32_to_cpu(cqe->result1);
+
+ complete(&adm_cmd->cmd_done);
+}
+
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid);
+
+static void spraid_complete_aen(struct spraid_queue *spraidq, struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+ u32 result = le32_to_cpu(cqe->result);
+
+ dev_info(hdev->dev, "rcv aen, status[%x], result[%x]\n",
+ le16_to_cpu(cqe->status) >> 1, result);
+
+ spraid_send_aen(hdev, cqe->cmd_id);
+
+ if ((le16_to_cpu(cqe->status) >> 1) != SPRAID_SC_SUCCESS)
+ return;
+ switch (result & 0x7) {
+ case SPRAID_AEN_NOTICE:
+ spraid_handle_aen_notice(hdev, result);
+ break;
+ case SPRAID_AEN_VS:
+ spraid_handle_aen_vs(hdev, result, le32_to_cpu(cqe->result1));
+ break;
+ default:
+ dev_warn(hdev->dev, "Unsupported async event type: %u\n",
+ result & 0x7);
+ break;
+ }
+}
+
+static void spraid_complete_ioq_sync_cmnd(struct spraid_queue *ioq, struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct spraid_cmd *ptcmd;
+
+ ptcmd = hdev->ioq_ptcmds + (ioq->qid - 1) * SPRAID_PTCMDS_PERQ +
+ cqe->cmd_id - SPRAID_IO_BLK_MQ_DEPTH;
+
+ ptcmd->status = le16_to_cpu(cqe->status) >> 1;
+ ptcmd->result0 = le32_to_cpu(cqe->result);
+ ptcmd->result1 = le32_to_cpu(cqe->result1);
+
+ complete(&ptcmd->cmd_done);
+}
+
+static inline void spraid_handle_cqe(struct spraid_queue *spraidq, u16 idx)
+{
+ struct spraid_completion *cqe = &spraidq->cqes[idx];
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ if (unlikely(cqe->cmd_id >= spraidq->q_depth)) {
+ dev_err(hdev->dev, "Invalid command id[%d] completed on queue %d\n",
+ cqe->cmd_id, cqe->sq_id);
+ return;
+ }
+
+ pr_debug("cid[%d], qid[%d], result[0x%x], sq_id[%d], status[0x%x]\n",
+ cqe->cmd_id, spraidq->qid, le32_to_cpu(cqe->result),
+ le16_to_cpu(cqe->sq_id), le16_to_cpu(cqe->status));
+
+ if (unlikely(spraidq->qid == 0 && cqe->cmd_id >= SPRAID_AQ_BLK_MQ_DEPTH)) {
+ spraid_complete_aen(spraidq, cqe);
+ return;
+ }
+
+ if (unlikely(spraidq->qid && cqe->cmd_id >= SPRAID_IO_BLK_MQ_DEPTH)) {
+ spraid_complete_ioq_sync_cmnd(spraidq, cqe);
+ return;
+ }
+
+ if (spraidq->qid)
+ spraid_complete_ioq_cmnd(spraidq, cqe);
+ else
+ spraid_complete_adminq_cmnd(spraidq, cqe);
+}
+
+static void spraid_complete_cqes(struct spraid_queue *spraidq, u16 start, u16 end)
+{
+ while (start != end) {
+ spraid_handle_cqe(spraidq, start);
+ if (++start == spraidq->q_depth)
+ start = 0;
+ }
+}
+
+static inline void spraid_update_cq_head(struct spraid_queue *spraidq)
+{
+ if (++spraidq->cq_head == spraidq->q_depth) {
+ spraidq->cq_head = 0;
+ spraidq->cq_phase = !spraidq->cq_phase;
+ }
+}
+
+static inline bool spraid_process_cq(struct spraid_queue *spraidq, u16 *start, u16 *end, int tag)
+{
+ bool found = false;
+
+ *start = spraidq->cq_head;
+ while (!found && spraid_cqe_pending(spraidq)) {
+ if (spraidq->cqes[spraidq->cq_head].cmd_id == tag)
+ found = true;
+ spraid_update_cq_head(spraidq);
+ }
+ *end = spraidq->cq_head;
+
+ if (*start != *end)
+ writel(spraidq->cq_head, spraidq->q_db + spraidq->hdev->db_stride);
+
+ return found;
+}
+
+static bool spraid_poll_cq(struct spraid_queue *spraidq, int cid)
+{
+ u16 start, end;
+ bool found;
+
+ if (!spraid_cqe_pending(spraidq))
+ return 0;
+
+ spin_lock_irq(&spraidq->cq_lock);
+ found = spraid_process_cq(spraidq, &start, &end, cid);
+ spin_unlock_irq(&spraidq->cq_lock);
+
+ spraid_complete_cqes(spraidq, start, end);
+ return found;
+}
+
+static irqreturn_t spraid_irq(int irq, void *data)
+{
+ struct spraid_queue *spraidq = data;
+ irqreturn_t ret = IRQ_NONE;
+ u16 start, end;
+
+ spin_lock(&spraidq->cq_lock);
+ if (spraidq->cq_head != spraidq->last_cq_head)
+ ret = IRQ_HANDLED;
+
+ spraid_process_cq(spraidq, &start, &end, -1);
+ spraidq->last_cq_head = spraidq->cq_head;
+ spin_unlock(&spraidq->cq_lock);
+
+ if (start != end) {
+ spraid_complete_cqes(spraidq, start, end);
+ ret = IRQ_HANDLED;
+ }
+ return ret;
+}
+
+static int spraid_setup_admin_queue(struct spraid_dev *hdev)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u32 aqa;
+ int ret;
+
+ dev_info(hdev->dev, "[%s] start disable ctrl\n", __func__);
+
+ ret = spraid_disable_ctrl(hdev);
+ if (ret)
+ return ret;
+
+ ret = spraid_alloc_queue(hdev, 0, SPRAID_AQ_DEPTH);
+ if (ret)
+ return ret;
+
+ aqa = adminq->q_depth - 1;
+ aqa |= aqa << 16;
+ writel(aqa, hdev->bar + SPRAID_REG_AQA);
+ lo_hi_writeq(adminq->sq_dma_addr, hdev->bar + SPRAID_REG_ASQ);
+ lo_hi_writeq(adminq->cq_dma_addr, hdev->bar + SPRAID_REG_ACQ);
+
+ dev_info(hdev->dev, "[%s] start enable ctrl\n", __func__);
+
+ ret = spraid_enable_ctrl(hdev);
+ if (ret) {
+ ret = -ENODEV;
+ goto free_queue;
+ }
+
+ adminq->cq_vector = 0;
+ spraid_init_queue(adminq, 0);
+ ret = pci_request_irq(hdev->pdev, adminq->cq_vector, spraid_irq, NULL,
+ adminq, "spraid%d_q%d", hdev->instance, adminq->qid);
+
+ if (ret) {
+ adminq->cq_vector = -1;
+ hdev->online_queues--;
+ goto free_queue;
+ }
+
+ dev_info(hdev->dev, "[%s] success, queuecount:[%d], onlinequeue:[%d]\n",
+ __func__, hdev->queue_count, hdev->online_queues);
+
+ return 0;
+
+free_queue:
+ spraid_free_queue(adminq);
+ return ret;
+}
+
+static u32 spraid_bar_size(struct spraid_dev *hdev, u32 nr_ioqs)
+{
+ return (SPRAID_REG_DBS + ((nr_ioqs + 1) * 8 * hdev->db_stride));
+}
+
+static int spraid_alloc_admin_cmds(struct spraid_dev *hdev)
+{
+ int i;
+
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+ spin_lock_init(&hdev->adm_cmd_lock);
+
+ hdev->adm_cmds = kcalloc_node(SPRAID_AQ_BLK_MQ_DEPTH, sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
+
+ if (!hdev->adm_cmds) {
+ dev_err(hdev->dev, "Alloc admin cmds failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < SPRAID_AQ_BLK_MQ_DEPTH; i++) {
+ hdev->adm_cmds[i].qid = 0;
+ hdev->adm_cmds[i].cid = i;
+ list_add_tail(&(hdev->adm_cmds[i].list), &hdev->adm_cmd_list);
+ }
+
+ dev_info(hdev->dev, "Alloc admin cmds success, num[%d]\n", SPRAID_AQ_BLK_MQ_DEPTH);
+
+ return 0;
+}
+
+static void spraid_free_admin_cmds(struct spraid_dev *hdev)
+{
+ kfree(hdev->adm_cmds);
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+}
+
+static struct spraid_cmd *spraid_get_cmd(struct spraid_dev *hdev, enum spraid_cmd_type type)
+{
+ struct spraid_cmd *cmd = NULL;
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
+
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
+
+ spin_lock_irqsave(slock, flags);
+ if (list_empty(head)) {
+ spin_unlock_irqrestore(slock, flags);
+ dev_err(hdev->dev, "err, cmd[%d] list empty\n", type);
+ return NULL;
+ }
+ cmd = list_entry(head->next, struct spraid_cmd, list);
+ list_del_init(&cmd->list);
+ spin_unlock_irqrestore(slock, flags);
+
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IN_FLIGHT);
+
+ return cmd;
+}
+
+static void spraid_put_cmd(struct spraid_dev *hdev, struct spraid_cmd *cmd,
+ enum spraid_cmd_type type)
+{
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
+
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
+
+ spin_lock_irqsave(slock, flags);
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IDLE);
+ list_add_tail(&cmd->list, head);
+ spin_unlock_irqrestore(slock, flags);
+}
+
+
+static int spraid_submit_admin_sync_cmd(struct spraid_dev *hdev, struct spraid_admin_command *cmd,
+ u32 *result0, u32 *result1, u32 timeout)
+{
+ struct spraid_cmd *adm_cmd = spraid_get_cmd(hdev, SPRAID_CMD_ADM);
+
+ if (!adm_cmd) {
+ dev_err(hdev->dev, "err, get admin cmd failed\n");
+ return -EFAULT;
+ }
+
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
+
+ init_completion(&adm_cmd->cmd_done);
+
+ cmd->common.command_id = adm_cmd->cid;
+ spraid_submit_cmd(&hdev->queues[0], cmd);
+
+ if (!wait_for_completion_timeout(&adm_cmd->cmd_done, timeout)) {
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout, opcode[0x%x] subopcode[0x%x]\n",
+ __func__, adm_cmd->cid, adm_cmd->qid, cmd->usr_cmd.opcode,
+ cmd->usr_cmd.info_0.subopcode);
+ WRITE_ONCE(adm_cmd->state, SPRAID_CMD_TIMEOUT);
+ return -EINVAL;
+ }
+
+ if (result0)
+ *result0 = adm_cmd->result0;
+ if (result1)
+ *result1 = adm_cmd->result1;
+
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+
+ return adm_cmd->status;
+}
+
+static int spraid_create_cq(struct spraid_dev *hdev, u16 qid,
+ struct spraid_queue *spraidq, u16 cq_vector)
+{
+ struct spraid_admin_command admin_cmd;
+ int flags = SPRAID_QUEUE_PHYS_CONTIG | SPRAID_CQ_IRQ_ENABLED;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.create_cq.opcode = SPRAID_ADMIN_CREATE_CQ;
+ admin_cmd.create_cq.prp1 = cpu_to_le64(spraidq->cq_dma_addr);
+ admin_cmd.create_cq.cqid = cpu_to_le16(qid);
+ admin_cmd.create_cq.qsize = cpu_to_le16(spraidq->q_depth - 1);
+ admin_cmd.create_cq.cq_flags = cpu_to_le16(flags);
+ admin_cmd.create_cq.irq_vector = cpu_to_le16(cq_vector);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static int spraid_create_sq(struct spraid_dev *hdev, u16 qid,
+ struct spraid_queue *spraidq)
+{
+ struct spraid_admin_command admin_cmd;
+ int flags = SPRAID_QUEUE_PHYS_CONTIG;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.create_sq.opcode = SPRAID_ADMIN_CREATE_SQ;
+ admin_cmd.create_sq.prp1 = cpu_to_le64(spraidq->sq_dma_addr);
+ admin_cmd.create_sq.sqid = cpu_to_le16(qid);
+ admin_cmd.create_sq.qsize = cpu_to_le16(spraidq->q_depth - 1);
+ admin_cmd.create_sq.sq_flags = cpu_to_le16(flags);
+ admin_cmd.create_sq.cqid = cpu_to_le16(qid);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static void spraid_free_queue(struct spraid_queue *spraidq)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ hdev->queue_count--;
+ dma_free_coherent(hdev->dev, CQ_SIZE(spraidq->q_depth),
+ (void *)spraidq->cqes, spraidq->cq_dma_addr);
+ dma_free_coherent(hdev->dev, SQ_SIZE(spraidq->qid, spraidq->q_depth),
+ spraidq->sq_cmds, spraidq->sq_dma_addr);
+ dma_free_coherent(hdev->dev, SENSE_SIZE(spraidq->q_depth),
+ spraidq->sense, spraidq->sense_dma_addr);
+}
+
+static void spraid_free_admin_queue(struct spraid_dev *hdev)
+{
+ spraid_free_queue(&hdev->queues[0]);
+}
+
+static void spraid_free_io_queues(struct spraid_dev *hdev)
+{
+ int i;
+
+ for (i = hdev->queue_count - 1; i >= 1; i--)
+ spraid_free_queue(&hdev->queues[i]);
+}
+
+static int spraid_delete_queue(struct spraid_dev *hdev, u8 op, u16 id)
+{
+ struct spraid_admin_command admin_cmd;
+ int ret;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.delete_queue.opcode = op;
+ admin_cmd.delete_queue.qid = cpu_to_le16(id);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+
+ if (ret)
+ dev_err(hdev->dev, "Delete %s:[%d] failed\n",
+ (op == SPRAID_ADMIN_DELETE_CQ) ? "cq" : "sq", id);
+
+ return ret;
+}
+
+static int spraid_delete_cq(struct spraid_dev *hdev, u16 cqid)
+{
+ return spraid_delete_queue(hdev, SPRAID_ADMIN_DELETE_CQ, cqid);
+}
+
+static int spraid_delete_sq(struct spraid_dev *hdev, u16 sqid)
+{
+ return spraid_delete_queue(hdev, SPRAID_ADMIN_DELETE_SQ, sqid);
+}
+
+static int spraid_create_queue(struct spraid_queue *spraidq, u16 qid)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+ u16 cq_vector;
+ int ret;
+
+ cq_vector = (hdev->num_vecs == 1) ? 0 : qid;
+ ret = spraid_create_cq(hdev, qid, spraidq, cq_vector);
+ if (ret)
+ return ret;
+
+ ret = spraid_create_sq(hdev, qid, spraidq);
+ if (ret)
+ goto delete_cq;
+
+ spraid_init_queue(spraidq, qid);
+ spraidq->cq_vector = cq_vector;
+
+ ret = pci_request_irq(hdev->pdev, cq_vector, spraid_irq, NULL,
+ spraidq, "spraid%d_q%d", hdev->instance, qid);
+
+ if (ret) {
+ dev_err(hdev->dev, "Request queue[%d] irq failed\n", qid);
+ goto delete_sq;
+ }
+
+ return 0;
+
+delete_sq:
+ spraidq->cq_vector = -1;
+ hdev->online_queues--;
+ spraid_delete_sq(hdev, qid);
+delete_cq:
+ spraid_delete_cq(hdev, qid);
+
+ return ret;
+}
+
+static int spraid_create_io_queues(struct spraid_dev *hdev)
+{
+ u32 i, max;
+ int ret = 0;
+
+ max = min(hdev->max_qid, hdev->queue_count - 1);
+ for (i = hdev->online_queues; i <= max; i++) {
+ ret = spraid_create_queue(&hdev->queues[i], i);
+ if (ret) {
+ dev_err(hdev->dev, "Create queue[%d] failed\n", i);
+ break;
+ }
+ }
+
+ dev_info(hdev->dev, "[%s] queue_count[%d], online_queue[%d]",
+ __func__, hdev->queue_count, hdev->online_queues);
+
+ return ret >= 0 ? 0 : ret;
+}
+
+static int spraid_set_features(struct spraid_dev *hdev, u32 fid, u32 dword11, void *buffer,
+ size_t buflen, u32 *result)
+{
+ struct spraid_admin_command admin_cmd;
+ int ret;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+
+ if (buffer && buflen) {
+ data_ptr = dma_alloc_coherent(hdev->dev, buflen, &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memcpy(data_ptr, buffer, buflen);
+ }
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.features.opcode = SPRAID_ADMIN_SET_FEATURES;
+ admin_cmd.features.fid = cpu_to_le32(fid);
+ admin_cmd.features.dword11 = cpu_to_le32(dword11);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, result, NULL, 0);
+
+ if (data_ptr)
+ dma_free_coherent(hdev->dev, buflen, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_configure_timestamp(struct spraid_dev *hdev)
+{
+ __le64 ts;
+ int ret;
+
+ ts = cpu_to_le64(ktime_to_ms(ktime_get_real()));
+ ret = spraid_set_features(hdev, SPRAID_FEAT_TIMESTAMP, 0, &ts, sizeof(ts), NULL);
+
+ if (ret)
+ dev_err(hdev->dev, "set timestamp failed: %d\n", ret);
+ return ret;
+}
+
+static int spraid_set_queue_cnt(struct spraid_dev *hdev, u32 *cnt)
+{
+ u32 q_cnt = (*cnt - 1) | ((*cnt - 1) << 16);
+ u32 nr_ioqs, result;
+ int status;
+
+ status = spraid_set_features(hdev, SPRAID_FEAT_NUM_QUEUES, q_cnt, NULL, 0, &result);
+ if (status) {
+ dev_err(hdev->dev, "Set queue count failed, status: %d\n",
+ status);
+ return -EIO;
+ }
+
+ nr_ioqs = min(result & 0xffff, result >> 16) + 1;
+ *cnt = min(*cnt, nr_ioqs);
+ if (*cnt == 0) {
+ dev_err(hdev->dev, "Illegal queue count: zero\n");
+ return -EIO;
+ }
+ return 0;
+}
+
+static int spraid_setup_io_queues(struct spraid_dev *hdev)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ struct pci_dev *pdev = hdev->pdev;
+ u32 nr_ioqs = num_online_cpus();
+ u32 i, size;
+ int ret;
+
+ struct irq_affinity affd = {
+ .pre_vectors = 1
+ };
+
+ ret = spraid_set_queue_cnt(hdev, &nr_ioqs);
+ if (ret < 0)
+ return ret;
+
+ size = spraid_bar_size(hdev, nr_ioqs);
+ ret = spraid_remap_bar(hdev, size);
+ if (ret)
+ return -ENOMEM;
+
+ adminq->q_db = hdev->dbs;
+
+ pci_free_irq(pdev, 0, adminq);
+ pci_free_irq_vectors(pdev);
+
+ ret = pci_alloc_irq_vectors_affinity(pdev, 1, (nr_ioqs + 1),
+ PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
+ if (ret <= 0)
+ return -EIO;
+
+ hdev->num_vecs = ret;
+
+ hdev->max_qid = max(ret - 1, 1);
+
+ ret = pci_request_irq(pdev, adminq->cq_vector, spraid_irq, NULL,
+ adminq, "spraid%d_q%d", hdev->instance, adminq->qid);
+ if (ret) {
+ dev_err(hdev->dev, "Request admin irq failed\n");
+ adminq->cq_vector = -1;
+ return ret;
+ }
+
+ for (i = hdev->queue_count; i <= hdev->max_qid; i++) {
+ ret = spraid_alloc_queue(hdev, i, hdev->ioq_depth);
+ if (ret)
+ break;
+ }
+ dev_info(hdev->dev, "[%s] max_qid: %d, queue_count: %d, online_queue: %d, ioq_depth: %d\n",
+ __func__, hdev->max_qid, hdev->queue_count, hdev->online_queues, hdev->ioq_depth);
+
+ return spraid_create_io_queues(hdev);
+}
+
+static void spraid_delete_io_queues(struct spraid_dev *hdev)
+{
+ u16 queues = hdev->online_queues - 1;
+ u8 opcode = SPRAID_ADMIN_DELETE_SQ;
+ u16 i, pass;
+
+ if (!pci_device_is_present(hdev->pdev)) {
+ dev_err(hdev->dev, "pci_device is not present, skip disable io queues\n");
+ return;
+ }
+
+ if (hdev->online_queues < 2) {
+ dev_err(hdev->dev, "[%s] err, io queue has been delete\n", __func__);
+ return;
+ }
+
+ for (pass = 0; pass < 2; pass++) {
+ for (i = queues; i > 0; i--)
+ if (spraid_delete_queue(hdev, opcode, i))
+ break;
+
+ opcode = SPRAID_ADMIN_DELETE_CQ;
+ }
+}
+
+static void spraid_remove_io_queues(struct spraid_dev *hdev)
+{
+ spraid_delete_io_queues(hdev);
+ spraid_free_io_queues(hdev);
+}
+
+static void spraid_pci_disable(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ u32 i;
+
+ for (i = 0; i < hdev->online_queues; i++)
+ pci_free_irq(pdev, hdev->queues[i].cq_vector, &hdev->queues[i]);
+ pci_free_irq_vectors(pdev);
+ if (pci_is_enabled(pdev)) {
+ pci_disable_pcie_error_reporting(pdev);
+ pci_disable_device(pdev);
+ }
+ hdev->online_queues = 0;
+}
+
+static void spraid_disable_admin_queue(struct spraid_dev *hdev, bool shutdown)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u16 start, end;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ if (shutdown)
+ spraid_shutdown_ctrl(hdev);
+ else
+ spraid_disable_ctrl(hdev);
+ }
+
+ if (hdev->queue_count == 0) {
+ dev_err(hdev->dev, "[%s] err, admin queue has been delete\n", __func__);
+ return;
+ }
+
+ spin_lock_irq(&adminq->cq_lock);
+ spraid_process_cq(adminq, &start, &end, -1);
+ spin_unlock_irq(&adminq->cq_lock);
+
+ spraid_complete_cqes(adminq, start, end);
+ spraid_free_admin_queue(hdev);
+}
+
+static int spraid_create_dma_pools(struct spraid_dev *hdev)
+{
+ int i;
+ char poolname[20] = { 0 };
+
+ hdev->prp_page_pool = dma_pool_create("prp list page", hdev->dev,
+ PAGE_SIZE, PAGE_SIZE, 0);
+
+ if (!hdev->prp_page_pool) {
+ dev_err(hdev->dev, "create prp_page_pool failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < small_pool_num; i++) {
+ sprintf(poolname, "prp_list_256_%d", i);
+ hdev->prp_small_pool[i] = dma_pool_create(poolname, hdev->dev, SMALL_POOL_SIZE,
+ SMALL_POOL_SIZE, 0);
+
+ if (!hdev->prp_small_pool[i]) {
+ dev_err(hdev->dev, "create prp_small_pool %d failed\n", i);
+ goto destroy_prp_small_pool;
+ }
+ }
+
+ return 0;
+
+destroy_prp_small_pool:
+ while (i > 0)
+ dma_pool_destroy(hdev->prp_small_pool[--i]);
+ dma_pool_destroy(hdev->prp_page_pool);
+
+ return -ENOMEM;
+}
+
+static void spraid_destroy_dma_pools(struct spraid_dev *hdev)
+{
+ int i;
+
+ for (i = 0; i < small_pool_num; i++)
+ dma_pool_destroy(hdev->prp_small_pool[i]);
+ dma_pool_destroy(hdev->prp_page_pool);
+}
+
+static int spraid_get_dev_list(struct spraid_dev *hdev, struct spraid_dev_info *devices)
+{
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ struct spraid_admin_command admin_cmd;
+ struct spraid_dev_list *list_buf;
+ dma_addr_t data_dma = 0;
+ u32 i, idx, hdid, ndev;
+ int ret = 0;
+
+ list_buf = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
+ if (!list_buf)
+ return -ENOMEM;
+
+ for (idx = 0; idx < nd;) {
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
+ admin_cmd.get_info.type = SPRAID_GET_INFO_DEV_LIST;
+ admin_cmd.get_info.cdw11 = cpu_to_le32(idx);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+
+ if (ret) {
+ dev_err(hdev->dev, "Get device list failed, nd: %u, idx: %u, ret: %d\n",
+ nd, idx, ret);
+ goto out;
+ }
+ ndev = le32_to_cpu(list_buf->dev_num);
+
+ dev_info(hdev->dev, "ndev numbers: %u\n", ndev);
+
+ for (i = 0; i < ndev; i++) {
+ hdid = le32_to_cpu(list_buf->devices[i].hdid);
+ dev_info(hdev->dev, "list_buf->devices[%d], hdid: %u target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ i, hdid, le16_to_cpu(list_buf->devices[i].target),
+ list_buf->devices[i].channel,
+ list_buf->devices[i].lun,
+ list_buf->devices[i].attr);
+ if (hdid > nd || hdid == 0) {
+ dev_err(hdev->dev, "err, hdid[%d] invalid\n", hdid);
+ continue;
+ }
+ memcpy(&devices[hdid - 1], &list_buf->devices[i],
+ sizeof(struct spraid_dev_info));
+ }
+ idx += ndev;
+
+ if (idx < MAX_DEV_ENTRY_PER_PAGE_4K)
+ break;
+ }
+
+out:
+ dma_free_coherent(hdev->dev, PAGE_SIZE, list_buf, data_dma);
+ return ret;
+}
+
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.common.opcode = SPRAID_ADMIN_ASYNC_EVENT;
+ admin_cmd.common.command_id = cid;
+
+ spraid_submit_cmd(adminq, &admin_cmd);
+ dev_info(hdev->dev, "send aen, cid[%d]\n", cid);
+}
+static inline void spraid_send_all_aen(struct spraid_dev *hdev)
+{
+ u16 i;
+
+ for (i = 0; i < hdev->ctrl_info->aerl; i++)
+ spraid_send_aen(hdev, i + SPRAID_AQ_BLK_MQ_DEPTH);
+}
+
+static int spraid_add_device(struct spraid_dev *hdev, struct spraid_dev_info *device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ sdev = scsi_device_lookup(shost, device->channel, le16_to_cpu(device->target), 0);
+ if (sdev) {
+ dev_warn(hdev->dev, "Device is already exist, channel: %d, target_id: %d, lun: %d\n",
+ device->channel, le16_to_cpu(device->target), 0);
+ scsi_device_put(sdev);
+ return -EEXIST;
+ }
+ scsi_add_device(shost, device->channel, le16_to_cpu(device->target), 0);
+ return 0;
+}
+
+static int spraid_rescan_device(struct spraid_dev *hdev, struct spraid_dev_info *device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ sdev = scsi_device_lookup(shost, device->channel, le16_to_cpu(device->target), 0);
+ if (!sdev) {
+ dev_warn(hdev->dev, "Device is not exit, channel: %d, target_id: %d, lun: %d\n",
+ device->channel, le16_to_cpu(device->target), 0);
+ return -ENODEV;
+ }
+
+ scsi_rescan_device(&sdev->sdev_gendev);
+ scsi_device_put(sdev);
+ return 0;
+}
+
+static int spraid_remove_device(struct spraid_dev *hdev, struct spraid_dev_info *org_device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ sdev = scsi_device_lookup(shost, org_device->channel, le16_to_cpu(org_device->target), 0);
+ if (!sdev) {
+ dev_warn(hdev->dev, "Device is not exit, channel: %d, target_id: %d, lun: %d\n",
+ org_device->channel, le16_to_cpu(org_device->target), 0);
+ return -ENODEV;
+ }
+
+ scsi_remove_device(sdev);
+ scsi_device_put(sdev);
+ return 0;
+}
+
+static int spraid_dev_list_init(struct spraid_dev *hdev)
+{
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ int i, ret;
+
+ hdev->devices = kzalloc_node(nd * sizeof(struct spraid_dev_info),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->devices)
+ return -ENOMEM;
+
+ ret = spraid_get_dev_list(hdev, hdev->devices);
+ if (ret) {
+ dev_err(hdev->dev, "Ignore failure of getting device list within initialization\n");
+ return 0;
+ }
+
+ for (i = 0; i < nd; i++) {
+ if (SPRAID_DEV_INFO_FLAG_VALID(hdev->devices[i].flag) &&
+ SPRAID_DEV_INFO_ATTR_BOOT(hdev->devices[i].attr)) {
+ spraid_add_device(hdev, &hdev->devices[i]);
+ break;
+ }
+ }
+ return 0;
+}
+
+static void spraid_scan_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, scan_work);
+ struct spraid_dev_info *devices, *org_devices;
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ u8 flag, org_flag;
+ int i, ret;
+
+ devices = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
+ if (!devices)
+ return;
+ ret = spraid_get_dev_list(hdev, devices);
+ if (ret)
+ goto free_list;
+ org_devices = hdev->devices;
+ for (i = 0; i < nd; i++) {
+ org_flag = org_devices[i].flag;
+ flag = devices[i].flag;
+
+ pr_debug("i: %d, org_flag: 0x%x, flag: 0x%x\n",
+ i, org_flag, flag);
+
+ if (SPRAID_DEV_INFO_FLAG_VALID(flag)) {
+ if (!SPRAID_DEV_INFO_FLAG_VALID(org_flag)) {
+ down_write(&hdev->devices_rwsem);
+ memcpy(&org_devices[i], &devices[i],
+ sizeof(struct spraid_dev_info));
+ up_write(&hdev->devices_rwsem);
+ spraid_add_device(hdev, &devices[i]);
+ } else if (SPRAID_DEV_INFO_FLAG_CHANGE(flag)) {
+ spraid_rescan_device(hdev, &devices[i]);
+ }
+ } else {
+ if (SPRAID_DEV_INFO_FLAG_VALID(org_flag)) {
+ down_write(&hdev->devices_rwsem);
+ org_devices[i].flag &= 0xfe;
+ up_write(&hdev->devices_rwsem);
+ spraid_remove_device(hdev, &org_devices[i]);
+ }
+ }
+ }
+free_list:
+ kfree(devices);
+}
+
+static void spraid_timesyn_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, timesyn_work);
+
+ spraid_configure_timestamp(hdev);
+}
+
+static void spraid_queue_scan(struct spraid_dev *hdev)
+{
+ queue_work(spraid_wq, &hdev->scan_work);
+}
+
+static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result)
+{
+ switch ((result & 0xff00) >> 8) {
+ case SPRAID_AEN_DEV_CHANGED:
+ spraid_queue_scan(hdev);
+ break;
+ case SPRAID_AEN_HOST_PROBING:
+ break;
+ default:
+ dev_warn(hdev->dev, "async event result %08x\n", result);
+ }
+}
+
+static void spraid_handle_aen_vs(struct spraid_dev *hdev, u32 result, u32 result1)
+{
+ switch ((result & 0xff00) >> 8) {
+ case SPRAID_AEN_TIMESYN:
+ queue_work(spraid_wq, &hdev->timesyn_work);
+ break;
+ default:
+ dev_warn(hdev->dev, "async event result: 0x%x\n", result);
+ }
+}
+
+static int spraid_alloc_resources(struct spraid_dev *hdev)
+{
+ int ret, nqueue;
+
+ ret = ida_alloc(&spraid_instance_ida, GFP_KERNEL);
+ if (ret < 0) {
+ dev_err(hdev->dev, "Get instance id failed\n");
+ return ret;
+ }
+ hdev->instance = ret;
+
+ hdev->ctrl_info = kzalloc_node(sizeof(*hdev->ctrl_info),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->ctrl_info) {
+ ret = -ENOMEM;
+ goto release_instance;
+ }
+
+ ret = spraid_create_dma_pools(hdev);
+ if (ret)
+ goto free_ctrl_info;
+ nqueue = num_possible_cpus() + 1;
+ hdev->queues = kcalloc_node(nqueue, sizeof(struct spraid_queue),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->queues) {
+ ret = -ENOMEM;
+ goto destroy_dma_pools;
+ }
+
+ ret = spraid_alloc_admin_cmds(hdev);
+ if (ret)
+ goto free_queues;
+
+ dev_info(hdev->dev, "[%s] queues num: %d\n", __func__, nqueue);
+
+ return 0;
+
+free_queues:
+ kfree(hdev->queues);
+destroy_dma_pools:
+ spraid_destroy_dma_pools(hdev);
+free_ctrl_info:
+ kfree(hdev->ctrl_info);
+release_instance:
+ ida_free(&spraid_instance_ida, hdev->instance);
+ return ret;
+}
+
+static void spraid_free_resources(struct spraid_dev *hdev)
+{
+ spraid_free_admin_cmds(hdev);
+ kfree(hdev->queues);
+ spraid_destroy_dma_pools(hdev);
+ kfree(hdev->ctrl_info);
+ ida_free(&spraid_instance_ida, hdev->instance);
+}
+
+
+
+static void spraid_bsg_unmap_data(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir = rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+
+ if (iod->nsge)
+ dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
+
+ spraid_free_iod_res(hdev, iod);
+}
+
+static int spraid_bsg_map_data(struct spraid_dev *hdev, struct bsg_job *job,
+ struct spraid_admin_command *cmd)
+{
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir = rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ int ret = 0;
+
+ iod->sg = job->request_payload.sg_list;
+ iod->nsge = job->request_payload.sg_cnt;
+ iod->length = job->request_payload.payload_len;
+ iod->use_sgl = false;
+ iod->npages = -1;
+ iod->sg_drv_mgmt = false;
+
+ if (!iod->nsge)
+ goto out;
+
+ ret = dma_map_sg_attrs(hdev->dev, iod->sg, iod->nsge, dma_dir, DMA_ATTR_NO_WARN);
+ if (!ret)
+ goto out;
+
+ ret = spraid_setup_prps(hdev, iod);
+ if (ret)
+ goto unmap;
+
+ cmd->common.dptr.prp1 = cpu_to_le64(sg_dma_address(iod->sg));
+ cmd->common.dptr.prp2 = cpu_to_le64(iod->first_dma);
+
+ return 0;
+
+unmap:
+ dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
+out:
+ return ret;
+}
+
+static int spraid_get_ctrl_info(struct spraid_dev *hdev, struct spraid_ctrl_info *ctrl_info)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
+ admin_cmd.get_info.type = SPRAID_GET_INFO_CTRL;
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(ctrl_info, data_ptr, sizeof(struct spraid_ctrl_info));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_init_ctrl_info(struct spraid_dev *hdev)
+{
+ int ret;
+
+ hdev->ctrl_info->nd = cpu_to_le32(240);
+ hdev->ctrl_info->mdts = 8;
+ hdev->ctrl_info->max_cmds = cpu_to_le16(4096);
+ hdev->ctrl_info->max_num_sge = cpu_to_le16(128);
+ hdev->ctrl_info->max_channel = cpu_to_le16(4);
+ hdev->ctrl_info->max_tgt_id = cpu_to_le32(3239);
+ hdev->ctrl_info->max_lun = cpu_to_le16(2);
+
+ ret = spraid_get_ctrl_info(hdev, hdev->ctrl_info);
+ if (ret)
+ dev_err(hdev->dev, "get controller info failed: %d\n", ret);
+
+ dev_info(hdev->dev, "[%s]nd = %d\n", __func__, hdev->ctrl_info->nd);
+ dev_info(hdev->dev, "[%s]max_cmd = %d\n", __func__, hdev->ctrl_info->max_cmds);
+ dev_info(hdev->dev, "[%s]max_channel = %d\n", __func__, hdev->ctrl_info->max_channel);
+ dev_info(hdev->dev, "[%s]max_tgt_id = %d\n", __func__, hdev->ctrl_info->max_tgt_id);
+ dev_info(hdev->dev, "[%s]max_lun = %d\n", __func__, hdev->ctrl_info->max_lun);
+ dev_info(hdev->dev, "[%s]max_num_sge = %d\n", __func__, hdev->ctrl_info->max_num_sge);
+ dev_info(hdev->dev, "[%s]lun_num_boot = %d\n", __func__, hdev->ctrl_info->lun_num_in_boot);
+ dev_info(hdev->dev, "[%s]mdts = %d\n", __func__, hdev->ctrl_info->mdts);
+ dev_info(hdev->dev, "[%s]acl = %d\n", __func__, hdev->ctrl_info->acl);
+ dev_info(hdev->dev, "[%s]aer1 = %d\n", __func__, hdev->ctrl_info->aerl);
+ dev_info(hdev->dev, "[%s]card_type = %d\n", __func__, hdev->ctrl_info->card_type);
+ dev_info(hdev->dev, "[%s]rtd3e = %d\n", __func__, hdev->ctrl_info->rtd3e);
+ dev_info(hdev->dev, "[%s]sn = %s\n", __func__, hdev->ctrl_info->sn);
+ dev_info(hdev->dev, "[%s]fr = %s\n", __func__, hdev->ctrl_info->fr);
+
+ return 0;
+}
+
+#define SPRAID_MAX_ADMIN_PAYLOAD_SIZE BIT(16)
+static int spraid_alloc_iod_ext_mem_pool(struct spraid_dev *hdev)
+{
+ u16 max_sge = le16_to_cpu(hdev->ctrl_info->max_num_sge);
+ size_t alloc_size;
+
+ alloc_size = spraid_iod_ext_size(hdev, SPRAID_MAX_ADMIN_PAYLOAD_SIZE,
+ max_sge, true, false);
+ if (alloc_size > PAGE_SIZE)
+ dev_warn(hdev->dev, "It is unreasonable for sg allocation more than one page\n");
+ hdev->iod_mempool = mempool_create_node(1, mempool_kmalloc, mempool_kfree,
+ (void *)alloc_size, GFP_KERNEL, hdev->numa_node);
+ if (!hdev->iod_mempool) {
+ dev_err(hdev->dev, "Create iod extension memory pool failed\n");
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void spraid_free_iod_ext_mem_pool(struct spraid_dev *hdev)
+{
+ mempool_destroy(hdev->iod_mempool);
+}
+
+static int spraid_user_admin_cmd(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct spraid_bsg_request *bsg_req = (struct spraid_bsg_request *)(job->request);
+ struct spraid_passthru_common_cmd *cmd = &(bsg_req->admcmd);
+ struct spraid_admin_command admin_cmd;
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
+ u32 result[2] = {0};
+ int status;
+
+ if (hdev->state >= SPRAID_RESETTING) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not right\n",
+ __func__, hdev->state);
+ return -EBUSY;
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode);
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.common.opcode = cmd->opcode;
+ admin_cmd.common.flags = cmd->flags;
+ admin_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ admin_cmd.common.cdw2[0] = cpu_to_le32(cmd->cdw2);
+ admin_cmd.common.cdw2[1] = cpu_to_le32(cmd->cdw3);
+ admin_cmd.common.cdw10 = cpu_to_le32(cmd->cdw10);
+ admin_cmd.common.cdw11 = cpu_to_le32(cmd->cdw11);
+ admin_cmd.common.cdw12 = cpu_to_le32(cmd->cdw12);
+ admin_cmd.common.cdw13 = cpu_to_le32(cmd->cdw13);
+ admin_cmd.common.cdw14 = cpu_to_le32(cmd->cdw14);
+ admin_cmd.common.cdw15 = cpu_to_le32(cmd->cdw15);
+
+ status = spraid_bsg_map_data(hdev, job, &admin_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
+ }
+
+ status = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, &result[0], &result[1], timeout);
+ if (status >= 0) {
+ job->reply_len = sizeof(result);
+ memcpy(job->reply, result, sizeof(result));
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x], status[0x%x] result0[0x%x] result1[0x%x]\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode, status, result[0], result[1]);
+
+ spraid_bsg_unmap_data(hdev, job);
+
+ return status;
+}
+
+static int spraid_alloc_ioq_ptcmds(struct spraid_dev *hdev)
+{
+ int i;
+ int ptnum = SPRAID_NR_IOQ_PTCMDS;
+
+ INIT_LIST_HEAD(&hdev->ioq_pt_list);
+ spin_lock_init(&hdev->ioq_pt_lock);
+
+ hdev->ioq_ptcmds = kcalloc_node(ptnum, sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
+
+ if (!hdev->ioq_ptcmds) {
+ dev_err(hdev->dev, "Alloc ioq_ptcmds failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < ptnum; i++) {
+ hdev->ioq_ptcmds[i].qid = i / SPRAID_PTCMDS_PERQ + 1;
+ hdev->ioq_ptcmds[i].cid = i % SPRAID_PTCMDS_PERQ + SPRAID_IO_BLK_MQ_DEPTH;
+ list_add_tail(&(hdev->ioq_ptcmds[i].list), &hdev->ioq_pt_list);
+ }
+
+ dev_info(hdev->dev, "Alloc ioq_ptcmds success, ptnum[%d]\n", ptnum);
+
+ return 0;
+}
+
+static void spraid_free_ioq_ptcmds(struct spraid_dev *hdev)
+{
+ kfree(hdev->ioq_ptcmds);
+ INIT_LIST_HEAD(&hdev->ioq_pt_list);
+}
+
+static int spraid_submit_ioq_sync_cmd(struct spraid_dev *hdev, struct spraid_ioq_command *cmd,
+ u32 *result, u32 *reslen, u32 timeout)
+{
+ int ret;
+ dma_addr_t sense_dma;
+ struct spraid_queue *ioq;
+ void *sense_addr = NULL;
+ struct spraid_cmd *pt_cmd = spraid_get_cmd(hdev, SPRAID_CMD_IOPT);
+
+ if (!pt_cmd) {
+ dev_err(hdev->dev, "err, get ioq cmd failed\n");
+ return -EFAULT;
+ }
+
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
+
+ init_completion(&pt_cmd->cmd_done);
+
+ ioq = &hdev->queues[pt_cmd->qid];
+ ret = pt_cmd->cid * SCSI_SENSE_BUFFERSIZE;
+ sense_addr = ioq->sense + ret;
+ sense_dma = ioq->sense_dma_addr + ret;
+
+ cmd->common.sense_addr = cpu_to_le64(sense_dma);
+ cmd->common.sense_len = cpu_to_le16(SCSI_SENSE_BUFFERSIZE);
+ cmd->common.command_id = pt_cmd->cid;
+
+ spraid_submit_cmd(ioq, cmd);
+
+ if (!wait_for_completion_timeout(&pt_cmd->cmd_done, timeout)) {
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout, opcode[0x%x] subopcode[0x%x]\n",
+ __func__, pt_cmd->cid, pt_cmd->qid, cmd->common.opcode,
+ (le32_to_cpu(cmd->common.cdw3[0]) & 0xffff));
+ WRITE_ONCE(pt_cmd->state, SPRAID_CMD_TIMEOUT);
+ return -EINVAL;
+ }
+
+ if (result && reslen) {
+ if ((pt_cmd->status & 0x17f) == 0x101) {
+ memcpy(result, sense_addr, SCSI_SENSE_BUFFERSIZE);
+ *reslen = SCSI_SENSE_BUFFERSIZE;
+ }
+ }
+
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
+
+ return pt_cmd->status;
+}
+
+static int spraid_user_ioq_cmd(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct spraid_bsg_request *bsg_req = (struct spraid_bsg_request *)(job->request);
+ struct spraid_ioq_passthru_cmd *cmd = &(bsg_req->ioqcmd);
+ struct spraid_ioq_command ioq_cmd;
+ int status = 0;
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
+
+ if (cmd->data_len > PAGE_SIZE) {
+ dev_err(hdev->dev, "[%s] data len bigger than 4k\n", __func__);
+ return -EFAULT;
+ }
+
+ if (hdev->state != SPRAID_LIVE) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not live\n",
+ __func__, hdev->state);
+ return -EBUSY;
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init, datalen[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode, cmd->data_len);
+
+ memset(&ioq_cmd, 0, sizeof(ioq_cmd));
+ ioq_cmd.common.opcode = cmd->opcode;
+ ioq_cmd.common.flags = cmd->flags;
+ ioq_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ ioq_cmd.common.sense_len = cpu_to_le16(cmd->info_0.res_sense_len);
+ ioq_cmd.common.cdb_len = cmd->info_0.cdb_len;
+ ioq_cmd.common.rsvd2 = cmd->info_0.rsvd0;
+ ioq_cmd.common.cdw3[0] = cpu_to_le32(cmd->cdw3);
+ ioq_cmd.common.cdw3[1] = cpu_to_le32(cmd->cdw4);
+ ioq_cmd.common.cdw3[2] = cpu_to_le32(cmd->cdw5);
+
+ ioq_cmd.common.cdw10[0] = cpu_to_le32(cmd->cdw10);
+ ioq_cmd.common.cdw10[1] = cpu_to_le32(cmd->cdw11);
+ ioq_cmd.common.cdw10[2] = cpu_to_le32(cmd->cdw12);
+ ioq_cmd.common.cdw10[3] = cpu_to_le32(cmd->cdw13);
+ ioq_cmd.common.cdw10[4] = cpu_to_le32(cmd->cdw14);
+ ioq_cmd.common.cdw10[5] = cpu_to_le32(cmd->data_len);
+
+ memcpy(ioq_cmd.common.cdb, &cmd->cdw16, cmd->info_0.cdb_len);
+
+ ioq_cmd.common.cdw26[0] = cpu_to_le32(cmd->cdw26[0]);
+ ioq_cmd.common.cdw26[1] = cpu_to_le32(cmd->cdw26[1]);
+ ioq_cmd.common.cdw26[2] = cpu_to_le32(cmd->cdw26[2]);
+ ioq_cmd.common.cdw26[3] = cpu_to_le32(cmd->cdw26[3]);
+
+ status = spraid_bsg_map_data(hdev, job, (struct spraid_admin_command *)&ioq_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
+ }
+
+ status = spraid_submit_ioq_sync_cmd(hdev, &ioq_cmd, job->reply, &job->reply_len, timeout);
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x], status[0x%x] reply_len[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode, status, job->reply_len);
+
+ spraid_bsg_unmap_data(hdev, job);
+
+ return status;
+}
+
+static int spraid_reset_work_sync(struct spraid_dev *hdev);
+
+static bool spraid_check_scmd_completed(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_queue *spraidq;
+ u16 hwq, cid;
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ spraidq = &hdev->queues[hwq];
+ if (READ_ONCE(iod->state) == SPRAID_CMD_COMPLETE || spraid_poll_cq(spraidq, cid)) {
+ dev_warn(hdev->dev, "cid[%d], qid[%d] has been completed\n",
+ cid, spraidq->qid);
+ return true;
+ }
+ return false;
+}
+
+static enum blk_eh_timer_return spraid_scmd_timeout(struct scsi_cmnd *scmd)
+{
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ unsigned int timeout = scmd->device->request_queue->rq_timeout;
+
+ if (spraid_check_scmd_completed(scmd))
+ goto out;
+
+ if (time_after(jiffies, scmd->jiffies_at_alloc + timeout)) {
+ if (cmpxchg(&iod->state, SPRAID_CMD_IN_FLIGHT, SPRAID_CMD_TIMEOUT) ==
+ SPRAID_CMD_IN_FLIGHT) {
+ return BLK_EH_DONE;
+ }
+ }
+out:
+ return BLK_EH_RESET_TIMER;
+}
+
+/* send abort command by admin queue temporary */
+static int spraid_send_abort_cmd(struct spraid_dev *hdev, u32 hdid, u16 qid, u16 cid)
+{
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.abort.opcode = SPRAID_ADMIN_ABORT_CMD;
+ admin_cmd.abort.hdid = cpu_to_le32(hdid);
+ admin_cmd.abort.sqid = cpu_to_le16(qid);
+ admin_cmd.abort.cid = cpu_to_le16(cid);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+/* send reset command by admin quueue temporary */
+static int spraid_send_reset_cmd(struct spraid_dev *hdev, int type, u32 hdid)
+{
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.reset.opcode = SPRAID_ADMIN_RESET;
+ admin_cmd.reset.hdid = cpu_to_le32(hdid);
+ admin_cmd.reset.type = type;
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static bool spraid_change_host_state(struct spraid_dev *hdev, enum spraid_state newstate)
+{
+ unsigned long flags;
+ enum spraid_state oldstate;
+ bool change = false;
+
+ spin_lock_irqsave(&hdev->state_lock, flags);
+
+ oldstate = hdev->state;
+ switch (newstate) {
+ case SPRAID_LIVE:
+ switch (oldstate) {
+ case SPRAID_NEW:
+ case SPRAID_RESETTING:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ case SPRAID_RESETTING:
+ switch (oldstate) {
+ case SPRAID_LIVE:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ case SPRAID_DELETING:
+ if (oldstate != SPRAID_DELETING)
+ change = true;
+ break;
+ case SPRAID_DEAD:
+ switch (oldstate) {
+ case SPRAID_NEW:
+ case SPRAID_LIVE:
+ case SPRAID_RESETTING:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ default:
+ break;
+ }
+ if (change)
+ hdev->state = newstate;
+ spin_unlock_irqrestore(&hdev->state_lock, flags);
+
+ dev_info(hdev->dev, "[%s][%d]->[%d], change[%d]\n", __func__, oldstate, newstate, change);
+
+ return change;
+}
+
+static void spraid_back_fault_cqe(struct spraid_queue *ioq, struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct blk_mq_tags *tags;
+ struct scsi_cmnd *scmd;
+ struct spraid_iod *iod;
+ struct request *req;
+
+ tags = hdev->shost->tag_set.tags[ioq->qid - 1];
+ req = blk_mq_tag_to_rq(tags, cqe->cmd_id);
+ if (unlikely(!req || !blk_mq_request_started(req)))
+ return;
+
+ scmd = blk_mq_rq_to_pdu(req);
+ iod = scsi_cmd_priv(scmd);
+
+ set_host_byte(scmd, DID_NO_CONNECT);
+ if (iod->nsge)
+ scsi_dma_unmap(scmd);
+ spraid_free_iod_res(hdev, iod);
+ scsi_done(scmd);
+ dev_warn(hdev->dev, "Back fault CQE, cid[%d], qid[%d]\n",
+ cqe->cmd_id, ioq->qid);
+}
+
+static void spraid_back_all_io(struct spraid_dev *hdev)
+{
+ int i, j;
+ struct spraid_queue *ioq;
+ struct spraid_completion cqe = { 0 };
+
+ for (i = 1; i <= hdev->shost->nr_hw_queues; i++) {
+ ioq = &hdev->queues[i];
+ for (j = 0; j < hdev->shost->can_queue; j++) {
+ cqe.cmd_id = j;
+ spraid_back_fault_cqe(ioq, &cqe);
+ }
+ }
+}
+
+static void spraid_dev_disable(struct spraid_dev *hdev, bool shutdown)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u16 start, end;
+ unsigned long timeout = jiffies + 600 * HZ;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ if (shutdown)
+ spraid_shutdown_ctrl(hdev);
+ else
+ spraid_disable_ctrl(hdev);
+ }
+
+ while (!time_after(jiffies, timeout)) {
+ if (!pci_device_is_present(hdev->pdev)) {
+ dev_info(hdev->dev, "[%s] pci_device not present, skip wait\n", __func__);
+ break;
+ }
+ if (!spraid_wait_ready(hdev, hdev->cap, false)) {
+ dev_info(hdev->dev, "[%s] wait ready success after reset\n", __func__);
+ break;
+ }
+ dev_info(hdev->dev, "[%s] waiting csts_rdy ready\n", __func__);
+ }
+
+ if (hdev->queue_count == 0) {
+ dev_err(hdev->dev, "[%s] warn, queue has been delete\n", __func__);
+ return;
+ }
+
+ spin_lock_irq(&adminq->cq_lock);
+ spraid_process_cq(adminq, &start, &end, -1);
+ spin_unlock_irq(&adminq->cq_lock);
+ spraid_complete_cqes(adminq, start, end);
+
+ spraid_pci_disable(hdev);
+
+ spraid_back_all_io(hdev);
+}
+
+static void spraid_reset_work(struct work_struct *work)
+{
+ int ret;
+ struct spraid_dev *hdev = container_of(work, struct spraid_dev, reset_work);
+
+ if (hdev->state != SPRAID_RESETTING) {
+ dev_err(hdev->dev, "[%s] err, host is not reset state\n", __func__);
+ return;
+ }
+
+ dev_info(hdev->dev, "[%s] enter host reset\n", __func__);
+
+ if (hdev->ctrl_config & SPRAID_CC_ENABLE) {
+ dev_info(hdev->dev, "[%s] start dev_disable\n", __func__);
+ spraid_dev_disable(hdev, false);
+ }
+
+ ret = spraid_pci_enable(hdev);
+ if (ret)
+ goto out;
+
+ ret = spraid_setup_admin_queue(hdev);
+ if (ret)
+ goto pci_disable;
+
+ ret = spraid_setup_io_queues(hdev);
+ if (ret || hdev->online_queues <= hdev->shost->nr_hw_queues)
+ goto pci_disable;
+
+ spraid_change_host_state(hdev, SPRAID_LIVE);
+
+ spraid_send_all_aen(hdev);
+
+ return;
+
+pci_disable:
+ spraid_pci_disable(hdev);
+out:
+ spraid_change_host_state(hdev, SPRAID_DEAD);
+ dev_err(hdev->dev, "[%s] err, host reset failed\n", __func__);
+}
+
+static int spraid_reset_work_sync(struct spraid_dev *hdev)
+{
+ if (!spraid_change_host_state(hdev, SPRAID_RESETTING)) {
+ dev_info(hdev->dev, "[%s] can't change to reset state\n", __func__);
+ return -EBUSY;
+ }
+
+ if (!queue_work(spraid_wq, &hdev->reset_work)) {
+ dev_err(hdev->dev, "[%s] err, host is already in reset state\n", __func__);
+ return -EBUSY;
+ }
+
+ flush_work(&hdev->reset_work);
+ if (hdev->state != SPRAID_LIVE)
+ return -ENODEV;
+
+ return 0;
+}
+
+static int spraid_wait_abnl_cmd_done(struct spraid_iod *iod)
+{
+ u16 times = 0;
+
+ do {
+ if (READ_ONCE(iod->state) == SPRAID_CMD_TMO_COMPLETE)
+ break;
+ msleep(500);
+ times++;
+ } while (times <= SPRAID_WAIT_ABNL_CMD_TIMEOUT);
+
+ /* wait command completion timeout after abort/reset success */
+ if (times >= SPRAID_WAIT_ABNL_CMD_TIMEOUT)
+ return -ETIMEDOUT;
+
+ return 0;
+}
+
+static int spraid_abort_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (!spraid_wait_abnl_cmd_done(iod) || spraid_check_scmd_completed(scmd) ||
+ hdev->state != SPRAID_LIVE)
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, aborting\n", cid, hwq);
+ ret = spraid_send_abort_cmd(hdev, hostdata->hdid, hwq, cid);
+ if (ret != ADMIN_ERR_TIMEOUT) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort failed, not found\n", cid, hwq);
+ return FAILED;
+ }
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort succ\n", cid, hwq);
+ return SUCCESS;
+ }
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort failed, timeout\n", cid, hwq);
+ return FAILED;
+}
+
+static int spraid_tgt_reset_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (!spraid_wait_abnl_cmd_done(iod) || spraid_check_scmd_completed(scmd) ||
+ hdev->state != SPRAID_LIVE)
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, target reset\n", cid, hwq);
+ ret = spraid_send_reset_cmd(hdev, SPRAID_RESET_TARGET, hostdata->hdid);
+ if (ret == 0) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d]target reset failed, not found\n",
+ cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] target reset success\n", cid, hwq);
+ return SUCCESS;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] ret[%d] target reset failed\n", cid, hwq, ret);
+ return FAILED;
+}
+
+static int spraid_bus_reset_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (!spraid_wait_abnl_cmd_done(iod) || spraid_check_scmd_completed(scmd) ||
+ hdev->state != SPRAID_LIVE)
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, bus reset\n", cid, hwq);
+ ret = spraid_send_reset_cmd(hdev, SPRAID_RESET_BUS, hostdata->hdid);
+ if (ret == 0) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] bus reset failed, not found\n",
+ cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] bus reset succ\n", cid, hwq);
+ return SUCCESS;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] ret[%d] bus reset failed\n", cid, hwq, ret);
+ return FAILED;
+}
+
+static int spraid_shost_reset_handler(struct scsi_cmnd *scmd)
+{
+ u16 hwq, cid;
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+
+ scsi_print_command(scmd);
+ if (spraid_check_scmd_completed(scmd) || hdev->state != SPRAID_LIVE)
+ return SUCCESS;
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset\n", cid, hwq);
+
+ if (spraid_reset_work_sync(hdev)) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset failed\n", cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset success\n", cid, hwq);
+
+ return SUCCESS;
+}
+
+static ssize_t csts_pp_show(struct device *cdev, struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_PP_MASK);
+ ret >>= SPRAID_CSTS_PP_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_shst_show(struct device *cdev, struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_SHST_MASK);
+ ret >>= SPRAID_CSTS_SHST_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_cfs_show(struct device *cdev, struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_CFS_MASK);
+ ret >>= SPRAID_CSTS_CFS_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_rdy_show(struct device *cdev, struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev))
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_RDY);
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t fw_version_show(struct device *cdev, struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+
+ return snprintf(buf, PAGE_SIZE, "%s\n", hdev->ctrl_info->fr);
+}
+
+static DEVICE_ATTR_RO(csts_pp);
+static DEVICE_ATTR_RO(csts_shst);
+static DEVICE_ATTR_RO(csts_cfs);
+static DEVICE_ATTR_RO(csts_rdy);
+static DEVICE_ATTR_RO(fw_version);
+
+static struct attribute *spraid_host_attrs[] = {
+ &dev_attr_csts_pp.attr,
+ &dev_attr_csts_shst.attr,
+ &dev_attr_csts_cfs.attr,
+ &dev_attr_csts_rdy.attr,
+ &dev_attr_fw_version.attr,
+ NULL,
+};
+
+static const struct attribute_group spraid_host_attr_group = {
+ .attrs = spraid_host_attrs
+};
+
+static const struct attribute_group *spraid_host_attr_groups[] = {
+ &spraid_host_attr_group,
+ NULL,
+};
+
+static struct scsi_host_template spraid_driver_template = {
+ .module = THIS_MODULE,
+ .name = "Ramaxel Logic spraid driver",
+ .proc_name = "spraid",
+ .queuecommand = spraid_queue_command,
+ .slave_alloc = spraid_slave_alloc,
+ .slave_destroy = spraid_slave_destroy,
+ .slave_configure = spraid_slave_configure,
+ .eh_timed_out = spraid_scmd_timeout,
+ .eh_abort_handler = spraid_abort_handler,
+ .eh_target_reset_handler = spraid_tgt_reset_handler,
+ .eh_bus_reset_handler = spraid_bus_reset_handler,
+ .eh_host_reset_handler = spraid_shost_reset_handler,
+ .change_queue_depth = scsi_change_queue_depth,
+ .host_tagset = 0,
+ .this_id = -1,
+ .shost_groups = spraid_host_attr_groups,
+};
+
+static void spraid_shutdown(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ spraid_remove_io_queues(hdev);
+ spraid_disable_admin_queue(hdev, true);
+}
+
+/* bsg dispatch user command */
+static int spraid_bsg_host_dispatch(struct bsg_job *job)
+{
+ struct Scsi_Host *shost = dev_to_shost(job->dev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_bsg_request *bsg_req = job->request;
+ int ret = 0;
+
+ dev_info(hdev->dev, "[%s]msgcode[%d], msglen[%d], timeout[%d], req_nsge[%d], req_len[%d]\n",
+ __func__, bsg_req->msgcode, job->request_len, rq->timeout,
+ job->request_payload.sg_cnt, job->request_payload.payload_len);
+
+ job->reply_len = 0;
+
+ switch (bsg_req->msgcode) {
+ case SPRAID_BSG_ADM:
+ ret = spraid_user_admin_cmd(hdev, job);
+ break;
+ case SPRAID_BSG_IOQ:
+ ret = spraid_user_ioq_cmd(hdev, job);
+ break;
+ default:
+ dev_info(hdev->dev, "[%s] unsupport msgcode[%d]\n", __func__, bsg_req->msgcode);
+ break;
+ }
+
+ bsg_job_done(job, ret, 0);
+ return 0;
+}
+
+static inline void spraid_remove_bsg(struct spraid_dev *hdev)
+{
+ struct request_queue *q = hdev->bsg_queue;
+
+ if (q)
+ bsg_remove_queue(q);
+}
+static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct spraid_dev *hdev;
+ struct Scsi_Host *shost;
+ int node, ret;
+ char bsg_name[15];
+
+ shost = scsi_host_alloc(&spraid_driver_template, sizeof(*hdev));
+ if (!shost) {
+ dev_err(&pdev->dev, "Failed to allocate scsi host\n");
+ return -ENOMEM;
+ }
+ hdev = shost_priv(shost);
+ hdev->pdev = pdev;
+ hdev->dev = get_device(&pdev->dev);
+
+ node = dev_to_node(hdev->dev);
+ if (node == NUMA_NO_NODE) {
+ node = first_memory_node;
+ set_dev_node(hdev->dev, node);
+ }
+ hdev->numa_node = node;
+ hdev->shost = shost;
+ pci_set_drvdata(pdev, hdev);
+
+ ret = spraid_dev_map(hdev);
+ if (ret)
+ goto put_dev;
+
+ init_rwsem(&hdev->devices_rwsem);
+ INIT_WORK(&hdev->scan_work, spraid_scan_work);
+ INIT_WORK(&hdev->timesyn_work, spraid_timesyn_work);
+ INIT_WORK(&hdev->reset_work, spraid_reset_work);
+ spin_lock_init(&hdev->state_lock);
+
+ ret = spraid_alloc_resources(hdev);
+ if (ret)
+ goto dev_unmap;
+
+ ret = spraid_pci_enable(hdev);
+ if (ret)
+ goto resources_free;
+
+ ret = spraid_setup_admin_queue(hdev);
+ if (ret)
+ goto pci_disable;
+
+ ret = spraid_init_ctrl_info(hdev);
+ if (ret)
+ goto disable_admin_q;
+
+ ret = spraid_alloc_iod_ext_mem_pool(hdev);
+ if (ret)
+ goto disable_admin_q;
+
+ ret = spraid_setup_io_queues(hdev);
+ if (ret)
+ goto free_iod_mempool;
+
+ spraid_shost_init(hdev);
+
+ ret = scsi_add_host(hdev->shost, hdev->dev);
+ if (ret) {
+ dev_err(hdev->dev, "Add shost to system failed, ret: %d\n",
+ ret);
+ goto remove_io_queues;
+ }
+
+ snprintf(bsg_name, sizeof(bsg_name), "spraid%d", shost->host_no);
+ hdev->bsg_queue = bsg_setup_queue(&shost->shost_gendev, bsg_name,
+ spraid_bsg_host_dispatch, NULL,
+ spraid_cmd_size(hdev, true, false));
+ if (IS_ERR(hdev->bsg_queue)) {
+ dev_err(hdev->dev, "err, setup bsg failed\n");
+ hdev->bsg_queue = NULL;
+ goto remove_io_queues;
+ }
+
+ if (hdev->online_queues == SPRAID_ADMIN_QUEUE_NUM) {
+ dev_warn(hdev->dev, "warn only admin queue can be used\n");
+ return 0;
+ }
+
+ hdev->state = SPRAID_LIVE;
+
+ spraid_send_all_aen(hdev);
+
+ ret = spraid_dev_list_init(hdev);
+ if (ret)
+ goto remove_bsg;
+
+ ret = spraid_configure_timestamp(hdev);
+ if (ret)
+ dev_warn(hdev->dev, "init set timestamp failed\n");
+
+ ret = spraid_alloc_ioq_ptcmds(hdev);
+ if (ret)
+ goto remove_bsg;
+
+ scsi_scan_host(hdev->shost);
+
+ return 0;
+
+remove_bsg:
+ spraid_remove_bsg(hdev);
+remove_io_queues:
+ spraid_remove_io_queues(hdev);
+free_iod_mempool:
+ spraid_free_iod_ext_mem_pool(hdev);
+disable_admin_q:
+ spraid_disable_admin_queue(hdev, false);
+pci_disable:
+ spraid_pci_disable(hdev);
+resources_free:
+ spraid_free_resources(hdev);
+dev_unmap:
+ spraid_dev_unmap(hdev);
+put_dev:
+ put_device(hdev->dev);
+ scsi_host_put(shost);
+
+ return -ENODEV;
+}
+
+static void spraid_remove(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+ struct Scsi_Host *shost = hdev->shost;
+
+ dev_info(hdev->dev, "enter spraid remove\n");
+
+ spraid_change_host_state(hdev, SPRAID_DELETING);
+
+ if (!pci_device_is_present(pdev)) {
+ scsi_block_requests(shost);
+ spraid_back_all_io(hdev);
+ scsi_unblock_requests(shost);
+ }
+
+ flush_work(&hdev->reset_work);
+ spraid_remove_bsg(hdev);
+ scsi_remove_host(shost);
+ spraid_free_ioq_ptcmds(hdev);
+ kfree(hdev->devices);
+ spraid_remove_io_queues(hdev);
+ spraid_free_iod_ext_mem_pool(hdev);
+ spraid_disable_admin_queue(hdev, false);
+ spraid_pci_disable(hdev);
+ spraid_free_resources(hdev);
+ spraid_dev_unmap(hdev);
+ put_device(hdev->dev);
+ scsi_host_put(shost);
+
+ dev_info(hdev->dev, "exit spraid remove\n");
+}
+
+static const struct pci_device_id spraid_id_table[] = {
+ { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC, SPRAID_SERVER_DEVICE_HBA_DID) },
+ { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC, SPRAID_SERVER_DEVICE_RAID_DID) },
+ { 0, }
+};
+MODULE_DEVICE_TABLE(pci, spraid_id_table);
+
+static struct pci_driver spraid_driver = {
+ .name = "spraid",
+ .id_table = spraid_id_table,
+ .probe = spraid_probe,
+ .remove = spraid_remove,
+ .shutdown = spraid_shutdown,
+};
+
+static int __init spraid_init(void)
+{
+ int ret;
+
+ spraid_wq = alloc_workqueue("spraid-wq", WQ_UNBOUND | WQ_MEM_RECLAIM | WQ_SYSFS, 0);
+ if (!spraid_wq)
+ return -ENOMEM;
+
+ spraid_class = class_create(THIS_MODULE, "spraid");
+ if (IS_ERR(spraid_class)) {
+ ret = PTR_ERR(spraid_class);
+ goto destroy_wq;
+ }
+
+ ret = pci_register_driver(&spraid_driver);
+ if (ret < 0)
+ goto destroy_class;
+
+ return 0;
+
+destroy_class:
+ class_destroy(spraid_class);
+destroy_wq:
+ destroy_workqueue(spraid_wq);
+
+ return ret;
+}
+
+static void __exit spraid_exit(void)
+{
+ pci_unregister_driver(&spraid_driver);
+ class_destroy(spraid_class);
+ destroy_workqueue(spraid_wq);
+ ida_destroy(&spraid_instance_ida);
+}
+
+MODULE_AUTHOR("songyl(a)ramaxel.com");
+MODULE_DESCRIPTION("Ramaxel Memory Technology SPraid Driver");
+MODULE_LICENSE("GPL");
+MODULE_VERSION(SPRAID_DRV_VERSION);
+module_init(spraid_init);
+module_exit(spraid_exit);
--
2.27.0
1
0

31 Dec '21
KABI: Add KABI paddings for base structures
Alexey Gladkov (1):
Increase size of ucounts to atomic_long_t
Cui GaoSheng (1):
kabi: reserve space for net_namespace
GONG, Ruiqi (1):
kabi: reserve space for cred and user_namespace
Guan Jing (1):
KABI:reserve space for sched structures
Guo Zihua (1):
KABI: reserve space for IMA IPE
Gustavo A. R. Silva (1):
UAPI: nfsfh.h: Replace one-element array with flexible-array member
Jialin Zhang (2):
kabi: reserve space for posix clock related structure
kabi: reserve space for power management related structure
Lin Ruizhe (6):
bootparam: Add kabi_reserve in bootparam
interrupt: Add kabi_reserve in interrupt.h
irq: Add kabi_reserve in irq
irq_desc: Add kabi_reserve in irq_desc
irqdomain: Add kabi_reserve in irqdomain
msi: Add kabi_reserve in msi.h
Lu Jialin (4):
kabi: reserve space for cgroup framework related structures
kabi: reserve space for memcg related structures
kabi: reserve space for cpu cgroup and cpuset cgroup related
structures
kabi: reserve space for cgroup bpf structures
Tan Xiaojun (1):
kabi: reserve space for pci subsystem related structure
Wang Hai (6):
kabi: net: reserve space for net base subsystem related structure
kabi: net: reserve space for net can subsystem related structure
kabi: net: reserve space for net sunrpc subsystem related structure
kabi: net: reserve space for net rdma subsystem related structure
kabi: net: reserve space for net bpf subsystem related structure
kabi: net: reserve space for net netfilter subsystem related structure
Wang ShaoBo (8):
kabi: reserve space for io subsystem related structures
kabi: reserve space for kobject related structures
kabi: reserve space for struct module
kabi: reserve space for struct ptp_clock
kabi: reserve space for struct ptp_clock_info
kabi: reserve space for ptp_clock.h
kabi: reserve space for iommu.h
kabi: reserve space for fwnode.h
Xie XiuQi (6):
kabi: add kabi helper macros
kabi: add KABI_SIZE_ALIGN_CHECKS for more stringent kabi checks
kabi: enables more stringent kabi checks
kabi: add script tools to check kabi symbol
kabi: add a tool to generate the kabi reference relationship
kabi: add kABI reference checking tool
Yang Jihong (1):
kabi: reserve space for perf subsystem related structures
Yang Yingliang (2):
kabi: reserve space for struct cpu_stop_work
kabi: reserve space for struct dma_map_ops
Yongqiang Liu (1):
kabi: mm: reserve space for memory subsystem related
Yu Liao (3):
kabi: reserve space for struct worker
kabi: reserve space for time and workqueue subsystem related structure
kabi: reserve space for hrtimer related structures
Zheng Zengkai (2):
KABI: add KABI padding to cpuidle structures
KABI: add KABI padding to x86/paravirt ops structures
Zhihao Cheng (1):
kabi: Add kabi reservation for storage module
Kconfig | 7 +
arch/arm64/configs/openeuler_defconfig | 1 +
arch/x86/configs/openeuler_defconfig | 1 +
arch/x86/include/asm/paravirt_types.h | 9 +
arch/x86/include/uapi/asm/bootparam.h | 2 +
block/blk-mq-tag.h | 7 +
drivers/nvme/host/nvme.h | 6 +
drivers/pci/pci.h | 9 +
drivers/ptp/ptp_private.h | 4 +
include/linux/backing-dev-defs.h | 11 +
include/linux/bio.h | 10 +
include/linux/blk-cgroup.h | 11 +
include/linux/blk-mq.h | 30 ++
include/linux/blk_types.h | 12 +
include/linux/blkdev.h | 16 +
include/linux/bpf-cgroup.h | 18 +
include/linux/can/core.h | 4 +
include/linux/can/dev.h | 6 +
include/linux/can/rx-offload.h | 4 +
include/linux/can/skb.h | 1 +
include/linux/cgroup-defs.h | 29 ++
include/linux/cpuidle.h | 15 +
include/linux/cred.h | 12 +
include/linux/dcache.h | 9 +
include/linux/delayacct.h | 4 +
include/linux/device.h | 19 +
include/linux/device/class.h | 7 +
include/linux/device/driver.h | 6 +
include/linux/dma-map-ops.h | 7 +
include/linux/elevator.h | 15 +
include/linux/ethtool.h | 6 +
include/linux/exportfs.h | 4 +
include/linux/fs.h | 48 +++
include/linux/fsnotify_backend.h | 3 +
include/linux/fwnode.h | 7 +
include/linux/genhd.h | 14 +
include/linux/hrtimer.h | 10 +
include/linux/interrupt.h | 3 +
include/linux/iomap.h | 6 +
include/linux/iommu.h | 12 +
include/linux/ioport.h | 6 +
include/linux/ipv6.h | 6 +
include/linux/irq.h | 6 +
include/linux/irqdesc.h | 3 +-
include/linux/irqdomain.h | 2 +
include/linux/jbd2.h | 6 +
include/linux/kabi.h | 494 ++++++++++++++++++++++++-
include/linux/kernfs.h | 9 +
include/linux/kobject.h | 13 +
include/linux/lsm_hooks.h | 2 +
include/linux/memcontrol.h | 20 +
include/linux/mm.h | 6 +
include/linux/mm_types.h | 15 +
include/linux/mmu_notifier.h | 9 +
include/linux/mmzone.h | 9 +
include/linux/module.h | 4 +
include/linux/mount.h | 3 +
include/linux/msi.h | 3 +
include/linux/net.h | 6 +
include/linux/netdevice.h | 47 +++
include/linux/netfilter.h | 9 +
include/linux/netfilter/ipset/ip_set.h | 7 +
include/linux/netfilter/nfnetlink.h | 5 +
include/linux/netfilter_ipv6.h | 3 +
include/linux/ns_common.h | 4 +
include/linux/pci.h | 34 ++
include/linux/pci_hotplug.h | 18 +
include/linux/perf_event.h | 8 +
include/linux/pm.h | 14 +
include/linux/pm_domain.h | 9 +
include/linux/pm_qos.h | 7 +
include/linux/pm_wakeup.h | 4 +
include/linux/posix-clock.h | 17 +
include/linux/ptp_clock_kernel.h | 4 +
include/linux/quota.h | 7 +
include/linux/sbitmap.h | 3 +
include/linux/sched.h | 30 ++
include/linux/sched/signal.h | 6 +
include/linux/sched/topology.h | 4 +
include/linux/sched/user.h | 4 +
include/linux/skbuff.h | 6 +
include/linux/skmsg.h | 11 +
include/linux/stop_machine.h | 2 +
include/linux/sunrpc/cache.h | 4 +
include/linux/sunrpc/clnt.h | 10 +
include/linux/sunrpc/sched.h | 14 +
include/linux/sunrpc/stats.h | 5 +
include/linux/sunrpc/xprt.h | 17 +
include/linux/swap.h | 4 +
include/linux/sysfs.h | 3 +
include/linux/timer.h | 6 +
include/linux/user_namespace.h | 36 +-
include/linux/workqueue.h | 11 +
include/linux/writeback.h | 4 +
include/linux/xattr.h | 3 +
include/net/dcbnl.h | 9 +
include/net/dst.h | 10 +
include/net/dst_ops.h | 10 +
include/net/fib_rules.h | 10 +
include/net/flow.h | 10 +
include/net/genetlink.h | 10 +
include/net/inet_connection_sock.h | 5 +
include/net/ip6_fib.h | 9 +
include/net/l3mdev.h | 6 +
include/net/lwtunnel.h | 7 +
include/net/neighbour.h | 8 +
include/net/net_namespace.h | 6 +
include/net/netfilter/nf_conntrack.h | 4 +
include/net/netlink.h | 5 +
include/net/netns/can.h | 3 +
include/net/netns/ipv4.h | 8 +
include/net/netns/ipv6.h | 3 +
include/net/netns/netfilter.h | 3 +
include/net/netns/nftables.h | 3 +
include/net/netns/xfrm.h | 3 +
include/net/page_pool.h | 3 +
include/net/rtnetlink.h | 10 +
include/net/sch_generic.h | 7 +
include/net/sock.h | 19 +
include/net/switchdev.h | 6 +
include/net/tcp.h | 6 +
include/net/tls.h | 5 +
include/net/xdp.h | 6 +
include/net/xfrm.h | 6 +
include/rdma/ib_addr.h | 3 +
include/rdma/rdma_cm.h | 8 +
include/rdma/rdma_counter.h | 3 +
include/rdma/rdma_netlink.h | 4 +
include/rdma/rdma_vt.h | 6 +
include/rdma/rdmavt_qp.h | 5 +
include/scsi/scsi_cmnd.h | 6 +
include/scsi/scsi_device.h | 15 +
include/scsi/scsi_host.h | 13 +
include/scsi/scsi_transport_fc.h | 25 ++
include/target/target_core_base.h | 7 +
include/uapi/linux/nfsd/nfsfh.h | 27 +-
include/uapi/linux/ptp_clock.h | 4 +
kernel/cgroup/cpuset.c | 6 +
kernel/sched/sched.h | 23 ++
kernel/ucount.c | 32 +-
kernel/workqueue_internal.h | 6 +
scripts/check-kabi | 147 ++++++++
scripts/kabideps | 161 ++++++++
scripts/kabisyms | 141 +++++++
144 files changed, 2220 insertions(+), 29 deletions(-)
create mode 100755 scripts/check-kabi
create mode 100755 scripts/kabideps
create mode 100755 scripts/kabisyms
--
2.20.1
1
49

[PATCH openEuler-5.10 01/71] perf mem: Search event name with more flexible path
by Zheng Zengkai 31 Dec '21
by Zheng Zengkai 31 Dec '21
31 Dec '21
From: Leo Yan <leo.yan(a)linaro.org>
mainline inclusion
from mainline-v5.11-rc1
commit f9f16dfbe76e
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NGPV
CVE: NA
-------------------------------------------------
The perf tool searches a memory event name under the folder
'/sys/devices/cpu/events/', this leads to the limitation for the
selection of a memory profiling event which must be under this folder.
Thus it's impossible to use any other event as memory event which is not
under this specific folder, e.g. Arm SPE hardware event is not located
in '/sys/devices/cpu/events/' so it cannot be enabled for memory
profiling.
This patch changes to search folder from '/sys/devices/cpu/events/' to
'/sys/devices', so it give flexibility to find events which can be used
for memory profiling.
Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
Acked-by: Jiri Olsa <jolsa(a)redhat.com>
Link: https://lore.kernel.org/r/20201106094853.21082-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Wei Li <liwei391(a)huawei.com>
Reviewed-by: Yang Jihong <yangjihong1(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
tools/perf/util/mem-events.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index ea0af0bc4314..35c8d175a9d2 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -18,8 +18,8 @@ unsigned int perf_mem_events__loads_ldlat = 30;
#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
- E("ldlat-loads", "cpu/mem-loads,ldlat=%u/P", "mem-loads"),
- E("ldlat-stores", "cpu/mem-stores/P", "mem-stores"),
+ E("ldlat-loads", "cpu/mem-loads,ldlat=%u/P", "cpu/events/mem-loads"),
+ E("ldlat-stores", "cpu/mem-stores/P", "cpu/events/mem-stores"),
};
#undef E
@@ -93,7 +93,7 @@ int perf_mem_events__init(void)
struct perf_mem_event *e = &perf_mem_events[j];
struct stat st;
- scnprintf(path, PATH_MAX, "%s/devices/cpu/events/%s",
+ scnprintf(path, PATH_MAX, "%s/devices/%s",
mnt, e->sysfs_name);
if (!stat(path, &st))
--
2.20.1
1
70

[PATCH openEuler-1.0-LTS 1/4] net: hns3: fix the concurrency between functions reading debugfs
by Yang Yingliang 31 Dec '21
by Yang Yingliang 31 Dec '21
31 Dec '21
From: Yufeng Mo <moyufeng(a)huawei.com>
driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OSRU
CVE: NA
----------------------------
[1298504.847848] Call trace:
[1298504.847859] [<ffff000008089e14>] dump_backtrace+0x0/0x23c
[1298504.847865] [<ffff00000808a074>] show_stack+0x24/0x2c
[1298504.847870] [<ffff0000088568a8>] dump_stack+0x84/0xa8
[1298504.847878] [<ffff0000082122fc>] bad_page+0xec/0x14c
[1298504.847883] [<ffff000008219384>] free_pages_check_bad+0x90/0x9c
[1298504.847888] [<ffff00000821307c>] __free_pages_ok+0x2b8/0x2ec
[1298504.847894] [<ffff0000082153ec>] __free_pages+0x44/0x64
[1298504.847900] [<ffff000008288788>] kfree+0x198/0x1a0
[1298504.847905] [<ffff00000823432c>] kvfree+0x3c/0x58
[1298504.847937] [<ffff0000014fabf4>] hns3_dbg_read+0xf4/0x278 [hns3]
[1298504.847944] [<ffff000008359550>] full_proxy_read+0x60/0x90
[1298504.847949] [<ffff0000082b22a4>] __vfs_read+0x58/0x178
[1298504.847952] [<ffff0000082b2454>] vfs_read+0x90/0x14c
[1298504.847956] [<ffff0000082b2b70>] SyS_read+0x60/0xc0
When different functions reading the same debugfs node, it will
cause double free problem, because different functions shared
the same node buffer.
This patch make different functions have their own buffer to fix
the problem.
Fixes: 319ba0a4d154 ("net: hns3: fix race condition in debugfs")
Fixes: c91910efc03a ("net: hns3: refactor the debugfs process")
Signed-off-by: Yufeng Mo <moyufeng(a)huawei.com>
Signed-off-by: Yonglong Liu <liuyonglong(a)huawei.com>
Reviewed-by: Jian Shen <shenjian15(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hnae3.h | 1 +
.../net/ethernet/hisilicon/hns3/hns3_debugfs.c | 15 +++++++++++----
.../net/ethernet/hisilicon/hns3/hns3_debugfs.h | 1 -
3 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index 0d849c13f22f5..61b92430e56db 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -765,6 +765,7 @@ struct hnae3_handle {
u8 netdev_flags;
struct dentry *hnae3_dbgfs;
struct mutex dbgfs_lock;
+ char **dbgfs_buf;
/* Network interface message level enabled bits */
u32 msg_enable;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
index c68e5f3d0ba52..8dbbf597d8a2c 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
@@ -808,7 +808,7 @@ static ssize_t hns3_dbg_read(struct file *filp, char __user *buffer,
return ret;
mutex_lock(&handle->dbgfs_lock);
- save_buf = &hns3_dbg_cmd[index].buf;
+ save_buf = &handle->dbgfs_buf[index];
if (!test_bit(HNS3_NIC_STATE_INITED, &priv->state) ||
test_bit(HNS3_NIC_STATE_RESETTING, &priv->state)) {
@@ -911,6 +911,13 @@ int hns3_dbg_init(struct hnae3_handle *handle)
int ret;
u32 i;
+ handle->dbgfs_buf = devm_kcalloc(&handle->pdev->dev,
+ ARRAY_SIZE(hns3_dbg_cmd),
+ sizeof(*handle->dbgfs_buf),
+ GFP_KERNEL);
+ if (!handle->dbgfs_buf)
+ return -ENOMEM;
+
hns3_dbg_dentry[HNS3_DBG_DENTRY_COMMON].dentry =
debugfs_create_dir(name, hns3_dbgfs_root);
handle->hnae3_dbgfs = hns3_dbg_dentry[HNS3_DBG_DENTRY_COMMON].dentry;
@@ -952,9 +959,9 @@ void hns3_dbg_uninit(struct hnae3_handle *handle)
u32 i;
for (i = 0; i < ARRAY_SIZE(hns3_dbg_cmd); i++)
- if (hns3_dbg_cmd[i].buf) {
- kvfree(hns3_dbg_cmd[i].buf);
- hns3_dbg_cmd[i].buf = NULL;
+ if (handle->dbgfs_buf[i]) {
+ kvfree(handle->dbgfs_buf[i]);
+ handle->dbgfs_buf[i] = NULL;
}
mutex_destroy(&handle->dbgfs_lock);
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.h b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.h
index ea65e5174ea98..902e16d99fb7c 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.h
@@ -49,7 +49,6 @@ struct hns3_dbg_cmd_info {
enum hnae3_dbg_cmd cmd;
enum hns3_dbg_dentry_type dentry;
u32 buf_len;
- char *buf;
int (*init)(struct hnae3_handle *handle, unsigned int cmd);
};
--
2.25.1
1
3

[PATCH openEuler-1.0-LTS v2] f2fs: fix to do sanity check on last xattr entry in __f2fs_setxattr()
by Yang Yingliang 31 Dec '21
by Yang Yingliang 31 Dec '21
31 Dec '21
From: Chao Yu <chao(a)kernel.org>
mainline inclusion
from mainline-v5.17
commit 5598b24efaf4892741c798b425d543e4bed357a1
category: bugfix
CVE: CVE-2021-45469
--------------------------------
As Wenqing Liu reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=215235
- Overview
page fault in f2fs_setxattr() when mount and operate on corrupted image
- Reproduce
tested on kernel 5.16-rc3, 5.15.X under root
1. unzip tmp7.zip
2. ./single.sh f2fs 7
Sometimes need to run the script several times
- Kernel dump
loop0: detected capacity change from 0 to 131072
F2FS-fs (loop0): Found nat_bits in checkpoint
F2FS-fs (loop0): Mounted with checkpoint version = 7548c2ee
BUG: unable to handle page fault for address: ffffe47bc7123f48
RIP: 0010:kfree+0x66/0x320
Call Trace:
__f2fs_setxattr+0x2aa/0xc00 [f2fs]
f2fs_setxattr+0xfa/0x480 [f2fs]
__f2fs_set_acl+0x19b/0x330 [f2fs]
__vfs_removexattr+0x52/0x70
__vfs_removexattr_locked+0xb1/0x140
vfs_removexattr+0x56/0x100
removexattr+0x57/0x80
path_removexattr+0xa3/0xc0
__x64_sys_removexattr+0x17/0x20
do_syscall_64+0x37/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
The root cause is in __f2fs_setxattr(), we missed to do sanity check on
last xattr entry, result in out-of-bound memory access during updating
inconsistent xattr data of target inode.
After the fix, it can detect such xattr inconsistency as below:
F2FS-fs (loop11): inode (7) has invalid last xattr entry, entry_size: 60676
F2FS-fs (loop11): inode (8) has corrupted xattr
F2FS-fs (loop11): inode (8) has corrupted xattr
F2FS-fs (loop11): inode (8) has invalid last xattr entry, entry_size: 47736
Cc: stable(a)vger.kernel.org
Reported-by: Wenqing Liu <wenqingliu0120(a)gmail.com>
Signed-off-by: Chao Yu <chao(a)kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
Conflicts:
fs/f2fs/xattr.c
[yyl: replace f2fs_err() with f2fs_msg(KERN_ERR)]
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Reviewed-by: fang wei <fangwei1(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
fs/f2fs/xattr.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/f2fs/xattr.c b/fs/f2fs/xattr.c
index 90d0d305f6a36..1858c66d871d1 100644
--- a/fs/f2fs/xattr.c
+++ b/fs/f2fs/xattr.c
@@ -657,8 +657,17 @@ static int __f2fs_setxattr(struct inode *inode, int index,
}
last = here;
- while (!IS_XATTR_LAST_ENTRY(last))
+ while (!IS_XATTR_LAST_ENTRY(last)) {
+ if ((void *)(last) + sizeof(__u32) > last_base_addr ||
+ (void *)XATTR_NEXT_ENTRY(last) > last_base_addr) {
+ f2fs_msg(F2FS_I_SB(inode)->sb, KERN_ERR, "inode (%lu) has invalid last xattr entry, entry_size: %zu",
+ inode->i_ino, ENTRY_SIZE(last));
+ set_sbi_flag(F2FS_I_SB(inode), SBI_NEED_FSCK);
+ error = -EFSCORRUPTED;
+ goto exit;
+ }
last = XATTR_NEXT_ENTRY(last);
+ }
newsize = XATTR_ALIGN(sizeof(struct f2fs_xattr_entry) + len + size);
--
2.25.1
1
0
patch #1 ~ #2 prepare for backporting fix patch
patch #3 fix CVE-2021-44733
Jens Wiklander (3):
tee: remove linked list of struct tee_shm
tee: don't assign shm id for private shms
tee: handle lookup of shm with reference count 0
drivers/tee/tee_core.c | 1 -
drivers/tee/tee_private.h | 3 +-
drivers/tee/tee_shm.c | 196 ++++++++++++++------------------------
include/linux/tee_drv.h | 7 +-
4 files changed, 77 insertions(+), 130 deletions(-)
--
2.25.1
1
3

[PATCH] f2fs: fix to do sanity check on last xattr entry in __f2fs_setxattr()
by Yang Yingliang 30 Dec '21
by Yang Yingliang 30 Dec '21
30 Dec '21
From: Chao Yu <chao(a)kernel.org>
mainline inclusion
from mainline-v5.17
commit 5598b24efaf4892741c798b425d543e4bed357a1
category: bugfix
CVE: CVE-2021-45469
--------------------------------
As Wenqing Liu reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=215235
- Overview
page fault in f2fs_setxattr() when mount and operate on corrupted image
- Reproduce
tested on kernel 5.16-rc3, 5.15.X under root
1. unzip tmp7.zip
2. ./single.sh f2fs 7
Sometimes need to run the script several times
- Kernel dump
loop0: detected capacity change from 0 to 131072
F2FS-fs (loop0): Found nat_bits in checkpoint
F2FS-fs (loop0): Mounted with checkpoint version = 7548c2ee
BUG: unable to handle page fault for address: ffffe47bc7123f48
RIP: 0010:kfree+0x66/0x320
Call Trace:
__f2fs_setxattr+0x2aa/0xc00 [f2fs]
f2fs_setxattr+0xfa/0x480 [f2fs]
__f2fs_set_acl+0x19b/0x330 [f2fs]
__vfs_removexattr+0x52/0x70
__vfs_removexattr_locked+0xb1/0x140
vfs_removexattr+0x56/0x100
removexattr+0x57/0x80
path_removexattr+0xa3/0xc0
__x64_sys_removexattr+0x17/0x20
do_syscall_64+0x37/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
The root cause is in __f2fs_setxattr(), we missed to do sanity check on
last xattr entry, result in out-of-bound memory access during updating
inconsistent xattr data of target inode.
After the fix, it can detect such xattr inconsistency as below:
F2FS-fs (loop11): inode (7) has invalid last xattr entry, entry_size: 60676
F2FS-fs (loop11): inode (8) has corrupted xattr
F2FS-fs (loop11): inode (8) has corrupted xattr
F2FS-fs (loop11): inode (8) has invalid last xattr entry, entry_size: 47736
Cc: stable(a)vger.kernel.org
Reported-by: Wenqing Liu <wenqingliu0120(a)gmail.com>
Signed-off-by: Chao Yu <chao(a)kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
fs/f2fs/xattr.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/f2fs/xattr.c b/fs/f2fs/xattr.c
index 90d0d305f6a36..b55d0920e777c 100644
--- a/fs/f2fs/xattr.c
+++ b/fs/f2fs/xattr.c
@@ -657,8 +657,17 @@ static int __f2fs_setxattr(struct inode *inode, int index,
}
last = here;
- while (!IS_XATTR_LAST_ENTRY(last))
+ while (!IS_XATTR_LAST_ENTRY(last)) {
+ if ((void *)(last) + sizeof(__u32) > last_base_addr ||
+ (void *)XATTR_NEXT_ENTRY(last) > last_base_addr) {
+ f2fs_err(F2FS_I_SB(inode), "inode (%lu) has invalid last xattr entry, entry_size: %zu",
+ inode->i_ino, ENTRY_SIZE(last));
+ set_sbi_flag(F2FS_I_SB(inode), SBI_NEED_FSCK);
+ error = -EFSCORRUPTED;
+ goto exit;
+ }
last = XATTR_NEXT_ENTRY(last);
+ }
newsize = XATTR_ALIGN(sizeof(struct f2fs_xattr_entry) + len + size);
--
2.25.1
1
0
From: Zekun Shen <bruceshenzk(a)gmail.com>
mainline inclusion
from mainline-v5.17
commit 04d80663f67ccef893061b49ec8a42ff7045ae84
category: bugfix
CVE: CVE-2021-43976
--------------------------------
Currently, with an unknown recv_type, mwifiex_usb_recv
just return -1 without restoring the skb. Next time
mwifiex_usb_rx_complete is invoked with the same skb,
calling skb_put causes skb_over_panic.
The bug is triggerable with a compromised/malfunctioning
usb device. After applying the patch, skb_over_panic
no longer shows up with the same input.
Attached is the panic report from fuzzing.
skbuff: skb_over_panic: text:000000003bf1b5fa
len:2048 put:4 head:00000000dd6a115b data:000000000a9445d8
tail:0x844 end:0x840 dev:<NULL>
kernel BUG at net/core/skbuff.c:109!
invalid opcode: 0000 [#1] SMP KASAN NOPTI
CPU: 0 PID: 198 Comm: in:imklog Not tainted 5.6.0 #60
RIP: 0010:skb_panic+0x15f/0x161
Call Trace:
<IRQ>
? mwifiex_usb_rx_complete+0x26b/0xfcd [mwifiex_usb]
skb_put.cold+0x24/0x24
mwifiex_usb_rx_complete+0x26b/0xfcd [mwifiex_usb]
__usb_hcd_giveback_urb+0x1e4/0x380
usb_giveback_urb_bh+0x241/0x4f0
? __hrtimer_run_queues+0x316/0x740
? __usb_hcd_giveback_urb+0x380/0x380
tasklet_action_common.isra.0+0x135/0x330
__do_softirq+0x18c/0x634
irq_exit+0x114/0x140
smp_apic_timer_interrupt+0xde/0x380
apic_timer_interrupt+0xf/0x20
</IRQ>
Reported-by: Brendan Dolan-Gavitt <brendandg(a)nyu.edu>
Signed-off-by: Zekun Shen <bruceshenzk(a)gmail.com>
Signed-off-by: Kalle Valo <kvalo(a)codeaurora.org>
Link: https://lore.kernel.org/r/YX4CqjfRcTa6bVL+@Zekuns-MBP-16.fios-router.home
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/net/wireless/marvell/mwifiex/usb.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/marvell/mwifiex/usb.c b/drivers/net/wireless/marvell/mwifiex/usb.c
index d445acc4786b7..22a42eb0cc0ac 100644
--- a/drivers/net/wireless/marvell/mwifiex/usb.c
+++ b/drivers/net/wireless/marvell/mwifiex/usb.c
@@ -130,7 +130,8 @@ static int mwifiex_usb_recv(struct mwifiex_adapter *adapter,
default:
mwifiex_dbg(adapter, ERROR,
"unknown recv_type %#x\n", recv_type);
- return -1;
+ ret = -1;
+ goto exit_restore_skb;
}
break;
case MWIFIEX_USB_EP_DATA:
--
2.25.1
1
0

30 Dec '21
From: xulei <stone.xulei(a)huawei.com>
virt inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA
-------------------
fix losing SMI problem
Signed-off-by: xulei <stone.xulei(a)huawei.com>
Signed-off-by: Jingyi Wang <wangjingyi11(a)huawei.com>
Reviewed-by: Zenghui Yu <yuzenghui(a)huawei.com>
Reviewed-by: Wei Li <liwei391(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
arch/x86/kvm/x86.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 12db47c8bd3f..e33414f36dba 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4227,6 +4227,11 @@ static int kvm_vcpu_ioctl_nmi(struct kvm_vcpu *vcpu)
static int kvm_vcpu_ioctl_smi(struct kvm_vcpu *vcpu)
{
+ if (is_smm(vcpu)) {
+ vcpu->arch.hflags &= ~HF_SMM_MASK;
+ vcpu->arch.smi_pending = 0;
+ }
+
kvm_make_request(KVM_REQ_SMI, vcpu);
return 0;
--
2.20.1
1
86

[PATCH openEuler-1.0-LTS] mm/page_alloc: Use cmdline to disable "place pages to tail"
by Yang Yingliang 30 Dec '21
by Yang Yingliang 30 Dec '21
30 Dec '21
From: Peng Liu <liupeng256(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4EG0R
CVE: NA
-----------------------------------------------
Add cmdline to disable "place pages to tail" when online memory.
Signed-off-by: Peng Liu <liupeng256(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
.../admin-guide/kernel-parameters.txt | 9 +++++++
mm/page_alloc.c | 27 ++++++++++++++-----
2 files changed, 30 insertions(+), 6 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 425015359b5ce..420f7317aa6af 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1254,6 +1254,15 @@
Warning: use of this parameter will taint the kernel
and may cause unknown problems.
+ fpi_to_tail=off
+ [KNL] Place pages to front in __free_pages_core().
+ This kernel start-up parameter can be just used as
+ "fpi_to_tail=off", which means memory is online to
+ the front of freelist. Dynamic open is not supportted,
+ in other words, "fpi_to_tail=on" will be ignored by
+ kernel. This is just an ugly solution for circumvention
+ for some latent bugs revealed by "place pages to tail".
+
ftrace=[tracer]
[FTRACE] will set and start the specified tracer
as early as possible in order to facilitate early
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 935435e03cdd4..4cad86f1e3a91 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -102,6 +102,17 @@ typedef int __bitwise fpi_t;
*/
#define FPI_TO_TAIL ((__force fpi_t)BIT(1))
+static bool fpi_to_tail = true;
+
+static int __init early_fpi_to_tail(char *str)
+{
+ if (str && !strcmp(str, "off"))
+ fpi_to_tail = false;
+
+ return 0;
+}
+early_param("fpi_to_tail", early_fpi_to_tail);
+
/* prevent >1 _updater_ of zone percpu pageset ->high and ->batch fields */
static DEFINE_MUTEX(pcp_batch_high_lock);
#define MIN_PERCPU_PAGELIST_FRACTION (8)
@@ -1342,12 +1353,16 @@ void __free_pages_core(struct page *page, unsigned int order)
}
__ClearPageReserved(p);
set_page_count(p, 0);
-
- /*
- * Bypass PCP and place fresh pages right to the tail, primarily
- * relevant for memory onlining.
- */
- __free_pages_ok(page, order, FPI_TO_TAIL);
+ if (fpi_to_tail) {
+ /*
+ * Bypass PCP and place fresh pages right to the tail, primarily
+ * relevant for memory onlining.
+ */
+ __free_pages_ok(page, order, FPI_TO_TAIL);
+ } else {
+ set_page_refcounted(page);
+ __free_pages(page, order);
+ }
}
#if defined(CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID) || \
--
2.25.1
1
0
Add three modifications.
1.RSS initialization process optimization
2.Attribute negotiation and optimization.
3.The reset command flags modification.
Yanling Song (3):
net/spnic:RSS initialization process optimization
net/spnic:Attribute negotiation and optimization.
net/spnic:The reset command flags modification.
.../net/ethernet/ramaxel/spnic/hw/sphw_hw.h | 28 ++++-----
.../ethernet/ramaxel/spnic/hw/sphw_hwdev.c | 9 ++-
.../ethernet/ramaxel/spnic/hw/sphw_hwdev.h | 2 +
.../net/ethernet/ramaxel/spnic/spnic_main.c | 22 +++----
.../ramaxel/spnic/spnic_mgmt_interface.h | 12 ----
.../ethernet/ramaxel/spnic/spnic_nic_cfg.c | 57 ++++++++++++-------
.../ethernet/ramaxel/spnic/spnic_nic_cfg.h | 19 +------
.../ethernet/ramaxel/spnic/spnic_nic_dev.h | 2 +
.../net/ethernet/ramaxel/spnic/spnic_rss.c | 16 +-----
.../ethernet/ramaxel/spnic/spnic_rss_cfg.c | 57 -------------------
10 files changed, 70 insertions(+), 154 deletions(-)
--
2.23.0
1
3

[PATCH openEuler-1.0-LTS] scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback
by Yang Yingliang 30 Dec '21
by Yang Yingliang 30 Dec '21
30 Dec '21
From: Can Guo <cang(a)codeaurora.org>
mainline inclusion
from mainline-v5.11-rc4
commit 35fc4cd34426c242ab015ef280853b7bff101f48
category: bugfix
bugzilla: NA
CVE: CVE-2021-39657
-----------------------------------------------
Users can initiate resets to specific SCSI device/target/host through
IOCTL. When this happens, the SCSI cmd passed to eh_device/target/host
_reset_handler() callbacks is initialized with a request whose tag is -1.
In this case it is not right for eh_device_reset_handler() callback to
count on the LUN get from hba->lrb[-1]. Fix it by getting LUN from the SCSI
device associated with the SCSI cmd.
Link: https://lore.kernel.org/r/1609157080-26283-1-git-send-email-cang@codeaurora…
Reviewed-by: Avri Altman <avri.altman(a)wdc.com>
Reviewed-by: Stanley Chu <stanley.chu(a)mediatek.com>
Signed-off-by: Can Guo <cang(a)codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Reviewed-by: Jason Yan <yanaijie(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/scsi/ufs/ufshcd.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 3601e770da162..3d9bfa5207689 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5752,19 +5752,16 @@ static int ufshcd_eh_device_reset_handler(struct scsi_cmnd *cmd)
{
struct Scsi_Host *host;
struct ufs_hba *hba;
- unsigned int tag;
u32 pos;
int err;
- u8 resp = 0xF;
- struct ufshcd_lrb *lrbp;
+ u8 resp = 0xF, lun;
unsigned long flags;
host = cmd->device->host;
hba = shost_priv(host);
- tag = cmd->request->tag;
- lrbp = &hba->lrb[tag];
- err = ufshcd_issue_tm_cmd(hba, lrbp->lun, 0, UFS_LOGICAL_RESET, &resp);
+ lun = ufshcd_scsi_to_upiu_lun(cmd->device->lun);
+ err = ufshcd_issue_tm_cmd(hba, lun, 0, UFS_LOGICAL_RESET, &resp);
if (err || resp != UPIU_TASK_MANAGEMENT_FUNC_COMPL) {
if (!err)
err = resp;
@@ -5773,7 +5770,7 @@ static int ufshcd_eh_device_reset_handler(struct scsi_cmnd *cmd)
/* clear the commands that were pending for corresponding LUN */
for_each_set_bit(pos, &hba->outstanding_reqs, hba->nutrs) {
- if (hba->lrb[pos].lun == lrbp->lun) {
+ if (hba->lrb[pos].lun == lun) {
err = ufshcd_clear_cmd(hba, pos);
if (err)
break;
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] ext4: fix an use-after-free issue about data=journal writeback mode
by Yang Yingliang 30 Dec '21
by Yang Yingliang 30 Dec '21
30 Dec '21
From: Zhang Yi <yi.zhang(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 185944, https://gitee.com/openeuler/kernel/issues/I4OP92
CVE: NA
---------------------------
Our syzkaller report an use-after-free issue that accessing the freed
buffer_head on the writeback page in __ext4_journalled_writepage(). The
problem is that if there was a truncate racing with the data=journalled
writeback procedure, the writeback length could become zero and
bget_one() refuse to get buffer_head's refcount, then the truncate
procedure release buffer once we drop page lock, finally, the last
ext4_walk_page_buffers() trigger the use-after-free problem.
sync truncate
ext4_sync_file()
file_write_and_wait_range()
ext4_setattr(0)
inode->i_size = 0
ext4_writepage()
len = 0
__ext4_journalled_writepage()
page_bufs = page_buffers(page)
ext4_walk_page_buffers(bget_one) <- does not get refcount
do_invalidatepage()
free_buffer_head()
ext4_walk_page_buffers(page_bufs) <- trigger use-after-free
After commit bdf96838aea6 ("ext4: fix race between truncate and
__ext4_journalled_writepage()"), we have already handled the racing
case, so the bget_one() and bput_one() are not needed. So this patch
simply remove these hunk, and recheck the i_size to make it safe.
Fixes: bdf96838aea6 ("ext4: fix race between truncate and __ext4_journalled_writepage()")
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: yangerkun <yangerkun(a)huawei.com>
Reviewed-by: Jason Yan <yanaijie(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
fs/ext4/inode.c | 35 ++++++++++-------------------------
1 file changed, 10 insertions(+), 25 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 6213816056cd6..e106a9317b429 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1962,28 +1962,16 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
return 0;
}
-static int bget_one(handle_t *handle, struct buffer_head *bh)
-{
- get_bh(bh);
- return 0;
-}
-
-static int bput_one(handle_t *handle, struct buffer_head *bh)
-{
- put_bh(bh);
- return 0;
-}
-
static int __ext4_journalled_writepage(struct page *page,
unsigned int len)
{
struct address_space *mapping = page->mapping;
struct inode *inode = mapping->host;
- struct buffer_head *page_bufs = NULL;
handle_t *handle = NULL;
int ret = 0, err = 0;
int inline_data = ext4_has_inline_data(inode);
struct buffer_head *inode_bh = NULL;
+ loff_t size;
ClearPageChecked(page);
@@ -1993,14 +1981,6 @@ static int __ext4_journalled_writepage(struct page *page,
inode_bh = ext4_journalled_write_inline_data(inode, len, page);
if (inode_bh == NULL)
goto out;
- } else {
- page_bufs = page_buffers(page);
- if (!page_bufs) {
- BUG();
- goto out;
- }
- ext4_walk_page_buffers(handle, page_bufs, 0, len,
- NULL, bget_one);
}
/*
* We need to release the page lock before we start the
@@ -2021,7 +2001,8 @@ static int __ext4_journalled_writepage(struct page *page,
lock_page(page);
put_page(page);
- if (page->mapping != mapping) {
+ size = i_size_read(inode);
+ if (page->mapping != mapping || page_offset(page) > size) {
/* The page got truncated from under us */
ext4_journal_stop(handle);
ret = 0;
@@ -2031,6 +2012,13 @@ static int __ext4_journalled_writepage(struct page *page,
if (inline_data) {
ret = ext4_mark_inode_dirty(handle, inode);
} else {
+ struct buffer_head *page_bufs = page_buffers(page);
+
+ if (page->index == size >> PAGE_SHIFT)
+ len = size & ~PAGE_MASK;
+ else
+ len = PAGE_SIZE;
+
ret = ext4_walk_page_buffers(handle, page_bufs, 0, len, NULL,
do_journal_get_write_access);
@@ -2048,9 +2036,6 @@ static int __ext4_journalled_writepage(struct page *page,
out:
unlock_page(page);
out_no_pagelock:
- if (!inline_data && page_bufs)
- ext4_walk_page_buffers(NULL, page_bufs, 0, len,
- NULL, bput_one);
brelse(inode_bh);
return ret;
}
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] netdevsim: Zero-initialize memory for new map's value in function nsim_bpf_map_alloc
by Yang Yingliang 30 Dec '21
by Yang Yingliang 30 Dec '21
30 Dec '21
From: Haimin Zhang <tcs.kernel(a)gmail.com>
mainline inclusion
from mainline-5.16-rc6
commit 481221775d53d6215a6e5e9ce1cce6d2b4ab9a46
category: bugfix
CVE: CVE-2021-4135
--------------------------------
Zero-initialize memory for new map's value in function nsim_bpf_map_alloc
since it may cause a potential kernel information leak issue, as follows:
1. nsim_bpf_map_alloc calls nsim_map_alloc_elem to allocate elements for
a new map.
2. nsim_map_alloc_elem uses kmalloc to allocate map's value, but doesn't
zero it.
3. A user application can use IOCTL BPF_MAP_LOOKUP_ELEM to get specific
element's information in the map.
4. The kernel function map_lookup_elem will call bpf_map_copy_value to get
the information allocated at step-2, then use copy_to_user to copy to the
user buffer.
This can only leak information for an array map.
Fixes: 395cacb5f1a0 ("netdevsim: bpf: support fake map offload")
Suggested-by: Jakub Kicinski <kuba(a)kernel.org>
Acked-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Haimin Zhang <tcs.kernel(a)gmail.com>
Link: https://lore.kernel.org/r/20211215111530.72103-1-tcs.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Zhengchao Shao <shaozhengchao(a)huawei.com>
Reviewed-by: Wei Yongjun <weiyongjun1(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/net/netdevsim/bpf.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/netdevsim/bpf.c b/drivers/net/netdevsim/bpf.c
index 81444208b2162..12f100392ed11 100644
--- a/drivers/net/netdevsim/bpf.c
+++ b/drivers/net/netdevsim/bpf.c
@@ -493,6 +493,7 @@ nsim_bpf_map_alloc(struct netdevsim *ns, struct bpf_offloaded_map *offmap)
goto err_free;
key = nmap->entry[i].key;
*key = i;
+ memset(nmap->entry[i].value, 0, offmap->map.value_size);
}
}
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS 1/7] bpf: Fix up register-based shifts in interpreter to silence KUBSAN
by Yang Yingliang 30 Dec '21
by Yang Yingliang 30 Dec '21
30 Dec '21
From: Daniel Borkmann <daniel(a)iogearbox.net>
mainline inclusion
from mainline-v5.12
commit 28131e9d933339a92f78e7ab6429f4aaaa07061c
category: bugfix
bugzilla: NA
CVE: NA
-----------------------------------------------------------------------
syzbot reported a shift-out-of-bounds that KUBSAN observed in the
interpreter:
[...]
UBSAN: shift-out-of-bounds in kernel/bpf/core.c:1420:2
shift exponent 255 is too large for 64-bit type 'long long unsigned int'
CPU: 1 PID: 11097 Comm: syz-executor.4 Not tainted 5.12.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x141/0x1d7 lib/dump_stack.c:120
ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
__ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:327
___bpf_prog_run.cold+0x19/0x56c kernel/bpf/core.c:1420
__bpf_prog_run32+0x8f/0xd0 kernel/bpf/core.c:1735
bpf_dispatcher_nop_func include/linux/bpf.h:644 [inline]
bpf_prog_run_pin_on_cpu include/linux/filter.h:624 [inline]
bpf_prog_run_clear_cb include/linux/filter.h:755 [inline]
run_filter+0x1a1/0x470 net/packet/af_packet.c:2031
packet_rcv+0x313/0x13e0 net/packet/af_packet.c:2104
dev_queue_xmit_nit+0x7c2/0xa90 net/core/dev.c:2387
xmit_one net/core/dev.c:3588 [inline]
dev_hard_start_xmit+0xad/0x920 net/core/dev.c:3609
__dev_queue_xmit+0x2121/0x2e00 net/core/dev.c:4182
__bpf_tx_skb net/core/filter.c:2116 [inline]
__bpf_redirect_no_mac net/core/filter.c:2141 [inline]
__bpf_redirect+0x548/0xc80 net/core/filter.c:2164
____bpf_clone_redirect net/core/filter.c:2448 [inline]
bpf_clone_redirect+0x2ae/0x420 net/core/filter.c:2420
___bpf_prog_run+0x34e1/0x77d0 kernel/bpf/core.c:1523
__bpf_prog_run512+0x99/0xe0 kernel/bpf/core.c:1737
bpf_dispatcher_nop_func include/linux/bpf.h:644 [inline]
bpf_test_run+0x3ed/0xc50 net/bpf/test_run.c:50
bpf_prog_test_run_skb+0xabc/0x1c50 net/bpf/test_run.c:582
bpf_prog_test_run kernel/bpf/syscall.c:3127 [inline]
__do_sys_bpf+0x1ea9/0x4f00 kernel/bpf/syscall.c:4406
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
[...]
Generally speaking, KUBSAN reports from the kernel should be fixed.
However, in case of BPF, this particular report caused concerns since
the large shift is not wrong from BPF point of view, just undefined.
In the verifier, K-based shifts that are >= {64,32} (depending on the
bitwidth of the instruction) are already rejected. The register-based
cases were not given their content might not be known at verification
time. Ideas such as verifier instruction rewrite with an additional
AND instruction for the source register were brought up, but regularly
rejected due to the additional runtime overhead they incur.
As Edward Cree rightly put it:
Shifts by more than insn bitness are legal in the BPF ISA; they are
implementation-defined behaviour [of the underlying architecture],
rather than UB, and have been made legal for performance reasons.
Each of the JIT backends compiles the BPF shift operations to machine
instructions which produce implementation-defined results in such a
case; the resulting contents of the register may be arbitrary but
program behaviour as a whole remains defined.
Guard checks in the fast path (i.e. affecting JITted code) will thus
not be accepted.
The case of division by zero is not truly analogous here, as division
instructions on many of the JIT-targeted architectures will raise a
machine exception / fault on division by zero, whereas (to the best
of my knowledge) none will do so on an out-of-bounds shift.
Given the KUBSAN report only affects the BPF interpreter, but not JITs,
one solution is to add the ANDs with 63 or 31 into ___bpf_prog_run().
That would make the shifts defined, and thus shuts up KUBSAN, and the
compiler would optimize out the AND on any CPU that interprets the shift
amounts modulo the width anyway (e.g., confirmed from disassembly that
on x86-64 and arm64 the generated interpreter code is the same before
and after this fix).
The BPF interpreter is slow path, and most likely compiled out anyway
as distros select BPF_JIT_ALWAYS_ON to avoid speculative execution of
BPF instructions by the interpreter. Given the main argument was to
avoid sacrificing performance, the fact that the AND is optimized away
from compiler for mainstream archs helps as well as a solution moving
forward. Also add a comment on LSH/RSH/ARSH translation for JIT authors
to provide guidance when they see the ___bpf_prog_run() interpreter
code and use it as a model for a new JIT backend.
Reported-by: syzbot+bed360704c521841c85d(a)syzkaller.appspotmail.com
Reported-by: Kurt Manucredo <fuzzybritches0(a)gmail.com>
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
Co-developed-by: Eric Biggers <ebiggers(a)kernel.org>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
Acked-by: Andrii Nakryiko <andrii(a)kernel.org>
Tested-by: syzbot+bed360704c521841c85d(a)syzkaller.appspotmail.com
Cc: Edward Cree <ecree.xilinx(a)gmail.com>
Link: https://lore.kernel.org/bpf/0000000000008f912605bd30d5d7@google.com
Link: https://lore.kernel.org/bpf/bac16d8d-c174-bdc4-91bd-bfa62b410190@gmail.com
conflicts:
kernel/bpf/core.c
Signed-off-by: He Fengqing <hefengqing(a)huawei.com>
Reviewed-by: Kuohai Xu <xukuohai(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
kernel/bpf/core.c | 59 +++++++++++++++++++++++++++++++++--------------
1 file changed, 42 insertions(+), 17 deletions(-)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 641622b65de44..625edbc8360de 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1100,29 +1100,54 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
select_insn:
goto *jumptable[insn->code];
- /* ALU */
-#define ALU(OPCODE, OP) \
- ALU64_##OPCODE##_X: \
- DST = DST OP SRC; \
- CONT; \
- ALU_##OPCODE##_X: \
- DST = (u32) DST OP (u32) SRC; \
- CONT; \
- ALU64_##OPCODE##_K: \
- DST = DST OP IMM; \
- CONT; \
- ALU_##OPCODE##_K: \
- DST = (u32) DST OP (u32) IMM; \
+ /* Explicitly mask the register-based shift amounts with 63 or 31
+ * to avoid undefined behavior. Normally this won't affect the
+ * generated code, for example, in case of native 64 bit archs such
+ * as x86-64 or arm64, the compiler is optimizing the AND away for
+ * the interpreter. In case of JITs, each of the JIT backends compiles
+ * the BPF shift operations to machine instructions which produce
+ * implementation-defined results in such a case; the resulting
+ * contents of the register may be arbitrary, but program behaviour
+ * as a whole remains defined. In other words, in case of JIT backends,
+ * the AND must /not/ be added to the emitted LSH/RSH/ARSH translation.
+ */
+ /* ALU (shifts) */
+#define SHT(OPCODE, OP) \
+ ALU64_##OPCODE##_X: \
+ DST = DST OP (SRC & 63); \
+ CONT; \
+ ALU_##OPCODE##_X: \
+ DST = (u32) DST OP ((u32) SRC & 31); \
+ CONT; \
+ ALU64_##OPCODE##_K: \
+ DST = DST OP IMM; \
+ CONT; \
+ ALU_##OPCODE##_K: \
+ DST = (u32) DST OP (u32) IMM; \
+ CONT;
+ /* ALU (rest) */
+#define ALU(OPCODE, OP) \
+ ALU64_##OPCODE##_X: \
+ DST = DST OP SRC; \
+ CONT; \
+ ALU_##OPCODE##_X: \
+ DST = (u32) DST OP (u32) SRC; \
+ CONT; \
+ ALU64_##OPCODE##_K: \
+ DST = DST OP IMM; \
+ CONT; \
+ ALU_##OPCODE##_K: \
+ DST = (u32) DST OP (u32) IMM; \
CONT;
-
ALU(ADD, +)
ALU(SUB, -)
ALU(AND, &)
ALU(OR, |)
- ALU(LSH, <<)
- ALU(RSH, >>)
ALU(XOR, ^)
ALU(MUL, *)
+ SHT(LSH, <<)
+ SHT(RSH, >>)
+#undef SHT
#undef ALU
ALU_NEG:
DST = (u32) -DST;
@@ -1147,7 +1172,7 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
insn++;
CONT;
ALU64_ARSH_X:
- (*(s64 *) &DST) >>= SRC;
+ (*(s64 *) &DST) >>= (SRC & 63);
CONT;
ALU64_ARSH_K:
(*(s64 *) &DST) >>= IMM;
--
2.25.1
1
6
patch #1 ~ #9 prepare for fix CVEs
patch #10 ~ #14 fixes:
CVE-2021-28711
CVE-2021-28712
CVE-2021-28713
CVE-2021-28714
CVE-2021-28715
Juergen Gross (14):
xen/netback: avoid race in xenvif_rx_ring_slots_available()
xen: sync include/xen/interface/io/ring.h with Xen's newest version
xen/blkfront: read response from backend only once
xen/blkfront: don't take local copy of a request from the ring page
xen/blkfront: don't trust the backend response data blindly
xen/netfront: read response from backend only once
xen/netfront: don't read data from request on the ring page
xen/netfront: disentangle tx_skb_freelist
xen/netfront: don't trust the backend response data blindly
xen/blkfront: harden blkfront against event channel storms
xen/netfront: harden netfront against event channel storms
xen/console: harden hvc_xen against event channel storms
xen/netback: fix rx queue stall detection
xen/netback: don't queue unlimited number of packages
drivers/block/xen-blkfront.c | 141 ++++++++----
drivers/net/xen-netback/common.h | 1 +
drivers/net/xen-netback/rx.c | 70 ++++--
drivers/net/xen-netfront.c | 372 ++++++++++++++++++++-----------
drivers/tty/hvc/hvc_xen.c | 30 ++-
include/xen/interface/io/ring.h | 293 +++++++++++++-----------
6 files changed, 575 insertions(+), 332 deletions(-)
--
2.25.1
1
14
Ramaxel inclusion
category: features
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ON8F
CVE: NA
Changes:
1. Split scmd_tmout_nonpt into two parameters:
scmd_tmout_vd/scmd_tmout_rawdisk
2. Return -ETIME instead of -EINVAL when command is timeout.
3. Add one module parameters: max_io_force.
Signed-off-by: Yanling Song <songyl(a)ramaxel.com>
Reviewed-by: Jiang Yu<yujiang(a)ramaxel.com>
---
drivers/scsi/spraid/spraid_main.c | 55 ++++++++++++++-----------------
1 file changed, 24 insertions(+), 31 deletions(-)
diff --git a/drivers/scsi/spraid/spraid_main.c b/drivers/scsi/spraid/spraid_main.c
index 519b39f44e91..df15b1cea426 100644
--- a/drivers/scsi/spraid/spraid_main.c
+++ b/drivers/scsi/spraid/spraid_main.c
@@ -42,10 +42,17 @@ static u32 admin_tmout = 60;
module_param(admin_tmout, uint, 0644);
MODULE_PARM_DESC(admin_tmout, "admin commands timeout (seconds)");
-static u32 scmd_tmout_nonpt = 180;
-module_param(scmd_tmout_nonpt, uint, 0644);
-MODULE_PARM_DESC(scmd_tmout_nonpt,
- "scsi commands timeout for rawdisk&raid(seconds)");
+static u32 scmd_tmout_rawdisk = 180;
+module_param(scmd_tmout_rawdisk, uint, 0644);
+MODULE_PARM_DESC(scmd_tmout_rawdisk, "scsi commands timeout for rawdisk(seconds)");
+
+static u32 scmd_tmout_vd = 180;
+module_param(scmd_tmout_vd, uint, 0644);
+MODULE_PARM_DESC(scmd_tmout_vd, "scsi commands timeout for vd(seconds)");
+
+static bool max_io_force;
+module_param(max_io_force, bool, 0644);
+MODULE_PARM_DESC(max_io_force, "force max_hw_sectors_kb = 1024, default false(performance first)");
static int ioq_depth_set(const char *val, const struct kernel_param *kp);
static const struct kernel_param_ops ioq_depth_ops = {
@@ -78,7 +85,7 @@ static unsigned char log_debug_switch;
module_param_cb(log_debug_switch, &log_debug_switch_ops,
&log_debug_switch, 0644);
MODULE_PARM_DESC(log_debug_switch,
- "set log state, default non-zero for switch on");
+ "set log state, default zero for switch off");
static int small_pool_num_set(const char *val, const struct kernel_param *kp)
{
@@ -981,32 +988,17 @@ static void spraid_slave_destroy(struct scsi_device *sdev)
static int spraid_slave_configure(struct scsi_device *sdev)
{
- u16 idx;
- unsigned int timeout = scmd_tmout_nonpt * HZ;
+ unsigned int timeout = scmd_tmout_rawdisk * HZ;
struct spraid_dev *hdev = shost_priv(sdev->host);
struct spraid_sdev_hostdata *hostdata = sdev->hostdata;
u32 max_sec = sdev->host->max_sectors;
if (hostdata) {
- idx = hostdata->hdid - 1;
- if (sdev->channel == hdev->devices[idx].channel &&
- sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
- sdev->lun < hdev->devices[idx].lun) {
- if (SPRAID_DEV_INFO_ATTR_PT(hdev->devices[idx].attr))
- timeout = 30 * HZ;
- else
- timeout = scmd_tmout_nonpt * HZ;
- max_sec = le16_to_cpu(hdev->devices[idx].max_io_kb)
- << 1;
- } else {
- dev_err(hdev->dev,
- "[%s] err, sdev->channel:id:lun[%d:%d:%lld];"
- "devices[%d], channel:target:lun[%d:%d:%d]\n",
- __func__, sdev->channel, sdev->id, sdev->lun,
- idx, hdev->devices[idx].channel,
- hdev->devices[idx].target,
- hdev->devices[idx].lun);
- }
+ if (SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ timeout = scmd_tmout_vd * HZ;
+ else if (SPRAID_DEV_INFO_ATTR_RAWDISK(hostdata->attr))
+ timeout = scmd_tmout_rawdisk * HZ;
+ max_sec = hostdata->max_io_kb << 1;
} else {
dev_err(hdev->dev, "[%s] err, sdev->hostdata is null\n",
__func__);
@@ -1018,11 +1010,12 @@ static int spraid_slave_configure(struct scsi_device *sdev)
if ((max_sec == 0) || (max_sec > sdev->host->max_sectors))
max_sec = sdev->host->max_sectors;
- dev_info(hdev->dev,
- "[%s] sdev->channel:id:lun[%d:%d:%lld];"
- " scmd_timeout[%d]s, maxsec[%d]\n",
- __func__, sdev->channel, sdev->id,
- sdev->lun, timeout / HZ, max_sec);
+ if (!max_io_force)
+ blk_queue_max_hw_sectors(sdev->request_queue, max_sec);
+
+ dev_info(hdev->dev, "[%s] sdev->channel:id:lun[%d:%d:%lld];"
+ "scmd_timeout[%d]s, maxsec[%d]\n", __func__, sdev->channel,
+ sdev->id, sdev->lun, timeout / HZ, max_sec);
return 0;
}
--
2.27.0
1
0

[PATCH openEuler-1.0-LTS 1/2] sctp: account stream padding length for reconf chunk
by Yang Yingliang 29 Dec '21
by Yang Yingliang 29 Dec '21
29 Dec '21
From: Eiichi Tsukata <eiichi.tsukata(a)nutanix.com>
stable inclusion
from linux-4.19.213
commit c57fdeff69b152185fafabd37e6bfecfce51efda
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OJ3D
CVE: NA
--------------------------------
commit a2d859e3fc97e79d907761550dbc03ff1b36479c upstream.
sctp_make_strreset_req() makes repeated calls to sctp_addto_chunk()
which will automatically account for padding on each call. inreq and
outreq are already 4 bytes aligned, but the payload is not and doing
SCTP_PAD4(a + b) (which _sctp_make_chunk() did implicitly here) is
different from SCTP_PAD4(a) + SCTP_PAD4(b) and not enough. It led to
possible attempt to use more buffer than it was allocated and triggered
a BUG_ON.
Cc: Vlad Yasevich <vyasevich(a)gmail.com>
Cc: Neil Horman <nhorman(a)tuxdriver.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Fixes: cc16f00f6529 ("sctp: add support for generating stream reconf ssn reset request chunk")
Reported-by: Eiichi Tsukata <eiichi.tsukata(a)nutanix.com>
Signed-off-by: Eiichi Tsukata <eiichi.tsukata(a)nutanix.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <mleitner(a)redhat.com>
Reviewed-by: Xin Long <lucien.xin(a)gmail.com>
Link: https://lore.kernel.org/r/b97c1f8b0c7ff79ac4ed206fc2c49d3612e0850c.16341568…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Wang Hai <wanghai38(a)huawei.com>
Reviewed-by: Wei Yongjun <weiyongjun1(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
net/sctp/sm_make_chunk.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 0789109c2d093..35e1fb708f4a7 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -3673,7 +3673,7 @@ struct sctp_chunk *sctp_make_strreset_req(
outlen = (sizeof(outreq) + stream_len) * out;
inlen = (sizeof(inreq) + stream_len) * in;
- retval = sctp_make_reconf(asoc, outlen + inlen);
+ retval = sctp_make_reconf(asoc, SCTP_PAD4(outlen) + SCTP_PAD4(inlen));
if (!retval)
return NULL;
--
2.25.1
1
1

[PATCH openEuler-5.10 01/64] arm64: Revert feature: Add memmap parameter and register pmem
by Zheng Zengkai 29 Dec '21
by Zheng Zengkai 29 Dec '21
29 Dec '21
From: Zhuling <zhuling8(a)huawei.com>
euleros inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4O31I?from=project-issue
CVE: NA
--------------------------------
The reserve memory of PMEM conflicts with
"add memmap interface to reserved memory for mremap syscall usage",
need to rollback, resubmit after adaptation.
Feature related commit:
1.PMEM function commit: 94dc364f5eda10f49449ba573dc3322e1ea92280
2.PMEM feature config commit: 36d7a831e15ceb84e937122c87d01c14242dc377
Signed-off-by: Zhuling <zhuling8(a)huawei.com>
Reviewed-by: Sang Yan <sangyan(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
arch/arm64/Kconfig | 21 ------
arch/arm64/configs/openeuler_defconfig | 3 -
arch/arm64/kernel/Makefile | 1 -
arch/arm64/kernel/pmem.c | 35 ----------
arch/arm64/kernel/setup.c | 10 ---
arch/arm64/mm/init.c | 94 --------------------------
drivers/nvdimm/Kconfig | 5 --
drivers/nvdimm/Makefile | 1 -
8 files changed, 170 deletions(-)
delete mode 100644 arch/arm64/kernel/pmem.c
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index df90a6e05ad2..2df4b310eb23 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1321,27 +1321,6 @@ config RODATA_FULL_DEFAULT_ENABLED
This requires the linear region to be mapped down to pages,
which may adversely affect performance in some cases.
-config ARM64_PMEM_RESERVE
- bool "Reserve memory for persistent storage"
- default n
- help
- Use memmap=nn[KMG]!ss[KMG](memmap=100K!0x1a0000000) reserve
- memory for persistent storage.
-
- Say y here to enable this feature.
-
-config ARM64_PMEM_LEGACY_DEVICE
- bool "Create persistent storage"
- depends on BLK_DEV
- depends on LIBNVDIMM
- select ARM64_PMEM_RESERVE
- help
- Use reserved memory for persistent storage when the kernel
- restart or update. the data in PMEM will not be lost and
- can be loaded faster.
-
- Say y if unsure.
-
config ARM64_SW_TTBR0_PAN
bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
help
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig
index 17bc8750bba7..b5fc851f1949 100644
--- a/arch/arm64/configs/openeuler_defconfig
+++ b/arch/arm64/configs/openeuler_defconfig
@@ -416,8 +416,6 @@ CONFIG_ARM64_CPU_PARK=y
CONFIG_FORCE_MAX_ZONEORDER=11
CONFIG_UNMAP_KERNEL_AT_EL0=y
CONFIG_RODATA_FULL_DEFAULT_ENABLED=y
-CONFIG_ARM64_PMEM_RESERVE=y
-CONFIG_ARM64_PMEM_LEGACY_DEVICE=y
# CONFIG_ARM64_SW_TTBR0_PAN is not set
CONFIG_ARM64_TAGGED_ADDR_ABI=y
CONFIG_ARM64_ILP32=y
@@ -6026,7 +6024,6 @@ CONFIG_ND_BTT=m
CONFIG_BTT=y
CONFIG_OF_PMEM=m
CONFIG_NVDIMM_KEYS=y
-CONFIG_PMEM_LEGACY=m
CONFIG_DAX_DRIVER=y
CONFIG_DAX=y
CONFIG_DEV_DAX=m
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index f6153250b631..169d90f11cf5 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -68,7 +68,6 @@ obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o
obj-$(CONFIG_SHADOW_CALL_STACK) += scs.o
obj-$(CONFIG_ARM64_MTE) += mte.o
obj-$(CONFIG_MPAM) += mpam/
-obj-$(CONFIG_ARM64_PMEM_LEGACY_DEVICE) += pmem.o
obj-y += vdso/ probes/
obj-$(CONFIG_COMPAT_VDSO) += vdso32/
diff --git a/arch/arm64/kernel/pmem.c b/arch/arm64/kernel/pmem.c
deleted file mode 100644
index 16eaf706f671..000000000000
--- a/arch/arm64/kernel/pmem.c
+++ /dev/null
@@ -1,35 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright(c) 2021 Huawei Technologies Co., Ltd
- *
- * Derived from x86 and arm64 implement PMEM.
- */
-#include <linux/platform_device.h>
-#include <linux/init.h>
-#include <linux/ioport.h>
-#include <linux/module.h>
-
-static int found(struct resource *res, void *data)
-{
- return 1;
-}
-
-static int __init register_e820_pmem(void)
-{
- struct platform_device *pdev;
- int rc;
-
- rc = walk_iomem_res_desc(IORES_DESC_PERSISTENT_MEMORY_LEGACY,
- IORESOURCE_MEM, 0, -1, NULL, found);
- if (rc <= 0)
- return 0;
-
- /*
- * See drivers/nvdimm/e820.c for the implementation, this is
- * simply here to trigger the module to load on demand.
- */
- pdev = platform_device_alloc("e820_pmem", -1);
-
- return platform_device_add(pdev);
-}
-device_initcall(register_e820_pmem);
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 88ec49504001..7cd042536d3b 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -70,10 +70,6 @@ static int __init arm64_enable_cpu0_hotplug(char *str)
__setup("arm64_cpu0_hotplug", arm64_enable_cpu0_hotplug);
#endif
-#ifdef CONFIG_ARM64_PMEM_RESERVE
-extern struct resource pmem_res;
-#endif
-
phys_addr_t __fdt_pointer __initdata;
/*
@@ -288,12 +284,6 @@ static void __init request_standard_resources(void)
request_resource(res, &pin_memory_resource);
#endif
}
-
-#ifdef CONFIG_ARM64_PMEM_RESERVE
- if (pmem_res.end && pmem_res.start)
- request_resource(&iomem_resource, &pmem_res);
-#endif
-
}
static int __init reserve_memblock_reserved_regions(void)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 3b9401ee9c58..e8d446164c76 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -55,7 +55,6 @@
*/
s64 memstart_addr __ro_after_init = -1;
EXPORT_SYMBOL(memstart_addr);
-phys_addr_t start_at, mem_size;
#ifdef CONFIG_PIN_MEMORY
struct resource pin_memory_resource = {
@@ -112,18 +111,6 @@ static void __init reserve_pin_memory_res(void)
*/
phys_addr_t arm64_dma_phys_limit __ro_after_init;
-static unsigned long long pmem_size, pmem_start;
-
-#ifdef CONFIG_ARM64_PMEM_RESERVE
-struct resource pmem_res = {
- .name = "Persistent Memory (legacy)",
- .start = 0,
- .end = 0,
- .flags = IORESOURCE_MEM,
- .desc = IORES_DESC_PERSISTENT_MEMORY_LEGACY
-};
-#endif
-
#ifndef CONFIG_KEXEC_CORE
static void __init reserve_crashkernel(void)
{
@@ -417,83 +404,6 @@ static int __init reserve_park_mem(void)
}
#endif
-static bool __init is_mem_valid(unsigned long long mem_size, unsigned long long mem_start)
-{
- if (!memblock_is_region_memory(mem_start, mem_size)) {
- pr_warn("cannot reserve mem: region is not memory!\n");
- return false;
- }
-
- if (memblock_is_region_reserved(mem_start, mem_size)) {
- pr_warn("cannot reserve mem: region overlaps reserved memory!\n");
- return false;
- }
-
- if (!IS_ALIGNED(mem_start, SZ_2M)) {
- pr_warn("cannot reserve mem: base address is not 2MB aligned!\n");
- return false;
- }
-
- return true;
-}
-
-static int __init parse_memmap_one(char *p)
-{
- char *oldp;
-
- if (!p)
- return -EINVAL;
-
- oldp = p;
- mem_size = memparse(p, &p);
- if (p == oldp)
- return -EINVAL;
-
- if (!mem_size)
- return -EINVAL;
-
- mem_size = PAGE_ALIGN(mem_size);
-
- if (*p == '!') {
- start_at = memparse(p+1, &p);
-
- pmem_start = start_at;
- pmem_size = mem_size;
- } else
- pr_info("Unrecognized memmap option, please check the parameter.\n");
-
- return *p == '\0' ? 0 : -EINVAL;
-}
-
-static int __init parse_memmap_opt(char *str)
-{
- while (str) {
- char *k = strchr(str, ',');
-
- if (k)
- *k++ = 0;
- parse_memmap_one(str);
- str = k;
- }
-
- return 0;
-}
-early_param("memmap", parse_memmap_opt);
-
-#ifdef CONFIG_ARM64_PMEM_RESERVE
-static void __init reserve_pmem(void)
-{
- if (!is_mem_valid(mem_size, start_at))
- return;
-
- memblock_remove(pmem_start, pmem_size);
- pr_info("pmem reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
- pmem_start, pmem_start + pmem_size, pmem_size >> 20);
- pmem_res.start = pmem_start;
- pmem_res.end = pmem_start + pmem_size - 1;
-}
-#endif
-
void __init arm64_memblock_init(void)
{
const s64 linear_region_size = BIT(vabits_actual - 1);
@@ -668,10 +578,6 @@ void __init bootmem_init(void)
reserve_quick_kexec();
#endif
-#ifdef CONFIG_ARM64_PMEM_RESERVE
- reserve_pmem();
-#endif
-
reserve_pin_memory_res();
memblock_dump_all();
diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig
index ce4de75262b9..b7d1eb38b27d 100644
--- a/drivers/nvdimm/Kconfig
+++ b/drivers/nvdimm/Kconfig
@@ -132,8 +132,3 @@ config NVDIMM_TEST_BUILD
infrastructure.
endif
-
-config PMEM_LEGACY
- tristate "Pmem_legacy"
- select X86_PMEM_LEGACY if X86
- select ARM64_PMEM_LEGACY_DEVICE if ARM64
diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile
index 6f8dc9242a81..04077532f7ed 100644
--- a/drivers/nvdimm/Makefile
+++ b/drivers/nvdimm/Makefile
@@ -3,7 +3,6 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
obj-$(CONFIG_ND_BTT) += nd_btt.o
obj-$(CONFIG_ND_BLK) += nd_blk.o
-obj-$(CONFIG_PMEM_LEGACY) += nd_e820.o
obj-$(CONFIG_OF_PMEM) += of_pmem.o
obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o nd_virtio.o
--
2.20.1
1
63

[PATCH openEuler-1.0-LTS 00/93] Enabling AMD Milan series processor support
by Laibin Qiu 28 Dec '21
by Laibin Qiu 28 Dec '21
28 Dec '21
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4
Babu Moger (1):
KVM: SVM: Clear the CR4 register on reset
David Edmondson (1):
KVM: x86: clflushopt should be treated as a no-op by emulation
Fenghua Yu (2):
x86/cpufeatures: Enumerate MOVDIRI instruction
x86/cpufeatures: Enumerate MOVDIR64B instruction
Haiyan Song (2):
perf vendor events intel: Add Icelake V1.00 event file
perf vendor events intel: Add Tremontx event file v1.02
Isaac Vaughn (1):
EDAC/amd64: Add PCI device IDs for family 17h, model 70h
Jan H. Schönherr (1):
x86/mce: Fix use of uninitialized MCE message string
Jim Mattson (1):
kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID
John Allen (2):
kvm/svm: PKU not currently supported
x86/microcode/AMD: Increase microcode PATCH_MAX_SIZE
John Garry (4):
perf jevents: Add support for Hisi hip08 DDRC PMU aliasing
perf jevents: Add support for Hisi hip08 HHA PMU aliasing
perf jevents: Add support for Hisi hip08 L3C PMU aliasing
perf vendor events arm64: Fix Hisi hip08 DDRC PMU eventname
Kai Huang (3):
kvm: x86: Move kvm_set_mmio_spte_mask() from x86.c to mmu.c
kvm: x86: Fix reserved bits related calculation errors caused by MKTME
kvm: x86: Fix L1TF mitigation for shadow MMU
Kan Liang (1):
perf vendor events intel: Add uncore_upi JSON support
Kim Phillips (17):
perf/amd/uncore: Prepare L3 thread mask code for Family 19h
perf/amd/uncore: Make L3 thread mask code more readable
perf/amd/uncore: Add support for Family 19h L3 PMU
arch/x86/amd/ibs: Fix re-arming IBS Fetch
perf/x86/amd/ibs: Fix raw sample data accumulation
perf/amd/uncore: Set all slices and threads to restore perf stat -a
behaviour
perf/x86/amd/ibs: Don't include randomized bits in get_ibs_op_count()
perf/amd/uncore: Prepare to scale for more attributes that vary per
family
perf/amd/uncore: Allow F17h user threadmask and slicemask
specification
perf/amd/uncore: Allow F19h user coreid, threadmask, and sliceid
specification
perf vendor events amd: Add L3 cache events for Family 17h
perf vendor events amd: Remove redundant '['
perf vendor events amd: Enable Family 19h users by matching Zen2
events
x86/cpu/amd: Call init_amd_zn() om Family 19h processors too
perf/x86/amd/ibs: Support 27-bit extended Op/cycle counter
tools/power turbostat: Support AMD Family 19h
perf vendor events amd: Add L2 Prefetch events for zen1
Krish Sadhukhan (1):
KVM: SVM: Replace hard-coded value with #define
Like Xu (1):
perf/x86/amd: Don't touch the AMD64_EVENTSEL_HOSTONLY bit inside the
guest
Liu Jingqi (2):
KVM: x86: expose MOVDIRI CPU feature into VM.
KVM: x86: expose MOVDIR64B CPU feature into VM.
Maciej S. Szmigiero (1):
KVM: mmu: Fix SPTE encoding of MMIO generation upper half
Marcel Bocu (1):
x86/amd_nb: Add PCI device IDs for family 17h, model 70h
Martin Liška (1):
perf vendor events amd: perf PMU events for AMD Family 17h
Nathan Chancellor (2):
crypto: ccp - Remove forward declaration
perf/amd/uncore: Fix sysfs type mismatch
Paolo Bonzini (3):
KVM: x86: only do L1TF workaround on affected processors
KVM: x86: assign two bits to track SPTE kinds
KVM: x86: fix overlap between SPTE_MMIO_MASK and generation
Rasmus Villemoes (1):
build_bug.h: add wrapper for _Static_assert
Sean Christopherson (13):
KVM: nVMX: Allocate and configure VM{READ,WRITE} bitmaps iff
enable_shadow_vmcs
KVM: x86: Add requisite includes to kvm_cache_regs.h
KVM: x86: Add requisite includes to hyperv.h
KVM: x86: Use a u64 when passing the MMIO gen around
KVM: Explicitly define the "memslot update in-progress" bit
KVM: x86: Refactor the MMIO SPTE generation handling
KVM: x86: Rename access permissions cache member in struct
kvm_vcpu_arch
KVM: x86/mmu: Add explicit access mask for MMIO SPTEs
KVM: x86/mmu: Consolidate "is MMIO SPTE" code
KVM: x86/mmu: Apply max PA check for MMIO sptes to 32-bit KVM
KVM: x86/mmu: Set mmio_value to '0' if reserved #PF can't be generated
KVM: Remove the hack to trigger memslot generation wraparound
KVM: Move the memslot update in-progress flag to bit 63
Sebastian Andrzej Siewior (1):
x86/pkeys: Don't check if PKRU is zero before writing it
Tom Lendacky (1):
KVM: SVM: Override default MMIO mask if memory encryption is enabled
Vijay Thakkar (3):
perf vendor events amd: Restrict model detection for zen1 based
processors
perf vendor events amd: Add Zen2 events
perf vendor events amd: Update Zen1 events to V2
Woods, Brian (2):
hwmon/k10temp, x86/amd_nb: Consolidate shared device IDs
x86/amd_nb: Add PCI device IDs for family 17h, model 30h
Yazen Ghannam (24):
EDAC/amd64: Drop some family checks for newer systems
x86/amd_nb: Add Family 19h PCI IDs
EDAC/mce_amd: Always load on SMCA systems
x86/MCE/AMD, EDAC/mce_amd: Add new MP5, NBIO, and PCIE SMCA bank types
x86/MCE/AMD, EDAC/mce_amd: Add new McaTypes for CS, PSP, and SMU units
x86/MCE/AMD, EDAC/mce_amd: Add new error descriptions for some SMCA
bank types
x86/MCE/AMD, EDAC/mce_amd: Add new Load Store unit McaType
EDAC/amd64: Use a macro for iterating over Unified Memory Controllers
EDAC/amd64: Support more than two controllers for chip selects
handling
EDAC/amd64: Initialize DIMM info for systems with more than two
channels
EDAC/amd64: Add Family 17h Model 30h PCI IDs
EDAC/amd64: Support more than two Unified Memory Controllers
EDAC/amd64: Set maximum channel layer size depending on family
EDAC/amd64: Recognize x16 symbol size
EDAC/amd64: Adjust printed chip select sizes when interleaved
EDAC/amd64: Find Chip Select memory size using Address Mask
EDAC/amd64: Cache secondary Chip Select registers
EDAC/amd64: Support asymmetric dual-rank DIMMs
EDAC/amd64: Set grain per DIMM
EDAC/amd64: Make struct amd64_family_type global
EDAC/amd64: Gather hardware information early
EDAC/amd64: Save max number of controllers to family type
EDAC/amd64: Add family ops for Family 19h Models 00h-0Fh
EDAC/amd64: Handle three rank interleaving mode
Documentation/virtual/kvm/mmu.txt | 13 +-
arch/x86/events/amd/ibs.c | 93 +-
arch/x86/events/amd/uncore.c | 179 ++--
arch/x86/events/perf_event.h | 3 +-
arch/x86/include/asm/cpufeatures.h | 4 +-
arch/x86/include/asm/kvm_host.h | 10 +-
arch/x86/include/asm/mce.h | 7 +
arch/x86/include/asm/microcode_amd.h | 2 +-
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/include/asm/perf_event.h | 16 +-
arch/x86/kernel/amd_nb.c | 15 +-
arch/x86/kernel/cpu/amd.c | 3 +-
arch/x86/kernel/cpu/mce/amd.c | 28 +-
arch/x86/kernel/cpu/mce/core.c | 4 +-
arch/x86/kvm/cpuid.c | 6 +-
arch/x86/kvm/emulate.c | 8 +-
arch/x86/kvm/hyperv.h | 2 +
arch/x86/kvm/kvm_cache_regs.h | 2 +
arch/x86/kvm/mmu.c | 215 +++--
arch/x86/kvm/mmu.h | 2 +-
arch/x86/kvm/svm.c | 53 +-
arch/x86/kvm/vmx.c | 51 +-
arch/x86/kvm/x86.c | 33 +-
arch/x86/kvm/x86.h | 4 +-
arch/x86/mm/pkeys.c | 7 -
drivers/crypto/ccp/sp-platform.c | 53 +-
drivers/edac/amd64_edac.c | 609 ++++++++----
drivers/edac/amd64_edac.h | 31 +-
drivers/edac/mce_amd.c | 134 ++-
drivers/hwmon/k10temp.c | 9 +-
include/linux/build_bug.h | 19 +
include/linux/kvm_host.h | 21 +
include/linux/pci_ids.h | 5 +
.../arm64/hisilicon/hip08/uncore-ddrc.json | 44 +
.../arm64/hisilicon/hip08/uncore-hha.json | 51 +
.../arm64/hisilicon/hip08/uncore-l3c.json | 37 +
.../pmu-events/arch/x86/amdzen1/branch.json | 23 +
.../pmu-events/arch/x86/amdzen1/cache.json | 312 ++++++
.../pmu-events/arch/x86/amdzen1/core.json | 125 +++
.../arch/x86/amdzen1/floating-point.json | 224 +++++
.../pmu-events/arch/x86/amdzen1/memory.json | 184 ++++
.../pmu-events/arch/x86/amdzen1/other.json | 56 ++
.../pmu-events/arch/x86/amdzen2/branch.json | 52 +
.../pmu-events/arch/x86/amdzen2/cache.json | 338 +++++++
.../pmu-events/arch/x86/amdzen2/core.json | 130 +++
.../arch/x86/amdzen2/floating-point.json | 140 +++
.../pmu-events/arch/x86/amdzen2/memory.json | 341 +++++++
.../pmu-events/arch/x86/amdzen2/other.json | 115 +++
.../pmu-events/arch/x86/icelake/cache.json | 552 +++++++++++
.../arch/x86/icelake/floating-point.json | 102 ++
.../pmu-events/arch/x86/icelake/frontend.json | 424 +++++++++
.../pmu-events/arch/x86/icelake/memory.json | 410 ++++++++
.../pmu-events/arch/x86/icelake/other.json | 121 +++
.../pmu-events/arch/x86/icelake/pipeline.json | 892 ++++++++++++++++++
.../arch/x86/icelake/virtual-memory.json | 236 +++++
tools/perf/pmu-events/arch/x86/mapfile.csv | 6 +
.../pmu-events/arch/x86/tremontx/cache.json | 111 +++
.../arch/x86/tremontx/frontend.json | 26 +
.../pmu-events/arch/x86/tremontx/memory.json | 26 +
.../pmu-events/arch/x86/tremontx/other.json | 26 +
.../arch/x86/tremontx/pipeline.json | 111 +++
.../arch/x86/tremontx/uncore-memory.json | 73 ++
.../arch/x86/tremontx/uncore-other.json | 431 +++++++++
.../arch/x86/tremontx/uncore-power.json | 11 +
.../arch/x86/tremontx/virtual-memory.json | 86 ++
tools/perf/pmu-events/jevents.c | 5 +
tools/power/x86/turbostat/turbostat.c | 34 +-
virt/kvm/kvm_main.c | 36 +-
68 files changed, 6981 insertions(+), 552 deletions(-)
create mode 100644 tools/perf/pmu-events/arch/arm64/hisilicon/hip08/uncore-ddrc.json
create mode 100644 tools/perf/pmu-events/arch/arm64/hisilicon/hip08/uncore-hha.json
create mode 100644 tools/perf/pmu-events/arch/arm64/hisilicon/hip08/uncore-l3c.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/branch.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/core.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/floating-point.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/other.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/branch.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/core.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/floating-point.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/other.json
create mode 100644 tools/perf/pmu-events/arch/x86/icelake/cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/icelake/floating-point.json
create mode 100644 tools/perf/pmu-events/arch/x86/icelake/frontend.json
create mode 100644 tools/perf/pmu-events/arch/x86/icelake/memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/icelake/other.json
create mode 100644 tools/perf/pmu-events/arch/x86/icelake/pipeline.json
create mode 100644 tools/perf/pmu-events/arch/x86/icelake/virtual-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/frontend.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/other.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/pipeline.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-other.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-power.json
create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/virtual-memory.json
--
2.22.0
1
93

[PATCH openEuler-1.0-LTS] netfilter: fix regression in looped (broad|multi)cast's MAC handling
by Yang Yingliang 28 Dec '21
by Yang Yingliang 28 Dec '21
28 Dec '21
From: Ignacy Gawędzki <ignacy.gawedzki(a)green-communications.fr>
mainline inclusion
from mainline-v5.16-rc7
commit ebb966d3bdfed581ecccbb4a7432341baf7619b4
category: bugfix
bugzilla: NA
CVE: NA
--------------------------------
In commit 5648b5e1169f ("netfilter: nfnetlink_queue: fix OOB when mac
header was cleared"), the test for non-empty MAC header introduced in
commit 2c38de4c1f8da7 ("netfilter: fix looped (broad|multi)cast's MAC
handling") has been replaced with a test for a set MAC header.
This breaks the case when the MAC header has been reset (using
skb_reset_mac_header), as is the case with looped-back multicast
packets. As a result, the packets ending up in NFQUEUE get a bogus
hwaddr interpreted from the first bytes of the IP header.
This patch adds a test for a non-empty MAC header in addition to the
test for a set MAC header. The same two tests are also implemented in
nfnetlink_log.c, where the initial code of commit 2c38de4c1f8da7
("netfilter: fix looped (broad|multi)cast's MAC handling") has not been
touched, but where supposedly the same situation may happen.
Fixes: 5648b5e1169f ("netfilter: nfnetlink_queue: fix OOB when mac header was cleared")
Signed-off-by: Ignacy Gawędzki <ignacy.gawedzki(a)green-communications.fr>
Reviewed-by: Florian Westphal <fw(a)strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Wei Yongjun <weiyongjun1(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
net/netfilter/nfnetlink_log.c | 3 ++-
net/netfilter/nfnetlink_queue.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nfnetlink_log.c b/net/netfilter/nfnetlink_log.c
index 25298b3eb8546..17ca9a681d47b 100644
--- a/net/netfilter/nfnetlink_log.c
+++ b/net/netfilter/nfnetlink_log.c
@@ -509,7 +509,8 @@ __build_packet_message(struct nfnl_log_net *log,
goto nla_put_failure;
if (indev && skb->dev &&
- skb->mac_header != skb->network_header) {
+ skb_mac_header_was_set(skb) &&
+ skb_mac_header_len(skb) != 0) {
struct nfulnl_msg_packet_hw phw;
int len;
diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
index eb5a052d3b252..8955431f2ab26 100644
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -566,7 +566,8 @@ nfqnl_build_packet_message(struct net *net, struct nfqnl_instance *queue,
goto nla_put_failure;
if (indev && entskb->dev &&
- skb_mac_header_was_set(entskb)) {
+ skb_mac_header_was_set(entskb) &&
+ skb_mac_header_len(entskb) != 0) {
struct nfqnl_msg_packet_hw phw;
int len;
--
2.25.1
1
0
euleros inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OBDE
CVE: NA
-------------------------------------------------
This config enables dangerous old drivers. It is selected by
CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT which is suggested by upstream
community that "modern distros should consider turning it off".
CONFIG_DRM_VM was selected by CONFIG_DRM_LEGACY and can be turned
off now.
Other distros has disabled this config:
https://src.fedoraproject.org/rpms/kernel/blob/rawhide/f/kernel-aarch64-rhe…
Signed-off-by: Liu Zixian <liuzixian4(a)huawei.com>
---
arch/arm64/configs/openeuler_defconfig | 10 ++--------
arch/x86/configs/openeuler_defconfig | 10 ++--------
2 files changed, 4 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig
index 17bc8750b..6949824e8 100644
--- a/arch/arm64/configs/openeuler_defconfig
+++ b/arch/arm64/configs/openeuler_defconfig
@@ -4673,7 +4673,6 @@ CONFIG_DRM_TTM_DMA_PAGE_POOL=y
CONFIG_DRM_VRAM_HELPER=y
CONFIG_DRM_TTM_HELPER=y
CONFIG_DRM_GEM_SHMEM_HELPER=y
-CONFIG_DRM_VM=y
CONFIG_DRM_SCHED=m
#
@@ -4719,7 +4718,7 @@ CONFIG_DRM_AMD_DC_DCN=y
# CONFIG_HSA_AMD is not set
CONFIG_DRM_NOUVEAU=m
-CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT=y
+# CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT is not set
CONFIG_NOUVEAU_DEBUG=5
CONFIG_NOUVEAU_DEBUG_DEFAULT=3
# CONFIG_NOUVEAU_DEBUG_MMU is not set
@@ -4817,12 +4816,7 @@ CONFIG_DRM_CIRRUS_QEMU=m
# CONFIG_DRM_LIMA is not set
# CONFIG_DRM_PANFROST is not set
# CONFIG_DRM_TIDSS is not set
-CONFIG_DRM_LEGACY=y
-# CONFIG_DRM_TDFX is not set
-# CONFIG_DRM_R128 is not set
-# CONFIG_DRM_MGA is not set
-# CONFIG_DRM_VIA is not set
-# CONFIG_DRM_SAVAGE is not set
+# CONFIG_DRM_LEGACY is not set
CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y
#
diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig
index 7b6083018..dbae04231 100644
--- a/arch/x86/configs/openeuler_defconfig
+++ b/arch/x86/configs/openeuler_defconfig
@@ -5040,7 +5040,6 @@ CONFIG_DRM_TTM_DMA_PAGE_POOL=y
CONFIG_DRM_VRAM_HELPER=m
CONFIG_DRM_TTM_HELPER=m
CONFIG_DRM_GEM_SHMEM_HELPER=y
-CONFIG_DRM_VM=y
CONFIG_DRM_SCHED=m
#
@@ -5085,7 +5084,7 @@ CONFIG_DRM_AMD_DC_DCN=y
# CONFIG_HSA_AMD is not set
CONFIG_DRM_NOUVEAU=m
-CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT=y
+# CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT is not set
CONFIG_NOUVEAU_DEBUG=5
CONFIG_NOUVEAU_DEBUG_DEFAULT=3
# CONFIG_NOUVEAU_DEBUG_MMU is not set
@@ -5226,12 +5225,7 @@ CONFIG_DRM_CIRRUS_QEMU=m
# CONFIG_TINYDRM_ST7735R is not set
# CONFIG_DRM_XEN is not set
# CONFIG_DRM_VBOXVIDEO is not set
-CONFIG_DRM_LEGACY=y
-# CONFIG_DRM_TDFX is not set
-# CONFIG_DRM_R128 is not set
-# CONFIG_DRM_MGA is not set
-# CONFIG_DRM_VIA is not set
-# CONFIG_DRM_SAVAGE is not set
+# CONFIG_DRM_LEGACY is not set
CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y
#
--
2.27.0
1
0
尊敬的女士/先生:
您好!
感谢您选择与华为合作及在双方合作中给予的支持!
为了更好地实现共同成长、共同发展,我们目前在开展“2021年华为生态健康度调
研”。为此,我们委托独立的第三方调研机构广州尼尔森市场研究有限公司(以下简称
尼尔森),来开展相关调研,以了解您对华为生态系统的评价和建议,并分析改进。本
次调研内容仅供项目研究使用,对外会严格保密。
问卷答题时长约5-10分钟。参与答题请点击问卷链接:
https://csurveys.nielseniq.cn/wix/p1611532.aspx
您对本次调查有任何疑问或需要任何帮助,可以通过Ecosurvey(a)huawei.xn--com,-os0g9z677h027f
为工作人员将会解答您的问题。
感谢您的参与!
2021华为生态健康度调研项目组
1
0
From: sdlzx <hdu_sdlzx(a)163.com>
redhat inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OBDE
CVE: NA
-------------------------------------------------
This config enables dangerous old drivers. It is selected by
CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT which is suggested by upstream
community that "modern distros should consider turning it off".
CONFIG_DRM_VM was selected by CONFIG_DRM_LEGACY and can be turned
off now.
Other distros has disabled this config:
https://src.fedoraproject.org/rpms/kernel/blob/rawhide/f/kernel-aarch64-rhe…
Signed-off-by: sdlzx <hdu_sdlzx(a)163.com>
---
arch/arm64/configs/openeuler_defconfig | 10 ++--------
arch/x86/configs/openeuler_defconfig | 10 ++--------
2 files changed, 4 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig
index 17bc8750b..6949824e8 100644
--- a/arch/arm64/configs/openeuler_defconfig
+++ b/arch/arm64/configs/openeuler_defconfig
@@ -4673,7 +4673,6 @@ CONFIG_DRM_TTM_DMA_PAGE_POOL=y
CONFIG_DRM_VRAM_HELPER=y
CONFIG_DRM_TTM_HELPER=y
CONFIG_DRM_GEM_SHMEM_HELPER=y
-CONFIG_DRM_VM=y
CONFIG_DRM_SCHED=m
#
@@ -4719,7 +4718,7 @@ CONFIG_DRM_AMD_DC_DCN=y
# CONFIG_HSA_AMD is not set
CONFIG_DRM_NOUVEAU=m
-CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT=y
+# CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT is not set
CONFIG_NOUVEAU_DEBUG=5
CONFIG_NOUVEAU_DEBUG_DEFAULT=3
# CONFIG_NOUVEAU_DEBUG_MMU is not set
@@ -4817,12 +4816,7 @@ CONFIG_DRM_CIRRUS_QEMU=m
# CONFIG_DRM_LIMA is not set
# CONFIG_DRM_PANFROST is not set
# CONFIG_DRM_TIDSS is not set
-CONFIG_DRM_LEGACY=y
-# CONFIG_DRM_TDFX is not set
-# CONFIG_DRM_R128 is not set
-# CONFIG_DRM_MGA is not set
-# CONFIG_DRM_VIA is not set
-# CONFIG_DRM_SAVAGE is not set
+# CONFIG_DRM_LEGACY is not set
CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y
#
diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig
index 7b6083018..dbae04231 100644
--- a/arch/x86/configs/openeuler_defconfig
+++ b/arch/x86/configs/openeuler_defconfig
@@ -5040,7 +5040,6 @@ CONFIG_DRM_TTM_DMA_PAGE_POOL=y
CONFIG_DRM_VRAM_HELPER=m
CONFIG_DRM_TTM_HELPER=m
CONFIG_DRM_GEM_SHMEM_HELPER=y
-CONFIG_DRM_VM=y
CONFIG_DRM_SCHED=m
#
@@ -5085,7 +5084,7 @@ CONFIG_DRM_AMD_DC_DCN=y
# CONFIG_HSA_AMD is not set
CONFIG_DRM_NOUVEAU=m
-CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT=y
+# CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT is not set
CONFIG_NOUVEAU_DEBUG=5
CONFIG_NOUVEAU_DEBUG_DEFAULT=3
# CONFIG_NOUVEAU_DEBUG_MMU is not set
@@ -5226,12 +5225,7 @@ CONFIG_DRM_CIRRUS_QEMU=m
# CONFIG_TINYDRM_ST7735R is not set
# CONFIG_DRM_XEN is not set
# CONFIG_DRM_VBOXVIDEO is not set
-CONFIG_DRM_LEGACY=y
-# CONFIG_DRM_TDFX is not set
-# CONFIG_DRM_R128 is not set
-# CONFIG_DRM_MGA is not set
-# CONFIG_DRM_VIA is not set
-# CONFIG_DRM_SAVAGE is not set
+# CONFIG_DRM_LEGACY is not set
CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y
#
--
2.27.0
2
1
尊敬的女士/先生:
您好!
感谢您选择与华为合作及在双方合作中给予的支持!
为了更好地实现共同成长、共同发展,我们目前在开展“2021年华为生态健康度调
研”。为此,我们委托独立的第三方调研机构广州尼尔森市场研究有限公司(以下简称
尼尔森),来开展相关调研,以了解您对华为生态系统的评价和建议,并分析改进。本
次调研内容仅供项目研究使用,对外会严格保密。
问卷答题时长约5-10分钟。参与答题请点击问卷链接:
https://csurveys.nielseniq.cn/wix/p1611532.aspx
您对本次调查有任何疑问或需要任何帮助,可以通过Ecosurvey(a)huawei.xn--com,-os0g9z677h027f
为工作人员将会解答您的问题。
感谢您的参与!
2021华为生态健康度调研项目组
1
0

[PATCH OLK-5.10 00/13] Introduce AESNI/AVX and AESNI/AVX2 accelerated implementation for SM4 algorithm
by shenzijun 28 Dec '21
by shenzijun 28 Dec '21
28 Dec '21
From: 沈子俊 <shenzijun(a)kylinos.cn>
This patchset support test GCM/CCM mode for SM4. The GCM/CCM mode of
SM4 is defined in the RFC 8998 specification:
https://datatracker.ietf.org/doc/html/rfc8998
This patchset extracts the public SM4 algorithm as a separate library,
At the same time, the acceleration implementation of SM4 in arm64 was
adjusted to adapt to this SM4 library. Then introduces an accelerated
implementation of the AESNI/AVX and AESNI/AVX2 on x86_64. The
AESNI/AVX2 implementation reuses the function of AESNI/AVX.
The optimization supports the four modes of SM4, ECB, CBC, CFB, and
CTR. Since CBC and CFB do not support multiple block parallel
encryption, the optimization effect is not obvious. And all selftests
have passed already.
The main algorithm implementation comes from SM4 AES-NI work by
libgcrypt and Markku-Juhani O. Saarinen at:
https://github.com/mjosaarinen/sm4ni
Finally, add the configuration for accelerated of SM4. And introduce
some bugfix about AESNI/AVX2 accelerated implementation.
Benchmark on Intel i5-6200U 2.30GHz, performance data of three
implementation methods, pure software sm4-generic, aesni/avx
acceleration, and aesni/avx2 acceleration, the data comes from
the 218 mode and 518 mode of tcrypt. The abscissas are blocks of
different lengths. The data is tabulated and the unit is Mb/s:
block-size | 16 64 128 256 1024 1420 4096
sm4-generic
ECB enc | 60.94 70.41 72.27 73.02 73.87 73.58 73.59
ECB dec | 61.87 70.53 72.15 73.09 73.89 73.92 73.86
CBC enc | 56.71 66.31 68.05 69.84 70.02 70.12 70.24
CBC dec | 54.54 65.91 68.22 69.51 70.63 70.79 70.82
CFB enc | 57.21 67.24 69.10 70.25 70.73 70.52 71.42
CFB dec | 57.22 64.74 66.31 67.24 67.40 67.64 67.58
CTR enc | 59.47 68.64 69.91 71.02 71.86 71.61 71.95
CTR dec | 59.94 68.77 69.95 71.00 71.84 71.55 71.95
sm4-aesni-avx
ECB enc | 44.95 177.35 292.06 316.98 339.48 322.27 330.59
ECB dec | 45.28 178.66 292.31 317.52 339.59 322.52 331.16
CBC enc | 57.75 67.68 69.72 70.60 71.48 71.63 71.74
CBC dec | 44.32 176.83 284.32 307.24 328.61 312.61 325.82
CFB enc | 57.81 67.64 69.63 70.55 71.40 71.35 71.70
CFB dec | 43.14 167.78 282.03 307.20 328.35 318.24 325.95
CTR enc | 42.35 163.32 279.11 302.93 320.86 310.56 317.93
CTR dec | 42.39 162.81 278.49 302.37 321.11 310.33 318.37
sm4-aesni-avx2
ECB enc | 45.19 177.41 292.42 316.12 339.90 322.53 330.54
ECB dec | 44.83 178.90 291.45 317.31 339.85 322.55 331.07
CBC enc | 57.66 67.62 69.73 70.55 71.58 71.66 71.77
CBC dec | 44.34 176.86 286.10 501.68 559.58 483.87 527.46
CFB enc | 57.43 67.60 69.61 70.52 71.43 71.28 71.65
CFB dec | 43.12 167.75 268.09 499.33 558.35 490.36 524.73
CTR enc | 42.42 163.39 256.17 493.95 552.45 481.58 517.19
CTR dec | 42.49 163.11 256.36 493.34 552.62 481.49 516.83
From the benchmark data, it can be seen that when the block size is
1024, compared to AVX acceleration, the performance achieved by AVX2
has increased by about 70%, it is also 7.7 times of the pure software
implementation of sm4-generic.
沈子俊 (13):
crypto: tcrypt - Fix missing return value check
crypto: testmgr - Add GCM/CCM mode test of SM4 algorithm
crypto: tcrypt - add GCM/CCM mode test for SM4 algorithm
crypto: sm4 - create SM4 library based on sm4 generic code
crypto: arm64/sm4-ce - Make dependent on sm4 library instead of
sm4-generic
crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation
crypto: tcrypt - add the asynchronous speed test for SM4
crypto: x86/sm4 - export reusable AESNI/AVX functions
crypto: x86/sm4 - add AES-NI/AVX2/x86_64 implementation
Add the configuration for accelerated of SM4
crypto: x86/sm4 - Fix frame pointer stack corruption
crypto: sm4 - Do not change section of ck and sbox
crypto: x86/sm4 - Fix invalid section entry size
arch/arm64/crypto/Kconfig | 2 +-
arch/arm64/crypto/sm4-ce-glue.c | 20 +-
arch/x86/configs/openeuler_defconfig | 2 +
arch/x86/crypto/Makefile | 6 +
arch/x86/crypto/sm4-aesni-avx-asm_64.S | 594 ++++++++++++++++++++++++
arch/x86/crypto/sm4-aesni-avx2-asm_64.S | 501 ++++++++++++++++++++
arch/x86/crypto/sm4-avx.h | 24 +
arch/x86/crypto/sm4_aesni_avx2_glue.c | 169 +++++++
arch/x86/crypto/sm4_aesni_avx_glue.c | 487 +++++++++++++++++++
crypto/Kconfig | 44 ++
crypto/sm4_generic.c | 180 +------
crypto/tcrypt.c | 99 +++-
crypto/testmgr.c | 29 ++
crypto/testmgr.h | 148 ++++++
include/crypto/sm4.h | 25 +-
lib/crypto/Kconfig | 3 +
lib/crypto/Makefile | 3 +
lib/crypto/sm4.c | 176 +++++++
18 files changed, 2325 insertions(+), 187 deletions(-)
create mode 100644 arch/x86/crypto/sm4-aesni-avx-asm_64.S
create mode 100644 arch/x86/crypto/sm4-aesni-avx2-asm_64.S
create mode 100644 arch/x86/crypto/sm4-avx.h
create mode 100644 arch/x86/crypto/sm4_aesni_avx2_glue.c
create mode 100644 arch/x86/crypto/sm4_aesni_avx_glue.c
create mode 100644 lib/crypto/sm4.c
--
2.30.0
3
16

28 Dec '21
Ramaxel inclusion
category: features
bugzilla: https://gitee.com/openeuler/kernel/issues/I4JXCG
CVE: NA
Changes from v2:
1. Split scmd_tmout_nonpt into two parameters:
scmd_tmout_vd/scmd_tmout_rawdisk
2. Return -ETIME instead of -EINVAL when command is timeout.
3. Add one module parameters: max_io_force.
Changes from v1:
1. Add more debug infor.
2. Remove some unnecessary module parameters.
3. Report disks by the order of channel/target id.
4. Add host_reset handler.
5. Use get_unaligned_be24.
Signed-off-by: Yanling Song <songyl(a)ramaxel.com>
Reviewed-by: Jiang Yu<yujiang(a)ramaxel.com>
---
drivers/scsi/spraid/Kconfig | 6 +-
drivers/scsi/spraid/spraid.h | 142 ++-
drivers/scsi/spraid/spraid_main.c | 1633 +++++++++++++++--------------
3 files changed, 968 insertions(+), 813 deletions(-)
diff --git a/drivers/scsi/spraid/Kconfig b/drivers/scsi/spraid/Kconfig
index 83962efaab07..bfbba3db8db0 100644
--- a/drivers/scsi/spraid/Kconfig
+++ b/drivers/scsi/spraid/Kconfig
@@ -5,7 +5,9 @@
config RAMAXEL_SPRAID
tristate "Ramaxel spraid Adapter"
depends on PCI && SCSI
+ select BLK_DEV_BSGLIB
depends on ARM64 || X86_64
- default m
help
- This driver supports Ramaxel spraid driver.
+ This driver supports Ramaxel SPRxxx serial
+ raid controller, which has PCIE Gen4 interface
+ with host and supports SAS/SATA Hdd/ssd.
diff --git a/drivers/scsi/spraid/spraid.h b/drivers/scsi/spraid/spraid.h
index da46d8e1b4b6..983d7af2faa8 100644
--- a/drivers/scsi/spraid/spraid.h
+++ b/drivers/scsi/spraid/spraid.h
@@ -1,4 +1,5 @@
/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
#ifndef __SPRAID_H_
#define __SPRAID_H_
@@ -24,7 +25,7 @@
#define SENSE_SIZE(depth) ((depth) * SCSI_SENSE_BUFFERSIZE)
#define SPRAID_AQ_DEPTH 128
-#define SPRAID_NR_AEN_COMMANDS 1
+#define SPRAID_NR_AEN_COMMANDS 16
#define SPRAID_AQ_BLK_MQ_DEPTH (SPRAID_AQ_DEPTH - SPRAID_NR_AEN_COMMANDS)
#define SPRAID_AQ_MQ_TAG_DEPTH (SPRAID_AQ_BLK_MQ_DEPTH - 1)
@@ -44,7 +45,7 @@
#define SMALL_POOL_SIZE 256
#define MAX_SMALL_POOL_NUM 16
-#define MAX_CMD_PER_DEV 32
+#define MAX_CMD_PER_DEV 64
#define MAX_CDB_LEN 32
#define SPRAID_UP_TO_MULTY4(x) (((x) + 4) & (~0x03))
@@ -53,7 +54,7 @@
#define PCI_VENDOR_ID_RAMAXEL_LOGIC 0x1E81
-#define SPRAID_SERVER_DEVICE_HAB_DID 0x2100
+#define SPRAID_SERVER_DEVICE_HBA_DID 0x2100
#define SPRAID_SERVER_DEVICE_RAID_DID 0x2200
#define IO_6_DEFAULT_TX_LEN 256
@@ -142,11 +143,15 @@ enum {
enum {
SPRAID_AEN_DEV_CHANGED = 0x00,
+ SPRAID_AEN_FW_ACT_START = 0x01,
SPRAID_AEN_HOST_PROBING = 0x10,
};
enum {
- SPRAID_AEN_TIMESYN = 0x07
+ SPRAID_AEN_TIMESYN = 0x00,
+ SPRAID_AEN_FW_ACT_FINISH = 0x02,
+ SPRAID_AEN_EVENT_MIN = 0x80,
+ SPRAID_AEN_EVENT_MAX = 0xff,
};
enum {
@@ -175,6 +180,16 @@ enum spraid_state {
SPRAID_DEAD,
};
+enum {
+ SPRAID_CARD_HBA,
+ SPRAID_CARD_RAID,
+};
+
+enum spraid_cmd_type {
+ SPRAID_CMD_ADM,
+ SPRAID_CMD_IOPT,
+};
+
struct spraid_completion {
__le32 result;
union {
@@ -217,8 +232,6 @@ struct spraid_dev {
struct dma_pool *prp_page_pool;
struct dma_pool *prp_small_pool[MAX_SMALL_POOL_NUM];
mempool_t *iod_mempool;
- struct blk_mq_tag_set admin_tagset;
- struct request_queue *admin_q;
void __iomem *bar;
u32 max_qid;
u32 num_vecs;
@@ -232,23 +245,27 @@ struct spraid_dev {
u32 ctrl_config;
u32 online_queues;
u64 cap;
- struct device ctrl_device;
- struct cdev cdev;
int instance;
struct spraid_ctrl_info *ctrl_info;
struct spraid_dev_info *devices;
- struct spraid_ioq_ptcmd *ioq_ptcmds;
+ struct spraid_cmd *adm_cmds;
+ struct list_head adm_cmd_list;
+ spinlock_t adm_cmd_lock;
+
+ struct spraid_cmd *ioq_ptcmds;
struct list_head ioq_pt_list;
spinlock_t ioq_pt_lock;
- struct work_struct aen_work;
struct work_struct scan_work;
struct work_struct timesyn_work;
struct work_struct reset_work;
+ struct work_struct fw_act_work;
enum spraid_state state;
spinlock_t state_lock;
+
+ struct request_queue *bsg_queue;
};
struct spraid_sgl_desc {
@@ -347,6 +364,35 @@ struct spraid_get_info {
__u32 rsvd12[4];
};
+struct spraid_usr_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ union {
+ struct {
+ __le16 subopcode;
+ __le16 rsvd1;
+ } info_0;
+ __le32 cdw2;
+ };
+ union {
+ struct {
+ __le16 data_len;
+ __le16 param_len;
+ } info_1;
+ __le32 cdw3;
+ };
+ __u64 metadata;
+ union spraid_data_ptr dptr;
+ __le32 cdw10;
+ __le32 cdw11;
+ __le32 cdw12;
+ __le32 cdw13;
+ __le32 cdw14;
+ __le32 cdw15;
+};
+
enum {
SPRAID_CMD_FLAG_SGL_METABUF = (1 << 6),
SPRAID_CMD_FLAG_SGL_METASEG = (1 << 7),
@@ -393,6 +439,7 @@ struct spraid_admin_command {
struct spraid_get_info get_info;
struct spraid_abort_cmd abort;
struct spraid_reset_cmd reset;
+ struct spraid_usr_cmd usr_cmd;
};
};
@@ -456,9 +503,6 @@ struct spraid_ioq_command {
};
};
-#define SPRAID_IOCTL_RESET_CMD _IOWR('N', 0x80, struct spraid_passthru_common_cmd)
-#define SPRAID_IOCTL_ADMIN_CMD _IOWR('N', 0x41, struct spraid_passthru_common_cmd)
-
struct spraid_passthru_common_cmd {
__u8 opcode;
__u8 flags;
@@ -494,8 +538,6 @@ struct spraid_passthru_common_cmd {
__u32 result1;
};
-#define SPRAID_IOCTL_IOQ_CMD _IOWR('N', 0x42, struct spraid_ioq_passthru_cmd)
-
struct spraid_ioq_passthru_cmd {
__u8 opcode;
__u8 flags;
@@ -560,7 +602,21 @@ struct spraid_ioq_passthru_cmd {
__u32 result1;
};
-struct spraid_ioq_ptcmd {
+struct spraid_bsg_request {
+ u32 msgcode;
+ u32 control;
+ union {
+ struct spraid_passthru_common_cmd admcmd;
+ struct spraid_ioq_passthru_cmd ioqcmd;
+ };
+};
+
+enum {
+ SPRAID_BSG_ADM,
+ SPRAID_BSG_IOQ,
+};
+
+struct spraid_cmd {
int qid;
int cid;
u32 result0;
@@ -572,14 +628,6 @@ struct spraid_ioq_ptcmd {
struct list_head list;
};
-struct spraid_admin_request {
- struct spraid_admin_command *cmd;
- u32 result0;
- u32 result1;
- u16 flags;
- u16 status;
-};
-
struct spraid_queue {
struct spraid_dev *hdev;
spinlock_t sq_lock; /* spinlock for lock handling */
@@ -607,7 +655,6 @@ struct spraid_queue {
};
struct spraid_iod {
- struct spraid_admin_request req;
struct spraid_queue *spraidq;
enum spraid_cmd_state state;
int npages;
@@ -623,13 +670,51 @@ struct spraid_iod {
};
#define SPRAID_DEV_INFO_ATTR_BOOT(attr) ((attr) & 0x01)
-#define SPRAID_DEV_INFO_ATTR_HDD(attr) ((attr) & 0x02)
+#define SPRAID_DEV_INFO_ATTR_VD(attr) (((attr) & 0x02) == 0x0)
#define SPRAID_DEV_INFO_ATTR_PT(attr) (((attr) & 0x22) == 0x02)
#define SPRAID_DEV_INFO_ATTR_RAWDISK(attr) ((attr) & 0x20)
#define SPRAID_DEV_INFO_FLAG_VALID(flag) ((flag) & 0x01)
#define SPRAID_DEV_INFO_FLAG_CHANGE(flag) ((flag) & 0x02)
+#define BGTASK_TYPE_REBUILD 4
+#define USR_CMD_READ 0xc2
+#define USR_CMD_RDLEN 0x1000
+#define USR_CMD_VDINFO 0x704
+#define USR_CMD_BGTASK 0x504
+#define VDINFO_PARAM_LEN 0x04
+
+struct spraid_vd_info {
+ __u8 name[32];
+ __le16 id;
+ __u8 rg_id;
+ __u8 rg_level;
+ __u8 sg_num;
+ __u8 sg_disk_num;
+ __u8 vd_status;
+ __u8 vd_type;
+ __u8 rsvd1[4056];
+};
+
+#define MAX_REALTIME_BGTASK_NUM 32
+
+struct bgtask_info {
+ __u8 type;
+ __u8 progress;
+ __u8 rate;
+ __u8 rsvd0;
+ __le16 vd_id;
+ __le16 time_left;
+ __u8 rsvd1[4];
+};
+
+struct spraid_bgtask {
+ __u8 sw;
+ __u8 task_num;
+ __u8 rsvd[6];
+ struct bgtask_info bgtask[MAX_REALTIME_BGTASK_NUM];
+};
+
struct spraid_dev_info {
__le32 hdid;
__le16 target;
@@ -649,6 +734,11 @@ struct spraid_dev_list {
struct spraid_sdev_hostdata {
u32 hdid;
+ u16 max_io_kb;
+ u8 attr;
+ u8 flag;
+ u8 rg_id;
+ u8 rsvd[3];
};
#endif
diff --git a/drivers/scsi/spraid/spraid_main.c b/drivers/scsi/spraid/spraid_main.c
index a0a75ecb0027..c6d2a0b8e35e 100644
--- a/drivers/scsi/spraid/spraid_main.c
+++ b/drivers/scsi/spraid/spraid_main.c
@@ -1,8 +1,8 @@
// SPDX-License-Identifier: GPL-2.0
-/*
- * Linux spraid device driver
- * Copyright(c) 2021 Ramaxel Memory Technology, Ltd
- */
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
+
+/* Ramaxel Raid SPXXX Series Linux Driver */
+
#define pr_fmt(fmt) "spraid: " fmt
#include <linux/sched/signal.h>
@@ -23,6 +23,9 @@
#include <linux/debugfs.h>
#include <linux/io-64-nonatomic-lo-hi.h>
#include <linux/blkdev.h>
+#include <linux/bsg-lib.h>
+#include <asm/unaligned.h>
+#include <linux/sort.h>
#include <scsi/scsi.h>
#include <scsi/scsi_cmnd.h>
@@ -31,27 +34,24 @@
#include <scsi/scsi_transport.h>
#include <scsi/scsi_dbg.h>
+
#include "spraid.h"
static u32 admin_tmout = 60;
module_param(admin_tmout, uint, 0644);
MODULE_PARM_DESC(admin_tmout, "admin commands timeout (seconds)");
-static u32 scmd_tmout_pt = 30;
-module_param(scmd_tmout_pt, uint, 0644);
-MODULE_PARM_DESC(scmd_tmout_pt, "scsi commands timeout for passthrough(seconds)");
+static u32 scmd_tmout_rawdisk = 180;
+module_param(scmd_tmout_rawdisk, uint, 0644);
+MODULE_PARM_DESC(scmd_tmout_rawdisk, "scsi commands timeout for rawdisk(seconds)");
-static u32 scmd_tmout_nonpt = 180;
-module_param(scmd_tmout_nonpt, uint, 0644);
-MODULE_PARM_DESC(scmd_tmout_nonpt, "scsi commands timeout for rawdisk&raid(seconds)");
+static u32 scmd_tmout_vd = 180;
+module_param(scmd_tmout_vd, uint, 0644);
+MODULE_PARM_DESC(scmd_tmout_vd, "scsi commands timeout for vd(seconds)");
-static u32 wait_abl_tmout = 3;
-module_param(wait_abl_tmout, uint, 0644);
-MODULE_PARM_DESC(wait_abl_tmout, "wait abnormal io timeout(seconds)");
-
-static bool use_sgl_force;
-module_param(use_sgl_force, bool, 0644);
-MODULE_PARM_DESC(use_sgl_force, "force IO use sgl format, default false");
+static bool max_io_force;
+module_param(max_io_force, bool, 0644);
+MODULE_PARM_DESC(max_io_force, "force max_hw_sectors_kb = 1024, default false(performance first)");
static int ioq_depth_set(const char *val, const struct kernel_param *kp);
static const struct kernel_param_ops ioq_depth_ops = {
@@ -106,16 +106,20 @@ static const struct kernel_param_ops small_pool_num_ops = {
.get = param_get_byte,
};
+/* It was found that the spindlock of a single pool conflicts
+ * a lot with multiple CPUs.So multiple pools are introduced
+ * to reduce the conflictions.
+ */
static unsigned char small_pool_num = 4;
module_param_cb(small_pool_num, &small_pool_num_ops, &small_pool_num, 0644);
MODULE_PARM_DESC(small_pool_num, "set prp small pool num, default 4, MAX 16");
static void spraid_free_queue(struct spraid_queue *spraidq);
static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result);
-static void spraid_handle_aen_vs(struct spraid_dev *hdev, u32 result);
+static void spraid_handle_aen_vs(struct spraid_dev *hdev, u32 result, u32 result1);
static DEFINE_IDA(spraid_instance_ida);
-static dev_t spraid_chr_devt;
+
static struct class *spraid_class;
#define SPRAID_CAP_TIMEOUT_UNIT_MS (HZ / 2)
@@ -131,9 +135,8 @@ static struct workqueue_struct *spraid_wq;
#define SPRAID_DRV_VERSION "1.0.0.0"
#define ADMIN_TIMEOUT (admin_tmout * HZ)
-#define ADMIN_ERR_TIMEOUT 32757
-#define SPRAID_WAIT_ABNL_CMD_TIMEOUT (wait_abl_tmout * 2)
+#define SPRAID_WAIT_ABNL_CMD_TIMEOUT (3 * 2)
#define SPRAID_DMA_MSK_BIT_MAX 64
@@ -147,6 +150,13 @@ enum FW_STAT_CODE {
FW_STAT_NEED_RETRY
};
+static const char * const raid_levels[] = {"0", "1", "5", "6", "10", "50", "60", "NA"};
+
+static const char * const raid_states[] = {
+ "NA", "NORMAL", "FAULT", "DEGRADE", "NOT_FORMATTED", "FORMATTING", "SANITIZING",
+ "INITIALIZING", "INITIALIZE_FAIL", "DELETING", "DELETE_FAIL", "WRITE_PROTECT"
+};
+
static int ioq_depth_set(const char *val, const struct kernel_param *kp)
{
int n = 0;
@@ -231,12 +241,6 @@ static int spraid_pci_enable(struct spraid_dev *hdev)
goto disable;
}
- ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
- if (ret < 0) {
- dev_err(hdev->dev, "Allocate one IRQ for setup admin channel failed\n");
- goto disable;
- }
-
hdev->cap = lo_hi_readq(hdev->bar + SPRAID_REG_CAP);
hdev->ioq_depth = min_t(u32, SPRAID_CAP_MQES(hdev->cap) + 1, io_queue_depth);
hdev->db_stride = 1 << SPRAID_CAP_STRIDE(hdev->cap);
@@ -246,13 +250,20 @@ static int spraid_pci_enable(struct spraid_dev *hdev)
dev_err(hdev->dev, "err, dma mask invalid[%llu], set to default\n", maskbit);
maskbit = SPRAID_DMA_MSK_BIT_MAX;
}
- if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(maskbit))) {
+ if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(maskbit)) &&
+ dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32))) {
dev_err(hdev->dev, "set dma mask and coherent failed\n");
goto disable;
}
dev_info(hdev->dev, "set dma mask[%llu] success\n", maskbit);
+ ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
+ if (ret < 0) {
+ dev_err(hdev->dev, "Allocate one IRQ for setup admin channel failed\n");
+ goto disable;
+ }
+
pci_enable_pcie_error_reporting(pdev);
pci_save_state(pdev);
@@ -263,12 +274,6 @@ static int spraid_pci_enable(struct spraid_dev *hdev)
return ret;
}
-static inline
-struct spraid_admin_request *spraid_admin_req(struct request *req)
-{
- return blk_mq_rq_to_pdu(req);
-}
-
static int spraid_npages_prp(u32 size, struct spraid_dev *hdev)
{
u32 nprps = DIV_ROUND_UP(size + hdev->page_size, hdev->page_size);
@@ -419,7 +424,7 @@ static void spraid_submit_cmd(struct spraid_queue *spraidq, const void *cmd)
writel(spraidq->sq_tail, spraidq->q_db);
spin_unlock_irqrestore(&spraidq->sq_lock, flags);
- dev_log_dbg(spraidq->hdev->dev, "cid[%d], qid[%d], opcode[0x%x], flags[0x%x], hdid[%u]\n",
+ dev_log_dbg(spraidq->hdev->dev, "cid[%d] qid[%d], opcode[0x%x], flags[0x%x], hdid[%u]\n",
acd->command_id, spraidq->qid, acd->opcode, acd->flags, le32_to_cpu(acd->hdid));
}
@@ -605,18 +610,15 @@ static void spraid_setup_rw_cmd(struct spraid_dev *hdev,
if (scmd->cmd_len == 6) {
datalength = (u32)(scmd->cmnd[4] == 0 ?
IO_6_DEFAULT_TX_LEN : scmd->cmnd[4]);
- start_lba_lo = ((u32)scmd->cmnd[1] << 16) |
- ((u32)scmd->cmnd[2] << 8) | (u32)scmd->cmnd[3];
+ start_lba_lo = (u32)get_unaligned_be24(&scmd->cmnd[1]);
start_lba_lo &= 0x1FFFFF;
}
/* 10-byte READ(0x28) or WRITE(0x2A) cdb */
else if (scmd->cmd_len == 10) {
- datalength = (u32)scmd->cmnd[8] | ((u32)scmd->cmnd[7] << 8);
- start_lba_lo = ((u32)scmd->cmnd[2] << 24) |
- ((u32)scmd->cmnd[3] << 16) |
- ((u32)scmd->cmnd[4] << 8) | (u32)scmd->cmnd[5];
+ datalength = (u32)get_unaligned_be16(&scmd->cmnd[7]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
if (scmd->cmnd[1] & FUA_MASK)
control |= SPRAID_RW_FUA;
@@ -624,42 +626,26 @@ static void spraid_setup_rw_cmd(struct spraid_dev *hdev,
/* 12-byte READ(0xA8) or WRITE(0xAA) cdb */
else if (scmd->cmd_len == 12) {
- datalength = ((u32)scmd->cmnd[6] << 24) |
- ((u32)scmd->cmnd[7] << 16) |
- ((u32)scmd->cmnd[8] << 8) | (u32)scmd->cmnd[9];
- start_lba_lo = ((u32)scmd->cmnd[2] << 24) |
- ((u32)scmd->cmnd[3] << 16) |
- ((u32)scmd->cmnd[4] << 8) | (u32)scmd->cmnd[5];
+ datalength = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
if (scmd->cmnd[1] & FUA_MASK)
control |= SPRAID_RW_FUA;
}
/* 16-byte READ(0x88) or WRITE(0x8A) cdb */
else if (scmd->cmd_len == 16) {
- datalength = ((u32)scmd->cmnd[10] << 24) |
- ((u32)scmd->cmnd[11] << 16) |
- ((u32)scmd->cmnd[12] << 8) | (u32)scmd->cmnd[13];
- start_lba_lo = ((u32)scmd->cmnd[6] << 24) |
- ((u32)scmd->cmnd[7] << 16) |
- ((u32)scmd->cmnd[8] << 8) | (u32)scmd->cmnd[9];
- start_lba_hi = ((u32)scmd->cmnd[2] << 24) |
- ((u32)scmd->cmnd[3] << 16) |
- ((u32)scmd->cmnd[4] << 8) | (u32)scmd->cmnd[5];
+ datalength = get_unaligned_be32(&scmd->cmnd[10]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[2]);
if (scmd->cmnd[1] & FUA_MASK)
control |= SPRAID_RW_FUA;
}
/* 32-byte READ(0x88) or WRITE(0x8A) cdb */
else if (scmd->cmd_len == 32) {
- datalength = ((u32)scmd->cmnd[28] << 24) |
- ((u32)scmd->cmnd[29] << 16) |
- ((u32)scmd->cmnd[30] << 8) | (u32)scmd->cmnd[31];
- start_lba_lo = ((u32)scmd->cmnd[16] << 24) |
- ((u32)scmd->cmnd[17] << 16) |
- ((u32)scmd->cmnd[18] << 8) | (u32)scmd->cmnd[19];
- start_lba_hi = ((u32)scmd->cmnd[12] << 24) |
- ((u32)scmd->cmnd[13] << 16) |
- ((u32)scmd->cmnd[14] << 8) | (u32)scmd->cmnd[15];
+ datalength = get_unaligned_be32(&scmd->cmnd[28]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[16]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[12]);
if (scmd->cmnd[10] & FUA_MASK)
control |= SPRAID_RW_FUA;
@@ -814,7 +800,7 @@ static void spraid_map_status(struct spraid_iod *iod, struct scsi_cmnd *scmd,
if (scmd->result & SAM_STAT_CHECK_CONDITION) {
memset(scmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
memcpy(scmd->sense_buffer, iod->sense, SCSI_SENSE_BUFFERSIZE);
- set_driver_byte(scmd, DRIVER_SENSE);
+ scmd->result = (scmd->result & 0x00ffffff) | (DRIVER_SENSE << 24);
}
break;
case FW_STAT_ABORTED:
@@ -825,6 +811,8 @@ static void spraid_map_status(struct spraid_iod *iod, struct scsi_cmnd *scmd,
break;
default:
set_host_byte(scmd, DID_BAD_TARGET);
+ dev_warn(iod->spraidq->hdev->dev, "[%s] cid[%d] qid[%d] bad status[0x%x]\n",
+ __func__, cqe->cmd_id, le16_to_cpu(cqe->sq_id), le16_to_cpu(cqe->status));
break;
}
}
@@ -850,14 +838,13 @@ static int spraid_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
int ret;
if (unlikely(!scmd)) {
- dev_err(hdev->dev, "err, scmd is null, return 0\n");
+ dev_err(hdev->dev, "err, scmd is null\n");
return 0;
}
if (unlikely(hdev->state != SPRAID_LIVE)) {
set_host_byte(scmd, DID_NO_CONNECT);
scmd->scsi_done(scmd);
- dev_err(hdev->dev, "[%s] err, hdev state is not live\n", __func__);
return 0;
}
@@ -894,7 +881,7 @@ static int spraid_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
WRITE_ONCE(iod->state, SPRAID_CMD_IN_FLIGHT);
spraid_submit_cmd(ioq, &ioq_cmd);
elapsed = jiffies - scmd->jiffies_at_alloc;
- dev_log_dbg(hdev->dev, "cid[%d], qid[%d] submit IO cost %3ld.%3ld seconds\n",
+ dev_log_dbg(hdev->dev, "cid[%d] qid[%d] submit IO cost %3ld.%3ld seconds\n",
cid, hwq, elapsed / HZ, elapsed % HZ);
return 0;
@@ -945,6 +932,10 @@ static int spraid_slave_alloc(struct scsi_device *sdev)
scan_host:
hostdata->hdid = le32_to_cpu(hdev->devices[idx].hdid);
+ hostdata->max_io_kb = le16_to_cpu(hdev->devices[idx].max_io_kb);
+ hostdata->attr = hdev->devices[idx].attr;
+ hostdata->flag = hdev->devices[idx].flag;
+ hostdata->rg_id = 0xff;
sdev->hostdata = hostdata;
up_read(&hdev->devices_rwsem);
return 0;
@@ -958,30 +949,17 @@ static void spraid_slave_destroy(struct scsi_device *sdev)
static int spraid_slave_configure(struct scsi_device *sdev)
{
- u16 idx;
- unsigned int timeout = scmd_tmout_nonpt * HZ;
+ unsigned int timeout = scmd_tmout_rawdisk * HZ;
struct spraid_dev *hdev = shost_priv(sdev->host);
struct spraid_sdev_hostdata *hostdata = sdev->hostdata;
u32 max_sec = sdev->host->max_sectors;
- if (!hostdata) {
- idx = hostdata->hdid - 1;
- if (sdev->channel == hdev->devices[idx].channel &&
- sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
- sdev->lun < hdev->devices[idx].lun) {
- if (SPRAID_DEV_INFO_ATTR_PT(hdev->devices[idx].attr))
- timeout = scmd_tmout_pt * HZ;
- else
- timeout = scmd_tmout_nonpt * HZ;
- max_sec = le16_to_cpu(hdev->devices[idx].max_io_kb) << 1;
- } else {
- dev_err(hdev->dev, "[%s] err, sdev->channel:id:lun[%d:%d:%lld];"
- "devices[%d], channel:target:lun[%d:%d:%d]\n",
- __func__, sdev->channel, sdev->id, sdev->lun,
- idx, hdev->devices[idx].channel,
- hdev->devices[idx].target,
- hdev->devices[idx].lun);
- }
+ if (hostdata) {
+ if (SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ timeout = scmd_tmout_vd * HZ;
+ else if (SPRAID_DEV_INFO_ATTR_RAWDISK(hostdata->attr))
+ timeout = scmd_tmout_rawdisk * HZ;
+ max_sec = hostdata->max_io_kb << 1;
} else {
dev_err(hdev->dev, "[%s] err, sdev->hostdata is null\n", __func__);
}
@@ -991,7 +969,9 @@ static int spraid_slave_configure(struct scsi_device *sdev)
if ((max_sec == 0) || (max_sec > sdev->host->max_sectors))
max_sec = sdev->host->max_sectors;
- blk_queue_max_hw_sectors(sdev->request_queue, max_sec);
+
+ if (!max_io_force)
+ blk_queue_max_hw_sectors(sdev->request_queue, max_sec);
dev_info(hdev->dev, "[%s] sdev->channel:id:lun[%d:%d:%lld], scmd_timeout[%d]s, maxsec[%d]\n",
__func__, sdev->channel, sdev->id, sdev->lun, timeout / HZ, max_sec);
@@ -1176,6 +1156,75 @@ static inline bool spraid_cqe_pending(struct spraid_queue *spraidq)
spraidq->cq_phase;
}
+static void spraid_sata_report_zone_handle(struct scsi_cmnd *scmd, struct spraid_iod *iod)
+{
+ int i = 0;
+ unsigned int bytes = 0;
+ struct scatterlist *sg = scsi_sglist(scmd);
+
+ scsi_for_each_sg(scmd, sg, iod->nsge, i) {
+ unsigned int offset = 0;
+
+ if (bytes == 0) {
+ char *hdr;
+ u32 list_length;
+ u64 max_lba, opt_lba;
+ u16 same;
+
+ hdr = sg_virt(sg);
+
+ list_length = get_unaligned_le32(&hdr[0]);
+ same = get_unaligned_le16(&hdr[4]);
+ max_lba = get_unaligned_le64(&hdr[8]);
+ opt_lba = get_unaligned_le64(&hdr[16]);
+ put_unaligned_be32(list_length, &hdr[0]);
+ hdr[4] = same & 0xf;
+ put_unaligned_be64(max_lba, &hdr[8]);
+ put_unaligned_be64(opt_lba, &hdr[16]);
+ offset += 64;
+ bytes += 64;
+ }
+ while (offset < sg_dma_len(sg)) {
+ char *rec;
+ u8 cond, type, non_seq, reset;
+ u64 size, start, wp;
+
+ rec = sg_virt(sg) + offset;
+ type = rec[0] & 0xf;
+ cond = (rec[1] >> 4) & 0xf;
+ non_seq = (rec[1] & 2);
+ reset = (rec[1] & 1);
+ size = get_unaligned_le64(&rec[8]);
+ start = get_unaligned_le64(&rec[16]);
+ wp = get_unaligned_le64(&rec[24]);
+ rec[0] = type;
+ rec[1] = (cond << 4) | non_seq | reset;
+ put_unaligned_be64(size, &rec[8]);
+ put_unaligned_be64(start, &rec[16]);
+ put_unaligned_be64(wp, &rec[24]);
+ WARN_ON(offset + 64 > sg_dma_len(sg));
+ offset += 64;
+ bytes += 64;
+ }
+ }
+}
+
+static inline void spraid_handle_ata_cmd(struct spraid_dev *hdev, struct scsi_cmnd *scmd,
+ struct spraid_iod *iod)
+{
+ if (hdev->ctrl_info->card_type != SPRAID_CARD_HBA)
+ return;
+
+ switch (scmd->cmnd[0]) {
+ case ZBC_IN:
+ dev_info(hdev->dev, "[%s] process report zone\n", __func__);
+ spraid_sata_report_zone_handle(scmd, iod);
+ break;
+ default:
+ break;
+ }
+}
+
static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq, struct spraid_completion *cqe)
{
struct spraid_dev *hdev = ioq->hdev;
@@ -1197,12 +1246,12 @@ static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq, struct spraid_com
iod = scsi_cmd_priv(scmd);
elapsed = jiffies - scmd->jiffies_at_alloc;
- dev_log_dbg(hdev->dev, "cid[%d], qid[%d] finish IO cost %3ld.%3ld seconds\n",
+ dev_log_dbg(hdev->dev, "cid[%d] qid[%d] finish IO cost %3ld.%3ld seconds\n",
cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
if (cmpxchg(&iod->state, SPRAID_CMD_IN_FLIGHT, SPRAID_CMD_COMPLETE) !=
SPRAID_CMD_IN_FLIGHT) {
- dev_warn(hdev->dev, "cid[%d], qid[%d] enters abnormal handler, cost %3ld.%3ld seconds\n",
+ dev_warn(hdev->dev, "cid[%d] qid[%d] enters abnormal handler, cost %3ld.%3ld seconds\n",
cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
WRITE_ONCE(iod->state, SPRAID_CMD_TMO_COMPLETE);
@@ -1215,6 +1264,8 @@ static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq, struct spraid_com
return;
}
+ spraid_handle_ata_cmd(hdev, scmd, iod);
+
spraid_map_status(iod, scmd, cqe);
if (iod->nsge) {
iod->nsge = 0;
@@ -1224,38 +1275,36 @@ static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq, struct spraid_com
scmd->scsi_done(scmd);
}
-static inline void spraid_end_admin_request(struct request *req, __le16 status,
- __le32 result0, __le32 result1)
-{
- struct spraid_admin_request *rq = spraid_admin_req(req);
-
- rq->status = le16_to_cpu(status) >> 1;
- rq->result0 = le32_to_cpu(result0);
- rq->result1 = le32_to_cpu(result1);
- blk_mq_complete_request(req);
-}
-
static void spraid_complete_adminq_cmnd(struct spraid_queue *adminq, struct spraid_completion *cqe)
{
- struct blk_mq_tags *tags = adminq->hdev->admin_tagset.tags[0];
- struct request *req;
+ struct spraid_dev *hdev = adminq->hdev;
+ struct spraid_cmd *adm_cmd;
- req = blk_mq_tag_to_rq(tags, cqe->cmd_id);
- if (unlikely(!req)) {
+ adm_cmd = hdev->adm_cmds + cqe->cmd_id;
+ if (unlikely(adm_cmd->state == SPRAID_CMD_IDLE)) {
dev_warn(adminq->hdev->dev, "Invalid id %d completed on queue %d\n",
cqe->cmd_id, le16_to_cpu(cqe->sq_id));
return;
}
- spraid_end_admin_request(req, cqe->status, cqe->result, cqe->result1);
+
+ adm_cmd->status = le16_to_cpu(cqe->status) >> 1;
+ adm_cmd->result0 = le32_to_cpu(cqe->result);
+ adm_cmd->result1 = le32_to_cpu(cqe->result1);
+
+ complete(&adm_cmd->cmd_done);
}
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid);
+
static void spraid_complete_aen(struct spraid_queue *spraidq, struct spraid_completion *cqe)
{
struct spraid_dev *hdev = spraidq->hdev;
u32 result = le32_to_cpu(cqe->result);
- dev_info(hdev->dev, "rcv aen, status[%x], result[%x]\n",
- le16_to_cpu(cqe->status) >> 1, result);
+ dev_info(hdev->dev, "rcv aen, cid[%d], status[0x%x], result[0x%x]\n",
+ cqe->cmd_id, le16_to_cpu(cqe->status) >> 1, result);
+
+ spraid_send_aen(hdev, cqe->cmd_id);
if ((le16_to_cpu(cqe->status) >> 1) != SPRAID_SC_SUCCESS)
return;
@@ -1264,22 +1313,19 @@ static void spraid_complete_aen(struct spraid_queue *spraidq, struct spraid_comp
spraid_handle_aen_notice(hdev, result);
break;
case SPRAID_AEN_VS:
- spraid_handle_aen_vs(hdev, result);
+ spraid_handle_aen_vs(hdev, result, le32_to_cpu(cqe->result1));
break;
default:
dev_warn(hdev->dev, "Unsupported async event type: %u\n",
result & 0x7);
break;
}
- queue_work(spraid_wq, &hdev->aen_work);
}
-static void spraid_put_ioq_ptcmd(struct spraid_dev *hdev, struct spraid_ioq_ptcmd *cmd);
-
static void spraid_complete_ioq_sync_cmnd(struct spraid_queue *ioq, struct spraid_completion *cqe)
{
struct spraid_dev *hdev = ioq->hdev;
- struct spraid_ioq_ptcmd *ptcmd;
+ struct spraid_cmd *ptcmd;
ptcmd = hdev->ioq_ptcmds + (ioq->qid - 1) * SPRAID_PTCMDS_PERQ +
cqe->cmd_id - SPRAID_IO_BLK_MQ_DEPTH;
@@ -1289,8 +1335,6 @@ static void spraid_complete_ioq_sync_cmnd(struct spraid_queue *ioq, struct sprai
ptcmd->result1 = le32_to_cpu(cqe->result1);
complete(&ptcmd->cmd_done);
-
- spraid_put_ioq_ptcmd(hdev, ptcmd);
}
static inline void spraid_handle_cqe(struct spraid_queue *spraidq, u16 idx)
@@ -1304,7 +1348,7 @@ static inline void spraid_handle_cqe(struct spraid_queue *spraidq, u16 idx)
return;
}
- dev_log_dbg(hdev->dev, "cid[%d], qid[%d], result[0x%x], sq_id[%d], status[0x%x]\n",
+ dev_log_dbg(hdev->dev, "cid[%d] qid[%d], result[0x%x], sq_id[%d], status[0x%x]\n",
cqe->cmd_id, spraidq->qid, le32_to_cpu(cqe->result),
le16_to_cpu(cqe->sq_id), le16_to_cpu(cqe->status));
@@ -1452,62 +1496,119 @@ static u32 spraid_bar_size(struct spraid_dev *hdev, u32 nr_ioqs)
return (SPRAID_REG_DBS + ((nr_ioqs + 1) * 8 * hdev->db_stride));
}
-static inline void spraid_clear_spraid_request(struct request *req)
+static int spraid_alloc_admin_cmds(struct spraid_dev *hdev)
{
- if (!(req->rq_flags & RQF_DONTPREP)) {
- spraid_admin_req(req)->flags = 0;
- req->rq_flags |= RQF_DONTPREP;
+ int i;
+
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+ spin_lock_init(&hdev->adm_cmd_lock);
+
+ hdev->adm_cmds = kcalloc_node(SPRAID_AQ_BLK_MQ_DEPTH, sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
+
+ if (!hdev->adm_cmds) {
+ dev_err(hdev->dev, "Alloc admin cmds failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < SPRAID_AQ_BLK_MQ_DEPTH; i++) {
+ hdev->adm_cmds[i].qid = 0;
+ hdev->adm_cmds[i].cid = i;
+ list_add_tail(&(hdev->adm_cmds[i].list), &hdev->adm_cmd_list);
}
+
+ dev_info(hdev->dev, "Alloc admin cmds success, num[%d]\n", SPRAID_AQ_BLK_MQ_DEPTH);
+
+ return 0;
}
-static struct request *spraid_alloc_admin_request(struct request_queue *q,
- struct spraid_admin_command *cmd,
- blk_mq_req_flags_t flags)
+static void spraid_free_admin_cmds(struct spraid_dev *hdev)
{
- u32 op = COMMAND_IS_WRITE(cmd) ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN;
- struct request *req;
+ kfree(hdev->adm_cmds);
+ hdev->adm_cmds = NULL;
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+}
+
+static struct spraid_cmd *spraid_get_cmd(struct spraid_dev *hdev, enum spraid_cmd_type type)
+{
+ struct spraid_cmd *cmd = NULL;
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
+
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
+
+ spin_lock_irqsave(slock, flags);
+ if (list_empty(head)) {
+ spin_unlock_irqrestore(slock, flags);
+ dev_err(hdev->dev, "err, cmd[%d] list empty\n", type);
+ return NULL;
+ }
+ cmd = list_entry(head->next, struct spraid_cmd, list);
+ list_del_init(&cmd->list);
+ spin_unlock_irqrestore(slock, flags);
- req = blk_mq_alloc_request(q, op, flags);
- if (IS_ERR(req))
- return req;
- req->cmd_flags |= REQ_FAILFAST_DRIVER;
- spraid_clear_spraid_request(req);
- spraid_admin_req(req)->cmd = cmd;
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IN_FLIGHT);
- return req;
+ return cmd;
}
-static int spraid_submit_admin_sync_cmd(struct request_queue *q,
- struct spraid_admin_command *cmd,
- u32 *result, void *buffer,
- u32 bufflen, u32 timeout, int at_head, blk_mq_req_flags_t flags)
+static void spraid_put_cmd(struct spraid_dev *hdev, struct spraid_cmd *cmd,
+ enum spraid_cmd_type type)
{
- struct request *req;
- int ret;
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
- req = spraid_alloc_admin_request(q, cmd, flags);
- if (IS_ERR(req))
- return PTR_ERR(req);
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
- req->timeout = timeout ? timeout : ADMIN_TIMEOUT;
- if (buffer && bufflen) {
- ret = blk_rq_map_kern(q, req, buffer, bufflen, GFP_KERNEL);
- if (ret)
- goto out;
+ spin_lock_irqsave(slock, flags);
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IDLE);
+ list_add_tail(&cmd->list, head);
+ spin_unlock_irqrestore(slock, flags);
+}
+
+
+static int spraid_submit_admin_sync_cmd(struct spraid_dev *hdev, struct spraid_admin_command *cmd,
+ u32 *result0, u32 *result1, u32 timeout)
+{
+ struct spraid_cmd *adm_cmd = spraid_get_cmd(hdev, SPRAID_CMD_ADM);
+
+ if (!adm_cmd) {
+ dev_err(hdev->dev, "err, get admin cmd failed\n");
+ return -EFAULT;
}
- blk_execute_rq(req->q, NULL, req, at_head);
- if (result)
- *result = spraid_admin_req(req)->result0;
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
- if (spraid_admin_req(req)->flags & SPRAID_REQ_CANCELLED)
- ret = -EINTR;
- else
- ret = spraid_admin_req(req)->status;
+ init_completion(&adm_cmd->cmd_done);
-out:
- blk_mq_free_request(req);
- return ret;
+ cmd->common.command_id = adm_cmd->cid;
+ spraid_submit_cmd(&hdev->queues[0], cmd);
+
+ if (!wait_for_completion_timeout(&adm_cmd->cmd_done, timeout)) {
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout, opcode[0x%x] subopcode[0x%x]\n",
+ __func__, adm_cmd->cid, adm_cmd->qid, cmd->usr_cmd.opcode,
+ cmd->usr_cmd.info_0.subopcode);
+ WRITE_ONCE(adm_cmd->state, SPRAID_CMD_TIMEOUT);
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+ return -ETIME;
+ }
+
+ if (result0)
+ *result0 = adm_cmd->result0;
+ if (result1)
+ *result1 = adm_cmd->result1;
+
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+
+ return adm_cmd->status;
}
static int spraid_create_cq(struct spraid_dev *hdev, u16 qid,
@@ -1524,8 +1625,7 @@ static int spraid_create_cq(struct spraid_dev *hdev, u16 qid,
admin_cmd.create_cq.cq_flags = cpu_to_le16(flags);
admin_cmd.create_cq.irq_vector = cpu_to_le16(cq_vector);
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
}
static int spraid_create_sq(struct spraid_dev *hdev, u16 qid,
@@ -1542,8 +1642,7 @@ static int spraid_create_sq(struct spraid_dev *hdev, u16 qid,
admin_cmd.create_sq.sq_flags = cpu_to_le16(flags);
admin_cmd.create_sq.cqid = cpu_to_le16(qid);
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
}
static void spraid_free_queue(struct spraid_queue *spraidq)
@@ -1581,8 +1680,7 @@ static int spraid_delete_queue(struct spraid_dev *hdev, u8 op, u16 id)
admin_cmd.delete_queue.opcode = op;
admin_cmd.delete_queue.qid = cpu_to_le16(id);
- ret = spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
if (ret)
dev_err(hdev->dev, "Delete %s:[%d] failed\n",
@@ -1663,19 +1761,28 @@ static int spraid_set_features(struct spraid_dev *hdev, u32 fid, u32 dword11, vo
size_t buflen, u32 *result)
{
struct spraid_admin_command admin_cmd;
- u32 res;
int ret;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+
+ if (buffer && buflen) {
+ data_ptr = dma_alloc_coherent(hdev->dev, buflen, &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memcpy(data_ptr, buffer, buflen);
+ }
memset(&admin_cmd, 0, sizeof(admin_cmd));
admin_cmd.features.opcode = SPRAID_ADMIN_SET_FEATURES;
admin_cmd.features.fid = cpu_to_le32(fid);
admin_cmd.features.dword11 = cpu_to_le32(dword11);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
- ret = spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, &res,
- buffer, buflen, 0, 0, 0);
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, result, NULL, 0);
- if (!ret && result)
- *result = res;
+ if (data_ptr)
+ dma_free_coherent(hdev->dev, buflen, data_ptr, data_dma);
return ret;
}
@@ -1764,8 +1871,7 @@ static int spraid_setup_io_queues(struct spraid_dev *hdev)
break;
}
dev_info(hdev->dev, "[%s] max_qid: %d, queue_count: %d, online_queue: %d, ioq_depth: %d\n",
- __func__, hdev->max_qid, hdev->queue_count,
- hdev->online_queues, hdev->ioq_depth);
+ __func__, hdev->max_qid, hdev->queue_count, hdev->online_queues, hdev->ioq_depth);
return spraid_create_io_queues(hdev);
}
@@ -1889,10 +1995,11 @@ static int spraid_get_dev_list(struct spraid_dev *hdev, struct spraid_dev_info *
u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
struct spraid_admin_command admin_cmd;
struct spraid_dev_list *list_buf;
+ dma_addr_t data_dma = 0;
u32 i, idx, hdid, ndev;
int ret = 0;
- list_buf = kmalloc(sizeof(*list_buf), GFP_KERNEL);
+ list_buf = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
if (!list_buf)
return -ENOMEM;
@@ -1901,9 +2008,9 @@ static int spraid_get_dev_list(struct spraid_dev *hdev, struct spraid_dev_info *
admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
admin_cmd.get_info.type = SPRAID_GET_INFO_DEV_LIST;
admin_cmd.get_info.cdw11 = cpu_to_le32(idx);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
- ret = spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL, list_buf,
- sizeof(*list_buf), 0, 0, 0);
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
if (ret) {
dev_err(hdev->dev, "Get device list failed, nd: %u, idx: %u, ret: %d\n",
@@ -1916,12 +2023,11 @@ static int spraid_get_dev_list(struct spraid_dev *hdev, struct spraid_dev_info *
for (i = 0; i < ndev; i++) {
hdid = le32_to_cpu(list_buf->devices[i].hdid);
- dev_info(hdev->dev, "list_buf->devices[%d], hdid: %u target: %d, channel: %d, lun: %d, attr[%x]\n",
- i, hdid,
- le16_to_cpu(list_buf->devices[i].target),
- list_buf->devices[i].channel,
- list_buf->devices[i].lun,
- list_buf->devices[i].attr);
+ dev_info(hdev->dev, "list_buf->devices[%d], hdid: %u target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ i, hdid, le16_to_cpu(list_buf->devices[i].target),
+ list_buf->devices[i].channel,
+ list_buf->devices[i].lun,
+ list_buf->devices[i].attr);
if (hdid > nd || hdid == 0) {
dev_err(hdev->dev, "err, hdid[%d] invalid\n", hdid);
continue;
@@ -1936,21 +2042,29 @@ static int spraid_get_dev_list(struct spraid_dev *hdev, struct spraid_dev_info *
}
out:
- kfree(list_buf);
+ dma_free_coherent(hdev->dev, PAGE_SIZE, list_buf, data_dma);
return ret;
}
-static void spraid_send_aen(struct spraid_dev *hdev)
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid)
{
struct spraid_queue *adminq = &hdev->queues[0];
struct spraid_admin_command admin_cmd;
memset(&admin_cmd, 0, sizeof(admin_cmd));
admin_cmd.common.opcode = SPRAID_ADMIN_ASYNC_EVENT;
- admin_cmd.common.command_id = SPRAID_AQ_BLK_MQ_DEPTH;
+ admin_cmd.common.command_id = cid;
spraid_submit_cmd(adminq, &admin_cmd);
- dev_info(hdev->dev, "send aen, cid[%d]\n", SPRAID_AQ_BLK_MQ_DEPTH);
+ dev_info(hdev->dev, "send aen, cid[%d]\n", cid);
+}
+
+static inline void spraid_send_all_aen(struct spraid_dev *hdev)
+{
+ u16 i;
+
+ for (i = 0; i < hdev->ctrl_info->aerl; i++)
+ spraid_send_aen(hdev, i + SPRAID_AQ_BLK_MQ_DEPTH);
}
static int spraid_add_device(struct spraid_dev *hdev, struct spraid_dev_info *device)
@@ -1958,6 +2072,10 @@ static int spraid_add_device(struct spraid_dev *hdev, struct spraid_dev_info *de
struct Scsi_Host *shost = hdev->shost;
struct scsi_device *sdev;
+ dev_info(hdev->dev, "add device, hdid: %u target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ le32_to_cpu(device->hdid), le16_to_cpu(device->target),
+ device->channel, device->lun, device->attr);
+
sdev = scsi_device_lookup(shost, device->channel, le16_to_cpu(device->target), 0);
if (sdev) {
dev_warn(hdev->dev, "Device is already exist, channel: %d, target_id: %d, lun: %d\n",
@@ -1974,9 +2092,13 @@ static int spraid_rescan_device(struct spraid_dev *hdev, struct spraid_dev_info
struct Scsi_Host *shost = hdev->shost;
struct scsi_device *sdev;
+ dev_info(hdev->dev, "rescan device, hdid: %u target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ le32_to_cpu(device->hdid), le16_to_cpu(device->target),
+ device->channel, device->lun, device->attr);
+
sdev = scsi_device_lookup(shost, device->channel, le16_to_cpu(device->target), 0);
if (!sdev) {
- dev_warn(hdev->dev, "Device is not exit, channel: %d, target_id: %d, lun: %d\n",
+ dev_warn(hdev->dev, "device is not exit rescan it, channel: %d, target_id: %d, lun: %d\n",
device->channel, le16_to_cpu(device->target), 0);
return -ENODEV;
}
@@ -1991,9 +2113,13 @@ static int spraid_remove_device(struct spraid_dev *hdev, struct spraid_dev_info
struct Scsi_Host *shost = hdev->shost;
struct scsi_device *sdev;
+ dev_info(hdev->dev, "remove device, hdid: %u target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ le32_to_cpu(org_device->hdid), le16_to_cpu(org_device->target),
+ org_device->channel, org_device->lun, org_device->attr);
+
sdev = scsi_device_lookup(shost, org_device->channel, le16_to_cpu(org_device->target), 0);
if (!sdev) {
- dev_warn(hdev->dev, "Device is not exit, channel: %d, target_id: %d, lun: %d\n",
+ dev_warn(hdev->dev, "device is not exit remove it, channel: %d, target_id: %d, lun: %d\n",
org_device->channel, le16_to_cpu(org_device->target), 0);
return -ENODEV;
}
@@ -2029,36 +2155,54 @@ static int spraid_dev_list_init(struct spraid_dev *hdev)
return 0;
}
+static int luntarget_cmp_func(const void *l, const void *r)
+{
+ const struct spraid_dev_info *ln = l;
+ const struct spraid_dev_info *rn = r;
+
+ if (ln->channel == rn->channel)
+ return le16_to_cpu(ln->target) - le16_to_cpu(rn->target);
+
+ return ln->channel - rn->channel;
+}
+
static void spraid_scan_work(struct work_struct *work)
{
struct spraid_dev *hdev =
container_of(work, struct spraid_dev, scan_work);
struct spraid_dev_info *devices, *org_devices;
+ struct spraid_dev_info *sortdevice;
u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
u8 flag, org_flag;
int i, ret;
+ int count = 0;
devices = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
if (!devices)
return;
+
+ sortdevice = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
+ if (!sortdevice)
+ goto free_list;
+
ret = spraid_get_dev_list(hdev, devices);
if (ret)
- goto free_list;
+ goto free_all;
org_devices = hdev->devices;
for (i = 0; i < nd; i++) {
org_flag = org_devices[i].flag;
flag = devices[i].flag;
- dev_log_dbg(hdev->dev, "i: %d, org_flag: 0x%x, flag: 0x%x\n",
- i, org_flag, flag);
+ dev_log_dbg(hdev->dev, "i: %d, org_flag: 0x%x, flag: 0x%x\n", i, org_flag, flag);
if (SPRAID_DEV_INFO_FLAG_VALID(flag)) {
if (!SPRAID_DEV_INFO_FLAG_VALID(org_flag)) {
down_write(&hdev->devices_rwsem);
memcpy(&org_devices[i], &devices[i],
- sizeof(struct spraid_dev_info));
+ sizeof(struct spraid_dev_info));
+ memcpy(&sortdevice[count++], &devices[i],
+ sizeof(struct spraid_dev_info));
up_write(&hdev->devices_rwsem);
- spraid_add_device(hdev, &devices[i]);
} else if (SPRAID_DEV_INFO_FLAG_CHANGE(flag)) {
spraid_rescan_device(hdev, &devices[i]);
}
@@ -2071,6 +2215,16 @@ static void spraid_scan_work(struct work_struct *work)
}
}
}
+
+ dev_info(hdev->dev, "scan work add device count = %d\n", count);
+
+ sort(sortdevice, count, sizeof(sortdevice[0]), luntarget_cmp_func, NULL);
+
+ for (i = 0; i < count; i++)
+ spraid_add_device(hdev, &sortdevice[i]);
+
+free_all:
+ kfree(sortdevice);
free_list:
kfree(devices);
}
@@ -2083,6 +2237,15 @@ static void spraid_timesyn_work(struct work_struct *work)
spraid_configure_timestamp(hdev);
}
+static int spraid_init_ctrl_info(struct spraid_dev *hdev);
+static void spraid_fw_act_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev = container_of(work, struct spraid_dev, fw_act_work);
+
+ if (spraid_init_ctrl_info(hdev))
+ dev_err(hdev->dev, "get ctrl info failed after fw act\n");
+}
+
static void spraid_queue_scan(struct spraid_dev *hdev)
{
queue_work(spraid_wq, &hdev->scan_work);
@@ -2094,6 +2257,9 @@ static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result)
case SPRAID_AEN_DEV_CHANGED:
spraid_queue_scan(hdev);
break;
+ case SPRAID_AEN_FW_ACT_START:
+ dev_info(hdev->dev, "fw activation starting\n");
+ break;
case SPRAID_AEN_HOST_PROBING:
break;
default:
@@ -2101,25 +2267,25 @@ static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result)
}
}
-static void spraid_handle_aen_vs(struct spraid_dev *hdev, u32 result)
+static void spraid_handle_aen_vs(struct spraid_dev *hdev, u32 result, u32 result1)
{
- switch (result) {
+ switch ((result & 0xff00) >> 8) {
case SPRAID_AEN_TIMESYN:
queue_work(spraid_wq, &hdev->timesyn_work);
break;
+ case SPRAID_AEN_FW_ACT_FINISH:
+ dev_info(hdev->dev, "fw activation finish\n");
+ queue_work(spraid_wq, &hdev->fw_act_work);
+ break;
+ case SPRAID_AEN_EVENT_MIN ... SPRAID_AEN_EVENT_MAX:
+ dev_info(hdev->dev, "rcv card event[%d], param1[0x%x] param2[0x%x]\n",
+ (result & 0xff00) >> 8, result, result1);
+ break;
default:
- dev_warn(hdev->dev, "async event result: %x\n", result);
+ dev_warn(hdev->dev, "async event result: 0x%x\n", result);
}
}
-static void spraid_async_event_work(struct work_struct *work)
-{
- struct spraid_dev *hdev =
- container_of(work, struct spraid_dev, aen_work);
-
- spraid_send_aen(hdev);
-}
-
static int spraid_alloc_resources(struct spraid_dev *hdev)
{
int ret, nqueue;
@@ -2149,10 +2315,16 @@ static int spraid_alloc_resources(struct spraid_dev *hdev)
goto destroy_dma_pools;
}
+ ret = spraid_alloc_admin_cmds(hdev);
+ if (ret)
+ goto free_queues;
+
dev_info(hdev->dev, "[%s] queues num: %d\n", __func__, nqueue);
return 0;
+free_queues:
+ kfree(hdev->queues);
destroy_dma_pools:
spraid_destroy_dma_pools(hdev);
free_ctrl_info:
@@ -2164,50 +2336,18 @@ static int spraid_alloc_resources(struct spraid_dev *hdev)
static void spraid_free_resources(struct spraid_dev *hdev)
{
+ spraid_free_admin_cmds(hdev);
kfree(hdev->queues);
spraid_destroy_dma_pools(hdev);
kfree(hdev->ctrl_info);
ida_free(&spraid_instance_ida, hdev->instance);
}
-static void spraid_setup_passthrough(struct request *req, struct spraid_admin_command *cmd)
-{
- memcpy(cmd, spraid_admin_req(req)->cmd, sizeof(*cmd));
- cmd->common.flags &= ~SPRAID_CMD_FLAG_SGL_ALL;
-}
-
-static inline void spraid_clear_hreq(struct request *req)
-{
- if (!(req->rq_flags & RQF_DONTPREP)) {
- spraid_admin_req(req)->flags = 0;
- req->rq_flags |= RQF_DONTPREP;
- }
-}
-
-static blk_status_t spraid_setup_admin_cmd(struct request *req, struct spraid_admin_command *cmd)
+static void spraid_bsg_unmap_data(struct spraid_dev *hdev, struct bsg_job *job)
{
- spraid_clear_hreq(req);
-
- memset(cmd, 0, sizeof(*cmd));
- switch (req_op(req)) {
- case REQ_OP_DRV_IN:
- case REQ_OP_DRV_OUT:
- spraid_setup_passthrough(req, cmd);
- break;
- default:
- WARN_ON_ONCE(1);
- return BLK_STS_IOERR;
- }
-
- cmd->common.command_id = req->tag;
- return BLK_STS_OK;
-}
-
-static void spraid_unmap_data(struct spraid_dev *hdev, struct request *req)
-{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- enum dma_data_direction dma_dir = rq_data_dir(req) ?
- DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir = rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
if (iod->nsge)
dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
@@ -2215,36 +2355,36 @@ static void spraid_unmap_data(struct spraid_dev *hdev, struct request *req)
spraid_free_iod_res(hdev, iod);
}
-static blk_status_t spraid_admin_map_data(struct spraid_dev *hdev, struct request *req,
- struct spraid_admin_command *cmd)
+static int spraid_bsg_map_data(struct spraid_dev *hdev, struct bsg_job *job,
+ struct spraid_admin_command *cmd)
{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct request_queue *admin_q = req->q;
- enum dma_data_direction dma_dir = rq_data_dir(req) ?
- DMA_TO_DEVICE : DMA_FROM_DEVICE;
- blk_status_t ret = BLK_STS_IOERR;
- int nr_mapped;
- int res;
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir = rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ int ret = 0;
+
+ iod->sg = job->request_payload.sg_list;
+ iod->nsge = job->request_payload.sg_cnt;
+ iod->length = job->request_payload.payload_len;
+ iod->use_sgl = false;
+ iod->npages = -1;
+ iod->sg_drv_mgmt = false;
- sg_init_table(iod->sg, blk_rq_nr_phys_segments(req));
- iod->nsge = blk_rq_map_sg(admin_q, req, iod->sg);
if (!iod->nsge)
goto out;
- dev_info(hdev->dev, "nseg: %u, nsge: %u\n",
- blk_rq_nr_phys_segments(req), iod->nsge);
-
- ret = BLK_STS_RESOURCE;
- nr_mapped = dma_map_sg_attrs(hdev->dev, iod->sg, iod->nsge, dma_dir, DMA_ATTR_NO_WARN);
- if (!nr_mapped)
+ ret = dma_map_sg_attrs(hdev->dev, iod->sg, iod->nsge, dma_dir, DMA_ATTR_NO_WARN);
+ if (!ret)
goto out;
- res = spraid_setup_prps(hdev, iod);
- if (res)
+ ret = spraid_setup_prps(hdev, iod);
+ if (ret)
goto unmap;
+
cmd->common.dptr.prp1 = cpu_to_le64(sg_dma_address(iod->sg));
cmd->common.dptr.prp2 = cpu_to_le64(iod->first_dma);
- return BLK_STS_OK;
+
+ return 0;
unmap:
dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
@@ -2252,137 +2392,29 @@ static blk_status_t spraid_admin_map_data(struct spraid_dev *hdev, struct reques
return ret;
}
-static blk_status_t spraid_init_admin_iod(struct request *rq, struct spraid_dev *hdev)
-{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(rq);
- int nents = blk_rq_nr_phys_segments(rq);
- unsigned int size = blk_rq_payload_bytes(rq);
-
- if (nents > SPRAID_INT_PAGES || size > SPRAID_INT_BYTES(hdev)) {
- iod->sg = mempool_alloc(hdev->iod_mempool, GFP_ATOMIC);
- if (!iod->sg)
- return BLK_STS_RESOURCE;
- } else {
- iod->sg = iod->inline_sg;
- }
-
- iod->nsge = 0;
- iod->use_sgl = false;
- iod->npages = -1;
- iod->length = size;
- iod->sg_drv_mgmt = true;
-
- return BLK_STS_OK;
-}
-
-static blk_status_t spraid_queue_admin_rq(struct blk_mq_hw_ctx *hctx,
- const struct blk_mq_queue_data *bd)
-{
- struct spraid_queue *adminq = hctx->driver_data;
- struct spraid_dev *hdev = adminq->hdev;
- struct request *req = bd->rq;
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct spraid_admin_command cmd;
- blk_status_t ret;
-
- ret = spraid_setup_admin_cmd(req, &cmd);
- if (ret)
- goto out;
-
- ret = spraid_init_admin_iod(req, hdev);
- if (ret)
- goto out;
-
- if (blk_rq_nr_phys_segments(req)) {
- ret = spraid_admin_map_data(hdev, req, &cmd);
- if (ret)
- goto cleanup_iod;
- }
-
- blk_mq_start_request(req);
- spraid_submit_cmd(adminq, &cmd);
- return BLK_STS_OK;
-
-cleanup_iod:
- spraid_free_iod_res(hdev, iod);
-out:
- return ret;
-}
-
-static blk_status_t spraid_error_status(struct request *req)
-{
- switch (spraid_admin_req(req)->status & 0x7ff) {
- case SPRAID_SC_SUCCESS:
- return BLK_STS_OK;
- default:
- return BLK_STS_IOERR;
- }
-}
-
-static void spraid_complete_admin_rq(struct request *req)
-{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct spraid_dev *hdev = iod->spraidq->hdev;
-
- if (blk_rq_nr_phys_segments(req))
- spraid_unmap_data(hdev, req);
- blk_mq_end_request(req, spraid_error_status(req));
-}
-
-static int spraid_admin_init_hctx(struct blk_mq_hw_ctx *hctx, void *data, unsigned int hctx_idx)
-{
- struct spraid_dev *hdev = data;
- struct spraid_queue *adminq = &hdev->queues[0];
-
- WARN_ON(hctx_idx != 0);
- WARN_ON(hdev->admin_tagset.tags[0] != hctx->tags);
-
- hctx->driver_data = adminq;
- return 0;
-}
-
-static int spraid_admin_init_request(struct blk_mq_tag_set *set, struct request *req,
- unsigned int hctx_idx, unsigned int numa_node)
-{
- struct spraid_dev *hdev = set->driver_data;
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct spraid_queue *adminq = &hdev->queues[0];
-
- WARN_ON(!adminq);
- iod->spraidq = adminq;
- return 0;
-}
-
-static enum blk_eh_timer_return
-spraid_admin_timeout(struct request *req, bool reserved)
-{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct spraid_queue *spraidq = iod->spraidq;
- struct spraid_dev *hdev = spraidq->hdev;
-
- dev_err(hdev->dev, "Admin cid[%d] qid[%d] timeout\n",
- req->tag, spraidq->qid);
-
- if (spraid_poll_cq(spraidq, req->tag)) {
- dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, completion polled\n",
- req->tag, spraidq->qid);
- return BLK_EH_DONE;
- }
-
- spraid_end_admin_request(req, cpu_to_le16(-EINVAL), 0, 0);
- return BLK_EH_DONE;
-}
-
static int spraid_get_ctrl_info(struct spraid_dev *hdev, struct spraid_ctrl_info *ctrl_info)
{
struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
memset(&admin_cmd, 0, sizeof(admin_cmd));
admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
admin_cmd.get_info.type = SPRAID_GET_INFO_CTRL;
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(ctrl_info, data_ptr, sizeof(struct spraid_ctrl_info));
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- ctrl_info, sizeof(struct spraid_ctrl_info), 0, 0, 0);
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
}
static int spraid_init_ctrl_info(struct spraid_dev *hdev)
@@ -2416,6 +2448,11 @@ static int spraid_init_ctrl_info(struct spraid_dev *hdev)
dev_info(hdev->dev, "[%s]sn = %s\n", __func__, hdev->ctrl_info->sn);
dev_info(hdev->dev, "[%s]fr = %s\n", __func__, hdev->ctrl_info->fr);
+ if (!hdev->ctrl_info->aerl)
+ hdev->ctrl_info->aerl = 1;
+ if (hdev->ctrl_info->aerl > SPRAID_NR_AEN_COMMANDS)
+ hdev->ctrl_info->aerl = SPRAID_NR_AEN_COMMANDS;
+
return 0;
}
@@ -2444,98 +2481,51 @@ static void spraid_free_iod_ext_mem_pool(struct spraid_dev *hdev)
mempool_destroy(hdev->iod_mempool);
}
-static int spraid_submit_user_cmd(struct request_queue *q, struct spraid_admin_command *cmd,
- void __user *ubuffer, unsigned int bufflen, u32 *result,
- unsigned int timeout)
+static int spraid_user_admin_cmd(struct spraid_dev *hdev, struct bsg_job *job)
{
- struct request *req;
- struct bio *bio = NULL;
- int ret;
-
- req = spraid_alloc_admin_request(q, cmd, 0);
- if (IS_ERR(req))
- return PTR_ERR(req);
+ struct spraid_bsg_request *bsg_req = job->request;
+ struct spraid_passthru_common_cmd *cmd = &(bsg_req->admcmd);
+ struct spraid_admin_command admin_cmd;
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
+ u32 result[2] = {0};
+ int status;
- req->timeout = timeout ? timeout : ADMIN_TIMEOUT;
- spraid_admin_req(req)->flags |= SPRAID_REQ_USERCMD;
+ if (hdev->state >= SPRAID_RESETTING) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not right\n",
+ __func__, hdev->state);
+ return -EBUSY;
+ }
- if (ubuffer && bufflen) {
- ret = blk_rq_map_user(q, req, NULL, ubuffer, bufflen, GFP_KERNEL);
- if (ret)
- goto out;
- bio = req->bio;
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.common.opcode = cmd->opcode;
+ admin_cmd.common.flags = cmd->flags;
+ admin_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ admin_cmd.common.cdw2[0] = cpu_to_le32(cmd->cdw2);
+ admin_cmd.common.cdw2[1] = cpu_to_le32(cmd->cdw3);
+ admin_cmd.common.cdw10 = cpu_to_le32(cmd->cdw10);
+ admin_cmd.common.cdw11 = cpu_to_le32(cmd->cdw11);
+ admin_cmd.common.cdw12 = cpu_to_le32(cmd->cdw12);
+ admin_cmd.common.cdw13 = cpu_to_le32(cmd->cdw13);
+ admin_cmd.common.cdw14 = cpu_to_le32(cmd->cdw14);
+ admin_cmd.common.cdw15 = cpu_to_le32(cmd->cdw15);
+
+ status = spraid_bsg_map_data(hdev, job, &admin_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
}
- blk_execute_rq(req->q, NULL, req, 0);
- if (spraid_admin_req(req)->flags & SPRAID_REQ_CANCELLED)
- ret = -EINTR;
- else
- ret = spraid_admin_req(req)->status;
- if (result) {
- result[0] = spraid_admin_req(req)->result0;
- result[1] = spraid_admin_req(req)->result1;
- }
- if (bio)
- blk_rq_unmap_user(bio);
-out:
- blk_mq_free_request(req);
- return ret;
-}
-static int spraid_user_admin_cmd(struct spraid_dev *hdev,
- struct spraid_passthru_common_cmd __user *ucmd)
-{
- struct spraid_passthru_common_cmd cmd;
- struct spraid_admin_command admin_cmd;
- u32 timeout = 0;
- int status;
-
- if (!capable(CAP_SYS_ADMIN)) {
- dev_err(hdev->dev, "Current user hasn't administrator right, reject service\n");
- return -EACCES;
- }
-
- if (copy_from_user(&cmd, ucmd, sizeof(cmd))) {
- dev_err(hdev->dev, "Copy command from user space to kernel space failed\n");
- return -EFAULT;
- }
-
- if (cmd.flags) {
- dev_err(hdev->dev, "Invalid flags in user command\n");
- return -EINVAL;
+ status = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, &result[0], &result[1], timeout);
+ if (status >= 0) {
+ job->reply_len = sizeof(result);
+ memcpy(job->reply, result, sizeof(result));
}
- dev_info(hdev->dev, "user_admin_cmd opcode: 0x%x, subopcode: 0x%x\n",
- cmd.opcode, cmd.cdw2 & 0x7ff);
+ if (status)
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x], status[0x%x] result0[0x%x] result1[0x%x]\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode, status, result[0], result[1]);
- memset(&admin_cmd, 0, sizeof(admin_cmd));
- admin_cmd.common.opcode = cmd.opcode;
- admin_cmd.common.flags = cmd.flags;
- admin_cmd.common.hdid = cpu_to_le32(cmd.nsid);
- admin_cmd.common.cdw2[0] = cpu_to_le32(cmd.cdw2);
- admin_cmd.common.cdw2[1] = cpu_to_le32(cmd.cdw3);
- admin_cmd.common.cdw10 = cpu_to_le32(cmd.cdw10);
- admin_cmd.common.cdw11 = cpu_to_le32(cmd.cdw11);
- admin_cmd.common.cdw12 = cpu_to_le32(cmd.cdw12);
- admin_cmd.common.cdw13 = cpu_to_le32(cmd.cdw13);
- admin_cmd.common.cdw14 = cpu_to_le32(cmd.cdw14);
- admin_cmd.common.cdw15 = cpu_to_le32(cmd.cdw15);
-
- if (cmd.timeout_ms)
- timeout = msecs_to_jiffies(cmd.timeout_ms);
-
- status = spraid_submit_user_cmd(hdev->admin_q, &admin_cmd,
- (void __user *)(uintptr_t)cmd.addr, cmd.info_1.data_len,
- &cmd.result0, timeout);
-
- dev_info(hdev->dev, "user_admin_cmd status: 0x%x, result0: 0x%x, result1: 0x%x\n",
- status, cmd.result0, cmd.result1);
-
- if (status >= 0) {
- if (put_user(cmd.result0, &ucmd->result0))
- return -EFAULT;
- if (put_user(cmd.result1, &ucmd->result1))
- return -EFAULT;
- }
+ spraid_bsg_unmap_data(hdev, job);
return status;
}
@@ -2548,8 +2538,8 @@ static int spraid_alloc_ioq_ptcmds(struct spraid_dev *hdev)
INIT_LIST_HEAD(&hdev->ioq_pt_list);
spin_lock_init(&hdev->ioq_pt_lock);
- hdev->ioq_ptcmds = kcalloc_node(ptnum, sizeof(struct spraid_ioq_ptcmd),
- GFP_KERNEL, hdev->numa_node);
+ hdev->ioq_ptcmds = kcalloc_node(ptnum, sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
if (!hdev->ioq_ptcmds) {
dev_err(hdev->dev, "Alloc ioq_ptcmds failed\n");
@@ -2567,55 +2557,35 @@ static int spraid_alloc_ioq_ptcmds(struct spraid_dev *hdev)
return 0;
}
-static struct spraid_ioq_ptcmd *spraid_get_ioq_ptcmd(struct spraid_dev *hdev)
-{
- struct spraid_ioq_ptcmd *cmd = NULL;
- unsigned long flags;
-
- spin_lock_irqsave(&hdev->ioq_pt_lock, flags);
- if (list_empty(&hdev->ioq_pt_list)) {
- spin_unlock_irqrestore(&hdev->ioq_pt_lock, flags);
- dev_err(hdev->dev, "err, ioq ptcmd list empty\n");
- return NULL;
- }
- cmd = list_entry((&hdev->ioq_pt_list)->next, struct spraid_ioq_ptcmd, list);
- list_del_init(&cmd->list);
- spin_unlock_irqrestore(&hdev->ioq_pt_lock, flags);
-
- WRITE_ONCE(cmd->state, SPRAID_CMD_IDLE);
-
- return cmd;
-}
-
-static void spraid_put_ioq_ptcmd(struct spraid_dev *hdev, struct spraid_ioq_ptcmd *cmd)
+static void spraid_free_ioq_ptcmds(struct spraid_dev *hdev)
{
- unsigned long flags;
+ kfree(hdev->ioq_ptcmds);
+ hdev->ioq_ptcmds = NULL;
- spin_lock_irqsave(&hdev->ioq_pt_lock, flags);
- list_add(&cmd->list, (&hdev->ioq_pt_list)->next);
- spin_unlock_irqrestore(&hdev->ioq_pt_lock, flags);
+ INIT_LIST_HEAD(&hdev->ioq_pt_list);
}
static int spraid_submit_ioq_sync_cmd(struct spraid_dev *hdev, struct spraid_ioq_command *cmd,
- u32 *result, void **sense, u32 timeout)
+ u32 *result, u32 *reslen, u32 timeout)
{
- struct spraid_queue *ioq;
int ret;
dma_addr_t sense_dma;
- struct spraid_ioq_ptcmd *pt_cmd = spraid_get_ioq_ptcmd(hdev);
-
- *sense = NULL;
+ struct spraid_queue *ioq;
+ void *sense_addr = NULL;
+ struct spraid_cmd *pt_cmd = spraid_get_cmd(hdev, SPRAID_CMD_IOPT);
- if (!pt_cmd)
+ if (!pt_cmd) {
+ dev_err(hdev->dev, "err, get ioq cmd failed\n");
return -EFAULT;
+ }
- dev_info(hdev->dev, "[%s] ptcmd, cid[%d], qid[%d]\n", __func__, pt_cmd->cid, pt_cmd->qid);
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
init_completion(&pt_cmd->cmd_done);
ioq = &hdev->queues[pt_cmd->qid];
ret = pt_cmd->cid * SCSI_SENSE_BUFFERSIZE;
- pt_cmd->priv = ioq->sense + ret;
+ sense_addr = ioq->sense + ret;
sense_dma = ioq->sense_dma_addr + ret;
cmd->common.sense_addr = cpu_to_le64(sense_dma);
@@ -2625,260 +2595,87 @@ static int spraid_submit_ioq_sync_cmd(struct spraid_dev *hdev, struct spraid_ioq
spraid_submit_cmd(ioq, cmd);
if (!wait_for_completion_timeout(&pt_cmd->cmd_done, timeout)) {
- dev_err(hdev->dev, "[%s] cid[%d], qid[%d] timeout\n",
- __func__, pt_cmd->cid, pt_cmd->qid);
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout, opcode[0x%x] subopcode[0x%x]\n",
+ __func__, pt_cmd->cid, pt_cmd->qid, cmd->common.opcode,
+ (le32_to_cpu(cmd->common.cdw3[0]) & 0xffff));
WRITE_ONCE(pt_cmd->state, SPRAID_CMD_TIMEOUT);
- return -EINVAL;
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
+ return -ETIME;
}
- if (result) {
- result[0] = pt_cmd->result0;
- result[1] = pt_cmd->result1;
+ if (result && reslen) {
+ if ((pt_cmd->status & 0x17f) == 0x101) {
+ memcpy(result, sense_addr, SCSI_SENSE_BUFFERSIZE);
+ *reslen = SCSI_SENSE_BUFFERSIZE;
+ }
}
- if ((pt_cmd->status & 0x17f) == 0x101)
- *sense = pt_cmd->priv;
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
return pt_cmd->status;
}
-static int spraid_user_ioq_cmd(struct spraid_dev *hdev,
- struct spraid_ioq_passthru_cmd __user *ucmd)
+static int spraid_user_ioq_cmd(struct spraid_dev *hdev, struct bsg_job *job)
{
- struct spraid_ioq_passthru_cmd cmd;
+ struct spraid_bsg_request *bsg_req = (struct spraid_bsg_request *)(job->request);
+ struct spraid_ioq_passthru_cmd *cmd = &(bsg_req->ioqcmd);
struct spraid_ioq_command ioq_cmd;
- u32 timeout = 0;
int status = 0;
- u8 *data_ptr = NULL;
- dma_addr_t data_dma;
- enum dma_data_direction dma_dir = DMA_NONE;
- void *sense = NULL;
-
- if (!capable(CAP_SYS_ADMIN)) {
- dev_err(hdev->dev, "Current user hasn't administrator right, reject service\n");
- return -EACCES;
- }
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
- if (copy_from_user(&cmd, ucmd, sizeof(cmd))) {
- dev_err(hdev->dev, "Copy command from user space to kernel space failed\n");
- return -EFAULT;
- }
-
- if (cmd.data_len > PAGE_SIZE) {
+ if (cmd->data_len > PAGE_SIZE) {
dev_err(hdev->dev, "[%s] data len bigger than 4k\n", __func__);
return -EFAULT;
}
- dev_info(hdev->dev, "[%s] opcode: 0x%x, subopcode: 0x%x, datalen: %d\n",
- __func__, cmd.opcode, cmd.info_1.subopcode, cmd.data_len);
-
- if (cmd.addr && cmd.data_len) {
- data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
- if (!data_ptr)
- return -ENOMEM;
-
- dma_dir = (cmd.opcode & 1) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ if (hdev->state != SPRAID_LIVE) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not live\n",
+ __func__, hdev->state);
+ return -EBUSY;
}
- if (dma_dir == DMA_TO_DEVICE) {
- if (copy_from_user(data_ptr, (void __user *)(uintptr_t)cmd.addr, cmd.data_len)) {
- dev_err(hdev->dev, "[%s] copy user data failed\n", __func__);
- status = -EFAULT;
- goto free_dma_mem;
- }
- }
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init, datalen[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode, cmd->data_len);
memset(&ioq_cmd, 0, sizeof(ioq_cmd));
- ioq_cmd.common.opcode = cmd.opcode;
- ioq_cmd.common.flags = cmd.flags;
- ioq_cmd.common.hdid = cpu_to_le32(cmd.nsid);
- ioq_cmd.common.sense_len = cpu_to_le16(cmd.info_0.res_sense_len);
- ioq_cmd.common.cdb_len = cmd.info_0.cdb_len;
- ioq_cmd.common.rsvd2 = cmd.info_0.rsvd0;
- ioq_cmd.common.cdw3[0] = cpu_to_le32(cmd.cdw3);
- ioq_cmd.common.cdw3[1] = cpu_to_le32(cmd.cdw4);
- ioq_cmd.common.cdw3[2] = cpu_to_le32(cmd.cdw5);
- ioq_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
-
- ioq_cmd.common.cdw10[0] = cpu_to_le32(cmd.cdw10);
- ioq_cmd.common.cdw10[1] = cpu_to_le32(cmd.cdw11);
- ioq_cmd.common.cdw10[2] = cpu_to_le32(cmd.cdw12);
- ioq_cmd.common.cdw10[3] = cpu_to_le32(cmd.cdw13);
- ioq_cmd.common.cdw10[4] = cpu_to_le32(cmd.cdw14);
- ioq_cmd.common.cdw10[5] = cpu_to_le32(cmd.data_len);
-
- memcpy(ioq_cmd.common.cdb, &cmd.cdw16, cmd.info_0.cdb_len);
-
- ioq_cmd.common.cdw26[0] = cpu_to_le32(cmd.cdw26[0]);
- ioq_cmd.common.cdw26[1] = cpu_to_le32(cmd.cdw26[1]);
- ioq_cmd.common.cdw26[2] = cpu_to_le32(cmd.cdw26[2]);
- ioq_cmd.common.cdw26[3] = cpu_to_le32(cmd.cdw26[3]);
-
- if (cmd.timeout_ms)
- timeout = msecs_to_jiffies(cmd.timeout_ms);
- timeout = timeout ? timeout : ADMIN_TIMEOUT;
-
- status = spraid_submit_ioq_sync_cmd(hdev, &ioq_cmd, &cmd.result0, &sense, timeout);
-
- if (status >= 0) {
- if (put_user(cmd.result0, &ucmd->result0)) {
- status = -EFAULT;
- goto free_dma_mem;
- }
- if (put_user(cmd.result1, &ucmd->result1)) {
- status = -EFAULT;
- goto free_dma_mem;
- }
- if (dma_dir == DMA_FROM_DEVICE &&
- copy_to_user((void __user *)(uintptr_t)cmd.addr, data_ptr, cmd.data_len)) {
- status = -EFAULT;
- goto free_dma_mem;
- }
- }
-
- if (sense) {
- if (copy_to_user((void *__user *)(uintptr_t)cmd.sense_addr,
- sense, cmd.info_0.res_sense_len)) {
- status = -EFAULT;
- goto free_dma_mem;
- }
- }
-
-free_dma_mem:
- if (data_ptr)
- dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
-
- return status;
-
-}
-
-static int spraid_reset_work_sync(struct spraid_dev *hdev);
-
-static int spraid_user_reset_cmd(struct spraid_dev *hdev)
-{
- int ret;
-
- dev_info(hdev->dev, "[%s] start user reset cmd\n", __func__);
- ret = spraid_reset_work_sync(hdev);
- dev_info(hdev->dev, "[%s] stop user reset cmd[%d]\n", __func__, ret);
-
- return ret;
-}
-
-static int hdev_open(struct inode *inode, struct file *file)
-{
- struct spraid_dev *hdev =
- container_of(inode->i_cdev, struct spraid_dev, cdev);
- file->private_data = hdev;
- return 0;
-}
-
-static long hdev_ioctl(struct file *file, u32 cmd, unsigned long arg)
-{
- struct spraid_dev *hdev = file->private_data;
- void __user *argp = (void __user *)arg;
-
- switch (cmd) {
- case SPRAID_IOCTL_ADMIN_CMD:
- return spraid_user_admin_cmd(hdev, argp);
- case SPRAID_IOCTL_IOQ_CMD:
- return spraid_user_ioq_cmd(hdev, argp);
- case SPRAID_IOCTL_RESET_CMD:
- return spraid_user_reset_cmd(hdev);
- default:
- return -ENOTTY;
- }
-}
-
-static const struct file_operations spraid_dev_fops = {
- .owner = THIS_MODULE,
- .open = hdev_open,
- .unlocked_ioctl = hdev_ioctl,
- .compat_ioctl = hdev_ioctl,
-};
-
-static int spraid_create_cdev(struct spraid_dev *hdev)
-{
- int ret;
-
- device_initialize(&hdev->ctrl_device);
- hdev->ctrl_device.devt = MKDEV(MAJOR(spraid_chr_devt), hdev->instance);
- hdev->ctrl_device.class = spraid_class;
- hdev->ctrl_device.parent = hdev->dev;
- dev_set_drvdata(&hdev->ctrl_device, hdev);
- ret = dev_set_name(&hdev->ctrl_device, "spraid%d", hdev->instance);
- if (ret)
- return ret;
- cdev_init(&hdev->cdev, &spraid_dev_fops);
- hdev->cdev.owner = THIS_MODULE;
- ret = cdev_device_add(&hdev->cdev, &hdev->ctrl_device);
- if (ret) {
- dev_err(hdev->dev, "Add cdev failed, ret: %d", ret);
- put_device(&hdev->ctrl_device);
- kfree_const(hdev->ctrl_device.kobj.name);
- return ret;
- }
-
- return 0;
-}
-
-static inline void spraid_remove_cdev(struct spraid_dev *hdev)
-{
- cdev_device_del(&hdev->cdev, &hdev->ctrl_device);
-}
-
-static const struct blk_mq_ops spraid_admin_mq_ops = {
- .queue_rq = spraid_queue_admin_rq,
- .complete = spraid_complete_admin_rq,
- .init_hctx = spraid_admin_init_hctx,
- .init_request = spraid_admin_init_request,
- .timeout = spraid_admin_timeout,
-};
-
-static void spraid_remove_admin_tagset(struct spraid_dev *hdev)
-{
- if (hdev->admin_q && !blk_queue_dying(hdev->admin_q)) {
- blk_mq_unquiesce_queue(hdev->admin_q);
- blk_cleanup_queue(hdev->admin_q);
- blk_mq_free_tag_set(&hdev->admin_tagset);
+ ioq_cmd.common.opcode = cmd->opcode;
+ ioq_cmd.common.flags = cmd->flags;
+ ioq_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ ioq_cmd.common.sense_len = cpu_to_le16(cmd->info_0.res_sense_len);
+ ioq_cmd.common.cdb_len = cmd->info_0.cdb_len;
+ ioq_cmd.common.rsvd2 = cmd->info_0.rsvd0;
+ ioq_cmd.common.cdw3[0] = cpu_to_le32(cmd->cdw3);
+ ioq_cmd.common.cdw3[1] = cpu_to_le32(cmd->cdw4);
+ ioq_cmd.common.cdw3[2] = cpu_to_le32(cmd->cdw5);
+
+ ioq_cmd.common.cdw10[0] = cpu_to_le32(cmd->cdw10);
+ ioq_cmd.common.cdw10[1] = cpu_to_le32(cmd->cdw11);
+ ioq_cmd.common.cdw10[2] = cpu_to_le32(cmd->cdw12);
+ ioq_cmd.common.cdw10[3] = cpu_to_le32(cmd->cdw13);
+ ioq_cmd.common.cdw10[4] = cpu_to_le32(cmd->cdw14);
+ ioq_cmd.common.cdw10[5] = cpu_to_le32(cmd->data_len);
+
+ memcpy(ioq_cmd.common.cdb, &cmd->cdw16, cmd->info_0.cdb_len);
+
+ ioq_cmd.common.cdw26[0] = cpu_to_le32(cmd->cdw26[0]);
+ ioq_cmd.common.cdw26[1] = cpu_to_le32(cmd->cdw26[1]);
+ ioq_cmd.common.cdw26[2] = cpu_to_le32(cmd->cdw26[2]);
+ ioq_cmd.common.cdw26[3] = cpu_to_le32(cmd->cdw26[3]);
+
+ status = spraid_bsg_map_data(hdev, job, (struct spraid_admin_command *)&ioq_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
}
-}
-static int spraid_alloc_admin_tags(struct spraid_dev *hdev)
-{
- if (!hdev->admin_q) {
- hdev->admin_tagset.ops = &spraid_admin_mq_ops;
- hdev->admin_tagset.nr_hw_queues = 1;
+ status = spraid_submit_ioq_sync_cmd(hdev, &ioq_cmd, job->reply, &job->reply_len, timeout);
- hdev->admin_tagset.queue_depth = SPRAID_AQ_MQ_TAG_DEPTH;
- hdev->admin_tagset.timeout = ADMIN_TIMEOUT;
- hdev->admin_tagset.numa_node = hdev->numa_node;
- hdev->admin_tagset.cmd_size =
- spraid_cmd_size(hdev, true, false);
- hdev->admin_tagset.flags = BLK_MQ_F_NO_SCHED;
- hdev->admin_tagset.driver_data = hdev;
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x], status[0x%x], reply_len[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode, status, job->reply_len);
- if (blk_mq_alloc_tag_set(&hdev->admin_tagset)) {
- dev_err(hdev->dev, "Allocate admin tagset failed\n");
- return -ENOMEM;
- }
+ spraid_bsg_unmap_data(hdev, job);
- hdev->admin_q = blk_mq_init_queue(&hdev->admin_tagset);
- if (IS_ERR(hdev->admin_q)) {
- dev_err(hdev->dev, "Initialize admin request queue failed\n");
- blk_mq_free_tag_set(&hdev->admin_tagset);
- return -ENOMEM;
- }
- if (!blk_get_queue(hdev->admin_q)) {
- dev_err(hdev->dev, "Get admin request queue failed\n");
- spraid_remove_admin_tagset(hdev);
- hdev->admin_q = NULL;
- return -ENODEV;
- }
- } else {
- blk_mq_unquiesce_queue(hdev->admin_q);
- }
- return 0;
+ return status;
}
static bool spraid_check_scmd_completed(struct scsi_cmnd *scmd)
@@ -2891,7 +2688,7 @@ static bool spraid_check_scmd_completed(struct scsi_cmnd *scmd)
spraid_get_tag_from_scmd(scmd, &hwq, &cid);
spraidq = &hdev->queues[hwq];
if (READ_ONCE(iod->state) == SPRAID_CMD_COMPLETE || spraid_poll_cq(spraidq, cid)) {
- dev_warn(hdev->dev, "cid[%d], qid[%d] has been completed\n",
+ dev_warn(hdev->dev, "cid[%d] qid[%d] has been completed\n",
cid, spraidq->qid);
return true;
}
@@ -2927,8 +2724,7 @@ static int spraid_send_abort_cmd(struct spraid_dev *hdev, u32 hdid, u16 qid, u16
admin_cmd.abort.sqid = cpu_to_le16(qid);
admin_cmd.abort.cid = cpu_to_le16(cid);
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
}
/* send reset command by admin quueue temporary */
@@ -2941,8 +2737,7 @@ static int spraid_send_reset_cmd(struct spraid_dev *hdev, int type, u32 hdid)
admin_cmd.reset.hdid = cpu_to_le32(hdid);
admin_cmd.reset.type = type;
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
}
static bool spraid_change_host_state(struct spraid_dev *hdev, enum spraid_state newstate)
@@ -3022,7 +2817,7 @@ static void spraid_back_fault_cqe(struct spraid_queue *ioq, struct spraid_comple
scsi_dma_unmap(scmd);
spraid_free_iod_res(hdev, iod);
scmd->scsi_done(scmd);
- dev_warn(hdev->dev, "Back fault CQE, cid[%d], qid[%d]\n",
+ dev_warn(hdev->dev, "Back fault CQE, cid[%d] qid[%d]\n",
cqe->cmd_id, ioq->qid);
}
@@ -3032,6 +2827,8 @@ static void spraid_back_all_io(struct spraid_dev *hdev)
struct spraid_queue *ioq;
struct spraid_completion cqe = { 0 };
+ scsi_block_requests(hdev->shost);
+
for (i = 1; i <= hdev->shost->nr_hw_queues; i++) {
ioq = &hdev->queues[i];
for (j = 0; j < hdev->shost->can_queue; j++) {
@@ -3039,6 +2836,8 @@ static void spraid_back_all_io(struct spraid_dev *hdev)
spraid_back_fault_cqe(ioq, &cqe);
}
}
+
+ scsi_unblock_requests(hdev->shost);
}
static void spraid_dev_disable(struct spraid_dev *hdev, bool shutdown)
@@ -3106,17 +2905,13 @@ static void spraid_reset_work(struct work_struct *work)
if (ret)
goto pci_disable;
- ret = spraid_alloc_admin_tags(hdev);
- if (ret)
- goto pci_disable;
-
ret = spraid_setup_io_queues(hdev);
if (ret || hdev->online_queues <= hdev->shost->nr_hw_queues)
goto pci_disable;
spraid_change_host_state(hdev, SPRAID_LIVE);
- spraid_send_aen(hdev);
+ spraid_send_all_aen(hdev);
return;
@@ -3174,8 +2969,8 @@ static int spraid_abort_handler(struct scsi_cmnd *scmd)
scsi_print_command(scmd);
- if (!spraid_wait_abnl_cmd_done(iod) || spraid_check_scmd_completed(scmd) ||
- hdev->state != SPRAID_LIVE)
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
return SUCCESS;
hostdata = scmd->device->hostdata;
@@ -3183,7 +2978,7 @@ static int spraid_abort_handler(struct scsi_cmnd *scmd)
dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, aborting\n", cid, hwq);
ret = spraid_send_abort_cmd(hdev, hostdata->hdid, hwq, cid);
- if (ret != ADMIN_ERR_TIMEOUT) {
+ if (ret != -ETIME) {
ret = spraid_wait_abnl_cmd_done(iod);
if (ret) {
dev_warn(hdev->dev, "cid[%d] qid[%d] abort failed, not found\n", cid, hwq);
@@ -3206,8 +3001,8 @@ static int spraid_tgt_reset_handler(struct scsi_cmnd *scmd)
scsi_print_command(scmd);
- if (!spraid_wait_abnl_cmd_done(iod) || spraid_check_scmd_completed(scmd) ||
- hdev->state != SPRAID_LIVE)
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
return SUCCESS;
hostdata = scmd->device->hostdata;
@@ -3241,8 +3036,8 @@ static int spraid_bus_reset_handler(struct scsi_cmnd *scmd)
scsi_print_command(scmd);
- if (!spraid_wait_abnl_cmd_done(iod) || spraid_check_scmd_completed(scmd) ||
- hdev->state != SPRAID_LIVE)
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
return SUCCESS;
hostdata = scmd->device->hostdata;
@@ -3272,7 +3067,7 @@ static int spraid_shost_reset_handler(struct scsi_cmnd *scmd)
struct spraid_dev *hdev = shost_priv(scmd->device->host);
scsi_print_command(scmd);
- if (spraid_check_scmd_completed(scmd) || hdev->state != SPRAID_LIVE)
+ if (hdev->state != SPRAID_LIVE || spraid_check_scmd_completed(scmd))
return SUCCESS;
spraid_get_tag_from_scmd(scmd, &hwq, &cid);
@@ -3288,6 +3083,62 @@ static int spraid_shost_reset_handler(struct scsi_cmnd *scmd)
return SUCCESS;
}
+static pci_ers_result_t spraid_pci_error_detected(struct pci_dev *pdev,
+ pci_channel_state_t state)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "enter pci error detect, state:%d\n", state);
+
+ switch (state) {
+ case pci_channel_io_normal:
+ dev_warn(hdev->dev, "channel is normal, do nothing\n");
+
+ return PCI_ERS_RESULT_CAN_RECOVER;
+ case pci_channel_io_frozen:
+ dev_warn(hdev->dev, "channel io frozen, need reset controller\n");
+
+ scsi_block_requests(hdev->shost);
+
+ spraid_change_host_state(hdev, SPRAID_RESETTING);
+
+ return PCI_ERS_RESULT_NEED_RESET;
+ case pci_channel_io_perm_failure:
+ dev_warn(hdev->dev, "channel io failure, request disconnect\n");
+
+ return PCI_ERS_RESULT_DISCONNECT;
+ }
+
+ return PCI_ERS_RESULT_NEED_RESET;
+}
+
+static pci_ers_result_t spraid_pci_slot_reset(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "restart after slot reset\n");
+
+ pci_restore_state(pdev);
+
+ if (!queue_work(spraid_wq, &hdev->reset_work)) {
+ dev_err(hdev->dev, "[%s] err, the device is resetting state\n", __func__);
+ return PCI_ERS_RESULT_NONE;
+ }
+
+ flush_work(&hdev->reset_work);
+
+ scsi_unblock_requests(hdev->shost);
+
+ return PCI_ERS_RESULT_RECOVERED;
+}
+
+static void spraid_reset_done(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "enter spraid reset done\n");
+}
+
static ssize_t csts_pp_show(struct device *cdev, struct device_attribute *attr, char *buf)
{
struct Scsi_Host *shost = class_to_shost(cdev);
@@ -3347,7 +3198,7 @@ static ssize_t fw_version_show(struct device *cdev, struct device_attribute *att
struct Scsi_Host *shost = class_to_shost(cdev);
struct spraid_dev *hdev = shost_priv(shost);
- return snprintf(buf, sizeof(hdev->ctrl_info->fr), "%s\n", hdev->ctrl_info->fr);
+ return snprintf(buf, PAGE_SIZE, "%s\n", hdev->ctrl_info->fr);
}
static DEVICE_ATTR_RO(csts_pp);
@@ -3365,6 +3216,185 @@ static struct device_attribute *spraid_host_attrs[] = {
NULL,
};
+static int spraid_get_vd_info(struct spraid_dev *hdev, struct spraid_vd_info *vd_info, u16 vid)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.usr_cmd.opcode = USR_CMD_READ;
+ admin_cmd.usr_cmd.info_0.subopcode = cpu_to_le16(USR_CMD_VDINFO);
+ admin_cmd.usr_cmd.info_1.data_len = cpu_to_le16(USR_CMD_RDLEN);
+ admin_cmd.usr_cmd.info_1.param_len = cpu_to_le16(VDINFO_PARAM_LEN);
+ admin_cmd.usr_cmd.cdw10 = cpu_to_le32(vid);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(vd_info, data_ptr, sizeof(struct spraid_vd_info));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_get_bgtask(struct spraid_dev *hdev, struct spraid_bgtask *bgtask)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.usr_cmd.opcode = USR_CMD_READ;
+ admin_cmd.usr_cmd.info_0.subopcode = cpu_to_le16(USR_CMD_BGTASK);
+ admin_cmd.usr_cmd.info_1.data_len = cpu_to_le16(USR_CMD_RDLEN);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(bgtask, data_ptr, sizeof(struct spraid_bgtask));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static ssize_t raid_level_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_sdev_hostdata *hostdata;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret)
+ vd_info->rg_level = ARRAY_SIZE(raid_levels) - 1;
+
+ ret = (vd_info->rg_level < ARRAY_SIZE(raid_levels)) ?
+ vd_info->rg_level : (ARRAY_SIZE(raid_levels) - 1);
+
+ kfree(vd_info);
+
+ return snprintf(buf, PAGE_SIZE, "RAID-%s\n", raid_levels[ret]);
+}
+
+static ssize_t raid_state_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_sdev_hostdata *hostdata;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret) {
+ vd_info->vd_status = 0;
+ vd_info->rg_id = 0xff;
+ }
+
+ ret = (vd_info->vd_status < ARRAY_SIZE(raid_states)) ? vd_info->vd_status : 0;
+
+ kfree(vd_info);
+
+ return snprintf(buf, PAGE_SIZE, "%s\n", raid_states[ret]);
+}
+
+static ssize_t raid_resync_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_bgtask *bgtask;
+ struct spraid_sdev_hostdata *hostdata;
+ u8 rg_id, i, progress = 0;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret)
+ goto out;
+
+ rg_id = vd_info->rg_id;
+
+ bgtask = (struct spraid_bgtask *)vd_info;
+ ret = spraid_get_bgtask(hdev, bgtask);
+ if (ret)
+ goto out;
+ for (i = 0; i < bgtask->task_num; i++) {
+ if ((bgtask->bgtask[i].type == BGTASK_TYPE_REBUILD) &&
+ (le16_to_cpu(bgtask->bgtask[i].vd_id) == rg_id))
+ progress = bgtask->bgtask[i].progress;
+ }
+
+out:
+ kfree(vd_info);
+ return snprintf(buf, PAGE_SIZE, "%d\n", progress);
+}
+
+static DEVICE_ATTR_RO(raid_level);
+static DEVICE_ATTR_RO(raid_state);
+static DEVICE_ATTR_RO(raid_resync);
+
+static struct device_attribute *spraid_dev_attrs[] = {
+ &dev_attr_raid_level,
+ &dev_attr_raid_state,
+ &dev_attr_raid_resync,
+ NULL,
+};
+
+static struct pci_error_handlers spraid_err_handler = {
+ .error_detected = spraid_pci_error_detected,
+ .slot_reset = spraid_pci_slot_reset,
+ .reset_done = spraid_reset_done,
+};
+
+static int spraid_sysfs_host_reset(struct Scsi_Host *shost, int reset_type)
+{
+ int ret;
+ struct spraid_dev *hdev = shost_priv(shost);
+
+ dev_info(hdev->dev, "[%s] start sysfs host reset cmd\n", __func__);
+ ret = spraid_reset_work_sync(hdev);
+ dev_info(hdev->dev, "[%s] stop sysfs host reset cmd[%d]\n", __func__, ret);
+
+ return ret;
+}
+
static struct scsi_host_template spraid_driver_template = {
.module = THIS_MODULE,
.name = "Ramaxel Logic spraid driver",
@@ -3379,9 +3409,11 @@ static struct scsi_host_template spraid_driver_template = {
.eh_bus_reset_handler = spraid_bus_reset_handler,
.eh_host_reset_handler = spraid_shost_reset_handler,
.change_queue_depth = scsi_change_queue_depth,
- .host_tagset = 1,
+ .host_tagset = 0,
.this_id = -1,
.shost_attrs = spraid_host_attrs,
+ .sdev_attrs = spraid_dev_attrs,
+ .host_reset = spraid_sysfs_host_reset,
};
static void spraid_shutdown(struct pci_dev *pdev)
@@ -3392,11 +3424,53 @@ static void spraid_shutdown(struct pci_dev *pdev)
spraid_disable_admin_queue(hdev, true);
}
+/* bsg dispatch user command */
+static int spraid_bsg_host_dispatch(struct bsg_job *job)
+{
+ struct Scsi_Host *shost = dev_to_shost(job->dev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_bsg_request *bsg_req = job->request;
+ int ret = 0;
+
+ dev_log_dbg(hdev->dev, "[%s] msgcode[%d], msglen[%d], timeout[%d], req_nsge[%d], req_len[%d]\n",
+ __func__, bsg_req->msgcode, job->request_len, rq->timeout,
+ job->request_payload.sg_cnt, job->request_payload.payload_len);
+
+ job->reply_len = 0;
+
+ switch (bsg_req->msgcode) {
+ case SPRAID_BSG_ADM:
+ ret = spraid_user_admin_cmd(hdev, job);
+ break;
+ case SPRAID_BSG_IOQ:
+ ret = spraid_user_ioq_cmd(hdev, job);
+ break;
+ default:
+ dev_info(hdev->dev, "[%s] unsupport msgcode[%d]\n", __func__, bsg_req->msgcode);
+ break;
+ }
+
+ if (ret > 0)
+ ret = ret | (ret << 8);
+
+ bsg_job_done(job, ret, 0);
+ return 0;
+}
+
+static inline void spraid_remove_bsg(struct spraid_dev *hdev)
+{
+ if (hdev->bsg_queue) {
+ bsg_unregister_queue(hdev->bsg_queue);
+ blk_cleanup_queue(hdev->bsg_queue);
+ }
+}
static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
struct spraid_dev *hdev;
struct Scsi_Host *shost;
int node, ret;
+ char bsg_name[15];
shost = scsi_host_alloc(&spraid_driver_template, sizeof(*hdev));
if (!shost) {
@@ -3421,10 +3495,10 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
goto put_dev;
init_rwsem(&hdev->devices_rwsem);
- INIT_WORK(&hdev->aen_work, spraid_async_event_work);
INIT_WORK(&hdev->scan_work, spraid_scan_work);
INIT_WORK(&hdev->timesyn_work, spraid_timesyn_work);
INIT_WORK(&hdev->reset_work, spraid_reset_work);
+ INIT_WORK(&hdev->fw_act_work, spraid_fw_act_work);
spin_lock_init(&hdev->state_lock);
ret = spraid_alloc_resources(hdev);
@@ -3439,17 +3513,13 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
if (ret)
goto pci_disable;
- ret = spraid_alloc_admin_tags(hdev);
- if (ret)
- goto disable_admin_q;
-
ret = spraid_init_ctrl_info(hdev);
if (ret)
- goto free_admin_tagset;
+ goto disable_admin_q;
ret = spraid_alloc_iod_ext_mem_pool(hdev);
if (ret)
- goto free_admin_tagset;
+ goto disable_admin_q;
ret = spraid_setup_io_queues(hdev);
if (ret)
@@ -3464,9 +3534,15 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
goto remove_io_queues;
}
- ret = spraid_create_cdev(hdev);
- if (ret)
+ snprintf(bsg_name, sizeof(bsg_name), "spraid%d", shost->host_no);
+ hdev->bsg_queue = bsg_setup_queue(&shost->shost_gendev, bsg_name,
+ spraid_bsg_host_dispatch, NULL,
+ spraid_cmd_size(hdev, true, false));
+ if (IS_ERR(hdev->bsg_queue)) {
+ dev_err(hdev->dev, "err, setup bsg failed\n");
+ hdev->bsg_queue = NULL;
goto remove_io_queues;
+ }
if (hdev->online_queues == SPRAID_ADMIN_QUEUE_NUM) {
dev_warn(hdev->dev, "warn only admin queue can be used\n");
@@ -3475,11 +3551,11 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
hdev->state = SPRAID_LIVE;
- spraid_send_aen(hdev);
+ spraid_send_all_aen(hdev);
ret = spraid_dev_list_init(hdev);
if (ret)
- goto remove_cdev;
+ goto remove_bsg;
ret = spraid_configure_timestamp(hdev);
if (ret)
@@ -3487,20 +3563,18 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
ret = spraid_alloc_ioq_ptcmds(hdev);
if (ret)
- goto remove_cdev;
+ goto remove_bsg;
scsi_scan_host(hdev->shost);
return 0;
-remove_cdev:
- spraid_remove_cdev(hdev);
+remove_bsg:
+ spraid_remove_bsg(hdev);
remove_io_queues:
spraid_remove_io_queues(hdev);
free_iod_mempool:
spraid_free_iod_ext_mem_pool(hdev);
-free_admin_tagset:
- spraid_remove_admin_tagset(hdev);
disable_admin_q:
spraid_disable_admin_queue(hdev, false);
pci_disable:
@@ -3524,22 +3598,17 @@ static void spraid_remove(struct pci_dev *pdev)
dev_info(hdev->dev, "enter spraid remove\n");
spraid_change_host_state(hdev, SPRAID_DELETING);
+ flush_work(&hdev->reset_work);
- if (!pci_device_is_present(pdev)) {
- scsi_block_requests(shost);
+ if (!pci_device_is_present(pdev))
spraid_back_all_io(hdev);
- scsi_unblock_requests(shost);
- }
- flush_work(&hdev->reset_work);
+ spraid_remove_bsg(hdev);
scsi_remove_host(shost);
-
- kfree(hdev->ioq_ptcmds);
+ spraid_free_ioq_ptcmds(hdev);
kfree(hdev->devices);
- spraid_remove_cdev(hdev);
spraid_remove_io_queues(hdev);
spraid_free_iod_ext_mem_pool(hdev);
- spraid_remove_admin_tagset(hdev);
spraid_disable_admin_queue(hdev, false);
spraid_pci_disable(hdev);
spraid_free_resources(hdev);
@@ -3551,7 +3620,7 @@ static void spraid_remove(struct pci_dev *pdev)
}
static const struct pci_device_id spraid_id_table[] = {
- { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC, SPRAID_SERVER_DEVICE_HAB_DID) },
+ { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC, SPRAID_SERVER_DEVICE_HBA_DID) },
{ PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC, SPRAID_SERVER_DEVICE_RAID_DID) },
{ 0, }
};
@@ -3563,6 +3632,7 @@ static struct pci_driver spraid_driver = {
.probe = spraid_probe,
.remove = spraid_remove,
.shutdown = spraid_shutdown,
+ .err_handler = &spraid_err_handler,
};
static int __init spraid_init(void)
@@ -3573,14 +3643,10 @@ static int __init spraid_init(void)
if (!spraid_wq)
return -ENOMEM;
- ret = alloc_chrdev_region(&spraid_chr_devt, 0, SPRAID_MINORS, "spraid");
- if (ret < 0)
- goto destroy_wq;
-
spraid_class = class_create(THIS_MODULE, "spraid");
if (IS_ERR(spraid_class)) {
ret = PTR_ERR(spraid_class);
- goto unregister_chrdev;
+ goto destroy_wq;
}
ret = pci_register_driver(&spraid_driver);
@@ -3591,8 +3657,6 @@ static int __init spraid_init(void)
destroy_class:
class_destroy(spraid_class);
-unregister_chrdev:
- unregister_chrdev_region(spraid_chr_devt, SPRAID_MINORS);
destroy_wq:
destroy_workqueue(spraid_wq);
@@ -3603,12 +3667,11 @@ static void __exit spraid_exit(void)
{
pci_unregister_driver(&spraid_driver);
class_destroy(spraid_class);
- unregister_chrdev_region(spraid_chr_devt, SPRAID_MINORS);
destroy_workqueue(spraid_wq);
ida_destroy(&spraid_instance_ida);
}
-MODULE_AUTHOR("Ramaxel Memory Technology");
+MODULE_AUTHOR("songyl(a)ramaxel.com");
MODULE_DESCRIPTION("Ramaxel Memory Technology SPraid Driver");
MODULE_LICENSE("GPL");
MODULE_VERSION(SPRAID_DRV_VERSION);
--
2.32.0
1
0
From: sdlzx <hdu_sdlzx(a)163.com>
Subject: [PATCH openEuler-5.10] Disable legacy DRM drivers
euleros inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OBDE
CVE: NA
-------------------------------------------------
This config enables dangerous old drivers. It is selected by
CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT which is suggested by upstream
community that "modern distros should consider turning it off".
CONFIG_DRM_VM was selected by CONFIG_DRM_LEGACY and can be turned
off now.
Signed-off-by: sdlzx <hdu_sdlzx(a)163.com>
---
arch/arm64/configs/openeuler_defconfig | 10 ++--------
arch/x86/configs/openeuler_defconfig | 10 ++--------
2 files changed, 4 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig
index 17bc8750b..6949824e8 100644
--- a/arch/arm64/configs/openeuler_defconfig
+++ b/arch/arm64/configs/openeuler_defconfig
@@ -4673,7 +4673,6 @@ CONFIG_DRM_TTM_DMA_PAGE_POOL=y
CONFIG_DRM_VRAM_HELPER=y
CONFIG_DRM_TTM_HELPER=y
CONFIG_DRM_GEM_SHMEM_HELPER=y
-CONFIG_DRM_VM=y
CONFIG_DRM_SCHED=m
#
@@ -4719,7 +4718,7 @@ CONFIG_DRM_AMD_DC_DCN=y
# CONFIG_HSA_AMD is not set
CONFIG_DRM_NOUVEAU=m
-CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT=y
+# CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT is not set
CONFIG_NOUVEAU_DEBUG=5
CONFIG_NOUVEAU_DEBUG_DEFAULT=3
# CONFIG_NOUVEAU_DEBUG_MMU is not set
@@ -4817,12 +4816,7 @@ CONFIG_DRM_CIRRUS_QEMU=m
# CONFIG_DRM_LIMA is not set
# CONFIG_DRM_PANFROST is not set
# CONFIG_DRM_TIDSS is not set
-CONFIG_DRM_LEGACY=y
-# CONFIG_DRM_TDFX is not set
-# CONFIG_DRM_R128 is not set
-# CONFIG_DRM_MGA is not set
-# CONFIG_DRM_VIA is not set
-# CONFIG_DRM_SAVAGE is not set
+# CONFIG_DRM_LEGACY is not set
CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y
#
diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig
index 7b6083018..dbae04231 100644
--- a/arch/x86/configs/openeuler_defconfig
+++ b/arch/x86/configs/openeuler_defconfig
@@ -5040,7 +5040,6 @@ CONFIG_DRM_TTM_DMA_PAGE_POOL=y
CONFIG_DRM_VRAM_HELPER=m
CONFIG_DRM_TTM_HELPER=m
CONFIG_DRM_GEM_SHMEM_HELPER=y
-CONFIG_DRM_VM=y
CONFIG_DRM_SCHED=m
#
@@ -5085,7 +5084,7 @@ CONFIG_DRM_AMD_DC_DCN=y
# CONFIG_HSA_AMD is not set
CONFIG_DRM_NOUVEAU=m
-CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT=y
+# CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT is not set
CONFIG_NOUVEAU_DEBUG=5
CONFIG_NOUVEAU_DEBUG_DEFAULT=3
# CONFIG_NOUVEAU_DEBUG_MMU is not set
@@ -5226,12 +5225,7 @@ CONFIG_DRM_CIRRUS_QEMU=m
# CONFIG_TINYDRM_ST7735R is not set
# CONFIG_DRM_XEN is not set
# CONFIG_DRM_VBOXVIDEO is not set
-CONFIG_DRM_LEGACY=y
-# CONFIG_DRM_TDFX is not set
-# CONFIG_DRM_R128 is not set
-# CONFIG_DRM_MGA is not set
-# CONFIG_DRM_VIA is not set
-# CONFIG_DRM_SAVAGE is not set
+# CONFIG_DRM_LEGACY is not set
CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y
#
--
2.27.0
1
0
From: Keefe LIU <liuqifa(a)huawei.com>
hulk inclusion
category: feature
bugzilla: 9511, https://gitee.com/openeuler/kernel/issues/I4IHL1
CVE: NA
-------------------------------------------------
In a typical IPvlan L2 setup where master is in default-ns and
each slave is into different (slave) ns. In this setup, if master
and slaves in different net, egress packet processing for traffic
originating from slave-ns can't be forwarded to master or other
machine whose ip in the same net with master, and they can't be
forwarded to other interface in default-ns.
This patch introuce a new mode l2e for ipvlan to realize above
goals, and it won't affect the original l2, l3, l3s mode.
As the ip tool doesn't support l2e mode, We use module param
"ipvlan_default_mode" to set the default work mode. 0 for l2
mode, 1 for l3, 2 for l2e, 3 for l3s, others invalid now.
Attention, when we create ipvlan devices by "ip" commond, if we
assign the mode, ipvlan will work in the mode we assigned other
then the "ipvlan_default_mode".
Signed-off-by: Keefe LIU <liuqifa(a)huawei.com>
Reviewed-by: Wei Yongjun <weiyongjun1(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Modified:
when insmod the ipvlan module, the origin version(4.19) is 0
for l2, 1 for l3, 2 for l2e, 3 for l3s mode, but in 5.10 version
is 0 for l2, 1 for l3, 2 for l3s, 3 for l2e mode.
Signed-off-by: Lu Wei <luwei32(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Wei Yongjun <weiyongjun1(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
drivers/net/ipvlan/ipvlan_core.c | 190 +++++++++++++++++++++++++++++++
drivers/net/ipvlan/ipvlan_main.c | 6 +-
include/uapi/linux/if_link.h | 1 +
3 files changed, 196 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index 8801d093135c..5b695ec5c650 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -532,6 +532,122 @@ static int ipvlan_process_outbound(struct sk_buff *skb)
return ret;
}
+static int ipvlan_process_v4_forward(struct sk_buff *skb)
+{
+ const struct iphdr *ip4h = ip_hdr(skb);
+ struct net_device *dev = skb->dev;
+ struct net *net = dev_net(dev);
+ struct rtable *rt;
+ int err, ret = NET_XMIT_DROP;
+ struct flowi4 fl4 = {
+ .flowi4_tos = RT_TOS(ip4h->tos),
+ .flowi4_flags = FLOWI_FLAG_ANYSRC,
+ .flowi4_mark = skb->mark,
+ .daddr = ip4h->daddr,
+ .saddr = ip4h->saddr,
+ };
+
+ rt = ip_route_output_flow(net, &fl4, NULL);
+ if (IS_ERR(rt))
+ goto err;
+
+ if (rt->rt_type != RTN_UNICAST && rt->rt_type != RTN_LOCAL) {
+ ip_rt_put(rt);
+ goto err;
+ }
+ skb_dst_set(skb, &rt->dst);
+ err = ip_local_out(net, skb->sk, skb);
+ if (unlikely(net_xmit_eval(err)))
+ dev->stats.tx_errors++;
+ else
+ ret = NET_XMIT_SUCCESS;
+ goto out;
+err:
+ dev->stats.tx_errors++;
+ kfree_skb(skb);
+out:
+ return ret;
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+static int ipvlan_process_v6_forward(struct sk_buff *skb)
+{
+ const struct ipv6hdr *ip6h = ipv6_hdr(skb);
+ struct net_device *dev = skb->dev;
+ struct net *net = dev_net(dev);
+ struct dst_entry *dst;
+ int err, ret = NET_XMIT_DROP;
+ struct flowi6 fl6 = {
+ .daddr = ip6h->daddr,
+ .saddr = ip6h->saddr,
+ .flowi6_flags = FLOWI_FLAG_ANYSRC,
+ .flowlabel = ip6_flowinfo(ip6h),
+ .flowi6_mark = skb->mark,
+ .flowi6_proto = ip6h->nexthdr,
+ };
+
+ dst = ip6_route_output(net, NULL, &fl6);
+ if (dst->error) {
+ ret = dst->error;
+ dst_release(dst);
+ goto err;
+ }
+ skb_dst_set(skb, dst);
+ err = ip6_local_out(net, skb->sk, skb);
+ if (unlikely(net_xmit_eval(err)))
+ dev->stats.tx_errors++;
+ else
+ ret = NET_XMIT_SUCCESS;
+ goto out;
+err:
+ dev->stats.tx_errors++;
+ kfree_skb(skb);
+out:
+ return ret;
+}
+#else
+static int ipvlan_process_v6_forward(struct sk_buff *skb)
+{
+ return NET_XMIT_DROP;
+}
+#endif
+
+static int ipvlan_process_forward(struct sk_buff *skb)
+{
+ struct ethhdr *ethh = eth_hdr(skb);
+ int ret = NET_XMIT_DROP;
+
+ /* In this mode we dont care about multicast and broadcast traffic */
+ if (is_multicast_ether_addr(ethh->h_dest)) {
+ pr_debug_ratelimited("Dropped {multi|broad}cast of type=[%x]\n",
+ ntohs(skb->protocol));
+ kfree_skb(skb);
+ goto out;
+ }
+
+ /* The ipvlan is a pseudo-L2 device, so the packets that we receive
+ * will have L2; which need to discarded and processed further
+ * in the net-ns of the main-device.
+ */
+ if (skb_mac_header_was_set(skb)) {
+ skb_pull(skb, sizeof(*ethh));
+ skb->mac_header = (typeof(skb->mac_header))~0U;
+ skb_reset_network_header(skb);
+ }
+
+ if (skb->protocol == htons(ETH_P_IPV6)) {
+ ret = ipvlan_process_v6_forward(skb);
+ } else if (skb->protocol == htons(ETH_P_IP)) {
+ ret = ipvlan_process_v4_forward(skb);
+ } else {
+ pr_warn_ratelimited("Dropped outbound packet type=%x\n",
+ ntohs(skb->protocol));
+ kfree_skb(skb);
+ }
+out:
+ return ret;
+}
+
static void ipvlan_multicast_enqueue(struct ipvl_port *port,
struct sk_buff *skb, bool tx_pkt)
{
@@ -629,6 +745,46 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
return dev_queue_xmit(skb);
}
+static int ipvlan_xmit_mode_l2e(struct sk_buff *skb, struct net_device *dev)
+{
+ const struct ipvl_dev *ipvlan = netdev_priv(dev);
+ struct ethhdr *eth = eth_hdr(skb);
+ struct ipvl_addr *addr;
+ void *lyr3h;
+ int addr_type;
+
+ if (!ipvlan_is_vepa(ipvlan->port) &&
+ ether_addr_equal(eth->h_dest, eth->h_source)) {
+ lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, &addr_type);
+ if (lyr3h) {
+ addr = ipvlan_addr_lookup(ipvlan->port, lyr3h,
+ addr_type, true);
+ if (addr) {
+ if (ipvlan_is_private(ipvlan->port)) {
+ consume_skb(skb);
+ return NET_XMIT_DROP;
+ }
+ return ipvlan_rcv_frame(addr, &skb, true);
+ }
+ }
+ skb = skb_share_check(skb, GFP_ATOMIC);
+ if (!skb)
+ return NET_XMIT_DROP;
+
+ /* maybe the packet need been forward */
+ skb->dev = ipvlan->phy_dev;
+ ipvlan_skb_crossing_ns(skb, ipvlan->phy_dev);
+ return ipvlan_process_forward(skb);
+ } else if (is_multicast_ether_addr(eth->h_dest)) {
+ ipvlan_skb_crossing_ns(skb, NULL);
+ ipvlan_multicast_enqueue(ipvlan->port, skb, true);
+ return NET_XMIT_SUCCESS;
+ }
+
+ skb->dev = ipvlan->phy_dev;
+ return dev_queue_xmit(skb);
+}
+
int ipvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct ipvl_dev *ipvlan = netdev_priv(dev);
@@ -648,6 +804,8 @@ int ipvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev)
case IPVLAN_MODE_L3S:
#endif
return ipvlan_xmit_mode_l3(skb, dev);
+ case IPVLAN_MODE_L2E:
+ return ipvlan_xmit_mode_l2e(skb, dev);
}
/* Should not reach here */
@@ -729,6 +887,36 @@ static rx_handler_result_t ipvlan_handle_mode_l2(struct sk_buff **pskb,
return ret;
}
+static rx_handler_result_t ipvlan_handle_mode_l2e(struct sk_buff **pskb,
+ struct ipvl_port *port)
+{
+ struct sk_buff *skb = *pskb;
+ struct ethhdr *eth = eth_hdr(skb);
+ rx_handler_result_t ret = RX_HANDLER_PASS;
+
+ if (is_multicast_ether_addr(eth->h_dest)) {
+ if (ipvlan_external_frame(skb, port)) {
+ struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
+
+ /* External frames are queued for device local
+ * distribution, but a copy is given to master
+ * straight away to avoid sending duplicates later
+ * when work-queue processes this frame. This is
+ * achieved by returning RX_HANDLER_PASS.
+ */
+ if (nskb) {
+ ipvlan_skb_crossing_ns(nskb, NULL);
+ ipvlan_multicast_enqueue(port, nskb, false);
+ }
+ }
+ } else {
+ /* Perform like l3 mode for non-multicast packet */
+ ret = ipvlan_handle_mode_l3(pskb, port);
+ }
+
+ return ret;
+}
+
rx_handler_result_t ipvlan_handle_frame(struct sk_buff **pskb)
{
struct sk_buff *skb = *pskb;
@@ -746,6 +934,8 @@ rx_handler_result_t ipvlan_handle_frame(struct sk_buff **pskb)
case IPVLAN_MODE_L3S:
return RX_HANDLER_PASS;
#endif
+ case IPVLAN_MODE_L2E:
+ return ipvlan_handle_mode_l2e(pskb, port);
}
/* Should not reach here */
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 60b7d93bb834..efd1452cd929 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -4,6 +4,10 @@
#include "ipvlan.h"
+static int ipvlan_default_mode = IPVLAN_MODE_L3;
+module_param(ipvlan_default_mode, int, 0400);
+MODULE_PARM_DESC(ipvlan_default_mode, "set ipvlan default mode: 0 for l2, 1 for l3, 2 for l3s, 3 for l2e, others invalid now");
+
static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval,
struct netlink_ext_ack *extack)
{
@@ -535,7 +539,7 @@ int ipvlan_link_new(struct net *src_net, struct net_device *dev,
struct ipvl_port *port;
struct net_device *phy_dev;
int err;
- u16 mode = IPVLAN_MODE_L3;
+ u16 mode = ipvlan_default_mode;
if (!tb[IFLA_LINK])
return -EINVAL;
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index c4b23f06f69e..50d4705e1cbc 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -690,6 +690,7 @@ enum ipvlan_mode {
IPVLAN_MODE_L2 = 0,
IPVLAN_MODE_L3,
IPVLAN_MODE_L3S,
+ IPVLAN_MODE_L2E,
IPVLAN_MODE_MAX
};
--
2.20.1
1
128

[PATCH OLK-5.10 v2] arm64: Revert feature: Add memmap parameter and register pmem
by Zhuling 27 Dec '21
by Zhuling 27 Dec '21
27 Dec '21
euleros inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4O31I?from=project-issue
CVE: NA
The reserve memory of PMEM conflicts with
"add memmap interface to reserved memory for mremap syscall usage",
need to rollback, resubmit after adaptation.
Feature related commit:
1.PMEM function commit:94dc364f5eda10f49449ba573dc3322e1ea92280
2.PMEM feature config commit: 36d7a831e15ceb84e937122c87d01c14242dc377
Signed-off-by: Zhuling <zhuling8(a)huawei.com>
---
arch/arm64/Kconfig | 21 --------
arch/arm64/configs/openeuler_defconfig | 3 --
arch/arm64/kernel/Makefile | 1 -
arch/arm64/kernel/pmem.c | 35 -------------
arch/arm64/kernel/setup.c | 10 ----
arch/arm64/mm/init.c | 94 ----------------------------------
drivers/nvdimm/Kconfig | 5 --
drivers/nvdimm/Makefile | 1 -
8 files changed, 170 deletions(-)
delete mode 100644 arch/arm64/kernel/pmem.c
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index df90a6e..2df4b31 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1321,27 +1321,6 @@ config RODATA_FULL_DEFAULT_ENABLED
This requires the linear region to be mapped down to pages,
which may adversely affect performance in some cases.
-config ARM64_PMEM_RESERVE
- bool "Reserve memory for persistent storage"
- default n
- help
- Use memmap=nn[KMG]!ss[KMG](memmap=100K!0x1a0000000) reserve
- memory for persistent storage.
-
- Say y here to enable this feature.
-
-config ARM64_PMEM_LEGACY_DEVICE
- bool "Create persistent storage"
- depends on BLK_DEV
- depends on LIBNVDIMM
- select ARM64_PMEM_RESERVE
- help
- Use reserved memory for persistent storage when the kernel
- restart or update. the data in PMEM will not be lost and
- can be loaded faster.
-
- Say y if unsure.
-
config ARM64_SW_TTBR0_PAN
bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
help
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig
index 17bc875..b5fc851 100644
--- a/arch/arm64/configs/openeuler_defconfig
+++ b/arch/arm64/configs/openeuler_defconfig
@@ -416,8 +416,6 @@ CONFIG_ARM64_CPU_PARK=y
CONFIG_FORCE_MAX_ZONEORDER=11
CONFIG_UNMAP_KERNEL_AT_EL0=y
CONFIG_RODATA_FULL_DEFAULT_ENABLED=y
-CONFIG_ARM64_PMEM_RESERVE=y
-CONFIG_ARM64_PMEM_LEGACY_DEVICE=y
# CONFIG_ARM64_SW_TTBR0_PAN is not set
CONFIG_ARM64_TAGGED_ADDR_ABI=y
CONFIG_ARM64_ILP32=y
@@ -6026,7 +6024,6 @@ CONFIG_ND_BTT=m
CONFIG_BTT=y
CONFIG_OF_PMEM=m
CONFIG_NVDIMM_KEYS=y
-CONFIG_PMEM_LEGACY=m
CONFIG_DAX_DRIVER=y
CONFIG_DAX=y
CONFIG_DEV_DAX=m
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index f615325..169d90f 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -68,7 +68,6 @@ obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o
obj-$(CONFIG_SHADOW_CALL_STACK) += scs.o
obj-$(CONFIG_ARM64_MTE) += mte.o
obj-$(CONFIG_MPAM) += mpam/
-obj-$(CONFIG_ARM64_PMEM_LEGACY_DEVICE) += pmem.o
obj-y += vdso/ probes/
obj-$(CONFIG_COMPAT_VDSO) += vdso32/
diff --git a/arch/arm64/kernel/pmem.c b/arch/arm64/kernel/pmem.c
deleted file mode 100644
index 16eaf70..0000000
--- a/arch/arm64/kernel/pmem.c
+++ /dev/null
@@ -1,35 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright(c) 2021 Huawei Technologies Co., Ltd
- *
- * Derived from x86 and arm64 implement PMEM.
- */
-#include <linux/platform_device.h>
-#include <linux/init.h>
-#include <linux/ioport.h>
-#include <linux/module.h>
-
-static int found(struct resource *res, void *data)
-{
- return 1;
-}
-
-static int __init register_e820_pmem(void)
-{
- struct platform_device *pdev;
- int rc;
-
- rc = walk_iomem_res_desc(IORES_DESC_PERSISTENT_MEMORY_LEGACY,
- IORESOURCE_MEM, 0, -1, NULL, found);
- if (rc <= 0)
- return 0;
-
- /*
- * See drivers/nvdimm/e820.c for the implementation, this is
- * simply here to trigger the module to load on demand.
- */
- pdev = platform_device_alloc("e820_pmem", -1);
-
- return platform_device_add(pdev);
-}
-device_initcall(register_e820_pmem);
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 88ec495..7cd0425 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -70,10 +70,6 @@ static int __init arm64_enable_cpu0_hotplug(char *str)
__setup("arm64_cpu0_hotplug", arm64_enable_cpu0_hotplug);
#endif
-#ifdef CONFIG_ARM64_PMEM_RESERVE
-extern struct resource pmem_res;
-#endif
-
phys_addr_t __fdt_pointer __initdata;
/*
@@ -288,12 +284,6 @@ static void __init request_standard_resources(void)
request_resource(res, &pin_memory_resource);
#endif
}
-
-#ifdef CONFIG_ARM64_PMEM_RESERVE
- if (pmem_res.end && pmem_res.start)
- request_resource(&iomem_resource, &pmem_res);
-#endif
-
}
static int __init reserve_memblock_reserved_regions(void)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 3b9401e..e8d4461 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -55,7 +55,6 @@
*/
s64 memstart_addr __ro_after_init = -1;
EXPORT_SYMBOL(memstart_addr);
-phys_addr_t start_at, mem_size;
#ifdef CONFIG_PIN_MEMORY
struct resource pin_memory_resource = {
@@ -112,18 +111,6 @@ static void __init reserve_pin_memory_res(void)
*/
phys_addr_t arm64_dma_phys_limit __ro_after_init;
-static unsigned long long pmem_size, pmem_start;
-
-#ifdef CONFIG_ARM64_PMEM_RESERVE
-struct resource pmem_res = {
- .name = "Persistent Memory (legacy)",
- .start = 0,
- .end = 0,
- .flags = IORESOURCE_MEM,
- .desc = IORES_DESC_PERSISTENT_MEMORY_LEGACY
-};
-#endif
-
#ifndef CONFIG_KEXEC_CORE
static void __init reserve_crashkernel(void)
{
@@ -417,83 +404,6 @@ static int __init reserve_park_mem(void)
}
#endif
-static bool __init is_mem_valid(unsigned long long mem_size, unsigned long long mem_start)
-{
- if (!memblock_is_region_memory(mem_start, mem_size)) {
- pr_warn("cannot reserve mem: region is not memory!\n");
- return false;
- }
-
- if (memblock_is_region_reserved(mem_start, mem_size)) {
- pr_warn("cannot reserve mem: region overlaps reserved memory!\n");
- return false;
- }
-
- if (!IS_ALIGNED(mem_start, SZ_2M)) {
- pr_warn("cannot reserve mem: base address is not 2MB aligned!\n");
- return false;
- }
-
- return true;
-}
-
-static int __init parse_memmap_one(char *p)
-{
- char *oldp;
-
- if (!p)
- return -EINVAL;
-
- oldp = p;
- mem_size = memparse(p, &p);
- if (p == oldp)
- return -EINVAL;
-
- if (!mem_size)
- return -EINVAL;
-
- mem_size = PAGE_ALIGN(mem_size);
-
- if (*p == '!') {
- start_at = memparse(p+1, &p);
-
- pmem_start = start_at;
- pmem_size = mem_size;
- } else
- pr_info("Unrecognized memmap option, please check the parameter.\n");
-
- return *p == '\0' ? 0 : -EINVAL;
-}
-
-static int __init parse_memmap_opt(char *str)
-{
- while (str) {
- char *k = strchr(str, ',');
-
- if (k)
- *k++ = 0;
- parse_memmap_one(str);
- str = k;
- }
-
- return 0;
-}
-early_param("memmap", parse_memmap_opt);
-
-#ifdef CONFIG_ARM64_PMEM_RESERVE
-static void __init reserve_pmem(void)
-{
- if (!is_mem_valid(mem_size, start_at))
- return;
-
- memblock_remove(pmem_start, pmem_size);
- pr_info("pmem reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
- pmem_start, pmem_start + pmem_size, pmem_size >> 20);
- pmem_res.start = pmem_start;
- pmem_res.end = pmem_start + pmem_size - 1;
-}
-#endif
-
void __init arm64_memblock_init(void)
{
const s64 linear_region_size = BIT(vabits_actual - 1);
@@ -668,10 +578,6 @@ void __init bootmem_init(void)
reserve_quick_kexec();
#endif
-#ifdef CONFIG_ARM64_PMEM_RESERVE
- reserve_pmem();
-#endif
-
reserve_pin_memory_res();
memblock_dump_all();
diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig
index ce4de75..b7d1eb3 100644
--- a/drivers/nvdimm/Kconfig
+++ b/drivers/nvdimm/Kconfig
@@ -132,8 +132,3 @@ config NVDIMM_TEST_BUILD
infrastructure.
endif
-
-config PMEM_LEGACY
- tristate "Pmem_legacy"
- select X86_PMEM_LEGACY if X86
- select ARM64_PMEM_LEGACY_DEVICE if ARM64
diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile
index 6f8dc92..0407753 100644
--- a/drivers/nvdimm/Makefile
+++ b/drivers/nvdimm/Makefile
@@ -3,7 +3,6 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
obj-$(CONFIG_ND_BTT) += nd_btt.o
obj-$(CONFIG_ND_BLK) += nd_blk.o
-obj-$(CONFIG_PMEM_LEGACY) += nd_e820.o
obj-$(CONFIG_OF_PMEM) += of_pmem.o
obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o nd_virtio.o
--
2.9.5
2
1

[PATCH openEuler-1.0-LTS 01/36] net: Prevent infinite while loop in skb_tx_hash()
by Yang Yingliang 27 Dec '21
by Yang Yingliang 27 Dec '21
27 Dec '21
From: Michael Chan <michael.chan(a)broadcom.com>
stable inclusion
from linux-4.19.215
commit 02302cbd52264337630a32848ac03648648e9685
--------------------------------
commit 0c57eeecc559ca6bc18b8c4e2808bc78dbe769b0 upstream.
Drivers call netdev_set_num_tc() and then netdev_set_tc_queue()
to set the queue count and offset for each TC. So the queue count
and offset for the TCs may be zero for a short period after dev->num_tc
has been set. If a TX packet is being transmitted at this time in the
code path netdev_pick_tx() -> skb_tx_hash(), skb_tx_hash() may see
nonzero dev->num_tc but zero qcount for the TC. The while loop that
keeps looping while hash >= qcount will not end.
Fix it by checking the TC's qcount to be nonzero before using it.
Fixes: eadec877ce9c ("net: Add support for subordinate traffic classes to netdev_pick_tx")
Reviewed-by: Andy Gospodarek <gospo(a)broadcom.com>
Signed-off-by: Michael Chan <michael.chan(a)broadcom.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
net/core/dev.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/core/dev.c b/net/core/dev.c
index 5d9800804d4a4..6ed7810ed7bd4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2846,6 +2846,12 @@ static u16 skb_tx_hash(const struct net_device *dev,
qoffset = sb_dev->tc_to_txq[tc].offset;
qcount = sb_dev->tc_to_txq[tc].count;
+ if (unlikely(!qcount)) {
+ net_warn_ratelimited("%s: invalid qcount, qoffset %u for tc %u\n",
+ sb_dev->name, qoffset, tc);
+ qoffset = 0;
+ qcount = dev->real_num_tx_queues;
+ }
}
if (skb_rx_queue_recorded(skb)) {
--
2.25.1
1
35

[[PATCH OLK-5.10] Revert feature — arm64: Add memmap parameter and register pmem] revert feature — [arm64: Add memmap parameter and register pmem]
by Zhuling 27 Dec '21
by Zhuling 27 Dec '21
27 Dec '21
euleros inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4O31I?from=project-issue
CVE: NA
The reserve memory of PMEM conflicts with
"add memmap interface to reserved memory for mremap syscall usage",
need to rollback, resubmit after adaptation.
Feature related commit:
1.PMEM function commit:94dc364f5eda10f49449ba573dc3322e1ea92280
2.PMEM feature config commit: 36d7a831e15ceb84e937122c87d01c14242dc377
Signed-off-by: Zhuling <zhuling8(a)huawei.com>
---
arch/arm64/Kconfig | 21 --------
arch/arm64/configs/openeuler_defconfig | 3 --
arch/arm64/kernel/Makefile | 1 -
arch/arm64/kernel/pmem.c | 35 -------------
arch/arm64/kernel/setup.c | 10 ----
arch/arm64/mm/init.c | 94 ----------------------------------
drivers/nvdimm/Kconfig | 5 --
drivers/nvdimm/Makefile | 1 -
8 files changed, 170 deletions(-)
delete mode 100644 arch/arm64/kernel/pmem.c
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index df90a6e..2df4b31 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1321,27 +1321,6 @@ config RODATA_FULL_DEFAULT_ENABLED
This requires the linear region to be mapped down to pages,
which may adversely affect performance in some cases.
-config ARM64_PMEM_RESERVE
- bool "Reserve memory for persistent storage"
- default n
- help
- Use memmap=nn[KMG]!ss[KMG](memmap=100K!0x1a0000000) reserve
- memory for persistent storage.
-
- Say y here to enable this feature.
-
-config ARM64_PMEM_LEGACY_DEVICE
- bool "Create persistent storage"
- depends on BLK_DEV
- depends on LIBNVDIMM
- select ARM64_PMEM_RESERVE
- help
- Use reserved memory for persistent storage when the kernel
- restart or update. the data in PMEM will not be lost and
- can be loaded faster.
-
- Say y if unsure.
-
config ARM64_SW_TTBR0_PAN
bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
help
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig
index 17bc875..b5fc851 100644
--- a/arch/arm64/configs/openeuler_defconfig
+++ b/arch/arm64/configs/openeuler_defconfig
@@ -416,8 +416,6 @@ CONFIG_ARM64_CPU_PARK=y
CONFIG_FORCE_MAX_ZONEORDER=11
CONFIG_UNMAP_KERNEL_AT_EL0=y
CONFIG_RODATA_FULL_DEFAULT_ENABLED=y
-CONFIG_ARM64_PMEM_RESERVE=y
-CONFIG_ARM64_PMEM_LEGACY_DEVICE=y
# CONFIG_ARM64_SW_TTBR0_PAN is not set
CONFIG_ARM64_TAGGED_ADDR_ABI=y
CONFIG_ARM64_ILP32=y
@@ -6026,7 +6024,6 @@ CONFIG_ND_BTT=m
CONFIG_BTT=y
CONFIG_OF_PMEM=m
CONFIG_NVDIMM_KEYS=y
-CONFIG_PMEM_LEGACY=m
CONFIG_DAX_DRIVER=y
CONFIG_DAX=y
CONFIG_DEV_DAX=m
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index f615325..169d90f 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -68,7 +68,6 @@ obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o
obj-$(CONFIG_SHADOW_CALL_STACK) += scs.o
obj-$(CONFIG_ARM64_MTE) += mte.o
obj-$(CONFIG_MPAM) += mpam/
-obj-$(CONFIG_ARM64_PMEM_LEGACY_DEVICE) += pmem.o
obj-y += vdso/ probes/
obj-$(CONFIG_COMPAT_VDSO) += vdso32/
diff --git a/arch/arm64/kernel/pmem.c b/arch/arm64/kernel/pmem.c
deleted file mode 100644
index 16eaf70..0000000
--- a/arch/arm64/kernel/pmem.c
+++ /dev/null
@@ -1,35 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright(c) 2021 Huawei Technologies Co., Ltd
- *
- * Derived from x86 and arm64 implement PMEM.
- */
-#include <linux/platform_device.h>
-#include <linux/init.h>
-#include <linux/ioport.h>
-#include <linux/module.h>
-
-static int found(struct resource *res, void *data)
-{
- return 1;
-}
-
-static int __init register_e820_pmem(void)
-{
- struct platform_device *pdev;
- int rc;
-
- rc = walk_iomem_res_desc(IORES_DESC_PERSISTENT_MEMORY_LEGACY,
- IORESOURCE_MEM, 0, -1, NULL, found);
- if (rc <= 0)
- return 0;
-
- /*
- * See drivers/nvdimm/e820.c for the implementation, this is
- * simply here to trigger the module to load on demand.
- */
- pdev = platform_device_alloc("e820_pmem", -1);
-
- return platform_device_add(pdev);
-}
-device_initcall(register_e820_pmem);
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 88ec495..7cd0425 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -70,10 +70,6 @@ static int __init arm64_enable_cpu0_hotplug(char *str)
__setup("arm64_cpu0_hotplug", arm64_enable_cpu0_hotplug);
#endif
-#ifdef CONFIG_ARM64_PMEM_RESERVE
-extern struct resource pmem_res;
-#endif
-
phys_addr_t __fdt_pointer __initdata;
/*
@@ -288,12 +284,6 @@ static void __init request_standard_resources(void)
request_resource(res, &pin_memory_resource);
#endif
}
-
-#ifdef CONFIG_ARM64_PMEM_RESERVE
- if (pmem_res.end && pmem_res.start)
- request_resource(&iomem_resource, &pmem_res);
-#endif
-
}
static int __init reserve_memblock_reserved_regions(void)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 3b9401e..e8d4461 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -55,7 +55,6 @@
*/
s64 memstart_addr __ro_after_init = -1;
EXPORT_SYMBOL(memstart_addr);
-phys_addr_t start_at, mem_size;
#ifdef CONFIG_PIN_MEMORY
struct resource pin_memory_resource = {
@@ -112,18 +111,6 @@ static void __init reserve_pin_memory_res(void)
*/
phys_addr_t arm64_dma_phys_limit __ro_after_init;
-static unsigned long long pmem_size, pmem_start;
-
-#ifdef CONFIG_ARM64_PMEM_RESERVE
-struct resource pmem_res = {
- .name = "Persistent Memory (legacy)",
- .start = 0,
- .end = 0,
- .flags = IORESOURCE_MEM,
- .desc = IORES_DESC_PERSISTENT_MEMORY_LEGACY
-};
-#endif
-
#ifndef CONFIG_KEXEC_CORE
static void __init reserve_crashkernel(void)
{
@@ -417,83 +404,6 @@ static int __init reserve_park_mem(void)
}
#endif
-static bool __init is_mem_valid(unsigned long long mem_size, unsigned long long mem_start)
-{
- if (!memblock_is_region_memory(mem_start, mem_size)) {
- pr_warn("cannot reserve mem: region is not memory!\n");
- return false;
- }
-
- if (memblock_is_region_reserved(mem_start, mem_size)) {
- pr_warn("cannot reserve mem: region overlaps reserved memory!\n");
- return false;
- }
-
- if (!IS_ALIGNED(mem_start, SZ_2M)) {
- pr_warn("cannot reserve mem: base address is not 2MB aligned!\n");
- return false;
- }
-
- return true;
-}
-
-static int __init parse_memmap_one(char *p)
-{
- char *oldp;
-
- if (!p)
- return -EINVAL;
-
- oldp = p;
- mem_size = memparse(p, &p);
- if (p == oldp)
- return -EINVAL;
-
- if (!mem_size)
- return -EINVAL;
-
- mem_size = PAGE_ALIGN(mem_size);
-
- if (*p == '!') {
- start_at = memparse(p+1, &p);
-
- pmem_start = start_at;
- pmem_size = mem_size;
- } else
- pr_info("Unrecognized memmap option, please check the parameter.\n");
-
- return *p == '\0' ? 0 : -EINVAL;
-}
-
-static int __init parse_memmap_opt(char *str)
-{
- while (str) {
- char *k = strchr(str, ',');
-
- if (k)
- *k++ = 0;
- parse_memmap_one(str);
- str = k;
- }
-
- return 0;
-}
-early_param("memmap", parse_memmap_opt);
-
-#ifdef CONFIG_ARM64_PMEM_RESERVE
-static void __init reserve_pmem(void)
-{
- if (!is_mem_valid(mem_size, start_at))
- return;
-
- memblock_remove(pmem_start, pmem_size);
- pr_info("pmem reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
- pmem_start, pmem_start + pmem_size, pmem_size >> 20);
- pmem_res.start = pmem_start;
- pmem_res.end = pmem_start + pmem_size - 1;
-}
-#endif
-
void __init arm64_memblock_init(void)
{
const s64 linear_region_size = BIT(vabits_actual - 1);
@@ -668,10 +578,6 @@ void __init bootmem_init(void)
reserve_quick_kexec();
#endif
-#ifdef CONFIG_ARM64_PMEM_RESERVE
- reserve_pmem();
-#endif
-
reserve_pin_memory_res();
memblock_dump_all();
diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig
index ce4de75..b7d1eb3 100644
--- a/drivers/nvdimm/Kconfig
+++ b/drivers/nvdimm/Kconfig
@@ -132,8 +132,3 @@ config NVDIMM_TEST_BUILD
infrastructure.
endif
-
-config PMEM_LEGACY
- tristate "Pmem_legacy"
- select X86_PMEM_LEGACY if X86
- select ARM64_PMEM_LEGACY_DEVICE if ARM64
diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile
index 6f8dc92..0407753 100644
--- a/drivers/nvdimm/Makefile
+++ b/drivers/nvdimm/Makefile
@@ -3,7 +3,6 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
obj-$(CONFIG_ND_BTT) += nd_btt.o
obj-$(CONFIG_ND_BLK) += nd_blk.o
-obj-$(CONFIG_PMEM_LEGACY) += nd_e820.o
obj-$(CONFIG_OF_PMEM) += of_pmem.o
obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o nd_virtio.o
--
2.9.5
4
4

[PATCH openEuler-1.0-LTS] watchdog: Fix check_preemption_disabled() error
by Yang Yingliang 27 Dec '21
by Yang Yingliang 27 Dec '21
27 Dec '21
From: Wei Li <liwei391(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 173968, https://gitee.com/openeuler/kernel/issues/I3J87Y
CVE: NA
-------------------------------------------------
When enabling CONFIG_DEBUG_PREEMPT and CONFIG_PREEMPT, it triggers a 'BUG'
in the pmu based nmi_watchdog initializaion:
[ 3.341853] BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
[ 3.344392] caller is debug_smp_processor_id+0x17/0x20
[ 3.344395] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.10.0+ #398
[ 3.344397] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[ 3.344399] Call Trace:
[ 3.344410] dump_stack+0x60/0x76
[ 3.344412] check_preemption_disabled+0xba/0xc0
[ 3.344415] debug_smp_processor_id+0x17/0x20
[ 3.344422] hardlockup_detector_event_create+0xf/0x60
[ 3.344427] hardlockup_detector_perf_init+0xf/0x41
[ 3.344430] watchdog_nmi_probe+0xe/0x10
[ 3.344432] lockup_detector_init+0x22/0x5b
[ 3.344437] kernel_init_freeable+0x20c/0x245
[ 3.344439] ? rest_init+0xd0/0xd0
[ 3.344441] kernel_init+0xe/0x110
[ 3.344446] ret_from_fork+0x22/0x30
This issue was introduced by commit 141482cb4b01, which move down
lockup_detector_init() after do_basic_setup(), after sched_init_smp() too.
hardlockup_detector_event_create
|- hardlockup_detector_perf_init (unsafe)
|- watchdog_nmi_probe
|- lockup_detector_init
|- hardlockup_detector_perf_enable
|- watchdog_nmi_enable
|- watchdog_enable
|- lockup_detector_online_cpu
|- softlockup_start_fn
|- softlockup_start_all
|- lockup_detector_reconfigure
|- lockup_detector_setup
|- lockup_detector_init
After analysing the calling context, it's only unsafe to use
smp_processor_id() in hardlockup_detector_perf_init() as the thread
'kernel_init' is preemptible after sched_init_smp().
While it is just a test if we can enable the pmu based nmi_watchdog, the
real enabling process is in softlockup_start_fn() later which ensures
that watchdog_enable() is called on all cores. So it's free to disable
preempt to fix this 'BUG'.
Fixes: 141482cb4b01 ("lockup_detector: init lockup detector after all the init_calls")
Signed-off-by: Wei Li <liwei391(a)huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
kernel/watchdog_hld.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index 43832b1023693..a5aff8ffa48ce 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -508,14 +508,17 @@ void __init hardlockup_detector_perf_restart(void)
*/
int __init hardlockup_detector_perf_init(void)
{
- int ret = hardlockup_detector_event_create();
+ int ret;
+ preempt_disable();
+ ret = hardlockup_detector_event_create();
if (ret) {
pr_info("Perf NMI watchdog permanently disabled\n");
} else {
perf_event_release_kernel(this_cpu_read(watchdog_ev));
this_cpu_write(watchdog_ev, NULL);
}
+ preempt_enable();
return ret;
}
#endif /* CONFIG_HARDLOCKUP_DETECTOR_PERF */
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] btrfs: unlock newly allocated extent buffer after error
by Yang Yingliang 27 Dec '21
by Yang Yingliang 27 Dec '21
27 Dec '21
From: Qu Wenruo <wqu(a)suse.com>
mainline inclusion
from mainline-v5.15-rc6
commit 19ea40dddf1833db868533958ca066f368862211
category: bugfix
bugzilla: NA
CVE: CVE-2021-4149
--------------------------------
[BUG]
There is a bug report that injected ENOMEM error could leave a tree
block locked while we return to user-space:
BTRFS info (device loop0): enabling ssd optimizations
FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 0
CPU: 0 PID: 7579 Comm: syz-executor Not tainted 5.15.0-rc1 #16
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x8d/0xcf lib/dump_stack.c:106
fail_dump lib/fault-inject.c:52 [inline]
should_fail+0x13c/0x160 lib/fault-inject.c:146
should_failslab+0x5/0x10 mm/slab_common.c:1328
slab_pre_alloc_hook.constprop.99+0x4e/0xc0 mm/slab.h:494
slab_alloc_node mm/slub.c:3120 [inline]
slab_alloc mm/slub.c:3214 [inline]
kmem_cache_alloc+0x44/0x280 mm/slub.c:3219
btrfs_alloc_delayed_extent_op fs/btrfs/delayed-ref.h:299 [inline]
btrfs_alloc_tree_block+0x38c/0x670 fs/btrfs/extent-tree.c:4833
__btrfs_cow_block+0x16f/0x7d0 fs/btrfs/ctree.c:415
btrfs_cow_block+0x12a/0x300 fs/btrfs/ctree.c:570
btrfs_search_slot+0x6b0/0xee0 fs/btrfs/ctree.c:1768
btrfs_insert_empty_items+0x80/0xf0 fs/btrfs/ctree.c:3905
btrfs_new_inode+0x311/0xa60 fs/btrfs/inode.c:6530
btrfs_create+0x12b/0x270 fs/btrfs/inode.c:6783
lookup_open+0x660/0x780 fs/namei.c:3282
open_last_lookups fs/namei.c:3352 [inline]
path_openat+0x465/0xe20 fs/namei.c:3557
do_filp_open+0xe3/0x170 fs/namei.c:3588
do_sys_openat2+0x357/0x4a0 fs/open.c:1200
do_sys_open+0x87/0xd0 fs/open.c:1216
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x46ae99
Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f46711b9c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000055
RAX: ffffffffffffffda RBX: 000000000078c0a0 RCX: 000000000046ae99
RDX: 0000000000000000 RSI: 00000000000000a1 RDI: 0000000020005800
RBP: 00007f46711b9c80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000017
R13: 0000000000000000 R14: 000000000078c0a0 R15: 00007ffc129da6e0
================================================
WARNING: lock held when returning to user space!
5.15.0-rc1 #16 Not tainted
------------------------------------------------
syz-executor/7579 is leaving the kernel with locks still held!
1 lock held by syz-executor/7579:
#0: ffff888104b73da8 (btrfs-tree-01/1){+.+.}-{3:3}, at:
__btrfs_tree_lock+0x2e/0x1a0 fs/btrfs/locking.c:112
[CAUSE]
In btrfs_alloc_tree_block(), after btrfs_init_new_buffer(), the new
extent buffer @buf is locked, but if later operations like adding
delayed tree ref fail, we just free @buf without unlocking it,
resulting above warning.
[FIX]
Unlock @buf in out_free_buf: label.
Reported-by: Hao Sun <sunhao.th(a)gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CACkBjsZ9O6Zr0KK1yGn=1rQi6Crh1yeCRdTSBx…
CC: stable(a)vger.kernel.org # 5.4+
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Reviewed-by: Jason Yan <yanaijie(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
fs/btrfs/extent-tree.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 4bf93184362ec..dcbb76b62a0b0 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -8324,6 +8324,7 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans,
out_free_delayed:
btrfs_free_delayed_extent_op(extent_op);
out_free_buf:
+ btrfs_tree_unlock(buf);
free_extent_buffer(buf);
out_free_reserved:
btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 0);
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] ext4: Fix null-ptr-deref in '__ext4_journal_ensure_credits'
by Yang Yingliang 25 Dec '21
by Yang Yingliang 25 Dec '21
25 Dec '21
From: Ye Bin <yebin10(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 185945, https://gitee.com/openeuler/kernel/issues/I4O33F
CVE: NA
-----------------------------------------------
We got issue as follows when run syzkaller test:
[ 1901.130043] EXT4-fs error (device vda): ext4_remount:5624: comm syz-executor.5: Abort forced by user
[ 1901.130901] Aborting journal on device vda-8.
[ 1901.131437] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.16: Detected aborted journal
[ 1901.131566] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.11: Detected aborted journal
[ 1901.132586] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.18: Detected aborted journal
[ 1901.132751] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.9: Detected aborted journal
[ 1901.136149] EXT4-fs error (device vda) in ext4_reserve_inode_write:6035: Journal has aborted
[ 1901.136837] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-fuzzer: Detected aborted journal
[ 1901.136915] ==================================================================
[ 1901.138175] BUG: KASAN: null-ptr-deref in __ext4_journal_ensure_credits+0x74/0x140 [ext4]
[ 1901.138343] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.13: Detected aborted journal
[ 1901.138398] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.1: Detected aborted journal
[ 1901.138808] Read of size 8 at addr 0000000000000000 by task syz-executor.17/968
[ 1901.138817]
[ 1901.138852] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.30: Detected aborted journal
[ 1901.144779] CPU: 1 PID: 968 Comm: syz-executor.17 Not tainted 4.19.90-vhulk2111.1.0.h893.eulerosv2r10.aarch64+ #1
[ 1901.146479] Hardware name: linux,dummy-virt (DT)
[ 1901.147317] Call trace:
[ 1901.147552] dump_backtrace+0x0/0x2d8
[ 1901.147898] show_stack+0x28/0x38
[ 1901.148215] dump_stack+0xec/0x15c
[ 1901.148746] kasan_report+0x108/0x338
[ 1901.149207] __asan_load8+0x58/0xb0
[ 1901.149753] __ext4_journal_ensure_credits+0x74/0x140 [ext4]
[ 1901.150579] ext4_xattr_delete_inode+0xe4/0x700 [ext4]
[ 1901.151316] ext4_evict_inode+0x524/0xba8 [ext4]
[ 1901.151985] evict+0x1a4/0x378
[ 1901.152353] iput+0x310/0x428
[ 1901.152733] do_unlinkat+0x260/0x428
[ 1901.153056] __arm64_sys_unlinkat+0x6c/0xc0
[ 1901.153455] el0_svc_common+0xc8/0x320
[ 1901.153799] el0_svc_handler+0xf8/0x160
[ 1901.154265] el0_svc+0x10/0x218
[ 1901.154682] ==================================================================
This issue may happens like this:
Process1 Process2
ext4_evict_inode
ext4_journal_start
ext4_truncate
ext4_ind_truncate
ext4_free_branches
ext4_ind_truncate_ensure_credits
ext4_journal_ensure_credits_fn
ext4_journal_restart
handle->h_transaction = NULL;
mount -o remount,abort /mnt
-> trigger JBD abort
start_this_handle -> will return failed
ext4_xattr_delete_inode
ext4_journal_ensure_credits
ext4_journal_ensure_credits_fn
__ext4_journal_ensure_credits
jbd2_handle_buffer_credits
journal = handle->h_transaction->t_journal; ->null-ptr-deref
Now, indirect truncate process didn't handle error. To solve this issue
maybe simply add check handle is abort in '__ext4_journal_ensure_credits'
is enough, and i also think this is necessary.
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
fs/ext4/ext4_jbd2.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 022ab8afd9499..2c6dc99292fdc 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -140,6 +140,8 @@ int __ext4_journal_ensure_credits(handle_t *handle, int check_cred,
{
if (!ext4_handle_valid(handle))
return 0;
+ if (is_handle_aborted(handle))
+ return -EROFS;
if (jbd2_handle_buffer_credits(handle) >= check_cred &&
handle->h_revoke_credits >= revoke_cred)
return 0;
--
2.25.1
1
0

[[PATCH OLK-5.10] Revert feature — [arm64: Add memmap parameter and register pmem]] revert feature — [arm64: Add memmap parameter and register pmem]
by Zhuling 25 Dec '21
by Zhuling 25 Dec '21
25 Dec '21
euleros inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4O31I?from=project-issue
CVE: NA
The reserve memory of PMEM conflicts with
"add memmap interface to reserved memory for mremap syscall usage",
need to rollback, resubmit after adaptation.
Feature related commit:
1.PMEM function commit:94dc364f5eda10f49449ba573dc3322e1ea92280
2.PMEM feature config commit: 36d7a831e15ceb84e937122c87d01c14242dc377
Signed-off-by: Zhuling <zhuling8(a)huawei.com>
---
arch/arm64/Kconfig | 21 --------
arch/arm64/configs/openeuler_defconfig | 3 --
arch/arm64/kernel/Makefile | 1 -
arch/arm64/kernel/pmem.c | 35 -------------
arch/arm64/kernel/setup.c | 10 ----
arch/arm64/mm/init.c | 94 ----------------------------------
drivers/nvdimm/Kconfig | 5 --
drivers/nvdimm/Makefile | 1 -
8 files changed, 170 deletions(-)
delete mode 100644 arch/arm64/kernel/pmem.c
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index df90a6e..2df4b31 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1321,27 +1321,6 @@ config RODATA_FULL_DEFAULT_ENABLED
This requires the linear region to be mapped down to pages,
which may adversely affect performance in some cases.
-config ARM64_PMEM_RESERVE
- bool "Reserve memory for persistent storage"
- default n
- help
- Use memmap=nn[KMG]!ss[KMG](memmap=100K!0x1a0000000) reserve
- memory for persistent storage.
-
- Say y here to enable this feature.
-
-config ARM64_PMEM_LEGACY_DEVICE
- bool "Create persistent storage"
- depends on BLK_DEV
- depends on LIBNVDIMM
- select ARM64_PMEM_RESERVE
- help
- Use reserved memory for persistent storage when the kernel
- restart or update. the data in PMEM will not be lost and
- can be loaded faster.
-
- Say y if unsure.
-
config ARM64_SW_TTBR0_PAN
bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
help
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig
index 17bc875..b5fc851 100644
--- a/arch/arm64/configs/openeuler_defconfig
+++ b/arch/arm64/configs/openeuler_defconfig
@@ -416,8 +416,6 @@ CONFIG_ARM64_CPU_PARK=y
CONFIG_FORCE_MAX_ZONEORDER=11
CONFIG_UNMAP_KERNEL_AT_EL0=y
CONFIG_RODATA_FULL_DEFAULT_ENABLED=y
-CONFIG_ARM64_PMEM_RESERVE=y
-CONFIG_ARM64_PMEM_LEGACY_DEVICE=y
# CONFIG_ARM64_SW_TTBR0_PAN is not set
CONFIG_ARM64_TAGGED_ADDR_ABI=y
CONFIG_ARM64_ILP32=y
@@ -6026,7 +6024,6 @@ CONFIG_ND_BTT=m
CONFIG_BTT=y
CONFIG_OF_PMEM=m
CONFIG_NVDIMM_KEYS=y
-CONFIG_PMEM_LEGACY=m
CONFIG_DAX_DRIVER=y
CONFIG_DAX=y
CONFIG_DEV_DAX=m
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index f615325..169d90f 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -68,7 +68,6 @@ obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o
obj-$(CONFIG_SHADOW_CALL_STACK) += scs.o
obj-$(CONFIG_ARM64_MTE) += mte.o
obj-$(CONFIG_MPAM) += mpam/
-obj-$(CONFIG_ARM64_PMEM_LEGACY_DEVICE) += pmem.o
obj-y += vdso/ probes/
obj-$(CONFIG_COMPAT_VDSO) += vdso32/
diff --git a/arch/arm64/kernel/pmem.c b/arch/arm64/kernel/pmem.c
deleted file mode 100644
index 16eaf70..0000000
--- a/arch/arm64/kernel/pmem.c
+++ /dev/null
@@ -1,35 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright(c) 2021 Huawei Technologies Co., Ltd
- *
- * Derived from x86 and arm64 implement PMEM.
- */
-#include <linux/platform_device.h>
-#include <linux/init.h>
-#include <linux/ioport.h>
-#include <linux/module.h>
-
-static int found(struct resource *res, void *data)
-{
- return 1;
-}
-
-static int __init register_e820_pmem(void)
-{
- struct platform_device *pdev;
- int rc;
-
- rc = walk_iomem_res_desc(IORES_DESC_PERSISTENT_MEMORY_LEGACY,
- IORESOURCE_MEM, 0, -1, NULL, found);
- if (rc <= 0)
- return 0;
-
- /*
- * See drivers/nvdimm/e820.c for the implementation, this is
- * simply here to trigger the module to load on demand.
- */
- pdev = platform_device_alloc("e820_pmem", -1);
-
- return platform_device_add(pdev);
-}
-device_initcall(register_e820_pmem);
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 88ec495..7cd0425 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -70,10 +70,6 @@ static int __init arm64_enable_cpu0_hotplug(char *str)
__setup("arm64_cpu0_hotplug", arm64_enable_cpu0_hotplug);
#endif
-#ifdef CONFIG_ARM64_PMEM_RESERVE
-extern struct resource pmem_res;
-#endif
-
phys_addr_t __fdt_pointer __initdata;
/*
@@ -288,12 +284,6 @@ static void __init request_standard_resources(void)
request_resource(res, &pin_memory_resource);
#endif
}
-
-#ifdef CONFIG_ARM64_PMEM_RESERVE
- if (pmem_res.end && pmem_res.start)
- request_resource(&iomem_resource, &pmem_res);
-#endif
-
}
static int __init reserve_memblock_reserved_regions(void)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 3b9401e..e8d4461 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -55,7 +55,6 @@
*/
s64 memstart_addr __ro_after_init = -1;
EXPORT_SYMBOL(memstart_addr);
-phys_addr_t start_at, mem_size;
#ifdef CONFIG_PIN_MEMORY
struct resource pin_memory_resource = {
@@ -112,18 +111,6 @@ static void __init reserve_pin_memory_res(void)
*/
phys_addr_t arm64_dma_phys_limit __ro_after_init;
-static unsigned long long pmem_size, pmem_start;
-
-#ifdef CONFIG_ARM64_PMEM_RESERVE
-struct resource pmem_res = {
- .name = "Persistent Memory (legacy)",
- .start = 0,
- .end = 0,
- .flags = IORESOURCE_MEM,
- .desc = IORES_DESC_PERSISTENT_MEMORY_LEGACY
-};
-#endif
-
#ifndef CONFIG_KEXEC_CORE
static void __init reserve_crashkernel(void)
{
@@ -417,83 +404,6 @@ static int __init reserve_park_mem(void)
}
#endif
-static bool __init is_mem_valid(unsigned long long mem_size, unsigned long long mem_start)
-{
- if (!memblock_is_region_memory(mem_start, mem_size)) {
- pr_warn("cannot reserve mem: region is not memory!\n");
- return false;
- }
-
- if (memblock_is_region_reserved(mem_start, mem_size)) {
- pr_warn("cannot reserve mem: region overlaps reserved memory!\n");
- return false;
- }
-
- if (!IS_ALIGNED(mem_start, SZ_2M)) {
- pr_warn("cannot reserve mem: base address is not 2MB aligned!\n");
- return false;
- }
-
- return true;
-}
-
-static int __init parse_memmap_one(char *p)
-{
- char *oldp;
-
- if (!p)
- return -EINVAL;
-
- oldp = p;
- mem_size = memparse(p, &p);
- if (p == oldp)
- return -EINVAL;
-
- if (!mem_size)
- return -EINVAL;
-
- mem_size = PAGE_ALIGN(mem_size);
-
- if (*p == '!') {
- start_at = memparse(p+1, &p);
-
- pmem_start = start_at;
- pmem_size = mem_size;
- } else
- pr_info("Unrecognized memmap option, please check the parameter.\n");
-
- return *p == '\0' ? 0 : -EINVAL;
-}
-
-static int __init parse_memmap_opt(char *str)
-{
- while (str) {
- char *k = strchr(str, ',');
-
- if (k)
- *k++ = 0;
- parse_memmap_one(str);
- str = k;
- }
-
- return 0;
-}
-early_param("memmap", parse_memmap_opt);
-
-#ifdef CONFIG_ARM64_PMEM_RESERVE
-static void __init reserve_pmem(void)
-{
- if (!is_mem_valid(mem_size, start_at))
- return;
-
- memblock_remove(pmem_start, pmem_size);
- pr_info("pmem reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
- pmem_start, pmem_start + pmem_size, pmem_size >> 20);
- pmem_res.start = pmem_start;
- pmem_res.end = pmem_start + pmem_size - 1;
-}
-#endif
-
void __init arm64_memblock_init(void)
{
const s64 linear_region_size = BIT(vabits_actual - 1);
@@ -668,10 +578,6 @@ void __init bootmem_init(void)
reserve_quick_kexec();
#endif
-#ifdef CONFIG_ARM64_PMEM_RESERVE
- reserve_pmem();
-#endif
-
reserve_pin_memory_res();
memblock_dump_all();
diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig
index ce4de75..b7d1eb3 100644
--- a/drivers/nvdimm/Kconfig
+++ b/drivers/nvdimm/Kconfig
@@ -132,8 +132,3 @@ config NVDIMM_TEST_BUILD
infrastructure.
endif
-
-config PMEM_LEGACY
- tristate "Pmem_legacy"
- select X86_PMEM_LEGACY if X86
- select ARM64_PMEM_LEGACY_DEVICE if ARM64
diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile
index 6f8dc92..0407753 100644
--- a/drivers/nvdimm/Makefile
+++ b/drivers/nvdimm/Makefile
@@ -3,7 +3,6 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o
obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
obj-$(CONFIG_ND_BTT) += nd_btt.o
obj-$(CONFIG_ND_BLK) += nd_blk.o
-obj-$(CONFIG_PMEM_LEGACY) += nd_e820.o
obj-$(CONFIG_OF_PMEM) += of_pmem.o
obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o nd_virtio.o
--
2.9.5
1
0

[PATCH openEuler-1.0-LTS] net/hinic: Fix call trace when the rx_buff module parameter is grater than 2
by Yang Yingliang 25 Dec '21
by Yang Yingliang 25 Dec '21
25 Dec '21
From: Chiqijun <chiqijun(a)huawei.com>
driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4O2ZZ
-----------------------------------------------------------------------
When rx_buff is greater than 2, the driver will alloc for more than 1
page of memory for network rx, but the __GFP_COMP gfp flag is not set,
resulting in the following call trace:
CPU: 3 PID: 494041 Comm: ping Kdump: loaded Tainted: G W OE 4.19.90-2106.3.0.0095.oe1.x86_64 #1
Hardware name: Huawei Technologies Co., Ltd. RH2288H V3/BC11HGSA0, BIOS 5.15 05/21/2019
RIP: 0010:copy_page_to_iter+0x154/0x310
Code: 31 b8 00 10 00 00 f7 c6 00 80 00 00 74 07 0f b6 49 51 48 d3 e0 48 39 c2 0f 86 ed fe ff ff 48 c7 c7 30
RSP: 0018:ffffbd6907d03bd8 EFLAGS: 00010286
RAX: 0000000000000024 RBX: ffffe0ffee5b3000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff9edbbfcd6858 RDI: ffff9edbbfcd6858
RBP: 0000000000000001 R08: 000000000001574a R09: 0000000000000004
R10: 000000000000004e R11: 0000000000000001 R12: ffffbd6907d03ed0
R13: 0000000000002100 R14: 0000000000000030 R15: 0000000000000000
FS: 00007f9d37244dc0(0000) GS:ffff9edbbfcc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe0e715f80 CR3: 000000203c018005 CR4: 00000000001606e0
Call Trace:
skb_copy_datagram_iter+0x16c/0x2a0
raw_recvmsg+0xd0/0x1f0
inet_recvmsg+0x5b/0xd0
____sys_recvmsg+0x95/0x160
? import_iovec+0x37/0xd0
? copy_msghdr_from_user+0x5c/0x90
___sys_recvmsg+0x8c/0xd0
? __audit_syscall_exit+0x228/0x290
? kretprobe_trampoline+0x25/0x50
? __sys_recvmsg+0x5b/0xa0
__sys_recvmsg+0x5b/0xa0
do_syscall_64+0x5f/0x240
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Use 'dev_alloc_pages' instead of calling ’alloc_pages_node‘ directly.
Signed-off-by: Chiqijun <chiqijun(a)huawei.com>
Reviewed-by: Wangxiaoyun <cloud.wangxiaoyun(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/net/ethernet/huawei/hinic/hinic_rx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_rx.c b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
index a9047432f3053..019cd439ce696 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_rx.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
@@ -67,7 +67,7 @@ static bool rx_alloc_mapped_page(struct hinic_rxq *rxq,
return true;
/* alloc new page for storage */
- page = alloc_pages_node(NUMA_NO_NODE, GFP_ATOMIC, nic_dev->page_order);
+ page = dev_alloc_pages(nic_dev->page_order);
if (unlikely(!page)) {
RXQ_STATS_INC(rxq, alloc_rx_buf_err);
return false;
--
2.25.1
2
1

[PATCH openEuler-1.0-LTS 1/3] arm64/mpam: remove __init macro to support driver probe
by Yang Yingliang 24 Dec '21
by Yang Yingliang 24 Dec '21
24 Dec '21
From: Xingang Wang <wangxingang5(a)huawei.com>
ascend inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I49RB2
CVE: NA
---------------------------------------------------
To support device tree boot for arm64 mpam, the __init macro might be
used by the dts driver. This remove the necessary __init macro for the
related functions.
Signed-off-by: Xingang Wang <wangxingang5(a)huawei.com>
Reviewed-by: Wang ShaoBo <bobo.shaobowang(a)huawei.com>
Acked-by: Xie XiuQi <xiexiuqi(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
arch/arm64/include/asm/resctrl.h | 2 +-
arch/arm64/kernel/mpam/mpam_device.c | 14 +++++++-------
arch/arm64/kernel/mpam/mpam_internal.h | 2 +-
arch/arm64/kernel/mpam/mpam_resctrl.c | 2 +-
fs/resctrlfs.c | 2 +-
include/linux/arm_mpam.h | 14 +++++++-------
6 files changed, 18 insertions(+), 18 deletions(-)
diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h
index ff4ce56430d1f..3fceaa482e5a3 100644
--- a/arch/arm64/include/asm/resctrl.h
+++ b/arch/arm64/include/asm/resctrl.h
@@ -424,7 +424,7 @@ int resctrl_update_groups_config(struct rdtgroup *rdtgrp);
#define RESCTRL_MAX_CLOSID 32
-int __init resctrl_group_init(void);
+int resctrl_group_init(void);
void post_resctrl_mount(void);
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c
index c0615f6947a1f..4139b4476a836 100644
--- a/arch/arm64/kernel/mpam/mpam_device.c
+++ b/arch/arm64/kernel/mpam/mpam_device.c
@@ -534,7 +534,7 @@ static void mpam_disable_irqs(void)
* Scheduled by mpam_discovery_complete() once all devices have been created.
* Also scheduled when new devices are probed when new CPUs come online.
*/
-static void __init mpam_enable(struct work_struct *work)
+static void mpam_enable(struct work_struct *work)
{
int err;
unsigned long flags;
@@ -761,7 +761,7 @@ static struct mpam_class * __init mpam_class_get(u8 level_idx,
* class/component structures may be allocated.
* Returns the new device, or an ERR_PTR().
*/
-struct mpam_device * __init
+struct mpam_device *
__mpam_device_create(u8 level_idx, enum mpam_class_types type,
int component_id, const struct cpumask *fw_affinity,
phys_addr_t hwpage_address)
@@ -810,7 +810,7 @@ __mpam_device_create(u8 level_idx, enum mpam_class_types type,
return dev;
}
-void __init mpam_device_set_error_irq(struct mpam_device *dev, u32 irq,
+void mpam_device_set_error_irq(struct mpam_device *dev, u32 irq,
u32 flags)
{
unsigned long irq_save_flags;
@@ -821,7 +821,7 @@ void __init mpam_device_set_error_irq(struct mpam_device *dev, u32 irq,
spin_unlock_irqrestore(&dev->lock, irq_save_flags);
}
-void __init mpam_device_set_overflow_irq(struct mpam_device *dev, u32 irq,
+void mpam_device_set_overflow_irq(struct mpam_device *dev, u32 irq,
u32 flags)
{
unsigned long irq_save_flags;
@@ -864,7 +864,7 @@ static inline u16 mpam_cpu_max_pmg(void)
/*
* prepare for initializing devices.
*/
-int __init mpam_discovery_start(void)
+int mpam_discovery_start(void)
{
if (!mpam_cpus_have_feature())
return -EOPNOTSUPP;
@@ -1104,7 +1104,7 @@ static int mpam_cpu_offline(unsigned int cpu)
return 0;
}
-int __init mpam_discovery_complete(void)
+int mpam_discovery_complete(void)
{
int ret = 0;
@@ -1121,7 +1121,7 @@ int __init mpam_discovery_complete(void)
return ret;
}
-void __init mpam_discovery_failed(void)
+void mpam_discovery_failed(void)
{
struct mpam_class *class, *tmp;
diff --git a/arch/arm64/kernel/mpam/mpam_internal.h b/arch/arm64/kernel/mpam/mpam_internal.h
index cfaef82428aa5..7b84ea54975aa 100644
--- a/arch/arm64/kernel/mpam/mpam_internal.h
+++ b/arch/arm64/kernel/mpam/mpam_internal.h
@@ -329,7 +329,7 @@ int mpam_resctrl_setup(void);
struct raw_resctrl_resource *
mpam_get_raw_resctrl_resource(u32 level);
-int __init mpam_resctrl_init(void);
+int mpam_resctrl_init(void);
int mpam_resctrl_set_default_cpu(unsigned int cpu);
void mpam_resctrl_clear_default_cpu(unsigned int cpu);
diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c
index 51cdefebaeba8..1ad4cd49762a3 100644
--- a/arch/arm64/kernel/mpam/mpam_resctrl.c
+++ b/arch/arm64/kernel/mpam/mpam_resctrl.c
@@ -2209,7 +2209,7 @@ static int __init mpam_setup(char *str)
}
__setup("mpam", mpam_setup);
-int __init mpam_resctrl_init(void)
+int mpam_resctrl_init(void)
{
mpam_init_padding();
diff --git a/fs/resctrlfs.c b/fs/resctrlfs.c
index 11741c87eb0af..ea9df7d77b95a 100644
--- a/fs/resctrlfs.c
+++ b/fs/resctrlfs.c
@@ -1029,7 +1029,7 @@ static int __init resctrl_group_setup_root(void)
*
* Return: 0 on success or -errno
*/
-int __init resctrl_group_init(void)
+int resctrl_group_init(void)
{
int ret = 0;
diff --git a/include/linux/arm_mpam.h b/include/linux/arm_mpam.h
index 44d0690ae8c4b..a242d4fdff8b3 100644
--- a/include/linux/arm_mpam.h
+++ b/include/linux/arm_mpam.h
@@ -16,7 +16,7 @@ enum mpam_class_types {
MPAM_CLASS_UNKNOWN, /* Everything else, e.g. TLBs etc */
};
-struct mpam_device * __init
+struct mpam_device *
__mpam_device_create(u8 level_idx, enum mpam_class_types type,
int component_id, const struct cpumask *fw_affinity,
phys_addr_t hwpage_address);
@@ -54,9 +54,9 @@ mpam_device_create_memory(int nid, phys_addr_t hwpage_address)
return __mpam_device_create(~0, MPAM_CLASS_MEMORY, nid,
&dev_affinity, hwpage_address);
}
-int __init mpam_discovery_start(void);
-int __init mpam_discovery_complete(void);
-void __init mpam_discovery_failed(void);
+int mpam_discovery_start(void);
+int mpam_discovery_complete(void);
+void mpam_discovery_failed(void);
enum mpam_enable_type {
MPAM_ENABLE_DENIED = 0,
@@ -71,12 +71,12 @@ extern enum mpam_enable_type mpam_enabled;
#define mpam_irq_flags_to_acpi(x) ((x & MPAM_IRQ_MODE_LEVEL) ? \
ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE)
-void __init mpam_device_set_error_irq(struct mpam_device *dev, u32 irq,
+void mpam_device_set_error_irq(struct mpam_device *dev, u32 irq,
u32 flags);
-void __init mpam_device_set_overflow_irq(struct mpam_device *dev, u32 irq,
+void mpam_device_set_overflow_irq(struct mpam_device *dev, u32 irq,
u32 flags);
-static inline int __init mpam_register_device_irq(struct mpam_device *dev,
+static inline int mpam_register_device_irq(struct mpam_device *dev,
u32 overflow_interrupt, u32 overflow_flags,
u32 error_interrupt, u32 error_flags)
{
--
2.25.1
1
2

[openEuler-5.10 01/19] drm/hisilicon: Support i2c driver algorithms for bit-shift adapters
by Zheng Zengkai 23 Dec '21
by Zheng Zengkai 23 Dec '21
23 Dec '21
From: Ma Hai <mahai1(a)huawei.com>
mainline inclusion
from mainline-v5.12-rc2
commit 4eb4d99dfe3018d86f4529112aa7082f43b6996a
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4M6CD?from=project-issue
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/d…
----------------------------------------------------------------------
Adding driver implementation to support i2c driver algorithms for
bit-shift adapters, so hibmc will using the interface provided by
drm to read edid.
Signed-off-by: Ma Hai <mahai1(a)huawei.com>
Signed-off-by: Tian Tao <tiantao6(a)hisilicon.com>
Reviewed-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Reviewed-by: Li Dongming <lidongming5(a)huawei.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1600778670-60370-2-git-send-e…
Acked-by: Xie XiuQi <xiexiuqi(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
drivers/gpu/drm/hisilicon/hibmc/Makefile | 2 +-
.../gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h | 25 ++++-
.../gpu/drm/hisilicon/hibmc/hibmc_drm_i2c.c | 99 +++++++++++++++++++
.../gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c | 2 +-
4 files changed, 125 insertions(+), 3 deletions(-)
create mode 100644 drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_i2c.c
diff --git a/drivers/gpu/drm/hisilicon/hibmc/Makefile b/drivers/gpu/drm/hisilicon/hibmc/Makefile
index f99132715597..684ef794eb7c 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/Makefile
+++ b/drivers/gpu/drm/hisilicon/hibmc/Makefile
@@ -1,4 +1,4 @@
# SPDX-License-Identifier: GPL-2.0-only
-hibmc-drm-y := hibmc_drm_drv.o hibmc_drm_de.o hibmc_drm_vdac.o hibmc_ttm.o
+hibmc-drm-y := hibmc_drm_drv.o hibmc_drm_de.o hibmc_drm_vdac.o hibmc_ttm.o hibmc_drm_i2c.o
obj-$(CONFIG_DRM_HISI_HIBMC) += hibmc-drm.o
diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
index 197485e2fe0b..87d2aad0bb5e 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
@@ -14,11 +14,23 @@
#ifndef HIBMC_DRM_DRV_H
#define HIBMC_DRM_DRV_H
+#include <linux/gpio/consumer.h>
+#include <linux/i2c-algo-bit.h>
+#include <linux/i2c.h>
+
+#include <drm/drm_edid.h>
#include <drm/drm_fb_helper.h>
#include <drm/drm_framebuffer.h>
struct drm_device;
+struct hibmc_connector {
+ struct drm_connector base;
+
+ struct i2c_adapter adapter;
+ struct i2c_algo_bit_data bit_data;
+};
+
struct hibmc_drm_private {
/* hw */
void __iomem *mmio;
@@ -31,10 +43,20 @@ struct hibmc_drm_private {
struct drm_plane primary_plane;
struct drm_crtc crtc;
struct drm_encoder encoder;
- struct drm_connector connector;
+ struct hibmc_connector connector;
bool mode_config_initialized;
};
+static inline struct hibmc_connector *to_hibmc_connector(struct drm_connector *connector)
+{
+ return container_of(connector, struct hibmc_connector, base);
+}
+
+static inline struct hibmc_drm_private *to_hibmc_drm_private(struct drm_device *dev)
+{
+ return dev->dev_private;
+}
+
void hibmc_set_power_mode(struct hibmc_drm_private *priv,
unsigned int power_mode);
void hibmc_set_current_gate(struct hibmc_drm_private *priv,
@@ -47,6 +69,7 @@ int hibmc_mm_init(struct hibmc_drm_private *hibmc);
void hibmc_mm_fini(struct hibmc_drm_private *hibmc);
int hibmc_dumb_create(struct drm_file *file, struct drm_device *dev,
struct drm_mode_create_dumb *args);
+int hibmc_ddc_create(struct drm_device *drm_dev, struct hibmc_connector *connector);
extern const struct drm_mode_config_funcs hibmc_mode_funcs;
diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_i2c.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_i2c.c
new file mode 100644
index 000000000000..86d712090d87
--- /dev/null
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_i2c.c
@@ -0,0 +1,99 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/* Hisilicon Hibmc SoC drm driver
+ *
+ * Based on the bochs drm driver.
+ *
+ * Copyright (c) 2016 Huawei Limited.
+ *
+ * Author:
+ * Tian Tao <tiantao6(a)hisilicon.com>
+ */
+
+#include <linux/delay.h>
+#include <linux/pci.h>
+
+#include <drm/drm_atomic_helper.h>
+#include <drm/drm_probe_helper.h>
+
+#include "hibmc_drm_drv.h"
+
+#define GPIO_DATA 0x0802A0
+#define GPIO_DATA_DIRECTION 0x0802A4
+
+#define I2C_SCL_MASK BIT(0)
+#define I2C_SDA_MASK BIT(1)
+
+static void hibmc_set_i2c_signal(void *data, u32 mask, int value)
+{
+ struct hibmc_connector *hibmc_connector = data;
+ struct hibmc_drm_private *priv = to_hibmc_drm_private(hibmc_connector->base.dev);
+ u32 tmp_dir = readl(priv->mmio + GPIO_DATA_DIRECTION);
+
+ if (value) {
+ tmp_dir &= ~mask;
+ writel(tmp_dir, priv->mmio + GPIO_DATA_DIRECTION);
+ } else {
+ u32 tmp_data = readl(priv->mmio + GPIO_DATA);
+
+ tmp_data &= ~mask;
+ writel(tmp_data, priv->mmio + GPIO_DATA);
+
+ tmp_dir |= mask;
+ writel(tmp_dir, priv->mmio + GPIO_DATA_DIRECTION);
+ }
+}
+
+static int hibmc_get_i2c_signal(void *data, u32 mask)
+{
+ struct hibmc_connector *hibmc_connector = data;
+ struct hibmc_drm_private *priv = to_hibmc_drm_private(hibmc_connector->base.dev);
+ u32 tmp_dir = readl(priv->mmio + GPIO_DATA_DIRECTION);
+
+ if ((tmp_dir & mask) != mask) {
+ tmp_dir &= ~mask;
+ writel(tmp_dir, priv->mmio + GPIO_DATA_DIRECTION);
+ }
+
+ return (readl(priv->mmio + GPIO_DATA) & mask) ? 1 : 0;
+}
+
+static void hibmc_ddc_setsda(void *data, int state)
+{
+ hibmc_set_i2c_signal(data, I2C_SDA_MASK, state);
+}
+
+static void hibmc_ddc_setscl(void *data, int state)
+{
+ hibmc_set_i2c_signal(data, I2C_SCL_MASK, state);
+}
+
+static int hibmc_ddc_getsda(void *data)
+{
+ return hibmc_get_i2c_signal(data, I2C_SDA_MASK);
+}
+
+static int hibmc_ddc_getscl(void *data)
+{
+ return hibmc_get_i2c_signal(data, I2C_SCL_MASK);
+}
+
+int hibmc_ddc_create(struct drm_device *drm_dev,
+ struct hibmc_connector *connector)
+{
+ connector->adapter.owner = THIS_MODULE;
+ connector->adapter.class = I2C_CLASS_DDC;
+ snprintf(connector->adapter.name, I2C_NAME_SIZE, "HIS i2c bit bus");
+ connector->adapter.dev.parent = &drm_dev->pdev->dev;
+ i2c_set_adapdata(&connector->adapter, connector);
+ connector->adapter.algo_data = &connector->bit_data;
+
+ connector->bit_data.udelay = 20;
+ connector->bit_data.timeout = usecs_to_jiffies(2000);
+ connector->bit_data.data = connector;
+ connector->bit_data.setsda = hibmc_ddc_setsda;
+ connector->bit_data.setscl = hibmc_ddc_setscl;
+ connector->bit_data.getsda = hibmc_ddc_getsda;
+ connector->bit_data.getscl = hibmc_ddc_getscl;
+
+ return i2c_bit_add_bus(&connector->adapter);
+}
diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c
index 376a05ddbc2f..c8b14afbcbed 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c
@@ -78,7 +78,7 @@ int hibmc_vdac_init(struct hibmc_drm_private *priv)
{
struct drm_device *dev = priv->dev;
struct drm_encoder *encoder = &priv->encoder;
- struct drm_connector *connector = &priv->connector;
+ struct drm_connector *connector = &priv->connector.base;
int ret;
encoder->possible_crtcs = 0x1;
--
2.20.1
1
18

[PATCH openEuler-1.0-LTS 1/3] arm64/mpam: Fix mpam corrupt when cpu online
by Yang Yingliang 23 Dec '21
by Yang Yingliang 23 Dec '21
23 Dec '21
From: Wang ShaoBo <bobo.shaobowang(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I3YAI3
CVE: NA
-------------------------------------------------
The following error occurred occasionally on a machine that supports MPAM:
[ 13.321386][ T658] Unable to handle kernel paging request at virtual address ffff80001115816c
[ 13.326013][ T684] hid-generic 0003:12D1:0003.0002: input,hidraw1: USB HID v1.10 Mouse [Keyboard/Mouse KVM 1.1.0] on usb-0000:7a:01.0-1.1/input1
[ 13.340558][ T658] Mem abort info:
[ 13.340563][ T658] ESR = 0x86000007
[ 13.352567][ T5] hub 6-1:1.0: USB hub found
[ 13.364750][ T658] EC = 0x21: IABT (current EL), IL = 32 bits
[ 13.369891][ T5] hub 6-1:1.0: 4 ports detected
[ 13.373871][ T658] SET = 0, FnV = 0
[ 13.396107][ T658] EA = 0, S1PTW = 0
[ 13.400599][ T658] swapper pgtable: 64k pages, 48-bit VAs, pgdp=0000000029540000
[ 13.408726][ T658] [ffff80001115816c] pgd=0000205fffff0003, p4d=0000205fffff0003, pud=0000205fffff0003, pmd=0000205ffffe0003, pte=0000000000000000
[ 13.423346][ T658] Internal error: Oops: 86000007 [#1] SMP
[ 13.429720][ T658] Modules linked in:
[ 13.434243][ T658] CPU: 72 PID: 658 Comm: kworker/72:1 Not tainted 5.10.0-4.17.0.28.oe1.aarch64 #1
[ 13.443966][ T658] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDDA, BIOS 1.70 01/07/2021
[ 13.453683][ T658] Workqueue: events mpam_enable
[ 13.459206][ T658] pstate: 20c00009 (nzCv daif +PAN +UAO -TCO BTYPE=--)
[ 13.466625][ T658] pc : mpam_enable+0x194/0x1d8
[ 13.472019][ T658] lr : mpam_enable+0x194/0x1d8
[ 13.477301][ T658] sp : ffff80004664fd70
[ 13.481937][ T658] x29: ffff80004664fd70 x28: 0000000000000000
[ 13.488578][ T658] x27: ffff00400484a648 x26: ffff800011b71080
[ 13.495306][ T658] x25: 0000000000000000 x24: ffff800011b6cda0
[ 13.502001][ T658] x23: ffff800011646f18 x22: ffff800011b6cd80
[ 13.508684][ T658] x21: ffff800011b6c000 x20: ffff800011646f08
[ 13.515425][ T658] x19: ffff800011646f70 x18: 0000000000000020
[ 13.522075][ T658] x17: 000000001790b332 x16: 0000000000000001
[ 13.528785][ T658] x15: ffffffffffffffff x14: ff00000000000000
[ 13.535464][ T658] x13: ffffffffffffffff x12: 0000000000000006
[ 13.542045][ T658] x11: 00000091cea718e2 x10: 0000000000000b90
[ 13.548735][ T658] x9 : ffff80001009ebac x8 : ffff2040061aabf0
[ 13.555383][ T658] x7 : ffffa05f8dca0000 x6 : 000000000000000f
[ 13.561924][ T658] x5 : 0000000000000000 x4 : ffff2040061aa000
[ 13.568613][ T658] x3 : ffff80001164dfa0 x2 : 00000000ffffffff
[ 13.575267][ T658] x1 : ffffa05f8dca0000 x0 : 00000000000000c1
[ 13.581813][ T658] Call trace:
[ 13.585600][ T658] mpam_enable+0x194/0x1d8
[ 13.590450][ T658] process_one_work+0x1cc/0x390
[ 13.595654][ T658] worker_thread+0x70/0x2f0
[ 13.600499][ T658] kthread+0x118/0x120
[ 13.604935][ T658] ret_from_fork+0x10/0x18
[ 13.609717][ T658] Code: bad PC value
[ 13.613944][ T658] ---[ end trace f1e305d2c339f67f ]---
[ 13.753818][ T658] Kernel panic - not syncing: Oops: Fatal exception
[ 13.760885][ T658] SMP: stopping secondary CPUs
[ 13.765933][ T658] Kernel Offset: disabled
[ 13.770516][ T658] CPU features: 0x8040002,22208a38
[ 13.775862][ T658] Memory Limit: none
[ 13.913929][ T658] ---[ end Kernel panic - not syncing:
The process of MPAM devices initialization is like this:
mpam_discovery_start()
... // discover devices
mpam_discovery_complete() // hang up the mpam_online/offline_cpu callbacks
-=> mpam_cpu_online() // probe all devices
-=> mpam_enable() // prepare for resctrl
(1) -=> cpuhp_remove_state() // clean resctrl internal structure
(2) -=> cpuhp_setup_state() // rehang mpam_online/offline_cpu callbacks
-=> mpam_cpu_online() // it does not call mpam_enable again
-=> mpam_resctrl_cpu_online() // pull up resctrl
Re-hang process of mpam_cpu_online/offline callbacks should not be
disturbed by irqs, to ensure that CPU context is reliable before
re-entering mpam_cpu_online(), which always happens between (1) and (2).
Fixes: 2ab89c893faf ("arm64/mpam: resctrl: Re-synchronise resctrl's view of online CPUs")
Signed-off-by: Wang ShaoBo <bobo.shaobowang(a)huawei.com>
Reviewed-by: Cheng Jian <cj.chengjian(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
arch/arm64/kernel/mpam/mpam_device.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c
index f8840274b902f..c0615f6947a1f 100644
--- a/arch/arm64/kernel/mpam/mpam_device.c
+++ b/arch/arm64/kernel/mpam/mpam_device.c
@@ -593,9 +593,11 @@ static void __init mpam_enable(struct work_struct *work)
pr_err("Failed to setup/init resctrl\n");
mutex_unlock(&mpam_devices_lock);
+ local_irq_disable();
mpam_cpuhp_state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
"mpam:online", mpam_cpu_online,
mpam_cpu_offline);
+ local_irq_enable();
if (mpam_cpuhp_state <= 0)
pr_err("Failed to re-register 'dyn' cpuhp callbacks");
mutex_unlock(&mpam_cpuhp_lock);
--
2.25.1
1
2

[PATCH openEuler-1.0-LTS 1/2] kprobes: Set unoptimized flag after unoptimizing code
by Yang Yingliang 22 Dec '21
by Yang Yingliang 22 Dec '21
22 Dec '21
From: Masami Hiramatsu <mhiramat(a)kernel.org>
stable inclusion
from stable-v4.19.108
commit 39af044d1ccb207da43af5c060bb0fb0dd548b3e
category: bugfix
bugzilla: NA
CVE: NA
-------------------------------------------------
commit f66c0447cca1281116224d474cdb37d6a18e4b5b upstream.
Set the unoptimized flag after confirming the code is completely
unoptimized. Without this fix, when a kprobe hits the intermediate
modified instruction (the first byte is replaced by an INT3, but
later bytes can still be a jump address operand) while unoptimizing,
it can return to the middle byte of the modified code, which causes
an invalid instruction exception in the kernel.
Usually, this is a rare case, but if we put a probe on the function
call while text patching, it always causes a kernel panic as below:
# echo p text_poke+5 > kprobe_events
# echo 1 > events/kprobes/enable
# echo 0 > events/kprobes/enable
invalid opcode: 0000 [#1] PREEMPT SMP PTI
RIP: 0010:text_poke+0x9/0x50
Call Trace:
arch_unoptimize_kprobe+0x22/0x28
arch_unoptimize_kprobes+0x39/0x87
kprobe_optimizer+0x6e/0x290
process_one_work+0x2a0/0x610
worker_thread+0x28/0x3d0
? process_one_work+0x610/0x610
kthread+0x10d/0x130
? kthread_park+0x80/0x80
ret_from_fork+0x3a/0x50
text_poke() is used for patching the code in optprobes.
This can happen even if we blacklist text_poke() and other functions,
because there is a small time window during which we show the intermediate
code to other CPUs.
[ mingo: Edited the changelog. ]
Tested-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: bristot(a)redhat.com
Fixes: 6274de4984a6 ("kprobes: Support delayed unoptimizing")
Link: https://lkml.kernel.org/r/157483422375.25881.13508326028469515760.stgit@dev…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Yang Jihong <yangjihong1(a)huawei.com>
Reviewed-by: Kuohai Xu <xukuohai(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
kernel/kprobes.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index f6e6edaf964c1..666243fff573d 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -523,6 +523,8 @@ static void do_unoptimize_kprobes(void)
arch_unoptimize_kprobes(&unoptimizing_list, &freeing_list);
/* Loop free_list for disarming */
list_for_each_entry_safe(op, tmp, &freeing_list, list) {
+ /* Switching from detour code to origin */
+ op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
/* Disarm probes if marked disabled */
if (kprobe_disabled(&op->kp))
arch_disarm_kprobe(&op->kp);
@@ -662,6 +664,7 @@ static void force_unoptimize_kprobe(struct optimized_kprobe *op)
{
lockdep_assert_cpus_held();
arch_unoptimize_kprobe(op);
+ op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
if (kprobe_disabled(&op->kp))
arch_disarm_kprobe(&op->kp);
}
@@ -689,7 +692,6 @@ static void unoptimize_kprobe(struct kprobe *p, bool force)
return;
}
- op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
if (!list_empty(&op->list)) {
/* Dequeue from the optimization queue */
list_del_init(&op->list);
--
2.25.1
1
1

[PATCH openEuler-1.0-LTS] cpufreq: schedutil: Destroy mutex before kobject_put() frees the memory
by Yang Yingliang 22 Dec '21
by Yang Yingliang 22 Dec '21
22 Dec '21
From: James Morse <james.morse(a)arm.com>
stable inclusion
from linux-4.19.209
commit 3cb9595e23d879824ebcefeabdf5156bc288000c
--------------------------------
[ Upstream commit cdef1196608892b9a46caa5f2b64095a7f0be60c ]
Since commit e5c6b312ce3c ("cpufreq: schedutil: Use kobject release()
method to free sugov_tunables") kobject_put() has kfree()d the
attr_set before gov_attr_set_put() returns.
kobject_put() isn't the last user of attr_set in gov_attr_set_put(),
the subsequent mutex_destroy() triggers a use-after-free:
| BUG: KASAN: use-after-free in mutex_is_locked+0x20/0x60
| Read of size 8 at addr ffff000800ca4250 by task cpuhp/2/20
|
| CPU: 2 PID: 20 Comm: cpuhp/2 Not tainted 5.15.0-rc1 #12369
| Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development
| Platform, BIOS EDK II Jul 30 2018
| Call trace:
| dump_backtrace+0x0/0x380
| show_stack+0x1c/0x30
| dump_stack_lvl+0x8c/0xb8
| print_address_description.constprop.0+0x74/0x2b8
| kasan_report+0x1f4/0x210
| kasan_check_range+0xfc/0x1a4
| __kasan_check_read+0x38/0x60
| mutex_is_locked+0x20/0x60
| mutex_destroy+0x80/0x100
| gov_attr_set_put+0xfc/0x150
| sugov_exit+0x78/0x190
| cpufreq_offline.isra.0+0x2c0/0x660
| cpuhp_cpufreq_offline+0x14/0x24
| cpuhp_invoke_callback+0x430/0x6d0
| cpuhp_thread_fun+0x1b0/0x624
| smpboot_thread_fn+0x5e0/0xa6c
| kthread+0x3a0/0x450
| ret_from_fork+0x10/0x20
Swap the order of the calls.
Fixes: e5c6b312ce3c ("cpufreq: schedutil: Use kobject release() method to free sugov_tunables")
Cc: 4.7+ <stable(a)vger.kernel.org> # 4.7+
Signed-off-by: James Morse <james.morse(a)arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/cpufreq/cpufreq_governor_attr_set.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cpufreq/cpufreq_governor_attr_set.c b/drivers/cpufreq/cpufreq_governor_attr_set.c
index 52841f807a7eb..45fdf30cade39 100644
--- a/drivers/cpufreq/cpufreq_governor_attr_set.c
+++ b/drivers/cpufreq/cpufreq_governor_attr_set.c
@@ -77,8 +77,8 @@ unsigned int gov_attr_set_put(struct gov_attr_set *attr_set, struct list_head *l
if (count)
return count;
- kobject_put(&attr_set->kobj);
mutex_destroy(&attr_set->update_lock);
+ kobject_put(&attr_set->kobj);
return 0;
}
EXPORT_SYMBOL_GPL(gov_attr_set_put);
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS v3] scsi:spraid: support Ramaxel's spraid driver
by Yanling Song 22 Dec '21
by Yanling Song 22 Dec '21
22 Dec '21
Ramaxel inclusion
category: features
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NJIB
CVE: NA
v2->v3:
1. Add deconfig for hulk in arm64 and x86
v1->v2:
1. Add deconfig for arm64 and x86
Support Ramaxel's SPRxxx Raid controller
Signed-off-by: Yanling Song <songyl(a)ramaxel.com>
Reviewed-by: Jiang Yu<yujiang(a)ramaxel.com>
---
arch/arm64/configs/hulk_defconfig | 1 +
arch/arm64/configs/openeuler_defconfig | 1 +
arch/x86/configs/hulk_defconfig | 3 +-
arch/x86/configs/openeuler_defconfig | 1 +
drivers/scsi/Kconfig | 1 +
drivers/scsi/Makefile | 1 +
drivers/scsi/spraid/Kconfig | 13 +
drivers/scsi/spraid/Makefile | 7 +
drivers/scsi/spraid/spraid.h | 746 +++++
drivers/scsi/spraid/spraid_main.c | 3875 ++++++++++++++++++++++++
10 files changed, 4648 insertions(+), 1 deletion(-)
create mode 100644 drivers/scsi/spraid/Kconfig
create mode 100644 drivers/scsi/spraid/Makefile
create mode 100644 drivers/scsi/spraid/spraid.h
create mode 100644 drivers/scsi/spraid/spraid_main.c
diff --git a/arch/arm64/configs/hulk_defconfig b/arch/arm64/configs/hulk_defconfig
index bc350c41b333..71517c091c43 100644
--- a/arch/arm64/configs/hulk_defconfig
+++ b/arch/arm64/configs/hulk_defconfig
@@ -2149,6 +2149,7 @@ CONFIG_SCSI_MPT2SAS_MAX_SGE=128
CONFIG_SCSI_MPT3SAS_MAX_SGE=128
CONFIG_SCSI_MPT2SAS=m
CONFIG_SCSI_SMARTPQI=m
+CONFIG_RAMAXEL_SPRAID=m
# CONFIG_SCSI_UFSHCD is not set
# CONFIG_SCSI_HPTIOP is not set
CONFIG_LIBFC=m
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig
index d2167c80757f..47b0931b91bf 100644
--- a/arch/arm64/configs/openeuler_defconfig
+++ b/arch/arm64/configs/openeuler_defconfig
@@ -2143,6 +2143,7 @@ CONFIG_SCSI_MPT2SAS_MAX_SGE=128
CONFIG_SCSI_MPT3SAS_MAX_SGE=128
CONFIG_SCSI_MPT2SAS=m
CONFIG_SCSI_SMARTPQI=m
+CONFIG_RAMAXEL_SPRAID=m
# CONFIG_SCSI_UFSHCD is not set
# CONFIG_SCSI_HPTIOP is not set
CONFIG_LIBFC=m
diff --git a/arch/x86/configs/hulk_defconfig b/arch/x86/configs/hulk_defconfig
index 983d52ccad82..536ffb38c00c 100644
--- a/arch/x86/configs/hulk_defconfig
+++ b/arch/x86/configs/hulk_defconfig
@@ -1,4 +1,4 @@
-#
+
# Automatically generated file; DO NOT EDIT.
# Linux/x86 4.19.21 Kernel Configuration
#
@@ -2177,6 +2177,7 @@ CONFIG_SCSI_MPT2SAS_MAX_SGE=128
CONFIG_SCSI_MPT3SAS_MAX_SGE=128
CONFIG_SCSI_MPT2SAS=m
CONFIG_SCSI_SMARTPQI=m
+CONFIG_RAMAXEL_SPRAID=m
# CONFIG_SCSI_UFSHCD is not set
# CONFIG_SCSI_HPTIOP is not set
# CONFIG_SCSI_BUSLOGIC is not set
diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig
index 7043426d9780..786186288fea 100644
--- a/arch/x86/configs/openeuler_defconfig
+++ b/arch/x86/configs/openeuler_defconfig
@@ -2182,6 +2182,7 @@ CONFIG_SCSI_MPT2SAS_MAX_SGE=128
CONFIG_SCSI_MPT3SAS_MAX_SGE=128
CONFIG_SCSI_MPT2SAS=m
CONFIG_SCSI_SMARTPQI=m
+CONFIG_RAMAXEL_SPRAID=m
# CONFIG_SCSI_UFSHCD is not set
# CONFIG_SCSI_HPTIOP is not set
# CONFIG_SCSI_BUSLOGIC is not set
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 8450484184e3..63d2aaa22834 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -522,6 +522,7 @@ source "drivers/scsi/megaraid/Kconfig.megaraid"
source "drivers/scsi/mpt3sas/Kconfig"
source "drivers/scsi/smartpqi/Kconfig"
source "drivers/scsi/ufs/Kconfig"
+source "drivers/scsi/spraid/Kconfig"
config SCSI_HPTIOP
tristate "HighPoint RocketRAID 3xxx/4xxx Controller support"
diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index 2973693f6dcc..4056cf26e09e 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -94,6 +94,7 @@ obj-$(CONFIG_SCSI_ZALON) += zalon7xx.o
obj-$(CONFIG_SCSI_DC395x) += dc395x.o
obj-$(CONFIG_SCSI_AM53C974) += esp_scsi.o am53c974.o
obj-$(CONFIG_CXLFLASH) += cxlflash/
+obj-$(CONFIG_RAMAXEL_SPRAID) += spraid/
obj-$(CONFIG_MEGARAID_LEGACY) += megaraid.o
obj-$(CONFIG_MEGARAID_NEWGEN) += megaraid/
obj-$(CONFIG_MEGARAID_SAS) += megaraid/
diff --git a/drivers/scsi/spraid/Kconfig b/drivers/scsi/spraid/Kconfig
new file mode 100644
index 000000000000..bfbba3db8db0
--- /dev/null
+++ b/drivers/scsi/spraid/Kconfig
@@ -0,0 +1,13 @@
+#
+# Ramaxel driver configuration
+#
+
+config RAMAXEL_SPRAID
+ tristate "Ramaxel spraid Adapter"
+ depends on PCI && SCSI
+ select BLK_DEV_BSGLIB
+ depends on ARM64 || X86_64
+ help
+ This driver supports Ramaxel SPRxxx serial
+ raid controller, which has PCIE Gen4 interface
+ with host and supports SAS/SATA Hdd/ssd.
diff --git a/drivers/scsi/spraid/Makefile b/drivers/scsi/spraid/Makefile
new file mode 100644
index 000000000000..aadc2ffd37eb
--- /dev/null
+++ b/drivers/scsi/spraid/Makefile
@@ -0,0 +1,7 @@
+#
+# Makefile for the Ramaxel device drivers.
+#
+
+obj-$(CONFIG_RAMAXEL_SPRAID) += spraid.o
+
+spraid-objs := spraid_main.o
\ No newline at end of file
diff --git a/drivers/scsi/spraid/spraid.h b/drivers/scsi/spraid/spraid.h
new file mode 100644
index 000000000000..c1e4980e18e5
--- /dev/null
+++ b/drivers/scsi/spraid/spraid.h
@@ -0,0 +1,746 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
+
+#ifndef __SPRAID_H_
+#define __SPRAID_H_
+
+#define SPRAID_CAP_MQES(cap) ((cap) & 0xffff)
+#define SPRAID_CAP_STRIDE(cap) (((cap) >> 32) & 0xf)
+#define SPRAID_CAP_MPSMIN(cap) (((cap) >> 48) & 0xf)
+#define SPRAID_CAP_MPSMAX(cap) (((cap) >> 52) & 0xf)
+#define SPRAID_CAP_TIMEOUT(cap) (((cap) >> 24) & 0xff)
+#define SPRAID_CAP_DMAMASK(cap) (((cap) >> 37) & 0xff)
+
+#define SPRAID_DEFAULT_MAX_CHANNEL 4
+#define SPRAID_DEFAULT_MAX_ID 240
+#define SPRAID_DEFAULT_MAX_LUN_PER_HOST 8
+#define MAX_SECTORS 2048
+
+#define IO_SQE_SIZE sizeof(struct spraid_ioq_command)
+#define ADMIN_SQE_SIZE sizeof(struct spraid_admin_command)
+#define SQE_SIZE(qid) (((qid) > 0) ? IO_SQE_SIZE : ADMIN_SQE_SIZE)
+#define CQ_SIZE(depth) ((depth) * sizeof(struct spraid_completion))
+#define SQ_SIZE(qid, depth) ((depth) * SQE_SIZE(qid))
+
+#define SENSE_SIZE(depth) ((depth) * SCSI_SENSE_BUFFERSIZE)
+
+#define SPRAID_AQ_DEPTH 128
+#define SPRAID_NR_AEN_COMMANDS 16
+#define SPRAID_AQ_BLK_MQ_DEPTH (SPRAID_AQ_DEPTH - SPRAID_NR_AEN_COMMANDS)
+#define SPRAID_AQ_MQ_TAG_DEPTH (SPRAID_AQ_BLK_MQ_DEPTH - 1)
+
+#define SPRAID_ADMIN_QUEUE_NUM 1
+#define SPRAID_PTCMDS_PERQ 1
+#define SPRAID_IO_BLK_MQ_DEPTH (hdev->shost->can_queue)
+#define SPRAID_NR_IOQ_PTCMDS (SPRAID_PTCMDS_PERQ * hdev->shost->nr_hw_queues)
+
+#define FUA_MASK 0x08
+#define SPRAID_MINORS BIT(MINORBITS)
+
+#define COMMAND_IS_WRITE(cmd) ((cmd)->common.opcode & 1)
+
+#define SPRAID_IO_IOSQES 7
+#define SPRAID_IO_IOCQES 4
+#define PRP_ENTRY_SIZE 8
+
+#define SMALL_POOL_SIZE 256
+#define MAX_SMALL_POOL_NUM 16
+#define MAX_CMD_PER_DEV 64
+#define MAX_CDB_LEN 32
+
+#define SPRAID_UP_TO_MULTY4(x) (((x) + 4) & (~0x03))
+
+#define CQE_STATUS_SUCCESS (0x0)
+
+#define PCI_VENDOR_ID_RAMAXEL_LOGIC 0x1E81
+
+#define SPRAID_SERVER_DEVICE_HBA_DID 0x2100
+#define SPRAID_SERVER_DEVICE_RAID_DID 0x2200
+
+#define IO_6_DEFAULT_TX_LEN 256
+
+#define SPRAID_INT_PAGES 2
+#define SPRAID_INT_BYTES(hdev) (SPRAID_INT_PAGES * (hdev)->page_size)
+
+enum {
+ SPRAID_REQ_CANCELLED = (1 << 0),
+ SPRAID_REQ_USERCMD = (1 << 1),
+};
+
+enum {
+ SPRAID_SC_SUCCESS = 0x0,
+ SPRAID_SC_INVALID_OPCODE = 0x1,
+ SPRAID_SC_INVALID_FIELD = 0x2,
+
+ SPRAID_SC_ABORT_LIMIT = 0x103,
+ SPRAID_SC_ABORT_MISSING = 0x104,
+ SPRAID_SC_ASYNC_LIMIT = 0x105,
+
+ SPRAID_SC_DNR = 0x4000,
+};
+
+enum {
+ SPRAID_REG_CAP = 0x0000,
+ SPRAID_REG_CC = 0x0014,
+ SPRAID_REG_CSTS = 0x001c,
+ SPRAID_REG_AQA = 0x0024,
+ SPRAID_REG_ASQ = 0x0028,
+ SPRAID_REG_ACQ = 0x0030,
+ SPRAID_REG_DBS = 0x1000,
+};
+
+enum {
+ SPRAID_CC_ENABLE = 1 << 0,
+ SPRAID_CC_CSS_NVM = 0 << 4,
+ SPRAID_CC_MPS_SHIFT = 7,
+ SPRAID_CC_AMS_SHIFT = 11,
+ SPRAID_CC_SHN_SHIFT = 14,
+ SPRAID_CC_IOSQES_SHIFT = 16,
+ SPRAID_CC_IOCQES_SHIFT = 20,
+ SPRAID_CC_AMS_RR = 0 << SPRAID_CC_AMS_SHIFT,
+ SPRAID_CC_SHN_NONE = 0 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CC_IOSQES = SPRAID_IO_IOSQES << SPRAID_CC_IOSQES_SHIFT,
+ SPRAID_CC_IOCQES = SPRAID_IO_IOCQES << SPRAID_CC_IOCQES_SHIFT,
+ SPRAID_CC_SHN_NORMAL = 1 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CC_SHN_MASK = 3 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CSTS_CFS_SHIFT = 1,
+ SPRAID_CSTS_SHST_SHIFT = 2,
+ SPRAID_CSTS_PP_SHIFT = 5,
+ SPRAID_CSTS_RDY = 1 << 0,
+ SPRAID_CSTS_SHST_CMPLT = 2 << 2,
+ SPRAID_CSTS_SHST_MASK = 3 << 2,
+ SPRAID_CSTS_CFS_MASK = 1 << SPRAID_CSTS_CFS_SHIFT,
+ SPRAID_CSTS_PP_MASK = 1 << SPRAID_CSTS_PP_SHIFT,
+};
+
+enum {
+ SPRAID_ADMIN_DELETE_SQ = 0x00,
+ SPRAID_ADMIN_CREATE_SQ = 0x01,
+ SPRAID_ADMIN_DELETE_CQ = 0x04,
+ SPRAID_ADMIN_CREATE_CQ = 0x05,
+ SPRAID_ADMIN_ABORT_CMD = 0x08,
+ SPRAID_ADMIN_SET_FEATURES = 0x09,
+ SPRAID_ADMIN_ASYNC_EVENT = 0x0c,
+ SPRAID_ADMIN_GET_INFO = 0xc6,
+ SPRAID_ADMIN_RESET = 0xc8,
+};
+
+enum {
+ SPRAID_GET_INFO_CTRL = 0,
+ SPRAID_GET_INFO_DEV_LIST = 1,
+};
+
+enum {
+ SPRAID_RESET_TARGET = 0,
+ SPRAID_RESET_BUS = 1,
+};
+
+enum {
+ SPRAID_AEN_ERROR = 0,
+ SPRAID_AEN_NOTICE = 2,
+ SPRAID_AEN_VS = 7,
+};
+
+enum {
+ SPRAID_AEN_DEV_CHANGED = 0x00,
+ SPRAID_AEN_FW_ACT_START = 0x01,
+ SPRAID_AEN_HOST_PROBING = 0x10,
+};
+
+enum {
+ SPRAID_AEN_TIMESYN = 0x00,
+ SPRAID_AEN_FW_ACT_FINISH = 0x02,
+ SPRAID_AEN_EVENT_MIN = 0x80,
+ SPRAID_AEN_EVENT_MAX = 0xff,
+};
+
+enum {
+ SPRAID_CMD_WRITE = 0x01,
+ SPRAID_CMD_READ = 0x02,
+
+ SPRAID_CMD_NONIO_NONE = 0x80,
+ SPRAID_CMD_NONIO_TODEV = 0x81,
+ SPRAID_CMD_NONIO_FROMDEV = 0x82,
+};
+
+enum {
+ SPRAID_QUEUE_PHYS_CONTIG = (1 << 0),
+ SPRAID_CQ_IRQ_ENABLED = (1 << 1),
+
+ SPRAID_FEAT_NUM_QUEUES = 0x07,
+ SPRAID_FEAT_ASYNC_EVENT = 0x0b,
+ SPRAID_FEAT_TIMESTAMP = 0x0e,
+};
+
+enum spraid_state {
+ SPRAID_NEW,
+ SPRAID_LIVE,
+ SPRAID_RESETTING,
+ SPRAID_DELETING,
+ SPRAID_DEAD,
+};
+
+enum {
+ SPRAID_CARD_HBA,
+ SPRAID_CARD_RAID,
+};
+
+enum spraid_cmd_type {
+ SPRAID_CMD_ADM,
+ SPRAID_CMD_IOPT,
+};
+
+struct spraid_completion {
+ __le32 result;
+ union {
+ struct {
+ __u8 sense_len;
+ __u8 resv[3];
+ };
+ __le32 result1;
+ };
+ __le16 sq_head;
+ __le16 sq_id;
+ __u16 cmd_id;
+ __le16 status;
+};
+
+struct spraid_ctrl_info {
+ __le32 nd;
+ __le16 max_cmds;
+ __le16 max_channel;
+ __le32 max_tgt_id;
+ __le16 max_lun;
+ __le16 max_num_sge;
+ __le16 lun_num_in_boot;
+ __u8 mdts;
+ __u8 acl;
+ __u8 aerl;
+ __u8 card_type;
+ __u16 rsvd;
+ __u32 rtd3e;
+ __u8 sn[32];
+ __u8 fr[16];
+ __u8 rsvd1[4020];
+};
+
+struct spraid_dev {
+ struct pci_dev *pdev;
+ struct device *dev;
+ struct Scsi_Host *shost;
+ struct spraid_queue *queues;
+ struct dma_pool *prp_page_pool;
+ struct dma_pool *prp_small_pool[MAX_SMALL_POOL_NUM];
+ mempool_t *iod_mempool;
+ void __iomem *bar;
+ u32 max_qid;
+ u32 num_vecs;
+ u32 queue_count;
+ u32 ioq_depth;
+ int db_stride;
+ u32 __iomem *dbs;
+ struct rw_semaphore devices_rwsem;
+ int numa_node;
+ u32 page_size;
+ u32 ctrl_config;
+ u32 online_queues;
+ u64 cap;
+ int instance;
+ struct spraid_ctrl_info *ctrl_info;
+ struct spraid_dev_info *devices;
+
+ struct spraid_cmd *adm_cmds;
+ struct list_head adm_cmd_list;
+ spinlock_t adm_cmd_lock;
+
+ struct spraid_cmd *ioq_ptcmds;
+ struct list_head ioq_pt_list;
+ spinlock_t ioq_pt_lock;
+
+ struct work_struct scan_work;
+ struct work_struct timesyn_work;
+ struct work_struct reset_work;
+ struct work_struct fw_act_work;
+
+ enum spraid_state state;
+ spinlock_t state_lock;
+
+ struct request_queue *bsg_queue;
+};
+
+struct spraid_sgl_desc {
+ __le64 addr;
+ __le32 length;
+ __u8 rsvd[3];
+ __u8 type;
+};
+
+union spraid_data_ptr {
+ struct {
+ __le64 prp1;
+ __le64 prp2;
+ };
+ struct spraid_sgl_desc sgl;
+};
+
+struct spraid_admin_common_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le32 cdw2[4];
+ union spraid_data_ptr dptr;
+ __le32 cdw10;
+ __le32 cdw11;
+ __le32 cdw12;
+ __le32 cdw13;
+ __le32 cdw14;
+ __le32 cdw15;
+};
+
+struct spraid_features {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[2];
+ union spraid_data_ptr dptr;
+ __le32 fid;
+ __le32 dword11;
+ __le32 dword12;
+ __le32 dword13;
+ __le32 dword14;
+ __le32 dword15;
+};
+
+struct spraid_create_cq {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[5];
+ __le64 prp1;
+ __u64 rsvd8;
+ __le16 cqid;
+ __le16 qsize;
+ __le16 cq_flags;
+ __le16 irq_vector;
+ __u32 rsvd12[4];
+};
+
+struct spraid_create_sq {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[5];
+ __le64 prp1;
+ __u64 rsvd8;
+ __le16 sqid;
+ __le16 qsize;
+ __le16 sq_flags;
+ __le16 cqid;
+ __u32 rsvd12[4];
+};
+
+struct spraid_delete_queue {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[9];
+ __le16 qid;
+ __u16 rsvd10;
+ __u32 rsvd11[5];
+};
+
+struct spraid_get_info {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u32 rsvd2[4];
+ union spraid_data_ptr dptr;
+ __u8 type;
+ __u8 rsvd10[3];
+ __le32 cdw11;
+ __u32 rsvd12[4];
+};
+
+struct spraid_usr_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ union {
+ struct {
+ __le16 subopcode;
+ __le16 rsvd1;
+ } info_0;
+ __le32 cdw2;
+ };
+ union {
+ struct {
+ __le16 data_len;
+ __le16 param_len;
+ } info_1;
+ __le32 cdw3;
+ };
+ __u64 metadata;
+ union spraid_data_ptr dptr;
+ __le32 cdw10;
+ __le32 cdw11;
+ __le32 cdw12;
+ __le32 cdw13;
+ __le32 cdw14;
+ __le32 cdw15;
+};
+
+enum {
+ SPRAID_CMD_FLAG_SGL_METABUF = (1 << 6),
+ SPRAID_CMD_FLAG_SGL_METASEG = (1 << 7),
+ SPRAID_CMD_FLAG_SGL_ALL = SPRAID_CMD_FLAG_SGL_METABUF |
+ SPRAID_CMD_FLAG_SGL_METASEG,
+};
+
+enum spraid_cmd_state {
+ SPRAID_CMD_IDLE = 0,
+ SPRAID_CMD_IN_FLIGHT = 1,
+ SPRAID_CMD_COMPLETE = 2,
+ SPRAID_CMD_TIMEOUT = 3,
+ SPRAID_CMD_TMO_COMPLETE = 4,
+};
+
+struct spraid_abort_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[4];
+ __le16 sqid;
+ __le16 cid;
+ __u32 rsvd11[5];
+};
+
+struct spraid_reset_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[4];
+ __u8 type;
+ __u8 rsvd10[3];
+ __u32 rsvd11[5];
+};
+
+struct spraid_admin_command {
+ union {
+ struct spraid_admin_common_command common;
+ struct spraid_features features;
+ struct spraid_create_cq create_cq;
+ struct spraid_create_sq create_sq;
+ struct spraid_delete_queue delete_queue;
+ struct spraid_get_info get_info;
+ struct spraid_abort_cmd abort;
+ struct spraid_reset_cmd reset;
+ struct spraid_usr_cmd usr_cmd;
+ };
+};
+
+struct spraid_ioq_common_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_len;
+ __u8 rsvd2;
+ __le32 cdw3[3];
+ union spraid_data_ptr dptr;
+ __le32 cdw10[6];
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __le32 cdw26[6];
+};
+
+struct spraid_rw_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_len;
+ __u8 rsvd2;
+ __u32 rsvd3[3];
+ union spraid_data_ptr dptr;
+ __le64 slba;
+ __le16 nlb;
+ __le16 control;
+ __u32 rsvd13[3];
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __u32 rsvd26[6];
+};
+
+struct spraid_scsi_nonio {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_length;
+ __u8 rsvd2;
+ __u32 rsvd3[3];
+ union spraid_data_ptr dptr;
+ __u32 rsvd10[5];
+ __le32 buffer_len;
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __u32 rsvd26[6];
+};
+
+struct spraid_ioq_command {
+ union {
+ struct spraid_ioq_common_command common;
+ struct spraid_rw_command rw;
+ struct spraid_scsi_nonio scsi_nonio;
+ };
+};
+
+struct spraid_passthru_common_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 rsvd0;
+ __u32 nsid;
+ union {
+ struct {
+ __u16 subopcode;
+ __u16 rsvd1;
+ } info_0;
+ __u32 cdw2;
+ };
+ union {
+ struct {
+ __u16 data_len;
+ __u16 param_len;
+ } info_1;
+ __u32 cdw3;
+ };
+ __u64 metadata;
+
+ __u64 addr;
+ __u64 prp2;
+
+ __u32 cdw10;
+ __u32 cdw11;
+ __u32 cdw12;
+ __u32 cdw13;
+ __u32 cdw14;
+ __u32 cdw15;
+ __u32 timeout_ms;
+ __u32 result0;
+ __u32 result1;
+};
+
+struct spraid_ioq_passthru_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 rsvd0;
+ __u32 nsid;
+ union {
+ struct {
+ __u16 res_sense_len;
+ __u8 cdb_len;
+ __u8 rsvd0;
+ } info_0;
+ __u32 cdw2;
+ };
+ union {
+ struct {
+ __u16 subopcode;
+ __u16 rsvd1;
+ } info_1;
+ __u32 cdw3;
+ };
+ union {
+ struct {
+ __u16 rsvd;
+ __u16 param_len;
+ } info_2;
+ __u32 cdw4;
+ };
+ __u32 cdw5;
+ __u64 addr;
+ __u64 prp2;
+ union {
+ struct {
+ __u16 eid;
+ __u16 sid;
+ } info_3;
+ __u32 cdw10;
+ };
+ union {
+ struct {
+ __u16 did;
+ __u8 did_flag;
+ __u8 rsvd2;
+ } info_4;
+ __u32 cdw11;
+ };
+ __u32 cdw12;
+ __u32 cdw13;
+ __u32 cdw14;
+ __u32 data_len;
+ __u32 cdw16;
+ __u32 cdw17;
+ __u32 cdw18;
+ __u32 cdw19;
+ __u32 cdw20;
+ __u32 cdw21;
+ __u32 cdw22;
+ __u32 cdw23;
+ __u64 sense_addr;
+ __u32 cdw26[4];
+ __u32 timeout_ms;
+ __u32 result0;
+ __u32 result1;
+};
+
+struct spraid_bsg_request {
+ u32 msgcode;
+ u32 control;
+ union {
+ struct spraid_passthru_common_cmd admcmd;
+ struct spraid_ioq_passthru_cmd ioqcmd;
+ };
+};
+
+enum {
+ SPRAID_BSG_ADM,
+ SPRAID_BSG_IOQ,
+};
+
+struct spraid_cmd {
+ int qid;
+ int cid;
+ u32 result0;
+ u32 result1;
+ u16 status;
+ void *priv;
+ enum spraid_cmd_state state;
+ struct completion cmd_done;
+ struct list_head list;
+};
+
+struct spraid_queue {
+ struct spraid_dev *hdev;
+ spinlock_t sq_lock;
+
+ spinlock_t cq_lock ____cacheline_aligned_in_smp;
+
+ void *sq_cmds;
+
+ struct spraid_completion *cqes;
+
+ dma_addr_t sq_dma_addr;
+ dma_addr_t cq_dma_addr;
+ u32 __iomem *q_db;
+ u8 cq_phase;
+ u8 sqes;
+ u16 qid;
+ u16 sq_tail;
+ u16 cq_head;
+ u16 last_cq_head;
+ u16 q_depth;
+ s16 cq_vector;
+ void *sense;
+ dma_addr_t sense_dma_addr;
+ struct dma_pool *prp_small_pool;
+};
+
+struct spraid_iod {
+ struct spraid_queue *spraidq;
+ enum spraid_cmd_state state;
+ int npages;
+ u32 nsge;
+ u32 length;
+ bool use_sgl;
+ bool sg_drv_mgmt;
+ dma_addr_t first_dma;
+ void *sense;
+ dma_addr_t sense_dma;
+ struct scatterlist *sg;
+ struct scatterlist inline_sg[0];
+};
+
+#define SPRAID_DEV_INFO_ATTR_BOOT(attr) ((attr) & 0x01)
+#define SPRAID_DEV_INFO_ATTR_VD(attr) (((attr) & 0x02) == 0x0)
+#define SPRAID_DEV_INFO_ATTR_PT(attr) (((attr) & 0x22) == 0x02)
+#define SPRAID_DEV_INFO_ATTR_RAWDISK(attr) ((attr) & 0x20)
+
+#define SPRAID_DEV_INFO_FLAG_VALID(flag) ((flag) & 0x01)
+#define SPRAID_DEV_INFO_FLAG_CHANGE(flag) ((flag) & 0x02)
+
+#define BGTASK_TYPE_REBUILD 4
+#define USR_CMD_READ 0xc2
+#define USR_CMD_RDLEN 0x1000
+#define USR_CMD_VDINFO 0x704
+#define USR_CMD_BGTASK 0x504
+#define VDINFO_PARAM_LEN 0x04
+
+struct spraid_vd_info {
+ __u8 name[32];
+ __le16 id;
+ __u8 rg_id;
+ __u8 rg_level;
+ __u8 sg_num;
+ __u8 sg_disk_num;
+ __u8 vd_status;
+ __u8 vd_type;
+ __u8 rsvd1[4056];
+};
+
+#define MAX_REALTIME_BGTASK_NUM 32
+
+struct bgtask_info {
+ __u8 type;
+ __u8 progress;
+ __u8 rate;
+ __u8 rsvd0;
+ __le16 vd_id;
+ __le16 time_left;
+ __u8 rsvd1[4];
+};
+
+struct spraid_bgtask {
+ __u8 sw;
+ __u8 task_num;
+ __u8 rsvd[6];
+ struct bgtask_info bgtask[MAX_REALTIME_BGTASK_NUM];
+};
+
+struct spraid_dev_info {
+ __le32 hdid;
+ __le16 target;
+ __u8 channel;
+ __u8 lun;
+ __u8 attr;
+ __u8 flag;
+ __le16 max_io_kb;
+};
+
+#define MAX_DEV_ENTRY_PER_PAGE_4K 340
+struct spraid_dev_list {
+ __le32 dev_num;
+ __u32 rsvd0[3];
+ struct spraid_dev_info devices[MAX_DEV_ENTRY_PER_PAGE_4K];
+};
+
+struct spraid_sdev_hostdata {
+ u32 hdid;
+ u16 max_io_kb;
+ u8 attr;
+ u8 flag;
+ u8 rg_id;
+ u8 rsvd[3];
+};
+
+#endif
+
diff --git a/drivers/scsi/spraid/spraid_main.c b/drivers/scsi/spraid/spraid_main.c
new file mode 100644
index 000000000000..519b39f44e91
--- /dev/null
+++ b/drivers/scsi/spraid/spraid_main.c
@@ -0,0 +1,3875 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
+
+/* Ramaxel Raid SPXXX Series Linux Driver */
+
+#define pr_fmt(fmt) "spraid: " fmt
+
+#include <linux/sched/signal.h>
+#include <linux/version.h>
+#include <linux/pci.h>
+#include <linux/aer.h>
+#include <linux/module.h>
+#include <linux/ioport.h>
+#include <linux/device.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/cdev.h>
+#include <linux/sysfs.h>
+#include <linux/gfp.h>
+#include <linux/types.h>
+#include <linux/ratelimit.h>
+#include <linux/once.h>
+#include <linux/debugfs.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/blkdev.h>
+#include <linux/bsg-lib.h>
+#include <asm/unaligned.h>
+#include <linux/sort.h>
+#include <target/target_core_backend.h>
+
+#include <scsi/scsi.h>
+#include <scsi/scsi_cmnd.h>
+#include <scsi/scsi_device.h>
+#include <scsi/scsi_host.h>
+#include <scsi/scsi_transport.h>
+#include <scsi/scsi_dbg.h>
+
+
+#include "spraid.h"
+
+static u32 admin_tmout = 60;
+module_param(admin_tmout, uint, 0644);
+MODULE_PARM_DESC(admin_tmout, "admin commands timeout (seconds)");
+
+static u32 scmd_tmout_nonpt = 180;
+module_param(scmd_tmout_nonpt, uint, 0644);
+MODULE_PARM_DESC(scmd_tmout_nonpt,
+ "scsi commands timeout for rawdisk&raid(seconds)");
+
+static int ioq_depth_set(const char *val, const struct kernel_param *kp);
+static const struct kernel_param_ops ioq_depth_ops = {
+ .set = ioq_depth_set,
+ .get = param_get_uint,
+};
+
+static u32 io_queue_depth = 1024;
+module_param_cb(io_queue_depth, &ioq_depth_ops, &io_queue_depth, 0644);
+MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2");
+
+static int log_debug_switch_set(const char *val, const struct kernel_param *kp)
+{
+ u8 n = 0;
+ int ret;
+
+ ret = kstrtou8(val, 10, &n);
+ if (ret != 0)
+ return -EINVAL;
+
+ return param_set_byte(val, kp);
+}
+
+static const struct kernel_param_ops log_debug_switch_ops = {
+ .set = log_debug_switch_set,
+ .get = param_get_byte,
+};
+
+static unsigned char log_debug_switch;
+module_param_cb(log_debug_switch, &log_debug_switch_ops,
+ &log_debug_switch, 0644);
+MODULE_PARM_DESC(log_debug_switch,
+ "set log state, default non-zero for switch on");
+
+static int small_pool_num_set(const char *val, const struct kernel_param *kp)
+{
+ u8 n = 0;
+ int ret;
+
+ ret = kstrtou8(val, 10, &n);
+ if (ret != 0)
+ return -EINVAL;
+ if (n > MAX_SMALL_POOL_NUM)
+ n = MAX_SMALL_POOL_NUM;
+ if (n < 1)
+ n = 1;
+ *((u8 *)kp->arg) = n;
+
+ return 0;
+}
+
+static const struct kernel_param_ops small_pool_num_ops = {
+ .set = small_pool_num_set,
+ .get = param_get_byte,
+};
+
+/* It was found that the spindlock of a single pool conflicts
+ * a lot with multiple CPUs.So multiple pools are introduced
+ * to reduce the conflictions.
+ */
+static unsigned char small_pool_num = 4;
+module_param_cb(small_pool_num, &small_pool_num_ops, &small_pool_num, 0644);
+MODULE_PARM_DESC(small_pool_num, "set prp small pool num, default 4, MAX 16");
+
+static void spraid_free_queue(struct spraid_queue *spraidq);
+static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result);
+static void spraid_handle_aen_vs(struct spraid_dev *hdev,
+ u32 result, u32 result1);
+
+static DEFINE_IDA(spraid_instance_ida);
+
+static struct class *spraid_class;
+
+#define SPRAID_CAP_TIMEOUT_UNIT_MS (HZ / 2)
+
+static struct workqueue_struct *spraid_wq;
+
+#define dev_log_dbg(dev, fmt, ...) do { \
+ if (unlikely(log_debug_switch)) \
+ dev_info(dev, "[%s] [%d] " fmt, \
+ __func__, __LINE__, ##__VA_ARGS__); \
+} while (0)
+
+#define SPRAID_DRV_VERSION "1.0.0.0"
+
+#define ADMIN_TIMEOUT (admin_tmout * HZ)
+#define ADMIN_ERR_TIMEOUT 32757
+
+#define SPRAID_WAIT_ABNL_CMD_TIMEOUT (3 * 2)
+
+#define SPRAID_DMA_MSK_BIT_MAX 64
+
+enum FW_STAT_CODE {
+ FW_STAT_OK = 0,
+ FW_STAT_NEED_CHECK,
+ FW_STAT_ERROR,
+ FW_STAT_EP_PCIE_ERROR,
+ FW_STAT_NAC_DMA_ERROR,
+ FW_STAT_ABORTED,
+ FW_STAT_NEED_RETRY
+};
+
+static const char * const raid_levels[] = {"0", "1", "5", "6", "10", "50", "60",
+ "NA"};
+
+static const char * const raid_states[] = {
+ "NA", "NORMAL", "FAULT", "DEGRADE", "NOT_FORMATTED",
+ "FORMATTING", "SANITIZING", "INITIALIZING", "INITIALIZE_FAIL",
+ "DELETING", "DELETE_FAIL", "WRITE_PROTECT"
+};
+
+static int ioq_depth_set(const char *val, const struct kernel_param *kp)
+{
+ int n = 0;
+ int ret;
+
+ ret = kstrtoint(val, 10, &n);
+ if (ret != 0 || n < 2)
+ return -EINVAL;
+
+ return param_set_int(val, kp);
+}
+
+static int spraid_remap_bar(struct spraid_dev *hdev, u32 size)
+{
+ struct pci_dev *pdev = hdev->pdev;
+
+ if (size > pci_resource_len(pdev, 0)) {
+ dev_err(hdev->dev, "Input size[%u] exceed bar0 length[%llu]\n",
+ size, pci_resource_len(pdev, 0));
+ return -ENOMEM;
+ }
+
+ if (hdev->bar)
+ iounmap(hdev->bar);
+
+ hdev->bar = ioremap(pci_resource_start(pdev, 0), size);
+ if (!hdev->bar) {
+ dev_err(hdev->dev, "ioremap for bar0 failed\n");
+ return -ENOMEM;
+ }
+ hdev->dbs = hdev->bar + SPRAID_REG_DBS;
+
+ return 0;
+}
+
+static int spraid_dev_map(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ int ret;
+
+ ret = pci_request_mem_regions(pdev, "spraid");
+ if (ret) {
+ dev_err(hdev->dev, "fail to request memory regions\n");
+ return ret;
+ }
+
+ ret = spraid_remap_bar(hdev, SPRAID_REG_DBS + 4096);
+ if (ret) {
+ pci_release_mem_regions(pdev);
+ return ret;
+ }
+
+ return 0;
+}
+
+static void spraid_dev_unmap(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+
+ if (hdev->bar) {
+ iounmap(hdev->bar);
+ hdev->bar = NULL;
+ }
+ pci_release_mem_regions(pdev);
+}
+
+static int spraid_pci_enable(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ int ret = -ENOMEM;
+ u64 maskbit = SPRAID_DMA_MSK_BIT_MAX;
+
+ if (pci_enable_device_mem(pdev)) {
+ dev_err(hdev->dev,
+ "Enable pci device memory resources failed\n");
+ return ret;
+ }
+ pci_set_master(pdev);
+
+ if (readl(hdev->bar + SPRAID_REG_CSTS) == U32_MAX) {
+ ret = -ENODEV;
+ dev_err(hdev->dev, "Read csts register failed\n");
+ goto disable;
+ }
+
+ ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
+ if (ret < 0) {
+ dev_err(hdev->dev,
+ "Allocate one IRQ for setup admin channel failed\n");
+ goto disable;
+ }
+
+ hdev->cap = lo_hi_readq(hdev->bar + SPRAID_REG_CAP);
+ hdev->ioq_depth = min_t(u32, SPRAID_CAP_MQES(hdev->cap) + 1,
+ io_queue_depth);
+ hdev->db_stride = 1 << SPRAID_CAP_STRIDE(hdev->cap);
+
+ maskbit = SPRAID_CAP_DMAMASK(hdev->cap);
+ if (maskbit < 32 || maskbit > SPRAID_DMA_MSK_BIT_MAX) {
+ dev_err(hdev->dev,
+ "err, dma mask invalid[%llu], set to default\n",
+ maskbit);
+ maskbit = SPRAID_DMA_MSK_BIT_MAX;
+ }
+ if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(maskbit))) {
+ dev_err(hdev->dev, "set dma mask and coherent failed\n");
+ goto disable;
+ }
+
+ dev_info(hdev->dev, "set dma mask[%llu] success\n", maskbit);
+
+ pci_enable_pcie_error_reporting(pdev);
+ pci_save_state(pdev);
+
+ return 0;
+
+disable:
+ pci_disable_device(pdev);
+ return ret;
+}
+
+static int spraid_npages_prp(u32 size, struct spraid_dev *hdev)
+{
+ u32 nprps = DIV_ROUND_UP(size + hdev->page_size, hdev->page_size);
+
+ return DIV_ROUND_UP(PRP_ENTRY_SIZE * nprps, PAGE_SIZE - PRP_ENTRY_SIZE);
+}
+
+static int spraid_npages_sgl(u32 nseg)
+{
+ return DIV_ROUND_UP(nseg * sizeof(struct spraid_sgl_desc), PAGE_SIZE);
+}
+
+static void **spraid_iod_list(struct spraid_iod *iod)
+{
+ return (void **)(iod->inline_sg + (iod->sg_drv_mgmt ? iod->nsge : 0));
+}
+
+static u32 spraid_iod_ext_size(struct spraid_dev *hdev, u32 size, u32 nsge,
+ bool sg_drv_mgmt, bool use_sgl)
+{
+ size_t alloc_size, sg_size;
+
+ if (use_sgl)
+ alloc_size = sizeof(__le64 *) * spraid_npages_sgl(nsge);
+ else
+ alloc_size = sizeof(__le64 *) * spraid_npages_prp(size, hdev);
+
+ sg_size = sg_drv_mgmt ? (sizeof(struct scatterlist) * nsge) : 0;
+ return sg_size + alloc_size;
+}
+
+static u32 spraid_cmd_size(struct spraid_dev *hdev,
+ bool sg_drv_mgmt, bool use_sgl)
+{
+ u32 alloc_size = spraid_iod_ext_size(hdev, SPRAID_INT_BYTES(hdev),
+ SPRAID_INT_PAGES, sg_drv_mgmt, use_sgl);
+
+ dev_info(hdev->dev, "sg_drv_mgmt: %s, use_sgl: %s, iod size: %lu;"
+ " alloc_size: %u\n", sg_drv_mgmt ? "true" : "false",
+ use_sgl ? "true" : "false",
+ sizeof(struct spraid_iod), alloc_size);
+
+ return sizeof(struct spraid_iod) + alloc_size;
+}
+
+static int spraid_setup_prps(struct spraid_dev *hdev, struct spraid_iod *iod)
+{
+ struct scatterlist *sg = iod->sg;
+ u64 dma_addr = sg_dma_address(sg);
+ int dma_len = sg_dma_len(sg);
+ __le64 *prp_list, *old_prp_list;
+ u32 page_size = hdev->page_size;
+ int offset = dma_addr & (page_size - 1);
+ void **list = spraid_iod_list(iod);
+ int length = iod->length;
+ struct dma_pool *pool;
+ dma_addr_t prp_dma;
+ int nprps, i;
+
+ length -= (page_size - offset);
+ if (length <= 0) {
+ iod->first_dma = 0;
+ return 0;
+ }
+
+ dma_len -= (page_size - offset);
+ if (dma_len) {
+ dma_addr += (page_size - offset);
+ } else {
+ sg = sg_next(sg);
+ dma_addr = sg_dma_address(sg);
+ dma_len = sg_dma_len(sg);
+ }
+
+ if (length <= page_size) {
+ iod->first_dma = dma_addr;
+ return 0;
+ }
+
+ nprps = DIV_ROUND_UP(length, page_size);
+ if (nprps <= (SMALL_POOL_SIZE / PRP_ENTRY_SIZE)) {
+ pool = iod->spraidq->prp_small_pool;
+ iod->npages = 0;
+ } else {
+ pool = hdev->prp_page_pool;
+ iod->npages = 1;
+ }
+
+ prp_list = dma_pool_alloc(pool, GFP_ATOMIC, &prp_dma);
+ if (!prp_list) {
+ dev_err_ratelimited(hdev->dev,
+ "Allocate first prp_list memory failed\n");
+ iod->first_dma = dma_addr;
+ iod->npages = -1;
+ return -ENOMEM;
+ }
+ list[0] = prp_list;
+ iod->first_dma = prp_dma;
+ i = 0;
+ for (;;) {
+ if (i == page_size / PRP_ENTRY_SIZE) {
+ old_prp_list = prp_list;
+
+ prp_list = dma_pool_alloc(pool, GFP_ATOMIC, &prp_dma);
+ if (!prp_list) {
+ dev_err_ratelimited(hdev->dev, "Allocate %dth;"
+ " prp_list memory failed\n",
+ iod->npages + 1);
+ return -ENOMEM;
+ }
+ list[iod->npages++] = prp_list;
+ prp_list[0] = old_prp_list[i - 1];
+ old_prp_list[i - 1] = cpu_to_le64(prp_dma);
+ i = 1;
+ }
+ prp_list[i++] = cpu_to_le64(dma_addr);
+ dma_len -= page_size;
+ dma_addr += page_size;
+ length -= page_size;
+ if (length <= 0)
+ break;
+ if (dma_len > 0)
+ continue;
+ if (unlikely(dma_len < 0))
+ goto bad_sgl;
+ sg = sg_next(sg);
+ dma_addr = sg_dma_address(sg);
+ dma_len = sg_dma_len(sg);
+ }
+
+ return 0;
+
+bad_sgl:
+ dev_err(hdev->dev,
+ "Setup prps, invalid SGL for payload: %d nents: %d\n",
+ iod->length, iod->nsge);
+ return -EIO;
+}
+
+#define SGES_PER_PAGE (PAGE_SIZE / sizeof(struct spraid_sgl_desc))
+
+static void spraid_submit_cmd(struct spraid_queue *spraidq, const void *cmd)
+{
+ u32 sqes = SQE_SIZE(spraidq->qid);
+ unsigned long flags;
+ struct spraid_admin_common_command *acd =
+ (struct spraid_admin_common_command *)cmd;
+
+ spin_lock_irqsave(&spraidq->sq_lock, flags);
+ memcpy((spraidq->sq_cmds + sqes * spraidq->sq_tail), cmd, sqes);
+ if (++spraidq->sq_tail == spraidq->q_depth)
+ spraidq->sq_tail = 0;
+
+ writel(spraidq->sq_tail, spraidq->q_db);
+ spin_unlock_irqrestore(&spraidq->sq_lock, flags);
+
+ dev_log_dbg(spraidq->hdev->dev,
+ "cid[%d] qid[%d], opcode[0x%x], flags[0x%x], hdid[%u]\n",
+ acd->command_id, spraidq->qid, acd->opcode,
+ acd->flags, le32_to_cpu(acd->hdid));
+}
+
+static u32 spraid_mod64(u64 dividend, u32 divisor)
+{
+ u64 d;
+ u32 remainder;
+
+ if (!divisor)
+ pr_err("DIVISOR is zero, in div fn\n");
+
+ d = dividend;
+ remainder = do_div(d, divisor);
+ return remainder;
+}
+
+static inline bool spraid_is_rw_scmd(struct scsi_cmnd *scmd)
+{
+ switch (scmd->cmnd[0]) {
+ case READ_6:
+ case READ_10:
+ case READ_12:
+ case READ_16:
+ case READ_32:
+ case WRITE_6:
+ case WRITE_10:
+ case WRITE_12:
+ case WRITE_16:
+ case WRITE_32:
+ return true;
+ default:
+ return false;
+ }
+}
+
+static bool spraid_is_prp(struct spraid_dev *hdev,
+ struct scsi_cmnd *scmd, u32 nsge)
+{
+ struct scatterlist *sg = scsi_sglist(scmd);
+ u32 page_size = hdev->page_size;
+ bool is_prp = true;
+ int i = 0;
+
+ scsi_for_each_sg(scmd, sg, nsge, i) {
+ if (i != 0 && i != nsge - 1) {
+ if (spraid_mod64(sg_dma_len(sg), page_size) ||
+ spraid_mod64(sg_dma_address(sg), page_size)) {
+ is_prp = false;
+ break;
+ }
+ }
+
+ if (nsge > 1 && i == 0) {
+ if ((spraid_mod64((sg_dma_address(sg) + sg_dma_len(sg)),
+ page_size))) {
+ is_prp = false;
+ break;
+ }
+ }
+
+ if (nsge > 1 && i == (nsge - 1)) {
+ if (spraid_mod64(sg_dma_address(sg), page_size)) {
+ is_prp = false;
+ break;
+ }
+ }
+ }
+
+ return is_prp;
+}
+
+enum {
+ SPRAID_SGL_FMT_DATA_DESC = 0x00,
+ SPRAID_SGL_FMT_SEG_DESC = 0x02,
+ SPRAID_SGL_FMT_LAST_SEG_DESC = 0x03,
+ SPRAID_KEY_SGL_FMT_DATA_DESC = 0x04,
+ SPRAID_TRANSPORT_SGL_DATA_DESC = 0x05
+};
+
+static void spraid_sgl_set_data(struct spraid_sgl_desc *sge,
+ struct scatterlist *sg)
+{
+ sge->addr = cpu_to_le64(sg_dma_address(sg));
+ sge->length = cpu_to_le32(sg_dma_len(sg));
+ sge->type = SPRAID_SGL_FMT_DATA_DESC << 4;
+}
+
+static void spraid_sgl_set_seg(struct spraid_sgl_desc *sge,
+ dma_addr_t dma_addr, int entries)
+{
+ sge->addr = cpu_to_le64(dma_addr);
+ if (entries <= SGES_PER_PAGE) {
+ sge->length = cpu_to_le32(entries * sizeof(*sge));
+ sge->type = SPRAID_SGL_FMT_LAST_SEG_DESC << 4;
+ } else {
+ sge->length = cpu_to_le32(PAGE_SIZE);
+ sge->type = SPRAID_SGL_FMT_SEG_DESC << 4;
+ }
+}
+
+static int spraid_setup_ioq_cmd_sgl(struct spraid_dev *hdev,
+ struct scsi_cmnd *scmd,
+ struct spraid_ioq_command *ioq_cmd,
+ struct spraid_iod *iod)
+{
+ struct spraid_sgl_desc *sg_list, *link, *old_sg_list;
+ struct scatterlist *sg = scsi_sglist(scmd);
+ void **list = spraid_iod_list(iod);
+ struct dma_pool *pool;
+ int nsge = iod->nsge;
+ dma_addr_t sgl_dma;
+ int i = 0;
+
+ ioq_cmd->common.flags |= SPRAID_CMD_FLAG_SGL_METABUF;
+
+ if (nsge == 1) {
+ spraid_sgl_set_data(&ioq_cmd->common.dptr.sgl, sg);
+ return 0;
+ }
+
+ if (nsge <= (SMALL_POOL_SIZE / sizeof(struct spraid_sgl_desc))) {
+ pool = iod->spraidq->prp_small_pool;
+ iod->npages = 0;
+ } else {
+ pool = hdev->prp_page_pool;
+ iod->npages = 1;
+ }
+
+ sg_list = dma_pool_alloc(pool, GFP_ATOMIC, &sgl_dma);
+ if (!sg_list) {
+ dev_err_ratelimited(hdev->dev,
+ "Allocate first sgl_list failed\n");
+ iod->npages = -1;
+ return -ENOMEM;
+ }
+
+ list[0] = sg_list;
+ iod->first_dma = sgl_dma;
+ spraid_sgl_set_seg(&ioq_cmd->common.dptr.sgl, sgl_dma, nsge);
+ do {
+ if (i == SGES_PER_PAGE) {
+ old_sg_list = sg_list;
+ link = &old_sg_list[SGES_PER_PAGE - 1];
+
+ sg_list = dma_pool_alloc(pool, GFP_ATOMIC, &sgl_dma);
+ if (!sg_list) {
+ dev_err_ratelimited(hdev->dev,
+ "Allocate %dth sgl_list;"
+ " failed\n",
+ iod->npages + 1);
+ return -ENOMEM;
+ }
+ list[iod->npages++] = sg_list;
+
+ i = 0;
+ memcpy(&sg_list[i++], link, sizeof(*link));
+ spraid_sgl_set_seg(link, sgl_dma, nsge);
+ }
+
+ spraid_sgl_set_data(&sg_list[i++], sg);
+ sg = sg_next(sg);
+ } while (--nsge > 0);
+
+ return 0;
+}
+
+#define SPRAID_RW_FUA BIT(14)
+
+static void spraid_setup_rw_cmd(struct spraid_dev *hdev,
+ struct spraid_rw_command *rw,
+ struct scsi_cmnd *scmd)
+{
+ u32 start_lba_lo, start_lba_hi;
+ u32 datalength = 0;
+ u16 control = 0;
+
+ start_lba_lo = 0;
+ start_lba_hi = 0;
+
+ if (scmd->sc_data_direction == DMA_TO_DEVICE) {
+ rw->opcode = SPRAID_CMD_WRITE;
+ } else if (scmd->sc_data_direction == DMA_FROM_DEVICE) {
+ rw->opcode = SPRAID_CMD_READ;
+ } else {
+ dev_err(hdev->dev,
+ "Invalid IO for unsupported data direction: %d\n",
+ scmd->sc_data_direction);
+ WARN_ON(1);
+ }
+
+ /* 6-byte READ(0x08) or WRITE(0x0A) cdb */
+ if (scmd->cmd_len == 6) {
+ datalength = (u32)(scmd->cmnd[4] == 0 ?
+ IO_6_DEFAULT_TX_LEN : scmd->cmnd[4]);
+ start_lba_lo = (u32)get_unaligned_be24(&scmd->cmnd[1]);
+
+ start_lba_lo &= 0x1FFFFF;
+ }
+
+ /* 10-byte READ(0x28) or WRITE(0x2A) cdb */
+ else if (scmd->cmd_len == 10) {
+ datalength = (u32)get_unaligned_be16(&scmd->cmnd[7]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+
+ /* 12-byte READ(0xA8) or WRITE(0xAA) cdb */
+ else if (scmd->cmd_len == 12) {
+ datalength = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+ /* 16-byte READ(0x88) or WRITE(0x8A) cdb */
+ else if (scmd->cmd_len == 16) {
+ datalength = get_unaligned_be32(&scmd->cmnd[10]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+ /* 32-byte READ(0x88) or WRITE(0x8A) cdb */
+ else if (scmd->cmd_len == 32) {
+ datalength = get_unaligned_be32(&scmd->cmnd[28]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[16]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[12]);
+
+ if (scmd->cmnd[10] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+
+ if (unlikely(datalength > U16_MAX || datalength == 0)) {
+ dev_err(hdev->dev,
+ "Invalid IO for illegal transfer data length: %u\n",
+ datalength);
+ WARN_ON(1);
+ }
+
+ rw->slba = cpu_to_le64(((u64)start_lba_hi << 32) | start_lba_lo);
+ /* 0base for nlb */
+ rw->nlb = cpu_to_le16((u16)(datalength - 1));
+ rw->control = cpu_to_le16(control);
+}
+
+static void spraid_setup_nonio_cmd(struct spraid_dev *hdev,
+ struct spraid_scsi_nonio *scsi_nonio,
+ struct scsi_cmnd *scmd)
+{
+ scsi_nonio->buffer_len = cpu_to_le32(scsi_bufflen(scmd));
+
+ switch (scmd->sc_data_direction) {
+ case DMA_NONE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_NONE;
+ break;
+ case DMA_TO_DEVICE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_TODEV;
+ break;
+ case DMA_FROM_DEVICE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_FROMDEV;
+ break;
+ default:
+ dev_err(hdev->dev,
+ "Invalid IO for unsupported data direction: %d\n",
+ scmd->sc_data_direction);
+ WARN_ON(1);
+ }
+}
+
+static void spraid_setup_ioq_cmd(struct spraid_dev *hdev,
+ struct spraid_ioq_command *ioq_cmd,
+ struct scsi_cmnd *scmd)
+{
+ memcpy(ioq_cmd->common.cdb, scmd->cmnd, scmd->cmd_len);
+ ioq_cmd->common.cdb_len = scmd->cmd_len;
+
+ if (spraid_is_rw_scmd(scmd))
+ spraid_setup_rw_cmd(hdev, &ioq_cmd->rw, scmd);
+ else
+ spraid_setup_nonio_cmd(hdev, &ioq_cmd->scsi_nonio, scmd);
+}
+
+static int spraid_init_iod(struct spraid_dev *hdev, struct spraid_iod *iod,
+ struct spraid_ioq_command *ioq_cmd,
+ struct scsi_cmnd *scmd)
+{
+ if (unlikely(!iod->sense)) {
+ dev_err(hdev->dev, "Allocate sense data buffer failed\n");
+ return -ENOMEM;
+ }
+ ioq_cmd->common.sense_addr = cpu_to_le64(iod->sense_dma);
+ ioq_cmd->common.sense_len = cpu_to_le16(SCSI_SENSE_BUFFERSIZE);
+
+ iod->nsge = 0;
+ iod->npages = -1;
+ iod->use_sgl = 0;
+ iod->sg_drv_mgmt = false;
+ WRITE_ONCE(iod->state, SPRAID_CMD_IDLE);
+
+ return 0;
+}
+
+static void spraid_free_iod_res(struct spraid_dev *hdev, struct spraid_iod *iod)
+{
+ const int last_prp = hdev->page_size / sizeof(__le64) - 1;
+ dma_addr_t dma_addr, next_dma_addr;
+ struct spraid_sgl_desc *sg_list;
+ __le64 *prp_list;
+ void *addr;
+ int i;
+
+ dma_addr = iod->first_dma;
+ if (iod->npages == 0)
+ dma_pool_free(iod->spraidq->prp_small_pool,
+ spraid_iod_list(iod)[0], dma_addr);
+
+ for (i = 0; i < iod->npages; i++) {
+ addr = spraid_iod_list(iod)[i];
+
+ if (iod->use_sgl) {
+ sg_list = addr;
+ next_dma_addr =
+ le64_to_cpu((sg_list[SGES_PER_PAGE - 1]).addr);
+ } else {
+ prp_list = addr;
+ next_dma_addr = le64_to_cpu(prp_list[last_prp]);
+ }
+
+ dma_pool_free(hdev->prp_page_pool, addr, dma_addr);
+ dma_addr = next_dma_addr;
+ }
+
+ if (iod->sg_drv_mgmt && iod->sg != iod->inline_sg) {
+ iod->sg_drv_mgmt = false;
+ mempool_free(iod->sg, hdev->iod_mempool);
+ }
+
+ iod->sense = NULL;
+ iod->npages = -1;
+}
+
+static int spraid_io_map_data(struct spraid_dev *hdev, struct spraid_iod *iod,
+ struct scsi_cmnd *scmd,
+ struct spraid_ioq_command *ioq_cmd)
+{
+ int ret;
+
+ iod->nsge = scsi_dma_map(scmd);
+
+ /* No data to DMA, it may be scsi no-rw command */
+ if (unlikely(iod->nsge == 0))
+ return 0;
+
+ iod->length = scsi_bufflen(scmd);
+ iod->sg = scsi_sglist(scmd);
+ iod->use_sgl = !spraid_is_prp(hdev, scmd, iod->nsge);
+
+ if (iod->use_sgl) {
+ ret = spraid_setup_ioq_cmd_sgl(hdev, scmd, ioq_cmd, iod);
+ } else {
+ ret = spraid_setup_prps(hdev, iod);
+ ioq_cmd->common.dptr.prp1 =
+ cpu_to_le64(sg_dma_address(iod->sg));
+ ioq_cmd->common.dptr.prp2 = cpu_to_le64(iod->first_dma);
+ }
+
+ if (ret)
+ scsi_dma_unmap(scmd);
+
+ return ret;
+}
+
+static void spraid_map_status(struct spraid_iod *iod, struct scsi_cmnd *scmd,
+ struct spraid_completion *cqe)
+{
+ scsi_set_resid(scmd, 0);
+
+ switch ((le16_to_cpu(cqe->status) >> 1) & 0x7f) {
+ case FW_STAT_OK:
+ set_host_byte(scmd, DID_OK);
+ break;
+ case FW_STAT_NEED_CHECK:
+ set_host_byte(scmd, DID_OK);
+ scmd->result |= le16_to_cpu(cqe->status) >> 8;
+ if (scmd->result & SAM_STAT_CHECK_CONDITION) {
+ memset(scmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
+ memcpy(scmd->sense_buffer, iod->sense,
+ SCSI_SENSE_BUFFERSIZE);
+ scmd->result =
+ (scmd->result & 0x00ffffff) | (DRIVER_SENSE << 24);
+ }
+ break;
+ case FW_STAT_ABORTED:
+ set_host_byte(scmd, DID_ABORT);
+ break;
+ case FW_STAT_NEED_RETRY:
+ set_host_byte(scmd, DID_REQUEUE);
+ break;
+ default:
+ set_host_byte(scmd, DID_BAD_TARGET);
+ break;
+ }
+}
+
+static inline void spraid_get_tag_from_scmd(struct scsi_cmnd *scmd,
+ u16 *qid, u16 *cid)
+{
+ u32 tag = blk_mq_unique_tag(scmd->request);
+
+ *qid = blk_mq_unique_tag_to_hwq(tag) + 1;
+ *cid = blk_mq_unique_tag_to_tag(tag);
+}
+
+static int spraid_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
+{
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_dev *hdev = shost_priv(shost);
+ struct scsi_device *sdev = scmd->device;
+ struct spraid_sdev_hostdata *hostdata;
+ struct spraid_ioq_command ioq_cmd;
+ struct spraid_queue *ioq;
+ unsigned long elapsed;
+ u16 hwq, cid;
+ int ret;
+
+ if (unlikely(!scmd)) {
+ dev_err(hdev->dev, "err, scmd is null\n");
+ return 0;
+ }
+
+ if (unlikely(hdev->state != SPRAID_LIVE)) {
+ set_host_byte(scmd, DID_NO_CONNECT);
+ scmd->scsi_done(scmd);
+ return 0;
+ }
+
+ if (log_debug_switch)
+ scsi_print_command(scmd);
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ hostdata = sdev->hostdata;
+ ioq = &hdev->queues[hwq];
+ memset(&ioq_cmd, 0, sizeof(ioq_cmd));
+ ioq_cmd.rw.hdid = cpu_to_le32(hostdata->hdid);
+ ioq_cmd.rw.command_id = cid;
+
+ spraid_setup_ioq_cmd(hdev, &ioq_cmd, scmd);
+
+ ret = cid * SCSI_SENSE_BUFFERSIZE;
+ iod->sense = ioq->sense + ret;
+ iod->sense_dma = ioq->sense_dma_addr + ret;
+
+ ret = spraid_init_iod(hdev, iod, &ioq_cmd, scmd);
+ if (unlikely(ret))
+ return SCSI_MLQUEUE_HOST_BUSY;
+
+ iod->spraidq = ioq;
+ ret = spraid_io_map_data(hdev, iod, scmd, &ioq_cmd);
+ if (unlikely(ret)) {
+ dev_err(hdev->dev, "spraid_io_map_data Err.\n");
+ set_host_byte(scmd, DID_ERROR);
+ scmd->scsi_done(scmd);
+ ret = 0;
+ goto deinit_iod;
+ }
+
+ WRITE_ONCE(iod->state, SPRAID_CMD_IN_FLIGHT);
+ spraid_submit_cmd(ioq, &ioq_cmd);
+ elapsed = jiffies - scmd->jiffies_at_alloc;
+ dev_log_dbg(hdev->dev,
+ "cid[%d] qid[%d] submit IO cost %3ld.%3ld seconds\n",
+ cid, hwq, elapsed / HZ, elapsed % HZ);
+ return 0;
+
+deinit_iod:
+ spraid_free_iod_res(hdev, iod);
+ return ret;
+}
+
+static int spraid_match_dev(struct spraid_dev *hdev, u16 idx,
+ struct scsi_device *sdev)
+{
+ if (SPRAID_DEV_INFO_FLAG_VALID(hdev->devices[idx].flag)) {
+ if (sdev->channel == hdev->devices[idx].channel &&
+ sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
+ sdev->lun < hdev->devices[idx].lun) {
+ dev_info(hdev->dev,
+ "Match device success, channel;"
+ "target:lun[%d:%d:%d]\n",
+ hdev->devices[idx].channel,
+ hdev->devices[idx].target,
+ hdev->devices[idx].lun);
+ return 1;
+ }
+ }
+
+ return 0;
+}
+
+static int spraid_slave_alloc(struct scsi_device *sdev)
+{
+ struct spraid_sdev_hostdata *hostdata;
+ struct spraid_dev *hdev;
+ u16 idx;
+
+ hdev = shost_priv(sdev->host);
+ hostdata = kzalloc(sizeof(*hostdata), GFP_KERNEL);
+ if (!hostdata) {
+ dev_err(hdev->dev, "Alloc scsi host data memory failed\n");
+ return -ENOMEM;
+ }
+
+ down_read(&hdev->devices_rwsem);
+ for (idx = 0; idx < le32_to_cpu(hdev->ctrl_info->nd); idx++) {
+ if (spraid_match_dev(hdev, idx, sdev))
+ goto scan_host;
+ }
+ up_read(&hdev->devices_rwsem);
+
+ kfree(hostdata);
+ return -ENXIO;
+
+scan_host:
+ hostdata->hdid = le32_to_cpu(hdev->devices[idx].hdid);
+ hostdata->max_io_kb = le16_to_cpu(hdev->devices[idx].max_io_kb);
+ hostdata->attr = hdev->devices[idx].attr;
+ hostdata->flag = hdev->devices[idx].flag;
+ hostdata->rg_id = 0xff;
+ sdev->hostdata = hostdata;
+ up_read(&hdev->devices_rwsem);
+ return 0;
+}
+
+static void spraid_slave_destroy(struct scsi_device *sdev)
+{
+ kfree(sdev->hostdata);
+ sdev->hostdata = NULL;
+}
+
+static int spraid_slave_configure(struct scsi_device *sdev)
+{
+ u16 idx;
+ unsigned int timeout = scmd_tmout_nonpt * HZ;
+ struct spraid_dev *hdev = shost_priv(sdev->host);
+ struct spraid_sdev_hostdata *hostdata = sdev->hostdata;
+ u32 max_sec = sdev->host->max_sectors;
+
+ if (hostdata) {
+ idx = hostdata->hdid - 1;
+ if (sdev->channel == hdev->devices[idx].channel &&
+ sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
+ sdev->lun < hdev->devices[idx].lun) {
+ if (SPRAID_DEV_INFO_ATTR_PT(hdev->devices[idx].attr))
+ timeout = 30 * HZ;
+ else
+ timeout = scmd_tmout_nonpt * HZ;
+ max_sec = le16_to_cpu(hdev->devices[idx].max_io_kb)
+ << 1;
+ } else {
+ dev_err(hdev->dev,
+ "[%s] err, sdev->channel:id:lun[%d:%d:%lld];"
+ "devices[%d], channel:target:lun[%d:%d:%d]\n",
+ __func__, sdev->channel, sdev->id, sdev->lun,
+ idx, hdev->devices[idx].channel,
+ hdev->devices[idx].target,
+ hdev->devices[idx].lun);
+ }
+ } else {
+ dev_err(hdev->dev, "[%s] err, sdev->hostdata is null\n",
+ __func__);
+ }
+
+ blk_queue_rq_timeout(sdev->request_queue, timeout);
+ sdev->eh_timeout = timeout;
+
+ if ((max_sec == 0) || (max_sec > sdev->host->max_sectors))
+ max_sec = sdev->host->max_sectors;
+
+ dev_info(hdev->dev,
+ "[%s] sdev->channel:id:lun[%d:%d:%lld];"
+ " scmd_timeout[%d]s, maxsec[%d]\n",
+ __func__, sdev->channel, sdev->id,
+ sdev->lun, timeout / HZ, max_sec);
+
+ return 0;
+}
+
+static void spraid_shost_init(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ u8 domain, bus;
+ u32 dev_func;
+
+ domain = pci_domain_nr(pdev->bus);
+ bus = pdev->bus->number;
+ dev_func = pdev->devfn;
+
+ hdev->shost->nr_hw_queues = hdev->online_queues - 1;
+ hdev->shost->can_queue = (hdev->ioq_depth - SPRAID_PTCMDS_PERQ);
+
+ hdev->shost->sg_tablesize = le16_to_cpu(hdev->ctrl_info->max_num_sge);
+ /* 512B per sector */
+ hdev->shost->max_sectors =
+ (1U << ((hdev->ctrl_info->mdts) * 1U) << 12) / 512;
+ hdev->shost->cmd_per_lun = MAX_CMD_PER_DEV;
+ hdev->shost->max_channel =
+ le16_to_cpu(hdev->ctrl_info->max_channel) - 1;
+ hdev->shost->max_id = le32_to_cpu(hdev->ctrl_info->max_tgt_id);
+ hdev->shost->max_lun = le16_to_cpu(hdev->ctrl_info->max_lun);
+
+ hdev->shost->this_id = -1;
+ hdev->shost->unique_id = (domain << 16) | (bus << 8) | dev_func;
+ hdev->shost->max_cmd_len = MAX_CDB_LEN;
+ hdev->shost->hostt->cmd_size = max(spraid_cmd_size(hdev, false, true),
+ spraid_cmd_size(hdev, false, false));
+}
+
+static inline void spraid_host_deinit(struct spraid_dev *hdev)
+{
+ ida_free(&spraid_instance_ida, hdev->instance);
+}
+
+static int spraid_alloc_queue(struct spraid_dev *hdev, u16 qid, u16 depth)
+{
+ struct spraid_queue *spraidq = &hdev->queues[qid];
+ int ret = 0;
+
+ if (hdev->queue_count > qid) {
+ dev_info(hdev->dev, "[%s] warn: queue[%d] is exist\n",
+ __func__, qid);
+ return 0;
+ }
+
+ spraidq->cqes = dma_alloc_coherent(hdev->dev, CQ_SIZE(depth),
+ &spraidq->cq_dma_addr,
+ GFP_KERNEL | __GFP_ZERO);
+ if (!spraidq->cqes)
+ return -ENOMEM;
+
+ spraidq->sq_cmds = dma_alloc_coherent(hdev->dev, SQ_SIZE(qid, depth),
+ &spraidq->sq_dma_addr,
+ GFP_KERNEL);
+ if (!spraidq->sq_cmds) {
+ ret = -ENOMEM;
+ goto free_cqes;
+ }
+
+ spin_lock_init(&spraidq->sq_lock);
+ spin_lock_init(&spraidq->cq_lock);
+ spraidq->hdev = hdev;
+ spraidq->q_depth = depth;
+ spraidq->qid = qid;
+ spraidq->cq_vector = -1;
+ hdev->queue_count++;
+
+ /* alloc sense buffer */
+ spraidq->sense = dma_alloc_coherent(hdev->dev, SENSE_SIZE(depth),
+ &spraidq->sense_dma_addr,
+ GFP_KERNEL | __GFP_ZERO);
+ if (!spraidq->sense) {
+ ret = -ENOMEM;
+ goto free_sq_cmds;
+ }
+
+ return 0;
+
+free_sq_cmds:
+ dma_free_coherent(hdev->dev, SQ_SIZE(qid, depth),
+ (void *)spraidq->sq_cmds, spraidq->sq_dma_addr);
+free_cqes:
+ dma_free_coherent(hdev->dev, CQ_SIZE(depth), (void *)spraidq->cqes,
+ spraidq->cq_dma_addr);
+ return ret;
+}
+
+static int spraid_wait_ready(struct spraid_dev *hdev, u64 cap, bool enabled)
+{
+ unsigned long timeout =
+ ((SPRAID_CAP_TIMEOUT(cap) + 1) * SPRAID_CAP_TIMEOUT_UNIT_MS) + jiffies;
+ u32 bit = enabled ? SPRAID_CSTS_RDY : 0;
+
+ while ((readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_RDY) != bit) {
+ usleep_range(1000, 2000);
+ if (fatal_signal_pending(current))
+ return -EINTR;
+
+ if (time_after(jiffies, timeout)) {
+ dev_err(hdev->dev, "Device not ready; aborting %s\n",
+ enabled ? "initialisation" : "reset");
+ return -ENODEV;
+ }
+ }
+ return 0;
+}
+
+static int spraid_shutdown_ctrl(struct spraid_dev *hdev)
+{
+ unsigned long timeout = hdev->ctrl_info->rtd3e + jiffies;
+
+ hdev->ctrl_config &= ~SPRAID_CC_SHN_MASK;
+ hdev->ctrl_config |= SPRAID_CC_SHN_NORMAL;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ while ((readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_SHST_MASK) !=
+ SPRAID_CSTS_SHST_CMPLT) {
+ msleep(100);
+ if (fatal_signal_pending(current))
+ return -EINTR;
+ if (time_after(jiffies, timeout)) {
+ dev_err(hdev->dev,
+ "Device shutdown incomplete; abort shutdown\n");
+ return -ENODEV;
+ }
+ }
+ return 0;
+}
+
+static int spraid_disable_ctrl(struct spraid_dev *hdev)
+{
+ hdev->ctrl_config &= ~SPRAID_CC_SHN_MASK;
+ hdev->ctrl_config &= ~SPRAID_CC_ENABLE;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ return spraid_wait_ready(hdev, hdev->cap, false);
+}
+
+static int spraid_enable_ctrl(struct spraid_dev *hdev)
+{
+ u64 cap = hdev->cap;
+ u32 dev_page_min = SPRAID_CAP_MPSMIN(cap) + 12;
+ u32 page_shift = PAGE_SHIFT;
+
+ if (page_shift < dev_page_min) {
+ dev_err(hdev->dev,
+ "Minimum device page size[%u], too large for host[%u]\n",
+ 1U << dev_page_min, 1U << page_shift);
+ return -ENODEV;
+ }
+
+ page_shift = min_t(unsigned int, SPRAID_CAP_MPSMAX(cap) + 12,
+ PAGE_SHIFT);
+ hdev->page_size = 1U << page_shift;
+
+ hdev->ctrl_config = SPRAID_CC_CSS_NVM;
+ hdev->ctrl_config |= (page_shift - 12) << SPRAID_CC_MPS_SHIFT;
+ hdev->ctrl_config |= SPRAID_CC_AMS_RR | SPRAID_CC_SHN_NONE;
+ hdev->ctrl_config |= SPRAID_CC_IOSQES | SPRAID_CC_IOCQES;
+ hdev->ctrl_config |= SPRAID_CC_ENABLE;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ return spraid_wait_ready(hdev, cap, true);
+}
+
+static void spraid_init_queue(struct spraid_queue *spraidq, u16 qid)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ memset((void *)spraidq->cqes, 0, CQ_SIZE(spraidq->q_depth));
+
+ spraidq->sq_tail = 0;
+ spraidq->cq_head = 0;
+ spraidq->cq_phase = 1;
+ spraidq->q_db = &hdev->dbs[qid * 2 * hdev->db_stride];
+ spraidq->prp_small_pool = hdev->prp_small_pool[qid % small_pool_num];
+ hdev->online_queues++;
+}
+
+static inline bool spraid_cqe_pending(struct spraid_queue *spraidq)
+{
+ return (le16_to_cpu(spraidq->cqes[spraidq->cq_head].status) & 1) ==
+ spraidq->cq_phase;
+}
+
+static void spraid_sata_report_zone_handle(struct scsi_cmnd *scmd,
+ struct spraid_iod *iod)
+{
+ int i = 0;
+ unsigned int bytes = 0;
+ struct scatterlist *sg = scsi_sglist(scmd);
+
+ scsi_for_each_sg(scmd, sg, iod->nsge, i) {
+ unsigned int offset = 0;
+
+ if (bytes == 0) {
+ char *hdr;
+ u32 list_length;
+ u64 max_lba, opt_lba;
+ u16 same;
+
+ hdr = sg_virt(sg);
+
+ list_length = get_unaligned_le32(&hdr[0]);
+ same = get_unaligned_le16(&hdr[4]);
+ max_lba = get_unaligned_le64(&hdr[8]);
+ opt_lba = get_unaligned_le64(&hdr[16]);
+ put_unaligned_be32(list_length, &hdr[0]);
+ hdr[4] = same & 0xf;
+ put_unaligned_be64(max_lba, &hdr[8]);
+ put_unaligned_be64(opt_lba, &hdr[16]);
+ offset += 64;
+ bytes += 64;
+ }
+ while (offset < sg_dma_len(sg)) {
+ char *rec;
+ u8 cond, type, non_seq, reset;
+ u64 size, start, wp;
+
+ rec = sg_virt(sg) + offset;
+ type = rec[0] & 0xf;
+ cond = (rec[1] >> 4) & 0xf;
+ non_seq = (rec[1] & 2);
+ reset = (rec[1] & 1);
+ size = get_unaligned_le64(&rec[8]);
+ start = get_unaligned_le64(&rec[16]);
+ wp = get_unaligned_le64(&rec[24]);
+ rec[0] = type;
+ rec[1] = (cond << 4) | non_seq | reset;
+ put_unaligned_be64(size, &rec[8]);
+ put_unaligned_be64(start, &rec[16]);
+ put_unaligned_be64(wp, &rec[24]);
+ WARN_ON(offset + 64 > sg_dma_len(sg));
+ offset += 64;
+ bytes += 64;
+ }
+ }
+}
+
+static inline void spraid_handle_ata_cmd(struct spraid_dev *hdev,
+ struct scsi_cmnd *scmd,
+ struct spraid_iod *iod)
+{
+ if (hdev->ctrl_info->card_type != SPRAID_CARD_HBA)
+ return;
+
+ switch (scmd->cmnd[0]) {
+ case ZBC_IN:
+ dev_info(hdev->dev, "[%s] process report zone\n", __func__);
+ spraid_sata_report_zone_handle(scmd, iod);
+ break;
+ default:
+ break;
+ }
+}
+
+static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct blk_mq_tags *tags;
+ struct scsi_cmnd *scmd;
+ struct spraid_iod *iod;
+ struct request *req;
+ unsigned long elapsed;
+
+ tags = hdev->shost->tag_set.tags[ioq->qid - 1];
+ req = blk_mq_tag_to_rq(tags, cqe->cmd_id);
+ if (unlikely(!req || !blk_mq_request_started(req))) {
+ dev_warn(hdev->dev, "Invalid id %d completed on queue %d\n",
+ cqe->cmd_id, ioq->qid);
+ return;
+ }
+
+ scmd = blk_mq_rq_to_pdu(req);
+ iod = scsi_cmd_priv(scmd);
+
+ elapsed = jiffies - scmd->jiffies_at_alloc;
+ dev_log_dbg(hdev->dev,
+ "cid[%d] qid[%d] finish IO cost %3ld.%3ld seconds\n",
+ cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
+
+ if (cmpxchg(&iod->state, SPRAID_CMD_IN_FLIGHT, SPRAID_CMD_COMPLETE) !=
+ SPRAID_CMD_IN_FLIGHT) {
+ dev_warn(hdev->dev,
+ "cid[%d] qid[%d] enters abnormal handler;"
+ " cost %3ld.%3ld seconds\n",
+ cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
+ WRITE_ONCE(iod->state, SPRAID_CMD_TMO_COMPLETE);
+
+ if (iod->nsge) {
+ iod->nsge = 0;
+ scsi_dma_unmap(scmd);
+ }
+ spraid_free_iod_res(hdev, iod);
+
+ return;
+ }
+
+ spraid_handle_ata_cmd(hdev, scmd, iod);
+
+ spraid_map_status(iod, scmd, cqe);
+ if (iod->nsge) {
+ iod->nsge = 0;
+ scsi_dma_unmap(scmd);
+ }
+ spraid_free_iod_res(hdev, iod);
+ scmd->scsi_done(scmd);
+}
+
+static void spraid_complete_adminq_cmnd(struct spraid_queue *adminq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = adminq->hdev;
+ struct spraid_cmd *adm_cmd;
+
+ adm_cmd = hdev->adm_cmds + cqe->cmd_id;
+ if (unlikely(adm_cmd->state == SPRAID_CMD_IDLE)) {
+ dev_warn(adminq->hdev->dev,
+ "Invalid id %d completed on queue %d\n",
+ cqe->cmd_id, le16_to_cpu(cqe->sq_id));
+ return;
+ }
+
+ adm_cmd->status = le16_to_cpu(cqe->status) >> 1;
+ adm_cmd->result0 = le32_to_cpu(cqe->result);
+ adm_cmd->result1 = le32_to_cpu(cqe->result1);
+
+ complete(&adm_cmd->cmd_done);
+}
+
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid);
+
+static void spraid_complete_aen(struct spraid_queue *spraidq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+ u32 result = le32_to_cpu(cqe->result);
+
+ dev_info(hdev->dev, "rcv aen, cid[%d], status[0x%x], result[0x%x]\n",
+ cqe->cmd_id, le16_to_cpu(cqe->status) >> 1, result);
+
+ spraid_send_aen(hdev, cqe->cmd_id);
+
+ if ((le16_to_cpu(cqe->status) >> 1) != SPRAID_SC_SUCCESS)
+ return;
+ switch (result & 0x7) {
+ case SPRAID_AEN_NOTICE:
+ spraid_handle_aen_notice(hdev, result);
+ break;
+ case SPRAID_AEN_VS:
+ spraid_handle_aen_vs(hdev, result, le32_to_cpu(cqe->result1));
+ break;
+ default:
+ dev_warn(hdev->dev, "Unsupported async event type: %u\n",
+ result & 0x7);
+ break;
+ }
+}
+
+static void spraid_complete_ioq_sync_cmnd(struct spraid_queue *ioq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct spraid_cmd *ptcmd;
+
+ ptcmd = hdev->ioq_ptcmds + (ioq->qid - 1) * SPRAID_PTCMDS_PERQ +
+ cqe->cmd_id - SPRAID_IO_BLK_MQ_DEPTH;
+
+ ptcmd->status = le16_to_cpu(cqe->status) >> 1;
+ ptcmd->result0 = le32_to_cpu(cqe->result);
+ ptcmd->result1 = le32_to_cpu(cqe->result1);
+
+ complete(&ptcmd->cmd_done);
+}
+
+static inline void spraid_handle_cqe(struct spraid_queue *spraidq, u16 idx)
+{
+ struct spraid_completion *cqe = &spraidq->cqes[idx];
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ if (unlikely(cqe->cmd_id >= spraidq->q_depth)) {
+ dev_err(hdev->dev,
+ "Invalid command id[%d] completed on queue %d\n",
+ cqe->cmd_id, cqe->sq_id);
+ return;
+ }
+
+ dev_log_dbg(hdev->dev, "cid[%d] qid[%d];"
+ " result[0x%x], sq_id[%d], status[0x%x]\n",
+ cqe->cmd_id, spraidq->qid, le32_to_cpu(cqe->result),
+ le16_to_cpu(cqe->sq_id), le16_to_cpu(cqe->status));
+
+ if (unlikely(spraidq->qid == 0
+ && cqe->cmd_id >= SPRAID_AQ_BLK_MQ_DEPTH)) {
+ spraid_complete_aen(spraidq, cqe);
+ return;
+ }
+
+ if (unlikely(spraidq->qid && cqe->cmd_id >= SPRAID_IO_BLK_MQ_DEPTH)) {
+ spraid_complete_ioq_sync_cmnd(spraidq, cqe);
+ return;
+ }
+
+ if (spraidq->qid)
+ spraid_complete_ioq_cmnd(spraidq, cqe);
+ else
+ spraid_complete_adminq_cmnd(spraidq, cqe);
+}
+
+static void spraid_complete_cqes(struct spraid_queue *spraidq,
+ u16 start, u16 end)
+{
+ while (start != end) {
+ spraid_handle_cqe(spraidq, start);
+ if (++start == spraidq->q_depth)
+ start = 0;
+ }
+}
+
+static inline void spraid_update_cq_head(struct spraid_queue *spraidq)
+{
+ if (++spraidq->cq_head == spraidq->q_depth) {
+ spraidq->cq_head = 0;
+ spraidq->cq_phase = !spraidq->cq_phase;
+ }
+}
+
+static inline bool spraid_process_cq(struct spraid_queue *spraidq,
+ u16 *start, u16 *end, int tag)
+{
+ bool found = false;
+
+ *start = spraidq->cq_head;
+ while (!found && spraid_cqe_pending(spraidq)) {
+ if (spraidq->cqes[spraidq->cq_head].cmd_id == tag)
+ found = true;
+ spraid_update_cq_head(spraidq);
+ }
+ *end = spraidq->cq_head;
+
+ if (*start != *end)
+ writel(spraidq->cq_head,
+ spraidq->q_db + spraidq->hdev->db_stride);
+
+ return found;
+}
+
+static bool spraid_poll_cq(struct spraid_queue *spraidq, int cid)
+{
+ u16 start, end;
+ bool found;
+
+ if (!spraid_cqe_pending(spraidq))
+ return 0;
+
+ spin_lock_irq(&spraidq->cq_lock);
+ found = spraid_process_cq(spraidq, &start, &end, cid);
+ spin_unlock_irq(&spraidq->cq_lock);
+
+ spraid_complete_cqes(spraidq, start, end);
+ return found;
+}
+
+static irqreturn_t spraid_irq(int irq, void *data)
+{
+ struct spraid_queue *spraidq = data;
+ irqreturn_t ret = IRQ_NONE;
+ u16 start, end;
+
+ spin_lock(&spraidq->cq_lock);
+ if (spraidq->cq_head != spraidq->last_cq_head)
+ ret = IRQ_HANDLED;
+
+ spraid_process_cq(spraidq, &start, &end, -1);
+ spraidq->last_cq_head = spraidq->cq_head;
+ spin_unlock(&spraidq->cq_lock);
+
+ if (start != end) {
+ spraid_complete_cqes(spraidq, start, end);
+ ret = IRQ_HANDLED;
+ }
+ return ret;
+}
+
+static int spraid_setup_admin_queue(struct spraid_dev *hdev)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u32 aqa;
+ int ret;
+
+ dev_info(hdev->dev, "[%s] start disable ctrl\n", __func__);
+
+ ret = spraid_disable_ctrl(hdev);
+ if (ret)
+ return ret;
+
+ ret = spraid_alloc_queue(hdev, 0, SPRAID_AQ_DEPTH);
+ if (ret)
+ return ret;
+
+ aqa = adminq->q_depth - 1;
+ aqa |= aqa << 16;
+ writel(aqa, hdev->bar + SPRAID_REG_AQA);
+ lo_hi_writeq(adminq->sq_dma_addr, hdev->bar + SPRAID_REG_ASQ);
+ lo_hi_writeq(adminq->cq_dma_addr, hdev->bar + SPRAID_REG_ACQ);
+
+ dev_info(hdev->dev, "[%s] start enable ctrl\n", __func__);
+
+ ret = spraid_enable_ctrl(hdev);
+ if (ret) {
+ ret = -ENODEV;
+ goto free_queue;
+ }
+
+ adminq->cq_vector = 0;
+ spraid_init_queue(adminq, 0);
+ ret = pci_request_irq(hdev->pdev, adminq->cq_vector, spraid_irq, NULL,
+ adminq, "spraid%d_q%d",
+ hdev->instance, adminq->qid);
+
+ if (ret) {
+ adminq->cq_vector = -1;
+ hdev->online_queues--;
+ goto free_queue;
+ }
+
+ dev_info(hdev->dev, "[%s] success, queuecount:[%d], onlinequeue:[%d]\n",
+ __func__, hdev->queue_count, hdev->online_queues);
+
+ return 0;
+
+free_queue:
+ spraid_free_queue(adminq);
+ return ret;
+}
+
+static u32 spraid_bar_size(struct spraid_dev *hdev, u32 nr_ioqs)
+{
+ return (SPRAID_REG_DBS + ((nr_ioqs + 1) * 8 * hdev->db_stride));
+}
+
+static int spraid_alloc_admin_cmds(struct spraid_dev *hdev)
+{
+ int i;
+
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+ spin_lock_init(&hdev->adm_cmd_lock);
+
+ hdev->adm_cmds = kcalloc_node(SPRAID_AQ_BLK_MQ_DEPTH,
+ sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
+
+ if (!hdev->adm_cmds) {
+ dev_err(hdev->dev, "Alloc admin cmds failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < SPRAID_AQ_BLK_MQ_DEPTH; i++) {
+ hdev->adm_cmds[i].qid = 0;
+ hdev->adm_cmds[i].cid = i;
+ list_add_tail(&(hdev->adm_cmds[i].list), &hdev->adm_cmd_list);
+ }
+
+ dev_info(hdev->dev, "Alloc admin cmds success, num[%d]\n",
+ SPRAID_AQ_BLK_MQ_DEPTH);
+
+ return 0;
+}
+
+static void spraid_free_admin_cmds(struct spraid_dev *hdev)
+{
+ kfree(hdev->adm_cmds);
+ hdev->adm_cmds = NULL;
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+}
+
+static struct spraid_cmd *spraid_get_cmd(struct spraid_dev *hdev,
+ enum spraid_cmd_type type)
+{
+ struct spraid_cmd *cmd = NULL;
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
+
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
+
+ spin_lock_irqsave(slock, flags);
+ if (list_empty(head)) {
+ spin_unlock_irqrestore(slock, flags);
+ dev_err(hdev->dev, "err, cmd[%d] list empty\n", type);
+ return NULL;
+ }
+ cmd = list_entry(head->next, struct spraid_cmd, list);
+ list_del_init(&cmd->list);
+ spin_unlock_irqrestore(slock, flags);
+
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IN_FLIGHT);
+
+ return cmd;
+}
+
+static void spraid_put_cmd(struct spraid_dev *hdev, struct spraid_cmd *cmd,
+ enum spraid_cmd_type type)
+{
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
+
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
+
+ spin_lock_irqsave(slock, flags);
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IDLE);
+ list_add_tail(&cmd->list, head);
+ spin_unlock_irqrestore(slock, flags);
+}
+
+
+static int spraid_submit_admin_sync_cmd(struct spraid_dev *hdev,
+ struct spraid_admin_command *cmd,
+ u32 *result0, u32 *result1, u32 timeout)
+{
+ struct spraid_cmd *adm_cmd = spraid_get_cmd(hdev, SPRAID_CMD_ADM);
+
+ if (!adm_cmd) {
+ dev_err(hdev->dev, "err, get admin cmd failed\n");
+ return -EFAULT;
+ }
+
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
+
+ init_completion(&adm_cmd->cmd_done);
+
+ cmd->common.command_id = adm_cmd->cid;
+ spraid_submit_cmd(&hdev->queues[0], cmd);
+
+ if (!wait_for_completion_timeout(&adm_cmd->cmd_done, timeout)) {
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout;"
+ " opcode[0x%x] subopcode[0x%x]\n",
+ __func__, adm_cmd->cid, adm_cmd->qid,
+ cmd->usr_cmd.opcode, cmd->usr_cmd.info_0.subopcode);
+ WRITE_ONCE(adm_cmd->state, SPRAID_CMD_TIMEOUT);
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+ return -EINVAL;
+ }
+
+ if (result0)
+ *result0 = adm_cmd->result0;
+ if (result1)
+ *result1 = adm_cmd->result1;
+
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+
+ return adm_cmd->status;
+}
+
+static int spraid_create_cq(struct spraid_dev *hdev, u16 qid,
+ struct spraid_queue *spraidq, u16 cq_vector)
+{
+ struct spraid_admin_command admin_cmd;
+ int flags = SPRAID_QUEUE_PHYS_CONTIG | SPRAID_CQ_IRQ_ENABLED;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.create_cq.opcode = SPRAID_ADMIN_CREATE_CQ;
+ admin_cmd.create_cq.prp1 = cpu_to_le64(spraidq->cq_dma_addr);
+ admin_cmd.create_cq.cqid = cpu_to_le16(qid);
+ admin_cmd.create_cq.qsize = cpu_to_le16(spraidq->q_depth - 1);
+ admin_cmd.create_cq.cq_flags = cpu_to_le16(flags);
+ admin_cmd.create_cq.irq_vector = cpu_to_le16(cq_vector);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static int spraid_create_sq(struct spraid_dev *hdev, u16 qid,
+ struct spraid_queue *spraidq)
+{
+ struct spraid_admin_command admin_cmd;
+ int flags = SPRAID_QUEUE_PHYS_CONTIG;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.create_sq.opcode = SPRAID_ADMIN_CREATE_SQ;
+ admin_cmd.create_sq.prp1 = cpu_to_le64(spraidq->sq_dma_addr);
+ admin_cmd.create_sq.sqid = cpu_to_le16(qid);
+ admin_cmd.create_sq.qsize = cpu_to_le16(spraidq->q_depth - 1);
+ admin_cmd.create_sq.sq_flags = cpu_to_le16(flags);
+ admin_cmd.create_sq.cqid = cpu_to_le16(qid);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static void spraid_free_queue(struct spraid_queue *spraidq)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ hdev->queue_count--;
+ dma_free_coherent(hdev->dev, CQ_SIZE(spraidq->q_depth),
+ (void *)spraidq->cqes, spraidq->cq_dma_addr);
+ dma_free_coherent(hdev->dev, SQ_SIZE(spraidq->qid, spraidq->q_depth),
+ spraidq->sq_cmds, spraidq->sq_dma_addr);
+ dma_free_coherent(hdev->dev, SENSE_SIZE(spraidq->q_depth),
+ spraidq->sense, spraidq->sense_dma_addr);
+}
+
+static void spraid_free_admin_queue(struct spraid_dev *hdev)
+{
+ spraid_free_queue(&hdev->queues[0]);
+}
+
+static void spraid_free_io_queues(struct spraid_dev *hdev)
+{
+ int i;
+
+ for (i = hdev->queue_count - 1; i >= 1; i--)
+ spraid_free_queue(&hdev->queues[i]);
+}
+
+static int spraid_delete_queue(struct spraid_dev *hdev, u8 op, u16 id)
+{
+ struct spraid_admin_command admin_cmd;
+ int ret;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.delete_queue.opcode = op;
+ admin_cmd.delete_queue.qid = cpu_to_le16(id);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+
+ if (ret)
+ dev_err(hdev->dev, "Delete %s:[%d] failed\n",
+ (op == SPRAID_ADMIN_DELETE_CQ) ? "cq" : "sq", id);
+
+ return ret;
+}
+
+static int spraid_delete_cq(struct spraid_dev *hdev, u16 cqid)
+{
+ return spraid_delete_queue(hdev, SPRAID_ADMIN_DELETE_CQ, cqid);
+}
+
+static int spraid_delete_sq(struct spraid_dev *hdev, u16 sqid)
+{
+ return spraid_delete_queue(hdev, SPRAID_ADMIN_DELETE_SQ, sqid);
+}
+
+static int spraid_create_queue(struct spraid_queue *spraidq, u16 qid)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+ u16 cq_vector;
+ int ret;
+
+ cq_vector = (hdev->num_vecs == 1) ? 0 : qid;
+ ret = spraid_create_cq(hdev, qid, spraidq, cq_vector);
+ if (ret)
+ return ret;
+
+ ret = spraid_create_sq(hdev, qid, spraidq);
+ if (ret)
+ goto delete_cq;
+
+ spraid_init_queue(spraidq, qid);
+ spraidq->cq_vector = cq_vector;
+
+ ret = pci_request_irq(hdev->pdev, cq_vector, spraid_irq, NULL,
+ spraidq, "spraid%d_q%d", hdev->instance, qid);
+
+ if (ret) {
+ dev_err(hdev->dev, "Request queue[%d] irq failed\n", qid);
+ goto delete_sq;
+ }
+
+ return 0;
+
+delete_sq:
+ spraidq->cq_vector = -1;
+ hdev->online_queues--;
+ spraid_delete_sq(hdev, qid);
+delete_cq:
+ spraid_delete_cq(hdev, qid);
+
+ return ret;
+}
+
+static int spraid_create_io_queues(struct spraid_dev *hdev)
+{
+ u32 i, max;
+ int ret = 0;
+
+ max = min(hdev->max_qid, hdev->queue_count - 1);
+ for (i = hdev->online_queues; i <= max; i++) {
+ ret = spraid_create_queue(&hdev->queues[i], i);
+ if (ret) {
+ dev_err(hdev->dev, "Create queue[%d] failed\n", i);
+ break;
+ }
+ }
+
+ dev_info(hdev->dev, "[%s] queue_count[%d], online_queue[%d]",
+ __func__, hdev->queue_count, hdev->online_queues);
+
+ return ret >= 0 ? 0 : ret;
+}
+
+static int spraid_set_features(struct spraid_dev *hdev, u32 fid,
+ u32 dword11, void *buffer,
+ size_t buflen, u32 *result)
+{
+ struct spraid_admin_command admin_cmd;
+ int ret;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+
+ if (buffer && buflen) {
+ data_ptr = dma_alloc_coherent(hdev->dev, buflen,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memcpy(data_ptr, buffer, buflen);
+ }
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.features.opcode = SPRAID_ADMIN_SET_FEATURES;
+ admin_cmd.features.fid = cpu_to_le32(fid);
+ admin_cmd.features.dword11 = cpu_to_le32(dword11);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, result, NULL, 0);
+
+ if (data_ptr)
+ dma_free_coherent(hdev->dev, buflen, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_configure_timestamp(struct spraid_dev *hdev)
+{
+ __le64 ts;
+ int ret;
+
+ ts = cpu_to_le64(ktime_to_ms(ktime_get_real()));
+ ret = spraid_set_features(hdev, SPRAID_FEAT_TIMESTAMP,
+ 0, &ts, sizeof(ts), NULL);
+
+ if (ret)
+ dev_err(hdev->dev, "set timestamp failed: %d\n", ret);
+ return ret;
+}
+
+static int spraid_set_queue_cnt(struct spraid_dev *hdev, u32 *cnt)
+{
+ u32 q_cnt = (*cnt - 1) | ((*cnt - 1) << 16);
+ u32 nr_ioqs, result;
+ int status;
+
+ status = spraid_set_features(hdev, SPRAID_FEAT_NUM_QUEUES,
+ q_cnt, NULL, 0, &result);
+ if (status) {
+ dev_err(hdev->dev, "Set queue count failed, status: %d\n",
+ status);
+ return -EIO;
+ }
+
+ nr_ioqs = min(result & 0xffff, result >> 16) + 1;
+ *cnt = min(*cnt, nr_ioqs);
+ if (*cnt == 0) {
+ dev_err(hdev->dev, "Illegal queue count: zero\n");
+ return -EIO;
+ }
+ return 0;
+}
+
+static int spraid_setup_io_queues(struct spraid_dev *hdev)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ struct pci_dev *pdev = hdev->pdev;
+ u32 nr_ioqs = num_online_cpus();
+ u32 i, size;
+ int ret;
+
+ struct irq_affinity affd = {
+ .pre_vectors = 1
+ };
+
+ ret = spraid_set_queue_cnt(hdev, &nr_ioqs);
+ if (ret < 0)
+ return ret;
+
+ size = spraid_bar_size(hdev, nr_ioqs);
+ ret = spraid_remap_bar(hdev, size);
+ if (ret)
+ return -ENOMEM;
+
+ adminq->q_db = hdev->dbs;
+
+ pci_free_irq(pdev, 0, adminq);
+ pci_free_irq_vectors(pdev);
+
+ ret = pci_alloc_irq_vectors_affinity(pdev, 1, (nr_ioqs + 1),
+ PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
+ if (ret <= 0)
+ return -EIO;
+
+ hdev->num_vecs = ret;
+
+ hdev->max_qid = max(ret - 1, 1);
+
+ ret = pci_request_irq(pdev, adminq->cq_vector, spraid_irq, NULL,
+ adminq, "spraid%d_q%d",
+ hdev->instance, adminq->qid);
+ if (ret) {
+ dev_err(hdev->dev, "Request admin irq failed\n");
+ adminq->cq_vector = -1;
+ return ret;
+ }
+
+ for (i = hdev->queue_count; i <= hdev->max_qid; i++) {
+ ret = spraid_alloc_queue(hdev, i, hdev->ioq_depth);
+ if (ret)
+ break;
+ }
+ dev_info(hdev->dev, "[%s] max_qid: %d, queue_count: %d;"
+ " online_queue: %d, ioq_depth: %d\n",
+ __func__, hdev->max_qid, hdev->queue_count,
+ hdev->online_queues, hdev->ioq_depth);
+
+ return spraid_create_io_queues(hdev);
+}
+
+static void spraid_delete_io_queues(struct spraid_dev *hdev)
+{
+ u16 queues = hdev->online_queues - 1;
+ u8 opcode = SPRAID_ADMIN_DELETE_SQ;
+ u16 i, pass;
+
+ if (!pci_device_is_present(hdev->pdev)) {
+ dev_err(hdev->dev,
+ "pci_device is not present, skip disable io queues\n");
+ return;
+ }
+
+ if (hdev->online_queues < 2) {
+ dev_err(hdev->dev, "[%s] err, io queue has been delete\n",
+ __func__);
+ return;
+ }
+
+ for (pass = 0; pass < 2; pass++) {
+ for (i = queues; i > 0; i--)
+ if (spraid_delete_queue(hdev, opcode, i))
+ break;
+
+ opcode = SPRAID_ADMIN_DELETE_CQ;
+ }
+}
+
+static void spraid_remove_io_queues(struct spraid_dev *hdev)
+{
+ spraid_delete_io_queues(hdev);
+ spraid_free_io_queues(hdev);
+}
+
+static void spraid_pci_disable(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ u32 i;
+
+ for (i = 0; i < hdev->online_queues; i++)
+ pci_free_irq(pdev, hdev->queues[i].cq_vector, &hdev->queues[i]);
+ pci_free_irq_vectors(pdev);
+ if (pci_is_enabled(pdev)) {
+ pci_disable_pcie_error_reporting(pdev);
+ pci_disable_device(pdev);
+ }
+ hdev->online_queues = 0;
+}
+
+static void spraid_disable_admin_queue(struct spraid_dev *hdev, bool shutdown)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u16 start, end;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ if (shutdown)
+ spraid_shutdown_ctrl(hdev);
+ else
+ spraid_disable_ctrl(hdev);
+ }
+
+ if (hdev->queue_count == 0) {
+ dev_err(hdev->dev, "[%s] err, admin queue has been delete\n",
+ __func__);
+ return;
+ }
+
+ spin_lock_irq(&adminq->cq_lock);
+ spraid_process_cq(adminq, &start, &end, -1);
+ spin_unlock_irq(&adminq->cq_lock);
+
+ spraid_complete_cqes(adminq, start, end);
+ spraid_free_admin_queue(hdev);
+}
+
+static int spraid_create_dma_pools(struct spraid_dev *hdev)
+{
+ int i;
+ char poolname[20] = { 0 };
+
+ hdev->prp_page_pool = dma_pool_create("prp list page", hdev->dev,
+ PAGE_SIZE, PAGE_SIZE, 0);
+
+ if (!hdev->prp_page_pool) {
+ dev_err(hdev->dev, "create prp_page_pool failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < small_pool_num; i++) {
+ sprintf(poolname, "prp_list_256_%d", i);
+ hdev->prp_small_pool[i] =
+ dma_pool_create(poolname, hdev->dev, SMALL_POOL_SIZE,
+ SMALL_POOL_SIZE, 0);
+
+ if (!hdev->prp_small_pool[i]) {
+ dev_err(hdev->dev, "create prp_small_pool %d failed\n",
+ i);
+ goto destroy_prp_small_pool;
+ }
+ }
+
+ return 0;
+
+destroy_prp_small_pool:
+ while (i > 0)
+ dma_pool_destroy(hdev->prp_small_pool[--i]);
+ dma_pool_destroy(hdev->prp_page_pool);
+
+ return -ENOMEM;
+}
+
+static void spraid_destroy_dma_pools(struct spraid_dev *hdev)
+{
+ int i;
+
+ for (i = 0; i < small_pool_num; i++)
+ dma_pool_destroy(hdev->prp_small_pool[i]);
+ dma_pool_destroy(hdev->prp_page_pool);
+}
+
+static int spraid_get_dev_list(struct spraid_dev *hdev,
+ struct spraid_dev_info *devices)
+{
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ struct spraid_admin_command admin_cmd;
+ struct spraid_dev_list *list_buf;
+ dma_addr_t data_dma = 0;
+ u32 i, idx, hdid, ndev;
+ int ret = 0;
+
+ list_buf = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!list_buf)
+ return -ENOMEM;
+
+ for (idx = 0; idx < nd;) {
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
+ admin_cmd.get_info.type = SPRAID_GET_INFO_DEV_LIST;
+ admin_cmd.get_info.cdw11 = cpu_to_le32(idx);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd,
+ NULL, NULL, 0);
+
+ if (ret) {
+ dev_err(hdev->dev, "Get device list failed, nd: %u;"
+ "idx: %u, ret: %d\n",
+ nd, idx, ret);
+ goto out;
+ }
+ ndev = le32_to_cpu(list_buf->dev_num);
+
+ dev_info(hdev->dev, "ndev numbers: %u\n", ndev);
+
+ for (i = 0; i < ndev; i++) {
+ hdid = le32_to_cpu(list_buf->devices[i].hdid);
+ dev_info(hdev->dev, "list_buf->devices[%d], hdid: %u;"
+ "target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ i, hdid,
+ le16_to_cpu(list_buf->devices[i].target),
+ list_buf->devices[i].channel,
+ list_buf->devices[i].lun,
+ list_buf->devices[i].attr);
+ if (hdid > nd || hdid == 0) {
+ dev_err(hdev->dev, "err, hdid[%d] invalid\n",
+ hdid);
+ continue;
+ }
+ memcpy(&devices[hdid - 1], &list_buf->devices[i],
+ sizeof(struct spraid_dev_info));
+ }
+ idx += ndev;
+
+ if (idx < MAX_DEV_ENTRY_PER_PAGE_4K)
+ break;
+ }
+
+out:
+ dma_free_coherent(hdev->dev, PAGE_SIZE, list_buf, data_dma);
+ return ret;
+}
+
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.common.opcode = SPRAID_ADMIN_ASYNC_EVENT;
+ admin_cmd.common.command_id = cid;
+
+ spraid_submit_cmd(adminq, &admin_cmd);
+ dev_info(hdev->dev, "send aen, cid[%d]\n", cid);
+}
+
+static inline void spraid_send_all_aen(struct spraid_dev *hdev)
+{
+ u16 i;
+
+ for (i = 0; i < hdev->ctrl_info->aerl; i++)
+ spraid_send_aen(hdev, i + SPRAID_AQ_BLK_MQ_DEPTH);
+}
+
+static int spraid_add_device(struct spraid_dev *hdev,
+ struct spraid_dev_info *device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ dev_info(hdev->dev, "add device, hdid: %u target: %d, channel: %d;"
+ " lun: %d, attr[0x%x]\n",
+ le32_to_cpu(device->hdid), le16_to_cpu(device->target),
+ device->channel, device->lun, device->attr);
+
+ sdev = scsi_device_lookup(shost, device->channel,
+ le16_to_cpu(device->target), 0);
+ if (sdev) {
+ dev_warn(hdev->dev, "Device is already exist, channel: %d;"
+ " target_id: %d, lun: %d\n",
+ device->channel, le16_to_cpu(device->target), 0);
+ scsi_device_put(sdev);
+ return -EEXIST;
+ }
+ scsi_add_device(shost, device->channel, le16_to_cpu(device->target), 0);
+ return 0;
+}
+
+static int spraid_rescan_device(struct spraid_dev *hdev,
+ struct spraid_dev_info *device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ dev_info(hdev->dev, "rescan device, hdid: %u target: %d, channel: %d;"
+ " lun: %d, attr[0x%x]\n",
+ le32_to_cpu(device->hdid), le16_to_cpu(device->target),
+ device->channel, device->lun, device->attr);
+
+ sdev = scsi_device_lookup(shost, device->channel,
+ le16_to_cpu(device->target), 0);
+ if (!sdev) {
+ dev_warn(hdev->dev, "device is not exit rescan it, channel: %d;"
+ " target_id: %d, lun: %d\n",
+ device->channel, le16_to_cpu(device->target), 0);
+ return -ENODEV;
+ }
+
+ scsi_rescan_device(&sdev->sdev_gendev);
+ scsi_device_put(sdev);
+ return 0;
+}
+
+static int spraid_remove_device(struct spraid_dev *hdev,
+ struct spraid_dev_info *org_device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ dev_info(hdev->dev, "remove device, hdid: %u target: %d, channel: %d;"
+ " lun: %d, attr[0x%x]\n",
+ le32_to_cpu(org_device->hdid), le16_to_cpu(org_device->target),
+ org_device->channel, org_device->lun, org_device->attr);
+
+ sdev = scsi_device_lookup(shost, org_device->channel,
+ le16_to_cpu(org_device->target), 0);
+ if (!sdev) {
+ dev_warn(hdev->dev, "device is not exit remove it, channel: %d;"
+ " target_id: %d, lun: %d\n",
+ org_device->channel,
+ le16_to_cpu(org_device->target), 0);
+ return -ENODEV;
+ }
+
+ scsi_remove_device(sdev);
+ scsi_device_put(sdev);
+ return 0;
+}
+
+static int spraid_dev_list_init(struct spraid_dev *hdev)
+{
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ int i, ret;
+
+ hdev->devices = kzalloc_node(nd * sizeof(struct spraid_dev_info),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->devices)
+ return -ENOMEM;
+
+ ret = spraid_get_dev_list(hdev, hdev->devices);
+ if (ret) {
+ dev_err(hdev->dev,
+ "Ignore failure of getting device list;"
+ " within initialization\n");
+ return 0;
+ }
+
+ for (i = 0; i < nd; i++) {
+ if (SPRAID_DEV_INFO_FLAG_VALID(hdev->devices[i].flag) &&
+ SPRAID_DEV_INFO_ATTR_BOOT(hdev->devices[i].attr)) {
+ spraid_add_device(hdev, &hdev->devices[i]);
+ break;
+ }
+ }
+ return 0;
+}
+
+static int luntarget_cmp_func(const void *l, const void *r)
+{
+ const struct spraid_dev_info *ln = l;
+ const struct spraid_dev_info *rn = r;
+
+ if (ln->channel == rn->channel)
+ return le16_to_cpu(ln->target) - le16_to_cpu(rn->target);
+
+ return ln->channel - rn->channel;
+}
+
+static void spraid_scan_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, scan_work);
+ struct spraid_dev_info *devices, *org_devices;
+ struct spraid_dev_info *sortdevice;
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ u8 flag, org_flag;
+ int i, ret;
+ int count = 0;
+
+ devices = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
+ if (!devices)
+ return;
+
+ sortdevice = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
+ if (!sortdevice)
+ goto free_list;
+
+ ret = spraid_get_dev_list(hdev, devices);
+ if (ret)
+ goto free_all;
+ org_devices = hdev->devices;
+ for (i = 0; i < nd; i++) {
+ org_flag = org_devices[i].flag;
+ flag = devices[i].flag;
+
+ dev_log_dbg(hdev->dev, "i: %d, org_flag: 0x%x, flag: 0x%x\n",
+ i, org_flag, flag);
+
+ if (SPRAID_DEV_INFO_FLAG_VALID(flag)) {
+ if (!SPRAID_DEV_INFO_FLAG_VALID(org_flag)) {
+ down_write(&hdev->devices_rwsem);
+ memcpy(&org_devices[i], &devices[i],
+ sizeof(struct spraid_dev_info));
+ memcpy(&sortdevice[count++], &devices[i],
+ sizeof(struct spraid_dev_info));
+ up_write(&hdev->devices_rwsem);
+ } else if (SPRAID_DEV_INFO_FLAG_CHANGE(flag)) {
+ spraid_rescan_device(hdev, &devices[i]);
+ }
+ } else {
+ if (SPRAID_DEV_INFO_FLAG_VALID(org_flag)) {
+ down_write(&hdev->devices_rwsem);
+ org_devices[i].flag &= 0xfe;
+ up_write(&hdev->devices_rwsem);
+ spraid_remove_device(hdev, &org_devices[i]);
+ }
+ }
+ }
+
+ dev_info(hdev->dev, "scan work add device count = %d\n", count);
+
+ sort(sortdevice, count, sizeof(sortdevice[0]),
+ luntarget_cmp_func, NULL);
+
+ for (i = 0; i < count; i++)
+ spraid_add_device(hdev, &sortdevice[i]);
+
+free_all:
+ kfree(sortdevice);
+free_list:
+ kfree(devices);
+}
+
+static void spraid_timesyn_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, timesyn_work);
+
+ spraid_configure_timestamp(hdev);
+}
+
+static int spraid_init_ctrl_info(struct spraid_dev *hdev);
+static void spraid_fw_act_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, fw_act_work);
+
+ if (spraid_init_ctrl_info(hdev))
+ dev_err(hdev->dev, "get ctrl info failed after fw act\n");
+}
+
+static void spraid_queue_scan(struct spraid_dev *hdev)
+{
+ queue_work(spraid_wq, &hdev->scan_work);
+}
+
+static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result)
+{
+ switch ((result & 0xff00) >> 8) {
+ case SPRAID_AEN_DEV_CHANGED:
+ spraid_queue_scan(hdev);
+ break;
+ case SPRAID_AEN_FW_ACT_START:
+ dev_info(hdev->dev, "fw activation starting\n");
+ break;
+ case SPRAID_AEN_HOST_PROBING:
+ break;
+ default:
+ dev_warn(hdev->dev, "async event result %08x\n", result);
+ }
+}
+
+static void spraid_handle_aen_vs(struct spraid_dev *hdev,
+ u32 result, u32 result1)
+{
+ switch ((result & 0xff00) >> 8) {
+ case SPRAID_AEN_TIMESYN:
+ queue_work(spraid_wq, &hdev->timesyn_work);
+ break;
+ case SPRAID_AEN_FW_ACT_FINISH:
+ dev_info(hdev->dev, "fw activation finish\n");
+ queue_work(spraid_wq, &hdev->fw_act_work);
+ break;
+ case SPRAID_AEN_EVENT_MIN ... SPRAID_AEN_EVENT_MAX:
+ dev_info(hdev->dev, "rcv card event[%d];"
+ " param1[0x%x] param2[0x%x]\n",
+ (result & 0xff00) >> 8, result, result1);
+ break;
+ default:
+ dev_warn(hdev->dev, "async event result: 0x%x\n", result);
+ }
+}
+
+static int spraid_alloc_resources(struct spraid_dev *hdev)
+{
+ int ret, nqueue;
+
+ ret = ida_alloc(&spraid_instance_ida, GFP_KERNEL);
+ if (ret < 0) {
+ dev_err(hdev->dev, "Get instance id failed\n");
+ return ret;
+ }
+ hdev->instance = ret;
+
+ hdev->ctrl_info = kzalloc_node(sizeof(*hdev->ctrl_info),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->ctrl_info) {
+ ret = -ENOMEM;
+ goto release_instance;
+ }
+
+ ret = spraid_create_dma_pools(hdev);
+ if (ret)
+ goto free_ctrl_info;
+ nqueue = num_possible_cpus() + 1;
+ hdev->queues = kcalloc_node(nqueue, sizeof(struct spraid_queue),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->queues) {
+ ret = -ENOMEM;
+ goto destroy_dma_pools;
+ }
+
+ ret = spraid_alloc_admin_cmds(hdev);
+ if (ret)
+ goto free_queues;
+
+ dev_info(hdev->dev, "[%s] queues num: %d\n", __func__, nqueue);
+
+ return 0;
+
+free_queues:
+ kfree(hdev->queues);
+destroy_dma_pools:
+ spraid_destroy_dma_pools(hdev);
+free_ctrl_info:
+ kfree(hdev->ctrl_info);
+release_instance:
+ ida_free(&spraid_instance_ida, hdev->instance);
+ return ret;
+}
+
+static void spraid_free_resources(struct spraid_dev *hdev)
+{
+ spraid_free_admin_cmds(hdev);
+ kfree(hdev->queues);
+ spraid_destroy_dma_pools(hdev);
+ kfree(hdev->ctrl_info);
+ ida_free(&spraid_instance_ida, hdev->instance);
+}
+
+static void spraid_bsg_unmap_data(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir =
+ rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+
+ if (iod->nsge)
+ dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
+
+ spraid_free_iod_res(hdev, iod);
+}
+
+static int spraid_bsg_map_data(struct spraid_dev *hdev, struct bsg_job *job,
+ struct spraid_admin_command *cmd)
+{
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir =
+ rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ int ret = 0;
+
+ iod->sg = job->request_payload.sg_list;
+ iod->nsge = job->request_payload.sg_cnt;
+ iod->length = job->request_payload.payload_len;
+ iod->use_sgl = false;
+ iod->npages = -1;
+ iod->sg_drv_mgmt = false;
+
+ if (!iod->nsge)
+ goto out;
+
+ ret = dma_map_sg_attrs(hdev->dev, iod->sg, iod->nsge,
+ dma_dir, DMA_ATTR_NO_WARN);
+ if (!ret)
+ goto out;
+
+ ret = spraid_setup_prps(hdev, iod);
+ if (ret)
+ goto unmap;
+
+ cmd->common.dptr.prp1 = cpu_to_le64(sg_dma_address(iod->sg));
+ cmd->common.dptr.prp2 = cpu_to_le64(iod->first_dma);
+
+ return 0;
+
+unmap:
+ dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
+out:
+ return ret;
+}
+
+static int spraid_get_ctrl_info(struct spraid_dev *hdev,
+ struct spraid_ctrl_info *ctrl_info)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
+ admin_cmd.get_info.type = SPRAID_GET_INFO_CTRL;
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(ctrl_info, data_ptr, sizeof(struct spraid_ctrl_info));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_init_ctrl_info(struct spraid_dev *hdev)
+{
+ int ret;
+
+ hdev->ctrl_info->nd = cpu_to_le32(240);
+ hdev->ctrl_info->mdts = 8;
+ hdev->ctrl_info->max_cmds = cpu_to_le16(4096);
+ hdev->ctrl_info->max_num_sge = cpu_to_le16(128);
+ hdev->ctrl_info->max_channel = cpu_to_le16(4);
+ hdev->ctrl_info->max_tgt_id = cpu_to_le32(3239);
+ hdev->ctrl_info->max_lun = cpu_to_le16(2);
+
+ ret = spraid_get_ctrl_info(hdev, hdev->ctrl_info);
+ if (ret)
+ dev_err(hdev->dev, "get controller info failed: %d\n", ret);
+
+ dev_info(hdev->dev, "[%s]nd = %d\n", __func__, hdev->ctrl_info->nd);
+ dev_info(hdev->dev, "[%s]max_cmd = %d\n",
+ __func__, hdev->ctrl_info->max_cmds);
+ dev_info(hdev->dev, "[%s]max_channel = %d\n",
+ __func__, hdev->ctrl_info->max_channel);
+ dev_info(hdev->dev, "[%s]max_tgt_id = %d\n",
+ __func__, hdev->ctrl_info->max_tgt_id);
+ dev_info(hdev->dev, "[%s]max_lun = %d\n",
+ __func__, hdev->ctrl_info->max_lun);
+ dev_info(hdev->dev, "[%s]max_num_sge = %d\n",
+ __func__, hdev->ctrl_info->max_num_sge);
+ dev_info(hdev->dev, "[%s]lun_num_boot = %d\n",
+ __func__, hdev->ctrl_info->lun_num_in_boot);
+ dev_info(hdev->dev, "[%s]mdts = %d\n", __func__, hdev->ctrl_info->mdts);
+ dev_info(hdev->dev, "[%s]acl = %d\n", __func__, hdev->ctrl_info->acl);
+ dev_info(hdev->dev, "[%s]aer1 = %d\n", __func__, hdev->ctrl_info->aerl);
+ dev_info(hdev->dev, "[%s]card_type = %d\n",
+ __func__, hdev->ctrl_info->card_type);
+ dev_info(hdev->dev, "[%s]rtd3e = %d\n",
+ __func__, hdev->ctrl_info->rtd3e);
+ dev_info(hdev->dev, "[%s]sn = %s\n", __func__, hdev->ctrl_info->sn);
+ dev_info(hdev->dev, "[%s]fr = %s\n", __func__, hdev->ctrl_info->fr);
+
+ if (!hdev->ctrl_info->aerl)
+ hdev->ctrl_info->aerl = 1;
+ if (hdev->ctrl_info->aerl > SPRAID_NR_AEN_COMMANDS)
+ hdev->ctrl_info->aerl = SPRAID_NR_AEN_COMMANDS;
+
+ return 0;
+}
+
+#define SPRAID_MAX_ADMIN_PAYLOAD_SIZE BIT(16)
+static int spraid_alloc_iod_ext_mem_pool(struct spraid_dev *hdev)
+{
+ u16 max_sge = le16_to_cpu(hdev->ctrl_info->max_num_sge);
+ size_t alloc_size;
+
+ alloc_size = spraid_iod_ext_size(hdev, SPRAID_MAX_ADMIN_PAYLOAD_SIZE,
+ max_sge, true, false);
+ if (alloc_size > PAGE_SIZE)
+ dev_warn(hdev->dev, "It is unreasonable ;"
+ " sg allocation more than one page\n");
+ hdev->iod_mempool = mempool_create_node(1, mempool_kmalloc,
+ mempool_kfree,
+ (void *)alloc_size, GFP_KERNEL,
+ hdev->numa_node);
+ if (!hdev->iod_mempool) {
+ dev_err(hdev->dev, "Create iod extension memory pool failed\n");
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void spraid_free_iod_ext_mem_pool(struct spraid_dev *hdev)
+{
+ mempool_destroy(hdev->iod_mempool);
+}
+
+static int spraid_user_admin_cmd(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct spraid_bsg_request *bsg_req = job->request;
+ struct spraid_passthru_common_cmd *cmd = &(bsg_req->admcmd);
+ struct spraid_admin_command admin_cmd;
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
+ u32 result[2] = {0};
+ int status;
+
+ if (hdev->state >= SPRAID_RESETTING) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not right\n",
+ __func__, hdev->state);
+ return -EBUSY;
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode);
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.common.opcode = cmd->opcode;
+ admin_cmd.common.flags = cmd->flags;
+ admin_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ admin_cmd.common.cdw2[0] = cpu_to_le32(cmd->cdw2);
+ admin_cmd.common.cdw2[1] = cpu_to_le32(cmd->cdw3);
+ admin_cmd.common.cdw10 = cpu_to_le32(cmd->cdw10);
+ admin_cmd.common.cdw11 = cpu_to_le32(cmd->cdw11);
+ admin_cmd.common.cdw12 = cpu_to_le32(cmd->cdw12);
+ admin_cmd.common.cdw13 = cpu_to_le32(cmd->cdw13);
+ admin_cmd.common.cdw14 = cpu_to_le32(cmd->cdw14);
+ admin_cmd.common.cdw15 = cpu_to_le32(cmd->cdw15);
+
+ status = spraid_bsg_map_data(hdev, job, &admin_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
+ }
+
+ status = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, &result[0],
+ &result[1], timeout);
+ if (status >= 0) {
+ job->reply_len = sizeof(result);
+ memcpy(job->reply, result, sizeof(result));
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x];"
+ " status[0x%x] result0[0x%x] result1[0x%x]\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode,
+ status, result[0], result[1]);
+
+ spraid_bsg_unmap_data(hdev, job);
+
+ return status;
+}
+
+static int spraid_alloc_ioq_ptcmds(struct spraid_dev *hdev)
+{
+ int i;
+ int ptnum = SPRAID_NR_IOQ_PTCMDS;
+
+ INIT_LIST_HEAD(&hdev->ioq_pt_list);
+ spin_lock_init(&hdev->ioq_pt_lock);
+
+ hdev->ioq_ptcmds = kcalloc_node(ptnum, sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
+
+ if (!hdev->ioq_ptcmds) {
+ dev_err(hdev->dev, "Alloc ioq_ptcmds failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < ptnum; i++) {
+ hdev->ioq_ptcmds[i].qid = i / SPRAID_PTCMDS_PERQ + 1;
+ hdev->ioq_ptcmds[i].cid = i % SPRAID_PTCMDS_PERQ
+ + SPRAID_IO_BLK_MQ_DEPTH;
+ list_add_tail(&(hdev->ioq_ptcmds[i].list), &hdev->ioq_pt_list);
+ }
+
+ dev_info(hdev->dev, "Alloc ioq_ptcmds success, ptnum[%d]\n", ptnum);
+
+ return 0;
+}
+
+static void spraid_free_ioq_ptcmds(struct spraid_dev *hdev)
+{
+ kfree(hdev->ioq_ptcmds);
+ hdev->ioq_ptcmds = NULL;
+
+ INIT_LIST_HEAD(&hdev->ioq_pt_list);
+}
+
+static int spraid_submit_ioq_sync_cmd(struct spraid_dev *hdev,
+ struct spraid_ioq_command *cmd,
+ u32 *result, u32 *reslen, u32 timeout)
+{
+ int ret;
+ dma_addr_t sense_dma;
+ struct spraid_queue *ioq;
+ void *sense_addr = NULL;
+ struct spraid_cmd *pt_cmd = spraid_get_cmd(hdev, SPRAID_CMD_IOPT);
+
+ if (!pt_cmd) {
+ dev_err(hdev->dev, "err, get ioq cmd failed\n");
+ return -EFAULT;
+ }
+
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
+
+ init_completion(&pt_cmd->cmd_done);
+
+ ioq = &hdev->queues[pt_cmd->qid];
+ ret = pt_cmd->cid * SCSI_SENSE_BUFFERSIZE;
+ sense_addr = ioq->sense + ret;
+ sense_dma = ioq->sense_dma_addr + ret;
+
+ cmd->common.sense_addr = cpu_to_le64(sense_dma);
+ cmd->common.sense_len = cpu_to_le16(SCSI_SENSE_BUFFERSIZE);
+ cmd->common.command_id = pt_cmd->cid;
+
+ spraid_submit_cmd(ioq, cmd);
+
+ if (!wait_for_completion_timeout(&pt_cmd->cmd_done, timeout)) {
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout;"
+ " opcode[0x%x] subopcode[0x%x]\n",
+ __func__, pt_cmd->cid, pt_cmd->qid, cmd->common.opcode,
+ (le32_to_cpu(cmd->common.cdw3[0]) & 0xffff));
+ WRITE_ONCE(pt_cmd->state, SPRAID_CMD_TIMEOUT);
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
+ return -EINVAL;
+ }
+
+ if (result && reslen) {
+ if ((pt_cmd->status & 0x17f) == 0x101) {
+ memcpy(result, sense_addr, SCSI_SENSE_BUFFERSIZE);
+ *reslen = SCSI_SENSE_BUFFERSIZE;
+ }
+ }
+
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
+
+ return pt_cmd->status;
+}
+
+static int spraid_user_ioq_cmd(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct spraid_bsg_request *bsg_req =
+ (struct spraid_bsg_request *)(job->request);
+ struct spraid_ioq_passthru_cmd *cmd = &(bsg_req->ioqcmd);
+ struct spraid_ioq_command ioq_cmd;
+ int status = 0;
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
+
+ if (cmd->data_len > PAGE_SIZE) {
+ dev_err(hdev->dev, "[%s] data len bigger than 4k\n", __func__);
+ return -EFAULT;
+ }
+
+ if (hdev->state != SPRAID_LIVE) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not live\n",
+ __func__, hdev->state);
+ return -EBUSY;
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init;"
+ " datalen[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode, cmd->data_len);
+
+ memset(&ioq_cmd, 0, sizeof(ioq_cmd));
+ ioq_cmd.common.opcode = cmd->opcode;
+ ioq_cmd.common.flags = cmd->flags;
+ ioq_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ ioq_cmd.common.sense_len = cpu_to_le16(cmd->info_0.res_sense_len);
+ ioq_cmd.common.cdb_len = cmd->info_0.cdb_len;
+ ioq_cmd.common.rsvd2 = cmd->info_0.rsvd0;
+ ioq_cmd.common.cdw3[0] = cpu_to_le32(cmd->cdw3);
+ ioq_cmd.common.cdw3[1] = cpu_to_le32(cmd->cdw4);
+ ioq_cmd.common.cdw3[2] = cpu_to_le32(cmd->cdw5);
+
+ ioq_cmd.common.cdw10[0] = cpu_to_le32(cmd->cdw10);
+ ioq_cmd.common.cdw10[1] = cpu_to_le32(cmd->cdw11);
+ ioq_cmd.common.cdw10[2] = cpu_to_le32(cmd->cdw12);
+ ioq_cmd.common.cdw10[3] = cpu_to_le32(cmd->cdw13);
+ ioq_cmd.common.cdw10[4] = cpu_to_le32(cmd->cdw14);
+ ioq_cmd.common.cdw10[5] = cpu_to_le32(cmd->data_len);
+
+ memcpy(ioq_cmd.common.cdb, &cmd->cdw16, cmd->info_0.cdb_len);
+
+ ioq_cmd.common.cdw26[0] = cpu_to_le32(cmd->cdw26[0]);
+ ioq_cmd.common.cdw26[1] = cpu_to_le32(cmd->cdw26[1]);
+ ioq_cmd.common.cdw26[2] = cpu_to_le32(cmd->cdw26[2]);
+ ioq_cmd.common.cdw26[3] = cpu_to_le32(cmd->cdw26[3]);
+
+ status = spraid_bsg_map_data(hdev, job,
+ (struct spraid_admin_command *)&ioq_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
+ }
+
+ status = spraid_submit_ioq_sync_cmd(hdev, &ioq_cmd, job->reply,
+ &job->reply_len, timeout);
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x], status[0x%x];"
+ " reply_len[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode,
+ status, job->reply_len);
+
+ spraid_bsg_unmap_data(hdev, job);
+
+ return status;
+}
+
+static bool spraid_check_scmd_completed(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_queue *spraidq;
+ u16 hwq, cid;
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ spraidq = &hdev->queues[hwq];
+ if (READ_ONCE(iod->state) == SPRAID_CMD_COMPLETE
+ || spraid_poll_cq(spraidq, cid)) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] has been completed\n",
+ cid, spraidq->qid);
+ return true;
+ }
+ return false;
+}
+
+static enum blk_eh_timer_return spraid_scmd_timeout(struct scsi_cmnd *scmd)
+{
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ unsigned int timeout = scmd->device->request_queue->rq_timeout;
+
+ if (spraid_check_scmd_completed(scmd))
+ goto out;
+
+ if (time_after(jiffies, scmd->jiffies_at_alloc + timeout)) {
+ if (cmpxchg(&iod->state,
+ SPRAID_CMD_IN_FLIGHT,
+ SPRAID_CMD_TIMEOUT) == SPRAID_CMD_IN_FLIGHT) {
+ return BLK_EH_DONE;
+ }
+ }
+out:
+ return BLK_EH_RESET_TIMER;
+}
+
+/* send abort command by admin queue temporary */
+static int spraid_send_abort_cmd(struct spraid_dev *hdev,
+ u32 hdid, u16 qid, u16 cid)
+{
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.abort.opcode = SPRAID_ADMIN_ABORT_CMD;
+ admin_cmd.abort.hdid = cpu_to_le32(hdid);
+ admin_cmd.abort.sqid = cpu_to_le16(qid);
+ admin_cmd.abort.cid = cpu_to_le16(cid);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+/* send reset command by admin quueue temporary */
+static int spraid_send_reset_cmd(struct spraid_dev *hdev, int type, u32 hdid)
+{
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.reset.opcode = SPRAID_ADMIN_RESET;
+ admin_cmd.reset.hdid = cpu_to_le32(hdid);
+ admin_cmd.reset.type = type;
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static bool spraid_change_host_state(struct spraid_dev *hdev,
+ enum spraid_state newstate)
+{
+ unsigned long flags;
+ enum spraid_state oldstate;
+ bool change = false;
+
+ spin_lock_irqsave(&hdev->state_lock, flags);
+
+ oldstate = hdev->state;
+ switch (newstate) {
+ case SPRAID_LIVE:
+ switch (oldstate) {
+ case SPRAID_NEW:
+ case SPRAID_RESETTING:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ case SPRAID_RESETTING:
+ switch (oldstate) {
+ case SPRAID_LIVE:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ case SPRAID_DELETING:
+ if (oldstate != SPRAID_DELETING)
+ change = true;
+ break;
+ case SPRAID_DEAD:
+ switch (oldstate) {
+ case SPRAID_NEW:
+ case SPRAID_LIVE:
+ case SPRAID_RESETTING:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ default:
+ break;
+ }
+ if (change)
+ hdev->state = newstate;
+ spin_unlock_irqrestore(&hdev->state_lock, flags);
+
+ dev_info(hdev->dev, "[%s][%d]->[%d], change[%d]\n",
+ __func__, oldstate, newstate, change);
+
+ return change;
+}
+
+static void spraid_back_fault_cqe(struct spraid_queue *ioq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct blk_mq_tags *tags;
+ struct scsi_cmnd *scmd;
+ struct spraid_iod *iod;
+ struct request *req;
+
+ tags = hdev->shost->tag_set.tags[ioq->qid - 1];
+ req = blk_mq_tag_to_rq(tags, cqe->cmd_id);
+ if (unlikely(!req || !blk_mq_request_started(req)))
+ return;
+
+ scmd = blk_mq_rq_to_pdu(req);
+ iod = scsi_cmd_priv(scmd);
+
+ set_host_byte(scmd, DID_NO_CONNECT);
+ if (iod->nsge)
+ scsi_dma_unmap(scmd);
+ spraid_free_iod_res(hdev, iod);
+ scmd->scsi_done(scmd);
+ dev_warn(hdev->dev, "Back fault CQE, cid[%d] qid[%d]\n",
+ cqe->cmd_id, ioq->qid);
+}
+
+static void spraid_back_all_io(struct spraid_dev *hdev)
+{
+ int i, j;
+ struct spraid_queue *ioq;
+ struct spraid_completion cqe = { 0 };
+
+ scsi_block_requests(hdev->shost);
+
+ for (i = 1; i <= hdev->shost->nr_hw_queues; i++) {
+ ioq = &hdev->queues[i];
+ for (j = 0; j < hdev->shost->can_queue; j++) {
+ cqe.cmd_id = j;
+ spraid_back_fault_cqe(ioq, &cqe);
+ }
+ }
+
+ scsi_unblock_requests(hdev->shost);
+}
+
+static void spraid_dev_disable(struct spraid_dev *hdev, bool shutdown)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u16 start, end;
+ unsigned long timeout = jiffies + 600 * HZ;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ if (shutdown)
+ spraid_shutdown_ctrl(hdev);
+ else
+ spraid_disable_ctrl(hdev);
+ }
+
+ while (!time_after(jiffies, timeout)) {
+ if (!pci_device_is_present(hdev->pdev)) {
+ dev_info(hdev->dev, "[%s] pci_device not present;"
+ " skip wait\n", __func__);
+ break;
+ }
+ if (!spraid_wait_ready(hdev, hdev->cap, false)) {
+ dev_info(hdev->dev,
+ "[%s] wait ready success after reset\n",
+ __func__);
+ break;
+ }
+ dev_info(hdev->dev, "[%s] waiting csts_rdy ready\n", __func__);
+ }
+
+ if (hdev->queue_count == 0) {
+ dev_err(hdev->dev, "[%s] warn, queue has been delete\n",
+ __func__);
+ return;
+ }
+
+ spin_lock_irq(&adminq->cq_lock);
+ spraid_process_cq(adminq, &start, &end, -1);
+ spin_unlock_irq(&adminq->cq_lock);
+ spraid_complete_cqes(adminq, start, end);
+
+ spraid_pci_disable(hdev);
+
+ spraid_back_all_io(hdev);
+}
+
+static void spraid_reset_work(struct work_struct *work)
+{
+ int ret;
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, reset_work);
+
+ if (hdev->state != SPRAID_RESETTING) {
+ dev_err(hdev->dev, "[%s] err, host is not reset state\n",
+ __func__);
+ return;
+ }
+
+ dev_info(hdev->dev, "[%s] enter host reset\n", __func__);
+
+ if (hdev->ctrl_config & SPRAID_CC_ENABLE) {
+ dev_info(hdev->dev, "[%s] start dev_disable\n", __func__);
+ spraid_dev_disable(hdev, false);
+ }
+
+ ret = spraid_pci_enable(hdev);
+ if (ret)
+ goto out;
+
+ ret = spraid_setup_admin_queue(hdev);
+ if (ret)
+ goto pci_disable;
+
+ ret = spraid_setup_io_queues(hdev);
+ if (ret || hdev->online_queues <= hdev->shost->nr_hw_queues)
+ goto pci_disable;
+
+ spraid_change_host_state(hdev, SPRAID_LIVE);
+
+ spraid_send_all_aen(hdev);
+
+ return;
+
+pci_disable:
+ spraid_pci_disable(hdev);
+out:
+ spraid_change_host_state(hdev, SPRAID_DEAD);
+ dev_err(hdev->dev, "[%s] err, host reset failed\n", __func__);
+}
+
+static int spraid_reset_work_sync(struct spraid_dev *hdev)
+{
+ if (!spraid_change_host_state(hdev, SPRAID_RESETTING)) {
+ dev_info(hdev->dev, "[%s] can't change to reset state\n",
+ __func__);
+ return -EBUSY;
+ }
+
+ if (!queue_work(spraid_wq, &hdev->reset_work)) {
+ dev_err(hdev->dev, "[%s] err, host is already in reset state\n",
+ __func__);
+ return -EBUSY;
+ }
+
+ flush_work(&hdev->reset_work);
+ if (hdev->state != SPRAID_LIVE)
+ return -ENODEV;
+
+ return 0;
+}
+
+static int spraid_wait_abnl_cmd_done(struct spraid_iod *iod)
+{
+ u16 times = 0;
+
+ do {
+ if (READ_ONCE(iod->state) == SPRAID_CMD_TMO_COMPLETE)
+ break;
+ msleep(500);
+ times++;
+ } while (times <= SPRAID_WAIT_ABNL_CMD_TIMEOUT);
+
+ /* wait command completion timeout after abort/reset success */
+ if (times >= SPRAID_WAIT_ABNL_CMD_TIMEOUT)
+ return -ETIMEDOUT;
+
+ return 0;
+}
+
+static int spraid_abort_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, aborting\n", cid, hwq);
+ ret = spraid_send_abort_cmd(hdev, hostdata->hdid, hwq, cid);
+ if (ret != ADMIN_ERR_TIMEOUT) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort failed;"
+ " not found\n", cid, hwq);
+ return FAILED;
+ }
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort succ\n", cid, hwq);
+ return SUCCESS;
+ }
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort failed, timeout\n",
+ cid, hwq);
+ return FAILED;
+}
+
+static int spraid_tgt_reset_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, target reset\n",
+ cid, hwq);
+ ret = spraid_send_reset_cmd(hdev, SPRAID_RESET_TARGET, hostdata->hdid);
+ if (ret == 0) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev,
+ "cid[%d] qid[%d]target reset failed;"
+ " not found\n", cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] target reset success\n",
+ cid, hwq);
+ return SUCCESS;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] ret[%d] target reset failed\n",
+ cid, hwq, ret);
+ return FAILED;
+}
+
+static int spraid_bus_reset_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, bus reset\n", cid, hwq);
+ ret = spraid_send_reset_cmd(hdev, SPRAID_RESET_BUS, hostdata->hdid);
+ if (ret == 0) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev,
+ "cid[%d] qid[%d] bus reset failed;"
+ " not found\n", cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] bus reset succ\n",
+ cid, hwq);
+ return SUCCESS;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] ret[%d] bus reset failed\n",
+ cid, hwq, ret);
+ return FAILED;
+}
+
+static int spraid_shost_reset_handler(struct scsi_cmnd *scmd)
+{
+ u16 hwq, cid;
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+
+ scsi_print_command(scmd);
+ if (hdev->state != SPRAID_LIVE || spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset\n", cid, hwq);
+
+ if (spraid_reset_work_sync(hdev)) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset failed\n",
+ cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset success\n", cid, hwq);
+
+ return SUCCESS;
+}
+
+static pci_ers_result_t spraid_pci_error_detected(struct pci_dev *pdev,
+ pci_channel_state_t state)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "enter pci error detect, state:%d\n", state);
+
+ switch (state) {
+ case pci_channel_io_normal:
+ dev_warn(hdev->dev, "channel is normal, do nothing\n");
+
+ return PCI_ERS_RESULT_CAN_RECOVER;
+ case pci_channel_io_frozen:
+ dev_warn(hdev->dev,
+ "channel io frozen, need reset controller\n");
+
+ scsi_block_requests(hdev->shost);
+
+ spraid_change_host_state(hdev, SPRAID_RESETTING);
+
+ return PCI_ERS_RESULT_NEED_RESET;
+ case pci_channel_io_perm_failure:
+ dev_warn(hdev->dev, "channel io failure, request disconnect\n");
+
+ return PCI_ERS_RESULT_DISCONNECT;
+ }
+
+ return PCI_ERS_RESULT_NEED_RESET;
+}
+
+static pci_ers_result_t spraid_pci_slot_reset(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "restart after slot reset\n");
+
+ pci_restore_state(pdev);
+
+ if (!queue_work(spraid_wq, &hdev->reset_work)) {
+ dev_err(hdev->dev, "[%s] err, the device is resetting state\n",
+ __func__);
+ return PCI_ERS_RESULT_NONE;
+ }
+
+ flush_work(&hdev->reset_work);
+
+ scsi_unblock_requests(hdev->shost);
+
+ return PCI_ERS_RESULT_RECOVERED;
+}
+
+static void spraid_reset_done(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "enter spraid reset done\n");
+}
+
+static ssize_t csts_pp_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS)
+ & SPRAID_CSTS_PP_MASK);
+ ret >>= SPRAID_CSTS_PP_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_shst_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS)
+ & SPRAID_CSTS_SHST_MASK);
+ ret >>= SPRAID_CSTS_SHST_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_cfs_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS)
+ & SPRAID_CSTS_CFS_MASK);
+ ret >>= SPRAID_CSTS_CFS_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_rdy_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev))
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_RDY);
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t fw_version_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+
+ return snprintf(buf, PAGE_SIZE, "%s\n", hdev->ctrl_info->fr);
+}
+
+static DEVICE_ATTR_RO(csts_pp);
+static DEVICE_ATTR_RO(csts_shst);
+static DEVICE_ATTR_RO(csts_cfs);
+static DEVICE_ATTR_RO(csts_rdy);
+static DEVICE_ATTR_RO(fw_version);
+
+static struct device_attribute *spraid_host_attrs[] = {
+ &dev_attr_csts_pp,
+ &dev_attr_csts_shst,
+ &dev_attr_csts_cfs,
+ &dev_attr_csts_rdy,
+ &dev_attr_fw_version,
+ NULL,
+};
+
+static int spraid_get_vd_info(struct spraid_dev *hdev,
+ struct spraid_vd_info *vd_info, u16 vid)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.usr_cmd.opcode = USR_CMD_READ;
+ admin_cmd.usr_cmd.info_0.subopcode = cpu_to_le16(USR_CMD_VDINFO);
+ admin_cmd.usr_cmd.info_1.data_len = cpu_to_le16(USR_CMD_RDLEN);
+ admin_cmd.usr_cmd.info_1.param_len = cpu_to_le16(VDINFO_PARAM_LEN);
+ admin_cmd.usr_cmd.cdw10 = cpu_to_le32(vid);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(vd_info, data_ptr, sizeof(struct spraid_vd_info));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_get_bgtask(struct spraid_dev *hdev,
+ struct spraid_bgtask *bgtask)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.usr_cmd.opcode = USR_CMD_READ;
+ admin_cmd.usr_cmd.info_0.subopcode = cpu_to_le16(USR_CMD_BGTASK);
+ admin_cmd.usr_cmd.info_1.data_len = cpu_to_le16(USR_CMD_RDLEN);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(bgtask, data_ptr, sizeof(struct spraid_bgtask));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static ssize_t raid_level_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_sdev_hostdata *hostdata;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret)
+ vd_info->rg_level = ARRAY_SIZE(raid_levels) - 1;
+
+ ret = (vd_info->rg_level < ARRAY_SIZE(raid_levels)) ?
+ vd_info->rg_level : (ARRAY_SIZE(raid_levels) - 1);
+
+ kfree(vd_info);
+
+ return snprintf(buf, PAGE_SIZE, "RAID-%s\n", raid_levels[ret]);
+}
+
+static ssize_t raid_state_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_sdev_hostdata *hostdata;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret) {
+ vd_info->vd_status = 0;
+ vd_info->rg_id = 0xff;
+ }
+
+ ret = (vd_info->vd_status < ARRAY_SIZE(raid_states)) ?
+ vd_info->vd_status : 0;
+
+ kfree(vd_info);
+
+ return snprintf(buf, PAGE_SIZE, "%s\n", raid_states[ret]);
+}
+
+static ssize_t raid_resync_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_bgtask *bgtask;
+ struct spraid_sdev_hostdata *hostdata;
+ u8 rg_id, i, progress = 0;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret)
+ goto out;
+
+ rg_id = vd_info->rg_id;
+
+ bgtask = (struct spraid_bgtask *)vd_info;
+ ret = spraid_get_bgtask(hdev, bgtask);
+ if (ret)
+ goto out;
+ for (i = 0; i < bgtask->task_num; i++) {
+ if ((bgtask->bgtask[i].type == BGTASK_TYPE_REBUILD) &&
+ (le16_to_cpu(bgtask->bgtask[i].vd_id) == rg_id))
+ progress = bgtask->bgtask[i].progress;
+ }
+
+out:
+ kfree(vd_info);
+ return snprintf(buf, PAGE_SIZE, "%d\n", progress);
+}
+
+static DEVICE_ATTR_RO(raid_level);
+static DEVICE_ATTR_RO(raid_state);
+static DEVICE_ATTR_RO(raid_resync);
+
+static struct device_attribute *spraid_dev_attrs[] = {
+ &dev_attr_raid_level,
+ &dev_attr_raid_state,
+ &dev_attr_raid_resync,
+ NULL,
+};
+
+static struct pci_error_handlers spraid_err_handler = {
+ .error_detected = spraid_pci_error_detected,
+ .slot_reset = spraid_pci_slot_reset,
+ .reset_done = spraid_reset_done,
+};
+
+static int spraid_sysfs_host_reset(struct Scsi_Host *shost, int reset_type)
+{
+ int ret;
+ struct spraid_dev *hdev = shost_priv(shost);
+
+ dev_info(hdev->dev, "[%s] start sysfs host reset cmd\n", __func__);
+ ret = spraid_reset_work_sync(hdev);
+ dev_info(hdev->dev, "[%s] stop sysfs host reset cmd[%d]\n",
+ __func__, ret);
+
+ return ret;
+}
+
+static struct scsi_host_template spraid_driver_template = {
+ .module = THIS_MODULE,
+ .name = "Ramaxel Logic spraid driver",
+ .proc_name = "spraid",
+ .queuecommand = spraid_queue_command,
+ .slave_alloc = spraid_slave_alloc,
+ .slave_destroy = spraid_slave_destroy,
+ .slave_configure = spraid_slave_configure,
+ .eh_timed_out = spraid_scmd_timeout,
+ .eh_abort_handler = spraid_abort_handler,
+ .eh_target_reset_handler = spraid_tgt_reset_handler,
+ .eh_bus_reset_handler = spraid_bus_reset_handler,
+ .eh_host_reset_handler = spraid_shost_reset_handler,
+ .change_queue_depth = scsi_change_queue_depth,
+ .this_id = -1,
+ .shost_attrs = spraid_host_attrs,
+ .sdev_attrs = spraid_dev_attrs,
+ .host_reset = spraid_sysfs_host_reset,
+};
+
+static void spraid_shutdown(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ spraid_remove_io_queues(hdev);
+ spraid_disable_admin_queue(hdev, true);
+}
+
+/* bsg dispatch user command */
+static int spraid_bsg_host_dispatch(struct bsg_job *job)
+{
+ struct Scsi_Host *shost = dev_to_shost(job->dev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_bsg_request *bsg_req = job->request;
+ int ret = 0;
+
+ dev_info(hdev->dev, "[%s] msgcode[%d], msglen[%d], timeout[%d];"
+ " req_nsge[%d], req_len[%d]\n",
+ __func__, bsg_req->msgcode, job->request_len,
+ rq->timeout, job->request_payload.sg_cnt,
+ job->request_payload.payload_len);
+
+ job->reply_len = 0;
+
+ switch (bsg_req->msgcode) {
+ case SPRAID_BSG_ADM:
+ ret = spraid_user_admin_cmd(hdev, job);
+ break;
+ case SPRAID_BSG_IOQ:
+ ret = spraid_user_ioq_cmd(hdev, job);
+ break;
+ default:
+ dev_info(hdev->dev, "[%s] unsupport msgcode[%d]\n",
+ __func__, bsg_req->msgcode);
+ break;
+ }
+
+ if (ret > 0)
+ ret = ret | (ret << 8);
+
+ bsg_job_done(job, ret, 0);
+ return 0;
+}
+
+static inline void spraid_remove_bsg(struct spraid_dev *hdev)
+{
+ if (hdev->bsg_queue) {
+ bsg_unregister_queue(hdev->bsg_queue);
+ blk_cleanup_queue(hdev->bsg_queue);
+ }
+}
+static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct spraid_dev *hdev;
+ struct Scsi_Host *shost;
+ int node, ret;
+ char bsg_name[15];
+
+ shost = scsi_host_alloc(&spraid_driver_template, sizeof(*hdev));
+ if (!shost) {
+ dev_err(&pdev->dev, "Failed to allocate scsi host\n");
+ return -ENOMEM;
+ }
+ hdev = shost_priv(shost);
+ hdev->pdev = pdev;
+ hdev->dev = get_device(&pdev->dev);
+
+ node = dev_to_node(hdev->dev);
+ if (node == NUMA_NO_NODE) {
+ node = first_memory_node;
+ set_dev_node(hdev->dev, node);
+ }
+ hdev->numa_node = node;
+ hdev->shost = shost;
+ pci_set_drvdata(pdev, hdev);
+
+ ret = spraid_dev_map(hdev);
+ if (ret)
+ goto put_dev;
+
+ init_rwsem(&hdev->devices_rwsem);
+ INIT_WORK(&hdev->scan_work, spraid_scan_work);
+ INIT_WORK(&hdev->timesyn_work, spraid_timesyn_work);
+ INIT_WORK(&hdev->reset_work, spraid_reset_work);
+ INIT_WORK(&hdev->fw_act_work, spraid_fw_act_work);
+ spin_lock_init(&hdev->state_lock);
+
+ ret = spraid_alloc_resources(hdev);
+ if (ret)
+ goto dev_unmap;
+
+ ret = spraid_pci_enable(hdev);
+ if (ret)
+ goto resources_free;
+
+ ret = spraid_setup_admin_queue(hdev);
+ if (ret)
+ goto pci_disable;
+
+ ret = spraid_init_ctrl_info(hdev);
+ if (ret)
+ goto disable_admin_q;
+
+ ret = spraid_alloc_iod_ext_mem_pool(hdev);
+ if (ret)
+ goto disable_admin_q;
+
+ ret = spraid_setup_io_queues(hdev);
+ if (ret)
+ goto free_iod_mempool;
+
+ spraid_shost_init(hdev);
+
+ ret = scsi_add_host(hdev->shost, hdev->dev);
+ if (ret) {
+ dev_err(hdev->dev, "Add shost to system failed, ret: %d\n",
+ ret);
+ goto remove_io_queues;
+ }
+
+ snprintf(bsg_name, sizeof(bsg_name), "spraid%d", shost->host_no);
+ hdev->bsg_queue = bsg_setup_queue(&shost->shost_gendev, bsg_name,
+ spraid_bsg_host_dispatch,
+ spraid_cmd_size(hdev, true, false));
+ if (IS_ERR(hdev->bsg_queue)) {
+ dev_err(hdev->dev, "err, setup bsg failed\n");
+ hdev->bsg_queue = NULL;
+ goto remove_io_queues;
+ }
+
+ if (hdev->online_queues == SPRAID_ADMIN_QUEUE_NUM) {
+ dev_warn(hdev->dev, "warn only admin queue can be used\n");
+ return 0;
+ }
+
+ hdev->state = SPRAID_LIVE;
+
+ spraid_send_all_aen(hdev);
+
+ ret = spraid_dev_list_init(hdev);
+ if (ret)
+ goto remove_bsg;
+
+ ret = spraid_configure_timestamp(hdev);
+ if (ret)
+ dev_warn(hdev->dev, "init set timestamp failed\n");
+
+ ret = spraid_alloc_ioq_ptcmds(hdev);
+ if (ret)
+ goto remove_bsg;
+
+ scsi_scan_host(hdev->shost);
+
+ return 0;
+
+remove_bsg:
+ spraid_remove_bsg(hdev);
+remove_io_queues:
+ spraid_remove_io_queues(hdev);
+free_iod_mempool:
+ spraid_free_iod_ext_mem_pool(hdev);
+disable_admin_q:
+ spraid_disable_admin_queue(hdev, false);
+pci_disable:
+ spraid_pci_disable(hdev);
+resources_free:
+ spraid_free_resources(hdev);
+dev_unmap:
+ spraid_dev_unmap(hdev);
+put_dev:
+ put_device(hdev->dev);
+ scsi_host_put(shost);
+
+ return -ENODEV;
+}
+
+static void spraid_remove(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+ struct Scsi_Host *shost = hdev->shost;
+
+ dev_info(hdev->dev, "enter spraid remove\n");
+
+ spraid_change_host_state(hdev, SPRAID_DELETING);
+ flush_work(&hdev->reset_work);
+
+ if (!pci_device_is_present(pdev))
+ spraid_back_all_io(hdev);
+
+ spraid_remove_bsg(hdev);
+ scsi_remove_host(shost);
+ spraid_free_ioq_ptcmds(hdev);
+ kfree(hdev->devices);
+ spraid_remove_io_queues(hdev);
+ spraid_free_iod_ext_mem_pool(hdev);
+ spraid_disable_admin_queue(hdev, false);
+ spraid_pci_disable(hdev);
+ spraid_free_resources(hdev);
+ spraid_dev_unmap(hdev);
+ put_device(hdev->dev);
+ scsi_host_put(shost);
+
+ dev_info(hdev->dev, "exit spraid remove\n");
+}
+
+static const struct pci_device_id spraid_id_table[] = {
+ { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC,
+ SPRAID_SERVER_DEVICE_HBA_DID) },
+ { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC,
+ SPRAID_SERVER_DEVICE_RAID_DID) },
+ { 0, }
+};
+MODULE_DEVICE_TABLE(pci, spraid_id_table);
+
+static struct pci_driver spraid_driver = {
+ .name = "spraid",
+ .id_table = spraid_id_table,
+ .probe = spraid_probe,
+ .remove = spraid_remove,
+ .shutdown = spraid_shutdown,
+ .err_handler = &spraid_err_handler,
+};
+
+static int __init spraid_init(void)
+{
+ int ret;
+
+ spraid_wq = alloc_workqueue("spraid-wq", WQ_UNBOUND | WQ_MEM_RECLAIM |
+ WQ_SYSFS, 0);
+ if (!spraid_wq)
+ return -ENOMEM;
+
+ spraid_class = class_create(THIS_MODULE, "spraid");
+ if (IS_ERR(spraid_class)) {
+ ret = PTR_ERR(spraid_class);
+ goto destroy_wq;
+ }
+
+ ret = pci_register_driver(&spraid_driver);
+ if (ret < 0)
+ goto destroy_class;
+
+ return 0;
+
+destroy_class:
+ class_destroy(spraid_class);
+destroy_wq:
+ destroy_workqueue(spraid_wq);
+
+ return ret;
+}
+
+static void __exit spraid_exit(void)
+{
+ pci_unregister_driver(&spraid_driver);
+ class_destroy(spraid_class);
+ destroy_workqueue(spraid_wq);
+ ida_destroy(&spraid_instance_ida);
+}
+
+MODULE_AUTHOR("songyl(a)ramaxel.com");
+MODULE_DESCRIPTION("Ramaxel Memory Technology SPraid Driver");
+MODULE_LICENSE("GPL");
+MODULE_VERSION(SPRAID_DRV_VERSION);
+module_init(spraid_init);
+module_exit(spraid_exit);
--
2.27.0
2
1

[PATCH openEuler-1.0-LTS] USB: gadget: bRequestType is a bitfield, not a enum
by Yang Yingliang 22 Dec '21
by Yang Yingliang 22 Dec '21
22 Dec '21
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
mainline inclusion
from mainline-v5.16-rc6
commit f08adf5add9a071160c68bb2a61d697f39ab0758
category: bugfix
bugzilla: NA
CVE: CVE-2021-39685
--------------------------------
Szymon rightly pointed out that the previous check for the endpoint
direction in bRequestType was not looking at only the bit involved, but
rather the whole value. Normally this is ok, but for some request
types, bits other than bit 8 could be set and the check for the endpoint
length could not stall correctly.
Fix that up by only checking the single bit.
Fixes: 153a2d7e3350 ("USB: gadget: detect too-big endpoint 0 requests")
Cc: Felipe Balbi <balbi(a)kernel.org>
Reported-by: Szymon Heidrich <szymon.heidrich(a)gmail.com>
Link: https://lore.kernel.org/r/20211214184621.385828-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/usb/gadget/composite.c | 6 +++---
drivers/usb/gadget/legacy/dbgp.c | 6 +++---
drivers/usb/gadget/legacy/inode.c | 6 +++---
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c
index 3813249387b3d..d39ebf5bdbdd5 100644
--- a/drivers/usb/gadget/composite.c
+++ b/drivers/usb/gadget/composite.c
@@ -1570,14 +1570,14 @@ composite_setup(struct usb_gadget *gadget, const struct usb_ctrlrequest *ctrl)
u8 endp;
if (w_length > USB_COMP_EP0_BUFSIZ) {
- if (ctrl->bRequestType == USB_DIR_OUT) {
- goto done;
- } else {
+ if (ctrl->bRequestType & USB_DIR_IN) {
/* Cast away the const, we are going to overwrite on purpose. */
__le16 *temp = (__le16 *)&ctrl->wLength;
*temp = cpu_to_le16(USB_COMP_EP0_BUFSIZ);
w_length = USB_COMP_EP0_BUFSIZ;
+ } else {
+ goto done;
}
}
diff --git a/drivers/usb/gadget/legacy/dbgp.c b/drivers/usb/gadget/legacy/dbgp.c
index e567afcb2794c..ffe58d44b2cb1 100644
--- a/drivers/usb/gadget/legacy/dbgp.c
+++ b/drivers/usb/gadget/legacy/dbgp.c
@@ -346,14 +346,14 @@ static int dbgp_setup(struct usb_gadget *gadget,
u16 len = 0;
if (length > DBGP_REQ_LEN) {
- if (ctrl->bRequestType == USB_DIR_OUT) {
- return err;
- } else {
+ if (ctrl->bRequestType & USB_DIR_IN) {
/* Cast away the const, we are going to overwrite on purpose. */
__le16 *temp = (__le16 *)&ctrl->wLength;
*temp = cpu_to_le16(DBGP_REQ_LEN);
length = DBGP_REQ_LEN;
+ } else {
+ return err;
}
}
diff --git a/drivers/usb/gadget/legacy/inode.c b/drivers/usb/gadget/legacy/inode.c
index 83fd1b55d497a..f91d403da3141 100644
--- a/drivers/usb/gadget/legacy/inode.c
+++ b/drivers/usb/gadget/legacy/inode.c
@@ -1335,14 +1335,14 @@ gadgetfs_setup (struct usb_gadget *gadget, const struct usb_ctrlrequest *ctrl)
u16 w_length = le16_to_cpu(ctrl->wLength);
if (w_length > RBUF_SIZE) {
- if (ctrl->bRequestType == USB_DIR_OUT) {
- return value;
- } else {
+ if (ctrl->bRequestType & USB_DIR_IN) {
/* Cast away the const, we are going to overwrite on purpose. */
__le16 *temp = (__le16 *)&ctrl->wLength;
*temp = cpu_to_le16(RBUF_SIZE);
w_length = RBUF_SIZE;
+ } else {
+ return value;
}
}
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS 1/2] scsi:spraid: support Ramaxel's spraid driver
by Yang Yingliang 22 Dec '21
by Yang Yingliang 22 Dec '21
22 Dec '21
From: Yanling Song <songyl(a)ramaxel.com>
Ramaxel inclusion
category: features
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NJIB
CVE: NA
Support Ramaxel's SPRxxx Raid controller
Signed-off-by: Yanling Song <songyl(a)ramaxel.com>
Reviewed-by: Jiang Yu<yujiang(a)ramaxel.com>
Acked-by: Xie XiuQi <xiexiuqi(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/scsi/Kconfig | 1 +
drivers/scsi/Makefile | 1 +
drivers/scsi/spraid/Kconfig | 13 +
drivers/scsi/spraid/Makefile | 7 +
drivers/scsi/spraid/spraid.h | 746 ++++++
drivers/scsi/spraid/spraid_main.c | 3875 +++++++++++++++++++++++++++++
6 files changed, 4643 insertions(+)
create mode 100644 drivers/scsi/spraid/Kconfig
create mode 100644 drivers/scsi/spraid/Makefile
create mode 100644 drivers/scsi/spraid/spraid.h
create mode 100644 drivers/scsi/spraid/spraid_main.c
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 8450484184e36..63d2aaa228349 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -522,6 +522,7 @@ source "drivers/scsi/megaraid/Kconfig.megaraid"
source "drivers/scsi/mpt3sas/Kconfig"
source "drivers/scsi/smartpqi/Kconfig"
source "drivers/scsi/ufs/Kconfig"
+source "drivers/scsi/spraid/Kconfig"
config SCSI_HPTIOP
tristate "HighPoint RocketRAID 3xxx/4xxx Controller support"
diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index 2973693f6dcc7..4056cf26e09e6 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -94,6 +94,7 @@ obj-$(CONFIG_SCSI_ZALON) += zalon7xx.o
obj-$(CONFIG_SCSI_DC395x) += dc395x.o
obj-$(CONFIG_SCSI_AM53C974) += esp_scsi.o am53c974.o
obj-$(CONFIG_CXLFLASH) += cxlflash/
+obj-$(CONFIG_RAMAXEL_SPRAID) += spraid/
obj-$(CONFIG_MEGARAID_LEGACY) += megaraid.o
obj-$(CONFIG_MEGARAID_NEWGEN) += megaraid/
obj-$(CONFIG_MEGARAID_SAS) += megaraid/
diff --git a/drivers/scsi/spraid/Kconfig b/drivers/scsi/spraid/Kconfig
new file mode 100644
index 0000000000000..bfbba3db8db03
--- /dev/null
+++ b/drivers/scsi/spraid/Kconfig
@@ -0,0 +1,13 @@
+#
+# Ramaxel driver configuration
+#
+
+config RAMAXEL_SPRAID
+ tristate "Ramaxel spraid Adapter"
+ depends on PCI && SCSI
+ select BLK_DEV_BSGLIB
+ depends on ARM64 || X86_64
+ help
+ This driver supports Ramaxel SPRxxx serial
+ raid controller, which has PCIE Gen4 interface
+ with host and supports SAS/SATA Hdd/ssd.
diff --git a/drivers/scsi/spraid/Makefile b/drivers/scsi/spraid/Makefile
new file mode 100644
index 0000000000000..aadc2ffd37ebd
--- /dev/null
+++ b/drivers/scsi/spraid/Makefile
@@ -0,0 +1,7 @@
+#
+# Makefile for the Ramaxel device drivers.
+#
+
+obj-$(CONFIG_RAMAXEL_SPRAID) += spraid.o
+
+spraid-objs := spraid_main.o
\ No newline at end of file
diff --git a/drivers/scsi/spraid/spraid.h b/drivers/scsi/spraid/spraid.h
new file mode 100644
index 0000000000000..c1e4980e18e5d
--- /dev/null
+++ b/drivers/scsi/spraid/spraid.h
@@ -0,0 +1,746 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
+
+#ifndef __SPRAID_H_
+#define __SPRAID_H_
+
+#define SPRAID_CAP_MQES(cap) ((cap) & 0xffff)
+#define SPRAID_CAP_STRIDE(cap) (((cap) >> 32) & 0xf)
+#define SPRAID_CAP_MPSMIN(cap) (((cap) >> 48) & 0xf)
+#define SPRAID_CAP_MPSMAX(cap) (((cap) >> 52) & 0xf)
+#define SPRAID_CAP_TIMEOUT(cap) (((cap) >> 24) & 0xff)
+#define SPRAID_CAP_DMAMASK(cap) (((cap) >> 37) & 0xff)
+
+#define SPRAID_DEFAULT_MAX_CHANNEL 4
+#define SPRAID_DEFAULT_MAX_ID 240
+#define SPRAID_DEFAULT_MAX_LUN_PER_HOST 8
+#define MAX_SECTORS 2048
+
+#define IO_SQE_SIZE sizeof(struct spraid_ioq_command)
+#define ADMIN_SQE_SIZE sizeof(struct spraid_admin_command)
+#define SQE_SIZE(qid) (((qid) > 0) ? IO_SQE_SIZE : ADMIN_SQE_SIZE)
+#define CQ_SIZE(depth) ((depth) * sizeof(struct spraid_completion))
+#define SQ_SIZE(qid, depth) ((depth) * SQE_SIZE(qid))
+
+#define SENSE_SIZE(depth) ((depth) * SCSI_SENSE_BUFFERSIZE)
+
+#define SPRAID_AQ_DEPTH 128
+#define SPRAID_NR_AEN_COMMANDS 16
+#define SPRAID_AQ_BLK_MQ_DEPTH (SPRAID_AQ_DEPTH - SPRAID_NR_AEN_COMMANDS)
+#define SPRAID_AQ_MQ_TAG_DEPTH (SPRAID_AQ_BLK_MQ_DEPTH - 1)
+
+#define SPRAID_ADMIN_QUEUE_NUM 1
+#define SPRAID_PTCMDS_PERQ 1
+#define SPRAID_IO_BLK_MQ_DEPTH (hdev->shost->can_queue)
+#define SPRAID_NR_IOQ_PTCMDS (SPRAID_PTCMDS_PERQ * hdev->shost->nr_hw_queues)
+
+#define FUA_MASK 0x08
+#define SPRAID_MINORS BIT(MINORBITS)
+
+#define COMMAND_IS_WRITE(cmd) ((cmd)->common.opcode & 1)
+
+#define SPRAID_IO_IOSQES 7
+#define SPRAID_IO_IOCQES 4
+#define PRP_ENTRY_SIZE 8
+
+#define SMALL_POOL_SIZE 256
+#define MAX_SMALL_POOL_NUM 16
+#define MAX_CMD_PER_DEV 64
+#define MAX_CDB_LEN 32
+
+#define SPRAID_UP_TO_MULTY4(x) (((x) + 4) & (~0x03))
+
+#define CQE_STATUS_SUCCESS (0x0)
+
+#define PCI_VENDOR_ID_RAMAXEL_LOGIC 0x1E81
+
+#define SPRAID_SERVER_DEVICE_HBA_DID 0x2100
+#define SPRAID_SERVER_DEVICE_RAID_DID 0x2200
+
+#define IO_6_DEFAULT_TX_LEN 256
+
+#define SPRAID_INT_PAGES 2
+#define SPRAID_INT_BYTES(hdev) (SPRAID_INT_PAGES * (hdev)->page_size)
+
+enum {
+ SPRAID_REQ_CANCELLED = (1 << 0),
+ SPRAID_REQ_USERCMD = (1 << 1),
+};
+
+enum {
+ SPRAID_SC_SUCCESS = 0x0,
+ SPRAID_SC_INVALID_OPCODE = 0x1,
+ SPRAID_SC_INVALID_FIELD = 0x2,
+
+ SPRAID_SC_ABORT_LIMIT = 0x103,
+ SPRAID_SC_ABORT_MISSING = 0x104,
+ SPRAID_SC_ASYNC_LIMIT = 0x105,
+
+ SPRAID_SC_DNR = 0x4000,
+};
+
+enum {
+ SPRAID_REG_CAP = 0x0000,
+ SPRAID_REG_CC = 0x0014,
+ SPRAID_REG_CSTS = 0x001c,
+ SPRAID_REG_AQA = 0x0024,
+ SPRAID_REG_ASQ = 0x0028,
+ SPRAID_REG_ACQ = 0x0030,
+ SPRAID_REG_DBS = 0x1000,
+};
+
+enum {
+ SPRAID_CC_ENABLE = 1 << 0,
+ SPRAID_CC_CSS_NVM = 0 << 4,
+ SPRAID_CC_MPS_SHIFT = 7,
+ SPRAID_CC_AMS_SHIFT = 11,
+ SPRAID_CC_SHN_SHIFT = 14,
+ SPRAID_CC_IOSQES_SHIFT = 16,
+ SPRAID_CC_IOCQES_SHIFT = 20,
+ SPRAID_CC_AMS_RR = 0 << SPRAID_CC_AMS_SHIFT,
+ SPRAID_CC_SHN_NONE = 0 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CC_IOSQES = SPRAID_IO_IOSQES << SPRAID_CC_IOSQES_SHIFT,
+ SPRAID_CC_IOCQES = SPRAID_IO_IOCQES << SPRAID_CC_IOCQES_SHIFT,
+ SPRAID_CC_SHN_NORMAL = 1 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CC_SHN_MASK = 3 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CSTS_CFS_SHIFT = 1,
+ SPRAID_CSTS_SHST_SHIFT = 2,
+ SPRAID_CSTS_PP_SHIFT = 5,
+ SPRAID_CSTS_RDY = 1 << 0,
+ SPRAID_CSTS_SHST_CMPLT = 2 << 2,
+ SPRAID_CSTS_SHST_MASK = 3 << 2,
+ SPRAID_CSTS_CFS_MASK = 1 << SPRAID_CSTS_CFS_SHIFT,
+ SPRAID_CSTS_PP_MASK = 1 << SPRAID_CSTS_PP_SHIFT,
+};
+
+enum {
+ SPRAID_ADMIN_DELETE_SQ = 0x00,
+ SPRAID_ADMIN_CREATE_SQ = 0x01,
+ SPRAID_ADMIN_DELETE_CQ = 0x04,
+ SPRAID_ADMIN_CREATE_CQ = 0x05,
+ SPRAID_ADMIN_ABORT_CMD = 0x08,
+ SPRAID_ADMIN_SET_FEATURES = 0x09,
+ SPRAID_ADMIN_ASYNC_EVENT = 0x0c,
+ SPRAID_ADMIN_GET_INFO = 0xc6,
+ SPRAID_ADMIN_RESET = 0xc8,
+};
+
+enum {
+ SPRAID_GET_INFO_CTRL = 0,
+ SPRAID_GET_INFO_DEV_LIST = 1,
+};
+
+enum {
+ SPRAID_RESET_TARGET = 0,
+ SPRAID_RESET_BUS = 1,
+};
+
+enum {
+ SPRAID_AEN_ERROR = 0,
+ SPRAID_AEN_NOTICE = 2,
+ SPRAID_AEN_VS = 7,
+};
+
+enum {
+ SPRAID_AEN_DEV_CHANGED = 0x00,
+ SPRAID_AEN_FW_ACT_START = 0x01,
+ SPRAID_AEN_HOST_PROBING = 0x10,
+};
+
+enum {
+ SPRAID_AEN_TIMESYN = 0x00,
+ SPRAID_AEN_FW_ACT_FINISH = 0x02,
+ SPRAID_AEN_EVENT_MIN = 0x80,
+ SPRAID_AEN_EVENT_MAX = 0xff,
+};
+
+enum {
+ SPRAID_CMD_WRITE = 0x01,
+ SPRAID_CMD_READ = 0x02,
+
+ SPRAID_CMD_NONIO_NONE = 0x80,
+ SPRAID_CMD_NONIO_TODEV = 0x81,
+ SPRAID_CMD_NONIO_FROMDEV = 0x82,
+};
+
+enum {
+ SPRAID_QUEUE_PHYS_CONTIG = (1 << 0),
+ SPRAID_CQ_IRQ_ENABLED = (1 << 1),
+
+ SPRAID_FEAT_NUM_QUEUES = 0x07,
+ SPRAID_FEAT_ASYNC_EVENT = 0x0b,
+ SPRAID_FEAT_TIMESTAMP = 0x0e,
+};
+
+enum spraid_state {
+ SPRAID_NEW,
+ SPRAID_LIVE,
+ SPRAID_RESETTING,
+ SPRAID_DELETING,
+ SPRAID_DEAD,
+};
+
+enum {
+ SPRAID_CARD_HBA,
+ SPRAID_CARD_RAID,
+};
+
+enum spraid_cmd_type {
+ SPRAID_CMD_ADM,
+ SPRAID_CMD_IOPT,
+};
+
+struct spraid_completion {
+ __le32 result;
+ union {
+ struct {
+ __u8 sense_len;
+ __u8 resv[3];
+ };
+ __le32 result1;
+ };
+ __le16 sq_head;
+ __le16 sq_id;
+ __u16 cmd_id;
+ __le16 status;
+};
+
+struct spraid_ctrl_info {
+ __le32 nd;
+ __le16 max_cmds;
+ __le16 max_channel;
+ __le32 max_tgt_id;
+ __le16 max_lun;
+ __le16 max_num_sge;
+ __le16 lun_num_in_boot;
+ __u8 mdts;
+ __u8 acl;
+ __u8 aerl;
+ __u8 card_type;
+ __u16 rsvd;
+ __u32 rtd3e;
+ __u8 sn[32];
+ __u8 fr[16];
+ __u8 rsvd1[4020];
+};
+
+struct spraid_dev {
+ struct pci_dev *pdev;
+ struct device *dev;
+ struct Scsi_Host *shost;
+ struct spraid_queue *queues;
+ struct dma_pool *prp_page_pool;
+ struct dma_pool *prp_small_pool[MAX_SMALL_POOL_NUM];
+ mempool_t *iod_mempool;
+ void __iomem *bar;
+ u32 max_qid;
+ u32 num_vecs;
+ u32 queue_count;
+ u32 ioq_depth;
+ int db_stride;
+ u32 __iomem *dbs;
+ struct rw_semaphore devices_rwsem;
+ int numa_node;
+ u32 page_size;
+ u32 ctrl_config;
+ u32 online_queues;
+ u64 cap;
+ int instance;
+ struct spraid_ctrl_info *ctrl_info;
+ struct spraid_dev_info *devices;
+
+ struct spraid_cmd *adm_cmds;
+ struct list_head adm_cmd_list;
+ spinlock_t adm_cmd_lock;
+
+ struct spraid_cmd *ioq_ptcmds;
+ struct list_head ioq_pt_list;
+ spinlock_t ioq_pt_lock;
+
+ struct work_struct scan_work;
+ struct work_struct timesyn_work;
+ struct work_struct reset_work;
+ struct work_struct fw_act_work;
+
+ enum spraid_state state;
+ spinlock_t state_lock;
+
+ struct request_queue *bsg_queue;
+};
+
+struct spraid_sgl_desc {
+ __le64 addr;
+ __le32 length;
+ __u8 rsvd[3];
+ __u8 type;
+};
+
+union spraid_data_ptr {
+ struct {
+ __le64 prp1;
+ __le64 prp2;
+ };
+ struct spraid_sgl_desc sgl;
+};
+
+struct spraid_admin_common_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le32 cdw2[4];
+ union spraid_data_ptr dptr;
+ __le32 cdw10;
+ __le32 cdw11;
+ __le32 cdw12;
+ __le32 cdw13;
+ __le32 cdw14;
+ __le32 cdw15;
+};
+
+struct spraid_features {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[2];
+ union spraid_data_ptr dptr;
+ __le32 fid;
+ __le32 dword11;
+ __le32 dword12;
+ __le32 dword13;
+ __le32 dword14;
+ __le32 dword15;
+};
+
+struct spraid_create_cq {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[5];
+ __le64 prp1;
+ __u64 rsvd8;
+ __le16 cqid;
+ __le16 qsize;
+ __le16 cq_flags;
+ __le16 irq_vector;
+ __u32 rsvd12[4];
+};
+
+struct spraid_create_sq {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[5];
+ __le64 prp1;
+ __u64 rsvd8;
+ __le16 sqid;
+ __le16 qsize;
+ __le16 sq_flags;
+ __le16 cqid;
+ __u32 rsvd12[4];
+};
+
+struct spraid_delete_queue {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[9];
+ __le16 qid;
+ __u16 rsvd10;
+ __u32 rsvd11[5];
+};
+
+struct spraid_get_info {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u32 rsvd2[4];
+ union spraid_data_ptr dptr;
+ __u8 type;
+ __u8 rsvd10[3];
+ __le32 cdw11;
+ __u32 rsvd12[4];
+};
+
+struct spraid_usr_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ union {
+ struct {
+ __le16 subopcode;
+ __le16 rsvd1;
+ } info_0;
+ __le32 cdw2;
+ };
+ union {
+ struct {
+ __le16 data_len;
+ __le16 param_len;
+ } info_1;
+ __le32 cdw3;
+ };
+ __u64 metadata;
+ union spraid_data_ptr dptr;
+ __le32 cdw10;
+ __le32 cdw11;
+ __le32 cdw12;
+ __le32 cdw13;
+ __le32 cdw14;
+ __le32 cdw15;
+};
+
+enum {
+ SPRAID_CMD_FLAG_SGL_METABUF = (1 << 6),
+ SPRAID_CMD_FLAG_SGL_METASEG = (1 << 7),
+ SPRAID_CMD_FLAG_SGL_ALL = SPRAID_CMD_FLAG_SGL_METABUF |
+ SPRAID_CMD_FLAG_SGL_METASEG,
+};
+
+enum spraid_cmd_state {
+ SPRAID_CMD_IDLE = 0,
+ SPRAID_CMD_IN_FLIGHT = 1,
+ SPRAID_CMD_COMPLETE = 2,
+ SPRAID_CMD_TIMEOUT = 3,
+ SPRAID_CMD_TMO_COMPLETE = 4,
+};
+
+struct spraid_abort_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[4];
+ __le16 sqid;
+ __le16 cid;
+ __u32 rsvd11[5];
+};
+
+struct spraid_reset_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[4];
+ __u8 type;
+ __u8 rsvd10[3];
+ __u32 rsvd11[5];
+};
+
+struct spraid_admin_command {
+ union {
+ struct spraid_admin_common_command common;
+ struct spraid_features features;
+ struct spraid_create_cq create_cq;
+ struct spraid_create_sq create_sq;
+ struct spraid_delete_queue delete_queue;
+ struct spraid_get_info get_info;
+ struct spraid_abort_cmd abort;
+ struct spraid_reset_cmd reset;
+ struct spraid_usr_cmd usr_cmd;
+ };
+};
+
+struct spraid_ioq_common_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_len;
+ __u8 rsvd2;
+ __le32 cdw3[3];
+ union spraid_data_ptr dptr;
+ __le32 cdw10[6];
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __le32 cdw26[6];
+};
+
+struct spraid_rw_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_len;
+ __u8 rsvd2;
+ __u32 rsvd3[3];
+ union spraid_data_ptr dptr;
+ __le64 slba;
+ __le16 nlb;
+ __le16 control;
+ __u32 rsvd13[3];
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __u32 rsvd26[6];
+};
+
+struct spraid_scsi_nonio {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_length;
+ __u8 rsvd2;
+ __u32 rsvd3[3];
+ union spraid_data_ptr dptr;
+ __u32 rsvd10[5];
+ __le32 buffer_len;
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __u32 rsvd26[6];
+};
+
+struct spraid_ioq_command {
+ union {
+ struct spraid_ioq_common_command common;
+ struct spraid_rw_command rw;
+ struct spraid_scsi_nonio scsi_nonio;
+ };
+};
+
+struct spraid_passthru_common_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 rsvd0;
+ __u32 nsid;
+ union {
+ struct {
+ __u16 subopcode;
+ __u16 rsvd1;
+ } info_0;
+ __u32 cdw2;
+ };
+ union {
+ struct {
+ __u16 data_len;
+ __u16 param_len;
+ } info_1;
+ __u32 cdw3;
+ };
+ __u64 metadata;
+
+ __u64 addr;
+ __u64 prp2;
+
+ __u32 cdw10;
+ __u32 cdw11;
+ __u32 cdw12;
+ __u32 cdw13;
+ __u32 cdw14;
+ __u32 cdw15;
+ __u32 timeout_ms;
+ __u32 result0;
+ __u32 result1;
+};
+
+struct spraid_ioq_passthru_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 rsvd0;
+ __u32 nsid;
+ union {
+ struct {
+ __u16 res_sense_len;
+ __u8 cdb_len;
+ __u8 rsvd0;
+ } info_0;
+ __u32 cdw2;
+ };
+ union {
+ struct {
+ __u16 subopcode;
+ __u16 rsvd1;
+ } info_1;
+ __u32 cdw3;
+ };
+ union {
+ struct {
+ __u16 rsvd;
+ __u16 param_len;
+ } info_2;
+ __u32 cdw4;
+ };
+ __u32 cdw5;
+ __u64 addr;
+ __u64 prp2;
+ union {
+ struct {
+ __u16 eid;
+ __u16 sid;
+ } info_3;
+ __u32 cdw10;
+ };
+ union {
+ struct {
+ __u16 did;
+ __u8 did_flag;
+ __u8 rsvd2;
+ } info_4;
+ __u32 cdw11;
+ };
+ __u32 cdw12;
+ __u32 cdw13;
+ __u32 cdw14;
+ __u32 data_len;
+ __u32 cdw16;
+ __u32 cdw17;
+ __u32 cdw18;
+ __u32 cdw19;
+ __u32 cdw20;
+ __u32 cdw21;
+ __u32 cdw22;
+ __u32 cdw23;
+ __u64 sense_addr;
+ __u32 cdw26[4];
+ __u32 timeout_ms;
+ __u32 result0;
+ __u32 result1;
+};
+
+struct spraid_bsg_request {
+ u32 msgcode;
+ u32 control;
+ union {
+ struct spraid_passthru_common_cmd admcmd;
+ struct spraid_ioq_passthru_cmd ioqcmd;
+ };
+};
+
+enum {
+ SPRAID_BSG_ADM,
+ SPRAID_BSG_IOQ,
+};
+
+struct spraid_cmd {
+ int qid;
+ int cid;
+ u32 result0;
+ u32 result1;
+ u16 status;
+ void *priv;
+ enum spraid_cmd_state state;
+ struct completion cmd_done;
+ struct list_head list;
+};
+
+struct spraid_queue {
+ struct spraid_dev *hdev;
+ spinlock_t sq_lock;
+
+ spinlock_t cq_lock ____cacheline_aligned_in_smp;
+
+ void *sq_cmds;
+
+ struct spraid_completion *cqes;
+
+ dma_addr_t sq_dma_addr;
+ dma_addr_t cq_dma_addr;
+ u32 __iomem *q_db;
+ u8 cq_phase;
+ u8 sqes;
+ u16 qid;
+ u16 sq_tail;
+ u16 cq_head;
+ u16 last_cq_head;
+ u16 q_depth;
+ s16 cq_vector;
+ void *sense;
+ dma_addr_t sense_dma_addr;
+ struct dma_pool *prp_small_pool;
+};
+
+struct spraid_iod {
+ struct spraid_queue *spraidq;
+ enum spraid_cmd_state state;
+ int npages;
+ u32 nsge;
+ u32 length;
+ bool use_sgl;
+ bool sg_drv_mgmt;
+ dma_addr_t first_dma;
+ void *sense;
+ dma_addr_t sense_dma;
+ struct scatterlist *sg;
+ struct scatterlist inline_sg[0];
+};
+
+#define SPRAID_DEV_INFO_ATTR_BOOT(attr) ((attr) & 0x01)
+#define SPRAID_DEV_INFO_ATTR_VD(attr) (((attr) & 0x02) == 0x0)
+#define SPRAID_DEV_INFO_ATTR_PT(attr) (((attr) & 0x22) == 0x02)
+#define SPRAID_DEV_INFO_ATTR_RAWDISK(attr) ((attr) & 0x20)
+
+#define SPRAID_DEV_INFO_FLAG_VALID(flag) ((flag) & 0x01)
+#define SPRAID_DEV_INFO_FLAG_CHANGE(flag) ((flag) & 0x02)
+
+#define BGTASK_TYPE_REBUILD 4
+#define USR_CMD_READ 0xc2
+#define USR_CMD_RDLEN 0x1000
+#define USR_CMD_VDINFO 0x704
+#define USR_CMD_BGTASK 0x504
+#define VDINFO_PARAM_LEN 0x04
+
+struct spraid_vd_info {
+ __u8 name[32];
+ __le16 id;
+ __u8 rg_id;
+ __u8 rg_level;
+ __u8 sg_num;
+ __u8 sg_disk_num;
+ __u8 vd_status;
+ __u8 vd_type;
+ __u8 rsvd1[4056];
+};
+
+#define MAX_REALTIME_BGTASK_NUM 32
+
+struct bgtask_info {
+ __u8 type;
+ __u8 progress;
+ __u8 rate;
+ __u8 rsvd0;
+ __le16 vd_id;
+ __le16 time_left;
+ __u8 rsvd1[4];
+};
+
+struct spraid_bgtask {
+ __u8 sw;
+ __u8 task_num;
+ __u8 rsvd[6];
+ struct bgtask_info bgtask[MAX_REALTIME_BGTASK_NUM];
+};
+
+struct spraid_dev_info {
+ __le32 hdid;
+ __le16 target;
+ __u8 channel;
+ __u8 lun;
+ __u8 attr;
+ __u8 flag;
+ __le16 max_io_kb;
+};
+
+#define MAX_DEV_ENTRY_PER_PAGE_4K 340
+struct spraid_dev_list {
+ __le32 dev_num;
+ __u32 rsvd0[3];
+ struct spraid_dev_info devices[MAX_DEV_ENTRY_PER_PAGE_4K];
+};
+
+struct spraid_sdev_hostdata {
+ u32 hdid;
+ u16 max_io_kb;
+ u8 attr;
+ u8 flag;
+ u8 rg_id;
+ u8 rsvd[3];
+};
+
+#endif
+
diff --git a/drivers/scsi/spraid/spraid_main.c b/drivers/scsi/spraid/spraid_main.c
new file mode 100644
index 0000000000000..519b39f44e914
--- /dev/null
+++ b/drivers/scsi/spraid/spraid_main.c
@@ -0,0 +1,3875 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
+
+/* Ramaxel Raid SPXXX Series Linux Driver */
+
+#define pr_fmt(fmt) "spraid: " fmt
+
+#include <linux/sched/signal.h>
+#include <linux/version.h>
+#include <linux/pci.h>
+#include <linux/aer.h>
+#include <linux/module.h>
+#include <linux/ioport.h>
+#include <linux/device.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/cdev.h>
+#include <linux/sysfs.h>
+#include <linux/gfp.h>
+#include <linux/types.h>
+#include <linux/ratelimit.h>
+#include <linux/once.h>
+#include <linux/debugfs.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/blkdev.h>
+#include <linux/bsg-lib.h>
+#include <asm/unaligned.h>
+#include <linux/sort.h>
+#include <target/target_core_backend.h>
+
+#include <scsi/scsi.h>
+#include <scsi/scsi_cmnd.h>
+#include <scsi/scsi_device.h>
+#include <scsi/scsi_host.h>
+#include <scsi/scsi_transport.h>
+#include <scsi/scsi_dbg.h>
+
+
+#include "spraid.h"
+
+static u32 admin_tmout = 60;
+module_param(admin_tmout, uint, 0644);
+MODULE_PARM_DESC(admin_tmout, "admin commands timeout (seconds)");
+
+static u32 scmd_tmout_nonpt = 180;
+module_param(scmd_tmout_nonpt, uint, 0644);
+MODULE_PARM_DESC(scmd_tmout_nonpt,
+ "scsi commands timeout for rawdisk&raid(seconds)");
+
+static int ioq_depth_set(const char *val, const struct kernel_param *kp);
+static const struct kernel_param_ops ioq_depth_ops = {
+ .set = ioq_depth_set,
+ .get = param_get_uint,
+};
+
+static u32 io_queue_depth = 1024;
+module_param_cb(io_queue_depth, &ioq_depth_ops, &io_queue_depth, 0644);
+MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2");
+
+static int log_debug_switch_set(const char *val, const struct kernel_param *kp)
+{
+ u8 n = 0;
+ int ret;
+
+ ret = kstrtou8(val, 10, &n);
+ if (ret != 0)
+ return -EINVAL;
+
+ return param_set_byte(val, kp);
+}
+
+static const struct kernel_param_ops log_debug_switch_ops = {
+ .set = log_debug_switch_set,
+ .get = param_get_byte,
+};
+
+static unsigned char log_debug_switch;
+module_param_cb(log_debug_switch, &log_debug_switch_ops,
+ &log_debug_switch, 0644);
+MODULE_PARM_DESC(log_debug_switch,
+ "set log state, default non-zero for switch on");
+
+static int small_pool_num_set(const char *val, const struct kernel_param *kp)
+{
+ u8 n = 0;
+ int ret;
+
+ ret = kstrtou8(val, 10, &n);
+ if (ret != 0)
+ return -EINVAL;
+ if (n > MAX_SMALL_POOL_NUM)
+ n = MAX_SMALL_POOL_NUM;
+ if (n < 1)
+ n = 1;
+ *((u8 *)kp->arg) = n;
+
+ return 0;
+}
+
+static const struct kernel_param_ops small_pool_num_ops = {
+ .set = small_pool_num_set,
+ .get = param_get_byte,
+};
+
+/* It was found that the spindlock of a single pool conflicts
+ * a lot with multiple CPUs.So multiple pools are introduced
+ * to reduce the conflictions.
+ */
+static unsigned char small_pool_num = 4;
+module_param_cb(small_pool_num, &small_pool_num_ops, &small_pool_num, 0644);
+MODULE_PARM_DESC(small_pool_num, "set prp small pool num, default 4, MAX 16");
+
+static void spraid_free_queue(struct spraid_queue *spraidq);
+static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result);
+static void spraid_handle_aen_vs(struct spraid_dev *hdev,
+ u32 result, u32 result1);
+
+static DEFINE_IDA(spraid_instance_ida);
+
+static struct class *spraid_class;
+
+#define SPRAID_CAP_TIMEOUT_UNIT_MS (HZ / 2)
+
+static struct workqueue_struct *spraid_wq;
+
+#define dev_log_dbg(dev, fmt, ...) do { \
+ if (unlikely(log_debug_switch)) \
+ dev_info(dev, "[%s] [%d] " fmt, \
+ __func__, __LINE__, ##__VA_ARGS__); \
+} while (0)
+
+#define SPRAID_DRV_VERSION "1.0.0.0"
+
+#define ADMIN_TIMEOUT (admin_tmout * HZ)
+#define ADMIN_ERR_TIMEOUT 32757
+
+#define SPRAID_WAIT_ABNL_CMD_TIMEOUT (3 * 2)
+
+#define SPRAID_DMA_MSK_BIT_MAX 64
+
+enum FW_STAT_CODE {
+ FW_STAT_OK = 0,
+ FW_STAT_NEED_CHECK,
+ FW_STAT_ERROR,
+ FW_STAT_EP_PCIE_ERROR,
+ FW_STAT_NAC_DMA_ERROR,
+ FW_STAT_ABORTED,
+ FW_STAT_NEED_RETRY
+};
+
+static const char * const raid_levels[] = {"0", "1", "5", "6", "10", "50", "60",
+ "NA"};
+
+static const char * const raid_states[] = {
+ "NA", "NORMAL", "FAULT", "DEGRADE", "NOT_FORMATTED",
+ "FORMATTING", "SANITIZING", "INITIALIZING", "INITIALIZE_FAIL",
+ "DELETING", "DELETE_FAIL", "WRITE_PROTECT"
+};
+
+static int ioq_depth_set(const char *val, const struct kernel_param *kp)
+{
+ int n = 0;
+ int ret;
+
+ ret = kstrtoint(val, 10, &n);
+ if (ret != 0 || n < 2)
+ return -EINVAL;
+
+ return param_set_int(val, kp);
+}
+
+static int spraid_remap_bar(struct spraid_dev *hdev, u32 size)
+{
+ struct pci_dev *pdev = hdev->pdev;
+
+ if (size > pci_resource_len(pdev, 0)) {
+ dev_err(hdev->dev, "Input size[%u] exceed bar0 length[%llu]\n",
+ size, pci_resource_len(pdev, 0));
+ return -ENOMEM;
+ }
+
+ if (hdev->bar)
+ iounmap(hdev->bar);
+
+ hdev->bar = ioremap(pci_resource_start(pdev, 0), size);
+ if (!hdev->bar) {
+ dev_err(hdev->dev, "ioremap for bar0 failed\n");
+ return -ENOMEM;
+ }
+ hdev->dbs = hdev->bar + SPRAID_REG_DBS;
+
+ return 0;
+}
+
+static int spraid_dev_map(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ int ret;
+
+ ret = pci_request_mem_regions(pdev, "spraid");
+ if (ret) {
+ dev_err(hdev->dev, "fail to request memory regions\n");
+ return ret;
+ }
+
+ ret = spraid_remap_bar(hdev, SPRAID_REG_DBS + 4096);
+ if (ret) {
+ pci_release_mem_regions(pdev);
+ return ret;
+ }
+
+ return 0;
+}
+
+static void spraid_dev_unmap(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+
+ if (hdev->bar) {
+ iounmap(hdev->bar);
+ hdev->bar = NULL;
+ }
+ pci_release_mem_regions(pdev);
+}
+
+static int spraid_pci_enable(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ int ret = -ENOMEM;
+ u64 maskbit = SPRAID_DMA_MSK_BIT_MAX;
+
+ if (pci_enable_device_mem(pdev)) {
+ dev_err(hdev->dev,
+ "Enable pci device memory resources failed\n");
+ return ret;
+ }
+ pci_set_master(pdev);
+
+ if (readl(hdev->bar + SPRAID_REG_CSTS) == U32_MAX) {
+ ret = -ENODEV;
+ dev_err(hdev->dev, "Read csts register failed\n");
+ goto disable;
+ }
+
+ ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
+ if (ret < 0) {
+ dev_err(hdev->dev,
+ "Allocate one IRQ for setup admin channel failed\n");
+ goto disable;
+ }
+
+ hdev->cap = lo_hi_readq(hdev->bar + SPRAID_REG_CAP);
+ hdev->ioq_depth = min_t(u32, SPRAID_CAP_MQES(hdev->cap) + 1,
+ io_queue_depth);
+ hdev->db_stride = 1 << SPRAID_CAP_STRIDE(hdev->cap);
+
+ maskbit = SPRAID_CAP_DMAMASK(hdev->cap);
+ if (maskbit < 32 || maskbit > SPRAID_DMA_MSK_BIT_MAX) {
+ dev_err(hdev->dev,
+ "err, dma mask invalid[%llu], set to default\n",
+ maskbit);
+ maskbit = SPRAID_DMA_MSK_BIT_MAX;
+ }
+ if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(maskbit))) {
+ dev_err(hdev->dev, "set dma mask and coherent failed\n");
+ goto disable;
+ }
+
+ dev_info(hdev->dev, "set dma mask[%llu] success\n", maskbit);
+
+ pci_enable_pcie_error_reporting(pdev);
+ pci_save_state(pdev);
+
+ return 0;
+
+disable:
+ pci_disable_device(pdev);
+ return ret;
+}
+
+static int spraid_npages_prp(u32 size, struct spraid_dev *hdev)
+{
+ u32 nprps = DIV_ROUND_UP(size + hdev->page_size, hdev->page_size);
+
+ return DIV_ROUND_UP(PRP_ENTRY_SIZE * nprps, PAGE_SIZE - PRP_ENTRY_SIZE);
+}
+
+static int spraid_npages_sgl(u32 nseg)
+{
+ return DIV_ROUND_UP(nseg * sizeof(struct spraid_sgl_desc), PAGE_SIZE);
+}
+
+static void **spraid_iod_list(struct spraid_iod *iod)
+{
+ return (void **)(iod->inline_sg + (iod->sg_drv_mgmt ? iod->nsge : 0));
+}
+
+static u32 spraid_iod_ext_size(struct spraid_dev *hdev, u32 size, u32 nsge,
+ bool sg_drv_mgmt, bool use_sgl)
+{
+ size_t alloc_size, sg_size;
+
+ if (use_sgl)
+ alloc_size = sizeof(__le64 *) * spraid_npages_sgl(nsge);
+ else
+ alloc_size = sizeof(__le64 *) * spraid_npages_prp(size, hdev);
+
+ sg_size = sg_drv_mgmt ? (sizeof(struct scatterlist) * nsge) : 0;
+ return sg_size + alloc_size;
+}
+
+static u32 spraid_cmd_size(struct spraid_dev *hdev,
+ bool sg_drv_mgmt, bool use_sgl)
+{
+ u32 alloc_size = spraid_iod_ext_size(hdev, SPRAID_INT_BYTES(hdev),
+ SPRAID_INT_PAGES, sg_drv_mgmt, use_sgl);
+
+ dev_info(hdev->dev, "sg_drv_mgmt: %s, use_sgl: %s, iod size: %lu;"
+ " alloc_size: %u\n", sg_drv_mgmt ? "true" : "false",
+ use_sgl ? "true" : "false",
+ sizeof(struct spraid_iod), alloc_size);
+
+ return sizeof(struct spraid_iod) + alloc_size;
+}
+
+static int spraid_setup_prps(struct spraid_dev *hdev, struct spraid_iod *iod)
+{
+ struct scatterlist *sg = iod->sg;
+ u64 dma_addr = sg_dma_address(sg);
+ int dma_len = sg_dma_len(sg);
+ __le64 *prp_list, *old_prp_list;
+ u32 page_size = hdev->page_size;
+ int offset = dma_addr & (page_size - 1);
+ void **list = spraid_iod_list(iod);
+ int length = iod->length;
+ struct dma_pool *pool;
+ dma_addr_t prp_dma;
+ int nprps, i;
+
+ length -= (page_size - offset);
+ if (length <= 0) {
+ iod->first_dma = 0;
+ return 0;
+ }
+
+ dma_len -= (page_size - offset);
+ if (dma_len) {
+ dma_addr += (page_size - offset);
+ } else {
+ sg = sg_next(sg);
+ dma_addr = sg_dma_address(sg);
+ dma_len = sg_dma_len(sg);
+ }
+
+ if (length <= page_size) {
+ iod->first_dma = dma_addr;
+ return 0;
+ }
+
+ nprps = DIV_ROUND_UP(length, page_size);
+ if (nprps <= (SMALL_POOL_SIZE / PRP_ENTRY_SIZE)) {
+ pool = iod->spraidq->prp_small_pool;
+ iod->npages = 0;
+ } else {
+ pool = hdev->prp_page_pool;
+ iod->npages = 1;
+ }
+
+ prp_list = dma_pool_alloc(pool, GFP_ATOMIC, &prp_dma);
+ if (!prp_list) {
+ dev_err_ratelimited(hdev->dev,
+ "Allocate first prp_list memory failed\n");
+ iod->first_dma = dma_addr;
+ iod->npages = -1;
+ return -ENOMEM;
+ }
+ list[0] = prp_list;
+ iod->first_dma = prp_dma;
+ i = 0;
+ for (;;) {
+ if (i == page_size / PRP_ENTRY_SIZE) {
+ old_prp_list = prp_list;
+
+ prp_list = dma_pool_alloc(pool, GFP_ATOMIC, &prp_dma);
+ if (!prp_list) {
+ dev_err_ratelimited(hdev->dev, "Allocate %dth;"
+ " prp_list memory failed\n",
+ iod->npages + 1);
+ return -ENOMEM;
+ }
+ list[iod->npages++] = prp_list;
+ prp_list[0] = old_prp_list[i - 1];
+ old_prp_list[i - 1] = cpu_to_le64(prp_dma);
+ i = 1;
+ }
+ prp_list[i++] = cpu_to_le64(dma_addr);
+ dma_len -= page_size;
+ dma_addr += page_size;
+ length -= page_size;
+ if (length <= 0)
+ break;
+ if (dma_len > 0)
+ continue;
+ if (unlikely(dma_len < 0))
+ goto bad_sgl;
+ sg = sg_next(sg);
+ dma_addr = sg_dma_address(sg);
+ dma_len = sg_dma_len(sg);
+ }
+
+ return 0;
+
+bad_sgl:
+ dev_err(hdev->dev,
+ "Setup prps, invalid SGL for payload: %d nents: %d\n",
+ iod->length, iod->nsge);
+ return -EIO;
+}
+
+#define SGES_PER_PAGE (PAGE_SIZE / sizeof(struct spraid_sgl_desc))
+
+static void spraid_submit_cmd(struct spraid_queue *spraidq, const void *cmd)
+{
+ u32 sqes = SQE_SIZE(spraidq->qid);
+ unsigned long flags;
+ struct spraid_admin_common_command *acd =
+ (struct spraid_admin_common_command *)cmd;
+
+ spin_lock_irqsave(&spraidq->sq_lock, flags);
+ memcpy((spraidq->sq_cmds + sqes * spraidq->sq_tail), cmd, sqes);
+ if (++spraidq->sq_tail == spraidq->q_depth)
+ spraidq->sq_tail = 0;
+
+ writel(spraidq->sq_tail, spraidq->q_db);
+ spin_unlock_irqrestore(&spraidq->sq_lock, flags);
+
+ dev_log_dbg(spraidq->hdev->dev,
+ "cid[%d] qid[%d], opcode[0x%x], flags[0x%x], hdid[%u]\n",
+ acd->command_id, spraidq->qid, acd->opcode,
+ acd->flags, le32_to_cpu(acd->hdid));
+}
+
+static u32 spraid_mod64(u64 dividend, u32 divisor)
+{
+ u64 d;
+ u32 remainder;
+
+ if (!divisor)
+ pr_err("DIVISOR is zero, in div fn\n");
+
+ d = dividend;
+ remainder = do_div(d, divisor);
+ return remainder;
+}
+
+static inline bool spraid_is_rw_scmd(struct scsi_cmnd *scmd)
+{
+ switch (scmd->cmnd[0]) {
+ case READ_6:
+ case READ_10:
+ case READ_12:
+ case READ_16:
+ case READ_32:
+ case WRITE_6:
+ case WRITE_10:
+ case WRITE_12:
+ case WRITE_16:
+ case WRITE_32:
+ return true;
+ default:
+ return false;
+ }
+}
+
+static bool spraid_is_prp(struct spraid_dev *hdev,
+ struct scsi_cmnd *scmd, u32 nsge)
+{
+ struct scatterlist *sg = scsi_sglist(scmd);
+ u32 page_size = hdev->page_size;
+ bool is_prp = true;
+ int i = 0;
+
+ scsi_for_each_sg(scmd, sg, nsge, i) {
+ if (i != 0 && i != nsge - 1) {
+ if (spraid_mod64(sg_dma_len(sg), page_size) ||
+ spraid_mod64(sg_dma_address(sg), page_size)) {
+ is_prp = false;
+ break;
+ }
+ }
+
+ if (nsge > 1 && i == 0) {
+ if ((spraid_mod64((sg_dma_address(sg) + sg_dma_len(sg)),
+ page_size))) {
+ is_prp = false;
+ break;
+ }
+ }
+
+ if (nsge > 1 && i == (nsge - 1)) {
+ if (spraid_mod64(sg_dma_address(sg), page_size)) {
+ is_prp = false;
+ break;
+ }
+ }
+ }
+
+ return is_prp;
+}
+
+enum {
+ SPRAID_SGL_FMT_DATA_DESC = 0x00,
+ SPRAID_SGL_FMT_SEG_DESC = 0x02,
+ SPRAID_SGL_FMT_LAST_SEG_DESC = 0x03,
+ SPRAID_KEY_SGL_FMT_DATA_DESC = 0x04,
+ SPRAID_TRANSPORT_SGL_DATA_DESC = 0x05
+};
+
+static void spraid_sgl_set_data(struct spraid_sgl_desc *sge,
+ struct scatterlist *sg)
+{
+ sge->addr = cpu_to_le64(sg_dma_address(sg));
+ sge->length = cpu_to_le32(sg_dma_len(sg));
+ sge->type = SPRAID_SGL_FMT_DATA_DESC << 4;
+}
+
+static void spraid_sgl_set_seg(struct spraid_sgl_desc *sge,
+ dma_addr_t dma_addr, int entries)
+{
+ sge->addr = cpu_to_le64(dma_addr);
+ if (entries <= SGES_PER_PAGE) {
+ sge->length = cpu_to_le32(entries * sizeof(*sge));
+ sge->type = SPRAID_SGL_FMT_LAST_SEG_DESC << 4;
+ } else {
+ sge->length = cpu_to_le32(PAGE_SIZE);
+ sge->type = SPRAID_SGL_FMT_SEG_DESC << 4;
+ }
+}
+
+static int spraid_setup_ioq_cmd_sgl(struct spraid_dev *hdev,
+ struct scsi_cmnd *scmd,
+ struct spraid_ioq_command *ioq_cmd,
+ struct spraid_iod *iod)
+{
+ struct spraid_sgl_desc *sg_list, *link, *old_sg_list;
+ struct scatterlist *sg = scsi_sglist(scmd);
+ void **list = spraid_iod_list(iod);
+ struct dma_pool *pool;
+ int nsge = iod->nsge;
+ dma_addr_t sgl_dma;
+ int i = 0;
+
+ ioq_cmd->common.flags |= SPRAID_CMD_FLAG_SGL_METABUF;
+
+ if (nsge == 1) {
+ spraid_sgl_set_data(&ioq_cmd->common.dptr.sgl, sg);
+ return 0;
+ }
+
+ if (nsge <= (SMALL_POOL_SIZE / sizeof(struct spraid_sgl_desc))) {
+ pool = iod->spraidq->prp_small_pool;
+ iod->npages = 0;
+ } else {
+ pool = hdev->prp_page_pool;
+ iod->npages = 1;
+ }
+
+ sg_list = dma_pool_alloc(pool, GFP_ATOMIC, &sgl_dma);
+ if (!sg_list) {
+ dev_err_ratelimited(hdev->dev,
+ "Allocate first sgl_list failed\n");
+ iod->npages = -1;
+ return -ENOMEM;
+ }
+
+ list[0] = sg_list;
+ iod->first_dma = sgl_dma;
+ spraid_sgl_set_seg(&ioq_cmd->common.dptr.sgl, sgl_dma, nsge);
+ do {
+ if (i == SGES_PER_PAGE) {
+ old_sg_list = sg_list;
+ link = &old_sg_list[SGES_PER_PAGE - 1];
+
+ sg_list = dma_pool_alloc(pool, GFP_ATOMIC, &sgl_dma);
+ if (!sg_list) {
+ dev_err_ratelimited(hdev->dev,
+ "Allocate %dth sgl_list;"
+ " failed\n",
+ iod->npages + 1);
+ return -ENOMEM;
+ }
+ list[iod->npages++] = sg_list;
+
+ i = 0;
+ memcpy(&sg_list[i++], link, sizeof(*link));
+ spraid_sgl_set_seg(link, sgl_dma, nsge);
+ }
+
+ spraid_sgl_set_data(&sg_list[i++], sg);
+ sg = sg_next(sg);
+ } while (--nsge > 0);
+
+ return 0;
+}
+
+#define SPRAID_RW_FUA BIT(14)
+
+static void spraid_setup_rw_cmd(struct spraid_dev *hdev,
+ struct spraid_rw_command *rw,
+ struct scsi_cmnd *scmd)
+{
+ u32 start_lba_lo, start_lba_hi;
+ u32 datalength = 0;
+ u16 control = 0;
+
+ start_lba_lo = 0;
+ start_lba_hi = 0;
+
+ if (scmd->sc_data_direction == DMA_TO_DEVICE) {
+ rw->opcode = SPRAID_CMD_WRITE;
+ } else if (scmd->sc_data_direction == DMA_FROM_DEVICE) {
+ rw->opcode = SPRAID_CMD_READ;
+ } else {
+ dev_err(hdev->dev,
+ "Invalid IO for unsupported data direction: %d\n",
+ scmd->sc_data_direction);
+ WARN_ON(1);
+ }
+
+ /* 6-byte READ(0x08) or WRITE(0x0A) cdb */
+ if (scmd->cmd_len == 6) {
+ datalength = (u32)(scmd->cmnd[4] == 0 ?
+ IO_6_DEFAULT_TX_LEN : scmd->cmnd[4]);
+ start_lba_lo = (u32)get_unaligned_be24(&scmd->cmnd[1]);
+
+ start_lba_lo &= 0x1FFFFF;
+ }
+
+ /* 10-byte READ(0x28) or WRITE(0x2A) cdb */
+ else if (scmd->cmd_len == 10) {
+ datalength = (u32)get_unaligned_be16(&scmd->cmnd[7]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+
+ /* 12-byte READ(0xA8) or WRITE(0xAA) cdb */
+ else if (scmd->cmd_len == 12) {
+ datalength = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+ /* 16-byte READ(0x88) or WRITE(0x8A) cdb */
+ else if (scmd->cmd_len == 16) {
+ datalength = get_unaligned_be32(&scmd->cmnd[10]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+ /* 32-byte READ(0x88) or WRITE(0x8A) cdb */
+ else if (scmd->cmd_len == 32) {
+ datalength = get_unaligned_be32(&scmd->cmnd[28]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[16]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[12]);
+
+ if (scmd->cmnd[10] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+
+ if (unlikely(datalength > U16_MAX || datalength == 0)) {
+ dev_err(hdev->dev,
+ "Invalid IO for illegal transfer data length: %u\n",
+ datalength);
+ WARN_ON(1);
+ }
+
+ rw->slba = cpu_to_le64(((u64)start_lba_hi << 32) | start_lba_lo);
+ /* 0base for nlb */
+ rw->nlb = cpu_to_le16((u16)(datalength - 1));
+ rw->control = cpu_to_le16(control);
+}
+
+static void spraid_setup_nonio_cmd(struct spraid_dev *hdev,
+ struct spraid_scsi_nonio *scsi_nonio,
+ struct scsi_cmnd *scmd)
+{
+ scsi_nonio->buffer_len = cpu_to_le32(scsi_bufflen(scmd));
+
+ switch (scmd->sc_data_direction) {
+ case DMA_NONE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_NONE;
+ break;
+ case DMA_TO_DEVICE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_TODEV;
+ break;
+ case DMA_FROM_DEVICE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_FROMDEV;
+ break;
+ default:
+ dev_err(hdev->dev,
+ "Invalid IO for unsupported data direction: %d\n",
+ scmd->sc_data_direction);
+ WARN_ON(1);
+ }
+}
+
+static void spraid_setup_ioq_cmd(struct spraid_dev *hdev,
+ struct spraid_ioq_command *ioq_cmd,
+ struct scsi_cmnd *scmd)
+{
+ memcpy(ioq_cmd->common.cdb, scmd->cmnd, scmd->cmd_len);
+ ioq_cmd->common.cdb_len = scmd->cmd_len;
+
+ if (spraid_is_rw_scmd(scmd))
+ spraid_setup_rw_cmd(hdev, &ioq_cmd->rw, scmd);
+ else
+ spraid_setup_nonio_cmd(hdev, &ioq_cmd->scsi_nonio, scmd);
+}
+
+static int spraid_init_iod(struct spraid_dev *hdev, struct spraid_iod *iod,
+ struct spraid_ioq_command *ioq_cmd,
+ struct scsi_cmnd *scmd)
+{
+ if (unlikely(!iod->sense)) {
+ dev_err(hdev->dev, "Allocate sense data buffer failed\n");
+ return -ENOMEM;
+ }
+ ioq_cmd->common.sense_addr = cpu_to_le64(iod->sense_dma);
+ ioq_cmd->common.sense_len = cpu_to_le16(SCSI_SENSE_BUFFERSIZE);
+
+ iod->nsge = 0;
+ iod->npages = -1;
+ iod->use_sgl = 0;
+ iod->sg_drv_mgmt = false;
+ WRITE_ONCE(iod->state, SPRAID_CMD_IDLE);
+
+ return 0;
+}
+
+static void spraid_free_iod_res(struct spraid_dev *hdev, struct spraid_iod *iod)
+{
+ const int last_prp = hdev->page_size / sizeof(__le64) - 1;
+ dma_addr_t dma_addr, next_dma_addr;
+ struct spraid_sgl_desc *sg_list;
+ __le64 *prp_list;
+ void *addr;
+ int i;
+
+ dma_addr = iod->first_dma;
+ if (iod->npages == 0)
+ dma_pool_free(iod->spraidq->prp_small_pool,
+ spraid_iod_list(iod)[0], dma_addr);
+
+ for (i = 0; i < iod->npages; i++) {
+ addr = spraid_iod_list(iod)[i];
+
+ if (iod->use_sgl) {
+ sg_list = addr;
+ next_dma_addr =
+ le64_to_cpu((sg_list[SGES_PER_PAGE - 1]).addr);
+ } else {
+ prp_list = addr;
+ next_dma_addr = le64_to_cpu(prp_list[last_prp]);
+ }
+
+ dma_pool_free(hdev->prp_page_pool, addr, dma_addr);
+ dma_addr = next_dma_addr;
+ }
+
+ if (iod->sg_drv_mgmt && iod->sg != iod->inline_sg) {
+ iod->sg_drv_mgmt = false;
+ mempool_free(iod->sg, hdev->iod_mempool);
+ }
+
+ iod->sense = NULL;
+ iod->npages = -1;
+}
+
+static int spraid_io_map_data(struct spraid_dev *hdev, struct spraid_iod *iod,
+ struct scsi_cmnd *scmd,
+ struct spraid_ioq_command *ioq_cmd)
+{
+ int ret;
+
+ iod->nsge = scsi_dma_map(scmd);
+
+ /* No data to DMA, it may be scsi no-rw command */
+ if (unlikely(iod->nsge == 0))
+ return 0;
+
+ iod->length = scsi_bufflen(scmd);
+ iod->sg = scsi_sglist(scmd);
+ iod->use_sgl = !spraid_is_prp(hdev, scmd, iod->nsge);
+
+ if (iod->use_sgl) {
+ ret = spraid_setup_ioq_cmd_sgl(hdev, scmd, ioq_cmd, iod);
+ } else {
+ ret = spraid_setup_prps(hdev, iod);
+ ioq_cmd->common.dptr.prp1 =
+ cpu_to_le64(sg_dma_address(iod->sg));
+ ioq_cmd->common.dptr.prp2 = cpu_to_le64(iod->first_dma);
+ }
+
+ if (ret)
+ scsi_dma_unmap(scmd);
+
+ return ret;
+}
+
+static void spraid_map_status(struct spraid_iod *iod, struct scsi_cmnd *scmd,
+ struct spraid_completion *cqe)
+{
+ scsi_set_resid(scmd, 0);
+
+ switch ((le16_to_cpu(cqe->status) >> 1) & 0x7f) {
+ case FW_STAT_OK:
+ set_host_byte(scmd, DID_OK);
+ break;
+ case FW_STAT_NEED_CHECK:
+ set_host_byte(scmd, DID_OK);
+ scmd->result |= le16_to_cpu(cqe->status) >> 8;
+ if (scmd->result & SAM_STAT_CHECK_CONDITION) {
+ memset(scmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
+ memcpy(scmd->sense_buffer, iod->sense,
+ SCSI_SENSE_BUFFERSIZE);
+ scmd->result =
+ (scmd->result & 0x00ffffff) | (DRIVER_SENSE << 24);
+ }
+ break;
+ case FW_STAT_ABORTED:
+ set_host_byte(scmd, DID_ABORT);
+ break;
+ case FW_STAT_NEED_RETRY:
+ set_host_byte(scmd, DID_REQUEUE);
+ break;
+ default:
+ set_host_byte(scmd, DID_BAD_TARGET);
+ break;
+ }
+}
+
+static inline void spraid_get_tag_from_scmd(struct scsi_cmnd *scmd,
+ u16 *qid, u16 *cid)
+{
+ u32 tag = blk_mq_unique_tag(scmd->request);
+
+ *qid = blk_mq_unique_tag_to_hwq(tag) + 1;
+ *cid = blk_mq_unique_tag_to_tag(tag);
+}
+
+static int spraid_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
+{
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_dev *hdev = shost_priv(shost);
+ struct scsi_device *sdev = scmd->device;
+ struct spraid_sdev_hostdata *hostdata;
+ struct spraid_ioq_command ioq_cmd;
+ struct spraid_queue *ioq;
+ unsigned long elapsed;
+ u16 hwq, cid;
+ int ret;
+
+ if (unlikely(!scmd)) {
+ dev_err(hdev->dev, "err, scmd is null\n");
+ return 0;
+ }
+
+ if (unlikely(hdev->state != SPRAID_LIVE)) {
+ set_host_byte(scmd, DID_NO_CONNECT);
+ scmd->scsi_done(scmd);
+ return 0;
+ }
+
+ if (log_debug_switch)
+ scsi_print_command(scmd);
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ hostdata = sdev->hostdata;
+ ioq = &hdev->queues[hwq];
+ memset(&ioq_cmd, 0, sizeof(ioq_cmd));
+ ioq_cmd.rw.hdid = cpu_to_le32(hostdata->hdid);
+ ioq_cmd.rw.command_id = cid;
+
+ spraid_setup_ioq_cmd(hdev, &ioq_cmd, scmd);
+
+ ret = cid * SCSI_SENSE_BUFFERSIZE;
+ iod->sense = ioq->sense + ret;
+ iod->sense_dma = ioq->sense_dma_addr + ret;
+
+ ret = spraid_init_iod(hdev, iod, &ioq_cmd, scmd);
+ if (unlikely(ret))
+ return SCSI_MLQUEUE_HOST_BUSY;
+
+ iod->spraidq = ioq;
+ ret = spraid_io_map_data(hdev, iod, scmd, &ioq_cmd);
+ if (unlikely(ret)) {
+ dev_err(hdev->dev, "spraid_io_map_data Err.\n");
+ set_host_byte(scmd, DID_ERROR);
+ scmd->scsi_done(scmd);
+ ret = 0;
+ goto deinit_iod;
+ }
+
+ WRITE_ONCE(iod->state, SPRAID_CMD_IN_FLIGHT);
+ spraid_submit_cmd(ioq, &ioq_cmd);
+ elapsed = jiffies - scmd->jiffies_at_alloc;
+ dev_log_dbg(hdev->dev,
+ "cid[%d] qid[%d] submit IO cost %3ld.%3ld seconds\n",
+ cid, hwq, elapsed / HZ, elapsed % HZ);
+ return 0;
+
+deinit_iod:
+ spraid_free_iod_res(hdev, iod);
+ return ret;
+}
+
+static int spraid_match_dev(struct spraid_dev *hdev, u16 idx,
+ struct scsi_device *sdev)
+{
+ if (SPRAID_DEV_INFO_FLAG_VALID(hdev->devices[idx].flag)) {
+ if (sdev->channel == hdev->devices[idx].channel &&
+ sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
+ sdev->lun < hdev->devices[idx].lun) {
+ dev_info(hdev->dev,
+ "Match device success, channel;"
+ "target:lun[%d:%d:%d]\n",
+ hdev->devices[idx].channel,
+ hdev->devices[idx].target,
+ hdev->devices[idx].lun);
+ return 1;
+ }
+ }
+
+ return 0;
+}
+
+static int spraid_slave_alloc(struct scsi_device *sdev)
+{
+ struct spraid_sdev_hostdata *hostdata;
+ struct spraid_dev *hdev;
+ u16 idx;
+
+ hdev = shost_priv(sdev->host);
+ hostdata = kzalloc(sizeof(*hostdata), GFP_KERNEL);
+ if (!hostdata) {
+ dev_err(hdev->dev, "Alloc scsi host data memory failed\n");
+ return -ENOMEM;
+ }
+
+ down_read(&hdev->devices_rwsem);
+ for (idx = 0; idx < le32_to_cpu(hdev->ctrl_info->nd); idx++) {
+ if (spraid_match_dev(hdev, idx, sdev))
+ goto scan_host;
+ }
+ up_read(&hdev->devices_rwsem);
+
+ kfree(hostdata);
+ return -ENXIO;
+
+scan_host:
+ hostdata->hdid = le32_to_cpu(hdev->devices[idx].hdid);
+ hostdata->max_io_kb = le16_to_cpu(hdev->devices[idx].max_io_kb);
+ hostdata->attr = hdev->devices[idx].attr;
+ hostdata->flag = hdev->devices[idx].flag;
+ hostdata->rg_id = 0xff;
+ sdev->hostdata = hostdata;
+ up_read(&hdev->devices_rwsem);
+ return 0;
+}
+
+static void spraid_slave_destroy(struct scsi_device *sdev)
+{
+ kfree(sdev->hostdata);
+ sdev->hostdata = NULL;
+}
+
+static int spraid_slave_configure(struct scsi_device *sdev)
+{
+ u16 idx;
+ unsigned int timeout = scmd_tmout_nonpt * HZ;
+ struct spraid_dev *hdev = shost_priv(sdev->host);
+ struct spraid_sdev_hostdata *hostdata = sdev->hostdata;
+ u32 max_sec = sdev->host->max_sectors;
+
+ if (hostdata) {
+ idx = hostdata->hdid - 1;
+ if (sdev->channel == hdev->devices[idx].channel &&
+ sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
+ sdev->lun < hdev->devices[idx].lun) {
+ if (SPRAID_DEV_INFO_ATTR_PT(hdev->devices[idx].attr))
+ timeout = 30 * HZ;
+ else
+ timeout = scmd_tmout_nonpt * HZ;
+ max_sec = le16_to_cpu(hdev->devices[idx].max_io_kb)
+ << 1;
+ } else {
+ dev_err(hdev->dev,
+ "[%s] err, sdev->channel:id:lun[%d:%d:%lld];"
+ "devices[%d], channel:target:lun[%d:%d:%d]\n",
+ __func__, sdev->channel, sdev->id, sdev->lun,
+ idx, hdev->devices[idx].channel,
+ hdev->devices[idx].target,
+ hdev->devices[idx].lun);
+ }
+ } else {
+ dev_err(hdev->dev, "[%s] err, sdev->hostdata is null\n",
+ __func__);
+ }
+
+ blk_queue_rq_timeout(sdev->request_queue, timeout);
+ sdev->eh_timeout = timeout;
+
+ if ((max_sec == 0) || (max_sec > sdev->host->max_sectors))
+ max_sec = sdev->host->max_sectors;
+
+ dev_info(hdev->dev,
+ "[%s] sdev->channel:id:lun[%d:%d:%lld];"
+ " scmd_timeout[%d]s, maxsec[%d]\n",
+ __func__, sdev->channel, sdev->id,
+ sdev->lun, timeout / HZ, max_sec);
+
+ return 0;
+}
+
+static void spraid_shost_init(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ u8 domain, bus;
+ u32 dev_func;
+
+ domain = pci_domain_nr(pdev->bus);
+ bus = pdev->bus->number;
+ dev_func = pdev->devfn;
+
+ hdev->shost->nr_hw_queues = hdev->online_queues - 1;
+ hdev->shost->can_queue = (hdev->ioq_depth - SPRAID_PTCMDS_PERQ);
+
+ hdev->shost->sg_tablesize = le16_to_cpu(hdev->ctrl_info->max_num_sge);
+ /* 512B per sector */
+ hdev->shost->max_sectors =
+ (1U << ((hdev->ctrl_info->mdts) * 1U) << 12) / 512;
+ hdev->shost->cmd_per_lun = MAX_CMD_PER_DEV;
+ hdev->shost->max_channel =
+ le16_to_cpu(hdev->ctrl_info->max_channel) - 1;
+ hdev->shost->max_id = le32_to_cpu(hdev->ctrl_info->max_tgt_id);
+ hdev->shost->max_lun = le16_to_cpu(hdev->ctrl_info->max_lun);
+
+ hdev->shost->this_id = -1;
+ hdev->shost->unique_id = (domain << 16) | (bus << 8) | dev_func;
+ hdev->shost->max_cmd_len = MAX_CDB_LEN;
+ hdev->shost->hostt->cmd_size = max(spraid_cmd_size(hdev, false, true),
+ spraid_cmd_size(hdev, false, false));
+}
+
+static inline void spraid_host_deinit(struct spraid_dev *hdev)
+{
+ ida_free(&spraid_instance_ida, hdev->instance);
+}
+
+static int spraid_alloc_queue(struct spraid_dev *hdev, u16 qid, u16 depth)
+{
+ struct spraid_queue *spraidq = &hdev->queues[qid];
+ int ret = 0;
+
+ if (hdev->queue_count > qid) {
+ dev_info(hdev->dev, "[%s] warn: queue[%d] is exist\n",
+ __func__, qid);
+ return 0;
+ }
+
+ spraidq->cqes = dma_alloc_coherent(hdev->dev, CQ_SIZE(depth),
+ &spraidq->cq_dma_addr,
+ GFP_KERNEL | __GFP_ZERO);
+ if (!spraidq->cqes)
+ return -ENOMEM;
+
+ spraidq->sq_cmds = dma_alloc_coherent(hdev->dev, SQ_SIZE(qid, depth),
+ &spraidq->sq_dma_addr,
+ GFP_KERNEL);
+ if (!spraidq->sq_cmds) {
+ ret = -ENOMEM;
+ goto free_cqes;
+ }
+
+ spin_lock_init(&spraidq->sq_lock);
+ spin_lock_init(&spraidq->cq_lock);
+ spraidq->hdev = hdev;
+ spraidq->q_depth = depth;
+ spraidq->qid = qid;
+ spraidq->cq_vector = -1;
+ hdev->queue_count++;
+
+ /* alloc sense buffer */
+ spraidq->sense = dma_alloc_coherent(hdev->dev, SENSE_SIZE(depth),
+ &spraidq->sense_dma_addr,
+ GFP_KERNEL | __GFP_ZERO);
+ if (!spraidq->sense) {
+ ret = -ENOMEM;
+ goto free_sq_cmds;
+ }
+
+ return 0;
+
+free_sq_cmds:
+ dma_free_coherent(hdev->dev, SQ_SIZE(qid, depth),
+ (void *)spraidq->sq_cmds, spraidq->sq_dma_addr);
+free_cqes:
+ dma_free_coherent(hdev->dev, CQ_SIZE(depth), (void *)spraidq->cqes,
+ spraidq->cq_dma_addr);
+ return ret;
+}
+
+static int spraid_wait_ready(struct spraid_dev *hdev, u64 cap, bool enabled)
+{
+ unsigned long timeout =
+ ((SPRAID_CAP_TIMEOUT(cap) + 1) * SPRAID_CAP_TIMEOUT_UNIT_MS) + jiffies;
+ u32 bit = enabled ? SPRAID_CSTS_RDY : 0;
+
+ while ((readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_RDY) != bit) {
+ usleep_range(1000, 2000);
+ if (fatal_signal_pending(current))
+ return -EINTR;
+
+ if (time_after(jiffies, timeout)) {
+ dev_err(hdev->dev, "Device not ready; aborting %s\n",
+ enabled ? "initialisation" : "reset");
+ return -ENODEV;
+ }
+ }
+ return 0;
+}
+
+static int spraid_shutdown_ctrl(struct spraid_dev *hdev)
+{
+ unsigned long timeout = hdev->ctrl_info->rtd3e + jiffies;
+
+ hdev->ctrl_config &= ~SPRAID_CC_SHN_MASK;
+ hdev->ctrl_config |= SPRAID_CC_SHN_NORMAL;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ while ((readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_SHST_MASK) !=
+ SPRAID_CSTS_SHST_CMPLT) {
+ msleep(100);
+ if (fatal_signal_pending(current))
+ return -EINTR;
+ if (time_after(jiffies, timeout)) {
+ dev_err(hdev->dev,
+ "Device shutdown incomplete; abort shutdown\n");
+ return -ENODEV;
+ }
+ }
+ return 0;
+}
+
+static int spraid_disable_ctrl(struct spraid_dev *hdev)
+{
+ hdev->ctrl_config &= ~SPRAID_CC_SHN_MASK;
+ hdev->ctrl_config &= ~SPRAID_CC_ENABLE;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ return spraid_wait_ready(hdev, hdev->cap, false);
+}
+
+static int spraid_enable_ctrl(struct spraid_dev *hdev)
+{
+ u64 cap = hdev->cap;
+ u32 dev_page_min = SPRAID_CAP_MPSMIN(cap) + 12;
+ u32 page_shift = PAGE_SHIFT;
+
+ if (page_shift < dev_page_min) {
+ dev_err(hdev->dev,
+ "Minimum device page size[%u], too large for host[%u]\n",
+ 1U << dev_page_min, 1U << page_shift);
+ return -ENODEV;
+ }
+
+ page_shift = min_t(unsigned int, SPRAID_CAP_MPSMAX(cap) + 12,
+ PAGE_SHIFT);
+ hdev->page_size = 1U << page_shift;
+
+ hdev->ctrl_config = SPRAID_CC_CSS_NVM;
+ hdev->ctrl_config |= (page_shift - 12) << SPRAID_CC_MPS_SHIFT;
+ hdev->ctrl_config |= SPRAID_CC_AMS_RR | SPRAID_CC_SHN_NONE;
+ hdev->ctrl_config |= SPRAID_CC_IOSQES | SPRAID_CC_IOCQES;
+ hdev->ctrl_config |= SPRAID_CC_ENABLE;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ return spraid_wait_ready(hdev, cap, true);
+}
+
+static void spraid_init_queue(struct spraid_queue *spraidq, u16 qid)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ memset((void *)spraidq->cqes, 0, CQ_SIZE(spraidq->q_depth));
+
+ spraidq->sq_tail = 0;
+ spraidq->cq_head = 0;
+ spraidq->cq_phase = 1;
+ spraidq->q_db = &hdev->dbs[qid * 2 * hdev->db_stride];
+ spraidq->prp_small_pool = hdev->prp_small_pool[qid % small_pool_num];
+ hdev->online_queues++;
+}
+
+static inline bool spraid_cqe_pending(struct spraid_queue *spraidq)
+{
+ return (le16_to_cpu(spraidq->cqes[spraidq->cq_head].status) & 1) ==
+ spraidq->cq_phase;
+}
+
+static void spraid_sata_report_zone_handle(struct scsi_cmnd *scmd,
+ struct spraid_iod *iod)
+{
+ int i = 0;
+ unsigned int bytes = 0;
+ struct scatterlist *sg = scsi_sglist(scmd);
+
+ scsi_for_each_sg(scmd, sg, iod->nsge, i) {
+ unsigned int offset = 0;
+
+ if (bytes == 0) {
+ char *hdr;
+ u32 list_length;
+ u64 max_lba, opt_lba;
+ u16 same;
+
+ hdr = sg_virt(sg);
+
+ list_length = get_unaligned_le32(&hdr[0]);
+ same = get_unaligned_le16(&hdr[4]);
+ max_lba = get_unaligned_le64(&hdr[8]);
+ opt_lba = get_unaligned_le64(&hdr[16]);
+ put_unaligned_be32(list_length, &hdr[0]);
+ hdr[4] = same & 0xf;
+ put_unaligned_be64(max_lba, &hdr[8]);
+ put_unaligned_be64(opt_lba, &hdr[16]);
+ offset += 64;
+ bytes += 64;
+ }
+ while (offset < sg_dma_len(sg)) {
+ char *rec;
+ u8 cond, type, non_seq, reset;
+ u64 size, start, wp;
+
+ rec = sg_virt(sg) + offset;
+ type = rec[0] & 0xf;
+ cond = (rec[1] >> 4) & 0xf;
+ non_seq = (rec[1] & 2);
+ reset = (rec[1] & 1);
+ size = get_unaligned_le64(&rec[8]);
+ start = get_unaligned_le64(&rec[16]);
+ wp = get_unaligned_le64(&rec[24]);
+ rec[0] = type;
+ rec[1] = (cond << 4) | non_seq | reset;
+ put_unaligned_be64(size, &rec[8]);
+ put_unaligned_be64(start, &rec[16]);
+ put_unaligned_be64(wp, &rec[24]);
+ WARN_ON(offset + 64 > sg_dma_len(sg));
+ offset += 64;
+ bytes += 64;
+ }
+ }
+}
+
+static inline void spraid_handle_ata_cmd(struct spraid_dev *hdev,
+ struct scsi_cmnd *scmd,
+ struct spraid_iod *iod)
+{
+ if (hdev->ctrl_info->card_type != SPRAID_CARD_HBA)
+ return;
+
+ switch (scmd->cmnd[0]) {
+ case ZBC_IN:
+ dev_info(hdev->dev, "[%s] process report zone\n", __func__);
+ spraid_sata_report_zone_handle(scmd, iod);
+ break;
+ default:
+ break;
+ }
+}
+
+static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct blk_mq_tags *tags;
+ struct scsi_cmnd *scmd;
+ struct spraid_iod *iod;
+ struct request *req;
+ unsigned long elapsed;
+
+ tags = hdev->shost->tag_set.tags[ioq->qid - 1];
+ req = blk_mq_tag_to_rq(tags, cqe->cmd_id);
+ if (unlikely(!req || !blk_mq_request_started(req))) {
+ dev_warn(hdev->dev, "Invalid id %d completed on queue %d\n",
+ cqe->cmd_id, ioq->qid);
+ return;
+ }
+
+ scmd = blk_mq_rq_to_pdu(req);
+ iod = scsi_cmd_priv(scmd);
+
+ elapsed = jiffies - scmd->jiffies_at_alloc;
+ dev_log_dbg(hdev->dev,
+ "cid[%d] qid[%d] finish IO cost %3ld.%3ld seconds\n",
+ cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
+
+ if (cmpxchg(&iod->state, SPRAID_CMD_IN_FLIGHT, SPRAID_CMD_COMPLETE) !=
+ SPRAID_CMD_IN_FLIGHT) {
+ dev_warn(hdev->dev,
+ "cid[%d] qid[%d] enters abnormal handler;"
+ " cost %3ld.%3ld seconds\n",
+ cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
+ WRITE_ONCE(iod->state, SPRAID_CMD_TMO_COMPLETE);
+
+ if (iod->nsge) {
+ iod->nsge = 0;
+ scsi_dma_unmap(scmd);
+ }
+ spraid_free_iod_res(hdev, iod);
+
+ return;
+ }
+
+ spraid_handle_ata_cmd(hdev, scmd, iod);
+
+ spraid_map_status(iod, scmd, cqe);
+ if (iod->nsge) {
+ iod->nsge = 0;
+ scsi_dma_unmap(scmd);
+ }
+ spraid_free_iod_res(hdev, iod);
+ scmd->scsi_done(scmd);
+}
+
+static void spraid_complete_adminq_cmnd(struct spraid_queue *adminq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = adminq->hdev;
+ struct spraid_cmd *adm_cmd;
+
+ adm_cmd = hdev->adm_cmds + cqe->cmd_id;
+ if (unlikely(adm_cmd->state == SPRAID_CMD_IDLE)) {
+ dev_warn(adminq->hdev->dev,
+ "Invalid id %d completed on queue %d\n",
+ cqe->cmd_id, le16_to_cpu(cqe->sq_id));
+ return;
+ }
+
+ adm_cmd->status = le16_to_cpu(cqe->status) >> 1;
+ adm_cmd->result0 = le32_to_cpu(cqe->result);
+ adm_cmd->result1 = le32_to_cpu(cqe->result1);
+
+ complete(&adm_cmd->cmd_done);
+}
+
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid);
+
+static void spraid_complete_aen(struct spraid_queue *spraidq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+ u32 result = le32_to_cpu(cqe->result);
+
+ dev_info(hdev->dev, "rcv aen, cid[%d], status[0x%x], result[0x%x]\n",
+ cqe->cmd_id, le16_to_cpu(cqe->status) >> 1, result);
+
+ spraid_send_aen(hdev, cqe->cmd_id);
+
+ if ((le16_to_cpu(cqe->status) >> 1) != SPRAID_SC_SUCCESS)
+ return;
+ switch (result & 0x7) {
+ case SPRAID_AEN_NOTICE:
+ spraid_handle_aen_notice(hdev, result);
+ break;
+ case SPRAID_AEN_VS:
+ spraid_handle_aen_vs(hdev, result, le32_to_cpu(cqe->result1));
+ break;
+ default:
+ dev_warn(hdev->dev, "Unsupported async event type: %u\n",
+ result & 0x7);
+ break;
+ }
+}
+
+static void spraid_complete_ioq_sync_cmnd(struct spraid_queue *ioq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct spraid_cmd *ptcmd;
+
+ ptcmd = hdev->ioq_ptcmds + (ioq->qid - 1) * SPRAID_PTCMDS_PERQ +
+ cqe->cmd_id - SPRAID_IO_BLK_MQ_DEPTH;
+
+ ptcmd->status = le16_to_cpu(cqe->status) >> 1;
+ ptcmd->result0 = le32_to_cpu(cqe->result);
+ ptcmd->result1 = le32_to_cpu(cqe->result1);
+
+ complete(&ptcmd->cmd_done);
+}
+
+static inline void spraid_handle_cqe(struct spraid_queue *spraidq, u16 idx)
+{
+ struct spraid_completion *cqe = &spraidq->cqes[idx];
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ if (unlikely(cqe->cmd_id >= spraidq->q_depth)) {
+ dev_err(hdev->dev,
+ "Invalid command id[%d] completed on queue %d\n",
+ cqe->cmd_id, cqe->sq_id);
+ return;
+ }
+
+ dev_log_dbg(hdev->dev, "cid[%d] qid[%d];"
+ " result[0x%x], sq_id[%d], status[0x%x]\n",
+ cqe->cmd_id, spraidq->qid, le32_to_cpu(cqe->result),
+ le16_to_cpu(cqe->sq_id), le16_to_cpu(cqe->status));
+
+ if (unlikely(spraidq->qid == 0
+ && cqe->cmd_id >= SPRAID_AQ_BLK_MQ_DEPTH)) {
+ spraid_complete_aen(spraidq, cqe);
+ return;
+ }
+
+ if (unlikely(spraidq->qid && cqe->cmd_id >= SPRAID_IO_BLK_MQ_DEPTH)) {
+ spraid_complete_ioq_sync_cmnd(spraidq, cqe);
+ return;
+ }
+
+ if (spraidq->qid)
+ spraid_complete_ioq_cmnd(spraidq, cqe);
+ else
+ spraid_complete_adminq_cmnd(spraidq, cqe);
+}
+
+static void spraid_complete_cqes(struct spraid_queue *spraidq,
+ u16 start, u16 end)
+{
+ while (start != end) {
+ spraid_handle_cqe(spraidq, start);
+ if (++start == spraidq->q_depth)
+ start = 0;
+ }
+}
+
+static inline void spraid_update_cq_head(struct spraid_queue *spraidq)
+{
+ if (++spraidq->cq_head == spraidq->q_depth) {
+ spraidq->cq_head = 0;
+ spraidq->cq_phase = !spraidq->cq_phase;
+ }
+}
+
+static inline bool spraid_process_cq(struct spraid_queue *spraidq,
+ u16 *start, u16 *end, int tag)
+{
+ bool found = false;
+
+ *start = spraidq->cq_head;
+ while (!found && spraid_cqe_pending(spraidq)) {
+ if (spraidq->cqes[spraidq->cq_head].cmd_id == tag)
+ found = true;
+ spraid_update_cq_head(spraidq);
+ }
+ *end = spraidq->cq_head;
+
+ if (*start != *end)
+ writel(spraidq->cq_head,
+ spraidq->q_db + spraidq->hdev->db_stride);
+
+ return found;
+}
+
+static bool spraid_poll_cq(struct spraid_queue *spraidq, int cid)
+{
+ u16 start, end;
+ bool found;
+
+ if (!spraid_cqe_pending(spraidq))
+ return 0;
+
+ spin_lock_irq(&spraidq->cq_lock);
+ found = spraid_process_cq(spraidq, &start, &end, cid);
+ spin_unlock_irq(&spraidq->cq_lock);
+
+ spraid_complete_cqes(spraidq, start, end);
+ return found;
+}
+
+static irqreturn_t spraid_irq(int irq, void *data)
+{
+ struct spraid_queue *spraidq = data;
+ irqreturn_t ret = IRQ_NONE;
+ u16 start, end;
+
+ spin_lock(&spraidq->cq_lock);
+ if (spraidq->cq_head != spraidq->last_cq_head)
+ ret = IRQ_HANDLED;
+
+ spraid_process_cq(spraidq, &start, &end, -1);
+ spraidq->last_cq_head = spraidq->cq_head;
+ spin_unlock(&spraidq->cq_lock);
+
+ if (start != end) {
+ spraid_complete_cqes(spraidq, start, end);
+ ret = IRQ_HANDLED;
+ }
+ return ret;
+}
+
+static int spraid_setup_admin_queue(struct spraid_dev *hdev)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u32 aqa;
+ int ret;
+
+ dev_info(hdev->dev, "[%s] start disable ctrl\n", __func__);
+
+ ret = spraid_disable_ctrl(hdev);
+ if (ret)
+ return ret;
+
+ ret = spraid_alloc_queue(hdev, 0, SPRAID_AQ_DEPTH);
+ if (ret)
+ return ret;
+
+ aqa = adminq->q_depth - 1;
+ aqa |= aqa << 16;
+ writel(aqa, hdev->bar + SPRAID_REG_AQA);
+ lo_hi_writeq(adminq->sq_dma_addr, hdev->bar + SPRAID_REG_ASQ);
+ lo_hi_writeq(adminq->cq_dma_addr, hdev->bar + SPRAID_REG_ACQ);
+
+ dev_info(hdev->dev, "[%s] start enable ctrl\n", __func__);
+
+ ret = spraid_enable_ctrl(hdev);
+ if (ret) {
+ ret = -ENODEV;
+ goto free_queue;
+ }
+
+ adminq->cq_vector = 0;
+ spraid_init_queue(adminq, 0);
+ ret = pci_request_irq(hdev->pdev, adminq->cq_vector, spraid_irq, NULL,
+ adminq, "spraid%d_q%d",
+ hdev->instance, adminq->qid);
+
+ if (ret) {
+ adminq->cq_vector = -1;
+ hdev->online_queues--;
+ goto free_queue;
+ }
+
+ dev_info(hdev->dev, "[%s] success, queuecount:[%d], onlinequeue:[%d]\n",
+ __func__, hdev->queue_count, hdev->online_queues);
+
+ return 0;
+
+free_queue:
+ spraid_free_queue(adminq);
+ return ret;
+}
+
+static u32 spraid_bar_size(struct spraid_dev *hdev, u32 nr_ioqs)
+{
+ return (SPRAID_REG_DBS + ((nr_ioqs + 1) * 8 * hdev->db_stride));
+}
+
+static int spraid_alloc_admin_cmds(struct spraid_dev *hdev)
+{
+ int i;
+
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+ spin_lock_init(&hdev->adm_cmd_lock);
+
+ hdev->adm_cmds = kcalloc_node(SPRAID_AQ_BLK_MQ_DEPTH,
+ sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
+
+ if (!hdev->adm_cmds) {
+ dev_err(hdev->dev, "Alloc admin cmds failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < SPRAID_AQ_BLK_MQ_DEPTH; i++) {
+ hdev->adm_cmds[i].qid = 0;
+ hdev->adm_cmds[i].cid = i;
+ list_add_tail(&(hdev->adm_cmds[i].list), &hdev->adm_cmd_list);
+ }
+
+ dev_info(hdev->dev, "Alloc admin cmds success, num[%d]\n",
+ SPRAID_AQ_BLK_MQ_DEPTH);
+
+ return 0;
+}
+
+static void spraid_free_admin_cmds(struct spraid_dev *hdev)
+{
+ kfree(hdev->adm_cmds);
+ hdev->adm_cmds = NULL;
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+}
+
+static struct spraid_cmd *spraid_get_cmd(struct spraid_dev *hdev,
+ enum spraid_cmd_type type)
+{
+ struct spraid_cmd *cmd = NULL;
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
+
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
+
+ spin_lock_irqsave(slock, flags);
+ if (list_empty(head)) {
+ spin_unlock_irqrestore(slock, flags);
+ dev_err(hdev->dev, "err, cmd[%d] list empty\n", type);
+ return NULL;
+ }
+ cmd = list_entry(head->next, struct spraid_cmd, list);
+ list_del_init(&cmd->list);
+ spin_unlock_irqrestore(slock, flags);
+
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IN_FLIGHT);
+
+ return cmd;
+}
+
+static void spraid_put_cmd(struct spraid_dev *hdev, struct spraid_cmd *cmd,
+ enum spraid_cmd_type type)
+{
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
+
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
+
+ spin_lock_irqsave(slock, flags);
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IDLE);
+ list_add_tail(&cmd->list, head);
+ spin_unlock_irqrestore(slock, flags);
+}
+
+
+static int spraid_submit_admin_sync_cmd(struct spraid_dev *hdev,
+ struct spraid_admin_command *cmd,
+ u32 *result0, u32 *result1, u32 timeout)
+{
+ struct spraid_cmd *adm_cmd = spraid_get_cmd(hdev, SPRAID_CMD_ADM);
+
+ if (!adm_cmd) {
+ dev_err(hdev->dev, "err, get admin cmd failed\n");
+ return -EFAULT;
+ }
+
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
+
+ init_completion(&adm_cmd->cmd_done);
+
+ cmd->common.command_id = adm_cmd->cid;
+ spraid_submit_cmd(&hdev->queues[0], cmd);
+
+ if (!wait_for_completion_timeout(&adm_cmd->cmd_done, timeout)) {
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout;"
+ " opcode[0x%x] subopcode[0x%x]\n",
+ __func__, adm_cmd->cid, adm_cmd->qid,
+ cmd->usr_cmd.opcode, cmd->usr_cmd.info_0.subopcode);
+ WRITE_ONCE(adm_cmd->state, SPRAID_CMD_TIMEOUT);
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+ return -EINVAL;
+ }
+
+ if (result0)
+ *result0 = adm_cmd->result0;
+ if (result1)
+ *result1 = adm_cmd->result1;
+
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+
+ return adm_cmd->status;
+}
+
+static int spraid_create_cq(struct spraid_dev *hdev, u16 qid,
+ struct spraid_queue *spraidq, u16 cq_vector)
+{
+ struct spraid_admin_command admin_cmd;
+ int flags = SPRAID_QUEUE_PHYS_CONTIG | SPRAID_CQ_IRQ_ENABLED;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.create_cq.opcode = SPRAID_ADMIN_CREATE_CQ;
+ admin_cmd.create_cq.prp1 = cpu_to_le64(spraidq->cq_dma_addr);
+ admin_cmd.create_cq.cqid = cpu_to_le16(qid);
+ admin_cmd.create_cq.qsize = cpu_to_le16(spraidq->q_depth - 1);
+ admin_cmd.create_cq.cq_flags = cpu_to_le16(flags);
+ admin_cmd.create_cq.irq_vector = cpu_to_le16(cq_vector);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static int spraid_create_sq(struct spraid_dev *hdev, u16 qid,
+ struct spraid_queue *spraidq)
+{
+ struct spraid_admin_command admin_cmd;
+ int flags = SPRAID_QUEUE_PHYS_CONTIG;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.create_sq.opcode = SPRAID_ADMIN_CREATE_SQ;
+ admin_cmd.create_sq.prp1 = cpu_to_le64(spraidq->sq_dma_addr);
+ admin_cmd.create_sq.sqid = cpu_to_le16(qid);
+ admin_cmd.create_sq.qsize = cpu_to_le16(spraidq->q_depth - 1);
+ admin_cmd.create_sq.sq_flags = cpu_to_le16(flags);
+ admin_cmd.create_sq.cqid = cpu_to_le16(qid);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static void spraid_free_queue(struct spraid_queue *spraidq)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ hdev->queue_count--;
+ dma_free_coherent(hdev->dev, CQ_SIZE(spraidq->q_depth),
+ (void *)spraidq->cqes, spraidq->cq_dma_addr);
+ dma_free_coherent(hdev->dev, SQ_SIZE(spraidq->qid, spraidq->q_depth),
+ spraidq->sq_cmds, spraidq->sq_dma_addr);
+ dma_free_coherent(hdev->dev, SENSE_SIZE(spraidq->q_depth),
+ spraidq->sense, spraidq->sense_dma_addr);
+}
+
+static void spraid_free_admin_queue(struct spraid_dev *hdev)
+{
+ spraid_free_queue(&hdev->queues[0]);
+}
+
+static void spraid_free_io_queues(struct spraid_dev *hdev)
+{
+ int i;
+
+ for (i = hdev->queue_count - 1; i >= 1; i--)
+ spraid_free_queue(&hdev->queues[i]);
+}
+
+static int spraid_delete_queue(struct spraid_dev *hdev, u8 op, u16 id)
+{
+ struct spraid_admin_command admin_cmd;
+ int ret;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.delete_queue.opcode = op;
+ admin_cmd.delete_queue.qid = cpu_to_le16(id);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+
+ if (ret)
+ dev_err(hdev->dev, "Delete %s:[%d] failed\n",
+ (op == SPRAID_ADMIN_DELETE_CQ) ? "cq" : "sq", id);
+
+ return ret;
+}
+
+static int spraid_delete_cq(struct spraid_dev *hdev, u16 cqid)
+{
+ return spraid_delete_queue(hdev, SPRAID_ADMIN_DELETE_CQ, cqid);
+}
+
+static int spraid_delete_sq(struct spraid_dev *hdev, u16 sqid)
+{
+ return spraid_delete_queue(hdev, SPRAID_ADMIN_DELETE_SQ, sqid);
+}
+
+static int spraid_create_queue(struct spraid_queue *spraidq, u16 qid)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+ u16 cq_vector;
+ int ret;
+
+ cq_vector = (hdev->num_vecs == 1) ? 0 : qid;
+ ret = spraid_create_cq(hdev, qid, spraidq, cq_vector);
+ if (ret)
+ return ret;
+
+ ret = spraid_create_sq(hdev, qid, spraidq);
+ if (ret)
+ goto delete_cq;
+
+ spraid_init_queue(spraidq, qid);
+ spraidq->cq_vector = cq_vector;
+
+ ret = pci_request_irq(hdev->pdev, cq_vector, spraid_irq, NULL,
+ spraidq, "spraid%d_q%d", hdev->instance, qid);
+
+ if (ret) {
+ dev_err(hdev->dev, "Request queue[%d] irq failed\n", qid);
+ goto delete_sq;
+ }
+
+ return 0;
+
+delete_sq:
+ spraidq->cq_vector = -1;
+ hdev->online_queues--;
+ spraid_delete_sq(hdev, qid);
+delete_cq:
+ spraid_delete_cq(hdev, qid);
+
+ return ret;
+}
+
+static int spraid_create_io_queues(struct spraid_dev *hdev)
+{
+ u32 i, max;
+ int ret = 0;
+
+ max = min(hdev->max_qid, hdev->queue_count - 1);
+ for (i = hdev->online_queues; i <= max; i++) {
+ ret = spraid_create_queue(&hdev->queues[i], i);
+ if (ret) {
+ dev_err(hdev->dev, "Create queue[%d] failed\n", i);
+ break;
+ }
+ }
+
+ dev_info(hdev->dev, "[%s] queue_count[%d], online_queue[%d]",
+ __func__, hdev->queue_count, hdev->online_queues);
+
+ return ret >= 0 ? 0 : ret;
+}
+
+static int spraid_set_features(struct spraid_dev *hdev, u32 fid,
+ u32 dword11, void *buffer,
+ size_t buflen, u32 *result)
+{
+ struct spraid_admin_command admin_cmd;
+ int ret;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+
+ if (buffer && buflen) {
+ data_ptr = dma_alloc_coherent(hdev->dev, buflen,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memcpy(data_ptr, buffer, buflen);
+ }
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.features.opcode = SPRAID_ADMIN_SET_FEATURES;
+ admin_cmd.features.fid = cpu_to_le32(fid);
+ admin_cmd.features.dword11 = cpu_to_le32(dword11);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, result, NULL, 0);
+
+ if (data_ptr)
+ dma_free_coherent(hdev->dev, buflen, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_configure_timestamp(struct spraid_dev *hdev)
+{
+ __le64 ts;
+ int ret;
+
+ ts = cpu_to_le64(ktime_to_ms(ktime_get_real()));
+ ret = spraid_set_features(hdev, SPRAID_FEAT_TIMESTAMP,
+ 0, &ts, sizeof(ts), NULL);
+
+ if (ret)
+ dev_err(hdev->dev, "set timestamp failed: %d\n", ret);
+ return ret;
+}
+
+static int spraid_set_queue_cnt(struct spraid_dev *hdev, u32 *cnt)
+{
+ u32 q_cnt = (*cnt - 1) | ((*cnt - 1) << 16);
+ u32 nr_ioqs, result;
+ int status;
+
+ status = spraid_set_features(hdev, SPRAID_FEAT_NUM_QUEUES,
+ q_cnt, NULL, 0, &result);
+ if (status) {
+ dev_err(hdev->dev, "Set queue count failed, status: %d\n",
+ status);
+ return -EIO;
+ }
+
+ nr_ioqs = min(result & 0xffff, result >> 16) + 1;
+ *cnt = min(*cnt, nr_ioqs);
+ if (*cnt == 0) {
+ dev_err(hdev->dev, "Illegal queue count: zero\n");
+ return -EIO;
+ }
+ return 0;
+}
+
+static int spraid_setup_io_queues(struct spraid_dev *hdev)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ struct pci_dev *pdev = hdev->pdev;
+ u32 nr_ioqs = num_online_cpus();
+ u32 i, size;
+ int ret;
+
+ struct irq_affinity affd = {
+ .pre_vectors = 1
+ };
+
+ ret = spraid_set_queue_cnt(hdev, &nr_ioqs);
+ if (ret < 0)
+ return ret;
+
+ size = spraid_bar_size(hdev, nr_ioqs);
+ ret = spraid_remap_bar(hdev, size);
+ if (ret)
+ return -ENOMEM;
+
+ adminq->q_db = hdev->dbs;
+
+ pci_free_irq(pdev, 0, adminq);
+ pci_free_irq_vectors(pdev);
+
+ ret = pci_alloc_irq_vectors_affinity(pdev, 1, (nr_ioqs + 1),
+ PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
+ if (ret <= 0)
+ return -EIO;
+
+ hdev->num_vecs = ret;
+
+ hdev->max_qid = max(ret - 1, 1);
+
+ ret = pci_request_irq(pdev, adminq->cq_vector, spraid_irq, NULL,
+ adminq, "spraid%d_q%d",
+ hdev->instance, adminq->qid);
+ if (ret) {
+ dev_err(hdev->dev, "Request admin irq failed\n");
+ adminq->cq_vector = -1;
+ return ret;
+ }
+
+ for (i = hdev->queue_count; i <= hdev->max_qid; i++) {
+ ret = spraid_alloc_queue(hdev, i, hdev->ioq_depth);
+ if (ret)
+ break;
+ }
+ dev_info(hdev->dev, "[%s] max_qid: %d, queue_count: %d;"
+ " online_queue: %d, ioq_depth: %d\n",
+ __func__, hdev->max_qid, hdev->queue_count,
+ hdev->online_queues, hdev->ioq_depth);
+
+ return spraid_create_io_queues(hdev);
+}
+
+static void spraid_delete_io_queues(struct spraid_dev *hdev)
+{
+ u16 queues = hdev->online_queues - 1;
+ u8 opcode = SPRAID_ADMIN_DELETE_SQ;
+ u16 i, pass;
+
+ if (!pci_device_is_present(hdev->pdev)) {
+ dev_err(hdev->dev,
+ "pci_device is not present, skip disable io queues\n");
+ return;
+ }
+
+ if (hdev->online_queues < 2) {
+ dev_err(hdev->dev, "[%s] err, io queue has been delete\n",
+ __func__);
+ return;
+ }
+
+ for (pass = 0; pass < 2; pass++) {
+ for (i = queues; i > 0; i--)
+ if (spraid_delete_queue(hdev, opcode, i))
+ break;
+
+ opcode = SPRAID_ADMIN_DELETE_CQ;
+ }
+}
+
+static void spraid_remove_io_queues(struct spraid_dev *hdev)
+{
+ spraid_delete_io_queues(hdev);
+ spraid_free_io_queues(hdev);
+}
+
+static void spraid_pci_disable(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ u32 i;
+
+ for (i = 0; i < hdev->online_queues; i++)
+ pci_free_irq(pdev, hdev->queues[i].cq_vector, &hdev->queues[i]);
+ pci_free_irq_vectors(pdev);
+ if (pci_is_enabled(pdev)) {
+ pci_disable_pcie_error_reporting(pdev);
+ pci_disable_device(pdev);
+ }
+ hdev->online_queues = 0;
+}
+
+static void spraid_disable_admin_queue(struct spraid_dev *hdev, bool shutdown)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u16 start, end;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ if (shutdown)
+ spraid_shutdown_ctrl(hdev);
+ else
+ spraid_disable_ctrl(hdev);
+ }
+
+ if (hdev->queue_count == 0) {
+ dev_err(hdev->dev, "[%s] err, admin queue has been delete\n",
+ __func__);
+ return;
+ }
+
+ spin_lock_irq(&adminq->cq_lock);
+ spraid_process_cq(adminq, &start, &end, -1);
+ spin_unlock_irq(&adminq->cq_lock);
+
+ spraid_complete_cqes(adminq, start, end);
+ spraid_free_admin_queue(hdev);
+}
+
+static int spraid_create_dma_pools(struct spraid_dev *hdev)
+{
+ int i;
+ char poolname[20] = { 0 };
+
+ hdev->prp_page_pool = dma_pool_create("prp list page", hdev->dev,
+ PAGE_SIZE, PAGE_SIZE, 0);
+
+ if (!hdev->prp_page_pool) {
+ dev_err(hdev->dev, "create prp_page_pool failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < small_pool_num; i++) {
+ sprintf(poolname, "prp_list_256_%d", i);
+ hdev->prp_small_pool[i] =
+ dma_pool_create(poolname, hdev->dev, SMALL_POOL_SIZE,
+ SMALL_POOL_SIZE, 0);
+
+ if (!hdev->prp_small_pool[i]) {
+ dev_err(hdev->dev, "create prp_small_pool %d failed\n",
+ i);
+ goto destroy_prp_small_pool;
+ }
+ }
+
+ return 0;
+
+destroy_prp_small_pool:
+ while (i > 0)
+ dma_pool_destroy(hdev->prp_small_pool[--i]);
+ dma_pool_destroy(hdev->prp_page_pool);
+
+ return -ENOMEM;
+}
+
+static void spraid_destroy_dma_pools(struct spraid_dev *hdev)
+{
+ int i;
+
+ for (i = 0; i < small_pool_num; i++)
+ dma_pool_destroy(hdev->prp_small_pool[i]);
+ dma_pool_destroy(hdev->prp_page_pool);
+}
+
+static int spraid_get_dev_list(struct spraid_dev *hdev,
+ struct spraid_dev_info *devices)
+{
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ struct spraid_admin_command admin_cmd;
+ struct spraid_dev_list *list_buf;
+ dma_addr_t data_dma = 0;
+ u32 i, idx, hdid, ndev;
+ int ret = 0;
+
+ list_buf = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!list_buf)
+ return -ENOMEM;
+
+ for (idx = 0; idx < nd;) {
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
+ admin_cmd.get_info.type = SPRAID_GET_INFO_DEV_LIST;
+ admin_cmd.get_info.cdw11 = cpu_to_le32(idx);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd,
+ NULL, NULL, 0);
+
+ if (ret) {
+ dev_err(hdev->dev, "Get device list failed, nd: %u;"
+ "idx: %u, ret: %d\n",
+ nd, idx, ret);
+ goto out;
+ }
+ ndev = le32_to_cpu(list_buf->dev_num);
+
+ dev_info(hdev->dev, "ndev numbers: %u\n", ndev);
+
+ for (i = 0; i < ndev; i++) {
+ hdid = le32_to_cpu(list_buf->devices[i].hdid);
+ dev_info(hdev->dev, "list_buf->devices[%d], hdid: %u;"
+ "target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ i, hdid,
+ le16_to_cpu(list_buf->devices[i].target),
+ list_buf->devices[i].channel,
+ list_buf->devices[i].lun,
+ list_buf->devices[i].attr);
+ if (hdid > nd || hdid == 0) {
+ dev_err(hdev->dev, "err, hdid[%d] invalid\n",
+ hdid);
+ continue;
+ }
+ memcpy(&devices[hdid - 1], &list_buf->devices[i],
+ sizeof(struct spraid_dev_info));
+ }
+ idx += ndev;
+
+ if (idx < MAX_DEV_ENTRY_PER_PAGE_4K)
+ break;
+ }
+
+out:
+ dma_free_coherent(hdev->dev, PAGE_SIZE, list_buf, data_dma);
+ return ret;
+}
+
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.common.opcode = SPRAID_ADMIN_ASYNC_EVENT;
+ admin_cmd.common.command_id = cid;
+
+ spraid_submit_cmd(adminq, &admin_cmd);
+ dev_info(hdev->dev, "send aen, cid[%d]\n", cid);
+}
+
+static inline void spraid_send_all_aen(struct spraid_dev *hdev)
+{
+ u16 i;
+
+ for (i = 0; i < hdev->ctrl_info->aerl; i++)
+ spraid_send_aen(hdev, i + SPRAID_AQ_BLK_MQ_DEPTH);
+}
+
+static int spraid_add_device(struct spraid_dev *hdev,
+ struct spraid_dev_info *device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ dev_info(hdev->dev, "add device, hdid: %u target: %d, channel: %d;"
+ " lun: %d, attr[0x%x]\n",
+ le32_to_cpu(device->hdid), le16_to_cpu(device->target),
+ device->channel, device->lun, device->attr);
+
+ sdev = scsi_device_lookup(shost, device->channel,
+ le16_to_cpu(device->target), 0);
+ if (sdev) {
+ dev_warn(hdev->dev, "Device is already exist, channel: %d;"
+ " target_id: %d, lun: %d\n",
+ device->channel, le16_to_cpu(device->target), 0);
+ scsi_device_put(sdev);
+ return -EEXIST;
+ }
+ scsi_add_device(shost, device->channel, le16_to_cpu(device->target), 0);
+ return 0;
+}
+
+static int spraid_rescan_device(struct spraid_dev *hdev,
+ struct spraid_dev_info *device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ dev_info(hdev->dev, "rescan device, hdid: %u target: %d, channel: %d;"
+ " lun: %d, attr[0x%x]\n",
+ le32_to_cpu(device->hdid), le16_to_cpu(device->target),
+ device->channel, device->lun, device->attr);
+
+ sdev = scsi_device_lookup(shost, device->channel,
+ le16_to_cpu(device->target), 0);
+ if (!sdev) {
+ dev_warn(hdev->dev, "device is not exit rescan it, channel: %d;"
+ " target_id: %d, lun: %d\n",
+ device->channel, le16_to_cpu(device->target), 0);
+ return -ENODEV;
+ }
+
+ scsi_rescan_device(&sdev->sdev_gendev);
+ scsi_device_put(sdev);
+ return 0;
+}
+
+static int spraid_remove_device(struct spraid_dev *hdev,
+ struct spraid_dev_info *org_device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ dev_info(hdev->dev, "remove device, hdid: %u target: %d, channel: %d;"
+ " lun: %d, attr[0x%x]\n",
+ le32_to_cpu(org_device->hdid), le16_to_cpu(org_device->target),
+ org_device->channel, org_device->lun, org_device->attr);
+
+ sdev = scsi_device_lookup(shost, org_device->channel,
+ le16_to_cpu(org_device->target), 0);
+ if (!sdev) {
+ dev_warn(hdev->dev, "device is not exit remove it, channel: %d;"
+ " target_id: %d, lun: %d\n",
+ org_device->channel,
+ le16_to_cpu(org_device->target), 0);
+ return -ENODEV;
+ }
+
+ scsi_remove_device(sdev);
+ scsi_device_put(sdev);
+ return 0;
+}
+
+static int spraid_dev_list_init(struct spraid_dev *hdev)
+{
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ int i, ret;
+
+ hdev->devices = kzalloc_node(nd * sizeof(struct spraid_dev_info),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->devices)
+ return -ENOMEM;
+
+ ret = spraid_get_dev_list(hdev, hdev->devices);
+ if (ret) {
+ dev_err(hdev->dev,
+ "Ignore failure of getting device list;"
+ " within initialization\n");
+ return 0;
+ }
+
+ for (i = 0; i < nd; i++) {
+ if (SPRAID_DEV_INFO_FLAG_VALID(hdev->devices[i].flag) &&
+ SPRAID_DEV_INFO_ATTR_BOOT(hdev->devices[i].attr)) {
+ spraid_add_device(hdev, &hdev->devices[i]);
+ break;
+ }
+ }
+ return 0;
+}
+
+static int luntarget_cmp_func(const void *l, const void *r)
+{
+ const struct spraid_dev_info *ln = l;
+ const struct spraid_dev_info *rn = r;
+
+ if (ln->channel == rn->channel)
+ return le16_to_cpu(ln->target) - le16_to_cpu(rn->target);
+
+ return ln->channel - rn->channel;
+}
+
+static void spraid_scan_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, scan_work);
+ struct spraid_dev_info *devices, *org_devices;
+ struct spraid_dev_info *sortdevice;
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ u8 flag, org_flag;
+ int i, ret;
+ int count = 0;
+
+ devices = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
+ if (!devices)
+ return;
+
+ sortdevice = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
+ if (!sortdevice)
+ goto free_list;
+
+ ret = spraid_get_dev_list(hdev, devices);
+ if (ret)
+ goto free_all;
+ org_devices = hdev->devices;
+ for (i = 0; i < nd; i++) {
+ org_flag = org_devices[i].flag;
+ flag = devices[i].flag;
+
+ dev_log_dbg(hdev->dev, "i: %d, org_flag: 0x%x, flag: 0x%x\n",
+ i, org_flag, flag);
+
+ if (SPRAID_DEV_INFO_FLAG_VALID(flag)) {
+ if (!SPRAID_DEV_INFO_FLAG_VALID(org_flag)) {
+ down_write(&hdev->devices_rwsem);
+ memcpy(&org_devices[i], &devices[i],
+ sizeof(struct spraid_dev_info));
+ memcpy(&sortdevice[count++], &devices[i],
+ sizeof(struct spraid_dev_info));
+ up_write(&hdev->devices_rwsem);
+ } else if (SPRAID_DEV_INFO_FLAG_CHANGE(flag)) {
+ spraid_rescan_device(hdev, &devices[i]);
+ }
+ } else {
+ if (SPRAID_DEV_INFO_FLAG_VALID(org_flag)) {
+ down_write(&hdev->devices_rwsem);
+ org_devices[i].flag &= 0xfe;
+ up_write(&hdev->devices_rwsem);
+ spraid_remove_device(hdev, &org_devices[i]);
+ }
+ }
+ }
+
+ dev_info(hdev->dev, "scan work add device count = %d\n", count);
+
+ sort(sortdevice, count, sizeof(sortdevice[0]),
+ luntarget_cmp_func, NULL);
+
+ for (i = 0; i < count; i++)
+ spraid_add_device(hdev, &sortdevice[i]);
+
+free_all:
+ kfree(sortdevice);
+free_list:
+ kfree(devices);
+}
+
+static void spraid_timesyn_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, timesyn_work);
+
+ spraid_configure_timestamp(hdev);
+}
+
+static int spraid_init_ctrl_info(struct spraid_dev *hdev);
+static void spraid_fw_act_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, fw_act_work);
+
+ if (spraid_init_ctrl_info(hdev))
+ dev_err(hdev->dev, "get ctrl info failed after fw act\n");
+}
+
+static void spraid_queue_scan(struct spraid_dev *hdev)
+{
+ queue_work(spraid_wq, &hdev->scan_work);
+}
+
+static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result)
+{
+ switch ((result & 0xff00) >> 8) {
+ case SPRAID_AEN_DEV_CHANGED:
+ spraid_queue_scan(hdev);
+ break;
+ case SPRAID_AEN_FW_ACT_START:
+ dev_info(hdev->dev, "fw activation starting\n");
+ break;
+ case SPRAID_AEN_HOST_PROBING:
+ break;
+ default:
+ dev_warn(hdev->dev, "async event result %08x\n", result);
+ }
+}
+
+static void spraid_handle_aen_vs(struct spraid_dev *hdev,
+ u32 result, u32 result1)
+{
+ switch ((result & 0xff00) >> 8) {
+ case SPRAID_AEN_TIMESYN:
+ queue_work(spraid_wq, &hdev->timesyn_work);
+ break;
+ case SPRAID_AEN_FW_ACT_FINISH:
+ dev_info(hdev->dev, "fw activation finish\n");
+ queue_work(spraid_wq, &hdev->fw_act_work);
+ break;
+ case SPRAID_AEN_EVENT_MIN ... SPRAID_AEN_EVENT_MAX:
+ dev_info(hdev->dev, "rcv card event[%d];"
+ " param1[0x%x] param2[0x%x]\n",
+ (result & 0xff00) >> 8, result, result1);
+ break;
+ default:
+ dev_warn(hdev->dev, "async event result: 0x%x\n", result);
+ }
+}
+
+static int spraid_alloc_resources(struct spraid_dev *hdev)
+{
+ int ret, nqueue;
+
+ ret = ida_alloc(&spraid_instance_ida, GFP_KERNEL);
+ if (ret < 0) {
+ dev_err(hdev->dev, "Get instance id failed\n");
+ return ret;
+ }
+ hdev->instance = ret;
+
+ hdev->ctrl_info = kzalloc_node(sizeof(*hdev->ctrl_info),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->ctrl_info) {
+ ret = -ENOMEM;
+ goto release_instance;
+ }
+
+ ret = spraid_create_dma_pools(hdev);
+ if (ret)
+ goto free_ctrl_info;
+ nqueue = num_possible_cpus() + 1;
+ hdev->queues = kcalloc_node(nqueue, sizeof(struct spraid_queue),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->queues) {
+ ret = -ENOMEM;
+ goto destroy_dma_pools;
+ }
+
+ ret = spraid_alloc_admin_cmds(hdev);
+ if (ret)
+ goto free_queues;
+
+ dev_info(hdev->dev, "[%s] queues num: %d\n", __func__, nqueue);
+
+ return 0;
+
+free_queues:
+ kfree(hdev->queues);
+destroy_dma_pools:
+ spraid_destroy_dma_pools(hdev);
+free_ctrl_info:
+ kfree(hdev->ctrl_info);
+release_instance:
+ ida_free(&spraid_instance_ida, hdev->instance);
+ return ret;
+}
+
+static void spraid_free_resources(struct spraid_dev *hdev)
+{
+ spraid_free_admin_cmds(hdev);
+ kfree(hdev->queues);
+ spraid_destroy_dma_pools(hdev);
+ kfree(hdev->ctrl_info);
+ ida_free(&spraid_instance_ida, hdev->instance);
+}
+
+static void spraid_bsg_unmap_data(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir =
+ rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+
+ if (iod->nsge)
+ dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
+
+ spraid_free_iod_res(hdev, iod);
+}
+
+static int spraid_bsg_map_data(struct spraid_dev *hdev, struct bsg_job *job,
+ struct spraid_admin_command *cmd)
+{
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir =
+ rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ int ret = 0;
+
+ iod->sg = job->request_payload.sg_list;
+ iod->nsge = job->request_payload.sg_cnt;
+ iod->length = job->request_payload.payload_len;
+ iod->use_sgl = false;
+ iod->npages = -1;
+ iod->sg_drv_mgmt = false;
+
+ if (!iod->nsge)
+ goto out;
+
+ ret = dma_map_sg_attrs(hdev->dev, iod->sg, iod->nsge,
+ dma_dir, DMA_ATTR_NO_WARN);
+ if (!ret)
+ goto out;
+
+ ret = spraid_setup_prps(hdev, iod);
+ if (ret)
+ goto unmap;
+
+ cmd->common.dptr.prp1 = cpu_to_le64(sg_dma_address(iod->sg));
+ cmd->common.dptr.prp2 = cpu_to_le64(iod->first_dma);
+
+ return 0;
+
+unmap:
+ dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
+out:
+ return ret;
+}
+
+static int spraid_get_ctrl_info(struct spraid_dev *hdev,
+ struct spraid_ctrl_info *ctrl_info)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
+ admin_cmd.get_info.type = SPRAID_GET_INFO_CTRL;
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(ctrl_info, data_ptr, sizeof(struct spraid_ctrl_info));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_init_ctrl_info(struct spraid_dev *hdev)
+{
+ int ret;
+
+ hdev->ctrl_info->nd = cpu_to_le32(240);
+ hdev->ctrl_info->mdts = 8;
+ hdev->ctrl_info->max_cmds = cpu_to_le16(4096);
+ hdev->ctrl_info->max_num_sge = cpu_to_le16(128);
+ hdev->ctrl_info->max_channel = cpu_to_le16(4);
+ hdev->ctrl_info->max_tgt_id = cpu_to_le32(3239);
+ hdev->ctrl_info->max_lun = cpu_to_le16(2);
+
+ ret = spraid_get_ctrl_info(hdev, hdev->ctrl_info);
+ if (ret)
+ dev_err(hdev->dev, "get controller info failed: %d\n", ret);
+
+ dev_info(hdev->dev, "[%s]nd = %d\n", __func__, hdev->ctrl_info->nd);
+ dev_info(hdev->dev, "[%s]max_cmd = %d\n",
+ __func__, hdev->ctrl_info->max_cmds);
+ dev_info(hdev->dev, "[%s]max_channel = %d\n",
+ __func__, hdev->ctrl_info->max_channel);
+ dev_info(hdev->dev, "[%s]max_tgt_id = %d\n",
+ __func__, hdev->ctrl_info->max_tgt_id);
+ dev_info(hdev->dev, "[%s]max_lun = %d\n",
+ __func__, hdev->ctrl_info->max_lun);
+ dev_info(hdev->dev, "[%s]max_num_sge = %d\n",
+ __func__, hdev->ctrl_info->max_num_sge);
+ dev_info(hdev->dev, "[%s]lun_num_boot = %d\n",
+ __func__, hdev->ctrl_info->lun_num_in_boot);
+ dev_info(hdev->dev, "[%s]mdts = %d\n", __func__, hdev->ctrl_info->mdts);
+ dev_info(hdev->dev, "[%s]acl = %d\n", __func__, hdev->ctrl_info->acl);
+ dev_info(hdev->dev, "[%s]aer1 = %d\n", __func__, hdev->ctrl_info->aerl);
+ dev_info(hdev->dev, "[%s]card_type = %d\n",
+ __func__, hdev->ctrl_info->card_type);
+ dev_info(hdev->dev, "[%s]rtd3e = %d\n",
+ __func__, hdev->ctrl_info->rtd3e);
+ dev_info(hdev->dev, "[%s]sn = %s\n", __func__, hdev->ctrl_info->sn);
+ dev_info(hdev->dev, "[%s]fr = %s\n", __func__, hdev->ctrl_info->fr);
+
+ if (!hdev->ctrl_info->aerl)
+ hdev->ctrl_info->aerl = 1;
+ if (hdev->ctrl_info->aerl > SPRAID_NR_AEN_COMMANDS)
+ hdev->ctrl_info->aerl = SPRAID_NR_AEN_COMMANDS;
+
+ return 0;
+}
+
+#define SPRAID_MAX_ADMIN_PAYLOAD_SIZE BIT(16)
+static int spraid_alloc_iod_ext_mem_pool(struct spraid_dev *hdev)
+{
+ u16 max_sge = le16_to_cpu(hdev->ctrl_info->max_num_sge);
+ size_t alloc_size;
+
+ alloc_size = spraid_iod_ext_size(hdev, SPRAID_MAX_ADMIN_PAYLOAD_SIZE,
+ max_sge, true, false);
+ if (alloc_size > PAGE_SIZE)
+ dev_warn(hdev->dev, "It is unreasonable ;"
+ " sg allocation more than one page\n");
+ hdev->iod_mempool = mempool_create_node(1, mempool_kmalloc,
+ mempool_kfree,
+ (void *)alloc_size, GFP_KERNEL,
+ hdev->numa_node);
+ if (!hdev->iod_mempool) {
+ dev_err(hdev->dev, "Create iod extension memory pool failed\n");
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void spraid_free_iod_ext_mem_pool(struct spraid_dev *hdev)
+{
+ mempool_destroy(hdev->iod_mempool);
+}
+
+static int spraid_user_admin_cmd(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct spraid_bsg_request *bsg_req = job->request;
+ struct spraid_passthru_common_cmd *cmd = &(bsg_req->admcmd);
+ struct spraid_admin_command admin_cmd;
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
+ u32 result[2] = {0};
+ int status;
+
+ if (hdev->state >= SPRAID_RESETTING) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not right\n",
+ __func__, hdev->state);
+ return -EBUSY;
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode);
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.common.opcode = cmd->opcode;
+ admin_cmd.common.flags = cmd->flags;
+ admin_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ admin_cmd.common.cdw2[0] = cpu_to_le32(cmd->cdw2);
+ admin_cmd.common.cdw2[1] = cpu_to_le32(cmd->cdw3);
+ admin_cmd.common.cdw10 = cpu_to_le32(cmd->cdw10);
+ admin_cmd.common.cdw11 = cpu_to_le32(cmd->cdw11);
+ admin_cmd.common.cdw12 = cpu_to_le32(cmd->cdw12);
+ admin_cmd.common.cdw13 = cpu_to_le32(cmd->cdw13);
+ admin_cmd.common.cdw14 = cpu_to_le32(cmd->cdw14);
+ admin_cmd.common.cdw15 = cpu_to_le32(cmd->cdw15);
+
+ status = spraid_bsg_map_data(hdev, job, &admin_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
+ }
+
+ status = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, &result[0],
+ &result[1], timeout);
+ if (status >= 0) {
+ job->reply_len = sizeof(result);
+ memcpy(job->reply, result, sizeof(result));
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x];"
+ " status[0x%x] result0[0x%x] result1[0x%x]\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode,
+ status, result[0], result[1]);
+
+ spraid_bsg_unmap_data(hdev, job);
+
+ return status;
+}
+
+static int spraid_alloc_ioq_ptcmds(struct spraid_dev *hdev)
+{
+ int i;
+ int ptnum = SPRAID_NR_IOQ_PTCMDS;
+
+ INIT_LIST_HEAD(&hdev->ioq_pt_list);
+ spin_lock_init(&hdev->ioq_pt_lock);
+
+ hdev->ioq_ptcmds = kcalloc_node(ptnum, sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
+
+ if (!hdev->ioq_ptcmds) {
+ dev_err(hdev->dev, "Alloc ioq_ptcmds failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < ptnum; i++) {
+ hdev->ioq_ptcmds[i].qid = i / SPRAID_PTCMDS_PERQ + 1;
+ hdev->ioq_ptcmds[i].cid = i % SPRAID_PTCMDS_PERQ
+ + SPRAID_IO_BLK_MQ_DEPTH;
+ list_add_tail(&(hdev->ioq_ptcmds[i].list), &hdev->ioq_pt_list);
+ }
+
+ dev_info(hdev->dev, "Alloc ioq_ptcmds success, ptnum[%d]\n", ptnum);
+
+ return 0;
+}
+
+static void spraid_free_ioq_ptcmds(struct spraid_dev *hdev)
+{
+ kfree(hdev->ioq_ptcmds);
+ hdev->ioq_ptcmds = NULL;
+
+ INIT_LIST_HEAD(&hdev->ioq_pt_list);
+}
+
+static int spraid_submit_ioq_sync_cmd(struct spraid_dev *hdev,
+ struct spraid_ioq_command *cmd,
+ u32 *result, u32 *reslen, u32 timeout)
+{
+ int ret;
+ dma_addr_t sense_dma;
+ struct spraid_queue *ioq;
+ void *sense_addr = NULL;
+ struct spraid_cmd *pt_cmd = spraid_get_cmd(hdev, SPRAID_CMD_IOPT);
+
+ if (!pt_cmd) {
+ dev_err(hdev->dev, "err, get ioq cmd failed\n");
+ return -EFAULT;
+ }
+
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
+
+ init_completion(&pt_cmd->cmd_done);
+
+ ioq = &hdev->queues[pt_cmd->qid];
+ ret = pt_cmd->cid * SCSI_SENSE_BUFFERSIZE;
+ sense_addr = ioq->sense + ret;
+ sense_dma = ioq->sense_dma_addr + ret;
+
+ cmd->common.sense_addr = cpu_to_le64(sense_dma);
+ cmd->common.sense_len = cpu_to_le16(SCSI_SENSE_BUFFERSIZE);
+ cmd->common.command_id = pt_cmd->cid;
+
+ spraid_submit_cmd(ioq, cmd);
+
+ if (!wait_for_completion_timeout(&pt_cmd->cmd_done, timeout)) {
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout;"
+ " opcode[0x%x] subopcode[0x%x]\n",
+ __func__, pt_cmd->cid, pt_cmd->qid, cmd->common.opcode,
+ (le32_to_cpu(cmd->common.cdw3[0]) & 0xffff));
+ WRITE_ONCE(pt_cmd->state, SPRAID_CMD_TIMEOUT);
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
+ return -EINVAL;
+ }
+
+ if (result && reslen) {
+ if ((pt_cmd->status & 0x17f) == 0x101) {
+ memcpy(result, sense_addr, SCSI_SENSE_BUFFERSIZE);
+ *reslen = SCSI_SENSE_BUFFERSIZE;
+ }
+ }
+
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
+
+ return pt_cmd->status;
+}
+
+static int spraid_user_ioq_cmd(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct spraid_bsg_request *bsg_req =
+ (struct spraid_bsg_request *)(job->request);
+ struct spraid_ioq_passthru_cmd *cmd = &(bsg_req->ioqcmd);
+ struct spraid_ioq_command ioq_cmd;
+ int status = 0;
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
+
+ if (cmd->data_len > PAGE_SIZE) {
+ dev_err(hdev->dev, "[%s] data len bigger than 4k\n", __func__);
+ return -EFAULT;
+ }
+
+ if (hdev->state != SPRAID_LIVE) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not live\n",
+ __func__, hdev->state);
+ return -EBUSY;
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init;"
+ " datalen[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode, cmd->data_len);
+
+ memset(&ioq_cmd, 0, sizeof(ioq_cmd));
+ ioq_cmd.common.opcode = cmd->opcode;
+ ioq_cmd.common.flags = cmd->flags;
+ ioq_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ ioq_cmd.common.sense_len = cpu_to_le16(cmd->info_0.res_sense_len);
+ ioq_cmd.common.cdb_len = cmd->info_0.cdb_len;
+ ioq_cmd.common.rsvd2 = cmd->info_0.rsvd0;
+ ioq_cmd.common.cdw3[0] = cpu_to_le32(cmd->cdw3);
+ ioq_cmd.common.cdw3[1] = cpu_to_le32(cmd->cdw4);
+ ioq_cmd.common.cdw3[2] = cpu_to_le32(cmd->cdw5);
+
+ ioq_cmd.common.cdw10[0] = cpu_to_le32(cmd->cdw10);
+ ioq_cmd.common.cdw10[1] = cpu_to_le32(cmd->cdw11);
+ ioq_cmd.common.cdw10[2] = cpu_to_le32(cmd->cdw12);
+ ioq_cmd.common.cdw10[3] = cpu_to_le32(cmd->cdw13);
+ ioq_cmd.common.cdw10[4] = cpu_to_le32(cmd->cdw14);
+ ioq_cmd.common.cdw10[5] = cpu_to_le32(cmd->data_len);
+
+ memcpy(ioq_cmd.common.cdb, &cmd->cdw16, cmd->info_0.cdb_len);
+
+ ioq_cmd.common.cdw26[0] = cpu_to_le32(cmd->cdw26[0]);
+ ioq_cmd.common.cdw26[1] = cpu_to_le32(cmd->cdw26[1]);
+ ioq_cmd.common.cdw26[2] = cpu_to_le32(cmd->cdw26[2]);
+ ioq_cmd.common.cdw26[3] = cpu_to_le32(cmd->cdw26[3]);
+
+ status = spraid_bsg_map_data(hdev, job,
+ (struct spraid_admin_command *)&ioq_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
+ }
+
+ status = spraid_submit_ioq_sync_cmd(hdev, &ioq_cmd, job->reply,
+ &job->reply_len, timeout);
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x], status[0x%x];"
+ " reply_len[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode,
+ status, job->reply_len);
+
+ spraid_bsg_unmap_data(hdev, job);
+
+ return status;
+}
+
+static bool spraid_check_scmd_completed(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_queue *spraidq;
+ u16 hwq, cid;
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ spraidq = &hdev->queues[hwq];
+ if (READ_ONCE(iod->state) == SPRAID_CMD_COMPLETE
+ || spraid_poll_cq(spraidq, cid)) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] has been completed\n",
+ cid, spraidq->qid);
+ return true;
+ }
+ return false;
+}
+
+static enum blk_eh_timer_return spraid_scmd_timeout(struct scsi_cmnd *scmd)
+{
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ unsigned int timeout = scmd->device->request_queue->rq_timeout;
+
+ if (spraid_check_scmd_completed(scmd))
+ goto out;
+
+ if (time_after(jiffies, scmd->jiffies_at_alloc + timeout)) {
+ if (cmpxchg(&iod->state,
+ SPRAID_CMD_IN_FLIGHT,
+ SPRAID_CMD_TIMEOUT) == SPRAID_CMD_IN_FLIGHT) {
+ return BLK_EH_DONE;
+ }
+ }
+out:
+ return BLK_EH_RESET_TIMER;
+}
+
+/* send abort command by admin queue temporary */
+static int spraid_send_abort_cmd(struct spraid_dev *hdev,
+ u32 hdid, u16 qid, u16 cid)
+{
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.abort.opcode = SPRAID_ADMIN_ABORT_CMD;
+ admin_cmd.abort.hdid = cpu_to_le32(hdid);
+ admin_cmd.abort.sqid = cpu_to_le16(qid);
+ admin_cmd.abort.cid = cpu_to_le16(cid);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+/* send reset command by admin quueue temporary */
+static int spraid_send_reset_cmd(struct spraid_dev *hdev, int type, u32 hdid)
+{
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.reset.opcode = SPRAID_ADMIN_RESET;
+ admin_cmd.reset.hdid = cpu_to_le32(hdid);
+ admin_cmd.reset.type = type;
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static bool spraid_change_host_state(struct spraid_dev *hdev,
+ enum spraid_state newstate)
+{
+ unsigned long flags;
+ enum spraid_state oldstate;
+ bool change = false;
+
+ spin_lock_irqsave(&hdev->state_lock, flags);
+
+ oldstate = hdev->state;
+ switch (newstate) {
+ case SPRAID_LIVE:
+ switch (oldstate) {
+ case SPRAID_NEW:
+ case SPRAID_RESETTING:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ case SPRAID_RESETTING:
+ switch (oldstate) {
+ case SPRAID_LIVE:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ case SPRAID_DELETING:
+ if (oldstate != SPRAID_DELETING)
+ change = true;
+ break;
+ case SPRAID_DEAD:
+ switch (oldstate) {
+ case SPRAID_NEW:
+ case SPRAID_LIVE:
+ case SPRAID_RESETTING:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ default:
+ break;
+ }
+ if (change)
+ hdev->state = newstate;
+ spin_unlock_irqrestore(&hdev->state_lock, flags);
+
+ dev_info(hdev->dev, "[%s][%d]->[%d], change[%d]\n",
+ __func__, oldstate, newstate, change);
+
+ return change;
+}
+
+static void spraid_back_fault_cqe(struct spraid_queue *ioq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct blk_mq_tags *tags;
+ struct scsi_cmnd *scmd;
+ struct spraid_iod *iod;
+ struct request *req;
+
+ tags = hdev->shost->tag_set.tags[ioq->qid - 1];
+ req = blk_mq_tag_to_rq(tags, cqe->cmd_id);
+ if (unlikely(!req || !blk_mq_request_started(req)))
+ return;
+
+ scmd = blk_mq_rq_to_pdu(req);
+ iod = scsi_cmd_priv(scmd);
+
+ set_host_byte(scmd, DID_NO_CONNECT);
+ if (iod->nsge)
+ scsi_dma_unmap(scmd);
+ spraid_free_iod_res(hdev, iod);
+ scmd->scsi_done(scmd);
+ dev_warn(hdev->dev, "Back fault CQE, cid[%d] qid[%d]\n",
+ cqe->cmd_id, ioq->qid);
+}
+
+static void spraid_back_all_io(struct spraid_dev *hdev)
+{
+ int i, j;
+ struct spraid_queue *ioq;
+ struct spraid_completion cqe = { 0 };
+
+ scsi_block_requests(hdev->shost);
+
+ for (i = 1; i <= hdev->shost->nr_hw_queues; i++) {
+ ioq = &hdev->queues[i];
+ for (j = 0; j < hdev->shost->can_queue; j++) {
+ cqe.cmd_id = j;
+ spraid_back_fault_cqe(ioq, &cqe);
+ }
+ }
+
+ scsi_unblock_requests(hdev->shost);
+}
+
+static void spraid_dev_disable(struct spraid_dev *hdev, bool shutdown)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u16 start, end;
+ unsigned long timeout = jiffies + 600 * HZ;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ if (shutdown)
+ spraid_shutdown_ctrl(hdev);
+ else
+ spraid_disable_ctrl(hdev);
+ }
+
+ while (!time_after(jiffies, timeout)) {
+ if (!pci_device_is_present(hdev->pdev)) {
+ dev_info(hdev->dev, "[%s] pci_device not present;"
+ " skip wait\n", __func__);
+ break;
+ }
+ if (!spraid_wait_ready(hdev, hdev->cap, false)) {
+ dev_info(hdev->dev,
+ "[%s] wait ready success after reset\n",
+ __func__);
+ break;
+ }
+ dev_info(hdev->dev, "[%s] waiting csts_rdy ready\n", __func__);
+ }
+
+ if (hdev->queue_count == 0) {
+ dev_err(hdev->dev, "[%s] warn, queue has been delete\n",
+ __func__);
+ return;
+ }
+
+ spin_lock_irq(&adminq->cq_lock);
+ spraid_process_cq(adminq, &start, &end, -1);
+ spin_unlock_irq(&adminq->cq_lock);
+ spraid_complete_cqes(adminq, start, end);
+
+ spraid_pci_disable(hdev);
+
+ spraid_back_all_io(hdev);
+}
+
+static void spraid_reset_work(struct work_struct *work)
+{
+ int ret;
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, reset_work);
+
+ if (hdev->state != SPRAID_RESETTING) {
+ dev_err(hdev->dev, "[%s] err, host is not reset state\n",
+ __func__);
+ return;
+ }
+
+ dev_info(hdev->dev, "[%s] enter host reset\n", __func__);
+
+ if (hdev->ctrl_config & SPRAID_CC_ENABLE) {
+ dev_info(hdev->dev, "[%s] start dev_disable\n", __func__);
+ spraid_dev_disable(hdev, false);
+ }
+
+ ret = spraid_pci_enable(hdev);
+ if (ret)
+ goto out;
+
+ ret = spraid_setup_admin_queue(hdev);
+ if (ret)
+ goto pci_disable;
+
+ ret = spraid_setup_io_queues(hdev);
+ if (ret || hdev->online_queues <= hdev->shost->nr_hw_queues)
+ goto pci_disable;
+
+ spraid_change_host_state(hdev, SPRAID_LIVE);
+
+ spraid_send_all_aen(hdev);
+
+ return;
+
+pci_disable:
+ spraid_pci_disable(hdev);
+out:
+ spraid_change_host_state(hdev, SPRAID_DEAD);
+ dev_err(hdev->dev, "[%s] err, host reset failed\n", __func__);
+}
+
+static int spraid_reset_work_sync(struct spraid_dev *hdev)
+{
+ if (!spraid_change_host_state(hdev, SPRAID_RESETTING)) {
+ dev_info(hdev->dev, "[%s] can't change to reset state\n",
+ __func__);
+ return -EBUSY;
+ }
+
+ if (!queue_work(spraid_wq, &hdev->reset_work)) {
+ dev_err(hdev->dev, "[%s] err, host is already in reset state\n",
+ __func__);
+ return -EBUSY;
+ }
+
+ flush_work(&hdev->reset_work);
+ if (hdev->state != SPRAID_LIVE)
+ return -ENODEV;
+
+ return 0;
+}
+
+static int spraid_wait_abnl_cmd_done(struct spraid_iod *iod)
+{
+ u16 times = 0;
+
+ do {
+ if (READ_ONCE(iod->state) == SPRAID_CMD_TMO_COMPLETE)
+ break;
+ msleep(500);
+ times++;
+ } while (times <= SPRAID_WAIT_ABNL_CMD_TIMEOUT);
+
+ /* wait command completion timeout after abort/reset success */
+ if (times >= SPRAID_WAIT_ABNL_CMD_TIMEOUT)
+ return -ETIMEDOUT;
+
+ return 0;
+}
+
+static int spraid_abort_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, aborting\n", cid, hwq);
+ ret = spraid_send_abort_cmd(hdev, hostdata->hdid, hwq, cid);
+ if (ret != ADMIN_ERR_TIMEOUT) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort failed;"
+ " not found\n", cid, hwq);
+ return FAILED;
+ }
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort succ\n", cid, hwq);
+ return SUCCESS;
+ }
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort failed, timeout\n",
+ cid, hwq);
+ return FAILED;
+}
+
+static int spraid_tgt_reset_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, target reset\n",
+ cid, hwq);
+ ret = spraid_send_reset_cmd(hdev, SPRAID_RESET_TARGET, hostdata->hdid);
+ if (ret == 0) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev,
+ "cid[%d] qid[%d]target reset failed;"
+ " not found\n", cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] target reset success\n",
+ cid, hwq);
+ return SUCCESS;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] ret[%d] target reset failed\n",
+ cid, hwq, ret);
+ return FAILED;
+}
+
+static int spraid_bus_reset_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, bus reset\n", cid, hwq);
+ ret = spraid_send_reset_cmd(hdev, SPRAID_RESET_BUS, hostdata->hdid);
+ if (ret == 0) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev,
+ "cid[%d] qid[%d] bus reset failed;"
+ " not found\n", cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] bus reset succ\n",
+ cid, hwq);
+ return SUCCESS;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] ret[%d] bus reset failed\n",
+ cid, hwq, ret);
+ return FAILED;
+}
+
+static int spraid_shost_reset_handler(struct scsi_cmnd *scmd)
+{
+ u16 hwq, cid;
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+
+ scsi_print_command(scmd);
+ if (hdev->state != SPRAID_LIVE || spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset\n", cid, hwq);
+
+ if (spraid_reset_work_sync(hdev)) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset failed\n",
+ cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset success\n", cid, hwq);
+
+ return SUCCESS;
+}
+
+static pci_ers_result_t spraid_pci_error_detected(struct pci_dev *pdev,
+ pci_channel_state_t state)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "enter pci error detect, state:%d\n", state);
+
+ switch (state) {
+ case pci_channel_io_normal:
+ dev_warn(hdev->dev, "channel is normal, do nothing\n");
+
+ return PCI_ERS_RESULT_CAN_RECOVER;
+ case pci_channel_io_frozen:
+ dev_warn(hdev->dev,
+ "channel io frozen, need reset controller\n");
+
+ scsi_block_requests(hdev->shost);
+
+ spraid_change_host_state(hdev, SPRAID_RESETTING);
+
+ return PCI_ERS_RESULT_NEED_RESET;
+ case pci_channel_io_perm_failure:
+ dev_warn(hdev->dev, "channel io failure, request disconnect\n");
+
+ return PCI_ERS_RESULT_DISCONNECT;
+ }
+
+ return PCI_ERS_RESULT_NEED_RESET;
+}
+
+static pci_ers_result_t spraid_pci_slot_reset(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "restart after slot reset\n");
+
+ pci_restore_state(pdev);
+
+ if (!queue_work(spraid_wq, &hdev->reset_work)) {
+ dev_err(hdev->dev, "[%s] err, the device is resetting state\n",
+ __func__);
+ return PCI_ERS_RESULT_NONE;
+ }
+
+ flush_work(&hdev->reset_work);
+
+ scsi_unblock_requests(hdev->shost);
+
+ return PCI_ERS_RESULT_RECOVERED;
+}
+
+static void spraid_reset_done(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "enter spraid reset done\n");
+}
+
+static ssize_t csts_pp_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS)
+ & SPRAID_CSTS_PP_MASK);
+ ret >>= SPRAID_CSTS_PP_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_shst_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS)
+ & SPRAID_CSTS_SHST_MASK);
+ ret >>= SPRAID_CSTS_SHST_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_cfs_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS)
+ & SPRAID_CSTS_CFS_MASK);
+ ret >>= SPRAID_CSTS_CFS_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_rdy_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev))
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_RDY);
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t fw_version_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+
+ return snprintf(buf, PAGE_SIZE, "%s\n", hdev->ctrl_info->fr);
+}
+
+static DEVICE_ATTR_RO(csts_pp);
+static DEVICE_ATTR_RO(csts_shst);
+static DEVICE_ATTR_RO(csts_cfs);
+static DEVICE_ATTR_RO(csts_rdy);
+static DEVICE_ATTR_RO(fw_version);
+
+static struct device_attribute *spraid_host_attrs[] = {
+ &dev_attr_csts_pp,
+ &dev_attr_csts_shst,
+ &dev_attr_csts_cfs,
+ &dev_attr_csts_rdy,
+ &dev_attr_fw_version,
+ NULL,
+};
+
+static int spraid_get_vd_info(struct spraid_dev *hdev,
+ struct spraid_vd_info *vd_info, u16 vid)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.usr_cmd.opcode = USR_CMD_READ;
+ admin_cmd.usr_cmd.info_0.subopcode = cpu_to_le16(USR_CMD_VDINFO);
+ admin_cmd.usr_cmd.info_1.data_len = cpu_to_le16(USR_CMD_RDLEN);
+ admin_cmd.usr_cmd.info_1.param_len = cpu_to_le16(VDINFO_PARAM_LEN);
+ admin_cmd.usr_cmd.cdw10 = cpu_to_le32(vid);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(vd_info, data_ptr, sizeof(struct spraid_vd_info));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_get_bgtask(struct spraid_dev *hdev,
+ struct spraid_bgtask *bgtask)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.usr_cmd.opcode = USR_CMD_READ;
+ admin_cmd.usr_cmd.info_0.subopcode = cpu_to_le16(USR_CMD_BGTASK);
+ admin_cmd.usr_cmd.info_1.data_len = cpu_to_le16(USR_CMD_RDLEN);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(bgtask, data_ptr, sizeof(struct spraid_bgtask));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static ssize_t raid_level_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_sdev_hostdata *hostdata;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret)
+ vd_info->rg_level = ARRAY_SIZE(raid_levels) - 1;
+
+ ret = (vd_info->rg_level < ARRAY_SIZE(raid_levels)) ?
+ vd_info->rg_level : (ARRAY_SIZE(raid_levels) - 1);
+
+ kfree(vd_info);
+
+ return snprintf(buf, PAGE_SIZE, "RAID-%s\n", raid_levels[ret]);
+}
+
+static ssize_t raid_state_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_sdev_hostdata *hostdata;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret) {
+ vd_info->vd_status = 0;
+ vd_info->rg_id = 0xff;
+ }
+
+ ret = (vd_info->vd_status < ARRAY_SIZE(raid_states)) ?
+ vd_info->vd_status : 0;
+
+ kfree(vd_info);
+
+ return snprintf(buf, PAGE_SIZE, "%s\n", raid_states[ret]);
+}
+
+static ssize_t raid_resync_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_bgtask *bgtask;
+ struct spraid_sdev_hostdata *hostdata;
+ u8 rg_id, i, progress = 0;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret)
+ goto out;
+
+ rg_id = vd_info->rg_id;
+
+ bgtask = (struct spraid_bgtask *)vd_info;
+ ret = spraid_get_bgtask(hdev, bgtask);
+ if (ret)
+ goto out;
+ for (i = 0; i < bgtask->task_num; i++) {
+ if ((bgtask->bgtask[i].type == BGTASK_TYPE_REBUILD) &&
+ (le16_to_cpu(bgtask->bgtask[i].vd_id) == rg_id))
+ progress = bgtask->bgtask[i].progress;
+ }
+
+out:
+ kfree(vd_info);
+ return snprintf(buf, PAGE_SIZE, "%d\n", progress);
+}
+
+static DEVICE_ATTR_RO(raid_level);
+static DEVICE_ATTR_RO(raid_state);
+static DEVICE_ATTR_RO(raid_resync);
+
+static struct device_attribute *spraid_dev_attrs[] = {
+ &dev_attr_raid_level,
+ &dev_attr_raid_state,
+ &dev_attr_raid_resync,
+ NULL,
+};
+
+static struct pci_error_handlers spraid_err_handler = {
+ .error_detected = spraid_pci_error_detected,
+ .slot_reset = spraid_pci_slot_reset,
+ .reset_done = spraid_reset_done,
+};
+
+static int spraid_sysfs_host_reset(struct Scsi_Host *shost, int reset_type)
+{
+ int ret;
+ struct spraid_dev *hdev = shost_priv(shost);
+
+ dev_info(hdev->dev, "[%s] start sysfs host reset cmd\n", __func__);
+ ret = spraid_reset_work_sync(hdev);
+ dev_info(hdev->dev, "[%s] stop sysfs host reset cmd[%d]\n",
+ __func__, ret);
+
+ return ret;
+}
+
+static struct scsi_host_template spraid_driver_template = {
+ .module = THIS_MODULE,
+ .name = "Ramaxel Logic spraid driver",
+ .proc_name = "spraid",
+ .queuecommand = spraid_queue_command,
+ .slave_alloc = spraid_slave_alloc,
+ .slave_destroy = spraid_slave_destroy,
+ .slave_configure = spraid_slave_configure,
+ .eh_timed_out = spraid_scmd_timeout,
+ .eh_abort_handler = spraid_abort_handler,
+ .eh_target_reset_handler = spraid_tgt_reset_handler,
+ .eh_bus_reset_handler = spraid_bus_reset_handler,
+ .eh_host_reset_handler = spraid_shost_reset_handler,
+ .change_queue_depth = scsi_change_queue_depth,
+ .this_id = -1,
+ .shost_attrs = spraid_host_attrs,
+ .sdev_attrs = spraid_dev_attrs,
+ .host_reset = spraid_sysfs_host_reset,
+};
+
+static void spraid_shutdown(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ spraid_remove_io_queues(hdev);
+ spraid_disable_admin_queue(hdev, true);
+}
+
+/* bsg dispatch user command */
+static int spraid_bsg_host_dispatch(struct bsg_job *job)
+{
+ struct Scsi_Host *shost = dev_to_shost(job->dev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_bsg_request *bsg_req = job->request;
+ int ret = 0;
+
+ dev_info(hdev->dev, "[%s] msgcode[%d], msglen[%d], timeout[%d];"
+ " req_nsge[%d], req_len[%d]\n",
+ __func__, bsg_req->msgcode, job->request_len,
+ rq->timeout, job->request_payload.sg_cnt,
+ job->request_payload.payload_len);
+
+ job->reply_len = 0;
+
+ switch (bsg_req->msgcode) {
+ case SPRAID_BSG_ADM:
+ ret = spraid_user_admin_cmd(hdev, job);
+ break;
+ case SPRAID_BSG_IOQ:
+ ret = spraid_user_ioq_cmd(hdev, job);
+ break;
+ default:
+ dev_info(hdev->dev, "[%s] unsupport msgcode[%d]\n",
+ __func__, bsg_req->msgcode);
+ break;
+ }
+
+ if (ret > 0)
+ ret = ret | (ret << 8);
+
+ bsg_job_done(job, ret, 0);
+ return 0;
+}
+
+static inline void spraid_remove_bsg(struct spraid_dev *hdev)
+{
+ if (hdev->bsg_queue) {
+ bsg_unregister_queue(hdev->bsg_queue);
+ blk_cleanup_queue(hdev->bsg_queue);
+ }
+}
+static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct spraid_dev *hdev;
+ struct Scsi_Host *shost;
+ int node, ret;
+ char bsg_name[15];
+
+ shost = scsi_host_alloc(&spraid_driver_template, sizeof(*hdev));
+ if (!shost) {
+ dev_err(&pdev->dev, "Failed to allocate scsi host\n");
+ return -ENOMEM;
+ }
+ hdev = shost_priv(shost);
+ hdev->pdev = pdev;
+ hdev->dev = get_device(&pdev->dev);
+
+ node = dev_to_node(hdev->dev);
+ if (node == NUMA_NO_NODE) {
+ node = first_memory_node;
+ set_dev_node(hdev->dev, node);
+ }
+ hdev->numa_node = node;
+ hdev->shost = shost;
+ pci_set_drvdata(pdev, hdev);
+
+ ret = spraid_dev_map(hdev);
+ if (ret)
+ goto put_dev;
+
+ init_rwsem(&hdev->devices_rwsem);
+ INIT_WORK(&hdev->scan_work, spraid_scan_work);
+ INIT_WORK(&hdev->timesyn_work, spraid_timesyn_work);
+ INIT_WORK(&hdev->reset_work, spraid_reset_work);
+ INIT_WORK(&hdev->fw_act_work, spraid_fw_act_work);
+ spin_lock_init(&hdev->state_lock);
+
+ ret = spraid_alloc_resources(hdev);
+ if (ret)
+ goto dev_unmap;
+
+ ret = spraid_pci_enable(hdev);
+ if (ret)
+ goto resources_free;
+
+ ret = spraid_setup_admin_queue(hdev);
+ if (ret)
+ goto pci_disable;
+
+ ret = spraid_init_ctrl_info(hdev);
+ if (ret)
+ goto disable_admin_q;
+
+ ret = spraid_alloc_iod_ext_mem_pool(hdev);
+ if (ret)
+ goto disable_admin_q;
+
+ ret = spraid_setup_io_queues(hdev);
+ if (ret)
+ goto free_iod_mempool;
+
+ spraid_shost_init(hdev);
+
+ ret = scsi_add_host(hdev->shost, hdev->dev);
+ if (ret) {
+ dev_err(hdev->dev, "Add shost to system failed, ret: %d\n",
+ ret);
+ goto remove_io_queues;
+ }
+
+ snprintf(bsg_name, sizeof(bsg_name), "spraid%d", shost->host_no);
+ hdev->bsg_queue = bsg_setup_queue(&shost->shost_gendev, bsg_name,
+ spraid_bsg_host_dispatch,
+ spraid_cmd_size(hdev, true, false));
+ if (IS_ERR(hdev->bsg_queue)) {
+ dev_err(hdev->dev, "err, setup bsg failed\n");
+ hdev->bsg_queue = NULL;
+ goto remove_io_queues;
+ }
+
+ if (hdev->online_queues == SPRAID_ADMIN_QUEUE_NUM) {
+ dev_warn(hdev->dev, "warn only admin queue can be used\n");
+ return 0;
+ }
+
+ hdev->state = SPRAID_LIVE;
+
+ spraid_send_all_aen(hdev);
+
+ ret = spraid_dev_list_init(hdev);
+ if (ret)
+ goto remove_bsg;
+
+ ret = spraid_configure_timestamp(hdev);
+ if (ret)
+ dev_warn(hdev->dev, "init set timestamp failed\n");
+
+ ret = spraid_alloc_ioq_ptcmds(hdev);
+ if (ret)
+ goto remove_bsg;
+
+ scsi_scan_host(hdev->shost);
+
+ return 0;
+
+remove_bsg:
+ spraid_remove_bsg(hdev);
+remove_io_queues:
+ spraid_remove_io_queues(hdev);
+free_iod_mempool:
+ spraid_free_iod_ext_mem_pool(hdev);
+disable_admin_q:
+ spraid_disable_admin_queue(hdev, false);
+pci_disable:
+ spraid_pci_disable(hdev);
+resources_free:
+ spraid_free_resources(hdev);
+dev_unmap:
+ spraid_dev_unmap(hdev);
+put_dev:
+ put_device(hdev->dev);
+ scsi_host_put(shost);
+
+ return -ENODEV;
+}
+
+static void spraid_remove(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+ struct Scsi_Host *shost = hdev->shost;
+
+ dev_info(hdev->dev, "enter spraid remove\n");
+
+ spraid_change_host_state(hdev, SPRAID_DELETING);
+ flush_work(&hdev->reset_work);
+
+ if (!pci_device_is_present(pdev))
+ spraid_back_all_io(hdev);
+
+ spraid_remove_bsg(hdev);
+ scsi_remove_host(shost);
+ spraid_free_ioq_ptcmds(hdev);
+ kfree(hdev->devices);
+ spraid_remove_io_queues(hdev);
+ spraid_free_iod_ext_mem_pool(hdev);
+ spraid_disable_admin_queue(hdev, false);
+ spraid_pci_disable(hdev);
+ spraid_free_resources(hdev);
+ spraid_dev_unmap(hdev);
+ put_device(hdev->dev);
+ scsi_host_put(shost);
+
+ dev_info(hdev->dev, "exit spraid remove\n");
+}
+
+static const struct pci_device_id spraid_id_table[] = {
+ { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC,
+ SPRAID_SERVER_DEVICE_HBA_DID) },
+ { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC,
+ SPRAID_SERVER_DEVICE_RAID_DID) },
+ { 0, }
+};
+MODULE_DEVICE_TABLE(pci, spraid_id_table);
+
+static struct pci_driver spraid_driver = {
+ .name = "spraid",
+ .id_table = spraid_id_table,
+ .probe = spraid_probe,
+ .remove = spraid_remove,
+ .shutdown = spraid_shutdown,
+ .err_handler = &spraid_err_handler,
+};
+
+static int __init spraid_init(void)
+{
+ int ret;
+
+ spraid_wq = alloc_workqueue("spraid-wq", WQ_UNBOUND | WQ_MEM_RECLAIM |
+ WQ_SYSFS, 0);
+ if (!spraid_wq)
+ return -ENOMEM;
+
+ spraid_class = class_create(THIS_MODULE, "spraid");
+ if (IS_ERR(spraid_class)) {
+ ret = PTR_ERR(spraid_class);
+ goto destroy_wq;
+ }
+
+ ret = pci_register_driver(&spraid_driver);
+ if (ret < 0)
+ goto destroy_class;
+
+ return 0;
+
+destroy_class:
+ class_destroy(spraid_class);
+destroy_wq:
+ destroy_workqueue(spraid_wq);
+
+ return ret;
+}
+
+static void __exit spraid_exit(void)
+{
+ pci_unregister_driver(&spraid_driver);
+ class_destroy(spraid_class);
+ destroy_workqueue(spraid_wq);
+ ida_destroy(&spraid_instance_ida);
+}
+
+MODULE_AUTHOR("songyl(a)ramaxel.com");
+MODULE_DESCRIPTION("Ramaxel Memory Technology SPraid Driver");
+MODULE_LICENSE("GPL");
+MODULE_VERSION(SPRAID_DRV_VERSION);
+module_init(spraid_init);
+module_exit(spraid_exit);
--
2.25.1
1
1
From: Matteo Croce <mcroce(a)microsoft.com>
maillist inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4BULZ
CVE: NA
Reference: https://lore.kernel.org/all/20201110202746.9690-1-mcroce@linux.microsoft.co…
----------------------------------------------------------------------
The kernel cmdline reboot= option offers some sort of control
on how the reboot is issued.
Add handles in sysfs to allow setting these reboot options, so they
can be changed when the system is booted, other than at boot time.
The handlers are under <sysfs>/kernel/reboot, can be read to
get the current configuration and written to alter it.
# cd /sys/kernel/reboot/
# grep . *
cpu:0
force:0
mode:cold
type:acpi
# echo 2 >cpu
# echo yes >force
# echo soft >mode
# echo bios >type
# grep . *
cpu:2
force:1
mode:soft
type:bios
Before setting anything, check for CAP_SYS_BOOT capability, so it's
possible to allow an unpriviledged process to change these settings
simply by relaxing the handles permissions, without opening them to
the world.
Signed-off-by: Matteo Croce <mcroce(a)microsoft.com>
Signed-off-by: Hongyu Li <543306408(a)qq.com>
---
Documentation/ABI/testing/sysfs-kernel-reboot | 32 +++
kernel/reboot.c | 206 ++++++++++++++++++
2 files changed, 238 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-kernel-reboot
diff --git a/Documentation/ABI/testing/sysfs-kernel-reboot b/Documentation/ABI/testing/sysfs-kernel-reboot
new file mode 100644
index 000000000000..837330fb2511
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-reboot
@@ -0,0 +1,32 @@
+What: /sys/kernel/reboot
+Date: November 2020
+KernelVersion: 5.11
+Contact: Matteo Croce <mcroce(a)microsoft.com>
+Description: Interface to set the kernel reboot behavior, similarly to
+ what can be done via the reboot= cmdline option.
+ (see Documentation/admin-guide/kernel-parameters.txt)
+
+What: /sys/kernel/reboot/mode
+Date: November 2020
+KernelVersion: 5.11
+Contact: Matteo Croce <mcroce(a)microsoft.com>
+Description: Reboot mode. Valid values are: cold warm hard soft gpio
+
+What: /sys/kernel/reboot/type
+Date: November 2020
+KernelVersion: 5.11
+Contact: Matteo Croce <mcroce(a)microsoft.com>
+Description: Reboot type. Valid values are: bios acpi kbd triple efi pci
+
+What: /sys/kernel/reboot/cpu
+Date: November 2020
+KernelVersion: 5.11
+Contact: Matteo Croce <mcroce(a)microsoft.com>
+Description: CPU number to use to reboot.
+
+What: /sys/kernel/reboot/force
+Date: November 2020
+KernelVersion: 5.11
+Contact: Matteo Croce <mcroce(a)microsoft.com>
+Description: Don't wait for any other CPUs on reboot and
+ avoid anything that could hang.
diff --git a/kernel/reboot.c b/kernel/reboot.c
index af6f23d8bea1..778979d92dc7 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -594,3 +594,209 @@ static int __init reboot_setup(char *str)
return 1;
}
__setup("reboot=", reboot_setup);
+
+#ifdef CONFIG_SYSFS
+
+#define REBOOT_COLD_STR "cold"
+#define REBOOT_WARM_STR "warm"
+#define REBOOT_HARD_STR "hard"
+#define REBOOT_SOFT_STR "soft"
+#define REBOOT_GPIO_STR "gpio"
+#define REBOOT_UNDEFINED_STR "undefined"
+
+#define BOOT_TRIPLE_STR "triple"
+#define BOOT_KBD_STR "kbd"
+#define BOOT_BIOS_STR "bios"
+#define BOOT_ACPI_STR "acpi"
+#define BOOT_EFI_STR "efi"
+#define BOOT_CF9_FORCE_STR "cf9_force"
+#define BOOT_CF9_SAFE_STR "cf9_safe"
+
+static ssize_t mode_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+ const char *val;
+
+ switch (reboot_mode) {
+ case REBOOT_COLD:
+ val = REBOOT_COLD_STR;
+ break;
+ case REBOOT_WARM:
+ val = REBOOT_WARM_STR;
+ break;
+ case REBOOT_HARD:
+ val = REBOOT_HARD_STR;
+ break;
+ case REBOOT_SOFT:
+ val = REBOOT_SOFT_STR;
+ break;
+ case REBOOT_GPIO:
+ val = REBOOT_GPIO_STR;
+ break;
+ default:
+ val = REBOOT_UNDEFINED_STR;
+ }
+
+ return sprintf(buf, "%s\n", val);
+}
+static ssize_t mode_store(struct kobject *kobj, struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+ if (!capable(CAP_SYS_BOOT))
+ return -EPERM;
+
+ if (!strncmp(buf, REBOOT_COLD_STR, strlen(REBOOT_COLD_STR)))
+ reboot_mode = REBOOT_COLD;
+ else if (!strncmp(buf, REBOOT_WARM_STR, strlen(REBOOT_WARM_STR)))
+ reboot_mode = REBOOT_WARM;
+ else if (!strncmp(buf, REBOOT_HARD_STR, strlen(REBOOT_HARD_STR)))
+ reboot_mode = REBOOT_HARD;
+ else if (!strncmp(buf, REBOOT_SOFT_STR, strlen(REBOOT_SOFT_STR)))
+ reboot_mode = REBOOT_SOFT;
+ else if (!strncmp(buf, REBOOT_GPIO_STR, strlen(REBOOT_GPIO_STR)))
+ reboot_mode = REBOOT_GPIO;
+ else
+ return -EINVAL;
+
+ return count;
+}
+static struct kobj_attribute reboot_mode_attr = __ATTR_RW(mode);
+
+static ssize_t type_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+ const char *val;
+
+ switch (reboot_type) {
+ case BOOT_TRIPLE:
+ val = BOOT_TRIPLE_STR;
+ break;
+ case BOOT_KBD:
+ val = BOOT_KBD_STR;
+ break;
+ case BOOT_BIOS:
+ val = BOOT_BIOS_STR;
+ break;
+ case BOOT_ACPI:
+ val = BOOT_ACPI_STR;
+ break;
+ case BOOT_EFI:
+ val = BOOT_EFI_STR;
+ break;
+ case BOOT_CF9_FORCE:
+ val = BOOT_CF9_FORCE_STR;
+ break;
+ case BOOT_CF9_SAFE:
+ val = BOOT_CF9_SAFE_STR;
+ break;
+ default:
+ val = REBOOT_UNDEFINED_STR;
+ }
+
+ return sprintf(buf, "%s\n", val);
+}
+static ssize_t type_store(struct kobject *kobj, struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+ if (!capable(CAP_SYS_BOOT))
+ return -EPERM;
+
+ if (!strncmp(buf, BOOT_TRIPLE_STR, strlen(BOOT_TRIPLE_STR)))
+ reboot_mode = BOOT_TRIPLE;
+ else if (!strncmp(buf, BOOT_KBD_STR, strlen(BOOT_KBD_STR)))
+ reboot_mode = BOOT_KBD;
+ else if (!strncmp(buf, BOOT_BIOS_STR, strlen(BOOT_BIOS_STR)))
+ reboot_mode = BOOT_BIOS;
+ else if (!strncmp(buf, BOOT_ACPI_STR, strlen(BOOT_ACPI_STR)))
+ reboot_mode = BOOT_ACPI;
+ else if (!strncmp(buf, BOOT_EFI_STR, strlen(BOOT_EFI_STR)))
+ reboot_mode = BOOT_EFI;
+ else if (!strncmp(buf, BOOT_CF9_FORCE_STR, strlen(BOOT_CF9_FORCE_STR)))
+ reboot_mode = BOOT_CF9_FORCE;
+ else if (!strncmp(buf, BOOT_CF9_SAFE_STR, strlen(BOOT_CF9_SAFE_STR)))
+ reboot_mode = BOOT_CF9_SAFE;
+ else
+ return -EINVAL;
+
+ return count;
+}
+static struct kobj_attribute reboot_type_attr = __ATTR_RW(type);
+
+static ssize_t cpu_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+ return sprintf(buf, "%d\n", reboot_cpu);
+}
+static ssize_t cpu_store(struct kobject *kobj, struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+ unsigned int cpunum;
+ int rc;
+
+ if (!capable(CAP_SYS_BOOT))
+ return -EPERM;
+
+ rc = kstrtouint(buf, 0, &cpunum);
+
+ if (rc)
+ return rc;
+
+ if (cpunum >= num_possible_cpus())
+ return -ERANGE;
+
+ reboot_cpu = cpunum;
+
+ return count;
+}
+static struct kobj_attribute reboot_cpu_attr = __ATTR_RW(cpu);
+
+static ssize_t force_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+ return sprintf(buf, "%d\n", reboot_force);
+}
+static ssize_t force_store(struct kobject *kobj, struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+ bool res;
+
+ if (!capable(CAP_SYS_BOOT))
+ return -EPERM;
+
+ if (kstrtobool(buf, &res))
+ return -EINVAL;
+
+ reboot_force = res;
+
+ return count;
+}
+static struct kobj_attribute reboot_force_attr = __ATTR_RW(force);
+
+static struct attribute *reboot_attrs[] = {
+ &reboot_mode_attr.attr,
+ &reboot_type_attr.attr,
+ &reboot_cpu_attr.attr,
+ &reboot_force_attr.attr,
+ NULL,
+};
+
+static const struct attribute_group reboot_attr_group = {
+ .attrs = reboot_attrs,
+};
+
+static int __init reboot_ksysfs_init(void)
+{
+ struct kobject *reboot_kobj;
+ int ret;
+
+ reboot_kobj = kobject_create_and_add("reboot", kernel_kobj);
+ if (!reboot_kobj)
+ return -ENOMEM;
+
+ ret = sysfs_create_group(reboot_kobj, &reboot_attr_group);
+ if (ret) {
+ kobject_put(reboot_kobj);
+ return ret;
+ }
+
+ return 0;
+}
+late_initcall(reboot_ksysfs_init);
+
+#endif
--
2.17.1
3
2
From: Nathan Chancellor <natechancellor(a)gmail.com>
maillist inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4BULZ
CVE: NA
Reference: https://lore.kernel.org/all/20201112035023.974748-1-natechancellor@gmail.co…
----------------------------------------------------------------------
Clang warns:
kernel/reboot.c:707:17: warning: implicit conversion from enumeration
type 'enum reboot_type' to different enumeration type 'enum reboot_mode'
[-Wenum-conversion]
reboot_mode = BOOT_TRIPLE;
~ ^~~~~~~~~~~
kernel/reboot.c:709:17: warning: implicit conversion from enumeration
type 'enum reboot_type' to different enumeration type 'enum reboot_mode'
[-Wenum-conversion]
reboot_mode = BOOT_KBD;
~ ^~~~~~~~
kernel/reboot.c:711:17: warning: implicit conversion from enumeration
type 'enum reboot_type' to different enumeration type 'enum reboot_mode'
[-Wenum-conversion]
reboot_mode = BOOT_BIOS;
~ ^~~~~~~~~
kernel/reboot.c:713:17: warning: implicit conversion from enumeration
type 'enum reboot_type' to different enumeration type 'enum reboot_mode'
[-Wenum-conversion]
reboot_mode = BOOT_ACPI;
~ ^~~~~~~~~
kernel/reboot.c:715:17: warning: implicit conversion from enumeration
type 'enum reboot_type' to different enumeration type 'enum reboot_mode'
[-Wenum-conversion]
reboot_mode = BOOT_EFI;
~ ^~~~~~~~
kernel/reboot.c:717:17: warning: implicit conversion from enumeration
type 'enum reboot_type' to different enumeration type 'enum reboot_mode'
[-Wenum-conversion]
reboot_mode = BOOT_CF9_FORCE;
~ ^~~~~~~~~~~~~~
kernel/reboot.c:719:17: warning: implicit conversion from enumeration
type 'enum reboot_type' to different enumeration type 'enum reboot_mode'
[-Wenum-conversion]
reboot_mode = BOOT_CF9_SAFE;
~ ^~~~~~~~~~~~~
7 warnings generated.
It seems that these assignment should be to reboot_type, not
reboot_mode. Fix it so there are no more warnings and the code works
properly.
Fixes: eab8da48579d ("reboot: allow to specify reboot mode via sysfs")
Link: https://github.com/ClangBuiltLinux/linux/issues/1197
Signed-off-by: Nathan Chancellor <natechancellor(a)gmail.com>
Reviewed-and-tested-by: Matteo Croce <mcroce(a)microsoft.com>
Reviewed-by: Petr Mladek <pmladek(a)suse.com>
Signed-off-by: Hongyu Li <543306408(a)qq.com>
---
kernel/reboot.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/kernel/reboot.c b/kernel/reboot.c
index 778979d92dc7..7a445c710f9a 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -700,19 +700,19 @@ static ssize_t type_store(struct kobject *kobj, struct kobj_attribute *attr,
return -EPERM;
if (!strncmp(buf, BOOT_TRIPLE_STR, strlen(BOOT_TRIPLE_STR)))
- reboot_mode = BOOT_TRIPLE;
+ reboot_type = BOOT_TRIPLE;
else if (!strncmp(buf, BOOT_KBD_STR, strlen(BOOT_KBD_STR)))
- reboot_mode = BOOT_KBD;
+ reboot_type = BOOT_KBD;
else if (!strncmp(buf, BOOT_BIOS_STR, strlen(BOOT_BIOS_STR)))
- reboot_mode = BOOT_BIOS;
+ reboot_type = BOOT_BIOS;
else if (!strncmp(buf, BOOT_ACPI_STR, strlen(BOOT_ACPI_STR)))
- reboot_mode = BOOT_ACPI;
+ reboot_type = BOOT_ACPI;
else if (!strncmp(buf, BOOT_EFI_STR, strlen(BOOT_EFI_STR)))
- reboot_mode = BOOT_EFI;
+ reboot_type = BOOT_EFI;
else if (!strncmp(buf, BOOT_CF9_FORCE_STR, strlen(BOOT_CF9_FORCE_STR)))
- reboot_mode = BOOT_CF9_FORCE;
+ reboot_type = BOOT_CF9_FORCE;
else if (!strncmp(buf, BOOT_CF9_SAFE_STR, strlen(BOOT_CF9_SAFE_STR)))
- reboot_mode = BOOT_CF9_SAFE;
+ reboot_type = BOOT_CF9_SAFE;
else
return -EINVAL;
--
2.17.1
2
1

[PATCH openEuler-1.0-LTS v2] scsi:spraid: support Ramaxel's spraid driver
by Yanling Song 22 Dec '21
by Yanling Song 22 Dec '21
22 Dec '21
Ramaxel inclusion
category: features
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NJIB
CVE: NA
v1->v2:
1. Add deconfig for arm64 and x86
Support Ramaxel's SPRxxx Raid controller
Signed-off-by: Yanling Song <songyl(a)ramaxel.com>
Reviewed-by: Jiang Yu<yujiang(a)ramaxel.com>
---
arch/arm64/configs/openeuler_defconfig | 1 +
arch/x86/configs/openeuler_defconfig | 1 +
drivers/scsi/Kconfig | 1 +
drivers/scsi/Makefile | 1 +
drivers/scsi/spraid/Kconfig | 13 +
drivers/scsi/spraid/Makefile | 7 +
drivers/scsi/spraid/spraid.h | 746 +++++
drivers/scsi/spraid/spraid_main.c | 3875 ++++++++++++++++++++++++
8 files changed, 4645 insertions(+)
create mode 100644 drivers/scsi/spraid/Kconfig
create mode 100644 drivers/scsi/spraid/Makefile
create mode 100644 drivers/scsi/spraid/spraid.h
create mode 100644 drivers/scsi/spraid/spraid_main.c
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig
index d2167c80757f..47b0931b91bf 100644
--- a/arch/arm64/configs/openeuler_defconfig
+++ b/arch/arm64/configs/openeuler_defconfig
@@ -2143,6 +2143,7 @@ CONFIG_SCSI_MPT2SAS_MAX_SGE=128
CONFIG_SCSI_MPT3SAS_MAX_SGE=128
CONFIG_SCSI_MPT2SAS=m
CONFIG_SCSI_SMARTPQI=m
+CONFIG_RAMAXEL_SPRAID=m
# CONFIG_SCSI_UFSHCD is not set
# CONFIG_SCSI_HPTIOP is not set
CONFIG_LIBFC=m
diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig
index 7043426d9780..786186288fea 100644
--- a/arch/x86/configs/openeuler_defconfig
+++ b/arch/x86/configs/openeuler_defconfig
@@ -2182,6 +2182,7 @@ CONFIG_SCSI_MPT2SAS_MAX_SGE=128
CONFIG_SCSI_MPT3SAS_MAX_SGE=128
CONFIG_SCSI_MPT2SAS=m
CONFIG_SCSI_SMARTPQI=m
+CONFIG_RAMAXEL_SPRAID=m
# CONFIG_SCSI_UFSHCD is not set
# CONFIG_SCSI_HPTIOP is not set
# CONFIG_SCSI_BUSLOGIC is not set
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 8450484184e3..63d2aaa22834 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -522,6 +522,7 @@ source "drivers/scsi/megaraid/Kconfig.megaraid"
source "drivers/scsi/mpt3sas/Kconfig"
source "drivers/scsi/smartpqi/Kconfig"
source "drivers/scsi/ufs/Kconfig"
+source "drivers/scsi/spraid/Kconfig"
config SCSI_HPTIOP
tristate "HighPoint RocketRAID 3xxx/4xxx Controller support"
diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index 2973693f6dcc..4056cf26e09e 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -94,6 +94,7 @@ obj-$(CONFIG_SCSI_ZALON) += zalon7xx.o
obj-$(CONFIG_SCSI_DC395x) += dc395x.o
obj-$(CONFIG_SCSI_AM53C974) += esp_scsi.o am53c974.o
obj-$(CONFIG_CXLFLASH) += cxlflash/
+obj-$(CONFIG_RAMAXEL_SPRAID) += spraid/
obj-$(CONFIG_MEGARAID_LEGACY) += megaraid.o
obj-$(CONFIG_MEGARAID_NEWGEN) += megaraid/
obj-$(CONFIG_MEGARAID_SAS) += megaraid/
diff --git a/drivers/scsi/spraid/Kconfig b/drivers/scsi/spraid/Kconfig
new file mode 100644
index 000000000000..bfbba3db8db0
--- /dev/null
+++ b/drivers/scsi/spraid/Kconfig
@@ -0,0 +1,13 @@
+#
+# Ramaxel driver configuration
+#
+
+config RAMAXEL_SPRAID
+ tristate "Ramaxel spraid Adapter"
+ depends on PCI && SCSI
+ select BLK_DEV_BSGLIB
+ depends on ARM64 || X86_64
+ help
+ This driver supports Ramaxel SPRxxx serial
+ raid controller, which has PCIE Gen4 interface
+ with host and supports SAS/SATA Hdd/ssd.
diff --git a/drivers/scsi/spraid/Makefile b/drivers/scsi/spraid/Makefile
new file mode 100644
index 000000000000..aadc2ffd37eb
--- /dev/null
+++ b/drivers/scsi/spraid/Makefile
@@ -0,0 +1,7 @@
+#
+# Makefile for the Ramaxel device drivers.
+#
+
+obj-$(CONFIG_RAMAXEL_SPRAID) += spraid.o
+
+spraid-objs := spraid_main.o
\ No newline at end of file
diff --git a/drivers/scsi/spraid/spraid.h b/drivers/scsi/spraid/spraid.h
new file mode 100644
index 000000000000..c1e4980e18e5
--- /dev/null
+++ b/drivers/scsi/spraid/spraid.h
@@ -0,0 +1,746 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
+
+#ifndef __SPRAID_H_
+#define __SPRAID_H_
+
+#define SPRAID_CAP_MQES(cap) ((cap) & 0xffff)
+#define SPRAID_CAP_STRIDE(cap) (((cap) >> 32) & 0xf)
+#define SPRAID_CAP_MPSMIN(cap) (((cap) >> 48) & 0xf)
+#define SPRAID_CAP_MPSMAX(cap) (((cap) >> 52) & 0xf)
+#define SPRAID_CAP_TIMEOUT(cap) (((cap) >> 24) & 0xff)
+#define SPRAID_CAP_DMAMASK(cap) (((cap) >> 37) & 0xff)
+
+#define SPRAID_DEFAULT_MAX_CHANNEL 4
+#define SPRAID_DEFAULT_MAX_ID 240
+#define SPRAID_DEFAULT_MAX_LUN_PER_HOST 8
+#define MAX_SECTORS 2048
+
+#define IO_SQE_SIZE sizeof(struct spraid_ioq_command)
+#define ADMIN_SQE_SIZE sizeof(struct spraid_admin_command)
+#define SQE_SIZE(qid) (((qid) > 0) ? IO_SQE_SIZE : ADMIN_SQE_SIZE)
+#define CQ_SIZE(depth) ((depth) * sizeof(struct spraid_completion))
+#define SQ_SIZE(qid, depth) ((depth) * SQE_SIZE(qid))
+
+#define SENSE_SIZE(depth) ((depth) * SCSI_SENSE_BUFFERSIZE)
+
+#define SPRAID_AQ_DEPTH 128
+#define SPRAID_NR_AEN_COMMANDS 16
+#define SPRAID_AQ_BLK_MQ_DEPTH (SPRAID_AQ_DEPTH - SPRAID_NR_AEN_COMMANDS)
+#define SPRAID_AQ_MQ_TAG_DEPTH (SPRAID_AQ_BLK_MQ_DEPTH - 1)
+
+#define SPRAID_ADMIN_QUEUE_NUM 1
+#define SPRAID_PTCMDS_PERQ 1
+#define SPRAID_IO_BLK_MQ_DEPTH (hdev->shost->can_queue)
+#define SPRAID_NR_IOQ_PTCMDS (SPRAID_PTCMDS_PERQ * hdev->shost->nr_hw_queues)
+
+#define FUA_MASK 0x08
+#define SPRAID_MINORS BIT(MINORBITS)
+
+#define COMMAND_IS_WRITE(cmd) ((cmd)->common.opcode & 1)
+
+#define SPRAID_IO_IOSQES 7
+#define SPRAID_IO_IOCQES 4
+#define PRP_ENTRY_SIZE 8
+
+#define SMALL_POOL_SIZE 256
+#define MAX_SMALL_POOL_NUM 16
+#define MAX_CMD_PER_DEV 64
+#define MAX_CDB_LEN 32
+
+#define SPRAID_UP_TO_MULTY4(x) (((x) + 4) & (~0x03))
+
+#define CQE_STATUS_SUCCESS (0x0)
+
+#define PCI_VENDOR_ID_RAMAXEL_LOGIC 0x1E81
+
+#define SPRAID_SERVER_DEVICE_HBA_DID 0x2100
+#define SPRAID_SERVER_DEVICE_RAID_DID 0x2200
+
+#define IO_6_DEFAULT_TX_LEN 256
+
+#define SPRAID_INT_PAGES 2
+#define SPRAID_INT_BYTES(hdev) (SPRAID_INT_PAGES * (hdev)->page_size)
+
+enum {
+ SPRAID_REQ_CANCELLED = (1 << 0),
+ SPRAID_REQ_USERCMD = (1 << 1),
+};
+
+enum {
+ SPRAID_SC_SUCCESS = 0x0,
+ SPRAID_SC_INVALID_OPCODE = 0x1,
+ SPRAID_SC_INVALID_FIELD = 0x2,
+
+ SPRAID_SC_ABORT_LIMIT = 0x103,
+ SPRAID_SC_ABORT_MISSING = 0x104,
+ SPRAID_SC_ASYNC_LIMIT = 0x105,
+
+ SPRAID_SC_DNR = 0x4000,
+};
+
+enum {
+ SPRAID_REG_CAP = 0x0000,
+ SPRAID_REG_CC = 0x0014,
+ SPRAID_REG_CSTS = 0x001c,
+ SPRAID_REG_AQA = 0x0024,
+ SPRAID_REG_ASQ = 0x0028,
+ SPRAID_REG_ACQ = 0x0030,
+ SPRAID_REG_DBS = 0x1000,
+};
+
+enum {
+ SPRAID_CC_ENABLE = 1 << 0,
+ SPRAID_CC_CSS_NVM = 0 << 4,
+ SPRAID_CC_MPS_SHIFT = 7,
+ SPRAID_CC_AMS_SHIFT = 11,
+ SPRAID_CC_SHN_SHIFT = 14,
+ SPRAID_CC_IOSQES_SHIFT = 16,
+ SPRAID_CC_IOCQES_SHIFT = 20,
+ SPRAID_CC_AMS_RR = 0 << SPRAID_CC_AMS_SHIFT,
+ SPRAID_CC_SHN_NONE = 0 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CC_IOSQES = SPRAID_IO_IOSQES << SPRAID_CC_IOSQES_SHIFT,
+ SPRAID_CC_IOCQES = SPRAID_IO_IOCQES << SPRAID_CC_IOCQES_SHIFT,
+ SPRAID_CC_SHN_NORMAL = 1 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CC_SHN_MASK = 3 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CSTS_CFS_SHIFT = 1,
+ SPRAID_CSTS_SHST_SHIFT = 2,
+ SPRAID_CSTS_PP_SHIFT = 5,
+ SPRAID_CSTS_RDY = 1 << 0,
+ SPRAID_CSTS_SHST_CMPLT = 2 << 2,
+ SPRAID_CSTS_SHST_MASK = 3 << 2,
+ SPRAID_CSTS_CFS_MASK = 1 << SPRAID_CSTS_CFS_SHIFT,
+ SPRAID_CSTS_PP_MASK = 1 << SPRAID_CSTS_PP_SHIFT,
+};
+
+enum {
+ SPRAID_ADMIN_DELETE_SQ = 0x00,
+ SPRAID_ADMIN_CREATE_SQ = 0x01,
+ SPRAID_ADMIN_DELETE_CQ = 0x04,
+ SPRAID_ADMIN_CREATE_CQ = 0x05,
+ SPRAID_ADMIN_ABORT_CMD = 0x08,
+ SPRAID_ADMIN_SET_FEATURES = 0x09,
+ SPRAID_ADMIN_ASYNC_EVENT = 0x0c,
+ SPRAID_ADMIN_GET_INFO = 0xc6,
+ SPRAID_ADMIN_RESET = 0xc8,
+};
+
+enum {
+ SPRAID_GET_INFO_CTRL = 0,
+ SPRAID_GET_INFO_DEV_LIST = 1,
+};
+
+enum {
+ SPRAID_RESET_TARGET = 0,
+ SPRAID_RESET_BUS = 1,
+};
+
+enum {
+ SPRAID_AEN_ERROR = 0,
+ SPRAID_AEN_NOTICE = 2,
+ SPRAID_AEN_VS = 7,
+};
+
+enum {
+ SPRAID_AEN_DEV_CHANGED = 0x00,
+ SPRAID_AEN_FW_ACT_START = 0x01,
+ SPRAID_AEN_HOST_PROBING = 0x10,
+};
+
+enum {
+ SPRAID_AEN_TIMESYN = 0x00,
+ SPRAID_AEN_FW_ACT_FINISH = 0x02,
+ SPRAID_AEN_EVENT_MIN = 0x80,
+ SPRAID_AEN_EVENT_MAX = 0xff,
+};
+
+enum {
+ SPRAID_CMD_WRITE = 0x01,
+ SPRAID_CMD_READ = 0x02,
+
+ SPRAID_CMD_NONIO_NONE = 0x80,
+ SPRAID_CMD_NONIO_TODEV = 0x81,
+ SPRAID_CMD_NONIO_FROMDEV = 0x82,
+};
+
+enum {
+ SPRAID_QUEUE_PHYS_CONTIG = (1 << 0),
+ SPRAID_CQ_IRQ_ENABLED = (1 << 1),
+
+ SPRAID_FEAT_NUM_QUEUES = 0x07,
+ SPRAID_FEAT_ASYNC_EVENT = 0x0b,
+ SPRAID_FEAT_TIMESTAMP = 0x0e,
+};
+
+enum spraid_state {
+ SPRAID_NEW,
+ SPRAID_LIVE,
+ SPRAID_RESETTING,
+ SPRAID_DELETING,
+ SPRAID_DEAD,
+};
+
+enum {
+ SPRAID_CARD_HBA,
+ SPRAID_CARD_RAID,
+};
+
+enum spraid_cmd_type {
+ SPRAID_CMD_ADM,
+ SPRAID_CMD_IOPT,
+};
+
+struct spraid_completion {
+ __le32 result;
+ union {
+ struct {
+ __u8 sense_len;
+ __u8 resv[3];
+ };
+ __le32 result1;
+ };
+ __le16 sq_head;
+ __le16 sq_id;
+ __u16 cmd_id;
+ __le16 status;
+};
+
+struct spraid_ctrl_info {
+ __le32 nd;
+ __le16 max_cmds;
+ __le16 max_channel;
+ __le32 max_tgt_id;
+ __le16 max_lun;
+ __le16 max_num_sge;
+ __le16 lun_num_in_boot;
+ __u8 mdts;
+ __u8 acl;
+ __u8 aerl;
+ __u8 card_type;
+ __u16 rsvd;
+ __u32 rtd3e;
+ __u8 sn[32];
+ __u8 fr[16];
+ __u8 rsvd1[4020];
+};
+
+struct spraid_dev {
+ struct pci_dev *pdev;
+ struct device *dev;
+ struct Scsi_Host *shost;
+ struct spraid_queue *queues;
+ struct dma_pool *prp_page_pool;
+ struct dma_pool *prp_small_pool[MAX_SMALL_POOL_NUM];
+ mempool_t *iod_mempool;
+ void __iomem *bar;
+ u32 max_qid;
+ u32 num_vecs;
+ u32 queue_count;
+ u32 ioq_depth;
+ int db_stride;
+ u32 __iomem *dbs;
+ struct rw_semaphore devices_rwsem;
+ int numa_node;
+ u32 page_size;
+ u32 ctrl_config;
+ u32 online_queues;
+ u64 cap;
+ int instance;
+ struct spraid_ctrl_info *ctrl_info;
+ struct spraid_dev_info *devices;
+
+ struct spraid_cmd *adm_cmds;
+ struct list_head adm_cmd_list;
+ spinlock_t adm_cmd_lock;
+
+ struct spraid_cmd *ioq_ptcmds;
+ struct list_head ioq_pt_list;
+ spinlock_t ioq_pt_lock;
+
+ struct work_struct scan_work;
+ struct work_struct timesyn_work;
+ struct work_struct reset_work;
+ struct work_struct fw_act_work;
+
+ enum spraid_state state;
+ spinlock_t state_lock;
+
+ struct request_queue *bsg_queue;
+};
+
+struct spraid_sgl_desc {
+ __le64 addr;
+ __le32 length;
+ __u8 rsvd[3];
+ __u8 type;
+};
+
+union spraid_data_ptr {
+ struct {
+ __le64 prp1;
+ __le64 prp2;
+ };
+ struct spraid_sgl_desc sgl;
+};
+
+struct spraid_admin_common_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le32 cdw2[4];
+ union spraid_data_ptr dptr;
+ __le32 cdw10;
+ __le32 cdw11;
+ __le32 cdw12;
+ __le32 cdw13;
+ __le32 cdw14;
+ __le32 cdw15;
+};
+
+struct spraid_features {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[2];
+ union spraid_data_ptr dptr;
+ __le32 fid;
+ __le32 dword11;
+ __le32 dword12;
+ __le32 dword13;
+ __le32 dword14;
+ __le32 dword15;
+};
+
+struct spraid_create_cq {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[5];
+ __le64 prp1;
+ __u64 rsvd8;
+ __le16 cqid;
+ __le16 qsize;
+ __le16 cq_flags;
+ __le16 irq_vector;
+ __u32 rsvd12[4];
+};
+
+struct spraid_create_sq {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[5];
+ __le64 prp1;
+ __u64 rsvd8;
+ __le16 sqid;
+ __le16 qsize;
+ __le16 sq_flags;
+ __le16 cqid;
+ __u32 rsvd12[4];
+};
+
+struct spraid_delete_queue {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[9];
+ __le16 qid;
+ __u16 rsvd10;
+ __u32 rsvd11[5];
+};
+
+struct spraid_get_info {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u32 rsvd2[4];
+ union spraid_data_ptr dptr;
+ __u8 type;
+ __u8 rsvd10[3];
+ __le32 cdw11;
+ __u32 rsvd12[4];
+};
+
+struct spraid_usr_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ union {
+ struct {
+ __le16 subopcode;
+ __le16 rsvd1;
+ } info_0;
+ __le32 cdw2;
+ };
+ union {
+ struct {
+ __le16 data_len;
+ __le16 param_len;
+ } info_1;
+ __le32 cdw3;
+ };
+ __u64 metadata;
+ union spraid_data_ptr dptr;
+ __le32 cdw10;
+ __le32 cdw11;
+ __le32 cdw12;
+ __le32 cdw13;
+ __le32 cdw14;
+ __le32 cdw15;
+};
+
+enum {
+ SPRAID_CMD_FLAG_SGL_METABUF = (1 << 6),
+ SPRAID_CMD_FLAG_SGL_METASEG = (1 << 7),
+ SPRAID_CMD_FLAG_SGL_ALL = SPRAID_CMD_FLAG_SGL_METABUF |
+ SPRAID_CMD_FLAG_SGL_METASEG,
+};
+
+enum spraid_cmd_state {
+ SPRAID_CMD_IDLE = 0,
+ SPRAID_CMD_IN_FLIGHT = 1,
+ SPRAID_CMD_COMPLETE = 2,
+ SPRAID_CMD_TIMEOUT = 3,
+ SPRAID_CMD_TMO_COMPLETE = 4,
+};
+
+struct spraid_abort_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[4];
+ __le16 sqid;
+ __le16 cid;
+ __u32 rsvd11[5];
+};
+
+struct spraid_reset_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[4];
+ __u8 type;
+ __u8 rsvd10[3];
+ __u32 rsvd11[5];
+};
+
+struct spraid_admin_command {
+ union {
+ struct spraid_admin_common_command common;
+ struct spraid_features features;
+ struct spraid_create_cq create_cq;
+ struct spraid_create_sq create_sq;
+ struct spraid_delete_queue delete_queue;
+ struct spraid_get_info get_info;
+ struct spraid_abort_cmd abort;
+ struct spraid_reset_cmd reset;
+ struct spraid_usr_cmd usr_cmd;
+ };
+};
+
+struct spraid_ioq_common_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_len;
+ __u8 rsvd2;
+ __le32 cdw3[3];
+ union spraid_data_ptr dptr;
+ __le32 cdw10[6];
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __le32 cdw26[6];
+};
+
+struct spraid_rw_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_len;
+ __u8 rsvd2;
+ __u32 rsvd3[3];
+ union spraid_data_ptr dptr;
+ __le64 slba;
+ __le16 nlb;
+ __le16 control;
+ __u32 rsvd13[3];
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __u32 rsvd26[6];
+};
+
+struct spraid_scsi_nonio {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_length;
+ __u8 rsvd2;
+ __u32 rsvd3[3];
+ union spraid_data_ptr dptr;
+ __u32 rsvd10[5];
+ __le32 buffer_len;
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __u32 rsvd26[6];
+};
+
+struct spraid_ioq_command {
+ union {
+ struct spraid_ioq_common_command common;
+ struct spraid_rw_command rw;
+ struct spraid_scsi_nonio scsi_nonio;
+ };
+};
+
+struct spraid_passthru_common_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 rsvd0;
+ __u32 nsid;
+ union {
+ struct {
+ __u16 subopcode;
+ __u16 rsvd1;
+ } info_0;
+ __u32 cdw2;
+ };
+ union {
+ struct {
+ __u16 data_len;
+ __u16 param_len;
+ } info_1;
+ __u32 cdw3;
+ };
+ __u64 metadata;
+
+ __u64 addr;
+ __u64 prp2;
+
+ __u32 cdw10;
+ __u32 cdw11;
+ __u32 cdw12;
+ __u32 cdw13;
+ __u32 cdw14;
+ __u32 cdw15;
+ __u32 timeout_ms;
+ __u32 result0;
+ __u32 result1;
+};
+
+struct spraid_ioq_passthru_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 rsvd0;
+ __u32 nsid;
+ union {
+ struct {
+ __u16 res_sense_len;
+ __u8 cdb_len;
+ __u8 rsvd0;
+ } info_0;
+ __u32 cdw2;
+ };
+ union {
+ struct {
+ __u16 subopcode;
+ __u16 rsvd1;
+ } info_1;
+ __u32 cdw3;
+ };
+ union {
+ struct {
+ __u16 rsvd;
+ __u16 param_len;
+ } info_2;
+ __u32 cdw4;
+ };
+ __u32 cdw5;
+ __u64 addr;
+ __u64 prp2;
+ union {
+ struct {
+ __u16 eid;
+ __u16 sid;
+ } info_3;
+ __u32 cdw10;
+ };
+ union {
+ struct {
+ __u16 did;
+ __u8 did_flag;
+ __u8 rsvd2;
+ } info_4;
+ __u32 cdw11;
+ };
+ __u32 cdw12;
+ __u32 cdw13;
+ __u32 cdw14;
+ __u32 data_len;
+ __u32 cdw16;
+ __u32 cdw17;
+ __u32 cdw18;
+ __u32 cdw19;
+ __u32 cdw20;
+ __u32 cdw21;
+ __u32 cdw22;
+ __u32 cdw23;
+ __u64 sense_addr;
+ __u32 cdw26[4];
+ __u32 timeout_ms;
+ __u32 result0;
+ __u32 result1;
+};
+
+struct spraid_bsg_request {
+ u32 msgcode;
+ u32 control;
+ union {
+ struct spraid_passthru_common_cmd admcmd;
+ struct spraid_ioq_passthru_cmd ioqcmd;
+ };
+};
+
+enum {
+ SPRAID_BSG_ADM,
+ SPRAID_BSG_IOQ,
+};
+
+struct spraid_cmd {
+ int qid;
+ int cid;
+ u32 result0;
+ u32 result1;
+ u16 status;
+ void *priv;
+ enum spraid_cmd_state state;
+ struct completion cmd_done;
+ struct list_head list;
+};
+
+struct spraid_queue {
+ struct spraid_dev *hdev;
+ spinlock_t sq_lock;
+
+ spinlock_t cq_lock ____cacheline_aligned_in_smp;
+
+ void *sq_cmds;
+
+ struct spraid_completion *cqes;
+
+ dma_addr_t sq_dma_addr;
+ dma_addr_t cq_dma_addr;
+ u32 __iomem *q_db;
+ u8 cq_phase;
+ u8 sqes;
+ u16 qid;
+ u16 sq_tail;
+ u16 cq_head;
+ u16 last_cq_head;
+ u16 q_depth;
+ s16 cq_vector;
+ void *sense;
+ dma_addr_t sense_dma_addr;
+ struct dma_pool *prp_small_pool;
+};
+
+struct spraid_iod {
+ struct spraid_queue *spraidq;
+ enum spraid_cmd_state state;
+ int npages;
+ u32 nsge;
+ u32 length;
+ bool use_sgl;
+ bool sg_drv_mgmt;
+ dma_addr_t first_dma;
+ void *sense;
+ dma_addr_t sense_dma;
+ struct scatterlist *sg;
+ struct scatterlist inline_sg[0];
+};
+
+#define SPRAID_DEV_INFO_ATTR_BOOT(attr) ((attr) & 0x01)
+#define SPRAID_DEV_INFO_ATTR_VD(attr) (((attr) & 0x02) == 0x0)
+#define SPRAID_DEV_INFO_ATTR_PT(attr) (((attr) & 0x22) == 0x02)
+#define SPRAID_DEV_INFO_ATTR_RAWDISK(attr) ((attr) & 0x20)
+
+#define SPRAID_DEV_INFO_FLAG_VALID(flag) ((flag) & 0x01)
+#define SPRAID_DEV_INFO_FLAG_CHANGE(flag) ((flag) & 0x02)
+
+#define BGTASK_TYPE_REBUILD 4
+#define USR_CMD_READ 0xc2
+#define USR_CMD_RDLEN 0x1000
+#define USR_CMD_VDINFO 0x704
+#define USR_CMD_BGTASK 0x504
+#define VDINFO_PARAM_LEN 0x04
+
+struct spraid_vd_info {
+ __u8 name[32];
+ __le16 id;
+ __u8 rg_id;
+ __u8 rg_level;
+ __u8 sg_num;
+ __u8 sg_disk_num;
+ __u8 vd_status;
+ __u8 vd_type;
+ __u8 rsvd1[4056];
+};
+
+#define MAX_REALTIME_BGTASK_NUM 32
+
+struct bgtask_info {
+ __u8 type;
+ __u8 progress;
+ __u8 rate;
+ __u8 rsvd0;
+ __le16 vd_id;
+ __le16 time_left;
+ __u8 rsvd1[4];
+};
+
+struct spraid_bgtask {
+ __u8 sw;
+ __u8 task_num;
+ __u8 rsvd[6];
+ struct bgtask_info bgtask[MAX_REALTIME_BGTASK_NUM];
+};
+
+struct spraid_dev_info {
+ __le32 hdid;
+ __le16 target;
+ __u8 channel;
+ __u8 lun;
+ __u8 attr;
+ __u8 flag;
+ __le16 max_io_kb;
+};
+
+#define MAX_DEV_ENTRY_PER_PAGE_4K 340
+struct spraid_dev_list {
+ __le32 dev_num;
+ __u32 rsvd0[3];
+ struct spraid_dev_info devices[MAX_DEV_ENTRY_PER_PAGE_4K];
+};
+
+struct spraid_sdev_hostdata {
+ u32 hdid;
+ u16 max_io_kb;
+ u8 attr;
+ u8 flag;
+ u8 rg_id;
+ u8 rsvd[3];
+};
+
+#endif
+
diff --git a/drivers/scsi/spraid/spraid_main.c b/drivers/scsi/spraid/spraid_main.c
new file mode 100644
index 000000000000..519b39f44e91
--- /dev/null
+++ b/drivers/scsi/spraid/spraid_main.c
@@ -0,0 +1,3875 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
+
+/* Ramaxel Raid SPXXX Series Linux Driver */
+
+#define pr_fmt(fmt) "spraid: " fmt
+
+#include <linux/sched/signal.h>
+#include <linux/version.h>
+#include <linux/pci.h>
+#include <linux/aer.h>
+#include <linux/module.h>
+#include <linux/ioport.h>
+#include <linux/device.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/cdev.h>
+#include <linux/sysfs.h>
+#include <linux/gfp.h>
+#include <linux/types.h>
+#include <linux/ratelimit.h>
+#include <linux/once.h>
+#include <linux/debugfs.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/blkdev.h>
+#include <linux/bsg-lib.h>
+#include <asm/unaligned.h>
+#include <linux/sort.h>
+#include <target/target_core_backend.h>
+
+#include <scsi/scsi.h>
+#include <scsi/scsi_cmnd.h>
+#include <scsi/scsi_device.h>
+#include <scsi/scsi_host.h>
+#include <scsi/scsi_transport.h>
+#include <scsi/scsi_dbg.h>
+
+
+#include "spraid.h"
+
+static u32 admin_tmout = 60;
+module_param(admin_tmout, uint, 0644);
+MODULE_PARM_DESC(admin_tmout, "admin commands timeout (seconds)");
+
+static u32 scmd_tmout_nonpt = 180;
+module_param(scmd_tmout_nonpt, uint, 0644);
+MODULE_PARM_DESC(scmd_tmout_nonpt,
+ "scsi commands timeout for rawdisk&raid(seconds)");
+
+static int ioq_depth_set(const char *val, const struct kernel_param *kp);
+static const struct kernel_param_ops ioq_depth_ops = {
+ .set = ioq_depth_set,
+ .get = param_get_uint,
+};
+
+static u32 io_queue_depth = 1024;
+module_param_cb(io_queue_depth, &ioq_depth_ops, &io_queue_depth, 0644);
+MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2");
+
+static int log_debug_switch_set(const char *val, const struct kernel_param *kp)
+{
+ u8 n = 0;
+ int ret;
+
+ ret = kstrtou8(val, 10, &n);
+ if (ret != 0)
+ return -EINVAL;
+
+ return param_set_byte(val, kp);
+}
+
+static const struct kernel_param_ops log_debug_switch_ops = {
+ .set = log_debug_switch_set,
+ .get = param_get_byte,
+};
+
+static unsigned char log_debug_switch;
+module_param_cb(log_debug_switch, &log_debug_switch_ops,
+ &log_debug_switch, 0644);
+MODULE_PARM_DESC(log_debug_switch,
+ "set log state, default non-zero for switch on");
+
+static int small_pool_num_set(const char *val, const struct kernel_param *kp)
+{
+ u8 n = 0;
+ int ret;
+
+ ret = kstrtou8(val, 10, &n);
+ if (ret != 0)
+ return -EINVAL;
+ if (n > MAX_SMALL_POOL_NUM)
+ n = MAX_SMALL_POOL_NUM;
+ if (n < 1)
+ n = 1;
+ *((u8 *)kp->arg) = n;
+
+ return 0;
+}
+
+static const struct kernel_param_ops small_pool_num_ops = {
+ .set = small_pool_num_set,
+ .get = param_get_byte,
+};
+
+/* It was found that the spindlock of a single pool conflicts
+ * a lot with multiple CPUs.So multiple pools are introduced
+ * to reduce the conflictions.
+ */
+static unsigned char small_pool_num = 4;
+module_param_cb(small_pool_num, &small_pool_num_ops, &small_pool_num, 0644);
+MODULE_PARM_DESC(small_pool_num, "set prp small pool num, default 4, MAX 16");
+
+static void spraid_free_queue(struct spraid_queue *spraidq);
+static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result);
+static void spraid_handle_aen_vs(struct spraid_dev *hdev,
+ u32 result, u32 result1);
+
+static DEFINE_IDA(spraid_instance_ida);
+
+static struct class *spraid_class;
+
+#define SPRAID_CAP_TIMEOUT_UNIT_MS (HZ / 2)
+
+static struct workqueue_struct *spraid_wq;
+
+#define dev_log_dbg(dev, fmt, ...) do { \
+ if (unlikely(log_debug_switch)) \
+ dev_info(dev, "[%s] [%d] " fmt, \
+ __func__, __LINE__, ##__VA_ARGS__); \
+} while (0)
+
+#define SPRAID_DRV_VERSION "1.0.0.0"
+
+#define ADMIN_TIMEOUT (admin_tmout * HZ)
+#define ADMIN_ERR_TIMEOUT 32757
+
+#define SPRAID_WAIT_ABNL_CMD_TIMEOUT (3 * 2)
+
+#define SPRAID_DMA_MSK_BIT_MAX 64
+
+enum FW_STAT_CODE {
+ FW_STAT_OK = 0,
+ FW_STAT_NEED_CHECK,
+ FW_STAT_ERROR,
+ FW_STAT_EP_PCIE_ERROR,
+ FW_STAT_NAC_DMA_ERROR,
+ FW_STAT_ABORTED,
+ FW_STAT_NEED_RETRY
+};
+
+static const char * const raid_levels[] = {"0", "1", "5", "6", "10", "50", "60",
+ "NA"};
+
+static const char * const raid_states[] = {
+ "NA", "NORMAL", "FAULT", "DEGRADE", "NOT_FORMATTED",
+ "FORMATTING", "SANITIZING", "INITIALIZING", "INITIALIZE_FAIL",
+ "DELETING", "DELETE_FAIL", "WRITE_PROTECT"
+};
+
+static int ioq_depth_set(const char *val, const struct kernel_param *kp)
+{
+ int n = 0;
+ int ret;
+
+ ret = kstrtoint(val, 10, &n);
+ if (ret != 0 || n < 2)
+ return -EINVAL;
+
+ return param_set_int(val, kp);
+}
+
+static int spraid_remap_bar(struct spraid_dev *hdev, u32 size)
+{
+ struct pci_dev *pdev = hdev->pdev;
+
+ if (size > pci_resource_len(pdev, 0)) {
+ dev_err(hdev->dev, "Input size[%u] exceed bar0 length[%llu]\n",
+ size, pci_resource_len(pdev, 0));
+ return -ENOMEM;
+ }
+
+ if (hdev->bar)
+ iounmap(hdev->bar);
+
+ hdev->bar = ioremap(pci_resource_start(pdev, 0), size);
+ if (!hdev->bar) {
+ dev_err(hdev->dev, "ioremap for bar0 failed\n");
+ return -ENOMEM;
+ }
+ hdev->dbs = hdev->bar + SPRAID_REG_DBS;
+
+ return 0;
+}
+
+static int spraid_dev_map(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ int ret;
+
+ ret = pci_request_mem_regions(pdev, "spraid");
+ if (ret) {
+ dev_err(hdev->dev, "fail to request memory regions\n");
+ return ret;
+ }
+
+ ret = spraid_remap_bar(hdev, SPRAID_REG_DBS + 4096);
+ if (ret) {
+ pci_release_mem_regions(pdev);
+ return ret;
+ }
+
+ return 0;
+}
+
+static void spraid_dev_unmap(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+
+ if (hdev->bar) {
+ iounmap(hdev->bar);
+ hdev->bar = NULL;
+ }
+ pci_release_mem_regions(pdev);
+}
+
+static int spraid_pci_enable(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ int ret = -ENOMEM;
+ u64 maskbit = SPRAID_DMA_MSK_BIT_MAX;
+
+ if (pci_enable_device_mem(pdev)) {
+ dev_err(hdev->dev,
+ "Enable pci device memory resources failed\n");
+ return ret;
+ }
+ pci_set_master(pdev);
+
+ if (readl(hdev->bar + SPRAID_REG_CSTS) == U32_MAX) {
+ ret = -ENODEV;
+ dev_err(hdev->dev, "Read csts register failed\n");
+ goto disable;
+ }
+
+ ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
+ if (ret < 0) {
+ dev_err(hdev->dev,
+ "Allocate one IRQ for setup admin channel failed\n");
+ goto disable;
+ }
+
+ hdev->cap = lo_hi_readq(hdev->bar + SPRAID_REG_CAP);
+ hdev->ioq_depth = min_t(u32, SPRAID_CAP_MQES(hdev->cap) + 1,
+ io_queue_depth);
+ hdev->db_stride = 1 << SPRAID_CAP_STRIDE(hdev->cap);
+
+ maskbit = SPRAID_CAP_DMAMASK(hdev->cap);
+ if (maskbit < 32 || maskbit > SPRAID_DMA_MSK_BIT_MAX) {
+ dev_err(hdev->dev,
+ "err, dma mask invalid[%llu], set to default\n",
+ maskbit);
+ maskbit = SPRAID_DMA_MSK_BIT_MAX;
+ }
+ if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(maskbit))) {
+ dev_err(hdev->dev, "set dma mask and coherent failed\n");
+ goto disable;
+ }
+
+ dev_info(hdev->dev, "set dma mask[%llu] success\n", maskbit);
+
+ pci_enable_pcie_error_reporting(pdev);
+ pci_save_state(pdev);
+
+ return 0;
+
+disable:
+ pci_disable_device(pdev);
+ return ret;
+}
+
+static int spraid_npages_prp(u32 size, struct spraid_dev *hdev)
+{
+ u32 nprps = DIV_ROUND_UP(size + hdev->page_size, hdev->page_size);
+
+ return DIV_ROUND_UP(PRP_ENTRY_SIZE * nprps, PAGE_SIZE - PRP_ENTRY_SIZE);
+}
+
+static int spraid_npages_sgl(u32 nseg)
+{
+ return DIV_ROUND_UP(nseg * sizeof(struct spraid_sgl_desc), PAGE_SIZE);
+}
+
+static void **spraid_iod_list(struct spraid_iod *iod)
+{
+ return (void **)(iod->inline_sg + (iod->sg_drv_mgmt ? iod->nsge : 0));
+}
+
+static u32 spraid_iod_ext_size(struct spraid_dev *hdev, u32 size, u32 nsge,
+ bool sg_drv_mgmt, bool use_sgl)
+{
+ size_t alloc_size, sg_size;
+
+ if (use_sgl)
+ alloc_size = sizeof(__le64 *) * spraid_npages_sgl(nsge);
+ else
+ alloc_size = sizeof(__le64 *) * spraid_npages_prp(size, hdev);
+
+ sg_size = sg_drv_mgmt ? (sizeof(struct scatterlist) * nsge) : 0;
+ return sg_size + alloc_size;
+}
+
+static u32 spraid_cmd_size(struct spraid_dev *hdev,
+ bool sg_drv_mgmt, bool use_sgl)
+{
+ u32 alloc_size = spraid_iod_ext_size(hdev, SPRAID_INT_BYTES(hdev),
+ SPRAID_INT_PAGES, sg_drv_mgmt, use_sgl);
+
+ dev_info(hdev->dev, "sg_drv_mgmt: %s, use_sgl: %s, iod size: %lu;"
+ " alloc_size: %u\n", sg_drv_mgmt ? "true" : "false",
+ use_sgl ? "true" : "false",
+ sizeof(struct spraid_iod), alloc_size);
+
+ return sizeof(struct spraid_iod) + alloc_size;
+}
+
+static int spraid_setup_prps(struct spraid_dev *hdev, struct spraid_iod *iod)
+{
+ struct scatterlist *sg = iod->sg;
+ u64 dma_addr = sg_dma_address(sg);
+ int dma_len = sg_dma_len(sg);
+ __le64 *prp_list, *old_prp_list;
+ u32 page_size = hdev->page_size;
+ int offset = dma_addr & (page_size - 1);
+ void **list = spraid_iod_list(iod);
+ int length = iod->length;
+ struct dma_pool *pool;
+ dma_addr_t prp_dma;
+ int nprps, i;
+
+ length -= (page_size - offset);
+ if (length <= 0) {
+ iod->first_dma = 0;
+ return 0;
+ }
+
+ dma_len -= (page_size - offset);
+ if (dma_len) {
+ dma_addr += (page_size - offset);
+ } else {
+ sg = sg_next(sg);
+ dma_addr = sg_dma_address(sg);
+ dma_len = sg_dma_len(sg);
+ }
+
+ if (length <= page_size) {
+ iod->first_dma = dma_addr;
+ return 0;
+ }
+
+ nprps = DIV_ROUND_UP(length, page_size);
+ if (nprps <= (SMALL_POOL_SIZE / PRP_ENTRY_SIZE)) {
+ pool = iod->spraidq->prp_small_pool;
+ iod->npages = 0;
+ } else {
+ pool = hdev->prp_page_pool;
+ iod->npages = 1;
+ }
+
+ prp_list = dma_pool_alloc(pool, GFP_ATOMIC, &prp_dma);
+ if (!prp_list) {
+ dev_err_ratelimited(hdev->dev,
+ "Allocate first prp_list memory failed\n");
+ iod->first_dma = dma_addr;
+ iod->npages = -1;
+ return -ENOMEM;
+ }
+ list[0] = prp_list;
+ iod->first_dma = prp_dma;
+ i = 0;
+ for (;;) {
+ if (i == page_size / PRP_ENTRY_SIZE) {
+ old_prp_list = prp_list;
+
+ prp_list = dma_pool_alloc(pool, GFP_ATOMIC, &prp_dma);
+ if (!prp_list) {
+ dev_err_ratelimited(hdev->dev, "Allocate %dth;"
+ " prp_list memory failed\n",
+ iod->npages + 1);
+ return -ENOMEM;
+ }
+ list[iod->npages++] = prp_list;
+ prp_list[0] = old_prp_list[i - 1];
+ old_prp_list[i - 1] = cpu_to_le64(prp_dma);
+ i = 1;
+ }
+ prp_list[i++] = cpu_to_le64(dma_addr);
+ dma_len -= page_size;
+ dma_addr += page_size;
+ length -= page_size;
+ if (length <= 0)
+ break;
+ if (dma_len > 0)
+ continue;
+ if (unlikely(dma_len < 0))
+ goto bad_sgl;
+ sg = sg_next(sg);
+ dma_addr = sg_dma_address(sg);
+ dma_len = sg_dma_len(sg);
+ }
+
+ return 0;
+
+bad_sgl:
+ dev_err(hdev->dev,
+ "Setup prps, invalid SGL for payload: %d nents: %d\n",
+ iod->length, iod->nsge);
+ return -EIO;
+}
+
+#define SGES_PER_PAGE (PAGE_SIZE / sizeof(struct spraid_sgl_desc))
+
+static void spraid_submit_cmd(struct spraid_queue *spraidq, const void *cmd)
+{
+ u32 sqes = SQE_SIZE(spraidq->qid);
+ unsigned long flags;
+ struct spraid_admin_common_command *acd =
+ (struct spraid_admin_common_command *)cmd;
+
+ spin_lock_irqsave(&spraidq->sq_lock, flags);
+ memcpy((spraidq->sq_cmds + sqes * spraidq->sq_tail), cmd, sqes);
+ if (++spraidq->sq_tail == spraidq->q_depth)
+ spraidq->sq_tail = 0;
+
+ writel(spraidq->sq_tail, spraidq->q_db);
+ spin_unlock_irqrestore(&spraidq->sq_lock, flags);
+
+ dev_log_dbg(spraidq->hdev->dev,
+ "cid[%d] qid[%d], opcode[0x%x], flags[0x%x], hdid[%u]\n",
+ acd->command_id, spraidq->qid, acd->opcode,
+ acd->flags, le32_to_cpu(acd->hdid));
+}
+
+static u32 spraid_mod64(u64 dividend, u32 divisor)
+{
+ u64 d;
+ u32 remainder;
+
+ if (!divisor)
+ pr_err("DIVISOR is zero, in div fn\n");
+
+ d = dividend;
+ remainder = do_div(d, divisor);
+ return remainder;
+}
+
+static inline bool spraid_is_rw_scmd(struct scsi_cmnd *scmd)
+{
+ switch (scmd->cmnd[0]) {
+ case READ_6:
+ case READ_10:
+ case READ_12:
+ case READ_16:
+ case READ_32:
+ case WRITE_6:
+ case WRITE_10:
+ case WRITE_12:
+ case WRITE_16:
+ case WRITE_32:
+ return true;
+ default:
+ return false;
+ }
+}
+
+static bool spraid_is_prp(struct spraid_dev *hdev,
+ struct scsi_cmnd *scmd, u32 nsge)
+{
+ struct scatterlist *sg = scsi_sglist(scmd);
+ u32 page_size = hdev->page_size;
+ bool is_prp = true;
+ int i = 0;
+
+ scsi_for_each_sg(scmd, sg, nsge, i) {
+ if (i != 0 && i != nsge - 1) {
+ if (spraid_mod64(sg_dma_len(sg), page_size) ||
+ spraid_mod64(sg_dma_address(sg), page_size)) {
+ is_prp = false;
+ break;
+ }
+ }
+
+ if (nsge > 1 && i == 0) {
+ if ((spraid_mod64((sg_dma_address(sg) + sg_dma_len(sg)),
+ page_size))) {
+ is_prp = false;
+ break;
+ }
+ }
+
+ if (nsge > 1 && i == (nsge - 1)) {
+ if (spraid_mod64(sg_dma_address(sg), page_size)) {
+ is_prp = false;
+ break;
+ }
+ }
+ }
+
+ return is_prp;
+}
+
+enum {
+ SPRAID_SGL_FMT_DATA_DESC = 0x00,
+ SPRAID_SGL_FMT_SEG_DESC = 0x02,
+ SPRAID_SGL_FMT_LAST_SEG_DESC = 0x03,
+ SPRAID_KEY_SGL_FMT_DATA_DESC = 0x04,
+ SPRAID_TRANSPORT_SGL_DATA_DESC = 0x05
+};
+
+static void spraid_sgl_set_data(struct spraid_sgl_desc *sge,
+ struct scatterlist *sg)
+{
+ sge->addr = cpu_to_le64(sg_dma_address(sg));
+ sge->length = cpu_to_le32(sg_dma_len(sg));
+ sge->type = SPRAID_SGL_FMT_DATA_DESC << 4;
+}
+
+static void spraid_sgl_set_seg(struct spraid_sgl_desc *sge,
+ dma_addr_t dma_addr, int entries)
+{
+ sge->addr = cpu_to_le64(dma_addr);
+ if (entries <= SGES_PER_PAGE) {
+ sge->length = cpu_to_le32(entries * sizeof(*sge));
+ sge->type = SPRAID_SGL_FMT_LAST_SEG_DESC << 4;
+ } else {
+ sge->length = cpu_to_le32(PAGE_SIZE);
+ sge->type = SPRAID_SGL_FMT_SEG_DESC << 4;
+ }
+}
+
+static int spraid_setup_ioq_cmd_sgl(struct spraid_dev *hdev,
+ struct scsi_cmnd *scmd,
+ struct spraid_ioq_command *ioq_cmd,
+ struct spraid_iod *iod)
+{
+ struct spraid_sgl_desc *sg_list, *link, *old_sg_list;
+ struct scatterlist *sg = scsi_sglist(scmd);
+ void **list = spraid_iod_list(iod);
+ struct dma_pool *pool;
+ int nsge = iod->nsge;
+ dma_addr_t sgl_dma;
+ int i = 0;
+
+ ioq_cmd->common.flags |= SPRAID_CMD_FLAG_SGL_METABUF;
+
+ if (nsge == 1) {
+ spraid_sgl_set_data(&ioq_cmd->common.dptr.sgl, sg);
+ return 0;
+ }
+
+ if (nsge <= (SMALL_POOL_SIZE / sizeof(struct spraid_sgl_desc))) {
+ pool = iod->spraidq->prp_small_pool;
+ iod->npages = 0;
+ } else {
+ pool = hdev->prp_page_pool;
+ iod->npages = 1;
+ }
+
+ sg_list = dma_pool_alloc(pool, GFP_ATOMIC, &sgl_dma);
+ if (!sg_list) {
+ dev_err_ratelimited(hdev->dev,
+ "Allocate first sgl_list failed\n");
+ iod->npages = -1;
+ return -ENOMEM;
+ }
+
+ list[0] = sg_list;
+ iod->first_dma = sgl_dma;
+ spraid_sgl_set_seg(&ioq_cmd->common.dptr.sgl, sgl_dma, nsge);
+ do {
+ if (i == SGES_PER_PAGE) {
+ old_sg_list = sg_list;
+ link = &old_sg_list[SGES_PER_PAGE - 1];
+
+ sg_list = dma_pool_alloc(pool, GFP_ATOMIC, &sgl_dma);
+ if (!sg_list) {
+ dev_err_ratelimited(hdev->dev,
+ "Allocate %dth sgl_list;"
+ " failed\n",
+ iod->npages + 1);
+ return -ENOMEM;
+ }
+ list[iod->npages++] = sg_list;
+
+ i = 0;
+ memcpy(&sg_list[i++], link, sizeof(*link));
+ spraid_sgl_set_seg(link, sgl_dma, nsge);
+ }
+
+ spraid_sgl_set_data(&sg_list[i++], sg);
+ sg = sg_next(sg);
+ } while (--nsge > 0);
+
+ return 0;
+}
+
+#define SPRAID_RW_FUA BIT(14)
+
+static void spraid_setup_rw_cmd(struct spraid_dev *hdev,
+ struct spraid_rw_command *rw,
+ struct scsi_cmnd *scmd)
+{
+ u32 start_lba_lo, start_lba_hi;
+ u32 datalength = 0;
+ u16 control = 0;
+
+ start_lba_lo = 0;
+ start_lba_hi = 0;
+
+ if (scmd->sc_data_direction == DMA_TO_DEVICE) {
+ rw->opcode = SPRAID_CMD_WRITE;
+ } else if (scmd->sc_data_direction == DMA_FROM_DEVICE) {
+ rw->opcode = SPRAID_CMD_READ;
+ } else {
+ dev_err(hdev->dev,
+ "Invalid IO for unsupported data direction: %d\n",
+ scmd->sc_data_direction);
+ WARN_ON(1);
+ }
+
+ /* 6-byte READ(0x08) or WRITE(0x0A) cdb */
+ if (scmd->cmd_len == 6) {
+ datalength = (u32)(scmd->cmnd[4] == 0 ?
+ IO_6_DEFAULT_TX_LEN : scmd->cmnd[4]);
+ start_lba_lo = (u32)get_unaligned_be24(&scmd->cmnd[1]);
+
+ start_lba_lo &= 0x1FFFFF;
+ }
+
+ /* 10-byte READ(0x28) or WRITE(0x2A) cdb */
+ else if (scmd->cmd_len == 10) {
+ datalength = (u32)get_unaligned_be16(&scmd->cmnd[7]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+
+ /* 12-byte READ(0xA8) or WRITE(0xAA) cdb */
+ else if (scmd->cmd_len == 12) {
+ datalength = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+ /* 16-byte READ(0x88) or WRITE(0x8A) cdb */
+ else if (scmd->cmd_len == 16) {
+ datalength = get_unaligned_be32(&scmd->cmnd[10]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+ /* 32-byte READ(0x88) or WRITE(0x8A) cdb */
+ else if (scmd->cmd_len == 32) {
+ datalength = get_unaligned_be32(&scmd->cmnd[28]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[16]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[12]);
+
+ if (scmd->cmnd[10] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+
+ if (unlikely(datalength > U16_MAX || datalength == 0)) {
+ dev_err(hdev->dev,
+ "Invalid IO for illegal transfer data length: %u\n",
+ datalength);
+ WARN_ON(1);
+ }
+
+ rw->slba = cpu_to_le64(((u64)start_lba_hi << 32) | start_lba_lo);
+ /* 0base for nlb */
+ rw->nlb = cpu_to_le16((u16)(datalength - 1));
+ rw->control = cpu_to_le16(control);
+}
+
+static void spraid_setup_nonio_cmd(struct spraid_dev *hdev,
+ struct spraid_scsi_nonio *scsi_nonio,
+ struct scsi_cmnd *scmd)
+{
+ scsi_nonio->buffer_len = cpu_to_le32(scsi_bufflen(scmd));
+
+ switch (scmd->sc_data_direction) {
+ case DMA_NONE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_NONE;
+ break;
+ case DMA_TO_DEVICE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_TODEV;
+ break;
+ case DMA_FROM_DEVICE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_FROMDEV;
+ break;
+ default:
+ dev_err(hdev->dev,
+ "Invalid IO for unsupported data direction: %d\n",
+ scmd->sc_data_direction);
+ WARN_ON(1);
+ }
+}
+
+static void spraid_setup_ioq_cmd(struct spraid_dev *hdev,
+ struct spraid_ioq_command *ioq_cmd,
+ struct scsi_cmnd *scmd)
+{
+ memcpy(ioq_cmd->common.cdb, scmd->cmnd, scmd->cmd_len);
+ ioq_cmd->common.cdb_len = scmd->cmd_len;
+
+ if (spraid_is_rw_scmd(scmd))
+ spraid_setup_rw_cmd(hdev, &ioq_cmd->rw, scmd);
+ else
+ spraid_setup_nonio_cmd(hdev, &ioq_cmd->scsi_nonio, scmd);
+}
+
+static int spraid_init_iod(struct spraid_dev *hdev, struct spraid_iod *iod,
+ struct spraid_ioq_command *ioq_cmd,
+ struct scsi_cmnd *scmd)
+{
+ if (unlikely(!iod->sense)) {
+ dev_err(hdev->dev, "Allocate sense data buffer failed\n");
+ return -ENOMEM;
+ }
+ ioq_cmd->common.sense_addr = cpu_to_le64(iod->sense_dma);
+ ioq_cmd->common.sense_len = cpu_to_le16(SCSI_SENSE_BUFFERSIZE);
+
+ iod->nsge = 0;
+ iod->npages = -1;
+ iod->use_sgl = 0;
+ iod->sg_drv_mgmt = false;
+ WRITE_ONCE(iod->state, SPRAID_CMD_IDLE);
+
+ return 0;
+}
+
+static void spraid_free_iod_res(struct spraid_dev *hdev, struct spraid_iod *iod)
+{
+ const int last_prp = hdev->page_size / sizeof(__le64) - 1;
+ dma_addr_t dma_addr, next_dma_addr;
+ struct spraid_sgl_desc *sg_list;
+ __le64 *prp_list;
+ void *addr;
+ int i;
+
+ dma_addr = iod->first_dma;
+ if (iod->npages == 0)
+ dma_pool_free(iod->spraidq->prp_small_pool,
+ spraid_iod_list(iod)[0], dma_addr);
+
+ for (i = 0; i < iod->npages; i++) {
+ addr = spraid_iod_list(iod)[i];
+
+ if (iod->use_sgl) {
+ sg_list = addr;
+ next_dma_addr =
+ le64_to_cpu((sg_list[SGES_PER_PAGE - 1]).addr);
+ } else {
+ prp_list = addr;
+ next_dma_addr = le64_to_cpu(prp_list[last_prp]);
+ }
+
+ dma_pool_free(hdev->prp_page_pool, addr, dma_addr);
+ dma_addr = next_dma_addr;
+ }
+
+ if (iod->sg_drv_mgmt && iod->sg != iod->inline_sg) {
+ iod->sg_drv_mgmt = false;
+ mempool_free(iod->sg, hdev->iod_mempool);
+ }
+
+ iod->sense = NULL;
+ iod->npages = -1;
+}
+
+static int spraid_io_map_data(struct spraid_dev *hdev, struct spraid_iod *iod,
+ struct scsi_cmnd *scmd,
+ struct spraid_ioq_command *ioq_cmd)
+{
+ int ret;
+
+ iod->nsge = scsi_dma_map(scmd);
+
+ /* No data to DMA, it may be scsi no-rw command */
+ if (unlikely(iod->nsge == 0))
+ return 0;
+
+ iod->length = scsi_bufflen(scmd);
+ iod->sg = scsi_sglist(scmd);
+ iod->use_sgl = !spraid_is_prp(hdev, scmd, iod->nsge);
+
+ if (iod->use_sgl) {
+ ret = spraid_setup_ioq_cmd_sgl(hdev, scmd, ioq_cmd, iod);
+ } else {
+ ret = spraid_setup_prps(hdev, iod);
+ ioq_cmd->common.dptr.prp1 =
+ cpu_to_le64(sg_dma_address(iod->sg));
+ ioq_cmd->common.dptr.prp2 = cpu_to_le64(iod->first_dma);
+ }
+
+ if (ret)
+ scsi_dma_unmap(scmd);
+
+ return ret;
+}
+
+static void spraid_map_status(struct spraid_iod *iod, struct scsi_cmnd *scmd,
+ struct spraid_completion *cqe)
+{
+ scsi_set_resid(scmd, 0);
+
+ switch ((le16_to_cpu(cqe->status) >> 1) & 0x7f) {
+ case FW_STAT_OK:
+ set_host_byte(scmd, DID_OK);
+ break;
+ case FW_STAT_NEED_CHECK:
+ set_host_byte(scmd, DID_OK);
+ scmd->result |= le16_to_cpu(cqe->status) >> 8;
+ if (scmd->result & SAM_STAT_CHECK_CONDITION) {
+ memset(scmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
+ memcpy(scmd->sense_buffer, iod->sense,
+ SCSI_SENSE_BUFFERSIZE);
+ scmd->result =
+ (scmd->result & 0x00ffffff) | (DRIVER_SENSE << 24);
+ }
+ break;
+ case FW_STAT_ABORTED:
+ set_host_byte(scmd, DID_ABORT);
+ break;
+ case FW_STAT_NEED_RETRY:
+ set_host_byte(scmd, DID_REQUEUE);
+ break;
+ default:
+ set_host_byte(scmd, DID_BAD_TARGET);
+ break;
+ }
+}
+
+static inline void spraid_get_tag_from_scmd(struct scsi_cmnd *scmd,
+ u16 *qid, u16 *cid)
+{
+ u32 tag = blk_mq_unique_tag(scmd->request);
+
+ *qid = blk_mq_unique_tag_to_hwq(tag) + 1;
+ *cid = blk_mq_unique_tag_to_tag(tag);
+}
+
+static int spraid_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
+{
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_dev *hdev = shost_priv(shost);
+ struct scsi_device *sdev = scmd->device;
+ struct spraid_sdev_hostdata *hostdata;
+ struct spraid_ioq_command ioq_cmd;
+ struct spraid_queue *ioq;
+ unsigned long elapsed;
+ u16 hwq, cid;
+ int ret;
+
+ if (unlikely(!scmd)) {
+ dev_err(hdev->dev, "err, scmd is null\n");
+ return 0;
+ }
+
+ if (unlikely(hdev->state != SPRAID_LIVE)) {
+ set_host_byte(scmd, DID_NO_CONNECT);
+ scmd->scsi_done(scmd);
+ return 0;
+ }
+
+ if (log_debug_switch)
+ scsi_print_command(scmd);
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ hostdata = sdev->hostdata;
+ ioq = &hdev->queues[hwq];
+ memset(&ioq_cmd, 0, sizeof(ioq_cmd));
+ ioq_cmd.rw.hdid = cpu_to_le32(hostdata->hdid);
+ ioq_cmd.rw.command_id = cid;
+
+ spraid_setup_ioq_cmd(hdev, &ioq_cmd, scmd);
+
+ ret = cid * SCSI_SENSE_BUFFERSIZE;
+ iod->sense = ioq->sense + ret;
+ iod->sense_dma = ioq->sense_dma_addr + ret;
+
+ ret = spraid_init_iod(hdev, iod, &ioq_cmd, scmd);
+ if (unlikely(ret))
+ return SCSI_MLQUEUE_HOST_BUSY;
+
+ iod->spraidq = ioq;
+ ret = spraid_io_map_data(hdev, iod, scmd, &ioq_cmd);
+ if (unlikely(ret)) {
+ dev_err(hdev->dev, "spraid_io_map_data Err.\n");
+ set_host_byte(scmd, DID_ERROR);
+ scmd->scsi_done(scmd);
+ ret = 0;
+ goto deinit_iod;
+ }
+
+ WRITE_ONCE(iod->state, SPRAID_CMD_IN_FLIGHT);
+ spraid_submit_cmd(ioq, &ioq_cmd);
+ elapsed = jiffies - scmd->jiffies_at_alloc;
+ dev_log_dbg(hdev->dev,
+ "cid[%d] qid[%d] submit IO cost %3ld.%3ld seconds\n",
+ cid, hwq, elapsed / HZ, elapsed % HZ);
+ return 0;
+
+deinit_iod:
+ spraid_free_iod_res(hdev, iod);
+ return ret;
+}
+
+static int spraid_match_dev(struct spraid_dev *hdev, u16 idx,
+ struct scsi_device *sdev)
+{
+ if (SPRAID_DEV_INFO_FLAG_VALID(hdev->devices[idx].flag)) {
+ if (sdev->channel == hdev->devices[idx].channel &&
+ sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
+ sdev->lun < hdev->devices[idx].lun) {
+ dev_info(hdev->dev,
+ "Match device success, channel;"
+ "target:lun[%d:%d:%d]\n",
+ hdev->devices[idx].channel,
+ hdev->devices[idx].target,
+ hdev->devices[idx].lun);
+ return 1;
+ }
+ }
+
+ return 0;
+}
+
+static int spraid_slave_alloc(struct scsi_device *sdev)
+{
+ struct spraid_sdev_hostdata *hostdata;
+ struct spraid_dev *hdev;
+ u16 idx;
+
+ hdev = shost_priv(sdev->host);
+ hostdata = kzalloc(sizeof(*hostdata), GFP_KERNEL);
+ if (!hostdata) {
+ dev_err(hdev->dev, "Alloc scsi host data memory failed\n");
+ return -ENOMEM;
+ }
+
+ down_read(&hdev->devices_rwsem);
+ for (idx = 0; idx < le32_to_cpu(hdev->ctrl_info->nd); idx++) {
+ if (spraid_match_dev(hdev, idx, sdev))
+ goto scan_host;
+ }
+ up_read(&hdev->devices_rwsem);
+
+ kfree(hostdata);
+ return -ENXIO;
+
+scan_host:
+ hostdata->hdid = le32_to_cpu(hdev->devices[idx].hdid);
+ hostdata->max_io_kb = le16_to_cpu(hdev->devices[idx].max_io_kb);
+ hostdata->attr = hdev->devices[idx].attr;
+ hostdata->flag = hdev->devices[idx].flag;
+ hostdata->rg_id = 0xff;
+ sdev->hostdata = hostdata;
+ up_read(&hdev->devices_rwsem);
+ return 0;
+}
+
+static void spraid_slave_destroy(struct scsi_device *sdev)
+{
+ kfree(sdev->hostdata);
+ sdev->hostdata = NULL;
+}
+
+static int spraid_slave_configure(struct scsi_device *sdev)
+{
+ u16 idx;
+ unsigned int timeout = scmd_tmout_nonpt * HZ;
+ struct spraid_dev *hdev = shost_priv(sdev->host);
+ struct spraid_sdev_hostdata *hostdata = sdev->hostdata;
+ u32 max_sec = sdev->host->max_sectors;
+
+ if (hostdata) {
+ idx = hostdata->hdid - 1;
+ if (sdev->channel == hdev->devices[idx].channel &&
+ sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
+ sdev->lun < hdev->devices[idx].lun) {
+ if (SPRAID_DEV_INFO_ATTR_PT(hdev->devices[idx].attr))
+ timeout = 30 * HZ;
+ else
+ timeout = scmd_tmout_nonpt * HZ;
+ max_sec = le16_to_cpu(hdev->devices[idx].max_io_kb)
+ << 1;
+ } else {
+ dev_err(hdev->dev,
+ "[%s] err, sdev->channel:id:lun[%d:%d:%lld];"
+ "devices[%d], channel:target:lun[%d:%d:%d]\n",
+ __func__, sdev->channel, sdev->id, sdev->lun,
+ idx, hdev->devices[idx].channel,
+ hdev->devices[idx].target,
+ hdev->devices[idx].lun);
+ }
+ } else {
+ dev_err(hdev->dev, "[%s] err, sdev->hostdata is null\n",
+ __func__);
+ }
+
+ blk_queue_rq_timeout(sdev->request_queue, timeout);
+ sdev->eh_timeout = timeout;
+
+ if ((max_sec == 0) || (max_sec > sdev->host->max_sectors))
+ max_sec = sdev->host->max_sectors;
+
+ dev_info(hdev->dev,
+ "[%s] sdev->channel:id:lun[%d:%d:%lld];"
+ " scmd_timeout[%d]s, maxsec[%d]\n",
+ __func__, sdev->channel, sdev->id,
+ sdev->lun, timeout / HZ, max_sec);
+
+ return 0;
+}
+
+static void spraid_shost_init(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ u8 domain, bus;
+ u32 dev_func;
+
+ domain = pci_domain_nr(pdev->bus);
+ bus = pdev->bus->number;
+ dev_func = pdev->devfn;
+
+ hdev->shost->nr_hw_queues = hdev->online_queues - 1;
+ hdev->shost->can_queue = (hdev->ioq_depth - SPRAID_PTCMDS_PERQ);
+
+ hdev->shost->sg_tablesize = le16_to_cpu(hdev->ctrl_info->max_num_sge);
+ /* 512B per sector */
+ hdev->shost->max_sectors =
+ (1U << ((hdev->ctrl_info->mdts) * 1U) << 12) / 512;
+ hdev->shost->cmd_per_lun = MAX_CMD_PER_DEV;
+ hdev->shost->max_channel =
+ le16_to_cpu(hdev->ctrl_info->max_channel) - 1;
+ hdev->shost->max_id = le32_to_cpu(hdev->ctrl_info->max_tgt_id);
+ hdev->shost->max_lun = le16_to_cpu(hdev->ctrl_info->max_lun);
+
+ hdev->shost->this_id = -1;
+ hdev->shost->unique_id = (domain << 16) | (bus << 8) | dev_func;
+ hdev->shost->max_cmd_len = MAX_CDB_LEN;
+ hdev->shost->hostt->cmd_size = max(spraid_cmd_size(hdev, false, true),
+ spraid_cmd_size(hdev, false, false));
+}
+
+static inline void spraid_host_deinit(struct spraid_dev *hdev)
+{
+ ida_free(&spraid_instance_ida, hdev->instance);
+}
+
+static int spraid_alloc_queue(struct spraid_dev *hdev, u16 qid, u16 depth)
+{
+ struct spraid_queue *spraidq = &hdev->queues[qid];
+ int ret = 0;
+
+ if (hdev->queue_count > qid) {
+ dev_info(hdev->dev, "[%s] warn: queue[%d] is exist\n",
+ __func__, qid);
+ return 0;
+ }
+
+ spraidq->cqes = dma_alloc_coherent(hdev->dev, CQ_SIZE(depth),
+ &spraidq->cq_dma_addr,
+ GFP_KERNEL | __GFP_ZERO);
+ if (!spraidq->cqes)
+ return -ENOMEM;
+
+ spraidq->sq_cmds = dma_alloc_coherent(hdev->dev, SQ_SIZE(qid, depth),
+ &spraidq->sq_dma_addr,
+ GFP_KERNEL);
+ if (!spraidq->sq_cmds) {
+ ret = -ENOMEM;
+ goto free_cqes;
+ }
+
+ spin_lock_init(&spraidq->sq_lock);
+ spin_lock_init(&spraidq->cq_lock);
+ spraidq->hdev = hdev;
+ spraidq->q_depth = depth;
+ spraidq->qid = qid;
+ spraidq->cq_vector = -1;
+ hdev->queue_count++;
+
+ /* alloc sense buffer */
+ spraidq->sense = dma_alloc_coherent(hdev->dev, SENSE_SIZE(depth),
+ &spraidq->sense_dma_addr,
+ GFP_KERNEL | __GFP_ZERO);
+ if (!spraidq->sense) {
+ ret = -ENOMEM;
+ goto free_sq_cmds;
+ }
+
+ return 0;
+
+free_sq_cmds:
+ dma_free_coherent(hdev->dev, SQ_SIZE(qid, depth),
+ (void *)spraidq->sq_cmds, spraidq->sq_dma_addr);
+free_cqes:
+ dma_free_coherent(hdev->dev, CQ_SIZE(depth), (void *)spraidq->cqes,
+ spraidq->cq_dma_addr);
+ return ret;
+}
+
+static int spraid_wait_ready(struct spraid_dev *hdev, u64 cap, bool enabled)
+{
+ unsigned long timeout =
+ ((SPRAID_CAP_TIMEOUT(cap) + 1) * SPRAID_CAP_TIMEOUT_UNIT_MS) + jiffies;
+ u32 bit = enabled ? SPRAID_CSTS_RDY : 0;
+
+ while ((readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_RDY) != bit) {
+ usleep_range(1000, 2000);
+ if (fatal_signal_pending(current))
+ return -EINTR;
+
+ if (time_after(jiffies, timeout)) {
+ dev_err(hdev->dev, "Device not ready; aborting %s\n",
+ enabled ? "initialisation" : "reset");
+ return -ENODEV;
+ }
+ }
+ return 0;
+}
+
+static int spraid_shutdown_ctrl(struct spraid_dev *hdev)
+{
+ unsigned long timeout = hdev->ctrl_info->rtd3e + jiffies;
+
+ hdev->ctrl_config &= ~SPRAID_CC_SHN_MASK;
+ hdev->ctrl_config |= SPRAID_CC_SHN_NORMAL;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ while ((readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_SHST_MASK) !=
+ SPRAID_CSTS_SHST_CMPLT) {
+ msleep(100);
+ if (fatal_signal_pending(current))
+ return -EINTR;
+ if (time_after(jiffies, timeout)) {
+ dev_err(hdev->dev,
+ "Device shutdown incomplete; abort shutdown\n");
+ return -ENODEV;
+ }
+ }
+ return 0;
+}
+
+static int spraid_disable_ctrl(struct spraid_dev *hdev)
+{
+ hdev->ctrl_config &= ~SPRAID_CC_SHN_MASK;
+ hdev->ctrl_config &= ~SPRAID_CC_ENABLE;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ return spraid_wait_ready(hdev, hdev->cap, false);
+}
+
+static int spraid_enable_ctrl(struct spraid_dev *hdev)
+{
+ u64 cap = hdev->cap;
+ u32 dev_page_min = SPRAID_CAP_MPSMIN(cap) + 12;
+ u32 page_shift = PAGE_SHIFT;
+
+ if (page_shift < dev_page_min) {
+ dev_err(hdev->dev,
+ "Minimum device page size[%u], too large for host[%u]\n",
+ 1U << dev_page_min, 1U << page_shift);
+ return -ENODEV;
+ }
+
+ page_shift = min_t(unsigned int, SPRAID_CAP_MPSMAX(cap) + 12,
+ PAGE_SHIFT);
+ hdev->page_size = 1U << page_shift;
+
+ hdev->ctrl_config = SPRAID_CC_CSS_NVM;
+ hdev->ctrl_config |= (page_shift - 12) << SPRAID_CC_MPS_SHIFT;
+ hdev->ctrl_config |= SPRAID_CC_AMS_RR | SPRAID_CC_SHN_NONE;
+ hdev->ctrl_config |= SPRAID_CC_IOSQES | SPRAID_CC_IOCQES;
+ hdev->ctrl_config |= SPRAID_CC_ENABLE;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ return spraid_wait_ready(hdev, cap, true);
+}
+
+static void spraid_init_queue(struct spraid_queue *spraidq, u16 qid)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ memset((void *)spraidq->cqes, 0, CQ_SIZE(spraidq->q_depth));
+
+ spraidq->sq_tail = 0;
+ spraidq->cq_head = 0;
+ spraidq->cq_phase = 1;
+ spraidq->q_db = &hdev->dbs[qid * 2 * hdev->db_stride];
+ spraidq->prp_small_pool = hdev->prp_small_pool[qid % small_pool_num];
+ hdev->online_queues++;
+}
+
+static inline bool spraid_cqe_pending(struct spraid_queue *spraidq)
+{
+ return (le16_to_cpu(spraidq->cqes[spraidq->cq_head].status) & 1) ==
+ spraidq->cq_phase;
+}
+
+static void spraid_sata_report_zone_handle(struct scsi_cmnd *scmd,
+ struct spraid_iod *iod)
+{
+ int i = 0;
+ unsigned int bytes = 0;
+ struct scatterlist *sg = scsi_sglist(scmd);
+
+ scsi_for_each_sg(scmd, sg, iod->nsge, i) {
+ unsigned int offset = 0;
+
+ if (bytes == 0) {
+ char *hdr;
+ u32 list_length;
+ u64 max_lba, opt_lba;
+ u16 same;
+
+ hdr = sg_virt(sg);
+
+ list_length = get_unaligned_le32(&hdr[0]);
+ same = get_unaligned_le16(&hdr[4]);
+ max_lba = get_unaligned_le64(&hdr[8]);
+ opt_lba = get_unaligned_le64(&hdr[16]);
+ put_unaligned_be32(list_length, &hdr[0]);
+ hdr[4] = same & 0xf;
+ put_unaligned_be64(max_lba, &hdr[8]);
+ put_unaligned_be64(opt_lba, &hdr[16]);
+ offset += 64;
+ bytes += 64;
+ }
+ while (offset < sg_dma_len(sg)) {
+ char *rec;
+ u8 cond, type, non_seq, reset;
+ u64 size, start, wp;
+
+ rec = sg_virt(sg) + offset;
+ type = rec[0] & 0xf;
+ cond = (rec[1] >> 4) & 0xf;
+ non_seq = (rec[1] & 2);
+ reset = (rec[1] & 1);
+ size = get_unaligned_le64(&rec[8]);
+ start = get_unaligned_le64(&rec[16]);
+ wp = get_unaligned_le64(&rec[24]);
+ rec[0] = type;
+ rec[1] = (cond << 4) | non_seq | reset;
+ put_unaligned_be64(size, &rec[8]);
+ put_unaligned_be64(start, &rec[16]);
+ put_unaligned_be64(wp, &rec[24]);
+ WARN_ON(offset + 64 > sg_dma_len(sg));
+ offset += 64;
+ bytes += 64;
+ }
+ }
+}
+
+static inline void spraid_handle_ata_cmd(struct spraid_dev *hdev,
+ struct scsi_cmnd *scmd,
+ struct spraid_iod *iod)
+{
+ if (hdev->ctrl_info->card_type != SPRAID_CARD_HBA)
+ return;
+
+ switch (scmd->cmnd[0]) {
+ case ZBC_IN:
+ dev_info(hdev->dev, "[%s] process report zone\n", __func__);
+ spraid_sata_report_zone_handle(scmd, iod);
+ break;
+ default:
+ break;
+ }
+}
+
+static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct blk_mq_tags *tags;
+ struct scsi_cmnd *scmd;
+ struct spraid_iod *iod;
+ struct request *req;
+ unsigned long elapsed;
+
+ tags = hdev->shost->tag_set.tags[ioq->qid - 1];
+ req = blk_mq_tag_to_rq(tags, cqe->cmd_id);
+ if (unlikely(!req || !blk_mq_request_started(req))) {
+ dev_warn(hdev->dev, "Invalid id %d completed on queue %d\n",
+ cqe->cmd_id, ioq->qid);
+ return;
+ }
+
+ scmd = blk_mq_rq_to_pdu(req);
+ iod = scsi_cmd_priv(scmd);
+
+ elapsed = jiffies - scmd->jiffies_at_alloc;
+ dev_log_dbg(hdev->dev,
+ "cid[%d] qid[%d] finish IO cost %3ld.%3ld seconds\n",
+ cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
+
+ if (cmpxchg(&iod->state, SPRAID_CMD_IN_FLIGHT, SPRAID_CMD_COMPLETE) !=
+ SPRAID_CMD_IN_FLIGHT) {
+ dev_warn(hdev->dev,
+ "cid[%d] qid[%d] enters abnormal handler;"
+ " cost %3ld.%3ld seconds\n",
+ cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
+ WRITE_ONCE(iod->state, SPRAID_CMD_TMO_COMPLETE);
+
+ if (iod->nsge) {
+ iod->nsge = 0;
+ scsi_dma_unmap(scmd);
+ }
+ spraid_free_iod_res(hdev, iod);
+
+ return;
+ }
+
+ spraid_handle_ata_cmd(hdev, scmd, iod);
+
+ spraid_map_status(iod, scmd, cqe);
+ if (iod->nsge) {
+ iod->nsge = 0;
+ scsi_dma_unmap(scmd);
+ }
+ spraid_free_iod_res(hdev, iod);
+ scmd->scsi_done(scmd);
+}
+
+static void spraid_complete_adminq_cmnd(struct spraid_queue *adminq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = adminq->hdev;
+ struct spraid_cmd *adm_cmd;
+
+ adm_cmd = hdev->adm_cmds + cqe->cmd_id;
+ if (unlikely(adm_cmd->state == SPRAID_CMD_IDLE)) {
+ dev_warn(adminq->hdev->dev,
+ "Invalid id %d completed on queue %d\n",
+ cqe->cmd_id, le16_to_cpu(cqe->sq_id));
+ return;
+ }
+
+ adm_cmd->status = le16_to_cpu(cqe->status) >> 1;
+ adm_cmd->result0 = le32_to_cpu(cqe->result);
+ adm_cmd->result1 = le32_to_cpu(cqe->result1);
+
+ complete(&adm_cmd->cmd_done);
+}
+
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid);
+
+static void spraid_complete_aen(struct spraid_queue *spraidq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+ u32 result = le32_to_cpu(cqe->result);
+
+ dev_info(hdev->dev, "rcv aen, cid[%d], status[0x%x], result[0x%x]\n",
+ cqe->cmd_id, le16_to_cpu(cqe->status) >> 1, result);
+
+ spraid_send_aen(hdev, cqe->cmd_id);
+
+ if ((le16_to_cpu(cqe->status) >> 1) != SPRAID_SC_SUCCESS)
+ return;
+ switch (result & 0x7) {
+ case SPRAID_AEN_NOTICE:
+ spraid_handle_aen_notice(hdev, result);
+ break;
+ case SPRAID_AEN_VS:
+ spraid_handle_aen_vs(hdev, result, le32_to_cpu(cqe->result1));
+ break;
+ default:
+ dev_warn(hdev->dev, "Unsupported async event type: %u\n",
+ result & 0x7);
+ break;
+ }
+}
+
+static void spraid_complete_ioq_sync_cmnd(struct spraid_queue *ioq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct spraid_cmd *ptcmd;
+
+ ptcmd = hdev->ioq_ptcmds + (ioq->qid - 1) * SPRAID_PTCMDS_PERQ +
+ cqe->cmd_id - SPRAID_IO_BLK_MQ_DEPTH;
+
+ ptcmd->status = le16_to_cpu(cqe->status) >> 1;
+ ptcmd->result0 = le32_to_cpu(cqe->result);
+ ptcmd->result1 = le32_to_cpu(cqe->result1);
+
+ complete(&ptcmd->cmd_done);
+}
+
+static inline void spraid_handle_cqe(struct spraid_queue *spraidq, u16 idx)
+{
+ struct spraid_completion *cqe = &spraidq->cqes[idx];
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ if (unlikely(cqe->cmd_id >= spraidq->q_depth)) {
+ dev_err(hdev->dev,
+ "Invalid command id[%d] completed on queue %d\n",
+ cqe->cmd_id, cqe->sq_id);
+ return;
+ }
+
+ dev_log_dbg(hdev->dev, "cid[%d] qid[%d];"
+ " result[0x%x], sq_id[%d], status[0x%x]\n",
+ cqe->cmd_id, spraidq->qid, le32_to_cpu(cqe->result),
+ le16_to_cpu(cqe->sq_id), le16_to_cpu(cqe->status));
+
+ if (unlikely(spraidq->qid == 0
+ && cqe->cmd_id >= SPRAID_AQ_BLK_MQ_DEPTH)) {
+ spraid_complete_aen(spraidq, cqe);
+ return;
+ }
+
+ if (unlikely(spraidq->qid && cqe->cmd_id >= SPRAID_IO_BLK_MQ_DEPTH)) {
+ spraid_complete_ioq_sync_cmnd(spraidq, cqe);
+ return;
+ }
+
+ if (spraidq->qid)
+ spraid_complete_ioq_cmnd(spraidq, cqe);
+ else
+ spraid_complete_adminq_cmnd(spraidq, cqe);
+}
+
+static void spraid_complete_cqes(struct spraid_queue *spraidq,
+ u16 start, u16 end)
+{
+ while (start != end) {
+ spraid_handle_cqe(spraidq, start);
+ if (++start == spraidq->q_depth)
+ start = 0;
+ }
+}
+
+static inline void spraid_update_cq_head(struct spraid_queue *spraidq)
+{
+ if (++spraidq->cq_head == spraidq->q_depth) {
+ spraidq->cq_head = 0;
+ spraidq->cq_phase = !spraidq->cq_phase;
+ }
+}
+
+static inline bool spraid_process_cq(struct spraid_queue *spraidq,
+ u16 *start, u16 *end, int tag)
+{
+ bool found = false;
+
+ *start = spraidq->cq_head;
+ while (!found && spraid_cqe_pending(spraidq)) {
+ if (spraidq->cqes[spraidq->cq_head].cmd_id == tag)
+ found = true;
+ spraid_update_cq_head(spraidq);
+ }
+ *end = spraidq->cq_head;
+
+ if (*start != *end)
+ writel(spraidq->cq_head,
+ spraidq->q_db + spraidq->hdev->db_stride);
+
+ return found;
+}
+
+static bool spraid_poll_cq(struct spraid_queue *spraidq, int cid)
+{
+ u16 start, end;
+ bool found;
+
+ if (!spraid_cqe_pending(spraidq))
+ return 0;
+
+ spin_lock_irq(&spraidq->cq_lock);
+ found = spraid_process_cq(spraidq, &start, &end, cid);
+ spin_unlock_irq(&spraidq->cq_lock);
+
+ spraid_complete_cqes(spraidq, start, end);
+ return found;
+}
+
+static irqreturn_t spraid_irq(int irq, void *data)
+{
+ struct spraid_queue *spraidq = data;
+ irqreturn_t ret = IRQ_NONE;
+ u16 start, end;
+
+ spin_lock(&spraidq->cq_lock);
+ if (spraidq->cq_head != spraidq->last_cq_head)
+ ret = IRQ_HANDLED;
+
+ spraid_process_cq(spraidq, &start, &end, -1);
+ spraidq->last_cq_head = spraidq->cq_head;
+ spin_unlock(&spraidq->cq_lock);
+
+ if (start != end) {
+ spraid_complete_cqes(spraidq, start, end);
+ ret = IRQ_HANDLED;
+ }
+ return ret;
+}
+
+static int spraid_setup_admin_queue(struct spraid_dev *hdev)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u32 aqa;
+ int ret;
+
+ dev_info(hdev->dev, "[%s] start disable ctrl\n", __func__);
+
+ ret = spraid_disable_ctrl(hdev);
+ if (ret)
+ return ret;
+
+ ret = spraid_alloc_queue(hdev, 0, SPRAID_AQ_DEPTH);
+ if (ret)
+ return ret;
+
+ aqa = adminq->q_depth - 1;
+ aqa |= aqa << 16;
+ writel(aqa, hdev->bar + SPRAID_REG_AQA);
+ lo_hi_writeq(adminq->sq_dma_addr, hdev->bar + SPRAID_REG_ASQ);
+ lo_hi_writeq(adminq->cq_dma_addr, hdev->bar + SPRAID_REG_ACQ);
+
+ dev_info(hdev->dev, "[%s] start enable ctrl\n", __func__);
+
+ ret = spraid_enable_ctrl(hdev);
+ if (ret) {
+ ret = -ENODEV;
+ goto free_queue;
+ }
+
+ adminq->cq_vector = 0;
+ spraid_init_queue(adminq, 0);
+ ret = pci_request_irq(hdev->pdev, adminq->cq_vector, spraid_irq, NULL,
+ adminq, "spraid%d_q%d",
+ hdev->instance, adminq->qid);
+
+ if (ret) {
+ adminq->cq_vector = -1;
+ hdev->online_queues--;
+ goto free_queue;
+ }
+
+ dev_info(hdev->dev, "[%s] success, queuecount:[%d], onlinequeue:[%d]\n",
+ __func__, hdev->queue_count, hdev->online_queues);
+
+ return 0;
+
+free_queue:
+ spraid_free_queue(adminq);
+ return ret;
+}
+
+static u32 spraid_bar_size(struct spraid_dev *hdev, u32 nr_ioqs)
+{
+ return (SPRAID_REG_DBS + ((nr_ioqs + 1) * 8 * hdev->db_stride));
+}
+
+static int spraid_alloc_admin_cmds(struct spraid_dev *hdev)
+{
+ int i;
+
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+ spin_lock_init(&hdev->adm_cmd_lock);
+
+ hdev->adm_cmds = kcalloc_node(SPRAID_AQ_BLK_MQ_DEPTH,
+ sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
+
+ if (!hdev->adm_cmds) {
+ dev_err(hdev->dev, "Alloc admin cmds failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < SPRAID_AQ_BLK_MQ_DEPTH; i++) {
+ hdev->adm_cmds[i].qid = 0;
+ hdev->adm_cmds[i].cid = i;
+ list_add_tail(&(hdev->adm_cmds[i].list), &hdev->adm_cmd_list);
+ }
+
+ dev_info(hdev->dev, "Alloc admin cmds success, num[%d]\n",
+ SPRAID_AQ_BLK_MQ_DEPTH);
+
+ return 0;
+}
+
+static void spraid_free_admin_cmds(struct spraid_dev *hdev)
+{
+ kfree(hdev->adm_cmds);
+ hdev->adm_cmds = NULL;
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+}
+
+static struct spraid_cmd *spraid_get_cmd(struct spraid_dev *hdev,
+ enum spraid_cmd_type type)
+{
+ struct spraid_cmd *cmd = NULL;
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
+
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
+
+ spin_lock_irqsave(slock, flags);
+ if (list_empty(head)) {
+ spin_unlock_irqrestore(slock, flags);
+ dev_err(hdev->dev, "err, cmd[%d] list empty\n", type);
+ return NULL;
+ }
+ cmd = list_entry(head->next, struct spraid_cmd, list);
+ list_del_init(&cmd->list);
+ spin_unlock_irqrestore(slock, flags);
+
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IN_FLIGHT);
+
+ return cmd;
+}
+
+static void spraid_put_cmd(struct spraid_dev *hdev, struct spraid_cmd *cmd,
+ enum spraid_cmd_type type)
+{
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
+
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
+
+ spin_lock_irqsave(slock, flags);
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IDLE);
+ list_add_tail(&cmd->list, head);
+ spin_unlock_irqrestore(slock, flags);
+}
+
+
+static int spraid_submit_admin_sync_cmd(struct spraid_dev *hdev,
+ struct spraid_admin_command *cmd,
+ u32 *result0, u32 *result1, u32 timeout)
+{
+ struct spraid_cmd *adm_cmd = spraid_get_cmd(hdev, SPRAID_CMD_ADM);
+
+ if (!adm_cmd) {
+ dev_err(hdev->dev, "err, get admin cmd failed\n");
+ return -EFAULT;
+ }
+
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
+
+ init_completion(&adm_cmd->cmd_done);
+
+ cmd->common.command_id = adm_cmd->cid;
+ spraid_submit_cmd(&hdev->queues[0], cmd);
+
+ if (!wait_for_completion_timeout(&adm_cmd->cmd_done, timeout)) {
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout;"
+ " opcode[0x%x] subopcode[0x%x]\n",
+ __func__, adm_cmd->cid, adm_cmd->qid,
+ cmd->usr_cmd.opcode, cmd->usr_cmd.info_0.subopcode);
+ WRITE_ONCE(adm_cmd->state, SPRAID_CMD_TIMEOUT);
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+ return -EINVAL;
+ }
+
+ if (result0)
+ *result0 = adm_cmd->result0;
+ if (result1)
+ *result1 = adm_cmd->result1;
+
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+
+ return adm_cmd->status;
+}
+
+static int spraid_create_cq(struct spraid_dev *hdev, u16 qid,
+ struct spraid_queue *spraidq, u16 cq_vector)
+{
+ struct spraid_admin_command admin_cmd;
+ int flags = SPRAID_QUEUE_PHYS_CONTIG | SPRAID_CQ_IRQ_ENABLED;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.create_cq.opcode = SPRAID_ADMIN_CREATE_CQ;
+ admin_cmd.create_cq.prp1 = cpu_to_le64(spraidq->cq_dma_addr);
+ admin_cmd.create_cq.cqid = cpu_to_le16(qid);
+ admin_cmd.create_cq.qsize = cpu_to_le16(spraidq->q_depth - 1);
+ admin_cmd.create_cq.cq_flags = cpu_to_le16(flags);
+ admin_cmd.create_cq.irq_vector = cpu_to_le16(cq_vector);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static int spraid_create_sq(struct spraid_dev *hdev, u16 qid,
+ struct spraid_queue *spraidq)
+{
+ struct spraid_admin_command admin_cmd;
+ int flags = SPRAID_QUEUE_PHYS_CONTIG;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.create_sq.opcode = SPRAID_ADMIN_CREATE_SQ;
+ admin_cmd.create_sq.prp1 = cpu_to_le64(spraidq->sq_dma_addr);
+ admin_cmd.create_sq.sqid = cpu_to_le16(qid);
+ admin_cmd.create_sq.qsize = cpu_to_le16(spraidq->q_depth - 1);
+ admin_cmd.create_sq.sq_flags = cpu_to_le16(flags);
+ admin_cmd.create_sq.cqid = cpu_to_le16(qid);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static void spraid_free_queue(struct spraid_queue *spraidq)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ hdev->queue_count--;
+ dma_free_coherent(hdev->dev, CQ_SIZE(spraidq->q_depth),
+ (void *)spraidq->cqes, spraidq->cq_dma_addr);
+ dma_free_coherent(hdev->dev, SQ_SIZE(spraidq->qid, spraidq->q_depth),
+ spraidq->sq_cmds, spraidq->sq_dma_addr);
+ dma_free_coherent(hdev->dev, SENSE_SIZE(spraidq->q_depth),
+ spraidq->sense, spraidq->sense_dma_addr);
+}
+
+static void spraid_free_admin_queue(struct spraid_dev *hdev)
+{
+ spraid_free_queue(&hdev->queues[0]);
+}
+
+static void spraid_free_io_queues(struct spraid_dev *hdev)
+{
+ int i;
+
+ for (i = hdev->queue_count - 1; i >= 1; i--)
+ spraid_free_queue(&hdev->queues[i]);
+}
+
+static int spraid_delete_queue(struct spraid_dev *hdev, u8 op, u16 id)
+{
+ struct spraid_admin_command admin_cmd;
+ int ret;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.delete_queue.opcode = op;
+ admin_cmd.delete_queue.qid = cpu_to_le16(id);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+
+ if (ret)
+ dev_err(hdev->dev, "Delete %s:[%d] failed\n",
+ (op == SPRAID_ADMIN_DELETE_CQ) ? "cq" : "sq", id);
+
+ return ret;
+}
+
+static int spraid_delete_cq(struct spraid_dev *hdev, u16 cqid)
+{
+ return spraid_delete_queue(hdev, SPRAID_ADMIN_DELETE_CQ, cqid);
+}
+
+static int spraid_delete_sq(struct spraid_dev *hdev, u16 sqid)
+{
+ return spraid_delete_queue(hdev, SPRAID_ADMIN_DELETE_SQ, sqid);
+}
+
+static int spraid_create_queue(struct spraid_queue *spraidq, u16 qid)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+ u16 cq_vector;
+ int ret;
+
+ cq_vector = (hdev->num_vecs == 1) ? 0 : qid;
+ ret = spraid_create_cq(hdev, qid, spraidq, cq_vector);
+ if (ret)
+ return ret;
+
+ ret = spraid_create_sq(hdev, qid, spraidq);
+ if (ret)
+ goto delete_cq;
+
+ spraid_init_queue(spraidq, qid);
+ spraidq->cq_vector = cq_vector;
+
+ ret = pci_request_irq(hdev->pdev, cq_vector, spraid_irq, NULL,
+ spraidq, "spraid%d_q%d", hdev->instance, qid);
+
+ if (ret) {
+ dev_err(hdev->dev, "Request queue[%d] irq failed\n", qid);
+ goto delete_sq;
+ }
+
+ return 0;
+
+delete_sq:
+ spraidq->cq_vector = -1;
+ hdev->online_queues--;
+ spraid_delete_sq(hdev, qid);
+delete_cq:
+ spraid_delete_cq(hdev, qid);
+
+ return ret;
+}
+
+static int spraid_create_io_queues(struct spraid_dev *hdev)
+{
+ u32 i, max;
+ int ret = 0;
+
+ max = min(hdev->max_qid, hdev->queue_count - 1);
+ for (i = hdev->online_queues; i <= max; i++) {
+ ret = spraid_create_queue(&hdev->queues[i], i);
+ if (ret) {
+ dev_err(hdev->dev, "Create queue[%d] failed\n", i);
+ break;
+ }
+ }
+
+ dev_info(hdev->dev, "[%s] queue_count[%d], online_queue[%d]",
+ __func__, hdev->queue_count, hdev->online_queues);
+
+ return ret >= 0 ? 0 : ret;
+}
+
+static int spraid_set_features(struct spraid_dev *hdev, u32 fid,
+ u32 dword11, void *buffer,
+ size_t buflen, u32 *result)
+{
+ struct spraid_admin_command admin_cmd;
+ int ret;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+
+ if (buffer && buflen) {
+ data_ptr = dma_alloc_coherent(hdev->dev, buflen,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memcpy(data_ptr, buffer, buflen);
+ }
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.features.opcode = SPRAID_ADMIN_SET_FEATURES;
+ admin_cmd.features.fid = cpu_to_le32(fid);
+ admin_cmd.features.dword11 = cpu_to_le32(dword11);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, result, NULL, 0);
+
+ if (data_ptr)
+ dma_free_coherent(hdev->dev, buflen, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_configure_timestamp(struct spraid_dev *hdev)
+{
+ __le64 ts;
+ int ret;
+
+ ts = cpu_to_le64(ktime_to_ms(ktime_get_real()));
+ ret = spraid_set_features(hdev, SPRAID_FEAT_TIMESTAMP,
+ 0, &ts, sizeof(ts), NULL);
+
+ if (ret)
+ dev_err(hdev->dev, "set timestamp failed: %d\n", ret);
+ return ret;
+}
+
+static int spraid_set_queue_cnt(struct spraid_dev *hdev, u32 *cnt)
+{
+ u32 q_cnt = (*cnt - 1) | ((*cnt - 1) << 16);
+ u32 nr_ioqs, result;
+ int status;
+
+ status = spraid_set_features(hdev, SPRAID_FEAT_NUM_QUEUES,
+ q_cnt, NULL, 0, &result);
+ if (status) {
+ dev_err(hdev->dev, "Set queue count failed, status: %d\n",
+ status);
+ return -EIO;
+ }
+
+ nr_ioqs = min(result & 0xffff, result >> 16) + 1;
+ *cnt = min(*cnt, nr_ioqs);
+ if (*cnt == 0) {
+ dev_err(hdev->dev, "Illegal queue count: zero\n");
+ return -EIO;
+ }
+ return 0;
+}
+
+static int spraid_setup_io_queues(struct spraid_dev *hdev)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ struct pci_dev *pdev = hdev->pdev;
+ u32 nr_ioqs = num_online_cpus();
+ u32 i, size;
+ int ret;
+
+ struct irq_affinity affd = {
+ .pre_vectors = 1
+ };
+
+ ret = spraid_set_queue_cnt(hdev, &nr_ioqs);
+ if (ret < 0)
+ return ret;
+
+ size = spraid_bar_size(hdev, nr_ioqs);
+ ret = spraid_remap_bar(hdev, size);
+ if (ret)
+ return -ENOMEM;
+
+ adminq->q_db = hdev->dbs;
+
+ pci_free_irq(pdev, 0, adminq);
+ pci_free_irq_vectors(pdev);
+
+ ret = pci_alloc_irq_vectors_affinity(pdev, 1, (nr_ioqs + 1),
+ PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
+ if (ret <= 0)
+ return -EIO;
+
+ hdev->num_vecs = ret;
+
+ hdev->max_qid = max(ret - 1, 1);
+
+ ret = pci_request_irq(pdev, adminq->cq_vector, spraid_irq, NULL,
+ adminq, "spraid%d_q%d",
+ hdev->instance, adminq->qid);
+ if (ret) {
+ dev_err(hdev->dev, "Request admin irq failed\n");
+ adminq->cq_vector = -1;
+ return ret;
+ }
+
+ for (i = hdev->queue_count; i <= hdev->max_qid; i++) {
+ ret = spraid_alloc_queue(hdev, i, hdev->ioq_depth);
+ if (ret)
+ break;
+ }
+ dev_info(hdev->dev, "[%s] max_qid: %d, queue_count: %d;"
+ " online_queue: %d, ioq_depth: %d\n",
+ __func__, hdev->max_qid, hdev->queue_count,
+ hdev->online_queues, hdev->ioq_depth);
+
+ return spraid_create_io_queues(hdev);
+}
+
+static void spraid_delete_io_queues(struct spraid_dev *hdev)
+{
+ u16 queues = hdev->online_queues - 1;
+ u8 opcode = SPRAID_ADMIN_DELETE_SQ;
+ u16 i, pass;
+
+ if (!pci_device_is_present(hdev->pdev)) {
+ dev_err(hdev->dev,
+ "pci_device is not present, skip disable io queues\n");
+ return;
+ }
+
+ if (hdev->online_queues < 2) {
+ dev_err(hdev->dev, "[%s] err, io queue has been delete\n",
+ __func__);
+ return;
+ }
+
+ for (pass = 0; pass < 2; pass++) {
+ for (i = queues; i > 0; i--)
+ if (spraid_delete_queue(hdev, opcode, i))
+ break;
+
+ opcode = SPRAID_ADMIN_DELETE_CQ;
+ }
+}
+
+static void spraid_remove_io_queues(struct spraid_dev *hdev)
+{
+ spraid_delete_io_queues(hdev);
+ spraid_free_io_queues(hdev);
+}
+
+static void spraid_pci_disable(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ u32 i;
+
+ for (i = 0; i < hdev->online_queues; i++)
+ pci_free_irq(pdev, hdev->queues[i].cq_vector, &hdev->queues[i]);
+ pci_free_irq_vectors(pdev);
+ if (pci_is_enabled(pdev)) {
+ pci_disable_pcie_error_reporting(pdev);
+ pci_disable_device(pdev);
+ }
+ hdev->online_queues = 0;
+}
+
+static void spraid_disable_admin_queue(struct spraid_dev *hdev, bool shutdown)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u16 start, end;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ if (shutdown)
+ spraid_shutdown_ctrl(hdev);
+ else
+ spraid_disable_ctrl(hdev);
+ }
+
+ if (hdev->queue_count == 0) {
+ dev_err(hdev->dev, "[%s] err, admin queue has been delete\n",
+ __func__);
+ return;
+ }
+
+ spin_lock_irq(&adminq->cq_lock);
+ spraid_process_cq(adminq, &start, &end, -1);
+ spin_unlock_irq(&adminq->cq_lock);
+
+ spraid_complete_cqes(adminq, start, end);
+ spraid_free_admin_queue(hdev);
+}
+
+static int spraid_create_dma_pools(struct spraid_dev *hdev)
+{
+ int i;
+ char poolname[20] = { 0 };
+
+ hdev->prp_page_pool = dma_pool_create("prp list page", hdev->dev,
+ PAGE_SIZE, PAGE_SIZE, 0);
+
+ if (!hdev->prp_page_pool) {
+ dev_err(hdev->dev, "create prp_page_pool failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < small_pool_num; i++) {
+ sprintf(poolname, "prp_list_256_%d", i);
+ hdev->prp_small_pool[i] =
+ dma_pool_create(poolname, hdev->dev, SMALL_POOL_SIZE,
+ SMALL_POOL_SIZE, 0);
+
+ if (!hdev->prp_small_pool[i]) {
+ dev_err(hdev->dev, "create prp_small_pool %d failed\n",
+ i);
+ goto destroy_prp_small_pool;
+ }
+ }
+
+ return 0;
+
+destroy_prp_small_pool:
+ while (i > 0)
+ dma_pool_destroy(hdev->prp_small_pool[--i]);
+ dma_pool_destroy(hdev->prp_page_pool);
+
+ return -ENOMEM;
+}
+
+static void spraid_destroy_dma_pools(struct spraid_dev *hdev)
+{
+ int i;
+
+ for (i = 0; i < small_pool_num; i++)
+ dma_pool_destroy(hdev->prp_small_pool[i]);
+ dma_pool_destroy(hdev->prp_page_pool);
+}
+
+static int spraid_get_dev_list(struct spraid_dev *hdev,
+ struct spraid_dev_info *devices)
+{
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ struct spraid_admin_command admin_cmd;
+ struct spraid_dev_list *list_buf;
+ dma_addr_t data_dma = 0;
+ u32 i, idx, hdid, ndev;
+ int ret = 0;
+
+ list_buf = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!list_buf)
+ return -ENOMEM;
+
+ for (idx = 0; idx < nd;) {
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
+ admin_cmd.get_info.type = SPRAID_GET_INFO_DEV_LIST;
+ admin_cmd.get_info.cdw11 = cpu_to_le32(idx);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd,
+ NULL, NULL, 0);
+
+ if (ret) {
+ dev_err(hdev->dev, "Get device list failed, nd: %u;"
+ "idx: %u, ret: %d\n",
+ nd, idx, ret);
+ goto out;
+ }
+ ndev = le32_to_cpu(list_buf->dev_num);
+
+ dev_info(hdev->dev, "ndev numbers: %u\n", ndev);
+
+ for (i = 0; i < ndev; i++) {
+ hdid = le32_to_cpu(list_buf->devices[i].hdid);
+ dev_info(hdev->dev, "list_buf->devices[%d], hdid: %u;"
+ "target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ i, hdid,
+ le16_to_cpu(list_buf->devices[i].target),
+ list_buf->devices[i].channel,
+ list_buf->devices[i].lun,
+ list_buf->devices[i].attr);
+ if (hdid > nd || hdid == 0) {
+ dev_err(hdev->dev, "err, hdid[%d] invalid\n",
+ hdid);
+ continue;
+ }
+ memcpy(&devices[hdid - 1], &list_buf->devices[i],
+ sizeof(struct spraid_dev_info));
+ }
+ idx += ndev;
+
+ if (idx < MAX_DEV_ENTRY_PER_PAGE_4K)
+ break;
+ }
+
+out:
+ dma_free_coherent(hdev->dev, PAGE_SIZE, list_buf, data_dma);
+ return ret;
+}
+
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.common.opcode = SPRAID_ADMIN_ASYNC_EVENT;
+ admin_cmd.common.command_id = cid;
+
+ spraid_submit_cmd(adminq, &admin_cmd);
+ dev_info(hdev->dev, "send aen, cid[%d]\n", cid);
+}
+
+static inline void spraid_send_all_aen(struct spraid_dev *hdev)
+{
+ u16 i;
+
+ for (i = 0; i < hdev->ctrl_info->aerl; i++)
+ spraid_send_aen(hdev, i + SPRAID_AQ_BLK_MQ_DEPTH);
+}
+
+static int spraid_add_device(struct spraid_dev *hdev,
+ struct spraid_dev_info *device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ dev_info(hdev->dev, "add device, hdid: %u target: %d, channel: %d;"
+ " lun: %d, attr[0x%x]\n",
+ le32_to_cpu(device->hdid), le16_to_cpu(device->target),
+ device->channel, device->lun, device->attr);
+
+ sdev = scsi_device_lookup(shost, device->channel,
+ le16_to_cpu(device->target), 0);
+ if (sdev) {
+ dev_warn(hdev->dev, "Device is already exist, channel: %d;"
+ " target_id: %d, lun: %d\n",
+ device->channel, le16_to_cpu(device->target), 0);
+ scsi_device_put(sdev);
+ return -EEXIST;
+ }
+ scsi_add_device(shost, device->channel, le16_to_cpu(device->target), 0);
+ return 0;
+}
+
+static int spraid_rescan_device(struct spraid_dev *hdev,
+ struct spraid_dev_info *device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ dev_info(hdev->dev, "rescan device, hdid: %u target: %d, channel: %d;"
+ " lun: %d, attr[0x%x]\n",
+ le32_to_cpu(device->hdid), le16_to_cpu(device->target),
+ device->channel, device->lun, device->attr);
+
+ sdev = scsi_device_lookup(shost, device->channel,
+ le16_to_cpu(device->target), 0);
+ if (!sdev) {
+ dev_warn(hdev->dev, "device is not exit rescan it, channel: %d;"
+ " target_id: %d, lun: %d\n",
+ device->channel, le16_to_cpu(device->target), 0);
+ return -ENODEV;
+ }
+
+ scsi_rescan_device(&sdev->sdev_gendev);
+ scsi_device_put(sdev);
+ return 0;
+}
+
+static int spraid_remove_device(struct spraid_dev *hdev,
+ struct spraid_dev_info *org_device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ dev_info(hdev->dev, "remove device, hdid: %u target: %d, channel: %d;"
+ " lun: %d, attr[0x%x]\n",
+ le32_to_cpu(org_device->hdid), le16_to_cpu(org_device->target),
+ org_device->channel, org_device->lun, org_device->attr);
+
+ sdev = scsi_device_lookup(shost, org_device->channel,
+ le16_to_cpu(org_device->target), 0);
+ if (!sdev) {
+ dev_warn(hdev->dev, "device is not exit remove it, channel: %d;"
+ " target_id: %d, lun: %d\n",
+ org_device->channel,
+ le16_to_cpu(org_device->target), 0);
+ return -ENODEV;
+ }
+
+ scsi_remove_device(sdev);
+ scsi_device_put(sdev);
+ return 0;
+}
+
+static int spraid_dev_list_init(struct spraid_dev *hdev)
+{
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ int i, ret;
+
+ hdev->devices = kzalloc_node(nd * sizeof(struct spraid_dev_info),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->devices)
+ return -ENOMEM;
+
+ ret = spraid_get_dev_list(hdev, hdev->devices);
+ if (ret) {
+ dev_err(hdev->dev,
+ "Ignore failure of getting device list;"
+ " within initialization\n");
+ return 0;
+ }
+
+ for (i = 0; i < nd; i++) {
+ if (SPRAID_DEV_INFO_FLAG_VALID(hdev->devices[i].flag) &&
+ SPRAID_DEV_INFO_ATTR_BOOT(hdev->devices[i].attr)) {
+ spraid_add_device(hdev, &hdev->devices[i]);
+ break;
+ }
+ }
+ return 0;
+}
+
+static int luntarget_cmp_func(const void *l, const void *r)
+{
+ const struct spraid_dev_info *ln = l;
+ const struct spraid_dev_info *rn = r;
+
+ if (ln->channel == rn->channel)
+ return le16_to_cpu(ln->target) - le16_to_cpu(rn->target);
+
+ return ln->channel - rn->channel;
+}
+
+static void spraid_scan_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, scan_work);
+ struct spraid_dev_info *devices, *org_devices;
+ struct spraid_dev_info *sortdevice;
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ u8 flag, org_flag;
+ int i, ret;
+ int count = 0;
+
+ devices = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
+ if (!devices)
+ return;
+
+ sortdevice = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
+ if (!sortdevice)
+ goto free_list;
+
+ ret = spraid_get_dev_list(hdev, devices);
+ if (ret)
+ goto free_all;
+ org_devices = hdev->devices;
+ for (i = 0; i < nd; i++) {
+ org_flag = org_devices[i].flag;
+ flag = devices[i].flag;
+
+ dev_log_dbg(hdev->dev, "i: %d, org_flag: 0x%x, flag: 0x%x\n",
+ i, org_flag, flag);
+
+ if (SPRAID_DEV_INFO_FLAG_VALID(flag)) {
+ if (!SPRAID_DEV_INFO_FLAG_VALID(org_flag)) {
+ down_write(&hdev->devices_rwsem);
+ memcpy(&org_devices[i], &devices[i],
+ sizeof(struct spraid_dev_info));
+ memcpy(&sortdevice[count++], &devices[i],
+ sizeof(struct spraid_dev_info));
+ up_write(&hdev->devices_rwsem);
+ } else if (SPRAID_DEV_INFO_FLAG_CHANGE(flag)) {
+ spraid_rescan_device(hdev, &devices[i]);
+ }
+ } else {
+ if (SPRAID_DEV_INFO_FLAG_VALID(org_flag)) {
+ down_write(&hdev->devices_rwsem);
+ org_devices[i].flag &= 0xfe;
+ up_write(&hdev->devices_rwsem);
+ spraid_remove_device(hdev, &org_devices[i]);
+ }
+ }
+ }
+
+ dev_info(hdev->dev, "scan work add device count = %d\n", count);
+
+ sort(sortdevice, count, sizeof(sortdevice[0]),
+ luntarget_cmp_func, NULL);
+
+ for (i = 0; i < count; i++)
+ spraid_add_device(hdev, &sortdevice[i]);
+
+free_all:
+ kfree(sortdevice);
+free_list:
+ kfree(devices);
+}
+
+static void spraid_timesyn_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, timesyn_work);
+
+ spraid_configure_timestamp(hdev);
+}
+
+static int spraid_init_ctrl_info(struct spraid_dev *hdev);
+static void spraid_fw_act_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, fw_act_work);
+
+ if (spraid_init_ctrl_info(hdev))
+ dev_err(hdev->dev, "get ctrl info failed after fw act\n");
+}
+
+static void spraid_queue_scan(struct spraid_dev *hdev)
+{
+ queue_work(spraid_wq, &hdev->scan_work);
+}
+
+static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result)
+{
+ switch ((result & 0xff00) >> 8) {
+ case SPRAID_AEN_DEV_CHANGED:
+ spraid_queue_scan(hdev);
+ break;
+ case SPRAID_AEN_FW_ACT_START:
+ dev_info(hdev->dev, "fw activation starting\n");
+ break;
+ case SPRAID_AEN_HOST_PROBING:
+ break;
+ default:
+ dev_warn(hdev->dev, "async event result %08x\n", result);
+ }
+}
+
+static void spraid_handle_aen_vs(struct spraid_dev *hdev,
+ u32 result, u32 result1)
+{
+ switch ((result & 0xff00) >> 8) {
+ case SPRAID_AEN_TIMESYN:
+ queue_work(spraid_wq, &hdev->timesyn_work);
+ break;
+ case SPRAID_AEN_FW_ACT_FINISH:
+ dev_info(hdev->dev, "fw activation finish\n");
+ queue_work(spraid_wq, &hdev->fw_act_work);
+ break;
+ case SPRAID_AEN_EVENT_MIN ... SPRAID_AEN_EVENT_MAX:
+ dev_info(hdev->dev, "rcv card event[%d];"
+ " param1[0x%x] param2[0x%x]\n",
+ (result & 0xff00) >> 8, result, result1);
+ break;
+ default:
+ dev_warn(hdev->dev, "async event result: 0x%x\n", result);
+ }
+}
+
+static int spraid_alloc_resources(struct spraid_dev *hdev)
+{
+ int ret, nqueue;
+
+ ret = ida_alloc(&spraid_instance_ida, GFP_KERNEL);
+ if (ret < 0) {
+ dev_err(hdev->dev, "Get instance id failed\n");
+ return ret;
+ }
+ hdev->instance = ret;
+
+ hdev->ctrl_info = kzalloc_node(sizeof(*hdev->ctrl_info),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->ctrl_info) {
+ ret = -ENOMEM;
+ goto release_instance;
+ }
+
+ ret = spraid_create_dma_pools(hdev);
+ if (ret)
+ goto free_ctrl_info;
+ nqueue = num_possible_cpus() + 1;
+ hdev->queues = kcalloc_node(nqueue, sizeof(struct spraid_queue),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->queues) {
+ ret = -ENOMEM;
+ goto destroy_dma_pools;
+ }
+
+ ret = spraid_alloc_admin_cmds(hdev);
+ if (ret)
+ goto free_queues;
+
+ dev_info(hdev->dev, "[%s] queues num: %d\n", __func__, nqueue);
+
+ return 0;
+
+free_queues:
+ kfree(hdev->queues);
+destroy_dma_pools:
+ spraid_destroy_dma_pools(hdev);
+free_ctrl_info:
+ kfree(hdev->ctrl_info);
+release_instance:
+ ida_free(&spraid_instance_ida, hdev->instance);
+ return ret;
+}
+
+static void spraid_free_resources(struct spraid_dev *hdev)
+{
+ spraid_free_admin_cmds(hdev);
+ kfree(hdev->queues);
+ spraid_destroy_dma_pools(hdev);
+ kfree(hdev->ctrl_info);
+ ida_free(&spraid_instance_ida, hdev->instance);
+}
+
+static void spraid_bsg_unmap_data(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir =
+ rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+
+ if (iod->nsge)
+ dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
+
+ spraid_free_iod_res(hdev, iod);
+}
+
+static int spraid_bsg_map_data(struct spraid_dev *hdev, struct bsg_job *job,
+ struct spraid_admin_command *cmd)
+{
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir =
+ rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ int ret = 0;
+
+ iod->sg = job->request_payload.sg_list;
+ iod->nsge = job->request_payload.sg_cnt;
+ iod->length = job->request_payload.payload_len;
+ iod->use_sgl = false;
+ iod->npages = -1;
+ iod->sg_drv_mgmt = false;
+
+ if (!iod->nsge)
+ goto out;
+
+ ret = dma_map_sg_attrs(hdev->dev, iod->sg, iod->nsge,
+ dma_dir, DMA_ATTR_NO_WARN);
+ if (!ret)
+ goto out;
+
+ ret = spraid_setup_prps(hdev, iod);
+ if (ret)
+ goto unmap;
+
+ cmd->common.dptr.prp1 = cpu_to_le64(sg_dma_address(iod->sg));
+ cmd->common.dptr.prp2 = cpu_to_le64(iod->first_dma);
+
+ return 0;
+
+unmap:
+ dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
+out:
+ return ret;
+}
+
+static int spraid_get_ctrl_info(struct spraid_dev *hdev,
+ struct spraid_ctrl_info *ctrl_info)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
+ admin_cmd.get_info.type = SPRAID_GET_INFO_CTRL;
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(ctrl_info, data_ptr, sizeof(struct spraid_ctrl_info));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_init_ctrl_info(struct spraid_dev *hdev)
+{
+ int ret;
+
+ hdev->ctrl_info->nd = cpu_to_le32(240);
+ hdev->ctrl_info->mdts = 8;
+ hdev->ctrl_info->max_cmds = cpu_to_le16(4096);
+ hdev->ctrl_info->max_num_sge = cpu_to_le16(128);
+ hdev->ctrl_info->max_channel = cpu_to_le16(4);
+ hdev->ctrl_info->max_tgt_id = cpu_to_le32(3239);
+ hdev->ctrl_info->max_lun = cpu_to_le16(2);
+
+ ret = spraid_get_ctrl_info(hdev, hdev->ctrl_info);
+ if (ret)
+ dev_err(hdev->dev, "get controller info failed: %d\n", ret);
+
+ dev_info(hdev->dev, "[%s]nd = %d\n", __func__, hdev->ctrl_info->nd);
+ dev_info(hdev->dev, "[%s]max_cmd = %d\n",
+ __func__, hdev->ctrl_info->max_cmds);
+ dev_info(hdev->dev, "[%s]max_channel = %d\n",
+ __func__, hdev->ctrl_info->max_channel);
+ dev_info(hdev->dev, "[%s]max_tgt_id = %d\n",
+ __func__, hdev->ctrl_info->max_tgt_id);
+ dev_info(hdev->dev, "[%s]max_lun = %d\n",
+ __func__, hdev->ctrl_info->max_lun);
+ dev_info(hdev->dev, "[%s]max_num_sge = %d\n",
+ __func__, hdev->ctrl_info->max_num_sge);
+ dev_info(hdev->dev, "[%s]lun_num_boot = %d\n",
+ __func__, hdev->ctrl_info->lun_num_in_boot);
+ dev_info(hdev->dev, "[%s]mdts = %d\n", __func__, hdev->ctrl_info->mdts);
+ dev_info(hdev->dev, "[%s]acl = %d\n", __func__, hdev->ctrl_info->acl);
+ dev_info(hdev->dev, "[%s]aer1 = %d\n", __func__, hdev->ctrl_info->aerl);
+ dev_info(hdev->dev, "[%s]card_type = %d\n",
+ __func__, hdev->ctrl_info->card_type);
+ dev_info(hdev->dev, "[%s]rtd3e = %d\n",
+ __func__, hdev->ctrl_info->rtd3e);
+ dev_info(hdev->dev, "[%s]sn = %s\n", __func__, hdev->ctrl_info->sn);
+ dev_info(hdev->dev, "[%s]fr = %s\n", __func__, hdev->ctrl_info->fr);
+
+ if (!hdev->ctrl_info->aerl)
+ hdev->ctrl_info->aerl = 1;
+ if (hdev->ctrl_info->aerl > SPRAID_NR_AEN_COMMANDS)
+ hdev->ctrl_info->aerl = SPRAID_NR_AEN_COMMANDS;
+
+ return 0;
+}
+
+#define SPRAID_MAX_ADMIN_PAYLOAD_SIZE BIT(16)
+static int spraid_alloc_iod_ext_mem_pool(struct spraid_dev *hdev)
+{
+ u16 max_sge = le16_to_cpu(hdev->ctrl_info->max_num_sge);
+ size_t alloc_size;
+
+ alloc_size = spraid_iod_ext_size(hdev, SPRAID_MAX_ADMIN_PAYLOAD_SIZE,
+ max_sge, true, false);
+ if (alloc_size > PAGE_SIZE)
+ dev_warn(hdev->dev, "It is unreasonable ;"
+ " sg allocation more than one page\n");
+ hdev->iod_mempool = mempool_create_node(1, mempool_kmalloc,
+ mempool_kfree,
+ (void *)alloc_size, GFP_KERNEL,
+ hdev->numa_node);
+ if (!hdev->iod_mempool) {
+ dev_err(hdev->dev, "Create iod extension memory pool failed\n");
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void spraid_free_iod_ext_mem_pool(struct spraid_dev *hdev)
+{
+ mempool_destroy(hdev->iod_mempool);
+}
+
+static int spraid_user_admin_cmd(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct spraid_bsg_request *bsg_req = job->request;
+ struct spraid_passthru_common_cmd *cmd = &(bsg_req->admcmd);
+ struct spraid_admin_command admin_cmd;
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
+ u32 result[2] = {0};
+ int status;
+
+ if (hdev->state >= SPRAID_RESETTING) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not right\n",
+ __func__, hdev->state);
+ return -EBUSY;
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode);
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.common.opcode = cmd->opcode;
+ admin_cmd.common.flags = cmd->flags;
+ admin_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ admin_cmd.common.cdw2[0] = cpu_to_le32(cmd->cdw2);
+ admin_cmd.common.cdw2[1] = cpu_to_le32(cmd->cdw3);
+ admin_cmd.common.cdw10 = cpu_to_le32(cmd->cdw10);
+ admin_cmd.common.cdw11 = cpu_to_le32(cmd->cdw11);
+ admin_cmd.common.cdw12 = cpu_to_le32(cmd->cdw12);
+ admin_cmd.common.cdw13 = cpu_to_le32(cmd->cdw13);
+ admin_cmd.common.cdw14 = cpu_to_le32(cmd->cdw14);
+ admin_cmd.common.cdw15 = cpu_to_le32(cmd->cdw15);
+
+ status = spraid_bsg_map_data(hdev, job, &admin_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
+ }
+
+ status = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, &result[0],
+ &result[1], timeout);
+ if (status >= 0) {
+ job->reply_len = sizeof(result);
+ memcpy(job->reply, result, sizeof(result));
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x];"
+ " status[0x%x] result0[0x%x] result1[0x%x]\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode,
+ status, result[0], result[1]);
+
+ spraid_bsg_unmap_data(hdev, job);
+
+ return status;
+}
+
+static int spraid_alloc_ioq_ptcmds(struct spraid_dev *hdev)
+{
+ int i;
+ int ptnum = SPRAID_NR_IOQ_PTCMDS;
+
+ INIT_LIST_HEAD(&hdev->ioq_pt_list);
+ spin_lock_init(&hdev->ioq_pt_lock);
+
+ hdev->ioq_ptcmds = kcalloc_node(ptnum, sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
+
+ if (!hdev->ioq_ptcmds) {
+ dev_err(hdev->dev, "Alloc ioq_ptcmds failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < ptnum; i++) {
+ hdev->ioq_ptcmds[i].qid = i / SPRAID_PTCMDS_PERQ + 1;
+ hdev->ioq_ptcmds[i].cid = i % SPRAID_PTCMDS_PERQ
+ + SPRAID_IO_BLK_MQ_DEPTH;
+ list_add_tail(&(hdev->ioq_ptcmds[i].list), &hdev->ioq_pt_list);
+ }
+
+ dev_info(hdev->dev, "Alloc ioq_ptcmds success, ptnum[%d]\n", ptnum);
+
+ return 0;
+}
+
+static void spraid_free_ioq_ptcmds(struct spraid_dev *hdev)
+{
+ kfree(hdev->ioq_ptcmds);
+ hdev->ioq_ptcmds = NULL;
+
+ INIT_LIST_HEAD(&hdev->ioq_pt_list);
+}
+
+static int spraid_submit_ioq_sync_cmd(struct spraid_dev *hdev,
+ struct spraid_ioq_command *cmd,
+ u32 *result, u32 *reslen, u32 timeout)
+{
+ int ret;
+ dma_addr_t sense_dma;
+ struct spraid_queue *ioq;
+ void *sense_addr = NULL;
+ struct spraid_cmd *pt_cmd = spraid_get_cmd(hdev, SPRAID_CMD_IOPT);
+
+ if (!pt_cmd) {
+ dev_err(hdev->dev, "err, get ioq cmd failed\n");
+ return -EFAULT;
+ }
+
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
+
+ init_completion(&pt_cmd->cmd_done);
+
+ ioq = &hdev->queues[pt_cmd->qid];
+ ret = pt_cmd->cid * SCSI_SENSE_BUFFERSIZE;
+ sense_addr = ioq->sense + ret;
+ sense_dma = ioq->sense_dma_addr + ret;
+
+ cmd->common.sense_addr = cpu_to_le64(sense_dma);
+ cmd->common.sense_len = cpu_to_le16(SCSI_SENSE_BUFFERSIZE);
+ cmd->common.command_id = pt_cmd->cid;
+
+ spraid_submit_cmd(ioq, cmd);
+
+ if (!wait_for_completion_timeout(&pt_cmd->cmd_done, timeout)) {
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout;"
+ " opcode[0x%x] subopcode[0x%x]\n",
+ __func__, pt_cmd->cid, pt_cmd->qid, cmd->common.opcode,
+ (le32_to_cpu(cmd->common.cdw3[0]) & 0xffff));
+ WRITE_ONCE(pt_cmd->state, SPRAID_CMD_TIMEOUT);
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
+ return -EINVAL;
+ }
+
+ if (result && reslen) {
+ if ((pt_cmd->status & 0x17f) == 0x101) {
+ memcpy(result, sense_addr, SCSI_SENSE_BUFFERSIZE);
+ *reslen = SCSI_SENSE_BUFFERSIZE;
+ }
+ }
+
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
+
+ return pt_cmd->status;
+}
+
+static int spraid_user_ioq_cmd(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct spraid_bsg_request *bsg_req =
+ (struct spraid_bsg_request *)(job->request);
+ struct spraid_ioq_passthru_cmd *cmd = &(bsg_req->ioqcmd);
+ struct spraid_ioq_command ioq_cmd;
+ int status = 0;
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
+
+ if (cmd->data_len > PAGE_SIZE) {
+ dev_err(hdev->dev, "[%s] data len bigger than 4k\n", __func__);
+ return -EFAULT;
+ }
+
+ if (hdev->state != SPRAID_LIVE) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not live\n",
+ __func__, hdev->state);
+ return -EBUSY;
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init;"
+ " datalen[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode, cmd->data_len);
+
+ memset(&ioq_cmd, 0, sizeof(ioq_cmd));
+ ioq_cmd.common.opcode = cmd->opcode;
+ ioq_cmd.common.flags = cmd->flags;
+ ioq_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ ioq_cmd.common.sense_len = cpu_to_le16(cmd->info_0.res_sense_len);
+ ioq_cmd.common.cdb_len = cmd->info_0.cdb_len;
+ ioq_cmd.common.rsvd2 = cmd->info_0.rsvd0;
+ ioq_cmd.common.cdw3[0] = cpu_to_le32(cmd->cdw3);
+ ioq_cmd.common.cdw3[1] = cpu_to_le32(cmd->cdw4);
+ ioq_cmd.common.cdw3[2] = cpu_to_le32(cmd->cdw5);
+
+ ioq_cmd.common.cdw10[0] = cpu_to_le32(cmd->cdw10);
+ ioq_cmd.common.cdw10[1] = cpu_to_le32(cmd->cdw11);
+ ioq_cmd.common.cdw10[2] = cpu_to_le32(cmd->cdw12);
+ ioq_cmd.common.cdw10[3] = cpu_to_le32(cmd->cdw13);
+ ioq_cmd.common.cdw10[4] = cpu_to_le32(cmd->cdw14);
+ ioq_cmd.common.cdw10[5] = cpu_to_le32(cmd->data_len);
+
+ memcpy(ioq_cmd.common.cdb, &cmd->cdw16, cmd->info_0.cdb_len);
+
+ ioq_cmd.common.cdw26[0] = cpu_to_le32(cmd->cdw26[0]);
+ ioq_cmd.common.cdw26[1] = cpu_to_le32(cmd->cdw26[1]);
+ ioq_cmd.common.cdw26[2] = cpu_to_le32(cmd->cdw26[2]);
+ ioq_cmd.common.cdw26[3] = cpu_to_le32(cmd->cdw26[3]);
+
+ status = spraid_bsg_map_data(hdev, job,
+ (struct spraid_admin_command *)&ioq_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
+ }
+
+ status = spraid_submit_ioq_sync_cmd(hdev, &ioq_cmd, job->reply,
+ &job->reply_len, timeout);
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x], status[0x%x];"
+ " reply_len[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode,
+ status, job->reply_len);
+
+ spraid_bsg_unmap_data(hdev, job);
+
+ return status;
+}
+
+static bool spraid_check_scmd_completed(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_queue *spraidq;
+ u16 hwq, cid;
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ spraidq = &hdev->queues[hwq];
+ if (READ_ONCE(iod->state) == SPRAID_CMD_COMPLETE
+ || spraid_poll_cq(spraidq, cid)) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] has been completed\n",
+ cid, spraidq->qid);
+ return true;
+ }
+ return false;
+}
+
+static enum blk_eh_timer_return spraid_scmd_timeout(struct scsi_cmnd *scmd)
+{
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ unsigned int timeout = scmd->device->request_queue->rq_timeout;
+
+ if (spraid_check_scmd_completed(scmd))
+ goto out;
+
+ if (time_after(jiffies, scmd->jiffies_at_alloc + timeout)) {
+ if (cmpxchg(&iod->state,
+ SPRAID_CMD_IN_FLIGHT,
+ SPRAID_CMD_TIMEOUT) == SPRAID_CMD_IN_FLIGHT) {
+ return BLK_EH_DONE;
+ }
+ }
+out:
+ return BLK_EH_RESET_TIMER;
+}
+
+/* send abort command by admin queue temporary */
+static int spraid_send_abort_cmd(struct spraid_dev *hdev,
+ u32 hdid, u16 qid, u16 cid)
+{
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.abort.opcode = SPRAID_ADMIN_ABORT_CMD;
+ admin_cmd.abort.hdid = cpu_to_le32(hdid);
+ admin_cmd.abort.sqid = cpu_to_le16(qid);
+ admin_cmd.abort.cid = cpu_to_le16(cid);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+/* send reset command by admin quueue temporary */
+static int spraid_send_reset_cmd(struct spraid_dev *hdev, int type, u32 hdid)
+{
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.reset.opcode = SPRAID_ADMIN_RESET;
+ admin_cmd.reset.hdid = cpu_to_le32(hdid);
+ admin_cmd.reset.type = type;
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static bool spraid_change_host_state(struct spraid_dev *hdev,
+ enum spraid_state newstate)
+{
+ unsigned long flags;
+ enum spraid_state oldstate;
+ bool change = false;
+
+ spin_lock_irqsave(&hdev->state_lock, flags);
+
+ oldstate = hdev->state;
+ switch (newstate) {
+ case SPRAID_LIVE:
+ switch (oldstate) {
+ case SPRAID_NEW:
+ case SPRAID_RESETTING:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ case SPRAID_RESETTING:
+ switch (oldstate) {
+ case SPRAID_LIVE:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ case SPRAID_DELETING:
+ if (oldstate != SPRAID_DELETING)
+ change = true;
+ break;
+ case SPRAID_DEAD:
+ switch (oldstate) {
+ case SPRAID_NEW:
+ case SPRAID_LIVE:
+ case SPRAID_RESETTING:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ default:
+ break;
+ }
+ if (change)
+ hdev->state = newstate;
+ spin_unlock_irqrestore(&hdev->state_lock, flags);
+
+ dev_info(hdev->dev, "[%s][%d]->[%d], change[%d]\n",
+ __func__, oldstate, newstate, change);
+
+ return change;
+}
+
+static void spraid_back_fault_cqe(struct spraid_queue *ioq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct blk_mq_tags *tags;
+ struct scsi_cmnd *scmd;
+ struct spraid_iod *iod;
+ struct request *req;
+
+ tags = hdev->shost->tag_set.tags[ioq->qid - 1];
+ req = blk_mq_tag_to_rq(tags, cqe->cmd_id);
+ if (unlikely(!req || !blk_mq_request_started(req)))
+ return;
+
+ scmd = blk_mq_rq_to_pdu(req);
+ iod = scsi_cmd_priv(scmd);
+
+ set_host_byte(scmd, DID_NO_CONNECT);
+ if (iod->nsge)
+ scsi_dma_unmap(scmd);
+ spraid_free_iod_res(hdev, iod);
+ scmd->scsi_done(scmd);
+ dev_warn(hdev->dev, "Back fault CQE, cid[%d] qid[%d]\n",
+ cqe->cmd_id, ioq->qid);
+}
+
+static void spraid_back_all_io(struct spraid_dev *hdev)
+{
+ int i, j;
+ struct spraid_queue *ioq;
+ struct spraid_completion cqe = { 0 };
+
+ scsi_block_requests(hdev->shost);
+
+ for (i = 1; i <= hdev->shost->nr_hw_queues; i++) {
+ ioq = &hdev->queues[i];
+ for (j = 0; j < hdev->shost->can_queue; j++) {
+ cqe.cmd_id = j;
+ spraid_back_fault_cqe(ioq, &cqe);
+ }
+ }
+
+ scsi_unblock_requests(hdev->shost);
+}
+
+static void spraid_dev_disable(struct spraid_dev *hdev, bool shutdown)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u16 start, end;
+ unsigned long timeout = jiffies + 600 * HZ;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ if (shutdown)
+ spraid_shutdown_ctrl(hdev);
+ else
+ spraid_disable_ctrl(hdev);
+ }
+
+ while (!time_after(jiffies, timeout)) {
+ if (!pci_device_is_present(hdev->pdev)) {
+ dev_info(hdev->dev, "[%s] pci_device not present;"
+ " skip wait\n", __func__);
+ break;
+ }
+ if (!spraid_wait_ready(hdev, hdev->cap, false)) {
+ dev_info(hdev->dev,
+ "[%s] wait ready success after reset\n",
+ __func__);
+ break;
+ }
+ dev_info(hdev->dev, "[%s] waiting csts_rdy ready\n", __func__);
+ }
+
+ if (hdev->queue_count == 0) {
+ dev_err(hdev->dev, "[%s] warn, queue has been delete\n",
+ __func__);
+ return;
+ }
+
+ spin_lock_irq(&adminq->cq_lock);
+ spraid_process_cq(adminq, &start, &end, -1);
+ spin_unlock_irq(&adminq->cq_lock);
+ spraid_complete_cqes(adminq, start, end);
+
+ spraid_pci_disable(hdev);
+
+ spraid_back_all_io(hdev);
+}
+
+static void spraid_reset_work(struct work_struct *work)
+{
+ int ret;
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, reset_work);
+
+ if (hdev->state != SPRAID_RESETTING) {
+ dev_err(hdev->dev, "[%s] err, host is not reset state\n",
+ __func__);
+ return;
+ }
+
+ dev_info(hdev->dev, "[%s] enter host reset\n", __func__);
+
+ if (hdev->ctrl_config & SPRAID_CC_ENABLE) {
+ dev_info(hdev->dev, "[%s] start dev_disable\n", __func__);
+ spraid_dev_disable(hdev, false);
+ }
+
+ ret = spraid_pci_enable(hdev);
+ if (ret)
+ goto out;
+
+ ret = spraid_setup_admin_queue(hdev);
+ if (ret)
+ goto pci_disable;
+
+ ret = spraid_setup_io_queues(hdev);
+ if (ret || hdev->online_queues <= hdev->shost->nr_hw_queues)
+ goto pci_disable;
+
+ spraid_change_host_state(hdev, SPRAID_LIVE);
+
+ spraid_send_all_aen(hdev);
+
+ return;
+
+pci_disable:
+ spraid_pci_disable(hdev);
+out:
+ spraid_change_host_state(hdev, SPRAID_DEAD);
+ dev_err(hdev->dev, "[%s] err, host reset failed\n", __func__);
+}
+
+static int spraid_reset_work_sync(struct spraid_dev *hdev)
+{
+ if (!spraid_change_host_state(hdev, SPRAID_RESETTING)) {
+ dev_info(hdev->dev, "[%s] can't change to reset state\n",
+ __func__);
+ return -EBUSY;
+ }
+
+ if (!queue_work(spraid_wq, &hdev->reset_work)) {
+ dev_err(hdev->dev, "[%s] err, host is already in reset state\n",
+ __func__);
+ return -EBUSY;
+ }
+
+ flush_work(&hdev->reset_work);
+ if (hdev->state != SPRAID_LIVE)
+ return -ENODEV;
+
+ return 0;
+}
+
+static int spraid_wait_abnl_cmd_done(struct spraid_iod *iod)
+{
+ u16 times = 0;
+
+ do {
+ if (READ_ONCE(iod->state) == SPRAID_CMD_TMO_COMPLETE)
+ break;
+ msleep(500);
+ times++;
+ } while (times <= SPRAID_WAIT_ABNL_CMD_TIMEOUT);
+
+ /* wait command completion timeout after abort/reset success */
+ if (times >= SPRAID_WAIT_ABNL_CMD_TIMEOUT)
+ return -ETIMEDOUT;
+
+ return 0;
+}
+
+static int spraid_abort_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, aborting\n", cid, hwq);
+ ret = spraid_send_abort_cmd(hdev, hostdata->hdid, hwq, cid);
+ if (ret != ADMIN_ERR_TIMEOUT) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort failed;"
+ " not found\n", cid, hwq);
+ return FAILED;
+ }
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort succ\n", cid, hwq);
+ return SUCCESS;
+ }
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort failed, timeout\n",
+ cid, hwq);
+ return FAILED;
+}
+
+static int spraid_tgt_reset_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, target reset\n",
+ cid, hwq);
+ ret = spraid_send_reset_cmd(hdev, SPRAID_RESET_TARGET, hostdata->hdid);
+ if (ret == 0) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev,
+ "cid[%d] qid[%d]target reset failed;"
+ " not found\n", cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] target reset success\n",
+ cid, hwq);
+ return SUCCESS;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] ret[%d] target reset failed\n",
+ cid, hwq, ret);
+ return FAILED;
+}
+
+static int spraid_bus_reset_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, bus reset\n", cid, hwq);
+ ret = spraid_send_reset_cmd(hdev, SPRAID_RESET_BUS, hostdata->hdid);
+ if (ret == 0) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev,
+ "cid[%d] qid[%d] bus reset failed;"
+ " not found\n", cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] bus reset succ\n",
+ cid, hwq);
+ return SUCCESS;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] ret[%d] bus reset failed\n",
+ cid, hwq, ret);
+ return FAILED;
+}
+
+static int spraid_shost_reset_handler(struct scsi_cmnd *scmd)
+{
+ u16 hwq, cid;
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+
+ scsi_print_command(scmd);
+ if (hdev->state != SPRAID_LIVE || spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset\n", cid, hwq);
+
+ if (spraid_reset_work_sync(hdev)) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset failed\n",
+ cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset success\n", cid, hwq);
+
+ return SUCCESS;
+}
+
+static pci_ers_result_t spraid_pci_error_detected(struct pci_dev *pdev,
+ pci_channel_state_t state)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "enter pci error detect, state:%d\n", state);
+
+ switch (state) {
+ case pci_channel_io_normal:
+ dev_warn(hdev->dev, "channel is normal, do nothing\n");
+
+ return PCI_ERS_RESULT_CAN_RECOVER;
+ case pci_channel_io_frozen:
+ dev_warn(hdev->dev,
+ "channel io frozen, need reset controller\n");
+
+ scsi_block_requests(hdev->shost);
+
+ spraid_change_host_state(hdev, SPRAID_RESETTING);
+
+ return PCI_ERS_RESULT_NEED_RESET;
+ case pci_channel_io_perm_failure:
+ dev_warn(hdev->dev, "channel io failure, request disconnect\n");
+
+ return PCI_ERS_RESULT_DISCONNECT;
+ }
+
+ return PCI_ERS_RESULT_NEED_RESET;
+}
+
+static pci_ers_result_t spraid_pci_slot_reset(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "restart after slot reset\n");
+
+ pci_restore_state(pdev);
+
+ if (!queue_work(spraid_wq, &hdev->reset_work)) {
+ dev_err(hdev->dev, "[%s] err, the device is resetting state\n",
+ __func__);
+ return PCI_ERS_RESULT_NONE;
+ }
+
+ flush_work(&hdev->reset_work);
+
+ scsi_unblock_requests(hdev->shost);
+
+ return PCI_ERS_RESULT_RECOVERED;
+}
+
+static void spraid_reset_done(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "enter spraid reset done\n");
+}
+
+static ssize_t csts_pp_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS)
+ & SPRAID_CSTS_PP_MASK);
+ ret >>= SPRAID_CSTS_PP_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_shst_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS)
+ & SPRAID_CSTS_SHST_MASK);
+ ret >>= SPRAID_CSTS_SHST_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_cfs_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS)
+ & SPRAID_CSTS_CFS_MASK);
+ ret >>= SPRAID_CSTS_CFS_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_rdy_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev))
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_RDY);
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t fw_version_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+
+ return snprintf(buf, PAGE_SIZE, "%s\n", hdev->ctrl_info->fr);
+}
+
+static DEVICE_ATTR_RO(csts_pp);
+static DEVICE_ATTR_RO(csts_shst);
+static DEVICE_ATTR_RO(csts_cfs);
+static DEVICE_ATTR_RO(csts_rdy);
+static DEVICE_ATTR_RO(fw_version);
+
+static struct device_attribute *spraid_host_attrs[] = {
+ &dev_attr_csts_pp,
+ &dev_attr_csts_shst,
+ &dev_attr_csts_cfs,
+ &dev_attr_csts_rdy,
+ &dev_attr_fw_version,
+ NULL,
+};
+
+static int spraid_get_vd_info(struct spraid_dev *hdev,
+ struct spraid_vd_info *vd_info, u16 vid)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.usr_cmd.opcode = USR_CMD_READ;
+ admin_cmd.usr_cmd.info_0.subopcode = cpu_to_le16(USR_CMD_VDINFO);
+ admin_cmd.usr_cmd.info_1.data_len = cpu_to_le16(USR_CMD_RDLEN);
+ admin_cmd.usr_cmd.info_1.param_len = cpu_to_le16(VDINFO_PARAM_LEN);
+ admin_cmd.usr_cmd.cdw10 = cpu_to_le32(vid);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(vd_info, data_ptr, sizeof(struct spraid_vd_info));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_get_bgtask(struct spraid_dev *hdev,
+ struct spraid_bgtask *bgtask)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.usr_cmd.opcode = USR_CMD_READ;
+ admin_cmd.usr_cmd.info_0.subopcode = cpu_to_le16(USR_CMD_BGTASK);
+ admin_cmd.usr_cmd.info_1.data_len = cpu_to_le16(USR_CMD_RDLEN);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(bgtask, data_ptr, sizeof(struct spraid_bgtask));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static ssize_t raid_level_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_sdev_hostdata *hostdata;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret)
+ vd_info->rg_level = ARRAY_SIZE(raid_levels) - 1;
+
+ ret = (vd_info->rg_level < ARRAY_SIZE(raid_levels)) ?
+ vd_info->rg_level : (ARRAY_SIZE(raid_levels) - 1);
+
+ kfree(vd_info);
+
+ return snprintf(buf, PAGE_SIZE, "RAID-%s\n", raid_levels[ret]);
+}
+
+static ssize_t raid_state_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_sdev_hostdata *hostdata;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret) {
+ vd_info->vd_status = 0;
+ vd_info->rg_id = 0xff;
+ }
+
+ ret = (vd_info->vd_status < ARRAY_SIZE(raid_states)) ?
+ vd_info->vd_status : 0;
+
+ kfree(vd_info);
+
+ return snprintf(buf, PAGE_SIZE, "%s\n", raid_states[ret]);
+}
+
+static ssize_t raid_resync_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_bgtask *bgtask;
+ struct spraid_sdev_hostdata *hostdata;
+ u8 rg_id, i, progress = 0;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret)
+ goto out;
+
+ rg_id = vd_info->rg_id;
+
+ bgtask = (struct spraid_bgtask *)vd_info;
+ ret = spraid_get_bgtask(hdev, bgtask);
+ if (ret)
+ goto out;
+ for (i = 0; i < bgtask->task_num; i++) {
+ if ((bgtask->bgtask[i].type == BGTASK_TYPE_REBUILD) &&
+ (le16_to_cpu(bgtask->bgtask[i].vd_id) == rg_id))
+ progress = bgtask->bgtask[i].progress;
+ }
+
+out:
+ kfree(vd_info);
+ return snprintf(buf, PAGE_SIZE, "%d\n", progress);
+}
+
+static DEVICE_ATTR_RO(raid_level);
+static DEVICE_ATTR_RO(raid_state);
+static DEVICE_ATTR_RO(raid_resync);
+
+static struct device_attribute *spraid_dev_attrs[] = {
+ &dev_attr_raid_level,
+ &dev_attr_raid_state,
+ &dev_attr_raid_resync,
+ NULL,
+};
+
+static struct pci_error_handlers spraid_err_handler = {
+ .error_detected = spraid_pci_error_detected,
+ .slot_reset = spraid_pci_slot_reset,
+ .reset_done = spraid_reset_done,
+};
+
+static int spraid_sysfs_host_reset(struct Scsi_Host *shost, int reset_type)
+{
+ int ret;
+ struct spraid_dev *hdev = shost_priv(shost);
+
+ dev_info(hdev->dev, "[%s] start sysfs host reset cmd\n", __func__);
+ ret = spraid_reset_work_sync(hdev);
+ dev_info(hdev->dev, "[%s] stop sysfs host reset cmd[%d]\n",
+ __func__, ret);
+
+ return ret;
+}
+
+static struct scsi_host_template spraid_driver_template = {
+ .module = THIS_MODULE,
+ .name = "Ramaxel Logic spraid driver",
+ .proc_name = "spraid",
+ .queuecommand = spraid_queue_command,
+ .slave_alloc = spraid_slave_alloc,
+ .slave_destroy = spraid_slave_destroy,
+ .slave_configure = spraid_slave_configure,
+ .eh_timed_out = spraid_scmd_timeout,
+ .eh_abort_handler = spraid_abort_handler,
+ .eh_target_reset_handler = spraid_tgt_reset_handler,
+ .eh_bus_reset_handler = spraid_bus_reset_handler,
+ .eh_host_reset_handler = spraid_shost_reset_handler,
+ .change_queue_depth = scsi_change_queue_depth,
+ .this_id = -1,
+ .shost_attrs = spraid_host_attrs,
+ .sdev_attrs = spraid_dev_attrs,
+ .host_reset = spraid_sysfs_host_reset,
+};
+
+static void spraid_shutdown(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ spraid_remove_io_queues(hdev);
+ spraid_disable_admin_queue(hdev, true);
+}
+
+/* bsg dispatch user command */
+static int spraid_bsg_host_dispatch(struct bsg_job *job)
+{
+ struct Scsi_Host *shost = dev_to_shost(job->dev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_bsg_request *bsg_req = job->request;
+ int ret = 0;
+
+ dev_info(hdev->dev, "[%s] msgcode[%d], msglen[%d], timeout[%d];"
+ " req_nsge[%d], req_len[%d]\n",
+ __func__, bsg_req->msgcode, job->request_len,
+ rq->timeout, job->request_payload.sg_cnt,
+ job->request_payload.payload_len);
+
+ job->reply_len = 0;
+
+ switch (bsg_req->msgcode) {
+ case SPRAID_BSG_ADM:
+ ret = spraid_user_admin_cmd(hdev, job);
+ break;
+ case SPRAID_BSG_IOQ:
+ ret = spraid_user_ioq_cmd(hdev, job);
+ break;
+ default:
+ dev_info(hdev->dev, "[%s] unsupport msgcode[%d]\n",
+ __func__, bsg_req->msgcode);
+ break;
+ }
+
+ if (ret > 0)
+ ret = ret | (ret << 8);
+
+ bsg_job_done(job, ret, 0);
+ return 0;
+}
+
+static inline void spraid_remove_bsg(struct spraid_dev *hdev)
+{
+ if (hdev->bsg_queue) {
+ bsg_unregister_queue(hdev->bsg_queue);
+ blk_cleanup_queue(hdev->bsg_queue);
+ }
+}
+static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct spraid_dev *hdev;
+ struct Scsi_Host *shost;
+ int node, ret;
+ char bsg_name[15];
+
+ shost = scsi_host_alloc(&spraid_driver_template, sizeof(*hdev));
+ if (!shost) {
+ dev_err(&pdev->dev, "Failed to allocate scsi host\n");
+ return -ENOMEM;
+ }
+ hdev = shost_priv(shost);
+ hdev->pdev = pdev;
+ hdev->dev = get_device(&pdev->dev);
+
+ node = dev_to_node(hdev->dev);
+ if (node == NUMA_NO_NODE) {
+ node = first_memory_node;
+ set_dev_node(hdev->dev, node);
+ }
+ hdev->numa_node = node;
+ hdev->shost = shost;
+ pci_set_drvdata(pdev, hdev);
+
+ ret = spraid_dev_map(hdev);
+ if (ret)
+ goto put_dev;
+
+ init_rwsem(&hdev->devices_rwsem);
+ INIT_WORK(&hdev->scan_work, spraid_scan_work);
+ INIT_WORK(&hdev->timesyn_work, spraid_timesyn_work);
+ INIT_WORK(&hdev->reset_work, spraid_reset_work);
+ INIT_WORK(&hdev->fw_act_work, spraid_fw_act_work);
+ spin_lock_init(&hdev->state_lock);
+
+ ret = spraid_alloc_resources(hdev);
+ if (ret)
+ goto dev_unmap;
+
+ ret = spraid_pci_enable(hdev);
+ if (ret)
+ goto resources_free;
+
+ ret = spraid_setup_admin_queue(hdev);
+ if (ret)
+ goto pci_disable;
+
+ ret = spraid_init_ctrl_info(hdev);
+ if (ret)
+ goto disable_admin_q;
+
+ ret = spraid_alloc_iod_ext_mem_pool(hdev);
+ if (ret)
+ goto disable_admin_q;
+
+ ret = spraid_setup_io_queues(hdev);
+ if (ret)
+ goto free_iod_mempool;
+
+ spraid_shost_init(hdev);
+
+ ret = scsi_add_host(hdev->shost, hdev->dev);
+ if (ret) {
+ dev_err(hdev->dev, "Add shost to system failed, ret: %d\n",
+ ret);
+ goto remove_io_queues;
+ }
+
+ snprintf(bsg_name, sizeof(bsg_name), "spraid%d", shost->host_no);
+ hdev->bsg_queue = bsg_setup_queue(&shost->shost_gendev, bsg_name,
+ spraid_bsg_host_dispatch,
+ spraid_cmd_size(hdev, true, false));
+ if (IS_ERR(hdev->bsg_queue)) {
+ dev_err(hdev->dev, "err, setup bsg failed\n");
+ hdev->bsg_queue = NULL;
+ goto remove_io_queues;
+ }
+
+ if (hdev->online_queues == SPRAID_ADMIN_QUEUE_NUM) {
+ dev_warn(hdev->dev, "warn only admin queue can be used\n");
+ return 0;
+ }
+
+ hdev->state = SPRAID_LIVE;
+
+ spraid_send_all_aen(hdev);
+
+ ret = spraid_dev_list_init(hdev);
+ if (ret)
+ goto remove_bsg;
+
+ ret = spraid_configure_timestamp(hdev);
+ if (ret)
+ dev_warn(hdev->dev, "init set timestamp failed\n");
+
+ ret = spraid_alloc_ioq_ptcmds(hdev);
+ if (ret)
+ goto remove_bsg;
+
+ scsi_scan_host(hdev->shost);
+
+ return 0;
+
+remove_bsg:
+ spraid_remove_bsg(hdev);
+remove_io_queues:
+ spraid_remove_io_queues(hdev);
+free_iod_mempool:
+ spraid_free_iod_ext_mem_pool(hdev);
+disable_admin_q:
+ spraid_disable_admin_queue(hdev, false);
+pci_disable:
+ spraid_pci_disable(hdev);
+resources_free:
+ spraid_free_resources(hdev);
+dev_unmap:
+ spraid_dev_unmap(hdev);
+put_dev:
+ put_device(hdev->dev);
+ scsi_host_put(shost);
+
+ return -ENODEV;
+}
+
+static void spraid_remove(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+ struct Scsi_Host *shost = hdev->shost;
+
+ dev_info(hdev->dev, "enter spraid remove\n");
+
+ spraid_change_host_state(hdev, SPRAID_DELETING);
+ flush_work(&hdev->reset_work);
+
+ if (!pci_device_is_present(pdev))
+ spraid_back_all_io(hdev);
+
+ spraid_remove_bsg(hdev);
+ scsi_remove_host(shost);
+ spraid_free_ioq_ptcmds(hdev);
+ kfree(hdev->devices);
+ spraid_remove_io_queues(hdev);
+ spraid_free_iod_ext_mem_pool(hdev);
+ spraid_disable_admin_queue(hdev, false);
+ spraid_pci_disable(hdev);
+ spraid_free_resources(hdev);
+ spraid_dev_unmap(hdev);
+ put_device(hdev->dev);
+ scsi_host_put(shost);
+
+ dev_info(hdev->dev, "exit spraid remove\n");
+}
+
+static const struct pci_device_id spraid_id_table[] = {
+ { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC,
+ SPRAID_SERVER_DEVICE_HBA_DID) },
+ { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC,
+ SPRAID_SERVER_DEVICE_RAID_DID) },
+ { 0, }
+};
+MODULE_DEVICE_TABLE(pci, spraid_id_table);
+
+static struct pci_driver spraid_driver = {
+ .name = "spraid",
+ .id_table = spraid_id_table,
+ .probe = spraid_probe,
+ .remove = spraid_remove,
+ .shutdown = spraid_shutdown,
+ .err_handler = &spraid_err_handler,
+};
+
+static int __init spraid_init(void)
+{
+ int ret;
+
+ spraid_wq = alloc_workqueue("spraid-wq", WQ_UNBOUND | WQ_MEM_RECLAIM |
+ WQ_SYSFS, 0);
+ if (!spraid_wq)
+ return -ENOMEM;
+
+ spraid_class = class_create(THIS_MODULE, "spraid");
+ if (IS_ERR(spraid_class)) {
+ ret = PTR_ERR(spraid_class);
+ goto destroy_wq;
+ }
+
+ ret = pci_register_driver(&spraid_driver);
+ if (ret < 0)
+ goto destroy_class;
+
+ return 0;
+
+destroy_class:
+ class_destroy(spraid_class);
+destroy_wq:
+ destroy_workqueue(spraid_wq);
+
+ return ret;
+}
+
+static void __exit spraid_exit(void)
+{
+ pci_unregister_driver(&spraid_driver);
+ class_destroy(spraid_class);
+ destroy_workqueue(spraid_wq);
+ ida_destroy(&spraid_instance_ida);
+}
+
+MODULE_AUTHOR("songyl(a)ramaxel.com");
+MODULE_DESCRIPTION("Ramaxel Memory Technology SPraid Driver");
+MODULE_LICENSE("GPL");
+MODULE_VERSION(SPRAID_DRV_VERSION);
+module_init(spraid_init);
+module_exit(spraid_exit);
--
2.27.0
1
0

[PATCH openEuler-1.0-LTS 1/2] USB: gadget: detect too-big endpoint 0 requests
by Yang Yingliang 22 Dec '21
by Yang Yingliang 22 Dec '21
22 Dec '21
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
mainline inclusion
from mainline-v5.16-rc5
commit 153a2d7e3350cc89d406ba2d35be8793a64c2038
category: bugfix
bugzilla: NA
CVE: CVE-2021-39685
--------------------------------
Sometimes USB hosts can ask for buffers that are too large from endpoint
0, which should not be allowed. If this happens for OUT requests, stall
the endpoint, but for IN requests, trim the request size to the endpoint
buffer size.
Co-developed-by: Szymon Heidrich <szymon.heidrich(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/usb/gadget/composite.c | 12 ++++++++++++
drivers/usb/gadget/legacy/dbgp.c | 13 +++++++++++++
drivers/usb/gadget/legacy/inode.c | 16 +++++++++++++++-
3 files changed, 40 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c
index af0702cfa92ff..3813249387b3d 100644
--- a/drivers/usb/gadget/composite.c
+++ b/drivers/usb/gadget/composite.c
@@ -1569,6 +1569,18 @@ composite_setup(struct usb_gadget *gadget, const struct usb_ctrlrequest *ctrl)
struct usb_function *f = NULL;
u8 endp;
+ if (w_length > USB_COMP_EP0_BUFSIZ) {
+ if (ctrl->bRequestType == USB_DIR_OUT) {
+ goto done;
+ } else {
+ /* Cast away the const, we are going to overwrite on purpose. */
+ __le16 *temp = (__le16 *)&ctrl->wLength;
+
+ *temp = cpu_to_le16(USB_COMP_EP0_BUFSIZ);
+ w_length = USB_COMP_EP0_BUFSIZ;
+ }
+ }
+
/* partial re-init of the response message; the function or the
* gadget might need to intercept e.g. a control-OUT completion
* when we delegate to it.
diff --git a/drivers/usb/gadget/legacy/dbgp.c b/drivers/usb/gadget/legacy/dbgp.c
index e1d566c9918ae..e567afcb2794c 100644
--- a/drivers/usb/gadget/legacy/dbgp.c
+++ b/drivers/usb/gadget/legacy/dbgp.c
@@ -345,6 +345,19 @@ static int dbgp_setup(struct usb_gadget *gadget,
void *data = NULL;
u16 len = 0;
+ if (length > DBGP_REQ_LEN) {
+ if (ctrl->bRequestType == USB_DIR_OUT) {
+ return err;
+ } else {
+ /* Cast away the const, we are going to overwrite on purpose. */
+ __le16 *temp = (__le16 *)&ctrl->wLength;
+
+ *temp = cpu_to_le16(DBGP_REQ_LEN);
+ length = DBGP_REQ_LEN;
+ }
+ }
+
+
if (request == USB_REQ_GET_DESCRIPTOR) {
switch (value>>8) {
case USB_DT_DEVICE:
diff --git a/drivers/usb/gadget/legacy/inode.c b/drivers/usb/gadget/legacy/inode.c
index 37ca0e669bd85..83fd1b55d497a 100644
--- a/drivers/usb/gadget/legacy/inode.c
+++ b/drivers/usb/gadget/legacy/inode.c
@@ -109,6 +109,8 @@ enum ep0_state {
/* enough for the whole queue: most events invalidate others */
#define N_EVENT 5
+#define RBUF_SIZE 256
+
struct dev_data {
spinlock_t lock;
refcount_t count;
@@ -143,7 +145,7 @@ struct dev_data {
struct dentry *dentry;
/* except this scratch i/o buffer for ep0 */
- u8 rbuf [256];
+ u8 rbuf[RBUF_SIZE];
};
static inline void get_dev (struct dev_data *data)
@@ -1332,6 +1334,18 @@ gadgetfs_setup (struct usb_gadget *gadget, const struct usb_ctrlrequest *ctrl)
u16 w_value = le16_to_cpu(ctrl->wValue);
u16 w_length = le16_to_cpu(ctrl->wLength);
+ if (w_length > RBUF_SIZE) {
+ if (ctrl->bRequestType == USB_DIR_OUT) {
+ return value;
+ } else {
+ /* Cast away the const, we are going to overwrite on purpose. */
+ __le16 *temp = (__le16 *)&ctrl->wLength;
+
+ *temp = cpu_to_le16(RBUF_SIZE);
+ w_length = RBUF_SIZE;
+ }
+ }
+
spin_lock (&dev->lock);
dev->setup_abort = 0;
if (dev->state == STATE_DEV_UNCONNECTED) {
--
2.25.1
1
1

21 Dec '21
Ramaxel inclusion
category: features
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NJIB
CVE: NA
Support Ramaxel's SPRxxx Raid controller
Signed-off-by: Yanling Song <songyl(a)ramaxel.com>
Reviewed-by: Jiang Yu<yujiang(a)ramaxel.com>
---
drivers/scsi/Kconfig | 1 +
drivers/scsi/Makefile | 1 +
drivers/scsi/spraid/Kconfig | 13 +
drivers/scsi/spraid/Makefile | 7 +
drivers/scsi/spraid/spraid.h | 746 ++++++
drivers/scsi/spraid/spraid_main.c | 3875 +++++++++++++++++++++++++++++
6 files changed, 4643 insertions(+)
create mode 100644 drivers/scsi/spraid/Kconfig
create mode 100644 drivers/scsi/spraid/Makefile
create mode 100644 drivers/scsi/spraid/spraid.h
create mode 100644 drivers/scsi/spraid/spraid_main.c
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 8450484184e3..63d2aaa22834 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -522,6 +522,7 @@ source "drivers/scsi/megaraid/Kconfig.megaraid"
source "drivers/scsi/mpt3sas/Kconfig"
source "drivers/scsi/smartpqi/Kconfig"
source "drivers/scsi/ufs/Kconfig"
+source "drivers/scsi/spraid/Kconfig"
config SCSI_HPTIOP
tristate "HighPoint RocketRAID 3xxx/4xxx Controller support"
diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index 2973693f6dcc..4056cf26e09e 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -94,6 +94,7 @@ obj-$(CONFIG_SCSI_ZALON) += zalon7xx.o
obj-$(CONFIG_SCSI_DC395x) += dc395x.o
obj-$(CONFIG_SCSI_AM53C974) += esp_scsi.o am53c974.o
obj-$(CONFIG_CXLFLASH) += cxlflash/
+obj-$(CONFIG_RAMAXEL_SPRAID) += spraid/
obj-$(CONFIG_MEGARAID_LEGACY) += megaraid.o
obj-$(CONFIG_MEGARAID_NEWGEN) += megaraid/
obj-$(CONFIG_MEGARAID_SAS) += megaraid/
diff --git a/drivers/scsi/spraid/Kconfig b/drivers/scsi/spraid/Kconfig
new file mode 100644
index 000000000000..bfbba3db8db0
--- /dev/null
+++ b/drivers/scsi/spraid/Kconfig
@@ -0,0 +1,13 @@
+#
+# Ramaxel driver configuration
+#
+
+config RAMAXEL_SPRAID
+ tristate "Ramaxel spraid Adapter"
+ depends on PCI && SCSI
+ select BLK_DEV_BSGLIB
+ depends on ARM64 || X86_64
+ help
+ This driver supports Ramaxel SPRxxx serial
+ raid controller, which has PCIE Gen4 interface
+ with host and supports SAS/SATA Hdd/ssd.
diff --git a/drivers/scsi/spraid/Makefile b/drivers/scsi/spraid/Makefile
new file mode 100644
index 000000000000..aadc2ffd37eb
--- /dev/null
+++ b/drivers/scsi/spraid/Makefile
@@ -0,0 +1,7 @@
+#
+# Makefile for the Ramaxel device drivers.
+#
+
+obj-$(CONFIG_RAMAXEL_SPRAID) += spraid.o
+
+spraid-objs := spraid_main.o
\ No newline at end of file
diff --git a/drivers/scsi/spraid/spraid.h b/drivers/scsi/spraid/spraid.h
new file mode 100644
index 000000000000..c1e4980e18e5
--- /dev/null
+++ b/drivers/scsi/spraid/spraid.h
@@ -0,0 +1,746 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
+
+#ifndef __SPRAID_H_
+#define __SPRAID_H_
+
+#define SPRAID_CAP_MQES(cap) ((cap) & 0xffff)
+#define SPRAID_CAP_STRIDE(cap) (((cap) >> 32) & 0xf)
+#define SPRAID_CAP_MPSMIN(cap) (((cap) >> 48) & 0xf)
+#define SPRAID_CAP_MPSMAX(cap) (((cap) >> 52) & 0xf)
+#define SPRAID_CAP_TIMEOUT(cap) (((cap) >> 24) & 0xff)
+#define SPRAID_CAP_DMAMASK(cap) (((cap) >> 37) & 0xff)
+
+#define SPRAID_DEFAULT_MAX_CHANNEL 4
+#define SPRAID_DEFAULT_MAX_ID 240
+#define SPRAID_DEFAULT_MAX_LUN_PER_HOST 8
+#define MAX_SECTORS 2048
+
+#define IO_SQE_SIZE sizeof(struct spraid_ioq_command)
+#define ADMIN_SQE_SIZE sizeof(struct spraid_admin_command)
+#define SQE_SIZE(qid) (((qid) > 0) ? IO_SQE_SIZE : ADMIN_SQE_SIZE)
+#define CQ_SIZE(depth) ((depth) * sizeof(struct spraid_completion))
+#define SQ_SIZE(qid, depth) ((depth) * SQE_SIZE(qid))
+
+#define SENSE_SIZE(depth) ((depth) * SCSI_SENSE_BUFFERSIZE)
+
+#define SPRAID_AQ_DEPTH 128
+#define SPRAID_NR_AEN_COMMANDS 16
+#define SPRAID_AQ_BLK_MQ_DEPTH (SPRAID_AQ_DEPTH - SPRAID_NR_AEN_COMMANDS)
+#define SPRAID_AQ_MQ_TAG_DEPTH (SPRAID_AQ_BLK_MQ_DEPTH - 1)
+
+#define SPRAID_ADMIN_QUEUE_NUM 1
+#define SPRAID_PTCMDS_PERQ 1
+#define SPRAID_IO_BLK_MQ_DEPTH (hdev->shost->can_queue)
+#define SPRAID_NR_IOQ_PTCMDS (SPRAID_PTCMDS_PERQ * hdev->shost->nr_hw_queues)
+
+#define FUA_MASK 0x08
+#define SPRAID_MINORS BIT(MINORBITS)
+
+#define COMMAND_IS_WRITE(cmd) ((cmd)->common.opcode & 1)
+
+#define SPRAID_IO_IOSQES 7
+#define SPRAID_IO_IOCQES 4
+#define PRP_ENTRY_SIZE 8
+
+#define SMALL_POOL_SIZE 256
+#define MAX_SMALL_POOL_NUM 16
+#define MAX_CMD_PER_DEV 64
+#define MAX_CDB_LEN 32
+
+#define SPRAID_UP_TO_MULTY4(x) (((x) + 4) & (~0x03))
+
+#define CQE_STATUS_SUCCESS (0x0)
+
+#define PCI_VENDOR_ID_RAMAXEL_LOGIC 0x1E81
+
+#define SPRAID_SERVER_DEVICE_HBA_DID 0x2100
+#define SPRAID_SERVER_DEVICE_RAID_DID 0x2200
+
+#define IO_6_DEFAULT_TX_LEN 256
+
+#define SPRAID_INT_PAGES 2
+#define SPRAID_INT_BYTES(hdev) (SPRAID_INT_PAGES * (hdev)->page_size)
+
+enum {
+ SPRAID_REQ_CANCELLED = (1 << 0),
+ SPRAID_REQ_USERCMD = (1 << 1),
+};
+
+enum {
+ SPRAID_SC_SUCCESS = 0x0,
+ SPRAID_SC_INVALID_OPCODE = 0x1,
+ SPRAID_SC_INVALID_FIELD = 0x2,
+
+ SPRAID_SC_ABORT_LIMIT = 0x103,
+ SPRAID_SC_ABORT_MISSING = 0x104,
+ SPRAID_SC_ASYNC_LIMIT = 0x105,
+
+ SPRAID_SC_DNR = 0x4000,
+};
+
+enum {
+ SPRAID_REG_CAP = 0x0000,
+ SPRAID_REG_CC = 0x0014,
+ SPRAID_REG_CSTS = 0x001c,
+ SPRAID_REG_AQA = 0x0024,
+ SPRAID_REG_ASQ = 0x0028,
+ SPRAID_REG_ACQ = 0x0030,
+ SPRAID_REG_DBS = 0x1000,
+};
+
+enum {
+ SPRAID_CC_ENABLE = 1 << 0,
+ SPRAID_CC_CSS_NVM = 0 << 4,
+ SPRAID_CC_MPS_SHIFT = 7,
+ SPRAID_CC_AMS_SHIFT = 11,
+ SPRAID_CC_SHN_SHIFT = 14,
+ SPRAID_CC_IOSQES_SHIFT = 16,
+ SPRAID_CC_IOCQES_SHIFT = 20,
+ SPRAID_CC_AMS_RR = 0 << SPRAID_CC_AMS_SHIFT,
+ SPRAID_CC_SHN_NONE = 0 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CC_IOSQES = SPRAID_IO_IOSQES << SPRAID_CC_IOSQES_SHIFT,
+ SPRAID_CC_IOCQES = SPRAID_IO_IOCQES << SPRAID_CC_IOCQES_SHIFT,
+ SPRAID_CC_SHN_NORMAL = 1 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CC_SHN_MASK = 3 << SPRAID_CC_SHN_SHIFT,
+ SPRAID_CSTS_CFS_SHIFT = 1,
+ SPRAID_CSTS_SHST_SHIFT = 2,
+ SPRAID_CSTS_PP_SHIFT = 5,
+ SPRAID_CSTS_RDY = 1 << 0,
+ SPRAID_CSTS_SHST_CMPLT = 2 << 2,
+ SPRAID_CSTS_SHST_MASK = 3 << 2,
+ SPRAID_CSTS_CFS_MASK = 1 << SPRAID_CSTS_CFS_SHIFT,
+ SPRAID_CSTS_PP_MASK = 1 << SPRAID_CSTS_PP_SHIFT,
+};
+
+enum {
+ SPRAID_ADMIN_DELETE_SQ = 0x00,
+ SPRAID_ADMIN_CREATE_SQ = 0x01,
+ SPRAID_ADMIN_DELETE_CQ = 0x04,
+ SPRAID_ADMIN_CREATE_CQ = 0x05,
+ SPRAID_ADMIN_ABORT_CMD = 0x08,
+ SPRAID_ADMIN_SET_FEATURES = 0x09,
+ SPRAID_ADMIN_ASYNC_EVENT = 0x0c,
+ SPRAID_ADMIN_GET_INFO = 0xc6,
+ SPRAID_ADMIN_RESET = 0xc8,
+};
+
+enum {
+ SPRAID_GET_INFO_CTRL = 0,
+ SPRAID_GET_INFO_DEV_LIST = 1,
+};
+
+enum {
+ SPRAID_RESET_TARGET = 0,
+ SPRAID_RESET_BUS = 1,
+};
+
+enum {
+ SPRAID_AEN_ERROR = 0,
+ SPRAID_AEN_NOTICE = 2,
+ SPRAID_AEN_VS = 7,
+};
+
+enum {
+ SPRAID_AEN_DEV_CHANGED = 0x00,
+ SPRAID_AEN_FW_ACT_START = 0x01,
+ SPRAID_AEN_HOST_PROBING = 0x10,
+};
+
+enum {
+ SPRAID_AEN_TIMESYN = 0x00,
+ SPRAID_AEN_FW_ACT_FINISH = 0x02,
+ SPRAID_AEN_EVENT_MIN = 0x80,
+ SPRAID_AEN_EVENT_MAX = 0xff,
+};
+
+enum {
+ SPRAID_CMD_WRITE = 0x01,
+ SPRAID_CMD_READ = 0x02,
+
+ SPRAID_CMD_NONIO_NONE = 0x80,
+ SPRAID_CMD_NONIO_TODEV = 0x81,
+ SPRAID_CMD_NONIO_FROMDEV = 0x82,
+};
+
+enum {
+ SPRAID_QUEUE_PHYS_CONTIG = (1 << 0),
+ SPRAID_CQ_IRQ_ENABLED = (1 << 1),
+
+ SPRAID_FEAT_NUM_QUEUES = 0x07,
+ SPRAID_FEAT_ASYNC_EVENT = 0x0b,
+ SPRAID_FEAT_TIMESTAMP = 0x0e,
+};
+
+enum spraid_state {
+ SPRAID_NEW,
+ SPRAID_LIVE,
+ SPRAID_RESETTING,
+ SPRAID_DELETING,
+ SPRAID_DEAD,
+};
+
+enum {
+ SPRAID_CARD_HBA,
+ SPRAID_CARD_RAID,
+};
+
+enum spraid_cmd_type {
+ SPRAID_CMD_ADM,
+ SPRAID_CMD_IOPT,
+};
+
+struct spraid_completion {
+ __le32 result;
+ union {
+ struct {
+ __u8 sense_len;
+ __u8 resv[3];
+ };
+ __le32 result1;
+ };
+ __le16 sq_head;
+ __le16 sq_id;
+ __u16 cmd_id;
+ __le16 status;
+};
+
+struct spraid_ctrl_info {
+ __le32 nd;
+ __le16 max_cmds;
+ __le16 max_channel;
+ __le32 max_tgt_id;
+ __le16 max_lun;
+ __le16 max_num_sge;
+ __le16 lun_num_in_boot;
+ __u8 mdts;
+ __u8 acl;
+ __u8 aerl;
+ __u8 card_type;
+ __u16 rsvd;
+ __u32 rtd3e;
+ __u8 sn[32];
+ __u8 fr[16];
+ __u8 rsvd1[4020];
+};
+
+struct spraid_dev {
+ struct pci_dev *pdev;
+ struct device *dev;
+ struct Scsi_Host *shost;
+ struct spraid_queue *queues;
+ struct dma_pool *prp_page_pool;
+ struct dma_pool *prp_small_pool[MAX_SMALL_POOL_NUM];
+ mempool_t *iod_mempool;
+ void __iomem *bar;
+ u32 max_qid;
+ u32 num_vecs;
+ u32 queue_count;
+ u32 ioq_depth;
+ int db_stride;
+ u32 __iomem *dbs;
+ struct rw_semaphore devices_rwsem;
+ int numa_node;
+ u32 page_size;
+ u32 ctrl_config;
+ u32 online_queues;
+ u64 cap;
+ int instance;
+ struct spraid_ctrl_info *ctrl_info;
+ struct spraid_dev_info *devices;
+
+ struct spraid_cmd *adm_cmds;
+ struct list_head adm_cmd_list;
+ spinlock_t adm_cmd_lock;
+
+ struct spraid_cmd *ioq_ptcmds;
+ struct list_head ioq_pt_list;
+ spinlock_t ioq_pt_lock;
+
+ struct work_struct scan_work;
+ struct work_struct timesyn_work;
+ struct work_struct reset_work;
+ struct work_struct fw_act_work;
+
+ enum spraid_state state;
+ spinlock_t state_lock;
+
+ struct request_queue *bsg_queue;
+};
+
+struct spraid_sgl_desc {
+ __le64 addr;
+ __le32 length;
+ __u8 rsvd[3];
+ __u8 type;
+};
+
+union spraid_data_ptr {
+ struct {
+ __le64 prp1;
+ __le64 prp2;
+ };
+ struct spraid_sgl_desc sgl;
+};
+
+struct spraid_admin_common_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le32 cdw2[4];
+ union spraid_data_ptr dptr;
+ __le32 cdw10;
+ __le32 cdw11;
+ __le32 cdw12;
+ __le32 cdw13;
+ __le32 cdw14;
+ __le32 cdw15;
+};
+
+struct spraid_features {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[2];
+ union spraid_data_ptr dptr;
+ __le32 fid;
+ __le32 dword11;
+ __le32 dword12;
+ __le32 dword13;
+ __le32 dword14;
+ __le32 dword15;
+};
+
+struct spraid_create_cq {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[5];
+ __le64 prp1;
+ __u64 rsvd8;
+ __le16 cqid;
+ __le16 qsize;
+ __le16 cq_flags;
+ __le16 irq_vector;
+ __u32 rsvd12[4];
+};
+
+struct spraid_create_sq {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[5];
+ __le64 prp1;
+ __u64 rsvd8;
+ __le16 sqid;
+ __le16 qsize;
+ __le16 sq_flags;
+ __le16 cqid;
+ __u32 rsvd12[4];
+};
+
+struct spraid_delete_queue {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __u32 rsvd1[9];
+ __le16 qid;
+ __u16 rsvd10;
+ __u32 rsvd11[5];
+};
+
+struct spraid_get_info {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u32 rsvd2[4];
+ union spraid_data_ptr dptr;
+ __u8 type;
+ __u8 rsvd10[3];
+ __le32 cdw11;
+ __u32 rsvd12[4];
+};
+
+struct spraid_usr_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ union {
+ struct {
+ __le16 subopcode;
+ __le16 rsvd1;
+ } info_0;
+ __le32 cdw2;
+ };
+ union {
+ struct {
+ __le16 data_len;
+ __le16 param_len;
+ } info_1;
+ __le32 cdw3;
+ };
+ __u64 metadata;
+ union spraid_data_ptr dptr;
+ __le32 cdw10;
+ __le32 cdw11;
+ __le32 cdw12;
+ __le32 cdw13;
+ __le32 cdw14;
+ __le32 cdw15;
+};
+
+enum {
+ SPRAID_CMD_FLAG_SGL_METABUF = (1 << 6),
+ SPRAID_CMD_FLAG_SGL_METASEG = (1 << 7),
+ SPRAID_CMD_FLAG_SGL_ALL = SPRAID_CMD_FLAG_SGL_METABUF |
+ SPRAID_CMD_FLAG_SGL_METASEG,
+};
+
+enum spraid_cmd_state {
+ SPRAID_CMD_IDLE = 0,
+ SPRAID_CMD_IN_FLIGHT = 1,
+ SPRAID_CMD_COMPLETE = 2,
+ SPRAID_CMD_TIMEOUT = 3,
+ SPRAID_CMD_TMO_COMPLETE = 4,
+};
+
+struct spraid_abort_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[4];
+ __le16 sqid;
+ __le16 cid;
+ __u32 rsvd11[5];
+};
+
+struct spraid_reset_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __u64 rsvd2[4];
+ __u8 type;
+ __u8 rsvd10[3];
+ __u32 rsvd11[5];
+};
+
+struct spraid_admin_command {
+ union {
+ struct spraid_admin_common_command common;
+ struct spraid_features features;
+ struct spraid_create_cq create_cq;
+ struct spraid_create_sq create_sq;
+ struct spraid_delete_queue delete_queue;
+ struct spraid_get_info get_info;
+ struct spraid_abort_cmd abort;
+ struct spraid_reset_cmd reset;
+ struct spraid_usr_cmd usr_cmd;
+ };
+};
+
+struct spraid_ioq_common_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_len;
+ __u8 rsvd2;
+ __le32 cdw3[3];
+ union spraid_data_ptr dptr;
+ __le32 cdw10[6];
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __le32 cdw26[6];
+};
+
+struct spraid_rw_command {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_len;
+ __u8 rsvd2;
+ __u32 rsvd3[3];
+ union spraid_data_ptr dptr;
+ __le64 slba;
+ __le16 nlb;
+ __le16 control;
+ __u32 rsvd13[3];
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __u32 rsvd26[6];
+};
+
+struct spraid_scsi_nonio {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ __le16 sense_len;
+ __u8 cdb_length;
+ __u8 rsvd2;
+ __u32 rsvd3[3];
+ union spraid_data_ptr dptr;
+ __u32 rsvd10[5];
+ __le32 buffer_len;
+ __u8 cdb[32];
+ __le64 sense_addr;
+ __u32 rsvd26[6];
+};
+
+struct spraid_ioq_command {
+ union {
+ struct spraid_ioq_common_command common;
+ struct spraid_rw_command rw;
+ struct spraid_scsi_nonio scsi_nonio;
+ };
+};
+
+struct spraid_passthru_common_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 rsvd0;
+ __u32 nsid;
+ union {
+ struct {
+ __u16 subopcode;
+ __u16 rsvd1;
+ } info_0;
+ __u32 cdw2;
+ };
+ union {
+ struct {
+ __u16 data_len;
+ __u16 param_len;
+ } info_1;
+ __u32 cdw3;
+ };
+ __u64 metadata;
+
+ __u64 addr;
+ __u64 prp2;
+
+ __u32 cdw10;
+ __u32 cdw11;
+ __u32 cdw12;
+ __u32 cdw13;
+ __u32 cdw14;
+ __u32 cdw15;
+ __u32 timeout_ms;
+ __u32 result0;
+ __u32 result1;
+};
+
+struct spraid_ioq_passthru_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 rsvd0;
+ __u32 nsid;
+ union {
+ struct {
+ __u16 res_sense_len;
+ __u8 cdb_len;
+ __u8 rsvd0;
+ } info_0;
+ __u32 cdw2;
+ };
+ union {
+ struct {
+ __u16 subopcode;
+ __u16 rsvd1;
+ } info_1;
+ __u32 cdw3;
+ };
+ union {
+ struct {
+ __u16 rsvd;
+ __u16 param_len;
+ } info_2;
+ __u32 cdw4;
+ };
+ __u32 cdw5;
+ __u64 addr;
+ __u64 prp2;
+ union {
+ struct {
+ __u16 eid;
+ __u16 sid;
+ } info_3;
+ __u32 cdw10;
+ };
+ union {
+ struct {
+ __u16 did;
+ __u8 did_flag;
+ __u8 rsvd2;
+ } info_4;
+ __u32 cdw11;
+ };
+ __u32 cdw12;
+ __u32 cdw13;
+ __u32 cdw14;
+ __u32 data_len;
+ __u32 cdw16;
+ __u32 cdw17;
+ __u32 cdw18;
+ __u32 cdw19;
+ __u32 cdw20;
+ __u32 cdw21;
+ __u32 cdw22;
+ __u32 cdw23;
+ __u64 sense_addr;
+ __u32 cdw26[4];
+ __u32 timeout_ms;
+ __u32 result0;
+ __u32 result1;
+};
+
+struct spraid_bsg_request {
+ u32 msgcode;
+ u32 control;
+ union {
+ struct spraid_passthru_common_cmd admcmd;
+ struct spraid_ioq_passthru_cmd ioqcmd;
+ };
+};
+
+enum {
+ SPRAID_BSG_ADM,
+ SPRAID_BSG_IOQ,
+};
+
+struct spraid_cmd {
+ int qid;
+ int cid;
+ u32 result0;
+ u32 result1;
+ u16 status;
+ void *priv;
+ enum spraid_cmd_state state;
+ struct completion cmd_done;
+ struct list_head list;
+};
+
+struct spraid_queue {
+ struct spraid_dev *hdev;
+ spinlock_t sq_lock;
+
+ spinlock_t cq_lock ____cacheline_aligned_in_smp;
+
+ void *sq_cmds;
+
+ struct spraid_completion *cqes;
+
+ dma_addr_t sq_dma_addr;
+ dma_addr_t cq_dma_addr;
+ u32 __iomem *q_db;
+ u8 cq_phase;
+ u8 sqes;
+ u16 qid;
+ u16 sq_tail;
+ u16 cq_head;
+ u16 last_cq_head;
+ u16 q_depth;
+ s16 cq_vector;
+ void *sense;
+ dma_addr_t sense_dma_addr;
+ struct dma_pool *prp_small_pool;
+};
+
+struct spraid_iod {
+ struct spraid_queue *spraidq;
+ enum spraid_cmd_state state;
+ int npages;
+ u32 nsge;
+ u32 length;
+ bool use_sgl;
+ bool sg_drv_mgmt;
+ dma_addr_t first_dma;
+ void *sense;
+ dma_addr_t sense_dma;
+ struct scatterlist *sg;
+ struct scatterlist inline_sg[0];
+};
+
+#define SPRAID_DEV_INFO_ATTR_BOOT(attr) ((attr) & 0x01)
+#define SPRAID_DEV_INFO_ATTR_VD(attr) (((attr) & 0x02) == 0x0)
+#define SPRAID_DEV_INFO_ATTR_PT(attr) (((attr) & 0x22) == 0x02)
+#define SPRAID_DEV_INFO_ATTR_RAWDISK(attr) ((attr) & 0x20)
+
+#define SPRAID_DEV_INFO_FLAG_VALID(flag) ((flag) & 0x01)
+#define SPRAID_DEV_INFO_FLAG_CHANGE(flag) ((flag) & 0x02)
+
+#define BGTASK_TYPE_REBUILD 4
+#define USR_CMD_READ 0xc2
+#define USR_CMD_RDLEN 0x1000
+#define USR_CMD_VDINFO 0x704
+#define USR_CMD_BGTASK 0x504
+#define VDINFO_PARAM_LEN 0x04
+
+struct spraid_vd_info {
+ __u8 name[32];
+ __le16 id;
+ __u8 rg_id;
+ __u8 rg_level;
+ __u8 sg_num;
+ __u8 sg_disk_num;
+ __u8 vd_status;
+ __u8 vd_type;
+ __u8 rsvd1[4056];
+};
+
+#define MAX_REALTIME_BGTASK_NUM 32
+
+struct bgtask_info {
+ __u8 type;
+ __u8 progress;
+ __u8 rate;
+ __u8 rsvd0;
+ __le16 vd_id;
+ __le16 time_left;
+ __u8 rsvd1[4];
+};
+
+struct spraid_bgtask {
+ __u8 sw;
+ __u8 task_num;
+ __u8 rsvd[6];
+ struct bgtask_info bgtask[MAX_REALTIME_BGTASK_NUM];
+};
+
+struct spraid_dev_info {
+ __le32 hdid;
+ __le16 target;
+ __u8 channel;
+ __u8 lun;
+ __u8 attr;
+ __u8 flag;
+ __le16 max_io_kb;
+};
+
+#define MAX_DEV_ENTRY_PER_PAGE_4K 340
+struct spraid_dev_list {
+ __le32 dev_num;
+ __u32 rsvd0[3];
+ struct spraid_dev_info devices[MAX_DEV_ENTRY_PER_PAGE_4K];
+};
+
+struct spraid_sdev_hostdata {
+ u32 hdid;
+ u16 max_io_kb;
+ u8 attr;
+ u8 flag;
+ u8 rg_id;
+ u8 rsvd[3];
+};
+
+#endif
+
diff --git a/drivers/scsi/spraid/spraid_main.c b/drivers/scsi/spraid/spraid_main.c
new file mode 100644
index 000000000000..519b39f44e91
--- /dev/null
+++ b/drivers/scsi/spraid/spraid_main.c
@@ -0,0 +1,3875 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
+
+/* Ramaxel Raid SPXXX Series Linux Driver */
+
+#define pr_fmt(fmt) "spraid: " fmt
+
+#include <linux/sched/signal.h>
+#include <linux/version.h>
+#include <linux/pci.h>
+#include <linux/aer.h>
+#include <linux/module.h>
+#include <linux/ioport.h>
+#include <linux/device.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/cdev.h>
+#include <linux/sysfs.h>
+#include <linux/gfp.h>
+#include <linux/types.h>
+#include <linux/ratelimit.h>
+#include <linux/once.h>
+#include <linux/debugfs.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/blkdev.h>
+#include <linux/bsg-lib.h>
+#include <asm/unaligned.h>
+#include <linux/sort.h>
+#include <target/target_core_backend.h>
+
+#include <scsi/scsi.h>
+#include <scsi/scsi_cmnd.h>
+#include <scsi/scsi_device.h>
+#include <scsi/scsi_host.h>
+#include <scsi/scsi_transport.h>
+#include <scsi/scsi_dbg.h>
+
+
+#include "spraid.h"
+
+static u32 admin_tmout = 60;
+module_param(admin_tmout, uint, 0644);
+MODULE_PARM_DESC(admin_tmout, "admin commands timeout (seconds)");
+
+static u32 scmd_tmout_nonpt = 180;
+module_param(scmd_tmout_nonpt, uint, 0644);
+MODULE_PARM_DESC(scmd_tmout_nonpt,
+ "scsi commands timeout for rawdisk&raid(seconds)");
+
+static int ioq_depth_set(const char *val, const struct kernel_param *kp);
+static const struct kernel_param_ops ioq_depth_ops = {
+ .set = ioq_depth_set,
+ .get = param_get_uint,
+};
+
+static u32 io_queue_depth = 1024;
+module_param_cb(io_queue_depth, &ioq_depth_ops, &io_queue_depth, 0644);
+MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2");
+
+static int log_debug_switch_set(const char *val, const struct kernel_param *kp)
+{
+ u8 n = 0;
+ int ret;
+
+ ret = kstrtou8(val, 10, &n);
+ if (ret != 0)
+ return -EINVAL;
+
+ return param_set_byte(val, kp);
+}
+
+static const struct kernel_param_ops log_debug_switch_ops = {
+ .set = log_debug_switch_set,
+ .get = param_get_byte,
+};
+
+static unsigned char log_debug_switch;
+module_param_cb(log_debug_switch, &log_debug_switch_ops,
+ &log_debug_switch, 0644);
+MODULE_PARM_DESC(log_debug_switch,
+ "set log state, default non-zero for switch on");
+
+static int small_pool_num_set(const char *val, const struct kernel_param *kp)
+{
+ u8 n = 0;
+ int ret;
+
+ ret = kstrtou8(val, 10, &n);
+ if (ret != 0)
+ return -EINVAL;
+ if (n > MAX_SMALL_POOL_NUM)
+ n = MAX_SMALL_POOL_NUM;
+ if (n < 1)
+ n = 1;
+ *((u8 *)kp->arg) = n;
+
+ return 0;
+}
+
+static const struct kernel_param_ops small_pool_num_ops = {
+ .set = small_pool_num_set,
+ .get = param_get_byte,
+};
+
+/* It was found that the spindlock of a single pool conflicts
+ * a lot with multiple CPUs.So multiple pools are introduced
+ * to reduce the conflictions.
+ */
+static unsigned char small_pool_num = 4;
+module_param_cb(small_pool_num, &small_pool_num_ops, &small_pool_num, 0644);
+MODULE_PARM_DESC(small_pool_num, "set prp small pool num, default 4, MAX 16");
+
+static void spraid_free_queue(struct spraid_queue *spraidq);
+static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result);
+static void spraid_handle_aen_vs(struct spraid_dev *hdev,
+ u32 result, u32 result1);
+
+static DEFINE_IDA(spraid_instance_ida);
+
+static struct class *spraid_class;
+
+#define SPRAID_CAP_TIMEOUT_UNIT_MS (HZ / 2)
+
+static struct workqueue_struct *spraid_wq;
+
+#define dev_log_dbg(dev, fmt, ...) do { \
+ if (unlikely(log_debug_switch)) \
+ dev_info(dev, "[%s] [%d] " fmt, \
+ __func__, __LINE__, ##__VA_ARGS__); \
+} while (0)
+
+#define SPRAID_DRV_VERSION "1.0.0.0"
+
+#define ADMIN_TIMEOUT (admin_tmout * HZ)
+#define ADMIN_ERR_TIMEOUT 32757
+
+#define SPRAID_WAIT_ABNL_CMD_TIMEOUT (3 * 2)
+
+#define SPRAID_DMA_MSK_BIT_MAX 64
+
+enum FW_STAT_CODE {
+ FW_STAT_OK = 0,
+ FW_STAT_NEED_CHECK,
+ FW_STAT_ERROR,
+ FW_STAT_EP_PCIE_ERROR,
+ FW_STAT_NAC_DMA_ERROR,
+ FW_STAT_ABORTED,
+ FW_STAT_NEED_RETRY
+};
+
+static const char * const raid_levels[] = {"0", "1", "5", "6", "10", "50", "60",
+ "NA"};
+
+static const char * const raid_states[] = {
+ "NA", "NORMAL", "FAULT", "DEGRADE", "NOT_FORMATTED",
+ "FORMATTING", "SANITIZING", "INITIALIZING", "INITIALIZE_FAIL",
+ "DELETING", "DELETE_FAIL", "WRITE_PROTECT"
+};
+
+static int ioq_depth_set(const char *val, const struct kernel_param *kp)
+{
+ int n = 0;
+ int ret;
+
+ ret = kstrtoint(val, 10, &n);
+ if (ret != 0 || n < 2)
+ return -EINVAL;
+
+ return param_set_int(val, kp);
+}
+
+static int spraid_remap_bar(struct spraid_dev *hdev, u32 size)
+{
+ struct pci_dev *pdev = hdev->pdev;
+
+ if (size > pci_resource_len(pdev, 0)) {
+ dev_err(hdev->dev, "Input size[%u] exceed bar0 length[%llu]\n",
+ size, pci_resource_len(pdev, 0));
+ return -ENOMEM;
+ }
+
+ if (hdev->bar)
+ iounmap(hdev->bar);
+
+ hdev->bar = ioremap(pci_resource_start(pdev, 0), size);
+ if (!hdev->bar) {
+ dev_err(hdev->dev, "ioremap for bar0 failed\n");
+ return -ENOMEM;
+ }
+ hdev->dbs = hdev->bar + SPRAID_REG_DBS;
+
+ return 0;
+}
+
+static int spraid_dev_map(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ int ret;
+
+ ret = pci_request_mem_regions(pdev, "spraid");
+ if (ret) {
+ dev_err(hdev->dev, "fail to request memory regions\n");
+ return ret;
+ }
+
+ ret = spraid_remap_bar(hdev, SPRAID_REG_DBS + 4096);
+ if (ret) {
+ pci_release_mem_regions(pdev);
+ return ret;
+ }
+
+ return 0;
+}
+
+static void spraid_dev_unmap(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+
+ if (hdev->bar) {
+ iounmap(hdev->bar);
+ hdev->bar = NULL;
+ }
+ pci_release_mem_regions(pdev);
+}
+
+static int spraid_pci_enable(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ int ret = -ENOMEM;
+ u64 maskbit = SPRAID_DMA_MSK_BIT_MAX;
+
+ if (pci_enable_device_mem(pdev)) {
+ dev_err(hdev->dev,
+ "Enable pci device memory resources failed\n");
+ return ret;
+ }
+ pci_set_master(pdev);
+
+ if (readl(hdev->bar + SPRAID_REG_CSTS) == U32_MAX) {
+ ret = -ENODEV;
+ dev_err(hdev->dev, "Read csts register failed\n");
+ goto disable;
+ }
+
+ ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
+ if (ret < 0) {
+ dev_err(hdev->dev,
+ "Allocate one IRQ for setup admin channel failed\n");
+ goto disable;
+ }
+
+ hdev->cap = lo_hi_readq(hdev->bar + SPRAID_REG_CAP);
+ hdev->ioq_depth = min_t(u32, SPRAID_CAP_MQES(hdev->cap) + 1,
+ io_queue_depth);
+ hdev->db_stride = 1 << SPRAID_CAP_STRIDE(hdev->cap);
+
+ maskbit = SPRAID_CAP_DMAMASK(hdev->cap);
+ if (maskbit < 32 || maskbit > SPRAID_DMA_MSK_BIT_MAX) {
+ dev_err(hdev->dev,
+ "err, dma mask invalid[%llu], set to default\n",
+ maskbit);
+ maskbit = SPRAID_DMA_MSK_BIT_MAX;
+ }
+ if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(maskbit))) {
+ dev_err(hdev->dev, "set dma mask and coherent failed\n");
+ goto disable;
+ }
+
+ dev_info(hdev->dev, "set dma mask[%llu] success\n", maskbit);
+
+ pci_enable_pcie_error_reporting(pdev);
+ pci_save_state(pdev);
+
+ return 0;
+
+disable:
+ pci_disable_device(pdev);
+ return ret;
+}
+
+static int spraid_npages_prp(u32 size, struct spraid_dev *hdev)
+{
+ u32 nprps = DIV_ROUND_UP(size + hdev->page_size, hdev->page_size);
+
+ return DIV_ROUND_UP(PRP_ENTRY_SIZE * nprps, PAGE_SIZE - PRP_ENTRY_SIZE);
+}
+
+static int spraid_npages_sgl(u32 nseg)
+{
+ return DIV_ROUND_UP(nseg * sizeof(struct spraid_sgl_desc), PAGE_SIZE);
+}
+
+static void **spraid_iod_list(struct spraid_iod *iod)
+{
+ return (void **)(iod->inline_sg + (iod->sg_drv_mgmt ? iod->nsge : 0));
+}
+
+static u32 spraid_iod_ext_size(struct spraid_dev *hdev, u32 size, u32 nsge,
+ bool sg_drv_mgmt, bool use_sgl)
+{
+ size_t alloc_size, sg_size;
+
+ if (use_sgl)
+ alloc_size = sizeof(__le64 *) * spraid_npages_sgl(nsge);
+ else
+ alloc_size = sizeof(__le64 *) * spraid_npages_prp(size, hdev);
+
+ sg_size = sg_drv_mgmt ? (sizeof(struct scatterlist) * nsge) : 0;
+ return sg_size + alloc_size;
+}
+
+static u32 spraid_cmd_size(struct spraid_dev *hdev,
+ bool sg_drv_mgmt, bool use_sgl)
+{
+ u32 alloc_size = spraid_iod_ext_size(hdev, SPRAID_INT_BYTES(hdev),
+ SPRAID_INT_PAGES, sg_drv_mgmt, use_sgl);
+
+ dev_info(hdev->dev, "sg_drv_mgmt: %s, use_sgl: %s, iod size: %lu;"
+ " alloc_size: %u\n", sg_drv_mgmt ? "true" : "false",
+ use_sgl ? "true" : "false",
+ sizeof(struct spraid_iod), alloc_size);
+
+ return sizeof(struct spraid_iod) + alloc_size;
+}
+
+static int spraid_setup_prps(struct spraid_dev *hdev, struct spraid_iod *iod)
+{
+ struct scatterlist *sg = iod->sg;
+ u64 dma_addr = sg_dma_address(sg);
+ int dma_len = sg_dma_len(sg);
+ __le64 *prp_list, *old_prp_list;
+ u32 page_size = hdev->page_size;
+ int offset = dma_addr & (page_size - 1);
+ void **list = spraid_iod_list(iod);
+ int length = iod->length;
+ struct dma_pool *pool;
+ dma_addr_t prp_dma;
+ int nprps, i;
+
+ length -= (page_size - offset);
+ if (length <= 0) {
+ iod->first_dma = 0;
+ return 0;
+ }
+
+ dma_len -= (page_size - offset);
+ if (dma_len) {
+ dma_addr += (page_size - offset);
+ } else {
+ sg = sg_next(sg);
+ dma_addr = sg_dma_address(sg);
+ dma_len = sg_dma_len(sg);
+ }
+
+ if (length <= page_size) {
+ iod->first_dma = dma_addr;
+ return 0;
+ }
+
+ nprps = DIV_ROUND_UP(length, page_size);
+ if (nprps <= (SMALL_POOL_SIZE / PRP_ENTRY_SIZE)) {
+ pool = iod->spraidq->prp_small_pool;
+ iod->npages = 0;
+ } else {
+ pool = hdev->prp_page_pool;
+ iod->npages = 1;
+ }
+
+ prp_list = dma_pool_alloc(pool, GFP_ATOMIC, &prp_dma);
+ if (!prp_list) {
+ dev_err_ratelimited(hdev->dev,
+ "Allocate first prp_list memory failed\n");
+ iod->first_dma = dma_addr;
+ iod->npages = -1;
+ return -ENOMEM;
+ }
+ list[0] = prp_list;
+ iod->first_dma = prp_dma;
+ i = 0;
+ for (;;) {
+ if (i == page_size / PRP_ENTRY_SIZE) {
+ old_prp_list = prp_list;
+
+ prp_list = dma_pool_alloc(pool, GFP_ATOMIC, &prp_dma);
+ if (!prp_list) {
+ dev_err_ratelimited(hdev->dev, "Allocate %dth;"
+ " prp_list memory failed\n",
+ iod->npages + 1);
+ return -ENOMEM;
+ }
+ list[iod->npages++] = prp_list;
+ prp_list[0] = old_prp_list[i - 1];
+ old_prp_list[i - 1] = cpu_to_le64(prp_dma);
+ i = 1;
+ }
+ prp_list[i++] = cpu_to_le64(dma_addr);
+ dma_len -= page_size;
+ dma_addr += page_size;
+ length -= page_size;
+ if (length <= 0)
+ break;
+ if (dma_len > 0)
+ continue;
+ if (unlikely(dma_len < 0))
+ goto bad_sgl;
+ sg = sg_next(sg);
+ dma_addr = sg_dma_address(sg);
+ dma_len = sg_dma_len(sg);
+ }
+
+ return 0;
+
+bad_sgl:
+ dev_err(hdev->dev,
+ "Setup prps, invalid SGL for payload: %d nents: %d\n",
+ iod->length, iod->nsge);
+ return -EIO;
+}
+
+#define SGES_PER_PAGE (PAGE_SIZE / sizeof(struct spraid_sgl_desc))
+
+static void spraid_submit_cmd(struct spraid_queue *spraidq, const void *cmd)
+{
+ u32 sqes = SQE_SIZE(spraidq->qid);
+ unsigned long flags;
+ struct spraid_admin_common_command *acd =
+ (struct spraid_admin_common_command *)cmd;
+
+ spin_lock_irqsave(&spraidq->sq_lock, flags);
+ memcpy((spraidq->sq_cmds + sqes * spraidq->sq_tail), cmd, sqes);
+ if (++spraidq->sq_tail == spraidq->q_depth)
+ spraidq->sq_tail = 0;
+
+ writel(spraidq->sq_tail, spraidq->q_db);
+ spin_unlock_irqrestore(&spraidq->sq_lock, flags);
+
+ dev_log_dbg(spraidq->hdev->dev,
+ "cid[%d] qid[%d], opcode[0x%x], flags[0x%x], hdid[%u]\n",
+ acd->command_id, spraidq->qid, acd->opcode,
+ acd->flags, le32_to_cpu(acd->hdid));
+}
+
+static u32 spraid_mod64(u64 dividend, u32 divisor)
+{
+ u64 d;
+ u32 remainder;
+
+ if (!divisor)
+ pr_err("DIVISOR is zero, in div fn\n");
+
+ d = dividend;
+ remainder = do_div(d, divisor);
+ return remainder;
+}
+
+static inline bool spraid_is_rw_scmd(struct scsi_cmnd *scmd)
+{
+ switch (scmd->cmnd[0]) {
+ case READ_6:
+ case READ_10:
+ case READ_12:
+ case READ_16:
+ case READ_32:
+ case WRITE_6:
+ case WRITE_10:
+ case WRITE_12:
+ case WRITE_16:
+ case WRITE_32:
+ return true;
+ default:
+ return false;
+ }
+}
+
+static bool spraid_is_prp(struct spraid_dev *hdev,
+ struct scsi_cmnd *scmd, u32 nsge)
+{
+ struct scatterlist *sg = scsi_sglist(scmd);
+ u32 page_size = hdev->page_size;
+ bool is_prp = true;
+ int i = 0;
+
+ scsi_for_each_sg(scmd, sg, nsge, i) {
+ if (i != 0 && i != nsge - 1) {
+ if (spraid_mod64(sg_dma_len(sg), page_size) ||
+ spraid_mod64(sg_dma_address(sg), page_size)) {
+ is_prp = false;
+ break;
+ }
+ }
+
+ if (nsge > 1 && i == 0) {
+ if ((spraid_mod64((sg_dma_address(sg) + sg_dma_len(sg)),
+ page_size))) {
+ is_prp = false;
+ break;
+ }
+ }
+
+ if (nsge > 1 && i == (nsge - 1)) {
+ if (spraid_mod64(sg_dma_address(sg), page_size)) {
+ is_prp = false;
+ break;
+ }
+ }
+ }
+
+ return is_prp;
+}
+
+enum {
+ SPRAID_SGL_FMT_DATA_DESC = 0x00,
+ SPRAID_SGL_FMT_SEG_DESC = 0x02,
+ SPRAID_SGL_FMT_LAST_SEG_DESC = 0x03,
+ SPRAID_KEY_SGL_FMT_DATA_DESC = 0x04,
+ SPRAID_TRANSPORT_SGL_DATA_DESC = 0x05
+};
+
+static void spraid_sgl_set_data(struct spraid_sgl_desc *sge,
+ struct scatterlist *sg)
+{
+ sge->addr = cpu_to_le64(sg_dma_address(sg));
+ sge->length = cpu_to_le32(sg_dma_len(sg));
+ sge->type = SPRAID_SGL_FMT_DATA_DESC << 4;
+}
+
+static void spraid_sgl_set_seg(struct spraid_sgl_desc *sge,
+ dma_addr_t dma_addr, int entries)
+{
+ sge->addr = cpu_to_le64(dma_addr);
+ if (entries <= SGES_PER_PAGE) {
+ sge->length = cpu_to_le32(entries * sizeof(*sge));
+ sge->type = SPRAID_SGL_FMT_LAST_SEG_DESC << 4;
+ } else {
+ sge->length = cpu_to_le32(PAGE_SIZE);
+ sge->type = SPRAID_SGL_FMT_SEG_DESC << 4;
+ }
+}
+
+static int spraid_setup_ioq_cmd_sgl(struct spraid_dev *hdev,
+ struct scsi_cmnd *scmd,
+ struct spraid_ioq_command *ioq_cmd,
+ struct spraid_iod *iod)
+{
+ struct spraid_sgl_desc *sg_list, *link, *old_sg_list;
+ struct scatterlist *sg = scsi_sglist(scmd);
+ void **list = spraid_iod_list(iod);
+ struct dma_pool *pool;
+ int nsge = iod->nsge;
+ dma_addr_t sgl_dma;
+ int i = 0;
+
+ ioq_cmd->common.flags |= SPRAID_CMD_FLAG_SGL_METABUF;
+
+ if (nsge == 1) {
+ spraid_sgl_set_data(&ioq_cmd->common.dptr.sgl, sg);
+ return 0;
+ }
+
+ if (nsge <= (SMALL_POOL_SIZE / sizeof(struct spraid_sgl_desc))) {
+ pool = iod->spraidq->prp_small_pool;
+ iod->npages = 0;
+ } else {
+ pool = hdev->prp_page_pool;
+ iod->npages = 1;
+ }
+
+ sg_list = dma_pool_alloc(pool, GFP_ATOMIC, &sgl_dma);
+ if (!sg_list) {
+ dev_err_ratelimited(hdev->dev,
+ "Allocate first sgl_list failed\n");
+ iod->npages = -1;
+ return -ENOMEM;
+ }
+
+ list[0] = sg_list;
+ iod->first_dma = sgl_dma;
+ spraid_sgl_set_seg(&ioq_cmd->common.dptr.sgl, sgl_dma, nsge);
+ do {
+ if (i == SGES_PER_PAGE) {
+ old_sg_list = sg_list;
+ link = &old_sg_list[SGES_PER_PAGE - 1];
+
+ sg_list = dma_pool_alloc(pool, GFP_ATOMIC, &sgl_dma);
+ if (!sg_list) {
+ dev_err_ratelimited(hdev->dev,
+ "Allocate %dth sgl_list;"
+ " failed\n",
+ iod->npages + 1);
+ return -ENOMEM;
+ }
+ list[iod->npages++] = sg_list;
+
+ i = 0;
+ memcpy(&sg_list[i++], link, sizeof(*link));
+ spraid_sgl_set_seg(link, sgl_dma, nsge);
+ }
+
+ spraid_sgl_set_data(&sg_list[i++], sg);
+ sg = sg_next(sg);
+ } while (--nsge > 0);
+
+ return 0;
+}
+
+#define SPRAID_RW_FUA BIT(14)
+
+static void spraid_setup_rw_cmd(struct spraid_dev *hdev,
+ struct spraid_rw_command *rw,
+ struct scsi_cmnd *scmd)
+{
+ u32 start_lba_lo, start_lba_hi;
+ u32 datalength = 0;
+ u16 control = 0;
+
+ start_lba_lo = 0;
+ start_lba_hi = 0;
+
+ if (scmd->sc_data_direction == DMA_TO_DEVICE) {
+ rw->opcode = SPRAID_CMD_WRITE;
+ } else if (scmd->sc_data_direction == DMA_FROM_DEVICE) {
+ rw->opcode = SPRAID_CMD_READ;
+ } else {
+ dev_err(hdev->dev,
+ "Invalid IO for unsupported data direction: %d\n",
+ scmd->sc_data_direction);
+ WARN_ON(1);
+ }
+
+ /* 6-byte READ(0x08) or WRITE(0x0A) cdb */
+ if (scmd->cmd_len == 6) {
+ datalength = (u32)(scmd->cmnd[4] == 0 ?
+ IO_6_DEFAULT_TX_LEN : scmd->cmnd[4]);
+ start_lba_lo = (u32)get_unaligned_be24(&scmd->cmnd[1]);
+
+ start_lba_lo &= 0x1FFFFF;
+ }
+
+ /* 10-byte READ(0x28) or WRITE(0x2A) cdb */
+ else if (scmd->cmd_len == 10) {
+ datalength = (u32)get_unaligned_be16(&scmd->cmnd[7]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+
+ /* 12-byte READ(0xA8) or WRITE(0xAA) cdb */
+ else if (scmd->cmd_len == 12) {
+ datalength = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+ /* 16-byte READ(0x88) or WRITE(0x8A) cdb */
+ else if (scmd->cmd_len == 16) {
+ datalength = get_unaligned_be32(&scmd->cmnd[10]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[2]);
+
+ if (scmd->cmnd[1] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+ /* 32-byte READ(0x88) or WRITE(0x8A) cdb */
+ else if (scmd->cmd_len == 32) {
+ datalength = get_unaligned_be32(&scmd->cmnd[28]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[16]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[12]);
+
+ if (scmd->cmnd[10] & FUA_MASK)
+ control |= SPRAID_RW_FUA;
+ }
+
+ if (unlikely(datalength > U16_MAX || datalength == 0)) {
+ dev_err(hdev->dev,
+ "Invalid IO for illegal transfer data length: %u\n",
+ datalength);
+ WARN_ON(1);
+ }
+
+ rw->slba = cpu_to_le64(((u64)start_lba_hi << 32) | start_lba_lo);
+ /* 0base for nlb */
+ rw->nlb = cpu_to_le16((u16)(datalength - 1));
+ rw->control = cpu_to_le16(control);
+}
+
+static void spraid_setup_nonio_cmd(struct spraid_dev *hdev,
+ struct spraid_scsi_nonio *scsi_nonio,
+ struct scsi_cmnd *scmd)
+{
+ scsi_nonio->buffer_len = cpu_to_le32(scsi_bufflen(scmd));
+
+ switch (scmd->sc_data_direction) {
+ case DMA_NONE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_NONE;
+ break;
+ case DMA_TO_DEVICE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_TODEV;
+ break;
+ case DMA_FROM_DEVICE:
+ scsi_nonio->opcode = SPRAID_CMD_NONIO_FROMDEV;
+ break;
+ default:
+ dev_err(hdev->dev,
+ "Invalid IO for unsupported data direction: %d\n",
+ scmd->sc_data_direction);
+ WARN_ON(1);
+ }
+}
+
+static void spraid_setup_ioq_cmd(struct spraid_dev *hdev,
+ struct spraid_ioq_command *ioq_cmd,
+ struct scsi_cmnd *scmd)
+{
+ memcpy(ioq_cmd->common.cdb, scmd->cmnd, scmd->cmd_len);
+ ioq_cmd->common.cdb_len = scmd->cmd_len;
+
+ if (spraid_is_rw_scmd(scmd))
+ spraid_setup_rw_cmd(hdev, &ioq_cmd->rw, scmd);
+ else
+ spraid_setup_nonio_cmd(hdev, &ioq_cmd->scsi_nonio, scmd);
+}
+
+static int spraid_init_iod(struct spraid_dev *hdev, struct spraid_iod *iod,
+ struct spraid_ioq_command *ioq_cmd,
+ struct scsi_cmnd *scmd)
+{
+ if (unlikely(!iod->sense)) {
+ dev_err(hdev->dev, "Allocate sense data buffer failed\n");
+ return -ENOMEM;
+ }
+ ioq_cmd->common.sense_addr = cpu_to_le64(iod->sense_dma);
+ ioq_cmd->common.sense_len = cpu_to_le16(SCSI_SENSE_BUFFERSIZE);
+
+ iod->nsge = 0;
+ iod->npages = -1;
+ iod->use_sgl = 0;
+ iod->sg_drv_mgmt = false;
+ WRITE_ONCE(iod->state, SPRAID_CMD_IDLE);
+
+ return 0;
+}
+
+static void spraid_free_iod_res(struct spraid_dev *hdev, struct spraid_iod *iod)
+{
+ const int last_prp = hdev->page_size / sizeof(__le64) - 1;
+ dma_addr_t dma_addr, next_dma_addr;
+ struct spraid_sgl_desc *sg_list;
+ __le64 *prp_list;
+ void *addr;
+ int i;
+
+ dma_addr = iod->first_dma;
+ if (iod->npages == 0)
+ dma_pool_free(iod->spraidq->prp_small_pool,
+ spraid_iod_list(iod)[0], dma_addr);
+
+ for (i = 0; i < iod->npages; i++) {
+ addr = spraid_iod_list(iod)[i];
+
+ if (iod->use_sgl) {
+ sg_list = addr;
+ next_dma_addr =
+ le64_to_cpu((sg_list[SGES_PER_PAGE - 1]).addr);
+ } else {
+ prp_list = addr;
+ next_dma_addr = le64_to_cpu(prp_list[last_prp]);
+ }
+
+ dma_pool_free(hdev->prp_page_pool, addr, dma_addr);
+ dma_addr = next_dma_addr;
+ }
+
+ if (iod->sg_drv_mgmt && iod->sg != iod->inline_sg) {
+ iod->sg_drv_mgmt = false;
+ mempool_free(iod->sg, hdev->iod_mempool);
+ }
+
+ iod->sense = NULL;
+ iod->npages = -1;
+}
+
+static int spraid_io_map_data(struct spraid_dev *hdev, struct spraid_iod *iod,
+ struct scsi_cmnd *scmd,
+ struct spraid_ioq_command *ioq_cmd)
+{
+ int ret;
+
+ iod->nsge = scsi_dma_map(scmd);
+
+ /* No data to DMA, it may be scsi no-rw command */
+ if (unlikely(iod->nsge == 0))
+ return 0;
+
+ iod->length = scsi_bufflen(scmd);
+ iod->sg = scsi_sglist(scmd);
+ iod->use_sgl = !spraid_is_prp(hdev, scmd, iod->nsge);
+
+ if (iod->use_sgl) {
+ ret = spraid_setup_ioq_cmd_sgl(hdev, scmd, ioq_cmd, iod);
+ } else {
+ ret = spraid_setup_prps(hdev, iod);
+ ioq_cmd->common.dptr.prp1 =
+ cpu_to_le64(sg_dma_address(iod->sg));
+ ioq_cmd->common.dptr.prp2 = cpu_to_le64(iod->first_dma);
+ }
+
+ if (ret)
+ scsi_dma_unmap(scmd);
+
+ return ret;
+}
+
+static void spraid_map_status(struct spraid_iod *iod, struct scsi_cmnd *scmd,
+ struct spraid_completion *cqe)
+{
+ scsi_set_resid(scmd, 0);
+
+ switch ((le16_to_cpu(cqe->status) >> 1) & 0x7f) {
+ case FW_STAT_OK:
+ set_host_byte(scmd, DID_OK);
+ break;
+ case FW_STAT_NEED_CHECK:
+ set_host_byte(scmd, DID_OK);
+ scmd->result |= le16_to_cpu(cqe->status) >> 8;
+ if (scmd->result & SAM_STAT_CHECK_CONDITION) {
+ memset(scmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
+ memcpy(scmd->sense_buffer, iod->sense,
+ SCSI_SENSE_BUFFERSIZE);
+ scmd->result =
+ (scmd->result & 0x00ffffff) | (DRIVER_SENSE << 24);
+ }
+ break;
+ case FW_STAT_ABORTED:
+ set_host_byte(scmd, DID_ABORT);
+ break;
+ case FW_STAT_NEED_RETRY:
+ set_host_byte(scmd, DID_REQUEUE);
+ break;
+ default:
+ set_host_byte(scmd, DID_BAD_TARGET);
+ break;
+ }
+}
+
+static inline void spraid_get_tag_from_scmd(struct scsi_cmnd *scmd,
+ u16 *qid, u16 *cid)
+{
+ u32 tag = blk_mq_unique_tag(scmd->request);
+
+ *qid = blk_mq_unique_tag_to_hwq(tag) + 1;
+ *cid = blk_mq_unique_tag_to_tag(tag);
+}
+
+static int spraid_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
+{
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_dev *hdev = shost_priv(shost);
+ struct scsi_device *sdev = scmd->device;
+ struct spraid_sdev_hostdata *hostdata;
+ struct spraid_ioq_command ioq_cmd;
+ struct spraid_queue *ioq;
+ unsigned long elapsed;
+ u16 hwq, cid;
+ int ret;
+
+ if (unlikely(!scmd)) {
+ dev_err(hdev->dev, "err, scmd is null\n");
+ return 0;
+ }
+
+ if (unlikely(hdev->state != SPRAID_LIVE)) {
+ set_host_byte(scmd, DID_NO_CONNECT);
+ scmd->scsi_done(scmd);
+ return 0;
+ }
+
+ if (log_debug_switch)
+ scsi_print_command(scmd);
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ hostdata = sdev->hostdata;
+ ioq = &hdev->queues[hwq];
+ memset(&ioq_cmd, 0, sizeof(ioq_cmd));
+ ioq_cmd.rw.hdid = cpu_to_le32(hostdata->hdid);
+ ioq_cmd.rw.command_id = cid;
+
+ spraid_setup_ioq_cmd(hdev, &ioq_cmd, scmd);
+
+ ret = cid * SCSI_SENSE_BUFFERSIZE;
+ iod->sense = ioq->sense + ret;
+ iod->sense_dma = ioq->sense_dma_addr + ret;
+
+ ret = spraid_init_iod(hdev, iod, &ioq_cmd, scmd);
+ if (unlikely(ret))
+ return SCSI_MLQUEUE_HOST_BUSY;
+
+ iod->spraidq = ioq;
+ ret = spraid_io_map_data(hdev, iod, scmd, &ioq_cmd);
+ if (unlikely(ret)) {
+ dev_err(hdev->dev, "spraid_io_map_data Err.\n");
+ set_host_byte(scmd, DID_ERROR);
+ scmd->scsi_done(scmd);
+ ret = 0;
+ goto deinit_iod;
+ }
+
+ WRITE_ONCE(iod->state, SPRAID_CMD_IN_FLIGHT);
+ spraid_submit_cmd(ioq, &ioq_cmd);
+ elapsed = jiffies - scmd->jiffies_at_alloc;
+ dev_log_dbg(hdev->dev,
+ "cid[%d] qid[%d] submit IO cost %3ld.%3ld seconds\n",
+ cid, hwq, elapsed / HZ, elapsed % HZ);
+ return 0;
+
+deinit_iod:
+ spraid_free_iod_res(hdev, iod);
+ return ret;
+}
+
+static int spraid_match_dev(struct spraid_dev *hdev, u16 idx,
+ struct scsi_device *sdev)
+{
+ if (SPRAID_DEV_INFO_FLAG_VALID(hdev->devices[idx].flag)) {
+ if (sdev->channel == hdev->devices[idx].channel &&
+ sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
+ sdev->lun < hdev->devices[idx].lun) {
+ dev_info(hdev->dev,
+ "Match device success, channel;"
+ "target:lun[%d:%d:%d]\n",
+ hdev->devices[idx].channel,
+ hdev->devices[idx].target,
+ hdev->devices[idx].lun);
+ return 1;
+ }
+ }
+
+ return 0;
+}
+
+static int spraid_slave_alloc(struct scsi_device *sdev)
+{
+ struct spraid_sdev_hostdata *hostdata;
+ struct spraid_dev *hdev;
+ u16 idx;
+
+ hdev = shost_priv(sdev->host);
+ hostdata = kzalloc(sizeof(*hostdata), GFP_KERNEL);
+ if (!hostdata) {
+ dev_err(hdev->dev, "Alloc scsi host data memory failed\n");
+ return -ENOMEM;
+ }
+
+ down_read(&hdev->devices_rwsem);
+ for (idx = 0; idx < le32_to_cpu(hdev->ctrl_info->nd); idx++) {
+ if (spraid_match_dev(hdev, idx, sdev))
+ goto scan_host;
+ }
+ up_read(&hdev->devices_rwsem);
+
+ kfree(hostdata);
+ return -ENXIO;
+
+scan_host:
+ hostdata->hdid = le32_to_cpu(hdev->devices[idx].hdid);
+ hostdata->max_io_kb = le16_to_cpu(hdev->devices[idx].max_io_kb);
+ hostdata->attr = hdev->devices[idx].attr;
+ hostdata->flag = hdev->devices[idx].flag;
+ hostdata->rg_id = 0xff;
+ sdev->hostdata = hostdata;
+ up_read(&hdev->devices_rwsem);
+ return 0;
+}
+
+static void spraid_slave_destroy(struct scsi_device *sdev)
+{
+ kfree(sdev->hostdata);
+ sdev->hostdata = NULL;
+}
+
+static int spraid_slave_configure(struct scsi_device *sdev)
+{
+ u16 idx;
+ unsigned int timeout = scmd_tmout_nonpt * HZ;
+ struct spraid_dev *hdev = shost_priv(sdev->host);
+ struct spraid_sdev_hostdata *hostdata = sdev->hostdata;
+ u32 max_sec = sdev->host->max_sectors;
+
+ if (hostdata) {
+ idx = hostdata->hdid - 1;
+ if (sdev->channel == hdev->devices[idx].channel &&
+ sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
+ sdev->lun < hdev->devices[idx].lun) {
+ if (SPRAID_DEV_INFO_ATTR_PT(hdev->devices[idx].attr))
+ timeout = 30 * HZ;
+ else
+ timeout = scmd_tmout_nonpt * HZ;
+ max_sec = le16_to_cpu(hdev->devices[idx].max_io_kb)
+ << 1;
+ } else {
+ dev_err(hdev->dev,
+ "[%s] err, sdev->channel:id:lun[%d:%d:%lld];"
+ "devices[%d], channel:target:lun[%d:%d:%d]\n",
+ __func__, sdev->channel, sdev->id, sdev->lun,
+ idx, hdev->devices[idx].channel,
+ hdev->devices[idx].target,
+ hdev->devices[idx].lun);
+ }
+ } else {
+ dev_err(hdev->dev, "[%s] err, sdev->hostdata is null\n",
+ __func__);
+ }
+
+ blk_queue_rq_timeout(sdev->request_queue, timeout);
+ sdev->eh_timeout = timeout;
+
+ if ((max_sec == 0) || (max_sec > sdev->host->max_sectors))
+ max_sec = sdev->host->max_sectors;
+
+ dev_info(hdev->dev,
+ "[%s] sdev->channel:id:lun[%d:%d:%lld];"
+ " scmd_timeout[%d]s, maxsec[%d]\n",
+ __func__, sdev->channel, sdev->id,
+ sdev->lun, timeout / HZ, max_sec);
+
+ return 0;
+}
+
+static void spraid_shost_init(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ u8 domain, bus;
+ u32 dev_func;
+
+ domain = pci_domain_nr(pdev->bus);
+ bus = pdev->bus->number;
+ dev_func = pdev->devfn;
+
+ hdev->shost->nr_hw_queues = hdev->online_queues - 1;
+ hdev->shost->can_queue = (hdev->ioq_depth - SPRAID_PTCMDS_PERQ);
+
+ hdev->shost->sg_tablesize = le16_to_cpu(hdev->ctrl_info->max_num_sge);
+ /* 512B per sector */
+ hdev->shost->max_sectors =
+ (1U << ((hdev->ctrl_info->mdts) * 1U) << 12) / 512;
+ hdev->shost->cmd_per_lun = MAX_CMD_PER_DEV;
+ hdev->shost->max_channel =
+ le16_to_cpu(hdev->ctrl_info->max_channel) - 1;
+ hdev->shost->max_id = le32_to_cpu(hdev->ctrl_info->max_tgt_id);
+ hdev->shost->max_lun = le16_to_cpu(hdev->ctrl_info->max_lun);
+
+ hdev->shost->this_id = -1;
+ hdev->shost->unique_id = (domain << 16) | (bus << 8) | dev_func;
+ hdev->shost->max_cmd_len = MAX_CDB_LEN;
+ hdev->shost->hostt->cmd_size = max(spraid_cmd_size(hdev, false, true),
+ spraid_cmd_size(hdev, false, false));
+}
+
+static inline void spraid_host_deinit(struct spraid_dev *hdev)
+{
+ ida_free(&spraid_instance_ida, hdev->instance);
+}
+
+static int spraid_alloc_queue(struct spraid_dev *hdev, u16 qid, u16 depth)
+{
+ struct spraid_queue *spraidq = &hdev->queues[qid];
+ int ret = 0;
+
+ if (hdev->queue_count > qid) {
+ dev_info(hdev->dev, "[%s] warn: queue[%d] is exist\n",
+ __func__, qid);
+ return 0;
+ }
+
+ spraidq->cqes = dma_alloc_coherent(hdev->dev, CQ_SIZE(depth),
+ &spraidq->cq_dma_addr,
+ GFP_KERNEL | __GFP_ZERO);
+ if (!spraidq->cqes)
+ return -ENOMEM;
+
+ spraidq->sq_cmds = dma_alloc_coherent(hdev->dev, SQ_SIZE(qid, depth),
+ &spraidq->sq_dma_addr,
+ GFP_KERNEL);
+ if (!spraidq->sq_cmds) {
+ ret = -ENOMEM;
+ goto free_cqes;
+ }
+
+ spin_lock_init(&spraidq->sq_lock);
+ spin_lock_init(&spraidq->cq_lock);
+ spraidq->hdev = hdev;
+ spraidq->q_depth = depth;
+ spraidq->qid = qid;
+ spraidq->cq_vector = -1;
+ hdev->queue_count++;
+
+ /* alloc sense buffer */
+ spraidq->sense = dma_alloc_coherent(hdev->dev, SENSE_SIZE(depth),
+ &spraidq->sense_dma_addr,
+ GFP_KERNEL | __GFP_ZERO);
+ if (!spraidq->sense) {
+ ret = -ENOMEM;
+ goto free_sq_cmds;
+ }
+
+ return 0;
+
+free_sq_cmds:
+ dma_free_coherent(hdev->dev, SQ_SIZE(qid, depth),
+ (void *)spraidq->sq_cmds, spraidq->sq_dma_addr);
+free_cqes:
+ dma_free_coherent(hdev->dev, CQ_SIZE(depth), (void *)spraidq->cqes,
+ spraidq->cq_dma_addr);
+ return ret;
+}
+
+static int spraid_wait_ready(struct spraid_dev *hdev, u64 cap, bool enabled)
+{
+ unsigned long timeout =
+ ((SPRAID_CAP_TIMEOUT(cap) + 1) * SPRAID_CAP_TIMEOUT_UNIT_MS) + jiffies;
+ u32 bit = enabled ? SPRAID_CSTS_RDY : 0;
+
+ while ((readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_RDY) != bit) {
+ usleep_range(1000, 2000);
+ if (fatal_signal_pending(current))
+ return -EINTR;
+
+ if (time_after(jiffies, timeout)) {
+ dev_err(hdev->dev, "Device not ready; aborting %s\n",
+ enabled ? "initialisation" : "reset");
+ return -ENODEV;
+ }
+ }
+ return 0;
+}
+
+static int spraid_shutdown_ctrl(struct spraid_dev *hdev)
+{
+ unsigned long timeout = hdev->ctrl_info->rtd3e + jiffies;
+
+ hdev->ctrl_config &= ~SPRAID_CC_SHN_MASK;
+ hdev->ctrl_config |= SPRAID_CC_SHN_NORMAL;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ while ((readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_SHST_MASK) !=
+ SPRAID_CSTS_SHST_CMPLT) {
+ msleep(100);
+ if (fatal_signal_pending(current))
+ return -EINTR;
+ if (time_after(jiffies, timeout)) {
+ dev_err(hdev->dev,
+ "Device shutdown incomplete; abort shutdown\n");
+ return -ENODEV;
+ }
+ }
+ return 0;
+}
+
+static int spraid_disable_ctrl(struct spraid_dev *hdev)
+{
+ hdev->ctrl_config &= ~SPRAID_CC_SHN_MASK;
+ hdev->ctrl_config &= ~SPRAID_CC_ENABLE;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ return spraid_wait_ready(hdev, hdev->cap, false);
+}
+
+static int spraid_enable_ctrl(struct spraid_dev *hdev)
+{
+ u64 cap = hdev->cap;
+ u32 dev_page_min = SPRAID_CAP_MPSMIN(cap) + 12;
+ u32 page_shift = PAGE_SHIFT;
+
+ if (page_shift < dev_page_min) {
+ dev_err(hdev->dev,
+ "Minimum device page size[%u], too large for host[%u]\n",
+ 1U << dev_page_min, 1U << page_shift);
+ return -ENODEV;
+ }
+
+ page_shift = min_t(unsigned int, SPRAID_CAP_MPSMAX(cap) + 12,
+ PAGE_SHIFT);
+ hdev->page_size = 1U << page_shift;
+
+ hdev->ctrl_config = SPRAID_CC_CSS_NVM;
+ hdev->ctrl_config |= (page_shift - 12) << SPRAID_CC_MPS_SHIFT;
+ hdev->ctrl_config |= SPRAID_CC_AMS_RR | SPRAID_CC_SHN_NONE;
+ hdev->ctrl_config |= SPRAID_CC_IOSQES | SPRAID_CC_IOCQES;
+ hdev->ctrl_config |= SPRAID_CC_ENABLE;
+ writel(hdev->ctrl_config, hdev->bar + SPRAID_REG_CC);
+
+ return spraid_wait_ready(hdev, cap, true);
+}
+
+static void spraid_init_queue(struct spraid_queue *spraidq, u16 qid)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ memset((void *)spraidq->cqes, 0, CQ_SIZE(spraidq->q_depth));
+
+ spraidq->sq_tail = 0;
+ spraidq->cq_head = 0;
+ spraidq->cq_phase = 1;
+ spraidq->q_db = &hdev->dbs[qid * 2 * hdev->db_stride];
+ spraidq->prp_small_pool = hdev->prp_small_pool[qid % small_pool_num];
+ hdev->online_queues++;
+}
+
+static inline bool spraid_cqe_pending(struct spraid_queue *spraidq)
+{
+ return (le16_to_cpu(spraidq->cqes[spraidq->cq_head].status) & 1) ==
+ spraidq->cq_phase;
+}
+
+static void spraid_sata_report_zone_handle(struct scsi_cmnd *scmd,
+ struct spraid_iod *iod)
+{
+ int i = 0;
+ unsigned int bytes = 0;
+ struct scatterlist *sg = scsi_sglist(scmd);
+
+ scsi_for_each_sg(scmd, sg, iod->nsge, i) {
+ unsigned int offset = 0;
+
+ if (bytes == 0) {
+ char *hdr;
+ u32 list_length;
+ u64 max_lba, opt_lba;
+ u16 same;
+
+ hdr = sg_virt(sg);
+
+ list_length = get_unaligned_le32(&hdr[0]);
+ same = get_unaligned_le16(&hdr[4]);
+ max_lba = get_unaligned_le64(&hdr[8]);
+ opt_lba = get_unaligned_le64(&hdr[16]);
+ put_unaligned_be32(list_length, &hdr[0]);
+ hdr[4] = same & 0xf;
+ put_unaligned_be64(max_lba, &hdr[8]);
+ put_unaligned_be64(opt_lba, &hdr[16]);
+ offset += 64;
+ bytes += 64;
+ }
+ while (offset < sg_dma_len(sg)) {
+ char *rec;
+ u8 cond, type, non_seq, reset;
+ u64 size, start, wp;
+
+ rec = sg_virt(sg) + offset;
+ type = rec[0] & 0xf;
+ cond = (rec[1] >> 4) & 0xf;
+ non_seq = (rec[1] & 2);
+ reset = (rec[1] & 1);
+ size = get_unaligned_le64(&rec[8]);
+ start = get_unaligned_le64(&rec[16]);
+ wp = get_unaligned_le64(&rec[24]);
+ rec[0] = type;
+ rec[1] = (cond << 4) | non_seq | reset;
+ put_unaligned_be64(size, &rec[8]);
+ put_unaligned_be64(start, &rec[16]);
+ put_unaligned_be64(wp, &rec[24]);
+ WARN_ON(offset + 64 > sg_dma_len(sg));
+ offset += 64;
+ bytes += 64;
+ }
+ }
+}
+
+static inline void spraid_handle_ata_cmd(struct spraid_dev *hdev,
+ struct scsi_cmnd *scmd,
+ struct spraid_iod *iod)
+{
+ if (hdev->ctrl_info->card_type != SPRAID_CARD_HBA)
+ return;
+
+ switch (scmd->cmnd[0]) {
+ case ZBC_IN:
+ dev_info(hdev->dev, "[%s] process report zone\n", __func__);
+ spraid_sata_report_zone_handle(scmd, iod);
+ break;
+ default:
+ break;
+ }
+}
+
+static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct blk_mq_tags *tags;
+ struct scsi_cmnd *scmd;
+ struct spraid_iod *iod;
+ struct request *req;
+ unsigned long elapsed;
+
+ tags = hdev->shost->tag_set.tags[ioq->qid - 1];
+ req = blk_mq_tag_to_rq(tags, cqe->cmd_id);
+ if (unlikely(!req || !blk_mq_request_started(req))) {
+ dev_warn(hdev->dev, "Invalid id %d completed on queue %d\n",
+ cqe->cmd_id, ioq->qid);
+ return;
+ }
+
+ scmd = blk_mq_rq_to_pdu(req);
+ iod = scsi_cmd_priv(scmd);
+
+ elapsed = jiffies - scmd->jiffies_at_alloc;
+ dev_log_dbg(hdev->dev,
+ "cid[%d] qid[%d] finish IO cost %3ld.%3ld seconds\n",
+ cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
+
+ if (cmpxchg(&iod->state, SPRAID_CMD_IN_FLIGHT, SPRAID_CMD_COMPLETE) !=
+ SPRAID_CMD_IN_FLIGHT) {
+ dev_warn(hdev->dev,
+ "cid[%d] qid[%d] enters abnormal handler;"
+ " cost %3ld.%3ld seconds\n",
+ cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
+ WRITE_ONCE(iod->state, SPRAID_CMD_TMO_COMPLETE);
+
+ if (iod->nsge) {
+ iod->nsge = 0;
+ scsi_dma_unmap(scmd);
+ }
+ spraid_free_iod_res(hdev, iod);
+
+ return;
+ }
+
+ spraid_handle_ata_cmd(hdev, scmd, iod);
+
+ spraid_map_status(iod, scmd, cqe);
+ if (iod->nsge) {
+ iod->nsge = 0;
+ scsi_dma_unmap(scmd);
+ }
+ spraid_free_iod_res(hdev, iod);
+ scmd->scsi_done(scmd);
+}
+
+static void spraid_complete_adminq_cmnd(struct spraid_queue *adminq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = adminq->hdev;
+ struct spraid_cmd *adm_cmd;
+
+ adm_cmd = hdev->adm_cmds + cqe->cmd_id;
+ if (unlikely(adm_cmd->state == SPRAID_CMD_IDLE)) {
+ dev_warn(adminq->hdev->dev,
+ "Invalid id %d completed on queue %d\n",
+ cqe->cmd_id, le16_to_cpu(cqe->sq_id));
+ return;
+ }
+
+ adm_cmd->status = le16_to_cpu(cqe->status) >> 1;
+ adm_cmd->result0 = le32_to_cpu(cqe->result);
+ adm_cmd->result1 = le32_to_cpu(cqe->result1);
+
+ complete(&adm_cmd->cmd_done);
+}
+
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid);
+
+static void spraid_complete_aen(struct spraid_queue *spraidq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+ u32 result = le32_to_cpu(cqe->result);
+
+ dev_info(hdev->dev, "rcv aen, cid[%d], status[0x%x], result[0x%x]\n",
+ cqe->cmd_id, le16_to_cpu(cqe->status) >> 1, result);
+
+ spraid_send_aen(hdev, cqe->cmd_id);
+
+ if ((le16_to_cpu(cqe->status) >> 1) != SPRAID_SC_SUCCESS)
+ return;
+ switch (result & 0x7) {
+ case SPRAID_AEN_NOTICE:
+ spraid_handle_aen_notice(hdev, result);
+ break;
+ case SPRAID_AEN_VS:
+ spraid_handle_aen_vs(hdev, result, le32_to_cpu(cqe->result1));
+ break;
+ default:
+ dev_warn(hdev->dev, "Unsupported async event type: %u\n",
+ result & 0x7);
+ break;
+ }
+}
+
+static void spraid_complete_ioq_sync_cmnd(struct spraid_queue *ioq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct spraid_cmd *ptcmd;
+
+ ptcmd = hdev->ioq_ptcmds + (ioq->qid - 1) * SPRAID_PTCMDS_PERQ +
+ cqe->cmd_id - SPRAID_IO_BLK_MQ_DEPTH;
+
+ ptcmd->status = le16_to_cpu(cqe->status) >> 1;
+ ptcmd->result0 = le32_to_cpu(cqe->result);
+ ptcmd->result1 = le32_to_cpu(cqe->result1);
+
+ complete(&ptcmd->cmd_done);
+}
+
+static inline void spraid_handle_cqe(struct spraid_queue *spraidq, u16 idx)
+{
+ struct spraid_completion *cqe = &spraidq->cqes[idx];
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ if (unlikely(cqe->cmd_id >= spraidq->q_depth)) {
+ dev_err(hdev->dev,
+ "Invalid command id[%d] completed on queue %d\n",
+ cqe->cmd_id, cqe->sq_id);
+ return;
+ }
+
+ dev_log_dbg(hdev->dev, "cid[%d] qid[%d];"
+ " result[0x%x], sq_id[%d], status[0x%x]\n",
+ cqe->cmd_id, spraidq->qid, le32_to_cpu(cqe->result),
+ le16_to_cpu(cqe->sq_id), le16_to_cpu(cqe->status));
+
+ if (unlikely(spraidq->qid == 0
+ && cqe->cmd_id >= SPRAID_AQ_BLK_MQ_DEPTH)) {
+ spraid_complete_aen(spraidq, cqe);
+ return;
+ }
+
+ if (unlikely(spraidq->qid && cqe->cmd_id >= SPRAID_IO_BLK_MQ_DEPTH)) {
+ spraid_complete_ioq_sync_cmnd(spraidq, cqe);
+ return;
+ }
+
+ if (spraidq->qid)
+ spraid_complete_ioq_cmnd(spraidq, cqe);
+ else
+ spraid_complete_adminq_cmnd(spraidq, cqe);
+}
+
+static void spraid_complete_cqes(struct spraid_queue *spraidq,
+ u16 start, u16 end)
+{
+ while (start != end) {
+ spraid_handle_cqe(spraidq, start);
+ if (++start == spraidq->q_depth)
+ start = 0;
+ }
+}
+
+static inline void spraid_update_cq_head(struct spraid_queue *spraidq)
+{
+ if (++spraidq->cq_head == spraidq->q_depth) {
+ spraidq->cq_head = 0;
+ spraidq->cq_phase = !spraidq->cq_phase;
+ }
+}
+
+static inline bool spraid_process_cq(struct spraid_queue *spraidq,
+ u16 *start, u16 *end, int tag)
+{
+ bool found = false;
+
+ *start = spraidq->cq_head;
+ while (!found && spraid_cqe_pending(spraidq)) {
+ if (spraidq->cqes[spraidq->cq_head].cmd_id == tag)
+ found = true;
+ spraid_update_cq_head(spraidq);
+ }
+ *end = spraidq->cq_head;
+
+ if (*start != *end)
+ writel(spraidq->cq_head,
+ spraidq->q_db + spraidq->hdev->db_stride);
+
+ return found;
+}
+
+static bool spraid_poll_cq(struct spraid_queue *spraidq, int cid)
+{
+ u16 start, end;
+ bool found;
+
+ if (!spraid_cqe_pending(spraidq))
+ return 0;
+
+ spin_lock_irq(&spraidq->cq_lock);
+ found = spraid_process_cq(spraidq, &start, &end, cid);
+ spin_unlock_irq(&spraidq->cq_lock);
+
+ spraid_complete_cqes(spraidq, start, end);
+ return found;
+}
+
+static irqreturn_t spraid_irq(int irq, void *data)
+{
+ struct spraid_queue *spraidq = data;
+ irqreturn_t ret = IRQ_NONE;
+ u16 start, end;
+
+ spin_lock(&spraidq->cq_lock);
+ if (spraidq->cq_head != spraidq->last_cq_head)
+ ret = IRQ_HANDLED;
+
+ spraid_process_cq(spraidq, &start, &end, -1);
+ spraidq->last_cq_head = spraidq->cq_head;
+ spin_unlock(&spraidq->cq_lock);
+
+ if (start != end) {
+ spraid_complete_cqes(spraidq, start, end);
+ ret = IRQ_HANDLED;
+ }
+ return ret;
+}
+
+static int spraid_setup_admin_queue(struct spraid_dev *hdev)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u32 aqa;
+ int ret;
+
+ dev_info(hdev->dev, "[%s] start disable ctrl\n", __func__);
+
+ ret = spraid_disable_ctrl(hdev);
+ if (ret)
+ return ret;
+
+ ret = spraid_alloc_queue(hdev, 0, SPRAID_AQ_DEPTH);
+ if (ret)
+ return ret;
+
+ aqa = adminq->q_depth - 1;
+ aqa |= aqa << 16;
+ writel(aqa, hdev->bar + SPRAID_REG_AQA);
+ lo_hi_writeq(adminq->sq_dma_addr, hdev->bar + SPRAID_REG_ASQ);
+ lo_hi_writeq(adminq->cq_dma_addr, hdev->bar + SPRAID_REG_ACQ);
+
+ dev_info(hdev->dev, "[%s] start enable ctrl\n", __func__);
+
+ ret = spraid_enable_ctrl(hdev);
+ if (ret) {
+ ret = -ENODEV;
+ goto free_queue;
+ }
+
+ adminq->cq_vector = 0;
+ spraid_init_queue(adminq, 0);
+ ret = pci_request_irq(hdev->pdev, adminq->cq_vector, spraid_irq, NULL,
+ adminq, "spraid%d_q%d",
+ hdev->instance, adminq->qid);
+
+ if (ret) {
+ adminq->cq_vector = -1;
+ hdev->online_queues--;
+ goto free_queue;
+ }
+
+ dev_info(hdev->dev, "[%s] success, queuecount:[%d], onlinequeue:[%d]\n",
+ __func__, hdev->queue_count, hdev->online_queues);
+
+ return 0;
+
+free_queue:
+ spraid_free_queue(adminq);
+ return ret;
+}
+
+static u32 spraid_bar_size(struct spraid_dev *hdev, u32 nr_ioqs)
+{
+ return (SPRAID_REG_DBS + ((nr_ioqs + 1) * 8 * hdev->db_stride));
+}
+
+static int spraid_alloc_admin_cmds(struct spraid_dev *hdev)
+{
+ int i;
+
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+ spin_lock_init(&hdev->adm_cmd_lock);
+
+ hdev->adm_cmds = kcalloc_node(SPRAID_AQ_BLK_MQ_DEPTH,
+ sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
+
+ if (!hdev->adm_cmds) {
+ dev_err(hdev->dev, "Alloc admin cmds failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < SPRAID_AQ_BLK_MQ_DEPTH; i++) {
+ hdev->adm_cmds[i].qid = 0;
+ hdev->adm_cmds[i].cid = i;
+ list_add_tail(&(hdev->adm_cmds[i].list), &hdev->adm_cmd_list);
+ }
+
+ dev_info(hdev->dev, "Alloc admin cmds success, num[%d]\n",
+ SPRAID_AQ_BLK_MQ_DEPTH);
+
+ return 0;
+}
+
+static void spraid_free_admin_cmds(struct spraid_dev *hdev)
+{
+ kfree(hdev->adm_cmds);
+ hdev->adm_cmds = NULL;
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+}
+
+static struct spraid_cmd *spraid_get_cmd(struct spraid_dev *hdev,
+ enum spraid_cmd_type type)
+{
+ struct spraid_cmd *cmd = NULL;
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
+
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
+
+ spin_lock_irqsave(slock, flags);
+ if (list_empty(head)) {
+ spin_unlock_irqrestore(slock, flags);
+ dev_err(hdev->dev, "err, cmd[%d] list empty\n", type);
+ return NULL;
+ }
+ cmd = list_entry(head->next, struct spraid_cmd, list);
+ list_del_init(&cmd->list);
+ spin_unlock_irqrestore(slock, flags);
+
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IN_FLIGHT);
+
+ return cmd;
+}
+
+static void spraid_put_cmd(struct spraid_dev *hdev, struct spraid_cmd *cmd,
+ enum spraid_cmd_type type)
+{
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
+
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
+
+ spin_lock_irqsave(slock, flags);
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IDLE);
+ list_add_tail(&cmd->list, head);
+ spin_unlock_irqrestore(slock, flags);
+}
+
+
+static int spraid_submit_admin_sync_cmd(struct spraid_dev *hdev,
+ struct spraid_admin_command *cmd,
+ u32 *result0, u32 *result1, u32 timeout)
+{
+ struct spraid_cmd *adm_cmd = spraid_get_cmd(hdev, SPRAID_CMD_ADM);
+
+ if (!adm_cmd) {
+ dev_err(hdev->dev, "err, get admin cmd failed\n");
+ return -EFAULT;
+ }
+
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
+
+ init_completion(&adm_cmd->cmd_done);
+
+ cmd->common.command_id = adm_cmd->cid;
+ spraid_submit_cmd(&hdev->queues[0], cmd);
+
+ if (!wait_for_completion_timeout(&adm_cmd->cmd_done, timeout)) {
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout;"
+ " opcode[0x%x] subopcode[0x%x]\n",
+ __func__, adm_cmd->cid, adm_cmd->qid,
+ cmd->usr_cmd.opcode, cmd->usr_cmd.info_0.subopcode);
+ WRITE_ONCE(adm_cmd->state, SPRAID_CMD_TIMEOUT);
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+ return -EINVAL;
+ }
+
+ if (result0)
+ *result0 = adm_cmd->result0;
+ if (result1)
+ *result1 = adm_cmd->result1;
+
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+
+ return adm_cmd->status;
+}
+
+static int spraid_create_cq(struct spraid_dev *hdev, u16 qid,
+ struct spraid_queue *spraidq, u16 cq_vector)
+{
+ struct spraid_admin_command admin_cmd;
+ int flags = SPRAID_QUEUE_PHYS_CONTIG | SPRAID_CQ_IRQ_ENABLED;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.create_cq.opcode = SPRAID_ADMIN_CREATE_CQ;
+ admin_cmd.create_cq.prp1 = cpu_to_le64(spraidq->cq_dma_addr);
+ admin_cmd.create_cq.cqid = cpu_to_le16(qid);
+ admin_cmd.create_cq.qsize = cpu_to_le16(spraidq->q_depth - 1);
+ admin_cmd.create_cq.cq_flags = cpu_to_le16(flags);
+ admin_cmd.create_cq.irq_vector = cpu_to_le16(cq_vector);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static int spraid_create_sq(struct spraid_dev *hdev, u16 qid,
+ struct spraid_queue *spraidq)
+{
+ struct spraid_admin_command admin_cmd;
+ int flags = SPRAID_QUEUE_PHYS_CONTIG;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.create_sq.opcode = SPRAID_ADMIN_CREATE_SQ;
+ admin_cmd.create_sq.prp1 = cpu_to_le64(spraidq->sq_dma_addr);
+ admin_cmd.create_sq.sqid = cpu_to_le16(qid);
+ admin_cmd.create_sq.qsize = cpu_to_le16(spraidq->q_depth - 1);
+ admin_cmd.create_sq.sq_flags = cpu_to_le16(flags);
+ admin_cmd.create_sq.cqid = cpu_to_le16(qid);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static void spraid_free_queue(struct spraid_queue *spraidq)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+
+ hdev->queue_count--;
+ dma_free_coherent(hdev->dev, CQ_SIZE(spraidq->q_depth),
+ (void *)spraidq->cqes, spraidq->cq_dma_addr);
+ dma_free_coherent(hdev->dev, SQ_SIZE(spraidq->qid, spraidq->q_depth),
+ spraidq->sq_cmds, spraidq->sq_dma_addr);
+ dma_free_coherent(hdev->dev, SENSE_SIZE(spraidq->q_depth),
+ spraidq->sense, spraidq->sense_dma_addr);
+}
+
+static void spraid_free_admin_queue(struct spraid_dev *hdev)
+{
+ spraid_free_queue(&hdev->queues[0]);
+}
+
+static void spraid_free_io_queues(struct spraid_dev *hdev)
+{
+ int i;
+
+ for (i = hdev->queue_count - 1; i >= 1; i--)
+ spraid_free_queue(&hdev->queues[i]);
+}
+
+static int spraid_delete_queue(struct spraid_dev *hdev, u8 op, u16 id)
+{
+ struct spraid_admin_command admin_cmd;
+ int ret;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.delete_queue.opcode = op;
+ admin_cmd.delete_queue.qid = cpu_to_le16(id);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+
+ if (ret)
+ dev_err(hdev->dev, "Delete %s:[%d] failed\n",
+ (op == SPRAID_ADMIN_DELETE_CQ) ? "cq" : "sq", id);
+
+ return ret;
+}
+
+static int spraid_delete_cq(struct spraid_dev *hdev, u16 cqid)
+{
+ return spraid_delete_queue(hdev, SPRAID_ADMIN_DELETE_CQ, cqid);
+}
+
+static int spraid_delete_sq(struct spraid_dev *hdev, u16 sqid)
+{
+ return spraid_delete_queue(hdev, SPRAID_ADMIN_DELETE_SQ, sqid);
+}
+
+static int spraid_create_queue(struct spraid_queue *spraidq, u16 qid)
+{
+ struct spraid_dev *hdev = spraidq->hdev;
+ u16 cq_vector;
+ int ret;
+
+ cq_vector = (hdev->num_vecs == 1) ? 0 : qid;
+ ret = spraid_create_cq(hdev, qid, spraidq, cq_vector);
+ if (ret)
+ return ret;
+
+ ret = spraid_create_sq(hdev, qid, spraidq);
+ if (ret)
+ goto delete_cq;
+
+ spraid_init_queue(spraidq, qid);
+ spraidq->cq_vector = cq_vector;
+
+ ret = pci_request_irq(hdev->pdev, cq_vector, spraid_irq, NULL,
+ spraidq, "spraid%d_q%d", hdev->instance, qid);
+
+ if (ret) {
+ dev_err(hdev->dev, "Request queue[%d] irq failed\n", qid);
+ goto delete_sq;
+ }
+
+ return 0;
+
+delete_sq:
+ spraidq->cq_vector = -1;
+ hdev->online_queues--;
+ spraid_delete_sq(hdev, qid);
+delete_cq:
+ spraid_delete_cq(hdev, qid);
+
+ return ret;
+}
+
+static int spraid_create_io_queues(struct spraid_dev *hdev)
+{
+ u32 i, max;
+ int ret = 0;
+
+ max = min(hdev->max_qid, hdev->queue_count - 1);
+ for (i = hdev->online_queues; i <= max; i++) {
+ ret = spraid_create_queue(&hdev->queues[i], i);
+ if (ret) {
+ dev_err(hdev->dev, "Create queue[%d] failed\n", i);
+ break;
+ }
+ }
+
+ dev_info(hdev->dev, "[%s] queue_count[%d], online_queue[%d]",
+ __func__, hdev->queue_count, hdev->online_queues);
+
+ return ret >= 0 ? 0 : ret;
+}
+
+static int spraid_set_features(struct spraid_dev *hdev, u32 fid,
+ u32 dword11, void *buffer,
+ size_t buflen, u32 *result)
+{
+ struct spraid_admin_command admin_cmd;
+ int ret;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+
+ if (buffer && buflen) {
+ data_ptr = dma_alloc_coherent(hdev->dev, buflen,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memcpy(data_ptr, buffer, buflen);
+ }
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.features.opcode = SPRAID_ADMIN_SET_FEATURES;
+ admin_cmd.features.fid = cpu_to_le32(fid);
+ admin_cmd.features.dword11 = cpu_to_le32(dword11);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, result, NULL, 0);
+
+ if (data_ptr)
+ dma_free_coherent(hdev->dev, buflen, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_configure_timestamp(struct spraid_dev *hdev)
+{
+ __le64 ts;
+ int ret;
+
+ ts = cpu_to_le64(ktime_to_ms(ktime_get_real()));
+ ret = spraid_set_features(hdev, SPRAID_FEAT_TIMESTAMP,
+ 0, &ts, sizeof(ts), NULL);
+
+ if (ret)
+ dev_err(hdev->dev, "set timestamp failed: %d\n", ret);
+ return ret;
+}
+
+static int spraid_set_queue_cnt(struct spraid_dev *hdev, u32 *cnt)
+{
+ u32 q_cnt = (*cnt - 1) | ((*cnt - 1) << 16);
+ u32 nr_ioqs, result;
+ int status;
+
+ status = spraid_set_features(hdev, SPRAID_FEAT_NUM_QUEUES,
+ q_cnt, NULL, 0, &result);
+ if (status) {
+ dev_err(hdev->dev, "Set queue count failed, status: %d\n",
+ status);
+ return -EIO;
+ }
+
+ nr_ioqs = min(result & 0xffff, result >> 16) + 1;
+ *cnt = min(*cnt, nr_ioqs);
+ if (*cnt == 0) {
+ dev_err(hdev->dev, "Illegal queue count: zero\n");
+ return -EIO;
+ }
+ return 0;
+}
+
+static int spraid_setup_io_queues(struct spraid_dev *hdev)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ struct pci_dev *pdev = hdev->pdev;
+ u32 nr_ioqs = num_online_cpus();
+ u32 i, size;
+ int ret;
+
+ struct irq_affinity affd = {
+ .pre_vectors = 1
+ };
+
+ ret = spraid_set_queue_cnt(hdev, &nr_ioqs);
+ if (ret < 0)
+ return ret;
+
+ size = spraid_bar_size(hdev, nr_ioqs);
+ ret = spraid_remap_bar(hdev, size);
+ if (ret)
+ return -ENOMEM;
+
+ adminq->q_db = hdev->dbs;
+
+ pci_free_irq(pdev, 0, adminq);
+ pci_free_irq_vectors(pdev);
+
+ ret = pci_alloc_irq_vectors_affinity(pdev, 1, (nr_ioqs + 1),
+ PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
+ if (ret <= 0)
+ return -EIO;
+
+ hdev->num_vecs = ret;
+
+ hdev->max_qid = max(ret - 1, 1);
+
+ ret = pci_request_irq(pdev, adminq->cq_vector, spraid_irq, NULL,
+ adminq, "spraid%d_q%d",
+ hdev->instance, adminq->qid);
+ if (ret) {
+ dev_err(hdev->dev, "Request admin irq failed\n");
+ adminq->cq_vector = -1;
+ return ret;
+ }
+
+ for (i = hdev->queue_count; i <= hdev->max_qid; i++) {
+ ret = spraid_alloc_queue(hdev, i, hdev->ioq_depth);
+ if (ret)
+ break;
+ }
+ dev_info(hdev->dev, "[%s] max_qid: %d, queue_count: %d;"
+ " online_queue: %d, ioq_depth: %d\n",
+ __func__, hdev->max_qid, hdev->queue_count,
+ hdev->online_queues, hdev->ioq_depth);
+
+ return spraid_create_io_queues(hdev);
+}
+
+static void spraid_delete_io_queues(struct spraid_dev *hdev)
+{
+ u16 queues = hdev->online_queues - 1;
+ u8 opcode = SPRAID_ADMIN_DELETE_SQ;
+ u16 i, pass;
+
+ if (!pci_device_is_present(hdev->pdev)) {
+ dev_err(hdev->dev,
+ "pci_device is not present, skip disable io queues\n");
+ return;
+ }
+
+ if (hdev->online_queues < 2) {
+ dev_err(hdev->dev, "[%s] err, io queue has been delete\n",
+ __func__);
+ return;
+ }
+
+ for (pass = 0; pass < 2; pass++) {
+ for (i = queues; i > 0; i--)
+ if (spraid_delete_queue(hdev, opcode, i))
+ break;
+
+ opcode = SPRAID_ADMIN_DELETE_CQ;
+ }
+}
+
+static void spraid_remove_io_queues(struct spraid_dev *hdev)
+{
+ spraid_delete_io_queues(hdev);
+ spraid_free_io_queues(hdev);
+}
+
+static void spraid_pci_disable(struct spraid_dev *hdev)
+{
+ struct pci_dev *pdev = hdev->pdev;
+ u32 i;
+
+ for (i = 0; i < hdev->online_queues; i++)
+ pci_free_irq(pdev, hdev->queues[i].cq_vector, &hdev->queues[i]);
+ pci_free_irq_vectors(pdev);
+ if (pci_is_enabled(pdev)) {
+ pci_disable_pcie_error_reporting(pdev);
+ pci_disable_device(pdev);
+ }
+ hdev->online_queues = 0;
+}
+
+static void spraid_disable_admin_queue(struct spraid_dev *hdev, bool shutdown)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u16 start, end;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ if (shutdown)
+ spraid_shutdown_ctrl(hdev);
+ else
+ spraid_disable_ctrl(hdev);
+ }
+
+ if (hdev->queue_count == 0) {
+ dev_err(hdev->dev, "[%s] err, admin queue has been delete\n",
+ __func__);
+ return;
+ }
+
+ spin_lock_irq(&adminq->cq_lock);
+ spraid_process_cq(adminq, &start, &end, -1);
+ spin_unlock_irq(&adminq->cq_lock);
+
+ spraid_complete_cqes(adminq, start, end);
+ spraid_free_admin_queue(hdev);
+}
+
+static int spraid_create_dma_pools(struct spraid_dev *hdev)
+{
+ int i;
+ char poolname[20] = { 0 };
+
+ hdev->prp_page_pool = dma_pool_create("prp list page", hdev->dev,
+ PAGE_SIZE, PAGE_SIZE, 0);
+
+ if (!hdev->prp_page_pool) {
+ dev_err(hdev->dev, "create prp_page_pool failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < small_pool_num; i++) {
+ sprintf(poolname, "prp_list_256_%d", i);
+ hdev->prp_small_pool[i] =
+ dma_pool_create(poolname, hdev->dev, SMALL_POOL_SIZE,
+ SMALL_POOL_SIZE, 0);
+
+ if (!hdev->prp_small_pool[i]) {
+ dev_err(hdev->dev, "create prp_small_pool %d failed\n",
+ i);
+ goto destroy_prp_small_pool;
+ }
+ }
+
+ return 0;
+
+destroy_prp_small_pool:
+ while (i > 0)
+ dma_pool_destroy(hdev->prp_small_pool[--i]);
+ dma_pool_destroy(hdev->prp_page_pool);
+
+ return -ENOMEM;
+}
+
+static void spraid_destroy_dma_pools(struct spraid_dev *hdev)
+{
+ int i;
+
+ for (i = 0; i < small_pool_num; i++)
+ dma_pool_destroy(hdev->prp_small_pool[i]);
+ dma_pool_destroy(hdev->prp_page_pool);
+}
+
+static int spraid_get_dev_list(struct spraid_dev *hdev,
+ struct spraid_dev_info *devices)
+{
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ struct spraid_admin_command admin_cmd;
+ struct spraid_dev_list *list_buf;
+ dma_addr_t data_dma = 0;
+ u32 i, idx, hdid, ndev;
+ int ret = 0;
+
+ list_buf = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!list_buf)
+ return -ENOMEM;
+
+ for (idx = 0; idx < nd;) {
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
+ admin_cmd.get_info.type = SPRAID_GET_INFO_DEV_LIST;
+ admin_cmd.get_info.cdw11 = cpu_to_le32(idx);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd,
+ NULL, NULL, 0);
+
+ if (ret) {
+ dev_err(hdev->dev, "Get device list failed, nd: %u;"
+ "idx: %u, ret: %d\n",
+ nd, idx, ret);
+ goto out;
+ }
+ ndev = le32_to_cpu(list_buf->dev_num);
+
+ dev_info(hdev->dev, "ndev numbers: %u\n", ndev);
+
+ for (i = 0; i < ndev; i++) {
+ hdid = le32_to_cpu(list_buf->devices[i].hdid);
+ dev_info(hdev->dev, "list_buf->devices[%d], hdid: %u;"
+ "target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ i, hdid,
+ le16_to_cpu(list_buf->devices[i].target),
+ list_buf->devices[i].channel,
+ list_buf->devices[i].lun,
+ list_buf->devices[i].attr);
+ if (hdid > nd || hdid == 0) {
+ dev_err(hdev->dev, "err, hdid[%d] invalid\n",
+ hdid);
+ continue;
+ }
+ memcpy(&devices[hdid - 1], &list_buf->devices[i],
+ sizeof(struct spraid_dev_info));
+ }
+ idx += ndev;
+
+ if (idx < MAX_DEV_ENTRY_PER_PAGE_4K)
+ break;
+ }
+
+out:
+ dma_free_coherent(hdev->dev, PAGE_SIZE, list_buf, data_dma);
+ return ret;
+}
+
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.common.opcode = SPRAID_ADMIN_ASYNC_EVENT;
+ admin_cmd.common.command_id = cid;
+
+ spraid_submit_cmd(adminq, &admin_cmd);
+ dev_info(hdev->dev, "send aen, cid[%d]\n", cid);
+}
+
+static inline void spraid_send_all_aen(struct spraid_dev *hdev)
+{
+ u16 i;
+
+ for (i = 0; i < hdev->ctrl_info->aerl; i++)
+ spraid_send_aen(hdev, i + SPRAID_AQ_BLK_MQ_DEPTH);
+}
+
+static int spraid_add_device(struct spraid_dev *hdev,
+ struct spraid_dev_info *device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ dev_info(hdev->dev, "add device, hdid: %u target: %d, channel: %d;"
+ " lun: %d, attr[0x%x]\n",
+ le32_to_cpu(device->hdid), le16_to_cpu(device->target),
+ device->channel, device->lun, device->attr);
+
+ sdev = scsi_device_lookup(shost, device->channel,
+ le16_to_cpu(device->target), 0);
+ if (sdev) {
+ dev_warn(hdev->dev, "Device is already exist, channel: %d;"
+ " target_id: %d, lun: %d\n",
+ device->channel, le16_to_cpu(device->target), 0);
+ scsi_device_put(sdev);
+ return -EEXIST;
+ }
+ scsi_add_device(shost, device->channel, le16_to_cpu(device->target), 0);
+ return 0;
+}
+
+static int spraid_rescan_device(struct spraid_dev *hdev,
+ struct spraid_dev_info *device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ dev_info(hdev->dev, "rescan device, hdid: %u target: %d, channel: %d;"
+ " lun: %d, attr[0x%x]\n",
+ le32_to_cpu(device->hdid), le16_to_cpu(device->target),
+ device->channel, device->lun, device->attr);
+
+ sdev = scsi_device_lookup(shost, device->channel,
+ le16_to_cpu(device->target), 0);
+ if (!sdev) {
+ dev_warn(hdev->dev, "device is not exit rescan it, channel: %d;"
+ " target_id: %d, lun: %d\n",
+ device->channel, le16_to_cpu(device->target), 0);
+ return -ENODEV;
+ }
+
+ scsi_rescan_device(&sdev->sdev_gendev);
+ scsi_device_put(sdev);
+ return 0;
+}
+
+static int spraid_remove_device(struct spraid_dev *hdev,
+ struct spraid_dev_info *org_device)
+{
+ struct Scsi_Host *shost = hdev->shost;
+ struct scsi_device *sdev;
+
+ dev_info(hdev->dev, "remove device, hdid: %u target: %d, channel: %d;"
+ " lun: %d, attr[0x%x]\n",
+ le32_to_cpu(org_device->hdid), le16_to_cpu(org_device->target),
+ org_device->channel, org_device->lun, org_device->attr);
+
+ sdev = scsi_device_lookup(shost, org_device->channel,
+ le16_to_cpu(org_device->target), 0);
+ if (!sdev) {
+ dev_warn(hdev->dev, "device is not exit remove it, channel: %d;"
+ " target_id: %d, lun: %d\n",
+ org_device->channel,
+ le16_to_cpu(org_device->target), 0);
+ return -ENODEV;
+ }
+
+ scsi_remove_device(sdev);
+ scsi_device_put(sdev);
+ return 0;
+}
+
+static int spraid_dev_list_init(struct spraid_dev *hdev)
+{
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ int i, ret;
+
+ hdev->devices = kzalloc_node(nd * sizeof(struct spraid_dev_info),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->devices)
+ return -ENOMEM;
+
+ ret = spraid_get_dev_list(hdev, hdev->devices);
+ if (ret) {
+ dev_err(hdev->dev,
+ "Ignore failure of getting device list;"
+ " within initialization\n");
+ return 0;
+ }
+
+ for (i = 0; i < nd; i++) {
+ if (SPRAID_DEV_INFO_FLAG_VALID(hdev->devices[i].flag) &&
+ SPRAID_DEV_INFO_ATTR_BOOT(hdev->devices[i].attr)) {
+ spraid_add_device(hdev, &hdev->devices[i]);
+ break;
+ }
+ }
+ return 0;
+}
+
+static int luntarget_cmp_func(const void *l, const void *r)
+{
+ const struct spraid_dev_info *ln = l;
+ const struct spraid_dev_info *rn = r;
+
+ if (ln->channel == rn->channel)
+ return le16_to_cpu(ln->target) - le16_to_cpu(rn->target);
+
+ return ln->channel - rn->channel;
+}
+
+static void spraid_scan_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, scan_work);
+ struct spraid_dev_info *devices, *org_devices;
+ struct spraid_dev_info *sortdevice;
+ u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
+ u8 flag, org_flag;
+ int i, ret;
+ int count = 0;
+
+ devices = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
+ if (!devices)
+ return;
+
+ sortdevice = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
+ if (!sortdevice)
+ goto free_list;
+
+ ret = spraid_get_dev_list(hdev, devices);
+ if (ret)
+ goto free_all;
+ org_devices = hdev->devices;
+ for (i = 0; i < nd; i++) {
+ org_flag = org_devices[i].flag;
+ flag = devices[i].flag;
+
+ dev_log_dbg(hdev->dev, "i: %d, org_flag: 0x%x, flag: 0x%x\n",
+ i, org_flag, flag);
+
+ if (SPRAID_DEV_INFO_FLAG_VALID(flag)) {
+ if (!SPRAID_DEV_INFO_FLAG_VALID(org_flag)) {
+ down_write(&hdev->devices_rwsem);
+ memcpy(&org_devices[i], &devices[i],
+ sizeof(struct spraid_dev_info));
+ memcpy(&sortdevice[count++], &devices[i],
+ sizeof(struct spraid_dev_info));
+ up_write(&hdev->devices_rwsem);
+ } else if (SPRAID_DEV_INFO_FLAG_CHANGE(flag)) {
+ spraid_rescan_device(hdev, &devices[i]);
+ }
+ } else {
+ if (SPRAID_DEV_INFO_FLAG_VALID(org_flag)) {
+ down_write(&hdev->devices_rwsem);
+ org_devices[i].flag &= 0xfe;
+ up_write(&hdev->devices_rwsem);
+ spraid_remove_device(hdev, &org_devices[i]);
+ }
+ }
+ }
+
+ dev_info(hdev->dev, "scan work add device count = %d\n", count);
+
+ sort(sortdevice, count, sizeof(sortdevice[0]),
+ luntarget_cmp_func, NULL);
+
+ for (i = 0; i < count; i++)
+ spraid_add_device(hdev, &sortdevice[i]);
+
+free_all:
+ kfree(sortdevice);
+free_list:
+ kfree(devices);
+}
+
+static void spraid_timesyn_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, timesyn_work);
+
+ spraid_configure_timestamp(hdev);
+}
+
+static int spraid_init_ctrl_info(struct spraid_dev *hdev);
+static void spraid_fw_act_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, fw_act_work);
+
+ if (spraid_init_ctrl_info(hdev))
+ dev_err(hdev->dev, "get ctrl info failed after fw act\n");
+}
+
+static void spraid_queue_scan(struct spraid_dev *hdev)
+{
+ queue_work(spraid_wq, &hdev->scan_work);
+}
+
+static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result)
+{
+ switch ((result & 0xff00) >> 8) {
+ case SPRAID_AEN_DEV_CHANGED:
+ spraid_queue_scan(hdev);
+ break;
+ case SPRAID_AEN_FW_ACT_START:
+ dev_info(hdev->dev, "fw activation starting\n");
+ break;
+ case SPRAID_AEN_HOST_PROBING:
+ break;
+ default:
+ dev_warn(hdev->dev, "async event result %08x\n", result);
+ }
+}
+
+static void spraid_handle_aen_vs(struct spraid_dev *hdev,
+ u32 result, u32 result1)
+{
+ switch ((result & 0xff00) >> 8) {
+ case SPRAID_AEN_TIMESYN:
+ queue_work(spraid_wq, &hdev->timesyn_work);
+ break;
+ case SPRAID_AEN_FW_ACT_FINISH:
+ dev_info(hdev->dev, "fw activation finish\n");
+ queue_work(spraid_wq, &hdev->fw_act_work);
+ break;
+ case SPRAID_AEN_EVENT_MIN ... SPRAID_AEN_EVENT_MAX:
+ dev_info(hdev->dev, "rcv card event[%d];"
+ " param1[0x%x] param2[0x%x]\n",
+ (result & 0xff00) >> 8, result, result1);
+ break;
+ default:
+ dev_warn(hdev->dev, "async event result: 0x%x\n", result);
+ }
+}
+
+static int spraid_alloc_resources(struct spraid_dev *hdev)
+{
+ int ret, nqueue;
+
+ ret = ida_alloc(&spraid_instance_ida, GFP_KERNEL);
+ if (ret < 0) {
+ dev_err(hdev->dev, "Get instance id failed\n");
+ return ret;
+ }
+ hdev->instance = ret;
+
+ hdev->ctrl_info = kzalloc_node(sizeof(*hdev->ctrl_info),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->ctrl_info) {
+ ret = -ENOMEM;
+ goto release_instance;
+ }
+
+ ret = spraid_create_dma_pools(hdev);
+ if (ret)
+ goto free_ctrl_info;
+ nqueue = num_possible_cpus() + 1;
+ hdev->queues = kcalloc_node(nqueue, sizeof(struct spraid_queue),
+ GFP_KERNEL, hdev->numa_node);
+ if (!hdev->queues) {
+ ret = -ENOMEM;
+ goto destroy_dma_pools;
+ }
+
+ ret = spraid_alloc_admin_cmds(hdev);
+ if (ret)
+ goto free_queues;
+
+ dev_info(hdev->dev, "[%s] queues num: %d\n", __func__, nqueue);
+
+ return 0;
+
+free_queues:
+ kfree(hdev->queues);
+destroy_dma_pools:
+ spraid_destroy_dma_pools(hdev);
+free_ctrl_info:
+ kfree(hdev->ctrl_info);
+release_instance:
+ ida_free(&spraid_instance_ida, hdev->instance);
+ return ret;
+}
+
+static void spraid_free_resources(struct spraid_dev *hdev)
+{
+ spraid_free_admin_cmds(hdev);
+ kfree(hdev->queues);
+ spraid_destroy_dma_pools(hdev);
+ kfree(hdev->ctrl_info);
+ ida_free(&spraid_instance_ida, hdev->instance);
+}
+
+static void spraid_bsg_unmap_data(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir =
+ rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+
+ if (iod->nsge)
+ dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
+
+ spraid_free_iod_res(hdev, iod);
+}
+
+static int spraid_bsg_map_data(struct spraid_dev *hdev, struct bsg_job *job,
+ struct spraid_admin_command *cmd)
+{
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir =
+ rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ int ret = 0;
+
+ iod->sg = job->request_payload.sg_list;
+ iod->nsge = job->request_payload.sg_cnt;
+ iod->length = job->request_payload.payload_len;
+ iod->use_sgl = false;
+ iod->npages = -1;
+ iod->sg_drv_mgmt = false;
+
+ if (!iod->nsge)
+ goto out;
+
+ ret = dma_map_sg_attrs(hdev->dev, iod->sg, iod->nsge,
+ dma_dir, DMA_ATTR_NO_WARN);
+ if (!ret)
+ goto out;
+
+ ret = spraid_setup_prps(hdev, iod);
+ if (ret)
+ goto unmap;
+
+ cmd->common.dptr.prp1 = cpu_to_le64(sg_dma_address(iod->sg));
+ cmd->common.dptr.prp2 = cpu_to_le64(iod->first_dma);
+
+ return 0;
+
+unmap:
+ dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
+out:
+ return ret;
+}
+
+static int spraid_get_ctrl_info(struct spraid_dev *hdev,
+ struct spraid_ctrl_info *ctrl_info)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
+ admin_cmd.get_info.type = SPRAID_GET_INFO_CTRL;
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(ctrl_info, data_ptr, sizeof(struct spraid_ctrl_info));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_init_ctrl_info(struct spraid_dev *hdev)
+{
+ int ret;
+
+ hdev->ctrl_info->nd = cpu_to_le32(240);
+ hdev->ctrl_info->mdts = 8;
+ hdev->ctrl_info->max_cmds = cpu_to_le16(4096);
+ hdev->ctrl_info->max_num_sge = cpu_to_le16(128);
+ hdev->ctrl_info->max_channel = cpu_to_le16(4);
+ hdev->ctrl_info->max_tgt_id = cpu_to_le32(3239);
+ hdev->ctrl_info->max_lun = cpu_to_le16(2);
+
+ ret = spraid_get_ctrl_info(hdev, hdev->ctrl_info);
+ if (ret)
+ dev_err(hdev->dev, "get controller info failed: %d\n", ret);
+
+ dev_info(hdev->dev, "[%s]nd = %d\n", __func__, hdev->ctrl_info->nd);
+ dev_info(hdev->dev, "[%s]max_cmd = %d\n",
+ __func__, hdev->ctrl_info->max_cmds);
+ dev_info(hdev->dev, "[%s]max_channel = %d\n",
+ __func__, hdev->ctrl_info->max_channel);
+ dev_info(hdev->dev, "[%s]max_tgt_id = %d\n",
+ __func__, hdev->ctrl_info->max_tgt_id);
+ dev_info(hdev->dev, "[%s]max_lun = %d\n",
+ __func__, hdev->ctrl_info->max_lun);
+ dev_info(hdev->dev, "[%s]max_num_sge = %d\n",
+ __func__, hdev->ctrl_info->max_num_sge);
+ dev_info(hdev->dev, "[%s]lun_num_boot = %d\n",
+ __func__, hdev->ctrl_info->lun_num_in_boot);
+ dev_info(hdev->dev, "[%s]mdts = %d\n", __func__, hdev->ctrl_info->mdts);
+ dev_info(hdev->dev, "[%s]acl = %d\n", __func__, hdev->ctrl_info->acl);
+ dev_info(hdev->dev, "[%s]aer1 = %d\n", __func__, hdev->ctrl_info->aerl);
+ dev_info(hdev->dev, "[%s]card_type = %d\n",
+ __func__, hdev->ctrl_info->card_type);
+ dev_info(hdev->dev, "[%s]rtd3e = %d\n",
+ __func__, hdev->ctrl_info->rtd3e);
+ dev_info(hdev->dev, "[%s]sn = %s\n", __func__, hdev->ctrl_info->sn);
+ dev_info(hdev->dev, "[%s]fr = %s\n", __func__, hdev->ctrl_info->fr);
+
+ if (!hdev->ctrl_info->aerl)
+ hdev->ctrl_info->aerl = 1;
+ if (hdev->ctrl_info->aerl > SPRAID_NR_AEN_COMMANDS)
+ hdev->ctrl_info->aerl = SPRAID_NR_AEN_COMMANDS;
+
+ return 0;
+}
+
+#define SPRAID_MAX_ADMIN_PAYLOAD_SIZE BIT(16)
+static int spraid_alloc_iod_ext_mem_pool(struct spraid_dev *hdev)
+{
+ u16 max_sge = le16_to_cpu(hdev->ctrl_info->max_num_sge);
+ size_t alloc_size;
+
+ alloc_size = spraid_iod_ext_size(hdev, SPRAID_MAX_ADMIN_PAYLOAD_SIZE,
+ max_sge, true, false);
+ if (alloc_size > PAGE_SIZE)
+ dev_warn(hdev->dev, "It is unreasonable ;"
+ " sg allocation more than one page\n");
+ hdev->iod_mempool = mempool_create_node(1, mempool_kmalloc,
+ mempool_kfree,
+ (void *)alloc_size, GFP_KERNEL,
+ hdev->numa_node);
+ if (!hdev->iod_mempool) {
+ dev_err(hdev->dev, "Create iod extension memory pool failed\n");
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void spraid_free_iod_ext_mem_pool(struct spraid_dev *hdev)
+{
+ mempool_destroy(hdev->iod_mempool);
+}
+
+static int spraid_user_admin_cmd(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct spraid_bsg_request *bsg_req = job->request;
+ struct spraid_passthru_common_cmd *cmd = &(bsg_req->admcmd);
+ struct spraid_admin_command admin_cmd;
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
+ u32 result[2] = {0};
+ int status;
+
+ if (hdev->state >= SPRAID_RESETTING) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not right\n",
+ __func__, hdev->state);
+ return -EBUSY;
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode);
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.common.opcode = cmd->opcode;
+ admin_cmd.common.flags = cmd->flags;
+ admin_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ admin_cmd.common.cdw2[0] = cpu_to_le32(cmd->cdw2);
+ admin_cmd.common.cdw2[1] = cpu_to_le32(cmd->cdw3);
+ admin_cmd.common.cdw10 = cpu_to_le32(cmd->cdw10);
+ admin_cmd.common.cdw11 = cpu_to_le32(cmd->cdw11);
+ admin_cmd.common.cdw12 = cpu_to_le32(cmd->cdw12);
+ admin_cmd.common.cdw13 = cpu_to_le32(cmd->cdw13);
+ admin_cmd.common.cdw14 = cpu_to_le32(cmd->cdw14);
+ admin_cmd.common.cdw15 = cpu_to_le32(cmd->cdw15);
+
+ status = spraid_bsg_map_data(hdev, job, &admin_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
+ }
+
+ status = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, &result[0],
+ &result[1], timeout);
+ if (status >= 0) {
+ job->reply_len = sizeof(result);
+ memcpy(job->reply, result, sizeof(result));
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x];"
+ " status[0x%x] result0[0x%x] result1[0x%x]\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode,
+ status, result[0], result[1]);
+
+ spraid_bsg_unmap_data(hdev, job);
+
+ return status;
+}
+
+static int spraid_alloc_ioq_ptcmds(struct spraid_dev *hdev)
+{
+ int i;
+ int ptnum = SPRAID_NR_IOQ_PTCMDS;
+
+ INIT_LIST_HEAD(&hdev->ioq_pt_list);
+ spin_lock_init(&hdev->ioq_pt_lock);
+
+ hdev->ioq_ptcmds = kcalloc_node(ptnum, sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
+
+ if (!hdev->ioq_ptcmds) {
+ dev_err(hdev->dev, "Alloc ioq_ptcmds failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < ptnum; i++) {
+ hdev->ioq_ptcmds[i].qid = i / SPRAID_PTCMDS_PERQ + 1;
+ hdev->ioq_ptcmds[i].cid = i % SPRAID_PTCMDS_PERQ
+ + SPRAID_IO_BLK_MQ_DEPTH;
+ list_add_tail(&(hdev->ioq_ptcmds[i].list), &hdev->ioq_pt_list);
+ }
+
+ dev_info(hdev->dev, "Alloc ioq_ptcmds success, ptnum[%d]\n", ptnum);
+
+ return 0;
+}
+
+static void spraid_free_ioq_ptcmds(struct spraid_dev *hdev)
+{
+ kfree(hdev->ioq_ptcmds);
+ hdev->ioq_ptcmds = NULL;
+
+ INIT_LIST_HEAD(&hdev->ioq_pt_list);
+}
+
+static int spraid_submit_ioq_sync_cmd(struct spraid_dev *hdev,
+ struct spraid_ioq_command *cmd,
+ u32 *result, u32 *reslen, u32 timeout)
+{
+ int ret;
+ dma_addr_t sense_dma;
+ struct spraid_queue *ioq;
+ void *sense_addr = NULL;
+ struct spraid_cmd *pt_cmd = spraid_get_cmd(hdev, SPRAID_CMD_IOPT);
+
+ if (!pt_cmd) {
+ dev_err(hdev->dev, "err, get ioq cmd failed\n");
+ return -EFAULT;
+ }
+
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
+
+ init_completion(&pt_cmd->cmd_done);
+
+ ioq = &hdev->queues[pt_cmd->qid];
+ ret = pt_cmd->cid * SCSI_SENSE_BUFFERSIZE;
+ sense_addr = ioq->sense + ret;
+ sense_dma = ioq->sense_dma_addr + ret;
+
+ cmd->common.sense_addr = cpu_to_le64(sense_dma);
+ cmd->common.sense_len = cpu_to_le16(SCSI_SENSE_BUFFERSIZE);
+ cmd->common.command_id = pt_cmd->cid;
+
+ spraid_submit_cmd(ioq, cmd);
+
+ if (!wait_for_completion_timeout(&pt_cmd->cmd_done, timeout)) {
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout;"
+ " opcode[0x%x] subopcode[0x%x]\n",
+ __func__, pt_cmd->cid, pt_cmd->qid, cmd->common.opcode,
+ (le32_to_cpu(cmd->common.cdw3[0]) & 0xffff));
+ WRITE_ONCE(pt_cmd->state, SPRAID_CMD_TIMEOUT);
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
+ return -EINVAL;
+ }
+
+ if (result && reslen) {
+ if ((pt_cmd->status & 0x17f) == 0x101) {
+ memcpy(result, sense_addr, SCSI_SENSE_BUFFERSIZE);
+ *reslen = SCSI_SENSE_BUFFERSIZE;
+ }
+ }
+
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
+
+ return pt_cmd->status;
+}
+
+static int spraid_user_ioq_cmd(struct spraid_dev *hdev, struct bsg_job *job)
+{
+ struct spraid_bsg_request *bsg_req =
+ (struct spraid_bsg_request *)(job->request);
+ struct spraid_ioq_passthru_cmd *cmd = &(bsg_req->ioqcmd);
+ struct spraid_ioq_command ioq_cmd;
+ int status = 0;
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
+
+ if (cmd->data_len > PAGE_SIZE) {
+ dev_err(hdev->dev, "[%s] data len bigger than 4k\n", __func__);
+ return -EFAULT;
+ }
+
+ if (hdev->state != SPRAID_LIVE) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not live\n",
+ __func__, hdev->state);
+ return -EBUSY;
+ }
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init;"
+ " datalen[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode, cmd->data_len);
+
+ memset(&ioq_cmd, 0, sizeof(ioq_cmd));
+ ioq_cmd.common.opcode = cmd->opcode;
+ ioq_cmd.common.flags = cmd->flags;
+ ioq_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ ioq_cmd.common.sense_len = cpu_to_le16(cmd->info_0.res_sense_len);
+ ioq_cmd.common.cdb_len = cmd->info_0.cdb_len;
+ ioq_cmd.common.rsvd2 = cmd->info_0.rsvd0;
+ ioq_cmd.common.cdw3[0] = cpu_to_le32(cmd->cdw3);
+ ioq_cmd.common.cdw3[1] = cpu_to_le32(cmd->cdw4);
+ ioq_cmd.common.cdw3[2] = cpu_to_le32(cmd->cdw5);
+
+ ioq_cmd.common.cdw10[0] = cpu_to_le32(cmd->cdw10);
+ ioq_cmd.common.cdw10[1] = cpu_to_le32(cmd->cdw11);
+ ioq_cmd.common.cdw10[2] = cpu_to_le32(cmd->cdw12);
+ ioq_cmd.common.cdw10[3] = cpu_to_le32(cmd->cdw13);
+ ioq_cmd.common.cdw10[4] = cpu_to_le32(cmd->cdw14);
+ ioq_cmd.common.cdw10[5] = cpu_to_le32(cmd->data_len);
+
+ memcpy(ioq_cmd.common.cdb, &cmd->cdw16, cmd->info_0.cdb_len);
+
+ ioq_cmd.common.cdw26[0] = cpu_to_le32(cmd->cdw26[0]);
+ ioq_cmd.common.cdw26[1] = cpu_to_le32(cmd->cdw26[1]);
+ ioq_cmd.common.cdw26[2] = cpu_to_le32(cmd->cdw26[2]);
+ ioq_cmd.common.cdw26[3] = cpu_to_le32(cmd->cdw26[3]);
+
+ status = spraid_bsg_map_data(hdev, job,
+ (struct spraid_admin_command *)&ioq_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
+ }
+
+ status = spraid_submit_ioq_sync_cmd(hdev, &ioq_cmd, job->reply,
+ &job->reply_len, timeout);
+
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x], status[0x%x];"
+ " reply_len[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode,
+ status, job->reply_len);
+
+ spraid_bsg_unmap_data(hdev, job);
+
+ return status;
+}
+
+static bool spraid_check_scmd_completed(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_queue *spraidq;
+ u16 hwq, cid;
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ spraidq = &hdev->queues[hwq];
+ if (READ_ONCE(iod->state) == SPRAID_CMD_COMPLETE
+ || spraid_poll_cq(spraidq, cid)) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] has been completed\n",
+ cid, spraidq->qid);
+ return true;
+ }
+ return false;
+}
+
+static enum blk_eh_timer_return spraid_scmd_timeout(struct scsi_cmnd *scmd)
+{
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ unsigned int timeout = scmd->device->request_queue->rq_timeout;
+
+ if (spraid_check_scmd_completed(scmd))
+ goto out;
+
+ if (time_after(jiffies, scmd->jiffies_at_alloc + timeout)) {
+ if (cmpxchg(&iod->state,
+ SPRAID_CMD_IN_FLIGHT,
+ SPRAID_CMD_TIMEOUT) == SPRAID_CMD_IN_FLIGHT) {
+ return BLK_EH_DONE;
+ }
+ }
+out:
+ return BLK_EH_RESET_TIMER;
+}
+
+/* send abort command by admin queue temporary */
+static int spraid_send_abort_cmd(struct spraid_dev *hdev,
+ u32 hdid, u16 qid, u16 cid)
+{
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.abort.opcode = SPRAID_ADMIN_ABORT_CMD;
+ admin_cmd.abort.hdid = cpu_to_le32(hdid);
+ admin_cmd.abort.sqid = cpu_to_le16(qid);
+ admin_cmd.abort.cid = cpu_to_le16(cid);
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+/* send reset command by admin quueue temporary */
+static int spraid_send_reset_cmd(struct spraid_dev *hdev, int type, u32 hdid)
+{
+ struct spraid_admin_command admin_cmd;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.reset.opcode = SPRAID_ADMIN_RESET;
+ admin_cmd.reset.hdid = cpu_to_le32(hdid);
+ admin_cmd.reset.type = type;
+
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+}
+
+static bool spraid_change_host_state(struct spraid_dev *hdev,
+ enum spraid_state newstate)
+{
+ unsigned long flags;
+ enum spraid_state oldstate;
+ bool change = false;
+
+ spin_lock_irqsave(&hdev->state_lock, flags);
+
+ oldstate = hdev->state;
+ switch (newstate) {
+ case SPRAID_LIVE:
+ switch (oldstate) {
+ case SPRAID_NEW:
+ case SPRAID_RESETTING:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ case SPRAID_RESETTING:
+ switch (oldstate) {
+ case SPRAID_LIVE:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ case SPRAID_DELETING:
+ if (oldstate != SPRAID_DELETING)
+ change = true;
+ break;
+ case SPRAID_DEAD:
+ switch (oldstate) {
+ case SPRAID_NEW:
+ case SPRAID_LIVE:
+ case SPRAID_RESETTING:
+ change = true;
+ break;
+ default:
+ break;
+ }
+ break;
+ default:
+ break;
+ }
+ if (change)
+ hdev->state = newstate;
+ spin_unlock_irqrestore(&hdev->state_lock, flags);
+
+ dev_info(hdev->dev, "[%s][%d]->[%d], change[%d]\n",
+ __func__, oldstate, newstate, change);
+
+ return change;
+}
+
+static void spraid_back_fault_cqe(struct spraid_queue *ioq,
+ struct spraid_completion *cqe)
+{
+ struct spraid_dev *hdev = ioq->hdev;
+ struct blk_mq_tags *tags;
+ struct scsi_cmnd *scmd;
+ struct spraid_iod *iod;
+ struct request *req;
+
+ tags = hdev->shost->tag_set.tags[ioq->qid - 1];
+ req = blk_mq_tag_to_rq(tags, cqe->cmd_id);
+ if (unlikely(!req || !blk_mq_request_started(req)))
+ return;
+
+ scmd = blk_mq_rq_to_pdu(req);
+ iod = scsi_cmd_priv(scmd);
+
+ set_host_byte(scmd, DID_NO_CONNECT);
+ if (iod->nsge)
+ scsi_dma_unmap(scmd);
+ spraid_free_iod_res(hdev, iod);
+ scmd->scsi_done(scmd);
+ dev_warn(hdev->dev, "Back fault CQE, cid[%d] qid[%d]\n",
+ cqe->cmd_id, ioq->qid);
+}
+
+static void spraid_back_all_io(struct spraid_dev *hdev)
+{
+ int i, j;
+ struct spraid_queue *ioq;
+ struct spraid_completion cqe = { 0 };
+
+ scsi_block_requests(hdev->shost);
+
+ for (i = 1; i <= hdev->shost->nr_hw_queues; i++) {
+ ioq = &hdev->queues[i];
+ for (j = 0; j < hdev->shost->can_queue; j++) {
+ cqe.cmd_id = j;
+ spraid_back_fault_cqe(ioq, &cqe);
+ }
+ }
+
+ scsi_unblock_requests(hdev->shost);
+}
+
+static void spraid_dev_disable(struct spraid_dev *hdev, bool shutdown)
+{
+ struct spraid_queue *adminq = &hdev->queues[0];
+ u16 start, end;
+ unsigned long timeout = jiffies + 600 * HZ;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ if (shutdown)
+ spraid_shutdown_ctrl(hdev);
+ else
+ spraid_disable_ctrl(hdev);
+ }
+
+ while (!time_after(jiffies, timeout)) {
+ if (!pci_device_is_present(hdev->pdev)) {
+ dev_info(hdev->dev, "[%s] pci_device not present;"
+ " skip wait\n", __func__);
+ break;
+ }
+ if (!spraid_wait_ready(hdev, hdev->cap, false)) {
+ dev_info(hdev->dev,
+ "[%s] wait ready success after reset\n",
+ __func__);
+ break;
+ }
+ dev_info(hdev->dev, "[%s] waiting csts_rdy ready\n", __func__);
+ }
+
+ if (hdev->queue_count == 0) {
+ dev_err(hdev->dev, "[%s] warn, queue has been delete\n",
+ __func__);
+ return;
+ }
+
+ spin_lock_irq(&adminq->cq_lock);
+ spraid_process_cq(adminq, &start, &end, -1);
+ spin_unlock_irq(&adminq->cq_lock);
+ spraid_complete_cqes(adminq, start, end);
+
+ spraid_pci_disable(hdev);
+
+ spraid_back_all_io(hdev);
+}
+
+static void spraid_reset_work(struct work_struct *work)
+{
+ int ret;
+ struct spraid_dev *hdev =
+ container_of(work, struct spraid_dev, reset_work);
+
+ if (hdev->state != SPRAID_RESETTING) {
+ dev_err(hdev->dev, "[%s] err, host is not reset state\n",
+ __func__);
+ return;
+ }
+
+ dev_info(hdev->dev, "[%s] enter host reset\n", __func__);
+
+ if (hdev->ctrl_config & SPRAID_CC_ENABLE) {
+ dev_info(hdev->dev, "[%s] start dev_disable\n", __func__);
+ spraid_dev_disable(hdev, false);
+ }
+
+ ret = spraid_pci_enable(hdev);
+ if (ret)
+ goto out;
+
+ ret = spraid_setup_admin_queue(hdev);
+ if (ret)
+ goto pci_disable;
+
+ ret = spraid_setup_io_queues(hdev);
+ if (ret || hdev->online_queues <= hdev->shost->nr_hw_queues)
+ goto pci_disable;
+
+ spraid_change_host_state(hdev, SPRAID_LIVE);
+
+ spraid_send_all_aen(hdev);
+
+ return;
+
+pci_disable:
+ spraid_pci_disable(hdev);
+out:
+ spraid_change_host_state(hdev, SPRAID_DEAD);
+ dev_err(hdev->dev, "[%s] err, host reset failed\n", __func__);
+}
+
+static int spraid_reset_work_sync(struct spraid_dev *hdev)
+{
+ if (!spraid_change_host_state(hdev, SPRAID_RESETTING)) {
+ dev_info(hdev->dev, "[%s] can't change to reset state\n",
+ __func__);
+ return -EBUSY;
+ }
+
+ if (!queue_work(spraid_wq, &hdev->reset_work)) {
+ dev_err(hdev->dev, "[%s] err, host is already in reset state\n",
+ __func__);
+ return -EBUSY;
+ }
+
+ flush_work(&hdev->reset_work);
+ if (hdev->state != SPRAID_LIVE)
+ return -ENODEV;
+
+ return 0;
+}
+
+static int spraid_wait_abnl_cmd_done(struct spraid_iod *iod)
+{
+ u16 times = 0;
+
+ do {
+ if (READ_ONCE(iod->state) == SPRAID_CMD_TMO_COMPLETE)
+ break;
+ msleep(500);
+ times++;
+ } while (times <= SPRAID_WAIT_ABNL_CMD_TIMEOUT);
+
+ /* wait command completion timeout after abort/reset success */
+ if (times >= SPRAID_WAIT_ABNL_CMD_TIMEOUT)
+ return -ETIMEDOUT;
+
+ return 0;
+}
+
+static int spraid_abort_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, aborting\n", cid, hwq);
+ ret = spraid_send_abort_cmd(hdev, hostdata->hdid, hwq, cid);
+ if (ret != ADMIN_ERR_TIMEOUT) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort failed;"
+ " not found\n", cid, hwq);
+ return FAILED;
+ }
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort succ\n", cid, hwq);
+ return SUCCESS;
+ }
+ dev_warn(hdev->dev, "cid[%d] qid[%d] abort failed, timeout\n",
+ cid, hwq);
+ return FAILED;
+}
+
+static int spraid_tgt_reset_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, target reset\n",
+ cid, hwq);
+ ret = spraid_send_reset_cmd(hdev, SPRAID_RESET_TARGET, hostdata->hdid);
+ if (ret == 0) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev,
+ "cid[%d] qid[%d]target reset failed;"
+ " not found\n", cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] target reset success\n",
+ cid, hwq);
+ return SUCCESS;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] ret[%d] target reset failed\n",
+ cid, hwq, ret);
+ return FAILED;
+}
+
+static int spraid_bus_reset_handler(struct scsi_cmnd *scmd)
+{
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+ struct spraid_iod *iod = scsi_cmd_priv(scmd);
+ struct spraid_sdev_hostdata *hostdata;
+ u16 hwq, cid;
+ int ret;
+
+ scsi_print_command(scmd);
+
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ hostdata = scmd->device->hostdata;
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, bus reset\n", cid, hwq);
+ ret = spraid_send_reset_cmd(hdev, SPRAID_RESET_BUS, hostdata->hdid);
+ if (ret == 0) {
+ ret = spraid_wait_abnl_cmd_done(iod);
+ if (ret) {
+ dev_warn(hdev->dev,
+ "cid[%d] qid[%d] bus reset failed;"
+ " not found\n", cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] bus reset succ\n",
+ cid, hwq);
+ return SUCCESS;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] ret[%d] bus reset failed\n",
+ cid, hwq, ret);
+ return FAILED;
+}
+
+static int spraid_shost_reset_handler(struct scsi_cmnd *scmd)
+{
+ u16 hwq, cid;
+ struct spraid_dev *hdev = shost_priv(scmd->device->host);
+
+ scsi_print_command(scmd);
+ if (hdev->state != SPRAID_LIVE || spraid_check_scmd_completed(scmd))
+ return SUCCESS;
+
+ spraid_get_tag_from_scmd(scmd, &hwq, &cid);
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset\n", cid, hwq);
+
+ if (spraid_reset_work_sync(hdev)) {
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset failed\n",
+ cid, hwq);
+ return FAILED;
+ }
+
+ dev_warn(hdev->dev, "cid[%d] qid[%d] host reset success\n", cid, hwq);
+
+ return SUCCESS;
+}
+
+static pci_ers_result_t spraid_pci_error_detected(struct pci_dev *pdev,
+ pci_channel_state_t state)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "enter pci error detect, state:%d\n", state);
+
+ switch (state) {
+ case pci_channel_io_normal:
+ dev_warn(hdev->dev, "channel is normal, do nothing\n");
+
+ return PCI_ERS_RESULT_CAN_RECOVER;
+ case pci_channel_io_frozen:
+ dev_warn(hdev->dev,
+ "channel io frozen, need reset controller\n");
+
+ scsi_block_requests(hdev->shost);
+
+ spraid_change_host_state(hdev, SPRAID_RESETTING);
+
+ return PCI_ERS_RESULT_NEED_RESET;
+ case pci_channel_io_perm_failure:
+ dev_warn(hdev->dev, "channel io failure, request disconnect\n");
+
+ return PCI_ERS_RESULT_DISCONNECT;
+ }
+
+ return PCI_ERS_RESULT_NEED_RESET;
+}
+
+static pci_ers_result_t spraid_pci_slot_reset(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "restart after slot reset\n");
+
+ pci_restore_state(pdev);
+
+ if (!queue_work(spraid_wq, &hdev->reset_work)) {
+ dev_err(hdev->dev, "[%s] err, the device is resetting state\n",
+ __func__);
+ return PCI_ERS_RESULT_NONE;
+ }
+
+ flush_work(&hdev->reset_work);
+
+ scsi_unblock_requests(hdev->shost);
+
+ return PCI_ERS_RESULT_RECOVERED;
+}
+
+static void spraid_reset_done(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "enter spraid reset done\n");
+}
+
+static ssize_t csts_pp_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS)
+ & SPRAID_CSTS_PP_MASK);
+ ret >>= SPRAID_CSTS_PP_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_shst_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS)
+ & SPRAID_CSTS_SHST_MASK);
+ ret >>= SPRAID_CSTS_SHST_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_cfs_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev)) {
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS)
+ & SPRAID_CSTS_CFS_MASK);
+ ret >>= SPRAID_CSTS_CFS_SHIFT;
+ }
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t csts_rdy_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ int ret = -1;
+
+ if (pci_device_is_present(hdev->pdev))
+ ret = (readl(hdev->bar + SPRAID_REG_CSTS) & SPRAID_CSTS_RDY);
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t fw_version_show(struct device *cdev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(cdev);
+ struct spraid_dev *hdev = shost_priv(shost);
+
+ return snprintf(buf, PAGE_SIZE, "%s\n", hdev->ctrl_info->fr);
+}
+
+static DEVICE_ATTR_RO(csts_pp);
+static DEVICE_ATTR_RO(csts_shst);
+static DEVICE_ATTR_RO(csts_cfs);
+static DEVICE_ATTR_RO(csts_rdy);
+static DEVICE_ATTR_RO(fw_version);
+
+static struct device_attribute *spraid_host_attrs[] = {
+ &dev_attr_csts_pp,
+ &dev_attr_csts_shst,
+ &dev_attr_csts_cfs,
+ &dev_attr_csts_rdy,
+ &dev_attr_fw_version,
+ NULL,
+};
+
+static int spraid_get_vd_info(struct spraid_dev *hdev,
+ struct spraid_vd_info *vd_info, u16 vid)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.usr_cmd.opcode = USR_CMD_READ;
+ admin_cmd.usr_cmd.info_0.subopcode = cpu_to_le16(USR_CMD_VDINFO);
+ admin_cmd.usr_cmd.info_1.data_len = cpu_to_le16(USR_CMD_RDLEN);
+ admin_cmd.usr_cmd.info_1.param_len = cpu_to_le16(VDINFO_PARAM_LEN);
+ admin_cmd.usr_cmd.cdw10 = cpu_to_le32(vid);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(vd_info, data_ptr, sizeof(struct spraid_vd_info));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_get_bgtask(struct spraid_dev *hdev,
+ struct spraid_bgtask *bgtask)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE,
+ &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.usr_cmd.opcode = USR_CMD_READ;
+ admin_cmd.usr_cmd.info_0.subopcode = cpu_to_le16(USR_CMD_BGTASK);
+ admin_cmd.usr_cmd.info_1.data_len = cpu_to_le16(USR_CMD_RDLEN);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(bgtask, data_ptr, sizeof(struct spraid_bgtask));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static ssize_t raid_level_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_sdev_hostdata *hostdata;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret)
+ vd_info->rg_level = ARRAY_SIZE(raid_levels) - 1;
+
+ ret = (vd_info->rg_level < ARRAY_SIZE(raid_levels)) ?
+ vd_info->rg_level : (ARRAY_SIZE(raid_levels) - 1);
+
+ kfree(vd_info);
+
+ return snprintf(buf, PAGE_SIZE, "RAID-%s\n", raid_levels[ret]);
+}
+
+static ssize_t raid_state_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_sdev_hostdata *hostdata;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret) {
+ vd_info->vd_status = 0;
+ vd_info->rg_id = 0xff;
+ }
+
+ ret = (vd_info->vd_status < ARRAY_SIZE(raid_states)) ?
+ vd_info->vd_status : 0;
+
+ kfree(vd_info);
+
+ return snprintf(buf, PAGE_SIZE, "%s\n", raid_states[ret]);
+}
+
+static ssize_t raid_resync_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_bgtask *bgtask;
+ struct spraid_sdev_hostdata *hostdata;
+ u8 rg_id, i, progress = 0;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret)
+ goto out;
+
+ rg_id = vd_info->rg_id;
+
+ bgtask = (struct spraid_bgtask *)vd_info;
+ ret = spraid_get_bgtask(hdev, bgtask);
+ if (ret)
+ goto out;
+ for (i = 0; i < bgtask->task_num; i++) {
+ if ((bgtask->bgtask[i].type == BGTASK_TYPE_REBUILD) &&
+ (le16_to_cpu(bgtask->bgtask[i].vd_id) == rg_id))
+ progress = bgtask->bgtask[i].progress;
+ }
+
+out:
+ kfree(vd_info);
+ return snprintf(buf, PAGE_SIZE, "%d\n", progress);
+}
+
+static DEVICE_ATTR_RO(raid_level);
+static DEVICE_ATTR_RO(raid_state);
+static DEVICE_ATTR_RO(raid_resync);
+
+static struct device_attribute *spraid_dev_attrs[] = {
+ &dev_attr_raid_level,
+ &dev_attr_raid_state,
+ &dev_attr_raid_resync,
+ NULL,
+};
+
+static struct pci_error_handlers spraid_err_handler = {
+ .error_detected = spraid_pci_error_detected,
+ .slot_reset = spraid_pci_slot_reset,
+ .reset_done = spraid_reset_done,
+};
+
+static int spraid_sysfs_host_reset(struct Scsi_Host *shost, int reset_type)
+{
+ int ret;
+ struct spraid_dev *hdev = shost_priv(shost);
+
+ dev_info(hdev->dev, "[%s] start sysfs host reset cmd\n", __func__);
+ ret = spraid_reset_work_sync(hdev);
+ dev_info(hdev->dev, "[%s] stop sysfs host reset cmd[%d]\n",
+ __func__, ret);
+
+ return ret;
+}
+
+static struct scsi_host_template spraid_driver_template = {
+ .module = THIS_MODULE,
+ .name = "Ramaxel Logic spraid driver",
+ .proc_name = "spraid",
+ .queuecommand = spraid_queue_command,
+ .slave_alloc = spraid_slave_alloc,
+ .slave_destroy = spraid_slave_destroy,
+ .slave_configure = spraid_slave_configure,
+ .eh_timed_out = spraid_scmd_timeout,
+ .eh_abort_handler = spraid_abort_handler,
+ .eh_target_reset_handler = spraid_tgt_reset_handler,
+ .eh_bus_reset_handler = spraid_bus_reset_handler,
+ .eh_host_reset_handler = spraid_shost_reset_handler,
+ .change_queue_depth = scsi_change_queue_depth,
+ .this_id = -1,
+ .shost_attrs = spraid_host_attrs,
+ .sdev_attrs = spraid_dev_attrs,
+ .host_reset = spraid_sysfs_host_reset,
+};
+
+static void spraid_shutdown(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ spraid_remove_io_queues(hdev);
+ spraid_disable_admin_queue(hdev, true);
+}
+
+/* bsg dispatch user command */
+static int spraid_bsg_host_dispatch(struct bsg_job *job)
+{
+ struct Scsi_Host *shost = dev_to_shost(job->dev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_bsg_request *bsg_req = job->request;
+ int ret = 0;
+
+ dev_info(hdev->dev, "[%s] msgcode[%d], msglen[%d], timeout[%d];"
+ " req_nsge[%d], req_len[%d]\n",
+ __func__, bsg_req->msgcode, job->request_len,
+ rq->timeout, job->request_payload.sg_cnt,
+ job->request_payload.payload_len);
+
+ job->reply_len = 0;
+
+ switch (bsg_req->msgcode) {
+ case SPRAID_BSG_ADM:
+ ret = spraid_user_admin_cmd(hdev, job);
+ break;
+ case SPRAID_BSG_IOQ:
+ ret = spraid_user_ioq_cmd(hdev, job);
+ break;
+ default:
+ dev_info(hdev->dev, "[%s] unsupport msgcode[%d]\n",
+ __func__, bsg_req->msgcode);
+ break;
+ }
+
+ if (ret > 0)
+ ret = ret | (ret << 8);
+
+ bsg_job_done(job, ret, 0);
+ return 0;
+}
+
+static inline void spraid_remove_bsg(struct spraid_dev *hdev)
+{
+ if (hdev->bsg_queue) {
+ bsg_unregister_queue(hdev->bsg_queue);
+ blk_cleanup_queue(hdev->bsg_queue);
+ }
+}
+static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct spraid_dev *hdev;
+ struct Scsi_Host *shost;
+ int node, ret;
+ char bsg_name[15];
+
+ shost = scsi_host_alloc(&spraid_driver_template, sizeof(*hdev));
+ if (!shost) {
+ dev_err(&pdev->dev, "Failed to allocate scsi host\n");
+ return -ENOMEM;
+ }
+ hdev = shost_priv(shost);
+ hdev->pdev = pdev;
+ hdev->dev = get_device(&pdev->dev);
+
+ node = dev_to_node(hdev->dev);
+ if (node == NUMA_NO_NODE) {
+ node = first_memory_node;
+ set_dev_node(hdev->dev, node);
+ }
+ hdev->numa_node = node;
+ hdev->shost = shost;
+ pci_set_drvdata(pdev, hdev);
+
+ ret = spraid_dev_map(hdev);
+ if (ret)
+ goto put_dev;
+
+ init_rwsem(&hdev->devices_rwsem);
+ INIT_WORK(&hdev->scan_work, spraid_scan_work);
+ INIT_WORK(&hdev->timesyn_work, spraid_timesyn_work);
+ INIT_WORK(&hdev->reset_work, spraid_reset_work);
+ INIT_WORK(&hdev->fw_act_work, spraid_fw_act_work);
+ spin_lock_init(&hdev->state_lock);
+
+ ret = spraid_alloc_resources(hdev);
+ if (ret)
+ goto dev_unmap;
+
+ ret = spraid_pci_enable(hdev);
+ if (ret)
+ goto resources_free;
+
+ ret = spraid_setup_admin_queue(hdev);
+ if (ret)
+ goto pci_disable;
+
+ ret = spraid_init_ctrl_info(hdev);
+ if (ret)
+ goto disable_admin_q;
+
+ ret = spraid_alloc_iod_ext_mem_pool(hdev);
+ if (ret)
+ goto disable_admin_q;
+
+ ret = spraid_setup_io_queues(hdev);
+ if (ret)
+ goto free_iod_mempool;
+
+ spraid_shost_init(hdev);
+
+ ret = scsi_add_host(hdev->shost, hdev->dev);
+ if (ret) {
+ dev_err(hdev->dev, "Add shost to system failed, ret: %d\n",
+ ret);
+ goto remove_io_queues;
+ }
+
+ snprintf(bsg_name, sizeof(bsg_name), "spraid%d", shost->host_no);
+ hdev->bsg_queue = bsg_setup_queue(&shost->shost_gendev, bsg_name,
+ spraid_bsg_host_dispatch,
+ spraid_cmd_size(hdev, true, false));
+ if (IS_ERR(hdev->bsg_queue)) {
+ dev_err(hdev->dev, "err, setup bsg failed\n");
+ hdev->bsg_queue = NULL;
+ goto remove_io_queues;
+ }
+
+ if (hdev->online_queues == SPRAID_ADMIN_QUEUE_NUM) {
+ dev_warn(hdev->dev, "warn only admin queue can be used\n");
+ return 0;
+ }
+
+ hdev->state = SPRAID_LIVE;
+
+ spraid_send_all_aen(hdev);
+
+ ret = spraid_dev_list_init(hdev);
+ if (ret)
+ goto remove_bsg;
+
+ ret = spraid_configure_timestamp(hdev);
+ if (ret)
+ dev_warn(hdev->dev, "init set timestamp failed\n");
+
+ ret = spraid_alloc_ioq_ptcmds(hdev);
+ if (ret)
+ goto remove_bsg;
+
+ scsi_scan_host(hdev->shost);
+
+ return 0;
+
+remove_bsg:
+ spraid_remove_bsg(hdev);
+remove_io_queues:
+ spraid_remove_io_queues(hdev);
+free_iod_mempool:
+ spraid_free_iod_ext_mem_pool(hdev);
+disable_admin_q:
+ spraid_disable_admin_queue(hdev, false);
+pci_disable:
+ spraid_pci_disable(hdev);
+resources_free:
+ spraid_free_resources(hdev);
+dev_unmap:
+ spraid_dev_unmap(hdev);
+put_dev:
+ put_device(hdev->dev);
+ scsi_host_put(shost);
+
+ return -ENODEV;
+}
+
+static void spraid_remove(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+ struct Scsi_Host *shost = hdev->shost;
+
+ dev_info(hdev->dev, "enter spraid remove\n");
+
+ spraid_change_host_state(hdev, SPRAID_DELETING);
+ flush_work(&hdev->reset_work);
+
+ if (!pci_device_is_present(pdev))
+ spraid_back_all_io(hdev);
+
+ spraid_remove_bsg(hdev);
+ scsi_remove_host(shost);
+ spraid_free_ioq_ptcmds(hdev);
+ kfree(hdev->devices);
+ spraid_remove_io_queues(hdev);
+ spraid_free_iod_ext_mem_pool(hdev);
+ spraid_disable_admin_queue(hdev, false);
+ spraid_pci_disable(hdev);
+ spraid_free_resources(hdev);
+ spraid_dev_unmap(hdev);
+ put_device(hdev->dev);
+ scsi_host_put(shost);
+
+ dev_info(hdev->dev, "exit spraid remove\n");
+}
+
+static const struct pci_device_id spraid_id_table[] = {
+ { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC,
+ SPRAID_SERVER_DEVICE_HBA_DID) },
+ { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC,
+ SPRAID_SERVER_DEVICE_RAID_DID) },
+ { 0, }
+};
+MODULE_DEVICE_TABLE(pci, spraid_id_table);
+
+static struct pci_driver spraid_driver = {
+ .name = "spraid",
+ .id_table = spraid_id_table,
+ .probe = spraid_probe,
+ .remove = spraid_remove,
+ .shutdown = spraid_shutdown,
+ .err_handler = &spraid_err_handler,
+};
+
+static int __init spraid_init(void)
+{
+ int ret;
+
+ spraid_wq = alloc_workqueue("spraid-wq", WQ_UNBOUND | WQ_MEM_RECLAIM |
+ WQ_SYSFS, 0);
+ if (!spraid_wq)
+ return -ENOMEM;
+
+ spraid_class = class_create(THIS_MODULE, "spraid");
+ if (IS_ERR(spraid_class)) {
+ ret = PTR_ERR(spraid_class);
+ goto destroy_wq;
+ }
+
+ ret = pci_register_driver(&spraid_driver);
+ if (ret < 0)
+ goto destroy_class;
+
+ return 0;
+
+destroy_class:
+ class_destroy(spraid_class);
+destroy_wq:
+ destroy_workqueue(spraid_wq);
+
+ return ret;
+}
+
+static void __exit spraid_exit(void)
+{
+ pci_unregister_driver(&spraid_driver);
+ class_destroy(spraid_class);
+ destroy_workqueue(spraid_wq);
+ ida_destroy(&spraid_instance_ida);
+}
+
+MODULE_AUTHOR("songyl(a)ramaxel.com");
+MODULE_DESCRIPTION("Ramaxel Memory Technology SPraid Driver");
+MODULE_LICENSE("GPL");
+MODULE_VERSION(SPRAID_DRV_VERSION);
+module_init(spraid_init);
+module_exit(spraid_exit);
--
2.27.0
1
0

[PATCH openEuler-1.0-LTS 1/2] blk-mq: fix abnormal free in single queue process
by Yang Yingliang 20 Dec '21
by Yang Yingliang 20 Dec '21
20 Dec '21
From: Zhang Wensheng <zhangwensheng5(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 185922, https://gitee.com/openeuler/kernel/issues/I4NDPF
CVE: NA
--------------------------------
fix the problem introduced by commit 80086651acbd ("blk: Fix
lock inversion between ioc lock and bfqd lock"), which lead to
blk_queue_bio using blk_mq_free_request to free abnormally.
fix the problem using __blk_put_request to replace it.
Fixes: 80086651acbdb ("blk: Fix lock inversion between ioc lock and bfqd lock")
Signed-off-by: Zhang Wensheng <zhangwensheng5(a)huawei.com>
Reviewed-by: Laibin Qiu <qiulaibin(a)huawei.com>
Reviewed-by: Jason Yan <yanaijie(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
block/blk-mq.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-mq.h b/block/blk-mq.h
index bb8aabbc5c579..b3540d62c541e 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -228,7 +228,7 @@ static inline void blk_mq_free_requests(struct list_head *list)
struct request *rq = list_entry_rq(list->next);
list_del_init(&rq->queuelist);
- blk_mq_free_request(rq);
+ __blk_put_request(rq->q, rq);
}
}
--
2.25.1
1
1

[PATCH openEuler-1.0-LTS] scsi: hisi_sas: Add support for sata disk I/O errors report to libsas
by Yang Yingliang 20 Dec '21
by Yang Yingliang 20 Dec '21
20 Dec '21
From: Xingui Yang <yangxingui(a)huawei.com>
driver inclusion
category: feature
bugzilla: NA
CVE: NA
---------------------------
If the response frame has been written back to the memory and carries the
disk error and status when sata I/O completion abnormally, then set task
stat to SAS_PROTO_RESPONSE and let libsas to handle it.
Signed-off-by: Xingui Yang <yangxingui(a)huawei.com>
Reviewed-by: Kangfenglong <kangfenglong(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 4508c4a2f02fc..3c662fa5f6c0c 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -470,6 +470,9 @@ struct hisi_sas_err_record_v3 {
#define RX_DATA_LEN_UNDERFLOW_OFF 6
#define RX_DATA_LEN_UNDERFLOW_MSK (1 << RX_DATA_LEN_UNDERFLOW_OFF)
+#define RX_FIS_STATUS_ERR_OFF 0
+#define RX_FIS_STATUS_ERR_MSK (1 << RX_FIS_STATUS_ERR_OFF)
+
#define HISI_SAS_COMMAND_ENTRIES_V3_HW 4096
#define HISI_SAS_MSI_COUNT_V3_HW 32
@@ -2241,17 +2244,20 @@ slot_err_v3_hw(struct hisi_hba *hisi_hba, struct sas_task *task,
hisi_sas_status_buf_addr_mem(slot);
u32 dma_rx_err_type = record->dma_rx_err_type;
u32 trans_tx_fail_type = record->trans_tx_fail_type;
+ u16 sipc_rx_err_type = le16_to_cpu(record->sipc_rx_err_type);
+ u32 dw0 = le32_to_cpu(complete_hdr->dw0);
+ u32 dw3 = le32_to_cpu(complete_hdr->dw3);
switch (task->task_proto) {
case SAS_PROTOCOL_SSP:
if (dma_rx_err_type & RX_DATA_LEN_UNDERFLOW_MSK) {
ts->residual = trans_tx_fail_type;
ts->stat = SAS_DATA_UNDERRUN;
- if ((!(complete_hdr->dw0 & CMPLT_HDR_RSPNS_GOOD_MSK)) &&
- (complete_hdr->dw0 & CMPLT_HDR_RSPNS_XFRD_MSK)) {
+ if (!(dw0 & CMPLT_HDR_RSPNS_GOOD_MSK) &&
+ (dw0 & CMPLT_HDR_RSPNS_XFRD_MSK)) {
hisi_sas_set_sense_data(task, slot);
}
- } else if (complete_hdr->dw3 & CMPLT_HDR_IO_IN_TARGET_MSK) {
+ } else if (dw3 & CMPLT_HDR_IO_IN_TARGET_MSK) {
ts->stat = SAS_QUEUE_FULL;
slot->abort = 1;
} else {
@@ -2262,7 +2268,10 @@ slot_err_v3_hw(struct hisi_hba *hisi_hba, struct sas_task *task,
case SAS_PROTOCOL_SATA:
case SAS_PROTOCOL_STP:
case SAS_PROTOCOL_SATA | SAS_PROTOCOL_STP:
- if (complete_hdr->dw3 & CMPLT_HDR_IO_IN_TARGET_MSK) {
+ if ((dw0 & CMPLT_HDR_RSPNS_XFRD_MSK) &&
+ (sipc_rx_err_type & RX_FIS_STATUS_ERR_MSK)) {
+ ts->stat = SAS_PROTO_RESPONSE;
+ } else if (dw3 & CMPLT_HDR_IO_IN_TARGET_MSK) {
ts->stat = SAS_PHY_DOWN;
slot->abort = 1;
} else if (dma_rx_err_type & RX_DATA_LEN_UNDERFLOW_MSK) {
--
2.25.1
1
0

20 Dec '21
Ramaxel inclusion
category: features
bugzilla: https://gitee.com/openeuler/kernel/issues/I4JXCG
CVE: NA
Changes from v1:
1. Add more debug infor.
2. Remove some unnecessary module parameters.
3. Report disks by the order of channel/target id.
4. Add host_reset handler.
5. Use get_unaligned_be24.
Signed-off-by: Yanling Song <songyl(a)ramaxel.com>
Reviewed-by: Jiang Yu<yujiang(a)ramaxel.com>
---
drivers/scsi/spraid/Kconfig | 6 +-
drivers/scsi/spraid/spraid.h | 142 ++-
drivers/scsi/spraid/spraid_main.c | 1571 +++++++++++++++--------------
3 files changed, 938 insertions(+), 781 deletions(-)
diff --git a/drivers/scsi/spraid/Kconfig b/drivers/scsi/spraid/Kconfig
index 83962efaab07..bfbba3db8db0 100644
--- a/drivers/scsi/spraid/Kconfig
+++ b/drivers/scsi/spraid/Kconfig
@@ -5,7 +5,9 @@
config RAMAXEL_SPRAID
tristate "Ramaxel spraid Adapter"
depends on PCI && SCSI
+ select BLK_DEV_BSGLIB
depends on ARM64 || X86_64
- default m
help
- This driver supports Ramaxel spraid driver.
+ This driver supports Ramaxel SPRxxx serial
+ raid controller, which has PCIE Gen4 interface
+ with host and supports SAS/SATA Hdd/ssd.
diff --git a/drivers/scsi/spraid/spraid.h b/drivers/scsi/spraid/spraid.h
index da46d8e1b4b6..983d7af2faa8 100644
--- a/drivers/scsi/spraid/spraid.h
+++ b/drivers/scsi/spraid/spraid.h
@@ -1,4 +1,5 @@
/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
#ifndef __SPRAID_H_
#define __SPRAID_H_
@@ -24,7 +25,7 @@
#define SENSE_SIZE(depth) ((depth) * SCSI_SENSE_BUFFERSIZE)
#define SPRAID_AQ_DEPTH 128
-#define SPRAID_NR_AEN_COMMANDS 1
+#define SPRAID_NR_AEN_COMMANDS 16
#define SPRAID_AQ_BLK_MQ_DEPTH (SPRAID_AQ_DEPTH - SPRAID_NR_AEN_COMMANDS)
#define SPRAID_AQ_MQ_TAG_DEPTH (SPRAID_AQ_BLK_MQ_DEPTH - 1)
@@ -44,7 +45,7 @@
#define SMALL_POOL_SIZE 256
#define MAX_SMALL_POOL_NUM 16
-#define MAX_CMD_PER_DEV 32
+#define MAX_CMD_PER_DEV 64
#define MAX_CDB_LEN 32
#define SPRAID_UP_TO_MULTY4(x) (((x) + 4) & (~0x03))
@@ -53,7 +54,7 @@
#define PCI_VENDOR_ID_RAMAXEL_LOGIC 0x1E81
-#define SPRAID_SERVER_DEVICE_HAB_DID 0x2100
+#define SPRAID_SERVER_DEVICE_HBA_DID 0x2100
#define SPRAID_SERVER_DEVICE_RAID_DID 0x2200
#define IO_6_DEFAULT_TX_LEN 256
@@ -142,11 +143,15 @@ enum {
enum {
SPRAID_AEN_DEV_CHANGED = 0x00,
+ SPRAID_AEN_FW_ACT_START = 0x01,
SPRAID_AEN_HOST_PROBING = 0x10,
};
enum {
- SPRAID_AEN_TIMESYN = 0x07
+ SPRAID_AEN_TIMESYN = 0x00,
+ SPRAID_AEN_FW_ACT_FINISH = 0x02,
+ SPRAID_AEN_EVENT_MIN = 0x80,
+ SPRAID_AEN_EVENT_MAX = 0xff,
};
enum {
@@ -175,6 +180,16 @@ enum spraid_state {
SPRAID_DEAD,
};
+enum {
+ SPRAID_CARD_HBA,
+ SPRAID_CARD_RAID,
+};
+
+enum spraid_cmd_type {
+ SPRAID_CMD_ADM,
+ SPRAID_CMD_IOPT,
+};
+
struct spraid_completion {
__le32 result;
union {
@@ -217,8 +232,6 @@ struct spraid_dev {
struct dma_pool *prp_page_pool;
struct dma_pool *prp_small_pool[MAX_SMALL_POOL_NUM];
mempool_t *iod_mempool;
- struct blk_mq_tag_set admin_tagset;
- struct request_queue *admin_q;
void __iomem *bar;
u32 max_qid;
u32 num_vecs;
@@ -232,23 +245,27 @@ struct spraid_dev {
u32 ctrl_config;
u32 online_queues;
u64 cap;
- struct device ctrl_device;
- struct cdev cdev;
int instance;
struct spraid_ctrl_info *ctrl_info;
struct spraid_dev_info *devices;
- struct spraid_ioq_ptcmd *ioq_ptcmds;
+ struct spraid_cmd *adm_cmds;
+ struct list_head adm_cmd_list;
+ spinlock_t adm_cmd_lock;
+
+ struct spraid_cmd *ioq_ptcmds;
struct list_head ioq_pt_list;
spinlock_t ioq_pt_lock;
- struct work_struct aen_work;
struct work_struct scan_work;
struct work_struct timesyn_work;
struct work_struct reset_work;
+ struct work_struct fw_act_work;
enum spraid_state state;
spinlock_t state_lock;
+
+ struct request_queue *bsg_queue;
};
struct spraid_sgl_desc {
@@ -347,6 +364,35 @@ struct spraid_get_info {
__u32 rsvd12[4];
};
+struct spraid_usr_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 hdid;
+ union {
+ struct {
+ __le16 subopcode;
+ __le16 rsvd1;
+ } info_0;
+ __le32 cdw2;
+ };
+ union {
+ struct {
+ __le16 data_len;
+ __le16 param_len;
+ } info_1;
+ __le32 cdw3;
+ };
+ __u64 metadata;
+ union spraid_data_ptr dptr;
+ __le32 cdw10;
+ __le32 cdw11;
+ __le32 cdw12;
+ __le32 cdw13;
+ __le32 cdw14;
+ __le32 cdw15;
+};
+
enum {
SPRAID_CMD_FLAG_SGL_METABUF = (1 << 6),
SPRAID_CMD_FLAG_SGL_METASEG = (1 << 7),
@@ -393,6 +439,7 @@ struct spraid_admin_command {
struct spraid_get_info get_info;
struct spraid_abort_cmd abort;
struct spraid_reset_cmd reset;
+ struct spraid_usr_cmd usr_cmd;
};
};
@@ -456,9 +503,6 @@ struct spraid_ioq_command {
};
};
-#define SPRAID_IOCTL_RESET_CMD _IOWR('N', 0x80, struct spraid_passthru_common_cmd)
-#define SPRAID_IOCTL_ADMIN_CMD _IOWR('N', 0x41, struct spraid_passthru_common_cmd)
-
struct spraid_passthru_common_cmd {
__u8 opcode;
__u8 flags;
@@ -494,8 +538,6 @@ struct spraid_passthru_common_cmd {
__u32 result1;
};
-#define SPRAID_IOCTL_IOQ_CMD _IOWR('N', 0x42, struct spraid_ioq_passthru_cmd)
-
struct spraid_ioq_passthru_cmd {
__u8 opcode;
__u8 flags;
@@ -560,7 +602,21 @@ struct spraid_ioq_passthru_cmd {
__u32 result1;
};
-struct spraid_ioq_ptcmd {
+struct spraid_bsg_request {
+ u32 msgcode;
+ u32 control;
+ union {
+ struct spraid_passthru_common_cmd admcmd;
+ struct spraid_ioq_passthru_cmd ioqcmd;
+ };
+};
+
+enum {
+ SPRAID_BSG_ADM,
+ SPRAID_BSG_IOQ,
+};
+
+struct spraid_cmd {
int qid;
int cid;
u32 result0;
@@ -572,14 +628,6 @@ struct spraid_ioq_ptcmd {
struct list_head list;
};
-struct spraid_admin_request {
- struct spraid_admin_command *cmd;
- u32 result0;
- u32 result1;
- u16 flags;
- u16 status;
-};
-
struct spraid_queue {
struct spraid_dev *hdev;
spinlock_t sq_lock; /* spinlock for lock handling */
@@ -607,7 +655,6 @@ struct spraid_queue {
};
struct spraid_iod {
- struct spraid_admin_request req;
struct spraid_queue *spraidq;
enum spraid_cmd_state state;
int npages;
@@ -623,13 +670,51 @@ struct spraid_iod {
};
#define SPRAID_DEV_INFO_ATTR_BOOT(attr) ((attr) & 0x01)
-#define SPRAID_DEV_INFO_ATTR_HDD(attr) ((attr) & 0x02)
+#define SPRAID_DEV_INFO_ATTR_VD(attr) (((attr) & 0x02) == 0x0)
#define SPRAID_DEV_INFO_ATTR_PT(attr) (((attr) & 0x22) == 0x02)
#define SPRAID_DEV_INFO_ATTR_RAWDISK(attr) ((attr) & 0x20)
#define SPRAID_DEV_INFO_FLAG_VALID(flag) ((flag) & 0x01)
#define SPRAID_DEV_INFO_FLAG_CHANGE(flag) ((flag) & 0x02)
+#define BGTASK_TYPE_REBUILD 4
+#define USR_CMD_READ 0xc2
+#define USR_CMD_RDLEN 0x1000
+#define USR_CMD_VDINFO 0x704
+#define USR_CMD_BGTASK 0x504
+#define VDINFO_PARAM_LEN 0x04
+
+struct spraid_vd_info {
+ __u8 name[32];
+ __le16 id;
+ __u8 rg_id;
+ __u8 rg_level;
+ __u8 sg_num;
+ __u8 sg_disk_num;
+ __u8 vd_status;
+ __u8 vd_type;
+ __u8 rsvd1[4056];
+};
+
+#define MAX_REALTIME_BGTASK_NUM 32
+
+struct bgtask_info {
+ __u8 type;
+ __u8 progress;
+ __u8 rate;
+ __u8 rsvd0;
+ __le16 vd_id;
+ __le16 time_left;
+ __u8 rsvd1[4];
+};
+
+struct spraid_bgtask {
+ __u8 sw;
+ __u8 task_num;
+ __u8 rsvd[6];
+ struct bgtask_info bgtask[MAX_REALTIME_BGTASK_NUM];
+};
+
struct spraid_dev_info {
__le32 hdid;
__le16 target;
@@ -649,6 +734,11 @@ struct spraid_dev_list {
struct spraid_sdev_hostdata {
u32 hdid;
+ u16 max_io_kb;
+ u8 attr;
+ u8 flag;
+ u8 rg_id;
+ u8 rsvd[3];
};
#endif
diff --git a/drivers/scsi/spraid/spraid_main.c b/drivers/scsi/spraid/spraid_main.c
index a0a75ecb0027..976b7ded336a 100644
--- a/drivers/scsi/spraid/spraid_main.c
+++ b/drivers/scsi/spraid/spraid_main.c
@@ -1,8 +1,8 @@
// SPDX-License-Identifier: GPL-2.0
-/*
- * Linux spraid device driver
- * Copyright(c) 2021 Ramaxel Memory Technology, Ltd
- */
+/* Copyright(c) 2021 Ramaxel Memory Technology, Ltd */
+
+/* Ramaxel Raid SPXXX Series Linux Driver */
+
#define pr_fmt(fmt) "spraid: " fmt
#include <linux/sched/signal.h>
@@ -23,6 +23,9 @@
#include <linux/debugfs.h>
#include <linux/io-64-nonatomic-lo-hi.h>
#include <linux/blkdev.h>
+#include <linux/bsg-lib.h>
+#include <asm/unaligned.h>
+#include <linux/sort.h>
#include <scsi/scsi.h>
#include <scsi/scsi_cmnd.h>
@@ -31,28 +34,17 @@
#include <scsi/scsi_transport.h>
#include <scsi/scsi_dbg.h>
+
#include "spraid.h"
static u32 admin_tmout = 60;
module_param(admin_tmout, uint, 0644);
MODULE_PARM_DESC(admin_tmout, "admin commands timeout (seconds)");
-static u32 scmd_tmout_pt = 30;
-module_param(scmd_tmout_pt, uint, 0644);
-MODULE_PARM_DESC(scmd_tmout_pt, "scsi commands timeout for passthrough(seconds)");
-
static u32 scmd_tmout_nonpt = 180;
module_param(scmd_tmout_nonpt, uint, 0644);
MODULE_PARM_DESC(scmd_tmout_nonpt, "scsi commands timeout for rawdisk&raid(seconds)");
-static u32 wait_abl_tmout = 3;
-module_param(wait_abl_tmout, uint, 0644);
-MODULE_PARM_DESC(wait_abl_tmout, "wait abnormal io timeout(seconds)");
-
-static bool use_sgl_force;
-module_param(use_sgl_force, bool, 0644);
-MODULE_PARM_DESC(use_sgl_force, "force IO use sgl format, default false");
-
static int ioq_depth_set(const char *val, const struct kernel_param *kp);
static const struct kernel_param_ops ioq_depth_ops = {
.set = ioq_depth_set,
@@ -106,16 +98,20 @@ static const struct kernel_param_ops small_pool_num_ops = {
.get = param_get_byte,
};
+/* It was found that the spindlock of a single pool conflicts
+ * a lot with multiple CPUs.So multiple pools are introduced
+ * to reduce the conflictions.
+ */
static unsigned char small_pool_num = 4;
module_param_cb(small_pool_num, &small_pool_num_ops, &small_pool_num, 0644);
MODULE_PARM_DESC(small_pool_num, "set prp small pool num, default 4, MAX 16");
static void spraid_free_queue(struct spraid_queue *spraidq);
static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result);
-static void spraid_handle_aen_vs(struct spraid_dev *hdev, u32 result);
+static void spraid_handle_aen_vs(struct spraid_dev *hdev, u32 result, u32 result1);
static DEFINE_IDA(spraid_instance_ida);
-static dev_t spraid_chr_devt;
+
static struct class *spraid_class;
#define SPRAID_CAP_TIMEOUT_UNIT_MS (HZ / 2)
@@ -133,7 +129,7 @@ static struct workqueue_struct *spraid_wq;
#define ADMIN_TIMEOUT (admin_tmout * HZ)
#define ADMIN_ERR_TIMEOUT 32757
-#define SPRAID_WAIT_ABNL_CMD_TIMEOUT (wait_abl_tmout * 2)
+#define SPRAID_WAIT_ABNL_CMD_TIMEOUT (3 * 2)
#define SPRAID_DMA_MSK_BIT_MAX 64
@@ -147,6 +143,13 @@ enum FW_STAT_CODE {
FW_STAT_NEED_RETRY
};
+static const char * const raid_levels[] = {"0", "1", "5", "6", "10", "50", "60", "NA"};
+
+static const char * const raid_states[] = {
+ "NA", "NORMAL", "FAULT", "DEGRADE", "NOT_FORMATTED", "FORMATTING", "SANITIZING",
+ "INITIALIZING", "INITIALIZE_FAIL", "DELETING", "DELETE_FAIL", "WRITE_PROTECT"
+};
+
static int ioq_depth_set(const char *val, const struct kernel_param *kp)
{
int n = 0;
@@ -263,12 +266,6 @@ static int spraid_pci_enable(struct spraid_dev *hdev)
return ret;
}
-static inline
-struct spraid_admin_request *spraid_admin_req(struct request *req)
-{
- return blk_mq_rq_to_pdu(req);
-}
-
static int spraid_npages_prp(u32 size, struct spraid_dev *hdev)
{
u32 nprps = DIV_ROUND_UP(size + hdev->page_size, hdev->page_size);
@@ -419,7 +416,7 @@ static void spraid_submit_cmd(struct spraid_queue *spraidq, const void *cmd)
writel(spraidq->sq_tail, spraidq->q_db);
spin_unlock_irqrestore(&spraidq->sq_lock, flags);
- dev_log_dbg(spraidq->hdev->dev, "cid[%d], qid[%d], opcode[0x%x], flags[0x%x], hdid[%u]\n",
+ dev_log_dbg(spraidq->hdev->dev, "cid[%d] qid[%d], opcode[0x%x], flags[0x%x], hdid[%u]\n",
acd->command_id, spraidq->qid, acd->opcode, acd->flags, le32_to_cpu(acd->hdid));
}
@@ -605,18 +602,15 @@ static void spraid_setup_rw_cmd(struct spraid_dev *hdev,
if (scmd->cmd_len == 6) {
datalength = (u32)(scmd->cmnd[4] == 0 ?
IO_6_DEFAULT_TX_LEN : scmd->cmnd[4]);
- start_lba_lo = ((u32)scmd->cmnd[1] << 16) |
- ((u32)scmd->cmnd[2] << 8) | (u32)scmd->cmnd[3];
+ start_lba_lo = (u32)get_unaligned_be24(&scmd->cmnd[1]);
start_lba_lo &= 0x1FFFFF;
}
/* 10-byte READ(0x28) or WRITE(0x2A) cdb */
else if (scmd->cmd_len == 10) {
- datalength = (u32)scmd->cmnd[8] | ((u32)scmd->cmnd[7] << 8);
- start_lba_lo = ((u32)scmd->cmnd[2] << 24) |
- ((u32)scmd->cmnd[3] << 16) |
- ((u32)scmd->cmnd[4] << 8) | (u32)scmd->cmnd[5];
+ datalength = (u32)get_unaligned_be16(&scmd->cmnd[7]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
if (scmd->cmnd[1] & FUA_MASK)
control |= SPRAID_RW_FUA;
@@ -624,42 +618,26 @@ static void spraid_setup_rw_cmd(struct spraid_dev *hdev,
/* 12-byte READ(0xA8) or WRITE(0xAA) cdb */
else if (scmd->cmd_len == 12) {
- datalength = ((u32)scmd->cmnd[6] << 24) |
- ((u32)scmd->cmnd[7] << 16) |
- ((u32)scmd->cmnd[8] << 8) | (u32)scmd->cmnd[9];
- start_lba_lo = ((u32)scmd->cmnd[2] << 24) |
- ((u32)scmd->cmnd[3] << 16) |
- ((u32)scmd->cmnd[4] << 8) | (u32)scmd->cmnd[5];
+ datalength = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[2]);
if (scmd->cmnd[1] & FUA_MASK)
control |= SPRAID_RW_FUA;
}
/* 16-byte READ(0x88) or WRITE(0x8A) cdb */
else if (scmd->cmd_len == 16) {
- datalength = ((u32)scmd->cmnd[10] << 24) |
- ((u32)scmd->cmnd[11] << 16) |
- ((u32)scmd->cmnd[12] << 8) | (u32)scmd->cmnd[13];
- start_lba_lo = ((u32)scmd->cmnd[6] << 24) |
- ((u32)scmd->cmnd[7] << 16) |
- ((u32)scmd->cmnd[8] << 8) | (u32)scmd->cmnd[9];
- start_lba_hi = ((u32)scmd->cmnd[2] << 24) |
- ((u32)scmd->cmnd[3] << 16) |
- ((u32)scmd->cmnd[4] << 8) | (u32)scmd->cmnd[5];
+ datalength = get_unaligned_be32(&scmd->cmnd[10]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[6]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[2]);
if (scmd->cmnd[1] & FUA_MASK)
control |= SPRAID_RW_FUA;
}
/* 32-byte READ(0x88) or WRITE(0x8A) cdb */
else if (scmd->cmd_len == 32) {
- datalength = ((u32)scmd->cmnd[28] << 24) |
- ((u32)scmd->cmnd[29] << 16) |
- ((u32)scmd->cmnd[30] << 8) | (u32)scmd->cmnd[31];
- start_lba_lo = ((u32)scmd->cmnd[16] << 24) |
- ((u32)scmd->cmnd[17] << 16) |
- ((u32)scmd->cmnd[18] << 8) | (u32)scmd->cmnd[19];
- start_lba_hi = ((u32)scmd->cmnd[12] << 24) |
- ((u32)scmd->cmnd[13] << 16) |
- ((u32)scmd->cmnd[14] << 8) | (u32)scmd->cmnd[15];
+ datalength = get_unaligned_be32(&scmd->cmnd[28]);
+ start_lba_lo = get_unaligned_be32(&scmd->cmnd[16]);
+ start_lba_hi = get_unaligned_be32(&scmd->cmnd[12]);
if (scmd->cmnd[10] & FUA_MASK)
control |= SPRAID_RW_FUA;
@@ -814,7 +792,7 @@ static void spraid_map_status(struct spraid_iod *iod, struct scsi_cmnd *scmd,
if (scmd->result & SAM_STAT_CHECK_CONDITION) {
memset(scmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
memcpy(scmd->sense_buffer, iod->sense, SCSI_SENSE_BUFFERSIZE);
- set_driver_byte(scmd, DRIVER_SENSE);
+ scmd->result = (scmd->result & 0x00ffffff) | (DRIVER_SENSE << 24);
}
break;
case FW_STAT_ABORTED:
@@ -850,14 +828,13 @@ static int spraid_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
int ret;
if (unlikely(!scmd)) {
- dev_err(hdev->dev, "err, scmd is null, return 0\n");
+ dev_err(hdev->dev, "err, scmd is null\n");
return 0;
}
if (unlikely(hdev->state != SPRAID_LIVE)) {
set_host_byte(scmd, DID_NO_CONNECT);
scmd->scsi_done(scmd);
- dev_err(hdev->dev, "[%s] err, hdev state is not live\n", __func__);
return 0;
}
@@ -894,7 +871,7 @@ static int spraid_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
WRITE_ONCE(iod->state, SPRAID_CMD_IN_FLIGHT);
spraid_submit_cmd(ioq, &ioq_cmd);
elapsed = jiffies - scmd->jiffies_at_alloc;
- dev_log_dbg(hdev->dev, "cid[%d], qid[%d] submit IO cost %3ld.%3ld seconds\n",
+ dev_log_dbg(hdev->dev, "cid[%d] qid[%d] submit IO cost %3ld.%3ld seconds\n",
cid, hwq, elapsed / HZ, elapsed % HZ);
return 0;
@@ -945,6 +922,10 @@ static int spraid_slave_alloc(struct scsi_device *sdev)
scan_host:
hostdata->hdid = le32_to_cpu(hdev->devices[idx].hdid);
+ hostdata->max_io_kb = le16_to_cpu(hdev->devices[idx].max_io_kb);
+ hostdata->attr = hdev->devices[idx].attr;
+ hostdata->flag = hdev->devices[idx].flag;
+ hostdata->rg_id = 0xff;
sdev->hostdata = hostdata;
up_read(&hdev->devices_rwsem);
return 0;
@@ -964,13 +945,13 @@ static int spraid_slave_configure(struct scsi_device *sdev)
struct spraid_sdev_hostdata *hostdata = sdev->hostdata;
u32 max_sec = sdev->host->max_sectors;
- if (!hostdata) {
+ if (hostdata) {
idx = hostdata->hdid - 1;
if (sdev->channel == hdev->devices[idx].channel &&
sdev->id == le16_to_cpu(hdev->devices[idx].target) &&
sdev->lun < hdev->devices[idx].lun) {
if (SPRAID_DEV_INFO_ATTR_PT(hdev->devices[idx].attr))
- timeout = scmd_tmout_pt * HZ;
+ timeout = 30 * HZ;
else
timeout = scmd_tmout_nonpt * HZ;
max_sec = le16_to_cpu(hdev->devices[idx].max_io_kb) << 1;
@@ -991,7 +972,6 @@ static int spraid_slave_configure(struct scsi_device *sdev)
if ((max_sec == 0) || (max_sec > sdev->host->max_sectors))
max_sec = sdev->host->max_sectors;
- blk_queue_max_hw_sectors(sdev->request_queue, max_sec);
dev_info(hdev->dev, "[%s] sdev->channel:id:lun[%d:%d:%lld], scmd_timeout[%d]s, maxsec[%d]\n",
__func__, sdev->channel, sdev->id, sdev->lun, timeout / HZ, max_sec);
@@ -1176,6 +1156,75 @@ static inline bool spraid_cqe_pending(struct spraid_queue *spraidq)
spraidq->cq_phase;
}
+static void spraid_sata_report_zone_handle(struct scsi_cmnd *scmd, struct spraid_iod *iod)
+{
+ int i = 0;
+ unsigned int bytes = 0;
+ struct scatterlist *sg = scsi_sglist(scmd);
+
+ scsi_for_each_sg(scmd, sg, iod->nsge, i) {
+ unsigned int offset = 0;
+
+ if (bytes == 0) {
+ char *hdr;
+ u32 list_length;
+ u64 max_lba, opt_lba;
+ u16 same;
+
+ hdr = sg_virt(sg);
+
+ list_length = get_unaligned_le32(&hdr[0]);
+ same = get_unaligned_le16(&hdr[4]);
+ max_lba = get_unaligned_le64(&hdr[8]);
+ opt_lba = get_unaligned_le64(&hdr[16]);
+ put_unaligned_be32(list_length, &hdr[0]);
+ hdr[4] = same & 0xf;
+ put_unaligned_be64(max_lba, &hdr[8]);
+ put_unaligned_be64(opt_lba, &hdr[16]);
+ offset += 64;
+ bytes += 64;
+ }
+ while (offset < sg_dma_len(sg)) {
+ char *rec;
+ u8 cond, type, non_seq, reset;
+ u64 size, start, wp;
+
+ rec = sg_virt(sg) + offset;
+ type = rec[0] & 0xf;
+ cond = (rec[1] >> 4) & 0xf;
+ non_seq = (rec[1] & 2);
+ reset = (rec[1] & 1);
+ size = get_unaligned_le64(&rec[8]);
+ start = get_unaligned_le64(&rec[16]);
+ wp = get_unaligned_le64(&rec[24]);
+ rec[0] = type;
+ rec[1] = (cond << 4) | non_seq | reset;
+ put_unaligned_be64(size, &rec[8]);
+ put_unaligned_be64(start, &rec[16]);
+ put_unaligned_be64(wp, &rec[24]);
+ WARN_ON(offset + 64 > sg_dma_len(sg));
+ offset += 64;
+ bytes += 64;
+ }
+ }
+}
+
+static inline void spraid_handle_ata_cmd(struct spraid_dev *hdev, struct scsi_cmnd *scmd,
+ struct spraid_iod *iod)
+{
+ if (hdev->ctrl_info->card_type != SPRAID_CARD_HBA)
+ return;
+
+ switch (scmd->cmnd[0]) {
+ case ZBC_IN:
+ dev_info(hdev->dev, "[%s] process report zone\n", __func__);
+ spraid_sata_report_zone_handle(scmd, iod);
+ break;
+ default:
+ break;
+ }
+}
+
static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq, struct spraid_completion *cqe)
{
struct spraid_dev *hdev = ioq->hdev;
@@ -1197,12 +1246,12 @@ static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq, struct spraid_com
iod = scsi_cmd_priv(scmd);
elapsed = jiffies - scmd->jiffies_at_alloc;
- dev_log_dbg(hdev->dev, "cid[%d], qid[%d] finish IO cost %3ld.%3ld seconds\n",
+ dev_log_dbg(hdev->dev, "cid[%d] qid[%d] finish IO cost %3ld.%3ld seconds\n",
cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
if (cmpxchg(&iod->state, SPRAID_CMD_IN_FLIGHT, SPRAID_CMD_COMPLETE) !=
SPRAID_CMD_IN_FLIGHT) {
- dev_warn(hdev->dev, "cid[%d], qid[%d] enters abnormal handler, cost %3ld.%3ld seconds\n",
+ dev_warn(hdev->dev, "cid[%d] qid[%d] enters abnormal handler, cost %3ld.%3ld seconds\n",
cqe->cmd_id, ioq->qid, elapsed / HZ, elapsed % HZ);
WRITE_ONCE(iod->state, SPRAID_CMD_TMO_COMPLETE);
@@ -1215,6 +1264,8 @@ static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq, struct spraid_com
return;
}
+ spraid_handle_ata_cmd(hdev, scmd, iod);
+
spraid_map_status(iod, scmd, cqe);
if (iod->nsge) {
iod->nsge = 0;
@@ -1224,38 +1275,36 @@ static void spraid_complete_ioq_cmnd(struct spraid_queue *ioq, struct spraid_com
scmd->scsi_done(scmd);
}
-static inline void spraid_end_admin_request(struct request *req, __le16 status,
- __le32 result0, __le32 result1)
-{
- struct spraid_admin_request *rq = spraid_admin_req(req);
-
- rq->status = le16_to_cpu(status) >> 1;
- rq->result0 = le32_to_cpu(result0);
- rq->result1 = le32_to_cpu(result1);
- blk_mq_complete_request(req);
-}
-
static void spraid_complete_adminq_cmnd(struct spraid_queue *adminq, struct spraid_completion *cqe)
{
- struct blk_mq_tags *tags = adminq->hdev->admin_tagset.tags[0];
- struct request *req;
+ struct spraid_dev *hdev = adminq->hdev;
+ struct spraid_cmd *adm_cmd;
- req = blk_mq_tag_to_rq(tags, cqe->cmd_id);
- if (unlikely(!req)) {
+ adm_cmd = hdev->adm_cmds + cqe->cmd_id;
+ if (unlikely(adm_cmd->state == SPRAID_CMD_IDLE)) {
dev_warn(adminq->hdev->dev, "Invalid id %d completed on queue %d\n",
cqe->cmd_id, le16_to_cpu(cqe->sq_id));
return;
}
- spraid_end_admin_request(req, cqe->status, cqe->result, cqe->result1);
+
+ adm_cmd->status = le16_to_cpu(cqe->status) >> 1;
+ adm_cmd->result0 = le32_to_cpu(cqe->result);
+ adm_cmd->result1 = le32_to_cpu(cqe->result1);
+
+ complete(&adm_cmd->cmd_done);
}
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid);
+
static void spraid_complete_aen(struct spraid_queue *spraidq, struct spraid_completion *cqe)
{
struct spraid_dev *hdev = spraidq->hdev;
u32 result = le32_to_cpu(cqe->result);
- dev_info(hdev->dev, "rcv aen, status[%x], result[%x]\n",
- le16_to_cpu(cqe->status) >> 1, result);
+ dev_info(hdev->dev, "rcv aen, cid[%d], status[0x%x], result[0x%x]\n",
+ cqe->cmd_id, le16_to_cpu(cqe->status) >> 1, result);
+
+ spraid_send_aen(hdev, cqe->cmd_id);
if ((le16_to_cpu(cqe->status) >> 1) != SPRAID_SC_SUCCESS)
return;
@@ -1264,22 +1313,19 @@ static void spraid_complete_aen(struct spraid_queue *spraidq, struct spraid_comp
spraid_handle_aen_notice(hdev, result);
break;
case SPRAID_AEN_VS:
- spraid_handle_aen_vs(hdev, result);
+ spraid_handle_aen_vs(hdev, result, le32_to_cpu(cqe->result1));
break;
default:
dev_warn(hdev->dev, "Unsupported async event type: %u\n",
result & 0x7);
break;
}
- queue_work(spraid_wq, &hdev->aen_work);
}
-static void spraid_put_ioq_ptcmd(struct spraid_dev *hdev, struct spraid_ioq_ptcmd *cmd);
-
static void spraid_complete_ioq_sync_cmnd(struct spraid_queue *ioq, struct spraid_completion *cqe)
{
struct spraid_dev *hdev = ioq->hdev;
- struct spraid_ioq_ptcmd *ptcmd;
+ struct spraid_cmd *ptcmd;
ptcmd = hdev->ioq_ptcmds + (ioq->qid - 1) * SPRAID_PTCMDS_PERQ +
cqe->cmd_id - SPRAID_IO_BLK_MQ_DEPTH;
@@ -1289,8 +1335,6 @@ static void spraid_complete_ioq_sync_cmnd(struct spraid_queue *ioq, struct sprai
ptcmd->result1 = le32_to_cpu(cqe->result1);
complete(&ptcmd->cmd_done);
-
- spraid_put_ioq_ptcmd(hdev, ptcmd);
}
static inline void spraid_handle_cqe(struct spraid_queue *spraidq, u16 idx)
@@ -1304,7 +1348,7 @@ static inline void spraid_handle_cqe(struct spraid_queue *spraidq, u16 idx)
return;
}
- dev_log_dbg(hdev->dev, "cid[%d], qid[%d], result[0x%x], sq_id[%d], status[0x%x]\n",
+ dev_log_dbg(hdev->dev, "cid[%d] qid[%d], result[0x%x], sq_id[%d], status[0x%x]\n",
cqe->cmd_id, spraidq->qid, le32_to_cpu(cqe->result),
le16_to_cpu(cqe->sq_id), le16_to_cpu(cqe->status));
@@ -1452,62 +1496,119 @@ static u32 spraid_bar_size(struct spraid_dev *hdev, u32 nr_ioqs)
return (SPRAID_REG_DBS + ((nr_ioqs + 1) * 8 * hdev->db_stride));
}
-static inline void spraid_clear_spraid_request(struct request *req)
+static int spraid_alloc_admin_cmds(struct spraid_dev *hdev)
{
- if (!(req->rq_flags & RQF_DONTPREP)) {
- spraid_admin_req(req)->flags = 0;
- req->rq_flags |= RQF_DONTPREP;
+ int i;
+
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+ spin_lock_init(&hdev->adm_cmd_lock);
+
+ hdev->adm_cmds = kcalloc_node(SPRAID_AQ_BLK_MQ_DEPTH, sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
+
+ if (!hdev->adm_cmds) {
+ dev_err(hdev->dev, "Alloc admin cmds failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < SPRAID_AQ_BLK_MQ_DEPTH; i++) {
+ hdev->adm_cmds[i].qid = 0;
+ hdev->adm_cmds[i].cid = i;
+ list_add_tail(&(hdev->adm_cmds[i].list), &hdev->adm_cmd_list);
}
+
+ dev_info(hdev->dev, "Alloc admin cmds success, num[%d]\n", SPRAID_AQ_BLK_MQ_DEPTH);
+
+ return 0;
}
-static struct request *spraid_alloc_admin_request(struct request_queue *q,
- struct spraid_admin_command *cmd,
- blk_mq_req_flags_t flags)
+static void spraid_free_admin_cmds(struct spraid_dev *hdev)
{
- u32 op = COMMAND_IS_WRITE(cmd) ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN;
- struct request *req;
+ kfree(hdev->adm_cmds);
+ hdev->adm_cmds = NULL;
+ INIT_LIST_HEAD(&hdev->adm_cmd_list);
+}
+
+static struct spraid_cmd *spraid_get_cmd(struct spraid_dev *hdev, enum spraid_cmd_type type)
+{
+ struct spraid_cmd *cmd = NULL;
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
- req = blk_mq_alloc_request(q, op, flags);
- if (IS_ERR(req))
- return req;
- req->cmd_flags |= REQ_FAILFAST_DRIVER;
- spraid_clear_spraid_request(req);
- spraid_admin_req(req)->cmd = cmd;
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
+
+ spin_lock_irqsave(slock, flags);
+ if (list_empty(head)) {
+ spin_unlock_irqrestore(slock, flags);
+ dev_err(hdev->dev, "err, cmd[%d] list empty\n", type);
+ return NULL;
+ }
+ cmd = list_entry(head->next, struct spraid_cmd, list);
+ list_del_init(&cmd->list);
+ spin_unlock_irqrestore(slock, flags);
+
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IN_FLIGHT);
- return req;
+ return cmd;
}
-static int spraid_submit_admin_sync_cmd(struct request_queue *q,
- struct spraid_admin_command *cmd,
- u32 *result, void *buffer,
- u32 bufflen, u32 timeout, int at_head, blk_mq_req_flags_t flags)
+static void spraid_put_cmd(struct spraid_dev *hdev, struct spraid_cmd *cmd,
+ enum spraid_cmd_type type)
{
- struct request *req;
- int ret;
+ unsigned long flags;
+ struct list_head *head = &hdev->adm_cmd_list;
+ spinlock_t *slock = &hdev->adm_cmd_lock;
- req = spraid_alloc_admin_request(q, cmd, flags);
- if (IS_ERR(req))
- return PTR_ERR(req);
+ if (type == SPRAID_CMD_IOPT) {
+ head = &hdev->ioq_pt_list;
+ slock = &hdev->ioq_pt_lock;
+ }
- req->timeout = timeout ? timeout : ADMIN_TIMEOUT;
- if (buffer && bufflen) {
- ret = blk_rq_map_kern(q, req, buffer, bufflen, GFP_KERNEL);
- if (ret)
- goto out;
+ spin_lock_irqsave(slock, flags);
+ WRITE_ONCE(cmd->state, SPRAID_CMD_IDLE);
+ list_add_tail(&cmd->list, head);
+ spin_unlock_irqrestore(slock, flags);
+}
+
+
+static int spraid_submit_admin_sync_cmd(struct spraid_dev *hdev, struct spraid_admin_command *cmd,
+ u32 *result0, u32 *result1, u32 timeout)
+{
+ struct spraid_cmd *adm_cmd = spraid_get_cmd(hdev, SPRAID_CMD_ADM);
+
+ if (!adm_cmd) {
+ dev_err(hdev->dev, "err, get admin cmd failed\n");
+ return -EFAULT;
}
- blk_execute_rq(req->q, NULL, req, at_head);
- if (result)
- *result = spraid_admin_req(req)->result0;
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
- if (spraid_admin_req(req)->flags & SPRAID_REQ_CANCELLED)
- ret = -EINTR;
- else
- ret = spraid_admin_req(req)->status;
+ init_completion(&adm_cmd->cmd_done);
-out:
- blk_mq_free_request(req);
- return ret;
+ cmd->common.command_id = adm_cmd->cid;
+ spraid_submit_cmd(&hdev->queues[0], cmd);
+
+ if (!wait_for_completion_timeout(&adm_cmd->cmd_done, timeout)) {
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout, opcode[0x%x] subopcode[0x%x]\n",
+ __func__, adm_cmd->cid, adm_cmd->qid, cmd->usr_cmd.opcode,
+ cmd->usr_cmd.info_0.subopcode);
+ WRITE_ONCE(adm_cmd->state, SPRAID_CMD_TIMEOUT);
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+ return -EINVAL;
+ }
+
+ if (result0)
+ *result0 = adm_cmd->result0;
+ if (result1)
+ *result1 = adm_cmd->result1;
+
+ spraid_put_cmd(hdev, adm_cmd, SPRAID_CMD_ADM);
+
+ return adm_cmd->status;
}
static int spraid_create_cq(struct spraid_dev *hdev, u16 qid,
@@ -1524,8 +1625,7 @@ static int spraid_create_cq(struct spraid_dev *hdev, u16 qid,
admin_cmd.create_cq.cq_flags = cpu_to_le16(flags);
admin_cmd.create_cq.irq_vector = cpu_to_le16(cq_vector);
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
}
static int spraid_create_sq(struct spraid_dev *hdev, u16 qid,
@@ -1542,8 +1642,7 @@ static int spraid_create_sq(struct spraid_dev *hdev, u16 qid,
admin_cmd.create_sq.sq_flags = cpu_to_le16(flags);
admin_cmd.create_sq.cqid = cpu_to_le16(qid);
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
}
static void spraid_free_queue(struct spraid_queue *spraidq)
@@ -1581,8 +1680,7 @@ static int spraid_delete_queue(struct spraid_dev *hdev, u8 op, u16 id)
admin_cmd.delete_queue.opcode = op;
admin_cmd.delete_queue.qid = cpu_to_le16(id);
- ret = spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
if (ret)
dev_err(hdev->dev, "Delete %s:[%d] failed\n",
@@ -1663,19 +1761,28 @@ static int spraid_set_features(struct spraid_dev *hdev, u32 fid, u32 dword11, vo
size_t buflen, u32 *result)
{
struct spraid_admin_command admin_cmd;
- u32 res;
int ret;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+
+ if (buffer && buflen) {
+ data_ptr = dma_alloc_coherent(hdev->dev, buflen, &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memcpy(data_ptr, buffer, buflen);
+ }
memset(&admin_cmd, 0, sizeof(admin_cmd));
admin_cmd.features.opcode = SPRAID_ADMIN_SET_FEATURES;
admin_cmd.features.fid = cpu_to_le32(fid);
admin_cmd.features.dword11 = cpu_to_le32(dword11);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
- ret = spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, &res,
- buffer, buflen, 0, 0, 0);
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, result, NULL, 0);
- if (!ret && result)
- *result = res;
+ if (data_ptr)
+ dma_free_coherent(hdev->dev, buflen, data_ptr, data_dma);
return ret;
}
@@ -1764,8 +1871,7 @@ static int spraid_setup_io_queues(struct spraid_dev *hdev)
break;
}
dev_info(hdev->dev, "[%s] max_qid: %d, queue_count: %d, online_queue: %d, ioq_depth: %d\n",
- __func__, hdev->max_qid, hdev->queue_count,
- hdev->online_queues, hdev->ioq_depth);
+ __func__, hdev->max_qid, hdev->queue_count, hdev->online_queues, hdev->ioq_depth);
return spraid_create_io_queues(hdev);
}
@@ -1889,10 +1995,11 @@ static int spraid_get_dev_list(struct spraid_dev *hdev, struct spraid_dev_info *
u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
struct spraid_admin_command admin_cmd;
struct spraid_dev_list *list_buf;
+ dma_addr_t data_dma = 0;
u32 i, idx, hdid, ndev;
int ret = 0;
- list_buf = kmalloc(sizeof(*list_buf), GFP_KERNEL);
+ list_buf = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
if (!list_buf)
return -ENOMEM;
@@ -1901,9 +2008,9 @@ static int spraid_get_dev_list(struct spraid_dev *hdev, struct spraid_dev_info *
admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
admin_cmd.get_info.type = SPRAID_GET_INFO_DEV_LIST;
admin_cmd.get_info.cdw11 = cpu_to_le32(idx);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
- ret = spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL, list_buf,
- sizeof(*list_buf), 0, 0, 0);
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
if (ret) {
dev_err(hdev->dev, "Get device list failed, nd: %u, idx: %u, ret: %d\n",
@@ -1916,12 +2023,11 @@ static int spraid_get_dev_list(struct spraid_dev *hdev, struct spraid_dev_info *
for (i = 0; i < ndev; i++) {
hdid = le32_to_cpu(list_buf->devices[i].hdid);
- dev_info(hdev->dev, "list_buf->devices[%d], hdid: %u target: %d, channel: %d, lun: %d, attr[%x]\n",
- i, hdid,
- le16_to_cpu(list_buf->devices[i].target),
- list_buf->devices[i].channel,
- list_buf->devices[i].lun,
- list_buf->devices[i].attr);
+ dev_info(hdev->dev, "list_buf->devices[%d], hdid: %u target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ i, hdid, le16_to_cpu(list_buf->devices[i].target),
+ list_buf->devices[i].channel,
+ list_buf->devices[i].lun,
+ list_buf->devices[i].attr);
if (hdid > nd || hdid == 0) {
dev_err(hdev->dev, "err, hdid[%d] invalid\n", hdid);
continue;
@@ -1936,21 +2042,29 @@ static int spraid_get_dev_list(struct spraid_dev *hdev, struct spraid_dev_info *
}
out:
- kfree(list_buf);
+ dma_free_coherent(hdev->dev, PAGE_SIZE, list_buf, data_dma);
return ret;
}
-static void spraid_send_aen(struct spraid_dev *hdev)
+static void spraid_send_aen(struct spraid_dev *hdev, u16 cid)
{
struct spraid_queue *adminq = &hdev->queues[0];
struct spraid_admin_command admin_cmd;
memset(&admin_cmd, 0, sizeof(admin_cmd));
admin_cmd.common.opcode = SPRAID_ADMIN_ASYNC_EVENT;
- admin_cmd.common.command_id = SPRAID_AQ_BLK_MQ_DEPTH;
+ admin_cmd.common.command_id = cid;
spraid_submit_cmd(adminq, &admin_cmd);
- dev_info(hdev->dev, "send aen, cid[%d]\n", SPRAID_AQ_BLK_MQ_DEPTH);
+ dev_info(hdev->dev, "send aen, cid[%d]\n", cid);
+}
+
+static inline void spraid_send_all_aen(struct spraid_dev *hdev)
+{
+ u16 i;
+
+ for (i = 0; i < hdev->ctrl_info->aerl; i++)
+ spraid_send_aen(hdev, i + SPRAID_AQ_BLK_MQ_DEPTH);
}
static int spraid_add_device(struct spraid_dev *hdev, struct spraid_dev_info *device)
@@ -1958,6 +2072,10 @@ static int spraid_add_device(struct spraid_dev *hdev, struct spraid_dev_info *de
struct Scsi_Host *shost = hdev->shost;
struct scsi_device *sdev;
+ dev_info(hdev->dev, "add device, hdid: %u target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ le32_to_cpu(device->hdid), le16_to_cpu(device->target),
+ device->channel, device->lun, device->attr);
+
sdev = scsi_device_lookup(shost, device->channel, le16_to_cpu(device->target), 0);
if (sdev) {
dev_warn(hdev->dev, "Device is already exist, channel: %d, target_id: %d, lun: %d\n",
@@ -1974,9 +2092,13 @@ static int spraid_rescan_device(struct spraid_dev *hdev, struct spraid_dev_info
struct Scsi_Host *shost = hdev->shost;
struct scsi_device *sdev;
+ dev_info(hdev->dev, "rescan device, hdid: %u target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ le32_to_cpu(device->hdid), le16_to_cpu(device->target),
+ device->channel, device->lun, device->attr);
+
sdev = scsi_device_lookup(shost, device->channel, le16_to_cpu(device->target), 0);
if (!sdev) {
- dev_warn(hdev->dev, "Device is not exit, channel: %d, target_id: %d, lun: %d\n",
+ dev_warn(hdev->dev, "device is not exit rescan it, channel: %d, target_id: %d, lun: %d\n",
device->channel, le16_to_cpu(device->target), 0);
return -ENODEV;
}
@@ -1991,9 +2113,13 @@ static int spraid_remove_device(struct spraid_dev *hdev, struct spraid_dev_info
struct Scsi_Host *shost = hdev->shost;
struct scsi_device *sdev;
+ dev_info(hdev->dev, "remove device, hdid: %u target: %d, channel: %d, lun: %d, attr[0x%x]\n",
+ le32_to_cpu(org_device->hdid), le16_to_cpu(org_device->target),
+ org_device->channel, org_device->lun, org_device->attr);
+
sdev = scsi_device_lookup(shost, org_device->channel, le16_to_cpu(org_device->target), 0);
if (!sdev) {
- dev_warn(hdev->dev, "Device is not exit, channel: %d, target_id: %d, lun: %d\n",
+ dev_warn(hdev->dev, "device is not exit remove it, channel: %d, target_id: %d, lun: %d\n",
org_device->channel, le16_to_cpu(org_device->target), 0);
return -ENODEV;
}
@@ -2029,36 +2155,54 @@ static int spraid_dev_list_init(struct spraid_dev *hdev)
return 0;
}
+static int luntarget_cmp_func(const void *l, const void *r)
+{
+ const struct spraid_dev_info *ln = l;
+ const struct spraid_dev_info *rn = r;
+
+ if (ln->channel == rn->channel)
+ return ln->target - rn->target;
+
+ return ln->channel - rn->channel;
+}
+
static void spraid_scan_work(struct work_struct *work)
{
struct spraid_dev *hdev =
container_of(work, struct spraid_dev, scan_work);
struct spraid_dev_info *devices, *org_devices;
+ struct spraid_dev_info *sortdevice;
u32 nd = le32_to_cpu(hdev->ctrl_info->nd);
u8 flag, org_flag;
int i, ret;
+ int count = 0;
devices = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
if (!devices)
return;
+
+ sortdevice = kcalloc(nd, sizeof(struct spraid_dev_info), GFP_KERNEL);
+ if (!sortdevice)
+ goto free_list;
+
ret = spraid_get_dev_list(hdev, devices);
if (ret)
- goto free_list;
+ goto free_all;
org_devices = hdev->devices;
for (i = 0; i < nd; i++) {
org_flag = org_devices[i].flag;
flag = devices[i].flag;
- dev_log_dbg(hdev->dev, "i: %d, org_flag: 0x%x, flag: 0x%x\n",
- i, org_flag, flag);
+ dev_log_dbg(hdev->dev, "i: %d, org_flag: 0x%x, flag: 0x%x\n", i, org_flag, flag);
if (SPRAID_DEV_INFO_FLAG_VALID(flag)) {
if (!SPRAID_DEV_INFO_FLAG_VALID(org_flag)) {
down_write(&hdev->devices_rwsem);
memcpy(&org_devices[i], &devices[i],
- sizeof(struct spraid_dev_info));
+ sizeof(struct spraid_dev_info));
+ memcpy(&sortdevice[count++], &devices[i],
+ sizeof(struct spraid_dev_info));
up_write(&hdev->devices_rwsem);
- spraid_add_device(hdev, &devices[i]);
} else if (SPRAID_DEV_INFO_FLAG_CHANGE(flag)) {
spraid_rescan_device(hdev, &devices[i]);
}
@@ -2071,6 +2215,16 @@ static void spraid_scan_work(struct work_struct *work)
}
}
}
+
+ dev_info(hdev->dev, "scan work add device count = %d\n", count);
+
+ sort(sortdevice, count, sizeof(sortdevice[0]), luntarget_cmp_func, NULL);
+
+ for (i = 0; i < count; i++)
+ spraid_add_device(hdev, &sortdevice[i]);
+
+free_all:
+ kfree(sortdevice);
free_list:
kfree(devices);
}
@@ -2083,6 +2237,15 @@ static void spraid_timesyn_work(struct work_struct *work)
spraid_configure_timestamp(hdev);
}
+static int spraid_init_ctrl_info(struct spraid_dev *hdev);
+static void spraid_fw_act_work(struct work_struct *work)
+{
+ struct spraid_dev *hdev = container_of(work, struct spraid_dev, fw_act_work);
+
+ if (spraid_init_ctrl_info(hdev))
+ dev_err(hdev->dev, "get ctrl info failed after fw act\n");
+}
+
static void spraid_queue_scan(struct spraid_dev *hdev)
{
queue_work(spraid_wq, &hdev->scan_work);
@@ -2094,6 +2257,9 @@ static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result)
case SPRAID_AEN_DEV_CHANGED:
spraid_queue_scan(hdev);
break;
+ case SPRAID_AEN_FW_ACT_START:
+ dev_info(hdev->dev, "fw activation starting\n");
+ break;
case SPRAID_AEN_HOST_PROBING:
break;
default:
@@ -2101,25 +2267,25 @@ static void spraid_handle_aen_notice(struct spraid_dev *hdev, u32 result)
}
}
-static void spraid_handle_aen_vs(struct spraid_dev *hdev, u32 result)
+static void spraid_handle_aen_vs(struct spraid_dev *hdev, u32 result, u32 result1)
{
- switch (result) {
+ switch ((result & 0xff00) >> 8) {
case SPRAID_AEN_TIMESYN:
queue_work(spraid_wq, &hdev->timesyn_work);
break;
+ case SPRAID_AEN_FW_ACT_FINISH:
+ dev_info(hdev->dev, "fw activation finish\n");
+ queue_work(spraid_wq, &hdev->fw_act_work);
+ break;
+ case SPRAID_AEN_EVENT_MIN ... SPRAID_AEN_EVENT_MAX:
+ dev_info(hdev->dev, "rcv card event[%d], param1[0x%x] param2[0x%x]\n",
+ (result & 0xff00) >> 8, result, result1);
+ break;
default:
- dev_warn(hdev->dev, "async event result: %x\n", result);
+ dev_warn(hdev->dev, "async event result: 0x%x\n", result);
}
}
-static void spraid_async_event_work(struct work_struct *work)
-{
- struct spraid_dev *hdev =
- container_of(work, struct spraid_dev, aen_work);
-
- spraid_send_aen(hdev);
-}
-
static int spraid_alloc_resources(struct spraid_dev *hdev)
{
int ret, nqueue;
@@ -2149,10 +2315,16 @@ static int spraid_alloc_resources(struct spraid_dev *hdev)
goto destroy_dma_pools;
}
+ ret = spraid_alloc_admin_cmds(hdev);
+ if (ret)
+ goto free_queues;
+
dev_info(hdev->dev, "[%s] queues num: %d\n", __func__, nqueue);
return 0;
+free_queues:
+ kfree(hdev->queues);
destroy_dma_pools:
spraid_destroy_dma_pools(hdev);
free_ctrl_info:
@@ -2164,50 +2336,18 @@ static int spraid_alloc_resources(struct spraid_dev *hdev)
static void spraid_free_resources(struct spraid_dev *hdev)
{
+ spraid_free_admin_cmds(hdev);
kfree(hdev->queues);
spraid_destroy_dma_pools(hdev);
kfree(hdev->ctrl_info);
ida_free(&spraid_instance_ida, hdev->instance);
}
-static void spraid_setup_passthrough(struct request *req, struct spraid_admin_command *cmd)
-{
- memcpy(cmd, spraid_admin_req(req)->cmd, sizeof(*cmd));
- cmd->common.flags &= ~SPRAID_CMD_FLAG_SGL_ALL;
-}
-
-static inline void spraid_clear_hreq(struct request *req)
-{
- if (!(req->rq_flags & RQF_DONTPREP)) {
- spraid_admin_req(req)->flags = 0;
- req->rq_flags |= RQF_DONTPREP;
- }
-}
-
-static blk_status_t spraid_setup_admin_cmd(struct request *req, struct spraid_admin_command *cmd)
+static void spraid_bsg_unmap_data(struct spraid_dev *hdev, struct bsg_job *job)
{
- spraid_clear_hreq(req);
-
- memset(cmd, 0, sizeof(*cmd));
- switch (req_op(req)) {
- case REQ_OP_DRV_IN:
- case REQ_OP_DRV_OUT:
- spraid_setup_passthrough(req, cmd);
- break;
- default:
- WARN_ON_ONCE(1);
- return BLK_STS_IOERR;
- }
-
- cmd->common.command_id = req->tag;
- return BLK_STS_OK;
-}
-
-static void spraid_unmap_data(struct spraid_dev *hdev, struct request *req)
-{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- enum dma_data_direction dma_dir = rq_data_dir(req) ?
- DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir = rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
if (iod->nsge)
dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
@@ -2215,36 +2355,36 @@ static void spraid_unmap_data(struct spraid_dev *hdev, struct request *req)
spraid_free_iod_res(hdev, iod);
}
-static blk_status_t spraid_admin_map_data(struct spraid_dev *hdev, struct request *req,
- struct spraid_admin_command *cmd)
+static int spraid_bsg_map_data(struct spraid_dev *hdev, struct bsg_job *job,
+ struct spraid_admin_command *cmd)
{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct request_queue *admin_q = req->q;
- enum dma_data_direction dma_dir = rq_data_dir(req) ?
- DMA_TO_DEVICE : DMA_FROM_DEVICE;
- blk_status_t ret = BLK_STS_IOERR;
- int nr_mapped;
- int res;
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_iod *iod = job->dd_data;
+ enum dma_data_direction dma_dir = rq_data_dir(rq) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ int ret = 0;
+
+ iod->sg = job->request_payload.sg_list;
+ iod->nsge = job->request_payload.sg_cnt;
+ iod->length = job->request_payload.payload_len;
+ iod->use_sgl = false;
+ iod->npages = -1;
+ iod->sg_drv_mgmt = false;
- sg_init_table(iod->sg, blk_rq_nr_phys_segments(req));
- iod->nsge = blk_rq_map_sg(admin_q, req, iod->sg);
if (!iod->nsge)
goto out;
- dev_info(hdev->dev, "nseg: %u, nsge: %u\n",
- blk_rq_nr_phys_segments(req), iod->nsge);
-
- ret = BLK_STS_RESOURCE;
- nr_mapped = dma_map_sg_attrs(hdev->dev, iod->sg, iod->nsge, dma_dir, DMA_ATTR_NO_WARN);
- if (!nr_mapped)
+ ret = dma_map_sg_attrs(hdev->dev, iod->sg, iod->nsge, dma_dir, DMA_ATTR_NO_WARN);
+ if (!ret)
goto out;
- res = spraid_setup_prps(hdev, iod);
- if (res)
+ ret = spraid_setup_prps(hdev, iod);
+ if (ret)
goto unmap;
+
cmd->common.dptr.prp1 = cpu_to_le64(sg_dma_address(iod->sg));
cmd->common.dptr.prp2 = cpu_to_le64(iod->first_dma);
- return BLK_STS_OK;
+
+ return 0;
unmap:
dma_unmap_sg(hdev->dev, iod->sg, iod->nsge, dma_dir);
@@ -2252,137 +2392,29 @@ static blk_status_t spraid_admin_map_data(struct spraid_dev *hdev, struct reques
return ret;
}
-static blk_status_t spraid_init_admin_iod(struct request *rq, struct spraid_dev *hdev)
-{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(rq);
- int nents = blk_rq_nr_phys_segments(rq);
- unsigned int size = blk_rq_payload_bytes(rq);
-
- if (nents > SPRAID_INT_PAGES || size > SPRAID_INT_BYTES(hdev)) {
- iod->sg = mempool_alloc(hdev->iod_mempool, GFP_ATOMIC);
- if (!iod->sg)
- return BLK_STS_RESOURCE;
- } else {
- iod->sg = iod->inline_sg;
- }
-
- iod->nsge = 0;
- iod->use_sgl = false;
- iod->npages = -1;
- iod->length = size;
- iod->sg_drv_mgmt = true;
-
- return BLK_STS_OK;
-}
-
-static blk_status_t spraid_queue_admin_rq(struct blk_mq_hw_ctx *hctx,
- const struct blk_mq_queue_data *bd)
-{
- struct spraid_queue *adminq = hctx->driver_data;
- struct spraid_dev *hdev = adminq->hdev;
- struct request *req = bd->rq;
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct spraid_admin_command cmd;
- blk_status_t ret;
-
- ret = spraid_setup_admin_cmd(req, &cmd);
- if (ret)
- goto out;
-
- ret = spraid_init_admin_iod(req, hdev);
- if (ret)
- goto out;
-
- if (blk_rq_nr_phys_segments(req)) {
- ret = spraid_admin_map_data(hdev, req, &cmd);
- if (ret)
- goto cleanup_iod;
- }
-
- blk_mq_start_request(req);
- spraid_submit_cmd(adminq, &cmd);
- return BLK_STS_OK;
-
-cleanup_iod:
- spraid_free_iod_res(hdev, iod);
-out:
- return ret;
-}
-
-static blk_status_t spraid_error_status(struct request *req)
-{
- switch (spraid_admin_req(req)->status & 0x7ff) {
- case SPRAID_SC_SUCCESS:
- return BLK_STS_OK;
- default:
- return BLK_STS_IOERR;
- }
-}
-
-static void spraid_complete_admin_rq(struct request *req)
-{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct spraid_dev *hdev = iod->spraidq->hdev;
-
- if (blk_rq_nr_phys_segments(req))
- spraid_unmap_data(hdev, req);
- blk_mq_end_request(req, spraid_error_status(req));
-}
-
-static int spraid_admin_init_hctx(struct blk_mq_hw_ctx *hctx, void *data, unsigned int hctx_idx)
-{
- struct spraid_dev *hdev = data;
- struct spraid_queue *adminq = &hdev->queues[0];
-
- WARN_ON(hctx_idx != 0);
- WARN_ON(hdev->admin_tagset.tags[0] != hctx->tags);
-
- hctx->driver_data = adminq;
- return 0;
-}
-
-static int spraid_admin_init_request(struct blk_mq_tag_set *set, struct request *req,
- unsigned int hctx_idx, unsigned int numa_node)
-{
- struct spraid_dev *hdev = set->driver_data;
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct spraid_queue *adminq = &hdev->queues[0];
-
- WARN_ON(!adminq);
- iod->spraidq = adminq;
- return 0;
-}
-
-static enum blk_eh_timer_return
-spraid_admin_timeout(struct request *req, bool reserved)
-{
- struct spraid_iod *iod = blk_mq_rq_to_pdu(req);
- struct spraid_queue *spraidq = iod->spraidq;
- struct spraid_dev *hdev = spraidq->hdev;
-
- dev_err(hdev->dev, "Admin cid[%d] qid[%d] timeout\n",
- req->tag, spraidq->qid);
-
- if (spraid_poll_cq(spraidq, req->tag)) {
- dev_warn(hdev->dev, "cid[%d] qid[%d] timeout, completion polled\n",
- req->tag, spraidq->qid);
- return BLK_EH_DONE;
- }
-
- spraid_end_admin_request(req, cpu_to_le16(-EINVAL), 0, 0);
- return BLK_EH_DONE;
-}
-
static int spraid_get_ctrl_info(struct spraid_dev *hdev, struct spraid_ctrl_info *ctrl_info)
{
struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
memset(&admin_cmd, 0, sizeof(admin_cmd));
admin_cmd.get_info.opcode = SPRAID_ADMIN_GET_INFO;
admin_cmd.get_info.type = SPRAID_GET_INFO_CTRL;
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(ctrl_info, data_ptr, sizeof(struct spraid_ctrl_info));
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- ctrl_info, sizeof(struct spraid_ctrl_info), 0, 0, 0);
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
}
static int spraid_init_ctrl_info(struct spraid_dev *hdev)
@@ -2416,6 +2448,11 @@ static int spraid_init_ctrl_info(struct spraid_dev *hdev)
dev_info(hdev->dev, "[%s]sn = %s\n", __func__, hdev->ctrl_info->sn);
dev_info(hdev->dev, "[%s]fr = %s\n", __func__, hdev->ctrl_info->fr);
+ if (!hdev->ctrl_info->aerl)
+ hdev->ctrl_info->aerl = 1;
+ if (hdev->ctrl_info->aerl > SPRAID_NR_AEN_COMMANDS)
+ hdev->ctrl_info->aerl = SPRAID_NR_AEN_COMMANDS;
+
return 0;
}
@@ -2444,99 +2481,54 @@ static void spraid_free_iod_ext_mem_pool(struct spraid_dev *hdev)
mempool_destroy(hdev->iod_mempool);
}
-static int spraid_submit_user_cmd(struct request_queue *q, struct spraid_admin_command *cmd,
- void __user *ubuffer, unsigned int bufflen, u32 *result,
- unsigned int timeout)
-{
- struct request *req;
- struct bio *bio = NULL;
- int ret;
-
- req = spraid_alloc_admin_request(q, cmd, 0);
- if (IS_ERR(req))
- return PTR_ERR(req);
-
- req->timeout = timeout ? timeout : ADMIN_TIMEOUT;
- spraid_admin_req(req)->flags |= SPRAID_REQ_USERCMD;
-
- if (ubuffer && bufflen) {
- ret = blk_rq_map_user(q, req, NULL, ubuffer, bufflen, GFP_KERNEL);
- if (ret)
- goto out;
- bio = req->bio;
- }
- blk_execute_rq(req->q, NULL, req, 0);
- if (spraid_admin_req(req)->flags & SPRAID_REQ_CANCELLED)
- ret = -EINTR;
- else
- ret = spraid_admin_req(req)->status;
- if (result) {
- result[0] = spraid_admin_req(req)->result0;
- result[1] = spraid_admin_req(req)->result1;
- }
- if (bio)
- blk_rq_unmap_user(bio);
-out:
- blk_mq_free_request(req);
- return ret;
-}
-
-static int spraid_user_admin_cmd(struct spraid_dev *hdev,
- struct spraid_passthru_common_cmd __user *ucmd)
+static int spraid_user_admin_cmd(struct spraid_dev *hdev, struct bsg_job *job)
{
- struct spraid_passthru_common_cmd cmd;
+ struct spraid_bsg_request *bsg_req = job->request;
+ struct spraid_passthru_common_cmd *cmd = &(bsg_req->admcmd);
struct spraid_admin_command admin_cmd;
- u32 timeout = 0;
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
+ u32 result[2] = {0};
int status;
- if (!capable(CAP_SYS_ADMIN)) {
- dev_err(hdev->dev, "Current user hasn't administrator right, reject service\n");
- return -EACCES;
- }
-
- if (copy_from_user(&cmd, ucmd, sizeof(cmd))) {
- dev_err(hdev->dev, "Copy command from user space to kernel space failed\n");
- return -EFAULT;
- }
-
- if (cmd.flags) {
- dev_err(hdev->dev, "Invalid flags in user command\n");
- return -EINVAL;
+ if (hdev->state >= SPRAID_RESETTING) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not right\n",
+ __func__, hdev->state);
+ return -EBUSY;
}
- dev_info(hdev->dev, "user_admin_cmd opcode: 0x%x, subopcode: 0x%x\n",
- cmd.opcode, cmd.cdw2 & 0x7ff);
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode);
memset(&admin_cmd, 0, sizeof(admin_cmd));
- admin_cmd.common.opcode = cmd.opcode;
- admin_cmd.common.flags = cmd.flags;
- admin_cmd.common.hdid = cpu_to_le32(cmd.nsid);
- admin_cmd.common.cdw2[0] = cpu_to_le32(cmd.cdw2);
- admin_cmd.common.cdw2[1] = cpu_to_le32(cmd.cdw3);
- admin_cmd.common.cdw10 = cpu_to_le32(cmd.cdw10);
- admin_cmd.common.cdw11 = cpu_to_le32(cmd.cdw11);
- admin_cmd.common.cdw12 = cpu_to_le32(cmd.cdw12);
- admin_cmd.common.cdw13 = cpu_to_le32(cmd.cdw13);
- admin_cmd.common.cdw14 = cpu_to_le32(cmd.cdw14);
- admin_cmd.common.cdw15 = cpu_to_le32(cmd.cdw15);
-
- if (cmd.timeout_ms)
- timeout = msecs_to_jiffies(cmd.timeout_ms);
-
- status = spraid_submit_user_cmd(hdev->admin_q, &admin_cmd,
- (void __user *)(uintptr_t)cmd.addr, cmd.info_1.data_len,
- &cmd.result0, timeout);
-
- dev_info(hdev->dev, "user_admin_cmd status: 0x%x, result0: 0x%x, result1: 0x%x\n",
- status, cmd.result0, cmd.result1);
+ admin_cmd.common.opcode = cmd->opcode;
+ admin_cmd.common.flags = cmd->flags;
+ admin_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ admin_cmd.common.cdw2[0] = cpu_to_le32(cmd->cdw2);
+ admin_cmd.common.cdw2[1] = cpu_to_le32(cmd->cdw3);
+ admin_cmd.common.cdw10 = cpu_to_le32(cmd->cdw10);
+ admin_cmd.common.cdw11 = cpu_to_le32(cmd->cdw11);
+ admin_cmd.common.cdw12 = cpu_to_le32(cmd->cdw12);
+ admin_cmd.common.cdw13 = cpu_to_le32(cmd->cdw13);
+ admin_cmd.common.cdw14 = cpu_to_le32(cmd->cdw14);
+ admin_cmd.common.cdw15 = cpu_to_le32(cmd->cdw15);
+
+ status = spraid_bsg_map_data(hdev, job, &admin_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
+ }
+ status = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, &result[0], &result[1], timeout);
if (status >= 0) {
- if (put_user(cmd.result0, &ucmd->result0))
- return -EFAULT;
- if (put_user(cmd.result1, &ucmd->result1))
- return -EFAULT;
+ job->reply_len = sizeof(result);
+ memcpy(job->reply, result, sizeof(result));
}
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x], status[0x%x] result0[0x%x] result1[0x%x]\n",
+ __func__, cmd->opcode, cmd->info_0.subopcode, status, result[0], result[1]);
+
+ spraid_bsg_unmap_data(hdev, job);
+
return status;
}
@@ -2548,8 +2540,8 @@ static int spraid_alloc_ioq_ptcmds(struct spraid_dev *hdev)
INIT_LIST_HEAD(&hdev->ioq_pt_list);
spin_lock_init(&hdev->ioq_pt_lock);
- hdev->ioq_ptcmds = kcalloc_node(ptnum, sizeof(struct spraid_ioq_ptcmd),
- GFP_KERNEL, hdev->numa_node);
+ hdev->ioq_ptcmds = kcalloc_node(ptnum, sizeof(struct spraid_cmd),
+ GFP_KERNEL, hdev->numa_node);
if (!hdev->ioq_ptcmds) {
dev_err(hdev->dev, "Alloc ioq_ptcmds failed\n");
@@ -2567,55 +2559,35 @@ static int spraid_alloc_ioq_ptcmds(struct spraid_dev *hdev)
return 0;
}
-static struct spraid_ioq_ptcmd *spraid_get_ioq_ptcmd(struct spraid_dev *hdev)
+static void spraid_free_ioq_ptcmds(struct spraid_dev *hdev)
{
- struct spraid_ioq_ptcmd *cmd = NULL;
- unsigned long flags;
-
- spin_lock_irqsave(&hdev->ioq_pt_lock, flags);
- if (list_empty(&hdev->ioq_pt_list)) {
- spin_unlock_irqrestore(&hdev->ioq_pt_lock, flags);
- dev_err(hdev->dev, "err, ioq ptcmd list empty\n");
- return NULL;
- }
- cmd = list_entry((&hdev->ioq_pt_list)->next, struct spraid_ioq_ptcmd, list);
- list_del_init(&cmd->list);
- spin_unlock_irqrestore(&hdev->ioq_pt_lock, flags);
-
- WRITE_ONCE(cmd->state, SPRAID_CMD_IDLE);
-
- return cmd;
-}
-
-static void spraid_put_ioq_ptcmd(struct spraid_dev *hdev, struct spraid_ioq_ptcmd *cmd)
-{
- unsigned long flags;
+ kfree(hdev->ioq_ptcmds);
+ hdev->ioq_ptcmds = NULL;
- spin_lock_irqsave(&hdev->ioq_pt_lock, flags);
- list_add(&cmd->list, (&hdev->ioq_pt_list)->next);
- spin_unlock_irqrestore(&hdev->ioq_pt_lock, flags);
+ INIT_LIST_HEAD(&hdev->ioq_pt_list);
}
static int spraid_submit_ioq_sync_cmd(struct spraid_dev *hdev, struct spraid_ioq_command *cmd,
- u32 *result, void **sense, u32 timeout)
+ u32 *result, u32 *reslen, u32 timeout)
{
- struct spraid_queue *ioq;
int ret;
dma_addr_t sense_dma;
- struct spraid_ioq_ptcmd *pt_cmd = spraid_get_ioq_ptcmd(hdev);
-
- *sense = NULL;
+ struct spraid_queue *ioq;
+ void *sense_addr = NULL;
+ struct spraid_cmd *pt_cmd = spraid_get_cmd(hdev, SPRAID_CMD_IOPT);
- if (!pt_cmd)
+ if (!pt_cmd) {
+ dev_err(hdev->dev, "err, get ioq cmd failed\n");
return -EFAULT;
+ }
- dev_info(hdev->dev, "[%s] ptcmd, cid[%d], qid[%d]\n", __func__, pt_cmd->cid, pt_cmd->qid);
+ timeout = timeout ? timeout : ADMIN_TIMEOUT;
init_completion(&pt_cmd->cmd_done);
ioq = &hdev->queues[pt_cmd->qid];
ret = pt_cmd->cid * SCSI_SENSE_BUFFERSIZE;
- pt_cmd->priv = ioq->sense + ret;
+ sense_addr = ioq->sense + ret;
sense_dma = ioq->sense_dma_addr + ret;
cmd->common.sense_addr = cpu_to_le64(sense_dma);
@@ -2625,260 +2597,87 @@ static int spraid_submit_ioq_sync_cmd(struct spraid_dev *hdev, struct spraid_ioq
spraid_submit_cmd(ioq, cmd);
if (!wait_for_completion_timeout(&pt_cmd->cmd_done, timeout)) {
- dev_err(hdev->dev, "[%s] cid[%d], qid[%d] timeout\n",
- __func__, pt_cmd->cid, pt_cmd->qid);
+ dev_err(hdev->dev, "[%s] cid[%d] qid[%d] timeout, opcode[0x%x] subopcode[0x%x]\n",
+ __func__, pt_cmd->cid, pt_cmd->qid, cmd->common.opcode,
+ (le32_to_cpu(cmd->common.cdw3[0]) & 0xffff));
WRITE_ONCE(pt_cmd->state, SPRAID_CMD_TIMEOUT);
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
return -EINVAL;
}
- if (result) {
- result[0] = pt_cmd->result0;
- result[1] = pt_cmd->result1;
+ if (result && reslen) {
+ if ((pt_cmd->status & 0x17f) == 0x101) {
+ memcpy(result, sense_addr, SCSI_SENSE_BUFFERSIZE);
+ *reslen = SCSI_SENSE_BUFFERSIZE;
+ }
}
- if ((pt_cmd->status & 0x17f) == 0x101)
- *sense = pt_cmd->priv;
+ spraid_put_cmd(hdev, pt_cmd, SPRAID_CMD_IOPT);
return pt_cmd->status;
}
-static int spraid_user_ioq_cmd(struct spraid_dev *hdev,
- struct spraid_ioq_passthru_cmd __user *ucmd)
+static int spraid_user_ioq_cmd(struct spraid_dev *hdev, struct bsg_job *job)
{
- struct spraid_ioq_passthru_cmd cmd;
+ struct spraid_bsg_request *bsg_req = (struct spraid_bsg_request *)(job->request);
+ struct spraid_ioq_passthru_cmd *cmd = &(bsg_req->ioqcmd);
struct spraid_ioq_command ioq_cmd;
- u32 timeout = 0;
int status = 0;
- u8 *data_ptr = NULL;
- dma_addr_t data_dma;
- enum dma_data_direction dma_dir = DMA_NONE;
- void *sense = NULL;
-
- if (!capable(CAP_SYS_ADMIN)) {
- dev_err(hdev->dev, "Current user hasn't administrator right, reject service\n");
- return -EACCES;
- }
-
- if (copy_from_user(&cmd, ucmd, sizeof(cmd))) {
- dev_err(hdev->dev, "Copy command from user space to kernel space failed\n");
- return -EFAULT;
- }
+ u32 timeout = msecs_to_jiffies(cmd->timeout_ms);
- if (cmd.data_len > PAGE_SIZE) {
+ if (cmd->data_len > PAGE_SIZE) {
dev_err(hdev->dev, "[%s] data len bigger than 4k\n", __func__);
return -EFAULT;
}
- dev_info(hdev->dev, "[%s] opcode: 0x%x, subopcode: 0x%x, datalen: %d\n",
- __func__, cmd.opcode, cmd.info_1.subopcode, cmd.data_len);
-
- if (cmd.addr && cmd.data_len) {
- data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
- if (!data_ptr)
- return -ENOMEM;
-
- dma_dir = (cmd.opcode & 1) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ if (hdev->state != SPRAID_LIVE) {
+ dev_err(hdev->dev, "[%s] err, host state:[%d] is not live\n",
+ __func__, hdev->state);
+ return -EBUSY;
}
- if (dma_dir == DMA_TO_DEVICE) {
- if (copy_from_user(data_ptr, (void __user *)(uintptr_t)cmd.addr, cmd.data_len)) {
- dev_err(hdev->dev, "[%s] copy user data failed\n", __func__);
- status = -EFAULT;
- goto free_dma_mem;
- }
- }
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x] init, datalen[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode, cmd->data_len);
memset(&ioq_cmd, 0, sizeof(ioq_cmd));
- ioq_cmd.common.opcode = cmd.opcode;
- ioq_cmd.common.flags = cmd.flags;
- ioq_cmd.common.hdid = cpu_to_le32(cmd.nsid);
- ioq_cmd.common.sense_len = cpu_to_le16(cmd.info_0.res_sense_len);
- ioq_cmd.common.cdb_len = cmd.info_0.cdb_len;
- ioq_cmd.common.rsvd2 = cmd.info_0.rsvd0;
- ioq_cmd.common.cdw3[0] = cpu_to_le32(cmd.cdw3);
- ioq_cmd.common.cdw3[1] = cpu_to_le32(cmd.cdw4);
- ioq_cmd.common.cdw3[2] = cpu_to_le32(cmd.cdw5);
- ioq_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
-
- ioq_cmd.common.cdw10[0] = cpu_to_le32(cmd.cdw10);
- ioq_cmd.common.cdw10[1] = cpu_to_le32(cmd.cdw11);
- ioq_cmd.common.cdw10[2] = cpu_to_le32(cmd.cdw12);
- ioq_cmd.common.cdw10[3] = cpu_to_le32(cmd.cdw13);
- ioq_cmd.common.cdw10[4] = cpu_to_le32(cmd.cdw14);
- ioq_cmd.common.cdw10[5] = cpu_to_le32(cmd.data_len);
-
- memcpy(ioq_cmd.common.cdb, &cmd.cdw16, cmd.info_0.cdb_len);
-
- ioq_cmd.common.cdw26[0] = cpu_to_le32(cmd.cdw26[0]);
- ioq_cmd.common.cdw26[1] = cpu_to_le32(cmd.cdw26[1]);
- ioq_cmd.common.cdw26[2] = cpu_to_le32(cmd.cdw26[2]);
- ioq_cmd.common.cdw26[3] = cpu_to_le32(cmd.cdw26[3]);
-
- if (cmd.timeout_ms)
- timeout = msecs_to_jiffies(cmd.timeout_ms);
- timeout = timeout ? timeout : ADMIN_TIMEOUT;
-
- status = spraid_submit_ioq_sync_cmd(hdev, &ioq_cmd, &cmd.result0, &sense, timeout);
-
- if (status >= 0) {
- if (put_user(cmd.result0, &ucmd->result0)) {
- status = -EFAULT;
- goto free_dma_mem;
- }
- if (put_user(cmd.result1, &ucmd->result1)) {
- status = -EFAULT;
- goto free_dma_mem;
- }
- if (dma_dir == DMA_FROM_DEVICE &&
- copy_to_user((void __user *)(uintptr_t)cmd.addr, data_ptr, cmd.data_len)) {
- status = -EFAULT;
- goto free_dma_mem;
- }
- }
-
- if (sense) {
- if (copy_to_user((void *__user *)(uintptr_t)cmd.sense_addr,
- sense, cmd.info_0.res_sense_len)) {
- status = -EFAULT;
- goto free_dma_mem;
- }
- }
-
-free_dma_mem:
- if (data_ptr)
- dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
-
- return status;
-
-}
-
-static int spraid_reset_work_sync(struct spraid_dev *hdev);
-
-static int spraid_user_reset_cmd(struct spraid_dev *hdev)
-{
- int ret;
-
- dev_info(hdev->dev, "[%s] start user reset cmd\n", __func__);
- ret = spraid_reset_work_sync(hdev);
- dev_info(hdev->dev, "[%s] stop user reset cmd[%d]\n", __func__, ret);
-
- return ret;
-}
-
-static int hdev_open(struct inode *inode, struct file *file)
-{
- struct spraid_dev *hdev =
- container_of(inode->i_cdev, struct spraid_dev, cdev);
- file->private_data = hdev;
- return 0;
-}
-
-static long hdev_ioctl(struct file *file, u32 cmd, unsigned long arg)
-{
- struct spraid_dev *hdev = file->private_data;
- void __user *argp = (void __user *)arg;
-
- switch (cmd) {
- case SPRAID_IOCTL_ADMIN_CMD:
- return spraid_user_admin_cmd(hdev, argp);
- case SPRAID_IOCTL_IOQ_CMD:
- return spraid_user_ioq_cmd(hdev, argp);
- case SPRAID_IOCTL_RESET_CMD:
- return spraid_user_reset_cmd(hdev);
- default:
- return -ENOTTY;
- }
-}
-
-static const struct file_operations spraid_dev_fops = {
- .owner = THIS_MODULE,
- .open = hdev_open,
- .unlocked_ioctl = hdev_ioctl,
- .compat_ioctl = hdev_ioctl,
-};
-
-static int spraid_create_cdev(struct spraid_dev *hdev)
-{
- int ret;
-
- device_initialize(&hdev->ctrl_device);
- hdev->ctrl_device.devt = MKDEV(MAJOR(spraid_chr_devt), hdev->instance);
- hdev->ctrl_device.class = spraid_class;
- hdev->ctrl_device.parent = hdev->dev;
- dev_set_drvdata(&hdev->ctrl_device, hdev);
- ret = dev_set_name(&hdev->ctrl_device, "spraid%d", hdev->instance);
- if (ret)
- return ret;
- cdev_init(&hdev->cdev, &spraid_dev_fops);
- hdev->cdev.owner = THIS_MODULE;
- ret = cdev_device_add(&hdev->cdev, &hdev->ctrl_device);
- if (ret) {
- dev_err(hdev->dev, "Add cdev failed, ret: %d", ret);
- put_device(&hdev->ctrl_device);
- kfree_const(hdev->ctrl_device.kobj.name);
- return ret;
- }
-
- return 0;
-}
-
-static inline void spraid_remove_cdev(struct spraid_dev *hdev)
-{
- cdev_device_del(&hdev->cdev, &hdev->ctrl_device);
-}
-
-static const struct blk_mq_ops spraid_admin_mq_ops = {
- .queue_rq = spraid_queue_admin_rq,
- .complete = spraid_complete_admin_rq,
- .init_hctx = spraid_admin_init_hctx,
- .init_request = spraid_admin_init_request,
- .timeout = spraid_admin_timeout,
-};
-
-static void spraid_remove_admin_tagset(struct spraid_dev *hdev)
-{
- if (hdev->admin_q && !blk_queue_dying(hdev->admin_q)) {
- blk_mq_unquiesce_queue(hdev->admin_q);
- blk_cleanup_queue(hdev->admin_q);
- blk_mq_free_tag_set(&hdev->admin_tagset);
+ ioq_cmd.common.opcode = cmd->opcode;
+ ioq_cmd.common.flags = cmd->flags;
+ ioq_cmd.common.hdid = cpu_to_le32(cmd->nsid);
+ ioq_cmd.common.sense_len = cpu_to_le16(cmd->info_0.res_sense_len);
+ ioq_cmd.common.cdb_len = cmd->info_0.cdb_len;
+ ioq_cmd.common.rsvd2 = cmd->info_0.rsvd0;
+ ioq_cmd.common.cdw3[0] = cpu_to_le32(cmd->cdw3);
+ ioq_cmd.common.cdw3[1] = cpu_to_le32(cmd->cdw4);
+ ioq_cmd.common.cdw3[2] = cpu_to_le32(cmd->cdw5);
+
+ ioq_cmd.common.cdw10[0] = cpu_to_le32(cmd->cdw10);
+ ioq_cmd.common.cdw10[1] = cpu_to_le32(cmd->cdw11);
+ ioq_cmd.common.cdw10[2] = cpu_to_le32(cmd->cdw12);
+ ioq_cmd.common.cdw10[3] = cpu_to_le32(cmd->cdw13);
+ ioq_cmd.common.cdw10[4] = cpu_to_le32(cmd->cdw14);
+ ioq_cmd.common.cdw10[5] = cpu_to_le32(cmd->data_len);
+
+ memcpy(ioq_cmd.common.cdb, &cmd->cdw16, cmd->info_0.cdb_len);
+
+ ioq_cmd.common.cdw26[0] = cpu_to_le32(cmd->cdw26[0]);
+ ioq_cmd.common.cdw26[1] = cpu_to_le32(cmd->cdw26[1]);
+ ioq_cmd.common.cdw26[2] = cpu_to_le32(cmd->cdw26[2]);
+ ioq_cmd.common.cdw26[3] = cpu_to_le32(cmd->cdw26[3]);
+
+ status = spraid_bsg_map_data(hdev, job, (struct spraid_admin_command *)&ioq_cmd);
+ if (status) {
+ dev_err(hdev->dev, "[%s] err, map data failed\n", __func__);
+ return status;
}
-}
-static int spraid_alloc_admin_tags(struct spraid_dev *hdev)
-{
- if (!hdev->admin_q) {
- hdev->admin_tagset.ops = &spraid_admin_mq_ops;
- hdev->admin_tagset.nr_hw_queues = 1;
+ status = spraid_submit_ioq_sync_cmd(hdev, &ioq_cmd, job->reply, &job->reply_len, timeout);
- hdev->admin_tagset.queue_depth = SPRAID_AQ_MQ_TAG_DEPTH;
- hdev->admin_tagset.timeout = ADMIN_TIMEOUT;
- hdev->admin_tagset.numa_node = hdev->numa_node;
- hdev->admin_tagset.cmd_size =
- spraid_cmd_size(hdev, true, false);
- hdev->admin_tagset.flags = BLK_MQ_F_NO_SCHED;
- hdev->admin_tagset.driver_data = hdev;
+ dev_info(hdev->dev, "[%s] opcode[0x%x] subopcode[0x%x], status[0x%x], reply_len[%d]\n",
+ __func__, cmd->opcode, cmd->info_1.subopcode, status, job->reply_len);
- if (blk_mq_alloc_tag_set(&hdev->admin_tagset)) {
- dev_err(hdev->dev, "Allocate admin tagset failed\n");
- return -ENOMEM;
- }
+ spraid_bsg_unmap_data(hdev, job);
- hdev->admin_q = blk_mq_init_queue(&hdev->admin_tagset);
- if (IS_ERR(hdev->admin_q)) {
- dev_err(hdev->dev, "Initialize admin request queue failed\n");
- blk_mq_free_tag_set(&hdev->admin_tagset);
- return -ENOMEM;
- }
- if (!blk_get_queue(hdev->admin_q)) {
- dev_err(hdev->dev, "Get admin request queue failed\n");
- spraid_remove_admin_tagset(hdev);
- hdev->admin_q = NULL;
- return -ENODEV;
- }
- } else {
- blk_mq_unquiesce_queue(hdev->admin_q);
- }
- return 0;
+ return status;
}
static bool spraid_check_scmd_completed(struct scsi_cmnd *scmd)
@@ -2891,7 +2690,7 @@ static bool spraid_check_scmd_completed(struct scsi_cmnd *scmd)
spraid_get_tag_from_scmd(scmd, &hwq, &cid);
spraidq = &hdev->queues[hwq];
if (READ_ONCE(iod->state) == SPRAID_CMD_COMPLETE || spraid_poll_cq(spraidq, cid)) {
- dev_warn(hdev->dev, "cid[%d], qid[%d] has been completed\n",
+ dev_warn(hdev->dev, "cid[%d] qid[%d] has been completed\n",
cid, spraidq->qid);
return true;
}
@@ -2927,8 +2726,7 @@ static int spraid_send_abort_cmd(struct spraid_dev *hdev, u32 hdid, u16 qid, u16
admin_cmd.abort.sqid = cpu_to_le16(qid);
admin_cmd.abort.cid = cpu_to_le16(cid);
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
}
/* send reset command by admin quueue temporary */
@@ -2941,8 +2739,7 @@ static int spraid_send_reset_cmd(struct spraid_dev *hdev, int type, u32 hdid)
admin_cmd.reset.hdid = cpu_to_le32(hdid);
admin_cmd.reset.type = type;
- return spraid_submit_admin_sync_cmd(hdev->admin_q, &admin_cmd, NULL,
- NULL, 0, 0, 0, 0);
+ return spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
}
static bool spraid_change_host_state(struct spraid_dev *hdev, enum spraid_state newstate)
@@ -3022,7 +2819,7 @@ static void spraid_back_fault_cqe(struct spraid_queue *ioq, struct spraid_comple
scsi_dma_unmap(scmd);
spraid_free_iod_res(hdev, iod);
scmd->scsi_done(scmd);
- dev_warn(hdev->dev, "Back fault CQE, cid[%d], qid[%d]\n",
+ dev_warn(hdev->dev, "Back fault CQE, cid[%d] qid[%d]\n",
cqe->cmd_id, ioq->qid);
}
@@ -3032,6 +2829,8 @@ static void spraid_back_all_io(struct spraid_dev *hdev)
struct spraid_queue *ioq;
struct spraid_completion cqe = { 0 };
+ scsi_block_requests(hdev->shost);
+
for (i = 1; i <= hdev->shost->nr_hw_queues; i++) {
ioq = &hdev->queues[i];
for (j = 0; j < hdev->shost->can_queue; j++) {
@@ -3039,6 +2838,8 @@ static void spraid_back_all_io(struct spraid_dev *hdev)
spraid_back_fault_cqe(ioq, &cqe);
}
}
+
+ scsi_unblock_requests(hdev->shost);
}
static void spraid_dev_disable(struct spraid_dev *hdev, bool shutdown)
@@ -3106,17 +2907,13 @@ static void spraid_reset_work(struct work_struct *work)
if (ret)
goto pci_disable;
- ret = spraid_alloc_admin_tags(hdev);
- if (ret)
- goto pci_disable;
-
ret = spraid_setup_io_queues(hdev);
if (ret || hdev->online_queues <= hdev->shost->nr_hw_queues)
goto pci_disable;
spraid_change_host_state(hdev, SPRAID_LIVE);
- spraid_send_aen(hdev);
+ spraid_send_all_aen(hdev);
return;
@@ -3174,8 +2971,8 @@ static int spraid_abort_handler(struct scsi_cmnd *scmd)
scsi_print_command(scmd);
- if (!spraid_wait_abnl_cmd_done(iod) || spraid_check_scmd_completed(scmd) ||
- hdev->state != SPRAID_LIVE)
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
return SUCCESS;
hostdata = scmd->device->hostdata;
@@ -3206,8 +3003,8 @@ static int spraid_tgt_reset_handler(struct scsi_cmnd *scmd)
scsi_print_command(scmd);
- if (!spraid_wait_abnl_cmd_done(iod) || spraid_check_scmd_completed(scmd) ||
- hdev->state != SPRAID_LIVE)
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
return SUCCESS;
hostdata = scmd->device->hostdata;
@@ -3241,8 +3038,8 @@ static int spraid_bus_reset_handler(struct scsi_cmnd *scmd)
scsi_print_command(scmd);
- if (!spraid_wait_abnl_cmd_done(iod) || spraid_check_scmd_completed(scmd) ||
- hdev->state != SPRAID_LIVE)
+ if (hdev->state != SPRAID_LIVE || !spraid_wait_abnl_cmd_done(iod) ||
+ spraid_check_scmd_completed(scmd))
return SUCCESS;
hostdata = scmd->device->hostdata;
@@ -3272,7 +3069,7 @@ static int spraid_shost_reset_handler(struct scsi_cmnd *scmd)
struct spraid_dev *hdev = shost_priv(scmd->device->host);
scsi_print_command(scmd);
- if (spraid_check_scmd_completed(scmd) || hdev->state != SPRAID_LIVE)
+ if (hdev->state != SPRAID_LIVE || spraid_check_scmd_completed(scmd))
return SUCCESS;
spraid_get_tag_from_scmd(scmd, &hwq, &cid);
@@ -3288,6 +3085,62 @@ static int spraid_shost_reset_handler(struct scsi_cmnd *scmd)
return SUCCESS;
}
+static pci_ers_result_t spraid_pci_error_detected(struct pci_dev *pdev,
+ pci_channel_state_t state)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "enter pci error detect, state:%d\n", state);
+
+ switch (state) {
+ case pci_channel_io_normal:
+ dev_warn(hdev->dev, "channel is normal, do nothing\n");
+
+ return PCI_ERS_RESULT_CAN_RECOVER;
+ case pci_channel_io_frozen:
+ dev_warn(hdev->dev, "channel io frozen, need reset controller\n");
+
+ scsi_block_requests(hdev->shost);
+
+ spraid_change_host_state(hdev, SPRAID_RESETTING);
+
+ return PCI_ERS_RESULT_NEED_RESET;
+ case pci_channel_io_perm_failure:
+ dev_warn(hdev->dev, "channel io failure, request disconnect\n");
+
+ return PCI_ERS_RESULT_DISCONNECT;
+ }
+
+ return PCI_ERS_RESULT_NEED_RESET;
+}
+
+static pci_ers_result_t spraid_pci_slot_reset(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "restart after slot reset\n");
+
+ pci_restore_state(pdev);
+
+ if (!queue_work(spraid_wq, &hdev->reset_work)) {
+ dev_err(hdev->dev, "[%s] err, the device is resetting state\n", __func__);
+ return PCI_ERS_RESULT_NONE;
+ }
+
+ flush_work(&hdev->reset_work);
+
+ scsi_unblock_requests(hdev->shost);
+
+ return PCI_ERS_RESULT_RECOVERED;
+}
+
+static void spraid_reset_done(struct pci_dev *pdev)
+{
+ struct spraid_dev *hdev = pci_get_drvdata(pdev);
+
+ dev_info(hdev->dev, "enter spraid reset done\n");
+}
+
static ssize_t csts_pp_show(struct device *cdev, struct device_attribute *attr, char *buf)
{
struct Scsi_Host *shost = class_to_shost(cdev);
@@ -3347,7 +3200,7 @@ static ssize_t fw_version_show(struct device *cdev, struct device_attribute *att
struct Scsi_Host *shost = class_to_shost(cdev);
struct spraid_dev *hdev = shost_priv(shost);
- return snprintf(buf, sizeof(hdev->ctrl_info->fr), "%s\n", hdev->ctrl_info->fr);
+ return snprintf(buf, PAGE_SIZE, "%s\n", hdev->ctrl_info->fr);
}
static DEVICE_ATTR_RO(csts_pp);
@@ -3365,6 +3218,185 @@ static struct device_attribute *spraid_host_attrs[] = {
NULL,
};
+static int spraid_get_vd_info(struct spraid_dev *hdev, struct spraid_vd_info *vd_info, u16 vid)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.usr_cmd.opcode = USR_CMD_READ;
+ admin_cmd.usr_cmd.info_0.subopcode = cpu_to_le16(USR_CMD_VDINFO);
+ admin_cmd.usr_cmd.info_1.data_len = cpu_to_le16(USR_CMD_RDLEN);
+ admin_cmd.usr_cmd.info_1.param_len = cpu_to_le16(VDINFO_PARAM_LEN);
+ admin_cmd.usr_cmd.cdw10 = cpu_to_le32(vid);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(vd_info, data_ptr, sizeof(struct spraid_vd_info));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static int spraid_get_bgtask(struct spraid_dev *hdev, struct spraid_bgtask *bgtask)
+{
+ struct spraid_admin_command admin_cmd;
+ u8 *data_ptr = NULL;
+ dma_addr_t data_dma = 0;
+ int ret;
+
+ data_ptr = dma_alloc_coherent(hdev->dev, PAGE_SIZE, &data_dma, GFP_KERNEL);
+ if (!data_ptr)
+ return -ENOMEM;
+
+ memset(&admin_cmd, 0, sizeof(admin_cmd));
+ admin_cmd.usr_cmd.opcode = USR_CMD_READ;
+ admin_cmd.usr_cmd.info_0.subopcode = cpu_to_le16(USR_CMD_BGTASK);
+ admin_cmd.usr_cmd.info_1.data_len = cpu_to_le16(USR_CMD_RDLEN);
+ admin_cmd.common.dptr.prp1 = cpu_to_le64(data_dma);
+
+ ret = spraid_submit_admin_sync_cmd(hdev, &admin_cmd, NULL, NULL, 0);
+ if (!ret)
+ memcpy(bgtask, data_ptr, sizeof(struct spraid_bgtask));
+
+ dma_free_coherent(hdev->dev, PAGE_SIZE, data_ptr, data_dma);
+
+ return ret;
+}
+
+static ssize_t raid_level_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_sdev_hostdata *hostdata;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret)
+ vd_info->rg_level = ARRAY_SIZE(raid_levels) - 1;
+
+ ret = (vd_info->rg_level < ARRAY_SIZE(raid_levels)) ?
+ vd_info->rg_level : (ARRAY_SIZE(raid_levels) - 1);
+
+ kfree(vd_info);
+
+ return snprintf(buf, PAGE_SIZE, "RAID-%s\n", raid_levels[ret]);
+}
+
+static ssize_t raid_state_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_sdev_hostdata *hostdata;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret) {
+ vd_info->vd_status = 0;
+ vd_info->rg_id = 0xff;
+ }
+
+ ret = (vd_info->vd_status < ARRAY_SIZE(raid_states)) ? vd_info->vd_status : 0;
+
+ kfree(vd_info);
+
+ return snprintf(buf, PAGE_SIZE, "%s\n", raid_states[ret]);
+}
+
+static ssize_t raid_resync_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev;
+ struct spraid_dev *hdev;
+ struct spraid_vd_info *vd_info;
+ struct spraid_bgtask *bgtask;
+ struct spraid_sdev_hostdata *hostdata;
+ u8 rg_id, i, progress = 0;
+ int ret;
+
+ sdev = to_scsi_device(dev);
+ hdev = shost_priv(sdev->host);
+ hostdata = sdev->hostdata;
+
+ vd_info = kmalloc(sizeof(*vd_info), GFP_KERNEL);
+ if (!vd_info || !SPRAID_DEV_INFO_ATTR_VD(hostdata->attr))
+ return snprintf(buf, PAGE_SIZE, "NA\n");
+
+ ret = spraid_get_vd_info(hdev, vd_info, sdev->id);
+ if (ret)
+ goto out;
+
+ rg_id = vd_info->rg_id;
+
+ bgtask = (struct spraid_bgtask *)vd_info;
+ ret = spraid_get_bgtask(hdev, bgtask);
+ if (ret)
+ goto out;
+ for (i = 0; i < bgtask->task_num; i++) {
+ if ((bgtask->bgtask[i].type == BGTASK_TYPE_REBUILD) &&
+ (le16_to_cpu(bgtask->bgtask[i].vd_id) == rg_id))
+ progress = bgtask->bgtask[i].progress;
+ }
+
+out:
+ kfree(vd_info);
+ return snprintf(buf, PAGE_SIZE, "%d\n", progress);
+}
+
+static DEVICE_ATTR_RO(raid_level);
+static DEVICE_ATTR_RO(raid_state);
+static DEVICE_ATTR_RO(raid_resync);
+
+static struct device_attribute *spraid_dev_attrs[] = {
+ &dev_attr_raid_level,
+ &dev_attr_raid_state,
+ &dev_attr_raid_resync,
+ NULL,
+};
+
+static struct pci_error_handlers spraid_err_handler = {
+ .error_detected = spraid_pci_error_detected,
+ .slot_reset = spraid_pci_slot_reset,
+ .reset_done = spraid_reset_done,
+};
+
+static int spraid_sysfs_host_reset(struct Scsi_Host *shost, int reset_type)
+{
+ int ret;
+ struct spraid_dev *hdev = shost_priv(shost);
+
+ dev_info(hdev->dev, "[%s] start sysfs host reset cmd\n", __func__);
+ ret = spraid_reset_work_sync(hdev);
+ dev_info(hdev->dev, "[%s] stop sysfs host reset cmd[%d]\n", __func__, ret);
+
+ return ret;
+}
+
static struct scsi_host_template spraid_driver_template = {
.module = THIS_MODULE,
.name = "Ramaxel Logic spraid driver",
@@ -3379,9 +3411,11 @@ static struct scsi_host_template spraid_driver_template = {
.eh_bus_reset_handler = spraid_bus_reset_handler,
.eh_host_reset_handler = spraid_shost_reset_handler,
.change_queue_depth = scsi_change_queue_depth,
- .host_tagset = 1,
+ .host_tagset = 0,
.this_id = -1,
.shost_attrs = spraid_host_attrs,
+ .sdev_attrs = spraid_dev_attrs,
+ .host_reset = spraid_sysfs_host_reset,
};
static void spraid_shutdown(struct pci_dev *pdev)
@@ -3392,11 +3426,53 @@ static void spraid_shutdown(struct pci_dev *pdev)
spraid_disable_admin_queue(hdev, true);
}
+/* bsg dispatch user command */
+static int spraid_bsg_host_dispatch(struct bsg_job *job)
+{
+ struct Scsi_Host *shost = dev_to_shost(job->dev);
+ struct spraid_dev *hdev = shost_priv(shost);
+ struct request *rq = blk_mq_rq_from_pdu(job);
+ struct spraid_bsg_request *bsg_req = job->request;
+ int ret = 0;
+
+ dev_info(hdev->dev, "[%s] msgcode[%d], msglen[%d], timeout[%d], req_nsge[%d], req_len[%d]\n",
+ __func__, bsg_req->msgcode, job->request_len, rq->timeout,
+ job->request_payload.sg_cnt, job->request_payload.payload_len);
+
+ job->reply_len = 0;
+
+ switch (bsg_req->msgcode) {
+ case SPRAID_BSG_ADM:
+ ret = spraid_user_admin_cmd(hdev, job);
+ break;
+ case SPRAID_BSG_IOQ:
+ ret = spraid_user_ioq_cmd(hdev, job);
+ break;
+ default:
+ dev_info(hdev->dev, "[%s] unsupport msgcode[%d]\n", __func__, bsg_req->msgcode);
+ break;
+ }
+
+ if (ret > 0)
+ ret = ret | (ret << 8);
+
+ bsg_job_done(job, ret, 0);
+ return 0;
+}
+
+static inline void spraid_remove_bsg(struct spraid_dev *hdev)
+{
+ if (hdev->bsg_queue) {
+ bsg_unregister_queue(hdev->bsg_queue);
+ blk_cleanup_queue(hdev->bsg_queue);
+ }
+}
static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
struct spraid_dev *hdev;
struct Scsi_Host *shost;
int node, ret;
+ char bsg_name[15];
shost = scsi_host_alloc(&spraid_driver_template, sizeof(*hdev));
if (!shost) {
@@ -3421,10 +3497,10 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
goto put_dev;
init_rwsem(&hdev->devices_rwsem);
- INIT_WORK(&hdev->aen_work, spraid_async_event_work);
INIT_WORK(&hdev->scan_work, spraid_scan_work);
INIT_WORK(&hdev->timesyn_work, spraid_timesyn_work);
INIT_WORK(&hdev->reset_work, spraid_reset_work);
+ INIT_WORK(&hdev->fw_act_work, spraid_fw_act_work);
spin_lock_init(&hdev->state_lock);
ret = spraid_alloc_resources(hdev);
@@ -3439,17 +3515,13 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
if (ret)
goto pci_disable;
- ret = spraid_alloc_admin_tags(hdev);
- if (ret)
- goto disable_admin_q;
-
ret = spraid_init_ctrl_info(hdev);
if (ret)
- goto free_admin_tagset;
+ goto disable_admin_q;
ret = spraid_alloc_iod_ext_mem_pool(hdev);
if (ret)
- goto free_admin_tagset;
+ goto disable_admin_q;
ret = spraid_setup_io_queues(hdev);
if (ret)
@@ -3464,9 +3536,15 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
goto remove_io_queues;
}
- ret = spraid_create_cdev(hdev);
- if (ret)
+ snprintf(bsg_name, sizeof(bsg_name), "spraid%d", shost->host_no);
+ hdev->bsg_queue = bsg_setup_queue(&shost->shost_gendev, bsg_name,
+ spraid_bsg_host_dispatch, NULL,
+ spraid_cmd_size(hdev, true, false));
+ if (IS_ERR(hdev->bsg_queue)) {
+ dev_err(hdev->dev, "err, setup bsg failed\n");
+ hdev->bsg_queue = NULL;
goto remove_io_queues;
+ }
if (hdev->online_queues == SPRAID_ADMIN_QUEUE_NUM) {
dev_warn(hdev->dev, "warn only admin queue can be used\n");
@@ -3475,11 +3553,11 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
hdev->state = SPRAID_LIVE;
- spraid_send_aen(hdev);
+ spraid_send_all_aen(hdev);
ret = spraid_dev_list_init(hdev);
if (ret)
- goto remove_cdev;
+ goto remove_bsg;
ret = spraid_configure_timestamp(hdev);
if (ret)
@@ -3487,20 +3565,18 @@ static int spraid_probe(struct pci_dev *pdev, const struct pci_device_id *id)
ret = spraid_alloc_ioq_ptcmds(hdev);
if (ret)
- goto remove_cdev;
+ goto remove_bsg;
scsi_scan_host(hdev->shost);
return 0;
-remove_cdev:
- spraid_remove_cdev(hdev);
+remove_bsg:
+ spraid_remove_bsg(hdev);
remove_io_queues:
spraid_remove_io_queues(hdev);
free_iod_mempool:
spraid_free_iod_ext_mem_pool(hdev);
-free_admin_tagset:
- spraid_remove_admin_tagset(hdev);
disable_admin_q:
spraid_disable_admin_queue(hdev, false);
pci_disable:
@@ -3524,22 +3600,17 @@ static void spraid_remove(struct pci_dev *pdev)
dev_info(hdev->dev, "enter spraid remove\n");
spraid_change_host_state(hdev, SPRAID_DELETING);
+ flush_work(&hdev->reset_work);
- if (!pci_device_is_present(pdev)) {
- scsi_block_requests(shost);
+ if (!pci_device_is_present(pdev))
spraid_back_all_io(hdev);
- scsi_unblock_requests(shost);
- }
- flush_work(&hdev->reset_work);
+ spraid_remove_bsg(hdev);
scsi_remove_host(shost);
-
- kfree(hdev->ioq_ptcmds);
+ spraid_free_ioq_ptcmds(hdev);
kfree(hdev->devices);
- spraid_remove_cdev(hdev);
spraid_remove_io_queues(hdev);
spraid_free_iod_ext_mem_pool(hdev);
- spraid_remove_admin_tagset(hdev);
spraid_disable_admin_queue(hdev, false);
spraid_pci_disable(hdev);
spraid_free_resources(hdev);
@@ -3551,7 +3622,7 @@ static void spraid_remove(struct pci_dev *pdev)
}
static const struct pci_device_id spraid_id_table[] = {
- { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC, SPRAID_SERVER_DEVICE_HAB_DID) },
+ { PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC, SPRAID_SERVER_DEVICE_HBA_DID) },
{ PCI_DEVICE(PCI_VENDOR_ID_RAMAXEL_LOGIC, SPRAID_SERVER_DEVICE_RAID_DID) },
{ 0, }
};
@@ -3563,6 +3634,7 @@ static struct pci_driver spraid_driver = {
.probe = spraid_probe,
.remove = spraid_remove,
.shutdown = spraid_shutdown,
+ .err_handler = &spraid_err_handler,
};
static int __init spraid_init(void)
@@ -3573,14 +3645,10 @@ static int __init spraid_init(void)
if (!spraid_wq)
return -ENOMEM;
- ret = alloc_chrdev_region(&spraid_chr_devt, 0, SPRAID_MINORS, "spraid");
- if (ret < 0)
- goto destroy_wq;
-
spraid_class = class_create(THIS_MODULE, "spraid");
if (IS_ERR(spraid_class)) {
ret = PTR_ERR(spraid_class);
- goto unregister_chrdev;
+ goto destroy_wq;
}
ret = pci_register_driver(&spraid_driver);
@@ -3591,8 +3659,6 @@ static int __init spraid_init(void)
destroy_class:
class_destroy(spraid_class);
-unregister_chrdev:
- unregister_chrdev_region(spraid_chr_devt, SPRAID_MINORS);
destroy_wq:
destroy_workqueue(spraid_wq);
@@ -3603,12 +3669,11 @@ static void __exit spraid_exit(void)
{
pci_unregister_driver(&spraid_driver);
class_destroy(spraid_class);
- unregister_chrdev_region(spraid_chr_devt, SPRAID_MINORS);
destroy_workqueue(spraid_wq);
ida_destroy(&spraid_instance_ida);
}
-MODULE_AUTHOR("Ramaxel Memory Technology");
+MODULE_AUTHOR("songyl(a)ramaxel.com");
MODULE_DESCRIPTION("Ramaxel Memory Technology SPraid Driver");
MODULE_LICENSE("GPL");
MODULE_VERSION(SPRAID_DRV_VERSION);
--
2.27.0
1
0

[PATCH openEuler-1.0-LTS v2 01/10] arm64: mm: Restore mm_cpumask (revert commit 38d96287504a ("arm64: mm: kill mm_cpumask usage"))
by Yang Yingliang 20 Dec '21
by Yang Yingliang 20 Dec '21
20 Dec '21
From: Takao Indoh <indou.takao(a)fujitsu.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4BLL0
CVE: NA
---------------------------
mm_cpumask was deleted by the commit 38d96287504a ("arm64: mm: kill
mm_cpumask usage") because it was not used at that time. Now this is needed
to find appropriate CPUs for TLB flush, so this patch reverts this commit.
Signed-off-by: QI Fuli <qi.fuli(a)fujitsu.com>
Signed-off-by: Takao Indoh <indou.takao(a)fujitsu.com>
Signed-off-by: Cheng Jian <cj.chengjian(a)huawei.com>
Reviewed-by: Xie XiuQi <xiexiuqi(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
arch/arm64/kernel/smp.c | 6 ++++++
arch/arm64/mm/context.c | 2 ++
2 files changed, 8 insertions(+)
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index fe562778de352..e86940d353a3e 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -387,6 +387,7 @@ asmlinkage notrace void secondary_start_kernel(void)
*/
mmgrab(mm);
current->active_mm = mm;
+ cpumask_set_cpu(cpu, mm_cpumask(mm));
/*
* TTBR0 is only used for the identity mapping at this stage. Make it
@@ -489,6 +490,11 @@ int __cpu_disable(void)
*/
irq_migrate_all_off_this_cpu();
+ /*
+ * Remove this CPU from the vm mask set of all processes.
+ */
+ clear_tasks_mm_cpumask(cpu);
+
return 0;
}
diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 2b80ceff5d6c2..27d1f3fec1cc9 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -207,6 +207,7 @@ static u64 new_context(struct mm_struct *mm, unsigned int cpu)
set_asid:
__set_bit(asid, asid_map);
cur_idx = asid;
+ cpumask_clear(mm_cpumask(mm));
return idx2asid(asid) | generation;
}
@@ -254,6 +255,7 @@ void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
switch_mm_fastpath:
arm64_apply_bp_hardening();
+ cpumask_set_cpu(cpu, mm_cpumask(mm));
/*
* Defer TTBR0_EL1 setting for user threads to uaccess_enable() when
--
2.25.1
1
9

[PATCH openEuler-1.0-LTS 1/2] audit: improve robustness of the audit queue handling
by Yang Yingliang 20 Dec '21
by Yang Yingliang 20 Dec '21
20 Dec '21
From: Paul Moore <paul(a)paul-moore.com>
hulk inclusion
category: bugfix
bugzilla: 185904, https://gitee.com/openeuler/kernel/issues/I4NAZN
CVE: NA
Reference: https://patchwork.kernel.org/project/linux-audit/patch/163942029335.62691.7…
-------------------------------------------------------------------
If the audit daemon were ever to get stuck in a stopped state the
kernel's kauditd_thread() could get blocked attempting to send audit
records to the userspace audit daemon. With the kernel thread
blocked it is possible that the audit queue could grow unbounded as
certain audit record generating events must be exempt from the queue
limits else the system enter a deadlock state.
This patch resolves this problem by lowering the kernel thread's
socket sending timeout from MAX_SCHEDULE_TIMEOUT to HZ/10 and tweaks
the kauditd_send_queue() function to better manage the various audit
queues when connection problems occur between the kernel and the
audit daemon. With this patch, the backlog may temporarily grow
beyond the defined limits when the audit daemon is stopped and the
system is under heavy audit pressure, but kauditd_thread() will
continue to make progress and drain the queues as it would for other
connection problems. For example, with the audit daemon put into a
stopped state and the system configured to audit every syscall it
was still possible to shutdown the system without a kernel panic,
deadlock, etc.; granted, the system was slow to shutdown but that is
to be expected given the extreme pressure of recording every syscall.
The timeout value of HZ/10 was chosen primarily through
experimentation and this developer's "gut feeling". There is likely
no one perfect value, but as this scenario is limited in scope (root
privileges would be needed to send SIGSTOP to the audit daemon), it
is likely not worth exposing this as a tunable at present. This can
always be done at a later date if it proves necessary.
Cc: stable(a)vger.kernel.org
Fixes: 5b52330bbfe63 ("audit: fix auditd/kernel connection state tracking")
Reported-by: Gaosheng Cui <cuigaosheng1(a)huawei.com>
Tested-by: Gaosheng Cui <cuigaosheng1(a)huawei.com>
Reviewed-by: Richard Guy Briggs <rgb(a)redhat.com>
Signed-off-by: Paul Moore <paul(a)paul-moore.com>
Signed-off-by: Cui GaoSheng <cuigaosheng1(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
kernel/audit.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/kernel/audit.c b/kernel/audit.c
index 45741c3c48a47..968921d376b98 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -726,7 +726,7 @@ static int kauditd_send_queue(struct sock *sk, u32 portid,
{
int rc = 0;
struct sk_buff *skb;
- static unsigned int failed = 0;
+ unsigned int failed = 0;
/* NOTE: kauditd_thread takes care of all our locking, we just use
* the netlink info passed to us (e.g. sk and portid) */
@@ -743,32 +743,30 @@ static int kauditd_send_queue(struct sock *sk, u32 portid,
continue;
}
+retry:
/* grab an extra skb reference in case of error */
skb_get(skb);
rc = netlink_unicast(sk, skb, portid, 0);
if (rc < 0) {
- /* fatal failure for our queue flush attempt? */
+ /* send failed - try a few times unless fatal error */
if (++failed >= retry_limit ||
rc == -ECONNREFUSED || rc == -EPERM) {
- /* yes - error processing for the queue */
sk = NULL;
if (err_hook)
(*err_hook)(skb);
- if (!skb_hook)
- goto out;
- /* keep processing with the skb_hook */
+ if (rc == -EAGAIN)
+ rc = 0;
+ /* continue to drain the queue */
continue;
} else
- /* no - requeue to preserve ordering */
- skb_queue_head(queue, skb);
+ goto retry;
} else {
- /* it worked - drop the extra reference and continue */
+ /* skb sent - drop the extra reference and continue */
consume_skb(skb);
failed = 0;
}
}
-out:
return (rc >= 0 ? 0 : rc);
}
@@ -1557,7 +1555,8 @@ static int __net_init audit_net_init(struct net *net)
audit_panic("cannot initialize netlink socket in namespace");
return -ENOMEM;
}
- aunet->sk->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT;
+ /* limit the timeout in case auditd is blocked/stopped */
+ aunet->sk->sk_sndtimeo = HZ / 10;
return 0;
}
--
2.25.1
1
1
backport psi feature and avoid kabi change
bugzilla: https://gitee.com/openeuler/kernel/issues/I47QS2
Changes since v9:
1.Third-party kernel modules should not use struct cgroup,
to avoid kabi changes by __GENKSYMS__, from Xie Xiuqi.
Baruch Siach (1):
psi: fix reference to kernel commandline enable
Dan Schatzberg (1):
kernel/sched/psi.c: expose pressure metrics on root cgroup
Jason Xing (1):
psi: get poll_work to run when calling poll syscall next time
Johannes Weiner (14):
mm: workingset: tell cache transitions from workingset thrashing
sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD
sched: loadavg: make calc_load_n() public
sched: sched.h: make rq locking and clock functions available in
stats.h
sched: introduce this_rq_lock_irq()
psi: pressure stall information for CPU, memory, and IO
psi: cgroup support
psi: make disabling/enabling easier for vendor kernels
psi: fix aggregation idle shut-off
psi: avoid divide-by-zero crash inside virtual machines
fs: kernfs: add poll file operation
kernel: cgroup: add poll file operation
sched/psi: Fix sampling error and rare div0 crashes with cgroups and
high uptime
psi: Fix a division error in psi poll()
Josef Bacik (1):
blk-iolatency: use a percentile approache for ssd's
Liu Xinpeng (2):
psi:enable psi in config
psi:avoid kabi change
Miklos Szeredi (1):
fuse: ignore PG_workingset after stealing
Miles Chen (1):
sched/psi: Correct overly pessimistic size calculation
Odin Ugedal (1):
cgroup: fix psi monitor for root cgroup
Olof Johansson (1):
kernel/sched/psi.c: simplify cgroup_move_task()
Peter Zijlstra (1):
sched/psi: Reduce psimon FIFO priority
Suren Baghdasaryan (9):
psi: introduce state_mask to represent stalled psi states
psi: make psi_enable static
psi: rename psi fields in preparation for psi trigger addition
psi: split update_stats into parts
psi: track changed states
include/: refactor headers to allow kthread.h inclusion in psi_types.h
psi: introduce psi monitor
sched/psi: Do not require setsched permission from the
sched/psi: Fix OOB write when writing 0 bytes to PSI files
Yafang Shao (1):
mm, memcg: add workingset_restore in memory.stat
Documentation/accounting/psi.txt | 180 ++++
Documentation/admin-guide/cgroup-v2.rst | 22 +
Documentation/admin-guide/kernel-parameters.txt | 4 +
arch/arm64/configs/openeuler_defconfig | 2 +
arch/powerpc/platforms/cell/cpufreq_spudemand.c | 2 +-
arch/powerpc/platforms/cell/spufs/sched.c | 9 +-
arch/s390/appldata/appldata_os.c | 4 -
arch/x86/configs/openeuler_defconfig | 2 +
block/blk-iolatency.c | 183 +++-
drivers/cpuidle/governors/menu.c | 4 -
drivers/spi/spi-rockchip.c | 1 +
fs/fuse/dev.c | 1 +
fs/kernfs/file.c | 31 +-
fs/proc/loadavg.c | 3 -
include/linux/cgroup-defs.h | 11 +
include/linux/cgroup.h | 15 +
include/linux/kernfs.h | 8 +
include/linux/kthread.h | 4 +
include/linux/page-flags.h | 5 +
include/linux/psi.h | 65 ++
include/linux/psi_types.h | 173 +++
include/linux/sched.h | 13 +
include/linux/sched/loadavg.h | 24 +-
include/linux/swap.h | 1 +
include/trace/events/mmflags.h | 1 +
init/Kconfig | 28 +
kernel/cgroup/cgroup.c | 131 ++-
kernel/debug/kdb/kdb_main.c | 7 +-
kernel/fork.c | 4 +
kernel/kthread.c | 1 +
kernel/sched/Makefile | 1 +
kernel/sched/core.c | 16 +-
kernel/sched/loadavg.c | 139 ++-
kernel/sched/psi.c | 1292 +++++++++++++++++++++++
kernel/sched/sched.h | 178 ++--
kernel/sched/stats.h | 86 ++
kernel/workqueue.c | 23 +
kernel/workqueue_internal.h | 6 +-
mm/compaction.c | 5 +
mm/filemap.c | 20 +-
mm/huge_memory.c | 1 +
mm/migrate.c | 2 +
mm/page_alloc.c | 9 +
mm/swap_state.c | 1 +
mm/vmscan.c | 10 +
mm/workingset.c | 113 +-
46 files changed, 2562 insertions(+), 279 deletions(-)
create mode 100644 Documentation/accounting/psi.txt
create mode 100644 include/linux/psi.h
create mode 100644 include/linux/psi_types.h
create mode 100644 kernel/sched/psi.c
--
1.8.3.1
1
35

[PATCH openEuler-1.0-LTS] block/wbt: fix negative inflight counter when remove scsi device
by Yang Yingliang 18 Dec '21
by Yang Yingliang 18 Dec '21
18 Dec '21
From: Laibin Qiu <qiulaibin(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 185857
CVE: NA
--------------------------------
Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in
wbt_disable_default() when switch elevator to bfq. And when
we remove scsi device, wbt will be enabled by wbt_enable_default.
If it become false positive between wbt_wait() and wbt_track()
when submit write request.
The following is the scenario that triggered the problem.
T1 T2 T3
elevator_switch_mq
bfq_init_queue
wbt_disable_default <= Set
rwb->enable_state (OFF)
Submit_bio
blk_mq_make_request
rq_qos_throttle
<= rwb->enable_state (OFF)
scsi_remove_device
sd_remove
del_gendisk
blk_unregister_queue
elv_unregister_queue
wbt_enable_default
<= Set rwb->enable_state (ON)
q_qos_track
<= rwb->enable_state (ON)
^^^^^^ this request will mark WBT_TRACKED without inflight add and will
lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.
Fix this by move wbt_enable_default() from elv_unregister to
bfq_exit_queue(). Only re-enable wbt when bfq exit.
Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly")
confilct:
block/elevator.c
Signed-off-by: Laibin Qiu <qiulaibin(a)huawei.com>
Reviewed-by: Jason Yan <yanaijie(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
block/bfq-iosched.c | 4 ++++
block/elevator.c | 2 --
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 9078299ffe3da..fe4255698d810 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -5381,6 +5381,7 @@ static void bfq_exit_queue(struct elevator_queue *e)
{
struct bfq_data *bfqd = e->elevator_data;
struct bfq_queue *bfqq, *n;
+ struct request_queue *q = bfqd->queue;
hrtimer_cancel(&bfqd->idle_slice_timer);
@@ -5404,6 +5405,9 @@ static void bfq_exit_queue(struct elevator_queue *e)
#endif
kfree(bfqd);
+
+ /* Re-enable throttling in case elevator disabled it */
+ wbt_enable_default(q);
}
static void bfq_init_root_group(struct bfq_group *root_group,
diff --git a/block/elevator.c b/block/elevator.c
index 70ea9f3e64d7e..9621a812319e7 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -876,8 +876,6 @@ void elv_unregister_queue(struct request_queue *q)
kobject_uevent(&e->kobj, KOBJ_REMOVE);
kobject_del(&e->kobj);
e->registered = 0;
- /* Re-enable throttling in case elevator disabled it */
- wbt_enable_default(q);
}
}
--
2.25.1
1
0

18 Dec '21
From: Ye Bin <yebin10(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 185875
CVE: NA
-----------------------------------------------
We got issue as follows:
[ 833.786542] nbd: failed to add new device
[ 833.791613] ==================================================================
[ 833.794918] BUG: KASAN: use-after-free in blk_mq_free_rqs+0x558/0x6c0
[ 833.798108] Read of size 8 at addr ffff800109b7c288 by task kworker/0:3/113
[ 833.804216] CPU: 0 PID: 113 Comm: kworker/0:3 Kdump: loaded Not tainted 4.19.90 #1
[ 833.807635] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[ 833.811798] Workqueue: events __blk_release_queue
[ 833.815035] Call trace:
[ 833.817964] dump_backtrace+0x0/0x3c0
[ 833.821070] show_stack+0x28/0x38
[ 833.824091] dump_stack+0xfc/0x154
[ 833.827042] print_address_description+0x68/0x278
[ 833.830147] kasan_report+0x204/0x330
[ 833.833121] __asan_report_load8_noabort+0x30/0x40
[ 833.836180] blk_mq_free_rqs+0x558/0x6c0
[ 833.839089] blk_mq_sched_tags_teardown+0xf4/0x1c0
[ 833.842035] blk_mq_exit_sched+0x1b8/0x260
[ 833.844878] elevator_exit+0x114/0x148
[ 833.847634] blk_exit_queue+0x68/0xe8
[ 833.850352] __blk_release_queue+0xd0/0x408
[ 833.853113] process_one_work+0x55c/0x10d0
[ 833.855864] worker_thread+0x3d4/0xe30
[ 833.858558] kthread+0x2c8/0x348
[ 833.861139] ret_from_fork+0x10/0x18
[ 833.863714]
[ 833.866000] Allocated by task 186531:
[ 833.868467] kasan_kmalloc+0xe0/0x190
[ 833.870936] kmem_cache_alloc_trace+0x104/0x218
[ 833.873483] nbd_dev_add+0x54/0x760 [nbd]
[ 833.875988] nbd_genl_connect+0x3c4/0x1348 [nbd]
[ 833.878591] genl_family_rcv_msg+0x798/0xa10
[ 833.881113] genl_rcv_msg+0xc0/0x170
[ 833.883489] netlink_rcv_skb+0x1b4/0x370
[ 833.885897] genl_rcv+0x40/0x58
[ 833.888225] netlink_unicast+0x4bc/0x660
[ 833.890661] netlink_sendmsg+0x880/0xa60
[ 833.893112] sock_sendmsg+0xb8/0x110
[ 833.895513] ____sys_sendmsg+0x570/0x698
[ 833.897927] ___sys_sendmsg+0x108/0x188
[ 833.900350] __sys_sendmsg+0xe8/0x198
[ 833.900360] __arm64_sys_sendmsg+0x78/0xa8
[ 833.906911] el0_svc_common+0x10c/0x330
[ 833.909289] el0_svc_handler+0x60/0xd0
[ 833.911660] el0_svc+0x8/0x1b0
[ 833.913963]
[ 833.916117] Freed by task 186531:
[ 833.918445] __kasan_slab_free+0x120/0x228
[ 833.920860] kasan_slab_free+0x10/0x18
[ 833.923193] kfree+0x80/0x1f0
[ 833.925392] nbd_dev_add+0xf0/0x760 [nbd]
[ 833.927686] nbd_genl_connect+0x3c4/0x1348 [nbd]
[ 833.929989] genl_family_rcv_msg+0x798/0xa10
[ 833.932231] genl_rcv_msg+0xc0/0x170
[ 833.934335] netlink_rcv_skb+0x1b4/0x370
[ 833.936444] genl_rcv+0x40/0x58
[ 833.938460] netlink_unicast+0x4bc/0x660
[ 833.940570] netlink_sendmsg+0x880/0xa60
[ 833.942682] sock_sendmsg+0xb8/0x110
[ 833.944745] ____sys_sendmsg+0x570/0x698
[ 833.946849] ___sys_sendmsg+0x108/0x188
[ 833.948924] __sys_sendmsg+0xe8/0x198
[ 833.950980] __arm64_sys_sendmsg+0x78/0xa8
[ 833.953059] el0_svc_common+0x10c/0x330
[ 833.955121] el0_svc_handler+0x60/0xd0
[ 833.957143] el0_svc+0x8/0x1b0
[ 833.959088]
[ 833.960846] The buggy address belongs to the object at ffff800109b7c280
[ 833.960846] which belongs to the cache kmalloc-512 of size 512
[ 833.965502] The buggy address is located 8 bytes inside of
[ 833.965502] 512-byte region [ffff800109b7c280, ffff800109b7c480)
[ 833.970108] The buggy address belongs to the page:
[ 833.972390] page:ffff7e000426df00 count:1 mapcount:0 mapping:ffff8000c0003800 index:0x0 compound_mapcount: 0
[ 833.975269] flags: 0x7ffff0000008100(slab|head)
[ 833.977640] raw: 07ffff0000008100 ffff7e00035d1600 0000000200000002 ffff8000c0003800
[ 833.980426] raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000
[ 833.983212] page dumped because: kasan: bad access detected
[ 833.985778]
[ 833.987935] Memory state around the buggy address:
[ 833.990448] ffff800109b7c180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 833.993265] ffff800109b7c200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 833.996097] >ffff800109b7c280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 833.998930] ^
[ 834.001384] ffff800109b7c300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 834.004329] ffff800109b7c380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 834.007264] ==================================================================
As 98fd847a495f commit add check "disk->first_minor", if failed will
free tags and finally call put_disk will free request_queue.
blk_mq_free_tag_set
blk_mq_free_map_and_requests
blk_mq_free_rqs
put_disk:
__blk_release_queue
blk_exit_queue
elevator_exit
blk_mq_exit_sched
blk_mq_sched_tags_teardown
blk_mq_free_rqs -->will trigger UAF
To address this issue, just move 'disk->first_minor' check at the first
in nbd_dev_add.
Fixes:98fd847a495f("nbd: add sanity check for first_minor")
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Reviewed-by: Yu Kuai <yukuai3(a)huawei.com>
Reviewed-by: Jason Yan <yanaijie(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/block/nbd.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 775cbb4c1bbcd..a35189098581b 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1702,6 +1702,15 @@ static int nbd_dev_add(int index)
struct gendisk *disk;
struct request_queue *q;
int err = -ENOMEM;
+ int first_minor = index << part_shift;
+
+ /*
+ * Too big index can cause duplicate creation of sysfs files/links,
+ * because MKDEV() expect that the max first minor is MINORMASK, or
+ * index << part_shift can overflow.
+ */
+ if (first_minor < index || first_minor > MINORMASK)
+ return -EINVAL;
nbd = kzalloc(sizeof(struct nbd_device), GFP_KERNEL);
if (!nbd)
@@ -1765,18 +1774,7 @@ static int nbd_dev_add(int index)
refcount_set(&nbd->refs, 1);
INIT_LIST_HEAD(&nbd->list);
disk->major = NBD_MAJOR;
-
- /*
- * Too big index can cause duplicate creation of sysfs files/links,
- * because MKDEV() expect that the max first minor is MINORMASK, or
- * index << part_shift can overflow.
- */
- disk->first_minor = index << part_shift;
- if (disk->first_minor < index || disk->first_minor > MINORMASK) {
- err = -EINVAL;
- goto out_free_tags;
- }
-
+ disk->first_minor = first_minor;
disk->fops = &nbd_fops;
disk->private_data = nbd;
sprintf(disk->disk_name, "nbd%d", index);
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS 1/5] block, bfq: improve asymmetric scenarios detection
by Yang Yingliang 17 Dec '21
by Yang Yingliang 17 Dec '21
17 Dec '21
From: Federico Motta <federico(a)willer.it>
mainline inclusion
from mainline-v4.20-rc1
commit 2d29c9f89fcd9bf408fcdaaf515c90a169f22ecd
category: bugfix
bugzilla: 185905
CVE: NA
---------------------------
bfq defines as asymmetric a scenario where an active entity, say E
(representing either a single bfq_queue or a group of other entities),
has a higher weight than some other entities. If the entity E does sync
I/O in such a scenario, then bfq plugs the dispatch of the I/O of the
other entities in the following situation: E is in service but
temporarily has no pending I/O request. In fact, without this plugging,
all the times that E stops being temporarily idle, it may find the
internal queues of the storage device already filled with an
out-of-control number of extra requests, from other entities. So E may
have to wait for the service of these extra requests, before finally
having its own requests served. This may easily break service
guarantees, with E getting less than its fair share of the device
throughput. Usually, the end result is that E gets the same fraction of
the throughput as the other entities, instead of getting more, according
to its higher weight.
Yet there are two other more subtle cases where E, even if its weight is
actually equal to or even lower than the weight of any other active
entities, may get less than its fair share of the throughput in case the
above I/O plugging is not performed:
1. other entities issue larger requests than E;
2. other entities contain more active child entities than E (or in
general tend to have more backlog than E).
In the first case, other entities may get more service than E because
they get larger requests, than those of E, served during the temporary
idle periods of E. In the second case, other entities get more service
because, by having many child entities, they have many requests ready
for dispatching while E is temporarily idle.
This commit addresses this issue by extending the definition of
asymmetric scenario: a scenario is asymmetric when
- active entities representing bfq_queues have differentiated weights,
as in the original definition
or (inclusive)
- one or more entities representing groups of entities are active.
This broader definition makes sure that I/O plugging will be performed
in all the above cases, provided that there is at least one active
group. Of course, this definition is very coarse, so it will trigger
I/O plugging also in cases where it is not needed, such as, e.g.,
multiple active entities with just one child each, and all with the same
I/O-request size. The reason for this coarse definition is just that a
finer-grained definition would be rather heavy to compute.
On the opposite end, even this new definition does not trigger I/O
plugging in all cases where there is no active group, and all bfq_queues
have the same weight. So, in these cases some unfairness may occur if
there are asymmetries in I/O-request sizes. We made this choice because
I/O plugging may lower throughput, and probably a user that has not
created any group cares more about throughput than about perfect
fairness. At any rate, as for possible applications that may care about
service guarantees, bfq already guarantees a high responsiveness and a
low latency to soft real-time applications automatically.
Signed-off-by: Federico Motta <federico(a)willer.it>
Signed-off-by: Paolo Valente <paolo.valente(a)linaro.org>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Yu Kuai <yukuai3(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
block/bfq-iosched.c | 223 ++++++++++++++++++++++++--------------------
block/bfq-iosched.h | 27 +++---
block/bfq-wf2q.c | 35 +++----
3 files changed, 154 insertions(+), 131 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 4686ca1e93afe..e2b5c3ccd5502 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -625,12 +625,13 @@ void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq)
}
/*
- * Tell whether there are active queues or groups with differentiated weights.
+ * Tell whether there are active queues with different weights or
+ * active groups.
*/
-static bool bfq_differentiated_weights(struct bfq_data *bfqd)
+static bool bfq_varied_queue_weights_or_active_groups(struct bfq_data *bfqd)
{
/*
- * For weights to differ, at least one of the trees must contain
+ * For queue weights to differ, queue_weights_tree must contain
* at least two nodes.
*/
return (!RB_EMPTY_ROOT(&bfqd->queue_weights_tree) &&
@@ -638,9 +639,7 @@ static bool bfq_differentiated_weights(struct bfq_data *bfqd)
bfqd->queue_weights_tree.rb_node->rb_right)
#ifdef CONFIG_BFQ_GROUP_IOSCHED
) ||
- (!RB_EMPTY_ROOT(&bfqd->group_weights_tree) &&
- (bfqd->group_weights_tree.rb_node->rb_left ||
- bfqd->group_weights_tree.rb_node->rb_right)
+ (bfqd->num_active_groups > 0
#endif
);
}
@@ -658,26 +657,25 @@ static bool bfq_differentiated_weights(struct bfq_data *bfqd)
* 3) all active groups at the same level in the groups tree have the same
* number of children.
*
- * Unfortunately, keeping the necessary state for evaluating exactly the
- * above symmetry conditions would be quite complex and time-consuming.
- * Therefore this function evaluates, instead, the following stronger
- * sub-conditions, for which it is much easier to maintain the needed
- * state:
+ * Unfortunately, keeping the necessary state for evaluating exactly
+ * the last two symmetry sub-conditions above would be quite complex
+ * and time consuming. Therefore this function evaluates, instead,
+ * only the following stronger two sub-conditions, for which it is
+ * much easier to maintain the needed state:
* 1) all active queues have the same weight,
- * 2) all active groups have the same weight,
- * 3) all active groups have at most one active child each.
- * In particular, the last two conditions are always true if hierarchical
- * support and the cgroups interface are not enabled, thus no state needs
- * to be maintained in this case.
+ * 2) there are no active groups.
+ * In particular, the last condition is always true if hierarchical
+ * support or the cgroups interface are not enabled, thus no state
+ * needs to be maintained in this case.
*/
static bool bfq_symmetric_scenario(struct bfq_data *bfqd)
{
- return !bfq_differentiated_weights(bfqd);
+ return !bfq_varied_queue_weights_or_active_groups(bfqd);
}
/*
* If the weight-counter tree passed as input contains no counter for
- * the weight of the input entity, then add that counter; otherwise just
+ * the weight of the input queue, then add that counter; otherwise just
* increment the existing counter.
*
* Note that weight-counter trees contain few nodes in mostly symmetric
@@ -688,25 +686,25 @@ static bool bfq_symmetric_scenario(struct bfq_data *bfqd)
* In most scenarios, the rate at which nodes are created/destroyed
* should be low too.
*/
-void bfq_weights_tree_add(struct bfq_data *bfqd, struct bfq_entity *entity,
+void bfq_weights_tree_add(struct bfq_data *bfqd, struct bfq_queue *bfqq,
struct rb_root *root)
{
+ struct bfq_entity *entity = &bfqq->entity;
struct rb_node **new = &(root->rb_node), *parent = NULL;
/*
- * Do not insert if the entity is already associated with a
+ * Do not insert if the queue is already associated with a
* counter, which happens if:
- * 1) the entity is associated with a queue,
- * 2) a request arrival has caused the queue to become both
+ * 1) a request arrival has caused the queue to become both
* non-weight-raised, and hence change its weight, and
* backlogged; in this respect, each of the two events
* causes an invocation of this function,
- * 3) this is the invocation of this function caused by the
+ * 2) this is the invocation of this function caused by the
* second event. This second invocation is actually useless,
* and we handle this fact by exiting immediately. More
* efficient or clearer solutions might possibly be adopted.
*/
- if (entity->weight_counter)
+ if (bfqq->weight_counter)
return;
while (*new) {
@@ -716,7 +714,7 @@ void bfq_weights_tree_add(struct bfq_data *bfqd, struct bfq_entity *entity,
parent = *new;
if (entity->weight == __counter->weight) {
- entity->weight_counter = __counter;
+ bfqq->weight_counter = __counter;
goto inc_counter;
}
if (entity->weight < __counter->weight)
@@ -725,66 +723,67 @@ void bfq_weights_tree_add(struct bfq_data *bfqd, struct bfq_entity *entity,
new = &((*new)->rb_right);
}
- entity->weight_counter = kzalloc(sizeof(struct bfq_weight_counter),
- GFP_ATOMIC);
+ bfqq->weight_counter = kzalloc(sizeof(struct bfq_weight_counter),
+ GFP_ATOMIC);
/*
* In the unlucky event of an allocation failure, we just
- * exit. This will cause the weight of entity to not be
- * considered in bfq_differentiated_weights, which, in its
- * turn, causes the scenario to be deemed wrongly symmetric in
- * case entity's weight would have been the only weight making
- * the scenario asymmetric. On the bright side, no unbalance
- * will however occur when entity becomes inactive again (the
- * invocation of this function is triggered by an activation
- * of entity). In fact, bfq_weights_tree_remove does nothing
- * if !entity->weight_counter.
+ * exit. This will cause the weight of queue to not be
+ * considered in bfq_varied_queue_weights_or_active_groups,
+ * which, in its turn, causes the scenario to be deemed
+ * wrongly symmetric in case bfqq's weight would have been
+ * the only weight making the scenario asymmetric. On the
+ * bright side, no unbalance will however occur when bfqq
+ * becomes inactive again (the invocation of this function
+ * is triggered by an activation of queue). In fact,
+ * bfq_weights_tree_remove does nothing if
+ * !bfqq->weight_counter.
*/
- if (unlikely(!entity->weight_counter))
+ if (unlikely(!bfqq->weight_counter))
return;
- entity->weight_counter->weight = entity->weight;
- rb_link_node(&entity->weight_counter->weights_node, parent, new);
- rb_insert_color(&entity->weight_counter->weights_node, root);
+ bfqq->weight_counter->weight = entity->weight;
+ rb_link_node(&bfqq->weight_counter->weights_node, parent, new);
+ rb_insert_color(&bfqq->weight_counter->weights_node, root);
inc_counter:
- entity->weight_counter->num_active++;
+ bfqq->weight_counter->num_active++;
}
/*
- * Decrement the weight counter associated with the entity, and, if the
+ * Decrement the weight counter associated with the queue, and, if the
* counter reaches 0, remove the counter from the tree.
* See the comments to the function bfq_weights_tree_add() for considerations
* about overhead.
*/
void __bfq_weights_tree_remove(struct bfq_data *bfqd,
- struct bfq_entity *entity,
+ struct bfq_queue *bfqq,
struct rb_root *root)
{
- if (!entity->weight_counter)
+ if (!bfqq->weight_counter)
return;
- entity->weight_counter->num_active--;
- if (entity->weight_counter->num_active > 0)
+ bfqq->weight_counter->num_active--;
+ if (bfqq->weight_counter->num_active > 0)
goto reset_entity_pointer;
- rb_erase(&entity->weight_counter->weights_node, root);
- kfree(entity->weight_counter);
+ rb_erase(&bfqq->weight_counter->weights_node, root);
+ kfree(bfqq->weight_counter);
reset_entity_pointer:
- entity->weight_counter = NULL;
+ bfqq->weight_counter = NULL;
}
/*
- * Invoke __bfq_weights_tree_remove on bfqq and all its inactive
- * parent entities.
+ * Invoke __bfq_weights_tree_remove on bfqq and decrement the number
+ * of active groups for each queue's inactive parent entity.
*/
void bfq_weights_tree_remove(struct bfq_data *bfqd,
struct bfq_queue *bfqq)
{
struct bfq_entity *entity = bfqq->entity.parent;
- __bfq_weights_tree_remove(bfqd, &bfqq->entity,
+ __bfq_weights_tree_remove(bfqd, bfqq,
&bfqd->queue_weights_tree);
for_each_entity(entity) {
@@ -798,17 +797,13 @@ void bfq_weights_tree_remove(struct bfq_data *bfqd,
* next_in_service for details on why
* in_service_entity must be checked too).
*
- * As a consequence, the weight of entity is
- * not to be removed. In addition, if entity
- * is active, then its parent entities are
- * active as well, and thus their weights are
- * not to be removed either. In the end, this
- * loop must stop here.
+ * As a consequence, its parent entities are
+ * active as well, and thus this loop must
+ * stop here.
*/
break;
}
- __bfq_weights_tree_remove(bfqd, entity,
- &bfqd->group_weights_tree);
+ bfqd->num_active_groups--;
}
}
@@ -3539,9 +3534,11 @@ static bool bfq_better_to_idle(struct bfq_queue *bfqq)
* symmetric scenario where:
* (i) each of these processes must get the same throughput as
* the others;
- * (ii) all these processes have the same I/O pattern
- (either sequential or random).
- * In fact, in such a scenario, the drive will tend to treat
+ * (ii) the I/O of each process has the same properties, in
+ * terms of locality (sequential or random), direction
+ * (reads or writes), request sizes, greediness
+ * (from I/O-bound to sporadic), and so on.
+ * In fact, in such a scenario, the drive tends to treat
* the requests of each of these processes in about the same
* way as the requests of the others, and thus to provide
* each of these processes with about the same throughput
@@ -3550,18 +3547,50 @@ static bool bfq_better_to_idle(struct bfq_queue *bfqq)
* certainly needed to guarantee that bfqq receives its
* assigned fraction of the device throughput (see [1] for
* details).
+ * The problem is that idling may significantly reduce
+ * throughput with certain combinations of types of I/O and
+ * devices. An important example is sync random I/O, on flash
+ * storage with command queueing. So, unless bfqq falls in the
+ * above cases where idling also boosts throughput, it would
+ * be important to check conditions (i) and (ii) accurately,
+ * so as to avoid idling when not strictly needed for service
+ * guarantees.
+ *
+ * Unfortunately, it is extremely difficult to thoroughly
+ * check condition (ii). And, in case there are active groups,
+ * it becomes very difficult to check condition (i) too. In
+ * fact, if there are active groups, then, for condition (i)
+ * to become false, it is enough that an active group contains
+ * more active processes or sub-groups than some other active
+ * group. We address this issue with the following bi-modal
+ * behavior, implemented in the function
+ * bfq_symmetric_scenario().
*
- * We address this issue by controlling, actually, only the
- * symmetry sub-condition (i), i.e., provided that
- * sub-condition (i) holds, idling is not performed,
- * regardless of whether sub-condition (ii) holds. In other
- * words, only if sub-condition (i) holds, then idling is
+ * If there are active groups, then the scenario is tagged as
+ * asymmetric, conservatively, without checking any of the
+ * conditions (i) and (ii). So the device is idled for bfqq.
+ * This behavior matches also the fact that groups are created
+ * exactly if controlling I/O (to preserve bandwidth and
+ * latency guarantees) is a primary concern.
+ *
+ * On the opposite end, if there are no active groups, then
+ * only condition (i) is actually controlled, i.e., provided
+ * that condition (i) holds, idling is not performed,
+ * regardless of whether condition (ii) holds. In other words,
+ * only if condition (i) does not hold, then idling is
* allowed, and the device tends to be prevented from queueing
- * many requests, possibly of several processes. The reason
- * for not controlling also sub-condition (ii) is that we
- * exploit preemption to preserve guarantees in case of
- * symmetric scenarios, even if (ii) does not hold, as
- * explained in the next two paragraphs.
+ * many requests, possibly of several processes. Since there
+ * are no active groups, then, to control condition (i) it is
+ * enough to check whether all active queues have the same
+ * weight.
+ *
+ * Not checking condition (ii) evidently exposes bfqq to the
+ * risk of getting less throughput than its fair share.
+ * However, for queues with the same weight, a further
+ * mechanism, preemption, mitigates or even eliminates this
+ * problem. And it does so without consequences on overall
+ * throughput. This mechanism and its benefits are explained
+ * in the next three paragraphs.
*
* Even if a queue, say Q, is expired when it remains idle, Q
* can still preempt the new in-service queue if the next
@@ -3575,11 +3604,7 @@ static bool bfq_better_to_idle(struct bfq_queue *bfqq)
* idling allows the internal queues of the device to contain
* many requests, and thus to reorder requests, we can rather
* safely assume that the internal scheduler still preserves a
- * minimum of mid-term fairness. The motivation for using
- * preemption instead of idling is that, by not idling,
- * service guarantees are preserved without minimally
- * sacrificing throughput. In other words, both a high
- * throughput and its desired distribution are obtained.
+ * minimum of mid-term fairness.
*
* More precisely, this preemption-based, idleless approach
* provides fairness in terms of IOPS, and not sectors per
@@ -3598,27 +3623,27 @@ static bool bfq_better_to_idle(struct bfq_queue *bfqq)
* 1024/8 times as high as the service received by the other
* queue.
*
- * On the other hand, device idling is performed, and thus
- * pure sector-domain guarantees are provided, for the
- * following queues, which are likely to need stronger
- * throughput guarantees: weight-raised queues, and queues
- * with a higher weight than other queues. When such queues
- * are active, sub-condition (i) is false, which triggers
- * device idling.
+ * The motivation for using preemption instead of idling (for
+ * queues with the same weight) is that, by not idling,
+ * service guarantees are preserved (completely or at least in
+ * part) without minimally sacrificing throughput. And, if
+ * there is no active group, then the primary expectation for
+ * this device is probably a high throughput.
*
- * According to the above considerations, the next variable is
- * true (only) if sub-condition (i) holds. To compute the
- * value of this variable, we not only use the return value of
- * the function bfq_symmetric_scenario(), but also check
- * whether bfqq is being weight-raised, because
- * bfq_symmetric_scenario() does not take into account also
- * weight-raised queues (see comments on
- * bfq_weights_tree_add()). In particular, if bfqq is being
- * weight-raised, it is important to idle only if there are
- * other, non-weight-raised queues that may steal throughput
- * to bfqq. Actually, we should be even more precise, and
- * differentiate between interactive weight raising and
- * soft real-time weight raising.
+ * We are now left only with explaining the additional
+ * compound condition that is checked below for deciding
+ * whether the scenario is asymmetric. To explain this
+ * compound condition, we need to add that the function
+ * bfq_symmetric_scenario checks the weights of only
+ * non-weight-raised queues, for efficiency reasons (see
+ * comments on bfq_weights_tree_add()). Then the fact that
+ * bfqq is weight-raised is checked explicitly here. More
+ * precisely, the compound condition below takes into account
+ * also the fact that, even if bfqq is being weight-raised,
+ * the scenario is still symmetric if all active queues happen
+ * to be weight-raised. Actually, we should be even more
+ * precise here, and differentiate between interactive weight
+ * raising and soft real-time weight raising.
*
* As a side note, it is worth considering that the above
* device-idling countermeasures may however fail in the
@@ -5408,7 +5433,7 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
bfqd->idle_slice_timer.function = bfq_idle_slice_timer;
bfqd->queue_weights_tree = RB_ROOT;
- bfqd->group_weights_tree = RB_ROOT;
+ bfqd->num_active_groups = 0;
INIT_LIST_HEAD(&bfqd->active_list);
INIT_LIST_HEAD(&bfqd->idle_list);
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index a41e9884f2dd2..e9960727cb565 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -108,15 +108,14 @@ struct bfq_sched_data {
};
/**
- * struct bfq_weight_counter - counter of the number of all active entities
+ * struct bfq_weight_counter - counter of the number of all active queues
* with a given weight.
*/
struct bfq_weight_counter {
- unsigned int weight; /* weight of the entities this counter refers to */
- unsigned int num_active; /* nr of active entities with this weight */
+ unsigned int weight; /* weight of the queues this counter refers to */
+ unsigned int num_active; /* nr of active queues with this weight */
/*
- * Weights tree member (see bfq_data's @queue_weights_tree and
- * @group_weights_tree)
+ * Weights tree member (see bfq_data's @queue_weights_tree)
*/
struct rb_node weights_node;
};
@@ -151,8 +150,6 @@ struct bfq_weight_counter {
struct bfq_entity {
/* service_tree member */
struct rb_node rb_node;
- /* pointer to the weight counter associated with this entity */
- struct bfq_weight_counter *weight_counter;
/*
* Flag, true if the entity is on a tree (either the active or
@@ -266,6 +263,9 @@ struct bfq_queue {
/* entity representing this queue in the scheduler */
struct bfq_entity entity;
+ /* pointer to the weight counter associated with this entity */
+ struct bfq_weight_counter *weight_counter;
+
/* maximum budget allowed from the feedback mechanism */
int max_budget;
/* budget expiration (in jiffies) */
@@ -449,14 +449,9 @@ struct bfq_data {
*/
struct rb_root queue_weights_tree;
/*
- * rbtree of non-queue @bfq_entity weight counters, sorted by
- * weight. Used to keep track of whether all @bfq_groups have
- * the same weight. The tree contains one counter for each
- * distinct weight associated to some active @bfq_group (see
- * the comments to the functions bfq_weights_tree_[add|remove]
- * for further details).
+ * number of groups with requests still waiting for completion
*/
- struct rb_root group_weights_tree;
+ unsigned int num_active_groups;
/*
* Number of bfq_queues containing requests (including the
@@ -854,10 +849,10 @@ struct bfq_queue *bic_to_bfqq(struct bfq_io_cq *bic, bool is_sync);
void bic_set_bfqq(struct bfq_io_cq *bic, struct bfq_queue *bfqq, bool is_sync);
struct bfq_data *bic_to_bfqd(struct bfq_io_cq *bic);
void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq);
-void bfq_weights_tree_add(struct bfq_data *bfqd, struct bfq_entity *entity,
+void bfq_weights_tree_add(struct bfq_data *bfqd, struct bfq_queue *bfqq,
struct rb_root *root);
void __bfq_weights_tree_remove(struct bfq_data *bfqd,
- struct bfq_entity *entity,
+ struct bfq_queue *bfqq,
struct rb_root *root);
void bfq_weights_tree_remove(struct bfq_data *bfqd,
struct bfq_queue *bfqq);
diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
index e830715fe15d6..cdd164c82398a 100644
--- a/block/bfq-wf2q.c
+++ b/block/bfq-wf2q.c
@@ -788,25 +788,29 @@ __bfq_entity_update_weight_prio(struct bfq_service_tree *old_st,
new_weight = entity->orig_weight *
(bfqq ? bfqq->wr_coeff : 1);
/*
- * If the weight of the entity changes, remove the entity
- * from its old weight counter (if there is a counter
- * associated with the entity), and add it to the counter
- * associated with its new weight.
+ * If the weight of the entity changes, and the entity is a
+ * queue, remove the entity from its old weight counter (if
+ * there is a counter associated with the entity).
*/
if (prev_weight != new_weight) {
- root = bfqq ? &bfqd->queue_weights_tree :
- &bfqd->group_weights_tree;
- __bfq_weights_tree_remove(bfqd, entity, root);
+ if (bfqq) {
+ root = &bfqd->queue_weights_tree;
+ __bfq_weights_tree_remove(bfqd, bfqq, root);
+ } else
+ bfqd->num_active_groups--;
}
entity->weight = new_weight;
/*
- * Add the entity to its weights tree only if it is
- * not associated with a weight-raised queue.
+ * Add the entity, if it is not a weight-raised queue,
+ * to the counter associated with its new weight.
*/
- if (prev_weight != new_weight &&
- (bfqq ? bfqq->wr_coeff == 1 : 1))
- /* If we get here, root has been initialized. */
- bfq_weights_tree_add(bfqd, entity, root);
+ if (prev_weight != new_weight) {
+ if (bfqq && bfqq->wr_coeff == 1) {
+ /* If we get here, root has been initialized. */
+ bfq_weights_tree_add(bfqd, bfqq, root);
+ } else
+ bfqd->num_active_groups++;
+ }
new_st->wsum += entity->weight;
@@ -1014,8 +1018,7 @@ static void __bfq_activate_entity(struct bfq_entity *entity,
container_of(entity, struct bfq_group, entity);
struct bfq_data *bfqd = bfqg->bfqd;
- bfq_weights_tree_add(bfqg->bfqd, entity,
- &bfqd->group_weights_tree);
+ bfqd->num_active_groups++;
}
#endif
@@ -1702,7 +1705,7 @@ void bfq_add_bfqq_busy(struct bfq_data *bfqd, struct bfq_queue *bfqq)
if (!bfqq->dispatched)
if (bfqq->wr_coeff == 1)
- bfq_weights_tree_add(bfqd, &bfqq->entity,
+ bfq_weights_tree_add(bfqd, bfqq,
&bfqd->queue_weights_tree);
if (bfqq->wr_coeff > 1)
--
2.25.1
1
4

[PATCH openEuler-1.0-LTS 01/10] arm64: mm: Restore mm_cpumask (revert commit 38d96287504a ("arm64: mm: kill mm_cpumask usage"))
by Yang Yingliang 17 Dec '21
by Yang Yingliang 17 Dec '21
17 Dec '21
From: Takao Indoh <indou.takao(a)fujitsu.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4BLL0
CVE: NA
---------------------------
mm_cpumask was deleted by the commit 38d96287504a ("arm64: mm: kill
mm_cpumask usage") because it was not used at that time. Now this is needed
to find appropriate CPUs for TLB flush, so this patch reverts this commit.
Signed-off-by: QI Fuli <qi.fuli(a)fujitsu.com>
Signed-off-by: Takao Indoh <indou.takao(a)fujitsu.com>
Signed-off-by: Cheng Jian <cj.chengjian(a)huawei.com>
Reviewed-by: Xie XiuQi <xiexiuqi(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
arch/arm64/kernel/smp.c | 6 ++++++
arch/arm64/mm/context.c | 2 ++
2 files changed, 8 insertions(+)
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index fe562778de352..e86940d353a3e 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -387,6 +387,7 @@ asmlinkage notrace void secondary_start_kernel(void)
*/
mmgrab(mm);
current->active_mm = mm;
+ cpumask_set_cpu(cpu, mm_cpumask(mm));
/*
* TTBR0 is only used for the identity mapping at this stage. Make it
@@ -489,6 +490,11 @@ int __cpu_disable(void)
*/
irq_migrate_all_off_this_cpu();
+ /*
+ * Remove this CPU from the vm mask set of all processes.
+ */
+ clear_tasks_mm_cpumask(cpu);
+
return 0;
}
diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 2b80ceff5d6c2..27d1f3fec1cc9 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -207,6 +207,7 @@ static u64 new_context(struct mm_struct *mm, unsigned int cpu)
set_asid:
__set_bit(asid, asid_map);
cur_idx = asid;
+ cpumask_clear(mm_cpumask(mm));
return idx2asid(asid) | generation;
}
@@ -254,6 +255,7 @@ void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
switch_mm_fastpath:
arm64_apply_bp_hardening();
+ cpumask_set_cpu(cpu, mm_cpumask(mm));
/*
* Defer TTBR0_EL1 setting for user threads to uaccess_enable() when
--
2.25.1
1
9

[PATCH openEuler-1.0-LTS] fget: check that the fd still exists after getting a ref to it
by Yang Yingliang 17 Dec '21
by Yang Yingliang 17 Dec '21
17 Dec '21
From: Linus Torvalds <torvalds(a)linux-foundation.org>
stable inclusion
from stable-4.19.220
commit 8bf31f9d9395b71af3ed33166a057cd3ec0c59da
category: bugfix
bugzilla: 185907
CVE: CVE-2021-4083
-------------------------------------------------
commit 054aa8d439b9185d4f5eb9a90282d1ce74772969 upstream.
Jann Horn points out that there is another possible race wrt Unix domain
socket garbage collection, somewhat reminiscent of the one fixed in
commit cbcf01128d0a ("af_unix: fix garbage collect vs MSG_PEEK").
See the extended comment about the garbage collection requirements added
to unix_peek_fds() by that commit for details.
The race comes from how we can locklessly look up a file descriptor just
as it is in the process of being closed, and with the right artificial
timing (Jann added a few strategic 'mdelay(500)' calls to do that), the
Unix domain socket garbage collector could see the reference count
decrement of the close() happen before fget() took its reference to the
file and the file was attached onto a new file descriptor.
This is all (intentionally) correct on the 'struct file *' side, with
RCU lookups and lockless reference counting very much part of the
design. Getting that reference count out of order isn't a problem per
se.
But the garbage collector can get confused by seeing this situation of
having seen a file not having any remaining external references and then
seeing it being attached to an fd.
In commit cbcf01128d0a ("af_unix: fix garbage collect vs MSG_PEEK") the
fix was to serialize the file descriptor install with the garbage
collector by taking and releasing the unix_gc_lock.
That's not really an option here, but since this all happens when we are
in the process of looking up a file descriptor, we can instead simply
just re-check that the file hasn't been closed in the meantime, and just
re-do the lookup if we raced with a concurrent close() of the same file
descriptor.
Reported-and-tested-by: Jann Horn <jannh(a)google.com>
Acked-by: Miklos Szeredi <mszeredi(a)redhat.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
fs/file.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/fs/file.c b/fs/file.c
index 0caadedd3cc43..20a44d0ac400b 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -779,6 +779,10 @@ static struct file *__fget(unsigned int fd, fmode_t mask, unsigned int refs)
file = NULL;
else if (!get_file_rcu_many(file, refs))
goto loop;
+ else if (__fcheck_files(files, fd) != file) {
+ fput_many(file, refs);
+ goto loop;
+ }
}
rcu_read_unlock();
--
2.25.1
1
0
1
0

[PATCH openEuler-1.0-LTS 1/4] Revert "Revert "ext4: Allow parallel DIO reads""
by Yang Yingliang 17 Dec '21
by Yang Yingliang 17 Dec '21
17 Dec '21
From: Zhang Xiaoxu <zhangxiaoxu5(a)huawei.com>
hulk inclusion
category: feature
bugzilla: 185871, https://gitee.com/openeuler/kernel/issues/I4MIQ0
CVE: NA
---------------------------
This reverts commit bb24e7e6309ae1bbc80062366b8f9bb2e8fcc025.
In order to improve the dio reads performance, we enable the
parallel dio reads.
And make it configuratable
Conflict:
fs/ext4/Kconfig
fs/ext4/inode.c
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
fs/ext4/Kconfig | 5 +++++
fs/ext4/inode.c | 34 ++++++++++++++++++++++++++++------
2 files changed, 33 insertions(+), 6 deletions(-)
diff --git a/fs/ext4/Kconfig b/fs/ext4/Kconfig
index a453cc87082b5..de836b75c6f2b 100644
--- a/fs/ext4/Kconfig
+++ b/fs/ext4/Kconfig
@@ -120,3 +120,8 @@ config EXT4_DEBUG
If you select Y here, then you will be able to turn on debugging
with a command such as:
echo 1 > /sys/module/ext4/parameters/mballoc_debug
+config EXT4_PARALLEL_DIO_READ
+ bool "EXT4 parallel dio reads"
+ depends on EXT4_FS
+ help
+ Enable parallel dio read.
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index fb4de707cd9ad..2f5231899d342 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3795,15 +3795,31 @@ static ssize_t ext4_direct_IO_write(struct kiocb *iocb, struct iov_iter *iter)
static ssize_t ext4_direct_IO_read(struct kiocb *iocb, struct iov_iter *iter)
{
+ struct address_space *mapping = iocb->ki_filp->f_mapping;
+ struct inode *inode = mapping->host;
+#ifdef CONFIG_EXT4_PARALLEL_DIO_READ
+ size_t count = iov_iter_count(iter);
+#else
int unlocked = 0;
- struct inode *inode = iocb->ki_filp->f_mapping->host;
+#endif
ssize_t ret;
loff_t offset = iocb->ki_pos;
loff_t size = i_size_read(inode);
if (offset >= size)
return 0;
-
+#ifdef CONFIG_EXT4_PARALLEL_DIO_READ
+ /*
+ * Shared inode_lock is enough for us - it protects against concurrent
+ * writes & truncates and since we take care of writing back page cache,
+ * we are protected against page writeback as well.
+ */
+ inode_lock_shared(inode);
+ ret = filemap_write_and_wait_range(mapping, iocb->ki_pos,
+ iocb->ki_pos + count - 1);
+ if (ret)
+ goto out_unlock;
+#else
if (ext4_should_dioread_nolock(inode)) {
/*
* Nolock dioread optimization may be dynamically disabled
@@ -3813,18 +3829,24 @@ static ssize_t ext4_direct_IO_read(struct kiocb *iocb, struct iov_iter *iter)
inode_dio_begin(inode);
smp_mb();
if (unlikely(ext4_test_inode_state(inode,
- EXT4_STATE_DIOREAD_LOCK)))
+ EXT4_STATE_DIOREAD_LOCK)))
inode_dio_end(inode);
else
unlocked = 1;
}
+#endif
+
ret = __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev,
- iter, ext4_dio_get_block,
- NULL, NULL,
+ iter, ext4_dio_get_block, NULL, NULL,
+#ifdef CONFIG_EXT4_PARALLEL_DIO_READ
+ 0);
+out_unlock:
+ inode_unlock_shared(inode);
+#else
unlocked ? 0 : DIO_LOCKING);
-
if (unlocked)
inode_dio_end(inode);
+#endif
return ret;
}
--
2.25.1
1
3

[PATCH openEuler-1.0-LTS 1/4] Revert "Revert "ext4: Allow parallel DIO reads""
by Yang Yingliang 17 Dec '21
by Yang Yingliang 17 Dec '21
17 Dec '21
From: Zhang Xiaoxu <zhangxiaoxu5(a)huawei.com>
hulk inclusion
category: feature
bugzilla: 185871, https://gitee.com/openeuler/kernel/issues/I4MIQ0
CVE: NA
---------------------------
This reverts commit bb24e7e6309ae1bbc80062366b8f9bb2e8fcc025.
In order to improve the dio reads performance, we enable the
parallel dio reads.
And make it configuratable
Conflict:
fs/ext4/Kconfig
fs/ext4/inode.c
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
fs/ext4/Kconfig | 5 +++++
fs/ext4/inode.c | 34 ++++++++++++++++++++++++++++------
2 files changed, 33 insertions(+), 6 deletions(-)
diff --git a/fs/ext4/Kconfig b/fs/ext4/Kconfig
index a453cc87082b5..de836b75c6f2b 100644
--- a/fs/ext4/Kconfig
+++ b/fs/ext4/Kconfig
@@ -120,3 +120,8 @@ config EXT4_DEBUG
If you select Y here, then you will be able to turn on debugging
with a command such as:
echo 1 > /sys/module/ext4/parameters/mballoc_debug
+config EXT4_PARALLEL_DIO_READ
+ bool "EXT4 parallel dio reads"
+ depends on EXT4_FS
+ help
+ Enable parallel dio read.
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index fb4de707cd9ad..2f5231899d342 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3795,15 +3795,31 @@ static ssize_t ext4_direct_IO_write(struct kiocb *iocb, struct iov_iter *iter)
static ssize_t ext4_direct_IO_read(struct kiocb *iocb, struct iov_iter *iter)
{
+ struct address_space *mapping = iocb->ki_filp->f_mapping;
+ struct inode *inode = mapping->host;
+#ifdef CONFIG_EXT4_PARALLEL_DIO_READ
+ size_t count = iov_iter_count(iter);
+#else
int unlocked = 0;
- struct inode *inode = iocb->ki_filp->f_mapping->host;
+#endif
ssize_t ret;
loff_t offset = iocb->ki_pos;
loff_t size = i_size_read(inode);
if (offset >= size)
return 0;
-
+#ifdef CONFIG_EXT4_PARALLEL_DIO_READ
+ /*
+ * Shared inode_lock is enough for us - it protects against concurrent
+ * writes & truncates and since we take care of writing back page cache,
+ * we are protected against page writeback as well.
+ */
+ inode_lock_shared(inode);
+ ret = filemap_write_and_wait_range(mapping, iocb->ki_pos,
+ iocb->ki_pos + count - 1);
+ if (ret)
+ goto out_unlock;
+#else
if (ext4_should_dioread_nolock(inode)) {
/*
* Nolock dioread optimization may be dynamically disabled
@@ -3813,18 +3829,24 @@ static ssize_t ext4_direct_IO_read(struct kiocb *iocb, struct iov_iter *iter)
inode_dio_begin(inode);
smp_mb();
if (unlikely(ext4_test_inode_state(inode,
- EXT4_STATE_DIOREAD_LOCK)))
+ EXT4_STATE_DIOREAD_LOCK)))
inode_dio_end(inode);
else
unlocked = 1;
}
+#endif
+
ret = __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev,
- iter, ext4_dio_get_block,
- NULL, NULL,
+ iter, ext4_dio_get_block, NULL, NULL,
+#ifdef CONFIG_EXT4_PARALLEL_DIO_READ
+ 0);
+out_unlock:
+ inode_unlock_shared(inode);
+#else
unlocked ? 0 : DIO_LOCKING);
-
if (unlocked)
inode_dio_end(inode);
+#endif
return ret;
}
--
2.25.1
1
3

[PATCH openEuler-1.0-LTS 1/4] net: hns3: fix VF RSS failed problem after PF enable multi-TCs
by Yang Yingliang 16 Dec '21
by Yang Yingliang 16 Dec '21
16 Dec '21
From: Guangbin Huang <huangguangbin2(a)huawei.com>
mainline inclusion
from mainline-v5.16-rc3
commit 8d2ad993aa05c0768f00c886c9d369cd97a337ac
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MWF6
CVE: NA
----------------------------
When PF is set to multi-TCs and configured mapping relationship between
priorities and TCs, the hardware will active these settings for this PF
and its VFs.
In this case when VF just uses one TC and its rx packets contain priority,
and if the priority is not mapped to TC0, as other TCs of VF is not valid,
hardware always put this kind of packets to the queue 0. It cause this kind
of packets of VF can not be used RSS function.
To fix this problem, set tc mode of all unused TCs of VF to the setting of
TC0, then rx packet with priority which map to unused TC will be direct to
TC0.
Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Guangbin Huang <huangguangbin2(a)huawei.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Yonglong Liu <liuyonglong(a)huawei.com>
Reviewed-by: li yongxin <liyongxin1(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index 8e79818f149d2..ebe09a304aeda 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -686,9 +686,9 @@ static int hclgevf_set_rss_tc_mode(struct hclgevf_dev *hdev, u16 rss_size)
roundup_size = ilog2(roundup_size);
for (i = 0; i < HCLGEVF_MAX_TC_NUM; i++) {
- tc_valid[i] = !!(hdev->hw_tc_map & BIT(i));
+ tc_valid[i] = 1;
tc_size[i] = roundup_size;
- tc_offset[i] = rss_size * i;
+ tc_offset[i] = (hdev->hw_tc_map & BIT(i)) ? rss_size * i : 0;
}
hclgevf_cmd_setup_basic_desc(&desc, HCLGEVF_OPC_RSS_TC_MODE, false);
--
2.25.1
1
3

[PATCH openEuler-1.0-LTS 1/2] hugetlbfs: flush TLBs correctly after huge_pmd_unshare
by Yang Yingliang 16 Dec '21
by Yang Yingliang 16 Dec '21
16 Dec '21
From: Nadav Amit <namit(a)vmware.com>
stable inclusion
from linux-v4.19.219
commit b0313bc7f5fbb6beee327af39d818ffdc921821a
category: bugfix
bugzilla: 185854
CVE: CVE-2021-4002
-----------------------------------------------
commit a4a118f2eead1d6c49e00765de89878288d4b890 upstream.
When __unmap_hugepage_range() calls to huge_pmd_unshare() succeed, a TLB
flush is missing. This TLB flush must be performed before releasing the
i_mmap_rwsem, in order to prevent an unshared PMDs page from being
released and reused before the TLB flush took place.
Arguably, a comprehensive solution would use mmu_gather interface to
batch the TLB flushes and the PMDs page release, however it is not an
easy solution: (1) try_to_unmap_one() and try_to_migrate_one() also call
huge_pmd_unshare() and they cannot use the mmu_gather interface; and (2)
deferring the release of the page reference for the PMDs page until
after i_mmap_rwsem is dropeed can confuse huge_pmd_unshare() into
thinking PMDs are shared when they are not.
Fix __unmap_hugepage_range() by adding the missing TLB flush, and
forcing a flush when unshare is successful.
Fixes: 24669e58477e ("hugetlb: use mmu_gather instead of a temporary linked list for accumulating pages)" # 3.6
Signed-off-by: Nadav Amit <namit(a)vmware.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar(a)linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Conflicts:
include/asm-generic/tlb.h
mm/mmu_gather.c
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
Reviewed-by: tong tiangen <tongtiangen(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
arch/arm/include/asm/tlb.h | 8 ++++++++
arch/ia64/include/asm/tlb.h | 10 ++++++++++
arch/s390/include/asm/tlb.h | 16 ++++++++++++++++
arch/sh/include/asm/tlb.h | 10 ++++++++++
arch/um/include/asm/tlb.h | 12 ++++++++++++
include/asm-generic/tlb.h | 2 ++
mm/hugetlb.c | 23 +++++++++++++++++++----
mm/mmu_gather.c | 10 ++++++++++
8 files changed, 87 insertions(+), 4 deletions(-)
diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index f854148c8d7c2..00baa13c158d7 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -280,6 +280,14 @@ tlb_remove_pmd_tlb_entry(struct mmu_gather *tlb, pmd_t *pmdp, unsigned long addr
tlb_add_flush(tlb, addr);
}
+static inline void
+tlb_flush_pmd_range(struct mmu_gather *tlb, unsigned long address,
+ unsigned long size)
+{
+ tlb_add_flush(tlb, address);
+ tlb_add_flush(tlb, address + size - PMD_SIZE);
+}
+
#define pte_free_tlb(tlb, ptep, addr) __pte_free_tlb(tlb, ptep, addr)
#define pmd_free_tlb(tlb, pmdp, addr) __pmd_free_tlb(tlb, pmdp, addr)
#define pud_free_tlb(tlb, pudp, addr) pud_free((tlb)->mm, pudp)
diff --git a/arch/ia64/include/asm/tlb.h b/arch/ia64/include/asm/tlb.h
index 516355a774bfe..5d032d97c254e 100644
--- a/arch/ia64/include/asm/tlb.h
+++ b/arch/ia64/include/asm/tlb.h
@@ -268,6 +268,16 @@ __tlb_remove_tlb_entry (struct mmu_gather *tlb, pte_t *ptep, unsigned long addre
tlb->end_addr = address + PAGE_SIZE;
}
+static inline void
+tlb_flush_pmd_range(struct mmu_gather *tlb, unsigned long address,
+ unsigned long size)
+{
+ if (tlb->start_addr > address)
+ tlb->start_addr = address;
+ if (tlb->end_addr < address + size)
+ tlb->end_addr = address + size;
+}
+
#define tlb_migrate_finish(mm) platform_tlb_migrate_finish(mm)
#define tlb_start_vma(tlb, vma) do { } while (0)
diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h
index b31c779cf5817..1df28a8e2f19e 100644
--- a/arch/s390/include/asm/tlb.h
+++ b/arch/s390/include/asm/tlb.h
@@ -116,6 +116,20 @@ static inline void tlb_remove_page_size(struct mmu_gather *tlb,
return tlb_remove_page(tlb, page);
}
+static inline void tlb_flush_pmd_range(struct mmu_gather *tlb,
+ unsigned long address, unsigned long size)
+{
+ /*
+ * the range might exceed the original range that was provided to
+ * tlb_gather_mmu(), so we need to update it despite the fact it is
+ * usually not updated.
+ */
+ if (tlb->start > address)
+ tlb->start = address;
+ if (tlb->end < address + size)
+ tlb->end = address + size;
+}
+
/*
* pte_free_tlb frees a pte table and clears the CRSTE for the
* page table from the tlb.
@@ -177,6 +191,8 @@ static inline void pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
#define tlb_remove_tlb_entry(tlb, ptep, addr) do { } while (0)
#define tlb_remove_pmd_tlb_entry(tlb, pmdp, addr) do { } while (0)
#define tlb_migrate_finish(mm) do { } while (0)
+#define tlb_flush_pmd_range(tlb, addr, sz) do { } while (0)
+
#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address) \
tlb_remove_tlb_entry(tlb, ptep, address)
diff --git a/arch/sh/include/asm/tlb.h b/arch/sh/include/asm/tlb.h
index 77abe192fb43d..adcb0bfe238e3 100644
--- a/arch/sh/include/asm/tlb.h
+++ b/arch/sh/include/asm/tlb.h
@@ -127,6 +127,16 @@ static inline void tlb_remove_page_size(struct mmu_gather *tlb,
return tlb_remove_page(tlb, page);
}
+static inline void
+tlb_flush_pmd_range(struct mmu_gather *tlb, unsigned long address,
+ unsigned long size)
+{
+ if (tlb->start > address)
+ tlb->start = address;
+ if (tlb->end < address + size)
+ tlb->end = address + size;
+}
+
#define tlb_remove_check_page_size_change tlb_remove_check_page_size_change
static inline void tlb_remove_check_page_size_change(struct mmu_gather *tlb,
unsigned int page_size)
diff --git a/arch/um/include/asm/tlb.h b/arch/um/include/asm/tlb.h
index dce6db147f245..02e61f6abfcab 100644
--- a/arch/um/include/asm/tlb.h
+++ b/arch/um/include/asm/tlb.h
@@ -130,6 +130,18 @@ static inline void tlb_remove_page_size(struct mmu_gather *tlb,
return tlb_remove_page(tlb, page);
}
+static inline void
+tlb_flush_pmd_range(struct mmu_gather *tlb, unsigned long address,
+ unsigned long size)
+{
+ tlb->need_flush = 1;
+
+ if (tlb->start > address)
+ tlb->start = address;
+ if (tlb->end < address + size)
+ tlb->end = address + size;
+}
+
/**
* tlb_remove_tlb_entry - remember a pte unmapping for later tlb invalidation.
*
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 6be86c1c5c583..cfc86e3ba4606 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -139,6 +139,8 @@ void tlb_flush_mmu(struct mmu_gather *tlb);
void arch_tlb_finish_mmu(struct mmu_gather *tlb,
unsigned long start, unsigned long end, bool force);
void tlb_flush_mmu_free(struct mmu_gather *tlb);
+void tlb_flush_pmd_range(struct mmu_gather *tlb, unsigned long address,
+ unsigned long size);
extern bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page,
int page_size);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 2f65dad443ab3..b2ed9174bfd73 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3595,6 +3595,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
unsigned long sz = huge_page_size(h);
unsigned long mmun_start = start; /* For mmu_notifiers */
unsigned long mmun_end = end; /* For mmu_notifiers */
+ bool force_flush = false;
WARN_ON(!is_vm_hugetlb_page(vma));
BUG_ON(start & ~huge_page_mask(h));
@@ -3621,10 +3622,8 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
ptl = huge_pte_lock(h, mm, ptep);
if (huge_pmd_unshare(mm, &address, ptep)) {
spin_unlock(ptl);
- /*
- * We just unmapped a page of PMDs by clearing a PUD.
- * The caller's TLB flush range should cover this area.
- */
+ tlb_flush_pmd_range(tlb, address & PUD_MASK, PUD_SIZE);
+ force_flush = true;
continue;
}
@@ -3688,6 +3687,22 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
}
mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
tlb_end_vma(tlb, vma);
+
+ /*
+ * If we unshared PMDs, the TLB flush was not recorded in mmu_gather. We
+ * could defer the flush until now, since by holding i_mmap_rwsem we
+ * guaranteed that the last refernece would not be dropped. But we must
+ * do the flushing before we return, as otherwise i_mmap_rwsem will be
+ * dropped and the last reference to the shared PMDs page might be
+ * dropped as well.
+ *
+ * In theory we could defer the freeing of the PMD pages as well, but
+ * huge_pmd_unshare() relies on the exact page_count for the PMD page to
+ * detect sharing, so we cannot defer the release of the page either.
+ * Instead, do flush now.
+ */
+ if (force_flush)
+ tlb_flush_mmu_tlbonly(tlb);
}
void __unmap_hugepage_range_final(struct mmu_gather *tlb,
diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
index 2a9fbc4a37d59..c147a5aacfa96 100644
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -139,6 +139,16 @@ bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, int page_
return false;
}
+void tlb_flush_pmd_range(struct mmu_gather *tlb, unsigned long address,
+ unsigned long size)
+{
+ if (tlb->page_size != 0 && tlb->page_size != PMD_SIZE)
+ tlb_flush_mmu(tlb);
+
+ tlb->page_size = PMD_SIZE;
+ tlb->start = min(tlb->start, address);
+ tlb->end = max(tlb->end, address + size);
+}
#endif /* HAVE_GENERIC_MMU_GATHER */
#ifdef CONFIG_HAVE_RCU_TABLE_FREE
--
2.25.1
1
1

[PATCH openEuler-1.0-LTS 1/2] mm: share_pool: adjust sp_make_share_k2u behavior when coredump
by Yang Yingliang 16 Dec '21
by Yang Yingliang 16 Dec '21
16 Dec '21
From: Guo Mengqi <guomengqi3(a)huawei.com>
ascend inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MUV2
CVE: NA
when k2u is being executed ont the whole sharepool group,
and one process coredumps, k2u will skip the coredumped process and
continue on the rest processes in the group.
Signed-off-by: Guo Mengqi <guomengqi3(a)huawei.com>
Reviewed-by: Weilong Chen <chenweilong(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
mm/share_pool.c | 50 +++++++++++++++++++++++++++++++------------------
1 file changed, 32 insertions(+), 18 deletions(-)
diff --git a/mm/share_pool.c b/mm/share_pool.c
index c9079e6a40b1a..ac97a68a70264 100644
--- a/mm/share_pool.c
+++ b/mm/share_pool.c
@@ -678,8 +678,25 @@ static unsigned long sp_mmap(struct mm_struct *mm, struct file *file,
struct sp_area *spa, unsigned long *populate,
unsigned long prot);
static void sp_munmap(struct mm_struct *mm, unsigned long addr, unsigned long size);
+
+#define K2U_NORMAL 0
+#define K2U_COREDUMP 1
+
+struct sp_k2u_context {
+ unsigned long kva;
+ unsigned long kva_aligned;
+ unsigned long size;
+ unsigned long size_aligned;
+ unsigned long sp_flags;
+ int state;
+ int spg_id;
+ bool to_task;
+ struct timespec64 start;
+ struct timespec64 end;
+};
+
static unsigned long sp_remap_kva_to_vma(unsigned long kva, struct sp_area *spa,
- struct mm_struct *mm, unsigned long prot);
+ struct mm_struct *mm, unsigned long prot, struct sp_k2u_context *kc);
static void free_sp_group_id(int spg_id)
{
@@ -1334,7 +1351,7 @@ int mg_sp_group_add_task(int pid, unsigned long prot, int spg_id)
spin_unlock(&sp_area_lock);
if (spa->type == SPA_TYPE_K2SPG && spa->kva) {
- addr = sp_remap_kva_to_vma(spa->kva, spa, mm, prot);
+ addr = sp_remap_kva_to_vma(spa->kva, spa, mm, prot, NULL);
if (IS_ERR_VALUE(addr))
pr_warn("add group remap k2u failed %ld\n", addr);
@@ -2606,7 +2623,7 @@ static unsigned long __sp_remap_get_pfn(unsigned long kva)
/* when called by k2u to group, always make sure rw_lock of spg is down */
static unsigned long sp_remap_kva_to_vma(unsigned long kva, struct sp_area *spa,
- struct mm_struct *mm, unsigned long prot)
+ struct mm_struct *mm, unsigned long prot, struct sp_k2u_context *kc)
{
struct vm_area_struct *vma;
unsigned long ret_addr;
@@ -2618,6 +2635,8 @@ static unsigned long sp_remap_kva_to_vma(unsigned long kva, struct sp_area *spa,
if (unlikely(mm->core_state)) {
pr_err("k2u mmap: encountered coredump, abort\n");
ret_addr = -EBUSY;
+ if (kc)
+ kc->state = K2U_COREDUMP;
goto put_mm;
}
@@ -2703,7 +2722,7 @@ static void *sp_make_share_kva_to_task(unsigned long kva, unsigned long size, un
spa->kva = kva;
- uva = (void *)sp_remap_kva_to_vma(kva, spa, current->mm, prot);
+ uva = (void *)sp_remap_kva_to_vma(kva, spa, current->mm, prot, NULL);
__sp_area_drop(spa);
if (IS_ERR(uva))
pr_err("remap k2u to task failed %ld\n", PTR_ERR(uva));
@@ -2731,6 +2750,8 @@ static void *sp_make_share_kva_to_spg(unsigned long kva, unsigned long size,
struct mm_struct *mm;
struct sp_group_node *spg_node;
void *uva = ERR_PTR(-ENODEV);
+ struct sp_k2u_context kc;
+ unsigned long ret_addr = -ENODEV;
down_read(&spg->rw_lock);
spa = sp_alloc_area(size, sp_flags, spg, SPA_TYPE_K2SPG, current->tgid);
@@ -2745,12 +2766,17 @@ static void *sp_make_share_kva_to_spg(unsigned long kva, unsigned long size,
list_for_each_entry(spg_node, &spg->procs, proc_node) {
mm = spg_node->master->mm;
- uva = (void *)sp_remap_kva_to_vma(kva, spa, mm, spg_node->prot);
- if (IS_ERR(uva)) {
+ kc.state = K2U_NORMAL;
+ ret_addr = sp_remap_kva_to_vma(kva, spa, mm, spg_node->prot, &kc);
+ if (IS_ERR_VALUE(ret_addr)) {
+ if (kc.state == K2U_COREDUMP)
+ continue;
+ uva = (void *)ret_addr;
pr_err("remap k2u to spg failed %ld\n", PTR_ERR(uva));
__sp_free(spg, spa->va_start, spa_size(spa), mm);
goto out;
}
+ uva = (void *)ret_addr;
}
out:
@@ -2775,18 +2801,6 @@ static bool vmalloc_area_set_flag(unsigned long kva, unsigned long flags)
return false;
}
-struct sp_k2u_context {
- unsigned long kva;
- unsigned long kva_aligned;
- unsigned long size;
- unsigned long size_aligned;
- unsigned long sp_flags;
- int spg_id;
- bool to_task;
- struct timespec64 start;
- struct timespec64 end;
-};
-
static void trace_sp_k2u_begin(struct sp_k2u_context *kc)
{
if (!sysctl_sp_perf_k2u)
--
2.25.1
1
1
bugzilla: https://gitee.com/openeuler/kernel/issues/I469VQ
Tian Tao (2):
drm/hisilicon: Support i2c driver algorithms for bit-shift adapters
drm/hisilicon: Features to support reading resolutions from EDID
drivers/gpu/drm/hisilicon/hibmc/Makefile | 2 +-
.../gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h | 25 ++++-
.../gpu/drm/hisilicon/hibmc/hibmc_drm_i2c.c | 99 +++++++++++++++++++
.../gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c | 38 ++++++-
4 files changed, 158 insertions(+), 6 deletions(-)
create mode 100644 drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_i2c.c
--
2.20.1
1
2
bugzilla: https://gitee.com/openeuler/kernel/issues/I469VQ
Tian Tao (2):
drm/hisilicon: Support i2c driver algorithms for bit-shift adapters
drm/hisilicon: Features to support reading resolutions from EDID
drivers/gpu/drm/hisilicon/hibmc/Makefile | 2 +-
.../gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h | 25 ++++-
.../gpu/drm/hisilicon/hibmc/hibmc_drm_i2c.c | 99 +++++++++++++++++++
.../gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c | 38 ++++++-
4 files changed, 158 insertions(+), 6 deletions(-)
create mode 100644 drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_i2c.c
--
2.20.1
1
1
backport psi feature and avoid kabi change
bugzilla: https://gitee.com/openeuler/kernel/issues/I47QS2
Changes since v9:
1.There are some problems when allocating cgroups, from Cheng Jian.
First. We use `sizeof(struct cgroup_psi)` to allocate more space.
This is inappropriate.
Second, variable length arrays isn't the last member of cgroups_psi,
so the offset of the psi will not be a compile-determined value.
It's dangerous to use cgroup_psi().
2.free path should change too,from Xie Xiuqi.
Baruch Siach (1):
psi: fix reference to kernel commandline enable
Dan Schatzberg (1):
kernel/sched/psi.c: expose pressure metrics on root cgroup
Johannes Weiner (12):
mm: workingset: tell cache transitions from workingset thrashing
sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD
sched: loadavg: make calc_load_n() public
sched: sched.h: make rq locking and clock functions available in
stats.h
sched: introduce this_rq_lock_irq()
psi: pressure stall information for CPU, memory, and IO
psi: cgroup support
psi: make disabling/enabling easier for vendor kernels
psi: fix aggregation idle shut-off
psi: avoid divide-by-zero crash inside virtual machines
fs: kernfs: add poll file operation
sched/psi: Fix sampling error and rare div0 crashes with cgroups and
high uptime
Josef Bacik (1):
blk-iolatency: use a percentile approache for ssd's
Liu Xinpeng (2):
psi:enable psi in config
psi:avoid kabi change
Miklos Szeredi (1):
fuse: ignore PG_workingset after stealing
Olof Johansson (1):
kernel/sched/psi.c: simplify cgroup_move_task()
Suren Baghdasaryan (6):
psi: introduce state_mask to represent stalled psi states
psi: make psi_enable static
psi: rename psi fields in preparation for psi trigger addition
psi: split update_stats into parts
psi: track changed states
include/: refactor headers to allow kthread.h inclusion in psi_types.h
Yafang Shao (1):
mm, memcg: add workingset_restore in memory.stat
Documentation/accounting/psi.txt | 73 +++
Documentation/admin-guide/cgroup-v2.rst | 22 +
Documentation/admin-guide/kernel-parameters.txt | 4 +
arch/arm64/configs/openeuler_defconfig | 2 +
arch/powerpc/platforms/cell/cpufreq_spudemand.c | 2 +-
arch/powerpc/platforms/cell/spufs/sched.c | 9 +-
arch/s390/appldata/appldata_os.c | 4 -
arch/x86/configs/openeuler_defconfig | 2 +
block/blk-iolatency.c | 183 +++++-
drivers/cpuidle/governors/menu.c | 4 -
drivers/spi/spi-rockchip.c | 1 +
fs/fuse/dev.c | 1 +
fs/kernfs/file.c | 31 +-
fs/proc/loadavg.c | 3 -
include/linux/cgroup-defs.h | 12 +
include/linux/cgroup.h | 17 +
include/linux/kernfs.h | 8 +
include/linux/kthread.h | 4 +
include/linux/page-flags.h | 5 +
include/linux/psi.h | 55 ++
include/linux/psi_types.h | 95 +++
include/linux/sched.h | 13 +
include/linux/sched/loadavg.h | 24 +-
include/linux/swap.h | 1 +
include/trace/events/mmflags.h | 1 +
init/Kconfig | 28 +
kernel/cgroup/cgroup.c | 69 +-
kernel/debug/kdb/kdb_main.c | 7 +-
kernel/fork.c | 4 +
kernel/kthread.c | 3 +
kernel/sched/Makefile | 1 +
kernel/sched/core.c | 16 +-
kernel/sched/loadavg.c | 139 ++--
kernel/sched/psi.c | 823 ++++++++++++++++++++++++
kernel/sched/sched.h | 178 ++---
kernel/sched/stats.h | 86 +++
kernel/workqueue.c | 23 +
kernel/workqueue_internal.h | 6 +-
mm/compaction.c | 5 +
mm/filemap.c | 20 +-
mm/huge_memory.c | 1 +
mm/migrate.c | 2 +
mm/page_alloc.c | 9 +
mm/swap_state.c | 1 +
mm/vmscan.c | 10 +
mm/workingset.c | 113 +++-
46 files changed, 1841 insertions(+), 279 deletions(-)
create mode 100644 Documentation/accounting/psi.txt
create mode 100644 include/linux/psi.h
create mode 100644 include/linux/psi_types.h
create mode 100644 kernel/sched/psi.c
--
1.8.3.1
3
28

[PATCH openEuler-1.0-LTS 1/4] time: Normalize timespec64 before timespec64_compare()
by Yang Yingliang 15 Dec '21
by Yang Yingliang 15 Dec '21
15 Dec '21
From: Yu Liao <liaoyu15(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 185831, https://gitee.com/openeuler/kernel/issues/I4MO2G
CVE: NA
-------------------------------------------------
Passing unnormalized timespec64 to timespec64_compare() may cause
incorrect results.
For example:
wall_to_monotonic = {tv_sec = -10, tv_nsec = 900000000}
ts_delta = {tv_sec = -9, tv_nsec = -900000000}
timespec64_compare() returns -1, but actually wall_to_monotonic > ts_delta.
This will cause wall_to_monotonic to become a positive number.
Use timespec64_sub() instead of direct subtraction to avoid this.
Signed-off-by: Yu Liao <liaoyu15(a)huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
include/linux/time64.h | 2 ++
kernel/time/timekeeping.c | 3 +--
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/include/linux/time64.h b/include/linux/time64.h
index b58142aa0b1dd..e58f59674ad6d 100644
--- a/include/linux/time64.h
+++ b/include/linux/time64.h
@@ -62,6 +62,8 @@ static inline int timespec64_equal(const struct timespec64 *a,
* lhs < rhs: return <0
* lhs == rhs: return 0
* lhs > rhs: return >0
+ *
+ * Note: Both lhs and rhs must be normalized.
*/
static inline int timespec64_compare(const struct timespec64 *lhs, const struct timespec64 *rhs)
{
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 64aa492f79422..17ba0a2f5f5d1 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1236,8 +1236,7 @@ int do_settimeofday64(const struct timespec64 *ts)
timekeeping_forward_now(tk);
xt = tk_xtime(tk);
- ts_delta.tv_sec = ts->tv_sec - xt.tv_sec;
- ts_delta.tv_nsec = ts->tv_nsec - xt.tv_nsec;
+ ts_delta = timespec64_sub(*ts, xt);
if (timespec64_compare(&tk->wall_to_monotonic, &ts_delta) > 0) {
ret = -EINVAL;
--
2.25.1
1
3

15 Dec '21
openEuler inclusion
category: bugfix
bugzilla: NA
CVE: NA
------------------------
In line 1767, sas_alloc_slow_task() allocates and initializes a
sas_task structure. When some errors occur, line 1778 and line
1795 forget to free this structure, which will lead to a memory leak.
There is a similar snippet of code in the same file (in function
pm8001_send_read_log) as allocating and initializing in line 1812
as well as releasing the memory in line 1822 and line 1867.
We can fix it by calling sas_free_task() when the res and ret is true
and before the function returns.
Signed-off-by: Jianglei Nie <niejianglei2021(a)163.com>
---
drivers/scsi/pm8001/pm8001_hwi.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
index 124cb69740c6..25045a91620e 100644
--- a/drivers/scsi/pm8001/pm8001_hwi.c
+++ b/drivers/scsi/pm8001/pm8001_hwi.c
@@ -1774,8 +1774,10 @@ static void pm8001_send_abort_all(struct pm8001_hba_info *pm8001_ha,
task->task_done = pm8001_task_done;
res = pm8001_tag_alloc(pm8001_ha, &ccb_tag);
- if (res)
+ if (res) {
+ sas_free_task(task);
return;
+ }
ccb = &pm8001_ha->ccb_info[ccb_tag];
ccb->device = pm8001_ha_dev;
@@ -1791,8 +1793,10 @@ static void pm8001_send_abort_all(struct pm8001_hba_info *pm8001_ha,
ret = pm8001_mpi_build_cmd(pm8001_ha, circularQ, opc, &task_abort,
sizeof(task_abort), 0);
- if (ret)
+ if (ret) {
+ sas_free_task(task);
pm8001_tag_free(pm8001_ha, ccb_tag);
+ }
}
--
2.25.1
1
0
您好,我采用最细的openEuler-1.0测试perf spe-c2c的功能,但是发现/sys/devices下没有arm_spe_0这个目录
执行perf spe-c2c record时显示:
event syntax error: 'arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/'
\___ Cannot find PMU `arm_spe_0'. Missing kernel support?
想问下您这是为什么? 需要哪里开启spe功能吗?
3
2

14 Dec '21
From: zhangwensheng <zhangwensheng5(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 185891
CVE: NA
--------------------------------
because enlarge the last_events in md_rdev, to fix the kabi change;
Signed-off-by: zhangwensheng <zhangwensheng5(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/md/md.h | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/md/md.h b/drivers/md/md.h
index e745f701e1a82..422af63f1e1ee 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -47,8 +47,16 @@ struct md_rdev {
sector_t sectors; /* Device size (in 512bytes sectors) */
struct mddev *mddev; /* RAID array if running */
- long long last_events; /* IO event timestamp */
+ /*
+ * IO event timestamp, this pos has 64 bit space,
+ * can enlarge it solving its small size.
+ */
+#ifndef __GENKSYMS__
+ long long last_events;
+#else
+ int last_events;
+#endif
/*
* If meta_bdev is non-NULL, it means that a separate device is
* being used to store the metadata (superblock/bitmap) which
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] Revert "dm space maps: don't reset space map allocation cursor when committing"
by Yang Yingliang 14 Dec '21
by Yang Yingliang 14 Dec '21
14 Dec '21
From: Luo Meng <luomeng12(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 185894, https://gitee.com/openeuler/kernel/issues/I4MITL
CVE: NA
-----------------------------------------------
This reverts commit 38cb5d45845b9d8962eb26660c14b7f05abc602f.
Signed-off-by: Luo Meng <luomeng12(a)huawei.com>
Reviewed-by: Jason Yan <yanaijie(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/md/persistent-data/dm-space-map-disk.c | 9 +--------
drivers/md/persistent-data/dm-space-map-metadata.c | 9 +--------
2 files changed, 2 insertions(+), 16 deletions(-)
diff --git a/drivers/md/persistent-data/dm-space-map-disk.c b/drivers/md/persistent-data/dm-space-map-disk.c
index e0acae7a3815d..bf4c5e2ccb6ff 100644
--- a/drivers/md/persistent-data/dm-space-map-disk.c
+++ b/drivers/md/persistent-data/dm-space-map-disk.c
@@ -171,14 +171,6 @@ static int sm_disk_new_block(struct dm_space_map *sm, dm_block_t *b)
* Any block we allocate has to be free in both the old and current ll.
*/
r = sm_ll_find_common_free_block(&smd->old_ll, &smd->ll, smd->begin, smd->ll.nr_blocks, b);
- if (r == -ENOSPC) {
- /*
- * There's no free block between smd->begin and the end of the metadata device.
- * We search before smd->begin in case something has been freed.
- */
- r = sm_ll_find_common_free_block(&smd->old_ll, &smd->ll, 0, smd->begin, b);
- }
-
if (r)
return r;
@@ -207,6 +199,7 @@ static int sm_disk_commit(struct dm_space_map *sm)
return r;
memcpy(&smd->old_ll, &smd->ll, sizeof(smd->old_ll));
+ smd->begin = 0;
smd->nr_allocated_this_transaction = 0;
r = sm_disk_get_nr_free(sm, &nr_free);
diff --git a/drivers/md/persistent-data/dm-space-map-metadata.c b/drivers/md/persistent-data/dm-space-map-metadata.c
index da439ac857963..9e3c64ec2026f 100644
--- a/drivers/md/persistent-data/dm-space-map-metadata.c
+++ b/drivers/md/persistent-data/dm-space-map-metadata.c
@@ -452,14 +452,6 @@ static int sm_metadata_new_block_(struct dm_space_map *sm, dm_block_t *b)
* Any block we allocate has to be free in both the old and current ll.
*/
r = sm_ll_find_common_free_block(&smm->old_ll, &smm->ll, smm->begin, smm->ll.nr_blocks, b);
- if (r == -ENOSPC) {
- /*
- * There's no free block between smm->begin and the end of the metadata device.
- * We search before smm->begin in case something has been freed.
- */
- r = sm_ll_find_common_free_block(&smm->old_ll, &smm->ll, 0, smm->begin, b);
- }
-
if (r)
return r;
@@ -511,6 +503,7 @@ static int sm_metadata_commit(struct dm_space_map *sm)
return r;
memcpy(&smm->old_ll, &smm->ll, sizeof(smm->old_ll));
+ smm->begin = 0;
smm->allocated_this_transaction = 0;
return 0;
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS 1/4] nvme-fabrics: reject I/O to offline device
by Yang Yingliang 14 Dec '21
by Yang Yingliang 14 Dec '21
14 Dec '21
From: Victor Gladkov <victor.gladkov(a)kioxia.com>
mainline inclusion
from mainline-v5.11-rc1
commit 8c4dfea97f15b80097b3f882ca428fb2751ec30c
category: bugfix
bugzilla: NA
CVE: NA
Link: https://gitee.com/openeuler/kernel/issues/I4JFPM?from=project-issue
-------------------------------------------------
Commands get stuck while Host NVMe-oF controller is in reconnect state.
The controller enters into reconnect state when it loses connection with
the target. It tries to reconnect every 10 seconds (default) until
a successful reconnect or until the reconnect time-out is reached.
The default reconnect time out is 10 minutes.
Applications are expecting commands to complete with success or error
within a certain timeout (30 seconds by default). The NVMe host is
enforcing that timeout while it is connected, but during reconnect the
timeout is not enforced and commands may get stuck for a long period or
even forever.
To fix this long delay due to the default timeout, introduce new
"fast_io_fail_tmo" session parameter. The timeout is measured in seconds
from the controller reconnect and any command beyond that timeout is
rejected. The new parameter value may be passed during 'connect'.
The default value of -1 means no timeout (similar to current behavior).
Signed-off-by: Victor Gladkov <victor.gladkov(a)kioxia.com>
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni(a)wdc.com>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Sagi Grimberg <sagi(a)grimberg.me>
Reviewed-by: Chao Leng <lengchao(a)huawei.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
conflicts::
drivers/nvme/host/core.c
[adjust context]
Signed-off-by: jiangtao <jiangtao62(a)huawei.com>
Reviewed-by: chengjike <chengjike.cheng(a)huawei.com>
Reviewed-by: Ao Sun <sunao.sun(a)huawei.com>
Reviewed-by: Zhenwei Yang <yangzhenwei(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/nvme/host/core.c | 51 +++++++++++++++++++++++++++++++++--
drivers/nvme/host/fabrics.c | 28 ++++++++++++++++---
drivers/nvme/host/fabrics.h | 5 ++++
drivers/nvme/host/multipath.c | 2 ++
drivers/nvme/host/nvme.h | 3 +++
5 files changed, 83 insertions(+), 6 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index b45ae13aa5fe6..c9ca375a5ddb9 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -130,6 +130,37 @@ static void nvme_queue_scan(struct nvme_ctrl *ctrl)
queue_work(nvme_wq, &ctrl->scan_work);
}
+static void nvme_failfast_work(struct work_struct *work)
+{
+ struct nvme_ctrl *ctrl = container_of(to_delayed_work(work),
+ struct nvme_ctrl, failfast_work);
+
+ if (ctrl->state != NVME_CTRL_CONNECTING)
+ return;
+
+ set_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags);
+ dev_info(ctrl->device, "failfast expired\n");
+ nvme_kick_requeue_lists(ctrl);
+}
+
+static inline void nvme_start_failfast_work(struct nvme_ctrl *ctrl)
+{
+ if (!ctrl->opts || ctrl->opts->fast_io_fail_tmo == -1)
+ return;
+
+ schedule_delayed_work(&ctrl->failfast_work,
+ ctrl->opts->fast_io_fail_tmo * HZ);
+}
+
+static inline void nvme_stop_failfast_work(struct nvme_ctrl *ctrl)
+{
+ if (!ctrl->opts)
+ return;
+
+ cancel_delayed_work_sync(&ctrl->failfast_work);
+ clear_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags);
+}
+
int nvme_reset_ctrl(struct nvme_ctrl *ctrl)
{
if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING))
@@ -385,8 +416,21 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
ctrl->state = new_state;
spin_unlock_irqrestore(&ctrl->lock, flags);
- if (changed && ctrl->state == NVME_CTRL_LIVE)
- nvme_kick_requeue_lists(ctrl);
+ if (changed) {
+ switch (ctrl->state) {
+ case NVME_CTRL_LIVE:
+ if (old_state == NVME_CTRL_CONNECTING)
+ nvme_stop_failfast_work(ctrl);
+ nvme_kick_requeue_lists(ctrl);
+ break;
+ case NVME_CTRL_CONNECTING:
+ if (old_state == NVME_CTRL_RESETTING)
+ nvme_start_failfast_work(ctrl);
+ break;
+ default:
+ break;
+ }
+ }
return changed;
}
EXPORT_SYMBOL_GPL(nvme_change_ctrl_state);
@@ -3712,6 +3756,7 @@ void nvme_stop_ctrl(struct nvme_ctrl *ctrl)
{
nvme_mpath_stop(ctrl);
nvme_stop_keep_alive(ctrl);
+ nvme_stop_failfast_work(ctrl);
flush_work(&ctrl->async_event_work);
cancel_work_sync(&ctrl->fw_act_work);
if (ctrl->ops->stop_ctrl)
@@ -3776,6 +3821,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
int ret;
ctrl->state = NVME_CTRL_NEW;
+ clear_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags);
spin_lock_init(&ctrl->lock);
mutex_init(&ctrl->scan_lock);
INIT_LIST_HEAD(&ctrl->namespaces);
@@ -3791,6 +3837,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
INIT_DELAYED_WORK(&ctrl->ka_work, nvme_keep_alive_work);
memset(&ctrl->ka_cmd, 0, sizeof(ctrl->ka_cmd));
ctrl->ka_cmd.common.opcode = nvme_admin_keep_alive;
+ INIT_DELAYED_WORK(&ctrl->failfast_work, nvme_failfast_work);
BUILD_BUG_ON(NVME_DSM_MAX_RANGES * sizeof(struct nvme_dsm_range) >
PAGE_SIZE);
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 738794af3f389..650b3bd89968f 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -550,6 +550,7 @@ blk_status_t nvmf_fail_nonready_command(struct nvme_ctrl *ctrl,
{
if (ctrl->state != NVME_CTRL_DELETING &&
ctrl->state != NVME_CTRL_DEAD &&
+ !test_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags) &&
!blk_noretry_request(rq) && !(rq->cmd_flags & REQ_NVME_MPATH))
return BLK_STS_RESOURCE;
@@ -607,6 +608,7 @@ static const match_table_t opt_tokens = {
{ NVMF_OPT_HOST_TRADDR, "host_traddr=%s" },
{ NVMF_OPT_HOST_ID, "hostid=%s" },
{ NVMF_OPT_DUP_CONNECT, "duplicate_connect" },
+ { NVMF_OPT_FAIL_FAST_TMO, "fast_io_fail_tmo=%d" },
{ NVMF_OPT_ERR, NULL }
};
@@ -626,6 +628,7 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
opts->reconnect_delay = NVMF_DEF_RECONNECT_DELAY;
opts->kato = NVME_DEFAULT_KATO;
opts->duplicate_connect = false;
+ opts->fast_io_fail_tmo = NVMF_DEF_FAIL_FAST_TMO;
options = o = kstrdup(buf, GFP_KERNEL);
if (!options)
@@ -750,6 +753,17 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
pr_warn("ctrl_loss_tmo < 0 will reconnect forever\n");
ctrl_loss_tmo = token;
break;
+ case NVMF_OPT_FAIL_FAST_TMO:
+ if (match_int(args, &token)) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ if (token >= 0)
+ pr_warn("I/O will fail on after %d sec reconnect\n",
+ token);
+ opts->fast_io_fail_tmo = token;
+ break;
case NVMF_OPT_HOSTNQN:
if (opts->host) {
pr_err("hostnqn already user-assigned: %s\n",
@@ -830,11 +844,17 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
opts->nr_io_queues = 0;
opts->duplicate_connect = true;
}
- if (ctrl_loss_tmo < 0)
+
+ if (ctrl_loss_tmo < 0) {
opts->max_reconnects = -1;
- else
+ } else {
opts->max_reconnects = DIV_ROUND_UP(ctrl_loss_tmo,
opts->reconnect_delay);
+ if (ctrl_loss_tmo < opts->fast_io_fail_tmo)
+ pr_warn("failfast tmo (%d) > ctrl_loss_tmo (%d)\n",
+ opts->fast_io_fail_tmo,
+ ctrl_loss_tmo);
+ }
if (!opts->host) {
kref_get(&nvmf_default_host->ref);
@@ -903,8 +923,8 @@ EXPORT_SYMBOL_GPL(nvmf_free_options);
#define NVMF_REQUIRED_OPTS (NVMF_OPT_TRANSPORT | NVMF_OPT_NQN)
#define NVMF_ALLOWED_OPTS (NVMF_OPT_QUEUE_SIZE | NVMF_OPT_NR_IO_QUEUES | \
NVMF_OPT_KATO | NVMF_OPT_HOSTNQN | \
- NVMF_OPT_HOST_ID | NVMF_OPT_DUP_CONNECT)
-
+ NVMF_OPT_HOST_ID | NVMF_OPT_DUP_CONNECT |\
+ NVMF_OPT_FAIL_FAST_TMO)
static struct nvme_ctrl *
nvmf_create_ctrl(struct device *dev, const char *buf, size_t count)
{
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index 188ebbeec32ce..645ed75f22980 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -24,6 +24,8 @@
/* default to 600 seconds of reconnect attempts before giving up */
#define NVMF_DEF_CTRL_LOSS_TMO 600
#define NVMF_DEF_RECONNECT_FOREVER -1
+/* default is -1: the fail fast mechanism is disabled */
+#define NVMF_DEF_FAIL_FAST_TMO -1
/*
* Define a host as seen by the target. We allocate one at boot, but also
@@ -59,6 +61,7 @@ enum {
NVMF_OPT_CTRL_LOSS_TMO = 1 << 11,
NVMF_OPT_HOST_ID = 1 << 12,
NVMF_OPT_DUP_CONNECT = 1 << 13,
+ NVMF_OPT_FAIL_FAST_TMO = 1 << 20,
};
/**
@@ -86,6 +89,7 @@ enum {
* @max_reconnects: maximum number of allowed reconnect attempts before removing
* the controller, (-1) means reconnect forever, zero means remove
* immediately;
+ * @fast_io_fail_tmo: Fast I/O fail timeout in seconds
*/
struct nvmf_ctrl_options {
unsigned mask;
@@ -102,6 +106,7 @@ struct nvmf_ctrl_options {
unsigned int kato;
struct nvmf_host *host;
int max_reconnects;
+ int fast_io_fail_tmo;
};
/*
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index f4616df1f7f15..9201946568630 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -198,6 +198,8 @@ static bool nvme_available_path(struct nvme_ns_head *head)
struct nvme_ns *ns;
list_for_each_entry_rcu(ns, &head->list, siblings) {
+ if (test_bit(NVME_CTRL_FAILFAST_EXPIRED, &ns->ctrl->flags))
+ continue;
switch (ns->ctrl->state) {
case NVME_CTRL_LIVE:
case NVME_CTRL_RESETTING:
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 85f814a91ce9c..cd66364eb5429 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -211,6 +211,7 @@ struct nvme_ctrl {
struct work_struct scan_work;
struct work_struct async_event_work;
struct delayed_work ka_work;
+ struct delayed_work failfast_work;
struct nvme_command ka_cmd;
struct work_struct fw_act_work;
unsigned long events;
@@ -245,6 +246,8 @@ struct nvme_ctrl {
u16 icdoff;
u16 maxcmd;
int nr_reconnects;
+ unsigned long flags;
+#define NVME_CTRL_FAILFAST_EXPIRED 0
struct nvmf_ctrl_options *opts;
struct page *discard_page;
--
2.25.1
1
3

[PATCH openEuler-1.0-LTS 01/12] nvme: add proper discard setup for the multipath device
by Yang Yingliang 14 Dec '21
by Yang Yingliang 14 Dec '21
14 Dec '21
From: Christoph Hellwig <hch(a)lst.de>
mainline inclusion
from mainline-v5.1-rc1
commit 2631857160ecbea04e54423f5053133fe2b6ea45
category: bugfix
bugzilla: NA
CVE: NA
Link: https://gitee.com/openeuler/kernel/issues/I4JFBE?from=project-issue
-------------------------------------------------
Add a gendisk argument to nvme_config_discard so that the call to
nvme_update_disk_info for the multipath device node updates the
proper request_queue.
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Reported-by: Sagi Grimberg <sagi(a)grimberg.me>
Reviewed-by: Keith Busch <keith.busch(a)intel.com>
Reviewed-by: Max Gurtovoy <maxg(a)mellanox.com>
Tested-by: Sagi Grimberg <sagi(a)grimberg.me>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
conflicts:
drivers/nvme/host/core.c
[adjust context]
Signed-off-by: chengjike <chengjike.cheng(a)huawei.com>
Reviewed-by: Ao Sun <sunao.sun(a)huawei.com>
Reviewed-by: Zhenwei Yang <yangzhenwei(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/nvme/host/core.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 522ab9c0133ca..7c4a97e2cd93a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1558,10 +1558,10 @@ static void nvme_set_chunk_size(struct nvme_ns *ns)
blk_queue_chunk_sectors(ns->queue, rounddown_pow_of_two(chunk_size));
}
-static void nvme_config_discard(struct nvme_ns *ns)
+static void nvme_config_discard(struct gendisk *disk, struct nvme_ns *ns)
{
struct nvme_ctrl *ctrl = ns->ctrl;
- struct request_queue *queue = ns->queue;
+ struct request_queue *queue = disk->queue;
u32 size = queue_logical_block_size(queue);
if (!(ctrl->oncs & NVME_CTRL_ONCS_DSM)) {
@@ -1647,7 +1647,7 @@ static void nvme_update_disk_info(struct gendisk *disk,
capacity = 0;
set_capacity(disk, capacity);
- nvme_config_discard(ns);
+ nvme_config_discard(disk, ns);
if (id->nsattr & (1 << 0))
set_disk_ro(disk, true);
--
2.25.1
1
11

[PATCH openEuler-1.0-LTS 1/2] md: Fix undefined behaviour in is_mddev_idle
by Yang Yingliang 14 Dec '21
by Yang Yingliang 14 Dec '21
14 Dec '21
From: zhangwensheng <zhangwensheng5(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 185891
CVE: NA
--------------------------------
UBSAN reports this problem:
[ 5984.281385] UBSAN: Undefined behaviour in drivers/md/md.c:8175:15
[ 5984.281390] signed integer overflow:
[ 5984.281393] -2147483291 - 2072033152 cannot be represented in type 'int'
[ 5984.281400] CPU: 25 PID: 1854 Comm: md101_resync Kdump: loaded Not tainted 4.19.90
[ 5984.281404] Hardware name: Huawei TaiShan 200 (Model 5280)/BC82AMDDA
[ 5984.281406] Call trace:
[ 5984.281415] dump_backtrace+0x0/0x310
[ 5984.281418] show_stack+0x28/0x38
[ 5984.281425] dump_stack+0xec/0x15c
[ 5984.281430] ubsan_epilogue+0x18/0x84
[ 5984.281434] handle_overflow+0x14c/0x19c
[ 5984.281439] __ubsan_handle_sub_overflow+0x34/0x44
[ 5984.281445] is_mddev_idle+0x338/0x3d8
[ 5984.281449] md_do_sync+0x1bb8/0x1cf8
[ 5984.281452] md_thread+0x220/0x288
[ 5984.281457] kthread+0x1d8/0x1e0
[ 5984.281461] ret_from_fork+0x10/0x18
When the stat aacum of the disk is greater than INT_MAX, its value
becomes negative after casting to 'int', which may lead to overflow
after subtracting a positive number. In the same way, when the value
of sync_io is greater than INT_MAX,overflow may also occur. These
situations will lead to undefined behavior.
Otherwise, if the stat accum of the disk is close to INT_MAX when
creating raid arrays, the initial value of last_events would be set
close to INT_MAX when mddev initializes IO event counters.
'curr_events - rdev->last_events > 64' will always false during
synchronization. If all the disks of mddev are in this case,
is_mddev_idle() will always return 1, which may cause non-sync IO
is very slow.
To address these problems, need to use 64bit signed integer type
for sync_io,last_events, and curr_events.
Signed-off-by: zhangwensheng <zhangwensheng5(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/md/md.c | 7 ++++---
drivers/md/md.h | 6 +++---
include/linux/genhd.h | 2 +-
3 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 409ec5ffd28d3..8a68ff71fbbf8 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8161,14 +8161,15 @@ static int is_mddev_idle(struct mddev *mddev, int init)
{
struct md_rdev *rdev;
int idle;
- int curr_events;
+ long long curr_events;
idle = 1;
rcu_read_lock();
rdev_for_each_rcu(rdev, mddev) {
struct gendisk *disk = rdev->bdev->bd_contains->bd_disk;
- curr_events = (int)part_stat_read_accum(&disk->part0, sectors) -
- atomic_read(&disk->sync_io);
+ curr_events =
+ (long long)part_stat_read_accum(&disk->part0, sectors) -
+ atomic64_read(&disk->sync_io_sectors);
/* sync IO will cause sync_io to increase before the disk_stats
* as sync_io is counted when a request starts, and
* disk_stats is counted when it completes.
diff --git a/drivers/md/md.h b/drivers/md/md.h
index ee3c9768ff9e7..e745f701e1a82 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -47,7 +47,7 @@ struct md_rdev {
sector_t sectors; /* Device size (in 512bytes sectors) */
struct mddev *mddev; /* RAID array if running */
- int last_events; /* IO event timestamp */
+ long long last_events; /* IO event timestamp */
/*
* If meta_bdev is non-NULL, it means that a separate device is
@@ -528,12 +528,12 @@ extern void mddev_unlock(struct mddev *mddev);
static inline void md_sync_acct(struct block_device *bdev, unsigned long nr_sectors)
{
- atomic_add(nr_sectors, &bdev->bd_contains->bd_disk->sync_io);
+ atomic64_add(nr_sectors, &bdev->bd_contains->bd_disk->sync_io_sectors);
}
static inline void md_sync_acct_bio(struct bio *bio, unsigned long nr_sectors)
{
- atomic_add(nr_sectors, &bio->bi_disk->sync_io);
+ atomic64_add(nr_sectors, &bio->bi_disk->sync_io_sectors);
}
struct md_personality
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 9c398294b6279..38737c853b28d 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -212,7 +212,6 @@ struct gendisk {
struct kobject *slave_dir;
struct timer_rand_state *random;
- atomic_t sync_io; /* RAID */
struct disk_events *ev;
#ifdef CONFIG_BLK_DEV_INTEGRITY
struct kobject integrity_kobj;
@@ -220,6 +219,7 @@ struct gendisk {
int node_id;
struct badblocks *bb;
struct lockdep_map lockdep_map;
+ atomic64_t sync_io_sectors; /* RAID */
#ifndef __GENKSYMS__
unsigned long *user_ro_bitmap;
--
2.25.1
1
1

[PATCH openEuler-1.0-LTS 1/4] xfs: ensure that the inode uid/gid match values match the icdinode ones
by Yang Yingliang 14 Dec '21
by Yang Yingliang 14 Dec '21
14 Dec '21
From: Christoph Hellwig <hch(a)lst.de>
mainline inclusion
from mainline-v5.6-rc4
commit 3d8f2821502d0b60bac2789d0bea951fda61de0c
category: bugfix
bugzilla: 185881
CVE: CVE-2021-4037
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
-------------------------------------------------
Instead of only synchronizing the uid/gid values in xfs_setup_inode,
ensure that they always match to prepare for removing the icdinode
fields.
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong(a)oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong(a)oracle.com>
Signed-off-by: Guo Xuenan <guoxuenan(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
fs/xfs/libxfs/xfs_inode_buf.c | 2 ++
fs/xfs/xfs_icache.c | 4 ++++
fs/xfs/xfs_inode.c | 8 ++++++--
fs/xfs/xfs_iops.c | 3 ---
4 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index 82c4374acb4c6..ed58741e09ea8 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -225,7 +225,9 @@ xfs_inode_from_disk(
to->di_format = from->di_format;
to->di_uid = be32_to_cpu(from->di_uid);
+ inode->i_uid = xfs_uid_to_kuid(to->di_uid);
to->di_gid = be32_to_cpu(from->di_gid);
+ inode->i_gid = xfs_gid_to_kgid(to->di_gid);
to->di_flushiter = be16_to_cpu(from->di_flushiter);
/*
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 56e9043bddc71..ceee27b703842 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -286,6 +286,8 @@ xfs_reinit_inode(
uint64_t version = inode_peek_iversion(inode);
umode_t mode = inode->i_mode;
dev_t dev = inode->i_rdev;
+ kuid_t uid = inode->i_uid;
+ kgid_t gid = inode->i_gid;
error = inode_init_always(mp->m_super, inode);
@@ -294,6 +296,8 @@ xfs_reinit_inode(
inode_set_iversion_queried(inode, version);
inode->i_mode = mode;
inode->i_rdev = dev;
+ inode->i_uid = uid;
+ inode->i_gid = gid;
return error;
}
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index c3226c48079da..70d5f0e2aa129 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -812,15 +812,19 @@ xfs_ialloc(
inode->i_mode = mode;
set_nlink(inode, nlink);
- ip->i_d.di_uid = xfs_kuid_to_uid(current_fsuid());
- ip->i_d.di_gid = xfs_kgid_to_gid(current_fsgid());
+ inode->i_uid = current_fsuid();
+ ip->i_d.di_uid = xfs_kuid_to_uid(inode->i_uid);
inode->i_rdev = rdev;
xfs_set_projid(ip, prid);
if (pip && XFS_INHERIT_GID(pip)) {
+ inode->i_gid = VFS_I(pip)->i_gid;
ip->i_d.di_gid = pip->i_d.di_gid;
if ((VFS_I(pip)->i_mode & S_ISGID) && S_ISDIR(mode))
inode->i_mode |= S_ISGID;
+ } else {
+ inode->i_gid = current_fsgid();
+ ip->i_d.di_gid = xfs_kgid_to_gid(inode->i_gid);
}
/*
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 0ac63cafb32a1..2c9d2d2f92983 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -1305,9 +1305,6 @@ xfs_setup_inode(
/* make the inode look hashed for the writeback code */
inode_fake_hash(inode);
- inode->i_uid = xfs_uid_to_kuid(ip->i_d.di_uid);
- inode->i_gid = xfs_gid_to_kgid(ip->i_d.di_gid);
-
i_size_write(inode, ip->i_d.di_size);
xfs_diflags_to_iflags(inode, ip);
--
2.25.1
1
3

[PATCH openEuler-1.0-LTS] configfs: fix a use-after-free in __configfs_open_file
by Yang Yingliang 14 Dec '21
by Yang Yingliang 14 Dec '21
14 Dec '21
From: Daiyue Zhang <zhangdaiyue1(a)huawei.com>
mainline inclusion
from mainline-v5.12-rc3
commit 14fbbc8297728e880070f7b077b3301a8c698ef9
category: bugfix
bugzilla: NA
CVE: CVE-2021-39656
-----------------------------------------------
Commit b0841eefd969 ("configfs: provide exclusion between IO and removals")
uses ->frag_dead to mark the fragment state, thus no bothering with extra
refcount on config_item when opening a file. The configfs_get_config_item
was removed in __configfs_open_file, but not with config_item_put. So the
refcount on config_item will lost its balance, causing use-after-free
issues in some occasions like this:
Test:
1. Mount configfs on /config with read-only items:
drwxrwx--- 289 root root 0 2021-04-01 11:55 /config
drwxr-xr-x 2 root root 0 2021-04-01 11:54 /config/a
--w--w--w- 1 root root 4096 2021-04-01 11:53 /config/a/1.txt
......
2. Then run:
for file in /config
do
echo $file
grep -R 'key' $file
done
3. __configfs_open_file will be called in parallel, the first one
got called will do:
if (file->f_mode & FMODE_READ) {
if (!(inode->i_mode & S_IRUGO))
goto out_put_module;
config_item_put(buffer->item);
kref_put()
package_details_release()
kfree()
the other one will run into use-after-free issues like this:
BUG: KASAN: use-after-free in __configfs_open_file+0x1bc/0x3b0
Read of size 8 at addr fffffff155f02480 by task grep/13096
CPU: 0 PID: 13096 Comm: grep VIP: 00 Tainted: G W 4.14.116-kasan #1
TGID: 13096 Comm: grep
Call trace:
dump_stack+0x118/0x160
kasan_report+0x22c/0x294
__asan_load8+0x80/0x88
__configfs_open_file+0x1bc/0x3b0
configfs_open_file+0x28/0x34
do_dentry_open+0x2cc/0x5c0
vfs_open+0x80/0xe0
path_openat+0xd8c/0x2988
do_filp_open+0x1c4/0x2fc
do_sys_open+0x23c/0x404
SyS_openat+0x38/0x48
Allocated by task 2138:
kasan_kmalloc+0xe0/0x1ac
kmem_cache_alloc_trace+0x334/0x394
packages_make_item+0x4c/0x180
configfs_mkdir+0x358/0x740
vfs_mkdir2+0x1bc/0x2e8
SyS_mkdirat+0x154/0x23c
el0_svc_naked+0x34/0x38
Freed by task 13096:
kasan_slab_free+0xb8/0x194
kfree+0x13c/0x910
package_details_release+0x524/0x56c
kref_put+0xc4/0x104
config_item_put+0x24/0x34
__configfs_open_file+0x35c/0x3b0
configfs_open_file+0x28/0x34
do_dentry_open+0x2cc/0x5c0
vfs_open+0x80/0xe0
path_openat+0xd8c/0x2988
do_filp_open+0x1c4/0x2fc
do_sys_open+0x23c/0x404
SyS_openat+0x38/0x48
el0_svc_naked+0x34/0x38
To fix this issue, remove the config_item_put in
__configfs_open_file to balance the refcount of config_item.
Fixes: b0841eefd969 ("configfs: provide exclusion between IO and removals")
Signed-off-by: Daiyue Zhang <zhangdaiyue1(a)huawei.com>
Signed-off-by: Yi Chen <chenyi77(a)huawei.com>
Signed-off-by: Ge Qiu <qiuge(a)huawei.com>
Reviewed-by: Chao Yu <yuchao0(a)huawei.com>
Acked-by: Al Viro <viro(a)zeniv.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: weiyang wang <wangweiyang2(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
fs/configfs/file.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/fs/configfs/file.c b/fs/configfs/file.c
index bb0a427517e92..50b7c4c4310e0 100644
--- a/fs/configfs/file.c
+++ b/fs/configfs/file.c
@@ -392,7 +392,7 @@ static int __configfs_open_file(struct inode *inode, struct file *file, int type
attr = to_attr(dentry);
if (!attr)
- goto out_put_item;
+ goto out_free_buffer;
if (type & CONFIGFS_ITEM_BIN_ATTR) {
buffer->bin_attr = to_bin_attr(dentry);
@@ -405,7 +405,7 @@ static int __configfs_open_file(struct inode *inode, struct file *file, int type
/* Grab the module reference for this attribute if we have one */
error = -ENODEV;
if (!try_module_get(buffer->owner))
- goto out_put_item;
+ goto out_free_buffer;
error = -EACCES;
if (!buffer->item->ci_type)
@@ -449,8 +449,6 @@ static int __configfs_open_file(struct inode *inode, struct file *file, int type
out_put_module:
module_put(buffer->owner);
-out_put_item:
- config_item_put(buffer->item);
out_free_buffer:
up_read(&frag->frag_sem);
kfree(buffer);
--
2.25.1
1
0
Variables allocated by kvmalloc should not be freed by kfree.
Because they may be allocated by vmalloc.
So we replace kfree with kvfree here.
Fixes: 738fe155f58b ("vfio/iommu_type1: Add support for manual dirty log clear")
Signed-off-by: Jiacheng Shi <billsjc(a)sjtu.edu.cn>
---
drivers/vfio/vfio_iommu_type1.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 5daceec48811..6811a85109aa 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1150,7 +1150,7 @@ static int vfio_iova_dirty_log_clear(u64 __user *bitmap,
}
out:
- kfree(bitmap_buffer);
+ kvfree(bitmap_buffer);
return ret;
}
--
2.17.1
3
4

[PATCH openEuler-1.0-LTS 1/2] share_pool: Remove the redundant warning message
by Yang Yingliang 13 Dec '21
by Yang Yingliang 13 Dec '21
13 Dec '21
From: Ding Tianhong <dingtianhong(a)huawei.com>
ascend inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4M253
CVE: NA
-------------------------------------------------
No need to record the redundant message for out of memory,
the kernel already handle it when oom.
Signed-off-by: Ding Tianhong <dingtianhong(a)huawei.com>
Reviewed-by: Weilong Chen <chenweilong(a)huawei.com>
Reviewed-by: Hanjun Guo <guohanjun(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
mm/share_pool.c | 20 +++++---------------
1 file changed, 5 insertions(+), 15 deletions(-)
diff --git a/mm/share_pool.c b/mm/share_pool.c
index 2cc15f894cf84..ea61f69cdd6ad 100644
--- a/mm/share_pool.c
+++ b/mm/share_pool.c
@@ -137,10 +137,8 @@ static struct sp_group_master *sp_init_group_master_locked(
}
master = kmalloc(sizeof(struct sp_group_master), GFP_KERNEL);
- if (master == NULL) {
- pr_err_ratelimited("no memory for spg master\n");
+ if (master == NULL)
return ERR_PTR(-ENOMEM);
- }
INIT_LIST_HEAD(&master->node_list);
master->count = 0;
@@ -192,10 +190,8 @@ static struct sp_proc_stat *create_proc_stat(struct mm_struct *mm,
struct sp_proc_stat *stat;
stat = kmalloc(sizeof(*stat), GFP_KERNEL);
- if (stat == NULL) {
- pr_err_ratelimited("no memory for proc stat\n");
+ if (stat == NULL)
return ERR_PTR(-ENOMEM);
- }
atomic_set(&stat->use_count, 1);
atomic64_set(&stat->alloc_size, 0);
@@ -342,10 +338,8 @@ static struct spg_proc_stat *create_spg_proc_stat(int tgid, int spg_id)
struct spg_proc_stat *stat;
stat = kmalloc(sizeof(struct spg_proc_stat), GFP_KERNEL);
- if (stat == NULL) {
- pr_err_ratelimited("no memory for spg proc stat\n");
+ if (stat == NULL)
return ERR_PTR(-ENOMEM);
- }
stat->tgid = tgid;
stat->spg_id = spg_id;
@@ -413,10 +407,8 @@ static struct sp_spg_stat *create_spg_stat(int spg_id)
struct sp_spg_stat *stat;
stat = kmalloc(sizeof(*stat), GFP_KERNEL);
- if (stat == NULL) {
- pr_err_ratelimited("no memory for spg stat\n");
+ if (stat == NULL)
return ERR_PTR(-ENOMEM);
- }
stat->spg_id = spg_id;
atomic_set(&stat->hugepage_failures, 0);
@@ -1624,10 +1616,8 @@ static struct sp_area *sp_alloc_area(unsigned long size, unsigned long flags,
}
spa = __kmalloc_node(sizeof(struct sp_area), GFP_KERNEL, node_id);
- if (unlikely(!spa)) {
- pr_err_ratelimited("no memory for spa\n");
+ if (unlikely(!spa))
return ERR_PTR(-ENOMEM);
- }
spin_lock(&sp_area_lock);
--
2.25.1
1
1

[PATCH OLK-5.10 000/107] NTFS read-write driver GPL implementation by Paragon Software
by Yin Xiujiang 13 Dec '21
by Yin Xiujiang 13 Dec '21
13 Dec '21
This patch adds NTFS Read-Write driver to fs/ntfs3.
Having decades of expertise in commercial file systems development and huge
test coverage, we at Paragon Software GmbH want to make our contribution to
the Open Source Community by providing implementation of NTFS Read-Write
driver for the Linux Kernel.
This is fully functional NTFS Read-Write driver. Current version works with
NTFS(including v3.1) and normal/compressed/sparse files and supports journal replaying.
We plan to support this version after the codebase once merged, and add new
features and fix bugs. For example, full journaling support over JBD will be
added in later updates.
v2:
- patch splitted to chunks (file-wise)
- build issues fixed
- sparse and checkpatch.pl errors fixed
- NULL pointer dereference on mkfs.ntfs-formatted volume mount fixed
- cosmetics + code cleanup
v3:
- added acl, noatime, no_acs_rules, prealloc mount options
- added fiemap support
- fixed encodings support
- removed typedefs
- adapted Kernel-way logging mechanisms
- fixed typos and corner-case issues
v4:
- atomic_open() refactored
- code style updated
- bugfixes
v5:
- nls/nls_alt mount options added
- Unicode conversion fixes
- Improved very fragmented files operations
- logging cosmetics
v6:
- Security Descriptors processing changed
added system.ntfs_security xattr to set
SD
- atomic_open() optimized
- cosmetics
Christophe JAILLET (2):
fs/ntfs3: Remove a useless test in 'indx_find()'
fs/ntfs3: Remove a useless shadowing variable
Colin Ian King (4):
fs/ntfs3: Fix various spelling mistakes
fs/ntfs3: Fix integer overflow in multiplication
fs/ntfs3: Remove redundant initialization of variable err
fs/ntfs3: Fix a memory leak on object opts
Dan Carpenter (5):
fs/ntfs3: add checks for allocation failure
fs/ntfs3: fix an error code in ntfs_get_acl_ex()
fs/ntfs3: Fix error code in indx_add_allocate()
fs/ntfs3: Potential NULL dereference in hdr_find_split()
fs/ntfs3: Fix error handling in indx_insert_into_root()
Gustavo A. R. Silva (1):
fs/ntfs3: Fix fall-through warnings for Clang
Jiapeng Chong (1):
fs/ntfs3: Remove unused including <linux/version.h>
Kari Argillander (54):
fs/ntfs3: Use linux/log2 is_power_of_2 function
fs/ntfs3: Add ifndef + define to all header files
fs/ntfs3: Fix one none utf8 char in source file
fs/ntfs3: Restyle comment block in ni_parse_reparse()
fs/ntfs3: Use kernel ALIGN macros over driver specific
fs/ntfs3: Do not use driver own alloc wrappers
fs/ntfs3: Use kcalloc/kmalloc_array over kzalloc/kmalloc
fs/ntfs3: Restyle comments to better align with kernel-doc
fs/ntfs3: Remove fat ioctl's from ntfs3 driver for now
fs/ntfs3: Fix integer overflow in ni_fiemap with fiemap_prep()
fs/ntfs3: Remove unnecessary condition checking from
ntfs_file_read_iter
fs/ntfs3: Remove GPL boilerplates from decompress lib files
fs/ntfs3: Change how module init/info messages are displayed
fs/ntfs3: Remove unnecesarry mount option noatime
fs/ntfs3: Remove unnecesarry remount flag handling
fs/ntfs3: Convert mount options to pointer in sbi
fs/ntfs3: Use new api for mounting
fs/ntfs3: Init spi more in init_fs_context than fill_super
fs/ntfs3: Make mount option nohidden more universal
fs/ntfs3: Add iocharset= mount option as alias for nls=
fs/ntfs3: Rename mount option no_acs_rules > (no)acsrules
fs/ntfs3: Show uid/gid always in show_options()
fs/ntfs3. Add forward declarations for structs to debug.h
fs/ntfs3: Add missing header files to ntfs.h
fs/ntfs3: Add missing headers and forward declarations to ntfs_fs.h
fs/ntfs3: Add missing header and guards to lib/ headers
fs/ntfs3: Change right headers to bitfunc.c
fs/ntfs3: Change right headers to upcase.c
fs/ntfs3: Change right headers to lznt.c
fs/ntfs3: Remove unneeded header files from c files
fs/ntfs3: Limit binary search table size
fs/ntfs3: Make binary search to search smaller chunks in beginning
fs/ntfs3: Always use binary search with entry search
fs/ntfs3: Remove '+' before constant in ni_insert_resident()
fs/ntfs3: Place Comparisons constant right side of the test
fs/ntfs3: Remove braces from single statment block
fs/ntfs3: Remove tabs before spaces from comment
fs/ntfs3: Fix ntfs_look_for_free_space() does only report -ENOSPC
fs/ntfs3: Remove always false condition check
fs/ntfs3: Use clamp/max macros instead of comparisons
fs/ntfs3: Use min/max macros instated of ternary operators
fs/ntfs3: Fix wrong error message $Logfile -> $UpCase
fs/ntfs3: Change EINVAL to ENOMEM when d_make_root fails
fs/ntfs3: Remove impossible fault condition in fill_super
fs/ntfs3: Return straight without goto in fill_super
fs/ntfs3: Remove unnecessary variable loading in fill_super
fs/ntfs3: Use sb instead of sbi->sb in fill_super
fs/ntfs3: Remove tmp var is_ro in ntfs_fill_super
fs/ntfs3: Remove tmp pointer bd_inode in fill_super
fs/ntfs3: Remove tmp pointer upcase in fill_super
fs/ntfs3: Initialize pointer before use place in fill_super
fs/ntfs3: Initiliaze sb blocksize only in one place + refactor
Doc/fs/ntfs3: Fix rst format and make it cleaner
fs/ntfs3: Remove deprecated mount options nls
Konstantin Komarov (37):
fs/ntfs3: Add headers and misc files
fs/ntfs3: Add initialization of super block
fs/ntfs3: Add bitmap
fs/ntfs3: Add file operations and implementation
fs/ntfs3: Add attrib operations
fs/ntfs3: Add compression
fs/ntfs3: Add NTFS journal
fs/ntfs3: Add Kconfig, Makefile and doc
fs/ntfs3: Add NTFS3 in fs/Kconfig and fs/Makefile
fs/ntfs3: Rework file operations
fs/ntfs3: Restyle comments to better align with kernel-doc
fs/ntfs3: Fix insertion of attr in ni_ins_attr_ext
fs/ntfs3: Change max hardlinks limit to 4000
fs/ntfs3: Add sync flag to ntfs_sb_write_run and al_update
fs/ntfs3: Fix logical error in ntfs_create_inode
fs/ntfs3: Move ni_lock_dir and ni_unlock into ntfs_create_inode
fs/ntfs3: Refactor ntfs_get_acl_ex for better readability
fs/ntfs3: Pass flags to ntfs_set_ea in ntfs_set_acl_ex
fs/ntfs3: Change posix_acl_equiv_mode to posix_acl_update_mode
fs/ntfs3: Refactoring lock in ntfs_init_acl
fs/ntfs3: Reject mount if boot's cluster size < media sector size
fs/ntfs3: Refactoring of ntfs_init_from_boot
fs/ntfs3: Check for NULL if ATTR_EA_INFO is incorrect
fs/ntfs3: Use available posix_acl_release instead of
ntfs_posix_acl_release
fs/ntfs3: Remove locked argument in ntfs_set_ea
fs/ntfs3: Refactoring of ntfs_set_ea
fs/ntfs3: Forbid FALLOC_FL_PUNCH_HOLE for normal files
fs/ntfs3: Remove unnecessary functions
fs/ntfs3: Keep prealloc for all types of files
fs/ntfs3: Fix memory leak if fill_super failed
fs/ntfs3: Rework ntfs_utf16_to_nls
fs/ntfs3: Refactor ntfs_readlink_hlp
fs/ntfs3: Refactor ntfs_create_inode
fs/ntfs3: Refactor ni_parse_reparse
fs/ntfs3: Refactor ntfs_read_mft
fs/ntfs3: Check for NULL pointers in ni_try_remove_attr_list
fs/ntfs3: Add MAINTAINERS
Nathan Chancellor (1):
fs/ntfs3: Remove unused variable cnt in ntfs_security_init()
Yin Xiujiang (2):
fs/ntfs3: Fix the issue from backport 5.15 to 5.10
fs/ntfs3: Add ntfs3 module in openeuler_defconfig
Documentation/filesystems/index.rst | 1 +
Documentation/filesystems/ntfs3.rst | 115 +
MAINTAINERS | 9 +
arch/arm64/configs/openeuler_defconfig | 4 +
arch/x86/configs/openeuler_defconfig | 4 +
fs/Kconfig | 1 +
fs/Makefile | 1 +
fs/ntfs3/Kconfig | 46 +
fs/ntfs3/Makefile | 36 +
fs/ntfs3/attrib.c | 2083 ++++++++++
fs/ntfs3/attrlist.c | 457 +++
fs/ntfs3/bitfunc.c | 128 +
fs/ntfs3/bitmap.c | 1491 +++++++
fs/ntfs3/debug.h | 55 +
fs/ntfs3/dir.c | 593 +++
fs/ntfs3/file.c | 1254 ++++++
fs/ntfs3/frecord.c | 3284 +++++++++++++++
fs/ntfs3/fslog.c | 5213 ++++++++++++++++++++++++
fs/ntfs3/fsntfs.c | 2506 ++++++++++++
fs/ntfs3/index.c | 2584 ++++++++++++
fs/ntfs3/inode.c | 1960 +++++++++
fs/ntfs3/lib/decompress_common.c | 319 ++
fs/ntfs3/lib/decompress_common.h | 343 ++
fs/ntfs3/lib/lib.h | 32 +
fs/ntfs3/lib/lzx_decompress.c | 670 +++
fs/ntfs3/lib/xpress_decompress.c | 142 +
fs/ntfs3/lznt.c | 453 ++
fs/ntfs3/namei.c | 387 ++
fs/ntfs3/ntfs.h | 1224 ++++++
fs/ntfs3/ntfs_fs.h | 1138 ++++++
fs/ntfs3/record.c | 602 +++
fs/ntfs3/run.c | 1111 +++++
fs/ntfs3/super.c | 1507 +++++++
fs/ntfs3/upcase.c | 104 +
fs/ntfs3/xattr.c | 991 +++++
35 files changed, 30848 insertions(+)
create mode 100644 Documentation/filesystems/ntfs3.rst
create mode 100644 fs/ntfs3/Kconfig
create mode 100644 fs/ntfs3/Makefile
create mode 100644 fs/ntfs3/attrib.c
create mode 100644 fs/ntfs3/attrlist.c
create mode 100644 fs/ntfs3/bitfunc.c
create mode 100644 fs/ntfs3/bitmap.c
create mode 100644 fs/ntfs3/debug.h
create mode 100644 fs/ntfs3/dir.c
create mode 100644 fs/ntfs3/file.c
create mode 100644 fs/ntfs3/frecord.c
create mode 100644 fs/ntfs3/fslog.c
create mode 100644 fs/ntfs3/fsntfs.c
create mode 100644 fs/ntfs3/index.c
create mode 100644 fs/ntfs3/inode.c
create mode 100644 fs/ntfs3/lib/decompress_common.c
create mode 100644 fs/ntfs3/lib/decompress_common.h
create mode 100644 fs/ntfs3/lib/lib.h
create mode 100644 fs/ntfs3/lib/lzx_decompress.c
create mode 100644 fs/ntfs3/lib/xpress_decompress.c
create mode 100644 fs/ntfs3/lznt.c
create mode 100644 fs/ntfs3/namei.c
create mode 100644 fs/ntfs3/ntfs.h
create mode 100644 fs/ntfs3/ntfs_fs.h
create mode 100644 fs/ntfs3/record.c
create mode 100644 fs/ntfs3/run.c
create mode 100644 fs/ntfs3/super.c
create mode 100644 fs/ntfs3/upcase.c
create mode 100644 fs/ntfs3/xattr.c
--
2.30.0
1
34
From: Konstantin Komarov <almaz.alexandrovich(a)paragon-software.com>
mainline inclusion
from mainline-v5.15
commit 4534a70b7056fd4b9a1c6db5a4ce3c98546b291e
category: feature
bugzilla:
https://gitee.com/openeuler/kernel/issues/I4G67J?from=project-issue
CVE: NA
----------------------------------------------------------------------
This adds headers and misc files
Signed-off-by: Konstantin Komarov <almaz.alexandrovich(a)paragon-software.com>
Signed-off-by: Yin Xiujiang <yinxiujiang(a)kylinos.cn>
---
fs/ntfs3/debug.h | 64 +++
fs/ntfs3/ntfs.h | 1238 ++++++++++++++++++++++++++++++++++++++++++++
fs/ntfs3/ntfs_fs.h | 1092 ++++++++++++++++++++++++++++++++++++++
fs/ntfs3/upcase.c | 105 ++++
4 files changed, 2499 insertions(+)
create mode 100644 fs/ntfs3/debug.h
create mode 100644 fs/ntfs3/ntfs.h
create mode 100644 fs/ntfs3/ntfs_fs.h
create mode 100644 fs/ntfs3/upcase.c
diff --git a/fs/ntfs3/debug.h b/fs/ntfs3/debug.h
new file mode 100644
index 000000000000..dfaa4c79dc6d
--- /dev/null
+++ b/fs/ntfs3/debug.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ * useful functions for debuging
+ */
+
+// clang-format off
+#ifndef Add2Ptr
+#define Add2Ptr(P, I) ((void *)((u8 *)(P) + (I)))
+#define PtrOffset(B, O) ((size_t)((size_t)(O) - (size_t)(B)))
+#endif
+
+#define QuadAlign(n) (((n) + 7u) & (~7u))
+#define IsQuadAligned(n) (!((size_t)(n)&7u))
+#define Quad2Align(n) (((n) + 15u) & (~15u))
+#define IsQuad2Aligned(n) (!((size_t)(n)&15u))
+#define Quad4Align(n) (((n) + 31u) & (~31u))
+#define IsSizeTAligned(n) (!((size_t)(n) & (sizeof(size_t) - 1)))
+#define DwordAlign(n) (((n) + 3u) & (~3u))
+#define IsDwordAligned(n) (!((size_t)(n)&3u))
+#define WordAlign(n) (((n) + 1u) & (~1u))
+#define IsWordAligned(n) (!((size_t)(n)&1u))
+
+#ifdef CONFIG_PRINTK
+__printf(2, 3)
+void ntfs_printk(const struct super_block *sb, const char *fmt, ...);
+__printf(2, 3)
+void ntfs_inode_printk(struct inode *inode, const char *fmt, ...);
+#else
+static inline __printf(2, 3)
+void ntfs_printk(const struct super_block *sb, const char *fmt, ...)
+{
+}
+
+static inline __printf(2, 3)
+void ntfs_inode_printk(struct inode *inode, const char *fmt, ...)
+{
+}
+#endif
+
+/*
+ * Logging macros ( thanks Joe Perches <joe(a)perches.com> for implementation )
+ */
+
+#define ntfs_err(sb, fmt, ...) ntfs_printk(sb, KERN_ERR fmt, ##__VA_ARGS__)
+#define ntfs_warn(sb, fmt, ...) ntfs_printk(sb, KERN_WARNING fmt, ##__VA_ARGS__)
+#define ntfs_info(sb, fmt, ...) ntfs_printk(sb, KERN_INFO fmt, ##__VA_ARGS__)
+#define ntfs_notice(sb, fmt, ...) \
+ ntfs_printk(sb, KERN_NOTICE fmt, ##__VA_ARGS__)
+
+#define ntfs_inode_err(inode, fmt, ...) \
+ ntfs_inode_printk(inode, KERN_ERR fmt, ##__VA_ARGS__)
+#define ntfs_inode_warn(inode, fmt, ...) \
+ ntfs_inode_printk(inode, KERN_WARNING fmt, ##__VA_ARGS__)
+
+#define ntfs_malloc(s) kmalloc(s, GFP_NOFS)
+#define ntfs_zalloc(s) kzalloc(s, GFP_NOFS)
+#define ntfs_vmalloc(s) kvmalloc(s, GFP_KERNEL)
+#define ntfs_free(p) kfree(p)
+#define ntfs_vfree(p) kvfree(p)
+#define ntfs_memdup(src, len) kmemdup(src, len, GFP_NOFS)
+// clang-format on
diff --git a/fs/ntfs3/ntfs.h b/fs/ntfs3/ntfs.h
new file mode 100644
index 000000000000..40398e6c39c9
--- /dev/null
+++ b/fs/ntfs3/ntfs.h
@@ -0,0 +1,1238 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ * on-disk ntfs structs
+ */
+
+// clang-format off
+
+/* TODO:
+ * - Check 4K mft record and 512 bytes cluster
+ */
+
+/*
+ * Activate this define to use binary search in indexes
+ */
+#define NTFS3_INDEX_BINARY_SEARCH
+
+/*
+ * Check each run for marked clusters
+ */
+#define NTFS3_CHECK_FREE_CLST
+
+#define NTFS_NAME_LEN 255
+
+/*
+ * ntfs.sys used 500 maximum links
+ * on-disk struct allows up to 0xffff
+ */
+#define NTFS_LINK_MAX 0x400
+//#define NTFS_LINK_MAX 0xffff
+
+/*
+ * Activate to use 64 bit clusters instead of 32 bits in ntfs.sys
+ * Logical and virtual cluster number
+ * If needed, may be redefined to use 64 bit value
+ */
+//#define CONFIG_NTFS3_64BIT_CLUSTER
+
+#define NTFS_LZNT_MAX_CLUSTER 4096
+#define NTFS_LZNT_CUNIT 4
+#define NTFS_LZNT_CLUSTERS (1u<<NTFS_LZNT_CUNIT)
+
+struct GUID {
+ __le32 Data1;
+ __le16 Data2;
+ __le16 Data3;
+ u8 Data4[8];
+};
+
+/*
+ * this struct repeats layout of ATTR_FILE_NAME
+ * at offset 0x40
+ * it used to store global constants NAME_MFT/NAME_MIRROR...
+ * most constant names are shorter than 10
+ */
+struct cpu_str {
+ u8 len;
+ u8 unused;
+ u16 name[10];
+};
+
+struct le_str {
+ u8 len;
+ u8 unused;
+ __le16 name[];
+};
+
+static_assert(SECTOR_SHIFT == 9);
+
+#ifdef CONFIG_NTFS3_64BIT_CLUSTER
+typedef u64 CLST;
+static_assert(sizeof(size_t) == 8);
+#else
+typedef u32 CLST;
+#endif
+
+#define SPARSE_LCN64 ((u64)-1)
+#define SPARSE_LCN ((CLST)-1)
+#define RESIDENT_LCN ((CLST)-2)
+#define COMPRESSED_LCN ((CLST)-3)
+
+#define COMPRESSION_UNIT 4
+#define COMPRESS_MAX_CLUSTER 0x1000
+#define MFT_INCREASE_CHUNK 1024
+
+enum RECORD_NUM {
+ MFT_REC_MFT = 0,
+ MFT_REC_MIRR = 1,
+ MFT_REC_LOG = 2,
+ MFT_REC_VOL = 3,
+ MFT_REC_ATTR = 4,
+ MFT_REC_ROOT = 5,
+ MFT_REC_BITMAP = 6,
+ MFT_REC_BOOT = 7,
+ MFT_REC_BADCLUST = 8,
+ //MFT_REC_QUOTA = 9,
+ MFT_REC_SECURE = 9, // NTFS 3.0
+ MFT_REC_UPCASE = 10,
+ MFT_REC_EXTEND = 11, // NTFS 3.0
+ MFT_REC_RESERVED = 11,
+ MFT_REC_FREE = 16,
+ MFT_REC_USER = 24,
+};
+
+enum ATTR_TYPE {
+ ATTR_ZERO = cpu_to_le32(0x00),
+ ATTR_STD = cpu_to_le32(0x10),
+ ATTR_LIST = cpu_to_le32(0x20),
+ ATTR_NAME = cpu_to_le32(0x30),
+ // ATTR_VOLUME_VERSION on Nt4
+ ATTR_ID = cpu_to_le32(0x40),
+ ATTR_SECURE = cpu_to_le32(0x50),
+ ATTR_LABEL = cpu_to_le32(0x60),
+ ATTR_VOL_INFO = cpu_to_le32(0x70),
+ ATTR_DATA = cpu_to_le32(0x80),
+ ATTR_ROOT = cpu_to_le32(0x90),
+ ATTR_ALLOC = cpu_to_le32(0xA0),
+ ATTR_BITMAP = cpu_to_le32(0xB0),
+ // ATTR_SYMLINK on Nt4
+ ATTR_REPARSE = cpu_to_le32(0xC0),
+ ATTR_EA_INFO = cpu_to_le32(0xD0),
+ ATTR_EA = cpu_to_le32(0xE0),
+ ATTR_PROPERTYSET = cpu_to_le32(0xF0),
+ ATTR_LOGGED_UTILITY_STREAM = cpu_to_le32(0x100),
+ ATTR_END = cpu_to_le32(0xFFFFFFFF)
+};
+
+static_assert(sizeof(enum ATTR_TYPE) == 4);
+
+enum FILE_ATTRIBUTE {
+ FILE_ATTRIBUTE_READONLY = cpu_to_le32(0x00000001),
+ FILE_ATTRIBUTE_HIDDEN = cpu_to_le32(0x00000002),
+ FILE_ATTRIBUTE_SYSTEM = cpu_to_le32(0x00000004),
+ FILE_ATTRIBUTE_ARCHIVE = cpu_to_le32(0x00000020),
+ FILE_ATTRIBUTE_DEVICE = cpu_to_le32(0x00000040),
+ FILE_ATTRIBUTE_TEMPORARY = cpu_to_le32(0x00000100),
+ FILE_ATTRIBUTE_SPARSE_FILE = cpu_to_le32(0x00000200),
+ FILE_ATTRIBUTE_REPARSE_POINT = cpu_to_le32(0x00000400),
+ FILE_ATTRIBUTE_COMPRESSED = cpu_to_le32(0x00000800),
+ FILE_ATTRIBUTE_OFFLINE = cpu_to_le32(0x00001000),
+ FILE_ATTRIBUTE_NOT_CONTENT_INDEXED = cpu_to_le32(0x00002000),
+ FILE_ATTRIBUTE_ENCRYPTED = cpu_to_le32(0x00004000),
+ FILE_ATTRIBUTE_VALID_FLAGS = cpu_to_le32(0x00007fb7),
+ FILE_ATTRIBUTE_DIRECTORY = cpu_to_le32(0x10000000),
+};
+
+static_assert(sizeof(enum FILE_ATTRIBUTE) == 4);
+
+extern const struct cpu_str NAME_MFT;
+extern const struct cpu_str NAME_MIRROR;
+extern const struct cpu_str NAME_LOGFILE;
+extern const struct cpu_str NAME_VOLUME;
+extern const struct cpu_str NAME_ATTRDEF;
+extern const struct cpu_str NAME_ROOT;
+extern const struct cpu_str NAME_BITMAP;
+extern const struct cpu_str NAME_BOOT;
+extern const struct cpu_str NAME_BADCLUS;
+extern const struct cpu_str NAME_QUOTA;
+extern const struct cpu_str NAME_SECURE;
+extern const struct cpu_str NAME_UPCASE;
+extern const struct cpu_str NAME_EXTEND;
+extern const struct cpu_str NAME_OBJID;
+extern const struct cpu_str NAME_REPARSE;
+extern const struct cpu_str NAME_USNJRNL;
+
+extern const __le16 I30_NAME[4];
+extern const __le16 SII_NAME[4];
+extern const __le16 SDH_NAME[4];
+extern const __le16 SO_NAME[2];
+extern const __le16 SQ_NAME[2];
+extern const __le16 SR_NAME[2];
+
+extern const __le16 BAD_NAME[4];
+extern const __le16 SDS_NAME[4];
+extern const __le16 WOF_NAME[17]; /* WofCompressedData */
+
+/* MFT record number structure */
+struct MFT_REF {
+ __le32 low; // The low part of the number
+ __le16 high; // The high part of the number
+ __le16 seq; // The sequence number of MFT record
+};
+
+static_assert(sizeof(__le64) == sizeof(struct MFT_REF));
+
+static inline CLST ino_get(const struct MFT_REF *ref)
+{
+#ifdef CONFIG_NTFS3_64BIT_CLUSTER
+ return le32_to_cpu(ref->low) | ((u64)le16_to_cpu(ref->high) << 32);
+#else
+ return le32_to_cpu(ref->low);
+#endif
+}
+
+struct NTFS_BOOT {
+ u8 jump_code[3]; // 0x00: Jump to boot code
+ u8 system_id[8]; // 0x03: System ID, equals "NTFS "
+
+ // NOTE: this member is not aligned(!)
+ // bytes_per_sector[0] must be 0
+ // bytes_per_sector[1] must be multiplied by 256
+ u8 bytes_per_sector[2]; // 0x0B: Bytes per sector
+
+ u8 sectors_per_clusters;// 0x0D: Sectors per cluster
+ u8 unused1[7];
+ u8 media_type; // 0x15: Media type (0xF8 - harddisk)
+ u8 unused2[2];
+ __le16 sct_per_track; // 0x18: number of sectors per track
+ __le16 heads; // 0x1A: number of heads per cylinder
+ __le32 hidden_sectors; // 0x1C: number of 'hidden' sectors
+ u8 unused3[4];
+ u8 bios_drive_num; // 0x24: BIOS drive number =0x80
+ u8 unused4;
+ u8 signature_ex; // 0x26: Extended BOOT signature =0x80
+ u8 unused5;
+ __le64 sectors_per_volume;// 0x28: size of volume in sectors
+ __le64 mft_clst; // 0x30: first cluster of $MFT
+ __le64 mft2_clst; // 0x38: first cluster of $MFTMirr
+ s8 record_size; // 0x40: size of MFT record in clusters(sectors)
+ u8 unused6[3];
+ s8 index_size; // 0x44: size of INDX record in clusters(sectors)
+ u8 unused7[3];
+ __le64 serial_num; // 0x48: Volume serial number
+ __le32 check_sum; // 0x50: Simple additive checksum of all
+ // of the u32's which precede the 'check_sum'
+
+ u8 boot_code[0x200 - 0x50 - 2 - 4]; // 0x54:
+ u8 boot_magic[2]; // 0x1FE: Boot signature =0x55 + 0xAA
+};
+
+static_assert(sizeof(struct NTFS_BOOT) == 0x200);
+
+enum NTFS_SIGNATURE {
+ NTFS_FILE_SIGNATURE = cpu_to_le32(0x454C4946), // 'FILE'
+ NTFS_INDX_SIGNATURE = cpu_to_le32(0x58444E49), // 'INDX'
+ NTFS_CHKD_SIGNATURE = cpu_to_le32(0x444B4843), // 'CHKD'
+ NTFS_RSTR_SIGNATURE = cpu_to_le32(0x52545352), // 'RSTR'
+ NTFS_RCRD_SIGNATURE = cpu_to_le32(0x44524352), // 'RCRD'
+ NTFS_BAAD_SIGNATURE = cpu_to_le32(0x44414142), // 'BAAD'
+ NTFS_HOLE_SIGNATURE = cpu_to_le32(0x454C4F48), // 'HOLE'
+ NTFS_FFFF_SIGNATURE = cpu_to_le32(0xffffffff),
+};
+
+static_assert(sizeof(enum NTFS_SIGNATURE) == 4);
+
+/* MFT Record header structure */
+struct NTFS_RECORD_HEADER {
+ /* Record magic number, equals 'FILE'/'INDX'/'RSTR'/'RCRD' */
+ enum NTFS_SIGNATURE sign; // 0x00:
+ __le16 fix_off; // 0x04:
+ __le16 fix_num; // 0x06:
+ __le64 lsn; // 0x08: Log file sequence number
+};
+
+static_assert(sizeof(struct NTFS_RECORD_HEADER) == 0x10);
+
+static inline int is_baad(const struct NTFS_RECORD_HEADER *hdr)
+{
+ return hdr->sign == NTFS_BAAD_SIGNATURE;
+}
+
+/* Possible bits in struct MFT_REC.flags */
+enum RECORD_FLAG {
+ RECORD_FLAG_IN_USE = cpu_to_le16(0x0001),
+ RECORD_FLAG_DIR = cpu_to_le16(0x0002),
+ RECORD_FLAG_SYSTEM = cpu_to_le16(0x0004),
+ RECORD_FLAG_UNKNOWN = cpu_to_le16(0x0008),
+};
+
+/* MFT Record structure */
+struct MFT_REC {
+ struct NTFS_RECORD_HEADER rhdr; // 'FILE'
+
+ __le16 seq; // 0x10: Sequence number for this record
+ __le16 hard_links; // 0x12: The number of hard links to record
+ __le16 attr_off; // 0x14: Offset to attributes
+ __le16 flags; // 0x16: See RECORD_FLAG
+ __le32 used; // 0x18: The size of used part
+ __le32 total; // 0x1C: Total record size
+
+ struct MFT_REF parent_ref; // 0x20: Parent MFT record
+ __le16 next_attr_id; // 0x28: The next attribute Id
+
+ __le16 res; // 0x2A: High part of mft record?
+ __le32 mft_record; // 0x2C: Current mft record number
+ __le16 fixups[]; // 0x30:
+};
+
+#define MFTRECORD_FIXUP_OFFSET_1 offsetof(struct MFT_REC, res)
+#define MFTRECORD_FIXUP_OFFSET_3 offsetof(struct MFT_REC, fixups)
+
+static_assert(MFTRECORD_FIXUP_OFFSET_1 == 0x2A);
+static_assert(MFTRECORD_FIXUP_OFFSET_3 == 0x30);
+
+static inline bool is_rec_base(const struct MFT_REC *rec)
+{
+ const struct MFT_REF *r = &rec->parent_ref;
+
+ return !r->low && !r->high && !r->seq;
+}
+
+static inline bool is_mft_rec5(const struct MFT_REC *rec)
+{
+ return le16_to_cpu(rec->rhdr.fix_off) >=
+ offsetof(struct MFT_REC, fixups);
+}
+
+static inline bool is_rec_inuse(const struct MFT_REC *rec)
+{
+ return rec->flags & RECORD_FLAG_IN_USE;
+}
+
+static inline bool clear_rec_inuse(struct MFT_REC *rec)
+{
+ return rec->flags &= ~RECORD_FLAG_IN_USE;
+}
+
+/* Possible values of ATTR_RESIDENT.flags */
+#define RESIDENT_FLAG_INDEXED 0x01
+
+struct ATTR_RESIDENT {
+ __le32 data_size; // 0x10: The size of data
+ __le16 data_off; // 0x14: Offset to data
+ u8 flags; // 0x16: resident flags ( 1 - indexed )
+ u8 res; // 0x17:
+}; // sizeof() = 0x18
+
+struct ATTR_NONRESIDENT {
+ __le64 svcn; // 0x10: Starting VCN of this segment
+ __le64 evcn; // 0x18: End VCN of this segment
+ __le16 run_off; // 0x20: Offset to packed runs
+ // Unit of Compression size for this stream, expressed
+ // as a log of the cluster size.
+ //
+ // 0 means file is not compressed
+ // 1, 2, 3, and 4 are potentially legal values if the
+ // stream is compressed, however the implementation
+ // may only choose to use 4, or possibly 3. Note
+ // that 4 means cluster size time 16. If convenient
+ // the implementation may wish to accept a
+ // reasonable range of legal values here (1-5?),
+ // even if the implementation only generates
+ // a smaller set of values itself.
+ u8 c_unit; // 0x22
+ u8 res1[5]; // 0x23:
+ __le64 alloc_size; // 0x28: The allocated size of attribute in bytes
+ // (multiple of cluster size)
+ __le64 data_size; // 0x30: The size of attribute in bytes <= alloc_size
+ __le64 valid_size; // 0x38: The size of valid part in bytes <= data_size
+ __le64 total_size; // 0x40: The sum of the allocated clusters for a file
+ // (present only for the first segment (0 == vcn)
+ // of compressed attribute)
+
+}; // sizeof()=0x40 or 0x48 (if compressed)
+
+/* Possible values of ATTRIB.flags: */
+#define ATTR_FLAG_COMPRESSED cpu_to_le16(0x0001)
+#define ATTR_FLAG_COMPRESSED_MASK cpu_to_le16(0x00FF)
+#define ATTR_FLAG_ENCRYPTED cpu_to_le16(0x4000)
+#define ATTR_FLAG_SPARSED cpu_to_le16(0x8000)
+
+struct ATTRIB {
+ enum ATTR_TYPE type; // 0x00: The type of this attribute
+ __le32 size; // 0x04: The size of this attribute
+ u8 non_res; // 0x08: Is this attribute non-resident ?
+ u8 name_len; // 0x09: This attribute name length
+ __le16 name_off; // 0x0A: Offset to the attribute name
+ __le16 flags; // 0x0C: See ATTR_FLAG_XXX
+ __le16 id; // 0x0E: unique id (per record)
+
+ union {
+ struct ATTR_RESIDENT res; // 0x10
+ struct ATTR_NONRESIDENT nres; // 0x10
+ };
+};
+
+/* Define attribute sizes */
+#define SIZEOF_RESIDENT 0x18
+#define SIZEOF_NONRESIDENT_EX 0x48
+#define SIZEOF_NONRESIDENT 0x40
+
+#define SIZEOF_RESIDENT_LE cpu_to_le16(0x18)
+#define SIZEOF_NONRESIDENT_EX_LE cpu_to_le16(0x48)
+#define SIZEOF_NONRESIDENT_LE cpu_to_le16(0x40)
+
+static inline u64 attr_ondisk_size(const struct ATTRIB *attr)
+{
+ return attr->non_res ? ((attr->flags &
+ (ATTR_FLAG_COMPRESSED | ATTR_FLAG_SPARSED)) ?
+ le64_to_cpu(attr->nres.total_size) :
+ le64_to_cpu(attr->nres.alloc_size)) :
+ QuadAlign(le32_to_cpu(attr->res.data_size));
+}
+
+static inline u64 attr_size(const struct ATTRIB *attr)
+{
+ return attr->non_res ? le64_to_cpu(attr->nres.data_size) :
+ le32_to_cpu(attr->res.data_size);
+}
+
+static inline bool is_attr_encrypted(const struct ATTRIB *attr)
+{
+ return attr->flags & ATTR_FLAG_ENCRYPTED;
+}
+
+static inline bool is_attr_sparsed(const struct ATTRIB *attr)
+{
+ return attr->flags & ATTR_FLAG_SPARSED;
+}
+
+static inline bool is_attr_compressed(const struct ATTRIB *attr)
+{
+ return attr->flags & ATTR_FLAG_COMPRESSED;
+}
+
+static inline bool is_attr_ext(const struct ATTRIB *attr)
+{
+ return attr->flags & (ATTR_FLAG_SPARSED | ATTR_FLAG_COMPRESSED);
+}
+
+static inline bool is_attr_indexed(const struct ATTRIB *attr)
+{
+ return !attr->non_res && (attr->res.flags & RESIDENT_FLAG_INDEXED);
+}
+
+static inline __le16 const *attr_name(const struct ATTRIB *attr)
+{
+ return Add2Ptr(attr, le16_to_cpu(attr->name_off));
+}
+
+static inline u64 attr_svcn(const struct ATTRIB *attr)
+{
+ return attr->non_res ? le64_to_cpu(attr->nres.svcn) : 0;
+}
+
+/* the size of resident attribute by its resident size */
+#define BYTES_PER_RESIDENT(b) (0x18 + (b))
+
+static_assert(sizeof(struct ATTRIB) == 0x48);
+static_assert(sizeof(((struct ATTRIB *)NULL)->res) == 0x08);
+static_assert(sizeof(((struct ATTRIB *)NULL)->nres) == 0x38);
+
+static inline void *resident_data_ex(const struct ATTRIB *attr, u32 datasize)
+{
+ u32 asize, rsize;
+ u16 off;
+
+ if (attr->non_res)
+ return NULL;
+
+ asize = le32_to_cpu(attr->size);
+ off = le16_to_cpu(attr->res.data_off);
+
+ if (asize < datasize + off)
+ return NULL;
+
+ rsize = le32_to_cpu(attr->res.data_size);
+ if (rsize < datasize)
+ return NULL;
+
+ return Add2Ptr(attr, off);
+}
+
+static inline void *resident_data(const struct ATTRIB *attr)
+{
+ return Add2Ptr(attr, le16_to_cpu(attr->res.data_off));
+}
+
+static inline void *attr_run(const struct ATTRIB *attr)
+{
+ return Add2Ptr(attr, le16_to_cpu(attr->nres.run_off));
+}
+
+/* Standard information attribute (0x10) */
+struct ATTR_STD_INFO {
+ __le64 cr_time; // 0x00: File creation file
+ __le64 m_time; // 0x08: File modification time
+ __le64 c_time; // 0x10: Last time any attribute was modified
+ __le64 a_time; // 0x18: File last access time
+ enum FILE_ATTRIBUTE fa; // 0x20: Standard DOS attributes & more
+ __le32 max_ver_num; // 0x24: Maximum Number of Versions
+ __le32 ver_num; // 0x28: Version Number
+ __le32 class_id; // 0x2C: Class Id from bidirectional Class Id index
+};
+
+static_assert(sizeof(struct ATTR_STD_INFO) == 0x30);
+
+#define SECURITY_ID_INVALID 0x00000000
+#define SECURITY_ID_FIRST 0x00000100
+
+struct ATTR_STD_INFO5 {
+ __le64 cr_time; // 0x00: File creation file
+ __le64 m_time; // 0x08: File modification time
+ __le64 c_time; // 0x10: Last time any attribute was modified
+ __le64 a_time; // 0x18: File last access time
+ enum FILE_ATTRIBUTE fa; // 0x20: Standard DOS attributes & more
+ __le32 max_ver_num; // 0x24: Maximum Number of Versions
+ __le32 ver_num; // 0x28: Version Number
+ __le32 class_id; // 0x2C: Class Id from bidirectional Class Id index
+
+ __le32 owner_id; // 0x30: Owner Id of the user owning the file.
+ __le32 security_id; // 0x34: The Security Id is a key in the $SII Index and $SDS
+ __le64 quota_charge; // 0x38:
+ __le64 usn; // 0x40: Last Update Sequence Number of the file. This is a direct
+ // index into the file $UsnJrnl. If zero, the USN Journal is
+ // disabled.
+};
+
+static_assert(sizeof(struct ATTR_STD_INFO5) == 0x48);
+
+/* attribute list entry structure (0x20) */
+struct ATTR_LIST_ENTRY {
+ enum ATTR_TYPE type; // 0x00: The type of attribute
+ __le16 size; // 0x04: The size of this record
+ u8 name_len; // 0x06: The length of attribute name
+ u8 name_off; // 0x07: The offset to attribute name
+ __le64 vcn; // 0x08: Starting VCN of this attribute
+ struct MFT_REF ref; // 0x10: MFT record number with attribute
+ __le16 id; // 0x18: struct ATTRIB ID
+ __le16 name[3]; // 0x1A: Just to align. To get real name can use bNameOffset
+
+}; // sizeof(0x20)
+
+static_assert(sizeof(struct ATTR_LIST_ENTRY) == 0x20);
+
+static inline u32 le_size(u8 name_len)
+{
+ return QuadAlign(offsetof(struct ATTR_LIST_ENTRY, name) +
+ name_len * sizeof(short));
+}
+
+/* returns 0 if 'attr' has the same type and name */
+static inline int le_cmp(const struct ATTR_LIST_ENTRY *le,
+ const struct ATTRIB *attr)
+{
+ return le->type != attr->type || le->name_len != attr->name_len ||
+ (!le->name_len &&
+ memcmp(Add2Ptr(le, le->name_off),
+ Add2Ptr(attr, le16_to_cpu(attr->name_off)),
+ le->name_len * sizeof(short)));
+}
+
+static inline __le16 const *le_name(const struct ATTR_LIST_ENTRY *le)
+{
+ return Add2Ptr(le, le->name_off);
+}
+
+/* File name types (the field type in struct ATTR_FILE_NAME ) */
+#define FILE_NAME_POSIX 0
+#define FILE_NAME_UNICODE 1
+#define FILE_NAME_DOS 2
+#define FILE_NAME_UNICODE_AND_DOS (FILE_NAME_DOS | FILE_NAME_UNICODE)
+
+/* Filename attribute structure (0x30) */
+struct NTFS_DUP_INFO {
+ __le64 cr_time; // 0x00: File creation file
+ __le64 m_time; // 0x08: File modification time
+ __le64 c_time; // 0x10: Last time any attribute was modified
+ __le64 a_time; // 0x18: File last access time
+ __le64 alloc_size; // 0x20: Data attribute allocated size, multiple of cluster size
+ __le64 data_size; // 0x28: Data attribute size <= Dataalloc_size
+ enum FILE_ATTRIBUTE fa; // 0x30: Standard DOS attributes & more
+ __le16 ea_size; // 0x34: Packed EAs
+ __le16 reparse; // 0x36: Used by Reparse
+
+}; // 0x38
+
+struct ATTR_FILE_NAME {
+ struct MFT_REF home; // 0x00: MFT record for directory
+ struct NTFS_DUP_INFO dup;// 0x08
+ u8 name_len; // 0x40: File name length in words
+ u8 type; // 0x41: File name type
+ __le16 name[]; // 0x42: File name
+};
+
+static_assert(sizeof(((struct ATTR_FILE_NAME *)NULL)->dup) == 0x38);
+static_assert(offsetof(struct ATTR_FILE_NAME, name) == 0x42);
+#define SIZEOF_ATTRIBUTE_FILENAME 0x44
+#define SIZEOF_ATTRIBUTE_FILENAME_MAX (0x42 + 255 * 2)
+
+static inline struct ATTRIB *attr_from_name(struct ATTR_FILE_NAME *fname)
+{
+ return (struct ATTRIB *)((char *)fname - SIZEOF_RESIDENT);
+}
+
+static inline u16 fname_full_size(const struct ATTR_FILE_NAME *fname)
+{
+ // don't return struct_size(fname, name, fname->name_len);
+ return offsetof(struct ATTR_FILE_NAME, name) +
+ fname->name_len * sizeof(short);
+}
+
+static inline u8 paired_name(u8 type)
+{
+ if (type == FILE_NAME_UNICODE)
+ return FILE_NAME_DOS;
+ if (type == FILE_NAME_DOS)
+ return FILE_NAME_UNICODE;
+ return FILE_NAME_POSIX;
+}
+
+/* Index entry defines ( the field flags in NtfsDirEntry ) */
+#define NTFS_IE_HAS_SUBNODES cpu_to_le16(1)
+#define NTFS_IE_LAST cpu_to_le16(2)
+
+/* Directory entry structure */
+struct NTFS_DE {
+ union {
+ struct MFT_REF ref; // 0x00: MFT record number with this file
+ struct {
+ __le16 data_off; // 0x00:
+ __le16 data_size; // 0x02:
+ __le32 res; // 0x04: must be 0
+ } view;
+ };
+ __le16 size; // 0x08: The size of this entry
+ __le16 key_size; // 0x0A: The size of File name length in bytes + 0x42
+ __le16 flags; // 0x0C: Entry flags: NTFS_IE_XXX
+ __le16 res; // 0x0E:
+
+ // Here any indexed attribute can be placed
+ // One of them is:
+ // struct ATTR_FILE_NAME AttrFileName;
+ //
+
+ // The last 8 bytes of this structure contains
+ // the VBN of subnode
+ // !!! Note !!!
+ // This field is presented only if (flags & NTFS_IE_HAS_SUBNODES)
+ // __le64 vbn;
+};
+
+static_assert(sizeof(struct NTFS_DE) == 0x10);
+
+static inline void de_set_vbn_le(struct NTFS_DE *e, __le64 vcn)
+{
+ __le64 *v = Add2Ptr(e, le16_to_cpu(e->size) - sizeof(__le64));
+
+ *v = vcn;
+}
+
+static inline void de_set_vbn(struct NTFS_DE *e, CLST vcn)
+{
+ __le64 *v = Add2Ptr(e, le16_to_cpu(e->size) - sizeof(__le64));
+
+ *v = cpu_to_le64(vcn);
+}
+
+static inline __le64 de_get_vbn_le(const struct NTFS_DE *e)
+{
+ return *(__le64 *)Add2Ptr(e, le16_to_cpu(e->size) - sizeof(__le64));
+}
+
+static inline CLST de_get_vbn(const struct NTFS_DE *e)
+{
+ __le64 *v = Add2Ptr(e, le16_to_cpu(e->size) - sizeof(__le64));
+
+ return le64_to_cpu(*v);
+}
+
+static inline struct NTFS_DE *de_get_next(const struct NTFS_DE *e)
+{
+ return Add2Ptr(e, le16_to_cpu(e->size));
+}
+
+static inline struct ATTR_FILE_NAME *de_get_fname(const struct NTFS_DE *e)
+{
+ return le16_to_cpu(e->key_size) >= SIZEOF_ATTRIBUTE_FILENAME ?
+ Add2Ptr(e, sizeof(struct NTFS_DE)) :
+ NULL;
+}
+
+static inline bool de_is_last(const struct NTFS_DE *e)
+{
+ return e->flags & NTFS_IE_LAST;
+}
+
+static inline bool de_has_vcn(const struct NTFS_DE *e)
+{
+ return e->flags & NTFS_IE_HAS_SUBNODES;
+}
+
+static inline bool de_has_vcn_ex(const struct NTFS_DE *e)
+{
+ return (e->flags & NTFS_IE_HAS_SUBNODES) &&
+ (u64)(-1) != *((u64 *)Add2Ptr(e, le16_to_cpu(e->size) -
+ sizeof(__le64)));
+}
+
+#define MAX_BYTES_PER_NAME_ENTRY \
+ QuadAlign(sizeof(struct NTFS_DE) + \
+ offsetof(struct ATTR_FILE_NAME, name) + \
+ NTFS_NAME_LEN * sizeof(short))
+
+struct INDEX_HDR {
+ __le32 de_off; // 0x00: The offset from the start of this structure
+ // to the first NTFS_DE
+ __le32 used; // 0x04: The size of this structure plus all
+ // entries (quad-word aligned)
+ __le32 total; // 0x08: The allocated size of for this structure plus all entries
+ u8 flags; // 0x0C: 0x00 = Small directory, 0x01 = Large directory
+ u8 res[3];
+
+ //
+ // de_off + used <= total
+ //
+};
+
+static_assert(sizeof(struct INDEX_HDR) == 0x10);
+
+static inline struct NTFS_DE *hdr_first_de(const struct INDEX_HDR *hdr)
+{
+ u32 de_off = le32_to_cpu(hdr->de_off);
+ u32 used = le32_to_cpu(hdr->used);
+ struct NTFS_DE *e = Add2Ptr(hdr, de_off);
+ u16 esize;
+
+ if (de_off >= used || de_off >= le32_to_cpu(hdr->total))
+ return NULL;
+
+ esize = le16_to_cpu(e->size);
+ if (esize < sizeof(struct NTFS_DE) || de_off + esize > used)
+ return NULL;
+
+ return e;
+}
+
+static inline struct NTFS_DE *hdr_next_de(const struct INDEX_HDR *hdr,
+ const struct NTFS_DE *e)
+{
+ size_t off = PtrOffset(hdr, e);
+ u32 used = le32_to_cpu(hdr->used);
+ u16 esize;
+
+ if (off >= used)
+ return NULL;
+
+ esize = le16_to_cpu(e->size);
+
+ if (esize < sizeof(struct NTFS_DE) ||
+ off + esize + sizeof(struct NTFS_DE) > used)
+ return NULL;
+
+ return Add2Ptr(e, esize);
+}
+
+static inline bool hdr_has_subnode(const struct INDEX_HDR *hdr)
+{
+ return hdr->flags & 1;
+}
+
+struct INDEX_BUFFER {
+ struct NTFS_RECORD_HEADER rhdr; // 'INDX'
+ __le64 vbn; // 0x10: vcn if index >= cluster or vsn id index < cluster
+ struct INDEX_HDR ihdr; // 0x18:
+};
+
+static_assert(sizeof(struct INDEX_BUFFER) == 0x28);
+
+static inline bool ib_is_empty(const struct INDEX_BUFFER *ib)
+{
+ const struct NTFS_DE *first = hdr_first_de(&ib->ihdr);
+
+ return !first || de_is_last(first);
+}
+
+static inline bool ib_is_leaf(const struct INDEX_BUFFER *ib)
+{
+ return !(ib->ihdr.flags & 1);
+}
+
+/* Index root structure ( 0x90 ) */
+enum COLLATION_RULE {
+ NTFS_COLLATION_TYPE_BINARY = cpu_to_le32(0),
+ // $I30
+ NTFS_COLLATION_TYPE_FILENAME = cpu_to_le32(0x01),
+ // $SII of $Secure and $Q of Quota
+ NTFS_COLLATION_TYPE_UINT = cpu_to_le32(0x10),
+ // $O of Quota
+ NTFS_COLLATION_TYPE_SID = cpu_to_le32(0x11),
+ // $SDH of $Secure
+ NTFS_COLLATION_TYPE_SECURITY_HASH = cpu_to_le32(0x12),
+ // $O of ObjId and "$R" for Reparse
+ NTFS_COLLATION_TYPE_UINTS = cpu_to_le32(0x13)
+};
+
+static_assert(sizeof(enum COLLATION_RULE) == 4);
+
+//
+struct INDEX_ROOT {
+ enum ATTR_TYPE type; // 0x00: The type of attribute to index on
+ enum COLLATION_RULE rule; // 0x04: The rule
+ __le32 index_block_size;// 0x08: The size of index record
+ u8 index_block_clst; // 0x0C: The number of clusters or sectors per index
+ u8 res[3];
+ struct INDEX_HDR ihdr; // 0x10:
+};
+
+static_assert(sizeof(struct INDEX_ROOT) == 0x20);
+static_assert(offsetof(struct INDEX_ROOT, ihdr) == 0x10);
+
+#define VOLUME_FLAG_DIRTY cpu_to_le16(0x0001)
+#define VOLUME_FLAG_RESIZE_LOG_FILE cpu_to_le16(0x0002)
+
+struct VOLUME_INFO {
+ __le64 res1; // 0x00
+ u8 major_ver; // 0x08: NTFS major version number (before .)
+ u8 minor_ver; // 0x09: NTFS minor version number (after .)
+ __le16 flags; // 0x0A: Volume flags, see VOLUME_FLAG_XXX
+
+}; // sizeof=0xC
+
+#define SIZEOF_ATTRIBUTE_VOLUME_INFO 0xc
+
+#define NTFS_LABEL_MAX_LENGTH (0x100 / sizeof(short))
+#define NTFS_ATTR_INDEXABLE cpu_to_le32(0x00000002)
+#define NTFS_ATTR_DUPALLOWED cpu_to_le32(0x00000004)
+#define NTFS_ATTR_MUST_BE_INDEXED cpu_to_le32(0x00000010)
+#define NTFS_ATTR_MUST_BE_NAMED cpu_to_le32(0x00000020)
+#define NTFS_ATTR_MUST_BE_RESIDENT cpu_to_le32(0x00000040)
+#define NTFS_ATTR_LOG_ALWAYS cpu_to_le32(0x00000080)
+
+/* $AttrDef file entry */
+struct ATTR_DEF_ENTRY {
+ __le16 name[0x40]; // 0x00: Attr name
+ enum ATTR_TYPE type; // 0x80: struct ATTRIB type
+ __le32 res; // 0x84:
+ enum COLLATION_RULE rule; // 0x88:
+ __le32 flags; // 0x8C: NTFS_ATTR_XXX (see above)
+ __le64 min_sz; // 0x90: Minimum attribute data size
+ __le64 max_sz; // 0x98: Maximum attribute data size
+};
+
+static_assert(sizeof(struct ATTR_DEF_ENTRY) == 0xa0);
+
+/* Object ID (0x40) */
+struct OBJECT_ID {
+ struct GUID ObjId; // 0x00: Unique Id assigned to file
+ struct GUID BirthVolumeId;// 0x10: Birth Volume Id is the Object Id of the Volume on
+ // which the Object Id was allocated. It never changes
+ struct GUID BirthObjectId; // 0x20: Birth Object Id is the first Object Id that was
+ // ever assigned to this MFT Record. I.e. If the Object Id
+ // is changed for some reason, this field will reflect the
+ // original value of the Object Id.
+ struct GUID DomainId; // 0x30: Domain Id is currently unused but it is intended to be
+ // used in a network environment where the local machine is
+ // part of a Windows 2000 Domain. This may be used in a Windows
+ // 2000 Advanced Server managed domain.
+};
+
+static_assert(sizeof(struct OBJECT_ID) == 0x40);
+
+/* O Directory entry structure ( rule = 0x13 ) */
+struct NTFS_DE_O {
+ struct NTFS_DE de;
+ struct GUID ObjId; // 0x10: Unique Id assigned to file
+ struct MFT_REF ref; // 0x20: MFT record number with this file
+ struct GUID BirthVolumeId; // 0x28: Birth Volume Id is the Object Id of the Volume on
+ // which the Object Id was allocated. It never changes
+ struct GUID BirthObjectId; // 0x38: Birth Object Id is the first Object Id that was
+ // ever assigned to this MFT Record. I.e. If the Object Id
+ // is changed for some reason, this field will reflect the
+ // original value of the Object Id.
+ // This field is valid if data_size == 0x48
+ struct GUID BirthDomainId; // 0x48: Domain Id is currently unused but it is intended
+ // to be used in a network environment where the local
+ // machine is part of a Windows 2000 Domain. This may be
+ // used in a Windows 2000 Advanced Server managed domain.
+};
+
+static_assert(sizeof(struct NTFS_DE_O) == 0x58);
+
+#define NTFS_OBJECT_ENTRY_DATA_SIZE1 \
+ 0x38 // struct NTFS_DE_O.BirthDomainId is not used
+#define NTFS_OBJECT_ENTRY_DATA_SIZE2 \
+ 0x48 // struct NTFS_DE_O.BirthDomainId is used
+
+/* Q Directory entry structure ( rule = 0x11 ) */
+struct NTFS_DE_Q {
+ struct NTFS_DE de;
+ __le32 owner_id; // 0x10: Unique Id assigned to file
+ __le32 Version; // 0x14: 0x02
+ __le32 flags2; // 0x18: Quota flags, see above
+ __le64 BytesUsed; // 0x1C:
+ __le64 ChangeTime; // 0x24:
+ __le64 WarningLimit; // 0x28:
+ __le64 HardLimit; // 0x34:
+ __le64 ExceededTime; // 0x3C:
+
+ // SID is placed here
+}; // sizeof() = 0x44
+
+#define SIZEOF_NTFS_DE_Q 0x44
+
+#define SecurityDescriptorsBlockSize 0x40000 // 256K
+#define SecurityDescriptorMaxSize 0x20000 // 128K
+#define Log2OfSecurityDescriptorsBlockSize 18
+
+struct SECURITY_KEY {
+ __le32 hash; // Hash value for descriptor
+ __le32 sec_id; // Security Id (guaranteed unique)
+};
+
+/* Security descriptors (the content of $Secure::SDS data stream) */
+struct SECURITY_HDR {
+ struct SECURITY_KEY key; // 0x00: Security Key
+ __le64 off; // 0x08: Offset of this entry in the file
+ __le32 size; // 0x10: Size of this entry, 8 byte aligned
+ //
+ // Security descriptor itself is placed here
+ // Total size is 16 byte aligned
+ //
+} __packed;
+
+#define SIZEOF_SECURITY_HDR 0x14
+
+/* SII Directory entry structure */
+struct NTFS_DE_SII {
+ struct NTFS_DE de;
+ __le32 sec_id; // 0x10: Key: sizeof(security_id) = wKeySize
+ struct SECURITY_HDR sec_hdr; // 0x14:
+} __packed;
+
+#define SIZEOF_SII_DIRENTRY 0x28
+
+/* SDH Directory entry structure */
+struct NTFS_DE_SDH {
+ struct NTFS_DE de;
+ struct SECURITY_KEY key; // 0x10: Key
+ struct SECURITY_HDR sec_hdr; // 0x18: Data
+ __le16 magic[2]; // 0x2C: 0x00490049 "I I"
+};
+
+#define SIZEOF_SDH_DIRENTRY 0x30
+
+struct REPARSE_KEY {
+ __le32 ReparseTag; // 0x00: Reparse Tag
+ struct MFT_REF ref; // 0x04: MFT record number with this file
+}; // sizeof() = 0x0C
+
+static_assert(offsetof(struct REPARSE_KEY, ref) == 0x04);
+#define SIZEOF_REPARSE_KEY 0x0C
+
+/* Reparse Directory entry structure */
+struct NTFS_DE_R {
+ struct NTFS_DE de;
+ struct REPARSE_KEY key; // 0x10: Reparse Key
+ u32 zero; // 0x1c
+}; // sizeof() = 0x20
+
+static_assert(sizeof(struct NTFS_DE_R) == 0x20);
+
+/* CompressReparseBuffer.WofVersion */
+#define WOF_CURRENT_VERSION cpu_to_le32(1)
+/* CompressReparseBuffer.WofProvider */
+#define WOF_PROVIDER_WIM cpu_to_le32(1)
+/* CompressReparseBuffer.WofProvider */
+#define WOF_PROVIDER_SYSTEM cpu_to_le32(2)
+/* CompressReparseBuffer.ProviderVer */
+#define WOF_PROVIDER_CURRENT_VERSION cpu_to_le32(1)
+
+#define WOF_COMPRESSION_XPRESS4K cpu_to_le32(0) // 4k
+#define WOF_COMPRESSION_LZX32K cpu_to_le32(1) // 32k
+#define WOF_COMPRESSION_XPRESS8K cpu_to_le32(2) // 8k
+#define WOF_COMPRESSION_XPRESS16K cpu_to_le32(3) // 16k
+
+/*
+ * ATTR_REPARSE (0xC0)
+ *
+ * The reparse struct GUID structure is used by all 3rd party layered drivers to
+ * store data in a reparse point. For non-Microsoft tags, The struct GUID field
+ * cannot be GUID_NULL.
+ * The constraints on reparse tags are defined below.
+ * Microsoft tags can also be used with this format of the reparse point buffer.
+ */
+struct REPARSE_POINT {
+ __le32 ReparseTag; // 0x00:
+ __le16 ReparseDataLength;// 0x04:
+ __le16 Reserved;
+
+ struct GUID Guid; // 0x08:
+
+ //
+ // Here GenericReparseBuffer is placed
+ //
+};
+
+static_assert(sizeof(struct REPARSE_POINT) == 0x18);
+
+//
+// Maximum allowed size of the reparse data.
+//
+#define MAXIMUM_REPARSE_DATA_BUFFER_SIZE (16 * 1024)
+
+//
+// The value of the following constant needs to satisfy the following
+// conditions:
+// (1) Be at least as large as the largest of the reserved tags.
+// (2) Be strictly smaller than all the tags in use.
+//
+#define IO_REPARSE_TAG_RESERVED_RANGE 1
+
+//
+// The reparse tags are a ULONG. The 32 bits are laid out as follows:
+//
+// 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+// 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+// +-+-+-+-+-----------------------+-------------------------------+
+// |M|R|N|R| Reserved bits | Reparse Tag Value |
+// +-+-+-+-+-----------------------+-------------------------------+
+//
+// M is the Microsoft bit. When set to 1, it denotes a tag owned by Microsoft.
+// All ISVs must use a tag with a 0 in this position.
+// Note: If a Microsoft tag is used by non-Microsoft software, the
+// behavior is not defined.
+//
+// R is reserved. Must be zero for non-Microsoft tags.
+//
+// N is name surrogate. When set to 1, the file represents another named
+// entity in the system.
+//
+// The M and N bits are OR-able.
+// The following macros check for the M and N bit values:
+//
+
+//
+// Macro to determine whether a reparse point tag corresponds to a tag
+// owned by Microsoft.
+//
+#define IsReparseTagMicrosoft(_tag) (((_tag)&IO_REPARSE_TAG_MICROSOFT))
+
+//
+// Macro to determine whether a reparse point tag is a name surrogate
+//
+#define IsReparseTagNameSurrogate(_tag) (((_tag)&IO_REPARSE_TAG_NAME_SURROGATE))
+
+//
+// The following constant represents the bits that are valid to use in
+// reparse tags.
+//
+#define IO_REPARSE_TAG_VALID_VALUES 0xF000FFFF
+
+//
+// Macro to determine whether a reparse tag is a valid tag.
+//
+#define IsReparseTagValid(_tag) \
+ (!((_tag) & ~IO_REPARSE_TAG_VALID_VALUES) && \
+ ((_tag) > IO_REPARSE_TAG_RESERVED_RANGE))
+
+//
+// Microsoft tags for reparse points.
+//
+
+enum IO_REPARSE_TAG {
+ IO_REPARSE_TAG_SYMBOLIC_LINK = cpu_to_le32(0),
+ IO_REPARSE_TAG_NAME_SURROGATE = cpu_to_le32(0x20000000),
+ IO_REPARSE_TAG_MICROSOFT = cpu_to_le32(0x80000000),
+ IO_REPARSE_TAG_MOUNT_POINT = cpu_to_le32(0xA0000003),
+ IO_REPARSE_TAG_SYMLINK = cpu_to_le32(0xA000000C),
+ IO_REPARSE_TAG_HSM = cpu_to_le32(0xC0000004),
+ IO_REPARSE_TAG_SIS = cpu_to_le32(0x80000007),
+ IO_REPARSE_TAG_DEDUP = cpu_to_le32(0x80000013),
+ IO_REPARSE_TAG_COMPRESS = cpu_to_le32(0x80000017),
+
+ //
+ // The reparse tag 0x80000008 is reserved for Microsoft internal use
+ // (may be published in the future)
+ //
+
+ //
+ // Microsoft reparse tag reserved for DFS
+ //
+ IO_REPARSE_TAG_DFS = cpu_to_le32(0x8000000A),
+
+ //
+ // Microsoft reparse tag reserved for the file system filter manager
+ //
+ IO_REPARSE_TAG_FILTER_MANAGER = cpu_to_le32(0x8000000B),
+
+ //
+ // Non-Microsoft tags for reparse points
+ //
+
+ //
+ // Tag allocated to CONGRUENT, May 2000. Used by IFSTEST
+ //
+ IO_REPARSE_TAG_IFSTEST_CONGRUENT = cpu_to_le32(0x00000009),
+
+ //
+ // Tag allocated to ARKIVIO
+ //
+ IO_REPARSE_TAG_ARKIVIO = cpu_to_le32(0x0000000C),
+
+ //
+ // Tag allocated to SOLUTIONSOFT
+ //
+ IO_REPARSE_TAG_SOLUTIONSOFT = cpu_to_le32(0x2000000D),
+
+ //
+ // Tag allocated to COMMVAULT
+ //
+ IO_REPARSE_TAG_COMMVAULT = cpu_to_le32(0x0000000E),
+
+ // OneDrive??
+ IO_REPARSE_TAG_CLOUD = cpu_to_le32(0x9000001A),
+ IO_REPARSE_TAG_CLOUD_1 = cpu_to_le32(0x9000101A),
+ IO_REPARSE_TAG_CLOUD_2 = cpu_to_le32(0x9000201A),
+ IO_REPARSE_TAG_CLOUD_3 = cpu_to_le32(0x9000301A),
+ IO_REPARSE_TAG_CLOUD_4 = cpu_to_le32(0x9000401A),
+ IO_REPARSE_TAG_CLOUD_5 = cpu_to_le32(0x9000501A),
+ IO_REPARSE_TAG_CLOUD_6 = cpu_to_le32(0x9000601A),
+ IO_REPARSE_TAG_CLOUD_7 = cpu_to_le32(0x9000701A),
+ IO_REPARSE_TAG_CLOUD_8 = cpu_to_le32(0x9000801A),
+ IO_REPARSE_TAG_CLOUD_9 = cpu_to_le32(0x9000901A),
+ IO_REPARSE_TAG_CLOUD_A = cpu_to_le32(0x9000A01A),
+ IO_REPARSE_TAG_CLOUD_B = cpu_to_le32(0x9000B01A),
+ IO_REPARSE_TAG_CLOUD_C = cpu_to_le32(0x9000C01A),
+ IO_REPARSE_TAG_CLOUD_D = cpu_to_le32(0x9000D01A),
+ IO_REPARSE_TAG_CLOUD_E = cpu_to_le32(0x9000E01A),
+ IO_REPARSE_TAG_CLOUD_F = cpu_to_le32(0x9000F01A),
+
+};
+
+#define SYMLINK_FLAG_RELATIVE 1
+
+/* Microsoft reparse buffer. (see DDK for details) */
+struct REPARSE_DATA_BUFFER {
+ __le32 ReparseTag; // 0x00:
+ __le16 ReparseDataLength; // 0x04:
+ __le16 Reserved;
+
+ union {
+ // If ReparseTag == 0xA0000003 (IO_REPARSE_TAG_MOUNT_POINT)
+ struct {
+ __le16 SubstituteNameOffset; // 0x08
+ __le16 SubstituteNameLength; // 0x0A
+ __le16 PrintNameOffset; // 0x0C
+ __le16 PrintNameLength; // 0x0E
+ __le16 PathBuffer[]; // 0x10
+ } MountPointReparseBuffer;
+
+ // If ReparseTag == 0xA000000C (IO_REPARSE_TAG_SYMLINK)
+ // https://msdn.microsoft.com/en-us/library/cc232006.aspx
+ struct {
+ __le16 SubstituteNameOffset; // 0x08
+ __le16 SubstituteNameLength; // 0x0A
+ __le16 PrintNameOffset; // 0x0C
+ __le16 PrintNameLength; // 0x0E
+ // 0-absolute path 1- relative path, SYMLINK_FLAG_RELATIVE
+ __le32 Flags; // 0x10
+ __le16 PathBuffer[]; // 0x14
+ } SymbolicLinkReparseBuffer;
+
+ // If ReparseTag == 0x80000017U
+ struct {
+ __le32 WofVersion; // 0x08 == 1
+ /* 1 - WIM backing provider ("WIMBoot"),
+ * 2 - System compressed file provider
+ */
+ __le32 WofProvider; // 0x0C
+ __le32 ProviderVer; // 0x10: == 1 WOF_FILE_PROVIDER_CURRENT_VERSION == 1
+ __le32 CompressionFormat; // 0x14: 0, 1, 2, 3. See WOF_COMPRESSION_XXX
+ } CompressReparseBuffer;
+
+ struct {
+ u8 DataBuffer[1]; // 0x08
+ } GenericReparseBuffer;
+ };
+};
+
+/* ATTR_EA_INFO (0xD0) */
+
+#define FILE_NEED_EA 0x80 // See ntifs.h
+/* FILE_NEED_EA, indicates that the file to which the EA belongs cannot be
+ * interpreted without understanding the associated extended attributes.
+ */
+struct EA_INFO {
+ __le16 size_pack; // 0x00: Size of buffer to hold in packed form
+ __le16 count; // 0x02: Count of EA's with FILE_NEED_EA bit set
+ __le32 size; // 0x04: Size of buffer to hold in unpacked form
+};
+
+static_assert(sizeof(struct EA_INFO) == 8);
+
+/* ATTR_EA (0xE0) */
+struct EA_FULL {
+ __le32 size; // 0x00: (not in packed)
+ u8 flags; // 0x04
+ u8 name_len; // 0x05
+ __le16 elength; // 0x06
+ u8 name[]; // 0x08
+};
+
+static_assert(offsetof(struct EA_FULL, name) == 8);
+
+#define ACL_REVISION 2
+#define ACL_REVISION_DS 4
+
+#define SE_SELF_RELATIVE cpu_to_le16(0x8000)
+
+struct SECURITY_DESCRIPTOR_RELATIVE {
+ u8 Revision;
+ u8 Sbz1;
+ __le16 Control;
+ __le32 Owner;
+ __le32 Group;
+ __le32 Sacl;
+ __le32 Dacl;
+};
+static_assert(sizeof(struct SECURITY_DESCRIPTOR_RELATIVE) == 0x14);
+
+struct ACE_HEADER {
+ u8 AceType;
+ u8 AceFlags;
+ __le16 AceSize;
+};
+static_assert(sizeof(struct ACE_HEADER) == 4);
+
+struct ACL {
+ u8 AclRevision;
+ u8 Sbz1;
+ __le16 AclSize;
+ __le16 AceCount;
+ __le16 Sbz2;
+};
+static_assert(sizeof(struct ACL) == 8);
+
+struct SID {
+ u8 Revision;
+ u8 SubAuthorityCount;
+ u8 IdentifierAuthority[6];
+ __le32 SubAuthority[];
+};
+static_assert(offsetof(struct SID, SubAuthority) == 8);
+
+// clang-format on
diff --git a/fs/ntfs3/ntfs_fs.h b/fs/ntfs3/ntfs_fs.h
new file mode 100644
index 000000000000..0c3ac89c3115
--- /dev/null
+++ b/fs/ntfs3/ntfs_fs.h
@@ -0,0 +1,1092 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+// clang-format off
+#define MINUS_ONE_T ((size_t)(-1))
+/* Biggest MFT / smallest cluster */
+#define MAXIMUM_BYTES_PER_MFT 4096
+#define NTFS_BLOCKS_PER_MFT_RECORD (MAXIMUM_BYTES_PER_MFT / 512)
+
+#define MAXIMUM_BYTES_PER_INDEX 4096
+#define NTFS_BLOCKS_PER_INODE (MAXIMUM_BYTES_PER_INDEX / 512)
+
+/* ntfs specific error code when fixup failed*/
+#define E_NTFS_FIXUP 555
+/* ntfs specific error code about resident->nonresident*/
+#define E_NTFS_NONRESIDENT 556
+/* ntfs specific error code about punch hole*/
+#define E_NTFS_NOTALIGNED 557
+
+
+/* sbi->flags */
+#define NTFS_FLAGS_NODISCARD 0x00000001
+/* Set when LogFile is replaying */
+#define NTFS_FLAGS_LOG_REPLAYING 0x00000008
+/* Set when we changed first MFT's which copy must be updated in $MftMirr */
+#define NTFS_FLAGS_MFTMIRR 0x00001000
+#define NTFS_FLAGS_NEED_REPLAY 0x04000000
+
+
+/* ni->ni_flags */
+/*
+ * Data attribute is external compressed (lzx/xpress)
+ * 1 - WOF_COMPRESSION_XPRESS4K
+ * 2 - WOF_COMPRESSION_XPRESS8K
+ * 3 - WOF_COMPRESSION_XPRESS16K
+ * 4 - WOF_COMPRESSION_LZX32K
+ */
+#define NI_FLAG_COMPRESSED_MASK 0x0000000f
+/* Data attribute is deduplicated */
+#define NI_FLAG_DEDUPLICATED 0x00000010
+#define NI_FLAG_EA 0x00000020
+#define NI_FLAG_DIR 0x00000040
+#define NI_FLAG_RESIDENT 0x00000080
+#define NI_FLAG_UPDATE_PARENT 0x00000100
+// clang-format on
+
+struct ntfs_mount_options {
+ struct nls_table *nls;
+
+ kuid_t fs_uid;
+ kgid_t fs_gid;
+ u16 fs_fmask_inv;
+ u16 fs_dmask_inv;
+
+ unsigned uid : 1, /* uid was set */
+ gid : 1, /* gid was set */
+ fmask : 1, /* fmask was set */
+ dmask : 1, /*dmask was set*/
+ sys_immutable : 1, /* immutable system files */
+ discard : 1, /* issue discard requests on deletions */
+ sparse : 1, /*create sparse files*/
+ showmeta : 1, /*show meta files*/
+ nohidden : 1, /*do not show hidden files*/
+ force : 1, /*rw mount dirty volume*/
+ no_acs_rules : 1, /*exclude acs rules*/
+ prealloc : 1 /*preallocate space when file is growing*/
+ ;
+};
+
+/* special value to unpack and deallocate*/
+#define RUN_DEALLOCATE ((struct runs_tree *)(size_t)1)
+
+/* TODO: use rb tree instead of array */
+struct runs_tree {
+ struct ntfs_run *runs;
+ size_t count; // Currently used size a ntfs_run storage.
+ size_t allocated; // Currently allocated ntfs_run storage size.
+};
+
+struct ntfs_buffers {
+ /* Biggest MFT / smallest cluster = 4096 / 512 = 8 */
+ /* Biggest index / smallest cluster = 4096 / 512 = 8 */
+ struct buffer_head *bh[PAGE_SIZE >> SECTOR_SHIFT];
+ u32 bytes;
+ u32 nbufs;
+ u32 off;
+};
+
+enum ALLOCATE_OPT {
+ ALLOCATE_DEF = 0, // Allocate all clusters
+ ALLOCATE_MFT = 1, // Allocate for MFT
+};
+
+enum bitmap_mutex_classes {
+ BITMAP_MUTEX_CLUSTERS = 0,
+ BITMAP_MUTEX_MFT = 1,
+};
+
+struct wnd_bitmap {
+ struct super_block *sb;
+ struct rw_semaphore rw_lock;
+
+ struct runs_tree run;
+ size_t nbits;
+
+ size_t total_zeroes; // total number of free bits
+ u16 *free_bits; // free bits in each window
+ size_t nwnd;
+ u32 bits_last; // bits in last window
+
+ struct rb_root start_tree; // extents, sorted by 'start'
+ struct rb_root count_tree; // extents, sorted by 'count + start'
+ size_t count; // extents count
+
+ /*
+ * -1 Tree is activated but not updated (too many fragments)
+ * 0 - Tree is not activated
+ * 1 - Tree is activated and updated
+ */
+ int uptodated;
+ size_t extent_min; // Minimal extent used while building
+ size_t extent_max; // Upper estimate of biggest free block
+
+ /* Zone [bit, end) */
+ size_t zone_bit;
+ size_t zone_end;
+
+ bool set_tail; // not necessary in driver
+ bool inited;
+};
+
+typedef int (*NTFS_CMP_FUNC)(const void *key1, size_t len1, const void *key2,
+ size_t len2, const void *param);
+
+enum index_mutex_classed {
+ INDEX_MUTEX_I30 = 0,
+ INDEX_MUTEX_SII = 1,
+ INDEX_MUTEX_SDH = 2,
+ INDEX_MUTEX_SO = 3,
+ INDEX_MUTEX_SQ = 4,
+ INDEX_MUTEX_SR = 5,
+ INDEX_MUTEX_TOTAL
+};
+
+/* ntfs_index - allocation unit inside directory */
+struct ntfs_index {
+ struct runs_tree bitmap_run;
+ struct runs_tree alloc_run;
+ /* read/write access to 'bitmap_run'/'alloc_run' while ntfs_readdir */
+ struct rw_semaphore run_lock;
+
+ /*TODO: remove 'cmp'*/
+ NTFS_CMP_FUNC cmp;
+
+ u8 index_bits; // log2(root->index_block_size)
+ u8 idx2vbn_bits; // log2(root->index_block_clst)
+ u8 vbn2vbo_bits; // index_block_size < cluster? 9 : cluster_bits
+ u8 type; // index_mutex_classed
+};
+
+/* Minimum mft zone */
+#define NTFS_MIN_MFT_ZONE 100
+
+/* ntfs file system in-core superblock data */
+struct ntfs_sb_info {
+ struct super_block *sb;
+
+ u32 discard_granularity;
+ u64 discard_granularity_mask_inv; // ~(discard_granularity_mask_inv-1)
+
+ u32 cluster_size; // bytes per cluster
+ u32 cluster_mask; // == cluster_size - 1
+ u64 cluster_mask_inv; // ~(cluster_size - 1)
+ u32 block_mask; // sb->s_blocksize - 1
+ u32 blocks_per_cluster; // cluster_size / sb->s_blocksize
+
+ u32 record_size;
+ u32 sector_size;
+ u32 index_size;
+
+ u8 sector_bits;
+ u8 cluster_bits;
+ u8 record_bits;
+
+ u64 maxbytes; // Maximum size for normal files
+ u64 maxbytes_sparse; // Maximum size for sparse file
+
+ u32 flags; // See NTFS_FLAGS_XXX
+
+ CLST bad_clusters; // The count of marked bad clusters
+
+ u16 max_bytes_per_attr; // maximum attribute size in record
+ u16 attr_size_tr; // attribute size threshold (320 bytes)
+
+ /* Records in $Extend */
+ CLST objid_no;
+ CLST quota_no;
+ CLST reparse_no;
+ CLST usn_jrnl_no;
+
+ struct ATTR_DEF_ENTRY *def_table; // attribute definition table
+ u32 def_entries;
+ u32 ea_max_size;
+
+ struct MFT_REC *new_rec;
+
+ u16 *upcase;
+
+ struct {
+ u64 lbo, lbo2;
+ struct ntfs_inode *ni;
+ struct wnd_bitmap bitmap; // $MFT::Bitmap
+ /*
+ * MFT records [11-24) used to expand MFT itself
+ * They always marked as used in $MFT::Bitmap
+ * 'reserved_bitmap' contains real bitmap of these records
+ */
+ ulong reserved_bitmap; // bitmap of used records [11 - 24)
+ size_t next_free; // The next record to allocate from
+ size_t used; // mft valid size in records
+ u32 recs_mirr; // Number of records in MFTMirr
+ u8 next_reserved;
+ u8 reserved_bitmap_inited;
+ } mft;
+
+ struct {
+ struct wnd_bitmap bitmap; // $Bitmap::Data
+ CLST next_free_lcn;
+ } used;
+
+ struct {
+ u64 size; // in bytes
+ u64 blocks; // in blocks
+ u64 ser_num;
+ struct ntfs_inode *ni;
+ __le16 flags; // cached current VOLUME_INFO::flags, VOLUME_FLAG_DIRTY
+ u8 major_ver;
+ u8 minor_ver;
+ char label[65];
+ bool real_dirty; /* real fs state*/
+ } volume;
+
+ struct {
+ struct ntfs_index index_sii;
+ struct ntfs_index index_sdh;
+ struct ntfs_inode *ni;
+ u32 next_id;
+ u64 next_off;
+
+ __le32 def_security_id;
+ } security;
+
+ struct {
+ struct ntfs_index index_r;
+ struct ntfs_inode *ni;
+ u64 max_size; // 16K
+ } reparse;
+
+ struct {
+ struct ntfs_index index_o;
+ struct ntfs_inode *ni;
+ } objid;
+
+ struct {
+ struct mutex mtx_lznt;
+ struct lznt *lznt;
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+ struct mutex mtx_xpress;
+ struct xpress_decompressor *xpress;
+ struct mutex mtx_lzx;
+ struct lzx_decompressor *lzx;
+#endif
+ } compress;
+
+ struct ntfs_mount_options options;
+ struct ratelimit_state msg_ratelimit;
+};
+
+/*
+ * one MFT record(usually 1024 bytes), consists of attributes
+ */
+struct mft_inode {
+ struct rb_node node;
+ struct ntfs_sb_info *sbi;
+
+ struct MFT_REC *mrec;
+ struct ntfs_buffers nb;
+
+ CLST rno;
+ bool dirty;
+};
+
+/* nested class for ntfs_inode::ni_lock */
+enum ntfs_inode_mutex_lock_class {
+ NTFS_INODE_MUTEX_DIRTY,
+ NTFS_INODE_MUTEX_SECURITY,
+ NTFS_INODE_MUTEX_OBJID,
+ NTFS_INODE_MUTEX_REPARSE,
+ NTFS_INODE_MUTEX_NORMAL,
+ NTFS_INODE_MUTEX_PARENT,
+};
+
+/*
+ * ntfs inode - extends linux inode. consists of one or more mft inodes
+ */
+struct ntfs_inode {
+ struct mft_inode mi; // base record
+
+ /*
+ * Valid size: [0 - i_valid) - these range in file contains valid data
+ * Range [i_valid - inode->i_size) - contains 0
+ * Usually i_valid <= inode->i_size
+ */
+ u64 i_valid;
+ struct timespec64 i_crtime;
+
+ struct mutex ni_lock;
+
+ /* file attributes from std */
+ enum FILE_ATTRIBUTE std_fa;
+ __le32 std_security_id;
+
+ /*
+ * tree of mft_inode
+ * not empty when primary MFT record (usually 1024 bytes) can't save all attributes
+ * e.g. file becomes too fragmented or contains a lot of names
+ */
+ struct rb_root mi_tree;
+
+ /*
+ * This member is used in ntfs_readdir to ensure that all subrecords are loaded
+ */
+ u8 mi_loaded;
+
+ union {
+ struct ntfs_index dir;
+ struct {
+ struct rw_semaphore run_lock;
+ struct runs_tree run;
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+ struct page *offs_page;
+#endif
+ } file;
+ };
+
+ struct {
+ struct runs_tree run;
+ struct ATTR_LIST_ENTRY *le; // 1K aligned memory
+ size_t size;
+ bool dirty;
+ } attr_list;
+
+ size_t ni_flags; // NI_FLAG_XXX
+
+ struct inode vfs_inode;
+};
+
+struct indx_node {
+ struct ntfs_buffers nb;
+ struct INDEX_BUFFER *index;
+};
+
+struct ntfs_fnd {
+ int level;
+ struct indx_node *nodes[20];
+ struct NTFS_DE *de[20];
+ struct NTFS_DE *root_de;
+};
+
+enum REPARSE_SIGN {
+ REPARSE_NONE = 0,
+ REPARSE_COMPRESSED = 1,
+ REPARSE_DEDUPLICATED = 2,
+ REPARSE_LINK = 3
+};
+
+/* functions from attrib.c*/
+int attr_load_runs(struct ATTRIB *attr, struct ntfs_inode *ni,
+ struct runs_tree *run, const CLST *vcn);
+int attr_allocate_clusters(struct ntfs_sb_info *sbi, struct runs_tree *run,
+ CLST vcn, CLST lcn, CLST len, CLST *pre_alloc,
+ enum ALLOCATE_OPT opt, CLST *alen, const size_t fr,
+ CLST *new_lcn);
+int attr_make_nonresident(struct ntfs_inode *ni, struct ATTRIB *attr,
+ struct ATTR_LIST_ENTRY *le, struct mft_inode *mi,
+ u64 new_size, struct runs_tree *run,
+ struct ATTRIB **ins_attr, struct page *page);
+int attr_set_size(struct ntfs_inode *ni, enum ATTR_TYPE type,
+ const __le16 *name, u8 name_len, struct runs_tree *run,
+ u64 new_size, const u64 *new_valid, bool keep_prealloc,
+ struct ATTRIB **ret);
+int attr_data_get_block(struct ntfs_inode *ni, CLST vcn, CLST clen, CLST *lcn,
+ CLST *len, bool *new);
+int attr_data_read_resident(struct ntfs_inode *ni, struct page *page);
+int attr_data_write_resident(struct ntfs_inode *ni, struct page *page);
+int attr_load_runs_vcn(struct ntfs_inode *ni, enum ATTR_TYPE type,
+ const __le16 *name, u8 name_len, struct runs_tree *run,
+ CLST vcn);
+int attr_load_runs_range(struct ntfs_inode *ni, enum ATTR_TYPE type,
+ const __le16 *name, u8 name_len, struct runs_tree *run,
+ u64 from, u64 to);
+int attr_wof_frame_info(struct ntfs_inode *ni, struct ATTRIB *attr,
+ struct runs_tree *run, u64 frame, u64 frames,
+ u8 frame_bits, u32 *ondisk_size, u64 *vbo_data);
+int attr_is_frame_compressed(struct ntfs_inode *ni, struct ATTRIB *attr,
+ CLST frame, CLST *clst_data);
+int attr_allocate_frame(struct ntfs_inode *ni, CLST frame, size_t compr_size,
+ u64 new_valid);
+int attr_collapse_range(struct ntfs_inode *ni, u64 vbo, u64 bytes);
+int attr_punch_hole(struct ntfs_inode *ni, u64 vbo, u64 bytes, u32 *frame_size);
+
+/* functions from attrlist.c*/
+void al_destroy(struct ntfs_inode *ni);
+bool al_verify(struct ntfs_inode *ni);
+int ntfs_load_attr_list(struct ntfs_inode *ni, struct ATTRIB *attr);
+struct ATTR_LIST_ENTRY *al_enumerate(struct ntfs_inode *ni,
+ struct ATTR_LIST_ENTRY *le);
+struct ATTR_LIST_ENTRY *al_find_le(struct ntfs_inode *ni,
+ struct ATTR_LIST_ENTRY *le,
+ const struct ATTRIB *attr);
+struct ATTR_LIST_ENTRY *al_find_ex(struct ntfs_inode *ni,
+ struct ATTR_LIST_ENTRY *le,
+ enum ATTR_TYPE type, const __le16 *name,
+ u8 name_len, const CLST *vcn);
+int al_add_le(struct ntfs_inode *ni, enum ATTR_TYPE type, const __le16 *name,
+ u8 name_len, CLST svcn, __le16 id, const struct MFT_REF *ref,
+ struct ATTR_LIST_ENTRY **new_le);
+bool al_remove_le(struct ntfs_inode *ni, struct ATTR_LIST_ENTRY *le);
+bool al_delete_le(struct ntfs_inode *ni, enum ATTR_TYPE type, CLST vcn,
+ const __le16 *name, size_t name_len,
+ const struct MFT_REF *ref);
+int al_update(struct ntfs_inode *ni);
+static inline size_t al_aligned(size_t size)
+{
+ return (size + 1023) & ~(size_t)1023;
+}
+
+/* globals from bitfunc.c */
+bool are_bits_clear(const ulong *map, size_t bit, size_t nbits);
+bool are_bits_set(const ulong *map, size_t bit, size_t nbits);
+size_t get_set_bits_ex(const ulong *map, size_t bit, size_t nbits);
+
+/* globals from dir.c */
+int ntfs_utf16_to_nls(struct ntfs_sb_info *sbi, const struct le_str *uni,
+ u8 *buf, int buf_len);
+int ntfs_nls_to_utf16(struct ntfs_sb_info *sbi, const u8 *name, u32 name_len,
+ struct cpu_str *uni, u32 max_ulen,
+ enum utf16_endian endian);
+struct inode *dir_search_u(struct inode *dir, const struct cpu_str *uni,
+ struct ntfs_fnd *fnd);
+bool dir_is_empty(struct inode *dir);
+extern const struct file_operations ntfs_dir_operations;
+
+/* globals from file.c*/
+int ntfs_getattr(struct user_namespace *mnt_userns, const struct path *path,
+ struct kstat *stat, u32 request_mask, u32 flags);
+void ntfs_sparse_cluster(struct inode *inode, struct page *page0, CLST vcn,
+ CLST len);
+int ntfs3_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
+ struct iattr *attr);
+int ntfs_file_open(struct inode *inode, struct file *file);
+int ntfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
+ __u64 start, __u64 len);
+extern const struct inode_operations ntfs_special_inode_operations;
+extern const struct inode_operations ntfs_file_inode_operations;
+extern const struct file_operations ntfs_file_operations;
+
+/* globals from frecord.c */
+void ni_remove_mi(struct ntfs_inode *ni, struct mft_inode *mi);
+struct ATTR_STD_INFO *ni_std(struct ntfs_inode *ni);
+struct ATTR_STD_INFO5 *ni_std5(struct ntfs_inode *ni);
+void ni_clear(struct ntfs_inode *ni);
+int ni_load_mi_ex(struct ntfs_inode *ni, CLST rno, struct mft_inode **mi);
+int ni_load_mi(struct ntfs_inode *ni, struct ATTR_LIST_ENTRY *le,
+ struct mft_inode **mi);
+struct ATTRIB *ni_find_attr(struct ntfs_inode *ni, struct ATTRIB *attr,
+ struct ATTR_LIST_ENTRY **entry_o,
+ enum ATTR_TYPE type, const __le16 *name,
+ u8 name_len, const CLST *vcn,
+ struct mft_inode **mi);
+struct ATTRIB *ni_enum_attr_ex(struct ntfs_inode *ni, struct ATTRIB *attr,
+ struct ATTR_LIST_ENTRY **le,
+ struct mft_inode **mi);
+struct ATTRIB *ni_load_attr(struct ntfs_inode *ni, enum ATTR_TYPE type,
+ const __le16 *name, u8 name_len, CLST vcn,
+ struct mft_inode **pmi);
+int ni_load_all_mi(struct ntfs_inode *ni);
+bool ni_add_subrecord(struct ntfs_inode *ni, CLST rno, struct mft_inode **mi);
+int ni_remove_attr(struct ntfs_inode *ni, enum ATTR_TYPE type,
+ const __le16 *name, size_t name_len, bool base_only,
+ const __le16 *id);
+int ni_create_attr_list(struct ntfs_inode *ni);
+int ni_expand_list(struct ntfs_inode *ni);
+int ni_insert_nonresident(struct ntfs_inode *ni, enum ATTR_TYPE type,
+ const __le16 *name, u8 name_len,
+ const struct runs_tree *run, CLST svcn, CLST len,
+ __le16 flags, struct ATTRIB **new_attr,
+ struct mft_inode **mi);
+int ni_insert_resident(struct ntfs_inode *ni, u32 data_size,
+ enum ATTR_TYPE type, const __le16 *name, u8 name_len,
+ struct ATTRIB **new_attr, struct mft_inode **mi);
+int ni_remove_attr_le(struct ntfs_inode *ni, struct ATTRIB *attr,
+ struct ATTR_LIST_ENTRY *le);
+int ni_delete_all(struct ntfs_inode *ni);
+struct ATTR_FILE_NAME *ni_fname_name(struct ntfs_inode *ni,
+ const struct cpu_str *uni,
+ const struct MFT_REF *home,
+ struct ATTR_LIST_ENTRY **entry);
+struct ATTR_FILE_NAME *ni_fname_type(struct ntfs_inode *ni, u8 name_type,
+ struct ATTR_LIST_ENTRY **entry);
+int ni_new_attr_flags(struct ntfs_inode *ni, enum FILE_ATTRIBUTE new_fa);
+enum REPARSE_SIGN ni_parse_reparse(struct ntfs_inode *ni, struct ATTRIB *attr,
+ void *buffer);
+int ni_write_inode(struct inode *inode, int sync, const char *hint);
+#define _ni_write_inode(i, w) ni_write_inode(i, w, __func__)
+int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo,
+ __u64 vbo, __u64 len);
+int ni_readpage_cmpr(struct ntfs_inode *ni, struct page *page);
+int ni_decompress_file(struct ntfs_inode *ni);
+int ni_read_frame(struct ntfs_inode *ni, u64 frame_vbo, struct page **pages,
+ u32 pages_per_frame);
+int ni_write_frame(struct ntfs_inode *ni, struct page **pages,
+ u32 pages_per_frame);
+
+/* globals from fslog.c */
+int log_replay(struct ntfs_inode *ni, bool *initialized);
+
+/* globals from fsntfs.c */
+bool ntfs_fix_pre_write(struct NTFS_RECORD_HEADER *rhdr, size_t bytes);
+int ntfs_fix_post_read(struct NTFS_RECORD_HEADER *rhdr, size_t bytes,
+ bool simple);
+int ntfs_extend_init(struct ntfs_sb_info *sbi);
+int ntfs_loadlog_and_replay(struct ntfs_inode *ni, struct ntfs_sb_info *sbi);
+const struct ATTR_DEF_ENTRY *ntfs_query_def(struct ntfs_sb_info *sbi,
+ enum ATTR_TYPE Type);
+int ntfs_look_for_free_space(struct ntfs_sb_info *sbi, CLST lcn, CLST len,
+ CLST *new_lcn, CLST *new_len,
+ enum ALLOCATE_OPT opt);
+int ntfs_look_free_mft(struct ntfs_sb_info *sbi, CLST *rno, bool mft,
+ struct ntfs_inode *ni, struct mft_inode **mi);
+void ntfs_mark_rec_free(struct ntfs_sb_info *sbi, CLST rno);
+int ntfs_clear_mft_tail(struct ntfs_sb_info *sbi, size_t from, size_t to);
+int ntfs_refresh_zone(struct ntfs_sb_info *sbi);
+int ntfs_update_mftmirr(struct ntfs_sb_info *sbi, int wait);
+enum NTFS_DIRTY_FLAGS {
+ NTFS_DIRTY_CLEAR = 0,
+ NTFS_DIRTY_DIRTY = 1,
+ NTFS_DIRTY_ERROR = 2,
+};
+int ntfs_set_state(struct ntfs_sb_info *sbi, enum NTFS_DIRTY_FLAGS dirty);
+int ntfs_sb_read(struct super_block *sb, u64 lbo, size_t bytes, void *buffer);
+int ntfs_sb_write(struct super_block *sb, u64 lbo, size_t bytes,
+ const void *buffer, int wait);
+int ntfs_sb_write_run(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+ u64 vbo, const void *buf, size_t bytes);
+struct buffer_head *ntfs_bread_run(struct ntfs_sb_info *sbi,
+ const struct runs_tree *run, u64 vbo);
+int ntfs_read_run_nb(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+ u64 vbo, void *buf, u32 bytes, struct ntfs_buffers *nb);
+int ntfs_read_bh(struct ntfs_sb_info *sbi, const struct runs_tree *run, u64 vbo,
+ struct NTFS_RECORD_HEADER *rhdr, u32 bytes,
+ struct ntfs_buffers *nb);
+int ntfs_get_bh(struct ntfs_sb_info *sbi, const struct runs_tree *run, u64 vbo,
+ u32 bytes, struct ntfs_buffers *nb);
+int ntfs_write_bh(struct ntfs_sb_info *sbi, struct NTFS_RECORD_HEADER *rhdr,
+ struct ntfs_buffers *nb, int sync);
+int ntfs_bio_pages(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+ struct page **pages, u32 nr_pages, u64 vbo, u32 bytes,
+ u32 op);
+int ntfs_bio_fill_1(struct ntfs_sb_info *sbi, const struct runs_tree *run);
+int ntfs_vbo_to_lbo(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+ u64 vbo, u64 *lbo, u64 *bytes);
+struct ntfs_inode *ntfs_new_inode(struct ntfs_sb_info *sbi, CLST nRec,
+ bool dir);
+extern const u8 s_default_security[0x50];
+bool is_sd_valid(const struct SECURITY_DESCRIPTOR_RELATIVE *sd, u32 len);
+int ntfs_security_init(struct ntfs_sb_info *sbi);
+int ntfs_get_security_by_id(struct ntfs_sb_info *sbi, __le32 security_id,
+ struct SECURITY_DESCRIPTOR_RELATIVE **sd,
+ size_t *size);
+int ntfs_insert_security(struct ntfs_sb_info *sbi,
+ const struct SECURITY_DESCRIPTOR_RELATIVE *sd,
+ u32 size, __le32 *security_id, bool *inserted);
+int ntfs_reparse_init(struct ntfs_sb_info *sbi);
+int ntfs_objid_init(struct ntfs_sb_info *sbi);
+int ntfs_objid_remove(struct ntfs_sb_info *sbi, struct GUID *guid);
+int ntfs_insert_reparse(struct ntfs_sb_info *sbi, __le32 rtag,
+ const struct MFT_REF *ref);
+int ntfs_remove_reparse(struct ntfs_sb_info *sbi, __le32 rtag,
+ const struct MFT_REF *ref);
+void mark_as_free_ex(struct ntfs_sb_info *sbi, CLST lcn, CLST len, bool trim);
+int run_deallocate(struct ntfs_sb_info *sbi, struct runs_tree *run, bool trim);
+
+/* globals from index.c */
+int indx_used_bit(struct ntfs_index *indx, struct ntfs_inode *ni, size_t *bit);
+void fnd_clear(struct ntfs_fnd *fnd);
+static inline struct ntfs_fnd *fnd_get(void)
+{
+ return ntfs_zalloc(sizeof(struct ntfs_fnd));
+}
+static inline void fnd_put(struct ntfs_fnd *fnd)
+{
+ if (fnd) {
+ fnd_clear(fnd);
+ ntfs_free(fnd);
+ }
+}
+void indx_clear(struct ntfs_index *idx);
+int indx_init(struct ntfs_index *indx, struct ntfs_sb_info *sbi,
+ const struct ATTRIB *attr, enum index_mutex_classed type);
+struct INDEX_ROOT *indx_get_root(struct ntfs_index *indx, struct ntfs_inode *ni,
+ struct ATTRIB **attr, struct mft_inode **mi);
+int indx_read(struct ntfs_index *idx, struct ntfs_inode *ni, CLST vbn,
+ struct indx_node **node);
+int indx_find(struct ntfs_index *indx, struct ntfs_inode *dir,
+ const struct INDEX_ROOT *root, const void *Key, size_t KeyLen,
+ const void *param, int *diff, struct NTFS_DE **entry,
+ struct ntfs_fnd *fnd);
+int indx_find_sort(struct ntfs_index *indx, struct ntfs_inode *ni,
+ const struct INDEX_ROOT *root, struct NTFS_DE **entry,
+ struct ntfs_fnd *fnd);
+int indx_find_raw(struct ntfs_index *indx, struct ntfs_inode *ni,
+ const struct INDEX_ROOT *root, struct NTFS_DE **entry,
+ size_t *off, struct ntfs_fnd *fnd);
+int indx_insert_entry(struct ntfs_index *indx, struct ntfs_inode *ni,
+ const struct NTFS_DE *new_de, const void *param,
+ struct ntfs_fnd *fnd);
+int indx_delete_entry(struct ntfs_index *indx, struct ntfs_inode *ni,
+ const void *key, u32 key_len, const void *param);
+int indx_update_dup(struct ntfs_inode *ni, struct ntfs_sb_info *sbi,
+ const struct ATTR_FILE_NAME *fname,
+ const struct NTFS_DUP_INFO *dup, int sync);
+
+/* globals from inode.c */
+struct inode *ntfs_iget5(struct super_block *sb, const struct MFT_REF *ref,
+ const struct cpu_str *name);
+int ntfs_set_size(struct inode *inode, u64 new_size);
+int reset_log_file(struct inode *inode);
+int ntfs_get_block(struct inode *inode, sector_t vbn,
+ struct buffer_head *bh_result, int create);
+int ntfs3_write_inode(struct inode *inode, struct writeback_control *wbc);
+int ntfs_sync_inode(struct inode *inode);
+int ntfs_flush_inodes(struct super_block *sb, struct inode *i1,
+ struct inode *i2);
+int inode_write_data(struct inode *inode, const void *data, size_t bytes);
+struct inode *ntfs_create_inode(struct user_namespace *mnt_userns,
+ struct inode *dir, struct dentry *dentry,
+ const struct cpu_str *uni, umode_t mode,
+ dev_t dev, const char *symname, u32 size,
+ struct ntfs_fnd *fnd);
+int ntfs_link_inode(struct inode *inode, struct dentry *dentry);
+int ntfs_unlink_inode(struct inode *dir, const struct dentry *dentry);
+void ntfs_evict_inode(struct inode *inode);
+extern const struct inode_operations ntfs_link_inode_operations;
+extern const struct address_space_operations ntfs_aops;
+extern const struct address_space_operations ntfs_aops_cmpr;
+
+/* globals from name_i.c*/
+int fill_name_de(struct ntfs_sb_info *sbi, void *buf, const struct qstr *name,
+ const struct cpu_str *uni);
+struct dentry *ntfs3_get_parent(struct dentry *child);
+
+extern const struct inode_operations ntfs_dir_inode_operations;
+extern const struct inode_operations ntfs_special_inode_operations;
+
+/* globals from record.c */
+int mi_get(struct ntfs_sb_info *sbi, CLST rno, struct mft_inode **mi);
+void mi_put(struct mft_inode *mi);
+int mi_init(struct mft_inode *mi, struct ntfs_sb_info *sbi, CLST rno);
+int mi_read(struct mft_inode *mi, bool is_mft);
+struct ATTRIB *mi_enum_attr(struct mft_inode *mi, struct ATTRIB *attr);
+// TODO: id?
+struct ATTRIB *mi_find_attr(struct mft_inode *mi, struct ATTRIB *attr,
+ enum ATTR_TYPE type, const __le16 *name,
+ size_t name_len, const __le16 *id);
+static inline struct ATTRIB *rec_find_attr_le(struct mft_inode *rec,
+ struct ATTR_LIST_ENTRY *le)
+{
+ return mi_find_attr(rec, NULL, le->type, le_name(le), le->name_len,
+ &le->id);
+}
+int mi_write(struct mft_inode *mi, int wait);
+int mi_format_new(struct mft_inode *mi, struct ntfs_sb_info *sbi, CLST rno,
+ __le16 flags, bool is_mft);
+void mi_mark_free(struct mft_inode *mi);
+struct ATTRIB *mi_insert_attr(struct mft_inode *mi, enum ATTR_TYPE type,
+ const __le16 *name, u8 name_len, u32 asize,
+ u16 name_off);
+
+bool mi_remove_attr(struct mft_inode *mi, struct ATTRIB *attr);
+bool mi_resize_attr(struct mft_inode *mi, struct ATTRIB *attr, int bytes);
+int mi_pack_runs(struct mft_inode *mi, struct ATTRIB *attr,
+ struct runs_tree *run, CLST len);
+static inline bool mi_is_ref(const struct mft_inode *mi,
+ const struct MFT_REF *ref)
+{
+ if (le32_to_cpu(ref->low) != mi->rno)
+ return false;
+ if (ref->seq != mi->mrec->seq)
+ return false;
+
+#ifdef CONFIG_NTFS3_64BIT_CLUSTER
+ return le16_to_cpu(ref->high) == (mi->rno >> 32);
+#else
+ return !ref->high;
+#endif
+}
+
+static inline void mi_get_ref(const struct mft_inode *mi, struct MFT_REF *ref)
+{
+ ref->low = cpu_to_le32(mi->rno);
+#ifdef CONFIG_NTFS3_64BIT_CLUSTER
+ ref->high = cpu_to_le16(mi->rno >> 32);
+#else
+ ref->high = 0;
+#endif
+ ref->seq = mi->mrec->seq;
+}
+
+/* globals from run.c */
+bool run_lookup_entry(const struct runs_tree *run, CLST vcn, CLST *lcn,
+ CLST *len, size_t *index);
+void run_truncate(struct runs_tree *run, CLST vcn);
+void run_truncate_head(struct runs_tree *run, CLST vcn);
+void run_truncate_around(struct runs_tree *run, CLST vcn);
+bool run_lookup(const struct runs_tree *run, CLST vcn, size_t *Index);
+bool run_add_entry(struct runs_tree *run, CLST vcn, CLST lcn, CLST len,
+ bool is_mft);
+bool run_collapse_range(struct runs_tree *run, CLST vcn, CLST len);
+bool run_get_entry(const struct runs_tree *run, size_t index, CLST *vcn,
+ CLST *lcn, CLST *len);
+bool run_is_mapped_full(const struct runs_tree *run, CLST svcn, CLST evcn);
+
+int run_pack(const struct runs_tree *run, CLST svcn, CLST len, u8 *run_buf,
+ u32 run_buf_size, CLST *packed_vcns);
+int run_unpack(struct runs_tree *run, struct ntfs_sb_info *sbi, CLST ino,
+ CLST svcn, CLST evcn, CLST vcn, const u8 *run_buf,
+ u32 run_buf_size);
+
+#ifdef NTFS3_CHECK_FREE_CLST
+int run_unpack_ex(struct runs_tree *run, struct ntfs_sb_info *sbi, CLST ino,
+ CLST svcn, CLST evcn, CLST vcn, const u8 *run_buf,
+ u32 run_buf_size);
+#else
+#define run_unpack_ex run_unpack
+#endif
+int run_get_highest_vcn(CLST vcn, const u8 *run_buf, u64 *highest_vcn);
+
+/* globals from super.c */
+void *ntfs_set_shared(void *ptr, u32 bytes);
+void *ntfs_put_shared(void *ptr);
+void ntfs_unmap_meta(struct super_block *sb, CLST lcn, CLST len);
+int ntfs_discard(struct ntfs_sb_info *sbi, CLST Lcn, CLST Len);
+
+/* globals from bitmap.c*/
+int __init ntfs3_init_bitmap(void);
+void ntfs3_exit_bitmap(void);
+void wnd_close(struct wnd_bitmap *wnd);
+static inline size_t wnd_zeroes(const struct wnd_bitmap *wnd)
+{
+ return wnd->total_zeroes;
+}
+int wnd_init(struct wnd_bitmap *wnd, struct super_block *sb, size_t nbits);
+int wnd_set_free(struct wnd_bitmap *wnd, size_t bit, size_t bits);
+int wnd_set_used(struct wnd_bitmap *wnd, size_t bit, size_t bits);
+bool wnd_is_free(struct wnd_bitmap *wnd, size_t bit, size_t bits);
+bool wnd_is_used(struct wnd_bitmap *wnd, size_t bit, size_t bits);
+
+/* Possible values for 'flags' 'wnd_find' */
+#define BITMAP_FIND_MARK_AS_USED 0x01
+#define BITMAP_FIND_FULL 0x02
+size_t wnd_find(struct wnd_bitmap *wnd, size_t to_alloc, size_t hint,
+ size_t flags, size_t *allocated);
+int wnd_extend(struct wnd_bitmap *wnd, size_t new_bits);
+void wnd_zone_set(struct wnd_bitmap *wnd, size_t Lcn, size_t Len);
+int ntfs_trim_fs(struct ntfs_sb_info *sbi, struct fstrim_range *range);
+
+/* globals from upcase.c */
+int ntfs_cmp_names(const __le16 *s1, size_t l1, const __le16 *s2, size_t l2,
+ const u16 *upcase, bool bothcase);
+int ntfs_cmp_names_cpu(const struct cpu_str *uni1, const struct le_str *uni2,
+ const u16 *upcase, bool bothcase);
+
+/* globals from xattr.c */
+#ifdef CONFIG_NTFS3_FS_POSIX_ACL
+struct posix_acl *ntfs_get_acl(struct inode *inode, int type);
+int ntfs_set_acl(struct user_namespace *mnt_userns, struct inode *inode,
+ struct posix_acl *acl, int type);
+int ntfs_init_acl(struct user_namespace *mnt_userns, struct inode *inode,
+ struct inode *dir);
+#else
+#define ntfs_get_acl NULL
+#define ntfs_set_acl NULL
+#endif
+
+int ntfs_acl_chmod(struct user_namespace *mnt_userns, struct inode *inode);
+int ntfs_permission(struct user_namespace *mnt_userns, struct inode *inode,
+ int mask);
+ssize_t ntfs_listxattr(struct dentry *dentry, char *buffer, size_t size);
+extern const struct xattr_handler *ntfs_xattr_handlers[];
+
+int ntfs_save_wsl_perm(struct inode *inode);
+void ntfs_get_wsl_perm(struct inode *inode);
+
+/* globals from lznt.c */
+struct lznt *get_lznt_ctx(int level);
+size_t compress_lznt(const void *uncompressed, size_t uncompressed_size,
+ void *compressed, size_t compressed_size,
+ struct lznt *ctx);
+ssize_t decompress_lznt(const void *compressed, size_t compressed_size,
+ void *uncompressed, size_t uncompressed_size);
+
+static inline bool is_ntfs3(struct ntfs_sb_info *sbi)
+{
+ return sbi->volume.major_ver >= 3;
+}
+
+/*(sb->s_flags & SB_ACTIVE)*/
+static inline bool is_mounted(struct ntfs_sb_info *sbi)
+{
+ return !!sbi->sb->s_root;
+}
+
+static inline bool ntfs_is_meta_file(struct ntfs_sb_info *sbi, CLST rno)
+{
+ return rno < MFT_REC_FREE || rno == sbi->objid_no ||
+ rno == sbi->quota_no || rno == sbi->reparse_no ||
+ rno == sbi->usn_jrnl_no;
+}
+
+static inline void ntfs_unmap_page(struct page *page)
+{
+ kunmap(page);
+ put_page(page);
+}
+
+static inline struct page *ntfs_map_page(struct address_space *mapping,
+ unsigned long index)
+{
+ struct page *page = read_mapping_page(mapping, index, NULL);
+
+ if (!IS_ERR(page)) {
+ kmap(page);
+ if (!PageError(page))
+ return page;
+ ntfs_unmap_page(page);
+ return ERR_PTR(-EIO);
+ }
+ return page;
+}
+
+static inline size_t wnd_zone_bit(const struct wnd_bitmap *wnd)
+{
+ return wnd->zone_bit;
+}
+
+static inline size_t wnd_zone_len(const struct wnd_bitmap *wnd)
+{
+ return wnd->zone_end - wnd->zone_bit;
+}
+
+static inline void run_init(struct runs_tree *run)
+{
+ run->runs = NULL;
+ run->count = 0;
+ run->allocated = 0;
+}
+
+static inline struct runs_tree *run_alloc(void)
+{
+ return ntfs_zalloc(sizeof(struct runs_tree));
+}
+
+static inline void run_close(struct runs_tree *run)
+{
+ ntfs_vfree(run->runs);
+ memset(run, 0, sizeof(*run));
+}
+
+static inline void run_free(struct runs_tree *run)
+{
+ if (run) {
+ ntfs_vfree(run->runs);
+ ntfs_free(run);
+ }
+}
+
+static inline bool run_is_empty(struct runs_tree *run)
+{
+ return !run->count;
+}
+
+/* NTFS uses quad aligned bitmaps */
+static inline size_t bitmap_size(size_t bits)
+{
+ return QuadAlign((bits + 7) >> 3);
+}
+
+#define _100ns2seconds 10000000
+#define SecondsToStartOf1970 0x00000002B6109100
+
+#define NTFS_TIME_GRAN 100
+
+/*
+ * kernel2nt
+ *
+ * converts in-memory kernel timestamp into nt time
+ */
+static inline __le64 kernel2nt(const struct timespec64 *ts)
+{
+ // 10^7 units of 100 nanoseconds one second
+ return cpu_to_le64(_100ns2seconds *
+ (ts->tv_sec + SecondsToStartOf1970) +
+ ts->tv_nsec / NTFS_TIME_GRAN);
+}
+
+/*
+ * nt2kernel
+ *
+ * converts on-disk nt time into kernel timestamp
+ */
+static inline void nt2kernel(const __le64 tm, struct timespec64 *ts)
+{
+ u64 t = le64_to_cpu(tm) - _100ns2seconds * SecondsToStartOf1970;
+
+ // WARNING: do_div changes its first argument(!)
+ ts->tv_nsec = do_div(t, _100ns2seconds) * 100;
+ ts->tv_sec = t;
+}
+
+static inline struct ntfs_sb_info *ntfs_sb(struct super_block *sb)
+{
+ return sb->s_fs_info;
+}
+
+/* Align up on cluster boundary */
+static inline u64 ntfs_up_cluster(const struct ntfs_sb_info *sbi, u64 size)
+{
+ return (size + sbi->cluster_mask) & sbi->cluster_mask_inv;
+}
+
+/* Align up on cluster boundary */
+static inline u64 ntfs_up_block(const struct super_block *sb, u64 size)
+{
+ return (size + sb->s_blocksize - 1) & ~(u64)(sb->s_blocksize - 1);
+}
+
+static inline CLST bytes_to_cluster(const struct ntfs_sb_info *sbi, u64 size)
+{
+ return (size + sbi->cluster_mask) >> sbi->cluster_bits;
+}
+
+static inline u64 bytes_to_block(const struct super_block *sb, u64 size)
+{
+ return (size + sb->s_blocksize - 1) >> sb->s_blocksize_bits;
+}
+
+static inline struct buffer_head *ntfs_bread(struct super_block *sb,
+ sector_t block)
+{
+ struct buffer_head *bh = sb_bread(sb, block);
+
+ if (bh)
+ return bh;
+
+ ntfs_err(sb, "failed to read volume at offset 0x%llx",
+ (u64)block << sb->s_blocksize_bits);
+ return NULL;
+}
+
+static inline bool is_power_of2(size_t v)
+{
+ return v && !(v & (v - 1));
+}
+
+static inline struct ntfs_inode *ntfs_i(struct inode *inode)
+{
+ return container_of(inode, struct ntfs_inode, vfs_inode);
+}
+
+static inline bool is_compressed(const struct ntfs_inode *ni)
+{
+ return (ni->std_fa & FILE_ATTRIBUTE_COMPRESSED) ||
+ (ni->ni_flags & NI_FLAG_COMPRESSED_MASK);
+}
+
+static inline int ni_ext_compress_bits(const struct ntfs_inode *ni)
+{
+ return 0xb + (ni->ni_flags & NI_FLAG_COMPRESSED_MASK);
+}
+
+/* bits - 0xc, 0xd, 0xe, 0xf, 0x10 */
+static inline void ni_set_ext_compress_bits(struct ntfs_inode *ni, u8 bits)
+{
+ ni->ni_flags |= (bits - 0xb) & NI_FLAG_COMPRESSED_MASK;
+}
+
+static inline bool is_dedup(const struct ntfs_inode *ni)
+{
+ return ni->ni_flags & NI_FLAG_DEDUPLICATED;
+}
+
+static inline bool is_encrypted(const struct ntfs_inode *ni)
+{
+ return ni->std_fa & FILE_ATTRIBUTE_ENCRYPTED;
+}
+
+static inline bool is_sparsed(const struct ntfs_inode *ni)
+{
+ return ni->std_fa & FILE_ATTRIBUTE_SPARSE_FILE;
+}
+
+static inline int is_resident(struct ntfs_inode *ni)
+{
+ return ni->ni_flags & NI_FLAG_RESIDENT;
+}
+
+static inline void le16_sub_cpu(__le16 *var, u16 val)
+{
+ *var = cpu_to_le16(le16_to_cpu(*var) - val);
+}
+
+static inline void le32_sub_cpu(__le32 *var, u32 val)
+{
+ *var = cpu_to_le32(le32_to_cpu(*var) - val);
+}
+
+static inline void nb_put(struct ntfs_buffers *nb)
+{
+ u32 i, nbufs = nb->nbufs;
+
+ if (!nbufs)
+ return;
+
+ for (i = 0; i < nbufs; i++)
+ put_bh(nb->bh[i]);
+ nb->nbufs = 0;
+}
+
+static inline void put_indx_node(struct indx_node *in)
+{
+ if (!in)
+ return;
+
+ ntfs_free(in->index);
+ nb_put(&in->nb);
+ ntfs_free(in);
+}
+
+static inline void mi_clear(struct mft_inode *mi)
+{
+ nb_put(&mi->nb);
+ ntfs_free(mi->mrec);
+ mi->mrec = NULL;
+}
+
+static inline void ni_lock(struct ntfs_inode *ni)
+{
+ mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_NORMAL);
+}
+
+static inline void ni_lock_dir(struct ntfs_inode *ni)
+{
+ mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_PARENT);
+}
+
+static inline void ni_unlock(struct ntfs_inode *ni)
+{
+ mutex_unlock(&ni->ni_lock);
+}
+
+static inline int ni_trylock(struct ntfs_inode *ni)
+{
+ return mutex_trylock(&ni->ni_lock);
+}
+
+static inline int attr_load_runs_attr(struct ntfs_inode *ni,
+ struct ATTRIB *attr,
+ struct runs_tree *run, CLST vcn)
+{
+ return attr_load_runs_vcn(ni, attr->type, attr_name(attr),
+ attr->name_len, run, vcn);
+}
+
+static inline void le64_sub_cpu(__le64 *var, u64 val)
+{
+ *var = cpu_to_le64(le64_to_cpu(*var) - val);
+}
diff --git a/fs/ntfs3/upcase.c b/fs/ntfs3/upcase.c
new file mode 100644
index 000000000000..9617382aca64
--- /dev/null
+++ b/fs/ntfs3/upcase.c
@@ -0,0 +1,105 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved.
+ *
+ */
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/module.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+static inline u16 upcase_unicode_char(const u16 *upcase, u16 chr)
+{
+ if (chr < 'a')
+ return chr;
+
+ if (chr <= 'z')
+ return chr - ('a' - 'A');
+
+ return upcase[chr];
+}
+
+/*
+ * Thanks Kari Argillander <kari.argillander(a)gmail.com> for idea and implementation 'bothcase'
+ *
+ * Straigth way to compare names:
+ * - case insensitive
+ * - if name equals and 'bothcases' then
+ * - case sensitive
+ * 'Straigth way' code scans input names twice in worst case
+ * Optimized code scans input names only once
+ */
+int ntfs_cmp_names(const __le16 *s1, size_t l1, const __le16 *s2, size_t l2,
+ const u16 *upcase, bool bothcase)
+{
+ int diff1 = 0;
+ int diff2;
+ size_t len = min(l1, l2);
+
+ if (!bothcase && upcase)
+ goto case_insentive;
+
+ for (; len; s1++, s2++, len--) {
+ diff1 = le16_to_cpu(*s1) - le16_to_cpu(*s2);
+ if (diff1) {
+ if (bothcase && upcase)
+ goto case_insentive;
+
+ return diff1;
+ }
+ }
+ return l1 - l2;
+
+case_insentive:
+ for (; len; s1++, s2++, len--) {
+ diff2 = upcase_unicode_char(upcase, le16_to_cpu(*s1)) -
+ upcase_unicode_char(upcase, le16_to_cpu(*s2));
+ if (diff2)
+ return diff2;
+ }
+
+ diff2 = l1 - l2;
+ return diff2 ? diff2 : diff1;
+}
+
+int ntfs_cmp_names_cpu(const struct cpu_str *uni1, const struct le_str *uni2,
+ const u16 *upcase, bool bothcase)
+{
+ const u16 *s1 = uni1->name;
+ const __le16 *s2 = uni2->name;
+ size_t l1 = uni1->len;
+ size_t l2 = uni2->len;
+ size_t len = min(l1, l2);
+ int diff1 = 0;
+ int diff2;
+
+ if (!bothcase && upcase)
+ goto case_insentive;
+
+ for (; len; s1++, s2++, len--) {
+ diff1 = *s1 - le16_to_cpu(*s2);
+ if (diff1) {
+ if (bothcase && upcase)
+ goto case_insentive;
+
+ return diff1;
+ }
+ }
+ return l1 - l2;
+
+case_insentive:
+ for (; len; s1++, s2++, len--) {
+ diff2 = upcase_unicode_char(upcase, *s1) -
+ upcase_unicode_char(upcase, le16_to_cpu(*s2));
+ if (diff2)
+ return diff2;
+ }
+
+ diff2 = l1 - l2;
+ return diff2 ? diff2 : diff1;
+}
--
2.30.0
2
1
Hi,
Thanks for your patch.
Please subscribe the kernel(a)openeuler.org and kernel-discuss(a)openeuler.org mailing list before sending
patches, or your patches would be blocked. Others could not receive your email.
Please resend this patch.
On 2021/12/13 15:11, Jiao Fenfang wrote:
> From: JiaoFenfang <jiaofenfang(a)uniontech.com>
>
> openEuler inclusion
> category: feature
> bugzilla: https://gitee.com/openeuler/kernel/issues/I4KVP7?from=project-issue
>
Add commit message here.
> Signed-off-by: Jiao Fenfang <jiaofenfang(a)uniontech.com>
> ---
> arch/x86/configs/openeuler_defconfig | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
Also enable it on arm64 platform?
> diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig
> index 7b608301823c..01a902eba3b9 100644
> --- a/arch/x86/configs/openeuler_defconfig
> +++ b/arch/x86/configs/openeuler_defconfig
> @@ -7451,7 +7451,13 @@ CONFIG_XFS_POSIX_ACL=y
> CONFIG_GFS2_FS=m
> CONFIG_GFS2_FS_LOCKING_DLM=y
> # CONFIG_OCFS2_FS is not set
> -# CONFIG_BTRFS_FS is not set
> +CONFIG_BTRFS_FS=m
> +# CONFIG_BTRFS_FS_POSIX_ACL is not set
> +# CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set
> +# CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set
> +# CONFIG_BTRFS_DEBUG is not set
> +# CONFIG_BTRFS_ASSERT is not set
> +# CONFIG_BTRFS_FS_REF_VERIFY is not set
> # CONFIG_NILFS2_FS is not set
> # CONFIG_F2FS_FS is not set
> # CONFIG_ZONEFS_FS is not set
>
1
0

13 Dec '21
From: Jesse Brandeburg <jesse.brandeburg(a)intel.com>
mainline inclusion
from mainline-v5.13-rc4
commit 63e39d29b3da02e901349f6cd71159818a4737a6
category: bugfix
bugzilla: NA
CVE: CVE-2021-33098
-----------------------------------------------
Check that the MTU value requested by the VF is in the supported
range of MTUs before attempting to set the VF large packet enable,
otherwise reject the request. This also avoids unnecessary
register updates in the case of the 82599 controller.
Fixes: 872844ddb9e4 ("ixgbe: Enable jumbo frames support w/ SR-IOV")
Co-developed-by: Piotr Skajewski <piotrx.skajewski(a)intel.com>
Signed-off-by: Piotr Skajewski <piotrx.skajewski(a)intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg(a)intel.com>
Co-developed-by: Mateusz Palczewski <mateusz.palczewski(a)intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski(a)intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Huang Guobin <huangguobin4(a)huawei.com>
Reviewed-by: Wei Yongjun <weiyongjun1(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index f6ffd9fb20793..8aaf856771d7b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -467,12 +467,16 @@ static int ixgbe_set_vf_vlan(struct ixgbe_adapter *adapter, int add, int vid,
return err;
}
-static s32 ixgbe_set_vf_lpe(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf)
+static int ixgbe_set_vf_lpe(struct ixgbe_adapter *adapter, u32 max_frame, u32 vf)
{
struct ixgbe_hw *hw = &adapter->hw;
- int max_frame = msgbuf[1];
u32 max_frs;
+ if (max_frame < ETH_MIN_MTU || max_frame > IXGBE_MAX_JUMBO_FRAME_SIZE) {
+ e_err(drv, "VF max_frame %d out of range\n", max_frame);
+ return -EINVAL;
+ }
+
/*
* For 82599EB we have to keep all PFs and VFs operating with
* the same max_frame value in order to avoid sending an oversize
@@ -532,12 +536,6 @@ static s32 ixgbe_set_vf_lpe(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf)
}
}
- /* MTU < 68 is an error and causes problems on some kernels */
- if (max_frame > IXGBE_MAX_JUMBO_FRAME_SIZE) {
- e_err(drv, "VF max_frame %d out of range\n", max_frame);
- return -EINVAL;
- }
-
/* pull current max frame size from hardware */
max_frs = IXGBE_READ_REG(hw, IXGBE_MAXFRS);
max_frs &= IXGBE_MHADD_MFS_MASK;
@@ -1240,7 +1238,7 @@ static int ixgbe_rcv_msg_from_vf(struct ixgbe_adapter *adapter, u32 vf)
retval = ixgbe_set_vf_vlan_msg(adapter, msgbuf, vf);
break;
case IXGBE_VF_SET_LPE:
- retval = ixgbe_set_vf_lpe(adapter, msgbuf, vf);
+ retval = ixgbe_set_vf_lpe(adapter, msgbuf[1], vf);
break;
case IXGBE_VF_SET_MACVLAN:
retval = ixgbe_set_vf_macvlan_msg(adapter, msgbuf, vf);
--
2.25.1
1
0
Hi,
Thanks for your patch.
Please subscribe the kernel(a)openeuler.org and kernel-discuss(a)openeuler.org mailing list before sending
patches, or your patches would be blocked.
Please resend this patch, and specific the right branch.
Welcome to start a topic at openEuler kernel sig meeting via seeding mail to kernel-discuss(a)openeuler.org.
---
可以通过openEuler kernel 小助手,参加openEuler kernel SIG 微信群。
https://openeuler.gitee.io/kernel-portal/img/wechat/openEuler_kernel_helper…
On 2021/12/12 18:14, Jiacheng Shi wrote:
> From: billsjchw <billsjc(a)126.com>
>
> Variables allocated by kvmalloc should not be freed by kfree.
> Because they may be allocated by vmalloc.
> So we replace kfree with kvfree here.
>
> Signed-off-by: Jiacheng Shi <billsjc(a)sjtu.edu.cn>
> ---
> drivers/vfio/vfio_iommu_type1.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 5daceec48811..6811a85109aa 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1150,7 +1150,7 @@ static int vfio_iova_dirty_log_clear(u64 __user *bitmap,
> }
>
> out:
> - kfree(bitmap_buffer);
> + kvfree(bitmap_buffer);
> return ret;
> }
>
>
1
0

12 Dec '21
stable inclusion
category: bugfix
bugzilla: NA
CVE: NA
Line 6191 (#1) allocates a memory chunk for input by kmalloc().
Line 6213 (#3) frees the input before the function returns while
line 6199 (#2) forget to free it, which will lead to a memory leak.
This bug influences all stable versions from 5.15.1 to 5.15.7.
We should kfree() input in line 6199 (#2).
6186 static int rtw_mp_QueryDrv(struct net_device *dev,
6187 struct iw_request_info *info,
6188 union iwreq_data *wrqu, char *extra)
6189 {
6191 char *input = kmalloc(wrqu->data.length, GFP_KERNEL);
// #1: kmalloc space
6195 if (!input)
6196 return -ENOMEM;
6198 if (copy_from_user(input, wrqu->data.pointer, wrqu->data.length))
6199 return -EFAULT; // #2: missing kfree
6213 kfree(input); // #3: kfree space
6214 return 0;
6215 }
Signed-off-by: Jianglei Nie <niejianglei2021(a)163.com>
---
drivers/staging/r8188eu/os_dep/ioctl_linux.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/staging/r8188eu/os_dep/ioctl_linux.c b/drivers/staging/r8188eu/os_dep/ioctl_linux.c
index 906a57eae1af..edc660f15436 100644
--- a/drivers/staging/r8188eu/os_dep/ioctl_linux.c
+++ b/drivers/staging/r8188eu/os_dep/ioctl_linux.c
@@ -6195,8 +6195,11 @@ static int rtw_mp_QueryDrv(struct net_device *dev,
if (!input)
return -ENOMEM;
- if (copy_from_user(input, wrqu->data.pointer, wrqu->data.length))
- return -EFAULT;
+ if (copy_from_user(input, wrqu->data.pointer, wrqu->data.length)) {
+ kfree(input);
+ return -EFAULT;
+ }
+
DBG_88E("%s:iwpriv in =%s\n", __func__, input);
qAutoLoad = strncmp(input, "autoload", 8); /* strncmp true is 0 */
--
2.25.1
1
0

12 Dec '21
stable inclusion
category: bugfix
bugzilla: NA
CVE: NA
Line 5968 (#1) allocates a memory chunk for input by kmalloc().
Line 5973 (#2), line 5989 (#4) and line 5994 (#5) free the input
before the function returns while line 5986 (#3) forget to free it,
which will lead to a memory leak. This bug influences all stable
versions from 5.15 to 5.15.7.
We should kfree() input in line 5986 (#3).
5960 static int rtw_mp_pwrtrk(struct net_device *dev,
5961 struct iw_request_info *info,
5962 struct iw_point *wrqu, char *extra)
5963 {
5968 char *input = kmalloc(wrqu->length, GFP_KERNEL);
// #1: kmalloc space
5970 if (!input)
5971 return -ENOMEM;
5972 if (copy_from_user(input, wrqu->pointer, wrqu->length)) {
5973 kfree(input); // #2: kfree space
5974 return -EFAULT;
5975 }
5980 if (strncmp(input, "stop", 4) == 0) {
5981 enable = 0;
5982 sprintf(extra, "mp tx power tracking stop");
5983 } else if (sscanf(input, "ther =%d", &thermal)) {
5984 ret = Hal_SetThermalMeter(padapter, (u8)thermal);
5985 if (ret == _FAIL)
5986 return -EPERM; // #3: missing kfree
5987 sprintf(extra, "mp tx power tracking start,
target value =%d ok ", thermal);
5988 } else {
5989 kfree(input); // #4: kfree space
5990 return -EINVAL;
5991 }
5994 kfree(input); // #5: kfree space
6000 return 0;
6001 }
Signed-off-by: Jianglei Nie <niejianglei2021(a)163.com>
---
drivers/staging/r8188eu/os_dep/ioctl_linux.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/staging/r8188eu/os_dep/ioctl_linux.c b/drivers/staging/r8188eu/os_dep/ioctl_linux.c
index 0eccce57c63a..906a57eae1af 100644
--- a/drivers/staging/r8188eu/os_dep/ioctl_linux.c
+++ b/drivers/staging/r8188eu/os_dep/ioctl_linux.c
@@ -5982,8 +5982,10 @@ static int rtw_mp_pwrtrk(struct net_device *dev,
sprintf(extra, "mp tx power tracking stop");
} else if (sscanf(input, "ther =%d", &thermal)) {
ret = Hal_SetThermalMeter(padapter, (u8)thermal);
- if (ret == _FAIL)
+ if (ret == _FAIL) {
+ kfree(input);
return -EPERM;
+ }
sprintf(extra, "mp tx power tracking start, target value =%d ok ", thermal);
} else {
kfree(input);
--
2.25.1
1
0

[PATCH master] powercap: DTPM: Fix reference leak in cpuhp_dtpm_cpu_offline()
by Jianglei Nie 12 Dec '21
by Jianglei Nie 12 Dec '21
12 Dec '21
stable inclusion
category: bugfix
bugzilla: NA
CVE: NA
In line 153 (#1), cpufreq_cpu_get() increments the kobject reference
counter of the policy it returned on success. According to the
document, the policy returned by cpufreq_cpu_get() has to be
released with the help of cpufreq_cpu_put() to balance its kobject
reference counter properly. Forgetting the cpufreq_cpu_put()
operation will result in reference leak.This bug influences all
stable versions from v5.15 to v5.15.7.
We can fix it by calling cpufreq_cpu_put() before the function
returns (#2, #3 and #4).
147 static int cpuhp_dtpm_cpu_offline(unsigned int cpu)
148 {
153 policy = cpufreq_cpu_get(cpu);
// #1: reference increment
155 if (!policy)
156 return 0;
158 pd = em_cpu_get(cpu);
159 if (!pd)
160 return -EINVAL; // #2: missing reference decrement
166 if (cpumask_weight(policy->cpus) != 1)
167 return 0; // #3: missing reference decrement
174 return 0; // #4: missing reference decrement
175 }
Signed-off-by: Jianglei Nie <niejianglei2021(a)163.com>
---
drivers/powercap/dtpm_cpu.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
index 51c366938acd..6c94515b21ef 100644
--- a/drivers/powercap/dtpm_cpu.c
+++ b/drivers/powercap/dtpm_cpu.c
@@ -156,21 +156,25 @@ static int cpuhp_dtpm_cpu_offline(unsigned int cpu)
return 0;
pd = em_cpu_get(cpu);
- if (!pd)
+ if (!pd) {
+ cpufreq_cpu_put(policy);
return -EINVAL;
+ }
dtpm = per_cpu(dtpm_per_cpu, cpu);
power_sub(dtpm, pd);
- if (cpumask_weight(policy->cpus) != 1)
+ if (cpumask_weight(policy->cpus) != 1) {
+ cpufreq_cpu_put(policy);
return 0;
+ }
for_each_cpu(cpu, policy->related_cpus)
per_cpu(dtpm_per_cpu, cpu) = NULL;
dtpm_unregister(dtpm);
-
+ cpufreq_cpu_put(policy);
return 0;
}
--
2.25.1
1
0

[PATCH master] security:trusted_tpm2: Fix memory leak in tpm2_key_encode()
by Jianglei Nie 12 Dec '21
by Jianglei Nie 12 Dec '21
12 Dec '21
openEuler inclusion
category: bugfix
bugzilla: NA
CVE: NA
Line 36 (#1) allocates a memory chunk for scratch by kmalloc(), but
it is never freed through the function, which will lead to a memory
leak.
We should kfree() scratch before the function returns (#2, #3 and #4).
31 static int tpm2_key_encode(struct trusted_key_payload *payload,
32 struct trusted_key_options *options,
33 u8 *src, u32 len)
34 {
36 u8 *scratch = kmalloc(SCRATCH_SIZE, GFP_KERNEL);
// #1: kmalloc space
37 u8 *work = scratch, *work1;
50 if (!scratch)
51 return -ENOMEM;
56 if (options->blobauth_len == 0) {
60 if (WARN(IS_ERR(w), "BUG: Boolean failed to encode"))
61 return PTR_ERR(w); // #2: missing kfree
63 }
71 if (WARN(work - scratch + pub_len + priv_len + 14 > SCRATCH_SIZE,
72 "BUG: scratch buffer is too small"))
73 return -EINVAL; // #3: missing kfree
// #4: missing kfree: scratch is never used afterwards.
82 if (WARN(IS_ERR(work1), "BUG: ASN.1 encoder failed"))
83 return PTR_ERR(work1);
85 return work1 - payload->blob;
86 }
Signed-off-by: Jianglei Nie <niejianglei2021(a)163.com>
---
security/keys/trusted-keys/trusted_tpm2.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/security/keys/trusted-keys/trusted_tpm2.c b/security/keys/trusted-keys/trusted_tpm2.c
index 0165da386289..3408a74c855f 100644
--- a/security/keys/trusted-keys/trusted_tpm2.c
+++ b/security/keys/trusted-keys/trusted_tpm2.c
@@ -57,8 +57,10 @@ static int tpm2_key_encode(struct trusted_key_payload *payload,
unsigned char bool[3], *w = bool;
/* tag 0 is emptyAuth */
w = asn1_encode_boolean(w, w + sizeof(bool), true);
- if (WARN(IS_ERR(w), "BUG: Boolean failed to encode"))
+ if (WARN(IS_ERR(w), "BUG: Boolean failed to encode")) {
+ kfree(scratch);
return PTR_ERR(w);
+ }
work = asn1_encode_tag(work, end_work, 0, bool, w - bool);
}
@@ -69,9 +71,12 @@ static int tpm2_key_encode(struct trusted_key_payload *payload,
* trigger, so if it does there's something nefarious going on
*/
if (WARN(work - scratch + pub_len + priv_len + 14 > SCRATCH_SIZE,
- "BUG: scratch buffer is too small"))
+ "BUG: scratch buffer is too small")) {
+ kfree(scratch);
return -EINVAL;
+ }
+ kfree(scratch);
work = asn1_encode_integer(work, end_work, options->keyhandle);
work = asn1_encode_octet_string(work, end_work, pub, pub_len);
work = asn1_encode_octet_string(work, end_work, priv, priv_len);
--
2.25.1
1
0
openEuler inclusion
category: bugfix
bugzilla: NA
CVE: NA
Line 1169 (#3) allocates a memory chunk for victim_name by kmalloc(),
but when the function returns in line 1184 (#4) victim_name allcoated
by line 1169 (#3) is not freed, which will lead to a memory leak.
There is a similar snippet of code in this function as allocating a memory
chunk for victim_name in line 1104 (#1) as well as releasing the memory
in line 1116 (#2).
We should kfree() victim_name when the return value of backref_in_log()
is less than zero and before the function returns in line 1184 (#4).
1057 static inline int __add_inode_ref(struct btrfs_trans_handle *trans,
1058 struct btrfs_root *root,
1059 struct btrfs_path *path,
1060 struct btrfs_root *log_root,
1061 struct btrfs_inode *dir,
1062 struct btrfs_inode *inode,
1063 u64 inode_objectid, u64 parent_objectid,
1064 u64 ref_index, char *name, int namelen,
1065 int *search_done)
1066 {
1104 victim_name = kmalloc(victim_name_len, GFP_NOFS);
// #1: kmalloc (victim_name-1)
1105 if (!victim_name)
1106 return -ENOMEM;
1112 ret = backref_in_log(log_root, &search_key,
1113 parent_objectid, victim_name,
1114 victim_name_len);
1115 if (ret < 0) {
1116 kfree(victim_name); // #2: kfree (victim_name-1)
1117 return ret;
1118 } else if (!ret) {
1169 victim_name = kmalloc(victim_name_len, GFP_NOFS);
// #3: kmalloc (victim_name-2)
1170 if (!victim_name)
1171 return -ENOMEM;
1180 ret = backref_in_log(log_root, &search_key,
1181 parent_objectid, victim_name,
1182 victim_name_len);
1183 if (ret < 0) {
1184 return ret; // #4: missing kfree (victim_name-2)
1185 } else if (!ret) {
1241 return 0;
1242 }
Signed-off-by: Jianglei Nie <niejianglei2021(a)163.com>
---
fs/btrfs/tree-log.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 8ab33caf016f..d373fec55521 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -1181,6 +1181,7 @@ static inline int __add_inode_ref(struct btrfs_trans_handle *trans,
parent_objectid, victim_name,
victim_name_len);
if (ret < 0) {
+ kfree(victim_name);
return ret;
} else if (!ret) {
ret = -ENOENT;
--
2.25.1
1
0

12 Dec '21
openEuler inclusion
category: bugfix
bugzilla: NA
CVE: NA
Line 715 (#1) calls mempool_alloc() to allocate an element
from a specific memory pool. When some errors occur, line 727 (#2)
frees this memory pool but line 731 (#3) does not free it, which will
lead to a memory leak.
We can fix it by calling mempool_free() when the cbfn is not equal to
NULL and before the function returns 0 in line 732 (#3).
705 static int
706 csio_wr_eq_destroy(struct csio_hw *hw, void *priv, int eq_idx,
707 void (*cbfn) (struct csio_hw *, struct csio_mb *))
708 {
710 struct csio_mb *mbp;
715 mbp = mempool_alloc(hw->mb_mempool, GFP_ATOMIC);
// #1: allocate memory pool
716 if (!mbp)
717 return -ENOMEM;
725 rv = csio_mb_issue(hw, mbp);
726 if (rv != 0) {
727 mempool_free(mbp, hw->mb_mempool);
// #2: free memory pool
728 return rv;
729 }
731 if (cbfn != NULL)
732 return 0; // #3: missing free
734 return csio_wr_eq_destroy_rsp(hw, mbp, eq_idx);
735 }
Signed-off-by: Jianglei Nie <niejianglei2021(a)163.com>
---
drivers/scsi/csiostor/csio_wr.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/csiostor/csio_wr.c b/drivers/scsi/csiostor/csio_wr.c
index fe0355c964bc..7dcc4fda0483 100644
--- a/drivers/scsi/csiostor/csio_wr.c
+++ b/drivers/scsi/csiostor/csio_wr.c
@@ -728,8 +728,10 @@ csio_wr_eq_destroy(struct csio_hw *hw, void *priv, int eq_idx,
return rv;
}
- if (cbfn != NULL)
+ if (cbfn != NULL) {
+ mempool_free(mbp, hw->mb_mempool);
return 0;
+ }
return csio_wr_eq_destroy_rsp(hw, mbp, eq_idx);
}
--
2.25.1
1
0

[PATCH master] drm/amdgpu: Fix reference leak in psp_xgmi_reflect_topology_info()
by Jianglei Nie 12 Dec '21
by Jianglei Nie 12 Dec '21
12 Dec '21
openEuler inclusion
category: bugfix
bugzilla: NA
CVE: NA
In line 1138 (#1), amdgpu_get_xgmi_hive() increases the kobject reference
counter of the hive it returned. The hive returned by
amdgpu_get_xgmi_hive() should be released with the help of
amdgpu_put_xgmi_hive() to balance its kobject reference counter properly.
Forgetting the amdgpu_put_xgmi_hive() operation will result in
reference leak.
We can fix it by calling amdgpu_put_xgmi_hive() before the end of the
function (#2).
1128 static void psp_xgmi_reflect_topology_info(struct psp_context *psp,
1129 struct psp_xgmi_node_info node_info)
1130 {
1138 hive = amdgpu_get_xgmi_hive(psp->adev);
// #1: kzalloc space reference increment
1139 list_for_each_entry(mirror_adev, &hive->device_list, gmc.xgmi.head) {
1140 struct psp_xgmi_topology_info *mirror_top_info;
1141 int j;
1143 if (mirror_adev->gmc.xgmi.node_id != dst_node_id)
1144 continue;
1146 mirror_top_info = &mirror_adev->psp.xgmi_context.top_info;
1147 for (j = 0; j < mirror_top_info->num_nodes; j++) {
1148 if (mirror_top_info->nodes[j].node_id != src_node_id)
1149 continue;
1151 mirror_top_info->nodes[j].num_hops = dst_num_hops;
1157 if (dst_num_links)
1158 mirror_top_info->nodes[j].num_links = dst_num_links;
1160 break;
1161 }
1163 break;
1164 }
// #2: missing reference decrement
1165 }
Signed-off-by: Jianglei Nie <niejianglei2021(a)163.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index c641f84649d6..f6362047ed71 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -1162,6 +1162,7 @@ static void psp_xgmi_reflect_topology_info(struct psp_context *psp,
break;
}
+ amdgpu_put_xgmi_hive(hive);
}
int psp_xgmi_get_topology_info(struct psp_context *psp,
--
2.25.1
1
0

12 Dec '21
mainline inclusion
from mainline-v5.16-rc7
category: bugfix
commit: c56c96303e9289cc34716b1179597b6f470833de
bugzilla: NA
CVE: NA
------------------------
In line 800 (#1), nfp_cpp_area_alloc() allocates and initializes a
CPP area structure. But in line 807 (#2), when the cache is allocated
failed, this CPP area structure is not freed, which will result in
memory leak.
We can fix it by freeing the CPP area when the cache is allocated
failed (#2).
792 int nfp_cpp_area_cache_add(struct nfp_cpp *cpp, size_t size)
793 {
794 struct nfp_cpp_area_cache *cache;
795 struct nfp_cpp_area *area;
800 area = nfp_cpp_area_alloc(cpp, NFP_CPP_ID(7, NFP_CPP_ACTION_RW, 0),
801 0, size);
// #1: allocates and initializes
802 if (!area)
803 return -ENOMEM;
805 cache = kzalloc(sizeof(*cache), GFP_KERNEL);
806 if (!cache)
807 return -ENOMEM; // #2: missing free
817 return 0;
818 }
Fixes: 4cb584e0ee7d ("nfp: add CPP access core")
Signed-off-by: Jianglei Nie <niejianglei2021(a)163.com>
Acked-by: Simon Horman <simon.horman(a)corigine.com>
Link: https://lore.kernel.org/r/20211209061511.122535-1-niejianglei2021@163.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Jianglei Nie <niejianglei2021(a)163.com>
---
drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c
index d7ac0307797f..34c0d2ddf9ef 100644
--- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c
+++ b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c
@@ -803,8 +803,10 @@ int nfp_cpp_area_cache_add(struct nfp_cpp *cpp, size_t size)
return -ENOMEM;
cache = kzalloc(sizeof(*cache), GFP_KERNEL);
- if (!cache)
+ if (!cache) {
+ nfp_cpp_area_free(area);
return -ENOMEM;
+ }
cache->id = 0;
cache->addr = 0;
--
2.25.1
1
0
Hi,
Thanks for your patchset.
Please subscribe the kernel(a)openeuler.org and kernel-discuss(a)openeuler.org mailing list before sending
patchs, or your patchs would be blocked.
Welcome to start a topic at openEuler kernel sig meeting via seeding mail to kernel-discuss(a)openeuler.org.
---
可以通过openEuler kernel 小助手,参加openEuler kernel SIG 微信群。
https://openeuler.gitee.io/kernel-portal/img/wechat/openEuler_kernel_helper…
On 2021/12/10 16:09, Jianglei Nie wrote:
> Line 1169 (#3) allocates a memory chunk for victim_name by kmalloc(),
> but when the function returns in line 1184 (#4) victim_name allcoated
> by line 1169 (#3) is not freed, which will lead to a memory leak.
> There is a similar snippet of code in this function as allocating a memory
> chunk for victim_name in line 1104 (#1) as well as releasing the memory
> in line 1116 (#2).
>
> We should kfree() victim_name when the return value of backref_in_log()
> is less than zero and before the function returns in line 1184 (#4).
>
> 1057 static inline int __add_inode_ref(struct btrfs_trans_handle *trans,
> 1058 struct btrfs_root *root,
> 1059 struct btrfs_path *path,
> 1060 struct btrfs_root *log_root,
> 1061 struct btrfs_inode *dir,
> 1062 struct btrfs_inode *inode,
> 1063 u64 inode_objectid, u64 parent_objectid,
> 1064 u64 ref_index, char *name, int namelen,
> 1065 int *search_done)
> 1066 {
>
> 1104 victim_name = kmalloc(victim_name_len, GFP_NOFS);
> // #1: kmalloc (victim_name-1)
> 1105 if (!victim_name)
> 1106 return -ENOMEM;
>
> 1112 ret = backref_in_log(log_root, &search_key,
> 1113 parent_objectid, victim_name,
> 1114 victim_name_len);
> 1115 if (ret < 0) {
> 1116 kfree(victim_name); // #2: kfree (victim_name-1)
> 1117 return ret;
> 1118 } else if (!ret) {
>
> 1169 victim_name = kmalloc(victim_name_len, GFP_NOFS);
> // #3: kmalloc (victim_name-2)
> 1170 if (!victim_name)
> 1171 return -ENOMEM;
>
> 1180 ret = backref_in_log(log_root, &search_key,
> 1181 parent_objectid, victim_name,
> 1182 victim_name_len);
> 1183 if (ret < 0) {
> 1184 return ret; // #4: missing kfree (victim_name-2)
> 1185 } else if (!ret) {
>
> 1241 return 0;
> 1242 }
>
> Signed-off-by: Jianglei Nie <niejianglei2021(a)163.com>
> ---
> fs/btrfs/tree-log.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index 8ab33caf016f..d373fec55521 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -1181,6 +1181,7 @@ static inline int __add_inode_ref(struct btrfs_trans_handle *trans,
> parent_objectid, victim_name,
> victim_name_len);
> if (ret < 0) {
> + kfree(victim_name);
> return ret;
> } else if (!ret) {
> ret = -ENOENT;
>
1
0

[PATCH openEuler-5.10 1/7] bfq: Remove merged request already in bfq_requests_merged()
by Zheng Zengkai 10 Dec '21
by Zheng Zengkai 10 Dec '21
10 Dec '21
From: Jan Kara <jack(a)suse.cz>
mainline inclusion
from mainline-v5.14-rc1
commit a921c655f2033dd1ce1379128efe881dda23ea37
category: bugfix
bugzilla: 185777 https://gitee.com/openeuler/kernel/issues/I4LM14
CVE: NA
---------------------------
Currently, bfq does very little in bfq_requests_merged() and handles all
the request cleanup in bfq_finish_requeue_request() called from
blk_mq_free_request(). That is currently safe only because
blk_mq_free_request() is called shortly after bfq_requests_merged()
while bfqd->lock is still held. However to fix a lock inversion between
bfqd->lock and ioc->lock, we need to call blk_mq_free_request() after
dropping bfqd->lock. That would mean that already merged request could
be seen by other processes inside bfq queues and possibly dispatched to
the device which is wrong. So move cleanup of the request from
bfq_finish_requeue_request() to bfq_requests_merged().
Acked-by: Paolo Valente <paolo.valente(a)linaro.org>
Signed-off-by: Jan Kara <jack(a)suse.cz>
Link: https://lore.kernel.org/r/20210623093634.27879-2-jack@suse.cz
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
conflict: in bfq_finish_requeue_request, code in hulk have the line
atomic_dec(&rq->mq_hctx->elevator_queued); that is conflicted;
Signed-off-by: zhangwensheng <zhangwensheng5(a)huawei.com>
Reviewed-by: qiulaibin <qiulaibin(a)huawei.com>
Reviewed-by: Jason Yan <yanaijie(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
block/bfq-iosched.c | 41 +++++++++++++----------------------------
1 file changed, 13 insertions(+), 28 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index fd3c23d516b8..27e01b4cd528 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -2326,7 +2326,7 @@ static void bfq_requests_merged(struct request_queue *q, struct request *rq,
*next_bfqq = bfq_init_rq(next);
if (!bfqq)
- return;
+ goto remove;
/*
* If next and rq belong to the same bfq_queue and next is older
@@ -2349,6 +2349,14 @@ static void bfq_requests_merged(struct request_queue *q, struct request *rq,
bfqq->next_rq = rq;
bfqg_stats_update_io_merged(bfqq_group(bfqq), next->cmd_flags);
+remove:
+ /* Merged request may be in the IO scheduler. Remove it. */
+ if (!RB_EMPTY_NODE(&next->rb_node)) {
+ bfq_remove_request(next->q, next);
+ if (next_bfqq)
+ bfqg_stats_update_io_remove(bfqq_group(next_bfqq),
+ next->cmd_flags);
+ }
}
/* Must be called with bfqq != NULL */
@@ -5901,6 +5909,7 @@ static void bfq_finish_requeue_request(struct request *rq)
{
struct bfq_queue *bfqq = RQ_BFQQ(rq);
struct bfq_data *bfqd;
+ unsigned long flags;
/*
* rq either is not associated with any icq, or is an already
@@ -5918,40 +5927,16 @@ static void bfq_finish_requeue_request(struct request *rq)
rq->io_start_time_ns,
rq->cmd_flags);
+ spin_lock_irqsave(&bfqd->lock, flags);
if (likely(rq->rq_flags & RQF_STARTED)) {
- unsigned long flags;
-
- spin_lock_irqsave(&bfqd->lock, flags);
-
if (rq == bfqd->waited_rq)
bfq_update_inject_limit(bfqd, bfqq);
bfq_completed_request(bfqq, bfqd);
- bfq_finish_requeue_request_body(bfqq);
atomic_dec(&rq->mq_hctx->elevator_queued);
-
- spin_unlock_irqrestore(&bfqd->lock, flags);
- } else {
- /*
- * Request rq may be still/already in the scheduler,
- * in which case we need to remove it (this should
- * never happen in case of requeue). And we cannot
- * defer such a check and removal, to avoid
- * inconsistencies in the time interval from the end
- * of this function to the start of the deferred work.
- * This situation seems to occur only in process
- * context, as a consequence of a merge. In the
- * current version of the code, this implies that the
- * lock is held.
- */
-
- if (!RB_EMPTY_NODE(&rq->rb_node)) {
- bfq_remove_request(rq->q, rq);
- bfqg_stats_update_io_remove(bfqq_group(bfqq),
- rq->cmd_flags);
- }
- bfq_finish_requeue_request_body(bfqq);
}
+ bfq_finish_requeue_request_body(bfqq);
+ spin_unlock_irqrestore(&bfqd->lock, flags);
/*
* Reset private fields. In case of a requeue, this allows
--
2.20.1
1
6

10 Dec '21
The address printed by %p in the kernel will expose the kernel address information, which is extremely unsafe.
So Linux v4.15 limited the information printed by %p which will print a hashed value.
This patchset add no_hash_pointers startup parameter which can disable the restriction that %P only prints hashed values, so that %P can print the actual address in the kernel.
I patched this function and the test modules associated with this and passed these tests after recompiling.
Tobin C. Harding (3):
lib/test_printf: Add empty module_exit function
kselftest: Add test module framework header
lib: Use new kselftest header
Timur Tabi(3):
kselftest: add support for skipped tests
lib/vsprintf: no_hash_pointers prints all addresses as unhashed
lib: use KSTM_MODULE_GLOBALS macro in kselftest drivers
.../admin-guide/kernel-parameters.txt | 15 +++
Documentation/dev-tools/kselftest.rst | 94 +++++++++++++++++-
lib/test_bitmap.c | 23 +----
lib/test_printf.c | 29 +++---
lib/vsprintf.c | 36 ++++++-
tools/testing/selftests/kselftest_module.h | 54 ++++++++++
6 files changed, 215 insertions(+), 36 deletions(-)
create mode 100644 tools/testing/selftests/kselftest_module.h
--
2.30.0
1
6

[PATCH openEuler-1.0-LTS] block, bfq: move bfqq to root_group if parent group is offlined
by Yang Yingliang 10 Dec '21
by Yang Yingliang 10 Dec '21
10 Dec '21
From: Yu Kuai <yukuai3(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 185863
CVE: NA
---------------------------
Our test report a uaf problem:
[ 154.237639] ==================================================================
[ 154.239896] BUG: KASAN: use-after-free in __bfq_deactivate_entity+0x25/0x290
[ 154.241910] Read of size 1 at addr ffff88824501f7b8 by task rmmod/2447
[ 154.244248] CPU: 7 PID: 2447 Comm: rmmod Not tainted 4.19.90+ #1
[ 154.245962] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 154.248184] Call Trace:
[ 154.248532] dump_stack+0x7a/0xac
[ 154.248995] print_address_description+0x6c/0x237
[ 154.249649] ? __bfq_deactivate_entity+0x25/0x290
[ 154.250297] kasan_report.cold+0x88/0x29c
[ 154.250853] __bfq_deactivate_entity+0x25/0x290
[ 154.251483] bfq_pd_offline+0x13e/0x790
[ 154.252017] ? blk_mq_freeze_queue_wait+0x165/0x180
[ 154.252687] ? bfq_reparent_leaf_entity+0xa0/0xa0
[ 154.253333] ? bfq_put_queue+0x12c/0x1e0
[ 154.253877] ? kmem_cache_free+0x8e/0x1e0
[ 154.254433] ? hrtimer_active+0x53/0xa0
[ 154.254966] ? hrtimer_try_to_cancel+0x6d/0x1c0
[ 154.255576] ? __hrtimer_get_remaining+0xf0/0xf0
[ 154.256197] ? __bfq_deactivate_entity+0x11b/0x290
[ 154.256843] blkcg_deactivate_policy+0x106/0x1f0
[ 154.257464] bfq_exit_queue+0xf1/0x110
[ 154.257975] blk_mq_exit_sched+0x114/0x140
[ 154.258530] elevator_exit+0x9a/0xa0
[ 154.259023] blk_exit_queue+0x3d/0x70
[ 154.259523] blk_cleanup_queue+0x160/0x1e0
[ 154.260099] null_del_dev+0xda/0x1f0 [null_blk]
[ 154.260723] null_exit+0x5f/0xab [null_blk]
[ 154.261298] __x64_sys_delete_module+0x20e/0x2f0
[ 154.261931] ? __ia32_sys_delete_module+0x2f0/0x2f0
[ 154.262597] ? exit_to_usermode_loop+0x45/0xe0
[ 154.263219] do_syscall_64+0x73/0x280
[ 154.263731] ? page_fault+0x8/0x30
[ 154.264197] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 154.264882] RIP: 0033:0x7f033bf63acb
[ 154.265370] Code: 73 01 c3 48 8b 0d bd 33 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00
00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8d 33 0c 00 f7 d8 64 89 01 48
[ 154.267880] RSP: 002b:00007ffc7fe52548 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 154.268900] RAX: ffffffffffffffda RBX: 00005583e2b8e530 RCX: 00007f033bf63acb
[ 154.269865] RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005583e2b8e598
[ 154.270837] RBP: 00007ffc7fe525a8 R08: 0000000000000000 R09: 0000000000000000
[ 154.271802] R10: 00007f033bfd7ac0 R11: 0000000000000206 R12: 00007ffc7fe52770
[ 154.272763] R13: 00007ffc7fe536f8 R14: 00005583e2b8d2a0 R15: 00005583e2b8e530
[ 154.273939] Allocated by task 2350:
[ 154.274419] kasan_kmalloc+0xc6/0xe0
[ 154.274916] kmem_cache_alloc_node_trace+0x119/0x240
[ 154.275594] bfq_pd_alloc+0x50/0x510
[ 154.276081] blkg_alloc+0x237/0x310
[ 154.276557] blkg_create+0x48a/0x5e0
[ 154.277044] blkg_lookup_create+0x144/0x1c0
[ 154.277614] generic_make_request_checks+0x5cf/0xad0
[ 154.278290] generic_make_request+0xdd/0x6c0
[ 154.278877] submit_bio+0xaa/0x250
[ 154.279342] mpage_readpages+0x2a2/0x3b0
[ 154.279878] read_pages+0xdf/0x3a0
[ 154.280343] __do_page_cache_readahead+0x27c/0x2a0
[ 154.280989] ondemand_readahead+0x275/0x460
[ 154.281556] generic_file_read_iter+0xc4e/0x1790
[ 154.282182] aio_read+0x174/0x260
[ 154.282635] io_submit_one+0x7d4/0x14b0
[ 154.283164] __x64_sys_io_submit+0x102/0x230
[ 154.283749] do_syscall_64+0x73/0x280
[ 154.284250] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 154.285159] Freed by task 2315:
[ 154.285588] __kasan_slab_free+0x12f/0x180
[ 154.286150] kfree+0xab/0x1d0
[ 154.286561] blkg_free.part.0+0x4a/0xe0
[ 154.287089] rcu_process_callbacks+0x424/0x6d0
[ 154.287689] __do_softirq+0x10d/0x370
[ 154.288395] The buggy address belongs to the object at ffff88824501f700
which belongs to the cache kmalloc-2048 of size 2048
[ 154.290083] The buggy address is located 184 bytes inside of
2048-byte region [ffff88824501f700, ffff88824501ff00)
[ 154.291661] The buggy address belongs to the page:
[ 154.292306] page:ffffea0009140600 count:1 mapcount:0 mapping:ffff88824bc0e800 index:0x0 compound_mapcount: 0
[ 154.293610] flags: 0x17ffffc0008100(slab|head)
[ 154.294211] raw: 0017ffffc0008100 ffffea000896da00 0000000200000002 ffff88824bc0e800
[ 154.295247] raw: 0000000000000000 00000000800f000f 00000001ffffffff 0000000000000000
[ 154.296294] page dumped because: kasan: bad access detected
[ 154.297261] Memory state around the buggy address:
[ 154.297913] ffff88824501f680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 154.298884] ffff88824501f700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 154.299858] >ffff88824501f780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 154.300824] ^
[ 154.301505] ffff88824501f800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 154.302479] ffff88824501f880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 154.303459] ==================================================================
This is because when bfq_group is offlined, if the bfq_queues are not in
active tree, their parents(bfqq->entity.partent) are still point to the
offlined bfq_group. And after some ios are issued to such bfq_queues,
the offlined bfq_group is reinserted to service tree.
Fix the problem by move bfq_queue to root_group if we found it's parent
is offlined.
Fixes: e21b7a0b9887 ("block, bfq: add full hierarchical scheduling and cgroups support")
Signed-off-by: Yu Kuai <yukuai3(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
block/bfq-cgroup.c | 14 ++++++++++----
block/bfq-wf2q.c | 9 +++++++++
2 files changed, 19 insertions(+), 4 deletions(-)
diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index 78cfd008b89d7..73b82a5c03717 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -556,6 +556,7 @@ void bfq_bfqq_move(struct bfq_data *bfqd, struct bfq_queue *bfqq,
struct bfq_group *bfqg)
{
struct bfq_entity *entity = &bfqq->entity;
+ struct bfq_group *old_parent = bfqq_group(bfqq);
/*
* Get extra reference to prevent bfqq from being freed in
@@ -577,17 +578,21 @@ void bfq_bfqq_move(struct bfq_data *bfqd, struct bfq_queue *bfqq,
bfq_deactivate_bfqq(bfqd, bfqq, false, false);
else if (entity->on_st)
bfq_put_idle_entity(bfq_entity_service_tree(entity), entity);
- bfqg_and_blkg_put(bfqq_group(bfqq));
entity->parent = bfqg->my_entity;
entity->sched_data = &bfqg->sched_data;
/* pin down bfqg and its associated blkg */
bfqg_and_blkg_get(bfqg);
- if (bfq_bfqq_busy(bfqq)) {
- bfq_pos_tree_add_move(bfqd, bfqq);
+ /*
+ * Don't leave the bfqq->pos_root to old bfqg, since the ref to old
+ * bfqg will be released and the bfqg might be freed.
+ */
+ bfq_pos_tree_add_move(bfqd, bfqq);
+ bfqg_and_blkg_put(old_parent);
+
+ if (bfq_bfqq_busy(bfqq))
bfq_activate_bfqq(bfqd, bfqq);
- }
if (!bfqd->in_service_queue && !bfqd->rq_in_driver)
bfq_schedule_dispatch(bfqd);
@@ -860,6 +865,7 @@ static void bfq_pd_offline(struct blkg_policy_data *pd)
put_async_queues:
bfq_put_async_queues(bfqd, bfqg);
+ pd->plid = BLKCG_MAX_POLS;
spin_unlock_irqrestore(&bfqd->lock, flags);
/*
diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
index 316a1c2d1b610..e830715fe15d6 100644
--- a/block/bfq-wf2q.c
+++ b/block/bfq-wf2q.c
@@ -1684,6 +1684,15 @@ void bfq_del_bfqq_busy(struct bfq_data *bfqd, struct bfq_queue *bfqq,
*/
void bfq_add_bfqq_busy(struct bfq_data *bfqd, struct bfq_queue *bfqq)
{
+#ifdef CONFIG_BFQ_GROUP_IOSCHED
+ /* If parent group is offlined, move the bfqq to root group */
+ if (bfqq->entity.parent) {
+ struct bfq_group *bfqg = bfq_bfqq_to_bfqg(bfqq);
+
+ if (bfqg->pd.plid >= BLKCG_MAX_POLS)
+ bfq_bfqq_move(bfqd, bfqq, bfqd->root_group);
+ }
+#endif
bfq_log_bfqq(bfqd, bfqq, "add to busy");
bfq_activate_bfqq(bfqd, bfqq);
--
2.25.1
1
0
您好!
Kernel SIG 邀请您参加 2021-12-17 14:00 召开的ZOOM会议(自动录制)
会议主题:openEuler kernel技术分享第十六期 & 双周例会
会议内容:
14:10-15:30 openEuler kernel 技术分享第16期-openEuler内核热补丁介绍
会议链接:https://us06web.zoom.us/j/88618073121?pwd=V3JyZG5IOWRVWENieDJhNndpV2x5UT09
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the ZOOM conference(auto recording) will be held at 2021-12-17 14:00,
The subject of the conference is openEuler kernel技术分享第十六期 & 双周例会,
Summary:
14:10-15:30 openEuler kernel 技术分享第16期-openEuler内核热补丁介绍
You can join the meeting at https://us06web.zoom.us/j/88618073121?pwd=V3JyZG5IOWRVWENieDJhNndpV2x5UT09.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
1
0

[PATCH openEuler-1.0-LTS 1/2] io_uring: fix soft lockup when call __io_remove_buffers
by Yang Yingliang 09 Dec '21
by Yang Yingliang 09 Dec '21
09 Dec '21
From: Ye Bin <yebin10(a)huawei.com>
mainline inclusion
from mainline-v5.16-rc3
commit 1d0254e6b47e73222fd3d6ae95cccbaafe5b3ecf
category: bugfix
bugzilla: 185746
CVE: NA
-----------------------------------------------
I got issue as follows:
[ 567.094140] __io_remove_buffers: [1]start ctx=0xffff8881067bf000 bgid=65533 buf=0xffff8881fefe1680
[ 594.360799] watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u32:5:108]
[ 594.364987] Modules linked in:
[ 594.365405] irq event stamp: 604180238
[ 594.365906] hardirqs last enabled at (604180237): [<ffffffff93fec9bd>] _raw_spin_unlock_irqrestore+0x2d/0x50
[ 594.367181] hardirqs last disabled at (604180238): [<ffffffff93fbbadb>] sysvec_apic_timer_interrupt+0xb/0xc0
[ 594.368420] softirqs last enabled at (569080666): [<ffffffff94200654>] __do_softirq+0x654/0xa9e
[ 594.369551] softirqs last disabled at (569080575): [<ffffffff913e1d6a>] irq_exit_rcu+0x1ca/0x250
[ 594.370692] CPU: 2 PID: 108 Comm: kworker/u32:5 Tainted: G L 5.15.0-next-20211112+ #88
[ 594.371891] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
[ 594.373604] Workqueue: events_unbound io_ring_exit_work
[ 594.374303] RIP: 0010:_raw_spin_unlock_irqrestore+0x33/0x50
[ 594.375037] Code: 48 83 c7 18 53 48 89 f3 48 8b 74 24 10 e8 55 f5 55 fd 48 89 ef e8 ed a7 56 fd 80 e7 02 74 06 e8 43 13 7b fd fb bf 01 00 00 00 <e8> f8 78 474
[ 594.377433] RSP: 0018:ffff888101587a70 EFLAGS: 00000202
[ 594.378120] RAX: 0000000024030f0d RBX: 0000000000000246 RCX: 1ffffffff2f09106
[ 594.379053] RDX: 0000000000000000 RSI: ffffffff9449f0e0 RDI: 0000000000000001
[ 594.379991] RBP: ffffffff9586cdc0 R08: 0000000000000001 R09: fffffbfff2effcab
[ 594.380923] R10: ffffffff977fe557 R11: fffffbfff2effcaa R12: ffff8881b8f3def0
[ 594.381858] R13: 0000000000000246 R14: ffff888153a8b070 R15: 0000000000000000
[ 594.382787] FS: 0000000000000000(0000) GS:ffff888399c00000(0000) knlGS:0000000000000000
[ 594.383851] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 594.384602] CR2: 00007fcbe71d2000 CR3: 00000000b4216000 CR4: 00000000000006e0
[ 594.385540] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 594.386474] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 594.387403] Call Trace:
[ 594.387738] <TASK>
[ 594.388042] find_and_remove_object+0x118/0x160
[ 594.389321] delete_object_full+0xc/0x20
[ 594.389852] kfree+0x193/0x470
[ 594.390275] __io_remove_buffers.part.0+0xed/0x147
[ 594.390931] io_ring_ctx_free+0x342/0x6a2
[ 594.392159] io_ring_exit_work+0x41e/0x486
[ 594.396419] process_one_work+0x906/0x15a0
[ 594.399185] worker_thread+0x8b/0xd80
[ 594.400259] kthread+0x3bf/0x4a0
[ 594.401847] ret_from_fork+0x22/0x30
[ 594.402343] </TASK>
Message from syslogd@localhost at Nov 13 09:09:54 ...
kernel:watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u32:5:108]
[ 596.793660] __io_remove_buffers: [2099199]start ctx=0xffff8881067bf000 bgid=65533 buf=0xffff8881fefe1680
We can reproduce this issue by follow syzkaller log:
r0 = syz_io_uring_setup(0x401, &(0x7f0000000300), &(0x7f0000003000/0x2000)=nil, &(0x7f0000ff8000/0x4000)=nil, &(0x7f0000000280)=<r1=>0x0, &(0x7f0000000380)=<r2=>0x0)
sendmsg$ETHTOOL_MSG_FEATURES_SET(0xffffffffffffffff, &(0x7f0000003080)={0x0, 0x0, &(0x7f0000003040)={&(0x7f0000000040)=ANY=[], 0x18}}, 0x0)
syz_io_uring_submit(r1, r2, &(0x7f0000000240)=@IORING_OP_PROVIDE_BUFFERS={0x1f, 0x5, 0x0, 0x401, 0x1, 0x0, 0x100, 0x0, 0x1, {0xfffd}}, 0x0)
io_uring_enter(r0, 0x3a2d, 0x0, 0x0, 0x0, 0x0)
The reason above issue is 'buf->list' has 2,100,000 nodes, occupied cpu lead
to soft lockup.
To solve this issue, we need add schedule point when do while loop in
'__io_remove_buffers'.
After add schedule point we do regression, get follow data.
[ 240.141864] __io_remove_buffers: [1]start ctx=0xffff888170603000 bgid=65533 buf=0xffff8881116fcb00
[ 268.408260] __io_remove_buffers: [1]start ctx=0xffff8881b92d2000 bgid=65533 buf=0xffff888130c83180
[ 275.899234] __io_remove_buffers: [2099199]start ctx=0xffff888170603000 bgid=65533 buf=0xffff8881116fcb00
[ 296.741404] __io_remove_buffers: [1]start ctx=0xffff8881b659c000 bgid=65533 buf=0xffff8881010fe380
[ 305.090059] __io_remove_buffers: [2099199]start ctx=0xffff8881b92d2000 bgid=65533 buf=0xffff888130c83180
[ 325.415746] __io_remove_buffers: [1]start ctx=0xffff8881b92d1000 bgid=65533 buf=0xffff8881a17d8f00
[ 333.160318] __io_remove_buffers: [2099199]start ctx=0xffff8881b659c000 bgid=65533 buf=0xffff8881010fe380
...
Fixes:8bab4c09f24e("io_uring: allow conditional reschedule for intensive iterators")
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Link: https://lore.kernel.org/r/20211122024737.2198530-1-yebin10@huawei.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
conflicts:
fs/io_uring.c
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
fs/io_uring.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index d07388600bbed..f3b5f9d670df3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3462,6 +3462,7 @@ static int __io_remove_buffers(struct io_ring_ctx *ctx, struct io_buffer *buf,
kfree(nxt);
if (++i == nbufs)
return i;
+ cond_resched();
}
i++;
kfree(buf);
--
2.25.1
1
1

[PATCH openEuler-1.0-LTS] block: Fix fsync always failed if once failed
by Yang Yingliang 09 Dec '21
by Yang Yingliang 09 Dec '21
09 Dec '21
From: Ye Bin <yebin10(a)huawei.com>
mainline inclusion
from mainline-v5.17
commit 8a7518931baa8ea023700987f3db31cb0a80610b
category: bugfix
bugzilla: 185819
CVE: NA
-----------------------------------------------
We do test with inject error fault base on v4.19, after test some time we found
sync /dev/sda always failed.
[root@localhost] sync /dev/sda
sync: error syncing '/dev/sda': Input/output error
scsi log as follows:
[19069.812296] sd 0:0:0:0: [sda] tag#64 Send: scmd 0x00000000d03a0b6b
[19069.812302] sd 0:0:0:0: [sda] tag#64 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
[19069.812533] sd 0:0:0:0: [sda] tag#64 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[19069.812536] sd 0:0:0:0: [sda] tag#64 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
[19069.812539] sd 0:0:0:0: [sda] tag#64 scsi host busy 1 failed 0
[19069.812542] sd 0:0:0:0: Notifying upper driver of completion (result 0)
[19069.812546] sd 0:0:0:0: [sda] tag#64 sd_done: completed 0 of 0 bytes
[19069.812549] sd 0:0:0:0: [sda] tag#64 0 sectors total, 0 bytes done.
[19069.812564] print_req_error: I/O error, dev sda, sector 0
ftrace log as follows:
rep-306069 [007] .... 19654.923315: block_bio_queue: 8,0 FWS 0 + 0 [rep]
rep-306069 [007] .... 19654.923333: block_getrq: 8,0 FWS 0 + 0 [rep]
kworker/7:1H-250 [007] .... 19654.923352: block_rq_issue: 8,0 FF 0 () 0 + 0 [kworker/7:1H]
<idle>-0 [007] ..s. 19654.923562: block_rq_complete: 8,0 FF () 18446744073709551615 + 0 [0]
<idle>-0 [007] d.s. 19654.923576: block_rq_complete: 8,0 WS () 0 + 0 [-5]
As 8d6996630c03 introduce 'fq->rq_status', this data only update when 'flush_rq'
reference count isn't zero. If flush request once failed and record error code
in 'fq->rq_status'. If there is no chance to update 'fq->rq_status',then do fsync
will always failed.
To address this issue reset 'fq->rq_status' after return error code to upper layer.
Fixes: 8d6996630c03("block: fix null pointer dereference in blk_mq_rq_timed_out()")
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Reviewed-by: Ming Lei <ming.lei(a)redhat.com>
Link: https://lore.kernel.org/r/20211129012659.1553733-1-yebin10@huawei.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
conflicts:
block/blk-flush.c
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Reviewed-by: Jason Yan <yanaijie(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
block/blk-flush.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/block/blk-flush.c b/block/blk-flush.c
index 022a2bbb012d7..c1bfcde165af5 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -245,8 +245,10 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
* avoiding use-after-free.
*/
WRITE_ONCE(flush_rq->state, MQ_RQ_IDLE);
- if (fq->rq_status != BLK_STS_OK)
+ if (fq->rq_status != BLK_STS_OK) {
error = fq->rq_status;
+ fq->rq_status = BLK_STS_OK;
+ }
hctx = blk_mq_map_queue(q, flush_rq->mq_ctx->cpu);
if (!q->elevator) {
--
2.25.1
1
0