Hi, Thomas Monjalon&Ferruh Yigit and others
I'm analyzing multiprocess with eal. I have some questions I'd like to ask you.
Firstly, After the rte_eal_init() command is executed, the master and slave processes are started successfully.
and traffic is continuously sent using the tester.If you run the kill -9 command to stop the slave process, restart the re-process, and start packet receiving and sending,
how to ensure that the eal resource of the slave process is cleaned up?
Second, how to invoke the remove function to clear probe resources of the slave process after the slave process exits?
Finally, I found out why the rte_eal_cleanup call was not unregistered mp action after the process exited.
I look forward to your response.
Thanks
Lijun Ou
Hi,
Sorry your questions are quite confused. Please start explaining what is the problem you are trying to solve.
In general, closing a process does not mean removing the device, because it can be used by other processes.
04/02/2021 07:56, oulijun:
Hi, Thomas Monjalon&Ferruh Yigit and others
I'm analyzing multiprocess with eal. I have some questions I'd like
to ask you.
Firstly, After the rte_eal_init() command is executed, the master and slave processes are started successfully.
and traffic is continuously sent using the tester.If you run the kill -9 command to stop the slave process, restart the re-process, and start packet receiving and sending,
how to ensure that the eal resource of the slave process is cleaned up?
Second, how to invoke the remove function to clear probe resources of the slave process after the slave process exits?
Finally, I found out why the rte_eal_cleanup call was not unregistered mp action after the process exited.
I look forward to your response.
Thanks
Lijun Ou
在 2021/2/4 17:25, Thomas Monjalon 写道:
Hi,
Sorry your questions are quite confused. Please start explaining what is the problem you are trying to solve.
Start the master and slave processes at the same time, and then run the kill -9 command to kill the slave processes. The slave process should call rte_eal_cleanup to release resources. But I find that there is no release from the process, and I think there is a resource leak.
In general, closing a process does not mean removing the device, because it can be used by other processes.
04/02/2021 07:56, oulijun:
Hi, Thomas Monjalon&Ferruh Yigit and others
I'm analyzing multiprocess with eal. I have some questions I'd like
to ask you.
Firstly, After the rte_eal_init() command is executed, the master and slave processes are started successfully.
and traffic is continuously sent using the tester.If you run the kill -9 command to stop the slave process, restart the re-process, and start packet receiving and sending,
how to ensure that the eal resource of the slave process is cleaned up?
Second, how to invoke the remove function to clear probe resources of the slave process after the slave process exits?
Finally, I found out why the rte_eal_cleanup call was not unregistered mp action after the process exited.
I look forward to your response.
Thanks
Lijun Ou
.
On Thu, Feb 04, 2021 at 07:47:01PM +0800, oulijun wrote:
在 2021/2/4 17:25, Thomas Monjalon 写道:
Hi,
Sorry your questions are quite confused. Please start explaining what is the problem you are trying to solve.
Start the master and slave processes at the same time, and then run the kill -9 command to kill the slave processes. The slave process should call rte_eal_cleanup to release resources. But I find that there is no release from the process, and I think there is a resource leak.
"kill -9" is an immediate forcible kill of a process and no cleanup will ever be done in that case. It's equivalent to the process crashing, so would be considered an abnormal termination.
Regads, /Bruce
04/02/2021 12:47, oulijun:
在 2021/2/4 17:25, Thomas Monjalon 写道:
Hi,
Sorry your questions are quite confused. Please start explaining what is the problem you are trying to solve.
Start the master and slave processes at the same time, and then run the kill -9 command to kill the slave processes.
No, If you kill -9 (SIGKILL), the process aborts immediatly.
The slave process should call rte_eal_cleanup to release resources. But I find that there is no release from the process, and I think there is a resource leak.
Try other signals than SIGKILL.
My understanding is that SIGKILL can simulate a crash in the process. How to handle such case is to be defined per driver/library.
In general, closing a process does not mean removing the device, because it can be used by other processes.
04/02/2021 07:56, oulijun:
Hi, Thomas Monjalon&Ferruh Yigit and others
I'm analyzing multiprocess with eal. I have some questions I'd like
to ask you.
Firstly, After the rte_eal_init() command is executed, the master and slave processes are started successfully.
and traffic is continuously sent using the tester.If you run the kill -9 command to stop the slave process, restart the re-process, and start packet receiving and sending,
how to ensure that the eal resource of the slave process is cleaned up?
Second, how to invoke the remove function to clear probe resources of the slave process after the slave process exits?
Finally, I found out why the rte_eal_cleanup call was not unregistered mp action after the process exited.
I look forward to your response.
Thanks
Lijun Ou
On 04-Feb-21 11:47 AM, oulijun wrote:
Hi,
Sorry your questions are quite confused. Please start explaining what is the problem you are trying to solve.
Start the master and slave processes at the same time, and then run the kill -9 command to kill the slave processes. The slave process should call rte_eal_cleanup to release resources. But I find that there is no release from the process, and I think there is a resource leak.
To add to others, not only there will be a resource leak whenever you're killing processes with SIGKILL, the cleanup is up to individual applications to perform. It is not the responsibility of the DPDK library itself to install signal handlers and handle SIGINT or others.
So, if the application leaks resources, it's up to the application to catch termination signals and clean up after itself. Unfortunately, there's also no way to recover any leaked memory that has gone out this way: there is no garbage collection or any similar mechanism in DPDK. Therefore, while primary-secondary process model is slightly more resilient than single-process model, there are no mechanisms to reclaim memory from a crashed process, and crashing secondary process still leads to undefined behavior.
For example, a crashing secondary process may crash while holding a lock, and there's no way to release the lock without reinitializing the lock (which often means restart). The secondary process may also crash while processing buffers, and those in-flight buffers will be lost. There's nothing we can do about it, at least for now.
在 2021/2/10 23:59, Burakov, Anatoly 写道:
On 04-Feb-21 11:47 AM, oulijun wrote:
Hi,
Sorry your questions are quite confused. Please start explaining what is the problem you are trying to solve.
Start the master and slave processes at the same time, and then run the kill -9 command to kill the slave processes. The slave process should call rte_eal_cleanup to release resources. But I find that there is no release from the process, and I think there is a resource leak.
To add to others, not only there will be a resource leak whenever you're killing processes with SIGKILL, the cleanup is up to individual applications to perform. It is not the responsibility of the DPDK library itself to install signal handlers and handle SIGINT or others.
So, if the application leaks resources, it's up to the application to catch termination signals and clean up after itself. Unfortunately, there's also no way to recover any leaked memory that has gone out this way: there is no garbage collection or any similar mechanism in DPDK. Therefore, while primary-secondary process model is slightly more resilient than single-process model, there are no mechanisms to reclaim memory from a crashed process, and crashing secondary process still leads to undefined behavior.
For example, a crashing secondary process may crash while holding a lock, and there's no way to release the lock without reinitializing the lock (which often means restart). The secondary process may also crash while processing buffers, and those in-flight buffers will be lost. There's nothing we can do about it, at least for now.
Thank you very much for your response.