19Jan2022

Fork in multithreaded programs

A new heavyweight process is created by fork 2. As virtual memory came into the UNIX world, that was augmented with vfork 2 and some others. A fork 2 copies the entire address space of the process, including all the registers, and puts that process under the control of the operating system scheduler; the next time the scheduler comes around, the instruction counter picks up at the next instruction -- the forked child process is a clone of the parent. If you want to run another program, say because you're writing a shell, you follow the fork with an exec 2 call, which loads that new address space with a new program, replacing the one that was cloned.

Basically, your answer is buried in that explanation: when you have a process with many LWPs threads and you fork the process, you will have two independent processes with many threads, running concurrently. This trick is even useful: in many programs, you have a parent process that may have many threads, some of which fork new child processes. For example, an HTTP server might do that: each connection to port 80 is handled by a thread, and then a child process for something like a CGI program could be forked; exec 2 would then be called to run the CGI program in place of the parent process close.

My experience of fork 'ing within threads is really bad. The software generally fails pretty quickly. I've found several solutions to the matter, although you may not like them much, I think these are generally the best way to avoid close to undebuggable errors. Assuming you know the number of external processes you need at the start, you can create them upfront and just have them sit there waiting for an event i. Once you forked enough children you are free to use threads and communicate with those forked processes via your pipes, semaphores, etc.

From the time you create a first thread, you cannot call fork anymore. In some circumstances, it may be possible for you to stop all of your threads to start a process and then restart your threads. This is somewhat similar to point 1 in the sense that you do not want threads running at the time you call fork , although it requires a way for you to know about all the threads currently running in your software something not always possible with 3rd party libraries.

Remember that "stopping a thread" using a wait is not going to work. You have to join with the thread so it is fully exited, because a wait require a mutex and those need to be unlocked when you call fork. The other obvious possibility is to choose one or the other and not bother with whether you're going to interfere with one or the other. This is by far the simplest method if at all possible in your software.

In some software, one creates one or more threads in a function, use said threads, then joins all of them when exiting the function. This is somewhat equivalent to point 2 above, only you micro- manage threads as required instead of creating threads that sit around and get used when necessary.

This will work too, just keep in mind that creating a thread is a costly call. It has to allocate a new task with a stack and its own set of registers However, this makes it easy to know when you have threads running and except from within those functions, you are free to call fork. In my programming, I used all three solutions. I used Point 2 because the threaded version of log4cplus and I needed to use fork for some parts of my software.

As mentioned by others, if you are using a fork to then call execve then the idea is to use as little as possible between the two calls. That is likely to work The fact is that if you do not hit any of the mutexes held by the other threads, then this will work without issue.

On the other hand, if like me you want to do a fork and never call execve , then it's not likely to work right while any thread is running. The issue is that fork create a separate copy of only the current task a process under Linux is called a task in the kernel. However, a fork ignores those extra tasks when duplicating the currently running task. This means that if either or both have a lock on mutexes or something similar, then Process B is going to lock up quickly.

The locks are the worst, but any resources that either thread still has at the time the fork happens are lost socket connection, memory allocations, device handle, etc.

This is where point 2 above comes in. You need to know your state before the fork. If you have a very small number of threads or worker threads defined in one place and can easily stop all of them, then it will be easy enough.

If you are using the unix 'fork ' system call, then you are not technically using threads- you are using processes- they will have their own memory space, and therefore cannot interfere with eachother. Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? Collectives on Stack Overflow. Learn more. Is it safe to fork from within a thread? Ask Question. Asked 10 years, 8 months ago. Active 1 month ago. Viewed 24k times. So, any thread in the child that tries to lock the mutex waits forever.

The standard vfork 2 function is unsafe in multithreaded programs. As in nonthreaded implementations, vfork does not copy the address space for the child process.

Be careful that the thread in the child process does not change memory before it calls exec 2. Remember that vfork gives the parent address space to the child. The parent gets its address space back after the child calls exec or exits. It is important that the child not change the state of the parent.

For example, it is disastrous to create new threads between the call to vfork and the call to exec. Any one of these can be set to NULL. For example, a prepare handler could acquire all the mutexes needed, and then the parent and child handlers could release them. This ensures that all the relevant locks are held by the thread that calls the fork function before the process is forked, preventing the deadlock in the child.

Any other return value indicates that an error occurred. The Solaris fork 2 function duplicates the address space and all the threads and LWPs in the child. This is useful, for example, when the child process never calls exec 2 but does use its copy of the parent address space. Note that when one thread in a process calls Solaris fork 2 , threads that are blocked in an interruptible system call return EINTR.

Also, be careful not to create locks that are held by both the parent and child processes. Note that this is not a problem if the fork-one model is used. For example, when one thread reads a file serially and another thread in the process successfully calls one of the forks, each process then contains a thread that is reading the file.

Save Article. Like Article. Take a step-up from those "Hello World" programs. Learn to implement data structures like Heap, Stacks, Linked List and many more! Check out our Data Structures in C course to start learning today. Previous Wait System Call in C. Next exec family of functions in C. Recommended Articles. Article Contributed By :. Easy Normal Medium Hard Expert.

Writing code in comment?

vipefelturn1983's Ownd

0コメント

1000 / 1000