What are processes?

A process is basically a program in execution and a program is a piece of code which may be a single line or millions of lines long written in a programming language.

When a UNIX machine gets powered up, the kernel will be loaded and complete its initialization process. Once initialization is completed, the kernel creates a set of processes in the user space, including the scheduling of the system management daemon process (usually named init) which has PID 1 and is responsible for running the right complement of services and daemons at any given time.

One of the important responsibilities of init is the reaping of zombie processes, which I will cover below.

We can see the list of process by using the ps aux command.


Parent, child, orphan, daemon, zombie processes

Now we know what processes are, what are the differences between these different kinds of processes then?

Parent & child processes

Process creation

In UNIX, every process, except the first process (PID 0 swapper process), is created by another process by executing the fork() syscall.

The process that creates other processes is known as the Parent process. The processes that were created by the Parent process are known as Child processes.

The child process is largely identical to the parent process, only with its distinct PID and own accounting information.

After a fork(), a child process often uses one of the exec() syscall to begin execution of a new program

When we run ps command with the -aef flag, we can see 2 columns that are of interest to us: PID & PPID.

PID: process ID PPID: parent process ID

$ ps -aef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 13:39 ?        00:00:04 /usr/lib/systemd/systemd --switched-root --system --deserialize 21
root         2     0  0 13:39 ?        00:00:00 [kthreadd]
root         4     2  0 13:39 ?        00:00:00 [kworker/0:0H]
root         6     2  0 13:39 ?        00:00:00 [ksoftirqd/0]
root         7     2  0 13:39 ?        00:00:00 [migration/0]

Now run it again with the --forest flag. This will display the processes in a tree structure, clearly depicting the relationship between processes.

$ ps -aef --forest
UID        PID  PPID  C STIME TTY          TIME CMD
.
.
root     17368     1  0 13:43 ?        00:00:02 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
root     17612     1  0 13:43 ?        00:00:01 /usr/bin/amazon-ssm-agent
root     17697 17612  0 13:43 ?        00:00:14  \_ /usr/bin/ssm-agent-worker
root     32548 17697  0 19:41 ?        00:00:05      \_ /usr/bin/ssm-session-worker svc_123456 i-123456
ec2-user 32560 32548  0 19:41 pts/0    00:00:00          \_ sh
root     32561 32560  0 19:41 pts/0    00:00:00              \_ sudo su - ec2-user

We can visualize this using a tree diagram.

Process termination

When a process completes, it calls a routine named _exit to notify the kernel that is is ready to die (sounds dark). It supplies an exit code (an integer) which provides information on why it is exiting.

Before the completed process is allowed to be removed, the process’s parent has to acknowledge by calling the wait() syscall to remove its entry (PID) in the process table for reuse.


Orphan processes

As the name suggests, an orphan process is one where the parent process terminates before the child process.

When this happens, the kernel will adjust the orphan processes and make them children of the init process and call the wait() syscall on these newly adopted child processes.

There are 2 kinds of orphan processes: unintentional orphan & intentional orphan

Unintentional orphan

This happens when the parent process terminates or crashes unexpectedly.

The process group mechanism can be used in such cases to coordinate and terminate all child processes using the SIGHUP process signal instead of letting them continue to run as orphans.

Intentional orphan aka Daemon processes

An intentional orphan process is one that is expected to continue running in the background. Typically daemon process names end with letter “d” (e.g systemd-journald).

Example will be the use of nohup to run a job in indefinitely.

$ nohup sh custom-script.sh &

Zombie processes

We covered the process termination routine earlier on how wait() syscall is called by the parent process to clean up a process entry (PID) in process table after a child process is completed.

A zombie process is one that has completed but isn’t “waited()” by its parent process and hence continues to hold a process entry in the process table (ps -axf -o pid,ppid,tty,stat,cmd). Because the parent process is still running, the zombie process cannot be adopted by the init process and reaped (aka wait()). ()

This can potentially result in memory leak due to unreleased kernel resources.

That being said, there are some situations where zombie processes are desirable, such as when we want to ensure parent process creates a child process with a different PID or to obtain information about the child processes at a later time.

NOTE:

  • a zombie process cannot be killed since it is technically already “dead”
  • processes are never responsible for cleaning up their grandchildren processes. this task is always handled by PID 1, which is usually the init process

References