【Linux篇】进程运行行云流水的秘密---进程控制

用户11456817

发布于 2025-06-16 10:34:42

2210

文章被收录于专栏：学习学习

进程创建与终止

写时拷贝

通常，父子代码共享，在父子进程不写入数据时，数据也是共享的，但当任意一方写入数据时，便以写时拷贝的方式各自一份副本。

在图中，父进程页表中代码段是只读的，但是数据段在父进程创建子进程之前是读写的。一旦父进程创建子进程，操作系统就会将数据段的权限也改成只读的。当子进程尝试对数据段进行写入时，系统检测到你访问的数据是合法的且是数据段，并且页表关系是合法的。但数据段的权限是只读的，此时操作系统就会出错。但经过操作系统检查发现，访问的是数据段且是子进程的，这时操作系统就会触发写时拷贝。

会检测数据的合法性

操作系统如何知道用户访问的是数据段还是代码段的呢？

因为进程会有自己的虚拟地址空间即mm_strcut，在mm_struct内部维护了各个区的起始虚拟地址和结束虚拟地址，其中就包括数据段和代码段的起始、结束虚拟地址。

startcode-------endcode
startdata----------enddata

为什么要写时拷贝

如果父进程的数据量过大，拷贝的时间就会很长，效率低下，用写时拷贝就可以大大提高创建进程的效率
如果父进程的某些数据是只读的，那子进程就没必要在拷贝一份，只需要拷贝自己需要的数据就行，减少了内存的浪费

进程终止

进程退出场景

代码运行完毕，结果正确
代码运行完毕，结果异常
代码异常终止

之前写的c\c++程序，main函数的返回值是有含义的

在我们自己写的程序中，main函数并不是第一个被调用的程序，第一个被调用的程序在vs下是crtstart()，linux下是start()。而main函数便是被这两个函数调用的。既然是函数若返回类型不是void，那便会有一个返回值。我们常写的main函数的返回值是“0”，代表程序正常退出，若返回其他数，贼表示程序出错，不同的数代表不同的错误。
mian函数的返回值实际上是进程退出时的退出码

在Linux下我们可以打印查看main函数的返回值

echo $?：打印最近一个进程退出时的退出码
？：退出符
若main函数的退出码是1，运行完后，echo ?查看退出码看到的是1，若在运行一次echo ?，看到的则是0，因为echo

当进程退出后，它的退出码是要写入到task_struct内部的

在c语言标准库的中，提供了错误码和错误码对应的字符串，用字符串strerror查看

在我们的系统当中，错误信息是由c标准库提供的

#include<stdio.h>
#include<string.h>//strerror的头文件
 
int main()
{
  for(int i=0;i<200;i++)
  {
     printf("%d->%s\n",i,strerror(i));                                                  }
   return 0;
}
--------------------------------
//共134个错误码以及对应的字符串
0->Success
1->Operation not permitted
2->No such file or directory
3->No such process
4->Interrupted system call
5->Input/output error
6->No such device or address
7->Argument list too long
8->Exec format error
9->Bad file descriptor
10->No child processes
11->Resource temporarily unavailable
12->Cannot allocate memory
13->Permission denied
14->Bad address
15->Block device required
16->Device or resource busy
17->File exists
18->Invalid cross-device link
19->No such device
20->Not a directory
21->Is a directory
22->Invalid argument
23->Too many open files in system
24->Too many open files
25->Inappropriate ioctl for device
26->Text file busy
27->File too large
28->No space left on device
29->Illegal seek
30->Read-only file system
31->Too many links
32->Broken pipe
33->Numerical argument out of domain
34->Numerical result out of range
35->Resource deadlock avoided
36->File name too long
37->No locks available
38->Function not implemented
39->Directory not empty
40->Too many levels of symbolic links
41->Unknown error 41
42->No message of desired type
43->Identifier removed
44->Channel number out of range
45->Level 2 not synchronized
46->Level 3 halted
47->Level 3 reset
48->Link number out of range
49->Protocol driver not attached
50->No CSI structure available
51->Level 2 halted
52->Invalid exchange
53->Invalid request descriptor
54->Exchange full
55->No anode
56->Invalid request code
57->Invalid slot
58->Unknown error 58
59->Bad font file format
60->Device not a stream
61->No data available
62->Timer expired
63->Out of streams resources
64->Machine is not on the network
65->Package not installed
66->Object is remote
67->Link has been severed
68->Advertise error
69->Srmount error
70->Communication error on send
71->Protocol error
72->Multihop attempted
73->RFS specific error
74->Bad message
75->Value too large for defined data type
76->Name not unique on network
77->File descriptor in bad state
78->Remote address changed
79->Can not access a needed shared library
80->Accessing a corrupted shared library
81->.lib section in a.out corrupted
82->Attempting to link in too many shared libraries
83->Cannot exec a shared library directly
84->Invalid or incomplete multibyte or wide character
85->Interrupted system call should be restarted
86->Streams pipe error
87->Too many users
88->Socket operation on non-socket
89->Destination address required
90->Message too long
91->Protocol wrong type for socket
92->Protocol not available
93->Protocol not supported
94->Socket type not supported
95->Operation not supported
96->Protocol family not supported
97->Address family not supported by protocol
98->Address already in use
99->Cannot assign requested address
100->Network is down
101->Network is unreachable
102->Network dropped connection on reset
103->Software caused connection abort
104->Connection reset by peer
105->No buffer space available
106->Transport endpoint is already connected
107->Transport endpoint is not connected
108->Cannot send after transport endpoint shutdown
109->Too many references: cannot splice
110->Connection timed out
111->Connection refused
112->Host is down
113->No route to host
114->Operation already in progress
115->Operation now in progress
116->Stale file handle
117->Structure needs cleaning
118->Not a XENIX named type file
119->No XENIX semaphores available
120->Is a named type file
121->Remote I/O error
122->Disk quota exceeded
123->No medium found
124->Wrong medium type
125->Operation canceled
126->Required key not available
127->Key has expired
128->Key has been revoked
129->Key was rejected by service
130->Owner died
131->State not recoverable
132->Operation not possible due to RF-kill
133->Memory page has hardware error

如果要让程序返回对应的退出码，除了自己手动设置外，还可以返回"errno"

#include<stdio.h>
 #include<string.h>
 #include<errno.h> //errno的头文件                                                              
 int main()
 {
 
     FILE*fp=fopen("test.txt","r");
     if(fp==NULL)
     {
         return errno;
     }
     return 0;
}

当程序异常终止，退出码无意义。

进程一旦出现异常，一般是进程收到了信号（TODO）

exit()与return

exit(退出码)--- >任何地方调用exit，表示进程结束，函数不返回，并将子进程的退出码返回给父进程。终止整个程序
return---- >只终止当前函数的执行，并将控制权交回给调用该函数的地方。

exit()与_exit()--- >都包含在unistd.h头文件中

exit()：c标准库提供的。在进程退出的时候，会进行资源的回收，如进行缓冲区的刷新
_exit()：系统提供的。在进程退出的时候，不会进行资源的回收，如不进行缓冲区的刷新
exit底层调用了_exit，因为能杀死进程的只有操作系统

缓冲区在哪里？缓冲区一定不在哪里？

缓冲区在库中，是c语言提供的缓冲区
一定不在操作系统内部，如果在操作系统内部，那么_exit也会刷新缓冲区

进程的等待

进程为什么要等待

通过进程等待回收僵尸进程，避免内存泄漏
⽗进程通过进程等待的⽅式，回收⼦进程资源，获取⼦进程退出信息
我们需要知道。如⼦进程运⾏完成，结果对还是不对，或者是否正常退出。

进程等待的⽅法

用wait或者waitpid的进行等待的方式，就是进程等待

wait⽅法

status：进程的退出状态信息

pid_t wait(int *status);

1 #include<stdio.h>
  2 #include<string.h>
  3 #include<errno.h>
  4 #include<sys/types.h>//包含wait、waitpid
  5 #include<unistd.h>
  6 #include<stdlib.h>
  7 #include<sys/wait.h>//包含wait、waitpid
  8 int main()
  9 {
 
     pid_t id=fork();
     if(id==0)
     {
         int cnt=5;
         while(cnt)
       {
           printf("我是一个自己进程，我的pid：%d，父进程pid：%d\n",getpid(),getppid());
            sleep(1);                                                                                                     
             cnt--;
         }
         exit(0);
     }
     sleep(10);
     pid_t ret=wait(NULL);
     if(ret)
     {
         printf("wait success，rid：%d\n",ret);
     }
     sleep(10)
     return 0;
}

waitpid方法

pid_t waitpid(pid_t pid, int *status, int options)

pid：

当pid为-1时，表示等待任意一个子进程。此时，当waitpid函数的options为0时，其与wait函数的作用相同。
当pid大于0时，表示等待进程id与pid相等的子进程。

options：

当options为0时，表示父进程阻塞等待子进程。父进程一般是在内核中阻塞，等待被唤醒。
当options为WNOHANG时，若pid指定的子进程没有结束（退出），则waitpid函数返回0，不予以等待。若pid指定的子进程正常结束，则返回该子进程的id。

status：

status参数是一个输出型参数，由操作系统填充。如果传递给它的实参为NULL，表示不关心子进程的退出状态信息。否则，操作系统会根据该参数，将子进程的退出信息反馈给父进程

返回值：

当正常返回时，返回收集到的子进程的进程id。
如果waitpid函数设置了选项options为WNOHANG，而调用中waitpid函数发现没有已退出的子进程可收集，则返回0。
如果调用时出错，则返回-1，这时errno会被设置成相应的值以指示错误所在。

status：整型共32bit

status

//进行转换一下就能得到对应的退出码
(status>>8)&0xFF

进程替换与自定义shell

fork() 之后,⽗⼦各⾃执⾏⽗进程代码的⼀部分如果⼦进程就想执⾏⼀个全新的程序呢？进程的程序替换来完成这个功能！

程序替换是通过特定的接⼝，加载磁盘上的⼀个全新的程序(代码和数据)，加载到调⽤进程的地址空间中！

替换原理

fork创建子进程后，父子进程执行的是一样的程序，有时可能会执行不同的代码分支。若要让子进程执行一个新的程序，就要使用一种exec函数。调用这种函数时，会将全新的代码和数据覆盖原代码和数据。用下图举例就是：将原PCB种的栈、堆、数据段、代码段替换成新进程的数据段、代码段、栈堆。

exec类函数不会创建新进程，只是用新程序的代码和数据对原数据、代码覆盖式的进行替换。所以调用exec函数前后，进程的pid不会改变

#include<stdio.h>
#include<errno.h>
#include<unistd.h>//execl的头文件
 {
     printf("我要开始运行的了\n");
     execl("/usr/bin/ls","ls","-l","-a",NULL);                                         
     printf("我的程序运行完毕了\n");
     return 0;
 }
//当原代码被新代码替换后，原代码就不存在了，所以execl函数后的代码就不存在了，就不会被执行

exec系列的函数的返回值：

只有当函数调用失败时才返回，成功不返回

#include<stdio.h>
#include<string.h>
#include<unistd.h>
 
int main()
{
     printf("我的程序要运行了!\n");
     int n = execl("/usr/bn/ls","ls","-l","-a",NULL);  //故意将路劲写错，观察它的返回值                                      
     printf("我的程序运行完毕了:%d\n",n);
     return 0;
 }

程序替换相关接口

#include

int execl(const char *path, const char *arg, …);
int execlp(const char *file, const char *arg, …);
int execle(const char *path, const char *arg,…, char * const envp[]);
int execv(const char *path, char *const argv[]);
int execvp(const char *file, char *const argv[]);
int execvpe(const char *file, char *const argv[],char *const envp[]);
int execl(const char *path, const char *arg, …); const char *path：路径+程序。告诉函数要执行谁

const char *arg, …：怎么执行这个程序。这里是可变参数。在命令行中怎么使用这里就怎么传，每个选项用逗号隔开，如

这种传参也就是List，将选项以链表的形式传入，所以execl中的"l"就是list的意思。而链表的末尾一般是空的，所以execl函数最后一个参数必须传NULL，表明参数传递完成

若替换进程不想影响到父进程，可以做下面这样的操作

为什么没有影响到父进程？

进程之间相互独立
数据和代码发生写时拷贝

exec系列的函数其实属于加载器的范畴。在学操作系统时，提到的程序在变成进程前要先加载进内存，这一加载行为就要依靠加载器

能替换我们自己写的程序吗？

写一个c++程序替换c语言程序

int execlp(const char *file, const char *arg, …);

const char *file：文件名

const char *arg, …：怎么执行这个程序。这里是可变参数。在命令行中怎么使用这里就怎么传，每个选项用逗号隔开，如

为什么execlp不会给出路径？

因为execlp会自己到环境变量中去查找对应的命令。execlp中的p就表示环境变量。

int execv(const char *path, char *const argv[]);

const char *path：同上

char *const argv[]：传一个命令行参数表。也就是指针数组

execv中的v就是"vector"

int execvp(const char *file, char *const argv[]);

const char *file, char *const argv[]：同上

int execvpe(const char *file, char *const argv[],char *const envp[]);

const char *file, char *const argv[]：同上

char *const envp[]：环境变量

发现结果打印的环境变量只有我们自己传递的，原先的环境变量怎么没有了。

因为execvpe要求被替换的子进程使用全新的环境变量，即env表

若想以新增方式传入环境变量：

以新增方式传入环境变量：

若使用带e的exec系列的接口，则需要先putenv你要导入的环境变量，在传environ指针
若使用不带e的exec系列的接口，则直接putenv，在正常使用接口就行

其实exec系列的接口不传环境变量子进程也能拿到父进程的环境变量，因为在虚拟地址空间中，留有空间用来存放命令行参数与环境变量，子进程会拷贝父进程的PCB。

观察下面这张图，发现v系列的函数少了execve，这是因为execve是一个系统调用，而下图中的这些都是语言层面的封装。

execve。上面的函数在使用时都会调用execve。部分函数在使用时不需要传环境变量，但内部实际上是向exece传了环境变量的，只不过，用户传了，就使用用户的，不传就是要默认的即:extern char** environ。

自定义shell

#include<cstdio>
#include<cstdlib>
#include<cstring>
#include<iostream>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>

#define command_line 1024
#define FORMAT "[%s@%s %s]# "


//自定义shell全局变量
#define MAXARGC 128
char* g_argv[MAXARGC];
int g_argc=0;

const char*GetUserName()
{
    const char*name=getenv("USER");
    return name==NULL?"none":name;
}

const char*GetHostName()
{
    const char*hostname=getenv("HOSTNAME");
    return hostname==NULL?"none":hostname;
}

const char*GetPWD()
{
    const char*pwd=getenv("PWD");
    return pwd==NULL?"none":pwd;
}

bool GetCommandParse(char*commandline)
{
#define SEP " "
    //命令行分析
    g_argc=0;
    g_argv[g_argc++]=strtok(commandline,SEP);
    while((bool)(g_argv[g_argc++]=strtok(nullptr,SEP)));
    g_argc--;
    
    return true;
    
}

void PrintArgv()
{
    for(int i=0;g_argv[i];i++)
    {
        printf("argv[%d]->%s\n",i,g_argv[i]);
    }
    printf("argc:%d\n",g_argc);
}


bool GetCommandLine(char*out,int size)
{
    char*c=fgets(out,size,stdin);
    if(c==NULL)return false;
    out[strlen(out)-1]=0;
    if(strlen(out)==0)return false;
    return true;

}

std::string DirName(const char*pwd)
{
#define SLASH "/"
    std::string dir=pwd;
    if(dir==SLASH)return SLASH;
    auto pos =dir.rfind(SLASH);
    if(pos==std::string::npos)return "BUG";
    return dir.substr(pos+1);
}



void MakeCommandline(char cmd_prompt[],int size)
{
   snprintf(cmd_prompt,size,FORMAT,GetUserName(),GetHostName(),DirName(GetPWD()).c_str()); 
}

void PrintCommandPrompt()
{
    char prompt[command_line];
    MakeCommandline(prompt,sizeof(prompt));
    printf("%s",prompt);
    fflush(stdout);
}

int Execute()
{
    pid_t id=fork();
    if(id==0)
    {
      execvp(g_argv[0],g_argv);
      exit(1);
    }
    
    pid_t rid=waitpid(id,nullptr,0);
    (void)rid;
    return 0;
}


int main()
{
   
    while(true)
    {
      //输出命令行提示符
      PrintCommandPrompt();
      
      //获取用户命令
      char commandline[command_line];
      if(!GetCommandLine(commandline,sizeof(commandline)))
      {
          continue;
      }
      
      //命令行分析
      GetCommandParse(commandline);
      // PrintArgv();
    
      //执行命令
      Execute();

      

    }
    
    return 0;
}

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2025-06-15，如有侵权请联系 cloudcommunity@tencent.com 删除

linux