char* 与 char[] 返回值

环境

 Linux VM-0-2-centos 2.6.32-754.35.1.el6.x86_64 #1 SMP Sat Nov 7 12:42:14 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

示例

#include <stdio.h>

char* get_name_1()
{
   char name[] = "zhaoyuhe";
   return name;
}
char* get_name_2()
{
   char *name = "wangzhere";
   return name;
}
char *get_name_3()
{
   return "ouyangli";
}

int get_value_1()
{
   int value = 99;
   return value;
}
int* get_value_2()
{
   int value = 65;
   return &value;
}
int get_value_3()
{
   return 52;
}
int main()
{
   printf("%s %p\n", get_name_1(),get_name_1());
   printf("%s %p\n", get_name_2(),get_name_2());
   printf("%s %p\n", get_name_3(),get_name_3());
   printf("%d\n", get_value_1());
   printf("%d %p\n", (int*)get_value_2(),get_value_2());
   printf("%d\n", get_value_3());

   return 0;
}

结果


 .▒ 0x7ffe5bbc6c90
wangzhere 0x400748
ouyangli 0x400752
99
1539075228 0x7ffe5bbc6c9c
52

char[]分析

get_name_1返回的是name[],可以看到地址是虚拟内存中的栈地址,而栈在get_name_1结束时就已经释放了,所以打印乱码,使用gdb可以查看这部分代码汇编:


(gdb) disassemble
Dump of assembler code for function get_name_1:
   0x00000000004004c4 <+0>:     push   %rbp
   0x00000000004004c5 <+1>:     mov    %rsp,%rbp
=> 0x00000000004004c8 <+4>:     movl   $0x6f61687a,-0x10(%rbp)
   0x00000000004004cf <+11>:    movl   $0x65687579,-0xc(%rbp)
   0x00000000004004d6 <+18>:    movb   $0x0,-0x8(%rbp)
   0x00000000004004da <+22>:    lea    -0x10(%rbp),%rax
   0x00000000004004de <+26>:    leaveq
   0x00000000004004df <+27>:    retq
End of assembler dump.

movl   $0x6f61687a,-0x10(%rbp)   movl   $0x65687579,-0xc(%rbp)  movb   $0x0,-0x8(%rbp)分别将zhao   yuhe  0x0(字符串结束符)拷贝name[]中,因为我的系统时小端序,所以立即数0x6f61687a和对应的字符串是反的。

get_value_1也是将值直接拷贝到栈中:


Breakpoint 3, get_value_1 () at return.c:20
20         int value = 99;
(gdb) disassemble
Dump of assembler code for function get_value_1:
   0x00000000004004fd <+0>:     push   %rbp
   0x00000000004004fe <+1>:     mov    %rsp,%rbp
=> 0x0000000000400501 <+4>:     movl   $0x63,-0x4(%rbp)
   0x0000000000400508 <+11>:    mov    -0x4(%rbp),%eax
   0x000000000040050b <+14>:    leaveq
   0x000000000040050c <+15>:    retq
End of assembler dump.

$0x63,-0x4(%rbp)将0x63(99)压栈

而get_value_2也是返回栈地址,所以无法打印。

char*分析

get_name_2中的字符串可以正常打印,那是因为该字符串存储的常量区


Breakpoint 1, get_name_2 () at return.c:10
10         char *name = "wangzhere";
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.212.el6_10.3.x86_64
(gdb) disassemble
Dump of assembler code for function get_name_2:
   0x00000000004004e0 <+0>:     push   %rbp
   0x00000000004004e1 <+1>:     mov    %rsp,%rbp
=> 0x00000000004004e4 <+4>:     movq   $0x400748,-0x8(%rbp)
   0x00000000004004ec <+12>:    mov    -0x8(%rbp),%rax
   0x00000000004004f0 <+16>:    leaveq
   0x00000000004004f1 <+17>:    retq
End of assembler dump.
(gdb) p (char*)0x400748
$1 = 0x400748 "wangzhere"

movq   $0x400748,-0x8(%rbp),0x400748是常量区地址,生命周期跟进程一样,所以使用p (char*)0x400748可以打印出字符串。


Breakpoint 2, get_name_3 () at return.c:15
15         return "ouyangli";
(gdb) disassemble
Dump of assembler code for function get_name_3:
   0x00000000004004f2 <+0>:     push   %rbp
   0x00000000004004f3 <+1>:     mov    %rsp,%rbp
=> 0x00000000004004f6 <+4>:     mov    $0x400752,%eax
   0x00000000004004fb <+9>:     leaveq
   0x00000000004004fc <+10>:    retq
End of assembler dump.
(gdb) p (char*)0x400752
$2 = 0x400752 "ouyangli"

get_name_3中直接返回字符串,汇编实际跟get_name_2是一样的。

返回一个值

get_value_1  get_value_3直接返回一个int值,刚好一个寄存器%eax可以存下来,并不像name[]那样大(只能传地址),所以即使有压栈操作,这个值也被完整的保存下来。


Breakpoint 5, get_value_1 () at return.c:20
20         int value = 99;
(gdb) n
21         return value;
(gdb) disassemble
Dump of assembler code for function get_value_1:
   0x00000000004004fd <+0>:     push   %rbp
   0x00000000004004fe <+1>:     mov    %rsp,%rbp
   0x0000000000400501 <+4>:     movl   $0x63,-0x4(%rbp)
=> 0x0000000000400508 <+11>:    mov    -0x4(%rbp),%eax
   0x000000000040050b <+14>:    leaveq
   0x000000000040050c <+15>:    retq
End of assembler dump

Breakpoint 6, get_value_3 () at return.c:30
30         return 52;
(gdb) disassemble
Dump of assembler code for function get_value_3:
   0x000000000040051e <+0>:     push   %rbp
   0x000000000040051f <+1>:     mov    %rsp,%rbp
=> 0x0000000000400522 <+4>:     mov    $0x34,%eax
   0x0000000000400527 <+9>:     leaveq
   0x0000000000400528 <+10>:    retq
End of assembler dump.

直接返回52和value的唯一区别就是,返回value有一个压栈的操作。

总结

char [] 开辟在栈区,拷贝字符串到其中,所以函数返回后空间已经释放,所以无法打印。

char*赋值字符串在常量区,所以直接返回可以打印。