cp源码解读
目录
前言
前几篇文章都提到了cp命令,我们曾经用strace这个“放大镜”看到它的骨骼也就是系统调用这一层,本节就进一步简单讲讲它的源码,深入到“细胞”层面。
copy的call stack
copy数据具体是由copy_reg函数完成的,它的调用栈是
copy_reg
copy_internal
copy
do_copy
main
列出call stack主要是希望读者有一个从高处俯瞰的感觉,copy_reg以上的函数本文不会涉及。
copy_reg - copy数据的执行者
其主要代码如下:
if (data_copy_required)
{
/* Choose a suitable buffer size; it may be adjusted later. */
size_t buf_size = io_blksize (&sb);
size_t hole_size = STP_BLKSIZE (&sb);
/* Deal with sparse files. */
//新注释:1. 判断源文件是否是空洞文件
enum scantype scantype = infer_scantype (source_desc, &src_open_sb,
&scan_inference);
if (scantype == ERROR_SCANTYPE)
{
error (0, errno, _("cannot lseek %s"), quoteaf (src_name));
return_val = false;
goto close_src_and_dst_desc;
}
bool make_holes
= (S_ISREG (sb.st_mode)
&& (x->sparse_mode == SPARSE_ALWAYS
|| (x->sparse_mode == SPARSE_AUTO
&& scantype != PLAIN_SCANTYPE)));
//新注释:2. 调用fdadvise提高读取性能
fdadvise (source_desc, 0, 0, FADVISE_SEQUENTIAL);
/* If not making a sparse file, try to use a more-efficient
buffer size. */
if (! make_holes)
{
//新注释:3. 为普通文件计算buffer size
/* Compute the least common multiple of the input and output
buffer sizes, adjusting for outlandish values.
Note we read in multiples of the reported block size
to support (unusual) devices that have this constraint. */
size_t blcm_max = MIN (SIZE_MAX, SSIZE_MAX);
size_t blcm = buffer_lcm (io_blksize (&src_open_sb), buf_size,
blcm_max);
/* Do not bother with a buffer larger than the input file, plus one
byte to make sure the file has not grown while reading it. */
if (S_ISREG (src_open_sb.st_mode) && src_open_sb.st_size < buf_size)
buf_size = src_open_sb.st_size + 1;
/* However, stick with a block size that is a positive multiple of
blcm, overriding the above adjustments. Watch out for
overflow. */
buf_size += blcm - 1;
buf_size -= buf_size % blcm;
if (buf_size == 0 || blcm_max < buf_size)
buf_size = blcm;
}
off_t n_read;
bool wrote_hole_at_eof = false;
if (! (
#ifdef SEEK_HOLE
scantype == LSEEK_SCANTYPE
? lseek_copy (source_desc, dest_desc, &buf, buf_size, hole_size,
scan_inference.ext_start, src_open_sb.st_size,
make_holes ? x->sparse_mode : SPARSE_NEVER,
x->reflink_mode != REFLINK_NEVER,
src_name, dst_name) //新注释:4.1 copy空洞文件
:
#endif
sparse_copy (source_desc, dest_desc, &buf, buf_size,
make_holes ? hole_size : 0,
x->sparse_mode == SPARSE_ALWAYS,
x->reflink_mode != REFLINK_NEVER,
src_name, dst_name, UINTMAX_MAX, &n_read,
&wrote_hole_at_eof))) //新注释:4.2 copy普通文件
{
return_val = false;
goto close_src_and_dst_desc;
}
在分析代码之前,读者需要先明白一个概念:spare file(空洞文件或稀疏文件)。如果不清楚,请自行查询。我先创建一个空洞文件和一个普通文件,以便读者先有个印象,同时方便后面将代码举例子。
0. 创建普通、空洞文件
//普通文件,大小为20480字节
dd if=/dev/urandom of=./reg_file bs=4096 count=5
//空洞文件,4096字节空洞+4096字节数据+4096字节空洞+4096字节数据+4096字节全0数据
dd if=/dev/urandom of=holed_file bs=4k count=1 seek=1
dd if=/dev/urandom of=holed_file bs=4k count=1 seek=3
dd if=/dev/zero of=holed_file bs=4k count=1 seek=4
还有一个概念:reflink,暂时先不提。
1. 判断源文件是否是空洞文件
enum scantype scantype = infer_scantype (source_desc, &src_open_sb,
&scan_inference);
判断依据就是stat结构里的两个信息,我直接用stat命令对比下一个普通文件和一个空洞文件
mzhai:$ stat reg_file
File: reg_file
Size: 20480 Blocks: 40 IO Block: 4096 regular file
Device: fd01h/64769d Inode: 930859 Links: 1
mzhai:$ stat holed_file
File: holed_file
Size: 20480 Blocks: 24 IO Block: 4096 regular file
Device: fd01h/64769d Inode: 930853 Links: 1
两者 Size相等都是20480字节,但是Blocks不相等. reg_file所有blocks大小=40*512==Size; 而holed_file只有24个Block, 其所占磁盘大小=24*512<Size,这正是判断空洞文件的依据,即代码:
infer_scantype (...){
STP_NBLOCKS (sb) < sb->st_size / ST_NBLOCKSIZE)
如果知道了是空洞文件,就继续看第一块真正的数据在哪,探测办法是lseek,后面会用到。
off_t ext_start = lseek (fd, 0, SEEK_DATA);
2. 调用fdadvise提高性能
fdadvise (source_desc, 0, 0, FADVISE_SEQUENTIAL);
告诉内核我要读数据了,做好准备,以便提高性能。
3. 为普通文件计算copy所用buf_size
拷贝一个大文件,我们不可能把所有内容都读到一个超大的buffer中然后一次性写入目标文件,而是一块块的copy. 前文说过cp认为最佳buffer size为128*1024字节.
4.1 copy普通文件
sparse_copy (source_desc, dest_desc, &buf, buf_size,
make_holes ? hole_size : 0,
x->sparse_mode == SPARSE_ALWAYS,
x->reflink_mode != REFLINK_NEVER,
src_name, dst_name, UINTMAX_MAX, &n_read,
&wrote_hole_at_eof)))
它会先尝试用高效copy函数copy_file_range试一试,如果不成功再一块块的拷贝。
The ?copy_file_range() ?system call performs an in-kernel copy between two file descriptors without the additional?cost of transferring data from the kernel to user space and then back into the kernel.?
The copy_file_range() system call first appeared in Linux 4.5, but glibc 2.27 provides a user-space emulation when it is not available.
?
4.2 copy空洞文件
scantype == LSEEK_SCANTYPE
? lseek_copy (source_desc, dest_desc, &buf, buf_size, hole_size,
scan_inference.ext_start, src_open_sb.st_size,
make_holes ? x->sparse_mode : SPARSE_NEVER,
x->reflink_mode != REFLINK_NEVER,
src_name, dst_name)
原理也比较简单:你得调用lseek判断数据块和空洞块,涉及第三个参数值(SEEK_DATA或SEEK_HOLE)。
Since version 3.1, Linux supports the following additional values for whence:
SEEK_DATA
Adjust the file offset to the next location in the file greater than or equal to ?offset ?containing ?data.?If offset points to data, then the file offset is set to offset.SEEK_HOLE
Adjust the file offset to the next hole in the file greater than or equal to offset. ?If offset points into?the middle of a hole, then the file offset is set to offset. ?If there is no hole ?past ?offset, ?then ?the?file offset is adjusted to the end of the file (i.e., there is an implicit hole at the end of any file).
?如果是空洞块就调用create_hole在目标文件中创建空洞。如果是数据块则调用上面的sparse_copy拷贝数据。
以上面创建的有两个空洞块的holed_file为例,步骤如下:
?cp命令看起来简单,但却涉及到很多Linux上的知识,值得一读。每个if语句都是一个故事。
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。 如若内容造成侵权/违法违规/事实不符,请联系我的编程经验分享网邮箱:veading@qq.com进行投诉反馈,一经查实,立即删除!