Perl学习笔记（8）-文件与目录操作

Perl文件操作概述

Perl使用一种叫做文件句柄类型的变量来操作文件。从文件读取或者写入数据需要使用文件句柄。文件句柄(file handle)是一个I/O连接的名称，可以简单地地理解为控制文件的某个东西。Perl提供了三种文件句柄:STDIN,STDOUT,STDERR，分别代表标准输入、标准输出和标准出错输出。

Perl 中打开文件可以使用以下方式：

open FILEHANDLE, EXPR
open FILEHANDLE
sysopen FILEHANDLE, FILENAME, MODE, PERMS
sysopen FILEHANDLE, FILENAME, MODE

参数说明:

FILEHANDLE：文件句柄，用于存放一个文件唯一标识符，它就是一串数字，其实可以理解为文件编号。
EXPR：文件名及文件访问类型组成的表达式。
MODE：文件访问类型。
PERMS：访问权限位(permission bits)。

Open函数

只读方式打开

open是打开某个文件的函数，它的用法为open(filevar,filenmae)，其中，filevar是文件句柄，filename是文件名，路径可以是绝对路径，也可以是相对路径。open(DATA,"<file.txt");表示以只读的方式（只读的方式是<号）打开文件file.txt。如下所示：

biotest@ubuntu:~/perl/06file$ cat open.pl 
#!/usr/bin/perl
open(DATA,"<file.txt") or die "file.txt can not be opened,$!";
while(<DATA>){
    print "$_";
}
biotest@ubuntu:~/perl/06file$ perl open.pl 
This is a test file.
biotest@ubuntu:~/perl/06file$ cat file.txt 
This is a test file.

写入方式打开

写入方式打开是用>符号，使用格式为：

1	open(DATA,">file.txt") or die "file.txt can not be open, $!";

读写方式打开

如果要以读写方式打开文件，需要要>或<字符前添加+号，这种方式不会删除文件原来的内容，它的格式如下：

1	open(DATA,"+<file.txt"); or die "file.txt can not be opened,$!";

删除文件内容

删除文件内容，需要在+号后面添加>符号，如下所示：

1	open DATA, "+>file.txt" or die "file can not be opened, $!";

追加数据

追加数据使用双大于号（>>），如下所示：

1	open(DATA,">>file.txt")\|\| die "file.txt can not be opened,$!";

读取追加文件的内容

在追加符号前面加上+号即可，如下所示：

1	open(DATA,"+>>file.txt") \|\| die "file.txt can not be opened,$!";

不同的访问模式汇总

模式	描述
`<`或`r`	只读方式打开，将文件指针指向文件头。
`>`或`w`	写入方式打开，将文件指针指向文件头并将文件大小截为零。如果文件不存在则尝试创建之。
`>>`或`a`	写入方式打开，将文件指针指向文件末尾。如果文件不存在则尝试创建之。
`+<`或`r+`	读写方式打开，将文件指针指向文件头。
`+>`或`w+`	读写方式打开，将文件指针指向文件头并将文件大小截为零。如果文件不存在则尝试创建之。
`+>>`或`a+`	读写方式打开，将文件指针指向文件末尾。如果文件不存在则尝试创建之。

Sysopen函数

sysopen函数类似于open函数，它的使用格式如下所示：

1	sysopen(FH,$filename,O_RDWR\|O_CREAT,0666)（中间有一个竖线，是OR的意思）

参数解释：

1. FH:文件句柄，与open函数的第1个参数一样；
2. $filename：文件名；
3. O_RDWRIO_CREAT：模式值，tl模块提供的逻辑OR运算组合起来的常数构成；
4. 这个参数可先，是八进制值，0666表示数据文件，0777表示程序
如果文件能够打开，则返回true，如果失败，返回false。

以读写(+<filename)的方式打开文件的代码是sysopen(DATA,"file.txt",O_RDWR);如果需要在更新文件前清空文件，代码是`sysopen(DATA,”file.txt”,O_RDWR|O_TRUNC);。所有的模式值如下表所示：

模式	描述
O_RDWR	读写方式打开，将文件指针指向文件头。
O_RDONLY	只读方式打开，将文件指针指向文件头。
O_WRONLY	写入方式打开，将文件指针指向文件头并将文件大小截为零。如果文件不存在则尝试创建之。
O_CREAT	创建文件
O_APPEND	追加文件
O_TRUNC	将文件大小截为零
O_EXCL	如果使用O_CREAT时文件存在,就返回错误信息,它可以测试文件是否存在
O_NONBLOCK	非阻塞I/O使我们的操作要么成功，要么立即返回错误，不被阻塞。

Close函数

在文件使用完后，要关闭文件，以刷新与文件句柄相关联的输入输出缓冲区，关闭文件的语法如下：

1 2	close FILEHANDLE close

FILEHANDLE 为指定的文件句柄，如果成功关闭则返回 true，如下所示：

1	close(DATA) \|\| die "can not close file"

读写文件

向文件读写信息有几种不同的方式。

第一种方式：`<FILEHANDLE>操作符`

从打开的文件句柄读取信息的主要方法是<FILEHANDLE>操作符。在标量上下文中，它从文件句柄返回单一行，如下所示：

open file";
@lines=<DATA>;
print @lines;
close(DATA);
biotest@ubuntu:~/perl/06file$ perl import.pl
1
2
3

getc 函数

xgetc函数从指定的FILEHANDLE返回单一的字符，如果没指定返回 STDIN：

1 2	getc FILEHANDLE getc

如果发生错误，或在文件句柄在文件末尾，则返回 undef。

read函数

read函数用于从缓冲区的文件句柄读取信息。
这个函数用于从文件读取二进制数据，格式如下所示：

1 2	read FILEHANDLE,SCALAR,LENGTH,OFFSET read FILEHANDLE,SCALAR,LENGTH

参数说明:

1. FILEHANDLE：文件句柄，用于存放一个文件唯一标识符。 
2. SCALAR：存贮结果，如果没有指定OFFSET，数据将放在SCALAR的开头。否则数据放在SCALAR中的OFFSET字节之后。
3. LENGTH：读取的内容长度。
4. OFFSET：偏移量。
如果读取成功返回读取的字节数，如果在文件结尾返回0，如果发生错误返回undef。

print函数

对于所有从文件句柄中读取信息的函数，在后端主要的写入函数为 print：

1
2
3

print FILEHANDLE LIST
print LIST
print

利用文件句柄和print函数可以把程序运行的结果发给输出设备(STDOUT：标准输出)，如下所示：

1	print "Hello World!\n";

文件操作实例

复制文件数据

在下面的案例中，打开一个已经存在的文件file1.txt，并读取它的每一行，写入到文件file2.txt中，如下所示：

biotest@ubuntu:~/perl/06file$ cat file1.txt
This is line1 of file1.txt
This is line2 of file1.txt
This is line3 of file1.txt
biotest@ubuntu:~/perl/06file$ cat operation.pl 
#!/usr/bi/perl
# Open file1 with only-read
open(DATA1,"<file1.txt");
# Open new file2
open(DATA2,">file2.txt");
# copy data
while(<DATA1>)
{
    print DATA2 $_;
}
close(DATA1);
close(DATA2);
biotest@ubuntu:~/perl/06file$ perl operation.pl 
biotest@ubuntu:~/perl/06file$ cat file2.txt 
This is line1 of file1.txt
This is line2 of file1.txt
This is line3 of file1.txt

重命名文件

在下面的案例中，将file1.txt重命令为new_file1.txt，如下所示：

biotest@ubuntu:~/perl/06file$ ls -lh *.txt
-rw-rw-r-- 1 biotest biotest 81 May 18 20:13 file1.txt
-rw-rw-r-- 1 biotest biotest 81 May 18 20:17 file2.txt
-rw-rw-r-- 1 biotest biotest 21 May 18 19:44 file.txt
-rw-rw-r-- 1 biotest biotest  7 May 18 20:04 import.txt
biotest@ubuntu:~/perl/06file$ cat rename.pl 
#!/usr/bin/perl
rename("/home/biotest/perl/06file/file1.txt","/home/biotest/perl/06file/new_file1.txt");
biotest@ubuntu:~/perl/06file$ perl rename.pl 
biotest@ubuntu:~/perl/06file$ ls -lh *.txt
-rw-rw-r-- 1 biotest biotest 81 May 18 20:17 file2.txt
-rw-rw-r-- 1 biotest biotest 21 May 18 19:44 file.txt
-rw-rw-r-- 1 biotest biotest  7 May 18 20:04 import.txt
-rw-rw-r-- 1 biotest biotest 81 May 18 20:13 new_file1.txt

删除文件

现在删除当前目录下的file2.txt文件，使用的是unlink函数，如下所示：

biotest@ubuntu:~/perl/06file$ ls -lh *.txt
-rw-rw-r-- 1 biotest biotest 81 May 18 20:17 file2.txt
-rw-rw-r-- 1 biotest biotest 21 May 18 19:44 file.txt
-rw-rw-r-- 1 biotest biotest  7 May 18 20:04 import.txt
-rw-rw-r-- 1 biotest biotest 81 May 18 20:13 new_file1.txt
biotest@ubuntu:~/perl/06file$ vim delete.pl
biotest@ubuntu:~/perl/06file$ cat delete.pl 
#!/usr/bin/perl
unlink("/home/biotest/perl/06file/file2.txt");
biotest@ubuntu:~/perl/06file$ perl delete.pl 
biotest@ubuntu:~/perl/06file$ ls -lh *.txt
-rw-rw-r-- 1 biotest biotest 21 May 18 19:44 file.txt
-rw-rw-r-- 1 biotest biotest  7 May 18 20:04 import.txt
-rw-rw-r-- 1 biotest biotest 81 May 18 20:13 new_file1.txt

指定文件位置

tell函数用于获取文件的位置，seek函数指定文件内的位置。

tell函数

tell函数的使用如下所示：

1 2	tell FILEHANDLE tell

如果指定 FILEHANDLE 该函数返回文件指针的位置，以字节计。如果没有指定则返回默认选取的文件句柄。

seek函数

seek()函数是通过文件句柄来移动文件读写指针的方式来读取或写入文件的，以字节为单位进行读取和写入，使用格式如下所示：

1	seek FILEHANDLE, POSITION, WHENCE

参数说明:

1
2
3

1. FILEHANDLE：文件句柄，用于存放一个文件唯一标识符。
2. POSITION：表示文件句柄(读写位置指针)要移动的字节数。
3. WHENCE：表示文件句柄(读写位置指针)开始移动时的起始位置，可以取的值为0、1、2；分别表示文件开头、当前位置和文件尾。

以下案例就是从文件开头读取256个字节：

1	seek DATA, 256, 0;

使用案例

以下代码用于判断这个文件是否𥇛，是否可读写，代码如下所示：

biotest@ubuntu:~/perl/06file$ cat file_info.pl 
#!/usr/bin/perl
my $file="/home/biotest/perl/06file/file.txt";
my (@description, $size);
if (-e $file)
{
    push @description, 'is a binary file' if (-B _);
    push @description, 'is a socket' if (-S _);
    push @description, 'is a text file' if (-T _);
    push @description, 'is a special file' if (-b _);
    push @description, 'is a special character file' if (-c _);
    push @description, 'is a dictionary' if (-d _);
    push @description, 'file exists' if (-x _);
    push @description, (($size=-s _))? "$size byte":'null';
    print "$file information: ",join(',',@description),"\n";
}
biotest@ubuntu:~/perl/06file$ perl file_info.pl 
/home/biotest/perl/06file/file.txt information: is a text file,21 byte

文件测试操作符如下表所示：

操作符	描述
`-A`	文件上一次被访问的时间(单位：天)
`-B`	是否为二进制文件
`-C`	文件的(inode)索引节点修改时间(单位：天)
`-M`	文件上一次被修改的时间(单位：天)
`-O`	文件被真实的UID所有
`-R`	文件或目录可以被真实的UID/GID读取
`-S`	为socket(套接字)
`-T`	是否为文本文件
`-W`	文件或目录可以被真实的UID/GID写入
`-X`	文件或目录可以被真实的UID/GID执行
`-b`	为block-special (特殊块)文件(如挂载磁盘)
`-c`	为character-special (特殊字符)文件(如I/O 设备)
`-d`	为目录
`-e`	文件或目录名存在
`-f`	为普通文件
`-g`	文件或目录具有setgid属性
`-k`	文件或目录设置了sticky位
`-l`	为符号链接
`-o`	文件被有效UID所有
`-p`	文件是命名管道(FIFO)
`-r`	文件可以被有效的UID/GID读取
`-s`	文件或目录存在且不为0(返回字节数)
`-t`	文件句柄为TTY(系统函数isatty()的返回结果；不能对文件名使用这个测试)
`-u`	文件或目录具有setuid属性
`-w`	文件可以被有效的UID/GID写入
`-x`	文件可以被有效的UID/GID执行
`-z`	文件存在，大小为0(目录恒为false)，即是否为空文件，

Perl目录操作

下面是Perl中常用的一些操作目录的函数：

opendir DIRHANDLE, EXPR  # 打开目录
readdir DIRHANDLE        # 读取目录
rewinddir DIRHANDLE      # 定位指针到开头
telldir DIRHANDLE        # 返回目录的当前位置
seekdir DIRHANDLE, POS   # 定位指定到目录的 POS 位置
closedir DIRHANDLE       # 关闭目录

显示所有文件

在下面的案例中，可以显示所有的文件，使用了glob函数，如下所示：

biotest@ubuntu:~/perl/06file$ cat glob.pl 
#!/usr/bin/perl
# display /home/biotest/perl/06file/ all files
$dir = "/home/biotest/perl/06file/*";
my @files=glob($dir);
foreach(@files){
    print $_. "\n";
}
# display all files with .pl in /home/biotest/perl/06file/
$dir="/home/biotest/perl/06file/*.pl";
@files=glob($dir);
foreach(@files){
    print $_."\n";
}
biotest@ubuntu:~/perl/06file$ perl glob.pl 
/home/biotest/perl/06file/delete.pl
/home/biotest/perl/06file/file.txt
/home/biotest/perl/06file/file_info.pl
/home/biotest/perl/06file/glob.pl
/home/biotest/perl/06file/import.pl
/home/biotest/perl/06file/import.txt
/home/biotest/perl/06file/m1.pl
/home/biotest/perl/06file/new_file1.txt
/home/biotest/perl/06file/open.pl
/home/biotest/perl/06file/operation.pl
/home/biotest/perl/06file/rename.pl
/home/biotest/perl/06file/delete.pl
/home/biotest/perl/06file/file_info.pl
/home/biotest/perl/06file/glob.pl
/home/biotest/perl/06file/import.pl
/home/biotest/perl/06file/m1.pl
/home/biotest/perl/06file/open.pl
/home/biotest/perl/06file/operation.pl
/home/biotest/perl/06file/rename.pl

显示当前目录下所有文件，使用opendir函数，如下所示：

biotest@ubuntu:~/perl/06file$ cat opendir.pl 
#!/usr/bin/perl
opendir(DIR,'.') or die "Can not open directory,$!";
while($file=readdir DIR){
    print "$file\n";
}
closedir DIR;
biotest@ubuntu:~/perl/06file$ perl opendir.pl 
open.pl
file.txt
..
rename.pl
glob.pl
import.pl
opendir.pl
operation.pl
new_file1.txt
file_info.pl
.
m1.pl
delete.pl
import.txt

显示当前目录下所有以`.pl`结尾的文件

如下所示：

biotest@ubuntu:~/perl/06file$ cat display_file_pl.pl 
#!/usr/bin/perl
opendir(DIR,'.') or die "can not open directionary, $!";
foreach (sort grep(/^.*\.pl$/,readdir(DIR))){
     print "$_\n";
}
closedir DIR;
biotest@ubuntu:~/perl/06file$ perl display_file_pl.pl 
delete.pl
display_file_pl.pl
file_info.pl
glob.pl
import.pl
m1.pl
open.pl
opendir.pl
operation.pl
rename.pl

创建一个新目录

mkdir也是Perl中的一个函数，它用于创建一个新目录，但是使用之前，要有足够的权限，如下所示：

biotest@ubuntu:~/perl/06file$ cat mkdir.pl 
#!/usr/bin/perl
$dir="/home/biotest/perl/06file/new";
mkdir($dir) or die "Can not create $dir directionary, $!";
print "Create a directionary!";
biotest@ubuntu:~/perl/06file$ perl mkdir.pl 
Create a directionary!
biotest@ubuntu:~/perl/06file$ ls
delete.pl           glob.pl     mkdir.pl       open.pl
display_file_pl.pl  import.pl   new            operation.pl
file_info.pl        import.txt  new_file1.txt  rename.pl
file.txt            m1.pl       opendir.pl

删除目录

使用rmdir来删除目录，使用这个函数之前用户需要有足够的权限，如下所示：

biotest@ubuntu:~/perl/06file$ cat del_dir.pl 
#!/usr/bin/perl
$dir="/home/biotest/perl/06file/new";
rmdir($dir) or die "Can not delete $dir directionary, $!";
print "Delete directionary successfully!\n";
biotest@ubuntu:~/perl/06file$ perl del_dir.pl 
Delete directionary successfully!
biotest@ubuntu:~/perl/06file$ ls
del_dir.pl          file.txt    m1.pl          open.pl
delete.pl           glob.pl     mkdir.pl       operation.pl
display_file_pl.pl  import.pl   new_file1.txt  rename.pl
file_info.pl        import.txt  opendir.pl

切换目录

切换目录使用chdir函数，如下所示：

biotest@ubuntu:~/perl/06file$ pwd
/home/biotest/perl/06file
biotest@ubuntu:~/perl/06file$ cat chdir.pl 
#!/usr/bin/perl
$dir="/home/biotest";
chdir($dir) or die "Can not change to $dir, $!";
print "Your current directionary is $dir\n";
opendir(DIR,'.') or die "Can not open $dir,$!";
while($file=readdir DIR){
    print "$file\n";
}
closedir DIR;
biotest@ubuntu:~/perl/06file$ perl chdir.pl|head
Your current directionary is /home/biotest
Desktop
Miniconda2-latest-Linux-x86_64.sh
filename
Pictures
.dmrc
Videos
.xsession-errors.old
.bash_history
.cache

从结果来看，运行代码后，chdir并不会在shell中直接切换到/home/biotest目录，但是，我们后来使用了opendir函数后，显示的文件确实是/home/biotest目录下的文件，这一点需要注意：切换目录改变的只能是Perl程序运行中的目录，而不是Shell实际中的目录。

参考资料

Perl教程|菜鸟教程

Perl文件操作概述

Open函数

只读方式打开

写入方式打开

读写方式打开

删除文件内容

追加数据

读取追加文件的内容

不同的访问模式汇总

Sysopen函数

Close函数

读写文件

第一种方式：<FILEHANDLE>操作符

getc 函数

read函数

print函数

文件操作实例

复制文件数据

重命名文件

删除文件

指定文件位置

tell函数

seek函数

使用案例

Perl目录操作

显示所有文件

显示当前目录下所有以.pl结尾的文件

创建一个新目录

删除目录

切换目录

参考资料

第一种方式：`<FILEHANDLE>操作符`

显示当前目录下所有以`.pl`结尾的文件