什么是 sed 编辑器
sed 是一种流编辑器(stream editor),不同于交互式编辑器(如 vscode, vim)等,sed 遵从流式处理逻辑,一般的编辑流程为:
- 逐行从输入中读入数据;
- 按照提供的 sed 的编辑命令匹配并修改数据;
- 将新的数据输出到标准输出。
一个简单的例子,加入有一个文本文件 test.txt
如下:
this is the first line.
this is the second line.
现在我们想把两句话中的首字母变成大写,使用 sed 编辑器可以用以下命令:
sed "s/this/This/" test.txt
其中,双引号里面是 sed 的编辑命令,test.txt 是待编辑文件。命令中 s
是替换命令,把旧的字符串 this 替换成新的字符串 This。
以上命令会在标准输出得到如下输出:
This is the first line.
This is the second line.
我们使用 sed 编辑器在命令行中直接完成了文件内容修改。
sed 编辑器的基础用法
替换
在之前的例子中,用替换命令 s
完成了简单替换,但是默认情况下,s
只替换每一行中第一个出现的字符串,需要完成诸如替换所有字符串等复杂操作,需要使用替换标记。替换标记的用法如下:
sed "s/pattern/replacement/flags"
其中 flags 是替换标记,常见的替换标记有:
- 数字,表明将替换一行中第几处出现的
pattern
; - g,表明将替换所有出现的
pattern
; - p,表明将发生替换的行的原有内容打印出来;
- w filename,表明将发生替换的行的保存中文件中。
基于下面的文件 test.txt
阐述各种替换标记的作用:
This is the first line, again, this is the first line.
This is the second line, again, this is the second line.
This is the third line, again, this is the third line.
假如我们想把 line 替换成 sentence,如果单纯使用替换命令:
sed "s/line/sentence/" test.txt
输出:
This is the first sentence, again, this is the first line.
This is the second sentence, again, this is the second line.
This is the third sentence, again, this is the third line.
可以发现,默认的替换命令只替换了每行中的第一个出现的字符串,所以,我们使用 g
替换标记:
sed "s/line/sentence/g" test.txt
输出:
This is the first sentence, again, this is the first sentence.
This is the second sentence, again, this is the second sentence.
This is the third sentence, again, this is the third sentence.
可以看到所有字符串均被替换。
假如我们想替换每一行中的第二个字符串 line,可以使用数字标记,使用如下命令:
sed "s/line/sentence/2" test.txt
输出:
This is first line, again, this is first sentence.
This is second line, again, this is second sentence.
This is third line, again, this is third sentence.
假如我们想把发生替换的行的内容打印出来,可以使用 p
标记,但是 p
标记打印修改后内容的同时,sed 默认会把修改后的内容打印出来,导致发生修改的行会被打印两次,我们可以配合 sed 的 -n
选项,关闭默认输出,这样,可以做到只打印发生修改的行的内容。使用如下命令:
sed -n "s/second/2nd/p" test.txt
输出:
This is 2nd line, again, this is second line.
我们可以同时使用多个标记,比如我们想修改所有字符串并打印修改行,可以同时使用 p
和 g
标记,如下命令:
sed -n "s/second/2nd/pg" test.txt
输出:
This is 2nd line, again, this is 2nd line.
如果想把修改后的行另存到一个文件中,可以使用 w
标记,使用如下命令:
sed "s/second/2nd/gw out.txt"
上述命令同时指定 g
和 w
两个标记,得到一个 out.txt 文件,文件内容为:
This is 2nd line, again, this is 2nd line.
我们可以灵活使用上面的四种标记,进行文件内容替换。
行寻址
默认情况下,sed 编辑器使用的命令会作用于被操作数据中的所有行,如果我们想限定只操作部分行,应该使用行寻址。
sed 编辑器中有两种行寻址形式:
- 数字表示的行号,可以限定特定行或者行区间;
- 用文本模式过滤行
命令格式如下:
# 单行
[address]command
# 多行
[address] {
command1
command2
command3
}
数字行寻址
基于下面的文件 test.txt
验证行寻址操作
This is the first line, again, this is the first line.
This is the second line, again, this is the second line.
This is the third line, again, this is the third line.
This is the forth line, again, this is the forth line.
假如我们只需要修改第二行中的单词 line
, 将其改成 sentence
. 使用行寻址,可以使用如下命令:
sed "2 s/line/sentence/g" test.txt
输出:
This is the first line, again, this is the first line.
This is the second sentence, again, this is the second sentence.
This is the third line, again, this is the third line.
This is the forth line, again, this is the forth line.
假如需要修改第二行到第三行中的 line
, 使用如下命令:
sed "2,3 s/line/sentence/g" test.txt
输出:
This is the first line, again, this is the first line.
This is the second sentence, again, this is the second sentence.
This is the third sentence, again, this is the third sentence.
This is the forth line, again, this is the forth line.
可以看出,行区间,是用 起始行号, 结束行号
表示。
如果我们用一个特殊符号 $
作为 结束行号
,则表示末尾行,如果我们不知道文件有多少行,这是一种很方便的表达。
从第二行到文件结束的所有行,我们修改其中的 line
, 可以使用如下命令:
sed "2,$ s/line/sentence/g" test.txt
输出:
This is the first line, again, this is the first line.
This is the second sentence, again, this is the second sentence.
This is the third sentence, again, this is the third sentence.
This is the forth sentence, again, this is the forth sentence.
文本过滤
我们可以用文本过滤器过滤出要修改的行,再对这些行使用修改命令,命令形式如下:
/pattern/command
用两个正斜线把 pattern
封装了起来,比如我想修改 /etc/passws
文件中指定用户 coolxxy 的登录 shell, 可以使用如下命令:
sed '/coolxxy/s/bash/fish/ /etc/passwd'
就可以把默认 shell 从 bash
修改为 fish
.
文本过滤模式强大之处在于
pattern
可以使用正则表达式,由于正则表达式非常复杂,因此会在另外的文章中进行阐述。
命令组合
可以把多条命令作用在寻址行上面,比如对第二行开始的所有行,把 line
修改为 sentence
, 把 This
修改为 That
, 可以使用如下命令。
sed '2,$ {
> s/line/sentence/g
> s/This/That/g
> s/this/that/g
> }' test.txt
输出:
This is the first line, again, this is the first line.
That is the second sentence, again, that is the second sentence.
That is the third sentence, again, that is the third sentence.
That is the forth sentence, again, that is the forth sentence.