提取最后一个元素,并保留文件夹名称
问题描述:
我有一个独特的名字和内部的每个文件夹的文件夹40是一个名为的summary.txt看起来像这样:提取最后一个元素,并保留文件夹名称
HISAT2 summary stats:
Total reads: 36590175
Aligned 0 time: 1238197 (3.38%)
Aligned 1 time: 33866701 (92.56%)
Aligned >1 times: 1485277 (4.06%)
Overall alignment rate: 96.62%
我想创建一个新的.txt文件有文件夹名称的列和某列的“96.62%”,使得最终的结果是这样的:
Folder name alignment rate
Sample1 96.62%
Sample2 94.53%
... ...
SampleN 96.22%
有没有办法做到这一点使用命令行。也许awk?任何帮助,将不胜感激。
哈利
答
使用找到命令
$ echo -e "Folder name\talignment rate" > output.txt
$ find . -iname "summary.txt" -exec awk 'END{ match(FILENAME,/\/(\w+)\//,a); print a[1]"\t\t"$4}' {} \; > output.txt
输出:
Folder name alignment rate
dir1 96.62%
dir2 96.62%
答
awk中溶液:
步骤之前(在result.txt
设定标题行):
$ cat > result.txt
Folder name alignment rate
awk '/^Overall/{
printf "%-20s%s\n",substr(FILENAME,0,index(FILENAME, "/")-1), $NF >> "result.txt"
}' Sample*/summary.txt
Ť他result.txt
内容应该是这样的:
Folder name alignment rate
Sample1 96.62%
Sample2 94.53%
...
答
一个简单的脚本awk
:
$ awk -F': ' 'BEGIN { print "folder", "rate" }
/Overall/ { sub("/.*","",FILENAME); print FILENAME, $2 }' */summary.txt
folder rate
a 96.62%
b 91.63%
c 93.22%