grep命令与if语句

问题描述：

FSDFFDSFFDSFDS VCXVCXVCX 3343022340 IT_ON FDSFR0W3EV VXDF03 
DDSDS232323SD DSADFSDA SDA32323 SDADSDQ SDAFDSADS SDA DSADSE3QZCD 
DDSDS232323SD DSADFSDA SDA32323 SDADSDQ SDAFDSADS SDA DSADSE3QZCD 
DDSDSDEERWREF FSFDSDFFDS SDA32323 SDADSDQ SDAFDSADS SDA DSADSE3Q 
DDSDS232323SD DSADFSDA SDA32323 SDADSDQ SDAFDSADSDA 
DSADSE3QZCD FFDSFDAREDFS 23FDSFDDS IT_ON FDSFR0W3EV VXDF03ETRRT 
FFDSFDAREDFS 23FDSFDDSFK 3343022340 IT_OFF FDSFR0W3EV VXDF03ETRRT 
DDSDSDEERWREF FSFDSDFFDS SDA32323 SDADSDQ SDAFDSADS SDA DSADSE3QZCD 
DDSDS232323SD DSADFSDA SDA32323 SDADSDQ SDAFDSADS SDA DSADSE3QZCD 
FFDSFDAREDFS 23FDSFDDSFK 3343022340 IT_ON FDSFR0W3EV VXDF03ETRRT 
FFDSFDAREDFS 23FDSFDDSFK 3343022340 IT_OFF FDSFR0W3EV VXDF03ETRRF 
DDSDSDEERWREF FSFDSDFFDS SDA32323 SDADSDQ SDAFDSADS SDA DSADSE3QZCD 
DDSDS232323SD DSADFSDA SDA32323 SDADSDQ SDAFDSADS SDA DSADSE3QZCD 
FFDSFDAREDFS 23FDSFDDSFK 3343022340 IT_ON FDSFR0W3EV VXDF03ETRRT 
FFDSFDAREDFS 23FDSFDDSFK 3343022340 IT_OFF FDSFR0W3EV VXDF03ETRR 
FFDSFDAREDFS 23FDSFDDSFK 3343022340 IT_OFF FDSFR0W3EV VXDF03ETRR

我都数不过来有多少转变IT_ON到IT_OFF和IT_OFF到IT_ON发生，即

IT_ON to IT_OFF : 3 
IT_OFF to IT_ON : 2

我一直在努力用IF语句使用* grep“IT_ON”*和* grep“IT_OFF”*但它有点复杂。有什么帮助吗？

定义这种使用情况的转变。另外，你如何确定一个转换对，到目前为止你尝试过了什么？ – 2012-07-10 11:03:22

我在问如何定义一个IT_OFF到IT_OFF的转换，我不知道该怎么做。 – user1492786 2012-07-10 11:15:05

答

awk '/IT_ON/ {on = 1; if (off) {on_to_off++}; off = 0} /IT_OFF/ {off = 1; if (on) {off_to_on++}; on = 0} END {print "IT_ON to IT_OFF :", on_to_off; print "IT_OFF to IT_ON :", off_to_on}' inputfile

在多行细分：

awk ' 
    /IT_ON/ { 
     on = 1; 
     if (off) { 
      on_to_off++ 
     }; 
     off = 0 
    } 
    /IT_OFF/ { 
     off = 1; 
     if (on) { 
      off_to_on++ 
     }; 
     on = 0 
    } 
    END { 
     print "IT_ON to IT_OFF :", on_to_off; 
     print "IT_OFF to IT_ON :", off_to_on 
    }' inputfile

如果有需要使用跟踪每ID的转换的ID，则可以使用相同的技术与阵列。此外，您可能需要使用一个标志来设置开启状态，以确保首次开启被视为从关到开的转换。

答

不完全是，但是你想要的东西，可能工作：

sed -n 's/.*\(IT_ON\|IT_OFF\).*/\1/p' input | uniq > input.tmp 
grep $(head -1 input.tmp) input.tmp | uniq -c 
expr $(grep $(head -2 input.tmp | tail -1) input.tmp | wc -l) - 1 
rm input.tmp

答

这里是另一种方法：

grep -Po "IT_(ON|OFF)" inputFile \ 
| uniq | paste - - \ 
| awk 'NR==1 && NF==2{print;f=1}END{if(f)printf "%3d\t%3d\n", NR,NR-1}'

输出格式：

IT_ON IT_OFF 
    3  2

答

这里有一个shell脚本在bash是你问什么：

#!/bin/bash 

testfile="test.txt" 

uniques=$(command grep -o IT_O. $testfile | uniq) 
count=$(echo "$uniques" | paste - - | grep -c "IT_O.[[:space:]]IT_O.") 

if [[ ${uniques:0:5} = "IT_ON" ]]; then 
    echo "IT_ON -> IT_OFF: $count" 
    echo "IT_OFF -> IT_ON : $(($count-1))" 
else 
    echo "IT_ON -> IT_OFF: $(($count-1))" 
    echo "IT_OFF -> IT_ON : $count" 
fi

不幸的是，我不能在测试上花费太多时间 - 请运行一些测试以确定它是否足够适合您的用例。

答

在awk中：

/IT_ON/  { on=1; } 
on && /IT_OFF/ { offs++; on=0; off=1; } 
off && /IT_ON/ { ons++; off=0; on=1; } 
END { 
    printf("ON to OFF: %d\nOFF to ON: %d\n", offs, ons); 
}

ON to OFF: 3 
OFF to ON: 2

您可以实现在任何语言，包括外壳相同的逻辑，但是这似乎干净给我。

答

假设你的数据文件被命名为data.log：

grep -Eo 'IT_(ON|OFF)' data.log | uniq | tail -n +2 |sort |uniq -c

输出：

3 IT_OFF 
2 IT_ON

注释：

grep -Eo 'IT_(ON|OFF)' data.log $(: -E for extended regex, -o to only print matching part) \ 
    | uniq      $(: deduplicate adjacent items) \ 
    | tail -n +2     $(: drop the first line)  \ 
    | sort | uniq -c    $(: sort , then give a count for each unique item)

相关推荐