如何使用GNU并行化在巨大数据集上包含嵌套for循环的bash脚本？

问题描述：

为了检测服务器是否接受指定的密码，我实际上启动了许多到IP地址的连接，这要归功于OpenSSL。我在1 000 000个服务器上启动了这个脚本（包含在“listeIpShuffle.txt”中）。因此，我的脚本包含2 for循环：第一个用于获取包含IP地址的文件的每一行，第二个用于测试OpenSSL版本中可用的每个密码（如果服务器接受或拒绝它）。如何使用GNU并行化在巨大数据集上包含嵌套for循环的bash脚本？

我在GNU平行的doc看到，这是可能的并行这些类型的循环：

*(for x in `cat xlist` ; do 
    for y in `cat ylist` ; do 
     do_something $x $y 
    done 
    done) | process_output*

...可以这样写：

*parallel do_something {1} {2} :::: xlist ylist | process_output*

但我没有尝试将其应用于我的脚本中...实际上，包含我的IP地址的我的文件太大了，而且出现了着名的“太多参数”错误... 如何处理此问题并使脚本并行？

在此先感谢！

这里是我的脚本：

#!/usr/bin/env bash 

#on recupere toutes les ciphers d'openssl 
ciphers=$(openssl ciphers 'ALL:eNULL' | sed -e 's/:/ /g') 
fichier="./serveursMail/listeIpShuffle.txt" 
port=":25" 
>resultDeprecatedCipher.txt 
nbInconnu=0 
echo Cipher list de $(openssl version). 

for ligne in $(<$fichier) 
do 
    ligneIp=$(echo $ligne | tr "|" "\n")  
    ip=($ligneIp) 
    ipPort=$ip$port 
    dns=$(echo $ligneIp |cut -f 2 -d ' ') 


    ciphers=$(openssl ciphers 'ALL:eNULL' | sed -e 's/:/ /g') 

    for cipher in ${ciphers[@]} 
    do 
     if [[ $nbInconnu < 4 ]] ; then 
      echo -n Test $ligneIp " : " $cipher... 

      result=$(echo -n | timeout 10s openssl s_client -starttls smtp -cipher "$cipher" -connect $ipPort -servername $dns 2>&1) #pas de reponse apres dasn les 15sec => FAIL 

      if [[ "$result" =~ ":error:" ]] ; then 
      error=$(echo -n $result | cut -d':' -f6) 
      echo NON \($error\) 
      let "nbInconnu=0" 
      else 
       if [[ "$result" =~ "Cipher is ${cipher}" || "$result" =~ "Cipher :" ]] ; then 
        echo OUI 
        let "nbInconnu=0" 
        echo $ligneIp " : " $cipher >> resultDeprecatedCipher.txt 
       else 
        echo REPONSE INCONNUE 
        let "nbInconnu++" #incrementation 
        echo $nbInconnu 
        echo $result 
       fi 
      fi 
     else 
      let "nbInconnu=0" 
      break 
     fi 
    done 
done

答

echo alt3.gmail-smtp-in.l.google.com > serverlist 
echo fo-ds-ats.member.g02.yahoodns.net >> serverlist 

doit() { 
    ip="$1" 
    port="$2" 
    cipher="$3" 
    openssl s_client -starttls smtp -cipher "$cipher" -connect $ip:$port -servername $ip < /dev/null 
} 
export -f doit 
parallel --tag --timeout 10 --retries 4 doit $1 :::: serverlist ::: 25 ::: $(openssl ciphers 'ALL:eNULL' | sed -e 's/:/ /g') >tmp_results 
# Post process tmp_results as needed

为了不超载一台服务器添加--shuf。要尽可能多地并行添加-j0。

如何使用GNU并行化在巨大数据集上包含嵌套for循环的bash脚本？

相关推荐