得到相同的SHA1哈希值与所有字符串

问题描述：

我已经打开一个文件，查找任何有HASH("<stuff>")与HASH(<sha1(stuff)>)得到相同的SHA1哈希值与所有字符串

替代它的脚本的全部脚本是这样的：

import sys 
import re 
import hashlib 

def _hash(seq, trim_bits=64): 
    assert trim_bits % 8 == 0 
    temp = hashlib.sha1(seq).hexdigest() 
    temp = int(temp, 16) & eval('0x{}'.format('F' * (trim_bits/4))) 
    temp = hex(temp) 
    return str(temp[2:]).replace('L', '') 

if __name__ == '__main__': 
    assert len(sys.argv) == 3 
    in_file = sys.argv[1] 
    out_file = sys.argv[2] 
    with open(in_file, 'r') as f: 
     lines = f.readlines() 
     out_handle = open(out_file, 'w') 
     for line in lines: 
      new_line = re.sub(r'HASH\((["\'])(.*?)\1\)', 'HASH({})'.format(_hash(r'\2')), line) 
      out_handle.write(new_line) 
     out_handle.close()

然而，当我运行这个时，所有的sha1哈希变得完全一样，这对我来说没有意义。如果不是写散列，我用HASH({}).format(r'\2')来切换它，它会用双引号之间的字符序列替换它。那么为什么sha1散列返回相同的字符串呢？

看起来你会调用总是返回相同值的_hash（r'\ 2'）'。（http://blog.codinghorror.com/regular-expressions-now-you-have-two-problems/） – quamrana 2014-10-29 14:23:18

答

您正在计算字符串r'\2'的散列值; re模块只会在您将替换字符串用作替换字符串时替换该占位符，但您在此处未这么做。

通行证从匹配对象的组代替，使用替换功能：

def replace_with_hash(match): 
    return 'HASH({})'.format(_hash(match.group(2))) 

new_line = re.sub(r'HASH\((["\'])(.*?)\1\)', replace_with_hash, line)

的replace_with_hash()函数传递匹配对象，并且其返回值被用作替换。现在你可以计算第二组的哈希了！

演示：

>>> import re 
>>> def _hash(string): 
...  return 'HASHED: {}'.format(string[::-1]) 
... 
>>> sample = '''\ 
... HASH("<stuff>") 
... ''' 
>>> re.sub(r'HASH\((["\'])(.*?)\1\)', 'HASH({})'.format(_hash(r'\2')), sample) 
'HASH(HASHED: 2\\)\n' 
>>> def replace_with_hash(match): 
...  return 'HASH({})'.format(_hash(match.group(2))) 
... 
>>> re.sub(r'HASH\((["\'])(.*?)\1\)', replace_with_hash, sample) 
'HASH(HASHED: >ffuts<)\n'

我_hash()功能简单地反转输入字符串显示会发生什么。

第一个re.sub()是你的版本;注意它是如何返回的'2\\'，所以r'\2'颠倒了！我的版本整齐地哈希<stuff>到>futts<。

我不知道它只替换了一个值的占位符，如果它是替换字符串。真棒演示和解释！谢谢！ – ZWiki 2014-10-29 14:34:05

只是巧合，对不起：/ – ZWiki 2014-10-29 14:44:12

得到相同的SHA1哈希值与所有字符串

相关推荐