当我使用BPF过滤器时,python脚本的CPU利用率
问题描述:
我从here得到了代码。当我使用BPF过滤器时,python脚本的CPU利用率
from binascii import hexlify
from ctypes import create_string_buffer, addressof
from socket import socket, AF_PACKET, SOCK_RAW, SOL_SOCKET
from struct import pack, unpack
sniff_interval=120
# A subset of Berkeley Packet Filter constants and macros, as defined in
# linux/filter.h.
# Instruction classes
BPF_LD = 0x00
BPF_JMP = 0x05
BPF_RET = 0x06
# ld/ldx fields
BPF_H = 0x08
BPF_B = 0x10
BPF_ABS = 0x20
# alu/jmp fields
BPF_JEQ = 0x10
BPF_K = 0x00
def bpf_jump(code, k, jt, jf):
return pack('HBBI', code, jt, jf, k)
def bpf_stmt(code, k):
return bpf_jump(code, k, 0, 0)
# Ordering of the filters is backwards of what would be intuitive for
# performance reasons: the check that is most likely to fail is first.
filters_list = [
# Must have dst port 67. Load (BPF_LD) a half word value (BPF_H) in
# ethernet frame at absolute byte offset 36 (BPF_ABS). If value is equal to
# 67 then do not jump, else jump 5 statements.
bpf_stmt(BPF_LD | BPF_H | BPF_ABS, 36),
bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 5201, 0, 5),
# Must be UDP (check protocol field at byte offset 23)
bpf_stmt(BPF_LD | BPF_B | BPF_ABS, 23),
bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 0x06, 0, 3),
# Must be IPv4 (check ethertype field at byte offset 12)
bpf_stmt(BPF_LD | BPF_H | BPF_ABS, 12),
bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 0x0800, 0, 1),
bpf_stmt(BPF_RET | BPF_K, 0x0fffffff), # pass
bpf_stmt(BPF_RET | BPF_K, 0), # reject
]
# Create filters struct and fprog struct to be used by SO_ATTACH_FILTER, as
# defined in linux/filter.h.
filters = ''.join(filters_list)
b = create_string_buffer(filters)
mem_addr_of_filters = addressof(b)
fprog = pack('HL', len(filters_list), mem_addr_of_filters)
# As defined in asm/socket.h
SO_ATTACH_FILTER = 26
# Create listening socket with filters
s = socket(AF_PACKET, SOCK_RAW, 0x0800)
s.setsockopt(SOL_SOCKET, SO_ATTACH_FILTER, fprog)
s.bind(('eth0', 0x0800))
while True:
data, addr = s.recvfrom(65565)
#print "*****"
print 'got data from', addr, ':', hexlify(data) #Have to print data, then only the CPU is 2%
我与iperf3
测试,产生的流量从另一台笔记本电脑通过以太网电缆我的笔记本电脑。在5021上列出的服务器(我的笔记本电脑)和客户端(另一台笔记本电脑)发送数据。
- 如果我评论行
print 'got data from', addr, ':', hexlify(data)
,并运行该脚本,脚本 的CPU利用率,以同比增长30%,在100MB的流量存在40%。 - 如果我取消注释行
print 'got data from', addr, ':', hexlify(data)
并再次运行,CPU在下降到2%
存在相同的流量。我登记了htop
那么,这里有什么?
答
我敢打赌,要么hexlify()
,否则极有可能print
(因为它与标准输出同步)是给你的主线程非常需要休息,一个呼吸的空间,而不是只是冲击插座读出的无限while
循环
尝试添加time.sleep(0.05)
(当然首先导入time
)而不是print语句并再次检查CPU使用情况。
嘿,谢谢你的回答。我不想用'睡眠';高速的数据/流量可能会出现在界面上。如果我们失去一些流量或其他东西会怎么样? – Veerendra
如果你的套接字提供了足够的缓冲区,你不会失去任何东西,你只需给其他线程/进程一些空间来呼吸 - 当你调用print/hexlify()时,可能会发生同样的事情,你只是不会直接控制它。 – zwer