在python中实现一个基本的队列/线程进程

问题描述:

寻找一些眼球来验证下面的psuedo python块是否有意义。我期待产生一些线程来尽可能快地实现一些inproc函数。这个想法是产卵在主回路中的线程,因此该应用将在并行/并行的方式同时运行的线程在python中实现一个基本的队列/线程进程

chunk of code 
-get the filenames from a dir 
-write each filename ot a queue 
-spawn a thread for each filename, where each thread 
    waits/reads value/data from the queue 
-the threadParse function then handles the actual processing 
    based on the file that's included via the "execfile" function... 


# System modules 
from Queue import Queue 
from threading import Thread 
import time 

# Local modules 
#import feedparser 

# Set up some global variables 
appqueue = Queue() 

# more than the app will need 
# this matches the number of files that will ever be in the 
# urldir 
# 
num_fetch_threads = 200 


def threadParse(q) 
    #decompose the packet to get the various elements 
    line = q.get() 
    college,level,packet=decompose (line) 

    #build name of included file 
    fname=college+"_"+level+"_Parse.py" 
    execfile(fname) 
    q.task_done() 


#setup the master loop 
while True 
    time.sleep(2) 
    # get the files from the dir 
    # setup threads 
    filelist="ls /urldir" 
    if filelist 
    foreach file_ in filelist: 
     worker = Thread(target=threadParse, args=(appqueue,)) 
     worker.start() 

    # again, get the files from the dir 
    #setup the queue 
    filelist="ls /urldir" 
    foreach file_ in filelist: 
     #stuff the filename in the queue 
     appqueue.put(file_) 


    # Now wait for the queue to be empty, indicating that we have 
    # processed all of the downloads. 

    #don't care about this part 

    #print '*** Main thread waiting' 
    #appqueue.join() 
    #print '*** Done' 

思想/评论/指针被理解...

感谢

如果我理解这个权利:你产生了很多线程来让事情更快完成。

这只有在每个线程完成的工作的主要部分完成而没有保持GIL的情况下才有效。因此,如果有很多等待网络,磁盘或类似的数据,这可能是一个好主意。 如果每个任务都使用了大量的CPU,这将与单核1 CPU机器上的运行非常相似,您也可以按顺序执行它们。

我应该补充一点,我写的对CPython是正确的,但对Jython/IronPython不一定是这样。 另外,我应该补充说,如果你需要使用更多的CPU /内核,那么multiprocessing模块可能会有所帮助。