无法从文件行提取文件扩展名

问题描述：

当我运行下面的程序时，我没有得到预期的输出。无法从文件行提取文件扩展名

import os 
import re 

f = open('outputFile','w') 


#flag set to 1 when we are processing the required diffs 
diff_flag=0 #Initialized to 0 in beginning 

#with open('Diff_output.txt') as fp: 
with open('testFile') as fp: 
    for line in fp: 
     if re.match('diff --git',line): 
       #fileExtension = os.path.splitext(line)[1] 
       words=line.split(".") 
       diff_flag=0 
#    print fileExtension 
       str=".rtf"  

       print words[-1] 

       if words[-1] != "rtf": 
         print "Not a text file.."  
         diff_flag = 1 
         f.write(line) 
         print "writing -> " + line  

     elif diff_flag == 1: 
       f.write(line) 
     else: 
       continue

我得到的输出如下：

python read.py 
rtf 

Not a text file.. 
writing -> diff --git a/archived-output/NEW/action-core[best].rtf b/archived-output/NEW/action-core[best].rtf

这是一个文本文件，如果条件应该评估为false。当我打印文字[-1]或fileExtension时，我得到正确的扩展名。但是，我无法理解这种情况为什么会失败。是否这两个变量的内容有问题，因为条件评估为真（不等于）。我正在尝试逐行读取文件，并在此处提取文件名的扩展名。

答

当你遍历文件就像你正在做的，行会包括换行符“\ n”，你应该做的是两种：

words = line.strip().split(".").

或

if words[-1].strip() != "rtf":

但我'd do if I is you is：

if line.strip().endswith(".rtf"):

而不是拆分线。

BTW，换行的证明就是你的输出：

rtf 
<-- empty line here.

谢谢...它的工作.... \ n是这里的问题... – Zack 2015-02-08 07:38:53

答

2点：

1. re.match()试图从一开始就行了模式。如果你想在任何地方找到一个匹配匹配在字符串中，改为使用re.search()。（也search() vs. match()见）

2. words=line.split(".")不给你是因为它含有像\n空格在尾随或领导的文件，你需要先strip你行的单词列表：

words=line.strip().split(".")

谢谢...它的工作.... \ n是这里的问题... – Zack 2015-02-08 07:34:25

@ Zack欢迎！ – Kasramvd 2015-02-08 07:35:40

无法从文件行提取文件扩展名

相关推荐