Python解析路径列表

问题描述：

我有一个.txt文件中的路径列表，我试图用python解析出路径名中的一个文件夹。Python解析路径列表

9999\New_folder\A\23818\files\ 
9999\New_folder\A\18283_HO\files\ 
...

我很感兴趣，这样做是拉动9999\New_folder\A\和\files\，使我最终的字符串：

23818 
18283_HO

任何帮助，将不胜感激！

编辑：非常感谢大家！用您的输入提出以下代码。

input_text = open('C:\\Python\\textintolist\\Document1.txt', 'r') 
output_text = open('output.txt', 'w') 

paths =[] 


for line in input_text: 
    paths.append(line) 

for path in paths: 
     output_text.write(str(path.split('\\')[3])+"\n")

使用正则表达式[正则表达式（http://docs.python.org/howto/regex.html） – profitehlolz 2012-08-13 21:12:38

答

如果你的路总是以这种格式：

>>> paths 
['9999\\New_folder\\A\\23818\\files\\', '9999\\New_folder\\A\\18283_HO\\files'] 
>>> for path in paths: 
...  print path.split('\\')[3] 
... 
23818 
18283_HO

答

>>> s = '9999\\New_folder\\A\\23818\\files\\' 
>>> s.split('9999\\New_folder\\A\\')[1].split('\\')[0] 
'23818'

答

解决方法有很多。如果所有的路径都像9999 \ New_folder \ A＃number＃\ files \那么您可以简单地通过查找第三个最后一个和最后一个“\”秒来获取子字符串。您可以使用rfind()（http://docs.python.org/library/string.html#string.rfind）

另一种更常用的方法是使用正则表达式。 http://docs.python.org/library/re.html

答

#sm.th. like this should work: 
file_handler = open("file path") 
for line in file_handler: 
    re.search(r'\\(.[^\\]+)\\files', line).groups(0)[0]

Python解析路径列表

相关推荐