如何使用python将.txt文件转换为xml文件？

问题描述：

Latitude :23.1100348 
Longitude:72.5364922 
date&time :30:August:2014 05:04:31 PM 
gsm cell id: 4993 
Neighboring List- Lac : Cid : RSSI 
15000  : 7072  : 25 dBm 
15000  : 7073  : 23 dBm 
15000  : 6102  : 24 dBm 
15000  : 6101  : 24 dBm 
15000  : 6103  : 17 dBm 

Latitude :23.1120549 
Longitude:72.5397988 
date&time :30:August:2014 05:04:34 PM 
gsm cell id: 4993 
Neighboring List- Lac : Cid : RSSI 
15000  : 7072  : 24 dBm 
15000  : 7073  : 22 dBm 
15000  : 6102  : 23 dBm 
15000  : 6101  : 23 dBm 
15000  : 2552  : 16 dBm

这是My.txt文件的文件我想将它转换成XML文件中像如何使用python将.txt文件转换为xml文件？

<celldata> 
<time>  </time> 
<latitude> </latitude> 
<longitude> </longitude> 

</celldata>

我试图让所有组件的列表，但我没有得到Ø/ PI想存储纬度的所有值，经度，gsm cell id，列表中的时间，这将添加到xml文件中。我写下面的代码。

import re 

pa = 'Longitude|Latitude|gsm cell id|Neighboring List- Lac : Cid : RSSI' 

with open('cell.txt','rw') as file: 
    for line in file: 
     line.strip()  
     if re.search(pa, line): 
      lineInfo = line.split(':') 
      title = lineInfo[0] 
      value = lineInfo[1]

'time''纬度''经度'是您在xml文件中需要的唯一值？如果不是，请提供完整的xml结构 – nu11p01n73R 2014-09-26 09:37:47

当你说“我试图制作所有组件的列表，但我没有得到o/p”。你有尝试过输出任何东西吗？您刚刚显示了阅读代码。 – doctorlove 2014-09-26 09:44:22

答

试试下面的代码作为首发：

#!python3 

import re 
import xml.etree.ElementTree as ET 

rex = re.compile(r'''(?P<title>Longitude 
         |Latitude 
         |date&time 
         |gsm\s+cell\s+id 
        ) 
        \s*:?\s* 
        (?P<value>.*) 
        ''', re.VERBOSE) 

root = ET.Element('root') 
root.text = '\n' # newline before the celldata element 

with open('cell.txt') as f: 
    celldata = ET.SubElement(root, 'celldata') 
    celldata.text = '\n' # newline before the collected element 
    celldata.tail = '\n\n' # empty line after the celldata element 
    for line in f: 
     # Empty line starts new celldata element (hack style, uggly) 
     if line.isspace(): 
      celldata = ET.SubElement(root, 'celldata') 
      celldata.text = '\n' 
      celldata.tail = '\n\n' 

     # If the line contains the wanted data, process it. 
     m = rex.search(line) 
     if m: 
      # Fix some problems with the title as it will be used 
      # as the tag name. 
      title = m.group('title') 
      title = title.replace('&', '') 
      title = title.replace(' ', '') 

      e = ET.SubElement(celldata, title.lower()) 
      e.text = m.group('value') 
      e.tail = '\n' 

# Display for debugging    
ET.dump(root) 

# Include the root element to the tree and write the tree 
# to the file. 
tree = ET.ElementTree(root) 
tree.write('cell.xml', encoding='utf-8', xml_declaration=True)

它会显示您的数据。例如：

<root> 
<celldata> 
<latitude>23.1100348</latitude> 
<longitude>72.5364922</longitude> 
<datetime>30:August:2014 05:04:31 PM</datetime> 
<gsmcellid>4993</gsmcellid> 
</celldata> 

<celldata> 
<latitude>23.1120549</latitude> 
<longitude>72.5397988</longitude> 
<datetime>30:August:2014 05:04:34 PM</datetime> 
<gsmcellid>4993</gsmcellid> 
</celldata> 

</root>

更新通缉neigbour列表：

#!python3 

import re 
import xml.etree.ElementTree as ET 

rex = re.compile(r'''(?P<title>Longitude 
         |Latitude 
         |date&time 
         |gsm\s+cell\s+id 
         |Neighboring\s+List-\s+Lac\s+:\s+Cid\s+:\s+RSSI 
        ) 
        \s*:?\s* 
        (?P<value>.*) 
        ''', re.VERBOSE) 

root = ET.Element('root') 
root.text = '\n' # newline before the celldata element 

with open('cell.txt') as f: 
    celldata = ET.SubElement(root, 'celldata') 
    celldata.text = '\n' # newline before the collected element 
    celldata.tail = '\n\n' # empty line after the celldata element 
    for line in f: 
     # Empty line starts new celldata element (hack style, uggly) 
     if line.isspace(): 
      celldata = ET.SubElement(root, 'celldata') 
      celldata.text = '\n' 
      celldata.tail = '\n\n' 
     else: 
      # If the line contains the wanted data, process it. 
      m = rex.search(line) 
      if m: 
       # Fix some problems with the title as it will be used 
       # as the tag name. 
       title = m.group('title') 
       title = title.replace('&', '') 
       title = title.replace(' ', '') 

       if line.startswith('Neighboring'): 
        neighbours = ET.SubElement(celldata, 'neighbours') 
        neighbours.text = '\n' 
        neighbours.tail = '\n' 
       else: 
        e = ET.SubElement(celldata, title.lower()) 
        e.text = m.group('value') 
        e.tail = '\n' 
      else: 
       # This is the neighbour item. Split it by colon, 
       # and set the attributes of the item element. 
       item = ET.SubElement(neighbours, 'item') 
       item.tail = '\n' 

       lac, cid, rssi = (a.strip() for a in line.split(':')) 
       item.attrib['lac'] = lac 
       item.attrib['cid'] = cid 
       item.attrib['rssi'] = rssi.split()[0] # dBm removed 

# Include the root element to the tree and write the tree 
# to the file. 
tree = ET.ElementTree(root) 
tree.write('cell.xml', encoding='utf-8', xml_declaration=True)

接受邻居之前的空行10

更新 - 也更好地执行一般用途：

#!python3 import re import xml.etree.ElementTree as ET rex = re.compile(r'''(?P<title>Longitude |Latitude |date&time |gsm\s+cell\s+id |Neighboring\s+List-\s+Lac\s+:\s+Cid\s+:\s+RSSI ) \s*:?\s* (?P<value>.*) ''', re.VERBOSE) root = ET.Element('root') root.text = '\n' # newline before the celldata element with open('cell.txt') as f: celldata = ET.SubElement(root, 'celldata') celldata.text = '\n' # newline before the collected element celldata.tail = '\n\n' # empty line after the celldata element status = 0 # init status of the finite automaton for line in f: if status == 0: # lines of the heading expected # If the line contains the wanted data, process it. m = rex.search(line) if m: # Fix some problems with the title as it will be used # as the tag name. title = m.group('title') title = title.replace('&', '') title = title.replace(' ', '') if line.startswith('Neighboring'): neighbours = ET.SubElement(celldata, 'neighbours') neighbours.text = '\n' neighbours.tail = '\n' status = 1 # empty line and then list of neighbours expected else: e = ET.SubElement(celldata, title.lower()) e.text = m.group('value') e.tail = '\n' # keep the same status elif status == 1: # empty line expected if line.isspace(): status = 2 # list of neighbours must follow else: raise RuntimeError('Empty line expected. (status == {})'.format(status)) status = 999 # error status elif status == 2: # neighbour or the empty line as final separator if line.isspace(): celldata = ET.SubElement(root, 'celldata') celldata.text = '\n' celldata.tail = '\n\n' status = 0 # go to the initial status else: # This is the neighbour item. Split it by colon, # and set the attributes of the item element. item = ET.SubElement(neighbours, 'item') item.tail = '\n' lac, cid, rssi = (a.strip() for a in line.split(':')) item.attrib['lac'] = lac item.attrib['cid'] = cid item.attrib['rssi'] = rssi.split()[0] # dBm removed # keep the same status elif status == 999: # error status -- break the loop break else: raise LogicError('Unexpected status {}.'.format(status)) break # Display for debugging ET.dump(root) # Include the root element to the tree and write the tree # to the file. tree = ET.ElementTree(root) tree.write('cell.xml', encoding='utf-8', xml_declaration=True)

的代码实现所谓有限自动机其中status变量表示其当前状态。你可以使用铅笔和纸来形象化它 - 用内部的状态数字绘制小圆圈（在图论中称为节点）。处于这种状态，你只允许某种输入（line）。当输入被识别时，您将箭头（图形理论中的定向边）绘制到另一个状态（可能为相同的状态，如同返回到同一节点的循环）。箭头被注释为`condition |行动'。

结果在开始时可能看起来很复杂;然而，从某种意义上说，您可以随时关注属于特定状态的代码部分。而且，代码可以很容易地修改。但是，有限自动机的功率有限。但他们只是完美的这种问题。

我也希望在这个xml文件中的邻居列表像 cid ='，，'rssi ='，，，' – yogeshbhimani 2014-09-27 04:20:53

@yogeshbhimani：*“我想学习这种类型的编程，请给我一些链接或任何东西。 *这里的建议既简单又困难。这取决于你的年龄，以前的教育程度，你想去的地方，你想花多少努力。 – pepr 2014-09-27 13:47:27

我应该警告你，我不认为我的代码很好。这是相当匆忙放在一起，需要一些清洁（重构）。在我看来，堆栈溢出是完美的回答确切的问题;然而，它不适合指导，讨论时间演变的问题。对于那种情况，我建议另一个论坛。我亲自在专家交流活动（http://www.experts-exchange.com/Programming/Languages/Scripting/Python/）使用同样的昵称。让我们在那里讨论一下。 – pepr 2014-09-27 13:55:32

如何使用python将.txt文件转换为xml文件？

相关推荐