如何使用python将.txt文件转换为xml文件?

问题描述:

Latitude :23.1100348 
Longitude:72.5364922 
date&time :30:August:2014 05:04:31 PM 
gsm cell id: 4993 
Neighboring List- Lac : Cid : RSSI 
15000  : 7072  : 25 dBm 
15000  : 7073  : 23 dBm 
15000  : 6102  : 24 dBm 
15000  : 6101  : 24 dBm 
15000  : 6103  : 17 dBm 

Latitude :23.1120549 
Longitude:72.5397988 
date&time :30:August:2014 05:04:34 PM 
gsm cell id: 4993 
Neighboring List- Lac : Cid : RSSI 
15000  : 7072  : 24 dBm 
15000  : 7073  : 22 dBm 
15000  : 6102  : 23 dBm 
15000  : 6101  : 23 dBm 
15000  : 2552  : 16 dBm 

这是My.txt文件的文件我想将它转换成XML文件中像如何使用python将.txt文件转换为xml文件?

<celldata> 
<time>  </time> 
<latitude> </latitude> 
<longitude> </longitude> 

</celldata> 

我试图让所有组件的列表,但我没有得到Ø/ PI想存储纬度的所有值,经度,gsm cell id,列表中的时间,这将添加到xml文件中。 我写下面的代码。

import re 

pa = 'Longitude|Latitude|gsm cell id|Neighboring List- Lac : Cid : RSSI' 

with open('cell.txt','rw') as file: 
    for line in file: 
     line.strip()  
     if re.search(pa, line): 
      lineInfo = line.split(':') 
      title = lineInfo[0] 
      value = lineInfo[1] 
+0

'time''纬度''经度'是您在xml文件中需要的唯一值?如果不是,请提供完整的xml结构 – nu11p01n73R 2014-09-26 09:37:47

+0

当你说“我试图制作所有组件的列表,但我没有得到o/p”。你有尝试过输出任何东西吗?您刚刚显示了阅读代码。 – doctorlove 2014-09-26 09:44:22

试试下面的代码作为首发:

#!python3 

import re 
import xml.etree.ElementTree as ET 

rex = re.compile(r'''(?P<title>Longitude 
         |Latitude 
         |date&time 
         |gsm\s+cell\s+id 
        ) 
        \s*:?\s* 
        (?P<value>.*) 
        ''', re.VERBOSE) 

root = ET.Element('root') 
root.text = '\n' # newline before the celldata element 

with open('cell.txt') as f: 
    celldata = ET.SubElement(root, 'celldata') 
    celldata.text = '\n' # newline before the collected element 
    celldata.tail = '\n\n' # empty line after the celldata element 
    for line in f: 
     # Empty line starts new celldata element (hack style, uggly) 
     if line.isspace(): 
      celldata = ET.SubElement(root, 'celldata') 
      celldata.text = '\n' 
      celldata.tail = '\n\n' 

     # If the line contains the wanted data, process it. 
     m = rex.search(line) 
     if m: 
      # Fix some problems with the title as it will be used 
      # as the tag name. 
      title = m.group('title') 
      title = title.replace('&', '') 
      title = title.replace(' ', '') 

      e = ET.SubElement(celldata, title.lower()) 
      e.text = m.group('value') 
      e.tail = '\n' 

# Display for debugging    
ET.dump(root) 

# Include the root element to the tree and write the tree 
# to the file. 
tree = ET.ElementTree(root) 
tree.write('cell.xml', encoding='utf-8', xml_declaration=True) 

它会显示您的数据。例如:

<root> 
<celldata> 
<latitude>23.1100348</latitude> 
<longitude>72.5364922</longitude> 
<datetime>30:August:2014 05:04:31 PM</datetime> 
<gsmcellid>4993</gsmcellid> 
</celldata> 

<celldata> 
<latitude>23.1120549</latitude> 
<longitude>72.5397988</longitude> 
<datetime>30:August:2014 05:04:34 PM</datetime> 
<gsmcellid>4993</gsmcellid> 
</celldata> 

</root> 

更新通缉neigbour列表:

#!python3 

import re 
import xml.etree.ElementTree as ET 

rex = re.compile(r'''(?P<title>Longitude 
         |Latitude 
         |date&time 
         |gsm\s+cell\s+id 
         |Neighboring\s+List-\s+Lac\s+:\s+Cid\s+:\s+RSSI 
        ) 
        \s*:?\s* 
        (?P<value>.*) 
        ''', re.VERBOSE) 

root = ET.Element('root') 
root.text = '\n' # newline before the celldata element 

with open('cell.txt') as f: 
    celldata = ET.SubElement(root, 'celldata') 
    celldata.text = '\n' # newline before the collected element 
    celldata.tail = '\n\n' # empty line after the celldata element 
    for line in f: 
     # Empty line starts new celldata element (hack style, uggly) 
     if line.isspace(): 
      celldata = ET.SubElement(root, 'celldata') 
      celldata.text = '\n' 
      celldata.tail = '\n\n' 
     else: 
      # If the line contains the wanted data, process it. 
      m = rex.search(line) 
      if m: 
       # Fix some problems with the title as it will be used 
       # as the tag name. 
       title = m.group('title') 
       title = title.replace('&', '') 
       title = title.replace(' ', '') 

       if line.startswith('Neighboring'): 
        neighbours = ET.SubElement(celldata, 'neighbours') 
        neighbours.text = '\n' 
        neighbours.tail = '\n' 
       else: 
        e = ET.SubElement(celldata, title.lower()) 
        e.text = m.group('value') 
        e.tail = '\n' 
      else: 
       # This is the neighbour item. Split it by colon, 
       # and set the attributes of the item element. 
       item = ET.SubElement(neighbours, 'item') 
       item.tail = '\n' 

       lac, cid, rssi = (a.strip() for a in line.split(':')) 
       item.attrib['lac'] = lac 
       item.attrib['cid'] = cid 
       item.attrib['rssi'] = rssi.split()[0] # dBm removed 

# Include the root element to the tree and write the tree 
# to the file. 
tree = ET.ElementTree(root) 
tree.write('cell.xml', encoding='utf-8', xml_declaration=True) 
接受邻居之前的空行10

更新 - 也更好地执行一般用途:

#!python3 

import re 
import xml.etree.ElementTree as ET 

rex = re.compile(r'''(?P<title>Longitude 
         |Latitude 
         |date&time 
         |gsm\s+cell\s+id 
         |Neighboring\s+List-\s+Lac\s+:\s+Cid\s+:\s+RSSI 
        ) 
        \s*:?\s* 
        (?P<value>.*) 
        ''', re.VERBOSE) 

root = ET.Element('root') 
root.text = '\n' # newline before the celldata element 

with open('cell.txt') as f: 
    celldata = ET.SubElement(root, 'celldata') 
    celldata.text = '\n' # newline before the collected element 
    celldata.tail = '\n\n' # empty line after the celldata element 
    status = 0    # init status of the finite automaton 
    for line in f: 
     if status == 0:  # lines of the heading expected 
      # If the line contains the wanted data, process it. 
      m = rex.search(line) 
      if m: 
       # Fix some problems with the title as it will be used 
       # as the tag name. 
       title = m.group('title') 
       title = title.replace('&', '') 
       title = title.replace(' ', '') 

       if line.startswith('Neighboring'): 
        neighbours = ET.SubElement(celldata, 'neighbours') 
        neighbours.text = '\n' 
        neighbours.tail = '\n' 
        status = 1 # empty line and then list of neighbours expected 
       else: 
        e = ET.SubElement(celldata, title.lower()) 
        e.text = m.group('value') 
        e.tail = '\n' 
        # keep the same status 

     elif status == 1: # empty line expected 
      if line.isspace(): 
       status = 2 # list of neighbours must follow 
      else: 
       raise RuntimeError('Empty line expected. (status == {})'.format(status)) 
       status = 999 # error status 

     elif status == 2: # neighbour or the empty line as final separator 

      if line.isspace(): 
       celldata = ET.SubElement(root, 'celldata') 
       celldata.text = '\n' 
       celldata.tail = '\n\n' 
       status = 0 # go to the initial status 
      else: 
       # This is the neighbour item. Split it by colon, 
       # and set the attributes of the item element. 
       item = ET.SubElement(neighbours, 'item') 
       item.tail = '\n' 

       lac, cid, rssi = (a.strip() for a in line.split(':')) 
       item.attrib['lac'] = lac 
       item.attrib['cid'] = cid 
       item.attrib['rssi'] = rssi.split()[0] # dBm removed 
       # keep the same status 

     elif status == 999: # error status -- break the loop 
      break 

     else: 
      raise LogicError('Unexpected status {}.'.format(status)) 
      break 

# Display for debugging 
ET.dump(root) 

# Include the root element to the tree and write the tree 
# to the file. 
tree = ET.ElementTree(root) 
tree.write('cell.xml', encoding='utf-8', xml_declaration=True) 

的代码实现所谓有限自动机其中status变量表示其当前状态。你可以使用铅笔和纸来形象化它 - 用内部的状态数字绘制小圆圈(在图论中称为节点)。处于这种状态,你只允许某种输入(line)。当输入被识别时,您将箭头(图形理论中的定向边)绘制到另一个状态(可能为相同的状态,如同返回到同一节点的循环)。箭头被注释为`condition |行动'。

结果在开始时可能看起来很复杂;然而,从某种意义上说,您可以随时关注属于特定状态的代码部分。而且,代码可以很容易地修改。但是,有限自动机的功率有限。但他们只是完美的这种问题。

+0

我也希望在这个xml文件中的邻居列表像 cid =',,'rssi =',,,' – yogeshbhimani 2014-09-27 04:20:53

+0

@yogeshbhimani:*“我想学习这种类型的编程,请给我一些链接或任何东西。 *这里的建议既简单又困难。这取决于你的年龄,以前的教育程度,你想去的地方,你想花多少努力。 – pepr 2014-09-27 13:47:27

+0

我应该警告你,我不认为我的代码很好。这是相当匆忙放在一起,需要一些清洁(重构)。 在我看来,堆栈溢出是完美的回答确切的问题;然而,它不适合指导,讨论时间演变的问题。对于那种情况,我建议另一个论坛。我亲自在专家交流活动(http://www.experts-exchange.com/Programming/Languages/Scripting/Python/)使用同样的昵称。让我们在那里讨论一下。 – pepr 2014-09-27 13:55:32