Excel VBA:打开UTF-16 XML

问题描述:

我目前正努力在Excel中用VBA打开一个utf-16编码的XML文件。题为EntireFileExcel VBA:打开UTF-16 XML

我现在的字符串变量目前是这样开始的:

ÿþ<?xml version="1.0" encoding="utf-16"?> 
<Test xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> 

正如你可以看到有在一开始一些字符,似乎关闭。

我做得到的字符串变量:

Open PathToFile For Input As #1 
    Do Until EOF(1) 
       Line Input #1, textline 
       EntireFile = EntireFile & textline 

文件是根据记事本UCS-2的Little Endian格式++,但快速搜索低谷互联网透露,这是微软相当于UTF-16?

我尝试了删除前两个字符的bruteforce方法,但这留下了一个空字符串。

所有的谷歌搜索结果包括保存一个没有BOM的XML文件,但那种我正在寻找的是相反的。

感谢您的时间

您可以使用Win32 API函数转换编码。

Private Declare Function WideCharToMultiByte Lib "kernel32.dll" (_ 
         ByVal CodePage As Long, _ 
         ByVal dwFlags As Long, _ 
         ByVal lpWideCharStr As Long, _ 
         ByVal cchWideChar As Long, _ 
         ByVal lpMultiByteStr As Long, _ 
         ByVal cbMultiByte As Long, _ 
         ByVal lpDefaultChar As Long, _ 
         ByVal lpUsedDefaultChar As Long) As Long 

Private Declare Function MultiByteToWideChar Lib "kernel32.dll" (_ 
         ByVal CodePage As Long, _ 
         ByVal dwFlags As Long, _ 
         ByVal lpMultiByteStr As Long, _ 
         ByVal cbMultiByte As Long, _ 
         ByVal lpWideCharStr As Long, _ 
         ByVal cchWideChar As Long) As Long 

Private Const CP_UTF16 As Long = 1200& 

Private Function ConvertToUTF16(ByRef Source As String) As Byte() 

    Dim Length As Long 
    Dim Pointer As Long 
    Dim Size As Long 
    Dim Buffer() As Byte 

    Length = Len(Source) 
    Pointer = StrPtr(Source) 
    Size = WideCharToMultiByte(CP_UTF16, 0, Pointer, Length, 0, 0, 0, 0) 
    ReDim Buffer(0 To Size - 1) 

    WideCharToMultiByte CP_UTF16, 0, Pointer, Length, VarPtr(Buffer(0)), _ 
     Size, 0, 0 

    ConvertToUTF16 = Buffer 

End Function 

Private Function ConvertFromUTF16(ByRef Source() As Byte) As String 

    Dim Size As Long 
    Dim Pointer As Long 
    Dim Length As Long 
    Dim Buffer As String 

    Size = UBound(Source) - LBound(Source) + 1 
    Pointer = VarPtr(Source(LBound(Source))) 
    Length = MultiByteToWideChar(CP_UTF16, 0, Pointer, Size, 0, 0) 
    Buffer = Space$(Length) 
    MultiByteToWideChar CP_UTF16, 0, Pointer, Size, StrPtr(Buffer), Length 
    ConvertFromUTF16 = Buffer 

End Function 

Private Const CP_UTF16 As Long = 1200&表示代码页1200是UTF-16 little andian。

你可以看到所有代码页的列表在这里https://msdn.microsoft.com/de-de/library/windows/desktop/dd317756(v=vs.85).aspx

+0

谢谢回答。我尝试添加代码块作为模块,并在完成填充后使用EntireFile从我的代码中调用函数CovertToUTF16(我公开)。 WideCharToMultiByte给我索引越界。我敢肯定,这是我的一个错误,因为这是一个导入的函数,但我不知道在哪里。 – celphy

+0

我再次检查了整个代码,发现如果Len(Source)没有返回3000(显然是错误的),导入的函数会按照预期执行。 – celphy