Excel VBA:打开UTF-16 XML
问题描述:
我目前正努力在Excel中用VBA打开一个utf-16编码的XML文件。题为EntireFileExcel VBA:打开UTF-16 XML
我现在的字符串变量目前是这样开始的:
ÿþ<?xml version="1.0" encoding="utf-16"?>
<Test xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
正如你可以看到有在一开始一些字符,似乎关闭。
我做得到的字符串变量:
Open PathToFile For Input As #1
Do Until EOF(1)
Line Input #1, textline
EntireFile = EntireFile & textline
文件是根据记事本UCS-2的Little Endian格式++,但快速搜索低谷互联网透露,这是微软相当于UTF-16?
我尝试了删除前两个字符的bruteforce方法,但这留下了一个空字符串。
所有的谷歌搜索结果包括保存一个没有BOM的XML文件,但那种我正在寻找的是相反的。
感谢您的时间
答
您可以使用Win32 API函数转换编码。
Private Declare Function WideCharToMultiByte Lib "kernel32.dll" (_
ByVal CodePage As Long, _
ByVal dwFlags As Long, _
ByVal lpWideCharStr As Long, _
ByVal cchWideChar As Long, _
ByVal lpMultiByteStr As Long, _
ByVal cbMultiByte As Long, _
ByVal lpDefaultChar As Long, _
ByVal lpUsedDefaultChar As Long) As Long
Private Declare Function MultiByteToWideChar Lib "kernel32.dll" (_
ByVal CodePage As Long, _
ByVal dwFlags As Long, _
ByVal lpMultiByteStr As Long, _
ByVal cbMultiByte As Long, _
ByVal lpWideCharStr As Long, _
ByVal cchWideChar As Long) As Long
Private Const CP_UTF16 As Long = 1200&
Private Function ConvertToUTF16(ByRef Source As String) As Byte()
Dim Length As Long
Dim Pointer As Long
Dim Size As Long
Dim Buffer() As Byte
Length = Len(Source)
Pointer = StrPtr(Source)
Size = WideCharToMultiByte(CP_UTF16, 0, Pointer, Length, 0, 0, 0, 0)
ReDim Buffer(0 To Size - 1)
WideCharToMultiByte CP_UTF16, 0, Pointer, Length, VarPtr(Buffer(0)), _
Size, 0, 0
ConvertToUTF16 = Buffer
End Function
Private Function ConvertFromUTF16(ByRef Source() As Byte) As String
Dim Size As Long
Dim Pointer As Long
Dim Length As Long
Dim Buffer As String
Size = UBound(Source) - LBound(Source) + 1
Pointer = VarPtr(Source(LBound(Source)))
Length = MultiByteToWideChar(CP_UTF16, 0, Pointer, Size, 0, 0)
Buffer = Space$(Length)
MultiByteToWideChar CP_UTF16, 0, Pointer, Size, StrPtr(Buffer), Length
ConvertFromUTF16 = Buffer
End Function
Private Const CP_UTF16 As Long = 1200&
表示代码页1200是UTF-16 little andian。
你可以看到所有代码页的列表在这里https://msdn.microsoft.com/de-de/library/windows/desktop/dd317756(v=vs.85).aspx
谢谢回答。我尝试添加代码块作为模块,并在完成填充后使用EntireFile从我的代码中调用函数CovertToUTF16(我公开)。 WideCharToMultiByte给我索引越界。我敢肯定,这是我的一个错误,因为这是一个导入的函数,但我不知道在哪里。 – celphy
我再次检查了整个代码,发现如果Len(Source)没有返回3000(显然是错误的),导入的函数会按照预期执行。 – celphy