如何按字符拆分字符串,注意特殊字符
问题描述:
我试图按字符拆分字符串字符,但是我遇到了特殊字符的问题。 我目前使用以下功能:如何按字符拆分字符串,注意特殊字符
<?php
$input = "Comment ça va?";
$array_input = str_split($input, 1);
print_r($array_input);
?>
下面是输出:
Array (
[0] => C [1] => o [2] => m [3] => m [4] => e
[5] => n [6] => t [7] => [8] => � [9] => �
[10] => a [11] => [12] => v [13] => a [14] => ?)
我已经换行了同样的问题:
输入:
“他!
Oui?“
输出:
Array ([0] => H [1] => � [2] => � [3] => ! [4] =>
[5] => [6] => O [7] => u [8] => i [9] => ?)
是否有人有针对此问题的解决方案? 非常感谢。
答
str_split
Unicode字符串有问题。
可以使用u
修饰符preg_split
代替
例如:
$input = "Comment ça va?";
$letters1 = str_split($input);
$letters2 = preg_split('//u', $input, -1, PREG_SPLIT_NO_EMPTY);
print_r($letters1);
print_r($letters2);
将输出
Array ([0] => C [1] => o [2] => m [3] => m [4] => e
[5] => n [6] => t [7] => [8] => � [9] => �
[10] => a [11] => [12] => v [13] => a [14] => ?)
Array ([0] => C [1] => o [2] => m [3] => m [4] => e
[5] => n [6] => t [7] => [8] => ç [9] => a
[10] => [11] => v [12] => a [13] => ?)
答
这是因为PHP的str_split
功能并不多字节安全的,即它无法正确处理Unicode。您可以使用此功能来代替,这是str_split
function mb_str_split($string) {
# Split at all position not after the start:^
# and not before the end: $
return preg_split('/(?<!^)(?!$)/u', $string);
}
多字节安全的实现(来源:网友评论在PHP documentation)
谢谢您的回答。它适用于特殊字符,但不适用于换行符: 输入:hé! oui?数组([0] => h [1] => [2] => [3] => [4] => [5] => o [6] => u [7] =>我[8] =>?) – Zorkzyd 2012-04-29 14:51:33
@Zorkzyd:它实际上是在工作:位置3和4分别是\ r和\ n ...(尝试'ord($ letters [3])''''ord($字母[4])',你将分别得到13和10,这是'\ r'和'\ n'的ASCII码。 – nico 2012-04-29 14:58:30
谢谢你的解释。是否有可能在输出的数组中“合并”\ r \ n? – Zorkzyd 2012-04-29 15:03:43