正则表达式匹配

正则表达式匹配

问题描述：

如何在文化独立的方式下匹配单词但不匹配字母？正则表达式匹配

\w匹配单词或数字，但我想忽略数字。所以，“111或者这个”与\w\s将不起作用。

我想只得到“或这个”？我想{^[A-Za-z]+$}不是解决方案，因为德语字母表有一些额外的字母。

应该将“or this”视为一个还是两个？ –

我想获得模式“word1 word2”的匹配。请注意，“mark1是1”应该给我1匹配“mark1 is”。另外，“我的生日是11/08/2000”应该在“我的生日”和“生日是”中进行匹配（日期不应该匹配）。 – Nickolodeon

答

我认为正则表达式应该是[^ \ d \ s] +。即不是数字或空格字符。

答

这应该匹配单词工作：

\b[^\d\s]+\b

击穿：

\b - word boundary 
[ - start of character class 
^ - negation within character class 
\d - numerals 
\s - whitespace 
] - end of character class 
+ - repeat previous character one or more times 
\b - word boundary

这将匹配任何被明确排除数字和空格（所以“字”如“字边界划定aa？aa！aa“将被匹配）。

另外，如果您想排除这些，以及，你可以使用：

\b[\p{L}\p{M}]+\b

击穿：

\b - word boundary 
[  - start of character class 
\p{L} - single code point in the category "letter" 
\p{M} - code point that is a combining mark (such as diacritics) 
]  - end of character class 
+  - repeat previous character one or more times 
\b - word boundary

良好的通话。我以前从未使用过单词界限。现在我会。 :) – bozdoz

这也会匹配“aaa？”，“aaa！”，“aaa＃”等字样。 – mifki

@mifki - 标点符号不匹配。您将需要使用除'\ b'以外的内容来包含这些内容。 – Oded

答

我会建议使用此：

foundMatch = Regex.IsMatch(SubjectString, @"\b[\p{L}\p{M}]+\b");

哪样只匹配所有的unicode 字母。

虽然@ Oded的答案也可以工作，但它也与此匹配：p+ü+üü++üüü++ü这不完全是一个单词。

说明：

" 
\b    # Assert position at a word boundary 
[\p{L}\p{M}] # Match a single character present in the list below 
        # A character with the Unicode property “letter” (any kind of letter from any language) 
        # A character with the Unicode property “mark” (a character intended to be combined with another character (e.g. accents, umlauts, enclosing boxes, etc.)) 
    +    # Between one and unlimited times, as many times as possible, giving back as needed (greedy) 
\b    # Assert position at a word boundary 
"

您还需要包含'\ p {M}'，因为重音可能被编码为单独的编码点。 – mifki

@mifki +1感谢您的指点。 – FailedDev

答

使用此表达\b[\p{L}\p{M}]+\b。它使用不太知名的符号来匹配指定类别的Unicode字符（代码点）。所以\p{L}将匹配所有字母，并且\p{M}将匹配所有组合标记。后者是必需的，因为有时重音字符可能被编码为两个代码点（字母本身+组合标记），并且仅在这种情况下，\p{L}将仅匹配其中的一个。

另请注意，这是匹配可能包含国际字符的单词的一般表达式。例如，如果您需要一次匹配多个单词或允许以数字结尾的单词，则必须相应地修改此模式。

+1。不知道\ p {M}诀窍:) – FailedDev

好的，但为什么？你应该总是试着解释为什么你的解决方案在OP没有的时候工作。在SO这里，像这样的驾驶式答案在这里不受欢迎。 –

@AlanMoore我在我的评论中解释了FailedDev的答案。我也会更新我的答案。 – mifki

相关推荐