使用正则表达式匹配连字符后的所有内容

问题描述：

我试图从新闻文章中提取标题。问题是新闻网站通常会在标题中添加连字符和公司名称，所以我试图制作一个正则表达式来匹配空格，连字符，空格及其后的所有内容。例如：使用正则表达式匹配连字符后的所有内容

'Minecraft - Xbox 360 Edition' future mash up packs and Xbox One updates posted - National Video Game News

比赛

- National Video Game News

我想让它只有当一切都包含最多4个字后后的正则表达式匹配的空间+连字符+空间和一切以大写字母开头。我试图使用负先行排除与小写开头的单词：

\s-\s(?!([a-z]+\s){3,}[a-z]+).*

但它的空间，第一个连字符及其之后的所有比赛：

- Xbox 360 Edition' future mash up packs and Xbox One updates posted - National Video Game News

我不知道在这里做什么，有人可以帮忙吗？

答

而不是使用正则表达式，只需使用字符串操作的方法来找到最后连字符，并从那里提取它：

string title = "'Minecraft - Xbox 360 Edition' future mash up packs and Xbox One updates posted - National Video Game News"; 
string name = title.Substring(title.LastIndexOf("-") + 1).Trim(); 

Console.WriteLine(name); // "National Video Game News"

答

为什么不把它写不排除模式以正常的方式？

\s-(\s[A-Z][a-z]+){0,4}$

谢谢你，我最不善于用正则表达式。 – user2517599

答

试试这个：

(?<Title>'[\w\s-\s\w]+')(?<Name>[\w\s]+)-(?<Publisher>[\s\w]+)

编辑：是的，不是你所期待的。这会工作，还是在某些标题中没有？

 var text = new[] 
     { 
      @"'Minecraft - Xbox 360 Edition' future mash up packs and Xbox One updates posted - National Game News", 
      @"'Minecraft - Xbox 360 Edition' future mash up packs and Xbox One updates posted", 
      @"'Minecraft Xbox 360 Edition' future mash up packs and Xbox One updates posted" 
     }; 

     var regex = new Regex ("(?<Title>'[\\w\\s-]*')(?<Name>[\\w\\s]+)[-]*(?<Publisher>[\\s\\w]*)"); 

     foreach (var t in text) 
     { 
      var matches = regex.Matches (t); 

      foreach (Match match in matches) 
      { 
       Console.WriteLine ("Title:\t\t{0}", match.Groups ["Title"].Value.Trim()); 
       Console.WriteLine ("Name:\t\t{0}", match.Groups ["Name"].Value.Trim()); 
       Console.WriteLine ("Publisher:\t{0}", match.Groups ["Publisher"].Value.Trim()); 
       Console.WriteLine(); 
      } 
     }

输出为3名不同的字符串：

Title:  'Minecraft - Xbox 360 Edition' 
Name:  future mash up packs and Xbox One updates posted 
Publisher: National Game News 

Title:  'Minecraft - Xbox 360 Edition' 
Name:  future mash up packs and Xbox One updates posted 
Publisher: 

Title:  'Minecraft Xbox 360 Edition' 
Name:  future mash up packs and Xbox One updates posted 
Publisher:

使用正则表达式匹配连字符后的所有内容

相关推荐