如何匹配多个捕获组,但结果不如预期

问题描述:

我正在尝试学习Java正则表达式。我想将几个捕获组(即j(a(va)))与另一个字符串(即this is java. this is ava, this is va)匹配。我期待的输出为:如何匹配多个捕获组,但结果不如预期

I found the text "java" starting at index 8 and ending at index 12. 
I found the text "ava" starting at index 21 and ending at index 24.  
I found the text "va" starting at index 34 and ending at index 36. 
Number of group: 2 

但是,IDE而只输出:

I found the text "java" starting at index 8 and ending at index 12. 
Number of group: 2 

为什么会出现这种情况?有什么我失踪?

原始代码:

BufferedReader br = new BufferedReader(new InputStreamReader(System.in)); 
System.out.println("\nEnter your regex:"); 

     Pattern pattern 
       = Pattern.compile(br.readLine()); 

     System.out.println("\nEnter input string to search:"); 
     Matcher matcher 
       = pattern.matcher(br.readLine()); 

     boolean found = false; 
     while (matcher.find()) { 
      System.out.format("I found the text" 
        + " \"%s\" starting at " 
        + "index %d and ending at index %d.%n", 
        matcher.group(), 
        matcher.start(), 
        matcher.end()); 
      found = true; 
      System.out.println("Number of group: " + matcher.groupCount()); 
     } 
     if (!found) { 
      System.out.println("No match found."); 
     } 

运行上面的代码后,我已经进入了下输入:

Enter your regex: 
j(a(va)) 

Enter input string to search: 
this is java. this is ava, this is va 

而且IDE输出:

I found the text "java" starting at index 8 and ending at index 12. 
Number of group: 2 
+1

尝试使用https://regex101.com/ –

+1

我想你误会什么捕获组办。它们不会使正则表达式的其他部分成为可选项,所以您的正则表达式只匹配整个字符串'java'。 – Barmar

+1

请不要发布从System.in'读取的问题,并对结果进行一些处理,因为这意味着您可以a)轻松地调试代码以从System.in中读取或从硬编码中识别错误字符串。在这两种情况下,这意味着代码不是最简单的例子,并且/或者错误的来源可以很容易地缩小。这也意味着更多的工作来重现问题。 – f*

你的正则表达式仅匹配整个字符串java,它不匹配avava。当它匹配java时,它会将捕获组1设置为ava并将组2捕获到va,但它们与它们自己的字符串不匹配。这将产生你想要的结果的正则表达式是:

j?(a?(va)) 

?,使前项可选的,所以它会匹配以后的项目没有这些前缀。

DEMO

+0

非常感谢您的帮助!真的很感激它! – Thor

您需要的正则表达式(j?(a?(va)))

Pattern p = Pattern.compile("(j?(a?(va)))"); 
Matcher m = p.matcher("this is java. this is ava, this is va"); 

while(m.find()) 
{ 
    String group = m.group(); 
    int start = m.start(); 
    int end = m.end(); 
    System.out.format("I found the text" 
        + " \"%s\" starting at " 
       + "index %d and ending at index %d.%n", 
        group, 
        start, 
        end); 



} 

你可以看到演示here