斯卡拉:正则表达式模式匹配

问题描述:

我有以下的输入字符串斯卡拉:正则表达式模式匹配

"/horses/[email protected]" 
"/Goats/[email protected]" 
"/CATS/[email protected]" 

我想获得

"horses", "c132", "[email protected]" 
"Goats", "b-01", "[email protected]" 
"CATS", "001", "[email protected]" 

我尝试以下

StandardTokenParsers以下为输出

import scala.util.parsing.combinator.syntactical._ 
val p = new StandardTokenParsers { 
lexical.reserved ++= List("/", "?", "XXX=") 
def p = "/" ~ opt(ident) ~ "/" ~ opt(ident) ~ "?" ~ "XXX=" ~ opt(ident) 
} 
p: scala.util.parsing.combinator.syntactical.StandardTokenParsers{def p: this.Parser[this.~[this.~[this.~[String,Option[String]],String],Option[String]]]} = [email protected] 

scala> p.p(new p.lexical.Scanner("/horses/[email protected]")) 
warning: there was one feature warning; re-run with -feature for details 
res3: p.ParseResult[p.~[p.~[p.~[String,Option[String]],String],Option[String]]] = 
[1.1] failure: ``/'' expected but ErrorToken(illegal character) found 

/horses/[email protected] 
^ 

正则表达式

import scala.util.matching.regex 
val p1 = "(/)(.*)(/)(.*)(?)(XXX)(=)(.*)".r 
p1: scala.util.matching.Regex = (/)(.*)(/)(.*)(?)(XXX)(=)(.*) 

scala> val p1(_,animal,_,id,_,_,_,company) = "/horses/[email protected]" 
scala.MatchError: /horses/[email protected] (of class java.lang.String) 
    ... 32 elided 

是否有人可以帮忙吗?谢谢!

您的模式看起来像/(desired-group1)/(desired-group2)?XXX=(desired-group3)

所以,正则表达式将是

scala> val extractionPattern = """(/)(.*)(/)(.*)(\?XXX=)(.*)""".r 
extractionPattern: scala.util.matching.Regex = (/)(.*)(/)(.*)(\?XXX=)(.*) 

- 逃脱?字符。

它是如何工作的,

Full match `/horses/[email protected]` 
Group 1. `/` 
Group 2. `horses` 
Group 3. `/` 
Group 4. `c132` 
Group 5. `?XXX=` 
Group 6. `[email protected]` 

现在,应用它给你的组中的所有的正则表达式匹配

scala> extractionPattern.findAllIn("""/horses/[email protected]""") 
         .matchData.flatMap{m => m.subgroups}.toList 
res15: List[String] = List(/, horses, /, c132, ?XXX=, [email protected]) 

因为你只关心第二,第四和护理第六场比赛,只收集那些。

因此该解决方案会是什么样子,

scala> extractionPattern.findAllIn("""/horses/[email protected]""") 
         .matchData.map(_.subgroups) 
         .flatMap(matches => Seq(matches(1), matches(3), matches(4))).toList 
res16: List[String] = List(horses, c132, ?XXX=) 

当您输入不匹配正则表达式,你在这里得到空结果

scala> extractionPattern.findAllIn("""/horses/c132""") 
         .matchData.map(_.subgroups) 
         .flatMap(matches => Seq(matches(1), matches(3), matches(4))).toList 
res17: List[String] = List() 

工作正则表达式 - https://regex101.com/r/HuGRls/1/