正则表达式在“http：//”之前删除文本？

问题描述：

我有一个红宝石应用解析来自串一串网址：正则表达式在“http：//”之前删除文本？

@text = "a string with a url http://example.com" 

@text.split.grep(/http[s]?:\/\/\w/) 

@text[0] = "http://example.com"

这工作得很好^^

但有时网址的HTTP前文：//例如

有没有一个正则表达式可以选择字符串中的“http：//”之前的文本，这样我就可以去掉它了？

抬起头，你将有同样的问题在网址的结尾太，这将是更加艰难应对。 – JohnFx 2009-07-30 16:04:26

是的，我同意JohnFx。正则表达式对于这个问题不是很好。之前在SO上已经询问了匹配字符串中的URL。看看他们使用什么解决方案 - 即什么库等。 – Pod 2009-07-30 16:07:04

答

劈裂然后grepping为奇数的方式做这个。你为什么不只是使用String#scan：

@text = "a string with a url http://example.com" 
urls = @text.scan(/http[s]?:\/\/\S+/) 
url[0] # => "http://example.com"

答

.*(?=http://)

答

或者你可以将两者结合起来。

.*(?=(f|ht)tp[s]://)

答

只要搜索HTTP：//，然后在此之前，除去该字符串的部分（作为=〜返回偏移到字符串）

答

也许一个更好的方式来达到同样的效果是使用URI标准库。

require 'uri' 
text = "a string with a url http://example.com and another URL here:http://2.example.com and this here" 
URI.extract(text, ['http', 'https']) 
# => ["http://example.com", "http://2.example.com"]

文档：URI.extract

正则表达式在“http：//”之前删除文本？

相关推荐