记事本+ +正则表达式获取与字符串在HTML

问题描述：

我有一个大的HTML，我想利用与

“https://exampledomain.com/category/”开头的所有链接，并删除休息，HTML有像通“https://exampledomain.com/edit/ ...”“https://exampledomain.com/view/ ...”，有吊牌，文字，我要删除所有，但没有“https://exampledomain.com/category/.../”

最后的结果一定是这样的：

https://www.exampledomain/category/presents/ 
https://www.exampledomain/category/books/ 
https://www.exampledomain/category/clothes/ 
https://www.exampledomain/category/bags/

个

任何想法？谢谢！ :)

快速的方法：更换'A HREF =“'用一个新行则行（TextFX） –

你可以发布文本例如排序？转录该图像将是一件麻烦事。 – chris85

谢谢亚历克斯，你的想法救了我！：D – Emanuel

答

亚历克斯提出的，我用搜索和替换单独sepparate在一条线上的链接（通过扩展\ N）...

搜索：(https://www.exampledomain/category/[^"]*) 匹配所有链接，直到（ “）（HREF =结束” URL “）
替换：\n\n\1\n\n

当其完成，我使用记事本++” CTFL + F>标记“以选择包含

所有行10

https://www.exampledomain/category/

然后，除去没有标线......使用菜单>搜索>标记>删除没有选择的行...

谢谢！：d

如何去除尾部的残留物？ – sln

答

您可以使用此：

线上缠绕：：是
查找：.*?"(https://www.exampledomain/category/.*?)"|.*
替换：\1\n
正则表达式：是
.匹配换行符：是

点击全部替换

您能否告诉我这是否适合您？ – trincot

记事本+ +正则表达式获取与字符串在HTML

相关推荐