Python URL结束字符串计数器

问题描述：

我回到了Python中的一个旧项目，但我似乎忘记了我是如何设法提取数据的，如果有人能指出我正确的方向和文档来实现这一点，它将不胜感激。Python URL结束字符串计数器

我实现了一个网络爬虫，通过扫描我的html代码从HTML页面中提取信息。我使用的BeautifulSoup和urllib2库扫描mywebsite.com/product=1的URL。

但是我想让mywebsite.com增加到最多10.我究竟可以提取，读取和替换网址的末尾并将其替换？我注意到其他人实现了urlparse库来替换域主，但它与我的方法不一样。

> mywebsite.com/product=1 
> mywebsite.com/product=2 
> mywebsite.com/product=3 
> mywebsite.com/product=4 .. 
> mywebsite.com/product=10

谢谢！

答

你的意思是循环和爬行10次？

for i in range(1, 11): 
    url = r"mywebsite.com/product=" + str(i) 
    url = r"mywebsite.com/product={}".format(i) # or use str.format 
    print(url) 

    # crawl and extract

Python URL结束字符串计数器

相关推荐