Scrapy：如何获得另一个标签

问题描述：

<p>Hello <strong>I'm G </strong></p>

我试图让P内的所有文本。即使是强标签的部分。我尝试下面的代码，但我只得到“你好”：

for text in response.css("div.entry-content"): 
     yield { 
      "parag": text.css("p::text").extract(), 
     }

我也试过第一的孩子，就像在CSS中，但这次没有返回：

"parag": text.css("p:strong::text").extract()

编辑：不是强，它可能是另一个标签。因此我们的目标是拿到第一个孩子文本

的CSS标签不会在这里帮助;） –

答

这里有一个工作示例：

>>> from scrapy.http import HtmlResponse 
>>> response = HtmlResponse(url="Test HTML String", body="<p>Hello <strong>I'm G </strong> <b>I write code</b></p>") 

# First child 
>>> ' '.join(t.strip() for i, t in enumerate(response.css('p ::text').extract()) if i< 2).strip() 
u"Hello I'm G" 

# All child 
>>> ' '.join(t.strip() for t in response.css('p ::text').extract()).strip() 
u"Hello I'm G I write code"

Scrapy：如何获得另一个标签

相关推荐