



所以基本上 “雅各布的第一本书” 章节1-7



     <dt>Sub Title</dt> 
    <dt>Title 2</dt> 
     <dt>Sub Title 2</dt> 
#this continues for Title 3, Sub title 3, etc etc 


import requests 
import bs4 

scripture_url = 'http://scriptures.nephi.org/docbook/bom/' 
response = requests.get(scripture_url) 
soup = bs4.BeautifulSoup(response.text) 

links = soup.select('dl dd dt') 
for item in links: 
    title = str(item.get_text()).split(' ', 1)[1] 
    print title 


Chapter 1 
Chapter 2 
Chapter 3 
Chapter 4 
Chapter 5 
Chapter 6 
Chapter 7 
Chapter 8 
Chapter 9 
Chapter 10 
Chapter 11 
Chapter 12 
Chapter 13 
Chapter 14 
Chapter 15 
Chapter 16 
Chapter 17 
Chapter 18 
Chapter 19 
Chapter 20 
Chapter 21 
Chapter 22 
Chapter 1 
Chapter 2 
Chapter 3 
Chapter 4 
Chapter 5 
Chapter 6 
Chapter 7 
Chapter 8 
Chapter 9 
Chapter 10 
Chapter 11 
Chapter 12 
Chapter 13 
Chapter 14 
Chapter 15 
Chapter 16 
Chapter 17 
Chapter 18 
Chapter 19 
Chapter 20 
Chapter 21 
Chapter 22 
Chapter 23 
Chapter 24 
Chapter 25 
Chapter 26 
Chapter 27 
Chapter 28 
Chapter 29 
Chapter 30 
Chapter 31 
Chapter 32 
Chapter 33 
Chapter 1 
Chapter 2 
Chapter 3 
Chapter 4 
Chapter 5 
Chapter 6 
Chapter 7 
Chapter 1 
Chapter 1 

描述不清晰可言 - 有一些矛盾。发布预期的输出结果可能会帮助我们理解你实际尝试实现的结果 – har07


@ har07谢谢你,我继续澄清了这个问题,并将输出结果放在了一边,并试图更好地解释它。 – nadermx


book_title = 'The Book of Jacob' 
book = soup.find('a', text=book_title) 
print book.text 


links = book.parent.select('+ dd > dl > dt') 
for item in links: 
    title = str(item.get_text()).split(' ', 1)[1] 
    print title 


The Book of Jacob 
Chapter 1 
Chapter 2 
Chapter 3 
Chapter 4 
Chapter 5 
Chapter 6 
Chapter 7 


links = soup.select('dl dd dt') 
for item in links[:-2]: 
    title = str(item.get_text()).split(' ', 1)[1] 
    print title 

这似乎不适用于我所需要的。我更新了这个问题来澄清。该网站有一个以上的标题,持续一段时间。我需要能够改变条款,并选择说标题1或标题7等 – nadermx


title = links[0]; 
subtitle = links[1]; 

我更新了我的问题。我需要能够选择第一个,然后是第二个或第三个,因为它是一个有很多部分的页面 – nadermx