最近在学 python 爬虫,利用 beautifulsoup 爬学校图书馆书籍信息,但发现豆瓣简介信息无法抓取,求助大神
下面是主要代码:
url = 'http://202.119.112.133:8080/opac/item.php?marc_no=0000365400'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
headers = { 'User-Agent' : user_agent }
request = urllib.request.Request(url, headers=headers)
response = urllib.request.urlopen(url)
content = response.read().decode('utf-8')
book = str(content)
book_soup = BeautifulSoup(book,"lxml")
book_intro = book_soup.find_all('dl',{'class':'booklist'})
for item in book_intro:
print(item.get_text('',strip=True))
下面是主要代码:
url = 'http://202.119.112.133:8080/opac/item.php?marc_no=0000365400'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
headers = { 'User-Agent' : user_agent }
request = urllib.request.Request(url, headers=headers)
response = urllib.request.urlopen(url)
content = response.read().decode('utf-8')
book = str(content)
book_soup = BeautifulSoup(book,"lxml")
book_intro = book_soup.find_all('dl',{'class':'booklist'})
for item in book_intro:
print(item.get_text('',strip=True))