python 抓淘寶商品

Python是一種流行的編程語言，可以用于數據挖掘、爬蟲和其他許多任務。在這里，我們將介紹如何使用Python和BeautifulSoup庫來抓取淘寶商品。

import requests
from bs4 import BeautifulSoup
import re
def getHTMLText(url):
try:
r = requests.get(url)
r.raise_for_status()
r.encoding = r.apparent_encoding
return r.text
except:
return ""
def parsePage(ilt, html):
try:
soup = BeautifulSoup(html, "html.parser")
price = soup.find_all('div',attrs={'class':'price g_price g_price-highlight'})
title = soup.find_all('a',attrs={'class':'J_ClickStat'})
for i in range(len(price)):
tlt = title[i].text.strip()
plt = price[i].text.strip()
ilt.append([tlt,plt])
except:
print("")
def printGoodsList(ilt):
tplt = "{:4}\t{:16}\t{:8}"
print(tplt.format("序號", "商品名稱", "價格"))
count = 0
for g in ilt:
count = count + 1
print(tplt.format(count, g[0], g[1]))
def main():
goods = '書包'
depth = 2
start_url = 'https://s.taobao.com/search?q=' + goods
infoList = []
for i in range(depth):
try:
url = start_url + '&s=' + str(44*i)
html = getHTMLText(url)
parsePage(infoList, html)
except:
continue
printGoodsList(infoList)
main()

在這個例子里，我們定義了兩個函數：getHTMLText（來獲取一個網頁的HTML）和parsePage（用于解析HTML）。我們還定義了一個主函數（main），它使用getHTMLText函數從淘寶搜索結果頁面獲取HTML，并使用parsePage函數解析數據。

parsePage函數找到每個商品的價格和標題，并將它們添加到一個list中。最后，我們使用printGoodsList函數來格式化輸出list中的商品信息。

我們將“書包”作為示例搜索項，并設置搜索結果的頁面深度為2。您可以更改goods和depth變量，并以自己的方式使用Python和BeautifulSoup庫來抓取淘寶商品。

上一篇mysql分組統計查詢不到顯示0

下一篇vue axoi

色婷婷狠狠18禁久久YY,CHINESE性内射高清国产,国产女人18毛片水真多1,国产AV在线观看

網站導航

網站導航

網站分類

python 抓淘寶商品

色婷婷狠狠18禁久久YY,CHINESE性内射高清国产,国产女人18毛片水真多1,国产AV在线观看

網站導航

網站導航

網站分類

python 抓淘寶商品

相關文章