我正在尝试使用python抓取购物者的商品信息。
因为它使用ajax,所以我尝试从以下位置提取项目信息:https://shopee.com.my/api/v2/item/get?itemid=5859069631&shopid=206039726
经过几次请求,我发现它响应的json原来是假值(比如它的实际评级是4.78,但它返回0.24)。
我试图通过更改报头和ip/代理来解决这个问题,但仍然不起作用。
有没有其他方法可以潜在地解决这个问题?
def get_info(url,itemurl):
requests.adapters.DEFAULT_RETRIES = 5
s = requests.session()
s.keep_alive = False
try:
fake_ua=UserAgent()
headers = {'User-Agent':fake_ua.random,
'Accept': '*/*',
'Accept-Language': 'en-US,en;q=0.5',
'X-Shopee-Language': 'en',
'X-Requested-With': 'XMLHttpRequest',
'X-API-SOURCE': 'pc',
'If-None-Match-': '55b03-2ff39563c299cbdc937f8ab86ef322ab',
'DNT': '1',
'Referer': referer,
'TE': 'Trailers'}
ip = get_daili()
proxies = {"proxies":{"https":ip}}
response = requests.get(url, headers = headers, proxies = proxies, verify=False)
#response = requests.request("GET", url, headers=headers, data=payload)
if response.status_code == 200:
shop_info = response.json()
except requests.ConnectionError as e:
print(f' {url} error', e.args)
shop_name = shop_info['data']['name']
followers = shop_info['data']['follower_count']
ratinggood = shop_info['data']['rating_good']
ratingbad = shop_info['data']['rating_bad']
ratingnormal = shop_info['data']['rating_normal']
try:
fake_ua=UserAgent()
headers = {'User-Agent':fake_ua.random,
'Accept': '*/*',
'Accept-Language': 'en-US,en;q=0.5',
'X-Shopee-Language': 'en',
'X-Requested-With': 'XMLHttpRequest',
'X-API-SOURCE': 'pc',
'If-None-Match-': '55b03-2ff39563c299cbdc937f8ab86ef322ab',
'DNT': '1',
'Referer': referer,
'TE': 'Trailers'}
ip = get_daili()
proxies = {"proxies":{"https":ip}}
response = requests.get(itemurl, headers = headers, proxies = proxies, verify=False)
#response = requests.request("GET", itemurl, headers=headers, data=payload)
if response.status_code == 200:
item_info = response.json()
except requests.ConnectionError as e:
print(f' {url} error', e.args)
#print(json.dumps(item_info, indent=4))
print(itemurl)发布于 2021-06-20 23:30:55
我认为是算法来保护他们的API服务,这样人们就不会滥用他们的服务器。
也许您可以尝试使用python selenium和Selenium Wire来捕获数据。
https://stackoverflow.com/questions/67891213
复制相似问题