求各位大神帮解决,一个小小的爬虫问题
http://www.yinsuwang.com/exam/catalogue.html?id=2想爬免费的题库,但是获取不到
图片中的数据就是我想要的,但是python,总是无法获得
import requests
import json
headers ={"Accept": "application/json, text/javascript, */*; q=0.01",
"Accept-Encoding": "gzip, deflate",
"Accept-Language": "zh-CN,zh;q=0.9",
"Connection":"keep-alive",
"Content-Length": "24",
"Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
"Cookie": "PHPSESSID=6e7e1dl6h5lu8sd28lio5rbge2",
"Host": "www.yinsuwang.com",
"Origin": "http://www.yinsuwang.com",
"Referer": "http://www.yinsuwang.com/exam/taste.html?id=2&exam_id=14375",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"}
data = "{'key':'value'}"
url = "http://www.yinsuwang.com/api/taste/get.json"
response = requests.post(url,data=data,headers=headers)
print(response.text)
大连小熊 发表于 2021-8-31 07:39
我加参数了,还是不行,能不能用python给我解一下,js我不会用
import requests
替换下cookie,访问exam_id=14441,是可以访问的,可能14408本来就是登录才能访问的资源
import requests
import json
headers ={
"Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
"Cookie": "xxxxxxxxxxxxxxxx",
"Host": "www.yinsuwang.com",
"Origin": "http://www.yinsuwang.com",
"Referer": "http://www.yinsuwang.com/exam/taste.html?id=2&exam_id=14441",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"}
data = {'level_id':'2','exam_id':'14441'}
url = "http://www.yinsuwang.com/api/taste/get.json"
response = requests.post(url,data=data,headers=headers).json()
print(response) 忘了传参数了吧 加了检测的 本帖最后由 q2419068625 于 2021-8-30 23:56 编辑
const superagent = require('superagent');
const url = 'http://www.yinsuwang.com/api/taste/get.json'
const header ={
"Accept": "application/json, text/javascript, */*; q=0.01",
"Accept-Encoding": "gzip, deflate",
"Accept-Language": "zh-CN,zh;q=0.9",
"Connection":"keep-alive",
"Content-Length": "24",
"Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
"Cookie": "PHPSESSID=pahq734f3j4fnu3ndopgdj3p4o",
"Host": "www.yinsuwang.com",
"Origin": "http://www.yinsuwang.com",
"Referer": "http://www.yinsuwang.com/exam/taste.html?id=2&exam_id=14375",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"
}
function request(url) {
superagent.post(url).set(header).send({
level_id: 2,
exam_id: 14408
}).end((err, res) => {
if (err) return
const papers = JSON.parse(res.text).data.paper
papers.forEach(v => {
if (v.annex) {
console.log(`http://www.yinsuwang.com${v.annex}`);
}
});
})
}
request(url)
Node.js 版本,
这个网站太不稳定了,有时候请求都发不了, 垃圾网站 http://www.yinsuwang.com/upfiles/images/201907/2515640373036563.jpg
http://www.yinsuwang.com/upfiles/images/201907/2515640374121698.png
http://www.yinsuwang.com/upfiles/images/201907/2515640375226727.png
http://www.yinsuwang.com/upfiles/images/201907/2515640376075124.png
http://www.yinsuwang.com/upfiles/images/201907/2515640374813662.png
http://www.yinsuwang.com/upfiles/images/201907/2515640373563922.jpg
【http://www.yinsuwang.com/upfiles/images/201907/2515640373036563.jpg
http://www.yinsuwang.com/upfiles/images/201907/2515640374121698.png
http://www.yinsuwang.com/upfiles/images/201907/2515640375226727.png
http://www.yinsuwang.com/upfiles/images/201907/2515640376075124.png
http://www.yinsuwang.com/upfiles/images/201907/2515640374813662.png
http://www.yinsuwang.com/upfiles/images/201907/2515640373563922.jpg】
这是爬出来的图片, 不知道是不是你想要的 python怎么就死活爬不出来呢 我加参数了,还是不行,能不能用python给我解一下,js我不会用
import requests
import json
headers ={
"Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
"Cookie": "PHPSESSID=6e7e1dl6h5lu8sd28lio5rbge2",
"Host": "www.yinsuwang.com",
"Origin": "http://www.yinsuwang.com",
"Referer": "http://www.yinsuwang.com/exam/taste.html?id=2&exam_id=14375",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"}
data = "{'level_id':'2','exam_id'='14408'}"
data_json = json.dumps(data)
url = "http://www.yinsuwang.com/api/taste/get.json"
response = requests.post(url,data=data_json,headers=headers)
change = response.json()
new_req = json.dumps(change,ensure_ascii=False)
print(new_req) wynanwong 发表于 2021-8-31 14:13
替换下cookie,访问exam_id=14441,是可以访问的,可能14408本来就是登录才能访问的资源
import reque ...
还是不行,返回{'status': False, 'code': None, 'error': '创建失败,请登录后操作'}
页:
[1]
2