我用我想要抓取的URL设置了一个爬虫,演员正在工作,我用cookie/截图示例进行了测试。我只是在演员和爬虫之间传递cookie时遇到了问题:
const Apify = require('apify');
Apify.main(async () => {
const input = await Apify.getValue('INPUT');
const browser = await Apify.launchPuppeteer();
const page = await browser.newPage();
await page.goto('http://xy.com/login');
// Login
await page.type('#form_user_login_email', input.username);
await page.type('#form_user_login_password', input.password);
await page.evaluate(() => { document.querySelectorAll('.btn-full-width')[1].click(); });
await page.waitForNavigation();
// Get cookies
const cookies = await page.cookies();
// Use cookies in other tab or browser
//const page2 = await browser.newPage();
//await page2.setCookie(...cookies);
// Get cookies after login
const apifyClient = Apify.client;
// call crawler with cookies
const execution = await apifyClient.crawlers.startExecution({
crawlerId: 'mhi',
settings: {
cookies: cookies
}
});
console.log('Done.');
console.log('Closing Puppeteer...');
await browser.close();
});
我认为cookie没有通过,因为Crawler没有登录。
发布于 2019-05-10 18:28:58
您的代码应该可以工作。也许你可以尝试在设置中设置cookiesPersistence : 'OVER_CRAWLER_RUNS'
。如果您不确定是否传递了cookies,可以使用接口端点https://api.apify.com/v1/user_id/crawlers/crawler_id?token=api_apify_token&executionId=execution_id
进行检查。
但是你不需要将cookie传递给crawler,你可以使用Apify SDK在actor中直接抓取它。您只需覆盖PuppeteerCrawler中的goto函数,您就可以在其中设置cookie。选中doc for puppeterCrawler的待办事项。
const Apify = require('apify');
Apify.main(async () => {
const input = await Apify.getValue('INPUT');
const browser = await Apify.launchPuppeteer();
const page = await browser.newPage();
await page.goto('http://xy.com/login');
// Login
await page.type('#form_user_login_email', input.username);
await page.type('#form_user_login_password', input.password);
await page.evaluate(() => { document.querySelectorAll('.btn-full-width')[1].click(); });
await page.waitForNavigation();
// Get cookies
const cookies = await page.cookies();
const crawler = new Apify.PuppeteerCrawler({
// puppeteer crawler options
gotoFunction: async ({ request, page }) => {
await page.setCookie(cookies);
return page.goto(request.url);
}
});
await crawler.run();
console.log('Done.');
console.log('Closing Puppeteer...');
await browser.close();
});
https://stackoverflow.com/questions/56066248
复制相似问题