I am trying to use nodeJs with puppeteer to scrape for YouTube video information from the search results. Unfortunately, for some reason, the scrape doesn't load the elements via the document query selector.
import puppeteer from "puppeteer";
const breakSearchTermDownForYoutubeUrl = (term) => {
return `=${term.replace(/ /g, "+")}`
}
const getPageToScrape = async (link) => {
const browser = await puppeteer.launch({
headless: false,
defaultViewport: null,
});
const page = await browser.newPage();
return page
}
const scapeYoutubeLink = async (term) => {
let url = breakSearchTermDownForYoutubeUrl(term)
let youtubePage = await getPageToScrape(url)
await youtubePage.goto(url, {
waitUntil: 'networkidle0',
});
let videos = await youtubePage.evaluate(() => {
const result = document.querySelectorAll("div")
return result
})
console.log(videos)
};
// Start the scraping
scapeYoutubeLink("acevane");
I'm not sure if it's because the dom hasn't rendered yet or if Google somehow doesn't allow it. When I run this code the variable "result" is undefined (located in the "scapeYoutubeLink" function ), when it should be a nodelist. It gives me an array(or nodelist) when I console.log into the puppeteer browser, but not in my terminal.