Can someone help me find a way to load a public web page that requires JavaScript and blocks access from developers tools? I had an automated process that that worked as follows.
$TdyDate = $(get-date -f yyyyMMdd)
$wsjurl = "/$TdyDate/frontpage"
$wsjweb = Invoke-WebRequest -Uri $wsjurl -UseBasicParsing
This recently started generating "Please enable JS and disable any ad blocker" errors.
Based on this Stack Overflow post I tried the following which gets me past these errors but is only able to pull down an "Access Blocked" landing page instead of the full web page that renders in my browser.
Set-Alias msedge 'C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe'
msedge --headless --dump-dom --disable-gpu $wsjurl
If anyone could help me figure out a way around this, it would be greatly appreciated. The web page I'm targeting is publicly accessible.
Can someone help me find a way to load a public web page that requires JavaScript and blocks access from developers tools? I had an automated process that that worked as follows.
$TdyDate = $(get-date -f yyyyMMdd)
$wsjurl = "https://www.wsj/print-edition/$TdyDate/frontpage"
$wsjweb = Invoke-WebRequest -Uri $wsjurl -UseBasicParsing
This recently started generating "Please enable JS and disable any ad blocker" errors.
Based on this Stack Overflow post I tried the following which gets me past these errors but is only able to pull down an "Access Blocked" landing page instead of the full web page that renders in my browser.
Set-Alias msedge 'C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe'
msedge --headless --dump-dom --disable-gpu $wsjurl
If anyone could help me figure out a way around this, it would be greatly appreciated. The web page I'm targeting is publicly accessible.
Share Improve this question asked Mar 16 at 16:01 jbug187jbug187 31 bronze badge 1- Try using Postman to make request and see if you get same error. Postman is very robust and adds HTTP header to the request automatically. If postman works than check the Postman Console for Raw Request. Then add any http headers that Postman added to your PS request. Often issue like this are caused by User-Agent Header being different in Postman than your PS request. – jdweng Commented Mar 16 at 17:17
1 Answer
Reset to default 0The following code snippet could help:
$wsjDate = Get-Date
if ( 0 -eq $wsjDate.DayOfWeek.value__ ) {
$TdyDate = "{0:yyyyMMdd}" -f $wsjDate.AddDays( -1) # Sunday -> Saturday
} else {
$TdyDate = "{0:yyyyMMdd}" -f $wsjDate
}
$wsjurl = "https://www.wsj/print-edition/$TdyDate/frontpage"
$wsjweb = Invoke-WebRequest -Uri $wsjurl -Method Options -UseBasicParsing
Explanation:
- a bit (seemingly) complicated calculation of
$TdyDate
respects that the pages are not defined on Sundays, -Method Options
circumvents thePlease enable JS and disable any ad blocker
error, so that$wsjweb.Content
contains full web page code:<!DOCTYPE html><html lang="en-US"> … … … </script></body></html>
Moreover, $wsjweb.Headers
could enlighten the problem (see properties X-XSS-Protection
and X-Content-Type-Options
):
$wsjweb.Headers
# truncated
Key Value --- ----- … X-XSS-Protection 1; mode=block X-Content-Type-Options nosniff …