Configuring Puppeteer for Anti-Detection
To avoid detection when scraping websites, configure Puppeteer with specific args and settings.
Launch Arguments
await puppeteer.launch({
headless: true,
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--disable-accelerated-2d-canvas',
'--disable-gpu',
],
});
Viewport Configuration
await page.setViewport({
width: 1920, // Desktop resolution
height: 1080
});
User Agent
Set a realistic browser user agent:
await page.setUserAgent(
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) ' +
'AppleWebKit/537.36 (KHTML, like Gecko) ' +
'Chrome/120.0.0.0 Safari/537.36'
);
Navigation Strategy
Use networkidle2 to wait for network activity:
await page.goto(url, {
waitUntil: 'networkidle2',
timeout: 30000
});
Additional Wait Times
Wait for dynamic content to load:
await new Promise(resolve => setTimeout(resolve, 2000));
Complete Example
import puppeteer from 'puppeteer';
const browser = await puppeteer.launch({
headless: true,
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--disable-accelerated-2d-canvas',
'--disable-gpu',
],
});
const page = await browser.newPage();
await page.setViewport({ width: 1920, height: 1080 });
await page.setUserAgent('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) ...');
await page.goto(url, { waitUntil: 'networkidle2', timeout: 30000 });
await new Promise(resolve => setTimeout(resolve, 2000));