PickSkill
← Back

podman-browser

README.md
Rendered from GitHub raw
View raw ↗

podman-browser

Headless browser automation using Podman + Playwright for scraping JavaScript-rendered pages.

What it does

This skill runs a headless Chromium browser inside a Podman container using Microsoft's official Playwright image. It's useful for:

  • Scraping pages that require JavaScript to render content
  • Fetching dynamically loaded data (SPAs, React apps, etc.)
  • Automated browser tasks without installing browser dependencies locally

Prerequisites

  • Podman 5.x+ installed and running
  • Node.js 18+ (for running the CLI)
  • Internet connection (first run pulls ~1.5GB container image)

Installation

For OpenClaw users

Copy the skill folder to your OpenClaw skills directory:

cp -r podman-browser ~/.openclaw/workspace/skills/

Create a symlink to make it available in your PATH:

ln -sf ~/.openclaw/workspace/skills/podman-browser/browse.js ~/.local/bin/podman-browse
chmod +x ~/.openclaw/workspace/skills/podman-browser/browse.js

Standalone usage

Clone this repo and add to your PATH:

git clone https://github.com/ricardodantas/podman-browser.git
cd podman-browser
chmod +x browse.js
ln -sf "$(pwd)/browse.js" ~/.local/bin/podman-browse

Usage

Basic usage

# Get rendered text content from a page
podman-browse "https://example.com"
 
# Or run directly
./browse.js "https://example.com"

Options

Option Description
--html Return raw HTML instead of text
--wait <ms> Wait time after load (default: 2000ms)
--selector <css> Wait for specific CSS selector before capturing
-h, --help Show help

Examples

# Get rendered text content from Hacker News
podman-browse "https://news.ycombinator.com"
 
# Get raw HTML
podman-browse --html "https://news.ycombinator.com"
 
# Wait for specific element
podman-browse --selector ".itemlist" "https://news.ycombinator.com"
 
# Extra wait time for slow pages
podman-browse --wait 5000 "https://news.ycombinator.com/newest"

How it works

  1. Launches Microsoft's official Playwright container via Podman
  2. Uses Chromium in headless mode with a realistic user agent
  3. Navigates to the URL and waits for network idle
  4. Optionally waits for a specific CSS selector
  5. Applies additional wait time for JavaScript to settle
  6. Returns text content (or HTML with --html flag)

Files

File Description
browse.js Self-contained Node.js CLI (handles args + spawns podman)
SKILL.md OpenClaw skill documentation

Container Image

Uses mcr.microsoft.com/playwright:v1.50.0-noble with playwright@1.50.0 npm package.

Note: The Playwright npm package version must match the container image version.

Performance Notes

  • First run pulls the container image (~1.5GB)
  • Each run starts a fresh container (clean but takes ~10-15s)
  • Uses --ipc=host for Chromium stability
  • Uses --init to handle zombie processes

License

GPL-3.0 License - see LICENSE file.

Contributing

Contributions welcome! Please open an issue or PR.

Author

Ricardo Dantas (@ricardodantas)