diff --git a/README.md b/README.md index 479ff58..2d04912 100644 --- a/README.md +++ b/README.md @@ -116,8 +116,8 @@ Interact with pages using simplified commands (requires an active session): ```bash notte page observe # Get page state and available actions notte page scrape --instructions "..." # Scrape content from the page -notte page click B3 # Click an element by ID -notte page fill I1 "text" # Fill an input field +notte page click "@B3" # Click an element by ID +notte page fill "@I1" "text" # Fill an input field notte page goto "https://example.com" # Navigate to a URL notte page back # Go back in history notte page forward # Go forward in history @@ -361,12 +361,19 @@ Use `notte` for web automation. Run `notte --help` for all commands. Core workflow: 1. `notte sessions start` - Start a browser session 2. `notte page goto ` - Navigate to a URL -3. `notte page observe` - Get interactive elements with IDs (B1, B2) -4. `notte page click B1` / `notte page fill B2 "text"` - Interact using element IDs +3. `notte page observe` - Get interactive elements with IDs (@B1, @B2) +4. `notte page click "@B1"` / `notte page fill "@I1" "text"` - Interact using element IDs 5. `notte page scrape --instructions "..."` - Extract structured data 6. `notte sessions stop` - Clean up when done ``` +### Tips + +- **Viewing headless sessions**: When you start a session, the output includes a `ViewerUrl` - open it to watch your headless browser live +- **Element selectors**: If element IDs from `observe` (like `@B1`) don't work, use Playwright selectors: `#id`, `.class`, `button:has-text('Submit')` +- **Multiple matches**: Use `>> nth=0` suffix to select the first match: `button:has-text('OK') >> nth=0` +- **Closing modals**: `notte page press "Escape"` reliably dismisses most dialogs + ### Skills Documentation For comprehensive documentation including templates and reference guides, see the [skills/notte-browser](skills/notte-browser/SKILL.md) folder. diff --git a/skills/notte-browser/SKILL.md b/skills/notte-browser/SKILL.md index 7b0451a..8a8ec56 100644 --- a/skills/notte-browser/SKILL.md +++ b/skills/notte-browser/SKILL.md @@ -23,9 +23,11 @@ notte page goto "https://example.com" notte page observe notte page screenshot -# 4. Execute actions +# 4. Execute actions (use @IDs from observe, or Playwright selectors) notte page click "@B3" notte page fill "@I1" "hello world" +# If @IDs don't work, use Playwright selectors: +# notte page click "button:has-text('Submit')" # 5. Scrape content notte page scrape --instructions "Extract all product names and prices" @@ -399,6 +401,102 @@ notte functions schedule --id --cron "0 9 * * *" notte functions runs --id ``` +## Tips & Troubleshooting + +### Handling Inconsistent `observe` Output + +The `observe` command may sometimes return stale or partial DOM state, especially with dynamic content, modals, or single-page applications. If the output seems wrong: + +1. **Use screenshots to verify**: `notte page screenshot` always shows the current visual state +2. **Fall back to Playwright selectors**: Instead of `@ID` references, use standard selectors like `#id`, `.class`, or `button:has-text('Submit')` +3. **Add a brief wait**: `notte page wait 500` before observing can help with dynamic content + +### Selector Syntax + +Both element IDs from `observe` and Playwright selectors are supported: + +```bash +# Using element IDs from observe output +notte page click "@B3" +notte page fill "@I1" "text" + +# Using Playwright selectors (recommended when @IDs don't work) +notte page click "#submit-button" +notte page click ".btn-primary" +notte page click "button:has-text('Submit')" +notte page click "[data-testid='login']" +notte page fill "input[name='email']" "user@example.com" +``` + +**Handling multiple matches** - Use `>> nth=0` to select the first match: + +```bash +# When multiple elements match, select by index +notte page click "button:has-text('OK') >> nth=0" +notte page click ".submit-btn >> nth=0" +``` + +### Working with Modals and Dialogs + +Modals and popups can interfere with page interactions. Tips: + +- **Close modals with Escape**: `notte page press "Escape"` reliably dismisses most dialogs and modals +- **Wait after modal actions**: Add `notte page wait 500` after closing a modal before the next action +- **Check for overlays**: If clicks aren't working, a modal or overlay might be blocking - use screenshot to verify + +```bash +# Common pattern for handling unexpected modals +notte page press "Escape" +notte page wait 500 +notte page click "#target-element" +``` + +### Viewing Headless Sessions + +Running with `--headless` (the default) doesn't mean you can't see the browser: + +- **ViewerUrl**: When you start a session, the output includes a `ViewerUrl` - open it in your browser to watch the session live +- **Viewer command**: `notte sessions viewer` opens the viewer directly +- **Non-headless mode**: Use `--headless=false` only if you need a local browser window (not available on remote/CI environments) + +```bash +# Start headless session and get viewer URL +notte sessions start -o json | jq -r '.viewer_url' + +# Or open viewer for current session +notte sessions viewer +``` + +### Bot Detection / Stealth + +If you're getting blocked or seeing CAPTCHAs, try these approaches (requires restarting the session with new parameters): + +1. **Change browser type**: Some sites have different detection for different browsers + ```bash + notte sessions stop + notte sessions start --browser-type firefox + ``` + +2. **Enable proxies**: Rotate IP addresses to avoid rate limiting + ```bash + notte sessions stop + notte sessions start --proxies + ``` + +3. **Firefox + CAPTCHA solving**: The `--solve-captchas` flag works best with Firefox + ```bash + notte sessions stop + notte sessions start --browser-type firefox --solve-captchas + ``` + +4. **Combine strategies**: For heavily protected sites, combine multiple options + ```bash + notte sessions stop + notte sessions start --browser-type firefox --proxies --solve-captchas + ``` + +**Note**: Always stop the current session before starting a new one with different parameters. Session configuration cannot be changed mid-session. + ## Additional Resources - [Session Management Reference](references/session-management.md) - Detailed session lifecycle guide diff --git a/skills/notte-browser/templates/form-automation.sh b/skills/notte-browser/templates/form-automation.sh index 9727880..035c59d 100755 --- a/skills/notte-browser/templates/form-automation.sh +++ b/skills/notte-browser/templates/form-automation.sh @@ -16,11 +16,17 @@ set -euo pipefail TARGET_URL="https://example.com/contact" FORM_DATA=( # Format: "selector|value" - # Use @ID for element IDs from observe, or CSS selectors + # Use @ID for element IDs from observe, or Playwright selectors + # Examples: + # "@I1|value" - Element ID from observe + # "#name|value" - CSS ID selector + # "input[name='email']|value" - Attribute selector + # ".form-input >> nth=0|value" - First match when multiple elements "@name|John Doe" "@email|john@example.com" "@message|Hello, this is a test message." ) +# Tip: If @submit doesn't work, try: "button:has-text('Submit')" or "#submit-button" SUBMIT_SELECTOR="@submit" SUCCESS_INDICATOR="Thank you" # Text that appears on success