Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,21 @@ Preferred verification flow for docs/content changes:

## Docs Authoring Rules

### User Journey Funnel

Maintain a directed "funnel" for documentation to maximize user success and conversion:

1. **Phase 1: Quickstart (Local Demo)** — The primary entry point. Run `html2rss-web` with Docker and generate a feed from a page URL in minutes.
2. **Phase 2: Production (Deployment)** — The goal for invested users. Move to a stable, production-ready instance.
3. **Phase 3: Refinement (Custom Configs)** — Secondary optimization. Author custom YAML configs only when automatic generation needs precise control.

**Rules for Funnel Maintenance:**

- Avoid branching paths in introductory pages; always point toward the next phase in the funnel.
- Define "html2rss-web" as the primary interface and "page-to-RSS" as the primary workflow.
- Use "Feed Directory" consistently to refer to the pre-built feed catalog; avoid terms like "catalog", "included feeds", or "packaged configs" in user-facing docs.
- Do not introduce new terminology (e.g., "toolkit") or unrelated infrastructure concepts (e.g., "custom domains") unless they are essential to a specific guide.

### Code Snippets

In docs content (`src/content/docs/**`) and docs-supporting components:
Expand Down
40 changes: 25 additions & 15 deletions astro.config.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,19 @@ export default defineConfig({
"/components/html2rss": "/ruby-gem/",
"/components/html2rss-configs": "/creating-custom-feeds/",
"/components": "/",
"/web-application/how-to/deployment": "/web-application/deployment/",
"/web-application/how-to/automatic-updates": "/web-application/deployment/#auto-update-with-watchtower",
"/web-application/how-to/use-automatic-feed-generation":
"/web-application/guides/use-the-feed-directory/",
"/web-application/how-to/use-automatic-feed-generation/":
"/web-application/guides/use-the-feed-directory/",
"/web-application/how-to": "/web-application/guides/",
"/ruby-gem/how-to/dynamic-parameters": "/ruby-gem/guides/dynamic-parameters/",
"/ruby-gem/how-to/dynamic-parameters/": "/ruby-gem/guides/dynamic-parameters/",
"/ruby-gem/how-to": "/ruby-gem/guides/",
Comment thread
gildesmarais marked this conversation as resolved.
"/ruby-gem/tutorials": "/ruby-gem/guides/",
"/ruby-gem/tutorials/": "/ruby-gem/guides/",
"/web-application/guides/use-included-configs": "/web-application/guides/use-the-feed-directory/",
},
build: {
inlineStylesheets: "auto",
Expand Down Expand Up @@ -249,34 +262,31 @@ export default defineConfig({
sidebar: [
{
label: "Getting Started",
link: "/getting-started",
link: "/getting-started/",
},
{
label: "Feed Directory",
link: "/feed-directory/",
},
{
label: "Create Custom Feeds",
link: "/creating-custom-feeds",
link: "/creating-custom-feeds/",
},
{
label: "Web Application",
collapsed: true,
items: [
"web-application",
"web-application/getting-started",
"web-application/deployment",
{
label: "How-to",
autogenerate: { directory: "web-application/how-to" },
label: "Guides",
autogenerate: { directory: "web-application/guides" },
},
{
label: "Reference",
autogenerate: { directory: "web-application/reference" },
},
{
label: "Tutorials",
autogenerate: { directory: "web-application/tutorials" },
},
],
},
{
Expand All @@ -286,22 +296,22 @@ export default defineConfig({
"ruby-gem",
"ruby-gem/installation",
{
label: "How-to",
autogenerate: { directory: "ruby-gem/how-to" },
},
{
label: "Reference",
autogenerate: { directory: "ruby-gem/reference" },
label: "Guides",
autogenerate: { directory: "ruby-gem/guides" },
},
{
label: "Tutorials",
autogenerate: { directory: "ruby-gem/tutorials" },
},
{
label: "Reference",
autogenerate: { directory: "ruby-gem/reference" },
},
],
},
{
label: "About",
link: "/about",
link: "/about/",
},
{
label: "Get Involved",
Expand Down
30 changes: 9 additions & 21 deletions examples/deployment/.env
Original file line number Diff line number Diff line change
@@ -1,26 +1,14 @@
# Domain & routing
CADDY_HOST=example.com

# Core runtime
RACK_ENV=production

# Security
# Production secrets
# Generate with: openssl rand -hex 32
HTML2RSS_SECRET_KEY=replace-with-64-hex-characters-generated-by-openssl-rand-hex-32

# Authenticated health endpoint token
# Required by the documented Compose stack.
# If you build a custom stack and probe only /api/v1/health/live and /api/v1/health/ready,
# you can omit this value.
HEALTH_CHECK_TOKEN=replace-with-strong-health-token

# Auto source (optional; keep false unless you need automatic feed generation)
AUTO_SOURCE_ENABLED=false
# Web UI / feed creation token
# Paste this into the web app when it asks for an access token.
HTML2RSS_ACCESS_TOKEN=replace-with-strong-access-token

# Observability (optional)
#SENTRY_DSN=
# Optional authenticated health token
# Set this only if you plan to use GET /api/v1/health instead of /api/v1/health/ready.
# HEALTH_CHECK_TOKEN=replace-with-strong-health-token

# Performance tuning (override defaults only when needed)
WEB_CONCURRENCY=2
WEB_MAX_THREADS=5
RACK_TIMEOUT_SERVICE_TIMEOUT=15
# `AUTO_SOURCE_ENABLED=true` and `BOTASAURUS_SCRAPER_URL=http://botasaurus:4010`
# come from docker-compose.yml in this example.
38 changes: 4 additions & 34 deletions examples/deployment/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,43 +6,13 @@ services:
- path: .env
required: false
environment:
RACK_ENV: production
PORT: 4000
HTML2RSS_SECRET_KEY: ${HTML2RSS_SECRET_KEY:?set HTML2RSS_SECRET_KEY}
HTML2RSS_ACCESS_TOKEN: ${HTML2RSS_ACCESS_TOKEN:?set HTML2RSS_ACCESS_TOKEN}
AUTO_SOURCE_ENABLED: "true"
BOTASAURUS_SCRAPER_URL: http://botasaurus:4010

botasaurus:
image: html2rss/botasaurus-scrape-api:latest
restart: unless-stopped

caddy:
image: caddy:2-alpine
depends_on:
- html2rss-web
command:
- caddy
- reverse-proxy
- --from
- ${CADDY_HOST}
- --to
- html2rss-web:4000
ports:
- "80:80"
- "443:443"
volumes:
- caddy_data:/data

watchtower:
image: containrrr/watchtower
depends_on:
- html2rss-web
- caddy
- botasaurus
command:
- --cleanup
- --interval
- "7200"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
restart: unless-stopped

volumes:
caddy_data:
5 changes: 3 additions & 2 deletions src/components/docs/AutoGenerationOptional.astro
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ import { Aside } from "@astrojs/starlight/components";
---

<Aside type="note" title="Automatic generation may be disabled">
The direct `Create a feed` workflow is not enabled on every deployment. If you want that path, continue with
<a href="/web-application/how-to/use-automatic-feed-generation/">Use automatic feed generation</a>.
The Feed Directory is the fallback path when it already covers your site. If you want the primary page-URL
flow on your own instance, continue with
<a href="/web-application/guides/use-automatic-feed-generation/">Use automatic feed generation</a>.
</Aside>
58 changes: 10 additions & 48 deletions src/components/docs/DockerComposeSnippet.astro
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
import { Code } from "@astrojs/starlight/components";
import { botasaurusImage, browserlessImage, caddyImage, watchtowerImage, webImage } from "../../data/docker";
import { botasaurusImage, caddyImage, watchtowerImage, webImage } from "../../data/docker";

interface Props {
variant: "minimal" | "productionCaddy" | "secure" | "watchtower" | "resourceGuardrails";
Expand All @@ -12,35 +12,15 @@ const snippets: Record<Props["variant"], string> = {
minimal: `services:
html2rss-web:
image: ${webImage}
restart: unless-stopped
ports:
- "127.0.0.1:4000:4000"
env_file:
- path: .env
required: false
environment:
RACK_ENV: production
PORT: 4000
HTML2RSS_SECRET_KEY: \${HTML2RSS_SECRET_KEY:?set HTML2RSS_SECRET_KEY}
HEALTH_CHECK_TOKEN: \${HEALTH_CHECK_TOKEN:?set HEALTH_CHECK_TOKEN}
SENTRY_DSN: \${SENTRY_DSN:-}
BROWSERLESS_IO_WEBSOCKET_URL: ws://browserless:4002
BROWSERLESS_IO_API_TOKEN: \${BROWSERLESS_IO_API_TOKEN:?set BROWSERLESS_IO_API_TOKEN}
RACK_ENV: development
HTML2RSS_ACCESS_TOKEN: CHANGE_ME_ADMIN_TOKEN
BOTASAURUS_SCRAPER_URL: http://botasaurus:4010

botasaurus:
image: ${botasaurusImage}
restart: unless-stopped

browserless:
image: "${browserlessImage}"
restart: unless-stopped
ports:
- "127.0.0.1:4002:4002"
environment:
PORT: 4002
CONCURRENT: 10
TOKEN: \${BROWSERLESS_IO_API_TOKEN:?set BROWSERLESS_IO_API_TOKEN}`,
image: ${botasaurusImage}`,
productionCaddy: `services:
caddy:
image: ${caddyImage}
Expand Down Expand Up @@ -68,24 +48,15 @@ const snippets: Record<Props["variant"], string> = {
RACK_ENV: production
PORT: 4000
HTML2RSS_SECRET_KEY: \${HTML2RSS_SECRET_KEY:?set HTML2RSS_SECRET_KEY}
HEALTH_CHECK_TOKEN: \${HEALTH_CHECK_TOKEN:?set HEALTH_CHECK_TOKEN}
HTML2RSS_ACCESS_TOKEN: \${HTML2RSS_ACCESS_TOKEN:?set HTML2RSS_ACCESS_TOKEN}
AUTO_SOURCE_ENABLED: "true"
SENTRY_DSN: \${SENTRY_DSN:-}
BROWSERLESS_IO_WEBSOCKET_URL: ws://browserless:4002
BROWSERLESS_IO_API_TOKEN: \${BROWSERLESS_IO_API_TOKEN:?set BROWSERLESS_IO_API_TOKEN}
BOTASAURUS_SCRAPER_URL: http://botasaurus:4010

botasaurus:
image: ${botasaurusImage}
restart: unless-stopped

browserless:
image: "${browserlessImage}"
restart: unless-stopped
environment:
PORT: 4002
CONCURRENT: 10
TOKEN: \${BROWSERLESS_IO_API_TOKEN:?set BROWSERLESS_IO_API_TOKEN}

volumes:
caddy_data:`,
secure: `services:
Expand All @@ -99,23 +70,14 @@ volumes:
RACK_ENV: production
PORT: 4000
HTML2RSS_SECRET_KEY: \${HTML2RSS_SECRET_KEY:?set HTML2RSS_SECRET_KEY}
HEALTH_CHECK_TOKEN: \${HEALTH_CHECK_TOKEN:?set HEALTH_CHECK_TOKEN}
HTML2RSS_ACCESS_TOKEN: \${HTML2RSS_ACCESS_TOKEN:?set HTML2RSS_ACCESS_TOKEN}
AUTO_SOURCE_ENABLED: "true"
SENTRY_DSN: \${SENTRY_DSN:-}
BROWSERLESS_IO_WEBSOCKET_URL: ws://browserless:4002
BROWSERLESS_IO_API_TOKEN: \${BROWSERLESS_IO_API_TOKEN:?set BROWSERLESS_IO_API_TOKEN}
BOTASAURUS_SCRAPER_URL: http://botasaurus:4010

botasaurus:
image: ${botasaurusImage}
restart: unless-stopped

browserless:
image: "${browserlessImage}"
restart: unless-stopped
environment:
PORT: 4002
CONCURRENT: 10
TOKEN: \${BROWSERLESS_IO_API_TOKEN:?set BROWSERLESS_IO_API_TOKEN}`,
restart: unless-stopped`,
watchtower: `services:
watchtower:
image: ${watchtowerImage}
Expand All @@ -124,7 +86,7 @@ volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
# Optional for private registries only:
# - "\${HOME}/.docker/config.json:/config.json:ro"
command: --cleanup --interval 7200 html2rss-web botasaurus browserless caddy`,
command: --cleanup --interval 7200 html2rss-web botasaurus caddy`,
resourceGuardrails: `services:
html2rss-web:
image: ${webImage}
Expand Down
4 changes: 2 additions & 2 deletions src/content/docs/common-use-cases.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,6 @@ Follow multiple open source projects and their updates.

## Next Steps

- **[Run html2rss-web with Docker](/web-application/getting-started)** to verify your own instance.
- **[Use automatic feed generation](/web-application/how-to/use-automatic-feed-generation/)** when you want direct page-URL conversion.
- **[Run html2rss-web with Docker](/web-application/getting-started/)** to verify your own instance.
- **[Use automatic feed generation](/web-application/guides/use-automatic-feed-generation/)** when you want direct page-URL conversion.
- **[Create custom feeds](/creating-custom-feeds/)** when you need stable, reviewable extraction rules.
14 changes: 7 additions & 7 deletions src/content/docs/creating-custom-feeds.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ import { Aside, Code } from "@astrojs/starlight/components";

When existing feeds or auto-sourcing are not enough, write a YAML config for the site you want to follow.

**Prerequisites:** You should be familiar with the [Getting Started](/getting-started) guide before diving into custom configurations.
**Prerequisites:** You should be familiar with the [Getting Started](/getting-started/) guide before diving into custom configurations.

<Aside type="tip" title="Use this guide when you need more control">
Reach for a custom config when you need stable, reviewable extraction rules or generated output misses
Expand Down Expand Up @@ -117,9 +117,9 @@ This says: "Find each article, get the title from the h2 anchor, and get the lin
lang="yaml"
/>

**Step 3:** Test it with your html2rss-web instance or the [Ruby gem](/ruby-gem/installation).
**Step 3:** Test it with your html2rss-web instance or the [Ruby gem](/ruby-gem/installation/).

**Need help?** See our [troubleshooting guide](/troubleshooting/troubleshooting) for common issues.
**Need help?** See our [troubleshooting guide](/troubleshooting/troubleshooting/) for common issues.

---

Expand Down Expand Up @@ -186,7 +186,7 @@ there.
2. Click "Fork" → "Add file" → Create `domain.com.yml`
3. Paste your config → "Commit new file" → "Open pull request"

**Need help?** See our [contribution guide](/get-involved/contributing) for detailed instructions.
**Need help?** See our [contribution guide](/get-involved/contributing/) for detailed instructions.

---

Expand All @@ -200,7 +200,7 @@ there.
- **Missing content?** Try a browser-based rendering strategy during troubleshooting
- **Wrong data extracted?** Verify your selectors are pointing to the right elements

**Need more help?** See our [comprehensive troubleshooting guide](/troubleshooting/troubleshooting) or ask in [GitHub Discussions](https://github.com/orgs/html2rss/discussions).
**Need more help?** See our [comprehensive troubleshooting guide](/troubleshooting/troubleshooting/) or ask in [GitHub Discussions](https://github.com/orgs/html2rss/discussions).

---

Expand All @@ -212,7 +212,7 @@ there.

**For Beginners:**

- **[Run html2rss-web with Docker](/web-application/getting-started)** - Use the newest integrated behavior
- **[Run html2rss-web with Docker](/web-application/getting-started/)** - Use the newest integrated behavior
- **[Learn more about selectors](/ruby-gem/reference/selectors/)** - Master CSS selectors
- **[Submit your config via GitHub Web](https://github.com/html2rss/html2rss-configs)** - No Git knowledge required!

Expand All @@ -221,4 +221,4 @@ there.
- **[Browse existing configs](https://github.com/html2rss/html2rss-configs/tree/master/lib/html2rss/configs)** - See real examples
- **[Join discussions](https://github.com/orgs/html2rss/discussions)** - Connect with other users
- **[Learn about strategies](/ruby-gem/reference/strategy/)** - Decide when to use static vs JavaScript/browser-based extraction
- **[Learn advanced features](/ruby-gem/how-to/advanced-features/)** - Take your configs to the next level
- **[Learn advanced features](/ruby-gem/guides/advanced-features/)** - Take your configs to the next level
4 changes: 3 additions & 1 deletion src/content/docs/feed-directory/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,11 @@ import FeedDirectory from "../../../components/FeedDirectory.astro";

---

Need the main onboarding path first? Start with [Getting Started](/web-application/getting-started/) and create a feed from your own page URL. The directory below is the packaged fallback/catalog path for fast demos, known sample sources, or cases where the catalog already covers your site.

Need a different instance? You can use the built-in default, self-host your own, or find more options on the [community-run wiki](https://github.com/html2rss/html2rss-web/wiki/Instances).

[🚀 Host Your Own Instance (and share it!)](/web-application/how-to/deployment)
[🚀 Host Your Own Instance (and share it!)](/web-application/deployment/)

---

Expand Down
Loading
Loading