Skip to content

Add sitemap and robots.txt for SEO#16

Merged
ooloth merged 10 commits intomainfrom
add-sitemap
Dec 19, 2025
Merged

Add sitemap and robots.txt for SEO#16
ooloth merged 10 commits intomainfrom
add-sitemap

Conversation

@ooloth
Copy link
Owner

@ooloth ooloth commented Dec 19, 2025

✅ What

  • Adds /sitemap.xml with all static pages and dynamic blog posts from Notion
  • Adds /robots.txt that allows all crawlers and references the sitemap
  • Adds sitemap link to HTML <head> for search engine discovery
  • Uses trailing slashes in URLs to match existing RSS feed format
  • All files statically generated at build time

🤔 Why

  • Search engines can now efficiently discover and index all site content
  • Better SEO means more organic traffic to blog posts
  • Sitemap auto-updates when new blog posts are published in Notion

Copilot AI review requested due to automatic review settings December 19, 2025 03:57
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements SEO improvements by adding XML sitemap generation and a robots.txt file. The sitemap includes all static pages (home, blog index, likes) and dynamically fetches blog posts from Notion, while robots.txt allows all crawlers and references the sitemap location.

Key changes:

  • Added sitemap generation with static pages and dynamic blog posts from Notion
  • Added robots.txt configuration allowing all crawlers with sitemap reference
  • Added sitemap link to HTML head via layout metadata

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
app/sitemap.ts Generates sitemap with static pages and blog posts fetched from Notion
app/sitemap.test.ts Comprehensive test coverage for sitemap generation logic
app/robots.ts Configures robots.txt to allow all crawlers and reference sitemap
app/robots.test.ts Test coverage for robots.txt configuration
app/layout.tsx Adds sitemap link to HTML head metadata
app/layout.test.tsx Test coverage for layout metadata including sitemap link

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


const blogPosts: MetadataRoute.Sitemap = posts.map((post) => ({
url: `${SITE_URL}${post.slug}/`,
lastModified: new Date(post.firstPublished),
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field name 'firstPublished' suggests the initial publication date, but 'lastModified' typically represents the most recent update. Consider using a 'lastUpdated' or 'lastModified' field from the post data if available, or rename this to 'published' to better reflect that it represents the publication date rather than the last modification date.

Copilot uses AI. Check for mistakes.
app/sitemap.ts Outdated

const staticPages: MetadataRoute.Sitemap = [
{ url: SITE_URL, priority: 1 },
{ url: `${SITE_URL}blog`, changeFrequency: 'weekly', priority: 0.8 },
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'blog' URL is missing a trailing slash while 'likes' and blog post URLs include them. This creates inconsistency in URL formatting. For consistency with the trailing slash pattern used elsewhere (line 18), add a trailing slash to the blog URL.

Suggested change
{ url: `${SITE_URL}blog`, changeFrequency: 'weekly', priority: 0.8 },
{ url: `${SITE_URL}blog/`, changeFrequency: 'weekly', priority: 0.8 },

Copilot uses AI. Check for mistakes.
@claude
Copy link

claude bot commented Dec 19, 2025

Pull Request Review: Add sitemap and robots.txt for SEO

Thank you for this well-structured PR! The implementation is clean and follows good practices. Here's my detailed review:


Strengths

  1. Excellent Test Coverage: All new files have comprehensive tests with good edge case coverage

    • sitemap.test.ts covers static pages, dynamic posts, empty states, and integration scenarios
    • robots.test.ts validates configuration completely
    • layout.test.tsx ensures metadata updates are verified
  2. Consistent Patterns: Implementation follows existing codebase conventions

    • Uses force-static export (matches RSS feed pattern in app/rss.xml/route.ts:8)
    • Leverages existing getPosts function with proper error handling
    • Follows Result pattern with .unwrap() (consistent with codebase error handling)
  3. SEO Best Practices: Proper implementation of standard SEO features

    • Sitemap includes priority and changeFrequency attributes
    • robots.txt references sitemap location
    • Sitemap link added to HTML <head> for search engine discovery
  4. URL Consistency: Trailing slashes match existing RSS feed format (app/rss.xml/route.ts:40)


⚠️ Issues & Recommendations

1. Error Handling Missing (High Priority)

Issue: app/sitemap.ts:9 calls .unwrap() which will throw if getPosts fails, causing build failures.

Location: app/sitemap.ts:9

const posts = (await getPosts({ sortDirection: 'descending' })).unwrap()

Problem: Unlike RSS feed which has error handling in a loop (app/rss.xml/route.ts:30-33), sitemap has no fallback for API failures.

Recommendation: Add error handling to gracefully degrade:

export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
  const postsResult = await getPosts({ sortDirection: 'descending' })
  
  const staticPages: MetadataRoute.Sitemap = [
    { url: SITE_URL, priority: 1 },
    { url: `${SITE_URL}blog`, changeFrequency: 'weekly', priority: 0.8 },
    { url: `${SITE_URL}likes`, priority: 0.5 },
  ]

  // Return static pages if posts fetch fails
  if (!postsResult.ok) {
    console.error('[Sitemap] Failed to fetch posts:', postsResult.error)
    return staticPages
  }

  const blogPosts: MetadataRoute.Sitemap = postsResult.value.map((post) => ({
    url: `${SITE_URL}${post.slug}/`,
    lastModified: new Date(post.firstPublished),
    priority: 0.7,
  }))

  return [...staticPages, ...blogPosts]
}

Why: This ensures the sitemap still generates with static pages even if Notion API fails, preventing complete build failures.


2. Test Gap: Error Handling Not Tested (Medium Priority)

Issue: Tests don't cover the getPosts failure scenario, which is critical for build reliability.

Location: app/sitemap.test.ts

Recommendation: Add test case:

import { Err } from '@/utils/errors/result'

it('returns static pages when getPosts fails', async () => {
  vi.mocked(getPosts).mockResolvedValue(Err(new Error('API failure')))

  const urls = await sitemap()

  // Should still return the 3 static pages
  expect(urls).toHaveLength(3)
  expect(urls.every((u) => u.url.startsWith('https://michaeluloth.com/'))).toBe(true)
})

3. Sitemap Link in <head> - Non-Standard (Low Priority)

Issue: app/layout.tsx:18-26 adds sitemap link via metadata.icons.other, which is semantically incorrect.

Context: While this works, sitemaps aren't icons. The <link rel="sitemap"> tag is informational only (search engines discover sitemaps via robots.txt or manual submission).

Options:

  1. Keep as-is: Works fine, no functional impact
  2. Remove: Sitemap in robots.txt is sufficient (app/robots.ts:13)
  3. Use Next.js metadata.other (if you want to keep it):
    export const metadata: Metadata = {
      // ... other fields
      other: {
        'sitemap': '/sitemap.xml',
      },
    }

Recommendation: Consider removing since it's redundant with robots.txt, but no harm in keeping it.


4. Minor: SITE_URL Duplication (Low Priority)

Observation: SITE_URL constant duplicated across files:

  • app/sitemap.ts:3
  • app/robots.ts:3
  • app/rss.xml/route.ts:6

Recommendation: Consider extracting to shared constant (e.g., lib/constants.ts) to ensure consistency and easier updates. Not critical for this PR.


🔒 Security

✅ No security concerns. All data from getPosts is already validated via Zod schemas at the I/O boundary (io/notion/getPosts.ts:24-60).


Performance

✅ Excellent:

  • Static generation at build time (force-static)
  • No runtime overhead
  • Leverages existing cache system via getPosts

📝 Code Quality

✅ Generally excellent:

  • Clean, readable code
  • Follows TypeScript best practices
  • Proper use of Next.js 16 metadata routes
  • Consistent with existing patterns

🎯 Summary

Must Fix Before Merge:

  1. Add error handling to sitemap.ts to prevent build failures

Nice to Have:
2. Add test for getPosts failure scenario
3. Consider removing or relocating sitemap <head> link

Overall Assessment: Strong implementation with one critical fix needed. The test coverage and code quality are exemplary. Once error handling is added, this will be production-ready.

Great work! 🚀

Previously, if fetching blocks failed for any post, the RSS feed would
silently skip that post and continue building. This meant we could
deploy an incomplete RSS feed without knowing.

Now using .unwrap() to crash the build if any post's blocks can't be
fetched, ensuring we never deploy a broken RSS feed.

Updated test to verify build crashes on block fetch failures.
The <link rel="sitemap"> tag in HTML is:
- Not part of the HTML spec
- Not used by major search engines
- Redundant with robots.txt (which is the standard way to reference sitemaps)

Search engines discover sitemaps via robots.txt, manual submission to
search consoles, or by crawling /sitemap.xml directly.
Centralizes SITE_URL in utils/constants.ts to ensure consistent trailing
slash usage across sitemap, robots.txt, and RSS feed.

The trailing slash is critical for URL consistency - all page URLs use
trailing slashes (/, /blog/, /likes/, /slug/). Having a single source
of truth prevents accidental inconsistencies.
@ooloth ooloth merged commit 2e55f2b into main Dec 19, 2025
4 checks passed
@ooloth ooloth deleted the add-sitemap branch December 19, 2025 04:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants