Add built-in schemas for common sites (Springer, arXiv)

## Summary

Create pre-built schemas for commonly used academic/document sites.

## Schemas to Implement

### 1. Springer Books

```python
# fetcharoo/schemas/springer.py
from .base import SiteSchema
from .registry import schema
import re

@schema
class SpringerBook(SiteSchema):
    name = "springer_book"
    url_pattern = r"https?://link\.springer\.com/book/10\.\d+/.*"
    description = "Springer book with chapters"
    
    include_patterns = ["*.pdf"]
    exclude_patterns = ["*bbm*", "*bfm*"]  # Back/front matter
    sort_by = "numeric"
    recommended_depth = 1
    request_delay = 1.0
    
    test_url = "https://link.springer.com/book/10.1007/978-3-031-41026-0"
    expected_min_pdfs = 5
    
    @staticmethod
    def sort_key(url: str) -> tuple:
        """Sort by chapter number in filename (e.g., 978-3-xxx_5.pdf)."""
        match = re.search(r'_(\d+)\.pdf$', url)
        return (int(match.group(1)),) if match else (float('inf'),)
```

### 2. arXiv Papers

```python
@schema  
class ArxivPaper(SiteSchema):
    name = "arxiv"
    url_pattern = r"https?://arxiv\.org/(abs|pdf)/\d+\.\d+.*"
    description = "arXiv preprint paper"
    
    sort_by = "none"
    recommended_depth = 0
    request_delay = 0.5
    
    test_url = "https://arxiv.org/abs/2301.07041"
    expected_min_pdfs = 1
```

### 3. Generic/Fallback

```python
@schema
class GenericSite(SiteSchema):
    name = "generic"
    url_pattern = r".*"  # Matches anything (lowest priority)
    description = "Generic fallback for unknown sites"
    
    sort_by = "none"  # Preserve discovery order
    recommended_depth = 0
    request_delay = 0.5
    
    # No test_url - this is the fallback
```

## Directory Structure

```
fetcharoo/schemas/
├── __init__.py      # Exports all schemas
├── base.py          # SiteSchema dataclass
├── registry.py      # Registry functions
├── springer.py      # Springer schemas
├── arxiv.py         # arXiv schema
└── generic.py       # Generic fallback
```

## Tasks

- [ ] Create `springer.py` with `SpringerBook` schema
- [ ] Create `arxiv.py` with `ArxivPaper` schema
- [ ] Create `generic.py` with fallback schema
- [ ] Auto-register all schemas on import
- [ ] Ensure detection priority (specific before generic)
- [ ] Add integration tests with mocked responses
- [ ] Document each schema's behavior

## Acceptance Criteria

- `detect_schema("https://link.springer.com/book/...")` returns `SpringerBook`
- `detect_schema("https://arxiv.org/abs/...")` returns `ArxivPaper`
- `detect_schema("https://random-site.com")` returns `GenericSite`
- Each schema's `sort_key` works correctly

## Dependencies

- #11 (SiteSchema base class)
- #12 (Schema registry)

## Part of

Parent issue: #10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add built-in schemas for common sites (Springer, arXiv) #13

Summary

Schemas to Implement

1. Springer Books

2. arXiv Papers

3. Generic/Fallback

Directory Structure

Tasks

Acceptance Criteria

Dependencies

Part of

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add built-in schemas for common sites (Springer, arXiv) #13

Description

Summary

Schemas to Implement

1. Springer Books

2. arXiv Papers

3. Generic/Fallback

Directory Structure

Tasks

Acceptance Criteria

Dependencies

Part of

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions