|
1 | | -# CLAUDE.md |
| 1 | +# RedditModLog |
2 | 2 |
|
3 | | -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 3 | +Automated Reddit moderation log publisher — writes mod actions to a subreddit wiki page on a schedule. |
4 | 4 |
|
5 | | -## Project Overview |
| 5 | +## Stack |
| 6 | +- Python 3.11 / PRAW (Reddit API) |
| 7 | +- SQLite (deduplication and retention) |
| 8 | +- Docker with s6-overlay (deployment) |
6 | 9 |
|
7 | | -This is a Python-based Reddit moderation log publisher that automatically scrapes moderation actions from a subreddit and publishes them to a wiki page. The application uses PRAW (Python Reddit API Wrapper) and SQLite for data persistence. |
8 | | - |
9 | | -## Core Architecture |
10 | | - |
11 | | -- **Main application**: `modlog_wiki_publisher.py` - Single-file application containing all core functionality |
12 | | -- **Database layer**: `ModlogDatabase` class handles SQLite operations for deduplication and retention |
13 | | -- **Configuration**: JSON-based config with CLI override support |
14 | | -- **Logging**: Per-subreddit log files in `logs/` directory with rotating handlers |
15 | | -- **Authentication**: Reddit OAuth2 script-type app authentication |
16 | | - |
17 | | -## Development Commands |
18 | | - |
19 | | -**IMPORTANT**: Always use `/opt/.venv/redditbot/bin/python` for all Python commands in this project. |
20 | | - |
21 | | -### Setup and Dependencies |
22 | | -```bash |
23 | | -# Dependencies are pre-installed in the venv |
24 | | -# Copy template config (required for first run) |
25 | | -cp config_template.json config.json |
26 | | -``` |
27 | | - |
28 | | -### Running the Application |
| 10 | +## Dev |
29 | 11 | ```bash |
30 | | -# Test connection and configuration |
31 | 12 | /opt/.venv/redditbot/bin/python modlog_wiki_publisher.py --test |
32 | | - |
33 | | -# Single run |
34 | | -/opt/.venv/redditbot/bin/python modlog_wiki_publisher.py --source-subreddit SUBREDDIT_NAME |
35 | | - |
36 | | -# Continuous daemon mode |
37 | | -/opt/.venv/redditbot/bin/python modlog_wiki_publisher.py --source-subreddit SUBREDDIT_NAME --continuous |
38 | | - |
39 | | -# Force wiki update only (using existing database data) |
40 | | -/opt/.venv/redditbot/bin/python modlog_wiki_publisher.py --source-subreddit SUBREDDIT_NAME --force-wiki |
41 | | - |
42 | | -# Debug authentication issues |
43 | | -/opt/.venv/redditbot/bin/python scripts/debug_auth.py |
44 | | -``` |
45 | | - |
46 | | -### Database Operations |
47 | | -```bash |
48 | | -# View recent processed actions with removal reasons |
49 | | -sqlite3 modlog.db "SELECT action_id, action_type, moderator, removal_reason, subreddit, created_at FROM processed_actions ORDER BY created_at DESC LIMIT 10;" |
50 | | - |
51 | | -# View actions by subreddit |
52 | | -sqlite3 modlog.db "SELECT action_type, moderator, target_author, removal_reason FROM processed_actions WHERE subreddit = 'usenet' ORDER BY created_at DESC LIMIT 5;" |
53 | | - |
54 | | -# Track content lifecycle by target ID |
55 | | -sqlite3 modlog.db "SELECT target_id, action_type, moderator, removal_reason, datetime(created_at, 'unixepoch') FROM processed_actions WHERE target_id LIKE '%1mkz4jm%' ORDER BY created_at;" |
56 | | - |
57 | | -# Manual cleanup of old entries |
58 | | -sqlite3 modlog.db "DELETE FROM processed_actions WHERE created_at < date('now', '-30 days');" |
59 | | -``` |
60 | | - |
61 | | -## Configuration |
62 | | - |
63 | | -The application supports multiple configuration methods with the following priority (highest to lowest): |
64 | | -1. **Command line arguments** (highest priority) |
65 | | -2. **Environment variables** (override config file) |
66 | | -3. **JSON config file** (base configuration) |
67 | | - |
68 | | -### Environment Variables |
69 | | - |
70 | | -All configuration options can be set via environment variables: |
71 | | - |
72 | | -#### Reddit Credentials |
73 | | -- `REDDIT_CLIENT_ID`: Reddit app client ID |
74 | | -- `REDDIT_CLIENT_SECRET`: Reddit app client secret |
75 | | -- `REDDIT_USERNAME`: Reddit bot username |
76 | | -- `REDDIT_PASSWORD`: Reddit bot password |
77 | | - |
78 | | -#### Application Settings |
79 | | -- `SOURCE_SUBREDDIT`: Target subreddit name |
80 | | -- `WIKI_PAGE`: Wiki page name (default: "modlog") |
81 | | -- `RETENTION_DAYS`: Database cleanup period in days |
82 | | -- `BATCH_SIZE`: Entries fetched per run |
83 | | -- `UPDATE_INTERVAL`: Seconds between updates in daemon mode |
84 | | -- `ANONYMIZE_MODERATORS`: **MUST be `true`** (enforced for security) |
85 | | -- `DATABASE_PATH`: Path to SQLite database file (default: "modlog.db") |
86 | | -- `LOGS_DIR`: Directory for log files (default: "logs") |
87 | | - |
88 | | -#### Advanced Settings |
89 | | -- `WIKI_ACTIONS`: Comma-separated list of actions to show (e.g., "removelink,removecomment,approvelink") |
90 | | -- `IGNORED_MODERATORS`: Comma-separated list of moderators to ignore |
91 | | - |
92 | | -### Command Line Options |
93 | | -- `--source-subreddit`: Target subreddit for reading/writing logs |
94 | | -- `--wiki-page`: Wiki page name (default: "modlog") |
95 | | -- `--retention-days`: Database cleanup period (default: 30) |
96 | | -- `--batch-size`: Entries fetched per run (default: 100) |
97 | | -- `--interval`: Seconds between updates in daemon mode (default: 300) |
98 | | -- `--debug`: Enable verbose logging |
99 | | - |
100 | | -### Configuration Examples |
101 | | - |
102 | | -#### Using Environment Variables (Docker/Container) |
103 | | -```bash |
104 | | -# Set credentials via environment |
105 | | -export REDDIT_CLIENT_ID="your_client_id" |
106 | | -export REDDIT_CLIENT_SECRET="your_client_secret" |
107 | | -export REDDIT_USERNAME="your_bot_username" |
108 | | -export REDDIT_PASSWORD="your_bot_password" |
109 | | -export SOURCE_SUBREDDIT="usenet" |
110 | | - |
111 | | -# Run without config file |
112 | | -python modlog_wiki_publisher.py |
| 13 | +/opt/.venv/redditbot/bin/python modlog_wiki_publisher.py --source-subreddit NAME --continuous |
113 | 14 | ``` |
114 | 15 |
|
115 | | -#### Docker Example |
116 | | -```bash |
117 | | -docker run -e REDDIT_CLIENT_ID="id" \ |
118 | | - -e REDDIT_CLIENT_SECRET="secret" \ |
119 | | - -e REDDIT_USERNAME="bot" \ |
120 | | - -e REDDIT_PASSWORD="pass" \ |
121 | | - -e SOURCE_SUBREDDIT="usenet" \ |
122 | | - -e ANONYMIZE_MODERATORS="true" \ |
123 | | - your-modlog-image |
124 | | -``` |
125 | | - |
126 | | -#### Mixed Configuration |
127 | | -```bash |
128 | | -# Use config file + env overrides + CLI args |
129 | | -export SOURCE_SUBREDDIT="usenet" # Override config file |
130 | | -python modlog_wiki_publisher.py --debug --batch-size 25 # CLI takes priority |
131 | | -``` |
132 | | - |
133 | | -### Display Options |
134 | | -- `anonymize_moderators`: **REQUIRED** to be `true` for security (default: true) |
135 | | - - `true` (ENFORCED): Shows "AutoModerator", "Reddit", or "HumanModerator" |
136 | | - - `false`: **BLOCKED** - Would expose moderator identities publicly |
137 | | - |
138 | | -**SECURITY NOTE**: Setting `anonymize_moderators=false` is permanently disabled to protect moderator privacy. The application will refuse to start if this is attempted. |
139 | | - |
140 | | -### Action Types Displayed |
141 | | - |
142 | | -The application uses configurable action type variables for flexibility: |
143 | | - |
144 | | -#### Default Configuration |
145 | | -- **REMOVAL_ACTIONS**: `removelink`, `removecomment`, `spamlink`, `spamcomment` |
146 | | -- **APPROVAL_ACTIONS**: `approvelink`, `approvecomment` |
147 | | -- **REASON_ACTIONS**: `addremovalreason` |
148 | | -- **DEFAULT_WIKI_ACTIONS**: All above combined |
149 | | - |
150 | | -#### Display Behavior |
151 | | -- **Manual Actions**: Show as-is (e.g., `removelink`, `removecomment`) |
152 | | -- **AutoMod Filters**: Show with `filter-` prefix (e.g., `filter-removelink`, `filter-removecomment`) |
153 | | -- **Removal Reasons**: Combined with removal action when targeting same content |
154 | | -- **Human Approvals**: Only shown for reversals of Reddit/AutoMod actions |
155 | | -- **Approval Context**: Shows original removal reason and moderator (e.g., "Approved AutoModerator removal: Rule violation") |
156 | | - |
157 | | -### Database Features |
158 | | -- **Multi-subreddit support**: Single database handles multiple subreddits safely |
159 | | -- **Removal reason storage**: Full text/number handling from Reddit API |
160 | | -- **Target author tracking**: Actual usernames stored and displayed |
161 | | -- **Content ID extraction**: Unique IDs from permalinks for precise tracking |
162 | | -- **Data separation**: Subreddit column prevents cross-contamination |
163 | | - |
164 | | -## Authentication Requirements |
165 | | - |
166 | | -The bot account needs: |
167 | | -- Moderator status on the target subreddit |
168 | | -- Wiki edit permissions for the specified wiki page |
169 | | -- Reddit app credentials (script type, not web app) |
170 | | - |
171 | | -## File Structure |
172 | | - |
173 | | -- `modlog_wiki_publisher.py`: Main application |
174 | | -- `scripts/debug_auth.py`: Authentication debugging utility |
175 | | -- `tests/test_removal_reasons.py`: Test suite for removal reason processing |
176 | | -- `config.json`: Runtime configuration (created from template) |
177 | | -- `data/`: Runtime data directory (database files) |
178 | | -- `logs/`: Per-subreddit log files |
179 | | -- `requirements.txt`: Python dependencies |
180 | | - |
181 | | -## Testing |
182 | | - |
183 | | -Use `--test` flag to verify configuration and Reddit API connectivity without making changes. |
184 | | - |
185 | | -## Content Link Guidelines |
186 | | - |
187 | | -**CRITICAL**: Content links in the modlog should NEVER point to user profiles (`/u/username`). Links should only point to: |
188 | | -- Actual removed posts (`/comments/postid/`) |
189 | | -- Actual removed comments (`/comments/postid/_/commentid/`) |
190 | | -- No link at all if no actual content is available |
191 | | - |
192 | | -User profile links are a privacy concern and not useful for modlog purposes. |
193 | | - |
194 | | -## Recent Improvements (v1.2) |
195 | | - |
196 | | -### Environment Variable Support & Validation |
197 | | -- ✅ Complete environment variable support for all configuration options |
198 | | -- ✅ Standard configuration hierarchy: CLI args → Environment vars → Config file |
199 | | -- ✅ Container/Docker ready with secure credential handling |
200 | | -- ✅ Strict validation with 44+ known Reddit modlog actions in `VALID_MODLOG_ACTIONS` |
201 | | -- ✅ Fail-fast validation rejects invalid actions with clear error messages |
202 | | - |
203 | | -## Previous Improvements (v1.1) |
204 | | - |
205 | | -### Enhanced Removal Tracking |
206 | | -- ✅ Added approval action tracking for `approvelink` and `approvecomment` |
207 | | -- ✅ Smart filtering shows only approvals of Reddit/AutoMod removals in wiki |
208 | | -- ✅ Combined display of removal actions with their associated removal reasons |
209 | | -- ✅ AutoMod actions display as `filter-removelink`/`filter-removecomment` to distinguish from manual removals |
210 | | -- ✅ Approval actions show original removal context: "Approved AutoModerator removal: [reason]" |
211 | | -- ✅ Cleaner wiki presentation while maintaining full data integrity in database |
212 | | - |
213 | | -## Previous Improvements (v2.1) |
214 | | - |
215 | | -### Multi-Subreddit Database Support |
216 | | -- ✅ Fixed critical error that prevented multi-subreddit databases from working |
217 | | -- ✅ Single database now safely handles multiple subreddits with proper data separation |
218 | | -- ✅ Per-subreddit wiki updates without cross-contamination |
219 | | -- ✅ Subreddit-specific logging and error handling |
220 | | - |
221 | | -### Removal Reason Transparency |
222 | | -- ✅ Fixed "Removal reason applied" showing instead of actual text |
223 | | -- ✅ Full transparency - shows ALL available removal reason data including template numbers |
224 | | -- ✅ Consistent handling between storage and display logic using correct Reddit API fields |
225 | | -- ✅ Displays actual removal reasons like "Invites - No asking", "This comment has been filtered due to crowd control" |
226 | | - |
227 | | -### Unique Content ID Tracking |
228 | | -- ✅ Fixed duplicate IDs in markdown tables where all comments showed same post ID |
229 | | -- ✅ Comments now show unique comment IDs (e.g., "n7ravg2") for precise tracking |
230 | | -- ✅ Posts show post IDs for clear content identification |
231 | | -- ✅ Each modlog entry has a unique identifier for easy reference |
232 | | - |
233 | | -### Content Linking and Display |
234 | | -- ✅ Content links point to actual Reddit posts/comments, never user profiles for privacy |
235 | | -- ✅ Fixed target authors showing as [deleted] - now displays actual usernames |
236 | | -- ✅ Proper content titles extracted from Reddit API data |
237 | | -- ✅ AutoModerator displays as "AutoModerator" (not anonymized) |
238 | | -- ✅ Configurable anonymization for human moderators |
239 | | - |
240 | | -### Data Integrity |
241 | | -- ✅ Pipe character escaping for markdown table compatibility |
242 | | -- ✅ Robust error handling for mixed subreddit scenarios |
243 | | -- ✅ Database schema at version 5 with all required columns |
244 | | -- ✅ Consistent Reddit API field usage (action.details vs action.description) |
| 16 | +Always use `/opt/.venv/redditbot/bin/python`, not system python. |
245 | 17 |
|
246 | | -## Development Guidelines |
| 18 | +## Structure |
| 19 | +- `modlog_wiki_publisher.py` — Single-file application (ModlogDatabase class + main logic) |
| 20 | +- `config_template.json` — Config template |
| 21 | +- `scripts/debug_auth.py` — Auth debugging utility |
| 22 | +- `tests/` — Test suite |
247 | 23 |
|
248 | | -### Git Workflow |
249 | | -- If branch is not main, you may commit and push if a PR is draft or not open |
250 | | -- Use conventional commits for all changes |
251 | | -- Use multiple commits if needed, or patch if easier |
252 | | -- Always update CLAUDE.md and README.md when making changes |
| 24 | +## Config Priority |
| 25 | +CLI args > Environment variables > JSON config file |
253 | 26 |
|
254 | | -### Code Standards |
255 | | -- Always escape markdown table values like removal reasons for pipes |
256 | | -- Store pipe-free data in database to prevent markdown issues |
257 | | -- Confirm cache file of wiki page and warn if same, interactively ask to force refresh |
258 | | -- Always use the specified virtual environment path |
| 27 | +## Key Environment Variables |
| 28 | +`REDDIT_CLIENT_ID`, `REDDIT_CLIENT_SECRET`, `REDDIT_USERNAME`, `REDDIT_PASSWORD`, `SOURCE_SUBREDDIT` |
259 | 29 |
|
260 | | -### Documentation |
261 | | -- Always update commands and flags in documentation |
262 | | -- Remove CHANGELOG from CLAUDE.md (keep separate) |
263 | | -- Create and update changelog based on git tags (should be scripted) |
| 30 | +## Security |
| 31 | +- `anonymize_moderators` MUST be `true` (enforced, app refuses to start otherwise) |
| 32 | +- Content links must never point to user profiles — only to posts/comments |
| 33 | +- Escape pipe characters in removal reasons for markdown table compatibility |
264 | 34 |
|
265 | | -## Common Issues |
| 35 | +## Docker |
| 36 | +Image: `ghcr.io/baker-scripts/redditmodlog` |
| 37 | +Tags: `:1`, `:1.4`, `:1.4.x`, `:latest` |
| 38 | +Uses s6-overlay for init, PUID/PGID user management. |
266 | 39 |
|
267 | | -- **401 errors**: Check app type is "script" and verify client_id/client_secret |
268 | | -- **Wiki permission denied**: Ensure bot has moderator or wiki contributor access |
269 | | -- **Rate limiting**: Increase `--interval` and/or reduce `--batch-size` |
270 | | -- **Module not found**: Always use `/opt/.venv/redditbot/bin/python` instead of system python |
| 40 | +## Git Workflow |
| 41 | +- Conventional commits |
| 42 | +- May commit/push directly if branch is not main and PR is draft or not open |
0 commit comments