Browse Source

docs: add rule against RegExp.escape() for MongoDB-bound regex patterns

Document the escapeStringForMongoRegex convention introduced in the #11235
fix, so future MongoDB queries don't reintroduce the PCRE2 \u bug (error
51091). RegExp.escape() stays valid for in-process .test()/.replace().

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Yuki Takei 2 weeks ago
parent
commit
e5f2f34bb7
2 changed files with 47 additions and 0 deletions
  1. 46 0
      .claude/rules/mongodb-regex.md
  2. 1 0
      AGENTS.md

+ 46 - 0
.claude/rules/mongodb-regex.md

@@ -0,0 +1,46 @@
+# MongoDB Regex Escaping
+
+## RegExp.escape() must not be used for MongoDB-bound regex patterns
+
+Node.js 24's built-in `RegExp.escape()` escapes non-ASCII whitespace (code points
+≥ U+0100, e.g. U+3000 IDEOGRAPHIC SPACE) into `\uXXXX` form. MongoDB's PCRE2 engine
+does **not** support `\u`, so such a pattern throws:
+
+```
+Regular expression is invalid: PCRE2 does not support \L, \l, \N{name}, \U, or \u
+  code: 51091
+```
+
+This breaks page creation, v5 page migration, page listing, etc. for any path that
+contains those characters. (`escape-string-regexp`, used before the v7.5.0 refactor,
+passed non-ASCII characters through literally and did not have this problem.)
+
+## The Rule
+
+When a regex is sent to **MongoDB** — used as a `$regex` value, or wrapped in
+`new RegExp(...)` and assigned to a query field (`path`, `name`, …) in a Mongoose
+`find` / `updateMany` / `aggregate` / `count` / `bulkWrite` — escape the dynamic part
+with **`escapeStringForMongoRegex()`** from `@growi/core/dist/utils`, never `RegExp.escape()`.
+
+`escapeStringForMongoRegex()` escapes only regex metacharacters and passes every other
+character through literally (equivalent to `escape-string-regexp` v5), so its output
+never contains `\u` and is safe for PCRE2.
+
+```typescript
+import { escapeStringForMongoRegex } from '@growi/core/dist/utils';
+
+// ❌ WRONG — pattern goes to MongoDB
+Page.find({ path: new RegExp(`^${RegExp.escape(path)}`) });
+
+// ✅ CORRECT
+Page.find({ path: new RegExp(`^${escapeStringForMongoRegex(path)}`) });
+```
+
+## Exception: in-process JS regex is fine
+
+`RegExp.escape()` is acceptable for regexes evaluated **in-process by V8** — i.e.
+`.test()` / `.replace()` / `.match()` on local strings that are never sent to MongoDB.
+V8 interprets `\uXXXX` correctly, so there is no need to change those call sites.
+
+See `escapeStringForMongoRegex` (`packages/core/src/utils/escape-string-for-regex.ts`)
+and issue #11235 for background.

+ 1 - 0
AGENTS.md

@@ -27,6 +27,7 @@ GROWI is a team collaboration wiki platform using Markdown, featuring hierarchic
 | **github-cli** | **CRITICAL**: gh CLI auth required; stop immediately if unauthenticated |
 
 | **testing** | Test commands, pnpm vitest usage |
+| **mongodb-regex** | `RegExp.escape()` breaks MongoDB PCRE2 for non-ASCII whitespace; use `escapeStringForMongoRegex` for query-bound patterns |
 
 ### On-Demand Skills