How Does TypeScript Handle Control Characters?
Why Does It Matter?
For enterprise applications where security matters far more than on side projects, you need to worry about injection attacks.[1][2] Control characters such as CR (\r) and LF (\n) are a common attack vector — CRLF injection can lead to HTTP response splitting, session hijacking, and log poisoning.[3][4] Understanding how TypeScript treats control characters at the type level is the first step toward building safe input handling.
How TypeScript "Interprets" Control Characters
In TypeScript, control characters (like \0, \n, \t, \r, etc.) are just part of the string type.[5] There's no special type-level distinction for them.
However, there are some interesting nuances:
1. Template literal types can include escape sequences[6]
type Newline = "\n"; // valid literal type type Tab = "\t"; // valid literal type type Null = "\0"; // valid literal type const a: Newline = "\n"; // OK const b: Newline = "\t"; // Error
2. Unicode escapes work in literal types too[7]
type Bell = "\u0007"; // BEL control character type Escape = "\u001B"; // ESC
3. TypeScript doesn't validate string content beyond literal matching
// There's no built-in way to say "a string with no control characters" type SafeString = string; // no narrower option natively
4. Template literal types can't pattern-match control chars easily[6]
// You CAN do this:
type HasNewline<S extends string> =
S extends `${infer _}\n${infer __}` ? true : false;
type Test1 = HasNewline<"hello\nworld">; // true
type Test2 = HasNewline<"hello world">; // false
5. At runtime, all control characters are valid string values
const s: string = "\x00\x01\x02"; // perfectly valid s.length; // 3
The key takeaway: TypeScript treats control characters as ordinary string content. You can use them in string literal types for exact matching, but there's no built-in type to exclude or require them. If you need to enforce "no control characters" at the type level, you need a runtime validation layer (like Zod[8] or a branded type with a type guard[9]).
Why Use Types Over Regex?
The regex approach uses something like /[^\x00-\x1F\x7F]/ to match or reject control characters. The type approach creates a newtype or validated type that guarantees no control characters exist in the value, enforced at construction time.[9][10]
Key reasons to prefer types
- Compile-time guarantees — Once you have a value of the validated type, you know it's clean everywhere it's used. With regex, you have to remember to validate at every entry point.[9]
- Single point of validation — The type's constructor validates once, whereas regex validation tends to scatter throughout the codebase and get duplicated.[11]
- Self-documenting — A type name like
SafeStringorNoControlCharsmakes the intent clear in function signatures, while a regex hidden in validation logic is easy to overlook.[10] - Composability — Types integrate naturally with the type system, so functions accepting
SafeStringcan't accidentally receive unvalidated input. TypeScript uses structural typing, but branded types simulate nominal typing to prevent accidental interchangeability.[12] - Regex pitfalls — Control character regexes are error-prone: easy to miss edge cases, Unicode control characters, off-by-one errors in ranges, and different regex engines behave inconsistently.[7][13]
- Parse, don't validate — Using types to make invalid states unrepresentable aligns with the parse-don't-validate principle.[14]
Types Over Regex in Practice
The regex problem
function saveComment(text: string) {
if (/[\x00-\x1F\x7F]/.test(text)) {
throw new Error("control characters not allowed");
}
// ... save it
}
string is string everywhere — nothing stops unvalidated input from slipping through.[5]
The branded type approach[9]
type CleanText = string & { readonly __brand: unique symbol };
function cleanText(raw: string): CleanText {
if (/[\p{Cc}]/u.test(raw)) {
throw new Error("control characters not allowed");
}
return raw as CleanText;
}
Now functions declare what they need:
function saveComment(text: CleanText) {
// guaranteed clean — can't pass a raw string
}
saveComment("hello"); // Type error
saveComment(cleanText("hello")); // OK
The class approach (runtime enforcement)
class CleanText {
readonly value: string;
constructor(raw: string) {
if (/[\p{Cc}]/u.test(raw)) {
throw new Error("control characters not allowed");
}
this.value = raw;
}
}
function saveComment(text: CleanText) {
db.save(text.value);
}
Why \p{Cc} over \x00-\x1F
The Unicode property \p{Cc} (with the u flag) catches all Unicode control characters, not just ASCII ones.[7][13] A hand-written range like [\x00-\x1F\x7F] misses \x80-\x9F and others.
Comparison
| Concern | string + regex |
Branded / class type |
|---|---|---|
| Forgot to validate? | Silent bug | Compile-time error[9] |
| Validation location | Every call site | Once, at construction[11] |
| Self-documenting | No | CleanText in signature[10] |
| Correctness | Easy to miss chars | \p{Cc} covers Unicode[7] |
Raw string sneaks in? |
Yes | No |
The brand makes the compiler enforce what regex leaves to discipline.[9]
Parse, Don't Validate
"Parse, don't validate" is a design principle originally articulated by Alexis King[14] that says: instead of repeatedly checking (validating) raw input data and then still passing around the same raw, potentially-invalid structures, you should transform that data once into a richer type that cannot represent invalid states, and then only work with that type thereafter.[15]
Core idea
- Validate-as-you-go (bad pattern): You keep raw data (strings, loose maps, JSON blobs) and sprinkle
ifchecks everywhere to ensure it's "still valid" before each use.[11] - Parse-then-trust (preferred): At the boundary (e.g., HTTP request, file, DB row), you parse that data into a dedicated type whose constructor enforces all the rules; once you have that value, the rest of your code can assume it's correct without more checks.[14]
Simple example
Instead of:
// Raw data everywhere; age might be invalid
function handleUser(input: any) {
if (typeof input.age !== "number" || input.age < 0 || input.age > 130) {
throw new Error("Invalid age");
}
// Later…
if (input.age < 18) { /* ... */ }
}
You create a type and parse into it:
class Age {
private constructor(public readonly value: number) {}
static parse(n: unknown): Age {
if (typeof n !== "number" || n < 0 || n > 130) {
throw new Error("Invalid age");
}
return new Age(n);
}
}
function handleUser(input: any) {
const age = Age.parse(input.age);
if (age.value < 18) { /* ... no further validity checks needed */ }
}
Now the rest of the system cannot even represent an invalid age without failing at construction, which eliminates "shotgun validation" and many latent bugs.[14][15]
Why people like it
- Fail early at boundaries: Bad data is rejected before touching business logic, so you avoid partial updates and rollback headaches.[16]
- Simpler core logic: Most functions deal only with already-parsed, guaranteed-valid types, so they can focus purely on behavior.[17]
- Better invariants: By making illegal states unrepresentable in your types, you encode rules into the type system instead of scattered conditionals.[14]
Relationship between parsing and validation
You still validate, but you do it inside the parsing step and only once, at the edges. A good mental model: parsing = validation + transformation into a trustworthy type. After that, you work with the parsed type, not the original input.[14][16]
References
- OWASP Top 10:2025 — A05 Injection
- Injection Prevention Cheat Sheet — OWASP
- CRLF Injection — OWASP Foundation
- What is CRLF Injection — Imperva
- TypeScript Handbook — Literal Types
- TypeScript Handbook — Template Literal Types
- Unicode Character Class Escape \p{...} — MDN Web Docs
- Zod — TypeScript-first Schema Validation
- Branded Types in TypeScript — egghead.io
- Branded Types — Learning TypeScript
- Parse, Don't Validate — DevIQ
- Nominal Typing Techniques in TypeScript — Michal Zalecki
- Unicode Property Escapes in JavaScript Regular Expressions — Mathias Bynens
- Parse, Don't Validate — Alexis King
- Parse, Don't Validate — Yevhen Tytov (LinkedIn)
- What "Parse, Don't Validate" Means in Python — Bite Code
- Parse, Don't Validate: Embracing Data Integrity in Elixir — DEV Community