How Does TypeScript Handle Control Characters?

Why Does It Matter?

For enterprise applications where security matters far more than on side projects, you need to worry about injection attacks.[1][2] Control characters such as CR (\r) and LF (\n) are a common attack vector — CRLF injection can lead to HTTP response splitting, session hijacking, and log poisoning.[3][4] Understanding how TypeScript treats control characters at the type level is the first step toward building safe input handling.

How TypeScript "Interprets" Control Characters

In TypeScript, control characters (like \0, \n, \t, \r, etc.) are just part of the string type.[5] There's no special type-level distinction for them.

However, there are some interesting nuances:

1. Template literal types can include escape sequences[6]

type Newline = "\n";        // valid literal type
type Tab = "\t";             // valid literal type
type Null = "\0";            // valid literal type
const a: Newline = "\n";    // OK
const b: Newline = "\t";    // Error

2. Unicode escapes work in literal types too[7]

type Bell = "\u0007";        // BEL control character
type Escape = "\u001B";      // ESC

3. TypeScript doesn't validate string content beyond literal matching

// There's no built-in way to say "a string with no control characters"
type SafeString = string;  // no narrower option natively

4. Template literal types can't pattern-match control chars easily[6]

// You CAN do this:
type HasNewline<S extends string> =
  S extends `${infer _}\n${infer __}` ? true : false;

type Test1 = HasNewline<"hello\nworld">;  // true
type Test2 = HasNewline<"hello world">;   // false

5. At runtime, all control characters are valid string values

const s: string = "\x00\x01\x02";  // perfectly valid
s.length;  // 3

The key takeaway: TypeScript treats control characters as ordinary string content. You can use them in string literal types for exact matching, but there's no built-in type to exclude or require them. If you need to enforce "no control characters" at the type level, you need a runtime validation layer (like Zod[8] or a branded type with a type guard[9]).

Why Use Types Over Regex?

The regex approach uses something like /[^\x00-\x1F\x7F]/ to match or reject control characters. The type approach creates a newtype or validated type that guarantees no control characters exist in the value, enforced at construction time.[9][10]

Key reasons to prefer types

  1. Compile-time guarantees — Once you have a value of the validated type, you know it's clean everywhere it's used. With regex, you have to remember to validate at every entry point.[9]
  2. Single point of validation — The type's constructor validates once, whereas regex validation tends to scatter throughout the codebase and get duplicated.[11]
  3. Self-documenting — A type name like SafeString or NoControlChars makes the intent clear in function signatures, while a regex hidden in validation logic is easy to overlook.[10]
  4. Composability — Types integrate naturally with the type system, so functions accepting SafeString can't accidentally receive unvalidated input. TypeScript uses structural typing, but branded types simulate nominal typing to prevent accidental interchangeability.[12]
  5. Regex pitfalls — Control character regexes are error-prone: easy to miss edge cases, Unicode control characters, off-by-one errors in ranges, and different regex engines behave inconsistently.[7][13]
  6. Parse, don't validate — Using types to make invalid states unrepresentable aligns with the parse-don't-validate principle.[14]

Types Over Regex in Practice

The regex problem

function saveComment(text: string) {
  if (/[\x00-\x1F\x7F]/.test(text)) {
    throw new Error("control characters not allowed");
  }
  // ... save it
}

string is string everywhere — nothing stops unvalidated input from slipping through.[5]

The branded type approach[9]

type CleanText = string & { readonly __brand: unique symbol };

function cleanText(raw: string): CleanText {
  if (/[\p{Cc}]/u.test(raw)) {
    throw new Error("control characters not allowed");
  }
  return raw as CleanText;
}

Now functions declare what they need:

function saveComment(text: CleanText) {
  // guaranteed clean — can't pass a raw string
}

saveComment("hello");              // Type error
saveComment(cleanText("hello"));   // OK

The class approach (runtime enforcement)

class CleanText {
  readonly value: string;

  constructor(raw: string) {
    if (/[\p{Cc}]/u.test(raw)) {
      throw new Error("control characters not allowed");
    }
    this.value = raw;
  }
}

function saveComment(text: CleanText) {
  db.save(text.value);
}

Why \p{Cc} over \x00-\x1F

The Unicode property \p{Cc} (with the u flag) catches all Unicode control characters, not just ASCII ones.[7][13] A hand-written range like [\x00-\x1F\x7F] misses \x80-\x9F and others.

Comparison

Concern string + regex Branded / class type
Forgot to validate? Silent bug Compile-time error[9]
Validation location Every call site Once, at construction[11]
Self-documenting No CleanText in signature[10]
Correctness Easy to miss chars \p{Cc} covers Unicode[7]
Raw string sneaks in? Yes No

The brand makes the compiler enforce what regex leaves to discipline.[9]

Parse, Don't Validate

"Parse, don't validate" is a design principle originally articulated by Alexis King[14] that says: instead of repeatedly checking (validating) raw input data and then still passing around the same raw, potentially-invalid structures, you should transform that data once into a richer type that cannot represent invalid states, and then only work with that type thereafter.[15]

Core idea

Simple example

Instead of:

// Raw data everywhere; age might be invalid
function handleUser(input: any) {
  if (typeof input.age !== "number" || input.age < 0 || input.age > 130) {
    throw new Error("Invalid age");
  }
  // Later…
  if (input.age < 18) { /* ... */ }
}

You create a type and parse into it:

class Age {
  private constructor(public readonly value: number) {}
  static parse(n: unknown): Age {
    if (typeof n !== "number" || n < 0 || n > 130) {
      throw new Error("Invalid age");
    }
    return new Age(n);
  }
}

function handleUser(input: any) {
  const age = Age.parse(input.age);
  if (age.value < 18) { /* ... no further validity checks needed */ }
}

Now the rest of the system cannot even represent an invalid age without failing at construction, which eliminates "shotgun validation" and many latent bugs.[14][15]

Why people like it

Relationship between parsing and validation

You still validate, but you do it inside the parsing step and only once, at the edges. A good mental model: parsing = validation + transformation into a trustworthy type. After that, you work with the parsed type, not the original input.[14][16]

References

  1. OWASP Top 10:2025 — A05 Injection
  2. Injection Prevention Cheat Sheet — OWASP
  3. CRLF Injection — OWASP Foundation
  4. What is CRLF Injection — Imperva
  5. TypeScript Handbook — Literal Types
  6. TypeScript Handbook — Template Literal Types
  7. Unicode Character Class Escape \p{...} — MDN Web Docs
  8. Zod — TypeScript-first Schema Validation
  9. Branded Types in TypeScript — egghead.io
  10. Branded Types — Learning TypeScript
  11. Parse, Don't Validate — DevIQ
  12. Nominal Typing Techniques in TypeScript — Michal Zalecki
  13. Unicode Property Escapes in JavaScript Regular Expressions — Mathias Bynens
  14. Parse, Don't Validate — Alexis King
  15. Parse, Don't Validate — Yevhen Tytov (LinkedIn)
  16. What "Parse, Don't Validate" Means in Python — Bite Code
  17. Parse, Don't Validate: Embracing Data Integrity in Elixir — DEV Community