How Does TypeScript Handle Control Characters?

2026-02-23

Why Does It Matter?

For enterprise applications where security matters far more than on side projects, you need to worry about injection attacks.^[1]^[2] Control characters such as CR (\r) and LF (\n) are a common attack vector — CRLF injection can lead to HTTP response splitting, session hijacking, and log poisoning.^[3]^[4] Understanding how TypeScript treats control characters at the type level is the first step toward building safe input handling.

How TypeScript "Interprets" Control Characters

In TypeScript, control characters (like \0, \n, \t, \r, etc.) are just part of the string type.^[5] There's no special type-level distinction for them.

However, there are some interesting nuances:

1. Template literal types can include escape sequences^[6]

type Newline = "\n";        // valid literal type
type Tab = "\t";             // valid literal type
type Null = "\0";            // valid literal type
const a: Newline = "\n";    // OK
const b: Newline = "\t";    // Error

2. Unicode escapes work in literal types too^[7]

type Bell = "\u0007";        // BEL control character
type Escape = "\u001B";      // ESC

3. TypeScript doesn't validate string content beyond literal matching

// There's no built-in way to say "a string with no control characters"
type SafeString = string;  // no narrower option natively

4. Template literal types can't pattern-match control chars easily^[6]

// You CAN do this:
type HasNewline<S extends string> =
  S extends `${infer _}\n${infer __}` ? true : false;

type Test1 = HasNewline<"hello\nworld">;  // true
type Test2 = HasNewline<"hello world">;   // false

5. At runtime, all control characters are valid string values

const s: string = "\x00\x01\x02";  // perfectly valid
s.length;  // 3

The key takeaway: TypeScript treats control characters as ordinary string content. You can use them in string literal types for exact matching, but there's no built-in type to exclude or require them. If you need to enforce "no control characters" at the type level, you need a runtime validation layer (like Zod^[8] or a branded type with a type guard^[9]).

Why Use Types Over Regex?

The regex approach uses something like /[^\x00-\x1F\x7F]/ to match or reject control characters. The type approach creates a newtype or validated type that guarantees no control characters exist in the value, enforced at construction time.^[9]^[10]

Key reasons to prefer types

Compile-time guarantees — Once you have a value of the validated type, you know it's clean everywhere it's used. With regex, you have to remember to validate at every entry point.^[9]
Single point of validation — The type's constructor validates once, whereas regex validation tends to scatter throughout the codebase and get duplicated.^[11]
Self-documenting — A type name like SafeString or NoControlChars makes the intent clear in function signatures, while a regex hidden in validation logic is easy to overlook.^[10]
Composability — Types integrate naturally with the type system, so functions accepting SafeString can't accidentally receive unvalidated input. TypeScript uses structural typing, but branded types simulate nominal typing to prevent accidental interchangeability.^[12]
Regex pitfalls — Control character regexes are error-prone: easy to miss edge cases, Unicode control characters, off-by-one errors in ranges, and different regex engines behave inconsistently.^[7]^[13]
Parse, don't validate — Using types to make invalid states unrepresentable aligns with the parse-don't-validate principle.^[14]

Types Over Regex in Practice

The regex problem

function saveComment(text: string) {
  if (/[\x00-\x1F\x7F]/.test(text)) {
    throw new Error("control characters not allowed");
  }
  // ... save it
}

string is string everywhere — nothing stops unvalidated input from slipping through.^[5]

The branded type approach^[9]

type CleanText = string & { readonly __brand: unique symbol };

function cleanText(raw: string): CleanText {
  if (/[\p{Cc}]/u.test(raw)) {
    throw new Error("control characters not allowed");
  }
  return raw as CleanText;
}

Now functions declare what they need:

function saveComment(text: CleanText) {
  // guaranteed clean — can't pass a raw string
}

saveComment("hello");              // Type error
saveComment(cleanText("hello"));   // OK

The class approach (runtime enforcement)

class CleanText {
  readonly value: string;

  constructor(raw: string) {
    if (/[\p{Cc}]/u.test(raw)) {
      throw new Error("control characters not allowed");
    }
    this.value = raw;
  }
}

function saveComment(text: CleanText) {
  db.save(text.value);
}

Why `\p{Cc}` over `\x00-\x1F`

The Unicode property \p{Cc} (with the u flag) catches all Unicode control characters, not just ASCII ones.^[7]^[13] A hand-written range like [\x00-\x1F\x7F] misses \x80-\x9F and others.

Comparison

Concern	`string` + regex	Branded / class type
Forgot to validate?	Silent bug	Compile-time error^[9]
Validation location	Every call site	Once, at construction^[11]
Self-documenting	No	`CleanText` in signature^[10]
Correctness	Easy to miss chars	`\p{Cc}` covers Unicode^[7]
Raw `string` sneaks in?	Yes	No

The brand makes the compiler enforce what regex leaves to discipline.^[9]

Parse, Don't Validate

"Parse, don't validate" is a design principle originally articulated by Alexis King^[14] that says: instead of repeatedly checking (validating) raw input data and then still passing around the same raw, potentially-invalid structures, you should transform that data once into a richer type that cannot represent invalid states, and then only work with that type thereafter.^[15]

Core idea

Validate-as-you-go (bad pattern): You keep raw data (strings, loose maps, JSON blobs) and sprinkle if checks everywhere to ensure it's "still valid" before each use.^[11]
Parse-then-trust (preferred): At the boundary (e.g., HTTP request, file, DB row), you parse that data into a dedicated type whose constructor enforces all the rules; once you have that value, the rest of your code can assume it's correct without more checks.^[14]

Simple example

Instead of:

// Raw data everywhere; age might be invalid
function handleUser(input: any) {
  if (typeof input.age !== "number" || input.age < 0 || input.age > 130) {
    throw new Error("Invalid age");
  }
  // Later…
  if (input.age < 18) { /* ... */ }
}

You create a type and parse into it:

class Age {
  private constructor(public readonly value: number) {}
  static parse(n: unknown): Age {
    if (typeof n !== "number" || n < 0 || n > 130) {
      throw new Error("Invalid age");
    }
    return new Age(n);
  }
}

function handleUser(input: any) {
  const age = Age.parse(input.age);
  if (age.value < 18) { /* ... no further validity checks needed */ }
}

Now the rest of the system cannot even represent an invalid age without failing at construction, which eliminates "shotgun validation" and many latent bugs.^[14]^[15]

Why people like it

Fail early at boundaries: Bad data is rejected before touching business logic, so you avoid partial updates and rollback headaches.^[16]
Simpler core logic: Most functions deal only with already-parsed, guaranteed-valid types, so they can focus purely on behavior.^[17]
Better invariants: By making illegal states unrepresentable in your types, you encode rules into the type system instead of scattered conditionals.^[14]

Relationship between parsing and validation

You still validate, but you do it inside the parsing step and only once, at the edges. A good mental model: parsing = validation + transformation into a trustworthy type. After that, you work with the parsed type, not the original input.^[14]^[16]