Skip to content

Transformers

Transformers are the core building blocks of Skifta. Each transformer defines how to mask, replace, or generate data for specific column types. They're designed to be flexible, configurable, and composable.

What Are Transformers?

A transformer is a rule that tells Skifta how to modify values in a specific column. For example:

  • Email transformer: Replaces [email protected] with [email protected]
  • Name transformer: Replaces Alice Johnson with Emma Williams
  • Credit Card transformer: Replaces 4532-1234-5678-9010 with valid-format test card numbers
  • Static transformer: Replaces any value with a fixed string like [REDACTED]

Transformers run locally on your machine and require no API keys or external services.

How Transformers Work

  1. You configure which columns to transform in skifta.toml
  2. Skifta parses your SQL dump and identifies INSERT statements
  3. Transformers apply the configured transformations to specified columns
  4. Output generated with masked data, preserving schema and relationships

Available Transformers

PII Masking Transformers

Perfect for masking personally identifiable information:

TransformerUse CaseExample
EmailMask email addresses[email protected][email protected]
NameReplace real names with fake onesJohn SmithEmma Williams
Phone NumberGenerate realistic phone numbers555-1234555-8901
Credit CardReplace with valid-format test cards4111-1111-1111-11114532-xxxx-xxxx-1234

General Purpose Transformers

Flexible transformers for various use cases:

TransformerUse CaseExample
StaticReplace with a fixed valueAny value → [REDACTED]
ObfuscateScramble while preserving formatABC123XYZ789
LipsumReplace with Lorem Ipsum textLong text → Lorem ipsum placeholder
DateShift dates by random intervals2024-01-152024-02-03

Usage Examples

Transformers can be used in two ways: shorthand (simple) or object (advanced with options).

Basic Usage (Shorthand)

Use the transformer name directly for default behavior:

toml
[[table]]
name = "users"
columns = [
  { name = "email", transformer = "email" },
  { name = "full_name", transformer = "name" },
  { name = "phone", transformer = "phone-number" }
]

This applies each transformer with its default settings—perfect for most use cases.

Advanced Usage (Object with Options)

Configure transformers with custom options:

toml
[[table]]
name = "users"
columns = [
  # Only generate first names
  { name = "first_name", transformer = { name = { first-name = true, last-name = false } } },

  # Replace with a specific static value
  { name = "ssn", transformer = { static = { value = "XXX-XX-XXXX" } } },

  # Obfuscate while preserving length
  { name = "internal_id", transformer = { obfuscate = { preserve_length = true } } }
]

ℹ️ Each transformer documents its available configuration options on its dedicated page. Click the transformer name above to see full details.

Real-World Example

Here's a complete configuration for a typical e-commerce database:

toml
dialect = "postgres"

[[table]]
name = "customers"
columns = [
  { name = "email", transformer = "email" },
  { name = "first_name", transformer = "name" },
  { name = "last_name", transformer = "name" },
  { name = "phone", transformer = "phone-number" },
  { name = "address", transformer = "lipsum" }
]

[[table]]
name = "orders"
columns = [
  { name = "shipping_notes", transformer = "lipsum" }
]

[[table]]
name = "payment_methods"
columns = [
  { name = "card_number", transformer = "credit-card" },
  { name = "cardholder_name", transformer = "name" }
]

Configuration Format

Skifta uses TOML for configuration. Columns can be defined using either:

1. Array of Inline Tables

toml
[[table]]
name = "users"
columns = [
  { name = "email", transformer = "email" },
  { name = "name", transformer = "name" }
]

2. Nested Tables

toml
[[table]]
name = "users"

[[table.columns]]
name = "email"
transformer = "email"

[[table.columns]]
name = "name"
transformer = "name"

Choose the style that best fits your tooling or preference.

Choosing the Right Transformer

Not sure which transformer to use? Here's a quick decision guide:

Data TypeRecommended TransformerWhy
Email addressesemailGenerates valid-looking email addresses
Person namesnameRealistic first/last name combinations
Phone numbersphone-numberValid phone number formats
Credit cardscredit-cardValid test card numbers (pass Luhn check)
AddresseslipsumReplaces with placeholder text
Dates of birthdateShifts dates while maintaining format
Generic sensitive textlipsum or staticSafe placeholder replacement
Unique IDs you want to hideobfuscateScrambles while preserving uniqueness
Fields you want fully redactedstaticSimple, predictable redaction

Transformer Best Practices

Preserve Referential Integrity

Skifta transformers generate deterministic output by default. This means the same input always produces the same output within a transformation run, preserving foreign key relationships:

toml
# user_id: 123 in both tables will transform to the same masked value
[[table]]
name = "users"
columns = [{ name = "user_id", transformer = "obfuscate" }]

[[table]]
name = "orders"
columns = [{ name = "user_id", transformer = "obfuscate" }]

Test Your Configuration

Always test your transformation on a small sample before processing large dumps:

bash
# Test on first 100 lines
head -100 production-dump.sql > sample.sql
skifta -i sample.sql -o test-output.sql

# Review the output
cat test-output.sql

Document Your Choices

Add comments to your skifta.toml to explain transformer choices:

toml
[[table]]
name = "users"
columns = [
  # GDPR: Must mask all personal identifiers
  { name = "email", transformer = "email" },
  { name = "name", transformer = "name" },

  # HIPAA: Dates shifted to preserve age ranges
  { name = "dob", transformer = "date" }
]

Learn More

  • Configuration Guide: Complete configuration reference
  • Quickstart Guide: Get started with your first transformation
  • Individual Transformer Pages: Click any transformer name above for detailed documentation and examples