Skip to content

CI/CD Integration

Automate PII masking in your continuous integration and deployment pipelines. Skifta integrates seamlessly with all major CI/CD platforms.

Why Integrate Skifta in CI/CD?

  • Automated test data generation: Create safe, anonymized datasets for every test run
  • Pre-deployment sanitization: Mask production data before deploying to staging/dev
  • Compliance automation: Ensure no PII leaks through your pipeline
  • Consistent transformations: Same masking rules applied everywhere

GitHub Actions

Basic Workflow

Create .github/workflows/mask-data.yml:

yaml
name: Mask SQL Data

on:
  workflow_dispatch:
  schedule:
    - cron: '0 2 * * *'  # Daily at 2 AM

jobs:
  mask-data:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout repository
        uses: actions/checkout@v4

      - name: Install Skifta
        run: |
          curl --proto '=https' --tlsv1.2 -LsSf https://sh.skifta.dev | sh
          echo "$HOME/.cargo/bin" >> $GITHUB_PATH

      - name: Download production dump
        env:
          DATABASE_URL: ${{ secrets.DATABASE_URL }}
        run: |
          pg_dump $DATABASE_URL > production-dump.sql

      - name: Mask PII with Skifta
        run: |
          skifta -i production-dump.sql -o masked-dump.sql

      - name: Upload masked dump
        uses: actions/upload-artifact@v4
        with:
          name: masked-database
          path: masked-dump.sql

Advanced: Multi-Database Workflow

yaml
name: Mask Multiple Databases

on:
  push:
    branches: [main]

jobs:
  mask-databases:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        db: [users, orders, analytics]

    steps:
      - uses: actions/checkout@v4

      - name: Install Skifta
        run: |
          curl --proto '=https' --tlsv1.2 -LsSf https://sh.skifta.dev | sh
          echo "$HOME/.cargo/bin" >> $GITHUB_PATH

      - name: Export database dump
        run: |
          pg_dump ${{ secrets.DATABASE_URL }} \
            --schema=${{ matrix.db }} > ${{ matrix.db }}.sql

      - name: Mask with Skifta
        run: |
          skifta -i ${{ matrix.db }}.sql -o masked-${{ matrix.db }}.sql

      - name: Import to staging
        run: |
          psql ${{ secrets.STAGING_DATABASE_URL }} < masked-${{ matrix.db }}.sql

Caching for Faster Builds

yaml
- name: Cache Skifta binary
  uses: actions/cache@v4
  with:
    path: ~/.cargo/bin/skifta
    key: skifta-${{ runner.os }}-0.15.1

- name: Install Skifta (if not cached)
  if: steps.cache.outputs.cache-hit != 'true'
  run: curl --proto '=https' --tlsv1.2 -LsSf https://sh.skifta.dev | sh

GitLab CI

Basic Pipeline

Create .gitlab-ci.yml:

yaml
stages:
  - mask
  - deploy

mask-data:
  stage: mask
  image: ubuntu:latest
  before_script:
    - apt-get update && apt-get install -y curl postgresql-client
    - curl --proto '=https' --tlsv1.2 -LsSf https://sh.skifta.dev | sh
    - export PATH="$HOME/.cargo/bin:$PATH"
  script:
    - pg_dump $DATABASE_URL > production-dump.sql
    - skifta -i production-dump.sql -o masked-dump.sql
  artifacts:
    paths:
      - masked-dump.sql
    expire_in: 7 days

deploy-to-staging:
  stage: deploy
  dependencies:
    - mask-data
  script:
    - psql $STAGING_DATABASE_URL < masked-dump.sql
  only:
    - main

Using Docker

yaml
mask-with-docker:
  stage: mask
  image: postgres:16
  before_script:
    - curl --proto '=https' --tlsv1.2 -LsSf https://sh.skifta.dev | sh
    - export PATH="$HOME/.cargo/bin:$PATH"
  script:
    - skifta -i input.sql -o output.sql

Jenkins

Declarative Pipeline

groovy
pipeline {
    agent any

    stages {
        stage('Install Skifta') {
            steps {
                sh '''
                    curl --proto '=https' --tlsv1.2 -LsSf https://sh.skifta.dev | sh
                    export PATH="$HOME/.cargo/bin:$PATH"
                '''
            }
        }

        stage('Export Database') {
            steps {
                withCredentials([string(credentialsId: 'db-url', variable: 'DB_URL')]) {
                    sh 'pg_dump $DB_URL > dump.sql'
                }
            }
        }

        stage('Mask PII') {
            steps {
                sh '''
                    export PATH="$HOME/.cargo/bin:$PATH"
                    skifta -i dump.sql -o masked-dump.sql
                '''
            }
        }

        stage('Archive Artifacts') {
            steps {
                archiveArtifacts artifacts: 'masked-dump.sql', fingerprint: true
            }
        }
    }
}

CircleCI

Create .circleci/config.yml:

yaml
version: 2.1

jobs:
  mask-data:
    docker:
      - image: cimg/base:stable
    steps:
      - checkout

      - run:
          name: Install Skifta
          command: |
            curl --proto '=https' --tlsv1.2 -LsSf https://sh.skifta.dev | sh
            echo 'export PATH="$HOME/.cargo/bin:$PATH"' >> $BASH_ENV

      - run:
          name: Transform SQL dump
          command: |
            skifta -i input.sql -o output.sql

      - store_artifacts:
          path: output.sql

workflows:
  mask-and-deploy:
    jobs:
      - mask-data

Docker Integration

Dockerfile

Create a reusable Docker image with Skifta:

dockerfile
FROM ubuntu:22.04

RUN apt-get update && apt-get install -y \
    curl \
    ca-certificates \
    && rm -rf /var/lib/apt/lists/*

# Install Skifta
RUN curl --proto '=https' --tlsv1.2 -LsSf https://sh.skifta.dev | sh
ENV PATH="/root/.cargo/bin:${PATH}"

WORKDIR /data

ENTRYPOINT ["skifta"]

Usage in Docker Compose

yaml
version: '3.8'

services:
  skifta:
    build: .
    volumes:
      - ./dumps:/data
      - ./skifta.toml:/data/skifta.toml
    command: ["-i", "/data/input.sql", "-o", "/data/output.sql"]

Best Practices

1. Secure Your Configuration

Store skifta.toml in version control, but never commit actual dumps:

gitignore
# .gitignore
*.sql
!schema.sql  # Only commit schema, not data

2. Use Environment-Specific Configs

bash
# Development
skifta -i prod.sql -o dev.sql
cp skifta-dev.toml skifta.toml

# Staging
skifta -i prod.sql -o staging.sql
cp skifta-staging.toml skifta.toml

3. Validate Output

Always verify transformations worked:

bash
# After masking
grep '@' masked-dump.sql | head -5  # Check emails are masked
grep 'INSERT INTO users' masked-dump.sql | head -3  # Verify data structure

4. Performance Optimization

For large dumps in CI/CD:

bash
# Compress before transfer
skifta -i prod.sql -o masked.sql
gzip masked.sql

# Upload compressed file
aws s3 cp masked.sql.gz s3://bucket/

5. Audit Logging

Track when and how data was masked:

yaml
- name: Log masking operation
  run: |
    echo "Masked at $(date) by $GITHUB_ACTOR" >> masking-log.txt
    echo "Skifta version: $(skifta --version)" >> masking-log.txt

Troubleshooting

Skifta Command Not Found in CI

bash
# Add to PATH explicitly
export PATH="$HOME/.cargo/bin:$PATH"
skifta --version

Permission Issues

bash
# Ensure binary is executable
chmod +x ~/.cargo/bin/skifta

Examples Repository

Find complete CI/CD examples in the Skifta Examples Repository (coming soon).

Need Help?