Intelligent Data Transformation
Transform raw data from any source into schema-compliant, validated database records. ContentAtlas handles currency normalization, phone formatting, date standardization, and complex column transformations automatically.
Mapping Configuration Structure
Define transformations, validations, and column mappings in a portable JSON configuration. The engine processes data through multiple stages: column mapping, data transformation, and validation.
{
"source_columns": ["Revenue", "Cost", "Margin"],
"target_columns": ["revenue_usd", "cost_usd", "margin_pct"],
"transformations": [
{
"type": "clean_currency",
"source_column": "Revenue",
"target_column": "revenue_usd",
"locale_hint": "en_US"
},
{
"type": "clean_currency",
"source_column": "Cost",
"target_column": "cost_usd",
"locale_hint": "en_US"
},
{
"type": "coalesce_columns",
"source_columns": ["Margin", "Calc_Margin"],
"target_column": "margin_pct"
}
],
"column_validations": [
{"column": "revenue_usd", "validator": "currency_usd", "required": true},
{"column": "cost_usd", "validator": "currency_usd", "required": true}
]
}
Currency Transformation in Action
The clean_currency transformation automatically strips symbols and standardizes format for database storage.
Real-time currency normalization: $1,234.56 → 1234.56
Comprehensive Transformation Types
Apply these transformations during import to normalize, clean, and restructure your data automatically.
| Transformation | Input Example | Output Example | Description |
|---|---|---|---|
clean_currency |
$1,234.56 |
1234.56 |
Strip currency symbols and format to decimal |
standardize_phone |
(415) 610-7325 |
+14156107325 |
Format to international E.164 standard |
split_multi_value_column |
sales,marketing |
tag_1: sales, tag_2: marketing |
Split delimited values into separate columns |
compose_international_phone |
415, 610, 7325, 1 |
+14156107325 |
Reconstruct phone from country, area, local parts |
regex_replace |
ABC-12345 |
12345 |
Replace content using regex pattern (strip ABC-) |
coalesce_columns |
NULL, value |
value |
Use first non-null value from multiple columns |
group_columns_to_json |
col1: a, col2: b |
{"col1": "a", "col2": "b"} |
Group multiple columns into a JSON object |
date_standardization |
01/15/2024 |
2024-01-15 |
Convert dates to ISO 8601 format |
currency_precision |
1200.50 |
Flagged (JPY: 0 decimals expected) |
Flag currency values with incorrect decimal precision for review |
concatenate_columns |
John + Doe |
John Doe |
Combine multiple columns into one |
value_remap |
yes |
true |
Map values according to defined rules |
split_international_phone |
+14156107325 |
1, 415, 6107325 |
Split phone into country, area, local number |
Phone Standardization Example
The standardize_phone transformation converts any phone format to international E.164 standard.
International phone formatting: (415) 610-7325 → +14156107325
Currency Cleaning Matrix
Handle mixed currency formats from any locale. ContentAtlas normalizes all representations to DECIMAL(18,2) for precise storage.
| Input Format | Locale | Output |
|---|---|---|
$1,234.56 |
en_US |
1234.56 |
USD 1,234.56 |
en_US |
1234.56 |
€1.234,56 |
de_DE |
1234.56 |
(1,234.56) |
en_US |
-1234.56 |
1 234,56 € |
fr_FR |
1234.56 |
¥1500 |
ja_JP |
1500 |
IBAN and Credit Card Validation
Built-in validators ensure financial data integrity before database insertion.
iban
ISO 13616
International Bank Account Number validation with country-specific structure checks.
r'^[A-Z]{2}\d{2}[A-Z0-9]{1,30}$'
DE89370400440532013000
GB82WEST12345698765432
credit_card
Luhn Validated
Payment card validation using the Luhn algorithm (mod 10) for checksum verification.
r'^(?:\d{4}[-\s]?){3}\d{4}$'
4532-0154-2981-4456
5425 2334 3010 9903
Validation Error Highlighting
ContentAtlas visually highlights validation errors with different colors for hard vs soft validation.
Color-coded validation: Hard failures (red), soft corrections (yellow), passed (green)
International Standards Compliance
ContentAtlas implements internationally recognized standards for currency and date formatting.
Currency Validation
Validates monetary values against ISO 4217 currency codes with mandated decimal place requirements.
| Currency | Decimals | Example |
|---|---|---|
JPY, KRW, VND |
0 | 1500 |
USD, EUR, GBP, CAD |
2 | 1500.00 |
KWD, BHD, OMR |
3 | 1500.000 |
Supported Transformations:
clean_currency— Strip symbols and convert to numericcurrency_precision— Flag values with incorrect decimal precision for reviewcurrency_code_remap— Remap country codes to ISO currencycurrency_decimal_rounding— Round decimals per currency rules
Date Standardization
All dates are normalized to ISO 8601 format (YYYY-MM-DDTHH:MM:SSZ) for consistent database storage.
Supported Input Formats:
2024-09-04T23:09:18Z
ISO 8601
2024-09-04
ISO Date
MM/DD/YYYY
US Format
DD/MM/YYYY
European Format
Available Validators:
date_iso — ISO 8601 date validationdate_us — US date format (MM/DD/YYYY)date_eu — European date format (DD/MM/YYYY) Complete the picture
ContentAtlas transforms data coming in. Helm governs every AI action going out.
The same propose → approve → execute → reverse model that governs your data imports applies to your LLM agents — every API call, MCP tool invocation, and database write gated before it happens.
See ContentAtlas in Action
Book a strategy session to discover how ContentAtlas can automate your data transformation with intelligent mapping, ISO standard compliance, and comprehensive validation.