FastLoad is a high‑performance bulk data loading utility in Teradata, designed to load very large volumes of data into empty tables as fast as possible.
FastLoad achieves its speed by:
- Bypassing most SQL overhead
- Loading data directly into AMPs in parallel
- Minimizing logging
- Avoiding row‑by‑row processing
In short: FastLoad is optimized for speed, not flexibility.
2. Key Characteristics of FastLoad
| Feature | Description |
|---|---|
| Load Type | Bulk insert only |
| Target Table | Must be empty |
| Supported Tables | Permanent tables only |
| Speed | Fastest Teradata load utility |
| Logging | Minimal (no rollback) |
| SQL Support | Very limited |
| Error Handling | Error tables required |
| Concurrence | Single FastLoad per table |
3. What FastLoad Can Do
FastLoad is ideal for:
- Initial data population
- Loading millions or billions of rows
- Batch ETL processing
- Data warehouse staging or base tables
FastLoad:
- Loads data directly to AMPs based on hashing
- Uses multiple sessions to maximize parallelism
- Skips row‑level locking
- Writes data blocks efficiently to vdisks
4. What FastLoad Cannot Do (Important!)
FastLoad has many restrictions by design:
🚫 Cannot load into:
- Tables with existing data
- Tables with:
- Secondary indexes
- Join indexes
- Triggers
- Referential integrity
- Identity columns (older versions)
- Volatile tables
- Non-empty tables
🚫 Cannot perform:
- UPDATE or DELETE
- Upserts
- Row‑level restart
- Concurrent loads into the same table
This is why FastLoad is usually followed by index creation after the load.
5. FastLoad Architecture (Behind the Scenes)
FastLoad works below SQL:
Source File
↓
FastLoad Controller
↓
Multiple FastLoad Sessions
↓
BYNET
↓
Target AMPs
↓
Data written directly to vdisks
Each AMP receives only the rows it owns based on PI hashing.
6. FastLoad Processing Phases
FastLoad runs in two major phases, often called Phase 1 and Phase 2.
Phase 1 – Acquisition Phase
Purpose: Collect and distribute data to AMPs
Steps in Phase 1
- Target table is locked with an exclusive lock
- FastLoad sessions start (typically 8–32)
- Data is read from the input file
- Rows are hashed on Primary Index
- Rows are sent directly to owning AMPs via BYNET
- AMPs store rows in worktables (temporary storage)
Important Notes
- No data is yet committed to the base table
- No secondary indexes are updated
- Duplicate PI rows are not yet validated
- Errors are captured into error tables
📌 This phase is pure parallel data acquisition
Phase 2 – Application Phase
Purpose: Move data into final table structure
Steps in Phase 2
- AMPs sort rows by rowID
- Rows are merged into the base table
- Duplicate PI rows are detected
- Error rows are written to error tables
- Data blocks are finalized on disk
- Table lock is released
Only at the end of Phase 2 is data fully visible.
📌 If FastLoad fails before Phase 2 completes, no data is loaded
7. FastLoad Error Tables
FastLoad automatically creates two error tables:
| Error Table | Purpose |
|---|---|
| Error Table 1 | Data conversion errors (bad input data) |
| Error Table 2 | Constraint violations (duplicate PI rows) |
These tables:
- Must not exist before FastLoad
- Are dropped at the end unless specified otherwise
- Are crucial for data validation
8. Performance Why FastLoad Is So Fast
FastLoad performance comes from:
✅ AMP‑level parallelism
✅ Direct data path (bypass SQL engine)
✅ Minimal logging
✅ No index maintenance during load
✅ Block‑level I/O instead of row‑level
This is why FastLoad often outperforms:
- Multi‑row INSERT
- TPump
- BTEQ INSERT
9. When Is FastLoad Optimal?
✅ Use FastLoad When:
- Table is empty
- Loading millions+ of rows
- Data comes from flat files
- No need for row‑level rollback
- No secondary indexes during load
- Batch/offline ETL processing
- Performance is the #1 goal
Ideal example use cases
- Initial warehouse load
- Historical backfill
- Nightly staging loads
- One‑time migration
❌ Do NOT Use FastLoad When:
- Table already contains data
- You need to:
- UPDATE existing rows
- DELETE rows
- Perform upserts
- Table has secondary indexes or constraints you must maintain
- Small data volume (overhead is too heavy)
- Need concurrent loads
- Need restartability at row level
In these cases, consider:
- MultiLoad
- TPump
- INSERT…SELECT
- TPT Load Operator
10. FastLoad vs Other Utilities (Quick Context)
| Utility | Best For |
|---|---|
| FastLoad | Initial bulk loads |
| MultiLoad | Bulk UPSERT (I/U/D) |
| TPump | Small trickle inserts |
| TPT | Modern replacement of all above |
Final Expert Summary
FastLoad:
- Is Teradata’s fastest bulk load utility
- Works in two distinct phases
- Achieves speed through parallelism and minimal logging
- Is ideal for large, empty-table loads
- Trades flexibility and logging for raw performance
If you care about speed and your table is empty → FastLoad
If you need flexibility or incremental loads → use something else
Leave a Reply