actionETL
Cross-platform .NET ETL library combining the best of the ETL mindset with the tools & techniques of modern application development
Code-First ETL
actionETL runs cross-platform on Linux and Windows, using .NET Framework 4.6.2+, .NET Standard 2.0+, and .NET 6+.
Use it with C#, Visual Basic, F#, etc.
Dedicated MariaDB™, MySQL™, PostgreSQL®, SQLite, and SQL Server® database providers, as well as ODBC for other SQL and NoSQL databases.
Supports both on-premises and cloud deployments, including cloud versions of the above databases (Azure SQL, Azure PostgreSQL, etc., as well as on other cloud platforms).
Platforms
News
Release 0.41.0 Available supporting any data type in the dataflow and much more
New Bulk and Batch Insert Benchmark
Up to 19 Times Throughput Difference!
New Release 0.40.0 Available with Expanded Bulk Insert and Broader Transaction Support
actionETL is a high performance, highly productive .NET Core and .NET Framework library for easily writing ETL (Extract, Transform, Load) data processing applications in .NET languages such as C# and VB.
Use it to add ETL processing to your existing applications, and to create new ones, replacing or augmenting traditional ETL tools.
It is suitable for ETL developers with anywhere from limited to extensive .NET programming experience, and equally for .NET developers that have ETL requirements.
The Effective Way to ETL, using .NET
Outstanding Features
High Performance
- Combine in-memory row-by-row processing, database, and file-based processing to get the best of each approach
- Unlimited and configurable parallelism with low overhead, supporting millions of simultaneous or short-duration workers, including micro-batches
- Row and column high-performance mapping, copying, and comparing facilities for out-of-box and custom workers
- Statically typed, high throughput dataflow (or pipeline) with highly tuned workers
- Batch Insert for all databases and Bulk Insert for selected databases
- Wraps and extends .NET database providers to simplify writing database-agnostic code
- Dedicated MariaDB™, MySQL™, PostgreSQL®, SQLite, and SQL Server® providers, supporting both on-premises and cloud databases
- ODBC provider for other SQL and NoSQL databases and querying systems
- Local transactions across multiple workers
- Supports many database-specific data types (SqlDateTime, NpgsqlBox, …) end to end, avoiding conversion issues and reducing coding
- Read and write Delimited and Fixed Format flat files, streams, and strings
- Read and write XLSX (Excel) spreadsheets, JSON, and text files
- Read .NET IEnumerable, and read and write Collections
- Easily extensible to other databases, formats as well as transfer protocols like SFTP, SCP, FTP, etc.
Broad Data Source Support
Familiar ETL Concepts
- Control flow with start constraints, grouping, hierarchical structure, and highly parallel applications
- 52 dataflow workers provide extensive row-by-row processing capabilities
- Read and write data sources, cleanse, combine and transform data
- Divide and conquer – implement requirements using many small (and reusable) parts
- Effective debugging with breakpoints on worker and port state changes, and inspecting and editing rows in flight
- Highly capable (as well as replaceable) logging and configuration systems
- Automatically track and aggregate dataflow error row counts and log error row contents
- Common tasks require very little programming, more akin to setting a configuration
- Familiar concepts, project templates, a concise API, and extensive documentation makes it easy to learn
- NuGet.org packages and project templates for .NET Core / .NET Framework and a 100% C# implementation provide simple integration and updates
- Compose and encapsulate existing workers into new (control flow and dataflow) reusable workers
- Highly flexible workers that optionally accept code snippets (lambdas), e.g. to perform a greater-than join instead of the default equi-join, or specifying ordering columns, etc.
- .NET code-based programming and the actionETL architecture handles complexity very well
- Perform effective ETL testing with any .NET test framework
- Take full advantage of Visual Studio® or other .NET development environments for refactoring, source control, CI/CD, etc.
Modern Application Development
Unique Strengths
- Merged Dataflow and Control flow functionality (including constraints), which reduces code complexity and data staging
- Support any data type in the Dataflow
- Reuse and combine column schemas (groups of columns) to minimize dataflow bugs, maintenance effort, and code size
- Single programming model for both using and extending the ETL framework, including for constraints, custom workers, etc., simplifying creating reusable custom functionality
- Distinct recoverable vs. unrecoverable failures, minimizing start constraint bugs
Examples
Easy to Learn
Build your ETL application using the included 69 workers, which provide control flow and dataflow functionality, and handle both simple and highly complex scenarios with ease. The database workers generally work with all supported providers, making it easy to implement as well as port between databases.
This small C# ETL snippet moves a file and groups three workers that execute an external command, read a CSV file and insert the records in a database.
The architecture also makes it easy to create new, custom workers, that can be used and reused, same as the out-of-box workers.
With excellent reusability and composability, actionETL required 23 times less C# code (9kB) to create a high performance and reusable custom Slowly Changing Dimension (SCD) worker (fully included in documentation) vs. similar functionality implemented in one commonly used traditional ETL tool (209kB).
23 Times Less Code
Excels with Large and Complex Requirements
In this comprehensive ETL C# Process Incoming Files Example, a custom worker provides standardized and reliable processing, archiving and logging of incoming files, for any file type.