Integrator for AWS Redshift, IBM Netezza, or Apache Spark is a data integration tool that loads and transforms data more efficiently than traditional ETL tools, with a smaller footprint, at a fraction of the cost.
Play with Integrator in our new sandbox or reach out to us to join our beta program.Try it out Join beta
Integrator uses an intuitive visual dataflow language where icons connected by arrows represent sources, operations, and targets of data.
Testing and deployment framework
Integrator comes with advanced debugging and logging features to make deployment and maintenance easy.
Built-in jobs and batch scripts
Integrator presents users with a hierarchy of jobs and batches which can be easily integrated into production environments.
Advanced agent security scheme
Agents are processes that communicate between the source and target machines. They are able to keep communications private by intelligently managing connections between them and using strong encryption.
Unlike traditional ETL tools, Integrator has no engine, since it relies on the underlying Spark (Netezza, Redshift, etc.) engine. This significantly reduces Integrator's footprint.
A best practices product
The designers of Integrator are longtime practitioners of ELT and data warehousing and have built their experience into the product.
Bottlenecks in data integration processes happen at computational boundaries: landing data to disk or transferring data across a network. Eliminating a server from the series of machines that data passes through on its way from source to target will confer a significant performance advantage. This is all the more true when the machine that is eliminated (the ETL server) is resource-bound relative to the machine that replaces it (the target data integration server – generally an MPP purpose-built platform or cluster). The result: ELT architectures have significant speed advantages over traditional architectures.
Traditional ETL systems contain powerful engines that implement efficient joins, sorts, filters and other operations on large datasets. To work well, these systems require powerful dedicated servers. These servers become sources for data integration servers – which, ironically, are even more powerful servers that are more capable of performing the joins, sorts, filters, and other large dataset operations than the ETL servers are. The traditional ETL systems have become, for historical reasons, costly duplications of effort. To eliminate the ETL servers and make the data integration servers perform in the required role, however, some inexpensive “glue” is required. Integrator is that glue. The result: Integrator is an order of magnitude less expensive than the ETL systems it replaces. Integrator essentially eliminates the cost of traditional ETL systems.
Integrator is designed to look and feel familiar to the users of traditional ETL systems such as Informatica and DataStage. Integrator users write data integration programs using an intuitive visual dataflow language where icons connected by arrows represent sources, operations, and targets of data. Integrator presents users with a hierarchy of jobs and batches which can be readily integrated into production environments. Integrator provides parameters to control job behavior and to pass information between jobs. Integrator has debug modes and log information to help developers determine what is going wrong with their programs. The result: Integrator is easy for anyone to learn and use. For users of traditional ETL systems, Integrator will look and feel familiar.