ETL/Express – Bulk Data Loading Tool

If you are an organisation challenged with reducing the costs and improving the performance of your Oracle based ETL solutions, and meeting the demands of DevOps and Cloud Computing, then we may be able to help with our simple, low cost, and highly effective and efficient PL/SQL based ETL tool ETLExpress.

The Challenges

Data warehouses processing very large data volumes with complex multiple concurrent feeds can be immensely difficult to successfully deliver.  As a result, there can be big challenges associated with minimising ETL batch times, eradicating concurrent data loading issues to improve throughput, meeting minimum expected developer skill levels, and justifying the cost of expensive enterprise ETL tools. Additionally, keeping on top of all of this whilst juggling with the demands of meeting DevOps and Cloud Computing requirements only exasperates the challenges.

The Questions

Please take a look at these points to see if ETLEx/press can help you:

  • Are you using Oracle 11g or higher to perform your ETL tasks being into, out of, and within the databases?
  • Do you pay for expensive licences for your enterprise ETL technology?
  • Do you rely on costly specialist resources to administer, develop, and support your ETL solutions?
  • Are your ETL solutions challenged with continuing performance and scalability issues?
  • Find that you have limited scope to improve performance without resorting to bespoke PL/SQL ETL solutions
  • Do your bespoke PL/SQL based ETL solutions lack adequate frameworks, and beyond your in-house skills set to stabilise or improve?
  • Are you under pressure to reduce your IT skills and training costs, with possible outsourcing and/or off-shoring?
  • Are you looking to improve your IT development throughput and time-to-market for your data warehousing solutions?

The Answer

So, answering yes to most of the above will more than likely mean our feature-packed comprehensive ETL application ETL/Express  can help you.

ETLExpress

As a result, ETL/Express is a solution developed to overcome these common issues. Especially relevant is that is an Oracle PL/SQL based application focused on utilising the most specialist and complex bulk loading features available in the Oracle database. These features are empowered through a simple XML based declarative language, supplemented with embedded SQL.

Main Objectives

The main objective of ETL/Express is to provide a simple to use, compact, highly efficient, and cost effective ETL solution, specifically to:

  • Maximise the performance of bulk data loading, over and above what is achievable with costly enterprise ETL tools.
  • Simplify implementation via a compact easy-to-deploy single PL/SQL package solution.
  • Coded via simple XML based declarative configuration data and SQL, saved into the database, and managed as text documents through typical source control, build, and deployment processes.
  • Having less riskier outsourcing/off-shoring of technical responsibilities by reducing the required Oracle development skills level
  • Simplify coding and improve development turnaround to achieve a significantly reduced time-to-market.
  • Provide a low-cost ETL solution with no dependency on dedicated hardware and infrastructure.

Advanced Features

There has been no holding back with embracing all of the available data loading methods and associated techniques to maximise the throughput and robustness. The robustness and credibility of ETL/Express is also been enabled by having a chief designer with over twenty years of developing very large Oracle based data warehousing solutions. This has been with both enterprise ETL tools as well as bespoke PL/SQL solutions, providing a huge insight into the issues involved, and hence providing the inspiration for ETL/Express.

Bulk Loading

Not all features available in Oracle are available or easily utilised in Enterprise ETL Tools, usually due to the separate server based architecture they depend on, and the one-size-fits-all support for multiple vendors. These are the features available:

  • Set based loading, implemented using INSERT AS SELECT, CTAS, and MERGE.
  • Direct data loading, implemented using the APPEND hint.
  • DML Parallelism using hints and session settings
  • DDL Parallelism for index and constraint building and rebuilding.
  • Support for batched row-by-row processing using FORALL bulk loading for situations where this has advantages.
  • Loading into multiple target tables from a single source using INSERT ALL/FIRST
  • Full support for literal variables via dynamic SQL to enable stable execution plan management.
  • Support for single transaction loading to enable rollback of failed loads.
  • Managing of indexes during bulk loading (UNSABLE/REBUILD)
  • Parallel execute implemented using DBMS_PARALLEL_EXCUTE for improved performance where direct loading and parallel DML not possible.
  • Comprehensive support for statistics managements, including incremental stats calculation.

Partition Tables

Partitioned tables are an important design aspect of any large database tables holding millions if not billions of rows. Loading partitions efficiently and robustly needs to address a number of challenges by supporting these features:

  • Concurrent data loading into multiple partitions with guarantee of locking contention partition exchange.
  • Partition maintenance with built-in features to create new future partitions and archive or purge old partitions.
  • Incremental stats maintenance on large partitioned tables for efficient application of stats.
  • Management of both local and global indexes during data loading
  • Support for list, range, hash, system, and interval partitions

External Data

Moving data is not only confined to within the database, but also to and from the file system. ETLExpress

  • Generation of extract flat files via UTL_FILE.
  • Loading from flat files via external tables.
  • Sourcing of data from URL based service calls supporting CSV, XML, and JSON format.

Instrumentation

It is critical to monitor the progress of ETL processes, identifying bottlenecks, reporting on row counts and execution times of individual steps, and debugging issues:

  • APEX based web application provided for detailed load monitoring.
  • Defined ETL processes translated into a list of distinct easy-to-monitor tasks further broken down into a hierarchy of sub-tasks logged with detailed execution metrics.
  • All loading tasks recorded with the executed SQL for debugging and transparency of loading steps.
  • All tasks recorded with detailed row counts and timing metrics for identification of performance bottlenecks and to monitor deterioration of execution times.
  • Analysis and real-time monitoring achieved through direct querying of supplied database views.
  • Logging data easily accessible and replicated into third party monitoring tools.

Devops and Cloud Compatibility

With the IT industry pushing more for the latest approaches, it comes with the challenges of adapting existing legacy Oracle systems. ETLExpress can help to simplify achieving their goals:

  • Main skills set confined primarily to SQL, with an appreciation of Oracle specifics being an advantage, but with minimal reliance on PL/SQL.
  • Being fully resident in the database means its supported by virtualised database infrastructure solutions such as Oracle Cloud, AWS, and Delphix.
  • Easy deployment into a database as a component of a DBaaS implementation.

What Next?

Please do not hesitate to contact us via email or our contact form so we can discuss how ETLExpress can help you. We will be able to provide a trial licence version and some initial training and support to get you going. The tool is very is easy to implement and use, so you will be up running in no time.

Please contact us for further infomation …

Our other products…

Related articles …

0

Leave a Reply