|
by Philip Lambert
With Oracle’s rich support for XML in the
database it is now very hard to not exploit its use in your database
application designs. XML is now becoming the most popular format for
holding configuration metadata for bespoke systems and even the
native format for many desktop applications. This article describes
how to develop a generic PL/SQL package to read the hierarchical
structure of XML metadata and map it to a hierarchical structure of
a PL/SQL nested table record type driven by its own XML mapping
rules. It is directed at PL/SQL developers looking to use XML as a
metadata format for their data driven applications.
To download the metadata mapping package
click here.
Data Driven Applications
Traditional Approach
Before looking at how XML can be read by a
PL/SQL application, let us take a look at how we may have
traditionally designed a data driven application. As an experienced
data modeller, over the years I have developed or come across many
enterprise level data models that have had a subset of entities to
control the behaviour of the application - the data driven “rules”
of the system.
Typically, the physical design would have resulted in a series of
screens to maintain these metadata tables, often having to navigate
down several parent child relationships. Configuring a development
system via these screens may have been acceptable, but configuring a
controlled test system or production system using proper versioning
control and deployment methods often forced developers to deliver
the rules using DML scripts. This is still not a particularly
elegant approach and not really properly addressing the issues with
maintainability, version control, and deployment.
From a PL/SQL coding perspective, retrieving and navigating around
the metadata in the database tables meant repetitive calls to the
database in nested loops. A more elegant design would have been to
map the relational data to a nested table PL/SQL record type which
would have enabled more efficient and flexible navigation.
The problem with rules and configuration metadata is they can become
enhanced and changed over time. This usually resulted in changes to
the table design, new columns, etc, which are not very convenient to
chop and change after a system has been deployed and been in use for
a while.
So the traditional approach, though a familiar and better understood
solution by veteran database designers, has plenty of scope for
improvement. The drawbacks of taking a traditional approach can be
summarized as follows:
-
Maintainability – data maintained
via screens and/or DML scripts.
-
Version control – to achieve good
traceability and version control of the configuration data
changes would probably require this built into the tables adding
a further unwieldy dimension to the data design.
-
Deployment – new data and further
changes would require DML scripts to deploy.
-
Flexibility - Changes to the
metadata design would require DDL changes and possible
transformation of data to accommodate a new structure.
XML Approach
XML has become a very attractive means of
representing certain types of data, in particular small “envelopes”
of data such as message payloads, configuration metadata, transient
data in web applications, etc.
In my view, it is definitely not ready for high capacity use such as
storing transactional data, performing complex queries, etc. So
there is still a very strong and valid case for having a hybrid
approach, where XML provides an excellent alternative to holding
configuration metadata in a series of relational tables.
The attraction of using XML comes from the way it overcomes the
limitations discussed earlier:
-
Maintainability – If you are content
with editing XML documents with an XML editor, then this rids
the need for maintenance screens. Otherwise, with a bit of XSL
development it is possible to develop browser based screens to
edit the XML directly.
-
Version control – XML is a plain
text document which can be easily stored in a version control
system such as CVS or VSS, along with comment tags to stamp the
version and other information against the document.
-
Deployment – XML can be easily
stored in the database either in an XMLTYPE column or in a CLOB
as a single item of data, and subsequently retrieved by a PL/SQL
application and the XML mapped to a nested table PL/SQL record
type.
-
Flexibility - Changing the structure
of XML is very easy, in particular adding or removing elements
and attributes.
Design Considerations
XML and PL/SQL
At first glance it is easy to be over-whelmed
with the available XML functionality in PL/SQL. It helps if you can
understand the origins of the various capabilities and see what
supersedes what as there is quite a bit of “legacy” support. The
main split is between XML Developers Kit (XDK) introduced in Oracle
8i and XMLDB introduced in Oracle 9i R2.
I find XMLDB, with its XMLTYPE and API packages DBMS_DOM,
DBMS_PARSER, and DBMS_XMLPROCESSOR, is by far the most convenient to
use and easily satisfies all of my XML programming needs. The
XMLTYPE object type is also supported with a host of methods to
interrogate and manipulate the XML content.
There are several ways of reading the content of an XMLTYPE
document, mainly via its ExtractValue method or the DBMS_XMLDOM
package. ExtractValue is convenient for pulling out specific values
from the XML document identified by an XPATH string, and DBMS_XMLDOM
is better for walking the through the nodes of the XML document in
particular where you want to read in the full data content.
PL/SQL Control Flow
Whichever method you op for to read the content
of the XMLTYPE, where it is being used to drive the flow of the
program as in the case of metadata, then in needs to be converted
into a more accessible format. The best way to simulate the
hierarchical structure of XML is using nested tables within PL/SQL
record types. In Oracle 9i there is no limitation to how many levels
of nesting you can descend, so your typical XML document should be
easily accommodated.
Once you have all of your XML metadata loaded into a nested table
PL/SQL record type, then it easy to traverse the arrays via nested
loops in the PL/SQL code, and having very little cost or overhead
with any further repetitive traverses.
We are now onto the main objective of this article which is to
explain how you would retrieve data from XML and load up the PL/SQL
record type. In the past I would have done this by looping through
the nodes in the XMLTYPE using DBMS_XMLDOM, mapping parent elements
to records in an associative array, and then mapping elements and
attributes holding data into columns on the record type. Apart from
having to hard code all of your mappings, this has been fairly
straight forward to code and has worked effectively. When you start
to use this approach extensively and find that you are repeating the
same structure of code again and again, and keep including the same
little intricacies of using DBMS_XMLDOM, then its time to write a
generic package which is driven by its own metadata to define the
mappings between the XMLTYPE and the PL/SQL record type.
Strong Typing
The main issue with PL/SQL record types in a
compiled program is that they are strong typed, i.e. their
definition is predefined at compile time, which means that any
generic package developed to perform the mapping would have to be
compiled with the PL/SQL record type it is trying to populate. A
very useful way to overcome this is to use dynamic PL/SQL and
generate the PL/SQL on the fly to access the PL/SQL record type.
Ideally, we would want to have a PL/SQL function to return a PL/SQL
record type populated with the data content of the XML. Since
parameters have to be strong typed as well, then it will not be
possible to develop a packaged function or procedure that returns
the PL/SQL record type populated with the XML data content. An
alternative means of passing data between programs is to use package
global variables. Dynamic PL/SQL can reference global variables in
any package loaded into memory by the session, so it is feasible to
generate dynamic PL/SQL code to populate each component of the
record type global variable.
Dynamic PL/SQL
Having the EXECUTE IMMEDIATE command makes
using dynamic PL/SQL very straight forward. Though this command does
have its limitations where code cannot exceed the 32k bytes in size
of code. The alternative is to use the more traditional, but also
more resilient method of using the DBMS_SQL supplied package, which
has no limit to the size of SQL you use. It is not as easy to use as
EXECUTE IMMEDIATE, but it does the job.
Solution
Outlined Design
So we have now discussed and overcome each
potential limitation and now have a workable solution. So to
summarise the overall design we have:
-
The generic PL/SQL program would use its
own XML metadata to define the XML to PL/SQL record type
mappings.
-
DBMS_XMLDOM would be used to retrieve the
data from the XMLTYPE.
-
The data content of the XML would be mapped
into PL/SQL record type using dynamic PL/SQL.
-
A global variable representing the PL/SQL
record type would be used as an interface between the the
calling program and the package solution.
Working Solution
My next step could be to explain how to code up
a solution, but instead I have developed my own open source solution
for you to use and examine. Feel free to download it and use it in
your own solutions. Any feedback or proposed enhancements are very
welcome.
To download the metadata mapping package
click here.
Demonstration Example
Source XML Metadata
Rather than show an authentic example and have
to explain what the metadata is representing, I have put together a
hypothetical example. It shows a simple hierarchical structure that
descends five levels with content data in a mixture of elements and
attributes. The elements have been given obvious names eg Level1,
Level2, ... etc.
<?xml version="1.0"?>
<level1 Id="1" Name="Name 1">
<description>Level 1 Demo</description>
<level2 Id="11" Name="Name 1.1">
<description>Level 2 Demo - 1.1</description>
<level3 Id="111" Name="Name 1.1.1">
<description>Level 3 Demo - 1.1.1</description>
<level4 Id="1111" Name="Name 1.1.1.1">
<description>Level 4 Demo - 1.1.1.1</description>
<level5 Id="11111" Name="Name 1.1.1.1.1">
<description>Level 5 Demo - 1.1.1.1.1</description>
</level5>
<level5 Id="11112" Name="Name 1.1.1.1.2">
<description>Level 5 Demo - 1.1.1.1.2</description>
</level5>
</level4>
</level3>
<level3 Id="112" Name="Name 1.1.2">
<description>Level 3 Demo - 1.1.2</description>
<level4 Id="1121" Name="Name 1.1.2.1">
<description>Level 4 Demo - 1.1.2.1</description>
<level5 Id="11211" Name="Name 1.1.2.1.1">
<description>Level 5 Demo - 1.1.2.1.1</description>
</level5>
<level5 Id="11212" Name="Name 1.1.2.1.2">
<description>Level 5 Demo - 1.1.2.1.2</description>
</level5>
</level4>
</level3>
</level2>
<level2 Id="12" Name="Name 1.2">
<description>Level 2 Demo - 1.2</description>
</level2>
</level1>
Target Record Type
This XML data needs to be loaded into a PL/SQL
record type that simulates the hierarchical structure of the XML
metadata:
TYPE t_level5 IS RECORD
(
id VARCHAR2(10),
name VARCHAR2(100),
descr VARCHAR2(100)
);
TYPE t_level5_col IS TABLE OF t_level5;
TYPE t_level4 IS RECORD
(
id VARCHAR2(10),
name VARCHAR2(100),
descr VARCHAR2(100),
level5 t_level5_col
);
TYPE t_level4_col IS TABLE OF t_level4;
TYPE t_level3 IS RECORD
(
id VARCHAR2(10),
name VARCHAR2(100),
descr VARCHAR2(100),
level4 t_level4_col
);
TYPE t_level3_col IS TABLE OF t_level3;
TYPE t_level2 IS RECORD
(
id VARCHAR2(10),
name VARCHAR2(100),
descr VARCHAR2(100),
level3 t_level3_col
);
TYPE t_level2_col IS TABLE OF t_level2;
TYPE t_level1 IS RECORD
(
id VARCHAR2(10),
name VARCHAR2(100),
descr VARCHAR2(100),
level2 t_level2_col
);
TYPE t_level1_col IS TABLE OF t_level1;
TYPE t_param IS RECORD
(
level1 t_level1_col
);
The record types do not necessarily have to be
PL/SQL record types, but can also be object types defined in the
database.
XML to Record Type Mappings
To define a mapping between the source XML
metadata and the target PL/SQL record type, we use some XML
metadata:
<?xml version="1.0"?>
<Mappings Global="demo_metadata.g_param" DateFormat="YYYYMMDD">
<Record
Id="Level1"
ParentId=""
Source="Level1"
ParentSource=""
Target="level1"
Type="test_metadata2.t_level1_col">
<Mapping Type="String" Source="@Id" Target="id"/>
<Mapping Type="String" Source="@Name" Target="name"/>
<Mapping Type="String" Source="Description" Target="descr"/>
<Mapping Type="Boolean" Source="@IsValid"
Target="is_valid_ind"/>
<Mapping Type="Number" Source="@Version"
Target="version"/>
<Mapping Type="LargeString" Source="Comments"
Target="comments"/>
<Mapping Type="Date" Source="CreatedOn" Target="create_dt"/>
<Mapping Type="Date" Source="UpdatedOn" Target="update_dt"
DateFormat="YYYYMMDDHH24MISS"/>
</Record>
<Record
Id="Level2"
ParentId="Level1"
Source="Level2"
ParentSource="Level1"
Target="level2"
Type="test_metadata2.t_level2_col">
<Mapping Type="String" Source="@Id" Target="id"/>
<Mapping Type="String" Source="@Name" Target="name"/>
<Mapping Type="String" Source="Description" Target="descr"/>
</Record>
<Record
Id="Level3"
ParentId="Level2"
Source="Level3"
ParentSource="Level2"
Target="level3"
Type="test_metadata2.t_level3_col">
<Mapping Type="String" Source="@Id" Target="id"/>
<Mapping Type="String" Source="@Name" Target="name"/>
<Mapping Type="String" Source="Description" Target="descr"/>
</Record>
<Record
Id="Level4"
ParentId="Level3"
Source="Level4"
ParentSource="Level3"
Target="level4"
Type="test_metadata2.t_level4_col">
<Mapping Type="String" Source="@Id" Target="id"/>
<Mapping Type="String" Source="@Name" Target="name"/>
<Mapping Type="String" Source="Description" Target="descr"/>
</Record>
<Record
Id="Level5"
ParentId="Level4"
Source="Level5"
ParentSource="Level4"
Target="level5"
Type="test_metadata2.t_level5_col">
<Mapping Type="String" Source="@Id" Target="id"/>
<Mapping Type="String" Source="@Name" Target="name"/>
<Mapping Type="String" Source="Description" Target="descr"/>
</Record>
</Mappings>
The metadata has two levels of mapping: one at
record level and one at column level. There are rules for defining
the mappings:
-
The record mappings are driven from the
record types, i.e. from the target's perspective.
-
The target record type must be a collection
(TABLE TYPE)
-
The hierarchical relationship between the
records is defined using an arbitrarily defined unique id ("Id")
and the parent id ("ParentId") on the child record acting as a
foreign key. The full parent-child chain must be defined without
any missing links or gaps.
-
The Record Mapping definitions must follow
in a parent child sequence, where a record mapping defined
before its parent is not supported.
-
The records do not have to contain any
column mappings
-
There can only be one and only one top most
parent record, i.e. where the ParentId is NULL. It must also be
the first to be defined.
-
The "Source" refers to the source element
that contains the attributes and/or elements holding the data
content. The "ParentSource" refers to the parent element name in
order to distinguish between elements and attributes used in
different parent elements.
-
The "Target" is the name of the record
column on the target PL/SQL record type.
-
The "Global" is the name of the target
global variable to be populated with the XML data content.
-
It is also important to define the "Type"
of the PL/SQL record, including making any reference to the
package header the type is defined in. Database object types
also can be used.
-
Column mappings are defined within the
"Mapping" elements. Source attribute names must be prefixed with
an "@" as in the XPATH notation.
-
The column type is used to determine what
data conversion is required. The allowable values are
-
String. No conversion, but must be less
than 32k bytes.
-
Number (performs a to_number
conversion), returns null if conversion fails.
-
Boolean (accepts yes, no, true, false,
Y, N, which it will convert into a TRUE or FALSE value in
BOOLEAN column on the record type.
-
Date which assumes a format mask as
specified by the "FormatMask" attribute in the mappings
metadata. If this mask has not been supplied, then it uses
the default date format of the database. The format mask can
be defined globally against the "Metadata" root element, or
against an individual column mapping.
-
LongString. The maximum size the XMLDOM
package can read from an XML node is 64k, sp this is to
handle data between 32k bytes and 64k bytes. Rather than use
CLOBs in the record type ad worry about the management of
temporary LOBs, the data is read into an table type of
VARCHAR2(32767). With current limitations this would result
in array not exceeding two entries.
Conclusion
The solution described here gives a quick and
easy method for mapping an XML document to a PL/SQL record type. The
dbi_metadata package I have developed can be incorporated very
easily into to your own package solutions giving you the power and
flexibility to quickly utilise XML as a metadata medium.
The dbi_metadata package has been released as
an open source project. So feel free to use it and provide back
whatever feedback you may have.
To download the metadata mapping package
click here.
|