|
In this article we make a 'test drive' of XML indexes. We compare a traditional database schema with 2 master - Details tables versus one table where data is encoded in XML. Master table is called Orders, and the details table is called OrderDetails. They are joined using the column ?id?. You can download a script to create and populate all tables here CREATE TABLE [dbo].[Orders] ( [id] [varchar](20) NOT NULL , [doc_number] [int] NOT NULL , [doc_date] [datetime] NULL , [branch] [varchar](10) NOT NULL , [address] [varchar](255) NOT NULL , [payment_date] [datetime] NULL , [processed] [int] NULL ) ON [PRIMARY] GO CREATE TABLE [dbo].[OrderDetails] ( [id] [varchar](20) NOT NULL , [code] [varchar](4) NOT NULL , [price] [decimal](15,2) NULL , [quantity] [int] NULL , [tax] [decimal](2) NOT NULL , [exp_date] [datetime] NULL ) ON [PRIMARY] GO
We populate table Orders with 94,300 rows and with about 1,200,000 rows in OrderDetails (approx. 100 rows in orderDetails in one Order). This is done by stored procedure RandTable2. We also insert the same data into the table, where all information from the table Details is stored in one XML column 'x' (it is done by stored procedure FillXML_Attr):
CREATE TABLE [dbo].[OrdersXML_Attr] ( [id] [varchar](20) NOT NULL , [doc_number] [int] NOT NULL , [doc_date] [datetime] NULL , [branch] [varchar](10) NOT NULL , [address] [varchar](255) NOT NULL , [payment_date] [datetime] NULL , [processed] [int] NULL , [x] [xml] NULL ) ON [PRIMARY] GO
Data is stored in the following format (attribute-style): <Orders id="00000001239230086">
<Orders>
Then we run some simple tests against these tables. First test retrieves all Orders containing items with code ?0123?. We made all test against the relational tables Orders/OrderDetails and against the tables with xml data. We performed these tests with one xml-index (VALUE) and with two indexes (VALUE+PATH). -- Relational
indexes GO ALTER TABLE OrdersXML_Elem ADD CONSTRAINT OrdersXML_Elem_XPK PRIMARY KEY (id) GO CREATE UNIQUE INDEX Orders_XK1 on Orders (id) GO CREATE INDEX OrderDetails_XK1 on OrderDetails (id,code) GO -- XML indexes CREATE PRIMARY XML INDEX OrdersXML_Attr_PRIMARY on OrdersXML_Attr (x) GO CREATE XML INDEX OrdersXML_Attr_VALUE ON OrdersXML_Attr(x) USING XML INDEX OrdersXML_Attr_PRIMARY FOR VALUE GO CREATE PRIMARY XML INDEX OrdersXML_Elem_PRIMARY on OrdersXML_Elem (x) GO CREATE XML INDEX OrdersXML_Elem_VALUE ON OrdersXML_Elem(x) USING XML INDEX OrdersXML_Elem_PRIMARY FOR VALUE GO
(**) element-style query select id, x.value('(/Orders/OrderDetails/code)[1]','int') as o from OrdersXML_Elem where x is not null and x.exist('/Orders/OrderDetails [code="0123"]')=1
select distinct o.id, d.code, o.doc_number from Orders o, OrderDetails d where o.id = d.id and d.code = '0123'
In this table you can see results of our investigation. We provide 2 results in each cell, one for 'hot' execution (2nd or 3rd execution of the same query) and the second value for 'cold' execution (a very first executuon with void buffer cache, after execution of DBCC DROPCLEANBUFFERS)
Conclusion As usual, XML is nbiot a silver bullet. It is not a good idea to replace everything with XML. But you can use it rwhere eally fits, where data structured are not stable, are badly defined, changes every day, come from external data sources etc.
|
|
Disclaimer:
In many articles you will find the execution times of different queries. Please do not interpret these results scientifically or as official benchmarks. You will not even find the configuration of the computer used for these experiments because it is rather unimportant. What is important is that query A is running 2 times slower then query B. On your computer, it could be different: 1.5 or even 3 times slower. Read the full disclaimer » | |
|
|