About the Series …
This article is a member of the series, MDX Essentials.
The series is designed to provide hands-on application of the fundamentals of
the Multidimensional Expressions (MDX) language, with each tutorial
progressively adding features designed to meet specific real-world needs.
For more information about the series in general, as well as
the software and systems requirements for getting the most out of the lessons
included, please see my first article, MDX at
First Glance: Introduction to MDX Essentials.
Note: Service Pack 3 updates are assumed for MSSQL
Server 2000, MSSQL Server 2000 Analysis Services, and the related Books
Online and Samples.
Overview
In
an earlier article, The
CROSSJOIN() Function: Breaking Bottlenecks, we examined the use of CROSSJOIN(), and
factors that can render this otherwise powerful function suboptimal within our
queries. We discussed a business need as defined by a hypothetical group of
information consumers, in which we were asked to tune an MDX query for more
optimal performance. Our focus centered upon enhancing query performance when using CROSSJOIN()
in medium- to
large-sized data sets. After discussing how CROSSJOIN() works in
general, and pointing out the way in which its operations can result in
crippling performance overhead, we exposed approaches to mitigating that
overhead within practice exercises designed to reinforce the concepts.
We learned that using NONEMPTYCROSSJOIN()
is, by far, the most effective avenue to minimizing the bottlenecks that plague
standard CROSSJOINS() within challenging cube scenarios. We examined
two approaches to using NONEMPTYCROSSJOIN() in achieving our ends,
finding refinements in the second approach, where we employed the optional set
count parameter in the function, to provide more efficiency than the first
(which, even in its "vanilla" context, had demonstrated its power to
enhance performance dramatically). We noted, however, a prevailing concern
amid all this success: NONEMPTYCROSSJOIN() filters out calculated members, and so it is
not useful in a scenario where calculated members are to be returned.
In this article, we will examine the enhancement of queries
using CROSSJOIN() where calculated members are, indeed, to be returned.
We will begin with a simple, but intensive, CROSSJOIN() scenario,
reviewing how CROSSJOIN()
performance can become an issue where larger sized data sets are involved. We
will then undertake a multiple-step practice example, which will initially help
us to gain an understanding of the issues encountered with calculated
members. We will attempt the approach to minimizing performance overhead
that we used in The CROSSJOIN() Function: Breaking Bottlenecks, where we met with a simpler
scenario that did not involve calculated members. We will then provide an
approach that provides palpable relief of the performance issues, while
returning a calculated member that we require to achieve our reporting and
analysis objectives.
To accomplish our examination of CROSSJOIN()
enhancement when calculated members are a factor, we will undertake the
following steps in this article:
-
Review CROSSJOIN()
performance considerations that we introduced earlier; -
Create a copy of the Warehouse
sample cube for use in our practice exercise; -
Add a calculated member
to a dimension in the clone cube for consideration within our article; -
Prepare the cube further by
processing; -
Examine an
instance of suboptimal query performance that we determine to be due to the
resource-intensive use of CROSSJOIN(), highlighting factors that cause
performance degradation; -
Demonstrate
issues inherent in the attempt to enhance a suboptimal CROSSJOIN() scenario by substituting NONEMPTYCROSSJOIN(),
when calculated members need to be retrieved; -
Provide an
approach to enhancement of the CROSSJOIN() scenario
with concomitant return of calculated members, using the GENERATE() function; -
Explain the results we obtain from the steps we take to
accomplish the solution.