Choices are always good; more selection means a greater chance of finding exactly what you want. In the world of analytical databases, however, it’s not always clear which choice best suits the needs of an organization. Such is the case with the SQL Server 2016 Analysis Services. The tabular mode of analysis services was introduced in 2012, and saw improvements in 2014 and 2016, while the classic multidimensional mode has changed little since that time. Does this mean that the tabular mode is the hot new toolset that will replace the venerable multidimensional mode? Absolutely not! Let’s look at the strengths and weaknesses of each.
SQL Server 7 was the first version of SQL Server to come with an analytic database; the current Analysis Services technology debuted in 2005. In a multidimensional Analysis Services database, the many varied ways that analysts might wish to “slice and dice” their data looking for insights are designed right into the database itself. Analysts need not concern themselves with the raw data or with organizing the raw data into useful form; the database already knows about fiscal quarters, the calendar year, and the sales territories. Providing ease of use to analysts, however, does not come without without substantial cost. Well-training and experienced database architects must carefully handcraft a multidimensional database with a clear understanding of what the analysts need.
Observations on the ways real businesses were utilizing the Analysis Services revealed a variation of Pareto’s rule, often called the 80-20 rule. 80% of the work being accomplished with the Analysis Services was using only 20% of the system’s capabilities. This caused Microsoft to pose an important question. Might we create a new system focussed on the most frequently used capabilities and make that new system faster and easier to learn? This question opened up fertile new areas that have brought us not only the tabular mode of Analysis Services but also Excel Power Pivot and other tools in Microsoft’s “self-service” business intelligence lineup. Data is stored in a tabular structure much more familiar to folks, so the tabular mode is easier to learn. Power Pivot and the tabular mode also introduced a new data language called DAX (Data Analysis eXpressions). DAX was designed to resemble the functions used in Microsoft Excel, rendering the new database platform less daunting to analysts with Excel experience, but who are not themselves database professionals.
In addition to being less daunting and easier to use, the nin-memory structure of the tabular mode proved to be blazingly fast. In my own experiments, in those circumstances where the multidimensional mode and the tabular mode can accomplish the same task, the tabular mode has always come out on top for performance.
Sounds like the Analysis Services tabular mode is the flat-out winner. Well, no, not quite. Advantages always come with a cost, and in the case of the tabular mode the most notable cost is computer memory. A tabular mode database must live entirely in RAM. This places an absolute physical limit on the size of the database. The database size, in fact, must be significantly less than the size of available memory, since not only must the data fit entirely into RAM, but also the code and support structures. If your data load exceeds the limits of RAM there is no decision to be made; you cannot use the tabular model.
In contrast, a multidimensional database is, for all practical purposes, unlimited in size. More data requires more hard drive space, nothing else. While having more RAM is always better, of course, there is no absolute limit imposed in the multidimensional mode as in the tabular. (There are ultimate limits, of course, but none that are encountered in actual practice.) And, while implementing a multidimensional database still require custom design work on the part of administrators, Analysis Services multidimensional databases can be hosted on load-balancing clusters, so even the limitations of a single machine are not the limitations of the multidimensional database.
Lastly, well, it is called the “80-20” rule after all. There’s always that 20% who use more of the features supported by the multidimensional mode. These features include support for more complex queries and for integrating the data with customizable features called “actions”. These features are simply not available in the tabular mode. Some might point out that the multidimensional mode supports the Microsoft data mining tools, which the tabular mode does not. This is certainly true, but in my opinion should not be a deciding factor. The Analysis Services data mining tools have always been weak and poorly designed. They are too arcane for people without data mining experience to understand, and they are too idiosyncratic and inflexible to be of interest to folks who do have this experience.
Microsoft’s relatively recent acquisition of Revolution Analytics indicates a newfound and serious commitment by our friends in Redmond to the statistical language R. In my opinion, this marks a change in direction for data mining within the Microsoft business intelligence stack. R is, I believe, the toolset with the greater future. The Analysis Services data mining tool should not be considered when evaluating the relative merits of the multidimensional versus the tabular mode.
The short summary, then, is that the tabular mode will provide better performance and ease of use for many, but not all, analytic database requirements of a modern enterprise. Organizations with large volumes of data will have to rely on the multidimensional mode, as will analysts who need to explore data of subtlety and complexity beyond the limits of the tabular mode.