In the world of blog writing, “x vs. y” is a common theme. Contrasting two concepts, products, or services provides several cognitive advantages. Comparisons highlight both unique characteristics and shared features. Understanding the relationship between two items requires critical thinking and analysis. We also tend to learn better when information is presented in context, as opposed to abstraction.
However, OLTP (online transaction processing) vs. OLAP (online analytical processing) is a special case in the data space. Because OLTP and OLAP are, quite frequently, at odds with each other. Writing for the Data Products Substack, Gable.ai’s Mark Freeman shares his first-hand account of the problematic ways OLTP and OLAP can shape the “data worldview” of their users:
In my last role as a data scientist, I was hired at an HR tech company specifically to translate between the engineering and research teams, essentially turning analytics and I/O psychology theory into data products.
Despite both teams having the same ultimate goal of building amazing HR data products, I experienced firsthand the breakdown of communication and misunderstanding of needs amongst these two groups through this translation work. Upon talking to others in the data industry I quickly learned that this challenge wasn’t unique to my job, but rather an endemic issue across data teams of various company sizes.
Fortunately, Mark’s experiences have also led him to some thoughts on how OLTP and OLAP practitioners can bridge the “versus” gap in their actual working environments. But first, let’s cover what these two forms of data processing are, the roles they play, and where the complications arise.
OLTP and OLAP data processing
In the context of the data space OLTP and OLAP are two fundamental concepts that serve different but complementary purposes, both in managing and utilizing data within organizations.
OLTP at a glance
Basic role
OLTP systems are designed to manage transaction-oriented applications. As such, they are optimized to handle large numbers of short atomic (i.e., indivisible), and often simple operations.
Primarily, these operations are involved in many day-to-day transactions that help a business operate like order entry, retail sales, financial transactions, and customer relationship interactions.
Key characteristics
- Operationally focused, directly involved in business operations
- High throughput for short operations like create, read, update, and delete (CRUD)
- Ensures data integrity in real-time
- Deal with current data that is highly detailed
Use case examples
- Banking systems
- Online booking systems
- Retail point of sale (POS) systems
OLAP at a glance
Basic role
OLAP systems are designed for query-intensive and complex analytical calculations. These systems are optimized for data analysis and geared toward querying large sets of data at once to aid in looking for patterns, trends, and insights.
The primary role of OLAP is to aid in decision-making processes, like data mining, business intelligence (BI), and trend analysis.
Key characteristics
- Analytically focused, used to drive strategic business decisions
- Very efficient in handling complex queries that involve aggregations and joining large datasets
- When used, data is often consolidated from various OLTP databases and other sources
- Involves historical data analysis, as OLAP systems allow users to work on large volumes of historical data to derive insights over time
Use case examples
- Financial reporting, customer relationship management (CRM) analytics, data warehousing systems
How OLTP and OLAP work together
In integrated data warehousing environments, OLTP and OLAP will often share a data source-destination relationship. OLTP systems collect data through daily business operations and then pass it to OLAP systems for analysis. As part of this process, OLTP systems typically use an Entity-Relationship (ER) model, optimized for transactional efficiency. OLAP systems, alternately, will use a star schema or snowflake schema more optimized for reading and analyzing data.
This congruent relation is important to highlight, as OLTP and OLAP, despite their differences, function as complementary systems. Due to OLTP’s focus on efficient transaction processing and OLAP’s on complex data analysis, both are often necessary for a comprehensive data strategy that is two parts of a whole.
Appreciation of this interplay between OLTP and OLAP is important, as it lays a foundation for helping us appreciate how these complementary systems foster such a surprising amount of discord amongst data professionals.
Unpacking the “vs.” in OLTP vs. OLAP
As Mark witnessed firsthand, the tension between OLTP data producers and OLAP data consumers stems from how these two groups approach the handling and interpretation of data. While related challenges will be unique to each organization, here are seven factors that consistently contribute to the OLTP/OLAP communication gap:
1. Separate objectives and priorities
As the data producers in this equation, OLTP systems are employed in an organization to manage and process transactional data. Use cases like updating inventory and managing customer information requires immediate, real-time data. This necessitates that OLTP users prioritize operational efficiency, preserving data integrity, and sheer speed.
OLAP system users, as data consumers, have very different priorities. They are looking to cast a slow, thorough gaze across large amounts of historical data in an attempt to identify trends that can improve strategic decision-making. This requires a combination of data aggregation, complex queries, and measured data analysis in order to extract insights and support long-term planning.
2. Different data structures and models
OLTP data producers use a normalized data model in order to reduce data redundancy and maintain data integrity. Normalized data models are also easy to organize and update, in addition to being very consistent.
However, while these factors benefit transactional processes, normalized data models are inefficient when used for complex queries, requiring data from multiple tables. This is exactly what data analysis requires, which is why OLAP data systems frequently utilize a denormalized data model. These models organize data into large comprehensive models that facilitate querying which can be both quick and complex, by comparison.
3. Misaligned interpretations of data
Holding very distinct objectives, producers and users may understandably interpret the same data in very different ways. Again, producers are looking for the quickest and most efficient ways to record and retrieve transactional data so that it remains highly in sync with the real-world operations of a business. But for the consumers, working to identify trends through moment-to-moment changes would be like studying individual grains of sand to track the ebb and flow of an entire coastline.
4. Distinct languages
OLTP and OLAP users quite literally end up speaking very different languages.
OLTP users typically leverage more technical, operational language in their work. Terms and concepts in the world of real-time transactional processes involve database management, application performance, and transactional integrity. Water-cooler discussion might center on resolving issues related to query speed, system reliability, and database normalization.
Meanwhile, in the world of the OLAP user, language will be much more business and analytically focused, including business intelligence (BI) terminology, statistical analysis, and data visualization concepts. Around this respective water cooler, practitioners might instead discuss data warehousing, business metrics, dimensional modeling, and big data analytics.
5. Data quality and transformation issues
As data moves from an OLTP to an OLAP system, issues can arise regarding data quality and transformation. Attempts to resolve these issues through denormalization or integration from additional sources can result in misalignments concerning data accuracy, timeliness, and completeness.
6. Complexity in business logic
The business logic used to great effect in OLTP environments may not translate cleanly into an OLAP system.
For example, in a sales environment, every time a customer makes a purchase, inventory needs to be updated, sales details need to be captured, and loyalty or rewards points might even need to be tallied–all of which an OLTP system would log behind the scenes. Business logic in this example would involve immediate updates to ensure inventory accuracy, recording of sales details, and updates to customer profiles. Stock quantities need to be decremented and loyalty points incremented.
Downstream, however, OLAP users are working to analyze sales data over time to help stakeholders make strategic decisions, like understanding the seasonality of customer buying patterns and predicting future inventory needs. This changes the business logic. Here, the focus is not on individual transactions and instead is on aggregating the sales data over time. Users and their systems might look to calculate average sale values per customer, or to review years of historical sales data to improve their ability to forecast future stock requirements.
No surprise then, that in our example business logic from the former would not seamlessly transfer to the latter. While still complementary, disparate logic would need to be corrected each time a data transfer is made, resulting in the aforementioned discrepancies with how the data needs to be understood and used.
7. Different performance metrics
Finally, both professional camps would be equally concerned with measuring success. But while OLTPs would be judged based on metrics like transaction throughput and response time, OLAPs would instead like to be indicators of data freshness and query response time. Even when all involved in a given organization understand the metrics used by the other party, there’s no reason to believe they’ll inherently value them accordingly.
How then to bridge the OLTP vs. OLAP gap?
Not only is the OLTP vs. OLAP issue interesting to consider, but overcoming it may prove instructive, as more “versus” situations may be ahead, with more organizations using more data more of the time. But Mark’s own experiences lead us to a conclusion we can build on:
...the solution to this divide between OLTP and OLAP database users is not to force the benefits of one side or the other but to instead foster empathy so that both sides can translate the needs of each other...we made great strides by including the data consumers in the product requirements document (PRD) writing phase. It worked well because we added the OLAP perspective before data producers were constrained by existing requirements to deliver a product on a tight deadline. For the data consumers, we finally had an opportunity to see upstream and learn the product constraints that impacted our business logic.
But, as Mark notes, any solution that bridges this gap needs to scale in order to be effective. Because when they don’t, teams drift back into their comfort zones, perpetuating the OLAT vs. OLAP issues.
Ultimately, this led Mark to realize data contracts are the ideal avenue through which to scale and maintain empathy between OLTP and OLAP users—creating ecosystems where data sharing and analysis are not only more efficient but also more aligned with the needs and expectations of all parties involved.
Moreover, it also led Mark to Gable.ai, where he’s now part of our passionate team, working to transform how data producers and consumers work together.
Speaking of, our product waitlist is currently open. Make sure to secure your spot today.