Data Warehousing
Using data warehousing processes developed tailored for each customer we can provide our customers with a single database that will contain all of the critical business information collected from various different application used in the organization. This technique allows for simplified reporting based on a more complete view of the business while also dramatically improving report performance.
What is data warehousing?
Simply put data warehousing is a term used to describe the processes that consolidate data from multiple databases into a single database.
What data goes into a data warehouse?
Most often the source databases are operational databases that store daily transactions. A common example of such a database would be an accounting package (ERP), inventory control system or time & billing system.
Why would I benefit from having a data warehouse?
Many organizations have multiple databases responsible for managing different aspects of the business which of course restricts the ability of creating reports and analyzing data across these dissimilar databases. The following are some of the most immediate benefits that having a data warehouse can provide:
- Dramatic report performance improvements. Reports that previously took minutes (or hours) to complete may only take seconds.
- Consistent information across all departments within an organization.
- Reduce one-off, ad-hoc reports generated by multiple staff using multiple data sources which creates inconsistent outcomes and promotes excessive spreadsheets & report files.
- Consistent implementation of organization specific calculations which helps reduce duplication of formulas spread across hundreds or thousands of spreadsheets.
- Control information filters from a single location which helps reduce possible unintended exposure of information based on the outcome of various aggregations.
I only have one database. Do I still need a data warehouse?
Possibly. As mentioned above data warehouses provide a number of key benefits to companies regardless of whether or not the source information originated from one or many databases. If your single database does not provide you with the reports you need or the existing reports run very slowly that you can benefit from a data warehouse.
I am thinking of changing or upgrading my database software (I.E. Accounting package). What happens to my data warehouse?
When designing a data warehouse one of the goals of that design is around the information needs of the organization. The data warehouse is there to help answer questions about the business. Since this is a much higher-level scope the actual source of the data is not as critical to the warehouse. By changing the data source we may have to change the method used to retrieve that information but this will not impact the functions within the data warehouse or reports you have developed that rely on the warehouse.
What is an analytical database?
Quite often data warehouse database are also referred to as analytical databases. Analytical databases are database designed for the purpose of data extraction and analysis. One of the key principles of analytical databases is the propensity to de-normalize data. If done correctly de-normalization within the context of the analysis to be performed can take report generation that might take 10 minutes down to 10 seconds when compared with the same report generated against an operational database. Of course the performance benefit will vary depending on the analysis being performed but the results are dramatic.
What is an analytical database?
Normalization is a term used to describe a rule of database design that simply put means a single information asset shall be stored only one in a database and simply referenced by some index or key when necessary. A simple example of normalized information would be the last name of an employee. If you would like to change the last name of an employee instead of storing that on every transaction they have created throughout the database we would simply store the user ID with any relevant transactions while the actual name would be stored in a proper employee table. This is a very simple principle however it has far reaching implications on the reliability and scalability of your databases. As the name would indication de-normalization is essentially the opposite of normalization where information is duplicated in the database. Each technique has an appropriate context in which to be applied.
Report Design & Consultation:
One of the principle reasons for data collection is the generation of reports. These reports may vary in function from operational reports such as invoice generation to analytical reports summarizing various key performance indicators (KPI). While the report purpose will vary, the user expectation tends to be the same: deliver accurate information in a minimal period of time.
What is a reasonable amount of time for a report to take?
This is a very subjective but common question. Of course the answer depends on numerous factors ranging from available hardware resources, database engine limitations and database utilization. Unless there are some extenuating circumstances we generally tend to target under 30 seconds from the time the user request a report to the time the report is visible on the screen. In cases of larger databases (100GB+) accomplishing these goals becomes very dependent on data warehousing and OLAP technologies.
What is the difference between analytical reports and operational reports?
Operational reports tend to be quite simplistic and only format data as it pertains to a specific business transaction. Invoices, purchase orders, order forms, statements are examples of common operational reports that are used frequently. Analytical reports will either contain rules/equations proprietary to the organization as a method of producing or analyzing key performance indicators (KPI). Analytical reports may also be derived from information either directly in the operational database or from a separate data warehouse (see more information about in the Data Warehousing section above).
Is it possible to increase the speed of a report that came with my existing software package?
Almost any database application will have a series of reports that are included with the application. In the case of ‘off-the-shelf’ software, reports are designed to be as flexible to as many industries as possible. Of course it is not reasonable to expect these generic reports to take into account the knowledge of which information assets are more critical than others in your organization. The result being a less than ideal report taking long periods of time to complete. This problem is exacerbated as the database size and user count increase over time with the size of your organization. By using one or more strategies such as data warehousing, database index management and/or report design changes, BKY has helped reduce report generation time from hours to seconds. Of course the scale of performance improvement will vary between reports and organizations. Some of the most common databases we work with in terms of increasing report performance are ERP databases (I.E. Accounting software packages such as Great Plains, Sage Accpac or SAP).
What is index management?
Index management is the process of evaluating the business function of the information stored in a database and determining the most critical information that is frequently accessed, the type of information as well as the function of that information. By taking these factors into account we can recommend and implement new indexes to the database that will not impact the day-to-day functionality of the database while improving the ability to generate reports from that database. In most cases it is our preference to defer to data warehousing as a method of improving report performance since any changes or improvements to the data warehouse will have no impact on the source database.
Can new reports be added to existing software?
Yes. The majority of applications will be based on industry known database engines which can be accessed by a number of report designers (I.E. SAP Crystal Reports, SQL Reporting Services, IBM Cognos, etc). If the database engine used for the application is more obscure this can impede some of the options available for customizing existing or creating new reports as well as finding individuals with the expertise in those reporting packages.
I have some information spread across multiple databases but I need to consolidate this data on a single report. Is this possible?
Yes. Depending on the complexity of the data this can be accomplished by most reporting packages. In more demanding situations it may be far more cost effective and performance oriented if a data warehousing strategy is employed for those scenarios.
What are some of the things I should look for in a reporting package?
Some of the most important factors you should consider when shopping for a reporting package include the following:
- Licensing cost
Licensing can be one of the most important factors simply due to the wide range of licensing costs in the industry. These costs can range anywhere from a few hundred dollar to hundreds of thousands of dollars while the quality and effectiveness of the package may not relate to the price. Here are a few question to ask:- How much does a designer license cost?
- How many users will need to create new reports?
- How many users may generate (print) reports at any given time? (concurrent user count)
- Will the reports require web delivery and does web delivery carry and special complexity or costs?
- Compatibility
If there is a preferred or recommended report package that may have been used with the development of your existing database this should be seriously considered. Selecting a reporting package other than the one that came with your application may involve more effort to integrate depending on the application.
What are some of the things I should look for in a consultant to help with designing reports?
Choosing the right people to help create or modify reports can have a significant impact on the performance, accuracy and scalability of reports. It is important the people designing your reports ask many questions and preferable to have familiarity with the database structure you are working with. The following are some other key points that report designer consultants should have:
- Expertise
Selecting or continuing to use an obscure reporting package will equate to higher labour costs as well as dependency on fewer companies to provide training, design and deployment services. It is far more cost effective if the reporting package is a reputable, industry trusted solution. In some cases this may outweigh the benefit of using the reporting package that may have been provided with your existing database. - Training
As your organization grows it is assured your reporting needs will also grow. Having employees that understand even basic report design can help reduce the cost of external consultants. This is not a requirement but can have significant cost benefits depending on the number of reports to be managed. - Transparency
Perhaps more critical for analytical report it is important that your organization understand and test the business logic and outcomes generated by reports. If you hire an external consultant to create a report for you it is important the consultant understand and explain how the information will be processed. This will help reduce or eliminate incorrect business logic from undermining the accuracy of the reports. If your report designer cannot articulate how the data will be reflected in the report this may be an indication that he/she either does not understand the goals of the report or they are not sharing critical information that would allow your company to improve the reports in the future in an effort to create a dependency on those consultants.
BKY Technologies is your partner for Expertly Managed IT Services. Contact us today for a consultation.