Data Warehousing Techniques and Data Mining
Corporations often maintain their data in various “legacy” information systems that have evolved over the years to track “corporate knowledge.” Because such systems may have drawn from different databases or spreadsheets, may be on different operating systems and computer platforms, and were often developed at different times by different staff for a wide range of purposes, the quality of their data varies significantly. Legacy systems are also extremely difficult, costly, and inefficient to maintain. Most important, though, is that key pieces of data from across the systems cannot be integrated into one knowledge base to help in making sound corporate decisions.
The Information Sciences Group has developed a way to integrate essential bits of corporate knowledge bases from legacy systems. By using a data warehouse approach, the team has successfully centralized and leveraged these disparate systems into an effective information warehouse that can be used to support corporate decision making.
Argonne’s team of information scientists has been researching, developing, and applying technology and processes that leverage disparate databases and information sources into an integrated system to house corporate knowledge bases. These processes focus on capturing metadata — i.e., information about information — from the source systems. Doing so provides the context, purpose, and limitations of the data sources and helps us understand why these sources were created.
Once metadata are clearly defined, the Information Sciences Group applies a series of data transformations to systematically and methodically transform the data over a series of steps, with traceability, into a usable, easily integrated format. The transformations for each source data system provide the components for a data warehouse, which allow investigation of integration points among the data sources and construction of data marts.
Construction of a data warehouse allows development of data mining and decision analysis tools that can be used to mine data to identify patterns, trends, and information to support corporate decisions. Argonne then uses advanced visualization tools, statistical analysis tools, and full-text search engines to mine, extract, and analyze the data into the warehouse. This process yields insight into the corporate knowledge base to determine “what we already know.”
Argonne’s data integration techniques provide lasting solutions that can improve an organization’s competitive edge. In today’s changing cyber environment, data warehousing and data mining techniques are providing essential tools and technologies for leveraging corporate knowledge, thereby leading to better and faster decisions.
For more information, contact:
Craig Swietlik
Information Sciences Group
Decision and Information Sciences Division
Argonne National Laboratory
9700 South Cass Ave., Bldg. 900
Argonne, IL 60439
Phone: 630-252-8912
Fax: 630-252-5128
E-mail Craig Swietlik
|