本文是计算机专业的留学生作业范例，题目是“How Effective is Metadata?（元数据的有效性如何?）”，在一个高度发展的技术世界里，大型组织要处理大量的信息。随着历史信息的快速增长，历史信息的检索已成为决策者的一项强制性任务。这就是数据仓库发挥作用的地方。数据仓库是一个集成数据库的集合，旨在支持管理决策和问题解决功能。
In a highly growing technological world, large organizations deal with huge amount of information. With this rapid growth the retrieval of historical information has become a compulsory task for the decision makers. Here is where data warehouse comes into play. Data warehouse is a collection of integrated databases designed to support managerial decision making and problem solving functions.
Metadata is one of the most important feature available in data warehouse. In this paper we are going to explain how effective is metadata in data warehouse and how it helps in improvising the performance measures of the data warehouse making it highly efficient.
Retrieving the required information from data warehouse without metadata will be a daunting task. metadata is a small piece of data which tells the decision making analysts what kind of data is it and where it is stored. Thus making the task easy and time saving. Metadata has a significant role in data warehouse. There are of two parts front room and back room metadata. The back room metadata helps in extraction, cleaning and loading. the front room metadata is descriptive type helping in smooth functioning.
While creating the data warehouse for large organizations one or more meta data has to be stored, for this meta model is been created. Meta model is a conceptual model for metadata database where meta data of a warehouse are stored. It renders detailed description of metadata units and their relationship existing between them. The metadata type and their abstraction level has a direct impact on building a meta model. Meta data management system mostly follows the federal structure (i.e.) combining the advantages of both centralized and distributed structures enabling a variety of meta database for storage. Meta model should be highly scalable so that when the application requirements change the users can customize the application specific component of metadata.
2.EFFECTS OF METADATA ON DATA WAREHOUSE元数据对数据仓库的影响
Metadata and its management play a very important role in data warehouse system. When a user needs to access a data in data warehouse, he first looks into the metadata for where it is located in that cluster of information. What if the metadata is not present, the user has to skim through the whole information to find out his required information which is practically an impossible task. Thus metadata showing up to be highly efficient in data warehouse. In context with data warehouse, metadata is majorly classified as business, technical and operational meta data.
Business metadata explains about business definition, policies and others. Technical metadata has all technical information such as db. system names, tables, columns names, data types and values. Whereas operational metadata shows the currency of data. Meta data also ensures code reusability, accuracy, consistency and integrity of the system. It highly supports the development, maintenance and upgradation of Data warehouse.
When different layers of metadata fail to communicate and update successfully, the users will land up in wrong search of data.
A proper maintenance should be done on a timely basis and a notification mechanism should be initiated whenever such updating does not take place to ensure that metadata has updated and works fine. A good quality data warehouse should ensure good performance measures such as quality control, confidentiality of data, integrity, availability etc. Here we are going to see how metadata is playing a vital role in improvising the performance measures of data warehouse.
3.QUALITY CONTROL BY METADATA元数据质量控制
Implementation of data warehouse is a most efficient solution for decision makers. In some cases, data warehouse fails to meet the requirements due to lack of data quality. The major area where the data quality fails is while integrating the input sources. In a metadata quality control system, initially the demands for quality are gathered from the user and then convert these requirements into specifications. These specifications are added to metadata.
Along with this metadata, total data warehouse architecture is combined so the data is examined along the dataflow. This security mechanism is implemented in metadata since the metadata repository is responsible for each and every characteristic of data present in the warehouse. Hence applying quality demands in metadata helps to rectify the quality issues in data warehouse.
The quality demands of the user are satisfied, efficiency is improved, performance measure is boosted.
Firstly, though the quality control measure is applied in the initial stage, the data after stored in the warehouse do not guarantee the same quality. secondly, there is a slight modification in the warehouse architecture in accordance to the quality control model which may affect the performance of the warehouse.
A proper quality check should be carried out even after the data is stored in warehouse by involving quality maintenances in warehouse and also before involving modification in warehouse architecture the performance measure is to be checked so that it does not affect the efficiency.
4.SECURITY MEASURES BY METADATA元数据安全措施
A high quality data warehouse should have highly confidential storage system. Security is more important in this competitive world where hacking of information no more a big deal. Data warehouse is an inaccessible system providing huge of amount of data easily available for users. Security aspects are considered before building the data warehouse and inserted in the metadata in the architecture.so the user will be restricted to only a particular area where the access is provided, the rest data cannot be accessed unless proper authentication is provided. Inserting security measures after developing data warehouse will not be that effective and cost efficient. We apply security measure in metadata since applying it to the other parts will affect the performance of data warehouse and also it won’t end up in an efficient way.
Confidentiality of the data is maintained, illegal authentication is prohibited, increases the performance.
This security model is advantageous for those kind of data whose content is described by metadata and also it restricts the access to users, so the users end up making decision with the available decision thus indirectly affecting the performance of data warehouse.
Rather than giving authentication to a area, security system should be designed in such a way that only highly trusted users are let in the warehouse and are allowed to migrate throughout the system and make the better decision by considering all the available data.
For large organizations, metadata management is of great importance and great potential. Metadata not only provide information about the data in warehouse but also, helps in boosting various performance measures by which the user requirements are satisfied. By improving the quality of the data, consistency and efficiency is maintained in warehouse.
By providing a secured data to the users as per their demands increases the performance of the warehouse. Other security measures and quality control measures involves more complexity than that of one involving metadata. So, using metadata would be a simple and prudent choice.
In this goal, driven business environment, it is important to serve the demands of the user rather than giving them a comprehensive result. Metadata is a small part in data warehouse architecture but in turn provides huge effects in functioning. Meta data gives meaning to data and adds clarity to the users. hereby we summarize the effects of metadata in data warehouse and how it helps in improving the performance measures of warehouse. This performance improvement techniques using metadata is not widely followed because of lack of understanding of importance of metadata and, the metadata management has higher degree of complexity. So, by this we get to know that metadata is essential and highly effective in data warehousing.