Understanding Corporate Data Governance – A Legal Perspective

2025/06/06

Table of Contents

Data Governance is a collection of activities (planning, monitoring, and executing) that exercise authority and control over data asset management. The data governance function guides how other data management functions should be executed. Under this definition, corporate data governance refers to an organization’s management and control of data during its operations to ensure data quality, availability, security, and regulatory compliance. The purpose of corporate data governance is to enable companies to effectively manage their data resources, enhance decision-making processes, and improve business operational efficiency. From a legal perspective, it concerns how data from front-end collection to back-end utilization or sharing can serve the company while avoiding illegal data collection, processing, and utilization, as well as how to avoid misuse, misjudgment, leakage, or other regulatory compliance risks.

“Data is the oil of the 21st century,” but data and oil differ substantially in nature. First, oil is a finite natural resource that inevitably diminishes with extraction; data, however, becomes increasingly easier to obtain, generate, and preserve technologically, and its volume only grows larger. Second, oil is a one-time consumable resource whose value is exhausted after use; data can be reused repeatedly and even provided to multiple users simultaneously. Third, oil is valuable due to its scarcity or extraction difficulty, but data becomes more likely to generate new business value when accumulated in larger quantities and combined with data from different fields. Similarly, because data is not a natural resource like oil but is given meaning by humans, law naturally becomes the source of data rights. This makes data, as a production factor in the new era, not only an important corporate asset but also a significant source of risk. Below we explain, through several special characteristics of data in the legal field, why companies are facing unprecedented legal risks:

I. From Data, Databases to Big Data

History is a process. The role “data” plays in human society has evolved with advances in data processing technology—from single data points, to records of multiple data points, to extracting needed information from records, to meaningfully collecting, organizing, and aggregating data into databases, using structured methods to enhance people’s understanding of the world. When technological progress makes the quantity, frequency, units, and types of recordable data completely different in magnitude from the past, entering the era of Big Data that humans currently struggle to truly grasp, data that people previously considered meaningless or valueless to collect can now, through massive accumulation combined with data mining, artificial intelligence, and high-speed computing, unearth valuable information whose context humans have not fully grasped. This will inevitably become an important resource for the “data economy.”

Take an example widely circulated on social media platforms, originating from a post by someone in China’s e-commerce industry: “In our e-commerce industry, there’s an unspoken iron rule for finding girlfriends. Can’t find someone whose Taobao positive review rate is below 98%, DiDi rating below 4.8, whose core search word is ‘dress,’ and whose average transaction is below 128 yuan… Anyone satisfying all three conditions simultaneously is difficult to please.” This statement sparked many responses. Setting aside the rigid gender consciousness it implies, seemingly unrelated information may ultimately be used as important judgment factors in artificial intelligence or human intelligence decision-making, which differs in nature from previous database comparison controversies. With the development of Big Data and AI, information that might previously have been perceived as unrelated may generate meaningful information that no one expected when the data’s scope becomes “broad” and quantity becomes “large” enough. From Data to Database to Big Data is not just a quantitative difference but may have produced a qualitative change.

While this change certainly generates new business possibilities, it also creates unpredictable risks. Governments worldwide are similarly addressing various issues this brings, and strengthening corporate legal responsibilities to gain public trust is the most convenient approach. Legal issues face similar changes—whether government legislation or corporate responses to Big Data cannot be viewed through traditional concepts of data or databases. Otherwise, it will be difficult to address the enormous risks and opportunities Big Data may bring.

II. Internationalization and Localization

1. Internationalization

Data, as something intangible, is similar to intellectual property rights such as copyrights, trademarks, patents, and trade secrets. When designing legal systems, it simultaneously has the characteristics of internationalization and localization. This is evident from the impact of the EU’s General Data Protection Regulation (GDPR) on other countries and large corporations. Past internationalization of intellectual property-related legal systems stemmed from international treaty efforts, as intellectual property protection was generally accepted by all countries—such as the Paris Convention for the Protection of Industrial Property and the Berne Convention for the Protection of Literary and Artistic Works. Countries, due to cross-national communication needs for World Expos or cross-border trade transaction needs, through over a hundred years of effort, spread from developed countries to almost all countries worldwide. This author believes data-related legal systems will likewise, with the arrival of the data economy era and data’s need for cross-border circulation, gradually converge due to competition among national legal systems.

On May 31, 2018, immediately after GDPR came into effect, Japan’s Personal Information Protection Commission and the EU Commission reached substantial agreement on mutual recognition of cross-border transmission of acquired personal data, and Japan obtained adequacy status in January 2019. South Korea passed amendments to its Personal Data Protection Act referencing GDPR in January 2020 and obtained EU adequacy status on March 30, 2021. In 2018, under pressure from GDPR’s implementation, California passed the California Consumer Privacy Act 2018 (CCPA) (effective January 1, 2020). Because internet giants like Google, Apple, and Facebook are located in California, CCPA became the benchmark for consumer personal data protection in the United States. On November 3, 2020, California approved Proposition 24 through referendum in its general election, enacting the California Privacy Rights Act (CPRA), establishing the California Privacy Protection Agency with investigative, enforcement, and regulatory authority over CPRA, implemented from January 1, 2023. China’s Personal Information Protection Law was passed by the Standing Committee of the National People’s Congress on August 20, 2021. Countries are successively enacting or amending personal data-related legal systems and gradually raising protection standards. Although not achieved through international treaties, in the interconnected internet world, under pressure from legal system competition among countries, personal data protection standards will gradually converge.

Additionally, from GDPR implementation experience, many multinational corporations, because they must comply with GDPR, also require suppliers or partners who originally did not need to comply with GDPR to “voluntarily” comply through contracts. Failure to comply with GDPR would involve breach of contract liability. Just like other corporate social responsibility, anti-corruption, or anti-money laundering regulations, based on international trade or cross-border internet services, even if the local country has no such regulations, related legal systems’ influence will still extend through contracts. This is also why data-related legal systems may have internationalization effects. When companies conduct data governance, factors to consider include not only local laws but also contractual regulations from partners.

2. Localization

Intellectual property rights and data-related legal systems, while internationalizing, also have localization characteristics. Intellectual property rights mainly differ because rights like trademark rights and patent rights are not, like tangible objects from a “natural rights” perspective, usually “naturally” owned by producers or manufacturers in any country. For example, if a human ancestor in the Stone Age picked up a flint stone, after some pounding tied it to a wooden stick to make a weapon, without law this weapon would still have an owner—just that it might not be legally protected and would require personal strength and constant possession to avoid being taken by others. However, intangible property, whether intellectual property rights or data-related rights, is not like this—it is given meaning by humans, and its rights usually come from human political organizations formed through certain procedures. This is a manifestation of state sovereign acts, part of what is generally recognized as “national sovereignty.” Usually, countries won’t easily relinquish this. Because intangible asset fields need recognition from various national legal systems and establishment of related protection mechanisms, they have strong localization characteristics.

Therefore, in fields involving intangible asset “rights formation,” such as copyrights, trademark rights, patent rights, or personal data rights, legal systems must apply to each country individually and obtain rights according to local laws (basically there’s no concept of world patents or world trademarks), or when copyrights or personal data are infringed, rights must be asserted according to local laws. Additionally, countries may more or less adapt legal systems due to national economic development or cultural characteristics. Even when granting legal protection rights to intangible assets, different countries may differ.

This author believes that in the field of data-related legal systems, aside from personal data being special as it stems from information privacy rights—protection extending from “personality rights”—most laws currently enacted to address the data economy do not grant specific people rights through state sovereignty but come from administrative control. Because this data mostly comes from various countries’ companies and citizens, especially data related to citizen activities, it’s considered an important component of digital sovereignty, giving rise to the concept of Data Sovereignty. Naturally, it will also be considered within state sovereignty’s scope, meaning local laws must be followed. Especially since countries have different views on “economy” and “markets,” this will affect the legislative models and regulatory intensity countries adopt toward “data” in the “data economy” era. With certain differences among national legal systems, this further highlights the importance of whether multinational corporations or companies with cross-border business needs have sound data governance.

III. Rights Fragmentation

A very important difference between databases and Big Data is that databases usually collect data item by item. Therefore, generally each data item has a clear source, and where each data item is accessed or used can be tracked and compared more easily. For companies, before providing database services externally, they must clarify database rights. Otherwise, they may face lawsuits from rights holders. However, in the Big Data field, because data collection mainly considers technical feasibility, information that was previously oversimplified due to high data collection costs may all be collected, stored, and subjected to various analyses and uses.

Take convenience store customer information as an example. Early street-corner mom-and-pop stores relied on the owner’s exceptional memory. When your mother was busy cooking and asked you to buy soy sauce, while you stood before various soy sauce bottles complaining that your mother didn’t specify which kind, the owner had already recognized which family’s child you were and which soy sauce your family usually bought. No law would regulate personal data the owner could collect through physical perception, because human memory is limited and will eventually perish—there’s no need for legal control. But modern convenience stores began using cash registers to quickly collect information from every customer transaction. Each convenience store may also use digital device buttons to additionally collect customer “appearance”—such as gender, age, whether accompanied by children, simple emotional classifications, etc.—allowing data analysts to more accurately depict transaction contexts. Video footage from cameras originally installed for monitoring purposes, recording store interiors long-term and in large quantities, probably had no other use for convenience store operations except for retrieving theft or sexual harassment case data as evidence. But after Big Data-related technologies combined with artificial intelligence, not only “transacting” customers but anyone entering convenience stores may be profiled and analyzed. Do data analysts need to preemptively simplify collected data fields? Or can they directly use AI to deconstruct video data and convert it into structured, analyzable data? With labor shortages, the latter will inevitably prevail.

Such indiscriminate data collection is unprecedented in thousands of years of human history, and the same applies to the legal aspect. Previously, laws usually granted rights to “data” worth protecting by independently identifying it in different forms. For example, copyrights and trade secrets in the intellectual property field actually exist in data form; names, portraits, privacy, etc., in personality rights can also be presented in data form; and information privacy rights have already materialized as protection of personal data. But when digital devices can indiscriminately collect various information, data or information that might be priceless to humans has now become valuable under Big Data and AI-related technology development. But it may only have value when sufficiently complete and massive.

This creates a very difficult problem to avoid: Big Data collection and utilization can hardly clarify sources and obtain authorization one by one like databases, mainly because such authorization markets might not have existed previously, or for Big Data’s integrity and diversity, partial data known to be potentially problematic cannot be abandoned. This situation where only part of the data needs authorization during Big Data collection or utilization is obviously the norm, because most data was valueless in the past and didn’t enter the legal protection scope. But just identifying this data may require very high time and resources. These rights, present or absent, large or small, property rights, creditor’s rights, or personality rights, scattered in unknown locations within Big Data, wait for when this Big Data has high enough commercial value to become potential legal risks. This rights fragmentation phenomenon, besides being resolved through legislation, requires companies to adopt appropriate means such as data cleaning, fair use planning, de-identification mechanisms to reduce risks and truly enjoy the benefits the “data economy” brings.

IV. Data Legality is the Foundation for Operating Assets

Like trade secrets, data for companies may affect overall corporate operations rather than single products or services. Once data legality becomes problematic, it’s very difficult to immediately separate illegally involved data from corporate operations. Taking personal data-related legal regulations as an example, personal data illegally collected before law passage must be deleted before the transition period expires. For example, some early telemarketing companies may have bought data from different sources. After the Personal Data Protection Act’s implementation, they would face having to legally re-obtain sufficient data for operations. Otherwise, they may face countless personal data litigation cases. If, like EU GDPR, the legislative model for large companies adopts penalties calculated as a percentage of revenue, because behaviors violating GDPR are very difficult to stop or improve immediately, they can only face high fines. In early 2025, it was reported that noyb (https://noyb.eu/), a privacy rights organization located in Austria, filed lawsuits against six Chinese internet and technology companies—TikTok, AliExpress, Shein, Temu, WeChat, and Xiaomi—in Austria, Belgium, Greece, Italy, and the Netherlands¹. These are legal risks that companies operating with traditional production factors could not have anticipated.

Thomas C. Redman, President of Data Quality Solutions, known as “the Data Doc,” once said, “Where there is data smoke, there is business fire.” This shows that as an important corporate operating asset, data is like firewood piled indoors in ancient times—without advance planning and thorough management implementation, whether it’s fuel for living or the source of fire is yet unknown. If the “data economy” is an inevitable trend, then “data governance” should be a company’s responsibility to shareholders and society.

https://www.cna.com.tw/news/aopl/202501170369.aspx ↩︎