Getting value from your data shouldn’t be so difficult


The potential impact of the continuing global data explosion continues to inspire imagination.A 2018 report estimated that every second of every day, everyone is producing 1.7 MB data On average, each year’s data creation has been Since then it has more than doubled It is expected to double again by 2025.A report by the McKinsey Global Institute estimates that the clever use of big data can generate additional USD 3 trillion In economic activities, many applications such as self-driving cars, personalized healthcare, and traceable food supply chains are realized.

However, adding all this data to the system also creates confusion on how to find, use, manage, and share data legally, safely and effectively. Where does a certain data set come from? Who owns what? Who can see something? Where does it live? Can it be shared? Can it be sold? Can people see how it is used?

As data applications grow and become more common, data producers, consumers, owners and managers find that they have no scripts to follow. Consumers want to connect to the data they trust in order to make the best decisions. Producers need tools to safely share their data with those who need it. But the technology platform is insufficient, and there is no real common source of truth to connect the two parties.

How do we find the data? When should we move it?

In a perfect world, data will flow freely like a utility program that everyone can access. It can be packaged and sold like raw materials. Anyone who has the right to view it can easily view it without complications. Its origin and movement can be traced, eliminating any concerns about malicious use somewhere along the route.

Of course, today’s world does not operate like this. The explosion of massive data has created a long list of problems and opportunities that make sharing large amounts of information tricky.

Since data is being created almost anywhere inside and outside the organization, the first challenge is to determine what is being collected and how to organize the data so that it can be found.

The lack of transparency and sovereignty over the data and infrastructure that are stored and processed can lead to trust issues. Today, moving data from multiple technology stacks to a centralized location is expensive and inefficient. The lack of open metadata standards and widely accessible application programming interfaces may make it difficult to access and use data. The existence of industry-specific data ontologies may make it difficult for people outside the industry to benefit from new data sources. Multiple stakeholders and difficulty in accessing existing data services can lead to difficulties in sharing without a governance model.

Europe is in a leading position

Despite these problems, data sharing projects are still being carried out on a large scale.An organization supported by the European Union and a non-profit organization is creating an interoperable data exchange called Gaia-X, Companies can share data under the protection of strict European data privacy laws. The exchange is conceived as a container for sharing data across industries, as well as a repository for information about artificial intelligence (AI), analytics, and Internet of Things data services.

Hewlett-Packard Enterprise recently announced a Solution framework Support companies, service providers and public organizations to participate in Gaia-X. The data space platform currently under development is based on open standards and cloud native, democratizing data access by making it easier for domain experts and ordinary users to access data, data analysis and artificial intelligence. It provides a place where experts from the domain can more easily identify trustworthy data sets and safely perform analysis on operational data without the high cost of always moving the data to a centralized location.

By using this framework to integrate complex data sources in an IT environment, companies will be able to provide data transparency at scale, so everyone (whether a data scientist or not) knows what data they have, how to access it, and how to use it in real time.

Data sharing plans are also the top agenda of the company. An important priority for companies is to review the data used to train internal AI and machine learning models. Artificial intelligence and machine learning have been widely used in enterprises and industries to promote continuous improvement in all aspects from product development to recruitment to manufacturing. And we are just getting started. IDC predicts that the global artificial intelligence market will From 328 billion U.S. dollars in 2021 to 554 billion U.S. dollars in 2025.

To unleash the true potential of artificial intelligence, governments and companies need to better understand the collective legacy of all the data that drives these models. How does the AI ​​model make decisions? Are they biased? Are they trustworthy? Can untrusted individuals be able to access or change the data the company has trained on its model? Connecting data producers and data consumers in a more transparent and efficient way can help answer some of these questions.

Build data maturity

Enterprises will not solve the problem of how to unlock all data overnight. But they can be prepared to take advantage of the technology and management concepts that help create a data-sharing mindset. They can ensure that they are mature to use or share data strategically and effectively, rather than ad hoc.

Data producers can prepare for wider data distribution by taking a series of steps. They need to understand where their data is and how they collect it. Then, they need to ensure that those who use the data can access the right data set at the right time. This is the starting point.

Then comes the harder part. If the data producer has consumers (which can be inside or outside the organization), they must connect to the data. This is both an organizational challenge and a technical challenge. Many organizations want to govern data sharing with other organizations. The democratization of data—at least it can be found throughout the organization—is a matter of organizational maturity. How do they deal with it?

Companies that contribute to the automotive industry actively share data with suppliers, partners, and subcontractors. Assembling a car requires a lot of parts and a lot of coordination. It is easy for partners to share all information from engines to tires to network maintenance channels. The automotive data space can provide services to more than 10,000 suppliers. But in other industries, it may be more isolated. Some large companies may not even want to share sensitive information within their own business unit networks.

Build data thinking

Companies on both sides of the consumer-producer continuum can improve their data sharing mentality by asking themselves the following strategic questions:

  • If the company is building AI and machine learning solutions, where does the team get the data? How do they connect to this data? How do they track this history to ensure the credibility and source of the data?
  • If data is valuable to others, what is the monetization path the team is taking today to extend that value, and how will it be managed?
  • If a company is already exchanging data or monetizing data, can it authorize a broader set of services on multiple platforms (on-premises and cloud)?
  • For organizations that need to share data with suppliers, how do these suppliers coordinate the same data sets and updates today?
  • Do producers want to copy their data or force people to bring them models? The data set may be too large to replicate. Should companies host software developers and move models in and out of platforms on which their data resides?
  • How do staff in the department that uses the data influence the practices of upstream data producers in their organization?

Take action

The data revolution is creating business opportunities-but also with a lot of confusion about how to search, collect, manage, and gain insights from this data in a strategic way. The connection between data producers and data consumers is getting closer. HPE is building a platform that supports internal deployment and public cloud, based on open source, and using solutions such as the HPE Ezmeral software platform to provide a common foundation for both parties to make the data revolution work for them.

Read the original Enterprise.nxt.

This content was produced by Hewlett Packard Enterprise. It was not written by the editors of MIT Technology Review.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *