www.ibm.com Open in urlscan Pro
2a02:26f0:480:687::1e89  Public Scan

Submitted URL: https://d.email.ibm.com/Mjk4LVJTRS02NTAAAAGKekCDpElWzynRi3CKBj1CH84JJ-jxDq8s9pyO_hEvMVMXDsPcceyKqGOeqIAKnRxZJuuZ8Uk=
Effective URL: https://www.ibm.com/analytics/common/smartpapers/data-privacy-security/?utm_medium=Email&utm_source=Nurture&utm_cont...
Submission: On March 14 via api from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

Loading
The essentials The essentials Blend cloud and on-premises resources for
flexibility and balance What is Hybrid Cloud? Hybrid Cloud solutions Unlock the
value in your organization with Watson What is AI? AI solutions My IBM Log in


DATA LEADERS: TURN COMPLIANCE INTO COMPETITIVE ADVANTAGE




A DATA PRIVACY AND SECURITY PLAN THAT DELIVERS BUSINESS BENEFITS

01


A DATA PRIVACY AND SECURITY PLAN THAT DELIVERS BUSINESS BENEFITS

Making all enterprise data accessible and improving its security and compliance.

3 min read
Arrow right


ACCESS DATA MORE SECURELY ACROSS MULTIPLE SOURCES

02


ACCESS DATA MORE SECURELY ACROSS MULTIPLE SOURCES

Data virtualization and a single access point are vital for secure, simplified
data access.

4 min read
Arrow right


BUILD A TRUSTED FOUNDATION OF DATA QUALITY

03


BUILD A TRUSTED FOUNDATION OF DATA QUALITY

Data quality and organization changes time spent searching for data to time
spent using it for insight.

9 min read
Arrow right


DELIVER COMPLIANT OUTCOMES WITH DATA ACCESS AND LINEAGE

04


DELIVER COMPLIANT OUTCOMES WITH DATA ACCESS AND LINEAGE

Some of the key aspects of governance are knowing when, where and how data is
used and by whom.

7 min read
Arrow right


MANAGE RISK AND COMPLIANCE WITH REPORTS AND AUDITING

05


MANAGE RISK AND COMPLIANCE WITH REPORTS AND AUDITING

Reporting and auditing should be made simple through automation and powerful
UIs.

7 min read
Arrow right


FIND YOUR HOLISTIC DATA PRIVACY AND SECURITY SOLUTION

06


FIND YOUR HOLISTIC DATA PRIVACY AND SECURITY SOLUTION

Learn more about the way IBM Cloud Pak® for Data enables these governance
practices.

2 min read
Arrow right
Data leaders: turn compliance into competitive advantage
Arrow leftBack to table of contents

01

3 min read

A data privacy and security plan that delivers business benefits


HOW ARE ENTERPRISES BRINGING TOGETHER DATA ACCESS, SECURITY AND COMPLIANCE?

As chief data officers and other data leaders look to data and AI to help drive
innovation and competitive advantage, they're faced with two seemingly
contradicting imperatives: make all enterprise data accessible and ensure that
information is secure and compliant. As strategies and solutions evolve, these
leaders are coming to realize that this thought is a false dichotomy — the same
tools that help businesses stay compliant with the latest data privacy
regulations can help them take advantage of their data better, too.

> 38% of executives said a barrier they faced in seeking closer alignment to
> GDPR was the complexity of aligning the IT landscape.1

Historically, the data privacy and security landscape has been plagued by
piecemeal approaches, employing a web of disparate point solutions that, when
cobbled together in aggregate, provide the necessary view and understanding
needed. At best, this approach creates a complicated architecture, requiring
additional time and resources to manually integrate. At worst, it fails to
deliver a complete view of data and its usage, leaving blind spots that can open
the business up to unknown risks — increasing the potential for the losses and
fines associated with breaches or noncompliance.

To combat this issue, leading businesses are looking to more holistic strategies
and solutions that provide visibility across the entire data and AI lifecycle —
from building and securing a trusted, compliant data foundation to optimizing AI
models and their impact on the business and, ultimately, auditing and regulating
compliance. Organizations need a unified solution from which they can view the
impact of sensitive data and universally enforce policies.

In this paper, we’ll explain why a data fabric approach is the way forward and
dig deeper into each of those key areas to explore the vital data privacy and
security elements at each stage of the journey.

1 Championing Data Protection and Privacy (PDF, 2.7 MB), Capgemini Research
Institute, 2019

Arrow right
02 Access data more securely across multiple sources
Arrow leftBack to table of contents

02

4 min read

Access data more securely across multiple sources


USE A DATA FABRIC TO GOVERN YOUR DATA AND ALLOW SELF-SERVICE CONSUMPTION

One of the biggest challenges data leaders face is managing the diverse
landscape of data stored across multiple siloed environments. In the past, the
inability to ensure compliance from one environment or business unit to another
has led to many parts of the business being reticent to share data with
colleagues. In these scenarios, compliance became a hinderance to the business
rather than an advantage, further entrenching the disjointed repositories of
data and forcing IT teams to protect and secure each on an individual basis.

> 7 in 10 organizations are unable to secure data that moves across multiple
> cloud and on-premises environments.2

One answer to this siloed landscape is data fabric. A data fabric is an
architectural approach to simplify data access in an organization to facilitate
self-service consumption. Data virtualization is a component of data fabric and
allows users to access data from a single access point no matter where in the
organization or even outside of the organization it happens to be. This access
occurs without moving the data or using extract, transform, load (ETL)
processes, so the risk of data corruption and loss is mitigated significantly.
Moreover, the queries sent from this single access point to the data
repositories is protected with Secure Sockets Layer (SSL) and Transport Layer
Security (TLS) encryption using standard protocols. So, even though the data
itself doesn’t move, you can be sure the communications are secured, as well.

Forward-looking implementations of data virtualization also provide schema
folding — automatically detecting common schemas across repositories and making
them appear as a single schema. For example, if a similar sales table existed
across 20 databases, it would appear to the user as if it were just one table
that could be queried. This method heightens users’ ability to use more complete
sets of data for greater accuracy when developing insights or models.



Yet, one of the most useful aspects of a data fabric from a data privacy and
security viewpoint is that data can be governed at that single access point. So,
instead of adding governance across myriad different places, it can be
implemented at the place where users are receiving self-service data access.
These different aspects of governance are covered in the next section.

Learn more about data governance

Read about data fabric use cases

2 25 Trends for 2022 and Beyond, IBM Institute for Business Value, 2021

Arrow right
03 Build a trusted foundation of data quality
Arrow leftBack to table of contents

03

9 min read

Build a trusted foundation of data quality


ORGANIZE YOUR DATA WITH A COMMON CATALOG AND METADATA

Data leaders know that quality data, or a lack thereof, can be the difference
between insights that are confidently acted upon or ones that aren’t trusted. If
low-quality data goes into AI models it could even lead to regulatory
noncompliance if it has a discriminatory result. Some questions of data quality
can be answered with metadata like the source and how fresh the data is.

However, an additional level of data cleansing is often helpful. A data catalog
with a built-in data quality analysis and refinery should be used. The data
quality analysis can be used to make inferences and identify anomalies, while
the refinery can be used to discover, cleanse and transform the data with data
shaping operations.

One of the best examples of data privacy and data use’s complementary nature is
how both are supported by governance. At its core, data governance is about
organization —knowing where data comes from, what it is, who can access it and
when it should be retired. While this information certainly is important for
auditability, right to be forgotten requests and determining access rights, it
also helps data users determine the most relevant, freshest and cleanest data so
they can deliver the best insights. We’ll explore several critical components to
governance, along with how they complement both data privacy and better data
use.

> 31% of executives said the most significant challenges in getting ready for
> GDPR was cataloging and inventorying their data.3


A COMMON CATALOG

One of the most difficult challenges when bringing together data from across the
business is that people use different terms to refer to the same thing or may be
using the same term to refer to two different things. Creating a common taxonomy
makes sure that everyone can communicate more effectively. Doing so is important
for data privacy because, if the wrong term is used, data that should be limited
in access might accidentally be made available to the whole business. From a
data use perspective, using multiple terms or terms that are incorrect for that
business or industry can make finding data for models and understanding insights
more difficult and time-consuming.

Data catalogs, particularly ones with access to AI tools, can help with this
issue. Organizations should look for opportunities to use AI to search documents
and text within the business and industry to pull out the unique terminology
that’s most applicable to them. This process will make their taxonomy efforts
much more applicable to the data they possess.

> The number one challenge and area for adopting a privacy-centric approach was
> performing data discovery and ensuring data accuracy.4


METADATA

Metadata is at the heart of both privacy and use for the same reason — if you
don’t know details about your data, how can you truly say who is meant to see it
or how you might be able to use it? Metadata keeps track of the origins of the
data, age of the data, privacy level, potential uses and much more.

While this information can be added manually, it can be an extremely cumbersome
and time-intensive process. Fortunately, machine learning allows for automated
metadata generation. Based on the existing data catalog and other business
policies, data is reviewed and then automatically tagged with relevant metadata
based on what the machine learning algorithm finds. Not only does this process
help make data ready for use as it comes in it helps eliminate human errors that
might occur when applying metadata manually. Moreover, it mitigates the problems
with so-called “dark-data”, which remains hidden or unused because little to no
information is known about it after it is ingested.

> Organizations struggle with data readiness. 45% cited unactionable data
> formats and 44% cited regulatory restrictions on data use as key barriers.5

Automated metadata generation is particularly important with regard to access
and anonymization procedures. Consider, for example, an enterprise that wants to
bring in a new data set that contains information about transactions that
include item descriptions, quantity purchased, name, address and credit card
number. When this data set is ingested, automated tagging would tag the item
descriptions and quantity as general transaction data, the name and address as
personal data, and the credit card number as financial data. This tagging allows
policy enforcement at the point of access. So, if business users were to access
the data set, they could see the general transaction data, but the personal and
financial data would be automatically anonymized — another automation feature
being introduced in the most up-to-date governance tools. As such, policies are
easily enforced and even more sensitive data can be used in a nonidentifiable
and compliant way. Of course, those individuals with the need and authority to
access personal or financial data from this data set still can, and those access
rights are acknowledged at the single access point for data, as well. Additional
information about anonymization features is provided in the next section.

Establish a unified governance framework (107 KB) PDF

Read how ING carries out its data fabric vision

3 Maximizing the value of your data privacy investments, January 2019
4 Privacy Gains: Business Benefits of Privacy Investment, 2019
5 Unleash your platform’s power: 5 ways to create next-wave digital experiences,
, August 2021

Arrow right
04 Deliver compliant outcomes with data access and lineage
Arrow leftBack to table of contents

04

7 min read

Deliver compliant outcomes with data access and lineage


TRACK WHERE DATA COMES FROM AND HOW IT’S USED WITHOUT OVERCOMPLICATING DATA
ACCESS

Data access and lineage most directly applies to privacy concerns and
auditability for obvious reasons. Privacy is all about data only being used by
the people who need it, and lineage shows who has had that access in practice.
However, these safeguards are also important for self-sufficient use of data, as
well. If there’s confusion by data users over what they can or can’t use, they
may opt not to employ a valuable data set to which they should have access.
Moreover, it takes time to sort usable data from unusable data.

> 17% of executives are looking at anonymization and pseudonymization as a
> solution they are evaluating or implementing.6

A much better option is to have access restrictions built directly into the
single access point where users are getting their data so only the data they
have authorization to use is visible. It removes any confusion they might
otherwise have. Another helpful feature is dynamic masking of sensitive data so
that data sets and models can be used and shared without exposing private data
to those who shouldn’t have access. After access has been granted, the
governance solution should also be able to create reports that analyze the flow
of data from data sources through jobs and stages, and into databases, data
files, business intelligence reports, models and other assets. This data lineage
capability, alongside the access protections, should make auditing for internal
or external purposes much easier.

It’s worthwhile to focus here on two of these stages as they relate to lineage
in a bit more detail. Foremost, is the ingestion of data itself. Data lineage
helps to answer the question: “Where did this data come from?” That answer is
important because it speaks to the accuracy and relevancy of data for future
insight. For example, data that comes directly from transactions and is the full
data set may be more accurate than a sample of data pulled from social media. By
the same principle, data from a predominantly South American population may lead
to insights that would be incorrect to apply to an Eastern European market.
Knowing the data’s origin more accurately tells users where it should be
applied.

This in turn, relates to the other stage in data lineage, which is worthwhile to
discuss here: AI model lineage. The increased interest in producing AI models
for deeper and more robust insight necessitates a higher level of scrutiny from
a data lineage standpoint. Understanding where data comes from is vital to make
sure models are trained on data that’s applicable to where the model is used in
production, however tracing the lineage of the model itself can be just as
important. Specifically, it means tracking when and how the model was created,
as well as when and where it has been used, and the decisions that resulted.
Such model lineage is part of a new push for explainable AI that’s receiving
ever more scrutiny and regulation. Essentially, it’s not enough to know that a
decision was made, enterprises must be able to explain why a decision was made
and why that decision was correct.

Take, for instance, an AI model that determines whether home loans should be
approved. An important consideration in these types of decisions is whether the
decisions are discriminatory either in the data considered or their results. The
enterprise could reassure itself of unbiased decisions with data lineage by
identifying that the model was trained using data representative of the
population, it was applied uniformly in all cases where such a decision needed
to be made, and that the end results or decisions didn’t disproportionately harm
a particular group of people. Or, if errors were found — thanks to reporting or
auditing of this lineage — it can be corrected quickly before heightened
regulatory or reputational harm can come to the business. Such reports and
auditing are the subject of the next section.

Learn how IBM transformed its global data privacy framework

Read: is the new era of AI amplifying your risk and hurting efficiency? (98 KB)
PDF

6 Data Protection and Privacy Officer Priorities 2020 Report, CPO Magazine, 29
March 2020

Arrow right
05 Manage risk and compliance with reports and auditing
Arrow leftBack to table of contents

05

7 min read

Manage risk and compliance with reports and auditing


SIMPLIFY AND DEMOCRATIZE COMPLIANCE AND RISK MANAGEMENT ACROSS THE ORGANIZATION

Data privacy and security can be a confusing process with a wide variety of
regulations that differ by industry, location and even the type of data itself.
Greater consciousness of data privacy in the public will continue to lead to
more regulation in the coming years, which businesses will have to track and
adhere to so they remain compliant. A holistic data privacy and security
solution should provide capabilities that help businesses stay aware of these
policies, implement them effectively and regularly audit their compliance.
Automation is another crucial factor that helps eliminate manual effort, saves
time and increases accuracy.

> EU data protection authorities have handed out a total of USD 1.2 billion in
> fines over breaches of the bloc’s GDPR law since Jan. 2021.7

As a first step, solutions should be used to break down complex regulations into
a catalog of requirements, understand how they affect the business specifically
and create actionable tasks that the business can undertake. Regulatory
information should be ingested automatically from sources like Thomson Reuters
and Wolters Kluwer and automatically applied to terminology and workflows.
Similar regulations should also be deduplicated. Any actions should then be made
clear and measurable with terminology that’s specific to that industry or
organization. And those actions should be grouped logically and assigned to
specific owners within the system. In this way, an organization need not be an
expert on every regulatory compliance initiative to act on them.

The best solutions should also go beyond these steps to simplify adherence to
regulations for business users who may not be involved with managing risk on a
day-to-day basis. A user interface (UI) should be implemented that mitigates or
even eliminates the need for training by using AI-powered dashboard widgets to
help contextualize information in the moment and suggest courses for action. A
business user, guided by a virtual assistant within the UI, can then easily
follow suggestions on how to handle potentially sensitive data or whether it
should even be used at all or ask the assistant for guidance if confused.

Once proper guidelines and processes have been established, automated data
collection and constant monitoring for dashboards and audits must be
implemented. Tracking and documenting all data privacy and IT incidents
automatically to facilitate root cause analysis is vital, but so is preventative
monitoring. Outlier detection, using machine learning and statistical modeling,
can be used to flag anomalous activities and give them a high-risk score. This
process, in turn, can be set up to trigger an alert that will signal the data
security team to investigate. To save time, enterprises should look for embedded
workflow features that work out of the box for a variety of use cases. For more
complex workflows, drag and drop functionality should be available to make
creation easier.

Auditing is equally important and should be done not only when a problem has
been identified, but as part of routine data privacy and security practices.
One-click audit reports are a clear benefit in this regard, for regular, quick
check-ups. These reports can quickly show key stakeholders how the data is being
used and by whom over certain time periods. Of course, for more in-depth audits,
the ability to manage and monitor the audit’s execution, as well as the
assignment and tracking of resources from a central system, is crucial. That’s
why having a proper data lineage, as discussed earlier, and an audit trail is so
essential. Data points, such as modification of configuration data, user
actions, privileged access, and system events, should all be accessible.
Organizations should also consider the length of time the audit data is
available, whether performing an internal or external audit, lacking needed
records is far from ideal.

One final consideration is the auditing of models, which was discussed more
thoroughly in the previous section. Given the importance of models in driving
insight and the increased scrutiny they are receiving for accuracy on a company
and regulatory level, it behooves data professionals to check them often. A good
place to start is by creating a comprehensive model inventory and maintaining it
along with the purpose of each model. Then ownership, roles and responsibilities
can be established for each model. From there, interactive dashboards can be set
up to indicate model risk and further assessments can be conducted if anything
looks amiss.

Three resources to help you understand today's data & AI regulatory landscape

See how General Motors unified its audit, risk and control activities

7 Fines for breaches of EU privacy law spike sevenfold to USD 1.2 billion, as
Big Tech bears the brunt, CNBC, January 2022

Arrow right
06 Find your holistic data privacy and security solution
Arrow leftBack to table of contents

06

2 min read

Find your holistic data privacy and security solution


GET STARTED WITH A DATA FABRIC

Data privacy and security can and should work together with the universal
business desire to get as much value as possible out of data. Connecting
disparate data through a data fabric helps introduce privacy and security at a
single access point while offering users an easier way to self-serve the data
they need. Strong governance makes the right data, quality data, easier to find
for those who should have access to it, while allowing sensitive data to remain
hidden unless appropriate. And the ability to conduct real-time monitoring and
audits helps secure the systems and comply with regulations, but it also helps
the business mitigate data loss through breaches and keep models accurate.

IBM Cloud Pak® for Data brings together a comprehensive solution that addresses
all the needs discussed previously with built-in data virtualization and the
cataloging of the IBM Watson® Knowledge Catalog. The IBM Security™ Guardium®
Insights solution and IBM OpenPages® software also help with the monitoring and
auditing capabilities as part of the IBM Cloud Pak for Data platform.

Learn more below.

Talk with our experts




NEXT STEPS

Talk with our experts

Schedule a no-charge, one-on-one conversation to discuss these topics

Ask us your questions Arrow right

Data governance and privacy

Create a business-ready data foundation with data fabric

Learn more Arrow right

Optimize your data strategy

Here are 5 recommended steps for data leaders

Explore now (170 KB) PDF
Top products & platforms Industries Artificial intelligence Blockchain Business
operations Cloud computing Data & Analytics Hybrid cloud IT infrastructure
Security Supply chain What is Hybrid Cloud? What is Artificial intelligence?
What is Cloud Computing? What is Kubernetes? What are Containers? What is
DevOps? What is Machine Learning? IBM Consulting Communities Developer education
Support - Download fixes, updates & drivers IBM Research Partner with us -
Partner Plus Training - Courses Upcoming events & webinars Annual report Career
opportunities Corporate social responsibility Diversity & inclusion Industry
analyst reports Investor relations News & announcements Thought leadership
Security, privacy & trust About IBM LinkedIn Twitter Instagram Subscription
Center Contact IBM Privacy Terms of use Accessibility




IBM web domains

ibm.com, ibm.dev, ibm.org, ibm-zcouncil.com, insights-on-business.com, jazz.net,
merge.com, micromedex.com, mobilebusinessinsights.com, promontory.com,
proveit.com, ptech.org, resource.com, s81c.com, securityintelligence.com,
skillsbuild.org, softlayer.com, storagecommunity.org, strongloop.com,
teacheradvisor.org, think-exchange.com, thoughtsoncloud.com, trusteer.com,
truven.com, truvenhealth.com, alphaevents.webcasts.com, betaevents.webcasts.com,
ibm-cloud.github.io, ibmbigdatahub.com, bluemix.net, mybluemix.net, ibm.net,
ibmcloud.com, redhat.com, galasa.dev, blueworkslive.com, swiss-quantum.ch,
altoromutual.com, blueworkslive.cn, blueworkslive.com, cloudant.com, ibm.ie,
ibm.fr, ibm.com.br, ibm.co, ibm.ca, silverpop.com,
community.watsonanalytics.com, eclinicalos.com, datapower.com,
ibmmarketingcloud.com, thinkblogdach.com, truqua.com, my-invenio.com,
skills.yourlearning.ibm.com, bluewolf.com, asperasoft.com, instana.com,
taos.com, envizi.com
About cookies on this site Our websites require some cookies to function
properly (required). In addition, other cookies may be used with your consent to
analyze site usage, improve the user experience and for advertising. For more
information, please review your cookie preferences  options. By visiting our
website, you agree to our processing of information as described in IBM’s
privacy statement. To provide a smooth navigation, your cookie preferences will
be shared across the IBM web domains listed here.

Accept all Required only