<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://lms.onnocenter.or.id/wiki/index.php?action=history&amp;feed=atom&amp;title=Data_Science_Strategy%3A_Memanaged_Konsistensi_Data</id>
	<title>Data Science Strategy: Memanaged Konsistensi Data - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://lms.onnocenter.or.id/wiki/index.php?action=history&amp;feed=atom&amp;title=Data_Science_Strategy%3A_Memanaged_Konsistensi_Data"/>
	<link rel="alternate" type="text/html" href="https://lms.onnocenter.or.id/wiki/index.php?title=Data_Science_Strategy:_Memanaged_Konsistensi_Data&amp;action=history"/>
	<updated>2026-04-20T09:06:58Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.45.1</generator>
	<entry>
		<id>https://lms.onnocenter.or.id/wiki/index.php?title=Data_Science_Strategy:_Memanaged_Konsistensi_Data&amp;diff=63138&amp;oldid=prev</id>
		<title>Onnowpurbo: Created page with &quot; Managing Data Consistency Across the Data Science Environment It might seem like a simple task to ensure data consistency across the different parts of the data science envir...&quot;</title>
		<link rel="alternate" type="text/html" href="https://lms.onnocenter.or.id/wiki/index.php?title=Data_Science_Strategy:_Memanaged_Konsistensi_Data&amp;diff=63138&amp;oldid=prev"/>
		<updated>2021-04-07T02:44:48Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot; Managing Data Consistency Across the Data Science Environment It might seem like a simple task to ensure data consistency across the different parts of the data science envir...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;&lt;br /&gt;
Managing Data Consistency Across the&lt;br /&gt;
Data Science Environment&lt;br /&gt;
It might seem like a simple task to ensure data consistency across the different&lt;br /&gt;
parts of the data science environment, but it’s much more difficult than it seems.&lt;br /&gt;
First off, this area tends to be more complex than it needs to be, eating up more&lt;br /&gt;
time and resources than originally estimated. The need for consistency includes&lt;br /&gt;
aspects such as data governance and data formats, but also the labeling of data&lt;br /&gt;
consistently —using customer IDs across many different sources to enable cor-&lt;br /&gt;
relation of different data types related to the same customer, for example.&lt;br /&gt;
The challenge is that there is a built-in contradiction in terms infrastructure&lt;br /&gt;
between enabling usage of special tools to allow data scientists and data engineers&lt;br /&gt;
to be innovative and productive and at the same time ensuring consistency in the&lt;br /&gt;
data. This is because specialized tools are optimized to focus on solving certain&lt;br /&gt;
problems but either don’t keep the format consistent or don’t interface well with&lt;br /&gt;
other tools needed in the end-to-end flow. Optimized, specialized machine learn-&lt;br /&gt;
ing tools are simply not good at playing together with other, similar specialized&lt;br /&gt;
tools that are addressing comparable or other adjacent problems.&lt;br /&gt;
But is it really that bad? Well, it can lead to real problems, depending on how&lt;br /&gt;
much freedom is allowed in the architectural implementation and among the&lt;br /&gt;
teams. Some examples of problems that can stem from a lack of consistency across&lt;br /&gt;
the AI environment are described in this list:&lt;br /&gt;
» » Ad hoc solutions: Every case is treated as an isolated problem that needs to&lt;br /&gt;
be solved this instant in order for the team to move forward. The result? No&lt;br /&gt;
long-term solution and no learning between teams.&lt;br /&gt;
44&lt;br /&gt;
PART 1 Optimizing Your Data Science Investment» » Increased cost: When you have to duplicate tool capabilities in order to&lt;br /&gt;
manage a lack of consistency or when you have to build capabilities into&lt;br /&gt;
purchased tools to secure just the basic consistency, those costs add up.&lt;br /&gt;
» » End-to-end not working: Inconsistencies can occur when the infrastructure is&lt;br /&gt;
implemented across several cloud vendors, which then makes it difficult or&lt;br /&gt;
impossible to transfer data and keep data consistent across different&lt;br /&gt;
virtualized environments.&lt;br /&gt;
Because corporate management cannot enforce, and may not want to enforce,&lt;br /&gt;
data consistency across the organization as a company policy, they have to use&lt;br /&gt;
other means to preserve data consistency end-to-end. One way is to ensure that&lt;br /&gt;
all teams follow proper and relevant guidelines for evaluating and purchasing new&lt;br /&gt;
tools that incorporate specific directives related to data consistency. Clearly moti-&lt;br /&gt;
vating why this is key to a successful data science strategy execution.&lt;br /&gt;
It’s also vital to consider which limits are needed for each individual company,&lt;br /&gt;
depending on the type of business, their objectives, and so on. Hold the line when&lt;br /&gt;
it comes to data consistency: Otherwise, you may end up with a cumbersome and&lt;br /&gt;
costly implementation of data science, one far removed from the productive data&lt;br /&gt;
science environment you were hoping for.&lt;/div&gt;</summary>
		<author><name>Onnowpurbo</name></author>
	</entry>
</feed>