Foundation

The strongest and tallest buildings all start with a strong foundation.

Foundation is the backbone of the entire data strategy.

A strong Foundation consists of five key ideas: alignment, initiatives, technology, documentation, and data, each of which is necessary for the successful development of a data strategy. Without a competent, well-grounded Foundation, we cannot confidently advance to the other four pillars.

Get updates about new chapters and significant reviews by joining the alerts channel on Telegram, or join the community room here.

Alignment

First of all, we need to talk about organizational alignment and why it is so important.

What do I mean by organizational alignment? Alignment is when a company works as a single organism, like gears, where all elements complement each other, ensuring the smooth operation of the entire mechanism.

The success of all your data projects comes down to shared, well-communicated values within the company, while lack of organizational alignment is the number one reason why most data science and AI projects fail to be executed or adopted.

Whole books can be written about company alignment. It's a vast subject. I'm not trying to replace any of them, but I couldn't write a book about data strategy without mentioning it and how it relates to our initiatives.

Some symptoms allow us to quickly identify misalignment between the company' data initiatives and the organization itself:

  • People don't understand what data science is, how it works and how it will help them or the organization.
  • Solutions are developed but not used;
  • A general fear or distrust of data and AI.

In companies with a healthy data strategy, people understand the DSAI solutions - not necessarily technically but conceptually. They know why these solutions exist and how they help them, individually and as an organization.

When implemented right, these solutions are quickly absorbed by the organization, many times in a fully transparent, seamless way. By a “seamless way” I mean that the most successful designs are not visible at all, they work in the background, and people realize their significance only if they were taken away for some reason.

These systems are, without exception, centered around people's needs and the company’s goals and integrate perfectly with the existing solutions.

Goals

When an organization is aligned, everyone is working in the same direction and aware of the same goals.

The most obvious thing to do is to define a long-term vision. This abstract concept reveals what an organization most hopes to be and achieve long-term at the highest level.

However, even though that it's great to have a direction in which the company is heading, its natural ambiguity makes it open to different interpretations. That's why it's also helpful to turn your strategy into concrete goals and divide them into measurable objectives and actions that employees should take to achieve the overall goal.

Many frameworks exist to measure goals and objectives. One of the first was MBO (Management by Objectives), presented in 1954. At the time of this writing, probably the most known one is SMART. However, companies are increasingly introducing the concept of OKRs (Objective Key Results) and KPIs (Key Performance Indicators) which is also a way to monitor performance and direction.

Don't worry if you’re not familiar with these concepts. They will be explained in the simplest way possible further down the book.

In a data-driven company vision, objectives and the health of day-to-day operations can be measured with data and driven by data. These measurements are necessary to build a successful data strategy since your designs will be based on them and then further evaluated against previously defined indicators.

Need-driven, not tool-driven

The pile of projects that don’t see any adoption is tall and keeps on rising.

Perhaps the main reason for this is that they do not satisfy the real need, and even if they do, they do it badly. Often, acting with good intentions, we forget that only those who face the difficulties we're trying to solve really understand them and, most likely, they will be the ones applying your solution.

If we want these projects to be accepted and adopted, DSAI projects must be user-centered, i.e., based on people’s actual needs and demands, with feedback from the right people. The right people are rarely data scientists, because these are usually not the ones facing the problems they are solving.

Focus on the highest priority needs and problems, learn as much from the people that face them, and use technology to solve them.

Be need-driven, not tool-driven.

This book will help you understand how to do this by conducting interviews and observations with the people who will actually use the solutions.

Communication and understanding

I can’t even say how many times I found someone was afraid to talk to me knowing that I have worked on automation and even built super-human artificial intelligence. It’s a reasonable fear as we often see automation as a way to replace humans. However, I see artificial intelligence and data science differently. I see them as leverage to empower people, not replace them.

For companies seeking to grow, the ultimate goal of the data strategy is not to replace people in organizations but to help them perform their jobs more efficiently and effectively than before. People will be replaced at their repetitive, simple processes. However, once these are automated, people can then focus on high-quality problems, problems that only humans can solve, problems that are truly complex and require creative solutions or human interaction.

This idea is not the most common one. Since childhood, films, articles, and other media have presented us with only one side of technology and AI, usually an adversarial one. However, we should understand that when applied correctly, artificial intelligence and data science won’t end up creating overlords ruling the world. Instead, they will serve us and help us achieve what we want.

The only way to change this opinion is through communication. There is a whole chapter in this book written on communication but if I can give you two rules right now: keep it simple and make it visual.

Initiatives

One of the first things a company should start doing is defining initiatives for data science and artificial intelligence. These initiatives are a way to formalize and isolate the development of different systems with a particular goal.

Turning Ideas into Initiatives

Most company managers feel lost when it comes to deciding what thing the company should focus on first, once they’re ready to start doing data science.

Selecting which initiatives to develop should not be purely intuition-based but evidence-based, either through data, simulations or both.

It all starts with describing an initiative. Each initiative should have at least the following items:

  • Name;
  • Description;
  • Feasibility:
  • Necessary data;
  • Involved stakeholders, their awareness and availability;
  • Necessary infrastructure to deploy the solution;
  • Necessary integrations with the current systems;
  • Associated objectives and expected impact on each of them;
  • Associated indicators and expected impact on each of them;

Then these initiatives should be prioritized accordingly, taking feasibility, expected impact, importance and urgency into consideration.

Tracking Initiatives and Knowing When to Move On

If you define the initiatives correctly, as described earlier, you now have a way to track them and their results.

Tracking initiatives is a seriously undervalued action that companies must take to be effective with DSAI.

The Observation chapter is extremely helpful to keep track of the initiatives, don't forget to read it here.

Focus, Clarity and Motivation

Having a clear way to see the Indicators the project is having an effect on will give you clarity on why the project is important and how you’re moving forward.

A lot of times, teams are so deeply immersed in the project that they lose clarity over the reasons why the project started in the first place. Tracking your initiatives against the objectives will help keep the focus, clarity and motivation of the people involved.

Motivation

Knowing when to stop: Objective already accomplished

Let’s say your target is to get your retention rate at 15%. And you have a budget of 1 year for the project.

After Q1, you have already reached the goal. This is a good time to stop. Even if you estimate it’s possible to go to 20%, the law of diminishing returns quickly sets in and that 20% is never guaranteed.

Be happy with your success and add the improvement to that project as a new initiative, to have its priority assessed later against all the other initiatives.

It's very easy to get carried on and continuing improvement a project. So much so that I had one project where we only stopped after we tested a human against our machine. Our machine was more than 2.4 times better. Great for bragging rights, but an awful waste of time for the company that could be used somewhere else.

This is a silent killer of projects. As one project becomes successful, it's easy to get even more motivated to focus on it and forget about other initiatives that the company needs. Remember, FORCE initiatives always have at least one objective defined. Once it's accomplished, look into the next high-value initiative!

Goal Reached

Objective is Currently Unattainable

It's important to know when to move on.

Sometimes the objective is just too hard to achieve in the current context and you start hitting a plateau.

Imagine the target objective is to reach a 50% retention rate. As you can see, after one year of development, the progress hasn't advanced and seems to have converged to 20%. It's time to stop.

Even though it might feel like defeat, it's a victory to be able to control our egos and understand that there are other, perhaps equally important, initiatives where we should be focusing on instead.

Sometimes required effort to improve these systems would be much more productive elsewhere. This is particularly true when you're starting your journey. Accept it, move on. You can always come back later with more knowledge and tools that you didn't have before.

Goal Unattainable

Supporting Technology

Of course, it would be pretty useless to have the perfect data strategy and forgetting about the necessary technology to support it.

You might know exactly what you need to know, what data you need, the needs you are addressing, which people are involved and have them super motivated to do this, but without technology, you won’t go far.

You need technology to:

  • Collect data;
  • Store data;
  • Move data;
  • Process data;
  • Visualize and communicate;
  • Deploy models and analysis.

Different technological solutions can do one or more of these. A good ERP will do all of these, but usually only to a certain extent.

My Basic Rules for Choosing Technology

When choosing technologies, these are my rules. Like any list of rules, there are exceptions, but these are my strong preferences:

  • Extensibility: easy to extend with new modules;
  • High interoperability: with known and clear interfaces that allow integration with other services (can be seen as a type of extensibility, if the system is part of a whole)
  • Portable compatibility: always opt for something that is not OS-dependent. I tend to like Webapps, as they run on browsers and, as such, are natively portable.
  • Low maintenance: you can choose between having your own platform that you’ll have to maintain, or you can get a managed service. I would go for the managed service for various reasons. Yes, managed services are more expensive, but they're worth it.
  • Open-source: I’m an open-source advocate. Please note that open-source is not the same as free.
Be ready to pay for the technology - don’t be cheap on your foundation. A great building needs a great foundation. Invest the amount necessary to guarantee it.

Companies that fail to do this tend to suffer in the long run. You don’t want to have your data scientists or other specialized people focused on maintaining databases when some services do this automatically, do you?

The need for documentation

Now that you got your company aligned, your initiatives are well planned and ready to be tracked, and the necessary technology is set up, you’re ready to start, right? Wrong.

I know you’ll hate me for saying this, as do many technical people, but you have to define documentation standards.

Why? To remove dependencies and manage knowledge. If projects are not well documented, this is what will happen:

  • Every time a new person joins the project, they’ll be entirely lost and require lots and lots of attention from your most senior people. You know, the ones that should be working on getting you the most impact possible in the least amount of time.
  • Every time someone is unavailable, it won’t be clear what this person did and if changes are required, questions will occur.
  • Every time someone leaves the project, you’re in trouble. Without documentation, no-one will fully understand how the project works.

Fortunately for you, I’ve added a chapter to this book where I go deeper into documentation for different topics: code, databases, etc.

Documentation is essential for your projects’ long-term success and, ultimately, for your data strategy.

Everything in its place - Processes, Data and the Single Source of Truth

Finally, it should be completely clear where all this information can be found.

Have you been in a situation where everyone feels a bit lost because a single person had to skip work? Maybe this person even documented their processes, but you don’t know exactly where these documents are? Maybe some people save them on their computers, while others use Google Docs and others even email, or paper.

You have an amazing team, fully aligned, you have your initiatives defined, a great technology foundation, and even most processes defined but no-one knows where they all are! What a waste!

Imagine instead knowing exactly where all documentation lives. Everything, from all departments, and all teams. No matter what the process is, it can be found in this place, and people know it. With good access control and security, these processes are available to everyone who should have access to them, and no-one else. Great!

When the same is applied to data, we call this a Single Source of Truth, and it’s a fundamental concept for a successful data strategy.

What is a Single Source of Truth

A single source of truth or SSOT is a company’s single reference point for its data to be found. It is not a system, tool, or strategy. Historically, SSOTs have been implemented in many different ways. To this day, there is no best way to implement these. The implementation will always depend on the context of each organization. It’s only possible to aggregate data from the many systems within an organization and make them available to the right users.

Why is an SSOT needed?

Perhaps the most important benefit of SSOTs is ensuring that organizations operate based on the same standardized version of data across the company. Without a single source of truth, each team, department, or subsidiary of an organization is a potential data silo, isolated from the rest. It also enables better communication between different business units. They will all share the same “language” and information— no more meetings or decisions based on inconsistent numbers by various departments.

Implementing a single source of truth enables decision-makers to make data-driven decisions based on the whole organization’s data rather than from fragmented black boxes.

SSOT should contain everything that can be used to inform business decisions. There is no limit to the kind of data. Everything from customer data to sales data should be there, if relevant to the organization.

Who is responsible for building a Single Source of Truth?

Because SSOTs collect data from the many systems within an organization, it’s critical to give this data responsibility to the right people.

While a dedicated team should be responsible for setting up the single source of truth system, this team cannot be responsible for the data there.

I’m a firm believer that the team that creates the data is also responsible for keeping it updated within the defined quality standards. They are the only ones with the necessary domain knowledge to do so. Therefore, they should be responsible for sharing their data correctly with the rest of the organization.

Responsibility is not to say that they should do it alone. Instead, the different teams should cooperate with the technical team to ensure that the data is processed correctly.

Moving forward

If you followed all this, you should now have a solid foundation to support all your upcoming projects. You’re ready to start with the following module: Observation.

In this module, I’ll explain the importance of Observation in leading a business both in strategic and tactical ways.

You’ll learn to see if you’re going in the right direction and if the machine is running smoothly or if there’s any loose nut that needs to be fixed. You’ll also learn how to notice anomalies and get notified as soon as they happen so you can dedicate them the required attention to prevent any possible consequences, or arrange a party to celebrate your unusual success in sales, for example.