prahlaad r.
← All Artifacts
HealthcareDataThoughts

Clinical Data Stewardship: A Rant

“Why are these column values null?”

One of the major responsibilities as a product person across any of the healthcare organizations I’ve been a part that’s dealing with sensitive data is maintaining Data Governance. Moreso, I’ve been introduced to the concept of Data Stewardship


Approve the staging

Okay, let’s say you already have the validation from Executive Committee to launch your Data Governance program.

That’s great. What you need now is to really ONBOARD your data sponsors (for sure you already spoke to them before to get buy-in, but you have to go further now).

Make them talk about :

  • Their biggest current pain point (could be anything!) : is a system not synchronized with another? do they receive too much emails from customers?
  • Which data they need to manage their scope and take good decisions
  • Who do they need to succeed? (do they need IT to implement a new system? another business team to provide them key information?)

Now talk to them about :

  • Having the right data for their needs, with quality fit for their usage
  • Not being in charge of data quality remediation - at least not fully!
  • Relying on a spare partner (you!) to coordinate their data initiatives with IT and other business teams
  • And most important thing, you need to introduce the idea and usefulness of identifying Data Stewards in their teams 😈

Here you’ll have to chat about their fears : is it the title of “data sponsor” that is not clear? is it the workload for the Data Steward role?

🔍 Tip #1 : You need to reassure them, dispel all doubts and prevent any push back that could come later during committees. This is what I call “pre-approval”, mainly based on influence skills.

Prepare the scene

To help your data sponsors, you need to define precisely what the Data Stewards will be doing : the role scope, decisions and tasks expected.

  • 1️⃣ Define the role & decision rights
  • Mission : Define the meaning, functional quality, and usage of data in a specific domain, not the servers or ETL jobs.
  • Decision rights : Write and approve definitions with the data sponsor, prioritize data quality rules, accept/reject quality thresholds, arbitrate naming conventions.
  • Out of scope : Infra ownership, solution architecture sign-off (they partner, they don’t own).

👉 You can find a full description of the Data Steward role in my templates.

  • 2️⃣ Specify the domains & critical data elements (CDEs)
  • Define the sub-domains with the data sponsor.
  • Identify the top 5 CDE for each sub-domain where inconsistency hurts most (e.g., “Active Customer,” “Invoice Status”).
  • Have the data sponsor pre-identify Data Steward for each CDE.
  • 3️⃣ Write the Data Steward Playbook
  • Describe their purpose, scope, decision rights, KPIs, time commitment (typically 10–20% FTE), escalation path.
Usually people write a role description but they forget to enter in details on each task : what is it to do? at which frequency? in which tool?
That’s exactly what you should write in the Data Steward Playbook.

Example of detailed tasks for the Data Steward

  • Include a RACI for core workflows : definition management, issue triage, data quality control and remediation, etc.

👉 Finally, get the data sponsor to sign the Data Steward Playbook and approve workload associated to the details tasks of Stewards.


Make them heroes

Okay, we have a data sponsor onboarded with clear understanding of the need for Data Stewards. This data sponsor has validated the role description and pre-identified the people who could be assigned Data Stewards.

What could go wrong? 😅

Well, some people might be reluctant.

Here is a list of things I have heard from people who do not want to be Data Stewards :

  • Too busy
  • Can’t decide definitions or quality as it is relying on many different people
  • Might discourage others to take care of data if only one person is doing it

👉 What to answer :

  • Time : The Data Steward role is capped at ~2–4 hrs/month and replaces ad-hoc firefighting with a short, planned cadence.
  • Many stakeholders : We’ll coordinate the work between different teams to make sure the definition is common and not on one person’s shoulders. There’s also an escalation path and a public decision log.
  • Won’t discourage others : The Data Steward should carry and diffuse the data mindset towards others - not take care of everything while others look.

Moving forward

I recommend to start with the easiest Data Stewards : already data-driven, already concerned by data governance. But here are 2 ideas to make it a “heroe” path :

Career pathing

  • Create a career map Data Steward → Domain Owner → Governance Council.
  • Discuss it with HR team to link the roles to specific skills, incentives and rewards.

Metrics that make heroes visible

Track monthly, per domain :

  • Definition coverage : % CDEs with approved, versioned definitions
  • Quality rule coverage : % records passing
  • Issue : average time to resolve data quality issues

Make it no more than 3 KPIs, easy to track. Publish a simple scorecard with all domains consolidated, and celebrate the wins of Data Stewards.

Finally, recraft your playbook by inserting the names of your Data Stewards and their success stories.