Post 55 — Aug 19, 2022
“The most powerful person in the world is the storyteller. The storyteller sets the vision, values and agenda of an entire generation that is to come.” — Steve Jobs
Storytelling with data is a critical skill, and data visualizations allow us to package an intent of our data product, as well as create templates & data contracts.
I will be posting categories of data visualizations and different types of charts within those categories, and then how we can weave them into our data products.
Post 54 — Aug, 2022
Leadership is tricky, as is culture, especially in a fast-moving business with risks, demands for people’s times, and technology that is exponentially growing faster with complexity, and explosive competition for limited talent resources with the experience to integrate all of these systems.
The biggest missing gap is a process for business & data teams to communicate effectively in order to align interests and deliver ROI. That’s what I’m working on with the “Art of Data Products”. Please reach out if you’d like to learn more.
Post 53 — Aug, 2022
“The most powerful person in the world is the storyteller. The storyteller sets the vision, values, and agenda of an entire generation that is to come…” — Steve Jobs
Thanks Robbie Crabtree for writing this. So true, and a critical skill I want to develop. The field of computational psychometrics (data science + psychometrics) is something I’m very passionate about. Posting this to see the reaction to Robbie’s post and the introduction of the phrase “psychometrics” to those that follow my series. If I get a lot of comments, likes, shares, etc., then it tells me it is something I should dive deeper into.
Post 52 — Aug, 2022
Data Has Information, But No Meaning, what does that mean? A very simple step-by-step walkthrough. It is important to understand these relationships as it can help teams have a common framework when collaborating on designing and building software together.
Your follows, likes, comments, and shares are greatly appreciated!
Post 51 — Aug, 2022
Atomic Data Contracts are generalizable agreements for communication that are aligned at the information, data, and conceptual planes (and beyond, but we haven’t gotten there yet).
This proposal demonstrates a system for API payloads, or conversation, or a business concept glossary, etc.
Really enjoying how this is shaping up, and the support I’m getting from the community, thank you all!
Post 50 — Aug , 2022
I had fun & informative conversations with Chris Tabb and Caleb Keller about how do we measure the value of our KPIs? How do we make sure our investments in KPIs makes financial sense? How can we help leaders focus on what’s important and reduce the noise of their inputs? Welcome to the topic of “KPIs of KPIs”!
I hope this thread continues to grow and evolve on LinkedIn, and we welcome your input to help us know what problems you are facing, and what you’d like us to address on this topic.
I’m grateful I’ve gotten the opportunity to meet and collaborate slowly, iteratively, and organically, with people like Chris and Caleb. I hope you get a similar experience.
Post 49 — Aug 7, 2022
The conceptual layer is experiential and what we share socially through language, symbols, and music.
Previously, I’ve discussed the flow of Information > Data > Concepts > Knowledge, and I’d like to introduce how they connect to how we can code & design our systems: https://lnkd.in/g_hh8vUG
I’d like to dive a bit deeper into the conceptual layer, introducing a few concepts from psychometrics, and designing knowledge experiences.
My goal is to connect the worlds of semantics, data, software engineering, design, and psychometrics, and product management into a single framework, so that teams can collaborate better. It’s going to be different, and it’s going to challenge how things are currently done. It’s also not for everyone, and that’s ok. Thank you Ole Olesen-Bagneux for giving me the encouragement to explore & experiment, and Jessica Talisman for helping me think through how to connect elements of this framework to more traditional and formal LIS practices.
Post 48 — Aug 6, 2022
Here’s the #1 reason why I see large enterprise innovation projects fail.
Semantic-driven design is a strategy to setup your company’s innovation efforts for success BEFORE they begin by mapping how concepts & logic are connected and used by teams, applications, and processes in your org networks.
Agile development can be really value in the right context, but without being anchored in “network-effects” of how organizations cross-functionally operate, it can lead to siloed development.
The Lean Data Product Playbook has been developed as a companion to agile, to ensure teams are aligned using ‘network-thinking’. This will save your company tremendous amounts of money, time, and ensure better outcomes.
Post 47 — Aug 6, 2022
Want pitching advice for your startup, or presentation? Here’s a story for you! A LinkedIn connection had a startup, and asked me to review their pitch and give them feedback, which I was happy to do.
Their CEO was talking 100 miles per hour, had a lot of anxiety & negative thoughts, and I felt very overwhelmed. I gave them feedback, and I wanted to share it, because I think everyone, including myself could use this reminder.
They are at step 1, and we might be at step 10, but we need to meet them where they are, and make simple enough for them to understand. They’ll never get to step 2 if they can’t connect at step 1.
Your energy is their “User Experience” of how you think, feel, communicate, and run your startup. Slow down. Connect with them. Find what you like about yourself and your vision, and use that to ground yourself and feel empowered. Don’t feel proud of yourself? I totally get it, it took me a long, long time to start finding things I felt good about. It’s hard work, but you can find those victories and connect them to your startup. Think of you you want that person to experience your energy.
Your goal is to get them curious about Step 2. Learn from your conversation. Where are natural steps forward, and where do they not exist? Maybe you need to update your pitch or even startup strategy, or maybe you can learn who is not a good investor/audience. The goal is to make moving to step 2 feel natural and easy.
Post 46 — Aug 5, 2022
Want to better understand what data scientist experiences are? ‘Flip the paradigm’ of how your data products are built in order to create greater ROI and improve team collaboration.
Post 45 — August 4, 2022
Want to better understand what data scientist experiences are? ‘Flip the paradigm’ of how your data products are built in order to create greater ROI and improve team collaboration.
Original post: https://www.linkedin.com/posts/ronitelman_lean-data-products-the-data-scientist-experience-activity-6961350828410490880-YUa1?utm_source=linkedin_share&utm_medium=member_desktop_web
Post 44 — August 4, 2022
Post 43 — August 3, 2022
I was chatting with Chris Tabb, and he made a joke about that went something like this: “Why do we call it the ‘modern data stack’? Because everything else is taken to get budget”.
I’ve developed a simple and straightforward data product management methodology to address some of the inefficiencies I see and hear about from data teams. The goal of the data product management framework is to reduce complexity among teams through an easy checklist of getting everyone on the same page, knowledge sharing, and identifying issues *before* investing in building, to enable a highly iterative and efficient execution of collaborative efforts.
In the comments I am going to list posts from various thought leaders on related topics if it helps create understanding and perspective.
The methodology is focused on the human element of leadership, vision, continual development of expertise toward mastery, knowledge, and a culture mentorship. I believe an investment in team culture and processes can have equal or greater ROI than any investment in a new tech paradigm.
If you’d like to try this data product management and data experience design at your organization, please reach out, or forward this post to a leader in your organization. Clarkston Consulting has been incredibly generous in supporting this effort, and it is an open-source methodology, so that everyone can benefit.
Follow for more on data product management & data experience design.
P.S. — I will put links in the comments to other thought leaders who have written posts aligned with and influencing this one, as well as a link to my full series of posts on the topic.
Post 42 — August 2, 2022
I wanted to share with you a very simple, elegant, and generalizable framework for to align on language when talking about information, data, concepts, and knowledge.
This framework is grounded in my work for the world’s largest knowledge publishing company, where I was incredibly lucky to work with some of the most brilliant PhDs in Learning Science, Computational Psychometrics, AI/ML, and Personalized Learning & Analytics. That work informed this framework and will make your strategy, management, and execution of your data projects more successful if you get your teams to align on this paradigm.
My next posts will be about knowledge intelligence, the “KPIs of KPIs” (how to value your BI investments & strategy) w/Chris Tabb, and a “Guided Semantic Experience” exploration with Jessica Talisman. Hopefully we can get Ole Olesen-Bagneux to review our designs and provide input (if we are lucky ;) ).
Please follow if you want to learn more!
Post 41 — August 1, 2022
I was lucky enough to chat with Ole Olesen-Bagneux about his upcoming book on Enterprise Data Search, and am continuing to learn from him through other media, such as Loris Marini’s podcast on “Discovering Data”.
I recommend you give it a listen if you want to learn about how Data Catalogs enable Enterprise Search, and if you want to expand your horizons about bridging the worlds of how we manage concepts.
This post contains some visualizations from his podcast interview with Loris, and I added some thoughts about why a data product management framework is needed to bring together the different disciplines of semantic/concept management, software engineering, design, and business strategy. My goal is to provide a framework for all of these siloed teams to work effectively and efficiently by improving our communication, understanding, and alignment of our goals.
The challenge of assigning a value to data & knowledge will be addressed in a separate post, titled, “The KPIs of KPIs”, which I will be collaborating on with Chris Tabb.
I hope you enjoy this mini-presentation, and please follow for more!
Post 40 — August 1, 2022
I’ve seen some posts refer to data products as dashboards, and other posts referring to knowledge, and I thought I would share the differences I see between the two in the context of a data product management framework.
Having worked for one of the largest knowledge companies in the world alongside teams of PhDs in learning science, computational psychometrics, and personalized learning experience designers, I am opinionated on the topic of knowledge. I think it benefits the community to have clear distinctions.
Joe Horgan has a post covering different meanings of the term data products, and data.world has an excellent post (link in comments) by Juan Sequeda on being knowledge-first. The summary of my definition is if the context is ‘Assembled data to solve a problem’, then it is knowledge-product. If the context is ‘optimally communicating data that can be assembled’, then it is a data-product. The quicker we can align on a clear definition of these terms, the easier it will be to avoid confusion and make progress in ‘data as a product’ and ‘knowledge intelligence’ (which I will post on soon).
Post 39 — July 30,2022
I’ve been thinking about how to bring together the various domains of knowledge and disciplines involved with data products into a single data product management framework.
The levels of complexity in tools and how to tie them all together are skyrocketing, and the talent pool is limited. I’ve seen amazing teams operate in silos, and seen the shocking destruction in productivity and value because of it. We need to address this.
I’ve learned so much from data science, software engineering, experience design, data engineering, BI analysts, and product people, and I believe we can bring them all together using product management methodologies for data, which will unlock tremendous value for business leaders.
Follow me for more on this series, and the link to the complete series is in the comments.
Post 38 — July 30, 2022
A fascinating thought exercise was posed by Pablo K: “If there’s no incentive structure governing data production, it isn’t a sustainable or scalable op model.” Juan Sequeda brought up an equally provocative idea: “ If we are going to treat data as a product, it has a cost. It has value. A transaction needs to occur.”
I don’t have a strong sense of where we are headed, but they are definitely on to something on the topic of our complete lack of ability to measure the value and other properties of data products!
When an employee leaves, estimates are that it costs a business 2–4x their salary, when you factor in lost time for new hire training, finding an employee, loss in productivity, etc. But what about the cost of their knowledge being gone? When a senior data engineer leaves your team, how much is their knowledge in how to ‘get things done’ worth?
Let’s dive deeper — executive leadership needs to make a decision to invest in project A or project B, the quality of their decision could affect thousands of jobs, share prices, and sales. The wrong decision could jeopardize the very financial stability of the company. The need the right high-quality data, and they need it now, and they need your data science team to analyze it and make a recommendation. How much is the knowledge worth? How much is the data worth that makes the knowledge?
I’m going to have another post on how data products construct knowledge, but the two are fundamentally linked.
If we were talking about physical items, like screws (a product), in a production line to make a car (another product), factories have a wide and rich variety of tools to measure the cost of a good. But are we measuring the cost of our data goods? Are we investing in reducing our costs by increasing the reliability and quality of our production line?
One way to measure concepts is to measure the frequency with which they are communicated in your organization, and to how broad an audience. They can be connected to financial outcomes or other temporal-based events, and they can be weighted by who uses them. They can be ranked with what is most relevant for which business problem, and more.
But this only happens if we invest in “Unostrichizing” our data, algorithms, and decision-making systems.
Post 37 — July 30, 2022
The other day I was speaking with Ole Olesen-Bagneux, author of the The Enterprise Data Catalog — O’Reilly Media. He was so generous with his time, and patient because he knows like 10,000,000 times more than me on the topic of semantic management.
I had tremendous fun discussing with him my challenges and goal of finding a way to bring the worlds of data management, semantics, product management, experience design, and software engineering together into a seamless data product management framework.
I’m going to put together a few pieces based on what we discussed. The first was we talked about the experience of LinkedIn, and I want to share some of my learnings and want to thank Ole for his positive encouragement to continue to be myself:
1. There sometime is the experience of being open by ‘thinking out loud’ and exposing oneself to the invariable ‘expert’ who hasn’t really read your post but is all-too-eager to tell you why you are wrong (even though they are siloed in their own way of thinking). They state that they know more than you, and then they post a link to their website. I’m learning to focus on the messages I get that people feel like I’m writing about their experience. I don’t mind someone questioning or pointing out something that they don’t agree with or doesn’t make sense, but there’s a way to do it where it is positive, virtuous, polite, contributing to the topic, instead of antagonistic.
2. There are the people who will copy your content and post it without tagging you. This sometimes feels a bit sucky. So I recommend with images you add your name in the image so that if it is shared, at least you can get credit. If they erase your name from the image, then that’s on them and at the minimum, take that as a compliment for the quality of work. Not worth wasting emotional energy on them/their behavior.
4. Write to a specific persona. In my data product management and data experience series, I’m trying to be the voice of the data scientists and analytics engineer who don’t often get the opportunity to voice their frustrations. I want to amplify their voice, and I want to offer a solution to business leaders to improve their data product management and data experience capabilities.
5. LinkedIn Youssef Koutari Katie Carroll Tomer Cohen can we please get more analytics as creators to help us create better free content for your users? It gives us and them better experiences. I have like 5 requests :)
If you enjoyed this post, please follow me.
If you want to read more about my series on data products on Medium, please go to the link in the comments.
Thank you for allowing me to write about something I’m deeply curious about, and showing me there is a community I can connect with.
Original post: https://www.linkedin.com/posts/ronitelman_the-other-day-i-was-speaking-with-ole-olesen-bagneux-activity-6959322580671488000-dzV_?utm_source=linkedin_share&utm_medium=member_desktop_web
Post 36 — July 30, 2022
Commonly business builds a technology ecosystem which generates a lot of data, and then afterwards business teams want data scientists to create value from it.
Business leaders think they are ‘data-driven’ because they produce lots of data and have a data team to create dashboards for them.
If you flip that process, and ask your data teams “how would you design our business and technology system to generate the highest quality data that will drive the most amount of value?” you are putting data teams at the beginning, and not the end of the value creation chain.
You will be surprised at how they design your systems, and what they can do when they get the data they need, at the level of quality they need, to accelerate your business.
Every team that uses data is having a user experience with that data, why not have a conversation with them and see what their experience is like to create value for you? Here’s an illustration I made of what is quite typical: https://www.linkedin.com/posts/ronitelman_one-of-my-all-time-favorite-posts-on-a-data-activity-6958864836629839872-SUwx?utm_source=linkedin_share&utm_medium=member_desktop_web
Empower your data teams with great experiences, and they will empower you.
If your organization would like coaching on how to setup data product management and data experience design capabilities, please reach out.
Please follow to see my series on data product management and data product design, and you can the entire collection by following the link in the comments.
Post 35 — July 29, 2022
It articulates so clearly what so many BI analysts, data engineers, and data scientists have as a regular Data User Experience.
We need to change this. Companies need to evolve because technology is evolving faster and faster.
There is a cost not just to our employees and co-workers, there is a real monetary loss in productivity to companies, and a real loss in value to customers.
I believe a new software development paradigm is required, and I believe a new product management methodology is required to bring all of these various domains of knowledge together to have a semantic-first and knowledge first approach.
I’m trying to figure that out with these posts, so I share my thoughts. This is my MVP… the ideas and feedback, and the collaborative nature of LinkedIn has me believe that we will find a way to generate tremendous value and dramatically increase the user experience of our data developers, and all of the other teams that are connected to, and benefit from their hard work.
Please follow if you want to continue to read this series. I will put a link to the entire series about Data Product Management & Data Experience Design in the comments.
Post 34 — July 29, 2022
People don’t usually think in networks, they operate in siloes because ‘this is just how it is is’. How to find internal innovation opportunities applied data product thinking in networks:
1. Map our business processes & logic, how things are ‘supposed’ to work.
2. Map user behavior, how things ‘actually’ work (this is usually because employees have to get around obstacles of poor performing software systems, user experiences, bottlenecks of other teams)
3. Connect together all of the financial and time impacts of how the system is not working as designed to calculate a cost.
4. Design a new system where the smallest change can have the highest ROI across the entire network
You’ll be surprised how many internal innovation opportunities you find. The next question is: can you get internal support to make a change? To answer that, we need to find effective ways to clearly communicate and do it in a way which doesn’t make the custodians of the current system feel threatened or perceived as judged negatively. Some are excited by discovering opportunities for doing things better, some may not react so positively.
Link to my series on data product management in the comments.
Thanks for reading!
Post 33 — July 28, 2022
Design for simplicity.
Technological advancements are increasing in complexity at ever-increasing rates, and getting organizational systems and leaders to coordinate and collaborate effectively at the pace of technological change is getting harder than ever.
Ask your data, product, software, and design teams what would simplify the complexity in their tasks. Compile a list cross-functionally. See how the dots connect. You will be surprised at what you find.
I believe that in most cases, if not all, leaders can safely focus their investments of time, energy, and money into improving employee experiences by designing for simplicity. Nurturing a culture optimizing for simplicity also helps in focusing communication with clients and their experiences. Empower internal teams to think about how to design their employee experience from a bottom-up approach rather than strictly a top-down one.
For more on my data product management series, please follow me and you can browse the entire series in the link in the comments.
Post 32 — July 28, 2022
I started reading Christian Kaul’s paper “Closing the Business-IT
Gap with a Model-Driven Architecture” which covers data modeling not only from a data perspective, but from the perspective of needing a ‘ubiquitous language’. I highly recommend you follow Christian if you can, I learn from almost every post he makes on the topic. I’ll put the link to his paper in the comments.
I took a snapshot of something in the paper which stood out to me, and I wanted to brainstorm on a few properties of modeling that I have been noodling on, which hopefully are additive to Christian’s framework.
In addition to concepts, connections, and details (or perhaps these are included in details if Christian hasn’t already thought of them), I would add the property of “frequency” — meaning how frequently is a concept used across an organization’s network of systems and processes. That way, we can begin to think about the value of a concept.. is it used by everyone, all the time, or only used very rarely and by a few specific set of people?
If we add weights to the users of concepts in a graph, so for example a CEO might have a higher weight than a temporary admin we can also think about the value of a concept. A CEO using a concept 1x a year to make a major decision might be less total weight than 10,000 employees using a concept 10x a day.
Additionally, we can measure the change in frequency of a concept. Maybe something is used rarely and suddenly becomes very frequent.
The semantic system of the future I imagine will give each concept its own web page, with taxonomical description, analytics on the concept’s usage within systems, etc. Additionally I imagine we will want to measure the ‘bits’ of information required to communicate a concept. Why is that? Because if a concept with a few bits has a massive influence on a system it is more valuable than a concept with a high number of bits that has a very weak influence on the system. I describe this in my blog on the topic of ‘meta-intelligence’, which I will also link to in the comments. In that article I try to describe measuring concepts as a mass, and the influence of concepts on a system as gravity (f=ma). The acceleration of a concept on systems’ state toward the highest reward configuration is how I’m thinking of how to design intelligent systems intelligently.
Thanks Christian for sharing your knowledge with us. Curious if you have thoughts / feedback on this smorgasbord of thoughts ;).
Post 31 — July 27, 2022
“I believe that scientific knowledge has fractal properties, that no matter how much we learn, whatever is left, however small it may seem, is just as infinitely complex as the whole was to start with. That, I think, is the secret of the Universe.”
― Isaac Asimov
Of all my posts on data product management, this is the one I am most afraid of writing.
In the center we have a dog, a simple concept right? Everyone knows what it is, so in the company people use language to describe a “Dog”. But these teams then implement the concept of dog in different ways, and can represent “Dog” not in words, but in code (analyst group). Go deeper into the data engineering team, and they will create models of “Dog”, which invariably can have different levels of granularity of what a “Dog” is, from it’s paw at mezzo level, DNA at micro level, and dog packs at a macro level, but that’s not all, because the sale team wants to model the psychological experience of a human with the word “Dog” in marketing campaigns, and “Dog” for sales teams may mean story narratives around “Loyalty” and “Best Friend” and other adjacent concepts not even used in the word “Dog”.
Many, if not most, companies don’t have fully mature taxonomy & ontology teams. No one except a select few know how to leverage the value they bring and actually integrate it into user experiences and products. And even with a semantic team, the rest of the non-computational human folks still use “Dog” in their own context and their own language. And the table columns don’t and code for processing “Dog” information may use different variable names.
Tony Dahlager has a great post (link in the comments) about knowledge: “The great resignation has exposed a critical organizational weakness — a lack of knowledge-capture and knowledge-sharing in data teams.”
If anyone understands fractal systems, and knowledge as a fractal, I’m in awe, because I’m still trying to wrap my head around it. I bought a book on the fractal nature of entropy and maybe understood 1.1235813%. But I think about how much more efficient would organizations be if data, processes, code, and people’s language had something really simple to implement to capture concepts and in plain english explain what they are and what they are connected to. And rather than do this at the end of building expensive enterprise system, we flipped it, and we did this first?
What if, and this may sound crazy, dumb, radical, or brilliant.. what if the act of designing our concepts, what they are, what they do, and what they connect with, could generate our applications, queries, and other experiences?
I may never achieve understanding the fractal nature of knowledge, but exploring how knowledge is created, curated, and disseminated among groups to continuously evolve is absolutely a breathtaking and fascinating world to explore.
Post 30 — July 27, 2022
What is a data product?
Comments I’ve gotten about data products are often that it doesn’t exactly match the meaning of another person, which is ironic, because that is exactly the purpose of, and need for data products.
My definition of a data product is: “the communication of information from source to recipient with an agreed upon conceptual map”. So JSON communicated from an API to a business analyst with a taxonomy & schema reference embedded in the communication itself would be a data product.
This example is different than lets say a ‘dashboard’ being a data product (end-user experience), but a dashboard can also be a data product.
It is a bit recursive (and my next post will be on the fractal nature of knowledge), because data products can be combined with other data products to hierarchically and/or relationally form data products. In this visualization we have a dog ‘product’ sold in a store made of legos. But an individual lego can also be a ‘product’ sold to internal teams or to a person.
The data product management framework I’m developing is based on this ‘fractal’ pattern of knowledge. While very abstract, this pattern is very powerful and worthwhile to understand, because we can ‘compress’ value by making the lego blocks re-usable components to generate more and more products. For example, the “Dog” product can be part of an “Animal” lego set.
Claude Shannon’s definition of information was titled “A Mathematical Theory Of Communication”, and this is what I would like to ground my semantic/knowledge/data product theory in. That the properties of communication itself of information are the data product, which can be experienced by data scientists as JSON, XML, (and anything else), and used to build user experiences in recursive structures that we can zoom in and zoom out of.
User feedback, A/B/n testing, and other product management methodologies are vitally important, but grounding a data product in an information-theoretic approach is how I believe we can make data product management a scientific endeavor that aims to measure the entropy and efficiency of data product design to achieve business objectives and create value.
I’m going to leave the link to my running blog on data product management in the comments. I’d love for you to follow, share, or reach out, it’s how I get feedback on what resonates with people, and these posts are my MVP to a/b/n test with the community!
Post 29 — July 26, 2022
I thought about how much less innovation companies have created, how many less cures for diseases, how many terrible memories are created because of the lack of focus on employee experiences.
To be clear, there are companies that really try. And there are companies that spend a LOT of internal and external marketing effort to advertise their employee culture and how much they care.
But I think there’s a real metric that can only be set by the C-suite, and that has to do with “What KPIs and what incentives are you creating around your employee experience?” Simon Sinek has excellent lectures on YouTube about this.
The visualization below is about 1) KPIs — often toxic leaders who maximize the yield out of employees at any cost are rewarded because they generated a higher sales in a shorter amount of time. Never mind the trail of people who felt mistreated. Meanwhile, someone who is slower to reach sales targets, but builds a team that is nurtured and supported and has a great culture and experience will be penalized. 2) I had some amazing coaching sessions with Christopher Young, and how many companies have functioning internal coaching programs to make sure employees succeed? It’s hard, expensive, and messy. And yet, I believe doable. But it is long-term thinking versus short-term.
How can we influence leaders to switch from short-term thinking, which is about putting out fires, to long-term thinking, which is about seeing value sustained and increased over time? I’m not sure how to even have this conversation with leaders, or how to begin to shift their perspective. I want to believe it is possible, but I haven’t seen it yet. I’d like my growth as a consultant to have a way to communicate and influence leaders to think about this.
I firmly believe that a long-term investment in your employee experience will vastly increase the value of your business over the long-term compared to short-term thinking. But perhaps that mind shift is too scary, too difficult, or perhaps I’m too naive. I believe that after seeing so many great people leave companies because of toxic managers that there is a quantitative way to measure the cost. I hope that is my next area of growth. The technology is easy. Having a truly positive impact on teams and people so that people want to be their best at work is hard.
Post 28 — July 26, 2022
❌ Data Product Management isn’t about writing user stories, or estimating time to complete stories.
✅ Data Product Management is about getting stakeholders with wildly different language, expertise, and perspectives to agree to use the same map (conceptual language) in order to communicate and collaborate effectively, in order to drive business value and continuously innovate to create fantastic data experiences.
For more on Data Product Management Strategies for your organization: https://lnkd.in/geR78tZq
To anyone who took the time to read this, thank you :). Please follow me and share if you think others might enjoy it.
Post 27 - July 26, 2022
I was reading about PayPal innovation tournament inspired by venture capital for employees: https://www.linkedin.com/posts/srishivananda_how-paypal-gets-employees-invested-in-innovation-activity-6952714578745049088-IT-y/?utm_source=linkedin_share&utm_medium=member_desktop_web
I’ve been thinking a lot about the incentive structures of internal innovation programs, specifically related to the “Data Divide”, where Business executives say they keep investing in hiring more BI analysts and buying tools, and yet aren’t able to get the dashboards, reports, and insights they want fast enough, Data executives are saying they can’t possibly keep up with all of the demand, and the business analysts saying there is a usually a lack of documentation on metadata for ‘data swamp’ files they dig through.
I’m wondering if leaders incentivize companies to think about ‘how quickly data teams can ship a dashboard’ which is short-term thinking, and can often lead to stressful environments, versus teams that are longer-term thinking and are building the proper foundations that will sustain a solid foundation for generating higher-value, higher quality data products, and better culture. @paulgaffney has a post on quality (link in comments), and I recommend you follow him for a great CTO perspective.
The image below follows a Simon Sinek lecture where he describes one sales team hitting a quota and getting bonus but having a very toxic culture, and another team missing the quota but building steady and reliable growth to the goal with great culture and long-term thinking.
The most common feedback I get in conversations with business analysts and data scientists working on projects for our larger client projects is that quite often they don’t know how to have a conversation with business leaders around the ROI of longer-term thinking. I’d love to hear any suggestions you have on articles, visuals, or frameworks you might have that are great for getting business and data leaders on the same page on what they want to incentivize, and what culture they want to build.
Setting the KPIs and reward structure on long-term thinking is not the norm, and I’d love to understand A) why, and B) how to influence leaders to incentivize great team culture and solid foundations that eliminate BI bottlenecks.
Please share your thoughts and follow my if you’d like to hear more on Data Product Management.
Thank you all for being part of the conversation!
For my full series on Data Product Management, it is on a single Medium page (free) titled: “Data Product Management: A Series On Methodologies, Strategies, And Communities”, link: https://ron-itelman.medium.com/data-product-management-a-series-on-methodologies-strategies-and-communities-b2d6b944a654
#businessintelligence #cto #ceo #cio #cdo
Post 26, July 2022
A) Chad’s post is brilliant. Every CTO / CIO / CDO should immediately print this out, frame it to their wall / tape it to their monitor / etc. so they see this every. single. day.
B) You would be well served to be asking your leaders ‘How are we measuring our incentives and investments?” Because if putting out fires and scrambling well is who you decide to promote and listen to, you will get more of that growing in your organization.
C) Simon Sinek has a great presentation where he discusses team reward structures for sales in short & long-term thinking, and I can’t emphasize enough how this is equally applicable to data teams. Sometimes organizations reward teams that have high-volatility and high-stress, causing high attrition, which is a short-term focus. Rewarding teams that perhaps are slower but build foundations for long-term gains to avoid the situation that Chad describes creates a culture of positivity, loyalty and trust to each other and projects, and will over-deliver in value.
D) Do you have your success metrics and KPIs for your data teams?
Post 25, July 25 2022
I started reading ‘SQL for Data Analysis’ by @cathytanimura and what struck me is that on page 19 she references that data scientists spend 50–80% of their time on cleaning data rather than doing data analysis… and this is from 2014! As that number hasn’t really changed in 2022 that seems like a really big deal.
Despite the incredible value of saving time from very highly expensive talent, it also means companies aren’t realizing the value of AI/ML, analytics, etc. There’s a cost to society too, from all of the destroyed value by this problem, and that’s what is most troubling. How many cures to diseases have been delayed or destroyed because of data issues.
I learned that defining a problem is 50% of solving a problem, and I wanted to zoom out as much as possible. Here’s what I see:
In order to bridge the “data divide” between businesses that need to transform to prioritize knowledge, semantics, and data science, and the data & engineering teams that are challenged by even finding the data, understanding it, and being able to create data products, we need a ‘data product management framework’ to get everyone on the same page.
I view a data product manager not as someone who writes user stories, but as someone whose job it is to set a vision and align the teams to it.
Therefore, I believe a unified framework for a singular data product needs the following: integration into all of the tools and systems that consume the data product, semantic management (taxonomy, schema, ontology), experience — this is missing, how can *anyone* (with permission/role access controls) in the organization view the definition of a concept and how it is used across various systems in a beautiful user experience? And lastly a way to measure the data product, and how easily it is to create data products so that the business side can see an ROI / cost reduction and continue to invest?
“Data Product Studio” Apache 2 / open-source approach: My aim is to make it first a methodology, and secondly an open-source Streamlit app you can begin using. It won’t be enterprise-ready, because we will need a company to partner with us. But in the meantime, I hope you can follow this thread, offer recommendations for what you need, support with your ideas and questions, etc.
If your organization does want to partner with Clarkston Consulting on taking the alpha to something that is more enterprise ready in a co-development deal, please reach out!
#CTO #CDO #CIO #datascientists #business #management
Post 24, July 2022
When I started writing about Data Product Management, it was because I was seeing a problem I couldn’t exactly articulate, and Chad Sanderson wrote some amazing articles that inspired me.
Many years ago I took a Product Owner Scrum certification class (check out JJ Sutherland), which had us literally walk out into public and interview random strangers to get feedback on our product ideas. It taught me to get past my fears and get feedback as soon as possible, even with paper prototypes.
On LinkedIn I decided to make my posts my “MVPs” to test out ideas and get feedback from the community. What has emerged is that, unexpectedly I’ve met a ton of people who have shared a ton of knowledge and sent me messages about how my posts have either resonated with their experience at work, or they want to see where this live & social brainstorm goes around a new framework for data product management focused on semantics and knowledge.
So this post is just about saying thank you to LinkedIn for enabling us to connect with each other, build communities, teach and inspire each other, AND it is about saying thank you to the people who’ve been reading the posts and those that have been sharing knowledge on how to develop this series better.
I want to thank Simon Sinek for writing about staying focused on early adopters and early influencers (he calls the diffusion model), which is exactly the strategy I want to follow.
Lastly, I want to call out a few people who write about writing on social media, have helped me get over my initial fear, and I recommend you follow if you are interested in the topic:
Katelyn Bourgoin — “Buyer Psychology Geek”, posts about psychology & communication
Chris Munn — his “effective presentations” post is brilliant
Justin Welsh — he shares great resources and tips, including how to make your LinkedIn profile better (I’m going to follow his advice soon and update mine).
Wes Kao — Next-level tips on better writing
My wife thought the drawing reminded her of “Charlie and the Chocolate Factory”, so I added a flock of mini-dragons to add a GoT / Daenerys baby dragons spice to it. Hey, I don’t control what comes out in the artistic process, it just comes out, which is why I love being creative ;)
Thank you everyone, and if you want to follow my thread which is working on a Data Product Management Framework on LinkedIn, with support from the community, please follow me!
Have a great day everyone!
Post 23, July 2022
Data Product Management Framework Exploration — Post 22
We’ve begun referring to a common conflict with our clients between the BI & data teams, and the business owners. The data team is constantly putting out fires and trying to get through bottlenecks to answer business questions, unable to even easily find the data because of years of ‘data swamps’ building up. The business team keeps feeling they are sold promises that if they buy tool X or tool Y, or hire more BI people that the problem will be solved. We call this the “Digital Data Divide”.
Knowledge-first approaches (shout out to Juan Sequeda) need a product management methodology that has evolved for data products & internal customers.
Sean Goodpasture once posted about the 99% of developers aren’t in FAANG companies, with super-advanced systems, and mature development teams with experts available to set up best practices. Therefore, I’d like to propose we need a knowledge / semantic focused data product management framework for the 99% not in a large company with a big enough budget to hire and maintain professional ontology teams / data librarians. This is the key to unlocking search, knowledge, and healing the “Digital Divide”.
Allen Holub’s post on this topic of product management hits the nail on the head about it not being about estimating time to complete to-do-lists. Product Management processes should be focused on enabling communication among teams and alignment in effort to succeed: https://lnkd.in/gkif3S6E
I also recommend you follow Bruno Aziza if you want to learn more about data products.
Semantics for the 99%, the bare minimum for a data product I believe should be:
1. Schema defined in JSON. This defines a contract for communication of the data. GraphQL is a great option.
2. Concepts used by that data product listed in a taxonomy. This can be an Excel, Confluence, etc. This defines a contract for the meaning of the data.
3. Query catalog used to answer business questions, with the tables and databases / data sources, so that other people can more easily find data sources and know how to solve business questions. This defines a contract for how to use data.
I hope this post helps communicate the problem in the digital data divide, and some simple steps we can do to make it better.
Please follow if you like this thread :)
Post 22, July 2022
On Data Product Management Series, Post 21
One #systemsthinking strategy is to visualize how data stakeholders are connected: from business owners asking a question, to getting data-informed answers.
The deeper I dive into the problem of “How do we design our end-to-end data systems to maximize business business value, with greater efficiency and efficacy?”, the more I realize I have so much more to learn.
Below are people I recommend you follow related to this question, they certainly are sparking creativity and sharing valuable knowledge that I appreciate:
1 — Business & ROI: Ethan Aaron
2 — Strategy & Leadership: Chad Sanderson, who inspired me to start posting on the topic because every time he would post I would be like “you are saying exactly what I’m feeling and thinking and articulating it so brilliantly!”.
3 — Data Modeling: Christian Kaul
4 — Schemas: Jessica Talisman
5 — Ontologies: Jessica Talisman
6 — Data Catalog: Ole Olesen-Bagneux
7 — Data Engineering: Lauren Balik
Rather than a sequential and siloed process, this graph is meant to get everyone to have a voice and share what they need to be successful, so that teams can be aligned and working synchronously.
Any recommendations for developing this framework further? I am thinking of a sample project I can begin to include in this series, going through each teams role in creating a data-first application. We’ll create a very simple JSON-schema, ontology, model, etc., and your feedback will help inform the process. It will probably be an app where users can vote if they like dog or cat pictures better.
Please follow if you’d like to get updates as the series continue, and thank you for your participation & recommendations!
Post 21, July 2022
Post 21 on Data Product Management, writing to learn:
Christian Kaul’s thoughtful post (link in comments) hits the nail on the head: “When you are building data stuff, you are industrializing the structure of your thoughts. If your thoughts aren’t that structured to begin with, well, …”
What he brought to attention is even more complex when we are in a social network at work, with multiple stakeholders, who may have slightly different ways to think about concepts.
This morning I find myself curious about whether there is a simple way for business and product owners to model concepts in an atomic fashion, relative to the insights or user experiences they want to create. I’m not talking about getting to the level of “optimal normal form” that a professional data modeler would want to get to, but literally a simple to understand, yet powerful framework for us mere mortals to document ‘the structure of our thoughts’ in a way that data modelers can easily use to translate that structure into something that can be implemented.
Aishwarya Srinivasan posted about the massive value created by a semantic layer on AI & BI programs. I’m wondering what the best user experience for managing concepts are between data & business teams. Any suggestions for frameworks, books, or apps to use? Please leave a comment and if you like the topic of Data Product Management, please follow me!
Thanks for reading :)
Post 20, July 2022
Data-As-A-Product, Writing to learn, Post #20
Met with the team today to discuss what I believe is the hardest topic in Data-As-A-Product: the organizational mindset that needs to shift in order to truly unlock a company’s data teams and the data itself.
As technologies evolve, and the level of expertise required to fully understand how to design, leverage, and implement successful data systems increases at ever-expanding rates, a “digital divide” is emerging within organizations.
Furthermore, many program managers may not want to openly admit they “don’t know” how data systems and product management processes need to be setup for initiatives to succeed. They think they are data driven and thinks should ‘just work’ when they invest in a tool or hire new people.
We’re wondering what is the best language to describe this growing digital divide, and how to describe the remedy. We’ve discussed “Business Translator”, who knows enough about data and business cross-functionally to unite everyone towards a common vision. I call that a product manager for data products.
The visual below is what I call a data product. Each business concept is effectively illustrated as a physical ‘box’ from someone who creates business concepts and data (like a web app that logs user interactions), or someone who gets and uses data (like a business analyst for a dashboard, for example). Each ‘box’ is what I would call a data product *IF* it also contains semantic representation that teams cross-functionally agree to use (like a data contract both for technology and people). Data product management should include: Schemas (like GraphQL), Taxonomies (like Schema.org), API catalogs, logic design (pseudo code is fine), databases & tables being clearly documented for that data product, any ontology information (I am still learning about ontologies but OWL seems to be the standard), and also the data visualization documentation for designers and coders.
Each data product should be able to have its own web page with a unique url. This is what I would call a ‘complete’ data product, that anyone creating or consuming data can use. Ideally federated teams, business owners, data owners, etc. can own their own semantic management. Lots to learn here, both in terms of designing a web-based system that makes it easy to manage data-products, as well as the product management methodologies, and how to communicate to organizational leaders about the paramount importance of treating data as a product in order to maximize the value of their data, and help support the data teams with the leadership buy-in to evolve a company into a “data-first” culture, and out of siloed thinking.
If you are an organizational leader, and want Clarkston Consulting to provide strategic support in transitioning to a “data-as-a-product” methodology, please DM me or leave a comment. Thank you so much, excited to see if this resonates with organizational leaders who want to tackle their digital divides.
Post 19, July 2022
Post 19 on writing to learn about Data-As-A-Product:
I spoke with a colleague who implements and trains BI teams for both mature and startup stage life sciences companies. He described BI as the top of a pyramid, and the insights business leaders want as a ‘lightbulb’ as a north star. He described a dynamic from organizational leaders who hire teams and buy powerful BI tools like Tableau and Power BI, and expect a self-serve data system, but aren’t getting the digital transformation at the rate they were expecting. I wanted to dig in a bit of why he sees that happening, and a key reason he highlighted was that buying BI tools doesn’t solve the problem of how to translate key business concepts and objectives into logical and reproducible building blocks. I call this ‘translation’ process semantic management, and so in this visual I put it in between the data sources and the connectors to BI tools. I believe having a robust way to manage concepts and logical processes is the key to unlock the insights business leaders want when they invest in BI.
I am most curious about how to calculate the cost of not having semantic management. I belief Prof. Bent Flyvbjerg’s post (and the paper in the link) hits the nail on the head. There is a type of heuristic related to organizational decision-making around data-investments. I’ve seen engineers hired and being told to start building things before agreement has been reached cross-functionally on data structures or product objectives as an example. I have no doubt that some leaders think “I need to hire more BI people, and buy more tools, and that will solve our problem”, but do not factor in the importance of semantic management. I don’t know yet how to communicate on heuristics, “processes that ignore information and enable fast decisions” but it is important that we begin to think about how to setup data projects for success: https://lnkd.in/gv3uMmaf
Post 18, July 2022
Post 18 in “writing to learn”:
An expert in ontological management recommended I look at LinkML to manage schemas. It can generate RDF, OWL, GraphQL, SQL DDL. Has anyone used it and have thoughts on it they want to share?
Post 17, July 2022
One challenge teams have in ‘data-as-a-product’ is converting business needs to a logical format that can be technically executed on. Therefore, as part of our step-by-step methodology, we ask stakeholders to explain what concept they want represented, and an analyst or technical product manager can break down the concept into sequential logical building blocks with natural language (pseudo-code). From there, teams can convert the logic into whatever language they need.
The Data Product Studio is meant to be a communication tool, to get teams aligned using common product management and design techniques, and software engineering & data modeling processes like Hierarchical Problem Decomposition & Dynamic Programming are critical skills to get everyone on the same page.
If you’d like to try our Data Product Studio (Apache 2 open source license) alpha version at your org, to see if it can help your teams communicate better and work effectively, please leave a comment or send me a DM. Thanks!
Post 16, July 2022
Post 15 on “Data-As-A-Product”: Visualizations
Getting everyone on the same page, which is not a technology, is a core philosophy of good product management. To create data product visualizations we need may need a 1) designer, 2) software engineer (if JS), 3) business intelligence analyst (if Tableau or PowerBI), 4) a DevOps engineer (to create APIs), and a 5) business domain owner to describe what data product they need.
We aim for our ‘visualizations’ view on our Data Product Studio to enable us to do just that: create a data product as a reusable building block for our internal customers to use. Getting everyone on the same page so they can collaborate and create is our mission.
If you’d like to discuss our open source Alpha, please reach out via DM or in the comments. As always, thank you for your feedback and please follow if you’d like to see how we go from design to wireframes to production code!
Post 15, July 2022
One thing that Priyanka Vergadia does really, really well in her book “Visualizing Google Cloud” is that she sometimes adds characters having conversations over the content visualizations. Priyanka, I hope that you take imitation as the highest form of flattery!
For this post (#14) on “Data-As-A-Product” I’d like to talk about “Aligning Objectives” for all stakeholders of data products, from a product management perspective. We can call them epics and stories, or objectives and key results, but the purpose is not about writing tickets to throw over a wall and have other teams execute. The purpose is to create an easy process and flow for communication that enables teams to get a shared and mutual understanding. Thanks Keith Klundt for recommending the visual about everyone thinking about the same concepts symmetrically (shown with triangles).
When we create data products, we want data engineers, designers, product managers, business domain owners, DBAs, data scientists, and software engineers to be on the same page and that starts with communication before any technology implementation. We want our products to become ‘lego building blocks’ that can be reused. For more on this topic I recommend the Chad Sanderson’s excellent post: https://lnkd.in/gqup7HVC
The ‘LinkedIn Whiteboard Sketch’ below is Clarkston Consulting’s apache license, open-source Data Product Studio, which we are formalizing and beginning to build. If you’d like an invite for the alpha, or would like to learn more, please reach out.
If you want to see the journey as we design, mockup, and build our Data Product Studio, please follow me! Your feedback, comments, and questions, are very helpful in letting me know what content resonates with the community.
Post 14, July 2022
Post 13: The Data Product Studio Taxonomy Manager
We want any stakeholder (business intelligence, data engineer, business domain owner, etc.) to be able to:
1. look up / edit / add a concept that has it’s own unique URL
2. View the hierarchical construct that can be used for a JSON schema.
3. Now, any other data product can reference this concept’s web page / unique ID if it is used in a logical construct, or to embed metadata directly in the data communicated.
The formal designs of the data studio are being iterated on and hopefully released soon. Are there any features you would like to see ? We’re avoiding RBAC (for now) for the alpha. If you’d like to try the alpha, please leave a comment or DM me.
Thanks for all of your feedback!
Post 13, July 2022
The Data Product Design Studio is an open-source project we’re working on here at Clarkston Consulting to walk product, data, design, and engineering teams through a step-by-step process to create data products. Our North Star is to “Get everyone on the same page to communicate, collaborate, and create data products”
We start with a whiteboard / paper design and and get feedback as quickly as possible from potential users, and these posts and our iterations will continue over LinkedIn for our digital whiteboard!
Our first ‘paper’ prototype: A search function so someone can look for an entity like “Employee Cost”. A data engineer might look for a table, and a business analyst might look for how employee cost is calculated. Therefore, in our search experience we have ‘hot keys’ or tags so that you can add context for the search.
If you’d like to discuss trying out our Alpha, please DM or leave a comment, thanks!
Post 12, July 2022
Post 11 on visualizing “Data-As-A-Product” concepts:
At Clarkston Consulting we’ve been discussing the topic of “Data-As-A-Product”, and the challenges our clients are having. There are a lot of powerful tools out there to manage and get value out of data, like dbt Labs, Airbyte, RelationalAI, Snowflake, Snorkel AI, Monte Carlo, Databricks, and many more.
We have a pretty unique methodology we’ve been working on to help clients go through digital transformation via treating data-as-a-product. This means that the user experience of using data to create knowledge by business domain owners, data scientists, data engineers, and analysts is given the same amount of care as a consumer product experience.
One of the challenges we see is that there are a lot of tools out there, competition for talent is fierce, and a lot of the time people are putting out fires to deal with data issues because “that’s just the way it is”.
Rather than creating yet another technology tool, we are starting with the most important element in data systems: people. We are creating a “Data Product Studio” to help get teams on the same page, walking them through a step-by-step methodology for creating data products that we have been developing at Clarkston.
And the best part?: we are releasing it open-source, Apache license. Someone kindly asked if I would post a blog about the journey I’ve had on LinkedIn over this series of posts, and in a way, the community’s reaction is our MVP. I learn from every interaction I have from you. Thank you, this is what was born out of our LinkedIn MVP community-driven interactions.
The Data Product Studio is meant to be a checklist that step by step gets everyone on the same page in an easy to use interface to create data products. I believe this will solve a lot of problems by focusing on a methodology that supports best practices and collaboration. Think of it as a “Computational Center Of Excellence For Data Product Management”.
We’re building our alpha using Streamlit. Want to try the alpha? Please leave a comment or reach out! We’ll be posting progress through LinkedIn along the way.
Post 11, July 2022
This post is inspired by Priyanka Vergadia’s book “Visualizing Google Cloud”, as her visual writing style is so effective.
At the top of the visualization is a knowledge graph (long term goal), the first ring/step I’m thinking about is defining data management processes which is the second image (medium term goal), and at the bottom is an ideation on an open-source web app meant to guide various stakeholders through that process (short term goal).
I added a ‘taxonomy’ and ‘rules’ drawing, as business domain owners may define concepts, and business analysts define rules. The third section is a user experience to get everyone on the same page: The business domain owner (or product manager), the business analyst, a DBA, data engineer, and DevOps engineer, so that every concept can be understood from the highest-to-the-lowest level.
Ideally this user experience is searchable as well, has interactive design tools for experimentation to help teams get on the same page while figuring things out, and available to all users in a company who work with data, rather than only users with licenses to complex and sometimes expensive enterprise metadata management systems.
Does anyone have any tools to recommend (especially open source) that may be similar or achieve similar results? Much appreciated!
Post 10, July 2022
Day 9 of visualizing “Data-As-A-Product, Semantics & Hypergraphs”:
For fun, if this post tags Daniel Shapero (who gives excellent advice over walks he records on video), and then I create a symmetrical post on Twitter (with cross-references between networks), I could (in theory) have a taxonomy & schema which creates a universal semantic layer for multiple networks. I could then begin to think about analytics which measure impact of a digital footprint when networks are graphed together rather than separate.
Another example of using multiple networks (graphs) might be Katelyn “The Customer Whisperer” Bourgoin’s post: https://lnkd.in/g8G3Bmxt which shows using Google Autocomplete as a recommendation (based on their network of human interactions), and then cross-referenced with using answerthepublic.com, and she shows how to use tools like Facebook Ad Library, and more (it is excellent).
The ability to combine data from multiple network systems into a semantically managed hypergraph, and communicate information between networks and watch their network effects is something I’m quite excited to continue to learn about.
I view organizational systems as networks as well, and I’m wondering what the most efficient way is to adopt a data-as-a-product paradigm across different internal systems.
Andrej Karpathy retweeted Jascha Sohl-Dickstein’s post of JSON based tasks used by Big-bench: “The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to probe large language models and extrapolate their future capabilities.” https://lnkd.in/g4FtiXs3
The design challenge (I believe) is how do we find a generalizable way to combine multiple networks which may have different semantics into a hypergraph with a universal semantic layer? I believe JSON will be the de-facto language, and I believe a computational system to connect various graphs and leverage their data for insights, and the user experiences and products that can be built on multiple networks has the potential for creating great value.
I will post this on Twitter as well, tagging the same people, and referencing the LinkedIn post, and then I will go back and edit this post with the Twitter post to see if I can observe anything interesting from the cross-network references.
And the post-Twitter post edit: https://lnkd.in/gqVdGJ2a
Post 9, July 2022
I came across Drew Banin’s video speaking at dbt Labs apply() Conference 2022 presentation on the semantic layer. Additionally Tristan Handy has a Venture Beat interview about the semantic layer: https://lnkd.in/gXDde53v
At their August 4th Staging event they will present more on DBT’s semantic layer work.
Here’s the semantic management UX (visualized), I wonder if DBT will enable this or if I can use DBT to create this:
Semantic Studio View:
1. I can call a 3rd party data api endpoint and see what is returned in JSON.
2. I can then click on any entity in JSON and add a taxonomical definition, add any ontological relationships to other entities, and other data management features (query, api, schema management, etc.)
1. Every JSON entity has a unique url / id, a type, and a description.
The “Semantic Payload”:
1. Now when a data consumer gets the same data of ‘revenue_total’, there is an “@revenue_total” communicated so **metadata is embedded with the data**, pointing to our taxonomy.
2. Once the original API has been taken through the semantic studio (step 1), all future api calls for this data should be made through the semantically managed API by any app in the system.
Post 8, July 2022
Post #8 on “data-as-a-product”.
The technology I’m most excited about related to semantics is the knowledge graph, and the user experiences we can create from the data products that knowledge graphs can power. The future will be every company has their own ‘customizable and personalized google’ that deeply understands employee problems and company objectives, and ML can be used to automatically observe actions and outcomes to make recommendations and augment decision-making.
I believe semantic data contracts using JSON are the fundamental key to unlock this capability across any system and any team.
Post 7, July 2022
Post #7 on “Visualizing A Formal Data-As-A-Product Framework”
I’m struggling with the many definitions of ‘data products’, and not sure if I want to use that phrase. If you look at Xavier Gumara Rigol’s article, a “data product” refers to: ‘DJ Patil, former United States Chief Data Scientist, defined a data product as “a product that facilitates an end goal through the use of data”’
If you look at Kevin Troyanos article on Harvard Business Review on a Product Mindset for Data: “Many organizations’ failure to adapt to a highly digitized business landscape stems from their inability to convince employees to embrace data-driven decision-making”
Barr Moses’ Monte Carlo: ““Data as a Product (DaaP) is the simplest model to understand: the job of the data team is to provide the data that the company needs, for whatever purpose, be it making decisions, building personalized products, or detecting fraud.”, which is a very data team perspective.
I’d like to propose my own ‘flavor’ (visualized). A ‘data product is data communicated that is using a semantic contract.’ In my visual, the ‘data products’ are the arrows of a shared communication standard between systems.
I’d like to hear your feedback, because I want to communicate what I’ve learned if teams can agree to have shared definitions of concepts, and shared standards (contracts) for communicating them, and the pain points from not having semantic management. What tools do you recommend? If my definition of “data-as-a-product” doesn’t fit for where the data science community is, what term do you reccomend?
I’m going to tag some of the most influential data thought leaders on LinkedIn, to see if we can get their opinion in this thread. Will a luminary in the field of data shed wisdom on this topic? I don’t know, but to quote Steve Jobs: “Most people never pick up the phone and ask and that’s what separates the people who do things, and the people who just dream about them.”
Allie K. Miller Cassie Kozyrkov Srivatsan Srinivasan Vincent Granville Greg Coquillo Aishwarya Srinivasan 🚀 Abhishek Thakur Ajit Jaokar Vinay Srihari Thierry Cruanes Michel Tricot Maxime Beauchemin Suresh Srinivas Sriharsha Chintalapani Andrew Ng Tristan Handy Margaret Francis Michelle Yi
Post 6, July 2022
Day 6 of “Data-as-a-Product” visualization:
Reference links below in comments.
I decided to visualize what Paroma Varma was describing as the difference between model-centric and data-centric AI, using some of Andrew’s presentation material on the topic.
Alexander Ratner, co-founder of Snorkel says on the page “In a data-centric approach, you spend time efficiently labeling, managing, slicing, augmenting, and curating the data, with the model itself remaining relatively more fixed.”
Andrew Ng brings up the paradigm of having less than 10,000 examples, and that 90% of papers are on models being improved, versus innovation on data & ML-ops.
While I 100% agree with Andrew, Alex, and Paroma, the data-centric approach is focused on the data scientist’s perspective, with a focus on programmatic labelling, etc. Having never used their tool, I’m also 100% sure their AI development platform brings incredible value given their history with Andrew & top-tier companies that are their clients.
I recently came across Barr Moses ‘s post on data contracts, and what resonated with me related to ‘data-as-a-product’ and AI was she highlighted JSON as the interchange format. The key she wrote, was two words: data contracts.
JSON is the universal language for data scientists, software engineer, data engineers, is machine and human readable, and data contracts communicated in JSON are the optimal ‘lego blocks: https://lnkd.in/gsKn6VFG
I imagine that web apps of the future will be designed so that they have the high-quality data labels that are optimized for ML Ops systems auto-generated with every single user interactions. The challenge we have is that the data science community is often siloed from product managers and designers.
Integrating semantic management into our app designs, removing the silos of current app development paradigms will create the ‘knowledge layer’ which Chad Sanderson has so eloquently written about. This will enable real-time analytics and can enable automation of recommendations & insights vs. data scientists collecting and cleaning data sets in an asynchronous manner.
The biggest challenge I see (based on people’s comments) is that problems with data quality and a lack of a universal semantic layer is ‘the norm’ and organizations don’t necessarily understand these concepts let alone make the investments necessary to operationalize them.
A Data-As-A-Product framework is about treating our data with the same care as we do commercial products. That means we think about how data is communicated, consumed, and produced across our organization, including the organizational change management that it requires. I certainly don’t have the answers, but I am loving learning about this paradigm and appreciate the dialog by the LinkedIn community on this topic.
Post 5, July 22
Day 5 of visualizing ‘data-as-a-product’ concepts:
Semantic design & knowledge is something I’m very passionate about, and I’m “writing to learn”, inspired by Nir Eyal’s post (url in the comments). Your comments help me understand where people’s pain points are for adoption, and so far organizational change management seems to be the top.
Today’s post is about how we can bring product managers and designers into the fold for a ‘bar code system for data’ in our data products. I believe that showing how all stakeholders can adopt a semantic management strategy can act like a catalyst for the a digital transformation into a ‘semantic organization’ (I forgot who wrote that comment in one of the posts, thank you!).
If a data engineer has a SQL schema for a table, and a front-end engineer has a JSON object communicated via an API, my previous post (in comments) described how to embed the metadata in the data itself (the bar code). However, business domain owners can also be responsible for owning semantics! Product Managers are typically responsible as the bridge between technical, design, and business domains. Is this a correct description Keith Klundt? Keith literally wrote the book on Product Mindset at Slalom, and I’m grateful for all I learned working with him at Slalom on strategies for organizational leaders to design effective product management teams.
I think a transformative opportunity exists for organizations to adopt a ‘product mindset’ towards their own data from a semantic (concept management) perspective. A product manager may write a story: “As a user, I can enter my first name”, and they can include bringing in that the designer, data engineer, software engineer, and business stakeholders have a reference of what first_name means in a shared knowledge document. For example, does first_name mean the legal first name, or perhaps the preferred first name, or something else? Is there a context to first name, for example as private HR information or public speaking information? This has to do with domain driven design, I recommend Vlad Khononov’s book, and following Cassie Kozyrkov, who talks about conceptual representations (urls in comments): “Operationalization is the creation of measurable proxies for rigorously investigating fuzzy concepts.”
My perspective is that we need a ‘Rosetta Stone’ to get product management, software engineering, experience design, data engineering, data scientists, and business domain owners all on the same page from a semantic ground truth. Sometime next week I will post about how we extend this paradigm to API & query design, and how we can embed semantics directly in our web apps, like Google does with Schema.org
I appreciate your comments, likes, and follows, it is a good feedback loop which lets me know which posts are resonating with people!
Post 4, July 2022
Day 4 of visualizing ‘data-as-a-product’ concepts.
I’d like to highlight the excellent post that Mike Renwick makes (url in comments) where he describes an opportunity to improve the data experience of 3rd party data, and increase the value of one’s data:
“Mike’s plea to data vendors: Please give us data about our data, as data, in a commonly agreed basic format.
We spend a lot on 3rd party data, and yet, amazingly, most of those providers do not provide us with machine readable metadata.
Critically, all of this should be delivered WITH the data, AS data.”
The visualization below shows a proposed format (code below) to address this:
SQL tables are generated like this:
CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
first_name CHARACTER VARYING (20),
They contain a schema for the columns, but not a taxonomical description or hierarchical relationship that we would want in a semantic management / knowledge layer.
GraphQL can have types in JSON, but they don’t have semantic metadata embedded as part of their schema, and GraphQL schema naming conventions may not map 1:1 with a SQL database schema.
I’d like to propose to those interested in the topic a simple standard way to do so, a type of “Universal Embedded Metadata” specification. Example code below:
“type”:”POSTGRESQL v14.4 DATABASE OBJECT”,
“description”:”The global HR data object used by all systems in the organization”,
“type”: “POSTGRESQL v14.4 TABLE OBJECT”,
“description”: “The global list of all employees, for local employee information view semantic id:3411”,
“type”: “POSTGRESQL v14.4 COLUMN — SERIAL PRIMARY KEY”,
“description”: “The unique ID for the global list, for ids used by the Salesforce system, view semantic id:4982389”
The JSON code has embedded metadata which maps 1:1 with a SQL database, and can be consumed entirely in both human and machine readable form directly in the data communicated. The “@” symbol is a reference pointer to the property for which the metadata is being defined. Any entity is REQUIRED in this paradigm to have a 1) SEMANTIC ID, 2) TYPE, and 3) DESCRIPTION. This way, if a the same word is used in two different hierarchical properties, there is a unique ID showing they are different.
If this is something you’d like to participate in an open-source manner, please send me a message or a connection request.
Post 3, July 2022
Day 3 of visualizing data-as-a-product digital transformation concepts:
I recently came across the Harvard Business Review article “A Better Way to Put Your Data to Work”, which came to my attention from Bruno Aziza’s post (urls in the comments). Quotes from the article with my perspective and how it connects to the visualization below:
“Like a Lego brick, a data product wired to support one or more consumption archetypes can be quickly snapped into any number of business applications.”
The visual you see describes a “lego block” being constructed out of a ‘semantic contract’. My simple definition of semantics is the process of defining operational business concepts to reduce ambiguity. The benefits are higher efficiencies, better planning, and reduced mistakes. A more technical definition (I personally use) is that semantics define concepts in a taxonomy, with a unique ID, and within a context (hierarchy) in order to create a contract that data stores, data producers, and data consumers will all agree to use.
“Most companies struggle to capture the enormous potential of their data.”
APIs are how we communicate data that guarantee the use of a semantic contract between data stores and business apps. This is where ‘digital transformation’ comes in, which requires organizational change management. Often misalignment between data engineers, data scientists, designers, product managers, software engineers, middle management, and upper management emerges and create conflict during this transformation process. Leadership must provide a robust way to manage the challenges that come up in order to setup digital transformation for success. APIs are true for both technological communication, and APIs are a great way to describe how teams communicate. This concept was introduced to me in the book Team Topologies by Matthew Skelton and Manuel Pais (Team Topologies) 🇺🇦(thank you Sean Goodpasture for the recommendation).
“In our work we’ve seen that companies that treat data like a product can reduce the time it takes to implement it in new use cases by as much as 90%, decrease their total ownership (technology, development, and maintenance) costs by up to 30%, and reduce their risk and data governance burden.”
Once the “data-as-a-product” paradigm is adopted through a digital transformation process, we are ready for our data to create value. One measure of this is that we want a single ‘lego block’ to be usable to many applications, and those applications in turn (also using the semantic contract) create more data (lego blocks in different colors) which can be used for more apps and more insights. A virtuous cycle (thanks Kelly Taylor for teaching me that phrase) for data producers and consumers.
Ultimately we want to drive the maximization of value from our data and the investments we make into our data products.
Content created with Stephanie Bankes’ thought partnership, thanks Stephanie! :)
Post 2, July 2022
Day 2 of visualizing concepts on the topic of converging product, design, and data systems thinking.
I too often hear from data scientists that they have to dig through many layers to find a data file from a data swamp, which also lacks any governance or semantic metadata, making their job of cleaning, understanding, and using data to create value incredibly inefficient.
I recently watched Zhamak Dehghani’s fantastic presentation on the Data Mesh paradigm that is taking the data industry by storm. The presentation is sponsored by the Stanford University School of Medicine and recently shared by Amir Bahmani, Director of Stanford’s Deep Data Research Computing Center and Lecturer at Stanford University. https://lnkd.in/gxhavQ8i
Zhamak has created a common language for us to discuss / ideate / iterate on common pain points and inefficiencies we see, as well as offering a compelling roadmap that offers innovation strategies for organizational design as well as data architecture (Conway’s law: “Organizations, who design systems, are constrained to produce designs which are copies of the communication structures of these organizations”).
Instead of throwing data into a lake or warehouse, which is destined to become a data swamp as ever-increasing volume, velocity, and complexity of data is added, what if we were to start by thinking of our data as a product, and design data experiences that maximize value of data in a scalable way? This visual is my interpretation of the first page of Chapter 13 from Zhamak’s book: “Data Mesh: Delivering Data-Driven Value At Scale”.
Effectively, we want our data systems to be designed so that they maximize value, and this is an excellent place where I think data, design, and product thinking converge. The user experience for the data scientist (or any data consumer) should have everything they need: data is discoverable, understandable, trustworthy, and explorable. The savings in time has massive ROI for an organization: employee time savings, decision-making quality, the ability to implement real-time decision analytics, and the ability to enable computational knowledge.
In a way, I think the concepts in Zhamak’s book have become the cultural schema and taxonomy for us to continue to innovate and evolve how to design intelligent human and machine learning systems, the ultimate competitive advantage for any business.
#businessanalytics #datadriven #datamanagement #businessintelligence #datascientists #bestpractices #datawarehouse #datastrategy #cdo #augmentedanalytics #VC #Cloud #SelfService #Governance #artificialintelligence #machinelearning #bigdata #analysts #datalakes #datacatalog #migration #serverless #modernization #ml #insights #strategy #datalake #scale #algorithm #value #datascience #datamesh
Post 1, July 2022
I have decided to start illustrating concepts on the topics of “Data-as-a-Product”, “Computational Knowledge”, and “Data Experience Design”. A fun exercise that will have me engaged in continual learning of the art of visual storytelling for complex and abstract concepts around human + machine learning systems.