Data Dictionary?

One of the biggest needs to enable self-service bi is documentation of the available data models, especially with novice end users. I’ve been going through documenting all our fields but I cannot find a method to export to some form of data dictionary. Is hovering over the field the only way users can view data definitions?

1 Like

I faced exactly this problem yesterday when trying to self-service some stats data for a product whose data models I’m not familiar with.

I think currently there are 2 problems:

1- Currently the unit of interaction that an end-user (not data person) sees (at least when looking for stuff) is dataset (not data model).

Working with Dataset makes it hard for end users to build a high-level mental model around the data structures (what are the models, their fields, their relationships, etc?) to be able to self-serve effectively.

2- (your point above) There isn’t an interface to list all/relevant models + descriptions, except for hovering around the model.

My current workaround is to use dbdocs.io (also made by Holistics) to create a data dictionary of the tables and share it with people/consume myself. But that’s quite troublesome (have to sync between dbdocs and Holistics). I’d imagine Holistics can consider baking something like dbdocs into the self-service interface itself.

I’d love to see data dictionary added here. That’s one major key point with Looker that Holistics doesn’t have, a real table stakes item. That being said, however, I’m a little shocked I haven’t seen this dbdocs.io before on Holistics’ site. If it’s a viable work around, I’m happy to put in a little extra effort until they develop a solution. Much thanks for that tip!!!

1 Like

I’m gonna kick myself if i dont ask…any other Holistics products out there besides dbdocs.io?

1 Like

Haha, there’s Holistics which is a BI tool. And then there’s dbdiagram.io, dbdocs.io and dbml.org, which is an ecosystem of database development tools.

That’s about it for now but who knows what comes in the future;)

3 Likes

Hi @sm_mk ,

Could you help clarify a few things to help our Product team understand the need better?

  • What problem do you want to solve with the data dictionary? How do you imagine novice users leveraging the data dictionary in their daily operations?
  • What information would you need to document in the data dictionary?

Sure, we’re trying to establish an environment of data democratization and self service. A dictionary will help users understand what the fields mean and how they may relate to other models/tables. Beyond that it creates a central singular definition for any consumers of the data so there are no varying definitions of terms or mis-understandings on what relates to what. e.g. dataedo is a great product for this, it’ll scan schemas of dbs and allow you to add further documentation to any items then enable publishing of the dictionary to pdf or website among others. We use this function in website. Novice end users will open the site, search for the term they’re looking for, and all instances are returned for them to explore. A person’s ID may be named one way in one area and a completely different in another but hopefully the definitions will match and be caught by the user’s search terms.
We also have technical users not familiar to db’s outside their departments utilizing this too in order for them to understand how other structures are setup and related. For Holistics piece, i’d like something similar where novice users can see all the fields along with definitions and not have to hover over every single one. In my arena that’s pretty SOP for vendors we work with. dbdocs.io seems to be going in that general direction but data in/out of Holistics for definitions doesn’t seem possible yet (or maybe i just missed it). I’d also like to be able to programmatically get these definitions into and out of Holistics so if your tool doesn’t work like we need exactly I can develop a work around solution to keep documents up to date w/out double effort. Right now I’m having to manually copy/paste definitions from our data dictionary into holistics and remember to update should they change so it’s double effort which opens up more opportunity for errors. As far as information I’d like to see i’d say simple overview, model name, field name, field definition…maybe field type although that might be too much for a novice user. Anyway that’s kind of where I’m looking to go. Happy to followup if needed.

1 Like

Hey Mike, thanks for sharing in detail! I understand the need for a data dictionary for both novice and technical users.

Just had a discussion yesterday with @huy , we believe that a data dictionary also helps with exploratory analysis - when you haven’t had any specific questions in mind. Presenting a bunch of data in front of the users lets them know what’s available so they can mix and match the data to explore the hidden corners of their business.

I will continue discussing about this idea with my internal team and keep this thread updated with new ideas and plans :wink:

2 Likes