Key Skills
These are the skills you'll need to put together a Problem Definition.
General
YAML
At present, Problem Definitions are defined using YAML. We'll be introducing a graphical way of configuring these in a future update.
Querying
Gather Phase
When performing queries against a data source - e.g. SQL Server, MongoDB, Elasticsearch - you'll use the query language that's native to that platform.
For example, when using SQL Server, you'd use its flavor of T-SQL. For MySQL, it's a slightly different flavor of T-SQL. With MongoDB, it'd be MQL.
Analyze Phase
Analyze phase queries are performed using DuckDB. This allows you to query Parquet files using SQL as if they were tables.
DuckDB is Copyright 2018-2023 Stichting DuckDB Foundation.
For each Parquet file used as in input to the Analyze phase, a kind of virtual table is used. For example, if you have a filename off Foo.parquet
then you would have a table named Foo
. These virtual tables can then be queried using the DuckDB flavor of SQL.
Restructuring JSON data
The end result that DataBug needs to create cases is a Parquet file with "flat" data as opposed to hierarchic data like JSON.
When querying JSON-based data sources, you may be able to flatten your data using the query itself. For example, Cosmos DB's SQL syntax allows you to JOIN between hierarchic layers of a record and SELECT fields from multiple levels.
However, in some cases - especially JSON files or APIs - you can't control the structure of the data you get back from the data source. In these situations, we offer two ways of transforming the data you receive prior to a flat Parquet file being written.
JSONata
JSONata is a lightweight querying and transformation language for JSON data. It was inspired by the location path semantics of XPath.
For any given JSON structure, you can transform it into another shape. The JSONata Exerciser gives a good example of what can be achieved and you can use it as a playground for experimentation.
JSONata has a rich language for querying JSON.
Handlebars
Handlebars is a {{ mustache-based }}
templating language. Using Handlebars to transform data gives you complete control over every character of the output JSON you want to create.
Handlebars is Copyright (C) 2011-2019 by Yehuda Katz
This means that you are responsible for ensuring that you create syntactically-correct JSON as an output. In most cases, JSONata is going to be a far easier way to go.
The Handlebars Expressions documentation details how to put placeholders from a JSON object into output text. The Playground also lets you experiment with templates.
Creating Cases
Cases are created using Markdown and Handlebars.
Markdown
Markdown is a simple and easy-to-use markup language you can use to format virtually any document.
Its rich syntax allows you to create multiple levels of headings, style fonts, create tables or checklists, etc.
Use Markdown to present the narrative of a case to your end users. A good structure to follow is:
What's happened;
Why it's important / What the impact is;
How to resolve it, step-by-step.
However, you're free to write whatever works best for your team!
Markdown allows you to control the format and overall wording of your case, but to include data from Parquet files, you'll need to use Handlebars.
Handlebars
Handlebars is a {{ mustache-based }}
templating language.
Handlebars is Copyright (C) 2011-2019 by Yehuda Katz
The Handlebars Expressions documentation details how to put placeholders from a JSON object into your Markdown content. The Playground also lets you experiment with templates.
Each row from a Parquet file will be presented into your template as a JSON object. Handlebars can then be used to "mail merge" data into your narrative, for example:
Customer {{ CustomerName }}'s contract is close to expiry.
On top of the "out-of-the-box" Handlebars expressions, we have also introduced several of our own to make writing cases easier:
One
Two
Three
Last updated
Was this helpful?