\r\n\r\nWhile the above statistics are only estimates, it is telling that even after decades of data-driven initiatives, much of the technical workforce still cannot access data in a meaningful way. They continue to rely on other people and processes to bring their ideas to fruition.\r\n\r\n \r\n\x3Ch2>The \"self-serve\" that was promised\x3C/h2>\r\nStepping back for a moment, “self-serve” business intelligence (BI) was indeed the promise made to business users in the last quarter of this century. Unlike the 90s, when in-house experts delivered reports, the idea in the early 2000s was to decentralize decision-making through greater collaboration and collective knowledge, often referred to as business intelligence 2.0. Numerous BI tools, including Microsoft Power BI, Salesforce Tableau, Google Looker and Data Studio, IBM Cognos, Amazon QuickSight, ThoughtSpot, SAP BusinessObjects, Mode Analytics, HEX Technologies, JetBrains Datalore, Elastic Kibana, Qlik Sense, MicroStrategy, Apache Superset, and even newer ones such as Databricks, Snowflake, Sigma Computing, and Omni Analytics, set out to replace SQL with dashboards.\r\n\r\nUnfortunately, dashboarding tools didn’t change things much for business users:\r\n\x3Col>\r\n \t\x3Cli>Creating dashboards requires pulling data, setting up data models, crafting visualizations, and preparing them for business users, all of which involve expertise, both in data and the BI tool, that is typically scarce and keeps business users blocked.\x3C/li>\r\n \t\x3Cli>The process of building dashboards is time-consuming, taking anywhere from days to weeks. It is also iterative, requiring business users to communicate their requirements and ensure they are translated correctly.\x3C/li>\r\n \t\x3Cli>Dashboarding is still limited to presenting data and charts, even though the ultimate goal is to make decisions. As a result, business users may need to further process the results before they can use them.\x3C/li>\r\n \t\x3Cli>Finally, organizations typically end up with far more dashboards than they need, while still creating new ones all the time. This leads to a situation where finding the right dashboard is difficult, and even though so much time was spent building them, many remain unused.\x3C/li>\r\n\x3C/ol>\r\nIn short, with BI 2.0, dashboards have replaced SQL not just as a new interface but also as a new pain point, while “self-serve” remains an illusion even after decades of tooling.\r\n\r\n \r\n\x3Ch2>Text-to-SQL is a hammer in search of nail\x3C/h2>\r\nThe new wave (or rather, rage) of AI tools has reignited the debate about whether SQL will become redundant. With large language models (LLMs) becoming really good at processing natural language, the question is whether business users can finally express their needs directly to the LLMs. Ironically, answering questions on databases using LLMs has turned into a SQL-writing problem; rather than eliminating SQL from the picture, users are now inundated with auto-generated SQLs. This is not helpful, since people don’t really want SQL that maps to tabular results; they want answers. Interpreting the tabular data could lead to answers, but that requires the SQL to be correct in the first place, and someone needs to check that.\r\n\r\nLLMs are trained on the entire world’s data and they can perform a lot of reasoning tasks, yet they are not grounded in enterprise data. To overcome this, most new AI tools put the onus of verifying SQL queries on the users. This brings us back to square one, where knowing SQL is still required to solve the problem of SQL being hard to know. So the question remains: will we ever reach a point where people no longer need to know SQL? Interestingly, this seems difficult with pre-trained models, especially as \x3Ca href=\"https://www.youtube.com/watch?v=1yvBqasHLZs\">pre-training is hitting the wall\x3C/a>. Coincidentally, text-to-SQL accuracy has also plateaued:\r\n\r\n\x3Cimg class=\"alignnone wp-image-1248\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/10/accuracy-300x181.jpg\" alt=\"\" width=\"441\" height=\"266\" />\r\n\r\nSo while on one hand business users are asking to convert their text into answers and not SQL, on the other hand even the accuracy improvements of text-to-SQL have become incremental.\r\n\r\n \r\n\x3Ch2>Towards higher-level query engines\x3C/h2>\r\nMost people do not want to deal with SQL. They care about their questions and want to use data to answer them. Forcing them to inspect and verify the SQLs is counterproductive, much like asking people to inspect assembly code. Instead, modern users want to be assured of correctness using the new age AI that is expected to connect data with its consumers. The challenge is to control LLMs and ensure they operate within the boundaries of the enterprise data.\r\n\r\nWe need higher-level query engines that can constrain AI to operate strictly within the realm of the underlying data. This requires reimagining the entire stack, from data modeling and semantic understanding all the way to query compilation, optimization, execution, and result processing. To illustrate, consider the following question on the TPC-H dataset:\r\n\r\nExample: “\x3Cem>What are the top Chinese suppliers offering the most discounts in 2021?\x3C/em>”\r\n\r\nThis question has several parts to it, including:\r\n\x3Cul>\r\n \t\x3Cli>\x3Cem>Chinese suppliers\x3C/em>\x3C/li>\r\n \t\x3Cli>\x3Cem>Most discounts\x3C/em>\x3C/li>\r\n \t\x3Cli>\x3Cem>In 2021\x3C/em>\x3C/li>\r\n \t\x3Cli>\x3Cem>Top\x3C/em>\x3C/li>\r\n\x3C/ul>\r\nA lot of things need to be correct when answering the above question, including the joins (based on the join graph), aggregates (count vs count distinct, average vs summing), comparisons (equality vs like), filtering (dates vs string), syntax (limit, etc.), augmenting with additional information, visualizations, aspects, and so on. Tursio guarantees all of these nuances every single time as illustrated in the demo below:\r\n\r\n[video width=\"3822\" height=\"2168\" mp4=\"https://blog.tursio.ai/wp-content/uploads/2025/10/Research-Mode.mp4\"][/video]\r\n\r\n \r\n\r\n\x3Cstrong>So, what’s the secret sauce? \x3C/strong>\r\n\x3Col>\r\n \t\x3Cli>The key is reducing the ambiguity that general purpose LLMs suffer from. Tursio does this systematically by first inferring a semantic model and then identifying which portions of the semantic graph to constrain the query to. This is a completely new way of processing queries, very different traditional SQL query processors, that combines determinism with creativity.\x3C/li>\r\n \t\x3Cli>In addition to constraining, Tursio also guides the LLMs via relevant context. It auto-generates a large corpus of valid questions to cover the space of all queries uniformly and then identifies the relevant context for the incoming question. The goal is to automatically generate synthetic context over large databases and reduce the ambiguity by providing just the relevant one.\x3C/li>\r\n \t\x3Cli>Users today expect that they should be able to ask anything in natural language. However, long winded questions can confuse the LLMs and make the responses flaky. Tursio reduces the noise by tokenizing questions into fragments for better interpretability, i.e., instead of trying to interpret the question as a whole, it first identifies what are the various components in the question (same as the example above).\x3C/li>\r\n \t\x3Cli>Finally, the SQL query generation using LLMs can be unstable or even incorrect. Tursio avoids these issues by building query plans systematically step-by-step. It first constructs the operator trees and then rewrites them iteratively, using techniques from query processing, to make it as correlated to the user question as possible.\x3C/li>\r\n\x3C/ol>\r\n\x3Cp data-start=\"87\" data-end=\"466\">The next generation of data analytics will be AI-powered -- simpler, faster, and more efficient. Tursio is a practical step in that direction, with a reimagined query processing stack that is already deployed in several customer environments. But this is just the beginning; fusing enterprise data with AI is an exciting new world that is yet to be explored fully.\x3C/p>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/10/Blog-Graphics.png",author:"Alekh Jindal",author_image:void 0,published_date:"October 3, 2025",tags:"Generative AI",description:"Accessing enterprise data shouldn’t be hard—but for billions of professionals, SQL remains a barrier. Despite decades of BI tools and dashboards promising “self-serve” analytics, business users are still blocked by complexity, slow workflows, and reliance on data experts. Text-to-SQL AI seemed like a solution, but auto-generated queries still require verification, leaving the SQL wall intact. Tursio reimagines data analytics by connecting natural language questions directly to enterprise data while ensuring correctness. By inferring semantic models, constraining AI queries to relevant data, and systematically building query plans, Tursio delivers accurate, interpretable, and actionable answers—making AI-powered analytics truly accessible."}},$R[11]={id:1171,slug:"why-ai-fails-despite-great-models",acf:$R[12]={title:"Why AI fails despite \"great” models? ",content:"We’ve all seen the AI demos: someone types in a question, and an instant answer appears. The audience feels the wow and the presenter steals the show. Everyone leaves convinced that this is the next big thing. But in real life? That’s not how it works.\r\n\r\nThe gap between \x3Cstrong>demo magic\x3C/strong> and \x3Cstrong>real use\x3C/strong> is where most enterprise AI fails. In fact, recent report from MIT’s NANDA project reveals that \x3Ca href=\"https://fortune.com/2025/08/18/mit-report-95-percent-generative-ai-pilots-at-companies-failing-cfo/\">95% of the generative AI pilots are failing\x3C/a>. This is intriguing, especially when the latest GPT models are now touted to possess \x3Ca href=\"https://openai.com/index/introducing-gpt-5/\">PhD-level intelligence\x3C/a>.\r\n\r\nConsidering Google Trends as a proxy, the graphic below shows the interest in \"AI in production\" over the last one year (green curve). We can see an astronomical rise in this topic in the recent months. Co-incidentally, this is matched, nearly pixel by pixel, by interest in \"AI accuracy\" (blue curve). Is this a mere co-incidence that people are suddenly interested in accuracy just as they are interested in getting AI to production?\r\n\r\n\x3Cimg class=\"alignnone wp-image-884\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/08/AITrends-300x113.jpg\" alt=\"\" width=\"751\" height=\"283\" />\r\n\r\nTurns out that the best of the models are only as good as the prompts they get. In fact, \x3Ca href=\"https://mitsloan.mit.edu/ideas-made-to-matter/study-generative-ai-results-depend-user-prompts-much-models\">studies\x3C/a> suggest that \x3Cem>only half the gains seen with more advance AI model come from the model itself, while the other half come from how users adapt their prompts\x3C/em>. No wonder, GPT-5, like its predecessors, comes with an elaborate \x3Ca href=\"https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide\">prompting guide\x3C/a> instructing users how to prompt the model in different scenarios.\r\n\r\nHere is the surprising truth for people disappointed with the answers to their questions: \x3Cstrong>it's not the model's fault—it’s the question’s. \x3C/strong>The ability to ask the right question—with the right context, in the right format—is the single most overlooked skill in AI today, also referred to as prompt engineering. Unfortunately, most people don’t know how to prompt. And most tools aren’t built to help them.\r\n\x3Ch2>\x3Cstrong>The current state: Prompting is a tedious art\x3C/strong>\x3C/h2>\r\nToday’s LLM tools expect users to behave like prompt engineers. The common guidance is to \"just ask it like you would a human\". But that’s easier said than done and the reality is, \x3Cstrong>how you prompt matters immensely\x3C/strong>, and most AI tools do very little to help you prompt better.\r\n\r\nTo illustrate, here is the \x3Ca href=\"https://docs.anthropic.com/en/release-notes/system-prompts#august-5-2025\">system prompt\x3C/a> that powers the magic behind Claude, as released by Anthropic on August 5, 2025. This prompt is 2,560 word long and spans 6-pages, with detailed instructions about various types of questions that users may ask. The screenshot below shows just a snippet of this massive prompt:\r\n\r\n\x3Cimg class=\"alignnone wp-image-870\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/08/system-prompt-300x239.png\" alt=\"\" width=\"568\" height=\"453\" />\r\n\r\nBuilding such sophisticated prompts is near-impossible for regular users. While people can often start with simpler prompts, depending on their application, covering various corner cases makes them increasingly complex over time. In enterprise settings, where users are querying structured data, the stakes are higher, and the complexity is even deeper. Since you are not chatting about movie ideas but rather making decisions based on what’s inside complex enterprises databases, the ability to get what’s needed quickly and accurately is paramount. Prompting such data requires a good understanding of the structure, the underlying semantics, the business context, and crafting all that into the right prompts for the AI – clearly, a tall ask for most users.\r\n\r\n\x3Cstrong>So, here is the typical user behavior today: \x3C/strong>Users either freeze at the prompt box or spiral into 10+ turns of trial-and-error chats, only to give up with mediocre output. When prompts are vague or misaligned with the data, you get hallucinations, non-answers, or worse: confident lies. That’s a fast track to lost trust.\r\n\x3Ch2>\x3Cstrong>The problem: Garbage in, garbage out\x3C/strong>\x3C/h2>\r\nThe bottleneck in enterprise AI isn't the AI — it is the human input.\r\n\r\nThe new age AI tools are incredibly hard to control without the prompt perfectionism. We are trying to avoid prompts that are ambiguous or vague, prompts that are confusing or have contradictory instructions, prompts requesting unknown or unsupported facts, prompts that are long, overly broad, or multi-faceted, prompts lacking retrieval or grounding (citations), prompts that “jailbreak” the model on safety and truthfulness, prompts that are adversarial or misleading, and the list goes on. In short, we are clearly not talking to another human anymore, but trying to operate Formula 1 machines with tricycle training. And this is where even technically savvy users struggle to translate what they want into a precise, semantically valid, data-aware prompts. That’s not just inefficient. It’s \x3Cem>dangerous\x3C/em>—because it creates a false sense of reliability.\r\n\r\nWhat’s really broken? Well, most enterprise users are asked to learn the art of prompting and to start doing it from scratch. There’s no autocomplete. No query validation. No semantic hinting. We’ve created an illusion of power, where users can \"ask anything\", while at the same time we ask users to reinvent the wheel every time. And this design flaw doesn’t just create friction. It kills momentum.\r\n\r\nGiven the complexities in enterprise data, treating prompting like typing in a search box ends up being a non-starter. Prompting for data is a dynamic process that needs \x3Cem>scaffolding\x3C/em>. While expert data users might be able to muddle through, users in ops, marketing, or finance are likely to flounder. Essentially, we are locking knowledge behind an interface that pretends to be simple—but isn’t.\r\n\r\n \r\n\x3Ch2>\x3Cstrong>The counterintuitive solution: Auto mode\x3C/strong>\x3C/h2>\r\nMost people don’t need a \"chatbot\". They need a smart assistant that scaffolds their thinking. That’s why at \x3Ca href=\"https://www.tursio.ai/\">Tursio\x3C/a>, we built the \x3Cstrong>Auto Mode\x3C/strong>: a new way to prompt based on \x3Cstrong>what exists\x3C/strong>, \x3Cstrong>what’s meaningful\x3C/strong>, and \x3Cstrong>what’s probable\x3C/strong>—all grounded in your data. Let's unpack each of these below:\r\n\r\n\x3Cstrong>What exists?\x3C/strong> Asking questions on the data has the implicit assumption that the answers will come only from the data. This is often tedious since either the questions are too far from whats exists in the data or the responses are hallucinated to make things up. Discovering what exists in the data can help the users frame their questions more accurately and set better expectations for the responses.\r\n\r\n\x3Cstrong>What's meaningful?\x3C/strong> Data operations need to be meaningful with respect to the underlying data properties. Combining customers and employees that have no join relationship between them will yield garbage responses. Likewise, counting patients without deduplicating them on their IDs or summing up ages instead of averaging them is rarely meaningful. Identifying and navigating users through such data properties can help improve the correctness.\r\n\r\n\x3Cstrong>What's probable?\x3C/strong> SQL is grounded in first-order logic, i.e., predicates and quantifiers on sets. Unfortunately, this is not so intuitive to people when they are trying to ask data queries in natural language, that will ultimately get translated to SQL statements. Guiding users through the probable predicates and quantifiers while also limiting them from anything that could not be operated on sets can help ground the expectations.\r\n\r\nEssentially, the Auto mode is like working inside a smart IDE for queries. The intended experience is to feel like \"vibe coding\", i.e., you explore, select, and accept. The system fills in the gaps. Eventually, the goal is to make Tursio accessible to non-experts and they can forget the underlying SQL even existed.\r\n\r\nA natural reaction to this counterintuitive solution could be to just teach prompting: \"But people just need training\". Maybe. But most people aren’t prompt engineers—and they shouldn’t have to be. We believe AI tools should adapt to humans, not the other way around. This isn’t about dumbing down the AI. It’s about \x3Cstrong>removing friction so people can get real work done\x3C/strong>.\r\n\r\n \r\n\x3Ch2>\x3Cstrong>Final thoughts\x3C/strong>\x3C/h2>\r\nEvery new wave of technology goes through a cycle of hype followed by grounding. AI is no different with many AI tools already entering the grounding phase. Prompting has been a revelation in this process, as people quickly realize it to be both an art and a science. Consequently, more mature AI tools now abstract complex parts of prompts into system and developer layers, requiring minimal to no prompting from users. Tursio takes this a step further by introducing auto-prompting for data, hiding the complexities of data prompts so users no longer have to worry about them. Ultimately, the goal is to make data simpler and to get the job done.",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/09/New-Blog-Graphic-2.0.png",author:"Alekh Jindal",author_image:void 0,published_date:"September 19, 2025",tags:"Databases, Generative AI",description:"We’ve all seen AI demos—type a question, get an instant answer. In reality, 95% of enterprise AI pilots fail—not because models are weak, but because prompting is hard. Most users aren’t prompt engineers, especially when querying structured data, leading to hallucinations and lost trust. At Tursio, Auto Mode solves this: it scaffolds questions based on what exists, what’s meaningful, and what’s probable in your data. Users explore, select, and accept—without worrying about SQL or prompt complexity. AI should adapt to humans, not the other way around. The result? Fast, reliable insights you can trust."}},$R[13]={id:1172,slug:"redefining-productivity-with-ai",acf:$R[14]={title:"Redefining Productivity with AI",content:"\x3Cem>\x3Cstrong>The rise of AI has sparked a familiar fear: that machines will replace people. \x3C/strong>\x3C/em>\r\n\r\n\x3Cspan data-contrast=\"auto\">However, the real story unfolding across industries is far more nuanced and optimistic. AI isn’t here to replace your team. It’s here to rewire how they work, think, and create value.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">This isn’t about subtraction. It’s about transformation. The organizations that thrive in the AI era won’t be the ones that cut headcount- they’ll be the ones that reimagine how humans and machines collaborate.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\x3Ch3>\x3Cb>\x3Cspan data-contrast=\"auto\">AI-Powered Teams Will Outperform The Rest\x3C/span>\x3C/b>\x3C/h3>\r\n\x3Cspan data-contrast=\"auto\">The myth of AI as a job destroyer is giving way to a more grounded reality: AI is a force multiplier. It’s not eliminating roles-it’s enhancing them.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">A recent \x3Ca href=\"https://www.axios.com/2025/06/30/ai-job-vibe-coding-upwork\">Upwork-Axios study\x3C/a> found that freelancers using AI earned 25% more year-over-year. That’s not a fluke-it’s a signal. AI is enabling professionals to deliver more value, faster, and with greater precision. From automating repetitive tasks to accelerating research and ideation, AI is becoming a co-pilot for knowledge workers across functions.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">Rather than replacing expertise, AI is amplifying it, freeing up time for higher-order thinking, creativity, and strategic execution.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\x3Ch3>The New Productivity Formula: Decision Velocity > Hours Worked\x3C/h3>\r\n\x3Cspan data-contrast=\"auto\">Traditional productivity metrics, such as hours worked or tasks completed, are giving way to a more modern and meaningful measure: decision velocity.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">In a world where speed and accuracy are competitive advantages, the ability to make fast, informed decisions is more valuable than ever. AI enables this shift by surfacing insights in real-time, contextualizing data, and reducing the cognitive load on teams.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">According to recent studies, 64% of businesses expect AI to improve productivity, with some estimating up to 40% efficiency gains. \x3Ca href=\"https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/the-economic-potential-of-generative-ai-the-next-productivity-frontier\">McKinsey projects that generative AI could contribute between $2.6 and $4.4 trillion annually in global productivity.\x3C/a>\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cem>This isn’t about doing more- it’s about deciding better, faster, and with greater confidence. \x3C/em>\r\n\x3Ch3>\x3Cb>\x3Cspan data-contrast=\"auto\">Roles Most Impacted: Analysts, Marketers, Customer Teams, and Managers\x3C/span>\x3C/b>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\x3C/h3>\r\n\x3Cspan data-contrast=\"auto\">AI is reshaping the contours of work across departments by eliminating certain roles while evolving others.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\x3Cul>\r\n \t\x3Cli data-leveltext=\"\" data-font=\"Symbol\" data-listid=\"5\" data-list-defn-props=\"{"335552541":1,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"","469777815":"hybridMultilevel"}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"1\">\x3Cb>\x3Cspan data-contrast=\"auto\">Analysts\x3C/span>\x3C/b>\x3Cspan data-contrast=\"auto\"> are spending less time wrangling data and more time interpreting it, thanks to AI tools that automate data preparation and visualization.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":0,"335559739":0}\"> \x3C/span>\x3C/li>\r\n\x3C/ul>\r\n\x3Cul>\r\n \t\x3Cli data-leveltext=\"\" data-font=\"Symbol\" data-listid=\"5\" data-list-defn-props=\"{"335552541":1,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"","469777815":"hybridMultilevel"}\" aria-setsize=\"-1\" data-aria-posinset=\"2\" data-aria-level=\"1\">\x3Cb>\x3Cspan data-contrast=\"auto\">Marketers\x3C/span>\x3C/b>\x3Cspan data-contrast=\"auto\"> are leading the charge in AI adoption, with \x3Ca href=\"https://softspacesolutions.com/blog/application-of-ai-in-marketing/\" target=\"_blank\" rel=\"noopener\">88% using AI daily. Specifically, 93% leverage\x3C/a>\x3Ca href=\"https://softspacesolutions.com/blog/application-of-ai-in-marketing/\"> it for content generation, 81% for insight discovery, and 90% for faster decision-making.\x3C/a>\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":0,"335559739":0}\"> \x3C/span>\x3C/li>\r\n\x3C/ul>\r\n\x3Cul>\r\n \t\x3Cli data-leveltext=\"\" data-font=\"Symbol\" data-listid=\"5\" data-list-defn-props=\"{"335552541":1,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"","469777815":"hybridMultilevel"}\" aria-setsize=\"-1\" data-aria-posinset=\"3\" data-aria-level=\"1\">\x3Cb>\x3Cspan data-contrast=\"auto\">Customer-facing teams\x3C/span>\x3C/b>\x3Cspan data-contrast=\"auto\"> are seeing measurable gains in service quality and speed. \x3Ca href=\"https://www.squadstack.com/voicebot/ai-virtual-assistants\">AI-powered assistants have boosted productivity by 15% in support roles.\x3C/a>\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":0,"335559739":0}\"> \x3C/span>\x3C/li>\r\n\x3C/ul>\r\n\x3Cul>\r\n \t\x3Cli data-leveltext=\"\" data-font=\"Symbol\" data-listid=\"5\" data-list-defn-props=\"{"335552541":1,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"","469777815":"hybridMultilevel"}\" aria-setsize=\"-1\" data-aria-posinset=\"4\" data-aria-level=\"1\">\x3Cb>\x3Cspan data-contrast=\"auto\">Managers\x3C/span>\x3C/b>\x3Cspan data-contrast=\"auto\"> are shifting from micromanagement to strategic leadership, as AI takes over reporting, forecasting, and performance tracking.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":0,"335559739":0}\"> \x3C/span>\x3C/li>\r\n\x3C/ul>\r\n\x3Cspan data-contrast=\"auto\">These changes aren’t about displacement-they’re about elevation. \x3C/span>\r\n\r\n\x3Cem>AI is helping professionals focus on what humans do best: critical thinking, empathy, and innovation. \x3C/em>\r\n\x3Ch3>\x3Cb>\x3Cspan data-contrast=\"auto\">Tomorrow’s Skillsets: Data Literacy, AI Copiloting, Cross-Functional Agility\x3C/span>\x3C/b>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\x3C/h3>\r\n\x3Cspan data-contrast=\"auto\">As AI becomes embedded in daily workflows, the most valuable skill sets are evolving.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\x3Cul>\r\n \t\x3Cli data-leveltext=\"\" data-font=\"Symbol\" data-listid=\"6\" data-list-defn-props=\"{"335552541":1,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"","469777815":"hybridMultilevel"}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"1\">\x3Cb>\x3Cspan data-contrast=\"auto\">Data literacy\x3C/span>\x3C/b>\x3Cspan data-contrast=\"auto\"> is becoming essential in the form of coding but in the ability to interpret and question data-driven insights.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":0,"335559739":0}\"> \x3C/span>\x3C/li>\r\n\x3C/ul>\r\n\x3Cul>\r\n \t\x3Cli data-leveltext=\"\" data-font=\"Symbol\" data-listid=\"6\" data-list-defn-props=\"{"335552541":1,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"","469777815":"hybridMultilevel"}\" aria-setsize=\"-1\" data-aria-posinset=\"2\" data-aria-level=\"1\">\x3Cb>\x3Cspan data-contrast=\"auto\">AI copiloting\x3C/span>\x3C/b>\x3Cspan data-contrast=\"auto\"> is emerging as a core competency. Knowing how to prompt, validate, and collaborate with AI tools will be as important as knowing how to use a spreadsheet.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":0,"335559739":0}\"> \x3C/span>\x3C/li>\r\n\x3C/ul>\r\n\x3Cul>\r\n \t\x3Cli data-leveltext=\"\" data-font=\"Symbol\" data-listid=\"6\" data-list-defn-props=\"{"335552541":1,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"","469777815":"hybridMultilevel"}\" aria-setsize=\"-1\" data-aria-posinset=\"3\" data-aria-level=\"1\">\x3Cb>\x3Cspan data-contrast=\"auto\">Cross-functional agility\x3C/span>\x3C/b>\x3Cspan data-contrast=\"auto\"> is key. As AI breaks down silos, teams that can collaborate across disciplines and share a common data language will have a distinct edge.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":0,"335559739":0}\"> \x3C/span>\x3C/li>\r\n\x3C/ul>\r\n\x3Cspan data-contrast=\"auto\">Yet, there’s a gap. \x3Ca href=\"https://www.isaca.org/about-us/newsroom/press-releases/2025/ai-use-is-outpacing-policy-and-governance-isaca-finds\">While 83% of professionals in Europe use generative AI, only 31% report having formal AI policies in place.\x3C/a> This underscores the urgent need for training, governance, and cultural alignment.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\x3Ch3>Empowered Teams Win with AI\x3C/h3>\r\nThe real AI advantage isn’t in the tools - it’s in the teams that use them.\r\n\r\nOrganizations leading the AI race aren’t pushing solutions top-down; they’re empowering teams to explore, adapt, and innovate. When people are given the autonomy to apply AI in ways that make sense for their work, they unlock real productivity gains, saving hours each week and redirecting that time toward strategic thinking and creative problem-solving.\r\n\r\nCompanies with deliberate, team-driven AI strategies are seeing faster growth and stronger outcomes.\r\n\r\n\x3Cem>The message is clear: when you trust your teams with AI, they don’t just adopt it - they amplify its impact.\x3C/em>\r\n\x3Ch3>\x3Cb>\x3Cspan data-contrast=\"auto\">The Tursio Perspective: Unlocking Human Potential in the AI Era\x3C/span>\x3C/b>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\x3C/h3>\r\n\x3Cspan data-contrast=\"auto\">At Tursio, we’re building for this future. Our platform is designed to democratize access to structured data in natural language. Whether you’re in marketing, finance, or operations, Tursio empowers you to ask smarter questions and get instant answers.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">We believe productivity in the AI era isn’t about replacing people- it’s about unlocking their full potential. By removing technical barriers and accelerating insight generation, we help teams move from reactive reporting to proactive decision-making.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cem>In a world where speed, clarity, and adaptability define success, Tursio is your partner in rewriting the productivity playbook with quick insights at your fingertips. \x3C/em>\r\n\r\n ",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/08/1.png",author:"Nilanshi Dhoundiyal",author_image:void 0,published_date:"August 6, 2025",tags:"Generative AI",description:"AI isn’t replacing people—it’s amplifying them. Across functions, AI acts as a force multiplier, automating repetitive work, accelerating research, and enabling faster, more confident decisions. Analysts interpret rather than wrangle data, marketers generate content and insights instantly, customer teams serve faster, and managers focus on strategy over reporting. The new productivity measure is decision velocity, not hours worked. Tomorrow’s professionals need data literacy, AI copiloting skills, and cross-functional agility. At Tursio, we empower teams to access structured data in natural language, removing barriers and unlocking human potential, so AI amplifies impact instead of just adding tools."}},$R[15]={id:1174,slug:"mcp-for-databases-new-trick-for-old-elephants",acf:$R[16]={title:"MCP for Databases: New trick for old elephants",content:"In the evolving landscape of AI applications, one challenge keeps resurfacing: How do you make large language models (LLMs) actually understand enterprise data in context?\r\n\r\nWhile many developers have turned to orchestration frameworks like LangChain, a growing number are leaning toward something more structured, secure, and production-ready: Model Context Protocol (MCP) by Anthropic.\r\n\r\nOften called the “USB-C for AI applications,” MCP is quietly standardizing how tools, databases, and models communicate. But what’s really driving this shift - and how does it feel to use MCP in the real world?\r\n\r\nIn this post, we explore the human side of MCP: what developers love, where it shines, what makes it challenging, and why so many are betting on it as the foundation for scalable, secure, and context-aware AI workflows.\r\n\r\n \r\n\x3Ch2>\x3Cstrong>MCP Rapid Fire\x3C/strong>\x3C/h2>\r\n\x3Cstrong>What does an MCP server do?\x3C/strong>\r\n\r\nAn MCP server bridges the gap between natural language and structured enterprise data - securely and accurately.\r\n\r\nTake the Microsoft SQL Server MCP, for example. It translates plain English questions into secure, schema-aware SQL queries without requiring users to write a single line of SQL.\r\n\r\nThis enables fast, conversational access to enterprise data for use cases like:\r\n\x3Cul>\r\n \t\x3Cli>Financial reporting\x3C/li>\r\n \t\x3Cli>Supply chain insights\x3C/li>\r\n \t\x3Cli>Sales analytics\x3C/li>\r\n \t\x3Cli>Customer service dashboards\x3C/li>\r\n\x3C/ul>\r\nAll of this happens while respecting schema constraints, RBAC permissions, and existing infrastructure.\r\n\r\n\x3Cstrong>Why is everyone talking about it?\x3C/strong>\r\n\r\nMCP makes structured data LLM-readable - without the glue code.\r\n\r\nWith MCP servers, developers no longer need to manually wire up database logic or map schema fields into prompts. It provides LLMs with direct, yet secure, access to databases like SQL Server, enabling queries such as: \"What were our top-performing regions last quarter?\"\r\n\r\nMAC can provide instant answers using schema-aware, validated SQL, with no risk of prompt injection or data leakage.\r\n\r\n\x3Cstrong>Is MCP the new LangChain?\x3C/strong>\r\n\r\nLangChain and MCP solve different problems but often come up in the same conversation.\r\n\r\nLangChain is great for chaining tools, APIs, and agent logic. It's open-source and modular - fast to prototype, but often unstable, especially for production use.\r\n\r\nMCP is designed for structured context. It’s production-grade by design and prioritizes schema control, access rules, and data safety.\r\n\r\n \r\n\x3Ch2>\x3Cstrong>The Human Side\x3C/strong>\x3C/h2>\r\n\x3Cstrong>LangChain: flexible, but fragile\x3C/strong>\r\n\r\nWhat People Love:\r\n\x3Cul>\r\n \t\x3Cli>“Flexible and fast to prototype.”\x3C/li>\r\n \t\x3Cli>“Modular and open.”\x3C/li>\r\n \t\x3Cli>“Feels like Zapier for LLMs.”\x3C/li>\r\n\x3C/ul>\r\nWhere it Struggles:\r\n\x3Cul>\r\n \t\x3Cli>“Breaks every time I update.”\x3C/li>\r\n \t\x3Cli>“Not production-ready yet.”\x3C/li>\r\n \t\x3Cli>“Steep learning curve.”\x3C/li>\r\n\x3C/ul>\r\n\x3Cstrong>MCP: a structure that feels secure\x3C/strong>\r\n\r\nWhat People Love:\r\n\x3Cul>\r\n \t\x3Cli>“Finally, a standard.”\x3C/li>\r\n \t\x3Cli>“Less glue code!”\x3C/li>\r\n \t\x3Cli>“Great for regulated data.”\x3C/li>\r\n\x3C/ul>\r\nWhere it Struggles:\r\n\x3Cul>\r\n \t\x3Cli>“Hard to set up right.”\x3C/li>\r\n \t\x3Cli>“Some servers are half-baked.”\x3C/li>\r\n \t\x3Cli>“Security holes need patching.”\x3C/li>\r\n\x3C/ul>\r\n \r\n\x3Ch2>\x3Cstrong>MCP in Practice\x3C/strong>\x3C/h2>\r\nAcross industries, MCP servers are already being used to power secure, LLM-enabled access to structured data - from SQL databases to patient records and financial systems. Here are some notable implementations:\r\n\r\n\x3Cstrong>SQL Server\x3C/strong>\r\n\x3Cul>\r\n \t\x3Cli>DreamFactory MCP Server - Low-code API generation, RBAC, SQL injection protection\x3C/li>\r\n \t\x3Cli>Azure SQL + OpenAI - GPT-native SQL querying with enterprise-grade auth\x3C/li>\r\n\x3C/ul>\r\n\x3Cstrong>Snowflake\x3C/strong>\r\n\x3Cul>\r\n \t\x3Cli>Isaac Wasserman’s Server - OSS for developers\r\nGitHub Repo: \x3Ca href=\"https://github.com/isaacwasserman/mcp-snowflake-server\">isaacwasserman/mcp-snowflake-server\x3C/a>\x3C/li>\r\n \t\x3Cli>PulseMCP - Enterprise security and logging\x3C/li>\r\n \t\x3Cli>Wren Semantic Layer - Natural language mapped to SQL metadata\x3C/li>\r\n\x3C/ul>\r\n\x3Cstrong>Healthcare\x3C/strong>\r\n\x3Cul>\r\n \t\x3Cli>Mindbowser FHIR MCP - Secure endpoints for medical records and lab data\r\nURL: \x3Ca href=\"https://www.mindbowser.com/model-context-protocol/\">mindbowser.com/model-context-protocol\x3C/a>\x3C/li>\r\n \t\x3Cli>Medplum AI MCP - Cloud-native, GraphQL interface with FHIR\r\nURL: \x3Ca href=\"https://www.medplum.com/\">medplum.com\x3C/a>\x3C/li>\r\n \t\x3Cli>AWS HealthLake + Bedrock - Disease trend and cohort analysis\r\nURL: \x3Ca href=\"https://aws.amazon.com/healthlake\">aws.amazon.com/healthlake\x3C/a>\x3C/li>\r\n\x3C/ul>\r\n\x3Cstrong>Finance\x3C/strong>\r\n\x3Cul>\r\n \t\x3Cli>Zerodha Kite MCP - Retail portfolio analysis via chat\r\nURL: \x3Ca href=\"https://zerodha.com/z-connect/featured/connect-your-zerodha-account-to-ai-assistants-with-kite-mcp\">zerodha.com/z-connect/kite-mcp\x3C/a>\x3C/li>\r\n \t\x3Cli>FinGPT MCP Adapter - Real-time financial data processing\r\nGitHub: \x3Ca href=\"https://github.com/AI4Finance-Foundation/FinGPT\">AI4Finance Foundation\x3C/a>\x3C/li>\r\n \t\x3Cli>PulseMCP for Core Banking - Conversational access to core banking data\x3C/li>\r\n\x3C/ul>\r\n \r\n\x3Ch2>\x3Cstrong>Conclusion: Connectivity vs Experience\x3C/strong>\x3C/h2>\r\n\x3Cstrong>Model Context Protocol (MCP)\x3C/strong> is \x3Cstrong>connectivity first\x3C/strong>. It’s a powerful protocol for defining how AI systems connect with structured databases like SQL, Snowflake, or healthcare APIs. It focuses on schema exposure, context passing, and secure access, enabling LLMs to “see” the data properly and safely.\r\n\r\n\x3Cstrong>Tursio\x3C/strong>, by contrast, is \x3Cstrong>experience-first\x3C/strong>. We focus on making it effortless for users to ask questions, understand answers, and explore structured data, without writing SQL or knowing how the data is wired behind the scenes. Our strength lies in the natural language layer, contextual explanations, and an interface built for actual decision-makers.\r\n\r\nDespite these differences, \x3Cstrong>Tursio and MCP share a common architectural principle\x3C/strong>: both rely on structured schema modeling, context injection, and secure query generation behind the scenes.\r\n\r\n ",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/07/3.png",author:"Shraddhaa Khanna",author_image:void 0,published_date:"July 11, 2025",tags:"Generative AI",description:"In enterprise AI, making LLMs understand structured data is tricky. Many turn to LangChain for rapid prototyping, but Model Context Protocol (MCP) is emerging as a production-ready alternative—sometimes called the “USB-C for AI.” MCP servers translate natural language into schema-aware, secure queries, giving LLMs safe access to SQL, Snowflake, or FHIR databases without glue code. While LangChain excels at flexibility, MCP shines in regulated, structured environments. Tursio complements this by focusing on the human experience: effortless natural language querying, context-aware explanations, and insights for decision-makers. Together, structured access and user-centric interfaces define modern AI workflows."}},$R[17]={id:1175,slug:"is-100x-productivity-possible-with-ai",acf:$R[18]={title:"Is 100x Productivity Possible with AI?",content:"\x3Ch2>\x3Cem>“You spend 20 minutes searching for that one number... then ping the data team. Again.”\x3C/em>\x3C/h2>\r\nIf that feels familiar, you're not alone. Across industries, knowledge workers are increasingly overwhelmed by tools, reports, and dashboards - yet starved for actual answers. According to McKinsey, \x3Ca href=\"https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/the-social-economy\">employees spend nearly \x3Cstrong>20% of their time just searching for information\x3C/strong>\x3C/a>.\r\n\r\n\x3Cstrong>That’s an entire day each week, lost to inefficiency.\x3C/strong>\r\n\r\nBut what if that friction could be eliminated? What if getting the insight you need was as easy as asking a question?\r\n\r\nWe’re on the brink of a new work paradigm-one where \x3Cstrong>AI doesn’t just accelerate tasks but unlocks entirely new workflows\x3C/strong>. At the heart of it: efficiency not just as speed, but as \x3Cstrong>strategic leverage\x3C/strong>.\r\n\r\n \r\n\x3Ch3>The New Definition of Efficiency\x3C/h3>\r\nIn today's knowledge economy, efficiency isn’t about doing more in less time- it’s about \x3Cstrong>getting to impact faster\x3C/strong>.\r\n\r\nWhen data is everywhere but insights are nowhere, teams stall. Decisions get delayed. Opportunities get missed. Efficiency, in this context, means:\r\n\x3Cul>\r\n \t\x3Cli>\x3Cstrong>Reducing time-to-insight\x3C/strong> from days to seconds\x3C/li>\r\n \t\x3Cli>\x3Cstrong>Empowering \x3C/strong>\x3Cstrong>anyone, \x3C/strong>not just analysts, to explore data safely\x3C/li>\r\n \t\x3Cli>\x3Cstrong>Cutting down reliance\x3C/strong> on static dashboards and outdated reports\x3C/li>\r\n\x3C/ul>\r\nThe future isn’t about more data. It’s about \x3Cem>access\x3C/em> to the right data, in the right context, at the right time.\r\n\r\n \r\n\x3Ch3>What’s Blocking You?\x3C/h3>\r\nMost organizations have the tools. They have dashboards, analytics platforms, and reports. But here’s what’s still broken:\r\n\x3Cul>\r\n \t\x3Cli>\x3Cstrong>Silos\x3C/strong>: Marketing, product, finance-each team has its own language and systems\x3C/li>\r\n \t\x3Cli>\x3Cstrong>Technical barriers\x3C/strong>: \"I don’t know SQL,” or “I’m afraid to mess up the dashboard.”\x3C/li>\r\n \t\x3Cli>\x3Cstrong>Latency\x3C/strong>: Stakeholders wait hours or days for data answers that should take seconds\x3C/li>\r\n\x3C/ul>\r\nConsider this: A VP of Growth wants to know how Q2 revenue compares across regions. Instead of exploring herself, she submits a request. The data team queues it. The answer comes back two days later. By then, the window for decision may have closed.\r\n\r\nThis isn’t a tooling issue- it’s a workflow issue. And that’s where AI steps in.\r\n\r\n \r\n\x3Ch3>What 100x Efficiency \x3Cem>Really\x3C/em> Looks Like\x3C/h3>\r\n100x doesn’t mean working 100x harder. It means \x3Cstrong>reducing the friction between question and answer so drastically\x3C/strong> that every decision cycle compresses, again and again.\r\n\r\nIt looks like this:\r\n\x3Cul>\r\n \t\x3Cli>A marketer comparing campaign performance \x3Cstrong>mid-meeting\x3C/strong>, not two days later\x3C/li>\r\n \t\x3Cli>A PM identifying drop-off segments \x3Cstrong>without waiting for an analyst\x3C/strong>\x3C/li>\r\n \t\x3Cli>A finance lead stress-testing budget models with \x3Cstrong>natural language inputs\x3C/strong>\x3C/li>\r\n\x3C/ul>\r\nThese are not one-off productivity boosts. There are significant shifts in how work is done.\r\n\r\n“When AI understands both your question and your data without needing a translator, unlock compound efficiency.”\r\n\r\n \r\n\x3Ch3>Use Cases: Where the Shift Is Already Happening\x3C/h3>\r\nIn our work with modern teams, we’re seeing 100x efficiency in action every day:\r\n\x3Cul>\r\n \t\x3Cli>\x3Cstrong>Product\x3C/strong>: Teams instantly surfaces user behavior patterns without relying on embedded analysts\x3C/li>\r\n \t\x3Cli>\x3Cstrong>Finance\x3C/strong>: Budget owners run their own scenario models\x3C/li>\r\n \t\x3Cli>\x3Cstrong>Sales\x3C/strong>: Managers access real-time funnel metrics during pipeline reviews\x3C/li>\r\n\x3C/ul>\r\nEach of these moments takes what used to be a multi-day back-and-forth and turns it into a 30-second conversation.\r\n\r\n \r\n\x3Ch3>Curiosity: Powered by AI\x3C/h3>\r\nThe knowledge workers of tomorrow aren’t opening 12 dashboards or waiting three days for a report. They’re asking better questions and expecting instant answers.\r\n\r\nThis isn’t automation for the sake of novelty. It’s \x3Cstrong>AI as an enabler of curiosity\x3C/strong>, exploration, and speed.\r\n\r\n“The next competitive advantage isn’t in who has the most data’s in who can use it the fastest.”\r\n\r\nIf you’re rethinking how your teams access insights and make decisions, \x3Ca href=\"https://www.tursio.ai/contact\">let’s talk\x3C/a>. You don’t need another dashboard.\r\nYou need answers at the speed of thought.",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/07/4.png",author:"Nilanshi Dhoundiyal",author_image:void 0,published_date:"July 9, 2025",tags:"Generative AI",description:"Knowledge workers waste nearly 20% of their time hunting for data. Dashboards, reports, and analytics tools exist—but insights are buried. True efficiency isn’t about doing more; it’s about getting answers faster. AI can remove friction between question and insight, letting anyone explore data instantly without SQL or analyst handholding. Imagine a marketer checking campaign performance mid-meeting, a PM spotting drop-offs in real time, or finance stress-testing budgets on the spot. This is 100x efficiency: decisions made at the speed of thought. The next advantage isn’t more data—it’s faster, smarter access to it."}},$R[19]={id:1176,slug:"how-credit-unions-are-winning-with-generative-ai",acf:$R[20]={title:"How Credit Unions Are Winning with Generative AI",content:"\x3Cp data-start=\"109\" data-end=\"169\">\x3Cspan style=\"color: #000000;\">Most credit unions know AI is coming—but where do you start?\x3C/span>\x3C/p>\r\n\x3Cp data-start=\"410\" data-end=\"782\">\x3Cspan style=\"color: #000000;\">Our work with credit unions has revealed a practical roadmap for adoption—one that balances near-term ROI with\x3C/span> long-term transformation. It’s a multi-stage approach that lets financial institutions start small, see immediate value, and expand over time. Here's what we’ve learned, the challenges encountered, and why a phased implementation—Levels 1 through 3—is proving successful.\x3C/p>\r\n\r\n\x3Ch3 data-start=\"789\" data-end=\"841\">\x3Cstrong data-start=\"793\" data-end=\"841\">The Starting Point: Learnings from the Field\x3C/strong>\x3C/h3>\r\n\x3Col data-start=\"843\" data-end=\"1826\">\r\n \t\x3Cli data-start=\"843\" data-end=\"1105\">\r\n\x3Cp data-start=\"846\" data-end=\"1105\">\x3Cstrong data-start=\"846\" data-end=\"883\">Adoption Is a Journey, Not a Jump\x3C/strong>\r\nMost credit unions don't leap into enterprise-wide AI deployments. Instead, they prefer to start with targeted, low-risk use cases. This gradual approach builds confidence, showcases value, and fosters internal buy-in.\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"1107\" data-end=\"1345\">\r\n\x3Cp data-start=\"1110\" data-end=\"1345\">\x3Cstrong data-start=\"1110\" data-end=\"1137\">Data Complexity Is Real\x3C/strong>\r\nFrom core banking systems to spreadsheets, data is scattered and inconsistent. Without a way to unify and contextualize it, AI can’t deliver value. Automation in semantic modeling and mapping is critical.\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"1347\" data-end=\"1584\">\r\n\x3Cp data-start=\"1350\" data-end=\"1584\">\x3Cstrong data-start=\"1350\" data-end=\"1371\">Quick Wins Matter\x3C/strong>\r\nEarly success with simple, natural language-driven insights makes a big difference. Teams get excited when they can ask, \x3Cem data-start=\"1496\" data-end=\"1528\">“What’s our daily net inflow?”\x3C/em> and get instant, accurate answers—no analysts required.\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"1586\" data-end=\"1826\">\r\n\x3Cp data-start=\"1589\" data-end=\"1826\">\x3Cstrong data-start=\"1589\" data-end=\"1626\">Collaboration Drives Deeper Value\x3C/strong>\r\nOnce initial adoption proves effective, deeper use cases emerge organically—from finance to lending to risk. These require closer alignment across departments and more sophisticated data handling.\x3C/p>\r\n\x3C/li>\r\n\x3C/ol>\r\n\r\n\x3Chr data-start=\"2769\" data-end=\"2772\" />\r\n\r\n\x3Ch3 data-start=\"1833\" data-end=\"1893\">\x3Cstrong data-start=\"1837\" data-end=\"1893\">The Multi-Stage Approach: 3 Levels of Gen AI Maturity\x3C/strong>\x3C/h3>\r\n\x3Cp data-start=\"1895\" data-end=\"2035\">We’ve structured our solution into three implementation levels, each designed to match the organization’s readiness and business priorities.\x3C/p>\r\n\r\n\x3Ch3 data-start=\"2042\" data-end=\"2113\">\x3Cstrong data-start=\"2046\" data-end=\"2113\">Level 1 – Quick, Ad-Hoc Analysis \x3C/strong>\x3C/h3>\r\n\x3Cp data-start=\"2115\" data-end=\"2279\">This entry-level implementation is designed for teams that want fast insights without the overhead of reports or dashboards. It delivers tangible value within days.\x3C/p>\r\n\r\n\x3Cul data-start=\"2281\" data-end=\"2646\">\r\n \t\x3Cli data-start=\"2281\" data-end=\"2337\">\r\n\x3Cp data-start=\"2283\" data-end=\"2337\">\x3Cstrong data-start=\"2283\" data-end=\"2310\">Connect to data sources\x3C/strong> (databases, tables, views)\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"2338\" data-end=\"2409\">\r\n\x3Cp data-start=\"2340\" data-end=\"2409\">\x3Cstrong data-start=\"2340\" data-end=\"2373\">Auto-generate semantic models\x3C/strong> using AI to understand data context\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"2410\" data-end=\"2504\">\r\n\x3Cp data-start=\"2412\" data-end=\"2504\">\x3Cstrong data-start=\"2412\" data-end=\"2449\">Ask questions in natural language\x3C/strong>, like “Which branches had the most growth this month?”\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"2505\" data-end=\"2567\">\r\n\x3Cp data-start=\"2507\" data-end=\"2567\">\x3Cstrong data-start=\"2507\" data-end=\"2557\">Spot trends, anomalies, and generate summaries\x3C/strong> instantly\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"2568\" data-end=\"2646\">\r\n\x3Cp data-start=\"2570\" data-end=\"2646\">\x3Cstrong data-start=\"2570\" data-end=\"2606\">Visualize with charts and graphs\x3C/strong> automatically created based on the data\x3C/p>\r\n\x3C/li>\r\n\x3C/ul>\r\n\x3Cp data-start=\"2648\" data-end=\"2767\">✅ \x3Cem data-start=\"2650\" data-end=\"2767\">Ideal for business analysts, operations teams, and executives looking for real-time insights without waiting on IT.\x3C/em>\x3C/p>\r\n\r\n\r\n\x3Chr data-start=\"2769\" data-end=\"2772\" />\r\n\r\n\x3Ch3 data-start=\"2774\" data-end=\"2830\">\x3Cstrong data-start=\"2778\" data-end=\"2830\">Level 2 – Advanced Search and Financial Analysis\x3C/strong>\x3C/h3>\r\n\x3Cp data-start=\"2832\" data-end=\"3028\">Once Level 1 is in place, many credit unions want to go deeper. Level 2 enables strategic, cross-functional financial insights, and typically requires a few weeks of data mapping and ontology setup.\x3C/p>\r\n\x3Cp data-start=\"3030\" data-end=\"3046\">Sample insights:\x3C/p>\r\n\r\n\x3Cul data-start=\"3048\" data-end=\"3708\">\r\n \t\x3Cli data-start=\"3048\" data-end=\"3121\">\r\n\x3Cp data-start=\"3050\" data-end=\"3121\">\x3Cstrong data-start=\"3050\" data-end=\"3063\">Liquidity\x3C/strong>: Are we maintaining enough reserves to handle volatility?\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"3122\" data-end=\"3209\">\r\n\x3Cp data-start=\"3124\" data-end=\"3209\">\x3Cstrong data-start=\"3124\" data-end=\"3144\">Capital Adequacy\x3C/strong>: How does our CAR measure up to internal and regulatory targets?\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"3210\" data-end=\"3283\">\r\n\x3Cp data-start=\"3212\" data-end=\"3283\">\x3Cstrong data-start=\"3212\" data-end=\"3234\">Expense Management\x3C/strong>: Where can we cut costs without harming service?\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"3284\" data-end=\"3361\">\r\n\x3Cp data-start=\"3286\" data-end=\"3361\">\x3Cstrong data-start=\"3286\" data-end=\"3303\">Member Growth\x3C/strong>: Are we retaining the right segments for long-term value?\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"3362\" data-end=\"3448\">\r\n\x3Cp data-start=\"3364\" data-end=\"3448\">\x3Cstrong data-start=\"3364\" data-end=\"3389\">Loan & Deposit Trends\x3C/strong>: Are member behaviors shifting in ways we need to address?\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"3449\" data-end=\"3533\">\r\n\x3Cp data-start=\"3451\" data-end=\"3533\">\x3Cstrong data-start=\"3451\" data-end=\"3468\">Risk Exposure\x3C/strong>: What risks are emerging, and how effective are our mitigations?\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"3534\" data-end=\"3628\">\r\n\x3Cp data-start=\"3536\" data-end=\"3628\">\x3Cstrong data-start=\"3536\" data-end=\"3558\">Market Positioning\x3C/strong>: How do we compare with peers, and what macro trends should we watch?\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"3629\" data-end=\"3708\">\r\n\x3Cp data-start=\"3631\" data-end=\"3708\">\x3Cstrong data-start=\"3631\" data-end=\"3646\">Projections\x3C/strong>: Are our strategic plans in line with what the data suggests?\x3C/p>\r\n\x3C/li>\r\n\x3C/ul>\r\n\x3Cp data-start=\"3710\" data-end=\"3802\">✅ \x3Cem data-start=\"3712\" data-end=\"3802\">Level 2 equips product, finance, strategy, and leadership teams to make better decisions, faster.\x3C/em>\x3C/p>\r\n\r\n\r\n\x3Chr data-start=\"3804\" data-end=\"3807\" />\r\n\r\n\x3Ch3 data-start=\"3809\" data-end=\"3869\">\x3Cstrong data-start=\"3813\" data-end=\"3869\">Level 3 – Department-Level Deep Dives and Simulation\x3C/strong>\x3C/h3>\r\n\x3Cp data-start=\"3871\" data-end=\"4081\">At this stage, the AI solution becomes a true strategic advisor. We collaborate with specific departments—like Lending, Risk, or Member Services—to build simulations, scenario planning, and root cause analyses.\x3C/p>\r\n\x3Cp data-start=\"4083\" data-end=\"4113\">\x3Cstrong data-start=\"4083\" data-end=\"4113\">Loan Portfolio Monitoring:\x3C/strong>\x3C/p>\r\n\r\n\x3Cul data-start=\"4115\" data-end=\"4318\">\r\n \t\x3Cli data-start=\"4115\" data-end=\"4176\">\r\n\x3Cp data-start=\"4117\" data-end=\"4176\">What’s driving the $X billion (X%) increase in total loans?\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"4177\" data-end=\"4241\">\r\n\x3Cp data-start=\"4179\" data-end=\"4241\">Why did new auto loans drop while used auto loans stayed flat?\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"4242\" data-end=\"4318\">\r\n\x3Cp data-start=\"4244\" data-end=\"4318\">Should we rethink our credit card growth strategy based on balance trends?\x3C/p>\r\n\x3C/li>\r\n\x3C/ul>\r\n\x3Cp data-start=\"4320\" data-end=\"4346\">\x3Cstrong data-start=\"4320\" data-end=\"4346\">Delinquency Analytics:\x3C/strong>\x3C/p>\r\n\r\n\x3Cul data-start=\"4348\" data-end=\"4521\">\r\n \t\x3Cli data-start=\"4348\" data-end=\"4395\">\r\n\x3Cp data-start=\"4350\" data-end=\"4395\">Why did delinquencies rise by X basis points?\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"4396\" data-end=\"4454\">\r\n\x3Cp data-start=\"4398\" data-end=\"4454\">Which segments are showing early signs of credit stress?\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"4455\" data-end=\"4521\">\r\n\x3Cp data-start=\"4457\" data-end=\"4521\">How can we proactively adjust lending policies to mitigate risk?\x3C/p>\r\n\x3C/li>\r\n\x3C/ul>\r\n\x3Cp data-start=\"4523\" data-end=\"4623\">✅ \x3Cem data-start=\"4525\" data-end=\"4623\">This level unlocks powerful domain-specific insights that can transform how departments operate.\x3C/em>\x3C/p>\r\n\r\n\r\n\x3Chr data-start=\"4625\" data-end=\"4628\" />\r\n\r\n\x3Ch3 data-start=\"4630\" data-end=\"4679\">\x3Cstrong data-start=\"4634\" data-end=\"4679\">Overcoming Challenges: What We've Learned\x3C/strong>\x3C/h3>\r\n\x3Cul data-start=\"4681\" data-end=\"5289\">\r\n \t\x3Cli data-start=\"4681\" data-end=\"4913\">\r\n\x3Cp data-start=\"4683\" data-end=\"4913\">\x3Cstrong data-start=\"4683\" data-end=\"4718\">Data Governance is a Bottleneck\x3C/strong>\x3Cbr data-start=\"4718\" data-end=\"4721\" />Many credit unions are cautious with member data—as they should be. Our solution can run fully on-premises or within secured VPC environments to meet strict compliance and privacy standards.\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"4915\" data-end=\"5084\">\r\n\x3Cp data-start=\"4917\" data-end=\"5084\">\x3Cstrong data-start=\"4917\" data-end=\"4945\">Change Management is Key\x3C/strong>\x3Cbr data-start=\"4945\" data-end=\"4948\" />Empowering non-technical users to self-serve insights is a cultural shift. Training, support, and quick wins are critical to adoption.\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli data-start=\"5086\" data-end=\"5289\">\r\n\x3Cp data-start=\"5088\" data-end=\"5289\">\x3Cstrong data-start=\"5088\" data-end=\"5116\">Interoperability Matters\x3C/strong>\x3Cbr data-start=\"5116\" data-end=\"5119\" />AI has to work with existing systems, not replace them. Our platform plugs into what you already use—core systems, CRMs, data warehouses—and builds intelligence on top.\x3C/p>\r\n\x3C/li>\r\n \t\x3Cli>\x3Cstrong data-start=\"5088\" data-end=\"5116\">Deployment must be simplified\x3C/strong>\x3Cbr data-start=\"5116\" data-end=\"5119\" />Simplifying the path to GoLive, while considering various stakeholder concerns, is critical. Our deployment model is battle-tested with baked in security and performance.\x3C/li>\r\n\x3C/ul>\r\n\r\n\x3Chr data-start=\"5291\" data-end=\"5294\" />\r\n\r\n\x3Ch3 data-start=\"5296\" data-end=\"5339\">\x3Cstrong data-start=\"5300\" data-end=\"5339\">Conclusion: Start Smart, Scale Fast\x3C/strong>\x3C/h3>\r\n\x3Cp data-start=\"5341\" data-end=\"5600\">Start where ROI is fastest (Level 1), move toward strategic analysis (Level 2), and then go deep with functional use cases (Level 3).\x3C/p>\r\n\x3Cp data-start=\"738\" data-end=\"966\">This staged approach ensures you build momentum, get results at each phase, and avoid the pitfalls of a “big bang” rollout. We're here to guide you at every step—with tech that’s production-ready and built for your unique needs.\x3C/p>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/07/5.png",author:"Murali Mahalingam",author_image:void 0,published_date:"July 1, 2025",tags:"Credit Unions, Generative AI",description:"AI adoption for credit unions works best as a staged journey, not a leap. Start small with Level 1—quick, natural language insights for immediate ROI. Level 2 expands to advanced financial analysis across departments, enabling data-driven strategy, risk management, and growth tracking. Level 3 brings department-level simulations, scenario planning, and deep operational insights for Lending, Risk, and Member Services. Success requires strong data governance, simplified deployment, and empowering non-technical users. By starting smart, scaling fast, and aligning AI with business priorities, credit unions can unlock actionable insights, improve decision-making, and maximize value without the pitfalls of a “big bang” rollout."}},$R[21]={id:1177,slug:"searching-clinical-data-using-generative-ai",acf:$R[22]={title:"Searching Clinical Data using Generative AI",content:"\x3Ch1>\x3C/h1>\r\n\x3Ch1>Background\x3C/h1>\r\nHealthcare is a fertile ground for generative AI applications, with clinical data being the common denominator across patients, payers, and providers. Searching clinical data and running analyses over it can lead to better outcomes in patient diagnosis and care. Unfortunately, clinical data is often messy, making the search for relevant information challenging. Efforts to standardize the clinical data with medical codes go way back to the 1850s, and successive iterations have improved the categorization details. For example, the International Classification of Diseases (ICD) categorizes disease descriptions, with ICD-10 being the most prevalent and ICD-11 now getting adopted.\r\n\r\nICD codes follow a tree-like structure, beginning with broad categories such as “respiratory diseases” and branching into specific conditions like “chronic bronchitis” or “asthma”. While this organization helps group related illnesses logically, it complicates the search process, particularly when the everyday medical terminology doctors use doesn't align with the official language in the coding system. In contrast to code assignment, which is a one-to-one mapping from each disease or drug to a single standardized code, searching is a one-to-many problem where users look for broader families of diseases or drugs.\r\n\r\n \r\n\x3Ch1>SearchAI for Clinical Data\x3C/h1>\r\nOur solution, \x3Cstrong>SearchAI\x3C/strong>, enables generative AI-powered patient search on clinical databases. For example, a physician can select a specific ICD code dataset and, upon entering patient symptoms or diagnoses (i.e. \x3Cem>“fever and cough”\x3C/em>) into the search bar, the system retrieves relevant ICD codes related to respiratory infections, helping the physician reach an accurate diagnosis more quickly. Unlike simple SQL query generators, SearchAI uses natural language processing (NLP) techniques to interpret complex medical queries. It extracts key medical concepts and retrieves a broad set of relevant medical codes based on contextual and ontology understanding. Overall, our solution focuses on three core ideas.\r\n\x3Col>\r\n \t\x3Cli>\x3Cstrong>Boolean Decomposition:\x3C/strong> We train small models to decompose patient search queries into the underlying Boolean logic.\x3C/li>\r\n \t\x3Cli>\x3Cstrong>Ontology-Aware Navigation:\x3C/strong> We train hierarchical models to traverse coding ontologies while preserving their structural relationships.\x3C/li>\r\n \t\x3Cli>\x3Cstrong>Instance-Specific Tuning: \x3C/strong>We tune the hierarchies for the specific database instance to narrow down the scope of errors and make the patient search better tailored.\x3C/li>\r\n\x3C/ol>\r\nTo illustrate, users can ask the following kinds of patient queries, and SearchAI will systematically process them:\r\n\x3Cul>\r\n \t\x3Cli>Show sepsis patients.\x3C/li>\r\n \t\x3Cli>Show patients who are diagnosed with Anemia.\x3C/li>\r\n \t\x3Cli>Show patients who have external causes of abnormal reactions for surgical operations.\x3C/li>\r\n \t\x3Cli>Show patients with external causes of falls on the same level as slipping.\x3C/li>\r\n \t\x3Cli>Show patients with chronic ischemic heart disease.\x3C/li>\r\n \t\x3Cli>Show patients who are on psychoactive substance use.\x3C/li>\r\n \t\x3Cli>Show patients diagnosed with Type 2 diabetes mellitus with diabetic nephropathy.\x3C/li>\r\n \t\x3Cli>Show patients diagnosed with prediabetes who undergo drug abuse counseling.\x3C/li>\r\n\x3C/ul>\r\nSearchAI introduces a modern way of interacting with clinical data that is fast and easy. The idea is to save valuable time and arrive at better outcomes for all stakeholders.\r\n\r\n \r\n\x3Ch1>Hierarchical models\x3C/h1>\r\nCoding ontologies were originally designed for human interpretation, and automating them using AI requires several adjustments. Below, we briefly describe the hierarchical models for traversing the coding ontology. Readers can refer to the full paper for the detailed algorithms.\r\n\x3Cul>\r\n \t\x3Cli>\x3Cstrong>Default Hierarchical Predictor\x3C/strong>: The baseline method follows the standard top-down traversal of the medical code hierarchy. It predicts codes by navigating through parent-child relationships in the original structure. While this approach provides a reasonable starting point and achieves moderate accuracy, it often fails when the hierarchy contains semantic gaps or inconsistent relationships between parent and child nodes.\x3C/li>\r\n \t\x3Cli>\x3Cstrong>Hierarchical Flattening\x3C/strong>: To address the above limitations, we introduced a restructuring technique that adjusts hierarchical depths based on ICD code descriptions. Our method targets unreachable ICD codes through conventional top-down traversal, often because their names are semantically disconnected from their parent codes or consist of overly generic, single-word labels. By re-aligning these outlier codes into a more coherent structure, the system can interpret them more accurately.\x3C/li>\r\n \t\x3Cli>\x3Cstrong>Hybrid Approach\x3C/strong>: To further improve flexibility and accuracy, we designed a hybrid method combining structured and unstructured search strategies. It begins with a traditional top-down traversal but dynamically switches to a randomized flat search when the path becomes uninformative or ambiguous. This adaptive mechanism allows the model to escape rigid hierarchical constraints and intelligently jump to more promising starting points, achieving a better balance between precision and coverage.\x3C/li>\r\n\x3C/ul>\r\nOur hierarchical models significantly enhance the machine interpretability of medical codes, making them more accurate and accessible.\r\n\r\n \r\n\x3Ch1>Results\x3C/h1>\r\nWe evaluated SearchAI on both production and publicly available Medicare fee-for-service (FFS) datasets. We generated search queries for all ICD-10 codes and measured the proportion of cases where SearchAI successfully retrieved the intended ICD-10 code. The figures below show the result.\r\n\r\n \r\n\x3Ch3>Accuracy\x3C/h3>\r\n \r\n\r\n\x3Cimg class=\"alignnone wp-image-444\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/Screenshot-2025-05-13-at-3.21.48 PM-300x183.png\" alt=\"\" width=\"400\" height=\"244\" data-wp-editing=\"1\" />\x3Cimg class=\"alignnone wp-image-445\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/Screenshot-2025-05-13-at-3.22.33 PM-300x181.png\" alt=\"\" width=\"400\" height=\"242\" />\r\n\r\nFigure 1a and 1b: SearchAI accuracy on FFS and production datasets.\r\n\r\n \r\n\r\nWe see that the default hierarchical predictor has lower accuracies of 67.35% and 60% on FFS and production datasets, respectively. However, the accuracy improves significantly with flattened and hybrid variants of the algorithm. For the FFS dataset, accuracies reached 99% and 98.63%, while for the production dataset, the accuracies were 98.3% and 98.6%. These are promising numbers with measurable improvements in accuracy.\r\n\r\nSearchAI is robust to semantic variations of the search queries, with 79.86% and 88.23% accuracies on FFS and production datasets when the above queries were rephrased using ChatGPT. SearchAI also has low latency, ranging from a few milliseconds to a hundred milliseconds in the worst case, and it scales well with the dataset sizes.\r\n\r\n \r\n\x3Ch1>Extensions\x3C/h1>\r\nSearchAI also applies to other medical codes. Specifically, we tested the following:\r\n\x3Cul>\r\n \t\x3Cli>\x3Cstrong>National Drug Codes (NDC)\x3C/strong>: NDCs uniquely identify medications and are essential for pharmacy operations, electronic prescribing, and billing. We are integrating NDC support by fine-tuning our model to understand drug formulations, dosages, and brand/generic variations. With this capability, clinicians and researchers can enter queries like “\x3Cem>500mg oral amoxicillin” \x3C/em>and receive precise NDC code mappings.\x3C/li>\r\n \t\x3Cli>\x3Cstrong>Current Procedural Terminology (CPT) codes\x3C/strong>: CPT codes represent medical procedures and services. Accurate CPT coding is critical for documentation, billing, and reimbursement. We are enhancing SearchAI to accurately interpret procedural language and retrieve the most clinically relevant CPT codes, helping ensure consistency in medical reporting.\x3C/li>\r\n \t\x3Cli>\x3Cstrong>Modifier codes\x3C/strong>: These codes add specificity to CPT codes by capturing variations in how procedures are performed (e.g., bilateral performance, repeat procedures). SearchAI will assist medical professionals in identifying the appropriate modifiers based on procedure context, improving documentation accuracy.\x3C/li>\r\n \t\x3Cli>\x3Cstrong>Merit-based Incentive Payment System (MIPS) codes\x3C/strong>: MIPS codes track performance metrics related to healthcare quality, payment cost, and patient outcomes. This allows SearchAI to interpret clinical and policy-driven queries (e.g., “\x3Cem>preventive care for chronic illness”\x3C/em>) and map them to appropriate MIPS categories.\x3C/li>\r\n\x3C/ul>\r\n\x3Ch1>\x3C/h1>\r\n \r\n\x3Ch1>Parting thoughts\x3C/h1>\r\nThe healthcare industry can benefit from various process-related efficiencies, and SearchAI is a meaningful step in that direction. We demonstrated how generative AI models can help interpret natural language queries, traverse complex medical ontologies, and return accurate, hierarchical search results. We started from ICD codes and extended the approach to other codes, including CPT, MIPS, and Modifiers.\r\n\r\nUltimately, we envision SearchAI as a search engine for clinical data, one that bridges the gap between technical complexity and clinical usability.\r\n\r\n \r\n\r\n\x3Cstrong>For more insights, please see our full research on arXiv:\x3C/strong>\r\n\r\n\x3Ca href=\"https://arxiv.org/abs/2505.24090\">https://arxiv.org/abs/2505.24090 \x3C/a>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/07/6.png",author:"Karan Hanswadkar",author_image:void 0,published_date:"June 6, 2025",tags:"Clinical Research, Generative AI",description:"Healthcare data is messy, and querying it effectively is critical for better patient outcomes. SearchAI leverages generative AI to enable natural language search over clinical databases, including ICD, CPT, NDC, MIPS, and modifier codes. Using Boolean decomposition, ontology-aware navigation, and instance-specific tuning, SearchAI interprets complex medical queries, maps them to hierarchical codes, and returns accurate results. Hierarchical flattening and hybrid approaches enhance precision and coverage, achieving up to 99% accuracy. Fast, robust, and semantically aware, SearchAI empowers clinicians and researchers to quickly access actionable information, bridging the gap between technical complexity and clinical usability, and transforming how healthcare data is explored."}},$R[23]={id:1178,slug:"tursio-product-updates-towards-100x-knowledge-workers",acf:$R[24]={title:"Tursio Product Updates: Towards 100x knowledge workers",content:"\x3Cspan data-contrast=\"auto\">Knowledge workers are the backbone of modern enterprises. Yet, conversations with our customers reveal that knowledge workers struggle with productivity when it comes to enterprise data. Tursio empowers them with quick analyses using generative AI, and our latest product release is targeted to reducing friction and making Tursio AI work end-to-end for the knowledge workers.\x3C/span>\r\n\r\n \r\n\x3Ch3>Auto Mode – Ask Better, Ask Faster\x3C/h3>\r\n\x3Cspan data-contrast=\"auto\">Querying data sources can be overwhelming. We got the feedback that knowledge workers are often unaware of the underlying data schemas and structures, making it difficult for them to ask for the right piece of information. Therefore, we are introducing “Auto” mode to guide the users through the semantic model and help construct questions for getting the desired data. Tursio takes care of interpreting them correctly behind the scenes and generates backend queries. \x3C/span>\x3Cspan data-ccp-props=\"{"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">“Auto” mode makes it simple. Instead of manually figuring out what to ask, just start typing your question. Tursio will walk you through it—step by step—using your semantic model. You get the right answers every single time without needing to know how to write perfect queries.\x3C/span>\x3Cspan data-ccp-props=\"{"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">For example, a loan analyst asking \x3C/span>\x3Cspan data-contrast=\"auto\">“S\x3C/span>\x3Ci>\x3Cspan data-contrast=\"auto\">how down payment trends for Accent, Accord, and Taurus between 2010–2013\x3C/span>\x3C/i>\x3Cspan data-contrast=\"auto\">” gets the trend data along with a detailed analysis without any syntax errors or second-guessing.\x3C/span>\x3Cspan data-ccp-props=\"{"335559738":240,"335559739":240}\"> \x3C/span>\r\n\x3Ch3>\x3C/h3>\r\n\x3Ch3>Analyze Mode – Explore Deeper\x3C/h3>\r\n\x3Cspan data-contrast=\"auto\">Knowledge workers have more questions once they see an answer, a curiosity to dig deeper with open-ended analyses. Therefore, we are introducing “Analyze” mode to combine accurate data search with creative reasoning for holistic perspectives and fast decisions. \x3C/span>\x3Cspan data-ccp-props=\"{}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">The “Analyze” mode\x3C/span>\x3Cspan data-contrast=\"auto\"> operates on the data retrieved already and helps users refine the answer, draw more insights, or simply combine the data with external knowledge available to the language models. Essentially, it helps leverage the creativity of language models while still grounding the response strictly to the facts.\x3C/span>\x3Cspan data-ccp-props=\"{}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">For e\x3C/span>\x3Cspan data-contrast=\"auto\">xample, a loan strategist can ask for forward looking perspective, e.g., \x3C/span>\x3Cspan data-contrast=\"auto\">“\x3C/span>\x3Ci>\x3Cspan data-contrast=\"auto\">What should the expansion strategy be based on down payments and car model popularity?\x3C/span>\x3C/i>\x3Cspan data-contrast=\"auto\">”. The response is still grounded in facts from their portfolio.\x3C/span>\r\n\x3Ch3>\x3C/h3>\r\n\x3Ch3>Persistent Sharing – Never Lose an Insight\x3C/h3>\r\n\x3Cspan data-contrast=\"auto\">Insights are valuable and knowledge workers want to share all the time. We got feedback that people want to share results with others for both verification and consumption purposes. Therefore, we now persist cached results into permalinks that can be shared and accessed anytime. Tursio still ensures access control to the underlying data.\x3C/span>\x3Cspan data-ccp-props=\"{"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">Thus, important insights never vanish. Send it, save it, come back to it anytime. Whether it’s leadership, teams, or clients—you can now keep everyone on the same page.\x3C/span>\x3Cspan data-ccp-props=\"{"335559738":240,"335559739":240}\"> \x3C/span>\r\n\x3Ch3>\x3C/h3>\r\n\x3Ch3 aria-level=\"2\">Smarter PDF Export – Clean. Clear. Contextual.\x3C/h3>\r\n\x3Cspan data-contrast=\"auto\">We have revamped PDF report generation with support for large tables and wide layouts. The report truncates data results smoothly and backs it up with summarized highlights. \x3C/span>\x3Cspan data-ccp-props=\"{"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">Thus, users can present very persuasive and insightful reports to their stakeholders, while also creating the artifacts for future reference. \x3C/span>\x3Cspan data-ccp-props=\"{"335559738":240,"335559739":240}\"> \x3C/span>\r\n\x3Ch3>\x3C/h3>\r\n\x3Ch3>User Recommended – Power of the Crowd\x3C/h3>\r\n\x3Cspan data-contrast=\"auto\">Knowledge workers do not work alone, and they want to recommend things they found to each other. We have now made it easier to surface upvoted insights as “User Recommended” questions, making asking questions a collaborative effort.\x3C/span>\x3Cspan data-ccp-props=\"{"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">Users can discover questions their peers are asking and explore them for their own additional context.\x3C/span>\r\n\x3Ch3>\x3C/h3>\r\n\x3Ch3>Research Mode – Feeling Lucky\x3C/h3>\r\n\x3Cspan data-contrast=\"auto\">Finally, we introduce a “Research” mode for people who may not know what to ask or who would like to be surprised by what comes out. This is both experimental and open-ended, giving users the chance to try new things.\x3C/span>\x3Cspan data-ccp-props=\"{"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">For example, a CLO may ask “How is my loan portfolio doing?”, an open-ended inquiry meant to be followed up with more specific questions.\x3C/span>\x3Cspan data-ccp-props=\"{"335559738":240,"335559739":240}\"> \x3C/span>\r\n\x3Ch3>\x3C/h3>\r\n\x3Ch3>Summary\x3C/h3>\r\n\x3Cspan data-contrast=\"auto\">This new product release is designed to make enterprise data accessible and actionable for knowledge workers—at every stage of their workflow.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335551550":0,"335551620":0,"335559738":240,"335559739":240}\"> \x3C/span>\x3Cspan data-contrast=\"auto\">With Auto, Analyze, and Research modes, plus better sharing and smarter exports, Tursio helps you go from questions to decisions faster.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335551550":0,"335551620":0,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\">Reach out to us if you want to try Tursio on your data. We would love to learn what you think. This is just the beginning — stay tuned for more updates!\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335551550":0,"335551620":0,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335551550":0,"335551620":0,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n[video width=\"1280\" height=\"720\" mp4=\"https://blog.tursio.ai/wp-content/uploads/2025/05/Product-Updates.mp4\"][/video]",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/07/7.png",author:"Shraddhaa Khanna",author_image:void 0,published_date:"May 13, 2025",tags:"Spotlight Stories",description:"Knowledge workers often struggle to get actionable insights from enterprise data. Tursio’s latest release addresses this with Auto, Analyze, and Research modes. Auto guides users through semantic models, generating accurate queries without requiring technical expertise. Analyze lets users dig deeper, combining facts with creative reasoning for holistic insights. Research enables open-ended exploration for discovery. Persistent sharing and smarter PDF exports ensure insights are never lost and can be presented clearly. User-recommended questions foster collaboration across teams. Together, these features make Tursio an end-to-end tool, transforming questions into decisions faster and empowering knowledge workers to fully leverage their data."}},$R[25]={id:1179,slug:"from-queries-to-conversations-in-sql-server",acf:$R[26]={title:"From Queries to Conversations in SQL Server",content:"I spent a meaningful chapter of my life at Microsoft, working closely with brilliant minds and incredible technology. Tools like SQL Server Analysis Services (SSAS) and SQL Server Machine Learning Services (MLS) were part of my everyday toolkit—and I’ll be the first to say they’re excellent at what they’re built for: structured analytics, business intelligence, and model execution within the SQL Server ecosystem.\r\n\r\nBut here’s the thing. The way people interact with data has evolved. Business users don’t want to write Data Analysis Expressions (DAX) or Multidimensional Expressions (MDX). They don’t want to call a data team just to ask, “How is the customer retention metric between last year and this year?” They want answers—accurately, instantly, intuitively, and in their own words.\r\n\r\n\x3Cstrong>Microsoft SQL Server Analysis Services\x3C/strong>\r\n\r\nMicrosoft created SQL Server Analysis Services (SSAS) to provide a powerful online analytical processing (OLAP) and data mining tool that enables organizations to analyze and make sense of complex information spread across multiple databases or disparate data sources. SSAS was designed to complement the SQL Server relational database engine, which excels at transactional processing but is not optimized for complex analytical queries and aggregations on large volumes of data.\r\n\r\nThe origins of SSAS trace back to Microsoft's 1996 acquisition of OLAP technology from Panorama Software, aiming to enter the OLAP market and enhance its business intelligence offerings. The first version, OLAP Services, shipped with SQL Server 7.0 in 1998, providing multidimensional analysis capabilities. Microsoft renamed it Analysis Services in SQL Server 2000, adding data mining features to complete its analytical capabilities and support richer BI solutions.\r\n\r\nThe core motivation was to enable enterprises to build enterprise-grade semantic data models on top of data warehouses, facilitating scalable, fast, and flexible data analysis and reporting. SSAS supports both multidimensional and tabular models, allowing organizations to choose the best approach based on their data volume and analytical needs. This semantic layer enables client applications like Power BI and Excel to deliver deeper insights with better performance than querying raw data sources directly.\r\n\r\n\x3Cstrong>Microsoft SQL Server Machine Learning Services\x3C/strong>\r\n\r\nSQL Server Machine Learning Services is a feature within SQL Server that enables organizations to run Python and R scripts directly inside the database, leveraging open-source machine learning frameworks alongside relational data. This in-database execution removes the need to move large datasets across systems, streamlining advanced analytics and machine learning workflows.\r\n\r\nMicrosoft created SQL Server Machine Learning Services (originally introduced as R Services in SQL Server 2016) to bring advanced analytics and machine learning capabilities directly into the SQL Server database engine. The main reasons were to enable in-database processing of R and later Python scripts, eliminating the need to move large volumes of data out of the database for analysis, which reduces latency, enhances security, and simplifies compliance.\r\n\r\nBy integrating machine learning into the database, Microsoft aimed to allow data scientists and developers to prepare data, train, evaluate, and deploy machine learning models within the same environment where the data resides. This integration supports full dataset processing at scale, real-time scoring, and operationalization of models through stored procedures, making predictive analytics more efficient and accessible to enterprises.\r\n\r\n \r\n\r\n\x3Cstrong>The Natural Language Shift: A New Standard for Data Access\x3C/strong>\r\n\r\nToday, data needs to be conversational. Whether it’s a frontline manager or a marketing analyst, the expectation is this: \x3Cem>“I should be able to ask a question in plain English—and get a meaningful answer from my data.”\x3C/em>\r\n\r\nNatural Language Search (NLS) has moved from being a “nice-to-have” to a critical capability. But the legacy infrastructure many companies still rely on—SSAS, MLS, and similar technologies—just weren’t designed for this.\r\n\r\n\x3Cstrong>Why SSAS and MLS Struggle with Natural Language Search?\x3C/strong>\r\n\x3Col>\r\n \t\x3Cli>\x3Cstrong>No Native NLP Capabilities\x3C/strong>\x3C/li>\r\n\x3C/ol>\r\n\x3Cp style=\"text-align: left;\">SSAS is phenomenal with structured data models. But ask it to interpret a sentence like \x3Cem>“Which regions had the highest YoY revenue growth last quarter?”\x3C/em>—and it won’t know where to begin. MLS allows Python or R scripting, but you’re left building natural language logic from scratch. No semantic parsing, no pre-built NLP features.\x3C/p>\r\n\r\n\x3Col start=\"2\">\r\n \t\x3Cli>\x3Cstrong> Rigid and Technical Interfaces\x3C/strong>\x3C/li>\r\n\x3C/ol>\r\nMDX and DAX aren’t exactly user-friendly, especially for non-technical users. MLS scripts, while powerful, require explicit task programming. There’s no fluidity—no way to dynamically interpret the variety and ambiguity of human language.\r\n\x3Col start=\"3\">\r\n \t\x3Cli>\x3Cstrong> No Understanding of Intent or Context\x3C/strong>\x3C/li>\r\n\x3C/ol>\r\nHumans are nuanced. We say things like “top-performing products” or “last month’s churn.” Systems like SSAS and MLS have no intent detection or semantic recognition. They can’t map synonyms or understand conversational context.\r\n\x3Col start=\"4\">\r\n \t\x3Cli>\x3Cstrong> Cumbersome Handling of Unstructured Data\x3C/strong>\x3C/li>\r\n\x3C/ol>\r\nMost organizations now rely on both structured and unstructured data—emails, CRM notes, chat logs. SSAS is optimized for cubes and tables. MLS on the other side can technically process text, but only through painstaking integration with external NLP libraries.\r\n\x3Col start=\"5\">\r\n \t\x3Cli>\x3Cstrong> Performance and Scalability Issues\x3C/strong>\x3C/li>\r\n\x3C/ol>\r\nModern NLS systems need speed. Parsing, entity recognition, query generation—these must happen in near-real-time. SSAS and MLS weren’t built for that. Heavy NLP workloads within MLS, in particular, can strain database resources.\r\n\x3Col start=\"6\">\r\n \t\x3Cli>\x3Cstrong> Disconnected User Experience\x3C/strong>\x3C/li>\r\n\x3C/ol>\r\nToday’s data users expect to ask questions through an intuitive natural language search interfaces similar to OpenAI and Perplexity or through integration into productivity applications like Slack, Teams, or even voice assistants. SSAS and MLS are tied to BI dashboards and lack native APIs for these modern channels.\r\n\r\n \r\n\r\n\x3Cstrong>Summary: Where Legacy Tools Miss the Mark\x3C/strong>\r\n\x3Ctable>\r\n\x3Cthead>\r\n\x3Ctr>\r\n\x3Ctd>\x3Cstrong>Capability\x3C/strong>\x3C/td>\r\n\x3Ctd>\x3Cstrong>SSAS/MLS\x3C/strong>\x3C/td>\r\n\x3Ctd>\x3Cstrong>Natural Language Search Needs\x3C/strong>\x3C/td>\r\n\x3Ctd>\x3Cstrong>Key Gap\x3C/strong>\x3C/td>\r\n\x3C/tr>\r\n\x3C/thead>\r\n\x3Ctbody>\r\n\x3Ctr>\r\n\x3Ctd>NLP Understanding\x3C/td>\r\n\x3Ctd>❌ Not available\x3C/td>\r\n\x3Ctd>✅ Native NLP support\x3C/td>\r\n\x3Ctd>Cannot interpret human language\x3C/td>\r\n\x3C/tr>\r\n\x3Ctr>\r\n\x3Ctd>Intent Recognition\x3C/td>\r\n\x3Ctd>❌ No semantic engine\x3C/td>\r\n\x3Ctd>✅ Contextual and adaptive\x3C/td>\r\n\x3Ctd>No intent or contextual parsing\x3C/td>\r\n\x3C/tr>\r\n\x3Ctr>\r\n\x3Ctd>Unstructured Data Handling\x3C/td>\r\n\x3Ctd>⚠️ Manual & inefficient\x3C/td>\r\n\x3Ctd>✅ Native support for mixed data\x3C/td>\r\n\x3Ctd>Poor text data support\x3C/td>\r\n\x3C/tr>\r\n\x3Ctr>\r\n\x3Ctd>Real-Time Performance\x3C/td>\r\n\x3Ctd>❌ Not optimized\x3C/td>\r\n\x3Ctd>✅ Millisecond-level responsiveness\x3C/td>\r\n\x3Ctd>High latency, not scalable for NLS\x3C/td>\r\n\x3C/tr>\r\n\x3Ctr>\r\n\x3Ctd>Query Flexibility\x3C/td>\r\n\x3Ctd>❌ Requires MDX/DAX/scripts\x3C/td>\r\n\x3Ctd>✅ Open, intuitive input\x3C/td>\r\n\x3Ctd>No dynamic translation of questions to queries\x3C/td>\r\n\x3C/tr>\r\n\x3Ctr>\r\n\x3Ctd>Integration with Chat/Voice Systems\x3C/td>\r\n\x3Ctd>❌ Not available\x3C/td>\r\n\x3Ctd>✅ Seamless conversational UX\x3C/td>\r\n\x3Ctd>No native chatbot or assistant integrations\x3C/td>\r\n\x3C/tr>\r\n\x3C/tbody>\r\n\x3C/table>\r\n \r\n\r\n \r\n\r\nAt Tursio, we have set out with a bold mission: \x3Cstrong>To build the AI for future 100x workers.\x3C/strong>\r\n\r\nHere’s how we’re doing it:\r\n\r\n\x3Cstrong>Accuracy You Can Trust\x3C/strong>\r\n\r\nTursio’s proprietary NLP engine is trained on business language across domains—sales, finance, marketing, operations. We use semantic models that don’t just understand keywords, but the \x3Cem>meaning\x3C/em> behind your question. This ensures your query always maps to the right dataset, metrics, and timeframes.\r\n\r\n\x3Cstrong>Speed That Matches Your Workflow\x3C/strong>\r\n\r\nOur platform delivers real-time responses, optimized to run at scale. Whether your data lives in SQL Server, Snowflake, or BigQuery, Tursio interprets and executes your queries interactively. Ask a question. Get accurate answer. Keep moving.\r\n\r\n\x3Cstrong>Enterprise-Grade Security\x3C/strong>\r\n\r\nAs a former Microsoft engineer, I know how critical trust and data privacy are. That’s why we’ve built Tursio with enterprise-grade encryption, fine-grained access control, and full audit trails. Your data stays protected, and your governance rules stay intact.\r\n\r\n\x3Cstrong>Structured Data, Unleashed\x3C/strong>\r\n\r\nTursio is purpose-built to work with structured datasets—across cloud data warehouses like Snowflake and BigQuery, as well as on-premises systems like SQL Server. We eliminate the friction between data and the decision-makers by turning rigid queries into human conversations.\r\n\r\n\x3Cstrong>Empowering Data Access Across the Organization\x3C/strong>\r\n\r\nTursio isn’t just a NLS tool—it’s a transformation layer. Imagine:\r\n\x3Cul>\r\n \t\x3Cli>A \x3Cstrong>sales rep\x3C/strong> asking, \x3Cem>“What were my top accounts last quarter by revenue?”\x3C/em>—and getting an instant answer.\x3C/li>\r\n \t\x3Cli>A \x3Cstrong>finance analyst\x3C/strong> typing, \x3Cem>“Compare spend vs. budget for Q1 across departments,”\x3C/em>—no SQL analyst needed.\x3C/li>\r\n \t\x3Cli>A \x3Cstrong>customer success manager\x3C/strong> querying, \x3Cem>“How many clients churned after their renewal date?”\x3C/em>—without opening a dashboard.\x3C/li>\r\n\x3C/ul>\r\nThis is what data accessibility should look like. \x3Cstrong>No code. No delays. No bottlenecks.\x3C/strong>\r\n\r\n\x3Cstrong>Conclusion: The Future of Querying Is NLS\x3C/strong>\r\n\r\nFor folks who are still navigating SSAS cubes and ML scripts—I see you. I’ve been there, and I understand the immense value those tools offer. But times are changing.\r\n\r\n\x3Cstrong>Natural Language Search is not a future feature. It’s a present-day necessity.\x3C/strong>\r\n\r\nAnd legacy systems weren’t built for this new world.\r\n\r\nAt Tursio, we’re not here to replace everything you’ve built—we’re here to make it smarter, faster, and radically more accessible. We want to unlock the full potential of your structured data and enable every employee—whether they code or not.\r\n\r\nSo, here’s the real question:\r\n\r\nCan your current system answer your next business question the moment you think of it?\r\n\r\nIf not, it’s time to see what \x3Ca href=\"https://www.tursio.ai/\">Tursio\x3C/a> can do.",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/05/20250509_1402_Natural-Language-Data-Interaction_simple_compose_01jtvdkcf2ffrbtxa798n3j898-1.png",author:"Rony Chatterjee",author_image:void 0,published_date:"May 5, 2025",tags:"Databases, Generative AI",description:"Tursio bridges this gap, turning structured enterprise data into real-time, conversational intelligence. Our NLP engine understands intent, delivers fast and accurate answers, and works securely across systems. Tursio empowers every employee to query data naturally, making business intelligence intuitive, accessible, and actionable."}},$R[27]={id:1180,slug:"ai-powered-querying-for-100x-knowledge-workers",acf:$R[28]={title:"AI-powered Querying for 100x Knowledge Workers",content:"\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\r\n\x3Ca class=\"ag fe\" href=\"https://blog.tursio.ai/power-of-asking-questions-c1472ca815fe\" target=\"_blank\" rel=\"noopener ugc nofollow\">Asking questions\x3C/a> is one of the fundamental activities for modern knowledge workers. They can be analysts, researchers, architects, managers, or leaders across teams in product, sales, marketing, finance, business development, etc., and their jobs require high-level thinking and information processing. Finding answers quickly and with facts is crucial for the knowledge worker function.\r\n\x3Cp id=\"fe86\" class=\"pw-post-body-paragraph yr ys tg yt b yu zm yw yx yy zn za zb ry zo zd ze sb zp zg zh se zq zj zk zl ew bj\" data-selectable-paragraph=\"\">Unfortunately, asking questions has been traditionally categorized as business intelligence (BI), and it has been defaulted to building dashboards—a slow and painful process that still doesn’t answer the question. Building dashboards is also a highly technical activity, making it especially difficult for the vast majority of non-technical users who could easily outnumber their technical peers by 100x or more.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"dd80\" class=\"yi yj tg be yk lu yl lv lx ly ym lz mb jc yn jd jg me yo mf mi mj yp mk mn yq bj\" data-selectable-paragraph=\"\">Generational Productivity\x3C/h1>\r\n\x3Cp id=\"4682\" class=\"pw-post-body-paragraph yr ys tg yt b yu yv yw yx yy yz za zb ry zc zd ze sb zf zg zh se zi zj zk zl ew bj\" data-selectable-paragraph=\"\">Interestingly, we are in the midst of an AI transformation that promises a generational step up in worker productivity, similar to what happened back in the industrial age, or more recently with computers, the internet, and the cloud. AI can now provide intelligence for laborious, repetitive work while the workers can focus on their net new value addition. This means that knowledge workers should focus on their questions and analysis rather than on figuring out the data, queries, or dashboards. All these are supposed to be the laborious and repetitive implementation details that AI will take care.\x3C/p>\r\n\x3Cp id=\"942b\" class=\"pw-post-body-paragraph yr ys tg yt b yu zm yw yx yy zn za zb ry zo zd ze sb zp zg zh se zq zj zk zl ew bj\" data-selectable-paragraph=\"\">AI adoption is already picking up the pace. The recent \x3Ca class=\"ag fe\" href=\"https://x.com/tobi/status/1909251946235437514\" target=\"_blank\" rel=\"noopener ugc nofollow\">memo\x3C/a> from the CEO of Shopify underlines the fundamental expectation that employees should be using AI in their everyday work. However, Shopify is not the only one; companies of all sizes are thinking about how to leverage AI for their business. The goal is to do far more with far less, i.e., ambitious targets of boosting productivity by multiple orders of magnitude.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"0fea\" class=\"yi yj tg be yk lu yl lv lx ly ym lz mb jc yn jd jg me yo mf mi mj yp mk mn yq bj\" data-selectable-paragraph=\"\">AI-powered Querying\x3C/h1>\r\n\x3Cp id=\"f071\" class=\"pw-post-body-paragraph yr ys tg yt b yu yv yw yx yy yz za zb ry zc zd ze sb zf zg zh se zi zj zk zl ew bj\" data-selectable-paragraph=\"\">So, how do knowledge workers turbocharge themselves in the AI age? How do they become 100x? They need to ask way more questions and get answers way faster using AI, i.e., \x3Cem class=\"akv\">AI-powered querying\x3C/em>. The goal should be to focus on high-level thinking, such as what new questions to ask, what new analyses to do, and what new interpretations to draw, while leaving all low-level mechanics to the AI. Tursio is building such an AI capability, and talk to us to learn more.\x3C/p>\r\n\x3Cp id=\"0f1f\" class=\"pw-post-body-paragraph yr ys tg yt b yu zm yw yx yy zn za zb ry zo zd ze sb zp zg zh se zq zj zk zl ew bj\" data-selectable-paragraph=\"\">Now, one may wonder, what does AI mean for the good old world of BI? In all likelihood, the BI as we know it is going to be dead. Hammering the daily questions using tedious dashboards is too unproductive for the modern workforce, and this entire process is long overdue for transformation, one that makes knowledge workers truly productive.\x3C/p>\r\n\x3Cp id=\"ce7b\" class=\"pw-post-body-paragraph yr ys tg yt b yu zm yw yx yy zn za zb ry zo zd ze sb zp zg zh se zq zj zk zl ew bj\" data-selectable-paragraph=\"\">Human productivity has come a long way, but it is fascinating to see that it still has so much room to grow. Possibly, it could even trigger a natural selection in our workplaces — intriguing, but only time will tell.\x3C/p>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/04/growtika-nGoCBxiaRO0-unsplash.jpg",author:"Alekh Jindal",author_image:void 0,published_date:"April 15, 2025",tags:"Databases, Generative AI",description:"Asking questions is central for knowledge workers, yet traditional business intelligence tools—like dashboards—are slow, technical, and inefficient. Modern AI offers a generational productivity leap, automating repetitive tasks so workers can focus on analysis, interpretation, and asking better questions. AI-powered querying enables rapid, high-level insights, letting knowledge workers operate 100x more efficiently. Companies are increasingly adopting AI to do more with less, transforming workflows. In this new paradigm, conventional BI becomes obsolete, replaced by tools that empower humans to think, question, and act faster, unlocking unprecedented workplace productivity and reshaping the way knowledge-driven decisions are made."}},$R[29]={id:1181,slug:"why-is-it-hard-to-bet-on-ai",acf:$R[30]={title:"Why is it hard to bet on AI?",content:"\x3Ch1 id=\"ef20\" class=\"zq zr tu be zs lv zt lw ly lz zu ma mc jc zv jd jg mf zw mg mj mk zx ml mo zy bj\" data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Cp data-selectable-paragraph=\"\">Generative AI raises uncomfortable questions that often no one wants to answer. One is whether the AI bet is worth taking, or is it still la-la land? In this post, I want to touch upon this topic and unpack some perspectives from people in the field. Most of it is subjective to our experiences at \x3Ca class=\"ag fe\" href=\"https://www.tursio.ai/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Tursio\x3C/a>, so it should be taken with a grain of salt.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"ef20\" class=\"zq zr tu be zs lv zt lw ly lz zu ma mc jc zv jd jg mf zw mg mj mk zx ml mo zy bj\" data-selectable-paragraph=\"\">Newfound World\x3C/h1>\r\n\x3Cp id=\"9f29\" class=\"pw-post-body-paragraph yv yw tu yx b yy zz za zb zc aba ze zf so abb zh zi sr abc zk zl su abd zn zo zp ew bj\" data-selectable-paragraph=\"\">More of our world problems are unsolved today than we would like to believe. We are already past the first quarter of the twenty-first century, and yet most organizations we come across continue to grapple with staggering inefficiencies in their daily operations. From producing goods and delivering services to running sales, marketing, finance, operations, and customer service, these organizations rely on a considerable amount of human effort. Sadly, many of the tasks involved are insanely repetitive, and yet they consume valuable human resources, something that is getting more precious every passing day.\x3C/p>\r\n\x3Cp id=\"1d9a\" class=\"pw-post-body-paragraph yv yw tu yx b yy yz za zb zc zd ze zf so zg zh zi sr zj zk zl su zm zn zo zp ew bj\" data-selectable-paragraph=\"\">Scaling with better efficiencies is an age-old problem. During the medieval age, larger empires were considered stronger and more efficient. More landmass meant more resources that could be put to use. This changed to putting more people into the labor force during the industrial age and then to getting more machines on demand in the modern cloud computing age. The goal has always been to parallelize the work with more workers. Modern-day workers are knowledge workers who need to think and ask questions to get things done. Access to information, therefore, becomes the new resource, and Generative AI is showing promising signs of putting information in the hands of these knowledge workers.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"da54\" class=\"zq zr tu be zs lv zt lw ly lz zu ma mc jc zv jd jg mf zw mg mj mk zx ml mo zy bj\" data-selectable-paragraph=\"\">The Dark Clouds\x3C/h1>\r\n\x3Cp id=\"91b5\" class=\"pw-post-body-paragraph yv yw tu yx b yy zz za zb zc aba ze zf so abb zh zi sr abc zk zl su abd zn zo zp ew bj\" data-selectable-paragraph=\"\">Interestingly, the promising new world of generative AI is already muddled in dark clouds. The market is rapidly getting flooded with new AI products and services every day. Especially, all major incumbents have infused “AI” throughout their products, sales, and marketing. The noise is overwhelming not just for the investors, who I see having little to no chance of figuring out how one AI differs from the other in a market that is more willing to go with the incumbents, but also the customers who I see grappling to understand where to extract the value and how to move the needle. And lastly, the builders who I see having a hard time figuring out the killer app that works perfectly and is a no-brainer for ROI.\x3C/p>\r\n\x3Cp id=\"2e58\" class=\"pw-post-body-paragraph yv yw tu yx b yy yz za zb zc zd ze zf so zg zh zi sr zj zk zl su zm zn zo zp ew bj\" data-selectable-paragraph=\"\">A big part of the above din stems from the fact that generative AI is still in its nascent stages. Generative AI models are getting perfected every single day with new capabilities to make them more usable and reliable. Current models hardly work out-of-the-box, with a lot of work required from developers to craft them into applications that work.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"2a64\" class=\"zq zr tu be zs lv zt lw ly lz zu ma mc jc zv jd jg mf zw mg mj mk zx ml mo zy bj\" data-selectable-paragraph=\"\">Horse Looking for a Carriage\x3C/h1>\r\n\x3Cp id=\"b2fc\" class=\"pw-post-body-paragraph yv yw tu yx b yy zz za zb zc aba ze zf so abb zh zi sr abc zk zl su abd zn zo zp ew bj\" data-selectable-paragraph=\"\">AI is seen as the exciting new magic, and so the first thing a customer asks, I hear, is to “Show me something I don’t know?”. In fact, I routinely find the use of traditional statistics and machine learning not impressive anymore. They may even feel let down if the output is something they can understand easily. Apart from being a by-product of the hype, the expectation is to see something magical. As such, I find it very easy to fall into the trap of having an AI horse that is looking for a carriage to drive home. To navigate this better, I see reframing the goals into one of the following to be helpful:\x3C/p>\r\n\r\n\x3Cul class=\"\">\r\n \t\x3Cli id=\"2450\" class=\"yv yw tu yx b yy yz za zb zc zd ze zf so zg zh zi sr zj zk zl su zm zn zo zp abe abf abg bj\" data-selectable-paragraph=\"\">Simplify (something I want to do better) — Identifying tasks that could be simplified and made more efficient with AI is a wonderful framing. This is my favorite because it focuses on making people more productive, and so naturally, the discussion revolves around the pain points they are seeing. It is also more achievable since it is targeted at a specific user.\x3C/li>\r\n \t\x3Cli id=\"614d\" class=\"yv yw tu yx b yy abh za zb zc abi ze zf so abj zh zi sr abk zk zl su abl zn zo zp abe abf abg bj\" data-selectable-paragraph=\"\">Automate (Something I don’t want to do) — Many people look for specific business processes to be automated. It is more ambitious but still concrete, i.e., clear outcomes and expected ROI, typically in terms of cost, that move the discussion more towards feasibility. However, there is still a significant learning curve to understand the current process.\x3C/li>\r\n \t\x3Cli id=\"0a1f\" class=\"yv yw tu yx b yy abh za zb zc abi ze zf so abj zh zi sr abk zk zl su abl zn zo zp abe abf abg bj\" data-selectable-paragraph=\"\">Breakthrough (Something I don’t know I should be doing) — Searching for things that people should be doing is very open. You are chasing magical findings, and it requires a much bigger appetite for investment and risk. It takes dedicated focus, deep expertise, and a lot of luck to discover that one big thing — in short, it happens rarely!\x3C/li>\r\n\x3C/ul>\r\n \r\n\r\n \r\n\x3Ch1 id=\"289d\" class=\"zq zr tu be zs lv zt lw ly lz zu ma mc jc zv jd jg mf zw mg mj mk zx ml mo zy bj\" data-selectable-paragraph=\"\">It's Too Complex\x3C/h1>\r\n\x3Cp id=\"d1c2\" class=\"pw-post-body-paragraph yv yw tu yx b yy zz za zb zc aba ze zf so abb zh zi sr abc zk zl su abd zn zo zp ew bj\" data-selectable-paragraph=\"\">Unfortunately, many AI solutions are way too complex today. I often see customers struggling to find their way and making the solution work for their scenario. This has several implications:\x3C/p>\r\n\r\n\x3Col class=\"\">\r\n \t\x3Cli id=\"8461\" class=\"yv yw tu yx b yy yz za zb zc zd ze zf so zg zh zi sr zj zk zl su zm zn zo zp abm abf abg bj\" data-selectable-paragraph=\"\">Despite the AI hype, people often don’t have the basic things working in the first place. Increasingly, I see customers seeing value in simpler end-to-end products, rather than drowning them in features and bells and whistles. This is especially true for AI, where a simple search box is the standardized interface.\x3C/li>\r\n \t\x3Cli id=\"d8f1\" class=\"yv yw tu yx b yy abh za zb zc abi ze zf so abj zh zi sr abk zk zl su abl zn zo zp abm abf abg bj\" data-selectable-paragraph=\"\">A complex solution is hard to trust. Organizations have dedicated all their time and energy to building what they build, but how does the AI do better than what they do every single day? This is particularly a challenge when trying to automate processes or search for breakthroughs. The domain knowledge is paramount in most businesses, and it's hard to mask it under complexity.\x3C/li>\r\n \t\x3Cli id=\"8e93\" class=\"yv yw tu yx b yy abh za zb zc abi ze zf so abj zh zi sr abk zk zl su abl zn zo zp abm abf abg bj\" data-selectable-paragraph=\"\">Training AI models requires humongous amounts of data, and therefore, the question when applying AI to enterprises is how many data points do we have anyways? In reality, the data points are way smaller and the space is too sparse. Building complex solutions and workflows oversimplifies the business reality and likely make things worse.\x3C/li>\r\n \t\x3Cli id=\"a0eb\" class=\"yv yw tu yx b yy abh za zb zc abi ze zf so abj zh zi sr abk zk zl su abl zn zo zp abm abf abg bj\" data-selectable-paragraph=\"\">Human workers spent countless hours and training sessions to learn how to operate, and they still use their judgement to handle tricky scenarios. Machines need to learn far more in order to cover those corner cases and have to incorporate feedback, i.e., constantly growing training data. Without this infrastructure, complex solutions find it tough to hold their ground.\x3C/li>\r\n \t\x3Cli id=\"b9b5\" class=\"yv yw tu yx b yy abh za zb zc abi ze zf so abj zh zi sr abk zk zl su abl zn zo zp abm abf abg bj\" data-selectable-paragraph=\"\">Often I hear customers asking “how does it work?” or “what is the AI behind the scenes?” or “what did it learn?”. In reality, they are having trust issues with running their business using this new magic wand. The goal should be to revisit the assumptions and make the solution as simple as possible. More complexity needs to earn the trust first.\x3C/li>\r\n\x3C/ol>\r\n \r\n\r\n \r\n\x3Ch1 id=\"9f35\" class=\"zq zr tu be zs lv zt lw ly lz zu ma mc jc zv jd jg mf zw mg mj mk zx ml mo zy bj\" data-selectable-paragraph=\"\">Keep It Simple\x3C/h1>\r\n\x3Cp id=\"eb2e\" class=\"pw-post-body-paragraph yv yw tu yx b yy zz za zb zc aba ze zf so abb zh zi sr abc zk zl su abd zn zo zp ew bj\" data-selectable-paragraph=\"\">Making AI work is the single biggest challenge out there today. Customers love products that work every single time, no matter how simple they are, as opposed to something that works only 50–60% of the time. So while it is important to recalibrate customer expectations, it is equally important to make the AI product work. And here, despite all the competition, every product is essentially competing with itself on things like hallucination, accuracy, security, and onboarding speed when it comes to AI. This is precisely our focus at Tursio — to keep it simple and to make it work!\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3C/p>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/04/0_wGoqxmF9p5shaeEN.webp",author:"Alekh Jindal",author_image:void 0,published_date:"April 4, 2025",tags:"Databases, Generative AI",description:"Generative AI promises efficiency for knowledge workers but raises questions about its real value. While AI can simplify, automate, or enable breakthroughs, the market is crowded with complex tools, creating confusion for investors, customers, and builders. Many solutions require vast data and expertise, making trust and usability a challenge. Enterprises often prefer simple, reliable interfaces like search boxes over complex workflows. Success depends on delivering consistent, accurate, and actionable results while maintaining simplicity. At Tursio, the focus is on making generative AI work seamlessly within enterprise databases, empowering users without moving data, reducing complexity, and maximizing impact."}},$R[31]={id:1182,slug:"power-of-asking-questions",acf:$R[32]={title:"Power of Asking Questions",content:"\x3Ch1 id=\"e535\" class=\"zd ze tg be zf lu zg lv lx ly zh lz mb jc zi jd jg me zj mf mi mj zk mk mn zl bj\" data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Cp id=\"86fe\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Rudyard Kipling wrote “\x3Cem class=\"zc\">I Keep Six Honest Serving Men\x3C/em>” in 1902, referring to What, Why, When, How, Where, and Who. He was exploring the importance of curiosity and questioning, and how they can be helpful if used effectively. It is interesting how Kipling saw critical questions as mortals who must always serve the individual.\x3C/p>\r\n\x3Cp id=\"9cba\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Unsurprisingly, asking questions is a fundamental skill required in most trades, e.g., a teacher must question whether students understand, a doctor must question how the patient feels, a scientist must question the status quo, a journalist must question the facts, a politician must question the policies, and so on. Businesses are no different, with startups asking what pain points to address, growth-stage companies asking how to scale, and large corporates asking which markets to capture. In all cases, asking questions is critical to staying in business.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"e535\" class=\"zd ze tg be zf lu zg lv lx ly zh lz mb jc zi jd jg me zj mf mi mj zk mk mn zl bj\" data-selectable-paragraph=\"\">First Quarter of 21st Century\x3C/h1>\r\n\x3Cp id=\"1771\" class=\"pw-post-body-paragraph yh yi tg yj b yk zm ym yn yo zn yq yr ry zo yt yu sb zp yw yx se zq yz za zb ew bj\" data-selectable-paragraph=\"\">Business intelligence is the traditional go-to approach to answering business questions. The idea is to hire a set of engineers and analysts, who gather the requirements, i.e., the questions and answers the business stakeholders are looking for, collect the data needed to serve those requirements, and create dashboards to answer the business questions. The process is repeated for follow-up questions, modifications, or new requirements.\x3C/p>\r\n\x3Cp id=\"623c\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Countless tools have been built to help collect data and squeeze it into dashboards, an art that BI teams have mastered over the years. No wonder typical enterprises deal with hundreds to thousands of dashboards, each taking days to weeks to build, thus justifying massive resources and budgets. But are those helping answer the questions? Are they helping to ask enough questions? Are they helping ask the right questions?\x3C/p>\r\n\x3Cp id=\"140c\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Effectively, businesses are maintaining an actual set of “\x3Cem class=\"zc\">Honest Serving Men\x3C/em>”, typically way more than six, and relying on them for their questioning needs. While this has been the prevalent practice in the last quarter of the century, it is quickly getting outdated in the new age of knowledge workers and the knowledge economy.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"456b\" class=\"zd ze tg be zf lu zg lv lx ly zh lz mb jc zi jd jg me zj mf mi mj zk mk mn zl bj\" data-selectable-paragraph=\"\">Entering the Knowledge Age\x3C/h1>\r\n\x3Cblockquote class=\"zr zs zt\">\r\n\x3Cp id=\"c439\" class=\"yh yi zc yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">“Increasing the productivity of knowledge workers is the most important contribution management needs to make in the 21st century.”\x3C/p>\r\n\x3Cp id=\"06b4\" class=\"yh yi zc yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Peter Drucker, 2005\x3C/p>\r\n\x3C/blockquote>\r\n\x3Cp id=\"c023\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Knowledge workers need to think for a living, and thinking is closely tied to questioning. Therefore, knowledge workers need to ask all the time, and outsourcing this critical function to other entities not only makes them unproductive but also risks making them irrelevant. This is not a surprise for sales, marketing, customer, revenue, finance, or other teams who are stymied by data and BI teams for critical insights and decision-making.\x3C/p>\r\n\x3Cp id=\"ba28\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Asking questions is critical for knowledge workers, and it is an iterative process for making any meaningful decision. A marketing professional looking for better conversion rates needs to understand their customer micro-segments and analyze which A/B experiments are likely to produce better results. Likewise, a clinical researcher needs to understand their patient population along multiple dimensions while a banking officer needs to correlate internal trends with external data. These knowledge workers need to be able to ask questions at will and that requires them to command the \x3Cem class=\"zc\">power of asking questions\x3C/em>.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"68a8\" class=\"zd ze tg be zf lu zg lv lx ly zh lz mb jc zi jd jg me zj mf mi mj zk mk mn zl bj\" data-selectable-paragraph=\"\">Why Generative AI Matters?\x3C/h1>\r\n\x3Cp id=\"4f37\" class=\"pw-post-body-paragraph yh yi tg yj b yk zm ym yn yo zn yq yr ry zo yt yu sb zp yw yx se zq yz za zb ew bj\" data-selectable-paragraph=\"\">Generative AI is a powerful technology for interpreting natural language, and it can help transform raw enterprise data into processed information and ultimately into actionable knowledge. With generative AI, knowledge workers can look at facts, generate analysis, draw conclusions, and iterate quickly to get their job done, without relying on tedious and cumbersome processes at each step.\x3C/p>\r\n\x3Cp id=\"5db1\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">\x3Ca class=\"ag fe\" href=\"https://www.tursio.ai/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Tursio\x3C/a> helps accelerate this further by bringing generative AI to enterprise databases, without any data movement or privacy concerns. Using proprietary query processing techniques, Tursio enables knowledge workers to ask advanced questions and generate rich analyses in natural language. The goal is to keep the “\x3Cem class=\"zc\">Honest Serving Men\x3C/em>” employed at all times and truly entrust the knowledge workers with the power to ask questions that they deserve.\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3C/p>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/04/0_YaKuwPIaPP-odujn.webp",author:"Alekh Jindal",author_image:void 0,published_date:"December 30, 2024",tags:"Databases, Generative AI",description:"Rudyard Kipling’s “Six Honest Serving Men”—What, Why, When, How, Where, Who—highlights the power of questioning, a skill essential in every field. Traditional business intelligence relies on engineers and dashboards to answer questions, but this slows knowledge workers, limiting productivity and insight. In the knowledge economy, professionals—from marketers to clinicians to bankers—must ask questions iteratively to make informed decisions. Generative AI enables natural-language queries over enterprise data, transforming raw data into actionable knowledge. Tursio brings generative AI to databases without moving data, empowering knowledge workers to ask advanced questions, generate analyses, and iterate quickly, keeping critical decision-making agile and efficient."}},$R[33]={id:1183,slug:"why-tursio-the-goat-market-mystery-unveiled",acf:$R[34]={title:"Why Tursio? The Goat Market Mystery Unveiled",content:"\x3Ch1 id=\"9b47\" class=\"pj pk il be pl pm pn po gj pp pq pr gl ps pt pu pv pw px py pz qa qb qc qd qe bj\" data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Cp id=\"9c0f\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">After leaving Eltropy, I took a year to unwind — traveling, hiking, cooking, and basically living my best life — before finally taking the plunge with Tursio. One day, my daughter overheard me chatting with my new boss about my role and, with a puzzled look, asked, “Dad, are you working in a goat market now?” It took me a second to figure out what she meant, and then it hit me. “Oh no, not goat market — Go-to-Market!” And just like that, my glamorous new role at Tursio became the comedy at home.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"9b47\" class=\"pj pk il be pl pm pn po gj pp pq pr gl ps pt pu pv pw px py pz qa qb qc qd qe bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"al\">It’s the attitude, not the role!\x3C/strong>\x3C/h1>\r\n\x3Cp id=\"fc6f\" class=\"pw-post-body-paragraph oo op il oq b or qf ot ou ov qg ox oy gm qh pa pb gp qi pd pe gs qj pg ph pi hm bj\" data-selectable-paragraph=\"\">When I first met Alekh Jindal, the CEO and co-founder of Tursio, in Palo Alto, it was immediately clear he had uncovered something truly unique. While I won’t pretend to fully grasp the depth of the research of the three PhD founders, I was convinced after reading founders’ \x3Ca class=\"af qk\" href=\"https://www.cidrdb.org/cidr2024/papers/p81-jindal.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">research paper\x3C/a> which clearly outlined the speed, accuracy and privacy issues in the current landscape of Gen AI for Data, and emphasized the need for the \x3Ca class=\"af qk\" href=\"https://www.cidrdb.org/cidr2024/papers/p81-jindal.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">GOD\x3C/a> machine.\x3C/p>\r\n\x3Cp id=\"440a\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">Some of their first customers were already using Tursio’s Gen AI machine for complex use cases that would typically seem too risky for a company just starting out.\x3C/p>\r\n\x3Cp id=\"c0e8\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">Tursio’s founders, Alekh and Shi, have experience in early-stage startups where the owner mindset is embraced, so I’m bringing that same ownership mentality to my GTM role as we tackle challenges in the banking industry.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"d91a\" class=\"pj pk il be pl pm pn po gj pp pq pr gl ps pt pu pv pw px py pz qa qb qc qd qe bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"al\">Shared Challenges: Banks vs Credit Unions\x3C/strong>\x3C/h1>\r\n\x3Cp id=\"1de5\" class=\"pw-post-body-paragraph oo op il oq b or qf ot ou ov qg ox oy gm qh pa pb gp qi pd pe gs qj pg ph pi hm bj\" data-selectable-paragraph=\"\">During my two years at Eltropy, leading the AI business, I had the opportunity to observe the operational challenges and budget constraints faced by Credit Unions. However, driving efficient bank operations is not new to me. As GM/CIO at ICICI Bank in 2014, I managed the technology infrastructure for 30 million account holders, including core and digital banking systems, data centers, 4,500 branches, and a network of 10,000 ATMs — all on roughly one-fifth the budget of a typical U.S. bank of comparable size and complexity.\x3C/p>\r\n\x3Cp id=\"51b5\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">My takeaway from these experiences is clear: whether it’s a large bank or a smaller Credit Union, the challenges for executives remain remarkably similar. Their focus must always be on growing accounts, deposits, and loans, while effectively managing risks, compliance, fraud, and profitability.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"90bc\" class=\"pj pk il be pl pm pn po gj pp pq pr gl ps pt pu pv pw px py pz qa qb qc qd qe bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"al\">Road to AI Transformation\x3C/strong>\x3C/h1>\r\n\x3Cp id=\"d063\" class=\"pw-post-body-paragraph oo op il oq b or qf ot ou ov qg ox oy gm qh pa pb gp qi pd pe gs qj pg ph pi hm bj\" data-selectable-paragraph=\"\">When a leader asks the right question, it changes everything for the company. It wasn’t until Steve Jobs asked the question, \x3Cem class=\"ql\">“What pricing strategy will allow us to maintain high margins while attracting a loyal customer base?”\x3C/em>, did Apple became the first trillion-dollar company.\x3C/p>\r\n\x3Cp id=\"5114\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">Bank executives still struggle with getting instant answers to big questions. The questions get passed down from leaders to business analysts to data analysts to the developer, and finally a report or a model gets created. The complexity of data and the technical barriers to overcome makes it a time-consuming process for executives to get instant answers.\x3C/p>\r\n\x3Cp id=\"218b\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">Here is where Tursio comes in.\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg src=\"https://miro.medium.com/v2/resize:fit:1372/1*XEx-JFNYU_lZSmRtQyQfUg.png\" />\x3C/p>\r\n\x3Cp id=\"f03a\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">The journey to AI transformation begins with bridging the gap between AI and BI expertise. Many organizations underestimate the challenges of handling vast amounts of structured and unstructured data from diverse sources. Traditional BI tools demand specialized skills to build data models and generate actionable reports. Adding to the complexity, leveraging large language models (LLMs) for tasks like training, fine-tuning, summarizing, and analyzing data to uncover dynamic insights requires a highly skilled workforce.\x3C/p>\r\n\x3Cp id=\"434d\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">Tursio is pioneering a radical new approach of turning domain-specific databases into \x3Cstrong class=\"oq im\">generative AI machines, \x3C/strong>thus helping organizations build LLM applications with no additional infrastructure or expertise needed.\x3C/p>\r\n\x3Cp id=\"952d\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">With Tursio, enterprises can seamlessly deploy generative AI applications like enterprise search, augmented analytics, and predictive intelligence. All this happens while keeping the data within the enterprise database and operating fully within the private enterprise environment. The diagram below shows Tursio in action.\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg src=\"https://miro.medium.com/v2/resize:fit:1392/1*oavs0O9nHl6BJesVG_s-7g.png\" />\x3C/p>\r\n\x3Cp id=\"19b2\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">Figure 1: Tursio turning databases into generative AI machines.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"f463\" class=\"pj pk il be pl pm pn po gj pp pq pr gl ps pt pu pv pw px py pz qa qb qc qd qe bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"al\">Use Cases to Accelerate Lending, Credit and Risk Evaluation\x3C/strong>\x3C/h1>\r\n\x3Cp id=\"1f93\" class=\"pw-post-body-paragraph oo op il oq b or qf ot ou ov qg ox oy gm qh pa pb gp qi pd pe gs qj pg ph pi hm bj\" data-selectable-paragraph=\"\">Generative AI technology holds immense potential to revolutionize the way lenders evaluate borrowers and make lending decisions. Specifically, here are some of the use cases in Loan Portfolio Management that we have seen at Tursio:\x3C/p>\r\n\x3Cp id=\"49b3\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">1. \x3Cstrong class=\"oq im\">Scenario Simulation\x3C/strong>: Predictive analytics empowers lenders to simulate different economic scenarios, assessing a borrower’s financial stability under various conditions. \x3Cem class=\"ql\">Ask Tursio\x3C/em>: “How does the change in interest rate impact the borrower’s ability to repay?”.\x3C/p>\r\n\x3Cp id=\"f9d5\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">2. \x3Cstrong class=\"oq im\">Risk Assessment\x3C/strong>: By analyzing historical data, lenders can spot red flags in borrower profiles, minimizing the chances of issuing bad loans. Gen AI on data allows lenders to predict defaults and assess risks with greater precision. \x3Cem class=\"ql\">Ask Tursio\x3C/em>: “Analyze data points such as credit scores and social media activity to predict a borrower’s likelihood of default”.\x3C/p>\r\n\x3Cp id=\"f7ae\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">3. \x3Cstrong class=\"oq im\">Regulatory Compliance\x3C/strong>: Proactively monitoring suspicious activities protects both lenders and borrowers. With AI, lenders can automatically flag discrepancies or missing information, ensuring compliance and averting potential penalties. \x3Cem class=\"ql\">Ask Tursio\x3C/em>: “Find anomalies in member transactions for the last 2 years”.\x3C/p>\r\n\x3Cp id=\"287a\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">4. \x3Cstrong class=\"oq im\">Dynamic Pricing\x3C/strong>: Interest rates and loan terms can be dynamically adjusted based on real-time analytics of market conditions and individual risk profiles. By leveraging up-to-the-minute data, lenders can fine-tune offerings. \x3Cem class=\"ql\">Ask Tursio\x3C/em>: “Determine the loan eligibility, interest rate and loan term for the borrower’s educational background, employment stability and credit information”.\x3C/p>\r\n\x3Cp id=\"f0fa\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">5. \x3Cstrong class=\"oq im\">Portfolio Monitoring\x3C/strong>: After loan origination, data analytics enables continuous monitoring to detect anomalies in repayment patterns, signaling potential risks or fraud. Generative AI can assist portfolio managers by automating performance reports, drafting portfolio optimization summaries, and creating subsegment-specific strategies aligned with risk appetite. Additionally, it enhances early-warning systems by analyzing real-time unstructured data, such as news or market reports, to flag borrowers or segments requiring attention. \x3Cem class=\"ql\">Ask Tursio\x3C/em>: “Generate portfolio performance report for this quarter”.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"82fe\" class=\"pj pk il be pl pm pn po gj pp pq pr gl ps pt pu pv pw px py pz qa qb qc qd qe bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"al\">Conclusion\x3C/strong>\x3C/h1>\r\n\x3Cp id=\"03a4\" class=\"pw-post-body-paragraph oo op il oq b or qf ot ou ov qg ox oy gm qh pa pb gp qi pd pe gs qj pg ph pi hm bj\" data-selectable-paragraph=\"\">In conclusion, my journey from Eltropy to Tursio has been filled with exciting challenges and opportunities, all driven by a shared mission: to revolutionize banking through AI-driven solutions. Tursio’s innovative approach to generative AI is transforming how financial institutions make decisions, manage risk, and optimize portfolios. If you’re ready to accelerate your journey toward AI transformation, I encourage you to explore Tursio and see how our technology can unlock new possibilities for your organization.\x3C/p>\r\n\x3Cp id=\"ea3e\" class=\"pw-post-body-paragraph oo op il oq b or os ot ou ov ow ox oy gm oz pa pb gp pc pd pe gs pf pg ph pi hm bj\" data-selectable-paragraph=\"\">Read more- \x3Ca class=\"af qk\" href=\"https://blog.tursio.ai/the-future-of-clinical-research-asking-questions-using-ai-656edcab9828\" target=\"_blank\" rel=\"noopener ugc nofollow\">Tursio AI for Regulated Industries\x3C/a>\x3C/p>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2024/12/1_NBn9ky8XxMfOGoZgw71Rzg.webp",author:"Murali Mahalingam",author_image:void 0,published_date:"December 11, 2024",tags:"Generative AI",description:"After a year-long sabbatical, I joined Tursio to help revolutionize AI in banking. Drawing on experience at ICICI Bank and Eltropy, I’ve seen that both large banks and credit unions face similar challenges: managing risk, compliance, and profitability. Tursio transforms enterprise databases into in-situ generative AI machines, enabling natural language queries, predictive insights, and augmented analytics without moving data or requiring extra expertise. Use cases include scenario simulation, risk assessment, regulatory compliance, dynamic pricing, and portfolio monitoring. By bridging AI and BI, Tursio accelerates decision-making, enhances operational efficiency, and empowers banks to optimize lending, credit, and risk evaluation seamlessly."}},$R[35]={id:1184,slug:"the-future-of-clinical-research-asking-questions-using-ai",acf:$R[36]={title:"The Future of Clinical Research: Asking Questions using AI",content:"\x3Ch1 id=\"0825\" class=\"po pp il be pq pr ps pt gj pu pv pw gl px py pz qa qb qc qd qe qf qg qh qi qj bj\" data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch5 id=\"419b\" class=\"pw-post-body-paragraph oa ob il oc b od oe of og oh oi oj ok gm ol om on gp oo op oq gs or os ot ou hm bj\">Authors: Anika Kanchi and Rony Chatterjee\x3C/h5>\r\n \r\n\r\nGenerative AI is transforming the healthcare landscape by offering unprecedented opportunities for faster and more accurate diagnoses, personalized medicine, and improved patient outcomes. By interpreting incomplete data, leveraging domain-specific ontologies, and linking disparate datasets, generative AI has the potential to revolutionize healthcare research and practice.\r\n\x3Cp id=\"98e0\" class=\"pw-post-body-paragraph oa ob il oc b od oe of og oh oi oj ok gm ol om on gp oo op oq gs or os ot ou hm bj\" data-selectable-paragraph=\"\">However, deploying generative AI at scale often requires significant infrastructure and expertise. \x3Ca class=\"af pn\" href=\"https://www.tursio.ai/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Tursio\x3C/a> is addressing this challenge by enabling organizations to convert their domain-specific databases into generative AI machines, eliminating the need for additional infrastructure while accelerating AI adoption. This blog discusses the technological trends in clinical research from the Tursio lens.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"0825\" class=\"po pp il be pq pr ps pt gj pu pv pw gl px py pz qa qb qc qd qe qf qg qh qi qj bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"al\">Evolution of Clinical Trials\x3C/strong>\x3C/h1>\r\n\x3Cp id=\"9d35\" class=\"pw-post-body-paragraph oa ob il oc b od qk of og oh ql oj ok gm qm om on gp qn op oq gs qo os ot ou hm bj\" data-selectable-paragraph=\"\">Clinical research has evolved significantly from its early days of observation-based medicine, with no standardized tests or ethical guidelines, to the sophisticated and regulated processes of today. The history of clinical trials can be traced back to the 18th century when James Lind’s controlled trial compared treatments for scurvy, laying the foundation for clinical experiment design.\x3C/p>\r\n\x3Cp id=\"fdbb\" class=\"pw-post-body-paragraph oa ob il oc b od oe of og oh oi oj ok gm ol om on gp oo op oq gs or os ot ou hm bj\" data-selectable-paragraph=\"\">By the 20th century, randomized controlled trials (RCTs) became the “gold standard” for minimizing bias, and double-blind studies were introduced to ensure objectivity. Ethical frameworks such as the Nuremberg Code and the Declaration of Helsinki established informed consent and safeguards for human participants. Today, clinical research continues to evolve, incorporating advanced technologies and diverse data sources, from genetic profiles to real-world evidence, while addressing new challenges through innovative solutions.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"7b54\" class=\"po pp il be pq pr ps pt gj pu pv pw gl px py pz qa qb qc qd qe qf qg qh qi qj bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"al\">Complexity of Modern Clinical Trials\x3C/strong>\x3C/h1>\r\n\x3Cp id=\"0a36\" class=\"pw-post-body-paragraph oa ob il oc b od qk of og oh ql oj ok gm qm om on gp qn op oq gs qo os ot ou hm bj\" data-selectable-paragraph=\"\">Modern clinical trials are increasingly complex due to evolving demands of personalized medicine, rising patient research costs, and the explosion of data across various sources. The new era presents several challenges that require innovative solutions but also offers immense promise.\x3C/p>\r\n\x3Cp id=\"42aa\" class=\"pw-post-body-paragraph oa ob il oc b od oe of og oh oi oj ok gm ol om on gp oo op oq gs or os ot ou hm bj\" data-selectable-paragraph=\"\">The heart of medical trials involves the patient. Identifying these patients is no longer as simple as meeting a couple of clinical criteria. Trials now require patients with rare genetic markers or specific lifestyle profiles, which turns recruitment into a race against time. Additionally, retaining qualified participants and ensuring they adhere to protocols demands proper communication, user-friendly monitoring tools, and a trial experience that feels seamless.\x3C/p>\r\n\x3Cp id=\"80e0\" class=\"pw-post-body-paragraph oa ob il oc b od oe of og oh oi oj ok gm ol om on gp oo op oq gs or os ot ou hm bj\" data-selectable-paragraph=\"\">These trial adjustments come with a heavy price tag. It can cost up to $2.8 billion to bring a new drug to market and can take over a decade for trials to complete. With intricate trial designs, lengthy approval processes, and post-market surveillance, efficiency becomes more important. Delays equal more expenses and the pressure to deliver treatments grows ever more intense.\x3C/p>\r\n\x3Cp id=\"da89\" class=\"pw-post-body-paragraph oa ob il oc b od oe of og oh oi oj ok gm ol om on gp oo op oq gs or os ot ou hm bj\" data-selectable-paragraph=\"\">Fueling these trials is also the influx of disparate data sources. Electronic Health Records (EHRs) provide great quantities of both structured and unstructured patient data, including imaging results, clinical notes, and medication records. Wearables add another layer by generating real-time, longitudinal datasets, such as heart rate, activity levels, and sleep patterns. Genetic databases contribute even more detailed insights, offering patient-specific biomarkers that can inform personalized treatment plans. Integrating these data sets into a meaningful output is a large task that is only complicated further by the sheer scale and complexity.\x3C/p>\r\n\x3Cp id=\"d7b8\" class=\"pw-post-body-paragraph oa ob il oc b od oe of og oh oi oj ok gm ol om on gp oo op oq gs or os ot ou hm bj\" data-selectable-paragraph=\"\">These intertwined challenges — prioritizing patients, managing escalating costs, and working with an overwhelming amount of data — present complexities that redefine the landscape of clinical trials. For organizations to thrive, they must now adopt new innovative technologies and solutions.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"3b16\" class=\"po pp il be pq pr ps pt gj pu pv pw gl px py pz qa qb qc qd qe qf qg qh qi qj bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"al\">Revolutionizing Clinical Research with AI\x3C/strong>\x3C/h1>\r\n\x3Cp id=\"281f\" class=\"pw-post-body-paragraph oa ob il oc b od qk of og oh ql oj ok gm qm om on gp qn op oq gs qo os ot ou hm bj\" data-selectable-paragraph=\"\">AI, particularly generative AI, is a game changer in addressing challenges brought on by clinical research. From patient recruitment to post-trial analysis, AI is streamlining processes, uncovering new insights, and transforming how treatments are developed and delivered.\x3C/p>\r\n\x3Cp id=\"95aa\" class=\"pw-post-body-paragraph oa ob il oc b od oe of og oh oi oj ok gm ol om on gp oo op oq gs or os ot ou hm bj\" data-selectable-paragraph=\"\">AI enables personalized medicine by evaluating genetic, environmental, and lifestyle factors to craft tailored treatment plans. It also enhances efficiency by quickly analyzing large datasets to match trial criteria, reducing recruitment timelines and costs. In drug discovery, AI accelerates the process by identifying promising candidates from vast biological datasets. Post-trial, it analyzes real-world evidence and long-term outcomes, ensuring treatments remain effective and safe.\x3C/p>\r\n\x3Cp id=\"dd49\" class=\"pw-post-body-paragraph oa ob il oc b od oe of og oh oi oj ok gm ol om on gp oo op oq gs or os ot ou hm bj\" data-selectable-paragraph=\"\">By streamlining these processes, AI is revolutionizing clinical research, delivering faster, more precise results while improving patient care.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"251c\" class=\"po pp il be pq pr ps pt gj pu pv pw gl px py pz qa qb qc qd qe qf qg qh qi qj bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"al\">Empowering Healthcare with Generative AI\x3C/strong>\x3C/h1>\r\n\x3Cp id=\"3506\" class=\"pw-post-body-paragraph oa ob il oc b od qk of og oh ql oj ok gm qm om on gp qn op oq gs qo os ot ou hm bj\" data-selectable-paragraph=\"\">Generative AI can address the complexities of modern clinical trials and streamline the trial process. Specifically, here are some of the scenarios that we have seen at Tursio:\x3C/p>\r\n\r\n\x3Col class=\"\">\r\n \t\x3Cli id=\"7956\" class=\"oa ob il oc b od oe of og oh oi oj ok gm ol om on gp oo op oq gs or os ot ou qp qq qr bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"oc im\">Simplifying Patient Identification for Trials: \x3C/strong>Hospitals face the challenge of identifying patients who meet trial inclusion criteria, impacting patient care. Pharmaceutical companies also struggle to find hospitals with the right patient pool. Tursio simplifies this by allowing natural language queries, enabling both hospitals and pharmaceutical companies to quickly identify eligible patients, improving recruitment efficiency as well as accelerating drug development.\x3C/li>\r\n \t\x3Cli id=\"0038\" class=\"oa ob il oc b od qs of og oh qt oj ok gm qu om on gp qv op oq gs qw os ot ou qp qq qr bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"oc im\">Accelerating Trial Design and Post-Trial Analysis: \x3C/strong>A critical component of trial design is identifying the relevant patient population suffering from specific diseases, often categorized by ICD (International Classification of Diseases) codes. Mapping ICD codes to trial criteria is essential for targeting the right participants, but it can be extremely labor-intensive and error prone. Tursio can automate this by processing complex data schemas and creating domain-specific models. This ensures that patient conditions are accurately aligned with trial requirements, optimizing participant selection, and reducing the risk of human error. With Tursio, medical practitioners can gain relevant insights in under three seconds, enabling faster trial design and seamless evaluation of long-term outcomes post-trial.\x3C/li>\r\n \t\x3Cli id=\"f603\" class=\"oa ob il oc b od qs of og oh qt oj ok gm qu om on gp qv op oq gs qw os ot ou qp qq qr bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"oc im\">Generating Patient Predictions\x3C/strong>: Improving patient outcomes requires tracking and predicting clinical behavior based on patterns in similar cohorts, such as identifying patients likely to develop diabetes, hypertension, or other conditions. Manually exploring such cohorts is complex and time-consuming. Tursio simplifies this for non-experts by allowing natural language queries, automating analysis, and providing actionable insights. It auto-detects anomalies and enables real-time alerts for early intervention.\x3C/li>\r\n\x3C/ol>\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"66fe\" class=\"po pp il be pq pr ps pt gj pu pv pw gl px py pz qa qb qc qd qe qf qg qh qi qj bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"al\">Conclusion\x3C/strong>\x3C/h1>\r\n\x3Cp id=\"4bfb\" class=\"pw-post-body-paragraph oa ob il oc b od qk of og oh ql oj ok gm qm om on gp qn op oq gs qo os ot ou hm bj\" data-selectable-paragraph=\"\">Generative AI is revolutionizing clinical research by addressing the complexities of modern trials, such as data integration, personalized medicine, and regulatory compliance. By leveraging AI to analyze vast datasets, predict outcomes, and streamline patient recruitment, healthcare organizations can accelerate drug discovery and improve patient care. This shift promises faster, more effective trials while maintaining ethical and regulatory standards.\x3C/p>\r\n\x3Cp id=\"34d7\" class=\"pw-post-body-paragraph oa ob il oc b od oe of og oh oi oj ok gm ol om on gp oo op oq gs or os ot ou hm bj\" data-selectable-paragraph=\"\">Tursio is pioneering a radical novel approach of turning domain-specific databases into generative AI machines without additional infrastructure. By simplifying patient identification, integrating disparate data sources, and enabling real-time monitoring, Tursio can make clinical trial processes more efficient and personalized. With its innovative approach, Tursio is empowering researchers and physicians to leverage state-of-the-art AI technology to innovate and deliver state-of-the-art medical care.\x3C/p>\r\n\x3Cp id=\"738f\" class=\"pw-post-body-paragraph oa ob il oc b od oe of og oh oi oj ok gm ol om on gp oo op oq gs or os ot ou hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"oc im\">If you are interested, please contact us \x3C/strong>\x3Ca class=\"af pn\" href=\"https://www.tursio.ai/contact\" target=\"_blank\" rel=\"noopener ugc nofollow\">\x3Cstrong class=\"oc im\">here\x3C/strong>\x3C/a>\x3Cstrong class=\"oc im\">.\x3C/strong>\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"f536\" class=\"po pp il be pq pr ps pt gj pu pv pw gl px py pz qa qb qc qd qe qf qg qh qi qj bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"al\">References\x3C/strong>\x3C/h1>\r\n\x3Col class=\"\">\r\n \t\x3Cli id=\"ca51\" class=\"oa ob il oc b od qk of og oh ql oj ok gm qm om on gp qn op oq gs qo os ot ou qp qq qr bj\" data-selectable-paragraph=\"\">Reddy, S. Generative AI in healthcare: an implementation science informed translational path on application, integration, and governance. Implementation Sci 19, 27 (2024). \x3Ca class=\"af pn\" href=\"https://doi.org/10.1186/s13012-024-01357-9\" target=\"_blank\" rel=\"noopener ugc nofollow\">https://doi.org/10.1186/s13012-024-01357-9\x3C/a>\x3C/li>\r\n \t\x3Cli id=\"d3a6\" class=\"oa ob il oc b od qs of og oh qt oj ok gm qu om on gp qv op oq gs qw os ot ou qp qq qr bj\" data-selectable-paragraph=\"\">Bhatt A. (2010). Evolution of clinical research: a history before and beyond James Lind. Perspectives in clinical research, 1(1), 6–10.\x3C/li>\r\n \t\x3Cli id=\"84b6\" class=\"oa ob il oc b od qs of og oh qt oj ok gm qu om on gp qv op oq gs qw os ot ou qp qq qr bj\" data-selectable-paragraph=\"\">Thati, S. (2024, March 11). \x3Cem class=\"qx\">Precision medicine in clinical trials: A statistical perspective\x3C/em>. American Pharmaceutical Review. \x3Ca class=\"af pn\" href=\"https://www.americanpharmaceuticalreview.com/Featured-Articles/611945-Precision-Medicine-in-Clinical-Trials-A-Statistical-Perspective/\" target=\"_blank\" rel=\"noopener ugc nofollow\">https://www.americanpharmaceuticalreview.com/Featured-Articles/ 611945-Precision-Medicine-in-Clinical-Trials-A-Statistical-Perspective/\x3C/a>\x3C/li>\r\n \t\x3Cli id=\"ffea\" class=\"oa ob il oc b od qs of og oh qt oj ok gm qu om on gp qv op oq gs qw os ot ou qp qq qr bj\" data-selectable-paragraph=\"\">Hart, Inc. (2024, October 15). \x3Cem class=\"qx\">Top challenges in Healthcare Data Management Today\x3C/em>. Hart. \x3Ca class=\"af pn\" href=\"https://hart.com/blog/top-challenges-in-healthcare-data-management-today\" target=\"_blank\" rel=\"noopener ugc nofollow\">https://hart.com/blog/top-challenges-in-healthcare-data-management-today\x3C/a>\x3C/li>\r\n \t\x3Cli id=\"1734\" class=\"oa ob il oc b od qs of og oh qt oj ok gm qu om on gp qv op oq gs qw os ot ou qp qq qr bj\" data-selectable-paragraph=\"\">Chopra, H., Annu, Shin, D. K., Munjal, K., Priyanka, Dhama, K., & Emran, T. B. (2023). Revolutionizing clinical trials: the role of AI in accelerating medical breakthroughs. International journal of surgery (London, England), 109(12), 4211–4220. \x3Ca class=\"af pn\" href=\"https://doi.org/10.1097/JS9.0000000000000705\" target=\"_blank\" rel=\"noopener ugc nofollow\">https://doi.org/10.1097/JS9.0000000000000705\x3C/a>\x3C/li>\r\n \t\x3Cli id=\"466b\" class=\"oa ob il oc b od qs of og oh qt oj ok gm qu om on gp qv op oq gs qw os ot ou qp qq qr bj\" data-selectable-paragraph=\"\">Wouters, O. J., McKee, M., & Luyten, J. (2020). Estimated Research and Development Investment Needed to Bring a New Medicine to Market, 2009–2018. JAMA, 323(9), 844–853. \x3Ca class=\"af pn\" href=\"https://doi.org/10.1001/jama.2020.1166\" target=\"_blank\" rel=\"noopener ugc nofollow\">https://doi.org/10.1001/jama.2020.1166\x3C/a>\x3C/li>\r\n\x3C/ol>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/05/0_TReZ4SQNsQBWxEuR.webp",author:"Rony Chatterjee",author_image:void 0,published_date:"December 9, 2024",tags:"Clinical Research, Generative AI",description:"Generative AI is transforming clinical research by streamlining trials, integrating disparate datasets, and enabling personalized medicine. Modern trials face challenges including patient recruitment, high costs, and vast, complex data from EHRs, wearables, and genetic databases. Tursio turns domain-specific databases into in-situ generative AI machines, allowing natural language queries, rapid patient identification, automated trial design, and predictive insights—all without moving data or requiring additional infrastructure. By accelerating recruitment, improving trial efficiency, and supporting real-time monitoring, Tursio empowers healthcare organizations to reduce costs, enhance patient outcomes, and advance drug discovery while maintaining ethical and regulatory compliance."}},$R[37]={id:1185,slug:"sql-server-lifecycle-and-considerations-for-enterprises",acf:$R[38]={title:"SQL Server lifecycle and considerations for enterprises",content:"\x3Ch1>\x3C/h1>\r\n \r\n\x3Cp id=\"4f30\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">SQL Server is one of the most versatile databases which enterprises trust for their database workloads. It’s a traditional Online Transactional Processing (OLTP) database and over the years enterprises across different industry verticals like financial sector, healthcare, media and entertainment, manufacturing, insurance etc. have built plethora of applications using SQL Server. Every few years, Microsoft releases a new version of SQL Server (like 2014, 2016, 2017, 2019, 2022 editions) with new feature enhancements which make the product more secure, more compliant and with a performant database engine coping up with the growing needs of enterprise data. I’ve spent several years in the core SQL Server product team and can proudly vouch the rigorous testing’s which are done on the product prior to any release. SQL Server engineering and product teams have been known across the industry for their decades of engineering excellence in delivering such a robust engine impacting millions of customers worldwide.\x3C/p>\r\n\x3Cp id=\"b11c\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">Each version of SQL Server is backed by a minimum of 10 years support, which includes five years in mainstream support (includes functional, performance, scalability, and security updates), and five years in extended support (only security updates). For customers who are nearing their 10 years on a particular version they choose to either migrate to the cloud into Azure SQL, or to an Azure Virtual Machine for free extended security updates, or upgrade to a more recent version of SQL Server or purchase and extended security updates subscription with Microsoft. Enterprise customers typically choose to remain in n-1 or n-2 (n being the latest version) version of the product and prior to the 10 years end of life has to choose one of the options mentioned above. Several enterprise customers for their critical workloads and for business reasons need to remain on-premises and cannot move to the cloud. For them, they are tasked with migrating to the latest version of SQL Server along with upgrading their physical hardware. Recently July 9th, 2024, marked the end-of-life support for \x3Ca class=\"af os\" href=\"https://www.microsoft.com/en-us/americas-partner-blog/2024/03/28/sql-server-2014-end-of-support-keep-your-customers-secure/\" target=\"_blank\" rel=\"noopener ugc nofollow\">SQL Server 2014\x3C/a>. On-premises customers will need to move to a recent version of SQL Server and also upgrade the necessary hardware to meet the system requirements. This involves significant cost and planning for enterprises.\x3C/p>\r\n\x3Cp id=\"c6b9\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">Customers have built applications on SQL Server and most of these applications demand some form of reporting and Machine Learning capabilities on the data stored in SQL Server. Customers use SQL Server Machine Learning Services, launched in SQL Server 2016 with R support and 2017 with Python support to run any ML capabilities within the SQL Server database instances. However, when using the ML services the R or Python code is wrapped within an sp_execute_external_script stored procedure in T-SQL and customers miss getting any IntelliSense and debugging capabilities. I’ve seen instances where data scientists query the SQL instances and pull the data outside SQL Server to create their ML models and then store these ML models as binary object within SQL Server and then score against it. In this approach, the moment the data is pulled outside SQL Server the trust boundaries of the data are lost and customer data is potentially exposed to more surface areas for attack.\x3C/p>\r\n\x3Cp id=\"4dd6\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">Now in 2024, we see a new advent of workloads where enterprise customers are trying to enable GenAI capabilities over their databases. Enterprises are either trying to improve efficiency for their customers to find information correctly or improve the overall experience of their applications. For outwards facing use cases, customers want to have capabilities like enterprise search on their data and replacing current drop downs and filters in their applications to just providing a simple search like experience for their customers to ask questions in natural language and get responses from their databases.\x3C/p>\r\n\x3Cp id=\"c3d2\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">From both at my time in Microsoft and Amazon, I’ve seen BI teams being randomized with constant questions which leadership team asks on the data, and every time a new ad-hoc report gets created and enterprises end up creating hundreds of reports wasting both time and resources. We observe internal facing use cases where customers ask ad-hoc questions over their database instances and replacing manually created reports over their SQL instances in SSRS and PowerBI with asking questions in natural language. Imagine if enterprises had a natural language search bar for leadership to ask questions on their database instances which showed them all the results across thousands of tables.\x3C/p>\r\n\x3Cp id=\"17e7\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">In \x3Ca class=\"af os\" href=\"https://www.tursio.ai/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Tursio\x3C/a>, we are turning SQL Servers into GenAI machines. Enterprise customers running SQL Server instances anywhere — on-premises (yes you heard, right !) and in the cloud can get an in-situ GenAI solution using Tursio. Tursio can be deployed entirely on-premises (without any cloud connectivity) where enterprise customers can ask questions in natural language and get responses from within their databases. All the data modeling happens inside SQL Server instances and there is zero data movement. None of the data ever leaves your SQL Server. Tursio understands the ontology of the data and as the underlying data changes the models are constantly refreshed providing customers with the accurate and updated results from the database whenever the question has been asked. Enterprises can invoke the same search bar using a simple Rest API endpoint from within their applications. Tursio tries to look beyond just answering questions which enterprises are asking but what value they are seeking once they get the answer — Are customers trying to predict demand? Are customers trying to find anomalies? Are customers trying to forecast? Are they trying to classify? Customers using the Tursio platform get predictive insights from the data allowing them to effectively make business decisions faster and improve time to value and all within 3 seconds. Customers can define their own KPI’s and Tursio constantly learns and fine tunes the data models providing accurate results from the data models created.\x3C/p>\r\n\x3Cp id=\"d6d7\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">If you are a SQL Server customer and want to turbo charge your applications with GenAI capabilities without your data ever leaving SQL Server, feel free to drop a note below. In addition to SQL Server and Azure SQL, \x3Ca class=\"af os\" href=\"https://www.tursio.ai/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Tursio\x3C/a> platform also supports additional databases and data warehouses like Microsoft Fabric, AWS Redshift, Snowflake, Google BigQuery, Teradata, PostgreSQL, MySQL etc. Here are some teaser screenshots of bringing generative AI to your data:\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"aligncenter\" src=\"https://miro.medium.com/v2/resize:fit:1400/1*5DVGzvCIOBNyjFSskDRH1A.png\" width=\"1367\" height=\"657\" />\x3C/p>\r\n\x3Cp style=\"text-align: left;\" data-selectable-paragraph=\"\">\x3Cstrong>Example 1. Enterprise Search Questions using Tursio\x3C/strong>\x3C/p>\r\n \r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"aligncenter\" src=\"https://miro.medium.com/v2/resize:fit:1400/1*sBcydIkWZCaIOj2LHOQQKQ.png\" width=\"1319\" height=\"614\" />\x3C/p>\r\n\x3Cp style=\"text-align: left;\" data-selectable-paragraph=\"\">\x3Cstrong>Example 2. Understanding business KPIs using Tursio\x3C/strong>\x3C/p>\r\n \r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"aligncenter\" src=\"https://miro.medium.com/v2/resize:fit:1400/1*xf-41yI8PqNPHvUWStsLzg.png\" />\x3C/p>\r\n\x3Cp style=\"text-align: left;\" data-selectable-paragraph=\"\">\x3Cstrong>Example 3. Analytical Questions using Tursio\x3C/strong>\x3C/p>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/05/1_m4vpPEg6lFrfjuyi7Ev2kg.webp",author:"Rony Chatterjee",author_image:void 0,published_date:"August 20, 2024",tags:"Databases, Generative AI",description:"SQL Server powers enterprise workloads across industries, offering robust, secure, and performant database engines with long-term support. Enterprises face challenges when upgrading versions, managing ML workloads, and enabling Generative AI while keeping data secure. Traditional approaches often expose data when models are trained externally or ad-hoc reports are created. Tursio addresses this by turning SQL Server into an in-situ GenAI machine, fully on-premises or in the cloud. Data never leaves SQL Server, models refresh automatically, and users can ask natural language questions to gain predictive insights, KPI analysis, and ad-hoc analytics, enabling faster business decisions and enhanced application experiences."}},$R[39]={id:1186,slug:"generative-ai-for-simpler-and-smarter-decision-making-in-supply-chain",acf:$R[40]={title:"Generative AI for Simpler and Smarter Decision-Making in Supply Chain",content:"\x3Ch2 id=\"addd\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\">\x3C/h2>\r\n \r\n\x3Cp id=\"e0dc\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\" data-selectable-paragraph=\"\">The search for better supply chain management dates long back. During World War II when the U.S. Navy tackled the Polaris missile program, facing nearly 70,000 tasks and thousands of contractors, they adopted PERT (Program Evaluation and Review Technique), which enabled them to deliver the missile 18 months early. Around the same time, DuPont revolutionized plant shutdowns with the Critical Path Method (CPM), cutting shutdown time by 32 days and saving $1 million in the process.\x3C/p>\r\n\x3Cp id=\"25fa\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\" data-selectable-paragraph=\"\">But today’s supply chains are vastly more complex. Walmart handles 200 million transactions weekly [1], Amazon’s fulfillment centers could fit 28 football fields [2], and Toyota measures inventory in hours, not days. The recent blockage of the Ever Given in the Suez Canal, costing $400 million per day [3], is just one example of the challenges we face. While PERT and CPM laid the groundwork, traditional methods are no longer sufficient.\x3C/p>\r\n\r\n\x3Ch2>\x3C/h2>\r\n \r\n\x3Ch1 id=\"addd\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\">\x3Cstrong class=\"ow im\">The complexity of current supply chains\x3C/strong>\x3C/h1>\r\n\x3Cp id=\"9ce4\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\" data-selectable-paragraph=\"\">Modern supply chain relies on massive ERP databases and CRM systems, with constantly updating information, to track inventory, monitor demand, and optimize logistics. These tools often require a lot of expertise and a deeper understanding of how they are implemented, making them inaccessible to many within the organization. This creates a skills gap that limits adoption and forces companies to invest heavily in specialized teams, requiring extensive training and onboarding which can take months, if not years, and incur significant operational costs. Even with these teams in place, the sheer volume of data often leads to analysis paralysis, delaying critical decisions and hindering a company’s ability to respond quickly to market shifts or disruptions. Take, for example, Nike’s infamous supply chain glitch in 2000. Their overreliance on complex demand-planning software resulted in a $100 million loss and a significant backlog of orders. This costly mistake highlights the danger of relying solely on outdated tools [4]. Overall, enterprises are too dependent on traditional, sluggish methods, reminding of Einstein’s observation, \x3Cem class=\"pp\">“Insanity is doing the same thing over and over and expecting different results”.\x3C/em>\x3C/p>\r\n\r\n\x3Ch2>\x3C/h2>\r\n \r\n\x3Ch1 id=\"bc85\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\">\x3Cstrong class=\"ow im\">How can Generative AI help?\x3C/strong>\x3C/h1>\r\n\x3Cp id=\"d34a\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\" data-selectable-paragraph=\"\">The complexity of modern supply chains makes it hard to get a clear picture of the current state and to watch out for future disruptions and opportunities. Here are the ways Generative AI can help to simplify and facilitate better decision making:\x3C/p>\r\n\x3Cp id=\"00a7\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ow im\">a. Beyond Spreadsheets and Dashboards, Using Natural Language\x3C/strong>\x3C/p>\r\n\x3Cp id=\"b1a9\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\" data-selectable-paragraph=\"\">NLP models, trained on massive datasets of supply chain-specific information, can understand complex queries like “What’s the impact of the recent flood on my electronics components from Indonesia?”, deciphering the crucial relationships between locations, events, products, and potential disruptions to provide actionable intelligence. This approach also involves leveraging specialized data structures to interpret and respond to questions, disambiguating intent based on domain-specific vocabulary, and optimizing for interactive performance and low cost. Planners asking, “What’s the impact of recent port congestion on my furniture shipments from Vietnam?” need a solution that understands not just the words but the intricate relationships between shipping routes, product types, and real-time events. Generative AI can create specialized data structures and domain-specific compilers to interpret and respond to supply-chain questions accurately.\x3C/p>\r\n\x3Cp id=\"ebf4\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ow im\">b. Anticipating Disruptions Before They Happen\x3C/strong>\x3C/p>\r\n\x3Cp id=\"e66d\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\" data-selectable-paragraph=\"\">Searching through the large dimensional decision space can help to anticipate disruptions, mitigate risks, and optimize resources proactively. In contrast, the reactive approaches tend to be fragile, as seen in the COVID-19 pandemic when global trade plummeted by 5.3% in 2020 [5], while a staggering 94% of Fortune 1000 companies suffered disruptions [6]. Businesses need to adapt to proactive approaches that predict in real-time and at-scale. Such a transformation requires algorithms that find patterns and similarities at a deeper level over vast datasets of historical supply chain events. Identifying semantic similarities and patterns with subtle trends and anomalies can predict potential disruptions before they occur. Integrating these AI models with real-time data feeds from sources like IoT sensors, news outlets, and social media, businesses can maintain a constant pulse on emerging risks and dynamically adjust their strategies to ensure uninterrupted flow of goods and services. Generative AI can help collate, combine, analyze, and synthesize insights to make supply chain truly resilient and agile.\x3C/p>\r\n\x3Cp id=\"c84b\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ow im\">c. Democratizing Insights\x3C/strong>\x3C/p>\r\n\x3Cp id=\"6535\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\" data-selectable-paragraph=\"\">Generative AI doesn’t just benefit data scientists and supply chain experts. By making data accessible and understandable to everyone, it empowers all stakeholders, from procurement managers to warehouse operators, to make informed, data-driven decisions. No longer confined to complex spreadsheets or technical jargon, insights become accessible through intuitive interfaces powered by natural language processing and data visualization tools.\x3C/p>\r\n\x3Cp id=\"f5e4\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\" data-selectable-paragraph=\"\">Generative AI is an exciting new way to take supply chain forward, empowering businesses to know better and act faster. Tursio and FirstShift have partnered on one such effort to bring domain-specific knowledge together with specialized data and language models. Read more about it in the whitepaper \x3Ca href=\"https://www.firstshift.ai/resources/democratizing-supply-chain-insights-diagnostics-domain-specific-generative-ai?source=post_page\">here\x3C/a>.\x3C/p>\r\n \r\n\r\n \r\n\x3Ch1 id=\"722c\" class=\"pw-post-body-paragraph ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po hm bj\">\x3Cstrong class=\"ow im\">References:\x3C/strong>\x3C/h1>\r\n\x3Col class=\"\">\r\n \t\x3Cli id=\"3c33\" class=\"ou ov il ow b ox oy oz pa pb pc pd pe gm pf pg ph gp pi pj pk gs pl pm pn po qn qo qp bj\" data-selectable-paragraph=\"\">“The Most Powerful Man in Payments,” Jessica Leber, MIT Technology Review, March 29, 2012.\x3C/li>\r\n \t\x3Cli id=\"273f\" class=\"ou ov il ow b ox qq oz pa pb qr pd pe gm qs pg ph gp qt pj pk gs qu pm pn po qn qo qp bj\" data-selectable-paragraph=\"\">In-Person Amazon Fulfillment Center Tours, Amazon.\x3C/li>\r\n \t\x3Cli id=\"1cf1\" class=\"ou ov il ow b ox qq oz pa pb qr pd pe gm qs pg ph gp qt pj pk gs qu pm pn po qn qo qp bj\" data-selectable-paragraph=\"\">“The cost of the Suez Canal blockage,” Mary-Ann Russon, BBC News, March 29, 2021.\x3C/li>\r\n \t\x3Cli id=\"5666\" class=\"ou ov il ow b ox qq oz pa pb qr pd pe gm qs pg ph gp qt pj pk gs qu pm pn po qn qo qp bj\" data-selectable-paragraph=\"\">Case Study 16: Nike’s 100 Million Dollar Supply Chain ‘Speed Bump’, October 16, 2022.\x3C/li>\r\n \t\x3Cli id=\"529f\" class=\"ou ov il ow b ox qq oz pa pb qr pd pe gm qs pg ph gp qt pj pk gs qu pm pn po qn qo qp bj\" data-selectable-paragraph=\"\">World Trade Organization, Press Release No. PRESS/876, August 8, 2024.\x3C/li>\r\n \t\x3Cli id=\"f58a\" class=\"ou ov il ow b ox qq oz pa pb qr pd pe gm qs pg ph gp qt pj pk gs qu pm pn po qn qo qp bj\" data-selectable-paragraph=\"\">“94% of the Fortune 1000 are seeing coronavirus supply chain disruptions: Report,” Erik Sherman, Fortune, February 21, 2020\x3C/li>\r\n\x3C/ol>\r\n ",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/05/1_z43IoHHA9Q7Pqyd_Ome7bw.webp",author:"Tursio Editorial",author_image:void 0,published_date:"August 8, 2024",tags:"Generative AI",description:"Modern supply chains are vastly complex, relying on massive ERP and CRM systems, real-time updates, and specialized teams. Traditional methods like PERT and CPM are no longer sufficient, as seen in disruptions such as the Ever Given blockage or Nike’s 2000 supply chain glitch. Generative AI can transform supply chain management by enabling natural language queries, anticipating disruptions via pattern recognition over vast historical and real-time data, and democratizing insights for all stakeholders. By combining domain-specific knowledge with specialized AI models, businesses can proactively optimize operations, mitigate risks, and make faster, smarter decisions, ushering in a more resilient and agile supply chain."}},$R[41]={id:1187,slug:"introducing-tursio",acf:$R[42]={title:"Introducing Tursio",content:"\x3Ch1 id=\"d6b6\" class=\"oq or ho bf os ot ou ov fh ow ox oy fk oz pa pb pc pd pe pf pg ph pi pj pk pl bk\" data-selectable-paragraph=\"\">A New Era\x3C/h1>\r\n\x3Cp id=\"7331\" class=\"pw-post-body-paragraph pm pn ho po b pp pq pr ps pt pu pv pw fl px py pz fo qa qb qc fr qd qe qf qg gp bk\" data-selectable-paragraph=\"\">Humans differentiate by their desire to win. If ancient people won by natural abundances and medieval people won by armies, modern day people win by technology. Today, faster adoption of new technology is key to staying ahead of the curve. But this has not been easy since new technology brings in new pain. Industrial age took a hundred years to mature, and internet/cloud computing took two decades to become mainstream. Will generative AI be different?\x3C/p>\r\n\r\n\x3Ch1 id=\"8a76\" class=\"oq or ho bf os ot ou ov fh ow ox oy fk oz pa pb pc pd pe pf pg ph pi pj pk pl bk\" data-selectable-paragraph=\"\">Rise of AI Machines\x3C/h1>\r\n\x3Cp id=\"c959\" class=\"pw-post-body-paragraph pm pn ho po b pp pq pr ps pt pu pv pw fl px py pz fo qa qb qc fr qd qe qf qg gp bk\" data-selectable-paragraph=\"\">Both industrial age and cloud computing were built on the premise of doing more work, i.e., scaling the number of people or the number of compute servers. On the other hand, artificial intelligence (AI) is rooted in doing smart work, i.e., figure out what to do before doing it at scale. This is music for modern businesses that are far too complex and suffer from lack of decision making on a daily basis, leading to the existential question of how to win. AI can look into the past and learn patterns that could be used for future decisions. Generative AI can further generate new data points that never existed, opening up brand new possibilities for the business. No wonder the massive excitement in generative AI is filled with the belief that this new technology can help people win like never before.\x3C/p>\r\n\r\n\x3Ch1 id=\"d64d\" class=\"oq or ho bf os ot ou ov fh ow ox oy fk oz pa pb pc pd pe pf pg ph pi pj pk pl bk\" data-selectable-paragraph=\"\">Tursio Story\x3C/h1>\r\n\x3Cp id=\"a5d9\" class=\"pw-post-body-paragraph pm pn ho po b pp pq pr ps pt pu pv pw fl px py pz fo qa qb qc fr qd qe qf qg gp bk\" data-selectable-paragraph=\"\">Every single company today wants to be an AI company. They want to win by working smarter, and leveraging generative AI is a no brainer. However, the question is how to adopt generative AI. While countless foundation models and related tooling are screaming off the shelves, enterprises are still figuring out where to start from. It turns out that AI is only as good as the data it is trained on, and for enterprises, the data lives in databases. The prevailing wisdom is to move data to a new generative AI stack for fine-tuning, retrieval augmentation, or prompting. Unfortunately, this introduces complexity, cost, and risk.\x3C/p>\r\n\x3Cp id=\"5be1\" class=\"pw-post-body-paragraph pm pn ho po b pp qh pr ps pt qi pv pw fl qj py pz fo qk qb qc fr ql qe qf qg gp bk\" data-selectable-paragraph=\"\">\x3Cstrong class=\"po hp\">Turning databases into generative AI machines.\x3C/strong> We believe that instead of moving data out, AI should come to the data. And today, I am happy to announce Tursio, which is on a mission to turn enterprise databases into generative AI machines. This requires us to reimagine the existing database architectures and make them work for the new age generative AI workload. We started on this mission a year and a half ago, and having talked to hundreds of customers, we are convinced that databases are the right home for generative AI, bringing this new technology from lab to market and enabling enterprises to drive new efficiencies and experiences in their business. Our goal is to meet these customers where they are, right within their cloud, hybrid, or on-premise database environments.\x3C/p>\r\n\r\n\x3Ch1 id=\"1ff6\" class=\"oq or ho bf os ot ou ov fh ow ox oy fk oz pa pb pc pd pe pf pg ph pi pj pk pl bk\" data-selectable-paragraph=\"\">The Tursions\x3C/h1>\r\n\x3Cp id=\"499a\" class=\"pw-post-body-paragraph pm pn ho po b pp pq pr ps pt pu pv pw fl px py pz fo qa qb qc fr qd qe qf qg gp bk\" data-selectable-paragraph=\"\">A lot has gone into bringing Tursio to life and here are few snapshots behind the scenes.\x3C/p>\r\n\x3Cimg class=\"aligncenter size-full wp-image-49\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/04/1_S8bi5ri-KF0NtWmbyr2ojA.webp\" alt=\"\" width=\"100%\" />\r\n\r\n \r\n\r\n[embed]https://youtu.be/66nTX1LIKwM[/embed]\r\n\r\nExciting times at Tursio, where the journey has just begun. Looking forward to helping people win!",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/04/0_y-pZsitoOmA5kYui.webp",author:"Alekh Jindal",author_image:void 0,published_date:"July 19, 2024",tags:"Generative AI, Spotlight Stories",description:"Humans have always competed by leveraging resources, armies, or technology. Today, winning depends on adopting new technologies rapidly, with generative AI promising smarter, faster decisions. Unlike industrial or cloud revolutions, AI focuses on doing smart work—analyzing past patterns and generating new possibilities. Enterprises struggle to adopt AI due to data migration complexity, cost, and risk. Tursio’s mission is to bring generative AI to enterprise databases instead, reimagining architectures to handle AI workloads in-place. By keeping data within existing systems, Tursio enables smarter insights, faster deployment, and operational efficiency, helping businesses leverage AI without moving their critical data."}},$R[43]={id:1189,slug:"database-workloads-a-song-of-ice-and-fire",acf:$R[44]={title:"Database Workloads: A Song of Ice and Fire",content:"\x3Ch1 id=\"2a19\" class=\"zc zd tg be ze lv zf lw ly lz zg ma mc jc zh jd jg mf zi mg mj mk zj ml mo zk bj\" data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Cp id=\"a77e\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Databases have evolved to serve business workloads over the last 50 years. However, in 2005, Mike Stonebraker and Uğur Çetintemel described the previous 25 years of database development as “One size fits all” [1], i.e., referring to the monolithic database architectures for all types of workloads. Of course, the other extreme would be to build a new database for every new workload. Let us look at how the 20 years since the Stonebraker paper fared between these two extremes.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"2a19\" class=\"zc zd tg be ze lv zf lw ly lz zg ma mc jc zh jd jg mf zi mg mj mk zj ml mo zk bj\" data-selectable-paragraph=\"\">2010: Big Data\x3C/h1>\r\n\x3Cp id=\"cb21\" class=\"pw-post-body-paragraph yh yi tg yj b yk zl ym yn yo zm yq yr ry zn yt yu sb zo yw yx se zp yz za zb ew bj\" data-selectable-paragraph=\"\">2010s saw a huge wave of large-scale data analytics, inspired by Google’s data processing infrastructure and popularized by the open-source Hadoop data stack. The argument was that big data is not a database workload anymore, and it cannot fit into traditional database architectures. Hadoop, in particular, was considered a completely new platform to process unstructured or semi-structured data, write MapReduce programs, and scale massively to a large number of machines. All this at much lower performance and flexibility compared to databases.\x3C/p>\r\n\x3Cp id=\"8dec\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">While researchers were still debating whether MapReduce is friend or foe with parallel DBMSs, practitioners were already busy deploying Hadoop into their data stack. The main drivers for this active interest were ease of use — people can start running Hadoop directly on their existing files (or data lakes) and write simpler imperative programs without being a SQL expert— and lower cost — no need to pay expensive license cost of DBMSs, build complex ingestion pipelines, or hire expert DBAs before they can make the databases run.\x3C/p>\r\n\x3Cp id=\"d806\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Over time, however, people realized that Hadoop and big data platforms resemble databases in more ways than they had imagined. Hadoop implements all typical database operations, but they are hard-coded into a static execution plan. Making that flexible and allowing for alternate implementations of those operators, just like databases — can make Hadoop equally flexible and performant [2]. Furthermore, the query language can still be structured with numerous extensions. This philosophy was espoused by efforts like Hive, LLAP, Impala, SCOPE, Spark, BigQuery, among others.\x3C/p>\r\n\x3Cp id=\"d0f9\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Today, big data platforms have morphed into modern cloud-native data warehousing platforms. They have all the elements of traditional database architecture, including SQL, query optimization, query processing, data layouts, partitioning, indexing, materialized views, and so on. They also have support for open formats to process data directly from the data lake, and provide a lot of tooling and automation, e.g., in-built partitioning, auto-scaling, workload management, etc., for a better user experience. Hadoop may not be deployed anymore, but the workloads inspired by Hadoop have been fully assimilated by database architectures.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"2538\" class=\"zc zd tg be ze lv zf lw ly lz zg ma mc jc zh jd jg mf zi mg mj mk zj ml mo zk bj\" data-selectable-paragraph=\"\">2015: Graph Analytics\x3C/h1>\r\n\x3Cp id=\"a154\" class=\"pw-post-body-paragraph yh yi tg yj b yk zl ym yn yo zm yq yr ry zn yt yu sb zo yw yx se zp yz za zb ew bj\" data-selectable-paragraph=\"\">Graph analytics became a hot topic around 2015s, popularized by new applications in social media, ecommerce, retail, transportation, recommendation systems, and web search. The core idea was that linked data is different and does not fit existing databases. It involves operations like graph traversals, shortest paths, spanning trees, cliques, etc., over large graphs, which is hard to support in databases.\x3C/p>\r\n\x3Cp id=\"cda3\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Similar to MapReduce, Google introduced another data processing framework, called Pregel, for running graph analytics in bulk synchronous parallel (BSP) fashion. This design gained popularity with an open source implementation, called Giraph, on top of Hadoop, and a commercial implementation called GraphLab. Several other extensions continued developing these specialized graph systems further.\x3C/p>\r\n\x3Cp id=\"7d7f\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Over time, similar to Hadoop, people realized that graph analytics can also be supported in relational databases. Similar to Hadoop, Giraph also has a static execution plan that processes parallel vertex computations in supersteps. This can be expressed as a database query execution plan and optimized using a combination of SQL and user-defined functions [3]. Furthermore, column store layouts can perform fast self-joins to traverse the graph iteratively.\x3C/p>\r\n\x3Cp id=\"d883\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Today, many databases support graph analytics, including SQL Server, Oracle, Teradata, Spark, among others, and even specialized graph systems have a database-oriented architecture. Application developers can combine SQL and graph operations in the same database while still having the flexibility and performance of Pregel.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"5cd2\" class=\"zc zd tg be ze lv zf lw ly lz zg ma mc jc zh jd jg mf zi mg mj mk zj ml mo zk bj\" data-selectable-paragraph=\"\">2020: Machine Learning, Data Science\x3C/h1>\r\n\x3Cp id=\"55f8\" class=\"pw-post-body-paragraph yh yi tg yj b yk zl ym yn yo zm yq yr ry zn yt yu sb zo yw yx se zp yz za zb ew bj\" data-selectable-paragraph=\"\">Machine learning and data science became extremely hot topics by the 2020s. Once again, Google came up with Tensorflow to democratize ML platforms and data science, being dubbed the “sexiest job of the 21st century,” helped fuel the wave. ML workloads were seen as different from database workloads, and they needed new set of tools for training and deployment.\x3C/p>\r\n\x3Cp id=\"2516\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">The prevailing wisdom was to consider the end-to-end machine learning lifecycle and build new platforms right from feature engineering all the way to model tracking and inference. Data scientists predominantly operate in Python, which has also become one of the most popular languages with a comprehensive ecosystem of libraries and toolchains. All of these were completely isolated from the databases.\x3C/p>\r\n\x3Cp id=\"48ee\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Yet again, people soon realized that machine learning and databases are better off being close to one another. On one hand, people started building database extensions to run ML workloads natively within a database, e.g., ML Services in SQL Server or SQL extensions in BigQuery. On the other hand, it became possible to push down data science programs written in Python into scalable database platforms [4]. These developments have led to better integrated architectures where ML and data scientists can work right on top of the data in their databases.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"2852\" class=\"zc zd tg be ze lv zf lw ly lz zg ma mc jc zh jd jg mf zi mg mj mk zj ml mo zk bj\" data-selectable-paragraph=\"\">2025: Generative AI\x3C/h1>\r\n\x3Cp id=\"e8ca\" class=\"pw-post-body-paragraph yh yi tg yj b yk zl ym yn yo zm yq yr ry zn yt yu sb zo yw yx se zp yz za zb ew bj\" data-selectable-paragraph=\"\">Generative AI is changing how businesses operate. This time, it was not Google but rather OpenAI that had the sputnik moment with ChatGPT. For enterprises, while 2023 was the year of exploration (what is possible), 2024 seems the year of evaluation (will it work for me), 2025 is likely to be the year of much-awaited evolution (making it for real).\x3C/p>\r\n\x3Cp id=\"d0e7\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">There is a general consensus that AI is only as good as the data. However, the current approach is to treat data as a kitchen sink and throw it into new generative AI systems. These include systems for model finetuning, vector databases for retrieval augmentation, or toolchains to feed massive prompts to the LLMs. The current belief is that none of these can happen within a database, and we need to build completely new generative AI platforms. However, interestingly, we already see many database vendors building their own in-situ vector indexes. So we know where this is going.\x3C/p>\r\n\x3Cp id=\"3d71\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">Will generative AI be yet another workload that could be completely operated within a database? At Tursio, we believe so and are on a mission to \x3Cem class=\"zq\">turn databases into generative AI machines\x3C/em> [5]. The upsides are obvious — data stays within the database, simplified architecture lowers cost and risk, accelerated time to develop and deploy, all using design principles that have been perfected over the last 50 years.\x3C/p>\r\n\x3Cp id=\"ceb5\" class=\"pw-post-body-paragraph yh yi tg yj b yk yl ym yn yo yp yq yr ry ys yt yu sb yv yw yx se yy yz za zb ew bj\" data-selectable-paragraph=\"\">To conclude, does “One size fits all”? While the jury may still be out, databases have time and again emerged as a \x3Cem class=\"zq\">free-size\x3C/em> that can fit many.\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n \r\n\x3Ch2 id=\"f6ee\" class=\"zr zd tg be ze ru zs rv ly rw zt rx mc ry zu rz sa sb zv sc sd se zw sf sg zx bj\" data-selectable-paragraph=\"\">References\x3C/h2>\r\n\x3Col class=\"\">\r\n \t\x3Cli id=\"d5ad\" class=\"yh yi tg yj b yk zl ym yn yo zm yq yr ry zn yt yu sb zo yw yx se zp yz za zb zy zz aba bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"yj lu\">“One size fits all”: an idea whose time has come and gone\x3C/strong>, Michael Stonebraker and Ugur Çetintemel, \x3Cem class=\"zq\">International Conference on Data Engineering, 2005\x3C/em>.\x3C/li>\r\n \t\x3Cli id=\"fb68\" class=\"yh yi tg yj b yk abb ym yn yo abc yq yr ry abd yt yu sb abe yw yx se abf yz za zb zy zz aba bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"yj lu\">Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing)\x3C/strong>, Jens Dittrich, Jorge-Arnulfo Quiane-Ruiz, Alekh Jindal, Yagiz Kargin, Vinay Setty, Jorg Schad, \x3Cem class=\"zq\">International Conference on Very Large Data Bases, 2010\x3C/em>.\x3C/li>\r\n \t\x3Cli id=\"a16a\" class=\"yh yi tg yj b yk abb ym yn yo abc yq yr ry abd yt yu sb abe yw yx se abf yz za zb zy zz aba bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"yj lu\">Graph analytics using vertica relational database\x3C/strong>, Alekh Jindal, Samuel Madden, Malú Castellanos, Meichun Hsu, \x3Cem class=\"zq\">IEEE International Conference on Big Data, 2015\x3C/em>.\x3C/li>\r\n \t\x3Cli id=\"2f4f\" class=\"yh yi tg yj b yk abb ym yn yo abc yq yr ry abd yt yu sb abe yw yx se abf yz za zb zy zz aba bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"yj lu\">Magpie: Python at Speed and Scale using Cloud Backends\x3C/strong>, Alekh Jindal, Venkatesh Emani, Maureen Daum, Olga Poppe, Brandon Haynes, Anna Pavlenko, Ayushi Gupta, Karthik Ramachandra, Carlo Curino, Andreas Mueller, Wentao Wu, Hiren Patel, \x3Cem class=\"zq\">Conference on Innovative Data Systems Research, 2021\x3C/em>.\x3C/li>\r\n \t\x3Cli id=\"eac9\" class=\"yh yi tg yj b yk abb ym yn yo abc yq yr ry abd yt yu sb abe yw yx se abf yz za zb zy zz aba bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"yj lu\">Turning Databases Into Generative AI Machines\x3C/strong>, Alekh Jindal, Shi Qiao, Sathwik Reddy Madhula, Kanupriya Raheja, Sandhya Jain, \x3Cem class=\"zq\">Conference on Innovative Data Systems Research, 2024\x3C/em>.\x3C/li>\r\n\x3C/ol>\r\n ",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/04/0_oJRTcxEJaEhoacs9.webp",author:"Alekh Jindal",author_image:void 0,published_date:"May 28, 2024",tags:"Databases, Engineering, Generative AI",description:"Over the last 50 years, databases have evolved from monolithic “one size fits all” systems to flexible platforms handling diverse workloads. The 2010s introduced big data analytics with Hadoop, later integrated into modern cloud-native warehouses with SQL, optimization, and automation. By 2015, graph analytics required specialized systems, now supported in relational databases. The 2020s brought machine learning and data science, initially isolated but increasingly integrated via native database ML extensions. In 2025, generative AI is the next frontier. Tursio aims to turn databases into generative AI machines, combining security, scalability, interactivity, and automation, showing that databases can adapt to any modern workload."}},$R[45]={id:1190,slug:"generative-ai-for-enterprise-data",acf:$R[46]={title:"Generative AI for Enterprise Data",content:"\x3Ch1 id=\"de40\" class=\"ov ow ho be ox oy oz pa fg pb pc pd fj pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Cp id=\"3a01\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">Generative AI has caught people’s imagination for a variety of tasks — from coding to creative arts — that are otherwise tedious and hard. Enterprise data is likewise tedious and hard for many people, and so the question is whether generative AI can help. Data analytics, in particular, is a natural next task for generative AI with many new approaches coming up. We discuss these below.\x3C/p>\r\n \r\n\x3Ch1 id=\"de40\" class=\"ov ow ho be ox oy oz pa fg pb pc pd fj pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">Current Approaches\x3C/h1>\r\n\x3Cp id=\"07f8\" class=\"pw-post-body-paragraph oa ob ho oc b od pr of og oh ps oj ok fk pt om on fn pu op oq fq pv os ot ou gp bj\" data-selectable-paragraph=\"\">Current generative AI approaches for data analytics fall into three main categories, namely (1) \x3Cem class=\"pw\">text-to-SQL\x3C/em>, (2) \x3Cem class=\"pw\">contextual\x3C/em>, and (3) \x3Cem class=\"pw\">finetuning\x3C/em>:\x3C/p>\r\n\r\n\x3Cul class=\"\">\r\n \t\x3Cli id=\"70ea\" class=\"oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou px py pz bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"oc hp\">Text-to-SQL\x3C/strong>: Generating a SQL query from natural language is a popular technique for data analytics. The idea dates back to at least 1978 when William A. Martin presented “\x3Ca class=\"ag nz\" href=\"https://dl.acm.org/doi/10.1145/800127.804086\" target=\"_blank\" rel=\"noopener ugc nofollow\">Natural Language Database Query System\x3C/a>”, called EQS, at MIT. However, recent advances have motivated a lot of people to build natural language interfaces for databases. These include numerous startups, e.g., text2sql.ai, seek.ai, defog.ai, nlsql.com, blazesql.com, and large vendors, e.g., ThoughtSpot, Sage, Microsoft Synapse Fabric, Amazon QuickSight, and Databricks English SDK.\x3C/li>\r\n \t\x3Cli id=\"083d\" class=\"oa ob ho oc b od qa of og oh qb oj ok fk qc om on fn qd op oq fq qe os ot ou px py pz bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"oc hp\">Contextual\x3C/strong>: The other approach is to provide relevant data as context when sending prompts to the language models. Open AI Code Interpreter, for instance, can provide an entire data file as context. For larger data, retrieval augmented generation (RAG) can fetch relevant portions using a vector database, e.g., LlamaIndex. We can further improve prompting using few-shot, iterative, chain-of-thought, or tree-of-thought approaches to split the problem into smaller pieces.\x3C/li>\r\n \t\x3Cli id=\"0380\" class=\"oa ob ho oc b od qa of og oh qb oj ok fk qc om on fn qd op oq fq qe os ot ou px py pz bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"oc hp\">Finetuning\x3C/strong>: The final approach is to finetune the language model on the given enterprise data. This requires a massive amount of resources and in-house expertise. Examples include Dolly and Bloomberg GPT, which were finetuned on data collected within their organizations.\x3C/li>\r\n\x3C/ul>\r\n\x3Cp id=\"a2ab\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">Each of the above approaches has its own set of challenges. Let’s discuss the typical ones below.\x3C/p>\r\n \r\n\x3Ch2 id=\"2b9a\" class=\"qf ow ho be ox fe qg ff fg fh qh fi fj fk qi fl fm fn qj fo fp fq qk fr fs ql bj\" data-selectable-paragraph=\"\">Challenge 1: Big Data\x3C/h2>\r\n\x3Cp id=\"a19d\" class=\"pw-post-body-paragraph oa ob ho oc b od pr of og oh ps oj ok fk pt om on fn pu op oq fq pv os ot ou gp bj\" data-selectable-paragraph=\"\">Enterprise data can quickly grow big, proverbially referred to as \x3Cem class=\"pw\">big data\x3C/em>. This is not good for the text-to-SQL approach, which runs queries directly over the raw data. It is also limited to small schemas that can fit in the context size limits and requires sophisticated prompting, which could end up describing every single operator, to get it right on complex schemas. The contextual approach is likewise limited by context size limits, e.g., a few 100 MBs in Code Interpreter or 32k in GPT-4.\x3C/p>\r\n\r\n\x3Cblockquote class=\"qm qn qo\">\r\n\x3Cp id=\"1fd2\" class=\"oa ob pw oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">This is reminiscent of the quote “Nobody will ever need more than 640K RAM” that is \x3Ca class=\"ag nz\" href=\"https://www.reddit.com/media?url=https%3A%2F%2Fi.redd.it%2F68a75x1v2ez21.jpg\" target=\"_blank\" rel=\"noopener ugc nofollow\">famously attributed\x3C/a> to Bill Gates.\x3C/p>\r\n\x3C/blockquote>\r\n\x3Cp id=\"436f\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">Moreover, new results show that the \x3Ca class=\"ag nz\" href=\"https://arxiv.org/abs/2307.03172\" target=\"_blank\" rel=\"noopener ugc nofollow\">order of data matters\x3C/a> as context size grows. Finally, note that big data makes finetuning even more resource-intensive, with GPT models costing between 4–100 million dollars and BloombergGPT requiring nearly 1.3 million GPU hours!\x3C/p>\r\n \r\n\x3Ch2 id=\"dfca\" class=\"qf ow ho be ox fe qg ff fg fh qh fi fj fk qi fl fm fn qj fo fp fq qk fr fs ql bj\" data-selectable-paragraph=\"\">Challenge 2: Hallucination\x3C/h2>\r\n\x3Cp id=\"7c5d\" class=\"pw-post-body-paragraph oa ob ho oc b od pr of og oh ps oj ok fk pt om on fn pu op oq fq pv os ot ou gp bj\" data-selectable-paragraph=\"\">Data analytics requires accurate answers, however, recent studies from OpenAI show GPT model accuracy ranging from 50–80%. In fact, OpenAI recommends that:\x3C/p>\r\n\r\n\x3Cblockquote class=\"qm qn qo\">\r\n\x3Cp id=\"82ba\" class=\"oa ob pw oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">“Great care should be taken when using language model outputs, particularly in high-stakes contexts, with the exact protocol (such as human review, grounding with additional context, or avoiding high-stakes uses altogether) matching the needs of specific applications.” — \x3Ca class=\"ag nz\" href=\"https://arxiv.org/pdf/2303.08774.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">GPT-4 Technical Report\x3C/a>.\x3C/p>\r\n\x3C/blockquote>\r\n\x3Cp id=\"b96c\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">No wonder text-to-SQL approaches have accuracy ranging between 50–85% on popular leaderboards like \x3Ca class=\"ag nz\" href=\"https://yale-lily.github.io/spider\" target=\"_blank\" rel=\"noopener ugc nofollow\">Spider\x3C/a> and \x3Ca class=\"ag nz\" href=\"https://bird-bench.github.io/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Bird\x3C/a>. More importantly, users need to be experts in SQL so that they can spot and fix errors. The contextual and fine-tuning approaches reduce hallucination, but they need a lot of training data or context, and they still do not eliminate it entirely.\x3C/p>\r\n \r\n\x3Ch2 id=\"c22a\" class=\"qf ow ho be ox fe qg ff fg fh qh fi fj fk qi fl fm fn qj fo fp fq qk fr fs ql bj\" data-selectable-paragraph=\"\">Challenge 3: Data Leak\x3C/h2>\r\n\x3Cp id=\"3e0e\" class=\"pw-post-body-paragraph oa ob ho oc b od pr of og oh ps oj ok fk pt om on fn pu op oq fq pv os ot ou gp bj\" data-selectable-paragraph=\"\">Data leak is a major concern for enterprises. In fact, an increasing number of companies, including Amazon, Apple, Samsung, Verizon, Bank of America, Goldman Sachs, and others, have banned ChatGPT. Likewise, many countries have blocked access to ChatGPT due to privacy concerns.\x3C/p>\r\n\r\n\x3Cblockquote class=\"qm qn qo\">\r\n\x3Cp id=\"02fc\" class=\"oa ob pw oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">“We ask that you diligently adhere to our security guideline and failure to do so may result in a breach or compromise of company information resulting in disciplinary action up to and including termination of employment,” — \x3Ca class=\"ag nz\" href=\"https://www.japantimes.co.jp/news/2023/05/02/business/tech/samsung-bans-chatgpt-workplace-use/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Samsung memo\x3C/a>, May 2, 2023.\x3C/p>\r\n\x3C/blockquote>\r\n\x3Cp id=\"866b\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">While text-to-SQL needs to share complete database schemas, retrieval augment generation (RAG) pulls data out from enterprise data sources and sends it to the language models. Using a vector database also requires duplicating data as embeddings into the vector store.\x3C/p>\r\n \r\n\x3Ch2 id=\"31e3\" class=\"qf ow ho be ox fe qg ff fg fh qh fi fj fk qi fl fm fn qj fo fp fq qk fr fs ql bj\" data-selectable-paragraph=\"\">Challenge 4: Interactivity\x3C/h2>\r\n\x3Cp id=\"90e3\" class=\"pw-post-body-paragraph oa ob ho oc b od pr of og oh ps oj ok fk pt om on fn pu op oq fq pv os ot ou gp bj\" data-selectable-paragraph=\"\">Modern analytics demands interactive query performance. Typically, interactivity implies response times of within 2 seconds. Unfortunately, text-to-SQL approaches generate direct queries over the raw data that can end up slow. This is contrary to most analytics platforms that provide mechanisms to transform and aggregate data before querying them interactively, e.g., Imports in Power BI, Extracts in Tableau, PDTs in Looker, Preferred Tables in BigQuery BI engine, Saved Queries in Superset, and SPICE in Amazon QuickSight.\x3C/p>\r\n\x3Cp id=\"453a\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">Contextual approaches, on the other hand, rely on large context sizes. Unfortunately, the response times of language models are progressively increasing with larger context, e.g., \x3Ca class=\"ag nz\" href=\"https://gptforwork.com/tools/openai-api-and-other-llm-apis-response-time-tracker\" target=\"_blank\" rel=\"noopener ugc nofollow\">typical latency\x3C/a> of more than 10 seconds for GPT-4.\x3C/p>\r\n \r\n\x3Ch2 id=\"5b66\" class=\"qf ow ho be ox fe qg ff fg fh qh fi fj fk qi fl fm fn qj fo fp fq qk fr fs ql bj\" data-selectable-paragraph=\"\">The Ideal Scenario\x3C/h2>\r\n\x3Cp id=\"01a7\" class=\"pw-post-body-paragraph oa ob ho oc b od pr of og oh ps oj ok fk pt om on fn pu op oq fq pv os ot ou gp bj\" data-selectable-paragraph=\"\">Ideally, analytics on enterprise data needs to scale to large data sizes, it must produce accurate answers, minimize data leaks, and be highly interactive. Is this ideal picture possible?\x3C/p>\r\n \r\n\x3Ch1 id=\"ce51\" class=\"ov ow ho be ox oy oz pa fg pb pc pd fj pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">The Good ol’ Data Models\x3C/h1>\r\n\x3Cdiv class=\"gq gr gs gt gu l\">\x3Carticle>\r\n\x3Cdiv class=\"l\">\r\n\x3Cdiv class=\"l\">\x3Csection>\r\n\x3Cdiv class=\"gp hi hj hk hl\">\r\n\x3Cdiv class=\"ab cb\">\r\n\x3Cdiv class=\"ci bg gv gw gx gy\">\r\n\x3Cp id=\"605b\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">Tursio is taking a brand-new approach to generative AI on enterprise data. We build on the well-known concept of data models and apply a generative approach to it. Specifically, we introduce a \x3Cem class=\"pw\">Large Data Model\x3C/em> (LDM) that pre-generates data models and retrieves the relevant ones in response to user queries. Since queries are executed against data models, without any embedding of the physical data, they can \x3Cstrong class=\"oc hp\">scale\x3C/strong> to arbitrary data sizes. All answers are rooted in data models and hence guaranteed to be \x3Cstrong class=\"oc hp\">correct\x3C/strong>. Furthermore, data stays within the database at all times, thus providing better \x3Cstrong class=\"oc hp\">privacy\x3C/strong> and lower leakage risk. Finally, Tursio manages all data models with intelligent cache and refresh, making them highly \x3Cstrong class=\"oc hp\">optimized\x3C/strong> for interactive performance.\x3C/p>\r\n\x3Cp id=\"5047\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">Together, the LDM and the LLM form the left and right brains, respectively, combining factfulness and creativity for modern intelligence.\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3C/p>\r\n\r\n\x3C/div>\r\n\x3C/div>\r\n\x3C/div>\r\n\x3C/section>\x3C/div>\r\n\x3C/div>\r\n\x3Cdiv class=\"l\">\r\n\x3Cdiv class=\"l\">\x3Csection>\r\n\x3Cdiv class=\"gp hi hj hk hl\">\r\n\x3Cdiv class=\"ab cb\">\r\n\x3Cdiv class=\"ci bg gv gw gx gy\">\r\n\x3Ch3 id=\"2796\" class=\"qf ow ho be ox fe qg ff fg fh qh fi fj fk qi fl fm fn qj fo fp fq qk fr fs ql bj\">Application 1: Automated Q&A\x3C/h3>\r\n\x3Cp id=\"7259\" class=\"pw-post-body-paragraph oa ob ho oc b od pr of og oh ps oj ok fk pt om on fn pu op oq fq pv os ot ou gp bj\" data-selectable-paragraph=\"\">Tursio makes data accessible to everyone within an organization, without sacrificing accuracy, privacy, interactivity, and scale. Users can start asking natural language questions, and the system generates the corresponding data models that are shared with other users.\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n\x3Ch3 id=\"2274\" class=\"qf ow ho be ox fe qg ff fg fh qh fi fj fk qi fl fm fn qj fo fp fq qk fr fs ql bj\">Application 2: Automated Dashboards\x3C/h3>\r\n\x3Cp id=\"7d86\" class=\"pw-post-body-paragraph oa ob ho oc b od pr of og oh ps oj ok fk pt om on fn pu op oq fq pv os ot ou gp bj\" data-selectable-paragraph=\"\">Tursio generates visualizations for every data model, and users can pin those visualizations onto a dashboard. Thereafter, Tursio also takes care of managing the data models, i.e., generating data pipelines and refresh, thus making dashboarding a no-code affair.\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n\x3Ch3 id=\"c16a\" class=\"qf ow ho be ox fe qg ff fg fh qh fi fj fk qi fl fm fn qj fo fp fq qk fr fs ql bj\">Application 3: Automated Monitoring\x3C/h3>\r\n\x3Cp id=\"34b1\" class=\"pw-post-body-paragraph oa ob ho oc b od pr of og oh ps oj ok fk pt om on fn pu op oq fq pv os ot ou gp bj\" data-selectable-paragraph=\"\">Data models change over time, and it is hard for users to keep track of the growing number of data models. Tursio helps monitor data models for any unusual behavior and surfaces them if they require attention.\x3C/p>\r\n\r\n\x3C/div>\r\n\x3C/div>\r\n\x3C/div>\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n\x3Cdiv class=\"gp hi hj hk hl\">\r\n\x3Cdiv class=\"ab cb\">\r\n\x3Cdiv class=\"ci bg gv gw gx gy\">\r\n\x3Ch3 id=\"30b9\" class=\"qf ow ho be ox fe qg ff fg fh qh fi fj fk qi fl fm fn qj fo fp fq qk fr fs ql bj\">Application 4: Automated Reports\x3C/h3>\r\n\x3Cp id=\"4ce4\" class=\"pw-post-body-paragraph oa ob ho oc b od pr of og oh ps oj ok fk pt om on fn pu op oq fq pv os ot ou gp bj\" data-selectable-paragraph=\"\">The end goal of analytics is to summarize the actions that need to be taken into a report. Tursio can generate reports over one or more data models, indicating their behavior, trends, and insights.\x3C/p>\r\n\x3Cp id=\"fa6c\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">In summary, a Large Data Model can solve many of the challenges in applying generative AI on enterprise data. It simplifies analytics and makes it accessible to a wider audience. But more importantly, it learns to automate tedious manual tasks that would other consume valuable human time.\x3C/p>\r\n\x3Cp id=\"5f48\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">To learn more about our approach, visit \x3Ca href=\"https://www.tursio.ai/\">https://www.tursio.ai\x3C/a> or write to us at contact@tursio.ai\x3C/p>\r\n\r\n\x3C/div>\r\n\x3C/div>\r\n\x3C/div>\r\n\x3C/section>\x3C/div>\r\n\x3C/div>\r\n\x3C/article>\x3C/div>\r\n\x3Cdiv class=\"ab cb\">\r\n\x3Cdiv class=\"ci bg gv gw gx gy\">\r\n\x3Cdiv class=\"qv qw ab jt\">\r\n\x3Cdiv class=\"qx ab\">\r\n\x3Cdiv class=\"qz ee cx ra ha rb rc be b bf z bj eu\">\x3C/div>\r\n\x3C/div>\r\n\x3Cdiv class=\"qx ab\">\x3C/div>\r\n\x3C/div>\r\n\x3C/div>\r\n\x3C/div>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/04/0_i4wnWqvT8Eo4-kXW.webp",author:"Alekh Jindal",author_image:void 0,published_date:"August 14, 2023",tags:"Engineering, Generative AI",description:"Generative AI shows promise in data analytics, but current approaches—text-to-SQL, contextual methods, and finetuning—face challenges with scale, accuracy, privacy, and interactivity. Big data strains query generation and context sizes; hallucination reduces reliability; enterprise data leaks raise security concerns; and slow response times limit usability. Tursio offers a new path with its Large Data Model (LDM): pre-generated models that ensure scalability, accuracy, and privacy while supporting interactive performance. LDM powers automated Q&A, dashboards, monitoring, and reports—making analytics accessible, secure, and efficient. Together with LLMs, it blends factfulness and creativity, enabling organizations to turn enterprise data into actionable intelligence."}},$R[47]={id:1191,slug:"the-incredible-interns",acf:$R[48]={title:"The Incredible Interns",content:"\x3Ch1 id=\"c717\" class=\"pw px il be py pz qa qb gj qc qd qe gl qf qg qh qi qj qk ql qm qn qo qp qq qr bj\" data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Cp id=\"60bd\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">Internships have become an integral part of our modern education system. Originating from the practice of apprenticeships centuries ago where the trainee lived with a master to learn the trade, and later evolving into vocational training geared towards factory skills, the concept intern concept started with medical students in the early twentieth century.\x3C/p>\r\n\x3Cp id=\"8015\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">Today, internships are well-defined programs to help students in all fields to transition from academia to industry. According to estimates, there are \x3Ca class=\"af os\" href=\"https://www.prosperityforamerica.org/internship-statistics/#:~:text=In%20the%20US%2C%20there%20are,the%20organization%20where%20they%20interned\" target=\"_blank\" rel=\"noopener ugc nofollow\">300k interns per years in the US\x3C/a>, with \x3Ca class=\"af os\" href=\"https://www.taylorresearchgroup.com/news/2017/4/5/a-brief-history-of-the-internship\" target=\"_blank\" rel=\"noopener ugc nofollow\">20–40k in Washington DC alone\x3C/a>. While the students build their resume, the employers get entry level skills at much lower cost, with estimates indicating that companies save $2 billion dollars per year globally via internships.\x3C/p>\r\n\x3Cp id=\"a362\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">At SmartApps, internships are an integral part of our culture. Our core values when designing an internship program are as follows:\x3C/p>\r\n\r\n\x3Cul class=\"\">\r\n \t\x3Cli id=\"cfd8\" class=\"ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">Paid\x3C/strong> — Even though estimates suggest that 40% of the interns are unpaid, we strongly believe that interns must be compensated for their time and effort. The budgets may vary but the intent must be there.\x3C/li>\r\n \t\x3Cli id=\"38a9\" class=\"ot ou il ov b ow pr oy oz pa ps pc pd gm pt pf pg gp pu pi pj gs pv pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">Project\x3C/strong> — Intern projects must be well defined and thought through before the intern(s) arrive. This is a critical to make the program worth anyone’s time and to set the intern up for success.\x3C/li>\r\n \t\x3Cli id=\"67a0\" class=\"ot ou il ov b ow pr oy oz pa ps pc pd gm pt pf pg gp pu pi pj gs pv pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">Outcomes\x3C/strong> — Project must have well-defined outcomes that are aligned with the core business and mission. This includes having a clear understanding of future ownership and adoption of the work done.\x3C/li>\r\n \t\x3Cli id=\"f0ae\" class=\"ot ou il ov b ow pr oy oz pa ps pc pd gm pt pf pg gp pu pi pj gs pv pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">Experience\x3C/strong> — The program must be geared towards generating energy, fostering novel ideas, validating possibilities, and inspiring new directions. There must be something for everyone.\x3C/li>\r\n \t\x3Cli id=\"7378\" class=\"ot ou il ov b ow pr oy oz pa ps pc pd gm pt pf pg gp pu pi pj gs pv pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">Mentorship\x3C/strong> — The everyday experience of an intern is as important as the overall goal. Therefore, it is crucial to pair them up with the right partners who are committed to investing time and resources.\x3C/li>\r\n \t\x3Cli id=\"4cf5\" class=\"ot ou il ov b ow pr oy oz pa ps pc pd gm pt pf pg gp pu pi pj gs pv pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">Community\x3C/strong> — Internship programs are a great way to serve the community, welcoming new members to the workforce and staying connected with them even after the program ends.\x3C/li>\r\n\x3C/ul>\r\n\x3Cp id=\"cd6f\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">We are excited to welcome two incredible interns at SmartApps this year. Here is a short interview with them as they are getting started.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"c717\" class=\"pw px il be py pz qa qb gj qc qd qe gl qf qg qh qi qj qk ql qm qn qo qp qq qr bj\" data-selectable-paragraph=\"\">Kanupriya Raheja, Columbia University\x3C/h1>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"\" src=\"https://miro.medium.com/v2/resize:fit:1400/1*H7z09w8-7jMccfZyXHzUBQ.jpeg\" width=\"600\" height=\"471\" />\x3C/p>\r\n\x3Cp id=\"d1fc\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">Tell us about yourself\r\n\x3C/strong>I am Kanu, I am pursuing my master’s in computer science at Columbia University. I enjoy a variety of things from solving problems to going on a walk. I also have been doing Sudoku every day since I was a kid.\x3C/p>\r\n\x3Cp id=\"c4e3\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">What are your goals for the internship?\r\n\x3C/strong>My goal for this internship is to learn and get better at AI by working hands-on practical problems.\x3C/p>\r\n\x3Cp id=\"a698\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">What other places did you consider?\x3C/strong>\r\nI considered an internship at a security firm in New York but felt the internship at SmartApps suited my interest better.\x3C/p>\r\n\x3Cp id=\"fd6e\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">Why did you choose SmartApps?\x3C/strong>\r\nDuring my interview process when I learned more about the company and its product, I found it extremely interesting. It seemed like a place where I could learn and grow through challenging projects and mentorship. Additionally, SmartApps’ utilization of cutting-edge technologies further solidified my interest in being part of their team.\x3C/p>\r\n\x3Cp id=\"0209\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">Tell us about your project at SmartApps\x3C/strong>\r\nMy project at SmartApps is to learn the behavior of data models in large databases. The main objective is to identify changes in the data model within various parameters and keep customers informed about it. Additionally, I will also be working on better management and analysis of log data and models over it.\x3C/p>\r\n\x3Cp id=\"32ed\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">How has the early experience been so far?\x3C/strong>\r\nMy experience so far has been great. We took an iterative approach in my project. This was very helpful in getting started, I have been able to get the first version of my data model smarts to production. I also started working on the log manager and am currently testing it on our own logs at SmartApps.\x3C/p>\r\n\x3Cp id=\"4d54\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">What are you looking forward to?\x3C/strong>\r\nI am eagerly looking forward to customers trying out our data model smarts on their workloads and providing us their feedback. Additionally, I am excited to enhance and refine the feature based on the inputs we receive and our own testing. Lastly, I am looking forward to a potential trip to Mount Rainier!\x3C/p>\r\n\x3Cp id=\"7ac2\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">Any plans for the rest of your stay in Seattle?\x3C/strong>\r\nI don’t have any specific plans, but I am excited to relish all the nature and sunshine that Seattle has to offer. I am looking forward to visiting different parks, going on short hikes, and going kayaking.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"4958\" class=\"pw px il be py pz qa qb gj qc qd qe gl qf qg qh qi qj qk ql qm qn qo qp qq qr bj\" data-selectable-paragraph=\"\">Sathwik Reddy Madhula, University of California San Diego\x3C/h1>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"\" src=\"https://miro.medium.com/v2/resize:fit:1206/1*RvnxerE9T-nQDepD4DXgDg.jpeg\" width=\"601\" height=\"509\" />\x3C/p>\r\n\x3Cp id=\"af90\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">Tell us about yourself\r\n\x3C/strong>I am interested in NLP and computer engineering. My interests evolved from signal and systems to deep learning. Clearly, I have been taking a curve path rather a direct one.\x3C/p>\r\n\x3Cp id=\"ca77\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">What are your goals for the internship?\r\n\x3C/strong>Real exposure to working in a team and seeing my work deployed in real world. Getting my work help other people is the most enjoyable part.\x3C/p>\r\n\x3Cp id=\"65b8\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">What other places did you consider?\x3C/strong>\r\nI applied to a lot of industry places and got into interview rounds of several of them.\x3C/p>\r\n\x3Cp id=\"a95e\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">Why did you choose SmartApps?\x3C/strong>\r\nI found the problem they solve very interesting, and the specific work involves a good feature. I also anticipated a lot of guidance from an experienced team since it was small. Interning at a big company may not have gotten my work deployed or seen by the real user, while at SmartApps I could see it deployed fast and for the real world.\x3C/p>\r\n\x3Cp id=\"060a\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">Tell us about your project at SmartApps\x3C/strong>\r\nMy project involves improving the search-ability of the data models and training in-house language models for featurization.\x3C/p>\r\n\x3Cp id=\"39bb\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">How has the early experience been so far?\x3C/strong>\r\nIt’s been good, and I have learned a lot. Whatever I did in academics, I saw them in parts, but here I combine it all together. I have studied databases but never used it in an application; seeing data and AI coming together in an end-to-end application is nice.\x3C/p>\r\n\x3Cp id=\"1ac6\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">What are you looking forward to?\x3C/strong>\r\nI am looking forward to deploying my feature to AWS using Sagemaker, improving the generation tasks further, running detailed evaluation on them, and working through the corner cases.\x3C/p>\r\n\x3Cp id=\"be1f\" class=\"pw-post-body-paragraph ot ou il ov b ow ox oy oz pa pb pc pd gm pe pf pg gp ph pi pj gs pk pl pm pn hm bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ov im\">Any plans for the rest of your stay in Seattle?\x3C/strong>\r\nI am hoping to tour the Mount Rainier and the Space Needle. Also, looking to explore more restaurants in Seattle.\x3C/p>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/05/0_kUbvq6P_rgBu4vr1.webp",author:"Alekh Jindal",author_image:void 0,published_date:"July 18, 2023",tags:"Spotlight Stories",description:"Internships, rooted in centuries-old apprenticeships, now play a vital role in bridging academia and industry. With 300k interns annually in the U.S., companies benefit from affordable talent while students gain experience. At SmartApps, internships are guided by six core values: Paid, Project, Outcomes, Experience, Mentorship, and Community. This year, we welcomed two interns: Kanupriya Raheja (Columbia), working on detecting changes in data models and log analysis, and Sathwik Reddy Madhula (UCSD), focused on improving data model searchability and training in-house language models. Both value hands-on learning, mentorship, and the chance to see their work deployed in real-world applications."}},$R[49]={id:1173,slug:"enterprise-search-on-structured-data",acf:$R[50]={title:"Enterprise Search on Structured Data",content:"Generative AI is unleashing a new generation of go-getters who believe in getting things done faster than ever before. Most of the success stories leverage pre-trained large language models (LLMs) for personal productivity tasks such as writing, designing, coding, reporting, and web search.\r\n\r\n\x3Cspan data-contrast=\"auto\">However, when it comes to business applications, like analyzing products, supporting customers, or running operations, pre-trained models start to fall short. \x3C/span>\x3Cspan data-contrast=\"auto\">That’s because these tasks depend on structured enterprise data, typically stored as systems of record in relational databases, which generic AI models can’t access or understand easily. \x3C/span>\x3Cspan data-contrast=\"auto\">Pulling relevant pieces of information, aka \x3Cstrong>enterprise search\x3C/strong>, from structured \x3C/span>\x3Cspan data-contrast=\"auto\">databases\x3C/span>\x3Cspan data-contrast=\"auto\">,\x3C/span>\x3Cspan data-contrast=\"auto\"> is critical to assisting in these tasks.\x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\r\n\x3Cspan data-contrast=\"auto\"> \x3C/span>\x3Cspan data-ccp-props=\"{"134233117":false,"134233118":false,"335559738":240,"335559739":240}\"> \x3C/span>\r\n\x3Ch1>Current Patterns\x3C/h1>\r\nTraditional enterprise search is built around unstructured data, like documents, text, images, emails, and relies on keyword-based retrieval, e.g., \x3Ca href=\"https://www.elastic.co/\">Elastic\x3C/a>, \x3Ca href=\"https://www.glean.com/\">Glean\x3C/a>, and even \x3Ca href=\"https://www.notion.com/product/enterprise-search\">Notion\x3C/a>. Structured data, on the other hand, needs valid transformations before it can be retrieved, which often makes the search process error prone. Popular approaches to improve accuracy include treating structured data as training input, as unstructured embeddings, and as prompt for text-to-SQL.\r\n\x3Ch3>Treating as Training Input\x3C/h3>\r\nSince pre-training is falling short, an obvious solution is to fine-tune the model on structured data (see \x3Ca href=\"https://arxiv.org/pdf/2406.17642\">Lamini memory tuning\x3C/a>). This is a seemingly good approach, but it is not scalable on several counts: (i) fine-tuning is expensive on large databases that keep on changing all the time, (ii) the space is too large and even very large databases end up being very sparse, (iii) shifting probabilities to recall the database facts does not consider transformation of those facts, and (iv) fine-tuning still does not guarantee the transformations on structured data to be correct.\r\n\x3Ch3>Treating as Unstructured Embeddings\x3C/h3>\r\nAnother popular approach is retrieval augmented generation (RAG), i.e., converting structured data into unstructured embeddings and retrieving relevant portions in response to queries. Popular vector databases for managing these embeddings include \x3Ca href=\"https://www.pinecone.io/\">Pinecone\x3C/a>, \x3Ca href=\"https://weaviate.io/\">Weaviate\x3C/a>, and \x3Ca href=\"https://qdrant.tech/\">Qdrant\x3C/a>. Similar to fine-tuning, RAG also has cost issues, accuracy issues, and data transformation issues. Additionally, the embedding model itself needs to be tuned to the database at hand for the embeddings to be differentiated. Overall, RAG adds significant infrastructure overheads and long implementation cycles.\r\n\x3Ch3>Treating as Prompts for Text-to-SQL\x3C/h3>\r\nFinally, the third popular approach is text-to-SQL, i.e., prompt pre-trained language models with sufficient context to generate SQL queries. The context could include database schemas, sample values, examples of queries, and so on. While this method is completely lightweight, text-to-SQL suffers accuracy issues, with the leaderboard accuracy on \x3Ca href=\"https://bird-bench.github.io/\">BIRD-SQL benchmark\x3C/a> hovering around 77%. This is unacceptable for enterprise users who may not know when to trust the results.\r\n\r\n \r\n\x3Ch1>What do people really want?\x3C/h1>\r\nStepping into the user's shoes, let us consider what they really want.\r\n\x3Ch3>Answers, answers, answers!\x3C/h3>\r\nUltimately, business users care about getting answers to their questions, either to consume directly or to feed into other business applications. Most of these users are not equipped to verify SQL statements, and so text-to-SQL with inaccuracies built-in does not fly. \x3Cspan class=\"TrackChangeTextInsertion TrackedChange SCXW255461365 BCX0\">\x3Cspan class=\"TextRun SCXW255461365 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun SCXW255461365 BCX0\">Even presenting tables and charts \x3C/span>\x3C/span>\x3C/span>\x3Cspan class=\"TrackChangeTextInsertion TrackedChange SCXW255461365 BCX0\">\x3Cspan class=\"TextRun SCXW255461365 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun SCXW255461365 BCX0\">isn’t\x3C/span>\x3C/span>\x3C/span>\x3Cspan class=\"TrackChangeTextInsertion TrackedChange SCXW255461365 BCX0\">\x3Cspan class=\"TextRun SCXW255461365 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun SCXW255461365 BCX0\"> enough. It still puts the burden on the user to interpret and derive conclusions. \x3C/span>\x3C/span>\x3C/span>\x3Cspan class=\"TrackChangeTextInsertion TrackedChange TrackChangeHoverSelectColorRed SCXW255461365 BCX0\">\x3Cspan class=\"TextRun SCXW255461365 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun TrackChangeHoverSelectHighlightRed SCXW255461365 BCX0\">What they really need is direct, trustworthy answers. \x3C/span>\x3C/span>\x3C/span>\x3Ca href=\"https://arxiv.org/abs/2408.14717\">Table augmented generation\x3C/a> (TAG) is once such approach.\r\n\r\nFor example, using \x3Ca href=\"https://www.tursio.ai/\">Tursio\x3C/a> to analyze the extended price for Automobile, Household, and Machinery segments in different nations presents a detailed analysis like below:\r\n\r\n\x3Cimg class=\"alignnone wp-image-741\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/07/Screenshot-2025-07-21-at-9.20.42-PM-300x218.png\" alt=\"\" width=\"416\" height=\"302\" />\x3Cimg class=\"alignnone wp-image-742\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/07/Screenshot-2025-07-21-at-9.20.51-PM-300x218.png\" alt=\"\" width=\"426\" height=\"310\" />\r\n\x3Ch3>Better make sure the answers are correct\x3C/h3>\r\n\x3Cspan class=\"TrackChangeTextInsertion TrackedChange TrackChangeHoverSelectColorRed SCXW256787932 BCX0\">\x3Cspan class=\"TrackedChange SCXW256787932 BCX0\">\x3Cspan class=\"TextRun SCXW256787932 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun TrackChangeHoverSelectHighlightRed SCXW256787932 BCX0\">Users \x3C/span>\x3C/span>\x3C/span>\x3C/span>\x3Cspan class=\"TrackChangeTextInsertion TrackedChange TrackChangeHoverSelectColorRed SCXW256787932 BCX0\">\x3Cspan class=\"TextRun SCXW256787932 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun TrackChangeHoverSelectHighlightRed SCXW256787932 BCX0\">don’t\x3C/span>\x3C/span>\x3C/span>\x3Cspan class=\"TrackChangeTextInsertion TrackedChange TrackChangeHoverSelectColorRed SCXW256787932 BCX0\">\x3Cspan class=\"TrackedChange SCXW256787932 BCX0\">\x3Cspan class=\"TextRun SCXW256787932 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun TrackChangeHoverSelectHighlightRed SCXW256787932 BCX0\"> just want \x3C/span>\x3Cspan class=\"NormalTextRun ContextualSpellingAndGrammarErrorV2Themed TrackChangeHoverSelectHighlightRed SCXW256787932 BCX0\">answer\x3C/span>\x3C/span>\x3C/span>\x3C/span>\x3Cspan class=\"TrackChangeTextInsertion TrackedChange TrackChangeHoverSelectColorRed SCXW256787932 BCX0\">\x3Cspan class=\"TextRun SCXW256787932 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun ContextualSpellingAndGrammarErrorV2Themed TrackChangeHoverSelectHighlightRed SCXW256787932 BCX0\">s,\x3C/span> \x3C/span>\x3C/span>\x3Cspan class=\"TrackChangeTextInsertion TrackedChange TrackChangeHoverSelectColorRed SCXW256787932 BCX0\">\x3Cspan class=\"TrackedChange SCXW256787932 BCX0\">\x3Cspan class=\"TextRun SCXW256787932 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun TrackChangeHoverSelectHighlightRed SCXW256787932 BCX0\">they want \x3C/span>\x3C/span>\x3C/span>\x3C/span>\x3Cspan class=\"TrackChangeTextInsertion TrackedChange TrackChangeHoverSelectColorRed SCXW256787932 BCX0\">\x3Cspan class=\"TextRun SCXW256787932 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun TrackChangeHoverSelectHighlightRed SCXW256787932 BCX0\">answers they can trust\x3C/span>\x3C/span>\x3C/span>\x3Cspan class=\"TrackChangeTextInsertion TrackedChange TrackChangeHoverSelectColorRed SCXW256787932 BCX0\">\x3Cspan class=\"TrackedChange SCXW256787932 BCX0\">\x3Cspan class=\"TextRun SCXW256787932 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun TrackChangeHoverSelectHighlightRed SCXW256787932 BCX0\"> every single time. That means responses must be \x3C/span>\x3C/span>\x3C/span>\x3C/span>\x3Cspan class=\"TrackChangeTextInsertion TrackedChange TrackChangeHoverSelectColorRed SCXW256787932 BCX0\">\x3Cspan class=\"TextRun SCXW256787932 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun TrackChangeHoverSelectHighlightRed SCXW256787932 BCX0\">factually correct\x3C/span>\x3C/span>\x3C/span>\x3Cspan class=\"TrackChangeTextInsertion TrackedChange TrackChangeHoverSelectColorRed SCXW256787932 BCX0\">\x3Cspan class=\"TrackedChange SCXW256787932 BCX0\">\x3Cspan class=\"TextRun SCXW256787932 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun TrackChangeHoverSelectHighlightRed SCXW256787932 BCX0\"> and \x3C/span>\x3C/span>\x3C/span>\x3C/span>\x3Cspan class=\"TrackChangeTextInsertion TrackedChange TrackChangeHoverSelectColorRed SCXW256787932 BCX0\">\x3Cspan class=\"TextRun SCXW256787932 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun TrackChangeHoverSelectHighlightRed SCXW256787932 BCX0\">explainable\x3C/span>\x3C/span>\x3C/span>\x3Cspan class=\"TrackChangeTextInsertion TrackedChange TrackChangeHoverSelectColorRed SCXW256787932 BCX0\">\x3Cspan class=\"TextRun SCXW256787932 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\x3Cspan class=\"NormalTextRun TrackChangeHoverSelectHighlightRed SCXW256787932 BCX0\">, step by step. \x3C/span>\x3C/span>\x3C/span>While the presentation can vary, just like asking the same question to different people in a boardroom who describe the same thing differently, the underlying facts must be the same.\r\n\r\nTo illustrate, \x3Ca href=\"https://www.tursio.ai/\">Tursio\x3C/a> interprets queries consistently, every single time, and provides step by step explainability as follows:\r\n\r\n\x3Cimg class=\"wp-image-745 aligncenter\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/07/Screenshot-2025-07-21-at-9.42.06-PM-300x253.png\" alt=\"\" width=\"657\" height=\"554\" />\r\n\x3Ch3>People don't really know how to ask\x3C/h3>\r\nStructured data is complex and most business users don't know how to ask what they are looking for. Instead of teaching them how to ask, they need to be guided to sufficient clarity that leads to the right database question/SQL. Once the right data is at hand, the business users know what they want and must have the freedom to express themselves.\r\n\r\n\x3Ca href=\"https://www.tursio.ai/\">Tursio\x3C/a> solves this by providing a search box with auto-prompting to help users navigate through their questions:\r\n\r\n\x3Cimg class=\"wp-image-750 aligncenter\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/07/Screenshot-2025-07-21-at-9.49.02-PM-300x53.png\" alt=\"\" width=\"691\" height=\"122\" />\r\n\x3Ch3>Need to work out of the box\x3C/h3>\r\nAI is exciting but it is also extremely hard to get it working. No wonder most AI POCs fail, and so it is imperative for enterprise search on structured data to work out of the box. Business users have neither the expertise nor the patience to tune their search experience.\r\n\r\n\x3Ca href=\"https://www.tursio.ai/\">Tursio\x3C/a> has crafted a simple straightforward interface to connect databases, run training, and start asking questions within minutes, a fully managed experience with no complexity or security risk:\r\n\r\n\x3Cimg class=\"wp-image-753 aligncenter\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/07/Screenshot-2025-07-21-at-10.24.40-PM-300x155.png\" alt=\"\" width=\"683\" height=\"353\" />\r\n\x3Ch3>Single search box to rule them all\x3C/h3>\r\nMost businesses use multiple database backends and want to search for them all at the same place. Furthermore, they want the search box to be pluggable into their existing applications.\r\n\r\n\x3Ca href=\"https://www.tursio.ai/\">Tursio\x3C/a> allows adding connections to all major databases and then switching them freely at query time to use the exact same search box for the exact same experience!\r\n\r\n\x3Cimg class=\"wp-image-756 aligncenter\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/07/Screenshot-2025-07-21-at-10.32.14-PM-300x104.png\" alt=\"\" width=\"776\" height=\"269\" />\r\n\r\n \r\n\x3Ch1>\x3C/h1>\r\n\x3Ch1>Putting it all together\x3C/h1>\r\nClearly, enterprise search on structured data is picking up. But let's also think about what it means.\r\n\x3Ch3>What makes structured data different?\x3C/h3>\r\nEnterprise search on structure data is not for developers trying to learn and build the best SQLs; that is one extreme. It's also not a replacement for dashboards; you still need them for good reasons; this is the other extreme. Rather, enterprise search on structured data is for everyone and everything in between: vast majority of people who are not SQL ninjas, but want to get answers fast. Answers to analyze products, understand customers, research markets, devise strategies, and so on, basically anything that needs to be grounded in your company and your products.\r\n\x3Ch3>Where are we heading to?\x3C/h3>\r\nWe are witnessing a transformational shift in our software stack, one that is catering to the needs of a new generation of go-getters. They want things to be fast, unified, and simplified.\r\n\r\nStructured databases have long been expert systems accessible to a chosen few. They now need to open up to everyone -- a single search box to serve them all.\r\n\r\n \r\n\r\nNote: \x3Ca href=\"https://www.tursio.ai/\">Tursio\x3C/a> helps customers deploy proprietary enterprise search for their verticals. Please reach out at \x3Ca href=\"mailto:contact@tursio.ai\">contact@tursio.ai\x3C/a> if you are interested in learning more.",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/07/2.png",author:"Alekh Jindal",author_image:void 0,published_date:"July 23, 2025",tags:"Databases, Generative AI",description:"Generative AI is transforming personal productivity, but pre-trained LLMs struggle with structured enterprise data. Traditional search methods—fine-tuning, embeddings, text-to-SQL—face scalability, accuracy, and transformation issues. Business users don’t want SQL or dashboards—they want answers they can trust, consistently. Tursio addresses this with a unified search box, auto-prompting for query guidance, and step-by-step explainable results. It works out of the box across multiple databases, delivering accurate, contextual insights for products, customers, and operations. Structured data is no longer just for SQL experts; it’s accessible to everyone, empowering faster, reliable decision-making."}},$R[51]={id:1192,slug:"rethinking-data-pipelines",acf:$R[52]={title:"Rethinking Data Pipelines ",content:"\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"5c96\" class=\"oa ob ho be oc od oe of fg og oh oi fj oj ok ol om on oo op oq or os ot ou ov bj\" data-selectable-paragraph=\"\">Cosmos\x3C/h1>\r\n\x3Cp id=\"d21c\" class=\"pw-post-body-paragraph ow ox ho oy b oz pa pb pc pd pe pf pg fk ph pi pj fn pk pl pm fq pn po pp pq gp bj\" data-selectable-paragraph=\"\">Back in the early 2000s, \x3Ca class=\"ag nz\" href=\"http://vldb.org/pvldb/vol14/p3148-jindal.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Microsoft started a project\x3C/a> for highly reliable storage of user mailboxes in Hotmail. This was later incubated as \x3Cem class=\"pr\">Cosmos\x3C/em> into their Bing division in 2005. It had a Clientlib for users to handcraft distributed computations over massively growing datasets. Soon they came up with a Nebula algebra to define stages of computations, which was eventually executed using Dryad, and later Perl-SQL to generate the Nebula file from SQL and Perl fragments. This was followed by DiscoSQL to drag and drop stages together, instead of writing scripts, using a GUI.\x3C/p>\r\n\x3Cp id=\"bcf1\" class=\"pw-post-body-paragraph ow ox ho oy b oz ps pb pc pd pt pf pg fk pu pi pj fn pv pl pm fq pw po pp pq gp bj\" data-selectable-paragraph=\"\">Ultimately, by the end of a decade since its origin, data processing in Cosmos turned into a modern compilation and query processing platform called SCOPE (abbreviation for Structured Computations Optimized for Parallel Execution). SCOPE supports both SQL operators and custom operators in C#, Java, Python, and R. Today, \x3Ca class=\"ag nz\" href=\"https://www.microsoft.com/en-us/research/publication/microlearner-a-fine-grained-learning-optimizer-for-big-data-workloads-at-microsoft/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Cosmos\x3C/a> powers every single business unit at Microsoft, processing hundreds of thousands of jobs, running over hundreds of thousands of machines, and individual jobs that can consume tens of petabytes of data and run millions of tasks in parallel!\x3C/p>\r\n\x3Cp id=\"744e\" class=\"pw-post-body-paragraph ow ox ho oy b oz ps pb pc pd pt pf pg fk pu pi pj fn pv pl pm fq pw po pp pq gp bj\" data-selectable-paragraph=\"\">The journey of Cosmos from manually handcrafted data processing stages to a declarative and optimized query processing platform is now being replayed in many new data transformation and pipelining tools today.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"e28c\" class=\"oa ob ho be oc od oe of fg og oh oi fj oj ok ol om on oo op oq or os ot ou ov bj\" data-selectable-paragraph=\"\">The Pipeline-Verse\x3C/h1>\r\n\x3Cp id=\"1656\" class=\"pw-post-body-paragraph ow ox ho oy b oz ps pb pc pd pt pf pg fk pu pi pj fn pv pl pm fq pw po pp pq gp bj\" data-selectable-paragraph=\"\">Data pipelines have become ubiquitous in the modern data stack, especially with the popularity of ELT architectures, where the transformation step is done once the data lands into a data warehouse. Consequently, several new tools, such as Fivetran, DBT, and Airflow, have become popular for copying data, modeling data, and building workflows on top of data. Specifically, Airflow has emerged as one of the most popular data pipeline tools in recent years. Given that it is open-source, users can run it on their own compute nodes. However, Airflow is also available as a managed service on all major cloud providers, e.g., Google Composer on GCP, Managed Workflow for Apache Airflow on AWS, and Azure Data Factory Managed Airflow on Azure. In addition, members from the Airflow open-source community have spun up their own managed Airflow service, called Astronomer, which is also available on all clouds.\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n \r\n\x3Ch2 id=\"755e\" class=\"qd ob ho be oc fe qe ff fg fh qf fi fj fk qg fl fm fn qh fo fp fq qi fr fs qj bj\" data-selectable-paragraph=\"\">The Effort Required\x3C/h2>\r\n\x3Cp id=\"7f2b\" class=\"pw-post-body-paragraph ow ox ho oy b oz pa pb pc pd pe pf pg fk ph pi pj fn pk pl pm fq pn po pp pq gp bj\" data-selectable-paragraph=\"\">As per the \x3Ca class=\"ag nz\" href=\"https://airflow.apache.org/blog/airflow-survey-2022/\" target=\"_blank\" rel=\"noopener ugc nofollow\">2022 survey of the Airflow user\x3C/a> community, 62% of the users have between 11–250 DAGs in their largest instance and 61% of the users have more than 25 tasks in their largest DAG. The pie chart below shows the full distribution of maximum number of tasks in single DAG, as reported by the Airflow users.\x3C/p>\r\n\x3Cimg class=\"alignnone wp-image-141\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/image7-300x186.png\" alt=\"\" width=\"460\" height=\"285\" />\r\n\r\nMaximum number of tasks used in a single DAG. Source: \x3Ca class=\"ag nz\" href=\"https://airflow.apache.org/blog/airflow-survey-2022/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Airflow Survey 2022\x3C/a>\r\n\r\n \r\n\r\nWe can see a serious amount of manual work in creating and maintaining complex workflows, laden onto the data teams of sizable organizations. This is further confirmed in DBT’s \x3Ca class=\"ag nz\" href=\"https://www.getdbt.com/state-of-analytics-engineering-2023/\" target=\"_blank\" rel=\"noopener ugc nofollow\">State of Analytics Engineering Survey of 2023\x3C/a>, where most 66% of the respondents identified maintaining datasets as the most time consuming activity.\r\n\r\n\x3Cimg class=\"alignnone wp-image-136\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/04/1_F2qTea4IFKhAC0S4O2LEhg-300x83.webp\" alt=\"\" width=\"463\" height=\"128\" />\r\n\r\nHow data teams spend most of their time. Source: \x3Ca class=\"ag nz\" href=\"https://www.getdbt.com/state-of-analytics-engineering-2023/\" target=\"_blank\" rel=\"noopener ugc nofollow\">DBT Survey 2023\x3C/a>\r\n\r\n \r\n\x3Cp id=\"ab58\" class=\"pw-post-body-paragraph ow ox ho oy b oz ps pb pc pd pt pf pg fk pu pi pj fn pv pl pm fq pw po pp pq gp bj\" data-selectable-paragraph=\"\">Data pipeline tools require the data engineer, or the analytics engineer, to specify the various stages in data transformation and stitch them together into a pipeline. The process is manual and can easily run into weeks or months. It also requires several iterations to refine and improve the pipelines for serving the business and efficiency needs.\x3C/p>\r\n\x3Cp id=\"6b3a\" class=\"pw-post-body-paragraph ow ox ho oy b oz ps pb pc pd pt pf pg fk pu pi pj fn pv pl pm fq pw po pp pq gp bj\" data-selectable-paragraph=\"\">The question therefore is whether manual stitching of pipelines will also evolve into a declarative system, like how data processing on Cosmos evolved into SCOPE. It took Cosmos roughly a decade, and given that Airflow debuted in 2015, will it become declarative by 2025?\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n \r\n\x3Ch2 id=\"57e0\" class=\"qd ob ho be oc fe qe ff fg fh qf fi fj fk qg fl fm fn qh fo fp fq qi fr fs qj bj\" data-selectable-paragraph=\"\">Performance Gap\x3C/h2>\r\n\x3Cp id=\"93e4\" class=\"pw-post-body-paragraph ow ox ho oy b oz pa pb pc pd pe pf pg fk ph pi pj fn pk pl pm fq pn po pp pq gp bj\" data-selectable-paragraph=\"\">In addition to the growing complexity of the data pipelines that are hard to handle manually, there is also a significant performance gap that users can end up leaving on the table. To illustrate, consider Apache Superset, where it is typical practice to combine multiple tables using Saved Queries. Users can build one or more charts on top of these saved queries. Users can schedule the saved queries (the data pipeline) so that the data is appropriately fresh for all consumers, which is often very tricky. Alternatively, saved queries can be executed directly every time a chart is refreshed. However, the direct query could be 12x slower compared to the optimally scheduled one, as shown in the comparison below.\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"alignnone wp-image-145\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_UcLLjTPH5MUCoCoXLMl2kQ-300x93.webp\" alt=\"\" width=\"988\" height=\"306\" />\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">Comparing performance of manual and optimized data models on Apache Superset.\x3C/p>\r\n\x3Cp id=\"4e11\" class=\"pw-post-body-paragraph ow ox ho oy b oz ps pb pc pd pt pf pg fk pu pi pj fn pv pl pm fq pw po pp pq gp bj\" data-selectable-paragraph=\"\">We can see that it is difficult for the analysts to build the optimized data pipelines manually and they are likely to leave a lot of performance on the table. Analysts spending time to build the right data transformation pipelines before delivering the dashboards will delay the time to insights. Furthermore, data pipelines need constant updates and re-optimization with newer dashboards, leading the data team into a constant disarray.\x3C/p>\r\n\x3Cp id=\"dd36\" class=\"pw-post-body-paragraph ow ox ho oy b oz ps pb pc pd pt pf pg fk pu pi pj fn pv pl pm fq pw po pp pq gp bj\" data-selectable-paragraph=\"\">Similar data pipeline constructs exist in most analytics and reporting tools out there today, including Extracts in Tableau, Imports in Power BI, Persistent Derived Tables in Looker, Preferred Tables in BigQuery, SPICE in QuickSight, and so on. These are in addition to various engine-side tools to build generic data pipelines, e.g., Databricks pipelines, Snowflake tasks and DAGs, or simply the plain old, materialized views. Leveraging these tools and getting good performance out of them requires a tremendous amount of expertise and effort.\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n \r\n\x3Ch2 id=\"9ee8\" class=\"qd ob ho be oc fe qe ff fg fh qf fi fj fk qg fl fm fn qh fo fp fq qi fr fs qj bj\" data-selectable-paragraph=\"\">Cost Differences\x3C/h2>\r\n\x3Cp id=\"201c\" class=\"pw-post-body-paragraph ow ox ho oy b oz pa pb pc pd pe pf pg fk ph pi pj fn pk pl pm fq pn po pp pq gp bj\" data-selectable-paragraph=\"\">Apart from performance, data pipelines can quickly add up the costs since they do the heavy lifting of moving and transforming data in bulk. The data teams need to pay close attention to consolidate pipelines, avoid redundant work, and constantly explore opportunities to optimize for lower costs. The cost problem grows exponentially with the size of the organization due to the combinatorial nature of data sources and applications. To illustrate the difference in cost that organizations can see, consider a data modeling pipeline for filter-aggregate queries on the TPC-H dataset. The figure below compares the total Airflow runtime, query time, and bytes processed in the naive (Base) and optimized (SmartApps) approaches.\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"alignnone wp-image-147\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_hjpfJyl_iPWnkPcHZYallQ-300x79.webp\" alt=\"\" width=\"980\" height=\"258\" />\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">Comparing the performance of manual and optimized data models on TPC-H.\x3C/p>\r\n\x3Cp id=\"e53b\" class=\"pw-post-body-paragraph ow ox ho oy b oz ps pb pc pd pt pf pg fk pu pi pj fn pv pl pm fq pw po pp pq gp bj\" data-selectable-paragraph=\"\">We see that the overall Airflow DAG runtimes are 3x faster on 1TB, while the cumulative query runtime is 7x faster and the total bytes processed is 3x less with the well-crafted optimizer. Constantly doing manual optimizations to deliver the above performance is hard for the data team.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"c199\" class=\"oa ob ho be oc od oe of fg og oh oi fj oj ok ol om on oo op oq or os ot ou ov bj\" data-selectable-paragraph=\"\">Towards a Generative Approach\x3C/h1>\r\n\x3Cp id=\"6089\" class=\"pw-post-body-paragraph ow ox ho oy b oz pa pb pc pd pe pf pg fk ph pi pj fn pk pl pm fq pn po pp pq gp bj\" data-selectable-paragraph=\"\">At SmartApps, we have taken a generative approach to data pipelines, where users ask questions, and the system takes care of automatically generating all required data pipelines while keeping them optimized for performance and cost at all times. As a result, users focus on “what” while the system figures out the “how” part. This also means that pipelines are like physical execution plans, i.e., the implementation details of a declarative system, just like in Microsoft Cosmos!\x3C/p>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/05/0_jeqTxUzOPXUPqQ9I.webp",author:"Alekh Jindal",author_image:void 0,published_date:"July 13, 2023",tags:"Engineering",description:"Microsoft’s Cosmos began as handcrafted distributed computations but evolved into SCOPE, a declarative platform powering petabyte-scale jobs. Today, modern data stacks face similar challenges with Airflow, DBT, and pipeline tools, where data teams manually stitch complex DAGs, wasting time and leaving performance gaps. Surveys show pipeline maintenance is the most time-consuming task, while poor optimization causes delays and higher costs. Much like Cosmos’ decade-long journey, pipelines may evolve into declarative systems by 2025. At SmartApps, we envision a generative approach—users declare the “what,” and the system generates optimized pipelines for performance and cost, automating the “how.”"}},$R[53]={id:1193,slug:"generating-insights-on-private-data",acf:$R[54]={title:"Generating Insights on Private Data",content:"\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"2627\" class=\"nz oa hn be ob oc od oe ff of og oh fi oi oj ok ol om on oo op oq or os ot ou bj\" data-selectable-paragraph=\"\">Back To the Future\x3C/h1>\r\n\x3Cp id=\"7210\" class=\"pw-post-body-paragraph ov ow hn ox b oy oz pa pb pc pd pe pf fj pg ph pi fm pj pk pl fp pm pn po pp go bj\" data-selectable-paragraph=\"\">In 1770, Wolfgang von Kempelen built the \x3Ca class=\"af ny\" href=\"https://en.wikipedia.org/wiki/Mechanical_Turk\" target=\"_blank\" rel=\"noopener ugc nofollow\">Mechanical Turk\x3C/a> that gave the illusion of a machine playing chess. He was inspired by illusion acts in the court of Maria Theresa of Austria. 142 years later, in 1912, Leonardo Torres y Quevedo built \x3Ca class=\"af ny\" href=\"https://en.wikipedia.org/wiki/El_Ajedrecista\" target=\"_blank\" rel=\"noopener ugc nofollow\">El Ajedrecista\x3C/a>, the first chess automaton that did not require human guidance. Two and a half decades later, in 1936, Alan Turing introduced the \x3Ca class=\"af ny\" href=\"https://en.wikipedia.org/wiki/Turing_machine\" target=\"_blank\" rel=\"noopener ugc nofollow\">Turing Machine\x3C/a>, which was capable of implementing any computer algorithm. In 1945, Alan Turing predicted that computers would play very good chess one day.\x3C/p>\r\n\x3Cp id=\"713b\" class=\"pw-post-body-paragraph ov ow hn ox b oy pq pa pb pc pr pe pf fj ps ph pi fm pt pk pl fp pu pn po pp go bj\" data-selectable-paragraph=\"\">Later in 1951, 72 years ago, the \x3Ca class=\"af ny\" href=\"https://www.britannica.com/technology/artificial-intelligence/Alan-Turing-and-the-beginning-of-AI\" target=\"_blank\" rel=\"noopener ugc nofollow\">first successful AI programs\x3C/a> could play the complete game of Checkers and learn to do shopping. Checkers was also the first AI application to run in the US, with extensions to make it learn from experiences, i.e., evolutionary. \x3Ca class=\"af ny\" href=\"http://infolab.stanford.edu/pub/voy/museum/samuel.html\" target=\"_blank\" rel=\"noopener ugc nofollow\">“Machine learning”\x3C/a> was coined in 1959 by Arthur Samuel, followed by decades of scientific progress in building machines that learn. The \x3Ca class=\"af ny\" href=\"https://analyticsindiamag.com/story-eliza-first-chatbot-developed-1966/\" target=\"_blank\" rel=\"noopener ugc nofollow\">first chatbot, ELIZA\x3C/a>, appeared in 1966, and \x3Ca class=\"af ny\" href=\"https://web.stanford.edu/~learnest/sail/oldcart.html\" target=\"_blank\" rel=\"noopener ugc nofollow\">Stanford Cart\x3C/a> was the first autonomous vehicle in the 60s and 70s.\x3C/p>\r\n\x3Cp id=\"fd0f\" class=\"pw-post-body-paragraph ov ow hn ox b oy pq pa pb pc pr pe pf fj ps ph pi fm pt pk pl fp pu pn po pp go bj\" data-selectable-paragraph=\"\">By the 1970s, the questions evolved from can machines think to can machines talk, with people wondering whether \x3Ca class=\"af ny\" href=\"https://dl.acm.org/doi/pdf/10.1145/800194.805902\" target=\"_blank\" rel=\"noopener ugc nofollow\">natural language is unnatural for machines\x3C/a>, and one of the earliest natural language database systems, the \x3Ca class=\"af ny\" href=\"https://dl.acm.org/doi/pdf/10.1145/800127.804086\" target=\"_blank\" rel=\"noopener ugc nofollow\">English querying system (EQS)\x3C/a>, was developed at MIT. All of these were signs of incredible possibilities with groundbreaking starts. However, they were still early experiments confined to labs and prototypes in elite institutions.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"34d1\" class=\"nz oa hn be ob oc od oe ff of og oh fi oi oj ok ol om on oo op oq or os ot ou bj\" data-selectable-paragraph=\"\">Welcome To the Present\x3C/h1>\r\n\x3Cimg class=\"attachment-266x266 \" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/roger-ce-W32yvc0JJjw-unsplash.jpg\" sizes=\"auto, (max-width: 266px) 100vw, 266px\" srcset=\"https://i0.wp.com/tursio.wpcomstaging.com/wp-content/uploads/2025/05/roger-ce-W32yvc0JJjw-unsplash.jpg?w=6605&ssl=1 6605w, https://i0.wp.com/tursio.wpcomstaging.com/wp-content/uploads/2025/05/roger-ce-W32yvc0JJjw-unsplash.jpg?resize=300%2C200&ssl=1 300w, https://i0.wp.com/tursio.wpcomstaging.com/wp-content/uploads/2025/05/roger-ce-W32yvc0JJjw-unsplash.jpg?resize=1024%2C683&ssl=1 1024w, https://i0.wp.com/tursio.wpcomstaging.com/wp-content/uploads/2025/05/roger-ce-W32yvc0JJjw-unsplash.jpg?resize=768%2C512&ssl=1 768w, https://i0.wp.com/tursio.wpcomstaging.com/wp-content/uploads/2025/05/roger-ce-W32yvc0JJjw-unsplash.jpg?resize=1536%2C1024&ssl=1 1536w, https://i0.wp.com/tursio.wpcomstaging.com/wp-content/uploads/2025/05/roger-ce-W32yvc0JJjw-unsplash.jpg?resize=2048%2C1365&ssl=1 2048w, https://i0.wp.com/tursio.wpcomstaging.com/wp-content/uploads/2025/05/roger-ce-W32yvc0JJjw-unsplash.jpg?w=1280&ssl=1 1280w, https://i0.wp.com/tursio.wpcomstaging.com/wp-content/uploads/2025/05/roger-ce-W32yvc0JJjw-unsplash.jpg?w=1920&ssl=1 1920w\" alt=\"\" width=\"803\" height=\"535\" />\r\n\r\n \r\n\r\nToday, we have come a long way with ChatGPT, estimated to have crossed \x3Ca class=\"af ny\" href=\"https://www.demandsage.com/chatgpt-statistics\" target=\"_blank\" rel=\"noopener ugc nofollow\">1 billion users\x3C/a>! AI is no longer science fiction or the Tin Man from the \x3Cem class=\"qb\">Wizard of Oz\x3C/em>, but a modern workable tech that everyone wants to play with and build upon. Moreover, it is encouraging to see the exciting AI landscape out there, with ChatGPT, Llama, Dolly, Bard all competing for the users’ mindshare. The language models have unleashed the creativity of the masses, like the internet or the mobile, and there is a global movement to rebuild pretty much everything we see in our world today. So much that researchers believe a 50% chance of human-level AI before 2061.\r\n\r\n\x3Cimg class=\"alignnone wp-image-201\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_AQL1dErVJRwoReX1nEJZpw-300x85.webp\" alt=\"\" width=\"854\" height=\"242\" />\r\n\r\nSource: \x3Ca class=\"af ny\" href=\"https://ourworldindata.org/ai-timelines\" target=\"_blank\" rel=\"noopener ugc nofollow\">https://ourworldindata.org/ai-timelines\x3C/a>\r\n\r\n \r\n\r\nThis optimism is a golden opportunity for people working on data systems in two ways (1) \x3Cem class=\"qb\">More usage\x3C/em>: advances in AI have simplified the interfaces to complex systems, making them accessible to everyone in natural language, and (2) \x3Cem class=\"qb\">New usage\x3C/em>: there is an explosion of applications that surface data in interesting ways, thus creating newer scenarios and workloads. Altogether, data systems people are due for some exciting times, and they should brace up for that.\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"bb04\" class=\"nz oa hn be ob oc od oe ff of og oh fi oi oj ok ol om on oo op oq or os ot ou bj\" data-selectable-paragraph=\"\">Generative AI for Data Applications\x3C/h1>\r\n\x3Cp id=\"fd3d\" class=\"pw-post-body-paragraph ov ow hn ox b oy pq pa pb pc pr pe pf fj ps ph pi fm pt pk pl fp pu pn po pp go bj\" data-selectable-paragraph=\"\">At SmartApps, we are pushing to create the best-in-class generative AI experience for modern data applications. We came up with the idea of \x3Ca class=\"af ny\" href=\"https://blog.smart-apps.ai/where-is-my-large-data-model-75d92b8c045a\" target=\"_blank\" rel=\"noopener ugc nofollow\">“large data model”\x3C/a> (LDM) that learns how to transform data in a data warehouse, and presented \x3Ca class=\"af ny\" href=\"https://blog.smart-apps.ai/a-large-data-model-for-snowflake-data-marketplace-627fb4252906\" target=\"_blank\" rel=\"noopener ugc nofollow\">PikePlace\x3C/a>, which is pre-trained over Snowflake data marketplace. Our LDM has already trained over 125M data models and generates \x3Cspan style=\"box-sizing: border-box; margin: 0px; padding: 0px;\">lightning\x3Ca href=\"https://blog.smart-apps.ai/remember-that-time-is-money-5213e4fffb9a\" target=\"_blank\" rel=\"noopener\">-fast\x3C/a>\x3C/span>\x3Ca class=\"af ny\" href=\"https://blog.smart-apps.ai/remember-that-time-is-money-5213e4fffb9a\" target=\"_blank\" rel=\"noopener ugc nofollow\"> insights\x3C/a> (6ms-200ms) over a variety of datasets, including real estate, Covid-19, economy atlas, workforce, consumer engagement, vendor analytics, e-commerce, housing, web ads, Crunchbase, GitHub, and IPL. LDM helps transform data in these warehouses into high-quality data models for business applications; more importantly, however, LDM-generated data models are guaranteed to be \x3Cstrong class=\"ox ho\">correct\x3C/strong> and \x3Cstrong class=\"ox ho\">fast\x3C/strong>, i.e., the business users can rely on them consistently.\x3C/p>\r\n\x3Cp id=\"119a\" class=\"pw-post-body-paragraph ov ow hn ox b oy pq pa pb pc pr pe pf fj ps ph pi fm pt pk pl fp pu pn po pp go bj\" data-selectable-paragraph=\"\">We now look to push the envelope further and use LDM to generate insights on \x3Cem class=\"qb\">private data\x3C/em>, i.e., bring the same correctness and speed of generated data models to any given data source. Thanks to the big data movement of the last decade, every organization today has volumes of data collected into its data lake, and later extracted and ingested into its data warehouses. Yet, up to \x3Ca class=\"af ny\" href=\"https://aws.amazon.com/executive-insights/content/the-power-of-the-data-driven-enterprise/\" target=\"_blank\" rel=\"noopener ugc nofollow\">97% of business data sits unused\x3C/a> by organizations, thus missing tremendous business value to be unlocked. The right tools can transform such data into insights that could be useful. This is where LDM can help by:\x3C/p>\r\n\r\n\x3Cul class=\"\">\r\n \t\x3Cli id=\"58e6\" class=\"ov ow hn ox b oy pq pa pb pc pr pe pf fj ps ph pi fm pt pk pl fp pu pn po pp qe qf qg bj\" data-selectable-paragraph=\"\">Reducing the grunt work in identifying typical data patterns (most likely ways to join, group, aggregate, etc.).\x3C/li>\r\n \t\x3Cli id=\"2d7e\" class=\"ov ow hn ox b oy qh pa pb pc qi pe pf fj qj ph pi fm qk pk pl fp ql pn po pp qe qf qg bj\" data-selectable-paragraph=\"\">Automating boilerplate SQL queries, data cleaning rules, and visualizations on different databases.\x3C/li>\r\n \t\x3Cli id=\"366e\" class=\"ov ow hn ox b oy qh pa pb pc qi pe pf fj qj ph pi fm qk pk pl fp ql pn po pp qe qf qg bj\" data-selectable-paragraph=\"\">Pruning the search space of data models to a small, meaningful one that can be explored visually.\x3C/li>\r\n \t\x3Cli id=\"4e74\" class=\"ov ow hn ox b oy qh pa pb pc qi pe pf fj qj ph pi fm qk pk pl fp ql pn po pp qe qf qg bj\" data-selectable-paragraph=\"\">Providing a quick overview of what’s in the data and surfacing things that people may have otherwise missed.\x3C/li>\r\n \t\x3Cli id=\"3f9e\" class=\"ov ow hn ox b oy qh pa pb pc qi pe pf fj qj ph pi fm qk pk pl fp ql pn po pp qe qf qg bj\" data-selectable-paragraph=\"\">Allowing people to declare or specify what kind of data models they want before operationalizing them.\x3C/li>\r\n \t\x3Cli id=\"8d5e\" class=\"ov ow hn ox b oy qh pa pb pc qi pe pf fj qj ph pi fm qk pk pl fp ql pn po pp qe qf qg bj\" data-selectable-paragraph=\"\">Giving a good starting point to generate more ideas on how to analyze data for the problem at hand.\x3C/li>\r\n \t\x3Cli id=\"c958\" class=\"ov ow hn ox b oy qh pa pb pc qi pe pf fj qj ph pi fm qk pk pl fp ql pn po pp qe qf qg bj\" data-selectable-paragraph=\"\">Enabling data teams to do more with fewer resources and empowering non-experts to self-serve in many scenarios.\x3C/li>\r\n\x3C/ul>\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"4173\" class=\"nz oa hn be ob oc od oe ff of og oh fi oi oj ok ol om on oo op oq or os ot ou bj\" data-selectable-paragraph=\"\">Introducing SanJuan: LDM on Private Data\x3C/h1>\r\n\x3Cp id=\"4f32\" class=\"pw-post-body-paragraph ov ow hn ox b oy pq pa pb pc pr pe pf fj ps ph pi fm pt pk pl fp pu pn po pp go bj\" data-selectable-paragraph=\"\">Today, we announce SanJuan, our next version of generative analytics that applies the “large data model” (LDM) to private data files. With as many as 750 million to 2 billion people in the world \x3Ca class=\"af ny\" href=\"https://askwonder.com/research/number-google-sheets-users-worldwide-eoskdoxav\" target=\"_blank\" rel=\"noopener ugc nofollow\">using either Excel or Google Sheets\x3C/a>, data files are all around us, but they are often disorganized and tedious to run traditional data analytics on. Data files are also the de facto input/output for data scientists, with CSV and Excel formats being supported by all popular data science libraries. SanJuan allows users to generate insights over their data files in three simple steps:\x3C/p>\r\n\r\n\x3Col class=\"\">\r\n \t\x3Cli id=\"2290\" class=\"ov ow hn ox b oy pq pa pb pc pr pe pf fj ps ph pi fm pt pk pl fp pu pn po pp qn qf qg bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ox ho\">Login.\x3C/strong> Users log in using standard authenticators (Google authenticator supported currently), and all their information is visible only to them.\x3C/li>\r\n \t\x3Cli id=\"4d5e\" class=\"ov ow hn ox b oy qh pa pb pc qi pe pf fj qj ph pi fm qk pk pl fp ql pn po pp qn qf qg bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ox ho\">Upload.\x3C/strong> Once logged in, users can upload CSV or Excel files. They can upload multiple files, each up to 10MB in size, which get stored in a dedicated Snowflake database for each account. A background job runs the LDM inference on newly uploaded files every 20 minutes, which is also the length of an \x3Ca class=\"af ny\" href=\"https://en.wikipedia.org/wiki/Break_(work)#:~:text=Coffee%20breaks%20usually%20last%20from,third%20of%20the%20work%20shift.\" target=\"_blank\" rel=\"noopener ugc nofollow\">average coffee break\x3C/a>.\x3C/li>\r\n \t\x3Cli id=\"0ffa\" class=\"ov ow hn ox b oy qh pa pb pc qi pe pf fj qj ph pi fm qk pk pl fp ql pn po pp qn qf qg bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ox ho\">Analyze.\x3C/strong> Once the user is back from a coffee break, they can refresh the page and start analyzing models and visualizations that are automatically generated on their private data files.\x3C/li>\r\n\x3C/ol>\r\n\x3Cp id=\"48ad\" class=\"pw-post-body-paragraph ov ow hn ox b oy pq pa pb pc pr pe pf fj ps ph pi fm pt pk pl fp pu pn po pp go bj\" data-selectable-paragraph=\"\">Voila! The above three steps get your data files talking to you visually!\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"5df2\" class=\"nz oa hn be ob oc od oe ff of og oh fi oi oj ok ol om on oo op oq or os ot ou bj\" data-selectable-paragraph=\"\">Scenarios\x3C/h1>\r\n\x3Cp id=\"2e78\" class=\"pw-post-body-paragraph ov ow hn ox b oy oz pa pb pc pd pe pf fj pg ph pi fm pj pk pl fp pm pn po pp go bj\" data-selectable-paragraph=\"\">Let’s look at some of the interesting scenarios that we have come across.\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n \r\n\x3Ch2 id=\"317d\" class=\"qo oa hn be ob fd qp fe ff fg qq fh fi fj qr fk fl fm qs fn fo fp qt fq fr qu bj\" data-selectable-paragraph=\"\">Rideshare\x3C/h2>\r\n\x3Cp id=\"4c72\" class=\"pw-post-body-paragraph ov ow hn ox b oy oz pa pb pc pd pe pf fj pg ph pi fm pj pk pl fp pm pn po pp go bj\" data-selectable-paragraph=\"\">Ride share companies like Uber and Lyft are looking to analyze patterns of their usage and find opportunities to improve their services. Using SanJuan on a sample \x3Ca class=\"af ny\" href=\"https://www.kaggle.com/datasets/deexithreddy/rideshare-kaggle\" target=\"_blank\" rel=\"noopener ugc nofollow\">rideshare dataset\x3C/a> from Boston can help discover interesting insights, such as most rides take place in cloudy weather, Uber gets 54% of the rides while Lyft gets 46%, and Financial district is the least popular destination while Northeastern university is the most popular.\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"alignnone wp-image-204\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_MOjfRXy-cZ3-r8RCP67M_A-300x148.webp\" alt=\"\" width=\"807\" height=\"398\" />\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n \r\n\x3Ch2 id=\"9ed7\" class=\"qo oa hn be ob fd qp fe ff fg qq fh fi fj qr fk fl fm qs fn fo fp qt fq fr qu bj\" data-selectable-paragraph=\"\">Fast Food Chains\x3C/h2>\r\n\x3Cp id=\"8b66\" class=\"pw-post-body-paragraph ov ow hn ox b oy oz pa pb pc pd pe pf fj pg ph pi fm pj pk pl fp pm pn po pp go bj\" data-selectable-paragraph=\"\">The fast-food industry is very big, and analysts are constantly looking for ways to boost sales. SanJuan can quickly show insights on this sample dataset on \x3Ca class=\"af ny\" href=\"https://www.kaggle.com/datasets/iamsouravbanerjee/top-50-fastfood-chains-in-usa\" target=\"_blank\" rel=\"noopener ugc nofollow\">top-50 fast food chains in the USA\x3C/a>. For example, Chick-fil-A is #3 after Starbucks and McDonald’s in terms of total sales, but it has the highest average sales per unit in the U.S. Or, Jersey Mike’s added the maximum number of stores while Subway lost the maximum number of stores since 2020. These generated insights can save a lot of time for data scientists who may otherwise end up doing a lot of grunt work in building a long and verbose \x3Ca class=\"af ny\" href=\"https://www.kaggle.com/code/seasonwong/top-50-fast-food-data\" target=\"_blank\" rel=\"noopener ugc nofollow\">Notebook for the same data set\x3C/a>.\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"alignnone wp-image-206\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_kpd8n_roLtAH5_4XF3Bnrg-300x148.webp\" alt=\"\" width=\"809\" height=\"399\" />\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n \r\n\x3Ch2 id=\"4b40\" class=\"qo oa hn be ob fd qp fe ff fg qq fh fi fj qr fk fl fm qs fn fo fp qt fq fr qu bj\" data-selectable-paragraph=\"\">Finance Regulation (FIRE) Data Standard\x3C/h2>\r\n\x3Cp id=\"a75b\" class=\"pw-post-body-paragraph ov ow hn ox b oy oz pa pb pc pd pe pf fj pg ph pi fm pj pk pl fp pm pn po pp go bj\" data-selectable-paragraph=\"\">Financial data can be messy and there is a need to standardize by harmonizing data from various sources into a common model. Suade has come up with one such data model, called \x3Ca class=\"af ny\" href=\"https://github.com/SuadeLabs/fire\" target=\"_blank\" rel=\"noopener ugc nofollow\">FIRE\x3C/a>. SanJuan can process data in FIRE schema and generate insights over assets, ratings, encumbrance, balance, interest, and others.\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"alignnone wp-image-208\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_sXIp8ljMRpCRcC4IFDF1cQ-300x149.webp\" alt=\"\" width=\"809\" height=\"402\" />\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n \r\n\x3Ch2 id=\"edfe\" class=\"qo oa hn be ob fd qp fe ff fg qq fh fi fj qr fk fl fm qs fn fo fp qt fq fr qu bj\" data-selectable-paragraph=\"\">KDD Cup 2023\x3C/h2>\r\n\x3Cp id=\"56ab\" class=\"pw-post-body-paragraph ov ow hn ox b oy oz pa pb pc pd pe pf fj pg ph pi fm pj pk pl fp pm pn po pp go bj\" data-selectable-paragraph=\"\">SIGKDD runs the KDD Cup every year, and this year’s \x3Ca class=\"af ny\" href=\"https://www.aicrowd.com/challenges/amazon-kdd-cup-23-multilingual-recommendation-challenge\" target=\"_blank\" rel=\"noopener ugc nofollow\">challenge\x3C/a> is to recommend the next product for an e-commerce store. Such an exercise requires exploratory data analysis (EDA) as a first step, where the data scientist sees what’s in the data and how different variables relate to each other. SanJuan can help data scientists do such an analysis by quickly generating insight and providing them a way to understand data before they build their ML models.\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"alignnone wp-image-209\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_Od1WE3GJfVjf7WQUXOatSw-300x149.webp\" alt=\"\" width=\"846\" height=\"420\" />\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n \r\n\x3Ch2 id=\"2474\" class=\"qo oa hn be ob fd qp fe ff fg qq fh fi fj qr fk fl fm qs fn fo fp qt fq fr qu bj\" data-selectable-paragraph=\"\">Indian Premier League (IPL)\x3C/h2>\r\n\x3Cp id=\"bf17\" class=\"pw-post-body-paragraph ov ow hn ox b oy oz pa pb pc pd pe pf fj pg ph pi fm pj pk pl fp pm pn po pp go bj\" data-selectable-paragraph=\"\">The Indian Premier League for Cricket has emerged as the \x3Ca class=\"af ny\" href=\"https://www.forbes.com/sites/tristanlavalette/2022/06/14/the-indian-premier-leagues-jaw-dropping-6-billion-broadcast-deal-will-have-major-ramifications-in-cricket/?sh=352a7d7843e6\" target=\"_blank\" rel=\"noopener ugc nofollow\">second-richest sports league in the world\x3C/a>, behind only the NFL. Fans and advertisers are looking for newer ways to analyze the game and its related statistics. SanJuan can help analysts surface interesting insights. For example, match \x3Ca class=\"af ny\" href=\"https://www.kaggle.com/datasets/biswajitbrahmma/ipl-complete-dataset-2008-2022\" target=\"_blank\" rel=\"noopener ugc nofollow\">data from 2008–2022\x3C/a> shows that while 63% field first yet only 53% are able to chase, or that Mumbai has hosted 2x the number of matches than other cities, or that the number of matches per season has varied over the years.\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"alignnone wp-image-211\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_a0k0twagCdR0VTQQUGkJrA-300x149.webp\" alt=\"\" width=\"852\" height=\"423\" />\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"9548\" class=\"nz oa hn be ob oc od oe ff of og oh fi oi oj ok ol om on oo op oq or os ot ou bj\" data-selectable-paragraph=\"\">The Beginnings\x3C/h1>\r\n\x3Cp id=\"589b\" class=\"pw-post-body-paragraph ov ow hn ox b oy oz pa pb pc pd pe pf fj pg ph pi fm pj pk pl fp pm pn po pp go bj\" data-selectable-paragraph=\"\">Data has become the mainstay of modern enterprises. And with newer advances in AI, enterprise data has the potential to take any business to the next level. However, this data needs to be transformed into models that could be consumed by the fast-growing intelligent apps. With “large data model” (LDM), our goal is to make data transformation automated in every organization, thus unlocking the business value quickly. However, just like the early chess-playing machines, LDM still has a long way to go; with enough iterations, we hope to one day turn any data into self-serving intelligence for stakeholders to address their business needs — this is our mission at SmartApps!\x3C/p>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/05/0_RuiKCdCNM_RYGFBZ.webp",author:"Alekh Jindal",author_image:void 0,published_date:"May 19, 2023",tags:"Engineering, Generative AI",description:"From early illusions like the Mechanical Turk to Turing’s vision and modern AI breakthroughs, the journey of intelligent machines has been remarkable. Today, generative AI models such as ChatGPT have mainstreamed creativity and data access. At SmartApps, we built the Large Data Model (LDM), first demonstrated with PikePlace, delivering millisecond insights over public datasets. Now, with SanJuan, we extend LDM to private data, letting users upload spreadsheets and quickly generate insights. From rideshare analytics to finance, fast food, IPL, and regulation, SanJuan enables businesses to unlock hidden value. The mission: transform any data into self-serving intelligence efficiently."}},$R[55]={id:1194,slug:"remember-that-time-is-money",acf:$R[56]={title:"Remember That Time is Money",content:"\x3Ch1 id=\"873d\" class=\"oz pa ho be pb pc pd pe fg pf pg ph fj pi pj pk pl pm pn po pp pq pr ps pt pu bj\" data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Cp id=\"c335\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">Benjamin Franklin famously said this in his \x3Ca class=\"ag nz\" href=\"https://founders.archives.gov/documents/Franklin/01-03-02-0130\" target=\"_blank\" rel=\"noopener ugc nofollow\">“Advice to a Young Tradesman”\x3C/a> more than two centuries ago in 1748 to convey the opportunity cost of laziness. The phrase can be further traced back to The Free Thinker newspaper in 1719 referring to a woman advising her husband on the sense of urgency in his shoemaking business. Even the earliest use of the term “business intelligence” was back in 1865 when banker Sir Henry Furnese gained profit by receiving and acting upon information about his environment, \x3Cem class=\"ov\">prior\x3C/em> to his competitors:\x3C/p>\r\n\r\n\x3Cblockquote class=\"ow ox oy\">\r\n\x3Cp id=\"5637\" class=\"oa ob ov oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">\x3Cem>“Throughout Holland, Flanders, France, and Germany, he maintained a complete and perfect train of business intelligence. The news of the many battles fought was thus received first by him, and the fall of Namur added to his profits, owing to his early receipt of the news.”\x3C/em> — \x3Ca class=\"ag nz\" href=\"https://archive.org/details/cyclopaediacomm00devegoog/page/n262/mode/2up\" target=\"_blank\" rel=\"noopener ugc nofollow\">Devens, p. 210\x3C/a>\x3C/p>\r\n\x3C/blockquote>\r\n\x3Cp id=\"10bd\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">These anecdotes serve as a reminder on how time has been always critical to businesses and better use of time means more profitable business.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"873d\" class=\"oz pa ho be pb pc pd pe fg pf pg ph fj pi pj pk pl pm pn po pp pq pr ps pt pu bj\" data-selectable-paragraph=\"\">Time-to-Insights: The New Money\x3C/h1>\r\n\x3Cp id=\"929f\" class=\"pw-post-body-paragraph oa ob ho oc b od pv of og oh pw oj ok fk px om on fn py op oq fq pz os ot ou gp bj\" data-selectable-paragraph=\"\">Today when modern businesses run on data, time-to-insights on the data is the new money and there is a direct monetary cost to being lazy (or delayed) with data insights. Timely insights can help optimize processes, uncover opportunities, improve experiences, maximize profits, and make strategic decisions. And yet, consuming data from a data source and turning it into insights remains a challenge. At SmartApps, we ran a survey asking people who build analytics applications how long it takes them to build a dashboard and how long it takes to optimize it for their scenario. The two pie charts below show the results.\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"alignnone wp-image-157\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_qlqOoShVddarGOKVxRGalw-300x153.webp\" alt=\"\" width=\"502\" height=\"256\" />\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">Figure 1: Results of the survey of data analysts and application builders.\x3C/p>\r\n\r\n\x3Ch1 id=\"873d\" class=\"oz pa ho be pb pc pd pe fg pf pg ph fj pi pj pk pl pm pn po pp pq pr ps pt pu bj\" data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Cp data-selectable-paragraph=\"\">Ninety percent of the people take 1 week or more to create a dashboard, and eighty percent of the people take 1 month or more to optimize their dashboards. This is after the data has already landed in a data warehouse. While this was shocking to us initially, more application builders corroborated such an experience in our interviews. Infact, a quick search on the internet also reveals a similar ordeal that people must go through:\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"alignnone wp-image-159\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/Screenshot-2025-05-01-at-5.42.03 PM-300x274.png\" alt=\"\" width=\"452\" height=\"413\" />\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">Figure 2: A user sharing their experiences with Power BI\x3C/p>\r\n\r\n\x3Ch1 id=\"873d\" class=\"oz pa ho be pb pc pd pe fg pf pg ph fj pi pj pk pl pm pn po pp pq pr ps pt pu bj\" data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"alignnone wp-image-160\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/Screenshot-2025-05-01-at-5.42.37 PM-300x171.png\" alt=\"\" width=\"453\" height=\"258\" />\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">Figure 3: A user sharing their experience with Tableau.\x3C/p>\r\n\r\n\x3Ch1 id=\"873d\" class=\"oz pa ho be pb pc pd pe fg pf pg ph fj pi pj pk pl pm pn po pp pq pr ps pt pu bj\" data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Cp id=\"628f\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">We see that analysts routinely spend a lot of time on both data and visualizations, leading to a loss of productivity and business value. This is because the data and application platforms remain siloed, and transforming data into models that can power applications requires a series of manual steps, including understanding schemas, gathering data distributions and statistics, doing basic exploration, building intuition, and deriving preliminary insights. Currently, tools render all these painfully manual and slow, wasting precious time with a high opportunity cost for any business.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"388f\" class=\"oz pa ho be pb pc pd pe fg pf pg ph fj pi pj pk pl pm pn po pp pq pr ps pt pu bj\" data-selectable-paragraph=\"\">“What” First, “How” Later!\x3C/h1>\r\n\x3Cp id=\"801e\" class=\"pw-post-body-paragraph oa ob ho oc b od pv of og oh pw oj ok fk px om on fn py op oq fq pz os ot ou gp bj\" data-selectable-paragraph=\"\">The worst-case scenario for an analyst is to run an entire data marathon, transforming data from the data warehouse to the application, just to realize that the end user needs something else, thus re-running the marathon all over again. Instead, the analyst would like to first establish “what” they are looking for and whether it makes sense for the end users, i.e., meaningful and interesting (visual) insight from clean high-quality data. Only after the “what” part is established, do they want to operationalize it into updatable, scalable, and efficient model workflows. In other words, the analysts simply want to \x3Cem class=\"ov\">see\x3C/em> before they \x3Cem class=\"ov\">believe\x3C/em>. Current tools do not employ such an insight-oriented approach, with visualization being non-trivial in most cases, as evident in the reaction below:\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"alignnone wp-image-162\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/Screenshot-2025-05-01-at-5.43.39 PM-300x180.png\" alt=\"\" width=\"431\" height=\"260\" />\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">Figure 4: A user sharing their experience with query result visualization.\x3C/p>\r\n\r\n\x3Ch1 id=\"873d\" class=\"oz pa ho be pb pc pd pe fg pf pg ph fj pi pj pk pl pm pn po pp pq pr ps pt pu bj\" data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Cp id=\"c265\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">Figma is a great example of development teams first figuring out “what” they want and iterating on the wireframe before fleshing it out. As a result, with Figma, product manages can first run mockups by the end users, understand whether their requirements are captured, and iterate on it several times before the dev teams start integrating it in the backend. Unfortunately, a Figma like approach for data analytics is missing where analysts could visualize what they want first before fiddling with the complex data models and workflows.\x3C/p>\r\n\x3Cp id=\"51d0\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">The other aspect of figuring out “what” before “how” is that one can consider multiple “how” to achieve a given “what” and pick the best one. Databases do this by providing a declarative query language (SQL), where users specify what they want, and using a query optimizer internally to figure out the best way to execute a given SQL query. Users do not have to worry about either the data layouts, partitioning, or indexing or the operator implementation, ordering, or parallelism. Declarative data processing is today the de facto standard. Another everyday example of declarative tasks is Google Maps, where users simply tell where they want to go and the map figures out the best route for them.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"51a2\" class=\"oz pa ho be pb pc pd pe fg pf pg ph fj pi pj pk pl pm pn po pp pq pr ps pt pu bj\" data-selectable-paragraph=\"\">Generating Insights With “Large Data Model”\x3C/h1>\r\n\x3Cp id=\"7e62\" class=\"pw-post-body-paragraph oa ob ho oc b od pv of og oh pw oj ok fk px om on fn py op oq fq pz os ot ou gp bj\" data-selectable-paragraph=\"\">At SmartApps, we are building a pre-trained “large data model” so that analysts do not have to waste their time building data models, but they can instead focus on the “what” part and jump directly to insights. This generative approach reduces the overall time to insights from hours and days to minutes. Our current implementation, \x3Ca class=\"ag nz\" href=\"https://pikeplace.smart-apps.ai/\" target=\"_blank\" rel=\"noopener ugc nofollow\">PikePlace\x3C/a>, showcases this approach on the Snowflake data marketplace to generate insights over several marketplace datasets. The chart below shows the average latency over all requests that are received by the PikePlace servers.\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"alignnone wp-image-164\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_GgwTgYdFZzcT2B5KIB4qJQ-300x156.webp\" alt=\"\" width=\"418\" height=\"219\" />\x3C/p>\r\n\x3Cp id=\"75c9\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">We can see that average latency hovers between 6 and 200 milliseconds, with the average being under 100 milliseconds. Thus, the pre-trained “large data model” indeed delivers insights at lightning-fast speed.\x3C/p>\r\n\x3Cp id=\"0b22\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">Early reactions to Pike Place have been very encouraging:\x3C/p>\r\n\r\n\x3Cblockquote class=\"ow ox oy\">\r\n\x3Cp id=\"eb76\" class=\"oa ob ov oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">\x3Cem>“This shaves off first six hours of my data science tasks.”\x3C/em>\r\n\x3Cem>“I can see this bringing down data analyst work from 3–4 days to a day.”\x3C/em>\r\n\x3Cem>“You are turning analytics on its head.”\x3C/em>\x3C/p>\r\n\x3C/blockquote>\r\n\x3Cp id=\"f97b\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">Our quest to learn “large data model” is just getting started, with a long way to go in front of us. Some of the challenges we see include fine-tuning the models quickly to any given schema, contextualizing insights, structured search, and data privacy and security. However, the guiding principle remains helping businesses minimize the opportunity cost of time-to-insights. Or in the words of Benjamin Franklin:\x3C/p>\r\n\r\n\x3Cblockquote class=\"ow ox oy\">\r\n\x3Cp id=\"4d5f\" class=\"oa ob ov oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">\x3Cem>“In short, the Way to Wealth, if you desire it, is as plain as the Way to Market. It depends chiefly on two Words, Industry and Frugality; i.e. Waste neither Time nor Money, but make the best Use of both.”\x3C/em> — Benjamin Franklin\x3C/p>\r\n\x3C/blockquote>\r\n\x3Cp id=\"5223\" class=\"pw-post-body-paragraph oa ob ho oc b od oe of og oh oi oj ok fk ol om on fn oo op oq fq or os ot ou gp bj\" data-selectable-paragraph=\"\">To conclude, remember that time is money.\x3C/p>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/05/0_9A-cDAMjDNIoRf0F.webp",author:"Alekh Jindal",author_image:void 0,published_date:"April 19, 2023",tags:"Generative AI",description:"Benjamin Franklin’s phrase “time is money” highlights the opportunity cost of delay, a principle echoed across history and business. In today’s data-driven world, time-to-insights is the new currency, yet most analysts spend weeks creating and optimizing dashboards due to siloed tools and manual workflows. This inefficiency delays decision-making and reduces value. At SmartApps, we propose focusing on the “what” before the “how,” enabling faster iterations, much like Figma or SQL. Our pre-trained “large data model,” PikePlace, generates insights in milliseconds, drastically reducing effort. The goal remains clear: minimize time-to-insights and unlock business value efficiently—wasting neither time nor money."}},$R[57]={id:1195,slug:"a-large-data-model-for-snowflake-data-marketplace",acf:$R[58]={title:"A “large data model” for Snowflake Data Marketplace",content:"\x3Ch1 id=\"7474\" class=\"ov ow hn be ox oy oz pa ff pb pc pd fi pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Cp id=\"69c3\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">\x3Cem class=\"ou\">Recap\x3C/em>: In the last post, we explored the question of \x3Ca class=\"af ny\" href=\"https://blog.smart-apps.ai/where-is-my-large-data-model-75d92b8c045a\" target=\"_blank\" rel=\"noopener ugc nofollow\">whether data models can be learned\x3C/a> for providing quick answers to data questions. In this post, we take a step forward to evaluating its feasibility on the Snowflake data marketplace.\x3C/p>\r\n\x3Cp id=\"985c\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">The rise of cloud data warehouses has made it extremely easy to share and process data on demand. Today, anyone can get access to the most sophisticated query processors on the planet and start performing complex data transformations within minutes. No wonder there is a shift from traditional ETL style design to newer ELT or zero-ETL styles, where transformations happen once the data has already landed in the data warehouse. Thereafter, the onus is on the data analyst or the analytics engineer to create the right set of data transformations that can power the business apps. To understand how AI/ML can help in this process, let’s use data marketplaces as a concrete example below.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"7474\" class=\"ov ow hn be ox oy oz pa ff pb pc pd fi pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">Snowflake Data Marketplace\x3C/h1>\r\n\x3Cp id=\"434c\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">The data marketplace concept has been around since early 2000s, with data.gov releasing in 2009 and Snowflake announcing its data marketplace a decade later in 2019. Since then, nearly all cloud vendors have launched their own marketplaces, including AWS, Azure, Google, and Databricks. Snowflake, in particular, has made it very easy for data providers and consumers to share and consume data. Thanks to its disaggregated storage and compute architecture, Snowflake users only pay for the compute they do over the shared data. Many large organizations also have similar shared data platforms internally, e.g., the Cosmos big data platform at Microsoft. Industry trends show that data marketplace platforms are fast growing, and it is expected to be a $5B+ market by 2030.\x3C/p>\r\n\x3Cp id=\"9f0b\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">Typically, a data analyst starts by running sample queries like “\x3Cem class=\"ou\">Select * from T limit 10\x3C/em>” to understand the data, before crafting the right data transformations and models. This means spending an inordinate amount of time hammering the data before getting to the actual insights part and iterating this process over and over again to meet the business requirements. For instance, consider the \x3Ca class=\"af ny\" href=\"https://app.snowflake.com/marketplace/listing/GZTYZ3HT1R1/reason-automation-amazon-vendor-analytics-sample-dataset\" target=\"_blank\" rel=\"noopener ugc nofollow\">Amazon Vendor Analytics dataset\x3C/a> on Snowflake marketplace that contains 52 tables and 1,568 columns. Such a dataset will require a significant amount of analyst time before it could be turned into intelligence. Instead of running the entire data marathon first, data analysts rather want to \x3Cem class=\"ou\">see\x3C/em> things quickly and \x3Cem class=\"ou\">then\x3C/em> shoehorn the data based on what they want.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"5b4e\" class=\"ov ow hn be ox oy oz pa ff pb pc pd fi pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">Introducing PikePlace: Seeing is Analyzing\x3C/h1>\r\n\x3Cimg class=\"alignnone wp-image-171\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_CPxVau5_-7qXr2nmaaCNNQ-300x169.webp\" alt=\"\" width=\"612\" height=\"345\" />\r\n\r\nPike Place Market Entrance by Mtaylor444. CC BY-SA 3.0\r\n\r\n \r\n\r\nWe present PikePlace, the first “large data model” for the Snowflake data marketplace. Just like the \x3Ca class=\"af ny\" href=\"https://en.wikipedia.org/wiki/Pike_Place_Market\" target=\"_blank\" rel=\"noopener ugc nofollow\">Pike Place Market\x3C/a> that delights curious visitors, our goal is to get analysts the data models they need to power their business apps. Once the analyst connects to their data source, instead of messing with SQL queries, PikePlace shows them a set of \x3Cem class=\"ou\">generated\x3C/em> data models to explore, search, refine, and ultimately operationalize. The following screenshot shows what an analyst sees when connecting to the same Amazon Vendor Analytics dataset in the Snowflake marketplace:\r\n\r\n\x3Cimg class=\"alignnone wp-image-173\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_nQJeQRFEI-D9gdQkpMqOMg-300x186.webp\" alt=\"\" width=\"615\" height=\"381\" />\r\n\r\n \r\n\r\nThe analyst gets to directly see interesting visualizations in a \x3Cem class=\"ou\">single click!\x3C/em> They can play with the interactive charts or ask for other similar visualizations by clicking “Another”. They can also inspect the data model by clicking the model tab, which shows the SQL statement, its natural language explanation, a model score, and a declarative syntax to plot the visualization, as illustrated in the screenshot below:\r\n\r\n\x3Cimg class=\"alignnone wp-image-174\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_0KZbU_OY3Vak0QrLkbJL0g-300x249.webp\" alt=\"\" width=\"469\" height=\"389\" />\r\n\x3Cp id=\"5497\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">Analysts can refine models and collect all interesting ones into the workspace by clicking the “Save” button. They can preview saved models (across datasets) and export them into scalable workflows to keep them updated. The entire process is insights-first with all data shoehorning happening later, thus bringing in several key advantages:\x3C/p>\r\n\r\n\x3Col class=\"\">\r\n \t\x3Cli id=\"ef64\" class=\"nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot qa qb qc bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ob ho\">Reduces\x3C/strong> the time to insights by helping analysts discover and deliver the right data models to power business apps quickly.\x3C/li>\r\n \t\x3Cli id=\"4774\" class=\"nz oa hn ob b oc qd oe of og qe oi oj fj qf ol om fm qg oo op fp qh or os ot qa qb qc bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ob ho\">Enables\x3C/strong> anyone with access to data to start analyzing immediately without waiting for data engineers to craft the data for them.\x3C/li>\r\n \t\x3Cli id=\"bb0d\" class=\"nz oa hn ob b oc qd oe of og qe oi oj fj qf ol om fm qg oo op fp qh or os ot qa qb qc bj\" data-selectable-paragraph=\"\">\x3Cstrong class=\"ob ho\">Optimizes\x3C/strong> data better using machine learning, thus providing interactive data exploration at much lower cost.\x3C/li>\r\n\x3C/ol>\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"ad8e\" class=\"ov ow hn be ox oy oz pa ff pb pc pd fi pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">Generating Data Models: The Secret Sauce\x3C/h1>\r\n\x3Cp id=\"4b15\" class=\"pw-post-body-paragraph nz oa hn ob b oc qi oe of og qj oi oj fj qk ol om fm ql oo op fp qm or os ot go bj\" data-selectable-paragraph=\"\">The secret sauce in PikePlace is how the systems learns to generate data models from a given data source. We apply three-fold criteria when learning what kind of data models to generate.\x3C/p>\r\n\r\n\x3Col class=\"\">\r\n \t\x3Cli id=\"710e\" class=\"nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot qa qb qc bj\" data-selectable-paragraph=\"\">\x3Cem class=\"ou\">Quality\x3C/em>, i.e., a data model contains clean and meaningful data.\x3C/li>\r\n \t\x3Cli id=\"da80\" class=\"nz oa hn ob b oc qd oe of og qe oi oj fj qf ol om fm qg oo op fp qh or os ot qa qb qc bj\" data-selectable-paragraph=\"\">\x3Cem class=\"ou\">Correctness\x3C/em>, i.e., the data model is semantically valid and makes sense.\x3C/li>\r\n \t\x3Cli id=\"57d8\" class=\"nz oa hn ob b oc qd oe of og qe oi oj fj qf ol om fm qg oo op fp qh or os ot qa qb qc bj\" data-selectable-paragraph=\"\">\x3Cem class=\"ou\">Interestingness\x3C/em>, i.e., a data model stands out in what it shows.\x3C/li>\r\n\x3C/ol>\r\n\x3Cp id=\"1505\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">Our algorithms learn to generate data models that qualify these criteria in a three-step process, involving enumeration via hundreds of parameterized rules, evaluation over a variety of metrics, and feedback to tune and improve in response to real-time observations.\x3C/p>\r\n\x3Cp id=\"f856\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">We have been training the “large data model” on Snowflake marketplace datasets over the last few months and have created a pre-trained version that can generate data models for several diverse datasets, including areas such as real estate, covid-19, economy atlas, economic indicators, work force, consumer engagement, and vendor analytics. We are further improving this version with more usage and feedback every single day. Still, the “large data model” is relatively small right now and will need a significant amount of training to generalize much further. Fortunately, the existence of large number of data marketplaces out there provides an excellent source of real-world datasets to train on. Like language models, we believe that with enough training and tuning, we can learn how to generate the most useful data models over any given data source — thus providing quick insights to anyone over any data.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"cc31\" class=\"ov ow hn be ox oy oz pa ff pb pc pd fi pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">A Challenging Road\x3C/h1>\r\n\x3Cp id=\"b875\" class=\"pw-post-body-paragraph nz oa hn ob b oc qi oe of og qj oi oj fj qk ol om fm ql oo op fp qm or os ot go bj\" data-selectable-paragraph=\"\">We envision large data models to be a \x3Ca class=\"af ny\" href=\"https://maithraraghu.com/blog/2023/does-one-model-rule-them-all/\" target=\"_blank\" rel=\"noopener ugc nofollow\">specialized AI system\x3C/a> for solving a domain specific problem, with numerous customizations and retrofitting that are peculiar to the world of data modeling and transformations. Our early results on Snowflake marketplace are encouraging, but there is still a long road ahead. Just like the \x3Ca class=\"af ny\" href=\"https://awoiaf.westeros.org/index.php/First_Men\" target=\"_blank\" rel=\"noopener ugc nofollow\">First Men\x3C/a>, we hope the first “large data models” come riding horses, with bronze swords, and great leathern shields to change the world for the better.\x3C/p>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/05/0_u2Yd8TyMTtJATR43.webp",author:"Alekh Jindal",author_image:void 0,published_date:"April 6, 2023",tags:"Engineering",description:"Cloud data warehouses and marketplaces like Snowflake make data sharing easy, but analysts still spend excessive time exploring raw tables before generating insights. PikePlace is introduced as a “large data model” for Snowflake, enabling analysts to instantly see generated data models, visualizations, SQL, and explanations without manual querying. Models can be refined, saved, and exported into workflows, cutting time to insights, democratizing access, and lowering costs. PikePlace’s secret lies in algorithms that prioritize quality, correctness, and interestingness, trained on diverse marketplace datasets. Early results are promising, though scaling to robust, general-purpose large data models remains a long journey."}},$R[59]={id:1196,slug:"where-is-my-large-data-model",acf:$R[60]={title:"Where Is My Large Data Model?",content:"\x3Ch1 id=\"aabe\" class=\"ov ow hn be ox oy oz pa ff pb pc pd fi pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Cp id=\"dd51\" class=\"pw-post-body-paragraph oa ob hn oc b od oe of og oh oi oj ok fj ol om on fm oo op oq fp or os ot ou go bj\" data-selectable-paragraph=\"\">This is a follow-up post on connecting advancements in data and AI. Check out the previous post \x3Ca class=\"af nz\" href=\"https://blog.smart-apps.ai/the-ai-ambition-and-the-ground-reality-7619f33e558a\" target=\"_blank\" rel=\"noopener ugc nofollow\">here\x3C/a>.\x3C/p>\r\n\x3Cp id=\"d03c\" class=\"pw-post-body-paragraph oa ob hn oc b od oe of og oh oi oj ok fj ol om on fm oo op oq fp or os ot ou go bj\" data-selectable-paragraph=\"\">AI continues going to places with OpenAI announcing app-store style \x3Ca class=\"af nz\" href=\"https://openai.com/blog/chatgpt-plugins\" target=\"_blank\" rel=\"noopener ugc nofollow\">plugins for ChatGPT\x3C/a>, Databricks showing how to build \x3Ca class=\"af nz\" href=\"https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html\" target=\"_blank\" rel=\"noopener ugc nofollow\">ChatGPT-like magic entirely with open source\x3C/a>, and Microsoft claiming to have seen \x3Ca class=\"af nz\" href=\"https://arxiv.org/pdf/2303.12712.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">sparks of artificial general intelligence in GPT-4\x3C/a>, all within the span of last one week. As someone working in data, I both admire and envy this incredible progress in large language models. Particularly, I wonder where my large data model is to solve many of the hard data problems that are still unsolved. Let’s explore this thought further below.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"aabe\" class=\"ov ow hn be ox oy oz pa ff pb pc pd fi pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">Traditional Data Models\x3C/h1>\r\n\x3Cp id=\"d58c\" class=\"pw-post-body-paragraph oa ob hn oc b od pr of og oh ps oj ok fj pt om on fm pu op oq fp pv os ot ou go bj\" data-selectable-paragraph=\"\">The Wikipedia definition of data model is to \x3Ca class=\"af nz\" href=\"https://en.wikipedia.org/wiki/Data_model\" target=\"_blank\" rel=\"noopener ugc nofollow\">\x3Cem class=\"pw\">organize elements on data and standardize how they relate\x3C/em>\x3C/a>. Classical \x3Ca class=\"af nz\" href=\"https://www.elsevier.com/books/developing-high-quality-data-models/west/978-0-12-375106-5\" target=\"_blank\" rel=\"noopener ugc nofollow\">textbooks\x3C/a> on data modeling describe how the definition and format of data is crucial for developing information systems that share data across applications. Typically, a data model could of one of the three types:\x3C/p>\r\n\r\n\x3Col class=\"\">\r\n \t\x3Cli id=\"ee74\" class=\"oa ob hn oc b od oe of og oh oi oj ok fj ol om on fm oo op oq fp or os ot ou px py pz bj\" data-selectable-paragraph=\"\">Conceptual data model, as described by the semantics of the business.\x3C/li>\r\n \t\x3Cli id=\"c160\" class=\"oa ob hn oc b od qa of og oh qb oj ok fj qc om on fm qd op oq fp qe os ot ou px py pz bj\" data-selectable-paragraph=\"\">Logical data model, as described by the data processing system.\x3C/li>\r\n \t\x3Cli id=\"ea49\" class=\"oa ob hn oc b od qa of og oh qb oj ok fj qc om on fm qd op oq fp qe os ot ou px py pz bj\" data-selectable-paragraph=\"\">Physical data model, as described by the physical data storage.\x3C/li>\r\n\x3C/ol>\r\n\x3Cp id=\"cbca\" class=\"pw-post-body-paragraph oa ob hn oc b od oe of og oh oi oj ok fj ol om on fm oo op oq fp or os ot ou go bj\" data-selectable-paragraph=\"\">Databases also have physical and logical data independence, wherein the changes in the physical or logical schema or \x3Cem class=\"pw\">view\x3C/em> of the data should not impact the applications, i.e., they do not need to be rewritten. Physical schema changes include modifications how the data is stored, indexed, partitioned, etc. Logical schema changes include modifications to table or view definitions, e.g., change how they are computed or materialized.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"970f\" class=\"ov ow hn be ox oy oz pa ff pb pc pd fi pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">Data Models in Modern Data Stack\x3C/h1>\r\n\x3Cp id=\"82f4\" class=\"pw-post-body-paragraph oa ob hn oc b od pr of og oh ps oj ok fj pt om on fm pu op oq fp pv os ot ou go bj\" data-selectable-paragraph=\"\">Data modeling plays a critical role in modern data stack, where data is quickly extracted and loaded into a central data processing system, e.g., a data cloud or a lake house, for all the transformations to run afterwards, also known as the ELT paradigm. These transformations include curating, selecting, combining, aggregating, and applying custom logic in various ways. Essentially, these transformations encapsulate the logic needed to power the business apps and they are typically managed by data analysts or analytics engineers, who act as the connector between the data and business worlds, as illustrated below.\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">\x3Cimg class=\"alignnone wp-image-182\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_FiWd-yjuZe-4Lpv_i0Mh1g-300x52.webp\" alt=\"\" width=\"577\" height=\"100\" />\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">Figure 1: The data analyst or the analytics engineer connects the data and business worlds.\x3C/p>\r\n\x3Cp id=\"c616\" class=\"pw-post-body-paragraph oa ob hn oc b od oe of og oh oi oj ok fj ol om on fm oo op oq fp or os ot ou go bj\" data-selectable-paragraph=\"\">The above transformations could also be in stages, e.g., the \x3Ca class=\"af nz\" href=\"https://www.databricks.com/glossary/medallion-architecture\" target=\"_blank\" rel=\"noopener ugc nofollow\">medallion design pattern\x3C/a> in Databricks Lakehouse, or in hierarchical pipelines, e.g., the \x3Ca class=\"af nz\" href=\"https://www.vldb.org/pvldb/vol15/p3710-leeka.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Asimov data pipelines\x3C/a> in Microsoft’s Cosmos data analytics. Data transformation and modeling are also crucial for machine learning and data science applications. For instance, the \x3Ca class=\"af nz\" href=\"https://dl.acm.org/doi/abs/10.1145/3357223.3362726\" target=\"_blank\" rel=\"noopener ugc nofollow\">Peregrine workload optimization platform\x3C/a> for optimizing data platforms using ML relies on transforming various sets of system logs and metrics into an intermediate representation. Even OpenAI asks for elaborate \x3Ca class=\"af nz\" href=\"https://platform.openai.com/docs/guides/fine-tuning\" target=\"_blank\" rel=\"noopener ugc nofollow\">data preparation and high data quality\x3C/a> when using the prompt dataset for fine-tuning. Clearly, there is no substitute for good data modeling.\x3C/p>\r\n\x3Cp id=\"54ac\" class=\"pw-post-body-paragraph oa ob hn oc b od oe of og oh oi oj ok fj ol om on fm oo op oq fp or os ot ou go bj\" data-selectable-paragraph=\"\">Several tools exist for data modeling in the modern data stack. We illustrate a small subset of them in the figure below.\x3C/p>\r\n\x3Cp class=\"pw-post-body-paragraph oa ob hn oc b od oe of og oh oi oj ok fj ol om on fm oo op oq fp or os ot ou go bj\" data-selectable-paragraph=\"\">\x3Cimg class=\"alignnone wp-image-185\" src=\"https://blog.tursio.ai/wp-content/uploads/2025/05/1_YDo0rOdcotxE1jAyfGp82Q-300x112.webp\" alt=\"\" width=\"576\" height=\"215\" />\x3C/p>\r\n\x3Cp data-selectable-paragraph=\"\">Figure 2: A subset of tools for data modeling in the modern data stack.\x3C/p>\r\n\r\n\x3Cdiv class=\"gp gq gr gs gt l\">\x3Carticle>\r\n\x3Cdiv class=\"l\">\r\n\x3Cdiv class=\"l\">\x3Csection>\r\n\x3Cdiv>\r\n\x3Cdiv class=\"go hh hi hj hk\">\r\n\x3Cdiv class=\"ab ca\">\r\n\x3Cdiv class=\"ch bg gu gv gw gx\">\r\n\x3Cp id=\"4052\" class=\"pw-post-body-paragraph oa ob hn oc b od oe of og oh oi oj ok fj ol om on fm oo op oq fp or os ot ou go bj\" data-selectable-paragraph=\"\">Given that modern data processing systems have columns stores with vectorized query processing for interactive performance, data analysts can directly run transformation queries whenever needed by the business app. However, for better performance and predictability, analysts typically save the transformation queries as materialized views (e.g., relational databases such as PostgreSQL), tasks (e.g., Snowflake), or pipelines (e.g., Databricks) in different data processing systems. Alternatively, they could run the transformations as repeatable scripts external to the data processing systems using tools such as DBT, Airflow, Astronomer, and others.\x3C/p>\r\n\x3Cp id=\"07fc\" class=\"pw-post-body-paragraph oa ob hn oc b od oe of og oh oi oj ok fj ol om on fm oo op oq fp or os ot ou go bj\" data-selectable-paragraph=\"\">Apart from creating data models on the data processing platform side, analysts could also create data models on the application side. For example, they could use Looker’s LookML to define models and create persistent derived tables (PDTs). Or define saved queries in Superset, extracts in Tableau, imports in Power BI, preferred tables in BigQuery, and SPICE tables for Amazon QuickSight. Each of these application side mechanisms allow analysts to build application specific data models.\x3C/p>\r\n\r\n\x3C/div>\r\n\x3C/div>\r\n\x3C/div>\r\n\x3C/div>\r\n\x3C/section>\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n\x3Csection>\r\n\x3Cdiv>\r\n\r\n \r\n\x3Cdiv class=\"go hh hi hj hk\">\r\n\x3Cdiv class=\"ab ca\">\r\n\x3Cdiv class=\"ch bg gu gv gw gx\">\r\n\x3Ch1 id=\"95be\" class=\"ov ow hn be ox oy oz pa ff pb pc pd fi pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">Towards Learned Data Models\x3C/h1>\r\n\x3Cp id=\"3ea7\" class=\"pw-post-body-paragraph oa ob hn oc b od pr of og oh ps oj ok fj pt om on fm pu op oq fp pv os ot ou go bj\" data-selectable-paragraph=\"\">Current approaches to data modeling in modern data stack are challenging on several counts. First, the above data modeling tools are all manual, requiring the data analyst to spend a painful amount of time and effort in handcrafting the right data models. As a result, it takes days and weeks before data analysts can surface the data for the business users to get insights. Moreover, there is an entire lifecycle of events — including deploying, updating, sharing, and optimizing data models — that is extremely tedious for the data analysts to manage. For example, updating the data models requires the analyst to be aware of the data arrival rates, the application requirements, and all the cascading dependencies on any given data model. Therefore, the question is whether we can do better, can AI help here?\x3C/p>\r\n\x3Cp id=\"390a\" class=\"pw-post-body-paragraph oa ob hn oc b od oe of og oh oi oj ok fj ol om on fm oo op oq fp or os ot ou go bj\" data-selectable-paragraph=\"\">It is worth noting that even though the space of data models is very large, the actual data models built by the analysts are not arbitrary. Instead, they are meaningful business logic with patterns that are often overlapping and like other logic their team (or even other teams) has written. For example, the Cosmos analytics platform at Microsoft routinely has teams with \x3Ca class=\"af nz\" href=\"https://www.microsoft.com/en-us/research/uploads/prod/2018/03/cloudviews-sigmod2018.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">more than half of their analytics jobs overlapping\x3C/a> with each other. These are natural patterns to learn and assist the data analyst. Additionally, many applications frameworks generate canned SQL statements, e.g., the \x3Ca class=\"af nz\" href=\"https://cloud.google.com/looker/docs/best-practices/understanding-symmetric-aggregates\" target=\"_blank\" rel=\"noopener ugc nofollow\">symmetric aggregates\x3C/a> in Looker ensures there is no SQL fanout. Such logic is based on well-defined rules that are useful to learn. Finally, users know a lot about their applications and their interactions could provide insights into which data models make sense. Overall, we see an interesting case for learning data models based on the data patterns, rules, and interactions.\x3C/p>\r\n\x3Cp id=\"3cbe\" class=\"pw-post-body-paragraph oa ob hn oc b od oe of og oh oi oj ok fj ol om on fm oo op oq fp or os ot ou go bj\" data-selectable-paragraph=\"\">Learned data models raise many obvious questions. First, what makes a data model better than others? What is the training objective? Some candidate properties to evaluate data models could be: 1. \x3Cem class=\"pw\">Quality\x3C/em>, i.e., how clean or complete the data in a model is, 2. \x3Cem class=\"pw\">Correctness\x3C/em>, i.e., whether a model follows well defined rules and heuristics, and 3. \x3Cem class=\"pw\">Interestingness\x3C/em>, i.e., whether the model stands out in statistically defined metrics or as an interesting data pattern. Understanding more properties that differentiate data models is an open question.\x3C/p>\r\n\x3Cp id=\"a64b\" class=\"pw-post-body-paragraph oa ob hn oc b od oe of og oh oi oj ok fj ol om on fm oo op oq fp or os ot ou go bj\" data-selectable-paragraph=\"\">A even bolder question is whether large data models could be \x3Cem class=\"pw\">pre-trained\x3C/em>, i.e., whether data models could be learned independent of the data and application platforms. Indeed, current trends show an increasing interest to run \x3Ca class=\"af nz\" href=\"https://www.youtube.com/watch?v=zf-5SCUo0N0\" target=\"_blank\" rel=\"noopener ugc nofollow\">SQL query over anything\x3C/a> using tools like Starburst and to optimize those queries in a \x3Ca class=\"af nz\" href=\"https://patentimages.storage.googleapis.com/38/96/1b/1e9b4c673a9bca/US11567936.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">platform-agnostic manner\x3C/a> using \x3Ca class=\"af nz\" href=\"https://sigmodrecord.org/publications/sigmodRecord/2209/pdfs/10_Industry_JIndal.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">optimizer as a service\x3C/a>. Likewise, tools like DBT encourage creating data models independently before using them in the application platform. Both these trends indicate a possibility to learn large data models!\x3C/p>\r\n\r\n\x3C/div>\r\n\x3C/div>\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Cdiv class=\"ab ca\">\r\n\x3Cdiv class=\"ch bg gu gv gw gx\">\r\n\x3Ch1 id=\"506b\" class=\"ov ow hn be ox oy oz pa ff pb pc pd fi pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\">Data Models vs Language Models\x3C/h1>\r\n\x3Cp id=\"e6f0\" class=\"pw-post-body-paragraph oa ob hn oc b od pr of og oh ps oj ok fj pt om on fm pu op oq fp pv os ot ou go bj\" data-selectable-paragraph=\"\">Finally, let’s step back and see how data models are similar or different from language models. Large language models have made answers to natural language questions accessible within minutes instead of searching through scores of link and webpages. Large data models can have a similar effect of delivering insights quickly (within minutes) instead of digging through a jungle of tables and crafting the right data models that can show insights. This opens up a deluge of interesting questions: Is a large data model possible? Aren’t large language models all we need? Given that several recent approaches to \x3Ca class=\"af nz\" href=\"https://www.vldb.org/pvldb/vol16/p1534-fu.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">map natural language to SQL\x3C/a> make people better at asking questions, how do we get better in answering them? How do we ensure correctness? Can we fine-tune to specific business context?\x3C/p>\r\n\x3Cp id=\"6d50\" class=\"pw-post-body-paragraph oa ob hn oc b od oe of og oh oi oj ok fj ol om on fm oo op oq fp or os ot ou go bj\" data-selectable-paragraph=\"\">AI is defining a new world structure and the role of data is evolving along with it. While data continues to power pretty much everything around us, it is no longer at the frontend that operators want to figure out or fiddle with. Instead, the underpinning hope is for an automation that helps data show up magically, whenever, and however needed. Is that future possible?\x3C/p>\r\n\r\n\x3C/div>\r\n\x3C/div>\r\n\x3C/div>\r\n\x3C/div>\r\n\x3C/section>\x3C/div>\r\n\x3C/div>\r\n\x3C/article>\x3C/div>\r\n\x3Cdiv class=\"ab ca\">\r\n\x3Cdiv class=\"ch bg gu gv gw gx\">\r\n\x3Cdiv class=\"qm qn ab js\">\r\n\x3Cdiv class=\"qo ab\">\x3C/div>\r\n\x3C/div>\r\n\x3C/div>\r\n\x3C/div>",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/05/0_RwRMpW76unADfP0v.webp",author:"Alekh Jindal",author_image:void 0,published_date:"March 28, 2023",tags:"Engineering",description:"AI is advancing rapidly, with OpenAI plugins, Databricks’ open-source ChatGPT-like systems, and Microsoft’s claims of sparks of AGI in GPT-4. Yet, while large language models thrive, data models remain manual and slow. In the modern data stack, analysts must handcraft transformations, pipelines, and materialized views across platforms like Snowflake, Databricks, Looker, and DBT. This process is tedious, error-prone, and often repetitive, with overlapping patterns ripe for automation. The vision of learned data models suggests AI could capture quality, correctness, and interestingness of data logic, even pre-trained in platform-agnostic ways, enabling insights as easily as LLMs deliver answers."}},$R[61]={id:1197,slug:"the-ai-ambition-and-the-ground-reality",acf:$R[62]={title:"The AI Ambition and the Ground Reality",content:"\x3Cp id=\"4b75\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">This blog attempts to connect the dots between the present AI excitement to what it could mean for the future of data, while reconciling with some of the lessons learned from recent past.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"f146\" class=\"ou ov hn be ow ox oy oz ff pa pb pc fi pd pe pf pg ph pi pj pk pl pm pn po pp bj\" data-selectable-paragraph=\"\">AI-powered World\x3C/h1>\r\n\x3Cp id=\"d287\" class=\"pw-post-body-paragraph nz oa hn ob b oc pq oe of og pr oi oj fj ps ol om fm pt oo op fp pu or os ot go bj\" data-selectable-paragraph=\"\">We are living in one of the most exciting times for AI with ChatGPT being the Sputnik moment that got the whole world scrambling for action — from \x3Ca class=\"af ny\" href=\"https://hbr.org/2022/11/how-generative-ai-is-changing-creative-work\" target=\"_blank\" rel=\"noopener ugc nofollow\">creativity and productivity\x3C/a> to \x3Ca class=\"af ny\" href=\"https://www.forbes.com/sites/markminevich/2023/01/29/the-generative-ai-revolution-is-creating-the-next-phase-of-autonomous-enterprise\" target=\"_blank\" rel=\"noopener ugc nofollow\">efficiency and growth\x3C/a> to \x3Ca class=\"af ny\" href=\"https://www.gartner.com/en/articles/beyond-chatgpt-the-future-of-generative-ai-for-enterprises\" target=\"_blank\" rel=\"noopener ugc nofollow\">life and material sciences\x3C/a> — there is an endless list of what is possible with AI today and the world is changing faster than anyone can keep up with. Many believe generative AI is akin to the \x3Ca class=\"af ny\" href=\"https://www.zdnet.com/article/just-how-big-is-this-new-generative-ai-think-internet-level-disruption/\" target=\"_blank\" rel=\"noopener ugc nofollow\">dawn of internet\x3C/a> or the \x3Ca class=\"af ny\" href=\"https://www.sequoiacap.com/article/generative-ai-a-creative-new-world/\" target=\"_blank\" rel=\"noopener ugc nofollow\">rise of mobile\x3C/a> that unleashed a new wave of applications which could destroy existing industries and create new ones.\x3C/p>\r\n\x3Cp id=\"39d4\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">No wonder there is an unparalleled ambition to reimagine pretty much every part of our world, with generative AI applications in text, image, audio, video, gaming, code, chat, apps, and legal, to name a few. Recent count shows \x3Ca class=\"af ny\" href=\"https://app.dealroom.co/lists/33530\" target=\"_blank\" rel=\"noopener ugc nofollow\">210+ generative AI startups\x3C/a> out there with \x3Ca class=\"af ny\" href=\"https://twitter.com/AlexLee611/status/1628290705490857984/photo/1\" target=\"_blank\" rel=\"noopener ugc nofollow\">40+ generative AI companies backed by Y Combinator 2023 batch\x3C/a> alone. And you know there is a \x3Ca class=\"af ny\" href=\"https://www.nytimes.com/2023/03/14/technology/ai-funding-boom.html\" target=\"_blank\" rel=\"noopener ugc nofollow\">new gold rush\x3C/a> driving this crazy landscape when a year old startup announces \x3Ca class=\"af ny\" href=\"https://venturebeat.com/ai/this-ai-startup-just-raised-350-million-for-generative-ai-trained-to-use-every-software-tool-and-api/\" target=\"_blank\" rel=\"noopener ugc nofollow\">raising $350 million\x3C/a>, even though the product is yet to launch. Still, there are more fundamental questions as well. For instance, what will the future \x3Ca class=\"af ny\" href=\"https://a16z.com/2023/01/19/who-owns-the-generative-ai-platform/\" target=\"_blank\" rel=\"noopener ugc nofollow\">tech stack\x3C/a> look like, will \x3Ca class=\"af ny\" href=\"https://www.forbes.com/sites/lanceeliot/2023/03/01/generative-ai-chatgpt-as-masterful-manipulator-of-humans-worrying-ai-ethics-and-ai-law/?sh=b6e6b8a1d669\" target=\"_blank\" rel=\"noopener ugc nofollow\">AI manipulate humans\x3C/a>, and whether this be the end for Google? Although this article is entirely human-written written with Google still being the primary research tool.\x3C/p>\r\n\x3Cp id=\"6f86\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">While a new future of applications is getting unfolded with AI, what does AI mean for data? What about the enterprise data that largely remains unused or un-accessed even today? Is data getting better with AI?\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"238f\" class=\"ou ov hn be ow ox oy oz ff pa pb pc fi pd pe pf pg ph pi pj pk pl pm pn po pp bj\" data-selectable-paragraph=\"\">AI-powered Data Analytics\x3C/h1>\r\n\x3Cp id=\"e062\" class=\"pw-post-body-paragraph nz oa hn ob b oc pq oe of og pr oi oj fj ps ol om fm pt oo op fp pu or os ot go bj\" data-selectable-paragraph=\"\">There is a growing interest in leveraging AI for data analytics. This is driven by the need to make analytics self-serve and democratized for all units of business that are looking to turn more and more data into intelligence. In fact, experts believe more than \x3Ca class=\"af ny\" href=\"https://youtu.be/iNRzxP8eOzM?t=2219\" target=\"_blank\" rel=\"noopener ugc nofollow\">500 million intelligent applications are going to be built\x3C/a> over the next few years, portending serious questions on the scalability and cost of the underlying data platforms. Furthermore, data analytics is getting more complex from requirements to actions, thereby demanding more user expertise. Yet, a quick search on LinkedIn reveals more than 7M “analysts” compared to only 200K “data engineers”, thus indicating far more non-expert users asking business questions than experts who can work on the data stack.\x3C/p>\r\n\x3Cp id=\"b315\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">Recent approaches are trying to make analytics more accessible via conversational interfaces. These include both existing players, such as ThoughtSpot SearchIQ, Salesforce Tableau, Amazon QuickSight, and Microsoft Power BI, as well as new startups such as Seek, Defog, Ai2sql, Nlsql, Outerbase, ChatSpot, etc. The challenge, however, is to ensure correctness, i.e., conversations cannot lead to wrong results. Correctness is also a broader concern with generative AI (see attempts for \x3Ca class=\"af ny\" href=\"https://techcrunch.com/2023/03/08/forethought-aims-to-build-more-accurate-chatbots-with-more-constrained-generative-ai-models/\" target=\"_blank\" rel=\"noopener ugc nofollow\">accurate chatbots\x3C/a>), and therefore, current conversational interfaces for data analytics are mostly suggestive and best efforts.\x3C/p>\r\n\x3Cp id=\"df8d\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">Apart from query interfaces, other efforts for AI-powered data analytics include Trifacta and DataRobot/Paxata for traditional data preparation, Lume for schema mapping, Turntable for assisted data modeling, and Keebo for warehouse optimization. Many of these are early-stage efforts, and many other problems, including data discovery, integration, models, pipelines, privacy, quality, operations, etc., remain a rich opportunity for AI to seep deeper into the data analytics stack and solve some of the harder problems there. Overall, we are still in the early days of making AI-powered data analytics practical and end-to-end.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"3389\" class=\"ou ov hn be ow ox oy oz ff pa pb pc fi pd pe pf pg ph pi pj pk pl pm pn po pp bj\" data-selectable-paragraph=\"\">Déjà vu: AI-powered Data Systems\x3C/h1>\r\n\x3Cp id=\"7d64\" class=\"pw-post-body-paragraph nz oa hn ob b oc pq oe of og pr oi oj fj ps ol om fm pt oo op fp pu or os ot go bj\" data-selectable-paragraph=\"\">Interestingly, a very similar story of infusing AI has been playing out for data systems in the last few years. Consider data platforms like Spark, Snowflake, or BigQuery, and the various system-level problems inside them, such as performance, scale, resource utilization, efficiency, cost, configurations, etc. Many of these problems have become \x3Ca class=\"af ny\" href=\"http://vldb.org/pvldb/vol14/p3202-jindal.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">incredibly complex in cloud\x3C/a> with more sophisticated systems and workloads, less expert users (thanks to the ease of getting started), a lack of control in managed services, and too many moving parts with multiple layers of abstraction. As a result, it is hard for customers since they do not have the DBAs from older on-premises world, and hard for cloud service providers since they are grappling with a deluge of customer requests to meet the quality of service.\x3C/p>\r\n\x3Cp id=\"cab3\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">Together with my colleagues in Azure Data, I have previously spent several years at Microsoft building and deploying a gamut of AI-powered techniques for data systems. Let’s briefly look at three of them below.\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n \r\n\x3Ch2 id=\"e1d6\" class=\"pv ov hn be ow fd pw fe ff fg px fh fi fj py fk fl fm pz fn fo fp qa fq fr qb bj\" data-selectable-paragraph=\"\">1. Learned Cardinality Estimation\x3C/h2>\r\n\x3Cp id=\"fad1\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">Data systems typically use a query optimizer to convert declarative SQL queries (represented as a tree of operators) to physical query execution plans (an optimized tree of operators). And the core of a query optimizer requires estimating cardinalities (i.e., the row count) at each point in the operator tree. Cardinalities help estimate the cost of different physical operator trees and thus pick the cheapest one for execution. Unfortunately, cardinality estimation has been a long-standing problem in databases:\x3C/p>\r\n\r\n\x3Cblockquote class=\"qi qj qk\">\r\n\x3Cp id=\"d006\" class=\"nz oa ql ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">\x3Cem>\"The root of all evil, the Achilles Heel of query optimization, is the estimation of the size of intermediate results, known as cardinalities\"\x3C/em> — Guy Lohman.\x3C/p>\r\n\x3C/blockquote>\r\n\x3Cp id=\"4af7\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">At Microsoft, \x3Ca class=\"af ny\" href=\"http://www.vldb.org/pvldb/vol12/p210-wu.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">cardinality estimation in the Cosmos big data workload\x3C/a> ranges from 10,000 times under-estimation all the way to a million times over-estimation for different operator sub-trees, exacerbated particularly due to lack of statistics at massive scale, presence of large volumes of unstructured data, and quite generous use of user defined operators that are hard for the optimizer to reason about.\x3C/p>\r\n\x3Cp id=\"3107\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">We exploited two observations for building CardLearner, the ML-based cardinality estimator: (i) the recurring nature of the workloads where similar jobs with different inputs and parameters were executed repeatedly, and (ii) a large number of jobs having similar sub-trees across them. Together, these provided an excellent training input for learning several small and highly accurate \x3Cem class=\"ql\">micromodels\x3C/em>. These micromodels are then served to the query optimizer at compile time via an insights service.\x3C/p>\r\n\x3Cp id=\"17f9\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">\x3Ca class=\"af ny\" href=\"https://www.microsoft.com/en-us/research/publication/microlearner-a-fine-grained-learning-optimizer-for-big-data-workloads-at-microsoft/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Validation over production workloads\x3C/a> showed a 95th error reducing by five orders of magnitude from 465711% to just 1%, leading to better cost estimates, lower job latencies, and, surprisingly, even lower number of containers used by the workload. CardLearner is the first ML-based cardinality estimation to be deployed in production, and the key to its success was 4+ years of hardening right from an intern project to being enabled by default for critical production workloads — handling numerous corner cases, performance regressions, and fallback mechanisms along the way.\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n \r\n\x3Ch2 id=\"dece\" class=\"pv ov hn be ow fd pw fe ff fg px fh fi fj py fk fl fm pz fn fo fp qa fq fr qb bj\" data-selectable-paragraph=\"\">2. Learned Resource Allocation\x3C/h2>\r\n\x3Cp id=\"f597\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">Cloud data systems and increasingly “serverless”, where users do not have to provision resources upfront, and the system can decide resources dynamically. However, most systems still allow users to provide hints when submitting queries, e.g., concurrency level in Snowflake, executor count in Spark, and token counts in SCOPE at Microsoft. Unfortunately, SCOPE users rarely make an informed decision:\x3C/p>\r\n\r\n\x3Cblockquote class=\"qi qj qk\">\r\n\x3Cp id=\"b6f2\" class=\"nz oa ql ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">\x3Cem>\"At no point did I feel I had a better understanding beyond this: more tokens mean faster job completion… [U]se minimal tokens (<50) for tiny jobs and as many tokens as possible otherwise\"\x3C/em> — SCOPE user.\x3C/p>\r\n\x3C/blockquote>\r\n\x3Cp id=\"a3e9\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">No wonder 40–60% of the jobs in \x3Ca class=\"af ny\" href=\"http://www.vldb.org/pvldb/vol13/p3326-sen.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Cosmos have over-allocated tokens by as much as 1000x\x3C/a>, thus blocking resources for other jobs in the shared cluster while also creating an artificial peak demand that is higher than actually needed (see here the production \x3Ca class=\"af ny\" href=\"https://github.com/microsoft/Peregrine\" target=\"_blank\" rel=\"noopener ugc nofollow\">resource distributions\x3C/a>). An ML-based model, AutoToken, that predicts the peak resource for each job has an RMSE of less than 10% (two orders of magnitude lower than the previous state of the art), with the \x3Ca class=\"af ny\" href=\"http://www.vldb.org/pvldb/vol13/p3326-sen.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">token ask in one of the customer workloads reduced by 97%\x3C/a>.\x3C/p>\r\n\x3Cp id=\"73ee\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">Interestingly, resources less than peak may not impact performance disproportionately. There is a sweet spot where reduced resources (from the peak needed) still have acceptable performance. Models for such \x3Ca class=\"af ny\" href=\"https://openproceedings.org/2022/conf/edbt/paper-78.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">optimal resource allocation\x3C/a> can be built using careful experimentation, and the approach is applicable for other data systems, e.g., \x3Ca class=\"af ny\" href=\"https://openproceedings.org/2023/conf/edbt/paper-186.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">optimal executor counts in Spark\x3C/a>. However, AI alone cannot substitute the need to model the problem carefully, discover the relationships between performance and cost, formulate it as a generalized model, and test and improve it repeatedly. This was the key to deploying learned resource allocation.\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n \r\n\x3Ch2 id=\"d1c8\" class=\"pv ov hn be ow fd pw fe ff fg px fh fi fj py fk fl fm pz fn fo fp qa fq fr qb bj\" data-selectable-paragraph=\"\">3. Learned Query Optimizer\x3C/h2>\r\n\x3Cp id=\"065c\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">The query optimizer has long been a signature component of relational database systems, with a rich history of both academic and industrial research around it. From the early days of System R to modern data systems, this component has evolved significantly, with the cascades architecture becoming popular with several database implementations, including SQL Server, SCOPE, Spark, Calcite, Greenplum, Snowflake, Spanner, and F1. Interestingly, many individuals who worked on Cascades have moved between companies, cross-pollinating a common set of ideas and design principles across industry. There is an interesting anecdote of the SCOPE team hiring engineers who had previously worked on a distant past version of the SCOPE codebase, a few acquisitions, mergers, and reorgs ago. It is no wonder then that people working on query optimization are scarce and highly sought after.\x3C/p>\r\n\x3Cp id=\"4eb9\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">Given how fundamental query optimizers are to databases, there was a ChatGPT moment of sorts when people started \x3Ca class=\"af ny\" href=\"https://www.vldb.org/pvldb/vol12/p1705-marcus.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">proposing to replace the entire query optimizer with a learned one\x3C/a>. This was an ambitious move, making query optimizer people wonder about the future of this complex area. Incidentally, researchers quickly realized that it was hard to completely replace query optimizers with learned ones, and so they proposed a \x3Ca class=\"af ny\" href=\"https://people.csail.mit.edu/tatbul/publications/bao_sigmod21.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">more practical version\x3C/a> that sits side by side with the existing query optimizer to help guide the query plan search.\x3C/p>\r\n\x3Cp id=\"37ed\" class=\"pw-post-body-paragraph nz oa hn ob b oc od oe of og oh oi oj fj ok ol om fm on oo op fp oq or os ot go bj\" data-selectable-paragraph=\"\">My former team at Microsoft worked with the leading researchers on grounding these ideas into the industry-strength workloads in SCOPE. We noticed that SCOPE has 256 optimizer rules, leading to ²²⁵⁶ possible optimizer configurations instead of only 48 configurations considered in the original work. Therefore, we needed to divide this massive search space and came up with a \x3Ca class=\"af ny\" href=\"https://dl.acm.org/doi/pdf/10.1145/3448016.3457568\" target=\"_blank\" rel=\"noopener ugc nofollow\">rule signature\x3C/a> that captures the code path that a query takes inside a query optimizer. We could then learn smaller, specialized models to steer the optimizer toward good paths. Still, performance regression is a big challenge when \x3Ca class=\"af ny\" href=\"https://vldb.org/pvldb/vol14/p3362-hossain.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">deploying AI for systems\x3C/a>. The team came up with more innovations to carefully design pre-production experiments before the model could be \x3Ca class=\"af ny\" href=\"https://dl.acm.org/doi/abs/10.1145/3514221.3526052\" target=\"_blank\" rel=\"noopener ugc nofollow\">deployed to production\x3C/a>, a multi-year effort in taking the early AI excitement to production reality.\x3C/p>\r\n\r\n\x3Ch2 data-selectable-paragraph=\"\">\x3C/h2>\r\n \r\n\x3Ch2 id=\"cf3d\" class=\"pv ov hn be ow fd pw fe ff fg px fh fi fj py fk fl fm pz fn fo fp qa fq fr qb bj\" data-selectable-paragraph=\"\">Lesson Learned\x3C/h2>\r\n\x3Cp id=\"64da\" class=\"pw-post-body-paragraph nz oa hn ob b oc pq oe of og pr oi oj fj ps ol om fm pt oo op fp pu or os ot go bj\" data-selectable-paragraph=\"\">There are three key lessons to be learned from the above examples of deploying AI. First, it is tempting to think of replacing things with AI. Yet, AI is better off assisting someone or something that exists in doing a better job. Second, many people instinctively try to learn large global models that can predict everything; however, it is more practical to learn smaller local models that predict fewer things but with far more accuracy. Local models are also simpler, smaller, faster, and even explainable. And finally, the devil is in a lot of details, particularly when it comes to AI. It is non-trivial to make AI work reliably in any production setting and so it is important to consider the corner cases that can show up.\x3C/p>\r\n\r\n\x3Ch1 data-selectable-paragraph=\"\">\x3C/h1>\r\n \r\n\x3Ch1 id=\"8ef4\" class=\"ou ov hn be ow ox oy oz ff pa pb pc fi pd pe pf pg ph pi pj pk pl pm pn po pp bj\" data-selectable-paragraph=\"\">Conclusion\x3C/h1>\r\n\x3Cp id=\"e871\" class=\"pw-post-body-paragraph nz oa hn ob b oc pq oe of og pr oi oj fj ps ol om fm pt oo op fp pu or os ot go bj\" data-selectable-paragraph=\"\">To conclude, many classic data problems are still not solved, and we continue to hear “data is the biggest blocker” from many practitioners. It will be interesting to see how the new wave of AI unfolds for data, and hopefully, we can draw lessons from some of the recent experiences that we as a community have had in the field.\x3C/p>\r\n ",can_share_on_x:!0,can_share_on_facebook:!0,can_share_on_linkedin:!0,blog_image:"https://blog.tursio.ai/wp-content/uploads/2025/05/0_3EisrYiaKXjr6aRN.webp",author:"Alekh Jindal",author_image:void 0,published_date:"March 17, 2023",tags:"Engineering",description:"AI is reshaping industries at internet-scale, with generative AI fueling a surge of startups and applications. Yet, enterprise data remains underused, raising the question: can AI make data better? Conversational analytics tools like Tableau, QuickSight, and startups aim to democratize insights, though correctness is a hurdle. In data systems, Microsoft’s Azure Data team deployed AI successfully: CardLearner improved query accuracy, AutoToken optimized resources, and learned optimizers guided query planning. Key lessons: AI works best as an assistant, local models outperform global ones, and production reliability depends on handling edge cases. Data remains the biggest blocker, but AI offers promise"}}],total:27});$R[63]($R[6],$R[7]);$R[63]($R[4],!0);
From technical deep-dives to industry spotlights, explore how generative AI is reshaping the way we work with data.
Blog
News
Research
Webinars
Whitepapers
Case studies
Select Topic
Generative AI
Will SQL be the new Assembly Language?
Accessing enterprise data shouldn’t be hard—but for billions of professionals, SQL remains a barrier. Despite decades of BI tools and dashboards promising “self-serve” analytics, business users are still blocked by complexity, slow workflows, and reliance on data experts. Text-to-SQL AI seemed like a solution, but auto-generated queries still require verification, leaving the SQL wall intact. Tursio reimagines data analytics by connecting natural language questions directly to enterprise data while ensuring correctness. By inferring semantic models, constraining AI queries to relevant data, and systematically building query plans, Tursio delivers accurate, interpretable, and actionable answers—making AI-powered analytics truly accessible.
Alekh Jindal
Read more
Databases
Generative AI
Why AI fails despite "great” models?
We’ve all seen AI demos—type a question, get an instant answer. In reality, 95% of enterprise AI pilots fail—not because models are weak, but because prompting is hard. Most users aren’t prompt engineers, especially when querying structured data, leading to hallucinations and lost trust. At Tursio, Auto Mode solves this: it scaffolds questions based on what exists, what’s meaningful, and what’s probable in your data. Users explore, select, and accept—without worrying about SQL or prompt complexity. AI should adapt to humans, not the other way around. The result? Fast, reliable insights you can trust.
Alekh Jindal
Read more
Generative AI
Redefining Productivity with AI
AI isn’t replacing people—it’s amplifying them. Across functions, AI acts as a force multiplier, automating repetitive work, accelerating research, and enabling faster, more confident decisions. Analysts interpret rather than wrangle data, marketers generate content and insights instantly, customer teams serve faster, and managers focus on strategy over reporting. The new productivity measure is decision velocity, not hours worked. Tomorrow’s professionals need data literacy, AI copiloting skills, and cross-functional agility. At Tursio, we empower teams to access structured data in natural language, removing barriers and unlocking human potential, so AI amplifies impact instead of just adding tools.
Nilanshi Dhoundiyal
Read more
Generative AI
MCP for Databases: New trick for old elephants
In enterprise AI, making LLMs understand structured data is tricky. Many turn to LangChain for rapid prototyping, but Model Context Protocol (MCP) is emerging as a production-ready alternative—sometimes called the “USB-C for AI.” MCP servers translate natural language into schema-aware, secure queries, giving LLMs safe access to SQL, Snowflake, or FHIR databases without glue code. While LangChain excels at flexibility, MCP shines in regulated, structured environments. Tursio complements this by focusing on the human experience: effortless natural language querying, context-aware explanations, and insights for decision-makers. Together, structured access and user-centric interfaces define modern AI workflows.
Shraddhaa Khanna
Read more
Generative AI
Is 100x Productivity Possible with AI?
Knowledge workers waste nearly 20% of their time hunting for data. Dashboards, reports, and analytics tools exist—but insights are buried. True efficiency isn’t about doing more; it’s about getting answers faster. AI can remove friction between question and insight, letting anyone explore data instantly without SQL or analyst handholding. Imagine a marketer checking campaign performance mid-meeting, a PM spotting drop-offs in real time, or finance stress-testing budgets on the spot. This is 100x efficiency: decisions made at the speed of thought. The next advantage isn’t more data—it’s faster, smarter access to it.
Nilanshi Dhoundiyal
Read more
Credit Unions
Generative AI
How Credit Unions Are Winning with Generative AI
AI adoption for credit unions works best as a staged journey, not a leap. Start small with Level 1—quick, natural language insights for immediate ROI. Level 2 expands to advanced financial analysis across departments, enabling data-driven strategy, risk management, and growth tracking. Level 3 brings department-level simulations, scenario planning, and deep operational insights for Lending, Risk, and Member Services. Success requires strong data governance, simplified deployment, and empowering non-technical users. By starting smart, scaling fast, and aligning AI with business priorities, credit unions can unlock actionable insights, improve decision-making, and maximize value without the pitfalls of a “big bang” rollout.
Murali Mahalingam
Read more
Clinical Research
Generative AI
Searching Clinical Data using Generative AI
Healthcare data is messy, and querying it effectively is critical for better patient outcomes. SearchAI leverages generative AI to enable natural language search over clinical databases, including ICD, CPT, NDC, MIPS, and modifier codes. Using Boolean decomposition, ontology-aware navigation, and instance-specific tuning, SearchAI interprets complex medical queries, maps them to hierarchical codes, and returns accurate results. Hierarchical flattening and hybrid approaches enhance precision and coverage, achieving up to 99% accuracy. Fast, robust, and semantically aware, SearchAI empowers clinicians and researchers to quickly access actionable information, bridging the gap between technical complexity and clinical usability, and transforming how healthcare data is explored.
Karan Hanswadkar
Read more
Spotlight Stories
Tursio Product Updates: Towards 100x knowledge workers
Knowledge workers often struggle to get actionable insights from enterprise data. Tursio’s latest release addresses this with Auto, Analyze, and Research modes. Auto guides users through semantic models, generating accurate queries without requiring technical expertise. Analyze lets users dig deeper, combining facts with creative reasoning for holistic insights. Research enables open-ended exploration for discovery. Persistent sharing and smarter PDF exports ensure insights are never lost and can be presented clearly. User-recommended questions foster collaboration across teams. Together, these features make Tursio an end-to-end tool, transforming questions into decisions faster and empowering knowledge workers to fully leverage their data.
Shraddhaa Khanna
Read more
Databases
Generative AI
From Queries to Conversations in SQL Server
Tursio bridges this gap, turning structured enterprise data into real-time, conversational intelligence. Our NLP engine understands intent, delivers fast and accurate answers, and works securely across systems. Tursio empowers every employee to query data naturally, making business intelligence intuitive, accessible, and actionable.
Rony Chatterjee
Read more
Databases
Generative AI
AI-powered Querying for 100x Knowledge Workers
Asking questions is central for knowledge workers, yet traditional business intelligence tools—like dashboards—are slow, technical, and inefficient. Modern AI offers a generational productivity leap, automating repetitive tasks so workers can focus on analysis, interpretation, and asking better questions. AI-powered querying enables rapid, high-level insights, letting knowledge workers operate 100x more efficiently. Companies are increasingly adopting AI to do more with less, transforming workflows. In this new paradigm, conventional BI becomes obsolete, replaced by tools that empower humans to think, question, and act faster, unlocking unprecedented workplace productivity and reshaping the way knowledge-driven decisions are made.
Alekh Jindal
Read more
Databases
Generative AI
Why is it hard to bet on AI?
Generative AI promises efficiency for knowledge workers but raises questions about its real value. While AI can simplify, automate, or enable breakthroughs, the market is crowded with complex tools, creating confusion for investors, customers, and builders. Many solutions require vast data and expertise, making trust and usability a challenge. Enterprises often prefer simple, reliable interfaces like search boxes over complex workflows. Success depends on delivering consistent, accurate, and actionable results while maintaining simplicity. At Tursio, the focus is on making generative AI work seamlessly within enterprise databases, empowering users without moving data, reducing complexity, and maximizing impact.
Alekh Jindal
Read more
Databases
Generative AI
Power of Asking Questions
Rudyard Kipling’s “Six Honest Serving Men”—What, Why, When, How, Where, Who—highlights the power of questioning, a skill essential in every field. Traditional business intelligence relies on engineers and dashboards to answer questions, but this slows knowledge workers, limiting productivity and insight. In the knowledge economy, professionals—from marketers to clinicians to bankers—must ask questions iteratively to make informed decisions. Generative AI enables natural-language queries over enterprise data, transforming raw data into actionable knowledge. Tursio brings generative AI to databases without moving data, empowering knowledge workers to ask advanced questions, generate analyses, and iterate quickly, keeping critical decision-making agile and efficient.
Alekh Jindal
Read more
Prev
1
2
3
Next
Ready to transform your workflow with AI?
See how Tursio helps you work faster, smarter, and securely.