Theo is in the interviewer’s chair for this episode as Frank Liu from Zilliz joins the show to talk about how AI and machine learning are making it possible for developers to understand and extract more value from unstructured data such as text, audio, images, video, and more. Traditional databases are great for handling structured data, such as information that can easily be captured and categorized in a table. For example, a book database would have columns for the title and author. This allows the data to be searched, sorted, and filtered according to those values. However, unstructured data isn’t so easily managed. Frank explains how vector databases and AI can use algorithms and increased compute power to search unstructured data without relying on just tags and keywords. This has led to Zilliz being a major contributor to two open source projects – Milvus, which stores embedding vectors and feature maps from trained machine learning models, and Towhee, a machine learning platform built on top of Google’s Tensorflow and the Pytorch open source machine learning framework. Tools like these could be used to do a reverse image search to find similar photos, or even music recommendations based on the rhythm or sound of the music instead of being restricted to artist and genre keywords. Frank shares some of the highly specific and interesting use cases Zilliz has seen in fields like network security and pharmaceutical molecular analysis, in their journey to being the one-stop-shop for anyone needing to understand human-generated unstructured data.
- Series: Episodes