Karen Elliott, Author at Digital Music News

How to Train Your AI Chat Dragon: Hearby Uses Chat Technology to Help Fans Find Grassroots Music They’ll Love

Karen Elliott — Thu, 31 Oct 2024 01:05:21 +0000

Photo Credit: AI

ChatGPT is impressive out-of-the-box but challenging to apply to real-world problems. Area4 Labs and Hearby are building with AI technology to create a data-driven live event concierge.

The following comes from Hearby, a fast-emerging player in concert discovery and a DMN partner. Enjoy!

Full disclosure: “Train” when applied to Chat technology is the same “train” that we might apply to cats. That is, we ask them to do things they were going to do anyway in a way that doesn’t displease them, and then we figure out how to be happy with what they did.

This has been our biggest lesson in creating “Ask Hearby,” our AI chatbot music concierge. In this article, I’ll bring you behind the scenes on our AI adventure.

At Hearby, we aim to use technology to find and uplift grassroots music and help people find the wonderful music hidden right in their neighborhoods. Whether you’re looking for a night of clubbing, a free classical concert, or music to keep the kid out of your hair, it’s all out there. You may not realize there’s a great music venue right in the industrial park next door, on the dockyard of Liverpool, or in a thrift shop in London.

We want to get people exploring and finding music that they’ll love, and to do this, we spent a lot of time investing in fast search technologies, data-driven filters, and map visualizations. Then we ran right into the wall of ‘Too Much Stuff.’

Enter the chatbot, which allows fans to fast-forward and say what they want without going through all the tedious steps of searching, filtering, and reviewing results. It’s a lot of work for something that should be fun.

However, my experience with chatbots has been a big meh, and we wanted to do something more intriguing.

Our main product requirement was “be useful and don’t be irritating”. Yes, it took us a long time to get that, and we fell off the dragon a few too many times. But seeing how this ubiquitous technology works and encouraging more ideas and dreams with it has been interesting. At its core, it’s a foray into using a Large Language Model (LLM), and I will breathlessly say the possibilities are unlimited.

It took a while, but after several tries, we finally have something useful and entertaining to use. So here’s the behind-the-scenes on what we tried that didn’t work — and what finally did.

- Train, train, and train again.
- Give me all the data! More data!
- Hybrids: Just how many technologies can we cram in here?
- It’s a sandwich.

So first, a little more about Training when it comes to Machine Learning.

I need to bring up the topic of training, partly for my snazzy title but also because it’s at the bottom of everything you’re hearing about AI.

To train ML, we first choose a neural net architecture, then give it a vast set of data items labeled with the correct answers (for example, Cat/Dog, T-shirt/Skirt, Pedestrian/Bollard). This type of supervised learning is expensive in computing power, requiring a huge amount of ethically obtained, accurately labeled data. Training enables the ML architecture – the layers and feedback loops that make up the neural net – to adjust to create maximally accurate predictions. For example: “99% chance this image is a cat”.

Going beyond cat/dog to something actually relevant quickly gets expensive and time-consuming. It’s pretty much prohibitive on large data sets for all but the biggest players. Enter LLMs, which come ready-trained on massive amounts of human text right out-of-the-box for anyone to use.

This is what powers our chat dragon: the ability to “understand” human language, figure out what is being asked, and create amazing responses in human language. On the topic of whether there is any actual human-style understanding of concepts, I can start an argument in an empty room (so I won’t go there). It doesn’t matter for our purposes as long as the output is accurate, useful, valuable, safe, and reliable.

This brings me to our challenge: how to make already trained chat technology do what we want.

For a small amount of money and a lot of delight, you can get a subscription to Open AI’s ChatGPT, which will happily write you a letter to Grandma, your term paper, or a pretty decent novel – at least better than anything I can write. Whether soulless or best-selling is in the eye of the beholder, but I prefer to consider it a fantastic tool to help spur creativity.

But as impressive as this is, these out-of-the-box answers are standalone, and the type of chatbot we wanted to create is a conversation that builds as we go along, with context and informality, powered by accurate event, venue, and band data. The challenge, then, is how to get a language-based model to incorporate this external data and use it in its responses and how to have the conversation build as it progresses (memory).

Data! Give me all the data!

The challenge is getting our data into ChatGPT to inform its responses. In a “normal” program, this is a matter of, well, programming. However, an LLM is different: rather than programming, information needs to be text-based to be taken on board.

It’s weird, but not so much when we remember this is a language model. This is precisely how we listen, take in new info, understand it, and use it to inform our actions. In all fairness, the latest models also allow other forms of input, expanding beyond text input. But this is where it was when we started, so that’s where we began.

We started with text-to-sql, in which we describe in words how to find the answers to questions using the tables in our database. So, essentially, telling a programmer how to formulate database queries. This sounded so crazy and improbable that we thought it just might actually work.

Sometimes, it did, but mostly, it sulked, made stuff up, or ignored us. Or all of the above. If you’re thinking cat again, I’m right there with you.

Bring in the hybrids.

So, we moved on to hybridizing and searching our database using ChatGPT for its language capabilities. Among the many challenges:

(1) Knowing what the fan is asking about – An event? A venue? A neighborhood? A genre? A person?

(2) Find the data in our database with a fuzzy search – the whole point of chatting is that the fan doesn’t have to be specific.

(3) Get the data into ChatGPT in words, which is all it understands.

(4) Receive a human-ready answer from ChatGPT.

(5) Augment that answer with links and images.

We quickly realized we needed to confirm it was using our data and not going elsewhere, which in LLM terms is called temperature. Or, in human terms, don’t make stuff up!

It’s a sandwich

After a number of tries, we ended up with a workable sandwich of technologies: Bert NER to understand what the fan is asking about; specialized models to detect essential but idiosyncratic info like informal dates (“in 3 weeks”); a vector database to translate a fuzzy human question into something specific we can ask our already existing search capability; a layer to feed the search answer to ChatGPT in words, and then a method to receive the ChatGPT response in human language. And, finally, a layer to augment it with images and links.

Voila! If this all sounds like a bit much, I get you. But we were delighted to see that a fan can ask a reasonable question, “What’s on in London tonight?” or “Where can I take Aunt Nelly for a jazz brunch?” and get a believable answer that makes sense.

More interesting is that a fan can ask an unreasonable question and get an answer about music events or venues, and an explanation as to why, or, if it’s too far a stretch, simply a reasonable on-topic answer. And, to put your mind at ease somewhat, some questions bring in the guard rails: “I cannot assist you with that”.

Onward!

In addition to our chatbot launching later this year, we are working on several other AI efforts, mainly in Machine Learning and classification. These are focused on highlighting the music scene for fans and encouraging them to explore and find new music and venues. Off their sofas and into venues!

The chatbot has been a very interesting excursion for us into LLMs, which have enormous potential to change how we live with software. So, I hope this has shone a little light on this powerful technology for you.

We’re focused on music and using these incredible tools to uplift grassroots music. Still, I hope this gave you some ideas on how this kind of technology might help in your part of the music world – places where you want people to be able to get to the point faster, have informal access to better information, or be able to explore and expand on an idea on the fly.

Are We There Yet? How Area4Labs Is Utilizing AI to Highlight Our Vibrant Grassroots Music Scene

Karen Elliott — Thu, 29 Aug 2024 22:00:41 +0000

Photo generated by AI.

AI is full of theoretical hype — but Area4Labs is applying AI to construct real solutions for mapping live shows.

The following comes from Area4Labs, the company behind Hearby and a fast-emerging player in concert discovery.

Area4Labs has been diving headfirst into AI and sifting through lots of theoretically exciting possibilities. But we’re also crafting concrete solutions attracting serious partnerships and changing the game for show listings and concert discovery. In this article, I’ll give you a breakdown of what we’re building right now via our discovery platform and app, Hearby.

About three years ago, we started applying AI to our favorite challenges—identifying bands playing at events and creating a general-purpose event website scraper. In retrospect, these problems were enticing, but they were too ambitious for the tech available at that time, plus we had a lot of learning to do first.

Now that we have some mileage, we’re working on expanded versions of these same capabilities: how to find events (acquisition) and how to know what they really are (classification). Finding events incorporates an LLM, and classification uses a statistical model. We are also revisiting our neural net-based band identification project.

All over the world, organizations are sorting hype from reality and devising ways to avoid or fill shortfalls to get real work done now.

GPT-4 is deeply impressive, but getting it to do something fact-oriented and useful is a challenge. For example, asking it for “best venues in Boston” will get you a partial and out-of-date list. It’s a beautiful list, but it also includes closed venues and doesn’t tell you what’s on tonight. To add a human touch, it will throw over to the actual human-curated ‘Best of Lists,’ which is okay as a compilation of knowledge but nothing you couldn’t have found by Googling.

Hardware to train these models on is prohibitively expensive, leaving this in the hands of mega-corporations like Google, Facebook, OpenAI, and Amazon, not to mention the difficulty of acquiring clean, ethically procured data.

However, those closed doors are now opening due to the (relatively) recent advancement of incremental training. As a result, general-purpose models can be created by a large organization, then acquired by smaller groups and fine-tuned to meet special interests or needs.

I’ll be sharing some observations as we grapple with these problems, starting with:

It’s not as great as you think
You need clean data to learn on (and lots of it)
The chasm of supervised vs. unsupervised
The bizarre job you’ve never heard of: ‘Prompt Engineer’

First observation: It’s not as great as you think

AI can both perform impressive tasks and fail at simple things a 5-year-old (or a dog) could manage. Frustratingly, sometimes ChatGPT gives a coherent, useful answer, but sometimes it just gives back junk or simply refuses to answer.

It seems simple to hook in a database of facts, but this crosses two paradigms: computer-based information and human-like language. So, the challenge, as it is with humans, is to describe a database or task in human language. This task is exactly as clumsy as it sounds, as we all know from trying to explain something complex to another person.

Second observation: You need clean data to learn on (and lots of it).

Models need to learn on already cleaned and categorized data, which is hard to find and trust. This data needs to be ethically obtained. In the volume that is needed—millions of data points—this is prohibitive. LLMs provide a pre-trained model that can be adapted and/or expanded, which lightens this load but doesn’t remove it.

Thirdly: supervised vs. unsupervised

Supervised learning vs. unsupervised simply means whether the model is trained on categorized data (i.e. the correct answer is known) or left to figure it out more randomly.

Finding events is relatively simple—we start by looking where we know we will find music events, such as ticketing APIs, scraping venue websites, or understanding the weekly or monthly schedules of small venues.

But what about events that are promoted alone and without context? A café poster or a Google result? An API event that is not categorized? Is it music, theater, sports, or family?

If you see “AC/DC versus Led Zeppelin,” you know exactly what that is — what kind of music, and probably what kind of venue, crowd, and vibe it involves. And if you see “Arsenal versus Manchester United,” you also know precisely what that is. But pity the AI that must figure that out.

The challenge is to gather enough events and bands and fully understand them, including what kinds of events are happening and what type of band is playing what—then use this information to train a Machine Learning model. Complicating matters is that music scenes vary by city and even genre. In the UK, tribute bands are popular; in the US, less so. A model trained on New York City will probably be less accurate in connecting and categorizing bands in Manchester.

Lastly, the bizarre job you’ve never heard of: ‘Prompt Engineer’

As a lifelong programmer, the need to translate ideas, concepts, and requirements into natural human language to get the best results out of an LLM like ChatGPT is most unnatural to me. I’m used to doing this in various computer languages, but the opposite is odd. I recently saw a fascinating piece of AI art, which I can only describe as a beautiful feathered orange flying chicken woman. Only in 2024 do these words even go together.

But how was this art created? An AI artist designed a prompt specifying exactly the type and tone of image wanted. The resulting art is captivating and unique.

For more software-like needs, this is the Prompt Engineer job. They come up with a ChatGPT prompt that specifies not only what is wanted but also how to get it and what tone to use. With LLMs, just like the genie in the bottle, you will (probably) get what you ask for, and it may surprise you.

As a first step in this area, we started with text-to-SQL, meaning we needed to phrase a computer problem as a human language directive so that a computer could “understand” it in its language-oriented structures. For our usage, a prompt might be:

“Find events by looking in the Event database table by location, then looking up the venue in the Venue table. Pay special attention to the city and be sure not to confuse city with band name as they are sometimes the same. It is very important to return the soonest results first. Return the results in the style of a friendly guidebook.”

If you think about it, formulating a problem in human language so that a computer can understand it is a pretty ironic job. We are currently working on ways to optimize our database for this chat usage, including looking at OpenSearch and vector databases.

As we explore this more and sometimes hit frustrating walls with this incredible technology, I try to remember that we’re in a growth phase, and growth is not linear. Learning is messy, but the end results will be worth it.

I am delighted by the possibilities that AI has as a useful tool to improve our lives and optimistic that we can use it to elevate grassroots music.