Monitoring the LLM Explosion

Insights from Marc Klingen from Langfuse on Building and Debugging AI Apps

Jan 20, 2024

Episode 01 of AI Pro Ducks for The AI Engineer

This week we have a discussion on Langfuse, an open-source startup focused on large language model (LLM) observability. We speak with Max to dive deep into Langfuse and understand what's behind the product. Later, we also talk to Marc, one of Langfuse's co-founders.

Is AI Engineering the new Data Science? Will LLMs become normal and boring?

I see an interesting parallel between the evolution of data science as a profession starting around 2011-2012 and what we're seeing now with AI engineering. Back then, there was no clear path to becoming a data scientist nor university programs to study it. The first data scientists had PhDs in math, physics, etc. There were a few certifications but no established industry standards. It wasn't until 2015-2016 that we started seeing dedicated educational programs and jobs open up.

I see a similar trajectory for AI engineering. Right now, there are no set educational paths, and the industry is still defining what an AI engineer even is. But in 2-3 years, I expect the first master's programs in AI engineering to emerge. And in 5-8 years, AI won't seem so unique - it will be deeply embedded in all applications.

We saw a similar evolution with mobile. In 2010-2011, companies rushed to develop mobile apps, thinking mobile was a separate platform. Today, we rarely install new apps, and most companies focus on responsive web apps that work across devices.

In the same way, LLMs will eventually disappear into the background. They'll be available locally on devices for us to build on top of, not hosted on remote servers. We'll interact with intelligent features powered by local LLMs provided by hardware makers for most use cases. As AI engineers, we'll focus on the user experience while treating the LLMs as black boxes.

Let me know your thoughts in the comments! Do you agree LLMs and AI will become ubiquitous? Or will they hit barriers to real-world usage and fizzle out?

AI Pro Ducks on Langfuse

Takeaways

Observability is crucial for LLM applications, and companies like Langfuse provide specific tools to address the unique needs of generative AI.
Open source software can simplify the onboarding process and attract independent developers and smaller organizations.
Usage-based pricing offers flexibility and scalability, but companies should also consider the need for predictability and budget planning.
Existing ML observability players can collaborate with LLM observability companies to expand their offerings and increase their contract value.
Improving onboarding and providing education are key strategies for LLM observability companies to attract and retain users.

With Max, a VC and corporate innovation veteran, we're kicking off a new series analyzing AI products - not just the technology but the business side. If someone wants to build an AI application, what does it take? We call this series AI Pro Ducks.

We'll be looking at Langfuse this week. Langfuse offers an open-source suite to monitor large language models (LLMs) for those unfamiliar with the company. With the ChatGPT explosion, observability is crucial as companies deploy LLMs. Langfuse lets you track metrics, errors, and more across your LLM apps.

I first ask Max what catches his eye about Langfuse as an investor. He notes the enormous potential market - as LLM adoption grows, Langfuse can grow. And the fact it solves a clear need that applies equally to apps, machine learning models, and now LLMs.

We then dive into questions Max would ask Langfuse. First, he compliments their impressive young team that knows when to pivot. Strong execution is critical. He'd also look at market size and Langfuse's differentiators. What insights drove them to realize LLM observability was overlooked? And do they have early signs of traction?

I share my own Langfuse testing experience. As a new product, sure, some rough edges. However, the team wowed me with responsiveness to questions and issues—this ability to iterate and improve fast matters most.

We discuss Langfuse's open-source model. I prefer it for easy testing versus demos and sales calls. Max notes open source works well for developers, but enterprises want support services and SLAs. He sees parallels to other open-source software that got big fast but then went to monetize for businesses.

I ask how Langfuse should sell into enterprises. Max advises continuing the bottom-up developer-focused approach before building an enterprise sales motion. Support and SSO capabilities will be essential later. We also debate ideas around pricing - usage-based models versus predictable SaaS packages.

Shifting gears, we discuss defensibility. The base technology isn't too hard to replicate. But Max says Langfuse's edge will be transforming raw LLM observability data into business insights - guessing key metrics without SQL expertise. We also explore the $100B+ potential market size.

Finally, Max advises Langfuse to double down on enterprise capabilities like SSO as big cos adopt LLMs. I suggest improving onboarding and education on why observability matters. We're still early!

Let us know what other AI products or trends you want us to explore next!

Marc Klingen from Langfuse

Takeaways

Langfuse is an open-source monitoring tool for AI engineers that provides observability and analytics for complex language model applications.
Experimentation and user feedback are crucial in building successful AI applications.
The future of AI engineering lies in automating experimentation and integrating with other tools and platforms.
Building reliable and robust AI agents is a challenging task that requires unique setups and experimentation.
AI has the potential to enhance productivity and automate repetitive tasks, but it will only partially replace human expertise.
To progress in the AI engineering field, newcomers should focus on understanding the building blocks of AI, iterating quickly, and spotting problems that can be solved with AI technology.
Designing usable interfaces for AI applications is essential for user adoption and success.
Staying current in the AI field can be done by following the right people on Twitter and engaging with the AI community.
The potential of AI assistants lies in their ability to automate tasks and provide personalized services, such as meal planning.
The AI engineering field is rapidly evolving, and many opportunities exist to build innovative applications and tools.

Marc shares how Langfuse got started. Their team joined Y Combinator and built various LLM prototypes, like a microlearning course generator. Through constant experimentation, they realized the difficulty of iterating LLM apps without observability into how they perform. This insight spurred the idea for Langfuse.

Langfuse traces complex LLM app executions, attaching metadata like latency, errors, and feedback scores. You can break down metrics by end users to pinpoint quality bottlenecks. Langfuse aims to structure the process of building, monitoring, and improving LLM apps.

We discuss typical use cases like conversational documentation bots. Langfuse lets you replay full chat logs to see what context led to a negative user experience. This feature helps highlight knowledge gaps - if users ask about React Native but your docs lack it, append this missing info. Langfuse helps engineers understand how real-world usage differs from assumptions.

Marc emphasizes starting simple - using no-code tools like Langflow to easily try frameworks before getting into heavy custom coding. We discuss cautions when shipping LLM apps - perfectionism kills speed. Marc advises launching quickly with internal dogfooding to gather feedback before over-investing in robust evaluation frameworks.

Looking ahead, Marc expects more automation of LLM experimentation and open ecosystems integrating specialized tools. He's excited about process-automating enterprise use cases like agent assistants negotiating between company systems.

Marc notes LLMs already amplify individual productivity massively. While they may commoditize repetitive work, the leverage for innovation is incredible. He advises newcomers to immerse themselves frequently to follow the rapid progress. Leverage prebuilt modules to prototype valuable use cases for business workflows. The key is structuring output - not just chatting.

Let me know what stuck out or surprised you from Marc's insights! What are you building with or curious about in leveraging LLMs?

The AI Engineer Office Hours

Remember that we run free office hours available to all subscribers of The AI Engineer.

On our call, you can ask me about topics like:

Transitioning careers into AI Engineering
Deciphering AI buzzwords and concepts
Building your skills and resume as an aspiring AI Engineer
Applying principles into practice in real-world systems
My journey and career path so far

Or anything else related to AI and its real-world applications!

Here's How It Works

Subscribe to our newsletter: If you haven't already, make sure you're subscribed to The AI Engineering newsletter on Substack. This is your key to unlocking exclusive access to these free office hours.
Book your slot: Use this link and choose a date and time. Remember to mention the email address you used to subscribe. This will help me verify your subscription and ensure you're eligible for a free office hours call.
Receive the manual confirmation: Once verified, I'll share the details to log in to the scheduled call. You'll have my undivided attention to discuss any AI engineering or career-related questions you may have.