God, Love, News, Event, Entertainment, Amebo,..... All about Bringing out the best in you...
Show HN: Opik, an open source LLM evaluation framework https://ift.tt/5gOoFcE
Show HN: Opik, an open source LLM evaluation framework Hey HN! I'm Caleb, one of the contributors to Opik, a new open source framework for LLM evaluations. Over the last few months, my colleagues and I have been working on a project to solve what we see as the most painful parts of writing evals for an LLM application. For this initial release, we've focused on a few core features that we think are the most essential: - Simplifying the implementation of more complex LLM-based evaluation metrics, like Hallucination and Moderation. - Enabling step-by-step tracking, such that you can test and debug each individual component of your LLM application, even in more complex multi-agent architectures. - Exposing an API for "model unit tests" (built on Pytest), to allow you to run evals as part of your CI/CD pipelines - Providing an easy UI for scoring, annotating, and versioning your logged LLM data, for further evaluation or training. It's often hard to feel like you can trust an LLM application in production, not just because of the stochastic nature of the model, but because of the opaqueness of the application itself. Our belief is that with better tooling for evaluations, we can meaningfully improve this situation, and unlock a new wave of LLM applications. You can run Opik locally, or with a free API key via our cloud platform. You can use it with any model server or hosted model, but we currently have a built-in integration with the OpenAI Python library, which means it automatically works not just with OpenAI models, but with any model served via a compatible model server (ollama, vLLM, etc). Opik also currently has out-of-the-box integrations with LangChain, LlamaIndex, Ragas, and a few other popular tools. This is our initial release of Opik, so if you have any feedback or questions, I'd love to hear them! https://ift.tt/zRElFto September 17, 2024 at 03:01AM
Subscribe to:
Post Comments (Atom)
Visual inference exploration and experimentation playground https://ift.tt/tS0ZBRx
Visual inference exploration and experimentation playground https://ift.tt/muldL18 November 11, 2024 at 03:03PM
-
HOMILY FOR FRIDAY, 14TH JUNE, 2024 TENTH WEEK IN ORDINARY TIME 1KING 19:9a. 11-16; GOSPEL: MATT 5:27-32 The conscience of man is where moral...
-
HOMILY FOR TUESDAY, 11TH WEEK IN ORDINARY TIME 1Kings 21:17-29; Matt 5:43-48 The last phrase of the gospel passage says “Be perfect just a...
-
HOMILY FOR MONDAY 24TH WEEK IN ORDINARY TIME B {1COR 12:12-14.27-31a; LUKE 7:11-17} Faith without good works is indeed dead. This theme co...
No comments:
Post a Comment