Hacker Newsnew | past | comments | ask | show | jobs | submit | shad42's commentslogin

Mendral co-founder here. What happens at PostHog is not uncommon. While building Mendral, we talked to hundreds of team and they all have a similar situation. Initially they come to us to make their CI pipelines faster. But as the agent dives in, the urgency becomes keeping all pipelines reliable. It comes from growing a code base with a test suite. Of course it has to change eventually: splitting the test suite, running specific part of the CI depending on the code, etc... But the situation described in the article is widespread with a product that grows quickly.


In some ways: we use their product and they use Mendral


100% and LLMs have tons of related training data


very interesting, curious if there is any downside to running this at scale (compute?)


I'd assume it probably depends how large and varied your logs are?

But, my guess, I could see an algorithm like that being very fast. It's basically just doing a form of compression, so I'm thinking ballpark, like similar amount to just zipping the log

Can't be anything CLOSE to the compute cost of running any part of the file through an LLM haha


We did not want to make the post engineering-focused, but we have 18 companies in production today (we wrote about PostHog in the blog). At some point we should post some case studies. The metric we track for usefulness is our monthly revenue :)


Mendral is replacing a human Platform Engineer. It debugs the CI logs, look at the commit associated, look at the implementation of the tests, etc... It then proposes fixes and takes care of opening a PR.

We wrote about how this works for PostHog: https://www.mendral.com/blog/ci-at-scale


There is a cost associated with each investigation (that the Mendral agent is doing). And we spend time tuning the orchestration between agents. Yes expensive but we're making money on top of what it costs us. So far we were able to take the cost down while increasing the relevance of each root cause analysis.

We're writing another post about that specifically, we'll publish it sometimes next week


What is your pricing like? Do you do usage based pricing by any chance?


I agree, we automated in the Mendral agent what is time consuming for human (like debugging a flaky test), but it will need permission to confirm the remediation and open a PR.

But it's night and day to fix your CI when someone (in this case an agent) already dug into the logs, the code of the test and propose options to fix. We have several customers asking us to automate the rest (all the way to merge code), but we haven't done it for the reasons you mention. Although I am sure we'll get there sometimes this year.


Shameless plug here for Lexega—a deterministic policy enforcement layer for SQL in CI/CD :) https://lexega.com

There are bridges here that the industry has yet to figure out. There is absolutely a place for LLMs in these workflows, and what you've done here with the Mendral agent is very disciplined, which is, I'd venture to say, uncommon. Leadership wants results, which presses teams to ship things that maybe shouldn't be shipped quite yet. IMO the industry is moving faster than they can keep up with the implications.


LLMs are better now at pulling the context (as opposed to feeding everything you can inside the prompt). So you can expose enough query primitives to the LLM so it's able to filter out the noise.

I don't think implementing filtering on log ingestion is the right approach, because you don't know what is noise at this stage. We spent more time on thinking about the schema and indexes to make sure complex queries perform at scale.


I am sure you heard before: there are only two hard things in CS: cache invalidation and naming things.

In the history of this company, I can honestly say that this SQL/LLM thing wasn't the hardest :)


And the other of the two problems is off-by-one errors.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: