Hacker Newsnew | past | comments | ask | show | jobs | submit | neop1x's commentslogin

Many people wanted to be able to set a spending limit on google cloud account for many years but they were unable to implement anything, always suggesting a workaround by hosting a Cloud Run function which would remove billing from a project via API https://docs.cloud.google.com/billing/docs/how-to/disable-bi...

As someone who is new to the whole google cloud ecosystem, the amount of dark patterns they employ are absolutely shocking. Just off the top of my head:

1. You never know how much a single API request will cost or did cost for the gemini api

2. It takes anywhere between 12-24 hours to tell you how much they will charge you for past aggregate requests

3. No simple way to set limits on payment anywhere in google cloud

4. Either they are charging for the batch api before even returning a result, or their "minimal" thinking mode is burning through 15k tokens for a simple image description task with <200 output tokens. I have no way of knowing which of the two it is. The tokens in the UI are not adding up to the costs, so I can only assume its the first.

5. Incomplete batch requests can't be retrieved if they expire, despite being charged.

6. A truly labyrinthine ui experience that makes modern gacha game developers blush

All I have learned here is to never, ever use a google product.


At scale, distributed API routing shouldn't call accounting transactions, that expands the availability risk surface and adds latency to all valid requests for no reason (other than helping the minority of companies/users who want their product to stop working when it is popular).

Distributed “shared nothing” API handling should make usage available to accounting, and the API handling orchestrator should have a hook that allows accounting to revoke or flag a key.

This gets the accounting transactions and key availability management out of the request handling.


That is a nice excuse, do you work at Google? :) I get the idea of not slowing down requests or risking availability, but don’t tell me a company as big as Google can’t design an asynchronous accounting system robust enough to handle this. We’re not talking about penny-perfect precision - blocking at 110% or even 150% of the set cap would be enough. Right now, though, there’s nothing to prevent a $5k, 20k or even higher bill surprise due to API key leaks, misuse or wrong configuration. To me, this is unacceptable and one of the reason I try to avoid using gcloud (the other one is unbearably slow gogole cloud console "webapp").

That’s exactly what the cloud function does

Yes but each admin has to use their product (cloud function), configure IAM and do that for every project. This is clearly just a work-around.

I haven't used these budget alerts, maybe they are a pain to implement?

https://docs.cloud.google.com/billing/docs/how-to/budgets

They are still not a spending cap of course.


reminds me: Ever used Gemini API on Google Vertex Cloud API? The usage will show up like 24-48 hours later in the dashboard. So when you use Gemini's API on their Cloud me as Workspace admin cannot even track my own usage in near realtime there. Which makes me think that even Google cannot track it in realtime.

I use a sandbox example extension with comes with Pi, it uses the anthropic sandbox runtime (bubblewrap on linux). The runtime has one bug and needs one improvement (I've made PRs, no response yet). Pi's sandbox example extension does not block internal tools (read/write) according to rules, I've created a PR but can't submit because of Pi's OSS vacation BS... https://github.com/badlogic/pi-mono/compare/main...k3a:pi-mo... I am quite happy with my patched forks for now

I just told PI to generate itself a permissioned_* equivalents of read,write,bash,edit. Now, permissioned_read,permissioned_write,permissioned_edit have full access to anything from current dir and deeper, and permissioned_bash is always permission-gated.

Default read,edit,write,bash are disabled.

It seems to work really good.

Generally, I'm in awe. I think I've already changed the way I work.


Use OpenRouter, lots of great open-weights models like MiniMax, Kimi K2, Mistral, Qwen, ...

so are people pairing OpenRouter with open agent harnesses like Pi or OpenCode?

In Czechia we have LaskaKit which is ESP32-based board with a GoodDisplay e-paper. It works perfectly with ESPHome and Home Assistant. :) https://www.laskakit.cz/en/laskakit-live-7-5-e-paper-stavebn...

From the start, I didn't understand why people pay so much for those subscriptions. Often, there are no clearly stated quotas, limits, or SLAs. For example, OpenAI states that requests are "limited" for the basic plan and "extended" for more expensive plans. It's such a joke.

I've been a pay-as-you-go API user from the beginning, carefully managing my context and usage while also leveraging cheaper, open alternatives to those overpriced "SOTA" models. I'm paying ~6 USD a month and I'm using it quite a lot.


Recent open-weights models(MiniMax, Kimi K2, GLM, Mistral) are also quite good, can be self-hosted or accessed through 3rd-party hosters or OpenRouter and they are sufficient for most of the tasks. Just stop paying overpriced "unlimited" subscription bul**hit.

Good for him! But it is possible he won't stay there for a long time. Like Geohot at Apple. There is a difference between working on a fun project which you completely control and being under a constant pressure and having to follow constrains and requirements set by managers in a corporation.

Can we stop calling LLMs "artificial intelligence"?! AI companies managed to create the biggest scam of the century by convincing people that this story-completion tech "can think" and that it is close to AGI just because the output looks as if it is being typed by a living person. DotCom-like stock crash is inevitable. (not saying LLMs are useless; they can be very helpful but not AI-companies-valuation helpful)

It's simple - if a camera sends pictures and images to a cloud (other prople's computers) you must expect that thay process and store the data (and it's often in their ToC anyway). It's sad that many non-technical people don't realize that.

There is no TikTok in China. A local version Douyin enforces strict time restrictions for users under the age of 14.

    Daily usage limit: Users under 14 are limited to 40 minutes per day on the app.

    Access hours: The app is locked from 10 p.m. to 6 a.m. for this age group.

    Enforcement: These restrictions are part of a mandatory "Youth Mode" that activates automatically for users under 14 who complete real-name authentication.

    Content focus: Youth Mode promotes educational content such as science experiments, museum tours, historical stories, and nature scenery. 

China also implemented time restrictions in games played by minors.

They are well aware of negative impacts of those shi*y services.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: