> One way to do it would be assign various jobs a value. Or you could use real m...

mtrovo · on April 22, 2020

I think this is much more useful to a cloud provider than to a customer.

As a customer I think you could configure somethings like you said using spot instances on AWS but that’s it, you’re going to save some small amount of dollars in a year but if you account for the engineering hours needed to set this up maybe is not really worth it.

As a cloud provider you could juggle your client between datacenters depending on the load and price of energy over there. A flat rate for a cloud region means that there’s an opportunity for arbitration between datacenters that could be an increase of thousands in profits on their side.

vladd · on April 22, 2020

If spot instance price is the only communication channel between you and the cloud provider to achieve this, it's hard to do a good job at it. For example, if the spot price was $0.60 one hour ago, and $0.55 right now, and you still have 13 hours of latency left for your job's execution, should you start triggering it or not? (how do you figure out your bid level?) You could have statistical data to have an intuition what's the lowest price that's been hit historically on similar days of the week in the past etc, but it's inexact and overly complicated.

If the cloud provider becomes aware of the remaining latency you have at your disposal for the job's execution, they can do a much better job. They would be able to look across the entire job execution queue in that datacenter, each job having a specific remaining duration for its execution, they would know the predicted pattern of carbon/solar/wind split for the next hours, and the implementation of the system would sit just on the cloud provider side (making the life of the customers easier and simple).

Of course, in the end, the benefit is towards our planet, but this is much more likely to succeed if the proper cloud API exists and the implementation initiatives are properly aligned to avoid redundant implementations on the customer side.

tylerl · on April 23, 2020

Thing is, Google requires an absolutely stupid amount of computing resources for running their core business. YouTube transcoding is a great example and a big one for sure, but I bet they have even bigger ones in there somewhere. I have no real data to base this on (and I'm sure nobody does), but I'd bet 5:1 odds that if Google were an AWS customer, they'd be bigger than all the others combined.

So in that case, optimizing for a single customer makes perfect sense if it's the right customer.

why_only_15 · on April 23, 2020

Something sort of similar exists for iOS. BGTaskScheduler (https://developer.apple.com/documentation/backgroundtasks/bg...) is the public interface, but the internal interface allows for more subtle calculations about how long you can wait until this job is complete and what the optimal physical condition of the phone is (e.g. battery charging, user not using phone, wireless network connection). One of the important features is that when you run a job you have to check in fairly often (every second or so) to see if your job can still run, and if it can't, stop the job. This stops your two hour ML training session from burning battery for a while if the user wakes up at 2am and decides to go on a walk.