> One way to do it would be assign various jobs a value. Or you could use real money.
It's not the value of the outcome for that job that you're interested, but rather its sensitivity to a delayed latency in executing it.
For example, preemptively converting Youtube videos to a lower resolution with optimum compression to avoid having to do it in real-time (when video is played) at a crappy compression (to be fast), is valuable for sure. It's just that it can be postponed for 24 hours without real impact. Executing a search for a single user is less valuable in terms of overall impact but much more latency-sensitive.
(you can think of value and latency-sensitive in terms of two dimensions that are independent between them.)
This idea helps save the planet for sure, but it requires cloud-providers to build APIs that enable devs to switch from the "here's the SSH to the server, do what you want with it" to a model where it's the devs that say instead "here's a lambda function and its desired latency execution, please schedule to run it for me and let me know when the result is ready" ( https://en.wikipedia.org/wiki/Inversion_of_control )
Google was able to do that because it owns a large part of the jobs executed in their datacenters. Hence they could build this adaptive scheduling for their own jobs quickly without necessarily passing through a cloud-based API that inverts the control of job scheduling.
I think this is much more useful to a cloud provider than to a customer.
As a customer I think you could configure somethings like you said using spot instances on AWS but that’s it, you’re going to save some small amount of dollars in a year but if you account for the engineering hours needed to set this up maybe is not really worth it.
As a cloud provider you could juggle your client between datacenters depending on the load and price of energy over there. A flat rate for a cloud region means that there’s an opportunity for arbitration between datacenters that could be an increase of thousands in profits on their side.
If spot instance price is the only communication channel between you and the cloud provider to achieve this, it's hard to do a good job at it. For example, if the spot price was $0.60 one hour ago, and $0.55 right now, and you still have 13 hours of latency left for your job's execution, should you start triggering it or not? (how do you figure out your bid level?) You could have statistical data to have an intuition what's the lowest price that's been hit historically on similar days of the week in the past etc, but it's inexact and overly complicated.
If the cloud provider becomes aware of the remaining latency you have at your disposal for the job's execution, they can do a much better job. They would be able to look across the entire job execution queue in that datacenter, each job having a specific remaining duration for its execution, they would know the predicted pattern of carbon/solar/wind split for the next hours, and the implementation of the system would sit just on the cloud provider side (making the life of the customers easier and simple).
Of course, in the end, the benefit is towards our planet, but this is much more likely to succeed if the proper cloud API exists and the implementation initiatives are properly aligned to avoid redundant implementations on the customer side.
Thing is, Google requires an absolutely stupid amount of computing resources for running their core business. YouTube transcoding is a great example and a big one for sure, but I bet they have even bigger ones in there somewhere. I have no real data to base this on (and I'm sure nobody does), but I'd bet 5:1 odds that if Google were an AWS customer, they'd be bigger than all the others combined.
So in that case, optimizing for a single customer makes perfect sense if it's the right customer.
Something sort of similar exists for iOS. BGTaskScheduler (https://developer.apple.com/documentation/backgroundtasks/bg...) is the public interface, but the internal interface allows for more subtle calculations about how long you can wait until this job is complete and what the optimal physical condition of the phone is (e.g. battery charging, user not using phone, wireless network connection). One of the important features is that when you run a job you have to check in fairly often (every second or so) to see if your job can still run, and if it can't, stop the job. This stops your two hour ML training session from burning battery for a while if the user wakes up at 2am and decides to go on a walk.
It's not the value of the outcome for that job that you're interested, but rather its sensitivity to a delayed latency in executing it.
For example, preemptively converting Youtube videos to a lower resolution with optimum compression to avoid having to do it in real-time (when video is played) at a crappy compression (to be fast), is valuable for sure. It's just that it can be postponed for 24 hours without real impact. Executing a search for a single user is less valuable in terms of overall impact but much more latency-sensitive.
(you can think of value and latency-sensitive in terms of two dimensions that are independent between them.)
This idea helps save the planet for sure, but it requires cloud-providers to build APIs that enable devs to switch from the "here's the SSH to the server, do what you want with it" to a model where it's the devs that say instead "here's a lambda function and its desired latency execution, please schedule to run it for me and let me know when the result is ready" ( https://en.wikipedia.org/wiki/Inversion_of_control )
Google was able to do that because it owns a large part of the jobs executed in their datacenters. Hence they could build this adaptive scheduling for their own jobs quickly without necessarily passing through a cloud-based API that inverts the control of job scheduling.