Yes, but the point is that big company crawlers aren’t paying for questionably sourced residential proxies.
If this person is seeing a lot of traffic from residential IPs then I would be shocked if it’s really Amazon. I think someone else is doing something sketchy and they put “AmazonBot” in the user agent to make victims think it’s Amazon.
You can set the user agent string to anything you want, as we all know.
You joke, but do people analyze it continuously forever also? Because if we’re being paranoid, that’s something you’d need to do in order to account for random updates that are probably happening all the time.
I used to work for malware detection for a security company, and we looked at residential IP proxy services.
They are very, very, very expensive for the amount of data you get. You are paying for per bit of data. Even with Amazon's money, the number quickly become untenable.
It was literally cheaper for us to subscribe to business ADSL/cable/fiber optic services to our corp office buildings and thrunk them together.
They worked on malware detection. The most likely reason is very obvious: if you only allow traffic from residential addresses to your Command & Control server, you make anti-malware research (which is most likely coming from either a datacenter or an office building) an awful lot harder - especially when you give non-residential IPs a different and harmless response instead of straight-up blocking them.
To add: it’s also kinda silly on the surface of it for Amazon to use consumer devices to hide their crawling traffic, but still leave “Amazonbot” in their UA string… it’s pretty safe to assume they’re not doing this.
Wild. While I'm sure the service is technically legal since it can be used for non-nefarious purposes, signing up for a service like that seems like a guarantee that you are contributing to problematic behavior.
Requests coming from residential ips is really suspicious.
Edit: the motivation for such a DDoS might be targeting Amazon, by taking down smaller sites and making it look like amazon is responsible.
If it is Amazon one place to start is blocking all the the ip ranges they publish. Although it sounds like there are requests outside those ranges...