Residential vs. Hosting IPs
Why Your Firewall Is Failing to Stop Botnets
Here's the uncomfortable truth about every IP blocklist you're running in production: it's wrong about most botnets. Not because the list is stale, and not because the maintainers are lazy. It's wrong because by the time a botnet IP makes it onto a list, the attacker has already moved on to a fresh one — usually on the same AWS, Google Cloud, or OVH instance they spun up thirty seconds ago.
The real signal isn't "is this IP bad?" — it's "what kind of network is this IP even on?" Because real humans don't sign up for your SaaS from a Hetzner datacenter in Frankfurt. They sign up from Comcast, Verizon, Orange, or whatever ISP runs the wire into their apartment. If you want to detect hosting proxies and stop bot scraping at scale, the primitive you need is one level below the blocklist: network classification.
Why IP blocklists fail against modern botnets
Think about how a blocklist gets populated. Some honeypot, somewhere, gets hit by an IP. The IP is added to the list. Your firewall pulls the list every hour (or every day, or every week, depending on your vendor). You block the IP.
Now here's what the attacker is doing during that same window:
- Spinning up a new VPS on DigitalOcean for $4/month.
- Rotating through 10,000 IPs on a botnet-as-a-service platform like BrightData or Oxylabs (yes, the "residential proxy" vendors sell to attackers too).
- Cycling through a
/22of hosting IPs they control, with 1,024 fresh addresses ready to burn.
By the time any single IP lands on a public blocklist, it's been used for hours or days. The attacker has already scraped your pricing page, stuffed credentials into your login, or probed your signup form for validation regexes. The blocklist catches the body after the damage is done — it's a morgue, not a security system.
Residential IP vs. data center: the signal that actually scales
The fundamental asymmetry in your favor is this: an attacker can rotate IPs infinitely, but they can't change the kind of network the IPs sit on. Cloud infrastructure is cloud infrastructure. A brand-new IP on AS16509 (Amazon) is still on Amazon. A brand-new IP on AS14061 (DigitalOcean) is still on DigitalOcean. The ASN is baked into the route announcement — it's public, it's deterministic, and it updates in minutes, not weeks.
This is why the residential IP vs. data center distinction is the single highest-leverage fraud signal most teams aren't using. It's not 1,000 features of behavioral analysis. It's one lookup. Is this IP on a network where real humans browse the internet? Or is it on a network that exists purely to run virtual machines?
Neither of those IPs is on any blocklist. Both would sail straight through a FireHOL or AbuseIPDB check. But one is obviously a bot and one is obviously a customer — because the network classification tells you so, instantly, without waiting for a single abuse report to accumulate.
How CandycornDB's P3 classifier stops bot scraping at ingress
In our most recent release we shipped P3: ASN Classification & Proxy Detection. Every IP in our database now carries two new fields on top of the trust score: asnType and isProxy. They're surfaced in every /api/public/ip-score and /api/public/bulk-score response — no extra calls, no extra credits.
asnType: "hosting"— cloud, VPS, datacenter, colocation. AWS, GCP, Azure, OVH, Hetzner, Vultr, Linode, and roughly 200 other hosting ASNs we keep classified. This is your single most actionable "stop bot scraping" signal.asnType: "residential"— consumer broadband: Comcast, Charter, AT&T, Verizon FiOS, Orange, Deutsche Telekom, Starlink.asnType: "mobile"— cellular carriers: Verizon Wireless, T-Mobile, Vodafone, and the CGNAT-heavy networks behind most phone traffic.isProxy: true— a broader signal thanasnType === "hosting". Fires when the ISP string matches a hosting keyword or when the reverse-DNS hostname containsvps,server,node,proxy, orvpn. Catches residential ISPs being actively used as proxies — a pattern that plain ASN classification would miss.
Under the hood, these fields are the output of a continuously-maintained ASN database plus a hostname-override classifier. We don't just match on "is this AWS?" — we walk the ASN org name against a curated keyword set (hosting, residential, mobile), and we apply a second pass on the PTR record to catch ISPs who are technically residential but are being abused as proxy infrastructure. The whole decision runs on every IP we clean and is cached for every subsequent query, so your app pays no latency cost.
Real-world usage: three lines of code
Here's what this looks like in a typical login-gating flow. You call /api/public/ip-score, you read asnType, you decide. No ML, no rules engine, no vendor-specific SDK.
const { data } = await axios.get(
'https://candycorndb.com/api/public/ip-score',
{ params: { ip }, headers: { 'x-api-key': KEY } }
);
if (data.asnType === 'hosting' || data.isProxy) {
// No human signs up from a datacenter IP.
// CAPTCHA, rate-limit, or block outright.
return res.status(403).send('Request blocked.');
}
That single if statement will stop the majority of scraping, credential-stuffing, fake-signup, and content-theft traffic hitting your site today. And critically, it works on IPs that have never been seen before — because the ASN is known the moment the IP is routed, not six weeks after enough victims have reported it.
When to use asnType vs. isProxy
- Gating high-trust flows (signup, checkout, password reset): block on
asnType === 'hosting' || isProxy. You will lose approximately zero paying customers and block approximately all scrapers. - Risk scoring (fraud model inputs, manual-review queues): use
asnTypeas a categorical feature andisProxyas a boolean. The signal stacks cleanly with your existing model. - Content gating (API keys, pricing pages, anything AI scrapers want): block on
asnType === 'hosting'unless the request has a verified Googlebot rDNS. See our post on blocking AI scrapers without hurting SEO. - Analytics cleanup (real-user traffic only): filter out all rows where
asnType !== 'residential' && asnType !== 'mobile'. Watch your "users from Virginia" column quietly shrink by 30–60%.
What the data looks like end-to-end
The classifier isn't a black box. Every call returns both the label and the reasoning, so you can audit and tune. A high-risk response looks like this:
{
"ip": "185.220.101.44",
"score": 98,
"asn": "AS9009 - M247 Ltd",
"country": "DE",
"asnType": "hosting",
"isProxy": true,
"scoreReasons": [
"Known Tor exit node (+45)",
"Hosting/datacenter ASN (+15)",
"Proxy/VPN signal in ASN or hostname (+20)",
"31 of 256 /24 neighbors already flagged (+15)"
],
"firstSeen": "2024-07-18T02:19:41.000Z",
"lastSeen": "2026-04-22T14:03:09.441Z"
}
Compare that to a clean residential IP:
{
"ip": "73.14.58.201",
"score": 12,
"asn": "AS7922 - Comcast",
"country": "US",
"asnType": "residential",
"isProxy": false,
"scoreReasons": [],
"firstSeen": "2025-11-04T09:11:22.000Z",
"lastSeen": "2026-04-22T06:48:03.910Z"
}
Same schema, identical to parse, radically different decision. The full field reference lives in the Data Dictionary.
The bottom line
Blocklists will always be one step behind. That's architectural — they can only catalog the damage after it happens. But every IP on Earth sits on a network, and every network has a type. Once you can tell "AWS" apart from "Comcast," you've already caught the traffic blocklists can't touch: brand-new IPs on hosting ASNs, rotating residential proxies with telltale PTR records, and the thousand tiny VPS providers attackers cycle through every day.
If your firewall is still running on a 2015 playbook of IP lists and User-Agent regexes, you're leaving the highest-leverage fraud signal on the table. asnType is one lookup, one field, one if statement. And it stops the attacks blocklists can't.