We’d like to remind Forumites to please avoid political debate on the Forum.
This is to keep it a safe and useful space for MoneySaving discussions. Threads that are – or become – political in nature may be removed in line with the Forum’s rules. Thank you for your understanding.
📨 Have you signed up to the Forum's new Email Digest yet? Get a selection of trending threads sent straight to your inbox daily, weekly or monthly!
My IP has been blocked, can I get around it
Deep_Ocean
Posts: 553 Forumite
in Techie Stuff
I have some computer software which scrapes a website and uses the information that they provide and it supllies me with very useful information.
The problem is that the software searches around 30,000 pages of this site in an hour, (as you can see it is much quicker for the software to do this than me to look manually) the site have now blocked me from accessing their site and so my software doesn't work either.
I have next to no computer knowledge at all, but I understand that IP providers do change the last few digits of your IP address every now and then. I am with SKY and I hope that when my IP address changes again I will be able to access the site.
The problem is that there is nothing to stop them blocking me again.
I realise that the software must take up a lot of their bandwidth so I can see how it would be annoying to them. This is unfortunate.
Is their anyway I can prevent myself from being blocked again.
The problem is that the software searches around 30,000 pages of this site in an hour, (as you can see it is much quicker for the software to do this than me to look manually) the site have now blocked me from accessing their site and so my software doesn't work either.
I have next to no computer knowledge at all, but I understand that IP providers do change the last few digits of your IP address every now and then. I am with SKY and I hope that when my IP address changes again I will be able to access the site.
The problem is that there is nothing to stop them blocking me again.
I realise that the software must take up a lot of their bandwidth so I can see how it would be annoying to them. This is unfortunate.
Is their anyway I can prevent myself from being blocked again.
If you wish in this world to advance, your merits you're bound to enhance; You must stir it and stump it, and blow your own trumpet, or trust me, you haven't a chance.
0
Comments
-
Finding arbs?

I would rent a VPS for a few pounds a month. You'll get one IP, and additional IPs are a quid a pop usually. You can get a dozen IPs, and spread the scraping over the IPs. If you can run the scripts on the server, then great, but if not you could just use the VPS as a proxy to scrape from home. If the IPs get blacklisted either get new ones from that provider or ditch the VPS and get another from elsewhere.They say it's genetic, they say he can't help it, they say you can catch it - but sometimes you're born with it0 -
Can't you just go through a proxy server?0
-
Deep_Ocean wrote: »I realise that the software must take up a lot of their bandwidth so I can see how it would be annoying to them. This is unfortunate.
.
It is unfortunate for the poor sods who have legitimate use of the site. How would you feel if that happened here?
You're basically stealing IP from it as you obviously don't have the permission to harvest the information.0 -
I doubt it takes up a significant amount of their bandwidth if you're able to do it on a home line, tbh.They say it's genetic, they say he can't help it, they say you can catch it - but sometimes you're born with it0
-
Deep_Ocean wrote: »I have some computer software which scrapes a website and uses the information that they provide and it supllies me with very useful information.
The problem is that the software searches around 30,000 pages of this site in an hour,
You ask the site owner for permission to harvest their data and for access using a programming interface rather than screenscraping.
They may charge for this.A kind word lasts a minute, a skelped erse is sair for a day.0 -
Agree. And you access the information from the site as you need it, rather than downloading the whole site.Owain_Moneysaver wrote: »You ask the site owner for permission to harvest their data and for access using a programming interface rather than screenscraping.
They may charge for this.Hi, we’ve had to remove your signature. If you’re not sure why please read the forum rules or email the forum team if you’re still unsure - MSE ForumTeam0 -
If this was my site I wouldn't be blocking the crawler - I'd be redirecting it to some spam farm sites and let it play on them.
Why on earth should a webmaster allow a bot unfettered access to their data? Ethical crawlers obey robots.txt directives too.
Sorry no sympathy at all.0 -
To be fair to the OP here... web data is public and public data is up to the scrutiny of being scraped.
You can't put a road on the street and say "only Fords can drive on it". I know that analogy is quite a bit off but you'll get what I'm saying...
I understand from the webmasters point of view, 1) they don't want to be DOS'd by bots and 2) they want to reserve bandwidth, but in reality if the website is of importance it will have way more than enough bandwidth to handle these with DOS protection in place to limit traffic per IP rather than blocking it.
Blocking an IP for many requests to a webserver is a pretty basic form of DOS protection which doesn't even work. A proper DOS attack would come from many different sources (a BOTNET so to speak), so it will likely be a check box on the webservers security configuration....
To get round it you simply need to spoof your IP.. there are many free proxy's around which would route your connection through an external server, or as another poster mentioned try using a virtual server for a few bob a month.. save an image of the server.. then everytime you need a new IP bring up a new server with that image!
For people very against bots auto scraping web data... I mean come on... It's not causing any harm...
Lighten up [FONT=Arial, Helvetica, sans-serif]"The internet is a great way to get on the net."
- Bob Dole, Republican presidential candidate[/FONT]0 -
https://www.hidemyass.com
Also why not turn your router off and then back on so you get a new IP from your ISP's DHCP server0 -
Tbh omen most give you a longer lease these days so you'll just get the same IP..
Depends on the connection really.[FONT=Arial, Helvetica, sans-serif]"The internet is a great way to get on the net."
- Bob Dole, Republican presidential candidate[/FONT]0
This discussion has been closed.
Confirm your email address to Create Threads and Reply
Categories
- All Categories
- 352.4K Banking & Borrowing
- 253.7K Reduce Debt & Boost Income
- 454.4K Spending & Discounts
- 245.5K Work, Benefits & Business
- 601.3K Mortgages, Homes & Bills
- 177.6K Life & Family
- 259.4K Travel & Transport
- 1.5M Hobbies & Leisure
- 16K Discuss & Feedback
- 37.7K Read-Only Boards