We’d like to remind Forumites to please avoid political debate on the Forum.

This is to keep it a safe and useful space for MoneySaving discussions. Threads that are – or become – political in nature may be removed in line with the Forum’s rules. Thank you for your understanding.

📨 Have you signed up to the Forum's new Email Digest yet? Get a selection of trending threads sent straight to your inbox daily, weekly or monthly!

Programming Query

Slim
Slim Posts: 77 Forumite
I'm thinking of learning a bit of programming. Specifically I'd like to learn how to write something that can go to web sites and extract information onto, say, an Excel spreadsheet. I know that the Visual Basic software that comes with the standard Excel software will go onto specified web pages to extract information but what software has the ability to search for information on say a larger web site such as FT.com or ebay?

Comments

  • Lord_Chris
    Lord_Chris Posts: 358 Forumite
    hmm... unless your really serious about learning a programming language, then abandon your plans - its NOT easy. If you are serious:

    be very patient... i think probably the best programming language for beginners is C++ or Java - i started learning C++ a few months ago, but GCSE's have pushed that aside the last couple months, dont know if i can actually be bothered to pick up the book again lol.
  • wolfman
    wolfman Posts: 3,225 Forumite
    Have a look at .Net 2.0. You could use something like C#. There are plenty of resources and tutorials at https://www.asp.net and you can download Visual Studio Express for free from Microsoft.

    Screen scrapping is quite tricky. You're basically making a web request to a set url, then from the response (ie the html) page, you need to filter through it finding the bits of data you want.

    Regular expressions (google it) will be something you'll want to use no matter what language you use (string manipulation will be process/memory intensive and generally slower) to identify set pieces of information.

    I started doing something similar myself for https://www.imdb.com. Just be aware that some sites aren't too keen on people doing such things.
    "Boonowa tweepi, ha, ha."
  • Chippy_Minton
    Chippy_Minton Posts: 3,339 Forumite
    I've done something similar myself using Excel VBA to scrape data from a web page, though not the searching bit. Almost any programming language could do the job, however since you want to extract the data into an Excel spreadsheet, VBA is a good choice, particularly for a beginner, and you could get it working relatively quickly.

    Whichever language you choose you'll need to use the Microsoft HTML Object Library (MSHTML) or possibly Microsoft XML (MSXML). These give you access to the Document Object Model (DOM) and you can programmatically populate search fields, click search buttons and access data in the HTML.

    As others have said, this isn't an easy task and if the web site design changes in even the smallest way your code may not work properly and will also need modifying.
  • Lord_Chris
    Lord_Chris Posts: 358 Forumite
    since im not actually a programmer, theres a good chance im wrong, but i think i read on the net somewhere that a lot of microsofts programming programs are unstable... you might want to find an alternative to microsoft

    is that right? i dont actually know, maybe its just my immense dislike of microsoft (coming from somebody who uses a ton of their software :P)
  • wolfman
    wolfman Posts: 3,225 Forumite
    Far from it. Have a go with Visual Studio (even the older .Net 2002 and 2003 versions). It's excellent, one of the best programming applications about. The most recent version, just named "Visual Studio" is very good, I work with it everyday and have never found a bug or problem with it.
    "Boonowa tweepi, ha, ha."
  • IvanOpinion
    IvanOpinion Posts: 22,136 Forumite
    Part of the Furniture 10,000 Posts Name Dropper Combo Breaker
    Lord_Chris wrote:
    since im not actually a programmer, theres a good chance im wrong, but i think i read on the net somewhere that a lot of microsofts programming programs are unstable... you might want to find an alternative to microsoft

    is that right? i dont actually know, maybe its just my immense dislike of microsoft (coming from somebody who uses a ton of their software :P)
    Any development environment can be unstable and Microsoft, in my experience is one of the flakier environments I have come across (but then it does a lot more than some others). We develop using Visual Studio .Net, SQL server, C#, SPPS, CMS and CRM and have come across many many problems (it is fine doing the simple stuff but when you start to push it then the holes start appearing). That however will not stop a professional programmer ... that is part of the skill .. finding alternative ways of doing things when your environment lets you down.

    Those that spend their time on the web whingeing are either amateurs looking for a bit of a fiddle or just simply Microsoft bashers and should not be taken that seriously.

    I should also add that I am not a great Microsoft fan but you have to follow the money and try to have skills that suit the market. Therefore I agree with wolfman and his pick of languages.

    Ivan
    I don't care about your first world problems; I have enough of my own!
This discussion has been closed.
Meet your Ambassadors

🚀 Getting Started

Hi new member!

Our Getting Started Guide will help you get the most out of the Forum

Categories

  • All Categories
  • 352.5K Banking & Borrowing
  • 253.7K Reduce Debt & Boost Income
  • 454.5K Spending & Discounts
  • 245.5K Work, Benefits & Business
  • 601.4K Mortgages, Homes & Bills
  • 177.6K Life & Family
  • 259.4K Travel & Transport
  • 1.5M Hobbies & Leisure
  • 16K Discuss & Feedback
  • 37.7K Read-Only Boards

Is this how you want to be seen?

We see you are using a default avatar. It takes only a few seconds to pick a picture.