c# - Best way to scrape source code from a webpage? -

- February 15, 2014

i'm working on c# app. best way scrape source code webpage?

right now, viewing page source in browser (chrome), copying & pasting text file, , sucking parser.

i thinking i'd first create textbox in application i'd able paste url. application pull page's source code , pass parser.

i'd consider htmlagilitypack. can download page this:

htmldocument document = new htmldocument(); document.loadhtml(new webclient().downloadstring("http://www.bing.com"));

if looking parser, well, i've had experience scrapysharp adds extension methods htmlagilitypack's htmldocument select elements on page using cssselectors you'd find in jquery, this:

document.documentnode.cssselect(".sessions .main-head-row td.download a.text-pdf")

Search This Blog

Share

c# - Best way to scrape source code from a webpage? -

Comments

Post a Comment

Popular posts from this blog

Line ending issue with Mercurial or Visual Studio -

php - Retrieving data submitted with Yii's CActiveForm -

fatal error - Android RunTimeError: Java.lang.RunTimeException: Unable to Instantiate activity -