c# - Best way to scrape source code from a webpage? -
i'm working on c# app. best way scrape source code webpage?
right now, viewing page source in browser (chrome), copying & pasting text file, , sucking parser.
i thinking i'd first create textbox in application i'd able paste url. application pull page's source code , pass parser.
i'd consider htmlagilitypack. can download page this:
htmldocument document = new htmldocument(); document.loadhtml(new webclient().downloadstring("http://www.bing.com"));
if looking parser, well, i've had experience scrapysharp adds extension methods htmlagilitypack's htmldocument select elements on page using cssselectors you'd find in jquery, this:
document.documentnode.cssselect(".sessions .main-head-row td.download a.text-pdf")
Comments
Post a Comment