c++ - Get the HTML of a site -
i'm trying string (or char[]) html of page...( , such) know how use basic sockets, , connect client/server...
i've wrote client in past, gets ip & port, , connects it, , send images , such using sockets betwen client & server...
i've searched internet bit, , found can connect website, , send request, http content of page , store in variable, though have few problems :
1) i'm trying html of page isnt main page of site, like, not stackoverflow.com, stackoverflow.com/help , such (not "official page of site", inside site)
2) i'm not sure how either send or store data got request...
i saw there outside libraries use, rather use sockets only...
by way - i'm using windows 7, , aim it'll work on windows only(so it's fine if wont work linux)
thanks you'r help! :)
to access resource on host specify path resource in first line of request, after 'get'. e.g. check http://www.jmarshall.com/easy/http/#http1.1
get /path/file.html http/1.1 host: www.host1.com:80 [blank line here]
i'd recomend using portable library boost.asio instead of sockets. i'd recomend use existing, portable library implementing http protocol. of course if not matter of learning how implement it.
even if want implement it'd worth knowing existing solutions. instance how can webpage using cpp-netlib (http://cpp-netlib.org/0.10.1/index.html):
using namespace boost::network; using namespace boost::network::http; client::request request_("http://127.0.0.1:8000/"); request_ << header("connection", "close"); client client_; client::response response_ = client_.get(request_); std::string body_ = body(response_);
this how can using curl library (http://curl.haxx.se/libcurl/c/simple.html):
#include <stdio.h> #include <curl/curl.h> int main(void) { curl *curl; curlcode res; curl = curl_easy_init(); if(curl) { curl_easy_setopt(curl, curlopt_url, "http://example.com"); /* example.com redirected, tell libcurl follow redirection */ curl_easy_setopt(curl, curlopt_followlocation, 1l); /* perform request, res return code */ res = curl_easy_perform(curl); /* check errors */ if(res != curle_ok) fprintf(stderr, "curl_easy_perform() failed: %s\n", curl_easy_strerror(res)); /* cleanup */ curl_easy_cleanup(curl); } return 0; }
both libraries portable if you'd use windows-specific api might check wininet (http://msdn.microsoft.com/en-us/library/windows/desktop/aa383630%28v=vs.85%29.aspx) it's less pleasant use.
Comments
Post a Comment