java - Getting some data from HTML using regex -


i trying data html. code:

 public static void main(string[] args) {         final string str = "<div class=\"b-vacancy-list-salary\">\n" +                 "            50 000\n" +                 "             70 000\n" +                 "             usd.\n" +                 "        </div>";         system.out.println(arrays.tostring(gettagvalues(str).toarray()));     }       static final string tag = "<div class=\"b-vacancy-list-salary\">\n";     private static final pattern tag_regex = pattern.compile(tag+"(.+?)</div>");      private static list<string> gettagvalues(final string str) {         system.out.println(tag);         final list<string> tagvalues = new arraylist<string>();         final matcher matcher = tag_regex.matcher(str);         while (matcher.find()) {             tagvalues.add(matcher.group(1));         }         return tagvalues;     } 

it returns [], not value. what's wrong?

you can remove line feed.

the better way parse html use dom parser or xpath.

e.g :

    public static void main(string[] args) {       final string str = "<div class=\"b-vacancy-list-salary\">\n"               + "            50 000\n"               + "             70 000\n"               + "             usd.\n"               + "        </div>";       system.out.println(arrays.tostring(gettagvalues(str).toarray()));     }     static final string tag = "<div class=\"b-vacancy-list-salary\">";     private static final pattern tag_regex = pattern.compile(tag + "(.+?)</div>");      private static list<string> gettagvalues(final string str) {       system.out.println(tag);       final list<string> tagvalues = new arraylist<string>();       final matcher matcher = tag_regex.matcher(str.replace("\n", ""));       while (matcher.find()) {         tagvalues.add(matcher.group(1).trim());       }       return tagvalues;     } 

Comments

Popular posts from this blog

java - Run a .jar on Heroku -

java - Jtable duplicate Rows -

validation - How to pass paramaters like unix into windows batch file -