ruby - How to parse attribute and value of a tag in html file at same time with Nokogori? -
say have html file called ex.html
following:
<ul> <li data-value="datav1">val1</li> <li data-value="datav2">val2</li> <li data-value="datav3">val3</li> </ul>
i want extract attribute data-value
, text value line line , output result below:
datav1:val1
datav2:val2
datav3:val3
however i'm new nokogori, know code below,which can extract attribute data-value
, , don't know how extract attribute , text value in same loop.
require 'nokogiri' page_temp = nokogiri::html(open("ex.html")) page_temp.xpath('//li/@data-value').each |node| puts node end
i'd appreciate if can teach me how make work through nokogori, , better if there other solution using shell script.
update
thanks @rajarshi das , @arun kumar, answers partly solved problem. problem node.text
chinese characters. , unrecognizable when print them out in terminal. tried print out page_temp
when after executed page_temp = nokogiri::html(open("ex.html"))
, find chinese characters €
. guess read ex.html
file wrong in ruby.
this should it.
page_temp.xpath('//li').each |node| puts "#{node['data-value']}:#{node.text}" end
the code self-explanatory let me explain. you're looping on li
elements , printing value of data-value
attribute along text contained in li
element.
Comments
Post a Comment