Regex Matching Words in XML Between Two Other Words (Matlab regexp) -
i'm trying parse xml file using matlab regexp. retrieve array of incidences of word "curvepoint" occuring between "deposits" , "/deposits". xml below should [6x1] array like
"
<curvepoint> <curvepoint> <curvepoint> <curvepoint> <curvepoint> <curvepoint>
"
my attempt below doesn't work there lots of other text interspersed between each "curvepoint" word-incidence , look-aheads/backs, don't know how handle this.
regexp(xmltext,'(?<=<deposits>)(<curvepoint>)(?=</deposits>)','match')'
xmltext is
<?xml version="1.0" encoding="utf-8"?> <interestratecurve> <effectiveasof>2016-11-07</effectiveasof> <currency>eur</currency> <baddayconvention>m</baddayconvention> <deposits> <daycountconvention>act/360</daycountconvention> <snaptime>2016-11-04t15:00:00.000z</snaptime> <spotdate>2016-11-09</spotdate> <calendars> <calendar>none</calendar> </calendars> <curvepoint> <tenor>1m</tenor> <maturitydate>2016-12-09</maturitydate> <parrate>-0.00373</parrate> </curvepoint> <curvepoint> <tenor>2m</tenor> <maturitydate>2017-01-09</maturitydate> <parrate>-0.00339</parrate> </curvepoint> <curvepoint> <tenor>3m</tenor> <maturitydate>2017-02-09</maturitydate> <parrate>-0.00312</parrate> </curvepoint> <curvepoint> <tenor>6m</tenor> <maturitydate>2017-05-09</maturitydate> <parrate>-0.00213</parrate> </curvepoint> <curvepoint> <tenor>9m</tenor> <maturitydate>2017-08-09</maturitydate> <parrate>-0.0013</parrate> </curvepoint> <curvepoint> <tenor>1y</tenor> <maturitydate>2017-11-09</maturitydate> <parrate>-0.00071</parrate> </curvepoint> </deposits> <swaps> <fixeddaycountconvention>30/360</fixeddaycountconvention> <floatingdaycountconvention>act/360</floatingdaycountconvention> <fixedpaymentfrequency>1y</fixedpaymentfrequency> <floatingpaymentfrequency>6m</floatingpaymentfrequency> <snaptime>2016-11-04t15:00:00.000z</snaptime> <spotdate>2016-11-09</spotdate> <calendars> <calendar>none</calendar> </calendars> <curvepoint> <tenor>2y</tenor> <maturitydate>2018-11-09</maturitydate> <parrate>-0.00157</parrate> </curvepoint> <curvepoint> <tenor>3y</tenor> <maturitydate>2019-11-09</maturitydate> <parrate>-0.00115</parrate> </curvepoint> <curvepoint> <tenor>4y</tenor> <maturitydate>2020-11-09</maturitydate> <parrate>-0.00059</parrate> </curvepoint> <curvepoint> <tenor>5y</tenor> <maturitydate>2021-11-09</maturitydate> <parrate>0.00017</parrate> </curvepoint> <curvepoint> <tenor>6y</tenor> <maturitydate>2022-11-09</maturitydate> <parrate>0.00108</parrate> </curvepoint> <curvepoint> <tenor>7y</tenor> <maturitydate>2023-11-09</maturitydate> <parrate>0.0021</parrate> </curvepoint> <curvepoint> <tenor>8y</tenor> <maturitydate>2024-11-09</maturitydate> <parrate>0.00316</parrate> </curvepoint> <curvepoint> <tenor>9y</tenor> <maturitydate>2025-11-09</maturitydate> <parrate>0.00419</parrate> </curvepoint> <curvepoint> <tenor>10y</tenor> <maturitydate>2026-11-09</maturitydate> <parrate>0.00513</parrate> </curvepoint> <curvepoint> <tenor>12y</tenor> <maturitydate>2028-11-09</maturitydate> <parrate>0.00673</parrate> </curvepoint> <curvepoint> <tenor>15y</tenor> <maturitydate>2031-11-09</maturitydate> <parrate>0.00838</parrate> </curvepoint> <curvepoint> <tenor>20y</tenor> <maturitydate>2036-11-09</maturitydate> <parrate>0.00966</parrate> </curvepoint> <curvepoint> <tenor>30y</tenor> <maturitydate>2046-11-09</maturitydate> <parrate>0.01006</parrate> </curvepoint> </swaps> </interestratecurve>
never use regex parse xml. @ best solution brittle. use real xml parser instead.
in matlab, use xmlread, xmlwrite, , xslt functions read, write, , transform xml.
note mathworks blog has xml posts on using these functions in matlab.
Comments
Post a Comment