c++ - Validation for error -


while developing personal library stumbled upon think error inside libstdc++6.

because i'm quite sure library has been reviewed lot of higher skilled people came here validate finding , assistance on further steps.

consider following code:

#include <regex> #include <iostream>  int main() {         std::string uri = "http://example.com/test.html";         std::regex reg(...);         std::smatch match;         std::regex_match(uri, match, reg);         for(auto& e: match)         {                 std::cout<<e.str() <<std::endl;         } } 

i have written regex parse url into

  • protocol
  • user/pass (optional)
  • host
  • port (optional)
  • path (optional)
  • query (optional)
  • location (optional)

i used following regex (in c++):

std::regex reg("^(.+):\\/\\/(.+@)?([a-za-z\\.\\-0-9]+)(:\\d{1,5})?([^?\\n\\#]*)(\\?[^#\\n]*)?(\\#.*)?$"); 

this worked quite fine in online tester , msvc++ 2015 update 3 fails on build host host part matches both host , path.

buildhost:

g++ (ubuntu 5.4.0-6ubuntu1~16.04.2) 5.4.0 20160609

libstdc++6:amd64 5.4.0-6ubuntu1~16.04.2

i consider error because if change regex this:

std::regex reg("^(.+):\\/\\/(.+@)?([a-za-z\\.0-9\\-]+)(:\\d{1,5})?([^?\\n\\#]*)(\\?[^#\\n]*)?(\\#.*)?$"); 

it works fine althought should behave same.

failing regex: https://ideone.com/7n2jdk

working regex: https://ideone.com/6nmpuw

do miss important here or error within libstdc++6 ?

the difference on char class:

[a-za-z\\.\\-0-9] // not working [a-za-z\\.0-9\\-] // working 

this bug because "[.\\-0]" should parsed character class matching character either . or - (since hyphen escaped literal \) or 0. unknown reason, hyphen parsed range operator , [a-za-z\\.\\-0-9]+ subexpression becomes equal [a-za-z.-0-9]+. see this regex demo.

the second expression works because - @ end of character class (and @ start) parsed literal hyphen.

another example of same bug:

std::string uri = "%"; std::regex reg(r"([$\-&])"); std::smatch match; std::regex_match(uri, match, reg); for(auto& e: match) {    std::cout<< e.str() <<std::endl; } 

the [$\-&] regex should not match %, should match $, - or &, whatever reason, % (that between $ , & in ascii table) is still matched.


Comments

Popular posts from this blog

java - SSE Emitter : Manage timeouts and complete() -

jquery - uncaught exception: DataTables Editor - remote hosting of code not allowed -

java - How to resolve error - package com.squareup.okhttp3 doesn't exist? -