I am trying to use the gb_regex.h file to parse a string and i am running into an infinite loop. the expression i am compiling is
| char *ChatMessageRegex = R"Yes(^:[A-Za-z0-9_]+![A-Za-z0-9_]+@[A-Za-z0-9_]+\.tmi\.twitch\.tv PRIVMSG #[A-Za-z0-9_]+ :(.+))Yes";
|
Then I am running this code:
1
2
3
4
5
6
7
8
9
10
11
12
13 | response ParseResponse(char *Buffer)
{
response Result = {};
gbRegex Regex;
gbreError Error = gbre_compile(&Regex, ChatMessageRegex, Q_strlen(ChatMessageRegex));
int Count = gbre_capture_count(&Regex);
gbreCapture Captures;
gbreBool Match = gbre_match(&Regex, Buffer, Q_strlen(Buffer), &Captures, Count);
return Result;
}
|
the char *Buffer will be
| ":tmi.twitch.tv 001 barret5ocal_tog_dev :Welcome, GLHF!\r\n\n:tmi.twitch.tv 002 barret5ocal_tog_dev :Your host is tmi.twitch.tv\r\n\n:tmi.twitch.tv 003 barret5ocal_tog_dev :This server is rather new\r\n\n:tmi.twitch.tv 004 barret5ocal_tog_dev :-\r\n\n:tmi.twitch.tv 375 barret5ocal_tog_dev :-\r\n\n:tmi.twitch.tv 372 barret5ocal_tog_dev :You are in a maze of twisty passages, all alike.\r\n\n:tmi.twitch.tv 376 barret5ocal_tog_dev :>\r\n\n"
|
which shouldn't match so the call should return false (sorry about the long string the runs off the page. I wanted to make sure the string you got was correct)
The first question is whether I did anything wrong in this expression or code?
Also I figured out that the problem starts when the gbre_match call gets to the first '[' in the expression. When gbre__exec_single get to the '[' and runs the GBRE_OP_ONE_OR_MORE case, it calls gbre__exec_single recursively and runs the GBRE_OP_ANY_OF case. This case calls the gbre__context_no_match and set the gbreContext.offset to -1. I'm not sure gbre__context_no_match is supposed to set the offset to -1 or the op, because I feel like setting the offset to -1 is what is causing the infinite loop.
I think the problem happens when this code gets gets to gbre__consume. this for loop
1
2
3
4
5
6
7
8
9
10
11
12
13 | for (;;) {
c = gbre__exec_single(re, op, str, str_len, c.offset, 0, 0);
if (c.offset > str_len) break;
if (c.op >= re->buf_len) return c;
next_c = gbre__exec(re, c.op, str, str_len, c.offset, 0, 0);
if (next_c.offset <= str_len) {
if (captures)
gbre__exec(re, c.op, str, str_len, c.offset, captures, max_capture_count);
best_c = next_c;
if (!is_greedy) break;
}
}
|
might be the problem since the only way to break from the loop is if the c.offset > str_len, which it never will since it is constantly being set to -1.
Please tell me if this is a bug, if I am doing something wrong, or if you need more info from me. Any help would be appreciated. Thank you!