![]() If this path gets too long, the number of backtracking steps will eventually become very large, resulting in catastrophic backtracking, leading to a possible denial of service. What happens behind the curtains is that any time a symbol is being tested by the engine and it fails to match the next one, it will backtrack and look for another way to compile the previous symbol. This behavior is an extreme case and happens because the algorithm will go through all the possible paths until failing. And if we change it to aaaaaaaaaaaaaaaaX, 65536 possible paths will be generated. Each additional ” a” doubles this number. If we modify the input to aaaaaaaaaaX, it will have 1024 steps. To this, if we choose the input aaaaX, 16 possible paths will exist in the graph from figure 12. We can use the same methodology of inputting different sizes in order to understand the NFA behavior. This will try all paths of the NFA until it reaches an accepting state, that is, where a match occurred or all the paths were attempted but with no match.Ĭonsidering the regex ˆ(a+)+$ and its correspondent NFA: Hence, for each input symbol, the NFA will transit to a new state until all the input symbols have been consumed. In any case, to accomplish the regular expression matching, the engine builds a Nondeterministic Finite Automaton (NFA), which is a finite state machine where for each pair of state and input symbol, there may be several possible next states. So far, we have seen how the PHP implementation uses the PCRE, Perl-based, and Go uses the RE2. The most commonly used algorithms to implement regular expression matching are: This clarifies that for the Python and JavaScript implementations the time doubles when a new C is added.įinally, in the chart seen in figure 11, it becomes visually clear what the discrepancies are between the results and places of the performances of PHP and Go side by side. For the purpose of this post, we only tested ten inputs, starting with 20 ’C’ characters and incrementing one unit. In table 1, we summarize the obtained results for each programming language, where it is displayed the elapsed time in seconds for each input that was tested. The official regex package implemented in Go uses the RE2 engine 2, which does not support backreferences, and guarantees a linear time execution while avoiding regex denial of service. In the PHP regex implementation, we used the Perl Compatible Regular Expressions (PCRE) library 1, which uses backreferences. The malicious inputs used in Python and JavaScript generated exponential time increments, versus the linear time responses for PHP and Go. We can see that the results from PHP (8) and Go (10) implementations are much different from the Python (4) and JavaScript (6) results. The topic of this report was motivated byongoing research on the topic of Go security, where we aim to discover vulnerabilities lurking in Go packages.įunc sqli() Listing 8: Go Regex compiler This blog post includes a set of practical examples using different programming languages, aiming to show how the Go implementation avoids ReDoS. But first, let’s start by explaining the concept of ReDoS and how such attacks can be exploited and mitigated. ![]() Go is compiled, is statically typed as in C (with garbage collection), with limited structural typing, memory safety features and CSP-style concurrent features. In this blog post, we'll recap Go’s security posture facing Regular Expression Denial of Service (ReDoS) attacks. ![]() For example, if your loop is something like this right now: inputs := stringįmo = nil // now GC can reclaim the map.Īlso, for your particular regular expression and how you’re using ReplaceAllString, you can switch the implementation of fixAnnotationKey to use strings.Go Programming Language (also known as Golang) is an open source programming language created by Google. Are the input strings ever repeated? If so, you might actually save memory by caching the results. You said you’re using this function in a loop. Caching results would affect memory usage, but the result could go either way you’d need to benchmark/profile it. ![]()
0 Comments
Leave a Reply. |