c++ - How can you detect if two regular expressions overlap in the strings they can match? -


I have a regular expression container, I want to analyze them to determine if they It is possible to produce a food string. In the case of this use, the lack of writing your own Regex engine, is there an easy way to solve this problem in C ++ or Python?

There is no easy way.

Unless your regular expression only uses the standard features (Perl lets you embed the arbitrary code in Milan, I think), you can generate from each one that is available for RE matches. Enters all the stars in a solid way.

Looking at any pair of NFA, the intersection is empty if the intersection is not empty, some string matches (and vice versa) matches the RE.

The standard decidability proof is to determine them in the first place, and then to build a new DFA whose states are two pairs of two DFA, and whose last state, those two states in fact that pair Are the last in their original DFA. Alternatively, if you have already shown the NFA supplement to calculate, then you can get the intersection by (code of law of democracy) supplement (union (supplement (a), supplement (b)) Unfortunately, the NFA-> DFA includes a possible exponential size explosion (because the DFA states the states are subset of NFAs):

Certain sections of regular languages ​​are only determiners of finite automata Which can be described by the size of at least equivalent regular expression increases in size. Standard examples are all written words that are written from the alphabet {a, b} whose kth- the last letter is equal.

By the way, you should definitely use it. You can automatically create text files and at least, you can play with tasks like intersection, so that you give it You can see how efficient they are for your problem, there is already open source regexp-> nfa-> dfa compilers (I remember a Pearl module); Open FTP Automata files to output and play around Modify one.

Fortunately, it is possible to avoid a subset-off-state explosion, and repeat the two NFA directly using the same construction for DFA:

If A - & gt; A B (In an NFA, you can output to 'A' and output to 'a')

and X -> a (b, y) p> (c, z) last iff c is an NFA Last in and second in the last.

To stop the process, you start in the pair for initial states like NFA, (A, X) - this is the starting position of the intersection - NFA is the first time you first If you travel to any state, then leave the two states for each pair of each arc, make the arc above the above rule, and then go to all those (new) places that reach those arcs, you will store the fact that you Somebody The arch of the kingdom has expanded (such as in a hash table) and has started to explore all the states from the beginning.

If you allow Epsilon infection (letter), it is okay:

If A -> is in NFA first, then every state (A, For Y} , add to Epislon in NAPA, Arc (A, Y) -> gtc: epsilon (B, Y) and so on.

Epsilon transitions are useful (but not required) in translating regexp into an NAFA, in taking two NFA associations; Whenever you have a change regexp1 | Regexp2 | Regexp3 , you consolidate: An NFA, which is an epistolene infection for each NFA in the initial state that represents regexps in the iterative.

NFA is easy to decide for emptiness: If you reach the final position in the first search from the starting state in depth, then it is not empty.

This NFA-intersection is similar to the finite state transducer structure (a transducer is an NFA which outputs to pairs of symbols, to match both an input and output string, or Added added to change input given for output).


Comments

Popular posts from this blog

c# - How to capture HTTP packet with SharpPcap -

php - Multiple Select with Explode: only returns the word "Array" -

php - jQuery AJAX Post not working -