A finite state automaton is a way of looking at a computer, a way of solving parsing problems and a way of designing special purpose hardware. The automaton at any time in is one of a finite number of internal states. It is a black box. You feed it input, usually a character at a time. It turns its internal crank and applies its rules and goes into a new state and accepts more input. Every once in a while it also produces some output. A simple parser might classify Java source as either java commands or comment. It would have states for having seen /, //, /*, /*… *, /*… */ etc.
In the days of C, the state was stored as an integer 0 .. n. You had a complicated nest of switch statements that examined both the current state and the input to determine the next state.
In 1.4-, one way of writing finite state automata was to have a singleton class represent each possible state. There is a state variable that represents the current state. You feed the input to a standard method of the class’s interface and it computes the next state. That way states that are very similar can inherit default behaviours. You can have a static method to categorise the input and and a separate instance method to handle each categroy, or use a common method and a switch to dispatch to code based on input category to decide the next state. You don’t need any switch code based on current state. The dynamic method overriding features of Java handle that.
In Java version 1.5 or later, one way to write finite state automata is to use an enum constant to represent each state. A custom next method on each enum constant examines the input and calculates the next state. The various parsers used by JDisplay work using enums. The problem with this approach is you must use statics for variables shared between states. You can’t instantiate several different finite state automata.
Finite state automata come in two flavours: DFA (Deterministic Finite Automaton) and NFA (Non-deterministic Finite Automaton). NFAs (Non-deterministic Finite Automatons) don’t always give the same answer.
You might wonder what use a non-deterministic program could be. Consider integration in two dimensions by the Monte Carlo method. All you need is a way of determining if a point is inside or outside the area you want to integrate. Then you generate random points and count how many are inside and how many outside. The ratio tells you the ratio of the area inside to outside which when applied to the total area gives you an approximation of the area of the region to integrate. You want a random pattern of test points. Any regular pattern could be thrown off by regularities in the shape of the region you are trying to integrate.
Here is the source code for a the Compactor finite state automaton that removes excess white space from HTML to make it more compact for transmission. Compactor: compacts a group of files, a single file or a string.HTMLCharCategory: categorises characters. HTMLState: the finite state automaton that parse the HTML and decides which spaces and new
|recommend book⇒Introduction to the Theory of Computation, second edition|
|This is a university textbook that will explain such mysteries and finite state automata and how to convert them into regex expressions. It covers Turing machines. This is about the theory of computing, not a cookbook on how to code anything. This is thin but expensive book. third edition 978-1-133-18779-0 is due 2012-05-30|
|Greyed out stores probably do not have the item in stock. Try looking for it with a bookfinder.|
This page is posted
Optional Replicator mirror
Your face IP:[188.8.131.52]
You are visitor number|