com.ssv.utils.parser
Class Parser

java.lang.Object
  extended by com.ssv.utils.parser.Parser
Direct Known Subclasses:
ContestParser

public class Parser
extends java.lang.Object

This class is substitution for Java's StringTokenizer. Actually, this is much more powerful and flexible tokenizer, it allow user to specify his own keyword, symbols, the way of doing commentary etc. There are 6 predefined tokens:

EOF - end of file reached.
IDENTIFIER - identifier encountered, it's name accessible through getIdentifier()
NUMBER -number encountered, value accessible by getNumber()
DOUBLENUMBER - float number encountered, value accessible by getDoubleNumber()
SINGLEQUOTEDSTRING single quoted string literal encountered, value accessible by getString()
DOUBLEQUOTEDSTRING double quoted string literal encountered, value accessible by getString()

Also, the user can define his own symbols (see below).

To define keyword use #addKeyword(String). Store the returned int value to recognize the token when it encounters in the input stream. The example is here:


                                final static int BEGIN = addKeyword("BEGIN");
                                ...
                                if(token()==BEGIN) { // BEGIN just scanned from the input stream.
                                        ...
                                }
 

The same way should be used when declare your own symbols with addToken method. There are 2 versions of this method - #addToken(String) and #addToken(String, String). First one just declares new symbol and it will be shown in diag messages exactly like it was declared (which is OK when symbol doesn't have any non-printing symbols). If symbol incudes such symbols, it is more practical to assign string name for it. In this example we declare LF token (i.e. end of the line)


     final static int LF = addToken("\r\n", "LF");
     ...
     if(token()==LF) { 
          // Printing log message, actually "LF encountered." will be printed.
          System.out.println(tokenToStr(token())+" encountered.");
     }
 

Also there is a way to handle comments, namely addCommentSymbols(String, String) and addLineCommentSymbol(String). First of them registers the way to mark something from "begin" to "end" as a comments. For example, if you do this:


 addCommentSymbols("{", "}");
 

you will have Pascal's style of comments, which mean everything between '{' and '}' will be skipped by parser.
Another comments style can be handled by addLineCommentSymbol(String). The difference is here - you should indicate begin of commentary and everything after that will be skipped until line ends.
If you want to analyze contents of commentary, you may set setCommentHandler(CommentHandler) listener (see below).

One more important matter here is how to handle whitespaces. You may set as many whitespace symbols by invoking addSpace(char). Usually we treat spacebars, tabs and end of lines as whitespaces, so probably you will have to do this:


 addSpace(' ');
 addSpace('\r');
 addSpace('\n');
 addSpace('\t');
 

But it is not obligatory action, for example you may handle end of line as a meaningful token.

Author:
Sergey Siryk sergey.siryk@gmail.com

Nested Class Summary
static class Parser.ParserContext
           
static class Parser.Token
           
 
Field Summary
static int DOUBLENUMBER
          Float number.
static int DOUBLEQUOTEDSTRING
          Double-quoted string.
static int EOF
          End of file.
static int IDENTIFIER
          Identifier.
static int NONNUMBER
          Invalid token, starts with number and.
static int NUMBER
          Long number.
static int SINGLEQUOTEDSTRING
          Single-quoted string.
 
Method Summary
 int accept(java.lang.Integer... tokens)
          Checks if current token is what we expect, reads next token.
 Parser.ParserContext accepted()
          Return link to context which has been created inside last accept() method.
 Parser.Token ctoken()
          Returns current symbol.
 Parser.ParserContext fetchContext()
          Creates copy of current context.
 double getDoubleNumber()
           
 java.lang.String getIdentifier()
           
 long getNumber()
           
 java.lang.String getString()
           
 void init(java.io.InputStream is)
          Method opens the stream to read from.
static void main(java.lang.String[] args)
           
 int next()
          Reads next token from the input stream.
 void setCommentHandler(CommentHandler ch)
          Sets listener to invoke when commentary encountered.
 java.lang.String stoken()
          Returns string representation of the current token.
 java.lang.String stokenshort(int ID)
           
 int token()
          Returns current symbol.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

EOF

public static final int EOF
End of file.

See Also:
Constant Field Values

IDENTIFIER

public static final int IDENTIFIER
Identifier. Invoke getIdentifier() to access identifier's name.

See Also:
Constant Field Values

NUMBER

public static final int NUMBER
Long number. Invoke getNumber() to get the value.

See Also:
Constant Field Values

DOUBLENUMBER

public static final int DOUBLENUMBER
Float number. Invoke getDoubleNumber() to get the value.

See Also:
Constant Field Values

SINGLEQUOTEDSTRING

public static final int SINGLEQUOTEDSTRING
Single-quoted string. Invoke getString() to get the value.

See Also:
Constant Field Values

DOUBLEQUOTEDSTRING

public static final int DOUBLEQUOTEDSTRING
Double-quoted string. Invoke getString() to get the value.

See Also:
Constant Field Values

NONNUMBER

public static final int NONNUMBER
Invalid token, starts with number and. Invoke getString() to get the value.

See Also:
Constant Field Values
Method Detail

init

public void init(java.io.InputStream is)
          throws java.io.IOException
Method opens the stream to read from. If another stream has been opened previously, it will be closed.

Parameters:
is - stream to open.
Throws:
java.io.IOException

accept

public int accept(java.lang.Integer... tokens)
           throws ParserException,
                  java.io.IOException
Checks if current token is what we expect, reads next token. Using this method could be tricky since you going to lose current token. For example,

                parser.accept(IDENTIFIER);
                System.out.println(parser.getIdentifier());
 
will not work since value of identifier is already lost by time we try to get it. This code should be done this way:

                String id_value=parser.getIdentifier();
                parser.accept(IDENTIFIER);
                System.out.println(id_value);
 
But it still not the best way to do it. There is special function which holds context of accepted symbol: accepted(). You may use it like that:

                parser.accept(IDENTIFIER);
                System.out.println(parser.accepted().getIdentifier());
 

Parameters:
tokens - - tokens we expect.
Throws:
ParserException - if current token is not what we expect.
java.io.IOException - if there is input stream problem.

fetchContext

public Parser.ParserContext fetchContext()
Creates copy of current context.


accepted

public Parser.ParserContext accepted()
Return link to context which has been created inside last accept() method. See also #accept().


getIdentifier

public java.lang.String getIdentifier()
Returns:
value of last parsed identifier.

getNumber

public long getNumber()
Returns:
value of last parsed integer constant.

getDoubleNumber

public double getDoubleNumber()
Returns:
value of last parsed float constant.

getString

public java.lang.String getString()
Returns:
value of last parsed string literal.

next

public int next()
         throws java.io.IOException,
                ParserException
Reads next token from the input stream.

Returns:
last parsed token.
Throws:
IOException - if there is an error while reading from the input stream.
GeneralTokenException - syntax error.
java.io.IOException
ParserException

token

public int token()
          throws java.io.IOException,
                 ParserException
Returns current symbol.

Throws:
ParserException
java.io.IOException

ctoken

public Parser.Token ctoken()
                    throws java.io.IOException,
                           ParserException
Returns current symbol.

Throws:
ParserException
java.io.IOException

setCommentHandler

public void setCommentHandler(CommentHandler ch)
Sets listener to invoke when commentary encountered. Whole commentary's body will be passed to the listener if it is defined.

Parameters:
ch -

stoken

public java.lang.String stoken()
Returns string representation of the current token. Be careful - current implementation is not accurate, it returns string like this:
IDENTIFIER [ identname ]
where identname is current value of getIdentifier().

Parameters:
token -
Returns:

stokenshort

public java.lang.String stokenshort(int ID)
Parameters:
token - ID.
Returns:
token's name.

main

public static void main(java.lang.String[] args)