parser Class Reference

#include <parse.hpp>

List of all members.

Public Types

enum  kind {
  eof, identifier, number, plus = '+',
  minus = '-', times = '*', slash = '/', lparen = '(',
  rparen = ')', equal = '='
}

Public Member Functions

 parser (std::istream &input)
bool get_expr (double &result)

Private Member Functions

std::string charify (char c)
bool get_number (std::string const &token, double &result)
bool get_add_expr (double &result)
bool get_mul_expr (double &result)
bool get_primary (double &result)
bool get_unary (double &result)
kind get_token (std::string &token)
void get_identifier (std::string &identifier)
void push_back (std::string const &token, kind k)
bool isalpha (char c) const
bool isalnum (char c) const
bool isdigit (char c) const
bool isprint (char c) const

Private Attributes

std::istream & input_
 Share the input stream.
std::ctype< char >
const & 
ctype_
 Cache the ctype facet for checking character categories.
std::string token_
 One token push-back.
kind kind_
 The kind of token that was pushed back.


Detailed Description

Parser class template. The parser reads tokens from an input stream. A token can be a keyword, numeric literal, identifier, or symbol (operator or punctuator). Symbols can have multiple characters (e.g., :=).

Because the recursive-descent parser can examine too many tokens from the input stream, it keeps a push-back token. Once the parser knows it has gone too far, it pushes back the most recently read token. The next call to get_token() retrieves the pushed-back token.

Only one push-back is available, which limits the complexity of the syntax.

Definition at line 31 of file parse.hpp.


Member Enumeration Documentation

enum parser::kind

Token kind. Declare a name for each single-character token, to ensure the enumerated type can represent any operator or punctuator character.

Enumerator:
eof 
identifier 
number 
plus 
minus 
times 
slash 
lparen 
rparen 
equal 

Definition at line 37 of file parse.hpp.

00037             { eof, identifier, number,
00038     plus='+', minus='-', times='*', slash='/', lparen = '(', rparen=')', equal='=' };


Constructor & Destructor Documentation

parser::parser ( std::istream &  input  ) 

Constructor. Save the input stream.

Parameters:
input The input stream

Definition at line 5 of file parse.cpp.

00006 : input_(input),
00007   ctype_(std::use_facet<std::ctype<char> >(input.getloc())),
00008   token_(),
00009   kind_()
00010 {}


Member Function Documentation

bool parser::get_expr ( double &  result  ) 

Read one expression and store the result in result.

Parameters:
result Where to store the result of the expression.
Returns:
true to continue or false to end the loop
Exceptions:
parse_error for various syntax and other errors

Definition at line 142 of file parse.cpp.

References eof, get_add_expr(), get_token(), identifier, push_back(), and set_variable().

Referenced by get_primary(), and parse_loop().

00143 {
00144   std::string token;
00145   kind k(get_token(token));
00146   if (k == eof)
00147     return false;
00148 
00149   if (k == identifier and token == "var") {
00150     std::string name;
00151     // Define a variable.
00152     k = get_token(name);
00153     if (k != identifier)
00154       throw parse_error("syntax error: expected IDENTIFIER, but got " + name);
00155     k = get_token(token);
00156     if (k != '=')
00157       throw parse_error("syntax error: expected =, but got " + token);
00158     if (not get_add_expr(result))
00159       throw parse_error("syntax error: expected additive-exprssion in assignment");
00160     set_variable(name, result);
00161     return true;
00162   }
00163 
00164   if (k == identifier and token == "quit")
00165     std::exit(0);
00166 
00167   push_back(token, k);
00168   if (not get_add_expr(result))
00169     throw parse_error("syntax error: expected an additive-expression");
00170 
00171   return true;
00172 }

std::string parser::charify ( char  c  )  [private]

Convert a characer to a readable form.

Parameters:
c The character
Returns:
A C++-style character literal that ensures c is readable.

Definition at line 12 of file parse.cpp.

References isprint().

Referenced by get_identifier(), and get_token().

00013 {
00014   if (c == '\a') return "\'\\a\'";
00015   if (c == '\b') return "\'\\b\'";
00016   if (c == '\f') return "\'\\f\'";
00017   if (c == '\n') return "\'\\n\'";
00018   if (c == '\r') return "\'\\r\'";
00019   if (c == '\t') return "\'\\t\'";
00020   if (c == '\v') return "\'\\v\'";
00021   if (c == '\'') return "\'\\'\'";
00022   if (c == '\\') return "\'\\\\\'";
00023 
00024   if (isprint(c))
00025     return std::string("\'") + c + '\'';
00026   else {
00027     std::ostringstream stream;
00028     stream << "'\\x" << std::hex;
00029     stream.fill('0');
00030     stream.width(2);
00031     stream << (std::char_traits<char>::to_int_type(c) & 0xFF) << '\'';
00032     return stream.str();
00033   }
00034 }

bool parser::get_number ( std::string const &  token,
double &  result 
) [private]

Parse a floating number.

Parameters:
token The token to parse
result Store the number here
Returns:
true if token is a valid number or false for an error

Definition at line 135 of file parse.cpp.

Referenced by get_primary().

00136 {
00137   std::istringstream stream(token);
00138   // If the value overflows or is otherwise invalid, return false.
00139   return (stream >> result);
00140 }

bool parser::get_add_expr ( double &  result  )  [private]

Parse an addition expression

     ADD_EXPR ::= MUL_EXPR | ADD_EXPR + MUL_EXPR | ADD_EXPR - MUL_EXPR
     
Parameters:
result Store the result here
Returns:
true to continue parsing or false to stop (end of file or error)

Definition at line 174 of file parse.cpp.

References get_mul_expr(), get_token(), and push_back().

Referenced by get_expr().

00175 {
00176   if (not get_mul_expr(result))
00177     return false;
00178   std::string token;
00179   while (kind k = get_token(token)) {
00180     if (k != '+' and k != '-') {
00181       push_back(token, k);
00182       return true;
00183     } else {
00184       double right;
00185       if (not get_mul_expr(right))
00186         throw parse_error("syntax error: unterminated expression. Expected a multiplicative-expression after " + token);
00187       if (k == '+')
00188         result += right;
00189       else
00190         result -= right;
00191     }
00192   }
00193   return true;
00194 }

bool parser::get_mul_expr ( double &  result  )  [private]

Parse a multiplicative expression.

     MUL_EXPR ::= UNARY | MUL_EXPR + UNARY | MUL_EXPR - UNARY
     
Parameters:
result Store the result here
Returns:
true to continue parsing or false to stop (end of file or error)

Definition at line 196 of file parse.cpp.

References get_token(), get_unary(), and push_back().

Referenced by get_add_expr().

00197 {
00198   if (not get_unary(result))
00199     return false;
00200   std::string token;
00201   while (kind k = get_token(token)) {
00202     if (k != '*' and k != '/') {
00203       push_back(token, k);
00204       return true;
00205     } else {
00206       double right;
00207       if (not get_unary(right))
00208         throw parse_error("syntax error: unterminated expression. Expected a unary-expression after " + token);
00209       if (k == '*')
00210         result *= right;
00211       else if (right == 0.0)
00212         throw parse_error("division by zero");
00213       else
00214         result /= right;
00215     }
00216   }
00217   return true;
00218 }

bool parser::get_primary ( double &  result  )  [private]

Parse a primary expression.

     PRIMARY ::= NUMBER | IDENTIFIER | '(' EXPR ')'
     
Parameters:
result Store the result here
Returns:
true to continue parsing or false to stop (end of file or error)

Definition at line 239 of file parse.cpp.

References eof, get_expr(), get_number(), get_token(), get_variable(), identifier, and number.

Referenced by get_unary().

00240 {
00241   std::string token;
00242   if (kind k = get_token(token)) {
00243     if (k == '(') {
00244       if (not get_expr(result))
00245         return false;
00246       k = get_token(token);
00247       if (k == eof)
00248         throw parse_error("syntax error: EOF when expecting ')'");
00249       else if (k != ')')
00250         throw parse_error("syntax error: expected ')', but got " + token);
00251       else
00252         return true;
00253     } else if (k == number) {
00254       if (not get_number(token, result))
00255         throw parse_error("Invalid numeric literal: " + token);
00256       return true;
00257     } else if (k == identifier) {
00258       result = get_variable(token);
00259       return true;
00260     } else {
00261       throw parse_error("syntax error: expected a primary, but got " + token);
00262     }
00263   }
00264   return false;
00265 }

bool parser::get_unary ( double &  result  )  [private]

Parse a unary expression.

     UNARY ::= '-' PRIMARY | '+' PRIMARY | PRIMARY
     
Parameters:
result Store the result here
Returns:
true to continue parsing or false to stop (end of file or error)

Definition at line 220 of file parse.cpp.

References get_primary(), get_token(), and push_back().

Referenced by get_mul_expr().

00221 {
00222   std::string token;
00223   if (kind k = get_token(token)) {
00224     if (k == '-') {
00225       if (not get_primary(result))
00226         return false;
00227       result = -result;
00228       return true;
00229     } else if (k == '+') {
00230       return get_primary(result);
00231     } else {
00232       push_back(token, k);
00233       return get_primary(result);
00234     }
00235   }
00236   return false;
00237 }

parser::kind parser::get_token ( std::string &  token  )  [private]

Parse a token. A token can be a keyword, a literal or a symbol.

     TOKEN ::= IDENTIFIER | NUMBER | SYMBOL
     IDENTIIFER ::= ALPHA (ALPHA | DIGIT)*
     NUMBER ::= DIGIT+ ('.' DIGITS+)? ('E' SIGN? DIGITS+)?
     SYMBOL ::= '+' | '-' | '*' | '/' | '%' | '(' | ')' | '='
     
Parameters:
token Store the text of the token here.
Returns:
the token kind

Definition at line 62 of file parse.cpp.

References charify(), eof, get_identifier(), identifier, input_, isalpha(), kind_, number, and token_.

Referenced by get_add_expr(), get_expr(), get_mul_expr(), get_primary(), and get_unary().

00063 {
00064   if (not token_.empty())
00065   {
00066     token = token_;
00067     kind result(kind_);
00068     token_.clear();
00069     kind_ = eof;
00070     return result;
00071   }
00072 
00073   char c;
00074   if (not (input_ >> c))
00075     return eof;
00076   if (isalpha(c)) {
00077     input_.unget();
00078     get_identifier(token);
00079     return identifier;
00080   }
00081 
00082   // Get a numeric literal.
00083   token.clear();
00084   if (c == '+' or c == '-' or c == '*' or c == '/' or c == '%' or c == '(' or c == ')' or c == '=') {
00085     token += c;
00086     return kind(c);
00087   }
00088 
00089   if (c < '0' or c > '9') {
00090     input_.unget();
00091     throw parse_error("syntax error: expected digit, got " + charify(c));
00092   }
00093   while (c >= '0' and c <= '9') {
00094     token += c;
00095     if (not input_.get(c))
00096       return number;
00097   }
00098   if (c == '.') {
00099     token += c;
00100     if (not input_.get(c))
00101       throw parse_error("unterminated number: expected digit after the decimal point");
00102     if (c < '0' or c > '9') {
00103       input_.unget();
00104       throw parse_error("syntax error: expected digit after decimal point, got " + charify(c));
00105     }
00106     while (c >= '0' and c <= '9') {
00107       token += c;
00108       if (not input_.get(c))
00109         return number;
00110     }
00111   }
00112   if (c == 'e' or c == 'E') {
00113     token += c;
00114     if (not input_.get(c))
00115       throw parse_error("unterminated number: expected digit in the exponent");
00116     if (c == '-' or c == '+') {
00117       token += c;
00118       if (not input_.get(c))
00119         throw parse_error("unterminated number: expected digit after sign in the exponent");
00120     }
00121     if (c < '0' or c > '9') {
00122       input_.unget();
00123       throw parse_error("syntax error: expected digit in the exponent, got " + charify(c));
00124     }
00125     while (c >= '0' and c <= '9') {
00126       token += c;
00127       if (not input_.get(c))
00128         return number;
00129     }
00130   }
00131   input_.unget();
00132   return number;
00133 }

void parser::get_identifier ( std::string &  identifier  )  [private]

Parse an identifer.

Parameters:
identifier Store the identifier here.
Precondition:
first input character is alphabetic

Definition at line 36 of file parse.cpp.

References charify(), input_, isalnum(), and isalpha().

Referenced by get_token().

00037 {
00038   identifier.clear();
00039   char c;
00040   if (not input_.get(c))
00041     return;
00042   if (not isalpha(c))
00043     throw parse_error("syntax error: expected alphabetic, got " + charify(c));
00044   identifier += c;
00045   while (input_.get(c)) {
00046     if (not isalnum(c)) {
00047       input_.unget();
00048       return;
00049     }
00050     identifier += c;
00051   }
00052   return;
00053 }

void parser::push_back ( std::string const &  token,
kind  k 
) [inline, private]

Push back a token. The next call to get_token() will return the pushed-back token.

Parameters:
token The token to push back.
k The kind of token to push back

Definition at line 119 of file parse.hpp.

References kind_, and token_.

Referenced by get_add_expr(), get_expr(), get_mul_expr(), and get_unary().

00119 { token_ = token; kind_ = k; }

bool parser::isalpha ( char  c  )  const [inline, private]

Return true if c is alphabetic. Use the locale of the input stream.

Parameters:
c The character to test.

Definition at line 125 of file parse.hpp.

References ctype_.

Referenced by get_identifier(), and get_token().

00125 { return ctype_.is(ctype_.alpha, c); }

bool parser::isalnum ( char  c  )  const [inline, private]

Return true if c is alphanumeric. Use the locale of the input stream.

Parameters:
c The character to test.

Definition at line 130 of file parse.hpp.

References ctype_.

Referenced by get_identifier().

00130 { return ctype_.is(ctype_.alnum, c); }

bool parser::isdigit ( char  c  )  const [inline, private]

Return true if c is a digit. Use the locale of the input stream.

Parameters:
c The character to test.

Definition at line 135 of file parse.hpp.

References ctype_.

00135 { return ctype_.is(ctype_.digit, c); }

bool parser::isprint ( char  c  )  const [inline, private]

Return true if c is printable. Use the locale of the input stream.

Parameters:
c The character to test.

Definition at line 140 of file parse.hpp.

References ctype_.

Referenced by charify().

00140 { return ctype_.is(ctype_.print, c); }


Member Data Documentation

std::istream& parser::input_ [private]

Share the input stream.

Definition at line 142 of file parse.hpp.

Referenced by get_identifier(), and get_token().

std::ctype<char> const& parser::ctype_ [private]

Cache the ctype facet for checking character categories.

Definition at line 143 of file parse.hpp.

Referenced by isalnum(), isalpha(), isdigit(), and isprint().

std::string parser::token_ [private]

One token push-back.

Definition at line 144 of file parse.hpp.

Referenced by get_token(), and push_back().

kind parser::kind_ [private]

The kind of token that was pushed back.

Definition at line 145 of file parse.hpp.

Referenced by get_token(), and push_back().


The documentation for this class was generated from the following files:
Generated on Sun Nov 30 10:04:49 2008 for Calculator by  doxygen 1.5.3