Calculator  Step 4
Public Types | Public Member Functions | Private Member Functions | Private Attributes | List of all members
parser Class Reference

#include <parse.hpp>

Public Types

enum  kind : int {
  eof, identifier, number, plus ='+',
  minus ='-', times ='*', slash ='/', lparen = '(',
  rparen =')', equal ='=', comma =','
}
 

Public Member Functions

 parser (std::istream &input)
 
bool get_statement (std::ostream &output)
 

Private Member Functions

std::string charify (char c)
 
bool get_number (std::string const &token, node &result)
 
bool get_expr (node &result)
 
bool get_add_expr (node &result)
 
bool get_mul_expr (node &result)
 
bool get_primary (node &result)
 
bool get_unary (node &result)
 
void get_definition (std::string &name, identifier_list &parameters, node &definition)
 
kind get_token (std::string &token)
 
void get_identifier (std::string &identifier)
 
void get_expr_list (node_list &result)
 
template<class OutputIterator >
OutputIterator get_namelist (OutputIterator output)
 
void push_back (std::string const &token, kind k)
 
bool isalpha (char c) const
 
bool isalnum (char c) const
 
bool isdigit (char c) const
 
bool isprint (char c) const
 

Private Attributes

std::istream & input_
 Share the input stream. More...
 
std::ctype< char > const & ctype_
 Cache the ctype facet for checking character categories. More...
 
std::string token_
 One token push-back. More...
 
kind kind_
 The kind of token that was pushed back. More...
 

Detailed Description

Parser class template. The parser reads tokens from an input stream. A token can be a keyword, numeric literal, identifier, or symbol (operator or punctuator). Symbols can have multiple characters (e.g., :=).

Because the recursive-descent parser can examine too many tokens from the input stream, it keeps a push-back token. Once the parser knows it has gone too far, it pushes back the most recently read token. The next call to get_token() retrieves the pushed-back token.

Only one push-back is available, which limits the complexity of the syntax.

Definition at line 25 of file parse.hpp.

Member Enumeration Documentation

enum parser::kind : int

Token kind. Declare a name for each single-character token, to ensure the enumerated type can represent any operator or punctuator character.

Enumerator
eof 
identifier 
number 
plus 
minus 
times 
slash 
lparen 
rparen 
equal 
comma 

Definition at line 31 of file parse.hpp.

Constructor & Destructor Documentation

parser::parser ( std::istream &  input)

Constructor. Save the input stream.

Parameters
inputThe input stream

Definition at line 10 of file parse.cpp.

11 : input_(input),
12  ctype_(std::use_facet<std::ctype<char> >(input.getloc())),
13  token_(),
14  kind_()
15 {}
std::string token_
One token push-back.
Definition: parse.hpp:181
std::istream & input_
Share the input stream.
Definition: parse.hpp:179
kind kind_
The kind of token that was pushed back.
Definition: parse.hpp:182
std::ctype< char > const & ctype_
Cache the ctype facet for checking character categories.
Definition: parse.hpp:180

Member Function Documentation

std::string parser::charify ( char  c)
private

Convert a characer to a readable form.

Parameters
cThe character
Returns
A C++-style character literal that ensures c is readable.

Definition at line 17 of file parse.cpp.

References isprint().

Referenced by get_identifier(), and get_token().

18 {
19  if (c == '\a') return R"('\a')";
20  if (c == '\b') return R"('\b')";
21  if (c == '\f') return R"('\f')";
22  if (c == '\n') return R"('\n')";
23  if (c == '\r') return R"('\r')";
24  if (c == '\t') return R"('\t')";
25  if (c == '\v') return R"('\v')";
26  if (c == '\'') return R"('\'')";
27  if (c == '\\') return R"('\\')";
28 
29  if (isprint(c))
30  return std::string{"\'"} + std::string(1,c) + "\'";
31  else {
32  std::ostringstream stream{};
33  stream << "'\\x" << std::hex;
34  stream.fill('0');
35  stream.width(2);
36  stream << (std::char_traits<char>::to_int_type(c) & 0xFF) << '\'';
37  return stream.str();
38  }
39 }
bool isprint(char c) const
Definition: parse.hpp:177
bool parser::get_add_expr ( node result)
private

Parse an addition expression

ADD_EXPR ::= MUL_EXPR | ADD_EXPR + MUL_EXPR | ADD_EXPR - MUL_EXPR
Parameters
resultStore the result here
Returns
true to continue parsing or false to stop (end of file or error)

Definition at line 213 of file parse.cpp.

References get_mul_expr(), get_token(), and push_back().

Referenced by get_expr().

214 {
215  if (not get_mul_expr(result))
216  return false;
217  std::string token{};
218  while (kind k = get_token(token)) {
219  if (k != '+' and k != '-') {
220  push_back(token, k);
221  return true;
222  } else {
223  node right{};
224  if (not get_mul_expr(right))
225  throw syntax_error{"unterminated expression. Expected a multiplicative-expression after " + token};
226  result = node(result, k, right);
227  }
228  }
229  return true;
230 }
Definition: node.hpp:26
void push_back(std::string const &token, kind k)
Definition: parse.cpp:60
kind get_token(std::string &token)
Definition: parse.cpp:69
kind
Definition: parse.hpp:31
bool get_mul_expr(node &result)
Definition: parse.cpp:232
void parser::get_definition ( std::string &  name,
identifier_list parameters,
node definition 
)
private

Parse a function or variable definition A variable is just like a function that takes no parameters.

DEFINITION ::= DEF IDENTIFIER OPT_PARAMETERS '=' EXPR
OPT_PARAMETERS ::= emtpy | '(' OPT_IDENTIFIER_LIST ')'
OPT_IDENTIFIER_LIST ::=  empty | IDENTIFIER_LIST
IDENTIFIER_LIST ::= IDENTIFIER | IDENTIFIER_LIST ',' IDENTIFIER
Parameters
[out]nameStore the variable or function name here
[out]parametersStore the list of parameter names here
[out]definitionStore the definition expression here

Definition at line 157 of file parse.cpp.

References get_expr(), get_namelist(), get_token(), and identifier.

Referenced by get_statement().

158 {
159  // Define a variable.
160  kind k{get_token(name)};
161  if (k != identifier)
162  throw syntax_error{"expected IDENTIFIER, got " + name};
163 
164  std::string token;
165  k = get_token(token);
166  if (k == '(') {
167  get_namelist(std::back_inserter(parameters));
168  k = get_token(token);
169  }
170 
171  if (k != '=')
172  throw syntax_error{"expected = in definition, got " + token};
173 
174  if (not get_expr(definition))
175  throw syntax_error{"expected exprssion in assignment"};
176 }
bool get_expr(node &result)
Definition: parse.cpp:208
OutputIterator get_namelist(OutputIterator output)
Definition: parse.hpp:193
kind get_token(std::string &token)
Definition: parse.cpp:69
kind
Definition: parse.hpp:31
bool parser::get_expr ( node result)
private

Parse an expression

Parameters
resultStore the result here
Returns
true to continue parsing or false to stop (end of file or error)

Definition at line 208 of file parse.cpp.

References get_add_expr().

Referenced by get_definition(), get_expr_list(), get_primary(), and get_statement().

209 {
210  return get_add_expr(result);
211 }
bool get_add_expr(node &result)
Definition: parse.cpp:213
void parser::get_expr_list ( node_list result)
private

Parse a comma-separated expression list.

Parameters
[out]resultStore the result here

Definition at line 272 of file parse.cpp.

References get_expr(), get_token(), and push_back().

Referenced by get_primary().

273 {
274  result.clear();
275  std::string token{};
276  while (kind k = get_token(token)) {
277  if (k == ')')
278  return;
279  push_back(token, k);
280  node expr{};
281  if (not get_expr(expr))
282  throw syntax_error{"unexpected end of line in function argument"};
283  result.push_back(expr);
284  k = get_token(token);
285  if (k == ')')
286  return;
287  else if (k != ',')
288  throw syntax_error{"expected comma in argument list, got " + token};
289  }
290  throw syntax_error{"unexpected end of line in function argument list"};
291 }
bool get_expr(node &result)
Definition: parse.cpp:208
Definition: node.hpp:26
void push_back(std::string const &token, kind k)
Definition: parse.cpp:60
kind get_token(std::string &token)
Definition: parse.cpp:69
kind
Definition: parse.hpp:31
void parser::get_identifier ( std::string &  identifier)
private

Parse an identifer.

Parameters
identifierStore the identifier here.
Precondition
first input character is alphabetic

Definition at line 41 of file parse.cpp.

References charify(), input_, isalnum(), and isalpha().

Referenced by get_token().

42 {
43  identifier.clear();
44  char c{};
45  if (not input_.get(c))
46  return;
47  if (not isalpha(c))
48  throw syntax_error("expected alphabetic, got " + charify(c));
49  identifier += c;
50  while (input_.get(c)) {
51  if (not isalnum(c)) {
52  input_.unget();
53  return;
54  }
55  identifier += c;
56  }
57  return;
58 }
std::string charify(char c)
Definition: parse.cpp:17
bool isalpha(char c) const
Definition: parse.hpp:162
std::istream & input_
Share the input stream.
Definition: parse.hpp:179
bool isalnum(char c) const
Definition: parse.hpp:167
bool parser::get_mul_expr ( node result)
private

Parse a multiplicative expression.

MUL_EXPR ::= UNARY | MUL_EXPR + UNARY | MUL_EXPR - UNARY
Parameters
resultStore the result here
Returns
true to continue parsing or false to stop (end of file or error)

Definition at line 232 of file parse.cpp.

References get_token(), get_unary(), and push_back().

Referenced by get_add_expr().

233 {
234  if (not get_unary(result))
235  return false;
236  std::string token{};
237  while (kind k = get_token(token)) {
238  if (k != '*' and k != '/') {
239  push_back(token, k);
240  return true;
241  } else {
242  node right{};
243  if (not get_unary(right))
244  throw syntax_error{"unterminated expression. Expected a unary-expression after " + token};
245  result = node(result, k, right);
246  }
247  }
248  return true;
249 }
Definition: node.hpp:26
void push_back(std::string const &token, kind k)
Definition: parse.cpp:60
bool get_unary(node &result)
Definition: parse.cpp:251
kind get_token(std::string &token)
Definition: parse.cpp:69
kind
Definition: parse.hpp:31
template<class OutputIterator >
OutputIterator parser::get_namelist ( OutputIterator  output)
private

Parse a list of parameter names. Names are identifiers, separated by commas. The list can be empty. This is a template so the container type is unimportant. Any output iterator will do.

Parameters
[out]outputStore the identifiers here
Returns
a copy of output after storing all the identifiers

Definition at line 193 of file parse.hpp.

References get_token(), and identifier.

Referenced by get_definition().

194 {
195  std::string token{};
196  while (kind k = get_token(token)) {
197  if (k == ')')
198  return output;
199  else if (k != identifier)
200  throw syntax_error{"expected function parameter, got " + token};
201  else {
202  *output = token;
203  ++output;
204 
205  k = get_token(token);
206  if (k == ')')
207  return output;
208  if (k != ',')
209  throw syntax_error{"expected comma in function paramter list, got " + token};
210  }
211  }
212  throw syntax_error{"unexpected end of line in function parameter list"};
213 }
kind get_token(std::string &token)
Definition: parse.cpp:69
kind
Definition: parse.hpp:31
bool parser::get_number ( std::string const &  token,
node result 
)
private

Parse a floating number.

Parameters
tokenThe token to parse
resultStore the number here
Returns
true if token is a valid number or false for an error

Definition at line 146 of file parse.cpp.

Referenced by get_primary().

147 {
148  std::istringstream stream{token};
149  // If the value overflows or is otherwise invalid, return false.
150  double value{};
151  if (not (stream >> value))
152  return false;
153  result = node(value);
154  return true;
155 }
Definition: node.hpp:26
bool parser::get_primary ( node result)
private

Parse a primary expression.

PRIMARY ::= NUMBER | IDENTIFIER | '(' EXPR ')' | FUNCTION_CALL
FUNCTION_CALL ::= IDENTIFIER '(' OPT_EXPR_LIST ')'
OPT_EXPR_LIST ::= empty | EXPR_LIST
EXPR_LIST ::= EXPR | EXPR_LIST ',' EXPR
Parameters
resultStore the result here
Returns
true to continue parsing or false to stop (end of file or error)

Definition at line 293 of file parse.cpp.

References eof, get_expr(), get_expr_list(), get_number(), get_token(), identifier, number, and push_back().

Referenced by get_unary().

294 {
295  std::string token{};
296  kind k = get_token(token);
297  if (k == eof)
298  return false;
299 
300  if (k == '(') {
301  // Parenthesized expression
302  if (not get_expr(result))
303  throw syntax_error{"expected expression, got end of line"};
304  k = get_token(token);
305  if (k != ')')
306  throw syntax_error{"expected ')', got " + token};
307  else
308  return true;
309  }
310 
311  if (k == number) {
312  // Numeric literal
313  if (not get_number(token, result))
314  throw syntax_error{"Invalid numeric literal: " + token};
315  return true;
316  }
317 
318  if (k == identifier) {
319  // Identifier: variable or function call
320  std::string next{};
321  k = get_token(next);
322  if (k == '(') {
323  // function call
324  node_list arguments{};
325  get_expr_list(arguments);
326  result = node{std::move(token), std::move(arguments)};
327  } else {
328  static const node_list no_arguments{};
329  // Variable reference or function call with no arguments
330  push_back(next, k);
331  result = node(std::move(token), no_arguments);
332  }
333  return true;
334  }
335  throw syntax_error("expected a primary, got " + token);
336 }
bool get_expr(node &result)
Definition: parse.cpp:208
Definition: node.hpp:26
void push_back(std::string const &token, kind k)
Definition: parse.cpp:60
kind get_token(std::string &token)
Definition: parse.cpp:69
kind
Definition: parse.hpp:31
std::vector< node > node_list
A sequence of nodes.
Definition: node.hpp:13
bool get_number(std::string const &token, node &result)
Definition: parse.cpp:146
void get_expr_list(node_list &result)
Definition: parse.cpp:272
bool parser::get_statement ( std::ostream &  output)

Read one statement and store the parse tree in result. If the statement is an assignment or function definition, store the variable or function. If the statement is an expression, print the result to output.

STATEMENT ::= DEFINITION | QUIT | EXPR
Parameters
outputThe output stream.
Returns
true to continue or false to end the loop
Exceptions
parse_errorfor various syntax and other errors

Definition at line 178 of file parse.cpp.

References eof, node::evaluate(), get_definition(), get_expr(), get_token(), identifier, push_back(), and set_function().

179 {
180  std::string token{};
181  kind k(get_token(token));
182  if (k == eof)
183  return false;
184 
185  if (k == identifier and token == "def") {
186  node definition{};
187  identifier_list parameters{};
188  get_definition(token, parameters, definition);
189  set_function(token, node{std::move(parameters), definition});
190  return true;
191  }
192 
193  if (k == identifier and token == "quit")
194  std::exit(0);
195 
196  // Otherwise, the statement must be an expression.
197  push_back(token, k);
198  node n{};
199  if (not get_expr(n))
200  return false;
201  else {
202  // Evaluate the expression and print the result.
203  output << n.evaluate() << '\n';
204  return true;
205  }
206 }
bool get_expr(node &result)
Definition: parse.cpp:208
Definition: node.hpp:26
void get_definition(std::string &name, identifier_list &parameters, node &definition)
Definition: parse.cpp:157
void push_back(std::string const &token, kind k)
Definition: parse.cpp:60
std::vector< std::string > identifier_list
A sequence of identifiers (e.g., parameter names).
Definition: node.hpp:19
kind get_token(std::string &token)
Definition: parse.cpp:69
void set_function(std::string const &name, node value)
Definition: variables.cpp:70
double evaluate() const
Definition: node.cpp:57
kind
Definition: parse.hpp:31
parser::kind parser::get_token ( std::string &  token)
private

Parse a token. A token can be a keyword, a literal or a symbol.

TOKEN ::= IDENTIFIER | NUMBER | SYMBOL
IDENTIIFER ::= ALPHA (ALPHA | DIGIT)*
NUMBER ::= DIGIT+ ('.' DIGITS+)? ('E' SIGN? DIGITS+)?
SYMBOL ::= '+' | '-' | '*' | '/' | '%' | '(' | ')' | '=' | ','
Parameters
tokenStore the text of the token here.
Returns
the token kind

Definition at line 69 of file parse.cpp.

References charify(), eof, get_identifier(), identifier, input_, isalpha(), kind_, number, and token_.

Referenced by get_add_expr(), get_definition(), get_expr_list(), get_mul_expr(), get_namelist(), get_primary(), get_statement(), and get_unary().

70 {
71  if (not token_.empty())
72  {
73  kind result{kind_};
74  token = token_;
75 
76  token_.clear();
77  kind_ = eof;
78 
79  return result;
80  }
81 
82  char c{};
83  if (not (input_ >> c)) {
84  token = "end of line";
85  return eof;
86  }
87  if (isalpha(c)) {
88  input_.unget();
89  get_identifier(token);
90  return identifier;
91  }
92 
93  // Get a numeric literal.
94  token.clear();
95  if (c == '+' or c == '-' or c == '*' or c == '/' or c == '%' or c == '(' or c == ')' or c == '=' or c == ',') {
96  token += c;
97  return kind(c);
98  }
99 
100  if (c < '0' or c > '9') {
101  input_.unget();
102  throw syntax_error{"expected digit, got " + charify(c)};
103  }
104  while (c >= '0' and c <= '9') {
105  token += c;
106  if (not input_.get(c))
107  return number;
108  }
109  if (c == '.') {
110  token += c;
111  if (not input_.get(c))
112  throw syntax_error{"unterminated number: expected digit after the decimal point"};
113  if (c < '0' or c > '9') {
114  input_.unget();
115  throw syntax_error{"expected digit after decimal point, got " + charify(c)};
116  }
117  while (c >= '0' and c <= '9') {
118  token += c;
119  if (not input_.get(c))
120  return number;
121  }
122  }
123  if (c == 'e' or c == 'E') {
124  token += c;
125  if (not input_.get(c))
126  throw syntax_error{"unterminated number: expected digit in the exponent"};
127  if (c == '-' or c == '+') {
128  token += c;
129  if (not input_.get(c))
130  throw syntax_error{"unterminated number: expected digit after sign in the exponent"};
131  }
132  if (c < '0' or c > '9') {
133  input_.unget();
134  throw syntax_error{"expected digit in the exponent, got " + charify(c)};
135  }
136  while (c >= '0' and c <= '9') {
137  token += c;
138  if (not input_.get(c))
139  return number;
140  }
141  }
142  input_.unget();
143  return number;
144 }
void get_identifier(std::string &identifier)
Definition: parse.cpp:41
std::string charify(char c)
Definition: parse.cpp:17
bool isalpha(char c) const
Definition: parse.hpp:162
std::string token_
One token push-back.
Definition: parse.hpp:181
std::istream & input_
Share the input stream.
Definition: parse.hpp:179
kind kind_
The kind of token that was pushed back.
Definition: parse.hpp:182
kind
Definition: parse.hpp:31
bool parser::get_unary ( node result)
private

Parse a unary expression.

UNARY ::= '-' PRIMARY | '+' PRIMARY | PRIMARY
Parameters
resultStore the result here
Returns
true to continue parsing or false to stop (end of file or error)

Definition at line 251 of file parse.cpp.

References eof, get_primary(), get_token(), and push_back().

Referenced by get_mul_expr().

252 {
253  std::string token{};
254  kind k = get_token(token);
255  if (k == eof)
256  return false;
257  if (k == '-') {
258  if (not get_primary(result))
259  throw syntax_error{"expected primary after unary " + token + ", got end of line"};
260  result = node(k, result);
261  return true;
262  } else if (k == '+') {
263  if (not get_primary(result))
264  throw syntax_error{"expected primary after unary +, got end of line"};
265  return true;
266  } else {
267  push_back(token, k);
268  return get_primary(result);
269  }
270 }
Definition: node.hpp:26
void push_back(std::string const &token, kind k)
Definition: parse.cpp:60
bool get_primary(node &result)
Definition: parse.cpp:293
kind get_token(std::string &token)
Definition: parse.cpp:69
kind
Definition: parse.hpp:31
bool parser::isalnum ( char  c) const
inlineprivate

Return true if c is alphanumeric. Use the locale of the input stream.

Parameters
cThe character to test.

Definition at line 167 of file parse.hpp.

References ctype_.

Referenced by get_identifier().

167 { return ctype_.is(ctype_.alnum, c); }
std::ctype< char > const & ctype_
Cache the ctype facet for checking character categories.
Definition: parse.hpp:180
bool parser::isalpha ( char  c) const
inlineprivate

Return true if c is alphabetic. Use the locale of the input stream.

Parameters
cThe character to test.

Definition at line 162 of file parse.hpp.

References ctype_.

Referenced by get_identifier(), and get_token().

162 { return ctype_.is(ctype_.alpha, c); }
std::ctype< char > const & ctype_
Cache the ctype facet for checking character categories.
Definition: parse.hpp:180
bool parser::isdigit ( char  c) const
inlineprivate

Return true if c is a digit. Use the locale of the input stream.

Parameters
cThe character to test.

Definition at line 172 of file parse.hpp.

References ctype_.

172 { return ctype_.is(ctype_.digit, c); }
std::ctype< char > const & ctype_
Cache the ctype facet for checking character categories.
Definition: parse.hpp:180
bool parser::isprint ( char  c) const
inlineprivate

Return true if c is printable. Use the locale of the input stream.

Parameters
cThe character to test.

Definition at line 177 of file parse.hpp.

References ctype_.

Referenced by charify().

177 { return ctype_.is(ctype_.print, c); }
std::ctype< char > const & ctype_
Cache the ctype facet for checking character categories.
Definition: parse.hpp:180
void parser::push_back ( std::string const &  token,
kind  k 
)
private

Push back a token. The next call to get_token() will return the pushed-back token.

Parameters
tokenThe token to push back.
kThe kind of token being pushed back

Definition at line 60 of file parse.cpp.

References eof, kind_, and token_.

Referenced by get_add_expr(), get_expr_list(), get_mul_expr(), get_primary(), get_statement(), and get_unary().

61 {
62  kind_ = k;
63  if (kind_ == eof)
64  token_ = "end of line";
65  else
66  token_ = token;
67 }
std::string token_
One token push-back.
Definition: parse.hpp:181
kind kind_
The kind of token that was pushed back.
Definition: parse.hpp:182

Member Data Documentation

std::ctype<char> const& parser::ctype_
private

Cache the ctype facet for checking character categories.

Definition at line 180 of file parse.hpp.

Referenced by isalnum(), isalpha(), isdigit(), and isprint().

std::istream& parser::input_
private

Share the input stream.

Definition at line 179 of file parse.hpp.

Referenced by get_identifier(), and get_token().

kind parser::kind_
private

The kind of token that was pushed back.

Definition at line 182 of file parse.hpp.

Referenced by get_token(), and push_back().

std::string parser::token_
private

One token push-back.

Definition at line 181 of file parse.hpp.

Referenced by get_token(), and push_back().


The documentation for this class was generated from the following files: