A couple more fixes and edits

This commit is contained in:
Daniel Holden
2014-01-21 11:29:08 +00:00
parent 172d4ae5d9
commit 51dbf66b50
4 changed files with 223 additions and 237 deletions

237
README.md
View File

@@ -1,13 +1,13 @@
Micro Parser Combinators Micro Parser Combinators
======================== ========================
_mpc_ is a lightweight but powerful Parser Combinator library for C. _mpc_ is a lightweight and powerful Parser Combinator library for C.
Using _mpc_ might be of interest to you if you are... Using _mpc_ might be of interest to you if you are...
* Building a new programming language * Building a new programming language
* Building a new data format * Building a new data format
* Parsing an existing programming languages * Parsing an existing programming language
* Parsing an existing data format * Parsing an existing data format
* Embedding a Domain Specific Language * Embedding a Domain Specific Language
* Implementing [Greenspun's Tenth Rule](http://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule) * Implementing [Greenspun's Tenth Rule](http://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule)
@@ -21,15 +21,15 @@ Features
* Easy to Integrate (One Source File in ANSI C) * Easy to Integrate (One Source File in ANSI C)
* Error Messages * Error Messages
* Regular Expression Parser Generator * Regular Expression Parser Generator
* Grammar Parser Generator * Language/Grammar Parser Generator
Alternatives Alternatives
------------ ------------
The current main alternative C based parser combinator is a branch of [Cesium3](https://github.com/wbhart/Cesium3/tree/combinators). The current main alternative in C based parser combinators is a branch of [Cesium3](https://github.com/wbhart/Cesium3/tree/combinators).
_mpc_ provides a number of features that this project does not offer, but it also overcomes a number of potential downsides: _mpc_ provides a number of features that this project does not offer, and also overcomes a number of potential downsides:
* _mpc_ Works for Generic Types * _mpc_ Works for Generic Types
* _mpc_ Doesn't rely on Boehm-Demers-Weiser Garbage Collection * _mpc_ Doesn't rely on Boehm-Demers-Weiser Garbage Collection
@@ -38,39 +38,40 @@ _mpc_ provides a number of features that this project does not offer, but it als
View From the Top View From the Top
----------------- =================
In this example I create a parser for a basic maths language. The function `parse_maths` takes as input some mathematical expression and outputs an instance of `mpc_ast_t`. In this example I create a parser for a basic maths language. The function `parse_maths` takes as input some mathematical expression and outputs an instance of `mpc_ast_t`.
```c ```c
#include "mpc.h" #include "mpc.h"
mpc_ast_t* parse_maths(const char* input) { void parse_maths(const char *input) {
mpc_parser_t* Expr = mpc_new("expression"); mpc_parser_t *Expr = mpc_new("expression");
mpc_parser_t* Prod = mpc_new("product"); mpc_parser_t *Prod = mpc_new("product");
mpc_parser_t* Value = mpc_new("value"); mpc_parser_t *Value = mpc_new("value");
mpc_parser_t* Maths = mpc_new("maths"); mpc_parser_t *Maths = mpc_new("maths");
mpca_lang( mpca_lang(
" \ " \
expression : <product> (('+' | '-') <product>)*; \ expression : <product> (('+' | '-') <product>)*; \
product : <value> (('*' | '/') <value> )*; \ product : <value> (('*' | '/') <value> )*; \
value : /[0-9]+/ | '(' <expression> ')'; \ value : /[0-9]+/ | '(' <expression> ')'; \
maths : /^/ <expression> /$/; \ maths : /^/ <expression> /$/; \
", ",
Expr, Prod, Value, Maths); Expr, Prod, Value, Maths);
mpc_result_t r; mpc_result_t r;
if (!mpc_parse("<parse_maths>", input, Maths, &r)) { if (!mpc_parse("<parse_maths>", input, Maths, &r)) {
mpc_ast_print(r.output);
mpc_ast_delete(r.output);
} else {
mpc_err_print(r.error); mpc_err_print(r.error);
mpc_err_delete(r.error); mpc_err_delete(r.error);
exit(EXIT_FAILURE);
} }
mpc_cleanup(4, Expr, Prod, Value, Maths); mpc_cleanup(4, Expr, Prod, Value, Maths);
return r.output;
} }
``` ```
@@ -94,9 +95,11 @@ If you were to input `"(4 * 2 * 11 + 2) - 5"` into this function, the `mpc_ast_t
product|value|regex: '5' product|value|regex: '5'
``` ```
Getting Started
===============
View From the Bottom Introduction
-------------------- ------------
Parser Combinators are structures that encode how to parse a particular language. They can be combined using intuitive operators to create new parsers of increasing complexity. Using these operators detailed grammars and languages can be parsed and processed in a quick, efficient, and easy way. Parser Combinators are structures that encode how to parse a particular language. They can be combined using intuitive operators to create new parsers of increasing complexity. Using these operators detailed grammars and languages can be parsed and processed in a quick, efficient, and easy way.
@@ -105,7 +108,7 @@ The trick behind Parser Combinators is the observation that by structuring the l
As is shown in the above example _mpc_ takes this one step further, and actually allows you to specify the grammar directly, or to built up parsers using library functions. As is shown in the above example _mpc_ takes this one step further, and actually allows you to specify the grammar directly, or to built up parsers using library functions.
Parsers Parsing
------- -------
The Parser Combinator type in _mpc_ is `mpc_parser_t`. This encodes a function that attempts to parse some string and, if successful, returns a pointer to some data. Otherwise it returns some error. A parser can be run using `mpc_parse`. The Parser Combinator type in _mpc_ is `mpc_parser_t`. This encodes a function that attempts to parse some string and, if successful, returns a pointer to some data. Otherwise it returns some error. A parser can be run using `mpc_parse`.
@@ -113,24 +116,24 @@ The Parser Combinator type in _mpc_ is `mpc_parser_t`. This encodes a function t
* * * * * *
```c ```c
int mpc_parse(const char* filename, const char* string, mpc_parser_t* p, mpc_result_t* r); int mpc_parse(const char *filename, const char *string, mpc_parser_t *p, mpc_result_t *r);
``` ```
This function returns `1` on success and `0` on failure. It takes as input some parser `p`, some input `string`, and some `filename`. It outputs into `r` the result of the parse - which is either a pointer to some data object, or an error. The type `mpc_result_t` is a union type defined as follows. This function returns `1` on success and `0` on failure. It takes as input some parser `p`, some input `string`, and some `filename`. It outputs into `r` the result of the parse - which is either a pointer to some data object, or an error. The type `mpc_result_t` is a union type defined as follows.
```c ```c
typedef union { typedef union {
mpc_err_t* error; mpc_err_t *error;
mpc_val_t* output; mpc_val_t *output;
} mpc_result_t; } mpc_result_t;
``` ```
where `mpc_val_t` is synonymous with `void*` and simply represents some pointer to data - the exact type of which is dependant on the parser. Some variations on the above also exist. For almost all of the built-in and basic parsers the return type for a successful parser will be `char*`. where `mpc_val_t` is synonymous with `void *` and simply represents some pointer to data - the exact type of which is dependant on the parser. Some variations on the above also exist. For almost all of the built-in and basic parsers the return type for a successful parser will be `char *`.
* * * * * *
```c ```c
int mpc_fparse(const char* filename, FILE* file, mpc_parser_t* p, mpc_result_t* r); int mpc_parse_file(const char *filename, FILE *file, mpc_parser_t *p, mpc_result_t *r);
``` ```
Parses the contents of `file` with parser `p` and returns the result in `r`. Returns `1` on success and `0` on failure. This is also the correct method to parse input from pipes or streams. Parses the contents of `file` with parser `p` and returns the result in `r`. Returns `1` on success and `0` on failure. This is also the correct method to parse input from pipes or streams.
@@ -138,7 +141,7 @@ Parses the contents of `file` with parser `p` and returns the result in `r`. Ret
* * * * * *
```c ```c
int mpc_fparse_contents(const char* filename, mpc_parser_t* p, mpc_result_t* r); int mpc_parse_contents(const char* filename, mpc_parser_t *p, mpc_result_t* r);
``` ```
Opens file `filename` and parsers contents with `p`. Returns result in `r`. Returns `1` on success and `0` on failure; Opens file `filename` and parsers contents with `p`. Returns result in `r`. Returns `1` on success and `0` on failure;
@@ -154,7 +157,7 @@ All the following functions return basic parsers. All of those parsers return a
* * * * * *
```c ```c
mpc_parser_t* mpc_any(void); mpc_parser_t *mpc_any(void);
``` ```
Matches any individual character Matches any individual character
@@ -162,7 +165,7 @@ Matches any individual character
* * * * * *
```c ```c
mpc_parser_t* mpc_char(char c); mpc_parser_t *mpc_char(char c);
``` ```
Matches a single given character `c` Matches a single given character `c`
@@ -170,7 +173,7 @@ Matches a single given character `c`
* * * * * *
```c ```c
mpc_parser_t* mpc_range(char s, char e); mpc_parser_t *mpc_range(char s, char e);
``` ```
Matches any single given character in the range `s` to `e` (inclusive) Matches any single given character in the range `s` to `e` (inclusive)
@@ -178,7 +181,7 @@ Matches any single given character in the range `s` to `e` (inclusive)
* * * * * *
```c ```c
mpc_parser_t* mpc_oneof(const char* s); mpc_parser_t *mpc_oneof(const char* s);
``` ```
Matches any single given character in the string `s` Matches any single given character in the string `s`
@@ -186,14 +189,14 @@ Matches any single given character in the string `s`
* * * * * *
```c ```c
mpc_parser_t* mpc_noneof(const char* s); mpc_parser_t *mpc_noneof(const char* s);
``` ```
Matches any single given character not in the string `s` Matches any single given character not in the string `s`
* * * * * *
```c ```c
mpc_parser_t* mpc_satisfy(int(*f)(char)); mpc_parser_t *mpc_satisfy(int(*f)(char));
``` ```
Matches any single given character satisfying function `f` Matches any single given character satisfying function `f`
@@ -201,7 +204,7 @@ Matches any single given character satisfying function `f`
* * * * * *
```c ```c
mpc_parser_t* mpc_string(const char* s); mpc_parser_t *mpc_string(const char* s);
``` ```
Matches exactly the string `s` Matches exactly the string `s`
@@ -214,7 +217,7 @@ Several other functions exist that return basic parsers with some other special
* * * * * *
```c ```c
mpc_parser_t* mpc_pass(void); mpc_parser_t *mpc_pass(void);
``` ```
Consumes no input, always successful, returns `NULL` Consumes no input, always successful, returns `NULL`
@@ -222,7 +225,8 @@ Consumes no input, always successful, returns `NULL`
* * * * * *
```c ```c
mpc_parser_t* mpc_fail(const char* m); mpc_parser_t *mpc_fail(const char* m);
mpc_parser_t *mpc_failf(const char* fmt, ...);
``` ```
Consumes no input, always fails with message `m`. Consumes no input, always fails with message `m`.
@@ -230,7 +234,7 @@ Consumes no input, always fails with message `m`.
* * * * * *
```c ```c
mpc_parser_t* mpc_failf(const char* fmt, ...); mpc_parser_t *mpc_failf(const char* fmt, ...);
``` ```
Consumes no input, always fails with formatted message given by `fmt` and following parameters. Consumes no input, always fails with formatted message given by `fmt` and following parameters.
@@ -238,7 +242,7 @@ Consumes no input, always fails with formatted message given by `fmt` and follow
* * * * * *
```c ```c
mpc_parser_t* mpc_lift(mpc_ctor_t f); mpc_parser_t *mpc_lift(mpc_ctor_t f);
``` ```
Consumes no input, always successful, returns the result of function `f` Consumes no input, always successful, returns the result of function `f`
@@ -246,7 +250,7 @@ Consumes no input, always successful, returns the result of function `f`
* * * * * *
```c ```c
mpc_parser_t* mpc_lift_val(mpc_val_t* x); mpc_parser_t *mpc_lift_val(mpc_val_t* x);
``` ```
Consumes no input, always successful, returns `x` Consumes no input, always successful, returns `x`
@@ -264,7 +268,8 @@ Here are the main combinators and how to use then.
* * * * * *
```c ```c
mpc_parser_t* mpc_expect(mpc_parser_t* a, const char* e); mpc_parser_t *mpc_expect(mpc_parser_t *a, const char* e);
mpc_parser_t *mpc_expectf(mpc_parser_t *a, const char* fmt, ...);
``` ```
Returns a parser that runs `a`, and on success returns the result of `a`, while on failure reports that `e` was expected. Returns a parser that runs `a`, and on success returns the result of `a`, while on failure reports that `e` was expected.
@@ -272,8 +277,8 @@ Returns a parser that runs `a`, and on success returns the result of `a`, while
* * * * * *
```c ```c
mpc_parser_t* mpc_apply(mpc_parser_t* a, mpc_apply_t f); mpc_parser_t *mpc_apply(mpc_parser_t *a, mpc_apply_t f);
mpc_parser_t* mpc_apply_to(mpc_parser_t* a, mpc_apply_to_t f, void* x); mpc_parser_t *mpc_apply_to(mpc_parser_t *a, mpc_apply_to_t f, void* x);
``` ```
Returns a parser that applies function `f` (optionality taking extra input `x`) to the result of parser `a`. Returns a parser that applies function `f` (optionality taking extra input `x`) to the result of parser `a`.
@@ -281,8 +286,8 @@ Returns a parser that applies function `f` (optionality taking extra input `x`)
* * * * * *
```c ```c
mpc_parser_t* mpc_not(mpc_parser_t* a, mpc_dtor_t da); mpc_parser_t *mpc_not(mpc_parser_t *a, mpc_dtor_t da);
mpc_parser_t* mpc_not_lift(mpc_parser_t* a, mpc_dtor_t da, mpc_ctor_t lf); mpc_parser_t *mpc_not_lift(mpc_parser_t *a, mpc_dtor_t da, mpc_ctor_t lf);
``` ```
Returns a parser with the following behaviour. If parser `a` succeeds, then it fails and consumes no input. If parser `a` fails, then it succeeds, consumes no input and returns `NULL` (or the result of lift function `lf`). Destructor `da` is used to destroy the result of `a` on success. Returns a parser with the following behaviour. If parser `a` succeeds, then it fails and consumes no input. If parser `a` fails, then it succeeds, consumes no input and returns `NULL` (or the result of lift function `lf`). Destructor `da` is used to destroy the result of `a` on success.
@@ -290,8 +295,8 @@ Returns a parser with the following behaviour. If parser `a` succeeds, then it f
* * * * * *
```c ```c
mpc_parser_t* mpc_maybe(mpc_parser_t* a); mpc_parser_t *mpc_maybe(mpc_parser_t *a);
mpc_parser_t* mpc_maybe_lift(mpc_parser_t* a, mpc_ctor_t lf); mpc_parser_t *mpc_maybe_lift(mpc_parser_t *a, mpc_ctor_t lf);
``` ```
Returns a parser that runs `a`. If `a` is successful then it returns the result of `a`. If `a` is unsuccessful then it succeeds, but returns `NULL` (or the result of `lf`). Returns a parser that runs `a`. If `a` is successful then it returns the result of `a`. If `a` is unsuccessful then it succeeds, but returns `NULL` (or the result of `lf`).
@@ -299,7 +304,7 @@ Returns a parser that runs `a`. If `a` is successful then it returns the result
* * * * * *
```c ```c
mpc_parser_t* mpc_many(mpc_fold_t f, mpc_parser_t* a); mpc_parser_t *mpc_many(mpc_fold_t f, mpc_parser_t *a);
``` ```
Keeps running `a` until it fails. Results are combined using fold function `f`. See the _Function Types_ section for more details. Keeps running `a` until it fails. Results are combined using fold function `f`. See the _Function Types_ section for more details.
@@ -307,7 +312,7 @@ Keeps running `a` until it fails. Results are combined using fold function `f`.
* * * * * *
```c ```c
mpc_parser_t* mpc_many1(mpc_fold_t f, mpc_parser_t* a); mpc_parser_t *mpc_many1(mpc_fold_t f, mpc_parser_t *a);
``` ```
Attempts to run `a` one or more times. Results are combined with fold function `f`. Attempts to run `a` one or more times. Results are combined with fold function `f`.
@@ -315,7 +320,7 @@ Attempts to run `a` one or more times. Results are combined with fold function `
* * * * * *
```c ```c
mpc_parser_t* mpc_count(int n, mpc_fold_t f, mpc_parser_t* a, mpc_dtor_t da); mpc_parser_t *mpc_count(int n, mpc_fold_t f, mpc_parser_t *a, mpc_dtor_t da);
``` ```
Attempts to run `a` exactly `n` times. If this fails, any partial results are destructed with `da`. If successful results of `a` are combined using fold function `f`. Attempts to run `a` exactly `n` times. If this fails, any partial results are destructed with `da`. If successful results of `a` are combined using fold function `f`.
@@ -323,7 +328,7 @@ Attempts to run `a` exactly `n` times. If this fails, any partial results are de
* * * * * *
```c ```c
mpc_parser_t* mpc_or(int n, ...); mpc_parser_t *mpc_or(int n, ...);
``` ```
Attempts to run `n` parsers in sequence, returning the first one that succeeds. If all fail, returns an error. Attempts to run `n` parsers in sequence, returning the first one that succeeds. If all fail, returns an error.
@@ -331,7 +336,7 @@ Attempts to run `n` parsers in sequence, returning the first one that succeeds.
* * * * * *
```c ```c
mpc_parser_t* mpc_and(int n, mpc_fold_t f, ...); mpc_parser_t *mpc_and(int n, mpc_fold_t f, ...);
``` ```
Attempts to run `n` parsers in sequence, returning the fold of the results using fold function `f`. First parsers must be specified, followed by destructors for each parser, excluding the final parser. These are used in case of partial success. For example: `mpc_and(3, mpcf_strfold, mpc_char('a'), mpc_char('b'), mpc_char('c'), free, free);` would attempt to match `'a'` followed by `'b'` followed by `'c'`, and if successful would concatenate them using `mpcf_strfold`. Otherwise would use `free` on the partial results. Attempts to run `n` parsers in sequence, returning the fold of the results using fold function `f`. First parsers must be specified, followed by destructors for each parser, excluding the final parser. These are used in case of partial success. For example: `mpc_and(3, mpcf_strfold, mpc_char('a'), mpc_char('b'), mpc_char('c'), free, free);` would attempt to match `'a'` followed by `'b'` followed by `'c'`, and if successful would concatenate them using `mpcf_strfold`. Otherwise would use `free` on the partial results.
@@ -339,7 +344,7 @@ Attempts to run `n` parsers in sequence, returning the fold of the results using
* * * * * *
```c ```c
mpc_parser_t* mpc_predictive(mpc_parser_t* a); mpc_parser_t *mpc_predictive(mpc_parser_t *a);
``` ```
Returns a parser that runs `a` with backtracking disabled. This means if `a` consumes any input, it will not be reverted, even on failure. Turning backtracking off has good performance benefits for grammars which are `LL(1)`. These are grammars where the first character completely determines the parse result - such as the decision of parsing either a C identifier, number, or string literal. This option should not be used for non `LL(1)` grammars or it will produce incorrect results or crash the parser. Returns a parser that runs `a` with backtracking disabled. This means if `a` consumes any input, it will not be reverted, even on failure. Turning backtracking off has good performance benefits for grammars which are `LL(1)`. These are grammars where the first character completely determines the parse result - such as the decision of parsing either a C identifier, number, or string literal. This option should not be used for non `LL(1)` grammars or it will produce incorrect results or crash the parser.
@@ -412,13 +417,13 @@ Then we can actually specify the grammar using combinators to say how the basic
```c ```c
char* parse_ident(char* input) { char* parse_ident(char* input) {
mpc_parser_t* alpha = mpc_or(2, mpc_range('a', 'z'), mpc_range('A', 'Z')); mpc_parser_t *alpha = mpc_or(2, mpc_range('a', 'z'), mpc_range('A', 'Z'));
mpc_parser_t* digit = mpc_range('0', '9'); mpc_parser_t *digit = mpc_range('0', '9');
mpc_parser_t* underscore = mpc_char('_'); mpc_parser_t *underscore = mpc_char('_');
mpc_parser_t* ident0 = mpc_or(2, alpha, underscore); mpc_parser_t *ident0 = mpc_or(2, alpha, underscore);
mpc_parser_t* ident1 = mpc_many(strfold, mpc_or(3, alpha, digit, underscore)); mpc_parser_t *ident1 = mpc_many(strfold, mpc_or(3, alpha, digit, underscore));
mpc_parser_t* ident = mpc_and(2, strfold, ident0, ident1, free); mpc_parser_t *ident = mpc_and(2, strfold, ident0, ident1, free);
mpc_result_t r; mpc_result_t r;
if (!mpc_parse("<parse_ident>", input, ident, &r)) { if (!mpc_parse("<parse_ident>", input, ident, &r)) {
@@ -444,7 +449,7 @@ Building parsers in the above way can have issues with self-reference or cyclic-
* * * * * *
```c ```c
mpc_parser_t* mpc_new(const char* name); mpc_parser_t *mpc_new(const char* name);
``` ```
This will construct a parser called `name` which can then be used by others, including itself, without ownership being transfered. Any parser created using `mpc_new` is said to be _retained_. This means it will behave differently to a normal parser when referenced. When deleting a parser that includes a _retained_ parser, the _retained_ parser it will not be deleted along with it. To delete a retained parser `mpc_delete` must be used on it directly. This will construct a parser called `name` which can then be used by others, including itself, without ownership being transfered. Any parser created using `mpc_new` is said to be _retained_. This means it will behave differently to a normal parser when referenced. When deleting a parser that includes a _retained_ parser, the _retained_ parser it will not be deleted along with it. To delete a retained parser `mpc_delete` must be used on it directly.
@@ -454,7 +459,7 @@ A _retained_ parser can then be defined using...
* * * * * *
```c ```c
mpc_parser_t* mpc_define(mpc_parser_t* p, mpc_parser_t* a); mpc_parser_t *mpc_define(mpc_parser_t *p, mpc_parser_t *a);
``` ```
This assigns the contents of parser `a` to `p`, and deletes `a`. With this technique parsers can now reference each other, as well as themselves, without trouble. This assigns the contents of parser `a` to `p`, and deletes `a`. With this technique parsers can now reference each other, as well as themselves, without trouble.
@@ -462,7 +467,7 @@ This assigns the contents of parser `a` to `p`, and deletes `a`. With this techn
* * * * * *
```c ```c
mpc_parser_t* mpc_undefine(mpc_parser_t* p); mpc_parser_t *mpc_undefine(mpc_parser_t *p);
``` ```
A final step is required. Parsers that reference each other must all be undefined before they are deleted. It is important to do any undefining before deletion. The reason for this is that to delete a parser it must look at each sub-parser that is used by it. If any of these have already been deleted a segfault is unavoidable - even if they were retained beforehand. A final step is required. Parsers that reference each other must all be undefined before they are deleted. It is important to do any undefining before deletion. The reason for this is that to delete a parser it must look at each sub-parser that is used by it. If any of these have already been deleted a segfault is unavoidable - even if they were retained beforehand.
@@ -482,57 +487,57 @@ Common Parsers
A number of common parsers are included. A number of common parsers are included.
* `mpc_parser_t* mpc_soi(void);` Matches only the start of input, returns `NULL` * `mpc_soi(void);` Matches only the start of input, returns `NULL`
* `mpc_parser_t* mpc_eoi(void);` Matches only the end of input, returns `NULL` * `mpc_eoi(void);` Matches only the end of input, returns `NULL`
* `mpc_parser_t* mpc_space(void);` Matches any whitespace character (" \f\n\r\t\v") * `mpc_space(void);` Matches any whitespace character (" \f\n\r\t\v")
* `mpc_parser_t* mpc_spaces(void);` Matches zero or more whitespace characters * `mpc_spaces(void);` Matches zero or more whitespace characters
* `mpc_parser_t* mpc_whitespace(void);` Matches spaces and frees the result, returns `NULL` * `mpc_whitespace(void);` Matches spaces and frees the result, returns `NULL`
* `mpc_parser_t* mpc_newline(void);` Matches `'\n'` * `mpc_newline(void);` Matches `'\n'`
* `mpc_parser_t* mpc_tab(void);` Matches `'\t'` * `mpc_tab(void);` Matches `'\t'`
* `mpc_parser_t* mpc_escape(void);` Matches a backslash followed by any character * `mpc_escape(void);` Matches a backslash followed by any character
* `mpc_parser_t* mpc_digit(void);` Matches any character in the range `'0'` - `'9'` * `mpc_digit(void);` Matches any character in the range `'0'` - `'9'`
* `mpc_parser_t* mpc_hexdigit(void);` Matches any character in the range `'0'` - `'9'` as well as `'A'` - `'F'` and `'a'` - `'f'` * `mpc_hexdigit(void);` Matches any character in the range `'0'` - `'9'` as well as `'A'` - `'F'` and `'a'` - `'f'`
* `mpc_parser_t* mpc_octdigit(void);` Matches any character in the range `'0'` - `'7'` * `mpc_octdigit(void);` Matches any character in the range `'0'` - `'7'`
* `mpc_parser_t* mpc_digits(void);` Matches one or more digit * `mpc_digits(void);` Matches one or more digit
* `mpc_parser_t* mpc_hexdigits(void);` Matches one or more hexdigit * `mpc_hexdigits(void);` Matches one or more hexdigit
* `mpc_parser_t* mpc_octdigits(void);` Matches one or more octdigit * `mpc_octdigits(void);` Matches one or more octdigit
* `mpc_parser_t* mpc_lower(void);` Matches and lower case character * `mpc_lower(void);` Matches and lower case character
* `mpc_parser_t* mpc_upper(void);` Matches any upper case character * `mpc_upper(void);` Matches any upper case character
* `mpc_parser_t* mpc_alpha(void);` Matches and alphabet character * `mpc_alpha(void);` Matches and alphabet character
* `mpc_parser_t* mpc_underscore(void);` Matches `'_'` * `mpc_underscore(void);` Matches `'_'`
* `mpc_parser_t* mpc_alphanum(void);` Matches any alphabet character, underscore or digit * `mpc_alphanum(void);` Matches any alphabet character, underscore or digit
* `mpc_parser_t* mpc_int(void);` Matches digits and returns an `int*` * `mpc_int(void);` Matches digits and returns an `int*`
* `mpc_parser_t* mpc_hex(void);` Matches hexdigits and returns an `int*` * `mpc_hex(void);` Matches hexdigits and returns an `int*`
* `mpc_parser_t* mpc_oct(void);` Matches octdigits and returns an `int*` * `mpc_oct(void);` Matches octdigits and returns an `int*`
* `mpc_parser_t* mpc_number(void);` Matches `mpc_int`, `mpc_hex` or `mpc_oct` * `mpc_number(void);` Matches `mpc_int`, `mpc_hex` or `mpc_oct`
* `mpc_parser_t* mpc_real(void);` Matches some floating point number as a string * `mpc_real(void);` Matches some floating point number as a string
* `mpc_parser_t* mpc_float(void);` Matches some floating point number and returns a `float*` * `mpc_float(void);` Matches some floating point number and returns a `float*`
* `mpc_parser_t* mpc_char_lit(void);` Matches some character literal surrounded by `'` * `mpc_char_lit(void);` Matches some character literal surrounded by `'`
* `mpc_parser_t* mpc_string_lit(void);` Matches some string literal surrounded by `"` * `mpc_string_lit(void);` Matches some string literal surrounded by `"`
* `mpc_parser_t* mpc_regex_lit(void);` Matches some regex literal surrounded by `/` * `mpc_regex_lit(void);` Matches some regex literal surrounded by `/`
* `mpc_parser_t* mpc_ident(void);` Matches a C style identifier * `mpc_ident(void);` Matches a C style identifier
Useful Parsers Useful Parsers
-------------- --------------
* `mpc_parser_t* mpc_start(mpc_parser_t* a);` Matches the start of input followed by `a` * `mpc_start(mpc_parser_t *a);` Matches the start of input followed by `a`
* `mpc_parser_t* mpc_end(mpc_parser_t* a, mpc_dtor_t da);` Matches `a` followed by the end of input * `mpc_end(mpc_parser_t *a, mpc_dtor_t da);` Matches `a` followed by the end of input
* `mpc_parser_t* mpc_enclose(mpc_parser_t* a, mpc_dtor_t da);` Matches the start of input, `a`, and the end of input * `mpc_enclose(mpc_parser_t *a, mpc_dtor_t da);` Matches the start of input, `a`, and the end of input
* `mpc_parser_t* mpc_strip(mpc_parser_t* a);` Matches `a` striping any surrounding whitespace * `mpc_strip(mpc_parser_t *a);` Matches `a` striping any surrounding whitespace
* `mpc_parser_t* mpc_tok(mpc_parser_t* a);` Matches `a` and strips any trailing whitespace * `mpc_tok(mpc_parser_t *a);` Matches `a` and strips any trailing whitespace
* `mpc_parser_t* mpc_sym(const char* s);` Matches string `s` and strips any trailing whitespace * `mpc_sym(const char* s);` Matches string `s` and strips any trailing whitespace
* `mpc_parser_t* mpc_total(mpc_parser_t* a, mpc_dtor_t da);` Matches the whitespace stripped `a`, enclosed in the start and end of input * `mpc_total(mpc_parser_t *a, mpc_dtor_t da);` Matches the whitespace stripped `a`, enclosed in the start and end of input
* `mpc_parser_t* mpc_between(mpc_parser_t* a, mpc_dtor_t ad, const char* o, const char* c);` Matches `a` between strings `o` and `c` * `mpc_between(mpc_parser_t *a, mpc_dtor_t ad, const char* o, const char* c);` Matches `a` between strings `o` and `c`
* `mpc_parser_t* mpc_parens(mpc_parser_t* a, mpc_dtor_t ad);` Matches `a` between `"("` and `")"` * `mpc_parens(mpc_parser_t *a, mpc_dtor_t ad);` Matches `a` between `"("` and `")"`
* `mpc_parser_t* mpc_braces(mpc_parser_t* a, mpc_dtor_t ad);` Matches `a` between `"<"` and `">"` * `mpc_braces(mpc_parser_t *a, mpc_dtor_t ad);` Matches `a` between `"<"` and `">"`
* `mpc_parser_t* mpc_brackets(mpc_parser_t* a, mpc_dtor_t ad);` Matches `a` between `"{"` and `"}"` * `mpc_brackets(mpc_parser_t *a, mpc_dtor_t ad);` Matches `a` between `"{"` and `"}"`
* `mpc_parser_t* mpc_squares(mpc_parser_t* a, mpc_dtor_t ad);` Matches `a` between `"["` and `"]"` * `mpc_squares(mpc_parser_t *a, mpc_dtor_t ad);` Matches `a` between `"["` and `"]"`
* `mpc_parser_t* mpc_tok_between(mpc_parser_t* a, mpc_dtor_t ad, const char* o, const char* c);` Matches `a` between `o` and `c`, where `o` and `c` have their trailing whitespace striped. * `mpc_tok_between(mpc_parser_t *a, mpc_dtor_t ad, const char* o, const char* c);` Matches `a` between `o` and `c`, where `o` and `c` have their trailing whitespace striped.
* `mpc_parser_t* mpc_tok_parens(mpc_parser_t* a, mpc_dtor_t ad);` Matches `a` between trailing whitespace stripped `"("` and `")"` * `mpc_tok_parens(mpc_parser_t *a, mpc_dtor_t ad);` Matches `a` between trailing whitespace stripped `"("` and `")"`
* `mpc_parser_t* mpc_tok_braces(mpc_parser_t* a, mpc_dtor_t ad);` Matches `a` between trailing whitespace stripped `"<"` and `">"` * `mpc_tok_braces(mpc_parser_t *a, mpc_dtor_t ad);` Matches `a` between trailing whitespace stripped `"<"` and `">"`
* `mpc_parser_t* mpc_tok_brackets(mpc_parser_t* a, mpc_dtor_t ad);` Matches `a` between trailing whitespace stripped `"{"` and `"}"` * `mpc_tok_brackets(mpc_parser_t *a, mpc_dtor_t ad);` Matches `a` between trailing whitespace stripped `"{"` and `"}"`
* `mpc_parser_t* mpc_tok_squares(mpc_parser_t* a, mpc_dtor_t ad);` Matches `a` between trailing whitespace stripped `"["` and `"]"` * `mpc_tok_squares(mpc_parser_t *a, mpc_dtor_t ad);` Matches `a` between trailing whitespace stripped `"["` and `"]"`
Fold Functions Fold Functions
@@ -590,10 +595,10 @@ And then we use this to specify a basic grammar, which folds together any result
```c ```c
int parse_maths(char* input) { int parse_maths(char* input) {
mpc_parser_t* Expr = mpc_new("expr"); mpc_parser_t *Expr = mpc_new("expr");
mpc_parser_t* Factor = mpc_new("factor"); mpc_parser_t *Factor = mpc_new("factor");
mpc_parser_t* Term = mpc_new("term"); mpc_parser_t *Term = mpc_new("term");
mpc_parser_t* Maths = mpc_new("maths"); mpc_parser_t *Maths = mpc_new("maths");
mpc_define(Expr, mpc_or(2, mpc_define(Expr, mpc_or(2,
mpc_and(3, mpcf_maths, Factor, mpc_oneof("*/"), Factor, free, free), mpc_and(3, mpcf_maths, Factor, mpc_oneof("*/"), Factor, free, free),
@@ -632,7 +637,7 @@ Even with all that has been shown above, specifying parts of text can be a tedio
* * * * * *
```c ```c
mpc_parser_t* mpc_re(const char* re); mpc_parser_t *mpc_re(const char* re);
``` ```
This returns a parser that will attempt to match the given regular expression pattern, and return the matched string on success. It does not have support for groups and match objects, but should be sufficient for simple tasks. This returns a parser that will attempt to match the given regular expression pattern, and return the matched string on success. It does not have support for groups and match objects, but should be sufficient for simple tasks.
@@ -670,7 +675,7 @@ Like with the regular expressions, this user input is parsed by existing parts o
* * * * * *
```c ```c
mpc_parser_t* mpca_grammar(const char* grammar, ...); mpc_parser_t *mpca_grammar(const char* grammar, ...);
``` ```
This takes in some single right hand side of a rule, as well as a list of any of the parsers it refers to, and outputs a parser that does exactly what is specified by the rule. This takes in some single right hand side of a rule, as well as a list of any of the parsers it refers to, and outputs a parser that does exactly what is specified by the rule.
@@ -694,7 +699,7 @@ This reads in the contents of file `f` and inputs it into `mpca_lang`.
* * * * * *
```c ```c
mpc_err_t* mpca_lang_filename(const char* filename, ...); mpc_err_t* mpca_lang_contents(const char* filename, ...);
``` ```
This opens and reads in the contents of the file given by `filename` and passes it to `mpca_lang`. This opens and reads in the contents of the file given by `filename` and passes it to `mpca_lang`.

186
mpc.c
View File

@@ -4,13 +4,6 @@
** State Type ** State Type
*/ */
typedef struct {
char next;
int pos;
int row;
int col;
} mpc_state_t;
static mpc_state_t mpc_state_invalid(void) { static mpc_state_t mpc_state_invalid(void) {
mpc_state_t s; mpc_state_t s;
s.next = '\0'; s.next = '\0';
@@ -33,14 +26,6 @@ static mpc_state_t mpc_state_new(void) {
** Error Type ** Error Type
*/ */
struct mpc_err_t {
char *filename;
mpc_state_t state;
int expected_num;
char **expected;
char *failure;
};
static mpc_err_t *mpc_err_new(const char *filename, mpc_state_t s, const char *expected) { static mpc_err_t *mpc_err_new(const char *filename, mpc_state_t s, const char *expected) {
mpc_err_t *x = malloc(sizeof(mpc_err_t)); mpc_err_t *x = malloc(sizeof(mpc_err_t));
x->filename = malloc(strlen(filename) + 1); x->filename = malloc(strlen(filename) + 1);
@@ -54,11 +39,11 @@ static mpc_err_t *mpc_err_new(const char *filename, mpc_state_t s, const char *e
return x; return x;
} }
static mpc_err_t *mpc_err_fail(const char *filename, const char *failure) { static mpc_err_t *mpc_err_fail(const char *filename, mpc_state_t s, const char *failure) {
mpc_err_t *x = malloc(sizeof(mpc_err_t)); mpc_err_t *x = malloc(sizeof(mpc_err_t));
x->filename = malloc(strlen(filename) + 1); x->filename = malloc(strlen(filename) + 1);
strcpy(x->filename, filename); strcpy(x->filename, filename);
x->state = mpc_state_invalid(); x->state = s;
x->expected_num = 0; x->expected_num = 0;
x->expected = NULL; x->expected = NULL;
x->failure = malloc(strlen(failure) + 1); x->failure = malloc(strlen(failure) + 1);
@@ -116,7 +101,7 @@ void mpc_err_print(mpc_err_t *x) {
} }
void mpc_err_print_to(mpc_err_t *x, FILE *f) { void mpc_err_print_to(mpc_err_t *x, FILE *f) {
char *str; mpc_err_string(x, &str); char *str = mpc_err_string(x);
fprintf(f, "%s", str); fprintf(f, "%s", str);
free(str); free(str);
} }
@@ -157,7 +142,7 @@ static char *mpc_err_char_unescape(char c) {
} }
void mpc_err_string(mpc_err_t *x, char **out) { char *mpc_err_string(mpc_err_t *x) {
char *buffer = calloc(1, 1024); char *buffer = calloc(1, 1024);
int max = 1023; int max = 1023;
@@ -169,8 +154,7 @@ void mpc_err_string(mpc_err_t *x, char **out) {
"error: %s\n", "error: %s\n",
x->filename, x->state.row, x->filename, x->state.row,
x->state.col, x->failure); x->state.col, x->failure);
*out = buffer; return buffer;
return;
} }
mpc_err_string_cat(buffer, &pos, &max, mpc_err_string_cat(buffer, &pos, &max,
@@ -193,36 +177,41 @@ void mpc_err_string(mpc_err_t *x, char **out) {
mpc_err_string_cat(buffer, &pos, &max, mpc_err_char_unescape(x->state.next)); mpc_err_string_cat(buffer, &pos, &max, mpc_err_char_unescape(x->state.next));
mpc_err_string_cat(buffer, &pos, &max, "\n"); mpc_err_string_cat(buffer, &pos, &max, "\n");
*out = realloc(buffer, strlen(buffer) + 1); return realloc(buffer, strlen(buffer) + 1);
}
static mpc_err_t *mpc_err_either(mpc_err_t *x, mpc_err_t *y) {
int i;
if (x->state.pos > y->state.pos) { mpc_err_delete(y); return x; }
if (x->state.pos < y->state.pos) { mpc_err_delete(x); return y; }
if (x->state.pos == y->state.pos) {
for (i = 0; i < y->expected_num; i++) {
if (mpc_err_contains_expected(x, y->expected[i])) { continue; }
else { mpc_err_add_expected(x, y->expected[i]); }
}
mpc_err_delete(y);
return x;
}
return NULL;
} }
static mpc_err_t *mpc_err_or(mpc_err_t** x, int n) { static mpc_err_t *mpc_err_or(mpc_err_t** x, int n) {
mpc_err_t *e = x[0];
int i; int i, j;
for (i = 1; i < n; i++) { mpc_err_t *e = malloc(sizeof(mpc_err_t));
e = mpc_err_either(e, x[i]); e->state = mpc_state_invalid();
e->expected_num = 0;
e->expected = NULL;
e->failure = NULL;
e->filename = malloc(strlen(x[0]->filename)+1);
strcpy(e->filename, x[0]->filename);
for (i = 0; i < n; i++) {
if (x[i]->state.pos > e->state.pos) { e->state = x[i]->state; }
}
for (i = 0; i < n; i++) {
if (x[i]->state.pos < e->state.pos) { continue; }
if (x[i]->failure) {
e->failure = malloc(strlen(x[i]->failure)+1);
strcpy(e->failure, x[i]->failure);
break;
}
for (j = 0; j < x[i]->expected_num; j++) {
if (!mpc_err_contains_expected(e, x[i]->expected[j])) { mpc_err_add_expected(e, x[i]->expected[j]); }
}
}
for (i = 0; i < n; i++) {
mpc_err_delete(x[i]);
} }
return e; return e;
@@ -276,22 +265,6 @@ static mpc_err_t *mpc_err_count(mpc_err_t *x, int n) {
return y; return y;
} }
void mpc_err_expected(mpc_err_t *x, char **out, int *out_num, int out_max) {
int i;
out_max = out_max < x->expected_num ? out_max : x->expected_num;
*out_num = 0;
for (i = 0; i < out_max; i++) {
out[i] = x->expected[i];
(*out_num)++;
}
}
char *mpc_err_filename(mpc_err_t *x) { return x->filename; }
int mpc_err_line(mpc_err_t *x) { return x->state.row; }
int mpc_err_column(mpc_err_t *x) { return x->state.col; }
char mpc_err_unexpected(mpc_err_t *x) { return x->state.next; }
/* /*
** Input Type ** Input Type
*/ */
@@ -706,7 +679,7 @@ static mpc_stack_t *mpc_stack_new(const char *filename) {
s->results = NULL; s->results = NULL;
s->returns = NULL; s->returns = NULL;
s->err = mpc_err_fail(filename, "Unknown Error"); s->err = mpc_err_fail(filename, mpc_state_invalid(), "Unknown Error");
return s; return s;
} }
@@ -731,7 +704,10 @@ static int mpc_stack_terminate(mpc_stack_t *s, mpc_result_t *r) {
} }
static void mpc_stack_err(mpc_stack_t *s, mpc_err_t* e) { static void mpc_stack_err(mpc_stack_t *s, mpc_err_t* e) {
s->err = mpc_err_either(s->err, e); mpc_err_t *errs[2];
errs[0] = s->err;
errs[1] = e;
s->err = mpc_err_or(errs, 2);
} }
/* Stack Parser Stuff */ /* Stack Parser Stuff */
@@ -898,10 +874,10 @@ static mpc_err_t *mpc_stack_merger_err(mpc_stack_t *s, int n) {
** But it is now a pretty ugly beast... ** But it is now a pretty ugly beast...
*/ */
#define MPC_RETURN(st, x) mpc_stack_set_state(stk, st); mpc_stack_pushp(stk, x); continue #define MPC_CONTINUE(st, x) mpc_stack_set_state(stk, st); mpc_stack_pushp(stk, x); continue
#define MPC_SUCCESS(x) mpc_stack_popp(stk, &p, &st); mpc_stack_pushr(stk, mpc_result_out(x), 1); continue #define MPC_SUCCESS(x) mpc_stack_popp(stk, &p, &st); mpc_stack_pushr(stk, mpc_result_out(x), 1); continue
#define MPC_FAILURE(x) mpc_stack_popp(stk, &p, &st); mpc_stack_pushr(stk, mpc_result_err(x), 0); continue #define MPC_FAILURE(x) mpc_stack_popp(stk, &p, &st); mpc_stack_pushr(stk, mpc_result_err(x), 0); continue
#define MPC_FUNCTION(x, f) if (f) { MPC_SUCCESS(x); } else { MPC_FAILURE(mpc_err_fail(i->filename, "Incorrect Input")); } #define MPC_PRIMATIVE(x, f) if (f) { MPC_SUCCESS(x); } else { MPC_FAILURE(mpc_err_fail(i->filename, i->state, "Incorrect Input")); }
int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) { int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
@@ -925,28 +901,28 @@ int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
/* Trivial Parsers */ /* Trivial Parsers */
case MPC_TYPE_UNDEFINED: MPC_FAILURE(mpc_err_fail(i->filename, "Parser Undefined!")); case MPC_TYPE_UNDEFINED: MPC_FAILURE(mpc_err_fail(i->filename, i->state, "Parser Undefined!"));
case MPC_TYPE_PASS: MPC_SUCCESS(NULL); case MPC_TYPE_PASS: MPC_SUCCESS(NULL);
case MPC_TYPE_FAIL: MPC_FAILURE(mpc_err_fail(i->filename, p->data.fail.m)); case MPC_TYPE_FAIL: MPC_FAILURE(mpc_err_fail(i->filename, i->state, p->data.fail.m));
case MPC_TYPE_LIFT: MPC_SUCCESS(p->data.lift.lf()); case MPC_TYPE_LIFT: MPC_SUCCESS(p->data.lift.lf());
case MPC_TYPE_LIFT_VAL: MPC_SUCCESS(p->data.lift.x); case MPC_TYPE_LIFT_VAL: MPC_SUCCESS(p->data.lift.x);
/* Basic Parsers */ /* Basic Parsers */
case MPC_TYPE_SOI: MPC_FUNCTION(NULL, mpc_input_soi(i)); case MPC_TYPE_SOI: MPC_PRIMATIVE(NULL, mpc_input_soi(i));
case MPC_TYPE_EOI: MPC_FUNCTION(NULL, mpc_input_eoi(i)); case MPC_TYPE_EOI: MPC_PRIMATIVE(NULL, mpc_input_eoi(i));
case MPC_TYPE_ANY: MPC_FUNCTION(s, mpc_input_any(i, &s)); case MPC_TYPE_ANY: MPC_PRIMATIVE(s, mpc_input_any(i, &s));
case MPC_TYPE_SINGLE: MPC_FUNCTION(s, mpc_input_char(i, p->data.single.x, &s)); case MPC_TYPE_SINGLE: MPC_PRIMATIVE(s, mpc_input_char(i, p->data.single.x, &s));
case MPC_TYPE_RANGE: MPC_FUNCTION(s, mpc_input_range(i, p->data.range.x, p->data.range.y, &s)); case MPC_TYPE_RANGE: MPC_PRIMATIVE(s, mpc_input_range(i, p->data.range.x, p->data.range.y, &s));
case MPC_TYPE_ONEOF: MPC_FUNCTION(s, mpc_input_oneof(i, p->data.string.x, &s)); case MPC_TYPE_ONEOF: MPC_PRIMATIVE(s, mpc_input_oneof(i, p->data.string.x, &s));
case MPC_TYPE_NONEOF: MPC_FUNCTION(s, mpc_input_noneof(i, p->data.string.x, &s)); case MPC_TYPE_NONEOF: MPC_PRIMATIVE(s, mpc_input_noneof(i, p->data.string.x, &s));
case MPC_TYPE_SATISFY: MPC_FUNCTION(s, mpc_input_satisfy(i, p->data.satisfy.f, &s)); case MPC_TYPE_SATISFY: MPC_PRIMATIVE(s, mpc_input_satisfy(i, p->data.satisfy.f, &s));
case MPC_TYPE_STRING: MPC_FUNCTION(s, mpc_input_string(i, p->data.string.x, &s)); case MPC_TYPE_STRING: MPC_PRIMATIVE(s, mpc_input_string(i, p->data.string.x, &s));
/* Application Parsers */ /* Application Parsers */
case MPC_TYPE_EXPECT: case MPC_TYPE_EXPECT:
if (st == 0) { MPC_RETURN(1, p->data.expect.x); } if (st == 0) { MPC_CONTINUE(1, p->data.expect.x); }
if (st == 1) { if (st == 1) {
if (mpc_stack_popr(stk, &r)) { if (mpc_stack_popr(stk, &r)) {
MPC_SUCCESS(r.output); MPC_SUCCESS(r.output);
@@ -957,7 +933,7 @@ int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
} }
case MPC_TYPE_APPLY: case MPC_TYPE_APPLY:
if (st == 0) { MPC_RETURN(1, p->data.apply.x); } if (st == 0) { MPC_CONTINUE(1, p->data.apply.x); }
if (st == 1) { if (st == 1) {
if (mpc_stack_popr(stk, &r)) { if (mpc_stack_popr(stk, &r)) {
MPC_SUCCESS(p->data.apply.f(r.output)); MPC_SUCCESS(p->data.apply.f(r.output));
@@ -967,7 +943,7 @@ int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
} }
case MPC_TYPE_APPLY_TO: case MPC_TYPE_APPLY_TO:
if (st == 0) { MPC_RETURN(1, p->data.apply_to.x); } if (st == 0) { MPC_CONTINUE(1, p->data.apply_to.x); }
if (st == 1) { if (st == 1) {
if (mpc_stack_popr(stk, &r)) { if (mpc_stack_popr(stk, &r)) {
MPC_SUCCESS(p->data.apply_to.f(r.output, p->data.apply_to.d)); MPC_SUCCESS(p->data.apply_to.f(r.output, p->data.apply_to.d));
@@ -977,7 +953,7 @@ int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
} }
case MPC_TYPE_PREDICT: case MPC_TYPE_PREDICT:
if (st == 0) { mpc_input_backtrack_disable(i); MPC_RETURN(1, p->data.predict.x); } if (st == 0) { mpc_input_backtrack_disable(i); MPC_CONTINUE(1, p->data.predict.x); }
if (st == 1) { if (st == 1) {
mpc_input_backtrack_enable(i); mpc_input_backtrack_enable(i);
mpc_stack_popp(stk, &p, &st); mpc_stack_popp(stk, &p, &st);
@@ -989,7 +965,7 @@ int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
/* TODO: Update Not Error Message */ /* TODO: Update Not Error Message */
case MPC_TYPE_NOT: case MPC_TYPE_NOT:
if (st == 0) { mpc_input_mark(i); MPC_RETURN(1, p->data.not.x); } if (st == 0) { mpc_input_mark(i); MPC_CONTINUE(1, p->data.not.x); }
if (st == 1) { if (st == 1) {
if (mpc_stack_popr(stk, &r)) { if (mpc_stack_popr(stk, &r)) {
mpc_input_rewind(i); mpc_input_rewind(i);
@@ -1003,7 +979,7 @@ int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
} }
case MPC_TYPE_MAYBE: case MPC_TYPE_MAYBE:
if (st == 0) { MPC_RETURN(1, p->data.not.x); } if (st == 0) { MPC_CONTINUE(1, p->data.not.x); }
if (st == 1) { if (st == 1) {
if (mpc_stack_popr(stk, &r)) { if (mpc_stack_popr(stk, &r)) {
MPC_SUCCESS(r.output); MPC_SUCCESS(r.output);
@@ -1016,10 +992,10 @@ int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
/* Repeat Parsers */ /* Repeat Parsers */
case MPC_TYPE_MANY: case MPC_TYPE_MANY:
if (st == 0) { MPC_RETURN(st+1, p->data.repeat.x); } if (st == 0) { MPC_CONTINUE(st+1, p->data.repeat.x); }
if (st > 0) { if (st > 0) {
if (mpc_stack_peekr(stk, &r)) { if (mpc_stack_peekr(stk, &r)) {
MPC_RETURN(st+1, p->data.repeat.x); MPC_CONTINUE(st+1, p->data.repeat.x);
} else { } else {
mpc_stack_popr(stk, &r); mpc_stack_popr(stk, &r);
mpc_stack_err(stk, r.error); mpc_stack_err(stk, r.error);
@@ -1028,10 +1004,10 @@ int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
} }
case MPC_TYPE_MANY1: case MPC_TYPE_MANY1:
if (st == 0) { MPC_RETURN(st+1, p->data.repeat.x); } if (st == 0) { MPC_CONTINUE(st+1, p->data.repeat.x); }
if (st > 0) { if (st > 0) {
if (mpc_stack_peekr(stk, &r)) { if (mpc_stack_peekr(stk, &r)) {
MPC_RETURN(st+1, p->data.repeat.x); MPC_CONTINUE(st+1, p->data.repeat.x);
} else { } else {
if (st == 1) { if (st == 1) {
mpc_stack_popr(stk, &r); mpc_stack_popr(stk, &r);
@@ -1045,10 +1021,10 @@ int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
} }
case MPC_TYPE_COUNT: case MPC_TYPE_COUNT:
if (st == 0) { mpc_input_mark(i); MPC_RETURN(st+1, p->data.repeat.x); } if (st == 0) { mpc_input_mark(i); MPC_CONTINUE(st+1, p->data.repeat.x); }
if (st > 0) { if (st > 0) {
if (mpc_stack_peekr(stk, &r)) { if (mpc_stack_peekr(stk, &r)) {
MPC_RETURN(st+1, p->data.repeat.x); MPC_CONTINUE(st+1, p->data.repeat.x);
} else { } else {
if (st != (p->data.repeat.n+1)) { if (st != (p->data.repeat.n+1)) {
mpc_stack_popr(stk, &r); mpc_stack_popr(stk, &r);
@@ -1070,14 +1046,14 @@ int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
if (p->data.or.n == 0) { MPC_SUCCESS(NULL); } if (p->data.or.n == 0) { MPC_SUCCESS(NULL); }
if (st == 0) { MPC_RETURN(st+1, p->data.or.xs[st]); } if (st == 0) { MPC_CONTINUE(st+1, p->data.or.xs[st]); }
if (st <= p->data.or.n) { if (st <= p->data.or.n) {
if (mpc_stack_peekr(stk, &r)) { if (mpc_stack_peekr(stk, &r)) {
mpc_stack_popr(stk, &r); mpc_stack_popr(stk, &r);
mpc_stack_popr_err(stk, st-1); mpc_stack_popr_err(stk, st-1);
MPC_SUCCESS(r.output); MPC_SUCCESS(r.output);
} }
if (st < p->data.or.n) { MPC_RETURN(st+1, p->data.or.xs[st]); } if (st < p->data.or.n) { MPC_CONTINUE(st+1, p->data.or.xs[st]); }
if (st == p->data.or.n) { MPC_FAILURE(mpc_stack_merger_err(stk, p->data.or.n)); } if (st == p->data.or.n) { MPC_FAILURE(mpc_stack_merger_err(stk, p->data.or.n)); }
} }
@@ -1085,7 +1061,7 @@ int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
if (p->data.or.n == 0) { MPC_SUCCESS(p->data.and.f(0, NULL)); } if (p->data.or.n == 0) { MPC_SUCCESS(p->data.and.f(0, NULL)); }
if (st == 0) { mpc_input_mark(i); MPC_RETURN(st+1, p->data.and.xs[st]); } if (st == 0) { mpc_input_mark(i); MPC_CONTINUE(st+1, p->data.and.xs[st]); }
if (st <= p->data.and.n) { if (st <= p->data.and.n) {
if (!mpc_stack_peekr(stk, &r)) { if (!mpc_stack_peekr(stk, &r)) {
mpc_input_rewind(i); mpc_input_rewind(i);
@@ -1093,7 +1069,7 @@ int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
mpc_stack_popr_out(stk, st-1, p->data.and.dxs); mpc_stack_popr_out(stk, st-1, p->data.and.dxs);
MPC_FAILURE(r.error); MPC_FAILURE(r.error);
} }
if (st < p->data.and.n) { MPC_RETURN(st+1, p->data.and.xs[st]); } if (st < p->data.and.n) { MPC_CONTINUE(st+1, p->data.and.xs[st]); }
if (st == p->data.and.n) { mpc_input_unmark(i); MPC_SUCCESS(mpc_stack_merger_out(stk, p->data.and.n, p->data.and.f)); } if (st == p->data.and.n) { mpc_input_unmark(i); MPC_SUCCESS(mpc_stack_merger_out(stk, p->data.and.n, p->data.and.f)); }
} }
@@ -1101,7 +1077,7 @@ int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
default: default:
MPC_FAILURE(mpc_err_fail(i->filename, "Unknown Parser Type Id!")); MPC_FAILURE(mpc_err_fail(i->filename, i->state, "Unknown Parser Type Id!"));
} }
} }
@@ -1109,10 +1085,10 @@ int mpc_parse_input(mpc_input_t *i, mpc_parser_t *init, mpc_result_t *final) {
} }
#undef MPC_RETURN #undef MPC_CONTINUE
#undef MPC_SUCCESS #undef MPC_SUCCESS
#undef MPC_FAILURE #undef MPC_FAILURE
#undef MPC_FUNCTION #undef MPC_PRIMATIVE
int mpc_parse(const char *filename, const char *string, mpc_parser_t *p, mpc_result_t *r) { int mpc_parse(const char *filename, const char *string, mpc_parser_t *p, mpc_result_t *r) {
int x; int x;
@@ -1122,7 +1098,7 @@ int mpc_parse(const char *filename, const char *string, mpc_parser_t *p, mpc_res
return x; return x;
} }
int mpc_fparse(const char *filename, FILE *file, mpc_parser_t *p, mpc_result_t *r) { int mpc_parse_file(const char *filename, FILE *file, mpc_parser_t *p, mpc_result_t *r) {
int x; int x;
mpc_input_t *i = mpc_input_new_file(filename, file); mpc_input_t *i = mpc_input_new_file(filename, file);
x = mpc_parse_input(i, p, r); x = mpc_parse_input(i, p, r);
@@ -1130,18 +1106,18 @@ int mpc_fparse(const char *filename, FILE *file, mpc_parser_t *p, mpc_result_t *
return x; return x;
} }
int mpc_fparse_contents(const char *filename, mpc_parser_t *p, mpc_result_t *r) { int mpc_parse_contents(const char *filename, mpc_parser_t *p, mpc_result_t *r) {
FILE *f = fopen(filename, "rb"); FILE *f = fopen(filename, "rb");
int res; int res;
if (f == NULL) { if (f == NULL) {
r->output = NULL; r->output = NULL;
r->error = mpc_err_fail(filename, "Unable to open file!"); r->error = mpc_err_fail(filename, mpc_state_new(), "Unable to open file!");
return 0; return 0;
} }
res = mpc_fparse(filename, f, p, r); res = mpc_parse_file(filename, f, p, r);
fclose(f); fclose(f);
return res; return res;
} }
@@ -1953,7 +1929,7 @@ mpc_parser_t *mpc_re(const char *re) {
RegexEnclose = mpc_enclose(mpc_predictive(Regex), (mpc_dtor_t)mpc_delete); RegexEnclose = mpc_enclose(mpc_predictive(Regex), (mpc_dtor_t)mpc_delete);
if(!mpc_parse("<mpc_re_compiler>", re, RegexEnclose, &r)) { if(!mpc_parse("<mpc_re_compiler>", re, RegexEnclose, &r)) {
mpc_err_string(r.error, &err_msg); err_msg = mpc_err_string(r.error);
err_out = mpc_failf("Invalid Regex: %s", err_msg); err_out = mpc_failf("Invalid Regex: %s", err_msg);
mpc_err_delete(r.error); mpc_err_delete(r.error);
free(err_msg); free(err_msg);
@@ -2813,7 +2789,7 @@ mpc_parser_t *mpca_grammar_st(const char *grammar, mpca_grammar_st_t *st) {
)); ));
if(!mpc_parse("<mpc_grammar_compiler>", grammar, GrammarTotal, &r)) { if(!mpc_parse("<mpc_grammar_compiler>", grammar, GrammarTotal, &r)) {
mpc_err_string(r.error, &err_msg); err_msg = mpc_err_string(r.error);
err_out = mpc_failf("Invalid Grammar: %s", err_msg); err_out = mpc_failf("Invalid Grammar: %s", err_msg);
mpc_err_delete(r.error); mpc_err_delete(r.error);
free(err_msg); free(err_msg);
@@ -3012,7 +2988,7 @@ mpc_err_t *mpca_lang(const char *language, ...) {
return err; return err;
} }
mpc_err_t *mpca_lang_filename(const char *filename, ...) { mpc_err_t *mpca_lang_contents(const char *filename, ...) {
mpca_grammar_st_t st; mpca_grammar_st_t st;
mpc_input_t *i; mpc_input_t *i;
@@ -3023,7 +2999,7 @@ mpc_err_t *mpca_lang_filename(const char *filename, ...) {
FILE *f = fopen(filename, "rb"); FILE *f = fopen(filename, "rb");
if (f == NULL) { if (f == NULL) {
return mpc_err_fail(filename, "Unable to open file!"); return mpc_err_fail(filename, mpc_state_new(), "Unable to open file!");
} }
va_start(va, filename); va_start(va, filename);

35
mpc.h
View File

@@ -21,20 +21,25 @@
** Error Type ** Error Type
*/ */
struct mpc_err_t; typedef struct {
typedef struct mpc_err_t mpc_err_t; char next;
int pos;
int row;
int col;
} mpc_state_t;
void mpc_err_delete(mpc_err_t *x); typedef struct {
void mpc_err_print(mpc_err_t *x); mpc_state_t state;
void mpc_err_print_to(mpc_err_t *x, FILE *f); char *filename;
void mpc_err_string(mpc_err_t *x, char **out); char *failure;
int expected_num;
char **expected;
} mpc_err_t;
int mpc_err_line(mpc_err_t *x); void mpc_err_delete(mpc_err_t *e);
int mpc_err_column(mpc_err_t *x); char *mpc_err_string(mpc_err_t *e);
char mpc_err_unexpected(mpc_err_t *x); void mpc_err_print(mpc_err_t *e);
void mpc_err_expected(mpc_err_t *x, char **out, int *out_num, int out_max); void mpc_err_print_to(mpc_err_t *e, FILE *f);
char *mpc_err_filename(mpc_err_t *x);
char *mpc_err_failure(mpc_err_t *x);
/* /*
** Parsing ** Parsing
@@ -51,8 +56,8 @@ struct mpc_parser_t;
typedef struct mpc_parser_t mpc_parser_t; typedef struct mpc_parser_t mpc_parser_t;
int mpc_parse(const char *filename, const char *string, mpc_parser_t *p, mpc_result_t *r); int mpc_parse(const char *filename, const char *string, mpc_parser_t *p, mpc_result_t *r);
int mpc_fparse(const char *filename, FILE* file, mpc_parser_t *p, mpc_result_t *r); int mpc_parse_file(const char *filename, FILE* file, mpc_parser_t *p, mpc_result_t *r);
int mpc_fparse_contents(const char *filename, mpc_parser_t *p, mpc_result_t *r); int mpc_parse_contents(const char *filename, mpc_parser_t *p, mpc_result_t *r);
/* /*
** Function Types ** Function Types
@@ -270,7 +275,7 @@ mpc_parser_t *mpca_grammar(const char *grammar, ...);
mpc_err_t *mpca_lang(const char *language, ...); mpc_err_t *mpca_lang(const char *language, ...);
mpc_err_t *mpca_lang_file(FILE *f, ...); mpc_err_t *mpca_lang_file(FILE *f, ...);
mpc_err_t *mpca_lang_filename(const char *filename, ...); mpc_err_t *mpca_lang_contents(const char *filename, ...);
/* /*
** Debug & Testing ** Debug & Testing

View File

@@ -87,7 +87,7 @@ void test_language_file(void) {
Value = mpc_new("value"); Value = mpc_new("value");
Maths = mpc_new("maths"); Maths = mpc_new("maths");
mpca_lang_filename("./tests/maths.grammar", Expr, Prod, Value, Maths); mpca_lang_contents("./tests/maths.grammar", Expr, Prod, Value, Maths);
mpc_cleanup(4, Expr, Prod, Value, Maths); mpc_cleanup(4, Expr, Prod, Value, Maths);