Browse Source

Added the ability to dump the token stream or ast in bin. (#278)

* Added the ability to dump the token stream or ast in bin.

The dump functionality works both for files and REPL.

With --dump-tokens or -t for short it dumps the token stream to stdout  and --dump-ast or -a for short to dump the ast to stdout.

The dumping of tokens and ast is mutually exclusive. and when dumping it wont run the code.

* Fixed some issues with rustfmt.

* Added serde serialization and deserialization to token and the ast.

* Added a dynamic multi-format dumping of token stream and ast in bin.

- Changed the --dump-tokens and --dump-ast to be an optional argument that optionally takes a value of format type ([--opt=[val]]).
- The default format for --dump-tokens and --dump-ast is Debug format which calls std::fmt::Debug.
- Added Json and JsonMinified format for both dumps,  use serde_json internally.
- It is easy to support other format types, such as Toml with toml-rs for example.

* Made serde an optional dependency.

- Serde serialization and deserialization can be switched on by using the feature flag "serde-ast".

* Changed the JSON dumping format.

- Now Json  dumping format prints the data in minefied JSON form by default.
- Removed JsonMinified.
- Added JsonPretty as a way to dump the data in pretty printed JSON format.

* Updated the docs.
pull/290/head
HalidOdat 5 years ago committed by GitHub
parent
commit
5a85c595d4
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 4
      Cargo.lock
  2. 8
      README.md
  3. 2
      boa/Cargo.toml
  4. 3
      boa/src/lib.rs
  5. 4
      boa/src/syntax/ast/constant.rs
  6. 5
      boa/src/syntax/ast/expr.rs
  7. 4
      boa/src/syntax/ast/keyword.rs
  8. 10
      boa/src/syntax/ast/op.rs
  9. 4
      boa/src/syntax/ast/pos.rs
  10. 4
      boa/src/syntax/ast/punc.rs
  11. 9
      boa/src/syntax/ast/token.rs
  12. 2
      boa_cli/Cargo.toml
  13. 153
      boa_cli/src/main.rs
  14. 33
      docs/debugging.md

4
Cargo.lock generated

@ -9,6 +9,7 @@ dependencies = [
"gc_derive", "gc_derive",
"rand", "rand",
"regex", "regex",
"serde",
"serde_json", "serde_json",
"wasm-bindgen", "wasm-bindgen",
] ]
@ -610,6 +611,9 @@ name = "serde"
version = "1.0.104" version = "1.0.104"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "414115f25f818d7dfccec8ee535d76949ae78584fc4f79a6f45a904bf8ab4449" checksum = "414115f25f818d7dfccec8ee535d76949ae78584fc4f79a6f45a904bf8ab4449"
dependencies = [
"serde_derive",
]
[[package]] [[package]]
name = "serde_derive" name = "serde_derive"

8
README.md

@ -86,12 +86,18 @@ see [CHANGELOG](./CHANGELOG.md)
``` ```
USAGE: USAGE:
boa_cli [FILE]... boa_cli [OPTIONS] [FILE]...
FLAGS: FLAGS:
-h, --help Prints help information -h, --help Prints help information
-V, --version Prints version information -V, --version Prints version information
OPTIONS:
-a, --dump-ast <FORMAT> Dump the ast to stdout with the given format [possible values: Debug, Json,
JsonPretty]
-t, --dump-tokens <FORMAT> Dump the token stream to stdout with the given format [possible values: Debug, Json,
JsonPretty]
ARGS: ARGS:
<FILE>... The JavaScript file(s) to be evaluated <FILE>... The JavaScript file(s) to be evaluated
``` ```

2
boa/Cargo.toml

@ -11,6 +11,7 @@ exclude = ["../.vscode/*", "../Dockerfile", "../Makefile", "../.editorConfig"]
edition = "2018" edition = "2018"
[features] [features]
serde-ast = ["serde"]
default = ["wasm-bindgen"] default = ["wasm-bindgen"]
[dependencies] [dependencies]
@ -22,6 +23,7 @@ regex = "1.3.4"
# Optional Dependencies # Optional Dependencies
wasm-bindgen = { version = "0.2.58", optional = true } wasm-bindgen = { version = "0.2.58", optional = true }
serde = { version = "1.0", features = ["derive"], optional = true }
[dev-dependencies] [dev-dependencies]
criterion = "0.3.1" criterion = "0.3.1"

3
boa/src/lib.rs

@ -19,6 +19,9 @@ use crate::{
syntax::{ast::expr::Expr, lexer::Lexer, parser::Parser}, syntax::{ast::expr::Expr, lexer::Lexer, parser::Parser},
}; };
#[cfg(feature = "serde-ast")]
pub use serde_json;
fn parser_expr(src: &str) -> Result<Expr, String> { fn parser_expr(src: &str) -> Result<Expr, String> {
let mut lexer = Lexer::new(src); let mut lexer = Lexer::new(src);
lexer.lex().map_err(|e| format!("SyntaxError: {}", e))?; lexer.lex().map_err(|e| format!("SyntaxError: {}", e))?;

4
boa/src/syntax/ast/constant.rs

@ -1,6 +1,10 @@
use gc_derive::{Finalize, Trace}; use gc_derive::{Finalize, Trace};
use std::fmt::{Display, Formatter, Result}; use std::fmt::{Display, Formatter, Result};
#[cfg(feature = "serde-ast")]
use serde::{Deserialize, Serialize};
#[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(Clone, Debug, Trace, Finalize, PartialEq)] #[derive(Clone, Debug, Trace, Finalize, PartialEq)]
/// A Javascript Constant /// A Javascript Constant
pub enum Const { pub enum Const {

5
boa/src/syntax/ast/expr.rs

@ -8,6 +8,10 @@ use std::{
fmt::{Display, Formatter, Result}, fmt::{Display, Formatter, Result},
}; };
#[cfg(feature = "serde-ast")]
use serde::{Deserialize, Serialize};
#[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(Clone, Trace, Finalize, Debug, PartialEq)] #[derive(Clone, Trace, Finalize, Debug, PartialEq)]
pub struct Expr { pub struct Expr {
/// The expression definition /// The expression definition
@ -27,6 +31,7 @@ impl Display for Expr {
} }
} }
#[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(Clone, Debug, Trace, Finalize, PartialEq)] #[derive(Clone, Debug, Trace, Finalize, PartialEq)]
/// A Javascript Expression /// A Javascript Expression
pub enum ExprDef { pub enum ExprDef {

4
boa/src/syntax/ast/keyword.rs

@ -4,6 +4,10 @@ use std::{
str::FromStr, str::FromStr,
}; };
#[cfg(feature = "serde-ast")]
use serde::{Deserialize, Serialize};
#[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(Clone, Copy, PartialEq, Debug)] #[derive(Clone, Copy, PartialEq, Debug)]
/// A Javascript Keyword /// A Javascript Keyword
/// As specificed by <https://www.ecma-international.org/ecma-262/#sec-keywords> /// As specificed by <https://www.ecma-international.org/ecma-262/#sec-keywords>

10
boa/src/syntax/ast/op.rs

@ -1,6 +1,9 @@
use gc_derive::{Finalize, Trace}; use gc_derive::{Finalize, Trace};
use std::fmt::{Display, Formatter, Result}; use std::fmt::{Display, Formatter, Result};
#[cfg(feature = "serde-ast")]
use serde::{Deserialize, Serialize};
/// Represents an operator /// Represents an operator
pub trait Operator { pub trait Operator {
/// Get the associativity as a boolean that is true if it goes rightwards /// Get the associativity as a boolean that is true if it goes rightwards
@ -13,6 +16,7 @@ pub trait Operator {
} }
} }
#[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(Clone, Debug, Trace, Finalize, PartialEq)] #[derive(Clone, Debug, Trace, Finalize, PartialEq)]
/// A numeric operation between 2 values /// A numeric operation between 2 values
pub enum NumOp { pub enum NumOp {
@ -47,6 +51,7 @@ impl Display for NumOp {
} }
} }
#[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(Clone, Debug, Trace, Finalize, PartialEq)] #[derive(Clone, Debug, Trace, Finalize, PartialEq)]
/// A unary operation on a single value /// A unary operation on a single value
pub enum UnaryOp { pub enum UnaryOp {
@ -88,6 +93,7 @@ impl Display for UnaryOp {
} }
} }
#[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(Clone, Debug, Trace, Finalize, PartialEq)] #[derive(Clone, Debug, Trace, Finalize, PartialEq)]
/// A bitwise operation between 2 values /// A bitwise operation between 2 values
pub enum BitOp { pub enum BitOp {
@ -119,6 +125,7 @@ impl Display for BitOp {
} }
} }
#[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(Clone, Debug, Trace, Finalize, PartialEq)] #[derive(Clone, Debug, Trace, Finalize, PartialEq)]
/// A comparitive operation between 2 values /// A comparitive operation between 2 values
pub enum CompOp { pub enum CompOp {
@ -159,6 +166,7 @@ impl Display for CompOp {
} }
} }
#[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(Clone, Debug, Trace, Finalize, PartialEq)] #[derive(Clone, Debug, Trace, Finalize, PartialEq)]
/// A logical operation between 2 boolean values /// A logical operation between 2 boolean values
pub enum LogOp { pub enum LogOp {
@ -181,6 +189,7 @@ impl Display for LogOp {
} }
} }
#[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(Clone, Debug, Trace, Finalize, PartialEq)] #[derive(Clone, Debug, Trace, Finalize, PartialEq)]
/// A binary operation between 2 values /// A binary operation between 2 values
pub enum BinOp { pub enum BinOp {
@ -240,6 +249,7 @@ impl Display for BinOp {
} }
} }
#[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(Clone, Debug, Trace, Finalize, PartialEq)] #[derive(Clone, Debug, Trace, Finalize, PartialEq)]
/// A binary operation between 2 values /// A binary operation between 2 values
pub enum AssignOp { pub enum AssignOp {

4
boa/src/syntax/ast/pos.rs

@ -1,3 +1,7 @@
#[cfg(feature = "serde-ast")]
use serde::{Deserialize, Serialize};
#[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(Clone, Copy, PartialEq, Debug)] #[derive(Clone, Copy, PartialEq, Debug)]
/// A position in the Javascript source code /// A position in the Javascript source code
/// Stores both the column number and the line number /// Stores both the column number and the line number

4
boa/src/syntax/ast/punc.rs

@ -1,5 +1,9 @@
use std::fmt::{Display, Error, Formatter}; use std::fmt::{Display, Error, Formatter};
#[cfg(feature = "serde-ast")]
use serde::{Deserialize, Serialize};
#[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(PartialEq, Clone, Copy, Debug)] #[derive(PartialEq, Clone, Copy, Debug)]
/// Punctuation /// Punctuation
pub enum Punctuator { pub enum Punctuator {

9
boa/src/syntax/ast/token.rs

@ -1,9 +1,12 @@
use crate::syntax::ast::{keyword::Keyword, pos::Position, punc::Punctuator}; use crate::syntax::ast::{keyword::Keyword, pos::Position, punc::Punctuator};
use std::fmt::{Debug, Display, Formatter, Result}; use std::fmt::{Debug, Display, Formatter, Result};
#[derive(Clone, PartialEq)] #[cfg(feature = "serde-ast")]
use serde::{Deserialize, Serialize};
/// Represents a token /// Represents a token
#[derive(Debug)] #[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(Debug, Clone, PartialEq)]
pub struct Token { pub struct Token {
/// The token Data /// The token Data
pub data: TokenData, pub data: TokenData,
@ -38,7 +41,7 @@ impl Debug for VecToken {
write!(f, "{}", buffer) write!(f, "{}", buffer)
} }
} }
#[cfg_attr(feature = "serde-ast", derive(Serialize, Deserialize))]
#[derive(Clone, PartialEq, Debug)] #[derive(Clone, PartialEq, Debug)]
/// Represents the type of Token /// Represents the type of Token
pub enum TokenData { pub enum TokenData {

2
boa_cli/Cargo.toml

@ -11,5 +11,5 @@ exclude = ["../.vscode/*", "../Dockerfile", "../Makefile", "../.editorConfig"]
edition = "2018" edition = "2018"
[dependencies] [dependencies]
Boa = { path = "../boa", default-features = false } Boa = { path = "../boa", features = ["serde-ast"], default-features = false }
structopt = "0.3.9" structopt = "0.3.9"

153
boa_cli/src/main.rs

@ -3,18 +3,140 @@
#![allow(clippy::cognitive_complexity)] #![allow(clippy::cognitive_complexity)]
use boa::builtins::console::log; use boa::builtins::console::log;
use boa::serde_json;
use boa::syntax::ast::{expr::Expr, token::Token};
use boa::{exec::Executor, forward_val, realm::Realm}; use boa::{exec::Executor, forward_val, realm::Realm};
use std::io; use std::io::{self, Write};
use std::{fs::read_to_string, path::PathBuf}; use std::{fs::read_to_string, path::PathBuf};
use structopt::clap::arg_enum;
use structopt::StructOpt; use structopt::StructOpt;
/// CLI configuration for Boa. /// CLI configuration for Boa.
//
// Added #[allow(clippy::option_option)] because to StructOpt an Option<Option<T>>
// is an optional argument that optionally takes a value ([--opt=[val]]).
// https://docs.rs/structopt/0.3.11/structopt/#type-magic
#[allow(clippy::option_option)]
#[derive(Debug, StructOpt)] #[derive(Debug, StructOpt)]
#[structopt(author, about)] #[structopt(author, about)]
struct Opt { struct Opt {
/// The JavaScript file(s) to be evaluated. /// The JavaScript file(s) to be evaluated.
#[structopt(name = "FILE", parse(from_os_str))] #[structopt(name = "FILE", parse(from_os_str))]
files: Vec<PathBuf>, files: Vec<PathBuf>,
/// Dump the token stream to stdout with the given format.
#[structopt(
long,
short = "-t",
value_name = "FORMAT",
possible_values = &DumpFormat::variants(),
case_insensitive = true,
conflicts_with = "dump-ast"
)]
dump_tokens: Option<Option<DumpFormat>>,
/// Dump the ast to stdout with the given format.
#[structopt(
long,
short = "-a",
value_name = "FORMAT",
possible_values = &DumpFormat::variants(),
case_insensitive = true
)]
dump_ast: Option<Option<DumpFormat>>,
}
impl Opt {
/// Returns whether a dump flag has been used.
fn has_dump_flag(&self) -> bool {
self.dump_tokens.is_some() || self.dump_ast.is_some()
}
}
arg_enum! {
/// The different types of format available for dumping.
///
// NOTE: This can easily support other formats just by
// adding a field to this enum and adding the necessary
// implementation. Example: Toml, Html, etc.
//
// NOTE: The fields of this enum are not doc comments because
// arg_enum! macro does not support it.
#[derive(Debug)]
enum DumpFormat {
// This is the default format that you get from std::fmt::Debug.
Debug,
// This is a minified json format.
Json,
// This is a pretty printed json format.
JsonPretty,
}
}
/// Lexes the given source code into a stream of tokens and return it.
///
/// Returns a error of type String with a message,
/// if the source has a syntax error.
fn lex_source(src: &str) -> Result<Vec<Token>, String> {
use boa::syntax::lexer::Lexer;
let mut lexer = Lexer::new(src);
lexer.lex().map_err(|e| format!("SyntaxError: {}", e))?;
Ok(lexer.tokens)
}
/// Parses the the token stream into a ast and returns it.
///
/// Returns a error of type String with a message,
/// if the token stream has a parsing error.
fn parse_tokens(tokens: Vec<Token>) -> Result<Expr, String> {
use boa::syntax::parser::Parser;
Parser::new(tokens)
.parse_all()
.map_err(|e| format!("ParsingError: {}", e))
} }
/// Dumps the token stream or ast to stdout depending on the given arguments.
///
/// Returns a error of type String with a error message,
/// if the source has a syntax or parsing error.
fn dump(src: &str, args: &Opt) -> Result<(), String> {
let tokens = lex_source(src)?;
if let Some(ref arg) = args.dump_tokens {
match arg {
Some(format) => match format {
DumpFormat::Debug => println!("{:#?}", tokens),
DumpFormat::Json => println!("{}", serde_json::to_string(&tokens).unwrap()),
DumpFormat::JsonPretty => {
println!("{}", serde_json::to_string_pretty(&tokens).unwrap())
}
},
// Default token stream dumping format.
None => println!("{:#?}", tokens),
}
} else if let Some(ref arg) = args.dump_ast {
let ast = parse_tokens(tokens)?;
match arg {
Some(format) => match format {
DumpFormat::Debug => println!("{:#?}", ast),
DumpFormat::Json => println!("{}", serde_json::to_string(&ast).unwrap()),
DumpFormat::JsonPretty => {
println!("{}", serde_json::to_string_pretty(&ast).unwrap())
}
},
// Default ast dumping format.
None => println!("{:#?}", ast),
}
}
Ok(())
}
pub fn main() -> Result<(), std::io::Error> { pub fn main() -> Result<(), std::io::Error> {
let args = Opt::from_args(); let args = Opt::from_args();
@ -25,9 +147,16 @@ pub fn main() -> Result<(), std::io::Error> {
for file in &args.files { for file in &args.files {
let buffer = read_to_string(file)?; let buffer = read_to_string(file)?;
match forward_val(&mut engine, &buffer) { if args.has_dump_flag() {
Ok(v) => print!("{}", v.to_string()), match dump(&buffer, &args) {
Err(v) => eprint!("{}", v.to_string()), Ok(_) => {}
Err(e) => eprintln!("{}", e),
}
} else {
match forward_val(&mut engine, &buffer) {
Ok(v) => print!("{}", v.to_string()),
Err(v) => eprint!("{}", v.to_string()),
}
} }
} }
@ -37,10 +166,20 @@ pub fn main() -> Result<(), std::io::Error> {
io::stdin().read_line(&mut buffer)?; io::stdin().read_line(&mut buffer)?;
match forward_val(&mut engine, buffer.trim_end()) { if args.has_dump_flag() {
Ok(v) => println!("{}", v.to_string()), match dump(&buffer, &args) {
Err(v) => eprintln!("{}", v.to_string()), Ok(_) => {}
Err(e) => eprintln!("{}", e),
}
} else {
match forward_val(&mut engine, buffer.trim_end()) {
Ok(v) => println!("{}", v.to_string()),
Err(v) => eprintln!("{}", v.to_string()),
}
} }
// The flush is needed because where in a REPL and we do not want buffering.
std::io::stdout().flush().unwrap();
} }
} }

33
docs/debugging.md

@ -13,7 +13,16 @@ These are added in order of how the code is read:
The first thing boa will do is generate tokens from source code. The first thing boa will do is generate tokens from source code.
If the token generation is wrong the rest of the operation will be wrong, this is usually a good starting place. If the token generation is wrong the rest of the operation will be wrong, this is usually a good starting place.
Navigate to `parser_expr` in [lib.rs](../src/lib/lib.rs#L48) and add `dbg!(&tokens);` just below tokens to see the array of token output. You code should look like this: To print the tokens to stdout, you can use the `boa_cli` command-line flag `--dump-tokens`, which can optionally take a format type. Supports these formats: `Debug`, `Json`, `JsonPretty`. By default it is the `Debug` format.
```bash
cargo run -- test.js --dump-tokens # token dump format is Debug by default.
```
or with interactive mode (REPL):
```bash
cargo run -- --dump-tokens # token dump format is Debug by default.
```
Or you can do it manually by navigating to `parser_expr` in [lib.rs](../boa/src/lib.rs#L25) and add `dbg!(&tokens);` just below tokens to see the array of token output. You code should look like this:
```rust ```rust
let mut lexer = Lexer::new(src); let mut lexer = Lexer::new(src);
@ -22,18 +31,32 @@ Navigate to `parser_expr` in [lib.rs](../src/lib/lib.rs#L48) and add `dbg!(&toke
dbg!(&tokens); dbg!(&tokens);
... ...
``` ```
Seeing the order of tokens can be a big help to understanding what the parser is working with. Seeing the order of tokens can be a big help to understanding what the parser is working with.
**Note:** flags `--dump-tokens` and `--dump-ast` are mutually exclusive. When using the flag `--dump-tokens`, the code will not be executed.
## Expressions ## Expressions
Assuming the tokens looks fine, the next step is to see the AST. Assuming the tokens looks fine, the next step is to see the AST.
You can output the expressions in [forward](../src/lib/lib.rs#L57), add `dbg!(&expr);` You can use the `boa_cli` command-line flag `--dump-ast`, which can optionally take a format type. Supports these formats: `Debug`, `Json`, `JsonPretty`. By default it is the `Debug` format.
This will print out the entire parse tree.
Dumping the AST of a file:
```bash
cargo run -- test.js --dump-ast # AST dump format is Debug by default.
```
or with interactive mode (REPL):
```bash
cargo run -- --dump-ast # AST dump format is Debug by default.
```
Or manually, you can output the expressions in [forward](../boa/src/lib.rs#L36), add `dbg!(&expr);`
These methods will print out the entire parse tree.
**Note:** flags `--dump-tokens` and `--dump-ast` are mutually exclusive. When using the flag `--dump-ast`, the code will not be executed.
## Execution ## Execution
Once the tree has been generated [exec](../src/lib/exec.rs#L66) will begin to run through each expression. If the tokens and tree looks fine, you can start looking here. Once the tree has been generated [exec](../boa/src/lib.rs#L67) will begin to run through each expression. If the tokens and tree looks fine, you can start looking here.
I usually just add `dbg!()` in the relevent places to see what the output is at the time. I usually just add `dbg!()` in the relevent places to see what the output is at the time.
## Debugger ## Debugger

Loading…
Cancel
Save