after extracts specified special characters and expands environments, wildcard, we need to split all strings to token and translate to symbols. for translation, writing new grammar is necessary.
In our case, pipe(|), logical operators(&&, ||), redirections(<, >, <<, >>) are considered only.
for going not to deep and not to far, we preprocess the token by identifying type depends on location.
LBRACE RBRACE PIPE
( ) |
IN_RID OUT_RID IN_HEREDOC OUT_HEREDOC (with filepath)
< > << >>
AND_IF OR_IF CMD OPTION ARG
&& || cmd(str) option(str started with '-') argument(str)
%start complete_command
%%
complete_command : and_or
and_or : <pipeline>
| <and_or> OR_IF <pipeline>
| <and_or> AND_IF <pipeline>
pipeline : <simple_command>
| <pipeline> PIPE <simple_command>
simple_command : <command>
| <brace_group>
| <redirect_list> <command>
brace_group : LBRACE <and_or> RBRACE
command : CMD
| CMD <options>
options : <argument>
| OPTION <argument>
redirect_list : <redirect>
| <redirect_list> <redirect>
argument : ARG
| <argument> ARG
redirect : OUT_HEREDOC
| IN_HEREDOC
| IN_DIR
| OUT_DIR
as long as new character is not to be added, this grammar is enoughly deep.
1. pass extracting, expanding because of no existance of problem to tokenizing.
2. tokenize(using link list.)
str : echo -> str : -n -> str: "hi"
type: cmd type : option type : argument
3. make tree using grammar.
root : echo
\
-n
\
"hi"
1. separate redirection and filepath and split them.
{"<<", "end", ">", "outfile", "cat"}
2. tokenize.
str : << -> str : end -> str : > -> str : outfile -> str : cat
type : in_heredoc type : filepath type : out_rid type : filepath type : cmd
3. make tree using grammar.
root : cat
/
<< - end
\
> - outfile
1. pass.
2. tokenize.
str : echo -> str : "1" -> str : echo -> str: "2" -> str : echo -> str: "3"
type : cmd type : argument type : cmd type : argument type : cmd type : argument
3. make tree using grammar.
root : ||
/ \
&& echo
/ \ \
echo echo "3"
\ \
"1" "2"
📖 reference
https://pubs.opengroup.org/onlinepubs/009604499/utilities/xcu_chap02.html#tag_02_10_02