最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

bash - Using variables in awk within echo statement that prints into a file - Stack Overflow

programmeradmin1浏览0评论

We use a script that prints bash commands into a file that is then run on an HPC system. It is supposed to run through a large text file containing geographic coordinates separated by whitespace and extract a specific region from that file (e.g. extract all lines with an x coordinate between xmin and xmax and an y coordinate between ymin and ymax).

Ideally, I'd like to use awk for that like so (from memory since I don't have my computer available at the moment):

awk -v xmin=-13000 -v xmax=13000 -v ymin=-500 -v ymax=500 -F ' ' {if ($1 > xmin && $1 < xmin && $2 > ymin && $2 < ymin) print $1 $2} $infile > $outfile

That would probably execute fine. However, as suggested by the title, we save this line indirectly for 25 regions, each with their own xmin, xmax etc. There are more operations following after that (using GMT calls etc). Here's a little snippet:

xmin=-13000
xmax=13000
ymin=-500
ymax=500
infile=./full_file.txt
outfile=./filtered_file.yxy
srcfile=./region_1.txt

echo """awk -v xmin=$xmin -v xmax=$xmax -v ymin=$ymin -v ymax=$ymax -F ' ' {if ($1 > $xmin && $1 < $xmin && $2 > $ymin && $2 < $ymin) print $1 $2} $infile > $outfile""" >> $srcfile

Obviously, this raises errors when running due to variable expansion. I've tried escaping the awk column identifiers but to no avail or didn't understand the pattern correctly. Could someone point me to a solution that allows us to keep the indirect approach?

We use a script that prints bash commands into a file that is then run on an HPC system. It is supposed to run through a large text file containing geographic coordinates separated by whitespace and extract a specific region from that file (e.g. extract all lines with an x coordinate between xmin and xmax and an y coordinate between ymin and ymax).

Ideally, I'd like to use awk for that like so (from memory since I don't have my computer available at the moment):

awk -v xmin=-13000 -v xmax=13000 -v ymin=-500 -v ymax=500 -F ' ' {if ($1 > xmin && $1 < xmin && $2 > ymin && $2 < ymin) print $1 $2} $infile > $outfile

That would probably execute fine. However, as suggested by the title, we save this line indirectly for 25 regions, each with their own xmin, xmax etc. There are more operations following after that (using GMT calls etc). Here's a little snippet:

xmin=-13000
xmax=13000
ymin=-500
ymax=500
infile=./full_file.txt
outfile=./filtered_file.yxy
srcfile=./region_1.txt

echo """awk -v xmin=$xmin -v xmax=$xmax -v ymin=$ymin -v ymax=$ymax -F ' ' {if ($1 > $xmin && $1 < $xmin && $2 > $ymin && $2 < $ymin) print $1 $2} $infile > $outfile""" >> $srcfile

Obviously, this raises errors when running due to variable expansion. I've tried escaping the awk column identifiers but to no avail or didn't understand the pattern correctly. Could someone point me to a solution that allows us to keep the indirect approach?

Share Improve this question edited Nov 17, 2024 at 10:23 tripleee 190k36 gold badges313 silver badges361 bronze badges asked Nov 16, 2024 at 20:23 Sacha ViqueratSacha Viquerat 3931 gold badge3 silver badges15 bronze badges 4
  • This might help: Difference between single and double quotes in Bash – Cyrus Commented Nov 16, 2024 at 20:30
  • 1 It's impossible for a given $1 to be both >xmin and <xmin; did you mean <xmax and similarly <ymax? You don't need to specify -F ' ' because it's the default (and trying to use it in your echo context is more complicated). You don't need if...print; you can just do (if I'm right above) awk '$1>xmin&&$1<xmax&&$2>ymin&&$2<ymax' infile >outfile (a pattern with no action defaults to print $0) but to echo that to a file which will work as shell input you need to add quoted quotes around the script. – dave_thompson_085 Commented Nov 16, 2024 at 20:33
  • and also I think it would be good to enclose awk code in ' if you don't want shell to expand variables. – Arkadiusz Drabczyk Commented Nov 16, 2024 at 20:40
  • "We use a script that prints bash commands into a file that is then run on foo" - that approach [almost?] always leads to unnecessary complexity and fragility vs simply writing a script to run on foo. You might want to ask a question about how to do whatever it is you need to do rather than this question about how to implement your idea of how to do it. – Ed Morton Commented Nov 17, 2024 at 9:35
Add a comment  | 

2 Answers 2

Reset to default 4

IIUC, you have to either escape each dollar sign like that:

{if (\$1 > xmin && \$1 < xmin

or temporarily close a double quote and put a dollar sign in a single quote:

"{if ("'$1'" > xmin && "'$1'" < xmin"

or use Bash specific %q printf specifier:

$ read
awk -v xmin=-13000 -v xmax=13000 -v ymin=-500 -v ymax=500 -F ' ' {if ($1 > xmin && $1 < xmin && $2 > ymin && $2 < ymin) print $1 $2} $infile > $outfile
$ printf "%q\n" "$REPLY"
awk\ -v\ xmin=-13000\ -v\ xmax=13000\ -v\ ymin=-500\ -v\ ymax=500\ -F\ \'\ \'\ \{if\ \(\$1\ \>\ xmin\ \&\&\ \$1\ \<\ xmin\ \&\&\ \$2\ \>\ ymin\ \&\&\ \$2\ \<\ ymin\)\ print\ \$1\ \$2\}\ \$infile\ \>\ \$outfile
$ echo awk\ -v\ xmin=-13000\ -v\ xmax=13000\ -v\ ymin=-500\ -v\ ymax=500\ -F\ \'\ \'\ \{if\ \(\$1\ \>\ xmin\ \&\&\ \$1\ \<\ xmin\ \&\&\ \$2\ \>\ ymin\ \&\&\ \$2\ \<\ ymin\)\ print\ \$1\ \$2\}\ \$infile\ \>\ \$outfile
awk -v xmin=-13000 -v xmax=13000 -v ymin=-500 -v ymax=500 -F ' ' {if ($1 > xmin && $1 < xmin && $2 > ymin && $2 < ymin) print $1 $2} $infile > $outfile

And also I think it would be good to enclose awk code in ' if you don't want shell to expand variables.

Creating a separate temporary script seems superfluous. Just loop over the parameters.

while read -r xmin xmax ymin ymax\
              infile outfile
do
    awk -v xmin="$xmin" -v xmax="$xmax" -v ymin="$ymin" -v ymax="$ymax" \
     '$1 > xmin && $1 < xmax && $2 > ymin && $2 < ymax { print $1 $2 }' "$infile" > "$outfile"
done <<____
-13000 13000 -500 500 full_file.txt  filtered_file.yxy
    17    42  19   21 littlefile.txt other.yxy
-27350 27350 -123 123 another.txt    moar.yxy
____

The ____ is just a cute alternative to the more conventional EOF heredoc delimiter. The lines in the here document should each be one set of values for the variables in the read.

If you really want to print each snippet to a separate file (perhaps to submit each to run on a different cluster node, for example), maybe learn to use printf instead of echo.

while read -r xmin xmax ymin ymax\
              infile outfile srcfile
do
    printf 'awk -v xmin="%i" -v xmax="%i" -v ymin="%i" -v ymax="%i" \
     '"'"'$1 > xmin && $1 < xmax && $2 > ymin && $2 < ymax { print $1 $2 }'"'"' "./%s" > "./%s"\n' \
        "$xmin" "$xmax" "$ymin" "$ymax" "$infile" "$outfile" >>"./$srcfile"
done <<____
-13000 13000 -500 500 full_file.txt  filtered_file.yxy region1.txt
    17    42  19   21 littlefile.txt other.yxy         region2.txt 
-27350 27350 -123 123 another.txt    moar.yxy          region3.txt
____

(though printing commands to .txt files is still really weird).

For what it's worth, the triple quotes in your attempt do nothing useful. Python (for example) has this syntax, but in the shell, """ simply parses into an empty string inside a pair of quotes "" followed by an opening double quote ".

Similarly, the printf example above demonstrates one way to produce a literal single quote inside a single-quoted string. 'foo'"'"'bar' is (single-quoted) foo next to double-quoted ' next to single-quoted bar, which when pasted together produces foo'bar.

I also slightly refactored your Awk script to make it more idiomatic, and fixed missing quoting

发布评论

评论列表(0)

  1. 暂无评论