I need to modify a csv file using bash.
Input (a csv file):
firstletter="s"
surname="houston"
emaildomain"@zzz"
input=$(cat 1.csv)
1.csv:
1,1,Susan houston,Director of Services,,
2,1,Christina Gonzalez,Director,,
3,2,Brenda brown,"Director, Second Career Services",,
How can I add a text between the last 2 commas using Linux bash? I tried something like:
for i in $(cat $input);do
sed -i "s/,$/${firstletter}${surname}${emaildomain},/g" $i;
done
However, that results in an error:
sed: -e expression #1, char 5: unterminated `s' command
Expected output:
1,1,Susan houston,Director of Services,[email protected],
2,1,Christina Gonzalez,Director,[email protected],
3,2,Brenda brown,"Director, Second Career Services",[email protected],
I need to modify a csv file using bash.
Input (a csv file):
firstletter="s"
surname="houston"
emaildomain"@zzz"
input=$(cat 1.csv)
1.csv:
1,1,Susan houston,Director of Services,,
2,1,Christina Gonzalez,Director,,
3,2,Brenda brown,"Director, Second Career Services",,
How can I add a text between the last 2 commas using Linux bash? I tried something like:
for i in $(cat $input);do
sed -i "s/,$/${firstletter}${surname}${emaildomain},/g" $i;
done
However, that results in an error:
sed: -e expression #1, char 5: unterminated `s' command
Expected output:
1,1,Susan houston,Director of Services,[email protected],
2,1,Christina Gonzalez,Director,[email protected],
3,2,Brenda brown,"Director, Second Career Services",[email protected],
Share
Improve this question
edited Mar 29 at 2:50
John Kugelman
362k69 gold badges552 silver badges596 bronze badges
asked Mar 28 at 16:40
NetRangerNetRanger
514 bronze badges
10
|
Show 5 more comments
3 Answers
Reset to default 5The question isn't clear but I think this might be what you're trying to do, using GNU awk for "inplace" editing and gensub()
:
$ cat 1.csv
1,1,Susan houston,Director of Services,,
2,1,Christina Gonzalez,Director,,
3,2,Brenda brown,"Director, Second Career Services",,
$ awk -i inplace 'BEGIN{FS=OFS=","} {$(NF-1)=tolower(gensub(/(.).* (.*)/,"\\1\\2",1,$3) "@zzz")} 1' 1.csv
$ cat 1.csv
1,1,Susan houston,Director of Services,[email protected],
2,1,Christina Gonzalez,Director,[email protected],
3,2,Brenda brown,"Director, Second Career Services",[email protected],
See What's the most robust way to efficiently parse CSV using awk? for more information on processing CSVs with awk.
I'm sure there's some cleverness with sed
that would achieve what you want, I'd personally go w/ GNU awk here.
cat a.csv
1,1,Susan houston,Director of Services,,
2,1,Christina Gonzalez,Director,,
gawk -i inplace 'BEGIN{FS=OFS=","}{fn=gensub(/(.).*/,"\\1","1",$3);split($3,ln," ");$5=fn"."ln[length(ln)]"@zzz";print}' a.csv
cat a.csv
1,1,Susan houston,Director of Services,[email protected],
2,1,Christina Gonzalez,Director,[email protected],
-i inplace
is a GNU extension that allows awk to emulate sed
s -i
.
The BEGIN
section tells awk that both the input and output field separator are commas.
fn=gensub(...)
pulls the first letter of the first name (full name being the third field, $3
).
We then split the name into an array ln
(assuming some people may have middle names).
We set the 5th field (the empty space between the last commas) to the the first letter, and the last element of the array followed by @zzz
.
If $5 is not empty:
gawk -i inplace 'BEGIN{FS=OFS=","}$5==""{fn=gensub(/(.).*/,"\\1","1",$3);split($3,ln,"");$5=fn"."ln[length(ln)]"@zzz"}{print}' a.csv
Solution using jq:
With -R parameter (--raw-input) it reads plain text.
With -r parameter (--raw-output) it outputs plain text.
$ jq -Rr --arg emaildomain "@zzz" '
# split instruction creates an array
split(",")
# using JSON object makes the code readable
|{input_array: .}
# extract first letter and surname
|.name = .input_array[2]
|.firstletter =. name[0:1]
|.surname = (.name |split(" ") |.[1])
# construct email
|.email = (.firstletter+.surname+$emaildomain |ascii_downcase)
|.output_array =. input_array
|.output_array[-2] = .email
|.output_array
|join(",")
' 1.csv
1,1,Susan houston,Director of Services,[email protected],
2,1,Christina Gonzalez,Director,[email protected],
3,2,Brenda brown,"Director, Second Career Services",[email protected],
sed
is a line-editor used to operate on any number of lines in a file, using bash to feed lines to sed is an anti-pattern. Thirdly all the variables in your replacement string are undefined. You should really feed your script (don't omit the shebang-line) through shellcheck. – tink Commented Mar 28 at 16:44bash
, because if any field is quoted and contains commas, it's hard to process those fields, and there a lot of corner cases (e.g. quoted quotes). It will work for carefully-controlled input (e.g. that outputs no quotes or comma-containing fields), but if your CSV is generated by standard off-the-shelf tools for generic data, you are better off using a tool designed for CSV, such ascsvtool
(designed for use inbash
scripts) or Python (csv
module). – Vercingatorix Commented Mar 28 at 17:46