What's the best way of getting only the final match of a regular expression in a file using grep?
Also, is it possible to begin grepping from the end of the file instead of the beginning and stop when it finds the first match?
I am always using cat (but this makes it a little longer way): cat file | grep pattern | tail -1
I would blame my linux admin course teacher at college who love cats :))))
-- You don't have to cat a file first before grepping it. grep pattern file | tail -1 and is more efficient, too.
For someone working with huge text files in Unix/Linux/Mac/Cygwin. If you use Windows checkt this out about Linux tools in Windows: https://stackoverflow.com/questions/3519738/what-is-the-best-way-to-use-linux-utilities-under-windows.
One can follow this workflow to have good performance:
zq from the package.Quote from its github readme:
Creating an index
zindex needs to be told what part of each line constitutes the index. This can be done by a regular expression, by field, or by piping each line through an external program.
By default zindex creates an index of file.gz.zindex when asked to index file.gz.
Example:
create an index on lines matching a numeric regular expression. The capture group indicates the part that's to be indexed, and the options show each line has a unique, numeric index.
$ zindex file.gz --regex 'id:([0-9]+)' --numeric --uniqueExample: create an index on the second field of a CSV file:
$ zindex file.gz --delimiter , --field 2Example:
create an index on a JSON field orderId.id in any of the items in the document root's actions array (requires jq). The jq query creates an array of all the orderId.ids, then joins them with a space to ensure each individual line piped to jq creates a single line of output, with multiple matches separated by spaces (which is the default separator).
$ zindex file.gz --pipe "jq --raw-output --unbuffered '[.actions[].orderId.id] | join(\" \")'"
Querying the index
The zq program is used to query an index. It's given the name of the compressed file and a list of queries. For example:
$ zq file.gz 1023 4443 554It's also possible to output by line number, so to print lines 1 and 1000 from a file:
$ zq file.gz --line 1 1000
The above solutions only work for one single file, to print the last occurrence for many files (say with suffix .txt), use the following bash script
#!/bin/bash
for fn in `ls *.txt`
do
result=`grep 'pattern' $fn | tail -n 1`
echo $result
done
where 'pattern' is what you would like to grep.
If you have several files, use inline-for:
for a in *.txt; do grep "pattern" $a /dev/null | tail -n 1; done
The /dev/null provides a second file so grep will list the filename where the pattern is found.