A little Background
In the inception of UNIX, there were three initial core programs to make everything work: a compiler (i.e gcc), a shell (i.e. bash), and an editor (i.e. ed).
Being developed in the 70’s, meant that computational resources were very limited, with RAM reaching few tenths of KiloBytes, ed
was the line editor, you read it right, ed
operates line by line by default. This editor uses a clever command mode that allows user to maximize their experience by concatenating a set of numbers and mnemonic commands. For example, we wanted to:
- Print the 5th line of a text file:
5p
- Print the all the lines:
1,$p
- From the 1st to the 4th line, globally search for a regular expression and delete those matching lines:
(1,4)g/re/d
What is grep?
One evening, the original ed
developer, Ken Thompson was having a conversation with Lee McMahon, when the latter was interested in finding occurrences of an arbitrary word in The Federalist Papers, written under the pseudonym Publius by Alexander Hamilton, James Madison and John Jay.
The plain text file is 1.2 MegaBytes and since RAM was scarce, McMahon couldn’t use ed
to edit the file. That’s when Ken Thompson wrote grep
in one night using PDP Assembly, specializing the g
global command from ed
.
Grep stands from
g/re/p
or globally find all the lines in a file or directory of files that matches the RegEx/re/
and print them to stdout.
Where GREP Came From from Computerphile YouTube Channel / Prof. Brian Kernighan on YouTube.
Rationale
As programmers, a big portion of a our career is editing files, and grep
will ease the task of finding occurrences of functions, classes or variables in a project.
By simply using the command grep -R some_function my_project
, we will get all the lines were some_function
appears recursively in all the files inside the directory my_project
.
The main issue with using grep
in a modern development environment are all the files and folders that we don’t want to include in our search; files like logs, .vim
directories, logs, tmp
files, etc.
While we can use option flags with grep
, it gets out of hand really quick, and that’s what when working with a development project directory, we should use other options.
Alternatives
Thankfully there are many options for developers, tools that will respect the .gitignore
file and will be optimized for the programming workflow.
Ack
This program is written purely in Perl 5 and it takes advantage of Perl’s naturally powerful text processing capabilities and regular expressions engine.
It is very portable as Perl runs almost everywhere.
Ag, the Silver Searcher
According to the creator of ag
, Geoff Greer, it is an “order of magnitude” faster than ack
. It started as an ack
clone but its features has diverged, making it a really fast solution.
We can also add more patterns to ignore in a .ignore
file, such as *.min.js
files. Also, ag
is written on the C language and it’s readily available in many operating systems.
Ag: The Silver Searcher, Source Code at Github.
Sift, Grep on Steroids
Written in Go, sift
claims to be even faster than ag
, while also adding useful features for developers such as conditions, process inside gzipped files, search by file type, file extension, full path, file name, etc.
It also make it easier to Find and Replace without the use of other programs.
Sift: Grep on Steroids, Source Code at Github.
Other Solutions
Conclusion
Being aware of these programs will helps be more efficient in our workflow.
Cheers,