A little Background

In the inception of UNIX, there were three initial core programs to make everything work: a compiler (i.e gcc), a shell (i.e. bash), and an editor (i.e. ed).

Being developed in the 70’s, meant that computational resources were very limited, with RAM reaching few tenths of KiloBytes, ed was the line editor, you read it right, ed operates line by line by default. This editor uses a clever command mode that allows user to maximize their experience by concatenating a set of numbers and mnemonic commands. For example, we wanted to:

  • Print the 5th line of a text file: 5p
  • Print the all the lines: 1,$p
  • From the 1st to the 4th line, globally search for a regular expression and delete those matching lines: (1,4)g/re/d

What is grep?

GREP

One evening, the original ed developer, Ken Thompson was having a conversation with Lee McMahon, when the latter was interested in finding occurrences of an arbitrary word in The Federalist Papers, written under the pseudonym Publius by Alexander Hamilton, James Madison and John Jay.

The plain text file is 1.2 MegaBytes and since RAM was scarce, McMahon couldn’t use ed to edit the file. That’s when Ken Thompson wrote grep in one night using PDP Assembly, specializing the g global command from ed.

Grep stands from g/re/p or globally find all the lines in a file or directory of files that matches the RegEx /re/ and print them to stdout.

Where GREP Came From from Computerphile YouTube Channel / Prof. Brian Kernighan on YouTube.

Rationale

As programmers, a big portion of a our career is editing files, and grep will ease the task of finding occurrences of functions, classes or variables in a project.

By simply using the command grep -R some_function my_project , we will get all the lines were some_function appears recursively in all the files inside the directory my_project.

The main issue with using grep in a modern development environment are all the files and folders that we don’t want to include in our search; files like logs, .vim directories, logs, tmp files, etc.

While we can use option flags with grep, it gets out of hand really quick, and that’s what when working with a development project directory, we should use other options.

Alternatives

Thankfully there are many options for developers, tools that will respect the .gitignore file and will be optimized for the programming workflow.

Ack

This program is written purely in Perl 5 and it takes advantage of Perl’s naturally powerful text processing capabilities and regular expressions engine.

It is very portable as Perl runs almost everywhere.

Ag, the Silver Searcher

According to the creator of ag, Geoff Greer, it is an “order of magnitude” faster than ack. It started as an ack clone but its features has diverged, making it a really fast solution.

We can also add more patterns to ignore in a .ignore file, such as *.min.js files. Also, ag is written on the C language and it’s readily available in many operating systems.

Ag: The Silver Searcher, Source Code at Github.

Sift, Grep on Steroids

Written in Go, sift claims to be even faster than ag, while also adding useful features for developers such as conditions, process inside gzipped files, search by file type, file extension, full path, file name, etc.

It also make it easier to Find and Replace without the use of other programs.

Sift: Grep on Steroids, Source Code at Github.

Other Solutions

Conclusion

Being aware of these programs will helps be more efficient in our workflow.

Cheers,