Skip to main contentSkip to navigationSkip to navigation
Regular expressions text
Regular expressions are part of the fundamental makeup of modern software, yet few schools teach children how to use them.
Regular expressions are part of the fundamental makeup of modern software, yet few schools teach children how to use them.

Here's what ICT should really teach kids: how to do regular expressions

This article is more than 11 years old
Regexps are part of the fundamental makeup of modern software and can make everyday people's lives much easier

"Technical literacy" is the subject of an ongoing, worldwide educational debate. Some argue that every kid should learn the basics of programming and successfully write a program or two before graduating – just as we expect every student to write an essay or two before leaving school.

I think that this is sound. We tend to think of innovative, original software as originating with engineers and computer scientists who come up with better ways to solve problems, such as the breakthroughs that let our digital cameras improve their light-balancing and signal processing. This stuff is important, it's elegant, and it's creative, but it's not the whole story.

The other kind of innovation comes from people who identify new problems to solve, new opportunities for automation and computer-assisted processes. These are just as likely to come from information-civilians as professionals. Bakers, cab drivers, and artists are just as capable of proposing novel solutions to interesting problems as skilled engineers. Indeed, some of our greatest automation successes have come from people outside computer science solving their own problems – the showstopper being Tim Berners-Lee inventing a way to share documents among physicists and inventing the world wide web.

Historically, a "domain expert" who wants to automate a system will approach an engineer, who will go through a formal process of requirements: gathering, technical design, implementation, testing and refinement. That's fine as far as it goes, but there are huge dividends to be earned by giving people the power to solve their own problems without having to suffer through the inevitable signal degradation from being interpreted by others who've never had to do the job you're trying to improve.

So yes, let's expose all our kids to the fundamentals of writing code. Most people will never write code after school, just as most will never write an essay. But much of the important information you receive will be structured like an essay, including the position papers of politicians, the curriculum statements from your kids' schools, and your workplace's code of conduct.

Much of the world you interact with, from cash machines to your bank's website to the website where you sign on for disability benefits to the alarm clock that wakes you in the morning to the phone that tracks your location, social network and personal thoughts, are underpinned by software. At the very least, a cursory understanding of the working of software will help you judge the reliability, quality, and utility of the software in your life.

But writing code – and essays – is a higher-level skill, the peak of a pyramid composed of innumerable sub-skills that are truly foundational and fundamental. You can't write either without knowing something about spelling, grammar and structure. And these days, you can't do either without knowing a little typing.

Knowing how to type is one of those skills that is improbably important these days. Regardless of your professional path, being a good, fast, skilled, accurate typist is probably to your benefit. From doctors to secretaries, from call-centre workers to police officers, nearly every job today involves a little typing, and many jobs require a lot of typing. A once-esoteric skill has become ubiquitous and essential.

Which is weird, because typing is fundamentally old-fashioned. There are many ways into interact with our devices today that don't involve typing – everything from voice to gestures. But even as these non-typing interfaces proliferate, computers worm their way into new places, and the keyboard resurfaces. Every day sees devices shedding their keyboards – but every day also sees more devices adding keyboards.

There are other skills like this, skills that are notionally obsolete, based on old-fashioned ways of interacting with computers that are vestiges of more technical eras, but that are really, secretly indispensable if you want to get the most out of the modern world. One of my favourites is "regular expressions" (also called regex or regexp). These are short, fabulously useful commands that tell a computer how to tease apart long blocks of text and find words or phrases that match your criteria.

For example, if you had a list of names and you wanted to find all the michaels, Michaels, Mikes and mikes, you could use a simple regular expression inside of a search-box to locate all of them at once. You can use regular expressions to find all the files in a directory that end with jpg (or jpeg, or JPG or JPEG). You can use them to find all the street addresses (every string beginning with a number, followed by a space, followed by one or more words, followed by Street, or St, or Road or Rd, etc).

Regular expressions are part of the fundamental makeup of modern software. They are present, in some form or another, in every modern operating system. Word processors, search-engines, blogging programs … though it's been decades since software for everyday people was designed with the assumption that users would know regexps, they still lurk in practically every environment we use.

I think that technical people underestimate how useful regexps are for "normal" people, whether a receptionist labouriously copy-pasting all the surnames from a word-processor document into a spreadsheet, a school administrator trying to import an old set of school records into a new system, or a mechanic hunting through a parts list for specific numbers.

The reason technical people forget this is that once you know regexps, they become second nature. Any search that involves more than a few criteria is almost certainly easier to put into a regexp, even if your recollection of the specifics is fuzzy enough that you need to quickly look up some syntax online.

Knowing regexp can mean the difference between solving a problem in three steps and solving it in 3,000 steps. When you're a nerd, you forget that the problems you solve with a couple keystrokes can take other people days of tedious, error-prone work to slog through.

Like typing or spelling, regexp is a foundational skill that involves a fair bit of practice and learning by rote. It's the sort of skill that is best taught at an early age. Regexps are easy to teach with games, since each "operator" in a regular expression blocks or matches a different set of characters, so you can easily make tower-defence style games where the player has to construct progressively more complex regexps to block the incoming monsters – there are tons of regexp games online already.

So far as I can tell, no school ICT class yet bothers with them, despite the fact that regexps organise so much of our underlying infrastructure and can make such an enormous difference to the lives of everyday people, even those who don't code.

If we're going to teach kids to use PowerPoint and word processors as part of their core education, we should be teaching them regular expressions. Very few pieces of technical arcana have their widespread applicability – and the present lack of widespread appreciation.

Comments (…)

Sign in or create your Guardian account to join the discussion

Most viewed

Most viewed