Vim regular expressions and escape characters

Today I learned that regex in Vim can be even more irritating than normal regex.

I work for an agency that is (rightly) anal about typography which means that apostrophes should always be curly. This means (mis-)typing ’ a lot. I thought I’d write a little :substitute command in my vimrc to make all apostrophes in body copy (titles, paragraphs, links and spans) “curly”. I started off by working out a regex that would find such things:


Let’s break it down

  • /v – very magic switch that means less escape characters
  • \< – find an opening tag
  • (h[1-6]|p|a|span) – any tags for body copy
  • (_.)* – zero or more characters (incl. new lines)
  • \zs'\ze – limit the pattern to apostrophes followed by:
  • zero or more characters and accompanying closing tag (the \1 matches the captured opening tag)

This (eventually) worked like a charm. It can no doubt be optimised by someone with more of a Vim Regex brain than me.

I also could have been smart about characters trailing the apostrophe (like ‘s, ‘d, ‘m, ‘ll, ‘ve) etc. but this seemed needlessly complex and my brain was already hurting. I was then able to do a global substitute for the encoded curly quote:


The next step was to transfer all this to my .vimrc - I mapped it to <leader> Q for “Quotes”.

" replace aposrophes with curly ones in body copy
nnoremap &lt;leader&gt;Q :%s/\v\&lt;(h[1-6]|p|a|span).+(\_.)*\zs'\ze(\_.)*\&lt;/\1\&gt;/\&rsquo;/&lt;CR&gt;

Now I saved my config and was shouted at by Vim (my .vimrc auto sources on save).


After a lot of head scratching it turned out that the OR operator needed to be escaped even when using /v and even though it worked whilst searching and substituting in a buffer - I believe it has something to do with | having some kind of special meaning in vimscript.

This is the final snippet:

" replace aposrophes with curly ones in body copy
nnoremap &lt;leader&gt;Q :%s/\v\&lt;(h[1-6]\|p\|a\|span).+(\_.)*\zs'\ze(\_.)*\&lt;/\1\&gt;/\&rsquo;/&lt;CR&gt;

Problem solved, designers happy and lots of curly quotes everywhere.

