Skip to main content

Proof: The language consisting of strings whose lengths are prime is not regular

This is a problem that came up during my discussion with my friend for CS172: prove or disprove that the language consisting of all strings such that the length of each string is a prime number is regular. I’ll disprove this statement by showing that the pumping lemma for regular languages does not hold for this language. Because the pumping lemma holds for every regular language, the contrapositive shows that if the lemma does not hold, then the language is not regular.

Proof #

Suppose the statement is true, and the language is regular. Then there exists a finite state automaton that recognizes this language. Let us call it M. By the pumping lemma for regular languages, there exists a natural number p such that for every string s in L(M) of length at least p, there is a decomposition of s=xyz such that:

  1. |y|>0
  2. |xy|p
  3. iZ0,xyizL(M)

Since there are infinitely many prime numbers, there must exist a string wL(M) such that |w|=k is the first prime number greater than p. Because w is in L(M) and |w|>p, w can be decomposed as w=xyz that satisfies the above conditions.

Now consider the string v=xyk+1z. By the condition 3 above, vL(M). Thus, the length of v must be a prime number. But |v|=|xyk+1z|=|xyz|+|yk|=k+|y|k=k(1+|y|). Clearly, kk(1+|y|) and k>1. Hence |v| is not a prime. This contradiction implies that the supposition is false, and the given language is not regular. Q.E.D.