Attention: We are retiring the ASP.NET Community Blogs. Learn more >

Regex to capture an attribute collection

Wayne posted an elegant solution on the regex list today in answer to the following question: "How do I find all INPUT tags and pick out the attribute/values within them?". For the record, here's the pattern that Wayne came up with
<input \s+
  (
    (?'Attr'\w+) \s* = \s*
    (?'Value' [^\s"'>]+ | "[^"]*" | '[^']*')
    \s*
  )*      #match zero or more Attrs
  /?>
... and here is some sample code that he provided for demonstrating its use (NOTE: the usage of the Captures on each Match'ed item):
  Regex rex = new Regex(@"
      <input \s+
      (
        (?'Attr'\w+) \s* = \s*
        (?'Value' [^\s""'>]+ | ""[^""]*"" | '[^']*')
        \s*
      )*      #match zero or more Attrs
      /?>",
      RegexOptions.ExplicitCapture |
      RegexOptions.IgnorePatternWhitespace);
  
  foreach(Match m in rex.Matches( textToSearch ))
  {
    Console.WriteLine("Found a match with these attributes:");
    for(int i=0; i < m.Groups["Attr"].Captures.Count; i++) {
      Console.WriteLine("Attr:  " + m.Groups["Attr"].Captures[i].Value);
      Console.WriteLine("Value: " + m.Groups["Value"].Captures[i].Value);
    }
    Console.WriteLine();
  }
 
If you're not on the regex list already and you are even remotely interested in them, then I behoove you to jump over and sign-up today. It's a low-traffic, highly focussed list and it's full of smart people like Wayne :-)

No Comments