How to split a string on whitespace when there is quoted stuff in it

published Dec 21, 2007 12:33   by admin ( last modified Dec 21, 2007 12:33 )

When you are dealing with strings like e.g. the log lines from an apache or other web server log, it is common to find stuff that is inside quotes.

Let's say you want to split that line into its items, e.g. referrer, user agent and other stuff. Since some of that stuff is inside quotes, splitting on whitespace won't work. The trick is to first split on the quotes!

In that way you get an array/list where every odd element is guaranteed to be outside of any quotes, and any even element guaranteed to be inside a pair of quotes. Just iterate through the list: When you are at an odd item, split it on whitespace and store it, when you're on an even item store it away as it is, it's quoted.