In an odd turn, I was given text like below to display on a page.

view plain print about
1Some Title<br />
2&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;A.&amp;nbsp;&amp;nbsp;Some&amp;nbsp;subheader&amp;nbsp;here<br />
3&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;B.&amp;nbsp;&amp;nbsp;Some&amp;nbsp;other&amp;nbsp;subheader&amp;nbsp;here<br />

Two issues here (aside from the fact that it should have been an HTML list): 1) it needed to retain the spacing format, and 2)it needed to wrap within a sized element. A non-breaking space, when viewed, appears as a space ( ), but isn't an actual space, so the browser doesn't know where to break the text when wrapping it in an element. Hmmmm....

So, we needed to replace any &nbsp; that is preceded and followed by a printable character, leaving multiple concurrent &nbsp; in tact for the fake indentation. I figured this was best left to a RegEx expression used in ColdFusion's ReReplace() method, but my RegEx is pretty rusty, so I reached out on Twitter.

Andy Matthews and Kris Jones both reached out to me with possible expressions for this, but nothing was working. What was happening is it was finding the characters around an &nbsp; and removing both the nonbreaking space and the characters. Hmmmm....

But both of these folks had pointed me in the right direction. Seems in some RegEx replace engines you can reference groups within expressions in your replacement output. Unfortunately, you can't do this with ReReplace() (or if you can I haven't figured out how).

So I said to myself, "Self" ('cause that's what I call myself) "Self, what about tapping into ColdFusion's underlying Java?" Fingers flying I hit Google. BAM! Up pops Ben Nadel talking pattern matching with the underlying java.util.regex package, with code examples all over (here's one). Time to play.

First I needed the Java RegEx Pattern object:

view plain print about
1CreateObject("java","java.util.regex.Pattern")

Then I needed to define the pattern for which I was searching. This RegEx Glossary gave me a ton of info on Java RegEx, that I used to define my matching pattern:

view plain print about
1.compile(javaCast("string","(\p{Print})&amp;nbsp;(\p{Print})"))

the \p{Print} identifies any printable character (don't want to include my
tags), and want only nonbreaking spaces bracketed by printable characters. The next step is defining the matcher (what the expression will be run against):

view plain print about
1.matcher(javaCast("string",REQUEST.matchThis))

And then, the final step, replacing the &nbsp; with a space. The expression returns the characters as well, so I need group 1 + space + group 2 in my output (what I couldn't do in ReReplace). That RegEx Glossary helped with this too:

view plain print about
1.replaceAll("$1 $2")

The groups in the expression, the bits within parens (), are available to your output of the replaceAll by referencing that part of the expression's value. $1 for the first group, $2 for the second, and so on. The entire thing then looks something like this:

view plain print about
1REQUEST.finalValue = CreateObject("java", "java.util.regex.Pattern").compile(javaCast("string", "(\p{Print})&amp;nbsp;(\p{Print})")).matcher(javaCast("string", REQUEST.matchThis)).replaceAll("$1 $2");

Worked like a charm! Thanks to all who helped me get my head around this one.