Non-breaking white space Internet Explorer 8 JavaScript regexp bug (and how to fix it)

While developing jQuery plugin for upcoming bookmarking "Items in select boxes" plugin for our Vudu CMS

I wrote a simple regexp to strip few characters (pipe, minus, apostrof and white-space) that are added before the actual item.

 function cleanOptionText(txt)
 {
 return txt.replace(/^[\s|'-]+/, '');
 };

It worked just fine on FF9 and Chrome but in IE8 only the first pipe (|) was removed. After some debugging I discovered that I have both spaces and non-breaking spaces that should be removed and that in IE8 class shorthand \s (which should include all white space) doesn’t include non-breaking space.

Code for non-breaking space is 0xa0 (dec 160) so regexp should be updated as follows:

function cleanOptionText(txt)
{
return txt.replace(/^[\s\xA0|'-]+/, '');
};

Also we took opportunity to update our javascript trim functions:

String.prototype.trim = function()
{
return this.replace(/^[\s\xA0]+|[\s\xA0]+$/g,"");
}

String.prototype.ltrim = function()
{
return this.replace(/^[\s\xA0]+/g,"");
}

String.prototype.rtrim = function()
{
return this.replace(/[\s\xA0]+$/g,"");
}

Here is complete html test file:


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>nbsp test</title>
  </head>
  <body>
    <form action="#">
      <fieldset>
      <select id="testselect">
        <option value="1">&nbsp; &nbsp; Some option</option>
      </select>
      </fieldset>
    </form>

    <script type="text/javascript">
      /*<![CDATA[*/

      var txt=document.getElementById('testselect').options[0].text;

      var analyizeTxt='';
      var l=txt.length;
      for(var i=0; i<l; i++)
      {
        analyizeTxt+= '[ ('+txt.charCodeAt(i)+')=('+txt.charAt(i)+')]';
      }

      alert(analyizeTxt);

      var newTxt1 = txt.replace(/^\s+/, '');
      alert('Using only \\s = ('+newTxt1+')');


      var newTxt2 = txt.replace(/^[\s\xA0]+/, '');
      alert('Using \\s and \\xA0= ('+newTxt2+')');
      /*]]>*/
    </script>
  </body>
</html>

One thought on “Non-breaking white space Internet Explorer 8 JavaScript regexp bug (and how to fix it)”

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>