Simulating array_unique in JavaScript

One of the beautiful things of PHP is its wealth of array methods. JavaScript in comparison seems ridiculously inadequate and you find yourself having to write own methods or patch the existing ones. One method I especially cherish is array_unique() which returns a new array that has all the duplicates filtered out. This is easy to write in JavaScript, all you need to do is:

  • create a new object
  • loop through the array and use the array values as new properties of the object (that way the property simply gets re-set and not added as a new one to the object when it comes up again)
  • loop through the properties of the object and add each value to the results array

Technically this should do it:

function array_unique(ar){
  var sorter = {};
  for(var i=0,j=ar.length;i<j;i++){
    sorter[ar[i]] = ar[i];
  }
  ar = [];
  for(var i in sorter){
    ar.push(i);
  }
  return ar;
}

Now array_unique([1,2,3,1,1]) returns “[1,2,3]” which is what we want. However, there is a snag. What if the array contains elements that are almost the same but a different type? When you run array_unique([1,2,3,"1",1]) you still only get “[1,2,3]” as the returned array and what you’d really need is “[1,2,3,'1']“. The solution to this is to store both the value and the type in the property and push the values to the results array:

function array_unique(ar){
  var sorter = {};
  for(var i=0,j=ar.length;i<j;i++){
    sorter[ar[i]+typeof ar[i]] = ar[i];
  }
  ar = [];
  for(var i in sorter){
    ar.push(sorter[i]);
  }
  return ar;
}

The next thing I can think of is to ensure that the array is really an array. We can test this by checking if it has a length property and is not a string.

function array_unique(ar){
  if(ar.length && typeof ar!=='string'){
    var sorter = {};
    for(var i=0,j=ar.length;i<j;i++){
      sorter[ar[i]+typeof ar[i]] = ar[i];
    }
    ar = [];
    for(var i in sorter){
      ar.push(sorter[i]);
    }
  }
  return ar;
}

However, two loops can be slow, and for…in is a very slow construct. Therefore we can avoid the second loop by using an output array:

function array_unique(ar){
  if(ar.length && typeof ar!=='string'){
    var sorter = {};
    var out = [];
    for(var i=0,j=ar.length;i<j;i++){
      if(!sorter[ar[i]+typeof ar[i]]){
        out.push(ar[i]);
        sorter[ar[i]+typeof ar[i]]=true;
      }
    }
  }
  return out || ar;
}

Anything I have forgotten?

5 Responses to “Simulating array_unique in JavaScript”

  1. Michael Says:

    I might change your for construct and declare a variable that holds the currently indexed item’s value to avoid calling “typeof” twice:

    function array_unique(ar){
      if(ar.length && typeof ar!=='string'){
        var sorter = {};
        var out    = [];
        for (var i = 0, i < ar.length; i++) {
          var item = ar[i] + typeof ar[i];
          if (!sorter[item]) {
            out.push(ar[i]);
            sorter[item] = true;
          }
        }
      }
      return out || ar;
    };
  2. Julien Royer Says:

    Hi,

    This function doesn’t work for objects as it uses toString to compare them:
    alert({a: 1}.toString() == {}.toString()); // true

  3. steve Says:

    I would change your array type test to:

    function array_unique(ar){
      if(ar instanceof Array){
        var sorter = {};
        var out = [];
        for(var i=0,j=ar.length;i<j;i++){
          if(!sorter[ar[i]+typeof ar[i]]){
            out.push(ar[i]);
            sorter[ar[i]+typeof ar[i]]=true;
          }
        }
      }
      return out || ar;
    }

    because if you pass in say, a NodeList, that you get from say… document.getElementsByTagName(‘div’);

    It will have a length property, and it will be an integer… but it doesn’t quite work like an array… as you access the elements by .item( idx);

    Using instanceof, will ensure that you really do have an Array, not an object of any other kind, that happens to have a length property.

    cheers
    steve

  4. Matt Snider Says:

    I think this is a great little function and have changed my unique method to be similar. I agree with steve’s comment. Length, isn’t enough to determine that it is a valid array. I am always forgetting to treat nodelists separate from arrays and have created an isArray() function to do this type detection.

  5. David Golightly Says:

    You’ve forgotten the non-typecasting equality operator: ===. Using this, you don’t need to use typeof to keep track of types. Array.indexOf doesn’t convert type either. Thus, you can shorten to:


    Array.prototype.removeDuplicates = function () {
    // filter out duplicates
    var item, seen=[];

    for (var i=0, len=array.length; i<len ; i++) {
    item = array[i];
    if (!(seen.indexOf(item)+1)) {
    seen[seen.length] = item;
    }
    }

    return seen;
    }

    Array.prototype.indexOf is not provided natively in JS 1.5 (hence not in IE), but you can get an implementation via the MDC (google Mozilla Array indexOf).

    To call this on a nodelist, use


    var nodupes = [].call(nodelist);

    PS. You might consider changing the “HTML Allowed” section of your comments to be consistent with your “no-links” policy.

Leave a Reply

Wait till I come! is the blog of Christian Heilmann , a developer evangelist living and working in London, England. Download vcard.

Feed me, Seymour: Entries (RSS) and Comments (RSS).