Simulating array_unique in JavaScript
One of the beautiful things of PHP is its wealth of array methods. JavaScript in comparison seems ridiculously inadequate and you find yourself having to write own methods or patch the existing ones. One method I especially cherish is array_unique() which returns a new array that has all the duplicates filtered out. This is easy to write in JavaScript, all you need to do is:
- create a new object
- loop through the array and use the array values as new properties of the object (that way the property simply gets re-set and not added as a new one to the object when it comes up again)
- loop through the properties of the object and add each value to the results array
Technically this should do it:
function array_unique(ar){
var sorter = {};
for(var i=0,j=ar.length;i<j;i++){
sorter[ar[i]] = ar[i];
}
ar = [];
for(var i in sorter){
ar.push(i);
}
return ar;
}
Now array_unique([1,2,3,1,1]) returns “[1,2,3]” which is what we want. However, there is a snag. What if the array contains elements that are almost the same but a different type? When you run array_unique([1,2,3,”1”,1]) you still only get “[1,2,3]” as the returned array and what you’d really need is “[1,2,3,’1’]”. The solution to this is to store both the value and the type in the property and push the values to the results array:
function array_unique(ar){
var sorter = {};
for(var i=0,j=ar.length;i<j;i++){
sorter[ar[i]+typeof ar[i]] = ar[i];
}
ar = [];
for(var i in sorter){
ar.push(sorter[i]);
}
return ar;
}
The next thing I can think of is to ensure that the array is really an array. We can test this by checking if it has a length property and is not a string.
function array_unique(ar){
if(ar.length && typeof ar!=='string'){
var sorter = {};
for(var i=0,j=ar.length;i<j;i++){
sorter[ar[i]+typeof ar[i]] = ar[i];
}
ar = [];
for(var i in sorter){
ar.push(sorter[i]);
}
}
return ar;
}
However, two loops can be slow, and for…in is a very slow construct. Therefore we can avoid the second loop by using an output array:
function array_unique(ar){
if(ar.length && typeof ar!=='string'){
var sorter = {};
var out = [];
for(var i=0,j=ar.length;i<j;i++){
if(!sorter[ar[i]+typeof ar[i]]){
out.push(ar[i]);
sorter[ar[i]+typeof ar[i]]=true;
}
}
}
return out || ar;
}
Anything I have forgotten?
[tags]PHP,JavaScript,array_unique,sorting,cleaning,hashtable[/tags]


August 8th, 2007 at 3:37 pm
I might change your for construct and declare a variable that holds the currently indexed item’s value to avoid calling “typeof” twice:
August 8th, 2007 at 3:38 pm
Hi,
This function doesn’t work for objects as it uses toString to compare them:
alert({a: 1}.toString() == {}.toString()); // trueAugust 8th, 2007 at 3:47 pm
I would change your array type test to:
because if you pass in say, a NodeList, that you get from say… document.getElementsByTagName(‘div’);
It will have a length property, and it will be an integer… but it doesn’t quite work like an array… as you access the elements by .item( idx);
Using instanceof, will ensure that you really do have an Array, not an object of any other kind, that happens to have a length property.
cheers
steve
August 8th, 2007 at 11:14 pm
I think this is a great little function and have changed my unique method to be similar. I agree with steve’s comment. Length, isn’t enough to determine that it is a valid array. I am always forgetting to treat nodelists separate from arrays and have created an isArray() function to do this type detection.
August 13th, 2007 at 10:36 pm
You’ve forgotten the non-typecasting equality operator: ===. Using this, you don’t need to use typeof to keep track of types. Array.indexOf doesn’t convert type either. Thus, you can shorten to:
Array.prototype.removeDuplicates = function () {
// filter out duplicates
var item, seen=[];
for (var i=0, len=array.length; i<len ; i++) {
item = array[i];
if (!(seen.indexOf(item)+1)) {
seen[seen.length] = item;
}
}
return seen;
}
Array.prototype.indexOf is not provided natively in JS 1.5 (hence not in IE), but you can get an implementation via the MDC (google Mozilla Array indexOf).
To call this on a nodelist, use
var nodupes = [].call(nodelist);
PS. You might consider changing the “HTML Allowed” section of your comments to be consistent with your “no-links” policy.