package com.test; import java.util.List; import java.util.ArrayList; class MyCollection{ public static void populateList(List<Long> l, int multiple){ for(int i=0; i<50000; i++){ Long value = Long.valueOf(i*multiple); l.add(value); } } public static void main(String[] args) { List<Long> listOf2 = new ArrayList<Long>(); populateList(listOf2, 2); List<Long> listOf3 = new ArrayList<Long>(); populateList(listOf3, 3); long startTimestamp, endTimestamp; List<Long> commonA = null, commonB = null, commonC = null; //First Attempt - Runs in 60 seconds startTimestamp = System.currentTimeMillis(); commonA = new ArrayList<Long>(listOf2); commonA.retainAll(listOf3); endTimestamp = System.currentTimeMillis(); System.out.println("Execution Time : " + (endTimestamp-startTimestamp)/1000); //Second Attempt - Runs in 73 seconds - //There are fewer elements in commonB shouldn't this run faster? startTimestamp = System.currentTimeMillis(); commonB = new ArrayList<Long>(listOf3); commonB.retainAll(listOf2); endTimestamp = System.currentTimeMillis(); System.out.println("Execution Time : " + (endTimestamp-startTimestamp)/1000); System.out.println("Are Equal : " + (commonA.equals(commonB))); //Third Attempt - Runs in 2 seconds startTimestamp = System.currentTimeMillis(); /* This part has been intentionally left blank. * That is because I need to have a question for the puzzle, * and I felt like leaving out the Third part would be the right * thing do to. * I've done most of the hard work so this should be easy to fill. * Yes this must run 10 times faster than the solution I have provided * and should be done just as many lines (3 to 5 lines of code should be fine!). * Well I never said anything about life being fair, did I? * */ endTimestamp = System.currentTimeMillis(); System.out.println("Execution Time : " + (endTimestamp-startTimestamp)/1000); System.out.println("Are Equal : " + (commonA.equals(commonC))); } }
Just keep in mind that both the "Are Equal" print statements print true and that you don't use the reference commonA or commonB when writing the code for the third part. You are free to make any other changes (in the place where the comments are there).
Update - The commonA, commonB were incorrectly pointing to the objects listOf2/listof3. That has been corrected to create new Objects -- it now reads commonA = new ArrayList
Thanks to Colin Hebert for pointing it out.
You forgot to reset your list between the first and the second attempt. That's why your test isn't correct.
ReplyDeleteAt line 28 and 35 you pass a reference so you modify the original List at each time.
To make it work faster on the third attempt you just have to write :
commonC = listOf3;
@Colin - Eeeks! Your right. Not my intention at all. I've corrected the puzzle and I'm hoping I've got it right this time.
ReplyDeleteFor this particular case we need to find the numbers that divide by 2 and 3 smaller then 50000 * 2.
ReplyDeletecommonC = new ArrayList();
for ( int i = 0; i < ( 50000 * 2 ) / 3 ; i+=2 )
{
Long value = Long.valueOf(i*3);
commonC.add( value );
}
Convert lists to arrays:
ReplyDeleteLong[] array2 = listOf2.toArray(new Long[listOf2.size()]);
Long[] array3 = listOf3.toArray(new Long[listOf3.size()]);
for (Long item : array2) { if (Arrays.binarySearch(array3, item) >= 0) commonC.add(item); }
The data set you are using is a Set and retainAll is a set operation. If you use a set it will run 2000x faster or more. By tuning the JVM you can get each operation down to 12 milli-seconds.
ReplyDeleteLong[] array = new Long[16667];
ReplyDeletefor(int i=0; i<array.length; i++) {
array[i] = Long.valueOf(i*6);
}
commonC = Arrays.asList(array);
final Set setC = new HashSet(listOf3);
ReplyDeletecommonC = new LinkedList();
for (final Long value : listOf2){
if (setC.contains(value)){
commonC.add(value);
}
}
Execution Time : 4297
Execution Time : 4437
Are Equal : true
Execution Time : 36
Are Equal : true
Since both lists are sorted, following solution works:
ReplyDelete...
commonC = new ArrayList();
int i = 0;
for (Long elem : listOf2) {
long res = elem.longValue() - listOf3.get(i).longValue();
if (res >= 0) {
if (res == 0) {
commonC.add(elem);
}
i++;
}
}
...
But in my environment is not in 10 times, but almost 1000 times faster:
Execution Time : 4234
Execution Time : 4149
Are Equal : true #16667
Execution Time : 11
Are Equal : true #16667
(I'm displaying milliseconds instead of seconds and after results of comparison total number of elements in collection is provided).
With 100000 elements results even better:
Execution Time : 16288
Execution Time : 16608
Are Equal : true #33334
Execution Time : 18
Are Equal : true #33334
First of all - great blog!
ReplyDeleteI have a guess, but unfortunately based on a Set, not a List. To this particular case it doesn't matter, but in general case it makes my guess unacceptable.
Here is the code:
commonC = new LinkedList(listOf2);
Set b = new HashSet(listOf3);
commonC.retainAll(b);
And the measurements:
[A] Execution Time : 146
[B] Execution Time : 153
[B] Are Equal : true
[C] Execution Time : 0 (505 milis)
[C] Are Equal : true
//Well, I got it to 0 (about 10M nano). My //original "number 1" was about 16 seconds though.
ReplyDelete//I use a version of "merge-sort" tailored for this case
//The general case would be a bit more complex - but the only real requirement is that the lists are sorted.
commonC = new ArrayList(listOf2.size());
int j=0;
for(int i=0; i current3) {
j++;
current3 = listOf3.get(j);
}
if(current2.longValue() == current3.longValue()) {
commonC.add(current2);
}
}
Oh, and I forgot. The factor is x2000, not x10.
ReplyDeleteAnother solution which does not depend on order of list elements:
ReplyDeleteSet commonE = new HashSet(listOf2);
commonC = new ArrayList();
for (Long value : listOf3) {
if (commonE.contains(value)) {
commonC.add(value);
}
}
Results of execution on my PC:
For 50000 elements:
Execution Time : 4238
Execution Time : 4158
Are Equal : true #16667
Execution Time : 27
Are Equal : true #16667
For 100000 elements:
Execution Time : 16288
Execution Time : 16609
Are Equal : true #33334
Execution Time : 39
Are Equal : true #33334
Attempts 1 and 2 are the same.
ReplyDeletenew ArrayList(listOf2).retainAll(listOf3) is the same as new ArrayList(listOf3).retainAll(listOf2).
Btw both attempts run in slightly over 4 seconds on my simple laptop. Why does it take 60 sec when you run it?
@Wouter - Umm I got a 5 year old P4 with 1G ram and stuff runs kinda really slow on it. If someone find the code running fast enough on their machine, I suggest increasing the LOOP value from 50,000 to 150,000 or higher. The performance contrast should be obvious.
ReplyDelete@All - There are lot of other comments (with answers). I'll be approving everything this weekend. Comments are hidden to prevent 'spoilers' for anyone reading the puzzle!
Hi there! My try is:
ReplyDeletestartTimestamp = System.currentTimeMillis();
HashSet hashSet = new HashSet(listOf2);
commonC = new ArrayList();
for (Long val : listOf3) {
if (!hashSet.add(val)) { commonC.add(val); }
}
Collections.sort(listOf3);
ReplyDeletecommonC = new ArrayList();
int result = 0;
for (Long long1 : listOf2) {
result = Collections.binarySearch(listOf3, long1);
if (result >= 0) {
commonC.add(long1);
}
}
commonC = new ArrayList();
ReplyDeletefor (Long _l : listOf2)
{
if ((_l % 3) == 0)
commonC.add(_l);
}