algorithm - Java Benchmarking: Ensuring that objects are not reused after coming out of scope -
algorithm - Java Benchmarking: Ensuring that objects are not reused after coming out of scope -
i benchmarking several algorithms variation of k-th closest neighbour problem. i'm seeing troubling results when repeatedly running algorithms sort data.
updateit seems info not beingness cached, people have said. ran same n = 50000 test, generated random points in both arraylists every time ran 3 tests. speedup wary of still occurred. seem jitc feature.
output:
stopping :: brute force. time (ms): 21716 stopping :: sorted 'smart' brute force. time (ms): 24014 stopping :: quickselect. time (ms): 17655 stopping :: brute force. time (ms): 22034 stopping :: sorted 'smart' brute force. time (ms): 23975 stopping :: quickselect. time (ms): 18438 stopping :: brute force. time (ms): 20097 stopping :: sorted 'smart' brute force. time (ms): 15677 stopping :: quickselect. time (ms): 14399 stopping :: brute force. time (ms): 20457 stopping :: sorted 'smart' brute force. time (ms): 15141 stopping :: quickselect. time (ms): 14146 stopping :: brute force. time (ms): 20143 stopping :: sorted 'smart' brute force. time (ms): 15834 stopping :: quickselect. time (ms): 14084 stopping :: brute force. time (ms): 20173 stopping :: sorted 'smart' brute force. time (ms): 15170 stopping :: quickselect. time (ms): 13745 stopping :: brute force. time (ms): 19625 stopping :: sorted 'smart' brute force. time (ms): 14924 stopping :: quickselect. time (ms): 15972 stopping :: brute force. time (ms): 19388 stopping :: sorted 'smart' brute force. time (ms): 15209 stopping :: quickselect. time (ms): 13639 stopping :: brute force. time (ms): 19420 stopping :: sorted 'smart' brute force. time (ms): 14779 stopping :: quickselect. time (ms): 13798 stopping :: brute force. time (ms): 19390 stopping :: sorted 'smart' brute force. time (ms): 15078 stopping :: quickselect. time (ms): 13548
original query background/details:
i have 2 unsorted n-sized arraylists of points- , b. every point p in a, check see how many points in b within distance d p.
algorithms:
brute force for every point p1 in a, check every point p2 in b. sorted smart brute force create temporary arraylist c, , deep re-create b it. sort c x-value of points. scan b until reaching first p2.x exceeds (p1.x + d) quickselect create temporary arraylist c, , deep re-create b it. sort c x-value of points. for every point p1 in a, binary search through c, until point p2 found x value within p1.x ± d. scan left , right found point. stop scanning when reaching point lies outside of d range.sample:
public void quickselect(arraylist<point> field1, arraylist<point> input2, int distance){ arraylist<point> field2 = new arraylist<point>(); deepcopy(field2, input2); starttimer("starting :: quickselect.\n"); collections.sort(field2, new comparepoints()); for(int = 0; < field1.size(); ++i){ //find pivot ... //scan left until out of bounds. ... //scan right until out of bounds. ... } endtimer("stopping :: quickselect. "); field2.clear(); }
where deepcopy is:
private void deepcopy(arraylist<point> field1, arraylist<point> input1){ for(int = 0; < input1.size(); ++i){ field1.add(input1.get(i).getlocation()); } }
testing method:
public static void main(string[] args){ int points = 50000; arraylist<point> field1 = generator.makegraph(points); arraylist<point> field2 = generator.makegraph(points); graphtester tester = new graphtester(field1); int maxdistance = 300; for(int = 0; < 10; ++i){ tester.bruteforce(field1, field2, maxdistance); } for(int = 0; < 10; ++i){ tester.sortedbruteforce(field1, field2, maxdistance); } for(int = 0; < 10; ++i){ tester.quickselect(field1, field2, maxdistance); } }
the results:
stopping :: brute force. time (ms): 22851 stopping :: brute force. time (ms): 22482 stopping :: brute force. time (ms): 20690 stopping :: brute force. time (ms): 21073 stopping :: brute force. time (ms): 20860 stopping :: brute force. time (ms): 21311 stopping :: brute force. time (ms): 20847 stopping :: brute force. time (ms): 21000 stopping :: brute force. time (ms): 20503 stopping :: brute force. time (ms): 21342 stopping :: sorted 'smart' brute force. time (ms): 23083 stopping :: sorted 'smart' brute force. time (ms): 22616 stopping :: sorted 'smart' brute force. time (ms): 15881 stopping :: sorted 'smart' brute force. time (ms): 15323 stopping :: sorted 'smart' brute force. time (ms): 15930 stopping :: sorted 'smart' brute force. time (ms): 15360 stopping :: sorted 'smart' brute force. time (ms): 16072 stopping :: sorted 'smart' brute force. time (ms): 15601 stopping :: sorted 'smart' brute force. time (ms): 16952 stopping :: sorted 'smart' brute force. time (ms): 15950 stopping :: quickselect. time (ms): 18202 stopping :: quickselect. time (ms): 18109 stopping :: quickselect. time (ms): 14685 stopping :: quickselect. time (ms): 14401 stopping :: quickselect. time (ms): 14052 stopping :: quickselect. time (ms): 14782 stopping :: quickselect. time (ms): 14175 stopping :: quickselect. time (ms): 14187 stopping :: quickselect. time (ms): 13870 stopping :: quickselect. time (ms): 14601
as can see, sorted smart brute forcefulness , quickselect algorithms start off high computation time, level off much improve number tests repeated.
though i'm clearing temporary arraylist, sense sorted arraylist somehow beingness retained jvm , reused/remapped new arraylist pointers.
my question two-fold, then:
does seem case of caching? can guarantee info not cached? if randomize points every time generate graphs, i'm betting there more control, couldn't sure algorithms matched.
i can't see why you're deep-cloning data.
does seem case of caching?
no, it's jitc getting improve on you're running. each non-trivial java benchmark needs warmup, during statistics gathered , relevant part of code (the hotspot
) gets compiled , optimized heavily.
a more complex algorithm need more warmup.
can guarantee info not cached?
i can't see caching there. it? jitc can't understand you're doing. if knew you're computing same thing time, times 1 zero.
if randomize points every time generate graphs, i'm betting there more control, couldn't sure algorithms matched.
go if sense this. guess can rather trust benchmark without doing it.
update after commentwarmup: mutual situation server running many hours (at least) doing same operations time. such server measuring performance during first minutes create little sense. situation needed warmup measuring or optimizing fast operations e.g. string.hashcode, typically repeated many thousands times. situation single long-taking computation seems exception.
still, wrote warmup plausible explanation observations. anyway, quickselect
winner in runs, selection clear. fact improved later nice bonus winner. next places less clear, 22851 vs 23083 no real win, measurement error bigger (and might cost deep-copy). re-run benchmark several times (with fresh jvm) if want know more (1 iteration should suffice).
java algorithm caching arraylist garbage-collection
Comments
Post a Comment