Elasticsearch on Hadoop - Should ES nodes be Colocated with Hadoop DataNodes? -

from elasticsearch hadoop documentation:

whenever possible, elasticsearch-hadoop shares elasticsearch cluster info hadoop facilitate info co-location. in practice, means whenever info read elasticsearch, source nodes ips passed on hadoop optimize task execution. if co-location desired/possible, hosting elasticsearch , hadoop clusters within same rack provide important network savings.

does mean ideally elasticsearch node should colocated every datanode on hadoop cluster, or misreading this?

you may find joint presentation elasticsearch , hortonworks useful in answering question:

http://www.slideshare.net/hortonworks/hortonworks-elastic-searchfinal

you'll note on slides 33 , 34 show multiple architectures - 1 es nodes co-located on hadoop nodes , have separate clusters. first alternative gives best co-location of info of import managing hadoop performance. sec approach allows tune each separately , scale them independently.

i don't know can 1 approach improve other there tradeoffs. running on same node minimizes info access latency @ expense of loss of isolation , ability tune each cluster separately.

hadoop elasticsearch

Search This Blog

Three

Elasticsearch on Hadoop - Should ES nodes be Colocated with Hadoop DataNodes? -

Comments

Post a Comment

Popular posts from this blog

php - Android app custom user registration and login with cookie using facebook sdk -

django - Access session in user model .save() -

php - .htaccess Multiple Rewrite Rules / Prioritizing -