Elasticsearch on Hadoop - Should ES nodes be Colocated with Hadoop DataNodes? -
Elasticsearch on Hadoop - Should ES nodes be Colocated with Hadoop DataNodes? -
from elasticsearch hadoop documentation:
whenever possible, elasticsearch-hadoop shares elasticsearch cluster info hadoop facilitate info co-location. in practice, means whenever info read elasticsearch, source nodes ips passed on hadoop optimize task execution. if co-location desired/possible, hosting elasticsearch , hadoop clusters within same rack provide important network savings.
does mean ideally elasticsearch node should colocated every datanode on hadoop cluster, or misreading this?
you may find joint presentation elasticsearch , hortonworks useful in answering question:
http://www.slideshare.net/hortonworks/hortonworks-elastic-searchfinal
you'll note on slides 33 , 34 show multiple architectures - 1 es nodes co-located on hadoop nodes , have separate clusters. first alternative gives best co-location of info of import managing hadoop performance. sec approach allows tune each separately , scale them independently.
i don't know can 1 approach improve other there tradeoffs. running on same node minimizes info access latency @ expense of loss of isolation , ability tune each cluster separately.
hadoop elasticsearch
Comments
Post a Comment