We have experience with 5,000 - 6,000 node clusters. Although it ran/runs fine, any heavy hitter activities such as decommissioning needed to be carefully planned. In terms of files and blocks, we have multiple clusters running stable with over 500M files and blocks. Some at over 800M with the max heap at 256GB. It can probably go higher, but we haven't done performance testing & optimizations beyond 256GB yet. All our clusters are un-federated. Funny how the feature was developed in Yahoo! and ended up not being used here. :) We have a cluster with about 180PB of provisioned space. Many clusters are using over 100PB in their steady state. We don't run datanodes too dense, so can't tell what the per-datanode limit is.
Thanks and 73
On Thu, Jun 13, 2019 at 1:57 PM Wei-Chiu Chuang <[hidden email]> wrote:
We run similar Hadoop installations as Kihwal describes. Thanks for sharing Kihwal.
Our clusters are also not Federated.
So far the growth rate is the same as we reported in our talk last year (slide #5) 2x per year:
We track three main metrics: Total space used, Number of Objects (Files+ blocks), and Number of Tasks per day.
I found that the number of nodes is mostly irrelevant in measuring cluster size, since the nodes are very diverse in configuration and are constantly upgrading, so you may have the same #nodes, but much more drives, cores, RAM on each of them - a bigger cluster.
I do not see 200 GB heap size as a limit. We ran Dynamometer experiments with a bigger heap fitting 1 billion files and blocks. It should be doable, but we may hit other scalability limits when we get to so many objects. See Erik's talk discussing the experiments and solutions:
Well Hadoop scalability has always been a moving target for me. I don't think you can set it in stone once and for all.
On Sat, Jun 15, 2019 at 5:20 PM Wei-Chiu Chuang <[hidden email]> wrote:
Thank you, Kihwal for the insightful comments!
|Free forum by Nabble||Edit this page|