Re:Journal node recovery after a failure

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re:Journal node recovery after a failure

Clay Baenziger (BLOOMBERG/ 731 LEX)
A coworker filed HDFS-10665 about this seemingly manual process a number of years ago. We automated this in our Chef code back then not having found a better process since.

-Clay

From: [hidden email] At: 07/27/18 19:43:07
To: [hidden email]
Subject: Journal node recovery after a failure

Hi all,

I have a HA cluster setup with 3 journal nodes. Everything works fine until there is a failure and I try to replace the journal node with a new one.

Currently, I am manually copying the ‘edits’ directory from one of the live journal nodes to the new one and then start the new journal node.

Is there a way to automate this? Like execute a command to bootstrap journal node?

Any help is greatly appreciated.

Thanks,
Suman.
Reply | Threaded
Open this post in threaded view
|

Re: Journal node recovery after a failure

Suman Somasundar
Thanks Clay. I'll look to achieve something similar.

Suman.




On Fri, Jul 27, 2018 at 4:53 PM -0700, "Clay Baenziger (BLOOMBERG/ 731 LEX)" <[hidden email]> wrote:

A coworker filed HDFS-10665 about this seemingly manual process a number of years ago. We automated this in our Chef code back then not having found a better process since.

-Clay

From: [hidden email] At: 07/27/18 19:43:07
To: [hidden email]
Subject: Journal node recovery after a failure

Hi all,

I have a HA cluster setup with 3 journal nodes. Everything works fine until there is a failure and I try to replace the journal node with a new one.

Currently, I am manually copying the ‘edits’ directory from one of the live journal nodes to the new one and then start the new journal node.

Is there a way to automate this? Like execute a command to bootstrap journal node?

Any help is greatly appreciated.

Thanks,
Suman.