Understanding the YARN Linux container manager

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Understanding the YARN Linux container manager

Daniel Peebles
Hi all,

Please tell me if this is the wrong place to ask!

I'm trying to understand the isolation properties of LinuxContainerExecutor in YARN. I've looked through the documentation and traced through the code down to the C helper tool and as far as I've been able to determine, it's only apply cgroups to the subprocess. Is that right? I was trying to figure out if it's also unsharing any namespaces (filesystem, pid, network, etc.) from the parent process or otherwise isolating itself in other ways.

If I'm correct and it doesn't do namespaces, does that mean I should use the DockerContainerExecutor instead to get namespace isolation? That one has a big scary security warning saying that using it might allow privilege escalation so I'm hesitant.

I've also been trying to understand during a normal hadoop/YARN (or e.g., Spark) execution, whether any parts of the application run outside of the container. Is there a good place to read up on the container architecture in general?

Thanks,
Dan Peebles
Reply | Threaded
Open this post in threaded view
|

Re: Understanding the YARN Linux container manager

Eric Badger-2
Hi Daniel,

As far as I know, there is no namespace isolation for the default runtime of LinuxContainerExecutor outside of cgroups. Someone please correct me if I am wrong. 

There has been a significant amount of work over the past few years related to DockerLinuxContainerRuntime, which is a specific runtime that LinuxContainerExecutor can use. DockerContainerExecutor has been deprecated and was removed in 3.0, I believe. There are certainly security implications with running the DockerLinuxContainerRuntime, but we have done a ton of work to close these security holes and harden the infrastructure around running Docker. I very strongly suggest that you don't run Docker on Hadoop unless you are running at least 3.0, preferably 3.1. There is some code in 2.9, but it has not been maintained closely and is very experimental. Much of the code was rewritten in 3.0 to provide better security. See YARN-3611 and YARN-8472 for a list of Docker improvements made.

If you do choose to run Docker on Hadoop, you need to make sure to read up on the implicit security ramifications of running Docker (NM talks to dockerd, which is a root daemon, privileged containers are dangerous, bind-mounts directly affect the host and can allow breaking out of the container, etc.). Personally, I believe that Docker can be run on Hadoop with adequate security that makes it an improvement over running YARN containers bare-metal. I'd be happy to talk in more detail about the things to consider and steps to take to harden your setup, if you'd like.

Eric

On Fri, Oct 19, 2018 at 10:40 AM Daniel Peebles <[hidden email]> wrote:
Hi all,

Please tell me if this is the wrong place to ask!

I'm trying to understand the isolation properties of LinuxContainerExecutor in YARN. I've looked through the documentation and traced through the code down to the C helper tool and as far as I've been able to determine, it's only apply cgroups to the subprocess. Is that right? I was trying to figure out if it's also unsharing any namespaces (filesystem, pid, network, etc.) from the parent process or otherwise isolating itself in other ways.

If I'm correct and it doesn't do namespaces, does that mean I should use the DockerContainerExecutor instead to get namespace isolation? That one has a big scary security warning saying that using it might allow privilege escalation so I'm hesitant.

I've also been trying to understand during a normal hadoop/YARN (or e.g., Spark) execution, whether any parts of the application run outside of the container. Is there a good place to read up on the container architecture in general?

Thanks,
Dan Peebles
Reply | Threaded
Open this post in threaded view
|

Re: Understanding the YARN Linux container manager

Daniel Peebles
Hi Eric,

This is great information, thanks!

Unfortunately the provider I’m using (EMR) isn’t on 3 yet so I don’t have too much control over that.

I’m definitely interested in any other pointers you have for hardening, as well as additional reading that might be relevant. I’m mostly trying to understand the degree of mutual trust required between my YARN jobs that share nodes with each other (even if they don’t run concurrently), so if there’s any reading on what (potentially untrusted) job code runs in what contexts or similar information, anything like that would be very helpful to me!

Thanks,
Dan
On Fri, Oct 19, 2018 at 11:56 Eric Badger <[hidden email]> wrote:
Hi Daniel,

As far as I know, there is no namespace isolation for the default runtime of LinuxContainerExecutor outside of cgroups. Someone please correct me if I am wrong. 

There has been a significant amount of work over the past few years related to DockerLinuxContainerRuntime, which is a specific runtime that LinuxContainerExecutor can use. DockerContainerExecutor has been deprecated and was removed in 3.0, I believe. There are certainly security implications with running the DockerLinuxContainerRuntime, but we have done a ton of work to close these security holes and harden the infrastructure around running Docker. I very strongly suggest that you don't run Docker on Hadoop unless you are running at least 3.0, preferably 3.1. There is some code in 2.9, but it has not been maintained closely and is very experimental. Much of the code was rewritten in 3.0 to provide better security. See YARN-3611 and YARN-8472 for a list of Docker improvements made.

If you do choose to run Docker on Hadoop, you need to make sure to read up on the implicit security ramifications of running Docker (NM talks to dockerd, which is a root daemon, privileged containers are dangerous, bind-mounts directly affect the host and can allow breaking out of the container, etc.). Personally, I believe that Docker can be run on Hadoop with adequate security that makes it an improvement over running YARN containers bare-metal. I'd be happy to talk in more detail about the things to consider and steps to take to harden your setup, if you'd like.

Eric

On Fri, Oct 19, 2018 at 10:40 AM Daniel Peebles <[hidden email]> wrote:
Hi all,

Please tell me if this is the wrong place to ask!

I'm trying to understand the isolation properties of LinuxContainerExecutor in YARN. I've looked through the documentation and traced through the code down to the C helper tool and as far as I've been able to determine, it's only apply cgroups to the subprocess. Is that right? I was trying to figure out if it's also unsharing any namespaces (filesystem, pid, network, etc.) from the parent process or otherwise isolating itself in other ways.

If I'm correct and it doesn't do namespaces, does that mean I should use the DockerContainerExecutor instead to get namespace isolation? That one has a big scary security warning saying that using it might allow privilege escalation so I'm hesitant.

I've also been trying to understand during a normal hadoop/YARN (or e.g., Spark) execution, whether any parts of the application run outside of the container. Is there a good place to read up on the container architecture in general?

Thanks,
Dan Peebles
Reply | Threaded
Open this post in threaded view
|

Re: Understanding the YARN Linux container manager

Eric Badger-2
My expertise is more limited on more generalized Hadoop security concepts. There's everything from Kerberos authenticating users to HDFS encryption zones and secure RPC/spill. Then there's the fact that the tasks are running on the same node with access to the same resources. If they aren't running concurrently then you have a smaller class of problems to deal with, but I'm sure there are still some. As far as what potentially untrusted code runs in what contexts; well, it depends on what you consider to be untrusted. If you take a zero-trust stance then all user code running on top of Hadoop is untrusted. Otherwise, that's up to you to decide if jobs run on your cluster are trusted or not (via kerberos auth to hadoop, access to gateway machines, headless users, etc.) Again, I'm not a hadoop security expert (just an enthusiast). 

As far as documentation goes, that's a good question that I don't know the answer to. Unfortunately I don't think any of the above is a great answer to your question, so hopefully someone else can add additional insights.

Eric

On Fri, Oct 19, 2018 at 11:00 AM Daniel Peebles <[hidden email]> wrote:
Hi Eric,

This is great information, thanks!

Unfortunately the provider I’m using (EMR) isn’t on 3 yet so I don’t have too much control over that.

I’m definitely interested in any other pointers you have for hardening, as well as additional reading that might be relevant. I’m mostly trying to understand the degree of mutual trust required between my YARN jobs that share nodes with each other (even if they don’t run concurrently), so if there’s any reading on what (potentially untrusted) job code runs in what contexts or similar information, anything like that would be very helpful to me!

Thanks,
Dan
On Fri, Oct 19, 2018 at 11:56 Eric Badger <[hidden email]> wrote:
Hi Daniel,

As far as I know, there is no namespace isolation for the default runtime of LinuxContainerExecutor outside of cgroups. Someone please correct me if I am wrong. 

There has been a significant amount of work over the past few years related to DockerLinuxContainerRuntime, which is a specific runtime that LinuxContainerExecutor can use. DockerContainerExecutor has been deprecated and was removed in 3.0, I believe. There are certainly security implications with running the DockerLinuxContainerRuntime, but we have done a ton of work to close these security holes and harden the infrastructure around running Docker. I very strongly suggest that you don't run Docker on Hadoop unless you are running at least 3.0, preferably 3.1. There is some code in 2.9, but it has not been maintained closely and is very experimental. Much of the code was rewritten in 3.0 to provide better security. See YARN-3611 and YARN-8472 for a list of Docker improvements made.

If you do choose to run Docker on Hadoop, you need to make sure to read up on the implicit security ramifications of running Docker (NM talks to dockerd, which is a root daemon, privileged containers are dangerous, bind-mounts directly affect the host and can allow breaking out of the container, etc.). Personally, I believe that Docker can be run on Hadoop with adequate security that makes it an improvement over running YARN containers bare-metal. I'd be happy to talk in more detail about the things to consider and steps to take to harden your setup, if you'd like.

Eric

On Fri, Oct 19, 2018 at 10:40 AM Daniel Peebles <[hidden email]> wrote:
Hi all,

Please tell me if this is the wrong place to ask!

I'm trying to understand the isolation properties of LinuxContainerExecutor in YARN. I've looked through the documentation and traced through the code down to the C helper tool and as far as I've been able to determine, it's only apply cgroups to the subprocess. Is that right? I was trying to figure out if it's also unsharing any namespaces (filesystem, pid, network, etc.) from the parent process or otherwise isolating itself in other ways.

If I'm correct and it doesn't do namespaces, does that mean I should use the DockerContainerExecutor instead to get namespace isolation? That one has a big scary security warning saying that using it might allow privilege escalation so I'm hesitant.

I've also been trying to understand during a normal hadoop/YARN (or e.g., Spark) execution, whether any parts of the application run outside of the container. Is there a good place to read up on the container architecture in general?

Thanks,
Dan Peebles